Information and Randomness: An Algorithmic Perspective (Texts in Theoretical Computer Science. An EATCS Series)

Cristian S. Calude Infoflllation and Randolllness An Algorithmic Perspective Second Edition, Revised and Extended Fore...

Author: Cristian S. Calude

144 downloads 1177 Views 8MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Cristian S. Calude

Infoflllation and Randolllness An Algorithmic Perspective Second Edition, Revised and Extended Forewords by Gregory J. Chaitin and Arto Salomaa

Springer

Author Prof. Dr. Cristian S. Calude Department of Computer Science Auckland University Private Bag 92019 Auckland, New Zealand [email protected] www.cs.auckland.ac.nz/-cristian

Series Editors Prof. Dr. Wilfried Brauer Institut fiir Informatik Technische UniversiHit Miinchen Arcisstrasse 21,80333 Miinchen, Germany [email protected]

Prof. Dr. Grzegorz Rozenberg Leiden Institute of Advanced Computer Science University of Leiden Niels Bohrweg 1,2333 CA Leiden, The Netherlands [email protected]

Prof. Dr. Arto Salomaa Turku Centre for Computer Science Lemminkaisenkatu 14 A, 20520 Turku, Finland [email protected]

Library of Congress Cataloging-in-Publication Data Calude, Cristian, 1952Information and randomness: an algorithmic perspectivelCristian Calude; forewords by Gregory J. Chaitin and Arto Salomaa. - 2nd ed. p. cm. - (Texts in theoretical computer science) Includes bibliographical references and index. ISBN 3540434666 (hc: alk. paper) 1. Machine theory. 2. Com:putational complexity. 3. Stochastic processes. I. Title. II. EATCS monographs on theoretical computer science. QA267 .C22 2002 616.9'2-dc21

2002075734

ACM Computing Classification (1998): GA, F.2.1-2, F.1, E.1 ISBN 3-540-43466-6 Springer-Verlag Berlin Heidelberg New York ISBN 3-540-57456-5 1. edition Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York, a member of BertelsmannSpringer Science+ Business Media GmbH © Springer-Verlag Berlin Heidelberg 1994, 2002

Printed in Germany The use of general descriptive names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover Design: KiinkelLopka, Heidelberg Typesetting: Camera-ready by the authors Printed on acid-free paper SPIN 10797625

45/3142SR - 5 43 2 1 0

Editor's Foreword The first edition of the monograph Information and Randomness: An Algorithmic Perspective by Cristian Calude was published in 1994. In my Foreword I said: "The research in algorithmic information theory is already some 30 years old. However, only the recent years have witnessed a really vigorous growth in this area. ... The present book by Calude fits very well in our series. Much original research is presented . .. making the approach richer in consequences than the classical one. Remarkably, however, the text is so self-contained and coherent that the book may also serve as a textbook. All proofs are given in the book and, thus, it is not necessary to consult other sources for classroom instruction." The vigorous growth in the study of algorithmic information theory has continued during the past few years, which is clearly visible in the present second edition. Many new results, examples, exercises and open problems have been added. The additions include two entirely new chapters: "Computably Enumerable Random Reals" and "Randomness and Incompleteness". The really comprehensive new bibliography makes the book very valuable for a researcher. The new results about the characterization of computably enumerable random reals, as well as the fascinating Omega Numbers, should contribute much to the value of the book as a textbook. The author has been directly involved in these results that have appeared in the prestigious journals Nature, New Scientist and Pour la Science. May 2002

Arto Salomaa The Academy of Finland

Foreword

Algorithmic information theory (AIT) is the result of putting Shannon's information theory and Turing's computability theory into a cocktail shaker and shaking vigorously. The basic idea is to measure the complexity of an object by the size in bits of the smallest program for computing it.

AIT appeared in two installments. In the original formulation of AIT, AIT1, which lasted about a decade, there were 2N programs of size N. For the past thirty years, AITl has been superseded by a theory, AIT2, in which no extension of a valid program is a valid program. Therefore there are much fewer than 2N possible programs of size N. I have been the main intellectual driving force behind both AITl and AIT2, and in my opinion AITl is only of historical or pedagogic interest. I am happy that this book concentrates mostly on AIT2. Recently I have used LISP to convert AIT2 into a theory of the size of real computer programs, programs that you can actually run, yielding a new version of AIT2, AIT3 . Fortunately for the readers, this book concentrates on theory and avoids computer programming. In my opinion, program-size complexity is a much deeper concept than run-time complexity, which however is of greater practical importance in designing useful algorithms. The main applications of AIT are two-fold. First, to give a mathematical definition of what it means for a string of bits to be patternless, random, unstructured, typical. Indeed, most bit strings are algorithmically irreducible and therefore random. And, even more important, AIT casts an entirely new light on the incompleteness phenomenon discovered

VIII

Foreword

by Godel. AIT does this by placing information-theoretic limits on the power of any formal axiomatic theory. The new information-theoretic viewpoint provided by AIT suggests that incompleteness is natural and pervasive and cannot be brushed away in our everyday mathematical work. Indeed, AIT provides theoretical support for a quasi-empirical attitude to the foundations of mathematics and for adopting new arithmetical axioms that are not self-evident but are only justified pragmatically. There are also connections between AIT and physics. The program-size complexity measure of AIT is analogous to the Boltzmann entropy concept that plays a key role in statistical mechanics. And my work on Hilbert's 10th problem using AIT shows that God not only plays dice in quantum mechanics and nonlinear dynamics, but even in elementary number theory. AIT thus plays a role in recent efforts to build a bridge between theoretical computer science and theoretical physics. In this spirit, I should point out that a universal Turing machine is, from a physicist's point of view, just a physical system with such a rich repertoire of possible behavior that it can simulate any other physical system. This bridge-building is also connected with recent efforts by theoretical physicists to understand complex physical systems such as those encountered in biology. This book, benefiting as it does from Cristian Calude's own research in AIT and from his experience teaching AIT in university courses around the world, has helped to make the detailed mathematical techniques of AIT accessible to a much wider audience. This vastly expanded second edition collects in one place much exciting recent work of its author and others, and offers leisurely discussions of applications to philosophy and physics. I am sure that it will be even more successful and influential than the first edition. May 2002

G. J. Chaitin IBM Watson Research Center

Preface to the Second Edition The second edition of this book is more than a simple corrected, updated version. The main reason is the following: Algorithmic Information Theory (AIT) has made remarkable progress in the last few years. I would like to mention just a few facts: • The publication of Chaitin's trilogy [130, 131, 132] in which a programming-oriented vision of AIT has been projected. These books have made the theory accessible to a much wider category of readers and paved the way to new results, e.g. the calculation of 64 exact bits of an Omega Number [77]. • The solution of a long time open problem posed by Solovay (see [375, 85, 266]) has opened new ways of understanding randomness, especially regarding computably enumerable random reals. This problem has stirred the interest in AIT of a group of experts in' computability theory and the results are remarkable (a book overview [182] is due to appear soon). • The spectacular result of Solovay [377], who has effectively constructed a universal Chaitin computer U such that ZFC, if arithmetically sound, cannot determine any single bit of its halting probability, Ou, has produced a new understanding of the relations between incompleteness and randomness. • AIT has become increasingly more useful for other subjects, in particular physics and quantum computing. More and more researchers have been attracted to AlT. Various articles have appeared not only in prestigious international journals or conference

x

Preface to the Second Edition

proceedings (see the bibliography), but also in science magazines, such as AMS: Math in the Media, AMS Notices, American Scientist, Complexity, EATCS Bulletin, Nature, New Scientist, Pour La Science, and books devoted to larger audiences, e.g. Impossibility - The Limits of Science and the Science of Limits by J. Barrow, Five More Golden Rules by J. Casti or Cornerstones of Undecidability by G. Rozenberg and A. Salomaa, to name only a few. Finally, more researchers use the content-oriented name AIT instead of Kolmogorov complexity.1 What's new in this edition? Here are a few points: • Some errors and typos have been corrected. • The terminology in computability theory has been modernized: it refers to partial computable functions, computable sets and functions, computably enumerable sets and reals instead of partial recursive functions, recursive sets and functions, recursively enumerable sets and reals. • Two new chapters have been added, Computably Enumerable Random Reals and Randomness and Incompleteness. • Many results, examples, problems and exercises have been added. • The list of open problems has been revised. • The bibliography has been revised and about 200 new references have been added. In this book I treat some important problems in AIT; I do not offer a general and exhaustive presentation of AlT. There are other complimentary approaches to some of the main problems discussed in this book; for example the line adopted in Traub and Werschulz [403]. The selection of the material is subjective and follows my own interests. In the selection an important role was played by the reactions of my students at the University of Western Ontario, London, Canada, the University of Auckland, New Zealand, the Technical University of Vienna, Austria, the Japan Advanced Institute of Science and Technology, Japan, the University of Bucharest and "Ovidius" University of Constanta, Romania, and U niversidad de Buenos Aires, Argentina. I would like to thank all of them. 1 Mathematical Reviews and Zentralblatt Math have chosen the name "Algorithmic Information Theory" (68Q30) for the field.

Preface to the Second Edition

XI

I am extremely grateful to Wilfried Brauer, Greg Chaitin, Grzegorz Rozenberg and Arto Salomaa; without their encouragement and strong support the first edition of the book would have not appeared and the second one would have only been a dream. I have learned a lot from many colleagues, from their publications, from their discussions, from their co-operation. I warmly thank A. Arslanov, Veronica Backer, John Barrow, Douglas Bridges, Cezar Campeanu, John Casti, Richard Coles, Jack Copeland, John Dawson Jr., Michael Dinneen, Monica Dumitrescu, Crist ian Grozea, Josef Gruska, Juris Hartmanis, Lane Hemaspaandra, Peter Hertling, J uraj Hromkovic, Hajime Ishihara, Helmut Jurgensen, Bakh Khoussainov, Tien Kieu, Shane Legg, Solomon Marcus, Walter Meyerstein, Anil Nerode, Andre Nies, George Odifreddi, Boris Pavlov, Chi-Kou Shu, Ted Slaman, Bob Soare, Bob Solovay, Ludwig Staiger, Karl Svozil, loan Tomescu, Vladimir Uspensky, Yongge Wang, Klaus Weihrauch, Tudor Zamfirescu, Marius Zimand. I am very grateful to Joshua Arulanandham, Elena Calude, Greg Chait in , Simona Dragomir, Cristian Grozea, Peter Hertling, Bakh Khoussainov, Ion Mandoiu, Carlos Parra, Ludwig Staiger, Garry Tee and Marius Zimand for their comments which helped me to improve the book. Last, but not least, I reserve a big thank you to Ingeborg Mayer and Ulrike Stricker from Springer-Verlag, Heidelberg, for a most pleasant and efficient co-operation. May 2002

Cristian S. Calude Auckland, New Zealand

Preface to the First Edition We sail within a vast sphere, ever drifting in uncertainty, driven from end to end. When we think to attach ourselves to any point and to fasten to it, it wavers and leaves us; and if we follow it, it eludes our grasp, slips past us, and vanishes forever. Blaise Pascal

This book represents an elementary and, to a large extent, subjective introduction to algorithmic information theory (AIT). As it is clear from its name, this theory deals with algorithmic methods in the study of the quantity of information. While the classical theory of information is based on Shannon's concept of entropy, AIT adopts as a primary concept the information-theoretic complexity or descriptional complexity of an individual object. The entropy is a measure of ignorance concerning which possibility holds in a set endowed with an a priori probability distribution. Its point of view is largely global. The classical definition of randomness as considered in probability theory and used, for instance, in quantum mechanics allows one to speak of a process (such as a tossing coin, or measuring the diagonal polarization of a horizontally-polarized photon) as being random. It does not allow one to call a particular outcome (or string of outcomes, or sequence of outcomes) random, except in an intuitive, heuristic sense. The information-theoretic complexity of an object (independently introduced in the mid 1960s by R. J. Solom,onoff, A. N. Kolmogorov and G. J. Chaitin) is a measure of the difficulty of specifying that object; it focuses the attention on the individual, allowing one to formalize the randomness intuition. An algorithmically random string is one not producible from a

XIV

Preface to the First Edition

description significantly shorter than itself, when a universal computer is used as the decoding apparatus. Our interest is mainly directed to the basics of AlT. The first three chapters present the necessary background, i.e. relevant notions and results from recursion theory, topology, probability, noiseless coding and descriptional complexity. In Chapter 4 we introduce two important tools: the Kraft-Chaitin Theorem (an extension of Kraft's classical condition for the construction of prefix codes corresponding to arbitrary recursively enumerable codes) and relativized complexities and probabilities. As a major result, one computes the halting probability of a universal, self-delimiting computer and one proves that Chaitin's complexity equals, within 0(1), the halting entropy (Coding Theorem). Chapter 5 is devoted to the definition of random strings and to the proof that these strings satisfy almost all stochasticity requirements, e.g. almost all random strings are Borel normal. Random sequences are introduced and studied in Chapter 6. In contrast with the case of strings - for which randomness is a matter of degree, the definition of random sequences is "robust". With probability one every sequence is random (Martin-Lof Theorem) and every sequence is reducible to a random one (Gacs Theorem); however, the set ofrandom sequences is topologically "small". Chaitin's Omega Number, defined as the halting probability of a universal self-delimiting computer, has a random sequence of binary digits; the randomness property is preserved even when we re-write this number in an arbitrary base. In fact, a more general result is true: random sequences are invariant under change of base. We develop the theory of complexity and randomness with respect to an arbitrary alphabet, not necessarily binary. This approach is more general and richer in consequences than the classical one; see especially Sections 4.5 and 6.7. The concepts and results of AlT are relevant for other subjects, for instance for logic, physics and biology. A brief exploration of some applications may be found in Chapter 7. Finally, Chapter 8 is dedicated to some open problems. The literature on AlT has grown significantly in the last years. Chaitin's books Algorithmic Information Theory, Information, Randomness fj Incompleteness and Information- Theoretic Incompleteness are fundamental for the subject. Osamu Watanabe has edited a beautiful volume entitled Kolmogorov Complexity and Computational Complexity published in 1992 by Springer-Verlag. Ming Li and Paul Vitanyi have written a compre-

xv

Preface to the First Edition

hensive book, An Introduction to Kolmogorov Complexity and Its Applications, published by Springer-Verlag. Karl Svozil is the author of an important book entitled Randomness €j Undecidability in Physics, published by World Scientific in 1993. The bibliography tries to be as complete as possible. In crediting a result I have cited the first paper in which the result is stated and completely proven.

* I am most grateful to Arto Salomaa for being the springboard of the project leading to this book, for his inspiring comments, suggestions and permanent encouragement. I reserve my deepest gratitude to Greg Chaitin for many illuminating conversations about AIT that have improved an earlier version of the book, for permitting me to incorporate some of his beautiful unpublished results and for writing the Foreword. My warm thanks go to Charles Bennett, Ronald Book, Egon Borger, Wilfried Brauer, Douglas Bridges, Cezar Campeanu, Ion Chi~escu, Rusins Freivalds, Peter Gacs, Josef Gruska, Juris Hartmanis, Lane Hemaspaandra (Hemachandra), Gabriel Istrate, Helmut Jurgensen, Mike Lennon, Ming Li, Jack Lutz, Solomon Marcus, George Markowsky, Per MartinLof, Hermann Maurer, Ion Mandoiu, Michel Mendes-France, George Odifreddi, Roger Penrose, Marian Pour-El, Grzegorz Rozenberg, Charles Rackoff, Sergiu Rudeanu, Bob Solovay, Ludwig Staiger, Karl Svozil, Andy Szilard, Doru 9tefanescu, Garry Tee, Monica Tataram, Mark Titchener, Vladimir Uspensky, Drago§ Vaida, and Marius Zimand for stimulating discussions and comments; their beautiful ideas and/or results are now part of this book. This book was typeset using the Ib-'IEX package CLMono01 produced by Springer-Verlag. I offer special thanks to Helmut Jurgensen, Kai Salomaa, and Jeremy Gibbons - my 'IEX and Ib-'IEX teachers. I have taught parts of this book at Bucharest University (Romania), the University of Western Ontario (London, Canada) and Auckland University (New Zealand). I am grateful to all these universities, specifically to the respective chairs loan Tomescu, Helmut Jurgensen, and Bob Doran, for the assistance generously offered. My eager students have influenced this book more than they may imagine.

XVI

Preface to the First Edition

I am indebted to Bruce Benson, Rob Burrowes, Peter Dance, and Peter Shields for their competent technical support. The co-operation with Frank Holzwarth, J. Andrew Ross, and Hans Wossner from Springer-Verlag, was particularly efficient and pleasant. Finally, a word of gratitude to my wife Elena and daughter Andreea; I hope that they do not hate this book as writing it took my energy and attention for a fairly long period. March 1994

Cristian S. Calude Auckland, New Zealand

Contents

1 Mathematical Background

1

Prerequisites .....

1

1.2 Computability Theory

4

1.3 Topology .....

6

1.4 Probability Theory

8

1.1

1.5 Exercises and Problems 2 Noiseless Coding 2.1

19 21

Prefix-free Sets

21

2.2 Instantaneous Coding

24

2.3 Exercises and Problems

30

2.4 History of Results

32

3 Program-size

3.1

An Example. . . . . . . . . .

33

33

3.2 Computers and Complexities

34

3.3 Algorithmic Properties of Complexities.

43

3.4 Quantitative Estimates

45

3.5 Halting Probabilities ..

47

3.6 Exercises and Problems

49

3.7 History of Results ...

52

XVIII

Contents

4 Computably Enumerable Instantaneous Codes 4.1

The Kraft-Chaitin Theorem . . . . . . . . .

53

4.2

Relativized Complexities and Probabilities .

60

4.3

Speed-up Theorem ......

70

4.4

Algorithmic Coding Theorem

74

4.5

Binary vs Non-binary Coding (1)

85

4.6

Exercises and Problems

89

4.7

History of Results

91

5 Random Strings

6

53

95

5.1

An Empirical Analysis . . . . . . . . . .

95

5.2

Chaitin's Definition of Random Strings.

102

5.3

Relating Complexities K and H .

107

5.4

A Statistical Analysis

...

109

5.5

A Computational Analysis.

119

5.6

Borel Normality . . . . . . .

123

5.7

Extensions of Random Strings .

131

5.8

Binary vs Non-binary Coding (2)

136

5.9

Exercises and Problems

140

5.10 History of Results

145

Random Sequences

147

6.1

From Random Strings to Random Sequences

147

6.2

The Definition of Random Sequences ..

158

6.3

Characterizations of Random Sequences

169

6.4

Properties of Random Sequences

184

6.5

The Reducibility Theorem . .

204

6.6

The Randomness Hypothesis

229

6.7

Exercises and Problems ...

231

Contents

XIX

6.8

233

7

8

9

History of Results

Computably Enumerable Random Reals

237

7.1

Chaitin's Omega Number . . .

237

7.2

Is Randomness Base Invariant?

240

7.3

Most Reals Obey No Probability Laws

253

7.4

Computable and Uncomputable Reals

260

7.5

Computably Enumerable Reals, Domination and Degrees

271

7.6

A Characterization of Computably Enumerable Random Reals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

7.7

Degree-theoretic Properties of Computably Enumerable Random Reals . . . . .

302

7.8

Exercises and Problems

310

7.9

History of Results

...

313

Randomness and Incompleteness

315

8.1

The Incompleteness Phenomenon

315

8.2

Information-theoretic Incompleteness (1)

320

8.3

Information-theoretic Incompleteness (2)

324

8.4

Information-theoretic Incompleteness (3)

328

8.5

Coding Mathematical Knowledge . . . . .

332

8.6

Finitely Refutable Mathematical Problems.

335

8.7

Computing 64 Bits of a Computably Enumerable Random Real . . . . . . . . . . . .

343

8.8

Turing's Barrier Revisited

355

8.9

History of Results

358

Applications

361

9.1

The Infinity of Primes

361

9.2

The Undecidability of the Halting Problem

362

xx

Contents 9,3

Counting as a Source of Randomness,

363

9,4

Randomness and Chaos , , , , , , ,

366

9,5

Randomness and Cellular Automata

367

9,6

Random Sequences of Reals and Riemann's Zeta-function

383

9,7

Probabilistic Algorithms

389

9,8

Structural Complexity

393

9,9

What Is Life? , , , , ,

398

9,10 Randomness in Physics

405

9,11 Metaphysical Themes

409

10 Open Problems

415

Bibliography

419

Notation Index

455

Subject Index

457

N arne Index

461

Chapter 1

Mathematical Background Cum Deus calculat, fit mundus. Leibniz In this chapter we collect facts and results which will be used freely in the book, in an attempt to make it as self-contained as possible.

1.1

Prerequisites

We denote by N, Q, I and R, respectively, the sets of natural, rational, irrational and real numbers; N+ = N \ {O} and R+ = {x E R I x 2: O}. If S is a finite set, then #S denotes the cardinality of S. We shall use the following functions: i) rem(m, i), the remainder of the integral division of m by i (m, i E N+), ii) laJ, the "floor" of the real a (rounding downwards), iii) a 1, the "ceiling" of the real a (rounding upwards), iv ) (~), the binomial coefficient, v) logQ' the base Q logarithm, log(n) = llog2(n + I)J. It is easily seen that I log2 n - log n I:S 1, for all n 2: 1, and logn + logm -1 :S lognm:S logn + logm + 1, for all n, m > O. By I we denote the divisibility predicate. By c we denote the (non-strict) inclusion relation between sets.

r

We fix A = {al,"" aQ}, Q 2: 2, a finite alphabet. By A* we denote the set of all strings XIX2 •.• Xn with elements Xi E A (1 :S i :S n); the empty string is denoted by A. A* is a (free) monoid under concatenation (this operation is associative and the empty string is the null element). Let A+ = A* \ {A}. For x in A*, IxlA is the length of x (IAIA = 0). If there is

1. Mathematical Background

2

no ambiguity we write Ixl instead of IxIA. Every total ordering on A, say al < a2 < ... < aQ, induces a quasi-lexicographical order on A* :

We consider the following bijection between non-negative integers and binary strings on the alphabet A2 = {O, 1}:

o

f-t

).

1

f-t

0

2

f-t

1

3

f-t

00

4

f-t

01

5

f-t

10

6

f-t

11

The image of n, denoted bin(n), is the binary representation of the number n+ 1 without the leading 1. The quasi-lexicographical order on binary strings induced by the natural order 0 < 1 can be defined in terms of this bijection: for x, y E {0,1}*, x < y if bin-1(x) < bin-1(y). The length of bin(n) is almost equal to log2(n), more precisely, Ibin(n)1 = logn. In general we denote by stringQ(n) the nth string on an alphabet A with Q elements according to the quasi-lexicographical order. In particular, bin(n) = string2(n). In this way we get a bijective function stringQ : N ---t A*; IstringQ(n) I = LlogQ(n(Q - 1) + l)J. In any context in which the alphabet A is clear, we will write string instead of stringQ' On A * we define the prefix-order relation as follows: x

If x E A* and i E N, then xi is the concatenation xx ... x (i times), in case i > 0; xO =).. For two subsets S, TeA * their concatenation ST is defined to be the set {xy I xES, YET}. For m in N, Am = {x E A* I Ixl = m}. In case m ~ 1 we consider the alphabet B = Am and construct the free monoid B* = (Am) *. Every x E B* belongs to A * ,

3

1.1 Prerequisites

but the converse is false. For x E B* we denote by (according to B) which is exactly ~.

Ixl m

the length of x

For Q E N, Q ~ 2, let AQ be the alphabet {a, 1, ... , Q - 1}. The elements of AQ are to be considered as the digits used in natural positional representations of numbers in base Q. Thus, an element a E AQ denotes both the symbol used in number representations and the numerical value in the range from to Q - 1 which it represents. By (n)Q we denote the base-Q representation of the number n.

°

By AW we denote the set of all (infinite) sequences x = XIX2 ..• x n ... with elements Xi in A. The set AW is no longer a monoid, but it comes equipped with an interesting probabilistic structure, which will be discussed in Section 1.4. For x E AW and n E N +, we put x( n) A*,

SAW

= {x E AW I x(n)

= Xl ... Xn

E S,for some

n

~

E

1}; xAw

A *. For x E A *, S

= {x}AW, x

E

c

A*.

For orders of magnitude we will use Bachmann's notation. Let f, 9 : A* ~ R+ be two functions. We say that f :s: 9 + 0(1) if there exists a constant c > such that f (x) :s: g( x) + c, for all strings x E A *; sometimes

°

we may use the notation f we write f R::J g. In general,

~ g. If f

:s: 9 + 0(1) and 9 :s: f + 0(1), then

O(f) = {g: A* ~ R+ I there exist c E R+,m E N such that g(x) :s: cf(x), for all strings x, Ixl ~ m}. A partial function c.p : X ~ Y is a function defined on a subset Z of X, called the domain of c.p (we write dom( c.p)). In case dom( c.p) = X we say that c.p is total and we indicate this by writing c.p : X ~ Y. For x E dom( c.p) we write c.p(x) < 00; in the opposite case, i.e. when x t/. dom( c.p), we put c.p(x) = 00. The range of c.p is range(c.p) = {c.p(x) I x E dom(c.p)}; the graph of c.p is graph( c.p) = {(x, c.p( x)) I x E dom( c.p)}. Two partial functions c.p, f : X ~ Yare equal if dom(c.p) = dom(f) and c.p(x) = f(x), for all x E dom(c.p). Each chapter is divided into sections. The definitions, theorems, propositions, lemmata, corollaries and facts are sequentially numbered within each chapter. We adopt the abbreviation iff for "if and only if". Each proof ends with the Halmos end mark D.

4

1.2

1. Mathematical Background

Computability Theory

Algorithmic information theory is essentially based on recursion theory. Informally, an algorithm for computing a partial function rp : N ~ N is a finite set of instructions which, given an input x E dom( rp), yields after a finite number of steps the output y = rp(x). The algorithm must specify unambiguously how to obtain each step in the computation from the previous steps and from the input. In case rp is computed by an algorithm we call it a partial computable function; if rp is also total, then it is called a computable function. These informal notions have as formal models the partial computable functions - abbreviated p. c. functions, respectively, the computable functions (the old terminology referred to partial recursive and recursive functions). A partial string function rp : A * ~ A * is called partial computable if there exists a p.c. function f : N ~ N such that rp(x)

=

string(f(string- 1 (x))),

for all x E A * and similarly for computable functions. There are many equivalent ways to formally define p.c. functions, i.e. by means of Turing machines, Godel-Kleene equations, Kleene operations, Markov algorithms, abstract programming languages, etc. The particular formalism does not matter for what follows. The main result to be used is the possibility of enumerating all p.c. functions rp~n) : (A*t ~ A*

in such a way that the following two conditions are fulfilled:

Universality Theorem. rp12) (e, x) such that

There is a p.c. function of two variables

Uniform Composition Theorem. There is a computable function of two variables camp such that (1) ( ) rpcomp(x,y) Z

_

-

rpx(1)( rpy(1)( Z )) .

The p.c. functions of a variable rpx = rp11) are essential for the whole theory as there exist pairing functions, i.e. computable bijective functions

1.2 Computability Theory

5

<>: A* X A* -+ A*, which may be iterated and by which one can reduce the number of arguments. As a basic result one gets

Theorem 1.1 (Kleene's Recursion Theorem). For every mEN + and every computable function f there effectively exists an x (called fixed point of f) such that <pr; =
D = {x EN I P(X,Yl,Y2,'" ,Ym) = O,for some Yl,Y2,'" ,Ym E Z}. We call a set Diophantine if it is of the above form. The main relation is given by the following result:

Theorem 1.2 (Matiyasevich). A set is c.e. iff it is Diophantine. If the polynomial P is built up not only by means of the operations of addition and multiplication, but also by exponentiation, then it is called an exponential polynomial. Using the exponential polynomial instead of polynomial we may define in a straightforward way the notion of exponential Diophantine set. Of course, by Matiyasevich's Theorem, a set is c.e. iff it is exponential Diophantine. However, a stronger result may be proven. We call a set D singlefold exponential Diophantine if it is exponential Diophantine via the exponential Diophantine polynomial P(x, Yl, Y2, ... ,Ym) and for xED there is a unique m-tuple of non-negative integers Yl, Y2, . .. ,Ym such that

P(x, Yl, Y2,···, Ym) = 0.

1. Mathematical Background

6

Theorem 1.3 (Jones-Matiyasevich). A set is c.e. iff it is singlefold exponential Diophantine. For more details see Matiyasevich's monograph [309], and Jones and Matiyasevich's paper [242]. It is not known whether singlefold representations are always possible without exponentiation. A function f : A * ---7 R+ is called semi-computable from below (or lower semi-computable) if its graph approximation set {(x,r) E A* x Q I r

< f(x)}

is c.e. A function f is semi-computable from above (or upper semicomputable) if - f is semi-computable from below. If f is semicomputable from both below and above, then f is called computable. It is not too difficult to see that A function f is semi-computable from below iff there exists a nondecreasing (in n) computable function 9 : A * x N ---7 Q such that f(x) = limn---+oo g(x, n), for every x E A *. A function f is computable iff there exists a computable function 9 : A* x N ---7 Q such that for all x E A*, n ~ 1, If(x) - g(x, n)1 < lin.

For more details of recursion theory we recommend the following books: Azra, Jaulin [8], Borger [44], Bridges [46], Calude [51], Cohen [143], Mal'cev [292]' Odifreddi [321, 322], Rogers [347], Salomaa [355], Soare [372], Wood [440].

1.3

Topology

We are going to use some rudiments of topology, mainly to measure the size of different sets. The idea that comes naturally to mind is to use a Baire-like classification. Given a set X, a topology on X is a collection T of subsets of X such that 1.

0 E T and

X E

T.

2. For every U E T and VET, we have Un VET.

1.3 Topology

7

3. For every WeT, we have

UW E T.

When a topology has been chosen, its members are called open sets. Their complements are called closed sets. The pair (X, T) is called a topological space. An alternative, equivalent way to define a topology is by means of a closure operator Cl (mapping subsets of X into subsets of X) satisfying the following (Kuratowski) properties: 1. Cl(0) 2. Z

= 0.

c Cl(Z),

for all subsets Z C X.

3. Cl(Cl(Z)) = Cl(Z), for all subsets Z

C

X.

4. Cl(Y U Z) = Cl(Y) U Cl(Z), for all subsets Y, Z

c X.

For instance, in the topological space (X, T) the closure operator ClT is defined by ClT(Z) = n{F c X I Z c F, F is closed}. Let T be a topology on a set X and let ClT be its closure operator. A set T C X is said to be rare with respect to T if for every x E X and every open neighbourhood N x of x one has N x ClT(T). A set which is a countable union of rare sets is called meagre, or set of the first Baire category. A set which is not meagre is called a second Baire category set. A dense set is a set whose closure is equal to the whole space. Passing to complements we get co-rare, co-meagre, co-dense sets.

ct

Intuitively, the properties of being rare, meagre, dense, second Baire category, co-meagre, co-rare describe an increasing scale for the "sizes" of subsets of X, according to the topology T. Thus, for instance, a dense set is "larger" than a rare one, and a co-rare set is "larger" than a dense set. We shall work with spaces of strings and sequences endowed with topologies induced by various order relations. If < is an order relation on A * , then the induced topology is defined by means of the closure operator Cl T «) acting as follows:

ClT«)(Z) = {u

E A*

I v > u,for some v E Z},

or, equivalently, by means of the basic open neighbourhoods N;;={VEA*lu
1. Mathematical Background

8

The space of sequences AW is endowed with the topology generated by the sets xAw, x E A *. Various conditions of constructivity will be discussed when using these topologies. Let (X, T) be a topological space. A subset S of X is compact if whenever WeT and S = UW, there is a finite V C W such that S = UV. If X is itself compact, then we say that the topological space (X, T) is compact. Let (Xi, Td be topological spaces for all i E I. Let X be the Cartesian product X = I1iEI Xi. Let Pi be the projection from X onto the ith coordinate space Xi:

There is a unique topology T on X - called the product topology - which is the smallest topology on X making all coordinate projections continuous, i.e. for all W E Ti, one has pi1(W) E T. Theorem 1.4 (Tychonoff). Let (Xi, Ti) be compact topological spaces for all i E I. Then, the Cartesian product X = I1iEI Xi endowed with the product topology is compact.

In the case of the space of sequences AW one can see that the topology induced by the family (XAW)xEA* coincides with the product topology of an infinity of copies of A each of which comes with the discrete topology (i.e. every subset of A is open). So, by Tychonoff's Theorem, the space of all sequences is compact.

For more on topology see Kelley [251].

1.4

Probability Theory

In this section we describe the probabilities on the space of sequences AW. Probabilities are easiest to define for finite sets; see, for instance, Chung [140], Feller [192]. The classical example concerns a toss of a fair 1 coin. We may model this situation by means of an alphabet A = {O, 1}, 0 = "heads", 1 = "tails". We agree to set to 1 the probability of all possible outcomes. Also, if two possible outcomes cannot both happen, then we assume that their probabilities add. Introducing the notation "p( .. .)" IThat is, heads and tails are equally likely.

1.4 Probability Theory

9

for "the probability of ... ", we may write the relations p(o) = p(l), so p(o) = p(l) = 1/2.

+ p(l)

= 1,

p(o)

Let us toss our fair coin. If we toss it twice we get four possible outcomes 00,01,10,11, each of which has th~ probability 1/4. In general, if the coin is tossed n times, we get 2n possible strings of length n over the alphabet A = {O, 1}; and each string has probability 2- n . If A = {1, 2, 3, 4, 5, 6} and p(l) = p(2) = p(3) = p(4) = p(5) = p(6) = 1/6, then we model the tossing of a fair dice. More formally, let p be a function from an alphabet A to [0,1]. The pair (A,p) is called probabilistic model if LaEAP(a) = 1. An event X is a subset of A and its probability is defined by P(X) = LXEX p(x). The function P is called probability distribution. If p(a)

=

#~, for all a E A, then P is the uniform probability distribution

on A: P(X)

=

-$1, for all subsets X

of A.

The conditional probability P(X I Y) of the event X assuming the event Y is defined by the formula

P(X I Y) If P(Y)

=

p(XnY). P(Y)

= 0, then P(X I Y) is undefined.

The events X, Yare independent if p(XnY) = P(X)P(Y). If P(Y) =1= 0, then X, Yare independent iff P(X I Y) = P(X). For X c A we put X = A \ X. The events Xl, X 2, ... , Xn are independent if for every YI E {XI,Xd, Y2 E {X2,X 2), ... ,Yn E {Xn,Xn ) we have

P

CQ li) =

g

P(li).

It is easy to show that the events X I, X 2, ... ,Xn-l are independent provided the events Xl, X 2, ... , Xn are independent.

Let us return to the uniform probability distribution on An, where A = {O, 1}. What about letting n tend to infinity? We will get all possible sequences of O's and l's, i.e. the space AW = {O,l}w. Note that each sequence has probability zero, but this does not determine the probabilities of other interesting sets of possible outcomes, as in the finite case. To see this we "convert" our sequences into "reals" in the interval [0, 1] by

1. Mathematical Background

10

preceding each sequence by a "binary point" and regarding it as a binary expansion. For instance, to the sequence 0101010101010101 ... we associate the number 0.0101010101010101 ... , i.e.

111 4: + 16 + ... ="3'

Every number in [0, 1] has such an expansion; the dyadic rationals (and only them) k2- n have in fact two such expansions. On [0, 1] we have the usual Lebesgue measure which assigns to each subinterval its length. Via this identification a possible string of outcomes of the first n tosses corresponds to the set of all infinite sequences beginning with that string, hence to a subinterval of [0,1] of length 2- n . Furthermore, every set of k different strings of outcomes for exactly n tosses corresponds to a set in [0, 1] with Lebesgue measure k2- n . Bearing this correspondence in mind - as a guide - we move to the "direct" construction of the uniform probability on the space of all sequences over the alphabet A (which is not necessarily binary). For the rest of this section we shall follow Calude and Chitescu [68]. First, let us review some notions. A (Boolean) ring of sets is a non-empty class R of sets which is closed under union and difference. A (Boolean) algebra of sets is a non-empty class R of sets which is closed under union and complementation. A O"-ring is a ring which is closed under countable union, and a O"-algebra is an algebra which is closed under countable union. In every set X, the collection of all finite sets is a ring, but not an algebra unless X is finite. The collection of all finite and co-finite sets is an algebra, but not a O"-algebra, unless X is finite. The collection of all subsets of a given set is a O"-algebra. So, for any family C of subsets of a given set we can construct the smallest O"-algebra containing C; it is called the O"-algebra generated by C. In a topological space, the O"-algebra generated by the open sets is called the Borel O"-algebra, and its sets are called Borel sets. Let R be a ring. A measure is a real-valued, non-negative and countably additive function f.1, defined on R such that f.1,(0) = 0. 2 A measure for 2The function J1, is countably additive iffor every disjoint sequence {En }n2:0 of sets in R, whose union is also in R, we have J1,(Un2:0 En) = Ln2:o J1,(En).

11

1.4 Probability Theory which the whole space has measure one is called a probability.

Every ring R generates a unique (J-ring S(R). If J-l is a finite measure on a ring R, then there is a unique measure p; on the (J-ring S(R) such that for every E E R, p;(E) = J-l(E); the measure P; is finite. See, for instance, Dudley [187]. Let us now consider the total space A w. One can see that the class of sets P

= {xAW I x

E

A*} U {0}

has the following properties: 1. xAw

c yAW

2. xAw

n yAW i=- 0 iff x

3. X

nY

iff Y

E {X, Y,

0}, for all X, YEP.

Next let us consider the topology on AW generated by P, which coincides with the product topology on AW, previously discussed. Also, note that every element in P is both open and compact and the (J-algebra generated by P is exactly the Borel (J-algebra. Indeed, because P consists of open sets, we get one inclusion; on the other hand, every open set is a union (at most countable) of sets in the canonical basis generating the product topology, and every set in this basis is a finite union of sets in P. Let us illustrate this mechanism with the set appearing in the Law of Large Numbers: X

= {x E A W I lim

Xl

+ X2 + ... + Xn = ~}. n

n->OQ

2

A sequence x is in X if for every positive integer k there is a positive integer m such that

I for all n

~

Xl

+ X2 + ... + Xn n

m. In set-theoretical terms,

_

~I < ~ 2 k'

12

1. Mathematical Background

For all positive integers k, n the set

is a finite union of open intervals in uAw E P over all strings u of length n such that Xl + X2 + ... + Xn _ ~ I < .!.. I n 2 k Hence X is a Borel set as an intersection of unions of intersections of elements in P. Theorem 1.5 (Compactness). If X and (Xi)iEN are in P, and X = UiEN Xi, Xi being mutually disjoint, then only a finite number of Xi are non-empty. Proof Let X = UiEN Xi, Xi be as above and suppose Xi #- 0, for infinitely many i E N. Because X is compact and all Xi are open, we can find a natural n such that n

X=

UX

i.

i=l

Let m > n such that Xm #- 0. Every sequence x E Xm belongs to X; consequently it belongs to some Xi with i S n < m, contradicting the 0 fact that Xi and Xm are disjoint. Before proceeding further we note that for every string X E A * and natural k;:: lxi, there exists a single partition of xAw formed with elements zAw, with Izl = k, namely

U

xAW =

xyAw.

{yEA*iiyi=k-ixi}

We introduce the class C of all finite mutually disjoint unions of sets in

P. Theorem 1.6. The class C is an algebra. Proof We divide the proof into two steps. Step 1. For every X E P, Y E C, Y

c

X we have X \ Y E C.

1.4 Probability Theory As the case X

13

= 0 is obvious, we take n

X=xA w, y= UYi Aw , i=l

where x, Yi E A* and YiAw n YjAW = 0, for i -I j. Of course, x

U

xAW =

xuAw,

lul=k-Ixl

Yi AW

U

=

xZivAw.

Ivl=k-IYil

Then

X\Y

U xwAw , wEF

where

The last union is disjoint, so X \ Y E C.

Step 2. The class C is closed under union and difference. Let

X =

m

n

i=l

j=l

UXi, Y = UYj,

be in C. a) We have X n Y E C, because Xi n Yj E P. b) We have X \ Y E C. Indeed, m

X \Y

=U i=l

n

m

n

n (Xi \ Yj) = i=lj=l U n (Xi \ (Xi n Yj)).

j=l

1. Mathematical Background

14

Because of Xi n Yj E C and a), Step 1 gives the relation

Applying a) again we get X \ Y E C. c) We have Xu Y E C. Indeed, m

Xu Y

=

n

U U (Xi U (Yj \ Xi)) . i=lj=l

D

Now we describe the probabilities defined on the algebra C - which can be extended to the generated O"-algebra by standard methods. Theorem 1. 7. There exists a bijective correspondence between the probabilities defined on the O"-algebra generated by C and the functions h : A * ---7 [0, 1] having the following two properties:

1)

h()") = 1,

2)

h(x) = 2:~1 h(xai), for all x E A*.

Proof. Let 1i be the set of all functions h having properties 1) and 2) and let Prob be the set of all probabilities on the O"-algebra generated by C. One can easily check (by induction on l) that 2) is equivalent to 3) For all x E A*, h(x)

= 2:lvl=1 h(xv).

Step 1. We define the function S : 1i where ILh : C ---7 [0, 1],

---7

Prob as follows: S(h) = ILh,

ILh(0) = 0, ILh

(91

~ h(Xi).

XiAW) =

The above definition is correct since in case X E C has two disjoint representations m

X =

U

n

Xi Aw =

i=l we have

U yjAW,

j=l

m

n

i=l

j=l

L h(Xi) = L

h(Yj)·

(1.1 )

15

1.4 Probability Theory

Let k be the maximum of the lengths of the strings Xi, Yj; we may write the formulae Xi AW

U

=

Xi vAw ,

Ivl=k-Ixil YjAW

U

=

YjwAw.

Iwl=k-IYjl

We shall prove the inequality m

n

i=l

j=l

L h(Xi) ::; L h(Yj). To this end we fix i and a string v E A* such that

(1.2)

Ivl = k -Ixil.

Because

n

XivAw C

U yjAW, j=l

there exists a unique ji,v such that Yji,v

U

Yji,v AW --

Yji,v W AW

Iwl=k-IYji,v I

yields the existence of a unique string Wi,v such that XiV

= Yji,v Wi,v.

In this way we get the injective correspondence

So,

i,v

i,v

i,v

n

<

L h(Yjw) = L h(Yj)· j,w

j=l

Due to the symmetry, the opposite inequality to (1.2) still holds true, thus establishing the equality (1.1). Now we prove that ILh is finitely additive. If X = U~l XiAw E C and X = Uj=l Yj, with Yj = U~~l Yj,kAW E C mutually disjoint, then we can write n

X =

nj

U U Yj,k Aw . j=lk=l

16

1. Mathematical Background

According to the definition of ILh one has n

ILh(X) = L

h(Xi),

i=1

n LILh(Yj)

n nj =L L

j=1

h(Yj,k)'

j=lk=1

The last sum is in fact equal to ILh(X), because of the equalities

n nj

X

=

m

UU

Yj,k AW

j=1 k=1

=

U Xi Aw , i=1

expressing X in two different ways (we have made use of the correctness of the definition of ILh). The last step of this part consists in proving the countable additivity of ILh. Let X = U~1 Yi E C (where all Yi E P and are mutually disjoint). We consider also a sequence (Xn)nEN of mutually disjoint sets in C such that X = Un2:0 X n· We must show that ILh(X) = L:n2:0 ILh(Xn). The last equality will be proven by showing that only finitely many Xn are non-empty and using the - already proven - additivity. We write kn

Xn =

U Xn,j, j=1

with (Xn,j)jEN in P mutually disjoint. Put Z(i,j, n) = Yi n Xn,j, for all i,j. For all i :2: 0 one has Y i = Un,j Z(i,j,n). Applying Theorem 1.5 successively for i = 1,2, ... , m we find the naturals nl, n2, ... ,nm such that Z(i,j,n) = 0, for n:2: ni. Let N = max{nl,n2,'" ,nm }. We claim that Xn = 0, for all n > N. This assertion is equivalent to the fact that for such n one has Xn n X = 0, or Xn n Yi = 0, for all i = 1,2, ... , m. But, kn

Xn nYi =

U (Xn,j nYi) = 0. j=1

Step 2. The function T : Prob

where

-t

7-l defined by

17

1.4 Probability Theory satisfies conditions 1) and 2) in the statement of the theorem. First of all, hp,(>")

= J.l(>..AW) = J.l(AW) = 1. Next, let x E A* and compute

Q

Q

L hp,(xai) = L J.l(xai AW ) = J.l(xAW) = hp,(x), i=l

i=l

due to the equality Q

xAw =

U xai Aw , i=l

the union being disjoint.

Step 3. The mappings S, T are mutually inverse. We first take h E 7-l and show that T(S(h)) have

= h. For every x

E

A* we

T(S(h))(x) = T(J.lh) (x) = hP,h (x) = J.lh(xAW) = h(x). We now take J.l E Prob and show that S(T(J.l))

= J.l. Again, for x E A*,

So, J.l and S(T(J.l)) coincide on P. Actually, they are equal because every X E C is of the form X = U~=l XiAw, where the sets XiAw E Pare mutually disjoint; we use the additivity to write k

S(T(J.l)) (X)

=

LS(T(J.l))(XiAW) i=l

i=l

J.l(X).

o

The reader may well ask why we have presented the proof in such a detailed manner. Actually, the reason is contained in Fact 1.8. First we define the notion of semi-ring, according to Halmos [220]. A semi-ring is a non-empty class SR of sets such that: 1. If E, F E SR, then En F E SR. 2. If E,F E SR and E

c F, then

E = Co and Ci

\

Ci -

1

C

C 1 C ... C Cn = F,

E SR, for i = 1,2, ...

,n.

18

1. Mathematical Background

Fact 1.8. The class P is a semi-ring iff Q = 2.

Proof We first assume that Q = 2. If X = xyAW then let t = lyl and Y = YIY2 ... Yt· Then

X

=

Xo

= XYIY2 ... YtAW

c

xAw

= Yare in P,

= Xl XYIY2 ... Yt_2Aw = X 2

XYIY2 ... Yt_IAw

C C

and obviously (XYIY2 ... Yt_i AW ) \ (XYIY2 .. . Yt-iYt-i+I AW ) xYi . .. Yt-iYt-i+1 ... A W,

where _ Yj

=

{a2' aI,

if Yj i f Yj

= aI, = a2·

So, Xi \ Xi-l is in P which is a semi-ring. We now assume that Q > 2. Let X = xaiAw C xAw = Y. In case Xc Z C Y, where Z = uAw E P, we must have x

Z\X =

U

xajAW,

j=l,jf=i

and consequently Z \ X cannot belong to P.

o

So, the classical extensions of a measure from a semi-ring SR to the ring C generated by SR - see for instance Halmos [220] - do not apply if

Q >2. The most important example of a measure, which we will be constantly using, is the uniform probability measure

which obviously satisfies the conditions in Theorem 1.7. It corresponds to the Lebesgue measure on [0,1].

1.5 Exercises and Problems

1.5

19

Exercises and Problems

1. Let X

c

A *. Prove that the following definitions are equivalent:

a) The set X is c.e.

b) The set X is the domain of a p.c. function rp : A * ~ A * .

c) The set X is the range of a p.c. function rp : A * ~ A * .

d) The set X is empty or the range of a computable function rp : A * 2. Show that a function rp : A*

rp(x)

=

---+

---+

A *.

A* is computable iff the set {llxlOxy

I

y} is c.e.

3. Show that a function f : A * ---+ R+ is semi-computable from above iff the set ((x,r) E A* x Q I r > f(x)} is c.e.

f : A * ---+ R+ be semi-computable from above and 9 : A * ---+ R+ be semi-computable from below. Then show that the set {x E A* I g(x) < f(x)} is c.e.

4. Let

5. Show that if A = Nand p is defined by p(n) probabilistic model. 6. Show that if A = Nand p is defined by p(n) a probabilistic model.

=

=

2- n -1, then (A,p) is a

(n+I)I(n+2), then (A,p) is

7. Let P be the uniform probability distribution on A = {O, 1}1O. What is the probability of the events: a) Xl = {x = Xl ... XlO I X2 = 0, X5 = I}, b) Xl = {x = Xl.· .XlO l2:i~1 = 3}? 8. Prove the following properties of probability distributions: a) P(0)

=

0 and P(A)

=

1.

b) If (Xi)i=l, ... ,n are disjoint subsets of A, 2:~=1 P(Xi)' c) If Xc Y, then P(Y \ X)

=

then P(Uf=1 Xi)

P(Y) - P(X).

d) For every X, YeA, P(X U Y) = P(X)

+ P(Y) -

P(X n Y).

9. We toss a fair coin four times. Assume that we know that at least one time we have got 1. What is the probability that we have got 1 in the first toss, i.e. Xl = I? Compare this probability to the probability of the event Xl = 1. 10. Prove that the following pairs of events are independent, (X, Y), (X, Y), (X, Y), provided X and Yare independent. 11. Show that if (X I, Y) and (X 2, Y) are independent and X I, X 2 are disjoint, then (Xl U X 2 , Y) are independent.

20

1. Mathematical Background

12. Show that if Xl, X 2, ... , Xn are independent, then so are the events Yl , Y2,.·., Yn provided Yl E {Xl, Xl}, Y2 E {X2' X 2}, ... , Yn E {Xn,Xn}. 13. Show that if Y n Z = 0 and Y, X 2, . .. , Xn and Z, X2, .. . ,Xn are independent, then so are (Y U Z), X 2 , ••. , X n . 14. Show that if Xl, X 2, ... , Xn are independent, then so are (Xl n X 2 ), •.. ,Xn ·

15. (Bernoulli scheme with finitely many tosses) Consider A = {O,l}n, a = ala2 ... an E A and define

p(a) = p~umber of Os in a p?umber of Is in a, where Po

+ Pl

=

1, PO,Pl :::=: O.

Prove:

a) For every 1::::; il < i2 < ... < ik ::::; nand bl ,b2 , ••. ,bk E {0,1}, we have: P( {a E A I ail = bl , ai2 = b2, ... ,aik = bd) = Pb l Pb2 ... Pb k· b) The events {a E A I,ail = bl }, {a E A l,ai2 = b2}, ... ,{a E A I,aik = bk} are independent. 16. Show that every measure p, is additive (for all pairwise disjoint sets (En)o~n~m, p,(U~=oEn) = 2:~o P,(En)), monotone (if E c F, then p,( E) ::::; p,( F) ), su b-additive (for every sequence of sets (En )n>O,

p,(Un"?oEn) ::::; 2:n"?O P,(En)). 17. A null set is a set of measure zero. Show that a countable union of null sets is a null set. 18. Let X be a set for which i) p,(X) exists and ii) for every c > 0 there exists a set Y such that X c Y and p,(Y) ::::; c. Then, prove that X is a null set. 19. (Bernoulli scheme with infinitely many tosses) Consider A = {O, I}, Po Pl = 1,PO,Pl :::=: O. The measure

+

n

p,(xAW) =

II Pai' i=l

where x = ala2 ... an E An, gives the probability of getting a particular sequence ala2 ... an of Os and Is in the first n tosses in which 0 appears with probability Po and 1 appears with probability Pl. If Po = Pl, then we get the Lebesgue measure. 20. Let y E AW be fixed. Define

if x

Chapter 2

Noiseless Coding A poem is never finished, only abandoned. Paul Valery

In this chapter we consider the problem of safe transmission of a message over a channel, which cannot be affected by noise. We are looking for error-free and the fastest possible methods for transmitting messages. This is a rather special, but important, problem in classical information theory. We rely mainly on the following two central tools: prefix-free sets and Shannon entropy. Undoubtedly, the prefix-free sets are the easiest codes to construct, and most interesting problems on codes can be raised for prefix-free sets. Shannon entropy is a measure of the degree of ignorance concerning which possibility holds in an ensemble with a given a priori probability distribution. Later on, we shall contrast the Shannon measure with the information content of an individual (finite) object - viewed as a measure of how difficult it is to specify that object.

2.1

Prefix-free Sets

We start with the following guessing game where one person has thought of an arbitrary natural number and the other person tries to guess it. The person who guesses is only allowed to ask questions of the form: "Is your number less than n?" for every natural n ~ 0; the other player answers yes or no. The aim is to guess the number as fast as possible.

22

2. Noiseless Coding

As an example consider the following questions: 1. Is your number less than 1? 2. Is your number less than 2? 3. Is your number less than 3? 4. Is your number less than 4?

5. and so on until the first yes is obtained. To guess the number 10 we need to ask 11 questions; in general, to guess the number n we have to ask n + 1 questions. It is convenient to adopt the following convention: the representation of n is the string of answers that would be given when the number to be guessed is n, where 0 stands for yes and 1 stands for no. Accordingly, the above set of questions leads to the set S = {1 iO I i 2': O}, where 1no is a "name" for n. It is important to note the following remarkable property of the set S: no string in S is a proper prefix of a different string in S. Sets having this property are called prefix-free sets; they will be formally introduced in the following definition.

Fix an alphabet A

= {aI, a2, ... , aQ}, Q 2': 2.

Definition 2.1. i) A string x E A* is a prefix of another string y (written x

Example 2.2. For every natural n, the set S = An is a (finite) prefixfree set. Every prefix-free set S containing the empty string A is equal to AO = {A}. Example 2.3. The set S alphabet A = {al,a2,a3}.

> 1} is prefix-free over the

We may ask the following question: "Is there any way to represent all positive integers by means of a prefix-free set?" The answer is affirmative,

23

2.1 Prefix-free Sets

and the first solution which comes to mind is the set S = {1 iO I i 2 1}, already obtained. Since it requires n + 1 bits to represent n, this solution can hardly be considered as practical. To discuss some ways to improve this solution we will start by modifying the set of questions in the guessing game: 1. Is your number less than 17 2. Is your number less than 27

3. Is your number less than 47 4. Is your number less than 87 5. Is your number less than 167 6.

and so on until the first yes is obtained and then the process continues as a binary search. We are led to Example 2.4. Represent n E N+ as the string 11ogn Obin(n) and get a prefix-free set S in which every natural n 2 1 can be represented by 210gn + 1 bits. 1

For a further improvement we proceed as follows. For every x E {O, 1} * we construct the new string x obtained by inserting a 0 in front of each letter in x, and finally adding 1; X= 1. For instance,

0=001, I = 011, 00 = 00001, 01 = 00011, 10 = 01001, IT = 01011. It is clear that

Ixl = 21xl + 1.

Finally, let

d(x) = bin(lxl)x,

for every x E A*. We shall call d(x) the binary self-delimiting version of x. For example, d(0101) = bin( 4)0101 = 010101 = 000110101. lRecall that bin: N+ --> {a, 1}* is the function returning for n the binary representation of the number n + 1 without the leading 1.

2. Noiseless Coding

24

The set S = {d(x) I x E {0,1}*} is prefix-free and every string x E {0,1}* can be represented within S using Id(x)1 = Ixl + 2 log Ixl + 1 bits. Consequently, every natural n > 0 has a representation in S of log n + 2log log n + 1 bits.

Example 2.5.

Furthermore, by replacing 0 by al and 1 by a2 we can consider that the function bin takes values in {aI, a2}* C A*. The set {d(x) I x E A*} c A* is prefix-free, where d( x)

= bin(lxl)x

is the self-delimiting version of the string x E A *.

2.2

Instantaneous Coding

Consider two alphabets Y = {YI, Y2, ... , YN} and A = {aI, a2, ... , aQ} such that 2 ::; Q < N. If Y is the alphabet of a given initial information source and A is the input alphabet of a communication channel, then in order to transmit the letters (i.e. strings on Y) through the given channel an encoding process has to be developed, even if we assume that there is no noise on the communication channel. Definition 2.6. i} A (finite) code is an injective function i.p : Y ---> A *. The elements of i.p(Y) are called code-strings. ii} An instantaneous code or prefix code is a code i.p such that i.p(Y) is prefix-free.

= {YI, Y2, Y3, Y4} and A following functions defined on Y,'

Example 2.7. Let Y

YI i.pl

i.p2 i.p3 i.p4

Y2

Y3

=

{O, I}. Consider the

Y4

00 01 10 11 10 110 1110 11110 10 10 110 1110 01 011 0111 01111

The codes i.pl, i.p2 are instantaneous while the code i.p4 is not (i.p4 (Y) is not prefix-free); i.p3 is not even a code.

In what follows we will be concerned with instantaneous codes. Their main property is the uniqueness of decodability: a code is uniquely decod able if for each source sequence of finite length (i.e. string), the corresponding sequence of code-strings does not coincide with the sequence of

25

2.2 Instantaneous Coding

code-strings for any other source sequence. In other words, the (unique) extension of r.p to y* is injective. For example, the sequence

0010001101 in code r.p1 can be split as

00,10,00,11,01 and decoded as

Not every uniquely decodable code is instantaneous (e.g. r.p4), but as we shall see later, such a code can always be converted into an instantaneous code. The advantage of the prefix-free condition resides in the possibility to decode without delay, because the end of a code-string can be immediately recognized and subsequent parts of the message do not have to be observed before decoding is started. A simple way of building prefix codes is given by the following theorem.

= 1,2, ... , N, be positive integers. These numbers are the lengths of the code-strings of an instantaneous code r.p: Y ~ A* iff L:~I Q-n i :::; 1.

Theorem 2.8 (Kraft). Let (ni), i

Proof Let r.p : Y ~ A * be an instantaneous code such that !r.p(Yi)! = ni,l :::; i :::; N. Let ri be the number of the code-strings having length i. Clearly, rj = in case j > m = max{ nl, ... , nN}. As the tode is

°

instantaneous, the following relations hold true: rl r2 r3

rm

< Q, < (Q - rl)Q = Q2 - rIQ, < ((Q - rl)Q - r2)Q = Q3 <

Qm

-

- rIQ2 - r2Q,

rl Qm-I - r2 Qm-2 - ... - rm-I Q .

Dividing the last inequality by Qm we get m

2: riQ-i :::; 1. i=1

26

2. Noiseless Coding

The required inequality follows by virtue of the inequality: m

N

i=1

j=1

L 1'iQ-i = L Q-nj :s: l.

(2.1)

For the converse implication we use (2.1) to get, step by step, the inequalities m

1'IQ-l

< L1'iQ-i:s: 1, i=1

1'1 Q-l

+ 1'2Q-2 <

m

L 1'iQ-i :s: 1, i=1

so

1'm :s:

Qm - 1'IQm-l - ... - 1'm-lQ,

showing that we have enough elements to construct the instantaneous 0 code whose code-strings have lengths nl, ... , nN. Remark. The inequality L:~1 Q-n i :s: 1 is called Kraft's inequality. Kraft's Theorem does not assert that every code which satisfies the inequality therein must be a prefix code. A counter-example is offered by the code 'P4: it satisfies Kraft's inequality, but it is not prefix-free. Nevertheless, there is a prefix code 'P2 whose lengths of string-codes are equal to those of the code 'P4. The relation between these codes is a special instance of the following more general result. Theorem 2.9 (McMillan). If a code is uniquely decodable with codestrings of lengths nl, n2, ... , nN, then Kraft's inequality is satisfied.

2.2 Instantaneous Coding

27

Proof Let r be a positive integer. Then

N

L

N

Q-n k1

kl=l N

N

N

L

Q-n k2 ...

kl=l k2=1

Q-n kr

kr=l

k2=1 N

L L ... L

L

Q-(nl'l +n k 2 +··+n k r ),

kr=l

because a finite number of terms can always be rearranged without affecting their sum. Now nkl + nk2 + ... + nkr is exactly the number of code letters in some sequence of r code-strings. The numbers kl' k2' ... , kr vary, so all possible sequences of r code-strings are generated in this way. Let ri be the number of sequences of r code-strings which contain i letters; clearly, 1 ::; i ::; rm, where m = max{ nl, n2, ... ,nN}. So,

(2.2) Since the code is uniquely decodable all sequences of r code-strings with a total of i letters have to be distinct, i.e. ri ::; Qi. Accordingly, in view of (2.2)

E

Q-n k ::;

Allowing r to tend to

00,

(t11) rm

N

1 r

= (rm)~.

the right-hand side tends to 1.

o

Corollary 2.10. Each uniquely decodable code can be replaced by a prefix code without changing the lengths of the code-strings. Proof Use Theorem 2.8 and Theorem 2.9.

Let us now consider a probabilistic model on Y, i.e. a function p : Y (0, 1] such that N

LP(Yi) = 1. i=l

The self-information of

Yi is defined by

o ->

28

2. Noiseless Coding

For example, if we assume that all 26 letters (plus the space symbol) of the English alphabet are equally distributed, then the self-information of an arbitrary letter is log2 27- 1 ~ 4.76 bits. Of course, the above hypothesis is false for English!2 Suppose now that we have defined a function f which assigns the value fk to Yk· Then, E(f), the expectation (or average, mean) of f, is defined by the formula N

E(f)

= L p(Yk)fk. k=l

The entropy is the average of self-information, i.e. N

7-l(Y) = - LP(Yk)log2P(Yk)' k=l We shall use the entropy to study the instantaneous codes. To this end we fix a probabilistic model P on Y and define the average length of the instantaneous code

A * with respect to P to be the number N

Lcp

= LP(Yk)I
Notice that Lcp is the expectation of the function f(Yk)

= 1
Example 2.11. Consider a uniform code

A*, i.e. 1
->

A* be an instantaneous code

L > 7-l(Y). cp - log2 Q 2Disregarding the space symbol, the most common letter in English e occurs with a frequency of about 13%; the least common letters, q and z, occur with a frequency of about 0.1%.

29

2.2 Instantaneous Coding Proof. First we prove the following:

Intermediate Step 1. 2.:~1 qi = 1, then

Let

Cl, ... ,

CN, ql,"" qN be positive reals.

N

N

i=l

i=l

If

2: qiCi :2 II C{i. Consider the concave function f : (0, (0) quently,

--->

R, f(x) = log2 x. Conse-

Since the function log2 is increasing, the required inequality follows. Intermediate Step 2. Let qi,Pi E (0, (0), 1 :s; i :s; N, and N

N

i=l

i=l

2: Pi = 2: qi = 1. Then the following inequality holds true: N

N

i=l

i=l

-2: qi log2 Pi :2 -2: qi log2 qi' We put Ci = Piq;l and we note that the hypothesis of the Intermediate Step 1 is satisfied, so N

N

N

N

i=l

i=l

i=l

i=l

2: qiCi :2 II C{i, 2: Pi :2 II p{i q;qi, N

N

i=l

i=l

II p{i :s; II q'P, N

N

i=l

i=l

2: qi log2 Pi :s; 2: qi log2 qi· We are now in a position to conclude our proof: we apply the Intermediate Step 2 (for ni = Icp(Yi)l,qi = P(Yi),Pi = Q-n i (2.:f=l Q-nj)-l) and finally

2. Noiseless Coding

30 use Kraft's inequality

(2:f=1 Q-nj :S

1):

N

H(Y)

= - LP(Yi)log2P(Yi) i=l

Q-n i

N

< - LP(Yi) log2 2:l! Q-nj J=l

t=l

N

<

Q. LP(Yi)I
log2

i=l

D

Example 2.13. The above lower bound can be achieved; for instance, consider N = 8, Q = 2,P(Yi) = and a uniform code of length 3. We get H(Y) = Lcp = 3.

k,

2.3

Exercises and Problems

1. Let SeA *. Show that the following statements are equivalent:

a) S is prefix-free, b) SnSA+ = 0, c) if xu = yv and x,y

E S, then x

= y and u = v.

2. Show that for all prefix-free sets S, TeA *, if SA *

= T A *, then S = T.

3. Show that for every set S C A+ the set T = A*S \ A*SA+ is maximal prefix-free (i.e. it is prefix-free and it has no proper prefix-free superset in A*). 4. A recursive strategy for the guessing game proceeds as follows: first guess the number of binary digits of n, then use the binary search to determine the exact value of n. To guess the number of digits of n, the same rule is applied. Finally, we are led to the following sequence of questions: Is your number less than I? Is your number less than 2? Is your number less than 22? Is your number less than 24? Is your number less than 2 16 ?

2.3 Exercises and Problems

31

and so on. a) Design encoding/decoding procedures for the recursive strategy. b) Show that the length of the code-string corresponding to n ?: 1, in this representation, is f(n) + p(n) + ... + f*(n) + 1, where f(n) = logn, fm is the m-fold iteration of f and f*(n) = fm(n), where m is the greatest natural such that fm (n) -I=- O. 5. Show that for every prefix-free set SeA *, LuEs Q-1u l

:s: 1.

6. Think of A * as an infinite tree with the empty string as the root. Show that an instantaneous code r.p : Y ---+ A* is complete (Le. Kraft's inequality holds true with equality for r.p) iff the code is tree saturated, Le. if exactly Q edges start from every non-terminal vertex. 7. Show that 1i(Y) 1 :s: i :s: N.

:s: 10g2 N,

with equality only when P(Yi)

=

N-l, for all

8. Show that for every probabilistic model P on Y there is an instantaneous code r.p : Y ---+ A * such that Lrp

1i(Y)

<1 Q +1. og2

9. Assume that in the probabilistic model of Y we choose two probabilities Pi = P(Yi) > Pj = p(Yj) and we replace them by Pi - c: and Pj + c:, respectively, under the proviso 0 < 2c: < Pi - Pj. Show that in this way 1i(Y) increases. In this way we can explain why the entropy acts as a measure of uncertainty. 10. Show that for every code r.p : Y ---+ A * whose extension to y* is injective one has Lrp ?: l1i(YQ) - 10g210g2 Q - 2 og2 (for every probabilistic model P on Y). 11. An instantaneous code (over the alphabet AQ) has code-string lengths h, l2, . .. ,lm which satisfy the strict inequality

Show that there exist arbitrarily long strings in A * which cannot be decoded into sequences of code-strings.

12.

(Shannon-Fano code) Let aI, a2, ... , aN be positive reals with Lf=l aj :s: 1. Construct a binary prefix-free code XI,X2, .•. ,XN such that the codestrings are in quasi-lexicographical order and

+' 2 Ix·1J < - -loga·J for alII

:s: j :s: N.

2. Noiseless Coding

32 13. For SeA * , s E S and natural j

~

1 define the set

T(S, s, j) = {siu I 0 5:. i 5:. j, u E S \ {s}} U {sj+l}.

Prove: a) If S is prefix-free, then T(S, s,j) is prefix-free. b) If S is complete (in the sense of Exercise 2.3.6), then so is T(S, s,j). 14. Show that the set of all p.c. functions 'P : A* ~ A* having a prefix-free domain has a universal p.c. function (having itself a prefix-free domain). 15. A set of strings is suffix-free if no string is a suffix of any string in the set. Show that every suffix-free code is uniquely decodable. What is the minimum average length over all suffix-free codes?

2.4

History of Results

The guessing game comes from Bentley and Yao [34], Knuth [256]. Kraft's Theorem was proven in Kraft [262]. One can safely say that coding theory was born in 1948, after the seminal paper by Shannon [364]. See Berstel and Perrin [36], Csiszar and Korner [153], Cover and Thomas [152], Guia§u [219], Jones [241], Jurgensen and Duske [244] and Khinchin [252] for further details on codes. Exercises 2.3.9,10 are from Leung-Yan-Cheong and Cover [276], where some relations between Shannon's entropy and Chaitin's complexity have been established. Exercise 2.3.13, due to Titchener [399], generalizes a construction of a statisticallysynchronizable code.

Chapter 3

Program-size We have art to save ourselves from the truth. Friedrich Nietzsche One way to measure the information content of some text is to determine the size of the smallest string (code, input) from which it can be reproduced by some computer (decoder, interpreter). This idea has been independently formalized in a number of different ways by Solomonoff, Kolmogorovand Chaitin.

3.1

An Example

There are many ways to compress information; an important one consists of detecting patterns. There is no visible pattern in a long table of trigonometric functions. A much more compact way to convey the information in such a table is to provide instructions for calculating the table, e.g. using Euler's equation eix = cos x + isinx. Such a description not only is compact, but can be used to generate arbitrarily long trigonometric tables. The above method is inadequate for empirical data. For instance, consider the collection of gold medal winners in the Olympic Games since 1896 (see Rozenberg and Salomaa [348]). For such information the amount of compression is practically nil, especially if attention is restricted to the least significant digits. Moreover, since the tendency is for (slow) improvement, in the long run the most significant digits have some kind

34

3. Program-size

of regularity, a property which allows us to make predictions. Is it possible to find an "objective" indicator to distinguish between these two different cases? Empirical data give rise to strings which have to be "explained" and new ones have to be predicted. This can be done by theories. A crude model of a theory is just a computer program which reproduces the empirical observations. Usually, there exist infinitely many such programs - for instance, the number 123 can be produced by the algorithms Subtract i from

123 + i,

for i = 1,2,3, .... Minimal programs are clearly most interesting. They can be used to measure the amount of compression of the initial data.

3.2

Computers and Complexities

We view a computer as a p.c. function which reads a string (over some alphabet) as an input and then mayor may not print another string, as output. With reference to a fixed computer, the complexity of a string x is defined as the length of the shortest string y which when input to the computer will determine the output of x. If one chooses to think of the input as a program + data, then the computer acts as a unary p.c. function; if the program and the data come separately, then the computer will be a binary p.c. function. Whereas Kolmogorov and (in a first stage) Chaitin do not impose any restrictions on computers, it was realized that the domain of each computer should be prefix-free. Here is Chaitin's motivation for this point of view (see Chaitin [124]): A key point that must be stipulated... is that an input program must be self-delimited: its total length (in bits) must be given within the program itself. (This seemingly minor point, which paralyzed progress in the field for nearly a decade, is what entailed the redefinition of algorithmic randomness.) Real programming languages are self-delimiting, because they provide constructs for beginning and ending a program. Such constructs allow a program to contain well-defined subprograms nested in them. Because a self-delimiting program is built up by concatenation and nesting self-delimiting subprograms, a

3.2 Computers and Complexities

35

program is syntactically complete only when the last open subprogram is closed. In essence the beginning and ending constructs for programs and subprograms function respectively like left and right parentheses in mathematical expressions.

Definition 3.1. A computer is a p. c. function r.p : A * x A * ~ A *. A Chaitin computer is a computer C such that for every v E A *, the domain ofCv is prefix-free, where Cv : A* ~ A*,Cv(x) = C(x, v), for all xEA*.l Comment. If C(x, v) < 00 and y

During each cycle of operation the machine reads the content of the scanned program tape cell and of the scanned worktape cell; it may halt, move the read head of the program tape one cell to the right, write a 0, a 1 or a blank on the scanned worktape cell, move the read/write head of 1 We follow Solovay [375] for the terminology; a copy of this important, but unfortunately not (yet?) published, manuscript was kindly supplied to us by Charles Bennett.

3. Program-size

36

the worktape one cell to the left or to the right, and write a 0 or a 1 on the scanned output tape cell and move the write head of the output tape one cell to the right. The machine changes state: the action performed and the next state are both functions of the present state and the contents of the two cells being scanned by the program tape head and by the worktape head. If, after finitely many steps, M halts with the program tape head scanning the last bit of x, then the computation is a success, M(x) < 00; the output of the computation is the string M(x) E A* that has been written on the output tape. Otherwise, the computation is a failure, M(x) = 00, and there is no output.

In view of the above definition, a successful computation must end with the program tape head scanning the last bit of the program. Since the program tape head is read-only and cannot move left, this implies that for every self-delimiting Turing machine M the halting set {x E A * I M (x) < oo} is prefix-free. Definition 3.2. A (Chaitin) computer t/J is universal if for each (Chaitin) computer

Proof We sketch the proof for Chaitin computers. Let F : N + x A * x A * ~ A * be a universal p.c. function for the class of all p.c. functions C : A* X A* ~ A* such that for every v E A* the set {u E A* I C(u, v) < oo} is prefix-free. Then put t/J(aia2u,v)

= F(i,u,v).

D

We fix a universal computer t/J and a Chaitin universal computer U as standard universal computers (they are not necessarily the computers constructed in the proof of Theorem 3.3) and we use them for measuring program-size complexities throughout the rest of the book. Also we use the convention that the minimum of the empty set is 00. Definition 3.4. a) The Kolmogorov-Chaitin absolute complexity (for short, the absolute complexity) associated with the computer

3.2 Computers and Complexities

37

the partial function K

= 'ljJ

= min{lull

we put K(x)

u E A*, .)

= x}.

= K'Ij;(x).

b) The Chaitin self-delimiting absolute complexity or absolute program-size complexity (for short, program-size complexity) associated with Chaitin's computer C is the partial function He: A * ~ N, Hc(x) In the case C

=

= min{lull u

U we put H(x)

=

E A*,C(u,>.)

= x}.

Hu(x).

c) The canonical program defined with respect to Chaitin's universal computer U is x*

= min{u

E A* I U(u,>.)

=

x},

where the minimum is taken according to the quasi-lexicographical order on A* induced by al < a2 < ... < aQ.

By definition, x* is the most compact way for U to compute x: the computation U(x*) = x produces x by freeing Ixl - Ix* I bits of memory. What is the least thermodynamic cost of generating a string x from the canonical program x*? Zurek [454] has proven that the computation U(x*, >.) = x can be achieved reversibly, with no cost in terms of entropy increase. Let us note that a reversible computation, i.e. a computation which can be undone, can be performed only by using computer memory to keep track of the exact logical path from input to output (see further Calude and Casti [65]): thermodynamic irreversibility is inevitable only in the presence of logically irreversible operations. Corollary 3.5. For every computer

0

Corollary 3.6. All sections 'ljJy, Uy of'ljJ and U, respectively, are surjective. Proof Given the strings y and z we construct the computer C(>', y) and we use Definition 3.2.

=

z 0

3. Program-size

38

Lemma 3.7. For every x E A*: x* does exist and x*

-::/=

>.,

(3.2)

= U(x*, >'), H(x) = Ix*l.

x

Proof The partial function U).. is surjective and>'

(3.3)

(3.4)

rt dom(U)..).

0

Definition 3.8. a) The Kolmogorov-Chaitin conditional complexity (for short, the conditional complexity) induced by the computer

= min{lyll yEA *,
K
Put K(x/v) = K'Ij;(x/v). b) The Chaitin self-delimiting conditional complexity or conditional program-size complexity induced by Chaitin's computer C is defined by

= min{lyll y E A*, C(y, v*) = x}.

Hc(x/v) Put H(x/v)

= Hu(x/v).

Corollary 3.9. For every computer

+ 0(1),

H(x/v) S Hc(x/v)

+ 0(1).

(3.5)

In what follows, Corollary 3.5 and Corollary 3.9 will be referred to as the Invariance Theorem. Let us note that for every two universal computers 'l/J, w there exists a constant c such that for all x and y in A * one has and

IK'Ij;(x/y) - Kw(x/y) I s c. The same result holds true for Chaitin complexities. Hence, both absolute and conditional complexities are essentially asymptotically independent of the chosen universal computers. However, here we may find the reason that many upper bounds on K and H hold true only to within an additive constant.

3.2 Computers and Complexities

39

Corollary 3.10. For all strings x, v E A *,

0< K(x/v) <

Proof Take y

= v, y = v*

00,

0

< H(x/v) < 00.

(3.6)

o

in Corollary 3.6.

We are going to express most of the following results in terms of Chaitin computers; it is plain that all subsequent results hold true also for computers in general. Definition 3.11. Fix a computable bijection <, >: A * x A * denote by Oi : A* ---7 A*, i = 1,2, its inverse components. Put

Hc(x,y)

= Hc«

x,y

---7

A * and

», H(x,y) = Hu« x,y ».

Proposition 3.12. One has

H(x,y) = H(y,x) +0(1).

(3.7)

Proof We define the computer

C(u,>.) =< (U(u,>.)h,(U(u,>.)h >. In view of the Invariance Theorem one has

H(x,y)

= H«

for some constant c

x,y »::; Hc«

x,y »

+ c = H(y,x) + c, o

> O.

If f : A* ---7 A* is a computable bijection, then H(f(x)) = H(x) + 0(1). Indeed, we can use the Chaitin computer C(u, >.) = f(U(u, >.)). In the proof of Proposition 3.12 we have used the function

Remark.

f(x) = < (xh, (xh > . Lemma 3.13. The following two formulae are true:

H(x/x) = 0(1),

(3.8)

H(string(H(x))/x) = 0(1).

(3.9)

3. Program-size

40

Proof We have only to prove that conditional program-size complexity induced by a universal computer is bounded above. For (3.8) we use Chaitin's computer C(A,U) = U(U,A),U By (3.3), C(A, x*) = x, so Hc(x/x) Invariance Theorem.

E

A*.

= O. Formula (3.8) follows from the

For the second formula we construct Chaitin's computer

D(A, u)

= string(lul),

if U(u, A)

< 00.

Again by (3.3),

D(A,X*) = string(lx*l) = string(H(x)), HD(string(H(x))/x) = 0, and the required formula follows from the Invariance Theorem.

D

Lemma 3.14. There exists a natuml c such that for all x, yEA * one

has H(x) ::; H(x, y)

+ c,

(3.10)

H(x/y) ::; H(x)

+ c,

(3.11)

+ H(y/x) + c, H(x, y) ::; H(x) + H(y) + c.

H(x, y) ::; H(x)

Proof First we use the Chaitin computer C(u, A) H(x)

< Hc(x) + c < min{lull u E A*, (U(u, A)h H(x,y)

= x, (U(u, A)h = y}

proving (3.10). For (3.11) we can use the Chaitin computer D(u, v) = U(u, A) :

= H(x).

(3.13)

= (U(u,A)h:

+ c,

HD(X/Y)

(3.12)

+c

3.2 Computers and Complexities

41

To get (3.12) we construct a Chaitin computer C satisfying the following property: if U( u, x*) = y, then C(x*u, >.) = < x, y > . For the construction we use the c.e. (infinite) set V = dom(U>..). The computation of C on the input (x, >.) proceeds as follows: 1.

Generate all elements of V until we find (if possible) a string v E V with v

2.

Compute W E A* such that x

3.

If U(w, v)

= vw and

< 00, then put C(x,>.)

=

try to compute U(w,v).

< U(v,>.), U(w, v) >.

Clearly, C is a p.c. function and C(u, v) = 00, for v i=- >.. Assume that x, y E dom( C>..) and x

u x , uy E dom(U>..), Wx

E

dom(Uux )' Wy

E

dom(Uuy )

such that

x = UxW x , Y = UyWy. Since U x and u y are both prefixes of y and they belong to the prefix-free set dom(U>..) , it follows that Ux = uy. Moreover, {wx, wy} C dom(Uu ) , where u = Ux = uy and uW x , uWy are prefixes of y; we deduce that Wx = w y, i.e. x = y. So C is a Chaitin computer. Next we show that C satisfies the condition cited above. Let v = x*u and assume that U(u, x*) = y. Obviously, x* E V; during the first step of the computation of C( ux*, >.) we get x*; next we compute u and U( u, x*) = y < 00. According to the third step of the computation,

C(x*u, >.) = < U(x*, >'), U(u, x*) > = < x, y > . In the case H(yjx) = lui one has U(u,x*) exists a natural c such that

H(x,y)

= y and consequently there

=

H«x,y»5oHc«x,y»+c < Ix*ul + c= H(x) + H(yjx) + c.

As concerns (3.13),

H(x, y) 50 H(x) by (3.12) and (3.11).

+ H(yjx) + Cl 50 H(x) + H(y) + C2, 0

42

3. Program-size

Proposition 3.15 (Sub-additivity). The following formula is true:

H(xy) S H(x)

+ H(y) + 0(1).

Proof We use Chaitin's computer C(w,>') the relation (3.13).

=

(3.14)

(U(w,>')h(U(w,>')h and 0

Definition 3.16. The mutual algorithmic information of the strings x and y, according to Chaitin's computer C, is

Hc(x : y) = Hc(y) - Hc(y/x). Also, H(x : y)

= Hu(x : y).

Proposition 3.17. There is a constant c> 0 such that

H(x : y) :2 -c, H(x : y) S H(x)

(3.15)

+ H(y) - H(x, y) + c.

(3.16)

Proof The inequality (3.15) follows from (3.11). By (3.12) we get

H(x : y) = H(y) - H(y/x) ::; H(y) + H(x) - H(x, y) + c.

o

Lemma 3.18. The following formulae hold true:

H(x : x) = H(x)

+ 0(1),

(3.17)

H(x : >.) = 0(1),

(3.18)

H(>' : x) = 0(1).

(3.19)

Proof Formula (3.17) comes from (3.8). By (3.15), H(x : >.) :2 -c, for some positive constant c. Furthermore,

H(x: >.) < H(x) + H(>') - H(x, >.) + Cl < H(x) - H(x, >.) + C2

< because H(x, >.)

C3

= Hc(x), where C(u, >.) = (U(u, >')h.

Finally, using (3.15) and the Chaitin computer D(v, >.) can prove (3.19).

= (U(v, >')h

we 0

3.3 Algorithmic Properties of Complexities

3.3

43

Algorithmic Properties of Complexities

We begin this section by considering the set of canonical programs CP={x* IxEA*}

(see Definition 3.4b). We shall prove that CP is an immune set, i.e. CP is infinite and has no infinite c.e. subset. Theorem 3.19. The set of canonical programs is immune.

Proof The set CP is clearly infinite, as the function x - t x* is injective. We now proceed by contradiction, starting with the assumption that there exists an infinite c.e. set SeC P. Let S be enumerated by the injective computable function f : N - t A *. We define the function 9 : N - t A * by g(O) = f(O), g(n + 1) = f(minj[lf(j)1 > n

+ 1]).

It is straightforward to check that 9 is (total) computable, Sf = g(N+) is c.e. infinite, Sf C Sand Ig( i) I > i, for all i > O. Using the prefix-free set in Example 2.5 we can construct a Chaitin computer C such that for every i :2 2, there exists a string u such that C(u, >.) = g(i) and

lui::; log i + 2 log log i ::; 3log i. By the Invariance Theorem we get a constant

H(g(i)) ::; Hc(g(i))

+ Cl

::;

Cl

such that for all i EN,

3logi + Cl.

(3.20)

We continue with a result which is interesting in itself:

Intermediate Step. in CP, one has

There exists a constant

H(x) :2

C2

:2 0 such that for every x

Ixl- C2·

(3.21)

We construct Chaitin's computer D(u,>.)

= U(U(u,>.),>.)

and pick the constant C2 coming from the Invariance Theorem (applied to U and D). Taking x = y*, z = x*, we have D(z, >.)

= U(U(z, >'), >.) = U(U(x*, >'), >.) = U(x, >.) = U(y*, >.) = y,

44

3. Program-size

so

HD(y) ::; H(x),

= IY*I = H(y) ::; HD(Y) + C2 ::; H(x) + C2· 1, if g(i) E CP, then Ig(i)1 > i, so by (3.20) and (3.21) Ixl

For i

~

i -

C2

< Ig(i)l- C2

::;

H(g(i)) ::; 310gi + Cl,

and consequently only a finite number of elements in S' can be in CPo

D

Remark. In view of (3.21), the canonical programs have high complexity. We shall elaborate more on this idea in Chapter 5. Corollary 3.20. The function f : A *

-+

A *, f (x) = x* is not com-

putable. Proof The function f is injective and its range is exactly CPo

D

Theorem 3.21. The program-size complexity H(x) is semi-computable from above, but not computable.

Proof We have to prove that the "approximation from above" of the graph of H(x), i.e. the set {(x,n) I x E A*,n E N,H(x) < n}, is c.e. This is easy since H (x) < n iff there exist yEA * and tEN such that Iyl < nand U(y,.x) = x in at most t steps. For the second part of the theorem we prove a bit more, namely:

Claim. There is no p.c. function

3.4 Quantitative Estimates

45

For infinitely many i > 0,

(Recall that C(ala2) = f(ala2) is a Chaitin computer.) Accordingly, in view of the Invariance Theorem, for infinitely many i > 0, we have

o

This yields a contradiction.

3.4

Quantitative Estimates

In this section we derive some elementary estimations for program-size complexities. Similar results can be obtained for the conditional programsize complexities. Sharper estimations, deserving more involved proofs, will be presented later. Theorem 3.22. There exists a natural constant c > 0 such that for all x EA+, (3.22) K(x) ::; Ixl + c, H(x) ::; Ixl + 2 log Ixl + c. Proof For K take the computer
Lemma 3.23. For every Chaitin computer C and each natural none has #{x E A* I Hc(x) = n} ::; Qn. (3.23) Proposition 3.24. Let E c A * be a set having m E > O. Then, for every Chaitin computer C,

>0

elements and

( Ql-e) .

#{XEEIHc(x)2:: logQm-E}>m 1- Q _1

(3.24)

Proof A simple computation produces the required inequality (using (3.23)):

46

3. Program-size

#{x EEl Hc(x) ~ logQm - c:}

m - #{ x EEl H c( x) < logQ m - c: }

> m - #{x

E A*

I Hc(x) <

L

m-

+ 1} = i}

LlogQm - c:J

#{x E A* I Hc(x)

O::;i::;l!ogQm-cJ

> m - (QLlogQm-cj+1 - l)/(Q -1) > m(l - Q1-c /(Q - 1)).

o

Corollary 3.25. For every Chaitin computer C, natural n and positive real c:, one has #{x E An I Hc(x) ~ n - c:} > Qn(1- Q1-c /(Q - 1)).

(3.25)

Proof Take E = An in Proposition 3.24.

Proposition 3.26. If F : A * H(x) :::; F(x) + 0(1), then

--t

0

N is an arbitrary function such that

#{x E A* I F(x) < m} < Qm+O(l). Proof Clearly, {x E A* I F(x) < m} C {x E A* some constant c> O. Consequently,

logQ #{x E A* I F(x) < m}

I H(x) <

m

+ c},

< logQ #{x E A* I H(x) < m + c} < logQ(Qm+c - l)/(Q - 1) < m+c.

for

o

Proposition 3.27. Let F : A * --t N be a function semi-computable from above. If there exists a constant q > 0 such that for all natural m > 0 #{x E A* I F(x) then H(x) :::; F(x)

+ 0(1).

< m} < logm + q,

47

3.5 Halting Probabilities

Proof. Let {(Xl, md, (X2' m2), ... } be an injective computable enumeration of the c.e. set {(x,m) E A* x N I F(x) < m}. We construct Chaitin's computer C by the following algorithm: 1.

All strings yEA * are available.

2.

For i = 1,2, ... generate (Xi, mi), choose the first available Yi E Alogmi+q and put C(d(Yi),.x) = Xi.

3.

The string Yi is no longer available.

Recall that d comes from Example 2.5. In view of the hypothesis, we have "enough" elements to run every step, so in case F(x) < m there exists Y E A1ogm+q with C(d(y),.x) = X, i.e. Hc(x):::; 10gm+2loglogm+0(1). In particular, F(x) < F(x) + 1, so

Hc(x) :::; 10g(F(x) + 1) + 2 log 10g(F(x) + 1) + 0(1) :::; F(x) + 0(1). Finally, we use the Invariance Theorem.

3.5

o

Halting Probabilities

It is well known that the halting problem for an arbitrary (Chaitin) computer is unsolvable (see Section 9.2). Following Chaitin, we switch the point of view from a deterministic one to a probabilistic one. To this end we define - for a given Chaitin computer - the halting probabilities.

Definition 3.28. Given a Chaitin computer C we define the following ((probabilities" : Pc(x) = {UEA*IC(U,A)=X}

Pc(x/y)

= {UEA*Ic(u,y*)=X}

In the case C = U we put, using the common convention, P(x) Pu(x), P(x/y) = Pu(x/y). We say that Pc(x) is the absolute algorithmic probability of Chaitin's computer C with output X (it measures the probability that C produces x); Pc(x/y) is the conditional algorithmic probability.

3. Program-size

48

The above names are not "metaphorical". Indeed, P is just a probability on the space of all sequences with elements in A, i.e. AW, endowed with the uniform distribution. See Section 1.4 for more details and specific notation. As a consequence, for every Chaitin computer C,O :::; Pc(x) :::; 1 and 0 :::; Pc(x/y) :::; 1, for all strings x, y. Actually, we can prove a bit more. Lemma 3.29. For every Chaitin computer C and all strings x and y,

nC = L

(3.26)

Pc(x) :::; 1,

XEA*

L

(3.27)

Pc(x/y) :::; 1.

xEA*

Proof. For (3.26) we can write

nC =

L

Pc(x)

xEA*

=

L

L

Q-[u[

XEA* {uEA*[C(U,A)=X}

=

L

Q-[u[:::;

1,

uEdom(C)..)

the "series" still being a probability. The same argument works for (3.27).

o Remark. The number nc = LXEA* Pc(x) expresses the (absolute) halting probability of Chaitin's computer C. Lemma 3.30. For every Chaitin computer C and all strings x, y, Pc(x) ~ Q-Hc(x) ,

(3.28)

Pc(x/y) ~ Q-Hc(x/y).

(3.29)

Proof. One has· Pc(x) = {UEA*[C(U,A)=X}

and Hc(x)

= lui, C(u,.x) = x.

o

In the case of the universal Chaitin computer, neither the absolute nor the conditional algorithmic probability can be 0 or 1.

49

3.6 Exercises and Problems Scholium 3.31. For all x,y E A*,

0< P(x) < 1,

(3.30)

0< P(xly) < 1.

(3.31)

Proof In view of Lemma 3.30, with C = U, P(x) 2:: Q-H(x) = Q-1x*1 > O. Using (3.26), 2:xEA* P(x) :::; 1 and the fact that each term of the series is non-zero we deduce that P(x) < 1. A similar reasoning works for (3.31).

o Proposition 3.32. For every Chaitin computer C and all naturals n, m 2:: 1, the following four formulae are true:

#{x E A* I Hc(x) < m} < (Qm - 1)/(Q - 1),

(3.32)

#{x E A* I HC(xly) < m} < (Qm - 1)/(Q -1),

(3.33)

#{x E A* I Pc(x) > nlm} < min,

(3.34)

#{x E A* I Pc(xly) > nlm} < min.

(3.35)

Proof For (3.32) we use Lemma 3.23. For (3.34) let 8 = {x E A* Pc(x) > ~} and assume, by absurdity, that #8 2:: ~. Then, by (3.26): 1 2::

L xEA*

Pc(x) 2::

L xES

Pc(x) > !2.#8 2:: 1, m

o

a contradiction.

3.6

I

Exercises and Problems

1. Show that every prefix-free c.e. set of strings is the domain of some Chaitin

computer. 2. Show that there exists a natural c such that for all x E A *, H (x* / x) :::; c, and H(x/x) :::; c. 3. Consider Ackermann-Peter's computable and non-primitive recursive function a : N x N --t N,

a(O,x) = x + 1, a(n+1,x) =a(n,1),

3. Program-size

50 a(n + 1, x + 1) = a(n, a(n + 1, x)).

Show that for every unary primitive recursive function f there exists a natural constant c (depending upon f and a) such that f(x) < a(c, x), for all x ~ c; see Calude [51] for other properties of a. For every natural n define the string s(n) = 1a(n,n). a) For every n EN, K(s(n)) = K(string(n)) + 0(1). b) There is no primitive recursive function f : N

f(K(s(n)))

~

-t

N such that

a(n, n).

4. Fix a letter a E A. Show that there exists a constant c > 0 such that K(an/n) ~ c, for all natural n, but K(a n ) ~ logn-c, for infinitely many

n. 5. Show that there exists a natural c such that for all x E CP, H(x) (Hint: use Chaitin's computer C(u,'x) = u,u E dom(U)..).)

< Ixl +c.

6. (Chaitin) Show that the complexity of a LISP S-expression is bounded from above by its size + 3. 7. Show that the conditional program-size complexity is semi-computable from above but not computable. 8. (Chaitin) Show that H(x) ~ Ixl +loglxl +2logloglxl +c; furthermore, one can indefinitely improve the upper bound (3.22). (Hint: use Chaitin's computer C(bin(lbin(lxl)l)x,'x) = x.) 9. The function H(x/y) is not computable; is it semi-computable from above? 10. If yEA *, mEN and SeA * is a prefix-free set such that EXES Q-Ixi ~ Q-m /(Q _ 1), then there exists an element xES such that H(x/y) ~ Ixl-m. 11. Show that the halting set K = {x E A * I 'Ilx (x) < oo} and the selfdelimiting halting set K S = {x E A* I Cx(x) < oo} ((Cx ) is a c.e. enumeration of all Chaitin computers) are readily computed from one another, i.e. there exists a computable bijection F : A* - t A* such that F(K) = K S . 12. (Levin) Show that the following statements are equivalent: a) The function F : A* - t N is a function semi-computable from above + and K -< F, b) #{x E A* I F(x) < m} < Qm+O(1). 13. (Chaitin) A sequence x E AW is computable iff K(x(n)) ~ K(string(n)). Show that the equivalence is no longer true in case the formula on the right-hand side is valid only for infinitely many n.

+

+

14. Show that K -< H -< K + 2logK.

3.6 Exercises and Problems 15. Show that H ~ K upper bound.

51

+ log K + 2 log log K;

one can indefinitely improve this

16. Let f : N - t A* be a computable function such that If(n)1 = n, for all n ~ O. Then, H(x/ f(lxl)) :::; Ixl + 0(1).

17. Show that K(st'ring(n)) :::; logQ(n)

+ 0(1).

18. Show that there exist infinitely many n such that K (string( n)) 19. Show that if m

~

logQ (n).

< n, then m + K(st'ring(m)) < n + K(string(n)).

20. (Kamae) Prove that for each natural m there is a string x such that for all but finitely many strings y one has K(x) - K(x/y) ~ m. 21. Show that the above statement is false for H(x/y). 22. (Chaitin) An information content measure is a partial function H : N ~ N which is semi-computable from above and Ln>o 2- H (n) :::; 1. In case H(n) = 00, as usual, 2- 00 = 0 and this term contributes zero to the above sum. Prove: a) The Invariance Theorem remains true for the information content measure. b) For all natural n

H(n) H (n) H (n)

< 2logn + c, < log n + 2 log log n + c', < log n + log log n + 2 log log log n + c" ,

c) For infinitely many natural n

H(n) H(n) H(n)

> logn, > logn + log log n, > logn + log log n + log log log n,

23. Reformulate the results in this chapter in terms of information content measure. 24. (Shen) Show that for all strings x, y, z of length less than n

2H(x, y, z) :::; H(x, y)

~

+ H(x, z) + H(y, z) + 0(1).

lone has

52

3.7

3. Program-size

History of Results

The theory of program-size complexity was initiated independently by Solomonoff [373], Kolmogorov [259] and Chaitin [110]. Chaitin refers to the Kolmogorov-Chaitin complexity as blank-endmarker complexity. The importance of the self-delimiting property was discovered, again independently, by Schnorr [361], Levin [278] and Chaitin [114]; however, the theory of self-delimiting complexity was essentially developed by Chaitin (see [122]). Related results may be found in Fine [197], Gacs [199], Katseff and Sipser [249], Meyer [313]. The proof of Theorem 3.19 comes from Gewirtz [208]. The halting probabilities have been introduced and studied by Chaitin [114]; see also Willis [435]. For more historical facts see Chaitin [122, 131, 132, 134], Li and Vitanyi [282]' Uspensky [407]. Overviews on program-size complexity can be found in Zvonkin and Levin [455], Gewirtz [208], Chaitin [118, 121, 122, 125]' Schnorr [361], MartinLof[301], Cover and Thomas [152]' Gacs [203], Kolmogorov and Uspensky [261]' Calude [51], Li and Vitanyi [280,282]' Uspensky [407, 408], Denker, Woyczynski and Y cart [173], Gruska [217], Delahaye [164], Ferbus-Zanda and Grigorieff [195], Sipser [368], Yang and Shen [445, 446].

Chapter 4

Computably Enumerable Instantaneous Codes Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away. A ntoine de Saint Exupery

In this chapter - which is basically technical - we present two main tools used to design Chaitin computers and consequently to establish upper bounds: the extension of the Kraft condition (see Theorem 2.8) to arbitrary c.e. sets and relativized computation. New formulae, closely analogous to expressions in classical information theory, are derived.

4.1

The Kraft-Chaitin Theorem

We devote this section to the proof of the following important result. Theorem 4.1. Let

We can effectively construct an injective p.c. function

e : dom(
54

4. G.E. Instantaneous Codes

a) for every n

dom(
IB(n)1 =

L

Q-
l.

(4.1)

iEdom(
Before proceeding to the proof let us state some remarks. An initial segment of N+ is a finite set of the form {1, 2, ... , n} or N+. In (4.1) we can write, equivalently, Li> 1 Q-
Proof of Theorem 4.1. The direct implication is trivial: by (1,b) (see also Exercise 2.3.5) one has

L Q-
L

Q-1xl :::; l.

xErange(8)

So we focus our attention on the converse implication. We will construct three sequences (Mn)nEdom(
= min(Mn

n Amn),

where min is taken according to the lexicographical order. The sets Mn are constructed as follows: Mo = {A}, and if M I , ... ,Mn have been constructed and
where

4.1 The Kraft-Chaitin Theorem Note that Tn+l =

0 if tp(n)

=

55

m n.

Finally, we put

The proof consists of checking, by induction on n 2: 0, the following five conditions:

A) "6XEMn Q-1xl = 1 _

,,~-=-l Q-cp(i).

62-0

B) For all p 2: 0, #(AP n Mn) ::; Q - 1. C) The string /-Ln does exist. D) The sets Mn and {O(O), 0(1), ... , O(n - I)} are disjoint. E) The set Mn U {O(O), 0(1), ... ,O(n - I)} is prefix-free. The induction basis is very simple: Mo Consequently,

XEMO

= {A}, so

mo

= 0,0(0) =

ar(O).

i=O

For all p 2: 1, # (AP n Mn) = 0 ::; Q - 1. Finally, /-Lo = A and the last two conditions are vacuously true. We assume now that conditions A) to E) are true for some fixed n 2: 0 and prove that they remain true for n + 1. We start by proving the formula (4.2) In fact, Mn n Tn+l = 0. Otherwise, 0 i- Mn n Tn+l C Mn and Mn is prefix-free. So, for some 0 ::; j ::; tp(n) - mn - 1 and 2 ::; p ::; Q, /-Lna{ ap E Mn n Tn+l C Mn. As /-Ln E M n , it follows that Mn is no longer prefix-free, a contradiction. We continue by checking the validity of conditions A) - E). For A), using (4.2), the induction hypothesis and the construction of Mn+l, we have

56

4. G.E. Instantaneous Codes

L XEMn \{ftn}

L

L

Q-1x l +

Q-1x l -

Q-1x l

XETn+l

Q-m n

+ (Q -

1) cp(n)-mn- 1

n-l

1-

L

Q-cp(i) - Q-m n

+ (Q

_l)Q-m n - l

L

Q-j

j=O

i=O

n

1-

L Q-cp(i) , i=O

provided mn :::;
= n-l

=

1-

L

Q-cp(i) _ Q-mn

i=O

n

1-

L

Q-cp(i) ,

i=O

in case mn =
so in all these situations B) is true by virtue of the induction hypothesis.

In case mn

+ 1 :::; k:::;
(4.3)

we have

(4.4)

4.1 The Krait-Chaitin Theorem

57

Indeed, if x E Ak and k satisfies (4.3), then x

1- Mn.

For such a k,

((Mn \ {!Ln}) UTn+l) nAk ((Mn \ {!Ln}) n Ak) U (Tn+l n Ak) (Mn n Ak) U (Tn+l n Ak) Tn+1 n Ak. In view of (4.4),

#(Mn+1 n Ak) = #(Tn+l n Ak) = Q - 1. For C), !Ln+l does exist if in Mn+l we can find at least one string of length less than or equal to rp( n + 1). To prove this we assume, for the sake of a contradiction, that every string in Mn has length greater than rp(n + 1). We have ()()

L ()()

L ()()

<

L

p=cp(n+l)+l Q-cp(n+l) , as M n+1 nAP = 0, for almost allp E N, and by B), #(Mn+1 nAP) ::; Q-1. From A) we get n

1-

L Q-cp(i) = L i=O

Q-1x l < Q-cp(n+1) ,

XEMn +l

which contradicts the hypothesis (4.1), thus concluding the existence of !Ln+l· In proving D) we write Mn+1 n {O(O), 0(1), ... , O(n)} as a union of four sets:

(Mn \ {!Ln}) n {O(O), 0(1), ... ,O(n - 1)} Tn+1 n {O(O), 0(1), ... ,O(n - 1)} (Mn \ {!Ln}) n {O(n)} Tn+1 n {O(n)},

4. G.E. Instantaneous Codes

58

each of which will be shown to be empty. Indeed, the first set is empty by virtue of the induction hypothesis. For the second set we note that in case O(i) E Tn+l (for some 0 ::; i ::; n - 1), then O(i) = fLna{a p , for some 0 ::; j ::; c.p(n) - mn - 1 and 2 ::; p ::; Q. SO, fLn

E C

In -

For E) we write

Mn+l U {O(O),O(I), ... ,O(nn U

= (Mn \ {fLn})

{O(O), 0(1), ... , O(n -

-In

In U Tn+l U {O(nn.

The set Mn U {O(O), 0(1), ... , O(n is prefix-free by induction hypothesis; Tn+l U {O(nn is prefix-free by construction. To finish, four cases should be analysed: • The set (Mn \ {fLn}) U {O(nn is prefix-free as fLn

In is prefix-free

o if x

In,

o if O(i)

0 (the case t = 0 implies O(i)

0 (the case t = 0 is impossible), so O(i)

59

4.1 The Kraft-Chaitin Theorem

The injectivity of () follows directly from E). Hence, the theorem has been proved. D Theorem 4.2 (Kraft-Chaitin). Let f : N + ..!!." A* x N + be a p.c. function whose domain is an initial segment of N +. For every k E dom(J) put f(k) = (Xb nk). If 00

L: Q-n

k ::;

1,

k=l

then we can effectively construct a Chaitin computer C such that for every k E dom(J) there exists a string Uk of length nk with C(Uk' >.) = Xk. Furthermore, for every string v,

Pc(v) =

L: Q-n

k ,

X,,=V

and

Hc(v) = min{nk I Xk = v}. Proof. The p.c. function rp : dom(J) - t N + given by rp(k) = nk does satisfy the hypothesis of Theorem 4.1. So, we may define the Chaitin computer C by C(()(k), >.) = Xk, for every k E dom(J) (() comes from Theorem 4.1). It is straightforward to check that C has all the desired properties. D Comments. a) According to Theorem 4.1 we only have to make sure that the lengths satisfy the inequality (4.1) to get automatically the prefix-free set. b) Examples of functions satisfying the inequality (4.1) of Theorem 4.1 can be found in Exercise 4.6.3. c) The algorithm described in the proof of Theorem 4.1 produces the same code-strings as Chaitin's original algorithm [121]: 1. Put ()(1) =

ai Cl ).

2. If ()(2), ... ,()(n) have been constructed and rp(n ()(n

+ 1) =

min{x E A~Cn+l) I x

+ 1) =I- 00,

then put

1:-p ()(i), ()(i) 1:-p x, Vi, 1 ::; i

::; n}

where the minimum is taken according to the quasi-lexicographical order.

60

4.

e.E.

Instantaneous Codes

d) Following Chaitin [121] the above problem may be thought of as a storage allocation memory problem. We have a unit of storage and requests of storage of type Q-n of the unit. Storage is never freed. The allocation algorithm is able to service a series of (possibly infinite) storage allocation requests as long as the total storage requested is not greater than the unit. See Exercise 4.6.4 for a geometric interpretation. Corollary 4.3. Let w : N + ~ Q be a p. c. function having as domain an initial segment of N +. If

Lw(i)::; 1, i2:l

then we can effectively construct a p. c. function () : dom( w) the following two properties:

a)

1()(i)I::;-logQw(i)+l,

b)

()( dom( w)) is prefix-free.

Proof To w we associate the p.c. function

E N I Q-k ::;

--t

--t

A * having

N defined by

w(i)}.

It is plain that

L iEdam(w)

Q-
L

w(i)::; 1,

iEdom(w)

so Theorem 4.1 applies. We get a p.c. function () having a prefix-free 0 range and I()( i) I =
4.2

Relativized Complexities and Probabilities

Some important relations between complexities and probabilities make use of the non-computable function x --t x*. To avoid this difficulty we embed the non-computable computation involving x* into a larger computational process, evaluating a c.e. set containing (strictly) CP in such a way that the main property of C P is preserved. Recall that U(x*,.\) = x and consequently (the immune set) CP can be embedded into the c.e. set

U {w E A* I U(w,.\) = t}. tEA'

61

4.2 Relativized Complexities and Probabilities

Accordingly, the following definitions make sense for every Chaitin computer C: Hc(x/y;w) =min{lzll z E A*,U(w,>') =y,C(z,w) =x}, Pc(x/y;w) = {ZEA*IU(w,>,)=y,C(Z,w)=x}

H(x/y; w) = Hu(x/y; w), P(x/y; w) = Pu(x/y; w).

The following relations are obviously true for all x, y, wE A *: Hc(x/y)

= Hc(x/y; y*), Pc(x/y) = Pc(x/y; y*),

0:::; Pc(x/y; w) :::; 1,

L

Pc(x/y; w) :::; 1,

XEA*

Pc(x/y; w) 2: Q-Hc(x/y;w), 0 < P(x/y; w) < 1.

We refer to Hc(x/y; w) as the (Chaitin) relativized complexity of x, y with respect to wand Chaitin computer C. Similarly, Pc(x/y;w) is the relativized probability. Theorem 4.4. For every Chaitin computer C there exists a constant c > 0 (depending upon U and C) such that for all x, yEA * one has H(x) ::; -logQ Pc(x)

+ c,

H(x/y) ::; -logQ Pc(x/y)

+ c.

(4.5) (4.6)

Proof A simple dovetailing argument shows that the set T = {(x, n) E A*xN I Pc(x) > Q-n} isc.e. Let B = {(x,n+1) E A*xN I (x,n) E T} and put M = Q-(n+l) = Q-l Q-n.

L

L

(x,n+1)EB

(x,n)ET

We shall prove that M ::; 1. To this end we first introduce a piece of notation: for every real a, if Qn < a :::; Qn+l for some integer n, then put n = IgQa (lgQ = flogQ a 1 - 1). The following relations hold true:

62

4. G.E. Instantaneous Codes

< a, 2) if a > 0, then 19Qa < 10gQ a S 19Qa + 1, 1)

if a> 0, then QIgQa

3)

if a is a positive real and m is an integer, then

The first two relations are direct consequences of the definition of 19Q. If a > and m is an integer, then from Qn < a S Qn+l and 19Qa 2:: m we deduce m S 19Qa = n = 10gQ Qn < 10gQ a. Conversely, if 10gQ a > m, Qn < a S Qn+l, then Qn+l 2:: a > Qm, so n+ 1 > m, i.e. n = 19Qa 2:: m (n,m E Z).

°

Next we define the sets

N x = {n E N

I Pc( x) > Q-n},

x E A *.

Since n E N x implies n + 1 E N x it follows that N x is infinite. Moreover,

M=Q-l {nENxlxEA*} and

n E Nx

~

Pc(x) > Q-n

~

10gQ Pc(x)

~

19QPc(x) 2:: -no

> -n

Accordingly,

L

Q-n L n2:- gQPc(x)

Q-n

nENx

l

QIgQPc(X)+l/(Q _ 1)

< Q . Pc(x)/(Q - 1) < Q. Pc(x), and finally

M=Q-l

L L xEA*nENx

Q-n

s L

Pc(x) S 1.

XEA*

Using the Kraft-Chaitin Theorem we construct a Chaitin computer D : A * x {A.} ~ A * satisfying the following property

63

4.2 Relativized Complexities and Probabilities For every (x, n) E T there exists a string v E A * such that D(v, >.) = x and Ivl = n + 1. We prove that D satisfies the relation

Notice that D(v, >.)

=x ~

(x, Ivl) E B ~ Pc(x)

> QI- Iv l

and

HD(X)

= =

min{lvll v E A*,D(v,>.) = x} min{lvll v E A*,Pc(x) > QI- Iv l } min{lvll v E A*, Ivl ~ 1 -lgQPc(x)} 1 -lgQPc(x).

For the conditional case we extend D on a c.e. subset of A * x A +. To this end we let v = U(w,>'), x E A* and define the c.e. sets T:

= {(x,n)

B::/

E A* x N

I Pc(x/v;w) > Q-n},

= {(x, n + 1) E A * x N I (x, n)

E T:}.

It should be noted that in case w = v* E CP (U(v*, >.) = v) one has

T;;* = {(x, n)

E A*

x N I Pc(x/v) > Q- n }.

A similar counting argument shows that M (w, v) :s; 1, where

M(w, v)

= (x,n+1)EBi,"

Indeed, since it follows that

XEA* nENW v,x

64 and

4. G.E. Instantaneous Codes

M(w, v) = ~

Q-

L

~

QIgQPc(x/v;w)

XEA*

L

Pc(x) ~ 1.

XEA*

Using the Kraft-Chaitin Theorem again we extend D on a c.e. subset of A* x A+ such that

HD(X/Y) = 1 -lgQPc(x/y). The computation of D proceeds as follows: if U(w, >..) = v and (x, n+1) E B;;\ then there exists y E A* with D(y, w) = x and Iyl = n + 1. In case U(w, >..)

= v, one has D(y, w)

= x {:} Pc(x/v; w) > QI-Iyl.

Indeed,

D(y,w)

= x {:} (x, Iyl + 1) E B":} {:} Pc(x/v;w) > Q-(lyl-1) = Q1-lyl.

Next let w = v* (U (v* , >..) = v). One can easily check that

HD(x/v)

= x} min{lyll y E A*, Pc(x/v) > Q1- lyl }

min{lyll y E A*,D(y,v*)

min{lyll y E A*, Iyl ~ 1 -lgQPc(x/v)} 1 -lgQPc(x/v). Formulae (4.5) and (4.6) can now be derived from the Invariance Theorem. 0 Remark.

In view of the relations

PD(x)

= Q-1.

L

Q-n,

nENx

PD(X/Y)

= Q-1.

L

Q-n, y*

nENy,x

it follows that PD(X)

< Pc(x) and PD(X/Y) < Pc(x/y).

Corollary 4.5. For every Chaitin computer C there exists a constant c > 0 (depending upon U and C) such that for all x, yEA *

P(x)

~

Q-c Pc(x) ,

(4.7)

P(x/y)

~

Q-c Pc(x/y).

(4.8)

4.2 Relativized Complexities and Probabilities

65

Proof. The constant c comes from Theorem 4.4 (formulae (4.5) and (4.6)). It follows that

Pc(x) :::; Qc-H(x) , Pc(x/y) :::; Qc-H(x/ y ). Using Lemma 3.30 (with C

= U)

we get

Q-cpc(x) :::; Q-H(x) :::; P(x), Q-c Pc(x/y) :::; Q-H(x/ y )

:::;

P(x/y).

o

Theorem 4.6 (Chaitin). The following formulae are true:

H(x)

= -logQ P(x) + 0(1),

H(x/y) = -logQ P(x/y)

(4.9)

+ 0(1).

(4.10)

o

Proof. We use Theorem 4.4 and Lemma 3.30.

Remark. Actually, we have proven a bit more than stated in (4.9) and (4.10): namely, there exists a constant c > 0 such that

0:::; H(x)

+ logQ P(x)

:::; c, 0:::; H(x/y)

+ logQ P(x/y)

:::; c.

As a by-product we are able to show that there are only a few minimal programs.

Corollary 4.7. For every x,v E A*

#{y #{y

E

E

A* I U(y,'\)

= x, Iyl :::; H(x) + n} < Qn+O(l),

A* I U(y,v*) = x,

Iyl :::; H(x/v) + n} < Qn+O(l).

(4.11) (4.12)

Recall that <, > is a computable bijection between A* x A* and A* (with Oi,i = 1,2, as inverses) and P(x,y) = P« X,y ».

Theorem 4.8. One has

P(x):::::

2:= yEA*

P(x, y).

(4.13)

4. G.E. Instantaneous Codes

66

Proof. The Chaitin computer C(x, A) = (U(x, A)h has the following property: if U(y, A) = < u, v>, then C(y, A) = u. We compute PC(x) = {yEA*IC(y,A)=X}

uEA* {yEA* lU(y,A)= <x,u>}

All terms of the series above are positive and for every string yEA * with (U(y,A)h = x there is a unique string u E A* such that U(y,A) = < x, u > (because u = (U(y, A)h and <, > is one-to-one). So,

PC(x)

L

=

P(x, u)

UEA*

and

P(x) ~ Q-cpc(x)

= Q-c ( L

P(x,u)).

uEA*

For the converse relation we define the Chaitin computer

D(z, A)

= < U(z, A), U(z, A) >,

we evaluate the sum of the series

L

PD(x,y)

yEA*

yEA* {zEA*I}

L

Q-Izi

{ZEA* IU(Z,A)=X}

P(x), and we get a constant d > 0 with

P(x, y) ~ Q-d PD(x, y). Finally,

P(x) =

L

PD(x, y) :::; Qd(

yEA*

L

P(x, y)).

o

yEA*

Theorem 4.9. There exist a Chaitin computer C and a constant c

>0

such that for all strings x, y one has Hc(y/x) = H(x, y) - H(x)

+ c.

(4.14)

4.2 Relativized Complexities and Probabilities

67

Proof First we prove the existence of a constant c > 0 (depending upon U) such that QH(x)-c P(x, y)) :::; 1. (4.15)

(L

yEA*

From (4.9), H(x) = -logQ P(x) that for all x E A*

+ 0(1), so we can find

H(x) :::; -logQ P(x)

a natural n such

+ n,

or, equivalently, 1 < __

Q H(x)-n

From (4.13) we can get a real a

1/ P(x)

- P(x)·

> 0 such that

:::; a(

L

P(x, y))-l.

yEA*

Accordingly, QH(x)-n :::;

a(

L

P(x, y))-l

yEA*

and we may take in (4.15) c = n + flogQ a 1+ 1. For every x E dom(U>..) , x = U(u, )..), we generate the c.e. set B~

= {Iv I - lui + c I v E A *, (U (v, )..) h =

x} C Z

(c comes from (4.13)). In case u

= x* (U(x*,)..) = x) we have Bx

= B;* = {lvl-lx*1 + c I v E A*, (U(v, )..)h = x} = {Ivl- H(x) + c I v E A*, (U(v, )..)h = x}.

We then compute the sum of the series:

L

Q-(Ivl-H(x)+c)

{VEA*I(U(v,>")h=x} QH(X)-C {VEA*I(U(v,>")h=x} QH(x)-c(

L vEA*

< 1,

P(x, v))

68

4. G.E. Instantaneous Codes

by (4.15). It is worth noting that in the general relativized case U(u, A) = x we cannot claim the validity of the inequality

L

Q-(Ivl-lul+c) :; 1

{vEA* ,(U(v,.\))t=x}

because Ivl-Iul+c may be negative for some values ofu,v E A*. To avoid this difficulty (which prevents us using the Kraft-Chaitin Theorem) we shall proceed as follows. For every string u E A* with U(u, A) = x t= 00 we generate the elements of the set B'!); = {IVll- lui + c, IV21- lui + c, ... } and we test, at every step t ;:: 1, the condition t

L

Q-(lvil-lul+C) :; 1.

i=l

At the first failure we stop the generation process. Now we are in a position to make use of the Kraft-Chaitin Theorem to get the uth section of a Chait in computer C satisfying the property if U(u, A) = x and (U(y, A)h = x, then C(v, u) = (U(y, A)h, Ivl

= Iyl- lui + c.

It is clear that in the special case u = x*, the Kraft-Chaitin inequality is fulfilled; however, for U( u, A) = x we cannot decide, during the execution of the algorithm, if u = x*, since C P is immune. Next we are going to prove formula (4.14). If Hc(yjx) = lvi, then C(v, x*) = y, i.e. there exists a string w such that (U(w, A)h = x, C(v, x*) = (U(w, A)h = y and Ivl = Iwl-lx*1 +c = Iwl- H(x) +c. So,

x

= (U(w, A)h, y = (U(w, A)h,

U(w, A) = < x, y >, H(x, y) :; Iwl,

= Ivl = Iwl- H(x) + c;:: H(x, y) - H(x) + c. H(x) = Ix*l, H(x, y) = Iwl, U(w, A) = < x, y >.

Hc(yjx)

Conversely, let Clearly, Iwl - H(x) + c E Bx = Bit and the Kraft-Chaitin Theorem applies producing a string v such that Ivl = Iwl - H(x) + c with C(v, x*) = y. Accordingly,

Hc(Y/x) :; Ivl

=

Iwl- H(x) + c = H(x, y) - H(x) + c.

0

69

4.2 Relativized Complexities and Probabilities Theorem 4.10. The following formulae are valid:

H(x, y) H(x : y)

H(x)

=

+ H(yjx) + 0(1),

= H(x) + H(y)

H(x : y)

=

H(y : x)

P(yjx) H(yjx)

- H(x, y)

+ 0(1),

+ 0(1),

P(x)

P(x,y)

(4.19)

+ 0(1),

P(x, y) H(x : y) = logQ P(x)P(y)

(4.17) (4.18)

~ P~~~~) ,

= logQ

(4.16)

+ 0(1).

(4.20)

(4.21)

Proof. For (4.16) we construct a Chaitin computer C and a natural c> 0 such that Hc(yjx) = H(x, y) - H(x) + c.

(See Theorem 4.9.) Accordingly,

+ H(x) -

H(x, y) = Hc(yjx)

c ~ H(yjx)

+ H(x) + 0(1)

(we have applied the Invariance Theorem). To get the converse inequality we rely on Lemma 3.14 (formula (3.12)). From (4.16) we easily derive (4.17)

H(x : y)

= H(y) - H(yjx) = H(y) + h(x) - H(x, y) + 0(1).

The same is true for (4.18):

H(x: y)

= H(x)+H(y)-H(x,y)+0(1) = H(x)+H(y)-H(y,x)+0(1),

by virtue of Proposition 3.12. For (4.19) we note that

H(x, y)

= H(x) + H(yjx) + 0(1),

H(x) = -logQ P(x)

+ 0(1),

H(yjx) = -logQ P(yjx)

+ 0(1);

70

4.

e.E.

Instantaneous Codes

we have used Theorem 4.6. By virtue of the same result we deduce the existence of some constant d > 0 such that -d :::; H(yjx)+logQ P(yjx) :::; d. On the other hand, there exists a natural m such that

P(yjx) :::; mP(x, y)j P(x), P(x, y) :::; mP(yjx)P(x) (see (4.19)). Combining the "left" inequalities we get

-d:::; H(yjx)

+ logQ P(yjx) :::; H(yjx) + logQ P~~~~) ,

P(x) H(yjx) ~ logQ P(x,y)

+ 0(1).

From the "right" inequalities we infer

P(x) H(yjx) :::; logQ P(x, y)

+ 0(1),

thus proving formula (4.20). Finally, (4.21) is a direct consequence of 0 formulae (4.10) and (4.20). Corollary 4.11. One has H(x, string(H(x))) = H(x)

+ 0(1).

Proof We use Lemma 3.13 and Theorem 4.10: H(x, string(H(x)))

4.3

H(x) H(x)

+ H(string(H(x))jx) + 0(1) + 0(1).

o

Speed-up Theorem

We define the halting probability of a Chaitin computer and we prove a result asserting that there is no "optimal" universal Chaitin computer, in the sense of the best halting probability. We fix a universal Chaitin computer U and let U( w,.\) define the halting probability of C on section y to be O(C, y; w)

=

L xEA*

Pc(xjy; w).

= y, y I- .\. We

4.3 Speed-up Theorem In case y

71

= A, the absolute halting probability is O(C) =

L

Pc(x).

xEA*

Finally, if C

= U,

then we put 0

= O(U).

The inequalities will be derived in Corollary 7.3. Theorem 4.12 (Speed-up Theorem). Let U and V be two universal Chaitin computers and assume that U(w, >..) = y. Furthermore, suppose that 1- Ql-k < O(V,y;w) < 1- Q-k,

for some natural k > O. Under these conditions we can effectively construct a universal Chaitin computer W satisfying the following three properties. For all x E A *,

Hw(x/y; w) ::; Hv(x/y; w).

(4.22)

For all but a finite set of strings x E A *,

Hw(x/y; w) < Hv(x/y; w),

(4.23)

O(W,y;w) > O(V;y;w).

(4.24)

Proof We fix y with U(w, >..) B

= {(x,n)

E A* x N

= y and let

I V(z,w) = x, Izl = n,

for some z E A*}.

Since Vw is surjective, it follows that B is c.e. and infinite. We fix a one-to-one computable function f : N + ---t A * x N such that range(J) = B. We denote by Oi (i = 1,2) the projection of A* x N onto the ith coordinate. A simple computation shows the validity of the formula

O(V,y;w) =

L

Q-n.

(x,n)EB

In view of the inequality

O(V; y; w) > 1 _ Ql-k

4. G.E. Instantaneous Codes

72

we can construct enough elements in the sequence (J(i)h, i ally we get an N > 0 such that N

L

Q-(f(i))2

> 1_

~

1; eventu-

Ql-k.

i=l

N ext we claim that

#{i E N Ii> N, (J(i)h :::; k}:::; Q. Indeed, on the contrary, LQ-(f(i)h

D(V, y; w)

i~l

N

>

L

Q-(f(i)h

+ Ql-k

i=l

> 1-

Ql-k

+ Ql-k = 1.

Consequently, there exists a natural M > N (we do not have any indication concerning the effective computability of M) such that for all i ~ M, (J(i)h > k. On this basis we construct the computable function 9 : N+ ---t A* x N by the formula

(') 9

2

=

{!(i), if i :::; N or (i > N, (J(i)h :::; k), ((J(i)h, (J(i)h - 1), otherwise,

and we prove that L

Q-(g(i)h :::;

1.

i~l

First, we consider the number N

S = L

Q-(g(i)h

+

L

Q-(g(i)h,

N+1~i~M,(f(i)h~k

i=l

where M is the above bound. It is seen that N S> L i=l

Q-(g(i))2

=

N L i=l

Q-(f(i)h

>1_

Ql-k.

(4.25)

4.3 Speed-up Theorem

73

Now, a simple computation gives

L Q-(g(i))2

S+Q {i>N,(f(i)h>k}

i21

S + Q. (O(V,y;W) - S) Q. O(V,y;W) + (1- Q)S < Q(l - Q-k) + (1 - Q)(l _ Ql-k) 1 - (Q - 2)Ql-k

< 1. In view of the Kraft-Chaitin Theorem there exists (and we can effectively construct) a Chaitin computer W such that for all i 2:: 1 there is a string Zi E A* of length (g(i)h with W(Zi' w) = (g(i)h = (f(i)h. In the case n = Hv(x/y; w) we deduce that (x, n) E B, i.e. (x, n) = f(i), for some i 2:: 1. In case f(i) = g(i), W(Zi'W) = x, for some Zi E A*, IZil = (g(i)h = n; otherwise (i.e. in case f(i) =1= g(i)) W(Zi' w) = x, for some string Zi E A*, IZil = (g(i)h = n - 1. In both cases Hw(x/y; w) n, which shows that W is a universal Chaitin computer and (4.22) holds. Furthermore, the set {i E N I f(i) = g(i)} is finite, so the inequality H w (x / y; w) < n is valid for almost all strings x.

:s:

Finally,

O(W, y; w)

=

L Q-(g(i))2 i21

QO(V, y; w) > O(V,y;w),

+ (1 -

Q)S

o

proving (4.24). (The number S comes from (4.25).)

Corollary 4.13. Let U be a universal Chaitin computer such that 1 - Ql-k < O(U) < 1 _ Q-k,

for some natural k. Then we can effectively find a universal Chaitin computer W satisfying the following three properties. For all x E A *, Hw(x)

:s: Hu(x).

( 4.26)

For all but a finite set of strings x E A *, Hw(x) < Hu(x),

(4.27)

O(W) > O(U).

(4.28)

74

4. G.E. Instantaneous Codes

Remark. A similar result can be deduced for conditional complexities and probabilities.

4.4

Algorithmic Coding Theorem

In this section we prove the universality of the representation formula (4.9) in Theorem 4.6, i.e. we show that it is valuable not only for the probability P, but also for a class of "semi-measures".

Definition 4.14. a) A semi-measure is a function v satisfying the inequality v(x) ~ 1.

A*

---+

[0,1]

L

XEA*

b) A semi-measure v is enumerable if the graph approximation set of v, {(r, x) E Q x A* 11' < v(x)} is c.e. and computable if the above set is computable.

Example 4.15. The function v : A*

---+

[0,1] defined by

v(x) = 2- lxl - 1 Q-lxl is a computable semi-measure.

Definition 4.16. Let

~

be a class of semi-measures. A semi-measure

Vo E ~ is called universal for ~ if for every semi-measure v E ~, there exists a constant c > (depending upon Vo and v) such that Vo (x) 2:: cv( x),

°

for all strings x E A *.

Theorem 4.17. The class of all enumerable semi-measures contains a universal semi-measure. Proof. Using a standard technique we can prove that the class of enumerable semi-measures is c.e., i.e. there exists a c.e. set TeN x Q x A * such that the sections Ti of T are exactly the graph approximations of the enumerable semi-measures. We denote by Vi the semi-measure whose graph approximation is Ti. Finally we put m(x) =

L n:;::O

Tn-1vn(x).

4.4 Algorithmic Coding Theorem

75

We first show that m is a semi-measure, i.e.

L

m(x) xEA'n~O

XEA*

n~O

<

L

xEA*

2- n -

1

= l.

n~O

The semi-measure m is enumerable since for all x E A *, r E Q one has m(x) > r iff L:j=12-nj-lVnj(X» r, for some k 2:: 1,nl, ... ,nk 2:: o. Finally, m is universal since D

In what follows we fix a universal enumerable semi-measure m.

Theorem 4.18 (Algorithmic Coding Theorem). The following formulae are true:

H(x)

= -logQ P(x) + 0(1) = -logQ m(x) + 0(1).

Proof The equality

H(x) = -logQ P(x)

+ 0(1)

is exactly Theorem 4.6. So, we shall prove the formula 10gQ m(x) = 10gQ P(x)

+ 0(1).

Since P = Pu is an enumerable semi-measure and m is universal it follows that m(x) 2:: cP(x), for some positive natural c. To show the converse inequality we make use of the Kraft-Chaitin Theorem and we prove the inequality H(x) :s; -logQ m(x) + 0(1). To this end we consider an injective computable function f : N - t A * x N+ such that feN) = {(x, k) E A* x N+ I Q-k-l < m(x)}. We put f(t) = (:X;t, kt ). It is seen that

4. G.E. Instantaneous Codes

76

L

L

Q-k-l

XEA* Q-k<m(x)

Q-k-l xEA* k>-log Q m(x)

xEA* k~-lgQm(x)

L

QIgQm(x) /(Q

- 1)

xEA*

<

L

m(x)/(Q - 1)

xEA*

(We have made use of the equivalence

see the proof of Theorem 4.4.) According to the Kraft-Chaitin Theorem there exists a Chaitin computer C : A * x A* ~ A* satisfying the following property: for every natural t there exists a string Ut of length IUtl = kt such that C(Ut, A) = Xt. As for every string x E A* there exists a natural t such that x = Xt, we deduce that Hc(x) ~ -logQ m(x); using the Invariance Theorem we deduce the inequality H(x) ~ -logQ m(x)

+ 0(1), o

thus completing the proof. Comment.

Classically, for every probability measure w : A*

-t

[0,1],

L

w(x) =

1

xEA*

we can construct a prefix-code

fw

such that

for all x E A*. In the case of semi-computable measures w there is a universal code with a self-delimiting p.c. decoding function, independent of w, such that H(x) ~ -logQ w(x) + Cw, where

Cw

depends upon w.

4.4 Algorithmic Coding Theorem

77

Example 4.19. Consider a Chaitin computer C : A* ~ A*; when the computer asks for a new symbol we toss a coin to decide whether to give a or 1. The probability that C outputs x is Pc(x) = {YE{O,l}* IC(y)=x}

The semi-measure Pc is enumerable, so

Accordingly, Pc(x) is at most a constant times larger than the maximal element 2- H (x) = max{2- lyl I C(y) = x,y E {a, 1}*}. Comment. Let us illustrate the Algorithmic Coding Theorem with an example from Cover and Thomas [152]. We imagine a monkey trying to "type" the entire works of Shakespeare, say 1, 000, 000 bits long. If the monkey types "at random" on a dumb typewriter, the probability that the result is Shakespeare's work is 2- 1,000,000; if the monkey sits in front of a computer terminal, then the algorithmic probability that it types the same text is 2-H(Shakespeare) ;:::: 2- 250,000,

an event with an extremely small chance to happening, but still more likely than the first event. The use of the typewriter reproduces exactly the input produced by the typing while a computer "runs" the input and produces an output. Consequently, a random input to a computer is much more likely to produce an "interesting" output than a "random" input to a typewriter. Is this a way to create "sense" out of "nonsense"? As a different application of the Algorithmic Coding Theorem we will present another proof of Proposition 3.15. Example 4.20. The property of sub-additivity of the program-size complexity follows from the Algorithmic Coding Theorem. Proof As we noted before (see the proof of Proposition 3.15) it is enough to prove the formula H( < x, y

» :S H(x) + H(y) + 0(1).

4. G.E. Instantaneous Codes

78

To this end we consider the function I-" : A * --t [0, 1] defined by

1-"«

»

x,y

=

P(x)P(y).

It is clear that I-" is a semi-measure:

{(r, < x,y

»

E

Q x A* 11-"« x,y » < r}

I P(x) < rl, P(y) < r2, r = and

L

1-"«

x,y

»

=

{(r, < x,y

»

E

Q x A*

rlr2, for some rationals rl, r2} =

<x,y>EA*

L

P(x)P(y) ~ 1.

x,yEA*

Finally, using the Algorithmic Coding Theorem we get

H«

x,y

»

< -logQP« x,y » < -logQ(O(I-"« x,y »))+0(1) -logQ P(x)P(y)

+ 0(1)

< H(x) + H(y) + 0(1).

D

The uncertainty appearing in the Algorithmic Coding Theorem is a source of concern for applications in physics; see for example Schack [357]. Fortunately, a sharper version of the theorem can be proved. To this end we will study the coding phenomenon further. Recall that a one-to-one function C : A* --t A* such that C(A*) is prefix-free is called prefix-code. For example, for every surjective Chait in computer M, CM(X) = x M= min{y E A* I M(y) = x} is a prefix-code; universal Chaitin computers are surjective. The average code-string length of a prefix-code C with respect to a semi-measure P is Le,p = LP(x) ·IC(x)l· x

The minimal average code-string length with respect to a semi-measure Pis Lp = inf {Le,p I C prefix-code}. The entropy of a semi-measure P is

rtp = -

L P(x) ·logQ P(x). x

Shannon's classical result [364] (see further [152]) can be expressed for semi-measures as follows:

4.4 Algorithmic Coding Theorem

79

Theorem 4.21. The following inequalities hold true for every semimeasure P:

'!ip -1::; '!ip

+

(~P(X)) 10gQ (~P(X))

: ; Lp::; '!ip + 1.

If P is a measure, then 10gQ(Lx P(x)) = 0, so we get the classical inequality '!ip ~ Lp. However, this inequality is not true for every semi-measure. For example, take A = {O, I}, P(x) = 2- 2Ixl- 3 and C(x) = XIX1 ... x n x n01. It follows that Lp ::; Lc,p = '!ip -

i.

Next we investigate conditions under which given a semi-measure P, we can find a (universal) Chaitin computer M such that HM(X) is equal, up to an additive constant, to -logQ P(x). In what follows we will assume that P(x) > 0, for every x. Theorem 4.22. Assume that P is a semi-measure and there exist a c. e. set SeA * x N and a constant c ~ 0 such that the following two conditions are satisfied for every x E A * 1. 2.

L(x,n)ES Q-n ::; P(x) if P(x) > Q-n, then (x, m) E S, for some m ::; n

+ c.

Then, there exists a Chaitin computer M (depending upon S) such that for all x, -logQ P(x) ::; HM(X) ::; (1 + c) -logQ P(x). (4.29) Proof. In view of (i),

L (x,n)ES

Q-n::;

L P(x) ::; 1, x

so using the Kraft-Chaitin Theorem we can construct a Chaitin computer M such that for every (x, n) E S there exists a string vx,n of length n such that M(vx,n) = x. If (x, m) t/. S, for all m, then P(x) = 0 and HM(X) = 00, so (4.29) is satisfied. If (x,m) E S, for some m, then using (i) and (ii) we get

HM(X)

min{lvll v E A*,M(v)

= x}

min{n I n EN, (x,n) E S}

(4.30)

< min{m I mE N,P(x) > Q-m} + c min{m I m E N,m > -logQ P(x)} + c min{m I m E N,m ~ l-lgP(x)}

< (1 + c) -logQ P(x).

+c

80

4. C.E. Instantaneous Codes

If (x,n) is in S, then P(x) 2:: Q-n, hence -logQP(x):s: HM(X) because of (4.30). D Remark. P.

Theorem 4.22 makes no direct computability assumptions on

Lemma 4.23. Let M be a Chaitin computer such that DM

<

1. Then,

there exists a universal Chaitin computer U satisfying the inequality Hu(x) :s: HM(x), for all x. Proof. By hypothesis, DM < 1, so there is a non-negative integer k such that DM + Q-k :s: 1. Let V be a universal Chaitin computer. The set S = ((M(x) , Ixl) I M(x) < ex)} U {(V(x), Ixl is c.e. and

I:

Q-n:s: DM

+ k)

I V(x)

< oo}

+ Q-k :s: 1.

(y,n)ES

Consequently, in view of the Kraft-Chaitin Theorem, there exists a Chaitin computer U such that for (y, n) E S there is a program z of length n such that U(z) = y. Clearly, for every x,

Hu(x) :s: min{lwl

+k

I V(w) = x} = Hv(x)

+ k,

and

Hu(x)

= min{lvll U(v) = x} :s: HM(X),

so U is universal and satisfies the required inequality.

D

Lemma 4.24. Let M be a Chaitin computer. Then, there exists a Chaitin computer M' such that DM' < 1 and HMI(x) = HM(X) + 1,

for all x. Proof. Apply the Kraft-Chaitin Theorem to the set {(M(x), Ixl

+ 1)

to obtain the Chaitin computer M'.

I M(x) < ex)} D

4.4 Algorithmic Coding Theorem

81

Corollary 4.25. Under the hypotheses of Theorem 4.22, a universal Chaitin computer U can be constructed such that for all x, Hu(x) :S (2 + c) -logQ P(x).

(4.31)

Proof Use Lemmas 4.24, 4.23 to get a universal Chaitin computer U such that Hu(x) :; HM(X) + 1, for all x. D

Proposition 4.26. Assume that P is a semi-measure semi-computable from below. Then, there exists a. Chaitin computer M (depending upon P) such that for all x, (4.32) Consequently, minimal programs for M are almost optimal: the code C M satisfies the inequalities

Proof We take S = {(x, n

L

Q-n =

(x,n)ES

+ 1) I P(x) > Q- n }.

L

-n

L

Q-n =

n>l-log Q P(x)

For every x we have Q

n2:1-lgP(x)

=

Q IgP(x) Q 1 -

< P(x),

so condition (i) in Theorem 4.22 is satisfied. Condition (ii) holds for c = 1. Hence by (4.29) we get 0:; LoM,p -7-{p =

L

P(x) . (HM(X)

+ 10gQ P(x))

:; 2.

D

x

Corollary 4.27. Assume that f : A * ---t N is a function such that the set {(x,n) I f(x) < n} is c.e. and Lx2-f(x) :; 1. Let P(x) = Q-f(x). Then P is a semi-measure semi-computable from below, and there exists a Chaitin computer M (depending upon f) such that for all x, HM(X) :S 1 + f(x).

( 4.33)

Minimal programs for M are almost optimal: the code C M satisfies the inequalities 0:; LoM,p - 7-{p :; 1. There exists a universal Chaitin computer U (depending upon f) such that the code Cu satisfies the inequalities

o :; Lou,P -

7-{p :; 2.

4. G.E. Instantaneous Codes

82

Proof We take S = {(x, n) In> f(x)}. Clearly, S = {(x, n) Q-n}. The first condition in Theorem 4.22 is satisfied as

L

I P(x) >

Q-n = P(x)1 :S P(X),

n>f(x)

Q-

for every x, and the second condition is satisfied for

C

= o.

o

Remark. When the semi-measure P is given, an optimal prefix-code can be found for P. However, that code may be far from optimal for a different semi-measure. For example, let A = {O, 1} and C be a prefixcode such that IC(x)1 = 2 Ixl +2 , for all x. Let a > 0 and consider the measure Two radically different situations appear: if a :S 1, then

but if a

> 1, then Le,p"" - 1ip"" <

00.

So, C is asymptotically optimal for every measure Pa with 1 < a, but C is far away from optimality if 0 < a :S 1. Note that Pa is computable provided a is computable. The next result shows that minimal programs are asymptotically optimal for every semi-measure semi-computable from below. Theorem 4.28. Let P be a semi-measure semi-computable from below, and U a universal Chaitin computer. Then, there exists a constant Cp (depending upon P) such that

°:S

Leu,P - 1ip :S 1 + cpo

Proof We take M the Chait in computer constructed in Proposition 4.26 and let CM be the simulation constant of M on U. Then,

so we can take

Cp = CM.

o

4.4 Algorithmic Coding Theorem

83

Remark. Theorem 4.28, which generalizes a result in [151] proven for computable measures, is important only for semi-measures for which the entropy is infinite. For example, the entropy of the semi-measure

2- lxl P( x) - - - - - - - - - , - (Ixl + 2) log2(lxl + 2) is infinite. Using Lemma 4.23 we can obtain sharper inequalities. For example, for every universal Chaitin computer U, the code Cu is almost optimal with respect to Pu: o :S: Lcu,Pu - rtpu :S: 2. If f is a function as in Corollary 4.27 such that L:x Q- f(x) exists a universal Chaitin computer U such that

o :S: Lcu,P -

< 1, then there

rtp :S: 1.

For example, we can take f(x) = Hu(x), where U is a universal Chaitin computer. Proposition 4.29. Let P be a computable semi-measure. Then, there exists a Chaitin computer M such that

Proof Note that -lgP(x) = min{n I n E N, P(x) > Q-n} and then apply Theorem 4.22 to the set S = {(x, -lgP(x)) I x E A*} and constant c=O. 0 Corollary 4.30. Let P be a computable semi-measure. Then, there exists a universal Chaitin computer U such that

Hu(x) :S: 1 -logQ P(x). We are now in the position to characterize all Chaitin computers satisfying the Algorithmic Coding Theorem and to construct a class of (universal) Chait in computers for which the inequality is satisfied with constant c =

O. Proposition 4.31. Let M be a Chaitin computer and c lowing statements are equivalent:

~

O. The fol-

84

4. C.E. Instantaneous Codes ~

+ c) -logQ PM(X).

(a)

For all x, HM(X)

(b)

For all non-negative n, if PM (X) > Q-n, then HM(x)

Proof From HM(X)

~

(1

Q-n

(1

+ c) -logQ PM(x) < PM(X)

~

and PM(x)

~

n + c.

> 2- n we deduce

Q(1+c)-HM(X).

Conversely, we have D

Remark. For any Chaitin computer M satisfying one of the equivalent conditions in Proposition 4.31, the Algorithmic Coding Theorem holds:

(4.34) In fact, a Chaitin computer M satisfies (4.34) iff condition (b) is satisfied. Every universal Chaitin computer U satisfies condition (b), but not all Chaitin computers satisfy this condition. Indeed, to construct such an example, consider the following enumeration: for every string x enumerate Q1x l copies of the pair (x, 31xI + 1). Use the Kraft-Chaitin Theorem to construct a Chait in computer M such that for every string x there exist Q1x l different strings u~, all of length 31xI + 1, such that

M( u i) x -- x, 2. -- 1, 2, ••• , Q1xl • It is seen that PM(x) = Q- 2 Ix l- 1 , so taking nx = 21xI + 2 we get PM(x) > Q-n x , but there is no constant c such that HM(X) ~ nx + c, for all strings x.

Some Chait in computers satisfy condition (b) with c = 0, so their canonical programs are almost optimal. A class of (universal) such computers is provided in the next proposition. Proposition 4.32. Let M be a Chaitin computer such that for all pro-

grams x x,

I- x' with M(x) = M(x') we have Ixl I- Ix'i. Then, for all (4.35)

4.5 Binary

VB

Non-binary Coding (1)

Proof. Consider the set S

85

= {(x, Iyl) I M(y) = x}, PM(X) =

L

and note that

Q-n,

(x,n)ES

as programs producing the same output have different lengths. In view of the hypothesis,

PM(X) > Q-n

~

3(x,k 1 ) E S[(kl < n) V (k 1 1'13 k2(k2

-I kl

1\

=n

(x, k 2) E S))],

hence the second condition in Theorem 4.22 is satisfied with c = O. Using Theorem 4.22 we deduce the existence of a Chaitin computer M' such that HMI(x) :::;; 1 -logQ PMI(X), for all x. Inequality (4.35) follows from HM(x) = min{n I (x,n) E S} = HMI(x). 0

Remark. Not every universal Chaitin computer satisfies the hypothesis of Proposition 4.32. However, if V is a universal Chaitin computer, then one can effectively construct a universal Chaitin computer U such that programs producing the same output via U have different lengths and Hu(x) = Hv(x), for every x; Pu(x) :::;; Pv(x), for all x. Indeed, enumerate the graph of V and as soon as a pair (x, V (x)) appears in the list do not include in the list any pair (x', V(X')) with x -I x' and V(x) = V(x ' ). The set enumerated in this way, which is a subset of the graph of V, is the graph of the universal Chaitin computer U satisfying the required condition.

4.5

Binary vs Non-binary Coding (1)

The time has come to ask the following question: "Why did we choose to present the theory in an apparently more general setting, i.e. with respect to an arbitrary alphabet, not necessarily binary?" It seems that there is a widespread feeling that the binary case encompasses the whole strength and generality of coding phenomena, at least from an algorithmic point of view. For instance, Li and Vitanyi write in their book [282]:

[the} measure treated in the main text is universal in the sense that neither the restriction to binary objects to be described, nor the restriction to binary descriptions (programs) results in any loss of generality.

86

4. C.E. Instantaneous Codes

The problem is the following: does there exist a binary asymptotically optimal coding of all strings over an alphabet with q> 2 elements? Surprisingly, the answer is negative. We let q > p ~ 2 be naturals, and fix two alphabets, A, X, having q and p elements, respectively. The lengths of x E A * and y E X* will be denoted by IxlA and Iylx, respectively. We fix the universal computer 'Ij; : A* x A* ~ A* and the universal Chaitin computer U: A* x A* ~ A*. We denote by K the Kolmogorov-Chaitin complexity induced by 'Ij; and by H the Chaitin complexity associated with U. We shall prove that the following two problems have negative answers: 1. Does there exist a computer T/ : X* x A* ~ A* which is universal for the class of all computers acting on A *, i.e. a computer T/ for which there exists a constant c > 0 such that for every yEA *, if 'Ij;(x,),) = y, then T/(z,),) = y, for some Z E X* with Izlx :::; IxlA +c? 2. Does there exist a Chaitin computer C : X* x A* ~ A* which is universal for the class of all Chaitin computers acting on A *? We begin with a preliminary result. Lemma 4.33. Consider the function f(n)

f :N

---t

N defined by

= l(n + 1) logqpJ + 1.

gq P D ror every natura1 n> ll+10 1-1og q P J + 1 one has

Proof. Clearly, qf(n)

> pn+l. The inequality pn+l ~ pf(n)

is true for all natural n >

l1+

10g q PJ 1-1og qP

+ 1.

+ pn o

The next result says that complexities cannot be optimized better than linearly, i.e. the Invariance Theorem is the best possible result in this direction.

4.5 Binary

VB

Non-binary Coding (1)

87

Lemma 4.34. Fix a real number 0 < a < 1. There is no computer rJ : A* x A* ~ A* and no Chaitin computer C : A* x A* ~ A* such that for all computers

K l1 (x) :s; aK
+ 0(1)

and Hc(x)

:s; aHD(x) + 0(1).

= 'l/J to see that the computer rJ is universal. For

which means that Kl1 is bounded, a contradiction. The same argument works for Chaitin computers. D Theorem 4.35. There is no computer rJ : X* x A* ~ A* which is universal for the class of all computers acting on A *.

Proof Assume, by absurdity, that rJ satisfies the universality condition, i.e. there exists a constant c > 0 such that for every yEA * there exists an x E X* for which rJ(x, >.) = y, and

Ixlx :s; K(y) + c. In view of Lemma 4.33, for every natural

n>

I

pj + 1

I + logq 1 -logqP

one has Consider an injective, c.e. enumeration of the domain of rJ C X*,

l

We put mi = Iei Ix. For every ei E X* such that mi > get, in a consistent way, i.e. without repetitions, a string

and put

1+10g PJ 1-10g!p

+ 1 we

88

4. G.E. Instantaneous Codes

Clearly, we may lose some finite subset of ei's; however, this does not affect the universality of TJ. SO, r : A * x A * ~ A * is a computer which, in view of its construction and Lemma 4.33, satisfies the inequality

Kr(x) S K'f}(x) 10gqP + 0(1). We have contradicted Lemma 4.34, as 0 < logq p < 1.

o

Theorem 4.36. There is no Chaitin computer C : X* x A* ~ A* which is universal for the class of all Chaitin computers acting on A *. Proof We use the same construction as in the proof of Theorem 4.35. We have to check that in case C is a Chaitin computer (Le. dom( C) is prefixfree), then the domain of the resulting new computer is also prefix-free. We use the Kraft-Chaitin Theorem: for C one has 00

LP-mi S 1, i=l

so

00

LQ-L(mi +1)logqpJ+1 S 1, i=l

as

Q-l(m i +1)logqPJ+1 S Q-(mi +l)logqP S p-mi •

So, according to the Kraft-Chaitin Theorem, for every n >

l~~:~::~J + 1

and en E dom(C), there exists a string x~ E Af(mn) such that the set { Xi

n

E A*

In>

IIl-log + 10gqPj + I} P

C A*

q

is prefix-free. By Lemma 4.33, x~ can be taken in Af(mn) \ X f(m n ). We now define Chaitin's computer r ' : A* x A* ~ A*,

r I (x In ,),) = C(e n , ),),

for n

>

II + 10gqPj + ogqP . 1-

1

1.

We thus have the same contradiction as in the proof of Theorem 4.35.

0

The negative answers stated at the beginning of this section can be obtained just by taking X = {O, I}.

89

4.6 Exercises and Problems

4.6

Exercises and Problems

1. Show that the code cp: N+

-->

{O,l}*,cp(i) = 01i,i ~ 1, is not a prefix

code, but it is uniquely decodable. 2. Let cp : Y (i.e. : y*

--> -->

A* be a prefix code. Show that the induced morphism A*) is also a prefix code.

3. (Leung-Yan Cheong and Cover) For every i ~ 1 put li = ilog2(~ + l)l Show that the following functions cp : N + --> N satisfy the Kraft-Chaitin inequality in Theorem 4.1, for every Q ~ 2:

cp(i) = Ii + ailog2 lil +log2((2 a -1)/(2a - 2)),a > 1, cp(i) = Ii + 2llog2(li + l)J, cp( i) = li + llog2li + log2 (lOg2li) + ... J + 4. (We consider only iterates for which log2(log2('" (lOg2Ii) .. .)) is positive.) 4. (Pippenger) To every string x E A* we associate the interval

I(x) = [kQ-1x l, (k + 1)Q- x l), 1

where k E {O, 1, ... ,Qlxl_1} is the exact position of x among the strings in A1x l , ordered lexicographically. In this way one gets a one-to-one function from A* onto the set of intervals {[kQ-n, (k + l)Q-n) I n ~ 0, k :::;

°: :;

Qn -I}. a) Show that a subset S of A * is prefix-free iff to all distinct x, yES there correspond disjoint intervals I(x) n I(y) = 0. b) Rewrite the algorithm presented in the proof of Theorem 4.1 in a geometrical form according to the above equivalence between strings and intervals; prove the correctness of the algorithm. 5. (Mandoiu) Let c: N+ ~ A* be a p.c. code-string function. We say that c is a free-extendable code if for all natural numbers n ~ 1 and every p.c. code-length function f : N+ ~ N such that f(i) = Ic(i)l, 1 :::; i :::; n (recall that f satisfies condition (4.1) in Theorem 4.1), there exists a p.c. code-string function c' : dom(f) --> A* such that c(i) = c'(i), for 1:::; i :::; n, and Ic'(k)1 = f(k), for all k E dom(f). Informally, in a free-extendable code the code-strings are selected in a way that allows the continuation of the completion of every finite part of the code with new code-strings, for all possible compatible code-length functions. For example, the code-function c : N+ --> {0,1}* defined by c(i) = Oi- I 1, i ~ 1, is a free-extendable code. However, not all prefix codes are free-extendable. Even in the binary case we may consider c : N+ ~ {O,l}*,c(l) = 00,c(2) = 1O,c(3) = 0l,c(4) = 11 and c(k) = 00, for k ~ 5. This prefix code is not free-extendable. Indeed, let n = 2 and f : {I, 2, 3} --> N, f(l) = 2, f(2) = 2, f(3) = 1. Clearly, f is a codelength function compatible with c for n = 2, but there is no prefix code c' : {I, 2, 3} --> A* with c'(l) = 00, c'(2) = 10 and Ic'(3)1 = 1.

90

4. G.E. Instantaneous Codes Show that Theorem 4.1 is still valid for free-extendable codes. 6. (Grozea) Let M c A* be finite and prefix-free. An extension of M is a string x ,x M, such that M U {x} is still prefix-free. An extension 'rOot of M is a minimal extension for M, i.e. x is an extension of M, but no proper prefix of x is an extension of M. We denote by D(M) the set of all extension roots of M. a) Calculate D(M) for the following sets M over the alphabet {a, b, c}: i) {ab, ac}, ii) {a}, iii) {abc}), iv) 0. b) Prove: i) D(0) = {A}, ii) MnD(M) = 0, iii) D(M) is a finite prefix-free set, iv) #D(M) < Q. #M + 1, v) every string that can be used to extend M in a prefix-free manner has a prefix in D(M), vi) for each x E D(M), D(M U {x}) = D(M) \ {x}, vii) J.l(M) + J.l(D(M)) = 1. c) The profile of a set of strings M is the histogram of the lengths of strings in M: profile(M)(i) = #{x E M I i =1 x I}, for i E N. A set M has a thin profile (over A) if its profile is bounded by Q - 1 (recall that Q = #A). Prove that a finite prefix-free set M is free-extendable iff D(M) has a thin profile. d) Deduce the Kraft-Chaitin Theorem from the above statement. (Hint: the empty set is free-extendable.)

rt

7. Show that for every Chaitin computer C the sets {(x, n) E A* x N I Hc(x) :::; n} and {(x, n, m) E A* x N x N+ I Pc(x) > n/m} are c.e. 8. Show that given y* and C one can computably enumerate the following two sets: {(x,n) E A* x N I Hc(x/y) :::; n},{(x,n,m) E A* x N x N+ I

Pc(x/y) > n/m}. 9. Show that the set {(x,y,n)

E

A* x A* x N

I H(x/y):::; n} is not c.e.

= 0(1). 11. Show that H(x, string(H(x))) = H(x) + 0(1). 10. Show that H(string(H(x))/x)

12. As a cross between Kolmogorov-Chaitin complexity and Chaitin complexity we define Hc(x/y), in which C is self-delimiting, but C receives y instead of y*:

Hc(x/y) = min{lzll z

E A*, C(z,

y) = x}.

a) Show that the Invariance Theorem remains valid for H. Fix a universal computer U and denote by H its complexity (H(x/y) =

Hu(x/y), H(x)

=

H(x/A)).

b) Show that there exists a constant c

> 0 such that for all strings x, y,

H(x/y) ~ H(x/y) - c. c) Show that there exists a constant d> 0 such that for all strings x, y,

H(x/y) :::; H(x/y) + H(y) + d.

91

4.7 History of Results d) Prove the formula H(x, string(H(x))) = H(x) e) Prove the formula H(string(H(x))) =I- 0(1).

+ 0(1).

13. Let U and V be two universal computers and assume that U(w, A) = Y and 0< Ql-k < D(V,y;w) < 1- Q-k, for some natural k. Show that we can effectively construct a universal computer W such that a) Hw(x/y;w) ~ Hv(x/y;w), for all x E A*, b) Hw(x/y; w) > Hv(x/y; w), for all but a finite set of x E A*, c) D(W, y; w) < D(V, y; w). 14. Show that in the proof of Theorem 4.17 we may use any computable sequence a : N ---+ [0,1] such that Ln>o a( n) = 1, for instance the sequence a(n) = 6/(7rn)2. 15. Prove that among all computable semi-measures there is no universal one. 16. Show that H(x, string(H(x))) :::::: H(x). 17. Prove that for every universal enumerable semi-measure m,

L m«

x,y »:::::: m(x).

yEA'

18. Show that H(x, y) :::::: H(x) + H(y/ < x, string(H(x))

».

19. Let A, X be two alphabets, and let cp be a p.c. injective function from X* to A*. We denote by Hx, HA, the Chaitin complexities induced by two fixed Chaitin universal computers acting on X and A, respectively. Show that if A c X, then Hx(u) ~ HA(U) + 0(1), for all u E dom(cp).

4.7

History of Results

The Kraft-Chaitin Theorem comes from Chaitin [114], where a geometric proof is sketched and credit is given for this idea to N. J. Pippenger. The present proof is due to Calude and Grozea [81]; for other proofs see Calude and Kurta [92], SalEl,gean-Mandache [354], Vereshchagin [415]. In this chapter we have followed Chaitin [114, 118, 121], although the proofs are quite different. The Speed-up Theorem was proven by Gewirtz [208], which is also a good introduction to the topic of this chapter. The Algorithmic Coding Theorem comes from Chaitin [113] and Gacs [199, 203]. The semi-measures were introduced in Zvonkin and Levin [455]. Section 4.5 is essentially based on Calude and Campeanu [64]. The

92

4. G.E. Instantaneous Codes

material on prefix-free extendable codes comes from Mandoiu [294, 296] and Grozea [216]; see also Calude and Tomescu [99]. The analysis of the coding phenomenon was taken from Calude, Ishihara and Yamaguchi [87]. Example 4.20 comes from Hammer [221] . For applications of AIT in physics and quantum computing see, for example, Calude, Dinneen and Svozil [78], Denker, Woyczyllski and Y cart [173], Ford [193], Kieu [254], Ruelle [351], Schack [357], Schmidhuber [358], Segre [363], Svozil [391, 393, 394, 395]. A nice presentation of universal coding appears in Andreasen [3]. Kolmogorov's interest in complexity and randomness went back to the early 1950s: Information theory must precede probability theory, and not be based on it. By the very essence of this discipline, the foundations of information theory have a finite combinatorial character. The applications of probability theory can be put on a uniform basis. It is always a matter of consequences of hypotheses about the impossibility of reducing in one way or another the complexity of the description of the objects in question. Naturally, this approach to the matter does not prevent the development of probability theory as a branch of mathematics being a special case of general measure theory. The concepts of information theory as applied to infinite sequences give rise to very interesting investigations, which, without being indispensable as a basis of probability theory, can acquire a certain value in the investigation of the algorithmic side of mathematics as a whole.

Chaitin's early interest in complexity and randomness is described in his introductory chapter to [125] entitled A Life in Math: In high school I was also interested in game theory, information theory and in CODEL's incompleteness theorem. These subjects were still relatively new and exciting then, and there were not too many books about them or about the computers either, which were also a novelty at that time. I first had the idea of defining randomness via algorithmic incompressibility as part of the answer for an essay question on the entrance

4.7 History of Results

93

exam to get into the Science Honors Program! But I forgot the idea for a few years.

More facts on the history of the subject may be found in Cover, Gacs and Gray [151], Chait in [125, 122, 131, 132, 134], Li and Vitanyi [281, 282] and Uspensky [408].

Chapter 5

Random Strings We all agree that your theory is crazy, but is it crazy enough? Niels Bohr In this chapter we will address the question: "What is a random string?" A detailed analysis, at both empirical and formal levels, suggests that the correct question is not "Is x a random string?" but "To what extent is x random?"

5.1

An Empirical Analysis

Paradoxes often turn out to be a major source of inspiration for mathematical ideas. This is the case with Berry's paradox 1 for randomness. Consider the number one million, one hundred one thousand, one hundred twenty one. This number appears to be

the first number not nameable in under ten words. IG. G. Berry was an Oxford librarian and the paradox was first published by Bertrand Russell [352].

96

5. Random Strings

However, the above expression has only nine words, pointing out a naming inconsistency: it is an instance of Berry's paradox. We can reformulate the above argument in terms of program-size complexity. Assume that there exists a computable lower bound B for H. Clearly, B is unbounded as H is unbounded. Hence, for every non-negative integer m we can effectively compute a string x of complexity greater than m. Indeed, we compute B(u) for all strings u till we get a value greater than m. 2 So we can construct the following computable function: f(m) = min{x I B(x) > m}. By construction, H(f(m)) > m. Since f is computable, H(f(m)) :::; H(string(m)) + 0(1) :::; logm + 0(1), so m :::; log m + 0 (1), a contradiction. It follows that the property of nameability is inherently ambiguous and, consequently, too powerful to be freely used. The list of similar properties is indeed very long; another famous example refers to the classification of numbers as interesting versus dull. There can be no dull numbers: if there were, the first such number would be interesting on account of its dullness.

Of course, we may discuss the linguistic and mathematical soundness of the above analysis. For instance, what is the smallest even number greater than two, which is not the sum of two primes? We do not pursue such a course here (see, for instance, Borel [43]); our aim is more modest, namely to explain Chaitin's idea of using the inconsistency in Berry's paradox as a powerful method to measure the complexity of finite objects (see Chaitin [112]). We pass to another example, which is also a paradox: the paradox of randomness. Consider the following 32-length binary strings:

x =00000000000000000000000000000000,

y= 10011001100110011001100110011001, z=011010001001101oo101100100010110 u=00001001100000010100000010100010, v =01101000100110101101100110100101. 20f course, we do not know when we obtain the first u such that B(u) are sure that eventually such a string will be found.

> m, but we

5.1 An Empirical Analysis

97

According to classical probability theory the strings x, y, z, u, v are all equally probable, i.e. the probability of each is 2- 32 . However, a simple analysis reveals that these four strings are extremely different from the point of view of regularity. The string x has a maximum regularity which can be expressed by the following compact definition: only zeros. The string y is a bit more complex. To specify it we may use the following definition: eight blocks 1001. The string z is obtained by concatenating the string 0110100010011010 with its mirror. The strings u, v look definitely less regular, i.e. more complex. However, they are quite different. For a more compact definition of z we proceed as follows: we order the binary strings of a given length according to the increasing frequency of the ones, and within classes of equal frequency in lexicographical order (0 < 1), and we define a string by its number in this enumeration. To specify the position of a string with small frequency of ones (i.e. min ::; 1/2, where m is the number of ones and n is the length) one needs approximately n7i(mln) binary digits, where 7i : [0, 1/2] ~ R is the entropy function defined by

7i(0)

= 0, h(t) =

-tlog 2 t - (1 - t) 10g2(1 - t).

We need a constant number, say c > 0, of binary digits to specify the above enumeration, so our string will require approximately

n7i(mln)

+c

binary digits. Clearly, the above number is smaller than n for small values of the fraction min. The string z does satisfy this condition, since 8/32 < 1/2, hence z admits a much shorter definition. In contrast, the last string, v, appears to have no shorter definition at all. The above distinction is very sharp in the case of long strings (e.g. it is easier to specify the number 101010 than the first 100 digits of 7r), in contrast to the case of short strings (what are the "random" strings of length 1?), when it becomes meaningless. Suppose that persons A and B give us a sequence of 32 bits each, saying that they were obtained from independent coin tosses. If A gives the string u and B gives the string x, then we would believe A and not believe B: the string u seems to be random, but the string x does not, and we know a bit about the reason for this phenomenon. Laplace [273],

98

5. Random Strings

pp.16-17 was, in a sense, aware of this paradox, as may be clear from the following quotation: In the game of heads and tails, if head comes up a hundred times in a row then this appears to us extraordinary, because after dividing the nearly infinite number of combinations that can arise in a hundred throws into regular sequences, or those in which we observe a rule that is easy to grasp, and into irregular sequences, the latter are incomparably more numerous.

In other words: the non-random strings are the strings possessing some kind of regularity, and since the number of all those strings (of a given length) is small, the occurrence of such a string is extraordinary. Furthermore, regularity is a good basis for compression. Accordingly, randomness means the absence of any compression possibility; it corresponds to maximum information content (because after dropping any part of the string, there remains no possibility of recovering it). As we have noticed, most strings have this property. In contrast, most strings we deal with do not. A simple counting analysis is illustrative. A string of length n will be said to be c-incompressible if its compressed length is greater than or equal to n - c. For example, the 16-incompressible strings of length 64 are exactly the strings that can be compressed to a length of 48 or larger. Note that every (n+l)-incompressible string is n-incompressible, so every 5-incompressible string is 4-incompressible. Based on the fact that the number of strings of length n is 2n , it turns out that at least half of all the strings of every length are I-incompressible, at least three-quarters are 2incompressible, at least seven-eights are 3-incompressible, and so on. In general, at least 1- 2- c of all strings of length n are c-incompressible. For example, about 99.9% of all strings of length 64 cannot be compressed by more than 16% and about 99.99999998% of these strings cannot be compressed by more than 50%. The information content of a phrase in a natural language (English, for example) can be recovered even if some letters (words) are omitted. The reason comes from the redundancy of most spoken languages. As a consequence, there exist many efficient programs to compress texts written in natural languages. It is important to emphasize that all these methods work very well on texts written in some natural language, but they do not work well on average, i.e. on all possible combinations of letters of the same length. Redundancy is also a very powerful handle to readers of mathematical books (and, in general, of scientific literature), and also

5.1 An Empirical Analysis

99

to cryptanalysts (e.g. Caesar's ciphers - just permutations of letters can be broken by frequency analysis; see more on this topic in Salomaa [356]). A hypothetical language in which there are only strings with maximum information content gives no preference to strings (i.e. they have equal frequency); this makes the cipher impossible to break. However, such languages do not exist (and cannot be constructed, even with the help of the best computers available now or in the future); redundancy is essential and inescapable in a spoken language (and to a large extent in most artificial languages; see Marcus [298]). Furthermore, as Bennett [28] points out:

From the earliest days of information theory it has been appreciated that information per se is not a good measure of message value. For example, a typical sequence of coin tosses has high information content but little value; an ephemeris, giving the positions of the moon and planets every day for a hundred of years, has no more information than the equations of motion and initial conditions from which it was calculated, but saves its owner the effort of recalculating these positions. The value of a message thus appears to reside not in its information (its absolutely unpredictable parts), nor in its obvious redundancy (verbatim repetitions, unequal digit frequencies), but rather in what might be called its buried redundancy parts predictable only with difficulty, things the receiver could in principle have figured out without being told, but only at considerable cost in money, time, or computation. In other words, the value of a message is the amount of mathematicalor other work plausibly done by its originator, which its receiver is saved from having to repeat.

In the next example we will discuss the frequency problem. Suppose that we have a bag containing 90 round discs, bearing the numbers 1 to 90. We extract one disc from the bag "at random" and we note: i) whether the number it bears is odd/even,

ii) the remainder of the integer division of the number it bears by 5, and replace the disc.

5. Random Strings

100

We repeat this experiment 100 times and get the following two tables: 0 1 0 0 1 0 1 1 1 0

1 1 1 1 0 0 0 1 0 1

1 0 1 0 1 0 0 0 0 0

0 1 0 1 0 0 1 0 1 0

1 0 0 0 1 1 0 1 1 1

0 0 0 0 1 0 1 0 1 1

1 0 0 1 0 1 1 1 1 1

1 1 1 0 0 0 1 0 1 1

0 1 1 0 1 0 0 1 0 0

1 0 0 0 1 1 0 0 1 0

3 3 0 1 1 1 2 4 3 3

4 2 2 1 0 0 2 3 2 2

2 4 1 2 3 0 4 1 4 1

4 3 3 0 4 4 3 0 0 1

1 2 4 0 4 4 2 4 2 0

0 2 0 1 0 1 4 3 2 2

4 1 2 0 4 2 2 4 0 1

1 2 3 0 3 4 2 2 1 1

0 0 4 3 3 3 3 3 0 3

1 1 0 1 4 3 0 2 1 4

(Odd numbers have been denoted by 1 and even numbers by 0.) The relative frequency of the result one is 49/100. If we consider only the first, third, fifth and so forth, i.e. we take only numbers in odd columns, we find that ones appear in 24 cases out of 50; the relative frequency is 48/100. Using only the numbers appearing on positions 1,4,7, ... we get 20 ones out of 40, i.e. the relative frequency is 50/100. Choosing numbers according to the sequence of primes (2,3,5, ... , 89,97) we get 16 ones out 25 (frequency 64/100) or the Fibonacci sequence (1,2,3,5,8,13,21,34,55,89) we get 6 ones out of 10. These calculations show that, in all different selections which we have tried out, the ones always appear with a relative frequency of about 1/2. Similar results can be seen in the second table; for instance, the relative frequency of all numbers i = 0,1,2,3,4 is about 1/5 (both on the whole and on odd columns). According to prime or Fibonacci selections, the results are a bit different.

5.1 An Empirical Analysis

101

Of course, the above results come from a "real experiment". However, it is not too difficult to construct an "ideal" string (knowing the counting rules in advance). For example, the following string over the alphabet {O, 1,2,3,4,5,6,7,8, 9} has ideal behaviour with respect to the relative frequency computed according to the sequences containing i) all positions, ii) odd/even positions, iii) prime positions, and iv) Fibonacci positions: 76385799450482632013227791898895517410165702366053 44280199754483613219166840273555078206493093682174. Before presenting some conclusions it is natural to ask the following question: Are there any random strings? Of course, we do not yet have the necessary tools to answer this question properly, but we may try to approach it informally. Consider the minimal or canonical programs defined in Section 3.1. We claim that every such program should be random, independently of whether it generates a random output or not. Indeed, assume that x is a minimal program generating y. If x is not random, then there exists a program z generating x which is substantially smaller than x. Now, consider the program from z calculate x, then from x calculate y.

This program is only a few letters longer than z, and thus it should be much shorter than x, which was supposed to be minimal. We have reached a contradiction. Our analysis leads to the following empirical conclusions: • Testing the randomness property is computationally hard. • Randomness is an asymptotic property; it is meaningless for short strings. • Randomness excludes order and regularity. • Randomness implies equal frequencies for all digits. • Randomness can be identified, to a large extent, with incompressibility.

5. Random Strings

102

5.2

Chaitin's Definition of Random Strings

To motivate our approach we use the analogy between "tallness" and "randomness". To appreciate whether a person is or is not tall we proceed as follows. We choose a unit of measure (say, centimetre) and we evaluate the height. We get an absolute value. Next, we establish "a set of reference people". For instance, if we wish to appreciate how tall a little girl is we fix an age and we relate her height to the average height of girls of that age. But if we discuss the same question for a teenager, the situation is completely different. It follows that the adjective tall is relative. To appreciate it correctly we need both components: the exact one (height) and the relative one (comparison within a fixed set). It is fortunate that in English we have two words to express this: height and tall. For randomness we proceed in a similar way, trying to capture, as well as possible, the idea that a string is random if it cannot be algorithmically compressed. First we use a measure of complexity of strings (K or H); this represents the "absolute component". Secondly, we define randomness "relative to a set" - the relative component. In our case we assess the degree of randomness of a string with respect to the set of all strings having the same length. Of course, the success or failure of the approach depends upon the measure of complexity we are adopting. In searching for an appropriate inequality marking the border between randomness and non-randomness we follow the ideas of Chait in and we first analyse the asymptotical behaviour of the complexity H.

Theorem 5.1 (Chaitin). Let f : N

~

A* be an injective, computable

function. a)

One has

L

Q-H(f(n)) :::; l.

n:?:O

b)

Consider a computable function 9 : N + ~ N +.

i)

If Ln:?:l Q-g(n)

= 00,

then H(f(n))

> g(n), for infinitely many

nEN+. ii)

If Ln:?:l Q-g(n)

< 00, then H(f(n)):::; g(n)+O(l).

5.2 Cbaitin's Definition of Random Strings

103

Proof a) It is plain that

L Q-H(f(n)) :::; L

Q-H(x):::;

L

P(x) :::; 1.

XEA*

xEA*

(We have used Lemma 3.30 and Lemma 3.29.) b) i) Assume first that 2.:n2:1 Q-g(n) = 00. If there exists a natural N such that H(f(n)) :::; g(n), for all n ~ N, then we get a contradiction: 00

=

L

Q-g(n) :::;

L

Q-H(f(n)) :::;

L Q-H(f(n)) :::; 1. n2:0

In view of the hypothesis in b) ii), there exists a natural N such that 2.:n2:N Q-g(n) :::; 1. We can use the Kraft-Chaitin Theorem in order to construct a Chaitin computer C : A* x A* ~ A* with the following property: for every n ~ N there exists x E A * with Ix I = g( n) and C(x, >..) = f(n). So, there exists a natural c such that for all n ~ N, H(f(n)) :::; Hc(f(n))

+ c :::; g(n) + c.

Example 5.2. 2.:n2:0 Q-H(string(n)) :::; 1. Example 5.3. i) We take g(n)

L

=

llogQ nj. It can be seen that

Q-g(n)

= 00,

n2:1

so H(string(n)) for infinitely many n

ii) For g(n)

~

> llogQ nj,

1.

= 2llog Q nj,

L n2:1

one has Q-g(n) :::; Q

1

L

2"

n2:1

n

< 00,

so H(string(n)) :::; 2llog Q nj

+ 0(1).

o

5. Random Strings

104

For Q > 2 and g(n) = llogQ_l nJ, one has ' " Q-g(n)

6

n2:1

< Q -

1 '"

6

n2:1

.

n1ogQ-1Q

< 00 '

so H(string(n)) S llogQ-l nJ

+ 0(1).

Remark. Chaitin's complexity H can be characterized as a minimal function, semi-computable in the limit from above, that lies on the borderline between the convergence and the divergence of the series

L

Q-H(string(n)).

n2:0

We are now able to analyse the maximum Chaitin complexity of strings of a given length.

Theorem 5.4. For every n EN, one has max H(x)

XEAn

= n + H(string(n)) + 0(1).

Proof In view of Theorem 4.10, for every string x of length n,

+ 0(1) S H(string(n)) + H(x/ string(n)) + 0(1).

H(x) S H(string(n), x)

To get the relation max H(x) S n

xEAn

+ H(string(n)) + 0(1)

we shall prove that for every string x of length n,

H(x/ string(n)) S n + 0(1). We fix n 2:: 0 and define the Chaitin computer Cn : An

Cn(x, y) =

X

if U(y, >.) /::

A* ~ A* by

00,

for x E An, y E A*. Accordingly, U((string(n))*,>.)

H(x/string(n))

X

= string(n) and

< Hcn(x/string(n)) + 0(1) min{lzll z E A*, Cn(z, (string(n))*) = x}

+ 0(1) < n + 0(1).

5.2 Cbaitin's DeEnition of Random Strings

105

To prove the converse relation we need the following:

Intermediate Step. For every n 2: 0, #{x E An I H(x) < n + H(string(n)) - t + 0(1)} <

Qn-HO(l).

By Theorem 4.10 one has

H(x) < n+ H(string(n)) - t+ 0(1)

<¢=}

H(x/string(n)) < n - t+ 0(1),

so

#{x E An I H(x) < n + H(string(n)) - t + 0(1)} = #{x E An I H(x/ string(n)) < n - t + 0(1)} < Qn-HO(l). Accordingly, not all strings of length n have complexity less than n H(string(n)) + 0(1), i.e. max H(x) 2: n

XEAn

+ H(string(n)) + 0(1).

+ o

The above discussion may be concluded with the following definition. Let ~ : N ~ N be the function defined by ~(n)

= xEAn max H(x).

In view of Theorem 5.4, ~(n) = n + H(string(n)) + 0(1). We define the random strings of length n to be the strings with maximal self-delimiting complexity among the strings of length n, i.e. the strings x E An having

H(x)

~ ~(n).

Definition 5.5. A string x E A * is Chaitin m-random (m is a natural number) if H(x) 2: ~(Ixl) - mi x is Chaitin random if it is O-random. The above definition depends upon the fixed universal computer U; the generality of the approach comes from the Invariance Theorem. Obviously, for every length n and for every m 2: 0 there exists a Chaitin m-random string x of length n. We denote by RAND~,RANDc, respectively, the sets of Chaitin m-random strings and random strings. It is worth noting that the property of Chaitin m-randomness is asymptotic. Indeed, for x E RAN D~, the larger is the difference between Ixl

5. Random Strings

106

and m, the more random is x. There is no sharp dividing line between randomness and pattern, but it looks as though all x E RAN D~ with m S H(string(lxl)) have a true random behaviour. How many strings x E An have maximal complexity, i.e. H(x) The answer will be presented in the next theorem. Theorem 5.6. There exists a natural constant c

"((n) = #{x E An I H(x) =

~(Ixl)}

= ~(Ixl)?

> 0 such that > Qn-c,

for all natural n. Proof. We make use of the formula

to be proven in Theorem 9.4. 3 Here we work with the alphabet AQ = {O, 1, ... ,Q - I}, (m)Q is the base-Q representation of the natural m and o-(n) = I("((n))Q I s n, by Lemma 9.3. From the above formula the evaluation follows quite easily since if there are j consecutive Os at the left end of a string, this makes its complexity drop by j - O(logQ j), because, roughly speaking, we can replace these j Os by a minimum-size self-delimiting program for j, which is only H(string(j)) = O(logQj) long. We construct Chaitin's computer C acting as follows: string(n) and U(v,'\) = string(j), then

if U(u,'\)

C(uvy,'\) = ojy, for every string y of length

Iyl =

n.

It follows that

Hc(oi y) S H(string(n))

+ H(string(j)) + n - j,

so

H(ojy)

<

+ H(string(j)) - j ~(n) - O(logQj) + 0(1). ~(n)

3Recall that Om is the string 00 ... 0 of length m.

107

5.3 Relating Complexities K and H If

then H(on-a-(n) h(n))Q)

= ~(n) + 0(1) S

~(n) -

o (logQ j) + 0(1).

Thus there can be at most 0(1) consecutive Os, at the left end of Hence, for some constant c > 0 one has

~(n).

1'( n) > Qn-c, o

which was to be proven.

How large is c? Out of Qn strings of length n, at most Q + Q2 + ... + Qn-m-l = (Qn-m - 1)/(Q - 1) can be described by programs of length less than n - m. The ratio between (Qn-m - 1) / (Q - 1) and Qn is less than 1O- i as Qm ~ Wi, irrespective of the value of n. For instance, this happens in case Q = 2, m = 20, i = 6; it means that less than one in a million among the binary strings of any given length is not Chaitin 20-random.

5.3

Relating Complexities K and H

In this section we prove a useful result relating the blank-endmarker complexity K and the self-delimiting complexity H.

Theorem 5.7. For all x E A* and tEN, if K(x) < H(x)

Ixl- t,

then

< Ixl + H(string(lxl)) - t + o (logQ t).

Proof We start by noting that the set {(x, t) E A* x N I K(x) < Ixl-t} is c.e., thus if K(x) < Ixl-t, then we can eventually discover this. Moreover, there are less than

strings of length n having this property. Thus if we are given Ixl = n and t we need to know only n - t digits to pick out any particular string x E An with this property. That is, as the first x that we discover has this property, the second x that we discover has this property, ... ,the ith

5. Random Strings

108 x that we discover has this property, and i that any x E An that satisfies the inequality

K(x) <

< Qn-t / (Q - 1), it follows

Ixl- t

has the property that

H(x/ < string(n), string(t)

» < n - t + 0(1).

So, by Lemma 3.14 and Theorem 4.10:

H(x)

< H(x/ < string(n), string(t) » + H( < string(n) , string(t) » + 0(1) < n - t + H(string(n), string(t)) + 0(1) < n - t + H(string(n)) + H(string(t)) + 0(1) < n - t + H(string(n)) + o (logQ t),

since in general H(string(m))

< O(logQ m).

D

Corollary 5.8. For every tEN and x E RANDf, one has K(x) 2:: Ixl - T, whenever T - o (logQ T) 2:: t.

Proof We fix tEN and we pick x E RANDf, i.e. H(x) 2:: K(x) < Ixl- T, then by Theorem 5.7 H(x) < which means T - 0 (logQ T)

'E(lxl) < t.

'E(lxl) -

t. If

T - O(logQ T), D

The old version of algorithmic randomness for strings (see Kolmogorov [259] and Chaitin [110, 111, 122]) made use of the concept of blankendmarker program-size complexity; in that approach a string x is trandom if K(x) 2:: Ixl - t. Corollary 5.8 shows that Chaitin randomness is as strong as the old notion of randomness. Solovay [375] has proven that Chaitin's definition is actually stronger. There are many arguments favouring the new approach, an important one being the possibility to define random (infinite) sequences in complexity-theoretic terms. One can do this with self-delimiting complexity, but not with blank-endmarker complexity (see Chapter 6).

5.4 A Statistical Analysis

5.4

109

A Statistical Analysis

In this section we confront Chaitin's definition of randomness with the probability point of view. As we have already said, the present proposal identifies randomness with incompressibility. In order to justify this option we have to show that the strings that are incompressible have various properties of stochasticity identified by classical probability theory. It is not so difficult, although tedious, to check separately such a single property. However, we may proceed in a better way, due to the celebrated theory developed by Martin-Lof: we demonstrate that the incompressible strings do possess all conceivable effectively testable properties of stochasticity. Here we include the known properties, but also possible unknown ones. A general transfer principle will emerge, by virtue of which various results from classical probability theory carryover automatically to random strings. The ideas of Martin-Lof's theory are rooted in statistical practice. We are given an element x of some sample space (associated with some distribution) and we want to test the hypothesis x is a typical outcome. Being typical means "belonging to every reasonable majority". An element x will be "random" just in case x lies in the intersection of all such majorities. A level of a statistical test is a set of strings which are found to be relatively non-random (by the test). Each level is a subset of the previous level, containing less and less strings, considered more and more nonrandom. The number of strings decreases exponentially fast at each level. In the binary case, a test contains at level 1 all possible strings, at level 2 only at most 1/2 of the strings, at level 3 only 1/4 of all strings, and so on; accordingly, at level m the test contains at most 2n-m strings of length n. We now give the formal definition.

Definition 5.9. A c.e. set V c A* x N+ is called a Martin-Lof test if the following two properties hold true: 1)

Vm+1 C Vm , for all m

~ 1 (here

Vm = {x E A* I (x, m) E V} is the

m-section of 11), 2)

#(An n Vm ) < Qn-m/(Q - 1), for all n

~

m

By definition, the empty set is a Martin-Lof test.

~

1.

5. Random Strings

110

The set Vm is called the critical region at level Q-m /(Q - 1). (Getting an outcome string x in Vm means rejection of the randomness hypothesis for x.) A string x is declared "random" at level m by V in case x tj. Vm and Ixl > m. The next example models the following simple idea (see the second example discussed in Section 5.1): if a binary string x has too many ones (zeros), then it cannot be random.

Example 5.10. The set

V -- {( x,m ) E A* x N+

I II;r Ni (x) -

1 I > Qm JlXT' 1 } Q

where Ni (x) is the number of occurrences of the letter ai in x, is a MartinLaf test. Proof Clearly, V is c.e. and satisfies condition 1). In view of the formula

# {X

E

An

I

I Ixl

Ni(X) _ ~I Q >

}

e;::;

Qn-2(Q -1) ne;2

'

one gets

# {X

E

I Ixl

An I Ni(x) _

~I

Q >

Qm_1 }

JlXT

Qn-2(Q _ 1) Q2m Qn-2-2m(Q _ 1) Qn-m

< Q-1'

D

Example 5.11. Let

V(
x N+ I Kcp(x) < Ixl- m}

is a Martin-Laf test.

Proof First we show, by a dovetailing argument, that V(
Ixl- m

-¢:},

Iyl < Ixl- m

111

5.4 A Statistical Analysis

{:}
Ivi < Ixl -

m.

Condition 1) is clearly satisfied. For the inequality 2) we proceed to the following computation:

#(A n n (V(
#{x E An I (x,m) E V(
<

Qn-m /(Q - 1).

0

Comment. A test of the form V(
={

v

max{m?:: 1 I (x, m) E V}, 0,

if (x, 1). E V, otherw2se.

A string x is declared q-random bV a Martin-Laf test V if x (j. Vq and q < Ixl·

Remark. If x is q-random with respect to V, then mv(x) <

Ixl -

1.

Definition 5.13. A Martin-Laf test W is called representable if W = V(
2.

> 0, then mw(x) = Ixl- Kcp(x) -1. One has mw(x) = Kcp(x) ?:: Ixl - 1.

If mw(x)

°{:}

Proof If mw(x) ?:: 1, then

mw(x)

11 (x,m)

W} max{m ?:: 1 I (x, m) E V ( 11 m S Ixl- Kcp(x) - I} Ixl- Kcp(x) - 1. max{m?::

E

112

5. Random Strings

Finally,

mw(x) = 0 {:}

mV(~)(x)

= 0 {:} (x, 1) rf. V(
~

Ixl- 1.

o

Example 5.15. Not every Martin-Laf test is representable.

Proof. Indeed, we take W = {(OOO, 1), (010, 1), (111, I)}. Obviously, W is a Martin-Lof test. Assume that there exists a computer

Xt,

00,

ift = minr {mr otherwise,

= Ixr l-1} E N,

5.4 A Statistical Analysis

113

and for z =I- A,

Xt,

if z = string(s(lzl- 1) + i - 1), t = minr {mr = Ixrl-Izl- 1, and #{1 So j So r I mj = IXjl-lzl- 1} = i} E N, otherwise.

(An expression of the form min r { ... } may be sometimes undefined; the notation min r {... } E N emphasizes the fact that the minimum is defined.) We are now proving the equality: W = V( m, two cases may occur:

1) If Ixl = m+ 1, then this is the unique pair (x, m) E W with so we put z = A;
Ixl = m+ 1,

#(An n W m ) So Qn-m-l = Qm+1- m-l = QD = 1, there exists a unique string y E Am+1 such that (y, m) E W, namely y=x. 2) If

Ixl > m+ 1, we then compute i

= #{1 So j So t I mj = m, Ixl = IXjl}

and notice that, by hypothesis, 1 So i So Q1xl-m-l. We put

z = string(s(lxl- m - 2)

+i

-

1)

and observe that from
l)+i-l). In both cases we have found a string z E Alxl-m-l such that x =
Remark. All conditions in Theorem 5.16 are necessary. For example, the Martin-Lof test V = {(~O, 1), (000, 1), (000, 2)} is not representable: it satisfies (5.1), but (00,1), (000,2) are both in V.

5. Random Strings

114

Corollary 5.17. For every M artin-Laf test V and every string u E A *, lui ~ 2, the set uV = {(ux, m) I (x, m) E V} is a representable Martin-Laf test. Proof The set u V is a Martin-L()f test and

and Am+}

n (uV)m

=

0.

o

Lemma 5.18. For every u E A*, lui ~ 2 there exists a constant j ~ 0 such that K (x) :S K (ux) + j, for all x E A * . Proof We define a computer
By the Invariance Theorem, we have K(x) :S Kcp(x)

+ j,

and

= min{lzll z = min{lzll z = min{lzll z

Kcp(x)

E A*,
= x}

E A*, hu('ljJ(z)) = x} E A*, 'ljJ(z) = ux, hu(ux)

= x}

= K(ux).

So, K(x) :S K(ux)

+ j.

o

Definition 5.19. A Martin-Laf test U is called universal in case for every Martin-Laf test V, there exists a constant c (depending upon U and V) such that Vm+c cUm, m == 1,2, .... Theorem 5.20. A Martin-Laf test U is universal iff for every MartinLaf test V there exists a constant c (depending upon U and V) such that mv(x) :S mu(x)

+ c,

for all x E A*.

5.4 A Statistical Analysis

115

Proof Let U be universal. If mv(x) = 0, the inequality is obviously true. If mv(x) > 0, one has x E Vmv(x)' We have to check the inequality only in case mv(x) - c > 0. Accordingly, Vmv(x)

= V(mv(x)-c)+c c

Umv(x)-c

so, x E Umv(x)-c, i.e. mv(x) - c S mu(x). Conversely, assume m ~ 1, Vm +c =1= 0. If x E Vm +c , then mv(x) ~ m so mu(x) ~ mv(x) - c ~ m ~ 1,

+ c,

i.e. if x E Umu(x) cUm, then Vm+c cUm, showing that U is universal.

D

We now introduce another measure for randomness.

Definition 5.21. A function 8 : A* --+ N+ semi-computable from below is called a deficiency of randomness function if for all natural n, m ~

1, #{x E An 18(x) > m} S (Qn-m - 1)/(Q - 1). Example 5.22. The function 8 : A* --+ N+ defined by 8('\) = 8(a m ) = 2, ... ,Q, 8(XIX2 •.. Xn) = max{i ~ 11 Xl = X2 = ... = X2i-1 al} is a deficiency of randomness function.

O,m

= =

Example 5.23. We take a computer

= { Ixl- Kcp(x), 1,

if Kcp(~) < otherw2se.

Ixl -

1,

Then 8 is a deficiency of randomness function.

Theorem 5.24. Let 8 : A * ments are equivalent:

--+

N + be a function. The following state-

1.

The function 8 is a deficiency of randomness function.

2.

There exists a Marlin-Vif test V such that 8(x) all strings x E A * .

=

mv(x)

+ 1,

for

5. Random Strings

116

Proof Assume first that construct the set

is a deficiency of randomness function and

<5

V[<5] = ((x,m) E A* x N+ I <5(x) > m}. It is clear that V[<5] is a Martin-Lof test. If <5(x)

mV[8] (x)

> 1, then

max{i::: 1 I x E (V[<5])i}

= max{i::: 1 I <5(x) > i} = <5(x) + 1 > O. Finally, <5(x)

= 1 iff x rf- (V[<5]h,

Conversely, it is clear that

<5

i.e. mV[8](x)

= O.

is semi-computable from below and

#{x E An I <5(x) > m} = -

#{x E An I mv(x) + 1> m} #{x E An I mv(x) ::: m} < #{x E An I X E Vm } < (Qn-m _ l)/(Q - 1).

D

Theorem 5.25. Let 'ljJ : A * ~ A * be a computer. The following state-

ments are equivalent: A) The computer'ljJ is universal, i.e. for every computer i.p : A* ~ A* there exists a constant c (depending upon 'ljJ and i.p) such that

for all x E A*. B) For every Martin-Laf test V there exists a natural q (depending upon V and 'ljJ) such that for all x E A * mv(x) ::;

Ixl -

K1j;(x)

+ q.

C) For every deficiency of randomness function <5 : A* --+ N+ there exists a constant s (depending upon <5 and 'ljJ) such that for all x E A *

D) The M artin-Laf test V ('ljJ) is universal and there exists a constant d with for all x E A * .

5.4 A Statistical Analysis

117

*

Proof A) B) Let V be a Martin-Lof test and take U E A, W = uV; so, by Corollary 5.17, W is representable, i.e. W = V(ip), for some computer ip. Clearly, mv(x) = mV(cp) (ux). If mv(x) = 0, then KIj;(x) S Ixl + d (because'ljJ is universal), i.e. mv(x) = S Ixl- K1f;(x) +d. If mv(x) > 0, then

°

Kcp(ux)

= luxl- mV(cp) (ux) - 1 = Ixl- mV(cp) (ux) = Ixl- mv(x).

Hence, by Lemma 5.18,

We take q = max(c+ j,d), to get the inequality in B).

*

B) C) Let 5 be a deficiency of randomness and construct the MartinLof test V [5] = ((x,m) E A* x N+ I 5(x) > m} as in the proof of Theorem 5.24. By B), 5(x)

= mV[8] (x) + 1 S Ixl- K1f;(x) + (q + 1),

for all x E A*. C)

* D) Let V be a Martin-Lof test and x E A *. In view of Theorem 5.24

5(x) = mv (x) + 1 is a deficiency of randomness function that satisfies (by C)) the inequality 1 + mv(x)

= 5(x) S Ixl- K1f;(x) + S,

for all x E A*. If mV(1f;)(x) = 0, then Ixl - K1f;(x) S 1, so mv(x) S s. If mV(1f;)(x) > 0, then mV(1f;)(x) = Ixl - K1f;(x) - 1 and mv(x) S Ixl- K1f;(x) - 1 + s S mV(1f;) (x) + s. So, V('ljJ) is universal. We now take d = s and notice that S mv(x) S Ixl- K1f;(x) + s.

°

*

D) A) Let ip be a computer and consider the Martin-Lof test V(ip). If mV(cp) (x) = 0, then Kcp(x) ~ Ixl - 1; but K1f;(x) Ixl + d (by D)), so K1f;(x) S Kcp(x) + (d + 1). If mV(cp) (x) > 0, then mV(cp)(x) = Ixl Kcp(x) - 1 S mV(1f;)(x) + t, i.e. mV(1f;)(x) ~ Ixl - Kcp(x) - (1 + t). If mV(1f;) (x) = 0, then K1f;(x)-d S Ixl S Kcp(x)+I+t. IfmV(1f;)(x) > 0, then mV(1f;)(x) = Ixl-K1f;(x) -1 ~ Ixl- Kcp(x) -t-l, so K1f;(x) S Kip(x) +t. We set c = d + 1 + t; then K1f;(x) S Kcp(x) + c. So, 'ljJ is a universal computer. 0

s

118

5. Random Strings

Theorem 5.26 (Martin-Lof asymptotical formula). Let 't/J be a universal computer and U be a universal Martin-Leif test. Then there exists a constant c (depending upon 't/J, U) such that for all x E A *

Ilxl- K1j;(x) -

mu(x)

I:::; c.

Proof In view of Theorem 5.25 ('t/J is a universal computer and U is a universal Martin-Lof test) we can pick q and t such that for all x E A *

and

mV(1j;)(x) :::; mu(x)

+ t.

We are now using Proposition 5.14. If mV(1j;) (x) = 0, then Ixl- K1j;(x)1 :::; and mu(x) ~ -t ~ -t + Ixl - K1j;(x) - 1. If mV(1j;) (x) i= 0, then mu(x) ~ mV(1j;)(x) - t = Ixl - K1j;(x) - 1 - t. Finally we take

°

c=max(q,1+t).

D

Theorem 5.27. We fix tEN. Almost all strings in RANDf will be declared eventually random by every Martin-Laf test.

Proof If x E RANDf, and 't/J is a universal computer, then K'I/J(x)

>

Ixl - T, for all natural T - O(logQ T) ~ t, by virtue of Corollary 5.8. We fix now a Martin-Lof test V. There exists a q > 1 such that

for all i = 1,2, ... ; so, x €/. VT+q' So, if random.

Ixl

~

T

+ q,

then V declares x D

Corollary 5.28. Every deficiency of randomness function 8 is bounded on every set RAN Df . Comment. Theorem 5.27 says that all Chaitin t-random strings pass all possible effective tests of stochasticity. We have here a first (and strong) argument supporting the adequacy of Chaitin's definition.

5.5 A Computational Analysis

5.5

119

A Computational Analysis

We pursue the analysis of the relevance of Chaitin's definition by confronting it with a natural, computational requirement: there should be no

algorithmic way to recognize which strings are random. First we show that the absolute complexity H is not computable. Theorem 5.29. There is no p.c. function

domain such that H(x) =
f(ala2) =min{x E B I H(x) ~ Qi},i ~ l. Since 0, H (f (a a2)) ~ Qi and for all x E A *, H (x) :S H f (x) + c. Accordingly, for infinitely many i > 0, we have

1

o

This yields a contradiction.

With the same proof we may show that there is no p. c. function
all x E dom(
Proof Let us introduce the set

Ct = {x E A* I H(x)

~

Ixl- t},

and prove that the set Ct is immune. As RAN Df is an infinite subset of Ct, we deduce that RANDf itself is immune. Assume, by absurdity,

120

5. Random Strings

that D is an infinite computable subset of Ct. We define the p.c. function F: A*~A* by

F(aia2)

= min{x E D

Ilxl;:: t+ 2(i + 1)}.

It is plain that F has a computable graph. Furthermore

For infinitely many natural i, we have

o

This yields a contradiction.

Recall now that (rpx )XEA* is an acceptable G6del numbering, rpi : A * ~ A * and Wx = dom( rpx). The above theorem can be expressed as

(VB c A*)(B infinite and c.e.

=?

B \ RAN Df -=f 0).

There are two (classically equivalent) ways to represent the above statement: =? (~y E

1.

(Vx

E

A*) (Wx infinite

2.

(Vx

E

A*) (Wx c RANDf

=?

A*) y

E

Wx \ RANDf) ,

(:3n E N) #(Wx ) ::; n).

Based on these statements we can formulate two constructive versions of immunity: The set RcA * is called constructively immune if there exists a p.c. function rp : A * ~ A * such that for all x E A *, if Wx is infinite, then rp(x) -=f 00 and rp(x) E Wx \ R. The set RcA * is called effectively immune if there exists a p.c. function (7 : A * ~ N such that for all x E A *, if Wx c R, then (7(x) -=f 00 and #(Wx ) ::; (7(x). It is worth noting that there exist constructively immune sets which are not effectively immune and viceversa. Moreover, if the complement of an immune set is c.e., then that set is constructively immune. Hence, RAN is constructively immune, since its complement is c.e. We now present a direct proof of this fact.

Df

121

5.5 A Computational Analysis Theorem 5.32. For every t ~

0, RAN Df

is constructively immune.

Proof The complement, A* \ Ct , is a c.e. and infinite set, so we can construct two computable functions f, 9 : A * -+ A * such that for all x E A *: 1. 2. 3.

= Wx n (A* \ Ct ), range i.pg(x) = W x , if Wx =1= 0, then i.pg(x) is total. Wf(x)

We define the p.c. function h : A*~A* by

h(x)

= { i.pg(x) (al), 00,

If Wx is infinite, then h(f(x))

=1= 00

h(f(x)) E Wf(x)

if Wx ~ 0, otherWIse.

and h(f(x))

= i.pg(f(x)) (ad,

thus

= Wx n (A* \ Ct ).

In other words, for all x E A*, if Wx is infinite, then h(f(x)) =1= 00 and h(f(x)) E Wx \ Ct , i.e. Ct is constructively immune with respect to i.p(x) = h(f(x)). Again, RAN Df is an infinite subset of Ct , a constructive immune set, so it is constructively immune. 0 We focus our attention on the second constructive version of the immunity property: effective immunity. We start with a general result, which is interesting in itself. Let <>: A* x N -+ A* be a computable bijective function.

Theorem 5.33. We can effectively compute a constant d such that if

We C { < W, m > E A * x N

I H( w) > m},

then n ~ H(e)

+ d,

for all < u, n >E We. Proof We choose the universal Chaitin computer e( ai a2v) = i.pstring(i) (v) and note that He(x) ~ H'Pstring(i) (x) + i + 1. So,

H(x) ~ He(x) + c ~ H'Pstring(i) (x) + c+ i + 1. The partial function f : A*~A* operates on inputs of the form a~a2u (n E N +, u E A*) according to the following instructions:

122

5. Random Strings 1. 2.

3.

f tries to compute U(u, A); in case a result is obtained, say U( u, A) = e, f starts generating the c.e. set We until a pair < w, m > E We is found such that m > lui + n + c+ 1, and f outputs w.

There exists a computable function F : A * N,u E dom(U>.)

-+

A * such that for all m

E

'PF(string(m))(u) = f(af a2u ). To this computable function we apply the Recursion Theorem to get a natural n such that

'Pstring(n)(u) = 'PF(string(n))(u) = f(a~a2u),

(5.2)

for all u E A*. We claim that if U(V,A) = e, then 'Pstring(n)(v) = Indeed, if 'Pstring(n) (v) = w i= 00, then (by (5.2))

(w,m) EWe, H(w) > w >

Ivl + n +

00.

c+ 1,

and

H(w) :::; H'Pstring(n) (w) + n + c + 1:::; We have arrived at a contradiction. If U>. (v) then m :::; Ivl + n + c + 1. Indeed, f(a~a2v)

In particular, if

Ivl = H(e),

Ivl + n +

c + 1.

= e and < w, m > EWe,

= 'Pstring(n) (v) = 00.

then m :::; H(e) + d, where d = c+ n + 1.

0

Next we get a stronger version of Theorem 5.31. Corollary 5.34. Ifg: N -+ N is a computable function withg(n) :::; n~t and limn-+oog(n) = 00, then {w E A* I H(w) > g(lwl)} is immune.

Proof Let We

C

{w

E

A* I H(w)

> g(lwl)}. We put

Ve = {< w,g(lwi)

> I w EWe}.

123

5.6 Borel Normality Clearly, Ve is c.e. and Ve A*. So, Ve

= Wf(e)

= Wf(e) , for some computable function f : A*

C

{< w,m >

E

A* x N

~

I H(w) > m}

and in view of Theorem 5.33 from < w, g(lwl) > E Ve = Wf(e) we deduce g(lwl) ~ H(f(e)) + d, i.e. Ve is finite. This shows that We itself is finite. D

Scholium 5.35. If 9 : N ~ N is a computable function which converges computably to infinity, limn->oo g( n) = 00, (i. e. there exists an increasing computable function r : N ~ N, such that if n :::: r(k), then g(n) :::: k) and the set S = {w E A* I H(w) > g(lwl)} is infinite, then S is effectively

zmmune. Proof. In the context of the proof of Corollary 5.34, #We if wE We C {u E A* I H(u) > g(lul)}, then

g(lwl)

< H(w)

= #Wf(e) and

~ H(f(e)) + d ~ If(e)1 + 2 log If(e)1 + d + c + c'.

If Iwl :::: r(lf(e)I+2log If(e)l+d+c+c'), then g(lwl) > If(e)I+2log If(e)l+ d + c + c', so w (j. We. Accordingly, if w E We, then Iwl < r(lf(e)1 + 2 log If(e)1 + d + c + c'), i.e. #We ~ (Qr(lf(e)I+21o glf (e)I+d+c+c') - l)/(Q -1),

and the upper bound is a computable function of e.

D

Corollary 5.36. For all t :::: 0, RANDf is effectively immune. Proof. An infinite subset of an effectively immune set is effectively imD mune.

Corollary 5.31. The set {< w,m

> IH(w)

~

m} is c.e., but not com-

putable.

5.6

Borel Normality

Another important restriction pertaining to a good definition of randomness concerns the frequency of letters and blocks of letters. In a "truly

5. Random Strings

124

random" string each letter has to appear with approximately the same frequency, namely Q-1. Moreover, the same property should extend to "reasonably long" substrings. Recall that Ni (x) is the number of occurrences of the letter ai in the string x; 1 .::; i .::; Q. We now fix an integer m > 1 and consider the alphabet B = Am = {Y1, ... , YQm} (#B = Qm). For every 1 .::; i .::; Qm we denote by N im the integer-valued function defined on B* by Nim(X) = the number of occurrences of Yi in the string x E B*. For example, we take A = {O, I}, m = 2, B = A2 = {OO, 01,10,11} = {Y1, Y2, Y3, Y4}, x = Y1Y3Y3Y4Y3 E B* (x = 0010101110 E A*). It is easy to see that Ixl2 = 5,lxl = 10, N'f(x) = 1, N?(x) = 0, Nl(x) = 3, Nl(x) = 1. Note that the string Y2 = 01 appears three times in x, but not in the right positions. Not every string x E A* belongs to B*. However, there is a possibility "to approximate" such a string by a string in B*. We proceed as follows. For x E A* and 1 .::; i .::; Ixl we denote by [x;i] the prefix of x of length Ixl-rem(lxl, i) (i.e. [x; i] is the longest prefix of x whose length is divisible by i). Clearly, [x; 1] = x and [x;i] E (Aj)*. We are now in a position to extend the functions NF from B* to A*: we put

in case

Ixl

is not divisible by

m.

Similarly,

Ixlm= l[x;m]lm. For x E Aoo and n ~ 1, x(n) = X1X2 ... Xn E A*, so Ni(x(n)) counts the number of occurrences of the letter ai in the prefix of length n of x.

Definition 5.38. A non-empty string x E A * is called c-limiting (c is a fixed positive real) if for all 1 .::; i .::; Q, x satisfies the inequality

Ni(X) - Q- 1 < 1

Ixl

1

- c.

(5.3)

Comments. i) Since 0 .::; Ni(x) .::; lxi, the left-hand side member of (5.3) is always less than (Q -l)/Q. ii) In the binary case Q = 2, a string x is c-limiting iff the inequality (5.3) is satisfied for some i = 1,2. This is because IN1(x)/lxl - 2- 1 1 = IN2(x)/lxl- 2- 11. Definition 5.39. Let c > 0 and m

~

1.

125

5.6 Borel Normality a) We say that a non-empty string x E A* is (c,m)-limiting if

for every 1 ::; i ::; Qm.

b) A non-empty string x E A* is called Borel (c, m)-normal if x is (c,j)-limiting, for every 1 ::; j ::; m.

Definition 5.40. i) A non-empty string x

A* is called m-limiting if x is

E

(V(1ogQ Ixl)/Ixl, m) -limiting, i.e.

for every 1 ::; i ::; Qm. ii) If for every natural m, 1 ::; m ::; logQ logQ say that x is Borel normal.

lxi, x

is m-limiting, then we

We now use a simple combinatorial formula (see Natanson [318]).

Fact 5.41. For all naturals i, m

Lemma 5.42. For every c

Proof In (5.4) put x

k

°

and real x

> 0, 1 ::; m ::; M

> 0,

and 1 ::; i ::; Qm,

= Q-m,i = lM/mJ:

LtJ (lM/mJ) ( k=O

~

k

_ Q-m)2lM/mJ2(Qm _ 1)LM/mJ-k

lM/mJ

= lM/mJQm LM/mJ-2m(Qm - 1).

5. Random Strings

126 Next define the set

On one hand:

#

{x E AM II [!;~j - Q-ml > c} = L #{x E AM I Nim(X) = k} =

kET Qrem(M,m).

L #{x E AM I Ni(x) = k} kET

=

Qrem(M,m).

L (lM{mJ) (Qm _l)LM/mJ-k. kET

On the other hand:

lM/mJQmLM/mJ-2m(Qm >

1)

L c2lM/mJ2 (lM/mJ) (Qm _l)LM/mJ-k kET

k

= c2lM/mJ2Q-rem(M,m)#{XEAM Remark.

I!~~~)

_Q-m\

>e}.

D

For every 1 :S m :S M, 1 :S i :S Qm,

Comment. In case m = 1 and 1 :S i :S Q, Nl(x) = Ni(X), formula (5.5) becomes #{XEAM

I!N~)

_Q-l!

>c}:S

QM-~~_l),

127

5.6 Borel Normality

and the inequality (5.6) reduces to

In view of Definition 5.40, a string x E AM is not Borel normal in case

for some 1 ~ m

~

logQ logQ M, 1 ~ i

~

Qm.

Lemma 5.43. We can effectively compute a natural N such that for all naturals M ~ N,

# {x E AM

I

x is not Borel normal} ~

QM

.

(5.7)

VlogQM Proof We put

s = {m E N I 1 ~ m ~ logQ logQ M}. Using formula (5.6) we perform the following computation:

#{ x

E AM I x is not Borel normal}

for sufficiently large M.

o

128

5. Random Strings

Corollary 5.44. There exists a natural N (which can be effectively computed) such that for all M ~ N one has

#{x E A* I N ~

Ixl

~ M,x is not Borel normal} ~

VQM+3 logQM

(5.8)

Proof By Lemma 5.43 we get a bound N for which the inequality (5.7) is true. Accordingly, using a proof by induction on M we can show the inequalities #{x E A* I N ~

Ixl

~

M,x is not Borel normal}

Qi

M

QM+3

~L ~~ i=N V logQ i VlogQ M

.

D

Theorem 5.45 (Calude). We can effectively find two natural constants c and M such that every x E A* with Ixl ~ M and which is not Borel normal satisfies the inequality

K(x)

~

1

Ixl- "2logQ logQ Ixl + c.

(5.9)

Proof We define the computable function f : N+ ---+ A* by f(t) = the tth string x (according to the quasi-lexicographical order) which is not Borel normal and has length greater than N. (The constant N comes from Corollary 5.44.) In view of (5.8),

t<

Q 1x l +3

- VlogQ provided

Ixl

,

(5.10)

f (x) = t.

Finally, we define the p.c. function () : A* ~ A* by

()(u)

= f(string- 1 (u»

(5.11)

and consider b the constant coming from the Invariance Theorem. If x = f(t), then ()(string(t» = f(t) = x and (by (5.10»

K(x)

~

Ko(x)

+b

129

5.6 Borel Normality

< <

+b 10gQ (t + 1) + b Istring(t)1

< 10gQ

(,/1.1+ Ixl + 1) + 3

b

10gQ

1

< Ixl - 2 10gQ 10gQ Ixl + c.

o

Theorem 5.46. For every natural t ~ 0 we can effectively compute a natural number M t (depending upon t) such that every string of length greater than Mt in RANDf is Borel normal. Proof. From t we construct a minimal T (depending upon t) such that T - O(logT) ~ t, as in Corollary 5.8. Let c, M be the constants coming from Theorem 5.45 and put

(M QQ2(T+C)) . Mt -- max, If x is not Borel normal, then x satisfies the inequality (5.9) (since M t

~

N) K(x) ~

1

Ixl- 210gQ 10gQ Ixl + c < Ixl -

1 210gQ 10gQ M t

+ c ~ Ixl - T,

so, by Corollary 5.8,

H(x)

~ ~(Ixl)

-

t,

a contradiction.

o

Corollary 5.47. Almost all strings in RANDf are: a)

m-limiting,

b)

Borel (E, m)-normal,

c)

E-limiting.

Proof Every Chaitin t-random string x which is m-limiting and satisfies the inequality 10gQ Ixl ~ E21xl is also Borel (E, m)-normal. 0 We are now able to prove that every string can be embedded into a Chait in t-random string.

5. Random Strings

130

Theorem 5.48. For every natural t and for every string x we can find two strings u, v such that uxv E RAN

DE.

Proof We fix tEN, x E Ai, i ~ 1. Almost all strings z E RAN Borel normal by Theorem 5.46, i.e. they satisfy the inequality

DE are

_ -ml ~ INj(z) In/mJ Q :::; V-:-' for every 1 :::; j :::; Qm, 1 :::; m :::; logQ logQ n; n

= Izi.

We take m = i, x to be the jth string of length i and we pick a string z E RAN such that

DE

_ 1z 1-- QQ 2i+ 1 . nIt follows that

- -ml < VlogQ n INJ(z) In/iJ Q n' in particular, -i

Q To prove that NJ(z)

V-n-:::; n NJ(z) In/iJ· IOgQ

-

> 0 it is enough to show that . vlOgQn Q-2> ___ , n

which is true because

and 1 2'H _Q 2

1

1 2'2+ l 1 > -2 =' 42 > 2i + _. - 2

2

o

Theorem 5.48 answers only one question concerning various (potential) possibilities to extend arbitrary strings to random/non-random strings. To settle all these questions we shall use a topological approach.

5.7 Extensions of Random Strings

5.7

131

Extensions of Random Strings

In this section we deal with the following problem: to what extent is it possible to extend an arbitrary string to a Chaitin random or non-random string? We shall use some topological arguments. Let < be a partial order on A* which is computable, i.e. the predicate "u < v" is computable. We denote by T( <) the topology generated by the family Ow

= {x E

A*lw

< x},w

E A*.

It can be seen that the closure operator acts as follows: for every B C A * ,

B

---+

Cl(B)

= {x

E A*

I x < z,

for some z E B}.

Example 5.49. The following partial orders on A * are computable: a)

x

b)

x

c)

x

The next result is easy to prove. Lemma 5.50. Let < be a partial order on A*. For every Be A* and w E A * the following statements are equivalent:

=0,

1)

BnOw

2)

Cl(B) nOw

3)

w €f. Cl(B).

= 0,

Recall that a set Be A* is a) dense if Cl(B) dense) if Ow rt Cl(B), for all w E A*.

=

A*, b) rare (or nowhere

A set B C A * is computably rare if for every string w E A * we can computably obtain a witness certifying that Ow rt Cl(B), i.e. a string w < v such that v €f. Cl(B). We put this in a formal way as follows. Definition 5.51. A set B C A * is computably rare if there exists a computable function r : N ---+ N such that the following conditions hold for all n E N:

5. Random Strings

132 1.

string(n)

2.

B

<

string(r(n)),

n 0string(r(n))

Remarks.

0.

=

i) The family of computably rare sets is closed under subset.

ii) Every computably rare set is rare. Example 5.52. Each basic open set Ow is not (computably) rare.

Definition 5.53. A partial order < on A * is unbounded if for every x E A* and n E N there exists a string y E A*,

Iyl

~

n such that x

< y.

Clearly, the prefix, suffix and infix orders are unbounded.

Example 5.54. The following partial orders are unbounded (here x XIX2"'Xn E An, and y a)

x

Y iff Y

=

= YlY2 ... Yk

=

E Ak):

UIXIU2 X2··· UnXnUn+l, for some UI, U2, ... , Un+l E

A * (embedding order), b)

c)

x <m Y iff Xn-i < Yk-i, for all 0 SiS min(k, n) - 1, and if n then Xj = aI, for all 1 S j S n - k (masking order), x
Ixi s IYI

and Xi

< Vi,

for all 1 SiS

Ixi

> k,

(prefix-masking

order), d)

x
=

wajz with i

<

j, for some

w, v, z E A * (lexicographical order). e)

x
Ixi < Iyl

or ~xl

= IYI

and x
order). Sometimes, the distinction between rare and computably rare sets is sharp, as in the case of prefix or suffix orders. In other cases (see for instance the infix, embedding, masking and prefix-masking orders) the computably rare sets coincide with the rare sets. (See Exercises 5.9.2123.) Intuitively, the properties of being computably rare, rare and dense (according to some topology T) describe an increasing scale for the sizes of subsets of A *, with respect to T. Thus a dense set is "larger" than a rare one, and (sometimes) a rare set is larger than a computably rare one.

133

5.7 Extensions of Random Strings

Theorem 5.55. If < is a computable and unbounded partial order on A*, then the set {w E A* I H(w) < Iwl - t} is dense in r( <), for all natural t. Proof Let x E A *. For x we define the computable function f : N by

x

< string(m),string(m) i= string(i),

Consider Chaitin's computer C(a?a2' A) to see that Hc(string(f(n)))

N

I Istring(m) I 2: Ixl + 2(n + 1),

min{m E N

f(n)

---+

°: ;

i

< n}.

= string(f(n)). It is not difficult

= n + 1 < Istring(f(n))I- Ixl- n.

Using the Invariance Theorem we get a constant c such that H(string(f(n)))

For n

< Istring(f(n)) I - Ixl- n + c.

> t + c- Ix I we get H(string(f(n)))

< Istring(f(n))I- t,

and x

< string(f(n)).

o

Remarks. a) A stronger form of the above statement can actually be proven: for every increasing, unbounded (not necessarily computable) function f : N ---+ N, the set {x E A* I H(x) ::; f(lxl)}

is dense in r( <). b) We can interpret Theorem 5.55 as follows: each section of the Martin-Lof test {(x,m) E A* x N+ I H(x) < Ixl - m} is dense with respect to r( <).

Corollary 5.56. If < is a computable and unbounded partial order on A*, then A* \ RANDf is dense in r( <), for all natural t.

5. Random Strings

134 Proof We use Theorem 5.55 and the relation {x E A*

I H(x) < Ixl- t} c

A* \ RANDf.

o

We fix an arbitrary string x E A * and consider the following question: "Is it possible to find a Chaitin non-random string y having x as a prefix?" Theorem 5.55 with < =

Theorem 5.57. For every natural t and every string x E A * there exists a string u E A * such that for every string z E A *, xuz F/. RAN Df . Proof We fix tEN and x E A* and get T such that T - O(logT) 2:: t.

Next we define the computer 'P(zlOlogm) = string(r(m))z, where r is a computable function such that string(r(n)) = string(n)On-lstring(n)l. So, if x = string(m), then u = on-lstring(n)l. It can be seen that for every z E A * K(string(r(m))z)

< Kcp(string(r(m))z) + c <

provided m

1 + log m + Izl + c Istring(r(m))zl - T,

> 1 + c + T + log m. Consequently, by Corollary 5.8 we get H(string(r(m))z) ~ ~(Istring(r(m))zl) - t,

showing that every extension string(r(m))z of string(m) lies in A* \ MN~.

0

Corollary 5.58. For every natural t we can find a string x no extension of which is in RANDf . The above result shows that in Theorem 5.48 we need both the prefix u and the suffix v, i.e. it is not possible to fix u = A and then find an

135

5.7 Extensions of Random Strings

appropriate w. However, such a possibility is regained - conforming with the probabilistic intuition - so far as we switch from RAN with a fixed with an appropriate, small t. t to RAN

Df

Df

We start first with a preliminary result. Lemma 5.59. Let < be a partial order on A * which is computable and unbounded. Assume that < satisfies the relation

L

Q-1wl-llogQ IwlJ =

00,

for all x E A *.

x<w

Then, for every string x we can find a string y such that x H(y) 2=: Iyl + llogQ lylJ·

<

y and

Proof Assume, by absurdity, that there exists x E A * such that for all x < u one has H(u)

< lui + llogQ lulJ.

We put

x

= {v

E A*

I x < v,H(v) < Ivl + llogQ IvlJ}

and note that

So, 1>

L WEX

Q-H(w)

2=:

L

Q-1wl-llogQ IwlJ =

wEX

L

Q-1wl-llogQ IwlJ =

00,

x<w

since

{w E A* I x

< w,H(w) 2=: Iwl + llogQ IwlJ}

We thus have a contradiction.

=

0. D

Theorem 5.60. For every string x and natural n we can find a string u such that: i) Ixul2=: n, ii) for some natural t (which is about llogQ IxulJ), XUERANDf·

136

5. Random Strings

Proof Let < =

L

L

Q-lwl-llogQ IwlJ

Q-lxyl-llogQ IxylJ

yEA*

X
>

L

Q-lxyl-logQ Ixyl

yEA*

> Q-lxl-logQ Ixl

L L 00

Q-lyl-logQ Iyl

n=l yEAn

> Q-lxl-logQ Ixl

L -1 00

n=l

n

00.

Now we take x E A* and n 2': 1. Let y be an arbitrary string such that IxYI 2': n. By virtue of the preceding argument we can find a string w such that xy

H(w) 2': Iwl

+ llogQ IwlJ = ~(lwl) - t,

where t

= H(string(lwl)) -

5.8

Binary vs Non-binary Coding (2)

llogQ IwlJ is about llogQ IwlJ.

o

We continue the discussion in Section 4.5 on the relevance of the size of the alphabet for coding. It is common sense to notice that one needs fewer digits to code numbers in ternary than in binary; new names are about log32 times shorter. Is this trade-off a consequence of the special coding scheme? The answer is negative. We will show that there is no optimal instantaneous code for all positive integers, and the binary is the worst possible. Codes over a fixed alphabet can be indefinitely improved themselves, but only "slightly"; in contrast, changing the size of the alphabet determines a significant, not linear, improvement. The key relation describing the above phenomenon can be expressed in terms of program-size complexity: changing the size of the coding alphabet from q to Q, 2 ~ q < Q, results in an improvement of the complexity by a factor of logQ q. For Q 2': 2 we put AQ = {a, 1, ... ,Q - I}. The program-size complexity induced by C : AQ -::. AQ will be denoted by HQ,c; HQ,u = HQ.

137

5.8 Binary VB Non-binary Coding (2)

Lemma 5.61. Let f : N + -+ AQ be an injective (not necessarily computable) function having a prefix-free range. Then for every natural m the following inequality,

holds true for infinitely many n. Proof Assume by absurdity that for some natural m the inequality If(n)1 S logQ n + m holds true for almost all n. Then, the divergent series ~n2:1 Q-IogQ n-m appears to converge:

L

Q-IogQn-m

S

L

Q-lf(n)1

+ 0(1) < 00. o

Corollary 5.62. For every natural m,

for infinitely many n. It follows that for all m, no prefix-free coding of all natural numbers over the alphabet AQ - can be decreased below logQ n + m, for almost all n. By working with a larger alphabet this improvement can be achieved, even with a trivial, computable code. For instance, consider the code F : N+ -+ A Q+1 given by F(n) = stringQ(n)Q, the concatenation between stringQ( n) and the new letter Q. Clearly,

IF(n)1 = llogQ(n(Q - 1)

+ l)J + 1 S

logQ n

+ 2.

The next result is a stronger version of Theorem 5.1 b) ii): Lemma 5.63. Let F : N + -+ AQ be computable and injective, and let g : N + -+ N + be a function semi-computable from above such that

L

Q-g(n)

< 00.

n2:1

Then

HQ(F(n)) S g(n) + 0(1).

5. Random Strings

138

Proof First let us notice that 00

2:: 2:: Q-(g(n)+m) n?N m?g(n)

n?N m=O 00

=

2:: Q-g(n) 2:: Q-m n?N

m=O

~ Q-1 in case Ln?N Q-g(n)

<

"

~

Q-g(n)

n?N

<1 -

,

Q/(Q - 1). In view of the Kraft-Chaitin The-

orem, there exists a Chaitin computer C : AQ ~ AQ such that for all n ~ Nand m ~ g( n) there exists a string x of length m such that C(x) = F(n). In particular, there exists a string Xn of length g(n) such that C(xn) = F(n), for n ~ N; using the Invariance Theorem we deduce the required formula:

HQ(F(n)) :S g(n)

+ 0(1). D

Example 5.64. For every natural Q ~ 2, and every real a, one has

iff a

> 1.

The next result will prove that there is no optimal instantaneous code for all natural numbers, and the binary coding is the worst possible. For larger and larger alphabets we get better and better codings; in contrast to the case of codes over a fixed alphabet, which themselves can be indefinitely, but only "slightly" (see Knuth [256]) improved, changing the alphabet results in a significant improvement.

Theorem 5.65. Let q, Q ~ 2 be naturals and consider two computable one-to-one functions F : N + ---+ A Q, f : N + ---+ A~. Then,

HQ(F(n)) < (logQ q)Hq(J(n))

+ 0(1).

5.8 Binary vs Non-binary Coding (2)

139

Proof We take the function g(n) = l(logQq)Hq(f(n))J. One has

L

Q-g(n)

<

L L

Ql-(logQ q)Hq(f(n))

n;?:l

n;?:l

=

Q

<

Q

q-Hq(f(n))

n;?:l

< 00.

As 9 is semi-computable from above one can use Lemma 5.63 to get the required inequality. 0 Corollary 5.66. Let 2 ::; q x E A~ one has

< Q be naturals. Then, for every string

Proof We consider the computable one-to-one functions F : N + ---; A Q, f : N+ ---; A~ given by F(n) = f(n) = stringq(n) , and use 0 Lemma 5.63.

Corollary 5.67. For all 2 ::; q ::; Q,

Theorem 5.68. Let 2 ::; q < Q. Then, there exists a constant a (which depends upon q, Q) such that for all x E A~ we have

Proof We use Corollary 5.66 and Lemma 5.63 (and take F A~, F(n) = stringq(n), g(n) = l(logq Q)HQ(F(n))J).

What is the complexity of strings x E Proposition 5.69. For every 2 ::; q HQ(x)

A~

N + ---;

regarded as strings in A Q?

< Q and all x E A~,

< Ixl + 0(1).

0

5. Random Strings

140

Proof Indeed, by Corollary 5.66 and the relation HQ(stringq(n)) O(logq n) one has HQ(x)

+ 0(1) (logQ q)L:q(lxl) + 0(1) (logQ q)(lxl + O(logq Ixl)) = (logQ q) Ixl + O(logq Ixl) < Ixl + 0(1). < < <

=

(logQ q)Hq(x)

Corollary 5.70. No string x E

A~

D

is random over A Q.

In the binary case we have only two such strings, namely 00 ... 0 and 11 ... 1, which are obviously non-random. In the non-binary case we have

strings of length n over the alphabet AQ which are non-binary because they do not contain all Q letters. For instance, for Q = 3 one has 3 x 2n such strings, many of them being random as binary strings (in fact, according to a result of Chaitin [127], more than 3 x 2n - c2 , where C2 is a constant which depends on the size of the alphabet but not on the length

n).

5.9

Exercises and Problems

1. (von Mises) Imagine the following two experiments.

a) We have two dice and we keep throwing them and recording the results. b) Consider a road along which milestones are placed, large ones for whole miles and smaller ones for tenth of a mile. The relative frequency of the appearance of a large stone will lie around 1/10; the deviations from 1/10 will become smaller and smaller as the number of stones passed increases. In what sense does the sequence of observations of large/small stones differ from the sequence obtained from throwing a pair of dice? Which one could be called "random" and why?

5.9 Exercises and Problems

141

2. Show that for every computer cp and all natural numbers n, m, n 2: m, if Q > 2 or m > 0, then

lim #{x E An I K
n->oo

= 00.

The above formula is no longer true in the case Q = 2 and m = O. 3. Let (x, m) be in A * x N +. Show that the following statements are equivalent:

a) Ixi 2: l. b) The set H(x, m) = {(x, 1), ... , (x, mn is a Martin-Lof test. c) The set H(x,m) Martin-Lof test.

=

{(y,n) E A* x N+

I1

::; n ::; m,x

4. Give an example of a non-computable Martin-Lof test. 5. Construct a Martin-Lof test which rejects the hypothesis x is random at the level Q-m /(Q - 1) provided the first letters of x are all equal to a fixed element a E A * . 6. Construct a p.c. function properties:

f : N~

a) For all natural i,j if f(i,j) =I-

00,

~ A* x N having the following two

then f(i, k) =I-

b) A set X c A* x N is c.e. iff X = {f(i,j) some natural i.

I (i,j)

00,

E

for all k ::; j.

dom(j),j 2: I}, for

7. (Martin-Lof) Show that the set of all Martin-Lof tests is c.e., i.e. there exists a c.e. set T c N+ x A* x A* such that: i) every section Ti of T is a Martin-Lof test, and ii) for every Martin-LM test V there exists an i such that V = T i • (Hint: Use the p.c. function in the preceding exercise in the following procedure (for generating the ith section Ti ofT):

(a) Put Ti

=

0.

(b) Putj = l. (c) If f(i,j) =

00,

then continue indefinitely.

(d) Compute f(i,j)

=

(Xj, mj).

(e) Put Ti = Ti U {(Xj, 1), ... , (Xj, mjn. (f) If Ti is not a Martin-L<:if test, then put Ti (g) Put j

= j + 1 and go to step c).)

= 0.

5. Random Strings

142

8. (Martin-Lo£) Let T be a c.e. set enumerating all Martin-Lof tests. Show that the set U = {(x,m) E A* x N+ I (i,x,m+i) E T, for some i

> O}

is a universal Martin-L{Sf test. 9. Show that every section of a universal Martin-Lof test is infinite. 10. Is every universal Martin-Lof test non-computable? 11. Is every universal Martin-L{Sf test representable? 12. Show that among the computable Martin-L{Sf tests there is no universal one. 13. A Martin-Loftest is called weakly computable in case the set {(x,mv(x)) I x E Vl} is c.e. Show that every computable Martin-Lof test is weakly computable, but the converse is false. 14. Show that every weakly computable Martin-Lof test is uniformly embeddable into a weakly computable representable Martin-Lof test. 15. Show that the following two conditions are equivalent for an arbitrary weakly computable Martin-Lof test V: a) V is representable, b) for all naturals n

> m > 0,

16. Give an example of a Martin-L{Sf test which is not weakly computable. Show that the Martin-Lof test W is computable and satisfies b) in the preceding exercise iff W is representable by a (total) computable function. 17. Show that the Martin-L{Sf test W is weakly computable and satisfies b) in Exercise 5.9.15 iff W is representable by an injective p.c. function. 18. Prove Theorem 5.27 directly, without using the complexity K. 19. (Campeanu) Show that a Martin-Lof test V is representable by a Chaitin computer iff Q-1xl+m+l:s: 1.

L

(x,m)EV

20. Two functions j,g : A* ---) A* are equal almost everywhere on a set Xc A* (written j = 9 a.e. on X) if j(x) = g(x), for all but finitely many x EX. Show that if j, 9 : A * ---) A * are computable functions and j = 9 a.e. on A* \ RANDf, then j = 9 a.e. on A*.

5.9 Exercises and Problems

143

21. Prove: a) If < is a partial computable (unbounded) order on A* and is a computable bijection, then the partial order x

y

{==}

f(x)

f : A*

->

A*

< f(y)

is computable (unbounded). b) A set B c A* is rare (computably rare, dense) in r( <) iff f(B) {f(x) I x E B} is rare (computably rare, dense) in r( A *, mir( A) a, a E A*, mir(xy) = mir(y)mir(x), x, y E A*.

=

A, mir( a)

23. Let < be a computable, unbounded partial order on A *. Show that a set B is computably rare iff there exist a natural i and a computable function f : N -> N such that string(n) < string(f(n)) , for every n E N, and 08tring(f(n)) n B = 0, for all strings with Istring (n) I > i. 24. Suppose that < is a computable and unbounded partial order on A * and assume the existence of a computable function s : N -> A * such that for all natural numbers i,j if s(i)

< x, s(j) < x, for some string x, then i =

j.

Then we can find a rare set which is not computably rare. Illustrate the above situation with examples. Show that the above condition is preserved under computable bijections. 25. Suppose that < is a computable and unbounded partial order on A * and for all strings x, yEA * there exists a string z with x < z, y < z. Then, i) every rare set is computably rare, ii) every non-rare set is dense. Illustrate the above situation with examples. 26. Prove that for every natural t, the set RAN Df is computably rare with respect to the topologies r( <m), r( 0 such that for all naturals m and d, with d ;:::: c, the set {x E A* Ilxl ;: : m, K(x/ Ixl) :s; d} is dense. 28. Relativizing the notion of computably rare set with respect to some fixed set C we get the notion of computably rare set in C. Show that in case C is sparse (i.e. there is a natural k such that #(C nAn) :s; n k + k, for all n EN), then the set RANDf is computably rare in C. The same conclusion can be obtained in case C is co-sparse, i.e. when the complement of C is sparse.

5. Random Strings

144

29. Call a subset B of A* meagre in case limn->oo #{x E B Ilxl ::::; n}/n = 0. For example, the set of all binary strings which have twice as many zeros as ones is meagre. Show that if B is computable and meagre, then for each natural t there are only finitely many strings x E B n RAN

Df .

30. Let f : A* x A* ----; [0,00) be a non-negative semi-computable function from below. Show that then

H(y/x) ::::; -logQ f(x, y) + 0(1), for all strings x E A* such that

I:yEA*

f(x, y) ::::; 1.

31. Let A = {O, I} and put for every string x E A* rx

= N 1 (x)/lxl,

1fx

= r~h(x)(l_ rx)No(x),

where Ni(x) is the number of occurrences of i = 0,1 in x. Next define the function 8 : A* ----; [0,00) by 8(x) = log2(1fx ) + Ixl -log2(1 + Ixl). a) Show that 8 is a deficiency of randomness function. b) Express 8 in terms of the entropy function h. (Hint: 8(x) = Ixl(lh(1fx)) -log2(1 + Ixl).) c) Show that there exists a constant c > such that if

°

Irx -

1/21 >

then K(x) ::::; IxI32. (Kramosil) Let (X n k:':l be a sequence of strings such that K(x n ) and IX n I = n. Show that for every natural n ~ 1 . Ni(xn) 11m n->oo

In/mJ --

Q-m

~

Ixnl-t

,

for every 1 ::::; i ::::; Qm. 33. (Chaitin) For every natural t, show that there are infinitely many natural n for which all strings of length n have the property

H(x) < Ixl + llogQ IxlJ - t. 34. Let

x

=

be a random string and random?

X1X2 ... Xn

X1aX2a ... aX n

a

E

A.

Is the string

35. Let x = X1X2 ... Xn be a random string over the binary alphabet {O, I}. Construct the new binary string y = Y1Y2· .. Yn, where Y1 = Xl, Yj = Xj EBXj-1, for j = 2,3, ... ,n and EB is the modulo-2 addition. Is Y random? 36. Show that the maximal number of consecutive ones in an m-random binary string of length n is m + o (10g2 n).

5.10 History of Results

145

37. (Staiger) If C c AQ and x E CW, then there is a constant c > 0 such that for all finite prefixes w

K(w) :::;

~ logQ #C + clog Q Iwl, n

and ~ logQ #C

< 1 iff C =I- A Q. 38. (Staiger) A function f : N ---. AQ has

bounded ambiguity if there exists a positive integer k such that #(f-l({X})) :::; k, for all x E A Q. Show that

2: Q-HQ(f(n)) <

00,

n;:::O

provided

f

is computable and has bounded ambiguity.

39. Assume that g : N ---. AQ is computable, h : N ---. N is semi-computable from above and Ln;:::o Q-h(n) < 00. Show that HQ(g(n)) :::; h(n) + 0(1). 40. (Staiger) Assume that f : N ---. AQ is computable and has bounded ambiguity. Let g : N ---. A~ be computable. Then

Hq(g(n)) :::; (logq Q)HQ(f(n))

5.10

+ 0(1).

History of Results

Random strings were first studied by Kolmogorov [259], Chaitin [110, 111] and Martin-Lof [302]. This chapter is essentially based on Chaitin [118, 122, 125]. Berry's paradox was used for the first time as a tool for incompleteness by Chaitin [112] (for earlier discussions see Russell [352] and Gardner [205]); our presentation follows Bennett [32]. The paradox of randomness appears for the first time in Laplace [273]; see also Chaitin [122], MartinLof [301], Kolmogorov and Uspensky [261], Calude [51], Cover, Gacs and Gray [151]. The experiment on discs was first described by von Mises [422]. For other aspects of the theory of Kolmogorov random strings see Bennett, Gacs, Li, Vitanyi and Zurek [31], Chait in [118, 120, 121, 122], Kolmogorov [260], Zvonkin and Levin [455], Fine [197]' Gewirtz [208], Gacs [203], Martin-Lof [302], van Lambalgen [412], Calude [51], Li and Vitanyi [280,281], Uspensky [407], Uspensky and Shen [409], Vereshchagin [415]. The represent ability results (Theorem 5.16, Theorem 5.25) come from Calude, Chitescu and Staiger [73]; see also Staiger [382], Duta [188].

146

5. Random Strings

Martin-Lars asymptotical formula was first proven in Martin-Lof [302]; see also [51]. Theorem 5.6 was proven in Chaitin [127]; Theorem 5.32 is due to Chaitin [111]. Theorem 5.33 comes from van Lambalgen [412]. The topological results are due to Zimand [451] (see more in [452]), Calude and Chi~escu [72], Calude and Campeanu [63], Calude [54]. The study of Borel normality for random sequences was initiated by Chaitin [111] and Kolmogorov (unpublished result quoted in [297]); more detailed results have been obtained in Chaitin [125], Knuth [255]' Marandijan [297]' Li and Vitanyi [282] and Calude [53]. For a general discussion about the significance of this property see Bernoulli [35] and von Mises [421, 422]. The material covered in Section 5.8 comes from Calude and Campeanu [64]; see also Campeanu [107, 108] and Staiger [383]. The effectively immune sets have been studied by Smullyan [370] and Sacks [353]. Blum [37] has studied immunity in a context very close to ours. Constructive immune sets were introduced by Li [283]. For more details on these matters see Rogers [347] and Odifreddi [321]. Exercises 5.9.13-17 come from Calude and Chi~escu [73]; Exercises 5.9.2127 come from [63]. I have used [451] for Exercise 5.9.28 and [367] for Exercise 5.9.29. A more general result than Exercise 5.9.20 was proven in Calude and Istrate [88]. Exercise 5.9.31 comes from Gacs [203] and Exercise 5.9.32 is due to Kramosil [263, 249]. Exercises 5.9.37-40 come from Staiger [383]. For other presentations of randomness see Barrow [15], Bassein [20], Beltrami[25]' Bennett and Gardner, [32]' Calude [57, 58, 59], Casti [103, 104, 106], Chaitin [130, 131, 132, 134, 135], Davies [155], Davies and Gribbin [156], Davis [157], Davis and Hersh [160], Delahaye [164, 165]' Dembski [167], Ferbus-Zanda, Grigorieff [195], Markowsky [300], Pagels [328], Paulos [329], Rucker [349, 350], Ruelle [351], Stewart [380], and Tymczko [406], Vidakovic [417], Volchan [420], and Wolfram [439].

Chapter 6

Random Sequences Make everything as simple as possible, but not simpler. Albert Einstein

In this chapter we address the problem of defining the notion of random sequence. Various equivalent characterizations - measure-theoretical, information-theoretical, topological - will be discussed. We will then establish the main properties of random sequences and formulate the "randomness hypothesis" .

6.1

From Random Strings to Random Sequences

It is quite clear that all informal requirements discussed for random strings should transfer automatically to random sequences. But in this case we have also to cope with all the traps of infinity. Indeed, it is not difficult to shuffle 52 cards, but we may ask about the real significance of shuffling the points of the unit interval! We discuss two examples (presented in Theorem 6.1 and Theorem 6.10) that point out the extent to which we can transfer randomness from the finite case to the infinite one. We begin with the following result proved by Borel [40, 41].

6. Random Sequences

148

Theorem 6.1. Almost all real numbers, when expressed in any base, contain every possible digit or possible string of digits. Proof Let A = {O, 1, ... , Q - 1}, with Q > 2. Notice that for all a E A and x E A *, x does not contain the digit a iff x E (A \ {a} ) *. Accordingly, for every k > 0, N(k)

= #{x E A* Ilxl

~ k,x does not contain

and N(k) #{xEA*llxl~k}

a} =

(Q - l)k+1 - 1 Q_ 2

'

((Q - l)k+1 - l)(Q - 1) (Qk+l - l)(Q - 2)

so lim

N(k) #{x E A* Ilxl ~ k} We may now write the formula

= 0.

k->oo

lim #{x k->oo

EA* Ilxl ~ k,x does contain a} = 1 #{x E A* Ilxl ~ k}

which shows that almost all reals, when expressed in any scale Q ~ 2, contain every possible digit a E {O, 1, ... , Q - 1}. The case of strings of digits can be easy settled just by working with a large enough base. For instance, if the string 957 never occurs in the ordinary decimal for some number, then the digit 957 never occurs in base 1000. 0 Theorem 6.1 suggests that for sequences, like strings, randomness refers to typicality; in particular, "almost all sequences" should be "random". To go on we introduce some new notation. The set of all sequences over the alphabet A is denoted by AW, i.e. A W = {x I X=XIX2 ... Xn ... ,Xi E A}.

For every sequence x

=

XIX2 ... Xn ... E AW we put:

a) x(n) = XIX2 .. 'Xn E A*,n > 0, b) xm,n = xmxm+l ... x n , in case n remaining cases.

~

m

> 0, and

xm,n

= A, in the

For every x = Xl ... xm E A* and y = Yl'" Yn'" E AW we denote by xy the concatenation sequence Xl ... XmYl'" Yn' .. ; in particular, >.y = y. For X c A*, we put

6.1 From Random Strings to Random Sequences In case X is a singleton, i.e. X

149

= {x}, we write xAw instead of XAw.

Encouraged by Theorem 6.1 we may define a "random sequence" as a sequence whose prefixes are "c-random". Let us first interpret the above definition in terms of the complexity K: a sequence x E AW is "random" iff there exists a constant c such that for all natural n :2 1, K1fJ(x(n))

>n-

c.

Here 1jJ is a fixed universal computer. To get an image of the nature of the above definition let us consider the binary case (i.e. A = {a, 1}) and denote by Nf)(x) the number of successive zeros ending in position n of the sequence x. A result in classical probability theory says (see Feller [192]' p.210, problem 5) that with probability 1

N,n(x) = 1.

lim sup _0_ _ n-+oo log2 n

This means that for almost all sequences x E AW there exist infinitely many n for which x(n) :=::l x1,n-lognologn, l.e.

The above result suggests that there is no sequence satisfying the above condition of randomness. In fact, we shall prove that the above result is true for all sequences (not only with probability 1)! Hence, the complexity K is not an adequate instrument to define random sequences. We start with a technical result, which is interesting in itself.

Lemma 6.2. Let n(l), n(2), ... , n(k) be natural numbers, k :2 1. The following assertions are equivalent:

i) One has k

L Q-n(i) :2 1.

(6.1)

i=l

ii) One can effectively find k strings s(l), s(2), ... , s(k) in A * with Is(i)1 n(i), for all 1 SiS k and such that

=

k

U s(i)AW = AW. i=l

(6.2)

6. Random Sequences

150

Proof i) =} ii) We may assume that the numbers n(I), n(2), ... , n(k) are increasingly ordered: n(l) ::; n(2) ::; '" ::; n(k). In view of (6.1), the numbers n(I), n(2), ... ,n(k) are not all distinct. So we put n(l) = n(2) = ... = n(iI) = ml

< n(iI + 1) = n(iI + 2) = ... = m2 < ...

< n(iI + t2 + ... + tu-l + 1) = n(tl + t2 + ... + tu-l + 2) = ... = n(iI + t2 + ... + t u- l + tu) = mu' There are two distinct situations.

First Situation. One has tl ~ Qm l • In this case we take {s(I), s(2), ... ,s(Qml)} to be AmI, in lexicographical order. The remaining strings s(i) can be taken with Is(i)1 = n(i), because one has Qml

U s(i)AW = AW.

i=l

Second Situation. There exists a natural 2 ::; h .:::; u such that

and

tlQ-m l

+ t2Q-m 2 + '" + th_lQ-m h- 1 + thQ-m h ~ 1.

Multiplying by Qmh one can effectively find a natural 1 ::; t ::; th such that

We choose s(I), s(2), ... , S(tl) to be the first (in lexicographical order) strings of length ml. We have tl

U s(i)AW = UxAw, i=l

where x runs over the first iI Qmh -ml strings of length mh (in lexicographical order). The procedure continues in the same manner. Assume that we have already constructed the strings s(I), s(2), ... , S(tl) (of length ml), S(tl +

6.1 From Random Strings to Random Sequences

151

1), S(tl + 2), ... ,s(h + t2) (of length m2), ... , s(h + t2 + ... + ti-l + 1), s(h + t2 + ... + ti-l + 2), ... , s(h + t2 + ... + ti-l + ti) (of length mi), for i

< h. Suppose also that Ti

Us(j)AW= U xA j=l

w,

XEXi

where Xi consists of the first tl Qmh -ml + t2Qm h-m2 + ... + th-l Qmh -mi strings of length mi (in lexicographical order), and Ti = tl + t2 + ... + t i . In view of (6.3), the set Ami \ Xi is not empty. Let x be the first element (in lexicographical order) of the set Ami \ Xi. Then let y be the first (in lexicographical order) element of A mi+l -mi and S(Ti + 1) = xy. We construct the next strings of length mi+l (in lexicographical order):

if i

+ 1 < h,

and

s(Th- 1 + 1), S(Th-l

+ 2), ... , s(Th- 1 + t) = s(h + t2 + .,. + th-l + t),

if i = h - 1. It is seen that T

Us(j)AW = AW, j=l

where T = h +t2+" ·+th-l +t, again by virtue of (6.3). So, if k > T, the remaining strings s(i), i > k, can be taken arbitrarily with the condition Is(i)1 = h(i); the property (6.2) will hold true.

ii)

*

i) Again assume that n(l) S n(2) S '" S n(k), and put Ji

= {x

I s(i)

E An(k)

ex},

1 SiS k. Condition (6.2) implies that k

An(k)

U Ji

C

i=l

and this in turn implies the inequality k

L #Ji i=l

2: #An(k).

152

6. Random Sequences

This means that

Lk Qn(k)-n(i)

>

Qn(k),

i=l

o

which is exactly (6.1).

Definition 6.3. A p.c. function F : N ~ N is said to be small if 00

L

Q-F(n)

= 00.

n=O

Example 6.4. a) Let kEN. The constant function F : N by F(n) = k, for all n E N, is a small function.

-+

N given

b) Take a to be a strictly positive rational, a < 1 or a 2:: Q. The p. c. function F (n) = lloga n J, for n 2:: 1, is a small function. In particular, F( n) = llogQ n J is small. Lemma 6.5. Let F be a small function and let k be an integer such that F(n) + k 2:: 0, for all n E dom(F). We define the function F + k : dom(F) -+ N by (F+k)(n) = F(n)+k. Then, F+k is a small function. Lemma 6.6. Let 9 be a small function with a computable graph. Then one can effectively find another small function G with a computable domain such that: a)

The function 9 extends G.

b)

For every n E dom(G) one has G(n) S n.

c)

For every natural k there exists at most one natural n with G(n) = n-k.

Proof. We define the p.c. function G : N ~ N as follows: g(n), G(n)

=

{ 00,

if g(n) S nand m - g(m) for every natural m < n, otherwise.

=1=

n - g(n),

Since 9 has a computable graph, it follows that all conditions in the above definition are computable and G satisfies the above three requirements.

6.1 From Random Strings to Random Sequences

153

In particular, G has a computable graph. It remains to be proven that G has a computable domain and 00

L

Q-G(n)

=

00.

(6.4)

n=O To this end we define the sets

x = {n E N I g(n) :::; n}, Xk Notice that X

=

= {n E N I g(n) = n - k}, kEN.

U~O

Xk and the sets Xk are pairwise disjoint. Because

9 is small and

L

< 00

Q-g(n)

nEN\X

one has

L

Q-g(n)

= 00,

nEX which means that

L L

= 00,

Q-g(n)

(6.5)

kEYnEX" where Y

= {k

E N I Xk

=I 0}.

For every kEY we denote by nk the smallest element of X k . Then dom(G) = {nk I Xk i= 0}. So,

G(n)

< 00 iff G(n) :::; m, for some m :::; n.

Accordingly, dom( G) is computable. We put a

=

L

L

Q-g(n) ,

kEY nEX" \ {nl<;} and

(3

=

L

00

Q-g(n,,)

kEY

=

L

Q-G(n).

n=O

(The sum over the empty set is null.) So, we can write (6.5) in the form

a

+ (3 = 00.

(6.6)

154

6. Random Sequences

For every kEY one has

2::

2::

Q-g(n)

Q-(n-k)

Qk-n k /(Q _ 1) Q-g(n k ) /(Q _ 1). It follows that

as

2:: Q-g(n

k)

/(Q - 1) = (J/(Q - 1).

kEY

From (6.6) we deduce that which is precisely (6.4).

00

= a + {J S

{J

+ (J/(Q - 1); hence f3 =

00,

0

Proposition 6.7. Let 9 be a small function with computable graph. Then we can effectively construct a computable function f : N -+ A * such that for every sequence x in AW the set

ex = {n E N

I f(n)

= x(lf(n)l) and If(n)1 = g(n)}

is infinite.

Proof Given 9 we construct G according to Lemma 6.6. We can define f by the following procedure (we rely on the computability of dom( G)): Stage O.

no = min{n

E N I

1.

Compute

2.

Extract from the vector

'L,"]=oQ-G(j) ~ 1}.

(G(O), G(l), ... ,G(no)) all finite components and call them

(G(i(O)), G(i(l)), ... ,G(i(ko))), where

i(O) < i(l) < ... < i(k o).

155

6.1 From Random Strings to Random Sequences 3.

We construct ko strings s(O), s(l), ... ,s(ko) in A* such that

IsU)1 = G(iU)),O:S j:S ko, and for every x in A W there exists a natural 0 :S j :S ko satisfying sU) = x(lsU) I). This is done by Lemma 6.2, because of the choice of

no. 1 4.

= s(j),

We put f(i(j)) every

for all natural 0 :S j :S ko and f(m)

= A,

for

mE {O, 1, ... , no} \ {i(O), i(l), ... , i(ko)}. Stage q + 1.

= min {n

1.

Com pute nq+1

2.

Extract from the vector

E N

1

n < nq, 'L,j=nq+1 Q-G(j) 2:: 1}.

(G(nq + 1), G(nq + 2), ... , G(nq+1)) all finite components and call them

(G(i(kq + 1)), G(i(kq + 2)), ... , G(i(kq+1)))' where i(kq + 1) 3.

Find the

< i(kq + 2) < ... < i(kq+1). strings s(kq + 1), s(kq + 2), ... ,s(kq+1)

in A* having

s(j) = x(lsU)I), for all j = kq + 1, kq + 2, ... ,kq+1 and such that for each x E AW there exists a natural kq + 1 :S j:S kq+1 with s(j) = x(lsU)I)· 4.

We define f(i(j)) = sU), for all natural j E {kq + 1, kq and f(m) = A, for

mE {nq

+ 2, ... , kq+1}

+ 1, nq + 2, ... , nq+1} \ {i(kq + 1), i(kq + 2), ... ,i(kq+1)}.

The above procedure defines a computable function f. For every x E AW the set C(x) is infinite because dom( G) is infinite and G is small. D

156

6. Random Sequences

Proposition 6.8. Let 9 be a small function with a computable graph. Let 'ljJ : A * ~ A * be a universal computer. Then we can find a natural c (depending upon 9 and 'ljJ) such that for every sequence x E AW there exist infinitely many n E dom(g) satisfying the inequality

+ c.

K1jAx(n)) :::; n - g(n)

(6.7)

Proof Given 9 we construct G as in Lemma 6.6. Using G we construct the computable function f : N -+ A * having the property that for every sequence x E AW the set C(x)

= {n

EN

I f(n) = x(lf(n)I), If(n)1 = G(n)}

is infinite (we made use of Proposition 6.7). p.c. function

f(n)y,
Now we can define the

if there exists an n with

n - G(n) =

Iyl,

otherwise.

00,

It is clear that in view of Lemma 6.6 there exists at most one natural n such that n - G(n) = IYI; so, the above definition is correct.

We now take a sequence x E AW. Notice that the set

D(x) = {n E N

I x(lf(n)1) =

f(n), If(n)1 = G(n) < n}

is infinite because C(x) is infinite and the set

{n E N

I f(n) = x(lf(n)I), If(n)1 = G(n) = n}

has at most one element (according to Lemma 6.6). For every n E D(x) we construct the string

y(j, n) = where u

Xu +1 X u+2'" Xu+n-G(n) ,

= If(n)1 = G(n). We have
which shows that

Ktp(x(n)) :::; Iy(j, n)1 = n - G(n). The Invariance Theorem furnishes a natural c such that

K1jJ(x(n)) :::; Ktp(x(n)) for every n E D(x) C dom(g).

+ c :::; n -

G(n)

+c = n -

g(n)

+ c, D

6.1 From Random Strings to Random Sequences

157

The next step consists of the elimination of the constant c in (6.7). To this end we prove: Lemma 6.9. Let F be a small function with computable domain. Then we can effectively construct a small function F* with the same domain as F which has the following supplementary property: for every natural c there exists a natural Nc such that F*(n) ~ F(n)

+ c,

for all n E dom(F) , n ~ N c • Proof Let r : N -+ N be a computable, strictly increasing function such that dom(F) = {r(i) liE N}. We put u(n) = F(r(n)). Then 00

L

Q-u(n)

= 00.

(6.8)

n=O

In view of (6.8) we can effectively find a computable, strictly increasing function s : N -+ N such that s(i+1)

L

Q-u(n) ~ Qi+1,

(6.9)

n=s(i)+1

for all natural i. Now we can define the computable function v : N -+ N by v(n) = u(n) + i + 1, if s(i) + 1 S n < s(i + 1). From (6.9) it follows that s(i+1)

L

Q-v(n) ~

1.

(6.10)

n=s(i)+1

From (6.10) we get 00

L

Q-v(n)

=

00.

n=O

The required function F* may now be defined by the formula F*(r(n))

= v(n), n

E N.

o

Theorem 6.10. Let F be a small function with a computable domain and let 'lj; be a universal computer. Then, for each sequence x E AW the inequality

K1/J(x(n)) S n - F(n) holds true for infinitely many n E dom(F).

158

6. Random Sequences

Proof From F we construct F* as in Lemma 6.9. We may apply Proposition 6.8 to F* (instead of g) to get a natural c (depending upon F and 'l/J) such that the set H(x) = {n E N

I K1fJ(x(n))

~

n - F*(n)

+ c}

is infinite for every x E AW. For this constant c we get, using Lemma 6.9, a natural Nc such that F*(n) 2': F(n) + c, for all n 2': N c, n E dom(F). It follows that for every x E AW the set

T(x) = H(x) n {n E dom(F) I F* 2': F(n)

+ c}

is still infinite and for every n E T(x) one has

K1fJ(x(n))

~

n - F*(n)

+c ~ n -

F(n).

o

In particular: Corollary 6.11. For every sequence x E AW and every universal com-

puter'l/J, K1fJ(x(n))

~

n -logn,

for infinitely many n. So, by Theorem 6.10, there is no way to extend the notion of randomness from strings to sequences, using the absolute complexity K.

6.2

The Definition of Random Sequences

Formalizing the idea that

a sequence is random in case it passes all conceivable effectively testable properties of stochasticity we get a first formal definition of random sequences. Our main instrument will be the sequential Martin-Lof test.

c A* x N+ is called sequential if it satisfies the additional sequential property:

Definition 6.12. A Martin-Laf test V

For all natural m 2': 1, if x E Vm and x

6.2 The Definition of Random Sequences

159

Example 6.13. The set

= {(y,n)

if(x,m)

E A* x N+

11 S n S m,x

is a sequential Marlin-Laf test.

Recall (see Exercise 5.9.3) that for every x E A * and natural m :2 1 with Ixl > m, H(x, m) = {(x, 1), ... , (x, mn. Theorem 6.14. Let Xl, X2, ... , Xk be strings and ml, m2, . .. , mk :2 1 be natural numbers such that IXil :2 mi, for all 1 SiS k. We put k

H

=

U H(x,mi). i=l

The following statements are equivalent:

i) The set if

= {(y,n)

lyE A*,1

S n S m,x

E H}

is a sequential Martin-Laf test.

ii) The set H is a Marlin-Laf test and for every prefix-free subset {Xil' ... ,Xir} C {xI, ... ,xd one has r

L

Q-IXijl

< Q-min{mil"0o,mir } /(Q - 1).

j=l

Proof Let Pf(A*) be the family of all finite subsets of A* and define the function

--+

Pf(A*),
It is obvious that
a)

For all X E Pf(A*),
b)

For every n E N, {x E An I y An I y

c)

d)

E X}

{x E

160

6. Random Sequences

Only b) deserves a proof. Let x E An be such that y

Intermediate Step. Let X c A* be a finite, prefix-free set. Assume that n ~ lxi, for every x E X. Then

#{x E An I y

#{x

E

=1=

Y2, then {x

E

An I Y1

An I y

E

X} =

An I Y2

E

L #{x E An I y

We are now ready to conclude the proof of the theorem. i) =? ii) Let {Xill' .. , Xi r } be a prefix-free set and let n ~ IXij 1,1 :::; j :::; r, m = min { mil' ... , mi r }. In view of the construction of ii one has

{x E An I Xij

c A*

x N, iim

= {y

E A*

I (y, m)

E ii}).

So, r

LQn-1xijl

#{X E An I Xij

j=l

< #(An n ii) < Qn-m /(Q -1). (We have used the Intermediate Step for X

=

{Xil' ... ,

Xi r }.) Hence

r

LQ-Ixijl < Q-m/(Q -1). j=l

ii) =? i) Clearly, ii satisfies the sequentiability condition and iii+1 it remains to prove the cardinality inequality. To this end we put

c iii;

6.2 The DeEnition of Random Sequences

161

One has

mi ~ m, for some 1 SiS k} {xEAnly
X,

Accordingly, #(An n Hm)

#{x E An I y

L .

yEcp(X)

< Qn-min{mi I Xi Ecp(X)} /(Q - 1) < Qn-m /(Q _ 1).

D

Remark. For a set H as in Theorem 6.14, condition ii) is actually effectively computable. Theorem 6.15. The set of all sequential Marlin-Laf tests is c.e. More precisely, there exists a c. e. TeN x (A* x N)

such that for every V c A * x N + the following equivalence holds true: V is a sequential Martin-Vif test iff there exists a natural i ~ 1 such that

V

= {(x,m) I x E A*,m E N+, (i,x,m) E T}.

Proof Consider a p.c. function f : N x N ~ A* x N as in the proof of Theorem 3.3.

The following procedure constructs the section Ti of T (recall that Ti = {(x, m) E A * x N + I (i, x, m) E T}):

= 1.

1.

Put Ti

2.

Put j

3.

If f(i,j)

=

1.

=

00,

then Ti remains unchanged (the procedure continues

indefinitely). 4.

If f(i,j)

=I 00,

then compute f(i,j)

= (xj,mj).

6. Random Sequences

162 5.

Put Ri = Ti U H(xj, mj).

6.

If Ri is not a sequential Martin-Lof test, then Ti remains unchanged.

Stop. 7.

PutTi=~,j=j+landgotostep3.

Only step 6 may cause a slight difficulty (Ri is infinite at that moment). We can overcome this by using Theorem 6.14. So, all that remains to be proven reduces to i)

if Vi is a sequential Martin-Lof test, then Vi

ii)

every Ti is a sequential Martin-Lof test.

= Ti ,

For i) assume Vi =I- 0. We shall prove that Ti C Vi in the non-trivial case when Ti is non-empty. Let (x, m) be in Ti. According to steps 5 and 7 in the above procedure one must find a natural j ~ 1 such that (x, m) E H(xj,mj), where f(i,j) = (xj,mj) E Ti . So, Xj

U H(xt, mt) C Ti t=l

(in particular, (xj,mj) E Ti). Using the properties of f, f(i,k) = (xk,mk) =I- 00, for k = 1,2, ... ,j. So, all (xk,mk) are in Vi,k = 1,2, ... ,j. Since Vi is a sequential Martin-Lof test, IXkl > mk, 1 ::; k ::; j. The proof will be completed for i) in case we show that j

Nj =

U H(xt, mt) t=l

is a sequential Martin-Lof test. But Nj C Vi, which is a sequential MartinLof test, so Nj is itself a sequential Martin-Lof test. For ii) we consider two situations: a) the procedure eventually halts, b) the procedure continues indefinitely. In a) Ti = 0 or Ti = Utl H(Xk' mk), for some j ~ 1; in both cases Ti is a sequential Martin-Lof test. For b) we have again to consider two possibilities: the procedure runs step 3 indefinitely (and this case reduces to a previous analysis since the result

6.2 The Definition of Random Sequences

163

is a finite set) or the procedure runs steps 3,4,5,6,7 indefinitely, in which case 00

Ti

Uii(Xk' mk).

=

k=l

The set Ti is c.e. and all properties of a sequential Martin-Lof test are clearly fulfilled, except the cardinality inequality. To prove this we proceed by reductio ad absurdum, i.e. we assume the existence of naturals m, n 2:: 1 such that

Hence, we can find r strings x]I, ... , X jr' all of them of length n, such that (Xju' mjJ E Ti and

Assume that

il < i2 < ... < ir. H

=

Because

r

jr

k=l

t=l

UH(Xjk' mjk) C UH(xt, mt}

it follows that H is a Martin-Lof test (use Theorem 6.14) and r

= #{x E An

I (x,m) E H}

< Qn-m/(Q -1):::; r,

o

a contradiction.

Theorem 6.16 (Martin-Lof).

There exists a sequential Martin-Laf test U possessing the following property: for every sequential Martin-Laf test V there exists a natural c (depending upon U and V) such that for all natural m 2:: 1 we have

Proof. Using the c.e. set T constructed in Theorem 6.15 we define the set U

= {(x,m)

E A* x N+ I (i,x,m

+ i)

E T, for some i

2:: 1}.

Clearly, U is c.e. If (x, m) E U and x

164

(y, m has

6. Random Sequences

+ i)

E T i , i.e. (y, m) E U. Next we fix the naturals n, m ~ 1. One

#{x E An I (i,x,m

+ i)

E T, for some i ~ 1}

00

< L:Qn-(m+i)j(Q-1) i=l

Qn-m(Q _1)-2

< Qn-mj(Q_1). Now let us assume that V is a non-empty sequential Martin-LOf test. In view of Theorem 6.15 V = T c , for some c ~ 1. Then

Vm+c =

{x {x

E E

A * I (x, m + c) A * I (x, m + c)

E E

V} Tc}

=

{xEA*I(c,x,m+c)ET}

C

{x E A * I (i, x, m

+ i) E T,

for some i ~ 1}.

0

Definition 6.17. A sequential Martin-Vif test U having the property in Theorem 6.16 is called a universal sequential Martin-Lof test. The critical level mv induced by sequential Martin-Lof test V has the following extra properties: 1. mv(x) S mv(y), whenever x

00,

for every sequence x E AW.

As in the case of Martin-Lof tests one can prove the following characterization of universal sequential Martin-Lof tests in terms of induced critical levels.

Theorem 6.18. A sequential Martin-Laf test U is universal iff for every sequential Martin-Laf test V there exists a natural c (depending upon U and V) such that mv(x) S mu(x) + c, for all x E A*. As a step towards proving the independence of the definition of randomness for sequences with respect to the chosen universal sequential Martin-Lof test we prove the following result.

6.2 The Definition of Random Sequences

165

Lemma 6.19. Let U, W be universal sequential Martin-Laf tests. Let x E AW. Then lim mu(x(n)) < n---+oo

00 {:}

lim mw(x(n)) <

n---+oo

00.

Proof Assume that limn--+oo mu(x(n)) < 00. Since W is universal we can find a constant c such that mw(y) ::; mu(y) + c, for all y E A*, so mw(x(n)) ::; mu(x(n)) + c, for all n 2': 1. 0 Actually, we do not know whether sequences x satisfying the inequality 00, for some universal sequential Martin-Lof test, do exist! We now proceed to this existence proof.

limn--+oomu(x(n)) <

Theorem 6.20. Let V be a sequential Martin-Laf test. Then, for every natural m 2: 1, VmAw =1= AW.

Proof First proof: topological. For every m 2: 1, Vm C Vl, so it will be enough to prove that nXEVl (AW \ xAW) =1= 0. Consider the compact topological space (AW, T), where A comes equipped with the discrete topology and AW is endowed with the product topology. In this space every set Dx

= AW\

xAw, x E A*

is closed. The assertion of the theorem will be proven in case we show that the family (Dx, x E Vd possesses the finite intersection property. To this end let Yl, ... , Yt be in Vl and let us show that t

n

DYi

i=l

is non-empty, or, equivalently, t

UYi Aw =1= AW.

(6.11)

i=l

Without loss of generality we may assume that the set {Yi 11 ::; i ::; t} is prefix-free, because from x

UYi Aw = AW, i=l

(6.12)

6. Random Sequences

166

for some prefix-free set {Yi I 1 ::; i ::; t} C VI. We use Theorem 6.14: we take H = Uf=1 H(Yi, mV(Yi)) and notice that ii C 17 = V (V is a sequential Martin-Lof test). The prefix-free set {Yi 11 ::; i ::; t} satisfies the inequality t

LQ-IYil

<

Q-min{m v (Yi)II:::;i 9 } /(Q -1).

i=1 Furthermore, since every Yi E VI it follows that mV(Yi) ;:: 1, so

~Q-IYil <

1 - Q(Q -1)

~

<1

.

(6.13)

Now we put ni = IYi I and assume that nl ::; n2 ::; ... ::; nt. In view of (6.12), for every z E Ant the following inclusion holds: t

zAw

c

UYi Aw . i=1

For every z E Ant there exists a unique string Yi (the set {Yj 11 ::; j ::; t} is prefix-free) such that Yi

h = LQnt-ni

i=1 possibilities of finding such strings. We derive a contradiction showing that h < Qnt (or, equivalently, I:f=1 Q-n i < 1), because of (6.13). Second proof: graph-theoretical. Recall that a subtree is a non-empty set SeA * such that for every xES one has

C(x)

=

{y

E

A* I Y

C

S.

Every sequence x E AW generates the set of all its prefixes

B(x) = {x(n) In;:: 1} U {A}

c A*,

which is an infinite subtree of A *, linearly ordered by the prefix-order

<po

167

6.2 The Definition of Random Sequences We make use of:

Konig's Lemma. For every infinite subtree SeA *, there exists a sequence x E AW such that B(x) c S.

S = A* \ VI, So

= {A}, Sn =

An n S, n ;::: 1,

and we shall prove that S is an infinite subtree. Indeed, for every natural n;::: lone has #(An n VI) < Qn-I/(Q - 1) which implies that

So, S is infinite. Next we pick some element xES and show that C(x) c S. Assuming the contrary, let y E C(x) \ S and put n -Ixl ;::: 1. Since y f/. s, it follows that y E VI, contradicting the fact that x E VI (because y

V

=

{(OO, 1), (010, 1), (011, 1), (100, 1), (1010, 1), (1011, 1), (1100,1),(1101,1),(1110,1),(1111,1)},

W

=

V U H(llll, 1).

The Martin-Lof tests V and W satisfy the equalities VIAw = AW = WIAw. Theorem 6.21. Let x E AW and assume that V is a sequential MartinLa! test. Then

n VmAw. 00

lim mv(x(n)) =

n-+oo

00 {:}

x E

m=l

6. Random Sequences

168 Hence, the set rand(V)

= {x E A W I n-+oo lim mv(x(n)) < oo}

is non-empty.

Proof Let x be in AW. It is obvious that lim n -+ oo mv(x(n)) = 00 iff for every natural m :2: 1 there exists a natural nm :2: 1 such that mv(x(n m )) :2: m, i.e. x(n m ) E Vm. This means that x E VmAw, for all m :2: 1.

Theorem 6.20 shows that VmAw relation

AW\ (

i= AW,

for all m :2: 1, which implies the

n VmAW) i= 0. 00

D

m=l

Definition 6.22. Let V be a sequential Marlin-Laf test. The elements of the (non-empty) set rand(V) are called random sequences with respect to V. Example 6.23. Take x E A *, m rand(H(x,m)) = AW.

> 1 with Ixl > m > 1. One has

Example 6.23 shows that rand(V) can be "too" large in case V is rudimentary. In the case of a universal sequential Martin-Lof test, the situation is completely different. Theorem 6.24. Let U be a universal sequential Martin-Laf test. Then rand(U)

=

nv rand(V),

where V runs over all sequential Martin-Laf tests. Proof Let x E rand(U) (which is non-empty, by Theorem 6.21). Then limn -+ oo mu(x(n)) < 00. According to Theorem 6.18, for every sequential Martin-Lof test V there exists a natural c (depending upon U and V) such that mv(x(n)) ~ mu(x(n)) + c. It follows that lim n -+ oo mv(x(n)) < 00, i.e. x E rand(V). D

Theorem 6.24 validates the following statistical definition of random sequences:

169

6.3 Characterizations of Random Sequences

Definition 6.25 (Martin-Lof). A sequence x E AW is called random in case x is random with respect to every sequential Martin-LaJ test, i.e. x E rand(V), Jar every sequential Martin-LaJ test V. In view of Theorem 6.24

rand(U) = rand(U'), for all universal sequential Martin-Lof tests U, U'; so, we shall adopt the notation rand = rand(U), where U is a universal sequential Martin-Lof test.

6.3

Characterizations of Random Sequences

In this section we discuss various characterizations of random sequences. We shall mainly rely on Martin-Lof's constructive measure approach and on Chaitin complexity. We begin with the measure-theoretical characterization developed by Martin-Lof. The main idea is to isolate the set of all sequences having "all verifiable" properties that from the point of view of classical probability theory are effectively satisfied with "probability I" with respect to the unbiased discrete probability. Recall that the unbiased discrete probability on A is defined by the function A

h:2

-+

[0,1], h(X)

=

#X Q'

for all subsets X C A (here 2A is the power set of A). Hence, h( { ad) = Q-1, for every 1 :S i :S Q. This uniform measure induces the product measure f.t on AW; it is plain that f.t is a probabilistic measure defined on all Borel subsets of AW and has the property that f.t(xAW)

= Q-1x l,

for all strings x E A *. This is the main example of computable probability in Martin-Lof's sense ([302]; for more details about the above construction see Section 1.4). If x = XIX2 '" Xn E A* is a string oflength n, then f.t(xAW) = Q-n and the expression f.t(. .. ) can be interpreted as "the probability that a sequence

6. Random Sequences

170

Y=

E AW has the first element YI = Xl, the second element the nth element Yn = x n ". Independence means that the probability of an event of the form Yi = Xi does not depend upon the probability of the event Yj = Xj. YIY2' .. Yn'"

Y2 = X2,···,

Every open set G

c

AW

is tt-measurable and

tt( G)

=

L

Q-1x l ,

xEX

where

G = XA w =

U xAw , xEX

for some prefix-free subset X c A*. Finally, S c AW is a null set in case for every real c > 0 there exists an open set G c which contains Sand tt( G c ) < c. For instance, every enumerable subset of AW is a null set. An important result which can be easily proven is the following: the union of an enumerable sequence of null sets is still a null set. A property P of sequences x E AW is true almost everywhere in the sense of tt in case the set of sequences not having the property P is a null set. The main example of such a property was discovered by Borel and it is known as the Law of Large Numbers. Consider the binary alphabet A = {O, 1} and for every sequence x = XIX2 ..• X m ... E AW and natural number n 2:: 1 put

Borel's Theorem can be phrased as follows: The limit of Sn/n, when n -+ 00, exists almost everywhere in the sense of tt and has the value 1/2. In other words, the set of sequences not satisfying the relation limn-too Sn(x)/n = ~,

l'

{ x E A w I 1m n-too

Xl

+ X2 + ... + Xn n

-i

1}

r -2 '

is a null set. 2 The oscillations in the values of the ratio Sn(x)/n can roughly be described by the following result: 2A

stronger result will be proved in Theorem 6.27.

6.3 Characterizations of Random Sequences

171

There exists a null set N c AW such that for every x t/. Nand every natural n ~ 1 one can find two naturals m, q (depending upon x and n) such that

and

Sq(x) > (n +

v'n)/2.

The above properties are asymptotic, in the sense that the infinite behaviour of a sequence x determines if x does or does not have such a property. Kolmogorov has proven a result (known as the All or Nothing Law) stating that practically any conceivable property is true or false almost everywhere with respect to f.t. It is clear that a sequence satisfying a property false almost everywhere with respect to f.t is very "particular". Accordingly, it is tempting to try to say that a sequence x is "random" iff it satisfies every property true almost everywhere with respect to f.t. Unfortunately, we may define for every sequence x the property P x as follows y satisfies Px iff for every n ~ 1 there exists a natural m ~ n

such that

Xm

i= Ym·

Every Px is an asymptotic property which is true almost everywhere with respect to f.t and x does not have property P x . Accordingly, no sequence can verify all properties true almost everywhere with respect to f.t. The above definition is vacuous! The above analysis may suggest that there is no truly lawless sequence. Indeed, a "universal" non-trivial property shared by all sequences was discovered by van der Waerden (see for example [214]): In every binary sequence at least one of the two symbols must occur in arithmetical progressions of every length. Looking at the proof of van der Waerden's result (and of a few similar ones) we notice that they are all non-constructive. To be more precise,

172

6. Random Sequences

there is no algorithm which will tell in a finite amount of time which alternative is true: 0 occurs in arithmetical progressions of every length or 1 occurs in arithmetical progressions of every length. However, there is a way to overcome the above difficulty: We consider not all asymptotic properties true almost everywhere with respect to f.t, but only a sequence of such properties. So, the important question becomes: "What sequences of properties should be considered?" Clearly, the "larger" the chosen sequence of properties is, the "more random" will be the sequences satisfying that sequence of properties. In the context of our discussion a constructive selection criterion seems to be quite natural. Accordingly, we will impose the minimal computational restriction on objects, i.e. each set of strings will be c.e., and every convergent process will be regulated by a computable function. As a result, constructive variants of open and null sets will playa crucial role. Consider the compact topological space (AW, 'T) used in the topological proof of Theorem 6.20. The basic open sets are exactly the sets xAw, with x E A *. Accordingly, an open set G c A W is of the form G = X A W , where X c A*.

Definition 6.26. a) A constructively open set G set G = XA W for which Xc A* is c.e.

c AW

is an open

b) A constructive sequence of constructively open sets, for short, c.s.c.o. sets, is a sequence (G m )m2:1 of constructively open sets G m = XmAw such that there exists a c. e. set X c A * x N with Xm

= {x E A* I (x,m) EX},

for all natural m ::::: 1.

c) A constructively null set S c.s.c.o. sets (G m )m2:1 for which

c

A W is a set such that there exists a

and

lim f.t( G m )

m-+oo

= 0,

constructively,

i. e. there exists an increasing, unbounded, computable function H : N N such that f.t(G m ) < Q-k /(Q -1) whenever m::::: H(k).

---+

6.3 Characterizations of Random Sequences

173

It is clear that J-l(S) = 0, for every constructive null set, but the converse is not true.

Our first example of a constructive null set is a strong form of the Law of Large Numbers: we will show that the set of binary sequences not satisfying the relation limn-too Sn(x)/n = ~ is not only a null set but also a constructive null set. Theorem 6.27 (Constructive Law of Large Numbers). Let A {O, I}. Then, the set y

=

{x

E AW

I

lim n-too

Xl + X2 + ... + Xn i= ~} n

2

is a constructive null set. Proof. We will use Chernoff's bound: for every non-negative integer t there exists a rational qt E (0, 1) such that for all n we have

J-l({XEAW Then

J-l ({x

E

AW

II Xl +

X2

II XI+X2:",+ + ... + Xn n

xn

-~I ~

t}) S2qr·

~I ~ ~,for some n ~ k})

-

2

t

S 2qf . 1 - qt

Given non-negative integers m, t we can effectively find the smallest k such that 2qk 1 _ _ t_ < __ I - qt - m2t' which we will denote by km,t. Hence

J-l

(U

oo {

t=l

X

E AW I

IXl +

X2

III})

+n ... + Xn -"2 ~ t' for 00

1

some n ~ km,t

1

<2:-<- t=l m2t - m' The sets

Xm =

{x = Xl ...

Xs E

A* I
IXl +

X2:'"

+ Xs

-

~I ~

t,

s

~ km,t}

form a C.S.C.o. sets, Y c nm2:1 XmAw and limm-too J-l(XmAW) = 0, constructively. Consequently, Y is a constructive null set. 0

174

6. Random Sequences

Lemma 6.28. For every sequential Martin-Laf test V and for every natural m::::: 1 Q-1x l < Q-m /(Q - 1).

L

xEAnnVm

Proof We use the cardinality inequality in the definition of a sequential Martin-Lof test. 0

Lemma 6.29. Let V be a sequential Martin-Laf test. Then

lim f-t(VmA W) = 0, constructively.

m->CX)

Proof We take V and define for every natural m ::::: 1 the sets W m VmAw. It is seen that for each m::::: 1, Wm = U~=2Xn, where

Xn =

U xEAnnV

=

xAw.

m

Furthermore, Xn C Xn+1 and

f-t(Xn)

=

L L xEAnnV

f-t(xA W)

xEAnnVm

=

Q-1xl

m

=

Q-n#(An n Vm )

< Q-m /(Q -1), in view of Lemma 6.28 and of the fact that the sets {xAW I x E An n Vm } are mutually disjoint. So,

Finally, we put H(m) = m+1 and notice that if m ::::: H(k), then f-t(Wm)

<

Q-k /(Q _ 1).

0

Lemma 6.30. Let V be a sequential Martin-Laf test. Then

is a constructive null set.

6.3 Characterizations of Random Sequences

175

Proof Again we put Wm = VmAw. Since V is c.e. it follows that the sequence (Vm )m>l is a c.s.c.o. sets and the proof is finished by Lemma 6.29.

o Theorem 6.31 (Martin-Lof). The set AW \ rand is a maximal constructive null set. More precisely, AW \ rand equals the union of all constructive null sets. Proof We fix a universal sequential Martin-Lof test U. Since

n UmAw m=l 00

AW\ rand =

we may apply Lemma 6.30 to conclude that the family of non-random sequences forms a constructive null set. Next let S c AW be an arbitrary constructive null set. We shall prove that S c AW \ rand. To this end consider a c.s.c.o. sets (Gm)m~l such that

and

f.l(G t ) < Q-m /(Q - 1), whenever t 2:: H(m). (Here H : N

-+

N is a fixed increasing, unbounded, computable function.)

We write Gm = XmAw = (XmA*)AW,

for all m ;::: 1, where Xm C A* is a c.e. set. We have to construct a sequential Martin-Lof test V such that

(6.14) m=l

m=l

We put H(m)

Vm =

n

XiA*,

i=l

for every natural m 2:: 1. Clearly, the set V = {(x,m) E A* x N+ I x E Vm } is c.e., Vm+1 C Vm, and if x

6. Random Sequences

176 the naturals n, m

~

#(An

1. A simple computation shows that

n Vm ) < #(XH(m)A* nAn) Qn#(XH(m)A* n An)Q-n QnJL(((XH(m)A*) n An)Aw) < QnJL((XH(m)A*)AW) < Qn-m /(Q -1).

So, V is a sequential Martin-LM test. The equality (6.14) holds by virtue of the strict monotonicity of H. According to the universality of U one can find a natural c such that Vm+c cUm, for all m ~ 1. Then, 00

Sen

n 00

(VmAW)

c

m=l

n 00

(Vm+c AW )

c

m=l

(UmAW)

= A W\

rand.

m=l

o

As an easy consequence of Theorem 6.31 we deduce: Corollary 6.32. Almost all sequences are random, and this fact is constructively valid. Proof It is enough to notice that JL(rand) = 1, constructively.

0

The next theorem characterizes rand in terms of Chaitin's complexity. We need two technical results first. Proposition 6.33. A sequence x E A W is random iff for every c.e. set Covering C A* x N+ such that

for all j 2:: 1, there exists a natural i such that x

tI- C overingiAW .

Proof Assume x E rand and pick an arbitrary Covering with the properties stated in the statement of the proposition. We shall prove that

n 00

i=l

Coveringi AW

c

A W \ rand,

(6.15)

177

6.3 Characterizations of Random Sequences

which will imply that

n 00

rf.

x

Coveringi Aw .

i=l

To prove (6.15) we put

nCoveringi A* m

Vm =

i=l

and V = {(x, m) I x E Vm }. We claim that V is a sequential Martin-Lof test. Indeed, we have to check only the cardinality condition

n m

#(An n Vm )

#(An n

=

(CoveringiA*))

i=l

< #(An n (CoveringmA*)) = Qnf.t((An n (CoveringmA*))AW) < f.t( (CoveringmA *)AW) Qn f.t( CoveringmAW) < Qn-m /(Q _ 1).

= Accordingly,

n CoveringmAW m=l 00

n((CoveringmA*)AW) m=l n (VmAW) c m=l n (Vm+c AW ) m=l n (UmAW) = A W\ rand. m=l 00

00

00

00

c

For the converse implication we shall prove that for every Covering satisfying the required properties, the set

n 00

CoveringmAW m=l is a constructive null set. Indeed, we take H to be the identity function and notice that S

whenever m ;:: H(k) x rf. rand.

= k.

=

So, in view of Theorem 6.31, if xES, then 0

6. Random Sequences

178

Proposition 6.34. For every c. e. set B c A * x N + we can effectively find a c.e. set G c A* x N+ such that each section Gi is prefix-free and BiAw = GiAw, lor all natural i :::: 1. Proof Let 9 : N+ ~ A* x N+ be an injective p.c. function such that range(g) = B (if B is finite and has m elements, then dom(g) = {1,2, ... ,m};ifBisinfinite,thendom(g) =N+). Weputg(i) = (xi,mi), in case i E dom(g).

We construct the injective p.c. function

I(ml, 1) and, if g(k

+ 1) i= 00, l(mk+I,

I : N~

~ A * as follows:

= Xl,

then

1 + #{1 ~ i ~ k

Notice that if I (m, j) i= 00, then Bi = {/(i,j) I j :::: 1, I(i,j) i= oo}.

I mi = mk+d) = Xk+l·

I (m, k) i=

00,

for 1 ~ k

< j and

We are now in a position to describe a procedure for the (uniform) construction of the section Gi :

= 0.

1.

Put Gi

2. 3.

= O. If I (i, j) =

4.

Compute I(i,j)

5.

If X

6.

E A* I Xj

Put j

00,

then stop.

= Xj'

lxi,

7. 8.

Put Gi = Gi U {Xj}. Put j = j + 1 and go to step 3.

Clearly, Gi is a prefix-free set. We show that BiAw = GiAw. If x E GiAw, then x( n) E Gi, for some natural n :::: 1. There are three possibilities: i) x(n) E B i , ii) z

6.3 Characterizations of Random Sequences

179

Theorem 6.35 (Chaitin). A sequence x E AW is random iff there exists a natural c > 0 such that H(x(n)) ;:: n - c, for all natural n ;:: 1. Proof. Assume, by absurdity, that for every m > 0 there exists an nm such that H(x(n m )) < n m . Let c> 0 be a natural number such that

c + H(x)

+ logQ P(x) > 0,

for all strings x E A*; see Theorem 4.6. We define the set Covering

= {(s, t)

E A*

x N+

1

H(s)

< Isl- t - c-I}.

It is plain that Covering is c.e. and sECoveringt

{sEA·

1

{sEA·

1

< < =

L

L Q-Isl H(s)
Q-H(s)-t-c-l

sEA· Q-t-c-l

L

Q-H(s)

sEA·

<

Q-t-l

L

P(s)

sEA·

Q-t-;-l

< Q-t /(Q _ 1). We now prove that x E n~l CoveringtAw. Indeed, given t construct mt = nHc+l and use the hypothesis H(x(mt))

= H(x(nt+c+1)) < nHc+l

- (t

+ c + 1) = mt -

i.e. x(mt) E Coveringt. By Proposition 6.33, x

> 0 we

t - c - 1,

rf. rand.

Conversely, assume that x rf. rand, i.e. (by Proposition 6.33) there exists a c.e. set Covering c A* x N such that for all natural i ;:: 1, J-L(CoveringiAW)

< Q-i/(Q - 1) and x

E CoveringiAw.

6. Random Sequences

180

Moreover, by Proposition 6.34, we may assume that Coveringi is prefixfree, for all i :2: 1. Notice that the series 00

Qn-n 2

LQ -1 -

n=2

converges and has a sum less than 1. Next we compute

00

Q-(Isl-n) L n=2 sECovering 2

n=2

n

sECoveringn 2

00

=

L Qnf.l(Coveringn2A

W)

n=2 00

<

L

QnQ-n2/(Q - 1) :::; 1.

n=2 By the Kraft-Chaitin Theorem we get a Chaitin computer satisfying the following requirement: For all n :2: 2 and sEC overingn2 there exists a string u of length Isl- n such that C(u, A) = 8, i.e. HC(8) :::; 181- n. By the Invariance Theorem we get a constant c such that For all n :2:

2,8 E Coveringn2, H(8)

:::;

181- n + c.

(6.16)

Next we prove that for all natural i :2: 1 there exist infinitely many m such that x( m) E C overingi2. By hypothesis,

nCoveringjA 00

x E

W ,

j=l

so for every i we can find a natural mi2 with x( mi2) E Coveringi2. We have to prove that we can choose these numbers mi2 as large as we wish. Assume, for the sake of a contradiction, that mi2 :::; N, for all i and some fixed N. This means the existence of a string 8 of length less than N such that 8 E C overingi2, for all i :2: 1. Accordingly, for every i :2: 1 one has

6.3 Characterizations of Random Sequences

181

and

a contradiction. In conclusion, given d > 0 we pick i x(m) E Coveringi2 by (6.16),

H(x(m))

~

> d + c and m ::::: 2 in order to get

m- i

+c < m -

d.

o

From Proposition 6.33 we immediately obtain:

Corollary 6.36. A sequence x E AW is mndom iff for every c. e. set Covering c A* x N+ such that each section Coveringi C A* is prefixfree and /--l(Coveringi AW ) < Qi/(Q -1),

for all i ::::: 1, there exists a natuml n such that x

tI- CoveringnAw.

The following measure-theoretical criterion is very powerful.

Theorem 6.37 (Solovay). A sequence x E AW is mndom iff for every c.e. set Xc A* x N+ such that

2: /--l(XiAW) < 00, i2:1

there exists a natuml N such that for all i > N, x

tI- XiAw.

Proof. Assume first that x tI- rand. Then we can find a c.e. set X C A* x N+ such that every section Xi C A* is prefix-free and /--l(XiAW) < Q-i/(Q -1), for all i ::::: 1 and x E n~lXjAW (see Corollary 6.36). A routine computation shows that

Conversely, let X C A* x N+ be a c.e. set such that I:i >l /--l(XiAW) < 00, each Xi C A* is prefix-free and x E XiAw, for infinitely many i. We construct the set

182

6. Random Sequences

where i~l

i~l

For every n ;:: 1, and the sequence x is in natural m ;:: 1 such that

nn~l

BnAw, i.e. for every n ;:: 1 there exists a

We just take m = max{it, i2, ... , it}, where t

> Qn+c and

n t

x E

XijAw.

o

j=l

A stronger complexity-theoretic characterization of rand is contained in the following theorem.

Theorem 6.38 (Chaitin). A sequence x E AW is mndom iff lim (H(x(n)) - n)

n->oo

= 00.

Proof. We use Theorem 6.37 and assume that x E XiAw, for infinitely many i > 0, where X c A * x N + is a c.e. set having all sections prefix-free and Li~l /-t(XiAW) < 00. There exists a natural N > 0 such that Q-1u l = /-t(XiAW) ::; 1.

L L

L

i~NuEXi

i~N

In view of the Kraft-Chaitin Theorem, applied to the set {(u, lui) I u E Xi, i ;:: N}, there exists a Chaitin computer C satisfying the following property:

= lui. lui + c, for some constant

If i ;:: Nand u E Xi, then Hc(u)

So, for i ;:: Nand u E Xi, H(u) ::; Hc(u) c; in particular,

+c =

H(x(n)) ::; n + c, for infinitely many n, which shows that limn->oo(H(x(n)) - n) =1=

00.

Conversely, assume that the relation limn->oo(H(x(n)) -n) = 00 does not hold true, i.e. there exists a natural k > 0 such that for every N > 0 we

6.3 Characterizations of Random Sequences

183

can find an n ;:: N such that H(x(n)) < n+k. In view of the Intermediate Step in the proof of Theorem 5.4, for every n ;:: 0,

#{x E An I H(x) < n

+ H(string(n))

- t

+ O(1)} < Qn-t+O(l).

In particular, there exists a constant c > 0 such that

#{x E An I H(x) < n + k} < Qn+k-H(string(n))+c. We put Bn = {z E An I H(z) < n+k} c An and B Bn}. Every Bn is prefix-free and, by (6.17),

L

J-L(BnAW) =

Q-Izi

(6.17)

= {(z,n)A*xN+ I z

E

= Q-n#Bn < Qk-H(string(n))+c.

ZEBn

On one hand,

L

J-L(BnAW) :S

n2::1

L

Qk-H(string(n))+c :S Qk+c <

00,

n2::1

and on the other hand for every natural N > 0 there exists an n ;:: N such that H(x(n)) < n + k, i.e. x(n) E Bn or x E BnAw. So, x E BnAw, for infinitely many n, which, again by Theorem 6.37, shows that x is not random. 0 We finish this section with a variant of the measure-theoretical characterization. Theorem 6.39. A sequence x E AW is random iff for each computable function f : N -+ N and every c. e. set X c A * x N + such that

L

J-L(XiAW) < Q-n /(Q - 1),

i2::f(n)

for all n ;:: 1, there exists a natural N > 0 such that x E XiAW, for all i

> N.

Proof Assume that f, X satisfy the above requirements and x E X nAw , for infinitely many n; we shall prove that x is not random. To this end we construct the c.e. set B

= {(y, n)

E A* x N+ lyE Xi, for some i ;:: f(n)}.

184

6. Random Sequences

By a simple computation we get

J-L(BnAW) = J-L(

U

L

XiAW):s

i?'f(n)

J-L(XiAW) < Q-n/(Q -1),

i?'f(n)

and x E BnAw for infinitely many n; so, x is not random. Conversely, assume that X c A* x N+ is a c.e. set such that J-L(XnAW) Q-n /(Q - 1) and x E nn?l XnAw. Clearly,

L

<

J-L(XiAW) :S QN /(Q - 1).

i>N

So, if we take f(n)

= max:(n -

L

1,0),

J-L(XiAW) < Q-n/(Q -1)

i?f(n) and

o

6.4

Properties of Random Sequences

In this section we shall study various quantitative and qualitative properties of random sequences. The results will provide evidence about the quality of this model of randomness. It is an intuitive fact (although not operational) that if we delete (or add) a million letters from (to) the beginning of a random sequence, the new sequence thus obtained is still random. Next we make this idea precise.

Let us start with some notation. For u, v, yEA * and x E AW, if x for some z E A W , then we write

x(y; u

-+

= yuz,

v) = yvz.

Two particular cases are interesting: 1. (Addition of a string) The case y v) = vz = vx.

= u = A: x = z and x(y; u

-+

6.4 Properties of Random Sequences

185

2. (Deletion of a string) The case y

=v =

A: x

= uz

and x(y; u

-+

v) = z. Theorem 6.40. Let x = yuz be in A W (y, u E A*, z E AW). The following two assertions are equivalent:

a)

The sequence x is random.

b)

For every v E A *, the sequence x(y; u

-+

v) is random.

Proof. As x

= x(y; u

-+

u),

we have to prove only the direct implication. To this end we consider the c.e. set

v = {(yub, m) E A* x N+ I m ~ 1, bE A*(yvb, m + Iyvl) E U}, where U c A* x N+ is a universal sequential Martin-Lof test. It is easy to see that V itself is a sequential Martin-LOf test, because

#(A n n Vm )

#(An+lvl-lu l n Um+lyvl)

< Qn-m-1yul/(Q _ 1), for all natural n

>m

~

1, n ~

Iyul.

To finish our proof we shall show that x is not in rand(V) (see Definition 6.25) whenever x(y; u -+ v) = yvz is not random. Let k ~ 1. Since the sequence yvz is not random we can pick a natural number Mk ~ Iyvl such that for every n ~ Mk one has

((yvz)(n), k

+ Iyvl)

E

U,

((yv)z(t), k

+ Iyvl)

E

U,

which means that for all natural t ~ Mk - Iyvl. Let Nk = Mk + Iyuvl. One can check that for every natural n ~ Nb x(n) E Vb i.e. x ¢ rand(V). D Remark. By replacing "random" by "non-random" in the statement of Theorem 6.40 we get a valid result. Next we study the relation between randomness and computability for sequences. The main result will assert that the slightest possibility of computing an infinite part of a given sequence makes that sequence nonrandom.

6. Random Sequences

186

Theorem 6.41. Let x E AW be a sequence for which there exists a strictly increasing sequence of naturals i(k), k :::: 1, such that the set {(i(k),Xi(k))lk:::: 1} is computable. Then x is non-random. Proof We may assume that i(1) = 1, deleting - if necessary - from x the first i(1) - 1 digits with the aid of Theorem 6.40. Let us define the increasing unbounded computable functions h, s : N + ~ N by

h(t) = i(t + 1), s(n)

= #{k

EN I k:::: 1, i(k) S n}.

It is seen that s(h(t)) = t+ 1, for all t :::: 1. We will construct a sequential Martin-Lof test V such that x ¢ rand(V). For 1 S m S n - 1, we define

n A n Vm

=

B(t) = {y

E

{ B(h(m))An-h(m), if n > h(m), n.,

otherwise,

VJ

where

Y2 = Yi(s(2))

At I Y = YIY2··· Yt, YI

= Yi(l) = Xi(I),

= Xi(s(2)), ... ,Yt = Yi(s(t)) = Xi(s(t))}, t ::::

1.

The definition works because i(s(n)) S n < i(s(n) + 1). Clearly, V is computable and Vm +1 C Vm , m :::: 1 (h is increasing). We finish the proof with the following simple computation

#(An n Vm ) = Qh(m)-s(h(m))Q'n-h(m)

= Qn-s(h(m)) < Qn-m /(Q -1),

because s(h(m)) = m + 1. Finally, for every m :::: 1 and n has x(n) E Vm , i.e. mv(x(n)) :::: m.

> h(m) one 0

Corollary 6.42. Let x = XIX2 ... Xn ... be in AW. Assume that there exists 1 SiS Q such that the set

includes an infinite c. e. set M. Then x is non-random. Proof One can find an infinite computable set T C M and one constructs the infinite computable set {(t, ai) I t E T}, where T is enumerated in increasing order. By Theorem 6.41 the sequence x is not random. 0

187

6.4 Properties of Random Sequences

Example 6.43 (von Mises). Start with an arbitmry sequence x = over the alphabet A = {O, I} and define a new sequence y = YIY2 •.. Yn"" over the alphabet {O, 1, 2}, by XIX2 .•• Xn .••

YI

= Xl, Yn = Xn-l + X n , n :::::

2.

Then, y is not random, even if x is a mndom sequence.

Von Mises' [422J motivation is simple: the strings 02 and 20 never appear in y.3 We shall prove that in a random sequence every string appears infinitely often, so as a bonus we get a proof within the fmmework of our theory for von Mises' claim. We start with a new piece of notation. For every string yEA + and each sequence x = XIX2 •.. Xn .•. we put N(x,y)

= #{n

E N+ I x n x n +1" ,xn+lyl-l

= y}.

Definition 6.44. For x E AW and Y E A* we say that:

i) The string y does not occur in x if N(x, y) = O. ii) The string y occurs m times in x if N(x, y)

=m

::::: 1.

iii) The string y occurs infinitely many times in x if N(x, y)

= 00.

In cases ii) and iii) we say that y occurs in x.

Remarks.

Let x be in AW.

A) The following assertions are equivalent: i) Every string y occurs in x.

ii) Every string y occurs infinitely many times in x. B) The following assertions are equivalent: i) There exists a string y which does not occur in x, ii) There exists an infinite set of strings y which do not occur in x. (More precisely, every string in the set A *yA * does not occur in x.) We need some preliminary results. Definition 6.45. A string X is unbordered if for all strings y, z, y i= yzy.

X

3 Actually,

there are many other strings which do not appear in y.

i= A,

6. Random Sequences

188

Remark. An equivalent form of the above property can be stated as follows: the string x = XIX2 ... Xm is unbordered iff for every natural 1 S k S m - 1 one has

Fact 6.46. Let x = XIX2 ... Xn be in A *, n ~ 1. Then the string v(x) a?xa~ is un bordered. Proof Let v(x)

=

= YIY2 ... Y3n.

Consider kEN, 1 S k S 3n - 1. We shall prove that at least one of the 3n - k equalities Yq

= Yq+k, 1 S q S 3n -

(6.18)

k,

is false. There are some cases to be examined. We skip the trivial situations corresponding to n = 1,2. Case 1: 2n S k S 3n - 1. The equality YI

= Yk+l is false.

Case 2: n S k S 2n - 1. a) Assume first that k S (3n - 1)/2. Taking q = 1 in (6.18) we get al = Xk-n+l; for q = k + 1 in (6.18) we get ak+l = Y2k+l, i.e. Xk-n+1 = a2 and one of these equalities must be false. b) If k > (3n - 1)/2, then taking q = n in (6.18) we get al = Yn = Yn+k = a2, a contradiction. Case 3: 1 S k S n - 1. We consider the equality Yq as follows: 1.

for q = n

2.

for q

3.

+1-

k, giving al

= Yq+k from (6.18)

= Xl,

= n + 1, giving Xl = xk+l, for q = n + k + 1, giving xk+l = Yn+1+2k (in case of the validity of the previous two equalities).

There are two possibilities according to the relation between k and (n1)/2. i) If k > (n - 1)/2, then from Xk+1 = Yn+1+2k we deduce Xk = a2 and one of these equalities is false. ii) If k S (n - 1)/2, then we consider the natural t satisfying the inequalities k

+ 1 + tk S n, k + 1 + (t + l)k > n.

Recalling the equalities already obtained,

we take successively

189

6.4 Properties of Random Sequences q=n q=n

q=n

+ 1 + 2k, + 1 + 3k,

to get X2k+1 to get X3k+1

= X3k+l,

= X4k+l,

q = n + 1 + tk, to get Xtk+l = X(t+l)k+l, + 1 + (t + 1)k, to get (assuming all previous equalities)

o

The last equality is false.

We are now going to set a piece of new notation. Let n E N + and c E A + . We define the set

M(n,c) Of course, M(O, c)

= M(1, c)

= {x E An I C
Finally we put

= #(An \

R(n, c)

M(n, c)) = Qn - #M(n, c).

Fact 6.47. Let c = CIC2 ..• CL be an unbordered string with length L 2': 2. Then R(n, c) satisfies the following equations:

R(n,c)

= Qn,O ~ n < L,

(6.19)

R(n + 1,c) = QR(n,c) - R(n + 1- L,c),n 2': L. Proof. The first group of equalities being obvious we pass to the second group. It is readily seen that M(n+ 1, c) = S(n+ 1, c) UT(n+ 1, c), where

T(n + 1, c)

= {x E A n+1 I x = XIX2··· XnXn+l, C
S(n + 1,c)

= {x E A n+1 I x = XIX2 .. ,Xn+l,C
and c is not an infix XIX2 ... x n }. The sets T(n

+ 1, c)

and S(n

+ 1, c)

are disjoint and

#M(n + 1, c) = #T(n + 1, c) + S(n + 1, c) = Q#M(n, c) + #S(n + 1, c). We have the relation

Qn+1 _ R(n + 1, c) = Q(Qn - R(n, c)) + #S(n + 1, c).

(6.20)

6. Random Sequences

190 Obviously, S(n

+ 1, c) c

{x E A n+1 I x

= XIX2· .. Xn+l-LClC2 ... CL,

and c is not an infix XIX2 ... x n }.

(6.21)

Due to the special property of c, we shall prove that the above inclusion is in fact an equality. Indeed, we pick

in the set on the right-hand side of (6.21) and prove that c is not an infix of XIX2 ... Xn+l-LClC2 ... C£-l' In any case c is not an infix of XIX2 ... Xn+l-L. It remains to show that for every natural q such that 2L - n - 1 ~ q ~ L - 1 one has

=L

This is true because for every such q one can take k tion 6.45 applied to the unbordered string c. From the proved equality we deduce that #S(n + 1, c) from (6.20) we get (6.19).

- q in Defini-

= R(n + 1- L, c); 0

Comment. Formula (6.19) shows that in case of an unbordered string c, #M(n, c) depends only upon the length of the string c and n. To go on we need the following algebraic result. Fact 6.48. Consider the algebraic equation

where n ::::: 2 is natural and q ::::: 2 is real.

= q = 2. equation one has Izl < q.

a)

The equation has multiple roots iff n

b)

For every root z oj the

Proof We know that the equation has multiple roots iff J and its

derivative

J'

have common roots, where

J :

C

-t

C is given by

J(z) = zn - qzn-l + 1. One has f'(z) = zn-2(nz - q(n - 1)). For t = q(n - 1)/n, f'(t) = 1 ~ (q/n)n(n - 1)n-l.

6.4 Properties of Random Sequences

191

Intermediate Step. One has f(t) ::; 0, with equality iff n = q = 2. Indeed, it is seen that f(t) ::; iff nn ::; qn(n _1)n-l. In view of the inequality

°

it is sufficient to show that nn ::; 2n(n - 1)n-l. For n = 2 this is true with equality, i.e. f(t) = 0, for n = q = 2. For n > 2 one has nn < 2n(n - 1)n-l, because the last inequality may be equivalently rewritten as nn(n - 1)-n < 2n j(n - 1), and the sequence nn(n - 1)-n decreases to e. Its maximum value is 4 (for n = 2) and 4 < 2n j(n - 1), for n ~ 3. Consider the function F: R ~ R given by F(x) = x n _qx n- 1 +1. Assume n ~ 3 (the case n = 2 is obvious). The derivative F' has the roots 0, t. In case n is odd, the sign of the derivative gives three solutions for the equation F(x) = 0, lying in the intervals (-q, 0), (0, t) and (t, q). In case n is even, one has two solutions in the intervals (0, t), (t, q). Therefore, the assertion is true for real roots (we have constantly used the Intermediate Step). We now prove that every non-real root z has Izl take a non-real root z = p(cosB + i sin B), where p satisfies the equation we get

< 2. To this end we = 14 Writing that z

p = q sin(n - 1)B j sin nB

(we have sin nB

i=

°

since otherwise sin(n - 1)B

= 0,

i.e. sin B = 0).

Using this value for p and again writing that z verifies the equation we get pn = sin(n - 1)BjsinB. But, IsinkBI ::; klsinBI, for every natural k. So, assuming p ~ 2 we get the false inequality 2n ::; n - 1. Thus, p < 2 ::; q. 0

Remark. One can show that for n strictly less than 1.

> 2, the non-real roots have modulus

Fact 6.49. For every unbordered string c E A * of length a natural M such that for every m ~ M

~

3, there exists

(6.22)

192

6. Random Sequences

Proof We put L = lei 2': 3. In view of Fact 6.47 there exist L complex numbers al, ... ,aL such that for every natural n 2': 1 L

R(m, c) =

L ai A7, i=l

where AI, ... ,AL are the (simple) roots of the equation xL_Qx L - l +1 = For every 1 SiS L,

o.

IAdQI < 1, so

Accordingly, from Fact 6.48, we can find a natural M

for all m 2': M,1 SiS L, which implies (6.22).

> 0 such that

o

Theorem 6.50 (Calude-Chitescu). Every non-empty string occurs infinitely many times in every random sequence. Proof We proceed by contradiction. Let x be a sequence having the property that some string y does not occur infinitely many times in x. We shall prove that x rt rand. Deleting, if necessary, an initial string from x (using Theorem 6.40) we may assume that y does not occur in x. In view of Fact 6.46 there exists an unbordered string cof length L = Icl 2': 3 with Y
Next we define the set V n 2': 1 we put

c A* x N+, which depends upon c. For every

S(n) = {x

E An

I C f-i

x},

and notice that #(S(n)) = R(n, c). So, it will be enough to define the sequence of sets An n Vm , for all n, m 2': 1:

if m 2

< nand m > M, and

6.4 Properties of Random Sequences in case M2 cases.

193

< nand 1 S m S M; finally, An n Vm = 0 in the remaining

The set V is a computable sequential Martin-Lof test. The inclusion Vm +1 C Vm , valid for every m 2: 1, is proved separately for the following three cases: a1) (m + 1)2 < n, m > M, a2) (m + 1)2 < n, m = M, b) M2 < n, m + 1 SM. A simple computation shows

#(An

n Vm ) =

R(m2, c)Qn-m2, if m 2 < n, m > M, { R(M2, c)Qn-M2, if M2 < n, 1 S m S M, 0, otherwise.

The inequality follows - in the first two non-trivial cases - by (6.23). For every natural n > (M + 1)2 one has mv(x(n)) 2: M <m < x(n) E Vm . So,

vn,

lim mv(x(n)) n-t(X)

which shows that x

l vnJ - 1, since for

= 00,

rt rand(V), i.e. x rt rand.

D

Every random sequence x generates an immune set as follows: at least one of the letters ai,1 SiS Q, appears in x infinitely many times, i.e. the set Xi is infinite. By Corollary 6.42, at least one of the sets Xi, i = 1,2, ... ,Q, is immune. Using Theorem 6.50 we may get a slightly stronger result: Corollary 6.51. If x is random, then each set Xi immune.

=

{t

2: 1 I Xt

=

ad is

Also, as a by-product of Theorem 6.50 we get Corollary 6.52. If x E AW and all i 2: 1, then x is not random.

Xi

i= a,

for some fixed letter a E A, and

So, each sequence x over A is non-random as a sequence over a larger alphabet B i= A, A c B.

6. Random Sequences

194

The result in Theorem 6.50 can be studied from a quantitative point of view. We arrive, in a natural way, at the Borel normality of the random sequences. Borel was working with the interval [0, 1] endowed with Lebesgue measure and a criterion, equivalent to that presented in Definition 6.53; his main result states that almost all real numbers in [0,1] are normal (see [40,41]). We shall use the same counting notation as in the study of Borel normality for random strings, i.e. we employ the functions N im , 1 SiS Qm, m :2: 1. So, for x E AW and n :2: 1, x(n) = X1X2 ... Xn E A*, so Ni(x(n)) counts the number of occurrences of the letter ai in the prefix of length n of x. Definition 6.53. a) The sequence x is called Borel m-normal (m :2: 1) in case for every 1 SiS Qm one has

r

n~~

Nt(x(n)) _ Q-m L~J .

b) The sequence x is called Borel normal if it is Borel m-normal, for every natuml m :2: 1. Remark. In case m written in the form

=

1, the property of Borel 1-normality can be

lim Ni(x(n)) = Q-1, n for every 1 SiS Q. It corresponds to the Law of Large Numbers (see [192, 35, 421, 422]). n-->oo

We start with some preliminary results. Let Q :2: 2 and (X~)n>l' 1 SiS Q, be Q sequences such that x~ 2 L:~1 x~ = 1, for all ;; :2: 1.

°

and

Lemma 6.54. The following assertions are equivalent:

i)

For all 1 SiS Q, liminfn x~ = Q-1.

ii)

For all 1 SiS Q, liminfnx~:2: Q-1.

Proof Suppose, by absurdity, that liminfn x~ One has

> Q-1, for some 1 SiS Q. Q

1 = limninf(x~

+ x~ + .. , + x~) :2: '~ " limninf x~ > 1, j=l

195

6.4 Properties of Random Sequences a contradiction.

D

Lemma 6.55. If for every 1 SiS Q,

lim inf x~ = n

Q-1,

then for all 1 SiS Q,

Proof Assume, by absurdity, that lim infn x~ i= lim sUPn x~, for some 1 SiS Q, i.e. there exists a 8 > 0 such that lim sUPn x~ = Q-l + 8. Since liminfn(-x~) = -limsuPnx~, it follows that liminf(1n

x~)

= 1+

liminf(-x~) n

=

1-limsupx~ = n

QQ-1 - 8.

On the other hand, lim inf(1 - x~) n

liminf x~ j=l,#i

n

Q-1

=

Q Q-1 ---8

>

Q

'

a contradiction.

D

First we deal with the case m = 1. For every sequence x E AW we consider the sequences ,i=1, ... ,Q ( Ni(x(n))) n n:2:1 which satisfy the conditions in Lemma 6.54 and Lemma 6.55. So, in order to prove that lim Ni(x(n)) = Q-l, n->oo

whenever x is random, it suffices to show that · . f Ni(x(n)) 1Imln n

n

> Q-l _,

196

6. Random Sequences

for every 1 :S i :S Q. Assume, by absurdity, that there exists an i, 1 :S i :S Q, such that · III . f Ni(x(n)) 11m

n

n

< Q-1 •

Elementary reasoning shows that the set

1 Ni(x(n)) } { n> 1 I >f Q n is infinite, for some rational, small enough Consider now the computable set S

c

f

> 0.

A* x N+:

(6.24) Clearly, x E SnAw, for infinitely many n (here Sn = {y E A* I (y, n) E S}). Using Theorem 6.37 now, it is clear that all that remains to show reduces to the convergence of the series \

when S comes from (6.24). A combinatorial argument (Sn C An ) shows that p'(SnAW)

=

(~) (Q _1)n-k

I:

Q-n.

{kEN I O::;k
. (~) (1 -

I:

Q-1)n-kQ-k.

{kEN I O::;k
The following result is folklore (see [255]): Lemma 6.56. Let a,j3,f E (O,I),a and s = n(a + f). Then

I: {kEN I O::;k
+ 13

(n) a k

= 1,0

n - k 13 k

<

f

<e

< min(a,j3),n:2: 1

_e2n 2(3.

(6.25)

6.4 Properties of Random Sequences We now take ex

= QQ"l, f3 =

197

~. From (6.25) we can write the inequality

which enables us to conclude the convergence of the series

Ln~l

f-t(SnAW).

We have proven:

Theorem 6.57. Every random sequence is Borel i-normal. Passing to the question of Borel m-normality for m quence x E AW as

> 1 we write a se-

i.e. as a sequence over the alphabet Y = Am. The following preparatory result is interesting in itself.

Theorem 6.58. We fix the natural m > 1 and the alphabets A and Y = Am. A sequence x is random over A iffx is random over Y. Proof. First proof: measure-theoretical. We make use of Theorem 6.37. We denote by /-lA, f-ty, respectively, the measures on spaces AW, yw. First assume that x E AW is random over Y. Let S c A* x N+ be c.e. such that each of its sections Sn are prefix-free and Ln~l/-lA(SnAW) < 00. For every natural n :2: 1 we define the set Tn

= {z

E Sn

1m Ilzl}u{zy I z

E Sn,m

llzl,y E A*,

Iyl = m-Izl + mllzl/mj} c A*, and we notice that Tn may be regarded as a subset of Y*. (Recall that n lm means n does not divide m.) Furthermore, Tn is prefix-free (Sn is prefix-free) and setting Rz = {y E A* Ilyl = m - Izl + mllzl/mJ}, C z = {n E N I m llzl}, one gets

6. Random Sequences

198

(Qm)-Iw lm +

(Qm)-Iwlm L I Iwl WETn\Sn L (Qm)-lzyI/m (Qm)-lwI/m + L L zESn,mECz yERz wESn,m I Iwl L Q-Iw l + L L (Qm)-(l+LlEJ)

L wESn,m

wESn,m

L

wESn,m

L wESn,m

I Iwl

zESn,mECz yERz

Q-Iw l + I Iwl

I Iwl J.LA(SnAW).

L

L

Q-m(1+L~j)

#Rz

zESn,mECz yERz

Q-Iw l +

L

Q-Iwl

Since x is random over Y there exists a natural N > 0 such that x ~ TnYw, for all n ::::: N. We prove that x ~ SnAw, for every n ::::: N. Indeed, if for some n ::::: N, x(t) E Sn, then there are two possibilities: a) m divides t, in which case x(t) E Tn; b) m does not divide t, in which case x(t + t') E Tn, where tf = m - t + mLtjmJ. In both cases we get a contradiction. Conversely, assume that x is random over A. Let T c y* x N+ be a c.e. set such that every section Tn is prefix-free and L:n>l J.Ly(TnY W) < 00. Let S = T c A* x N+. Clearly, Sn is prefix-free -and J.LA(SnAW) = J.Ly(TnYW) , so L:n~l J.LA(SnAW) < 00. Since x is random over A, x ~ SnAw, for almost all n ::::: 1, so x ~ TnYw, for almost all n ::::: 1, i.e. x is random over Y. 0 We will offer a second proof of Theorem 6.58. Before proceeding to the proof itself we need the following two preliminary results which establish a natural connection between sequential Martin-Lof tests on the alphabets A and Y = Am with m ::::: 1. To avoid confusion, for w E A* whose length is divisible by m and x E AW, we write WA,XA, in case w,x are used in their capacity as elements in A *, respectively, AW, and Wy, Xy, in case w, x are used as elements in Y*, respectively, yw. The same convention will concern the length: we write iwiA, ix(n)iA, respectively,

iwiy, ix(n)iy· Lemma 6.59. Let m ::::: 1, Y = Am, and let W be a sequential MarlinLaj test over Y. Then the set V c A * x N defined by Vi = WiA *, jor

6.4 Properties of Random Sequences

199

i E N+, is a sequential Martin-Laf test over A such that for kEN and wE (Am)k one has

Proof Clearly, V is a c.e. subset of A* x N. If y E Vi+l, then there is an x E Wi+l such that x

Let n E N and consider Ann Vi. There are integers k and l' such that n = km + l' and 0 ::; l' < m. It follows that Ann Vi = (yk n Wi)Ar and, therefore, #( An n"Vi)

= #(yk n w.) . Qr <

2

2

#(yk-i) . Qr #y _ 1

=

Qn-mi < Qn-i Qm - 1 - Q - 1 .

This proves the cardinality condition. Next consider x E such that z

Vi and yEA * such that x

Finally, let kEN and w E A mk. Then the relation m V ( w x) = mw (Wy ) follows from the fact that Amk n Vi = (Am)k n Wi, for all i ~ 1. 0 In the situation of Lemma 6.59, the set W itself is a Martin-Lof test over A, but never a sequential Martin-Lof test over A - except when m = 1, i.e. A = Y - as it fails the sequentiality condition. Lemma 6.60. Let mEN, Y = Am, and let V be a sequential MartinLaf test over A. Then, the set W C Y x N defined by Wi = V(i+ 1 )m n Y*, fori EN, is a sequential Martin- Laf test over Y such that for kEN and wE Akm one has

(mw(wy) + 1)m::; mv(wA) < (mw(wy) + 2)m. Proof Clearly, W is a c.e. subset of Y x N. Moreover, one has Wi+l = V(i+2)mnY* C V(i+ 1 )m nY* = Wi, for all i E N. Consider n, i E N. Then #(yn

n Wi)

#(Anm n V(i+l)m) Qnm-(i+l)m

<

<

Q-1 #y n - i #Y(Q -1) #y n - i #Y-1'

6. Random Sequences

200

Finally, we suppose that x, y E Y*, x

W

E

Amk = yk. The relation

implies that (mw(wy)

+ l)m::; mv(wA) < (m(wy) + 2)m.

o

Combining Lemma 6.59 and Lemma 6.60, one gets a second proof for Theorem 6.58. We use sequential Martin-Lof tests. Let x be random over Y and assume that x is not random over A. Then there is a sequential Martin-Lof test V over A such that mv(wA(n)) is unbounded. Consider the sequential Martin-Lof test W defined in Lemma 6.60 and n EN. It follows that

(mw(xy(n))

+ l)m ::; mv(xA(nm)) < (mw(xy(n)) + 2)m

and, therefore, mw(xy(n)) is also unbounded that is, x is not random over Y, a contradiction. Conversely, assume that x is random over A, but x is not random over Y. Then there is a sequential Martin-LM test W c y* x N such that mw(xy(n)) is unbounded. By Lemma 6.59, a sequential Martin-Lof test V over A can be derived from W such that mv(xA(nm)) = mw(xy(n)), for all n E N. Hence mv(xA(n)) is also unbounded and, therefore, x is not random over A, again a contradiction.

Remark. Theorem 6.58 is an analogue of a result concerning numbers (see Niven and Zuckerman [320], Theorem 8.2): For every m :2: 2, a number a is normal to the base Q :2: 2 iff a is normal to the base Qm. Theorem 6.61. Every random sequence is Borel m-normal, for every natural m :2: 1. Proof. We use Theorem 6.57 and Theorem 6.58.

o

It is worth mentioning that there exists another possibility of counting the occurrences of a string y in a sequence x, quite different from that

6.4 Properties of Random Sequences

201

adopted in Definition 6.53. Indeed, given yEA * and x E A W we put m= Iyl and

F(x, y, n)

= #{1

~ j ~ n - m -

1 I XjXj+1'"

Xj+m-l

= y},

for every n ~ m (F(x, y, n) = 0, for n < m), and ask about the value of the limit: limn---tOQn-1F(x,y,n). Due to a classical result in Niven and Zuckerman [320] (see also Kuipers and Niederreiter [268]), the above limit has the value Q-m, for all yEA m, exactly in case x is Borel mnormal. Following Knuth [255], x is called an m-distributed sequence. Accordingly, we can state:

Corollary 6.62. For every random sequence x E AW and every string y E A* of length m,

lim F(x, y, n) = Q-m. n---tOQ

n

Remark. A) There exist many Borel normal sequences which are not raridom, e.g. Champernowne's binary sequence 010001101100001101110001101100000110111000110111000011011100 ... over the alphabet {O, I}, or Champernowne's decimal sequence 012345678910111213141516171819202122232425262728293031323334 ... over the alphabet A = {O, 1,2,3,4,5,6,7,8, 9} (see Exercise 6.7.2). The reason is simple: these sequences are computable, a property which excludes randomness. B) Becher and Figueira [24] have constructed a computable real which is normal in every base. C) It is still unknown whether the decimal representations of some familiar irrationals like x, e, vi2, log 2 are or are not Borel normal. An interesting theory explaining the normality, hence pseudo-randomness, of a collection of celebrated constants including x or log 2 is presented in Bailey, and Crandall [11, 12]. We close this section with a topological property. There is a very popular analogy between sets having measure zero and sets of first Baire category (see, for instance, Oxtoby's book [326]). However, there are many sets of measure zero which are not of first Baire category and vice versa. For

202

6. Random Sequences

instance, Oxtoby and Ulam have proven that the Law of Large Numbers fails to be true in the sense of topological category (i.e. the set Y appearing in Theorem 6.27 is of first category, although of constructive measure one). Consider again the com pact topological space (A W, T). A set B c AW is called a first category set (in the sense of Baire) if it can be written in the form n=l

where all sets Bn are nowhere dense, i.e. int(Cl(Bn)) = 0, for every natural n :2: 1. A set which is not a first category set is called a second category set and the complement of a first category set is a residual. Theorem 6.63. The set of all mndom sequences is a first Baire category set. Proof. In view of the formula

n(UmAW), 00

AW\ rand =

m=l

where U is a fixed universal sequential Martin-Lof test, one has 00

rand

=

U Fm, Fm = AW\ (UmAW). m=l

These sets are closed and have an empty interior. Indeed, only the second claim must be proven. We choose an arbitrary m :2: 1 and show that there is no open set G c Fm. It is sufficient to prove that for every x E A*,xA* ct Fm. It is plain that the constant sequence y = alal ... al ... is not in rand and the non-random sequence xy (see Theorem 6.41) is in

UmAw nxAw.

0

Corollary 6.64. Both sets rand and A W \ rand are dense in (AW, T). Moreover, AW \ rand is a residual. Proof. The set rand is dense because J.L(rand) = 1 and every nonempty open set has non-zero measure. By Corollary 6.32, A W \ rand is a residual. It is also dense in (AW, T) since for every x E AW the sequence x(n)alal ... al .,. tends to x as n -+ 00 and every element x(n)alal ... al ... is not random. 0

6.4 Properties of Random Sequences

203

Corollary 6.32 asserts that rand has constructively measure one. We now prove that Theorem 6.63 is also constructively valid.

Definition 6.65. A set B c A W is called a constructively first category set (in Baire sense) if there exist a c. e. set E c A * x Nand a computable function f : A * x N -+ A * satisfying the following two conditions: 1.

One has

00

U (AW\ (ErnAW)),

Be

rn=l

2.

where Ern = {y E A* I (y,m) E E}, for m ~ 1. For every string x #- A and every natural m ~ 1, x

Theorem 6.66 (Calude-Chitescu). The set rand is a constructively first category set.

Proof Let U be the universal sequential Martin-Lof test constructed in Theorem 6.16 and put E = U, hence Ern = Urn, for all natural m ~ 1. To define the computable function f we consider the sequential Martin-Lof test W = {(a?+ly,n) lyE A*,n ~ 1}. We take x E A* and mEN. In case x = A or m = 0, we put f(x, m) = A. Assume now that x #- A and m ~ 1. We construct the sequential MartinLOf test v = {(xy,n) lyE A*,n ~ 1,(y,n) E U}, and we constructively pick a natural c such that Vn + c U Wn+c C Un, n = 1,2, ... (see again the proof of Theorem 6.16). In fact, c can be obtained as the maximum of the "Godel numbers" of the c.e. sets Wand V. We put

f(x,m) = xaic+rn+l. All that remains to be proven is that f(x, m) E Urn. Indeed,

a2c+rn+l 1

TT

E V2c+rn C

Urn,

l.e.

o

6. Random Sequences

204

6.5

The Reducibility Theorem

In this section we discuss the extent to which random sequences can generate, in an algorithmic way, all sequences. For strings, this is fairly obvious: the random strings generate, by means of the universal Chaitin computer, all strings. For sequences, the same phenomenon occurs, but it is far more complicated to describe it formally. Before stating and proving the main result we need some more notation. We put kX) = A * U AW and note that the prefix-order relation defined on A* can be extended to Aoo. For a, (3 E A oo we say that a is a prefix of (3 (and we write a

2. a,(3 E A* and a

c

A 00 and every string x E A * , xX

= {a E X I x

If H c A *, then H X = UXEH xX. In particular, for X = A W we get the (basic) open sets generating the topology T on A W used in our previous section: xAW, HAW. In particular, a set D c A Wis a (constructive) closed set (or II~) if A W\ D = HAw, for some (c.e.) subset He A*. In what follows we shall freely use the measure fL. Definition 6.67. A function F : A OO lowing two conditions hold true:

-+

Aoo is continuous if the fol-

a)

F is prefix-increasing, i. e. for all x, yEA *, F(x) ever x

b)

for every x E AW,F(x) = sup{F(x(n)) I n ~ I}.

Comment. The set A 00 comes equipped with a natural structure of computable complete partial order (cpo) under <po The continuity in Definition 6.67 is exactly the continuity in the sense of complete partial orders (cpos); see more in Weihrauch [429].

6.5 The Reducibility Theorem

205

Definition 6.68. A continuous function F : AOO if its graph approximation set {(x, y) E A* x A* I y

~

Aoo is computable

is c.e.

Definition 6.69. An element x E A * is called a non-terminal string for the continuous function F : A 00 ~ A 00 if there exists a string y with x

Definition 6.70. A computable function F : A 00 the set {(x, F(x))

Ix

~

A 00 is a process if

E A* is a non-terminal string for

F}

is c.e.

Lemma 6.71. If F : A * ~ A * is a computable prefix-increasing function, then its extension F : Aoo ~ Aoo defined by F(x) = F(x), for all x E A* and F(x) = sup{F(x(n)) In:2: I}, for x E A W , is a process. Proof It is obvious that F is a continuous function. Also, the set {(x, y) E A* x A* I y

= {(x, y) E A*

x A* I y

is c.e. (F is computable). Finally, the set {(x, F(x)) = {(x, F(x))

Ix

Ix

E A* is a non-terminal string for F}

E A*, F(x)

i=

F(y), for some y E A* with x

is c.e.

Lemma 6.72. Let F : A 00 ~ A 00 be a computable function. Then there exists a computable and prefix-increasing function G : A * ~ A * such that F(x)

= sup{G(x(n)) In:2: I},

206

6. Random Sequences

Proof The set B = {(x, y) E A* x A* I y

1) F(x)

= sup{y E A*

2) F(x)

I (x,y) E B}, for all x E A*,

sup{y E A* I (x(n),y) E B, for some n sup{g2(k) IkE N,gl(k)

~

I}

for every x E AW. We define the function G : A * --+ A * by

G(x)

= sup{g2(k) I gl(k)

One can see that G is a monotonic computable function. Using the construction of G and the continuity of F we get, for every x E AW, sup{G(x(n)) I n ~ I}

supsup{g2(k) I gl(k)

n2':l

supF(x(n)) n2':l

F(x). On the other hand, for every kEN such that gl (k)

g2(k)

=

max { k, Ig1 (k ) I}. Therefore, F(x)

~ I},

completing the proof.

o

Remark. In view of Lemma 6.72 we will speak about a string being non-terminal for either a computable function defined on Aoo or a prefixincreasing computable function defined on A *. We are now in a position to state the main result of this section.

6.5 The Reducibility Theorem

207

c

A W be a constructive closed l . Then there effecset and ko be a natural number such that f.l(G) > tively exists a process F : Aoo -+ Aoo with AW = F(C).

Theorem 6.73 (Kucera-Gacs). Let G

ka

The proof will be divided into several steps. Let G = AW\Ui;::o WiAw (where the map carrying i into Wi is computable). We put t

Ct = A W \

UwiAw. i=O

It is seen that the sequence (Gt)t;::o is decreasing and G = nt;::oGt . Without loss of generality we may assume that the set {wo, WI, ... } is prefixfree. We shall use two non-decreasing sequences of natural numbers (nk)k;::o and (mk)k;::O as follows: for 0 ~ k < ko, for k ;::: ko, and for 0 ~ k < ko, for k ;::: ko. For every natural k, we put

and for all tEN, for 0 ~ k < ko, for k ;::: ko.

Fact 6.74. For all natural k, t, Proof For 0 ~ k

< ko,Rf

=

Rf

is non-empty and computable.

{.A}. If k;::: ko,t EN and

f.l( Gt n xAW) < k-IQ-n k , for all x E Tk. It follows that

Rf =

0, then

208

6. Random Sequences

and a contradiction. Furthermore, for all k :2: ko, t :2: 0, and x E Tk one has x E R~

n xAW) :2: k-1Q-n k

¢:::::}

f.l( Ct

¢:::::}

f.l(xA W\

t

UwiAW) :2: k-1Q-n

k

i=O t

¢:::::}

Q-n k

I:

_

Q-I Wi l:2: k-1Q-n k

i=O,Wi
o

and the last condition is computable.

Fact 6.75. For all t, kEN and x E Rf one has

#( xR~+1) :2:

Qm k+1-mk.

Proof Let k, tEN. If k < ko, then x = A,

so the required inequality comes from Fact 6.74 (i.e. R~+1 k :2: ko, x E Rf and put r = #(xR~+1). Using the definition of R~+1 it follows that for all

one has Also, f.l(Ct

n yAW) S f.l(yAW) S Q-n k +1 ,

for every y E XR~+l n XTk+1' Accordingly, the following computation is valid:

I: y ExTk+l

f.l(Ct n yAW)

i=

0). Let

209

6.5 The Reducibility Theorem

yExTk+l nxR~+l < (k + 1)-lQ-nk+l(Qnk+l-nk - r) < rQ-nk+l + (k + 1)-lQ-nk. From the hypothesis x E

+ rQ-nk+

1

Rf, so

in view of the above inequalities we get

k-1Q-n k ::; rQ-n k+1 r::::: k-1(k

+ (k + 1)-lQ-nk,

+ 1)-lQ nk+

1-

nk .

Using the construction of the sequence (nk) we can write the relations

(k + 1)2 + (k + 1)llog(k + 1)2J - k 2 - kllogk 2J

nk+1 - nk

>

2k + 1 + llog(k 2k + log(k

+ 1)2J

+ 1)2.

Finally, one has

r

> k-1(k + 1)-lQ2k+log (k+ 1)2 > (k + 1)k- 1Q2k > Q2k-l o

thus concluding the proof.

Next we define a sequence (Fi k~o of functions Fi : A * -+ A *. First we put Fo()..) =)... Then we define Fo on sections Tk, one by one. Let x E Tk+l and x' = x(nk)' There are two cases: A1) If x' E R/5, lFo(x') I = mk,x is the ith element of x'R~+1 and 1 ::; i ::; Qmk+l-m k , then let Fo(x) to be the ith element of Fo(X')Sk+l (in this case one can see that, following Fact 6.75, Fo(X')Sk+l has Qmk+ 1 - mk elements). A2) In the opposite case we put Fo(x)

=

Fo(x').

210

6. Random Sequences

Rules AI), A2) define Fo recursively for each string x in Uk:::::O Tk. We extend Fo to all strings in A * by the formula

where k is the greatest integer for which nk :S

Ixl.

Inductively, assume that we have defined F o, F 1 , ... , F t and we describe a procedure for Ft+l. Again, Ft+l()..) = )... Let x E Tk+l and x' = x(nk). There will be three cases:

B2) If x' E Rf+1'

1Ft (x') I = {z

E

mk, x is the ith element of the set x' R;tlllFt(z)1 i= mk+1}

and 1 :S i :S #X, where X = {y E Ft(X')Sk+l

I

for every z E x'R;tl,Ft(z)

i= y},

then Ft+1 (x) is the ith element of X.

B3) In the remaining cases, we put Ft+1(x) = F t+1(x'). We employ the same procedure, i.e. we extend Ft+l to A* by defining

where k is the greatest integer with nk :S

Ixl.

Since all sets Rf are computable it follows that F t are themselves all computable. We will start to prove some peculiar properties of these functions. Fact 6.76. For all t, kEN and x E Tk,

1Ft (x) I :S mk.

Proof Let t, kEN. We distinguish two cases:

= 0, then we proceed by induction on k. If k = 0, then x = ).., lFo()..) I = 0 = mo. Suppose that the inequality holds true for all strings

i) If t

in Tk and let x E Tk+l. From the construction of Fo one has

6.5 The Reducibility Theorem

211

By virtue of the induction hypothesis, IFO(x(nk)) I :S mk, so lFo(x) I :S mk+I' ii) In case t > 0 we still proceed by induction on k. If k = 0, then x =.A, IFt+1()..)I = 0 = mo. Let x E Tk+I; from the construction of FHI ,

(we have used the induction hypothesis IFHI(x(nk))1 :S mk).

Remark. For every x E Tk, if every j :2: t.

1Ft (x) I =

0

mk, then Fj(x) = Ft(x), for

Fact 6.77. For all t, kEN and x E Tk+1 the following two assertions are true:

a)

If 1Ft (x) I = mk+I, then IFt(x(nk)) I = mk·

b)

If 1Ft (x) I < mk+I, then Ft(x) = Ft(x(nk))'

Proof a) We proceed by induction on t. For t = 0, the statement is true by virtue of AI). Suppose that the inequality holds true for F t and let x E Tk+1 so that IFt+1 (x) I = mk+1' According to the definition of Ft+1 we have to analyse two situations: i) if Ft+1 comes out through BI), then 1Ft (x) I = mk+I, hence, by induction, IFt(x(nk)) I = mk and IFt+1(x(nk)) I = mk; ii) in case FHI comes out through B2), mk = IFt(x(nk)) I = IFHI (x(nk))I· o b) In case t = 0 we use A2); otherwise, B3). Fact 6.78. All functions Ft are prefix-increasing.

Proof It is enough to prove that Ft(x(nk))

212

6. Random Sequences

In case B2), 1Ft (x') I = mk,Ft+1(x') Ft+1(x), hence FH1(X')

= Ft(x').

In this case Ft(x')

In case B3), FH1(X) = FH1(X'), therefore the assertion holds true. Fact 6.79. For all tEN, x E A* one has Ft(x)

Proof It is enough to prove that Ft(x)

In case B1), Ft+1(x) = Ft(x). In case B2), 1Ft (x) I #- mk+l and Ft(x) = Ft(x') (by Fact 6.77b)). By monotonicity and the induction hypothesis Ft+1 (x')

o Since the sequence (Ft(x))t?o is prefix-increasing for every x E A*, we can define the function F : A * -+ AW by

F(x) = sup{Ft(x) I t

~

O}.

The next step consists in extending F to A 00 by the formula (see Lemma 6.71) F(x) = sup{F(x(n)) I n ~ O}. Fact 6.80. For each x E A*,F(x) E A*.

Proof Let x E A*. If x = A, then Ft(A) = A. If x #- A, then there exists a natural number k such that nk :S Ixl < nk+l. Therefore, using Fact 6.77 one has 1Ft (x) I = IFt(x(nk))1 :S mk, for each tEN. Thus the set {Ft(x) I t ~ O} is finite and F(x) = sup{Ft(x) I t ~ O} = max{Ft(x) I t ~ O} E A*. Fact 6.81. The junction F : A*

-+

0

A* is prefix-increasing.

Proof If x'

F(x')

= sup{Ft(x')

I t ~ O}

0

213

6.5 The Reducibility Theorem Fact 6.82. The function F : kX)

-+

Aoo is computable.

Proof From Fact 6.80, Fact 6.81 and the construction of F it follows that F is continuous. Further, the set

{(x, y) =

E A*

x A* I y

{(x, y) E A* x A* I y

is c.e. (each F t is computable), showing that F is computable.

0

Fact 6.83. For every kEN and x' E Tk, the following assertions are equivalent: a)

The string x' is non-terminal for F.

b)

One has 1Ft (x') I = mk, for some t ;::

o.

Proof a) ~ b) If x' is non-terminal for F, then there exists an index > k and a string y E x'Tj such that F(x) #- F(y). Suppose that 1Ft (x') I < mk, for every t ;:: O. Using Fact 6.77 b) we deduce that for every tEN and all i E {k, k + 1, ... ,j - I}, IFt(y(ni)) I < mi, so Ft(y(nHl)) = Ft(y(ni)). We get Ft(y) = x, for all natural t, i.e. F(y) = F(x), a contradiction.

j

a) If IFt(x' )I = mk, then F(X') = Ft(x' ). Let to = min{t -;:: o I 1Ft (x') I = mk}. There are two cases: b)

~

1. If to = 0, then x' E R~ (see the definition of Fo, case AI)). Let x = minx'R~+l. Again, by virtue of AI), one has lFo(x) I = mk+l. Therefore,

F(X') = Fo(x' ) #- Fo(x) = F(x), which shows that x' is a non-terminal string for F.

2. If to > 0, then let to = I+h· Then IFh(x') I < mk and 1F1+tl (x') I = mk· From the definition of Fl+h (case B2)) it follows that x' E Rt+tl. Let x = min x' R~ttl. Since IFtl (x') I < mk we can apply Fact 6.77 b) and deduce that IFtl (x)1 < mk. We put

B

=

{y E Ftl (X')Sk+l I Ftl (z)

=

y, for some z E x' R~ttJ,

and prove that B = 0. Indeed, on the contrary, there is a z E x' R~ttl such that Ftl (z) E Ftl (X')Sk+l. Let t2 be the smallest integer for which

214

6. Random Sequences

IFt2(Z)1 = mk+l (hence t2 S tl). It follows that Ft2(Z) has been defined via case B2) (in case t2 > 0) or A2) (in case t2 = 0). Since both strings x, z belong to x' R~+il C x' 2+1 and IFt2 (z) I = mk+l it follows that IFt2(X)1 = mk+l (the values of Ft2 have been assigned in lexicographical order and x is less than z according to this order). We have contradicted Fact 6.79: IFtl (x)1 < mk and t2 S tl·

M

As a direct consequence of the equality B =

0 we deduce the formula

and the least element of the above set is exactly Fl+h (x). Accordingly, IFIHI (x)1 = mk+l and F(X') = Fl+tl (x') i- FIHI (x) = F(x), i.e. x' is a non-terminal string for F. 0

Fact 6.84. The junction F is a process. Proof Let x E A* and kEN such that nk S Ixl < nk+l. Then F(x) = F(x( nk)), as all functions Ft have this property. In view of Fact 6.83, x is a non-terminal string for F iff there is an integer t such that IFt(x(nk)) I = mk. So, the set

{(x, F(x)) I x is a non-terminal string for F}

= {(x, F(x)) Ink S Ixl < nk+l, IFt(nk) I = mk, for some t, kEN} is c.e. (as the binary function Ft(x) is computable).

o

We put for t, kEN, and

Fact 6.85. For all k, tEN and x' E

Ft(x ' M;+l)

Mf

one has

= Ft(X')Sk+l.

Proof We proceed by induction on t and we prove a stronger relation, namely: for all t,k E N,x ' E Mf,y E Ft(X')Sk+l there is a unique string x E x'Mtk+l such that Ft(x) = y.

215

6.5 The Reducibility Theorem

For t = 0 we analyse two cases: i) k+1 < ko and mk+l = nk+l = 0 (and in this case the assertion is clearly true), ii) k+ 1 2:: ko and x' E R~, lFo(x' ) I = mk. Let y E FO(X ' )Sk+l. There exists an i, 1 ~ i ~ Q ffi k+ 1 - ffik , such that y is exactly the ith element of x' R~+l. Therefore, from the definition of Fo (case A2)), Fo(x) = y. As lFo(x) I = Iyl = mk+l, x E x' M~+l, x is the unique string in x' M~+1 having the above property. Now let us pass from t to t+ 1. We fix kEN and x' E M tk+1 . If k+ 1 < ko, then the statement is true. In case k + 1 2:: ko we use the construction of the set Ml+l to deduce that x, E .Rf+l and IFt+1 (x') I = mk. Let y E Ft (X ' )Sk+l. There are two cases to be checked: i) If there exists x E x ' R;1l such that F t (x) = y, then Ft+1 (x) = y (because 1Ft (x) I = mk) and x E x'M::r In view of the inclusion x ' R;1l c x'R;+l and the induction hypothesis, there is a unique string x in x ' R;1l such that Ft+l(X) = y.

ii) If Ft(z) X

i= y, for all Z E x' ~1l, = {y'

E Ft (X ' )Sk+l

then y E X, where

I Ft(z) i= y',

for all z E x ' R;1l}.

The induction hypothesis says that F t is a bijection between the sets x ' M tk+ 1 and Ft (X ' )Sk+l. Let

B

=

{z E x ' R;1lllFt(z)1

We deduce that #Ft(B)

mk+d C M tk+1.

=

= #B, X = Ft (X' )Sk+1 \ Ft(E)

#X

Qffi k +1 -ffi k -

Qffik+l -ffik

-

and

#Ft(B) #E

< #x/~1l- #E #(X' R;1l- E) (the inequality comes from Fact 6.75). Suppose now that y is the ith element of X, 1 ~ i ~ IX. Let x be the ith element of x ' R;1l \ B. From the definition of Ft+ 1 (case B2)) Ft+ 1 (x) = y and x is the unique string with the above property. 0 Fact 6.86. For all tEN, if x E M t

Proof Let t E N,x E M t

\

Mt+l,

\

Mt+l, then x

Ixl = nk.

One has

rt

Ui2t+l Mi.

216

6. Random Sequences

so

From the relation x (j. Mt+1 we get x (j. Rf+ l' Since the sequence (Rfk:::o is increasing, it follows that x (j. Ui;2::t+1Rf, i.e. x (j. Ui;::::t+1Mi' 0 Fact 6.86 allows us to define the lower limit of the sequence (Mt)t>o: M=

UnMi'

t2':O i2':t

We put Mk = MnTk. It is seen that Mk = {A}, for 0 :S k shows that Mk = {x E Tk I x E M t infinitely often}.

< ko. Fact 6.86

Fact 6.87. For all kEN and x' E Mk one has F(x' Mk+ 1) = F(X')Sk+1. Proof. Let k E N,x' E Mk and y E F(x'Mk+ 1).

From the relation x' E Mk it follows that a natural to exists such that x' E M tk, for all t 2 to. Accordingly, F(x') = Ft(x'), for all t 2 to. Thus

and using Fact 6.86, for each t 2 to there exists an element Xt E x'Mtk+1 with Ft(xt) = y. But

and the last set is finite. Therefore we can find an x E X'Tk+1 such that {t E Nit 2 to, Xt = x} is infinite, hence x E x' Mk+ 1. It is easy to see that F(x) = y, thus finishing the proof. 0

Remark. i.e. Mko i-

In case k

=

ko, Fact 6.87 shows that F(Mko)

{A},

0.

To complete the proof of Theorem 6.73 we state:

Fact 6.88. For every sequence y E AW there exists a sequence x E C such that F(x) = y.

217

6.5 The Reducibility Theorem

Proof. Starting with the sequence y E AW we construct a sequence (Xk)k?k o of strings satisfying the following properties:

First, we take Xo E Mko (which is non-empty). If Xk E Mk with F(Xk) = y(mk), then y(mk+t) E y(mk)Sk+1 = F(Xk)Sk+1. By Fact 6.87 there exists a string Xk+1 E xkMk+l such that F(Xk+1) y(mk+1)' We put x =

SUP{Xk

Ik

~

F(x)

=

k o}. Using the continuity of F,

F(SUp{Xk I k ~ k o}) sUp{F(Xk) I k ~ ko} sup{y(mk) I k ~ k o} y.

It remains to be proven that x E C. Suppose, by absurdity, C = A W \ Ui?OWiAw. There is an index j E N such that x Let k ~ ko with nk ~ IWjl; obviously, Wj

----+

that x rt E wjAw. For each Xk rt M tk , 0

Aoo such that

A W = F(rand). Proof. The set rand contains a constructive closed set of measure greater than 1- Q-1. 0

We close this section with some results analysing Theorem 6.73. First we show that the result is false in case C is closed, but not constructively closed. Proposition 6.90. For each natuml n ~ 1 there exists a closed set C c AW with f.l( C) ~ 1 - Q-n such that for every computable function F: Aoo ----+ Aoo,Aw =1= F(C).

6. Random Sequences

218

Proof Let M = {g : A* ~ A* I 9 is computable and prefix-increasing}. Clearly, M is enumerable; we fix a (non-computable) enumeration of M, M = {gi Ii;::: a}. Let G i : Aoo ~ Aoo be the extension of gi to A W, as defined in Lemma 6.71. For all n,i E Nand Y E An+i+l we put

Each set Oy is open (G i is continuous). Furthermore, for all naturals n, i there exists a string Yn,2. E An+i+ 1 with t-'"11.(0Yn,'t.) < Q-(n+i+.1). Indeed ' suppose that for some natural n, i and all strings Y E A n +2 +1 one has J-L(Oy) > Q-(n+i+ 1 ). The sets (Oy), Y E An+i+1, are disjoint, so

U

1 = J-L(AW) > J-L(

Oy)

iyi=n+i+l

L

J-L(Oy)

iyi=n+i+l

Q-(n+i+ L iyi=n+i+l

>

1)

1, a contradiction. We now fix n E N and let C = AW

\

U 0Yn,i' i2':O

The set C is closed (but not constructively closed). Next,

1-

L Q-(n+i+

1)

1- Q-n /(Q - 1)

> 1- Q-n. Let F : Aoo ~ Aoo be a computable function. From Lemma 6.72 there exists a computable and increasing function gi : A* ~ A* such that F(x) = Gi(x), for all x E AW. Finally, C n 0Yn,i = 0 implies Gi(C) n Yn,iAw = 0, i.e. F(C) nYn,iAw = 0; this shows that F(C) =I AW. D

219

6.5 The Reducibility Theorem Next we show that the quantitative condition f.l( C)

> kOI is not necessary.

Proposition 6.91. Assume that Q > 2 and let

B = {aI, a2, ... , aQ-d c A and C = B WcAw. Then C is a constructive null set and there is a process (which can be effectively constructed) F : AW -+ A W such that F( C) = AW. Proof A straightforward computation shows that f.l(C)

1 - f.l(B*aQA W) 1 - f.l(

U BnaQAW) n:2:0

o. Next we define the computable functions G : A * -+ {aI, a2}* and 9 : {aI, a2}* -+ A* as follows: G is a monoid morphism acting on generators by G(ai) = ai,i = 1,2,G(ai) = )..,2 < is Q, and

g(x)

=

{

aiG(y), in case x = WiY, 1 SiS Q, Y E {al,a2}*, A, otherwise.

Here WI

= aI, W2 = a2al, ... , WQ-I = a~-lal' WQ = a~.

The definition of 9 is correct since the set {Wi 11 SiS Q} is prefix-free (more exactly, for every x E {aI, a2}W there exists a unique 1 SiS Q such that x E wi{al,a2}W). We define the computable function F : A * -+ A * , F (x) = g( G (x)). Clearly, F is prefix-increasing, so according to Lemma 6.71 the extension F : A 00 -+ A 00 is a process. For every

we can construct the sequence

for which the following relations hold true: F(x)

= sup{F(x(n)) I n ~ I} = sup{g(x(n)) I n

~

I} = y.

0

6. Random Sequences

220

Is it possible to replace the measure-theoretical condition in Theorem 6.73 by a more general condition not involving the measure? The answer is affirmative and a result in this sense will be presented in what follows. Let L; and r be two fixed alphabets having p and q elements, respectively. If X c L;OO and n E N, the set {y E L;n I X n yL;oo =I- 0} will be denoted by x[nl.

Definition 6.92. Let 9 : N -+ N be an increasing function and h : N -+ N be a function with h(n) ;:: 2, for all n E N. A set X C L;w is called a (g, h)-Cantor set if it is non-empty and for each n E N and each x E x[g(n)l we have

# (xL;w n x[g(n+1)l) ;:: h(n + 1). A set X C L;w is called a computably growing Cantor set if there is a computable increasing function 9 : N -+ N such that X is a (g, 2)-Cantor set; here 2 is the constant function h( n) = 2.

The main result is the following stronger form of reducibility:

Theorem 6.93 (Hertling). Let 9 : N -+ Nand h : N -+ N be two increasing computable functions with g(O) = h(O) = O. Let C c L;W be a constructively closed set which contains a (g, n 1--+ qh(n+1)-h(n))-Cantor set. Then there is a process F : L;OO -+ roo satisfying the following two conditions: 1.

F(C) = rw.

2.

For all n E N and all non-terminal strings for F, x E E* with Ixl ;:: g(n), we have If(x)1 ;:: h(n).

Before presenting the proof we will state the following important consequence:

Corollary 6.94. Let C c L;w be a constructively closed set which contains a computably growing Cantor set. Then there is a process F : L;OO -+ roo with F( C) = rw. Proof Assume that 9 : N -+ N is a computable increasing function and X c C is a (g,2)-Cantor set. Let c E N be a number with 2 c ;:: q.

6.5 The Reducibility Theorem

221

We define two functions g, h : N -+ N by g(O) = 0, g(n) g(c . n), for n > 0, and h(n) = n for all n. These functions are computable, increasing and satisfy g(O) = h(O) = O. The set X is a (g,2 C )-Cantor set, hence a (g, n f--t qh(n+I)-h(n))_Cantor set. The corollary follows from Theorem 6.93. 0 We continue with the proof of Theorem 6.93. Let wo, WI, W2, computable sequence of strings in ~* with

For tEN we define

Ct

= ~w \

...

be a

U Wk~w. k
The strategy is the following. We will construct a computable sequence (ft)tEN of computable prefix-increasing functions ft : ~* -+ r* whose extensions map Ct onto rw , f t ( Ct) = rw , and we will show that the function f : ~* -+ r* defined by f(x) = the longest string in {ft(y) I y

pg(n+l)-g(n) :2 qh(n+I)-h(n), for all n E N.

(6.26)

The set X will be defined completely by the sets Dr defined for t, n E N by Df = {x E ~g(n) I x~w C Ct or (x~oo n Ct =1= 0 and #(x~oo n Df+l) :2 qh(n+l)-h(n))}.

Lemma 6.95. 1.

The sets Dr are well-defined for all t, n EN.

2. 3.

The set {( t, n, x) E N 2 X ~* I x E Dr} is computable. Dg = ~g(n), for all n E N.

222

4.

6. Random Sequences

5.

Dr+I c Dr, for all t, n E N. If x E Dr, then #(x'L/>O n Dr+I) 2': qh(n+1)-h(n) , for all t, n EN.

6.

A E D~, for all tEN.

Proof 1. If Ixl 2': max{lwill i < t}, then either xL;oo c Ct or xL;oo n Ct = 0. Hence, the sets Dr are well-defined for g(n) 2': max{lwill i < t}, so all sets Dr are well-defined. 2. The set {(t, x) eN x L;* I xL;oo n L;W c Cd is computable, so the set {(t, n, x) E N 2 X L;* I x E Dr} is computable. 3. This follows immediately from Co = L;W. 4. This follows from Ct+I C C t for all tEN. 5. This follows from (6.26) and the definition of Dr. 6. We shall show that x[g(n)] c Dr for all t, n EN. Indeed, X =1= g(O) = 0, hence A E X[g(O)]. For the proof of the inclusion x[g(n)] we fix a number t and distinguish the following two cases:

0 and c Dr

•

If g(n) 2': max{lwil I i < t}, then x E x[g(n)] and X C C C Ct, hence xL;oo n Ct =1= 0. The general assumption (first case) gives xL;oo n L;w C Ct, so using the definition of Dr we conclude that x E Dr.

•

If g(n) < max{IWil I i < t}, then x E x[g(n)] and X C C C Ct imply xL;oo n Ct =1= 0. Furthermore, x E x[g(n)] implies by the definition of X that there are at least qh(n+I)-h(n) strings Y in the set xL;oo n x[g(n+I)]. By virtue of the induction hypothesis all of these strings lie in Dr+ I , so by definition of Dr, x E Dr. D

Lemma 6.96. There exists a computable function

F :N x

UL;g(m) -+ Urh(m) m

m

satisfying the following five properties (1)-(5) for all t, n E N and x, Y E Um L;g(m). The properties are expressed in terms of the sequence of functions h(x) = F(t, x) and the set = {x E L;g(n) Ilh(x)1 = h(n)}.

Lr

(1)

h(x)

(2)

If x

6.5 The Reducibility Theorem

223

Ixl = g(n), then Ift(x)1 ::; h(n). (4) Lf = {x E L;9(n) I x is non-terminal for it}. (5) If x E Df n Lf, then ft maps xL;oo n D~+l n L~+l (3)

If

bijectively onto

ft(x)r OO n rh(n+l). Proof First we indicate the construction of F via ft. We start by defining fo(x) for all x E Um L;9(m). We set fO(A) = A and for y E L;9(n+l) we put x = y(g(n))4 and the jth string in fo(x )rOO n r h(n+l), _ { if x E and y is the jth string in xL;oon L;9(n+1), for some j with 1 ::; j ::; qh(n+l)-h(n) , fa(Y) -

La

fo(x),

otherwise.

In the definition of fo we have used the lexicographical order. The function fa is well-defined by (6.26). Now we fix a number tEN and construct ft+l. We set ft+1(A) = A, and for y E L;9(n+l) we put x = y(g(n)) and distinguish three cases. We can assume that ft has been constructed and ft+l(X) was already defined. We may use the induction hypotheses (1) to (5) .

First case: y E L~+l. Then we set ft+l(y) = ft(y). Second case: y (j. L~+l and x (j. Df+1 n Lf+1' Then we set ft+1 (y) ft+1 (x). Third case: y (j. L~+1 and x E Df+1nLf+l' Let k and l = qh(n+1)-h(n) - k. We claim that

=

#(xL;OOnD~:lnL~+1)

(6.27) and

According to our assumption x E Df+1 and Lemma 6.95.5 we know that there are at least qh(n+l)-h(n) elements in xL;oo n D~:l, which proves (6.27). The claim (6.28) follows for k = 0 immediately from the relation #(ft+l(X)) = h(n). 4In analogy with the notation u( n) = vided n ~ lyl.

Ul U2 ... Un

we write y( n) =

Yl Y2 ... Yn

pro-

224

6. Random Sequences

Assume k #- O. From x~oo n Lr+ 1 #- 0 and #(h(x)) :::; h(n) (induction hypothesis (3)) we deduce that x is non-terminal for ft. By induction hypothesis (4) we obtain #(ft(x)) = h(n). Using ft(x)

n Dr:l n Lr+l) c h(x )rOO n rh(n+l) = fHI (x )rOO n rh(n+I).

According to Lemma 6.95.4 we have Dr:l c Dr+l. Hence, by induction hypothesis (5) the function ft is injective on the set

Thus #(f(x~OO

n D~:l n Lr+I))

=

k,

which implies (6.28). We have proved (6.27) and (6.28). Now we can define fHI(Y). Let ZI, ... , Zz be the lexicographically ordered list of strings in

and let YI, ... , Yz be the first l strings (according to the lexicographical n+1 n Ln+1) order) in (x~OO n Dn+l) HI \ (x~OO n DHI t . We define ifY=Yj, forsomej E {1, ... ,l}, otherwise. This ends the construction of F. It is clear that F is defined on N x Um~g(m), and F(N x Um ~g(m)) C Um rh(m). We will check that F has indeed all properties (1)-(5). In the following we will always assume by the induction hypothesis that any of these conditions is true for smaller values of t or for shorter strings x, y.

(1) We have h()..) = ).. by construction. Hence, (1) is true for x = ).. and all tEN. We fix numbers t, n E N and a string Y E ~g(n+1). It is sufficient to prove the inequality

This is clear by construction of fHI in the first case. We set x = y(g(n)). First, we claim that our assumption Y tt Lr+1 in the second and third case implies h(Y) = ft(x). The assumption Y tt Lr+ 1 implies Ih(y)1 ::; h(n).

6.5 The Reducibility Theorem

225

We have either Ift(x)1 = h(n), in which case the induction hypothesis (2) implies ft(Y) = ft(x), or Ift(x)1 < h(n), in which case the induction hypothesis (4) implies that x is not non-terminal for ft, hence ft(y) = ft (x). We have proved the first claim: in the second and third cases we have ft(y) = ft(x). By induction hypothesis (1) we have ft(x)

in the second and third cases. (2) For t = 0, (2) follows immediately from the definition of fo and induction. We now fix t, n E N and a string y E L;9(n+I) and set x = y(g(n)). It is sufficient to prove that

In the second and third cases this follows from the definition of ft+1(Y). In the first case we have

by induction hypothesis (2) and the construction of ft+I. Hence it is sufficient to prove ft+1 (x) = ft (x). Indeed, the first case assumption y E L~+I and Ift(x) I ~ h(n) (induction hypothesis (3)) imply that x is non-terminal for it- Using the induction hypothesis (4) we conclude 1ft (x) I = h(n). With ft(x)

In the first case in the definition of ft+1 (y) this follows from ft+I (y) = ft(y) and Ift(y)1 ~ h(n) (induction hypothesis (3)). In the second and third cases this follows from the definition of ft+I(y) and Ift+I(X)1 ~ h(n) (induction hypothesis (3)). (4) For the proof of (4) we need the following property:

Intermediate Step. For all t, n E N, Lf+I eLf U Df+I.

226

6. Random Sequences

Indeed, the inclusion L~+1 C L~ U D~+1 is obviously true for any tEN because L~ = {.A} = D~ for all t. Now we fix t, n E N and a string Y E L~:r We have to show that Y E L~+l or Y E D~:r Assume Y (j. L~+1. We set x = y(g(n)). In the construction of ft+1(Y) either the second or the third case must be valid. But we cannot have ft+1 (y) = ft+1 (x) because together with IfHl(X)1 ::; h(n) (induction hypothesis (3)) this would contradict Y E L~:l. Hence, ft+1 (y) must be defined according to the second sub case of the third case; that is, we have Y = Yj for some Yj E xL;oonD~:r We have shown that ify E L~:l\L~+I, then y E D~:l, which proves the statement. We are now in a position to prove (4). For t = 0, (4) follows from the definition of fo and from Ifo(x)1 ::; h(n) for x E L;g(n) (induction hypothesis (3)). For the case of general t, we fix numbers t,n E N and a string x E L;g(n). We will show that

x E

L~+1

iff x is non-terminal for fHl.

First we assume x (j. Lr+l' Let y be an arbitrary string in xL;oo n L;g(n+1). Then in the definition of fH 1 (y) we are not in the third case. The first case cannot be valid because the first case condition y E L~+1 together with 1ft (x) I ::; h( n) (induction hypothesis (3)) and the induction hypothesis (4) would imply x E Lr. This, together with the relations ft(x)

6.5 The Reducibility Theorem

227

(3)). We fix numbers t, n E N and fix a string x E Dr+1 n Lf+1. We have to show that ft+1 maps xL;oo n Dftl n L~tl bijectively onto ft+1 (x )rOO n r h (n+1). For elements y E xL;oo n Dftl n Lftl the value ft+1 (y) must be defined according to the first case or the first subcase of the third case (because of the inequality Ift+1 (x) I ::; h( n) following from the induction hypothesis (3)). Hence, the set xL;oo n Dftl n L~tl splits into the set xL;oo n Dftl n Lr+1, on which fHI is defined according to the first case, I on which f HI is n+1 n L HI n+1 \ xL;oo n DHI n+1 n L t n+, and the set xL;oo n DHI defined according to the first sub case of the third case. In the discussion in the third case we have seen that ft (and hence fHI) maps the set xL;oo n Dftl n Lr+ I injectively into the set fHI(X)r OO n r h (n+1). The definition of ft+1 in the third case ensures that indeed fHI maps the set xL;oo n Dftl n L~tII bijectively onto the set ft+1(x)r OO n r h (n+1). 0 We have proved that F has all properties (1) to (5). In view of (1), (2) and (3) the function f(x) defined by

f (x) = the longest string in {it (y) I y

If(x)1

~

h(n) iff x is non-terminal for f.

(6.29)

We fix a string x E L;* with length g(n) ::; Ixl < g(n + 1). First assume that If(x)1 ~ h(n). By definition of f we have f(x) = f(x(g(n))) and there must be a number t with ft(x(g(n))) = f(x(g(n))). By (3) we have Ift(x(g(n)))1 = h(n) and by (4) the string x(g(n)) must be non-terminal It. Hence, also x is non-terminal for f. To prove the converse implication in (6.29) assume that x is non-terminal for f. Then there must be a string y E xL;OO n L;* with f(y) =J f(x). By the definition of f we can assume that y E Um>n+1 L;g(m). For large enough t we have f(x(g(n))) = ft(x) and f(y) = It(Y). Hence, x(g(n)) is non-terminal for It. By (4) we conclude that Ift(x(g(n)))1 = h(n), and hence also If(x)1 ~ If(x(g(n)))1 = h(n). This ends the proof of (6.29) which implies that f is a process that satisfies the second assertion in Theorem 6.93. Finally we have to show that f( C) = Lf n Dr, for t, n E N.

Mr

=

rw.

To this end we define the sets

228

6. Random Sequences

Lemma 6.97. We fix t, n E N. If x E Mr \ M!+l' then x rj. M~ for all s > t.

Proof If x E Mr, then x E Lr. By (1) and (3) we get x E Lr+1' With x rj. M!+l we conclude x rj. Dr+1' Lemma 6.95.4 implies x rj. D~, for any s

> t.

0

Corollary 6.98. For each n E N there is atE N with M:" = MF, for

all s 2 t and m ::; n. Proof The assertion follows from Lemma 6.97 and the fact that each set M:" is a subset of the finite set L;g(m). 0 We define the function s : N

s(n)

= min{t E N

--t

I M;.n

N by

= MF for

all r 2 t and m::; n}.

Property (3) implies that If(x)1 ::; h(m) for all x E L;g(m) , mEN. Hence, the function f coincides with fs(n+1) on the sets M~n+1) and M~~~l) and If(x)1 = h(n) for x E M~n+l)' for any n E N. Applying (5) to s(n + 1) we deduce that for each x E M~n+l)' the function f maps the set xL;oo n M~~~l) bijectively onto the set f(x)r OO n r h(n+l). Note that M~n+l) = M~n)' We claim that for each n E N,

f maps L;g(n) n

n M;(m)L;w bijectively onto rh(n).

(6.30)

m~n

M2

= {.\} for all This is clear for n = 0 because g(O) = h(O) = 0 and = {A}). Assume that it is true for n. We have proved that

t (Mg(o)

for each x E L;g(n) n nm~n M;(m)L;w the function f maps xL;oo n M~~~l) bijectively onto the set f(x)r OO nrh(n+l). This gives the claim (6.30) for n+l. We define the set Y c L;w by Y = nnM~n)L;w. By (6.30), f maps Y bijectively onto rw. We claim that Y c C. Let x E Y. Then for every n, x(g(n)) E M~n) c D~(n)' Hence, x(g(n))L;W n Cs(n) -I 0, so x(g(n))L;WnC -10. Since C is constructively closed we deduce that x E C and thus Y c C. This completes the proof of the relation f (C) = r w, hence of Theorem 6.93. 0

229

6.6 The Randomness Hypothesis

Comment. Let ~ be a finite alphabet. Every constructively closed subset of ~w with positive measure contains a computably growing Cantor set. Hence, we can apply Corollary 6.94 in order to obtain for any constructively closed set C c ~w a process F with FC) = r w , i.e. Theorem 6.73 follows. A sharper constructive result appears in Exercise 6.7.21.

6.6

The Randomness Hypothesis

Some other equivalent definitions of random sequences have been proposed by various authors. In this section we will briefly review some of these characterizations and the "randomness hypothesis" will be stated. A very interesting approach to randomness, a topological one, has been proposed by Hertling and Weihrauch [235]. We present the main ideas here. A randomness space is a triple (X, B, p,), where X is a topological space, B, a map from N to the power set of X, is a total numbering of a subbase of the topology of X, and p, is a measure defined on the (jalgebra generated by the topology of X.5 Let (Wn)n be a sequence of open subsets of X; a sequence (Vn)n of open subsets of X is called Wcomputable if there is a c.e. set A c N such that Vn = U7r(n,i)EA Wi for all n E N.6 Next we define W[ = W'(i) = njED(1+i) Wj , for all i E N; here D : N ........ {E lEe N is finite} is the bijection defined by

D-1(E)

= I:2i. iEE

Note that if B is a numbering of a subbase of a topology, then B' is a numbering of a base of the same topology. A randomness test on X is a B'-computable sequence (Wn)n of open sets with p,(Wn) ~ 2- n, for all n E N. An element x E X is called random if x rf. nnEN Wn , for every randomness test (Wn)n on X. The simplest example of randomness space is (~, B, p,), where ~ = {so, ... ,sd is a finite, non-empty set, the numbering B is given by Bi = {sd for i ~ k and Bi = X for i > k, and the measure p, is given by p,( {Si}) = k~l. Notice that p, is a probability measure. Every 5Recall that a subbase of a topology is a set (3 of open sets such that the sets WEE W, for finite, non-empty sets E c (3, form a basis of the topology. 67r(n, i) is a computable bijection; for example, 7r(n, i) = (n + i)(n + i + 1)/2 + i.

n

230

6. Random Sequences

element of is at least

~

is random because the measure of any non-empty open set

k!l'

Consider now the topological space AW (where A comes equipped with the discrete topology and AW is endowed with the product topology) and the numbering B of a subbase (in fact a base) of the topology is given by Bi

= (i)AW = {x E A W I string(i)

Finally, a sequence is H ertling- Weihrauch random if it is random in the space (AW, B, f.t). It turns out that the above definition of random sequences coincides with the other definitions presented in this chapter: 7 Theorem 6.99. Let x E AW. The following statements are equivalent: 1.

2. 3.

4.

The rem The The The

sequence x is Martin-Laf random {Definition 6.25, 6.31}.

Theo-

sequence x is Chaitin random {Theorems 6.35 and 6.38}. sequence x is Solovay random {Theorem 6.31, 6. 39}. sequence x is Hertling- Weihrauch random.

In what follows we will simply call "algorithmically random", "random" for short, a sequence satisfying one of the above equivalent conditions. Theorem 6.99 motivates the following "randomness hypothesis": A sequence is "algorithmically random" if it satisfies one of the equivalent conditions in Theorem 6.99.

Various arguments discussed in this chapter support this hypothesis. Here is another argument due to Fouche [194]. If X is a II~ set which contains a random sequence, then it has non-zero measure. So, if a II~ event is reflected in some random sequence, then the event must be probabilist ically significant. For a more detailed discussion see Delahaye [164]. In what follows we will drop the adjective "algorithmic" and refer to random sequences/reals. Finally, are there "natural" examples of random sequences? A detailed answer to this question will be given in the next chapter. 7 Other equivalent characterizations, including variants of Theorem 6.35 in which the program-size complexity is replaced by the "monotonic complexity" or the "a priori entropy", are presented in Li, Vitanyi [282]' Delahaye [164], Uspensky [407], Uspensky, Shen [409], Vereshchagin [415].

231

6.7 Exercises and Problems

6.7

Exercises and Problems

1. Suppose that x E (0,1) is expressed in the scale of Q (i.e. with digits 0,1, ... , Q -1), and the digit a occurs na times in the first n places in the sequence of digits of x. If n-1n a --. f3 when n --. 00, then we say that a has frequency f3 in x. Borel called x simple normal if for every digit a . na hm -

n-+oo

n

1

=-.

Q

a) Show that almost all reals are simple normal in every scale. b) Show that the set of all Borel normal sequences (in some fixed scale) is a first Baire category set. 2. Show that Champernowne's binary sequence is normal in base 2 and Champernowne's decimal sequence is normal in the scale of 10. It seems that these sequences are not normal in any other scales except powers of their bases. 3. Show that the sequence of primes

23571113171923 ... is Borel normal in the scale of 10. 4. Assume that F is a small function taking at most finitely many zero values and having a computable domain. Prove: a) For every universal computer 'IjJ and every sequence x, one has (x(n), F(n)) E V('IjJ), for infinitely many n E N. b) For every natural k ?': 1 and every sequence x, one has (x( n), k) E V ('IjJ), for infinitely many n EN. 5. Let U be a universal Martin-Lof test and F a small function with computable domain, taking at most finitely many zero values. Prove: a) For every x E AW, (x(n),F(n)) E U, for infinitely many natural n. b) For every x E AW, and every natural k ?': 1, x(n) E Uk, for infinitely many natural n. 6. Let U be a universal Martin-Lof test and let F be a small function with a computable domain. Show that for every sequence x one has mu(x(n)) ?': F(n), for infinitely many n E dom(F). In particular, for every natural k ?': 1, mu(x(n)) ?': k, for infinitely many natural n. 7. Show that no universal Martin-Lof test is sequential. In particular, the universal Martin-Lof test V('IjJ) , where 'IjJ is a universal computer, is not sequential.

232

6. Random Sequences

8. Show that among the computable sequential Martin-Lof tests there is no universal one. 9. Prove that AW with the product topology induced by the discrete topology on A is metrizable and complete under the metric d(x

,y

)=~

IXn-Ynl

~1+lxn-YnI2-n'

10. Give an example of a first category set which is not a constructive first category set. 11. Show that the set AW \ rand has the power of the continuum. 12. Show that for every sequential Martin-Lof test V, the set rand(V) has the power of the continuum. 13. Show that for every computable function f : A * --+ A and each random sequence x, the set {n ~ 1 I f (x( n)) = X n +1} is finite. 14. Show that the set of all sequences x E AW, for which there exists a constant c and infinitely many natural n such that K(x(n)) ~ n - c, has measure one. 15. Show that if there is a constant c > 0 such that K(x(n)) infinitely many natural n, then x is random. 16. Let

f :N

--+

~

n - c, for

N be a function such that the series 00

LQ-f(n) n=l

is convergent. Show that the set {x E A W I K(x(n))

~

n - fen), for all but finitely many n}

has measure one. 17. Let

f :N

--+

N be a computable function such that the series 00

L

Q-f(n)

n=l

is constructively convergent. If the sequence x is random, then K(x(n)) n - fen) for all but finitely many natural n.

~

18. Show that the set {x E AW I there is a natural c such that K(x(n)) > n - c, for infinitely many n} has measure one.

6.8 History of Results

233

19. A p.c. function tp : A* ~ A* is called a monotonic function (Zvonkin and Levin [455]) or a process (Schnorr [359]) if tp(x)

[F(x)1 :2: n -

3y'nlogQ n

> 0 such that on

+ c.

(Hint: there exists a constant c > 0 such that for all natural k :2: ko, if n < nk+1, then mk :2: n - 3y'nlogQ n + c.) 21. (Hertling) Let ~ and r be two alphabets with P and q elements, respectively. Let C C ~w be a constructive closed set with positive measure. Prove that for every c > 0 there exist a constant c and a process F : ~oo --+ roo with F( C) = rw and

IF(x)1 :2: logqp. Ixl -

(2 + c) .logqp.

for all non-terminating strings x

6.8

E ~+

Vlxl

.logp Ixl

-

c,

for F.

History of Results

Borel [40, 41] was probably the first author who systematically studied the random sequences. He was followed by von Mises who - starting in 1919 - tried to base probability theory on random sequences (Kollectives) [421, 422]. Von Mises' path has been followed by many authors, notably Church [141] and Wald [427]; see also Ville [418]. The oscillation of the complexity of strings in arbitrary sequences was discovered by Chaitin [111] and Martin-Lof [304]; for alternative proofs see Katseff [248], and Calude and Chitescu [71] (our presentation follows [71]). Various equivalent definitions of random sequences come from Martin-LM [302, 301]' Chaitin [110, 111, 113, 114, 118, 121,122, 123, 125], Solovay (quoted in [121]), Schnorr [360], Levin [277] and Gacs [200]. Independent

234

6. Random Sequences

proofs of the equivalence between Martin-Lof and Chaitin definitions have been obtained by Schnorr and Solovay, cf. [121, 133]. Martin-Lof [302] has proven that - in a constructive measure-theoretical sense - almost all sequences are random; the computational and topological properties of random sequences come from Calude and Chitescu [72, 69]. For more facts concerning the property of Borel normality see Copeland and Erdos [146], Kuipers and Niederreiter [268] and Niven and Zuckerman [320]. Chait in [111] investigated the Borel normality property for the first time for random sequences; he proved that any Omega Number is Borel normal in any base; this result was generalized for all numbers having a random sequence of digits in Calude [53]; see also Campeanu [108]. The Reducibility Theorem is due to Kucera [265] and Gacs [202]; we have followed the proof in Mandoiu [295]. Theorem 6.93 was proved by Hertling [232]. Chaitin's Omega Numbers - discovered by Chaitin in [114]- are the first "concrete" examples of numbers having a random binary expansion. Omega Numbers have received a great deal of attention; see, for instance, Barrow [15], Bennett and Gardner [32]' Casti [103, 104]' Davies [155]. We will devote most parts of Chapters 7 and 8 to Omega Numbers. Exercises 6.7.4-8 come from Calude and Chitescu [71]. We have followed Martin-Lof [304] for Exercises 6.7.15-18 and Gacs [202] for Exercise 6.7.20. Exercise 6.7.21 comes from Hertling [232]. More details can be found in Arslanov [6], Calude [51], Calude and Chitescu [69], Chaitin [110, 111, 114, 118, 121, 122, 123]' Calude, Hromkovic [86], Davie [154]' Cover [150]' Cover, Gacs and Gray [151]' Dellacherie [166]' Fine [197]' Gacs [201, 203], Gewirtz [208], Khoussainov [253], Knuth l255], Kolmogorov and Uspensky [261]' Kramosil [263], Kramosil and Sindelar [264], Levin [277, 278], Li and Vitanyi [280, 282]' Marandijan [297], Martin-Lof [301, 302]' Mendes-France [311], Schnorr [359, 361], Sipser [367], Svozil [391], van Lambalgen [411, 412], von Mises [421,422], Vereshchagin [415] and Zvonkin and Levin [455]. The randomness hypothesis has been proposed and discussed by Delahaye [164], and, independently, by Calude [59]. Interesting non-technical discussions pertaining to randomness in general and random sequences in particular, may be found in Barrow [15], Beltrami [25], Bennett and Gardner [32], Casti [103, 104]' Chown [139, 139], Davies [155], Davies and Gribbin [156]' Davis [157], Davis and Hersh [160]' Delahaye [165]' Pagels [328]' Paulos [329]' Rucker [349, 350]' Ruelle [351]'

6.8 History of Results

235

Stewart [380] and Tymoczko [406]. More references and applications will be cited in Chapter 9.

Chapter 7

Computably Enumerable Random Reals Not everything that can be counted counts, and not everything that counts can be counted. Albert Einstein

In this chapter we will introduce and study the class of c.e. random realso A key result will show that this class coincides with the class of all Chaitin's Omega Numbers.

7.1

Chaitin's Omega Number

In this section we briefly study Chaitin's random number Ou representing the halting probability of a universal Chaitin computer U)". Recall that Ou= uEdom(U;,.}

is the halting probability of a universal Chait in computer U with null-free data (= >.). In contexts in which there is no danger of confusion we will write U, M, C instead of U)", M)", C)". Let AQ

= {a, 1, 2, ... , Q-l} and f:

N+

--7

A* be an injective computable

7. C.E. Random Reals

238

function such that f(N+) = dom(U>..) and put k

Wk

=L

Q-lf(i)l.

(7.1)

i=l

It is clear that the sequence (Wk)k::::O increasingly converges to O.

Let

o = Ou = 0.0 1 0 2 ... On . .. be the non-terminating base Q expansion of 0 (at this moment we do not know that 0 is actually an irrational number!) and put

Lemma 1.1. If Wn 2: O(i), then

O(i) ::; Wn < 0 < O(i)

+ Q-i.

Proof. The inequalities follow from the following simple fact: 00

Q-i

L

2:

OjQ-j,

j=i+1

o

as OJ E {O, 1,2, ... ,Q -I}.

Theorem 1.2 (Chaitin). The sequence rQ(O) E AQ is random. Proof. We define a Chaitin computer M as follows: given x E A * we compute y = U(x) and the smallest number (if it exists) t with Wt 2: O.y. Let M(x) be the first (in qua~i-lexicographical order) string not belonging to the set {U(f(l)), U(f(2)), ... ,U(f(t))} if both y and t exist, and M(x) = 00 if U(x) = 00 or t does not exist. If M(x) < 00 and x' is a string with U(x) = U(x'), then M(x) = M(x'). Applying this to an arbitrary x with M(x) < 00 and to the canonical program x' = (U(x))* of U(x) yields

HM(M(x)) ::;

Ix'i = Hu(U(x)).

(7.2)

Furthermore, by the universality of U there is a constant c> 0 with

Hu(M(x)) ::; HM(M(x))

+c

(7.3)

7.1 Chaitin's Omega Number for all x with M(x) a string with

239

< 00. Now, we fix a number n and assume that x is

Then M(x) < 00. Let t be the smallest number (computed in the second step of M) with Wt 2:: 0.0102'" On. Using Lemma 7.1 we have 0.0 10 2 " , On

< Wt

(7.4) 00

< Wt+

Q-lf(s)1 2: s=t+1

Ou

<

0.0 10 2 " , On

+ Q-n.

Hence 00

Q-lf(s)1 ~ Q-n. 2: s=t+1 This implies If(s)1 2:: n, for every s 2:: t + 1. From the construction of M we conclude that Hu(M(x)) 2:: n. Using (7.3) and (7.2) we obtain

n

~

Hu(M(x)) < HM(M(x)) + c < Hu(U(x)) + C HU(0102'" On)

which proves that the sequence rQ(O)

+ c,

= 0 10 2 ", is random.

D

In what follows we shall call Ou Chaitin's Omega Number, in short, Omega Number. As pointed in Theorem 7.2, Ou is a natural example of a number having a random sequence of digits (in base Q). The following properties of Ou follow immediately: Corollary 7.3. A Chaitin's Omega Number is a transcendental number in the interval (0, 1). Corollary 7.4. Every Chaitin's Omega Number is Borel normal in base

Q.

240

7.2

7. G.E. Random Reals

Is Randomness Base Invariant?

In this section we deal with the question of robustness of the definition of random sequences, a natural test of the validity of the Randomness Hypothesis. In what follows we will confine ourselves to only one aspect, namely the question: "Is randomness an invariant for the natural representation of numbers?" A given real number may be represented in many different ways. In what follows we focus on the usual natural (positional) representations of numbers. Even for these representations, only very little is known about the connection between combinatorial properties of the representations of a number and properties of the number itself. We know of only one major exception: a real number is rational iff its natural representation is ultimately periodic. This statement is true regardless of the base. 1 It seems natural to ask the following question: "For a given class of number representations R, which combinatorial properties of number representations in R are invariant under transformations between representations?" If P is such an invariant property, r E R is a number representation, and s is a real number, then a representation r(s) of s according to r has property P iff for every r' E R, the representation r' (s) of s according to r' has property P. Thus, relative to the class R, the property P can be considered as a property of the numbers themselves rather than of their representations. Of course, in formulating the above question one has to be slightly more careful as numbers may have more than one representation for a fixed representation system r. Without loss of generality, we consider only numbers in the open interval (0,1) in the sequel; that is, we ask the following question: "Assume that the natural positional representation of a number s E (0,1) at one base is an infinite random sequence; is the natural positional representation of this number at any other base also an infinite random sequence?" Intuitively, the answer is affirmative. The intuition seems to be based on two kinds of arguments. First, the base transformation is a computable function which gives equal "preference" to all digits and cannot do much 1 For continued fraction representations we have more results: 1) A real number is rational iff its continued fraction representation terminates. 2) A real number is quadratic irrational, i.e. solution of a quadratic equation with integer coefficients, but not rational iff its continued fraction representation is ultimately periodic.

241

7.2 Is Randomness Base Invariant?

harm to a random sequence - the flaw with this argument is that even very simple computable functions can easily destroy much of randomness, as shown in this chapter. The second intuitive argument is that for a base transformation there always is an inverse base transformation, and if the first one destroys randomness the second one cannot recover it. To cast this idea into rigorous terms will be one of the main tasks of the present section. It should be mentioned that the main difficulty comes from the fact that there is no (total) computable continuous transformation capable of carrying - in the limit - numbers in some base into another base. The lack of uniformity could be avoided just by using partial transformations; this option raises some technical difficulties. The intuitive answer is, nevertheless, correct. We prove that, for the class of natural representations, randomness is a property of numbers rather than their representations. We shall again use the alphabet AQ = {O, 1, ... , Q -I}. The elements of AQ are to be considered as the digits used in natural positional representations of numbers in the open interval (0,1) at base Q, Q> 1. Thus, an element a E AQ denotes both the symbol used in number representations and the numerical value in the range from to Q - 1 which it represents. The value of a string XIX2 ... Xn E A* is

°

n

VQ(XIX2 ... x n ) = LXiQ-i. i=l

With a sequence x

= XIX2 ...

E AQ one associates its value 00

vQ(x) = LXiQ-i.

(7.5)

i=l

Clearly, vQ(x(n))

-+

vQ(x) as n

-+ 00.

If vQ(x) is irrational, then vQ(x') = vQ(x) implies x' = x. On the other hand, for rational numbers there sometimes are two different natural positional representations. Since we are considering randomness properties of natural positional representations of numbers and since the natural positional representations of rational numbers are far from being random, this will not cause a problem in the sequel. Let I denote the set of irrational numbers in (0,1). Let rQ be defined on I as the inverse of vQ; that is, for an irrational number s E (0,1), rQ(s) is the unique infinite sequence over AQ such that s = vQ(rQ(s)).

7. G.E. Random Reals

242

First we prove that, if the natural positional representation with respect to a base Q is random, then its natural representation is also random with respect to any other base p. The proof is achieved in two steps. First, we consider the case of p = Qm for any mEN. Secondly, we consider the case of p = Q - 1. When combined, this allows for the transition between any two bases. The transition from Q to Qm is intuitively very simple. In x E AQ successive strings of length m are considered as symbols in AQm. In this case the number representations do not playa role at all (see Theorem 6.58). To avoid any ambiguity we shall denote by rand(AQ) the set of all sequences over the alphabet AQ. Theorem 7.5. Let

S

E I and Q E N with Q 2: 2. Then

rQ(s) E rand(AQ) iffrQ(s) E rand(AQ), for all mEN.

Proof. Let mEN, m defined by

> 1, and let am : AQ

--t

AQm be the bijection

am(Om) = 0, a m (Om- 1 1) = 1, ... ,am((Q - l)m) = Qm - 1, that is, for w E A Q, am(w) of (AQ)W onto AQm by

= QmvQ(w). One extends am to a bijection

a m (xlx2 ... ) = am(xl ... xm)a m(xm+1 ... W2m) ... for x = Let

S

XIX2 ... E

E I and y =

AQ.

rQ(s) E AQ. By Theorem 6.58, y E rand(AQ) iff y E rand(A Q).

Moreover,

Y E rand(A Q) iff a(y) E rand(AQm), as am is a bijection of AQ onto AQm. Clearly, VQm (a(y)) = x.

0

We now turn to the transition from base Q + 1 natural representations to base Q. In this case, we need a function that achieves this transition. The obvious idea is to find an injective computable mapping of AQ+1 into AQ

7.2 Is Randomness Base Invariant?

243

that preserves the number represented and is continuous in the topology generated by <po One can prove that such a function does not exist. To understand the reason let us consider, following Staiger [386], the binary and ternary expansions of the real s = ~: r2 (s) = 100 ... 00 ... E {O, l}W, r3(s) = 11 ... 11 ... E {O, 1, 2}W and we observe that we cannot know that first bit of r2(s) until we know the whole sequence r3(s). For more details see Weihrauch [430, 431J. As a consequence, one has to use a function with weaker properties and this leads to more complicated proofs than one would intuitively anticipate. In the sequel, let Q E N, Q 2 2. Let

DQ = {w E A~+1 I vQ+1(w) :::; 1 - Q-1w l }. Let r = rQ be the partial mapping of AQ+1 into AQ with domain DQ and defined by

r(w) = min{z

E

A~I I VQ+l(W) :::; vQ(z)},

for W E DQ; here the minimum is taken with respect to the quasilexicographical order on A Q. Clearly, r is well-defined, DQ is computable, and r is a p.c. function. In several lemmata we state basic properties of natural positional representations and of the mapping r which are needed to establish our main result. The definition of r is based on the following idea: from the first n digits of the natural positional representation of a number in I at base Q + lone can determine the first n digits of its representation at base Q. In this sense r is a continuous, with respect to the topology generated by the prefix-order, "almost" injective, p.c. function which also preserves randomness. This function is not total because of "overflow carries" that would disturb continuity; fortunately, these discontinuities are very rare. The partial function r is not injective because AQ+1 is much bigger that AQ. Lemma 7.6. Let x E A Q+1 and n E N+. Ifx(n) E DQ, then x(n+ 1) E DQ.

Proof. Assume the contrary, i.e. for some n E N+, x(n) E DQ and x(n + 1) ~ DQ. Thus VQ+l(x(n)):::; 1- Q-n and VQ+l(x(n+ 1)) > 1- Q-n-l. Using the relation

244

7. G.E. Random Reals

one obtains

1 - Q-n-I

< <

VQ+I(x(n

+ 1))

VQ+I(x(n)Q) VQ+I(x(n))

<

Q

+ (Q + 1)n+1

1_Q-n+ (Q+i)n+I'

and, therefore,

~

( Q+1)n+l

< Q -1'

Q

o

a contradiction.

Remark. By Lemma 7.6, the set DQ is a computable open set with respect to the topology generated by
Proof If U

U,W

E A

Q.

as VQ+I(O)

= 0 :S 1 _

Q-I.

Then

then, obviously, VQ(U)

C DQ

lui :S Iwi

and

<

vQ(w)

<

vQ( u)

+~

vQ(u)

+ Qlul

Q_ 1 1

(IW!-IUI ~

1) Qi

1 - Qlwl·

Conversely, let U = UI ... Un and W = WI ... wm with UI, ... , Un, WI, ... , Wm E AQ and n :S m. Assume that U is not a prefix of w, that is, there is an i such that i :S nand Ui =I- Wi. Moreover, we may assume that Uj = Wj for all j with j < i. The inequality

7.2 Is Randomness Base Invariant?

245

implies n

m

O:S 2)Wj - Uj)Qm- j j=i

+

L

WjQm-j:s Qm-n -1,

j=n+l

where (Wi -Ui), ... , (w n -Un) E {-(Q-l), ... , -1,0,1, ... , (Q -I)} and Wn+l, ... ,Wm E AQ.

Suppose that Wi - Ui 2: 1. Then the above inequality implies n

Qm-n -12: Qm-i - (Q -1)

L

Qm- j

= Qm-n,

j=i+l

a contradiction. Similarly, if Wi - Ui :S -1 then, by the same inequality, m

L

O:s _Qm-i + (Q - 1)

Qm- j

= -1,

j=i+l

o

again a contradiction.

Lemma 7.9. Let u, W E A Q+1 and a E AQ such that

U

E

DQ. Then

UW, uwa E DQ and, if

vdr(uwa)) :S vdr(uw)) and

Q-l

+ Qluwl+l 1

1 - Qluwl'

1

1 - Qluwl+l·

vQ(r(uw)) :S vQ(r(u))

+ Qlul

vdr(uwa)) :S vQ(r(u))

+ Qlul

then

Proof. By Lemma 7.6, UW, uwa E DQ. One computes Q-l

< vQ(r(uw)) + Qluwl+l 1

1 Q-l Qluw l + Qluwl+l

1

1 Qluwl+l

< vQ(r(u)) + Qlul < vQ(r(u)) + Qlul as

-1 Q-l 1 Qluwl+l - Qluwl - Qluwl+l·

o

7. C.E. Random Reals

246

Lemma 7.10. If u E DQ, a E AQ, then ua E DQ and vQ(r(ua)) ::;

vQ(r(u)a). Proof. By Lemma 7.6, ua E DQ. One computes

vQ(r(u))

a

+ Ql ul+1 a

> vQ+1(u) + Ql ul+1 > VQ+l(U) +

a

(Q

+ l)lu l+1

vQ+l(ua) and, therefore, vQ(r(ua)) ::; vQ(r(u)a) by the definition of r.

0

Lemma 7.11. Let u,w E AQ+1 be such that u E DQ. Then uw,uwQ E

DQ and Q-1 VQ+l(UWQ) ::; vQ(r(uw)) + Ql uw l+1' Proof. By Lemma 7.6, uw E DQ and uwa E DQ, for all a E AQ+l. By the definition of r one has

vQ(r(uw)) ~ vQ+1(uw) Moreover, from Q ~ 2 and

= vQ+1(uwQ) -

Q

(Q + l)luw l+l'

luwl + 1 ~ lui + 1 ~ 2 it follows that

Q _ 1 > (~) luwl+l Q - Q+1 and, therefore,

Q-1 Ql uw l+l

~

Q (Q + l)luw l+1' .

Lemma 7.12. Let u E DQ and v E AQ+1 with u

o

E

DQ

and r(u)

vQ(r(u)) ::; vQ(r(v)) ::; vQ(r(u))

+ Q-1r(u)1

_ Q-1r(v)l.

7.2 Is Randomness Base Invariant? The first inequality follows from Ir(u)1 inequality is equivalent to

247

= lui

~

Ivi

= Ir(v)l. The second

From the definition of rand u

VQ+l(U) and, therefore, vQ(r(v)) 2: vQ(r(u)). Let v = uw. We prove the remaining claim, that is, that

by induction on the length of w. For Iwl = 0 nothing needs to be proven. Consider w = w' a with a E AQ+l and assume that

As u E DQ, also uw',uw'a E DQ by Lemma 7.6. We distinguish two cases. First, assume that a i= Q. Lemma 7.10 implies that

Using the induction hypothesis and Lemma 7.9, one obtains

VQ(r(uw'a)) < vQ(r(u)) -

+ _1 _ _ Q1 1 U

11 I Quwa I

as required. Now assume that a = Q. By Lemma 7.11 one has

and, therefore, by the definition of r,

Using the induction hypothesis and Lemma 7.9, one again obtains the required inequality. 0

7. C.E. Random Reals

248

Lemma 7.13. Let wE AQ+1 be such that w(no) E DQ, for some no E

N+. Then lim f(w(n))

n--->oo

exists and VQ+l(W) = vQ (lim r(w(n))) . n--->oo Proof. In view of Lemma 7.6, for all n, m 2: no with n :s; m one has E DQ; therefore, f(w(n))

wen), w(m)

lim f(w(n)) E A WQ .

n--->oo

For n 2: no one has

and, therefore, lim VQ+l(w(n)):s; lim vQ(f(w(n)))

n--->oo

<

lim (VQ+l(w(n))

n--->oo

n--->oo

+ Q-n) = vQ+1(w) o

as required.

Due to cardinality restrictions, the partial function f cannot be injective. However, each set f-l(u) is "small". Lemma 7.14. The partial function f is surjective and for every string

u E AQ one has Q+ l)IUI #(f-l(u)) < ( --0

+ 1.

Proof. For a natural n, a string u E AQ is the image of every string E A Q+1 such that

W

As u ranges over AQ, its values range over the numbers

249

7.2 Is Randomness Base Invariant? and, similarly, as w ranges over numbers 0, (Q

AQ+1

its values under

r

range over the

+ 1)-n, 2(Q + 1)-n, ... , ((Q + It - 1)(Q + 1)-n.

To prove the surjectivity of r it suffices to show that for every natural r < Qn, there is a natural t < (Q + l)n such that

r

t

1

t

< - < (Q + l)n + Qn· (Q + l)n - Qn

--;-=---,-;--

Given r, the number

satisfies the above inequality. Moreover, Thus, r is surjective.

°:s:

t

< (Q + 1) n, as required.

For every U E AQ with vQ(u) = rQ-n, the size of the set r- 1 (u) is bounded by i + 1, where i is natural and maximal with the following property: there exists a natural t < (Q + 1)n such that t

(Q

+ l)n

:s:

r

t

Qn < (Q

+ l)n +

1 Qn·

So,

t+i (Q

+ l)n

r

:s: Qn

t+i 1 < (Q + l)n + Qn·

Any such i has to satisfy the inequalities i

(Q

+ l)n :s:

t

r Qn - (Q

+ l)n <

1 Qn·

This implies . (Q+l)n 2< --

Q

o

Lemma 1.15. Let S C A Q. If S is prefix-free, then r-l(S) is also prefix-free. Proof. Suppose S is prefix-free and consider u, v E r-l(S) such that U

7. G.E. Random Reals

250

Lemma 7.16. ffx E rand(AQ+1), then x(n) E DQ, for some n E N+.

Proof Assume that x E rand(AQ+ 1 ) and x(n) tf- DQ for all n E N+, that is, VQ+l(x(n)) > 1 - Q-n. Therefore, limn---->oo VQ+l(x(n)) = 1, a contradiction as x E rand(AQ+d implies that VQ+l(X) is irrational. 0 Remark. The statement of Lemma 7.16 is actually true for all sequences except the sequence QQ .... Theorem 7.17. Let x E rand(AQ+l) and Y E AQ such that VQ+l(X) = vQ(Y). Then Y E rand(AQ).

Proof We will use Theorem 6.37. We denote by /1AQ,/1AQ+I' respectively, the uniform measures on spaces A Q, AQ+1 . Assume now that x E rand(AQ+l). Let S c AQ x N+ be a c.e. set such that every section Si is prefix-free and /1AQ (SjA Q) < 00.

L

j2:1

Next we construct the set

Clearly, Tis c.e. (DQ is computable and Sj is c.e.). We shall prove that

L /1AQ+l (TjAQ+1) <

00.

j2:1

To this end we first note the equality

r-1(Sj)AQ+1 =

U r- 1(w)AQ+1. WESj

For w E Sj we have /1AQ+l

(r- 1 (w)AQ+1)

#(r-1(w))(Q + l)-lw l

<

((Q~lywl+1)(Q+1)-IWI Q-1wl + (Q + l)-lw l 2

< Qlwl'

7.2 Is Randomness Base Invariant? as r- 1(W)

C

251

A~~l is a prefix-free set, and, by Lemma 7.14,

#(r- 1 (w)) <

(Q; 1) Iwl + 1.

Finally,

L ~AQ+l ( U r-

j2':l

<

1 (w)AQ+1 )

WESj

L L

~AQ+l (r- 1 (w)AQ+1 )

LL

2Q- lw l

j2':l WESj 2

L

~AQ(SjAQ) <

00.

j2':l

We have used Lemma 7.15 for the second equality; the last equality holds true because each section Sj is prefix-free. By hypothesis, x is random, so there exists a natural N such that for all natural i 2:: N, x rf. TiAQ+l = r- 1 (Si)A Q+1' We show that r(x) rf. SiAQ' for almost all i. In view of the convergence of the series Lj2':l ~AQ (SjA Q) it follows that so

lim min{lwll w E Sm}

m--->oo

= 00.

Now we use Lemma 7.16 to get the constant k with the property that x(n) E DQ, for all n 2:: k. Let M be such that for all i 2:: M, if w E Si, then Iwl > k. For all i 2:: max{M, N}, if r(x) E SiAQ' then r(x)(n) E Si, for some n 2:: k, and r(x)(n) = r(x(n)). We deduce that x(n) E r- 1 (Si), i.e. x E r- 1 (Si)A Q+u which is absurd. D By combining Theorem 6.58 and Theorem 7.17 we derive the main result of this section: randomness is invariant with respect to transformations between natural positional representations of numbers in (0,1).

Theorem 7.18 (Calude-Jiirgensen). Letp,Q E N withp,Q 2:: 2 and let x E A~ and y E AQ be such that vp(x) = vQ(Y). Then

x

E

rand(Ap) iffy

E

rand(AQ).

7. G.E. Random Reals

252

Proof Without loss of generality, assume that p < Q. Let m be the smallest integer such that pm ~ Q. By Theorem 6.58, x E rand(Ap) iff x E rand(Apm). Now let Q = pm - i. Applying Theorem 7.17 i times yields Y E rand(AQ) iff x E rand (Apm ). o

Corollary 7.19. Let s E I and Q E N with Q ~ 2. Then

for all pEN with p

~

2.

Proof The statement is a direct consequence of Theorem 7.5 and Theorem 7.18. 0

Comment. A complexity-theoretic proof (based on Theorem 5.68) of Theorem 7.18 has been obtained by Staiger [386]. Random reals can be defined "directly" using the Hertling-Weihrauch topological approach, hence another proof of Theorem 7.18 was obtained: see [235]. Thus, randomness is invariant with respect to the natural positional representations of numbers in 1. Definition 7.20. Now consider an arbitrary real number s. For Q E N, ~ 2, its natural positional representation over AQ consists of its sign sgn(s), a string iQ(s) E AQ representing the integer part of s, a dot, and a sequence x = FQ(s) E AQ representing the fraction part of s. We say that s is random (with respect to natural positional representations) if, for some Q ~ 2, the sequence iQ(s)Fds) is in rand(AQ).

Q

Remark. Note that s is random iff FQ(s) E rand(AQ). Thus, if s is random, then also Q . sand s / Q are random. Theorem 7.18 implies that this concept of a random number is welldefined. Corollary 7.21. Every random number is Borel normal in any base. Proof We use Theorem 6.61 and Theorem 7.18.

o

7.3 Most Reals Obey No Probability Laws

7.3

253

Most Reals Obey No Probability Laws

Having defined the random reals, the first question which naturally comes to mind is: "How many reals are random?" In measure-theoretical terms the answer is "almost all", using Theorem 6.31. 2 This gives the intuition that most real numbers are random; they do not satisfy any probability laws. This intuition is not confirmed from a topological point of view, namely, in topological terms the answer is "very few" as the set of random sequences is a first Baire category set by Theorem 6.63. Both results are constructively true. Is there any weaker sense in which the intuition regarding the lack of order of reals can be recaptured? The answer is affirmative and a constructive result can be proved. To obtain it we first need some extra notation. Recall that for b 2 2, Ab = {O, 1, ... , b - I}. For u, v E At the number

Nv(u) = card{l :s; j :s;

lull j == l(mod Ivl), UjUHl ... uHlvl-l = v}

counts the occurrences of the string v in u. As in the case of Borel normality, to compute Pv(u), the relative frequency of the string v E At in U E At, we group the elements of u in blocks of length Ivl (we ignore the last block in case it has length less than Ivl) and we divide the number of occurrences of v in the sequence of blocks by the number of total blocks:

( ) _ Nv(u) _ IvINv(u) Pv u lui .

M-

To each string

W

E

At we associate the open interval

The family {h,W}WEAt is a base for the natural topology on [0,1]. For a real s E [0,1) and a string v E At we define

Definition 7.22 (Jiirgensen-Thierrin). A real number s E [0,1] is called disjunctive in base b in case rb(s) contains all possible strings 2It is worth mentioning that under the usual identification of AQ with (0,1) the measure used in Theorem 6.31 coincides with the usual Lebesgue measure, which is not the case for the corresponding topologies.

7. C.E. Random Reals

254

over Ab. 3 A real number is called absolutely disjunctive or a lexicon if it is disjunctive in every base.

A lexicon contains all writings, which have been or will be ever written, in any possible language. Disjunctivity is a "qualitative" analogue of Borel normality. Clearly, every random real is a lexicon, but the converse is false.

Remark. In contrast to randomness, disjunctivity is not invariant under the change of base, see Hertling [231, 233]. In what follows we will denote by .c the set of absolutely disjunctive reals. Let F be the computable set {(b,a,n,v) b 2: 2,a E (O,l)nQ,n 2: 1,v E At}. For (b, a, n, v) E F we define the sets 1

n~,a,n,v) = {O:; r :; 1 1:3 m 2: n such that pv(rb(r)(m)) 2: a},

n"(b,a,n,v) = {O:; r :; 1 1:3 m 2: n such that pv(rb(r)(m)):; a}. It is readily seen that

n-(b,a,n,v) n b,a,n,v

b,a,v b,v

{o:; r :; 1 Vb 2: 2, V v EAt, Pbv(r) = O}, , 1

and

n n~,a,n,v) = {o:; r:; 1

1

Vb 2: 2, V v EAt, pt,v(r)

=

I}.

b,a,n,v

A set R c [0,1] is residual if it contains the intersection of a countable family of open dense sets. 4 To get a constructive version of this definition we require that the family of open dense sets is enumerated by a c.e. set, and we have a constructive "witness" to guarantee that each basic open set h,u intersects the family of open dense sets. We are led to the following definition: 3Recall that rb(s) is the inverse of the function Vb(S) defined in (7.5). 4See Oxtoby [326] for more details.

7.3 Most Reals Obey No Probability Laws

255

Definition 1.23. A set R c [0,1) is constructively residual if there exists a c.e. set E c {(b,u,m) EN x N+ X Nib 2:: 2,u E At,m 2:: I} and a computable function f : N+ X N --t N+ such that the following three conditions hold true: 1.

For all b 2:: 2,m 2:: 1,u EAt, f(u,m) EAt.

2.

n~=l

3.

(U(b,W,m)EE

h,w) C R.

For all b 2:: 2, m 2:: 1, u E At we have u

(b, f(u, m), m)

E

E.

The complement of a constructively residual set is a constructive first Baire category set; as a consequence, a constructively residual set is residual, but the converse is false (see, for example, Martin-LOf [303]). Definition 1.24. The statement constructively, the typical number has, or most numbers have, property P means that the set of all numbers with property P is constructively residual. Lemma 1.25. Constructively, most numbers are in R+=

n

R+ (b,a,n,v) .

(b,a,n,v)EF

Proof We fix a computable bijection t/J : N --t F and define the auxiliary computable functions t : N x N x ([0, 1) n Q) --t N and B : F X N+ --t N+ by

and

B((b ,a , n , v) ,u)

= uOmax(n-lul,O)vt(max(lul,n),lvl,a) .

We fix (b, a, n, v) E F and u E B:' We note that

Pv(B((b, a, n, v), u))

> IB((b, a,Ivln, v), u)1 t(max(lul, n), lvi, a)

>

Ivl t(max(lul, n), lvi, a) max(lul, n) + Ivl t(max(lul, n), lvi, a) a,

256

7. C.E. Random Reals

and

m = IO((b,a,n,v),u)1 2: n, so I b,B((b,a,n,v),u) cR+ (b,a,n,v)· For every string u E At, h,u

n Ib,B((b,a,n,v),u)

i= 0,

so the open set

U h,B((b,a,n,v),u) uEAt

is dense in [0,1]. In conclusion, the set of real numbers the lemma speaks about is a constructively residual via the c.e. set E

=

{(b,O((b,a,n,v),u),m) I b 2: 2,u E At,m 2: 1,1j;(m) = ((b,a,n,v),u)},

and the computable function f : N+

X

N

---+

N+ defined by f(u, m)

O(1j;(m) , u).

= 0

In view of the fact that for every rational a E (0, 1) and all strings u, v E At there exists a string W E At such that N v(uw) ::; a, we can modify the definition of 0 in the above proof appropriately to guarantee the inequality pv(O((b,a,n,v),u)) ::; a. So, the set

n

R-(b,a,n,v)

(b,a,n,v)EF

is constructively residual. Finally, the set Rresidual too. We have proven:

n R+ is constructively

Theorem 1.26 (Calude-Zamfirescu). Constructively, for most numbers r E [0, 1], using any base b and choosing any string v E At,

As an immediate consequence we derive a constructive version of a result due to Oxtoby and Ulam [327].

7.3 Most Reals Obey No Probability Laws

257

Corollary 1.21. Constructively, a typical number does not obey the Law of Large Numbers. Proof Indeed, the set of all reals r E [0,1] such that in their dyadic expansion the digits and 1 appear with probability one-half lies in the complement of the constructively residual set from Theorem 7.26. 0

°

As we have seen, random numbers are transcendental, but the converse implication is false.

Definition 1.28. A real number a E [0,1] is called a Liouville number if a is irrational, and for all n E N there exist p, q EN, q > 1, such that

Liouville numbers are transcendental (see [326]) but not random (see Exercise 7.8.15).

Corollary 1.29. Constructively, the typical Liouville number is a lexicon. Proof Since the constructively residual set in Theorem 7.26 is a subset of L, the set of absolutely disjunctive reals, constructively most numbers from [0,1] are in L. But most reals are constructively Liouville numbers, as the proof from [326], p. 8 can be readily constructivized. , 0

The set of all numbers each of which is a lexicon is large not only in the sense of constructive category, but also in the sense of constructive measure theory: this set contains all random numbers, so it has constructive measure one by Corollary 6.32. This suggests that constructively L may contain nearly all elements of [0,1]. But what does "nearly all" mean? Classically, a set contains nearly all numbers if its complement is a-porous [448]. The complement of a a-porous set is simultaneously residual and of measure one (but the complement of a null set of first category may well not contain nearly all elements, see Zamfirescu [448]). The fact that a porous set has measure one is a consequence of Lebesgue's Density Theorem (see Oxtoby [326]), a constructive form of which will be presented in what follows. A comprehensive study of porous and a-porous sets appears in [342].

7. G.E. Random Reals

258

Definition 7.30. A set M C [0,1] is called constructively megaporous if there exist a base b 2:: 2, a rational number r E (0,1) and a computable function f : At --+ At such that each interval h,u of length less than r contains a subinterval h,f(u) disjoint from M and having length greater than rb-Iul. A c. e. union of constructively megaporous sets is called constructively (j-megaporous. More precisely, M is constructively (j-megaporous if M = U~=l M n , and there exist two computable functions T : N X N+ --+ N+, R: N --+ Q such that Mn is constructively megaporous under T(n,.) and R(n).

Definition 7.31. We will say that constructively, nearly every point of [0, 1] enjoys property P if the set of points not enjoying P is constructively (j-megaporous. Theorem 7.32 (Calude-Zamfirescu). real number is a lexicon.

Constructively, nearly every

Proof Let, : {(b, w) I b 2:: 2, w EAt} --+ N be a computable bijection, and we define the computable functions T(n, u) = uw, R(n) = b- 1wl -1, whenever n = ,(b, w). Again, if n = ,(b, w), we put Ln = {O :::; x :::; 1 I w is not contained in rb(x)}. It is seen that [0,1] \ [, = U~lLi' and each Ln is constructively megaporous with respect to the base b, the computable function T(n,.) and the rational R(n). 0

The following result is a constructive version of (a weak form of) Lebesgue's Density Theorem.

Theorem 7.33. Every constructively (j-megaporous set is constructively null. Proof In view of Theorem 6.31 the union of all constructive null sets is a (maximal) constructive null set. Consequently, it is enough to prove the theorem for constructive mega porous sets. Let M be constructively megaporous with respect to the base b, the rational r and the computable function f. To estimate the size of M we will generate, in a computable way, smaller and smaller coverings of M. We start with an integer n such that b- n < r. For a string w E At we put E(w) = {y E At I w

= If(w)l, y

-=1=

f(w)}.

259

7.3 Most Reals Obey No Probability Laws The first covering is Me

U h,u· lul=n

The second iteration is Me

U U

h,Vi

lul=n Vi EE(u)

=

U h,u \

Ib,J(u)'

lul=n

The measure of this covering is

/-l (

L L

U Ib,u \ h,J(u)) lul=n

/-l(Ib,u \ Ib,J(u))

lul=n (b- 1ul - b-1f(u)l)

lul=n

<

L

b- 1ul (1

- r)

lul=n

1- r.

In general, a proof by induction shows that

M e

U U

lul=nviEE(u)

U

U

Ib,vk+i

VkEE(Vk_i) Vk+i EE(Vk)

U

h,Vk \ h,J(Vk)

VkEE(Vk-i)

and

We conclude that M is constructively null with respect to the c.e. family G

= {(w,n)

where Fo = {u E At E(w), for some w E Fk }.

E

At x N I w E Fn,n = 1,2, ... },

I lui =

n} and

Fk+1

= {u

E

At

I

u

E 0

The above result is stronger than the classical one as, for instance, constructive null sets are even "smaller" than classical null sets: the union of all null sets coincides with the whole space, while the union of all constructive null sets is a constructive null set.

7. C.E. Random Reals

260

7.4

Computable and Uncomputable Reals

The complexity of real numbers is a central topic in classical computability theory (see Turing [404], Rice [345], Calude [51], Soare [372], Odifreddi [321, 322], Bridges [46]), computable analysis (see Martin-Lof [303], Weihrauch [429,431], Pour-El and Richards [338], Ko [258], Bridges [47]), AIT (see Chaitin [122, 121, 130, 132, 131], Martin-Lof [301, 302]) and information-based complexity (see Traub, Wasilkowski and Wozniakowski [402]). An important class of reals is certainly the set of computable reals. In order to define them we introduce the notions of computable sequence of rationals and computable convergence rate. Definition 7.34. 1) A sequence (ai) of rationals ai is called computable if there is a computable function which, given a non-negative integer n, computes a name for the rational an, with respect to a standard computable enumeration of rationals. 5

2) A sequence (ai) of reals ai is said to converge computably if it converges and there is a computable function g : N --t N such that lai limk-->oo ak I ::; 2- j , for all i, j with i ~ g(j). 3) A real a is called computable if there exists a computable sequence of rationals which converges computably to a.

Theorem 7.35. Let a be a real in (0,1). Then, the following statements are equivalent:

(1)

The real a is computable.

(2)

There exists a computable sequence (an) of rationals with la-ani::; 2- n , for all n.

= {q

Q I q < a} is computable. (4) There exists a computable function f : N --t {O, I} such that a = I:~l f( i)2- i . (3)

The left Dedekind cut L( a)

E

Proof We will prove the implications (1) =? (2) =? (3) =? (4) =? (1). The implications (1) =? (2) and (4) =? (1) are obvious. For (2) =? (3) we 5For example, use the bijection v+ : N ----> [0,1] n Q defined by v+(O) = 0, v+(2n) = l+v+(n), v+ (2n+1) = l/(l+v+ (n)) from Yi-Ting [447]; see also Weihrauch [429,431]. Sometimes the inverse function will be denoted bye: Q ----> N.

261

7.4 Computable and Uncomputable Reals take a real a tf. Q and compute aI, a2, ... ,ak till lak lak - ql > 2- k ;:::: lak - ai, hence

ql > 2- k .

Then,

q < a iff q < ak.

To compute k as above we note that lam+l -ql > 2- m - 1 provided la-ql > 2- m , which is true because a is not rational. Finally assume that a tf. Q. Then the implication (3) ::::} (4) follows from the fact that the equivalence n

an +1 = 1 iff

L ai2-i + 2-n -

1

i=l

is true for all n E N.

o

Comment. The implications (2) ::::} (3) and (3) ::::} (4) are not uniformly constructive as the proof splits into two cases, a E Q and a tf. Q. To get a better insight into this phenomenon let us look at another equivalent definition of computable reals. As we have seen, a sequence (ai) of rationals is computable if we can effectively decide the ith member. However, we may be unable to decide the rationals that do not occur in the sequence. If (ai) is a sequence of rationals, we denote the set {q E Q I :3i EN (q = ai)} by {ai}. For computable sequences of rationals it is obvious that {ai} is a c.e. set. Theorem 7.36. If a sequence (ai) of rationals is computable and converges computably, then the set {ai} is computable. Proof Let (ai) be a computable sequence of rationals converging computably to a. Then there is a total computable function g such that for each n, lai - al ::; 2- n , for all i ;:::: g(n). We give a procedure for deciding if p E {ai} for an arbitrary rational p. We distinguish three cases.

(1)

The real a is irrational. To decide p E {ai} we perform the following procedure: Enumerate intervals (ak - 3· 2- n , ak + 3· 2- n ) with k ;:::: g(n) until finding the first such interval not containing p. Such an interval will be found because p =J a and (ai) converges to a. Then al E (ak - 3· 2- n ,ak + 3· 2- n ), for alII;:::: k. Hence p E {ai} iff p E {ao, .. . ,ak-l}.

7. C.E. Random Reals

262 (2)

The real
(3)

The real
o Remark. The last proof is not uniformly constructive in (ai) and g. Indeed, a uniform procedure does not exist as one can see by considering j )), a c.e. but not computable set S and the following list of sequences for j = 0,1,2, ... , where

(d

if i, j E N, and j E S" if i,j E N, and j tf. S. Example 7.37. All algebraic numbers, log23, are computable.

7r

and the Euler number e

Actually, all real numbers commonly used in numerical analysis and the natural sciences are computable. Of course, not all real numbers are computable (in fact, most reals are not computable).6 Given a computable sequence (ai) of rationals which converges computably to a computable real N as in Definition 7.34, by computing ag(n) one obtains a rational approximation of
7.4 Computable and Uncomputable Reals

263

Example 7.38 (Specker). If h is an injective, total computable function which enumerates a c. e. set of non-negative integers which is not computable, then the real 00

a=

2:rh(k)

(7.6)

k=O

is the limit of the computable sequence of partial sums (~k=O 2- h (k))n, but it is not a computable real. Proof The sequence (~k=O 2- h (k))n computably converges iff the range of h is computable. 0 The real a can be approximated by a computable, converging sequence of rationals, but not with a computable convergence rate [379]. Such a number is called a Specker real. Example 7.39. Every Chaitin Omega Number is a Specker real. We continue with a simple but intriguing example. Let timeu(stringi) be the running time of the computation U(stringi), and we define the real number (7.7) Note that timeu(stringi) is a positive integer in case stringi E dom(U), and timeu(stringi) = 00, in the opposite case. At first glance the analogy between (7.6) and (7.7) suggests that Yu is uncomputable because Yu seems to be essentially defined in terms of an uncomputable set, dom(U). This intuition is false: Example 7.40. The real Yu is computable. Proof Indeed, we can construct an algorithm computing, for every positive integer n, the nth digit of Yu. The idea is simple: only the terms tinuous but uncomputable solution for the wave equation even if the initial conditions are continuous and computable, see Pour-El and Richards [338]; see also Weihrauch and Zhong [432]

7. G.E. Random Reals

264

2- i /timeu(stringi) for which timeu(stringi) = 00 may cause problems in (7.7) because at every finite step of the computation they appear to be non-zero when, in fact, they are zero! The solution is to run all nonhalting programs stringi enough times such that their cumulative contribution is too small to affect the nth digit of Yu. 0 Proposition 7.41. Let h : N -> N be an injective, total computable function and define the sequence (an) of rationals by an = L~=o 2- h (m). The sequence (2- h (n)) is a computable sequence of rationals which converges always to zero, and the sequence (an) is an increasing, computable, converging sequence of rationals. Proof. It is clear that both sequences of rationals (2- h (n)) and (an) are computable. The claim that (2- h (n)) converges to zero is equivalent to (Vn) (3m) (Vi ~ m) h(i) ~ n.

This follows from our assumption that h is injective: for each n there is a number m such that h(N) n {a, 1, ... , n -I} C {h(O), h(l), ... , h(m -I)}. The injectivity of h implies h(i) ~ n, for all i ~ m. The sequence (an) is obviously increasing and converges because it is bounded by L~=o 2- n =

2.

0

Proposition 7.42. Let h : N -> N be an injective, total computable function and an L~=o 2- h (m). Then, the following conditions are equivalent: (a)

The range h(N) of h is a computable set.

(b)

The sequence (2- h (n)) converges computably.

(c)

The sequence (an) converges computably.

(d)

The limit of the sequence (an) is a computable real.

Proof We will prove the implications (a)

'*

(b)

'* (c) '* (d) '*

(c)

'*

(a). For the implication (a) Then the function g : N

'* (b) we assume that h(N) is a computable set. ->

N defined by

g(n) =min{ml {O,1, ... ,n-1}nh(N) ch({O,1, ... ,m-1})}

265

7.4 Computable and Uncomputable Reals is a total computable function and satisfies 2- h (m) m ~ g(n). Hence, (2- h (n)) converges computably.

::;:

2- n , for all nand

'*

We continue with (b) (c). Let 9 be a total computable function such that 2- h (m) ::;: 2- n , for all nand m ~ g(n). Then 00

lam -

L: 2- h(k) I ::;: r n, k=O

for all m ~ g(n + 1). Therefore the sequence (an) converges computably. Since h is computable, the sequence (an) is a computable sequence of rationals.

'*

The implication (c) (d) follows directly from the definition of com(c) follows directly from Proposition putable reals. The implication (d) 7.44. For the implication (c) function such that

'*

'* (a) we assume that 9 : N --> N is a computable 00

lam -

L: rh(k)1 ::;: rn, k=O

for all nand m

~

g(n). Then

n E h(N) iff n E h({O, 1, ... ,g(n + I)}).

Hence, h(N) is a computable set.

o

We continue with a special type of convergence. Definition 7.43. We say that a sequence (ai) of reals converges monotonically to the real a if there exists a constant c > 0 such that for all i and all j ~ i,

For example, any converging and monotonic, i.e. either non-decreasing (e.g. an = L:~=o 2- h (m)) or non-increasing, sequence of reals converges monotonically: one can take the constant c = 1. The following result is simple, but rather unexpected: Theorem 7.44 (Calude-Hertling). Every computable sequence of rationals which converges monotonically to a computable real converges computably.

266

7. C.E. Random Reals

Proof Let (ai) be a computable sequence of rationals which converges monotonically to a computable real a. Let c ~ 0 be a constant such that for all i and all j ~ i,

Furthermore, let (b i ) be a computable sequence of rationals with la-bil ~ 2- i , for all i. For any i there exists a number k with la - akl ~ 2- i - 2- c . For this k we have 1ak

-

bi+2+c 1 < _

1ak

-

a 1 + 1a - bi+2+c 1 < _ 2 -i-2-c

Hence, we can define a computable function h : N

1- c - 2- i - . + 2- i - 2- c --+

N by

In view of the monotonicity of (ai), for any i and any j la-ajl

< 2 c ·la- ah(i)1 < 2c . (la - bi+2+cl + Ibi+2+c < 2c . (T i - 2 - c + T i - 1- C ) < 2- i .

Hence, the sequence (ai) converges computably.

~

ah(i)

h(i) we have

I)

o

Remark. The converse implication in Proposition 7.44 is not true as the following example shows: the sequence (ai) defined by ai = 2- i if i is even and ai = 2- 2i if i is odd converges computably to zero, but it does not converge monotonically. Lemma 7.45. Let (an) be a computable sequence of rationals which converges computably, and let (b n ) be a computable sequence of rationals which converges non-computably. Then (an + bn ) is a computable sequence of rationals which converges non-computably to the sum of the limits of (an) and (b n ). Proof It is clear that the sum of two computable, converging sequences of rationals is again a computable, converging sequence of rationals converging to the sum of the limits of the two sequences. Let a = lim n -+ oo an and {3 = lim n -+ oo bn . We have to show that (an + bn ) does not converge computably. For the sake of a contradiction assume that (an + bn )

7.4 Computable and Uncomputable Reals

267

converges computably and that g is a total computable function such that la + 13 - am - bml :s; 2- n , for all nand m 2 g(n). Furthermore let f be a total computable function such that la - ami :s; 2- n , for all nand m 2 f(n). We define the total computable function h by h(n) = max{f(n+ 1), g(n + I)}. For arbitrary nand m 2 h(n) we obtain

113 - bml :s; la + 13 -

am -

bml + la -

ami :s; T n - 1 + T n - 1

= 2- n .

Hence, also the sequence (b n ) converges computably in contradiction to our assumption. 0 Next we prove that every computable real can be approximated by a computable sequence of rationals which converges non-computably. Theorem 7.46. For every computable real a there is a computable sequence (an) of rationals which converges to a, but which does not converge computably. Proof Let h be an injective, total computable function with uncomputable range, i.e. such that the set h(N) is an uncomputable set. By Proposition 7.41 the sequence (2- h (n)) is a computable sequence of rationals which converges non-computably to zero. Let (b n ) be a computable sequence of rationals which converges computably to a. By Lemma 7.45 the sequence (an) defined by an

= bn + Th(n)

is a computable sequence of rationals which converges non-computably to a. 0 Theorem 7.46 states that we can approximate every computable real noncomputably, i.e. very slowly. Thus, the fact that a computable sequence of rationals converges non-computably does not imply that the limit is uncomputable. Furthermore we may ask whether, given a computable sequence of rationals, one can decide whether its limit is computable or not, and also, whether it converges computably or not. The answer to both these questions is negative. We will use the following notation: a number i is called a Cadel number of a computable sequence of rationals (an) if an = vQ (
268

7. C.E. Random Reals

Definition 7.47. We say that it is impossible to decide whether the elements in a certain set A of computable sequences of rationals have a certain property, if there is no algorithm which, given a Cadel number of an element of the set A, decides whether this element has the property or not. Theorem 7.48. It is impossible to decide whether:

(1)

a converging, increasing, computable sequence of rationals converges computably,

(2)

a converging, increasing, computable sequence of rationals converges to a computable real or to an uncomputable real,

(3)

a computable sequence of rationals which converges non-computably converges to a computable real or to an uncomputable real.

Proof. Let us fix a c.e. but not computable set X c N and a total computable, injective function f such that f (N) = X. We define a sequence (gi) of functions gi as follows:

gi(n)

=

{

2n, 2f(n)

+ 1,

if there is no j ::; n with f(j) = i, if there is a j ::; n with f(j) = i.

It is clear that the functions gi are total computable and injective. Furthermore, the range gi (N) of gi is a computable set iff i tf. X. For each i we define a computable, increasing, converging sequence (a~)) ofrationals by a~)

n

= L 2- 9i (j). j=O

From i one can compute a Godel number of the sequence (a~)). Hence, if one could decide whether any converging, increasing, computable sequence of rationals converges computably, then one could also decide, for arbitrary i, whether the sequence (a~)) converges computably. By Proposition 7.42 this is the case iff the set gi(N) is computable. But we have constructed the 9i in such a way such that the function 9i (N) is computable iff i tf. X. Hence, we cannot decide this question, proving the first assertion. For the second assertion one argues in exactly the same way with the same class of sequences of rationals, but now using the fact, proved in

7.4 Computable and Uncomputable Reals

269

Proposition 7.42, that for any i the sequence (a~)) has a computable limit iff the range gi(N) is computable. For the third assertion we shall argue in the same way, but with a different class of sequences of rationals: we will use a class of sequences which converge non-computably. Therefore we define a second sequence (hi) of functions hi by

hi(n) = {2 f (n),

if there is no j ::; n with f(j) = i, 2n + 1, if there is a j ::; n with f(j) = i.

It is clear that the functions hi are total computable and injective. Furthermore the range hi(N) is a computable set iff i E X. For each i we define a sequence (b~)) of rationals by b~)

=

a~)

+ 2- hi (n).

This is certainly a computable and converging sequence of rationals which has the limit limn---->oo b~) = limn---->oo a~); compare Proposition 7.41 and Lemma 7.45. But it converges non-computably by Lemma 7.45, because both sequences (a~)) and (2- hi (n)) converge, but the sequence (a~)) converges computably iff i rf. X and the sequence (2- hi (n)) converges computably iff i E X; compare Proposition 7.4l. From i one can compute a Godel number of the sequence (b~)). Hence, if one could decide whether any converging, computable sequence of rationals which converges non-computably has a computable limit, then one could also decide, for an arbitrary i, whether the sequence (b~)) has a computable limit. This is the same as the limit of (a~)). By Proposition 7.42 this is a computable real iff the set gi(N) is a computable set. But this is the case iff i rf. X. Hence, we cannot decide this question. This proves the last assertion. 0 Theorem 7.46 and Theorem 7.48 tell us that a computable sequence of rationals which converges non-computably may converge to a computable or an uncomputable real, and that it is impossible to decide whether the limit is computable or uncomputable. Is there still a difference between the rate of convergence of a computable sequence of rationals with computable limit and the rate of convergence of a computable sequence of rationals with uncomputable limit? We shall see later that this is indeed the case.

270

7. C.E. Random Reals

We are naturally led to the question: "can we slow down arbitrarily the rate of convergence of a computable sequence of rationals with computable limit?" The answer is negative. The first result states that no computable sequence (ai) of rationals which converges to a computable real can dominate a computable sequence of rationals converging to a non-computable real. Hence, although we can have a slow computable approximation of computable reals, we cannot slow it down arbitrarily.

Theorem 7.49. Let (an) be a computable sequence of rationals converging to a computable real 0:, and let (b n ) be a computable sequence of rationals converging to a non-computable real {3. Then, for every c > 0 there are infinitely many i such that

Proof. For the sake of a contradiction we assume that there are constants c, dEN such that 1{3 - bil :S 2c ·10: - ail,

for all i :2: d. Let (ai) be a computable sequence of rationals such that for all i 10: - ail :S 2-i . We define the computable function h : N h(i)

= min{k

Ilak - akl

:S 2- i - c-

1

--t

N by

and k:2: max{i

+ c + 1, d}}.

This function is well-defined because the sequences (ak)k and (ak)k tend to the same limit. We calculate for all i,

1{3 -

bh(i) I

< 2c • 10: - o'h(i) I < 2c . (10: - ah(i) I + lah(i) < 2c . (T i - c - 1 + T i - C - 1 )

ah(i) I)

2- i .

Hence, the computable sequence (bh(i)) converges computably. This contradicts the assumption that its limit {3 is a non-computable real. 0 Theorem 7.61 will show that Theorem 7.49 is also true if we replace the computable real 0: by a non-random real 0: and the non-computable real

7.5 C.E. Reals, Domination and Degrees

271

{3 by a random real {3. In fact, the "domination relation" implies an estimate for the program-size complexity for the binary expansions of the reals. But first we shall define the domination relation introduced by Solovay [375].

7.5

Computably Enumerable Reals, Domination and Degrees

In this section we will introduce the notion of computably enumerable (c. e.) reals and will develop tools to compare the information contents of these types of reals.

Definition 1.50 (Soare). A real ex is called computably enumerable (c.e.) if there is a computable, increasing sequence of rationals which converges to ex.

Note that the property of being c.e. depends only on the fractional part of the real number. In what follows we will concentrate more on reals in the unit interval. We start with several characterizations of c.e. reals. We fix an alphabet L; real number by

= {a, I}. For a prefix-free set A c

L;* we define a

which, due to Kraft's inequality, lies in the interval [0, 1]. For a set X we define the number 2-X-1

=

L

c

N

2- n - 1 .

nEX

This number also lies in the interval [0,1]. If we disregard all finite sets X which correspond to rational numbers 2 - X -1, we get a bij ection X 1-+ 2-X-1 between the class of infinite subsets of N and the real numbers in the interval (0, 1]. If O.y is the binary expansion of a real ex with infinitely many ones, then ex = 2- X ",-1 where Xa = {i I Yi = I}. Clearly, if Xa is c.e., then the number 2- Xa - 1 is c.e., but the converse is not true as the Chaitin Omega Numbers show.

7. C.E. Random Reals

272

We start with a characterization of c.e. reals a in terms of prefix-free c.e. sets of strings (which are exactly the domains of Chait in computers) and in terms of the sets X a.

Theorem 7.51 (Calude-Hertling-Khoussainov-Wang). real a E (0,1] the following conditions are equivalent:

For

a

1.

The number a is c. e.

2.

There is a computable, non-decreasing sequence of rationals (an) which converges to a.

3.

The Dedekind set {p E Q I p

4.

There is an infinite prefix-free c. e. set A

5.

There is an infinite prefix-free computable set A

6.

There is a total computable function f : N 2 -+ (a) (b)

< a} is

c. e.

c

L;* with

a = 2-A.

c L;* {a, 1}

If for some k, n we have f(k, n) = 1 and f(k, n there is an l < k with f(l, n) = and f(l, n + 1) We have k E Xa ifflim n--+ oo f(k, n) = 1.

°

with a

= 2-A.

such that:

+ 1) = =

°

then

1.

Proof It is obvious that conditions 1, 2 and 3 are equivalent, that 4 =} 3, and 5 =} 4.

For the implication 1 =} 5 we start with an increasing computable sequence ofrationals (aj) with limit a, and we assume that 0< aj < a:::; 1, for all j. Using the computable sequence (aj) of rationals one can construct a non-decreasing computable sequence (ni) of positive integers and an increasing computable sequence (kj) of non-negative integers such that kj

Lr i=O

kj

ni

< aj < r

j

+ Lrni , i=O

for all j. Obviously L:~o 2- ni = a. By the Kraft-Chaitin Theorem 4.2 there are a one-to-one computable sequence (Xi) of strings with IXil = ni, for all i, and a Chaitin computer whose domain A is the set {Xi liE N}. The set A is computable because the sequence (Ixi I) of the lengths of the Xi is non-decreasing. We obtain a = 2-A. We now prove the implication 6 =} 2. We write fk,n for f(k,n). We claim that 6(a) implies o·fo,niI,n ... fm,n :::; O·fO,n+liI,n+l ... fm,n+l,

(7.8)

273

7.5 C.E. Reals, Domination and Degrees

for all m, n. Assume that (7.8) is not true for some m and some n. We fix this number n and choose m minimal such that (7.8) is not true. Then, becauseofO·fo,nh,n ... fm-l,n ::; O·fO,n+lh,n+l ... fm-l,n+1 we must have fm,n = 1 and fm,n+l = O. By 6(a) there is a number l < m with fl,n = 0 and fl,n+l = 1. Using the inequality O·fo,nh,n ... fl-l,n ::; O·fo,n+1h,n+1 ... fl-l,n+1

we obtain O·fo,nh,n ... fm,n

O·fo,nh,n ... fl-l,nOfl+l,n ... fm,n

< < <

O·fo,nh,n ... fl-l,n 1 O·fO,n+lh,n+l ... fl-l,n+l 1 O·fO,n+lh,n+l ... fl-l,n+l 1 fl+1,n+1 ... fm,n+1 O·fo,n+1h,n+1 ... fm,n+l,

a contradiction! Thus, (7.8) is true for all m, n. We define next the computable sequence (an) of rationals by an = O.fo,nh,n ... fn,n. Then, by (7.8), an ::; a n+1, for all n. Let O.y = 0.YOYIY2 ... be the binary expansion of a which contains infinitely many ones. We can prove by induction (on k) that the assumption 6(a) implies that for each k the sequence f(k, 0), f(k, 1), f(k, 2), ... changes its value only finitely many times. Hence the limit limn->oo f(k, n) exists. By 6(b), for each number L there is a number N L with Yk = ik,n for all k ::; L and n ~ N L . Hence, Ian - al ::; 2- L , for all n ~ max{L, N L }. We conclude that limn->oo an = a. Hence, (an) is a non-decreasing computable sequence of rationals converging to a.

'*

For the implication 1 6 we consider an increasing computable sequence ofrationals (an) with limit a. Again we can assume that 0 < an < a ::; 1, for all n. We define f such that O.fo,nh,nh,n ... is the binary expansion of ak containing infinitely many ones, for each k. Then f is computable. From an < an+l it follows that f satisfies 6(a). The equivalence k E Xa iff lim f(k,n)

n->oo

follows from limn->oo an

= a and

an

=1

< a, for all n.

o

In order to compare the information contents of c.e. reals, Solovay [375] (see also Chaitin [118]) has introduced the following definition.

7. C.E. Random Reals

274

Definition 1.52 (Solovay). The real a is said to dominate the real /3 if there are a partially computable function f : Q ~ Q and a constant c > 0 with the property that if p is a rational number less than a, then f (p) is (defined and) less than /3, and it satisfies the inequality

c· (a - p)

~

/3 -

f(p).

In this case we write a ~dom /3 or /3 5:.dom a. The relation the Solovay domination relation.

Sdom

is called

Roughly speaking, a real a dominates a real /3 if from any good approximation to a from below (say, from a rational number p < a with a - p < 2- n ) one can effectively obtain a good approximation to /3 from below (a rational number f(p) < /3 with /3 - f(p) < 2-n+constant). For c.e. reals this can also be expressed as follows.

Lemma 1.53. A c.e. real a dominates a c.e. real/3 iff there are computable, increasing (or non-decreasing) sequences (a i) and (b i ) of rationals and a constant c with lim n -+ oo an = a, lim n -+ oo bn = /3, and c(a - an) ~ /3 - bn , for all n. Proof First, we assume that a dominates /3. Let (an) and (b n ) be increasing, computable sequence of rationals converging to a and /3, respectively. Since a dominates /3 there are a constant c > 0 and an increasing, total computa~le function 9 : N -+ N with c( a - an) ~ /3 - bg(n) , for all n. We

set bn

=

bg(n)'

On the other hand, assume now that (an) and (b n ) are computable, nondecreasing sequences converging to a and to /3, respectively, and that c > 0 is a rational constant such that c( a - an) ~ /3 - bn , for all n. The sequences (an) and (b n ) defined by an = an - 2- n and bn = bn - c2- n are computable, increasing, converge to a and to /3, respectively, and satisfy c(a - an) ~ /3 - bn , for all n. We define a partially computable function f : Q ~ Q as follows. Given p E Q, compute the smallest i such that ai ~ p. If such an i has been found, set f(p) = bi . If P < a, then f(p) is defined and is smaller than /3. It is clear that this function f shows /3 5:.dom a. 0

Next we prove a few results about the structure of c.e. reals under

Sdom'

7.5 G.E. Reals, Domination and Degrees

275

Lemma 7.54. Let a, (3 and, be c. e. reals. Then the following conditions hold: 1.

The relation '2dom is reflexive and transitive.

2.

For every a, (3 one has a

3.

If, '2dom a and, '2dom (3, then, '2dom a

4.

For every non-negative a and positive (3 one has a . (3 '2dom a.

5.

If a and (3 are non-negative, and, '2dom a and, '2dom (3, then , '2dom a . (3.

+ (3 '2dom a. + (3.

Proof. The statement 1 follows from the definition. For 2 we consider a rational number P < a + (3 and we can compute two rational numbers PI,P2 such that PI < a, P2 < (3 and PI + P2 '2 P because a and (3 are c.e. reals. Now a+(3-p '2 a+(3-PI-P2 > a-Pl· Hence a+(3 '2dom a. For 3 we start with a constant c such that for each rational number P < , we can find - in an effective manner - two rational numbers PI < a and P2 < (3 satisfying cb - p) '2 a - PI and cb - p) '2 (3 - P2. Then

2c· b - p) '2 a - PI

+ (3 -

P2

=

a

+ (3 -

(PI

+ P2).

The assertion 4 is clear for a = O. Let us assume that a > O. Given a rational P < a(3 we can compute two positive rationals PI < a and P2 < (3 such that PIP2 '2 p. For c = 1/(3 we obtain c· (a(3 - p) '2 c· (a(3 - PIP2) '2 c· (a(3 - PI(3)

=

a - Pl·

The assertion 5 follows immediately from Lemma 7.53 that all c.e. reals dominate O. Therefore the assertion is true if a = 0 or (3 = O. Assume that a > 0 and (3 > 0, and that c is a constant such that, given a rational P < " we can find rationals PI < a and P2 < (3 satisfying cb-p) '2 a-PI and c(, - p) '2 (3 - P2. We can assume that PI and P2 are positive. With C= c . (a + (3) we obtain a(3 - PIP2

a((3 - P2)

+ P2(a -

PI)

< (a + P2)cb - p) < (a + (3)cb - p)

cb -

p).

o

Corollary 7.55. The sum of a random c.e. real and a c.e. real is a random c. e. real. The product of a positive random c. e. real with a positive c. e. real is a random c. e. real.

7. G.E. Random Reals

276

o

Proof This follows from Lemma 7.54 and Theorem 7.59.

Corollary 7.56. The class of random c. e. reals is closed under addition. The class of positive random c. e. reals is closed under multiplication. Remark. Corollary 7.55 contrasts with the fact that addition and multiplication do not preserve randomness. For example, if a is a random number, then 1- a is random as well, but a + (1- a) = 1 is not random. For two reals a and {3, a =dom {3 denotes the conjunction a '2dom {3 and {3 '2dom a. For a real a, let

[aJ

= {{3 E R

Ia

=dom

{3} and R c.e .

= {[aJI a

is a c.e. real}.

Theorem 7.57. The structure (R c .e .; ~dom) is an upper semi-lattice. It has a least element which is the =dom -equivalence class containing exactly all computable real numbers.

Proof By Lemma 7.54 the structure (Rc .e .; ~dom) is an upper semi-lattice. Let a be a computable real, so there exists an increasing computable sequence (an) of rationals with la - ani ~ 2- n . Clearly, if a dominates a c.e. real {3, then also {3 must be computable. Now let {3 be a c.e. real and (b n ) be an increasing computable sequence of rationals converging to (3. We define an increasing computable sequence an of rationals by an = ag(n)' where 9 : N -+ N is the total computable function defined by

g(-l) = -1 and g(n) = min{m 1m> g(n -1) and 2- m for all n E N. Then, (an) {3 dominates a.

-+

~ bn +1 -

bn },

a, and {3 - bn > a - an for all n E N. Hence, 0

Comment. Corollary 7.110 and Theorem 7.109 will show that (Rc .e .; ~dom) also has a greatest element, which is the equivalence class containing exactly all Chaitin Omega Numbers. We are now in a position to describe the relationship between the domination relation and the program-size complexity. Lemma 7.58. For every c E N there is a positive integer Nc such that for every n E N and all strings x, y E ~n with 10.x - O,yl ~ c· 2- n we have IH(y) - H(x)1 ~ N c .

7.5 G.E. Reals, Domination and Degrees

277

Proof For n ~ 1 and two strings x, y E ~n with IO.x - O.yl :S c· 2- n , one can compute y if one knows the canonical program x* of x and the integer 2n. (O.x-O.y) E [-c,c]. Consequently, there is a constant Nc > 0 depending only upon c such that H (y) :S H (x) + N c , for all n ~ 1, and all x, y E ~n with IO.x - O.yl :S c· 2- n . The lemma follows by symmetry. 0 Theorem 1.59 (Solovay). Let x, y E ~w be two infinite binary sequences such that both O.x and O.y are c.e. reals and O.x ~dom O.y. Then

H(y(n)) :S E(x(n))

+ 0(1).

Proof In view of the fact that O.x ~dom O.y, there is a constant c E N such that, for every n E N, given x(n), we can find, in an effective manner, a rational Pn < O.y satisfying 2c ~ c· ( O.x - O.x(n) - 2n+1 1 ) ~ O.y - Pn 2n+1

> O.

Let zPn be the first n + 1 digits of the binary expansion of Pn. Then

o :S O.y(n) -

2c+ 1

O,zPn:S 2n +1

.

Hence, by Lemma 7.58, we have

H(y(n)) :S H(zPn)

+ 0(1) :S H(x(n)) + 0(1).

o

Remark. If a :Sdom (3, then (3 is "more random" than a in the sense that the program-size complexity of the first n digits of a does not exceed the complexity of the first n digits of (3 by more than a constant, cf. Theorem 7.59. The more random an effective object is, the closer it is to Chaitin Omega Numbers; the less random an effective object is, the closer it is to computable reals. The converse implication is false, see Exercise 7.8.26. A slightly more general form of Theorem 7.59 is true: the hypothesis that the sequence is increasing is not necessary. Theorem 1.60. Let (ai) and (b i ) be converging sequences with O.x limi-too ai and O.y = limi-too k If (ai) dominates (bi), then

H(y(n)) :S H(x(n))

+ 0(1).

=

7. C.E. Random Reals

278

Proof For every n and large enough i we have 10.x - ail::; 2- n hence, 10.x(n) - ail::; 10.x(n) - O.xl + 10.x - ail::; Tn.

1

and

Therefore, given x(n), we can compute an index in such that

For this index in we have

Let c > 0 be a constant such that

for all i. Let Zn be the string consisting of the first n + 1 digits after the radix point of th.e binary expansion of bin (containing infinitely many ones). Then

10.y(n) - O.znl

< 'IO.y(n) - O·yl + 10.y - bini + Ib in < T n - 1 + c ·IO.x - ainl + T n- 1 < 2- n - 1 + c. 3. 2- n - 1 + 2- n - 1 (3c

- O.znl

+ 2) ·2- n - 1 .

Hence, by Lemma 7.58, we have

H(y(n)) ::; H(zn)

+ 0(1) ::; H(x(n)) + 0(1).

o

Theorem 1.61. Let (an) be a computable sequence of rationals converging to a non-random real 0, and let (b n ) be a computable sequence of rationals converging to a random real {3. Then, for every c > 0 there are infinitely many i such that

Proof For the sake of a contradiction assume that the assertion is not true and that (ai) dominates (bi). Let 0 = O.x and {3 = O.y (we can assume without loss of generality that 0 and (3lie in the interval [0,1)). Then, by Theorem 7.60, there is a constant c such that H(y(n)) ::; H(x(n)) + c, for all n. This implies that also x is random, i.e. 0 is random, a contradiction.

o

7.5 G.E. Reals, Domination and Degrees

279

We are now in a position to cast new light on Theorem 7.44. Lemma 7.62. Let (b i ) be a computable sequence of rationals which converges to a random real {3. Then for every d > 0 and almost all i, 1{3 - bil

> 2d-i.

Proof Let d > 0 be fixed. It is clear that we can assume without loss of generality that {3 and all rationals bi lie in the interval (0,1). Let O.y be the binary expansion of {3. For every i, let Zi E ~i+1 be the string consisting of the first i + 1 digits after the radix point of the binary expansion of bi (containing infinitely many ones). Then

o::; bi -

O,Zi ::; 2- i -

1.

Since the sequence (Zi) is a computable sequence of strings there exists a constant el such that for all i (7.9)

For the sake of a contradiction let us assume that there are infinitely many i with 1{3 - bi 1::; 2d-i. Then for all these i we have 10.y(i) - O.zil

< 10.y(i) - O·yl + 10.y - bil + Ibi - O.zil < T i - 1 + 2d+1 . T i - 1 + T i - 1 (2 + 2Ml) . Ti-l.

With Lemma 7.58 we conclude that there is a constant H(y(i)) ::; H(Zi) + e2 for all these i. Using (7.9) we obtain

H(y(i)) ::; 2logi + el

e2

such that

+ e2,

for infinitely many i. This contradicts the randomness of y, i.e. the randomness of the real {3. D The following result is a scholium to Theorem 7.44.

Scholium 7.63. Let (ai) be a computable sequence of rationals which converges computably to a computable real 0, and let (bi) be a computable sequence of rationals which converges monotonically to a random real {3. Then for every c > 0 there exists ad> 0 such that for all i 2 d (7.10)

280

7. G.E. Random Reals

Proof. Let (ai) and (b i ) be as in the scholium and fix a number c> O. We show that (7.10) is true for almost all i. First, we show that it is sufficient to prove this for c = 1. Indeed, since we can enlarge c, we can assume that c is a rational. Then we can prove the assertion for the sequence (cai) instead of (ai) with the constant c in (7.10) replaced by 1. The sequence (cai) is also a computable sequence of rationals and it converges computably to the computable real ca. Secondly, we show that we can restrict ourselves to the case that the sequence (ai) is of the form ai = 2- s (i) where 8 : N --+ N is a computable, non-decreasing, unbounded function with 8(0) = O. Indeed, since we will show 1,8 - bi I > la - ai I only for almost all i, we can forget finitely many terms of both sequences (ai) and (b i ) and assume that la - ail::; 1, for all i. Since the sequence (ai) converges computably to a there is a computable function 9 : N --+ N with la-a'I
for all i, j with i ::::: g(j). We can additionally assume that 9 is increasing and, because of la - ail::; 1 for all i, also that g(O) = O. We define the computable, non-decreasing, unbounded function 8 : N --+ N by

8(0) = 0 and 8(i) = max{j I g(j) ::; i}, for i > O. Then we observe i ::::: g( 8( i)) and hence i. Therefore, it is sufficient to prove that

la -

ai I ::; 2- S (i), for all (7.11)

holds true for almost all i. Hence, from now on we assume that. 8 : N --+ N is a computable, nondecreasing, unbounded function with 8(0) = 0 and we wish to show that (7.11) is true for almost all i. We define the computable non-decreasing function f : N --+ N by f(i) = max{j I 8(j) ::; i}, for all i. Then we have for all k ::::: 0

f(8(k)) = max{j I s(j) ::; s(k)} ::::: k. Finally we define a computable sequence (b i ) by bi = bf(i)' Since the sequence (b i ) converges monotonically there exists a constant d::::: 0 such that for all i, j with j ::::: i,

7.5 G.E. Reals, Domination and Degrees

281

By Lemma 7.62 there exists a constant el such that 1,8 - bjl > 2d-J,'

for all j ;::: el. We set e2 = f(el)

+ l.

Then s(i) > el for all i ;::: e2. Because of i ::; f(s(i)) for all i ;::: 0 we obtain for all i ;::: e2

1,8 -

bi I ;::: 2- d •

1,8 -

bf(s(i)) I = Td .

1,8 -

bS(i) I > Td . 2 d - s (i)

= Ts(i).

o

which completes the proof.

We have considered arbitrary converging and computable sequences (ai) and (b i ) and have explicitly formulated two gaps with respect to the convergence rates, one from computable to non-computable reals, and one from non-random to random reals. Both results were based on the inequality 1,8 - bil > cia - ail holding for infinitely many i. Can we claim that (b i ) converges slower than (ai)? If we compare monotonically converging sequences with computable limit and monotonically converging sequences with random limit and replace the quantifier "for infinitely many i" by the quantifier "for almost all i" , then it is justified to say that (bi) converges slower than (ai). Theorem 1.64. Let (ai) be a computable sequence of rationals which converges monotonically to a computable real (x, and let (bi) be a computable sequence of rationals which converges monotonically to a random real,8. Then for every c> 0 there exists ad> 0 such that for all i ;::: d (7.12) Proof This follows immediately from Proposition 7.44 and Scholium 7.63.

o We continue by comparing the domination relation with Turing reducibility. For every infinite sequence x E L;W such that O.x is a c.e. real, let

Ax = {v E L;* I O.v ::; O.x} and A~ = {string(n) I Xn = I}. Then, obviously, Ax is a c.e. set which is Turing equivalent to A~. 8 In the following, we establish the relationship between domination and Turing reducibility. Recall that we denote by XA the characteristic function of A 8Note that

A;t

is not necessarily a c.e. set.

c

L;*.

7. G.E. Random Reals

282

Definition 7.65. A set A c L;* is Turing reducible to a set B c L;* (we write A -:5:T B) if there is an oracle Turing machine M such that MB(x) = XA(X), for all x E L;*. Lemma 7.66. Let x, y E L;w be two infinite binary sequences such that both O.x and O.y are c.e. reals and O.x 2.dom O.y. Then Ay -:5:T Ax. Proof Without loss of generality, we may assume that

x, y tj. {xOOOO ... ,x1111 ... I x E L;*}.

(7.13)

Let f : L;* ~ L;* be a partially computable function and c E N a constant satisfying the following inequality for all n > 0: c

0< O.y - O·f(x(n - 1)) -:5: 2n . Given a string z we wish to decide whether z E A y . Using the oracle A~ we compute the least i 2. 0 such that either O·f(x(i - 1)) 2. O.z or O.z - O·f(x(i - 1))

>

c 2i

.

Such an i must exist in view of the relation y tj. {xOOOO . .. , xlIII . .. I x E L;*}. Finally, if O.f(x(i - 1)) 2. O.z, then z E Ay; otherwise z tj. A y . 0 Does the converse of Lemma 7.66 hold true? A negative answer will be given in Corollary 7.114. Let (CE; -:5:T) denote the upper semi-lattice structure of the class of c.e. sets under the Turing reducibility.

Definition 7.67. A strong homomorphism from a partially ordered set (X, -:5:) to another partially ordered set (Y, -:5:) is a mapping h : X -+ y such that 1.

For all x, x' E X, if x -:5: x', then h(x) -:5: h(x' ).

2.

For all y, y' E Y, if Y -:5: y', then there exist x, x' in X such that x -:5: x' and h(x) = y, h(x' ) = y'.

Theorem 7.68. There is a strong homomorphism from (Rc.e.; -:5:dom) onto (CE; -:5:T).

7.5 G.E. Reais, Domination and Degrees

283

Proof By Lemma 7.54 the structure (R c.e .; ~dom) is an upper semi-lattice. Every =dom-equivalence class of c.e. reals contains a c.e. real of the form O.x. Lemma 7.66 shows that by O.x f-t Ax one defines a mapping from (Rc.e.; ~dom) to (CE; ~T)' which satisfies the first condition in the definition of a strong homomorphism.

We have to show that this mapping also satisfies the second condition. Let B, C C L;* be two c.e. sets with C ~T B. To this end we will show that there are two c.e. reals O.x and O.y with the following three properties: (I) (II) (III)

O.x dominates O.y, Ax is Turing equivalent to B, and

Ay is Turing equivalent to C.

We can assume that the sets Band C are infinite and have the form B = {string(n) I nEB} and C = {string(n) I n E C}, where B is a c.e. set of odd natural numbers and C is a c.e. set of even natural numbers. Then the set D = B U C is Turing equivalent to B. We define two sequences x, y E L;W by x = XD and y = Xc. The real numbers O.x and O.y are c.e. They have the properties (II) and (III) because Ax is Turing equivalent to A~ = D, which is Turing equivalent to B, and Ay is Turing equivalent to = C. We are left to show that O.x dominates O.y. Let bo,b1 ,b2 , ••• and CO,C},C2, •..

At

be one-to-one computable enumerations of B and of C, respectively. The rational sequences

are increasing, computable, converge to O.x and to O.y, respectively, and satisfy the inequality n

O.x - 2)2- bi i=O

+ 2-C;)

n

~ O.y -

L 2-

Ci •

i=O

Hence, by Lemma 7.53, the number O.x dominates O.y.

o

Definition 7.69. Two sets A, B are Turing equivalent if A and Bare Turing reducible to each other. A n equivalence class with respect to the relation =T is called Turing degree. A c.e. Turing degree is a Turing degree containing a c. e. set.

7. C.E. Random Reals

284

We write a, b, and so on to denote the Turing degrees. We define a S b if there is some A E a and B E b such that A ST B. Turing degrees form a partial order with respect to ST which we denote by D(S). For example, 0 is the c.e. Turing degree containing all computable sets. Finally, identifying N with L;* via the computable bijection string we can talk about reducibility between sets of non-negative integers. Recall that (i.px) is a Godel numbering of all p.c. string functions. In what follows we will use a standard enumeration (Di) of the class of finite sets of strings (Do denotes 0). Definition 7.70. (a) Let

'Halt = {x E L;* I i.px(x) < (Xl}, 9 and let ~g be the class of all sets A ST 'Halt. (b) A computable approximation to a ~g set A is a sequence (Df(i)) of finite sets indexed by some computable function f such that XA(X) = limi-->oo XDf(i) (x), for all x. For q E Q n [0,1] we write q(x) = i if the xth bit of the binary representation containing infinitely many ones of q is i. Rephrasing the Shoenfield Limit Lemma (see Odifreddi [321], p. 373) we get: Proposition 7.71. For a real a E [0,1] the following two conditions are equivalent:

(1)

There exists a computable sequence (ai) of rationals converging to a.

(2)

a = O.XA, for some ~g set A.

Proof For the direct implication we can assume that all rationals ai lie in the unit interval [0,1]. We define x E A[s] if x < sand as(x) = 1. Then XA = lims-->oo XA[s] is a ~g set and a = O,XA· Conversely, suppose a = O.XA where A is a ~g set and {A[S]}SEN is a computable approximation to A. Let qs = O,XA[s]' Then clearly (qs) is a computable sequence converging to a. 0 9The standard notation K was not convenient in this context. It is well known that the decision problem for Halt - the Halting Problem - is uncomputable; an information-theoretic proof will be discussed in Section 9.2.

7.5 C.E. Reais, Domination and Degrees

285

Definition 7.72. We define the degree of a real 0;, degT(o;), to be the degree of A, where O.XA is the fractional part of 0;. Note that either there is a unique such set A or there are two, one finite and one cofinite. Theorem 7.73. Suppose 0; = O.XA, for some ~g set A. Then, for every c. e. degree b there exists a computable sequence (qi) with limit 0; such that {qi} has degree b.

Proof Let (Pi) be a computable sequence converging to 0; such that {Pi} is infinite. We can construct a computable subsequence (rj) of (Pi) such that O(rj) is strictly increasing. Let B be an arbitrary infinite c.e. set of natural numbers and ba, b1 , b2 , ••• be an effective injective enumeration of B. Then (qi) = (rbJ is a computable sequence of rationals which converges to 0;. We claim that {qi} =T B. Indeed, a natural number m is in B iff rm is in {qi}. Conversely, for an arbitrary rational number s we can decide s E {qd by first asking whether s E {ri}. This is decidable because O(ri) is strictly increasing. If the answer is positive we compute the unique 0 number b with rb = s, and ask whether bE B. So far we have considered arbitrary computable sequences of rationals that converge. It is possible for the left cut L( 0;) to be c.e. and the set A satisfying the equality 0; = O,XA not to be c.e. (see Exercise 7.8.23) Next we define the strongly w-c.e. sets and prove that if L( 0;) is c.e., then A is a strongly w-c.e. set.

Definition 7.74. Let A be a ~g set. We say that A is strongly w-c.e. if there is a computable approximation (A[s])s to A such that 1. 2.

A[O] = 0, If x E A[s] \ A[s + 1], then there exists y < x such that y E A[s +

1] \A[s]. The following theorem gives another characterization of c.e. reals. Theorem 7.75. Let are equivalent:

0;

be in [0,1]. Then, the following two conditions

7. C.E. Random Reals

286

(1)

The real

(2)

There is a strongly w-c.e. set A such that

0:

is c.e.

Proof. The implication (1)

0:

= O.XA.

'* (2) holds for 0: = O. Suppose 0: > 0 and (qi)

is an increasing computable sequence of rationals in [0,1] converging to 0:. We define XA = lims--->ooXA[sj, where A[s] = {x I x < sand qs(x) = I}. Then, 0: = O.XA and A is strongly w-c.e.

'*

For the converse implication, (2) (1), we consider a real 0: = O,XA, for some strongly w-c.e. set A. Let qs = O'XA[sj, where {A[s]} is a computable approximation to A satisfying Definition 7.74. Then L(o:) can be 0 enumerated from an enumeration of {qs I SEN}, so 0: is c.e. Corollary 7.76. If A is a strongly w-c. e. set, then A is of c. e. degree. Proof. As L(O.XA) rem 7.75.

=T

A, for A

c

N, the assertion follows from Theo0

Definition 7.77. A set B C Q of rationals is called a representation of 0: if there is an increasing computable sequence (qi) of rationals with limit 0: and {qi} = B.

To study the degrees of sets of rational numbers we will identify a set B C Q with its image under a fixed computable bijection B : Q --t Nand call B(B) a representation of 0:. Next we will look at the Thring degrees of representations of c.e. reals. Clearly, degT(O:) = degT(L(o:)). Lemma 7.78. Every c. e. degree is the degree of L( 0:), for some c. e. real 0:.

Proof. Let A be a c.e. set of degree a and let O.XA. Then it is clear that L(o:) =T A.

0:

be the c.e. real equal to 0

Definition 7.79. A splitting of a c. e. set A is a pair of disjoint c. e. sets Al and A2 such that Al U A2 = A. Then we say that Al and A2 form a splitting of A and that each of the sets Al and A2 is a half of a splitting of A.

7.5 C.E. Reals, Domination and Degrees

287

Recall that the disjoint sum of two sets A, B is defined by A EB B

= {2n I n

E A} U {2n

+ 1 In E B}.

It is not difficult to see that degT(A EB B) is the least upper bound of degT(A) and degT(B), and so D(:S) forms an upper semi-lattice. If Al and A2 form a splitting of a c.e. set A, then A =T Al EB A 2. The following two lemmata show the connection between representations of c.e. reals and splitting. Lemma 1.80. If B is a representation of a c.e. real a, then B is an infinite half of a splitting of L( a). Proof. It is clear that any representation B of a c.e. real a is an infinite c.e. subset of L( a). Hence, all we have to show is that L( a) \ B is c.e. Let (qi) be the increasing computable sequence of rationals with B = {qi}. The set L(a) is c.e. We can for each element p E L(a) wait until we find a qj with p :S qj (as rationals), and choose p iff P rj. {qQ, ... , qj}. Hence, we can enumerate L(a) \ B. 0

Lemma 1.81. Let B be a representation of a c. e. real a and let C Then the following two conditions are equivalent: 1.

C is a representation of a.

2.

C is an infinite half of a splitting of B.

c B.

Proof. The direct implication follows the proof of Lemma 7.80. For the converse, let (qi) be the increasing computable sequence of rationals with B = {qi}, let C be an infinite half of a splitting of B, and let D be the other half of this splitting. We construct an increasing rational sequence (Pi) with limit a and C = {pd by going through the list (qi), waiting for each element qi until it is enumerated either in C or in D, and finally choosing it iff it is enumerated in C. 0

Remark. From Lemma 7.80 it follows that L(a) is an upper bound for the degrees of representations of a. Corollary 1.82. If B is a representation of a c.e. real a, then B :ST

L(a).

288

7. O.E. Random Reals

For the special case of computable reals we then get the following: Corollary 7.83. If a is a computable real, then every representation of a is computable.

For a c.e. real a, let 8'(a) be the partial order (with respect to Turing reducibility) of those c.e. Turing degrees below degT(L(a)) that contain a representation of a. Proposition 7.84. For every c.e. real a, 8'(a) is an upper semi-lattice. Proof Let a be a c.e. real. Then 8'(a) is closed under the usual join operation on Turing degrees. Indeed suppose a, b E 8'( a) with A and B being representations of a in a and b, respectively. Let 0 = Au B. Then 0 is the representation of a formed by effectively enumerating the sequences of A and B in increasing order (as rationals). We claim that

degT(O) = aU b, i.e. 0(0) =T O(A) EB O(B). It is obvious that 0(0) ~T O(A) EB O(B). For the converse we use Lemma 7.81: the set A is a half of a splitting of 0, hence O(A) ~T 0(0), the same for B. 0 We further study the upper semi-lattice 8'(a). We first prove that 0 and

degT(L(a)) are in 8'(a). Proposition 7.85. For any c.e. real a there is a computable representation of a. Proof The classical result that every infinite c.e. set contains an infinite computable subset yields the assertion. 0

Theorem 7.86. Every non-computable c. e. real number a has a noncomputable representation. Proof We fix an increasing computable sequence (qi) converging to a such that {qd is computable. We construct by stages a non-computable representation B such that (Pi) is a subsequence of Band B is not the complement of any c.e. set.

289

7.5 C.E. Reals, Domination and Degrees

= 0 let bo = qo. At stage s + 1 we have already constructed B[s] = {bo, ... , bkJ, where bo < ... < bk8 (as rationals) and bk8 = qs. If there is a least e < s + 1 such that We[s] n B[s] = 0 and an x E We[s] with qs < x :S qs+l, then let bk8+1 = x, bk 8+2 = qs+l and ks+l = ks + 2. If there is no such e, then let bk8+1 = qs+l and ks+l = ks + 1. At stage s

We complete the construction by letting B =

Us B[s].

Clearly (bi) is an increasing computable sequence of rationals converging to o. It remains to show that B is not computable. Suppose B is computable. Then let e be the least index such that B = We. Let So be a stage such that for all i < e and all s ~ So we have Wds]nB[s] i= 0 or there is no x E Wi[S] with qs < x :S qs+l. We will show that for all P > qSQ (as rationals), p E L( 0) is decidable, contradicting the hypothesis of the theorem. To compute p E L( 0), we enumerate B and We until p occurs in one of them. If p E B then p E L(o). Otherwise p E We and we claim that p 1- L(o). Indeed, suppose that p E L(o). Then at some least stage t > So, qt < p :S qt+l, and the construction enumerates some p' E B for qt < p' :S qt+l and p' E We. This contradicts B n We = 0 and hence B is not computable. 0 Theorem 7.87 (Calude-Coles-Hertling-Khoussainov). Let 0 be a c. e. real. Then 0 has a rep res entation of degree L (0). Furthermore, every representation of 0 can be extended to a representation of degree L(o).

Proof Let (Pi) be an increasing computable sequence of rationals converging to o. We shall construct a new computable sequence (qi) of rationals such that {qi} is a representation of 0 with {qd =T L(o). Additionally we define li = max{O(pj) I j :S i}, for all i, and the sequence (ji) of natural numbers with qji = Pi, for all i. We start with jo = 0 and qo ji+1 > ji such that

= Po. Given ji with qji = Pi, we define

and for m = 1, ... ,ji+l - ji we define the numbers qji+ m as the rational numbers in this set in increasing order.

7. G.E. Random Reals

290

It is obvious that (qi) is an increasing computable sequence of rationals converging to a, and qji = Pi, for all i. From Corollary 7.82 we know that {qd "5.T L(a). We still have to prove that L(a) "5.T {qi}. Let P E Q. In order to decide P E L(a) we compute the minimal k with lk 2: O(p). Then we check whether P "5. qjk. If P "5. qjk' then P E L(a). If P > qjk' then P E L(a) iff P E {qi}. 0

Comment. The following alternative proof for the first assertion of Theorem 7.87 shows that we can obtain a representation of a of degree L( a) consisting only of dyadic rational numbers. We fix an increasing computable sequence (Pi) of dyadic rationals with limit a with increasing denominator

for a computable sequence (ni)i of integers and a computable, increasing sequence (ki)i of natural numbers. We shall construct a new computable sequence (qi)i of rationals such that {qd is a representation of a having Turing degree degT(L(a)). To this end we will define a sequence (ji) of natural numbers such that qji = Pi, for all i. We start with jo

= 0 and qo = PO. Given ji with % = Pi, we set m

qji+ m = qji

+ 2ki+l

Of course, (qi) is an increasing computable sequence of rationals converging to a since % = Pi, for all i. We have to show that L(a) "5.T {qi}. If a is a rational, then L(a) is computable, so "5.T {qd. So we assume that a is irrational. If the set {qi} contains a dyadic number 2~tl, then it contains all dyadic numbers in

,a)

the interval e~tl whose denominator is at most2k. But {qi} does not contain any number greater than a. Furthermore, the denominator of the dyadic number qji is at least 2ki 2: 2i. Hence, given {qi} as an oracle, for an arbitrary natural number l we can compute a dyadic rational (2n+ 1 )2- with k 2: l and such that the interval (2~tl, 2~t3) contains a.

k

291

7.5 C.E. Reals, Domination and Degrees

Using {qd, for a given rational number r, we can decide whether r < ex by computing such an interval which contains ex but not r (any sufficiently small interval containing the irrational number ex will not contain r) and checking whether r lies to the left or to the right of this interval.

Corollary 7.88. Every c. e. degree contains a representation of a c. e. real.

o

Proof By Lemma 7.78 and Theorem 7.87.

By Lemma 7.80 every representation of a c.e. real ex is a half of a splitting of L(ex). The following result shows that there is a representation of ex of the same degree as the other half.

Theorem 7.89. Suppose B is a representation of a c.e. real ex. Then there is a representation C of ex such that C =T L(ex) \ B. Proof Let (bi) be the increasing computable sequence such that B = {bi }. Let (Pi) be a representation of ex such that {pd is computable and {pd n {bd = 0. We construct a new increasing computable sequence of r~tionals (Ci) such that {Ci} =T L(ex) \ B. To this end we define li = max{O(pj) I j ~ i}, for all i, and a sequence (ji)i of natural numbers with Cji = Pi, for all i. We start with )0 = 0 and Co = Po. Let bPi denote the least rational in B which is greater than Pi. Then given ji with Cji = Pi, we define ji+I > ji such that

ji+I - ji

= #( {q

E

Q I Pi < q ~ Pi+I, O(q)

~ li+}, q

1- {bo, ... , bpi+l}})'

and for m = 1, ... ,ji+I - ji we define Cji+ m to be those rational numbers in this set in increasing order. Let C = {cd. It is clear that (Ci) is an increasing computable sequence of rationals converging to ex, since Cji = Pi, for all i. We now show that C =T L(ex) \B.

First, C ~T L(ex) \ B as follows. Let p E Q. If P 1- L(ex) \ B, then P 1- C. Otherwise, if P E L(ex) \ B, enumerate C until reaching a least Ci such that Ci 2 p. Then P E C iff p E {co, ... , Ci}. Secondly, L(ex) \B ~T C as follows. Let P E Q. Compute the least k such that lk 2 O(p) and then check whether P ~ Cjk. If P ~ Cjk' then enumerate B until reaching a least bi such that P ~ bi, and conclude p E L( ex) \ B iff P 1- {bo, ... , bd· Otherwise, P > Cjk and we can conclude that P E L(ex) \B ilipEQ

0

7. G.E. Random Reals

292

Remark. Theorem 7.89 is also a strengthening of Theorem 7.87: we can take B to be a computable representation in order to obtain the first part of Theorem 7.87. So we have established that for non-computable c.e. reals a, #(<;s(a)) 2 2. Are there intermediate representations? That is, for every noncomputable c.e. real a, is there a representation B such that 0
L(a)? Definition 7.90. The jump A' of a set A is the relativization of Halt to oracle A.lO A degree a is low if a' = 0'. For the next theorem we need a classical result due to Sacks (see, for example, Odifreddi [321] or Soare [372]). Theorem 7.91 (Sacks Splitting Theorem). Let A and D be two non-computable c. e. sets. Then there are low c. e. sets Band C such that A = B U C, B n C = 0 and D 1:.T B, C. In view of Theorem 7.87 every non-computable c.e. real has a noncomputable representation. Corollary 7.92. Assume that a is a non-computable c.e. real, A is a non-computable representation of a, and D is a non-computable c.e. set. Then there exists a non-computable representation B of a such that B -:5:.T A, D 1:.T Band B is low.

Proof. We apply the Sacks Splitting Theorem to A and D. At least one of the obtained sets Band C is non-computable, hence also infinite and by Lemma 7.81 a representation of a. 0 Corollary 7.93. Assume that a is a non-computable c.e. real and A is a non-computable representation of a. Then there is a low representation B of a such that 0
Proof. We apply Corollary 7.92 with D

= A.

o

lOIn other words, x E A'iff x E w;t. By (
7.5 C.E. Reals, Domination and Degrees

293

Repeated applications of Corollary 7.93 show that every non-computable c.e. real number has infinitely many representations of different degrees, and in fact the representations of non-computable c.e. reals are downwards dense. That is, if A is a non-computable representation of a, then there is a non-computable representation B of a such that B
a? Definition 7.94. We say that a realizes the cone if for every c.e. set A <5.T L(a) there is a computable increasing sequence of rationals of degree A converging to a. Next we consider the set S(A)

= {c I degT(A I ) = c, for some set

Al which is a half of a

splitting of A}.

Definition 7.95. A c.e. set A has the Universal Splitting Property (USP) if S(A) = {b I b <5. degT(A)}. Otherwise A is non-USP. A c. e. degree a such that every c. e. set of degree a is non- USP is called a completely non-USP degree.

The existence of completely non-USP degrees comes from:

Theorem 7.96 (Lerman-Remmel). degree.

There is a completely non-USP

We are now in a position to answer the above question:

Theorem 7.97. There is a non-computable c.e. real that does not realize the cone. Proof By Theorem 7.96 we can take a c.e. degree a which is completely non-USP. Let a be a c.e. real such that L(a) E a. Since a is completely non-USP, there is a c.e. degree b such that 0 < b < a and b contains no half of a splitting of L(a).

Now suppose {qd is a representation of a. By Lemma 7.80 {qd is a half 0 of a splitting of L(a), and hence cannot have Turing degree b.

7. G.E. Random Reals

294

7.6

A Characterization of Computably Enumerable Random Reals

We have seen that every Chaitin Omega Number is a c.e. random real. Indeed, in view of (7.1), every nu is c.e. and by Theorem 7.2 random. Are there c.e. random reals which are not Omega Numbers? In this section we will prove that the - rather unexpected - answer is negative. We now introduce a relation between c.e. sets which is very close, but not equivalent, to the domination relation.

Definition 7.98. Let A, B be infinite, prefix-free c.e. sets. We say that the set A strongly simulates the set B (we write B :::;ss A) if there is a partially computable function f : 'E* ~ 'E* which satisfies the following three conditions:

1) A

= dom(j),

2)

B = f(A),

3)

Ixl:::;

If(x)1

+ 0(1),

for all x E A.

Note that :::;ss is reflexive and transitive. Hence, the relation :::;ss defines a partially ordered set (C.E. ss ; :::;ssl' where C.E.ss is the set of ""ssequivalence classes of C.E.

Lemma 7.99. If A, B are infinite prefix-free c.e. sets and B :::;ss A, then f.L(B'EW) :::;dom f.L(A'E W).

Proof Let (Xi) be a one-to-one computable enumeration of A. Let f be a function and c > 0 be a constant as in the above definition. For each n and each y E B \ {f(xo), ... , f(x n )} there is a string x E A \ {xo, . .. , x n } with y = f(x) and Ixl :::; If(x)1 + c. Hence, f.L(B'EW) - f.L( {f(xo), .. , ,f(x n ) }'EW) f.L((B \ {f(xo)" .. , f(x n )} )'EW) < 2c . f.L((A \ {xo, ... ,xn})'EW) 2c . (f.L(A'E W) - f.L( {xo, ... , x n }'EW)) .

o

7.6 A Characterization of G.E. Random Reals

295

The following partial converse of Lemma 7.99 is true and very important. Theorem 7.100. Let a be a c.e. real, and B be an infinite prefix-free c.e. set. If p,(B2:.W) ~dom a, then there is an infinite prefix-free c.e. set A C 2:.* such that

Proof Assume that p,(B2:.W) ~dom a. Let (Yi) be a one-to-one computable enumeration of Band (an) be an increasing computable sequence of positive rationals converging to a. In view of the domination property of a, there are an increasing, total computable function f : N --> Nand a constant c E N such that, for each n EN,

f(n) 2c . (a - an) 2 p,(B2:.W) -

L

2- IYil .

(7.14)

i=O

Without loss of generality, we may assume that l l f(O)

ao

>

L 2-IYil-c.

(7.15)

i=O

We now construct a computable sequence (ni) of numbers and a computable double sequence (mi,j )i,j?O of elements in N U {oo}. These numbers ni and the numbers mi,j =I 00 will be the lengths of the strings in the set A which will be constructed. The numbers ni will guarantee that B ~ss A. The numbers mi,j will be used "to fill" the set A up in order to get exactly a = p,(A2:.W). This will follow directly from equation (7.16) below. Construction of (ni): Put ni

= IYil + c,

for all i.

Beginning of construction of (mi,j)' Stage O. Let mi,j = 00, for all i < f(O) and j E N, and define the positive integers (m f(O),j) inductively in such a way that f(O)

00

L j=O 11 Otherwise

we increase c.

2- m f(O),j

= ao -

L

i=O

2- ni

.

296

7. G.E. Random Reals

Stage s (s 2:: 1). If as:::;

f(s)

f(s-l)

00

i=O

i=O

j=O

LTni + L LTmi,j,

then let mi,j = 00, for all i with f(s -1) < i :::; f(s) and j EN. Otherwise, let mi,j = 00, for all i with f(s - 1) < i < f(8) and j E N, and let positive integers (m f(s),j )j?o be inductively defined in such a way that

L 2- mj(s),j = as - (f(S) L 2- ni + f(s-l) L L 2-mi ,j . 00

00

j=O

i=O

i=O

)

j=O

End of construction of (mi,j). Next we prove the equality

a

=

f= (2- ni + f= 2-mi ,j) ,

i=O

(7.16)

j=O

by distinguishing the following two cases.

Case 1. If there are infinitely many stages s such that

then (7.16) holds.

Case 2. Assume the inequality

holds true for almost all sEN and we note that (7.17) For the inverse estimate, we define So to be the largest stage such that

7.6 A Characterization of G.E. Random Reals

297

Such a stage So exists because of (7.15) and the construction. By (7.14) we have 00

'" L...t

a - aso > -

2-IYil-c.

i=f(so)+l

Hence, by the construction, (7.18) By combining (7.17) and (7.18) we obtain the equality (7.16) also in this case. Let h : N - t {( i, j) E N 2 I mi,j =1= oo} be a computable bijection (note that by construction the set {( i, j) E N 2 I mi,j < oo} is infinite) and define a computable sequence (mD of numbers by m~ = mh(i)' Using this sequence we define (n~) by n;i = ni and n;i+ 1 = m~. By the Kr aft-Chai tin Theorem 4.2 and (7.16), combined with 0 < a :S 1, we can construct a one-to-one computable sequence (Xi) of strings with IXil = n~ such that the set {Xi liE N} is prefix-free. Setting A = {Xi liE N} and using (7.16), we obtain 00

p,(A~W)

= LTn~ = i=O

00

00

i=O

i=O

LTni + LTm~ = a.

Finally we define a computable function g : A - t B by g(X2i) = Yi and such that Ig( x2i+d I ;::: IX2i+ll, for all i. This is possible because B is infinite. Obviously, g(A) = B, and Ixi :S Ig(x)1 + c, for all X E A, showing that B :Sss A. 0

Corollary 7.101. The mapping from (C.E.ss; :Sssl to (Rc.e .; :Sdoml mduced by A r--t 2- A , for A E C.E., is a strong homomorphism. Proof The statement in the corollary follows from Lemma 7.99 and Theorem 7.100. 0

Definition 7.102 (Solovay). We say that a computable, increasing and converging sequence (ai) of rationals is universal if for every computable, increasing and converging sequence (bi) of rationals there exists a number c> 0 such that c(a - an) ;::: {3 - bn , for all n E N, where a = limn-too an and {3 = limn-too bn · A real is called O-like if it is the limit of a universal computable, increasing sequence of rationals.

7. G.E. Random Reals

298

Theorem 7.103 (Solovay). Let U be a universal Chaitin computer. Every computable, increasing sequence of rationals converging to nu is universal.

Proof Let (an) be an increasing, computable sequence of rationals with limit nu, and let (b n ) be an increasing, computable, converging sequence of rationals. Set (3 = lim n -+ oo bn . We will show that there is a constant c > 0 with c(nu - an) 2: (3 - bn , for all n.

Let (Xi) be a one-to-one, computable enumeration of dom(U), and Wn = L:i=o 2- lxil . We define a total computable, increasing function 9 : N -+ N, where we also define g(-I) = -1, by

g(n) = min{j > g(n -1)

I Wj 2: an}.

We have already seen that the sequence (Wg(n)) is an increasing, computable sequence with limit nu. In view of the inequality nu - an 2: nu - Wg(n) , it is sufficient to prove that there is a constant c > 0 such that for all n E N,

For each i EN, let Yi be the first string (with respect to the quasilexicographical ordering) which is not in the set

{U(Xj) I j

~ g(i)} U

Furthermore, put ni = l-log(bi+1 - bi)J

{Yj I j < i}.

+ 1.

Since

00

LTni ~ (3 - bo < 1, i=O

by the Kraft-Chaitin Theorem 4.2 we can construct a Chaitin computer C such that, for every i EN, there is a string Ui E :E ni satisfying C (Ui) = Yi. Hence, there is a constant Cc such that H U (Yi) ~ ni + Cc. In view of the choice of Yi, there is a string x~ E dom(U) \ {Xj I j ~ g(i)} such that Ix~1 ~ ni + Cc and U(xD = Yi (here we have used the fact that U is surjective). For different i and j we have Yi =1= Yj, hence x~ =1= xj. Finally we obtain

7.6 A Characterization of G.E. Random Reals

nu -

299

wg(n)

o

which proves the assertion. We continue by observing that:

Lemma 7.104. Any n-like real dominates every c.e. real. Theorem 7.105 (Calude-Hertling-Khoussainov-Wang). For every n-like real a we can construct a universal Chaitin computer U such that a = nu. Hence, every n-like real is a Chaitin Omega Number. Proof. Let V be a universal Chaitin computer. Since a is n-like it dominates every c.e. real, in particular

By Theorem 7.100 there exist an infinite prefix-free c.e. set A with {L(A:L:W) = a, a computable function f : A -+ dom(V) with A = dom(f), f(A) = dom(V), and a constant c > 0 such that Ixl ::; If(x)1 + c, for all x E A. We define a Chaitin computer U by U(x) = V(f(x)). The universality of V implies the universality of U and

o In view of Lemma 7.104 and Theorem 7.105 we get: Theorem 7.106. Let a be a c.e. real. equivalent:

The following statements are

300

7. G.E. Random Reals

1.

There exists a universal computable, increasing sequence of rationals converging to a.

2.

Every computable, increasing sequence of rationals with limit a is universal.

3.

The real a dominates every c. e. real.

Random reals can be directly defined as follows: a real a is random iff for every Martin-Laf test A, art. ni>O Ai' In the context of reals, a MartinLof test A is a constructive seque-nce of constructively open sets (An) in the space :L: w such that {L(An) :::; 2- n .

Lemma 7.107 (Slaman). Let (an), (b n ) be two computable, increasing sequences of rationals converging to a and (3, respectively. One of the following two conditions holds:

A) B)

There is a Martin-Laf test A such that a E ni2:0 Ai. There is a rational constant c > 0 such that c( a - ai) 2': {3 - bi, for all i E N.

Proof. We enumerate the Martin-Lof set A by stages. Let An[s] be the union of finitely many open c.e. sets that have been enumerated into An during stages less than s. We put An [0] = 0 and An[s + 1] = An[s] U (as, as + (b s - bso )2- n ), in case as ¢ An[s] and bs =I bso ; here So is the last stage during which we enumerated a c.e. open set into An or So = 0 if there was no such stage; otherwise, An[s + 1] = An[s]. Clearly, An = Us An[s] is a disjoint union of c.e. open sets.

Let tl, t2, ... , tn, ... be the sequence of stages during which we enumerate open sets into An. Then, {L (YAn[s]) L{L(An[ti]) i2:1

1 00 2n (btl - bo) + L(b tHl j=l

1 2n ({3 - bo)

1

< 2n

-

btj )

7.6 A Characterization of G.E. Random Reals

301

If a E ni>O Ai, then A) holds. Assume that a ¢ An, for some n. We shall prove that 2i(a - ai) 2': (3 - bi, for almost all i, so B) holds. If the open set (as, as + (b s - bso )2- n ) is enumerated into An at stage s, then there is a stage t > s such that at > as + (b s - bso )2-n. We fix i > 0 and let to be the greatest stage t :::; i such that we enumerate something into An during stage t or to = 0, otherwise. Let tl, t2, . .. ,tn , . .. be the sequence of stages after to during which we enumerate open sets into An. Clearly, to :::; i :::; tl. As

a - ah > atk - atl

+ (b tk

- btk _1)2- n ,

for all k, and it follows that

so

a - ah 2': 2:Jbtk - btk_1)T n

= ((3 - bto)Tn.

k;:::l

Finally, for every i 2': max{to, tIl,

a - ai 2': a - atl 2': ((3 - bto)Tn 2': ((3 - bi)T n , because (an), (b n ) are increasing.

o

Theorem 7.108 (Slaman). Every c.e. random real is o'-like. Proof We apply Lemma 7.107: if A) holds, then a is not random; if B) holds, then (3 :::;dom a, and the theorem follows as (3 has been arbitrarily chosen. 0 The following theorem summarizes the characterization of c.e. random reals:

Theorem 7.109. Let a E (0,1). The following conditions are equivalent: 1.

The real a is c. e. and random.

2. 3.

For some universal Chaitin computer U, a = o,u. The real a is o'-like. Every computable, increasing sequence of rationals with limit a is universal.

4.

302

7.7

7. G.E. Random Reals

Degree-theoretic Properties of Computably Enumerable Random Reals

In this section we prove a few important degree-theoretic properties of c.e. random reals. We first obtain the following addendum to Theorem 7.57. Corollary 7.110. The structure (Rc.e .; "5odom! has a greatest element which is the =dom -equivalence class containing exactly all Chaitin Omega Numbers. In analogy with Corollary 7.55 we obtain: Corollary 7.111.

(1)

The fractional part of the sum of an Omega Number and a c.e. real is a Omega Number.

(2)

The fractional part of the product of an Omega Number with a positive c. e. real is an Omega Number.

(3)

The fractional parts of the sum and product of two Omega Numbers are again Omega Numbers.

Proof Use Lemma 7.54 and Theorem 7.109.

o

We continue with a classical result: Theorem 7.112 (Chaitin). Given the first n bits ofn u one can decide whether U(x) halts or not for every string x of length at most n. Proof Assume that 0, = 0.0,10,2 ... nn ... , x is an arbitrary program of length less than n and proceed by dovetailing the computations of U on all possible binary strings ordered quasi-lexicographically (considered as possible inputs). That is, we execute one step of the computation of U on the first input, then the second step of the computation of U on the first input and the first two steps of the computation of U on the second input, a.s.o., and we observe halting computations. Any halting computation of U on x improves the approximation of 0, by 2- lxl . This process eventually leads to an approximation of 0, which is better than 0.0,10,2 ... nn. At

7.7 Degree-theoretic Properties of G.E. Random Reals

303

this stage we check whether x is among the halting programs; if it is not, then x will never halt, because a new halting program x will contribute to the approximation of 0 with 2- lxl 2: 2- n , contradicting (7.4). 0 Remark~

The number Ou includes a tremendous amount of mathematical knowledge. According to Bennett [32, 206],

[Omega] embodies an enormous amount of wisdom in a very small space inasmuch as its first few thousand digits, which could be written on a small piece of paper, contain the answers to more mathematical questions than could be written down in the entire universe. Of course, the above comment is not valid for every Ou. Indeed, in view of Theorem 6.40, for every positive integer n one can construct an Omega Number whose first n bits are O. However, the claim becomes true for every Ou if we replace the bound "a few thousand" by some appropriate larger number.

It is worth noting that even if we get, by some kind of miracle, the first n digits of Ou, the task of solving the problems whose answers are embodied in these n bits is computable but unrealistically difficult: the time it takes to find all halting programs oflength less than n from 0.0102 ... On grows faster than any computable function of n. In a truly poetic description, Bennett continues: Throughout history mystics and philosophers have sought a compact key to universal wisdom, a finite formula or text which, when known and understood, would provide the answer to every question. The use of the Bible, the Koran and the I Ching for divination and the tradition of the secret books of Hermes Trismegistus, and the medieval Jewish Cabala exemplify this belief or hope. Such sources of universal wisdom are traditionally protected from casual use by being hard to find, hard to understand when found, and dangerous to use, tending to answer more questions and deeper ones than the searcher wishes to ask. The esoteric book is, like God, simple yet undescribable. It is omniscient, and transforms all who know it ... Omega is in many senses a cabalistic number. It can be known of, but not known, through human reason. To

7. C.E. Random Reals

304

know it in detail, one would have to accept its uncomputable digit sequence on faith, like words of a sacred text.

The converse implication in Theorem 7.112 is false. We shall return to this discussion in Sections 8.5 and 8.7. Corollary 7.113. The realO'X'Halt is not an Omega Number. Proof It is well known that O,X'Halt is not random.

o

Now we can answer the question raised after Lemma 7.66. Recall that the sets An and AX?-lalt are defined as before Lemma 7.66. Corollary 7.114. Let 0 be an Omega Number. Then the following statements hold: 1.

O,X'Halt 'l.dom 0,

2.

An

=T AX?-lalt =T

Halt.

Proof The first claim follows from Corollary 7.113. The relations An "5.T Halt =T AX?-lalt are clear and AX?-lalt "5.T An follows from Lemma 7.66.

o Clearly, all Omega Numbers are in ~g. Does there exist a random real in ~g which is not in the set {a, 1 - a I a is c.e. random}? The answer is positive. Proposition 7.115. There is a random sequence y with A~ E ~g such that neither O.y nor 1 - O.y is a c. e. real. Proof Let x = XIX2 •.• be an infinite binary sequence such that O.x is an Omega Number, hence O-like. We define an infinite binary sequence y = YIY2 ... by

if i = 1, if 3n < i "5. 2 . 3n , if 2 . 3n < i "5. 3n + 1 .

7.7 Degree-theoretic Properties of G.E. Random Reals

305

The sequence y is obtained by recursively reordering the digits of the sequence x. Hence, also y is a random sequence in ~g. Next we show that neither O.y nor 1 - O.y is a c.e. real. In fact, we show more:

O.x

£dom

O.y

and

O.x

£dom

1 - O.y .

(7.19)

By symmetry, it suffices to show that O.x does not dominate O.y. For the sake of a contradiction, assume that O.x "2dom O.y. Then, by Theorem 7.59, and hence, by the definition of y, we obtain

for all n EN. That is,

Since lim (3n +1

n--->oo

-

2 . 3n

-

H(string(2 . 3n ))) =

00,

the sequence x is not random by Theorem 6.99, hence we have proved 0 (7.19). We conclude that neither O.y nor 1 - O.y is a c.e. real. If one could solve the Halting Problem, then one could compute the program-size complexity. What about the converse implication: can the Halting Problem be solved if one could compute program-size complexity?

We will show that the above question has an affirmative answer. In fact, a stronger result will be proven. To this end we need some more notation and definitions. Recall that Wx is the domain of i.px'

Definition 7.116. Let A, Be 'E*. (a) We say that A is weak truth-table (wtt) reducible to B (we write A :;'wtt B) if A :;'T B via a Turing reduction which on input x only queries strings of length less than g(x), where g : 'E* -+ N is a fixed computable function. (b) We say that A is truth-table (tt) reducible to B (we write A :;'tt B) if there is a computable sequence of Boolean functions {Fx }XE~*' Fx : 'ErxH -+ 'E, such that for all x, we have

7. C.E. Random Reals

306

(c) For two infinite sequences x, y E 2:;w we write O.X :;'wtt O.y (O.x :;'tt O.y) in case A~ :;'wtt A~ (A~ :;'tt A~).

Note that in contrast with tt-reductions, a wtt-reduction may diverge.

Definition 7.117. A c.e. set A is tt (wtt)-complete if Halt :;'tt A (Halt :;'wtt A). We will use Arslanov's12 Completeness Criterion (see Odifreddi [321], p. 338 or Soare [372], p. 88) for wtt-reducibility

Theorem 7.118 (Arslanov's Completeness Criterion). A c.e. set A is wtt-complete iff there is a function f :;'wtt A without fixed-points, i.e. Wx =I- Wf(x), for all x E 2:;*. Next we show that c.e. random reals are wtt-complete, but not ttcomplete. 13

Theorem 7.119 (Arslanov-Calude-Chaitin-Nies). £H

= {(x,n)

The set

I x E 2:;*,n EN, H(x):::; n}

is wtt-complete. Proof We will use Theorem 7.118 and the formula

max H(x) = n

xE:E n

+ O(logn)

(7.20)

from Theorem 5.4. First we construct a positive integer c > 0 and a p.c. function 'IjJ ~ 2:;* such that for every x E 2:;* with Wx =I- 0,

(7.21) and

I'IjJ (x) I :::; p(x) + c. 12M. Arslanov. 13In the next result Arslanov is for A. Arslanov. son of M. Arslanov.

(7.22)

7.7 Degree-theoretic Properties of C.E. Random Reals

307

We now consider a Chaitin computer C such that C(OP(x)l) E Wx whenever Wx =1= 0. Let c' be the simulation constant of Con U in the Invariance Theorem and let () be a p.c. function satisfying the following condition: if C(u) is defined, then U(())(u) = C(u) and 1()(u)1 :::; lui + c'. We put c = c' + 1 and note that in case Wx =1= 0, C(QP(x)l) E wx , so ()(QP(x)l) is defined and belongs to W x . Finally, we put 'ljJ(x) = ()(OP(x)l) and note that Next we define the function

F(y) = min{x

E ~* I H(x)

> p(y) + c},

where the minimum is taken according to the quasi-lexicographical order and c comes from (7.22). In view of (7.20) it follows that

F(y) = min{x

E ~*

I H(x)

> p(y) + c, Ixl :::; p(y) + c}.

The function F is total, H-computable 14 and U('ljJ(y)) =1= F(y) whenever Wy =1= 0. Indeed, if Wy =1= 0 and U('ljJ(y)) = F(y), then 'ljJ(y) is defined, so U('ljJ(y)) E Wyand 1'ljJ(y) I :::; p(y) + c. But, in view of the construction of F, H(F(y)) > p(y) + c, an inequality which contradicts (7.22):

H(F(y)) :::; 1'ljJ(y) I :::; p(y)

+ c.

Let f be an H-computable function satisfying Wf(y) = {F(y)}. To compute f(y) in terms of F(y) we need to perform the test H(x) > p(y) + c only for those strings x satisfying the inequality Ixl :::; p(y) + c, so the function f is wtt-reducible to £H. We conclude by proving that for every y E ~*, Wf(y) =1= W y. If = Wy, then Wy = {F(y)}, so by (7.22), U('ljJ(y)) E W y, that is, U('ljJ(y)) = F(y). Consequently, by (7.21) H(F(y)) :::; 1'ljJ(y) I :::; p(y) + c, o which contradicts the construction of F. Wf(y)

Theorem 7.120. The set £H is wtt-reducible to

nu.

Proof Let 9 : N -+ ~* be a computable, one-to-one function which enumerates the domain of U and we put Wm = 2:i!o 2- lg (i)l. Given x and n > 0 we compute the smallest t ?': 0 such that

14That is, computable using the subroutine H.

7. G.E. Random Reals

308

From Lemma 7.1 00

O.D(n) :::; Wt < Wt

+

L

Tlg(s)1 = D < O.D(n)

+ Tn

s=t+l

we deduce that Ig( s)

1

> n, for every 8 2:: t + 1. Consequently, for all x,

x rf. {g(O),g(l), ... ,get)} iff H(x) > n. Indeed, if x rf. {g(O),g(l), ... ,get)}, then H(x) > n as H(x) = Ig(8)1, for some 8 2:: t + 1; conversely, if H(x) :::; n, then x must be produced via U by one of the elements of the set {g(O), g(l), . .. ,g(t)}. 0 As a consequence we obtain

Theorem 7.121 (Juedes-Lathrop-Lutz). not random.

If Halt

:::;tt

x, then x is

Proof Assume x is random and Halt :::;tt x; that is, there exists a computable sequence of Boolean functions {FU}UE~*' Fu : L:ru+ 1 ~ 2:, such that for all wE L:*, we have XA(W) = Fw(XOXl ... xrw). We will construct a Martin-Lof test V such that x E nn:::::O VnL: w , which will contradict the randomness of x. For every string z let

M(z) = {u E L:rz+ 1 Fz(u) = O}. 1

Consider the set

{z

E

L:* 1 p,(M(z)L:W) 2:: 1/2}

of inputs to the tt-reduction of Halt to x where at least half of the possible oracle strings give the output O. This set is c.e., so let Wzo be a name for it. From the construction it follows that

Zo E Halt iff Fzo (XOXI ... xrzo) = 1, hence if we put r

= rzo + 1 and

we ensure that V is c.e. and p,(VoL:W) :::; 1/2. because if u = xlr, then

Moreover, x E Vo L: w ,

7.7 Degree-theoretic Properties of C.E. Random Reals Assume now that Zn, Vn have been constructed such that x {L(Vn'EW) :::; 2- n- 1. Let Zn+1 ¢ {zo, Zl, ... , zn} be such that

309 E

Vn'Ew and

W Zn +1 = {u E 'E* I {L(M(u)'Ew n Vn'EW) 2: {L(Vn'EW)j2}. Then

Zn+1

E

Halt iff {L(M(u)'Ew n Vn'EW) 2: {L(Vn'EW)j2.

Finally, we put r = r Zn+1 +1 and

Vn+1

=

{u

E

'ET I ulrzn

E

Vn and ({L(M(Zn+1)'Ew n Vn'EW) 2: {L(Vn'EW)j2 iff

FZn+l (u) =

I)}

and note that Vn +1 is c.e., x E Vn +1 and

{L(Vn+1'EW) :::; {L(Vn'EW)j2 :::; Tn-2. Consequently, (Vn ) is a Martin-Lof test with x E nn2':O Vn'Ew.

D

Because Omega Numbers are the same as n-like reals, compared with a non-Omega Number, an Omega Number either contains more information or at least has its information structured in a more useful way. Indeed, we can find a good approximation from below to any c.e. real from a good approximation from below to any fixed Omega Number. Sometimes we wish to compute not just an arbitrary approximation (say, of precision 2- n ) from below to a c.e. real, but instead, a special approximation, namely the first n digits of its binary expansion. Is the information in n organized in such a way as to guarantee that for any c.e. real a there exists a total computable function 9 : N -> N (depending upon a) such that from the first g( n) digits of n we can actually compute the first n digits of a? We show that the answer to this question is negative if one demands that the computation is done by a total computable function. Theorem 7.122. The following statements hold: 1.

For every c.e. real a, a

2.

O,X'Halt

:::;tt O,X'Halt.

itt n.

Proof For the first assertion we observe that for an arbitrary c.e. real O.x the set Ax is c.e., whence Ax :::;1 Halt (i.e. there is a computable oneto-one function 9 with Ax = g-l(Halt)). Since A~ :::;tt Ax we obtain

A~ :::;tt Halt. The second assertion follows from Theorem 7.121 and the randomness of D

n.

310

7.8

7. G.E. Random Reals

Exercises and Problems

1. Let X be an infinite c.e. subset of dom(U>..). I:uEx Q-1u l is also a Chaitin's Omega Number.

Show that Q(X) =

2. (Hartmanis-Hemachandra-Kurtz) Show that a computable real function f has a Chaitin random root iff the set of roots of f has positive /-l measure. 3. (Hemaspaandra) Is

X2

a random number provided x E (0,1) is random?

4. Let A, B be two alphabets and t : A* ~ B* be a p.c., prefix-increasing function. Let /-lA, /-lB be the product measures on A W , B W , respectively. We denote by T the natural extension of t to A 00, i.e. T : A 00 ~ Boo, T (x) = t(x), for every x E dom(t), and T(x) = limn->oo t(x(n)), for every x E AW. Call the transformation T measure-bounded in case there exists a natural M 2: 1 such that

for every c.e. subset S

c B*.

a) Show that the base transformation bounded.

r

(see Section 7.2) is measure-

b) Show that every measure-bounded transformation T preserves random sequences, i.e. if x E AW is a random sequence (over A) and T(x) E BW, then T(x) is a random sequence (over B). 5. Show that the computable transformations x I--t y, X I--t Z mapping every binary sequence x = XIX2 ... X n .•• into the sequences y = OXIOX2 •.• 0Xn .•• and z = XIXIX2X2 .•. XnX n .•• do not preserve randomness. 6. To each binary sequence x = Xl X2 ••• Xn •• , E {O, l}W we associate, following Szilard [396], the binary sequence z = ZIZ2." Zn ••• where Zl = Xl, Zj = Xj EEl Xj-l, for j = 2,3, ... and EEl is the modulo-2 addition. a) Show that y is random provided x is random. b) Compare this result with von Mises' sequence y in Example 6.43. c) Show that each of the sequences x, y, z can be obtained from the other two by computable transformations. 7. Let x E AW be a random sequence over the alphabet A containing at least three letters and let a E A. Delete from x consistently all occurrences of the letter a. Show that the new sequence is random over the alphabet A \ {a}. 8. Let p : N ----t N be a computable permutation of the naturals. Show that a sequence XIX2 •.. Xn .•. is random iff the sequence xp(1)Xp(2) •.• Xp(n) •.• is random.

7.8 Exercises and Problems

311

9. Show that no sequence x E

A~

is random over the alphabet AQ in case

Q > q::::: 2. 10. (Dragomir) Let x be a random sequence over the alphabet {O, 1, 2}, y a random sequence over the alphabet {O, I}, and z a random sequence over the alphabet {3,4}. Construct a new sequence w over the alphabet {O, 1,2,3, 4} by inserting in x elements from z as follows: if Yi = 1, then insert on the ith position of x the letter Zi. All elements in x remain unchanged; they are just shifted to the right by accepting new elements from the disjoint alphabet {3,4}. For example, ify = 000101100 ... , then w = XlX2X3X4Z4X5X6Z6X7Z7XSX9 .... Is w random? 11. (Staiger) Let 0; E [0,1] be a real number, and let x E AQ and y E A~ be its base Q and base q expansions, respectively. Prove that there is a constant c > 0 such that for every lEN the following equations hold true:

IKQ(x(ll.logQ bJ)) -logQ q. Kq(y(l))1 ::; c, IHQ(x(ll.logQ qJ)) -logQ q. Hq(y(l))1 ::; c. 12. Deduce from the above relations the invariance of randomness under the change of base. 13. (Hertling-Weihrauch) Use the topological definition of random reals to prove the invariance under the change of base. (Hint: Consider the set of reals R with the usual Lebesgue measure jj and B the numbering of a base of the real line topology defined by B 1r (i,j) = {x E R Ilx - vD(i)1 < 2- j }, where VD( < k, l, m » = (k - l)2-m is defined on the set of dyadic reals D = {x E R I x = (i - j)2- k , for some i,j, k}. For the unit interval (0,1) we work with the restriction of the Lebesgue measure and Bi n [0,1].) 14. The lower and upper limits of the relative complexity of a sequence x E are defined by 1:£(x)

A~

. . Hr(x(n)) (). Hr(x(n)) = hmmf and R x = hmsup . n

n-HXl

a) Prove that every x E

A~

n-HXl

n

with 1:£(x) = 1 is Borel normal.

b) Prove that every computable sequence x E A~ has R(X)

= O.

15. (Staiger) In view of Exercise 7.8.14, we can define the lower and upper limits of the relative complexity of a real number by 1:£( vr(x)) = 1:£(x) and R(vr(x)) = R(x). Prove that every Liouville number 0; E [0,1] has 1:£(0;) = O. Deduce that no Liouville number is random. 16. Prove that there are uncountably many Liouville numbers 0; such that for every bEN, b::::: 2, the sequence x E Ab' with Vb(X) = 0; is disjunctive. 17. Show that the class of computable reals forms a real closed field.

312

7. G.E. Random Reals

18. Show that there is an algorithm to determine for every computable reals a -I /3 whether a < /3 or a > /3. 19. Show that there is no algorithm to determine for every computable reals a, /3 whether a = /3 or a -I /3. 20. Show that there exist two infinite prefix-free c.e. sets A and B such that J.t(A2: W ) = J.t(B2: W ) = 1 but A 1.88 Band B 1.88 A. Hence, the mapping in Corollary 7.101 cannot be one-to-one.

21. Show that for every universal Chaitin computer U we can effectively construct two universal Chaitin computers VI and V2 such that DVl = ~ . Du and DV2 = ~(1 + Du). 22. Let U be a universal Chaitin computer, Du = O.Wl ... , and let S = SI ... Sm be a binary string. Show that we can effectively construct a universal Chaitin computer W such that Dw = O.SI ... SmWl ....

23. (Soare) We associate to every subset A c N the real a = 0.XA(1)XA(2) ... , where XA(i) = 1 if i E A and A(i) = 0 if i A, and we write a = O.XA. Construct a set A which is not c.e. but L(a) is a c.e. set.

rt

24. Let D be a total standard notation of all finite sets of words in 2:;*. Let A, B c 2:*. Show that A :::;tt B iff there are two total computable functions f: 2:* -} Nand g: 2:* -} 2:* such that x E A iff XB(f(X)) E Dg(x). 25. (Soare) Show that A :::;tt L(a) but L(a) is not necessarily truth-table reducible to A, although L(a) :::;T A. 26. (Calude-Coles) Show that there are c.e. reals O.x and O.y such that H(x(n)) :::; H(y(n)) + 0(1) and O.y does not dominate O.x,.

27. With reference to Corollary 7.92, construct directly a low non-computable representation B avoiding the upper cone of a c.e. D.

28. Show that O.x :::;tt O.y iff there are two total computable functions 9 : N -} Nand F: 2:* -} 2:* with x(n) = F(y(g(n))), for all n. 29. The preorder :::;tt has a maximum among the c.e. reals, but this maximum is not D, as no random c.e. real is maximal. 30. Show that for every c.e. real O.x there exist a total computable function g: N -} N and a p.c. function F : 2:* ~ 2:* with x(n) = F(D(g(n))), for all n. (Hint: use A~ :::;tt Ax.)

31. (Slaman) Let (Vn ) be a universal Martin-Li:if test. Prove that for every n ~ 1, v(Vn2:W) is c.e. and random. 32. (Downey) Prove that the following conditions are equivalent: a) b is the m-degree of a splitting of L( a), b) b is the wtt-degree of a representation of a.

7.9 History of Results

313

33. (Downey-Hirschfeldt-Nies) Show that for c.e. reals, a -:::;dom f3 iff there exists an integer c > 0 and a c.e. real '"Y such that cf3 = a + '"Y. (Hint: let (an) be a computable increasing sequence with limit a; then, by speedingup the enumeration, we can construct a computable, increasing sequence f3n with limit f3 such that for all n, f3n+ I - f3n < c . (a n+I - an); at each stage one part of c(an+l - an) makes f3n+1 - f3n and the other part makes '"Yn+l - '"Yn.) 34. (Downey-LaForte) Show the existence of an uncomputable c.e. real a such that every prefix-free set A such that a = OA is computable. 35. (Arslanov) We say that a set X is (n + l)-c.e. if X = Xl \ X 2 , for some c.e. set Xl and n-c.e. set X 2 ; c.e. sets are called l-c.e. sets. Show that for every positive integer n every sequence of n-c.e. degree strictly below 0' is not random. 36. (Arslanov) A sequence x is w-c.e. if there exist two computable functions f,g such that Xk = lims-oo f(s,k), f(O,k) = 0, and #({s EN I f(s,k) =1= f (s + 1, k)}) -:::; g( k). Show that there exist w-c.e. random sequences x such that x =T 0'. Give a direct construction of a non-computable c.e. real that does not realize the cone. (Hint: try a finite injury priority argument with strategies that resemble those needed to construct sets without the USP together with a technique to deal with computable sequences of rationals.) 37. (Kucera-Terwijn) Show the existence of a c.e. set A such that rand A rand. Here rand A is the relativization of rand to oracle A.

=

38. (Kummer) Construct a set A such that there is a constant c with K(XA(l) ... XA(n)) ~ 2logn - c, for infinitely many i.

7.9

History of Results

Theorem 7.2 was proved by Chaitin [114]; see also [122, 118, 121]. Section 7.2 follows Calude and Jurgensen [89]; other proofs of invariance can be found in Hertling and Weihrauch [235] and Staiger [383]. The material presented in Section 7.3 comes from Calude and Zamfirescu [100, 101]. Definition 7.22 comes from Jurgensen and Thierrin [245]. For disjunctive sequences see Staiger [384, 388]. The equivalence of the statements 1 and 3 in Theorem 7.106 comes from Chaitin [118]. The analysis of the convergence of computable sequences of rationals was developed in Calude and Hertling [82]; see a1so Ho [238] . The definition of c.e. reals was given in Soare [371]; we direct the reader to [371] for related work on the relative computability of cuts of arbitrary reals. Solovay's manuscript [375] contains the definition of the domination relation

7. G.E. Random Reals

314

and its basic properties. The paper Calude, Hertling, Khoussainov and Wang contains the first detailed analysis of the Solovay domination relation. It has been followed by many papers, including Hertling and Wang [234], Hertling and Weihrauch [235], Slaman[369], Kucera and Slaman [266J, Downey, Hirschfeldt and Nies [184], Downey and LaForte [185]' Downey, Hirschfeldt and LaForte [183J, WU [442], Zheng [450], Rettinger, Zheng, Gengler and von Braunmiihl [344], Downey [181]' Downey and Hirschfeldt [182J. See also Calude [60, 66, 61J Theorem 7.108 was proved in Slaman; the final paper, which has appeared as Kucera and Slaman [266], also contains a discussion of early results in the area of random reals published by Demuth [168, 169, 170, 171J. Kucera [265J and Kautz [250J were among the first studies of c.e. degrees of random reals. For example, they observed that 0' is the only c.e. degree which contains random reals. Kucera [265J has used Arslanov's Completeness Criterion to show that all random sets of c.e. T-degree are T-complete. Hence, every Chaitin Omega Number is T-complete. Theorem 7.119 is a stronger result; it summarizes results obtained in Arslanov, Calude [7J, Chaitin [129], and Calude and Nies [95J Theorem 7.96 and other facts regarding the universal splitting property come from Lerman and Remmel [274, 275J. Tadaki [397J has introduced and studied the following generalization of

n: nuD

= "2:;xEdom(U)

real function D

f--+

J D 2 -~ D, where D E (0, 1. The numbers nu and the have very interesting randomness properties.

nS

Exercise 7.8.2 comes from Hartmanis, Hemachandra and Kurtz [226J. Exercise 7.8.4 generalizes Proposition 6.5 in Schnorr [359J. Exercise 7.8.9 comes from Calude and Campeanu [64J. Exercise 7.8.10 was communicated to us by S. Dragomir [186J. Exercises 7.8.23, 25 come from Soare [371 J. Exercise 7.8.26 was proved in Calude and Coles [75J; a simpler proof was discovered by Vereshchagin [416J. Exercise 7.8.31 comes from A. Kucera and Slaman [266J. Exercise 7.8.32 was proved in Downey [181], Exercise 7.8.33 was proved in Downey, Hirschfeldt and Nies [184J and Exercise 7.8.34 comes from Downey and LaForte [185J. Exercises 7.8.35, 36 come from Arslanov [6, 5, 4J. Exercise 7.8.37 comes from Kucera and Terwijn [267J. Kummer [269, 270J is the author of Exercise 7.8.37.

Chapter 8

Randomness and Incompleteness All truth passes through three stages. First, it is ridiculed. Second, it is violently opposed. Third, it is accepted as being self-evident. Arthur Schopenhauer

8.1

The Incompleteness Phenomenon

Godel's Incompleteness Theorem (GIT) has the same scientific status as Einstein's principle of relativity, Heisenberg's uncertainty principle, and Watson and Crick's double helix model of DNA. Incompleteness has captured the interest of many. Many books and thousands of technical papers discuss it and its implications. The March 29, 1999 issue of TIME magazine has included Godel and Turing in its list of the 20 greatest twenty scientists and thinkers of the twentieth century. Interest in incompleteness dates from early times. Incompleteness was an important issue for Aristotle, Kant, Gauss, Kronecker, but it did not have a fully explicit, precise meaning before the works of Hilbert and Ackermann, Whitehead and Russell, Godel and Turing.

In a famous lecture before the International Congress of Mathematicians (Paris, 1900), David Hilbert expressed his conviction of the solvability of

8. Randomness and Incompleteness

316

every mathematical problem:

Wir miissen wissen. Wir werden wissen. 1 Hilbert highlighted the need to clarify the methods of mathematical reasoning, using a formal system of explicit assumptions, or axioms. Hilbert's vision was the culmination of 2000 years of mathematics going back to Euclidean geometry. He stipulated that such a formal axiomatic system should be both 'consistent' (free of contradictions) and 'complete' (in that it represents all the truth). In their monumental Principia Mathematica (1925-1927), Whitehead and Russell developed the first coherent and precise formal system aimed to describe the whole of mathematics. Although Principia Mathematica held great promise for Hilbert's demand, it fell short of actually proving its completeness. After proving the completeness of the system of predicate logic in his doctoral dissertation (1929), Godel continued the investigation of the completeness problem for more comprehensive formal systems, especially systems encompassing all known methods of mathematical proof. In 1931 Godel proved his famous first incompleteness result, 2 which reads: Theorem 8.1 (Godel's Incompleteness Theorem). Every very axiomatic formal system which is (1) finitely specified, (2) rich enough to include the arithmetic, and (3) sound, is incomplete; that is, there exists (and can be effectively constructed) an arithmetical statement which (A) can be expressed in the formal system, (B) is true, but (C) is unprovable within the formal system. Our main example of an axiomatic formal system is the Zermelo-Frankael set theory with choice, Z FC. We fix an interpretation of Peano Arithmetic (PA) in ZFC. Each sentence of the language of PA has a translation into a sentence of the language of Z FC, determined by the interpretation of PAin ZFC. A "sentence of arithmetic" indicates a sentence lWe must know. We will know. 2The second incompleteness result states that consistency cannot be proved within the system.

8.1 The Incompleteness Phenomenon

317

of the language of Z FC that is the translation of some sentence of P A. We will assume that Z FC is arithmetically sound: that is, any sentence of arithmetic which is a theorem of ZFC is true (in the standard model of PA).3 All conditions are necessary. Condition (1) says that there is an algorithm listing all axioms and inference rules (which could be infinite): the axioms and inference rules form a c.e. set. Taking as axioms all true arithmetical statements will not do, as this set is not c.e. But what does it mean to be a "true arithmetical statement"? It is a statement about non-negative integers which cannot be invalidated by finding any combination of nonnegative integers that contradicts it. In Connes' terminology (see [145], p. 6), a true arithmetical statement is a "primordial mathematical reality" . Condition (2) says that the formal system has all the symbols and axioms used in arithmetic, the symbols for 0 (zero), S (successor), + (plus), x (times), = (equality) and the axioms making them work (as, for example, x + S(y) = S(x + V»~. Condition (2) cannot be satisfied if you do not have individual terms for 0, 1,2, .... For example, Tarski proved that the Euclidean geometry, which refers to points, circles and lines, is complete. Finally (3) means that the formal system is free of contradictions. The essence of GIT is to distinguish between truth and provability. A closer analogy in real life is the distinction between truths and judicial decisions, between what is true and what can be proved in court. 4 The essence of the original formulation of GIT involves the set Arith of true arithmetical sentences in which we use the usual operations of successor, addition and multiplication. 5 It reads as follows: Theorem 8.2 (Incompleteness of Arith). There is no formal axiomatic system satisfying all properties (1)-(2) in Theorem 8.1 and proving all true statements of Arith.

Proof Assume by contradiction that Arith is c.e., so there exists a computable function enumerating all elements of Arith. Let F( i) be an arithmetical formula saying that the ith p.c. function 'Pi halts in i, 3The metatheory is ZFC itself; that is, "we know" that P A itself is arithmetically sound. 4The Scottish judicial system which admits three forms of verdict, guilty, not-guilty and not-proven, comes closer to the picture described by GIT. 5 Actually, Godel has investigated the more powerful system constructed in the Russell and Whitehead Principia Mathematica.

318

8. Randomness and Incompleteness

i.e. CPi (i) < 00. It is clear that Arith is capable of expressing F( i). But deciding whether F( i) is true or false is equivalent to solving the Halting Problem. If there is no mechanical procedure for deciding the Halting Problem,6 then there is no complete set of underlying axioms either. Indeed, if there were, they would provide a (tremendously long) procedure for running through all possible proofs to show which programs halt! 0 The above reasoning is important not only for justifying the GIT for Arith, but also because it shows that the details of the formal axiomatic system are not relevant for GIT! Indeed, we can ignore anything regarding the inner mechanism of the system, what the axioms are or what logic is used. What is important is the fact that there should be a proof-checking algorithm, an algorithm which may help to run through all possible proofs in size order, see which ones are correct and then print out all and only all theorems. This is impractical, but conceptually important:

the essence of a formal axiomatic system is the fact that its theorems form a c. e. set (under a suitable codification). So, we are now in a position to reformulate the GIT for Arith as: Theorem B.3. The set Arith is not c. e.

As Chaitin has observed, there is more information in the above argument than in the original proof due to G6del. Following G6del we know that the axiomatic formal system is incomplete; however,

there still might be a mechanical procedure to decide if a given assertion is true or false! This possibility was ruled out by the above argument. GIT ended a hundred years of attempts to establish axioms to put mathematics on an axiomatic basis. GIT does not destroy the fundamental idea of formalism, but suggests that a) mathematics will be described by many formal systems as opposed to a universal one, b) a more sophisticated and comprehensive form of formal system than that envisaged by Hilbert is required (see also Post [337]). 6 An information-theoretic proof of the undecidability of the Halting Problem will be presented in Section 9.2.

8.1 The Incompleteness Phenomenon

319

Anticipating resistance to his results, Godel wrote his papers very carefully. He took pains to convince various people about the validity of his assertions and results, but he avoided any public debate and considered his results to have been accepted by those whose opinion mattered to him; see Dawson [161]. Unlike the other critics, Post expressed "the greatest admiration" for Godel's work, conceding that after all it is not ideas but the execution of ideas that constitute{s} . .. greatness. Godel's result provoked Hilbert's anger, but he apparently accepted its correctness (cf. [161]). Hilbert never cited Godel's work. There is a variety of reactions in interpreting GIT, ranging from pessimism to optimism or simple dismissal (as irrelevant for the practice of mathematics). For pessimists, this result can be interpreted as the final, definite failure of any attempt to formalize the whole of mathematics. For example, H. We yl acknowledged that GIT has exercised a "constant drain on the enthusiasm" with which he engaged himself in mathematics, and for S. Jaki, GIT is a fundamental barrier in understanding the Universe. In contrast, scientists like F. Dyson acknowledge the limit placed by GIT on our ability to discover the truth in mathematics, but interpret this in an optimistic way, as a guarantee that mathematics will go on forever (see Barrow [16], pp. 218-221). A lucid analysis of the impact of GIT in physics is presented in Barrow [17]. The reactions of two great philosophers are also of interest. Wittgenstein's negative comments (dated 1938 and posthumously published in "Remarks on the foundations of mathematics" in [436]) are now generally considered an embarrassment in the work of a great philosopher. Russell realized the importance of Godel's work, but expressed his continuous puzzlement in a rather ambiguous way in a letter from 1 April 1963 (addressed to L. Henkin; see [161]): Are we to think that 2+2 is not 4, but 4.001? Following the same source, Godel remarked (in a letter addressed to A. Robinson) that "Russell evidently misinterprets my result; however he does so in a very interesting manner ... ". In the long run Godel's own interpretations of incompleteness prevailed: GIT neither rejected the notion of formal system (quite the opposite) nor caused despair over the imposed limitations. It reaffirms the creative power of human reason. In Post's celebrated words: mathematical proof is {an} essentially creative (activity).

320

8. Randomness and Incompleteness

How large is the set of true and unprovable statements? If we fix a formal system satisfying all three conditions (1)-(3) in Theorem 8.1, then the set of true and unprovable statements is topologically "large" (constructively, a set of second Baire category, and in some cases even "larger"), cf. Calude, Jurgensen and Zimand [91]. No probabilistic similar result has been (yet?) proven. As we shall see later in this chapter (e.g. in Corollary 8.8), AIT forms of GIT suggest reinforcement of the above results: incompleteness is not an accident, it is a pervasive phenomenon. This raises the natural question (see Chaitin [135]): "How come that in spite of incompleteness, mathematicians are making so much progress?"

8.2

Information-theoretic Incompleteness (1)

This section presents the first information-theoretic approach to incompleteness. Incompleteness asserts a coding impossibility: an axiomatic system satisfying properties (1)-(3) in Theorem 8.1 does not have enough resources to "code" all true statements which it can express. Is it possible to get a more quantitative form of this fact? AIT is able to shed more light on GIT by analysing, following Chaitin [113, 115, 120, 122, 123, 125]1 the reason for this phenomenon. The main result can be informally stated as: An axiomatic formal system of complexity N cannot yield a theorem that asserts that a specific object is of complexity substantially greater than N. We consider an axiomatic formal system F whose rules of inference form a c.e. set of ordered pairs of the form

< a,T > indicating that the theorem T is deductible from the axiom a:

7See van Lambalgen [412] or Raatikainen [340] for critical discussions. 80 ne often writes a I- F T instead of < a, T >.

8.2 Information-theoretic Incompleteness (1)

321

So, F is fixed and a - which is a string via some standard encoding varies. The first information-theoretic version of GIT (see [123, 122, 125, 131, 136]) reads: Theorem 8.4. (Chaitin Information-theoretic Incompleteness (I». We consider an axiomatic formal system Fa consisting of all theorems derived from an axiom a using the rules of inference F. There exists a constant CF - depending upon the formal system Fa - such that if

a r-F "H(x) > n" only if H(x) > n, then a

only if n < H(a)

r-F "H(x) > n"

+ CF.

Proof. We shall present three proofs. Information-theoretic direct proof. We consider the following Chaitin computer C: for u, v E :L:* such that U(u) = string(k) and U(v) = a we put the first string s that can be shown in Fa

C(uv)

to have complexity greater than k

+ Ivl.

Note that in the above definition "first" refers to the quasi-lexicographical order. To understand how C actually works just notice that the set

Fa = {T I a r- F T} = {T

I<

a, T >}

is c.e. Among the admissible inputs for C we may find the minimal self-delimiting descriptions for string( k) and a, i.e.

u = (string(k))*, v = a*, having complexity H(string(k)), H(a), respectively. If C(uv)

= s, then Hc(s) ::::::

luvl : : :

l(string(k))*a*l·

322

8. Randomness and Incompleteness

On the other hand, using the Invariance Theorem for U and C we get a constant d such that

k + la*1 < H(s) :::; l(string(k))*a*1

+ d.

We therefore get the following inequalities:

k + H(a) < H(s) :::; H(string(k))

+ H(a) + d,

hence

k < H(string(k))

+ d = o (log k),

which can be true only for finitely many values of the natural k. We now pick CF = k, where k is a value that violates the above inequality. We have proven that s cannot exist for k = CF, i.e. the theorem is proved.

Recursion-theoretic proof. Recall that d( x) is a self-delimiting version of the string x. Let (Ce)eE~* be a c.e. enumeration of all Chaitin computers. We construct the Chaitin computer

Cw(d(x))

=

y, if y is the first string such that a statement

of the form "Cx(d(x))

Fa and z

=1=

z" is provable in

= y.

We prove first that

Cw(d(w))

= 00.

Indeed, if Cw(d(w)) =1= 00, then Cw(d(W),A) = y, for some string y E L;*; we admit that y is the first such string. On the other hand one has

a r-F "Cw(d(w), A)

=1=

y",

and, in view of the soundness of the formal system,

Cw(d(w), A)

=1= y.

We thus have a contradiction. The set of axioms a augmented with the axiom

{Cw(d(w), A)

=

y}

is consistent, for every string y. Otherwise we would have

a r-F "Cw(d(w), A)

=1=

y",

8.2 Information-theoretic Incompleteness (1)

323

for some string y, a false relation. Finally, the set of axioms a augmented with the axiom

{H(y) ::; Id(w)1 + c} (c comes from the Invariance Theorem applied to C w and U) is also consistent, showing that in the formal system Fa one cannot deduce any statement of the form "H(y) > Id(w)1 + c".

Information-theoretic indirect proo]. We delete from the list of theorems all statements which are not of the form "H(y) > m" - this operation can be effectively performed, so it may increase the complexity by at most a constant factor - and identify the set of theorems with a c.e. subset with Godel number e of the set on

{< w,m >E

L;* x N I H(w)

> m}.

In view of Theorem 5.33 all codes of theorems are bounded in the second argument by a constant (not depending on e), thus completing the proof.

o Remark. A false reading of Theorem 8.4 might say that the complexity of theorems proven by Fa is bounded by H(a) + CF. Indeed, if the set of theorems proven by Fa is infinite, then their program-size complexities will be arbitrarily large. How does Theorem 8.4 compare with Theorem 8.2? To answer this question we need need a result of the type Theorem 8.3 for Theorem 8.4. This is Theorem 5.31 (more precisely, in its proof we showed that the set C = {< w,m > E L;* x N I H(w) > m} is immune). Of course, every immune set is not c.e. and the converse implication is not generally true. Is Arith immune? The answer is negative as it is clear that Arith has infinite c.e. subsets. To understand better that immunity is a stronger form of non-computability than non-c.e., let us stop for a moment and describe a set which is not immune. Following Delahaye [164] such a set A may be called "approximable" as it is either finite or contains a c.e. set B, so A = Un::::l An, where

An = (A n {x E L;* i.e. A is a union of c.e. sets.

In 2': Ixl}) U B,

324

8. Randomness and Incompleteness

To conclude, Theorem 8.4 is stronger than Theorem 8.2. Recognizing high complexity is a difficult task even for ZFC. The difficulty depends upon the choice of U: some U's are worse than others. Raatikainen [340] has shown that there exists a universal Chaitin computer U so that Z FC, if arithmetically sound, can prove no statement of the form "Hu(x) > n". It follows that ZFC, if arithmetically sound, can prove no (obviously, true) statement of the form "Hu(x) > 0".

8.3

Information-theoretic Incompleteness (2)

Consider now a Diophantine equation, i.e. an equation of the form

P(n, x, YI, Y2,···, Ym) = 0, where P is a polynomial with integer coefficients. The variable n plays an important role as it is considered to be a parameter; for each value of n we define the set Dn

= {x

E N I P(n,

x, YI, Y2, ... , Ym) = 0, for some YI, Y2, ... , Ym

E Z}.

It is clear that for every polynomial P of m + 2 arguments the associated set Dn is c.e. By Matiyasevich's Theorem, every c.e. set is of the form Dn-

In particular, there exists a universal polynomial P such that the corresponding set Dn encodes all c.e. sets. So, P(n, x, YI, Y2, ... , Ym)

= 0,

(8.1)

iff the nth computer program outputs x at "time" (YI, Y2, .. . ,Ym). The diagonal set is not c.e., so there is no mechanical procedure for deciding whether equation (8.1) has a solution. In other words, no system of axioms and rules of deduction can permit one to prove whether equation (8.1) has a solution or not. Accordingly, we have obtained the following: Theorem 8.5 (Diophantine Form of Incompleteness). No formal axiomatic formal system with properties (1)-(3) in Theorem 8.1 can decide whether a Diophantine equation has a solution or not.

8.3 Information-theoretic Incompleteness (2)

325

Is there any relation between randomness and the sets of solutions of Diophantine equations? The answer is affirmative. For technical reasons we shall deal with exponential Diophantine equations, the larger class of equations which are built with addition, multiplication and exponentiation of non-negative integers and variables. Consider also an Omega Number Ou. First we prove the following technical result:

Theorem 8.6 (Chaitin). Given a universal Chaitin computer U one can effectively construct an exponential Diophantine equation

P(n, x, Yl, Y2, ... , Ym) = 0

(8.2)

such that for every natural fixed k the equation P(k, x, Yl, Y2, ... , Ym) = 0

has an infinity of solutions iff the kth bit of the binary expansion Ou is 1.

Proof Consider the sequence of rationals (7.1) defining Ou and note that the predicate "the nth bit of Ou(k) is I" is computable. Using now Jones and Matiyasevich's Theorem 9 one gets an equation of the form (8.2). This equation has exactly one solution Yl, Y2, ... , Ym if the nth bit of Ou(k) is 1, and it has no solution Yl, Y2, ... , Ym if the nth bit of Ou(k) is O. The number of different m-tuples Yl, Y2, ... , Ym of natural numbers which are solutions of the equation (8.2) is therefore infinite iff the nth bit of the base 2 expansion of Ou is 1. 0 It is interesting to remark on the sharp difference between the following two questions:

1. Does the exponential Diophantine equation P

= 0 have a solution?

2. Does the exponential Diophantine equation P of solutions?

= 0 have an infinity

The first question never leads to randomness. If one considers such an equation with a parameter n, and asks whether or not there is a solution 9S ee Theorem 1.3.

8. Randomness and Incompleteness

326

for n = 0,1,2, ... , N -1, then the N answers to these N questions contain only log2 N bits of information. Indeed, we can determine which equation has a solution if we know how many of them are solvable. The second question may sometimes lead to randomness, as in Theorem 8.6. It is remarkable that Chaitin [121 J has effectively constructed such an equation; the result is a huge equation.lO We are now in a position to prove the second information-theoretic version of GIT (see [123, 122, 125, 131]):

Theorem 8.7. (Chaitin Information-theoretic Incompleteness (II». Assume that the set of theorems of a formal axiomatic system T is c. e. If T has the property that any statement of the form "the nth bit of nu is a 0", "the nth bit of nu is a 1 ", can be represented in T and such a statement is a theorem of T only if it is true, then T can enable us to determine the positions and values of at most finitely many scattered bits of nu. Proof We will present two proofs. First proof. If T provides k different bits of nu, then it gives us a covering Coverk of measure 2- k which includes nu. Indeed, we enumerate T until k bits of nu are determined, and put

IXll = i l -1, IX21 = i2 - i l -1, ... , IXkl = ik - ik-l -1} C {a, 1}* (i l < i2 by T).

< ... < ik

are the positions where the right 0/1 choice was given

Accordingly, p,(CoverdO, 1}W)

= 2ik - k /2 ik = 2- k ,

and T yields infinitely many different bits of randomness of nu.

nu, which contradicts the

10 A 900,000-character 17,000-variable universal exponential Diophantine equation. See also the recent software in [130J.

8.3 Information-theoretic Incompleteness (2)

327

Second proof. Assume that T may give us an infinity of positions and corresponding values of n. Then we can get an increasing function i N ~ N such that the set

{(i(k), ni(k)) I k :2: o} is computable. Then, by virtue of Theorem 6.41, the sequence r2(nU) is not random, a contradiction. 0

In fact one can give a bound on the number of bits of nu which ZFC can determine; this bound can· be explicitly formulated, but it is not computable. For example, in [130] Chaitin has described, in a dialect of Lisp, a universal Chaitin computer U and a formal axiomatic system T satisfying properties (1)-(3) in Theorem 8.1 such that T can determine the value of at most H(T) + 15,328 bits of nu (an uncomputable number). Consider now all statements of the form "The nth binary digit of the expansion of for all n :2: 0, k

nu is k",

(8.3)

= 0, 1.

Theorem 8.7 can be restated in the following form which shows the pervasive nature of incompleteness:

Corollary 8.8 (Chaitin). If ZFC is arithmetically sound and U is a universal Chaitin computer, then almost all true statements of the form (8.3) are unprovable in T. To compare Theorem 8.4 with Theorem 8.7 we need the following:

Definition 8.9. A set of non-negative integers A is called random if sequence x = XIX2 ••• X n . .. defined by Xi

= { 1,

0,

if i E A, if i rf- A,

is random. Random sets are immune, but the converse is not necessarily true. In particular, the immune set C in Theorem 5.31 is not random, hence Theorem 8.7 is stronger than Theorem 8.4. Indeed, the analogue of Theorem 5.31 is:

328

8. Randomness and Incompleteness

Theorem 8.10. The set A of non-negative integers n such that ZFC proves a theorem of the form (8.3) is random. Remark. Of course, stronger and stronger forms of incompleteness can be imagined just following, for example, the arithmetical hierarchy. As noted by Delahaye [164], the beauty of the information-theoretic forms of incompleteness is given by the natural and simple constructions.

8.4

Information-theoretic Incompleteness (3)

In this section we fix T = ZFC. Note that each statement of the form (8.3) can be formalized in P A. Moreover, if U is a Chaitin computer which P A can prove universal and ZFC proves the assertion (8.3), then this assertion is true. By tuning the construction of the universal Chaitin computer, Solovay [377] has obtained a dramatic improvement of Corollary 8.8: Theorem 8.11 (Solovay). We can effectively construct a universal Chaitin computer U such that ZFC, if arithmetically sound, cannot determine any single bit of nu. Note that Corollary 8.8 holds true for every universal Chaitin computer U (it is easy to see that the finite set of (true) statements of the form (8.3) which can be proven in ZFC can be arbitrarily large) while Theorem 8.11 constructs a specific U. We will first obtain a stronger result Theorem 8.12 - from which Theorem 8.11 follows.

In what follows, if j is one of 0 or 1, the string of length 1 whose sole component is j will be denoted by (j). Theorem 8.12 (Calude). Assume ZFC is arithmetically sound. Let i 2: 0 and consider the c. e. random real

Then, we can effectively construct a universal Chaitin computer, U (depending upon ZFC and a.), such that the following three conditions are satisfied:

8.4 Information-theoretic Incompleteness (3)

329

a)

P A proves the universality of U.

b)

Z FC can determine at most i initial bits of 0, u.

c)

et=nu.

Proof We start by fixing a universal Chaitin computer V such that the universality of V is provable in PA and nv = et. We use Theorem 7.109 ~d Exercise 7.8.22 to effectively construct a universal Chaitin computer V such that

nv =

0.~eti+1eti+2"" i Os

if i ~ 1, and a universal Chaitin computer

V such that

nv = 0. et l et 2"·' in case i = O. Next we construct, by cases, a p.c. function W(l, s) (l is a non-negative integer and s E I:*) as follows: Step 1:

Set W(l,A) to be undefined.

Step 2:

If i = 0, then go to Step 6.

W(l, (1))

Otherwise, set

= W(l, 10) = ... = W(l,~O) =

A.

i Is

If s = OOt, for some tEI:* , then set

Step 3:

W(l, s)

= V(t),

and stop. Step 4:

If s = Oli, for some tEI:* , then go to Step 5.

Step 5: List all theorems of Z FC, in some def ini te order, not depending on t, and search for a theorem of the form (8.3). If no such theorem is found, then W(l,s) is undefined, and stop. If such a theorem is found, then let n, l, k be its parameters .

It I =1= n, It I

then W(l,s) is undefined, and stop.

•

If

•

If n, then let r be the unique dyadic rational, in [0,1), whose binary expansion is t(k) and set r' = r + 2-(n+1). Search for the least integer m such that ndm] E (r, r') . If this search fails, or s E Dz[m], then W(l,s) is undefined, and stop. In the opposite case set W(l, s) = A, and stop.

330 Step 6:

8. Randomness and Incompleteness If

8

= (O)t,

for some string t, then set

111(1,8) = lI(t), and stop. Step 7:

If

8

= (l)t,

for some string t, then go to Step 5.

The Recursion Theorem 1.1 provides a j such that !.pj(8) = 111(j,8). We fix such a j and set U = !.pj. We will show that U is a universal Chaitin computer which satisfies conditions a)-c). First we prove that U is a Chaitin computer. Let i = O. Suppose that and 82 are in the domain of U and 81

{I, 10, 110, ... ,.!..L;.:.,!O}, i 1s

so 81 = 82. If k = 0, then two cases may appear. If 8i = OOti, then t1, t2 belong to the domain of the Chaitin computer 11 (see Step 3), so t1 = t2 and 81 = 82. If 8i = Olii, then in view of Step 5, a similar argument as in case i = 0 shows that 81 = 82. It follows that U is a Chaitin computer, i.e. U = 1/Jj and nj = nu. The universality of U follows from the definition of 111(1, 8) on Steps 3 and 6 as 11 and 11 are universal. Furthermore, U inherits from 11(11) the fact that its universality is provable in P A.

Assume now that i = 0 and ZFC can determine some bit of nu. Then, in the course of the computation the integers nand k are defined. Let r be a dyadic rational with denominator 2 n +1 such that r

< nu < r + 2-(n+1)

(r exists because nu is irrational). Let r'

=

r + 2-(n+1).

8.4 Information-theoretic Incompleteness (3)

331

Since ZFC is arithmetically sound, the assertion "The nth binary bit of is k" is true. Hence the first n + 1 bits of the binary expansion of r have the form t(k) where t is a string of length n. For all sufficiently large m, nj[m] will lie in the interval (r, r').

nu

Let s = (l)t and consider the computation of U(s). The rationals rand r' involved in that computation are exactly the ones just defined above. The search for an m such that nj[m] E (r, r') will succeed and s rt Dj[m], because, if s E Dj[m], then U(s) is undefined. But Dj[m] C Dj , so s E D j , the domain of U, a contradiction. Consequently, U(s) is defined, and D j contains in addition to the members of Dj[m] the string s oflength n+1. It follows that nu 2: r+2-(n+1) = r', which contradicts the definition of r. With a similar argument as above one can show that the assumption that ZFC can determine some bit of nu beyond its first i 2: 1 bits leads to a contradiction. The analysis just described above shows that for i = 0, U( (l)t) is undefined, and in case i 2: 1, U(Olt) is undefined, for every string t. To finish the proof we notice that for i = 0, 1

nv = 2 . nv = nu, and for i 2: 1,

o Definition 8.13. A Chaitin computer satisfying all conditions in Theorem 8.12 will be called a Solovay computer.

If we set i

=

°

in Theorem 8.12, then we get Theorem 8.11. More precisely,

Corollary 8.14. Assume that ZFC is arithmetically sound. Then, every c.e. random real 0: E (0,1/2) is the halting probability of a Solovay computer which cannot determine any single bit of 0:. No c. e. random real 0: E (1/2,1) has the above property. Proof Indeed, every c.e. random real in the interval (0,1/2) has its 1st digit 0, so it can be represented as the halting probability of a Solovay computer for which ZFC cannot determine any single bit. However, if 0: is c.e. and random, but 0: > 1/2, then ZFC can determine the Oth bit of 0: which is 1. 0

332

8. Randomness and Incompleteness

GIT has a constructive proof, but the proof of Theorem 8.7 appears to be non-constructive. Is it possible to get a constructive variant of Theorem 8.7? The answer is affirmative and here is a possible variant: Theorem 8.15. If ZFC is arithmetically sound and U is a Solovay computer, then the statement "the Oth bit of the binary expansion of nu is 0" is true but unprovable in ZFC. Proof We start with a universal Chaitin computer U and effectively construct a Solovay computer U' such that nUl = ~ . nu. Then, nUl is less than 1/2, so its Oth bit is 0, but ZFC cannot prove this fact! 0

We can now use Theorem 8.6 to effectively construct an exponential Diophantine equation which has only finitely many solutions, but this fact cannot be proven in ZFC. In fact, for every binary string s = SlS2 ... Sn we use Exercise 7.8.22 to effectively construct a Solovay computer U such that the binary expansion of nu has the string (0;SlS2" . Sn as prefix. Consequently, the following statements "The 1st binary digit of the expansion of nu is

Sl",

"The 2nd binary digit of the expansion of nu is

S2",

"The (n

+ l)th binary digit of the expansion of nu

is sn",

are true but unprovable in ZFC.

8.5

Coding Mathematical Knowledge

Understanding the power and the limitations of human knowledge is an exciting but tremendously difficult task. In this section we shall confine ourselves to an answer to the following question: "Is there any mathematical 'wisdom' in an Omega Number?" Theorem 7.109 suggests that the answer is affirmative. Hence it is natural to want to see how we can use an Omega Number to solve a mathematical problem. We consider Fermat's Last Theorem:

8.5 Coding Mathematical Knowledge

333

Theorem 8.16 (Wiles). The equation

(1 + X)w+3

+ (1 + y)w+3 =

(1

+ Z)w+3

has no natural solutions. ll

Or the Goldbach Conjecture: 12

Conjecture 8.17 (Goldbach). Every even number greater than 2 is the sum of two primes 13 . Or Riemann's Hypothesis: 14

Conjecture 8.18 (Riemann's Hypothesis). All complex roots (zeros) s = Re(s) + iIm(s) of the Riemann zeta-function 00

((s)

=

1

Ln

S

n=l

(i. e. the values for which (( s) = 0) are located on the straight line Re(s) = 1/2 in the complex plane (except for the known zeros, which are the negative even integers). We could solve all these important problems, and many others, by just knowing enough bits of O! How? Just by solving the Halting Problem for a few programs. All the above mathematical problems can be refuted if appropriate numerical (more precisely, natural) counter-examples can be guessed. 15 A 11 Pierre de Fermat made this assertion in a note in the margin of the first edition of the Greek text of Diophantus's Arithmetica (1621); he added that he had discovered a truly remarkable proof of it that the margin was not large enough to include. Three centuries of effort culminated with Andrew Wiles' 1995 proof that Fermat's assertion is true; see, for instance, van der Poorten [414]. 12The conjecture was stated in 1742 by Goldbach in a letter to Euler [178]. Hardy [222] states that the Goldbach problem is "probably as difficult as any of the unsolved problems in mathematics" . 13 As in the following examples: 6 = 3+3, 8 = 3+5,10 = 3+7 = 5+5,12 = 5+7, ... 14The problem was first proposed in [346]. 15For Riemann's Hypothesis start with Euler's identity 00

Ln- a = II(1-p-a)-\ n=1

p

8. Randomness and Incompleteness

334

finitely refutable statement is equivalent to the assertion that some program - searching systematically for some non-existent object - never halts. Furthermore, each fixed instance of any of the above problems can be algorithmically tested, so we may construct 16 a Chaitin computer which halts only if it eventually discovers an appropriate counter-example. For instance, we may construct a Chaitin computer which halts iff it finds an even number greater than 2 which is not the sum of two primes (less than itself). Due to the inequalities D(n)

< 0 < D(n) + Tn, n =

1,2, ...

one can solve the Halting Problem for all programs of length shorter than n as follows: We start a systematic (dovetailing) search through all programs that eventually halt until enough halting programs P have been found such that I: p 2- lpl > D(n). Notice that we will never get all these programs, but if we have enough patience (and computational resources) we finally get enough programs Pil ,Pi2' Pi3, ... ,Pik of lengths liI' li2' li3 , ... , lik' such that k

L 2- li

j

> D(n).

j=l

In the above list there are programs longer than n bits, as well as some shorter ones. It really does not matter; the main thing is that the list Ph, Pi2' Pi3' ... ,Pik contains all halting programs shorter than n (otherwise, their contribution to 0 would be larger than D(n) + 2- n , a contradiction). If n is large enough, then among the halting programs Pil' Pi2' Pi3' ... ,Pi k we will find programs deciding all finitely refutable conjectures which can be expressed by reasonably long strings. So, in order to improve our knowledge in mathematics we should "compute" more and more digits of D. Is it simple? Is it feasible? First note that not all problems are finitely refutable. Here are three examples of problems which are not finitely refutable: in which the product on the right-hand side is taken over all primes, and write the expansion (1 _ p-a)-l = 1 + p-a + p-2a +"', to see the connection with the Fundamental Theorem of Arithmetic. l6This is an instructive exercise to do!

8.6 Finitely Refutable Mathematical Problems • Is

7r

335

Borel normal?

• Are there infinitely many twin primes (i.e. consecutive odd primes such as 11,13 or 857, 859)? • P i= NP (there are mathematical problems for which the validity of guessed solutions can be quickly verified, but for which solutions cannot be found as fast). 17 Is an Omega Number powerless for the above problems? How large is the class of finitely refutable problems? A more detailed discussion will be presented in the next section. For the moment we will present a few more examples. A statement expressible within a formal axiomatic system is independent of the system if neither the statement, nor its negation, can be proven (within the system). The Parallel Postulate (through a given point there is exactly one line parallel to a given line), the Continuum Hypothesis (there is no cardinal number strictly in between the cardinal of the set of natural numbers the cardinal of the set of reals) and (a slight variation of) Ramsey's Theorem (if a partition of a "big" finite set contains only a few classes, then at least one of these classes is "big enough") are probably the best known examples of independent statements (from Euclidean axioms, Zermelo-F'raenkel set theory and Peano arithmetic, respectively). Let T be a formal axiomatic system with properties (1)-(3) in Theorem 8.1 and s be a statement expressible in T. We construct the program p(s) that searches systematically among the proofs of T for a proof or refutation of s. Then, s is independent with respect to Tiff p(s) never halts.

8.6

Finitely Refutable Mathematical Problems

The phenomenon of a set being finite, but undecidable, is, of course, a consequence of allowing non-constructive arguments in proofs. In this section we discuss a few ramifications of this phenomenon. The conclusion re-enforces the fact that there is a big difference between finiteness and constructive finiteness. We start by showing that every number-theoretic statement that can be expressed in the first-order logic can be reduced to a finite set, to be 17See the discussion in Section 9.8.

8. Randomness and Incompleteness

336

called a test set. Thus, if one knew the test set, one could determine the truth of the statement. This rather simple result models what is sometimes referred to as experimental mathematics. Simply stated, if the statement is true we do not need to do anything and if it is false we find the smallest counter-example by computer. We then show how several classical problems fall into this category. The crucial point is, of course, that we may not be able to know what the finite test set is. Let kEN and consider a k-ary predicate P on N and the formula

f

= Q1 n 1 Q2n 2 ... Qknk P(n1' n2,···, nk)

where Q1, Q2, .. . ,Qk E {\f,:3} are quantifier symbols. In analogy to the arithmetic classes, we say that f is in the class ITs or ts if the quantifier prefix of f starts with \f or :3, respectively, and contains s - 1 alternations of quantifier symbols. When P is computable, then f is in lIs or L: s , respectively. It is sufficient to consider only such formulae f in which no two consecutive quantifier symbols are the same; in the sequel we make this assumption without special mention. With f as above, one has s = k. As usual, we write P(n1, ... , nk) instead of P(n1, ... , nk) n1, ... ,nk are elements of N. Thus,

=

1 when

Moreover, since we consider variable symbols only in the domain N, if f is any formula in first-order logic, we write f is true instead of f is true in N. For sEN, let and L: s .

r s denote any of ITs and t s , and let r s denote any of lIs

We refer to the task of proving or refuting a first-order logic formula as a problem; problems expressed by formulae in r s will be called r s -problems. We say that a problem is being solved if the corresponding formula is proved or disproved to be true; that is, if the truth value of the formula is determined. A problem is said to be finitely solvable if it can be solved by examining finitely many cases. Here is a precise definition: Definition 8.19. Let

with sEN, where Q1, Q2, ... , Qs are alternating quantifier symbols.

8.6 Finitely Refutable Mathematical Problems

337

1.

A test set for f is a set TeNS such that f is true in N S iff it is true in T.

2.

The problem expressed by f is finitely solvable if there is a finite test set for f.

We now examine several classical problems, mainly IT 1-problems and IT 1problems. As a first example we consider the predicate

P(n) =

{~:

if n is even or n otherwise,

= 1 or n is a prime,

i.e. P(n) = 0 iff n is an odd number greater than 1 which is not a prime. Then the problem expressed by the formula VnP(n) is finitely solvable; 18 indeed, it is sufficient to check all n up to and including 9. Goldbach's Conjecture (see Conjecture 8.17) is a IT 1 -problem. Using a carefully optimized segmented sieve and an efficient checking algorithm, the conjecture has been verified up to 4.10 14 (see [443]). To express it in our terminology, let PG : N -+ {O, 1} be such that if n is odd or n is the sum of two primes, otherwise. Thus, fG = Vn PG(n) is true iff Goldbach's Conjecture is true. Similarly, Riemann's Hypothesis (see Conjecture 8.18) is a IT 1-problem. By a result of [159], Riemann's Hypothesis can be expressed in terms of the function JR : N -+ R defined by

JR(k) =

II II 7]R(j), n
where

7]R(j) =

{P,1,

if j = pT for some prime p and some r EN, otherwise.

Riemann's Hypothesis is equivalent to the assertion that

18This example is based on a folklore joke on induction proofs. To prove that all odd natuml numbers greater than 2 are primes one proceeds as follows: 3 is a prime; 5 is a prime; 7 is a prime; 9 is a measuring error; 11 is prime; 13 is a prime; this is enough evidence.

8. Randomness and Incompleteness

338

for all n E N. (For proofs see [159] or [309], pp. 117-122.) Hence, let

PR(n) = {

I, 0,

otherwise.

Thus, fR = 'lin PR(n) is true iff Riemann's Hypothesis is true. Clearly, PR is decidable therefore Riemann's Hypothesis is a III-problem. Riemann's Hypothesis has been checked for the first 59,974,310,000 zeros and progress is continuous (cf. [444]). Of course, not every mathematical statement is a III-problem. For instance, the Twin-Prime Conjecture - stating the existence of infinitely many twin primes 19 - is not a III-problem. With if m > nand m and m otherwise,

+ 2 are primes,

this conjecture can be stated as

The formula iT is in the class II 2 . Bennett [206, 30] claims that most mathematical conjectures can be settled indirectly by proving stronger II l -problems. 20 For the Twin-Prime Conjecture such a stronger II l problem is obtained as follows. We consider the predicate

I, Pf(n)

={ 0,

if there is an m with IOn-I::; m ::; IOn, m and m + 2 primes, otherwise,

and let

ff = VnPf(n). Thus, true.

ff

gives rise to a III-problem and, if

ff

is true, then also

Theorem 8.20 (Calude-Jiirgensen-Legg). Every f E solvable, for all sEN.

fs

iT

is

is finitely

19Consecutive odd primes such as 857 and 859. 2°This "embedding method" has some limits itself. For instance, it fails for questions about n itself.

8.6 Finitely Refutable Mathematical Problems

339

Proof Let

f = Ql n l Q2n 2 ... Qsns P(nl' n2,···, n s), with sEN, where Ql, Q2, ... ,Qs are alternating quantifier symbols. We determine a sequence N 1, N 2, ... , Ns of finite sets with Ni c N i such that the problem posed by f can be solved by checking all s-tuples (nl' n2, ... ,ns ) ENs. We define the sets Ni by induction on i. For this purpose, let

where ml, ... , mi-l EN. In particular,

For i

= 1, if Ql = V, let 1, if f = h() is true, min{ml I ml EN, .!2(mI)}, otherwise;

if Ql

= 3, let VI

Let Nl

={ =

I, if f = h () is not true, min{ml I ml E N,!2(ml)}, otherwise.

{(ml) I ml EN, ml :::;

VI}.

Now, suppose N i- 1 has been defined and i :::; s. For each (ml, ... , mi-l) E Ni-l, we define vi(ml, ... , mi-I) E No as follows. If Qi = V, let

otherwise

if Qi

= 3, let

otherwise

8. Randomness and Incompleteness

340

Let Ni = {(m1, ... , mi)

I (m1"'"

mi-1) E Ni- 1,

mi EN, mi :::; vi(m1, ... , mi-1)}. We now prove, by induction on i, that each set Ti = Ni X Ns-i is a test set for f. Then, in particular, Ns is a finite test set for f. Consider i = 1. Suppose first that Q1 = V. The set N1 is {(I)} and, clearly, the set T1 is a test set 21 for f. When f is false the set N1 consists of all positive integers up to the first counter-example for the first variable of P. Hence, again, T1 is a test set for f. On the other hand, suppose that Q1 = :3. Then N1 = {(I)} when f is false. Clearly T1 is a test set 22 for f. When f is true the set N1 consists of all positive integers up to the first witness for the first variable of P. Again T1 is a test set for f. Now consider i > 1 and assume that Ti-1 is a test set for f. First suppose that Qi = V. We consider (m1"'" mi-1) E N i- 1. If fi(m1, ... , mi-d is true then vi(m1, ... , mi-d = 1. As T i - 1 is a test set for f, to test whether f is true on {(m1"'" mi-d} x Ns-i+1 it suffices to test on {(m1, ... ,mi1l1)} x NS-i, and (m1, ... ,mi-1,1) E N i. If fi(m1, ... , mi-d is false then Ni contains all the i-tuples (m1, ... , mi-1, mi) with mi ranging from 1 to the smallest counterexample. Hence, as T i - 1 is a test set for f so is T i . Now suppose that Qi :3. If fi(m1, ... , mi-1) is false then vi(m1, ... , mi-d = 1. As T i - 1 is a test set for f, to test whether f is true on {(m1, ... ,mi-d} x N s- i+1 it suffices to test on {(m1, ... ,mh,l)} x N S - i , and (m1"'" mi-1, 1) E N i . If fi(m1, ... , mi-1) is true then Ni contains all the i-tuples (m1"'" mi-1, mi) with mi ranging from 1 to the smallest witness. Hence, as T i - 1 is a test set for f so is T i . 0 The proof of Theorem 8.20 is non-constructive and this remains so even when P is decidable. Thus, from this proof we do not learn anything about the number of cases one needs to check in order to prove or disprove the truth of f. It is clear from the theories of arithmetic classes and degrees of unsolvability that, in general, finite test sets cannot be constructed for this type of problem even when the predicate is computable. 21In fact, the empty set would be a test set for f. However, if one uses this idea, Le. sets Vl to 0 rather than 1 - and similarly for Vi in general - then the "construction" seems to break down. 22 Again the empty set could have been used were it not for problems with the subsequent steps of the "construction".

8.6 Finitely Refutable Mathematical Problems

341

We will try to shed some light, from a different perspective, on some of the reasons why this cannot be done. The proof of Theorem 8.20 highlights a typical pitfall in proofs in computability theory when the reasoning of classical logic is used. The proof and the statement proved are computationally meaningless as neither helps with actually solving the f s-problem. The "construction" of the sets Ni in the proof disguises the fact that none of these finite sets may be computable. See, for example, the formula fG expressing Goldbach's Conjecture. We now analyse the case of III-problems in greater detail. For f E fs, let N(f) = Ns withNs as in the proof of Theorem 8.20. In particular, when s = 1, then N(f) is the set {(nI) I nl EN, nl :::; VI}. For this case, we define v(f) = VI. Let X be an arbitrary but fixed alphabet. We use X as the alphabet for programs of universal Chaitin computers. We also fix a computable bijective function ( ,) : X* x N --> X*. Consider f = VnP(n), where P is a computable predicate on N. We assume that P is given as a program for an arbitrary, but fixed, universal Chaitin computer U. Thus P is given as a string 7rp E X* such that U((7rp,n)) = P(n), for all n E N. One can, therefore, consider V as a partial function of X* into No; that is, v(7rp) = v(f) with f as above. We first determine an upper bound on v(f) for f E III. The busy beaver function cr : N

-->

N is defined by

cr(n) = max{U(x) I x is a program of length n for U and U(x) halts on x}. Let P be a computable unary predicate on N, and let f = VnP(n), hence f E III. We consider a program Pj for U such that

U(Pj)

= min{n I -'P(n)},

if f is not true, and such that U runs forever on P j if f is true. Such a program always exists because the program, which tries P(1), P(2), .. , and halts with the first n such that -'P(n), has the required properties. Let mj = Ipjl. If f is not true, then U halts on Pj with output v(f). Hence v(f) :::; cr(mj). If f is true, then v(f) = O. This proves the following statement. Proposition 8.21. For every f E III, v(f) :::; cr(mj).

342

8. Randomness and Incompleteness

By Theorem 8.21, to solve the problem of f we only need to check the truth value of P(n) for all n not exceeding O"(mj)' This could be very useful if 0" were computable. However, 0" grows faster than any computable function. Hence, the bound v(f) :S 0"( m j) does not help in the actual solution of the problem of f. In fact, no computable bound exists! Here is the argument. For any 7r E X*, we define the predicate PI[ on N by

P7r(n)

= {1,

0,

U(7r) d?es not halt within n steps, otherwIse.

Clearly, the predicate is computable. Let f7r iff U(7r) does not halt.

= VnP7r(n).

Then f7r is true

Assume now that there is a program to compute an upper bound of v(f) for any f E Ih; this program takes, as input, a program p computing the predicate pP and produces as output an integer v'(p) such that v(fP) :S v'(p), where fP = Vn PP(n). We show that this assumption implies the existence of an algorithm deciding the Halting Problem for U, a contradiction. Indeed, consider 7r E X*. To decide whether U(7r) halts, we first compute a program P7r producing P7r . Next we compute v'(P7r)' As f 7r = fP1r, we have v(f7r) :S v' (P7r ). Hence, to determine whether f 7r is true, it is sufficient to determine whether P7r (n) = 1, for all n :S v' (P7r)' If so, then U(7r) halts; otherwise it does not. Theorem 8.22. The upper bound v is T-complete. Proof We already showed that an oracle for v or an upper bound on v allows us to decide the Halting Problem for U. The converse follows from 0 Proposition 8.21.

Corollary 8.23. There is no constructive proof showing that every f E

Ih has a finite test set. Remark. With appropriate modifications, a statement similar to Corollary 8.23 can be proved for ~l' In fact, for any sEN and any fs, there is no constructive proof of the fact that every f E f s has a finite test set. Many true TIl-problems are undecidable, hence independent with respect to an axiomatic formal system with (1)-(3). The analysis above can help us in understanding this phenomenon. Knowing that P is false can be used to get a proof that "P is false": we keep computing P(n), for large

8.7 Computing 64 Bits of a G.E. Random Real

343

enough n, until we get an n such that ,P(n). But this situation is not symmetric: if we know that P is true we might not be able to prove that "P is true", and this case is quite frequent [91]. Indeed, even when we "have" the proof, i.e. we have successfully checked that P(n) =1= 0, for all n::; lI((\in)P(n)), we might not be able to "realize" that we have achieved the necessary bound. The correspondence P 1--+ 1I( (\in )P( n)) exists and is perfectly legitimate from a classical point of view, but has no constructive "meaning". To a large extent the mathematical activity can be regarded as a gigantic, collective effort to compute individual instances of the function 1I( (\in )P( n)). This point of view is consistent with Post's description of mathematical creativity [336]: Every symbolic logic is incomplete and extendible relative to the class of propositions constituting Halt. The conclusion is inescapable that even for such a fixed, well defined body of mathematical propositions, mathematical thinking is, and must remain, essentially creative.

In essence, the seemingly paradoxical situation arises from the fact that, in classical logic, it may happen that only finite resources are needed for defining a finite object but finite resources will not suffice to determine the same object constructively. The finite "character" of a problem may nevertheless rule out - in a very fundamental way - that its solution can be obtained by finite means.

8.7

Computing 64 Bits of a Computably Enumerable Random Real

Any attempt to compute the uncomputable or to decide the undecidable is without doubt challenging. Various successful attempts have been reported, see, for example, Marxen and Buntrock [308], Stewart [381], Casti [105]. What about computing an exact approximation of a c.e. random real? Computing some initial bits of an Omega Number is even more difficult. According to Theorem 7.109, c.e. random reals can be coded by universal Chaitin computers through their halting probabilities. How "good" or "bad" are these names? First we start with the register machine model used by Chaitin [121].

344

8. Randomness and Incompleteness

Recall that any register machine has a finite number of registers, each of which may contain an arbitrarily large non-negative integer. The list of instructions is given below in two forms: our compact form and its corresponding Chaitin [121] version. The main difference between Chaitin's implementation and ours is in the encoding: we use 7-bit codes instead of 8-bit codes.

L: ? Ll

(L: GOTO Ll) This is an unconditional branch to Ll. L1 is a label of some instruction in the program of the register machine.

L:

1\

R Ll

(L: JUMP R Ll) Set the register R to be the label of the next instruction and go to the instruction with label Ll.

L:

@

R

(L: GOBACK R) Go to the instruction with a label which is in R. This instruction will be used in conjunction with the jump instruction to return from a subroutine. The instruction is illegal (Le. runtime error occurs) if R has not been explicitly set to a valid label of an instruction in the program.

L: = Rl R2 Ll

(L: EQ Rl R2 Ll)

This is a conditional branch. The last 7 bits of register R1 are compared with the last 7 bits of register R2. If they are equal, then the execution continues at the instruction with label Ll. If they are not equal, then execution continues with the next instruction in sequential order. R2 may be replaced by a constant which can be represented by a 7-bit ASCII code, Le. a constant from 0 to 127.

L: # Rl R2 Ll

(L: NEQ Rl R2 Ll)

8.7 Computing 64 Bits of a C.E. Random Real

345

This is a conditional branch. The last 7 bits of register R1 are compared with the last 7 bits of register R2. If they are not equal, then the execution continues at the instruction with label L1. If they are equal, then execution continues with the next instruction in sequential order. R2 may be replaced by a constant which can be represented by a 7-bit ASCII code, i.e. a constant from 0 to 127.

(L: RIGHT R)

L: ) R

Shift register R right 7 bits, i.e. the last character in R is deleted.

L: (Rl R2

(L: LEFT Rl R2)

Shift register R1 left 7 bits, add to it the rightmost 7 bits of register R2, and then shift register R2 right 7 bits. The register R2 may be replaced by a constant from 0 to 127.

L: & Rl R2

(L: SET Rl R2)

The content of register R1 is replaced by the content of register R2. R2 may be replaced by a constant from 0 to 127. L: ! R

(L: READ R) One bit is read into the register R, so the numerical value of R becomes either 0 or 1. Any attempt to read past the last data-bit results in a run-time error.

L: /

(L: DUMP) All register names and their contents, as bit strings, are written out. This instruction is also used for debugging.

L:%

(L: HALT) Halts the execution. This is the last instruction for each register machine program.

8. Randomness and Incompleteness

346

Definition 8.24. A register machine program consists of a finite list of labelled instructions from the above list, with the restriction that the HALT instruction appears only once, as the last instruction of the list. The data (a binary string) immediately follow the HALT instruction.

The use of undefined variables is a run-time error. A program not reading the whole data or attempting to read past the last data-bit results in a run-time error. Because of the position of the HALT instruction and the specific way data are read, register machine programs are Chait in computers. To be more precise, we present a context-free grammar G = (N,~,P,S)

in Backus-Naur form which generates the register machine programs.

(1) N is the finite set of non-terminal variables: N

{S}UINSTUTOKEN

INST

{(RMSIns)' (?Ins)' (AIns)' (@Ins)' (=Ins)' (#Ins), OIns)' ((Ins)' (&Ins)' (!Ins)' UIns)' (%Ins)}

TOKEN

{(DATA), (LABEL), (REGISTER), (CONSTANT), (SPECIAL), (SPACE), (ALPHA), (LS)}

(2) ~, the alphabet of the register machine programs, is a finite set of terminals, disjoint from N: l:

(ALPHA) (SPECIAL) (SPACE) (CONSTANT)

(ALPHA) U (SPECIAL) U (SPACE) U (CONSTANT)

{a,b,c, ... ,z} {:, j,?, A,@,=,#,), (, &,!,?, %} {'space', 'tab'} {d I 0 ~ d ~ 127}

(3) P (a subset of N x (N U ~)*) is the finite set of rules (productions):

8.7 Computing 64 Bits of a G.E. Random Real

s

---+

(RMS lns ) *(%Ins) (DATA)

(DATA)

---+

(011)*

(LABEL)

---+

01 (11 21... 19)(0111 21.. ·19)*

(LS)

---+

: (SPACE) *

(REGISTER)

---+

(ALPHA)((ALPHA)

(RMS lns )

---+

(?Ins) 1(I\Ins) 1(@Ins) 1 (=Ins) 1(#Ins) 1 OIns) 1 ((Ins) 1 (&Ins) 1 (!Ins) 1 Ulns)

(%Ins)

---+

(L: HALT) (LABEL) (LS) %

(?Ins)

---+

(L: GOTO L1) (LABEL) (LS)? (SPACE) *(LABEL)

(I\Ins)

---+

(@Ins)

---+

U

(011121 ... 19))*

(L: JUMP R L1) (LABEL) (LS) 1\ (SPACE) *(REGISTER) (SPACE) + (LABEL) (L: GDBACK R) (LABEL) (LS)@(SPACE)* (REGISTER) (L: EQ R 0/127 L1 or L: EQ R R2 L1) (LABEL) (LS) = (SPACE) *(REGISTER) (SPACE) + (CONSTANT) (SPACE) + (LABEL) 1 (LABEL) (LS) = (SPACE) *(REGISTER) (SPACE) + (REGISTER) (SPACE) + (LABEL) (L: NEQ R 0/127 L1 or L: NEQ R R2 L1) (LABEL) (LS)#(SPACE) *(REGISTER) (SPACE) + (CONSTANT) (SPACE) + (LABEL) 1(LABEL) (LS) #(SPACE) *(REGISTER) (SPACE) + (REGISTER) (SPACE) + (LABEL)

OIns)

---+

(L: RIGHT R) (LABEL) (LS)) (SPACE) *(REGISTER)

((Ins)

---+

(L: LEFT R L1) (LABEL) (LS) ((SPACE) *(REGISTER) (SPACE) +

347

8. Randomness and Incompleteness

348

(CONSTANT) I (LABEL) (LS) ((SPACE) * (REGISTER) (SPACE) + (REGISTER)

(L: SET R 0/127 or L: SET R R2) (LABEL) (LS)&(SPACE) * (REGISTER) (SPACE) + (CONSTANT) I (LABEL) (LS)&(SPACE) * (REGISTER) (SPACE) + (REGISTER)

(L: READ R) (!Ins)

-+

(LABEL) (LS)! (SPACE) * (REGISTER)

Ulns)

-+

(L: DUMP) (LABEL) (LS) /

(4) SEN is the start symbol for the set of register machine programs. It is important to observe that the above construction is universal in the sense of AIT. Register machine programs are self-delimiting because the HALT instruction is at the end of any valid program. Note that the data, which immediately follow the HALT instruction, are read bit by bit with no endmarker. This type of construction was first programmed in Lisp by Chaitin [121, 132].

To minimize the number of programs of a given length that need to be simulated, we have used "canonical programs" instead of general register machines programs. A canonical program is a register machine program in which (1) labels appear in increasing numerical order starting with 0, (2) new register names appear in increasing lexicographical order starting from 'a', (3) there are no leading or trailing spaces, (4) operands are separated by a single space, (5) there is no space after labels or operators, (6) instructions are separated by a single space. Note that for every register machine program there is a unique canonical program which is equivalent to it; that is, both programs have the same domain and produce the same output on a given input.

8.7 Computing 64 Bits of a G.E. Random Real If x is a program and y is its canonical program, then

349

Iyl

~

Ixl.

Here is an example of a canonical program:

O:!a l:~b 4 2:!c 3:?11 4:=a 0 8 5:&c 110 6:(c 101 7:©b 8:& 8:&c 1019:(c 113 10:©b 11:%10 To facilitate understanding of the code we rewrite the instructions with additional comments and spaces:

O:! a 1: ~ b 4 2:! c

3:? 11 4:= a 0 8 5:& c en' 6:(c'e' 7:© b 8:& c 'e' 9:(c'q' 10:© b 11:%

10

// // // // // // // // // // //

read the first data-bit into register a jump to a subroutine at line 4 on return from the subroutine call c is written out go to the halting instruction the rightmost 7 bits are compared with 127; if they are equal, then go to label 8 ' else, continue here and store the character string 'ne' in register c go back to the instruction with label 2 stored in register b store the character string 'eq' in register c

/ / the halting instruction / / the input data

For optimization reasons, our particular implementation designates the first maximal sequence of SET/LET instructions as (static) register preloading instructions. We "compress" these canonical programs by a) deleting all labels, spaces and the colon symbol with the first non-static instruction having an implicit label 0, b) separating multiple operands by a single comma symbol, c) replacing constants with their ASCII numerical values. The compressed format of the above program is !a~b,4!c?11=a,0,8&c,110(,c,101©b&c,101(,c,113©b%10

Note that compressed programs are canonical programs because during the process of "compression" everything remains the same except for the elimination of space. Compressed programs use an alphabet with 49

350

8. Randomness and Incompleteness

symbols (including the halting character). The length is calculated as the sum of the program length and the data length (7 times the number of characters). For example, the length ofthe above program is 7x (48+2) = 350. In what follows we will be focusing on compressed programs. A Java version interpreter for register machine compressed programs has been implemented; it imitates Chaitin's construction in [121]. This interpreter has been used to test the Halting Problem for all register machine programs of at most 84 bits long. The results have been obtained according to the following procedure: 1. Start by generating all programs of 7 bits and test which of them stops. All strings of length 7 which can be extended to programs are considered prefixes for possible halting programs of length 14 or longer; they will simply be called prefixes. In general, all strings of length n which can be extended to programs are prefixes for possible halting programs of length n + 7 or longer. Compressed prefixes are prefixes of compressed (canonical) programs. 2. Testing the Halting Problem for programs of length n E {7, 14,21, ... ,84} was done by running all candidates (that is, programs of length n which are extensions of prefixes of length n - 7) for up to 100 instructions, and proving that any generated program which does not halt after running 100 instructions never halts. For example, (uncompressed) programs that match the regular expression "0: \ ~ a 5. * 5: \? 0" never halt on any input. For example, each of the following programs" ! a! b! a! b/%10101010" and "! a?0%10101010" produces a run-time error; the first program "under reads" the data and the second one "over reads" the data. The program " ! a?l! b%1010" loops.

Comment. One would naturally want to know the shortest program that halts with more than 100 steps. If this program is larger than 84 bits, then all of our looping programs never halt. The trivial program with a sequence of 100 dump instructions runs for 101 steps but can we do better? The answer is yes. The following family of programs {PI, P2,"'} recursively count to 2i but have linear growth in size. The programs PI to P4 are given below: 23 23In all cases the data length is zero.

8.7 Computing 64 Bits of a

c.E.

Random Real

351

/&a,0=a,1,5&a,1?2% /&a,0&b,0=b,1,6&b,1?3=a,1,9&a,1?2% /&a,0&b,0&c,0=c,1,7&c,1?4=b,1,10&b,1?3=a,1,13&a,1?2% /&a,0&b,0&c,0&d,0=d,1,8&d,1?5=c,1,11&c,1?4=b,1,14&b,1?3 =a,1,17&a,1?2% In order to construct the program PH 1 from Pi only four instructions are added, while updating "goto" labels. The running time t(i), excluding the halt instruction, of program Pi is found by the following recurrence: t(l) = 6, t(i) = 2 . t(i - 1) + 4. Thus, since t(4) = 86 and t(5) = 156, P5 is the smallest program in this family to exceed 100 steps. The size of P 5 is 86 bytes (602 bits), which is smaller than the trivial dump program of 707 bits. It is an open question on what is the smallest program that halts after 100 steps. A hybrid program, given below, created by combining P 2 and the trivial dump programs, is the smallest known.

&a,0/&b,0/////////////////////=b,1,26&b,1?2=a,1,29&a,1?O% This program of 57 bytes (399 bits) runs for 102 steps. Note that the problem of finding the smallest program with the above property is undecidable (see [131]). The distribution of halting compressed programs of up to 84 bits for U, the Chaitin universal computer processing compressed programs, is presented in Table 1. All binary strings representing programs have the length divisible by 7.

Program plus data length 7 14 21 28 35 42

Number of halting programs 1 1 3 8 50 311

Program plus data length 49 56 63 70 77 84

Number of halting programs 1,012 4,382 19,164 99,785 515,279 2,559,837

Table 1. Distribution of halting programs

352

8. Randomness and Incompleteness

Computing all halting programs of up to 84 bits for U seems to give the exact values of the first 84 bits of Ou. False! To understand the point let us first ask ourselves whether the converse implication in Theorem 7.112 is true. The answer is negative. Globally, if we can compute all bits of Ou, then we can decide the Halting Problem for every program for U and conversely. However, if we can solve for U the Halting Problem for all programs up to N bits long we might not still get any exact value for any bit of Ou (less all values for the first N bits). Indeed, a large set of very long halting programs can contribute to the values of more significant bits of the expansion of Ou. So, to be able to compute the exact values of the first N bits of Ou we need to be able to prove that longer programs do not affect the first N bits of Ou. And, fortunately, this is the case for our computation. Due to our specific procedure for solving the Halting Problem, any compressed halting program of length n has a compressed prefix of length n - 7. This gives an upper bound for the number of possible compressed halting programs of length n.

Orr

Let be the approximation of Ou given by the summation of all halting programs of up to n bits in length. Compressed prefixes are partitioned into two cases - ones with a HALT (%) instruction and ones without. Hence, halting programs may have one of the following two forms: either "x y HALT u" , where x is a prefix of length k not containing HALT, Y is a sequence of instructions of length n - knot containing HALT and u are the data of length m ~ 0; or "x u" , where x is a prefix of length k containing one occurrence of HALT followed by data (possibly empty) and u are the data of length m ~ 1. In both cases the prefix x has been extended by at least one character. Accordingly, the "tail" contribution to the value of 00

Ou=

L

L

Tlwl

n=O {Iwl=n, U(w) halts}

is bounded from above by the sum of the following two convergent series (which reduce to two independent sums of geometric progressions): 00

00

L L

,#{x I prefix x not containing

m=On=k

HALT,

;

Y

. 2m ·128-(n+m+1) '-v-" u

Ixl = k}/' ~.

'

~ HALT

8.7 Computing 64 Bits of a C.E. Random Real

353

and 00

L

#{x I prefix x containing

m=l

,

HALT,

Ixl =

k}· 2m ·128-(m+k). ,~

;

u

The number 48 comes from the fact that the alphabet has 49 characters and the last instruction before the data is HALT (%). There are 402,906,842 prefixes not containing HALT and 1,748,380 prefixes containing HALT. Hence, the "tail" contribution of all programs of length 91 or greater is bounded by 00

00

L L

402906842· 48 n - 13 ·2m . 128-(n+m+1)

m=On=13

00

+L

1748380· 2m . 128-(m+13)

(8.4)

m=l

64 00 (48)n 402906842· 128.4813 . n~3 128

+ 1748380· <

1 63.128 13

2- 68 ,

i.e. by our method we can get at most 68 correct first bits of nt4. Actually we do not have 68 correct bits, but only 64 because adding a 1 to the 68th bit may cause an overflow up to the 65th bit. From (8.4) it follows that no other overflows may occur. The following list presents the main results of the computation:

nb = 0.0000001 nfj = 0.00000010000001 nfJ = 0.000000100000010000011

nfJ = 0.0000001 000000 100000 11 0001 000 n~

= 0.00000010000001000001100010000110010

n~

= 0.0000001000000100000110001000011010001111110010110001111 o

n~

= 0.0000001000000100000110001000011010001111110010111011001

nr; = 0.000000100000010000011000100001101000110111 nE? = 0.0000001000000100000110001000011010001111101110100

8. Randomness and Incompleteness

354

11011100

nr? = 0.0000001000000100000110001000011010001111110010111011100 111001111 00 1001 nil = 0.0000001000000100000110001000011010001111110010111011101 0000011100000101001111 nit = 0.0000001000000100000110001000011010001111110010111011101 00001000001111011011011011101 The exact bits are underlined in the 84 approximation:

nit = 0.0000001000000100000110001000011010001111110010111011101 00001000001111011011011011101 We have obtained:

Theorem 8.25 (Calude-Dinneen-Shu).

nu

The first 64 exact bits of

are:

0000001000000100000110001000011010001111110010111011101000010000

Omega's first 64 digits. (Picture by J. Arulanandham and M. J. Dinneen)

8.8 Turing's Barrier Revisited

355

As we have already mentioned, solving the Halting Problem for programs of up to n bits might not be enough to compute exactly the first n bits of the halting probability. In our case, we have solved the Halting Problem for programs of at most 84 bits, but we have obtained only 64 exact initial bits of the halting probability. The method, which combines programming with mathematical proofs, can be improved in many respects. However, due to the impossibility of testing that long looping programs never actually halt (the undecidability of the Halting Problem), the method is essentially non-scalable. Finally, there is no contradiction between Theorem 8.14 and Theorem 8.25. Omega Numbers are halting probabilities of Chaitin universal computers, and each n is the halting probability of an infinite number of such computers - among them, Solovay computers; ZFC cannot determine more than the initial run of 1s of their halting probabilities. But the same n can be defined as the halting probability of a Chaitin universal computer which is not a Solovay computer, so ZFC, if supplied with that different computer, may be able to compute more (but, according to Theorem 8.8, always only finitely many) digits of the same n. Such a computer has been used for the n discussed in this section. The web site ftp: / /ftp. cs. auckland. ac. nz/pub/CDMTCS/Omega/ contains all programs used for the computation as well as all intermediate and final data files (3 gigabytes in gzip format). Finally, let us compare the following three numbers: 7f, XHait and n. Of course, 7f is computable, but XHalt and n are not computable. One consequence of this distinction is the following: we can compute as many digits of 7f as we want provided we have enough resources (money and time), but this is not possible for XHalt and n. We can compute infinitely many correct bits of the binary expansion of XHalt, but again this is impossible for n because of Theorem 8.8.

8.8

Turing's Barrier Revisited

Classically, there are two equivalent ways to look at the mathematical notion of proof: a) as a finite sequence of sentences strictly obeying some axioms and inference rules, b) as a specific type of computation. Indeed, from a proof given as a sequence of sentences one can easily construct a machine producing that sequence as the result of some finite computation and, conversely, given a machine computing a proof we can just print

356

8. Randomness and Incompleteness

all sentences produced during the computation and arrange them in a sequence. A proof is an explicit sequence of reasoning steps that can be inspected at leisure; in theory, if followed with care, such a sequence either reveals a gap or mistake, or can convince a sceptic of its conclusion, in which case the theorem is considered proven. This equivalence has stimulated the construction of programs which perform like artificial mathematicians. 24 From proving simple theorems of Euclidean geometry to the proof of the Four-Colour Theorem, these "theorem provers" have been very successful. Of course, this was a good reason for sparking lots of controversies. Artificial mathematicians are far less ingenious and subtle than human mathematicians, but they surpass their human counterparts by being infinitely more patient and diligent. What about making errors? Are human mathematicians less prone to errors? This is a difficult question which requires more attention. If a conventional proof is replaced by a "quantum computational proof"

(or a proof produced as a result of a molecular experiment), then the conversion from a computation to a sequence of sentences may be impossible, e.g. due to the size of the computation. For example, a quantum machine could be used to create some proof that relied on quantum interference among all the computations going on in superposition. The quantum machine would say "your conjecture is true", but there will be no way to exhibit all trajectories followed by the quantum machine in reaching that conclusion. In other words, the quantum machine has the ability to check a proof, but it may fail to reveal any "trace" of how it did it. Even worse, any attempt to watch the inner working of the quantum machine (e.g. by "looking" at any information c6ncerning the state of the ongoing proof) may compromise forever the proof itself! These facts may not affect the essence of mathematical objects and constructions (which have an autonomous reality quite independent of the physical reality), but they seem to have an impact on how we learn/understand mathematics (which is through the physical world). Indeed, our glimpses of mathematics seem to be "revealed" through physical objects, i.e. human brains, silicon computers, quantum Turing machines, etc., hence, according to Deutsch [176], they have to obey not only the

240ther types of "reasoning" such as medical diagnosis or legal inference have been successfully modelled and implemented; see, for example, the British National Act which has been encoded in first-order logic and a machine has been used to uncover its potential logical inconsistencies.

8.8 Turing's Barrier Revisited

357

axioms and the inference rules of the theory, but the laws of physics as well. The question of trespassing on Turing's barrier, i.e. the possibility to solve a Turing undecidable problem, to compute an uncomputable function has been considered by various authors, e.g. Siegelmann [366], Casti [105] Copeland [147, 148], Calude and Casti [65]. Is there any hope for quantum (or DNA) computing to challenge the Turing barrier, i.e. to solve an undecidable problem, to compute an uncomputable function? According to Feynman's argument (see [196], a paper reproduced also in [236]) any quantum system can be simulated with arbitrary precision by a (probabilistic) Turing machine, so the answer seems to be negative. However, some recent tentative approaches promise a positive answer: for quantum approaches 25 see [78, 189, 97, 254]26 and for DNA methods see Calude and Paun [96]. Is incompleteness affected? We need more understanding of the quantum world to be able to answer this question. One step towards a possible answer to this question is to look at the quantum version of the Omega Number, the number Dq invented in 1995 by G. Chaitin, K. Svozil and A. Zeilinger (see [393, 434]; see also [254, 419]). The number Dq is the probability amplitude with which a random quantum program halts on a self-delimiting universal quantum machine (hence, the halting probability of a self-delimiting universal quantum machine is IDqI2).27 For computing Dq only the quantum versions of classical bits in the domain of the quantum machine are allowed as inputs, so from the computability point of view Dq is an D, hence all information-theoretic results remain 25Randomness is essential. For the idea of a quantum random generator see Svozil [390]; a quantum random machine is described at http://www.gapoptic.unige.ch/ Prototypes/QRNG/default.asp. 26Halting programs can be recognized by simply running them; the main difficulty is to detect non-halting programs. Calude and Pavlov [97] have constructed a mathematical quantum "device" (with sensitivity c:) to solve the Halting Problem. The "device" works on a randomly chosen test-vector for T units of time. If the "device" produces a click, then the program halts. If it does not produce a click, then either the program does not halt or the test-vector has been chosen from an undistinguishable set of vectors Fe,T. The last case is not dangerous as our main result proves: the Wiener measure of Fe,T constructively tends to zero when T tends to infinity. The "device", working in time T, appropriately computed, will determine with a pre-established precision whether an arbitrary program halts or not. 27Things are more complicated as the halt bit of the quantum machine might enter a superposition state and remain there while other parts of the output state describing the quantum machine continue to change. Finally, to settle the matter one has to perform a measurement.

8. Randomness and Incompleteness

358

unchanged. The halting probability of any quantum device capable of solving the Halting Problem (for classical 'lUring machines) will be an a number (as introduced in Becher, Daicz and Chait in [23]), a random, but not c.e. real; the "incompleteness" derived from such a number has not (yet) been studied. . As is pointed out in [78], all these theoretical proposals for trespassing on 'lUring's barrier may have for the time being a fairly low impact on computer technology because for practical purposes the halting computation has a non-zero, but very small, chance of detection. So, when reality seems so far way from theory, why are we concerned with the latter? According to Landauer [272] the answer is: Because it is at the very core of science. ... Information, numerical or otherwise, is not an abstraction, but it is inevitable tied to . .. the physical universe, its contents and its laws.

8.9

History of Results

GIT is presented in many papers and books. Here is an incomplete list of references dealing with GIT in the form discussed in this chapter: Barrow [16, 17], Chaitin [112, 113, 114, 115, 120, 121, 122, 123, 125, 131, 132, 135], Davis [157], Dawson Jr. [161]' Delahaye [164, 165], Detlefsen [174], Kieu [254], Manca [293], Pagels [328], Rucker [349, 350], Svozil [391] and van Lambalgen [412]. Chaitin's Omega Number has received a great deal of attention subsequently (see, for instance, Barrow [15], Bennett, Gardner [32], Casti [103, 104], Davies [155]). Solovay Theorem 8.11 comes from [377]. Theorem 8.12 was proved in Calude [62] (see also [61]). Section 8.6 follows Calude, Jurgensen and Legg [90] while Section 8.7 follows Calude, Dinneen and Shu [77]. In Dawson Jr. [161] there is a brief citation from Godel's 1940 lecture regarding the idea of a random sequence. This is an extraordinary statement for the time it was made, because Cohen and Solovay had announced their results on the independence ofthe Continuum Hypothesis much later and AIT was developed only in the mid 1960s. In [162] Dawson Jr. wrote:

8.9 History of Results The statement in question is indeed extraordinary. It is taken from Godel's lecture at Brown University on 15 November 1940, the text of which was only published posthumously, in vol. III of Godel's Collected Works. 28 His exact words (quoted from pp. 184-185 of that volume) are: "It is to be expected that also not-A will be consistent with the axioms of mathematics, because an inconsistency of not-A would imply an inconsistency of the notion of a random sequence, where by a random sequence I mean one which follows no mathematical law whatsoever, and it seems very unlikely that this notion should imply a contradiction. Another argument which makes the consistency of not-A plausible is that an inconsistency of not-A would yield a proof for the axiom of choice, whereas the axiom of choice is generally conjectured to be independent." In his introductory note to that lecture of Godel's, Robert Solovay comments (p. 118), "At first glance [Godel's remark about random sequences] seems a foreshadowing of my notion of a real being random over a transitive model of set theory [to wit, that it] ... lies in no Borel set of Lebesgue measure zero coded by a real of [that model]. The analogous notion (of an absolutely random real) would be ... [one] that lies in no ordinal-definable set of measure zero. It is of course [now known that it is] consistent that such reals exist .... " Solovay goes on to say, "Upon reflection, however, I doubt that this notion is what Godel had in mind. More likely, it seems to me that by 'random' he meant a real which is not ordinal definable. This seems to be what the phrase 'no mathematical law whatsoever' was intended to express." And of course, it was just six years later that Godel first broached the notion of ordinal definability, in his lecture at the Princeton Bicentennial conference. I'm not aware of any other statements by Godel concerning the notion of randomness, though there might be something hidden somewhere in his Arbeitshefte, not all of which have yet been transcribed from the shorthand. There are also other stunning instances of Godel's prescience. One such is his lecture to the Zilsel circle in 1938, in which he anticipated Kreisel's "no-counterexample interpretation". In study28See [190j.

359

360

8. Randomness and Incompleteness ing Gadel's papers, I sometimes had the eerie feeling I was dealing with someone not quite human, someone with a direct pipeline to mathematical truth (a genius, that is, possessed of an extraordinary mathematical intuition).

Chapter 9

Applications Theory is to practice as rigor is to vigor. D. E. Knuth

This chapter discusses some applications of the main results in AIT. They reflect both the power and the beauty of the theory. This part is not so homogeneous; it is not a conclusion, nor a justification.

9.1

The Infinity of Primes

In this section we present Chaitin's proof [119] of one of (if not) the most important result in mathematics:

Theorem 9.1. The set of primes is infinite. Proof We start by formalizing the following idea: if there are only finitely many primes, the prime factorization of a number would usually be a much more compact representation for it than its base 2 numeral, which is absurd. 1

Let A

= {a, I} and assume, for a proof by absurdity, that Pl,P2,··· ,Pk

IThis idea appears formulated as a counting argument in Hardy and Wright [223].

362

9. Applications

are the only primes. To represent a natural number n by means of the primes we write k

n

= IIpfi, i=l

i.e. we give the exponents el, e2, ... , ek of Pl,P2, ... ,Pk' Note that the uniqueness of factorization property is not needed here. So,

H(string(n))

< H« string(ed, ... ,string(ek) »+0(1) k

< "LH(string(ei)) + 0(1). i=l

By virtue of the inequalities

one deduces

ei ::;

log2 n, and

H(ei) ::; loglogn + O(logloglogn). In case string(n) is random, H(string(n)) is approximately logn o(log log n) and the inequality

+

log n + 0 (log log n) ::; k(log log n + 0 (log log log n)) can be true only for finitely many n.

o

With a variation of the above method one can prove the existence of a positive constant a such that for infinitely many n we have

Pn ::; an log n(log log n) 2.

9.2

The Undecidability of the Halting Problem

We now pass to the fundamental theorem in computability theory, the undecidability of the Halting Problem. The following proof - due to Chaitin - has two advantages: a) it uses no formal model of computability, b) it identifies one of the sources of undecidability.

9.3 Counting as a Source of Randomness

363

Theorem 9.2. There is no computable function deciding whether a computer program ever halts. Proof Without restricting the generality of the proof we may assume that all programs incorporate inputs - which are coded as natural numbers. So, a program may run forever or may just eventually stop, in which case it prints a natural number.

Here is an information-theoretic approach. Assume that there exists a halting program deciding whether an arbitrary program will ever halt. Construct the following program: 1.

read a natural N;

2.

generate all programs up to N bits in size;

3.

use the halting program to check for each generated program whether it halts; remove all non-halting programs;

4.

simulate the running of the above generated halting programs;

5.

output twice the biggest value output by these programs.

The above program halts for every natural N. How long is it? It is about log N bits. Reason: to know N we need log2 N bits (in binary); the rest of the program is of constant size, so our program is log N + 0(1) bits. Now we observe that there is a big difference between the size - in bits - of our program and the size of the output produced by this program. Indeed, for large enough N, the size of our program is less than N bits (because 10gN + 0(1) < N). Accordingly, the program will be generated by itself - at some stage of the computation. In this case we have a contradiction since our program will output a natural number two times bigger than the output produced by itself! D

9.3

Counting as a Source of Randomness

We consider a natural number n, and denote by 'Y( n) the number of Chaitin random strings of length n. We fix a natural base Q ~ 2, and write the above number in base Q. The resulting string (over the alphabet containing the letters 0, 1, ... , Q - 1) is itself. .. random!

9. Applications

364

Let AQ = {O, 1, ... , Q -1 }, and denote by (m) Q the base Q representation of the natural m. Recall that ~(n) = maxixi=n H(x) and let

,(n) = #{x E An I H(x) =

~(Ixl)},

and

Lemma 9.3. For all natural n, O"(n) ::; n.

Proof. Indeed, at least Q strings of length n have H-complexity less than i.e. the strings in, i E AQ. So, ,( n) ::; Qn - Q. The greatest number whose base Q representation is a string of length n is Qn - 1 > "y( n); it corresponds to the string (Q - 1)n. 0 ~ (n),

In [127] is proved the following Theorem 9.4 (Chaitin). One has

H(On-u(n) (r(n))Q) = ~(n)

+ 0(1).

Proof. By Lemma 9.3 one has 1(r(n))QI ::; n, so lon-u(n)(r(n))Q I = n. Next we construct a special Chaitin computer. It acts as follows: for u E dom(U)...) one computes U(u,)..) = string(n), then a string v E dom(U),) such that U(v,)..) = string(ln + lul- ml), for some 0 < m ::; Qn - Q. Next we add a bit i which is 0 in case n + lui - m ::; 0, and 1 in the opposite situation. Finally, we compute a string w such that U(w,u) = (m)Q. In this way we get an admissible input for C, i.e. z = uviw. To compute C(z,)..) we enumerate the first Qn - m strings x of length n with H(x) < m and print out the first remaining string of length n. It is plain that C is a Chaitin computer. It is worth mentioning that for some very small m, C(z,)..) = 00, since very few strings x E An have complexity H(x) < m. Let us now examine the behaviour of C on a special input (the reader is warned that we cannot algorithmically distinguish this special input from an ordinary one; we are only sure that it is an acceptable input). Assume that for some natural n one has

365

9.3 Counting as a Source of Randomness a)

the string u is a minimum-size self-delimiting program for string(n), i.e. u = (string(n))*; this string is H(string(n)) long,

b)

m

c)

a bit i which is 0 or 1 according to the validity of the inequality n + H(string(n)) ::; I;(n), and

d)

the string w is a minimum-size self-delimiting program for computing the base Q numeral on-a-(n) (r(n))Q (of length n) if we are given a minimum-size self-delimiting program for string(n); this is a string of length H(on-a-(n) (r(n) )Q/ string(n)).

= I; (n) and v is a minimum-size self-delimiting program for string(ln + H(string(n)) - I;(n)I), a string of length 0 (1) since I;(n) = n + H(string(n)) + 0(1),

The size of the result of concatenating together these four strings is exactly

H(string(n))

+ H(string(ln + H(string(n)) -

I;(n)I))

+ 1 + H(On-a-(n) (r(n))Q/string(n)) = H(string(n)) + H(On-a-(n) (r(n))Q / string(n)) + 0(1). The Chaitin computer C on the above input computes a string of complexity I; (n). Indeed, it outputs the first string x of length n with H(x) 2: I;(n). Accordingly, the complexity of the output produced will be, by the Invariance Theorem,

H(string(n)) + H(On-a-(n) (r(n))Q/string(n)) + 0(1). Hence it must be the case that

I;(n) ::; H(string(n))

+ H(On-a-(n) (r(n))Q/string(n)) + 0(1).

Since

I;(n) = n + H(string(n)) + 0(1), it follows that

n::; H(On-a-(n) (r(n))Q/string(n)) Recalling that lon-a-(n)/(n)1

+ 0(1).

= n one deduces the equality

H(On-a-(n) (r(n))Q/ string(n))

=

n + 0(1).

9. Applications

366 Since for every string x we have

H(x)

=

H(string(n)) + H(xj string(n)) + 0(1)

it follows that

H(On-u(n) (((n))Q) = n + H(string(n))

+ 0(1) = ~(n) + 0(1).

o

Corollary 9.5. For every natural n,

H(({(n))Q) = ~(I({(n))QI)

9.4

+ 0(1).

Randomness and Chaos

According to Percival [330], p. 11,

Small changes lead to bigger changes later. This behaviour is the signature of chaos. Chaos seems to be everywhere, from the Earth's orbit round the Sun to the beating of the human heart, from the swing of a pendulum to the behaviour of financial markets. A simple way to get a chaotic behaviour is to use the Baker (doubling)2 map b: [0,1) --> [0,1), b(x) = 2x (mod 1). Here "mod I" means "ignore the integer part". If we choose the infinite binary representation for reals x E [0,1), then b can be regarded as the (computable) map b: {O,I}W --> {O,I}W,

Consider a system in which a state is an element of {O,I}W and the evolution is given by Baker's map. We assume that time is discrete. So, starting from each state x E {O, I}W we obtain the trajectory

bO(x)

x,

b1 (x)

b(x),

2For a more general discussion see Moore [315].

9.5 Randomness and Cellular Automata

b2 (x) b3 (x)

b(b(x)), b(b(b(x))),

bn(x)

b( ... b(b(x)) .. .),

367

Given an initial part of the sequence

we would like to compute the next state bn(x) and if this is not possible, then we would like to compute a "prediction" for bn(x) which should be a better approximation of the true value than a random coin toss. If the initial state x is "randomly drawn" from {O, 1}W, then according

to Corollary 6.32 with probability 1 x is random, hence due to Theorem 6.40 each element of the trajectory (bn(x))n is random. So, the behaviour of the system cannot be predicted better than a coin toss, a conclusion argued by Ford in [193J. See also Wolfram [438, 439], White [433J, Batterman, White [21 J, Brudno [50J, Calude and Dumitrescu [80J, Fouche [194J.

9.5

Randomness and Cellular Automata

Cellular automata have been introduced by Ulam and von Neumann [424J as models for natural complex systems, especially self-reproducing biological systems. Since then they have been analysed in many other contexts, e.g. for the simulation of physical phenomena, for computability questions (cellular automata form a universal model of computation), for random number generation, in the framework of formal language theory, in symbolic dynamics, and many more. See, for example, Wolfram [437J and other papers in the same volume [438, 439], Toffoli and Margolus [401J and Lind and Marcus [285J. Cellular automata show a uniform behaviour in a certain region of the space. They operate on configurations which consist of a discrete lattice of cells each of which is in one of finitely many states. Time is discrete; at each time step the value of each cell is updated uniformly according to a finite set of rules. The new value of a cell depends only on the current

368

9. Applications

values of finitely many cells in its neighbourhood. Although cellular automata can be described easily by a finite set of rules (the local function) they exhibit a rich and complicated global behaviour which often seems chaotic or random (see Wolfram [438]). In what follows we will follow Calude, Hertling, Jurgensen and Weihrauch [83] to give several rigorous mathematical characterizations of random configurations and analyse the behaviour of cellular automata on random and non-random configurations. We fix an alphabet I.; with Q 2: 2 elements, and a positive integer d 2: l. Then Zd is the d-dimensionallattice over the integers Z. The space I.;Zd is called a full shift space. We call the elements of I.; the states, the number d the dimension, and the elements c E I.;Zd the configurations of the full shift space. For a configuration c E I.;Zd and a E Zd we write Ca instead of c(a); elements of Zd will be sometimes called cells and Ca will then be the content of cell a. For r EN, let [-r, r] denote the set {-r, ... ,0, ... ,r}. On the spaces I.;Zd we use the product topology induced by infinitely many copies of the discrete topology on the finite space I.;. The space I.;Zd is compact because it is a countable product of compact spaces (Tychonoff's Theorem). This space is in fact a metric space. One can, for example, use the metric dist defined by dist( c, c') = 2- m (c,c'), where

m(c, c') = min{r EN for c, c' E

I.;Zd;

I Ca i- c~,

for some a E [-r, r]d},

here min 0 = 00. The sets

{c E

I.;Zd

I Cz = s},

S

E '" LJ,

z E Zd

form a subbase of the topology on I.;Zd. Cellular automata operate on full shift spaces. The name shift spaces comes from the fact that the shift mappings on the space I.;Zd play an important role. Each integer vector a = (al, ... , ad) E Zd induces a bijection O"id) : I.;Zd -> I.;Zd defined by O"id) (ch = cb+a, for every b E Zd; it is called the shift map associated with a. The superscript (d) will be omitted when the dimension is clear from the context. The shift map 0" ei associated with the unit vector ei = (0, ... ,0,1,0, ... ,0) E Zd having a one in position i and zeros in all other positions is also written O"i. The shift mapping 0"1 is the usual left shift in the one-dimensional case. Next we define a random configuration of a full shift space. First let us look at the simplest case, when the dimension d is equal to 1. The

369

9.5 Randomness and Cellular Automata

simplest way to define randomness for two-way infinite sequences over ~, that is, for elements of ~z, is to use a standard computable bijection from Z to N, e.g. the bijection (... ) : Z ---+ N defined by

() {2Z' Z

=

if 2( -z) - 1, if

Z:2 0, Z

< O.

This bijection induces a bijection from ~w to ~z in an obvious way: one maps an element x = XIX2 ••• E ~w to the two-way sequence q = (qz)z E ~z defined by qz = x(z), for all Z E Z.

Definition 9.6. A two-way infinite sequence q E ~z is called random if the corresponding one-way infinite sequence x E ~w is random. This procedure can also be carried out in the case of any dimension d :2 1. To this end we use a computable bijection from Zd onto N. For example, the mapping 7f : N 2 ---+ N defined by 7f(i,j) = ~(i + j)(i + j + 1) + i is a bijection. For d :2 2 we define (... ) : Zd ---+ N recursively by (Zl,"" Zd) = 7f((Zl), (Z2,'" ,Zd))' This is a computable bijection for each d:2 1. If Ll and L2 are countable sets, then a total mapping f a mapping l : ~L2 ---+ ~Ll via

: Ll

---+

L2 induces

for all p E ~L2 and hELl. If f is a bijection, then also l is a bijection. Hence, for each d :2 1, the induced mapping ~w ---+ ~Zd is a bijection. It is clear that it is even a homeomorphism that induces a bijection of the following subbases of the respective topologies: the preimage under of the cylinder {c E ~Zd I Cz = s} C ~Zd, for s E ~ and Z E Zd, is the cylinder {c E ~w I c(z) = s}, and these sets form a subbase of the product topology on ~w.

n :

n

Furthermore, if we consider the product measure ji, on ~w and ji, on ~Zd of the uniform measure fJ, on ~, then is also measure-preserving, i.e.

n

d

---

for all open U c ~z . Thus, the mapping ( ... ) shows that the spaces ~w and ~Zd are identical with respect to topology and measure. Using these considerations we can give the following:

9. Applications

370

Definition 9.7 (Calude-Hertling-Jiirgensen-Weihrauch). A two-way sequence c E L;Zd is called random if the one-way infinite se---1 quence ( ... ) (c) E L;w is random. Does the construction above depend upon the bijection (- .. ) : Zd - t N? Does the choice ofthe bijection influence the definition? Certainly it does, because the notion of randomness for elements of L;w is not invariant under an arbitrary permutation of its entries.

Example 9.8. For every sequence c E L;w J there exists a bijection 'lj; N - t N such that the sequence C?j;(1)C?j;(2) .•. E L;w is not random. Proof If the sequence C1 C2 • .. is not random we can take 'lj; to be the identity. Otherwise we can assume, without loss of generality, that L; = {O, 1, ... ,Q - I}, for some q 2: 2. Some element of L; appears in the sequence infinitely many times, say Ci = 0, for infinitely many i. Let f : N - t N be the unique increasing function such that cf(i) is the (i + 1)st zero in C1 C2 •.• , for all i. We define 'lj; by f(2j

'lj;(i) = { f(2j) i,

+ 1),

+ 1,

if i if i if i

= =

f(2j) + 1, f(2j + 1),

1- UjEN {f(2j) + 1, f(2j + I)}.

Then the sequence C?j;(l) C?j;(2) •.. does not contain an isolated zero, hence it does not contain the string 101. Consequently, it is not random by Theorem 6.50. D

Remark. In view of Exercise 7.8.8, if 'lj; : N - t N is a computable bijection, then a sequence CI C2 .• , E L;w is random iff the sequence C?j;(1)C?j;(2) •.. E L;w is random. Hence, if we choose a bijection b : Zd - t N such that (- .. ) 0 b- 1 is computable, then we obtain via b the same randomness notion on L;Zd as via the bijection There is another more direct way to define randomness on full shift spaces, without reference to random one-way infinite sequences: we will use the Hertling-Weihrauch topological approach discussed in Section 6.6. In order to view the full shift space L;Zd as a randomness ~pace (L;Zd, B, ji,) we have to describe the measure ji, and the numbering B of a subbase of the topology. The measure ji, is given by

n.

ji,({c E L;Zd I Cz = s}) = I/Q,

9.5 Randomness and Cellular Automata for s E ~ and z E Zd. The numbering -

B·J+ Q·\Zl,··.,Zd I )

= {c E

~

371

H is defined by Zd

I c(Zl,···,Zd ) = s'} J,

for l"'5:j"'5: Q and (Zl, ... ,Zd) E Zd. If (... ) is the bijection from Zd to N defined above, then we obtain the same randomness notion as in Definition 9.7. In fact a more general result is true. Before stating and proving it we will give another characterization for computable sequences of open sets in ~Zd. For an arbitrary finite set A C Zd and v E ~A we set

[VJ={CE~Zd Icz=v z , for all zEA}. The set

Cubes (~, d)

=

U ~[-r,rld r2:0

is countable. The sets [vJ for elements v E Cubes (~, d) form a base of the topology on ~Zd. We define the "length-lexicographical" bijection Cube: N -> Cubes (~, d) in the following way. For fixed r 2:: 0 we define an ordering between the cells in [-r, rJd by Z < if (z) < (Z) for z, E [-r, rJd. With respect to this ordering on [-r, rJd and a fixed ordering on ~ we consider the lexicographical ordering on ~[-r,rld. Finally we define Cube in such a way that first Cube lists all elements in ~[O,Old according to their lexicographical order, then all elements in ~[-l,lld according to their lexicographical order, then all elements in ~[-2,2ld according to their lexicographical order, and so on. The following result is easy, but useful.

z

z

Lemma 9.9. Consider a sequence (Ui)i of open subsets of ~Zd. Then, the following conditions are equivalent: 1.

The sequence (Ui) is H'-computable.

2.

The sequence (Ui) is Cube-computable.

3.

The sequence computable.

C00- 1(Ui))i

of open subsets of ~N is (v(j)~N)r

Theorem 9.10. Let d 2 1 be a positive integer. For a two-way sequence c E ~Zd the following conditions are equivalent: 1.

The infinite one-way sequence n-1(c) E ~N is random.

9. Applications

372

2.

The two-way sequence c zs a random element of the randomness Zd space (~ ,B, ji,).

Proof The equivalence follows from Lemma 9.9 and from the fact that the homeomorphism R : ~w - t ~Zd is measure-preserving. 0 Remark. If (Ui)i is a universal Martin-Lof test on ~w, then (R(Ui))i is a universal Martin-Lof test on ~Zd. In the case of dimension d = 1 the first of the conditions in Theorem 9.10 says that a two-way infinite sequence c = ... C-3C-2 C-lCOCI C2 C3 ... E ~z is random iff the one-way infinite sequence C-lClC-2 C2 C-3 C3 ... E ~w is random. This is also equivalent to the following condition: 3. The one-way sequences co, Cl, C2, ... and dom.

C-l, C-2, C-3, ...

are ran-

Comment. This last condition is often expressed by saying that the sequences (co, Cl, C2, ... ) and (C-l' C-2, C-3,"') are "independently random". Next we will use Martin-Lof tests to get more insight into the nature of randomness of two-way sequences. One must distinguish between MartinLof tests for two-way infinite sequences and for one-way infinite sequences. Let (Ui)i be a universal Martin-Loftest on the space (~W, B, ji,) of one-way infinite sequences, and let A c N be a c.e. set such that

u

iEN,1f(n,i)EA for all n (here v : N - t ~* is the standard computable bijection). Let An = {v(i) I 11"(n, i) E A}, for all n. We assume without loss of generality that all sets An are suffix-closed, i.e. if a prefix of a string w is contained in An, then also w itself is in An- Then a two-way infinite sequence c = ... C-3C-2C-lCoCIC2C3 ... E ~z is non-random iff for each n E N there is an mEN with COC-lClC-2C2 ... C-mCm E An. But

notice

that

we

cannot

replace

COC-lClC-2C2 ... C-mCm

C- m · .. C-lCOCl ... Cm:

Proposition 9.11. Every random two-way infinite sequence _ ... C-2C-lCOClC2 ... E "Z L..J

c -

by

9.5 Randomness and Cellular Automata

373

has the property that for every n E N there is an mEN with C- m ... C-ICOCI ... Cm

E

An.

Proof Let us fix a number n and an arbitrary string w = WI ... WI E An. For every random sequence C = ... C-2C-ICOCIC2 ... E I.;z there exists an m > l such that C- m ... C-m+l-I = w, hence the string w is a prefix of C- m ... C-ICOCI .. , em. Because An is assumed to be suffix-closed we conclude that C- m ... C-ICOCI ... Cm E An. 0

Next we observe that the shift mappings preserve randomness.

Proposition 9.12. Let d 2:: 1 and a E Zd an integer vector. If C E is random, then also O"a(c) is random.

I.;Zd

Proof. If (Ui) is a Martin-Lof test on I.;Zd, then also ((O"a)-I(Ui))i is a Martin-Lof test on I.;Zd, for arbitrary a E Zd. Assume that O"a(c) is non-random. Then there is a Martin-Lof test (Ui) on I.;Zd with O"a(c) E niEN Ui. Then also C E niEN(O"a)-I(Ui). We conclude that c is nonrandom as well. 0

Definition 9.13. Two configurations c(1), c(2) E I.;Zd are called equivalent (we write: c(1) =Shift c(2)) if one of them can be obtained by shifting the other one appropriately, i. e. if there exists an integer vector a E Z d with c(2) = O"~d)(c(1)). This defines an equivalence relation on the space I.;Zd, and often instead of the space I.;Zd one considers the quotient space I.;Zd / =Shift obtained by identifying equivalent configurations. Proposition 9.12 tells us that the randomness notion on I.;Zd induces a natural randomness notion on this quotient space. Is it also possible to obtain this randomness notion directly by applying the definition of a randomness space to the quotient space? It is interesting that this is not the case, at least not by using the quotient topology on the quotient space. We give the reason for the one-dimensional case. We need first to define a new notion, namely that of the rich two-way sequence, the two-way analogue of the disjunctive one-way sequence.

Definition 9.14. Let A, B C Zd be two finite sets and an integer vector a E Zd. The sets A, B are called a-equivalent if A = a + B. Two

9. Applications

374

elements v E ~A and w E ~B are called equivalent if there exist an integer vector a and two a-equivalent finite sets A, B such that va+b = Wb, for all b E B. The equivalence classes of elements of ~A for finite subsets A c Zd are called patterns (of dimension d over ~). The equivalence classes of elements of ~{1,2, ... ,n}d for any positive integer n are called cube patterns.

The number n is called the side length of the cube pattern. Definition 9.15. We say that a pattern, given by a representative W E ~A for some finite set A C Zd, occurs in c E ~Zd if there exists an integer vector b E Zd such that cb+a = Wa for all a E A. A two-way sequence c E ~Zd dimension d occurs in c.

is

called rich if every pattern over ~ and of

It is clear that a configuration is rich iff every cube pattern (over ~, of dimension d) occurs in c.

Remark. In contrast to randomness richness is very fragile even under the computable rearrangement of sequences. Indeed, if a one-way infinite sequence C = COCI C2 . " E ~w is rich, then also the two-way infinite sequence H(c) = ... C3CICoC2C4 ... E ~z is rich, but the converse is not true. To see this let c = COCIC2 . " be a one-way rich sequence and define another one-way sequence c by C2i = Ci and C2i+l = s for all i where s is a fixed element of~. Then c is not rich, but the corresponding two-way sequence

H(c)

= ... sSCoC2C4···

is rich. By choosing a different bijection from Z to N one can achieve the equivalence of the richness notions on ~w and ~z. It is not difficult to check that a two-way sequence c = ... C-2C-ICOCIC2 ... is rich iff the following one-way sequence is rich:

... C-15 Cll ... C15 ....

Finally, note also that in contrast to randomness, richness is not base invariant.

9.5 Randomness and Cellular Automata

375

Lemma 9.16. Every random configuration is rich. Proof We fix an arbitrary cube pattern. By a simple counting argument one can easily prove in an effective way that the set of all configurations which do not contain this pattern has measure zero. Therefore all such configurations are non-random. Since this is true for all cube patterns, it follows that all random configurations are rich. 0

Remark. In fact, much more is true. One can define in a natural way normal configurations, in which all patterns occur with the expected frequency. In the same way as one proves that every random real number has a normal binary expansion, one can also prove that every random configuration is normal. It is clear that every normal configuration is rich. We can now come back to the problem of randomness on the quotient space. A base of the quotient topology on ~z / =Shift is given by the sets {[C]=Shift ICE

~Zd and c contains the string w},

for arbitrary w E ~*. But any of these basic open sets contains the =shift-equivalence classes of all rich sequences! Hence, any open set in the quotient space contains the =shift-equivalence classes of all rich sequences. Especially, for any sequence (Ui)i of open subsets Ui of the quotient space, the =shift-equivalence classes of all rich sequences lie in the intersection nEN Ui . Therefore, any Martin-Lof test on the quotient space would show that these classes are non-random. Hence, the direct approach via Martin-Lof tests cannot give the "most natural" randomness notion on the quotient space ~ Zd / =Shift. Cellular automata are continuous functions which operate on a full shift space ~Zd and commute with the shift mappings O"a, for a E Zd. Definition 9.17. A cellular automaton (in short, CAY is a triple (~, d, F) consisting of a finite set ~ containing at least two elements, called the set of states, a positive integer d, called the dimension, and a continuous function

which commutes with the shift mappings O"i for i F is called the global map of the CA.

= 1, ... ,d.

The function

9. Applications

376

The usual definition of CA involves the so-called local function. Since the space I.;Zd is a compact metric space any continuous function F : I.;Zd - t I.;Zd is uniformly continuous. Hence, if F is continuous and commutes with the shift mappings, then there exist a finite set A C Zd and a function f : I.;A - t I.; such that

and bE Zd, where Cb+A E I.;A is defined in an obvious way: for all a E A. The function f is called a local function for F which is induced by f. for all C E

I.;Zd

(Cb+A)a = Cb+a,

Obviously, one could choose A to be the d-dimensional cube [-r, rjd, for some sufficiently large r. On the other hand it is clear that any function F induced by a local function f is the global map of a CA. Whenever we consider a local function for some CA we will assume that there is a natural number r such that f maps I.;[-r,rjd to I.;. The number r will be called the radius of f. Let f : I.;[-r,rjd - t I.; be a local function with radius r. It induces a function f* mapping any v E I.;[-k,kjd for arbitrary k :2 2r + 1 to an element f*(v) E I.;[-k+r,k-rjd in an obvious way. This function induces a mapping fpattern which maps any cube pattern of side length k for any k :2 2r + 1 to a cube pattern of side length k - 2r in an obvious way.

= 2 and a local function f : I.;[-r,rj2 - t I.; with radius r. We take a square pattern P with k·k cells, for some k :2 2r + 1. For simplicity let us assume that the indices of the cells are running from 1 to k, in both dimensions. We define the image pattern Q, which is a square pattern with (k - 2r) . (k - 2r) cells, in the following way. The indices of the cells of the image pattern Q are running from 1 + r to k - r, in both dimensions. For any index (i, j) of a cell in the image pattern Q, hence, with 1 + r ::; i ::; k - rand 1 + r ::; j ::; k - r, the value of the cell with index (i, j) in the image pattern Q is defined to be the value of the local function f, applied to the square subpattern of P with side length 2r + 1 and centre (i, j), hence, to the square subpattern of P with the cells running from i - r to i + r in the first dimension and from j - r to j + r in the second dimension. Example 9.18. Let us consider the dimension d

Definition 9.19. A CA

(I.;, d, F) is finitely injective if for all configurations c(1), c(2) E I.;Zd with c(1) # c(2) and c~l) = c~2), for almost all a E Zd we have F(c(1)) # F(C(2)).

9.5 Randomness and Cellular Automata

377

Definition 9.20. A continuous function F : L;Zd --+ L;Zd is measurepreserving if i1(F- 1(U)) = i1(U), for all open U C L;Zd. Theorem 9.21. (Moore-Maruoka-Kimura-Calude-HertlingJiirgensen-Weihrauch). Let (L;, d, F) be a CA, and f : L;[-r,rjd --+ L; be a local function inducing F. The following conditions are equivalent: 1.

F is surjective.

2.

For every finite pattern w there exists a configuration c such that w occurs in F(c).

3.

F is finitely injective.

4. For every n 2:: 2r + 1 and every cube pattern w of side length n we have

(9.1) 5.

F is measure-preserving.

6.

For all configurations c, if c is rich, then also F( c) is a rich configuration.

7.

For all configurations c, if c is random, then also F(c) is a random configuration.

Proof. The implication 1 =} 2 is trivial.

For 2 =} 1 let c E L;Zd be an arbitrary configuration. By 2, for each n there exists a configuration c(n) such that

The sequence (c(n))n has an accumulation point c in the compact space L;Zd. By continuity of F we conclude that F(c) = c. For 4 =} 2 it is sufficient to deduce that for every cube pattern w there exists a configuration c such that w occurs in F(c). This is the case iff #( (jpattern) -1 {w}) 2:: 1. Therefore, 2 follows immediately from 4. For the implication 2 =} 33 we assume that 3 is not true. Let c(l), c(2) E L;Zd be two different configurations with C~l) = c~2) , for almost all a E Zd, and with F(c(l)) = F(c(2)). Let 3A

strengthening of the Garden of Eden Theorem [316].

9. Applications

378

and k = 4r + 2l + 1, where lal = max{lall, ... , ladl} for a = (al, ... ,ad) E

Zd. We introduce an equivalence relation between cube patterns of side length k by calling two cube patterns v and w of side length k interchangeable if they are equal to each other or if each of them is equal to the pattern (1) d (2) represente d by c[-2r-I,2r+ll d or to the pattern represente by c[-2r-I,2r+ll d ' Obviously, if v and ware interchangeable, then jPattern(v) and jPattern(w) are equivalent. Let us fix a positive integer i and extend this relation to cube patterns of side length ik in the following way. Each cube pattern of side length ik can be viewed as consisting of i d non-overlapping cube patterns of side length k. Two cube patterns v and w of side length ik are called interchangeable if each of these i d cube subpatterns of v of side length k is interchangeable with the cube subpattern of w of side length k at the corresponding position. Since the outer 2r layers of any two interchangeable cube patterns of side length k are identical (this is especially true for (1)

(2))

the two cube patterns represented by c[-2r-I,2r+ll d and by c[-2r-I,2r+ll d we conclude that jPattern (v) = jPattern (W ),

,

for any two interchangeable cube patterns of side length ik. With respect to the "interchangeable" equivalence relation the set of all cube patterns of side length ik splits into exactly (Qk d - 1 )i d equivalence classes. 4 Hence, the set jpattern(cube patterns of side length ik) contains at most (Qk d - 1)i d cube patterns. They have side length ik - 2r, of course. But there are altogether Q(ik-2r)d cube patterns of side length ik - 2r. We claim that for sufficiently large i

(9.2) In order to prove the claim we choose i so large that

2r)d

kd - ( k - -

i

< 10gQ

Qkd

(Qk d

Raising Q to these powers and rearranging gives

4Recall that

I;

has Q elements.

-

1)

.

9.5 Randomness and Cellular Automata

379

and raising both sides to the power i d finally gives (9.2). We can now finish the argument. According to (9.2), for sufficiently large i there exists a cube pattern of side length ik - 2r which is not in the set fpattern(cube patterns of side length ik). This cube pattern cannot occur in F(c), for any configuration c, a contradiction. For the implication 3 :::} 45 we assume that 4 is not true. If there exists a cube pattern w of side length n such that equation (9.1) is not true then there must be a pattern v of side length n such that (9.3)

We set M = # ((jpattern) -1 { V }) and k = n + 2r. Let us fix a state s E L; and let r = (r, r, ... ,r) E Zd be the integer vector with constant value r. We fix a positive integer i and consider the set 8 of all configurations c E L;Zd such that each of the i d cube patterns represented by Cr+ka+{l, ... ,k}d,

for some a E {O, ... ,i - l}d is one of the patterns in (jpattern)-l{ v} such that Cb = s, for all bE Zd \ {r + 1, ... ,r + ik }d. There are exactly Mid such configurations, i.e. #(8) = Mid. The images F(c(l)) and F(c(2)) of any two configurations c(l) E 8 and c(2) E 8 are identical outside the cube {I, ... ,2r + ik }d, i.e.

for all a E Zd \ {I, ... ,2r + ik}d. Furthermore the i d cube subpatterns

F(c(1)) 2r+ka+{1, ... ,n} d , for a E {O, ... ,i - l}d are all equal to v. Hence, the set F(8) contains at most Q(2r+ik)d- i dn d configurations. We claim that for sufficiently large i

(9.4) In order to prove the claim we choose i so large that

5A

strengthening of a result by Maruoka and Kimura [306].

380

9. Applications

(remember M > Qkd_n d ). Raising Q to these powers and rearranging we obtain Q kd_n d . Q(k+k)d_kd , -_ Q(k+k)d_nd , <M, and raising both sides to the power i d finally gives (9.4). We now finish the argument. According to (9.4), for sufficiently large i there exist two different configurations C(l) and c(2) with C~l) = s = C~2), for all a E Zd \ {r + 1, ... , r + ik}d and F(c(l)) = F(C(2)). This shows that P is not finitely injective, a contradiction. We now prove 4 ~ 5. For a vector a E Zd, a positive number n, and a cube pattern w of side length n, the set Ca,w

= {c E ~Zd

I ca+{l, ... ,n}d is a representative for

w}

has measure l/Qn d , and its pre-image p-l(Ca,w)

= {c E ~Zd I f*(c-r+a+{l, ... ,n+r}d) is a representative for w}

has measure

#( (Jpattern)-l (v))/ Q(n+2r)d. Therefore, if F is measure-preserving, then 4 is true. On the other hand, if 4 is true, then each set Ca,w has the same measure as its pre-image p-l(Ca,w). Since every open set can be written as the disjoint union of sets Ca,w, we conclude that 4 implies 5. The implication 2 Lemma 9.16.

~

6 is trivial; the implication 7 =? 2 follows by

Finally, for 5 =? 7 we assume that c is a configuration such that P( c) is non-random. Then there is a Martin-Lof test (Ui)i such that

F(c)

E

n Ui. iEN

The sequence of open sets (F-1(Ui))i is also a Martin-LOf test. In view of 5 we have

!i(p-l(Ui )) = !i(Ui ) :; Ti. The facts that F is induced by a local function f and that the sequence (Ui)i of open sets is B'-computable, imply that the sequence (F-1(Ui))i of open sets is B'-computable. We have c E

n

F-1(Ui ).

iEN

Hence, c is non-random.

o

9.5 Randomness and Cellular Automata

381

First, from Condition 2 in Theorem 9.21 it follows that if F is not surjective, then there is no configuration c such that F( c) is rich or random. Hence, a non-surjective CA "destroys" both richness and randomness. Secondly, we ask what happens when one applies CA to a non-random configuration or to a non-rich configuration. 6 Example 9.22. The function F : I.;z

---+

I.;z defined by

is computable and measure-preserving. If all odd entries to one fixed element s E I.;, then the sequence

C2i+l

are equal

is non-random. But its image under F, the sequence

can still be random. It is not clear a priori whether the same phenomenon can occur when one

considers CA. One-dimensional CA preserve non-randomness, i.e. they transform non-random two-way infinite sequences into non-random twoway infinite sequences. Theorem 9.23 (Calude-Hertling-J iirgensen-Weihrauch). (I.;, d, F) be a CA.

Let

1.

If a configuration c E I.;Zd is not rich, then also F(c) is not rich.

2.

If d = 1 and a configuration c E I.;Zd is non-random, then also F(c) is non-random.

Proof. Let

f : I.;[-r,rjd

---+

I.; be a local function inducing F.

1. Let us fix a non-rich configuration c and a cube pattern of side length k which does not occur in c. Hence, at most Qkd - 1 cube patterns of side length k can occur in c. Let us consider cube patterns of side length ik, for an arbitrary positive integer i. Since cube patterns of side length 6We have seen examples of very simple computable functions on the space of oneway infinite sequences which transform some non-random sequences into random ones.

382

9. Applications

ik can be viewed as consisting of i d non-overlapping cube patterns of side kd ·d length k, we conclude that at most (Q - 1)~ different cube patterns of side length ik can occur in c. Let Pik denote the set of all cube patterns of side length ik which occur in c. We have just proved

Hence, also the set jPattern(Pik) contains at most (Qk d - 1)i d different cube patterns. These cube patterns have side length ik - 2r, of course. But there are altogether Q(ik-2r)d cube patterns of side length ik - 2r. By exactly the same counting argument as in the proof of the implication 2 ::::} 3 of Theorem 9.21 we conclude that for sufficiently large i there exists a cube pattern of side length ik - 2r which is not in the set jPattern(Pik). This cube pattern cannot occur in F(c). Hence, F(c) is not rich.

2. For the second assertion we assume that the dimension d of the cellular automaton is 1. We fix a non-random configuration c and a Martin-Lof test (Ui)i on ~Zd such that c E niEN Ui. We show that there is a MartinLof test CVi)i on ~Zd such that F(c) E niEN Vi. By Lemma 9.9 and a compactness argument we deduce from the fact that the sequence (Ui)i of open sets is H'-computable that the set

{7r(i,j) EN I [Cube(j)] CUi} is c.e. We set l

Vi

=

(9.5)

= flog2 (Q2r) l, and define

U{[J*(v)] I v E Cubes (~, 1), side length(v) ~ 2r + 1, [v] c Uz+d.

We claim that the sequence (Vi)i is a Martin-Lof test with F(c) E n iEN Vi. lt is clear that it is a sequence of open sets which is H'-computable (we use the fact that the set in (9.5) is c.e. and Lemma 9.9). For arbitrary i we have c E Uz+ i . Hence, there is an element v E Cubes(~, 1) of side length ~ 2r + 1 with c E [v] and [v] C Uz+ i . This shows F(c) E Vi. Finally we have to show that p,(Vi) ::; 2- i , for all i. We fix an i. There exists a set W

such that

c {v

E

Cubes (I.:, d) I side length( v) ~ 2r

+ 1, [v] c

Uz+ i }

U [v] = UZ+i vEW

and for any two v, w E W, the sets [v] and [w] are disjoint. If v, w E Cubes (I.:, d) and [v] C [w], then also [J*(v)] C [J*(w)].

9.6 Randomness and Riemann's Zeta-function Hence,

Vi

=

383

U [J*(v)]. vEW

Since for arbitrary v E Cubes (~, 1) with side length( v) 2: 2r + 1 we have

j1([J*(v)]) = Q2r . j1([v]) , we obtain

j1(Vi)

=

U [J*(V)])

j1 (

VEW

<

L

j1([J*(v)])

vEW

L Q2r ·j1([v])

vEW

Q2r . j1(U/+i) < Q2r. 2-1-i

< Ti. Consequently, (Vi)i is a Martin-Lof test with F(c) E

n Vi, iEN

hence, F(c) is non-random.

9.6

D

Random Sequences of Reals and Riemann's Zeta-function

In this section we will answer the following question, formulated in the first edition of this book: "Do the Zeros of Riemann's zeta-function form a random sequence?" It is a well-known fact that the real parts of the non-trivial zeros s of the Riemann zeta-function are close to 1/2, so they form a highly organized set. In fact a large proportion of them lies on the line Re(s) = 1/2. The imaginary parts Im( s) of the same zeros s are far from displaying any order and here is the argument.

Let us note that the zeta-function has infinitely many, but only count ably many, zeros. The non-trivial zeros lie in the stripe 0 < Re(s) < 1 and

9. Applications

384

are symmetric with respect to the real axis. Hence it is sufficient to consider the zeros with positive imaginary part. Since they do not have an accumulation point in the complex plane, we can order them to a sequence 8k = Re(8k) + iIm(8k), kEN = {O, 1,2, ... }, by the size of the imaginary part. 7 The Rademacher-Hlawka Theorem (cf. [215, 237]) states that for every real number t =J 0, the fractional parts h(t) of the sequence

form a uniformly distributed sequence modulo 1 in the sense that for every subinterval [a, b] C [0,1], the portion of points fo(t), ... , fN(t) that belong to this subinterval tends to b - a as N --+ 00. In other words, the probability of fk(t) lying within an interval coincides with the measure of the interval. There is a close relation between uniform distribution and Borel normality (see Kuipers and Niederreiter [268]): the number x is normal to the base b iff the sequence (bnx )n>O is uniformly distributed. Finally, the Riemann zeta-function can be written in two different ways as an infinite product, specifically, using Euler's formula

((8) = (

n

1_

p-s) -1,

P prIme

and the Riemann-Hadamard formula 00

((8) = f(8) .

II (1 -

8p;;1),

n=l

where Pn are the zeros of the Riemann zeta-function and f (8) is a relatively simple fudge factor. With the identification "zeros of Riemann's zeta-function = energy levels" and "logarithms of primes = lengths of periodic orbits" M. Berry and other physicists (see Cipra [142]) have been able to use the Riemann zeta-function as a simple model for quantum mechanics chaos (to test ideas about how to bridge the apparently incompatible chaotic and quantum mechanical descriptions of the microscopic world). 7Zeros with the same imaginary part - if any - are ordered by their real parts and multiple zeros are listed according to their multiplicities.

9.6 Randomness and Riemann's Zeta-function

385

We will now examine, following Calude, Hertling and Khoussainov [84], the nature of the question stated in the beginning of this section. First, one of the "indirect" goals of the question was to ask for a model of the notion of a random sequence of reals. This part of the question is methodological since approaches can be quite different and inequivalent. It is desirable that any definition implies that any random sequence of reals is uniformly distributed. Secondly, suppose that "we have" a definition of a random sequence of reals. Consider the set

of imaginary parts of all non-trivial zeros (with positive imaginary part) of the Riemann zeta-function. An important point about this notation is that it does not specify how the sequence Im(sk) was being constructed. Therefore, in order to answer the original question one has to specify how the imaginary parts of the zeros are being (effectively) enumerated. As we have seen, the Rademacher-Hlawka Theorem tells us that with respect to a specific, natural enumeration, the sequence of imaginary parts of the zeros of the Riemann zeta-function is uniformly distributed modulo l. Two questions naturally arise: a) Is this sequence random?, b) Is the sequence defined by another enumeration of the zeros random? Note that neither randomness nor uniform distribution are invariant with respect to arbitrary enumerations. Thus, to answer the original question we need to develop an appropriate theory of randomness for sequences of reals, and, according to it, to decide whether a sequence of imaginary parts of the zeros of the Riemann zeta-function is or is not random. We will introduce randomness for real numbers via representations of real numbers with respect to some natural base. We fix a base b 2:: 2, set L;b = {O, 1, ... ,b - 1}, and consider the representation of reals in the unit interval Vb : L;t --t [0,1] in (7.5). For sequences of real numbers we can proceed in the same way as for reals by "merging" the digits of the expansions of the fractional parts of the real numbers in a computable way.

Definition 9.24 (Calude-Hertling-Khoussainov). Let f : N2 --t N be a computable bijection and let b 2:: 2 be an integer. A sequence (an)n of real numbers an is called f-random to base b if there exists a random sequence q = qlq2 ... E ~t such that Vb(q!(1,n)q!(2,n) ... ) is the fractional part of an, for all n E N.

386

9. Applications

Note that this definition leads to the same randomness notion for all computable bijections f. Lemma 9.25. Let f : N 2 -> N be an arbitrary computable bijection and b ~ 2 be an arbitrary base. Then any sequence (an) of real numbers is f -random to base b iff it is 7r-random to base b. Proof. We can assume that all numbers an lie in the interval [0,1). The bijection f 07r- 1 is computable. We fix a sequence q E ~b' The sequence q = q1q2 ... is random iff the sequence p = qfor1(1)qfor1(2)qfo ... r1(n)'" is random (see Lemma 9.27 below). Furthermore, Qf(i,n) = P7r(i,n) , for all i,n, hence an = vb(Qf(1,n)Qf(2,n)"') iff an = Vb(P7r(1,n)P7r(2,n)" .), for all n. This proves the assertion. D

Lemma 9.25 justifies the following definition. Definition 9.26. Let b ~ 2 be an integer. A sequence (an) of real numbers an is called random to base b if there exists a random sequence q = Q1Q2··· E ~b such that an = Vb(Q7r(1,n)Q7r(2,n) ... ). This notion of randomness has natural properties, as Proposition 9.28 and Theorem 9.29 show. We will use the following simple fact (see, for example, Lemma 3.4 in Book, Lutz and Wagner [39]). Lemma 9.27. Let f : N -> N be a computable one-to-one function. If 0"10"2 ... E ~w is a random sequence, then the sequence 0" f(l)O" f(2) .. , is random as well. Proof. Consider the computable function m : N -> N defined for every i > by m(i) = max{n I f(n) ~ i} .. The function F : ~* -> ~* defined by F(X1X2 ... xn) = Xf(l)X f(2) ... Xf(m(n)) is computable and prefixincreasing (if x

°

Proposition 9.28. If a sequence of reals contains a non-random real, then the sequence itself is non-random to any base b. Proof. Let (an) be a sequence of reals which is random to some base b ~ 2. We can assume that an E [0,1), for all n. There is a random sequence

9.6 Randomness and Riemann's Zeta-function

387

q E ~b such that an = Vb(q7r(1,n)q7r(2,n) ... ), for all n. But, by Lemma 9.27, the sequence Q7r(1,n)Q7r(2,n) ... is also random, for each n E N. Thus, all 0 real numbers an are random to base b, hence, random.

Theorem 9.29. If a sequence of real numbers is random to some base b, then it is uniformly distributed modulo 1.

Proof Let the sequence (an) of reals be random to some base b:2: 2. We can assume that all reals an lie in [0, 1). For each integer N :2: 1 and o :s; r < s :s; 1 we put A([r, s), N)

= #( {i :s; N - ~I ai

We have to show limN--.ooA([r,s),N)

=s-

E [r, s)} ) .

r, for all O:S; r < s:S; 1.

For each n, let p( n) = p( n hp( n h ... be the expansion of an in base b. We know that the sequence q with Q7r(j,n) = p( n) j, for all n, j, is random. We fix a number k :2: 1. By Lemma 9.27 the sequence

is random. Let us consider each block p( n) 1 ... p( n) k in this sequence as one digit in the alphabet ~bk. In other words, consider the sequence r(k) E ~bk with vbk(r(k)) = Vb(Q(k)). This sequence is random as well and therefore also normal, see Theorem 6.57. Its normality implies that for each interval [ljb k , (l + l)jb k ) with 0 :s; l < bk the asymptotic portion of numbers an in this interval is limN--.oo A([ljb k , (l + l)jb k ), N) = Ijb k . This immediately implies lim A([ljbk, mjbk ), N) N--.oo

= (m -l)jbk,

for l, mEN, 0 :s; l :s; m :s; bk • Let [r, s) be an arbitrary interval with o :s; r < s :s; 1. It contains an interval [ljb k , mjbk ) with l, mEN, o :s; l :S m :s; 2k, and with length at least r - s - 2 . bk . Hence liminfA([r,s),N) :2:r-s-2·b k . N--.oo In the same way one proves lim sup N--.ooA([r, s), N) :s; r - s + 2· bk . Note that we have proved this for arbitrary k :2: 1. Thus, the desired assertion limN-+oo A([r, s), N) = r - s follows. 0

388

9. Applications

Remark. Note that the randomness notion for sequences of real numbers introduced in Definition 9.26 is base independent. Note also the following difference between random sequences over a finite alphabet I: and random sequences of real numbers: while the property of randomness for a sequence over a finite alphabet is invariant under the change of a finite portion of the sequence, the same is not true for the property of randomness for a sequence of reals. For example, just inserting one non-random real number somewhere into a random sequence of reals, or changing a number in the sequence into a number which already appears in the sequence at some other place, makes the sequence non-random. "Most" sequences of real numbers are random: in a perfect analogy with the case of reals, with probability 1 every sequence of real numbers is random. Examples of random sequences of real numbers can be easily constructed from random reals; however, we do not have a natural example. Now we return to the zeros of the Riemann zeta-function. As we have seen, the sequence (Im(sk))k:::::O of the (positive) imaginary parts of the non-trivial zeros of the Riemann zeta-function is uniformly distributed modulo 1. By Theorem 9.29, this is a property shared by all random sequences. But neither the sequence (Im(sk))k>O nor any other sequence containing imaginary parts of zeros of the Riemann zeta-function is random. We formulate a slightly more general result. Lemma 9.30. Let U c C be a connected open subset of the complex plane and let f : U - t C be an analytic function which is computable at least on some open subset of U. If a sequence of real numbers contains a real or imaginary part of a zero of f, then this sequence is not random. Proof By Proposition 1 in Pour-El and Richards [338] (Chapter 1.2), the function f is computable on any compact subset of its domain U. By a result of Orevkov [325] each zero of f is a computable complex number, i.e. its real and imaginary parts are computable real numbers. Any computable real number is non-random, hence, every sequence (Yn) which contains at least one real or imaginary part of a zero of f is not random by Lemma 9.28. D

We are now ready to state the main result (anticipated also in Longpre and Kreinovich [287]):

9.7 Probabilistic Algorithms

389

Theorem 9.31 (Calude-Hertling-Khoussainov). No sequence (Yk) of reals which contains at least one imaginary part of a zero of the Riemann zeta-function is random. Proof For complex numbers s with Im(s) > 1 the value ((s) of the zetafunction is given by the absolutely convergent sum L~=l n -8. Hence the zeta-function is computable in the half plane {s I Im(s) > I}. The assertion follows from Lemma 9.30 since the domain of definition of the zeta-function is the connected open set C \ {I}. 0

9.7

Probabilistic Algorithms

Probabilistic algorithms are very efficient, but only "probably correct" . Our aim is to prove that if sufficiently long random strings are supplied, the probabilistic reasoning involved in many probabilistic algorithms can be converted into rigorous, deterministic correctness proofs. To start with, we describe the famous probabilistic algorithms for testing primality due to Solovay-Strassen [378] and Miller-Rabin [314, 341]. The common idea of these algorithms is the following: To test whether a natural n is prime, process the following two steps: • take k natural numbers uniformly distributed between 1 and n -1, inclusive, • for each such a number i check whether some fixed predicate W(i, n) holds. The predicate W(i,n) is true if i is a witness of n's compositeness. If W (i, n) is true, then n is composite; if it is not true, then n is prime with probability greater than 1- 2- k . Such a "witness of compositeness" does exist because at least half of the numbers i E {I, 2 ... ,n - I} are witnesses of n's compositeness - if n is composite - and none of them are - in case n is prime. Furthermore, the predicates W(i, n) are different in case of the two algorithms cited above, but they have an important common feature: the running time of a program computing W (i, n) is bounded by a polynomial in the size of n, i.e. in log n. Here is a general definition of a probabilistic algorithm. A pair (j, f), where f : N x A* ~ N is a p.c. function and 0 < f < 2- 1 is a computable

390

9. Applications

real, is called a probabilistic algorithm that E:-computes the partial function 9 : N ~ N, provided that the following two conditions hold true: i) If g(n) # 00 and f(n, x) = g(n), for some n E N, x E A *, then f(n,xy) = g(n), for every y E A*. (A probabilistic algorithm reaching an acceptable "state" does not need any further "random" inputs.)

ii) For every n

E dom(g) there exists a natural te,n such that

#{x E A* I te,n

=

Ixl,f(n,x)

= g(n)} >

(1- E:)Qte,n.

(The probability that f computes 9 is greater than 1 - E:, if the encoding of the "random" factor is long enough.) A model for a probabilistic algorithm can be found in Gill [209]. Let us show that the above primality probabilistic tests are examples of probabilistic algorithms. We put Q = 2, A = {O, I}. For every subset I c {I, 2, ... ,n - I}, consider the binary string x of length n - 1 defined by x = XIX2 ... ,Xn-l, Xi = 1, if i E I, Xi = 0, in the opposite case. Condition i) is satisfied for te,n = n - 1. Condition ii) is also satisfied. Indeed, if n is prime, then #{x E A*

I f(n,x) = g(n), Ixl = n -I} = 2n-l > (1- 2- 1)2n-\

if n is composite one has

#{x E A*

I f(n, x) = g(n), Ixl = n #{x E A*

I}

Ilxl = n -l,Xi = 1,and W(i,n)

holds true for some 1 ::; i ::; n - I}

> 2n-l_ ~

(n-1)Tk

k 2n-l _ (3j2)n-l k=O

> 2n-l(1_2- 1), for n

~

5.

A classical result due to De Leeuw, Moore, Shannon and Shapiro [163] asserts that the class of p. c. functions coincides with the class of partial functions computed by probabilistic algorithms. Next we prove a slightly generalized version (due to Calude and Zimand [102]) of a result first proved by Chaitin and Schwartz [137].

391

9.7 Probabilistic Algorithms Theorem 9.32. Let f : N x A* -> N, g, h : N functions, and i EN. Assume that:

->

N be three computable

A) f is a probabilistic algorithm that c-computes g. B) For every natural n there exist a natural in and a computable real 0< Vn < 1/2 such that lim Vn = 0, n-+oo

and Then, there exists a natural N such that for all n 2:: N, if for some yEA * we have n = h(lyl), Iyl 2:: in, then f(n, x) = g(n), for every Chaitin i-random string x with n = h(lxl). Proof. First let T be such that T W = {(x,m)

I f(h(lxl),x)

o (log T)

2:: i. Next notice that the set

=J g(h(lxl)), #{y

E

A*

Ilxl = IYI,

f(h(lyl), y) = g(h(IYI))} > (1 - Q-m /(Q - l))Qlxl} is a Martin-Lof test. So, if U is a universal Martin-Lof test and m then

mw(z) :s; m(z)

= mu,

+ 0(1).

Let q be the constant furnished by Martin-Lof's asymptotical formula and define the bound

In view of B) there exists a natural N such that for all n 2:: N k n > T. Let k = kN. We shall prove that for all n 2:: N, f(n,x) = g(n) provided i) there exists a string y with h(lyl) = nand Iyl 2:: in, and ii) x is a Chaitin i-random string such that h(lxl) = n. We proceed by reductio ad absurdum. Suppose x to be a Chaitin random string, n = h(lxl) 2:: Nand

f(n,x) =J g(n). It is not difficult to see that

#{z E A*

Ilzl = lxi, f(n, z) = g(n)} 2::

(1 - vn)Q1x 1.

9. Applications

392 But, (x, mw(x)

#{z E A*

+ 1) ¢ W,

so

Ilzl = Ixl,f(n,z) = g(n)}

~ (1- Q-mw(x)-l/(Q _l))Qlxl.

Combining the last inequalities, we get Vn

2:: Q-mw(x)-l/(Q - 1),

or, equivalently,

so

Finally, we use the Martin-Lof asymptotical formula

K(x)

< Ixl- m(x) + q

< Ixl + (q + i + 1) -llogQ

vn(d -l)J

Ixl- k n < Ixl-T, since kn > T. In view of Corollary 5.8, x ¢ RAND?, thus contradicting the hypothesis. 0 We have obtained the main result in [137J: Corollary 9.33 (Chaitin-Schwartz). For almost all inputs n, the probabilistic algorithms of Solovay-Strassen and Miller-Rabin are errorfree in case they use a long enough Chaitin i-random input. Proof Consider

h(n) = n + 1, Vn = T Ln/3J, in = max{n - 1, a}.

o

An analysis of the proof of Theorem 9.32 reveals the number of potential witnesses of compositeness which must be visited to ensure the primality

9.8 Structural Complexity

393

of numbers of some special form correctly with high probability (in fact with certainty - if some "oracle" gives us a long Chaitin random stringS). For instance, a number of the following simple form N

requires O(1og n)

+ o (log m)

=

lOn

+m

potential witnesses.

Mersenne numbers

N require 0 (log n)

= 2n-1

= 0 (log log N) potential witnesses.

Fermat numbers N

require 0 (log log n)

= 22n + 1

= 0 (log log log log N) potential witnesses.

Finally, Eisenstein-Bell numbers 22 ...

N= n 2'S

need O(1ogn)

9.8

= O(logk N)

2 +1 '-v-' altogether

witnesses, for every natural k.

Structural Complexity

There is no general agreement as to what defines the structural complexity, 9 but there is a more common view as concerns the position of this area inside theoretical computer science - a leading role. We are not going to describe this fascinating subject; instead we shall give the reader an idea about the impact of AIT in structural complexity. See more details in Barthelemy, Cohen and Lobstein [19], Downey [180], Garey and Johnson [207], Balcazar, Diaz and Gabarro [14], Hemachandra, and Ogihara [229], Li and Vitanyi [280,282]' Longpre [286], Wagner and Wechsung [426] and Watanabe [428]. Perhaps the most known and discussed problem of structural complexity is the (in)famous problem P =? NP. Here is a very common illustration. 8We know, by virtue of results proven in Section 5.5, that, in spite of the fact the almost all strings are Chaitin t-random, no algorithm can produce an infinity of such strings. 9 As usual, a criterion like I know it when I see it works very well.

394

9. Applications

Given an undirected graph G we recall that a Hamiltonian path in G is a path through each of the vertices of G, passing through each vertex exactly once. 10 The main problem connected to Hamiltonian paths is to find such a path if it does exist: construct an algorithm such that for every graph G it computes a Hamiltonian path in G, or tells us one does not exist.ll A lot of work has been invested in this problem. One way to solve it is to proceed by trial and error. The resulting algorithm may run - in the worst case - more than O(2n) steps, where n is the number of edges of G. For a size> 103 the performance is pretty bad! What would be very desirable is a "polynomial-time algorithm", i.e. an algorithm running in time bounded by a low degree polynomial, say of order 3 or 4. Nobody at the time being knows such an algorithm! There is also a sense in which the above problem may be considered typical for a large class of similar problems 12 which are all equally difficult: if we can solve any of these problems by a fast algorithm - fast, in structural complexity, means in polynomial-time - then we can solve all of them fast. It is important to note the difference between two important measures of complexity: time and space. With respect to the space complexity, the above problem is tractable, i.e. it may be solved in polynomial-space (write: PSPACE) since space is reusable. We do not know if this problem is in P, i.e. if it can be solved in polynomial-time. Actually, most people think that the answer is negative! On the other hand, finding a Hamiltonian path is a problem that can be solved non-deterministically in polynomial-time, i.e. it lies in the class NP.

The problem P =? NP is really meta-mathematical! Indeed, assume an appropriate coding and measure of the size of proofs. For example, a Hamiltonian path is a proof that the graph has a Hamiltonian proof; moreover, the validity of this proof can be checked in polynomial-time. As we hinted in the above discussion, the difference between P and NP if any - may be seen as a difference between constructing a polynomialsize proof and verifying a polynomial-size proof. If P = NP, then the two tasks have the same degree of difficulty. lOThis problem is extremely useful in many practical situations. Just choose, at random, a book in operations research and you will be convinced. 11 Technically, NP problems are decision problems that give a YES/NO answer; in this case the output would be "YES, there is a Hamiltonian path", or "NO, there is no Hamiltonian path". 12Most of these problems have a strong practical significance.

9.8 Structural Complexity

395

Two more mathematical problems are quite relevant for our discussion. Both of them belong to number theory and are currently open. The prime number problem asks for a polynomial-time algorithm to check whether an arbitrary number n is prime. It should be emphasized that the interest is in a polynomial-time algorithm in the number of digits representing the number n (not in n, a trivial problem). It is plain that this problem is in co_Np 13 as determining whether a number is composite is in NP, a proof being just a prime factor; Pratt [339] has shown that it is also in NP. Miller [314] has proven that this problem is in P if one assumes the extended Riemann Hypothesis. The other problem, the factorization problem, asks for non-trivial factors of the natural number n, if n is composite. It is basic for many public-key crypto-systems ("trapdoor ones") and it is widely believed to be intractable. See more in Salomaa [356]. But, we do not even know if this problem is NP-complete, i.e. we do not know if the Hamiltonian path problem can be solved fast given a routine for factorization. We may ask: "Why is the problem P =? NP so hard?" To answer this question we have to rely on a technique from computability theory known as relativization. Roughly speaking, this means the introduction of the so-called oracles - devices able to perform even "non-algorithmic tasks" . Most statements true for oracle-free machines remain true for machines with oracles. An important step in this direction has been made by Baker, Gill and Solovay [13]: they have shown that the P = ? NP problem cannot be settled by arguments that relativize. Theorem 9.34. There exist two computable oracles B, A such that P(A) -=I NP(A) and P(B)

= NP(B).

Hartmanis and Hopcroft [227] have proven the following independence result: Theorem 9.35. There exist two computable sets A, B with P(A) -=I NP(A) and P(B) = NP(B), but neither result is provable in ZFC.

More light has been shed on this problem by the Bennett and Gill [33] result: 13 CO _NP

is the class of sets X such that the predicate x

if. X

is in NP.

396

9. Applications

Theorem 9.36. If A is a random oracle, then P(A) =J NP(A), i.e. with probability 1, P(A) =J NP(A). Hemaspaandra and Zimand [230] have obtained the following stronger result:

Theorem 9.31. Relative to a random oracle, there is a language in NP, on which each polynomial-time algorithm is correct on half of the inputs at each sufficiently large lengths, and is wrong on the other half. A modification of the central idea in ArT has been developed by Hartmanis [224]: consider not only the length of a computer outputting a string, but also, simultaneously, the running time of the computer. Given a universal computer 'ljJ and two computable functions G, g, a string x of length n is in the "generalized Kolmogorov class"

K1j>[g(n), G(n)], ifthere is a string y of length at most g(n) with the property that 'ljJ will generate x on input y in at most G(n) steps. A set X of strings has small generalized Kolmogorov complexity if there exist constants c, k such that for almost all x, one has

This class is usually denoted by K[log, poly]. For any set X we denote by enumx the function that for every natural n has as value a string encoding the set of all strings in X of length at most n. The set X is self-p-printable if there is a (deterministic) oracle computer that computes the function enumx relative to X and that runs in polynomial-time. Every self-p-printable set is sparse, i.e. there is a polynomial P such that for every natural n, the number of strings x E X of length less than n is bounded by P(lxl). An easy characterization follows: P

= NP iJJfor every self-p-printable set X, P(X) = NP(X).

Hartmanis and Hemachandra [225] have proven that the class of self-pprintable sets can be viewed as a relativized version of K[log, poly]:

9.8 Structural Complexity

397

Theorem 9.38. A set X is self-p-printable iff X E K[log, poly]. A very interesting approach has been inaugurated by Book, Lutz and Wagner [39] (see also Book [38]). They have studied the algorithmically random languages (RAND) in a framework which is very close to the main stream of Chapter 6. Motivated by Theorem 9.34 of Bennett and Gill, they have designed a new way to gain information about the complexity of a language L. Here is a typical result:

Theorem 9.39. a) Let L c AW be a union of constructively closed sets 14 that is closed under finite variation. Then fJ,(L)

= 1 iff

X

n RAND

=1=

0.

b) Let L be an intersection of constructively open sets that is closed under finite variation. Then fJ,(L)

=1

iff RAND

c

L.

Finally, consider the exponential complexity classes

E = DT I M E ( 2linear) , and E2 = DT I M E ( 2Polynomial) . There are several reasons for considering these classes (Lutz [289, 290]): 1.

Both classes E, E2 have rich internal structures.

2.

E2 is the smallest deterministic time complexity class known to contain NP and PSPACE.

c E 2, E

E 2, and E contains many NP-complete problems.

3.

PeE

4.

Both classes E, E2 have been proven to contain intractable problems.

=1=

In view of the property 2 there may well be a natural "notion of smallness" for subsets of E2 such that P is a small subset of E 2, but NP is not. Similarly, it may be that P is a small subset of E, but that NP n E is not! In the language of constructive measure theory smallness can be 14That is, L is a union of a family of sets each of which is the complement of a constructively open set.

9. Applications

398

translated by "measure zero" (with respect to the induced spaces E or E2). One can prove that indeed P has constructive measure zero in E and E2, Lutz [289]. This motivates Lutz [290] to adopt the following quantitative hypothesis: The set NP has not measure zero.

This is a strong hypothesis, as it implies P =J NP. It is consistent with Zimand's [453] topological analysis (with respect to a natural, constructive topology, if NP \ P is non-empty, then it is a second Baire category set, while NP-complete sets form a first category class) and appears to have more explanatory power than traditional, qualitative hypotheses. As currently we are unable to prove or disprove this conjecture, the best strategy seems to be to investigate it as a scientific hypothesis; its importance will be evaluated in terms of the extent and credibility of its consequences. Some interesting results have been obtained by Lutz [289] and Lutz and Mayordomo [291]. For instance, they have proven the following result: Theorem 9.40. For every real 0 < a < 1, only a subset of measure zero of the languages decidable in exponential time are ~~"'-trreducible to languages that are not exponentially dense. Here the truth-table ~~a_tt-reducibility is "truth-table reducibility with n cx queries on inputs of length n".

9.9

What Is Life?

The idea that the Universe is a living organism is very old. Aristotle thought that the entire Universe "resembles a gigantic organism, and it is directed towards some final cosmic goal" .15 But, "What is life?" "When must life arise and evolve?" Or, maybe better, "How likely is life to appear and evolve?" "How common is life in the Universe?" The evolution of life on Earth is seen as a deterministic affair, but a somewhat creative element is introduced through random variations and 15Teleology is the idea that physical processes can be determined by, or drawn towards, an a priori determined end-state.

9.9 What Is Life?

399

natural selection. Essentially, there are two views as regards the origins of life. The first one claims that the precise physical processes leading to the first living organism are exceedingly improbable, and life is in a way intimately linked to planet Earth (the events preceding the appearance of the first living organism would be very unlikely to have been repeated elsewhere). The second one puts no sharp division between living and non-living organisms. So, the origin of life is only one step, maybe a major one, along the long path of the progressive complexification and organization of matter. To be able to analyse these views we need some coherent concept of life! Do we have it? It is not difficult to recognize life when we see it, but it looks tremendously difficult to set up a list of distinct features shared in common by all and only all living organisms. The ability to reproduce, the response to external stimuli, and growth are among the most frequently cited properties. But, unfortunately, none of these properties "defines" life. Just consider an example: a virus does not satisfy any of the above criteria of life though viral diseases clearly imply biological activity. A very important step towards understanding life was taken by Stanley Miller and Harold Urey; their classical experiment led to amino acids, which are not living organisms or molecules, but the building blocks of proteins. Life is ultimately based on these two groups of chemicals: nucleic acids and proteins. Both are made from carbon, hydrogen, oxygen, nitrogen and small quantities of other elements (sulphur, phosphorus). Nucleic acids are responsible for storing and transmitting all the information required to build the organism and make it work - the genetic code. The role of proteins is twofold: structural and catalytic. Little is known about the crucial jump from amino acids to proteins and even less about the origins of nucleic acids. Along the line of reasoning suggested by the Miller and Urey primeval soup and Darwinian evolution it appears that the spontaneous generation of life from simple inanimate chemicals occurs far more easily than its deep complexity would suggest. In other words, life appears to be a rather common feature in the Universe! Von Neumann wished to isolate the mathematical essence of life 16 as it evolves from the above physics and biochemistry. In [424J he made the first step by showing that the exact reproduction of universal Turing machines is possible in a particular deterministic model Universe. 16In Chaitin's words: If mathematics can be made out of Darwin, then we will have added something basic to mathematics; while if it cannot, then Darwin must be wrong, and life remains a miracle ...

400

9. Applications

Following this path of thought it may be possible to formulate a way to differentiate between dead and living matter: by the degree of organization. According to Chaitin [122] an organism is a highly interdependent region, one for whieh the complexity of the whole is much less than the sum of the complexities of its parts. Life means unity. Dead versus living can be summarized as the whole versus the sum of its parts. Charles Bennett's thesis is that a structure is deep if it is superficially random but subtly redundant, in other words, if almost all its algorithmic probability is contributed by slow-running programs. To model this idea Bennett has introduced the notion of "logical depth": a string's logical depth reflects the amount of computational work required to expose its "buried redundancy" :17 A typical sequence of coin tosses has high information content, but little message value. ... The value of a message thus appears to reside. .. in what might be called its buried redundancy - parts predictable only with difficulty, things the receiver could in principle have figured out without being told, but only at considerable cost in time, money and computation. In other words, the value of a message is the amount of mathematical or other work plausibly done by its originator, which its receiver is saved from having to repeat.

We arrive at a point when the question Is the Universe a computer?

becomes inevitable. Maybe Douglas Adams' story ([1], pp. 134-137) is after all not science fiction: the answer to the Great Question of Life, the Universe and Everything, the Ultimate answer searched for in seven and a half million years of work, is -"Fm:'ty~1wo-#~said~f}eep-'fb:ought;- with-infinite

majesty and

calm. IS 17See Bennett [27], p. 297 and for more details [26, 28,31]. 18 "I checked it very thoroughly", said the computer, "and that quite definitely is the answer. I think the problem, to be quite honest with you, is that you've never actually known what the question is." ...

9.9 What Is Life?

401

For John Wheeler the Universe is a gigantic information processing system in which the output is as yet undetermined. He coined the slogan: It from bit! That is, it - every force, particle, etc. - is ultimately present through bits of information. And Wheeler is not unique in this view. Ed Fredkin and Tom Toffoli emphatically say yes: the Universe is a gigantic cellular automaton. No doubt! The only problem is that somebody else is using it. All we have to do is "hitch a ride" on his huge ongoing computation, and try to discover which parts of it happen to go near where we want - says Toffoli [400]. For the physicist Frank Tipler the Universe can be equated with its own simulation viewed very abstractly. Feynman [196] considered the ... possibility that there is to be an exact simulation, that the computer will do exactly the same as nature, ... that everything that happens in a finite volume of space and time would have to be exactly analyzable with a finite number of logical operations. He concludes: The present theory of physics is not that way, apparently. It allows space to go down to infinitesimal distances. This is a strong objection, but perhaps not a fatal one. As Paul Davies argues, the continuity of time and of space are only assumptions about the world, they are merely our working hypotheses. They cannot be proven! Here is his argument: ... we can never be sure that at some small scale of size, well below what can be observed, space and time might not be discrete. What would this mean? For one thing it would mean that time advanced in little hops, as in a cellular automaton, rather than smoothly. The situation would resemble a movie film which advances one frame at a time. The film appears to us as continuous, because we cannot resolve the short time intervals between frames. Similarly, in physics, our current experiments can measure intervals of time as short as 10- 26 "The Ultimate Question?" "Yes!" "Of Life, the Universe and Everything?" "Yes!" "But can you do it?" cried Loonquawl. Deep Thought pondered this for another long moment. Finally: "No", he said firmly. '" "But I'll tell you who can," said Deep Thought. '" "I speak of none but the computer that is to come after me," ... "A computer which can calculate the Question to the Ultimate Answer, a computer of such infinite and subtle complexity that organic life itself shall form a part of its operational matrix .... Yes! I shall design this computer for you. And I shall name it also unto you. And it shall be called ... The Earth."

402

9. Applications seconds; there are no sign of any jumps at that level. But, however fine our resolution becomes, there is still the possibility that the little hops are yet smaller. Similar remarks apply to the assumed continuity of space.

And, we may add, the results proved by methods of non-standard analysis reinforce the duality between the continuous and the discrete. A computer simulation is usually regarded as a model, as a (simplified) representation, as an image of the reality. Is it possible to realistically claim that the activity going inside a computer could ever create a real Universe? Can a computer simulate consciousness? Roger Penrose dedicated a fascinating book to this problem [331].19 His conclusion is strong: a brain's physical action evokes awareness, but physical action cannot, even in principle, be simulated computationally. It may even be possible that awareness cannot be explained in any scientific terms. 20 (An account of these matters was presented in [333].) Tipler distinguishes two "worlds": one inside the computer and the other outside. The key question is this: Do the simulated people exist? As far as the simulated people can tell, they do. By assumption, any action which real people can and do carry out to determine if they exist - reflecting on the fact that they think, interacting with the environment - the simulated people also can do, and in fact do do. There is simply no way for the simulated people to tell that they are "really" inside the computer, that they are merely simulated, and not real. They can't get at the real substance, the physical computer, from where they are, inside the program.

How do we know that we ourselves are real and not "simulated" by a gigantic computer 21 ? "Obviously, we can't know" says Tipler. But this is irrelevant. The existence of the Universe itself is irrelevant: Such a physically real universe would be equivalent to a K anti an thing-in-itself. As 19It will be soon followed by another one. 2°In his own words [332]: I... suggest that the outward manifestations of conscious mental activity cannot even be properly simulated by calculation. 21 Following Ilya Prigogine, God is reduced to a mere archivist turning pages of a cosmic history book already written; according to Paul Erdos, God has a large book containing all mathematics - and every mathematician is allowed to look into it only once, maybe twice, the rest being his job to discover.

9.9 What Is Life?

403

empiricists, we are forced to dispense with such an inherently unknowable object: the universe must be an abstract progam. The "world view from within" and "from the outside" have been suggested by other authors as well. Svozil has dedicated a chapter of his book [391] to a detailed presentation of his own views. Here are the main facts summarized in Svozil [392]22: Epistemologically, the intrinsic/extrinsic concept, or, by another naming, the endophysics/exophysics concept, is related to the question of how a mathematical or a logical or an algorithmic universe is perceived from within/from the outside. The physical universe (in Rossler's dictum, the "Cartesian prison"), by definition, can be perceived from within only. Extrinsic or exophysical perception can be conceived as a hierarchical process, in which the system under observation and the experimenter form a two-level hierarchy. The system is laid out and the experimenter peeps at every relevant feature of it without changing it. The restricted entanglement between the system and the experimenter can be represented by a one-way information flow from the system to the experimenter; the system is not affected by the experimenter's actions. Intrinsic or endophysical perception can be conceived as a nonhierarchical effort. The experimenter is part of the universe under observation. Experiments use devices and procedures which are realisable by internal resources, i.e., from within the universe. The total integration of the experimenter in the observed system can be represented by a two-way information flow, where "measurement apparatus" and "observed entity" are interchangeable and any distinction between them is merely a matter of intent and convention. Endophysics is limited by the self-referential character of any measurement. An intrinsic measurement can often be related to the paradoxical attempt to obtain the "true" value of an observable while - through interaction - it causes "disturbances" of the entity to be measured, thereby changing its state. 22 Historically, Archimedes conceived points outside the world, from which one could move the earth. Archimedes' use of "points outside the world" was in a mechanical rather than in a metatheoretical context: he claimed to be able to move any given weight by any given force, however small.

404

9. Applications Among other questions one may ask, "what kind of experiments are intrinsically operational and what type of theories will be intrinsically reasonable?" Imagine, for example, some artificial intelligence living in a (hermetic) cyberspace. This agent might develop a "natural science" by performing experiments and developing theories. It is tempting to speculate that also a figure in a novel, imagined by the poet and the reader, is such an agent. Intrinsic phenomenologically, the virtual backfiow could manifest itself by some violation of a "superselection rule;" i.e., by some virtual phenomenon which violates the fundamental laws of a virtual reality, such as symmetry and conservation principles.

The whole story is fascinating. Most facts are currently at the stage of hypotheses, beliefs .... Here are some relevant references for the interested reader: Akin [2], Barrow [15], Barrow and Tipler [18], Bennett [26, 28], Calude and Salomaa [98], Chaitin [119, 122, 125]' Davies [155], Davies and Gribbin [156]' Feynman [196], Levy [279]' Penrose [331, 332], Svozil [391]' Tymoczko [406] and von Neumann [423, 424]. As a bridge to the next section we quote the conclusion reached by Deutsch [176], p. 101:

The reason why we find it possible to construct, say, electronic calculators, and indeed why we can perform mental arithmetic, cannot be found in mathematics or logic. The reason is that the laws of physics· "happen" to permit the existence of physical models for the operations of arithmetic such as addition, subtraction and multiplication. If they did not, these familiar operations would be non-computable functions. We might still know of them and invoke them in mathematical proofs (which would be presumably called "non-constructive") but we could not perform them.

9.10 Randomness in Physics

9.10

405

Randomness in Physics

All science is founded on the assumption that the physical Universe is ordered and rational. The most powerful expression of this state of affairs is found in the successful application of mathematics to make predictions expressed by means of the laws of physics. Where do these laws come from? Why do they operate universally and unfailingly? Nobody seems to have reasonable answers to these questions. The most we can do is to explain that the hypothesis of order is supported by our daily observations: the rhythm of day and night, the pattern of planetary motion, the regular ticking of clocks. However, there is a limit to this perceived order: the vagaries of weather, the devastation of earthquakes, or the fall of meteorites are (perceived) as fortuitous. How are we to reconcile these seemingly random processes with the supposed order? There are at least two ways. The most common one starts by observing that even if the individual chance events may give the impression of lawlessness, disorderly processes may still have deep (statistical) regularities. This is the case for most interpretations of quantum mechanics - to which we shall return later. It is not too hard to notice some limits to this kind of explanation. It is common sense to say that "casino managers put as much faith in the laws of chance as engineers put in the laws of physics". We may ask: "How can the same physical process obey two contradictory laws, the laws of chance and the laws of physics?" As an example consider the spin of a roulette wheel.

There is a second, "symmetric" approach, which is mainly suggested by AlT. As our direct information refers to finite experiments, it is not out of question to discover local rules functioning on large, but finite, scales, even if the global behaviour of the process is truly random. 23 But, to percei ve this global randomness we have to have access to infinity! It is important to notice that, consistently with our common experience, 23Recall that in a random sequence every string - of any length - appears infinitely many times. So, in such a random sequence the first billion digits may be exactly the first digits of the expansion of 7r!

9. Applications

406

facing global randomness does not imply the impossibility of making predictions. Space scientists can pinpoint and predict planetary locations and velocities "well enough" to plan missions years in advance. Astronomers can predict solar or lunar eclipses centuries before their occurrence. We have to be aware that all these results - as superb as they may be - are only true within a certain degree of precision. Of course, in the process of solving equations, say of motion, small errors accumulate, making the predictions less reliable as the time gets longer. We face the limits of our methods! Why are our tools so imperfect? The reason may be found in some facts proved in Chapter 6: a random sequence cannot be "computed" , it is only possible to approximate it very crudely. AIT gives researchers an appreciation of how little complexity in a system is needed to produce extremely complicated phenomena and how difficult it is to describe the Universe. We shall return to this point of view in Section 9.11. It is important to note the main conclusions of Svozil (for a detailed and convincing argumentation see [391]):

• Chaos in physics corresponds to randomness in mathematics. • Randomness in physics may correspond to uncomputability in mathematics. Where do we stand with regard to computability in physics? The most striking results have been obtained by Pour-EI and Richards [338] (for an ample discussion see Penrose's book [331]) for the wave equation. They have proven that even though solutions of the wave equation behave deterministically, in the most common sense, there exist computable initial data24 with the strange property that for a later computable time the determined value of the field is non-computable. Thus, we get a certain possibility that the equations - of a possible field theory - give rise to a non-computable evolution. In the same spirit, da Costa and Doria [149] have proven that the problem whether a given Hamiltonian can 24More precisely, the initial condition is 0 1 (Le. continuous, with continuous deriva,tive), but not twice differentiable. Penrose [331] p. 243-244 appreciates that the initial data vary in a non-smooth way one would "normally" require for a physical sensible field. Of course, one may ask whether the physical Universe is really "normal". Once again, note the indirect way we are using the hypothesis of order! See also Weihrauch and Zhong [432].

9.10 Randomness in Physics

407

be integrated by quadratures is undecidable; their approach led to an incompleteness theorem for Hamiltonian mechanics. Perhaps the most important relation between randomness and the Universe is provided by quantum mechanics. Let us examine it very briefly. This theory pertains to events involving atoms and particles smaller than atoms, events such as collisions or the emission of radiation. In all these situations the theory is able to tell what will probably happen, not what will certainly happen. The classical idea of causality (i.e. the idea that the present state is the effect of a previous state and cause of the state which is to follow) implies that in order to predict the future we must know the present, with enough precision. 25 Not so in quantum mechanics! For quantum events this is impossible in view of Heisenberg's Uncertainty Principle. According to this principle it is impossible to measure both the position and the momentum of a particle accurately at the same time. Worse than this, there exists an absolute limit on the product of these inaccuracies expressed by the formula 6.p.6.q ~ h, where q,p refer, respectively, to the position and momentum and 6.p,6.q to the corresponding inaccuracies. In other words, the more accurately the position q is measured, the less accurately can the momentum p be determined, and vice versa. The measurement with an infinity of precision is ruled out: if the position were measured to infinite precision, then the momentum would become completely uncertain and if the momentum is measured exactly, then the particle's location is uncertain. To get some concrete feeling let us assume that the position of an electron is measured within to an accuracy of 10- 9 m; then the momentum would become so uncertain that one could not expect that, 1 second later, the electron would be closer than 100 kilometres away (see Penrose [331], p. 248). Borel [42] proved that if a mass of 1 gram is displaced through a distance of 1 centimetre on a star at the distance of Sirius it would influence the magnitude of gravitation on the Earth by a factor of only 10- 10 More recently, it has been proven that the presence/absence of an electron at a distance of 1010 light years would affect the gravitational force at the Earth by an amount that could change the angles of molecular trajectories by as much as 1 radian after about 56 collisions.

°.

But, what is the point of view of the main "actors"? 25In company with Laplace: a thing cannot occur without a cause which produces it.

408

9. Applications Heisenberg: In experiments about atomic events we have to do with things and facts, with phenomena that are just as real as any phenomena in daily life. But the atoms or the elementary particles themselves are not as real: they form a world of potentialities or possibilities rather than one of things or facts. Bohr: Physics is not about how the world is, it is about what we can say about this world. Dirac: The only object of theoretical physics is to calculate results that can be compared with experiment, and it is quite unnecessary that any satisfying description of the whole course of the phenomenon should be given.

Einstein was very upset about this situation! His opposition to the probabilistic aspect of quantum mechanics 26 is very well known: Quantum mechanics is very impressive. But an inner voice tells me that it is not yet the real thing. The theory produces a good deal but hardly brings us closer to the secret of the Old One. I am at all events convinced that He does not play dice. 27 It is important to note that Einstein was not questioning the use of probabilities in quantum theory (as a measure of temporary ignorance or error), but the implication that the individual microscopic events are themselves indeterminate, unpredictable, random.

Quantum randomness is precisely the kind of randomness usually considered in probability theory. It is a "global" randomness, in the sense that it addresses processes (e.g. measuring the diagonal polarization of a horizontally polarized photon) and not individuals (it does not allow one to call a particular measurement random). ArT succeeds in formalizing the notion of individual random sequence using a self-delimiting universal computer. However, we have to pay a price: if a more powerful computer 26Recall that Einstein put forward the concept of the photon in 1905 - out of which the idea of wave-particle duality was developed! 27From his reply to one of Niels Bohr's letters in 1926, quoted from Penrose [331], p. 280.

9.11 Metaphysical Themes

409

is used - for instance, a computer supplied with an oracle for the Halting Problem - then the definition changes. Moreover, there is no hope of obtaining a "completely invariant" definition of random sequences because of Berry's paradox. In Bennett's words [29]: The only escape is to conclude that the notion of definability or nameability cannot be completely formalized, while retaining its usual meaning. Here are some more references: Barrow [15], Barrow and Tipler [18], Brown, Calude and Doran [79], Chaitin [118, 120, 121, 122], Davies [155], Davies and Gribbin [156], Davis and Hersh [160], Denbigh and Denbigh [172], Hawking [228], Levin [278], Li and Vitanyi [282]' Mendes-France [312], Penrose [331, 332]' Peterson [334] and Svozil [391].

9.11

Metaphysical Themes

After physics, metaphysics .... Metaphysics is a branch of philosophy which studies the ultimate nature and structure of the world. Kant considered that the three fundamental concepts of metaphysics were the self, the world and God. The nature of God involves the problem of the infinity of God. This remark generated many important scholastic studies about the relation between the finite and the infinite. 28 In this context one can formulate one of the most intriguing questions: 29 Is the existence of God an axiom or a theorem?

Following the discussion in the preceding section we would like to suggest replacing the hypothesis of order by its opposite: The Universe is Lawless. 3D First let us note that the ancient Greeks and Romans would not have objected to the idea that the Universe is essentially governed by chancein fact they made their gods play dice quite literally, by throwing dice in 28The work of Scotus [362J has to be specifically mentioned [389J. 29 A very interesting point of view is discussed in Odifreddi [323, 324J; see also Calude, Marcus and ~tefanescu [93J. 30For a more elaborate discussion see Calude, and Meyerstein [94J; for an original presentation of scientific knowledge from the perspective of ArT see Brisson and Meyerstein [48J.

410

9. Applications

their temples, to see the will of gods; the Emperor Claudius even wrote a book on the art of winning at dice. 31 Poincare may have suspected and even understood the chaotic nature of our living Universe. More than 90 years ago he wrote:

If we knew exactly the laws of nature and the situation of the universe at the initial moment, we could predict exactly the situation of that universe at a succeeding moment. But even if it were the case that the natural law no longer had any secret for us, we could still only know the initial situation approximately. If that enabled us to predict the succeeding situation with the same approximation, that is all we require, that [it] is governed by the laws. But it is not always so; it may happen that small differences in the initial conditions produce very great ones in the final phenomena. A small error in the former will produce an enormous error in the latter. Prediction becomes impossible, and we have the fortuitous phenomenon. Of course, one may discuss this hypothesis and appreciated its value (if any) by its fruitfulness. We may observe, following Davies [155],

apparently random events in nature may not be random at all . .. Chaitin's theorem ensures we can never prove that the outcome of a sequence of quantum-mechanical measurements is actually random. It certainly appears random, but so do the digits of 7r. Unless you have the "code" or algorithm that reveals the underlying order, you might as well be dealing with something that is truly random. ... Might there be a "message" in this code that contains some profound secrets of the universe? This type of argument - which is very appealing - has been used to reconcile "acts of God" with physical reality. Most of those discussions have been focused on quantum indeterminism, which in the light of AIT is a severe limitation. Randomness is omnipresent in the Universe, and by no means is it a mark of the microscopic Universe! 31 However, from the point of view of Christianity, playing dice with God was definitely a pagan practice - it violates the first commandment. St Augustine is reported to have said that nothing happens by chance, because everything is controlled by the will of God.

9.11 Metaphysical Themes

411

A famous parable due to John Wheeler and discussed by Davies [155] may illuminate our point. One day Wheeler was the subject in the game of 20 questions. 32 Wheeler started asking simple questions: Is it big? Is it living? Eventually he guessed. Is it a cloud? And the answer came back "Yes" .in a general burst of laugh. The players revealed their strategy: no word had been chosen, but they tried to answer his questions randomly, only keeping consistent with their previous answers. In the end an answer came out. The answer was not a priori determined - as a fair play of the game would require - but neither was it arbitrary: it resulted from Wheeler's questions and players' binary answers, i.e. to a large extent by pure chance. Going on to a more serious argument we mention Godel [210], who discusses the essence of time. Under the influence of Einstein - during their stay at the Institute of Advanced Study in Princeton 33 - Godel produced some new solutions for Einstein's gravitational field equations. His main conclusion is that the lapse of time might be unreal and illusory.34 In his own words:

It seems that one obtains an unequivocal proof for the view of those philosophers who, like Parmenides and Kant, and the modem idealists, deny the objectivity of change and consider change as an illusion or an appearance due to our special perception. His model describes a rotating Universe giving rise to space-time trajectories that loop back upon themselves. Time is not a straight linear sequence of events - as is commonly suggested by the arrow - but a curving line. There is no absolute space; matter has inertia only relative to other matter in the Universe.

By making a round trip on a rocket ship in a sufficiently wide curve, it is possible in these worlds to travel into any region of the past, present, and future, and back again. 32Players agree on a word and the subject tries to guess that word by asking at most 20 questions. Only binary yes-no answers are allowed.

33See the nice book by Regis [343]. 34Karl Svozil pointed out in [392] that "Godel himself looked into celestial data for support of his solutions to the Einstein equations; physicists today tend to believe that the matter distribution of the universe rules out these solutions, but one never knows

412

9. Applications

It is to be remarked that the hypothesis of lawless offers a simpler way to deal with questions like: Does God exist? Is God omnipotent? Is God rational? Do the laws of physics contradict the laws of chance?

Finally, let us go back to the widely based conviction that the future is determined by the present, and therefore a careful study of the present allows us to unveil the future. As is clear, we do not subscribe to the first part of the statement, but we claim that our working hypothesis is consistent with the second part of it. We hope that the results presented in this book contribute to this assertion. The above results support Chaitin's claim that randomness has pervaded the inner structure of mathematics! It is important to note that the above assertion does not mean a "mandate for revolution, anarchy, and license" . It means that our notion of proof should be accordingly "modified". This point of view is consistent with the opinion expressed (30 years ago) by G6del [212, 213]: ... besides mathematical intuition there exists another (though only probable) criterion of truth of mathematical axioms, namely their fruitfulness in mathematics, and one may add, possibly also in physics . .. The simplest case of an application of the criterion under discussion arises when some . .. axiom has number-theoretical consequences verifiable by computation up to any given integer. . .. axioms need not be evident in themselves, but rather their justification lies (exactly as in physics) in the fact that they make it possible for these "sense perceptions" to be deduced. I think that. .. this view has been largely justified by subsequent developments, and it is to be expected that it will be still more so in the future. It has turned out that the solution of certain arithmetical problems requires the use of assumptions essentially transcending arithmetic. .. Of course, under these circumstances mathematics may lose a good deal of its "absolute certainty"; but, under the influence of the modern criticism of the foundations, this has already happened to a large extent

We end with an impressive remark made by Bridges [46]. Consider the following function f, defined on the set N of natural numbers:

9.11 Metapbysical Tbemes

f(n)

={

~:

413

if the Continuum Hypothesis is true, if the Continuum Hypothesis is false.

Deep work by Godel [211] and Cohen [144] shows that neither the Continuum Hypothesis nor its negation can be proven within Z FC. According to classical logic, f is computable because there exists an algorithm that computes it: that algorithm is either the one which always produces 0, or else the one which always produces 1. The trouble is we cannot know the correct one! And, as the Continuum Hypothesis is independent of the axioms of ZFC - the standard framework for mathematics - we will never know which of the two algorithms actually computes f. As the most recent developments show, the blend of logical and em piricalexperimental arguments ("quasi-empirical mathematics" for Tymoczko [405, 406], Chaitin [132, 135] or "experimental mathematics" for Bailey and Borwein [9], Borwein [45]; see also Bailey, Borwein and Devlin [10]) may lead to a new way to understand (and practise) mathematics; see also Chaitin [126]' Jaffe and Quinn [240],35, Zeilberger [449] and Horgan [239].

350ne distinguishes between "theoretical mathematics" (referring to the speculative and intuitive work) and "rigorous mathematics" (the proof-oriented phase) in an attempt to build a framework assuring a positive role for speculation and experiment.

Chapter 10

Open Problems It's kind of fun to do the impossible. Walt Disney AIT raises a large number of challenging open problems; they are motivated both from the inner structure of the theory and from the interreaction of the theory with other subjects.

1. We start with a group of problems communicated to us by Greg Chaitin: a) Further develop AIT for enumeration computers; see Chaitin [116, 125], Solovay [376] and Becher, Daicz and Chaitin [23] and Becher, Chaitin [22]. b) Discover interesting instances of randomness in other areas of mathematics, e.g. algebra, calculus or geometry. c) Prove that a famous mathematical conjecture is unsolvable in the usual formalizations of number theory. d) Develop formal definitions for intelligence and measures of its various components. Apply the AIT to AI. e) Develop measures of self-organization and proofs that life must evolve. More precisely, set up a non-deterministic model universe, . .. formally define what it means for a region of spacetime in that Universe to be an organism and what is its degree of organization, and .. , rigorously demonstrate that, starting from simple initial conditions, organisms will appear and

416

10. Open Problems evolve in degree of organization in a reasonable amount of time and with high probability. See more in von Neumann [424], Chaitin [122, 125], Levy [279].

2. Study the class offunctions f : A* -> A* such that f(x) is a random string whenever x is random string. 3. Study the class of reals which can be approximated by computable sequences of rationals converging monotonically. 4. How large is the class of finitely refutable mathematical problems? 5. We have seen that the program-size complexity can be used to study the rate of convergence of computable sequences of rationals. It would be interesting to apply these ideas to questions of physical interest (as in Pour-El and Richards [338] and Weihrauch and Zhong [432]). For example, is it possible to construct problems which on computable and low program-size complexity inputs have noncomputable solutions with high complexity, perhaps even random solutions? 6. Extend the invariance of randomness with respect to natural positional representations to other types of representations. 7. (Conjecture) In the context of GIT, the class oft rue but unprovable statement is "large" in probabilistic terms. 8. Define and study the symmetry of random strings and sequences. Is the absence of symmetry related to randomness? See in this respect Marcus [299]. 9. Do arbitrary CA of higher dimension preserve non-randomness? 10. Analyse the behaviour of CA with respect to the complexity of finite patterns. 11. We have seen that surjective CA are measure-preserving with respect to the uniform measure, hence they are dynamical systems in the sense of ergodic theory. For non-surjective CA one has to consider other measures in order to apply results from ergodic theory. For an application of ergodic theory to CA see Lind [284]' Cervelle, Durand and Formenti [109], Dubacq, Durand and Formenti [175], Galato [204] and V'yugin [425]. It seems to be very interesting to combine AIT and ergodic theory to study CA and other dynamical

10. Open Problems

417

systems; see, for example, Brudno [50], White [433] and Batterman and White [21]. 12. Construct a simpler Diophantine equation satisfying Theorem 8.6. 13. Find an appropriate notion of "pseudo-random sequence of reals" such that the zeros of Riemann's zeta-function form a pseudorandom sequence. A meaningful definition should be base invariant and a "pseudo-random sequence of reals" should be uniformly distributed modulo 1. For other open problems see Chaitin [122, 132]' Uspensky [407], Downey [181] and Downey and Hirschfeldt [182].

~ibliography

[1] D. Adams. The Hitch Hiker's Guide to the Galaxy, Pan Books, London, 1979. [2] E. Akin. The spiteful computer: a determinism paradox, Math. Intelligencer 14 (1992),45-47. [3] P. Andreasen. Universal Source Coding, Masters Thesis, University of Copenhagen, Denmark, 2001. [4] A. Arslanov. On a conjecture of M. Van Lambalgen, EATCS Bull. 62 (1997), 195-198. [5] A. Arslanov. Contributions to Algorithmic Information Theory, Ph.D. Thesis, University of Auckland, New Zealand, 1998 [6] A. Arslanov. On the phenomenon of auto computability, Electron. Notes Theor. Comput. Sci., 31 (2001), 1-14. [7] A. Arslanov, C. Calude. Program-size complexity computes the halting problem II, EATCS Bull. 57 (1995), 199-200. [8] J. P. Azra, B. Jaulin. Recursivite, Gauthier-Villars, Paris, 1973. [9] D. H. Bailey, J. M. Borwein. Experimental mathematics: recent developments and future outlook, in World Mathematical Year 2000 Book, Springer-Verlag, Berlin, to appear; see also http://www.cecm.sfu.ca/proj ects/lntegerRelations /2001/future.html.

[10] D. H. Bailey, J. Borwein, K. Devlin. The Experimental Mathematician, A K Peters, Wellesley, MA, to appear. [11] D. H. Bailey, R. E. Crandall. On the random character of fundamental constant expansions, Exp. Math. 10, 2 (2001), 175-190.

420

Bibliography

[12] D. H. Bailey, R. E. Crandall. Random generators and normal numbers, Manuscript, March 2002. http://www . nersc. gov / -dhbailey/dhbpapers/bcnormal.pdf. [13] T. Baker, J. Gill, R. Solovay. Relativizations of the problem P = ?NP question, SIAM J. Comput. 4 (1975),431-442. [14] J. B alcazar , J. Diaz, J. Gabarro. Structural Complexity I, Springer-Verlag, Heidelberg, 1995. [15] J. Barrow. Pi in the Sky, Clarendon Press, Oxford, 1992. [16] J. Barrow. Impossibility - The Limits of Science and the Science of Limits, Oxford University Press, Oxford, 1998. [17] J. Barrow. Mathematical jujitsu: some informal thoughts about Godel and physics, Complexity 5 (2000), 28-34. [18] J. Barrow, F. J. Tipler. The Anthropic Cosmological Principle, Oxford University Press, Oxford, 1986. [19] J.-P. Barthelemy, G. Cohen, A. Lobstein. Complexite algorithmique et problemes de communications, Masson, Paris, 1992. [20] S. Bassein. A sampler of randomness Amer. Math. Monthly 103, 6 (1996),483-490. [21] R. W. Batterman, H. S. White. Chaos and algorithmic complexity, Found. Phys. 26 (1996), 307-336. [22] V. Becher, G. Chaitin. Another example of higher order randomness, CDMTCS Research Report 187, 2002, 16pp. [23] V. Becher, S. Daicz, G. Chaitin. A highly random number, in C. S. Calude, M. J. Dinneen, S. Sburlan (eds.). Combinatorics, Computability and Logic, Proceedings of DMTCS'Ol, SpringerVerlag, London, 2001, 55-68. [24] V. Becher, S. Figueira. An example of a computable absolutely normal number, Theoret. Comput. Sci. 270 (2002), 947-958. [25] E. Beltrami. What Is Random? Chance and Order in Mathematics and Life, Copernicus, New York, 1999. [26] C. H. Bennett. The thermodynamics of computation - a review, Int. J. Theoret. Physics 21 (1982), 905-940.

Bibliography

421

[27] C. H. Bennett. Dissipation, information, computational complexity and the definition of organization, in D. Pines (ed.). Emerging Syntheses in Science, Proc. Workshop, Santa Fe Institute, 1985, 297-313. [28] C. H. Bennett. Logical depth and physical complexity, in R. Herken (ed.). The Universal Turing Machine. A Half-Century Survey, Oxford University Press, Oxford, 1988, 2270-258. [29] C. H. Bennett. E-mail to C. Calude, 25 April 1993. [30] C. H. Bennett. Chaitin's Omega, in M. Gardner (ed.). Fractal Music, Hypercards, and More ... , W. H. Freeman, New York, 1992, 307-319. [31] C. H. Bennett, P. Gacs, M. Li, P. M. Vitanyi, W. H. Zurek. Thermodynamics of computation and information distance, Proc. STOC'93, 21-30. [32] C. H. Bennett, M. Gardner. The random number omega bids fair to hold the mysteries of the universe, Scientific American 241 (1979),20-34. [33] C. H. Bennett, J. Gill. Relative to a random oracle A, pA =lN pA =I- co - N pA, with probability one, SIAM 1. Comput. 10 (1981),96-113. [34] J. L. Bentley, A. C. Yao. An almost optimal algorithm for unbounded search, Inf. Proc. Lett. 5 (1976), 82-87. [35] J. Bernoulli. The law of the large numbers, in J. R. Newman (ed.). The World of Mathematics, Vol. 3, Simon and Schuster, New York, 1956, 1452-1455. [36] J. Berstel, D. Perrin. Theory of Codes, Academic Press, New York, 1985. [37] M. Blum. On the size of machines, Inform. and Control 11 (1967), 257-265. [38] R. V. Book. On languages reducible to algorithmically random languages, SIAM J. Comput. 23 (1994), 1275-1282.

422

Bibliography [39] R. V. Book, J. Lutz, K. Wagner. On complexity classes and algorithmically random languages, Proc. STACS-92, Lecture Notes Comput. Sci. Springer-Verlag, Berlin, 577, 1992, 319-328. [40]

E.

[41]

E.

[42]

E.

Borel. Le hasard, Alcan, Paris, 1928.

[43]

E.

Borel. Les paradoxes de l'injini, Gallimard, Paris, 1946.

Borel. Les probabilites denombrables et leurs applications arithmetiques, Rend. Circ. Mat. Palermo 27 (1909), 247-271.

Borel. Ler;ons sur la theorie des fonctions, Gauthier-Villars, Paris, 1914 (2nd ed).

[44] E. Borger. Computability, Complexity, Logic, North-Holland, Amsterdam, 1989. [45] J. M. Borwein. The experimental mathematician: the pleasure of discovery and the role of proof, Int. J. Comput. Math. Learning, to appear. [46] D. S. Bridges. Computability: Springer-Verlag, Berlin, 1994.

A Mathematical Sketchbook,

[47] D. S. Bridges, F. Richman. Varieties of Constructive Mathematics, Cambridge University Press, Cambridge, 1987. [48] L. Brisson, F. W. Meyerstein. Inventing the Universe, State University of New York Press, New York, 1995. [49] R. H. Brown. Does God play dice?, in P. A. P. Moran (ed.). Chance in Nature, Australian Academy of Science, Sydney, 1979, 29-34. [50] A. A. Brudno. Entropy and the complexity of the trajectories of a dynamical system, Trans. Moscow Math. Soc. 2 (1983), 127-151. [51] C. Calude. Theories of Computational Complexity, Holland, Amsterdam, 1988.

North-

[52] C. Calude. Meanings and texts: an algorithmic metaphor, in M. Balat, J. Deledalle-Rhodes (eds.). Signs of Humanity, Mouton de Gruyter, Berlin, 1992, 95-97.

Bibliography

423

[53] C. Calude. Borel normality and algorithmic randomness, in G. Rozenberg, A. Salomaa (eds.). Developments in Language Theory, World Scientific, Singapore, 1994, 113-129 (with a note by G. J. Chaitin). [54] C. Calude. What is a random string? J. UCS 1 (1995), 48-66. [55] C. Calude. What is a random string? - Extended Abstract, in W. Depauli-Schimanovich, E. Koehler, F. Stadler (eds.). The Foundational Debate, Complexity and Constructivity in Mathematics and Physics, Kluwer, Dordrecht, 1995, 101-113. [56] C. Calude. Algorithmic information theory: open problems, J. UCS 2 (1996), 439-441. [57] C. Cahide. Computability and information, in E. Craig (ed.). Routledge Encyclopedia of Philosophy, Routledge, London, Vol. 2 (1998), 477-482. [58] C. S. Calude. Who is afraid of randomness?, in E. von Collani (ed.). Millennial Symposium 'Defining the Science of Stochastics', Wuerzburg University, 2000, 99-122. [59] C. S. Calude. A glimpse into algorithmic information theory, in L. Cavedon, P. Blackburn, N. Braisby, A. Shimojima (eds.). Logic, Language and Computation, Vol. 3, CSLI Series, CSLI Lectures Notes 111, Stanford, 2000, 67-83. [60] C. S. Calude. A characterization of c.e. random reals, Theoret. Comput. Sci. 217 (2002), 3-14. [61] C. S. Calude. Incompleteness, complexity, randomness and beyond, Minds and Machines, to appear; see also CDMTCS Research Report 166, 2001, 11pp. [62] C. S. Calude. Chaitin n numbers, Solovay machines and incompleteness, in K.-I. Ko, A. Nerode, K. Weihrauch (eds.). "Computability and Complexity in Analysis", Theoret. Comput. Sci. 284 (2002), 269-277. [63] C. Calude, C. Campeanu. Note on the topological structure of random strings, Theoret. Comput. Sci. 112 (1993), 383-390. [64] C. Calude, C. Campeanu. Are binary codings universal?, Complexity 1 (1996), 47-50.

424

Bibliography

[65] C. S. Calude, J. L. Casti. Parallel thinking, Nature 392 (1998), 549-551. [66] C. S. Calude, G. J. Chaitin. Randomness everywhere, Nature 400, 22 july (1999), 319-320. [67] C. Calude, 1. Chi~escu. Strong noncomputability of random strings, Int. J. Comput. Math. 11 (1982), 43-45. [68] C. Calude, 1. Chi~escu. Probabilities on the space of sequences, Technical Report 103, 1994, Computer Science Department, University of Auckland, New Zealand, 10pp. [69] C. Calude, 1. Chi~escu. Random sequences: some topological and measure-theoretical properties, An. Univ. Bucure§ti Mat. -Inf. 2 (1988), 27-32. [70] C. Calude, 1. Chi~escu. A combinatorial characterization of P. Martin-Lof tests, Int. 1. Comput. Math. 17 (1988), 53-64. [71] C. Calude, 1. Chi~escu. Upper limitation of Kolmogorov complexity and universal P. Martin-Lof tests, J. Comput. Math. 7 (1989), 61-70. [72] C. Calude, 1. Chi~escu. Qualitative properties of P. Martin-LM random sequences, Boll. Unione Mat. Ital. (7) 3-B (1989), 229240. [73] C. Calude, 1. Chi~escu, L. Staiger. P. Martin-Lof tests: represent ability and embeddability, Rev. Roumaine Math. Pures Appl. 30 (1985), 719-732. [74] C. S. Calude, R. J. Coles. On atheorem of Solovay, CDMTCS Research Report 94, 1999, 14pp. [75] C. S. Calude, R. J. Coles. Program-size complexity of initial segments and domination relation reducibility, in J. Karhumaki, H. A. Maurer, G. Paun, G. Rozenberg (eds.). Jewels Are Forever, Springer-Verlag, Berlin, 1999, 225-237. [76] C. S. Calude, R. Coles, P. H. Hertling, B. Khoussainov. Degreetheoretic aspects of computably enumerable reals, in S. B. Cooper, J. K. Truss (eds.). Models and Computability, Cambridge University Press, Cambridge, 1999, 23-39.

Bibliography

425

[77] C. S. Calude, M. J. Dinneen, C.-K. Shu. Computing a glimpse of randomness, Exp. Math., to appear in 2002; see also CDMTCS Research Report 167, 2001, 12pp. [78] C. S. Calude, M. J. Dinneen, K. Svozil. Reflections on quantum computing, Complexity 6, 1 (2000), 35-37. [79] C. Calude, R. W. Doran. Does God play dice?, EATCS Bull. 50 (1993), 338-341. [80] C. S. Calude, Monica Dumitrescu. Entropic measures, Markov information sources and complexity, Appl. Math. Comput., to appear in 2002; see also CDMTCS Research Report 150, 2001, 12pp. [81] C. Calude, C. Grozea. Kraft-Chaitin inequality revisited, J. UCS 2 (1996), 306-310. [82] C. S. Calude, P. Hertling. Computable approximations of reals: an information-theoretic analysis, Fundam. Informaticae 33 (1998), 1-16. [83] C. S. Calude, P. Hertling, H. Jurgensen, K. Weihrauch. Randomness on full shift spaces, Chaos, Solitons, Fractals, 12/3 (2001), 491-503. [84] C. Calude, P. Hertling, B. Khoussainov. Do the zeros of Riemann's zeta-function form a random sequence?, EATCS Bull. 62 (1997), 199-207. [85] C. S. Calude, P. Hertling, B. Khoussainov, Y. Wang. Recursively enumerable reals and Chaitin n numbers, Theoret. Comput. Sci. 255 (2001), 125-149. Also in M. Morvan, C. Meinel, D. Krob (eds.). STACS'98, Paris, 1998, Lecture Notes Comput. Sci. 1373, Springer-Verlag, Berlin, 1998, 596-606. [86] C. Calude, J. Hromkovic. Complexity: a language-theoretic point of view, in G. Rozenberg, A. Salomaa (eds.). Handbook of Formal Languages, Vol. II, Springer-Verlag, Berlin, 1997, 1-60. [87] C. S. Calude, H. Ishihara, T. Yamaguchi. Minimal programs are almost optimal, Int. J. Found. Comput. Sci. 12, 4 (2001), 479489.

426

Bibliography [88] C. Calude, G. Istrate. Determining and stationary sets for some classes of partial recursive functions, Theoret. Comput. Sci. 82 (1991), 151-155. [89] C. Calude, H. Jurgensen. Randomness as an invariant for number representations, in H. Maurer, J. Karhumiiki, G. Rozenberg (eds.). Results and Trends in Theoretical Computer Science, Springer-Verlag, Berlin, 1994, 44-66. [90] C. S. Calude, H. Jurgensen, S. Legg. Solving finitely refutable mathematical problems, in C. S. Calude, G. Paun (eds.). Finite Versus Infinite. Contributions to an Eternal Dilemma, SpringerVerlag, London, 2000, 39-52. [91] C. Calude, H. Jurgensen, M. Zimand. Is independence an exception?, Appl. Math. Comput. 66 (1994), 63-76. [92] C. Calude, E. Kurta. On Kraft-Chaitin inequality, Rev. Roumaine Math. Pures Appl. 35 (1990), 597-604. [93] C. Calude, S. Marcus, D. gtefanescu. The Creator versus its creation: from Scotus to Godel, Collegium Logicum. Annals of the Kurt-Godel-Society, Vol. 3, Institute of Computer Science, AS CR Prague, Vienna, 1999, 1-10.

[94] C. S. Calude, F. W. Meyerstein. Is the universe lawful?, Chaos, Solitons, Fractals 10 (1999), 1075-1084. [95] C. S. Calude, A. Nies. Chaitin n numbers and strong reducibilities, 1. UCS 3 (1997), 1161-1166. [96] C. S. Calude, G. Paun. Computing with Cells and Atoms, Taylor & Francis, London, 2001. [97] C. S. Calude, B. Pavlov. Coins, quantum measurements, and Turing's barrier, Quantum Inf. Process. 1, 1-2 (2002), 107-127.; see also CDMTCS Research Report 170, 2001, 15pp. [98] C. Calude, A. Salomaa. Algorithmically coding the universe, in G. Rozenberg, A. Salomaa (eds.). Developments in Language Theory, World Scientific, Singapore, 1994, 472-492. [99] C. S. Calude, 1. Tomescu. Optimum extendible prefix codes, 1. UCS 3 (1997), 1167-1179.

Bibliography

427

[100] C. Calude, T. Zamfirescu. The typical number is a lexicon, New Zealand 1. Math. 27 (1998), 7-13. [101] C. S. Calude, T. Zamfirescu. Most numbers obey no probability laws, Publ. Mathematicae Debrecen 54 Supplement (1999), 619623. [102] C. Calude, M. Zimand. A relation between correctness and randomness in the computation of probabilistic algorithms, Int. 1. Comput. Math. 16 (1984),47-53.

[103] J.1. Casti. Paradigms Lost, Avon Books, New York, 1990. [104] J. L. Casti. Searching for Certainty, William Morrow, New York, 1990.

[105] J. L. Casti. Computing the uncomputable, New Scientist, 154/2082, 17 May (1997), 34. [106] J. Casti. Five More Golden Rules-Knots, Codes, Chaos, and Other Great Theories of 20th-Century Mathematics, John Wiley & Sons, New York, 2000. [107] C. Campeanu. Topological Methods in Complexity Theory, Ph.D. Thesis Bucharest University, Romania, 1995. [108] C. Campeanu. Random numbers are Borel normal, EATCS Bull. 58 (1996), 155-158. [109] J. Cervelle, B. Durand, E. Formenti. Algorithmic information theory and cellular automata dynamics, in J. Sgall, A. Pultr, P. Kolman (eds.). Proc. MFCS'2001, Lecture Notes Comput. Sci. 2136, Springer-Verlag, Heidelberg, 2001, 248-260. [110] G. J. Chaitin. On the length of programs for computing finite binary sequences, 1. Assoc. Comput. Mach. 13 (1966), 547-569. (Reprinted in: [122], 219-244.) [111] G. J. Chaitin. On the length of programs for computing finite binary sequences: statistical considerations, J. Assoc. Comput. Mach. 16 (1969), 145-159. (Reprinted in: [122], 245-260.) [112] G. J. Chaitin. Computational complexity and Godel's incompleteness theorem, Notices Amer. Math. Soc. 17 (1970), 672. (Reprinted in: [122], 284.)

428

Bibliography

[113] G. J. Chaitin. Information-theoretic limitations of formal systems, 1. Assoc. Comput. Mach. 21 (1974), 403-424. (Reprinted in: [122], 113-128.) [114] G. J. Chaitin. A theory of program size formally identical to information theory, J. Assoc. Comput. Mach. 22 (1975), 329-340. (Reprinted in: [122], 197-223.) [115] G. J. Chaitin. Randomness and mathematical proof, Scientific American 232 (1975), 47-52. (Reprinted in: [122], 3-13.) [116] G. J. Chaitin. Algorithmic entropy of sets, Comput. Math. Appl. 2 (1976),233-245. (Reprinted in: [122], 153-168.) [117] G. J. Chaitin. Information-theoretic characterizations of recursive infinite strings, Theoret. Comput. Sci. 2 (1976), 45-48. (Reprinted in: [122], 203-206.) [118] G. J. Chaitin. Algorithmic information theory, IBM J. Res. Dev. 21 (1977), 350-359, 496. (Reprinted in: [122], 44-58.) [119] G. J. Chaitin. Toward a mathematical definition of "life", in R. D. Levine, M. Tribus (eds.). The Maximum Entropy Formalism, MIT Press, Cambridge, MA, 1979,477-498. (Reprinted in: [122], 92-110.) [120] G. J. Chaitin. Godel's theorem and information, Int. 1. Theoret. Physics 21 (1982), 941-954. (Reprinted in: [122], 61-71.) [121] G. J. Chaitin. Algorithmic Information Theory, Cambridge University Press, Cambridge, 1987 (3rd printing 1990). [122] G. J. Chaitin. Information, Randomness and Incompleteness: Papers on Algorithmic Information Theory, World Scientific, Singapore, 1987 (2nd ed, 1990). [123] G. J. Chaitin. Incompleteness theorems for random reals, Adv. Appl. Math. 8 (1987),119-146. (Reprinted in: [122], 129-152.) [124] G. J. Chaitin. Randomness in arithmetic, Scientific American 259 (1988), 8(}-85. (Reprinted in: [122], 14-19.) [125] G. J. Chaitin. Information-Theoretic Incompleteness, World Scientific, Singapore, 1992.

Bibliography

429

[126] G. J. Chaitin. Randomness in arithmetic and the decline and fall of reductionism in pure mathematics, EATCS Bull. 50 (1993), 314-328. [127] G. J. Chaitin. On the number of N-bit strings with maximum complexity, Appl. Math. Comput. 59 (1993), 97-100. [128] G. J. Chaitin. The Berry paradox, Complexity 1 (1995), 26-30. [129] G. J. Chaitin. Program-size complexity computes the halting problem I, EATCS Bull. 57 (1995), 199-200. [130] G. J. Chaitin. The Limits of Mathematics, Springer-Verlag, Singapore, 1998. [131] G. J. Chaitin. The Unknowable, Springer-Verlag, Singapore, 2000. [132] G. J. Chaitin. Exploring Randomness, Springer-Verlag, London, 2001. [133] G. J. Chaitin. Personal communication to C. S. Calude, December 2001. [134] G. J. Chaitin. Conversations with a Mathematician, SpringerVerlag, London, 2002. [135] G. J. Chaitin. Computers, paradoxes and the foundations of mathematics, American Sci. 90, March-April (2002), 164-171. [136] G. J. Chaitin. Meta-mathematics and the foundations of mathematics, CDMTCS Research Report 182, 2002, 14pp. [137] G. J. Chaitin, J. T. Schwartz. A note on Monte-Carlo primality tests and algorithmic information theory, Comm. Pure Appl. Math. 31 (1978), 521-527. (Reprinted in: [122], 197-202.) [138] D. G. Champernowne. The construction of decimals normal in the scale of ten, J. London Math. Soc. 8 (1933), 254-260. [139] M. Chown. The Omega man, New Scientist 10 March (2001), 29-31. bibitemmc2 M. Chown. Smash and grab, New Scientist 6 April (2002), 24-28. [140] K. L. Chung. Elementary Probability Theory with Stochastic Processes, Springer-Verlag, New York (3rd ed, 1979).

430

Bibliography

[141] A. Church. On the concept of a random sequence, Bull. Amer. Math. Soc. 46 (1940), 130-135. [142] B. Cipra. Prime formula weds number theory and quantum physics, Science 274 (20 December) (1996), 2014-2015. [143] D. E. Cohen. Computability and Logic, Ellis Horwood, John Wiley & Sons, New York, 1987. [144] P. J. Cohen. Set Theory and the Continuum Hypothesis, Benjamin, New York, 1966. [145] A. Connes, A. Linchnerowicz, M. P. Schiitzenberger. Triangle of Thoughts, American Mathematical Society, Providence, RI, 2001. [146] A. H. Copeland, P. Erdos. Note on normal numbers, Bull. Amer. Math. Soc. 52 (1946), 857-860. [147] J. Copeland. The modern history of computing, in Edward N. Zalta (ed.). The Stanford Encyclopedia of Philosophy (Fall 1999 Edition), http://plato . stanford. edu/entries/ computing-history/. [148] J. Copeland. Narrow versus wide mechanism: Including a reexamination of Turing's views on the mind-machine issue, J. Philos. XCVI, 1 (2000), 5-32. [149] N. C. A. da Costa, F. A. Doria. Undecidability and incompleteness in classical mechanics, Int. J. Theoret. Physics 30 (1991), 1041-1073. [150] T. M. Cover. Universal gambling schemes and the complexity measures of Kolmogorov and Chaitin, Technical Report 12, 1974, Stanford University, CA, 29pp. [151] T. M. Cover, P. Gacs, R. M. Gray. Kolmogorov's contributions to information theory and algorithmic complexity, Ann. Probab. 17 (1989), 840-865. [152] T. M. Cover, J. Y. Thomas. Elements of Information Theory, John Wiley & Sons, New York, 1991. [153] I. Csiszar, J. Korner. Information Theory, Academic Press, New York, 1981.

Bibliography

431

[154] G. Davie. Recursive events in random sequences, Arch. Math. Logic, 40, 8 (2001), 629-638. [155] P. Davies. The Mind of God, Science and the Search for Ultimate Meaning, Penguin Books, London, 1992. [156] P. Davies, J. Gribbin. The Matter Myth. Beyond Chaos and Complexity, Penguin Books, London, 1992. [157] M. Davis. What is a computation?, in L. A. Steen (ed.). Mathematics Today: Twelve Informal Essays, Springer-Verlag, New York, 1978, 241-267. [158] M. Davis, H. Putnam, J. Robinson. The decision problem for exponential diophantine equations, Ann. Math. 74 (1961), 425436. [159] M. Davis, Yu. V. Mat iyasevich , J. Robinson. Hilbert's tenth problem. Diophantine equations: positive aspects of a negative solution, in F. E. Browder (ed.). Mathematical Developments Arising from Hilbert Problems, American Mathematical Society, Providence, RI, 1976, 323-378. [160] P. J. Davis, R. Hersh. The Mathematical Experience, Birkhauser, Boston, 1981. [161] J. W. Dawson, Jr. Logical Dilemmas. The Life and Work of Kurt Gadel, A K Peters, Wellesley, MA, 1997. [162] J. W. Dawson, Jr. E-mail to C. Calude, 21 May 1997. [163] K. De Leeuw, E. F. Moore, C. E. Shannon, N. Shapiro. Computability by probabilistic machines, in C. E. Shannon, J. McCarthy (eds.). Automata Studies, Princeton University Press, Princeton, NJ, 1956, 183-212. [164] J.-P. Delahaye. Information, Complexite et Hasard, Hermes, Paris, 1994. [165] J.-P. Delahaye. Les nombres omega, Pour la Science, 292 May (2002),98-103. [166] C. Dellacherie. Nombres au hazard. De Borel Math., Soc. Math. France 11 (1978), 23-58.

a Martin-Loef,

Gaz.

432

Bibliography

[167] W. A. Dembski. Randomness, in E. Craig (ed.). Routledge Encyclopedia of Philosophy, Routledge, London, Vol. 8 (1998), 56-59. [168] O. Demuth. On constructive pseudorandomnumbers, Comment. Math. Univ. Carolin. 16 (1975), 315-331 (in Russian). [169] O. Demuth. On some classes of arithmetical real numbers, Comment. Math. Univ. Carolin. 23 (1982), 453-465 (in Russian). [170] O. Demuth. Reducibilities of sets based on constructive functions of a real variable, Comment. Math. Univ. Carolin. 26 (1988), 143156.

[l71J O. Demuth. Remarks on the structure of tt-degrees based on the construction of measure theory, Comment. Math. Univ. Carolin. 29 (1988), 233-247. [172] K. G. Denbigh, J. S. Denbigh. Entropy in Relation to Incomplete Knowledge, Cambridge University Press, Cambridge, 1985. [173J M. Denker, M. W. Woyczynski, B. Y cart. Introductory Statistics and Random Phenomena: Uncertainty, Complexity, and Chaotic Behavior in Engineering and Science, Birkhauser, Boston, 1998. [174] M. Detlefsen. Godel's theorems, in E. Craig (ed.). Routledge Encyclopedia of Philosophy, Routledge, London, Vol. 4 (1998), 106119. [175] J.-C. Dubacq, B. Durand, E. Formenti. Kolmogorov complexity and cellular automata classification, Theoret. Comput. Sci. 259 (2001), 271-285. [176] D. Deutsch. Quantum theory, the Church-Turing principle and the universal quantum computer, Proc. R. Soc. London A400 (1985), 97-117. [177] A. K. Dewdney. A computer trap for the busy beaver, the hardest-working Turing machine, Scientific American 251 (1984), 10-17. [178] L. E. Dickson. History of the Theory of Numbers, 3 volumes, Carnegie Institute, Washington, DC, 1919, 1920, 1923. [179] P. A. M. Dirac. The Principles of Quantum Mechanics, Oxford University Press, Oxford, 1930.

Bibliography

433

[180] R. Downey. An invitation to structural complexity, New Zealand 1. Math. 21 (1992), 33-89. [181] R. G. Downey. Some computability-theoretical aspects of reals and randomness, CDMTCS Research Report 173, 2002, 42pp. [182] R. Downey, D. Hirschfeldt. Algorithmic Randomness and Complexity, Springer-Verlag, Berlin, in preparation. [183] R. Downey, D. R. Hirschfeldt, G. 1. LaForte. Randomness and reducibility, in J. Sgall, A. Pultr, P. Kolman (eds.). Proc. MFCS'2001, Lecture Notes Comput. Sci. 2136, Springer-Verlag, Heidelberg, 2001, 316-327. [184] R. Downey, D. R. Hirschfeldt, A. Nies. Randomness, computability and density, in A. Ferreira, H. Reichel (eds.). Proc. STACS 2001, Springer-Verlag, Berlin, 2001, 195-205; full paper to appear in SIAM 1. Comput. [185] R. Downey, G. L. LaForte. Presentations of computably enumerable reals, in K.-1. Ko, A. Nerode, K. Weihrauch (eds.). "Computability and Complexity in Analysis", Theoret. Comput. Sci. 284 (2002), 539-555. [186] S. Dragomir. E-mail to C. Calude, 5 December 200l. [187] R. M. Dudley. Real Analysis and Probability, Wadsworth & Brooks/Cole, Pacific Grove, CA, 1989. [188] N. Duta. Representability and embeddability of P. Martin-Lof tests, Stud. Cercet. Mat. 47 (1995), 245-262. [189] G. Etesi, 1. Nemeti. Non-Turing computations via MalamentHogarth space-times, Int. 1. Theoretical Physics 41 (2002), 341370. [190] S. Feferman, J. Dawson, Jr., S. C. Kleene, G. H. Moore, R. M. Solovay, J. van Heijenoort (eds.) . Kurt Godel Collected Works, Vol. I, Oxford University Press, New York, 1986. [191] S. Feferman, J. Dawson, Jr., S. C. Kleene, G. H. Moore, R. M. Solovay, J. van Heijenoort (eds.). Kurt Godel Collected Works, Vol. II, Oxford University Press, New York, 1990.

434

Bibliography

[192] W. Feller. An Introduction to Probability Theory and Its Applications, Vol. 1, Chapman & Hall, London; John Wiley & Sons, New York (3rd ed, 1968). [193] J. Ford. How random is a random coin toss?, Phys. Today 36 (1983),40-47. [194] W. L. Fouche. Descriptive complexity and reflective properties of combinatorial configurations, J. London Math. Soc. 54 (1996), 199-208. [195] M. Ferbus-Zanda, S. Grigorieff. Is randomness "native" to computer science?, EATCS Bull. 74 (2001), 78-118. [196] R. Feynman. Simulating physics with computers, Int. J. Theoret. Physics 21 (1982), 467-488. [197] T. L. Fine. Theories of Probability. An Examination of Foundations, Academic Press, New York, 1973. [198] E. Fredkin, T. Toffoli. Conservative logic, Int. J. Theoret. Physics 21 (1982), 219-255. [199] P. Gacs. On the symmetry of algorithmic information, Sov. Math. Dokl. 15 (1974), 1477-1480; correction, ibid. 15 (1974), 1480. [200] P. Gacs. Exact expressions for some randomness tests, Z. Math. Logik Grundlag. Math. 26 (1980), 385-394. [201] P. Gacs. On the relation between descriptional complexity and algorithmic probability, Theoret. Comput. Sci. 22 (1983), 71-93. [202] P. Gacs. Every sequence is reducible to a random one, Inform. and Control 70 (1986), 186-192. [203] P. Gacs. Lecture Notes on Descriptional Complexity and Randomness, Boston University, 1988, 62pp. [204] S. Galato. A proof of the Beyer-Stein-Ulam relation between complexity and entropy, Discrete Mathematics 223 (2000), 367-372. [205] M. Gardner. A collection of tantalizing fallacies of mathematics, Scientific American 198 (1958), 92. [206] M. Gardner. Fractal Music, Hypercards, and More . .. , W. H. Freeman, New York, 1992, 307-319.

Bibliography

435

[207] M. Garey, D. Johnson. Computers and Intractability: A Guide to the Theory of N P-Completeness, W. H. Freeman, New York, 1979. [208] W. L. Gewirtz. Investigations in the theory of descriptive complexity, Report NSO-5, Courant Institute of Mathematical Sciences, New York University, 1974, 60pp. [209] J. Gill. Computational complexity of probabilistic Turing machines, SIAM 1. Comput. 6 (1976), 675-695. [210] K. Godel. An example of a new type of cosmological solutions of Einstein's field equations of gravitation, Rev. Modern Physics 21 (1949),447-450. (Reprinted in: [191]' 190-198.) [211] K. Godel. The Consistency of the Continuum Hypothesis, Princeton University Press, Princeton, NJ, 1940. [212] K. Godel. Russell's mathematical logic, in P. Benacerref, H. Putnam (eds.). Philosophy of Mathematics, Prentice-Hall, Englewood Cliffs, NJ, 1964,211-232. (Reprinted in [191]' 119-141.) [213] K. Godel. What is Cantor's continuum problem?, in P. Benacerref, H. Putnam (eds.). Philosophy of Mathematics, Prentice-Hall, Englewood Cliffs, NJ, 1964, 258-273. (Reprinted in: [191]' 176187.) [214] R. L. Graham, B. L. Rothschiled, J. H. Spencer. Ramsey Theory, John Wiley & Sons, New York (2nd ed, 1990). [215] E. Grosswald (ed.). Collected papers of Hans Rademacher, MIT Press, Cambridge, MA, 1974, 454-455. [216] C. Grozea. Free-extendible prefix-free sets and an extension of the Kraft-Chaitin theorem, 1. UCS 6 (2000), 130-135. [217] J. Gruska. Foundations of Computing, Thomson International Computer Press, Boston, 1997. [218] J. Gruska. Quantum Computing, McGraw-Hill, London, 1999. [219] S. Guia§u. Information Theory and Applications, McGraw-Hill, New York, 1977.

436

Bibliography

[220] P. R. Halmos. Measure Theory, Van Nostrand, Princeton, NJ, 1950. (Reprinted: Springer-Verlag, Berlin, 1974.) [221] D. Hammer. Complexity Inequalities, Wissenschaft & Technik Verlag, Berlin, 1998. [222] G. H. Hardy. Goldbach's theorem, Mat. Tid. B 1 (1922), 1-16. (Reprinted in: Collected Papers of G. H. Hardy, Vol. 1, Oxford University Press, Oxford, 1966, 545-560.) [223] G. H. Hardy, E. M. Wright. An Introduction to the Theory of Numbers, Clarendon Press, Oxford (5th ed, 1979). [224] J. Hartmanis. Generalized Kolmogorov complexity and the structure of feasible computations, in Proc. 24th IEEE Symp. Foundations of Computer Science, 1983, 439-445. [225] J. Hartmanis, L. Hemachandra. On sparse oracles separating feasible complexity classes, Inf. Process. Lett. 28 (1988), 291-295. [226] J. Hartmanis, L. Hemachandra, S. A. Kurtz. Complexity corresponds to topology, Technical Report 88-17, University of Chicago, 1988, 12pp. [227] J. Hartmanis, J. E. Hopcroft. Independence results in computer science, SIGACT News 8 (1976), 13-24. [228] S. W. Hawking. A Brief History of Time: From the Big Bang to Black Holes, Bantam Press, London, 1988. [229] L. Hemachandra, M. Ogihara. The Complexity Theory Companion, Springer-Verlag, Heidelberg, 2002. [230] L. Hemaspaandra, M. Zimand. Strong self-reducibility precludes strong immunity, Math. Syst. Theory 29 (1996), 535-548. [231] P. Hertling. Disjunctive w-words and real numbers, J. UCS 2 (1996), 549-568. [232] P. Hertling. Surjective functions on computably growing Cantor sets, 1. UCS 3 (1997), 1226-1240. [233] P. Hertling. Simply normal numbers to different bases, J. UCS 2 (2002), 235-242.

Bibliography

437

[234] P. Hertling, Y. Wang. lnvariance properties of random sequences, J. UCS 12 (1997), 1241-1249. [235] P. Hertling, K. Weihrauch. Randomness spaces, in K. G. Larsen, S. Skyum, G. Winskel (eds.). Automata, Languages and Programming, Proc. 25th Int. Coll., ICALP'98, Aalborg, Denmark, july 1998, Springer-Verlag, Berlin, 1998, 796-807. [236] J. G. Hey (ed.). Feynman and Computation. Exploring the Limits of Computers, Perseus Books, Reading, MA, 1999. [237] E. Hlawka. The Theory of Uniform Distribution, A B Academic Publishers, Zurich, 1984. [238] C.-K. Ho. Relatively recursive reals and real functions, Theoret. Comput. Sci. 219 (1999), 99-120. [239] J. Horgan. The death of proof, Scientific American 269 (1993), 74-82. [240] A. Jaffe, F. Quinn. "Theoretical mathematics": toward a cultural synthesis of mathematics and theoretical physics, Bull. Amer. Math. Soc. 29 (1993), 1-13. [241] D. S. Jones. Elementary Information Theory, Clarendon Press, Oxford, 1979. [242] J. P. Jones, Yu. V. Matiyasevich. Register machine proof of the theorem on exponential diophantine representation of enumerable sets, J. Symb. Logic 49 (1984), 818-829. [243] D. Juedes, J. Lathrop, J. Lutz. Computational depth and reducibility, Theoret. Comput. Sci. 132 (1994), 37-70. [244] H. Jurgensen, J. Duske. Codierungstheorie, Bl, Mannheim, 1977. [245] H. Jurgensen, G. Thierrin. Some structural properties of wlanguages, 13th Natl. School with Int. Participation "Applications of Mathematics in Technology", Sofia, 1988, 56-63. [246] H. Jurgensen, H. J. Shyr, G. Thierrin. Disjunctive w-languages. ElK 19 (1983), 267-278. [247] T. Kamae. On Kolmogorov's complexity and information, Osaka J. Math. 10 (1973), 305-307.

438

Bi bliography

[248] H. P. Katseff. Complexity dips in random infinite binary sequences, Inform. and Control 38 (1978), 258-263. [249] H. P. Katseff, M. Sipser. Several results in program size complexity, Theoret. Comput. Sci. 15 (1981), 291-309. [250] S. Kautz. Degrees of Random Sets, Ph.D. Thesis, Cornell University, Ithaca, NY, 1991. [251] J. L. Kelley. General Topology, Van Nostrand, Princeton, NJ, 1955. [252] A. I. Khinchin. Mathematical Foundations of Information Theory, Dover, New York, 1957. [253] B. Khoussainov. Randomness, computability, and algebraic specifications, Ann. Pure Appl. Logic 91 (1998), 1-15. [254] T. D. Kieu. Hilbert's incompleteness, Chaitin's n number and quantum physics, Los Alamos preprint archive http://arXiv:quant-ph/0111062,vl, 10 November 2001. [255] D. E. Knuth. The Art of Computer Programming, Vol. 2, Seminumerical Algorithms, Addison-Wesley, Reading, MA (2nd ed, 1981). [256] D. E. Knuth. Supernatural numbers, in D. A. Klamer (ed.). The Mathematical Gardner, Prindle, Weber & Schmidt, Wadsworth, Boston, MA, 1981, 310-325. [257] D. E. Knuth. Theory and practice, EATCS Bull. 27 (1985), 1421. [258] Ker-I Ko. Complexity of Real Functions, Birkha,user, Berlin, 1991. [259] A. N. Kolmogorov. Three approaches for defining the concept of "information quantity", Prabl. Inf. Transm. 1 (1965), 3-11. [260] A. N. Kolmogorov. Logical basis for information theory and probability theory, IEEE Trans. Inf. Theory 14 (1968), 662-664. [261] A. N. Kolmogorov, V. A. Uspensky. Algorithms and randomness, Theory Prabab Appl. 32 (1988), 389-412. (Two corrections in: Uspensky [407]' p.l02.)

Bibliography

439

[262] L. G. Kraft. A Device for Quantizing Grouping and Coding Amplitude Modulated Pulses, MS Thesis, MIT, Cambridge, MA, 1949. [263]

r. Kramosil. Recursive classification of pseudo-random sequences, Kybemetika (Prague) 20 (1984), 1-34 (supplement).

[264]

r. Kramosil, J. Sindelar. Infinite pseudo-random sequences of high algorithmic complexity, Kybemetika (Prague) 20 (1984), 429437.

[265] A. Kucera. Measure, II~-classes and complete extensions of PA, in H.-D. Ebbinghaus, G. H. Milller, G. E. Sacks (eds.). Recursion Theory Week, Proceedings, Oberwolfach 1984, Lecture Notes Math. 1141, Springer-Verlag, Berlin, 1985, 245-259. [266] A. Kucera, T. A. Slaman. Randomness and recursive enumerability, SIAM J. Comput. 31 (2001), 199-211. [267] A. Kucera, S. Terwijn. Lowness for the class of random sets, J. Symb. Logic 64 (1999), 1396-1402. [268] L. Kuipers, H. Niederreiter. Uniform Distribution of Sequences, John Wiley & Sons, New York, 1974. [269] M. Kummer. On the complexity of random strings, in C. Puech, R. Reischuk (eds.). Proceedings of STACS'96, Lecture Notes Comput. Sci. 1046, Springer-Verlag, Berlin, 1996, 25-38. [270] M. Kummer. Kolmogorov complexity and instance complexity of recursively enumerable sets, SIAM J. Comput. 25 (1996), 11231143. [271] R. Landauer. Uncertainty principle and minimal energy dissipation in the computer, Int. 1. Theoret. Physics 21 (1982), 283-297. [272] R. Landauer. Computation: a fundamental physical view, Physica Scripta 35 (1987), 88-95. [273] P. S. Laplace. A Philosophical Essay on Probability Theories, Dover, New York, 1951. [274] M. Lerman, J. B. Remmel. The universal splitting property, I, in D. van Dalen, D. Lascar, T. J. Smiley (eds.). Logic Colloquium '80, North-Holland, Amsterdam, 1982, 181-208.

440

Bibliography

[275] M. Lerman, J. B. Remmel. The universal splitting property, II, J. Symb. Logic 49 (1984), 137-150. [276] S. K. Leung-Yan-Cheong, T. M. Cover. Some equivalences between Shannon entropy and Kolmogorov complexity, IEEE Trans. Info. Theory 24 (1978), 331-338. [277] L. A. Levin. On the notion of random sequence, Sov. Math. Dokl. 14 (1973), 1413-1416. [278] L. A. Levin. Randomness conservation inequalities: information and independence in mathematical theories, Probl. Inf. Transm. 10 (1974), 206-210. [279] S. Levy. Artificial Life, Pantheon Books, New York, 1992. [280] M. Li, P. M. Vitanyi. Kolmogorov complexity and its applications, in J. van Leeuwen (ed.). Handbook of Theoretical Computer Science, Vol. A, North-Holland, Amsterdam, MIT Press, Boston, 1990, 187-254. [281] M. Li, P. M. Vitanyi. Combinatorics and Kolmogorov complexity, Proc. 6th IEEE Structure in Complexity Theory Conj., 1991, 154163. [282] M. Li, P. M. Vitanyi. An Introduction to Kolmogorov Complexity and Its Applications, Springer-Verlag, Berlin, 1993 (2nd ed, 1997). [283] X. Li. Effective immune sets, program index sets and effectively simple sets: generalizations and applications of the recursion theorem, in C.-T. Chong, M. J. Wicks (eds.). South-East Asian Conference on Logic, Elsevier, Amsterdam, 1983, 97-106. [284] D. Lind. Applications of ergodic theory and sofic systems to cellular automata, Physica D 10 (1984), 36-44. [285] D. Lind, B. Marcus. An Introduction to Symbolic Dynamics and Coding, Cambridge University Press, Cambridge, 1995. [286] L. Longpre. Resource bounded Kolmogorov complexity: a link between computational complexity and information theory, Technical Report 86-776, Cornell University, August 1986, 101pp.

Bibliograpby

441

[287] L. Longpre and V. Kreinovich. Zeros of Riemann's zeta function are uniformly distributed, but not random: an answer to Calude's open problem, EATCS Bull. 59 (1996), 163~164. [288] J. H. Loxton. A method of Mahler in transcendence theory and some of its applications, Bull. Austral. Math. Soc. 29 (1984), 127~ 136. [289] J. H. Lutz. Almost everywhere high nonuniform complexity, J. Comput. Syst. Sci. 44 (1992), 220~258. [290] J. H. Lutz. The quantitative structure of exponential time, Proc. Eighth Annual Structure in Complexity Theory Conj., San Diego, CA, May 18-21, 1993, IEEE Computer Society Press, 1993, 158~ 175. [291] J. H. Lutz, E. Mayordomo. Measure, stochasticity, and the density of hard languages, SIAM J. Comput. 23 (1994), 762-779. [292] A. r. Mal'cev. Algorithms and Recursive Functions, WoltersNoordhoff, Groningen, 1970. [293] V. Manca. Logica Matematica, Bollati Boringhieri, 2001. [294]

r.

[295]

r.

Mandoiu. Kraft-Chaitin's theorem for free-extendable codes, St. Cerc. Mat. 44 (1992), 497~501.

Mandoiu. On a theorem of Gacs, Int. J. Comput. Math. 48 (1993), 157~169.

[296] I. Mandoiu. Optimum extensions of prefix codes, Inj. Process. Lett. 66 (1998), 35-40. [297] H. B. Marandijan. Selected topics in recursive function theory in computer science, ID - TR 75, Technical University of Denmark, Lyngby, 1990, 93pp. [298] S. Marcus (ed.). Contextual Ambiguities in Natural 8 Artificial Languages, Comm. & Cognition, Vol. 2, Ghent, Belgium, 1983. [299] S. Marcus. Symmetry in the simplest case: the real line, Computers Math. Applic. 17 (1989), 103-115. [300] G. Markowsky. An introduction to algorithmic information theory: its history and some examples, Complexity 2 (1997), 14-22.

442

Bibliography

[301] P. Martin-Lof. Algorithms and Random Sequences, Erlangen University, Nlirnberg, Erlangen, 1966. [302] P. Martin-Lof. The definition of random sequences, Inform. and Control 9 (1966), 602~619. [303] P. Martin-Lof. Notes on Constructive Mathematics, Almqvist & Wiksell, Stockholm, 1970. [304] P. Martin-Lof. Complexity oscillations in infinite binary sequences, Z. Wahrscheinlichkeitstheorie Verw. Geb. 19 (1971), 225~230.

[305] P. Martin-LOf. On the notion of randomness, in A. Kino, J. Myhill, R. E. Vesley (eds.). Intuitionism and Proof Theory, NorthHolland, Amsterdam, 1970, 73~ 78. [306] A. Maruoka, M. Kimura. Conditions for injectivity of global maps for tessellation automata, Information and Control 32 (1976), 158~162. [307] A. Maruoka, M. Kimura. Injectivity and surjectivity of parallel maps for cellular automata, J. Comput. Syst. Sci. 18 (1979), 47~ 64. [308] H. Marxen, J. Buntrock. Attacking the busy beaver 5, EATCS Bull. 40 (1990), 247~251. [309] Yu. V. Matiyasevich. Hilbert's Tenth Problem, MIT Press, Cambridge, MA, 1993. [310] M. Mendes-France. Nombres normaux. Applications aux fonctions pseudo~aleatoires, 1. Anal. Math. Jerusalem 20 (1967), 1~ 56. [311] M. Mendes-France. Suites de nombres au hasard (d'apres Knuth), Seminaire de Theorie des N ombres, Expose 6, 1974, 1~ 11. [312] M. Mendes-France. The Planck constant of a curve, in J. Belair, S. Dubuc (eds.). Fractal Geometry and Analysis, Kluwer Academic, Boston, 1991, 325~366. [313] A. R. Meyer. Program size in restricted programming languages, Inform. and Control 21 (1972), 322~394.

Bibliography

443

[314] G. L. Miller. Riemann's hypothesis and tests ofprimality, 1. Comput. Syst. Sci. 13 (1976), 300-317. [315] C. Moore. Generalized shifts: unpredictability and undecidability in dynamical systems, Nonlinearity 4 (1991), 199-230. [316] E. F. Moore. Machine models of self reproduction, Proc. Symp. Appl. Math., American Mathematical Society, 14 (1962), 17-33. [317] A. Muchnik, A. Semenov, , V. A. Uspensky. Mathematical metaphysics of randomness, Theoret. Comput. Sci. 207 (1998), 263317. [318] I. P. Natanson. Theory of Functions of A Real Variable, Frederick Ungar, New York, 1955. [319] M. Nivat. Infinite words, infinite trees, infinite computations, in J. W. De Bakker, J. van Leeuwen (eds.). Foundations of Computer Science III, Mathematical Centre 'lracts 109, Amsterdam, 1979, 3-52. [320] I. Niven, H. S. Zuckerman. On the definition of normal numbers, Pacific J. Math. 1 (1951), 103-110. [321] P. Odifreddi. Classical Recursion Theory, Vol. 1, North-Holland, Amsterdam, 1989. [322] P. Odifreddi. Classical Recursion Theory, Vol. 2, North-Holland, Amsterdam, 1999. [323] P. Odifreddi. La prova di dio, Manuscript, January 1994, 8pp. [324] P. Odifreddi. Ultrafilters, dictators, and Gods, in C. S. Calude, G. Paun (eds.). Finite Versus Infinite. Contributions to an Eternal Dilemma, Springer-Verlag, London, 2000, 255-262. [325] V. P. Orevkov. A new proof of the uniqueness theorem for constructive differentiable functions of a complex variable, Zap. Nauchn. Sem. LOMI40 (1974), 119-126 (in Russian); English translation in 1. Sov. Math. 8 (1977), 329-334. [326] J. C. Oxtoby. Measure and Category, Springer-Verlag, Berlin, 1971.

444

Bibliography

[327] J. C. Oxtoby, S. M. Ulam. Measure-preserving homeomorphisms and metrical transitivity, Ann. Math. 42 (1941), 874-925. [328] H. R. Pagels. The Dreams of Reason, Bantam Books, New York, 1989. [329] J. A. Paulos. Beyond Numeracy, Vintage Books, Random House, New York, 1992. [330] 1. Percival. Chaos: a science for the real world, in N. Hall (ed.). New Scientist Guide to Chaos, Penguin Books, London, 1991, 11-21. [331] R. Penrose. The Emperor's New Mind. Concerning Computers, Minds, and the Laws of Physics, Oxford University Press, Oxford, 1989. [332] R. Penrose. Precis of The Emperor's New Mind. Concerning Computers, Minds, and the Laws of Physics (together with responses by critics and a reply by the author), Behav. Brain Sci. 13 (1990), 643-705. [333] R. Penrose. Computability and the Mind, 1993 Forder Lecture, Auckland University, 30 April 1993. [334] 1. Peterson. Islands of Truth: A Mathematical Mystery Cruise, W. H. Freeman, New York, 1990. [335] S. Porrot, M. Dauchet, B. Durand, N. K. Vereshchagin. Deterministic rational transducers and random sequences, Theoret. Comput. Sci. 378 (1998), 258-272. [336] E. L. Post. Recursively enumerable sets of positive integers and their decision problems, Bull. Amer. Math. Soc. (New Series) 50 (1944), 284-316. [337] E. L. Post. Absolutely unsolvable problems and relatively undecidable propositions: account of an anticipation, in M. Davis (ed.). The Undecidable, Raven Press, New York, 1965, 340-433. [338] M. Pour-El, 1. Richards. Computability in Analysis and Physics, Springer-Verlag, Berlin, 1989. [339] V. Pratt. Every prime has a succinct certificate, SIAM J. Comput. 4 (1975), 214-220.

Bibliography

445

[340] P. Raatikainen. On interpreting Chaitin's incompleteness theorem, J. Philos. Logic 27 (1998), 569-586. [341] M. O. Rabin. Probabilistic algorithms, in J. F. Traub (ed.). Algorithms and Complexity: New Directions and Recent Results, Academic Press, New York, 1976, 21-39. [342] D. L. Renfro. A Study of Porous and Sigma-Porous Sets, Longman, 2002, to appear. [343] E. Regis. Who Got Einstein's Office? Eccentricity and Genius at the Institute for Advanced Study, Penguin Books, New York, 1989. [344] R. Rettinger, X. Zheng, R. Gengler, B. von Braunmiihl. Monotonically computable real numbers, in C. S. Calude, M. J. Dinneen, S. Sburlan (eds.). Combinatorics, Computability and Logic, Proc. DMTCS'Ol, Springer-Verlag, London, 2001, 187-202. [345] H. Rice. Recursive reals, Proc. Amer. Math. Soc. 5 (1954), 784791. [346] B. Riemann. Uber die Anzahl der Primzahlen unter einer gegebenen Grosse, in Gesammelte mathematische Werke und wissenchajtlicher N achlass, Springer-Ver lag, Berlin, 1990, 177185. [347] H. Rogers. Theory of Recursive Functions and Effective Computability, McGraw-Hill, New York, 1967. [348] G. Rozenberg, A. Salomaa. Cornerstones of Undecidability, Prentice-Hall, Englewood Cliffs, NJ, 1994. [349] R. Rucker. Infinity and the Mind, Bantam Books, New York, 1983. [350] R. Rucker. Mind Tools, Houghton Mifflin, Boston, 1987. [351] D. Ruelle. Chance and Chaos, Princeton University Press, Princeton, NJ, 1991. [352] B. Russell. Mathematical logic as based on the theory of types, Amer. J. Math. 30 (1908), 222. (Reprinted in: [410]' 153.)

446

Bibliography

[353] G. E. Sacks. A simple set which is not effectively simple, Proc. Amer. Math. Soc. 15 (1964), 51-55. [354] A. M. Salagean-Mandache. A geometrical proof of Kraft-Chaitin theorem, An. Univ. Bucure§ti Mat. 39/40 (1990/91), 3 Matematica-Informatica, 90-97. [355] A. Salomaa. Computation and Automata, Cambridge University Press, Cambridge, 1985. [356] A. Salomaa. Public-Key Cryptography, Springer-Verlag, Berlin, 1990 (2nd ed, 1996). [357] R. Schack. Algorithmic information and simplicity in statistical physics, Int. 1. Theor. Physics, 36 (1997) 209-226. [358] J. Schmidhuber. Algorithmic theories of everything, Los Alamos preprint archive http://arXi v: quant-ph/OOl1122, 30 November 2000. [359] C. P. Schnorr. Zujiilligkeit und Wahrscheinlichkeit: Eine algorithmische B ehandlung der Wahrscheinlichkeitstheorie, Lecture Notes Math. 218, Springer-Verlag, Berlin, 1971. [360] C. P. Schnorr. Process complexity and effective random tests, J. Comput. Syst. Sci. 7 (1973), 376-388. [361] C. P. Schnorr. A survey of the theory of random sequences, in R. E. Butts, J. Hintikka (eds.). Basic Problems in Methodology and Linguistics, Reidel, Dordrecht, 1977, 193-210. [362] D. Scotus. Philosophical Writings, Nelson, New York, 1962. [363] G. Segre. The definition of a random sequence of qubits: from noncommutative algorithmic probability theory to quantum algorithmic information theory and back, Los Alamos preprint archive http://arXiv : quant-ph/0009009 v3, 7 November 2000. [364] C. E. Shannon. A mathematical theory of communication, Bell Syst. Tech. J. 27 (1948), 379-423, 623-656. [365] A. Shen. A strange application of Kolmogorov complexity, Manuscript, 1993, 4pp.

Bibliography

447

[366] H. Siegelmann. Computation beyond the Turing limit, Science 268 (1995), 545-548. [367] M. Sipser. A complexity-theoretic approach to randomness, Proc. 15th Annual ACM Symp. Theory of Computing, 1983, 330-335. [368] M. Sipser. Introduction to the Theory of Computation, PWS Publishing, Boston, 1997. [369] T. A. Slaman. Random implies D-like, Manuscript, 14 December 1998,2pp. [370] R. M. Smullyan. Effectively simple sets, Proc. Amer. Math. Soc. 15 (1964), 893-895. [371] R. I. Soare. Recursion theory and Dedekind cuts, Trans. Amer. Math. Soc. 140 (1969), 271-294. [372] R. I. Soare. Recursively Enumerable Sets and Degrees. A Study of Computable Functions and Computably Generated Sets, SpringerVerlag, Berlin, 1987. [373] R. J. Solomonoff. A formal theory of inductive inference, Part 1 and Part 2, Inform. and Control 7 (1964), 1-22, 224-254. [374] R. J. Solomonoff. Complexity-based induction systems: comparisons and convergence theorems, IEEE Trans. Inf. Theory 24 (1978),422-432. [375] R. M. Solovay. Draft of a paper (or series of papers) on Chaitin's work ... done for the most part during the period of Sept. Dec. 1974, Manuscript, IBM Thomas J. Watson Research Center, Yorktown Heights, New York, May 1975, 215pp. [376] R. M. Solovay. On random r.e. sets, in A. I. Arruda, N. C. A. Da Costa, R. Chuaqui (eds.). Non-Classical Logics, Model Theory and Computability, North-Holland, Amsterdam, 1977, 283-307. [377] R. M. Solovay. A version of D for which Z FC can not predict a single bit, in C. S. Calude, G. Paun (eds.). Finite Versus Infinite. Contributions to an Eternal Dilemma, Springer-Verlag, London, 2000, 323-334. [378] R. Solovay, V. Strassen. A fast Monte Carlo test for primality, SIAM J. Comput. 6 (1977), 84-85. Erratum: 7 (1978), 118.

448

Bibliograpby

[379] E. Specker. Nicht konstruktiv beweisbare Satze der Analysis, J. Symb. Logic 14 (1949), 145-158. [380] 1. Stewart. The Problems of Mathematics, Oxford University Press, Oxford, New York, 1992. [381] 1. Stewart. Deciding the undecidable, Nature 352 (1991), 664-665. [382] L. Staiger. Representable Martin-Lof tests, (1985), 235-243.

Kybernetika 21

[383] L. Staiger. Kolmogorov complexity and Hausdorff dimension, Inform. and Comput. 103 (1993), 159-194. [384] L. Staiger. w-languages, in G. Rozenberg, A. Salomaa (eds.). Handbook of Formal Languages, Vol. III, Springer-Verlag, Berlin, 1997 339-387. [385] L. Staiger. A tight upper bound on Kolmogorov complexity by Hausdorff dimension and uniformly optimal prediction, Theory Comput. Syst. 31 (1998), 215-229. [386] L. Staiger. The Kolmogorov complexity of real numbers, in G. Ciobanu, Gh. Paun (eds.). Proc. Fundamentals of Computation Theory, Lecture Notes Comput. Sci. 1684, Springer-Verlag, Berlin, 1999, 536-546. [387] L. Staiger. The Kolmogorov complexity of Liouville numbers, CDMTCS Research Report 096, 1999, llpp. [388] L. Staiger. How large is the set of disjunctive sequences? J. UCS 8 (2002), 348-362. [389] D.

~tefanescu.

Scotus, E-mail to C. Calude, 12 May 1993.

[390] K. Svozil. The quantum coin toss-testing microphysical undecidability, Phys. Lett. A143 (1990), 433-437. [391] K. Svozil. Rando~ness f3 Undecidability in Physics, World Scientific, Singapore, 1993. [392] K. Svozil. E-mail to C. Calude, 14 June 1993. [393] K. Svozil. Halting probability amplitude of quantum computers, J. UCS 1 (1995), 201-203.

Bibliography

449

[394] K. Svozil. Quantum information theory, J. UCS 5 (1996), 311346. [395] K. Svozil. The Church-Turing thesis as a guiding principle for physics, in C. S. Calude, J. Casti, M. J. Dinneen (eds.). Unconventional Models of Computation, Springer-Verlag, Singapore, 1998, 371-385. [396] A. Szilard. Private communication to C. Calude, 10 November 1993. [397] K. Tadaki. A generalization of Chaitin's halting probability 0 and halting self-similar sets, Hokkaido Math. J. 31 (2002), 219-253. [398] F. J. Tipler. The Omega point as Eschaton: answers to Pannenberg's questions for scientists, Zygon 24 (1989), 241-242. [399] M. R. Titchener. Construction and properties of the augmented and binary-depletion codes, lEE Pmc. 132 (1984), 163-169. [400] T. Toffoli. Physics and computation, Int. 1. Theoret. Physics 21 (1982), 165-175. [401] T. Toffoli, N. Margolus. Invertible cellular automata: a review, Physica D 45 (1990), 229-253. [402] J. F. Traub, G. W. Wasilkowski, H. Wozniakowski. InformationBased Complexity, Academic Press, New York, 1988. [403] J. F. Traub, A. G. Werschulz. Complexity and Information, Cambridge University Press, Cambridge, 1998. [404] A. M. Turing. On computable numbers with an application to the Entscheidungsproblem, Pmc. Amer. Math. Soc. 42 (1936-7), 230-265; a correction, ibid. 43 (1937), 544-546. [405] T. Tymoczko. The four-colour problem and its philosophical significance, J. Philos. 2 (1979), 57-83. [406] T. Tymoczko. New Directions in the Philosophy of Mathematics, Birkhiiuser, Boston, 1986 (2nd ed, 1998). [407] V. A. Uspensky. Complexity and entropy: an introduction to the theory of Kolmogorov complexity, in [428]' 86-102.

450

Bibliography

[408] V. A. Uspensky. Kolmogorov and mathematical logic, J. Symb. Logic 57 (1992), 385-412. [409] V. A. Uspensky, A. Shen. Relations between varieties of Kolmogorov complexities, Math. Syst. Theory 29 (1996), 271-292. [410] J. van Heijenoort (ed.). From Frege to Cadel. A Source Book in Mathematical Logic, 1879-1931, Harvard University Press, Cambridge, MA, 1967. [411] M. van Lambalgen. Von Mises' definition of random sequences reconsidered, J. Symb. Logic 52 (1987), 725-755. [412] M. van Lambalgen. Algorithmic information theory, J. Symb. Logic 54 (1989), 1389-1400. [413] M. van Lambalgen. The axiomatization of randomness, J. Symb. L09ic 55 (1990), 1143-1167. [414] A. van der Poorten. Notes on Fermat's Last Theorem Wiley Interscience, New York, 1996. [415] N. K. Vereshchagin. Kolmogorov Complexity, Universitat Wiirzburg, 1998, 116pp, http://www-info4 . informatik. uniwuerzburg.de/veranstalt/. [416] N. K. Vereshchagin. An enumerable undecidable set with low prex complexity: a simplied proof, http://lpcs.math.msu. ru/-ver/papers/calude.ps. [417] B. Vidakovic. Algorithmic complexity, universal priors and Ockham's Razor, Resenhas do Instituto de Matematica e Estatistica da Universidade de Sao Paolo 3, 4 (1998), 359-390. [418] J. Ville. Etude critique de la notion de collectij, Gauthier-Villars, Paris, 1939. [419] P. M. Vitanyi. Quantum Kolmogorov complexity based on classical descriptions, IEEE Trans. Inf. Theory 47, 6 (2001), 24642479. [420] S. B. Volchan. The algorithmic theory of randomness, Amer. Math. Monthly 1 (2002), 46-63.

Bibliography

451

[421] R. von Mises. Probability, Statistics and Truth, G. Allen and Unwin, London; Macmillan, New York (2nd revised English edition prepared by Hilda Geiringer), 1961. [422] R. von Mises. Mathematical Theory of Probability and Statistics, edited and complemented by Hilda Geiringer, Academic Press, New York, 1974. [423] J. von Neumann. The Computer and the Brain, Silliman Lectures Series, Yale University Press, New Haven, CT, 1958. [424] J. von Neumann. Theory of Self-Reproducing Automata, edited and complemented by A. W. Burks, University of Illinois Press, Urbana, 1966. [425] V. V. V'yugin. Ergodic theorems for individual random sequences, Theoret. Comput. Sci. 207 (1998), 343-361. [426] K. Wagner, G. Wechsung. Computational Complexity, D. Reidel, Dordrecht, 1986. [427] A. Wald. Die Widerspruchsfreiheit des Kollectivbegriffes, Ergeb. math. Kolloq. 8 (1937), 38-72. [428] O. Watanabe (ed.). Kolmogorov Complexity and Computational Complexity, Springer-Verlag, Berlin, 1992. [429] K. Weihrauch. Computability, Springer-Verlag, Berlin, 1987. [430] K. Weihrauch. The degrees of discontinuity of some translators between representations of the real numbers, Inf.-Ber. 129, Fern Universitat Hagen, 1992. [431] K. Weihrauch. Computable Analysis, Springer-Verlag, Berlin, 2000. [432] K. Weihrauch, N. Zhong. Is the linear Schrodinger propagator 'lUring computable?, in J. Blank, V. Brattka, P. Hertling (eds.). Computability and Complexity in Analysis, Lecture Notes Comput. Sci. 2064, Springer-Verlag, Heidelberg, 2000, 248-260. [433] H. S. White. Algorithmic complexity of points in dynamical systems, Ergodic Theory Dyn. Syst. 13 (1993), 807-830.

452

Bibliography

[434] C. P. Williams, S. H. Clearwater. Ultimate Zero and One, Copernicus, New York, 2000. [435] D. G. Willis. Computational complexity and probability constructions, J. Assoc. Comput. Mach. 17 (1970), 241-259. [436] L. Wittgenstein. Selections from "Remarks on the Foundations of Mathematics" , in P. Benacerref, H. Putnam (eds.). Philosophy of Mathematics: Selected Readings, Prentice-Hall, Englewood Cliffs, NJ, 1964,421-480. [437] S. Wolfram. Universality and complexity in cellular automata, Physica D 10 (1984), 1-35. [438] S. Wolfram. Origins of randomness in physical systems, Physical Rev. Lett. 55 (1985), 298-301. [439] S. Wolfram. A New Kind of Science, Wolfram Media, 2002. [440] D. Wood. Theory of Computation, Harper & Row, New York, 1987. [441] J. B. Wright, E. G. Wagner, J. W. Thatcher. A uniform approach to inductive posets and inductive closure, Theoret. Comput. Sci. 7 (1978), 57-77. [442] G. Wu. Prefix-free languages and initial segments of computably enumerable degrees, in J. Wang (ed.). COCOON 2001, Lecture Notes Comput. Sci. 2108, Springer-Verlag, Heidelberg, 2001, 576585. [443] http://www.informatik.unigiessen.de/staff/richstein/ ca/Goldbach.html.

[444] http://www.hipilib.de/zeta. [445] E. H. Yang, S. Y. Shen. Chaitin complexity, Shannon-information content of a single event, and infinite random sequences 1, Sci. China Ser. A 34, 10 (1991), 1183-1193. [446] E. H. Yang, S. Y. Shen. Chaitin complexity, Shannon-information content of a single event, and infinite random sequences 2, Sci. China Ser. A 34, 11 (1991), 1307-1319.

Bibliography

453

[447] S. Yi-Ting. A "natural" enumeration of non~negative rational numbers: an informal discussion, Amer. Math. Monthly 87 (1980), 25~29. [448] T. Zamfirescu. Porosity in convexity, Real Anal. Exch. 15 (1989/90), 424~436. [449] D. Zeilberger. Theorems for a price: tomorrow's semi-rigorous mathematical culture, Notices Amer. Math. Soc. 40 (1993), 978~ 981. [450] X. Zheng. Closure properties of real number classes under limits and computable operators, in D. Z. Du (eds.). Proc. COCOON 2000, Lecture Notes Comput. Sci. 1858, Springer-Verlag, Heidelberg, 2000, 170~ 179. [451] M. Zimand. On the topological size of random strings, Z. Math. Logik Grundlag. Math. 32 (1986), 81~88. [452] M. Zimand. Positive Relativizations and Baire Classification, Ph.D. Thesis, Bucharest University, Romania, 1991. [453] M. Zimand. If not empty, N P \ P is topologically large, Theoret. Comput. Sci. 119 (1993), 293~310. [454] W. H. Zurek. Thermodynamic cost of computation, algorithmic complexity and the information metric, Nature 341 (1989), 119~ 124. [455] A. Zvonkin, L. A. Levin. The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms, Usp. Mat. Nauk 156 (1970), 85~127.

Notation Index N,l

x,3

N+,l

dom(cp) , 3

Q,l

° (1),3

R,l

1::;g+0(1),3

R+,l

graph(cp), 3

I, 1

range(cp) , 3

lCYJ, 1 iCY l, 1

xAW, 3

log = llog2J, 1

string(n),l Ixl,l logQ' 1

SAW, 3 x*,37 He, 37 K
A*,l

Pc, 47

#,1

0,(C,y;w),70

rem(m, i), 1

0,(C),71

C,l

m(x),74

(~), 1

RAND e , 105

RAND;;', 105

stringQ(n),2

0, 115

cp: X ~ Y, 3

Ni,124

cp(x) < 00,3

rand (V) , 168

cp(x) = 00,3 + --<, 3 AW,3

rand,169

Aoo,204 Halt, 284

Notation Index

456 a, 284

Arith, 317

U,391

Subject Index

absolute algorithmic probability, 47 absolute complexity, 36 absolute disjunctive real, 254 Algorithmic Coding Theorem, 75 average code-string length, 78 Bachmann's notation, 3 Baker's map, 366 binary self-delimiting version of a string, 23 Boolean algebra, 10 - a-algebra, 10 Boolean ring, 10 - a-ring, 10 Borel - nt-normal sequence, 194 - normal - sequence, 194 - set, 10 - string, 125 bounded ambiguity, 145 busy beaver function, 341 canonical program, 37 Cantor set, 220 c.e. Thring degree, 283 Chait in complexity - absolute, 36 - self-delimiting, 28 - conditional, 38 - Omega Number, 237, 310 Chernoff's bound, 173 closure operator, 7 code-string, 54 computer, 33, 35 - Chaitin, 36, 307 - universal (Chaitin), 36 co-dense set, 7

co-meagre set, 7 co-rare set, 7 computable - approximation, 284 - real, 260 computably - enumerable (c.e.) real, 271 - enumerable (c.e.) set, 5 - growing Cantor set, 220 - rare set, 131 conditional - complexity, 38 - probability, 9 - program-size complexity, 38 constructive sequence of constructively open sets, 172 constructively - first category set, 203 - immune set, 120 - megaporous set, 258 - null set, 172 - open set, 172 - residual, 255 - a-megaporous set, 258 continuous function, 204 converge computably, 260 deficiency of randomness function, 115 degree of a real, 285 dense set, 7 Diophantine set, 5 disjunctive in base, 253 domain, 3 domination relation, 271 effectively immune set, 120 entropy, 78

458

Subject Index

enumerable semi-measure, 74 exponential Diophantine set, 5 Fermat's Last Theorem, 332 function, 3 - busy beaver, 341 - continuous, 204 - deficiency of randomness, 115 - pairing, 4 - partial, 3 - computable (poco), 4 - prefix-increasing, 204 - semi-computable from above, 6 - semi-computable from below, 6 - string, 4 Godel - number, 267-269 - Incompleteness Theorem, 315 Goldbach Conjecture, 333, 337 graph, 3 - approximation set, 205 Halting Problem, 284, 305, 362, 409 Hertling-Weihrauch random, 230 immune set, 60 Konig's Lemma, 167 Kolmogorov-Chaitin - complexity, 36 - conditional complexity, 38 Kraft-Chaitin Theorem, 59, 272, 297 Lebesgue's Density Theorem, 257 lexicon, 254 Liouville number, 257 Martin-Lof asymptotical formula, 118 - sequential test, 158 test, 109 Matiyasevich's Theorem, 5 S!-like real, 297 pairing functions, 4 partial - function, 3 computable (poco) function, 4

prefix-free set, 22 prefix-increasing function, 204 probabilistic model, 9 probability distribution, 9 product topology, 8 randomness - hypothesis, 229, 230 - space, 229 range, 3 rare set, 7 representation, 286 residual, 254 - set, 202 - sequence, 169 - string, 105 second Baire category set, 7, 202 self-delimiting version of a string, 24 semi-computable - from above function, 6 - from below function, 6 semi-measure, 74 semi-ring, 18 set - Borel, 10 Cantor, 220 - co-dense, 7 - co-meagre, 7 - co-rare, 7 - computably enumerable (coeo), 5 - coeo rareo 131 - computably growing Cantor set, 220 - constructively first category set, 203 - constructively immune set, 120 - constructively null set, 172 - constructively open set, 172 - constructively residual, 255 - constructively O'-megaporous set, 258 - dense set, 7 - Diophantine, 5 - effectively immune, 120 - exponential Diophantine, 5 - graph approximation, 205 - immune, 60 - prefix-free, 22 - rare, 7 - residual, 254

Subject Index

459

- second Baire category, 7, 202 - first Baire category, 7 - singlefold exponential Diophantine, 5 - suffix-free, 32 - tt (wtt)-complete, 306 a-algebra, 10 a-ring, 10 singlefold exponential Diophantine set, 5 Specker real, 263 splitting, 286 string function, 4 strongly simulates, 294 suffix-free set, 32

uniform probability distribution, 9 universal - (Chaitin) computer, 36 - Martin-L6f test, 114 - sequence, 297 - sequential Martin-L6f test, 164 Universality Theorem, 4

truth-table (tt) reducible, 305

weak truth-table (wtt) reducible, 305

tt (wtt)-complete set, 306 Turing - degree, 283 - equivalent, 283 - reducible, 282 Twin-Prime Conjecture, 338

Nam e Inde x

Ackermann, W., 49 Adams, D., 400, 419 Akin, E., 404, 419 Andreasen, P., 92, 419 Archimedes, 403 Aristotle, 398 Arruda, A.I., 447 Arslanov, A., 234, 306, 314, 419 Arslanov, M., 314 Azra, J.P., 6, 419 Bachma nn, P., 3 Backus, 346 Bailey, D.H., 201, 413, 419, 420 Baire, R., 6, 7, 201-203, 253, 320, 398 Baker, T., 395, 420 Balc
Bohr, N., 95, 408 Book, R.V., XV, 386, 397, 421, 422 Borel, XIV Borel, E., 10, 11,96, 123, 125, 127-130, 146, 147, 169, 170, 194, 197, 200, 201,23 1,233,2 34,239 ,252-25 4,335, 384, 407, 422, 423, 431 Borger, E., XV, 6, 422 Borwein, J.M., 413, 419, 422 Brattka , V., 451 Brauer, W., XV Bridges, D.S., XV, 6, 167,26 0,412,4 22 Brisson, L., 409, 422 Browder, F.E., 431 Brown, R.H., 409, 422 Brudno , A.A., 367, 417, 422 Buntroc k, J., 343, 442 Burks, A.W., 451 Burrowes, R., XV Butts, R.E., 446 Ciimpeanu, C., XV, 91, 142, 146, 234, 423,427 Calude, A., XVI Calude, C., VIII, 6, 10, 37, 50, 52, 91, 92, 128, 145, 146, 192, 203, 233, 234, 251, 256, 258, 272, 289, 299, 306, 312-31 4,320,3 54,357 ,358,36 7,368, 370, 377, 381, 384, 385, 389, 390, 404,409, 419, 420, 422-427, 433, 441, 443,445 ,447-44 9 Calude, E., XI, XVI Casti, J.L., 37, 146, 343, 357, 358, 416, 424, 427, 449 Cervelle, J., 416, 427 Chaitin , G.J., VIII, XIII-XV , 32-43, 45-53, 59-62, 64-66, 68-71, 73, 75, 76, 78-86, 87, 88-93, 96, 102-107,

462

N arne Index

108, 109, 118, 119, 121, 129, 131, 133, 134, 138, 140, 142, 144-146, 169, 176, 179, 180, 182, 204, 230, 233, 234, 237-239, 272, 273, 277, 294, 298, 299, 301, 302, 306, 307, 310, 313, 314, 318, 320-322, 324-327, 334, 341, 343,344,348,350,357,358,361-365, 390-393,399,400,404,409,410,412, 413,415-417,420, 423, 426~430, 441, 447, 449 Champernowne, D.G., 201, 231, 429 Chitescu, I., 192, 203 Chitescu, I., XV, 10, 145, 146, 233, 234, 424 Chong, C.-T., 440 Chown, M., 234, 429 Chung, KL., 8, 429 Church, A., 233, 430, 432 Ciobanu, G., 448 Cipra, B., 384, 430 Clearwater, S.H., 452 Cohen, D.E., 6, 430 Cohen, G., 393, 420 Cohen, P.J., 413, 430 Coles, RJ., 289, 312, 314, 424 Connes, A., 317, 430 Cooper, B., 424 Copeland, A.H., 234, 430 Copeland, J., 357, 430 Cover, T.M., 32, 52, 77, 89, 93, 145, 234, 430, 440 Craig, E., 432 Crandall, RE., 201, 419, 420 Csiszar, I., 32, 430 da Costa, N.C.A., 406, 430, 447 Daicz, S., 358, 415, 420 Dance, P., XV Dauchet, M., 444 Davie, G., 234, 431 Davies, P., 146, 234, 358, 401, 404, 409-411, 431 Davis, M., 146, 234, 358, 431, 444 Davis, P.J., 146, 234, 409, 431 Dawson, J., Jr., 319, 358, 431, 433 De Bakker, J.W., 443 De Leeuw, K., 390, 431 de Saint Exupery, A., 53 Dedekind, R., 260, 272

Delahaye, J.-P., 52, 146, 230, 234, 323 328, 358, 431 Dellacherie, C., 234, 431 Dembski, W.A., 146, 432 Demuth, 0., 314, 432 Denbigh, J.S., 409, 432 Denbigh, KG., 409, 432 Denker, 52, 92, 432 Detlefsen, M., 358, 432 Deutsch, D., 356, 404, 432 Devlin, K, 413, 419 Dewdney, A.K, 432 Diaz, J., 393, 420 Dickson, L.E., 432 Dinneen, M.J., 92,354, 358,420,425, 445,449 Diophantus, 333 Dirac, P.A.M., 408, 432 Disney, W., 415 Doran, RW., XV, 409, 425 Doria, F.A., 406, 430 Downey, R, 314, 393, 417, 433 Dragomir, S., 311, 314, 433 Du, D.Z., 453 Dubuc, S., 442 Dutil, N., 145, 433 Dubacq, J.-C., 416, 432 Dudley, RM., 11,433 Dumitrescu, M., 367, 425 Durand, B., 416, 427, 432, 444 Duske, J., 32, 437 Dyson, F., 319 Ebbinghaus, H.-D., 439 Einstein, A., 147, 237, 315, 408, 411, 435,445 Eisenstein, F., 393 Erdos, P., 234, 402, 430 Etesi, G., 433 Euclid,335 Euler, L., 33, 262, 333, 384 Fano, RM., 31 Feferman, S., 433 Feller, W., 8, 149, 434 Ferbus-Zanda, 52, 146, 434 Fermat, P., 332, 333, 393 Ferreira, A., 433 Feynman, R, 357, 401, 404, 434, 437

Name Index Fibonacci, L., 100, 101 Figueira, S., 201, 420 Fine, T.L., 52, 145, 234,434 Ford, J., 92, 367, 434 Formenti, E., 416, 427, 432 Fouche, VV., 230, 367, 434 Fraenkel, 316, 335 Fredkin, F., 434 Freivalds, R., XV Godel, K., VIII, 4, 120, 203, 315, 318, 319,323,411-413,420,426-428,433, 435,450 Gacs, P., XIV, XV, 52, 91, 93, 145, 146, 207,233,234,267-269,421,430,434, 441 Gabarro, J., 393, 420 Galato, S., 416, 434 Gardner, M., 145, 146, 234, 358, 421, 434,438 Garey, M., 393, 435 Geiring, H., 451 Gengler, R., 314, 445 Gewirtz, VV.L., 52, 91, 145, 234, 435 Gibbons, J., XV Gill, J., 390, 395, 397, 420, 421, 435 Goldbach, C., 333 Graham, R.L., 435 Gray, R.M., 93, 145, 234, 430 Gribbin, J., 146, 234,404,409, 431 Grigorieff, S., 52, 146, 434 Grosswald, E., 435 Grozea, C., 90-92, 425, 435 Gruska, J., XI, XV, 52,435 Gui3.§u, S., 32, 435 Hadamard, J., 384 Hall, N., 444 Halmos, P.R., 3, 17, 18, 436 Hammer, D., 92, 436 Hardy, G.H., 333, 361, 436 Hartmanis, J., XI, XV, 310, 314, 395, 396,436 Hawking, S.VV., 409, 436 Heisenberg, VV., 407, 408 Hemachandra, L., XV, 310, 314, 393, 396,436 Hemaspaandra, L., 396, 436 Henkin, L., 319

463

Herken, R., 421 Hersh, R., 146, 234,409,431 Hertling, 289, 368, 425 Hertling, P.H., 229, 230, 233, 234, 252, 254, 272, 289, 299, 311, 313, 314, 368, 370, 377, 381, 384, 385, 389, 424,425,436,437,451 Hey, J.G., 437 Hilbert, D., VIII Hintikka, J., 446 Hirschfeldt, D., 314,417,433 Hlawka, E., 384, 385, 437 Ho, C.-K., 313, 437 Holzwarth, F., XVI Hopcroft, J.E., 395, 436 Horgan, J., 413, 437 Hromkovic, J., 234, 425 Ishihara, H., 92,425 Istrate, G., XV, 146, 426 Jurgensen, H., XV, 32, 251, 253, 313, 320,368,370,377,381,425,426,437 Jaffe, A., 413, 437 Jaki, S., 319 Jaulin, B., 6, 419 Johnson, D., 393, 435 Jones, D.S., 32, 437 Jones, J.P., 6, 325, 437 Juedes, D., 308, 437 Kamae, T., 51, 437 Kant, I., 409, 411 Karhumiiki, J., 424, 426 Katseff, H.P., 52, 233,438 Kautz, S., 314, 438 Kelley, J.L., 8, 438 Khinchin, A.I., 32, 438 Khoussainov, B., 234, 272, 289, 299, 314, 384, 385, 389, 424, 425, 438 Kieu, T.D., 92, 358, 438 Kimura, M., 379, 442 Kino, A., 442 Klarner, D.A., 438 Kleene, S.C., 4, 5, 433 Knuth, D.E., 32, 138, 146, 234, 361, 438,442 Ko, Ker-I, 260, 438 Kolman, P., 427, 433

464

Name Index

Kolmogorov, A.N., XIII, XIV, 33, 34, 36, 38, 52, 86, 90, 92, 108, 145, 146, 171,234,396,424,430,432,436-440, 449-451 Konig, D., 167 Korner, J., 32, 430 Kraft, L.G., XIV, 75, 88, 90, 138, 25, 26, 30-32, 53, 59, 62, 64, 68, 73, 75, 76, 88-91, 103, 138, 180, 182, 426, 439,441 Kramosil, I., 144, 146, 234, 439 Kreinovich, V., 388, 441 Kreisel, G., 262 Krob, D., 425 Kucera, A., 207, 234, 313, 314, 439 Kuipers, L., 201, 234, 384, 439 Kummer, M., 313, 314, 439 Kuratowski, K., 7 Kurta, E., 91, 426 Kurtz, S.A., 310, 314, 436 LaForte, G.L., 314, 433 Landauer, R., 439 Laplace, P.S., 97, 145, 407, 439 Larsen,K.G., 437 Lascar, D., 439 Lathrop, 308, 437 Lebesgue, H., 10, 18, 194, 253 Legg, S., 358, 426 Leibniz, G., 1 Lennon, M., XV Lerman, M., 293, 314, 439, 440 Leung-Yan Cheong, S.K., 32, 89, 440 Levin, L.A., 50, 52, 91, 145, 233, 234, 409, 440, 453 Levine, R.D., 428 Levy, S., 404, 416, 440 Li, M., XIV, XV, 52, 85, 93, 145, 146, 230, 234, 393, 409, 421, 440 Li, X., 146, 440 Linchnerowicz, A., 430 Lind, D., 367, 416, 440 Liouville, 257, 311 Lobstein, A., 393, 420 Longpre, L., 388, 393, 440, 441 Loxton, J.H., 441 Lutz, J., XV, 308, 386, 397,398,422, 437. 441

Mandoiu, I., XI, XV, 89, 92, 234, 441 Mahler, K., 441 Mal'cev, A.I., 6, 441 Manca, V., 358, 441 Marandijan, H.B., 146, 234, 441 Marcus, B., 367, 440 Marcus, S., XV, 99, 409, 416, 426, 441 Margolus, N., 367, 449 Markov, A.A., 4 Markowski, G., XV, 146, 441 Martin-Lof, P., XV, XIV, 52, 109-118, 133, 141, 142, 145, 146, 158, 159, 161-169, 174-177, 185, 186, 193, 198-200, 202, 203, 230-234, 255, 260, 300, 372, 373, 375, 380, 382, 383, 391, 392, 424, 442 Maruoka, A., 379, 442 Marxen, J., 343, 442 Matiyasevich, Yu.V., 5, 6, 324, 325, 431, 437, 442 Maurer, H., 424, 426 Maurer, H.A., XV Mayordomo, E., 398, 441 McCarthy, J., 431 McMillan, B., 26 Meinel, C., 425 Mendes-France, M., XV, 234, 409, 442 Meyer, A.R., 52, 442 Meyerstein, F.W., 409, 422, 426 Michel-France, M., 442 Miller, G.L., 389, 392, 395, 443 Miller, S., 399 Moore, C., 366, 443 Moore, E.F., 377, 390, 431, 443 Moore, G.H., 433 Moran, P.A.P., 422 Morvan, M., 425 Muchnik, A., 443 Muller, G.H., 439 Myhill, J., 442 Nemeti, I., 433 Natanson, I.P., 125, 443 Naur, P., 346 Newman, J.R., 421 Niederreiter, H., 201, 234, 384, 439 Nies, A., 306, 314, 426, 433 Nietzsche, F., 33 Nivat. M .. 443

Name Index Niven, I., 200, 201, 234, 443 Odifreddi, P., XV, 6, 146, 260, 284, 292, 306, 409, 443 Ogihara, M., 393, 436 Orevkov, V.P., 388, 443 Oxtoby, J.C., 201, 202, 254, 256, 257, 443,444 Paun, G., 357, 424, 426, 443, 447, 448 Pagels, H.R, 146, 234, 358, 444 Parmenides, 411 Parra, C., XI Pascal, B., XIII Paulos, J.A., 146, 234, 444 Pavlov, B., 357, 426 Peano, 335 Penrose, R, XV, 402, 404, 406-409, 444 Percival, I., 366, 444 Perrin, D., 32, 421 Peter, R., 49 Peterson, I., 409, 444 Pines, D., 421 Pippenger, N.J., 89, 91 Poincare, H., 410 Porrot, S., 444 Post, E.L., 318, 319, 343, 444 Pour-El, M., XV, 260, 263, 388, 406, 416,444 Pratt, V., 395, 444 Prigogine, I., 402 Puech, C., 439 Pultr, A., 427, 433 Putnam, H., 431, 435, 452 Quinn, F., 413, 437 Raatikainen, P., 320, 324, 445 Rabin, M.O., 389, 392, 445 Rackoff, C., XV Rademacher, H., 384, 385, 435 Ramsey, F.P., 335 Regis, E., 411, 445 Reichel, H., 433 Reischuk, R., 439 Remmel, J.B., 293, 314, 439, 440 Renfro, D.L., 445 Rettinger, R, 314, 445 Rice, H., 260, 445

465

Richards, I., 260, 263, 388, 406, 416, 444 Richman, F., 167,422 Riemann, B., 333, 383-385, 388, 389, 395,443,445 Robinson, A., 319 Robinson, J., 431 Rogers, H., 6, 146, 445 Ross, J.A., XVI Rossler, M.O., 403 Rothschiled, B.L., 435 Rozenberg, G., X, XV, 33, 423, 424-426, 445, 448 Rucker, R., 146, 234, 358, 445 Rudeanu, S., XV Ruelle, D., 92, 146, 234, 445 Russell, B., 95, 145,317,435,445 Sacks, G.E., 146, 292,439,446 Salagean-Mandache, A.M., 91, 446 Salomaa, A., V, X, XV, 6, 33, 99, 395, 404,423,425,426,445,446,448 Salomaa, K., XV Sburlan, S., 420, 445 Schiitzenberger, P.M., 430 Schack, R., 78, 92, 446 Schmidhuber, J., 92, 446 Schnorr, C.P., 52, 233, 234, 314, 446 Schopenhauer, A., 315 Schwartz, J.T., 390, 392, 429 Scotus, D., 409, 426, 446, 448 Segre, G., 92, 446 Semenov, A., 443 Sgall, J., 427, 433 Shannon, C.E., VII, XIII, 21, 28, 31, 32, 390, 431, 440, 446 Shapiro, N., 390, 431 Shen, S.Y., 51, 52, 145, 230, 446, 450, 452 Shields, P., XV Shoenfield, J.R., 284 Shu, C.-K., 354, 358, 425 Shyr, H.J., 437 Siegelmann, H., 357, 447 Sindelar, J., 234,439 Sipser, M., 52, 234, 438, 447 Skyum, S., 437 Slaman, T.A., 300, 301, 314, 439, 447 Smiley, T.J., 439

466

Name Index

Smullyan, RM., 146, 447 Soare, R., 6, 260, 271, 292, 306, 312-314, 447 Solomonoff, RJ., XIII, 33, 52, 447 Solovay, RM., XV, 35, 108, 181, 233, 234, 271, 273, 274, 277, 297, 298, 313, 328, 358, 389, 392, 395, 415, 420, 433, 447 Specker, E., 262, 263, 448 Spencer, J.H., 435 St Augustine, 410 Staiger, L., XV, 145, 146, 243, 252,313, 424, 448 Steen, L.A., 431 ~tefanescu, D., XV, 409, 426, 448 Stewart, I., 146, 235, 343, 448 Strassen,V., 389, 392, 447 Svozil, K., XV, 92, 234, 357, 358, 403, 404, 406, 409, 411, 425, 448, 449 Szilard, A., XV, 310, 449 Tatanim, M., XV Tadaki, K., 314, 449 Tarski, A., 317 Tee, G., XV Terwijn, S., 313 Thatcher, J.W., 452 Thierrin, G., 253, 313, 437 Thomas, J.y', 32, 52, 77, 430 Tipler, F.J., 401, 402, 404, 409, 420, 449 Titchener, M., XV, 32, 449 Toffoli, T., 367, 401, 434, 449 Tomescu, I., XV, 92, 426 Traub, J.F., X, 260, 445, 449 Tribus, M., 428 Trismegistus, H., 303 Truss, J.K., 424 Turing, A.M., VII, VIII, 4, 35, 36, 260, 281-284,305,315,399,421,432,435, 449 Tychonoff, A.N., 8, 368 Tymoczko, T., 404, 413, 449 Ulam, S.M., 202, 256, 367, 444 Urey, H., 399 Uspensky, V., XV, 52, 93, 145, 230, 234,417,438,443,449,450

V'yugin, V.V., 416, 451 Valery, P., 21 van Dalen, D., 439 van der Poorten, A., 333, 450 van der Waerden, 171 van Heijenoort, J., 433, 450 van Lambalgen, M., 145, 146, 234, 320, 358,450 van Leeuwen, J., 440, 443 Vereshchagin, N.K., 91, 145, 230, 234, 314, 444, 450 Vesley, RE., 442 Vidakovic, B., 146, 450 Ville, J., 233, 450 Vitanyi, P.M., XIV, 52, 85, 93, 145, 146, 230, 234, 393, 409, 421, 440, 450 Volchan, S.B., 146, 450 von Braunmiihl, B., 314, 445 von Mises, R., 140, 145, 187, 233, 234, 310, 450, 451 von Neumann, J., 367, 399, 404, 416, 451 Wagner, E.G., 393, 452 Wagner, K., 386, 397, 422, 451 Wald, A., 233, 451 Wang, J., 452 Wang, y., 272, 299, 314, 425, 437 Wasilkowski, G.W., 260, 449 Watanabe, 0., XIV, 393, 451 Wechsung, G., 393, 451 Weihrauch, K., 204, 229, 230, 243, 252, 260, 263, 311, 314, 368, 370, 377, 381, 406, 416, 425, 437, 451 Werschulz, A.G., 449 Weyl, H., 319 Wheeler, J., 401, 411 White, H.S., 367, 417, 420, 451 Whitehead, A.N., 317 Wicks, M.J., 440 Wiener, N., 357 Wiles, A., 333 Williams, C.P., 452 Willis, D.G., 52, 452 Winskel, G., 437 Wittgenstein, L., 452 W6ssner, H., XVI Wolfram, S., 146, 367, 368, 452 Wood, D., 6, 452

Name Index Wozniakowski, H., 260, 449 Woyczyiiski, W., 52, 92, 432 Wright, E.M., 361, 436 Wright, J.B., 452 Wu, G., 314, 452 Yamaguchi, T., 92, 425 Yang, 52, 452 Yao, A.C., 32, 421 Ycart, B., 52, 92, 432 Yu-Ting, S., 260, 453

467

Zalta, E.N., 430 Zamfirescu, T., 256-258, 313, 427, 453 Zeilberger, D., 413, 453 Zeilinger, A., 357 Zermelo, E., 316, 335 Zheng, X., 314, 445, 453 Zhong, N., 263,406, 451 Zimand, M., XV, 146,320, 390, 396, 398, 426, 436, 453 Zuckerman, H.S., 200, 201,234,443 Zurek, W.H., 37, 145, 421, 453 Zvonkin, A., 52, 91, 145, 234, 453

Information and randomness: an algorithmic perspective

Read more

Information and randomness: An algorithmic perspective

Read more

Information and Randomness. An Algorithmic Perspective

Read more

Decision Procedures: An Algorithmic Point of View (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Decision Procedures: An Algorithmic Point of View (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Decision Procedures: An Algorithmic Point of View (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Decision Procedures: An Algorithmic Point of View (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Decision Procedures: An Algorithmic Point of View (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Information and randomness: an algorithmic approach

Read more

Extremal Combinatorics: With Applications in Computer Science, Second Edition (Texts in Theoretical Computer Science. an Eatcs)

Read more

Software Engineering 1: Abstraction and Modelling (Texts in Theoretical Computer Science. An EATCS Series) (v. 1)

Read more

Software Engineering 2: Specification of Systems and Languages (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Finite Model Theory and Its Applications (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Software Engineering 2: Specification of Systems and Languages (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Software Engineering 1: Abstraction and Modelling (Texts in Theoretical Computer Science. An EATCS Series) (v. 1)

Read more

Temporal Logic and State Systems (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Software Engineering 3: Domains, Requirements, and Software Design (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Structural Complexity I (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Parameterized Complexity Theory (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Parameterized Complexity Theory (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Exact Exponential Algorithms (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Parameterized Complexity Theory (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Grammatical Picture Generation: A Tree-Based Approach (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Public-Key Cryptography (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Theoretical Aspects of Local Search (Monographs in Theoretical Computer Science. An EATCS Series)

Read more

Polynomials: An Algorithmic Approach (Discrete Mathematics and Theoretical Computer Science)

Read more

Polynomials: An Algorithmic Approach (Discrete Mathematics and Theoretical Computer Science)

Read more

Dissemination of Information in Optical Networks:: From Technology to Algorithms (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Dissemination of Information in Optical Networks:: From Technology to Algorithms (Texts in Theoretical Computer Science. An EATCS Series)

Read more

Foundations of Algebraic Specification and Formal Software Development (Monographs in Theoretical Computer Science. An EATCS Series)

Read more

Recommend Documents

Information and randomness: an algorithmic perspective

Information and randomness: An algorithmic perspective

Information and Randomness. An Algorithmic Perspective

Decision Procedures: An Algorithmic Point of View (Texts in Theoretical Computer Science. An EATCS Series)

Texts in Theoretical Computer Science An EATCS Series Editors: W. Brauer J. Hromkoviˇc G. Rozenberg A. Salomaa On behal...

Decision Procedures: An Algorithmic Point of View (Texts in Theoretical Computer Science. An EATCS Series)

Texts in Theoretical Computer Science An EATCS Series Editors: W. Brauer J. Hromkoviˇc G. Rozenberg A. Salomaa On behal...

Decision Procedures: An Algorithmic Point of View (Texts in Theoretical Computer Science. An EATCS Series)

Texts in Theoretical Computer Science An EATCS Series Editors: W. Brauer J. Hromkoviˇc G. Rozenberg A. Salomaa On behal...

Decision Procedures: An Algorithmic Point of View (Texts in Theoretical Computer Science. An EATCS Series)

Texts in Theoretical Computer Science An EATCS Series Editors: W. Brauer J. Hromkoviˇc G. Rozenberg A. Salomaa On behalf...

Decision Procedures: An Algorithmic Point of View (Texts in Theoretical Computer Science. An EATCS Series)

Texts in Theoretical Computer Science An EATCS Series Editors: W. Brauer J. Hromkoviˇc G. Rozenberg A. Salomaa On behal...

Information and randomness: an algorithmic approach

Extremal Combinatorics: With Applications in Computer Science, Second Edition (Texts in Theoretical Computer Science. an Eatcs)

Texts in Theoretical Computer Science An EATCS Series Editors: J. Hromkoviˇc G. Rozenberg A. Salomaa Founding Editors: ...