CLASSICAL AND NEW PARADIGMS OF COMPUTATION AND THEIR COMPLEXITY HIERARCHIES
TRENDS IN LOGIC Studia Logica Library VOLUME 23 Managing Editor Ryszard Wójcicki, Institute of Philosophy and Sociology, Polish Academy of Sciences, Warsaw, Poland Editors Vincent F. Hendricks, Department of Philosophy and Science Studies, Roskilde University, Denmark Daniele Mundici, Department of Mathematics “Ulisse Dini”, University of Florence, Italy Ewa Orłowska, National Institute of Telecommunications, Warsaw, Poland Krister Segerberg, Department of Philosophy, Uppsala University, Sweden Heinrich Wansing, Institute of Philosophy, Dresden University of Technology, Germany
SCOPE OF THE SERIES
Trends in Logic is a bookseries covering essentially the same area as the journal Studia Logica – that is, contemporary formal logic and its applications and relations to other disciplines. These include artificial intelligence, informatics, cognitive science, philosophy of science, and the philosophy of language. However, this list is not exhaustive, moreover, the range of applications, comparisons and sources of inspiration is open and evolves over time.
Volume Editor Ryszard Wójcicki
The titles published in this series are listed at the end of this volume.
CLASSICAL AND NEW PARADIGMS OF COMPUTATION AND THEIR COMPLEXITY HIERARCHIES Papers of the conference “Foundations of the Formal Sciences III” Edited by
BENEDIKT LÖWE Universiteit van Amsterdam, The Netherlands
BORIS PIWINGER University of Wien, Austria and
THORALF RÄSCH University of Potsdam, Germany
KLUWER ACADEMIC PUBLISHERS DORDRECHT / BOSTON / LONDON
A C.I.P. Catalogue record for this book is available from the Library of Congress.
ISBN 1-4020-2775-3 (HB) ISBN 1-4020-2776-1 (e-book)
Published by Kluwer Academic Publishers, P.O. Box 17, 3300 AA Dordrecht, The Netherlands. Sold and distributed in North, Central and South America by Kluwer Academic Publishers, 101 Philip Drive, Norwell, MA 02061, U.S.A. In all other countries, sold and distributed by Kluwer Academic Publishers, P.O. Box 322, 3300 AH Dordrecht, The Netherlands.
Printed on acid-free paper
All Rights Reserved © 2004 Kluwer Academic Publishers No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Printed in the Netherlands.
Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xi
List of Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Complexity hierarchies derived from reduction functions . . . . . . . . Benedikt L¨owe
1
Quantum query algorithms and lower bounds . . . . . . . . . . . . . . . . . . 15 Andris Ambainis Algebras of minimal rank: overview and recent developments . . . . . 33 Markus Bl¨aser Recent developments in iterated forcing theory . . . . . . . . . . . . . . . . . 47 J¨org Brendle Classification problems in algebra and topology . . . . . . . . . . . . . . . . 61 Riccardo Camerlo Using easy optimization problems to solve hard ones . . . . . . . . . . . . 77 Lars Engebretsen On Sacks forcing and the Sacks property . . . . . . . . . . . . . . . . . . . . . 95 Stefan Geschke and Sandra Quickert Supertask computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Joel David Hamkins A refinement of Jensen’s constructible hierarchy . . . . . . . . . . . . . . . 159 Peter Koepke and Marc van Eijmeren Effective Hausdorff dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Elvira Mayordomo C´amara Axiomatizability of algebras of binary relations . . . . . . . . . . . . . . . . 187 Szabolcs Mikul´as v
vi
CONTENTS
Forcing axioms and projective sets of reals . . . . . . . . . . . . . . . . . . . . 207 Ralf Schindler Post’s and other problems of supertasks of higher type . . . . . . . . . . . 223 Philip D. Welch References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Preface “Foundations of the Formal Sciences” (FotFS) is a series of interdisciplinary conferences in mathematics, philosophy, computer science and linguistics. The main goal is to reestablish the traditionally strong links between these areas of research that have been lost in the past decades. The first two conferences of the series took place in Berlin (May 1999, Foundations of the Formal Sciences I, organized by Benedikt L¨owe and Florian Rudolph) and Bonn (November 2000, Foundations of the Formal Sciences II, organized by Peter Koepke, Benedikt L¨owe, Wolfgang Malzkorn, Thoralf R¨asch, and Rainer Stuhlmann-Laeisz). Proceedings volumes have appeared as a double special issue of the journal Synthese [L¨owRud1 02] and as a volume in this book series [L¨owMalR¨as03]. In February 2003, we had the fourth conference of the series FotFS IV on “The History of the Concept of the Formal Sciences” in Bonn, and the fifth conference FotFS V on “Infinite Games” is already in preparation for Fall 2004 as a joint activity of the Institute for Logic, Language and Computation in Amsterdam and the Mathematisches Institut in Bonn. The conference Foundations of the Formal Sciences III was a PhD EuroConference in the European Community program “Improving Human Research Potential and the Socio-Economic Knowledge Base” funded under the contract number HPCF-2000-00040. It was also a satellite meeting of the joint meeting of the Deutsche Mathematikervereinigung ¨ ¨ (DMV) and the Osterreichische Mathematische Gesellschaft (OMG), the ¨ 15. OMG-Kongreß. The conference FotFS III took place in the Institut f¨ur Formale Logik of the Universit¨at Wien located in the Josephinum. The Josephinum was founded in 1785 by Emperor Joseph II. as an academy for military doctors and still houses Vienna’s premier medical museum (the Museum des Instituts f¨ur Geschichte der Medizin) with a famous collection of wax anatomical figures. It was depicted on the back of the old Austrian 50Schilling bill before the Euro was introduced on January 1st, 2002 (see Figure 1). The participants of the conference enjoyed the blend of the historical location close to the centre of Vienna and the modern equipment of the logic institute.
vii
viii
PREFACE
Fig. 1. The Josephinum on the 50-Schilling Bill
Holding a conference on the Formal Sciences (Formalwissenschaften) in Vienna is particularly fitting. The Formal Sciences (mathematics, computer science, formal philosophy and theoretical linguistics) have found their niches in the university cultures, mostly determined by the history of the subjects. In Germany, mathematics and computer science are often grouped with the Natural Sciences in a Mathematisch-Naturwissenschaftliche Fakult¨at where the explicit mention of mathematics seems to hint at its special nature distinct from the natural sciences. In the American system, mathematics can be both part of the Arts and part of the Sciences, whereas computer science is often associated with Engineering. One of the most prominent members of the Vienna Circle, Rudolf Carnap, stressed the special nature of the Formal Sciences [Car1 35], and it seems that some relicts of this special status are still preserved in the Austrian university system:
PREFACE
ix
Until recently, the Universit¨at Wien had a Natur- und Formalwissenschaftliche Fakult¨at (the Institut f¨ur Mathematik and the Institut f¨ur Formale Logik representing the Formal Sciences part) until it was renamed Fakult¨at f¨ur Naturwissenschaften und Mathematik. The Wirtschaftsuniversit¨at Wien still has a Fachbereich f¨ur Sozial-, Geistes- und Formalwissenschaften, and in general, you will encounter the term Formalwissenschaften surprisingly often in the Austrian texts and discussions about academia and the classification of sciences, at least much more often than in Germany, the United States and other countries. During the Opening Ceremony of FotFS III, the Universit¨at Wien was represented by Marianne Popp, the Dean of the Faculty of Natural Sciences and Mathematics. The participants were also greeted in the name of the Institut f¨ur Mathematik of the Universit¨at Wien and the ¨ ¨ Osterreichische Mathematische Gesellschaft (OMG) by Karl Sigmund. ¨ In his rˆole as the president of the OMG, Professor Sigmund was the chair ¨ of the organizing committee of the 15th OMG-Kongreß. We would like to thank all speakers at the conference for contributing to the success of the conference. In addition to that, we would like to thank the Institut f¨ur Formale Logik at the Universit¨at Wien, and in particular, its director, Sy D. Friedman, for giving us the opportunity to have the conference in Vienna and for the institutional support and access to the institute infrastructure. We also would like to thank the Mathematisches Institut of the Rheinische Friedrich-Wilhelms-Universit¨at Bonn for the support during the preparation phase. Of course, a conference can’t be organized without a lot of help. We extend our thanks to all helpers at this conference and name them in alphabetic order: Gudrun Fankhaenel, Andreas Metzler, Mag. Pavel Ondrejovic, Dipl.-Math. Philipp Rohde, David Schrittesser, Margit Svitil, Lic. Marc van Eijmeren. We’d also like to thank the series editor, Heinrich Wansing (Dresden), and Jolanda Voogd and Charles Erkelens at Kluwer Academic Publishers for their support during the planning and preparation of the proceedings volume. The LATEX-classfile fotfs.cls which was used for the layout of this volume is due to Dipl.-Math. Philipp Rohde (Aachen). We received technical assistance from Dipl.-Math. Stefan Bold (Bonn).
x
PREFACE
Last but not least, we have to thank the numerous referees who helped us a lot to produce this volume by producing illuminating and competent reports. Amsterdam/Wien/Potsdam, March 2003
B.L. B.P. Th.R.
Schedule Sep 24 Sep 23 Sep 22 Sep 21 Time 09:30-10:00 Hamkins Schindler 10:00-10:30 Berenbrink BREAK BREAK 10:30-11:00 Welch B R E A K van Eijmeren 11:00-11:30 11:30-12:00 Ambainis Engebretsen Zapletal 12:00-12:30 CLOSING 12:30-13:00 13:00-13:30 13:30-14:00 LUNCH LUNCH 14:00-14:30 BREAK BREAK 14:30-15:00 O P E N I N G 15:00-15:30 Brendle Mikulas Veith 15:30-16:00 16:00-16:30 B R E A K BREAK BREAK 16:30-17:00 Quickert Bl¨aser Kutz 17:00-17:30 Schuster Grohe Mayordomo 17:30-18:00 B R E A K 18:00-18:30 Camerlo 18:30-19:00
xi
List of Participants Andris Ambainis (Institute for Advanced Study, Princeton NJ) David Asper´o (Universit¨at Wien, Vienna) George Barmpalias (University of Leeds, Leeds) Arnold Beckmann (Technische Universit¨at Wien, Vienna) Petra Berenbrink (University of Warwick, Warwick) Markus Bl¨aser (Medizinische Universit¨at zu L¨ubeck, L¨ubeck) J¨org Brendle (Kobe University, Kobe) Riccardo Camerlo (Universit`a di Torino, Torino) Lars Engebretsen (Royal Institute of Technology, Stockholm) Gudrun Fankhaenel (RhFWU Bonn, Bonn) Martin Goldstern (Technische Universit¨at Wien, Vienna) Martin Grohe (University of Edinburgh, Edinburgh) Joel David Hamkins (CUNY College of Staten Island, New York NY) Oliver Kutz (Universit¨at Leipzig, Leipzig) Benedikt L¨owe (RhFWU Bonn, Bonn) Elvira Mayordomo C´amara (Universidad de Zaragoza, Zaragoza) Andreas Metzler (Universit¨at Wien, Vienna) Szabolcz Mikulas (Birkbeck College, London) Heike Mildenberger (Universit¨at Wien, Vienna) Pavel Ondrejovic (Universit¨at Wien, Vienna) Boris Piwinger (Universit¨at Wien, Vienna) Sandra Quickert (RhFWU Bonn, Bonn) Thoralf R¨asch (Universit¨at Potsdam, Potsdam) Philipp Rohde (RhFWU Bonn, Bonn) Ralf-Dieter Schindler (Universit¨at Wien, Vienna) David Schrittesser (Universit¨at Wien, Vienna) Peter Schuster (LMU M¨unchen, M¨unchen) Helmut Schwichtenberg (LMU M¨unchen, M¨unchen) Emil Simeonov (Technische Universit¨at Wien, Vienna) Marc van Eijmeren (RhFWU Bonn, Bonn) Helmut Veith (Technische Universit¨at Wien, Vienna) Philip D. Welch (Universit¨at Wien, Vienna) Tina Wieczorek (Technische Universit¨at Berlin, Berlin) Jindˇrich Zapletal (University of Florida, Gainesville FL)
xiii
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
Complexity hierarchies derived from reduction functions Benedikt L¨owe Institute for Logic, Language and Computation Universiteit van Amsterdam Plantage Muidergracht 24 1018 TV Amsterdam, The Netherlands E-mail:
[email protected]
Abstract. This paper is an introduction to the entire volume: the notions of reduction functions and their derived complexity classes are introduced abstractly and connected to the areas covered by this volume.
1
Introduction
Logic is famous for what Hofstadter calls limitative theorems: G¨odel’s incompleteness theorems, Tarski’s result on the undefinability of truth, and Turing’s proof of the non-computability of the halting problem. For each of these limitative results we have three ingredients: 1. an informal notion to be investigated (e.g., provability, expressibility, computability), 2. a formalization of that notion (e.g., provability as formalized in Peano arithmetic PA, definability in a formal language, the Turing computable functions), Received: February 26th, 2003; In revised version: November 5th, 2003; Accepted by the editors: November 21st, 2003. 2000 Mathematics Subject Classification. 03–02 03D55 03D15 68Q15 03E15 03D28 68Q17.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 1-14. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
¨ BENEDIKT LOWE
2
3. and finally a theorem that there is some limitation to the formalized version of the informal notion (e.g., the theorems of G¨odel, Tarski and Turing). The limitative theorems split the world into cis and trans relative to a barrier and show that some objects are beyond the barrier (trans). In our examples, the statement “PA is consistent” Cons(PA) is beyond the barrier marked by the formal notion of provability in PA, a truth predicate satisfying the convention (T) is beyond the barrier of definability, and the halting problem is “beyond the Turing limit” (Siegelmann). Even more importantly, these results relativize: the notion of formal provability in the stronger system PA + Cons(PA) gives us another barrier, but for this barrier Cons(PA) lies cis, and Cons(PA + Cons(PA)) lies trans. This calls for iteration, and instead of seeing limitative theorems mainly as obstacles, we can see their barriers positively, as a means of defining relative complexity hierarchies.1 This is the approach we chose for this introductory paper. Let us illustrate the problems and methodology of limitative results in some examples: (Example 1) Is there an effective algorithm to determine whether a number is prime? The informal notion involved in this question is the notion of “effective algorithm”. In order to give a positive answer to the question in (Example 1), you do not necessarily need a metatheoretical apparatus or even a proper formal definition of what an effective algorithm is. You have to display an algorithm, and convince people that it is effective. But this pragmatic approach is not possible if you want to answer such a question in the negative. Consider the following statement: (Example 2) There is no effective algorithm that determines whether a given graph has an independent subset of size k.2 1
2
In a much more general approach, Wolfram Hogrebe writes in his preface to the proceedings volume of the XIX. Deutscher Kongreß f¨ur Philosophie that was held in Bonn in September 2002 with the general topic Grenzen und Grenz¨uberschreitungen: “Die limitativen Aspekte der Condition Humaine ... [sind] nie nur Beschr¨ankungen, sondern immer auch Bedingungen der Konturf¨ahigkeit im Ausdrucksraum.” [Hog1 02] An independent subset of a graph is a set A such that for a, b ∈ A, there is no edge between a and b.
COMPLEXITY HIERARCHIES DERIVED FROM REDUCTION FUNCTIONS
3
The statement of (Example 2) is a universally quantified statement, so in order to prove it, you have to show for each effective algorithm that it doesn’t do the job. This requires a formalization of the notion of an “effective algorithm”, the development of a metatheory of algorithms. The usual formalization of “effective algorithm” in computer science is “polynomial time algorithm”.3 Using that formalization, Agrawal, Saxena, and Kayal have answered the question in (Example 1) positively in their already famous [AgrKaySax∞]: they give an O(log7.5 n)-time algorithm for determining whether a given number is prime. Looking at (Example 2), the question of whether a given graph has an independent subset of size k is NP-complete; cf. [Pap94, Theorem 9.4]. Consequently, the validity of the statement of (Example 2) is equivalent to P = NP. In light of this, a proof of P = NP would be another limitative theorem, locating all problems solvable in polynomial time cis and the NP-complete problems like TSP, SAT and the problem mentioned in Example 2 trans of the barrier determined by our formalization of “effective algorithm”.4 Let us move from the examples from computer science to an example from pure mathematics: In classical geometry, you are interested in constructions with ruler and compass: what lengths can be constructed from a given unit length by drawing auxiliary lines√with ruler and compass. √ 3 E.g., can we give a geometric construction of 2, of 2, of π? Again, you can prove existential statements of the form “There is a ruler-and-compass construction of X” by just displaying the construc√ tion, e.g., for 2. On the other hand, there is no way to prove a statement of the form “X cannot be constructed with ruler and compass” without a means of proving things about arbitrary ruler-and-compass constructions, i.e., of talking about ruler-and-compass constructions as mathematical entities. In fact, you can do this, and prove non-constructibility (as is taught in algebra courses) by interpreting ruler-and-compass constructions as a certain class of field extensions. This is (a small fragment of) what is called Galois theory, and it would deserve the name metageometry. 3
4
Cf. the introduction of Engebretsen’s paper in this volume: “It has been widely accepted that a running time that can be bounded by a function that is a polynomial in the input length is a robust definition of ‘reasonable running time’ (Engebretsen, p. 78)”. Here, TSP is the Traveling Salesman Problem [Pap94, § 1.3] and SAT is the Satisfiability Problem [Pap94, § 4.2]; cf. also Section 6.
¨ BENEDIKT LOWE
4
For classical construction problems like squaring the circle, doubling the cube (the Delic problem), triangulation of arbitrary angles, and construction methods for the regular p-gon, humankind had been seeking unsuccessfully for positive answers (i.e., proofs of the existential formula) for centuries. Mathematicians embedded ruler-and-compass construction into the richer theory of field extensions by associating to a given real r a field extension and proving that r being ruler-and-compass constructible is equivalent to a property of the associated field extension.5 Thus, the universally quantified negative forms of the classical problems mentioned above are transformed into simple statements about algebraic objects: the cube cannot be doubled by ruler and compass because √ 3 2 has degree 3 over Q; the angle of 60◦ cannot be triangulated, since the polynomial X 3 − 43 X − 81 is the minimal polynomial of cos(20◦ ) over Q; the regular 7-gon can not be constructed since the 7th cyclotomic polynomial has degree 6 over Q and is the minimal polynomial of a 7th root of unity. In the cases discussed, the limitative results relativize: similar to the relativization of G¨odel’s incompleteness theorem for Peano Arithmetic PA, we get notions of relative computability and relative ruler-and-compass constructibility. These give rise to problems that are not even computable if we have the halting problem as an oracle, or numbers that are √ 3 not ruler-and-compass constructible even if the ruler has 2 marked on it. As soon as a limitative theorem has told us that there are things on the other side of the barrier, we are not content anymore with the simple dichotomy of cis and trans. We want to compare those that are beyond the barrier according to their complexity. This leads to questions of the following type:
(Example 3) Is it harder to determine whether a Π2 expression is satisfiable than to determine whether a quantifier free expression is satisfiable? ¨ The first problem is known as Schonfinkel-Bernays SAT, the latter is SAT itself. Both are NP-hard by Cook’s Theorem [Pap94, Theorem 5
Cf., e.g., [Lan93, Chapter VI].
COMPLEXITY HIERARCHIES DERIVED FROM REDUCTION FUNCTIONS
5
8.2]. Are they equally hard, or is the first one harder than the second one?6 In order to talk about questions like this, we need a relation “is harder than” or “is at least as hard as” and a corresponding complexity hierarchy. In this paper, we shall restrict our attention to a special class of complexity hierarchies, viz. those induced by reduction functions. This choice is motivated by the fact that the hierarchies investigated in computer science are of this type, and some of the most famous hierarchies in mathematical logic (e.g., the Wadge hierarchy, one-reducibility and many-one-reducibility) are as well. We shall introduce a notion of complexity hierarchy in an abstract way in Section 2 and then specialize in the sections to follow. Before starting with our abstract account, we should add a disclaimer: this paper is not a survey of notions of complexity and definitely not a history of complexity. Its purpose is to describe a general feature of several complexity notions that occur in mathematics and computer science and use that to tie together the papers in this volume. We shall give some pointers to the literature for the interested reader but there is no intention to be comprehensive.
2
Abstract notions
Let K be some basic set of objects. This can be the set of graph structures on a given set, of natural numbers, of real numbers etc. Suppose that we fixed some set F of functions from K to K that is closed under compositions (i.e., if f and g are from F , then f ◦ g is from F as well) and contains the identity function. We will call these functions reductions. Depending on the context, we can choose different sets F . We can now use F to define a partial preorder (i.e., a reflexive and transitive relation) on the power set of K, ℘(K): A ≤F B : ⇐⇒ ∃f ∈ F (A = f −1 [B]). The notion of a reduction and the derived partial preorder occur in many areas of mathematics and computer science: the “efficient reduc6
¨ Schonfinkel-Bernays SAT is NEXP-complete [Pap94, Theorem 20.3] and SAT is in NP. Hence, the answer to the question in (Example 3) is ‘Yes’.
¨ BENEDIKT LOWE
6
tions” of theoretical computer science7 , the continuous and Borel reductions of descriptive set theory8 , the computable reductions from recursion theory9 , and also the Rudin-Keisler ordering of the theory of ultrafilters.10 In this paper, we will only talk about comparing sets of objects of the same type. If you want to compare problems from different areas, the corresponding asymmetric version of the reductions mentioned here are the Galois-Tukey reductions or Galois correspondences. Given our reducibility relation ≤F defined from F , we can now define the corresponding equivalence relation A ≡F B : ⇐⇒ A ≤F B ∧ A ≤F B, and then look at the set of equivalence classes CF := ℘(K)/≡F . The reducibility relation ≤F transfers directly to a relation ≤F on the equivalence classes where it is a partial order. We denote the equivalence class of a set A by [A]≡F . The elements of CF can now be called the F -complexity classes or F -degrees. In the following, we shall give examples of K and F and the derived notions of complexity classes.
3
Sets of Reals
In descriptive set theory, the set of objects K is the set of real numbers R. This set is naturally endowed with the canonical topology. Traditionally, the topological space R is endowed with several complexity hierarchies of sets of real numbers that –at least prima facie– are not derived from complexity functions: the Borel hierarchy and the projective hierarchy. For the Borel hierarchy, we set Σ01 to be the set of all open sets. For any ordinal α, we set Π0α to be the collection of complements of sets in Σ0α (so Π01 is the set ofclosed sets) and Σ0α is the collection of all sets that are countable unions i∈N Ai , where each Ai is in some Π0αi for αi < α. 7 8 9 10
Cf. [Pap94, Definition 8.1] and Section 6. Cf. Sections 3 and 4. Cf. Section 5. Note that the Rudin-Keisler ordering is of slightly different type because ultrafilters are sets of sets of numbers: for ultrafilters U and V on N, we say that U is Rudin-Keisler reducible to V (U ≤RK V ) if there is a function f : N → N such that U := {f −1 [X] ; X ∈ V }.
COMPLEXITY HIERARCHIES DERIVED FROM REDUCTION FUNCTIONS
7
The index α indicates how often you have to iterate the operations “complementation” and “countable union” to get a set which is in Σ0α . Thus, in a preconceived notion of complexity, we could call cBorel (A) := min{α ; A ∈ Σ0α ∪ Π0α } a complexity measure for Borel sets and can derive a complexity relation A ≤Borel B : ⇐⇒ cBorel (A) ≤ cBorel (B) from it. Similarly, we can look at the projective hierarchy (defined on finite Cartesian products of R). We let Σ10 be the set of Borel sets, and for each n, we let Π1n be the collection of complements of sets in Σ1n . Then Σ1n+1 is defined to be the set of projections of sets in Π1n , where A ⊆ Rn is a projection of B ⊆ Rn+1 if x ∈ A ⇐⇒ ∃y(y, x ∈ B). Again, if we set cProj (A) to be the least n such that A ∈ Σ1n ∪ Π1n , we can call the following relation a complexity relation: A ≤Proj B : ⇐⇒ cProj (A) ≤ cProj (B). These two hierarchies are proper hierarchies, i.e., for α < ω1 and 1 ≤ n < ω, the following inclusions are all proper:11 ∆0α Σ0α ∪ Π0α ∆0α+1 ∆1n Σ1n ∪ Π1n ∆1n+1 . By the technique used to prove this chain of strict inclusions, it is connected to the limitative theorems of logic: As in the limitative theorems, the non-equality of the complexity classes (often called a hierarchy theorem) is proved using the method of diagonalization via universal sets. Even more connections to logic emerge: if you look at the wellknown L´evy hierarchy of formulas where you count alternations of quantifiers12 , then sets definable with a real number parameter and firstorder quantifiers over the standard model of (second-order) arithmetic 11 12
Here, ∆0α := Σ0α ∩ Π0α and ∆1n := Σ1n ∩ Π1n . A formula with n alternations of quantifiers starting with ∃ is called a Σn+1 formula, a formula with n alternations starting with ∀ is called Πn+1 . The L´evy hierarchy was introduced in [L´ev65].
8
¨ BENEDIKT LOWE
with a Σn (Πn ) formula are exactly the Σ0n (Π0n ) sets, and those definable with a real number parameter and both first- and second-order quantifiers over the standard model of (second-order) arithmetic with a Σn (Πn ) formula are exactly the Σ1n (Π1n ) sets. Thus, Borel and projective complexity fit well into the concept of formula (L´evy) complexity. These complexity relations are related to reduction functions in the following way: Let FW be the set of continuous functions from R to R. Then the relation ≤FW defined on ℘(R) is called Wadge reducibility ≤W . If you take a set A ∈ Σ0α \Π0α , then (using Wadge’s Lemma and Borel Determinacy) {B ; B ≤W A} = Σ0α . Similarly, (under additional assumptions) we get a description of the projective classes as initial segments of the Wadge hierarchy: ≤W is a refinement of both ≤Borel and ≤Proj . The Wadge hierarchy is one of the most fundamental hierarchies in the foundations of mathematics. Under the assumption of determinacy of games, the Wadge hierarchy is an almost linear and well-founded backbone of the class of sets of real numbers.13 Since the theory of real numbers and sets of reals is intricately connected to questions about the axiomatic framework of mathematics as a whole, the Wadge hierarchy serves as a stratification of an important part of the logical strength of set-theoretic axiom systems for mathematics. For a basic introduction into the theory of Wadge degrees, we refer the reader to Van Wesep’s basic paper [Van78].
4
Equivalence Classes
Closely connected to the Wadge hierarchy is the area called Descriptive Set Theory of Borel Equivalence Relations. Riccardo Camerlo’s paper “Classification Problems in Algebra and Topology” in this volume gives an overview of this area of research. Take two equivalence relations E and F on the set of real numbers R. As relations, they are subsets of R×R. A Borel function f : R → R gives 13
By results of Wadge, Martin, Monk, Steel and van Wesep.
COMPLEXITY HIERARCHIES DERIVED FROM REDUCTION FUNCTIONS
9
rise to a Borel function f : R × R → R × R by f(x, y) := f (x), f (y). Look at the class FBorel of functions like this and call ≤FBorel Borel reducibility ≤B . Note that this means that E ≤B F if and only if there is a Borel function f : R → R such that x E y ⇐⇒ f (x) F f (y). This hierarchy (as opposed to the Borel fragment of the Wadge hierarchy) is not at all linear: Camerlo mentions the Adams-Kechris theorem on p. 74 of his paper in this volume; you can embed the Borel sets (partially ordered by inclusion) into this hierarchy. Equivalence relations on the real numbers (or on other Polish spaces) play a rˆole in classification projects. We will give an example how complexity enters the discussion here (the example is from Simon Thomas’ survey article [Tho0 01]): In 1937, Reinhold Baer had given a (simple) complete invariant for additive subgroups of Q, thus classifying these groups up to isomorphism.14 Kurosh and Mal’cev gave complete invariants for additive subgroups of Qn , but “the associated equivalence relation is so complicated that the problem of deciding whether two [invariants are the same] ... is as difficult as that of deciding whether the corresponding groups are isomorphic. It is natural to ask whether the classification problem for [additive subgroups of Qn ] ... is genuinely more difficult when n ≥ 2. [Tho0 01, p. 330]”. This problem has been solved by Hjorth [Hjo99] by showing that the isomorphism equivalence relation for subgroups of Q2 has strictly higher complexity than the isomorphism equivalence relation for subgroups of Q and in fact, that it is strictly more complicated than E0 , the first nonsmooth Borel equivalence relation (cf. Camerlo’s Theorem 2.2).15 This complexity result can be seen as a metatheoretical explanation for the fact that Kurosh and Mal’cev couldn’t come up with better invariants for these classes of groups. 14
15
Subgroups of Q can be seen as subsets of N via a bijection between Q and N. Via characteristic functions, subsets of N can be interpreted as infinite 0-1-sequences, and those can be seen as a real number (e.g., via the binary expansion). Thus, the isomorphism relation on subgroups of Q can be investigated as an “equivalence relation on the real numbers”. Hjorth comments on the connection between this theorem and certain pretheoretical notions of classifiability in his [Hjo00, p. 55, fn. 2].
¨ BENEDIKT LOWE
10
5
Recursion Theory
From sets of reals, we move to sets of integers now. In Recursion Theory, K is the set of natural numbers N. We shall look at two different kinds of reduction functions: the set Ft of total recursive functions, and the set F1 of total injective recursive functions. Each of these sets of reductions gives rise to recursion-theoretic reducibility relations: ≤Ft is known as many-one reducibility ≤m , and ≤F1 is known as 1-reducibility ≤1 .16 One of the mentioned three classical limitative theorems stems from Recursion Theory: Let us assume that we’re looking at computer programs in binary code. We can ask whether a program, given its own code as an input, will eventually halt or run into an infinite loop. Call the former programs halting and the latter looping. Now, is there a program that, given any binary code b, can determine whether the program with code b is halting or looping? Diagonalization easily shows that there can’t be such a program. The set H := {b ; the machine with code b is halting} is called Turing’s halting problem: in terms of complexity the diagonalization argument shows that H ≤m A for any computable set A (and so also H ≤1 A). Hence the derived sets of degrees CFt , ≤m , and CF1 , ≤1 are nontrivial, and, in fact, as the degrees of Borel equivalence relations, they are very far from being linear orders. A huge part of the literature on recursion-theoretic degrees investigates the structure of the complexity hierarchies of recursion theory. Even more prominently than ≤1 and ≤m features Turing reducibility ≤T defined by the notion of computation in an oracle. The literature on complexity hierarchies in recursion theory also includes Sacks’ seminal [Sac71] in which he introduced Sacks forcing. The properties of Sacks forcing used in recursion theory are similar to the minimality property proved in Lemma 2.8 of the survey paper on set-theoretic properties of Sacks forcing by Geschke and Quickert in this volume. Of course, the above definitions are not restricted to the classical notion of computation. When you change the model of computation and 16
Cf. [Hin78] and [Soa87].
COMPLEXITY HIERARCHIES DERIVED FROM REDUCTION FUNCTIONS
11
replace “recursive”/“computable” in the above definitions by some predicate connected to a generalized model of computation, you get new complexity hierarchies with a lot of structure to investigate. One instance of this is the notion of the Infinite Time Turing Machine introduced by Hamkins and Kidder. These machines have the same architecture as normal Turing machines but can go on through the ordinals in their computations. Being computable by an Infinite Time Turing Machine is a much more liberal property of functions, so the hierarchies we get are much coarser. In this volume, there are two papers dealing with Infinite Time Turing Machines: a gentle introduction by Joel Hamkins entitled “Supertask Computation” and a paper by Philip Welch that discusses supertasks on sets of reals, thus connecting Infinite Time recursion theory to descriptive set theory (cf. Section 3). Having seen the rich structure theory of recursion theoretic hierarchies for the classical notion of computability, one can ask with Hamkins (Question 2 of his contribution to this volume): What is the structure of infinite time Turing degrees? To what extent do its properties mirror or differ from the classical structure?17
6
Complexity Theory in Computer Science
Computable functions, the reductions used in recursion theory, are not good enough in theoretical computer science: a computable function in general has no bound on computing time or storage space that is used while it’s being computed. The fine hierarchies of computer science that want to distinguish between problems that are solvable in a realistic timeframe and those that require too much time cannot be content with reductions of that sort. The class of reductions in theoretical computer science is the class of functions computable with logarithmic space usage (logspace reductions). I.e., there is an algorithm that computes f (x) from x and satisfies 17
In his [Wel0 99], Philip Welch discusses an aspect of Hamkins’ Question 2: minimality in the Infinite Time Turing degrees. Again, this is connected to minimality of Sacks reals as discussed in the survey of Geschke and Quickert, and Welch uses a variant of Sacks’ argument from [Sac71].
12
¨ BENEDIKT LOWE
the following condition: if the input has n bits, the program will never use more than c · log(n) bits of storage space during the computation (where c is a constant).18 If F is the class of logspace reductions, then CF , ≤F is the usual world of complexity classes: first of all the deterministic ones, the class of polynomial time decidable sets P, those decidable with polynomial scratch space usage PSPACE, those decidable in exponential time EXP, and also their nondeterministic counterparts NP, coNP, NPSPACE, NEXP, and others. The usual inclusion diagram of complexity classes (cf. [Pap94, §§ 7.2 & 7.3]) can be verified easily: P ⊆ NP ∩ coNP ⊆ PSPACE ⊆ EXP ⊆ NEXP. While there are some non-equality results, one puzzling and titillating feature of the complexity classes of computer science is that we don’t know for sure that all of the inclusions are proper. In particular, we are lacking the limitative theorem P = NP. For the reader unfamiliar with structural complexity theory, Rod Downey’s “Invitation to structural complexity” [Dow92] will be a nice way to approach the subject. As in recursion theory, also in complexity theory, a new model of computation gives rise to new classes of reduction functions, and thus, a fortiori, to different hierarchies.19 Examples are the Blum-Shub-Smale theory of computation with real numbers20 or quantum computability as discussed in Ambainis’ survey in this volume.
7
Other Complexity Hierarchies
We abstractly discussed hierarchies derived in a particular way from reduction functions. Of course, there are more complexity hierarchies in logic than that. Fundamental are the hierarchies of proof theory that are related to reduction functions in a less direct way: Fix a language L and a collection 18 19
20
Cf. [Pap94, Chapter 8]. Interesting to note is the connection to Infinite Time Turing Machines: Schindler opened the discussion on analogues of the P = NP question for Infinite Time Turing Machines in his [Sch2 ∞]. Cf. also [HamWel0 03] and [DeoHamSch2 ∞]. Cf. [Blu+98]; in Footnote 4 of his article in this volume, Hamkins mentions that the BlumShub-Smale theory was the key motivation for the development of Infinite Time Turing Machines.
COMPLEXITY HIERARCHIES DERIVED FROM REDUCTION FUNCTIONS
13
Φ of L-sentences. From now on, we mean by an L-theory a primitively recursive set of L-sentence (understood as an axiom system of the theory). Let Proof(v0 , v1 , v2 ) be the predicate “v0 is the G¨odel number of a proof of the sentence with the G¨odel number v1 in the theory with the index v2 ”. If S and T are L-theories, we call a function f a Φ-reduction of S to T , if it is primitively recursive and PRA ∀ϕ ∈ Φ ∀n ( Proof(n, ϕ, S) → Proof(f (n), ϕ, T ) ) . We write S ≤Φ T if such a Φ-reduction exists.21 As a famous special case, you can look at Φ∗ = {⊥}. Then the existence of a Φ∗ -reduction says that every proof of an inconsistency in S can be effectively transformed into a proof of an inconsistency in T . In other words, if T is consistent, then so is S. This gives rise to the consistency strength hierarchy ≤Cons .22 G¨odel’s second incompleteness theorem states that PA
22 23
24
Here, PRA is primitive recursive arithmetic, a weak subsystem of second-order arithmetic; cf. [Sim2 99]. Cf. [Rat99, §§ 2.5 & 2.7]. Cf. [Ste0 82a] for a mathematical discussion of the connections between the theory of inner models of set theory and this remarkable fact. Cf. [Sim2 99], [Poh96], [Kah02], and [Kan0 94]. For an example of a strength analysis by large cardinals, see Schindler’s paper in this volume where he computes the consistency strength
14
¨ BENEDIKT LOWE
The relations between the different aspects and contexts of complexity are large in number, and a short introductory article to this volume can’t list them all. The fact that we can only provide a very passing glance at the fascinating subject of complexity stresses the importance, the liveliness and vigour of the topic of the conference FotFS III. The slightly different but deeply related notions of complexity in the several areas discussed in this volume are entrenched within the research communities of mathematical logic and computer science and will serve as a bridge between the two subject areas for years to come.
of the theory BPFA+“every projective set of reals is Lebesgue measurable” in terms of large cardinals.
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
Quantum query algorithms and lower bounds Andris Ambainis School of Mathematics Institute for Advanced Study Princeton, NJ 08540, USA E-mail:
[email protected]
Abstract. Many quantum algorithms can be analyzed in a query (oracle) model where input is given by a black box that answers queries and the complexity of the algorithm is measured by the number of queries to the black box that it uses. We survey the main results and techniques in this model, with an emphasis on lower bound methods.
1
Introduction
The modern version of the Church-Turing thesis asserts that any reasonable model of computation can be efficiently (with at most a polynomial slowdown) simulated by a probabilistic Turing machine. Quantum computing challenges this thesis from an unexpected direction. The Turing machine model captures our intuition about how a general computation might proceed. Thus, either the modern Church-Turing thesis is correct or there is something missing in our intuition. The missing part is that our intuition is based on classical mechanics. Everything Received: November 25th, 2001; In revised version: March 12th, 2002; May 14th, 2002; Accepted by the editors: May 14th, 2002. 2000 Mathematics Subject Classification. 81P68 68Q17 68Q05.
Supported by NSF Grant CCR-9987845 and the State of New Jersey. Thanks to Yaoyun Shi and the FotFS referee for useful comments.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 15-32. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
16
ANDRIS AMBAINIS
that we encounter in our everyday life and all of today’s computers function according to the principles of classical mechanics. Because of that, when we define an abstract model of computation like Turing machines, we are likely to incorporate some principles of classical mechanics into this model in an implicit form. By using axioms of quantum mechanics instead, one can define quantum Turing machines and other models of quantum computation. There is evidence that those models are more powerful than their conventional (classical) counterparts. Simon (cf. [Sim0 97]) showed an oracle problem which can be solved by a quantum algorithm in a polynomial time but requires exponential time classically. Shor (cf. [Sho1 97]) gave a polynomial-time algorithm for factoring integers and discrete logarithms, problems that are conjectured to be very hard classically. Since public-key cryptosystems like RSA are based on the hardness of those problems, building a quantum computer would undercut the foundations of today’s cryptography. Another important quantum algorithm is due to Grover (cf. [Gro2 96]). It speeds up a very general class of search problems quadratically. While this speedup is just polynomial, Grover’s algorithm applies to a large variety of problems which includes both NP-complete problems like satisfiability or 3-coloring and search problems solvable in polynomial time. Almost all known quantum algorithms can be expressed in a query (oracle) model where the input is given by a black box which answers queries of a certain form. The algorithms that fall into this framework include Grover’s algorithm, Simon’s algorithm, period-finding (a key subroutine of Shor’s factoring/discrete log algorithm) and many others. One of the advantages of this model is that we can also prove lower bounds on quantum algorithms. In this paper, we describe the techniques for algorithms and lower bounds in the query model. We focus on Grover’s search algorithm and present the algorithm and three methods of proving that this algorithm is optimal. Then, we list results for various other problems.
2 2.1
Definitions Notation
We use ⊕ to denote XOR (exclusive OR). If x1 ∈ {0, 1}, x2 ∈ {0, 1}, x1 ⊕x2 denotes XOR of bits x1 , x2 . If x1 ∈ {0, 1}n , x2 ∈ {0, 1}n , x1 ⊕x2
QUANTUM QUERY ALGORITHMS
17
denotes bitwise XOR of n-bit strings x1 and x2 . We use + to denote the usual addition of integers. We use the O and Θ notation (cf. [CorLei1 Riv90]) standard in computer science. Let f (N ) and g(N ) be functions defined on positive integers N and taking positive values. We say that f = O(g) if there exists a constant c such that f (N ) ≤ cg(N ). We say that f = o(g) if, for any c > 0, there exists N0 such that f (N ) ≤ cg(N ) for all N > N0 . We say that f = Ω(g) if there exists c > 0 and N0 such that f (N ) ≥ cg(N ) for all N > N0 . 2.2
Quantum computing
We introduce the basic model of quantum computing. For more details, see the textbooks by Gruska (cf. [Gru1 99]) and Nielsen and Chuang (cf. [ChuNie00]). Quantum states: We consider finite dimensional quantum systems. An n-dimensional pure1 quantum state is a vector |ψ ∈ Cn of norm 1. We use |0, |1, . . ., |n − 1 to denote an orthonormal basis for Cn . n−1 Then, any state can be expressed as |ψ = i=0 ai |ifor some a0 ∈ 2 C, a1 ∈ C, . . ., an−1 ∈ C. Since the norm of |ψ is 1, n−1 i=0 |ai | = 1. We call the states |0, . . ., |n − 1 basis states. Any state of the form n−1 i=0 ai |i is called a superposition of |0, . . ., |n−1. The coefficient ai is called amplitude of |i. A quantum system can undergo two basic operations: a unitary evolution and a measurement. Unitary evolution: A unitary transformation U is a linear transformation on Ck that preserves the 2 norm (i.e., maps vectors of unit norm to vectors of unit norm). If, before applying U , the system was in a state |ψ, then the state after the transformation is U |ψ. Measurements: In this survey, we just use the simplest case of quantum measurement. It is the full measurement in the computational basis. Performing this measurement on a state |ψ = a0 |0 + . . . + an−1 |n − 1 gives the outcome i with probability |ai |2 . The measurement changes the state of the system to |i. Notice that the measurement destroys the original state |ψ and repeating the measurement 1
There are also mixed quantum states but they are not used in this paper. For the rest of the paper, “quantum state” means “pure quantum state”.
18
ANDRIS AMBAINIS
gives the same i with probability 1 (because the state after the first measurement is |i). More general classes of measurements are general von Neumann and POVM (Positive Operator-Valued Measure) measurements (cf. [ChuNie00]). 2.3
Query model
In the query model, the input x1 , . . ., xN is contained in a black box and can be accessed by queries to the black box. In each query, we give i to the black box and the black box outputs xi . The goal is to solve the problem with the minimum number of queries. The classical version of this model is known as decision trees (cf. [BuhdWo02]).
| i> | b>
0
1
x1 x2
... ...
0 xn
| i> | b+ x i >
Fig. 1. Quantum black box.
There are two ways to define the query box in the quantum model. The first is an extension of the classical query (cf. Figure 1). It has two inputs i, consisting of log N bits and b consisting of 1 bit. If the input to the query box is a basis state |i|b, the output is |i|b ⊕ xi . If the input is a superposition i,b ai,b |i|b, the output is i,b ai,b |i|b ⊕ xi . Notice that this definition applies both to the case when the xi are binary and to the case when they are k-valued. In the k-valued case, we just make b consist of log2 k bits and take b ⊕ xi to be bitwise XOR of b and xi . In the second form of quantum query (which only applies to problems with {0, 1}-valued xi ), the blackbox has just one input i. If the input is a state i ai |i, the output is i ai (−1)xi |i. While this form is less intuitive, it is very convenient for use in quantum algorithms, including Grover’s search algorithm (cf. [Gro2 96]).
QUANTUM QUERY ALGORITHMS
19
In this survey, we assume the first form as our main definition but use the second when describing Grover’s algorithm. This is possible to do because a query of the second type can be simulated by a query of the first type (cf. [Gro2 96]). Conversely, an oracle of the first type can be simulated by a generalization of the sign oracle which receives i ai,b |i|b b·xi as an input and outputs i ai (−1) |i|b. A quantum query algorithm with T queries is just a sequence of unitary transformations U0 → O → U1 → O → . . . → UT −1 → O → UT on some finite-dimensional space Ck . The transformations U0 , U1 , . . . , UT can be any unitary transformations that do not depend on the bits x1 , . . . , xN inside the black box. O are query transformations that consist of applying the black box to the first log N + 1 bits of the state. That is, we represent basis states of Ck as |i, b, z. Then, O maps |i, b, z to |i, b⊕xi , z. We use Ox to denote the query transformation corresponding to an input x = (x1 , . . . , xN ). The computation starts with the state |0. Then, we apply U0 , Ox , . . . , Ox , UT and measure the final state. The result of the computation is the rightmost bit of the state obtained by the measurement (or several bits if we are considering a problem where the answer has more than 2 values). The quantum algorithm computes a function f (x1 , . . . , xN ) if, for every x = (x1 , . . . , xN ) for which f is defined, the probability that the rightmost bit of UT Ox UT −1 . . . Ox U0 |0 equals f (x1 , . . . , xN ) is at least 1 − ε for some fixed ε < 1/2. The query complexity of f is the smallest number of queries used by a quantum algorithm that computes f . We denote it by Q(f ).
3
Grover’s search algorithm
Consider the following problem. Problem 3.1. Input: x1 ∈ {0, 1}, x2 ∈ {0, 1}, . . ., xN ∈ {0, 1} such that exactly one xi is 1. Output: the i such that xi = 1. Classically, one needs Ω(N ) queries to solve this problem and there is no better algorithm than just querying the locations one by one until we find xi = 1. Surprisingly, there is a better algorithm in the quantum case.
20
ANDRIS AMBAINIS
Theorem 3.2 (Grover). There is a quantum algorithm that solves Prob√ lem 3.1 with O( N ) queries. [Gro2 96] This is achieved by the following algorithm (cf. [Gro2 96]). 1. Apply a unitary transformation U mapping |0 to √1N N i=1 |i. √ π 2. Repeat for 4 N times: N (a) Apply the query transformation O which maps i=1 ai |i to N xi i=1 ai (−1) |i. (b) Apply the following “diffusion operator” D D|1 = − NN−2 |1 + N2 |2 + . . . + N2 |N D|2 = N2 |1 − NN−2 |2 + . . . + N2 |N ... D|N = N2 |1 + N2 |2 + . . . − NN−2 |N
3. Measure the state, output the result of the measurement. We note that Grover’s algorithm is efficient not just in the number of queries but also in the running time. The reason for that is that the diffusion operator D can be implemented in O(log N ) time steps (cf.√[Gro2 96]). Therefore, the whole algorithm can be implemented in O( N log N ) time.
1/
Ν
... |1> |2>
1/
Ν
− 1/
Ν
...
... |Ν>
Fig. 2. Amplitudes before the first query
|1> |2>
... |Ν>
Fig. 3. Amplitudes after the first query
QUANTUM QUERY ALGORITHMS
3/
Ν
1/
Ν
− 1/
Ν
... |1> |2>
21
... |Ν>
Fig. 4. Amplitudes after D
We now informally explain why this algorithm works. Our presentation follows the “inversion against average” of Grover (cf. [Gro2 97]). A different visualization of Grover’s algorithm (by two reflections of a vector) is described in (cf. [Aha99]). To understand the algorithm, plot the amplitudes of |1, . . ., |N at each step. After the first step, the state is √1N N |i and the amplitudes i=1 1 are √N (Figure 2). After the first query, the amplitude of |i with xi = 1 becomes − √1N (Figure 3). Then, the diffusion operator D is applied. D. Then, the state after D is Let |ψ = N i=1 ai |i be the state before 2 N −2 a + a |i where a = − |ψ = N i i j=i N aj . We can rewrite this i=1 i N N 1 N 2 2 as ai = −ai + N j=1 N aj j=1 N aj . Let A = j=1 N aj and ai + ai = be the average of amplitudes ai . Then, we have ai + ai = 2A and, if ai = A + ∆, then ai = A − ∆. Thus, the effect of the “diffusion transform” is that every amplitude ai is replaced by its reflection against the average of all ai .
In particular, after the first query, the amplitude of |i with xi = 1 is and all the other amplitudes are √1N . The average is √1N − N √2 N which is almost √1N . Therefore, after applying D, the amplitude of |i with xi = 1 becomes almost √3N and the amplitudes of all other basis states |j slightly less than √1N (cf. Figure 4).
− √1N
22
ANDRIS AMBAINIS
The next query makes the amplitude of |i with xi = 1 approximately − √3N . The average of all amplitudes is slightly less than √1N and reflecting against it make the amplitude of |i with xi = 1 almost √5N . Thus, √ each step increases the amplitude of |i with xi = 1 by O(1/ N ) and decreases the other amplitudes. A precise calculation (cf. √ π [Gro2 96], [Gro2 97]) shows that, after 4 N steps, the amplitude of |i with xi = 1 is 1 − o(1). Therefore, the measurement gives the correct answer with probability 1 − o(1).
Boyer et al. in [Boy+98] have extended Grover’s algorithm to the case when there can be more than one i with xi = 1. The simplest case is if the number of xi = 1 is known. If there are k such values, √ we can run the same algorithm with π4 N/k iterations instead of π4 N . An analysis similar to one above shows that this gives a random i such that xi = 1 with high probability. A more difficult case is if k is not known in advance. The problem is that, after reaching the maximum, the amplitudes of |i with xi = 1 start to decrease. Therefore, if we do too many iterations, we might not get the right answer. This problem can be handled in two ways. The first is running the algorithm above several times with a different number of steps (cf. [Boy+98]). The second is invoking a different algorithm called “quantum counting” (cf. [Bra1 HøyTap98]) to estimate the number of xi = 1 and then choose the number of steps for the search algorithm based on that. Either of those approaches gives us solution to:
Problem 3.3. Input: x1 ∈ {0, 1}, x2 ∈ {0, 1}, . . ., xN ∈ {0, 1}. Output: i with xi = 1, if there is one, “none” if xi = 0 for all i. Theorem √ 3.4 (Boyer). There is an algorithm that solves the Problem 3.3 with O( N ) queries. [Boy+98] Many problems can be solved by reductions to Problems 3.1 and 3.3. For example, consider the satisfiability which is the canonical NPcomplete problem. We have a Boolean formula F (a1 , . . . , an ) and we have to find whether there exists a satisfying assignment (a1 ∈ {0, 1}, . . ., an ∈ {0, 1} for which F (a1 , . . . , an ) = 1). We can reduce the satisfiability to Problem 3.3 by setting N = 2n and defining x1 , . . . , xN to be F (a1 , . . . , an ) for N = 2n possible assignments a1 ∈ {0, 1}, . . ., an ∈ {0, 1}.
QUANTUM QUERY ALGORITHMS
23
This means that we construct an algorithm that takes a1 , . . ., an and checks if F (a1 , . . . , an ) = 1. Then, if we replace the black box in the Grover’s algorithm by this algorithm,√we get an algorithm that finds a satisfying assignement in the time O( 2n ) times time needed to check one assignement. A similar reduction applies to any other problem that can be solved by checking all possibilities in some search space.
4
Lower bounds for search
A very natural question is: can we do even better? Is there a more complicated algorithm that solves Problem 3.1 with even less queries? If this was the case, we would get better quantum algorithms for all the problems that reduce to it. In particular, an O(logc N ) query algorithm for Problem 3.1 would result in a polynomial-time quantum algorithm for satisfiability. The next theorem shows that the speed-up achieved by Grover’s algorithm is maximal. Theorem 4.1 (Bennett). Any quantum algorithm for Problem 3.3 uses √ Ω( N ) queries. [Ben+97] There are several proofs for this theorem using different techniques. 4.1
Classical adversary
Theorem 4.1 was first shown by Bennett et al. (cf. [Ben+97]) using a classical adversary method (also called “hybrid argument”). In this proof, √ we assume that we are given a quantum algorithm A that uses o( N ) queries. We then produce an input x1 , . . ., xN on which A outputs an incorrect answer. To do that, we first analyze the algorithm on the input x1 = x2 = . . . = xN = 0 on which it must output “none”. We find xi such that the total probability of querying xi at all steps is small. Then, we show that if we change xi to 1 the final state of the algorithm does not change much. This implies that the algorithm either outputs the wrong answer on the original input x1 = x2 = . . . = xN = 0 or on the modified input x1 = . . . = xi−1 = 0, xi = 1, xi+1 = . . . = xN = 0.
24
ANDRIS AMBAINIS
We now formalize this argument. Let |ψi be the state before the ith query (with the previous queries answered according to x1 = . . . = xN = 0). We decompose this state as |ψi =
N
αij |j|ψij
j=1
with |j being the index that is queried and|ψij being the rest of the state. For each j, we consider the sum Si = i |αij |. This sum measures how much the algorithm queries the item xj . Then, we have the following two lemmas. Lemma 4.2 (Bennett). Consider the algorithm A on inputs x1 = . . . = xN = 0 and x1 = . . . = xi−1 = 0, xi = 1, xi+1 = . . . = xN = 0. The probability distributions on possible answers on those inputs differ by at most 8Si . [Ben+97] Therefore, if the algorithm outputs “none” on the input x1 = . . . = xN = 0 with probability at least 1 − ε, it also outputs “none” on x1 = . . . = xi−1 = 0, xi = 1, xi+1 = . . . = xN = 0 with probability at least 1 − ε − 8Si . Lemma 4.3 (Bennett). Let T be the number of queries. There exists i ∈ {1, . . . , N } with Si ≤ √TN . [Ben+97] √ Therefore, if the number of queries is less than c N for a small constant c, the probability of the algorithm A outputting “none” on x1 = . . . = xi−1 = 0, xi = 1, xi+1 = . . . = xN = 0 is close to 1 − ε for some i. This means that A is incorrect on this input. 4.2
Polynomials method
The second approach (cf. [Bea0 +01]) uses a completely different idea. We describe the probability of the algorithm outputting a given answer by a low-degree polynomial p. If the algorithm computes f with probability 1 − ε, the polynomial approximates f within ε. Then, the lower bound on the number of queries follows from a lower bound on the degree of the polynomial approximating f . The key result is
QUANTUM QUERY ALGORITHMS
25
Lemma 4.4. Let |ψ be the state of a quantum algorithm after k queries. Then, |ψ = i αi (x1 , . . . , xN )|i, with αi (x1 , . . . , xN ) being polynomials of degree at most k. [Bea0 +01], [For2 Rog1 99] At the end of the algorithm, the probability of the measurement giving i is the square of the amplitude of |i. Since the amplitude is a polynomial of degree at most k, its square is a polynomial of degree at most 2k. Consider the version of Problem 3.3 where the algorithm is only required to find if there is xi = 1 or not. Let p(x1 , . . . , xN ) be the probability of the algorithm outputting the answer “there exists xi = 1”. Then, by the argument above, p is a polynomial in x1 , . . . , xN of degree at most 2T . If the algorithm is correct, this polynomial must satisfy the following properties: 1. 0 ≤ p(0, . . . , 0) ≤ ε; 2. If xi = 1 for some i, then 1 − ε ≤ p(x1 , . . . , xN ) ≤ 1. The authors of [NisSze92], [Pat1 92]√show that any polynomial satisfying these constraints has degree Ω( N ). The first step of the proof is to use the symmetry to transform the polynomial in N variables into a polynomial in 1 variable (x1 + x2 + . . . + xN ) of the same degree. This gives a polynomial p(x) such that 0 ≤ p(0) ≤ ε but 1 − ε ≤ p(x) ≤ 1 for all x ∈ {1, . . . , N }. Then, one can apply results from approximation theory √ (cf. [Che66]) to show that any such polynomial must have degree Ω( N ). 4.3
Quantum adversaries
The third approach is the “quantum adversary” method by the present author (cf. [Amb0 00]). The main idea is to run the algorithm on an input that is a quantum state. That is, we consider a quantum system consisting of two registers HA and HI , with one register (HA ) containing the algorithm’s state and the other register (HI ) containing the input. At the beginning HA contains the algorithm’s starting state (|0) and HI contains a state of the form x αx |x where x = (x1 , x2 , . . . , xN ) ranges over all possible inputs. If the algorithm performs a unitary transformation Ui , we just perform Ui on HA . Queries O are replaced by unitary transformations O that are equal to Ox on the subspace consisting of states |ψ|x, |ψ ∈ HA .
26
ANDRIS AMBAINIS
If the input register HI contains a basis state |x, then the state in HA at any moment is just the state of the algorithm on the input x. For example, consider the search (Problem 3.1). First, we look at the case when HI contains the basis state |x with xi = 1 and xj = 0 for j = i. The correct answer on this input is i. Therefore, the end state in HA must be |i. This means that the algorithm transforms the starting state containing |0 in HA register into the end state containing |i in HA :
. . 0 . . . 0 1 0 . . . 0 → |i| 0 . |0| 0 . . . 0 1 0 . i−1
N −i
i−1
(1)
N −i
We can also put a combination of several basis states |x into HI . Then, we are running the algorithm on an input that is a quantum state. Since unitary transformations are linear, the end state of the algorithm on the starting state |0 x αx |x is just a linear combination of the end states for the starting states For the search problem, we consider the |0|x. 1 √ starting state that is x N |x, with summation over all x that have exactly one xi = 1. Then, by summing up (1) for i = 1, . . . , N , we get N N 1 1 √ |i| 0 . √ |0| 0 . . . 0 1 0 . . . 0 . . . 0 1 0 . . . 0 → N N i=1 i=1 i−1 i−1 N −i N −i
(2)
Notice an important difference between the starting state and the end state in (2). In the starting state, the HA and HI registers are uncorrelated. For any state in HI , HA register contains the same state |0. In the end state, the two registers are perfectly correlated. For every of N basis states |0 . . . 010 . . . 0 in HI , HA contains a different state. This is also true for other problems. In general, if the algorithm computes a function f , the parts of state in HI that correspond to inputs x with one value of f (x) become correlated with parts of HA which correspond to the algorithm outputting this answer2 . To bound the number of queries, we introduce a quantity measuring the correlations between HA and HI . We show bounds on this quantity correlations in the starting state, end state and the maximum change in one query. This implies a lower bound on the number of queries. For more details, cf. [Amb0 00]. 2
Those correlations are often called quantum entanglement because they are different from correlations between classical random variables, cf. [ChuNie00].
QUANTUM QUERY ALGORITHMS
4.4
27
Comparison of different methods
√ While all three described methods give the same Ω( N ) lower bound for Problem 3.3, they are of different strength for other problems. Classical adversary arguments can be usually recast into the quantum adversary framework. Thus, the quantum adversary method is a generalization of the classical adversary. The strengths of the quantum adversary and the polynomials method are incomparable. For many problems, both of them give optimal lower bounds. The polynomials method usually needs the problem to have a certain degree of symmetry. The reason is that it is easier to deal with polynomials in one variable, rather than in N variables. For search and other symmetric functions, one can convert an N -variable polynomial to a one-variable polynomial. However, if the problem lacks symmetry, it might be difficult. On the other hand, the polynomials method is better for proving lower bounds for quantum algorithms with zero error or very small error (cf. [Buh+99]). It is also the only method which gives a lower bound for the collision problem (cf. [Aar02], [Shi02]).
5
Other problems
A variety of other problems have been studied in the query model. We now list some main results and open problems. 5.1
Period-finding
This is a class of problems for which quantum algorithms have achieved the most impressive speedups. Two examples are: Problem 5.1. We are given a black box computing a 2-1 function f : {0, 1}n → {0, 1}n such that there exists x ∈ {0, 1}n , x = 0n such that f (x ⊗ y) = f (y) for all y. We want to find x. [Sim0 97] Classically, this problem requires querying f (x) for Ω(2n/2 ) values of x. In the quantum case, O(n) quantum queries are enough (cf. [Sim0 97]).
28
ANDRIS AMBAINIS
This was the first example of an exponential advantage for a quantum algorithm. It led to Shor’s factoring algorithm (cf. [Sho1 97]) which is based on the following black-box problem. Problem 5.2. We are given a black box computing a k-to-1 function f : Z → {1, . . . , N } such that f (x) = f (x + k) for all x. We want to find k. [Sho1 97] The problem can be solved with O(1) queries, cf. [Sho1 97], (com√ 3 pared to Ω( N ) classically, cf. [Cle00]). Both of those problems (and several other problems) are particular cases of the Abelian hidden subgroup problem (cf. [EkeMos0 98]). 5.2
Finding collisions
The algorithms for period-finding rely on the algebraic structure of the problem. If there is no such structure, the problem becomes much harder. Problem 5.3. We are given f : {1, . . . , N } → {1, . . . , N } such that f is 2-1 and want to find x, y such that f (x) = f (y). Notice that Problem 5.1 is a particular case of Problem 5.3 with more structure (f (x) = f (y) if and only if y = x + z). The absence of this structure changes the complexity dramatically. While Problem 5.1 can be solved with O(n) = O(log N ) queries, the best quantum algorithm for Problem 5.3 uses O(N 1/3 ) queries (cf. [Bra1 HøyTap97]). The algorithm uses Grover’s search algorithm. Proving a lower bound was open for four years, until Aaronson (cf. [Aar02]) showed an Ω(N 1/5 ) lower bound and Shi (cf. [Shi02]) improved it to Ω(N 1/3 ). The proof uses the polynomials method in a clever way. A related problem is element distinctness: Problem 5.4. We are given f : {1, . . . , N } → {1, . . . , N } such that there is a unique x, y such that x = y and f (x) = f (y). We want to find x, y. There is a quantum algorithm that solves element distinctness with O(N 3/4 ) queries (cf. [Buh+01]). It is based on a recursive use of Grover’s algorithm in a clever way. An Ω(N 2/3 ) lower bound has been shown by
QUANTUM QUERY ALGORITHMS
29
Shi (cf. [Shi02]). Closing the gap between those two results is an open problem. There is also a similar gap between upper and lower bounds for problems intermediate between 5.3 and 5.4 (functions f that have k pairs x, y such that f (x) = f (y)). 5.3
Counting and approximate counting
This is a class of problems studied by [Gro2 98], [Bra1 HøyTap98], [NayWu99] and others. Up to quadratic speedup can be achieved. An example is Problem 5.5. We are given x1 ∈ {0, 1}, x2 ∈ {0, 1}, . . ., xN ∈ {0, 1} and want to find k = x1 + . . . + xN . Classically, this requires Ω(N ) queries. In the quantum case, O( (k + 1)(N − k + 1)) queries are sufficient3 If k ≈ N/2, this is the same as O(N ) and there is no speedup. However, if k is close to 0 or N , quantum algorithms achieve up to√ quadratic speedup. If k = O(1), O( (k + 1)(N − k + 1) becomes O( N ), but classical algorithms still need Ω(N ) queries. For k ≈ N/2, although quantum algorithms need the same number of queries as classical for exact counting, they are better for approximate counting. Problem 5.6 (ε-approximate count). We are given x1 ∈ {0, 1}, x2 ∈ {0, 1}, . . ., xN ∈ {0, 1} and we know that either at least 21 + ε fraction of them are 0 or at least 12 + ε fraction of them are 1. We want to distinguish these two cases.
This can be solved with quantum algorithms with O(1/ε) queries, compared with O(1/ε2 ) queries needed classically (cf. [Gro2 98]). Both exact and approximate counting results are tight. Matching lower bounds can be shown by either the polynomials method (cf. [NayWu99]) or the quantum adversary (cf. [Amb0 00]). For more information, cf. [Bra1 +02], [Gro2 98], [NayWu99]. 3
Cf. [Bra1 HøyTap98].
30
5.4
ANDRIS AMBAINIS
Binary search
Problem 5.7. We are given x1 = . . . = xi = 0, xi+1 = . . . = xN = 1 and want to find i. Classically, log2 N queries are needed. Quantum algorithms can solve this problem with 0.53... log2 N (cf. [Far1 +99]) or 0.65... log2 N (cf. [HøyNeeShi01]) queries using two different techniques. The best lower bound is 0.22... log2 N (cf. [HøyNeeShi01]) by a variant of “quantum adversary”. Thus, the speedup is at most a constant. It remains open to determine the exact constant achievable by quantum algorithms for this basic problem. Related problems are searching and element distinctness, both of which have been studied in [HøyNeeShi01]. 5.5
General bounds
There are classes of problems for which we can bound the maximum improvement over classical algorithms achievable for any problem in this class. The broadest such class is the class of total Boolean functions. Let f : {0, 1}N → {0, 1} be a total function of N variables x1 ∈ {0, 1}, . . ., xN ∈ {0, 1}. Let Q(f ) be the quantum query complexity of f and D(f ) be the deterministic query complexity. Beals et al. in [Bea0 +01] showed that D(f ) = O(Q6 (f )). Thus, for total functions, quantum algorithms can have at most a polynomial speedup. This contrasts with partial functions for which an exponential speedup is possible. (An example is Problem 5.1 or Problem 5.2.) It is open what is the maximum possible speedup achievable for total f . The best known example is the OR function x1 OR x2 OR . . . OR xN . Its deterministic query complexity is N but its quantum query complexity √ is O( N ) which is obtained by Grover’s algorithm. (Solving Problem 3.3 is enough to compute the OR function. If we find any xi = 1, OR of all xi is 1. If there is none, OR is 0.) An excellent survey on this topic is [BuhdWo02].
QUANTUM QUERY ALGORITHMS
5.6
31
Other results
Van Dam in [vDa98] has shown that any function of N 0,1-valued variables can be computed with N/2+o(N ) queries. This is optimal because it is also known that any quantum algorithm needs N/2 queries to compute the parity of x1 , . . . , xN (cf. [Bea0 +01], [Far1 +98]). Buhrman et al. in [Buh+99] have studied the number of queries needed for zero-error quantum algorithms and small-error quantum algorithms. They showed that, for several problems, the speedup achieved by quantum algorithms disappears in the zero-error model. Thus, requiring a quantum algorithm to give the correct answer with probability 1 − ε (instead of 1) can be important for the construction of good quantum algorithms. They also studied the tradeoff between the decrease in the error and increase in the number of queries for the search problem. The present author and de Wolf in [Amb0 dWo01] studied the averagecase query complexity when we allow the algorithm to use different numbers of queries on different inputs and count the expected number of queries, under some distribution over possible values of x1 , . . . , xN .
6
Beyond the query model
A related area is quantum communication complexity. In the standard model of communication complexity (cf. [KusNis97]), we have two parties (often called Alice and Bob) computing a function f (x, y) of two inputs, with the input x known to one party (Alice) and the input y known to the other party (Bob). Since neither party knows both inputs, communication between the two parties is needed to compute f . The communication complexity of f is the minimum number of bits that need to be communicated between Alice and Bob to compute f . The speedups for problems in the query model often imply √ similar speedups for communication problems. For example, the O( N )-query algorithm for problem 3.3 can be used to construct a communication protocol that solves the√following problem (called set disjointness) by communicating just O( N log N ) quantum bits4 between Alice and Bob (cf. [BuhCleWig98]). This provides almost quadratic improvement over the classical case because, classically, Ω(N ) bit communication is needed to solve this problem (cf. [KalSch5 92], [Raz1 92]). 4
√ ∗ The result has been further improved to O( N clog n ) bits by [HøydWo02].
32
ANDRIS AMBAINIS
Problem 6.1. Alice is given a subset A ⊆ {1, . . . , N }, Bob is given B ⊆ {1, . . . , N }. They want to find out if A and B are disjoint. Query algorithms imply corresponding communication algorithms but proving lower bounds for quantum communication complexity is more difficult (because communication complexity is a more general setting). Proving a good lower bound for Problem 6.1 was open for about √ five years until, very recently, Razborov in [Raz1 03] showed that Ω( N ) bits of communication are needed. The proof uses polynomials method in a combination with several other ideas. For a survey on quantum communication complexity, cf. [dWo02]. A good book on classical communication complexity is [KusNis97].
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
Algebras of minimal rank: overview and recent developments Markus Bl¨aser Institut f¨ur Theoretische Informatik ETH Z¨urich 8092 Z¨urich, Switzerland E-mail:
[email protected]
Abstract. Let R(A) denote the rank (also called bilinear complexity) of a finite dimensional associative algebra A. A fundamental lower bound for R(A) is the so-called Alder-Strassen bound R(A) ≥ 2 dim A − t, where t is the number of maximal twosided ideals of A. The class of algebras for which the AlderStrassen bound is sharp, the so-called algebras of minimal rank, has received a wide attention in algebraic complexity theory. A lot of effort has been spent on describing algebras of minimal rank in terms of their algebraic structure. We review the known characterisation results, study some examples of algebras of minimal rank, and have a quick look at some central techniques. We put special emphasis on recent developments.
1
Introduction
How many operations are necessary and sufficient to multiply two matrices? Or two triangular matrices? How about two polynomials modulo a third fixed one? These questions are examples of perhaps the most fundamental problem in algebraic complexity theory, the complexity of multiplication (in associative algebras). To be more specific, let A be a finite Received: November 21st, 2001; Accepted by the editors: January 11th, 2002. 2000 Mathematics Subject Classification. 68Q17 16Z05.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 33-46. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
¨ MARKUS BLASER
34
dimensional associative k-algebra with unity 1. (In the following, we will often speak of an algebra for short.) We fix a basis of A, say v1 , . . . , vN , and define a set of bilinear forms corresponding to the multiplication in A as follows: If vµ v ν =
N
(κ) αµ,ν vκ
for 1 ≤ µ, ν ≤ N
κ=1 (κ)
with structural constants αµ,ν ∈ k, then these constants and the identity N N N Xµ vµ Yν v ν = bκ (X, Y )vκ µ=1
ν=1
κ=1
define the desired bilinear forms b1 , . . . , bN . The bilinear complexity or rank of these bilinear forms b1 , . . . , bN is the smallest number of essential bilinear multiplications necessary and sufficient to compute b1 , . . . , bN from the indeterminates X1 , . . . , XN and Y1 , . . . , YN . More precisely, the bilinear complexity of b1 , . . . , bN is the smallest number r of products p = u (X) · v (Y ) with linear forms u and v in the Xi ’s and Yj ’s, respectively, such that b1 , . . . , bN are contained in the linear span of p1 , . . . , pr . From this definition it is obvious that the bilinear complexity of b1 , . . . , bN does not depend on the choice of v1 , . . . , vN . Thus we may speak about the bilinear complexity of (the multiplication in) A. For a modern introduction to this topic and more background, the reader is referred to [B¨urClaSho0 97]. For more details on associative algebras, we recommend [DroKir94]. A central lower bound for the rank of an associative algebra A is the so-called Alder-Strassen bound (cf. [AldStr81]). It states that the rank of A is bounded from below by twice the dimension of A minus the number of twosided ideals in A. This bound is sharp in the sense that there are algebras for which equality holds. Since then, a lot of effort has been spent on characterising these algebras in terms of their algebraic structure. There has been some success for certain classes of algebras, like division algebras, commutative algebras, etc., but up to now, no general result holding over all fields is known. Recently, some progress has been made, namely a complete characterisation of all semisimple algebras of minimal rank and of all algebras over algebraically closed fields has been provided (cf. [Bl¨a00]). The latter results can also be generalised to perfect fields (cf. [Bl¨a01], [Bl¨a02]). In contrast to prior results, these classes
ALGEBRAS OF MINIMAL RANK
35
of algebras also contain algebras such that the quotient algebra A/ rad A has factors that are not division algebras. Here rad A denotes the radical of A which is defined as the intersection of all maximal twosided ideals of A. Closely related to the bilinear complexity is the multiplicative complexity. Here we consider products pλ = uλ (X, Y ) · vλ (X, Y ) where uλ and vλ are linear forms both in the Xi ’s and the Yj ’s. The multiplicative complexity is clearly a lower bound for the bilinear complexity and it is easy to show that twice the multiplicative complexity is an upper bound for the bilinear complexity, see, e.g., [B¨urClaSho0 97, Chapter 14]. 1.1
Model of Computation
Next we give a coordinate-free definition of rank. Such a definition turns out to be more appropriate in the following. For a vector space V , V ∗ denotes the dual space of V . Definition 1.1. Let k be a field, U , V , and W finite dimensional vector spaces over k, and ϕ : U × V → W be a bilinear map. 1. A sequence β = (f1 , g1 , w1 , . . . , fr , gr , wr ) such that f ∈ U ∗ , g ∈ V ∗ , and w ∈ W is called a bilinear computation of length r for ϕ if ϕ(u, v) =
r
f (u)g (v)w
for all u ∈ U, v ∈ V .
=1
2. The length of a shortest bilinear computation for ϕ is called the bilinear complexity or the rank of ϕ and is denoted by R(ϕ). 3. If A is a finite dimensional associative k-algebra with unity, then the rank of A is defined as the rank of the multiplication map of A, which is a bilinear map A × A → A. The rank of A is denoted by R(A). The products f (u) · g (v) stand in one-to-one correspondence with the products p defined above. Definition 1.1 also explains the alternate name rank: One can identify Bil(U, V ; W ), the set of all bilinear maps U × V → W , with U ∗ ⊗ V ∗ ⊗ W . There the triads f ⊗ g ⊗ w are rank one elements. Just as the rank of an ordinary matrix M is the minimal number of rank one matrices such that M is contained in the linear span of these rank one matrices, the rank of a bilinear map ϕ is the minimal
¨ MARKUS BLASER
36
number of rank one triads such that ϕ (more precisely, the tensor of ϕ) can be written as the sum of these triads. If we allow that f and g are linear forms on U × V in the above definition, we get quadratic computations and multiplicative complexity. With the introduced terminology, the Alder-Strassen bound can be rephrased as follows: R(A) ≥ 2 dim A − # of maximal twosided ideals of A
(1)
for any algebra A. Definition 1.2. An algebra A is called an algebra of minimal rank if equality holds in (1). One prominent example of an algebra of minimal rank is k 2×2 , the algebra of 2 × 2-matrices. Strassen [Str69] showed R(k 2×2 ) ≤ 7. On the other hand, k 2×2 is a simple algebra, i.e., it has only one maximal twosided ideal. For a long time it had been open whether k 3×3 is of minimal rank or not, see [B¨urClaSho0 97, Problem 17.1]. The idea was that if one could characterise all algebras of minimal rank in terms of their algebraic structure, then one simply has to verify whether k 3×3 has this structure or not. Meanwhile, we know that k 3×3 is not of minimal rank [Bl¨a99]. Nevertheless, the characterisation of the algebras of minimal rank is an interesting and important topic in algebraic complexity theory on its own, see, e.g., [Str90, § 12, Problem 4] or [B¨urClaSho0 97, Problem 17.5]. 1.2
Organisation of the Article
The organisation of the article follows closely the structure of the investigated algebras. We first treat division algebras, then local algebras, commutative algebras, and basic algebras. These results are the “old” results in the sense that the characterisation results for these classes are known for more than fifteen years. After that we turn to the new results, starting with simple algebras. After that, we study semisimple algebras and finally consider arbitrary algebras over perfect fields. We conclude this article with some interesting open problems.
ALGEBRAS OF MINIMAL RANK
2
37
Division Algebras
The easiest example to study in the context of bilinear complexity are division algebras. Recall that an algebra is called a division algebra if all of its nonzero elements are invertible. For this class, Fidducia and Zalcstein in [FidZal0 77] proved one of the first lower bounds for the complexity of associative algebras, namely R(D) ≥ 2 dim D − 1 for any division algebra D. This is nothing else than the Alder-Strassen bound for division algebras, since division algebras have only one maximal twosided ideal, the zero ideal. Division algebras are also the first class of algebras of minimal rank for which a characterisation in terms of their structure was obtained (cf. [dGr83]). Assume that D is a division algebra of minimal rank. Furthermore, let β = (f1 , g1 , w1 , . . . , fr , gr , wr ) be a bilinear computation for D where r = 2 dim D − 1. We claim that f1 , . . . , fr generate D∗ (as a vector space). If this was not the case, then there would be some nonzero x ∈ r =1 ker f . But this would mean x=x·1=
r
f (x)g (1)w = 0,
=1
a contradiction. Thus without loss of generality, f1 , . . . , fN generate D∗ (with N = dim D). Let x1 , . . . , xN be the dual basis of f1 , . . . , fN . By a technique called “sandwiching” (see, e.g., [dGr86]) we may also assume that xN = 1. Loosely speaking, sandwiching allows us to replace the basis x1 , . . . , xN by ax1 b, . . . , axN b for any invertible elements a, b ∈ D. This of course changes also the computation β. But still β is a computation for D. Now gN , . . . , g2N −1 generate D∗ , too. If this was not the case, then −1 we could find some nonzero y ∈ 2N j=N ker gj . But then y = xN · y =
2N −1
f (xN )g (y)w = 0,
=1
−1 since xN ∈ N i=1 ker fi by the definition of dual basis. Once more, we have obtained a contradiction. Let yN , y1 , . . . , yN −1 denote the dual basis of gN , gN +1 , . . . , g2N −1 . (The shift of the indices is introduced for convenience.) Again we also can achieve that yN = 1. This does not change
¨ MARKUS BLASER
38
x1 , . . . , xN . Exploiting the definition of dual basis, we obtain xi · 1 =
2N −1
f (xi )g (1)w = gi (1)wi
for 1 ≤ i ≤ N − 1,
=1
1 · yj =
2N −1
f (1)g (yj )w = fN +j (1)wN +j
for 1 ≤ j ≤ N − 1.
=1
The factors gi (1) and fN +j (1) are all nonzero and thus we may assume that they all equal one by scaling the products appropriately. It follows that x i · yj =
2N −1
f (xi )g (yj )w
=1
= gi (yj )wi + fN +j (xi )wN +j ∈ k · xi + k · yj
for 1 ≤ i, j ≤ N − 1.
(2)
A pair of bases 1, x1 , . . . , xN −1 and 1, y1 , . . . , yN −1 such that (2) holds is called a multiplicative pair or M-pair for short. Such pairs play a crucial role in the characterisation of algebras of minimal rank for which the quotient A/ rad A consists solely of factors that are division algebras. In the case of a division algebra, we can proceed as follows: Assume that D is not simply generated. Consider D as a vector space over K := k(x1 ). Since D is not simply generated, we have d := dimK D > 1. By a suitable permutation, we can achieve that y1 , . . . , yd is a basis of the K-vector space D. Now (2) implies x1 · y1 , . . . , x1 · yd ∈ k · x1 + k · y1 + · · · + k · yd . But over k, x1 · y1 , . . . , x1 · yd and y1 , . . . , yd are linearly independent. Comparing dimensions we get a contradiction, as d > 1. Thus D has to be simply generated, that is, it is isomorphic to k[X]/(p(X)) for some irreducible polynomial p. This yields the following classification result. Theorem 2.1 (de Groote). A division algebra D over some field k is of minimal rank iff D is simply generated and #k ≥ 2 dim D − 2. In particular, D is a field. [dGr83]
ALGEBRAS OF MINIMAL RANK
39
The field k has to be large enough since the upper bound R(D) ≤ 2 dim D − 1 is obtained via polynomial multiplication. Two univariate polynomials can be multiplied modulo a third one of degree n with 2n − 1 bilinear multiplications iff #k ≥ 2n − 2, as shown by Winograd in [Win0 79]. To avoid such lengthy discussion on the size of the field, we assume that in the following the underlying field k is always infinite.
3
Local Algebras
An algebra A is called local if A/ rad A is a division algebra. One example of local algebras of minimal rank can be easily obtained from those in the previous section. Let p be an irreducible polynomial of degree d in the indeterminate X over k. Then k[X]/(p ) is a simply generated local algebra. For = 1, this is even a simply generated division algebra, hence it is of minimal rank. For arbitrary , k[X]/(p ) is an algebra of minimal rank, too. We have R(k[X]/(p )) ≤ 2 · d − 1 by utilising fast polynomial multiplication modulo p . One the other hand, k[X]/(p ) has only one maximal twosided ideal, thus R(k[X]/(p )) ≥ 2 · d − 1 by the Alder-Strassen bound (1). A second example are the so-called generalised null algebras. A generalised null algebra A is generated by nilpotent elements e1 , . . . , em such that ei · ej = 0 whenever i = j. Generalised null algebras are local, their maximal ideal is (e1 , . . . , em ). Moreover, they are of minimal rank. Again the upper bound is obtained via univariate polynomial multiplication. Let us study this fact with a little example. Consider the three dimensional algebra T = k[X, Y ]/(X 2 , XY, Y 2 ). By the Alder-Strassen theorem, R(T ) ≥ 5. One the other hand, given two elements a + bX + cY
and
a + b X + c Y,
we can multiply them with five bilinear multiplications as follows. We first multiply a + bX with a + b X (modulo X 2 ). Note that this is nothing else than univariate polynomial multiplication. The result aa +(ba + ab )X can obviously computed with three multiplications. In the same way, we multiply a + cY with a + c Y . Again this needs three multiplications. But we compute the product aa for a second time. Hence we can save one multiplication and get the upper bound five. The same trick works for arbitrary generalised null algebras.
40
¨ MARKUS BLASER
De Groote and Heintz showed that there are no further commutative local algebras. Theorem 3.1 (De Groote & Heintz). A commutative local algebra is of minimal rank iff it is simply generated or a generalised null algebra. [dGrHei1 83] B¨uchi and Clausen extended this result to arbitrary local algebras. It turns out that local algebras of minimal rank are “almost commutative”. ¨ Theorem 3.2 (Buchi & Clausen). A local algebra A is of minimal rank iff it is simply generated or there exists a commutative local subalgebra B ⊆ A of minimal rank such that A = LA + B = RA + B. [B¨ucCla85] Here LA and RA denote the left and right annihilator of rad A, which are defined by LA = {x ∈ rad A | x(rad A) = {0}} and RA = {x ∈ rad A | (rad A)x = {0}}. The proofs of the above results extensively utilize the concept of Mpairs of bases. 3.1
Dependence on the Characteristic
Consider the four dimensional local algebra S = k[X, Y ]/(X 2 , Y 2 ). We have S = k · 1 + k · X + k · Y + k · XY as vector spaces. Over fields of characteristic distinct from two, S is generated by the two elements X + Y and X − Y , because (X + Y )2 = −(X − Y )2 = 2XY . Moreover, (X + Y )(X − Y ) = 0 (in S). Therefore S is a generalised null algebra and we have R(S) = 7. On the other hand, S is not of minimal rank over fields of characteristic two. If S would be an algebra of minimal rank, then it has to be a generalised null algebra. Take two nilpotent elements in S, say aX + bY + cXY and a X + bY + c XY , that annihilate each other. We have (aX + bY + cXY )(a X + b Y + c XY ) = (ab + ba )XY = (ab − ba )XY = 0.
ALGEBRAS OF MINIMAL RANK
41
But that means that the determinant ab − ba vanishes. In particular, any elements that pairwisely annihilate each other cannot generate both X and Y , a contradiction.
4
Commutative Algebras
Commutative algebras A are isomorphic to a direct product of commutative local algebras A = A1 ×· · ·×At , a result due to Artin. In the previous section, we have already characterised all local algebras of minimal rank. This would already complete the classification of commutative algebras t of minimal rank provided that R(A) = τ =1 R(Aτ ) would hold. However it is not known whether this holds in general. In fact, this is the socalled direct t sum conjecture made by Strassen in [Str73]. It is clear that R(A) ≤ τ =1 R(Aτ ) holds, since we can combine the single computations for each Aτ . Until today, we do not know any example where this inequality is strict. On the other hand, Sch¨onhage in [Sch7 81] constructed an example for which strict inequality holds for a related quantity, the socalled border rank. (Actually, his example did not involve multiplication in algebras but multiplication of rectangular matrices.) In our case here, we are lucky. De Groote and Heintz in [dGrHei1 83] showed that the direct sum conjecture holds for commutative algebras of minimal rank. They even proved the stronger result that every computation of A splits into t computations for A1 , . . . , At . Summarising the thoughts above, we obtain the following characterisation. Theorem 4.1 (De Groote & Heintz). A commutative algebra is of minimal rank iff it is isomorphic to a product of commutative local algebras of minimal rank as described in Theorem 3.1. [dGrHei1 83]
5
Basic Algebras over Algebraically Closed Fields
An algebra A is called basic if A/ rad A is isomorphic to a direct product of division algebras. Since over algebraically closed fields k, the only division algebra is k itself, we have A/ rad A ∼ = k t for basic algebras over algebraically closed fields. Again utilising the concept of M-pair, Heintz and Morgenstern in [Hei1 Mor1 86] were able to prove the following result.
¨ MARKUS BLASER
42
Theorem 5.1 (Heintz & Morgenstern). A basic algebra A over an algebraically closed field has minimal rank if and only if there exist w1 , . . . , wm ∈ rad A with wi wj = 0 for i = j such that rad A = LA + (w1 ) + · · · + (wm ) = RA + (w1 ) + · · · + (wm ). Up to LA and RA , basic algebras of minimal rank over algebraically closed fields are commutative.
6
Simple Algebras
All examples of algebras of minimal rank seen in the previous section base on polynomial multiplication. In this section, we study an example of a different kind, the algebra k 2×2 . Recall that an algebra A is simple if its only maximal twosided ideal is the zero ideal. By Wedderburn’s theorem, this is equivalent to the fact that there exists a division algebra D and an integer n such that A ∼ = Dn×n . In other words, simple algebras are matrix algebras where the entries of the matrices are elements from k-division algebras. By Strassen’s algorithm for multiplying 2 × 2-matrices, R(k 2×2 ) ≤ 7. On the other hand, R(A) ≥ 2 dim A − 1 for any simple algebra by the Alder-Strassen bound. Hence, k 2×2 is of minimal rank. It turns out that this is the only simple algebra of minimal rank. In fact, the question whether for any other k n×n the Alder-Strassen bound is sharp, started the efforts in characterising the algebras of minimal rank. For algebras of the form k n×n it was finally proven in [Bl¨a99] that R(k n×n ) ≥ 2n2 + n − 3.
(3)
(This even holds for the multiplicative complexity.) It follows that for n ≥ 3, k n×n is not of minimal rank. The technique for showing this result is the so-called substitution method due to Pan (cf. [Pan66]). We will exemplify this method at the end of this section. For simple algebras of the form Dn×n with dim D ≥ 2, it was shown in [Bl¨a00] that R(Dn×n ) ≥ 2 dim Dn×n
for any n ≥ 2.
The methods are centered around the problem of finding elements a, b ∈ Dn×n such that dim([a, b] · Dn×n ) is large, where [a, b] = ab − ba denotes the commutator or Lie product of a and b. Loosely speaking, the
ALGEBRAS OF MINIMAL RANK
43
“more noncommutative” an algebra is, the harder is multiplication in this algebra. The methods are beyond the scope of the present article, the interested reader is referred to [Bl¨a00]. Altogether, the above arguments yield the following result. Theorem 6.1. A simple algebra A ∼ = Dn×n is of minimal rank iff either n = 2 and D = k or n = 1 and D is a division algebra of minimal rank. The methods developed in [Bl¨a00] also yield the assymptocially best lower bound for the rank of multiplication in simple algebras R(Dn×n ) ≥ 25 dim Dn×n − 3n. By adjusting the correction term −3n appropriately, this bounds holds for arbitrary algebras, too. For semisimple algebras, this bound also holds for the multiplicative complexity. 6.1
The Substitution Method
As an application of the substitution method, we prove that R(k 2×2 ) ≥ 7. The proof of (3) for larger formats is proven along the same lines, the arguments are however much more involved. Originally, one takes a computation and restricts the computed bilinear map to say ker f1 (one coordinate is substituted by a linear form in the other ones). This trivialises one bilinear product. Then one proceeds with the remaining bilinear map in a recursive fashion. We here present a refined approach, basically due to Alder and Strassen. Let β = (f1 , g1 , w1 , . . . , fr , gr , wr ) be a bilinear computation for 2×2 k . Since the multiplication is surjective, w1 , . . . , wr generate k 2×2 . Let U and V be the vector spaces 00 0∗ U= and V = . ∗∗ 0∗ Above, the expressions on the righthand side denote the spaces of matrices obtained by substituting arbitrary elements from k for the “∗”. Note that U is a right and V is a left ideal. By “sandwiching”, we can achieve that w1 ∈ / U but w1 ∈ U + V . We just have to find invertible matrices a and b such that a · w1 · b has a zero as (1, 1)-entry and a nonzero value as (1, 2)-entry. We claim that
¨ MARKUS BLASER
44
g2 , . . . , gr generate (k 2×2 )∗ . If this was not the case, then there would be some nonzero x ∈ ri=2 ker gi . But then a · x ∈ k · w1
for each a ∈ k 2×2 .
This cannot be the case, because k 2×2 · x is a left ideal and has dimension at least two. Again by a suitable permutation, we can assume that g2 |V , g3 |V form a basis of V ∗ . Next we show that f4 |U , . . . , fr |U generate U ∗ . If this were wrong, then there would be some x ∈ U \ {0} such that x ∈ rj=4 ker fj . Let a ∈ k 2×2 be arbitrary. Since g2 |V , g3 |V form a basis of V ∗ , there is an a1 ∈ V such that gj (a1 ) = gj (a) for j = 2, 3. Therefore x · (a − a1 ) = f1 (x)g1 (a − a1 )w1 +
3 j=2
+
r i=4
fj (x) gj (a − a1 ) wj
f (x) g (a − a1 )wi .
i i
=0
(4)
=0
Since x ∈ U and U is a right ideal, x · a ∈ U , too. As U is even minimal, x · k 2×2 = U . Because V is a left ideal and a1 ∈ V , xa1 ∈ U ∩ V holds. Thus (4) implies U = x · k 2×2 ⊆ k · w1 + U ∩ V. / U , intersecting both sides with U yields U ⊆ U ∩ V , a Since w1 ∈ contradiction. Hence, we may assume that f4 |U , f5 |U generate U ∗ . Finally, we show that g2 , g3 , g6 , . . . , gr generate (k 2×2 )∗ . This yields r ≥ 7. Again, the proof is by contradiction. Assume that there is some nonzero
x ∈ ker g2 ∩ ker g3 ∩
r
ker gj .
j=6
/ V . The same reaSince g2 |V , . . . , g3 |V generate V ∗ , we know that x ∈ soning as before shows k 2×2 · x ⊆ k · w1 + U. By construction, the righthand side is contained in U + V . The left hand side, however, contains a matrix that has a nonzero entry in position (1, 1). Thus we have encountered a contradiction.
ALGEBRAS OF MINIMAL RANK
7
45
Semisimple Algebras
An algebra is called semisimple if rad A = {0}. By Wedderburn’s theorem, a semisimple algebra A is isomorphic to a product A1 × · · · × At of simple algebras Aτ . Note that t is exactly the number of maximal twosided ideals in A. In the course of the proof of their bound, Alder and Strassen showed that R(A × B) ≥ 2 dim A − 1 + R(B) for any simple algebra A and any arbitrary algebra B (cf. [AldStr81]). Applying induction, we get that a semisimple algebra is of minimal rank iff each of the simple algebras Aτ is of minimal rank. Simple algebras of minimal rank have been characterised in the previous section. Theorem 7.1. A semisimple algebra A ∼ = A1 × · · · × At is of minimal rank iff each simple factor Aτ is of minimal rank, that is, Aτ is either isomorphic to k 2×2 or is a division algebra of minimal rank.
8
Algebras over Perfect Fields
The most general characterisation result we know today, is the complete characterisation of all algebras of minimal rank over perfect fields1 . Note that “most” fields are perfect, like all algebraically closed fields, all fields of characteristic zero, and all finite fields (which we exclude in this article for the sake of simplicity). Theorem 8.1. An algebra A over a perfect field k is of minimal rank iff 2×2 A∼ × · · · × k 2×2 ×A = C1 × · · · × Cs × k
(5)
u times where the factors C1 , . . . , Cs are local algebras having minimal rank with dim(Cσ / rad Cσ ) ≥ 2, A is of minimal rank with A / rad A = k t , and there exist w1 , . . . , wm ∈ rad A with wi wj = 0 for i = j such that rad A = LA + (w1 ) + · · · + (wm ) = RA + (w1 ) + · · · + (wm ). Any of the integers s, u, or m may be zero and the factor A in (5) is optional. 1
Cf. [Bl¨a01] and [Bl¨a02]. See also [Bl¨a00] for a characterisation over algebraically closed fields.
¨ MARKUS BLASER
46
Over finite fields, a similar characterisation result can be obtained, too, see [Bl¨a01], [Bl¨a02] for more details. It is interesting to note that algebras of minimal rank are composed solely from examples we have already known before. Loosely speaking, the only ways to obtain algebras of minimal rank are via polynomial multiplication and Strassen’s algorithm for multiplying 2 × 2-matrices.
9
Open Problems
The obvious open problem is the following: Determine all algebras of minimal rank over arbitrary fields! The reason why the results in the preceding section were only proven over perfect fields, is that the proof heavily relies on the Wedderburn-Mal’cev theorem2 . From the Wedderburn-Mal’cev theorem it follows, that any algebra A over a perfect field has a subalgebra B such that A/ rad A = B and B ⊕ rad A = A. This nice structure property is used in the proof of Theorem 8.1. One actually does not need that the field is perfect, one only needs the existence of a subalgebra B having the described properties. Another interesting problem –but probably a hopeless one in the general case– is the determination of the so-called algorithm varieties of an algebra. Basically we want to determine all optimal algorithms (up to equivalence) for multiplication in a given algebra. Of course, we have to know the rank of the algebra before we can start such a task, thus algebras of minimal rank are a natural candidate. A remarkable result is the proof by de Groote in [dGr78] that up to equivalence, there is only one way to multiply 2 × 2-matrices with seven bilinear multiplications. It is beyond the scope of this survey to define the used notion of equivalence precisely, but we have already seen some examples: An easy one is the permutation of the products, a second one is the scaling of the factors of each product. “Sandwiching” is a third example. Further results in this direction are [dGr83] and [Fel1 86]. Note added in proof (December 2003). Meanwhile, the author obtained a complete characterisation of the algebras of minimal rank over arbitrary fields. It turns out that Theorem 8.1 is also valid over arbitrary fields.
2
Cf., e.g., [DroKir94].
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
Recent developments in iterated forcing theory J¨org Brendle The Graduate School of Science and Technology Kobe University Rokko–dai 1–1, Nada–ku Kobe 657–8501, Japan E-mail:
[email protected]
Abstract. These are expanded notes of a lecture on iterated forcing theory. Particular focus is put on recently developed techniques like constructions with mixed support, or Shelah’s iteration along templates.
1
What iterated forcing is about
Forcing, created by Cohen to prove the independence of the continuum hypothesis CH from ZFC, the standard axiom system for set theory, was soon transformed into a general and powerful technique to obtain a plethora of independence results. The aim of these notes is to provide a glimpse of goals, methods and problems of present-day forcing theory. While forcing can be used to get independence of combinatorial statements about larger cardinals as well, we shall focus on the continuum. So assume ϕ is a statement about the combinatorial structure of the real line. Received: November 12th, 2001; In revised version: February 28th, 2002; Accepted by the editors: March 4th, 2002. 2000 Mathematics Subject Classification. 03E35 03E17 03E40 03E50.
Supported by Grant-in-Aid for Scientific Research (C)(2)12640124, Japan Society for the Promotion of Science.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 47-60. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
¨ JORG BRENDLE
48
(A) A pattern encountered often is that ϕ is decided by CH, say, without loss, (i) CH ⇒ ϕ while it is independent of c ≥ ℵ2 , (ii) CON(c > ℵ1 + ¬ϕ) CON(c > ℵ1 + ϕ) A typical example would be a statement like the real line is not the union of less than c many meager sets which is true under CH by the Baire category theorem1 . (B) Some interesting statements are even independent of CH, e.g., Suslin’s hypothesis which asserts that every Dedekind-complete linear order without endpoints satisfying the countable chain condition c.c.c. (that is, every collection of pairwise disjoint non-empty open intervals is at most countable) is isomorphic to the real line. Then one typically gets both (i) CON(CH + ¬ϕ) CON(CH + ϕ) and (ii) CON(c > ℵ1 + ¬ϕ) CON(c > ℵ1 + ϕ) (C) Non-trivial statements which are decided even by c = ℵ2 while being independent of ZFC, i.e., (i) CH ⇒ ϕ (ii) c = ℵ2 ⇒ ϕ (iii) CON(c > ℵ2 + ¬ϕ) CON(c > ℵ2 + ϕ) have been extremely rare so far. An example is the existence of a Gross space, that is, an uncountable-dimensional vector space E over a countable field equipped with a symmetric bilinear form such that the orthogonal complement of every infinite-dimensional subspace has dimension less than dim(E) (see also below, Subsection 2.1). What is the reason for (C) occurring so seldom?—First of all c = ℵ2 is very weak combinatorially so that there is not much implied by it. Yet, 1
As usual c = |R| denotes the size of the continuum.
ITERATED FORCING THEORY
49
there is a deeper, forcing-theoretic reason: while we have a very good theory for making the continuum at most ℵ2 , i.e., for dealing with problems like (A)(ii), (B)(i)(ii), we are severely lacking in iteration theory when it comes to forcing c > ℵ2 . In fact, there are also a number of statements falling into patterns (A) or (B) for which (iii) CON(c > ℵ2 + ¬ϕ) CON(c > ℵ2 + ϕ) is still open. For example, CH implies “there is a set of reals of size c which cannot be mapped continuously onto the reals”, and this is consistently true with c being of arbitrary size, while it is also consistent this fails and c = ℵ2 . However, the latter consistency is still open when c > ℵ2 (cf. [Mil0 93] or [Mil0 00]). To get deeper into these problems, let us briefly recall what forcing does. It blows up a ground model M to a larger model M [G], called a generic extension, M → M [G] in the following fashion: in M , there is a partial order P such that G is P-generic over M , that is G is a filter and for all dense subsets D of P which belong to M , G ∩ D = ∅. Except for trivial choices of P, such G can be shown not to belong to M , and M [G] then can be described as the minimal model containing both M and G. The generic extension process can be iterated, M → M [G] → M [G][H] where P ∈ M , G is P-generic over M , Q ∈ M [G] and H is Q-generic over M [G]. There is an alternative way to describe this: namely, one can ˙ so that forcing with it does the same produce in M a partial order P Q thing as forcing first with P and then with Q, that is, there is a G H ˙ which is P Q-generic over M such that M [G][H] = M [G H], and the two-step iteration can be desribed in one step. Continuing we get M → M [G1 ] → M [G2 ] → M [G3 ] → ... → M [Gω ] ˙ 1 , P3 = P2 Q ˙ 2 , ... and the Gi are Pi -generic where P1 = Q0 , P2 = P1 Q over M . But what happens in the limit step? What is the model M [Gω ]? Or, to rephrase this, what is Pω ? There may be many possible choices, the two simplest ones being the direct limit given by Pω = n Pn , and
¨ JORG BRENDLE
50
thus being the minimal possible choice, and the inverse limit which can be thought of as the collection of all “threads”, thus being the maximal possible choice. More explicitly, one thinks of Pω as the set of all functions f with domain ω such that f n forces (in the sense of Pn ) that f (n) ˙ n . It should be clear that this iterative procedure can be is a member of Q continued as far as one wishes through the realm of the ordinals. 1.1
Classical iterations
Let us now assume we have an iterated forcing construction ˙ α ; α < κ Pα , Q with direct limit Pκ as sketched above. Here κ is an arbitrary limit ordinal; in most of the interesting cases, however, κ will be a regular uncountable cardinal. There are two main and, by now, classical techniques which have been used extensively for adjoining real numbers and, thus, for getting consistency results of the form (ii) and (iii) above, namely (1) finite support iteration of c.c.c. forcing, the technique originally developed by Solovay and Tennenbaum to show the consistency of Suslin’s hypothesis mentioned above, (2) countable support iteration of proper forcing, created by Shelah [She98]. For the first, the direct limit is taken everywhere so that this can be thought of as forcing with finite partial functions, while in the second, we use the inverse limit at ordinals of countable cofinality and the direct limit elsewhere which essentially amounts to forcing with countable partial functions. Some care has to be taken as to what kind of forcing can be iterated with which technique so as to avoid collapsing cardinals. For ˙ n have an uncountable antichain (are not example, if countably many Q c.c.c.), the direct limit will collapse ℵ1 so that finite support iteration is ruled out. This is why one uses it only for c.c.c. partial orders. The latter class in fact is closed under finite support iterations so that we can iterate as long as we wish without collapsing cardinals. Similarly, since countable support iteration is closed under proper forcing, the latter is a natural choice for (2). A partial order P is called proper [She98] if for large enough χ and every countable elementary substructure N of Hχ containing P, if p ∈ N ∩ P, then there is q ≤ p which is (P, N )-generic. Here
ITERATED FORCING THEORY
51
q is called (P, N )-generic if for every P-name for an ordinal γ˙ which belongs to N , q γ˙ ∈ N . (That is, if the name is in N , then q forces the evaluation of the name to be in N .) See below, Corollary 2.2, for an illustration of how this definition works. The proper partial orders form a much larger class than the c.c.c. partial orders. On the other hand, every proper partial order preserves ℵ1 . Both methods have rather obvious flaws. Let us explain this in somewhat more detail. Cohen forcing C is the collection of all finite partial functions from ω to 2, ordered by extension, i.e., q ≤ p iff q ⊇ p; more generally, given a cardinal λ, Cλ is the collection of all finite partial functions from λ to 2. A generic for C is called a Cohen real. Now one has ˙ α ; α < κ adds a Fact 1.1. A non-trivial finite support iteration Pα , Q Cohen real in each limit step of countable cofinality. More generally, it adds a generic for Cλ at each limit step of cofinality λ. ˙ α has Proof. The iteration being non-trivial means that Pα forces that Q 0 1 two incompatible elements, say q˙α and q˙α . Let µ < κ be an ordinal of countable cofinality. Without loss µ = ω. Let c˙ be the Pω -name given by: p forces c(n) ˙ = 0 iff pn forces that p(n) ≤ q˙α0 (so p forces c(n) ˙ = 1 iff pn forces that p(n) and q˙α0 are incompatible). It is easy to see that c˙ is forced to be C-generic over the ground model. The general case is analogous. q.e.d. A consequence of this is that if, say, we start with a model for CH, κ is regular uncountable, and Pκ forces c = κ, then Pκ also forces Martin’s Axiom MA for C, that is, the statement that for every collection of < c many dense sets of C there is a filter G ⊆ C which meets them all. Incidentally, the latter statement is equivalent to saying that the real line is not the union of less than c many meager sets (cov(M) = c in symbols), a property we considered above. More generally, Pκ also forces Martin’s Axiom for Cλ where λ < κ. This is a stronger statement than just having MA for C, see below (Subsection 2.1). However, there are many situations in which one wants to have cov(M) small. So what do we do in that case? In a sense, (2) is a much more powerful technique when it comes to making c ≤ ℵ2 for it both avoids the flaw sketched above and applies to a much wider class of partial orders. However, it necessarily forces c ≤ ℵ2 . To appreciate this, let Cω1 be the forcing for adding a Cohen
52
¨ JORG BRENDLE
subset of ω1 , that is, all countable partial functions from ω1 to 2 ordered by extension; similarly, for λ ≥ ℵ1 , Cωλ1 is the set of all countable partial functions from λ to 2. Then ˙ α ; α < κ Fact 1.2. A non-trivial countable support iteration Pα , Q adds a Cohen subset of ω1 in each limit step of cofinality ω1 . More generally, it adds a generic for Cωλ1 at each limit step of cofinality λ. The proof is the same as the one of Fact 1.1. Furthermore Fact 1.3. Forcing with Cω1 forces CH. Proof. Let {fα : ω → 2; α < c} enumerate the reals of the ground model. Note that given a condition p ∈ Cω1 and α < c, we can extend p to a condition q such that for some limit ordinal β < ω1 , q(β+n) = fα (n) for all n, that is, q codes the real fα at β. An easy density argument then shows that in the generic extension, all ground model reals are coded at some β by the generic function. This means that the size of the set of old reals has become ℵ1 . On the other hand, Cω1 , being ω-closed, does not q.e.d. add any new reals so that c = ℵ1 in the generic extension. ˙ α ; α < κ is a countable support iteration which Now assume Pα , Q adds new reals in each stage where κ = ω2 + ω1 . Since reals are added, Pω2 forces c = ℵ2 . By Facts 1.2 and 1.3, after the subsequent ω1 steps CH is forced so that ℵ2 has been collapsed by stage κ if not earlier. This shows that with a countable support iteration one can never go past c = ℵ2 , and this is the main reason there have been fewer consistency results with c ≥ ℵ3 , a fact mentioned above.
2 2.1
Recent developments Iterations with mixed support
Constructions with mixed support have been used for making Martin’s axiom MA for large Cohen algebras Cλ fail. We illustrate this with a simple paradigm due to Shelah [FucSheSou97], namely, adding Cohen reals with mixed support. This is a product construction, and no iteration in the strict sense, but the general method works along the same lines. Let κ be a cardinal. The p.o. P = Pκ consists of countable partial functions p : κ × ω → 2 such that {n; p(α, n) is defined} is finite for
ITERATED FORCING THEORY
53
all α. Denote by p(α, ·) the restriction of p to {α} × ω. Set supp(p) = {α; p(α, ·) = ∅}, the support of p, and define the ordering ≤ by q ≤ p iff q ⊇ p and {α ∈ supp(p); q(α, ·) p(α, ·)} is finite. P has the following properties: – Assuming CH, P has the ℵ2 -chain condition which entails that forcing with P preserves cardinals ≥ ℵ2 (this is a standard ∆-system argument). – P is proper so that forcing with P preserves ℵ1 (Lemma 2.1 and Corollary 2.2). – P adds κ many Cohen reals (this is easy) and thus forces cov(M) ≥ κ. – If ♦ holds in the ground model, ♣ holds in the generic extension (see below for definitions etc.). Say that q ≤0 p if q ≤ p and q(α, ·) = p(α, ·) for all α ∈ supp(p). Note that ≤0 is an ω-closed partial order. The following argument is archetypal for constructions with mixed support. Lemma 2.1. Let γ˙ be a P-name for an ordinal, p ∈ P. Then there is q ≤0 p such that the set Dq,γ˙ = {r ≤ q; supp(r) = supp(q) and r decides γ} ˙ is predense in P below q. [FucSheSou97] Proof. Suppose not. We recursively construct sets {pζ ; ζ < ω1 } and {qζ ; ζ < ω1 } of conditions such that (i) (ii) (iii) (iv) (v)
p0 ≤0 p and pξ ≤0 pζ for ξ ≥ ζ, qζ ≤ pζ , supp(qζ ) = supp(pζ ), if qζ (α, ·) pζ (α, ·), then α belongs to ξ<ζ supp(pξ ), ˙ and qζ decides γ, the qζ are pairwise incompatible.
This can be done by assumption. For indeed, suppose we are at step ζ of and pξ , qξ , ξ < ζ, have been produced. Set pζ = the construction, ξ<ζ pξ . Clearly pζ ≤0 p and, by assumption, we can find qζ ≤ pζ deciding γ˙ which is incompatible with all r ≤ pζ which decide γ˙ and have the same support as pζ . A fortiori, qζ is incompatible with all previous qξ so that (iv) and (v) are satisfied. Defining pζ such that it agrees with pζ on supp(pζ ) and with qζ on supp(qζ )\supp(pζ ), we guarantee (i), (ii) and (iii).
54
¨ JORG BRENDLE
Define F : ω1 → ω1 by F (ζ) = min{ξ ≤ ζ; for all α, if qζ (α, ·) = pζ (α, ·), then α ∈ supp(pξ )}. By (iii) and by definition of the partial order, F is regressive, i.e., F (ζ) < ζ for all ζ. Using Fodor’s Lemma we see there are a stationary set S ⊆ ω1 and a ζ0 such that F (ζ) = ζ0 for all ζ ∈ S. By pruning further, if necessary, we may assume without loss that there is r ∈ P, supp(r) = supp(pζ0 ), r ≤ pζ0 such that qζ (supp(r) × ω) = r for all ζ ∈ S. This means, however, that all qζ , ζ ∈ S, are pairwise compatible, contradicting clause (v). q.e.d. Corollary 2.2. P is a proper forcing notion. [FucSheSou97] Proof. This is a standard argument. We include a proof as an illustration of how properness works. Assume N is a countable elementary substructure of some Hχ containing P and a condition p. Let {γ˙ n ; n ∈ ω} list the P-names for ordinals belonging to N . By repeatedly applying the previous lemma inside N , we can construct a ≤0 -decreasing sequence of conditions qn ∈ P ∩ N , ≤0 -below p, such that Dqn ,γ˙ n is predense in P below qn . (Notice, however, that the whole sequence {qn ; n ∈ ω} does not belong to N .) Put q = n qn . We claim that q is a (P, N )-generic condition. For indeed, assume r ≤ q and r γ˙ n = δ. Since Dqn ,γ˙ n is predense below q there is s ∈ Dqn ,γ˙ n compatible with r. Since s forces a value to γ˙ n , we must have s γ˙ n = δ. As Dqn ,γ˙ n ∈ N is countable, Dqn ,γ˙ n ⊆ N and, a fortiori, s ∈ N and δ ∈ N , as required. q.e.d. What is this good for? Recall the combinatorial principle ♦ (diamond) asserts the existence of a sequence Aα ⊆ α; α < ω1 such that for every subset B ⊆ ω1 there is α < ω1 with B ∩ α = Aα . By ♣ (club), we mean that there is a sequence Aα ⊆ α cofinal; α < ω1 is a limit ordinal such that for all uncountable B ⊆ ω1 there is α with Aα ⊆ B. Finally, •| (“stick”) says there is a family A of countable subsets of ω1 of size ℵ1 such that every uncountable subset of ω1 contains a member of A. These principles play an important role in a number of applications of set theory to other fields of mathematics. The definitions immediately give the implications ♦ ⇒ ♣ ⇒ •| and ♦ ⇒ CH ⇒ •| . Furthermore, it is known that ♦ is equivalent to CH + ♣. The connection to forcing axioms is given by the easy Fact 2.3. If •| holds, Martin’s axiom MAℵ1 fails for Cω1 .
ITERATED FORCING THEORY
55
Proof. Let M be a model of (a large enough fragment of) set theory of size ℵ1 containing a witness A for •| . Note there can be no Cω1 -generic subset B over M , for, by genericity, such B would not contain any countable set of M , violating that A is a witness for •| . This means there is no filter meeting the (ℵ1 many) dense sets of Cω1 of M . q.e.d. Using an argument very similar to Lemma 2.1, one proves that a witness for ♦ in the ground model still witnesses ♣ in the generic extension via P. (In fact, ♦ is not necessary, for it can be shown that P generically adjoins a ♣-sequence anyway.) Thus, one gets a model where MA for countable partial orderings (for C) is true (equivalently, where the covering number cov(M) for the meager ideal is large) while ♣ holds which is a strong way of saying MA fails for Cω1 , by Fact 2.3. We close this subsection with a list of results which have been obtained by similar, albeit more complicated, methods. – Adjoining many random reals with mixed support (that is, a mixed support version of a large measure algebra), one gets a model where the covering number cov(N ) of the null ideal is large while ♣ holds. [Bre1 ∞a] – Proper non-c.c.c. iterations with mixed support. They are typically used to show the simultaneous consistency of ♣ with something else (e.g., the additivity of the meager ideal2 add(M) being ℵ2 ), or to prove various versions of ♣ may not coincide [DˇzaShe99]. In most models of this kind c = ℵ2 . – However, with a very sophisticated iterated mixed support construction involving “historic” definition of the conditions, Shelah has shown the consistency3 of c = ℵ3 (or larger) and “there are no Gross spaces” [She∞a]. In this model, MAℵ1 for σ-centered partial orders and a version of ♣ for ω2 hold simultaneously (so that MA fails for Cω2 in a strong way). – There is a standard way of iterating σ-centered forcing with mixed support making c arbitrarily large [Bre1 00a]. However, ♣ does in general not hold in the generic extension, and the method has no applications so far. 2
3
The additivity of the meager ideal add(M) is the size of the smallest family of meager sets whose union is not meager. The statement c = ℵ2 is known to imply the existence of a Gross space [Bre1 ∞a], see above.
56
¨ JORG BRENDLE
Problem 2.4. Is ♣ and add(M) > ℵ2 (simultaneously) consistent? We expect consistency can be obtained by a mixed support construction. 2.2
Iterations along templates
Let us step back for a moment and rethink what an iteration is after all. In most of the interesting cases used in applications, including the constructions with mixed support discussed above, we define, by a simultaneous recursion, both the iteration Pα (that is, the initial segment ˙ α (with which we want to of the eventual forcing) and the iterands Q force at a certain step α). As discussed before, we thus get a system ˙ α ; α < κ where the connection between the Pα and the Q ˙ α is Pα , Q ˙ given by Pα+1 = Pα Qα . The forcing we eventually force with is the direct limit Pκ of the Pα ’s. However, in utmost generality, an iteration is nothing but a system Bi ; i ∈ I of complete Boolean algebras and complete embeddings eij : Bi → Bj ; i < j between them, where I, ≤ is a directed index set. The forcing notion defined by this system is the direct limit of the Bi ’s. This is all nice but this general approach has not proved useful at all ˙ α; α < so far. On the other hand, considering only iterations of the Pα , Q κ-form seems to be too restrictive. Note, the latter are a special case of the former, satisfying additionally – wellfoundedness of the index set I, – I is linearly ordered, – the initial segments of the iteration and the new iterands are handed down simultaneously. Which of these conditions are needed? There is no a priori need for the second and third conditions, but the status of the first is somewhat different. While it can be avoided in a number of cases, e.g., – since adding Cohen reals is a commutative process, it is irrelevant in which order they are adjoined, and we may think of them as being adjoined along some illfounded order, – the same is true for random reals, if added with a large measure algebra, – illfounded iterations of Sacks forcing have been considered as well,
ITERATED FORCING THEORY
57
this is not true in general. In particular, when it comes to adding dominating reals, illfounded iterations are forbidden, for there cannot be models of set theory Mn ; n ∈ ω with Mn+1 ⊆ Mn and reals fn dominating over Mn such that fi ; i ≥ n + 1 ∈ Mn , by a theorem of Hjorth (see [Bre1 02] for a more detailed discussion of all this). As a paradigm for a method satisfying the first condition but not the second and third, we briefly describe how to iterate Hechler forcing along a template. Iterating along a template is a novel technique, introduced by Shelah [She∞b] to show the consistency of d < a (see also [Bre1 02]). Hechler forcing D consists of all pairs (s, f ) where s ∈ ω <ω and f ∈ ω ω with s ⊆ f . The order is given by (t, g) ≤ (s, f ) if t ⊇ s and g(n) ≥ f (n) for all n. D is a c.c.c. (even σ-centered) forcing notion which generically adds a dominating real, that is, a real d ∈ ω ω such that for all f ∈ ω ω of the ground model, d(n) ≥ f (n) holds for almost all n. To see this simply set d = {s ∈ ω <ω ; (s, f ) belongs to the generic filter for some f }. Definition 2.5. Let (L, ≤) be a linear order, and set Lx = {y ∈ L; y < x} for all x ∈ L. (L, I) is a template if I is a family of subsets of L satisfying (1) ∅, L ∈ I, (2) I is closed under finite unions and intersections, (3) if y < x belong to L, then there is A ∈ I ∩ ℘(Lx ) such that y ∈ A, (4) if A ∈ I and x ∈ L \ A, then A ∩ Lx ∈ I, and (5) I is wellfounded under ⊆, i.e., there is a function Dp : I → On, recursively defined by Dp(∅) = 0 and Dp(A) = sup{Dp(B) + 1; B ∈ I and B ⊂ A} for A ∈ I \ {∅}. We are ready for the definition of the iteration. Definition 2.6. Assume (L, I) is a template. We define, for A ∈ I, by recursion on Dp(A) the p.o. PA. – Dp(A) = 0, i.e., A = ∅. Let P∅ = {∅}. – Dp(A) > 0. Then PA consists of all finite partial functions p with domain contained in A and such that, letting x = max(dom(p)), there is B ∈ I ∩ ℘(A ∩ Lx ) (so Dp(B) < Dp(A)) such that p(A ∩ Lx ) ∈ PB and p(x) = (spx , f˙xp ) where spx ∈ ω <ω and f˙xp is a PB-name
58
¨ JORG BRENDLE
for an element of ω ω such that p(A ∩ Lx ) PB “spx ⊆ f˙xp ” (so this ˙ means that p(x) is a PB-name for a condition in Hechler forcing D). The ordering of PA is given by: q ≤PA p if dom(q) ⊇ dom(p) and • either y = max(dom(q)) > x = max(dom(p)) and there is B ∈ I ∩ ℘(A ∩ Ly ) such that p, q(A ∩ Ly ) ∈ PB and q(A ∩ Ly ) ≤PB p, • or x = max(dom(q)) = max(dom(p)) and there is B ∈ I ∩ ℘(A ∩ Lx ) such that p(A ∩ Lx ), q(A ∩ Lx ) ∈ PB, f˙xp , f˙xq are PB-names, q(A ∩ Lx ) ≤PB p(A ∩ Lx ), spx ⊆ sqx , and q(A ∩ Lx ) PB “f˙xp (n) ≤ f˙xq (n) for all n” (the last two clauses mean nothing but q(A ∩ Lx ) PB “q(x) ≤D˙ p(x)”). The first thing one needs to check is that this recursive definition makes sense at all, namely, that the orderings ≤PA defined along the way are transitive. To this effect, one proves by induction (simultaneously to the recursive definition) that the p.o.’s completely embed one into the other, i.e., that for A, B ∈ I, A ⊆ B, PA is a subforcing of PB. The latter is crucial by itself, for it shows that we indeed deal with an iteration in the usual sense. While complete embeddability is a triviality ˙ α ; α < κ-form, this is not so anymore in the for iterations of the Pα , Q present context. In an iterated forcing construction like the one described above, the recursive definition of the iteration and the description of the iterands are separated in a translucent way. Namely, while the latter are indexed by members of L, the former is indexed by members from the wellfounded set I. In general, I is not linear, and L is not wellfounded. In fact, there is no reason at all to consider only iterands which are ordered linearly, and, using a somewhat more complicated definition, the above can be extended to the situation where L is an arbitrary partial order. Results obtained so far with this method include: (1) The consistency of d < a [She∞b]. The dominating number d is the size of the least F ⊆ ω ω such that every function in ω ω is eventually dominated by a member of F. A family A ⊆ [ω]ω is called almost disjoint if any two distinct members of A have finite intersection. The almost disjointness number a is the size of the least infinite maximal almost disjoint family. The proof goes roughly as follows:
ITERATED FORCING THEORY
59
Assume CH in the ground model. Then (L, I) will be a template such that L has a cofinal subset of size ω2 . This guarantees we get a cofinal family of Hechler reals of cofinality ω2 so that b = d = ℵ2 will hold4 . Since a ≥ b in ZFC, it suffices to prove that no almost disjoint family of size ℵ2 can be maximal in the generic extension. Suppose A˙ = {A˙ α ; α < κ = ω2 + ω2 } were a name for such a family. Reordering the A˙ α , if necessary, and using CH and a ∆-system argument, we may assume the names A˙ α , α < ω2 , are isomorphic. The exact definition of (L, I) now is such that an additional name A˙ κ can be found such that, for any β < κ, the complete subalgebra generated by A˙ κ and A˙ β is isomorphic to the complete subalgebra generated by A˙ α and A˙ β for some (almost all) α = β, α < ω2 . This means A˙ κ is forced to be almost disjoint from all A˙ β , so that A˙ was not maximal. (2) Shelah [She∞b] also showed the consistency of u < a (using the consistency of the existence of a measurable cardinal)5 . Furthermore, he obtained that a can be singular (of uncountable cofinality). (3) The number a can consistently be singular of countable cofinality (e.g., a = ℵω is consistent).[Bre1 03] Let us close with two problems. Problem 2.7 (Shelah). Show the consistency of u < a on the basis of ZFC. [She00] We expect this can be done with a template-version of iterated Mathias forcing. Problem 2.8. Show the consistency of cov(N ) = ℵ3 and cov(M) = ℵ2 . While iterations along a template as sketched above seem not readily adaptable to this problem, we expect the solution to be along similar lines in the sense that it should be a recursively defined iteration whose iterands are given separately. 4
5
The unbounding number b is the least size of an unbounded family F ⊆ ω ω , i.e., there is no single g ∈ ω ω which eventually dominates all members of F. The ultrafilter number u is the smallest size of a base of an ultrafilter on ω.
¨ JORG BRENDLE
60
For more extensive lists of problems see either [Mil0 93] (with a recent update online with references: cf. [Mil0 00]), or the more recent [She00] (in particular, Sections 2 and 3) or Section 5 in [Bre1 01] both of which also contain a more extensive list of references than the present paper. Note added to the revised version (February 2002). Problem 2.8 has been solved positively by the author since the original writing of this article.6
6
Cf. [Bre1 ∞b].
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
Classification problems in algebra and topology Riccardo Camerlo Dipartimento di Matematica Politecnico di Torino Corso Duca degli Abruzzi, 24 10129 Torino, Italy E-mail:
[email protected]
Abstract. Descriptive set theory can provide a suitable framework to investigate complexity in mathematics, most notably in algebra, analysis and topology. Some aspects of Borel reducibility theory for equivalence relations are presented. Examples and open problems are discussed.
Introduction and basic notions The purpose of this paper is to give a brief account on some aspects of Borel reducibility theory for equivalence relations on standard Borel spaces. Excellent expositions of the subject can be found in several papers or books on this topic; cf., e.g., [BecKec96], [Hjo00], [Kec02]. Here the aim is to focus on some parts of the hierarchy of equivalence relations, discussing some examples including a couple of new ones and commenting on some interesting open problems in the field, the solution of some of which would likely bring a better understanding of the theory. Received: November 5th, 2001; In revised version: March 30th, 2002; Accepted by the editors: April 11th, 2002. 2000 Mathematics Subject Classification. 03E15 54H20.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 61-75. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
62
RICCARDO CAMERLO
Quite often in mathematics, people deal with the problem of classifying objects up to some notion of similarity, like isomorphism, homeomorphism, isometry, biembeddability, etc. This usually amounts to finding necessary and sufficient criteria for two objects in the domain of discussion to be equivalent. In other words, the goal is to find a criterion Φ which, applied to objects x, y, says that these are equivalent if and only if Φ(x) = Φ(y). So, these values Φ(x) are called complete invariants for the equivalence. In order to be useful, this assignment x → Φ(x) should be explicitly definable. Moreover, particularly for classification problems arising in real analysis, topology or algebra, the class of objects to be classified very often carries a natural structure turning it into a Polish space (separable, completely metrisable topological space). So the notions of similarity alluded to above are really equivalence relations on such spaces. Let me recall some examples to illustrate this. Universal Polish metric space. There is a Polish metric space U , called Urysohn space, such that every Polish metric space is isometric to a closed subspace of U . The collection F(U ) of closed subspaces of U , which is a standard Borel space endowed with the Effros Borel structure, can thus be regarded as the class of all Polish metric spaces. Relations among Polish metric spaces like homeomorphism or isometry are then equivalence relations on F(U ). Similarly compact metric spaces are isometric to compact subsets of U ; thus K(U ) (the Polish space of compact subsets of U ) is the class of all compact metric spaces. Universal compact metrisable space. Hilbert cube [0, 1]N contains a topological copy of every compact metrisable space. So K([0, 1]N ) is the Polish space of all compact metrisable spaces. Continua (that is compact connected non empty metrisable spaces) form a closed subset C([0, 1]N ) of K([0, 1]N ). These spaces are suitable for discussing questions concerning topological but not metric considerations about compact Polish spaces. Spaces of countable structures. If L = {Ri , fj , ck | i ∈ I, j ∈ J, k ∈ K} is a countable first order language and ar denotes the arity function, then L-structures with universe N can be coded as elements of the Polish space XL =
i∈I
2N
ar(i)
×
j∈J
NN
ar(j)
× NK .
CLASSIFICATION PROBLEMS IN ALGEBRA AND TOPOLOGY
63
The relation of isomorphism on countable L-structures is an equivalence relation on XL . This is the orbit equivalence relation of a continuous action –called logic action– of S∞ , the symmetric group of N, on XL (given a group G acting on a set A, the orbit equivalence is defined by letting a, a ∈ A be related if and only if there is a g ∈ G such that ga = a ). If ϕ ∈ S∞ , x ∈ XL , the result ϕx of applying ϕ to x is the countable L-structure obtained from x by relabeling the elements of the universe of the structure using ϕ. For example, if L consists of one binary relation symbol, then for all natural numbers n and m one has ϕx(n, m) = x(ϕ−1 (n), ϕ−1 (m)). If A ⊆ XL is a Borel subclass of XL closed under isomorphism, then the restriction of the isomorphism relation to A can be considered as well. It remains to decide how to compare, in some explicit way, different classification problems, that is different equivalence relations. There are several ways to do this; the theory of Borel reducibility is the consequence of the following choice. While encompassing most of the usual explicit constructions of mathematics, this choice allows to use many techniques from descriptive set theory. Let X, Y be Polish spaces and let E, F be equivalence relations defined on X, Y respectively. Say that E is Borel reducible to F if there is a Borel function g : X → Y such that ∀x, x ∈ X (xEx ⇔ g(x)F g(x )). This is denoted by E ≤B F . Indeed, when this situation arises, in order to know whether x, x ∈ X are similar with respect to the notion of similarity E it is sufficient to look whether elements g(x) and g(x ) in Y are similar with respect to F . So the classification problem for E is at most as complicated as the classification problem for F and this justifies the notation E ≤B F . Moreover, the requirement that g be Borel grants that this comparison is done in some explicitly definable way. Of course it is possible to study different notions of reducibility, like the ones obtained by requiring the function g above to be continuous or measurable with respect to some fixed measures on X, Y or to some σ-algebras other than the Borel one. If E ≤B F but ¬(F ≤B E), then the classification problem for E is strictly simpler than the one for F ; this is denoted by E
64
RICCARDO CAMERLO
Note that this notion does not involve explicitly the topology on the spaces where the equivalence relations are defined, but just the σ-algebra of Borel subsets. So these definitions may be extended to equivalence relations defined on sets where, rather than a Polish topology, it is more natural to find a σ-algebra of subsets that is the σ-algebra of Borel sets for some Polish topology. A set X together with such a σ-algebra is called a standard Borel space. Since ≤B is reflexive and transitive, it is a preorder in the class of equivalence relations on standard Borel spaces. Identifying equivalence relations E, F whenever E ∼B F , the relation ≤B induces a partial ordering. The study of this partial ordering is the scope of the theory.
1
Smooth equivalence relations
The easiest case of a classification problem for an equivalence relation E on a standard Borel space X is when it is possible to assign complete invariants that are themselves concrete objects, namely elements of some Polish space Y . This means that there is a Borel function g : X → Y such that, for x, x ∈ X, xEx ⇔ g(x) = g(x ); using the notation introduced above, E ≤B =Y where =Y (in the sequel simply denoted Y ) is equality on Y . Such an equivalence relation E is called smooth or concretely classifiable. Note that such an E is Borel as a subset of X 2 . The cardinality of the quotient space X/E can be finite, countably infinite or the continuum. Thanks to the Silver dichotomy, which asserts that for a Π11 equivalence relation E either E ≤B N or R ≤B E, smooth equivalence relations E, F on X, Y respectively are always comparable with respect to ≤B and they are ranked according to the cardinality of their quotient spaces: E ≤B F ⇔ card(X/E) ≤ card(Y /F ). So, up to ∼B , there are countably many smooth equivalence relations, which can be listed as follows: 0
(1)
(for notational uniformity the empty space and the unique equivalence relation on it are considered as well). Examples. 1. For n a positive natural number, let M(n × n, C) be the space of all n × n complex matrices. The equivalence relation of similar-
CLASSIFICATION PROBLEMS IN ALGEBRA AND TOPOLOGY
65
ity on M(n × n, C) is a smooth equivalence relation: such matrices are classified by their Jordan normal form. 2. Isomorphism for Bernoulli measure preserving automorphisms on [0, 1] is smooth by [Orn70], since these are classified by their entropy, a real number. 3. If U is the Urysohn space, the isometry relation on K(U ) (that is isometry for compact metric spaces) is smooth (cf. [Gro0 99]). The situation is very different for isometry on all of F(U ). The latter has been classified precisely in [GaoKec03]. Smoothness is perhaps the most desirable situation when dealing with an equivalence relation, in the sense that in this case one usually has a very good understanding of the equivalence. However this turns out to be a rather uncommon situation in mathematics.
2
Countable Borel equivalence relations
A countable Borel equivalence relation E on a standard Borel space X is an equivalence relation that is Borel as a subset of X 2 and such that all equivalence classes [x]E are countable. If F is a Borel equivalence relation, then F is essentially countable if, for some countable Borel equivalence relation E, the inequality F ≤B E holds. Smooth equivalence relations give thus the easiest examples of essentially countable Borel equivalence relations. Other examples of countable Borel equivalence relations are orbit equivalences induced by Borel actions of countable groups on standard Borel spaces. In fact, this is the only actual case by the following theorem of [Fel0 Moo0 77]. Theorem 2.1. If E is a countable Borel equivalence relation on a standard Borel space X, then there are a countable group G and a Borel action G × X → X inducing E. A very interesting fact concerning Borel equivalence relations is the Glimm-Effros dichotomy which, in the following general form, is due to [Har0 KecLou90]. Theorem 2.2. There is a non-smooth countable Borel equivalence relation E0 such that, for any Borel equivalence relation E, either E is smooth or E0 ≤B E.
66
RICCARDO CAMERLO
So, for Borel equivalence relations, the chain (1) can be continued as 0
(2)
and this chain contains all Borel equivalence relations (up to ∼B ) that are Borel reducible to E0 . 2.1
Recognizing non-smoothness
When trying to classify an equivalence relation, a first step may be to show that it is not smooth. As for other similar problems, to show E is not smooth it is enough to find an equivalence relation F , already known to be non-smooth, and show F ≤B E. By the definition, this amounts to set up a construction that assigns to F -equivalent elements some Eequivalent objects but, to F -inequivalent elements, E-inequivalent objects must be associated. This last requirement often involves some kind of diagonalisation. As an illustration, a couple of combinatorial arguments are discussed here. Recall the discussion of the space XL of countable L-structures for a given countable first order language L and its Borel invariant subclasses. If G is a subgroup of S∞ then the restriction of the logic action to G×XL (or G × A where A is the invariant class under consideration) induces a subequivalence relation which may be called G-isomorphism. An instance of this is when G = Rec is the group of recursive permutations and A = [u] is a single isomorphism class. If the subgroup G is not closed in S∞ , then the (right or left, equivalently) coset equivalence CG induced by G on S∞ is not smooth, by [Mil1 77]. This will be compared with G-isomorphism ∼ =uG on some isomorphism class [u] . For G, H subgroups of S∞ , let EGH be the equivalence relation on S∞ defined by letting xEGH y if and only if there are ϕ in G and σ in H such that ϕxσ = y. Proposition 2.3. Let H be the automorphism group of u ∈ XL and G be a subgroup of S∞ . Then EGH ∼B ∼ =uG . Proof. Define ∗ : S∞ → [u] , x → x∗ letting x∗ be the model obtained by applying x to u. So ∗ is continuous and surjective. To check that it is a reduction, let x, y ∈ S∞ . If there are ϕ ∈ G, σ ∈ H such that ϕxσ = y, since σ is in the stabiliser H of u, ϕ is an isomorphism between x∗ and
CLASSIFICATION PROBLEMS IN ALGEBRA AND TOPOLOGY
67
y ∗ . Conversely, let ψ ∈ G be an isomorphism between x∗ and y ∗ ; then y −1 ψx is an automorphism σ of u, so ψxσ −1 = y. Since H is closed, then ∗ has a Borel right inverse witnessing ∼ =uG ≤B H EG . q.e.d. Corollary 2.4. If u is a rigid structure, then for any subgroup G ⊆ S∞ , CG ∼B ∼ =uG is not smooth. =uG . In particular, if G is not closed, ∼ Proposition 2.5. Let G, H be countable subgroups of S∞ . Suppose G is closed under the following operations (this happens, for instance, when G = Rec): – if ϕ ∈ G and ϕ : N → N is defined, for k ∈ N, by ϕ (2k) = 2ϕ(k) , ϕ (2k + 1) = 2k + 1, then ϕ ∈ G. – if ϕ ∈ G and if ϕ (k) = then ϕ ∈ G.
ϕ(2k) 2
defines a bijection ϕ from N onto N,
Suppose moreover that there exists a set A ⊆ N with the following properties: 1. A is H-invariant; 2. A is infinite and coinfinite; 3. H acts freely on A. Then CG ≤B EGH . Proof. Distinguish two cases. Case 1. There is a finite orbit of the action of H on A. By condition 3, every orbit of H on A is finite with fixed cardinality m ∈ N and there are infinitely many such orbits. Moreover, H must also consist of m elements: let H = {id, σ1 , . . . , σm−1 }. Let {a0n }n∈N be a transversal for the action of H on A and, for n ∈ N, 1 ≤ h < m, let ahn = σh (a0n ). Thus A = {ahn }(n,h)∈N×m . Let also N\A = {bn }n∈N be an enumeration without repetitions. For every h < m let {qnh }n∈N be an injective sequence of odd numbers such that:
– h = h ⇒ {qnh }n∈N ∩ {qnh }n∈N = ∅; – {qnh }(n,h)∈N×m = 2N + 1;
68
RICCARDO CAMERLO
– for every h = 0, there is no function in G mapping qn0 to qnh for all n ∈ N. For x ∈ S∞ , define x∗ ∈ S∞ letting x∗ (bn ) = 2x(n) , x∗ (ahn ) = qnh . So ∗ : S∞ → S∞ , x → x∗ is a continuous injection. To check that it reduces CG to EGH , let x, y ∈ S∞ . Assume ϕx = y for some ϕ ∈ G. Define ϕ : N → N letting, for k ∈ N, ϕ (2k) = 2ϕ(k) , ϕ (2k + 1) = 2k + 1. Then ϕ ∈ G and ϕ x∗ = y ∗ . Conversely, suppose ϕx∗ σ = y ∗ for some ϕ ∈ G, σ ∈ H. Then, by evaluating on each a0n and using condition 3 above, σ = id and ϕx∗ = y ∗ ; in particular, ϕ maps even numbers to even numbers and it is the identity on the odd numbers. Define ϕ (k) = ϕ(2k) 2 for all natural numbers k; then ϕ ∈ G and ϕ x = y. Case 2. Every H-orbit on A is infinite. So it can be assumed that A consists of just one infinite H-orbit. Let N \ A = {bn }n∈N be an enumeration without repetitions. Let also X be the Polish space of bijections from A to 2N + 1. The claim is that ∃z ∈ X ∀σ ∈ H \ {id} ∀ϕ ∈ G ∃a ∈ A ϕzσ(a) = z(a). Indeed, for every σ ∈ H \ {id}, ϕ ∈ G, the set {z ∈ X | ∃a ∈ A ϕzσ(a) = z(a)} is open dense in X. Define ∗ : S∞ → S∞ , x → x∗ letting x∗ |A = z and x∗ (bn ) = 2x(n) for all n. Then ∗ is continuous; to check that it is the desired reduction let x, y ∈ S∞ . Suppose first ϕx = y for some ϕ ∈ G and define ϕ : N → N by letting, for k ∈ N, ϕ (2k) = 2ϕ(k) , ϕ (2k + 1) = 2k + 1. Then ϕ ∈ G and ϕ x∗ = y ∗ . Assume conversely that there are ϕ ∈ G, σ ∈ H such that ϕx∗ σ = y ∗ . This implies that for all a ∈ A one has ϕzσ(a) = z(a) and so σ = id, whence ϕx∗ = y ∗ . As before, let for all k ∈ N; then ϕ ∈ G and ϕ x = y. q.e.d. ϕ (k) = ϕ(2k) 2 Corollary 2.6. Let L be the language for group theory and Gr(Z) be the class of groups isomorphic to Z. Assume G is a countable subgroup
CLASSIFICATION PROBLEMS IN ALGEBRA AND TOPOLOGY
69
of S∞ satisfying the closure hypotheses of Proposition 2.5. Then CG Borel reduces to G-isomorphism on Gr(Z). Thus, if G is not closed, G-isomorphism on Gr(Z) is not smooth. In particular this holds for G = Rec. All technical effort in the proof of Proposition 2.5 was essentially to get rid of the presence of the group H, making it impossible for two elements in the range of the reduction to be related because of the effect of some non trivial member of H. This made sure that if x, y were CG inequivalent, then their images x∗ , y ∗ remained EGH -inequivalent. Another illustration of this kind of arguments is the following. Proposition 2.7. Let G be a countable subgroup of S∞ satisfying the closure hypotheses of Proposition 2.5 and let τ : N → N be defined by: • for all even n, we let τ (n) := n + 2; • τ (1) := 0; • τ (m) := m − 2 if m ∈ 2N + 1, m ≥ 3. τ
Then CG ≤B EG where τ
∀x, y ∈ S∞ (xEG y ⇔ ∃ϕ ∈ G ∃z ∈ Z ϕxτ z = y). Proof. Let {qn }n∈Z = 2N + 1 be an injective enumeration of odd natural numbers, indexed by integers, such that (a) for p ∈ 2Z \ {0}, the function qn → qn+p is not the restriction of any element of G; and (b) for p ∈ 2Z + 1, the function q2n → q2n+p is not the restriction of any element of G. Define ∗ : S∞ → S∞ , x → x∗ letting, for n ∈ N: x∗ (6n) = 2x(2n); x∗ (6n + 1) = q−2n−1 ; x∗ (6n + 2) = q2n ; x (6n + 3) = q−2n−2 ; x∗ (6n + 4) = q2n+1 ; x∗ (6n + 5) = 2x(2n + 1). ∗
So ∗ is a continuous injection; to check that it is actually a reduction, let x, y ∈ S∞ . Assume first ϕx = y for some ϕ ∈ G and define ψ : N → N by: • for all even m, we let ψ(m) := 2ϕ( m2 ); • for all odd m, we let ψ(m) := m.
70
RICCARDO CAMERLO
Then ψ ∈ G and ψx∗ = y ∗ . Suppose conversely that ψx∗ τ z = y ∗ for some ψ ∈ G, z ∈ Z. It must be z = 0 since otherwise requirements (a) and (b) would be contradicted. This implies that ψ sends even numbers to even numbers and it is the identity on odd numbers; so define ϕ : N → N by ϕ(n) = ψ(2n) . Then ϕ ∈ G and ϕx = y. q.e.d. 2 Corollary 2.8. If G is a countable subgroup of S∞ satisfying the closure hypotheses of Proposition 2.5, then the equivalence relation CG Borel reduces to G-isomorphism on total orderings of type ζ, the order type of the integers. In particular, this holds for G = Rec. Proof. Let L = {R} be the language for order theory and let u ∈ XL = 2 2N be defined by letting u(n, m) = 1 if and only if one of the following holds: (n, m ∈ 2N and n < m) or (n ∈ 2N + 1 and m ∈ 2N) or (n, m ∈ 2N + 1 and m < n). Then [u] is the class of total orderings isomorphic to ζ and the automorphism group of u is {τ z }z∈Z , where τ is as in Proposition 2.7. Now apply Propositions 2.3 and 2.7. q.e.d. In particular, the following has been proved. Theorem 2.9. The relations of recursive isomorphism on rigid structures, groups isomorphic to Z and total orderings of order type ζ are non smooth countable Borel equivalence relations. 2.2
Hyperfinite equivalence relations
Definition 2.10. An equivalence relation E on a standard Borel space X is called hyperfinite if there is a sequence En of Borel equivalence relations on X with finite classes such that • En ⊆En+1 for all n ∈ N; • E = n∈N En . The following characterisations of hyperfinite equivalence relations are very useful (cf. [DouJacKec94]). Theorem 2.11. The following are equivalent for any countable Borel equivalence relation E on standard Borel space X:
CLASSIFICATION PROBLEMS IN ALGEBRA AND TOPOLOGY
71
1. E is hyperfinite; 2. E is hypersmooth, that is E is the increasing union n∈N En where each En is a smooth equivalence relation; 3. E ≤B E0 ; 4. there is a Borel action of Z on X inducing E; 5. there is a Borel partial ordering < defined on X whose restriction to each equivalence class is a total order which is finite or has order type ζ. Examples. 1. Let elements x, y ∈ 2N be equivalent if they are eventually equal, that is there is n ∈ N such that, for all m ≥ n, x(m) = y(m). This hyperfinite equivalence relation is the one usually denoted E0 . 2. Again for x, y ∈ 2N , let x ET y if x, y have the same tail, that is there are n, m ∈ N such that, for all k ∈ N, x(n + k) = y(m + k). 3. For x, y ∈ R, let x EV y ⇔ x − y ∈ Q. This is the Vitali equivalence relation. Note that E0 ∼B ET ∼B EV , since they are hyperfinite but not smooth. Open problems 1. [DouJacKec94] The increasing union problem. Let X be a standard Borel space and suppose that, for each n ∈ N, En is a hyperfinite equivalence relation on X. Suppose further that En ⊆ En+1 for all n. Is E = n∈N En hyperfinite? This is a very interesting structural open problem. The question is spontaneous and simple to state but apparently very hard to solve. Among the various instances of this problem let me single out the following. 2. [JacKecLou02] Commensurability. For x, y ∈ R+ define xEc y ⇔ x ∈ Q+ . Is Ec hyperfinite? y The relation Ec is the relation of commensurability. It is one of the most ancient equivalence relations in the history of mathematics and thus it appears natural to try and settle its classification problem. Moreover, it is the multiplicative analogue of Vitali equivalence relation. To observe that this is indeed a particular case of Problem 1, note that the multiplicative group Q+ is isomorphic to the additive group Z<ω . The latter is the increasing union of subgroups isomorphic to Zn for n ∈ N and it can be shown that Borel actions of such groups on standard Borel spaces
72
RICCARDO CAMERLO
induce hyperfinite equivalence relations. On the other hand, classification of Ec would likely provide clues for the solution of the more general Problem 1. 3. (Weiss) Borel actions of countable Abelian groups. Let G be a countable Abelian group and EG be the orbit equivalence relation induced by a Borel action of G on some standard Borel space. Is EG hyperfinite? Since every countable Abelian group is homomorphic image of Q+ , an affirmative answer to Problem 1 would imply an affirmative solution for this. In turn, if this holds, Problem 2 would also be solved in the affirmative. Moreover, for any countable group H, every orbit equivalence relation induced by a Borel action of H on some standard Borel space Borel reduces to the equivalence relation on P(Z × H), the power set of Z × H, induced by the translation action of Z × H. Since Q+ $ Z × Q+ , Problem 3 is actually equivalent to the following instance of it. 4. [JacKecLou02] Translation action of Q+ . Let E be the orbit equivalence relation induced by the translation action Q+ × P(Q+ ) → P(Q+ ) (q, A) → qA = {qa}a∈A . Is E hyperfinite? Note that, thinking of positive real numbers as Dedekind cuts, R+ ⊆ P(Q+ ), so the commensurability relation of Problem 2 is just the restriction of this E to a subspace. 2.3
Universal countable Borel equivalence relations
Definition 2.12. A universal countable Borel equivalence relation is a countable Borel equivalence relation E∞ such that, for all countable Borel equivalence relations E, the inequality E ≤B E∞ holds. Theorem 2.13. There is a universal countable Borel equivalence relation. [DouJacKec94] Examples. 1. [DouJacKec94] The universal equivalence relation usually denoted by E∞ is defined as the orbit equivalence relation induced by the free group F2 on two generators on its power set P(F2 ) by translation: for g ∈ F2 , A ∈ P(F2 ), gA = {ga}a∈A .
CLASSIFICATION PROBLEMS IN ALGEBRA AND TOPOLOGY
73
2. [And2 CamHjo01] For n ≥ 2 let ∼ =nr be the equivalence relation on N the Cantor space n defined by letting x, y ∈ nN be equivalent if there is a recursive permutation ϕ such that xϕ = y. So this equivalence relation is induced by a right action on nN of the group of recursive permutations. For n ≥ 5, ∼ =nr is a universal countable Borel equivalence relation. 3. [And2 CamHjo01] If G is a countable group containing a copy of F2 , then conjugacy equivalence relation on subgroups of G is a universal countable Borel equivalence relation. 4. [Cam02] The equivalence relations of recursive isomorphism on countable trees, groups, Boolean algebras, fields, total orderings are all universal countable Borel equivalence relations. This is the countable analogue of results of S∞ -universality which will be discussed later.
Open problems 1. [JacKecLou02] Universality of free actions. Are there a countable group G and a free Borel action of G on some standard Borel space inducing a universal countable Borel equivalence relation? Note that the naive attempt to consider the free part of the translation action of F2 on P(F2 ) is fruitless here, since equivalence relations induced by free Borel actions of F2 are treeable, a much simpler class than the universal ones. 2. [DouKec00] Turing degrees. Is Turing equivalence ≡T a universal countable Borel equivalence relation? A positive answer here would imply the failure of a conjecture of Martin’s (see the appendix in [KecMos1 78]). 3. (Hjorth) Upward universality. Let E, F be countable Borel equivalence relations. Suppose E ⊆ F and E be universal. Is F universal? This is another interesting structural problem. A positive answer here, together with Example 2, would solve Problem 2 in the affirmative. Note however that, by results of Adams, for E, F countable Borel equivalence relations, E ⊆ F does not imply E ≤B F . 4. [DouKec00] Recursive permutations of binary sequences. Recall the notation from Example 2. Is ∼ =2r universal? It would be surprising if the number 5, which is the least n for which n ∼ =r has been proved to be universal so far, had an essential role.
74
2.4
RICCARDO CAMERLO
The structure of countable Borel equivalence relations
The relation ≤B on countable Borel equivalence relations was linear up to E0 , as depicted in (2). However the general structure is very different as shown by the following theorem of [AdaKec00]. Theorem 2.14. There is a map A → EA assigning to each Borel subset A ⊆ R a countable Borel equivalence relation EA such that, for all Borel A, B ⊆ R, A ⊆ B ⇔ EA ≤B EB . Open problems 1. Local structure. Theorem 2.14 says that there is no nice general theory of the structure of ≤B on countable Borel equivalence relations. However it would be interesting to investigate some local behaviour of this partial order, like points with immediate successors or predecessors, the structure of cones CE = {F | F ≤B E}, maximal sets D of pairwise incomparable elements etc. 2. Natural examples. The equivalence relations EA produced in Theorem 2.14 are built with ad hoc contructions. On the other hand it would be nice to find more examples of natural countable Borel equivalence relations pairwise non Borel bireducible. 3. [JacKecLou02] Essential countability. Let E be an essentially countable Borel equivalence relation. Is there a countable Borel equivalence relation F such that E ∼B F ?
3
Classification by countable structures
An equivalence relation on a standard Borel space is classifiable by countable structures if it is Borel reducible to the relation of isomorphism on some Borel class of countable structures. Such a relation is S∞ -universal if any equivalence relation induced by a Borel action of a closed subgroup of S∞ Borel reduces to it (recall that isomorphism is in fact induced by a continuous action of S∞ ). Theorem 3.1 (Becker, Kechris). There are S∞ -universal equivalence relations. [BecKec96]
CLASSIFICATION PROBLEMS IN ALGEBRA AND TOPOLOGY
75
Examples. 1. [Mek81], [Fri0 Sta1 89], [CamGao01] The relations of isomorphism on countable groups, trees, total orderings, fields, Boolean algebras are S∞ -universal equivalence relations. 2. [GaoKec03] Isometry on 0-dimensional locally compact Polish metric spaces and on ultrametric Polish spaces are S∞ -universal equivalence relations. Open problems 1. [Fri0 Sta1 89] Cofinality of trees. Let E be classifiable by countable structures and suppose that, for each α ∈ ω1 , isomorphism of well founded trees of rank ≤ α is Borel reducible to E. Is E an S∞ universal equivalence relation? 2. [Cam02] Isomorphism and recursive isomorphism. In connection with Example 4 from Section 2.3 and Example 1 above, let C be a Borel class of countable structures and suppose isomorphism on C is S∞ -universal. Is recursive isomorphism on C a universal countable Borel equivalence relation? 3. Natural examples. Find other interesting classes of countable structures whose isomorphism relation is S∞ -universal. Similarly, find classes of Polish spaces such that the relations of homeomorphism or isometry are S∞ -universal.
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
Using easy optimization problems to solve hard ones Lars Engebretsen Department of Numerical Analysis and Computer Science Royal Institute of Technology SE–100 44 Stockholm, Sweden E-mail:
[email protected]
Abstract. In combinatorial optimization one typically wants to minimize some cost function given a set of constraints that must be satisfied. A classical example is the so called Traveling Salesman Problem (TSP), in which we are given a set of cities with certain distances between them and we are asked to find the shortest possible tour that visits every city exactly once and then returns to the starting point. This problem is believed to be hard to solve exactly—it has been shown to belong to a certain class of notoriously hard problems, the so called NP-hard problems. Despite huge efforts, there is no currently known efficient algorithm that solves any of the NP-hard problems and it is widely believed that no such algorithm exists. However, it is always possible to find an approximation to the optimum tour by solving a much easier problem, the so called Minimum Spanning Tree problem. This is an example of a general paradigm in the field of approximation algorithms for optimization problems: Instead of solving a very hard problem we solve an easy one and then convert the optimal solution to the easy problem into an approximately optimal solution to the hard one. Obviously, it is important to be careful when converting the optimal solution to the easy problem into an approximately optimal solution to the hard one. In many applications, the analysis of the algorithm is greatly simplified by the use of random selections. Typically, we then use information obtained from the solution to the easy problem to bias our selection of a solution to the hard one. Received: November 14th, 2001; In revised version: February 22nd, 2002; Accepted by the editors: April 11th, 2002. 2000 Mathematics Subject Classification. 68Q25 68W20 68W25 68W40.
Supported in part by the Royal Swedish Academy of Sciences.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 77-93. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
78
1
LARS ENGEBRETSEN
Introduction
A profound question in theoretical computer science is to determine how hard it is to solve certain given problems. The definition of “hard” in this context varies from application to application, but a common definition is to estimate the amount of time needed to solve the problem. It has been widely accepted that a running time that can be bounded by a function that is a polynomial in the input length is a robust definition of “reasonable running time”; the class of all problems that can be solved in polynomial time is denoted by P. However, there also exists a large class of decision problems, i.e., problems that either accept or reject a given input, with the property that an affirmative answer can be verified in polynomial time with the aid of a proof; this class is denoted by NP. Consider the following problem: Given a Boolean formula on n variables, accept if it is satisfiable, i.e., if there exists any truth assignment to the variables such that the formula evaluates to True. Clearly, a satisfying assignment to the variables is a proof of the fact that a formula is satisfiable. Such a proof can be verified in polynomial time by direct substitution in the formula. It is, however, not known how to find a proof in polynomial time. We can solve the above problem by trying all possible proofs but that takes time exponential in n. The above problem is in fact contained in a subset of NP, consisting of the so called NP-complete problems. Those problems are equally hard in the sense that if one of them can be solved in polynomial time, then all of them can. Since nobody has been able to construct a polynomial time algorithm for any NP-complete problem, it is widely believed that no such algorithm exists. In his paper [Lev73], Levin wrote (although he did not use the term NP-complete): [The NP-complete] problems can be solved by the trivial algorithm that consists of exhaustive search in the solution space. However, these algorithms require exponential time and mathematicians believe that there are no faster algorithms for such problems. [. . . ] (For instance, it has not yet been shown that finding a mathematical proof requires more time than it takes to verify its correctness.)
This essentially summarizes the state of affairs also today. In fact, what is written above is exactly the P = NP question, the question whether it takes more effort to find a proof than it does to verify its correctness. During the first years after the introduction of the NP-completeness concept, several problems with wide use in applications were shown to be NP-complete (cf. [GarJoh79]).
USING EASY OPTIMIZATION PROBLEMS TO SOLVE HARD ONES
2
79
Optimization Problems
While optimization problems strictly speaking cannot belong to NP since they are not decision problems, they can often be shown to be at least as hard as some NP-complete problem. In that case they are called NP-hard and cannot be solved in polynomial time if P = NP. However, it is often enough to know that a solution is roughly the best possible. For an optimization problem known to be NP-hard, the following question is therefore a natural one to study: Is it possible to find, in polynomial time, a solution close to the optimum? For a large class of problems, a certain schema has been very successful for designing provably good polynomial time approximation algorithms: First solve an easier optimization problem exactly and then extract an approximate solution to the harder problem from the optimal solution to the easier one. In this paper, we illustrate this schema by describing approximation algorithms for the two NP-hard problems Maximum Satisfiability and Maximum Cut. We then conclude by stating some open problems. Definition 2.1. A CNF-clause over Boolean variables is a collection of literals, i.e., variables or their negations. A clause is satisfied by an assignment to the variables if the assignment sets at least one of the literals in the clause to True. For instance, one example of a CNF-clause over {x1 , x2 , x3 } is x1 ∨ x2 ∨ x3 . The bar over x3 indicates that x3 is negated. The clause x1 ∨ x2 ∨ x3 is satisfied if either x1 or x2 is True or if x3 is False. Definition 2.2. Maximum Satisfiability is the following problem: Given a collection of Boolean variables, a collection of CNF-clauses over those variables, and a collection of positive weights corresponding to each clause respectively, find a truth assignment to the Boolean variables that maximizes the total weight of the clauses containing at least one true literal. For instance, the following collection of clauses define an instance of Maximum Satisfiability over {x1 , x2 , x3 , x4 } (let us give the same weight to all clauses for the sake of simplicity): x 1 ∨ x2 ∨ x3 , x1 ∨ x2 ∨ x3 ,
x1 ∨ x2 ∨ x3 , x1 ∨ x2 ∨ x3 ,
x1 ∨ x4 x3 ∨ x4 .
80
LARS ENGEBRETSEN
Fig. 1. The graph shown above is an example of an instance and a feasible solution for the Maximum Cut problem. If the nodes are partitioned into black and white nodes as above, all but six of the edges connect a white vertex with a black one.
Definition 2.3. Maximum Cut is the following problem: Given a graph and a collection of positive weights corresponding to each edge respectively, find a partition of the vertex set of the graph that maximizes the total weight of the edges whose endpoints belong to different parts in the partition. Definition 2.4. A randomized c-approximation algorithm for a maximization problem is a polynomial time algorithm that for any instance outputs a solution with expected weight at least c times the value of the optimum solution. Such an algorithm has expected performance ratio c.
3
A Naive Probabilistic Algorithm
Let us study the following heuristic for Maximum Satisfiability over the Boolean variables {x1 , x2 , . . . , xn }: For each variable xi in the instance independently set xi to True with probability 1/2 and False with probability 1/2. For each clause Cj , we introduce the indicator random variable Yj for the event that Cj is satisfied. If we denote the weight of the clause mCj by wj we can write the weight of the satisfied clauses as W = j=1 wj Yj if there are m clauses. Once we have used two important facts, it is rather straightforward to compute E[W ], the expected weight of the satisfied clauses. Firstly, expectation is a linear operator. Secondly, the expected value of an indicator random variable is the probability that the variable is 1. Thus, E[W ] =
m j=1
wj E[Yj ] =
m j=1
wj Pr[Yj = 1].
USING EASY OPTIMIZATION PROBLEMS TO SOLVE HARD ONES
81
Notice that this means that we can compute Pr[Yj = 1] separately, regardless of the dependence between different Yj , which vastly simplifies the analysis. The key observation now is that for any fixed clause there is only one assignment that does not satisfy the clause: The assignment that sets all literals in the clause to False. A literal can be either a variable or a negated variable, but in both cases the probability that the literal is False is 1/2. Since we assign values to the variables independently, the probability that all literals in the clause Cj are false is 2−|Cj | , where |Cj | denotes the number of literals in the clause Cj . Thus, the clause is satisfied with probability 1−2−|Cj | , and we can write the expected weight of satisfied clauses as E[W ] =
m
wj (1 − 2−|Cj | ).
(1)
j=1
To prove an expected performance ratio, we must somehow relate the above expression tothe optimum value. A very crude upper bound on the objective value is m j=1 wj , since the optimum value is always at most the total weight of the instance. Similarly, a very crude lower bound on −|Cj | the factor 1 − 2 is 1/2, corresponding to |Cj | = 1, which implies w that E[W ] ≥ 21 m j=1 j . We have now proved the following theorem: Theorem 3.1. The algorithm described in Section 3 is a randomized 21 approximation algorithm for Maximum Satisfiability. The above algorithm is usually attributed to Johnson. For the case when all clauses have the same weight, a derandomized version of the algorithm, i.e., a version where the random selections have been replaced by deterministic ones in such a way that the quality of the solution does not decrease, is exactly what Johnson calls Algorithm B2 in his paper [Joh74], although he does not state the algorithm in terms of a probabilistic construction.
4
The Plan: Relaxations and Randomized Rounding
The algorithm from Section 3 seems to be a rather naive heuristic since it always selects truth values according to the same distribution, regardless of the instance. It seems reasonable that it would help to skew the distribution towards the optimal assignment. Moreover, we very crudely
82
LARS ENGEBRETSEN
bounded the optimal value by the total number of clauses in the instance when we analyzed the algorithm. When the instance is almost satisfiable, this bound is quite good, but for instances were the optimum value is far from the number of clauses, we need a better upper bound. At first, it seems hard to achieve any of the two improvements suggested above, since it is NP-hard to compute the true optimum value. However, it turns out that it is in many cases fruitful to use the concept of relaxations from combinatorial optimization [Lue73]. Definition 4.1. Let P be an optimization problem of the form “maximize Φ(y) subject to y ∈ C”. A problem P , of the form “maximize Φ (y) subject to y ∈ C ”, is a relaxation of P if C ⊆ C and Φ (y) ≥ Φ(y) for all y ∈ C. A relaxation alone provides us with an upper bound on the optimum value. But we are usually interested in more than this; we also want to obtain a feasible solution with weight close to the optimum value. For instance, in an approximation algorithm for Maximum Satisfiability, just knowing that it is possible to satisfy many clauses is not as interesting as obtaining an assignment to the variables satisfying many clauses. The potential problem of solving a relaxed version of the problem is that we usually end up with a solution that is not feasible for the original problem. To convert the optimal solution to the relaxed problem into a feasible solution to the original problem, a procedure called randomized rounding is often used. Typically, the combinatorial optimization problem is first expressed as an integer program, i.e., a program of the form “maximize some given function of integer-valued variables subject to a given collection of constraints of those variables”. This program is NP-hard to solve optimally, but it is possible to relax the program into something that is solvable in polynomial time. Then the solution to the relaxed problem is used to obtain a probability distribution according to which a feasible solution to the original problem is selected. To give a bound on the performance, we use a two-step argument: Firstly, we know that the optimum value of the relaxation of a maximization problem is an upper bound on the optimum value of the original problem. Secondly, we prove that the solution selected in the random process has an expected weight at most some factor c from the optimum value of the relaxed problem. Consequently, the solution selected has an
USING EASY OPTIMIZATION PROBLEMS TO SOLVE HARD ONES
83
expected weight at most a factor c from the optimum weight of the original problem. We exemplify the above procedure below by describing a linear relaxation of Maximum Satisfiability [GoeWil1 94] and a semidefinite relaxation of Maximum Cut [GoeWil1 95].
5
Relaxations via Linear Programming
Definition 5.1. A linear program in standard form is written min cT x : x ≥ 0 ∧ ∀i = 1, . . . , m(aTi x = bi ) , x
where a1 , . . . , am , c, and x are n-dimensional real vectors and b1 , . . . , bm are real scalars. It is possible to express inequality constraints by using additional variables. The constraint ai x ≥ bi can be rewritten as ai x − zi = bi together with zi ≥ 0, where zi is a new scalar variable. Finally, we remark that a linear program in n variables can be solved optimally in time O(n3 L), where L is the number of bits needed to store the program [Ye91]. 5.1
A Linear Relaxation of Maximum Satisfiability
We start by formulating Maximum Satisfiability as an integer program. For for each variable xi we introduce an indicator variable yi for the event that xi is True and for each clause Cj we introduce an indicator variable zj for the event that Cj is satisfied. We also introduce for each clause Cj the sets Tj , containing the variables appearing unnegated in Cj , and Fj , containing the variables appearing negated in Cj . Finally we denote the weight of the clause Cj by wj . Then the optimum of the Maximum Satisfiability instance is given by the optimum of the following integer program: m maximize wj zj j=1
subject to
i∈Tj
yi +
(1 − yi ) ≥ zj
i∈Fj
yi ∈ {0, 1} for all i = 1, . . . , n, zj ∈ {0, 1} for all j = 1, . . . , m.
(2)
84
LARS ENGEBRETSEN
To see this, remember that the value of yi decides the truth assignment to the corresponding xi . Then notice that zj = 0 only when yi = 0 for all i ∈ Ti and yi = 1 for all i ∈ Fi , i.e., when all variables appearing unnegated in Cj are False and all variables appearing negated in Cj are True. As soon as any of the unnegated variables are True or any of the negated variables are False,
yi +
i∈Tj
(1 − yi ) ≥ 1,
i∈Fj
and zj assumes the value 1, since all weights are positive. Since it is NP-hard to solve the above integer program, we relax the constraints on yi and zj to obtain the following linear program:
maximize
m
wj zj
j=1
subject to
yi +
i∈Tj
(1 − yi ) ≥ zj
(3)
i∈Fj
0 ≤ yi ≤ 1 for all i = 1, . . . , n, 0 ≤ zj ≤ 1 for all j = 1, . . . , m. To see why this is a relaxation, note that the set of feasible solutions to this program contains all solutions feasible to the integer program. For those solutions the objective functions are identical in the two programs. We denote the optimum solution to the relaxed problem above by yi∗ and zj∗ . There is a yi∗ for each variable xi and a zj∗ for each clause Cj and the optimum value of the relaxed problem is
wL =
m
wj zj∗ .
j=1
Similarly, we denote the optimum value of the integer program by wI . Note that although we do not know the optimum to the integer program, we do know that wL ≥ wI , by the definition of a relaxation.
USING EASY OPTIMIZATION PROBLEMS TO SOLVE HARD ONES
5.2
85
Rounding with Independent Random Assignments
Given an optimal solution to the relaxed program, we select a truth assignment to the variables xi as follows: Independently for each i, set True with probability yi∗ , (4) xi = False with probability 1 − yi∗ . Let us now analyze how many clauses this assignment satisfies on average. Introduce an indicator random variable 1 if Cj is satisfied, (5) Zj = 0 otherwise. Then we can use these random variables to compute the weight of our randomly selected solution as W =
m
w j Zj .
j=1
As in Section 3, we can express E[W ] as E[W ] =
m
wj E[Zj ] =
j=1
m
wj Pr[Zj = 1].
j=1
What remains is to bound Pr[Zj = 1]. We do this by bounding Pr[Zj = 0]. The only way the clause is left unsatisfied is if all literals in the clause are false. Since the variables are assigned truth values independently, Pr[Zj = 0] = (1 − yi∗ ) yi∗ . i∈Tj
i∈Fj
By the arithmetic/geometric mean inequality, √ a1 + a2 + · · · + ak k a 1 a2 · · · a k ≤ k for non-negative real numbers a1 , a2 , . . . , ak . Since 0 ≤ yi∗ ≤ 1, we can use the arithmetic/geometric mean inequality to bound Pr[Zj = 0] by ∗ ∗ |Cj | i∈Tj (1 − yi ) + i∈Fj yi . Pr[Zj = 0] ≤ |Cj |
86
LARS ENGEBRETSEN
If we rearrange the terms in the sums, we obtain Pr[Zj = 0] ≤
1−
i∈Tj
yi∗ +
i∈Fj (1
− yi∗ )
|Cj |
|Cj |
.
Since i∈Tj yi∗ + i∈Fj (1 − yi∗ ) ≥ zj∗ by the bound on zj from the linear program (3), zj∗ |Cj | . Pr[Zj = 1] = 1 − Pr[Zj = 0] ≥ 1 − 1 − |Cj | This expression can be bounded from below by β|Cj | zj∗ , where βk = 1 − (1 − k1 )k . To see this, it is enough to notice that the continuously differentiable function z k − βk z f (z) = 1 − 1 − k
is zero when z ∈ {0, 1} and that its second derivative z k−2 k − 1 1− f (z) = − k k
is negative when z ∈ [0, 1]. This implies that f is concave on [0, 1], which in turn implies that f (z) ≥ (1 − z)f (0) + zf (1) = 0 on that interval. Therefore, m E[W ] ≥ β|Cj | wj zj∗ . (6) j=1
Since βk < 1 − 1/e, we can reformulate this as m 1 1 wL . wj zj∗ = 1 − E[W ] ≥ 1 − e e j=1
Finally, we use the fact that wL is the optimum value of a relaxation of Maximum Satisfiability. This means that wL ≥ wI , which together with the above bound shows that E[W ] ≥ (1 − 1/e)wI , i.e., that the above algorithm is a randomized (1 − 1/e)-approximation algorithm for Maximum Satisfiability.
USING EASY OPTIMIZATION PROBLEMS TO SOLVE HARD ONES
5.3
87
A Probabilistic Combination of Two Algorithms
To obtain an approximation algorithm for Maximum Satisfiability with expected performance ratio better than 1 − 1/e we need to use a refinement of the above technique. We notice that the naive algorithm from Section 3 actually has good performance when there are only long clauses in the instance, while the above algorithm has good performance when there are only short clauses in the instance. Let us now combine these two algorithms as follows: For a given instance, we run both algorithms. This gives us two solution candidates. As our solution, we select the candidate giving the largest weight. By the expression (1) and the bound (6), the expected weight of the solution produced by this algorithm is m m E[W ] ≥ max wj (1 − 2−|Cj | ), β|Cj | wj zj∗ . j=1
j=1
To bound this quantity, we note that max{a1 , a2 } ≥ (a1 + a2 )/2, which enables us to write m 1 − 2−|Cj | + β|Cj | zj∗ 1 − 2−|Cj | + β|Cj | , ≥ wj zj∗ E[W ] ≥ wj 2 2 j=1 j=1 m
where the last inequality follows since 0 ≤ zj∗ ≤ 1. It is therefore enough to prove that 1 − 2−k + βk ≥ 3/2 for each k to obtain a 34 -approximation algorithm. This bound holds with equality when k = 1 or k = 2, and 1 − 2−k + βk ≥ 2 − 1/8 − 1/e > 3/2 when k ≥ 3. Thus, we have proved the following theorem: Theorem 5.2. The algorithm described in Section 5 is a randomized 43 approximation algorithm for Maximum Satisfiability. [GoeWil1 94]
The first randomized 34 -approximation algorithm for Maximum Satisfiability was constructed by Yannakakis [Yan94], and the above algorithm was constructed after the algorithm of Yannakakis was first presented.
6
Relaxations via Semidefinite Programming
Definition 6.1. A square matrix A is called positive semidefinite (PSD) if x · Ax ≥ 0, where · denotes the inner product, for all x ∈ Rn .
88
LARS ENGEBRETSEN
There are many equivalent definitions of a semidefinite program in the literature. To state one, we need to define the inner product between matrices, A • B = i,j aij bij . Definition 6.2. A semidefinite program on n2 variables is a program of the form minX {C • X : X is PSD ∧ ∀i = 1, . . . , m(Ai • X = bi )}, where A1 , . . . , Am , C, and X are real n × n-matrices and b1 , . . . , bm are real scalars. Given that certain technical requirements –which we will not delve into here– are satisfied, a semidefinite program on n variables can be solved within an additive error of ε in time polynomial in n and log 1ε [Ali95]. Lemma 6.3. Every linear program is also a semidefinite program. Proof. Suppose that the linear program is of standard form as in Definition 5.1. We can then transform it into a semidefinite program as follows: For a vector v let diag(v) denote the matrix that has the entries of v on the diagonal and zeros everywhere else. Set C = diag(c), and Ai = diag(ai ) for i = 1, . . . , m. Add constraints ensuring that X is diagonal to the semidefinite program. q.e.d. If A is positive semidefinite, there exists an n × n-matrix B such that A = BB T . This matrix can be computed in time O(n3 ) by incomplete Cholesky factorization [Gol3 vLo83]. This computation is needed in the algorithms below to compute from an n × n matrix A = (aii ) vectors {v1 , . . . , vn } such that aii = vi ·vi . We can use the above fact to rephrase the definition of a semidefinite program in a way suitable for our applications below. Definition 6.4. A semidefinite program on n vector-valued variables is a program of the form “maximize L(v1 , . . . , vn ) subject to |vi |2 = 1 for all i = 1, . . . , n and Cj (v1 , . . . , vn ) ≥ 0 for all j = 1, . . . , r” where the functions L and Cj are linear in the inner products vi · vi . 6.1
A Semidefinite Relaxation of Maximum Cut
Goemans and Williamson [GoeWil1 95] construct an approximation algorithm for Maximum Cut by studying a relaxation of an integer quadratic program. We will present their construction here as an example of a semidefinite relaxation combined with randomized rounding. For a graph
USING EASY OPTIMIZATION PROBLEMS TO SOLVE HARD ONES
89
G = (V, E) with vertices V = {1, . . . , n}, we introduce for each vertex i in the graph a variable yi ∈ {−1, 1}. If we denote by wii the weight of the edge (i, i ), the weight of the maximum cut in the graph is given by the optimum of the integer quadratic program 1 − yi yi maximize wii 2 (7) (i,i )∈E subject to yi ∈ {−1, 1} for all i. The partition (V1 , V2 ) yielding the optimum cut can be formed as V1 = {i : yi = 1} and V2 = {i : yi = −1}. It is NP-hard to solve (7) optimally. To obtain a polynomial time approximation algorithm, Goemans and Williamson [GoeWil1 95] construct a relaxation of the above integer quadratic program by using ndimensional real vectors vi instead of the integer variables yi . The products yi yi are then replaced by the inner products vi · vi . This gives the semidefinite program 1 − vi · vi maximize wii 2 (8) (i,i )∈E subject to |vi |2 = 1 for all i. To see that this is a relaxation of the integer quadratic program above, we note that we can transform a feasible solution to the integer quadratic program into a feasible solution to the semidefinite program as follows: Let the first coordinate of vi in the semidefinite program be the value of yi in the integer quadratic program; all other components of vi are set to 0. Then the feasible solutions to the integer quadratic program are feasible also for the semidefinite relaxation, and the objective functions are identical for solutions feasible to the integer quadratic program. We denote the optimum solution to the relaxed problem above by vi∗ , and the corresponding optimum value by 1 − vi∗ · vi∗ . wii wS = 2 (i,i )∈E
Similarly, we denote the optimum value of the integer quadratic program by wQ . Although we do not know how to compute it in polynomial time, we do know that wS ≥ wQ by the definition of a relaxation.
90
LARS ENGEBRETSEN
v∗i&
v∗i θ θ
θ
Fig. 2. When the vector r is in the shaded region, (vi∗ · r)(vi∗ · r) < 0. If r is uniformly distributed in the plane, the probability that r is in the shaded region is 2ϑ/2π = ϑ/π.
6.2
Rounding with a Random Hyperplane
To obtain a partition of the graph from the solution to the semidefinite relaxation, we select a random vector r, uniformly distributed on the unit sphere in Rn , and set V1 = {i : vi∗ · r > 0} and V2 = {i : vi∗ · r < 0}. Vectors satisfying vi∗ · r = 0 can be assigned a part in the partition arbitrarily since they occur with probability zero. As before, we can write the expected weight of the solution produced by the randomized rounding as E[W ] = wii Pr[(vi∗ · r)(vi∗ · r) < 0], (i,i )∈E
and we now try to bound the separation probability. Notice that the angle between vi∗ and the projection of r into the plane spanned by vi∗ and vi∗ is uniformly distributed, and thus arccos vi∗ · vi∗ . π We want to bound this from below by some constant factor α times (1 − vi∗ · vi∗ )/2 since then we can write 1 − vi∗ · vi∗ , E[W ] ≥ α wii 2 Pr[(vi∗ · r)(vi∗ · r) < 0] =
(i,i )∈E
which would show that the algorithm has expected performance ratio α. We compute a lower bound on α by considering all possible configurations of the vectors vi∗ and vi∗ , i.e., we use the bound α = min
2 ϑ 2 arccos vi∗ · vi∗ . min = ∗ ∗ π ϑ∈[0,π] 1 − cos ϑ π(1 − vi · vi )
USING EASY OPTIMIZATION PROBLEMS TO SOLVE HARD ONES
91
Consider the function a(ϑ) =
2ϑ π(1 − cos ϑ)
(9)
which describes the approximation factor for an edge in the instance as a function of the angle between the vectors corresponding to the endpoints of the edge. It is continuously differentiable in (0, π], a(π) = 1 and limϑ→0+ a(ϑ) = ∞; therefore α = minϑ∈(0,π) a(ϑ). Any local extreme point in (0, π) satisfies a (ϑ) = 0. This equation can be solved numerically to obtain that α ≈ 0.87856. It is also possible to prove that α > 0.87856, which shows that E[W ] > 0.87856
1 − vi∗ · vi∗ ≥ 0.87856wS ≥ 0.87856wQ . w 2 ii
(i,i )∈E
To conclude, we have now proved the following: Theorem 6.5. The algorithm described in Section 6 is a randomized 0.87856-approximation algorithm for Maximum Cut. [GoeWil1 95]
7
Directions for Future Research
We have presented algorithms for the Maximum Satisfiability and Maximum Cut problems using a general framework: relaxations combined with randomized rounding. We now conclude by stating some open problems related to the results presented above. 7.1
Purely Combinatorial Algorithms
Relaxations and randomized rounding give us a framework for constructing approximation algorithms. However, it is not clear that they help us understand the structure of the problem. Maybe we could get a better understanding of why certain problems are more difficult than others by constructing purely combinatorial approximation algorithms with the same performance ratio as the currently best known algorithms.
92
7.2
LARS ENGEBRETSEN
The Approximability of Maximum Satisfiability
The best known algorithm for Maximum Satisfiability that is based on linear programming and randomized rounding is the 43 -approximation algorithm presented in Section 5. Is there a better one, or can we prove that 3/4 is the best possible ratio such an algorithm can achieve, unless P = NP? As a comparison, the currently best known algorithm, which uses a mixture of several different techniques, has expected performance ratio 0.7846 (cf. [AsaWil1 00]) and there is an algorithm based on semidefinite programming and randomized rounding with an expected performance ratio of 7/8 for Maximum 3-Satisfiability, i.e., Maximum Satisfiability restricted to instances where the clauses have at most three variables (cf. [Kar1 Zwi97]). 7.3
The Approximability of Maximum Cut
The currently best known algorithm for Maximum Cut is in fact the 0.878-approximation algorithm presented in Section 6. It is known that the analysis of the algorithm is tight (cf. [Kar1 99], [Fei0 Sch0 01]) and that it is NP-hard to approximate Maximum Cut within 16/17 ≈ 0.942 (cf. [H˚as01]). Can we close this gap, i.e., either provide a better approximation algorithm or a better approximation hardness result? 7.4
Special Cases
Another promising direction for future research is to study special classes of input instances. One such class is defined by so called “almost satisfiable” instances, which for Maximum Cut amounts to instances that are almost bipartite, i.e., where it is possible to cut a fraction 1−ε of the edges for some small ε. By studying the function a(ϑ) defined in equation (9) when ϑ is close to 1 it is possible to see that the algorithm presented in Section 6 cuts a fraction 1 − O(ε1/2 ) of the edges in such instances. Zwick in [Zwi98b] shows similar results for Maximum 2-Satisfiability and Maximum Horn Satisfiability. 7.5
Optimal Approximation Hardness Results
For some optimization problem, just guessing an assignment uniformly at random works gives an approximation algorithm. For instance, this
USING EASY OPTIMIZATION PROBLEMS TO SOLVE HARD ONES
93
simple algorithm gives expected performance ratios of 1/2 for Maximum Cut and 1 − 2−k for Maximum Ek-Satisfiability—Maximum Satisfiability where clauses contain exactly k literals. For Maximum Cut and Maximum E2-Satisfiability, respectively, there are better polynomial time approximation algorithms, with expected performance ratios of 0.878 and 0.931, respectively [GoeWil1 95], [Fei0 Goe95]. However, it is NP-hard to approximate Max Ek-Satisfiability when k ≥ 3 within 1−2−k +ε for any constant ε > 0 [H˚as01]; we call such problems approximation resistant. Which problems are approximation resistant? For constraints on two or three Boolean variables, this problem has been solved: It is known that none of the constraints on two Boolean variables is approximation resistant –this is a corollary of the Goemans-Williamson algorithm (cf. [GoeWil1 95])– and that the approximation resistant constraints on three Boolean variables are exactly those implied by parity (cf. [H˚as01], [Zwi98a]). For constraints on two variables over some finite domain, the situation is not completely resolved. There are algorithms establishing that many such constraints are not approximation resistant (cf. [And0 EngH˚as01], [EngGur02], [Fri2 Jer97], [GoeWil1 01]). In fact, the only natural constraint on two non-Boolean variables whose status with respect to approximation resistance has not yet been resolved is the generalization of E2-Satisfiability, i.e., the constraint (x1 = c1 ) ∨ (x2 = c2 ).
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
On Sacks forcing and the Sacks property Stefan Geschke II. Mathematisches Institut Freie Universit¨at Berlin Arnimallee 3 14195 Berlin, Germany E-mail:
[email protected]
Sandra Quickert Mathematisches Institut Rheinische Friedrich-Wilhelms-Universit¨at Bonn Beringstraße 6 53115 Bonn, Germany E-mail:
[email protected]
Abstract. In this survey we explain the general idea of forcing, present Sacks forcing with some of its properties, give an overview of closely related forcing notions, and investigate the influence of some combinatorial principles to the question whether a c.c.c. forcing notion can have the Sacks property. Moreover, we discuss the iterated Sacks model, which is the standard model of set theory in which the Continuum Hypothesis fails, but where all other reasonably defined cardinal characteristics of the continuum have the minimal possible value. We also discuss the countable support side-by-side product of Sacks forcing. Received: November 23rd, 2001; In revised version: May 7th, 2002; January 26th, 2003; June 2nd, 2003; Accepted by the editors: June 3rd, 2003. 2000 Mathematics Subject Classification. 03E35.
The second author was supported by the Graduiertenf¨orderung of Northrhine-Westfalia, Germany.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 95-139. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
96
0
STEFAN GESCHKE AND SANDRA QUICKERT
Introduction and outline of the paper
The technique of forcing was invented by Paul Cohen (cf. [Coh63]) in order to produce a model of the commonly accepted system of set-theoretic axioms, ZFC, the Zermelo-Fraenkel axioms ZF together with the axiom of choice, in which the Continuum Hypothesis (CH) fails. The existence of such a model shows that CH does not follow from ZFC. Earlier it was shown by G¨odel (cf. [Kun80, Chapter VI]) that CH cannot be refuted from ZFC, assuming of course that ZFC is consistent. Forcing turned out to be a very powerful tool for proving independence results in mathematics. Starting from a model V , the ground model, which satisfies ZFC, a partial order P ∈ V is fixed which codes the desired properties of the model one wishes to construct. Then forcing with P over V yields a generic extension V [G] where all this information is decoded and ZFC holds as well. V [G] is obtained by adding a certain generic object G (a subset of P) to V . The model V [G] is the smallest model of ZFC which extends V and contains G. Sacks forcing is one such partial order and was invented by Gerald Sacks to produce a minimal forcing extension V [G] in the following sense: If W is a model of ZFC such that V ⊆ W ⊆ V [G], then either W = V or W = V [G]. In this article we give an overview of the properties and applications of Sacks forcing. In Section 1 we illustrate the idea of forcing by an example. The reader who is interested in learning the details of forcing is referred to [Kun80], [Jec78], or [Jec02]. Our notation follows [Kun80] and whenever we state a fact about forcing without mentioning a source, this fact can be found in [Kun80] or, if Borel sets and descriptive set theory are involved, in [Jec78]. This section is ment as a sketchy introduction to forcing, and it certainly does not replace some detailed treatment as in the books mentioned above. After that, in Section 2, we introduce Sacks forcing and study some of its properties. We isolate the so-called Sacks property and mention some of its consequences in Section 3. Section 4 is devoted to relatives of Sacks forcing which share some of its important features. Sacks forcing does not satisfy the countable chain condition (c.c.c.). So in Section 5 we wonder whether there is any c.c.c. forcing that has the Sacks property.
ON SACKS FORCING AND THE SACKS PROPERTY
97
We then turn to iterations of Sacks forcing. Section 6 deals with countable support side-by-side products of Sacks forcing and in Section 7 we discuss the countable support iteration of Sacks forcing of length ω2 .
1
Collapsing the continuum to ℵ1
When G¨odel proved that CH is consistent with ZFC, he started from a model V of ZFC and constructed a definable class L, the class of constructible sets, inside V such that L satisfies ZFC+CH. (Actually, G¨odel started from a model of ZF and produced a model of ZFC+CH.) Forcing works the opposite way. We start from a model V of ZFC and extend it. As usual, we will frequently talk about models of set theory, which just means models of ZFC. Whenever we consider two models V0 , V1 of set theory such that V0 ⊆ V1 , we will tacitly assume that V0 is transitive in V1 , i.e., for all x ∈ V0 and y ∈ V1 , if y ∈ x in V1 , then y ∈ V0 . This guarantees that V0 and V1 agree on elementary properties of sets, for instance whether a set x ∈ V0 is an ordinal or not.1 Suppose we wish to construct a model of ZFC + CH by forcing. Let V be a model of ZFC. If V satisfies CH, we are done. Otherwise, in V the cardinal 2ℵ0 , the size of R and of ℘(ω), is bigger than ℵ1 . In this case we try to add a function f : ℵ1 → ℘(ω) which is onto. Since V does not satisfy CH, no such function exists in V . However, for every countable ordinal α, f α could be an element of V . We define a forcing notion P, i.e., a partial order, which consists of possible initial segments of a surjective map f : ℵ1 → ℘(ω). Let P := {p : ran(p) ⊆ ℘(ω) & ∃α < ω1 (dom(p) = α)}. The order on P is reverse inclusion. More precisely, for p, q ∈ P let p ≤ q iff q ⊆ p, i.e., if p extends q. If p ≤ q, we say that p is stronger than q. The elements of P are the forcing conditions or just conditions. Two conditions p0 , p1 ∈ P are compatible iff there is q ∈ P such that q ≤ p0 , p1 . Otherwise p0 and p1 are incompatible. In this case we write p0 ⊥ p1 . Our particular forcing notion P is a tree of height ℵ1 (the root is the empty function, and stronger (i.e., smaller with respect to ≤) conditions are higher up the tree). In general, a forcing notion can be any partial order. For technical reasons, we assume that every partial order Q has a largest element 1Q . 1
Cf. [Kun80, IV.3] for more information on this subject.
98
STEFAN GESCHKE AND SANDRA QUICKERT
We now describe how we get a surjection f : ℵ1 → ℘(ω) from P. A subset G ⊆ P is a filter iff – for all p, q ∈ G there is r ∈ G such that r ≤ p, q and – for all p ∈ G and all q ∈ P with p ≤ q, q ∈ G. In our case, a filter in P is the same as a branch in the tree P. If G ⊆ P is a filter such that {α < ω1 : ran(p) ⊆ ℘(ω) & ∃p ∈ G(dom(p) = α)} = ω1 , then fG := G is a function from ℵ1 to ℘(ω). However, if G is an element of V , then fG is not onto ℘(ω) since 2ℵ0 > ℵ1 in V . This is where genericity comes into play. We have to make sure that G is sufficiently complicated. D ⊆ P is dense in P if for all p ∈ P there is q in D such that q ≤ p. A filter G ⊆ P is P-generic over V if it has nonempty intersection with every D ∈ V which is dense in P. A set D ⊆ P is dense below a condition p ∈ P if for all q ∈ P with q ≤ p there is q ∈ D with q ≤ q. It is not difficult to check that if D is dense below p and G is P-generic over V with p ∈ G, then D ∩ G = ∅. Except for trivial cases, filters that are generic over V cannot be elements of V . Thus, we need someone outside V who chooses a P-generic filter over V . It can be shown that for each partial order Q in V it is safe (i.e., it does not lead to a contradiction unless ZFC itself is contradictory) to assume that a Q-generic filter over V exists. More precisely, whenever we consider a specific forcing notion Q it will have a reasonable description. If the existence of a forcing notion with this description is consistent with ZFC, then there is a model V of set theory such that in V we have a partial order Q satisfying the description of Q and there is a model W of set theory such that V ⊆ W and in W there is a Q -generic filter G over V . So, let G be P-generic over V . For every x ∈ ℘(ω) the set Dx := {p ∈ P : x ∈ ran(p)} is dense in P. Since Dx ∈ V and G is generic, there is p ∈ G∩Dx , i.e., there is p ∈ G such that x ∈ ran(p). This implies that f := fG is a function onto ℘(ω). Similarly, for every α < ω1 , the set Dα := {p ∈ P : α ∈ dom(p)} is dense in P. It follows that for every α < ω1 there is p ∈ G such that α ∈ dom(p). This implies dom(f ) = ℵ1 . It follows that f : ℵ1 → ℘(ω) is onto. After constructing the desired map f from G, we have to explain how to get a model V [G] of ZFC from V and G. Roughly speaking, V [G] consists of all sets that are definable from G together with parameters from V . More precisely, a class of so-called P-names is defined in V .
ON SACKS FORCING AND THE SACKS PROPERTY
99
Then, by means of some simple algorithm, from every P-name x˙ ∈ V a set x˙ G is computed using G. The set x˙ G is the evaluation of x˙ with respect to G. The model V [G] consists of all x˙ G . Every element y of V has a canonical P-name yˇ with the property that for every P-generic G, yˇG = y. In other words, V is included in V [G]. Moreover, there is a name G˙ such that for every P-generic filter G, G˙ G = G. V [G] turns out to be a model of ZFC, and it has the same ordinals as V . In V [G] we have the function f = fG , as defined above. The function f has at least one name in V , so we can pick one and call it f˙. Thus, f is the interpretation of the name f˙ under G, and we denote this fact by f = f˙G . We argued that f : ℵ1 → ℘(ω) is onto. However, we were talking about the ℵ1 and ℘(ω) of V , not of V [G]. Formally, the symbols ℵ1 and ℘(ω) stand for definitions of certain sets. The set ℘(ω) is the power set of the first limit ordinal and ℵ1 is the first ordinal α such that there is no map from ω onto α. As it turns out, ω is the same in V as in V [G] V [G], and this holds for every forcing extension. Let ℵV1 and ℵ1 denote the first uncountable ordinal in V or V [G], respectively. Relativizations of power sets, i.e., ℘(ω)V and ℘(ω)V [G] , are defined similarly. In order V [G] to prove that V [G] is a model of CH it is enough to show ℵV1 = ℵ1 and ℘(ω)V = P (ω)V [G] . Both of these statements follow from the fact that V [G] does not contain any countable sequences of ordinals that are not already elements of V (i.e., there are no new countable sequences of ordinals), which we prove in a moment. We need some information on how properties of V [G] are connected to the properties of P. Let ϕ(x1 , . . . , xn ) be a formula in the language of set theory (first order logic with the binary relation symbol ∈) with all free variables among x1 , . . . , xn . Then for every condition p ∈ P and P-names x˙ 1 , . . . , x˙ n we say that p forces ϕ(x˙ 1 , . . . , x˙ n ) iff for every Pgeneric filter G with p ∈ G, V [G] satisfies ϕ((x˙ 1 )G , . . . , (x˙ n )G ).2 In this case we write p P ϕ(x˙ 1 , . . . , x˙ n ). It is crucial for the theory of forcing that the relation P is actually definable in V . Moreover, we have the following important fact: Theorem 1.1 (The Forcing Lemma). Let Q be a forcing notion and suppose that G is Q-generic over V . Then V [G] |= ϕ((x˙ 1 )G , . . . , (x˙ n )G ) 2
Here ϕ(x˙ 1 , . . . , x˙ n ) denotes ϕ with every free xi replaced by the name x˙ i .
100
STEFAN GESCHKE AND SANDRA QUICKERT
(i.e., ϕ((x˙ 1 )G , . . . , (x˙ n )G ) is true in V [G]) iff there is q ∈ G such that q P ϕ(x˙ 1 , . . . , x˙ n ). Thus, nothing in V [G] depends on chance, everything is forced at some point. We say that a condition p decides a statement ϕ if p ϕ or p ¬ϕ. Similarly, if x˙ is a name, we say that p decides x˙ if there is some ground model set x such that p x˙ = xˇ. It is worth noting that for conditions p and q with q ≤ p, q forces everything that p forces. Moreover, for every condition p and every statement ϕ there is q ≤ p such that q decides ϕ. Returning to our example, how can we see that there are no new countable sequences of ordinals? The forcing notion P is σ-closed, that is, if (pi )i<ω is a sequence of conditions in P such that pi+1 ≤P pi for each i, then there is some p ∈ P such that p ≤ pi for every i ∈ ω. Just let p := i∈ω pi . Assume there is a P-name h˙ and a condition p which forces that h˙ is a function from ω to the ordinals. Let p ≤ p be arbitrary. Then, using Theorem 1.1 and the remarks following it, we find a decreasing sequence (qi )i<ω of conditions such that q0 ≤P p and for each i there is some ˙ ordinal αi such that qi P h(i) = αi .3 Let q ∈ P be such that q ≤ qi for all i ∈ ω. Now for all i ∈ ω, q ˙h(i) = αi . In other words, q forces that h˙ is the sequence (αi )i∈ω , which is an element of the ground model. Since q ≤ p and p was arbitrary below p, the set of conditions that force h˙ to be a sequence in the ground model is dense below p. It follows that every generic filter G that contains p intersects this set. Thus, whenever G is P-generic over V and p ∈ G, then h˙ G ∈ V . It follows that for every P-generic filter G over V , in V [G] there are no new countable sequences of ordinals and thus, ℵ1 and ℘(ω) are the same in V [G] as in V . But this implies that V [G] is a model of ZFC+CH since in V [G] there is a surjection from ℵ1 onto ℘(ω). It is worth mentioning that if p forces h˙ to be a function from ω to the ordinals, then we may already assume that every condition q forces h˙ to be a function from ω to the ordinals. This is because we can replace h˙ by a name g˙ such that every condition q forces “g˙ is a function from ω to the 3
˙ ı) = α Formally, the last part should have been stated as qi P h(ˇ ˇ i since we have to use names to the right of the forcing relation. However, we identify the elements of V with their canonical names as long as it is clear what we mean.
ON SACKS FORCING AND THE SACKS PROPERTY
101
˙ then h˙ = g”. ordinals and if pˇ ∈ G, ˙ The existence of g˙ follows from the next theorem, the so-called Maximality Principle. Theorem 1.2. Let ϕ(x, y1 , . . . , yn ) be a formula in the language of set theory. Let Q be a forcing notion, q ∈ Q, and suppose that y˙1 , . . . , y˙ n are Q-names such that q ∃xϕ(x, y˙1 , . . . , y˙n ). Then there is a Q-name x˙ such that q ϕ(x, ˙ y˙1 , . . . , y˙n ). In particular, if every q ∈ Q forces ∃xϕ(x, y˙1 , . . . , y˙ n ), then 1Q forces this and thus, by Theorem 1.2, there is a Q-name x˙ such that every q ∈ Q forces ϕ(x, ˙ y˙1 , . . . , y˙n ).
2
Sacks forcing and its properties
The forcing notion which we introduced in Section 1 for collapsing 2ℵ0 to ℵ1 was a rather special case. As mentioned before, forcing was invented to enlarge 2ℵ0 , not to make it smaller. Enlarging 2ℵ0 can be done by adding new reals, where a “real” is an element of one of the sets ℘(ω), 2ω , ω ω , R, and [0, 1]. In our context, usually it does not matter whether we look at ℘(ω), 2ω , ω ω , R, or [0, 1], and here is why: The first two spaces can be identified in a natural way. The space ω ω is homeomorphic to the set of irrational numbers, a subset of R whose complement is countable and therefore neglegible for our purposes. The compact unit interval is a continuous image of 2ω via a mapping that fails to be injective only on countably many places, i.e., again on a neglegible set. Finally, [0, 1] is homeomorphic to the two-point compactification of R. These mappings, which are almost bijections between the different spaces, are all nicely definable and show that once we know all the elements of one of those spaces, we know all the elements of the other spaces too. Moreover, if we investigate some frequently studied ideals such as the ideal of measure zero sets, of meager sets, or of countable sets, it turns out that, as far as the basic properties of these ideals concerned, it does not matter on which incarnation of “the reals” the respective ideals are considered.
102
STEFAN GESCHKE AND SANDRA QUICKERT
The forcing notion used by Cohen to add new reals is now called Cohen forcing and is defined as follows. Let C be the set 2<ω of all finite sequences of 0’s and 1’s. The set C is ordered by reverse inclusion. If G is C-generic over the ground model, then the Cohen real added by C is just G, a function from ω to 2. Thus, Cohen forcing uses finite approximations to create a new real. To enlarge the continuum using C one has to use iterate Cohen forcing in order to add many new reals. Iterated forcing will be discussed later. Let us consider another description of Cohen forcing. Whenever we consider a Boolean algebra B as a forcing notion, we mean the partial order (B\{0B }, ≤) where ≤ is the natural order on B. Let (P, ≤P ) and (Q, ≤Q ) be two partial orders. A mapping e : P → Q is a dense embedding iff the following conditions hold: – ∀p, p ∈ P(p ≤P p ⇒ e(p) ≤Q e(p )) – ∀p, p ∈ P(p ⊥P p ⇒ e(p) ⊥Q e(p )) – ∀q ∈ Q∃p ∈ P(e(p) ≤Q q) Note that we do not demand e to be 1-1. But condition (ii) implies that for all p, p ∈ P with e(p) = e(p ) and every P-generic filter G we have p ∈ G iff p ∈ G. Every partial order P has a so-called completion, a complete Boolean algebra B such that P densely embeds into (B\{0B }, ≤). The Boolean algebra B is unique up to isomorphism. We call two partial orders P and Q forcing equivalent (or just equivalent) iff they have isomorphic completions. If P and Q are equivalent, then they produce the same generic extensions. Therefore, equivalent partial orders can be considered the same as far as forcing is concerned. Many forcing notions adding generic reals can be described in the following way: Let Bor(R) denote the Boolean algebra of Borel subsets of the reals. If I ⊆ Bor(R) is an ideal, then Bor(R)/I is a Boolean algebra. As a forcing notion, this Boolean algebra is equivalent to (Bor(R)\I, ⊆). Let M denote the ideal of meager subsets of the real line. Then Cohen forcing is equivalent to Bor(R)/M since Bor(R)/M has a dense subset which is isomorphic to C. Another important ideal in Bor(R) is the ideal N of Borel sets of measure zero. The forcing notion Bor(R)/N is called random real forcing and adds a random real. Both forcing notions, Cohen and random real forcing satisfy the countable chain condition (c.c.c.), that is, they only have countable antichains. Here an antichain is a set of pairwise
ON SACKS FORCING AND THE SACKS PROPERTY
103
incompatible elements. Forcing notions with the c.c.c. do not collapse cardinals. More exactly, if P is c.c.c. and G is P-generic over the ground model V , then an ordinal α is a cardinal in V [G] iff it is a cardinal in V . If I ⊆ Bor(R) is a σ-ideal, then the generic real added by Bor(R)/I is determined as follows: We work with Bor(R)\I instead of Bor(R)/I. Let G be Bor(R)\I-generic over the ground model V . Then G is nonempty and contains a unique real xG , and this real is the generic real added by Bor(R)/I. While xG is not a generic filter for some partial order, G can be recovered from xG (i.e., G and xG are interdefinable), and thus it makes sense to talk about xG as the generic object added by Bor(R)/I. In a moment we will see how G can be recovered from xG . To describe the properties of xG , we have to mention Borel codes (cf. [Jec78, page 537]). A Borel set X ⊆ R is not merely a set of real numbers, but it has a description of how it can be obtained from open sets by taking complements and unions over countable families. This description can be coded as an element of ω ω , and this code is a Borel code for X. (Note that a Borel code for a given set X is not unique.) In fact, usually mathematicians use Borel codes when they are talking about Borel sets. E.g., they are talking about the open set (0, 1) instead of thinking about every single real inside this interval. The notation (0, 1) can be considered as a Borel code for a certain open set. This Borel code has different interpretations in different models of set theory. However, usually the structure of a set matters, and the individual elements which are inside are unimportant. When we talk about a Borel set X in two different models of set theory, we really choose a Borel code for X and then identify the two Borel sets that are obtained in the respective models by applying the definition of a Borel set coded by the Borel code. This does not depend on the particular Borel code we chose for X. The generic real xG has the property that it is not an element of any element of I. More precisely, let X ∈ I. Then X is Borel and thus has a Borel code. (We work in V , so far.) Now the Borel set in V [G] which has this Borel code does not contain xG . Roughly speaking, xG avoids all sets from I. Now G can be recovered from xG as follows: Let (Bor(R))V denote the set of ground model Borel sets. Then G is the set of all X ∈ (Bor(R))V which in V [G] contain xG . Since xG avoids all elements of the ideal I, the ground model Borel sets which in V [G] contain xG are indeed elements of (Bor(R))V \I. By the definition of xG , every ele-
104
STEFAN GESCHKE AND SANDRA QUICKERT
ment of G contains xG (in V [G]). On the other hand, if X is a ground model Borel set which in V [G] contains xG , then X is an element of (Bor(R))V \I which is compatible with all elements of G since the intersection of X with an arbitrary element of G is again a ground model Borel set that in V [G] contains xG . It is a general fact about generic filters that they contain a condition iff the condition is compatible with all the elements of the filter. This implies X ∈ G.4 One σ-ideal of Bor(R) stands out as the smallest σ-ideal containing all the singletons, namely the ideal ctbl of countable sets. Sacks forcing is Bor(R)/ctbl. A Sacks real is a generic real for Sacks forcing. There is another ideal related to Sacks forcing, Marczewski’s ideal s0 introduced in [Mar0 35].5 A set X ⊆ R is perfect iff it is nonempty, closed, and has no isolated points. A set X ⊆ R is in the ideal s0 iff every perfect set P ⊆ R has a perfect subset Q ⊆ P which is disjoint from X. Note that no perfect set is in s0 . A set X ⊆ R is s-measurable iff for every perfect set P ⊆ R there is a perfect set Q ⊆ P such that either Q ∩ X = ∅ or Q ⊆ X. The set of s-measurable sets is called the Marczewski field. A well known theorem due to Alexandroff and Hausdorff says that every uncountable Borel set includes a perfect set (cf. [Jec78, Theorem 94]). It follows that every Borel set is s-measurable and that the Borel sets in s0 are precisely the countable sets. Moreover, every s-measurable set X ⊆ R which is not in the ideal s0 has a perfect subset. And perfect sets are clearly Borel. It follows that there is a dense embedding from the partial order Bor(R)\ctbl into the partial order of s-measurable sets that are not in s0 . This implies that the Boolean algebra Bor(R)/ctbl and the quotient of the Boolean algebra of s-measurable sets modulo the ideal s0 have isomorphic completions and are thus forcing equivalent. We give a combinatorial description of Sacks forcing. First note that every perfect set X ⊆ R has a perfect subset which is homeomorphic to 2ω .6 It follows that the copies of 2ω are dense in Bor(R)\ctbl. Therefore 4
5 6
Note that all the time we considered I as the actual set of ground model Borel sets, i.e., as an ideal in (Bor(R))V . Usually I fails to be an ideal in Bor(R) in V (G). If I is nicely definable like M and N , then typically the ground model ideal I is different from the ideal in V [G] which is obtained by applying the definition of I in V [G]. Cf. [Bre1 95] for some information on Marczewski’s ideal and other similarly defined ideals. Cf. [Jec78, Lemma 4.2] for this. The statement follows from the usual proof of the fact that every perfect set is of size 2ℵ0 .
ON SACKS FORCING AND THE SACKS PROPERTY
105
forcing with the partial order of perfect subsets of 2ω gives the same generic extensions as forcing with Bor(R)/ctbl. Let X be a subset of 2ω . Then the set T (X) := {xn : n ∈ ω ∧ x ∈ X} of initial segments of elements of X is a subtree of the tree (2<ω , ⊆). For a subtree T of 2<ω let [T ] := {x ∈ 2ω : ∀n ∈ ω(xn ∈ T )} be the set of branches through T . For every subtree T of 2<ω the set [T ] is closed. For every closed set X ⊆ 2ω , [T (X)] = X. In this translation between closed subsets of 2ω and subtrees of 2<ω the perfect sets correspond to perfect trees. A nonempty subtree T of 2<ω is perfect iff for every s ∈ T there are t0 , t1 ∈ T such that s ⊆ t0 , t1 and t0 and t1 are incomparable with respect to ⊆. This gives rise to our official definition of Sacks forcing S. Definition 2.1. Sacks forcing is the set S of all perfect subtrees of 2<ω ordered by inclusion. With this definition of Sacks forcing, whenever we refer to a Sacks real we mean the unique element x of p∈G [p] where G is an S-generic filter. If x is a Sacks real and this is witnessed by G, then G is the set of all perfect trees p in the ground model such that x ∈ [p]. It is quite obvious that Sacks forcing does not have the c.c.c. In fact, there are antichains of size 2ℵ0 : Fix an almost disjoint family {Aα | α < 2ℵ0 } of subsets of ω, and for each α < 2ℵ0 choose a perfect tree Tα whose splitting levels are exactly the elements of Aα , for example, Tα := {s ∈ 2<ω : ∀n < |s|(n ∈ / Aα → s(n) = 0)}. If α = β, then Tα ∩ Tβ includes no perfect tree, so they are incompatible. Nevertheless, S does not collapse ℵ1 either, since it satisfies Baumgartner’s Axiom A [Bau83, Section 7]: Definition 2.2. A forcing notion (P, ≤) satisfies Axiom A iff there is a decreasing chain of partial orders ≤ = ≤0 ⊇ ≤1 ⊇ ≤2 . . . on P such that 1. for every sequence (pn )n<ω in P such that for all n ∈ ω, pn+1 ≤n pn there is p ∈ P such that for all n ∈ ω, p ≤n pn (such a sequence (pn )n∈ω is called a fusion sequence and p is a fusion of (pn )n∈ω ) and 2. if A ⊆ P is an antichain and p ∈ P, then for every n ∈ ω there is q ≤n p such that q is compatible with at most countably many members of A. Axiom A can be seen as a generalisation of the c.c.c. since each c.c.c. forcing satisfies Axiom A: Just let ≤n be equality for all n > 0.
106
STEFAN GESCHKE AND SANDRA QUICKERT
Axiom A forcings belong to a wider class of forcing notions preserving ℵ1 , namely the class of proper forcings [She98, III.1]. Properness was introduced by Shelah as a strengthening of “not collapsing ℵ1 ” which behaves nicely with respect to iterations. Countable support iterations of proper forcings are again proper while countable support iterations of ℵ1 -preserving forcing notions may collapse ℵ1 [She98, III.3]. Instead of invoking properness it can be seen very quickly that Axiom A suffices to guarantee that ℵ1 is preserved. Lemma 2.3. Let P be a forcing notion satisfying Axiom A. Then for every P-generic filter G over the ground model V the following holds: If x is a countable set of ordinals in V [G], then there is a set y ∈ V such that x ⊆ y and y is countable in V . In particular, forcing with P does not collapse ℵ1 . Proof. Let x˙ be a P-name and suppose that p ∈ P forces that x˙ is a countable set of ordinals. We will construct a fusion sequence (qn )n∈ω starting with q0 = p and a countable set y such that for every fusion q of this sequence, q x˙ ⊆ yˇ. We start by choosing a name h˙ for a function from ω into the ordinals such that p forces h˙ to be a function from ω onto x. ˙ The existence of h˙ is guaranteed by the Maximality Principle. Let n ∈ ω and suppose we have already defined qn . The set of conditions ˙ deciding h(n) is dense in P. It follows that there is a maximal antichain ˙ An ⊆ P consisting of conditions that decide h(n). By condition (ii) of Axiom A there is qn+1 ≤n qn such that qn+1 is compatible with at most countably many elements of An . Let q be a fusion of (qn )n∈ω , and put ˙ = α)}. y := {α : ∃q ≤ q∃n ∈ ω(q h(n) Then, whenever G is P-generic over V and q ∈ G, for every n ∈ ω there ˙ is q ∈ G such that q ≤ q and q h(n) =α ˇ where α = h˙ G (n). (Recall that V [G] has the same ordinals as V .) It follows that x˙ G = h˙ G [ω] ⊆ y. It remains to show that y is countable. Let n ∈ ω and q ≤ q. Since An is a maximal antichain, there is r ∈ An such that q and r are compatible. ˙ By the choice of An , there is αr such that r h(n) = α ˇ r . It follows ˙ ˙ that if q decides h(n), then q decides h(n) to be αr . Since q ≤ q, r and q are compatible. By the choice of qn+1 and since q ≤ qn+1 , q is compatible with at most countably many elements of An . It follows
ON SACKS FORCING AND THE SACKS PROPERTY
107
˙ that {α : ∃q ≤ q(q h(n) = α ˇ )} is countable. But this implies the countability of y. q.e.d. The conclusion of Lemma 2.3 is useful enough to get a name. Definition 2.4. Let V0 and V1 be models of set theory such that V0 ⊆ V1 and V0 and V1 have the same ordinals. We say that V1 has the ℵ0 -covering property over V0 iff for every set A ∈ V1 of ordinals which is countable in V1 there is a set B ∈ V0 of ordinals which is countable in V0 such that A ⊆ B. Lemma 2.3 says that if P is a forcing notion satisfying Axiom A, then for every P-generic filter G over the ground model V , V [G] has the ℵ0 -covering property over V . Properness instead of Axiom A is actually sufficient for this (cf. [She98, III.1.16]). The definition of Axiom A seems rather technical. However, it is a quite natural property which is typically satisfied by forcing notions where the conditions are trees which are required to split often. Let us show Lemma 2.5. Sacks forcing S satisfies Axiom A. Proof. For n ∈ ω and p ∈ S, let pn consist of those t ∈ p that are minimal in p (with respect to ⊆) such that t has exactly n proper initial segments that have two immediate successors in p. For p, q ∈ S let p ≤n q iff p ≤ q and pn = q n . Suppose (pn )n∈ω is a sequence in S such that pn+1 ≤n pn for all n ∈ ω. Then q := pn is easily seen to be a perfect tree, i.e., an element of S. Clearly, q ≤n pn for every n ∈ ω. In the context of Sacks forcing we will refer to the intersection of a fusion sequence as the fusion of the sequence. Now let n ∈ ω, p ∈ S, and suppose that A is an antichain in S. Every σ ∈ 2n uniquely determines an element tσ of pn (using the natural bijection between 2n and pn ). Let p ∗ σ := {s ∈ p : s ⊆ tσ ∨ tσ ⊆ s}. Clearly, p ∗ σ ∈ S and p ∗ σ ≤ p. For every σ ∈ 2n let qσ ≤ p ∗ σ besuch that qσ is compatible with at most one element of A. Now q := σ∈2n qσ ∈ S and q ≤n p. Note that with this definition of q, for every σ ∈ 2n we have qσ = q ∗ σ. If q is compatible with some r ∈ A, then there is σ ∈ 2n such that q ∗ σ is compatible with r. But every q ∗σ is compatible with at most one element of A. It follows that q is compatible with at most 2n elements of A. q.e.d.
108
STEFAN GESCHKE AND SANDRA QUICKERT
Since Sacks forcing satisfies Axiom A, it does not collapse ℵ1 . Under CH, S is of size ℵ1 and therefore does not collapse any cardinal above ℵ1 , either. However, if CH fails in the ground model, forcing with S could easily collapse 2ℵ0 . In fact, it is possible to have a ground model V such that CH fails in V and whenever G is S-generic over V , 2ℵ0 = ℵ1 in V [G]. We will give the argument for this in Section 7. Petr Simon [Sim1 93] showed that adding a Sacks real collapses 2ℵ0 to the unboundedness number b, the least size of a set U ⊆ ω ω such that for all f : ω → ω there is g ∈ U such that for infinitely many n ∈ ω, g(n) > f (n). But b < 2ℵ0 is not the only reason for Sacks forcing to collapse cardinals. Judah, Miller, and Shelah in [JudMil0 She92] showed that it is consistent with Martin’s Axiom (which implies b = 2ℵ0 ) that Sacks forcing collapses cardinals. On the other hand, Shelah showed that it is consistent with ZFC + ¬CH that Sacks forcing does not collapse cardinals (cf. [Car0 Lav89]). Moreover, Carlson and Laver [Car0 Lav89] showed that some strengthened form of Martin’s Axiom implies that Sacks forcing does not collapse 2ℵ0 to ℵ1 . This stronger version of Martin’s axiom is known to be consistent with ZFC + 2ℵ0 =ℵ2 . This also implies that it is consistent with ZFC + ¬CH that Sacks forcing does not collapse any cardinal. Let us go back to the proof of Lemma 2.5. This proof shows that Sacks forcing S actually satisfies a stronger form of Axiom A: Given a condition p, an integer n, and an antichain A, we can find q ≤n p which is compatible with only finitely many members of A. This already implies that Sacks forcing has the so-called Sacks property. However, we can do even better. Definition 2.6. Let f : ω → ω\{0}. C : ω → [ω]<ℵ0 is an f -cone iff for all n ∈ ω, |C(n)| ≤ f (n). A real r ∈ ω ω is covered by C iff for all n ∈ ω, r(n) ∈ C(n). Let V0 and V1 be models of set theory such that V0 ⊆ V1 . We say that V1 has the Sacks property over V0 iff every real r : ω → ω in V1 is covered by a 2n -cone C : ω → [ω]<ω in V0 . (Here 2n is meant as a shortcut for the function n → 2n .) A forcing notion P has the Sacks property iff for every P-generic filter G over the ground model V , V [G] has the Sacks property over V . A subtree T of ω <ω is binary iff every t ∈ T has at most 2 immediate successors in T . A real r : ω → ω is covered by T iff r ∈ [T ]. Let V0 and V1 be as before. Then V1 has the 2-localization property over V0 iff every
ON SACKS FORCING AND THE SACKS PROPERTY
109
real r ∈ ω ω in V1 is covered by a binary tree in V0 . A forcing notion P has the 2-localization property iff for every P-generic filter G over the ground model V , V [G] has the 2-localization property over V . The Sacks property clearly follows from the 2-localization property. Lemma 2.7. Sacks forcing has the 2-localization property. In particular, it has the Sacks property. Proof. Let p ∈ S, and suppose that z˙ is an S-name such that p z˙ ∈ ω ˇ ωˇ . It suffices to find a binary tree T and q ≤ p such that q forces z˙ to be a branch through T . The condition q will be the fusion of a fusion sequence (pn )n∈ω . We then define T to be Tq (z) ˙ := {s ∈ ω <ω : ∃q ≤ q(q sˇ ⊆ z)}, ˙ the tree of q-possibilities for z. ˙ Note that q forces z˙ to be a branch of Tq (z). ˙ While choosing the sequence (pn )n∈ω all we have to make sure is that Tq (z) ˙ becomes binary. The set {q ∈ S : q decides all of z} ˙ ∪ {q ∈ S : ∀r ≤ q(r does not decide all of z)} ˙ is dense in S. If there is some q ≤ p which decides all of z˙ we are done since in this case Tq (z) ˙ does not have incomparable elements. Thus, we may assume that no condition below p decides all of z. ˙ For every condition q ≤ p let z˙q be the longest initial segment of z˙ that is decided by q. In particular, q (z˙q )ˇ ⊆ z. ˙ Since no condition <ω below p decides all of z, ˙ z˙q ∈ ω . Let p0 := p. Let n ∈ ω and suppose that we have already chosen pn . For σ ∈ 2n consider the conditions pn ∗ (σ 0) and pn ∗ (σ 1) where si denotes the concatenation of s with the sequence of length 1 that has the value i. Since pn ∗ (σ 0) and pn ∗ (σ 1) do not decide all of z, ˙ there are qσ 0 ≤ pn ∗ (σ 0) and qσ 1 ≤ pn ∗(σ 1) such that z˙qσ 0 and z˙qσ 1 are incomparable with respect to ⊆. Now pn+1 := i∈2,σ∈2n qσ i ≤n pn . Let q := n∈ω pn . It is easily checked by induction on n that for all n ∈ ω the finite tree Tn which is generated by (i.e., consists of all initial segments of elements of) {z˙q∗σ : σ ∈ 2n } has the following properties:
110
STEFAN GESCHKE AND SANDRA QUICKERT
1. Tn is a finite binary tree of height at least n, ˙ and 2. Tn ⊆ T = Tq (z), 3. if t ∈ T is of length ≤ n, then t ∈ Tn . It follows that T = n∈ω Tn and that T is binary.
q.e.d.
Let us analyze this proof. For every p ∈ S and every name z˙ for an element of ω ω such that no condition below p decides all of z˙ we found a condition q ≤ p such that Tq (z) ˙ is binary. It follows from the construction of q that h : [q] → [Tq (z)]; ˙ x → {z˙q∗σ : σ ∈ 2<ω ∧ x ∈ [q ∗ σ]} is well-defined and in fact a homeomorphism. Obviously, q forces the Sacks real to be a branch of q (every condition in S does this). Moreover, q forces that h maps the Sacks real to z. ˙ This shows the following: Let G be S-generic over the ground model V . Then for every element z of ω ω in V [G] with z ∈ V there is a homeomorphism h ∈ V between a perfect subset of 2ω (the set [q] for some suitable q ∈ G) and a perfect subset of ω ω (the set [Tq (z)] ˙ for some name z˙ for z) such that (in V [G]) h maps the Sacks real x added by G to z. The homeomorphism h induces an order isomorphism between the set of perfect subtrees of q and the set of perfect subtrees of Tq (z). ˙ The partial orders of perfect subtrees of q and of Tq (r) ˙ are clearly isomorphic to S. It is easily checked that {p ≤ q : z ∈ [p]} is a generic filter for the partial order of perfect subtrees of q. But this implies that {p ∈ S : p ⊆ Tq (z) ˙ ∧ z ∈ [p]} is a generic filter for the partial order of perfect subtrees of Tq (z). ˙ From this fact it easily follows that if z happens to be an element of 2ω , then z is a Sacks real belonging to some S-generic filter which is usually different from, but interdefinable with G using ground model parameters. It follows that every new element of 2ω is again a Sacks real. (This was proved by Sacks in [Sac71].) The existence of a ground model homeomorphism h mapping the original Sacks real x to z implies that x can be reconstructed from z using a ground model parameter, namely h (or rather h−1 ). This observation already indicates the minimality of the Sacks extension mentioned in the introduction: For every S-generic filter G over the ground model V and every model W of set theory such that V ⊆ W ⊆ V [G], either W = V or V [G] = W . A real r which codes (i.e., is interdefinable with) some
ON SACKS FORCING AND THE SACKS PROPERTY
111
P-generic filter G over V for some partial order P is called a minimal real iff V [G] is a minimal extension of V . From [Jec78, Lemma 25.2 and Lemma 25.3] we quote the following: If G is a P-generic filter over the ground model V for some forcing notion P and W is a model of ZFC such that V ⊆ W ⊆ V [G], then there is a set A of ordinals such that W = V [A] where V [A] is the smallest model of ZFC which extends V and contains A. Note that it is not obvious that V [A] exists at all for a given set A ∈ V [G] of ordinals. However, it can be shown that for every set A ∈ V [G] of ordinals there is a forcing notion Q ∈ V and a complete embedding e : Q → P such that V [e−1 [G]] is the smallest model of ZFC which extends V and contains A. A mapping e : Q → P is a complete embedding iff for all q, q ∈ Q, – q ≤ q implies e(q) ≤ e(q ), and – e(q) ⊥ e(q ) iff q ⊥ q , and – for all p ∈ P there is q ∈ Q such that for all q ∈ Q with q ≤ q, e(q ) is compatible with p. If e : Q → P is a complete embedding (in the ground model V ), then for every P-generic filter G over V , e−1 [G] is Q-generic over V . Lemma 2.8. Sacks reals are minimal. Proof. Let A˙ be a name for a set of ordinals. Then there is an ordinal α such that 1S forces A˙ to be a subset of α. Let z˙ be a name for the characteristic function of A˙ from α to 2. Let p ∈ S and let r˙ be a name for the Sacks real added by the S-generic filter. We have to find q ≤ p such that q decides all of z˙ or such that for every S-generic filter G over the ground model V with q ∈ G, r˙G can be reconstructed from z˙G using parameters in V . We proceed essentially as in the proof of Lemma 2.7. Again we may assume that no condition below p decides all of z˙ since otherwise we are done. For every condition q ≤ p let z˙q be the longest initial segment of z˙ that is decided by q, as before. Since no condition below p decides all of z, ˙ the domain of z˙q is an ordinal < α. Let p0 := p. Let n ∈ ω and suppose that we have already chosen pn . For σ ∈ 2n consider the conditions pn ∗ (σ 0) and pn ∗ (σ 1). Since pn ∗ (σ 0) and pn ∗ (σ 1)
112
STEFAN GESCHKE AND SANDRA QUICKERT
do not decide all of z, ˙ there are qσ 0 ≤ pn ∗ (σ 0) and qσ 1 ≤ pn ∗ (σ 1) such that z˙qσ 0 and z˙qσ 1 are incomparable with respect to ⊆. Let pn+1 := i ≤n pn . i∈2,σ∈2n qσ Let q := n∈ω pn . Now if G is S-generic over V and q ∈ G, then r˙G is the unique element of [q] such that for all σ ∈ 2<ω with r˙G ∈ [q ∗ σ], z˙q∗σ ⊆ z˙G . This shows that q ∈ G implies r˙G ∈ V [A˙ G ] and we are done. q.e.d.
Brendle in [Bre1 00b] proved some interesting theorems about the set of all Sacks reals over the ground model V in a generic extension V [G]. He showed that after adding a single Sacks real over V , the set of ground model reals is in the ideal s0 . Since every new real is a Sacks real, this means that the set of Sacks reals is big with respect to s0 , i.e., its complement is in s0 . It follows that after adding a Sacks real x0 over V and then adding a Sacks real x1 over V [x0 ], in V [x0 ][x1 ] the set of Sacks reals over V is in s0 . The results about Sacks forcing mentioned so far show that when adding a single Sacks real, we have a very good control about all the new reals. Let us mention another result along these lines. Let U be an ultrafilter on ω in the ground model V . Suppose G is P-generic over V for some forcing notion P. Then U is destroyed by adding G if the filter generated by U in V [G] is not an ultrafilter anymore. Otherwise the ultrafilter is preserved. It can be shown that adding any new real destroys some ultrafilter from the ground model (cf. [Bar2 Jud95, Theorem 6.2.2]). However, some ground model ultrafilters are preserved when adding a Sacks real. An ultrafilter U on ω is a p-point if for every countable family F ⊆ U there is a set U ∈ U which is almost included in all elements of F. The set U is called a pseudo-intersection of F. In [Bar2 Jud95, Theorem 7.3.48] it was shown that Miller forcing (see Section 4 for the definition of Miller forcing) preserves p-points. The same proof works for Sacks forcing. Lemma 2.9. If U is a p-point in the ground model V and G is S-generic over V , then U generates a p-point in V [G]. In other words, Sacks forcing preserves p-points. It should be pointed out that it is very easy to destroy every ground model ultrafilter by adding a new real. A real r ∈ 2ω is called a splitting real over the ground model V if for A := r−1 (1) and all B ∈ ℘(ω) ∩ V
ON SACKS FORCING AND THE SACKS PROPERTY
113
that are neither finite nor cofinite, A ∩ B and B\A are both infinite. Both random reals and Cohen reals are splitting. It is easily checked that adding a splitting real destroys all ultrafilters from the ground model. To conclude this section, we mention a result about the Π13 -theory of L[x] where x is a Sacks real over G¨odel’s universe L of constructible sets. Note that the minimality of the Sacks extension implies that by adding a Sacks real x to L, one obtains a minimal model of V = L, i.e., for every class C ⊆ L[x] that is a model of ZFC, either C = L[x] or C = L, and in the latter case C |= V = L. Recall that ϕ(x) is a Σ12 -formula if ϕ(x) is of the form ∃r1 ∈ ω ω ∀r2 ∈ ω ω ψ(x, r1 , r2 ) where ψ is arithmetical, i.e., only has quantifiers ranging over natural numbers (see, e.g., [Jec78, 7.40]). Shoenfield’s Absoluteness Theorem (cf. [Jec78, Theorem 98]) implies that Σ12 -formulas are absolute for forcing extensions. That is, if V [G] is a forcing extension of the ground model V , ϕ(x) is a Σ12 -formula, and r is an element of ω ω in V , then ϕ(r) holds in V iff it holds in V [G]. A Π13 -statement is a statement of the form ∀r0 ∈ ω ω ∃r1 ∈ ω ω ∀r2 ∈ ω ω ψ(r0 , r1 , r2 ) where ψ is arithmetical. Lemma 2.10. If ϕ is a Π13 -statement which holds in a universe containing a non-constructible real, then it holds in L[G] where G is S-generic over L. It follows that if there is any model V which is different from L such that in V every real r satisfies the Σ12 -formula ∃r1 ∈ ω ω ∀r2 ∈ ω ω ψ(r, r1 , r2 ), then Sacks forcing over L adds no real for which this statement fails. As Woodin remarked in [Woo99, Theorem 1.5], Lemma 2.10 is the consequence of the following result of Mansfield (cf. [Jec78, Theorem 99]): Theorem 2.11. Let A be a set of reals defined by a Σ12 -formula with constructible parameters. If A contains a non-constructible element, then it includes a constructibly coded perfect subset.
114
STEFAN GESCHKE AND SANDRA QUICKERT
To see why this implies Lemma 2.10, let V be a model of ZFC containing a non-constructible real, let ϕ(x) be a Σ12 -formula with only constructible parameters, and suppose that V |= ∀rϕ(r). Fix a generic extension V [G] in which ℵL1 is countable. Then in particular, all constructible dense sets of S are countable in V [G]. We have already seen that in the Sacks extension every new real is itself a Sacks real. Therefore it suffices to show that the Sacks generic real satisfies ϕ in L[G] where G is S-generic over L. In order to show this, let T ∈ S ∩ L. In V [G], A = {r ∈ [T ] | ϕ(r)} is a Σ12 -set containing a non-constructible element (from V ), and thus Theorem 2.11 yields a perfect tree T ∈ L such that [T ] ⊆ A. By the choice of V [G], [T ] also contains a Sacks real over L, and since Σ12 statements are absolute, neither T nor any perfect subtree of it can force ¬ϕ(r). ˙ As Brendle and L¨owe observed in [Bre1 L¨ow99], a slight generalization of Theorem 2.11 can be used to show the equivalence of the statements “every Σ12 -set of reals is s-measurable” and “every ∆12 -set of reals is s-measurable”. Here a set of reals is Σ12 iff it is the set of reals satisfying a fixed Σ12 -formula with a fixed additional real parameter. A set of reals is ∆12 iff the set and its complement are Σ12 . The class of Σ12 -sets can also be defined topologically (cf. [Jec78, p. 500]). The Σ11 -sets are the analytic sets, i.e., the projections of Borel sets, the Π11 -sets are the complements of Σ11 -sets, and the Σ12 -sets are the projections of Π11 -sets.
3
Consequences of the Sacks property
Before discussing consequences of the Sacks property, let us observe that in the definition of the Sacks property it is not essential that the size of C(n) is bounded by 2n . The number 2n could be replaced by any other non-decreasing unbounded function from ω to ω. This can be seen as follows: Let V0 and V1 be models of set theory with V0 ⊆ V1 . Let f, g : ω → ω\{0} be non-decreasing and unbounded functions in V0 . Suppose that every real in V1 is covered by an f -cone in V0 . In other words, suppose that V1 has the Sacks property over V0 with 2n replaced by f . We show that every new real in V1 is covered by a g-cone from V0 . Note that we may assume f (0) ≤ g(0) since every f -cone is the coordinatewise union of finitely many f -cones C with |C(0)| = 1.
ON SACKS FORCING AND THE SACKS PROPERTY
115
Let r : ω → ω be a real in V1 . In V0 choose a strictly increasing sequence (mn )n∈ω of natural numbers such that for every n ∈ ω, g(mn ) ≥ f (n). By our assumption f (0) ≤ g(0), we may assume m0 = 0. Identifying ω with ω <ω for a moment, we find an f -cone C : ω → [ω <ω ]<ℵ0 in V0 which covers (rmn )n∈ω . Let D : ω → [ω]<ℵ0 be defined as follows: For every n ∈ ω let D(n) := {s(n) : s ∈ C(n) ∧ n ∈ dom(s)}. Now for every n ∈ ω, if m is in the interval [mn , mn+1 ), then |D(m)| ≤ f (n) ≤ g(mn ) ≤ g(m). It follows that D is a g-cone in V0 covering r. The Sacks property implies that the new reals in the generic extension are very well controlled by ground model reals. For example, if V1 has the Sacks property over V0 , then the 2ℵ0 of V0 is still uncountable in V1 . This is because in V1 , countably many 2n -cones do not suffice to cover all of ω ω , but the 2n -cones from V0 do cover all of ω ω . We mention two more examples of this: If a forcing notion P has the Sacks property and G is P-generic over the ground model V , then the ideal N V [G] of measure zero sets in the generic extension V [G] is generated by the measure zero Borel sets from the ground model7 . Similarly, the ideal of nowhere dense subsets of 2ω in V [G] is generated by the nowhere dense Borel sets in V . Both of these consequences of the Sacks property follow from the following lemma. Lemma 3.1. Let V0 and V1 be models of set theory such that V0 ⊆ V1 and V0 and V1 have the same ordinals. Suppose that V1 has the Sacks property over V0 . Then a) for every measure zero set A ⊆ R in V1 there is a Borel set B ⊆ R in V0 of measure zero such that A ⊆ B holds in V1 and b) for every nowhere dense set A ⊆ 2ω in V1 there is a nowhere dense Borel set B ⊆ 2ω in V0 such that A ⊆ B holds in V1 . Proof. a) Let λ denote the Lebesgue measure on R. Recall that if A ⊆ R is of measure zero, then for every sequence (εn )n∈ω of positive real numbers there is a sequence (Un )n∈ωof finite unions of open intervals with rational endpoints such that A ⊆ n∈ω Un and for all n ∈ ω, λ(Un ) < εn . We do some preparations in V0 . Let (On )n∈ω be an enumeration of all finite unions of open intervals with rational endpoints, let e : ω × ω → ω 7
Whether a Borel code codes a measure zero set is absolute, that is, it does not depend on the model of set theory in which we evaluate the Borel code. The analogue is true for Borel codes of nowhere dense sets and of meager sets. Cf. [Jec78, Lemma 42.4].
116
STEFAN GESCHKE AND SANDRA QUICKERT
be a bijection, and fix a matrix (εij )i,j∈ω of positive real numbers such that for all i ∈ ω, 1 εij < i . 2 j∈ω Now for every i ∈ ω there is fi ∈ ω ω in V1 such that A ⊆ j∈ω Ofi (j) and for every j ∈ ω, 2e(i,j) · λ(Ofi (j) ) < εij . Consider the real r : ω → ω defined by r(n) := m iff m = fi (j) and n = e(i, j). By the Sacks property of P there is a 2n -cone C : ω → [ω]<ℵ0 in V0 that covers r. By the definition of r, A is a subset of {Om : m ∈ C(e(i, j)) ∧ 2e(i,j) · λ(Om ) < εij }. B := i∈ω j∈ω
Since only ground model parameters are used in the definition of B, B is a Borel set coded in the ground model. For all i, j ∈ ω, the set C(e(i, j)) has at most 2e(i,j) elements. It follows that the measure of {Om : m ∈ C(e(i, j)) ∧ 2e(i,j) · λ(Om ) < εij } is not greater than εij . Therefore, and by the choice of (εij )i,j∈ω , for every i ∈ ω, 1 {Om : m ∈ C(e(i, j)) ∧ 2e(i,j) · λ(Om ) < εij } < i . λ 2 j∈ω It follows that B is of measure zero. b) Note that if A ⊆ 2ω is nowhere dense, then for all n ∈ ω there is m > n and t : [n, m) → 2 such that {x ∈ 2ω : t ⊆ x} is disjoint from A. (Here [n, m) denotes the interval {n, . . . , m − 1} in ω.) This can be seen as follows: Let n ∈ ω and let (si )i<2n be an enumeration of 2n . Since A is nowhere dense, there is a sequence (ti )i<2n in 2<ω such that for all i < 2ω the set {x ∈ 2ω : s i t0 . . . ti ⊆ x} is disjoint from A (recall that st is the concatenation of the two finite sequences s and t). Now it is easy to check that t := t 0 . . . t2n has the desired property. Using this observation we can choose a strictly increasing sequence (ni )i∈ω of natural numbers and a sequence (ti )n∈ω such that for all i ∈ ω, ti ∈ 2[ni ,ni+1 ) and {x ∈ 2ω : ti ⊆ x} is disjoint from A. In other words, letting z := t 0 t1 . . . we have the following: For all i ∈ ω the set {x ∈ 2ω : x[ni , ni+1 ) = z[ni , ni+1 )} is disjoint from A. This property of z
ON SACKS FORCING AND THE SACKS PROPERTY
117
does not change if we replace the sequence (ni )i∈ω by another strictly increasing sequence (mi )i∈ω , provided for every i ∈ ω there is j ∈ ω such that [nj , nj+1 ) ⊆ [mi , mi+1 ). By the Sacks property, there is a 2n cone C : ω → [ω]<ω in V0 covering i → ni . Putting m0 := max(C(0)) and mi+1 := max(C(mi + 1)) for every i ∈ ω, we obtain sequence in V0 such that for every i ∈ ω the set {x ∈ 2ω : x[mi , mi+1 ) = z[mi , mi+1 )} is disjoint from A. Now let (ji )i∈ω be a sequence in V0 of natural numbers such that for all i ∈ ω, ji+1 − ji > 2i . By the Sacks property, there is a 2n -cone C : ω → [2<ω ]<ℵ0 in V0 covering i → zmji+1 . We may assume that for every i ∈ ω every t ∈ C(i) has dom(t) = mji+1 . Since ji+1 − ji > 2i and |C(i)| ≤ 2i , there is ui : [mji , mji+1 ) → 2 such that for every t ∈ C(i) there is k ∈ [ji , ji+1 ) such that t agrees with ui on [mk , mk+1 ). Since zmji+1 ∈ C(i), no x ∈ A extends ui . The sequence (ui )i∈ω can be chosen in V0 . Now the set {x ∈ 2ω : ∃i ∈ ω(ui ⊆ x)} is coded in V0 and disjoint from A in V1 . Its complement B is a nowhere dense Borel set in the ground model which covers A in V1 . q.e.d. The last lemma implies that forcing notions with the Sacks property neither add Cohen reals nor random reals. Moreover, we can compute the cofinality of N , the least size of a family F ⊆ N such that for all A ∈ N there is B ∈ F with A ⊆ B, in forcing extensions with the Sacks property over the ground model. Corollary 3.2. Let V0 and V1 be models of set theory such that V0 ⊆ V1 , V0 and V1 have the same ordinals, and V1 has the Sacks property over V0 . Then in V1 the cofinality of N (ordered by ⊆) is at most |(2ℵ0 )V0 | where (2ℵ0 )V0 denotes the ordinal which is the size of 2ω in V0 . In particular, if V0 satisfies CH, then the cofinality of N is ℵ1 in V1 . Proof. By Lemma 3.1, the Borel measure zero sets coded in V0 are cofinal in N V1 . In V0 there is a bijection between 2ℵ0 and all Borel subsets of 2ω . Since (2ℵ0 )V0 is an ordinal, and since V0 and V1 have the same ordinals, in V1 there is a bijection between |(2ℵ0 )V0 | and all Borel sets in 2ω which are coded in V0 . Therefore, in V1 the cofinality of N is at most |(2ℵ0 )V0 |. Now suppose that V0 satisfies CH. Then V1 |= |(2ℵ0 )V0 | ≤ ℵ1 . But the cofinality of N is uncountable and therefore it must be ℵ1 in V1 . q.e.d.
118
STEFAN GESCHKE AND SANDRA QUICKERT
The cofinality of N is the largest cardinal that appears in Cicho´n’s diagram (cf. [Bar2 Jud95, Chapter 2]). Thus, Corollary 3.2 implies that if one starts from a model of CH and then adds a P-generic filter over the ground model for some forcing notion P which has the Sacks property, then one ends up with a model where all the cardinals in Cicho´n’s diagram are ℵ1 . Of course this is only interesting if there is a forcing notion with the Sacks property that increases 2ℵ0 . We will go back to this when we discuss iterated forcing. The cardinals in Cicho´n’s diagram, as well as other cardinal characteristics of the continuum, are studied in depth in [Bar2 Jud95]. In this book the values of many cardinal characteristics of the continuum are computed in various models of set theory. Corollary 3.2 and further characterizations of the Sacks property for proper forcing notions can be found there. For a nice overview of cardinal characteristics of the continuum see [Bla1 ∞].
4
Variants of Sacks forcing
We give a brief overview of some relatives of Sacks forcing. Except for forcing with finitely branching trees, all the forcing notions we mention in this section are discussed in [Bar2 Jud95]. All the variants of Sacks forcing in this section satisfy Axiom A and have the Laver property, a weakening of the Sacks property. A forcing notion P has the Laver property iff for every P-generic filter G over the ground model V and every r : ω → ω in V [G] the following holds: If r is bounded by a ground model real, i.e., if there is b : ω → ω in V such that for all but finitely many n ∈ ω we have r(n) ≤ b(n) in V [G], then r is covered by a 2n -cone from the ground model. Since the Sacks property implies that every r : ω → ω in the generic extension is bounded by some ground model real, the Sacks property is equivalent to the Laver property together with the property that every element of ω ω in the extension is bounded by a ground model real. A forcing notion with the latter property is called ω ω -bounding8 . Random real forcing and Cohen forcing do not have the Laver property, because the Laver property implies that any new real in the generic extension can be covered by a meager set of measure zero. In particular, forcing notions with the Laver property neither add random nor Cohen reals. Ex8
For example, random real forcing is ω ω -bounding, Cohen forcing is not.
ON SACKS FORCING AND THE SACKS PROPERTY
119
cept for Mathias forcing, all the variants of Sacks forcing in this section give minimal generic extensions like Sacks forcing does. We first summarize the properties of Sacks forcing S. Sacks forcing consists of all perfect subtrees of 2<ω , ordered by inclusion. It has the Sacks property (even the 2-localization property) and preserves p-points. Moreover, if ϕ is a Π13 -statement which holds in some model of set theory with a non-constructible real, then it holds in L[G] where G is S-generic over L. Forcing with finitely branching trees. There are some forcing notions which are very closely related to Sacks forcing in that the conditions are also finitely branching trees with certain perfectness properties. For example, for n ∈ ω with n ≥ 2 a subtree p of n<ω is n-perfect iff every s ∈ p has an extension which has exactly n immediate successors in p. The forcing notion consisting of n-perfect trees ordered by inclusion has the n-localization property (which is obtained from the 2-localization property by replacing “binary tree” by “n-ary tree” where n-ary trees are those trees in which every node has at most n immediate successors) but not the (n − 1)-localization property [NewRos93]. Note that the nlocalization property implies the Sacks property. Laver forcing L [Lav76] consists of all subtrees p of ω <ω such that almost all s ∈ p have infinitely many immediate successors in p. The set L is ordered by inclusion. Laver forcing adds a dominating real, i.e., a function d : ω → ω which bounds all ground model functions from ω to ω. This implies that Laver forcing destroys every ultrafilter from the ground model. If G is L-generic over the ground model V , then in V [G] the set R ∩ V has outer measure one. Modifying Mansfield’s proof of Theorem 2.11, one can show that if ϕ is a Π13 -statement which holds in a universe with a dominating real over L, then ϕ holds in L[G] where G is L-generic over L. For this notice that if there is a dominating real over L, then there is a so-called strongly dominating real over L (cf. [Gol1 +95]). Moreover, if A is a Σ12 set in constructible parameters which contains a strongly dominating real over L, then A contains all the branches of a constructibly coded Laver condition. This can be shown using ideas from [Gol1 +95]. Miller forcing or rational perfect set forcing RP [Mil0 84] consists of all subtrees p of ω <ω such that every s ∈ p has an extension t ∈ p which has infinitely many immediate successors in p. The elements of RP are sometimes called superperfect trees. Miller forcing is ordered by
120
STEFAN GESCHKE AND SANDRA QUICKERT
inclusion. It preserves p-points and therefore does not add a dominating real. If G is RP-generic over the ground model V , then R ∩ V is neither meager nor of measure zero in V [G]. As for Laver forcing, a variant of Theorem 2.11 is true for Miller forcing. A real r : ω → ω is unbounded over a class C iff it is not bounded by a function in ω ω ∩ C. Using ideas from [Kec77] it is possible to show that every constructibly coded Σ12 -set which contains an unbounded real over L contains all the branches of a constructibly coded Miller condition. This implies that every Π13 -statement which holds in a model of set theory with an unbounded real over L also holds in every model of the form L[G] where G is RP-generic over L. Mathias forcing M [Mat77] consists of all pairs (s, A) such that s ⊆ ω is finite and A ⊆ ω\ s is infinite. The set M is ordered as follows: (s, A) ≤ (t, B) iff t ⊆ s, A ⊆ B and s\t ⊆ B. Mathias forcing adds a dominating real. If G is M-generic over the ground model V , then R ∩ V is a meager set of measure zero in V [G]. Of course, we can only mention the most popular relatives of Sacks forcing. More forcing notions which use trees as conditions, usually called tree forcings, are studied in [Bre1 95]. A general framework for constructing tree forcings (creature forcing with tree creatures) was developed by Rosłanowski and Shelah in [RosShe99]. Some forcing notions can be considered as suborders of Sacks forcing, but have not been mentioned in this section since they seem to be based on a different philosophy. Cohen forcing is an example of this: Every Cohen condition p, a finite partial function from ω to 2, determines a perfect tree Tp := {q ∈ 2<ω : p ∪ q is a function}. The mapping p → Tp is an embedding of Cohen forcing into Sacks forcing, but this embedding is obviously not a complete embedding. In the same way, Silver forcing and Grigorieff forcing can be considered as suborders of Sacks forcing.9 However, Silver forcing and Grigorieff forcing are much closer to Sacks forcing than Cohen forcing is since they use as their conditions partial functions from ω to 2 whose domains may be infinite. They are both proper. Grigorieff forcing is ω ω -bounding, but Silver 9
Cf. [She98, VI.4] for the definitions of these two forcing notions. Shelah used Grigorieff forcing to construct a model of set theory without p-points.
ON SACKS FORCING AND THE SACKS PROPERTY
121
forcing is not. Note that the Silver forcing in [She98, VI.4] is different from the similar Pˇr´ıkr´y-Silver forcing as defined in [Bau83, Section 7]. Pˇr´ıkr´y-Silver forcing consists of all functions from co-infinite subsets of ω to 2 and is ordered by reverse inclusion. Pˇr´ıkr´y-Silver forcing satisfies Axiom A. In the next section we mention some results about possible c.c.c. relatives of Sacks forcing.
5
The Sacks property and the c.c.c.
In the past many people investigated the question how close or how far away a c.c.c. forcing notion can be from Cohen forcing or random real forcing. Here “being close” simply means that Cohen forcing, respectively random real forcing can be completely embedded into the given forcing notion. If so, then such a forcing also adds a Cohen, respectively random real. One instance of this question, namely the question whether random real forcing completely embeds into every ω ω -bounding forcing notion that is c.c.c. goes back to the Scottish Book [Mau81] problem of von Neumann about measure algebras. Since the Sacks property implies that neither Cohen nor random reals are added, it is a good test question to ask when a c.c.c. forcing can have the Sacks property. Concerning reasonably definable partial orders, Shelah showed in [She94] that no c.c.c. Souslin forcing has the Laver property where a forcing notion is Souslin iff its underlying set, the order, and the incompatibility relation are analytic, i.e., Σ11 . So in particular, a c.c.c. Souslin forcing cannot have the Sacks property. Moreover, Shelah showed that if a c.c.c. Souslin forcing adds an unbounded real, then it adds a Cohen real. It is unknown whether there is a model of ZFC where every c.c.c. forcing adding an unbounded real adds a Cohen real. There are some combinatorical statements which imply or negate the existence of c.c.c. forcings (not necessarily nicely definable) with the Sacks property. We mention the combinatorial statements involved and give a summary of results. ♦: There is a sequence (Aα )α<ω1 such that for all α < ω1 , Aα ⊆ α and for every A ⊆ ω1 the set {α < ω1 | A ∩ α = Aα } is stationary. In [Jen70], Jensen constructed a c.c.c. forcing with the Sacks property from ♦. The principle ♦, which implies CH, is used in an inductive construction of length ω1 to predict initial parts of antichains of the final
122
STEFAN GESCHKE AND SANDRA QUICKERT
forcing notion already at a countable stage of the construction. During the construction one makes sure that the predicted antichains cannot be extended to uncountable antichains in the final forcing notion, so that the final forcing notion becomes c.c.c. This is similar to the standard construction of a Souslin tree from ♦ as presented in [Kun80, Theorem 7.8]. CCC(S): For every family D of size at most 2ℵ0 of dense subsets of Sacks forcing S there is a c.c.c. suborder P of S such that D ∩ P is dense in P for each D ∈ D. This principle together with a large size of the continuum was shown to be consistent with ZFC by Veliˇckovi´c [Vel91]. From CCC(S) one can easily extract a c.c.c. suborder of S which has the Sacks property. OCA (Open Coloring Axiom): Let X be a subset of the reals and ˙ 1 suppose that K0 and K1 are disjoint subsets of [X]2 with [X]2 = K0 ∪K 2 such that K0 is open in the topology that [X] inherits from the product topology on X 2 . Then either there is an uncountable Y ⊆ S such that [Y ]2 ⊆ K0 , or S = i∈ω Yi where [Yi ]2 ⊆ K1 for every i ∈ ω. From this principle, which implies that the size of the continuum is at least ℵ2 , Veliˇckovi´c [Vel02] deduced that no c.c.c. forcing adding a new real has the Sacks property. Principle ( ) for κ: Let I be a p-ideal on [κ]≤ℵ0 where I ⊆ ([κ]≤ℵ0 ) is a p-ideal if for every countable set C ⊆ I there is X ∈ I such that every member of C is almost included in X. Then either there is ωan uncountable ω Y ⊆ κ such that [Y ] ⊆ I, or κ = i∈ω Yi where [Yi ] ∩ I = ∅ for every i ∈ ω. The second author showed in [Qui02] that this principle also implies that no c.c.c. forcing has the Sacks property. Since ( ) restricted to κ = ℵ1 is consistent with CH [AbrTod97], this also shows that CH is not strong enough to decide the existence of c.c.c. forcings with the Sacks property. The case of the Laver property is different: CH already implies the existence of a c.c.c. forcing with the Laver property. In [She01] it is stated that Mathias forcing defined relatively to a Ramsey ultrafilter is an example (cf. [Mat77] for the definition of Mathias forcing relative to a Ramsey ultrafilter). A model where such forcings do not exist was found by Shelah [She01]. He showed that the countable support iteration of Mathias forcing of length ω2 yields such a model. To conclude this section, we list some questions in this area.
ON SACKS FORCING AND THE SACKS PROPERTY
123
1. Which combinatorical principles imply or deny the existence of a c.c.c. forcing preserving p-points? 2. Are there forcing notions not preserving p-points, but not adding splitting reals? 3. Which combinatorial principles imply or deny the existence of a c.c.c. forcing not adding splitting reals? The motivation for the second question is to look for less “brutal” methods to destroy ultrafilters than adding a splitting real. The other questions are variants of von Neumann’s problem. As mentioned before, both Cohen and Random real forcing add a splitting real and therefore destroy every ultrafilter in the ground model.
6
Products of Sacks forcing
If one wants to add many Sacks reals to a model of set theory, for example to increase 2ℵ0 , then there are essentially two possibilities: The first possibility is to use some kind of side-by-side product of many copies of S to add many Sacks reals over ground model at once (in a parallel way). The second possibility is to iterate Sacks forcing, i.e., to add one Sacks real after the other. We will describe the second possibility in the next section. Forcing with side-by-side products has two advantages over the iteration: First, it is technically easier to deal with and second, it is possible to construct models with an arbitrarily large size of the continuum. The iteration, at least the countable support iteration, which is discussed in the next section, only gives models where the size of the continuum is at most ℵ2 . However, the countable support iteration of Sacks forcing usually yields stronger consistency results than forcing with side-by-side products. Definition 6.1. The countable support (side-by-side) product P of a family (Qi )i∈I of forcing notions is the set of all p ∈ i∈I Qi with countable support supt(p) := {i ∈ I : p(i) = 1Qi }. The set P is ordered componentwise, i.e., for p, q ∈ P, p ≤ q iff for all i ∈ I, p(i) ≤ q(i). Of course one can define products with finite supports or with uncountable supports. However, countable supports are what is appropriate for forcing notions satisfying Axiom A. Something that all reasonable products P of a family (Qi )i∈I (like the finite support product and the
124
STEFAN GESCHKE AND SANDRA QUICKERT
countable support product) have in common, is the following: For every i ∈ I the embedding ei : Qi → P that maps every q ∈ Qi to the sequence which has q as its i-th coordinate and is 1 everywhere else is a complete embedding. In particular, if G is P-generic over the ground model V , then for every i ∈ I, e−1 [G] is Qi -generic over V . In other words, if P is the countable support product of κ copies of Sacks forcing, then P adds κ Sacks reals. An easy density argument shows that the Sacks reals added by different factors of P are indeed different. Under CH, countable support products of copies of Sacks forcing do not collapse cardinals [Bau85]. The proof of this fact consists of two parts: First one shows that ℵ1 is not collapsed by a countable support product of copies of Sacks forcing. Then one uses CH to show that in countable support products of forcing notions of size at most ℵ1 all antichains are of size at most ℵ1 . This can be done using a simple ∆-system argument. It follows that no cardinal above ℵ1 is collapsed (cf. [Kun80, Lemma 6.9]). We only show that countable support products of copies of Sacks forcing do not collapse ℵ1 . In the proof presented here we get the 2-localization property more or less for free. The following lemma is essentially taken from [NewRos93]. Lemma 6.2. Countable support products of copies of Sacks forcing have the 2-localization property. In particular, they have the Sacks property and do not collapse ℵ1 . The proof of this lemma follows closely the proof of Lemma 2.7. But we have to deal with countably supported κ-sequences of conditions in S this time. For this we need a lot of notation. Fix a cardinal κ and let P be the countable support product of κ copies of S.10 In order to show that P has the 2-localization property, we have to extend the notions “fusion” and “fusion sequence” to P. For this it is convenient to pass from our original definition of fusion sequences in S to a more flexible one. Now a sequence (pn )n∈ω in S is a fusion sequence iff there is a nondecreasing unbounded function f : ω → ω such that for all n ∈ ω, pn+1 ≤f (n) pn . It is easily checked that a sequence (pn )n∈ω which is a fusion sequence according to the new definition has a subsequence which is a fusion sequence according to the old definition. If (p n )n∈ω is a fusion sequence in according to the new definition, its fusion n∈ω pn is again a condition in S, as before. 10
The elements of P are functions from κ to S or equivalently, κ-sequences of elements of S.
ON SACKS FORCING AND THE SACKS PROPERTY
125
Definition 6.3. For a finite set F ⊆ κ and η : F → ω let the relation ≤F,η on P be defined as follows: For all p, q ∈ P let p ≤F,η q if p ≤ q and all α ∈ F , p(α) ≤η(α) q(α). For p, F , and η as before and σ ∈ for η(α) let p ∗ σ be such that for all α ∈ F , (p ∗ σ)(α) = p(α) ∗ σ(α) α∈F 2 and for all α ∈ κ\F , (p ∗ σ)(α) = p(α). A sequence (pn )n∈ω of conditions in P is a fusion sequence if there is an increasing sequence (Fn )n∈ω of finite subsets of κ and a sequence (ηn )n∈ω such that for all n ∈ ω, ηn : Fn → ω, for all i ∈ Fn we have ηn (i) ≤ ηn+1 (i), pn+1 ≤Fn ,ηn pn , and for all α ∈ supt(pn ) there is m ∈ ω such that α ∈ Fm and ηm (α) ≥ n. If (pn )n∈ω is a fusion sequence in P, then for each α ∈ κ, (pn (α))n∈ω is a fusion sequence in S or constant with value 1S . This shows that pn (α) pω := n∈ω
α∈κ
is a condition in P, the fusion of the sequence (pn )n∈ω . For the proof of Lemma 6.2 it is enough to show Lemma 6.4. For every p ∈ P and every name z˙ for an element of ω ω there are a condition q ≤ p and a binary tree T ⊆ ω <ω such that q forces z˙ to be a branch of T . Proof. Let p and z˙ be as above. Our goal is to construct q ≤ p such that ˙ = {s ∈ ω <ω : ∃r ≤ q(r sˇ ⊆ z)} ˙ Tq (z) is binary. We may assume that no condition below p decides all of z. ˙ Let F be a finite subset of κ and let η : F →ω be a function. A condition r ∈ P is (F, η)-faithful iff for all σ, τ ∈ α∈F 2η(α) and all r ≤F η r the following holds: If z˙r ∗σ and z˙r ∗τ are incomparable elements of ω <ω (with respect to ⊆), then already z˙r∗σ and z˙r∗τ are incomparable. Claim 1. Suppose r ∈ P is (F, η)-faithful. a) Let j ∈ κ\F and define F := F ∪ {j} and η := η ∪ {(j, 0)}. Then r is (F , η )-faithful. b) Let β ∈ F , and let η be such that for all α ∈ F \{β}, η (α) = η(α) and η (β) = η(β) + 1. Then there is a condition r ≤F,η r such that r is (F, η )-faithful.
126
STEFAN GESCHKE AND SANDRA QUICKERT
Part a) of this claim follows immediately fromthe definitions. For the proof of b) fix an enumeration {σ1 , . . . , σm } of i∈F 2η(i) . We define a ≤F,η -descending sequence (rk )k≤m below r. For k ∈ {1, . . . , m} and l ∈ 2 let σkl be such that for all α ∈ F \{β}, σkl (α) = σk (α) and σkl (β) = σk (β)l. Let r0 := r. Suppose we have constructed rk for some k < m 0 and we want to build rk+1 . Let rk+1 ≤F,η rk be such that z˙rk+1 ∗σk+1 1 and z˙rk+1 ∗σk+1 are incomparable if such a condition exists. Otherwise, let rk+1 := rk . Finally, let r := rm . It follows from the construction that r is (F, η )-faithful. This shows part b) of the claim. Using a) and b) of Claim 1 together with some bookkeeping, we construct a sequence (pm , Fm , ηm , jm )m∈ω such that 1. (pm )m∈ω is a fusion sequence in P witnessed by (Fm , ηm )m∈ω where F0 = η0 = ∅, 2. for all m ∈ ω, pm is (Fm , ηm )-faithful and pm ≤ p, 3. for all m ∈ ω and all i ∈ Fm , ηm+1 (i) ≤ ηm (i) + 1, 4. for all m ∈ ω, jm is the unique element of Fm+1 such that either jm ∈ Fm and ηm+1 (jm ) = ηm (jm )+1 or jm ∈ Fm and ηm+1 (jm ) = 1, and 5. for all m ∈ ω and all σ ∈ i∈Fm 2ηm (i) , pm ∗ σ decides at least zm. ˙ ˙ Note Let q be the fusion of the pm , m ∈ ω, and let T := Tq (z). that for all m ∈ ω, q ≤Fm ,ηm pm . For each m ∈ ω let Tm+1 be the tree generated by z˙pm ∗σ : σ ∈ 2ηm (i) , i∈Fm
and let T0 be the “tree” generated by z˙p0 . Now each Tm is a subtree of T . We show that T = m∈ω Tm . Let t ∈ T and m := dom(t). Let r ≤ q be a condition which forces that t is an initial segment of z. ˙ The set {q ∗ σ : σ ∈ i∈Fm 2ηm (i) } is a maximal antichain below q. It follows that there is σ ∈ i∈Fm 2ηm (i) such that r is compatible with q ∗ σ. Since q ∗ σ ≤ pm ∗ σ, r is compatible with pm ∗ σ. This implies t ⊆ z˙pm ∗σ and thus t ∈ Tm . It remains to prove that T is binary. We first show that for every m ∈ ω and t ∈ Tm , if there is s ∈ T such that s and t are incomparable, then there is s ∈ Tm such that s ⊆ s and already s and t are incomparable. Let s, t, and m be as before. Let r ≤ q be a condition which forces s to be an initial segment of z. ˙ Then there is σ ∈ i∈Fm 2ηm (i) such that r is
ON SACKS FORCING AND THE SACKS PROPERTY
127
compatible with pm ∗ σ. Let τ ∈ i∈Fm 2ηm (i) be such that t ⊆ z˙pm ∗τ . Clearly, σ = τ . Let m ≤ m minimal such that σ := (σ(α)ηm (α))α∈Fm and
τ := (τ (α)ηm (α))α∈Fm
are different. Now s := z˙pm ∗σ and z˙pm ∗τ are incomparable by the (Fm , ηm )-faithfulness of pm . Clearly, s ∈ Tm , and s and t are incomparable. Now suppose that T is not binary. Then there is t ∈ T such that t has three distinct immediate successors t0 , t1 , and t2 in T . Let m ∈ ω be minimal such that Tm contains at least two of the tj , j < 3. By what we proved Tm in fact contains all the tj . For every j < 3 let justηmbefore, (i) σj ∈ i∈Fm 2 be such that tj ⊆ z˙pm ∗σj . For every n ≤ m and every n j < 3 let σj := (σj (α)ηn (α))α∈Fn . Now let n ≤ m be minimal such that the set {σjn : j < 3} has at least two elements. Then {σjn : j < 3} has exactly 2 elements and n < m by the construction of Fn and ηn . Without loss of generality we may assume that σ0n and σ1n are different. Since m is minimal such that Tm contains at least two of the tj and since n < m, z˙pn ∗σ0n and z˙pn ∗σ1n are comparable. However, this contradicts the (Fn , ηn )-faithfulness of pn . q.e.d. Corollary 6.5. Let V be a model of set theory satisfying CH and suppose that κ is a cardinal V . Then there is a generic extension V [G] of V such that V [G] |= 2ℵ0 ≥ κ, V [G] has the same cardinals as V , and V [G] has the 2-localization property over V . In particular, it is consistent with an arbitrarily large size of the continuum that all cardinals in Cicho´n’s diagram are ℵ1 . Proof. In V let P be the countable support product of κ copies of Sacks forcing. By CH, P has no antichains of size ℵ2 and therefore does not collapse any cardinal above ℵ1 , as we mentioned before. By Lemma 6.2, P does not collapse ℵ1 . Let G be P-generic over V . Then V [G] and V have the same cardinals and V [G] has the Sacks property over V by Lemma 6.2. Now it follows from Corollary 3.2 that all the cardinals in Cicho´n’s diagram are ℵ1 . It remains to evaluate the size of the continuum in V [G]. As mentioned before, it is easily checked that the Sacks reals added by
128
STEFAN GESCHKE AND SANDRA QUICKERT
the different factors of P are pairwise different. It follows that 2ℵ0 is at least κ in V [G]. q.e.d. In order to determine the exact value of 2ℵ0 in V [G] in this proof, one has to analyze the construction of names for subsets of ω. For simplicity, let us assume that V satisfies the Generalized Continuum Hypothesis (GCH) and suppose that κ is of uncountable cofinality in V . Now it can be shown that in V there is a set N of size κ such that every subset of ω in V [G] has a name in N . This implies that in V [G], 2ℵ0 is exactly κ. Countable support products of Sacks forcing have been used by Baumgartner (cf. [Bau85]) to show the consistency of the total failure of Martin’s Axiom: The topological version of Martin’s Axiom can be stated as follows: No compact space without uncountable disjoint families of open sets is the union of < 2ℵ0 nowhere dense sets. The Baire Category Theorem implies that no compact space is the union of countably many nowhere dense sets. Martin’s Axiom fails totally if every compact space without uncountable disjoint families of open sets is the union of ℵ1 nowhere dense sets, but 2ℵ0 > ℵ1 . Baumgartner’s article includes a nice introduction to countable support products of Sacks forcing.
7
Iteration of Sacks forcing
Side-by-side products of partial orders adding reals, such as the countable support products of Sacks forcing introduced in Section 6, have one major disadvantage when it comes to strong independence results: The reals added by the different factors of the product are independent over the ground model. That is, if P is the countable support product of κ copies of Sacks forcing, G is P-generic over the the ground model V , and for each α < κ, xα denotes the Sacks real added by the α-th factor of P, then for every set A ∈ V with A ⊆ κ we have (2ω )V [(xα )α∈A ] ∩ (2ω )V [(xα )α∈κ\A ] = (2ω )V . Recall that officially we only defined V [X] if X is a set of ordinals. But it should be clear how to code (xα )α∈A and (xα )α∈κ\A by sets of ordinals. In Section 6 we showed that forcing with countable support products of copies of Sacks forcing over models of CH produces models of set theory in which many cardinal characteristics of the continuum, such as the cardinals in Cicho´n’s diagram, are small (i.e., equal to ℵ1 ), while the
ON SACKS FORCING AND THE SACKS PROPERTY
129
size of the continuum is big. However, there is one class of cardinal characteristics of the continuum whose members are still big in such models. Let X be a set and f : X → X. A point (x, y) ∈ X 2 is covered by f iff f (x) = y or f (y) = x. A family F of functions from X to X covers A ⊆ X 2 iff every point in A is covered by some member of F. One might ask how many continuous functions are needed to cover 2 R . As it turns out, the number of continuous functions needed to cover (2ω )2 is the the same as for R2 [GesGol1 Koj∞]. In the present context it is more convenient to consider 2ω instead of R. Let V be a model of GCH and suppose that κ > ℵ1 is a cardinal of uncountable cofinality in V . Let P be the countable support product of κ copies of Sacks forcing and let G be P-generic over V . For α < κ let xα denote the Sacks real added by the α-th factor of P, as above. Suppose that in V [G], F is a family of size < κ of continuous functions from 2ω to 2ω . Since every continuous function from 2ω to 2ω is in fact a closed subset of (2ω )2 , it can be coded (for example, using Borel codes) by a subset of ω. Now it can be shown that there is a set A ∈ V such that A is of size < κ in V and F ⊆ V [(xα )α∈A ]. Let β, γ ∈ κ\A. Since xγ ∈ V [(xα )α∈A∪{β} ] and xβ ∈ V [(xα )α∈A∪{γ} ], no function from F covers (xβ , xγ ). It follows that in V [G], F does not cover (2ω )2 . Since F was an arbitrary family of size < κ of continuous functions on 2ω , we have shown that in V [G], (2ω )2 cannot be covered by less than κ continuous functions. So, if we want to show that it is consistent that (2ω )2 can be covered by less than 2ℵ0 continuous functions, then we have to look for a different construction of models of set theory. We have to iterate Sacks forcing. The fundamental paper about iterated Sacks forcing is [BauLav79]. ˙ be a P-name for a Definition 7.1. Let P be a partial order and let Q ˙ is the partial order consisting of all pairs (p, q) partial order. Then P ∗ Q ˙ ˙ The set P ∗ Q ˙ is such that p ∈ P and q˙ is a P-name for an element of Q. ˙ let (p0 , q˙0 ) ≤ (p1 , q˙1 ) iff ordered as follows: For (p0 , q˙0 ), (p1 , q˙1 ) ∈ P ∗ Q p0 ≤ p1 and p0 q˙0 ≤ q˙1 . ˙ Let 1˙ Q be a P-name There is an obvious embedding e : P → P ∗ Q: ˙ for the largest element of Q. For every p ∈ P let e(p) := (p, 1˙ Q ). It is easily checked that e is a complete embedding. In particular, if G is ˙ P ∗ Q-generic over the ground model V , then G0 := e−1 [G] is P-generic
130
STEFAN GESCHKE AND SANDRA QUICKERT
˙G over V . In V [G] let G1 := {q˙G0 : ∃p ∈ P((p, q) ˙ ∈ G)}. Then G1 is Q 0 generic over V [G0 ]. On the other hand, if G0 is P-generic over V and G1 ˙ G -generic over V [G0 ], then {(p, q) ˙ : p ∈ G0 ∧ q˙G ∈ G1 } is Q ˙ ∈ P∗Q 0 0 ˙ is P ∗ Q-generic over V . ˙ is the same as first adding a PThis shows that forcing with P ∗ Q ˙ generic filter G0 and then adding a QG0 -generic filter over it. The point of iterated forcing is that we can add generics over each other using a single forcing notion in the ground model. The iteration becomes really relevant only when we want to add infinitely many generics over each other. Definition 7.2. Let δ be an ordinal. A countable support iteration of ˙ α )α<δ ) with the following length δ is an object of the form ((Pα )α≤δ , (Q properties: 1. (Pα )α≤δ is a sequence of forcing notions and P0 is the trivial forcing notion containing just one element, ˙ α is a Pα -name ˙ α )α<δ is a sequence such that for each α < δ, Q 2. (Q for a forcing notion (more precisely, a name for a set together with a name for an ordering on that set), ˙ α , and 3. for all α < δ, Pα+1 = Pα ∗ Q 4. if β ≤ δ is a limit ordinal, then Pβ consist of all functions q with domain β such that ˙ α and qα ∈ (a) for all α < β, q(α) is a Pα -name for a condition in Q Pα and (b) the support supt(q) := {α < β : 1Pα q(α) = 1Q˙ α } of q is ˙ α) countable (where 1Q˙ α is a Pα -name for the largest element of Q and Pβ is ordered as follows: for p, q ∈ Pβ we have p ≤ q iff for all α < β, pα Pα p(α) ≤ q(α). This definition deserves several remarks. First, it should be clear how one has to change the definition in order to obtain the definition of finite support iteration: Just replace “countably many” in 4b by “finitely many”. While finite supports are suitable for iterating c.c.c. forcings since finite support iterations of c.c.c. forcings are again c.c.c., countable supports are appropriate for Axiom A forcings since with countable supports it is possible to extend the fusion technology used to deal with Axiom A forcings to the iteration.
ON SACKS FORCING AND THE SACKS PROPERTY
131
˙ α )α<δ ) is already deterA countable support iteration ((Pα )α≤δ , (Q ˙ α . Therefore, slightly abusing notation, we will frequently mined by the Q ˙ α , α < δ. For all α, β ≤ δ call Pδ the countable support iteration of the Q with α < β there is a natural embedding eαβ : Pα → Pβ mapping every p ∈ Pα to p where p α = p and for all γ ∈ [α, β), p (γ) = 1Q˙ γ . The eαβ are complete embeddings. Via eαδ we may consider Pα as a subset of Pδ . Let G be a Pδ -generic filter over the ground model V . Then for every α < δ, Gα := G ∩ Pα is Pα -generic over V . Thus, we have an increasing chain V ⊆ V [G1 ] ⊆ · · · ⊆ V [Gα ] ⊆ · · · ⊆ V [G] of models of set theory. For every α < δ, V [G] is a generic extension of V [Gα ], obtained by adding a Q-generic filter over V [Gα ], where Q ∈ V [Gα ] is the so-called quotient of Pδ over Gα . The partial order Q can ˙ α )α<δ ) and Gα and is be constructed in a natural way from ((Pα )α≤δ , (Q 11 a countable support iteration of length δ − α. The countable support iteration of Sacks forcing of length δ is the ˙ α )α<δ ) is a countable support iterforcing notion Sδ where ((Sα )α≤δ , (Q ˙ ation such that for every α < δ, Qα is an Sα -name for Sacks forcing. The Sacks model is a model of set theory of the form V [G] where G is Sω2 -generic over V and V is a model of CH. The formulation “the Sacks model” is slightly misleading, since the model is not uniquely determined. However, Sω2 -generic extensions of models of CH are sufficiently similar to each other to be considered the same most of the time. If the ground model satisfies CH, then Sω2 has no antichains of length ℵ2 (cf. [BauLav79] or, for a more general result about proper forcing, [She98, Chapter III, Theorem 4.1]) and therefore does not collapse any cardinal ≥ ℵ2 . Since countable support iterations of Sacks forcing have the 2-localization property, as we will show in a moment, they do not collapse ℵ1 . It follows that forcing with Sω2 over models of CH does not collapse cardinals. Since Sω2 has the 2-localization property, it follows from Corollary 3.2 that all the cardinals in Cicho´n’s diagram are ℵ1 in the Sacks model. If G is Sδ -generic over the ground model V and α ≤ δ, then the quotient Q of Sδ over G ∩ Sα is (equivalent to) the countable support iteration of Sacks forcing of length δ−α in V [Gα ]. Moreover, V [Gα+1 ] = V [Gα ][xα ] where xα is the Sacks real added by the last factor of the 11
Cf. [Gol1 93, Section 4] for more on quotient forcing.
132
STEFAN GESCHKE AND SANDRA QUICKERT
iteration Sα+1 . Using this notation we see that for all β ≤ δ, V [Gβ ] = V [(xα )α<β ]. As in the case of countable support products, an easy density argument shows that the xα are pairwise different. Analyzing Sω2 -names for subsets of ω one can see that the size of the continuum does not exceed ℵ2 in the Sacks model. It follows that the size of the continuum is exactly ℵ2 in the Sacks model. We already mentioned that countable support iterations of Sacks forcing cannot be used to construct models of set theory where 2ℵ0 > ℵ2 . This is because countable support iterations of nontrivial forcing notions of some length with cofinality ℵ1 collapse the size of the continuum to ℵ1 .12 Thus, if δ is an ordinal with cofinality > ℵ1 and G is Sδ -generic over the ground model V , then for cofinally many α < δ, V [Gα ] is a model of CH. It follows that 2ℵ0 cannot be bigger than ℵ2 in V [G]. The Sacks model is generally known as the model of set theory in which all reasonable cardinal characteristics of the continuum (that can consistently be smaller than 2ℵ0 ) are ℵ1 . Recently this has been turned into a provable statement by Zapletal (cf. [Zap04]). For a large class of combinatorial cardinal characteristics of the continuum he showed that if there is a forcing extension of the ground model V in which the cardinal characteristic in question is smaller than 2ℵ0 , then this cardinal is ℵ1 in V [G] where G is S(2ℵ0 )+ -generic over V . The model V [G] actually is “the” Sacks model. Even if V is not a model of CH, forcing with Sω1 , an initial part of S(2ℵ0 )+ , collapses the 2ℵ0 to ℵ1 . Now the (2ℵ0 )+ of V is ℵ2 . The remaining part of the iteration is just forcing with Sω2 over a model of CH. The proof of the following theorem illustrates the main techniques for proving statements about the Sacks model. The theorem together with Lemma 7.4 summarizes a sizeable portion of what is known about this model of set theory. Theorem 7.3. In the Sacks model, (2ω )2 can be covered by ℵ1 continuous functions from 2ω to 2ω , while 2ℵ0 = ℵ2 . This theorem was explicitly shown in [Har2 vSt02]. However, implicitly it already follows from Miller’s [Mil0 83] about mapping sets of reals onto the reals and from the work of Groszek (cf. [Gro1 81]) showing that the constructible degrees of reals in a Sω2 -generic extension of L are 12
Cf. [Gol1 93, Section 0] for more details.
ON SACKS FORCING AND THE SACKS PROPERTY
133
wellordered of ordertype ω2 . Constructible degrees of reals are defined as follows: For x, y ∈ 2ω let x ≤ y iff x ∈ L[y]. The model L[y] can be constructed without forcing13 . A constructible degree is an equivalence class with respect to the relation ≤ ∩ ≥. The order ≤ induces an order on the constructible degrees, the constructibility order. Stronger versions of Theorem 7.3 can be found in [CiePaw∞], [Ges+02], and [Ste2 99]. There are more consistency results about the structure of constructible degrees of reals in models of set theory. Let us mention two examples. Groszek [Gro1 94] showed that it is consistent to have a reversed copy of ω1 as an initial segment of the order of constructible degrees. Kanovei and Zapletal [Kan1 Zap98] showed that there is a generic extension L[G] of L in which there are a strictly increasing sequence (an )n∈ω of constructible degrees and a constructible degree b such that the following holds: For two constructible degrees c, d let c, d denote the constructible degree which is the least upper bound of c and d. Then the sequence (an , b)n∈ω is strictly increasing and if x ∈ L[G] is not an element of any intermediate model of the form L[a0 , . . . , an , b], then L[G] = L[x]. We go back to the proof of Theorem 7.3. Since the number of ground model reals is ℵ1 in the Sacks model, the theorem follows immediately from the next lemma, which gives some additional information. Lemma 7.4. a) Let δ be an ordinal and suppose that G be Sδ -generic over the ground model V . For α < δ let Gα := G ∩ Sα . Then in V [G] the following holds: Let x ∈ 2ω and let α ≤ δ be the first ordinal such that x ∈ V [Gα ]. Then for all y ∈ 2ω ∩ V [Gα ] there is a continuous function f : 2ω → 2ω coded in V such that f (x) = y. In particular, (2ω )2 is covered by the continuous functions coded in the ground model. b) For every ordinal δ, Sδ has the 2-localization property and therefore has the Sacks property. Our proof of a) will be more or less the same as the proof of a similar statement in [GesGol1 Koj∞], which talks about slightly different forcing notions. In order to find the required continuous functions in the ground model, we first observe that for 2ω an extension theorem similar to the Tietze-Urysohn Theorem holds. 13
Cf. [Jec78, Section 15].
134
STEFAN GESCHKE AND SANDRA QUICKERT
Claim 2. Whenever A is a closed subset of 2ω and f : A → 2ω is continuous, then f can be extended to a continuous function f : 2ω → 2ω . This can be seen as follows: First of all, it suffices to extend every continuous g : A → 2 to a continuous function g : 2ω → 2 since f is built from countably many such functions. If g : A → 2 is continuous, then g −1 (0) and g −1 (1) are disjoint closed subsets of 2ω . Since the topology on 2ω is generated by clopen sets and since g −1 (0) and g −1 (1) are in fact compact, there is a clopen set B ⊆ 2ω such that g −1 (1) ⊆ B and g −1 (0) ⊆ 2ω \B. Let g : 2ω → 2 be the characteristic function of B. Since B is clopen, g is continuous. By the choice of B, g extends g. This finishes the proof of Claim 2. This extension property shows that it actually suffices to show that for x and y as in a) there are a closed set A ⊆ 2ω and a continuous function f : A → 2ω such that f (and therefore A) is coded in V and f (x) = y. Since we intend to work in V , we have to choose suitable names for x and y first. It is worth mentioning that no new reals are added at limit stages of forcing iterations of uncountable cofinality. That is, if α is minimal with x ∈ V [Gα ], then α is of countable cofinality14 .
Claim 3. Let x˙ be an Sα -name such that there is no β < α with x˙ Gα ∈ V [Gβ ]. Then there is an Sα -name z˙ such that x˙ Gα = z˙Gα and for all β < α and all Sα -generic filters H over V , z˙H ∈ V [Hβ ] where Hβ := H ∩ Sβ . We first show that there is p ∈ Sα such that for no Sα -generic filter H over V with p ∈ H there is β < α such that x ∈ V [Hβ ]. Note that every Sβ -name, β < α, corresponds canonically to some Sα -name for the same object. We may therefore identify every Sβ -name with the corresponding Sα -name. Now consider the sets D0 := {p ∈ Sα : For no Sα -generic filter H over V with p ∈ H there is β < α such that x˙ H ∈ V [Hβ ]} and D1 := {p ∈ Sα : There are β < α ˙ and an Sβ -name z˙ such that p x˙ = z}. 14
Cf., e.g., [Bar2 Jud95, Lemma 1.5.7].
ON SACKS FORCING AND THE SACKS PROPERTY
135
D0 ∪ D1 is dense in Sα : Suppose that p ∈ Sα is not in D0 . Then there are an Sα -generic filter H, β < α, and an Sβ -name z˙ such that p ∈ H and x˙ H = z˙Hβ . Since x˙ H = z˙Hβ there is some q ∈ H which forces x˙ = z. ˙ Since H is a filter, we may assume q ≤ p. Clearly, q ∈ D1 . This shows the density of D0 ∪D1 . By the genericity of G, there is p ∈ G∩(D0 ∪D1 ). If p ∈ D1 , then α is not minimal with x ∈ V [Gα ]. This shows p ∈ D1 . Therefore p is as required. Using the Maximality Principle it is now easy to construct an Sα name z˙ for an element of 2ω such that p z˙ = x˙ and for no Sα -generic filter H over V there is β < α such that z˙H ∈ V [Hβ ].15 This finishes the proof of Claim 3. Using Claim 3 we can find an Sα -name x˙ for an element of 2ω such that x = x˙ Gα and for all Sα -generic filters H and all β < α, x˙ H ∈ V [Hβ ]. In other words, x˙ is a name for a real not added before stage α of the iteration. For y ∈ 2ω ∩ V [Gα ] there is an Sα -name y˙ for an element of 2ω such that y = y˙ Gα . For every p ∈ Sα we will construct q ≤ p with the following property:
(∗)q,x,˙ y˙
Let Tq (x) ˙ be the tree of q-possibilities for x. ˙ Then in V there ω is a continuous map f : [Tq (x)] ˙ → 2 such that if H is Sα generic over V with q ∈ H, then f maps x˙ H to y˙H .
Clearly, q forces f (x) ˙ = y. ˙ The function f is only defined on a closed ω subset of 2 , but by Claim 2 we can extend it (still in the ground model) to a continuous function g : 2ω → 2ω . Thus, the set of conditions forcing g(x) ˙ = y˙ for some continuous ω ω ground model function g : 2 → 2 is dense in Sα . This shows that for every Sδ -generic filter G there is a continuous ground model function g : 2ω → 2ω such that g(x˙ G ) = y˙G . This finishes the proof of part a) of Lemma 7.4 provided we know Lemma 7.5. Let α be an ordinal and x˙ an Sα -name for an element of 2ω which is not added in a proper initial stage of the iteration. Let y˙ be an Sα -name for an element of 2ω . Then for every p ∈ Sα there is q ≤ p such that (∗)q,x,˙ y˙ holds. 15
Choose an Sα -name y˙ for an element of 2ω which is not added before stage α. Let z˙ be such that p z˙ = x˙ and q z˙ = y˙ for all q ∈ Sα that are incompatible with p.
136
STEFAN GESCHKE AND SANDRA QUICKERT
Proof. The proof of this lemma is closely related to the proof of the 2localization property of countable support products of Sacks forcing in Section 6. We have to define the notion of a fusion sequence for iterations of Sacks forcing. Let F be a finite subset of α and η : F → ω. For p, q ∈ Sα let p ≤F,η q if and only if for all γ ∈ F , pγ p(γ) ≤η(γ) q(γ). A sequence (pn )n∈ω is a fusion sequence if there are an increasing sequence (Fn )n∈ω of finite subsets of α and a sequence (ηn )n∈ω such that for all n ∈ ω, ηn : Fn → ω, for all i ∈ Fn we have ηn (i) ≤ ηn+1 (i), pn+1 ≤Fn ,ηn pn , and for all γ ∈ supt(pn ) there is m ∈ ω such that γ ∈ Fm and ηm (γ) ≥ n. The fusion pω of a fusion sequence (pn )n∈ω in Sα is defined as follows: Let γ < α and suppose we have already defined pω γ. Let pω (γ) be an Sγ -name for a condition in S such that pω γ forces pω (γ) to be the intersection of the pn (γ), n ∈ ω. We also use a notion of faithfulness. For F and η as above, p ∈ Sα , and σ ∈ γ∈F 2η(γ) let p ∗ σ ∈ Sα be such that for all γ ∈ F , p ∗ σγ p ∗ σ(γ) = p(γ) ∗ σ(γ) and for γ ∈ α\F , p ∗ σ(γ) = p(γ). A condition p ∈ Sα is (F, η)-faithful if for all σ, τ ∈ γ∈F 2η(γ) with σ = τ , x˙ p∗σ and x˙ p∗τ , the longest initial segments of x˙ decided by p ∗ σ, respectively by p ∗ τ , are incomparable (with respect to ⊆)16 . The following claim is the iteration version of Claim 1. Claim 4. Let F and η be as before and suppose that q ∈ Sα is (F, η)faithful. a) Let β ∈ α\F and let F := F ∪ {β} and η := η ∪ {(β, 0)}. Then there is r ≤F,η q such that r is (F , η )-faithful. b) Let β ∈ F and let η := ηF \{β} ∪ {(β, η(β) + 1)}. Then there is r ≤F,η q such that r is (F, η )-faithful. Part a) follows immediately from the definitions. b) let δ := For part η(γ) max F and let {σ0 , . . . , σm } be an enumeration of γ∈F 2 . We define a ≤F,η sequence (qj )j≤m in Sα along with names qσ,0 and qσ,1 , -decreasing η(γ) σ ∈ γ∈F 2 , for conditions. 16
Compare this to the corresponding definitions in Section 6.
ON SACKS FORCING AND THE SACKS PROPERTY
137
Let j ∈ {1, . . . m} and assume that qj−1 has been constructed already. Since x˙ is not added in a proper initial stage of the iteration, there are qσj ,0 and qσj ,1 such that for i ∈ 2, we have qj−1 ∗ σj δ qσj ,i ≤ (q(δ) ∗ (σj (δ)i))q(δ, α) and qj−1 ∗ σj δ x˙ qσj ,0 and x˙ qσj ,1 are incomparable. Let qj ≤F,η qj−1 be such that qj ∗ σδ decides x˙ qσj ,0 and x˙ qσj ,1 . This finishes the inductive construction of the qj . Now let r ≤F,η qm be such that rδ = qm δ and for all σ ∈ γ∈F 2η(γ) and all coordinatewise extensions τ ∈ γ∈F 2η (γ) of σ, r ∗ τ δ r ∗ τ [δ, α) = qσ,τ (η(β)) . It is easy to check that r works for Claim 4. To conclude the proof of Lemma 7.5, let p, x˙ and y˙ be as in the Lemma. Using some bookkeeping and parts a) and b) of Claim 4 we construct a sequence (pn )n∈ω and a sequence (Fn , ηn )n∈ω witnessing that (pn )n∈ω is a fusion sequence such that p = p0 and for all n ∈ ω, pn is (Fn , ηn )-faithful. We can construct the sequences (pn )n∈ω and (Fn , ηn )n∈ω with additional property that for every n ∈ ω and for every σ ∈ theη(γ) , pn ∗ σ decides yn. ˙ γ∈Fn 2 Let q be the fusion of the sequence (pn )n∈ω . We have to check that (∗)q,x,˙ y˙ holds. Let a ∈ [Tq (x)] ˙ and n ∈ ω. Now q ≤Fn ,ηn pn and pn is (Fn , ηn )-faithful. It follows that there is exactly one σa,n ∈ γ∈Fn 2ηn (γ) such that x˙ q∗σa,n ⊆ a. Let f (a) := n∈ω y˙ q∗σa,n . Since for all n ∈ ω, q ∗ σa,n ≤ pn ∗ σa,n and pn ∗ σa,n decides yn, ˙ f (a) ∈ 2ω . It is easily ω checked that f : [Tq (x)] ˙ → 2 is a continuous map witnessing (∗)q,x,˙ y˙ . q.e.d.
Now we turn the proof of part b) of Lemma 7.4. Let G be Sδ -generic over the ground model V . By Claim 3, for x ∈ ω ω there is α ≤ δ and an Sα -name for an element of ω ω such that x = x˙ Gα and for all Sα -generic filters H over V there is no β < α such that x˙ H ∈ V [Hβ ]. Using the proof of Lemma 7.5 for the name x˙ for an element of ω ω instead of 2ω (and any Sα -name y˙ for an element of 2ω ), for every p ∈ Sα we obtain a condition q ≤ p such that (∗)q,x,˙ y˙ is satisfied. This is because the proof of Lemma 7.5 does not depend on the fact that x˙ is a name for
138
STEFAN GESCHKE AND SANDRA QUICKERT
an element of 2ω . It works for names for elements of ω ω as well. From the construction of q it follows that Tq (x) ˙ is binary. Clearly, q forces x˙ to be a branch through Tq (x). ˙ By the genericity of Gα , there is q ∈ Gα such that Tq (x) ˙ is binary. This shows the 2-localization property of Sδ . Note that the name y˙ is actually not needed in the proof of the 2localization property. It is only used in order to be consistent with the notation of Lemma 7.5. q.e.d. (Lemma 7.4) Lemma 7.4 shows how S can collapse cardinals: Let G be Sω2 -generic over the ground model V . Suppose that V satisfies CH. Then in V [G] the size of the continuum is ℵ2 . Let H be S-generic over V [G]. Then V [G][H] is obtained by forcing with an iteration of Sacks forcing of length ω2 + 1 over V . For γ < (ω2 )V + 1 let xγ denote the generic real added at stage γ. Here (ω2 )V denotes the ordinal that is ω2 in V . By Lemma 7.4, for every γ < (ω2 )V there is a continuous function fγ : 2ω → 2ω such that fγ ∈ V and fγ (x(ω2 )V ) = xγ . Since there are only ℵ1 continuous function from 2ω to 2ω in V , this implies (ω2 )V ≤ ℵ1 in V [G][H]. This shows that adding a Sacks real over the Sacks model collapses ℵ2 . At the end of this section let us mention another description of Sδ , which is related more closely to the representation of Sacks forcing as the partial order of uncountable Borel subsets of R. Definition 7.6. Let δ be an ordinal. A set B ⊆ Rδ is ctbl-perfect iff 1. for every ordinal β < δ and every sequence s ∈ Bβ := {uβ : u ∈ B} the set {r ∈ R : sr ∈ Bβ + 1} is uncountable and 2. for every increasing sequence (βn )n∈ω of ordinals below δ and every increasing sequence (with respect to inclusion) (sn )n∈ωsuch that for all n ∈ ω, sn ∈ Bβn it is the case that n∈ω sn ∈ B n∈ω βn . It can be shown that the partial order consisting of all ctbl-perfect Borel subsets of Rδ is forcing equivalent to Sδ (cf. [Zap04, 3.1]). This way of representing iterations of Sacks forcing was used by Kanovei [Kan1 99] to iterate Sacks forcing along non-wellfounded linear orders. He showed that whenever I is a linearly ordered index set, it is possible to add a sequence (ai )i∈I of reals to the ground model V such that for each j ∈ I, aj is a Sacks real over V [(ai )i<j ].
ON SACKS FORCING AND THE SACKS PROPERTY
139
Using the representation of Sδ in terms of ctbl-perfect Borel sets, we can give the formulation of a set theoretic axiom that axiomatizes the properties of the Sacks model very well. Definition 7.7. Consider the following game between player I and player II lasting ω1 rounds. At round β ∈ ω1 player I plays an ordinal αβ < ω1 , a ctbl-perfect Borel set Bβ ⊆ Rαβ , and a Borel function fβ : Bβ → R. Then player II responds by a ctbl-perfect Borel set Cβ ⊆ Bβ . Player I wins iff β<ω1 fβ [Cβ ] = R. The Covering Property Axiom (CPA) is the statement “CH fails and player II has no winning strategy in the above game”. CPA was invented by Ciesielski and Pawlikowski [CiePaw∞]. They showed that CPA holds in the Sacks model and implies many statements that were previously known to be true in the Sacks model. In fact, many of these statements already follow from weaker versions of CPA which are also defined in [CiePaw∞] and which seem to be much easier to apply. In particular, these weaker version of CPA avoid some of the technicalities and notational inconveniences usually associated with countable support iterations. Recently, Zapletal [Zap04] proved that if V satisfies CPA, then for a large class of cardinal characteristics of the reals the following holds: If there is a generic extension V [G] of V where the cardinal characteristic in question is < 2ℵ0 , then this cardinal characteristic is < 2ℵ0 in V . This shows that CPA indeed axiomatizes the Sacks model very well. The formulation of CPA is sufficiently general to be easily adapted for other models of set theory as well17 .
17
Cf. [CiePaw∞] and [Zap04] for this.
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
Supertask computation Joel David Hamkins The College of Staten Island of The City University of New York Mathematics Department 1S-215 2800 Victory Boulevard Staten Island, NY 10314, USA The Graduate Center of The City University of New York Mathematics Program 365 Fifth Avenue New York, NY 10016, USA E-mail:
[email protected]
Abstract. Infinite time Turing machines extend the classical Turing machine concept to transfinite ordinal time, thereby providing a natural model of infinitary computability that sheds light on the power and limitations of supertask algorithms.
1
Supertasks
What would you compute with an infinitely fast computer? What could you compute? To make sense of these questions, one would want to understand the algorithms that the machines would carry out, computational tasks involving infinitely many steps of computation. Such tasks, known Received: November 19th, 2001; In revised version: February 8th, 2002; Accepted by the editors: February 13th, 2002. 2000 Mathematics Subject Classification. 03D10 68Q05 03D25.
My research has been supported in part by grants from the PSC-CUNY Research Foundation and from the NSF.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 141-158. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
142
JOEL DAVID HAMKINS
as supertasks, have been studied since antiquity from a variety of viewpoints. Zeno of Elea (ca. 450 B.C.) was perhaps the first to grapple with supertasks, in his famous paradox that it is impossible to go from here to there, because before doing so one must first get halfway there, and before that halfway to the halfway point, and so on, ad infinitum. Zeno takes the impossibility of completing a supertask as the foundation of his reductio. More recently, twentieth century philosophers (cf. [Tho1 54]) have introduced Thomson’s lamp, which is on for 21 minute, off for 41 minute, on for 18 minute, and so on. After one minute, is it on or off? In a more intriguing example, let’s suppose that you have infinitely many one dollar bills (numbered 1, 3, 5, . . .) and in some nefarious underground bar, the Devil explains to you that he has an attachment to your particular bills, and is willing to pay you two dollars for each of your one dollar bills. To carry out the exchange, he proposes an infinite series of transactions, in each of which he will hand over to you two dollars and take from you one dollar. The first transaction will take 21 hour, the second 14 hour, the third 81 hour, and so on, so that after one hour the entire exchange is complete. Should you accept his proposal? Perhaps you will become richer? At the very least, you think, it will do no harm, and so the contract is signed and the procedure begins.
It appears initially that you have made a good bargain, because at every step of the transaction, you receive two dollars but give up only one. The Devil is particular, however, about the order in which the bills are exchanged: he always buys from you your lowest-numbered bill, paying you with higher-numbered bills. (So on the first transaction he accepts from you bill number 1, and pays you with bills numbered 2 and 4, and on the second transaction he buys from you bill number 2, which he had just paid you, and pays you bills numbered 6 and 8, and so on.) When the transaction is complete, you discover that you have no money left at all! The reason is that at the nth exchange, the Devil took from you bill number n, and never subsequently returned it to you. Thus, the final destination of every individual bill is under the ownership of that shrewd banker, the Devil. The point is that you should have paid more attention to the details of the supertask transaction that you had agreed to undertake. And similarly, when we design supertask algorithms to solve mathematical questions,
SUPERTASK COMPUTATION
143
we must take care not to make inadvertent assumptions about what may be true only for finite algorithms. Supertasks have also been studied by the physicists (cf. [Ear95]). Using only the Newtonian gravity law (and neglecting relativity), it is possible to arrange finitely many stars in orbiting pairs, each pair orbiting the common center of mass of all the pairs, and a single tiny moon racing faster around, squeezing just so between the dual stars so as to pick up speed with every such transaction. Assuming point masses (or collapsing stars to avoid collision), the arrangement leads by Newton’s law of gravitation to infinite acceleration in finite time. Other supertasks reveal apparent violations of the conservation of energy in Newtonian physics: infinitely many billiard balls, of successively diminishing size converging to a point, are initially at rest, but then the first is set rolling, and each ball transfers in turn all the energy to the next; after a finite amount of time, all motion has ceased, though every interaction is energy conserving. Still other arrangements have the balls spaced out further and further out to infinity, and the interesting thing about both of these examples is that time-symmetry allows them to run in reverse, with static configurations of balls suddenly coming into motion without violating conservation of energy in any interaction. More computationally significant supertasks have been proposed by physicists in the context of relativity theory (cf. [EarNor1 93], [Hog0 92], [Hog0 94]). Suppose that you want to know the answer to some number theoretic conjecture, such as whether there are additional Fermat primes n (primes of the form 22 + 1), a conjecture that can be confirmed with a single numerical example. The way to solve the problem is to board a rocket, while setting your graduate students to work on earth looking for an example. While you fly faster and faster around the earth, your graduate students, and their graduate students and so on, continue the exhaustive search, with the agreement that if they ever find an example, they will send a radio signal up to the rocket. The point is that meanwhile, by accelerating sufficiently fast towards the speed of light, it is possible to arrange that because of relativistic time contraction, what is a finite amount of time on the rocket corresponds to an infinite amount of time on the earth. The general observation is that by means of such communication between two reference frames, what corresponds to an infinite search can be completed in a finite amount of time.
144
JOEL DAVID HAMKINS
Even more complicated arrangements, with rockets flying around rockets, can be arranged to solve more complicated number theoretic questions. And more complicated relativistic spacetimes can be (mathematically) constructed to avoid the unpleasantness of infinite acceleration required in the rocket examples above (cf. [Pit1 90]). These computational examples speak to the generalized Church’s thesis, the widely accepted philosophical principle that the classical theory of computability has correctly captured the notion of what it means to be computable. Because the relativistic rocket examples provide algorithms for computing functions, such as the halting problem, that are not computable by Turing machines, one can view them as refuting the generalized Church’s thesis. Supporters of this view emphasize that when thinking about what is in principle computable, we must attend to the computational power available to us as a consequence of the fact that we live in a relativistic or quantum-mechanical universe. To ignore this power is to pretend that we live in a Newtonian world. Another (probably specious) argument against the generalized Church’s thesis consists of the observation that a particle undergoing Brownian motion can be used to generate a random bit stream that we have no reason to think is recursive. Therefore, proponents argue, we have no reason to believe the generalized Church’s thesis. Apart from the question of what one can actually compute in this world, whether Newtonian or relativistic or quantum-mechanical, mathematicians are interested in what in principle a supertask can accomplish. Buchi [B¨uc62] and others initiated the study of ω-automata and Buchi machines, involving automata and Turing machine computations of length ω which accept or reject infinite input. Moving to a higher level in the hierarchy, Gerald Sacks and many others (cf. [Sac90]) founded the field of higher recursion theory, including α-recursion and E-recursion, a huge body of work analyzing computation on infinite objects. Blum, Shub and Smale in [BluShuSma89] have presented a model of computation on the real numbers, a kind of flowchart machine where the basic units of computation consist of real numbers, in full glorious precision. Apart from this previous mathematical work, I would like to propose here a new model of infinitary computability: infinite time Turing machines. This model offers the strong computational power of higher recursion theory while remaining very close in spirit to the computability concept of ordinary Turing machines.
SUPERTASK COMPUTATION
2
145
Infinite time Turing machines
I propose to extend the Turing machine concept to transfinite ordinal time, thereby providing a natural model for infinitary computability.1 The idea is to allow somehow a Turing machine to compute for infinitely many steps, while preserving the information produced up to that point. So let me explain specifically how the machines work. The machine hardware is identical to a classical Turing machine, with a head moving back and forth reading and writing zeros and ones on a tape according to the rigid instructions of a finite program, with finitely many states. What is new is the transfinite behavior of the machine, behavior providing a natural theory of computation on the reals that directly generalizes the classical finite theory to the transfinite. For convenience, the machines start input:
1
1
0
1
1
0
···
scratch:
0
0
0
0
0
0
···
output:
0
0
0
0
0
0
···
Fig. 1. An infinite time Turing machine: the computation begins
have three tapes –one for the input, one for scratch work and one for the output– and the computation begins with the input written out on the input tape, with the head on the left-most cell in the start state. The successor steps of computation proceed in exactly the classical manner: the head reads the contents of the cells on which it rests, reflects on its state and follows the rigid instructions of the finite program it is running: accordingly, it writes on the tape, moves the head one cell to the left or the right or not at all and switches to a new state. Thus, the classical procedure determines the configuration of the machine at stage α + 1, given the configuration at any stage α. 1
Infinite time Turing machines were originally defined by Jeff Kidder in 1990, and he and I worked out the early theory together while we were graduate students at UC Berkeley. Later, Andy Lewis and I solved some of the early questions, and presented a complete introduction in [HamLew00], later solving Post’s problem for supertasks in [HamLew02]. Benedikt L¨owe in [L¨ow01], Dan Seabold in [HamSea01] and especially Philip Welch in [Wel0 99], [Wel0 00b], [Wel0 00a] have also made important contributions.
146
JOEL DAVID HAMKINS
We extend the computation into transfinite ordinal time by simply specifying the behavior of the machine at limit ordinals. When a classical Turing machine fails to halt, it is usually thought of as some sort of failure; the result is discarded even though the machine might have been writing some very interesting information on the tape (such as all the theorems of mathematics, for example, or the members of some other computably enumerable set). With infinite time Turing machines, however, we hope to preserve this information by taking some kind of limit of the earlier configurations and continuing the computation transfinitely. Specifically, at any limit ordinal stage λ, the head resets to the left-most cell; the machine is placed in the special limit state, just another of the finitely many states; and the values in the cells of the tape are updated by computing the lim sup of the previous cell values. With the limit stage limit input:
1
1
0
1
0
0
···
scratch:
0
1
1
0
0
1
···
output:
1
1
0
1
1
1
···
Fig. 2. The limit configuration
configuration thus completely specified, the machine simply continues computing. If after some amount of time the halt state is reached, the machine gives as output whatever is written on the output tape. Because there seems to be no need to limit ourselves to finite input and output –the machines have plenty of time to consult the entire input tape and to write on the entire output tape before halting– the natural context for these machines is Cantor Space 2ω , the space of infinite binary sequences. For our purposes here, let’s denote this space by R and refer to its members as real numbers, intending by this terminology to mean infinite binary sequences. We regard the set of natural numbers N as a subset of R by identifying the number 0 with the sequence 000 . . . , the number 1 with 100 . . . , the number 2 with 110 . . . , and so on. Because every program p determines a function –the function sending input x to the output of the computation of program p on input x– the machines provide a model of computation on the reals. We define that a
SUPERTASK COMPUTATION
147
. partial function f .. R → R is infinite time computable (or supertask computable, or for brevity, just computable, when the infinite time context is understood) when there is a program p such that f (x) = y if and only if the computation of program p on input x yields output y. A set of reals A ⊆ R is infinite time decidable (or supertask decidable or again, just decidable) when its characteristic function, the function with value 1 for inputs in A and 0 for inputs not in A, is computable. The set A is infinite time semi-decidable when the function of affirmative values 1A, i.e., the function with domain A and constant value 1, is computable. (Thus, the semi-decidable sets correspond in the classical theory to the recursively enumerable sets, though since here we have sets of reals, we hesitate to describe them as enumerable.) Since it is an easy matter to change any output value to 1, the semi-decidable sets are exactly the domains of the computable functions, just as in the classical theory. Theorem 2.1. Every supertask computation halts or repeats in countably many steps. Proof. Suppose that a supertask computation does not halt by any countable stage of computation. The point is now that a simple cofinality argument shows that the complete configuration of the machine at stage ω1 –the position of the head, the state and the contents of the cells– must have occurred earlier. For example, one can find a countable ordinal α0 by which time all of the cells that have stabilized by ω1 have stabilized. And then one can construct a countable increasing sequence of countable ordinals α0 < α1 < . . . such that all the cells that change their value after αn do so at least once between αn and αn+1 . These ordinals exist because ω1 is regular and there are only countably many cells. At the limit stage αω = sup αn , which is still a countable ordinal, I claim that the configuration is the same as at ω1 : since it is a limit ordinal, the head is on the first cell and in the limit state; and by construction the contents of each cell are computing the same lim sup that they compute at ω1 . Since beyond α0 the only cells that change are the ones that will change unboundedly often, it follows that limits of this configuration are the very same configuration again, and the machine is caught in an endlessly repeating loop. So the proof is complete. q.e.d. Please observe in this argument that, contrary to the classical situation, a computation that merely repeats a complete machine configuration
148
JOEL DAVID HAMKINS
need not be caught in an endlessly repeating loop. After ω many repetitions, the limit configuration may allow it to escape. One example of this phenomenon would be the machine which does nothing at all except halt when it is in the limit state; this machine repeats its initial configuration many times, yet still halts at ω.
3
How powerful are the machines?
One naturally wants to understand the power of the new machines. The first observation, of course, is that the classical halting problem for ordinary Turing machines –the question of whether a given program p halts on given input n in finitely many steps– is decidable in ω many steps by an infinite time Turing machine. To see this, one programs an infinite time Turing machine to simply simulate the operation of p on n, and if the simulated computation ever halts our algorithm gives the output that yes, indeed, the computation did halt. Otherwise, the limit state will be attained, and when this occurs the machine will that know the simulated computation failed to halt; so it outputs the answer that no, the computation did not halt. The power of infinite time Turing machines, though, far transcends the classical halting problem. The truth is that any question of first order number theory is supertask decidable. With an infinite time Turing machine, one could solve the prime pairs conjecture (which asserts that there are infinitely many primes pairs, pairs of primes differing by two), for example, and the question of whether there are infinitely many Fermat n primes (primes of the form 22 + 1) and so on: there is a general decision algorithm for any such conjecture. The point is that to decide a question of the form ∃n ϕ(n, x), where n ranges over the natural numbers, one can simply try out all the possible values of n in turn. One either finds a witness n or else knows at the limit that there is no such witness, and in this way decides whether ∃n ϕ(n, x). Iterating this idea, one concludes by induction on the complexity of the statement that any first order number theoretic question is decidable with only a finite number of limits, i.e., before stage ω 2 . In fact, the class of sets that are decidable in time uniformly before ω 2 is exactly the class of arithmetic sets, the sets of reals that are definable by a statement using quantifiers over the natural numbers (cf. [HamLew00, Theorem 2.6 ]).
SUPERTASK COMPUTATION
149
Theorem 3.1. Arithmetic truth is infinite time decidable. One can push this much harder to see that even more complex questions, questions from the lower part of the projective hierarchy in second order number theory, are supertask decidable. The fact is that any Π11 set is decidable and more. To prove this, it suffices to consider the most complex Π11 set, the well-known set WO, consisting of the reals coding a wellorders of a subset of N. An infinite binary sequence x codes a relation on N when i j if and only if x( i, j ) = 1, where ·, · is the G¨odel pairing function coding pairs of natural numbers with natural numbers. Theorem 3.2. The set WO is infinite time decidable. Proof. This argument is known as the “count-through” argument. We would like to describe a supertask algorithm which on input x decides whether x codes a well order on a subset of N or not. In ω many steps, it is easy to check whether x codes a linear order: this amounts merely to checking that the relation coded by x is transitive, irreflexive and connected. For example, the machine must check that whenever i j and j k then also i k, and all these requirements can be enumerated and checked in ω many steps. Next, the algorithm will attempt to find the least element in the field of the relation . This can be done by keeping a current-best-guess on the scratch tape and systematically looking for better guesses, whenever a new smaller element is found. When such a better guess is found, it replaces the current guess on the scratch tape, and a special flag cell is flashed on and then off again. At the limit, if the flag is on, it means that infinitely often the guess was changed, and so the relation has an infinite descending sequence. Thus, in this case the input is definitely not a well order and the computation can halt with a negative output. Conversely, if the flag is off, it means that the guess was only changed finitely often, and the machine has successfully found the least element. The algorithm now proceeds to erase all mention of this element from the field of the relation . This produces a new smaller relation, and the algorithm proceeds to find the least element of it. In this way, the relation is eventually erased from the bottom as the computation proceeds. If the relation is not a well order, eventually the algorithm will erase the well founded initial segment of it, and then discover that there is no least element remaining, and reject the input. If the relation is a well order,
150
JOEL DAVID HAMKINS
then the algorithm will eventually erase the entire field, and recognize that it has done so, and accept the input as a well order. This completes the proof. q.e.d. Since WO is well-known as a complete Π11 set, we conclude as a corollary that every Π11 set is infinite time decidable and hence also, every Σ11 set is infinite time decidable. But one can’t go much further in the projective hierarchy, because every semi-decidable set has complexity ∆12 . For a finer stratification, let me mention that the arithmetic sets are exactly the sets which can be decided by an algorithm using a bounded finite number of limits, and the hyperarithmetic sets, the ∆11 sets, are exactly the sets which can be decided in some bounded recursive ordinal length of time. Thus, the arithmetic sets are those that can be decided uniformly in time before ω 2 , and the hyperarithmetic sets are exactly those which can be decided uniformly in time before ω1CK . Much of the classical computability theory generalizes to the supertask context of infinite time Turing machines. For example, the s-m-n theorem and the Recursion Theorem go through with virtually identical proofs. But some other classical results, even very elementary ones, do not generalize. One surprising result, for example, is the following. Theorem 3.3. There is a non-computable function whose graph is semidecidable. This follows from what I have called the Lost Melody Theorem [HamLew00, Theorem 4.9], which asserts the existence of a real c such that { c } is decidable, but c is not writable. Imagine the real c as the melody that you can recognize when someone sings it, but you cannot sing it on your own. Using such a lost melody real c, one can prove Theorem 3.3 with the function f (x) = c. Indeed, since this function is constant and the graph is decidable, the theorem can be strengthened to the assertion that there is a non-computable constant function whose graph is decidable. To give some idea of how one proves the Lost Melody Theorem, let me mention that the real c will be the least real in the G¨odel constructible universe L hierarchy that codes the ordinal supremum of the places where all computations on input 0 have either halted or repeated. Since this ordinal is above every writable ordinal, the real c cannot be writable. But the real c codes enough information about itself so that an infinite time Turing machine can verify that a given real is c or not.
SUPERTASK COMPUTATION
4
151
How long do the computations take?
One naturally wants to understand how long a supertask computation can take. Therefore, I define an ordinal α to be clockable if there is a computation on input 0 that takes exactly α many steps to complete (so that the αth step of computation is the act of moving to the halt state). Such a computation is a clock of sorts, a way to count exactly up to α. It is very easy to see that any finite n is clockable; one can simply have a machine cycle through n states and then halt. The ordinal ω is clockable, by the machine that halts whenever it sees the limit state. And these same ideas show that if α is clockable, then so is α + n and α + ω. Thus, every ordinal up to ω 2 is clockable. The ordinal ω 2 itself is clockable: one can recognize it as the first limit of limit ordinals, by flashing a flag on and then off again every time the limit state is encountered. The ordinal ω 2 will be first time this flag is on at a limit stage. Going beyond this, it is easy to see that if α and β are clockable, so are α + β and αβ. Undergraduate students might enjoy finding algorithms to clock specific 2 ordinals, such as ω ω , and I can recommend this as a way to help them understand the ordinals more deeply. Most readers will have guessed that the analysis extends much further. In fact, any recursive ordinal is clockable. This can be seen by optimizing the count-through argument in Theorem 3.2. Specifically, after writing a real coding a recursive ordinal on the tape in ω many steps, one proceeds to count through it in an optimized fashion. Rather than merely guessing the least element of the relation, one guesses the ω many least elements of the relation (while simultaneously erasing the previous guesses). In this way, each block of ω many steps of the algorithm will erase ω many elements from the field of the relation. Some have been surprised that the clockable ordinals extend beyond the recursive ordinals, but in fact they extend well beyond the recursive ordinals. To see at least the beginnings of this, let me show that the ordinal ω1CK + ω is clockable, where ω1CK is the supremum of the recursive ordinals. Kleene has proved that there is a recursive relation whose wellfounded part has order type ω1CK . Consider the supertask algorithm that writes this relation on the tape and then attempts to count through it. By stage ω1CK the ill-founded part will have been reached, but it takes the algorithm an additional ω many steps to realize this. So it can halt at stage ω1CK + ω.
152
JOEL DAVID HAMKINS
One is left to wonder, is ω1CK itself clockable? More generally, Are there gaps in the clockable ordinals? After all, if a child can count to twenty-seven, then one might expect the child also to be able to count to any smaller number, such as nineteen2 . The question is whether we expect the same to be true for infinite time Turing machines. Theorem 4.1. Gaps exist in the clockable ordinals. Proof. Consider the algorithm which simulates all programs on input 0, recording which have halted. When a stage is found at which no programs halt, then halt. This produces a clockable ordinal above a nonclockable ordinal, so gaps exist. q.e.d. The argument can be modified to show that the next gap above any clockable ordinal has size ω. Other arguments establish that complicated behavior can occur at limits of gaps, because the lengths of the gaps are unbounded in the clockable ordinals. Question 1. What is the structure of the clockable ordinals? For example, one might wonder whether the first gap begins at ω1CK , the supremum of the recursive ordinals? (It does, since no admissible ordinal is clockable [HamLew00].) There is another way for infinite time Turing machines to operate as clocks, and this is by counting through a real coding a well order in the manner of Theorem 3.2. To assist with this analysis, we define that a real is writable if it is the output of a supertask computation on input 0. An ordinal is writable if it is coded by a writable real. It is easy to see that there are no gaps in the writable ordinals, because if one can write down real coding α, it is an easy matter to write down from this a real coding any particular β < α. In [HamLew00], Andy Lewis and I proved that the order types of the clockable and writable ordinals are the same, but the question was left open as to whether these two classes of ordinals had the same supremum. This was solved by Philip Welch in [Wel0 00a], allowing Andy Lewis and I to greatly simplify arguments in [HamLew02]. 2
Friends with children have informed me that such an expectation is unwarranted; one sometimes can’t get the child to stop at the right time. This reminds me of a time when my younger brother was in kindergarten, the children all sat in a big circle taking turns saying the next letter of the alphabet: A, B, C, and so on, around the circle in the manner of the usual song. After the letter K, the next child contributed LMNOP, thinking that this was only one letter.
SUPERTASK COMPUTATION
153
Theorem 4.2 (Welch). Every clockable ordinal is writable. The supremum of the writable and clockable ordinals is the same.
5
The supertask halting problems
Any notion of computation naturally provides a corresponding halting problem, the question of whether a given computation will halt. In the supertask context, we divide the halting problem into two parts, a boldface and a lightface problem: H = { p, x | program p halts on input x } h = { p | program p halts on input 0 } In the classical theory, of course, these two sets are Turing equivalent, but here the situation is different. Nevertheless, for undecidability the classical arguments do directly generalize. Theorem 5.1. The halting problems h and H are semi-decidable but not decidable. For semi-decidability, the point is that given a program p and input x (or input 0), one can simply simulate p on x to see if it halts. If it does, output the answer that yes, it halted; otherwise, keep simulating. For undecidability, in the case of H one can use the classical diagonalization argument; for the lightface halting problem h, one appeals to the Recursion Theorem, just as in the classical theory.
6
Oracles
There are two natural types of oracles to use in the infinite time Turing machine context. On the one hand, one can use an individual real as an oracle just as one does in the classical context, by simply adding an oracle tape containing this real, and allowing the machine to access this tape during the computation. This corresponds exactly to adding an extra input tape and thinking of the oracle real as a fixed additional input. But this is ultimately not the right type of oracle to consider. Rather, an oracle is more properly the same type of object as one that might be decidable or semi-decidable, namely, a set of reals, not an individual
154
JOEL DAVID HAMKINS
real. Since such a set could be uncountable, we can’t expect to be able to write out the entire contents of the oracle on an extra tape. Rather, we provide an oracle model of relative computability by which the machine can make arbitrary membership queries of the oracle. Specifically, for a fixed oracle set of reals A, we equip an infinite time Turing machine with an initially blank oracle tape on which the machine can read or write. By attempting to switch to a special query state, the machine receives the answer (by moving actually to the yes or no state) as to whether the real currently written on the oracle tape is in A or not. In this way, the machine is able to ask, of any real x that it is capable of producing, whether x ∈ A or not. This model of oracle computation has proven robust, and it closely follows the well-known definition of L[A] in set theory, the constructible universe relative to the predicate A, in which at any given stage in the construction one is allowed to apply the predicate only to previously constructed objects. From the notion of oracle computation, one can of course define a notion of relative computability. Specifically, the set A is computable from B, written A ≤∞ B, if and only if A is supertask decidable using oracle B. One then also defines A ≡∞ B if and only if A ≤∞ B and B ≤∞ A, and this is the equivalence relation of the infinite time Turing degrees. The strict version A <∞ B holds if and only if A ≤∞ B and A ≡∞ B.
7
Supertask Jump Operators
The two halting problems give rise of course to two jump operators. Specifically, for any set A we have the boldface and lightface jumps3 : A = H A = { p, x | program p halts on input x with oracle A } A = A ⊕ hA = A ⊕ { p | program p halts on input 0 with oracle A } We include the factor A explicitly in A, because in general A may not be computable from hA . Indeed, there are some sets A that are not computable from any real at all. Jump Theorem 7.1 For any set, A <∞ A <∞ A. 3
Compare Welch’s Definition 1.9 and Definition 2.1 on pages 227 and 231 of this volume, respectively.
SUPERTASK COMPUTATION
155
To prove this theorem, one first observes that A ≤∞ A ≤∞ A, since A is explicitly computable from A and A is merely the 0th slice of A. Secondly, one knows that A <∞ A because the undecidability of the relativized halting problem means that hA is not computable from A. The nontrivial aspect of this theorem is the assertion that A <∞ A. This assertion is what separates the two jump operators, and is the reason that we know the two halting problems h ≡∞ 0 and H ≡∞ 0 are not equivalent. This follows from the more specific result that the set A is not computable from A ⊕ z for any real z. In particular, 0 is not computable from any real. In fact, the boldface jump jumps much higher than the lightface jump, and absorbs many iterates of the weaker jump, since A ≡∞ A; indeed, for any ordinal α which is A-writable, (α) A ≡∞ A (cf. [HamLew00]).
8
Post’s Problem for Supertasks
Post’s problem is the question in classical computability theory of whether there are any non-decidable semi-decidable degrees strictly below the halting problem, or equivalently, whether there are any intermediate semi-decidable degrees between 0 and the Turing jump 0 . This question has a natural supertask analogue: Supertask Post’s Problem 8.1 Are there any intermediate semi-decidable supertask degrees between 0 and the supertask jump 0? The answer is delicately mixed. On the one hand, in the context of degrees in the real numbers, we have a negative answer. This contrasts sharply with the classical theory. Theorem 8.2. There are no reals z such that 0 <∞ z <∞ 0. Proof. Suppose that 0 ≤∞ z ≤∞ 0. So z is the output of program p using 0 as an oracle. Consider the algorithm which computes approximations to 0, and uses program p with these approximations in an attempt to produce z. If one of the proper approximations to 0 can successfully produce z, then z is writable and 0 ≡∞ z. Conversely, if none of the proper approximations can produce z, then on input z we can recognize 0 as the true approximation, the first approximation able to produce z. So z ≡∞ 0. q.e.d.
156
JOEL DAVID HAMKINS
On the other hand, when it comes to sets of reals, we have an affirmative answer. Theorem 8.3. There are semi-decidable sets of reals A with 0 <∞ A <∞ 0. Indeed, there are incomparable semi-decidable sets A ⊥ B. Please consult [HamLew00] for the proof. Let me mention here, though, that the basic idea of the argument is to generalize the Friedburg-Muchnik priority argument to the supertask context, much as Sacks’ did for α-recursion theory. Building A and B in stages, we attempt to meet the requirements ϕB and ϕA p = A p = B by adding writable reals to A and B that have not yet appeared on the higher priority computations. One technical fact to make this idea work is that for any clockable ordinal α, there are many writable reals not appearing during the course of any supertask computation of length α. Thus, we can find a supply of new writable reals to add to A and B in order to satisfy the later requirements, without injuring the witnessing computations of earlier higher-priority requirements.
9
Other Models of Infinitary Computation
Let me briefly compare the infinite time Turing machine model of supertask computation with some other well-known models. The Blum-Shub-Smale machines (cf. [BluShuSma89]) were the original inspiration for infinite time Turing machines.4 Programs and computations for BSS machines are finite, but the basic units of computation are full precision real numbers. They are in essence finite state register machines, where the registers each hold a real number. The primary purpose of introducing the BSS machines was to provide a theoretical foundation for analyzing computational algorithms using the concepts of real analysis rather than arithmetic. The machines allow one to analyze 4
Jeff Kidder and I heard Lenore Blum’s lectures for the Berkeley Logic Colloquium in 1989, and had the idea to generalize the Turing machine concept in a different direction: to infinite time rather than infinite precision.
SUPERTASK COMPUTATION
157
the dynamical features, for example, of actual algorithms in numerical analysis, such as Newton’s method, and illuminate questions of stability and convergence for such algorithms. The classical approach to these problems, using the Turing machine model with ever greater decimal approximations, forces one into the realm of finite combinatorics, where one becomes lost in a jumble of discrete approximation error analysis, when one would rather fly smoothly above it in the heaven of differential equations. In another direction, the theory of higher recursion provides a model of infinitary computability by setting a very general theoretical context for recursion on infinite objects, and one should expect many parallels between it and the theory of infinite time Turing machines. The anonymous referee of [HamLew02] and Philip Welch have pointed out, for example, that the infinitary priority argument [HamLew02, Theorem 4.1], stated as Theorem 8.3 above, parallels Sacks’ version of the Friedburg-Munchnik proof for α-recursion [Sac90], specifically when α is λ, the supremum of the clockable ordinals. One can identify the writable reals in our argument with the ordinal stages at which they appear and get Sacks’ sets, and conversely, Sacks’ could have written out codes for those stages and gotten our sets. This identification reveals that the ≤∞ -degree structure of sets of writable reals below 0 is exactly that of the λ-degrees. Accordingly, one can obtain not only the answer to Post’s problem, but all the theorems from λ-recursion theory for this class of degrees, such as the Shore Density Theorem, etc., for free. It will be very interesting to see if these ideas will allow one to prove the theorems in the general case of all degrees. Lastly, let me mention quantum Turing machines, if only because I am often asked about them in connection with infinite time Turing machines. Quantum Turing machines are like classical Turing machines, except that the configuration of the machine at any given stage is a superposition of classical configurations; the different components of these superpositions, like the wave functions of quantum mechanics, may constructively or destructively interfere with one another as the computation proceeds. By means of clever quantum algorithms, one can effectively carry out parallel computation in these different components, constructively interfering their output to assemble the information into a final answer. In this way, quantum Turing machines allow for an exponential increase in the speed of computation of many important functions. But
158
JOEL DAVID HAMKINS
because quantum Turing machines, at the end of the day, are simulable by classical Turing machines, they do not introduce new decidable sets or new computable functions. And so while quantum Turing machines are without a doubt extremely important in matters of computational feasibility, they do not really provide a model of infinitary computability. Infinite time Turing machines are simply much more powerful than quantum Turing machines.
10
Questions for the Future
I close this article by asking the open-ended question: Question 2. What is the structure of infinite time Turing degrees? To what extent do its properties mirror or differ from the classical structure? This question really stands for the dozens of specific open questions that one might ask: does the Sacks Density Theorem, for example, hold in the supertask context for arbitrary sets of reals? The field is wide open.
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
A refinement of Jensen’s constructible hierarchy Peter Koepke Mathematisches Institut Rheinische Friedrich-Wilhelms-Universit¨at Bonn Beringstraße 1 D-53115 Bonn, Germany E-mail:
[email protected]
Marc van Eijmeren Ginnekenweg 278 4835 NK Breda, The Netherlands E-mail:
[email protected]
Abstract. Appropriate hierarchies are essential for the study of minimal or constructible inner models of set theory, in particular for proving strong combinatorial principles in those models. We define a fine structural hierarchy (Fα ) and compare it to Jensen’s well-known Jα -hierarchy for G¨odel’s constructible universe L.
1
Introduction
Under the axioms of Zermelo-Fraenkel set theory (ZF) the universe V of all sets can be layered into a hierarchy of initial segments which is Received: December 20th, 2001; In revised version: March 11th, 2003; November 5th, 2003; Accepted by the editors: November 5th, 2003. 2000 Mathematics Subject Classification. 03E45.
The second author was funded by the Mathematical Institute of the University of Bonn and the DAAD.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 159-169. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
160
PETER KOEPKE AND MARC VAN EIJMEREN
indexed by the ordinal numbers. John von Neumann made the following recursive definition: V0 := ∅, Vα+1 := ℘(Vα ), and Vλ := α<λ Vα , for limit ordinals λ. The axiom schema of foundation (among others) implies that this hierarchy exhausts the set theoretical universe: V := V α . A set x is of ordinal rank α if x ∈ Vα+1 \Vα . The rank α∈Ord function measures the complexity of x in terms of the membership relation. Transfinite induction along the ordinals is essentially the only method to carry out involved arguments and constructions on infinite sets. Via the von Neumann hierarchy, some inductive arguments can be applied to all sets in the universe V. In his fundamental work on the relative consistency of the axiom of choice and of the generalised continuum hypothesis, Kurt G¨odel defined the constructible universe L which is the smallest inner model of ZF. The model L has become the prototype for a canonical model of set theory. The constructible universe is defined and analysed through definability hierarchies. The present paper illustrates that there exists a variety of appropriate hierarchies for L. We briefly recall the two classical examples: the Lα -hierarchy by G¨odel and the Jα -hierarchy by Jensen. We shall then introduce a new hierarchy (Fα ) for L and relate it to Jensen’s hierarchy. The main technical result will be that the Jensen-hierarchy consists of the limit levels of the Fα -hierarchy which underlines again the canonical nature of the Jensen-hierarchy. G¨odel in [G¨od39] defined the hierarchy (Lα )α∈Ord by iterating a definable powerset operation instead of the unrestricted powerset operation. Denoting the collection of all first-order definable subsets of (X, ∈) by Def(X), the hierarchy is defined by: L0 := ∅, Lα+1 := Def(Lα ), and Lλ := α<λ Lα , for limit ordinals λ. The sets formed in thevarious levels of this hierarchy are called the constructible sets, L := α∈Ord Lα is defined to be the constructible universe. The elements in Lα+1 = Def(Lα ) can be classified according to their defining formulas over (Lα , ∈) and to the parameters used. The formulas can be indexed by natural numbers, and the parameters have already been located in some earlier level. This allows to denote every set in L by a finite sets of ordinals or, indeed, by one ordinal so that the methods of induction and recursion can be applied to all elements of L. In this way,
A REFINEMENT OF JENSEN’S CONSTRUCTIBLE HIERARCHY
161
G¨odel proved that the class L is an inner model of set theory in which the axiom of choice and the generalized continuum hypothesis hold. The constructible universe satisfies many combinatorial properties and allows involved constructions that cannot be carried out under the ZFC axioms alone. Ronald Jensen analysed the process of set formation in the constructible universe in his paper The fine structure of the constructible hierarchy [Jen72] for which he received the 2003 AMS Steele Prize for a Seminal Contribution to Research. The theory is based on the analysis of first-order formulas which define a set in Lα+1 \Lα . Jensen’s approach requires a thorough syntactical study of formulas of arbitrary quantifier complexity. For his work, Jensen introduced the J-hierarchy for L, whose levels are closed with respect to rudimentary functions. These are the functions generated by the scheme: – constant functions and projection functions are rudimentary; – the formation of unordered pairs is rudimentary; – if f (x0 , . . . , xn−1 ) and g0 (y ), . . . , gn−1 (y ) are rudimentary functions then the composition f (g0 (y ), . . . , gn−1 (y )) is rudimentary; – if g is a rudimentary function then f (y, x) = z∈y g(z, x) is rudimentary. The smallest, rudimentary closed set containing X is denoted by rud(X). Then the J-hierarchy is defined recursively as follows: J0 := ∅, Jα+1 := rud(Jα ∪ {Jα }), and Jλ := α<λ Jα for limit ordinals λ. The fine structure theory augments the structures Jα with a host of additional components like canonical well-orders, Skolem functions and truth predicates in order to reduce arbitrary definability over some Jα to Σ1 -definability over the augmented structure. The fundamental lemmas of the theory ensure that the additional components are preserved under various operations in the hierarchy. We state some properties of the Jhierarchy. Proofs can be found in [Dev84, Chap. VI]. Lemma 1.1. There exists a Σ1 (Jα ) map from ωα onto ωα × ωα. Lemma 1.2. There exists a Σ 1 (Jα ) map from ωα onto Jα . Theorem 1.3. For every α ≥ 1, ℘(Jα )∩Jα+1 = Σω (Jα ). So in particular Σω (Jα ) ⊆ Jα+1 for every α ≥ 1.
162
PETER KOEPKE AND MARC VAN EIJMEREN
Recently, some variants of fine structure theory have been proposed which try to keep the “model theoretic” intuitions of Jensen’s theory whilst reducing the syntactical complexities. S. D. Friedman and the first author [Fri1 Koe97], e.g., modified the standard Lα -hierarchy in a way which allows to use the combinatorics of Silver machines [Sil7?] in combinatorial proofs whilst retaining G¨odel’s iterated definability approach. In this note, we present yet another fine structural hierarchy (Fα )α∈Ord . The central idea is to incorporate canonical well-orders and Skolem functions directly into the hierarchy, so that these do not have to be defined over the structures. The F-hierarchy is defined via quantifier-free definability, which simplifies fine structural arguments. It has been used by the first author to provide a simple proof of the Jensen covering theorem. Details and proofs will be published in [Koe∞]. In the present article we prove that the F-hierarchy is a refinement of the J-hierarchy; the J-hierarchy consists of the limit structures of the F-hierarchy, hence the F-hierarchy is an adequate hierarchy for the constructible universe. The proof uses the fact that sufficiently much of the recursion for the F-hierarchy can be carried out inside the structures Jα . The original motivation for introducing the F-hierarchy was core model theory. The F-hierarchy can be used nicely for structuring the Dodd-Jensen core model K, but there are problems with higher core models so far.
2
The F-hierarchy
We introduce the F-hierarchy and state some basic properties. We approximate the constructible universe by a hierarchy (Fα )α∈Ord of structures Fα = (Fα , Iα , <α , Sα , ∈). Each Fα will be a transitive set and <ω α∈Ord Fα = L. The functions Iα and Sα are the restrictions to Fα of a global interpretation function I and a global Skolem function S defined on L<ω . The relation <α is the restriction of a ternary guarded constructible well-order < to F3α . Let us first define a “Skolem” language S adequate for the structures Fα : Definition 2.1. Let S be the first-order language with the following components:
A REFINEMENT OF JENSEN’S CONSTRUCTIBLE HIERARCHY
163
– variable symbols v˙ n for n < ω; – logical symbols = ˙ (equality), ∧˙ (conjunction), ¬˙ (negation), ∃˙ (existential quantification), (, ) (brackets); – function symbols I˙ (interpretation), S˙ (Skolem function) of variable finite arity; – a binary relation symbol ∈˙ (set-membership) and a ternary relation ˙ (guarded well-order). symbol ≺ The syntax and semantics of S are defined as usual. The variable arity of I˙ and S˙ is handled by bracketing: if n < ω and t0 , . . . , tn−1 are ˙ 0 , . . . , tn−1 ) and S(t ˙ 0 , . . . , tn−1 ) are S-terms. If t0 , t1 , S-terms then I(t ˙ 1 and t0 ≺ ˙ t1 t2 are atomic S-formulas. We also t2 are S-terms then t0 ∈t denote the set of first-order S-formulas by S. We assume that S is G¨odelized in an effective way: the set of formulas satisfies S ⊆ ω <ω , and the usual syntactical operations of S are recursively definable over Vω . This includes the simultaneous substitu in ϕ. The notation ϕ(v˙ 0 , . . . , v˙ n−1 ) tion ϕ wt of terms t for variables w indicates that the free variables of ϕ are contained in {v˙ 0 , . . . , v˙ n−1 }. By S0 we denote the collection of quantifier-free formulas of S. The formulas of S are interpreted in S-structures in the obvious way. If A = ˙ A , S˙ A , ∈˙ A ) is an S-structure, ϕ(v˙ 0 , . . . , v˙ n−1 ) ∈ S and a0 , . . . , (A, I˙A , ≺ an−1 ∈ A then A |= ϕ[a0 , . . . , an−1 ] says that A is a model of ϕ under the variable assignment v˙ i → ai for i < n. The Fα -hierarchy is defined by iterated S0 -definability: Definition 2.2. The fine hierarchy consists of a monotone sequence of S-structures Fα = (Fα , I, ≺, S, ∈) = (Fα , IFα , ≺Fα , SFα , ∈) where Fα , IFα , ≺Fα , SFα are defined by recursion on α ∈ Ord: For α ≤ ω let Fα = Vα and ∀x ∈ Fα : I(x) = S(x) = 0. Let <ω be a recursive binary relation which well-orders Fω = Vω in ordertype ω and which extends the ∈-relation on {Vn | n < ω}. Define the ternary relation ≺ on Fα by: x ≺z y iff (z = Fn for some n < α, x, y ∈ z and x <ω y). This defines the structures Fα for α ≤ ω. Assume that α ≥ ω and that Fα = (Fα , I, ≺, S, ∈) has been defined. For ϕ(v˙ 0 , . . . , v˙ n ) ∈ S0 and p ∈ Fα set I(Fα , ϕ, p) = {x ∈ Fα | Fα |= ϕ[x, p ]}.
(*)
164
PETER KOEPKE AND MARC VAN EIJMEREN
In this case, (Fα , ϕ, p) is a name for its interpretation I(Fα , ϕ, p) in the fine hierarchy. The next fine level is defined as Fα+1 = {I(Fα , ϕ, p) | ϕ ∈ S0 , p ∈ F<ω α }. We have to define I(z) for “new” vectors z ∈ (Fα+1 )<ω \F<ω α . Certain assignments were made in (*); in all other cases set I(z) = 0. Define the ternary relation ≺Fα+1 extending ≺Fα by adjoining the following triples: x ≺Fα y iff x, y ∈ Fα and there is a name (Fβ , ϕ, p) for x such that every name (Fγ , ψ, q) for y is lexicographically greater than (Fβ , ϕ, p), where coordinates are well-ordered from left to right by ∈, <ω and ≺Fmax(β,γ) , respectively. The Skolem function S finds witnesses of existential statements. We only need to define S(z) for z ∈ (Fα+1 )<ω \F<ω z ) = 0 except α . Set S( when z = (Fα , ϕ, p), where ϕ ∈ S0 , p ∈ F<ω and I(F , ) = ∅. In α ϕ, p α that case let S(z) be the ≺Fα -least element of I(Fα , ϕ, p). This defines Fα+1 = (Fα+1 , I, ≺, S, ∈). Assume that λ > ω is a limit ordinal and that Fα is defined for α < λ. Thenthe limit structure Fλ = (Fλ , I, ≺, S, ∈)is defined by unions: F λ = α<λ Fα , IFλ = α<λ IFα , ≺Fλ = α<λ ≺Fα , SFλ = α<λ SFα . The fine hierarchy satisfies some natural hierarchical properties some of which were already tacitly assumed in the previous definition: Proposition 2.3. For every γ ∈ Ord: 1. α ≤ γ → Fα ⊆ Fγ ; 2. α < γ → Fα ∈ Fγ ; 3. Fγ is transitive. First-order definability can be emulated in S0 : Proposition 2.4. For every ∈-formula ϕ(v˙ 0 , . . . , v˙ n−1 ) one can uniformly define a quantifier free formula ϕ∗ (v˙ 0 , . . . , v˙ n−1 , v˙ n , . . . , v˙ n+k ) ∈ S0 such that for all α ≥ ω and for all a0 , . . . , am ∈ Fα : (Fα , ∈) |= ϕ[a0 , . . . , an−1 ] ⇐⇒ Fα+k |= ϕ∗ [a0 , . . . , an−1 , Fα , Fα+1 , . . . , Fα+k ].
A REFINEMENT OF JENSEN’S CONSTRUCTIBLE HIERARCHY
165
Proposition 2.5. Every Fωα is closed with respect to first order definability. Theorem 2.6. α∈Ord Fα = L. So we have defined a hierarchy for the constructible universe. Proofs of the above theorem and propositions can be found in [Koe∞]. We show that the Fα -hierarchy is a refinement of the Jα -hierarchy: Theorem 2.7. For every α ∈ Ord, Fωα = Jα . The proof of this result occupies the rest of this paper and consists of a sequence of lemmas. Using Proposition 2.5 one can easily prove one inclusion. Lemma 2.8. For every α ∈ Ord, Jα ⊆ Fωα . Proof. By induction on α. For α = 0 and α = 1 this is true by definition, so assume Jγ ⊆ Fωγ for every γ < α. Jα is the rudimentary closure of the Jγ ’s. As rudimentary functions are first-order definable, Fωα is rudimentarily closed by Proposition 2.5. Hence Fωα includes Jα . q.e.d. For the other inclusion, we show that the F-hierarchy is absolute for each Jα where α > 1, i.e., the above recursive definition of the Fhierarchy can be carried out within Jα and will define the same sequence as the recursive definition in V for all ordinals in Jα . Since Jα is rudimentary closed and contains the language S as an element, we see by inspection that the recursive conditions in the definition of the F-hierarchy are absolute for Jα . Properties like I(Fγ , ϕ, p) = {x ∈ Fγ | Fγ |= ϕ[x, p ]} or Fγ+1 = {I(Fγ , ϕ, p) | ϕ ∈ S0 , p ∈ F<ω γ } refer to a quantifier-free satisfaction relation. Evaluating this within Jα involves evaluating finite sequences of values of terms and truth-values according to the structure of the formula ϕ considered. These finite sequences exist in the structure Jα just like they exist in the universe V. The other recursive conditions in Definition 2 are absolute for similar reasons. So the recursion can indeed be carried out absolutely in Jα provided we can prove the following
166
PETER KOEPKE AND MARC VAN EIJMEREN
Claim 5. For all ordinals α and all µ < ωα, we have (Fν | ν < µ) ∈ Jα . This can be shown by induction on α. The limit case is trivial: if α is a limit ordinal the inductive hypothesis implies that for all β < α, all initial segments of the F-hierarchy of length < ωβ are in Jβ . This yields the property for Jα . Now consider the successor case: assume that α = β + 1 and that ∀µ < ωβ (Fν | ν < µ) ∈ Jβ . Then (Fν | ν < ωβ) is definable over Jβ by the recursion formula and is thus an element of Jα . To show that the sequences (Fν | ν < µ) for ωβ < µ < ωα are elements of Jα it suffices to show by induction on ν, ωβ ≤ ν < ωα that Fν ∈ Jα . Since the claim holds for β, (Fν | ν < ωβ) is recursively definable over Jβ . Then the structure Fωβ is definable over Jβ as the union of that tower of structures. Thus Fωβ ∈ Jβ . For the successor step of this inner induction, we use a surjection h ∈ Jβ+1 from ωβ onto Fωβ+n to define an Fωβ+n -like structure over ωβ, hence over Jβ . We can then define the next level of the F-hierarchy within Jβ+1 . We start with some technical lemmas. Lemma 2.9. Let x ∈ Jβ+1 and let h ∈ Jβ+1 be a surjection from ωβ onto x. Then x<ω ∈ Jβ+1 . Proof. The set (ωβ)<ω = {s | Func(s) ∧ dom(s) ∈ ω ∧ ran(s) ⊆ ωβ} is a definable subset of Jβ . Hence (ωβ)<ω ∈ Jβ+1 . Since Jβ+1 is rudimentarily closed: x<ω = {h ◦ s | s ∈ (ωβ)<ω } h◦s = s∈(ωβ)<ω
∈ Jβ+1 . q.e.d.
For a surjection h from ωβ onto x we denote by h the surjection from (ωβ)<ω onto x<ω induced by h: for any s = (ξ0 , . . . , ξn−1 ) set h(s) := h ◦ s = (h(ξ0 ), . . . , h(ξn−1 )). Then h ∈ Jβ+1 implies that h ∈ Jβ+1 .
A REFINEMENT OF JENSEN’S CONSTRUCTIBLE HIERARCHY
167
Lemma 2.10. For every α > 1, S0 ∈ Jα . Proof. In [Dev84, VI.1.14], a language similar to S is defined over J1 . Therefore, S0 is definable over J1 and an element of Jα for α > 1. q.e.d. The following lemma then concludes the proof of our theorem:
Lemma 2.11. Let ν = ωβ + n, Fν = (Fν , I, ≺, S, ∈) ∈ Jβ+1 , hν a surjection from ωβ onto Fν and hν ∈ Jβ+1 . Then Fν+1 ∈ Jβ+1 and there is a surjection hν+1 from ωβ onto Fν+1 such that hν+1 ∈ Jβ+1 . ¯ ν = (F ¯ ν , iν , ≺ν , sν , ∈ν ) over ωβ which is Proof. Define a structure F analogous to Fν : ¯ ν := {γ ∈ ωβ | hν (γ) ∈ Fν } F iν (ϕ, s) := {γ ∈ ωβ | hν (γ) ∈ Iν (ϕ, hν (s))} ≺ν := {(γ, δ) | (hν (γ), hν (δ)) ∈ <ν } ∈ν := {(γ, δ) | hν (γ) ∈ hν (δ)} These are all subsets of Jβ and elements of Jβ+1 , hence definable over Jβ . We complete the structure by defining: iν :=
{((ϕ, s), iν (ϕ, s))}
ϕ∈S0 s∈(ωβ)<ω
sν :=
{(s, sν (hν (s)))}
s∈(ωβ)<ω
fν := (fν , iν , ≺ν , sν , ∈ν ). The latter definitions are compositions of rudimentary functions and elements of Jβ+1 , hence fν ∈ Jβ+1 .
168
PETER KOEPKE AND MARC VAN EIJMEREN
We can now extend fν to a structure fν+1 = (fν+1 , iν+1 , ≺ν+1 , sν+1 ) in a similar way as in the F-hierarchy: iν+1 (ϕ, s) := {γ ∈ ωβ | fν |= ϕ[γ, s]} {((ϕ, s), iν+1 (ϕ, s))} iν+1 := ϕ∈S0 s∈(ωβ)<ω
fν+1 :=
{iν+1 (ϕ, s)}
ϕ∈S0 s∈(ωβ)<ω
≺ν+1 :=≺ν ∪{(γ, δ) | γ ∈ fν ∧ δ ∈ fν+1 \fν } ∪ {(γ, δ) | ∃ϕ, s ∀ψ, t(γ = iν+1 (ϕ, s) ∧ δ = iν+1 (ψ, t) ∧ (ϕ ∈ ψ ∨ (ϕ = ψ ∧ s
We look closer at the definitions to show that the structure fν+1 is an element of Jβ+1 . First we remark that similarly to ordinary Σ0 -satisfaction, the f-satisfaction relation for quantifier-free formulae is rudimentary. For ordinary Σ0 -satisfaction this carried out in detail in [Dev84]. Therefore iν+1 ∈ Jβ+1 and fν+1 ∈ Jβ+1 . To see that ≺ν+1 ∈ Jβ+1 we note first that the first part of the definition is an element of Jβ+1 by assumption and the second as well as we know fν+1 ∈ Jβ+1 . For the third part we note that s
t(n) → (∃m < n) s(m) ≺β t(m)))) Finally the Skolem function sν+1 is readily definable from ≺ν and iν+1 , hence is an element of Jβ+1 . So fν+1 := (fν+1 , iν+1 , ≺ν+1 , sν+1 ) ∈ Jβ+1 .
A REFINEMENT OF JENSEN’S CONSTRUCTIBLE HIERARCHY
169
We now define Fν+1 :=
hν (γ)
γ∈fν+1
iν+1 (ϕ, s) Iν+1
:= {hν (γ) | γ ∈ iν+1 } := {((ϕ, hν (s)), iν+1 (ϕ, s))} ϕ∈S0 s∈(ωβ)<ω
<ν+1 := {(hν (γ), hν (δ) | (γ, δ) ∈≺ν+1 } Sν+1 := {(hν (s), hν (γ)) | (s, γ) ∈ sν+1 } Fν+1 := (Fν+1 , Iν+1 , <ν+1 , Sν+1 ) By a now familiar argument we find Fν+1 ∈ Jβ+1 We still need to construct a surjection hν+1 from ωβ onto Fν+1 . We can define a surjection g from ωβ onto ω <ω × (ωβ)<ω (see for example [Dod82]). Let g be its restriction to S0 ×(ωβ)<ω . Then we define hν+1 := hν ◦ iν+1 ◦ g . It is easily seen this is a surjection and by construction we have g ∈ Jβ+1 , hence hν+1 ∈ Jβ+1 . q.e.d. In all, we have proved Theorem 2.7.
q.e.d.
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
Effective Hausdorff dimension Elvira Mayordomo C´amara Departamento de Inform´atica e Ingenier´ıa de Sistemas Universidad de Zaragoza 50015 Zaragoza, Spain E-mail: [email protected]
Abstract. Lutz in [Lut00a] has recently proved a new characterization of Hausdorff dimension in terms of gales, which are betting strategies that generalize martingales. We present here this characterization and give three instances of how it can be used to define effective versions of Hausdorff dimension in the contexts of constructible, finite-state, and resource-bounded computation.
1
Introduction
Resource-bounded measure, a generalization of classical Lebesgue measure, was developed by Lutz in 1991 [Lut92] in order to investigate the internal structures of complexity classes. This line of research has proven to be very fruitful [Amb0 May1 97], [Lut97], [LutMay01], but there are certain inherent limitations to the information that resource-bounded measure can provide. These limitations, also present in classical Lebesgue measure, come from the fact that measure cannot make quantitative Received: November 23rd, 2001; In revised version: March 27th, 2002; June 5th, 2002; Accepted by the editors: June 18th, 2002. 2000 Mathematics Subject Classification. 68Q15 68Q30, 94A15.
This research was supported in part by Spanish Government MEC project PB98-0937-C0402. I am very grateful to Jack Lutz for his helpful remarks on earlier drafts of this paper. I thank Jack Lutz and John Hitchcock for providing early drafts of their papers. I also thank an anonymous referee for several suggestions that improved the paper.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 171-186. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
172
´ ELVIRA MAYORDOMO CAMARA
distinctions inside a measure zero set, and also from the Kolmogorov zero-one law ([Dai01], [Lut98]) that implies that for most sets of interest in computational complexity and recursion theory the measure is zero, one or undefined. Hausdorff dimension was defined as an augmentation of Lebesgue measure theory [Hau19]. Every subset of a given metric space is assigned a dimension. We are interested in the metric space being the Cantor space C consisting of all infinite binary strings. In this case the dimension of each set X ⊆ C is a real number dimH (X) ∈ [0, 1]. Hausdorff dimension is monotone, dimH (∅) = 0, dimH (C) = 1, and intermediate values occur for many interesting sets. Also, if dimH (X) < 1 then X is a measure zero subset of C, so Hausdorff dimension can quantitatively distinguish among measure zero sets. Hausdorff dimension was originally defined topologically (cf. [Hau19], [Fal85]). Lutz in [Lut00a] recently proved a new characterization of Hausdorff dimension in terms of gales or supergales, which are betting strategies that generalize martingales. The class of gales that “succeed” on the sequences in a set X ⊆ C determines the dimension of X. Therefore a natural generalization of Hausdorff dimension arises by restricting the class of admissible gales or supergales. This is useful when we want a nontrivial dimension inside a countable set such as the set of all decidable sequences. In this paper we present three effectivizations of Hausdorff dimension. The first one is constructive dimension (cf. [Lut00b], [Lut02]), defined by restricting to constructive (lower semicomputable) supergales, the second is finite-state dimension ([Dai+01]), defined by the natural finite-state effectivization, and the third one is resource-bounded dimension, in which a particular complexity time-bound is enforced on gales. Hausdorff dimension has also been related to measures of information content. Ryabko (cf. [Rya86], [Rya93], [Rya94]), Staiger (cf. [Sta0 93], [Sta0 98]) and Cai and Hartmanis (cf. [CaiHar3 94]) have all proven results that relate Hausdorff dimension with Kolmogorov complexity. This line of thought is also present in effective dimension. Constructive dimension can be fully characterized in terms of Kolmogorov complexity (cf. [May02]) and finite state dimension is definable in terms of sequence compressibility (cf. [Dai+01]). We present here these two results supporting the intuition that dimension is a measure of information density.
EFFECTIVE HAUSDORFF DIMENSION
173
In Section 2 we introduce a few preliminary definitions. Section 3 contains Lutz’s characterization of Hausdorff dimension, with the key concepts of gale and supergale. Section 4 presents constructive dimension of sets and sequences. Section 5 contains finite-state dimension, and Section 6 is a brief sketch of resource-bounded dimension.
2
Preliminaries
We work in the set {0, 1}∗ of all (finite, binary) strings and in the Cantor space C of all (infinite, binary) sequences. We write |w| for the length of a string w ∈ {0, 1}∗ . The empty string is denoted λ. For S ∈ C and i, j ∈ N, i ≥ j, we write S[i..j] for the string consisting of the ith through jth bits of S, stipulating that S[0..0] is the leftmost bit of S. For each w ∈ {0, 1}∗ , w is a prefix of S if w = S [0..|w| − 1] or w = λ. We will freely identify each language A ⊆ {0, 1}∗ with its characteristic sequence χA ∈ C. We will refer to the low levels of the arithmetical hierarchy of sets of sequences, specifically to sets in Π01 , Σ02 , and Π02 . The definition can be found in [Rog0 67]. We will also mention sequences in Σ01 , that is, r.e. sequences, sequences in Π01 , that is, co-r.e. sequences, and sequences in ∆02 , that is, sequences that are decidable relative to the halting oracle. if A function f : {0,1}∗ → R is lower semicomputable its lower − graph Graph (f ) = (x, s) ∈ {0, 1}∗ × [0, ∞) s < f (x) is recursively enumerable. A real number α is computable if there is a computable function f : N → Q such that for all r ∈ N, |f (r) − α| < 2−r . A real number α is ∆02 -computable if there is a function f : N → Q that is computable relative to the halting oracle such that for all r ∈ N, |f (r) − α| < 2−r . Let t : N → N. A function f : {0, 1}∗ → R is computable in time (space) t if there is a function fˆ : {0, 1}∗ × N → Q that can be computed by a Turing Machine in time (space) t and such that for all x ∈ {0, 1}∗ , r ∈ N, |fˆ(x, r) − f (x)| < 2−r . The Kolmogorov complexity K(x) of a finite binary sequence x is the length of the shortest description of x (the full definition and properties can be found in the book by Li and Vit´anyi [LiVit97]).
174
´ ELVIRA MAYORDOMO CAMARA
We use the LOGSPACE-uniform version of the bounded-depth circuit complexity class AC0 ([Joh90]). For each f : N → N, SIZE(f ) is the set of languages A ⊆ {0, 1}∗ such that for all n ∈ N there is an n-input boolean circuit of at most f (n) gates that recognizes A ∩ {0, 1}n . We define the frequency of a nonempty string w ∈ {0, 1}∗ to be , where #(b, w) denotes the number of occurthe ratio freq(w) = #(1,w) |w| rences of the bit b in w. For each α ∈ [0, 1], we define the set FREQ(α) = S ∈ C lim freq(S [0..n − 1]) = α . n→∞
The binary Shannon entropy function H : [0, 1] → [0, 1] is defined as H(x) = x log
1 1 , + (1 − x) log 1−x x
with H(0) = H(1) = 0 to ensure continuity at 0 and 1.
3
Lutz’ Characterization of Hausdorff Dimension
This section reviews the gale characterization of classical Hausdorff dimension, which motivates our development. Definition 3.1. Let s ∈ [0, ∞) 1. An s-supergale is a function d : {0, 1}∗ → [0, ∞) that satisfies the condition d(w) ≥ 2−s [d(w0) + d(w1)] (∗) for all w ∈ {0, 1}∗ . 2. An s-gale is an s-supergale that satisfies condition (∗) with equality for all w ∈ {0, 1}∗ . 3. A supermartingale is a 1-supergale. 4. A martingale is a 1-gale. Intuitively, an s-supergale is a strategy for betting on the successive bits of a sequence S ∈ C. For each prefix w of S, d(w) is the capital (amount of money) that d has after betting on the bits w of S. When betting on the next bit b of a prefix wb of S (assuming that b is equally likely to be 0 or 1), condition (∗) tells us that the expected value of d(wb)
EFFECTIVE HAUSDORFF DIMENSION
175
–the capital that d expects to have after this bet– is (d(w0) + d(w1))/2 ≤ 2s−1 d(w). If s = 1, this expected value is at most d(w) –the capital that d has before the bet– so the payoffs are at most “fair.” If s < 1, this expected value is less than d(w), so the payoffs are “less than fair.” Similarly, of s > 1, the payoffs can be “more than fair.” Note: We will use in Section 4 the concept of β-martingale, where β is a real number in [0, 1]. In this case when betting on the next bit b of a prefix wb or S we assume that β is the probability of b being 1. Specifically, a β martingale is a function d : {0, 1}∗ → [0, ∞) that satisfies the condition d(w) = (1 − β) d(w0) + β d(w1) for all w ∈ {0, 1}∗ . Note that for β = 1/2 we obtain the above definition. Of course the objective of an s-supergale is to win a lot of money. Definition 3.2. Let d be an s-supergale, where s ∈ [0, ∞). 1. We say that d succeeds on a sequence S ∈ C if lim sup d(S [0..n − 1]) = ∞. n→∞
2. The success set of d is
S ∞ [d] = S ∈ C d succeeds on S .
3. For X ⊆ C, G(X) is the set of all s ∈ [0, ∞) such that there is an s-gale d for which X ⊆ S ∞ [d]. 4. For X ⊆ C, G(X) is the set of all s ∈ [0, ∞) such that there is an s-supergale d for which X ⊆ S ∞ [d]. Note that if s, s ∈ [0, ∞) then for every s-supergale d, the func tion d : {0, 1}∗ → [0, ∞) defined by d (w) = 2(s −s)|w| d(w) is an s supergale. It was shown in [Lut00a] that the following definition is equivalent to the classical definition of Hausdorff dimension in C. Definition 3.3. The Hausdorff dimension of a set X ⊆ C is dimH (X) = inf G(X) = inf G(X).
176
´ ELVIRA MAYORDOMO CAMARA
See [Fal85] for a good overview of classical Hausdorff dimension, including the original topological definition based on open covers by balls of diminishing radii (cf. [Hau19]). The gale characterization of Hausdorff dimension that we have presented can be generalized by restricting the class of gales or supergales that are allowed. We will follow this idea in the rest of the paper. Eggleston in [Egg49] proved the following classical result on the Hausdorff dimension of a set of sequences with a fixed asymptotic frequency. Theorem 3.4. For each real number α ∈ [0, 1], dimH (FREQ(α)) = H(α). We will reformulate this last result in the contexts of the dimensions defined in Sections 4, 5, and 6.
4
Constructive Dimension
In this section we present the first effective version of of Hausdorff dimension that is defined by restricting the class of supergales to those that are lower semicomputable. We first give the definitions of constructive dimension of a set and constructive dimension of a sequence, and then we relate them and give their main properties. We finish the section summarising the main results on constructive dimension. The results in this section are mainly from [Lut00b] and [Lut02], also including those in [May02] and [Hit02]. An s-supergale is constructive if it is lower semicomputable. We define constructive dimension as follows. Definition 4.1. Let X ⊆ C. 1. Gconstr (X) is the set of all s ∈ [0, ∞) such that there is a constructive s-supergale d for which X ⊆ S ∞ [d]. 2. The constructive dimension of a set X ⊆ C is cdim(X) = inf Gconstr (X). 3. The constructive dimension of an individual sequence S ∈ C is dim(S) = cdim({S}).
EFFECTIVE HAUSDORFF DIMENSION
177
By Lutz characterization of Hausdorff dimension (Definition 3.3), we conclude that cdim(X) ≥ dimH (X) for all X ⊆ C. But in fact much more is true for certain classes, as Hitchcock shows in [Hit02]. For sets that are low in the arithmetical hierarchy, constructive dimension and Hausdorff dimension coincide. Theorem 4.2 (Hitchcock). If X ⊆ C is a union of Π01 sets, then dimH (X) = cdim(X). We note that Theorem 4.7 below yields a new proof of Theorem 4.2 from Theorem 5 of Staiger in [Sta0 98]. Hitchcock also proves that this is an optimal result for the arithmetical hierarchy, since it cannot be extended to sets in Π02 . For Hausdorff dimension, all singletons have dimension 0, and in fact all countable sets have Hausdorff dimension 0. The situation changes dramatically when we restrict to constructive supergales, since a singleton can have positive constructive dimension, and in fact can have any constructive dimension. Theorem 4.3. For every α ∈ [0, 1], there is an S ∈ C such that dim(S) = α. [Lut02] We note that Theorem 4.7 below yields a new proof of Theorem 4.3 from Lemma 3.4 of Cai Hartmanis [CaiHar3 94]. The constructive dimension of any set X ⊆ C is completely determined by the dimension of the individual sequences in the set. Theorem 4.4 (Lutz). For all X ⊆ C, cdim(X) = supx∈X dim(x).1 There is no analogue of this last theorem for Hausdorff dimension or for any of the concepts we will define in Sections 5 and 6. The key ingredient in the proof of Theorem 4.4 is the existence of optimal constructive supergales, that is, constructive supergales that multiplicatively dominate any other constructive supergale. This is analogous to the existence of universal tests of randomness in the theory of random sequences. Theorem 4.2 together with Theorem 4.4 implies that the classical Hausdorff dimension of every Σ02 set X ⊆ C has the pointwise characterization dimH (X) = supx∈X dim(x). 1
Cf. [Lut02].
178
´ ELVIRA MAYORDOMO CAMARA
Theorem 4.4 immediately implies that constructive dimension has the countable stability property, which also holds for classical Hausdorff dimension. The proof can be found in [Lut02]. Corollary 4.5. For all X0 , X1 , X2 , . . . ⊆ C, ∞ Xk = sup cdim(Xk ). cdim k=0
k∈N
Let α ∈ [0, 1]. If we define DIM≤α = S ∈ C dim(S) ≤ α , then this is the largest set of constructive dimension α. Theorem 4.6. For every α ∈ [0, 1], the set DIM≤α has the following two properties. 1. cdim(DIM≤α ) = α. 2. For all X ⊆ C, if cdim(X) ≤ α, then X ⊆ DIM≤α . The proof can be found in [Lut02]. We note that Theorem 4.7 below yields a new proof of Theorem 4.6 (part 1) from Theorem 2 of Ryabko [Rya84]. The constructive dimension of a sequence can be characterized in terms of the Kolmogorov complexities of its prefixes. Theorem 4.7. For all A ∈ C, K(A [0..n − 1]) . n→∞ n This latest theorem is proven in [May02] and justifies the intuition that the constructive dimension of a sequence is a measure of its algorithmic information density. The relation of dimension and information content will appear again in a different context in the next section, where the finite-state dimension of a sequence (to be defined) is characterized in terms of its compressibility. We now briefly state the main results proven so far on constructive dimension, including the existence of ∆02 sequences of any ∆02 dimension, the constructive version of Eggleston theorem, and the constructive dimension of sequences that are random relative to a non-uniform distribution. An important result in the theory of random sequences is the existence of random sequences in ∆02 . We have the following analogue for constructive dimension. dim(A) = lim inf
EFFECTIVE HAUSDORFF DIMENSION
179
Theorem 4.8. For every ∆02 -computable real number α ∈ [0, 1] there is a ∆02 sequence S such that dim(S) = α. [Lut02] And this cannot be improved to Σ01 or Π01 sequences since they all have constructive dimension 0. This is the constructive version of the classical Theorem 3.4 of Eggleston in [Egg49] and can be found in [Lut02]. Theorem 4.9. If α is ∆02 -computable real number in [0, 1] then cdim(FREQ(α)) = H(α). An anonymous referee has pointed out that an alternative proof of Theorem 4.9 can be derived from the newer Theorem 4.7 and earlier results of Eggleston (cf. [Egg49]) and Kolmogorov (cf. [ZvoLev70]). In fact, this approach shows that Theorem 4.9 holds for arbitrary α ∈ [0, 1]. A sequence is (Martin-L¨of) random [Mar3 66] if it passes every algorithmically implementable test of randomness. This can be reformulated in terms of martingales as follows Definition 4.10. A sequence A ∈ C is (Martin-L¨of) random if there is no constructive supermartingale d such that A ∈ S ∞ [d]. [Sch6 71] By this definition, random sequences have constructive dimension 1. For nonuniform distributions we have the concept of β-randomness, for β any real number in (0, 1) representing the bias. Definition 4.11. Let β ∈ (0, 1). A sequence A ∈ C is (Martin-L¨of) random relative to β if there is no constructive β-martingale d such that A ∈ S ∞ [d]. [Sch6 71] Lutz relates randomness relative to a non-uniform distribution with Shannon information theory. Theorem 4.12. Let β ∈ (0, 1) be a computable real number. Let A ∈ C be random relative to β. Then dim(A) = H(β). A more general result for randomness relative to sequences of cointosses is obtained in [Lut02]. Very recently, Lutz in [Lut02] has introduced the concept of dimension of finite binary strings, with very interesting connections with constructive dimension. We cannot cover this topic here due to lack of space.
´ ELVIRA MAYORDOMO CAMARA
180
5
Finite-State Dimension
Our second effectivization of Hausdorff dimension will be the most restrictive of those presented here, we will go all the way to the level of finite-state computation. In this section we use gales computed by multiaccount finite-state gamblers to develop the finite-state dimensions of sets of binary sequences and individual binary sequences. The theorem of Eggleston is shown to hold for finite-state dimension. Every rational number in [0, 1] is the finite-state dimension of a sequence in the lowlevel complexity class AC0 . The main theorem of this section shows that the finite-state dimension of a sequence is precisely the infimum of all compression ratios achievable on the sequence by information-lossless finite-state compressors. All the results in this section are from [Dai+01]. We start by introducing the concept of finite-state gambler that is used to develop finite-state dimension. Intuitively, a finite-state gambler is a finite-state device that places k separate binary bets on each of the successive bits of its input sequence. These bets correspond to k separate accounts into which the gambler’s capital is divided. Bets are required to be rational numbers in B = Q ∩ [0, 1]. Definition 5.1. If k is a positive integer, then a k-account finite-state gambler (k-account FSG) is a 5-tuple q0 , c0 ), G = (Q, δ, β, where – – – – –
Q is a nonempty finite set of states, δ : Q × {0, 1} → Q is the transition function, β : Q → Bk is the betting function, q0 ∈ Q is the initial state, and c0 = (c0,1 , . . . , c0,k ), the initial capital vector, is a sequence of nonnegative rational numbers.
A finite state gambler (FSG) is a k-account FSG for some positive integer k. The case k = 1, where there is only one account, is the model of finite-state gambling that has been considered (in essentially equivalent
EFFECTIVE HAUSDORFF DIMENSION
181
form) by Schnorr and Stimm in [Sch6 Sti72], Feder in [Fed91], and others. q0 , c0 ) is in state q ∈ Q Intuitively, if a k-account FSG G = (Q, δ, β, and its current capital vector is c = (c1 , . . . , ck ) ∈ (Q ∩ [0, ∞))k , then for each of its accounts i ∈ {1, . . . , k}, it places the bet βi (q) ∈ B. If the payoffs are fair, then after this bet G will be in state δ(q, b) and its ith account will have capital if b = 1, 2βi (q)ci 2ci [(1 − b)(1 − βi (q)) + bβi (q)] = 2(1 − βi (q))ci if b = 0. This suggests the following definition. q0 , c0 ) be a k-account finite-state gamDefinition 5.2. Let G = (Q, δ, β, bler. 1. For each 1 ≤ i ≤ k, the ith martingale of G is the function dG,i : {0, 1}∗ → [0, ∞) defined by the recursion dG,i (λ) = c0,i , dG,i (wb) = 2dG,i (w)[(1 − b)(1 − βi (δ(w))) + bβi (δ(w))] for all w ∈ {0, 1}∗ and b ∈ {0, 1}. 2. The total martingale (or simply the martingale ) of G is the function dG =
k
dG,i .
i=1
3. For s ∈ [0, ∞), the s-gale of an FSG G is the function dG : {0, 1}∗ → [0, ∞) (s)
defined by (s)
dG (w) = 2(s−1)|w| dG (w) for all w ∈ {0, 1}∗ . In particular, note that dG = dG . 4. For s ∈ [0, ∞), a finite-state s-gale is an s-gale d for which there (s) exists an FSG G such that dG = d. (1)
182
´ ELVIRA MAYORDOMO CAMARA
We now use finite-state gales to define finite-state dimension. Definition 5.3. Let X ⊆ C. 1. GFS (X) is the set of all s ∈ [0, ∞) such that there is a finite-state s-gale d for which X ⊆ S ∞ [d]. 2. The finite-state dimension of set X is dimFS (X) = inf GFS (X). 3. The finite-state dimension of a sequence S ∈ C is dimFS (S) = dimFS ({S}). In general, dimFS (X) is a real number satisfying 0 ≤ dimH (X) ≤ dimFS (X) ≤ 1. Like Hausdorff dimension, finite-state dimension has the stability property. Theorem 5.4. For all X, Y ⊆ C, dimFS (X ∪ Y ) = max {dimFS (X), dimFS (Y )} . Let us briefly digress on the role of multiple accounts in the FSG model. The proof of Theorem 5.4 is based on the fact that if d1 and d2 are finite-state s-gales then d1 + d2 is a finite-state s-gale, that is, finite-state gales are closed under nonnegative, rational, linear combinations. But multiple accounts are required for this closure property to hold, since there exist 1-account finite-state s-gales d1 and d2 such that d1 + d2 is not a 1-account finite-state s-gale. Notwithstanding the usefulness of the above closure property, the question remains whether multiple accounts are strictly necessary for a theory of finite-state dimension. The next result shows that multiple accounts are not strictly necessary if we are willing to accept a large blowup in the number of states. Theorem 5.5. For each n-state, k-account FSG G and each ε ∈ (0, 1) 1 there is an n · k ε -state, 1-account FSG G such that for all s ∈ [0, 1], S ∞ [dG ] ⊆ S ∞ [dG (s)
(s+ε)
].
The theorem of Eggleston in [Egg49, Theorem 3.4] holds for finitestate dimension.
EFFECTIVE HAUSDORFF DIMENSION
183
Theorem 5.6. For all α ∈ Q ∩ [0, 1], dimFS (FREQ(α)) = H(α). The following theorem says that every rational number r ∈ [0, 1] is the finite-state dimension of a reasonably simple sequence. Theorem 5.7. For every r ∈ Q ∩ [0, 1] there exists S ∈ AC0 such that dimFS (S) = r. The main result in this section is that we can characterize the finitestate dimensions of individual sequences in terms of finite-state compressibility. We first recall the definition of an information-lossless finitestate compressor. (This idea is due to Huffman [Huf59]. Further exposition may be found in [Koh78] or [Kur74].) Definition 5.8. A finite-state compressor (FSC ) is a 4-tuple C = (Q, δ, ν, q0 ), where – – – –
Q is a nonempty, finite set of states, δ : Q × {0, 1} → Q is the transition function, ν : Q × {0, 1} → {0, 1}∗ is the output function, and q0 ∈ Q is the initial state.
For q ∈ Q and w ∈ {0, 1}∗ , we define the output from state q on input w to be the string ν(q, w) defined by the recursion ν(q, λ) = λ ν(q, wb) = ν(q, w)ν(δ(q, w), b) for all w ∈ {0, 1}∗ and b ∈ {0, 1}. We then define the output of C on input w ∈ {0, 1}∗ to be the string C(w) = ν(q0 , w). Definition 5.9. An FSC C = (Q, δ, ν, q0 ) is information-lossless (IL) if the function {0, 1}∗ → {0, 1}∗ × Q w → (C(w), δ(w)) is one-to-one. An information-lossless finite-state compressor (ILFSC) is an FSC that is IL.
184
´ ELVIRA MAYORDOMO CAMARA
That is, an ILFSC is an FSC whose input can be reconstructed from the output and final state reached on that input. Intuitively, an FSC C compresses a string w if |C(w)| is significantly less than |w|. Of course, if C is IL, then not all strings can be compressed. Our interest here is in the degree (if any) to which the prefixes of a given sequence S ∈ C can be compressed by an ILFSC. Definition 5.10. 1. If C is an FSC and S ∈ C, then the compression ratio of C on S is C (S) = lim inf n→∞
|C(S [0..n − 1])| . n
2. The finite-state compression ratio of a sequence S ∈ C is FS (S) = inf {C (S)|C is an ILFSC} . The following theorem says that finite-state dimension and finitestate compressibility are one and the same for individual sequences. Theorem 5.11. For all S ∈ C, dimFS (S) = FS (S). Theorem 5.11 is a new instance of the relation of dimension and information. Finite-state dimension is a real-time effectivization of a powerful tool of fractal geometry. As such it should prove to be a useful tool for improving our understanding of real-time information processing.
6
Dimension in Complexity Classes
We now use resource-bounded gales to develop dimension in complexity classes. Let p be the class of polynomial time computable functions from {0, 1}∗ to R. Let pspace be the class of polynomial space computable functions from {0, 1}∗ to R. Using these two classes we define p-dimension and pspace-dimension in the natural way. Definition 6.1. Let ∆ be p or pspace. Let X ⊆ C.
EFFECTIVE HAUSDORFF DIMENSION
185
1. G∆ (X) is the set of all s ∈ [0, ∞) such that there is an s-gale d ∈ ∆ for which X ⊆ S ∞ [d]. 2. The ∆-dimension of a set X ⊆ C is dim∆ (X) = inf G∆ (X). Let E = DTIME(2linear ) be the class of languages that can be recognized in linear exponential time. Let ESPACE = DSPACE(2linear ) be that class of languages that can be recognized in linear exponential space. The following result is proven in [Lut00a] and implies that p and pspace are the right dimension bounds for the classes E and ESPACE, respectively. Theorem 6.2. 1. dimp (E) = 1. 2. dimpspace (ESPACE) = 1. Therefore we can define dimension in the classes E and ESPACE as follows. Definition 6.3. For each X ⊆ C, 1. dim(X|E) = dimp (X ∩ E). 2. dim(X|ESPACE) = dimpspace (X ∩ ESPACE). These definitions endow the classes E and ESPACE with internal dimension structure. Several interesting results have already been proven ([Lut00a], [Hit01], [Amb1 +01]), indicating that dimension can give a new insight to open questions in Computational Complexity, in particular since many sets of interest in Computational Complexity have “fractallike” structures. We mention two of these results. Theorem 6.4. For all α ∈ Q, dim(FREQ(α) | E) = dimp (FREQ(α)) = H(α). Theorem 6.4 is proven in [Lut00a] for α a “p-computable” real number. Notice that we have proven Eggleston’s theorem in each of our formulations (for appropriate restrictions of α). This indicates that, no matter how we limit our power to compute dimension, a bounded asymptotic frequency of the elements implies a restriction on the dimension of the set. We finish with a result on nonuniform classes that generalizes the n known fact ([Lut92]) that SIZE( 2n ) has measure zero in ESPACE:
186
´ ELVIRA MAYORDOMO CAMARA
Theorem 6.5. For every real α ∈ [0, 1], n 2 dim SIZE α ESPACE = α. n
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
Axiomatizability of algebras of binary relations Szabolcs Mikul´as School of Computer Science and Information Systems Birkbeck College, University of London Malet Street, London WC1E 7HX, UK E-mail: [email protected]
Abstract. We present an overview of the axiomatizability problem of algebras of binary relations. The focus will be on the finite and non-finite axiomatizability of several fragments of Tarski’s class of representable relation algebras. We examine the step-by-step method for establishing finite axiomatizability and ultraproduct constructions for establishing non-finite axiomatizability. We conclude with some open problems that could be tackled using either of the above methods.
1
Introduction
In algebraic logic, the most extensively investigated class of algebras of binary relations is the class RRA of representable relation algebras. These are Boolean algebras equipped with some extra-Boolean operations arising from the nature of relations, like relation composition, relation converse and the identity relation. The class RRA is a variety, i.e., it Received: November 23rd, 2001; In revised version: March 6th, 2002; April 29th, 2002; Accepted by the editors: June 12th, 2002. 2000 Mathematics Subject Classification. 03G15 03G10.
Research supported by OTKA T30314. Thanks are due to an anonymous referee for careful reading and many valuable suggestions. Thanks go to I. Hodkinson and J. Crampton who made useful comments on an earlier version of the paper.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 187-205. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
´ SZABOLCS MIKULAS
188
is an equational class (cf. [Tar55]). When Tarski defined RRA, he raised the problem of whether RRA was finitely axiomatizable using equations; Monk answered the question negatively in [Mon0 64]. An interesting question is whether algebras of binary relations with smaller clones of operations can be finitely axiomatized. See [Sch1 91] for a survey of axiomatizability results of subreducts of RRA, i.e., classes of algebras where only a subset of the operations of RRA are present. In the last decade, researchers have continued to investigate subreducts of RRA, and significant results have been obtained. In this paper, we will mainly concentrate on subreducts of RRA in which the similarity type includes relation composition and at least one of the lattice operations of meet and join. We will recall the most important finite axiomatizability results and describe the so-called step-by-step method, that has been successfully used to achieve finite axiomatizations. Furthermore, we will show how to use ultraproduct constructions to establish non-finite axiomatizability results. In both cases, we will use games to make the constructions more intuitive. At the end of the paper we will mention some open problems.
2
Basics
First we recall the definition of representable relation algebras. Definition 2.1. A representable relation algebra is an algebra A = (A, 0, 1, · , + , − , ; , , 1 ) such that A ⊆ ℘(W ) (the powerset of W ) for some equivalence relation W , 0 = ∅, 1 = W , · is intersection, + is union, − is complement with respect to W , ; is relation composition, is relation converse, and 1 is the identity relation. More formally, for all elements x, y ∈ A, x ; y = {(u, v) ∈ W : (u, w) ∈ x and (w, v) ∈ y for some w} x = {(u, v) ∈ W : (v, u) ∈ x} 1 = {(u, v) ∈ W : u = v}. The set W is the unit of A, and the base of the algebra is the field of W , i.e., the smallest set U such that W ⊆ U × U . We denote the class of isomorphs of representable relation algebras by RRA. We also use ‘RRA’ as an abbreviation for the phrase ‘representable relation algebra’.
AXIOMATIZABILITY OF ALGEBRAS OF BINARY RELATIONS
189
We will also consider classes of algebras with different similarity types than that of RRA. In general, we will say that an algebra is representable if it is isomorphic to a set algebra of binary relations. As we mentioned earlier, RRA is a non-finitely axiomatizable variety. It is an interesting problem to find out precisely which combinations of operations cause non-finite axiomatizability. This is one of the motivations for investigating algebras with smaller similarity types. Definition 2.2. Let A ∈ RRA and τ be a set of operations whose elements are definable by fixed terms. The τ -reduct of A is the algebra (A, o)o∈τ . The τ -subreduct of RRA, R(τ ), is the class of subalgebras1 of the τ -reducts of elements of RRA. When forming subreducts we will look at the following operations besides the primitive operations of the similarity type of RRA. The residuals \ and / of ; are defined as follows: x\y = − (x ; − y)
x/y = − ( − x ; y ).
Their semantics is given by x\y = {(u, v) ∈ W : for all w, (w, u) ∈ x implies (w, v) ∈ y} x/y = {(u, v) ∈ W : for all w, (v, w) ∈ y implies (u, w) ∈ x}. The conjugates of ; are and . They are defined in RRA as x y = (x ; y)
x y = (x ; y ).
Their meaning is x y = {(u, v) ∈ W : for some w, (w, u) ∈ x and (w, v) ∈ y} x y = {(u, v) ∈ W : for some w, (v, w) ∈ y and (u, w) ∈ x}. While RRA is a variety, subreducts of RRA are not always closed under homomorphic images. However, it is not difficult to show that they are closed under forming subalgebras, products and ultraproducts. In other words, any subreduct of RRA is a quasivariety. Hence subreducts 1
As usual in algebraic logic, we assume that subalgebras include isomorphic copies as well.
190
´ SZABOLCS MIKULAS
of RRA can be defined by sets of quasiequations, i.e., universal formulas of the form (σ1 = 1 ∧ · · · ∧ σn = n ) → σ0 = 0 . The question is for which choice of τ does there exist a finite set of quasiequations axiomatizing R(τ ). Remark 2.3. There is a subtle but important point worth mentioning here. Recall that, in the semantics of the extra-Boolean operations, the unit W of the algebra occurs as a parameter. Since we are talking about axiomatizing subreducts of RRA, an abstract algebra belongs to such a class if and only if it can be represented as an algebra of binary relations where the operations are computed in an equivalence relation. In certain cases, it is a natural question whether an algebra can be represented as a set algebra of binary relations where the operations are computed in a transitive relation, for example. The two problems are not equivalent in general; for different choices of W we might have different semantics of the extra-Boolean operations. As an example, see the case of semilattice-ordered residuated semigroups in the next section.
3
Finite axiomatizability
In this section we look at subreducts of RRA that can be axiomatized by finite sets of quasiequations. We will use the step-by-step method to establish finite axiomatizability. The origin of the method goes back to the early 20th century. The same idea has been used to prove modeltheoretical results (e.g., L¨owenheim–Skolem theorems) and completeness theorems (e.g., Henkin’s completeness proof for first-order logic). In algebraic logic, it has been used to establish axiomatizability results, see for instance [Mad82], [And1 Tho2 88]; see also [HirHod1 97a] which contains more historical remarks about the method used in other branches of logic as well. 3.1
Semilattice-ordered semigroups
First let us look at the subreduct R( · , ; ) whose similarity type consists of intersection · and composition ; . Clearly · satisfies the semilattice
AXIOMATIZABILITY OF ALGEBRAS OF BINARY RELATIONS
axioms, viz.
191
x·x = x x·y = y·x (x · y) · z = x · (y · z).
As usual in semilattices, we define x ≤ y ⇐⇒ x = x · y. It is easy to check that composition is an associative operation, i.e., (x ; y) ; z = x ; (y ; z). Finally, let us observe that composition is a monotone operation with respect to the ordering ≤: (x ≤ y ∧ u ≤ v) → x ; u ≤ y ; v. Thus { · , ; }-reducts of RRAs are (lower) semilattice-ordered semigroups. Bredikhin and Schein [Bre0 Sch1 78] showed that the converse holds as well. Theorem 3.1. Let Σ be the axioms for semilattice-ordered semigroups. Then we have A ∈ R( · , ; ) ⇐⇒ A |= Σ. Proof. To prove2 the theorem, we have to show that every semilatticeordered semigroup can be represented over an equivalence relation, i.e., it is isomorphic to a set algebra of binary relations where composition is computed in an equivalence relation. We will build such a representation in a step-by-step manner. The main idea is to define a chain of partial representations such that the union of the chain defines the required isomorphism. The elements of the chain will satisfy certain coherency conditions that we will maintain during the construction. As we go along the chain we will approximate the structure of the abstract algebra to be represented by achieving a saturation property. We will define a chain of labelled graphs G0 ⊆ · · · ⊆ G α ⊆ · · · 2
The present proof is based on [And1 Mik94].
´ SZABOLCS MIKULAS
192
such that Gα = (Uα , Eα , α ), where Uα denotes the set of nodes, Eα ⊆ Uα × Uα is the set of edges, and α : Eα → A is the labelling of the edges by elements of A. We will maintain the following two coherency conditions: Eα is a transitive relation for (x, y), (y, z) ∈ Eα , α (x, z) ≤ α (x, y) ; α (y, z).
(1) (2)
We say that Gα is coherent if it satisfies the two coherency conditions (1) and (2) above. The graphs form a chain in the following sense: for α < β, Uα ⊆ Uβ ,
Eα ⊆ Eβ ,
α is the restriction of β to Eα .
Let G = (U, E, ) be the union of the chain: U= Uα E = Eα = α . α
α
α
Clearly, G is coherent if all Gα are coherent. The saturation property we want to achieve is: for every a, b ∈ A and (x, y) ∈ E, (x, y) ≤ a ; b implies (x, z) = a, (z, y) = b for some z ∈ U.
(3)
Given a coherent and saturated graph G = (U, E, ), we define h : A → ℘(U × U ) as follows: for every a ∈ A, h(a) = {(x, y) ∈ U × U : (x, y) ≤ a}. Let B be the { · , ; }-reduct of the full RRA with unit U × U . We show that h is an embedding of A into B. By the semilattice axioms and the definition of h, h is a homomorphism with respect to · . Using the coherency condition (2), one easily shows that h(a) ; h(b) ⊆ h(a ; b), while h(a ; b) ⊆ h(a) ; h(b)
AXIOMATIZABILITY OF ALGEBRAS OF BINARY RELATIONS
193
follows from the saturation property (3). Injectivity is guaranteed once we make sure that {(x, y) : (x, y) ≤ a} = {(x, y) : (x, y) ≤ b} for distinct elements a and b of A; see the initial step of the construction below. Then h is indeed the desired embedding. Thus it remains to construct the chain of graphs Gα . There are three cases in the construction according to whether α is 0, a successor, or limit ordinal. Initial step. In the base step of the construction, for every element a of the algebra A, we take arbitrary distinct points ua and va and label the edge (ua , va ) by a. We choose these points such that they are distinct for different elements of the algebra, i.e., for a = b, {ua , va } ∩ {ub , vb } = ∅. We let U0 = {ua , va : a ∈ A}, E0 = {(ua , va ) : a ∈ A} and 0 = {((ua , va ), a) : a ∈ A}. The coherency conditions are trivially satisfied by G0 = (U0 , E0 , 0 ). Moreover, {(x, y) : 0 (x, y) = a} = {(x, y) : 0 (x, y) = b} for distinct elements a and b of A (thus h indeed is an injection). Finally, let D0 be an enumeration of all “defects”, i.e., those triples ((x, y), a, b) such that (x, y) ∈ E0 , a, b ∈ A and 0 (x, y) ≤ a ; b. Successor step. Let us assume that we have defined a coherently labelled graph Gα = (Uα , Eα , α ). Also assume that Dα is an enumeration of all triples ((x, y), a, b) such that (x, y) ∈ Eα , a, b ∈ A and α (x, y) ≤ a ; b. We define Gα+1 = (Uα+1 , Eα+1 , α+1 ) as follows. Let ((x, y), a, b) be the first actual defect, i.e., the first element of Dα such that there is no w ∈ Uα such that α (x, w) = a and α (w, y) = b. We repair this defect as follows. Choose a fresh point z ∈ / Uα . We define Uα+1 = Uα ∪ {z} Eα+1 = {(x, z), (z, y)}∪ {(r, z) : (r, x) ∈ Eα } ∪ {(z, s) : (y, s) ∈ Eα } α+1 = α ∪ {((x, z), a), ((z, y), b)}∪ {((r, z), α (r, x) ; a) : (r, x) ∈ Eα }∪ {((z, s), b ; α (y, s)) : (y, s) ∈ Eα }. Clearly, Eα+1 is transitive. Furthermore, using the axioms, it is routine to show that Gα+1 satisfies the coherency condition (2).
194
´ SZABOLCS MIKULAS
Finally, we define Dα+1 by appending the new possible defects to the list Dα . That is, we add the triples ((x, y), a, b) such that (x, y) ∈ Eα+1 , a, b ∈ A and α+1 (x, y) ≤ a ; b to the end of the enumeration Dα . Limit step. Take the union of the chain of labelled graphs defined so far. Obviously, it satisfies the coherency conditions. Since we do not introduce any defect in the limit step, we can define the list of defects as the union of the previous lists. The construction continues as long as there are defects that we have not dealt with yet. Note that by the careful enumeration of possible defects, every defect is repaired sooner or later. See [And1 Mik94] for a precise proof that the construction indeed terminates. Thus we can conclude that the final graph G is free from defects, i.e., it satisfies the saturation condition. q.e.d. Remark 3.2. 1. Note that the relation E constructed above is irreflexive and asymmetric, thus it is not an equivalence relation. However, the embedding h shows that the composition operation is computed in the equivalence relation U × U . In fact, using the similarity type { ; , · } we cannot distinguish between representations using equivalence or transitive relations. The picture will drastically change when we include the residuals into the similarity type, see below. 2. The coherency conditions have a universal flavor; these properties talk about all edges, and they have to be maintained during the whole construction. On the other hand, one might argue that the saturation condition is rather existential—it requires the existence of “witnessing” edges providing the decomposition of the edge with label a ; b. One way to make this idea more explicit is to introduce games. There are two players, an existential player ∃ and a universal player ∀. They start a game using an abstract algebra A — ∃’s task is to show that A can be represented, while ∀ tries to create obstacles. They build a chain of coherently labelled graphs. In the nth round of the game, ∀ points out a defect, i.e., an edge (x, y) labelled with a ; b such that there are no edges (x, z) and (z, y) labelled by a and b, respectively. Then ∃ tries to respond with an extension of the labelled graph containing the two labelled edges missing from the previous graph. Of course, ∃ has to make sure that the
AXIOMATIZABILITY OF ALGEBRAS OF BINARY RELATIONS
195
new graph is coherent—the coherency conditions are necessary requirements for any (partial) representation. The universal player ∀ wins the game if there is n < ω such that in the nth round ∃ cannot respond with a coherently labelled graph. Otherwise ∃ wins the game. This idea has been made explicit in the context of relation algebras, see [HirHod1 97b]. Roughly speaking, a relation algebra is representable if and only if the existential player has a winning strategy in the representability game sketched above. In general, the existence of a winning strategy for ∃ can be described by a set of first-order sentences that provide an axiomatization for the class of algebras to be represented. Usually this axiomatization is infinite, though. 3.2
Semilattice-ordered residuated semigroups
Next we look at the possibility of extending the similarity type without losing finite axiomatizability. We mentioned earlier the residuals \ and / of composition. Their abstract characterization is given by the following two formulas: x ; y ≤ z ←→ y ≤ x\z x ; y ≤ z ←→ x ≤ z/y. For instance, the first one says that x\z is the largest element a such that x ; a ≤ z. It is easy to check that the above two formulas are valid in representable algebras. It turns out that, together with the semilatticeordered semigroup axioms Σ from Theorem 3.1, they are enough to ensure representability over transitive relations, (cf. [And1 Mik94]). Theorem 3.3. Every semilattice-ordered residuated semigroup can be represented as a set algebra of binary relations over a transitive relation. Proof. The proof of Theorem 3.3 is similar to that of Theorem 3.1. Again, one can define a chain of labelled graphs whose union defines the required representation of the abstract algebra. The coherency conditions are the same. But we require an additional saturation property, viz. for all a ∈ A and x ∈ U , there exist y, z ∈ U such that (y, x) = a
(x, z) = a.
(4)
This property can be achieved by adding the appropriate edges in the successor step α + 1 of the construction. Of course, one has to label
196
´ SZABOLCS MIKULAS
the other edges emerging when we make sure that Eα+1 is transitive; this can be done in a similar way as above. Furthermore, the new graph has to satisfy the coherency conditions — a consequence of the axioms. Finally, one has to show that h is a homomorphism with respect to the residuals as well. By definition of h, (u, v) ∈ h(a\b) iff (u, v) ≤ a\b. The latter holds if and only if a ; (u, v) ≤ b by the new axiom. Then we have that, for every z, (z, u) ≤ a implies (z, u) ; (u, v) ≤ b. The coherency condition (2) implies that, for every z, (z, u) ≤ a implies (z, v) ≤ b. Hence (u, v) ∈ h(a)\h(b). A similar argument using the saturation property (4) shows that h(a)\h(b) ⊆ h(a\b). Hence h is the required embedding. q.e.d. 3.3
Axiomatizing R( · , ; , \, /)
So far we proved a representation theorem for semilattice-ordered residuated semigroups over transitive relations. That is, we proved that the class of (isomorphs of) algebras of binary relations with set-theoretic operations of intersection, composition, and left and right residuals computed in a transitive relation is finitely axiomatizable. Consider now the problem of the finite axiomatizability of R( · , ; , \, /). Recall that, in subreducts of RRA, the operations are computed in an equivalence relation. In other words, for axiomatizing R( · , ; , \, /) we need a representation theorem with equivalence relations. The difference between the two axiomatizability problems is indicated by the following observation. In R( · , ; , \, /), the identity relation is always contained in a\a, while it clearly fails in the representation we gave in the proof above (E is irreflexive). Additional axioms ensure that we can achieve finite axiomatizability in this case as well. Theorem 3.4. R( · , ; , \, /) is finitely axiomatizable. The axioms are those for semilattice-ordered residuated semigroups together with the following four additional axioms: x ≤ y → z ≤ z ; (x\y) x ≤ y → z ≤ z ; (y/x)
x ≤ y → z ≤ (x\y) ; z x ≤ y → z ≤ (y/x) ; z.
The proof goes along similar lines as for Theorem 3.3. However, it is more involved, since, in the constructed labelled graphs, Eα has to be not
AXIOMATIZABILITY OF ALGEBRAS OF BINARY RELATIONS
197
just transitive but symmetric and reflexive as well. For details we refer the interested reader to [And1 Mik94]. Of course, there are several other possible similarity types. We will look at adding union + or converse in the next section. Discussion of the identity constant is postponed until the last section.
4
Non-finite axiomatizability
In this section we look at classes of algebras that cannot be axiomatized by finite sets of quasiequations. We will also sketch a method using ultraproduct constructions to establish non-finite axiomatizability (NFA, for short). This method dates back to the work of Tarski’s school of logic in the 1950s. The idea of establishing NFA of R(τ ) using ultraproducts is the following. We will define Boolean algebras with extra operators of the RRA similarity type, An (n < ω), and show that their τ -reducts are not representable while an ultraproduct of the reducts is representable. By Ło´s’ theorem [Hod0 93, Theorem 9.5.1], this is enough to show that R(τ ) is not finitely axiomatizable in first-order logic. Now assume that τ ⊆ τ and that R(τ ) is not finitely axiomatizable. What can we say about R(τ )? There is no general theorem indicating that R(τ ) is not finitely axiomatizable, but this is often the case. If we can show that in fact an ultraproduct of An for n < ω (and not just an ultraproduct of their τ -reducts) is representable, then we can conclude that no R(τ ) with τ ⊆ τ can be finitely axiomatized. If this is the case, we will say that R(τ ) is hereditarily non-finitely axiomatizable (HNFA, for short). The following results show that including union + or converse into the similarity type leads to non-finite axiomatizability. – – – – –
R( + , ; ) is NFA, (cf. [And1 88]); R( ; , ) is NFA, (cf. [Bre0 77]); R( · , + , ; ) is HNFA, (cf. [And1 91]); R( · , ; , ) is HNFA, (cf. [Hod1 Mik00], [Hai91]); R( · , ; , 1 , ) and R( · , ; , 1 , ) are HNFA, (cf. [Hod1 Mik00]).
Thus, while the subreduct to intersection and composition is finitely axiomatizable, if intersection is replaced by either union or converse, the
198
´ SZABOLCS MIKULAS
resulting subreduct is not finitely axiomatizable. If to intersection and composition we adjoin union or converse (or the identity relation and one of the conjugates), we get a hereditarily non-finitely axiomatizable class. The problem of HNFA in the first two cases seems to be open. Below we will show how to establish the HNFA of R( · , ; , , 0) using a variant of the “rainbow” construction from [Hod1 Mik00]. This particular similarity type will make the presentation of the idea slightly simpler. Theorem 4.1. R( · , ; , , 0) is HNFA. Proof. We will define algebras An (n < ω) of the RRA similarity type. Then we will show that (i) their { · , ; , , 0}-reducts are not representable, while (ii) their ultraproduct A is in RRA. We will show (i) by using the games mentioned in Remark 3.2. We will provide the universal player ∀ with tools (the green atoms gi (0 ≤ i ≤ 2n ) below) to gradually build a linearly ordered sequence. The algebra An is designed so that the existential player ∃ has to respond with a similar linearly ordered sequence (using the red atoms ri (1 ≤ i < 2n )) in any representation of An . But ∀’s sequence is longer than the sequence ∃ can build (there are more green atoms than red). Hence there is a round in the game played on An when ∃ is not able to respond to ∀’s move. Thus ∀ can win the game, whence An cannot be represented. However, as n grows ∃ can survive more and more rounds in the representability game, since it takes more and more rounds to show that ∀’s sequence is longer than ∃’s. Then in the limit, i.e., in the game played on the ultraproduct A, ∃ can survive infinitely many rounds. It follows that A ∈ RRA, establishing (ii). We are ready to define the algebras. The construction is rather involved, since we have to make sure that ∃ can survive several rounds, i.e., the reason for non-representability does not become immediate. Let n be any natural number. An has the following atoms (minimal non-zero elements): – – – – –
identity: 1 , greens: gi (0 ≤ i ≤ 2n ), whites: w, wij (0 ≤ i ≤ j ≤ 2n ), yellow: y, black: b,
AXIOMATIZABILITY OF ALGEBRAS OF BINARY RELATIONS
– reds: ri
199
(0 < i < 2n ).
The elements of the algebra are the sums of the atoms, i.e., all possible subsets of the set of atoms. Then the Boolean operations are defined as set-theoretic union, intersection and complementation. All elements x are self-converse, i.e., x = x. Next we will define composition for atoms. Then composition for other elements of the algebra can be defined using additivity: x;y =
{u ; v : u ≤ x, v ≤ y, u, v atoms}.
The composition of two atoms x0 and x1 is defined by specifying precisely which atoms are below x0 ; x1 , or equivalently, by specifying which atoms are not below x0 ; x1 . We say that a triple (x0 , x1 , x2 ) of atoms is a forbidden triangle if x2 is not one of the atoms below x0 ; x1 , in symbols (x0 ; x1 ) · x2 = 0. Since the ultraproduct of the algebras An is to be in RRA, we want An to satisfy the so-called cycle law. Since every element is self-converse, it says that, for every permutation π of 3, we have (xπ(0) ; xπ(1) ) · xπ(2) = 0 as well. In other words, if (x0 , x1 , x2 ) is a forbidden triangle, so is every permutation. The forbidden triangles (up to permutations) are precisely the following: (green,green,green) (yellow,yellow,yellow) (green,green,white) (yellow,yellow,black) (ri ,rj ,rk ) unless i + j = k or i + k = j or j + k = i (gi ,gi+1 ,rj ) unless j = 1 (gi ,y,wjk ) unless i ∈ {j, k}, where, e.g., (green,green,white) stands for: (g ; g ) · w = 0 for all green atoms g, g and any white atom w. We also require that (1 , x, y) and its permutations are forbidden for all distinct atoms x and y, since we want 1 ; x = x. As an example let us look at the composition of two green atoms with consecutive indices. We have gi ; gi+1 = r1 + b + y,
´ SZABOLCS MIKULAS
200
since the composition of two green atoms with consecutive indices cannot be white, green, the identity, or any red but r1 . Note that the composition of the yellow atom with itself cannot be black or yellow. Thus we have (gi ; gi+1 ) · (y ; y) = r1 . Finally, let us look at (gi ; gi+2k ) · (y ; y) · (rk ; rk ) for some indices i and k > 0 such that i + 2k < 2n . We have already seen that the intersection of the composition of green atoms and of the composition of yellow with itself cannot be green, white, black or yellow, and identity is ruled out by the different indices on the green atoms; thus it must be a union of red atoms. But the only red atom below rk ; rk is r2k . Thus the value of the above displayed term must be r2k . Next we show that the { · , ; , , 0}-reduct Bn of An is not representable as a set algebra of binary relations over an equivalence relation. Lemma 4.2. For any n < ω, An is not in RRA. Moreover, Bn , the { · , ; , , 0}-reduct of An , is not representable either. Proof. We will show that the universal player ∀ can win the representability game played on An . First let us recall the coherency conditions for this game. Let us assume that we have an edge (x, y) labelled by a and let I be a finite index set such that, for each i ∈ I, the edges (x, zi ) and (zi , y) are labelled by bi and ci , respectively. Then by the second coherency condition (cf.(2) in the proof of Theorem 3.1, a ≤ bi ; ci for each i ∈ I. Hence a ≤ I (bi ; ci ) must hold as well. Since, we are considering a subreduct of RRA, the first coherency condition (cf. (1)) says that the labelled graph must be an equivalence relation. Since converse is in the similarity type, if an edge (x, y) is labelled by a, then the converse edge (y, x) must be labelled by a . Since a = a , the direction of the edges will not matter; the label of (x, y) must be the same as the label of (y, x).
(5)
Furthermore, all reflexive edges must be labelled — this can be done using the identity atom 1 . We will not bother mentioning this in the future. Consequently, in the picture illustrating the argument below, Figure 1,
AXIOMATIZABILITY OF ALGEBRAS OF BINARY RELATIONS
201
we do not indicate the orientation of the edges, and the reflexive edges are not shown. The game starts with an edge (u, v) labelled by w = 0. In the initial round of the game, ∀ points out that w ≤ g0 ; y. Thus ∃ has to create edges (u, u0 ) and (u0 , v) labelled by g0 and y, respectively. Since w ≤ gi ; y for all i ≤ 2n , in the ith round of the game ∀ can force the existence of ui (i ≤ 2n ) such that (u, ui ) is labelled by gi and (ui , v) is labelled by y. Note that by (5), (ui , u) must be gi and (v, ui ) must be y. Now let us have a look at how ∃ can respond to the above challenges of ∀. In the first round of the game, ∃ has to label the edge (u0 , u1 ). Since (g0 ; g1 ) · (y ; y) = r1 , the coherency requirement on the graph says that ∃ must label (u0 , u1 ) by r1 . The same argument shows that, in the (i + 1)th round, ∃ has to use r1 to color the edge between ui and ui+1 (for i < 2n ). Indeed, (gi ; gi+1 ) · (y ; y) = r1 for every i < 2n , whence the color of (ui , ui+1 ) must be r1 . So far so good, but ∃ has to label all the edges (ui , uj ) for 0 ≤ i, j ≤ 2n . In particular, in the second round, ∃ must label (u0 , u2 ). By (g0 ; g2 ) · (y ; y) · (r1 ; r1 ) = r2 , ∃ is forced to use r2 to color (u0 , u2 ). In general, we have that (gi ; gi+2 ) · (y ; y) · (r1 ; r1 ) = r2 determines the color of (ui , ui+2 ) to be r2 . By induction, we get that (uk , uk+2l ) must be labelled by r2l , since (gk ; gk+2l ) · (y ; y) · (r2l−1 ; r2l−1 ) = r2l . In particular, the label of (u0 , u2n−1 ) and (u2n−1 , u2n ) is r2n−1 . In the 2n th round, ∃ has to find a non-zero label for (u0 , u2n ). But this is impossible, since (g0 ; g2n ) · (y ; y) · (r2n−1 ; r2n−1 ) = 0. Thus ∃ cannot respond with a coherent graph in the 2n th round. See Figure 1 for an illustration of the argument; the solid lines represent ∀’s moves and the dotted lines show ∃’s responses. q.e.d. One might wonder why we made the definition of An so complicated. Why not simply use red and green atoms and require that, say, the composition of green atoms must be red, gi ; gi+1 = r1 , and (ri ; rj ) · rk = 0 whenever the sum of two indices is different from the third one? It is not difficult to see that this would imply that g0 ; g2n−1 = g2n−1 ; g2n = r2n−1 . Then ∀ could win the game just in two rounds independently of the value of n. Thus the ultraproduct would not be representable either. Let i and j be different, non-consecutive indices. In our construction, gi ; gj is the sum of y, b and all the red atoms. In addition to the green atoms, ∀ has to use yellow to force ∃ to use reds: y ; y cannot be yellow
´ SZABOLCS MIKULAS
202 r2
u0
r1
u1
r1
r2
r2n−1
r2n−1
u2
y
u2n−1
···
u2n −2
··· g2n −2
y
r1
u2n −1
r1
u2n
g2n −1
y g0
g2
g1
y g2n−1
y
y
y
g2n
u
w
v
Fig. 1. The reason for non-representability
or black. Furthermore, ∃ has some freedom in choosing the index of the red atom—the choice is constrained by the forbidden red triangles, i.e., by the indices of the red atoms used so far in the game. In the proof of Lemma 4.2, it was ∃’s responsibility to color the (red) edges (ui , uj ). Would not it be faster for ∀ to win the game by choosing some of these red colors? For instance, ∀ might start the game with an edge (u0 , u2 ) and label it with r3 . Then ∀ would force ∃ to extend the graph by pointing out that r3 ≤ g0 ; g2 and that r3 ≤ y ; y. Then we would have edges (u0 , u) ∈ g0 , (u, u2 ) ∈ g2 , (u0 , v) ∈ y and (v, u2 ) ∈ y. If ∀ could force ∃ to label new edges (u0 , u1 ) and (u1 , u2 ) by r1 like in the above proof (by requiring that (u, u1 ) be g1 and (u1 , v) be y), it would be the end for ∃ (since (r1 ; r1 ) · r3 = 0). This is where the indexed white atoms wij come into the picture. In the present scenario, ∃ is to choose the color of (u, v), and w02 is a possibility. Recall that (gk ; y) · wij = 0 whenever k ∈ / {i, j}. Hence if ∃ labels (u, v) by w02 , then ∀ cannot require the existence of u1 such that (u, u1 ) be labelled by g1 and (u1 , v) by y. Looking at ∃’s strategy in general, it should be obvious that it is not a clever idea to use red atoms (unless it is unavoidable) and that using green and yellow atoms is not advisable either (since they can force later edges to be red). When the graph has to be extended, ∃’s strategy is to use an indexed white whenever this is a possible choice. The black atom
AXIOMATIZABILITY OF ALGEBRAS OF BINARY RELATIONS
203
gives ∃ another option to avoid using red, green or yellow; when ∀ does not use yellow and green for extending the graph and no indexed white is available, ∃ uses black. Finally, ∃ uses a red atom only if neither white nor black can be used. ∃ chooses the index of the red atom depending on the number of rounds played so far and on the indices of the red atoms occurring in the graph; cf. [Hod1 Mik00] for details. Thus the black and the indexed white atoms give ∃ some leeway to postpone losing the game. In fact, the only way ∀ can win the game is to play a sequence of green and yellow atoms as described in the proof of Lemma 4.2. Furthermore, a win for ∀ requires more and more rounds as n grows. When the two players play the game on the ultraproduct A of the algebras An (n < ω), then ∃ can survive infinitely many rounds and thus has a winning strategy. Hence A is representable as an RRA. We can conclude that R( · , ; , , 0) is HNFA. We refer the interested reader to [Hod1 Mik00] for a precise proof. q.e.d. Remark 4.3. 1. It is not very difficult to show that the same construction can be used to establish that the { · , , }-reduct of An is not representable using an equivalence relation. Hence we have that R( · , , ) is HNFA.3 2. A related issue to finitely axiomatizing a class K of algebras is the problem whether the equational theory of K is finitely axiomatizable in equational logic. We will not pursue this issue here; the interested reader is referred to [Hod1 Mik00] where it is shown that, for all finite n, there is a valid equation that fails in the nth reduct algebra, establishing that the equational theory is not finitely axiomatizable either.
5
Open problems
So far the general, if somewhat simplified, picture is that the interactions between composition ; and union + or converse lead to non-finite axiomatizability. But what can we say about the interaction between ; , · and identity 1 ? 3
A joint result with I. Hodkinson.
204
´ SZABOLCS MIKULAS
Identity. The most obvious candidate as an axiom for the identity constant is x ; 1 = 1 ; x = x. But it is not very hard to show that it is not enough for axiomatizing R( · , ; , 1 ) over R( · , ; ).4 In fact, one of the most challenging open problems in this field is the following.5 Question 3. Is R( · , ; , 1 ) finitely axiomatizable? A related question is whether R( ; , · , \, /, 1 , 0) admits a finite axiomatization. In [And1 Mik94] a representability result was established for the specific case in which the identity constant 1 satisfies the axiom x ; y ≤ 1 → (x = 0 ∨ y = 0 ∨ x = y = 1 ) as well. However, this axiom is usually not valid, hence the general finite axiomatizability question remains open. Conjugates. In Remark 4.3, we mentioned the HNFA of R( · , , ). An alternative would be if we allow representations over transitive (not necessarily equivalence) relations. Is this class of algebras finitely axiomatizable? It can be shown that the class of algebras representable over transitive relations of similarity type { · , ; , , } is HNFA [Mik02]. It would be interesting to find the borderline precisely where we lose finite axiomatizability. Complexity. It is well known that the equational theory of RRA is undecidable [Tar41]. On the other hand, the equational theory of the positive fragment of RRA, R(1, 0, · , + , ; , , 1 ), is decidable [And1 Bre0 94], but not much is known about the exact complexity of specific subreducts. As an example we mention that it follows from [Lam58] that the equational theory of R( · , ; , \, /) is decidable in exponential time, but no matching lower bound has been found so far. 4
5
There are semilattice-ordered monoids which are not representable because they fail to validate the formula (x ; y) · 1 = 0 → (y ; x) · 1 = 0, for instance. This problem was mentioned by B. M. Schein at the Algebraic Logic Colloquium in Budapest in 1988.
AXIOMATIZABILITY OF ALGEBRAS OF BINARY RELATIONS
205
Finite base property. The step-by-step construction usually yields a representation on an infinite base. It would be interesting to find out which (classes of) algebras can be finitely represented. There are results about finite representability in certain cases, cf. [And1 Hod1 N´em99], but it is not clear if existing techniques can be applied to algebras with transitive units. A related issue is when it is enough to consider algebras with finite bases to refute non-valid formulas. Let us mention a specific problem [vBe91, p.349]. Question 4. Is it the case that any non-valid equation of R( · , ; , \, /) can be refuted in an algebra represented on a finite equivalence relation? The same question applies when we consider algebras represented over transitive relations.
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
Forcing axioms and projective sets of reals Ralf Schindler Institut f¨ur formale Logik Universit¨at Wien W¨ahringer Straße 25 1090 Wien, Austria E-mail: [email protected]
Abstract. This paper is an introduction to forcing axioms and large cardinals. Specifically, we shall discuss the large cardinal strength of forcing axioms in the presence of regularity properties for projective sets of reals. The new result shown in this paper says that ZFC + the bounded proper forcing axiom (BPFA) + “every projective set of reals is Lebesgue measurable” is equiconsistent with ZFC + “there is a Σ1 reflecting cardinal above a remarkable cardinal.”
1
Introduction.
The current paper is in the tradition of the following result. Theorem 1.1. The theory ZFC + Martin’s axiom (MA) + “every projective set of reals is Lebesgue measurable” is equiconsistent to the existence of a weakly compact cardinal. [Har0 She85, Theorem B] Received: November 21st, 2001; Accepted by the editors: January 11th, 2002. 2000 Mathematics Subject Classification. 03E55 03E15 03E35 03E60.
The author would like to thank Ilijas Farah, Ernest Schimmerling, and Boban Veliˇckovi´c for sharing with him mathematical and historical insights.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 207-222. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
208
RALF SCHINDLER
This theorem links a large cardinal concept with a forcing axiom (MA) and a regularity property for the projective sets of reals. We shall be interested in considering forcing axioms which are somewhat stronger than MA. In particular, we shall produce the following new result.1 Theorem 1.2. Equiconsistent are: 1. ZFC + “there is a Σ1 reflecting cardinal above a remarkable cardinal”, and 2. ZFC + the bounded proper forcing axiom (BPFA) + “every projective set of reals is Lebesgue measurable”. We don’t expect the reader to be familiar with the concepts appearing in the previous theorems, as we are going to explain them here. We do have to presuppose, though, a basic understanding of forcing and constructibility theory; this material is covered by the text books [Jec78] and [Kun80]. Large cardinals are ubiquitous in set theory. In [Hau14], Hausdorff had asked whether there is a regular fixed point of the aleph function i → ℵi . We know today that the existence of such a fixed point cannot be shown in ZFC, because if κ is regular and κ = ℵκ then Lκ |= ZFC. (Here and in what follows we assume that ZFC is consistent.) We may say that a formula ϕ(−) defines a large cardinal concept if ZFC proves that ∀κ(ϕ(κ) ⇒ Lκ |= ZFC). A large cardinal axiom is a statement of the form ∃κ ϕ(κ) for some large cardinal concept given by ϕ(−). Any large cardinal axiom is thereby independent from ZFC. Forcing axioms are also independent from ZFC. The prototype of all forcing axioms, Martin’s axiom, was isolated from the Solovay-Tennenbaum proof of the consistency of the non-existence of Suslin trees on ω1 (cf. [Jec78, Theorem 50] or [Kun80, II, §4]). A forcing axiom asserts the existence of reasonably generic filters for posets from a fixed class of forcings. One may often view forcing axioms as a subsitute for CH (the continuum hypothesis) in contexts where CH fails. Projective sets of reals are primarily studied in descriptive set theory. A projective set comes from a closed set via a finite process of taking complements and projections (cf. [Jec78, p. 500]). The theory ZFC typically decides questions about “simple” projective sets, but it leaves the 1
This theorem was first presented in a talk at the 6i`eme Atelier International de Th´eorie des Ensembles in Luminy, France, on Sept 20, 2000.
FORCING AXIOMS AND PROJECTIVE SETS OF REALS
209
arbitrary case open. For instance, every analytic set of reals is Lebesgue measurable in ZFC, whereas the Lebesgue measurability of all ∆12 sets of reals is independent from ZFC. Nevertheless, descriptive set theorists believe it is true that every projective set of reals is Lebesgue measurable. We are now going to study the Lebesgue measurability of all projective sets of reals in the context of forcing axioms. Set theorists obtain insight into the mathematical structure provided by various hypotheses by determining their consistency strength. We’ll write Con(Σ) ⇒ Con(Σ ) for the statement that the consistency of the theory Σ implies that of Σ . We say that Σ and Σ are equiconsistent just in case Con(Σ) ⇔ Con(Σ ). The consistency strength (or, large cardinal strength) of a given hypothesis Ψ is that large cardinal axiom ∃κ ϕ(κ) such that ZFC + Ψ is equiconsistent with ZFC + ∃κ ϕ(κ). This paper can also be read as an introduction to a method for proving equiconsistency results. Typically, the direction Con(ZFC + Ψ) ⇒ Con(ZFC + ∃κ ϕ(κ)) uses constructibility theory, whereas Con(ZFC + ∃κ ϕ(κ)) ⇒ Con(ZFC + Ψ) is an application of the method of forcing.
2 MA and the Harrington-Shelah theorem. Let P be a poset, and let D be a collection of sets such that every D ∈ D is a subset of P. We say that a filter F ⊆ P is D-generic just in case F ∩ D = ∅ for every D ∈ D. D ⊆ P is predense (in P) if every p ∈ P is compatible with some q ∈ D. D ⊆ P is dense (in P) if for every p ∈ P there is some stronger q ∈ D. If D is the collection of all dense subsets of P then a D-generic filter cannot exist, unless P contains atoms. On the other hand, if D is countable then a D-generic filter always exists. In particular, if M is a countable transitive model of ZFC with P ∈ M and if D ∈ M is the collection of all the dense subsets of P which exist in M then there is a D-generic filter; it is one of the key building blocks of the theory of forcing that such a filter G can then be “adjoined” to M to produce another model of ZFC, M [G]. The approach of forcing axioms is to ask for more than countably many dense sets to be met at once, where P will have to belong to a specific class of posets. In what follows we shall focus on the case that no more than ℵ1 many dense sets are to be met.
210
RALF SCHINDLER
Definition 2.1. Let Γ ⊆ V be a class of posets. We say that MA(Γ) holds if for every P ∈ Γ and for every collection D of sets such that • Card(D) ≤ ℵ1 , and • D is dense in P for every D ∈ D there is some D-generic filter. Let c.c.c. denote the class of all posets which have the countable chain condition. We take Martin’s axiom, which is abbreviated by MA and which was mentioned in Theorem 1.1, to say that MA(c.c.c.) holds. In particular, MA –as we define it– implies that 2ℵ0 ≥ ℵ2 (i.e., that the continuum hypothesis, CH, is false). The reader may consult [Jec78, §§22, 23, 44] or [Kun80, Chapters II and VII] on MA. One can construe MA as a generalization of the Baire category theorem (cf. [Kun80, Theorem II.2.22]). The reader might also enjoy the discussion of the plenty consequences of MA that can be found in [Fre84]. It can be shown that Con(ZFC) implies Con(ZFC + MA) (cf. the paragraph right after Theorem 2.3 below). If 2ℵ0 > ℵ2 then one can require κ many dense sets to be met for any κ < 2ℵ0 , but it is inconsistent to require 2ℵ0 many dense sets to be met. By [Kun80, Theorem II.3.4], MA is equivalent to the fact that for every P ∈ c.c.c. which is of size ≤ ℵ1 and for every collection D of sets such that Card(D) ≤ ℵ1 and D is dense in P for every D ∈ D there is some D-generic filter. This readily implies that MA is equivalent to BMA(c.c.c.) according to the following definition. Definition 2.2. Let Γ ⊆ V be a class of posets. We say that BMA(Γ) holds if for every P ∈ Γ and for every collection D of sets such that • Card(D) ≤ ℵ1 , and • D is predense in P and has size ≤ ℵ1 for every D ∈ D there is some D-generic filter. In order to uncover the relation of MA with the Lebesgue measurability of projective sets of reals we need a first large cardinal concept. We say that a cardinal κ is inaccessible if κ > ℵ0 , κ is regular, and 2γ < κ L[x] for all cardinals γ < κ. We say that ω1 is inaccessible to the reals if ω1 is countable for every real x. Notice that if ω1 is inaccessible to the reals then ω1 is an inaccessible cardinal from the point of view of L.
FORCING AXIOMS AND PROJECTIVE SETS OF REALS
211
Theorem 2.3. If every projective set of reals is Lebesgue measurable then ω1 is inaccessible to the reals. In fact, if every Σ13 set of reals is Lebesgue measurable then ω1 is inaccessible to the reals. [She84] There is a poset P ∈ c.c.c. such that MA provably holds in VP (cf. [Jec78, §23] or [Kun80, VIII. §6]). Let us assume that V = L. We P shall then have that ω1V = ω1V = ω1L , and we may apply Theorem 2.3 inside VP to get that in VP there is a projective set of reals which is not Lebesgue measurable. In particular, MA does not imply that all projective sets of reals are Lebesgue measurable: Corollary 2.4. Equiconsistent are: 1. ZFC, and 2. ZFC + MA + “there is a projective set of reals which is not Lebesgue measurable”. We now want to approach Theorem 1.1. A cardinal κ is called weakly compact if κ is inaccessible and there is no Aronszajn tree on κ, i.e., for every tree T of height κ such that each level of T is of size < κ there is a branch through T of length κ. Equivalently, κ is weakly compact if κ is Π11 indescribable in that for every A ⊆ κ and for every Π1 formula ϕ(−), if Vκ+1 |= ϕ(A) then there is some λ < κ with Vλ+1 |= ϕ(A ∩ λ) (cf. [Jec78, p. 386]). The proof of Con(2) ⇒ Con(1) needs an improvement of Theorem 2.3 in the presence of MA. Lemma 2.5. Assume that MA holds. If ℵ1 is inaccessible to the reals then in fact ℵ1 is weakly compact in L. [Har0 She85] Proof. Let us suppose that MA holds and that ℵ1 is not weakly compact L[x] in L. We shall produce a real x such that ω1 = ω1 . As ℵ1 is not weakly compact in L, there is a tree T ∈ L such that (in V!) T is an Aronszajn tree on ω1 (this is due to Silver, cf. [Har0 She85, Claim 5]). We may and shall assume that T has infinitely many nodes of height 0. For a sequence d = (di : i < ω1 ) of subsets of ω we define P(d) iff p is a finite order preserving partial function from by setting p ∈ P(d) T into Q (the rationals) such that if a ∈ dom(p) has height ω · i in T and p(a) ∈ N ⊆ Q then p(a) ∈ di . We set p ≤P(d) q iff p ⊃ q. It is not hard ∈ c.c.c. to verify that P(d)
212
RALF SCHINDLER
L[b ] Now let b = (bi : i < ω1 ) be such that bi ⊆ ω and ω1 i > i for each i < ω1 . It suffices to show that there is some c ⊆ ω with b ∈ L[c]. For i < ω1 let us write T i for the restriction of T to nodes of height < i. We shall define bn = (bni : i < ω1 ) and Fn by induction on n < ω. Let b0i = bi for i < ω1 . Given bn , let Fn be, by an application of MA, a Dn -generic filter on P(bn ), where Dn is some appropriate collection of dense sets with Card(Dn ) = ℵ1 such that the following will hold true:
(1)n m ∈ bni iff there is an a of height ω · i in T with Fn (a) = m, and (2)n Fn is continuous at limit ordinals. ⊆ ω be a canonical code for Fn (T ω · (i + 1)), where i < ω1 . Let bn+1 i Finally, let c ⊆ ω be a code for (bn0 : n < ω). It is now easy to verify that (bni : n < ω, i < ω1 ) ∈ L[c]: Let i < ω1 , and suppose that we already know (bnj : n < ω, j < i). As bn+1 codes j Fn (T ω · (j + 1)) for every j < i, we can thus compute Fn (T ω · i). But (2)n then gives us Fn (T ω·i+1), and hence by (1)n we can compute bni . q.e.d. Let us now turn towards Con(1) ⇒ Con(2). The key concept here is L(R)-absoluteness for a class of posets. Let Γ ⊆ V be a class of posets. We say that L(R)-absoluteness for Γ holds (in V) iff for all P ∈ Γ we have that the first order theories (with names for reals from V) of L(R) P and L(RV ) agree with each other. Lemma 2.6. Let Γ ⊆ V be a definable class of posets. Let κ be an Col(ω,<κ) and for all inaccessible cardinal, and suppose that for all P˙ ∈ ΓV ˙ Col(ω,<κ)P reals x ∈ V we have that there is some poset Q ∈ Vκ with Q x ∈ V . Then L(R)-absoluteness for Γ holds in VCol(ω,<κ) . Proof. Let G be Col(ω,< κ)-generic over V, and let H be P˙ G -generic over V[G]. Finally, let E be Col(ω, (2ℵ0 )V[G][H] )-generic over V[G][H]. Let (ei : i < ω) ∈ V[G][H][E] be such that {ei : i < ω} = R ∩ V[G][H]. By working inside V[G][H][E] we may use the hypothesis of Lemma 2.6 to construct (αi , Gi : i < ω) such that for all i < ω we shall have that αi < αi+1 < κ, Gi is Col(ω,< αi )-generic over V, Gi−1 ⊆ Gi (with the convention that G−1 = ∅), Gi ∈ V[G][H], and ei ∈ V[Gi ]. Set G = i Gi . Because Col(ω,< κ) has the κ-c.c., G is Col(ω,< κ)generic over V, and every real in V[G ] is in V[Gi ] for some i < ω. We thus get that R ∩ V[G ] = R ∩ V[G][H].
FORCING AXIOMS AND PROJECTIVE SETS OF REALS
213
This yields that the first order theories of L(RV[G] ) and L(RV[G][H] ) agree with each other: because Col(ω,< κ) is homogeneous (cf. [Jec78, Theorem 63]), we get that for any formula ϕ(v ) and for any x ∈ R ∩ V[ x] ˙ |= ϕ(xˇ)” iff V[G] we have that L(RV[G] ) |= ϕ(x) iff Col(ω,<κ) “L(R) L(RV[G][H] ) |= ϕ(x). q.e.d. Theorem 2.7. Let κ be inaccessible. Then in VCol(ω,<κ) we have that every projective set of reals is Lebesgue measurable. [Sol70] Corollary 2.8. Let Γ ⊆ V be a definable class of posets. Suppose that κ is an inaccessible cardinal and ϕ is a statement such that Col(ω,<κ) ˙ (a) for all P˙ ∈ ΓV and for all reals x ∈ VCol(ω,<κ)P we have that there is some poset Q ∈ Vκ with x ∈ VQ , (b) ϕ holds in VCol(ω,<κ) , and (c) provably in ZFC + ϕ, there is some P ∈ Γ such that MA(Γ) holds in VP .
There is then a poset S such that in VS we have that: MA(Γ) holds and all projective sets of reals are Lebesgue measurable. The same result holds with “ MA(Γ)” being replaced by “ BMA(Γ).” Proof. We set S = Col(ω,< κ) P˙ for some P˙ which is provided by (b) and an application of (c) inside VCol(ω,<κ) . The result follows by Lemma 2.6 and Theorem 2.7, via (a). q.e.d. We are now in a position to be able to give a proof for Con(2) ⇒ Con(1) of Theorem 1.1. As it is provable in ZFC that there is a poset P ∈ c.c.c. such that MA holds in VP , we may apply Corollary 2.8 with ϕ being some logical truth, say. It remains to see that (a) of Corollary 2.8 holds for Γ = c.c.c. and for κ being weakly compact: Lemma 2.9 (Kunen). Let κ be weakly compact. Suppose that VCol(ω,<κ) |= “P˙ is a poset which has the c.c.c.”. ˙
Then for all reals x ∈ VCol(ω,<κ)P we have that there is some poset Q ∈ Vκ with x ∈ VQ . [Har0 She85]
214
RALF SCHINDLER
Proof. It is easy to verify that Col(ω,< κ) P˙ ∈ c.c.c. Let us work with ˙ It suffices to show Boolean algebras, and set B = r.o.(Col(ω,< κ) P). that any X ⊆ B of size < κ generates a complete subalgebra of B of size < κ. ¯ of B with X ⊆ B ¯ ⊆ Well, there is clearly a complete subalgebra B ¯ ≤ κ. W.l.o.g., B ¯ ⊆ κ. Let A ⊆ [κ]<κ be the set of all B and Card(B) ¯ As κ is Π1 indescribable, there is some λ < κ maximal antichains of B. 1 ¯ such that X ⊆ λ, B∩λ is a < λ-complete Boolean algebra, and A∩[λ]<λ ¯ ¯ is the set of all maximal antichains of B∩λ. But then B∩λ is a complete Boolean algebra, and hence a complete subalgebra of B. q.e.d.
3 PFA and projective sets. If X is a set then [X]ω denotes the set of all subsets of X of size ℵ0 . Let α ≥ ℵ1 . A set S ⊆ [α]ω is called stationary iff for all models M = (α; . . .) of finite type and with universe α there is some (X; . . .) ≺ M such that X ∈ S. A poset P is called proper if for all α ≥ ℵ1 , whenever S ⊆ [α]ω is stationary in [α]ω then VP |= Sˇ is stationary. We let proper denote the class of all posets which are proper. The proper forcing axiom, abbreviated by PFA, says that MA(proper) holds. The reader may find information on PFA in [Jec86] and [Bau84]. In particular, if there is a supercompact cardinal then there is a proper poset P such that PFA holds in VP (cf. [Jec86, Theorem 6.2]). In this section we shall see that PFA implies that all projective sets of reals are Lebesgue measurable. It is not known how to prove this directly, though. We have to show a transfer theorem of the following form: if PFA holds then there are certain inner models for such-and-such large cardinals, which in turn implies Lebesgue measurability for projective sets. The argument will in fact give that PFA implies much more, namely Projective Determinacy. Here is the key concept which links PFA with inner model theory. Let κ be an infinite cardinal. We say that κ holds if there is a sequence (Ci : κ < i < κ+ ∧ lim(i)) such that each Ci is a closed unbounded subset of i of order type ≤ κ and Cj = Ci ∩ j whenever j is a limit point of i. The principle κ was isolated by Jensen in the context of his seminal work on G¨odel’s constructible universe.
FORCING AXIOMS AND PROJECTIVE SETS OF REALS
215
Lemma 3.1. PFA implies ¬κ for all κ ≥ ℵ1 . [Tod84] Proof. Let T be the set of all limit ordinals i with κ < i < κ+ . Suppose (Ci : i ∈ T ) witnesses that κ holds. For j < i ∈ T let us write j ≺ i iff j is a limit point of Ci (iff j < i and Cj = Ci ∩ j). Then (T, ≺) is a tree, which we shall denote by T. Set P = {p ⊆ κ+ : p is closed and countable}, ordered by endextension. Notice that P is ω-closed, and forcing with P adds a closed unbounded subset of κ+ of order type ω1 . The proof of the following Claim makes use of all the properties of our κ -sequence. Claim 6. In VP , there is no branch through T G˙ of order type ω1 . Proof of Claim. Suppose that p ∈ P forces that there is some such branch. Notice that P × P is also ω-closed. Let G0 × G1 be P × P generic over V with p ∈ G0 ∩G1 . Let bh ∈ V[Gh ] be a branch through T G˙ Gh of order type ω1 , for h ∈ {0, 1}. Let Ah = i∈bh Ci for h ∈ {0, 1}. As cf V[G0 ×G1 ] (κ+ ) = ω1 , it is straightforward to see that we must have A0 = A1 . Therefore, A0 ∈ V. But then we must have otpV (A0 ) = κ+ , and therefore there is some i ∈ T with otp(Ci ) > κ. Contradiction! q.e.d. (Claim)
˙ Inside VP , there is hence a forcing Q adding, with finite con for ˙ ditions, a specializing function f for T G, i.e., an order preserving in the proof of Lemma 2.5). A f : T G˙ → Q (cf. the forcing P(d) ˙ ∆-system argument shows that Q has the c.c.c. In particular, the iteration ˙ is proper. P Q ˙ such that otp(p) ≥ For k < ω1 , let Dk be the set of all (p, q) ˙ ∈ P Q k + 1 and if i is the kth element of p then (p, q) ˙ decides f˙(i). Each Dk is dense. Let D = {Dk : k < ω1 }. By PFA, let (H0 , H1 ) be a D-genericfilter. Let i = sup( H0 ), and let f be the specializing function for T H0 which is provided by H1 . Then f witnesses that T H0 cannot have a branch of order type ω1 . However, Ci ∩ H0 is exactly one such branch. Contradiction! q.e.d. The following two results were products of Jensen’s fine structural analysis of L. Lemma 3.2 (Jensen). V=L implies κ for all κ ≥ ℵ1 . [Jen72]
216
RALF SCHINDLER
We say that 0# exists if there is an elementary embedding π : L → L with π = id. The existence of 0# is a large cardinal axiom in that if π : L → L has critical point κ then Lκ |= ZFC. Lemma 3.3 (Jensen). Suppose that 0# does not exist. If κ is a singular cardinal then κ+L = κ+ . In fact, if X ⊆ L then there is a set Y ∈ L with Y ⊃ X and Card(Y ) ≤ Card(X) + ℵ1 . [DevJen75] Corollary 3.4. PFA implies that 0# exists. Proof. Assume that PFA holds. Set κ = ℵω . By Lemma 3.1 we know that κ fails. On the other hand, L |= V = L, and therefore we have that L |= κ by Lemma 3.2. Let (Ci : κ < i < κ+L , lim(i)) ∈ L be a witness. By Lemma 3.3, κ+L = κ+ , and therefore (Ci : κ < i < κ+ , lim(i)) in fact witnesses that κ holds in V. Contradiction! q.e.d. We may look at the previous proof from a more abstract point of view. It tells us that PFA implies that there can be no pair (κ, W ) such that κ is a cardinal, W is an inner model, κ+W = κ+ , and W |= κ . However, the existence of such pairs (κ, W ) can be shown from weaker anti large cardinal hypotheses than the non-existence of 0# . Such an analysis leads to the following remarkable theorem, via work of Martin, Mitchell, Schimmerling, Steel, and Woodin. In its statement, PD denotes Projective Determinacy (cf. [Jec78] p. 560). We refer the reader to [Kan0 94, Chapter 6] for a thorough and nicely written discussion of the significance of PD in modern set theory. Theorem 3.5 (Martin, Mitchell, Schimmerling, Steel, Todorˇcevi´c, Woodin). PFA implies PD. Proof. The basic idea, which is due to Woodin, is to prove by induction on n < ω that V is closed under the operator x → M# n (x) for all n < ω. # Here, Mn (x) denotes a sufficiently iterable sharp for an inner model with n Woodin cardinals which contains x. The reader may find the details of such an induction in the paper [For1 MagSch2 01]. The point is that if # n < ω is such that V is closed under x → M# n (x) but Mn+1 (x0 ) does not exist for some x0 then there is a constructible inner model L[E] such that κ+L[E] = κ+ and L[E] |= κ for all large enough singular cardinals κ. This gives a contradiction with PFA via Lemma 3.1. q.e.d.
FORCING AXIOMS AND PROJECTIVE SETS OF REALS
217
A classical result of Mycielski and Swierczkowski yields that PD implies that every projective set of reals is Lebesgue measurable (cf. [Jec78, Theorem 102 (a)]). We therefore finally established the following. Corollary 3.6. PFA implies that every projective set of reals is Lebesgue measurable. We do not know whether there is a proof of Corollary 3.6 which does not go through PD. We let stap denote the class of all posets P which are stationary preserving, i.e., such that for all S ⊆ ω1 we have that if S is stationary, then VP |= “Sˇ is stationary”. Every proper forcing is clearly stationary preserving, but the converse is false. The papers [For1 MagShe88a] and [For1 MagShe88a] introduced Martin’s maximum, abbreviated by MM, as MA(stap). Obviously, MM implies PFA. We refer the reader to [Jec86, III §7] or [For1 MagShe88a] and [For1 MagShe88b] for further information on MM.
4
A few remarks on OCA.
Let X be a set. By [X]2 we denote the set of all {x, y} ⊆ X with x = y. Let [X]2 = K0 ∪ K1 with K0 ∩ K1 = ∅; K0 , K1 is then called a colouring of [X]2 . For h ∈ {0, 1} we say that Y ⊆ X is h-homogeneous iff [Y ]2 ⊆ Kh . Now let X ⊆ R. We view X as a topological space with the topology being induced by that of R. The set [X]2 inherits a topology from X (the product topology). We say that OCA(X) holds iff for every colouring K0 , K1 of [X]2 such that K0 is open in [X]2 , either there is an uncountable 0homogeneous Y ⊆ X or else X is a countable union of 1-homogeneous sets. The Open Colouring Axiom, abbreviated by OCA, is the statement that OCA(X) holds for every X ⊆ R. The reader may find background information about OCA in [Tod89, Chapter 8] and [Far0 Tod93, Chapter 10]. Lemma 4.1. PFA implies OCA. [Tod89, Theorem 8.0]
218
RALF SCHINDLER
Proof. Let X ⊆ R, and let K0 , K1 be a colouring of [X]2 . Let us suppose that X is not the union of countably many 1-homogeneous sets. Then in VCol(ω,R) , X is still not the union of countably many 1-homogeneous sets. By [Tod89, Theorem 4.4], in VCol(ω,R) there is therefore some uncountable Y˙ ⊆ X such that P˙ = {p ⊆ Y˙ : Card(p) < ℵ0 , [p]2 ⊆ K0 }, ordered by reverse inclusion, has the c.c.c. Now Col(ω, R) P˙ ∈ proper, and an application of PFA yields an uncountable 0-homogeneous subset of X. q.e.d. It can be shown, however, that Con(ZFC) implies Con(ZFC + OCA). In particular, OCA cannot imply that every projective set of reals is Lebesgue measurable, by Theorem 2.3. We say that OCA∗ (X) holds iff for every colouring K0 , K1 of [X]2 such that K0 is open in [X]2 , either there is a perfect 0-homogeneous Y ⊆ X or else X is a countable union of 1-homogeneous sets. For all X ⊆ R, OCA∗ (X) implies that X has the perfect subset property; therefore, OCA∗ (X) cannot hold for all X ⊆ R (under the axiom of choice, that is). Theorem 4.2. Projective Determinacy PD implies OCA∗ (X) for every projective X ⊆ R. [Fen0 83] The following corollary is thus another example of a transfer theorem. It follows from Theorems 3.5 and 4.2. Again (cf. the remark right after the statement of Corollary 3.6) we do not know whether there is a proof of Corollary 4.3 which does not go through PD. Corollary 4.3. PFA implies OCA∗ (X) for every projective X ⊆ R. Question 5. Suppose that OCA∗ (X) holds for every projective X ⊆ R. Is then every projective set of reals Lebesgue measurable?
5
BPFA and projective sets.
We shall from now on be concerned with bounded forcing axioms, i.e., with BMA(Γ) for classes of posets Γ. It turns out that BMA(Γ) is much
FORCING AXIOMS AND PROJECTIVE SETS OF REALS
219
weaker than MA(Γ) for any reasonable class Γ of posets which is significantly larger than c.c.c. The next lemma says that BMA(Γ) can be phrased in terms of generic absoluteness. The proof can be found in [Bag00]. Lemma 5.1. Let P be a poset. Then BMA(P) holds iff for every Σ1 formula ϕ(v1 , . . . , vn ) and for all x1 , . . . , xn ∈ Hℵ2 , ϕ(x1 , . . . , xn ) ⇔ VP |= ϕ(ˇ x1 , . . . , xˇn ). The bounded proper forcing axiom, abbreviated by BPFA, was introduced by Goldstern and Shelah in [Gol1 She95] as BMA(proper). This section will produce a proof of Theorem 1.2. We say that a regular cardinal κ is Σ1 reflecting iff for all a ∈ Hκ and for every formula ϕ(−), if there is some regular ϑ ≥ κ with Hϑ |= Φ(a) then there is some regular ϑ¯ < κ with Hϑ¯ |= Φ(a) (cf. [Gol1 She95, Definition 2.2]). It can easily be verified that every Σ1 reflecting cardinal is inaccessible and that if λ is a Mahlo cardinal then the set of all κ < λ which are Σ1 reflecting in Vλ is stationary in λ. Theorem 5.2. If BPFA holds then ℵ2 is a Σ1 reflecting cardinal in L. On the other hand, if κ is a Σ1 reflecting cardinal then there is a poset P ∈ proper such that in VP , BPFA holds. [Gol1 She95] We say that a cardinal κ is remarkable iff for all regular cardinals ϑ > κ there are π, M , κ ¯ , σ, N , and ϑ¯ such that the following hold: π : M → Hϑ is an elementary embedding, M is countable and transitive, π(¯ κ) = κ, σ : M → N is an elementary embedding with critical point κ ¯, ¯ N is countable and transitive, ϑ = M ∩ OR is a regular cardinal in N , ¯ and M = HN¯ , i.e., M ∈ N and N |= “M is the set of all sets σ(¯ κ) > ϑ, ϑ ¯ (cf. [Sch2 00a, Definition 0.4]). which are hereditarily smaller than ϑ” We are now ready to prove Theorem 1.2. Let us commence with proving Con(2) ⇒ Con(1). By Theorems 2.3 and 5.2 it will be enough to verify that if BPFA holds and ℵ1 is inaccessible to the reals then in fact ℵ1 is remarkable in L. For this in turn, via Lemma 5.1, it suffices to prove the following. Lemma 5.3. Suppose that ℵ1 is not remarkable in L. There is then a L[x] poset T ∈ proper such that in VT , there is a real x with ω1 = ω1 . [Sch2 01]
220
RALF SCHINDLER
Proof. There is an ω-closed poset P such that in VP , Hℵ2 = Lℵ2 [A] for some A ⊆ ω1 . This part of the argument just uses the non-existence ¯ where of 0# . We may for instance let P = Col(δ + , 2δ ) Col(ω1 , δ) P ℵ0 ¯ is δ is a singular cardinal of uncountable cofinality with δ = δ and P + a forcing which codes some D ⊆ δ of the intermediate model, where Lδ+ [D] = Hδ+ , by A ⊆ ω1 using δ + many pairwise almost-disjoint subsets of δ provided by Jensen’s Covering Lemma for L. ˙ by Now fix such P and A ∈ VP . In VP , we may define a poset Q ˙ iff there is an ordinal α < ω1 such that p : α → {0, 1}, and setting p ∈ Q for all β ≤ α we have that L[A ∩ β, pβ] |= β is countable. It can be verified that the fact that ℵ1 is not remarkable in L implies that ˙ ∈ proper. Therefore, P Q ˙ ∈ proper. VP |= Q ˙ Forcing with P Q adds some B ⊆ ω1 such that L[B ∩ β] |= β is countable ˙
for every β < ω1 . Inside VPQ , we may therefore define a sequence (ai : i < ω1 ) of pairwise almost-disjoint subsets of ω such that for every i < ω1 , ai = the L[B ∩ i]-least subset of ω which is almost-disjoint ˙ from every aj , j < i. We may then define a poset S˙ ∈ VPQ by setting p = ((p), r(p)) ∈ S˙ iff (p) : n → {0, 1} for some n < ω and r(p) is a finite subset of ω1 . We let p = ((p), r(p)) ≤S˙ q = ((q), r(q)) iff (p) ⊃ (q), r(p) ⊃ r(q), and for all i ∈ r(q), if i ∈ B then {m ∈ dom((p)) \ dom((q)) : (p)(m) = 1} ∩ ai = ∅. The generic object will give the characteristic function of some subset a ˙ of ω such that i ∈ B iff a ∩ ai is finite. We shall have that VPQ |= S˙ ∈ ˙ S˙ ∈ proper. c.c.c., so that T := P Q T However, in V , a will be such that Hℵ2 = Lℵ2 [a]. q.e.d. Let us now turn towards proving Con(1) ⇒ Con(2) of Theorem 1.2. We plan on using Corollary 2.8 for Γ = proper. Fix κ < λ such that κ is remarkable and λ is Σ1 reflecting. It is easy to verify that λ is still Σ1 reflecting in VCol(ω,<κ) . Let ϕ denote the statement that there is a Σ1 reflecting cardinal. We then have (b) and (c) of Corollary 2.8, the latter one by the second part of Theorem 5.2 (more precisely, by the proof of
FORCING AXIOMS AND PROJECTIVE SETS OF REALS
221
[Gol1 She95, Theorem 4.1]). It remains to verify (a) of Corollary 2.8; we shall state this as a lemma. Lemma 5.4. Let κ be a remarkable cardinal. Let G be Col(ω,< κ)generic over V. Let P ∈ V[G] be a proper poset, and let H be P-generic over V[G]. Then for every real x in V[G][H] there is a poset Qx ∈ Vκ such that x is Qx -generic over V. [Sch2 01] Proof. Let ϑ > κ be a large enough regular cardinal; in particular, we want that P(P) ⊆ Hϑ [G], i.e., that the power set of P be contained in Hϑ [G]. Let x ∈ R ∩ V[G][H], and let x˙ ∈ Hϑ [G] be such that x˙ H = x. The fact that κ is remarkable in V implies that in V[G], there is a stationary set Σ of countable subsets of Hϑ [G] such that for each X ∈ Σ we have the following: there are π, α, and ϑ¯ such that V[G∩HV α]
π : (Hϑ¯
V[G]
V ; ∈, HV ϑ¯ , G ∩ Hα ) → (Hϑ
; ∈, HV ϑ , G)
is an elementary embedding with X = ran(π), α is the critical point of π, and α and ϑ¯ are regular cardinals of V (and hence of V[G ∩ HV α ]) (this actually characterizes remarkability; cf. [Sch2 01, Lemma 1.6]). As P is proper, Σ is still stationary in V[G][H]. Now consider the structure ˙ H). M = (Hϑ [G]; ∈, P, x, Because Σ is stationary in V[G][H], we may pick some X ∈ Σ such that X is the universe of an elementary submodel of M. Let V[G∩HV α]
π : (Hϑ¯
V[G]
¯ x, ¯ → (H ¯˙ H) ; ∈, P, ϑ
; ∈, P, x, ˙ H)
be the inverse of the Mostowski collapse of X. By the elementarity of V[G∩HV α] ¯ ¯ is P-generic π, H over Hϑ¯ , and hence over all of V[G ∩ HV α ], V[G∩HV V α] ¯ as P(P) ∩ V[G ∩ H ] ⊆ H ¯ . Moreover, by the definability of α
ϑ
¯ H ¯ pn forcing, we get that n ∈ x¯˙ iff ∃p ∈ H ˇ ∈ x¯˙ iff ∃p ∈ H p n ˇ ∈ x˙ ¯ H H ¯˙ ¯ iff n ∈ x˙ iff n ∈ x. So x˙ = x, and we may set Qx = Col(ω,< α) P ¯
H ¯ Notice finally that Qx ∈ Vκ . ¯˙ = P. where P
q.e.d.
222
RALF SCHINDLER
6 BMM. This final section will be concerned with another bounded forcing axiom. Bounded Martin’s maximum, abbreviated by BMM, was introduced in [Gol1 She95] as the statement that BMA(stap) holds. There is a lengthy discussion of BMM in [Woo99, §10.3]. Lemma 6.1. Suppose that X # does not exist for some set X. Then there L[x] is a poset T ∈ stap such that in VT , there is a real x with ω1 = ω1 . [Sch2 00] Proof. The point is that if X # does not exist for some set X then we can still define (a version of) the forcing T from the proof of Lemma 5.3; moreover, T will be stationary preserving. q.e.d. As a corollary to Lemma 6.1 do we get that BMM is much stronger than BPFA in the presence of Lebesgue measurability of projective sets. Corollary 6.2. Suppose that BMM holds. If ω1 is inaccessible to the reals then V is closed under sharps, i.e., X # exists for every set X. In particular, if BMM holds and every projective set of reals is Lebesgue measurable then V is closed under sharps. This corollary can be improved to get the following better lower bound on BMM + “every projective set of reals is Lebesgue measurable.” Theorem 6.3. If BMM holds and every projective set of reals is Lebesgue measurable then there is an inner model with a strong cardinal. [Sch2 00] As an upper bound, we only have: Theorem 6.4. Suppose that ZFC + “there is a proper class of Woodin cardinals” is consistent. Then so is ZFC + BMM + “every projective set of reals is Lebesgue measurable.” [Woo99] This leads to the obvious question: what is the consistency strength of ZFC + BMM + “every projective set of reals is Lebesgue measurable”? Let me finish with a conjecture which consists of two parts. Conjecture 6.5. 1. If BMM holds and every projective set of reals is Lebesgue measurable then there is an inner model with infinitely many strong cardinals. 2. If BMM holds and every set of reals in L(R) is Lebesgue measurable then the axiom of determinacy holds in L(R).
Benedikt L¨owe, Boris Piwinger, Thoralf R¨asch (eds.) Classical and New Paradigms of Computation and their Complexity Hierarchies Papers of the conference “Foundations of the Formal Sciences III” held in Vienna, September 21-24, 2001
Post’s and other problems of supertasks of higher type Philip D. Welch Department of Mathematics University of Bristol Bristol BS8 1TW, England Institut f¨ur Formale Logik W¨ahringerstr. 25 1090 Wien, Austria E-mail: [email protected]
Abstract. We consider the theory of supertasks as implemented on infinite time Turing machines. We consider in particular computations in the type of sets of reals. We give some commentary on this and a number of open problems are raised.
1
Introduction
The theory of infinite time computations on sets of integers is now, at least, reasonably understood. We discuss here some open questions in both this theory but more particularly in that of the theory concerning computations in the next type up, that is using reals as inputs/outputs, and Received: December 10th, 2001; In revised version: March 18th, 2002; Accepted by the editors: April 29th, 2002. 2000 Mathematics Subject Classification. 03D75 03D60 03D65 03E60.
We should like to thank the Institut f¨ur Formale Logik in Vienna for a Gastprofessur where this paper was written.
B. Löwe et al. (eds.), Classical and New Paradigms of Computation and their Complexity Hierarchies, 223-237. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
224
PHILIP D. WELCH
sets of reals as oracles. This higher type theory has not as yet received the same attention as the theory on integers. The aim of this article is to provide some commentary on what we feel this higher type theory should look like, and to ask a number of questions whose answers should help develop that theory, and place it in a context with other theories of reducibilities at this level. In this section we give some of the basic definitions1 , and mention an alternative machine architecture that yields the same class of computable functions on single tape machines. In the next section we state the corresponding definitions and facts for the higher type theory. We give no proofs of these latter results. Often they are simply the analogous ones for the theory on integers, and the reader can construct them for his or herself. We refer the reader to [HamLew00] or to Joel Hamkins’ article in this volume, for the basic structure, definitions, and results concerning such machines. We let Pn denote the nth program in some fixed enumeration. In Section 2 we shall assume some familiarity with the notions of determinacy of two person perfect information games—see, e.g., [Kan0 94]. For a pointclass Γ, “Det(Γ)” will denote the assertion that games with payoff sets A ∈ Γ have winning strategies for one of the players. It was noted in [HamSea01] that the machine architecture could be replaced by a single tape machine using 0 and 1 as an alphabet, but only as long as one considered computable functions f : N → N. For functions F : R → R it turned out that a single tape is insufficient. We first note here that a not unattractive one (or three) tape machine model is obtained which computes the same class of functions (on either integers or reals) as the original machine architecture, by allowing into the alphabet blank cells on the tape2 . The purpose of a “blank” or empty cell is to signify ambiguity. We adopt a new limit rule for specifying the contents of cells at limit stages: if a cell’s value has varied cofinally often below λ, we set the value to the “ambiguous” value of a blank. Ambiguity limit rule. If λ is a limit ordinal, then the contents of the ith cell on the tape at time λ, Ci (λ), is given by: 1 2
For more details, cf. [HamLew00]. To be sure one now has a machine that in essence works on 3ω instead of 2ω , although for the theorem above we still consider x ∈ R input as an infinite sequence from {0, 1} only. A function F : R → R is computable by such a machine, if for all real input strings, the output strings are also real strings, i.e., without blanks.
SUPERTASKS OF HIGHER TYPE
225
If there is ν0 < λ such that for all ν < λ(ν0 < ν implies Ci (ν0 ) = Ci (ν)) then set Ci (λ) = Ci (ν0 ); otherwise set Ci (λ) to be a blank. One then has: Theorem 1.1. Let C be the class of functions F : R → R computable by the Hamkins-Lewis machines of [HamLew00], and C those of the one-tape machine just specified. Then C = C . It is not hard to see that if we now define a one-tape machine using this ambiguity limit rule, then its action can be simulated on a {0, 1}valued three tape [HamLew00]-machine. It was noted in [HamSea01] that the single tape machines with {0, 1} values only computed a class of functions that was not closed under composition. This had the effect that the natural simulation of a three-tape machine by a one-tape machine could not realise that it had performed a final compression of the simulated output3 . We have here an extra symbol –namely the blank– in the alphabet, and this gives us enough room to set flags and enable us to close up under composition, and thus calculate the same functions as the three-tape model. (For readers familiar with [HamSea01] this is really an exercise.) However, the theorem above notwithstanding, we shall stick to the definitions and conventions of [HamLew00] for this paper. The rest of the definitions of this subsection are all taken from that paper. It should be noted that we deliberately elide the distinction between subsets of ω and their characteristic functions as elements of 2ω and write “n ∈ g” interchangeably for “g(n) = 1” etc. Definition 1.2. For x ∈ 2ω , ϕn (x) denotes the result of running Pn on input string x. For any n, x it is easily seen that there is a countable ordinal α so that the behaviour of ϕn (x) has “settled” by stage α: either it has halted or it has entered a permanent looping cycle: notationally, either ϕn (x)↓y for some y ∈ 2ω , or else ϕp (x)↑. For a machine with an oracle f ∈ 2ω we 3
A compression is a map of the form x0 y0 z0 x1 y1 z1 . . . → z0 z1 z2 . . .; although clearly one-tape computable on its own, one cannot compose it in a one-tape computable fashion with the other steps of the simulation. One needs a flag or some device such as further scratch pad (cf. [HamSea01] Theorem 2.3.) that is not part of the final output to allow us to know when the final compression is complete, and the one-tape computation may halt.
226
PHILIP D. WELCH
allow the machine to receive 0/1 answers to queries of the form “Does f (n) = 1”? We write “ϕfp (x)↑y” for “program p with input x and oracle f eventually has a settled y on its output tape”; in other words, there is a time ν, so that for all later times ϕfp (x) has y sitting on its output tape—although without requiring that ϕfp (x)↓. The relations “ϕfp (x)↑y”, “ϕfp (x)↑”, and “ϕfp (x)↓y” are ∆12 relations of f, x, (y). We let WO denote the set of reals in 2ω that code wellorderings of ω. Definition 1.3. We let x ∈ W if and only if there is a p ∈ ω such that ϕp (0)↓x; and we say that W is the class of writable reals. Fact 1.4. The writable ordinals, (those ordinals α so that there is x ∈ W ∩ WO with ||x|| = α) form an initial segment of the countable ordinals. [HamLew00, Theorems 3.7 & 8.3] Definition 1.5. Let λ := sup{||x|| : x ∈ WO ∩W}, and γ := sup{α : there is some p ∈ ω such that ϕp (0)↓ and halts in exactly α steps}. Definition 1.6. We define EW := {x ∈ 2ω : there is some p such that ϕp (0)↑x}, and call EW the set of eventually writable reals. Moreover, ζ := sup{||x|| : x ∈ WO ∩EW}. Definition 1.7. We define the set of accidentally writable reals as follows: AW := {x ∈ 2ω : there is some p such that x appears on any tape at any time of the computation of ϕp (0) }. Let Σ := sup{||t|| : t ∈ AW ∩ WO}. The accidentally writable reals are thus likely to be transient. (An equivalent class of reals is obtained by restricting the appearance of such reals to the output tape alone.) Clearly AW ⊇ EW ⊇ W. We let HC denote the class of hereditarily countable sets. Definition 1.8. We write H(λ) (H(ζ), H(Σ) respectively) for the class of sets y ∈ HC so that ∃¯ y ∈ W (EW, AW respectively) with y¯ coding y. We use the notation W f , λf . . . , Hf (λf ), . . . for the notions relativised to a real f .
SUPERTASKS OF HIGHER TYPE
227
Definition 1.9 (The weak jump). For f ∈ 2ω , define f to be the set {n ∈ ω : ϕfn (0)↓}.4 It is easy to see that f → f is a ∆12 function. The notions of g is a decidable (or semi-decidable) set of integers is the natural one: g must have a totally (or partially) computable characteristic function: Definition 1.10. g ∈ 2ω is semi-decidable in the real f if for some e we have for all n ∈ ω n ∈ g if and only if ϕfe (n)↓1. As usual g is decidable (in f ) if both it and its complement are semidecidable (in f ). For the ordinary notion of Turing computability one has the equivalence between the semi-decidable, i.e., the recursively enumerable, sets of integers as those that are the domain of some computable function, with, as alternative, those that are the range of such. One can establish this by observing that from the universality of Kleene’s T-predicate one may simply test integers to see if they code a whole course of computation of the relevant function, and seeing if they terminate in a desired number output. The point is that the course of computation can be coded as something within the domain of the Turing machine. But is that the case for infinite time Turing machines? Or do we have in the phraseology of Sacks [Sac80], a “violation of parity” (between the type of the object in the domain of the computation, and that of the computation itself)? In essence we are asking whether (a code for) a halting computation can also be the result of a halting output of another computation. For this it is enough to have a real code for the length of any halting computation to be a potential output of some halting computation. Once we have such a code to hand we can then write a program simulating the original course of computation along that ordinal, and so output a code for the course of that computation. In the terms we have defined, we are asking whether γ ≤ λ. This had been the principal open problem left from [HamLew00]. The answer is affirmative. Theorem 1.11. The ordinals λ and γ are the same, and by relativisation we also have for all f ∈ 2ω that λf = γ f . [Wel0 00b, Theorem 1.1] 4
Compare Hamkins’s definition on page 154 of this volume.
228
PHILIP D. WELCH
Once we have this result we can answer a number of questions, as to what the decidable sets of integers are, what is complexity of the weak jump operator, what are the sets H(λ) etc. Theorem 1.12. For ϑ ∈ {λ, ζ, Σ} the models H(ϑ) and Lϑ are the same, and similarly for any f ∈ 2ω and ϑ ∈ {λf , ζ f , Σf }, Hf (ϑ) = Lϑ [f ]. [Wel0 00b, Corollaries 3.1, 3.3, 3.5] Corollary 1.13. The decidable sets of integers are precisely those of ℘(ω) ∩ Lλ . The relationship between the various H(ϑ) sets is given by: Theorem 1.14 (The (λ, ζ, Σ) Theorem5 ). 1. Lλ ≺Σ1 Lζ ≺Σ2 LΣ . 2. (λ, ζ, Σ) is the lexicographically least triple of ordinals satisfying this relation. This theorem implies that we may consider the definable sets of integers as those Σ1 -definable inside the least model of KP with a transitive Σ2 -end extension. It further implies that Lλ , Lζ are admissible sets satisfying strong reflection properties. For example, there are unboundedly many α below any of the ordinals λ, ζ, Σ where Lα |= Σ2 -KP (meaning Kripke-Platek set theory with Σ2 -Collection and ∆2 -Comprehension axioms). By appealing to Theorem 1.11 as γ f = λf , Theorem 1.12, and by considering the running of such machines inside Lλf [f ] one sees: Corollary 1.15. The set f is an element of Σ1 (Lλf [f ]). As f is not an f -decidable set of integers, it cannot lie inside Lλf [f ]. It is thus perhaps unsurprisingly a Σ1 -mastercode for Lλ . Corollary 1.16. The set f is (ordinary) Turing isomorphic to the set Σ1 -Th(Lλ ), i.e., the complete Σ1 -theory of Lλ . [Wel0 00a] There is the natural reducibility relation associated to reals: 5
Cf. [Wel0 00a].
SUPERTASKS OF HIGHER TYPE
229
Definition 1.17. For x, y ∈ 2ω we write x ≤∞ y if there is an n such that for all k we have ϕyn (k)↓1 iff x(k) = 1 & ϕyn (k)↓0 iff x(k) = 0. Corollary 1.15 enables one to see that the assignment f → λf satisfies the usual Spector Criterion: Lemma 1.18. The statement x ≤∞ y is equivalent to x ∈ Lλy [y]; thus the statement x ≤∞ y implies that λx ≤ λy ; and the assignment x → λx then satisfies a Spector criterion: x ≤∞ y implies (x ≤∞ y ⇐⇒ λx < λy ). The analogy of ≤∞ with ordinary Turing reducibility is explored in [HamLew00]. We take Lemma 1.18 as indicating that the proper analogy here is rather with something intermediate between hyperdegrees and that of ∆12 -degrees6 . The principal open question we can formulate at this point is the following. Question 6. Let D be a countable set of infinite time degrees. Does D have a minimal upper bound? This question remains unresolved for hyperdegrees. For ∆12 -degrees there are two differing but complete pictures, both with affirmative answers, depending on whether R ⊆ L or not (cf. [Fri0 74]). There are some partial results on Question 1 in [Wel0 99], but they are very partial, and the problem looks difficult7 . 1.1
Eventually infinite time degrees
There is one sense in which one could argue that the ordinal ζ is perhaps more fundamental than λ. It is after all the point by which the behaviour 6
7
With the assignments x → ω1x (where ω1x is the least ordinal not recursive in x) and x → ∆12 (x), respectively. [Wel0 99] uses some unexplained notation which we here clarify: F (e, f ) = β abbreviates ”ϕfe (0)↓ in exactly β steps”
230
PHILIP D. WELCH
of the machine on a computation of the form ϕe (k) is determined: either it has halted, or it has begun to loop permanently at some ordinal ζ¯ ≤ ζ.8 We could thus define a reducibility relation “x is eventually writable from y” and a corresponding jump: Definition 1.19. We write x ≤e∞ y if x ∈ EW y , and x˜ for the set of indices {p ∈ ω : there is some y ∈ EW x such that ϕxp (0)↑y}. We have parallel to the above: Lemma 1.20. The statement x ≤e∞ y is equivalent to x ∈ Lζ y [y]; thus the statement x ≤e∞ y implies ζ x ≤ ζ y ; and the assignment x → ζ x then satisfies the Spector criterion: x ≤e∞ y implies (˜ x ≤e∞ y ⇐⇒ ζ x < ζ y ). Lemma 1.21. The real x˜ is Turing isomorphic to the complete Σ2 (Lζ x [x]) set of integers. Again the variant of Question 1 arises. Question 7. Let D be a countable set of eventually infinite time degrees. Does D have a ≤e∞ minimal upper bound? Let F = {x ∈ 2ω : x ∈ Lζ x }. We note that it makes no difference whether we write ζ x or λx in this definition, since there is a program that given an x as input, searches for its L-rank. Once found we can ensure the program halts. Hence if x ∈ F then x ∈ Lλx . There is an analogy here with C1 , the largest thin Π11 set of quickly constructible reals.9 We may define a natural hierarchy of eventually infinite degrees through F in the spirit of [Kec75] (where this is done for Q and hyperdegrees) by e0 := [0]e∞ , eα+1 := [e˜α ]e∞ , eµ := lub{eα : α < µ} for limit ordinals µ if defined. (Here “lub” abbreviates “least upper bound” in the degree ordering.) Question 8. What is the natural length of this hierarchy? That is, what is the least so that e is undefined10 ? 8
9
10
Moreover there is e so that this upper bound is attained here. It would take us too far afield here, but it is essentially the pattern of the universal machine ϕU ’s output tape at stage ζ that is recursively isomorphic to certain sets of interest occurring in the revision theory of truth based on Herzberger sequences. Here, C1 := {x : x ∈ Lω1x }. Cf. [Kec75] or [Mos1 80, 4F.4] for the basic properties and structure of C1 . The version of this question for the infinite time degrees has a known answer: the natural hierarchy is undefined even at the ωth stage.
SUPERTASKS OF HIGHER TYPE
2
231
Higher Type Computation
We briefly review again the relevant definitions from [HamLew00]: For A ⊆ 2ω we write ϕA p (x) for the function defined by the pth program in a recursive listing of programs allowing queries of the set A, as to whether a real currently on the scratch tape is, or is not, in A (and receiving a simple 0/1 answer), using real input x. Definition 2.1 (The strong jump). We define 0 to be the set {(n, x) ∈ ω × 2ω : ϕn (x)↓}. For A ⊆ 2ω define A := {(n, x) ∈ ω × 2ω : 11 ϕA n (x)↓}. There is an associated reducibility relation associated to sets of reals: Definition 2.2. We write A ≤∞ B if there is an n such that for all x ∈ 2ω we have ϕB n (x)↓1 if and only if x ∈ A, and / A. ϕB n (x)↓0 if and only if x ∈ We make specifically semi-decidability into a definition: Definition 2.3. A set of reals A is semi-decidable in a set of reals B if and only if: ! ∃n∀x ∈ 2ω ϕB n (x)↓1 if and only if x ∈ A . A set of reals A is semi-decidable in a set of reals B and a real y ∈ 2ω if and only if: ! ∃n∀x ∈ 2ω ϕB,y n (x)↓1 if and only if x ∈ A . We want a notion of “semi-decidability” that results in a pointclass of sets closed under continuous preimages. Thus (when sufficient determinacy) it shall have a Wadge rank, which in turn gives a measure of complexity to the notion of semi-decidability we have defined12 . The notion “A is decidable in B” (respectively “in a real and B”) is as usual: this relation holds if both A and ¬A are semi-decidable in B (respectively in a real and B). 11 12
Cf. [HamLew00]; compare also Hamkins’s definition on page 154 of this volume. Wadge defined the following ordering on sets of reals: A ≤W B if there is a continuous reduction of A to B: i.e., there is a continuous f : 2ω → 2ω so that x ∈ A if and only if f (x) ∈ B. Then ≤W is easily seen to be reflexive and transitive (cf. Section 3 of the introduction to this volume; p. 6). This will give us a notion of Wadge rank of the sets of
232
PHILIP D. WELCH
Definition 2.4. A ≤r,∞ B denotes that A is decidable in B and a real. For the rest of this section “semi-decidability” will abbreviate “semidecidability in a real”—the boldface notion. Definition 2.5. Define Γ0 := {A ⊆ 2ω : A is semi-decidable from a real} and ∆0 := Γ0 ∩ ¬Γ0 . Theorem 2.6 (Hamkins, Lewis). ∆0 ⊆ ∆12 is a σ-algebra, closed under Suslin’s operation A, properly containing the Selivanovski C-sets13 . [HamLew00] It is easy to verify that the semi-decidable in z sets of reals are provably ∆12 (z) and thus, by a result of Solovay (a proof is given in [Fen2 Nor0 74]), are all absolutely measurable, and enjoy the property of Baire. There are two questions related to hierarchies with a classical flavour (as opposed to the descriptive set-theoretic viewpoint based on the later effective theory of Moschovakis and others). The first is to ask how the semi-decidable sets sit with relation to the Kolmogorov R-sets. Question 9. Classify the semi-decidable sets of reals within the hierarchy of the R-sets. We conjecture that they sit very low down in this hierarchy. The reader may consult [Hin78, V.4 & V.5] for an account of classical hierarchies through ∆12 . There is another notion of generalised computation, due to Blackwell [Bla0 78], the “Borel programmable functions”— derived from the theory of dynamic programming in probability theory. For this, let X, Y be perfect polish spaces, and 2ω be Cantor space. A program is a function p : 2ω → 2ω so that for all n ∈ ω, x(n) ≤ p(x)(n)
13
reals in any given pointclass Γ provided we know that ≤W restricted to Γ is wellfounded. Assuming sufficient determinacy Wadge’s Lemma (cf.[Mos1 80, 7D.3]) asserts that ≤W so restricted is (virtually) a linear ordering, and a theorem of Martin (cf.[Mos1 80, 7D.14]) asserts –still assuming sufficient determinacy– that it is wellfounded. For Γ equal to the class of Borel sets no determinacy is required (indeed, unlike Borel Determinacy, these facts can be proven in second order arithmetic). Assuming the full axiom of determinacy, AD, then the Wadge hierarchy so obtained has rank Θ := sup{γ : there is some surjective g : R → γ}. The Selivanovski C-sets are the smallest σ-algebra of sets containing the Borel sets, and closed under operation A.
SUPERTASKS OF HIGHER TYPE
233
(thus we are really thinking –viacharacteristic functions– of p as operating on a subset x of ω and returning a possibly larger set p(x)). We iterate p in an obvious way, and let pα be the αth iterate, with pλ (x)(n) = supα<λ pα (n) for all n at limits λ. Let pΩ be the fixed point function attained at some Ω < ω1 . A Borel function e : X → 2ω is called an encoder, a Borel d : 2ω → Y is a decoder. Let ci be the constant function on ω with value i. Definition 2.7. A function f : X → Y between perfect polish spaces X, Y is Borel programmable if it is of the form d ◦ pΩ ◦ e for some d, p, e of the above form. A set A ⊆ X is Borel programmable if its characteristic function A : X → 2ω is (where A (x) = c1 if x ∈ A and A (x) = c0 if x ∈ / A.) Question 10. Are the infinite time Turing machine decidable sets all Borel programmable sets? It is easy to see from the definition of Borel programmable functions, that their computation could be effected on an infinite time Turing machine (equipped with oracles for the requisite Borel codes of the functions concerned). The question concerns the reverse inclusion. The question prima facie is not unreasonable, since by a result of Burgess and Lockhart [BurLoc83] the Borel programmable sets also properly include the C-sets. However we conjecture the answer is negative.14 In [HamLew02] a positive solution to Post’s problem (whether there can be A with 0 <∞ A <∞ 0) is proposed. In fact two countable semi-decidable sets A, B ⊆ 2ω , which could be construed as constructed definably over Lλ , are seen to be ≤∞ -incomparable viaa FriedbergMuchnik type construction. Since the sets constructed are countable, they are obviously both of degree 0 when considering the boldface reducibility of decidability in reals. One can “remedy” that as follows Lemma 2.8 (V=L). The set F of fast reals satisfies: 0 <∞ F <∞ 0. Moreover F is not decidable in a real, and 0 is not decidable in any real and F ; that is 0
We are grateful to John Burgess for bringing this example to our attention—albeit in a rather different context.
234
PHILIP D. WELCH
We take the view here that one tends not to consider the relation of “A is ∆1n in B” between sets of reals (although of course this makes perfect sense for sets of integers). We should like to generalise the notion of boldface semi-decidability that is afforded by Kleene degrees (see [HrbSim2 80] for a discussion of this notion). Briefly, for sets of reals A, B one writes A ≤K B (“A is Kleene recursive in B”) iff there is a real y so that the characteristics function χA of A is recursive (in the sense of Kleene [Kle59]) in y, χB and the existential integer quantifier 2 E. The computational model here differs from the infinite time Turing machines, but one can view it as a machine equipped with ability to quiz an oracle for B and y, with a countably infinite memory, and an ability to search that memory in a finite amount of time. The Kleene recursive sets are then the Borel sets, and the Kleene semi-recursive sets are the coanalytic sets. Solovay had shown (cf. [Sol71]) that under AD, the axiom of determinacy, the Kleene degrees are well-ordered. The complete semi-recursive set is WO, the Π11 set of reals coding wellorders. The nature of the reducibility ordering ≤K depends on one’s universe of discourse: contrasting with Solovay’s AD result already mentioned, in L, or in set generic extensions thereof, there are 2ℵ0 ≤K -incomparable semi-recursive sets below the degree of WO (cf. [HrbSim2 80]). A sharper result than Solovay’s (no pun intended) is the fact that if Π11 -determinacy holds then all Kleene semi-recursive, non-recursive sets have the same Kleene degree. This is a result of Steel [Ste0 80]. The converse also holds by Harrington [Har0 78, Theorem 4.4]. We should expect an entirely analogous picture to obtain for the degrees of the semi-decidable sets defined above. In the following we use freely the natural generalisations of the definitions and notations from the first section. We thus, for example, write “λB ” for the first ordinal that is not (coded by) the output of any halting computation with oracle B, on input 0. It follows from a version of the (λ, ζ, Σ) Theorem 1.14 that the sets of reals semi-decidable in x ∈ 2ω are those sets A expressible with a Σ1 formula ψ0 (v0 , v1 ) as follows:
y ∈ A if and only if Lλx,y [x, y] |= ψ0 (x, y).
SUPERTASKS OF HIGHER TYPE
235
The relation A is semi-decidable in B (A, B ⊆ 2ω ) and x ∈ 2ω can be seen to be given similarly as: y ∈ A if and only if Lλx,y,B [x, y, B] |= ψ0 (x, y, B). In the latter equivalence, we should have precisely the notion of A being Kleene semi-recursive to B, if one simply replaced λx,y,B by x,y,B ω1 . However we are not simply mimicking a (version equivalent to) Kleene’s definition here. Suppose we are given any function f : D → On of ordinary Turing degrees to countable ordinals that is definable via a Σ1 formula ψ(v0 , v1 ) so that for any (ordinary) Turing degree [y]T we have f ([y]T ) = α iff L[y] |= ψ(y, α); then we may define a slice through ∆12 by defining a lightface pointclass Γ0 as follows: A ∈ Γ0 if and only if for some formula ϑ(v0 , v1 ) we have x ∈ A if and only if Lf (x) [x] |= ϑ(x).15 How high a rank f has in D ℵ1 modulo the Martin measure (cf. [Kan0 94, p. 386]), assuming say Det(∆12 ), determines the complexity of the class Γ0 . By some measures λx is a large ordinal, and we are dealing with a complex class. We may then, to be specific, ask questions about the semi-decidable sets below 0 as follows: 1. Is there A ⊆ 2ω , A ∈ Γ0 , A not decidable in a real, but satisfying that 0 is not decidable in a real and A? 2. Are there A, B ∈ Γ0 neither of which is decidable in a real from the other? The outcome, just as for Kleene degrees, depends on the set theoretical universe one inhabits. Theorem 2.9. In L, the constructible universe, (or in set generic extensions thereof) there are A, B as in 2. We have not checked, but fully expect that, there are 2ℵ0 many such incomparable sets A. We list a number of problems, which are supposed to elucidate the boldface reducibility ordering in the constructible universe. We expect their answer (and solution) to be similar to that for Kleene degrees. 15
A boldface definition would add in a real parameter.
236
PHILIP D. WELCH
Question 11. Assume V = L. 1. Can there be a semi-decidable set A of minimal ≤r,∞ -degree? 2. Can there be infinite descending sequences of ≤r,∞ -degrees? 3. Are the ≤r,∞ -degrees of semi-decidable sets dense between 0 and 0? (We conjecture the answer to 3. to be affirmative, thus solving the other two parts.) In the following we let for Γ (any pointclass) Boolean(Γ) denote the pointclass obtained from the sets of Γ and closing up under complementation and finite unions. Theorem 2.10. Det(Boolean(Γ0 )) implies that every A ∈ Γ0 \∆0 satisfies: 0 and A are mutually decidable in a real. Thus sufficient determinacy ensures the answers to both questions is negative. In essence all one has to do for this latter theorem is to quote a result of Steel (cf. [Ste0 80]) that for any boldface pointclass Γ the determinacy of sets in Boolean(Γ) ensures that any set not Γ-self-dual is continuously reducible to another such set. There remains the question of how to measure the complexity of Γ0 . One method is to ascertain how much det4erminacy Det(Boolean(Γ0 )) actually is. Can one find an equivalence in terms of inner models of large cardinal axioms? We formulate this as follows: Question 12. Determine the strength of Det(Boolean(Γ0 )). Can the sharp of an inner model of some large cardinal axiom be found which is equivalent to this? In general the direction from a sharp to the determinacy is difficult. For the theory of Kleene degrees we have already remarked that there is a precise answer: Det(Boolean(Π11 )) yields, by the above comments, that all non-Borel co-analytic sets are Kleene mutually reducible to the complete Π11 set. This conclusion had been obtained by Steel assuming just Det(Π11 ).16 We can appeal to methods of Steel [Ste0 82b], using Friedman-style games to show that Det(Γ0 ) implies reasonable large cardinal strength. We first have: 16
Of course Harrington showed that Det(Π11 ) implied the existence of x# for any real x, and this was known by work of Martin to imply Det(Boolean(Π11 )).
SUPERTASKS OF HIGHER TYPE
237
Theorem 2.11. The following is a (lightface) decidable relation “x, y ∈ 2ω codes countable premice M, N <∗ 0¶ and M ≤∗ N in the (pre)mouse ordering.” In the above the relations <∗ and ≤∗ are the canonical (pre)mouse orderings:17 For premice M ≤∗ N is to be interpreted as “In the comparison coiteration of M with N to models Mϑ , Nϑ then either (i) ϑ is least so that one of the models is illfounded, and that model is Mϑ or (ii) Mϑ is an initial segment of Nϑ .” The notation 0¶ (read as “zero pistol ”) denotes the sharp for an inner model of a strong cardinal (if it exists). Using the fact that Det(Γ0 ) implies Det(Π11 ) we may use the Σ13 -correctness of the core model built assuming there is no inner model with a strong cardinal (cf.[Ste0 Wel0 94]) and obtain: Theorem 2.12. Det(Γ0 ) implies that 0¶ exists. In fact one has x¶ for any real x. But this seems far from an optimal result. Indeed to ask a question as to lower bounds: Question 13. Does the mutual decidability in reals of all A, B ∈ Γ0 \∆0 imply x¶ exists for all x ∈ R? Does it imply x# exists? We expect the answers should be affirmative.
17
Cf. [Zem02, § 5.4].
References [Aar02] [Abr84]
[AbrTod97] [AdaKec00]
[AgrKaySax∞]
[Aha99] [AldStr81]
[Ali95]
[AltFer02]
[Amb0 00] [Amb0 dWo01] [Amb0 May1 97] [Amb1 +01] [And0 EngH˚as01]
[And1 88] [And1 91] [And1 Bre0 94]
[And1 Hod1 N´em99]
[And1 Mik94]
Scott Aaronson, Quantum lower bound for the collision problem, in: [Rei02, p. 635–642] Uri Abraham, A minimal model for ¬CH: iteration on Jensen’s reals, Transactions of the American Mathematical Society 281 (1984), p. 657–674 Uri Abraham, Stevo Todorˇcevi´c, Partition Properties of ω1 compatible with CH, Fundamenta Mathematicae 152 (1997), p. 165–181 Scot Adams, Alexander S. Kechris, Linear algebraic groups and countable Borel equivalence relations, Journal of the American Mathematical Society 13 (2000), p. 909–943 Manindra Agrawal, Neeraj Kayal, Nitin Saxena, PRIMES is in P, preprint available at http://www.cse.iitk.ac.in/news/primality.ps Dorit Aharonov, Quantum Computation—A Review, in: [Sta2 99, p. 259–346] Alexander Alder, Volker Strassen, On the algorithmic complexity of associative algebras, Theoretical Computer Science 15 (1981), p. 201– 211 Farid Alizadeh, Interior point methods in semidefinite programming with applications to combinatorial optimization, SIAM Journal on Optimization 5 (1995), p. 13–51 Helmut Alt, Afonso Ferreira (eds.), 19th Annual Symposium on Theoretical Aspects of Computer Science, Antibes–Juan les Pins, France, March 14-16, 2002, Springer Verlag, Berlin, 2002 [Lecture Notes in Computer Science 2285] Andris Ambainis, Quantum lower bounds by quantum arguments, Journal of Computer and System Sciences 64 (2002), p. 750–767 Andris Ambainis, Ronald de Wolf, Average-case quantum query complexity, Journal of Physics A 34 (2001), p. 6741–6754 Klaus Ambos-Spies, Elvira Mayordomo, Resource-bounded measure and randomness, in: [Sor97, p. 1–47] Klaus Ambos-Spies, Wolfgang Merkle, Jan Reimann, Frank Stephan, Hausdorff dimension in exponential time, in: [Sud+01, p. 210–217] Gunnar Andersson, Lars Engebretsen, Johan H˚astad, A new way of using semidefinite programming with applications to linear equations mod p, Journal of Algorithms 39 (2001), p. 162–204 Hajnal Andr´eka, On the representation problem of distributive semilattice-ordered semigroups, preprint 1988 Hajnal Andr´eka, Representation of distributive lattice-ordered semigroups with binary relations, Algebra Universalis 28 (1991), p. 12–25 Hajnal Andr´eka, Dimitrij A. Bredikhin, The equational theory of union-free algebras of relations, Algebra Universalis 33 (1994), p. 516– 532 Hajnal Andr´eka, Ian Hodkinson, Istv´an N´emeti, Finite algebras of relations are representable on finite sets, Journal of Symbolic Logic 64 (1999), p. 243–267 Hajnal Andr´eka, Szabolcs Mikul´as, Lambek calculus and its relational semantics: completeness and incompleteness, Journal of Logic, Language and Information 3 (1994), p. 1–37
240 [And1 Mon0 N´em91] [And1 Tho2 88]
[And2 CamHjo01]
[AsaWil1 00] [Bag00] [Bar0 70]
[Bar1 +95]
[Bar2 Jud95] [Bar3 KeiKun80]
[Bau83] [Bau84] [Bau85] [BauLav79] [BauMar2 She84]
[Bea0 +01]
[Bea1 99]
[BecKec96]
REFERENCES Hajnal Andr´eka, J. Donald Monk, Istv´an N´emeti (eds.), Algebraic Logic, North-Holland, Amsterdam, 1991 Hajnal Andr´eka, Richard J. Thompson, A Stone type representation theorem for algebras of relations of higher rank, Transactions of the American Mathematical Society 309 (1988), p. 671–682 Alessandro Andretta, Riccardo Camerlo, Greg Hjorth, Conjugacy equivalence relation on subgroups, Fundamenta Mathematicae 167 (2001), p. 189–212 Takao Asano, David P. Williamson, Improved approximation algorithms for MAX SAT, in: [Shm0 00, p. 96–105] Joan Bagaria, Bounded forcing axioms as principles of generic absoluteness, Archive for Mathematical Logic 39 (2000), p. 393–401 Yehoshua Bar-Hillel (ed.), Mathematical logic and foundations of set theory, Proceedings of an International Colloquium held under the auspices of the Israel Academy of Sciences and Humanities, Jerusalem, 1114 November 1968, North-Holland Publishing Co., Amsterdam-London 1970 [Studies in Logic and the Foundations of Mathematics] Amotz Bar-Noy, Faith Fich, Tamar Flash, Orna Grumberg, Joseph Halpern, Shmuel Klein, Nati Linial, Moni Naor, David Peleg, Ron Roth, Jeannette Schmidt, Ron Shamir, Micha Sharir, Oded Shmueli, Mike Sipser, Avi Wigderson (eds.), Proceedings of 3rd Israel Symposium on Theory of Computing and Systems, Tel Aviv, Israel, 4–6 January, 1995, IEEE Computer Society Press, New York NY, 1995 ´ Tomek Bartoszynski, Haim Judah, Set Theory, On the Structure of the Real Line, A K Peters, Wellesley MA, 1995 Jon Barwise, H. Jerome Keisler, Kenneth Kunen (eds.), The Kleene Symposium, Proceedings of the Symposium held at the University of Wisconsin, Madison, June 18–24, 1978, North-Holland, Amsterdam, 1980 [Studies in Logic and the Foundations of Mathematics 101] James E. Baumgartner, Iterated forcing, in: [Mat83, p. 1–59] James E Baumgartner, Applications of the proper forcing axiom, in: [KunVau84, p. 913–959] James E. Baumgartner, Sacks forcing and the total failure of Martin’s axiom, Topology and its Applications 19 (1985), p. 211–225 James E. Baumgartner, Richard Laver, Iterated perfect-set forcing, Annals of Mathematical Logic 17 (1979), p. 271–288 James E. Baumgartner, Donald A. Martin, Saharon Shelah (eds.), Axiomatic set theory. Proceedings of the AMS-IMS-SIAM joint summer research conference held in Boulder, Colorado, June 19–25, 1983, American Mathematical Society, Providence RI, 1984 [Contemporary Mathematics 31] Robert Beals, Harry Buhrman, Richard Cleve, Michele Mosca, Ronald de Wolf, Quantum lower bounds by polynomials, Journal of the ACM 48 (2001), p. 778–797 Paul Beame (ed.), 40th Annual Symposium on Foundations of Computer Science, FOCS ’99, 17-18 October, 1999, New York NY, USA, IEEE Computer Society Press, New York NY, 1999 Howard Becker, Alexander S. Kechris, The descriptive set theory of Polish group actions, Cambridge University Press, Cambridge, 1996 [London Mathematical Society Lecture Notes Series 232]
REFERENCES [Ben+97]
[Bla0 78] [Bl¨a99] [Bl¨a00] [Bl¨a01]
[Bl¨a02] [Bla1 ∞]
[BluShuSma89]
[Blu+98]
[Boy+98] [Bra1 +02] [Bra1 HøyTap97] [Bra1 HøyTap98] [Bre0 77] [Bre0 Sch1 78]
[Bre1 95] [Bre1 00a] [Bre1 00b] [Bre1 01] [Bre1 02] [Bre1 03]
241
Charles H. Bennett, Ethan Bernstein, Gilles Brassard, Umesh Vazirani, Strengths and weaknesses of quantum computing, SIAM Journal on Computing 26 (1997), p. 1510–1523 David Blackwell, Borel-programmable functions, Annals of Probability 6 (1978), p. 321–324 Markus Bl¨aser, Lower bounds for the multiplicative complexity of matrix multiplication, Computational Complexity 8 (1999), p. 203–226 Markus Bl¨aser, Lower bounds for the bilinear complexity of associative algebras, Computational Complexity 9 (2000), p. 73–112 Markus Bl¨aser, Algebras of minimal rank over perfect fields, Technical ¨ Informatik und Mathematik — Medizinische Report — Institut fur ¨ Universit¨at zu Lubeck SIIM-TR-A-01-13 2001 Markus Bl¨aser, Algebras of minimal rank over perfect fields, in: [Mar1 02, p. 113–122] Andreas R. Blass, Combinatorial cardinal characteristics of the continuum, preprint available at http://www.math.lsa.umich.edu/∼ablass/hbk.pdf Lenore Blum, Michael Shub, Steven Smale, On a theory of computation and complexity over the real numbers: NP-completeness, recursive functions and universal machines, Bulletin of the American Mathematical Society 21 (1989), p. 1–46 Leonore Blum, Felipe Cucker, Michael Shub, Steve Smale, Complexity and real computation, With a foreword by Richard M. Karp, Springer Verlag, New York NY, 1998 Michel Boyer, Gilles Brassard, Peter Høyer, Alain Tapp, Tight bounds on quantum searching, Fortschritte der Physik 46 (1998), p. 493–505 Gilles Brassard, Peter Høyer, Michele Mosca, Alain Tapp, Quantum amplitude amplification and estimation, in: [LomBra0 02, p. 53–74] Gilles Brassard, Peter Høyer, Alain Tapp, Quantum algorithm for the collision problem, ACM SIGACT News 28 (1997), p. 14–19 Gilles Brassard, Peter Høyer, Alain Tapp, Quantum counting, in: [GulSkyWin1 98, p. 820–831] Dmitri A. Bredikhin, Abstract characteristic of some classes of algebras of relations (Russian), in: [Una77, p. 3–19] Dmitri A. Bredikhin, Boris M. Schein, Representation of ordered semigroups and lattices by binary relations, Colloquium Mathematicum 39 (1978), p. 1–12 J¨org Brendle, Strolling through paradise, Fundamenta Mathematicae 148 (1995), p. 1–25 J¨org Brendle, A new method for iterating σ–centered forcing, Ky¯oto ¯ ¯ ¯ daigaku surikaiseki kenkyusho k¯okyuroku 1143 (2000), p. 1–7 J¨org Brendle, How small can the set of generics be?, in: [BusH´ajPud00, p. 109–126] J¨org Brendle, Cardinal invariants of the continuum — A survey, Ky¯oto ¯ ¯ ¯ daigaku surikaiseki kenkyusho k¯okyuroku 1202 (2001), p. 7–32 J¨org Brendle, Mad families and iteration theory, in: [Zha02, p. 1–31] J¨org Brendle, The almost disjointness number may have countable cofinality, Transactions of the American Mathematical Society 355 (2003), p. 2633–2649
242 [Bre1 ∞a] [Bre1 ∞b] [Bre1 L¨ow99] [B¨uc62] [B¨ucCla85]
[Buh+99]
[BuhCleWig98] [BuhdWo02]
[Buh+01]
[B¨urClaSho0 97]
[BurLoc83]
[BusH´ajPud00]
[CaiHar3 94]
[Cal86]
[Cam02] [CamGao01]
[Car0 Lav89] [Car1 35] [Cha02]
REFERENCES J¨org Brendle, Cardinal invariants of the continuum and combinatorics on uncountable cardinals, in preparation J¨org Brendle, Shattered iterations, in preparation J¨org Brendle, Benedikt L¨owe, Solovay-type characterizations for forcing-algebras, Journal of Symbolic Logic 64 (1999), p. 1307–1323 ¨ J. Richard Buchi, On a decision method in restricted second order arithmetic, in: [NagSupTar62, p. 1–11] ¨ Werner Buchi, Michael Clausen, On a Class of Primary Algebras of Minimal Rank, Linear Algebra and its Applications 69 (1985), p. 249–268 Harry Buhrman, Richard Cleve, Ronald de Wolf, Christof Zalka, Bounds for small-error and zero-error quantum algorithms, in: [Bea1 99, p. 358-368] Harry Buhrman, Richard Cleve, Avi Wigderson, Quantum vs. classical communication and computation, in: [Vit98, p. 63–68] Harry Buhrman, Ronald de Wolf, Complexity measures and decision tree complexity: a survey, Theoretical Computer Science 288 (2002), p. 21–43 ¨ Harry Buhrman, Christoph Durr, Mark Heiligman, Peter Høyer, Fr´ed´eric Magniez, Miklos Santha, Ronald de Wolf, Quantum algorithms for element distinctness, in: [Sud+01, p. 131–137] ¨ Peter Burgisser, Michael Clausen, Amin Shokrollahi, Algebraic Complexity Theory, Springer Verlag, Berlin, 1997 [Grundlehren der Mathematischen Wissenschaften 315] John P. Burgess, Richard A. Lockhart, Classical hierarchies from a modern standpoint III, Fundamenta Mathematicae 115 (1983), p. 107–118 Samuel R. Buss, Petr H´ajek, Pavel Pudl´ak (eds.), Logic Colloquium ’98, Proceedings of the Annual European Summer Meeting of the Association for Symbolic Logic held at the University of Economics, Prague, August 9–15, 1998, Association for Symbolic Logic, Urbana IL, 2000 [Lecture Notes in Logic 13] Jin-Yi Cai, Juris Hartmanis, On Hausdorff and topological dimensions of the Kolmogorov complexity of the real line, Journal of Computer and Systems Sciences 49 (1994), p. 605–619 Jacques Calmet (ed.), Algebraic algorithms and error-correcting codes, Proceedings of the AAECC-3 conference held in Grenoble, July 15–19, 1985, Springer Verlag, Berlin, 1986 [Lecture Notes in Computer Science 229] Riccardo Camerlo, The relation of recursive isomorphism for countable structures, Journal of Symbolic Logic 67 (2002), p. 879–895 Riccardo Camerlo, Su Gao, The completeness of the isomorphism relation for countable Boolean algebras, Transactions of the American Mathematical Society 353 (2001), p. 491–518 Tim Carlson, Richard Laver, Sacks reals and Martin’s axiom, Fundamenta Mathematicae 133 (1989), p. 161–168 Rudolf Carnap, Formalwissenschaft und Realwissenschaft, Erkenntnis 5 (1935), p. 30–37 Bernhard Chazelle (ed.), The 43rd Annual IEEE Symposium on Foundations of Computer Science (FOCS’02), November 16 - 19, 2002, Vancouver, BC, Canada, IEEE Computer Society Press, New York NY, 2002
REFERENCES [Che66] [Cho+00]
[ChuNie00] [CiePaw∞]
[Cle00] [Coh63]
[Con+02]
[Coo0 71] [Coo1 Tru99]
[CorLei1 Riv90] [Dai01] [Dai+01] [dGr78]
[dGr83]
[dGr86] [dGrHei1 83]
243
Elliot W. Cheney, Introduction to Approximation Theory, McGrawHill, New York NY, 1966 Peter A. Cholak, Steffen Lempp, Manuel Lerman, Richard A. Shore (eds.), Computability theory and its applications, Current trends and open problems, Proceedings of a 1999 AMS-IMS-SIAM joint summer research conference, Boulder CO, USA, June 13–17, 1999, American Mathematical Society, Providence RI, 2000 [Contemporary Mathematics 257] Isaac L. Chuang, Michael A. Nielsen, Quantum Computation and Quantum Information, Cambridge University Press, Cambridge, 2000 Krzysztof Ciesielski, Janusz Pawlikowski, Covering Property Axiom CPA, A combinatorial core of the iterated perfect set model, Cambridge University Press, Cambridge, to appear [Cambridge Tracts in Mathematics], preprint available at http://jacobi.math.wvu.edu/∼kcies/prepF/ BookCPA/CPAbook.html Richard Cleve, The query complexity of order-finding, in: [Pud+00, p. 54–59] Paul Cohen, The independence of the continuum hypothesis, Proceedings of the National Academy of Sciences of the United States of America 50 (1963), p. 1143–1148 Ricardo Conejo, Stephan Eidenbenz, Matthew Hennessy, Rafael Morales, Francisco T. Ruiz, Peter Widmayer (eds.), Automata, Languages and Programming, 29th International Colloquium, ICALP 2002, Malaga, Spain, July 8-13, 2002, Springer Verlag, Berlin, 2002 [Lecture Notes in Computer Science 2380] Stephen A. Cook, The complexity of theorem proving procedures, in: [Har1 BanUll71, p. 151–158] S. Barry Cooper, John K. Truss (eds.), Sets and proofs, Invited papers from Logic Colloquium ’97 (European Meeting of the Association for Symbolic Logic) held in Leeds, July 6–13, 1997, Cambridge University Press, Cambridge, 1999 [London Mathematical Society Lecture Note Series 258] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Introduction to Algorithms, MIT Press, Cambridge MA, 1990 Jack J. Dai, A stronger Kolmogorov zero-one law for resource-bounded measure, in: [Tit01, p. 204–209] Jack J. Dai, James I. Lathrop, Jack H. Lutz, Elvira Mayordomo, Finite state dimension, in: [OreSpi1 vLe01, p. 1028–1039] Hans F. de Groote, On the varieties of optimal algorithms for the computation of bilinear mappings: Optimal algorithms for 2 × 2-matrix multiplication, Theoretical Computer Science 7 (1978), p. 127–148 Hans F. de Groote, Characterization of division algebras of minimal rank and the structure of their algorithm varieties, SIAM Journal on Computing 12 (1983), p. 101–117 Hans F. de Groote, Lectures on the Complexity of Bilinear Problems, Springer Verlag, Berlin 1986 [Lecture Notes in Computer Science 245] Hans F. de Groote, Joos Heintz, Commutative algebras of minimal rank, Linear Algebra and its Applications 55 (1983), p. 37–68.
244 [DeoHamSch2 ∞]
[Deu85]
[Dev84] [DevJen75] [dWo02] [Dod82] [DouJacKec94]
[DouKec00] [Dow92] [DroKir94]
[DˇzaShe99]
[Ear95]
[EarNor1 93]
[Egg49]
[EkeMos0 98] [EklG¨ob99]
[EngGur02] [Fal85] [Far0 Tod93]
REFERENCES Vinay Deolalikar, Joel D. Hamkins, Ralf-Dieter Schindler, P=NP ∩ coNP for infinite time Turing machines, submitted to: Journal of Logic and Computation David Deutsch, Quantum theory, the Church-Turing principle and the universal quantum computer, Proceedings of the Royal Society London Series A 400 (1985), p. 97–117 Keith J. Devlin, Constructibility, Springer Verlag, Berlin, 1984 [Perspectives in Mathematical Logic] Keith J. Devlin, Ronald B. Jensen, Marginalia to a theorem of Silver, in: [M¨ul0 ObePot75, p. 115–142] Ronald de Wolf, Quantum communication and complexity, Theoretical Computer Science 287 (2002), p. 337–353 Anthony J. Dodd, The Core Model, Cambridge University Press, Cambridge, 1982 [London Mathematical Society Lecture Notes Series 61] Randall Dougherty, Steve Jackson, Alexander S. Kechris, The structure of hyperfinite Borel equivalence relations, Transactions of the American Mathematical Society 341 (1994), p. 193–225 Randall Dougherty, Alexander S. Kechris, How many Turing degrees are there?, in: [Cho+00, p. 83–94] Rod Downey, An invitation to structural complexity, New Zealand Journal of Mathematics 21 (1992), p. 33-89 Yurij A. Drozd, Vladimir V. Kirichenko, Finite dimensional algebras, With an appendix by Vlastimil Dlab, Translated from the Russian by Vlastimil Dlab, Springer Verlag, Berlin, 1994 Mirna Dˇzamonja, Saharon Shelah, Similar but not the same: various versions of ♣ do not coincide, Journal of Symbolic Logic 64 (1999), p. 180–198 [Sh574] John Earman, Bangs, Crunches, Whimpers and Shrieks: Singularities and Acausalities in Relativistic Spacetimes, Oxford University Press, New York NY, 1995 John Earman, John D. Norton, Forever is a day: supertasks in Pitowski and Malament-Hogarth spacetimes, Philosophy of Science 60 (1993), p. 22-42 Harold Gordon Eggleston, The fractional dimension of a set defined by decimal properties, Quarterly Journal of Mathematics 20 (1949), p. 31–36 Artur Ekert, Michele Mosca, The hidden subgroup problem and eigenvalue estimation on a quantum computer, in: [Wil0 99, p. 174–188] Paul C. Eklof, R¨udiger G¨obel (eds.), Abelian groups and modules, Proceedings of the International Conference held at the Dublin Institute of Technology, Dublin, August 10–14, 1998, Birkh¨auser Verlag, Basel, 1999 [Trends in Mathematics] Lars Engebretsen, Venkatesan Guruswami, Is constraint satisfaction over two variables always easy?, in: [RolVad02, p. 224–238] Kenneth J. Falconer, The Geometry of Fractal Sets, Cambridge University Press, Cambridge, 1985 [Cambridge Tracts in Mathematics 85] Ilijas Farah, Stevo Todorˇcevi´c, Some applications of the method of forcing, Yenisei, Moscow, 1993 [Yenisei Series in Pure and Applied Mathematics]
REFERENCES [Far1 +98]
[Far1 +99]
[Fed91] [Fei0 Goe95]
[Fei0 Sch0 01] [Fei1 +98]
[Fel0 Moo0 77]
[Fel1 86] [Fen0 83] [Fen2 Nor0 74] [FidZal0 77]
[For1 MagSch2 01]
[For1 MagShe88a]
[For1 MagShe88b]
[For2 Rog1 99]
[Fre84] [Fri0 74] [Fri0 Sta1 89]
245
Edward Farhi, Jeffrey Goldstone, Sam Gutmann, Michael Sipser, A limit on the speed of quantum computation in determining parity, Physical Review Letters 81 (1998), p. 5442–5444 Edward Farhi, Jeffrey Goldstone, Sam Gutmann, Michael Sipser, Invariant Quantum Algorithms for Insertion into an Ordered List, Technical Report MIT-CTP-2815, online available at http://arXiv.org/abs/quant-ph/9901059 Meir Feder, Gambling using a finite state machine, IEEE Transactions on Information Theory 37 (1991), p. 1459–1461 Uriel Feige, Michel X. Goemans, Approximating the value of two prover proof systems, with applications to MAX 2SAT and MAX DICUT, in: [Bar1 +95, p. 182–189] Uriel Feige, Gideon Schechtman, On the integrality ratio of semidefinite relaxations of MAX CUT, in: [VitSpi1 Yan01, p. 433–442] Joan Feigenbaum, Richard Beigel, Stephen Fenner, Judy Goldsmith, Martin Kummer, Elvira Mayordomo, Moni Naor, Steven Rudich, Dan Spielman, Alan L. Selman (eds.), Thirteenth Annual IEEE Conference on Computational Complexity, June 15–18, 1998, Buffalo, New York, IEEE Computer Society Press, New York NY, 1998 Jacob Feldman, Calvin C. Moore, Ergodic equivalence relations, cohomology, and von Neumann algebras, I, Transactions of the American Mathematical Society 234 (1977), p. 289–324 Annemarie Fellmann, Optimal algorithms for finite dimensional simply generated algebras, in: [Cal86, p. 288–295] Qi Feng, Homogeneity for open partitions of pairs of reals, Transactions of the American Mathematical Society 339 (1983), p. 659–684 Jens Eric Fenstad, Dag Normann, On absolutely measurable sets, Fundamenta Mathematicae 81 (1974), p. 91–98 Charles M. Fiduccia, Yechezkel Zalcstein, Algebras having linear multiplicative complexities, Journal of the Association for Computing Machinery 24 (1977) p. 311–331 Matthew Foreman, Menachem Magidor, Ralf Schindler, The consistency strength of successive cardinals with the tree property, Journal of Symbolic Logic 66 (2001), p. 1837–1847 Matthew Foreman, Menachem Magidor, Saharon Shelah, Martin’s maximum, saturated ideals and non-regular ultrafilters I, Annals of Mathematics 127 (1988), p. 1–47 Matthew Foreman, Menachem Magidor, Saharon Shelah, Martin’s maximum, saturated ideals and non-regular ultrafilters II, Annals of Mathematics 127 (1988), p. 521–545 Lance Fortnow, John Rogers, Complexity limitations on quantum computing, Journal of Computer and System Sciences 59 (1999), p. 240– 252 David H. Fremlin, Consequences of Martin’s axiom, Cambridge University Press, Cambridge, 1984 [Cambridge Tracts in Mathematics 84] Harvey Friedman, Minimality in the ∆12 -degrees, Fundamenta Mathematicae 81 (1974), p. 183–192 Harvey Friedman, Lee Stanley, A Borel reducibility theory for classes of countable structures, Journal of Symbolic Logic 54 (1989), p. 894– 914
246 [Fri1 Koe97] [Fri2 Jer97]
[FucSheSou97] [GaoKec03]
[GarJoh79]
[GesGol1 Koj∞]
[Ges+02]
[G¨od39]
[GoeWil1 94]
[GoeWil1 95]
[GoeWil1 01]
[Gol1 93] [Gol1 +95]
[Gol1 She95] [Gol3 vLo83] [Gro0 99] [Gro1 81] [Gro1 94] [Gro2 96]
REFERENCES Sy D. Friedman, Peter Koepke, An elementary approach to the fine structure of L, Bulletin of Symbolic Logic 3 (1997), p. 453–468 Alan Frieze, Mark Jerrum, Improved approximation algorithms for MAX k-CUT and MAX BISECTION, Algorithmica 18 (1997), p. 67– 81 Sakae Fuchino, Saharon Shelah, Lajos Soukup, Sticks and clubs, Annals of Pure and Applied Logic 90 (1997), p. 57–77 [Sh544] Su Gao, Alexander S. Kechris, On the classification of Polish metric spaces up to isometry, American Mathematical Society, Providence RI, 2003 [Memoirs of the American Mathematical Society 161] Michael R. Garey, David S. Johnson, Computers and Intractability: a guide to the theory of NP-completeness, W. H. Freeman and Company, San Francisco CA, 1979 Stefan Geschke, Martin Goldstern, Menachem Kojman, Continuous Ramsey theory on Polish spaces and covering the plane by functions, submitted; preprint available at http://arxiv.org/abs/math.LO/0205331 Stefan Geschke, Menachem Kojman, Wiesław Kubi´s, Rene Schipperus, Convex decompositions in the plane and continuous pair colorings of the irrationals, Israel Journal of Mathematics 131 (2002), p. 285– 317 Kurt G¨odel, Consistency-Proof for the Generalized Continuum Hypothesis, Proceedings of the National Academy of Sciences of the United States of America 25 (1939), p. 220–224 Michel X. Goemans, David P. Williamson, New 43 -approximation algorithms for the maximum satisfiability problem, SIAM Journal on Discrete Mathematics 7 (1994), p. 656–666 Michel X. Goemans, David P. Williamson, Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming, Journal of the ACM 42 (1995), p. 1115–1145 Michel X. Goemans, David P. Williamson, Approximation algorithms for MAX-3-CUT and other problems via complex semidefinite programming, in: [VitSpi1 Yan01, p. 443–452] Martin Goldstern, Tools for your forcing construction, in: [Jud93, p. 305–360] Martin Goldstern, Miroslav Repick´y, Saharon Shelah, Otmar Spinas, On tree ideals, Proceedings of the American Mathematical Society 123 (1995), p. 1573–1581 Martin Goldstern, Saharon Shelah, The bounded proper forcing axiom, Journal of Symbolic Logic 60 (1995), p. 58–73 Gene H. Golub, Charles F. van Loan, Matrix Computations, North Oxford Academic Publishing, Oxford, 1983 Misha Gromov, Metric structures for Riemannian and non-Riemannian spaces,Birkh¨auser, Boston MA, 1999 [Progress in Mathematics 152] Marcia Groszek, Iterated perfect set forcing and constructible degrees, Ph.D. thesis, Harvard University 1981 Marcia Groszek, ω1∗ as an initial segment of the c-degrees, Journal of Symbolic Logic 59 (1994), p. 956–976 Lov K. Grover, A fast quantum mechanical algorithm for database search, in: [Mil2 96, p. 212–219]
REFERENCES [Gro2 97] [Gro2 98] [Gru1 99] [GulSkyWin1 98]
[Hai91] [HamLew00] [HamLew02]
[HamSea01]
[HamWel0 03] [Har0 78] [Har0 KecLou90]
[Har0 She85]
[Har1 BanUll71]
[Har2 vSt02]
[H˚as01] [Hau14] [Hau19] [Hei1 Mor1 86] [HemSel97] [Hen1 +02]
247
Lov K. Grover, Quantum mechanics helps in searching for a needle in a haystack, Physical Review Letters 79 (1997), p. 325–328 Lov K. Grover, A framework for fast quantum mechanical algorithms, in: [Vit98, p. 53–62] Jozef Gruska, Quantum Computing, McGraw-Hill, New York NY, 1999 Kim Guldstrand Larsen, Sven Skyum, Glynn Winskel (eds.) Automata, Languages and Programming, 25th International Colloquium, ICALP’98, Aalborg, Denmark, July 13-17, 1998, Springer Verlag, Berlin, 1998 [Lecture Notes in Computer Science 1443] Mark D. Haiman, Arguesian lattices which are not type I, Algebra Universalis 28 (1991), p. 128–137 Joel David Hamkins, Andrew Lewis, Infinite time Turing machines, Journal of Symbolic Logic 65 (2000), p. 567–604 Joel David Hamkins, Andrew Lewis, Post’s problem for supertasks has both positive and negative solutions, Archive for Mathematical Logic 41 (2002), p. 507–523 Joel David Hamkins, Daniel Seabold, Infinite time Turing machines with only one tape, Mathematical Logic Quarterly 47 (2001), p. 271– 287 Joel D. Hamkins, Philip D. Welch, Pf = NPf for almost all f , Mathematical Logic Quarterly 49 (2003), p. 536–540 Leo A. Harrington, Analytic determinacy and 0# , Journal for Symbolic Logic 43 (1978), p. 684–693 Leo A. Harrington, Alexander S. Kechris, Alain Louveau, A GlimmEffros dichotomy for Borel equivalence relations, Journal of the American Mathematical Society 3 (1990), p. 903–928 Leo A. Harrington, Saharon Shelah, Some exact equiconsistency results in set theory, Notre Dame Journal of Formal Logic 26 (1985), p. 178–188 Michael A. Harrison, Ranan B. Banerji, Jeffrey D. Ullman (eds.), Proceedings of the Third Annual ACM Symposium on Theory of Computing, Shaker Heights, Ohio, 3–5 May 1971, ACM Press, New York NY, 1971 Klaas Pieter Hart, Bert Jan van der Steeg, A small transitive family of continuous functions on the Cantor set, Topology and its Applications, 123 (2002), p. 409–420 Johan H˚astad, Some optimal inapproximability results, Journal of the ACM 48 (2001), p. 798–859 Felix Hausdorff, Grundz¨uge der Mengenlehre, Verlag Veit & Co, Leipzig, 1914 Felix Hausdorff, Dimension und a¨ usseres Mass, Mathematische Annalen 79 (1919), p. 157–179 Joos Heintz, Jacques Morgenstern, On associative algebras of minimal rank, in: [Pol86, p. 1–24] Lane A. Hemaspaandra, Alan L. Selman (eds.), Complexity Theory Retrospective II, Springer Verlag, New York NY, 1997 C. Ward Henson, Jos´e Iovino, Alexander S. Kechris, Edward Odell, Analysis and Logic, Cambridge University Press, Cambridge, 2002 [London Mathematical Society Lecture Note Series 262]
248 [Hin78] [HirHod1 97a] [HirHod1 97b] [Hit01]
[Hit02] [Hjo99] [Hjo00]
[Hod0 93] [Hod1 Mik00] [Hog0 92]
[Hog0 94] [Hog1 02] [Hog1 +02]
[HøydWo02] [HøyNeeShi01] [HrbSim2 80] [Huf59]
[HulFor0 Bur1 94]
[JacKecLou02]
[Jec78]
REFERENCES Peter G. Hinman, Recursion Theoretic Hierarchies, Springer Verlag, Berlin, 1978 [Perspectives in Mathematical Logic Series] Robin Hirsch, Ian Hodkinson, Step by step – building representations in algebraic logic, Journal of Symbolic Logic 62 (1997), p. 225–279 Robin Hirsch, Ian Hodkinson, Complete representations in algebraic logic, Journal of Symbolic Logic 62 (1997), p. 816–847 John M. Hitchcock, MAX 3SAT is exponentially hard to approximate if NP has positive dimension, Theoretical Computer Science 289 (2002), p. 861-869 John M. Hitchcock, Correspondence principles for effective dimensions, in: [Con+02, p. 561–571] Greg Hjorth, Around nonclassifiability for countable torsion free abelian groups, in: [EklG¨ob99, p. 269–292] Greg Hjorth, Classification and orbit equivalence relations, American Mathematical Society, Providence RI, 2000 [Mathematical Surveys and Monographs 75] Wilfrid Hodges, Model Theory, Cambridge University Press, Cambridge, 1993 Ian Hodkinson, Szabolcs Mikul´as, Axiomatizability of reducts of algebras of relations, Algebra Universalis 43 (2000), p. 127–156 Mark L. Hogarth, Does general relativity allow an observer to view an eternity in a finite time?, Foundations of Physics Letters 5 (1992), p. 173-181 Mark L. Hogarth, Non-Turing computers and non-Turing computability, in: [HulFor0 Bur1 94, p. 126-138] Wolfram Hogrebe, Vorwort, in: [Hog1 +02] Wolfram Hogrebe, Martin Booms, Joachim Bromand, Jochen Faseler, ¨ Niels May, Kati Muller, Martin Rohrmeier, Tilmann Wegerhoff, Hanna Zimmermann (eds.), Grenzen und Grenz¨uberschreitungen, XIX. Deutscher Kongreß f¨ur Philosophie, 23.–27.September 2002 in Bonn, Sektionsbeitr¨age, Sinclair Press, Bonn, 2002 Peter Høyer, Ronald de Wolf, Improved quantum communication complexity bounds for disjointness and equality, in: [AltFer02, p. 299–310] Peter Høyer, Jan Neerbek, Yaoyun Shi, Quantum bounds for ordered searching and sorting, in: [OreSpi1 vLe01, p. 346-357] Karel Hrbacek, Stephen Simpson, On Kleene degrees of Analytic Sets, in: [Bar3 KeiKun80, p. 347–352] David A. Huffman, Canonical forms for information-lossless finitestate logical machines, Transactions on Circuit Theory CT-6 (Special Supplement) (1959), p. 41–59; also available in [Moo1 64, p. 866-871] David Hull, Mickey Forbes, Richard B. Burian (eds.), PSA 1994, Volume One, Proceedings of the 1994 Biennial Meeting of the Philosophy of Science Association, Michigan State University, East Lansing MI, 1994 Steve Jackson, Alexander S. Kechris, Alain Louveau, Countable Borel equivalence relations, Journal of Mathematical Logic 2 (2002), p. 1– 80 Thomas Jech, Set Theory, Academic Press, New York NY, 1978 [Pure and Applied Mathematics]
REFERENCES [Jec86] [Jec02]
[Jen70] [Jen72] [Joh74] [Joh90] [Jud93]
[JudMil0 She92]
[Kah02]
[KalSch5 92]
[Kan0 94] [Kan1 99] [Kan1 Zap98]
[Kar0 97]
[Kar1 98]
[Kar1 99] [Kar1 Zwi97] [Kec75]
[Kec77]
249
Thomas Jech, Multiple Forcing, Cambridge University Press, Cambridge, 1986 [Cambridge Tracts in Mathematics 88] Thomas Jech, Set Theory, The Third Millenium Edition, Springer Verlag 2002 [Springer Monographs in Mathematics]; third edition of [Jec78] Ronald B. Jensen, Definable sets of minimal degree, in: [Bar0 70, p. 122–128] Ronald Jensen, The fine structure of the constructible hierarchy, Annals of Mathematical Logic 4 (1972), p. 229–308 David S. Johnson, Approximation algorithms for combinatorial problems, Journal of Computer and System Sciences 9 (1974), p. 256–278 David S. Johnson, A catalog of complexity classes, in: [vLe+90, p. 67– 181] Haim Judah (ed.), Set theory of the reals, Proceedings of the Winter Institute held at Bar-Ilan University, Ramat Gan, January 1991, Bar-Ilan University, American Mathematical Society, Providence RI, 1993 [Israel Mathematical Conference Proceedings 6] Haim Judah, Arnold W. Miller, Saharon Shelah, Sacks forcing, Laver forcing, and Martin’s axiom, Archive for Mathematical Logic 31 (1992), p. 145–161 Reinhard Kahle, Mathematical Proof Theory in the Light of Ordinal Analysis, Synthese 133 (2002), p. 237-255; i.e., in: [L¨owRud1 02, p. 237–255] Bala Kalyanasundaram, Georg Schnitger, The probabilistic communication complexity of set intersection, SIAM Journal on Discrete Mathematics 5 (1992), p. 545–557 Akihiro Kanamori, The Higher Infinite, Springer Verlag, Berlin, 1994 [Perspectives in Mathematical Logic] Vladimir Kanovei, On non-well-founded iterations of the perfect set forcing, Journal of Symbolic Logic 64 (1999), p. 551–574 Vladimir Kanovei, Jindˇrich Zapletal, Pyramidal structure of constructibility degrees, Mathematical Notes 63 (1998), p. 556–559; translation from: Rossi˘ıskaya Akademiya Nauk—Matematicheskie Zametki 63 (1998), p. 631–635 Anna R. Karlin (ed.), 38th Annual Symposium on Foundations of Computer Science, Miami Beach, Florida, USA, 20–22 October 1997, IEEE Computer Society Press, New York NY, 1997 Howard Karloff (ed.), Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, California, 25–27 January, 1998, ACM Press, New York NY, 1998 Howard Karloff, How Good Is the Goemans-Williamson MAX CUT Algorithm?, SIAM Journal on Computing 29 (1999), p. 336–350 Howard Karloff, Uri Zwick, A 7/8-approximation algorithm for MAX 3SAT?, in: [Kar0 97, p. 406–415] Alexander S. Kechris, The Theory of Countable Analytic Sets, Transactions of the American Mathematical Society 202 (1975), p. 259– 297 Alexander S. Kechris, On a notion of smallness for subsets of the Baire space, Transactions of the American Mathematical Society 229 (1977), p. 191–207
250 [Kec02] [KecMar2 Ste0 88]
[KecMos1 78]
[Kle59]
[Koe∞] [Koh78]
[Kos+92]
[Kun80] [KunVau84] [Kur74]
[KusNis97] [Lam58] [Lan93] [Lav76] [Lev73] [L´ev65]
[LiVit97]
[LomBra0 02]
REFERENCES Alexander S. Kechris, Actions of Polish groups and classification problems, in: [Hen1 +02] Alexander S. Kechris, Donald A. Martin, John R. Steel (eds.), Cabal Seminar 81–85, Proceedings of the Caltech-UCLA Logic Seminar held during 1981–85, Springer Verlag, Berlin, 1988 [Lecture Notes in Mathematics 1333] Alexander S. Kechris, Yiannis N. Moschovakis (eds.), Cabal Seminar 76–77, Proceedings of the Caltech-UCLA Logic Seminar 1976–77, Springer Verlag, Berlin, 1978 [Lecture Notes in Mathematics 689] Stephen C. Kleene, Recursive functionals and quantifiers of finite types I, Transactions of the American Mathematical Society 61 (1959), p. 193–213 Peter Koepke, A New Finestructural Hierarchy for the Constructible Universe, in preparation Zvi Kohavi, Switching and Finite Automata Theory, McGraw-Hill Book Company, D¨usseldorf, 2 1978 [McGraw-Hill Computer Science Series] Rao Kosaraju, Mike Fellows, Avi Wigderson, John Ellis (eds.), Proceedings of the twenty-fourth Annual ACM Symposium on Theory of Computing, Victoria, British Columbia, Canada, 1992, ACM Press, New York NY, 1992 Kenneth Kunen, Set theory, An introduction to independence proofs, North-Holland, Amsterdam, 1980 [Studies in Logic] Kenneth Kunen, Jerry E. Vaughan (eds.), Handbook of Set Theoretical Topology, North-Holland, Amsterdam, 1984 Avgust A. Kurmit, Information-Lossless Automata of Finite Order, John Wiley & Sons, New York NY 1974; translated from the Russian original “Avtomaty bez poteri informacii koneqnogo pordka” by D. Louvish in the Israel Program for Scientific Translations Eyal Kushilevitz, Noam Nisan, Communication Complexity, Cambridge University Press, Cambridge, 1997 Joachim Lambek, The mathematics of sentence structure, American Mathematical Monthly 65 (1958), p. 154–170 Serge Lang, Algebra, Addison-Wesley, Reading MA, 3 1993 Richard Laver, On the consistency of Borel’s conjecture, Acta Mathematica 137 (1976), p. 151–169 Leonid A. Levin, Universalnye problemy perebora, Problemy peredaqi informacii 9 (1973), p. 115–116 Azriel L´evy, A hierarchy of formulas in set theory, American Mathematical Society, Providence RI, 1965 [Memoirs of the American Mathematical Society 57] Ming Li, Paul M.B. Vit´anyi, An Introduction to Kolmogorov Complexity and its Applications, Springer Verlag, New York NY, 2 1997 [Graduate Texts in Computer Science] Samuel J. Lomonaco, Jr., Howard E. Brandt (eds.), Quantum computation and information, Papers from the AMS Special Session held in Washington, DC, January 19–21, 2000, American Mathematical Society, Providence RI, 2002 [Contemporary Mathematics 305]
REFERENCES [LouSai88] [L¨ow01]
[L¨owMalR¨as03]
[L¨owRud1 02]
[Lue73] [Lut92] [Lut97] [Lut98] [Lut00a] [Lut00b] [Lut02]
[LutMay01] [Mad82]
[Mar0 35]
[Mar1 02]
[Mar2 75] [Mar3 66] [Mat77] [Mat83]
[Mau81] [May02]
251
Alain Louveau, Jean Saint-Raymond, The strength of Borel Wadge determinacy, in: [KecMar2 Ste0 88, p. 1–30] Benedikt L¨owe, Revision sequences and computers with an infinite amount of time, Journal of Logic and Computation 11 (2001), p. 25– 40; also in: [Wan01, p. 37–59] Benedikt L¨owe, Wolfgang Malzkorn, Thoralf R¨asch (eds.), Foundations of the Formal Sciences II, Applications of Mathematical Logic to Philosophy and Linguistics, Papers of a Conference held in Bonn, November 10-13, 2000, Kluwer Academic Publishers, Dordrecht, 2003 [Trends in Logic 17] Benedikt L¨owe, Florian Rudolph (eds.), Foundations of the Formal Sciences I, Humboldt-Universit¨at zu Berlin, May 7-9, 1999, special issue of Synthese 133 (2002); Number 1-2, October/November 2002 David G. Luenberger, Linear and Nonlinear Programming, Addison Wesley, Reading MA, 2 1973 Jack H. Lutz, Almost everywhere high nonuniform complexity, Journal of Computer and System Sciences 44 (1992), p. 220–258 Jack H. Lutz, The quantitative structure of exponential time, in: [HemSel97, p. 225–254] Jack H. Lutz, Resource-bounded measure, in: [Fei1 +98, p. 236–248] Jack H. Lutz, Dimension in complexity classes, in: [Tit00, p. 158–169] Jack H. Lutz, Gales and the constructive dimension of individual sequences, in: [Mon1 RolWel1 00, p. 902–913] Jack H. Lutz, The dimensions of individual strings and sequences, preprint 2002; available at http://arxiv.org/abs/cs.CC/0203017 Jack H. Lutz, Elvira Mayordomo, Twelve problems in resourcebounded measure, in: [P˘auRozSal01, p. 83–101] Roger D. Maddux, Some varieties containing relation algebras, Transactions of the American Mathematical Society 272 (1982), p. 501– 526 Edward Marczewski (Szpilrajn), Sur une classe de fonctions de M. Sierpi´nski et la classe correspondante d’ensembles, Fundamenta Mathematicae 24 (1935), p. 17–34 Danielle Martin (ed.), Proceedings of the 17th Annual IEEE Conference on Computational Complexity, May 21-24, 2002, Montr´eal, Qu´ebec, Canada, IEEE Computer Society Press, 2002 Donald A. Martin, Borel determinacy, Annals of Mathematics 102 (1975), p. 363–371 Per Martin-L¨of, The definition of random sequences, Information and Control 9 (1966), p. 602–619 Adrian R.D. Mathias, Happy families, Annals of Mathematical Logic 12 (1977), p. 59–111 Adrian R.D. Mathias (ed.), Surveys in set theory, Cambridge University Press, Cambridge, 1983 [London Mathematical Society Lecture Note Series 87] Daniel Mauldin, The Scottish Book, Mathematics of the Scottish Caf´e, Birkh¨auser, Basel, 1981 Elvira Mayordomo, A Kolmogorov complexity characterization of constructive Hausdorff dimension, Information Processing Letters 84 (2002), p. 1–3
252 [Mek81] [Mik02] [Mil0 83] [Mil0 84] [Mil0 93] [Mil0 00]
[Mil1 77]
[Mil2 96]
[Mon0 64] [Mon1 RolWel1 00]
[Moo1 64]
[Mos1 80]
[Mot98]
[M¨ul0 ObePot75]
[NagSupTar62]
[NayWu99] [NewRos93]
[NisSze92] [OreSpi1 vLe01]
REFERENCES Alan H. Mekler, Stability of nilpotent groups of class 2 and prime exponent, Journal of Symbolic Logic 46 (1981), p. 781–788 Szabolcs Mikul´as, Complicated conjugates, manuscript 2002 Arnold Miller, Mapping a set of reals onto the reals, Journal of Symbolic Logic 48 (1983), p. 575–584 Arnold Miller, Rational perfect set forcing, in: [BauMar2 She84, p. 143– 159] Arnold Miller, Arnie Miller’s problem list, in: [Jud93, p. 645–654] Arnold Miller, Some interesting problems, version August 2000 (update of [Mil0 93]), available at http://www.math.wisc.edu/∼miller/res/ Douglas E. Miller, On the measurability of orbits in Borel actions, Proceedings of the American Mathematical Society 63 (1977), p. 165– 170 Gary L. Miller (ed.), Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing, Philadelphia, Pennsylvania, USA, May 22-24, 1996, ACM Press, New York NY, 1996 J. Donald Monk, On representable relation algebras, Michigan Mathematical Journal 11 (1964), p. 207–210 Ugo Montanari, Jos´e D.P. Rolim, Emo Welzl (eds.), Automata, Languages and Programming, 27th International Colloquium, ICALP 2000, Geneva, Switzerland, July 9-15, 2000, Springer Verlag, Berlin, 2000 [Lecture Notes in Computer Science 1853] Edward F. Moore (ed.), Sequential Machine: Selected Papers, AddisonWesley, Reading, 1964 [Addison-Wesley Series in Computer Science and Information Processing] Yiannis N. Moschovakis, Descriptive Set Theory, North-Holland, Amsterdam, 1980 [Studies in Logic and the Foundations of Mathematics 100] Rajeev Motwani (ed.), 39th Annual Symposium on Foundations of Computer Science, FOCS ’98, November 8-11, 1998, Palo Alto, California, USA, IEEE Computer Society Press, New York NY, 1998 ¨ Gert H. Muller, Arnold Oberschelp, Klaus Potthoff (eds.), Proceedings of the International Summer Institute and Logic Colloquium (ISILC) 1974, Kiel 1974, Springer Verlag, Berlin, 1975 [Lecture Notes in Mathematics 499] Ernest Nagel, Patrick Suppes, Alfred Tarski (eds.), Logic, Methodology and Philosophy of Science, Proceedings of the 1960 International Congress, Stanford University Press, Stanford CA, 1962 Ashwin Nayak, Felix Wu, The quantum query complexity of approximating the median and related statistics, in: [VitLarLei0 99, p. 384–393] Ludomir Newelski, Andrzej Rosłanowski, The ideal determined by the unsymmetric game, Proceedings of the American Mathematical Society 117 (1993), p. 823–831 Noam Nisan, Mario Szegedy, On the degree of Boolean functions as real polynomials, Computational Complexity 4 (1994), p. 301–313 Fernando Orejas, Paul G. Spirakis, Jan van Leeuwen (eds.), Automata, Languages and Programming, 28th International Colloquium, ICALP 2001, Crete, Greece, July 8-12, 2001, Springer Verlag, Berlin, 2001 [Lecture Notes in Computer Science 2076]
REFERENCES [Orn70] [Pan66] [Pap94] [Pat1 92] [P˘auRozSal01]
[Pit1 90] [Poh96] [Pol86]
[Pud+00]
[Qui02] [Rat99] [Raz1 92] [Raz1 03] [Rei02]
[Rog0 67] [RolVad02]
[RosShe99]
[Rya84] [Rya86]
253
Donald S. Ornstein, Bernoulli shifts with the same entropy are isomorphic, Advances in Mathematics 4 (1970), p. 337–352 Victor Y. Pan, Methods for computing values of polynomials, Russian Mathematical Surveys 21 (1966), p. 105–136 Christos H. Papadimitriou, Computational Complexity, AddisonWesley, Reading MA, 1994 Ramamohan Paturi, On the degree of polynomials that approximate symmetric Boolean functions, in: [Kos+92, p. 468–474] Gheorghe P˘aun, Grzegorz Rozenberg, Arto Salomaa (eds.), Current trends in theoretical computer science, Entering the 21st century, Based on columns and tutorials published in the Bulletin of the European Association for Theoretical Computer Science (EATCS), 1992–2000, World Scientific Publishing, Singapore, 2001 Itamar Pitowsky, The physical Church Thesis and Physical computational complexity, Iyyun 39 (1990), p. 81-99 Wolfram Pohlers, Pure proof theory, aims, methods and results, Bulletin of Symbolic Logic 2 (1996), p. 159–188 Alain Poli (ed.), Applied algebra, algorithmics and error-correcting codes, Proceedings of the second international conference held in Toulouse, October 1–5, 1984, Springer Verlag, Berlin, 1986 [Lecture Notes in Computer Science 228] Pavel Pudl´ak, Ricard Gavald´a, Russell Impagliazzo, Ran Raz, Alexander Razborov, Uwe Sch¨oning, Denis Th´erien, Luca Trevisan, Ramarathnam Venkatesan (eds.), 15th Annual IEEE Conference on Computational Complexity (CoCo’00), July 04–07, 2000, Florence, Italy, IEEE Computer Society Press, New York NY, 2000 Sandra Quickert, CH and the Sacks property, Fundamenta Mathematicae 171 (2002), p. 93–100 Michael Rathjen, The Realm of Ordinal Analysis, in: [Coo1 Tru99, p. 219–279] Alexander A. Razborov, On the distributional complexity of disjointness, Theoretical Computer Science 106 (1992), p. 385–390 Alexander A. Razborov, Quantum communication complexity of symmetric predicates, Izvestiya — Mathematics 67 (2003), p. 145–159 John Reif (ed.), Proceedings on 34th Annual ACM Symposium on Theory of Computing, May 19-21, 2002, Montr´eal, Qu´ebec, Canada, ACM Press, New York NY, 2002 Hartley Rogers, Jr., Theory of Recursive Functions and Effective Computability, McGraw-Hill, New York NY, 1967 Jose D. P. Rolim, Salil Vadhan, Randomization and Approximation Techniques: 6th International Workshop, RANDOM 2002, Cambridge, MA, USA, September 13-15, 2002, Springer, Heidelberg, 2002 [Lecture Notes in Computer Science 2483] Andrzej Rosłanowski, Saharon Shelah, Norms on possibilities I: forcing with trees and creatures, American Mathematical Society, Providence RI, 1999 [Memoirs of the American Mathematical Society 141] Boris Ya. Ryabko, Coding of combinatorial sources and Hausdorff dimension, Soviets Mathematics Doklady 30 (1984), p. 219–222 Boris Ya. Ryabko, Noiseless coding of combinatorial sources, Problems of Information Transmission 22 (1986), p. 170–179
254 [Rya93] [Rya94] [Sac71] [Sac80] [Sac90] [Sch1 91] [Sch2 00] [Sch2 00a] [Sch2 01] [Sch2 ∞] [Sch6 71] [Sch6 Sti72] [Sch7 81] [Sco71]
[She84] [She94] [She98] [She00] [She01] [She∞a]
[She∞b]
[Shi02] [Shm0 00]
REFERENCES Boris Ya. Ryabko, Algorithmic approach to the prediction problem, Problems of Information Transmission 29 (1993), p. 186–193 Boris Ya. Ryabko, The complexity and effectiveness of prediction problems, Journal of Complexity 10 (1994), p. 281–295 Gerald Sacks, Forcing with perfect closed sets, in: [Sco71, p. 331–355] Gerald E. Sacks, Post’s problem, Absoluteness and Recursion in Finite types, in: [Bar3 KeiKun80, p. 201–222] Gerald E. Sacks, Higher Recursion Theory, Springer Verlag, Berlin, 1990 [Perspectives in Mathematical Logic] Boris M. Schein, Representations of subreducts of Tarski relation algebras, in: [And1 Mon0 N´em91, p. 621–635] Ralf Schindler, Coding into K by reasonable forcing, Transactions of the American Mathematical Society 353 (2000), p. 479–489 Ralf Schindler, Proper forcing and remarkable cardinals, Bulletin of Symbolic Logic 6 (2000), p. 176–184 Ralf Schindler, Proper forcing and remarkable cardinals II, Journal of the Symbolic Logic 66 (2001), p. 1481–1492 Ralf-Dieter Schindler, P = NP for infinite time Turing machines, to ¨ Mathematik appear in: Monatshefte fur Claus P. Schnorr, A unified approach to the definition of random sequences, Mathematical Systems Theory 5 (1971), p. 246–258 Claus P. Schnorr, H. Stimm, Endliche Automaten und Zufallsfolgen, Acta Informatica 1 (1972), p. 345–359 Arnold Sch¨onhage, Partial and total matrix multiplication, SIAM Journal on Computing 10 (1981), p. 434–455 Dana G. Scott (ed.), Axiomatic set theory, Proceedings of the Symposium in Pure Mathematics of the American Mathematical Society held at the University of California, Los Angeles, Calif., July 10-August 5, 1967, American Mathematical Society, Providence RI, 1971 [Proceedings of Symposia in Pure Mathematics 13(I)] Saharon Shelah, Can you take Solovay’s inaccessible away?, Israel Journal of Mathematics 48 (1984), p. 1–47 Saharon Shelah, How special are random and Cohen forcings, Israel Journal of Mathematics 88 (1994), p. 1022–1031 Saharon Shelah, Proper and improper forcing, Springer Verlag, Berlin, 1998 [Perspectives in Mathematical Logic] Saharon Shelah, On what I do not understand (and have something to say): part I, Fundamenta Mathematicae 166 (2000), p. 1–82 [Sh666] Saharon Shelah, Consistently there is no non-trivial ccc forcing notion with the Sacks or Laver property, Combinatorica 21 (2001), p. 309–319 Saharon Shelah, Historic iteration with ℵε –support, [Sh538], preprint available at http://arXiv.org/abs/math/9607227 Saharon Shelah, Are a and d your cup of tea?, [Sh700], preprint available at http://arXiv.org/abs/math/0012170 Yaoyun Shi, Quantum lower bounds for the collision and the element distinctness problems, in: [Cha02, p. 513–519] David Shmoys (ed.), Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, California, 9–11 January, 2000, ACM Press, New York NY, 2000
REFERENCES [Sho1 97]
[Sil7?] [Sim0 97] [Sim1 93] [Sim2 99] [Soa87]
[Sol70] [Sol71] [Sor97]
[Sta0 93] [Sta0 98]
[Sta2 99] [Ste0 80] [Ste0 82a] [Ste0 82b] [Ste0 Wel0 94] [Ste2 99]
[Str69] [Str73] [Str90] [Sud+01]
255
Peter W. Shor, Polynomial time algorithms for prime factorization and discrete logarithms on a quantum computer, SIAM Journal on Computing 26 (1997), p. 1484–1509 Jack H. Silver, How to Eliminate the Fine Structure from the Work of Jensen, handwritten manuscript 197? Daniel R. Simon, On the power of quantum computation, SIAM Journal on Computing 26 (1997), p. 1474–1483 Petr Simon, Sacks forcing collapses c to b, Commentationes Mathematicae Universitatis Carolinae 34 (1993), p. 707–710 Stephen G. Simpson, Subsystems of second order arithmetic, Springer Verlag, Berlin, 1999 [Perspectives in Mathematical Logic] Robert I. Soare, Recursively enumerable sets and degrees. A study of computable functions and computably generated sets, Springer Verlag, Berlin, 1987 [Perspectives in Mathematical Logic] Robert M. Solovay, A model of set theory in which every set of reals is Lebesgue measurable, Annals of Mathematics 92 (1970), p. 1–56 Robert Solovay, Determinacy and type-2 recursion, Journal for Symbolic Logic 36 (1971), p. 374 Andrea Sorbi (ed.), Complexity, Logic and Recursion Theory, Marcel Dekker, New York NY, 1997 [Lecture Notes in Pure and Applied Mathematics 187] Ludwig Staiger, Kolmogorov complexity and Hausdorff dimension, Information and Computation 102 (1993), p. 159–194 Ludwig Staiger, A tight upper bound on Kolmogorov complexity and uniformly optimal prediction, Theory of Computing Systems 31 (1998), p. 215–229 Dietrich Stauffer (ed.), Annual Reviews of Computational Physics VI, World Scientific, River Edge NJ, 1999 John R. Steel, Analytic Sets and Borel Isomorphisms, Fundamenta Mathematicae 108 (1980), p. 83–88 John R. Steel, A classification of jump operators, Journal of Symbolic Logic 47 (1982), p. 347–358 John R. Steel, Determinacy in the Mitchell models, Annals of Mathematical Logic 22 (1982), p. 109–125 John R. Steel, Philip D. Welch, Σ13 -absoluteness and the second uniform indiscernible, Israel Journal of Mathematics 14 (1998), p. 157–190 Juris Stepr¯ans, Decomposing Euclidean space with a small number of smooth sets, Transactions of the American Mathematical Society 351 (1999), p. 1461–1480 Volker Strassen, Gaussian elimination is not optimal, Numerische Mathematik 13 (1969), p. 354–356 ¨ die Reine Volker Strassen, Vermeidung von Divisionen, Journal fur und Angewandte Mathematik 264 (1973), p 184–202 Volker Strassen, Algebraic complexity theory, in: [vLe+90, p. 634–672] Madhu Sudan, Peter Bro Miltersen, Richard Cleve, Uriel Feige, Anna Gal, Toniann Pitassi, Rainer Schuler, Dandapani Sivakumar, Salil Vadhan (eds.), 16th Annual IEEE Conference on Computational Complexity, 18-21 June 2001, Chicago, Illinois, USA, IEEE Computer Society Press, New York NY, 2001
256 [Tar41] [Tar55] [Tho0 01]
[Tho0 02]
[Tho1 54] [Tit00]
[Tit01]
[Tod84] [Tod89] [Una77] [vBe91]
[vDa98] [vLe+90]
[Van78] [Vaz98]
[Vel91] [Vel02] [Vit98]
[VitLarLei0 99]
REFERENCES Alfred Tarski, On the calculus of relations, Journal of Symbolic Logic 6 (1941), p. 73–89 Alfred Tarski, Contribution to the theory of models, Indagationes Mathematicae 17 (1955), p. 56–64 Simon Thomas, On the Complexity of the Classification Problem for Torsion-Free Abelian Groups of finite rank, Bulletin of Symbolic Logic 7 (2001), p. 329–344 Simon Thomas (ed.), Set theory, The Hajnal conference, Proceedings from the Mid-Atlantic Mathematical Logic Seminar (MAMLS) conference held in honor of Andr´as Hajnal at the DIMACS Center, Rutgers University, Piscataway, NJ, USA, October 15-17, 1999, American Mathematical Society, Providence RI, 2002 [DIMACS Series in Discrete Mathematics and Theoretical Computer Science 58] James F. Thomson, Tasks and Super-tasks, Analysis 15 (1954), p. 1–13 Francis Titsworth (ed.), Proceedings of the 15th Annual IEEE Conference on Computational Complexity, 4-7 July 2000, Florence, Italy, IEEE Computer Society Press, New York NY, 2000 Francis Titsworth (ed.), Proceedings of the 16th Annual IEEE Conference on Computational Complexity, 18-21 June 2001, Chicago, Illinois, IEEE Computer Society Press, New York NY, 2001 Stevo Todorˇcevi´c, A note on the proper forcing axiom, in: [BauMar2 She84, p. 209–218] Stevo Todorˇcevi´c, Partition problems in topology, American Mathematical Society, Providence RI, 1989 [Contemporary Mathematics 84] Kh. Ya. Unachev, Algebra i teori qisel, Vypusk 2, Kabardino-Balkarskii Gosudarstvennyi Universitet, Nalchik, 1977 Johan van Benthem, Language in action, Categories, lambdas and dynamic logic, North-Holland, Amsterdam 1991 [Studies in Logic and the Foundations of Mathematics 130] Wim van Dam, Quantum Oracle Interrogation: Getting all information for almost half the price, in: [Mot98, p. 362–367] Jan van Leeuwen, Albert R. Meyer, Maurice Nivat, Matthew Paterson, Dominique Perrin (eds.) , Handbook of Theoretical Computer Science Volume A, Elsevier Science Publishers, Amsterdam, 1990 Robert A. Van Wesep, Wadge degrees and descriptive set theory, in: [KecMos1 78, p. 151–170] Umesh Vazirani, On the power of quantum computation, Philosophical Transactions of the Royal Society London Series A 356 (1998), p. 1759–1768 Boban Veliˇckovi´c, CCC posets of perfect sets, Composito Mathematica 79 (1991), p. 279–294 Boban Veliˇckovi´c, The basis problem for ccc posets, in: [Tho0 02, p. 149–160] Jeffrey Vitter (ed.), Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing, Dallas, Texas, USA, May 23-26, 1998, ACM Press, New York NY, 1998 Jeffrey S. Vitter, Lawrence Larmore, Tom Leighton (eds.), Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing, May 1-4, 1999, Atlanta, Georgia, USA, ACM Press, New York NY, 1999
REFERENCES [VitSpi1 Yan01]
[Wan01] [Wel0 99] [Wel0 00a]
[Wel0 00b]
[Wil0 99]
[Win0 79] [Woo99]
[Yan94] [Ye91] [Zap04]
[Zem02] [Zha02] [ZvoLev70]
[Zwi98a]
[Zwi98b]
257
Jeffrey S. Vitter, Paul Spirakis, Mihalis Yannakakis (eds.), Proceedings of the Thirty-Third Annual ACM Symposium on Theory of Computing, Hersonissos, Crete, Greece, 6–8 July 2001, ACM Press, New York NY, 2001 Heinrich Wansing (ed.), Essays on Non-Classical Logic, World Scientific, Singapore 2001 [Advances in Logic 1] Philip D. Welch, Friedman’s trick: Minimality arguments in the infinite time Turing degrees, in: [Coo1 Tru99, p. 425–436] Philip D. Welch, Eventually infinite time Turing machine degrees: Infinite time decidable reals, Journal of Symbolic Logic 65 (2000), p. 1193–1203 Philip D. Welch, The lengths of infinite time Turing machine computations, Bulletin of the London Mathematical Society 32 (2000), p. 129– 136 Colin P. Williams (ed.), Quantum computing and quantum communications, 1st NASA international conference, QCQC ‘98, Palm Springs, CA, USA, February 17-20, 1998, Springer Verlag, Berlin, 1999 [Lecture Notes in Computer Science 1509] Shmuel Winograd, On multiplication in algebraic extension fields, Theoretical Computer Science 8 (1979), p. 359–377 W. Hugh Woodin, The Axiom of Determinacy, Forcing Axioms, and the Nonstationary Ideal, Walter de Gruyter & Co., Berlin, 1999 [de Gruyter Series in Logic and its Applications 1] Mihalis Yannakakis, On the approximation of maximum satisfiability, Journal of Algorithms 17 (1994), p. 475–502 Yinyu Ye, An O(n3 L) potential reduction algorithm for linear programming, Mathematical Programming 50 (1991), p. 239–258 Jindˇrich Zapletal, Descriptive Set Theory and Definable Forcing, American Mathematical Society, Providence RI, to appear 2004 [Memoirs of the American Mathematical Society 167] Martin Zeman, Inner Models and Large Cardinals, Walter de Gruyter, Berlin, 2002 [de Gruyter Series in Logic and Its Applications 5] Yi Zhang (ed.), Logic and algebra, American Mathematical Society, Providence RI, 2002 [Contemporary Mathematics 302] Alexander K. Zvonkin, Leonid A. Levin, The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms, Russian Mathematical Surveys 25 (1970), p. 83–124 Uri Zwick, Approximation algorithms for constraint satisfaction programs involving at most three variables per constraint, in: [Kar1 98, p. 201–210] Uri Zwick, Finding almost satisfying assignments, in: [Vit98, p. 551– 560]
TRENDS IN LOGIC 1.
G. Schurz: The Is-Ought Problem. An Investigation in Philosophical Logic. 1997 ISBN 0-7923-4410-3
2.
E. Ejerhed and S. Lindstr¨om (eds.): Logic, Action and Cognition. Essays in Philosophical Logic. 1997 ISBN 0-7923-4560-6
3.
H. Wansing: Displaying Modal Logic. 1998
ISBN 0-7923-5205-X
4.
P. H´ajek: Metamathematics of Fuzzy Logic. 1998
ISBN 0-7923-5238-6
5.
H.J. Ohlbach and U. Reyle (eds.): Logic, Language and Reasoning. Essays in Honour of Dov Gabbay. 1999 ISBN 0-7923-5687-X
6.
K. Do˘sen: Cut Elimination in Categories. 2000
7.
R.L.O. Cignoli, I.M.L. D’Ottaviano and D. Mundici: Algebraic Foundations of manyvalued Reasoning. 2000 ISBN 0-7923-6009-5
8.
E.P. Klement, R. Mesiar and E. Pap: Triangular Norms. 2000
ISBN 0-7923-5720-5
ISBN 0-7923-6416-3 9.
V.F. Hendricks: The Convergence of Scientific Knowledge. A View From the Limit. 2001 ISBN 0-7923-6929-7
10.
J. Czelakowski: Protoalgebraic Logics. 2001
ISBN 0-7923-6940-8
11.
G. Gerla: Fuzzy Logic. Mathematical Tools for Approximate Reasoning. 2001 ISBN 0-7923-6941-6
12.
M. Fitting: Types, Tableaus, and G¨odel’s God. 2002
ISBN 1-4020-0604-7
13.
F. Paoli: Substructural Logics: A Primer. 2002
ISBN 1-4020-0605-5
14.
S. Ghilardi and M. Zawadowki: Sheaves, Games, and Model Completions. A Categorical Approach to Nonclassical Propositional Logics. 2002 ISBN 1-4020-0660-8
15.
G. Coletti and R. Scozzafava: Probabilistic Logic in a Coherent Setting. 2002 ISBN 1-4020-0917-8; Pb: 1-4020-0970-4
16.
P. Kawalec: Structural Reliabilism. Inductive Logic as a Theory of Justification. 2002 ISBN 1-4020-1013-3
17.
B. L¨owe, W. Malzkorn and T. R¨asch (eds.): Foundations of the Formal Sciences II. Applications of Mathematical Logic in Philosophy and Linguistics, Papers of a conference held in Bonn, November 10-13, 2000. 2003 ISBN 1-4020-1154-7
18.
R.J.G.B. de Queiroz (ed.): Logic for Concurrency and Synchronisation. 2003 ISBN 1-4020-1270-5
19.
A. Marcja and C. Toffalori: A Guide to Classical and Modern Model Theory. 2003 ISBN 1-4020-1330-2; Pb 1-4020-1331-0
20.
S.E. Rodabaugh and E.P. Klement (eds.): Topological and Algebraic Structures in Fuzzy Sets. A Handbook of Recent Developments in the Mathematics of Fuzzy Sets. 2003 ISBN 1-4020-1515-1; Pb 1-4020-1516-X
21.
V.F. Hendricks and J. Malinowski: Trends in Logic. 50 Years Studia Logica. 2003 ISBN 1-4020-1601-8
22.
M. Dalla Chiara, R. Giuntini and R Greechie: Reasoning in Quantum Theory. Sharp and Unsharp Quantum Logics. 2004 ISBN 1-4020-1978-5
23.
B. L¨owe, B. Piwinger and T. R¨asch (eds.): Classical and New Paradigms of Computation and their Complexity Hierarchies. Papers of the conference "Foundations of the Formal Sciences III" held in Vienna, September 21–24, 2001 ISBN 1-4020-2775-3
KLUWER ACADEMIC PUBLISHERS – DORDRECHT / BOSTON / LONDON