OXFORD LOGIC GUIDES
1.
2. 3. 4. 5. 6. 7. 8. 9. 10. II. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41.
Jane Bridge: Beginning model theOlY: the completeness theorem and some consequences Michael Dummett: Elements of intuitionism (lst edition) A. S. Troelstra: Choice sequences: a chapter of intuitionistic mathematics J. L. Bell: Boolean-valued models and independence proofs in set theOlY (lst edition) Krister Seberberg: Classical propositional operators: an exercise in thefoulldation of logic G. C. Smith: The Boole-De Morgan correspondence 1842-1864 Alec Fisher: Formal number theOlY and computability: a work book Anand Pillay: An introduction to stability theOlY H. E. Rose: Subrecursion: filllctions and hierarchies Michael Hallett: Cantorian set theOlY and limitation of size R. Mansfield and G. Weitkamp: Recursive aspects of descriptive set theOlY J. L. Bell: Boolean-valued models and independence proofs in set theOlY (2nd edition) Melvin Fitting: Computability theOlY: semantics and logic programming J. L. Bell: Toposes and local set theories: an introduction R. Kaye: Models of Pea no arithmetic J. Chapman and F. Rowbottom: Relative categOlY theory and geometric morphisms: a logical approach Stewart Shapiro: Foundations withoutfoundationalism John P. Cleave: A study of logics R. M. Smullyan: Gadel's incompleteness theorems T. E. Forster: Set theory with a universal set: exploring an untyped universe C. McLarty: ElementGlJI categories, elementGlY toposes R. M. Smullyan: Recursion theOlY for metamathematics Peter Clote and Jan Krajiacek:Arithmetic, prooftheOlY, and computational complexity A. Tarski: Introduction to logic and to the methodology of deductive sciences G. Malinowski: Many valued logics Alexandre Borovik and Ali Nesin: Groups offinite Morley rank R. M. Smullyan: Diagonalization and self-reference Dov M. Gabbay, Ian Hodkinson, and Mark Reynolds: Temporal logic: mathematical foundations and computational aspects: Volume 1 Saharon Shelah: Cardinal arithmetic Elik Sandewall: Features andfiuents: Volume I: a systematic approach to the representation ofkno>vledge about dynamical systems T. E. Forster: Set theory with a universal set: exploring an untyped universe (2nd edition) Anand PilIay: Geometric stability theory Dov. M. Gabbay: Labelled deductive systems Raymond M. Smullyan and Melvin Fitting: Set theOlY and the continuum problem Alexander Chagrov and Michael Zakharyaschev: Modal logic G. Sambin and J. Smith: Twenty-five years of Martin-Lafconstructive type theOlY Maria Manzano: Model theOlY Dov M. Gabbay: Fibring logics Michael Dummett: Elements of intuitionism (2nd edition) D. M. Gabbay, M. A. Reynolds, and M. Finger: Temporal logic: mathematical foundations and computational aspects volume 2 J. M. Dunn and G. Hardellree: Ali!ebraic methods in DhilosolJhicallOf!ic
Algebraic Methods in Philosophical Logic J. MICHAEL DUNN and GARY M. HARDEGREE
CLARENDON PRESS • OXFORD 2001
OXFORD UNIVERSITY PRESS
Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Athens Auckland Bangkok Bogota Buenos Aires Calcutta Cape Town Chennai Dar es Salaam Delhi Florence Hong Kong Istanbul Karachi Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi Paris Sao Paulo Singapore Taipei Tokyo Toronto Warsaw with associated companies in Berlin Ibadan Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York © J. M. Dunn and G. M. Hardegree, 2001 The moral rights of the authors have been asserted Database right Oxford University Press (maker) First published 2001 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford Univer.sity Press, at the address above You must not circulate this book in any other binding or cover and you must impose this same condition on any acquirer A catalogue record for this title is available from the British Library. Library of Congress Cataloging in Publication Data Dunn, J. Michael, 1941Algebraic methods in philosophical logic / J. Michael Dunn and Gary Hardegree. p. cm.- (Oxford logic guides; 41) Includes bibliographical references and index. 1. Algebraic logic. I. Hardegree, Gary. II. Title. III. Series. QAlO.D85 2001 511.3'24-dc21 2001021287 ISBN 0 19 853192 3 (Hbk) Typeset by the authors in LaTeX Printed in Great Britain on acid-free paper by T. J. International Ltd, Padstow, Cornwall
Dedicated to our loved, and loving spouses,
Sally Dunn and Kate Doifman, who have been with us even longer than this book
PREFACE This book has been in process over many years. Someone once said "This letter would not be so long if I had more time," and we have somewhat the dual thought regarding this book. The book was begun by JMD in the late 1960s in the form of handouts for an algebraic logic seminar, taught first at Wayne State University and then at Indiana University. Chapters 1 through 3, 8, 10, and 11 date in their essentials from that period. GMHjoined the project after taking the seminar in the middle 1970s, but it did not really take off until GMH visited Indiana University in 1982. The bulk of the collaborative writing was done in the academic year 1984-85, especially dUling the spring semester when JMD visited the University of Massachusetts (Amherst) to work with GMH. JMD wishes to thank the American Council of Learned Societies for a fellowship during his sabbatical year 1984-85. Most of Chapters 4 through 7 were written jointly during that period. Little did we know then that this would be a "book for the new millennium." We wish to thank Dana Scott for his encouragement at that stage, but also for his critical help in converting our initial work, written in a then popular wordprocessor, to IbTEX. We also thank his then assistants, John Aronis and Stacy Quackenbush, for their skillful and patient work on the conversion and formatting. Then, for various reasons, the project essentially came to a stop after our joint work of 1984-85. But JMD resumed it, preparing a series of draft manuscripts for seminars. GMH is the principal author of Chapter 9, and JMD is the principal author of the remaining chapters. It is impossible to recall all of the students who provided lists of typos or suggestions, but we especially wanno thank Gerry Allwein, Alexandru Baltag, Axel Barcelo, Gordon Beavers, Norm Danner, Elic Hammer, Timothy Herron, Yu-Houng Houng, Albert Layland, Julia Lawall, Jay Mersch, Ed Mares, Michael O'Connor and Yuko Murakami. JMD has had a series of excellent research assistants who have been helpful in copy editing and aiding with the IbTEX aspects. Monica Holland systematized the formatting and the handling of the files for the book, Andre Chapuis did most of the diagrams, Chrysafis Hartonas helped particularly with the content of Chapter 13, Steve Crowley helped add some of the last sections, and Katalin Bimb6 did an outstanding job in perfecting and polishing the book. She also essentially wrote Section 8.13 and provided significant help with Section 8.3. Kata truly deserves the credit for making this a completed object rather than an incomplete process. We owe all of these our thanks. We owe thanks to Allen Hazen and Piero D' Altan, who have provided extensive comments, suggesting a range of improvements, from cOlTections of typos and technical points to stylistic suggestions. We also thank Yaroslav Shramko and Tatsutoshi Tatenaka for cOlTections. We wish to thank Greg Pave1cak and Katalin Bimb6 for prepming the index. We thank Robert K. Meyer for providing a critical counter-example (cf. Section
viii
PREFACE
6.9), and also for his useful interactions over the years with JMD. The "gaggle theory" in our book is a generalization of the semantics for relevance logic that he developed with Richard Routley in the early] 970s. The authors owe intellectual debts especially to G. D. Birkhoff, M. H. Stone, B. Jonsson and A. Tarski. Their work on universal algebra and representation theory permeates the present work. JMD also wants to thank his teacher N. D. Belnap for first stimulating his interest in algebraic methods in logic, and to also acknowledge the influence of P. Halmos' book, Algebraic Logic (1962). We wish to thank Richard Leigh, the copy editor for Oxford University Press, and Lisa Blake, the development editor for their keen eyes and friendly and professional manner. We owe many last-minute "catches" to them. Despite the efforts of all of us, there are undoubtedly still typos and maybe more serious errors, for which the authors take full responsibility. Someone (Aelius Donatus) also said "Pareant, inquit, qui ante nos nostra dixerunt" (Confound those who have voiced our thoughts before us). As the book was written over a considerable period of time, thoughts which were once original with us (or at least we thought they were) have undoubtedly been expressed by others. While we have tried to reference these wherever we could, we may have missed some, and we apologize to any such authors in advance. We wish to thank the following journals and publishers for permissions. Detailed bibliographic information appears in the references at the end of this volume under the headings given below. Section numbers indicate where in this volume some version of the cited material can be found. Springer-Verlag: Dunn (1991), 12.1-12.9, 12.16. Clarendon Press: Dunn (1993a), 3.17. W. de Gruyter: Dunn (1995a), 12.10-12.15; Dunn (l993b), 3.13. Zeitschriftfiir Mathematische Logik und Grundlagen der Mathematik: Dunn and Meyer (1971), ] 1.10. We wish to thank Indiana University and the University of Massachusetts for support for our research. In particular, JMD wishes to thank Morton Lowengrub, Dean of the College of Arts and Sciences, for his support over the years. We thank our spouses, Sarah J. ("Sally") Dunn and Katherine ("Kate") Dorfman for their love and support. Obviously this book tries to represent a reasonable portion of the intersection of algebraic logic and philosophical logic, but still contains only a fraction of the results. Scholars who know our previous publications may find surprising how little is devoted to relevance logic and quantum logic. We knew (between us) too much about these subjects to fit them between two covers. Another notable omission is the algebraic treatment of first-order logic, where perhaps we know too little. There are at least three main treatments for classical logic: cylindric algebras (Henkin, Tarski and Monk (1971)), polyadic algebras (Halmos (1962)), and complete lattices (Rasiowa and Sikorski (1963)), and at a rough calculation to do justice to them all we would have to multiply the length of the present book by three. We suspect that the reader applauds our decision. An overriding theme of the book is that standard algebraic-type results, e.g., representation theorems, translate into standard logic-type results, e.g., completeness the: orems. A subsidiary theme, stemming from JMD's research, is to identify a class of
PREFACE
IX
algebras most generally appropriate for the study of logics (both classical and nonclassical), and this leads to the introduction of gaggles, distributoids, and partial gaggles and tonoids. Another important subtheme is that logic is fundamentally infOlmation based. Its main elements are propositions, which can be understood as sets of information states. This book is both suitable as a textbook for graduate and even advanced undergraduate courses, while at the same time hopefully of interest to researchers. In terms of the book's target audience, we briefly considered indicating this by expanding its title to "Algebraic Methods in Philosophical Logic for Computer and Information Scientists, and maybe Linguists." We rejected this as too nakedly a marketing ploy. But the serious point behind this joke title is that we do believe that the book has results of interest to mathematicians, philosophers, computer and information scientists, and maybe linguists. J.M.D. G.M.H.
CONTENTS 1 2
Introduction Universal Algebra Introduction Relational and Operational Structures (Algebras) Subrelational Structures and Sub algebras Intersection, Generators, and Induction from Generators Homomorphisms and Isomorphisms Congruence Relations and Quotient Algebras Direct Products Subdirect products and the Fundamental Theorem of Universal Algebra 2.9 Word Algebras and Interpretations 2.10 Varieties and Equational Definability 2.11 Equational Theories 2.12 Examples of Free Algebras 2.13 Freedom and Typicality 2.14 The Existence of Free Algebras; Freedom in Varieties and Subdirect classes 2.15 Birkhoff's Varieties Theorem 2.16 Quasi-varieties 2.17 Logic and Algebra: Algebraic Statements of Soundness and Completeness
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8
3
Order, Lattices, and Boolean Algebras Introduction Partially Ordered Sets Strict Orderings Covering and Hasse Diagrams Infima and Suprema Lattices The Lattice of Congruences Lattices as Algebras Ordered Algebras Tonoids Tonoid Varieties Classical Complementation Non-Classical Complementation Classical Distribution Non-Classical Distribution Classical Implication Non-Classical Implication Filters and Ideals
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18
1 10 10 10 11
13 15 19 25 28 33 36 37 39 41 44 47 49 51 55 55 55 58 60 63 67 70 71 74 77
82 85 88 92 98 105 109 115
CONTENTS
xii
4
5
4.1 4.2 4.3 4.4 4.5 4.6 4.7
Syntax Introduction The Algebra of Strings The Algebra of Sentences Languages as Abstract Structures: Categorial Grammar Substitution Viewed Algebraically (Endomorphisms) Effectivity Enumerating Strings and Sentences
125 125 125 130 133 136 137 138
Semantics Introduction Categorial Semantics Algebraic Semantics for Sentential Languages Truth-Value Semantics Possible Worlds Semantics Logical Matrices and Logical Atlases Interpretations and Valuations Interpreted and Evaluationally Constrained Languages Substitutions, Interpretations, and Valuations Valuation Spaces Valuations and Logic Equivalence Compactness The Three-Fold Way
141 141 142 144 146 148 152 155 158 162 166 169 172 176 181
Logic
184 184 185 187 189 191 194
5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 6
CONTENTS
6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13
Motivational Background The Varieties of Logical Experience What Is (a) Logic? Logics and Valuations Binary Consequence in the Context of Pre-ordered Sets Asymmetric Consequence and Valuations (Completeness) Asymmetric Consequence in the Context of Pre-ordered Groupoids Symmetric Consequence and Valuations (Completeness and Absoluteness) Symmetric Consequence in the Context of Hemi-distributoids Structural (Formal) Consequence Lindenbaum Matrices and Compositional Semantics for Assertional Formal Logics Lindenbaum Atlas and Compositional Semantics for Formal Asymmetric Consequence Logics Scott Atlas and Compositional Semantics for Formal Symmetric Consequence Logics
7
6.14 Co-consequence as a Congruence 6.15 Formal Presentations of Logics (Axiomatizations) 6.16 Effectiveness and Logic
214 216 224
Matrices and Atlases Matrices 7.1.1 Background 7.1.2 Lukasiewicz matrices/submatrices, isomorphisms 7.1.3 GOdel matrices/more submatrices 7.1.4 Sugihara matrices/homomorphisms 7.1.5 Direct products 7.1.6 Tautology preservation 7.1.7 Infinite matrices 7.1.8 Interpretation 7.2 Relations Among Matrices: Submatrices, Homomorphic Images, and Direct Products 7.3 Proto-preservation Theorems 7.4 Preservation Theorems 7.5 Varieties Theorem Analogs for Matrices 7.5.1 Unaryassertionallogics 7.5.2 Asymmetric consequence logics 7.5.3 Symmetric consequence logics 7.6 Congruences and Quotient Matrices 7.7 The Structure of Congruences 7.8 The Cancellation Property 7.9 Normal Matrices 7.10 Normal Atlases 7.11 Normal Characteristic Matrices for Consequence Logics 7.12 Matrices and Algebras 7.13 When is a Logic "Algebraizable"?
226 226 226 227 230 230 232 232 233 234
Representation Theorems Partially Ordered Sets with Implication(s) 8.1.1 Partially ordered sets 8.1.2 Implication structures 8.2 Semi-lattices 8.3 Lattices 8.4 Finite Distributive Lattices 8.5 The Problem of a General Representation for Distributive Lattices 8.6 Stone's Representation Theorem for Distributive Lattices 8.7 Boolean Algebras 8.8 Filters and Homomorphisms 8.9 Maximal Filters and Prime Filters
277 277 277 278 287 288 293
7.1
8
8.1 196 199 202 208 209 211 213
xiii
237 239 243 246 246 247 249 249 254 257 262 266 270 271 273
295 297 300 302 302
CONTENTS
xiv
8.10 8.11 8.12 8.13 9
CONTENTS
Stone's Representation Theorem for Boolean Algebras Maximal Filters and Two-Valued Homomorphisms Distributive Lattices with Operators Lattices with Operators
303 305 313 317
Classical Propositional Logic 9.1 Preliminary Notions 9.2 The Equivalence of (Unital) Boolean Logic and Frege Logic 9.3 Symmetrical Entailment 9.4 Compactness Theorems for Classical Propositional Logic 9.5 A Third Logic 9.6 Axiomatic Calculi for Classical Propositional Logic 9.7 Primitive Vocabulary and Definitional Completeness 9.8 The Calculus BC 9.9 The Calculus D(BC) 9.10 Asymmetrical Sequent Calculus for Classical Propositional Logic 9.11 Fragments of Classical Propositional Logic 9.12 The Implicative Fragment of Classical Propositional Logic: Semi-Boolean Algebras 9.13 Axiomatizing the Implicative Fragment of Classical Propositional Logic 9.14 The Positive Fragment of Classical Propositional Logic
321 321 322 324 326 333 334 335 337 341
10 Modal Logic and Closure Algebras 10.1 10.2 10.3
Modal Logics Boolean Algebras with a Normal Unitary Operator Free Boolean Algebras' with a Normal Unitary Operator and Modal Logic 10.4 The KIipke Semantics for Modal Logic 10.5 Completeness 10.6 Topological Representation of Closure Algebras 10.7 The Absolute Semantics for S5 10.8 Henle Matrices 10.9 Alternation Property for S4 and Compactness 10.10 Algebraic Decision Procedures for Modal Logic 10.11 S5 and Pretabularity 11 Intuitionistic Logic and Heyting Algebras 11.1 Intuitionistic Logic 11.2 Implicative Lattices 11.3 Heyting Algebras 11.4 Representation of Heyting Algebras using Quasi-ordered Sets 11.5 Topological Representation of Heyting Algebras
346 348 349 350 352 356 356 358 361 361 363 364 367 367 369 370 375 380 380 381 383 383 384
11.6 11.7 11.8 11.9 11.10
Embedding Heyting Algebras into Closure Algebras Translation of H into S4 Alternation Property for H Algebraic Decision Procedures for Intuitionistic Logic LC and Pretabularity
xv
386 386 387 388 390
12 Gaggles: General Galois Logics 12.1 Introduction 12.2 Residuation and Galois Connections 12.3 Definitions of Distributoid and Tonoid 12.4 Representation of Distributoids 12.5 Partially Ordered Residuated Groupoids 12.6 Definition of a Gaggle 12.7 Representation of Gaggles 12.8 Modifications for Distributoids and Gaggles with Identities and Constants 12.9 Applications 12.10 Monadic Modal Operators 12.11 Dyadic Modal Operators 12.12 Identity Elements 12.13 Representation of Positive Binary Gaggles 12.14 Implication 12.14.1 Implication in relevance logic 12.14.2 Implication in intuitionistic logic 12.14.3 Modal logic 12.15 Negation 12.15.1 The gaggle treatment of negation 12.15.2 Negation in intuitionistic logic 12.15.3 Negation in relevance logic 12.15.4 Negation in classical logic 12.16 Future Directions
412 414 415 417 420 421 422 423 424 424 425 425 426 427 429 430
13 Representations and Duality 13.1 Representations and Duality 13.2 Some Topology 13.3 Duality for Boolean Algebras 13.4 Duality for Distributive Lattices 13.5 Extensions of Stone's and Priestley's Results
431 431 433 435 438 441
References
445
Index
455
394 394 395 398 400 406 408 409
1 INTRODUCTION
The reader who is completely new to algebraic logic may find this the hardest chapter in the book, since it uses concepts that may not be adequately explained. Such a reader is advised to skim this chapter at first reading and then to read relevant parts again as appropriate concepts are mastered. In this chapter we shall recall some of the high points in the development of algebraic logic, our aim being to provide a framework of established results with which our subsequent treatment of the algebra of various logics may be compared. Although we shall chiefly be discussing the algebra of the classical propositional calculus, this discussion is intended to have a certain generality. We mean to emphasize the essential features of the relation of the classical propositional calculus to Boolean algebra, remarking from time to time what is special to this relation and what is generalizable to the algebra of other propositional calculi. It should be mentioned that we here restrict ourselves to the algebra of propositional logics, despite the fact that profound results concerning the algebra of the classical predicate calculus have been obtained by Tarski, Halmos, and others. It should also be mentioned that we are not here concerned with setting down the history of algebraic logic, and that, much as in a historical novel, historical figures will be brought in mainly for the sake of dramatic emphasis. About the middle of the nineteenth century, the two fields of abstract algebra and symbolic logic came into being. Although algebra and logic had been around for some time, abstract algebra and symbolic logic were essentially new developments. Both these fields owe their origins to the insight that formal systems may be investigated without explicit recourse to their intended interpretations. This insight led George Boole, in his Mathematical Analysis of Logic (1847), to formulate at one and the same time perhaps the first example of a non-numerical algebra and the first example of a symbolic logic. He observed that the operation of conjoining two propositions had certain affinities with the operation of multiplying two numbers. Boole tended also to identify propositions with classes of times, or cases, in which they are true (cf. Dipert, 1978); the conjunction of propositions thus corresponded to the intersection of classes. Besides the operation of conjunction on propositions, there are also the operations of negation (-) and disjunction (v). Historically, Boole and his followers tended to favor exclusive disjunction (either a or b, but not both, is true), which they denoted by a+b, but modem definitions of a Boolean algebra (cf. Chapter 3) tend to feature inclusive disjunction (a and/or b are/is true), which is denoted by a V b. The "material conditional" a :J b can be defined as -a V b. A special element 1 can be defined as a V -a, and a relation of "implication" can be defined so a S b iff (a :J
2
INTRODUCTION
b) = 1.1 (If the reader has the natural tendency to want to reverse the inequality sign on the grounds that if a implies b, a ought to be the stronger proposition, think of Boole's identification of propositions with sets of cases in which they are true. Then "a implies b" means that every case in which a is true is a case in which b is true, i.e., a :J b.) Boole's algebra oflogic is thus at the same time an algebra of classes, but we shall ignore this aspect of Boole's algebra in the present discussion. He saw that by letting letters like a and b stand for propositions, just as they stand for numbers in ordinary algebra, and by letting juxtaposition of letters stand for the operation of conjunction, just as it stands for multiplication in ordinary algebra, these affinities could be brought to the fore. Thus, for example, ab = ba is a law of this algebra of logic just as it is a law of ordinary algebra of numbers. At the same time, the algebra of logic has certain differences from the algebra of numbers since, for example, aa = a. The differences are just as important as the similarities, for whereas the similarities suggested a truly symbolic logic, like the "symbolic arithmetic" that comprises ordinary algebra, the differences suggested that algebraic methods could be extended far beyond the ordinary algebra of numbers. Oddly enough, despite the fact that Boole's algebra was thus connected with the origins of both abstract algebra and symbolic logic, the two fields developed for some time thereafter in comparative isolation from one another. On the one hand, the notion of a Boolean algebra was perfected by Jevons (1864), Schroder (1890-1905), Huntington (1904), and others (until it reached the modem conception used in this book), and developed as a part of the growing field of abstract algebra. On the other hand, the notion of a symbolic logic was developed along subtly different lines from Boole's original algebraic formulation, starting with Frege (1879) and receiving its classic statement in Whitehead and Russell's Principia Mathematica (1910). The divergence of the two fields was partly a matter of attitude. Thus Boole, following in the tradition of Leibniz, wanted to study the mathematics of logic, whereas the aim of Frege, Whitehead, and Russell was to study the logic <;:If mathematics. The modem field of mathematical logic, of course, recognizes both approaches as methodologically legitimate, and indeed embraces both under the very ambiguity of its name, "mathematical logic," but the Frege-Whitehead-Russell aim to reduce mathematics to logic obscured for some time the two-headedness of the mathematical-logical coin. There is more than a difference in attitude, however, between Boole's algebraic approach to logic, and the Frege-Whitehead-Russell approach to logic, which for want of a better word we shall call logistic. We shall attempt to bring out this difference between the two approaches, which was either so profound, or so subtle, that the precise connection between the two ways of looking at logic was not discovered until the middle 1930s. The difference we have in mind is essentially the distinction that Curry (1963, pp. 166-168) makes between a relational (algebraic) system and an assertional (logistic) system, though we shall have to be more informal than Curry since we do not have his nice formalist distinctions at hand. Let us begin by looking at a logistic presentation of the classical propositional calculus that is essentially the same as in Principia Mathematica except that we use axiom I We
use the standard abbreviation throughout of "iff" for "if and only if."
INTRODUCTION
3
schemata and thereby do without the rule of substitution, which was tacitly presupposed by Principia. This presentation begins by assuming that we have a certain stock of atomic sentences p, q, r, etc., and then specifies that these are (well-formed) sentences and that further sentences may be constructed from them by the usual inductive insertion of logical connectives (and parentheses). The particular logical connectives assumed in this presentation are those of disjunction (V) and negation (~), although conjunction is assumed to be defined in terms of these so that rjJ & !ff is an abbreviation for ~(~rjJ V ~!ff), and material implication is also assumed to be defined in terms of these so that rjJ :J !ff is an abbreviation for ~rjJ V!ff. A certain proper subset of these sentences are then singled out as axioms. These axioms are all instances of the following axiom schemata: (1) (rjJ V rjJ) :J rjJ
(2) !ff:J (rjJ V!ff) (3) (rjJ V!ff) :J (!ff V rjJ) (4) (rjJ V (!ff V X)) :J (!ff V (rjJ V X))
(5) (!ff:J X) :J ((rjJ V!ff) :J (rjJ V X))·
These axioms are called theorems, and it is further specified that additional sentences are theorems in virtue of the following rule: Modus ponens: If rjJ is a theorem, and if rjJ :J !ff is a theorem, then !ff is a theorem. The point of this perhaps too tedious but not too careful rehearsal of elementary logic is to give us some common ground for a comparison of the classical propositional calculus with a Boolean algebra. There are certain sUlface similarities that are misleading. Thus, for example, a Boolean algebra has certain items called elements which are combined by certain operations to give other elements, just as the classical propositional calculus has certain items called sentences which are combined by the operation of inserting logical connectives to give other sentences. They are both then, from this point of view, abstract algebras in the sense of Chapter 2. This fact might lead one to confuse the operation of disjoining two sentences rjJ and !ff so as to obtain rjJ V !ff, with the operation of joining two elements of a Boolean algebra a and b so as to obtain a V b. There are essential differences between these two binary operations. Consider, for example, that where rjJ is a sentence, rjJ V rjJ is yet another distinct sentence since rjJ V rjJ contains at least one more occurrence of that disjunction sign V than does rjJ. Yet in a Boolean algebra, where a is an element, a V a = a. Further, in the algebra of sentences, where rjJ and !ff are distinct sentences, the sentence rjJ V !ff is distinct from the sentences !ff V rjJ since although the two sentences are composed of the same signs, the signs occur in different orders. Yet in a Boolean algebra, a V b = b V a. The trouble with the algebra of sentences is that, like the bore at a party, it makes too many distinctions to be interesting. Its detailed study might be of interest to the casual thrill-seeker who is satisfied with "something new every time," but the practiced seeker of identity in difference demands something more than mere newness. To such a seeker as Boole, the "identity" of two such different sentences as rjJ V rjJ and rjJ, or rjJ V !ff and !ff V rjJ, lies in the fact that they express the "same proposition," but this was
4
INTRODUCTION
only understood at such an intuitive level until the 1930s, when Lindenbaum and Tarski made their explication of this insight. Lindenbaum and Tarski observed that the logistic presentation of the classical propositional calculus could be made to reveal a deeper algebra than the algebra of sentences that it wore on its sleeve. Their trick was to introduce a relation of logical equivalence == upon the class of sentences by defining cjJ == If/ iff both cjJ ::::l If/ and If/ ::::l cjJ are theorems. It is easy to show that the relation == is a genuine equiValence relation. Thus reflexivity follows because cjJ ::::l cjJ (self-implication) is a theorem, symmetry follows by definition, and transitivity follows from the fact that whenever cjJ ::::l If/ and If/ ::::l X are theorems, then cjJ ::::l X is a theorem (the rule form of transitivity). It is interesting to observe that since the classical propositional calculus has a "well-behaved" conjunction connective, i.e., cjJ & If/ is a theorem iff both cjJ and If/ are theorems, then the same effect may be obtained by defining cjJ == If/ iff (cjJ ::::l If/) & (If/ ::::l cjJ) is a theorem. It is natural to think of the class of all sentences logically equivalent to cjJ, which we represent by [cjJ], as one of Boole's "propositions." Operations are then defined upon these equivalence classes, one corresponding to each logical connective, so that ~[cjJ] = [~cjJ], [cjJ] V [If/] = [cjJ V If/], [cjJ] A [If/] = [cjJ & If/], and [cjJ] ::::l [If/] = [cjJ ::::l If/]. (Observe that in the classical propositional calculus, the last two operations may actually be defined in terms of the first two since conjunction and material implication may be defined in terms of disjunction and negation.) Since the Replacement Theorem holds for the classical propositional calculus, these operations may be shown to be genuine (single-valued) operations. The point of the Replacement Theorem is to ensure that the result of operating upon equivalence classes does not depend upon our choice of representatives for the classes. Thus, for example, if cjJ == If/, then [cjJ] = [If/]. But then for the unary operation corresponding to negation to be single-valued, we must have ~[cjJ] = ~[lf/], i.e., [~cjJ] = [~lf/], i.e., ~CjJ == ~lf/, which is just what the Replacement Theorem guarantees us. Let us call the algebra so defin~d the Lindenbaum algebra of the of the classical propositional calculus. We follow Rasiowa and Sikorski (1963, p. 245n) in calling this device a Lindenbaum algebra, despite the fact that Tarski first used it in print, for essentially the reasons they give. It is simply a matter of axiom-chopping to see that this is a Boolean algebra. Thus, for example, it is easy to see that [cjJ] V [cjJ] = [cjJ], even though [cjJ] V [cjJ] and [cjJ] are distinct sentences, for (cjJ V cjJ) ::::l cjJ is an instance of axiom schema 1, and cjJ ::::l (cjJ V cjJ) is an instance of axiom schema 2. Similarly, [cjJ] V [If/] = [If/] V [cjJ] follows from two instances of axiom schema 3. The other laws of a Boolean algebra may be established analogously. Let us observe, as might have been expected, that [cjJ] ~ [If/] iff cjJ ::::l If/ is a theorem. The essentials of the Lindenbaum-Tarski method of constructing an algebra out of the classical propositional calculus can be applied to most other well-motivated propositional calculi, and because of the intuitive properties of conjunction and disjunction, most of the resulting Lindenbaum algebras are lattices, indeed, distributive lattices, in the sense of Chapter 3. Various logicians, at various times, have, however, questioned the various prinCiples needed for the construction of a Lindenbaum algebra, and some logicians have even developed logical systems that do not have these principles. For example, Strawson (1952, p. 15) has cast aspersion on the law of self-implication (though
INTRODUCTION
5
he seems prepared to accept it as a "technical" device). Smiley (1959) has worked out a theory of non-transitive "entailment." Fitch's (1952) system apparently does not have the Replacement Theorem. And there are some systems containing logical connectives for "conjunction" and "disjunction" which are not lattices. Thus, for example, both Angell's (1962) system of the "subjunctive conditional," and McCall's (1966) system of "connexive implication" are constructed so that cjJ & If/ does not always imply cjJ. The suggestion of a non-distributive logic of quantum mechanics may be found in Birkhoff and von Neumann (1936). But most logics that arise in "real life" are lattices. In particular, the Lindenbaum algebra of Lewis's modal logic S4 is a closure algebra (cf. Chapter 10), and the Lindenbaum algebra of Heyting's intuitionistic logic is a pseudo-Boolean algebra (or a Heyting algebra, as we call it in Chapter 11); cf. McKinsey (1941), McKinsey and Tarski (1948), and Birkhoff (1948, pp. 195-196). One of the most remarkable features of the reunion of logic and algebra that took place in the 1930s was the discovery that certain non-classical propositional calculi that had captured the interest of logicians had such intimate connections with certain structures that had been developed by algebraists in the context of lattice theory-a generalization of the theory of Boolean algebras that by then stood on its own. An even more striking example of the identification of notions and results that were of independent interest to both logicians and algebraists may be found in Tarski's theory of deductive systems, which was later seen to overlap the Boolean ideal theory of Stone (1936). Apparently Tarski did not realize the algebraic significance of his theory until he read Stone, and conversely, Stone did not realize the logical significance of his theory until he read Tarski (cf. Kiss 1961, pp. 5-6). Intuitively, a deductive system is an extension of a logistic presentation of a propositional calculus (assumed not to have a rule of substitution) that has been obtained by adding additional sentences as axioms (however, Tarski explicitly defined the notion only for the classical propositional calculus). Stone defined a (lattice) ideal (cf. Chapter 3), and at the same time showed that Boolean algebras could be identified with idempotent rings (with identity), the so-called Boolean rings, and that upon this identification the (lattice) ideals were the ordinary ring ideals (exclusive disjunction is the ring addition). This identification was of great importance since the value of ideals in ring theory was already well established, the concept of an ideal having first been developed by Dedekind (1872) as an explication of Kummer's "ideal number," which arose in connection with certain rings of numbers (the algebraic integers). It is a tribute to the powers of abstract algebra that the abstract concept of an ideal can be shown to underlie both certain number-theoretical concepts and certain logical concepts. The connection between deductive systems and ideals becomes transparent upon the Lindenbaum identification of a sentence with its logical equivalents. Then a deductive system is the dual of an ideal, namely, what is called a filter, and conversely, a filter is a deductive system. Without going into the details of this connection, let us simply remark the analogy between a deductive system and a filter. Let us assume that F is a set of theorems of some extension of the classical propositional calculus, or of almost any well-known, well-motivated propositional calculus. Then both formal and intuitive
6
INTRODUCTION
considerations demand that if P, lJf E F, then P & lJf E F, which corresponds to property (Fl) of our definition of filter (Definition 3.18.2), and that if P E F, then P V lJf E F, which corresponds to our property (F2). It is curious to observe that if we consider the set of refutable sentences, i.e., those sentences whose negations are theorems, then we obtain an ideal in the Lindenbaum algebra. The fact that theorems are more customary objects for logical study than refutables, while at the same time ideals are more customary objects for algebraic study than filters, has led Halmos (1962, p. 22) to conjecture that the logician is the dual of the algebraist. By duality we obtain as a corollary that the algebraist is the dual of the logician. Upon the Lindenbaum identification of logically equivalent sentences, the filter of theorems of the classical propositional calculus has a particularly simple structure, namely, it is the trivial filter that contains just the 1 of the Boolean algebra that so results. This fact depends upon one of the paradoxes of implication, namely, that if lJf is a theorem, then p :J lJf is a theorem. This means that all theorems are logically equivalent and hence identified with each other in the same equivalence class, and that any theorem is logically implied by any sentence, and hence this equivalence class of theorems ends up at the top of the Boolean algebra. In short, p is a theorem iff [pl = 1. This explicates a notion of Boole's that a proposition a is a logical truth iff a = 1. Since the same paradox of implication is shared with many other propositional calculi, e.g., S4 and the intuitionistic logic, this algebraically elegant characterization of theoremhood is widely applicable. But since in the intensional logics that we shall be studying it is not the case that all theorems are logically equivalent, we shall have to use a different algebraic analog of theoremhood. Note that we can always resort to the inelegant characterization that p is a theorem iff [pl is in the Lindenbaum analog of the deductive system based on the logic. This means, in the case of the intensional logics that we shall be studying, that the algebraic analog of the class of theorems is the filter generated by the elements that correspond to the qxioms. The same characterization actually holds for the Lindenbaum algebra of the classical propositional calculus, its being but a "lucky accident," so to speak, that this filter is the tlivial filter that may hence be thought of as identical with the element 1 that is its sole member. The algebra of intensional logics is thus demonstrably "non-trivial." So far, we have been discussing the algebra of the syntax of a propositional logic since the notions of sentence, theorem, etc., by which the Lindenbaum algebra is defined, all ultimately depend only upon the syntactic structure of sequences of signs in the system. But there is another side to logic, namely, semantics, which studies the interpretations of logical systems. Thus, to use a well known example, to say of the sentence p V ~p that it is a theorem of the classical propositional calculus is to say something syntactical, whereas to say of p V ~p that it is a tautology is to say something semantical since it is to say something about the sentence's interpretations in the ordinary two-valued truth tables, namely, that its value is true under every interpretation. Now we have already discussed an algebraic way of expressing the first fact, namely, we can say that [p V ~pl = 1. What we now want is an algebraic way of expressing the second fact. It is well known that the ordinary truth tables may be looked at as the two-element Boolean algebra 2 (where true is 1 andfalse is 0). This allows us to define
INTRODUCTION
7
an interpretation into 2 (or any Boolean algebra) as a mapping of the sentences into the Boolean algebra that carries negation into complementation, disjunction into join, etc., all in the obvious way. We can then define a sentence p as valid with respect to a class of Boolean algebras iff, for every interpretation I into a Boolean algebra in the class, I( p) = 1. We can define the classical propositional calculus as consistent with respect to a class of Boolean algebras iff every theorem is valid with respect to that class, and as complete with respect to the class iff every sentence that is valid with respect to the class is a theorem. Observe that these definitions coincide with the usual definitions with respect to truth tables when the class of Boolean algebras in question consists of just the single Boolean algebra 2. Observe also that similar definitions may be given for non-classical propositional calculi once the appropriate algebraic analog of theoremhood has been picked out. It may easily be shown that the classical propositional calculus is both consistent and complete with respect to the class of all Boolean algebras. Thus consistency may be shown in the usual inductive fashion, showing first that the axioms are valid, and then that the rules (modus ponens) preserve validity. Completeness is even more trivial, since it may be immediately seen that if a sentence p is not a theorem, then if we define for every sentence lJf, l(lJf) = [lJf], then under this interpretation I(p) 1'= 1. Of course, this completeness result is not as satisfying as the more familiar two-valued result since, among other things, it does not immediately lead to a decision procedure (the Lindenbaum algebra of the classical propositional calculus formulated with an infinite number of atomic sentences not being finite). But it does form the basis for an algebraic proof of the two-valued result. We shall see this after a short digression concerning interpretations and homomorphisms. The notion of a homomorphism is the algebraic analog of an interpretation. From any interpretation I of the classical propositional calculus into a Boolean algebra B we can define a homomorphism h of the Lindenbaum algebra into B as h([p]) = I(p); conversely, from any homomorphism h of the Lindenbaum algebra we can define an interpretation I as I(p) = h([p]). The second fact is obvious, but the first fact requires a modicum of proof, which is not without intrinsic interest. What needs to be shown is that the function h is well-defined in the sense that its value for a given equivalence class as argument does not depend upon our choice of a sentence as representative of the equivalence class, i.e., that if p == lJf, then I(p) = l(lJf). This amounts to a special case of the semantic consistency result, for what must be shown is that if p :J lJf is a theorem, then I(p) ~ l(lJf), i.e., I(p) :J l(lJf) = I(p :J lJf) = 1. The fact that every interpretation thus determines a homomorphism allows us to observe that the Lindenbaum algebra of the classical propositional calculus formulated with n atomic sentences is the free Boolean algebra with n free generators. Note that it is typical of algebraic logic that no artificial restrictions are placed upon the assumed cardinality of the stock of atomic sentences. Although there may be very good metaphysical or scientific reasons for thinking that the number of actual or possible physical inscliptions of atomic sentences is at most denumerable, still the proof we are about to sketch is not affected by questions of cardinality.
8
INTRODUCTION
The proof begins by observing that distinct atomic sentences determine distinct equivalence classes. Let us suppose that the atomic sentences are Pi, and that we have a mapping! of their equivalence classes [Pi] into a Boolean algebra B. We can then define a new function s from the atomic sentences into B by s(pd = !([Pi]). This function s then inductively determines an interpretation z into B, and the interpretation z in tum determines a homomorphism h of the Lindenbaum algebra into B, as we have just seen. The situation we have described above is typical of the algebra of logic. We take a logic and form its Lindenbaum algebra (if possible). We then abstract the Lindenbaum algebra's logical struchlre and find a class of algebras such that the Lindenbaum algebra is free in the class. That the Lindenbaum algebra is ill the class then amounts to the logic's completeness, and that it is free in the class amounts to the logic's consistency. The trick is to abstract the Lindenbaum algebra's logical structure in an interesting way. Thus, for example, it is interesting that the Lindenbaum algebra of S4 is free in the class of closure algebras, and it is interesting that the Lindenbaum algebra of intuitionistic logic is free in the class of pseudo-Boolean algebras, because these algebras are lich enough in structure and in applications to be interesting in their own right. Let us remark that it is irrelevant whether the logic or the algebra comes first in the actual historical process of investigation. Having thus picked an appropriate class of algebras with respect to which the logic may be shown consistent and complete, it is, of course, desirable to obtain a sharper completeness result with respect to some interesting subclass of the class of algebras. One perennially interesting subclass consists of the finite algebras, for then a completeness result leads to a decision procedure for the logic. McKinsey (1941) and McKinsey and Tarski (1948) have obtained such finite completeness results for S4 with respect to closure algebras, and for the intuitionistic logic with respect to pseudo-Boolean algebras (cf. Chapters 10 and 11). It might be appropriate to point ~mt that due to the typical coincidence of interpretations and homomorphisms, algebraic semantics may be looked at as a kind of algebraic representation theory, representation theory being the study of mappings, especially homomorphisms, between algebras. This being the case, one cannot expect to obtain deep completeness results from the mere hookup of a logic with an appropriate class of algebras unless that class of algebras has an already well-developed representation theory. Of course, the mere hookup can be a tremendous stimulus to the development of a representation theory, as we shall find when we begin our study of the algebra of intensional logics. We close this section with an example of how a well-developed representation theory can lead to deep completeness results. We shall show how certain representation results for Boolean algebras of Stone (1936), dualizing for the sake of convenience from the way we reported them earlier in this chapter to the way Stone actually stated them, lead to an elegant algebraic proof of the completeness of the classical propositional calculus with respect to 2. Of course, in point of fact the completeness result (with respect to truth tables) was first obtained by Post by a non-algebraic proof using cumbersome normal form methods, but this is irrelevant to the point being made.
INTRODUCTION
9
We shall show that a sentence ¢ is valid (in 2) only if it is a theorem by proving the contrapositive. We thus suppose that ¢ is not a theorem, i.e., that [¢] f:. l. By a result of Stone's we know that there is a maximal ideal M in the Lindenbaum algebra such that [¢] E M. But also by a result of Stone's we know that there is a homomorphism h of the Lindenbaum algebra that carries all members of Minto O. Thus h([¢]) = O. Now through the connection between homomorphisms and interpretations, we can define an interpretation z into 2 by z(¢) = h([¢]), and thus there is an interpretation z such that z(¢) = 0, i.e., z(¢) f:. 1, which completes the proof. Even more remarkable connections between Stone's results and the completeness of the classical propositional calculus with respect to 2 have been obtained. Thus, for example, Henkin (1954a) has essentially shown that Stone's representation theorem for Boolean algebras is directly equivalent to the completeness theorem stated in slightly stronger form than we have stated it; cf. Los (1957) for critical modifications. Let us remark several distinctive features of the "algebraic" proof of the completeness theorem that we have given that make it algebraic. It uses not only the language of algebra, but also the results of algebra. The only role the axioms and rules of the classical propositional calculus play in the proof is in showing that the Lindenbaum algebra is a Boolean algebra. Further, the proof is wildly transfinite. By this we mean not only that no assumptions have been made regarding the cardinality of the atomic sentences, but also that a detailed examination of Stone's proof regarding the existence of the essential maximal ideal would reveal that he used the Axiom of Choice. The proof is at the same time wildly non-constructive, for we are given no way to construct the crucial interpretation. A Lindenbaum algebra is thus treated by the same methods as any algebra. Let us say that although there may be philosophical objections to such methods of proof, these objections cannot be directed at just algebraic logic, but instead must be directed at almost the whole of modem algebra.
SUBRELATIONAL STRUCTURES AND SUBALGEBRAS
2 UNIVERSAL ALGEBRA 2.1
Introduction
In this chapter, we present some very general ideas from the field initiated in its modern form by Garrett Birkhoff and called by him "universal algebra." The fundamental notion of universal algebra is that of an algebra. Basically, an algebra is a set together with various operations that take elements of that set and yield elements of that same set. A simple, and familiar, example is the set of natural numbers together with the operations of addition and multiplication. In particular, any two natural numbers can be added (or multiplied), and the result is moreover a natural number. By contrast, the natural numbers together with the operation of subtraction do not form an algebra; for although the difference of any two natural numbers exists, it need not be a natural number. A more dramatic non-example is the set of natural numbers together with the operation of division; in particular, there is no such thing as I divided by O. In the first non-example, the operation can yield a result not in the set; in the second non-example, the operation can yield no result at all. Notwithstanding its name, algebraic logic is not concerned exclusively with algebras in the strict sense of universal algebra, but is rather concerned with more general structures, which we describe in the next section.
2.2
Relational and Operational Structures (Algebras)
By a relation on a set A, we mean an n-place relation on A, for some natural number n. An n-place relation on A, also called a relation on A of degree n, is simply a set of n-tuples of elements of A. For example, the less-than relation is a two-place (degree 2) relation on the set of natural numbers. A relational structure is, by definition, a set A together with a family (Rj) of relations on A. Each relation Rj has a degree, so there is an associated family (dj) of degrees, dj being the degree of relation Rj. The latter family is called the type of the relational structure. For example, a relational structure of type (1,2) consists of a oneplace relation and a two-place relation. (A one-place "relation" cOlTesponds to a property.) Two relational structures are said to be similar if they have the same type. For example, all relational structures that consist solely of a two-place relation are similar to one another. We shall follow the standard convention of using bold face, e.g., A, to denote the structure, and italic, e.g., A, to denote the underlying carrier set.
11
Having defined relational structures, we now define algebras, which are defined to be operational structures, and we remark that operational structures may be regarded as special sorts of relational structures. By an operation on a set A, we mean an n-place operation on A for some natural number n. An n-place operation on A is a function that assigns an element of A to every n-tuple of elements of A. An n-place operation is also said to be an operation of degree n. For example, addition is a two-place (degree 2) operation on the set of natural numbers; in particular, it assigns a natural number to every 2-tuple (pair) of natural numbers. An algebra is defined to be an operational structure. An operational structure is, by definition, a set A together with a family (OJ) of operations on A. As with relational structures, every operational structure has a type (dj), which is the family of degrees of all the operations making up that structure. Likewise, two operational structures are said to be similar if they have the same type. For example, the algebra of numerical multiplication is similar to the algebra of numerical addition, both being algebras of type (2). At this point, it is useful to remark that, given any operation 0 on a set A, of degree n, there is a naturally associated relation R on A, of degree n + 1. For example, the relation naturally associated with addition consists of precisely those 3-tuples (x, y, z) for which z = x + y. On the other hand, strictly speaking, the operation of addition consists of ordered pairs ((x, y), z) for which z = x + y. In most actual mathematical applications, the difference between (x, y, z) and ((x, y), z) is practically insignificant. It is accordingly convenient, in these applications, to regard an operation and its associated relation as one and the same mathematical object. This practical simplification allows us to regard operations as special sorts of relations, and operational structures (algebras) as special sorts of relational structures. It also allows us to include mixed structures (those consisting of both relations and operations) under the same general rubric. In fact, as it turns out, the mathematical structures that are central to the "algebraic" investigation of logic are mixed structures. There is, however, a potential problem in making this simplification. Specifically, according to the above account, an algebra has both an operational type and a relational type, which are distinct. Accordingly, in comparing structures with one another, it is important that we compare the COlTect types. In general, this can be accomplished simply by making all comparisons using the relational types. On the other hand, it is considerably more convenient to describe and compare algebras using operational types. In general, whenever we refer to the type of an algebra, we mean its operational type.
2.3
Subrelational Structures and Subalgebras
Relational structures (and algebras) can stand in various relations to one another. We have already described one relation-similarity. In this section, we describe another simple relation-the structural part-whole relation. The basic intuition is that one structure is a part of another structure if the carrier set of the first is a subset of the calTier set of the second, and moreover the relations
INTERSECTION, GENERATORS, AND INDUCTION FROM GENERATORS
UNIVERSAL ALGEBRA
12
(operations) on the first are "the same" as those on the second. For example, the natural numbers with the less-than relation are a structural part of the integers with the less-than relation. The same thing can be said about the corresponding additive and multiplicative algebras: the algebra of natural numbers is a structural part of the algebra of integers. A minor technical difficulty attaches to the notion of "the same" in the above paragraph. For example, the less-than relation on the natural numbers is, strictly speaking, not identical to the less-than relation on the integers, simply because they are different sets of ordered pairs. Accordingly, we must provide a precise rendering of the intuitive concept of sameness that is employed in the above paragraph. This is accomplished by providing a formal definition of the structural part-whole relation. Definition 2.3.1 Let A = (A, (Ri») and B = (B, (Si») be similar relational structures. Then A is said to be a structural part of B if the following conditions are satisfied: (1) A
~
B.
(2) (al, ... , all) E Ri
iff al, ... , an
E
A and (al, ... , an)
E
Si.
Condition (1) states the obvious requirement that every element in the substructure is an element of the superstructure. Condition (2) states the further requirement that an n-tuple of elements stands in the subrelation Ri iff its members are elements of A and they stand in the superrelation Si. This is customarily described by saying that Ri is Si restricted to A. The above definition can be applied to algebras, regarded as special sorts of relational structures, in which case we obtain the notion of subalgebra. However, there is a potential source of confusion. Given a relational structure (A, (Ri»), whether an algebra or not, every subset B of A forms a subrelational structure of A, but not every subset forms a subalgebra of A. In order for a subset to fOlm a sub algebra of A, it must additionally be closed under all the operations of A, which is just to say that it must form an algebra in its own right. ' As an example, consider the algebra of integers with negation as the sole operation. If we narrow down to the set N of natural numbers, then we obtain a subrelational structure, but we do not obtain a subalgebra. This is because N is not closed under the operation of negation (the negative of a natural number is not itself a natural number). On the other hand, the set {- 2, -1,0, 1,2} is closed under negation, and accordingly forms a subalgebra. The following is our official definition of subalgebras. Definition 2.3.2 Let A and B be similar algebras with operations (Oi) and (Pi) respectively. Then A is said to be a subalgebra of B (in symbols, A « B) if the following conditions are satisfied: (1) A ~ B. (2) If al, . .. , all E A, then Oi(al,· .. , an)
= Pi(a], . .. , all)'
Condition (1) states the obvious requirement. Condition (2) says that every suboperation Oi is the restriction to A of the associated superoperation Pi. As we have stated it, there is also an implicit background condition-namely, the requirement that the carrier
13
set of the subalgebra is closed under the operations. This must, of course, be fulfilled in order for A to be an algebra in its own right. After all, we only say that one structure is a subalgebra of another if they are both algebras. On other hand, it might be useful to talk about subrelational structures of algebras that are not themselves algebras. There is no generally employed special term for such a structure. 2.4
Intersection, Generators, and Induction from Generators
As we have seen in the previous section, the set-inclusion relation has an algebraic counterpart-the structural part-whole relation. There is another set-theoretic notion that has a natural algebraic counterpart-intersection. Just as we can form the settheoretic intersection of a family of sets, we can also form the algebraic intersection of a family of algebras, at least under carefully circumscribed conditions, described in the following definition. Definition 2.4.1 Let B be any algebra, and let (Ai) be any non-empty family of subalgebras of B. Then the intersection of (Ai) is the unique subalgebra C of B satisfying the following condition: C=nAi.
In other words, one forms the intersection of the family (Ai) of algebras by taking the set-theoretic intersection of the family (Ai) of affiliated carrier sets. This yields the carrier set C of the intersection algebra C; the operations of C are then defined simply to be the restrictions to C of the operations on the original algebra B. One question remains: does the intersection algebra, so defined, always exist? It does, the proof being left as an exercise for the reader. The fact that one can always form the intersection of any non-empty family of subalgebras enables us to deal with the very important algebraic notion of generation. The intuition goes as follows. Consider an algebra A and a subset S of A. If S is closed under the operations on A, then it forms a subalgebra of A. Otherwise it does not form a subalgebra. If S is not closed, we would like to find the minimal way of closing it. This presumably involves adding to S just as many elements as needed (but no more!) to ensure that the resulting enhanced set S* is closed under the operations of A. What is crucial here is the idea that we add no more elements than we absolutely have to. By following the intuitive procedure described in the previous paragraph, we construct the subalgebra generated by S. The difficulty with the intuition is that the minimality notion lacks mathematical precision, which we now attempt to correct in the more or less standard manner. First of all, as a small bit of natural terminology, we say that an algebra A contains a set S if the carrier set A set-theoretically contains S. Example 2.4.2 Consider the simple algebra of integers that has forming the negative as its sole operation. As remarked earlier, the set N of natural numbers does not form a subalgebra, because it is not closed under the negation operation. To form the subalgebra generated by N, one simply adds just enough integers to tum it into a subalgebra. Since 1 EN, the negative of I has to be added; likewise with 2,3,4, .... In fact, as it turns out, one has to throw in every negative number, so the resulting set contains every integer.
14
UNIVERSAL ALGEBRA
Thus, the subalgebra generated by N is the whole algebra. An example of a smaller subalgebra is the sub algebra generated by the set {I, 2, 3}, which consists of {- 3, - 2, -1,1,2, 3}.
Definition 2.4.3 The subalgebra of A generated by S is the smallest subalgebra of A containing S, which is to say that it is the unique algebra C satisfying the following conditions: (1) C« A. (2) S ~ C.
(3) For every B, if B «A, and S
~
B, then C «B.
The subalgebra of A generated by S is sometimes denoted A(S). We have defined A(S), but we have not demonstrated that it in fact exists. However, our worries are quickly allayed, since the existence of A(S) is ensured by the theorem stating that the intersection of any non-empty family of subalgebras of an algebra is itself a subalgebra. In particular, one can show that A(S) is just the intersection of the set of all subalgebras of A containing S. This is left as an exercise for the reader. Let A be an algebra, and let G be any subset of A. Then G is said to generate A if A(G) = A; alternatively, G is said to be a set of generators for A. Every algebra has at least one set of generators (its carrier set), and most algebras have many different sets of generators. The notion of generation is important because we can use it to prove statements by induction from generators, or simply induction. In particular, if one' wants to demonstrate that every element of an algebra A has a particular property P, it is sufficient to find a non-empty set G of generators for A, and show the following: (Be) For all x E G, x has P. (Ie) For every operation G of A (of degree k), if aI, ... , ak E A all have P, then so does G(al, ... , ak). (BC) is customarily called the base case, and (Ie) is customarily called the inductive case. To see that this procedure actually works, consider an algebra A with generators G, and suppose that both (BC) and (Ie) obtain. Next, consider the set A(P) = {x E A : x has P} of those elements of A that have property P. Since (BC) is true, G is a subset of A(P), and since (Ie) is true, A(P) forms a subalgebra of A. So A(P) is a subalgebra of A containing G. But A(G) is the smallest such subalgebra, by definition, so A(G) ~ A(P). But A(G) = A, since G is a set of generators for A. Thus A ~ A(P), which is to say that every element of A has property P. As a special case of this general principle of induction, consider the algebra of natural numbers with the successor operator as the sole operation. One set of generators is the singleton {O}; so in order to demonstrate that every number has a property P, it is sufficient to show that 0 has P, and that if a number has P so does its successor. This, of course, is the familiar form of mathematical induction.
HOMOMORPHISMS AND ISOMORPHISMS
2.5
15
Homomorphisms and Isomorphisms
The concept of isomorphism is common in philosophy and mathematics alike, the basic intuition being that two objects are isomorphic if they have the same structure (or literally, the same form). For example, the sequence of (positive) numbers is isomorphic to the sequence of annual time segments (years) from AD 1 ad infinitum. In fact, this isomorphism is so natural as to be virtually invisible in the way we speak and write. Now, the intuitive notion of isomorphism is a special case of a more general notionthe notion of homomorphism and homomorphic image. We can say that one object has the same form as another object, that they are isomorphic. We can also say that one object is an "image" of, or that it is "reflective" of, another object. Although this notion has something to do with form, it is nevertheless distinct from the notion of isomorphism. For example, a photographic image is not isomorphic to its subject, even in the ideal, for at least the following reasons: (1) the image is two-dimensional, whereas the subject is three-dimensional; (2) the image depicts only the surface of the SUbject, whereas the subject presumably has inner detail, not conveyed by the image; (3) the image may be in black and white, whereas the subject is presumably "in color." Not wishing to be side-tracked by issues concerning the representational nature of photography, or the exact relation between an image and its subject, let us simply propose photography as an intuitive example of structural reflection. One intuition that we wish to emphasize here is that of compression; going from the subject to its image involves (so to speak) compressing the three-dimensional subject into two dimensions, a technique that is familiar in engineering drawing. Indeed, sometimes the compression process is profound, as entire galaxies might be dots on the photographic plate. However, not just any compression is acceptable; it must somehow transfer the relevant structure of the subject to the conesponding image. In other words, the image must structurally reflect the subject, or to state matters in fairly standard mathematical terms, the image must be a h071l071l0lphic image of the subject. We are now in a position to provide formal definitions of homomorphism and isomorphism.
Definition 2.5.1 Let A and B be similar relational structures, with relations (Ri) and (Si) respectively. A homomorphism from A to B is any function mfrom A into B satisfying the following condition, for each i: (ST)
If (aI, ... , an)
E Ri, then (m(ar), ... , mean)) E Si.
Definition 2.5.2 A relational structure B is said to be a homomorphic image of A if there exists a homomOlphismfrom A to B that is onto B (in symbols, B = h*(A)). (A function h maps A onto B iffor every b E B there is an a E A such that h(a) = b.) (ST) requires that the structure of the source be transferred to its image. In addition to this requirement, we may require for certain purposes the image be faithful to the source. As it turns out, there are a number of different notions of fidelity, which we now examine formally. As a general term for a homomorphism of any kind we will sometimes use the term "morphism." The first notion, absolute fidelity, is the strongest.
UNIVERSAL ALGEBRA
16
Definition 2.5.3 A homomorphism m from A to E is said to be absolutely faithful satisfies the following condition, for each i: (AP)
If (m(al), ... , mean)~
E
HOMOMORPHISMS AND ISOMORPHISMS
if it
Si, then (al, .. ·, all) E Ri.
If m is a homomorphism from A to E, then objects in A stand in relation Ri only if their images in B stand in the counterpart relation Si. If m is additionally absolutely faithful, then if objects in A do not bear Ri, then their images in B do not bear the corresponding relation Si. Having described the strongest notion of fidelity, which we call absolute fidelity, we now examine the weakest such notion, which we call minimal fidelity. As remarked before, fidelity involves the idea that the image involves no gratuitous structure, but rather derives its structure from the source. One possible formal expression of this idea is given as follows.
Definition 2.5.4 Let A and E be two similar relational structures with relations (Ri) and (Si) respectively, and let m be a homomOlphismfrom A to B. Then m is said to be minimally faithful if it satisfies the following condition: (MF) If b1, ... , bn are in the range of 117, then if (bl' ... , bn ) E Si, then there are elementsal, ... ,an E Asuchthatm(al) = bl, ... ,m(a n) = bn, and(al, ... ,a n ) E Ri. In other words, the condition (MF) of minimal fidelity requires that elements in the range of a homomorphism stand in a relation Si only if they are images of elements in the source that stand in the corresponding relation Ri. This means that the structure of the image is the absolute minimal required in order to satisfy the requirement that the structure of the source be transferred to the image. Beyond the structure required by the structural transfer condition, the image has no further structure. Between the weakest and strongest notions of fidelity, there are a number of intermediate notions, one of which is iniportant, since it is the notion of fidelity that emerges in the context of homomorphisms of algebras. This condition is not as easily motivated on intuitive grounds; rather, its motivation comes largely from mathematical considerations. The mathematical criterion we consider is this: what is the minimal restriction on general morphisms that ensures that the image of every operational structure is itself an operational structure? As it turns out, the minimal restriction is what we call strong fidelity, which is formally defined as follows.
Definition 2.5.5 Let A and E be two similar relational structures with relations (Ri) and (Si) respectively, and let m be a homomOlphismfrom A to B. Then m is said to be strongly faithful if it satisfies the following condition: (SF) If (m(ad, ... , m(an» E Si, then there is an a E A, such that (al, ... , an-I, a) E Ri, and mea) = mean). This condition, (SF), is fairly abstract and unintuitive as it is stated. It will, we think, be fairly easy to understand if we describe it for the special case of two simple binary structures. In that case, (SF) reduces to the following. (SF"') If m(adSI11(a2), then there is an a
E
A such that alRa, and mea) = l11(a2).
17
So suppose that you have elements bt and b2 in the range of 111, and suppose they stand in the relation S. Minimal fidelity requires that at least one element whose image is bl stands in relation R to at least one element whose image is b2. Absolute fidelity requires that every element whose image is bl bears the relation R to every element whose image is b2. Strong fidelity goes a little further than minimal fidelity, but it does not go so far as absolute fidelity. It requires that every element whose image is bl bears the relation R to at least one element whose image is b2. We can also define the notions of dual-strong fidelity and symmetrical-strong fidelity. In the case of the former, we require the following: for all bl, b2 in the range of 111, if bl bears S to b2, then for every element x whose image is b2 there is at least one element y whose image is bl such that y bears R to x. On the other hand, symmetricalstrong fidelity is the combination of strong and dual-strong fidelity.
Fact 2.5.6 Let A and E be two similar operational structures. An (operational) homomOlphism from A onto E is a strong (relational) homomorphism from A to E, if A and E are viewed as relational rather than as operational structures (with n-ary operations represented by (n
+ l)-ary relations in the standard way).
The official definition of a homomorphism on algebras is given in 2.5.9 below, but Fact 2.5.6 captures the motivation behind focusing on the notion of strong fidelity. Having defined homomorphism, we now state the formal definition of isomorphism and is011l0lphic.
Definition 2.5.7 A homomOlphism hfrom A to E is said to be an isomorphism from A to E (between A and E) if it satisfies the following conditions: (l) h is one-one. (2) h is onto.
Definition 2.5.8 Two relational structures are said to be isomorphic isolnOlphism between them.
if there exists an
Notice that the isomorphism relation is a special case of the homomorphic image relation: if A is isomorphic to E, then A is a homomorphic image of E, and E is a homomorphic image of A. Now, suppose that A and E are homomorphic images of each other. Does it follow that they are isomorphic? This problem is left as an exercise for the reader; if it does not follow, give a counter-example; if it does, prove it. Before continuing, it might be useful to survey some standard terminology. A homomorphism that is one-one is often called a monomOlphism, and a homomorphism that is onto is often called an epimorphism. Thus, an isomorphism is both a monomorphism and an epimorphism. Another term that is often employed is "embedding," which is simply another name for monomorphism. In the special case that A and E are identical, we have special terminology. In particular, a homomorphism from A to A is called an endomOlphism, and an isomorphism from A to A is called an automorphism. ("Interesting" mathematical structures have many different automorphisms, including of course the identity function.) Note carefully that an endomorphism (automorphism) is not simply a homomorphism (isomorphism) whose domain and range are the same. Besides the
18
UNfVERSAL ALGEBRA
can-ier set, the relations (i.e., the structure) must be identical. Different structures can be founded on the very same carrier set. Although the notions of homomorphism and isomorphism are easier to understand intuitively in the context of relational structures, their primary importance arises in the context of algebras. In particular, many of the functions used in logic tum out to be algebraic homomorphisms. It is accordingly important to expand our intuitions to include algebras. Before giving the most general definition for arbitrary algebras, let us consider simple binary algebras, i.e., algebras of type (2). For example, consider two numerical algebras. The first consists of the set N of natural numbers together with the standard operation of addition. The second consists of just the numbers 0 and 1 together with the slightly non-standard modulus addition operation (defined so that 0 + 0 = 1 + 1 = 0 and o+ 1 = 1 + 0 = 1; recall that, in order for {O, I} to form an algebra under the operation +, I + I must be in the set). Now consider any function h from N into {O, I}. To say that h is a homomorphism from (N, +) into ( {O, 1 }, + ) is to say that it satisfies either of the following equivalent conditions: (1) If Z = x + y, then h(z) = hex) (1*) hex + y) = hex) + hey)
+ hey).
Remember that the '+' on the right refers to modulus addition, for which I + 1 = O. Are there any homomorphisms in the above sense? Yes; consider the function h that assigns 0 to every even number and 1 to every odd number. The reader should show that h is a homomorphism. As another example, consider two numerical algebras, one based on multiplication, the other on addition. To say that a function h from N into N is a homomorphism from (N, x) to (N, +) is to say that it satisfies either of the following equivalent conditions. (2) If z = x x y, then h(z) = hex;) + hey).
+ hey).
CONGRUENCE RELATfONS AND QUOTIENT ALGEBRAS
19
homomorphisms. On the other hand, there are morphisms from operational structures to relational structures that are not operational. However, the two following exercises emphasize important observations about the relationship of different kinds of homomorphisms. Exercise 2.5.10 If there is a strong relational homomorphism from A onto B, and A is operational, then B must also be operational. Show that this claim is true. Exercise 2.5.11 If A is an operational structure, and h is a one-one strong relational homomorphism, then h is automatically absolutely faithful. Prove that this is true. As we have noted previously, B is a homomorphic image of A if there is a homomorphism from A onto B. If A is an algebra, then so is B. However, suppose that h is a homomorphism but not onto. In this case, we can examine the range of h, {hex) : x E A}. This is obviously a subset of B. What is, perhaps, more interesting is that this set forms a subalgebra of B; one need merely show that it is closed under the operations on B, which is left as an exercise for the reader. Thus, we have an important result. Theorem 2.5.12 The image of any algebra under any homomorphism is itself an algebra of the same type. Alternatively stated, any homomorphic image of any algebra is also an algebra. We conclude this section with a small observation which is frequently used in logic; once again we leave it to the reader to verify the claim. Exercise 2.5.13 If there is a homomorphism from the algebra A to algebra B, and there is a homomorphism from algebra B to algebra C, then there is a homomorphism from A to C. (Hint: Show that the function composition of two homomorphisms is a homomorphism. )
(2*) hex x y) = hex)
It is easy to see that there are not many such functions; in fact, there is only one-h(x) = ofor all x. Thus, we see that, in this example, there are no interesting homomorphisms. We have now seen how homomorphisms work in the case of simple binary algebras. What remains is to provide the general definition for arbitrary algebras.
Definition 2.5.9 Let A andB be similar algebras with operations (Oi) and (Pi) respectively, and let h be any function from A into B. Then h is said to be a homomorphism ftvm A to B if it satisfies either of the following conditions: (1) Ify = Oi(Xl,··., xn) then hey) = PiCh(Xl), ... , h(x/l)). (1*) h(Oi(Xl, ... , Xn )) = Pi(h(x]), ... , h(x n )).
We have defined homomorphisms for relational structures and for operational structures separately, even though, as we have mentioned, every operational structure may be regarded as a special sort of relational structure. The question that naturally arises is whether our definitions are consistent with each other. We begin by noting that when we restrict the notion of morphism to operational structures we obtain operational
2.6
Congruence Relations and Quotient Algebras
Recall that an equivalence relation on a set A is any binary relation the following conditions, for all a, b, c in A:
=on A satisfying
(El) a=a.
(E2) If a (E3) If a
=b, then b =a. =b, and b =c, then a =c.
Alternatively stated, a relation on A is an equivalence relation if and only if it is reflexive (EI), symmetric (E2), and transitive (E3). Associated with this is the notion of a partition. Recall that a partition of a set A is any collection K of non-empty subsets of A satisfying the following conditions: (PI) If X, Y E K, then X n Y = 0. (P2) For all a E A, there is an X E K such that a E X. In other words, a partition of A is a collection of subsets of A that are mutually exclusive (PI) and jointly exhaustive (P2).
20
UNIVERSAL ALGEBRA
There is an intimate relation between equivalence relations and partitions on a set. Every equivalence relation determines an associated partition; conversely, every partition determines an associated equivalence relation. One way to form an equivalence relation, and hence a partition, on a set A is to define a function f from A into some other set. One then defines the associated relation == so that x == y iff f(x) = fey); it is easy to show that ==, so defined, is in fact an equivalence relation. (The reader should show this as an exercise!) Not only does every function from A determine an equivalence relation on A, the other direction also holds: given an equivalence relation on A, one can always find at least one function that determines that relation. This may be shown as an exercise. An interesting way of looking at the correspondence between an equivalence relation and its associated partition comes by considering the classical philosophical problem of the one and the many. Given many objects that are the same, is there one property (universal) that they all share? For example, given a class of similar geometrical figures (say squares), is there one property that they all share (say, squareness)? This is sometimes related to the supposed psychological process of abstraction, and accordingly the following is often called the principle of abstraction. Given a fairly loose construal of properties, the principle is almost disappointingly trivial, since given any equivalence relation == (the relation of sameness) we can always say that the property which two objects a and b share when a == b (when they are the same) is that of being members of the same equivalence class. The general class-theoretic notions of equivalence relation and partition have algebraic counterparts, being respectively the notions of congruence relation and quotient algebra, which we now examine. As noted above, every function determines an equivalence relation. Since the algebraic counterpart of a function is a homomorphism, it is natural to ask what sort of equivalence relation is determined, by a homomorphism. For the sake of illustration, let us consider two simple binary algebras, A and B, where the respective operations are ambiguously denoted by '+'. Let h be a homomorphism from A into B, and let == be the associated equivalence relation on A. Now suppose that elements a, b of A are equivalent modulo h, which is to say that h(a) = h(b), and let us further suppose that elements c, d are equivalent modulo h. Granting this, we wish to show that the elements a + c and b + d are also equivalent modulo h, which is to say that h(a + c) = h(b + d). But h is a homomorphism, by hypothesis, so h(a + c) = h(a) + h(c), and h(b + d) = h(b) + h(d); but h(a) = h(b) and h(c) = h(d), by hypothesis, so h(a + c) = h(b + d), and we are done. In terms of the relation ==, this result can be stated as follows: (C) If a
CONGRUENCE RELATIONS AND QUOTIENT ALGEBRAS
21
Principle (C) is a special case of the general notion of a congruence relation on an algebra, which is defined as follows.
Definition 2.6.1 Let A be allY algebra with operations (Oi) and let == be allY equivalence relation on A. Then == is said to be a congruence relation on A if it satisfies the following condition, for all i: (RP) If al == bl and . .. alld an == bn, then Oi(aj, ... ,an) == 0i(bl, ... ,bn ). Condition (RP) is often called the replacement property; basically, (RP) says that, from the viewpoint of the operations on the algebra A, == behaves exactly like the identity relation. This definition applies only to algebras, so we still require a definition for general relational structures. This is given as follows.
Definition 2.6.2 Let A be any relational structure with relations (R i ), and let == be any equivalence relation on A. Then == is said to be a congruence relation on A if it satisfies the following condition, for all i: (RP*) If al == bl and ... and an == bn, and (aI, ... ,alZ, x) E Ri, then there exists y such that x == y and (bl, ... ,bn, y) E Ri.
We first observe that, if the relational structure in question is in fact an operational structure, then condition (RP*) reduces to condition (RP) of the previous definition. Next, we note what this definition looks like for the special case of a simple binary relational structure with relation R. Euclid endorsed a special case of (RP) with his famous postulate: "if equals are added to equals, the results are equal." Ifwe interpret "equals" as not necessarily "really identical," but rather identical in a range of relevant respects, then we get a special case of (RP): (RPEuclid) If a == c and b == d, then a + b == c + d. Note that (RP) might be called "global replacement," since it allows for replacement of as many variables as one wants, all at the same time. But many "one-at-a-time replacements" have the same effect:
Proposition 2.6.3 (RP) can be rephrased equivalently as a bunch of "one-at-a-time replacements" :
== band c == d, then a + c == b + d.
This can be decomposed into two parts, as follows: (CI) If a == b, then a + c == b + c. (C2) If a == b, then c + a == c + b. These amount to the familiar algebraic principle that substituting "equals" for "equals" yields "equals." (Of course, equality need not be logical identity here.)
(RP n) if a
== b, then Oi(q, C2, ... , a) == 0i(q, C2, ... , b).
Proof These are each a special case of (RP) (since for each Ci, Ci == Ci by virtue of reflexivity); taken jointly they imply (RP) since if we have all of aj == bl, ... ,an == bIz, then we can replace the telms one at a time, to obtain:
22
UNIVERSAL ALGEBRA
Oi(al, a2, ... ,an)
CONGRUENCE RELATIONS AND QUOTIENT ALGEBRAS
From which it can immediately be concluded:
= Oi(bl, a2, ... ,an) Oi(bl, b2, ... ,an)
if a D
We shall refer to the one-at-a-time replacement rule as (Rp local ). (RP) can be called "atomic replacement" since it only allows for replacement in atomic terms. It can be strengthened without loss so as involve replacement not just in atomic terms but in all terms. We shall call the first "atomic replacement," and the second "complex replacement," denoting them respectively by (RPatomic) and (RPcomplex). (RPatomic) is of course just (RP). Phrased in terms of local replacement, this means that if a = b, then for any term
=
There is a little trick of notation here. Writing the term as
Throwing in the "global" versus "local" distinction gives us the four alternatives loc I ulobal global . . oca.I ) ' (RP global .) ' (RPcomplex a ) ' and (RP"'complex ). (RP atomic .) IS of course Just (RP). RP ( IatomiC atomic Theorem 2.6.5 The following four forms of the replacement rules are all equivalent: local ) (RP global) (RP local ) d (RP global ) (RPatomic ' atomic' complex ' an complex . Similar facts hold about (RP*), in terms of there being four options in terms of the alternatives local versus global replacement, and atomic versus complex replacement. Exercise 2.6.6 State formally all of the above replacement rules [(RP*~~~~ic)' ulobal.) (RP* local ) (RP* global )] and prove them eqUIvalent. . (RP*"'atomic ' complex ' complex Remark 2.6.7 It is best to think of the above fact as applying to first-order logic without identity. The reason we exclude identity is that otherwise the following is an instance of (RP*): if a = b and a = x, then b = x. From this we get as an instance: if a
=b and a = a, then b = a.
23
= b, then b =
a.
So the only congruence is the identity congruence on the algebra (since the identity congruence must be included in every congruence because of reflexivity). Exercise 2.6.8 Observe that the sanle problem does not apply to the following: (RPb) If a
=band aRx, then there exists a Y such that x =Y and bRy.
Just as every function determines an equivalence relation, every homomorphism determines a congruence relation. (The reader may prove this as an exercise.) Also, just as every equivalence relation is determined by a function, every congruence relation is determined by a homomorphism. Demonstrating this interesting result involves the development of the notions of quotient structure and quotient algebra. Every congruence relation = on a relational structure A is an equivalence relation, and accordingly partitions A into various equivalence classes. We can take the equivalence classes of A (relative to =) and construct a relational structure similar to A; this structure is called the quotient structure determined by = (also, A modulo =), and is denoted A/=. In particular, for each relation Ri on A, we define the derivative relation Qi on the equivalence classes in AI= as follows: (Q) ([a!J, ... , [a/l], [x]) E Qi iff for some Y E [x] (ai, ... ,a'b y) E Ri.
=;
=
Here, [x] is the equivalence class generated by x, relative to i.e., [x] = {a : a x}. Condition (Q) is best understood in the special context of binary relations, in which case it may be formulated as follows: (Qb) [a]Q[b] iff there exists ayE [b] such aRy. That the derivative relation Q is well-defined is not obvious; it might involve contradictions. It is at least conceivable that [a] bears Q to [b], but [a'] does not bear Q to [b], and yet [a] is identical to [a']. If this happens, then (Qb) is an illegitimate definition. But it cannot, because the equivalence relation = is not just any equivalence relation; it is a congruence relation. Exercise 2.6.9 Show that the well-definedness of the derivative structure requires (and is also ensured by) the fact that the equivalence relation is a congruence relation. The above definitions apply to general relational structures. When we focus our interest on operational structures, we obtain a certain degree of simplification in the definition of the quotient structure. Specifically, let A be an algebra with operations (Oi), let be a congruence relation on A, and let AI be the associated collection of equivalence classes determined by =. We can construct the quotient algebra determined by =, denoted A/=, by taking the equivalence classes on A determined by =, and defining the derivative operations (Qi) as follows:
=
(Q) Qi([a!J, ... , [an])
=
= [Oi(al, ... ,all)].
24
DIRECT PRODUCTS
UNIVERSAL ALGEBRA
In other words, to compute the result of applying (Qi) to the sequence of equivalence classes ([aIJ, ... , [an]), we look at the conesponding sequence of representatives (al, ... , an), and we apply the parent operation Oi to it, and we take the equivalence class of the result. If the operations are two-place, and we use' +' ambiguously to refer to them both, then (Q) may be restated as follows:
(Q+) [a] + [b] = [a + b]. In other words, to add [a] and [b], one adds a and b, and then takes the equivalence class of the result. Exercise 2.6.10 As with relational quotient structures, one might naturally wonder whether the above definition is illegitimate. For it is conceivable that [a] = [a'] and [b] = [b'], but [a] + [b] f:. [a'] + [b']. Show that (Q) is legitimate. (Hint: It is a direct consequence of the fact that the equivalence relation =: on which it is based is in fact a congruence relation.) Having defined quotient structures, we are now in a position to demonstrate that every congruence relation is determined by a homomorphism. Let =: be a congruence on A, and let AI=: be the associated quotient structure, defined as above. Define a function h so that h(a) is [a], i.e., the equivalence class generated by a; this function is customarily refened to as the canonical homomorphism associated with =:. That h is in fact a homomorphism follows from the way in which the operations on the quotient structure are defined. By way of illustration, consider addition. The question whether h(a+b) = h(a)+h(b) amounts to the question whether [a+b] = [a]+ [b]; but this is true simply in virtue of the way we have defined addition on equivalence classes. In a similar manner, we can show that the canonical homomorphism is indeed a homomorphism with respect to all relevant operaticms and relations. We must also show that the canonical homomorphism determines the original congruence relation =:, which is to say that a=: b iff h(a) = h(b); but this is tlivial, since it simply amounts to the question whether a=: b iff [a] = [b]. Every homomorphism determines a congruence relation, and every congruence determines a homomorphism, namely, the canonical homomorphism. If we put these two theorems together, we obtain another important theorem, known as the homol1lOlphism theorem, which states the following. Theorem 2.6.11 Every homommphic image of A is isommphic to a quotient of A, and vice versa. Proof Suppose that h is a homomorphism from A onto B. As noted earlier, h determines a congruence on A, where a =: b iff h(a) = h(b). The congruence relation =: in tum determines a quotient structllre AI=:. Claim: AI=: is isomorphic to B, where the isomorphism f is defined so that f([aD = h(a). That f is indeed an isomorphism is left
as an exercise. The "vice versa" is trivial given the remarks above about the canonical homomorphism. 0
25
The homomorphism theorem arises in algebraic logic in the following sort of situation. We begin with an algebra of sentences; next, we interpret these sentences via a conesponding algebra of propositions, where the interpretation function is a homomorphism. The interpretation function gives rise to a congruence relation on the algebra of sentences, which yields a quotient strucmre (based on equivalence classes of sentences) isomorphic to the algebra of propositions. According to nominalistic ally oriented logicians, one should discard the propositional algebra altogether in favor of the conesponding sentential quotient algebra. The realistically oriented logician, however, rejects this proposal since, without the propositional algebra, it is difficult, or even impossible, to understand how the sentential congruence relation arises in the first place. We close this section with one last definition' that will be needed in the sequel. Definition 2.6.12 An algebra A is said to be simple if it has only the two trivial congruences, i.e., the universal relation on A, A x A, and the identity relation restricted to A, =A.
It is easy to see that this is equivalent to saying that the algebra has only the two trivial homomorphic images, namely itself and the one-element algebra (of its similarity class). 2.7
Direct Products
In previous sections, we have discussed two ways of constructing new algebras (relational structures) from a given algebra (relational structlITe). Given an algebra A, we can form various subalgebras of A, and we can form various homomorphic images of A, which in light of the homomorphism theorem is equivalent to taking quotients of A. In addition to these techniques, both of which generally produce "smaller" algebras, there is a construction technique that generally produces larger algebras, the method of forming direct products. By way of illustl"ation, consider the set JR of real numbers. As you may recall, Descartes (alias Cartesius) made a singularly important contribution to mathematics by showing that there is a natural and fruitful conespondence between the points of the Euclidean plane, on the one hand, and ordered pairs of real numbers, on the other. Indeed, this conespondence, which involves assigning to each geometlic point its Cartesian coordinates, is the basis of analytic geometl"y and the associated theory of vector spaces. Not wishing to delve into the question of exactly what a vector space is, let us simply note that the set of ordered pairs of real numbers do form a vector space, where the vector addition operation is defined so that the vector sum of two pairs (aI, a2) and (bl, b2) is computed by numerically adding the respective components, thus obtaining the pair (al + bl, a2 + b2). So, for example, (1,2) + (3,4) = (4,6). In addition to vector addition, there is also scalar multiplication, but that is unimportant to our example. What is important is the way that the sum of vectors is defined in terms of the sum of numbers.
26
UNIVERSAL ALGEBRA
We know what it means to say that one number is smaller than another; what does it mean to say that one ordered pair (aI, a2) is smaller than another ordered pair (bI, b2)? One natural way to define this derivative notion of smaller-than is as follows: (It) (aI, a2) < (bI, b2) iff a1 < bi and a2 < b2.
DIRECT PRODUCTS
(1) C
=A x
B.
(2) «al,bl), ... , (an,b ll
27
» E Pi If.f(al, ... ,all) E Qi and (bl, ... ,bn ) E Ri.
Definition 2.7.2 Let A and B be similar algebras with operations (Fi) and (Gi) respectively. Then the direct product of A and B, denoted A x B, is defined to be the algebra (c, (Hi») satisfying the following conditions:
If one imagines the Cartesian plane in the usual way, then the positive x-axis goes east, and the positive y-axis goes north. Under this construal, condition (It) amounts to saying that the point (aI, a2) is less than the point (bI, b2) if and only if (aI, a2) is both west and south of (b1, b2). In the previous examples, what we have is a direct product; the intuitive (although imprecise) idea is that in order to apply an operation or relation to an ordered pair one applies that operation or relation to the respective components. By way of attempting to make these intuitions somewhat more precise and more general, let us consider a pair of simple binary relational structures, A and B, where RI and R2 are the respective relations. Next, let us consider the Cartesian product A x B of the respective carrier sets, A and B, which consists of precisely those ordered pairs (a, b) such that a E A and b E B. What we wish to do is, using A x B as the carrier set, construct a relational structure similar to A and B. So we define a binary relation Rp on A x B in the natural way:
As noted above, we can generalize from pairs of structures to m"bitrary families of structures. In order to accomplish this, we need the notion of the Cartesian product of an arbitrary family of sets. Recall that a choice junction on a family (Ai) of sets is any family (ai) of elements such that aj E Aj for all j in the (implicit) indexing set. The Cartesian product of the family (Ai) of sets, denoted X(Ai), is defined simply to be the set of all choice functions on (Ai). Now, let (Ai) be a family of simple binary relational structures, and let (Ri) be the respective binary relations. Then, using the Cartesian product X(A i ) of respective carrier sets, we can form the direct product of (Ai). In particular, we define the product relation Rp on X(Ai) as follows:
(p) (al,a2)Rp(bl,b2)iffa1Rlbl anda2R2b2.
(p) (ai)Rp(bi) iff for all j, ajRjbj.
In other words, ordered pairs stand in the product relation if and only if their respective components stand in the respective relations. In a similar way, we can define the simple direct product of two algebras. Consider, for example, taking the direct product of the algebra of numerical addition with itself. In this case, the elements are ordered pairs of natural numbers, and addition is defined as
In other words, a sequence (ai) bears the derivative relation Rp to sequence (bi) if and only if every component aj bears the component relation Rj to its counterpart component bj. In a similar way, we can define the direct product of an arbitrary family of simple binary algebras. EvelY thing goes as before, except that one defines the product operation Op as follows:
(+p) (w,x) + (y, z) = (w + y,x
t
z),
(1) C = A x B. (2) Hi[(al,b1), ... ,(an ,bn )] = (Fi[al, ... ,an],Gi[bl, ... ,bnD.
(p*) Op«ai), (bi»)
= (Ci), where for every i E I, Ci = 0i(ai, bi).
which simply says that, in order to add ordered pairs, you add their components in the usual manner (so, for example, (1,2) + (3,4) = (4,6»). If we included multiplication in our algebra, then multiplication would be defined so that (w, x) x (y, z) = (w x y, x x z). There are two ways to generalize the above intuitive definitions. On the one hand, we can consider arbitrary relations and operations, not just binary relations and operations; on the other hand, we can consider taking the product of an arbitrary family of structures, not just pairs of structures. Rather than going straight to the most general case, we shall first consider what happens when we have still only pairs of structures, but have relations of arbitrary degree (Definition 2.7.1), and then operations of arbitrary degree (Definition 2.7.2). In the first case, we obtain the following general definition of the direct product of a pair of structures.
(Here we assume that the degree of relations R 1 is n.)
Definition 2.7.1 Let A and B be similar relational structures with relations (Qi) and (Ri) respectively. Then the direct product of A and B, denoted A x B, is defined to be the relational structure (C, (Pi») satisfying the following conditions:
Definition 2.7.4 Let (Ai )iEJ be a family of similar algebras with operations (oj) 'EJ respectively. Then the direct product of (Ai), denoted I1 (Ai), is defined to be the la/ge_ bra (C, (O~) jEJ) satisfying the following conditions:
So, for example, in order to add two sequences, one adds the respective components. We are now in a position to offer the general definition of direct product for arbitrary families of m'bitrary structures.
Definition 2.7.3 Let (Ai )iEJ be afamily of similar relational structures with relations (R{) jEJ respectively. Then the direct product of (Ai), denoted I1 (Ai), is defined to be the relational structure (C, (R~)jEJ) satisfying thefoUowing conditions: (1) C = X(Ai).
(2) For eve/y j E J, «a} )iEI, ... , (a;Z>iEI) E
R~ If.ffor every i
E
I, (a}, ... , a;l)
E R{.
j
(1) C = X(Ai).
28
UNIVERSAL ALGEBRA
(2) Foreveryj
E
J, O~[(af)iEI, ... ,(aniEI] = (O{[af,···,a;l]iEI).
There is an important special case of a direct product, which is called direct power, that is equivalent (when specialized to an algebra) to what is more commonly called a function space. Let us concentrate for the moment on algebras. Suppose that the family (Ai) of algebras comprehends just one algebra, in the sense that Ai = Aj for all indices i,j; in this case, we have "I-many" repetitions of A. Then the direct product of the family (Ai) is called the I -direct power of A, much as the product of a number x with itself n times is called the nth power of x. Now, the Cartesian product of a set A with itself "I times" (where I is any set whatsoever) is simply the set of all functions from I into A. Given this set AI of functions from I into A, and given that A is the carrier set of algebra A, we can form the associated function space, which is simply the algebra founded in the natural way on A I. In particular, the operations acting on the functions are defined according to the corresponding operations on the elements of A. Thus, for example, in a numerical function space, the sum of two functions f and g, herein denoted [f + g], is defined according to the following rule:
(f+) [f + g](x)
= f(x) + g(x).
In other words, to compute what [f + g] yields when applied to an element x, one applies f to x and g to x, and adds the results. More generally, one defines each operation 0 on the function space according to the following scheme: (fO) [O(!l, ... ,fn)](x)
= O[!l(x), ... ,!n(x)].
By way of concluding this section, we introduce a subsidiary notion, which is employed in subsequent sections. Given any direct product (Ai), and given any element k in the indexing set I, one can d~fine the associated projection map, Pk, which maps X(Ai) to Ale, according to the following rule:
IT
(PROJ) Pk(ai)iEI) = ak. In other words, Pk simply selects the kth component of each sequence in X(Ai). For example, in the case of the 2-direct power of the real numbers, the projection of any point in the plane is simply the coordinate of that point along the appropriate axis.
Theorem 2.7.5 Each projection Pk is a /1lmphism from IT (Ai) onto Ak; in other words, each Aj is a mmphic image of the direct product IT (Ai). Exercise 2.7.6 Prove the above theorem. Exercise 2.7.7 Show that in the context of relational structures Pk is not necessarily strongly faithful, i.e., not necessarily a homomorphism. Show that in the context of operational structures Pk is a homomorphism. 2.8
Subdirect Products aud the Fuudamental Theorem of Universal Algebra
As we have seen in the previous section, we can, so to speak, multiply algebras to obtain other algebras, just as we can multiply natural numbers to obtain other natural
SUBDrRECT PRODUCTS
29
numbers. Recall that a natural number is called prime if it has no non-trivial factors. According to the prime factorization theorem, sometimes called the fundamental theorem of arithmetic, every number is identical to a product of prime numbers; for example, 10 = 2 x 5, and 30 = 2 x 3 x 5. Birkhoff, the father of universal algebra, proved the corresponding fundamental theorem of universal algebra (1944), which says (very roughly!) that every algebra is isomorphic to a product of prime algebras. In this section, we describe the content of this theorem. Let us start with the notion of prime algebra. Intuitively, an algebra is prime if it is not a non-trivial product of other algebras. What is a trivial product? For any similarity type of algebras, there is the trivial algebra of that type, which consists of exactly one element. Now, it is easy to show that the direct product of an algebra A with the trivial algebra is isomorphic to A. (This may be done as an exercise.) We regard such a product as trivial. In general, a product is trivial if at least one of the associated projection maps is an isomorphism, and it is non-trivial if it is not trivial (what else!). Having the notion of prime algebra under our belt, let us consider the content of the algebraic prime factorization theorem. One is naturally inclined to hope that this theorem says something like the following: (??) Every algebra is isomorphic to a direct product of prime algebras.
As nice as this may sound, it proves to be unfruitful. Consider cardinality. Every Cartesian, and hence every direct, product of finite algebras is either finite or uncountable; no such direct product is denumerable. On the other hand, many algebras are denumerable, which is a consequence of the famous Lowenheim-Skolem theorem from logic, which depends on the finitary (inductive) nature of the operations. We are accordingly forced either to accept every denumerable algebra as prime, or to devise a different notion of algebraic multiplication. Birkhoff opted for the latter choice, developing what is known as subdirect products. One way to look at a subdirect product is that it is a subalgebra S of a direct product (Ai) that still preserves the relationship of Theorem 2.7.5: each component algebra Ai is a homomorphic image of S. The following is our official definition.
IT
Definition 2.8.1 Let (Ai) be a family of similar algebras, and let B be a subalgebra of the corresponding direct product IT (Ai). Then B is said to be a subdirect product of (Ai) if it satisfies the follOWing condition: (1) for each index k, Ak is a homomorphic image of B under the projection map Pk (restricted to B).
B is said to be a trivial subdirect product of (Ai) toAk.
if there is a k such that B is isomorphic
Adopting this as our notion of product for the purposes of the prime factorization theorem, we now define prime algebra as follows (in the literature, the standard terminology is the mouthful "subdirectly irreducible"):
UNIVERSAL ALGEBRA
30
Definition 2.8.2 An algebra B is said to be prime, or subdirectly irreducible, iffor any family F of similar algebras, B is a subdirect product of F only if B is a trivial subdirect product of F.
With these definitions, we can now state Birkhoff's result. Theorem 2.8.3 (Birkhoff's prime factorization theorem) Every algebra is isomorphic to a subdirect product of prime algebras. The proof of Birkhoff's prime factorization theorem is somewhat involved, and proceeds by a series of lemmas. We shall first need Zorn's lemma, a well-known equivalent of the Axiom of Choice; but we first recall some terminology necessary for its statement. Where E is a family of sets, C is a chain of E iff (1) C is a subfamily of E and (2) for every pair of sets X, Y E C, either X ~ Y or Y ~ X. The union over a family of sets C is that set C containing all and only members of members of C. A set P is maximal in a family of sets E iff no member of E is a proper superset of P. Not all families of sets have maximal members, but Zorn's lemma states:
U
Lemma 2.8.4 (Zorn's lemma) If E is a non-empty family of sets, and if the union U C over every non-empty chain C of E is itself a member of E, then E has some maximal member P.
SUBDIRECT PRODUCTS
Corollary 2.8.6 Let A be a non-trivial algebra and let (Oi) be an indexed family of all the congruences Oi on A for which there is a pair of distinct elements a, b such that 0i is maximal with respect to not relating a and b. Then n(Oi) = E, the equality relation on A. I Lemma 2.8.7 If n(Oi) = E, where each Oi is a congruence on A, then A is isomOlphic to a subdirect product of (AjOi). Proof Suppose n(Oi) = E, where each Oi is a congruence relation on A. We argue that A is isomorphic to a subdirect product of (AjOi). For each AjOi let hi be the canonical homomorphism of A onto Aj Oi. Define the desired isomorphism h so that for a E A, h(a) = (hi(a). Note that our supposition amounts to that if a -:f. b then for some Oi we have not aOib, which means hi(a) -:f. hi (b) and so h(a) -:f. h(b). So h is one-one. That h preserves operations follows trivially from the componentwise definition of the operations on the direct product and from the fact that each hi, being a homomorphism, preserves the operations as they are computed at the ith component. All that remains then is to show that the homomorphic image of A under h, h*(A), is a subdirect product of the AjOiS. It is obviously a subalgebra of the direct product II (AjOi) because of the fact that h is an isomorphism. So we have finally to argue that each ith projection of h*(A) is onto AjOi. Consider [a]Bi E AjOi. Consider h(a) = (hi(a). Since hi(a) = [a]Bi' clearly [h(a)]i = hi(a) = [a]B i . So [a]Bi appears in the ith place of some member of h*(A), as was to be proven. 0
We now develop a series of lemmas and corollaries. Lemma 2.8.5 For a, bE A, if a -:f. b then there exists a congruence Oa, b on A which is maximal among all those congruences 0 satisfying the condition: it is not the case that aOb. Proof For each pair of elements a, b of A with a -:f. b, let C a, b be the set of all congruence relations 0 on A such that not aOb. Ca, b is non-empty since E E Ca, b. Regard each congruence as a relation in the set-theoretical sense, i.e., a set of ordered pairs. Our reason for emphasizing that congruences are sets is so we can talk of such things as one congruence being included in another and the union of a set of congruences. We need to be able to talk this way so as to apply Zorn's lemma so as to find a maximal member ofCa,b. Consider, then, any chain C of Ca, b. Clearly U C is a reflexive and symmetric relation. U C is also transitive. For suppose (x, y) E U C and (y, z) E U C. Then there is some 0 E C such that (x, y) E 0 and some Jr E C such that (y, z) E Jr. But either o ~ Jr or Jr ~ O. Whichever is the case, both (x, y) and (y, z) are members of one of Jr or O. Since both Jr and 0 are transitive, then (x, z) is a member of a member of C, i.e.,
(x,y) E UC. Verification of the Replacement Property is similarly argued (but note that it depends upon the operations of the algebra being finitary). We have thus shown that U C is a congruence, and it is obvious that U C does not relate a and b since no member of C does. Thus we have in hand all the hypotheses of Zorn's lemma and conclude that Ca, b has a maximal member; let us call it Oa, b. 0
31
Lemma2.8.8 If A issubdirectly reducible, then n(Oi) non-trivial congruences on A.
= E, where (Oi) is the set of all
Proof Suppose A is subdirectly reducible and yet n(Oi) -:f. E. Then for some a, bE A, a ~ b and ye~ aOib for all i E I. Let A be isomorphic to S, a subalgebra of II Ai, where S IS the subdlrect product that shows A subdirectly reducible. Thus no Ai is isomorphic to A. Each Pi restricted to S is a homomorphism of S (and hence indirectly of A) onto Ai, and hence determines a congruence Oi. Letting the isomorphism of A onto S be h, h(a) -:f. h(b), and hence for some i, [h(a)]i -:f. [h(b)]i. Thus Pi(h(a)) -:f. PiCh(b)) and so o not aOib.
Corollary 2.8.9 If A is simple, A is subdirectly irreducible. Proof If A has at least two distinct elements a and b, then there is no non-trivial congruence Oi that distinguishes a from b. So the consequent of Lemma 8.8 is false, and so by contraposition, A is subdirectly irreducible. In the degenerate case when A is a .trivial, one-element algebra, it is easy to see that A can be represented by a subdlrect product of algebras only when A is isomorphic to each of the 0 algebras. 1This last is just a fancy w~y. of saying that any pair a, b of distinct elements can be distinguished by a congruence B~, I.e., for some Bi It IS not the case that aBib. Incidentally, if (Bi) is empty, which is never needed In our use of It, we understand n(Bi) to be the universal relation on A, A x A.
WORD ALGEBRAS AND INTERPRETATIONS
UNIVERSAL ALGEBRA
32
Sub lemma 2.8.10 Let e be a congruence on A. Let If be a congruence on A/ e, alld define a(elf)b as [a]olf[b]o. Then, a(elf)b is a congruence on A. The proof of the sublemma is left as an exercise.
Lemma 2.8.11 Let e a, b be as in Lemma 2.8.5. Then A/ea, b is subdirectly irreducible. Proof Case 1. A/ea, b is simple. By Corollary 2.8.9, A/ea, b is sub directly ilTeducible. Case 2. A/ e a, b is not simple. Let (lfi) be the set of all non-trivial congruences on A/ e a, b. By Sublemma 2.8.10 it will follow that [a]Oa, blfi[b]Oa, b for alilfi. To see this, first notice that for arbitrary x, yEA, if xe a, bY then [x ]Oa, b = [Y]O o , b' and so [x]oa blfi[Y]Oa,b' i.e., X(ea,blfi)Y' Thus ea,b ~ (ea,blfi)' But further it follows fro~ the non-triviality of lfi that ea,b f:. (ea,blfi)-otherwise lfi = E. But since ea , b was maximal with respect to not making a and b congruent we have [a]Oa,blfi[b]Oa,b as promised. But then n(lfi) f:. E, and so by Lemma 2.8.8, A/ a , b cannot be subdirectly reducible. 0
e
We now at last prove Birkhoff's Theorem 2.8.3 proper. Let A be an arbitrary algebra. Suppose then that A is non-trivial. (We may suppose A is non-trivial. For if it has only one element, then it would obviously be simple, and hence by Corollary 2.8.9, A would itself be subdirectly ilTeducible.) Plugging Corollary 2.8.6 into Lemma 2.8.7, we conclude A is isomorphic to a subdirect product of an indexed family of algebras (A/ei), where each ei is as in Lemma 2.8.5. But by Lemma 2.8.11, each A/ei is itself subdirectly ilTeducible. The following is a sometimes useful summary of an important part of the reasoning used in showing Birkhoff's prime factorization theorem. It can of course be rephrased with congruences in place of homomorphisms, and quotient algebras in place of homomorphic images.
Theorem 2.8.12 Let K be a similarity class of algebras and let A (not necessarily K) be a similar algebra. Suppose for each a, b E A with a f:. b there exists an algebra E K and a /zomolnOlphism h of A onto A' such that h(a) f:. h(b). Let H be the class of all such homomorphisms. Then A is isolnOlphic to a subdirect product of the direct product IIhEH h(a). E
A'
We define f(a) = (h(a»hEH' We start by showing that f is one-one and preserves operations. The first is obvious, for if a f:. b, then there exists h such that h(a) f:. h(b), and so (h(a) hEH and (h(b) hEH differ at the "hth place." As for preserving operations, let us illustrate this with an arbitrary binary operation * (the extension to operations of all degrees being then clear). The following calculation depends upon the fact that both h and the angle brackets can be moved "inside," which is just a visual way of expressing that h is an isomorphism and operations in the direct product are defined componentwise. f(a
* b) = (h(a * b»hEH = (h(a) * h(b»hEH
= (h(a»hEH * (h(b)hEH
= f(a)
33
* feb).
All that remains is to show that the image of A under f is a subdirect product. We know that it is a subalgebra of the direct product and so all that really remains is to show that the projections are all onto their appropriate algebras in K. The hth projection is the function that assigns to each (h(a)hEH the element h(a) in the algebra A' E K. Since h is onto, this means that for every a' E A' that there is some a E A such that h(a) = a'. Thus every element a' E A' shows up in the "hth place" as a component of (h(a)hEH'
2.9
Word Algebras and Interpretations
In many of the remaining sections of this chapter, we employ concepts bOlTowed from the metatheory of first-order logic. So we take this opportunity to describe very briefly some of these concepts, leaving the detailed presentation until a later chapter. One of the things that connects algebra and metalogic is the notion of word algebra, which is crucial both to universal algebra and to algebraic logic. At least two examples of word algebras arise in logic-the algebra of terms of a first-order language, and the algebra of sentences of a zero-order language. In the construction of the class of terms of a first-order language, one begins with a class V of variable symbols, a class 0 of operation symbols (where 0 and V are disjoint), and a function d that assigns a natural number to each operation symbol. For each operational symbol S, deS) is the syntactic degree of S, which pertains to the rules of term formation, given below. Given the class SYM (= V U 0) of symbols, one constructs the associated class of all finite sequences of symbols, and using a set of syntactic rules, one identifies those sequences that are well-formed; in the case of first-order languages, these are called terms. This process applies equally to the formation of sentences in a zero-order language, as in sentential logic. In this case, the variable symbols cOlTespond to sentential variables, the operation symbols cOlTespond to connectives, or sentential operators, and the well-formed strings cOlTespond to sentences. Both first-order terms and zero-order sentences are concrete instances of the abstract algebraic concept of words. The terminology is natural: if the symbols correspond to letters of the alphabet, then the well-formed sequences of letters cOlTespond to words. Before formally defining algebraic words, we describe a subsidiary notion, that of an operational syntax.
Definition 2.9.1 An operational syntax is a system (V, (Oi), (di»), where V is any nonempty set, (Oi) is any non-repeating non-empty family disjoint from V, and (d i ) is any family (with the same indexing set) of natural numbers.
Definition 2.9.2 Let SYN = (V, 0, d), be an operational syntax, as defined above. Then the set of symbols of SYN is V U 0, and the set of strings of SYN is the set of all finite sequences of symbols of SYN. The set of words on SYN is a subset of strings of SYN, denoted W(SYN), inductively defined asfollows:
(1) Any sequence consisting solely of a variable symbol is a word (i.e., is an element of W(SYN)). (2) If WI, ... , Wk are words, and ijOi is a k-place operation symbol (i.e., di = k), then the sequence obtained by juxtaposing Oi, WI, ... , Wk, in that order, is also a word. (3) Nothing else is a word.
Given the set of words on a syntax SYN, we can construct an associated algebra, called the algebra of words on SYN. Specifically, for each operation symbol 0, of syntactic degree n, we define a conesponding operation 0*, of algebraic degree n, defined in the natural way, as follows: (0)
WORD ALGEBRAS AND INTERPRETATIONS
UNNERSAL ALGEBRA
34
O*(Wl,"" W/l) = OWl··· W/l'
The right-hand expression denotes the string of symbols that results by juxtaposing 0 with WI, ... , W/l, in that order. In other words, to compute the result of applying the algebraic operation 0* to a sequence of words, one first juxtaposes those words, and then prefixes the conesponding operation symbol O. Thus, the algebraic operation 0* serves as the mathematical portrayal of the syntactic action of prefixing the operation symbol 0 in front of an appropriate number of strings. As noted earlier, the algebraic notion of word encompasses both the terms of a firstorder language and the sentences of a zero-order language. Accordingly, the notion of word algebra has two concrete instances-the algebra of terms of a first-order language, and the algebra of sentences of a zero-order language. In the remainder of this section, we concentrate on algebras of terms. Every system of words has a type, given by the family (di), and hence every algebra of words also has a type, likewise given by (di). We can accordingly consider homomorphisms from a given algebra of words into similar algebras. In the case of the algebra of terms of a first-order language, the: homomorphisms conespond to the interpretations of model theory, which we now briefly describe. Consider a pure functional first-order language L, that is, a first-order language whose only predicate is identity. An interpretation structure for L is simply an algebra A similar to the algebra of terms of L. An assignment is any function that assigns an element of A to every variable. On the other hand, an interpretation is a (but not just any) function that assigns an element of A to every term; in order for a function to qualify as an interpretation, it must respect the conespondence between the syntactic operations, on the one hand, and the algebraic operations, on the other. For example, consider a language, L, with only one (two-place) operation symbol P, and consider an interpretation structure (i.e., algebra), A, consisting of a single twoplace operation, +, which in this example is intended to be the meaning of the operation symbol P. To say that a function I respects the conespondence between the symbol P and its intended meaning, addition, is simply to say the following: for any terms sand t, if I(S) = x, and l(t) = y, then I(PSt) = x + y; equivalently, I(PSt) = I(S) + l(t). But recall that the operation p* on the algebra of terms is defined so that P*(s, t) = Pst, so we can rewrite this condition thus: I[P*(S, t)] = I(S) + ICt)· This, of course, is simply the condition that I is a homomOlphism from the algebra of terms into the algebra A.
35
More generally, given any system of terms of a pure functional first-order language, an interpretation of that language is a homomorphism from the associated algebra of terms into a similar algebra. Every interpretation I gives rise to a unique assignment, which is simply the restriction of I to the class V of variables. The other direction also holds, although it is not trivial-every assignment function on A can be extended to a unique interpretation. The proof of this important, though unglamorous, result depends upon proving that the terms of a first-order language (more generally, the words of an operational syntax) decompose uniquely (cf. Section 4.3). For first-order language this result was first proven by Church (1956). Lemma 2.9.3 (Isolation lemma) Let l' = r(xl, ... , xn) be a tenn containing no variables other than those displayed. Let I and" be two interpretations that agree on each OfXl, ... ,X/l' Thenl[r(xl,···,X/l)] = 1'[r(xl, ... ,X/l)]. Proof. (By induction on the complexity of r(xl, ... , xn).) (i) Base case. Let r(xl, ... , X/l) be atomic. Then it is of the form OiXl ... Xn, and
(ii) Inductive case. We suppose the lemma true for terms 1'1, ... , Tn and show that the lemma is preserved under the construction of l' = Oi(Tl, ... , rn). We first observe that each of the terms rj can be relabelled as: rj(Xl, ... , X/l)' The requirement on this last notation was only that all variables of the term are included among Xl, ... , X/l, and so we can just pool all of the variables occuning in r(xl, ... , xn) thus having a uniform list of variables for each ingredient term in r. By inductive hypothesis, for 1 ::; j ::; n:
and so: I[Oi(rj, ... , rn)]
= Di(lrj, ... , 11'/1) = Di(z'rl, ... , z'rn) = z'[Oi(rl, ... , Tn)].
D
We introduce some quite standard notations which implicitly rely on the fact that in evaluating a formula we have no need to look outside that formula (as the isolation lemma tells us). We write r(xl, ... , xn) to mean a sentence containing no atomic sentences other than Xl, ... , X/l' When ai, ... , an are elements of an algebra A, it is natural to use a substitution notation real, ... , all) to indicate that we are thereby restricting the interpretations of the term to those where each displayed variable Xi has been assigned the element ai. Note that we are not necessarily assuming that each ai is denoted by a constant in our language (though adding a distinct constant for each distinct element would be another way to go). The notation real, ... , an) is neither syntactic fish, nor semantic fowl, but a mixture of both. We shall also from time to time employ a related notation I(al, ... , an/Xl, ... , XII) to mean an interpretation that is exactly like the interpretation I except that it assigns ai to Xi (1 ::; i ::; n). Another notation that is useful is rA(al, ... , all) = [I(al, ... , an/xj, ... , Xn)](T). Note that this is a semantic notion in
EQUATIONAL THEORIES
UNIVERSAL ALGEBRA
36
the sense that it computes an element in the algebra using the functions matching the operation symbols. 2.10
Varieties and Equational Definability
The symbolic machinery of first-order logic includes a two-place predicate symbol for identity, or equality, herein denoted E (though we shall quickly com~ also t.o ~se the standard =, both for the object language and metalanguage, context dlfferentIatmg them). Thus, the atomic formulas of a first-order language include in their ranks formulas involving this predicate; these are called, quite naturally, equations. Standard model theory treats E as a logical predicate, which is to say it assigns to E a fixed interpretation, in particular, the relation of (numerical) identity. What this means may be described as follows. As remarked in the previous section, in the special case of a pure functional language L, an interpretation is simply a homomorphism from the algebra of terms of L into a similar algebra. Standard model theory also involves the concept of satisfaction, which may be regarded as a relation between interpretations and formulas. For .the purposes ~f the remaining sections of this chapter, we need only be concerned WIth pure functlOnal languages, and we need only be concerned with equations. For this special class of formulas, satisfaction is simple to define: I satisfies Est if and only if I(S) = ICt). We ~an then derivatively talk of satisfaction by an assignment, meaning satisfaction by the mterpretation that it determines. Given the concept of satisfaction for equations, we can define a number of derivative notions. To say that an algebra A (of the appropriate type) satisfies Est is to say that every interpretation into A satisfies Est, and to say that a class K of similar algebras satisfies a class Q of equations is to say that every algebra in K satisfies every equation in Q. We also say that A is a model ,of Q when A satisfies Q. With these pieces of terminology, we can now define equational class. Definition 2.10.1 A class K of similar algebras is said to be an equational class, or to be equationally definable, if there is a set Q of equations such that K is precisely the class of all models of Q. Since it refers to linguistic entities (terms, formulas) in addition to algebraic entities, the notion of equational class is model-theoretic, and not purely algebraic. In his Varieties Theorem, Birkhoff (1935), showed that the notion of equational class is coextensive with a purely algebraic notion-namely, the notion of variety, which is defined as follows. Definition 2.10.2 A class K of similar algebras is said to be a variety following conditions: (S) If A E K, and B is a subalgebra of A, then B E K.
(H) If A
E K,
and B is a homomorphic image of A, then B
(P) If Ai E K, for all i E I, then II (Ai) E K.
E
K.
if it satisfies the
37
In other words, a variety is a class of similar algebras that is closed under the formation of (S) subalgebras, (H) homomorphic images, and (P) direct products. Note that, in virtue of (S) and (P), every variety is automatically closed under the formation of subdirect products, and in virtue of the homomorphism theorem, every variety is closed under the formation of quotient algebras. Birkhoff's varieties theorem may be stated as follows. Theorem 2.10.3 (The varieties theorem) (I) EvelY equational class is a variety. (2) Every variety is equationally definable. In the remainder of this section, we prove the first half of this theorem, leaving the second (harder!) half until we develop additional machinery.
Proof. We verify each of the conditions (1)-(3) in Definition 2.10.2. (S) Proving the contrapositive, if B is a subalgebra of A and fails to satisfy s = t then there is some interpretation I of the terms in B so that I(S) =f. l(t). But 1 is afortiori an interpretation in A. (H) Again contrapositively, if B is a homomorphic image of A and fails to satisfy s = t, then there is an interpretation I in B such that I(S) =f. l(t). Let h be the given homomorphism. Define the interpretation I' in A so that, for every variable x, I'(X) E h-l(I(X». Note that I(X) may in general have many "pre-images" in A, and which one you choose is absolutely arbitrary-hence technically you have to use the Axiom of Choice. It may be proven by an easy induction on the length of terms that for every term ll, simple or complex, I'(U) E h-l(l(ll». It follows that I'(S) =f. let), for otherwise h(z'(s» = h(z'(t» and thus I(S) = let), contrary to our assumption. (P) Let II (Ai) be a direct product of algebras each of which satisfies s = t, but suppose it itself does not. Let I be an interpretation in II (Ai) such that I(S) =f. l(t). Letting I(S) = (ai) and l(t) = (hi), then for some i E I, ai =f. hi. Let Pi be that ith projection homomorphism. Define li(U) = Pi(l(ll» on all terms u. This is verified to be an interpretation by an induction on terms, and obviously li(S) = ai =f. hi = liCtm Exercise 2.10.4 The reader can provide any missing details in the above proof, e.g., the inductions. Exercise 2.10.5 For each of conditions (S), (H), and (P) the reader can give an example of a kind of postulate, satisfaction of which would not be preserved by the corresponding way of forming new algebras from old. (Hint: For (S) think of existential sentences, for (H) think of non-identities, and for (P) think of disjunctions.) 2.11
Equational Theories
The question arises about how to formalize the notion of an equational class definable by a set of equations Q. This may seem a strange question, since the cheap way is to take as the set of axioms '
38
EXAMPLES OF FREE ALGEBRAS
UNIVERSAL ALGEBRA
39
as providing the inference mechanism. We know from the completeness theorem for first-order logic that r 1= P iff r I-FOLE p.
of the converse, which is that the rules are sound in the sense that if an algebra A with interpretation I satisfies Q, (i.e., I(S') = l(t') for every equation s' = t' in Q) and Q I-EL s = t, then A also satisfies s = t (i.e., I(S) = l(t)). 0
The completeness theorem for FOLE is of course much more general than what we need. The left-hand side means that every model of r is a model of p, and r can be any set of FOLE sentences, including complex sentences built out of terms, using predicates additional to equality, connectives, and first-order quantifiers, and similarly for the sentence p. We are interested just in the special case where a sentence is an equation: Q 1= s = ( iff Q I-FOLE s = t.
Remark 2.11.2 Rather than adding the above rules, one could take the corresponding "object-language" conditionals, e.g., (x = y & y = z) -+ x = z, and add them to first-order logic without equality. This makes no difference in terms of which equations are derivable, because the only way one could prove the antecedent would be to prove both x = y and y = z, and so by universal generalization obtain their closures.
In this instance the models are just algebras, and it turns out that one does not need all of the deductive power of FOLE.
2.12
Theorem 2.11.1 (Birkhoff) The following axiom and rules suffice to characterize those algebras that are definable by equations: x=x
x = y,
y x=z
=z
x=y y=x
= S'
Proof The hard part is to prove that if Q y:: s = t then there exists an algebra A and an interpretation I in A such that every equation in Q is true with that interpretation, but s = t is not. The idea is to define a congruence relation BQ on the word algebra W similar to A, using the straightforward idea that:
iff Q I-EL s =
* V2, V2 * V], V] * (V2 '" V3), (V2 * V3) * V].
Let us call the set of these W. None of these terms is equivalent in a general groupoid. Thus the word algebra is "free" in the class of groupo ids in the sense that it is totally unrestrained. This means that given any groupoid (A, 0) and any assignment f of the variables to elements of A, f can be extended to an interpretation I of all the words in W, by the inductive definition:
(where s(t]/X], ... ,tn/xII) denotes the result of uniform substitution of (], ... ,til for the variables X], ... ,XIl in the term s; similarly for S'). We call this equational logic EL. Thus we have: Q 1= s = t iff Q I-EL s = t.
sBQt
The way to think of "free algebras" of a given kind is that they are as unconstrained ("free") as they can be while still being algebras of the kind in question. In the next section we shall reveal this notion in its full technicality, but in this section we shall first be as concrete as we can be about so abstract a notion. Let us consider groupoids, which are algebras (A, 0), where 0 is a binary operation on A. A language of groupoids consists of a set V of variables, together with a binary operation symbol *. The set V can be of various cardinalities. For the sake of concreteness, let us assume that V = {V], V2, V3}. There are infinitely many words that can be made up from this alphabet, e.g., V], V2, V3, V]
x=y
s
Examples of Free Algebras
t.
The postulates above are enough to ensure that BQ is a congruence. The last postulate says that we can view them as generalized to all instances, and so the first says that congruence is reflexive, the second says it is symmetric, and the third says it is transitive. And of course the fourth says it satisfies replacement. We leave to the reader the proof
(1) If W E V, then I(W) = few). (2) If W, Wi E V, then I(W '" Wi) = I(W) O/(W ' ). If our word algebra were not totally unrestrained, if for example
V] '" V2
f could not always be extended to a homomorphism, since then IV] I( V2
* V]) =
IV2 0 IV] ,
OlV2
=
V2
* V], then
= I(V] * V2) =
and hence we would have for arbitrary elements x, yEA: xoy=yox
(commutativity).
Thus A would not be an arbitrary groupoid, but would have to be a commutative one. We can obtain a free groupoid in a "prettier" way. Thus W can be obviously simplified by in effect deleting the "', writing (w, Wi) in place of (w '" Wi). W is then the result of closing V under the operation of forming ordered pairs. To use some computer science lingo, the "data type" is ordered pairs. How do we produce an algebra with three generators that is free in the class of commutative groupoids? There are two ways, one of which is always guaranteed to
UNIVERSAL ALGEBRA
40
FREEDOM AND TYPICALITY
work. We shall discuss that one first. The idea is simple. We simply count words such as VI * V2 and V2 * VI as equivalent, as well as, of course, VI * (V2 * V3) and ~V2 * ~3) *VI, etc. More formally, we let == be the smallest congruence such that the baSIC eq~lv alence w * w' == w' * w holds. And we then "divide out" by this equivalence relatIon, obtaining the algebra W / ==. It is quite clear that this restrains the original algebra of words only as much as is needed in order to ensure commutativity. The generato.rs are == mto a the equivalence classes [vl1, [V2], [V3]. Given a map J of ~he generators commutative groupoid A, we can extend J to a homomorphIsm by the defimtIOn:
V/
h([wD
= leW).
This does not depend on representatives, since if w == w' then it is easy to show that leW) = leW'). This can be proven by a kind of induction, the base case being the basic equivalence. Thus leW * w') = IW 0 IW' = IW' 0 lW = l(~' * W). W~ leav~ to the reader the task of showing that this extends to the result of closmg the baSIC eql1lvalence under " reflexivity, symmetry, transitivity, and replacement. . The construction above is often called "the method of eql1lvalence classes, and the reader can easily see that it can be made to give free algebras satisfying whatever equations might be postulated. But there is often a "prettier" way to obtain free algebras. The pretty way to obtain a free commutative groupoid is to replace W * w' with [w, w'], where [w, w'] is to be understood as a "multi set doubleton." Multisets are just t.he same as sets except that multi sets care about how many times an element occurs 1~ the~. Thus [w, w] i= [w], even though {w, w} = {w}. The data type is unordered paIrS WIth duplicates. We now consider groupoids that satisfy the following: x
0
(y
0
z) = (x
0
y)
0
z
(associativity) .
Such groupoids are called semi-groups. We can obtain a free semi-group by taking as the basic equivalence w * (w' * wI!) == (w * w') * wI! and then forming the smallest congruence including this congruence, etc.-just as with commutative groupoids. !he pretty way this time is to just "erase" the parenth.eses, taking the w~rds to be stnngs (finite sequences) of the variables, and understandmg * as concatenatIOn of sequences. Note that the generators are the singleton sequences of the variables. The free commutative (sometimes called Abelian) semi-group can be obtained by replacing sequences with multi sets of variables, with 0 being (multiset) union. The generators are the singleton multisets. Another important law is xox = x
(idempotence) .
If idempotence is added to associativity and commutativity, we obtain an i~portant class of algebras called semi-lattices. Free semi-lattices may be obtained. by takmg s~ts of variables with 0 being the operation of union. The generators are the smgleton (umt) sets.
41
The reader must by now realize that there are eight possible combinations of associativity, commutativity, and idempotence, and we have not covered all of them. In particular, we have not covered idempotence by itself, or in combination with exactly one of associativity or commutativity. For the case of idempotence by itself, we need some data type of "ordered pairs without duplication": (w, w) = (w). For the case of idempotence with commutativity the appropriate data type is just ordinary doubletons, or "unordered pairs," because of course {w, w'} = {w', w}. Perhaps the most alien data type has to do with the combination of idempotence with associativity. These are strings where adjacent duplicates do not count. All of the above data types can either be taken as primitive (as the notion of set is ultimately taken as primitive), or else "implemented" inside set theory. Sets and doubletons are of course notions straight from set theory, and the notion of an ordered pair can be given the standard explication (due to Kuratowski) (x,y) = {{x,y}, {x}}. All the multisets we use above are finite, and hence can be thought of as functions from V into the natural numbers. The number a variable is mapped to indicates the number of times it "occurs" in the multiset. Multiset doubletons are then just multisets where either one variable takes the value 2 and all the others 0, or else two variables take the value 1 and all the others O. Functions of course are understood in set theory as sets of ordered pairs where no two distinct pairs have the same first component. Functions can also be used in implementing strings. A string is just a function from some proper initial sequence of the positive integers {I, 2, ... , 11} into V, saying which variable occurs in the first position, the second, etc. Since the function need not be one-one this allows for duplicates.
Exercise 2.12.1 We leave to the reader the task of figuring out ways to implement the remaining data types. It should be reasonably clear to the reader tllat the equivalence class construction we have applied above to variously equationally definable classes of groupoids can just as well be applied to any equationally definable class of algebras. This will be made precise in Remark 2.14.5 below. It turns out though that a much more complicated construction using equivalence classes can be applied directly to varieties, without yet knowing that these are the same as equationally definable classes of algebras. Indeed, this is used in the proof that the two are the same. We shall deal with these topics in Sections 2.14 and 2.15, after first presenting the technical definition of Jree algebra in Section 2.13. In reading the next section, the reader should bear in mind the intuitive and concrete examples of "freedom" we have just seen.
2.13
Freedom and Typicality
In Section 2.11, we discussed an important conespondence between the model-theoretic notion of equational class and the algebraic notion of variety. In the present section, we examine another notion that has both a model-theoretic and an algebraic formulationthe notion ofJreedom.
UNIVERSAL ALGEBRA
42
FREEDOM AND TYPICALITY
Since it has more intuitive content (though it is less familiar to algebraists), we describe the model-theoretic formulation before the algebraic fOlmulation. Like the notion of an equational class, the model-theoretic version of freedom may be defined by reference to equations. However, the formal definition is not transparent. So we begin by examining the underlying intuition, which pertains to the idea of typicality. To say that a subset S of an algebra A is typical with respect to a class K of similar algebras is to say that there is nothing peculiar about the elements of S. As a first approximation at least, this means that the elements of S do not bear any peculiar relations to one another, but bear only those relations that are born by all elements of all K algebras. By way of illustration, suppose that the pair {a, b} is typical; then for any binary relation R, if a bears R to b, then every element in every algebra A in K bears R to every element in A. If we count among our relations the relation of not being identical (x :j:. y), then we are immediately in trouble; for if non-identity counts as a relation, then no pair can be typical. So, by way of fine-tuning our conception of typicality, we need some way of describing what relations we mean to be referring to in the above sentence. Without arguing for it, let us simply declare that the relevant relations are those that are expressible by equations. So, for example, in the context of arithmetic, admissible (binary) relations include, among others, those associated with the following equations: x = y, x = y + x, x + Y = Y + y. Considering the equation x = y indicates the need for further fine-tuning of our concept of typicality. Suppose the set S in question is a singleton {a}; then the elements of S bear the identity relation to one another, yet surely we do not expect all elements to bear this relation to one another. The intuition is that, in order for a singleton to be typical, it cannot satisfy an equation unless every singleton in every K algebra satisfies that equation. The general point is that every set S has some particular cardinality, and that some equations will thus be satisfied when their variables are assigned members of S for that crude reason alone. The intuition is that a set S can still count as typical (for its size anyway) as long as whenever an equation may be made true by assigning its variables elements from S, then in every K algebra, that equation will be true no matter how its variables have been assigned from some set of the same cardinality (or smaller-it does not hurt to add the words "or smaller" here since if an equation is satisfied by a set it is satisfied by any subset). At this point, matters have become sufficiently complex that we might as well go ahead and present the forn1al definition of the model-theoretic version of freedom. Definition 2.13.1 Let K be a class of similar algebras, and let S be a subset of an algebra A. Then S is said to be typical in A relative to K, or K -typical in A, if the following condition is satisfied: (fl) For any equation E, for any K -algebra B, for any assignment (J into S, and for
any function f from S into B, if (J satisfies E, then f 2(f
0
(]")(x) = J(o"(x)).
0
(J satisfies E. 2
43
The following notation helps in understanding the intuitive meaning of the above definition. Where r(xI, ... , x n ) is a term (all of whose variables are included among XI, ... , xn), we let rA (Sl , ... , sn) be the result of first assigning each variable Xi the element Si from the set S, and then computing the resulting value in A. More formally, let (J be an assignment of the variables into S such that (J(x I) = s I, ... , (J(Xn) = S n. Then 1A(SI, ... ,Sn)
= 1(]"(r(xI, ... ,X n )),
where 1(]" is the interpretation determined by the assignment (J. Let r(xI, ... , xn) and r'(xI, ... ,Xn) be two terms (all of whose variables are included among those displayed). The requirement of the definition is then that if
then TB(fsl, ... ,fsn) = r~(fsl, ... ,fsn),
i.e., roughly put, if an equation holds when its variables have been set to be elements in S, then it holds in all K algebras where its variables have been set to be elements in the smaller set In addition to the above model-theoretic formulation of the notion of typicality, there is the more customary, purely algebraic, formulation of the equivalent notion of freedom, which we now give.
reS).
Definition 2.13.2 Let K be a class of similar algebras, let A be an algebra of the same similarity type, but not necessarily a member of K, and let S be any subset of A. Then we say that S is free in A relative to K, or simply K -free in A, if the following condition is satisfied: (f2) For eve,)' algebra B in K, and for evel), function f from S into B, there is a homomOlphism hfrom the subalgebra of A generated by S into B that extends f.
(To say that h extends f is simply to say that h includes f.)
Exercise 2.13.3 Show that the definitions of freedom and typicality are equivalent. Recall that the sub algebra generated by a subset S of an algebra A is the smallest subalgebra of A containing S, and recall that we say that G is a set of generators for A if A is the subalgebra generated by G. When we combine this with the notion of freedom we obtain the more customary algebraic notion of free generators, which is defined a~ follows. Definition 2.13.4 (1) If S is K -free in A, and additionally S generates A, then we say that S freely generates A relative to K, or we say that S K -freely generates A, and we say that A is a K-free algebra (2) If A is a K-free algebra, and additionally A is a K-algebra (i.e., A E K), then we say that A is a free K-algebra. Theorem 2.13.5 The K-algebrafreely generated by n elements is unique up to an isomorphism, i.e., ifG K-jj'eely generates A, and G' K-freely generates A', and G and G' have the same cardinality, then A and A' are isomOlphic.
UNIVERSAL ALGEBRA
44
Proof Let A and A' be two free K -algebras with n free generators, respectively G and G'. Since G and G' are both of the same cardinality, there exists a function f from G onto G'. Since A is free, there exists a homomorphism h ;2 f such that h: A --+ A'. Note that h* (A) is a subalgebra of A', and so we in fact have h* (A) = A', since A' is the least subalgebra containing G' ~ f( G) ~ h(A). So h is in fact onto. Exactly parallel I reasoning establishes the existence of a homomorphism g ;2 f- of A' onto A. Now we show by induction from generators that g 0 h is just the identity function restricted to A. Base case: Consider x E G. g 0 hex) = g(h(x)) = g(f(x)) = f-1(f(x)) = x. Inductive case: Suppose that g 0 hex I) = Xl, ... , g 0 h(xll) = xn· We show that go h(Oi(XI, ... , x/J)) = Oi(XI, ... , xn). Since g and h are both homomorphisms,
But by "inductive hypotheses,"
THE EXISTENCE OF FREE ALGEBRAS
45
«di, i» will do the trick (to get the ith "operation symbol," "subscript" (pair) its degree with i). One can then form the algebra of words (V, (Oi), (di ) ). V is a set of n generators, and any mapping a of V into a similar algebra is just an assignment, which can be (inductively) extended to an interpretation I defined on all the words. An interpretation is of course a homomorphism (cf. Sections 2.9 and 4.3). 0
The theorem of universally free algebras claims that for any class K of similar algebras there is a K-free algebra. Note carefully, however, that it does not claim the existence of a free K-algebra; a K-free algebra need not be a K-algebra. The free Kalgebra given in the proof above will rarely be a member of K. In general, one should not expect to find a free K -algebra in every class K of algebras. However, if K happens to be a non-degenerate variety (a degenerate valiety being one that contains only one-element algebras), then the prospects for finding free algebras in K are significantly improved. Theorem 2.14.2 For any non-degenerate variety K, and for any cardinal number n, there is an algebra A in K that has n elements that freely generate A with respect to K, i.e., there exists a free K-algebra FK(n).
as desired. By standard set-theoretical considerations, we can see that h is then one-one, and hence the desired isomorphism (if X f:. y, yet hex) = hey), then g 0 hex) = go hey), and 0 so it is impossible for both g 0 hex) = x and go hey) = y to be true). Remark 2.13.6 In light of Theorem 2.13.5, free K-algebras are unique up to the cardinality of the generators. This justifies the notation FK(n) for the free K -algebra with n free generators, but in so referring we should not be committed to such a beast actually existing. We show in Section 2.14 though that for varieties, indeed for a generalization of a variety called a subdirect class, FK(n) always does exist. 2.14
The Existence of Free Algebras; Freedom in Varieties and Subdirect classes
Having defined what a K-free algebra is, we now address the question whether one exists. As it turns out, there is no shortage of K -free algebras; for any class K of algebras, there is a K-free algebra. Indeed, one can prove something quite a bit stronger. Theorem 2.14.1 (Theorem on universally free algebras) For any similarity class of algebras, and for any cardinal number n (not necessarily finite), there is an algebra with exactly n generators that is universally free in that class, which is to say that it is free with respect to every algebra in that class. Proof The desired universally free algebra is just a similar algebra of words defined on n variables. Thus let V be any set of cardinality n. We think of the members of V as "valiables" ("symbols"), but really the members of V can be anything at all. Given the customary set-theoretical definition, identifying a cardinal number n with the smallest ordinal of that cardinality, the most convenient choice is just to let V be n itself. Where the similarity type in question is (di), we need an indexed non-repeating family of "operation symbols" (Oi) disjoint from V. In standard set theory it is easy to see that
Proof To obtain a free K-algebra with n generators, denoted FK(n), we begin with the universally free algebra (of the appropriate type) with n generators, UF(n), and we factor out the congruence relation =, defined as follows:
(c) a = b iff for every A in K, and for every homomorphism h from UF(n) into A, h(a) = h(b). This yields a quotient algebra, UF(n)/=, which we denote UFK(n). One first observes that UFK(n) is generated by n elements, being the equivalence classes of the generators of UF(n). One then shows that every identity satisfied by every K-algebra is also satisfied by UFK(n); this simply follows from the way we defined But K is a variety, and hence an equational class, so any algebra that satisfies the defining equations is also in the class; therefore, UF(n) /= is in the class. Still, we do not yet know that UFK(n) is K-freely generated. Well, suppose that f maps {[g] : g E G) into a K~algebra A. This induces a corresponding map f* on G itself, where f*(g) = f([g]). But this function can be extended to a homomorphism on UF(n), since UF(n) is, by hypothesis universally free; let h* be such a homomorphism. Given h*, we define a corresponding function h! on UFK(n) according to the following procedure. To obtain what h! yields when applied to an equivalence class [x], one simply applies h* to any representative of [x]; since the representatives are equivalent modulo K, they will all be assigned the same element by h*. One then shows that h!, so defined, is in fact a homomorphism from UFK(n) into A, and also that it extends the original function f. 0
=.
Corollary 2.14.3 Every algebra in a variety K is isomorphic to a quotient of a free K-algebra. Exercise 2.14.4 Prove Corollal)' 2.14.3. (Hint: If A has cal'dinality n, then consider the free K-algebra with n generators.)
BIRKHOFF'S VARIETIES THEOREM
UNIVERSAL ALGEBRA
46
Remark 2.14.5 Theorem 2.14.2 and its Corollary 2.14.3 can also be proven directly for equationally definable classes of algebras. Of course, the varieties theorem tells us this, but it is nevertheless interesting, particularly since the technique anticipates similar constructions later where quotient algebras (called "Lindenbaum algebras") are formed on the sentences of a logic using the equivalences provable from the logic. Given a set of equations Q, the trick, put quickly, is to define W =Q Wi iff W = Wi is a consequence of Q in the usual model-theoretical sense, i.e., every interpretation, in every algebra of the appropriate similarity type, which satisfies all equations in Q, also satisfies s = t. The technique used above in proving Theorem 2.14.2 can actually be used to prove a somewhat more general theorem, which pertains to what we will call subdirect classes. Whereas a variety is required to be closed under the formation of subalgebras, homomorphic images, and direct products, a subdirect class is required only to be closed under subalgebras and direct products (hence the name). Thus, subdirect classes are a more encompassing category of mathematical object. The more general theorem is that every subdirect class K of algebras contains a K -free algebra for each cardinality of generators. The proof that the quotient algebra UFK(n) is K-free proceeds just like the one above. On the other hand, our method for showing that UFK(n) is a K -algebra depends heavily upon assuming that K is a variety. Obviously, this does not work for general subdirect classes; as a matter of fact, no method works! Fortunately, however, although we cannot show that UFK(n) is a K-algebra, we can show that it is isomorphic to a K -algebra, and this is good enough.
Theorem 2.14.6 (Birkhoff) Let K be a non-degenerate class of similar algebras, which class is closed under subalgebras and direct products. Then for any cardinal 12, a free K-algebra FK(n) exists. Proof Pick a set V of 12 variables. Form the word algebra Won V of the same similarity type as the algebras in K. For W, Wi E W, define W =K Wi iff for all A E K, for all interpretations I in A, I(W) = I(W ' ). Then form the quotient algebra W /=K (which we denote by W / K -in general in the sequel we suppress =). It is easy to see that the [X]K (for x E V) generate W / K and are all distinct (this last is where the nondegeneracy of K comes in, for if K contains some algebra with at least two elements a and b, then two symbols x and x' can always be distinguished by some interpretation I with I(X) = a t= b = I(X')). SO W / K has the n generators [X]K. Also the [X]K are free generators, i.e., any mapping f of them into an algebra in K can be extended to a homomorphism h. Just define an interpretation I so that (1) I(X) = f([X]K),
and then let (2) h([W]K)
= I(W).
Note that this definition does not depend on representatives because the definition of =K above requires that if W =K Wi, then I(W) = I(W ' ).
47
Thus we have shown that W /K is K-free. The only point remaining would seem to be to show that W / K is a free K -algebra, i.e., that W / K E K. We cannot actually do that, but instead show that some isomorphic copy is a member of K, which clearly suffices (an isomorphic copy of a K-free algebra also being K-free). Note that if K happens to be a variety, then it is closed under isomorphisms, and then we would actually have W/K E K. The trick is to define, for each interpretation I in an A E K, a congruence =1 on W via (3)
W
=1 Wi iff leW) = I(W' ),
and then observe that where I =
{I :
3A
E
K such that I is an interpretation in A}, then
(4) h([W]K) = ([wlt)IEI
is an isomorphism from W / K into the direct product of the algebras W / I. Thus h is one-one, for if [W]K t= [W']K this must be because there exists I E I such that I(W) t= I(W'). But then [W]I t= [w']l> and so h([W]K) and h([W']K) differ (at some component) as desired. Also h preserves operations. Thus (5)
h(oi([wIlK, ... , [Wn]K)) = h([OiWl ... Wn]K)
= ([OiWl ... wn]l) = (oi([wIll>"" [wn]I) = oi([WIlI),···, ([wn]l»
= oi(h([wIlK), ... , h([Wn]K)). Finally, note that when I is an interpretation in A, W /1 is isomorphic to a subalgebra of A. Thus the image of W under I is a subalgebra of A, and the homomorphism theorem says that W /1 is isomorphic to this subalgebra. So we have shown that W / K is isomorphic to a subalgebra of a direct product of subalgebras of algebras in K. So that isomorphic image is then itself a member of K, K being a subdirect class. 0
Corollary 2.14.7 If K is a subdirect class, then evelY algebra A in K is isomorphic to a quotient algebra of a free K-algebra, FK(n)/=. Proof Like the proof of the analogous Corollary 2.14.3.
2.15
o
Birkhoff's Varieties Theorem
We now go about the business of completing the proof of Birkhoff's Theorem 2.10.3. This time we prove the hard part: (1) Every variety is equationally definable.
Its converse, the easy part (1), has been proven in Section 2.10. We begin by introducing some concepts, which are important in their own right and will figure centrally in the proof. Let K be a similarity class of algebras and let T(K) be a class of terms suitable to K. Now define K e to be the set of all equations in T(K) (expressions of the form f1 = t2, with fl, t2 E T(K)) that are valid in all members of
49
QUASI-VARIETIES
UNIVERSAL ALGEBRA
48
K, i.e., such that I(tt} = 1(t2) for all interpretations I in all A E K. Conversely, given a set Q of equations, we can form the class QQ of all algebras (of appropriate similarity type) in which all members of Q are valid. It is easy to establish the following.
Facts 2.15.1 Let K1, K2 be similarity classes of algebras (of the same type) and let Q1, Q2 be sets of equations (in the same terms). Then:
(a) Kl ~ K2 =? K~ ~ K~;
is to define a new mapping (11K) from FK(n) onto A b Y "d'IVI'd'mg "The h hick next . out t e mappmg I by the mapping [ ]K, i.e., we set
(1IK)([w]K) = I(W). If (I/~) is a homo~orphism from FK(n) onto A, then the proof is complete since K contaI~s F~(n) ~nd IS ~losed under homomorphic images. That (11K) is such a homomorphIsm IS an ImmedIate consequence of the following more abstract lemma.
Lemma 2.1~.2 Let A be an algebra with two congruences 81 and 82, vvith 81 Then AI82 IS a homomorphic image of AI81 under the mapping
(b) Q 1 ~ Q2 =? Q~ ~ Q~;
~
82.
(c) Kl ~ (K~)Q; (d) QI ~ (Q~)e. In general, pairs of mappings like (e, a) defined between two power sets are called "Galois connections," and there is an important body of theory concerning them, but now it is just fact (c) that interests us. For if we can go on to show that on the hypothesis that K is closed under subalgebra, homomorphic image and direct product, then we also
have the converse of (c), (c- I ) (Ke)Q ~ K,
then (c") (Ke)Q = K, e
i.e., K is equationally defined by the set of equations K . So assume the hypothesis of the necessity half of Birkhoff's varieties theorem, i.e., that K is a subdirect class also closed under homomorphic images. We may suppose without loss of generality that K is non-degenerate, since the class of all one-element algebras is trivially axiomatized by the equation x = y. Suppose further, then, that
A
A E K. ' First fmm an appropriate word algebra W on a set V of variables at least as big as A. Let I map V onto A. Since W is universally free (cf. Section 2.14), I can be extended E (Ke)Q. We show that
h([alel) = [a]e2' ~roof
The first thing. to verify is that h is a well-defined (single-valued) mapping. This ISb where the hypotheSIS that 81 ~ 82 comes in ' for if [ale 1 = [a']e I' l' .e. , a -el = a,' then ( y the hypothesis) a =e2 a', i.e., [a]e2 = [o']e2' and so the definition of h does not depend upon representatives. That ~ preserves operations falls right out of the representativewise definitions of the operatIOns on both AI81 and A182. Thus h(oi([all e l' ... , [an]el)) = h([oi(aI, ... , a/1)]el) = [oi~a1'.:.' a/1)]e2 = oi([alle2,:··, [a/1]e2) = oi(h([allel)"'" h([an]el))' And cleaIly h IS onto, each [a]e2 havmg as pre-image [a]el' 0
Remark 2.1.5.~ An analysis of the proof shows that taking products, subalgebras, and homomorphIC Images can be organized in an efficient way. Let Q be a set of equations. Then QQ = HSP(QQ). 2.16
Quasi-varieties
While varieties are imp~rtant because many commonly studied algebras are equationally definable, they are stIll a bit restrictive. For example, cancellation is not an equation: x
to an interpretation (1)
I:W~A.
0
z
= yo Z =? x = y.
We generalize this example to include more than one equation in the antecedent:
By a theorem of Birkhoff (Corollary 2.14.6), since K is a subdirect class, K has free K-algebras of any cardinality of generators. Let the cardinality of V be n and consider FK(n). It will simplify matters to suppose that FK(n) has been constructed as a quotient algebra on W in the manner of the proof of Birkhoff's theorem. We then have the canonical interpretation
Definition 2.16.1 A quasi-equation is a formula of the form Sl
= tl
& ... &
S/1
= t/1
=? S
= t.
Note .that. we all~w the antecedent to be empty (but not the consequent), and so an equatIOn IS a specIal case of a quasi-equation.
onto
(2) []K: W -+ FK(n)
We show that =K ~ =1' (Note that
if
Remark 2.16.2 Note that a quasi-equation is indirectly of the form: we already knew that A E K, this would be
immediate.) Thus suppose WI =K W2. Then WI = W2 is a valid identity in K. Since our hypothesis is that A E (Ke)Q, then WI = W2 is also valid in A. But then I(WI) = I(W2), i.e., WI =1 W2·
""PI V ""P2 V ... ""P/1 V Iff, wher~ .the PiS and Iff ar~ all e~uations. Such a formula, but where the PiS and Iff can be arbItrary sentences (mvolvmg predicates other than identity), is called by model
50
UNIVERSAL ALGEBRA
theorists a "Horn formula." Horn formulas allow that some of the disjuncts may be missing, and so lfI all by itself counts. Also "'PI V"'P2 V ... "'Pn counts as a Horn formula. We have no use for this and we follow other authors in calling a Horn formula "strict" (or "positive") when lfI is present. A Horn clause is often referred to in the literature as "universal" because it is intended that all variables are implicitly universally quantified with the scope being the entire formula. So we are interested in strict universal Horn formulas whose atomic formulas are equations, i.e., sentences of the form (where s = t is required to be present, but the other disjuncts may be missing):
SOUNDNESS AND COMPLETENESS
Remark 2.16.11 The history of Theorem 2.16.8 is complicated, and it has in effect been proven by various authors with various "definitions" of quasi-variety; see Gratzer (1979) and Wechler (1992) for some of these forms. Wechler (1992) has a good presentation of, e.g., K = SPred(K) or LSP(K) (where L is closure under "direct limits"), and also shows what happens if one allows the antecedent of a quasi-equation to be infinite. Wechler calls classes of algebra axiomatized by such formulas implicational classes, and shows that K is an implicational class iff K = SP K.
2.17 Definition 2.16.3 A quasi-equational class of algebras is a similarity class K of algebras definable by a set Q of quasi-equations, i.e., A E K iff every quasi-equation in Q is valid in A. The following replaces "variety" in the theory of equational algebras: Definition 2.16.4 A quasi-variety is a similarity class of algebras closed under subalgebras, isomorphic copies, direct products, and "ultraproducts." We owe the reader a definition of ultraproduct, but first introduce the notion of a reduced product of algebras: Definition 2.16.5 Let I be a set of indices, and let ~ be a filter of sets from its power set )fl(I). The product of algebras reduced by ~ (in symbols ntE! AiEl) is defined as the quotient matrix of the ordinary direct product niEI AiEl, produced by the congruence relation =;Y induced by ~ as follows: (ai >iEI =;Y (bi >iEI iff {i : ai = bi} E ~. Note that the congruence relation can be understood as saying that (ai >iEl and (bi>iEI are "almost everywhere" identical. There are two examples of special interest. First, when ~ is just the power set of I, we obtain the ordinary direct product. Second, when ~ is a maximal filter, we get what is called an ultraproduct (~is usually called an ultrafilter in this context). Definition 2.16.6 An ultraproduct is a product of algebras reduced by a maximal filter. Exercise 2.16.7 Prove that quasi-equations are not closed under homomorphic images, but are closed under subalgebras, isomorphic copies, direct products, and ultraproducts. Theorem 2.16.8 (Mal'cev 1973) Let K be a similarity class of algebras. K is a quasiequational class iff K is closed under subalgebras, isomorphic copies, direct products, and ultraproducts. Remark 2.16.9 It is customary to consider closure under isomorphism as built into K. Sometimes in the literature this is marked by referring to K as an "abstract" class, but we shall just implicitly make this assumption in the rest of the book. So the above theorem can be stated as K = SPPuK. Exercise 2.16.10 Prove Theorem 2.16.8 from left to right.
51
Logic and Algebra: Algebraic Statements of Soundness and Completeness
Let W be an algebra of words and let Q be a set of equations in W, i.e., a set of expressions of the form WI = W2 where WI, W2 E W. There is the notion from logic of Q being sound (or correct) with respect to a class of algebras K (similar to W), or, to focus attention on the algebras K, the notion of K satisfying Q. Informally, this means that every equation in Q holds in every algebra in K, no matter what assignment is made to the symbols that generate W (the symbols x E V are thought of as "variables"). More formally (cf. Remark 2.14.5), this means that whenever an equation WI = W2 is a member of Q, then I(WI) = I(W2) for every interpretation I in every algebra A E K. In symbols we write this as I=K Q. We can now establish the following connection between the logical notion of soundness and the algebraic notion of freedom. Theorem 2.17.1 Let W be an algebra of words and let Q be a non-trivial set of equations ( "non-trivial" means that not evel)' equation is a consequence of Q). Then Q is sound in a class of algebras K (similar to W) iff W / =Q is free in K, with the free generators being the [x] such that x is a variable (i.e., an atomic word). The notion W /=Q was introduced in Section 2.14, and amounts to identifying two words WI and W2 as "the same" just when WI = W2 is a consequence of Q. Before turning to the proof of the theorem, we first attend to a needed lemma. Lemma 2.17.2 (Interpretations and homomorphisms) Let W be an algebra of words and let Q be a set of equations in W that is sound for a class of similar algebras K. Let I be an interpretation of W in an algebra A E K. Then the mapping h([wD = I(W)
is a homomOlpizism ofW /=Q into A. Proof The point of the soundness assumption is simply to ensure that h is well-defined as a function (single-valued). Thus if [wll = [W2] then WI = W2 is a consequence of Q, and so (by soundness) I(WI) = I(W2), i.e., h([wll) = h([W2D. That h preserves operations follows directly from the fact that I preserves operations on the algebra of formulas and from the representativewise definitions of the operations on W/=Q. Thus
SOUNDNESS AND COMPLETENESS
UNIVERSAL ALGEBRA
52
h(ot([wll, ... , [Wk]))
= h([OiWI .. , Wk]) = I(OiWI ... Wk)
= ot(,(wd, ... , I(Wk)) = ot(h([wll), ... , h([Wk]))·
o
Remark 2.17.3 We have just shown that under the assumption of soundness, interpretations determine homomorphisms on W / =Q. With no special assumptions at all, the converse also holds. Thus given a homomorphism h from W /=Q into A, we can define an interpretation III (w) = h([ w]). III is obviously an interpretation, i.e., ~ homomorph~s~ on the algebra of words, since it is a composition of two homomorphIs~s. Indeed, It IS easy to see (under the assumption of soundness) that since homomorphIsms on W /=Q and interpretations on W codetermine one another, there is one-one correspondence between them.
Proof Dealing first with the "if" part of Theorem 2.17.1, we suppose that Q is sound for K and that f is any mapping of the set of generators [x] into an algebra A E K. Then define the interpretation I inductively on W so that (1) I(X) = f[x], (2) I(OiWj ... Wk)
Thus we have established (*). But then h([ull) = I(U]) =f. I(U2) = h([U2]), and so h([ull) =f. h([U2]), despite the fact that [ull = [U2] (remember that Uj = U2 is a consequence of Q). But then h is not a function (since it does not give a unique value for [ull, alias [1/2]), and so h is not a homomorphism. Yet our assumption that W /=Q is free gives us h as a homomorphism. 0
Exercise 2.17.4 Show that the assumption that Q is non-trivial is required in the above theorem. Having established a conceptual connection between the logical notion of soundness and the algebraic notion of freedom, we look for a corresponding algebraic rendering of the logical notion of completeness. A set of equations Q in an algebra of words W is complete with respect to a similar class of algebras K iff whenever an equation WI = W2 is valid in K (i.e., I(Wj) = I(W2) for every interpretation I in every algebra in K), then Wj = W2 is a consequence of Q. We have the following relatively trivial result.
Theorem 2.17.5 If W /=QE K, then Q is complete with respect to K.
= Oi(I(Wj), ... , I(Wk)),
and then define h from I using Remark 2.17.3 as in Lemma 2.17.2. The function h is then a homomorphism extending f· Turning now to the "only if" part of the theorem, we suppose for the sake of contradiction that W / =Q is free in K and yet Q is not sound in K. Since Q is not sound in K, then we know that some equation Uj = U2 is a consequence of Q and yet there is an interpretation I of W in some algebra A in K so that I(UI) =f. I(U2). Define a mapping f([x]) = I(X) for every x E V. The only way this mapping could. fail to be well-defined is if Q f- Xl = X2 for some distinct Xj,X2 E V. But then (slllce Xl and X2 are "variables," i.e., consequence is closed under substitution), Q f- WI = W2 for all words WI, W2 E W, and so Q is trivial, contradicting the hypothesis of the theorem. Since W /=Q is free in K, the mapping
f is extendible to a homomorphism
h. We
show that h([w])
= I(W)
for every word W E W. We show this by induction on generators in the algebra of words W, the base case being immediate. For the inductive case let us suppose W = OiWj ... Wk and that
Then
53
Proof Proceeding contrapositively, if Wj = W2 is not a consequence of Q then [wll =f. [W2] in W /=Q' Now define the canonical interpretation Ic(W) = [w]. This is just the canonical homomorphism on the algebra of words (cf. Section 2.8). But then Ic(Wj) =f. I c (W2), and so Wj = W2 is not valid in K. 0 So when the quotient algebra induced by Q is a K -algebra, we know that the set of equations Q is complete for the class of algebras K. It turns out that given the natural (and in practice widely satisfied) requirement that K is a variety, and also the additional assumption (which we wish we could dispense with) that Q is sound in K, then the converse holds as well.
Theorem 2.17.6 Let K be a variety and let Q be a set of equations that is both sound and complete for K. Then W /=QE K. Proof Let us suppose that Q is a set of equations in some algebra of words Wand that Q is complete and sound for some class K of similar algebras. For each interpretation I ofW in some A E K, the image I*(W) is a subalgebra of A. This is because an interpretation is just a homomorphism on the algebra of words, and homomorphic images are always subalgebras. Now consider the class of all interpretations of W in all K-algebras, I(K). We use it as a class of indices to form the direct product X lEI(K) 1* (W). The following can then be seen to be an isomorphism of W /=Q into this direct product: h([w]) = (I(W))IEI(K).
h([OiWj ... Wk])
= h(Ot([WI],.'"
[Wk])) = oi(h([wll), ... , h([Wk])) Oi(I(Wj), ... , I(Wk)) I(OiWj .,. Wk).
= =
It is in showing that h is well-defined that we require soundness. Much as in the lemma on interpretations and homomorphisms, we argue that if [wll = [W2], then WI = W2 is a consequence of Q, and so by soundness I(Wj) = I(W2). SO the definition of h given above does not depend on representatives.
54
UNIVERSAL ALGEBRA
That h is one-one follows from the fact that Q is complete for K. Thus if [wI] "I [W2], then WI = W2 is not a consequence of Q. SO by completeness we know there must be some interpretation zo in some K-algebra such that 10(WI) "l lO(W2). So (l(Wj))IEI(K) and (Z(W2))IEI(K) differ at some component, namely Zo, and are not identical.
3
That h is a homomorphism follows by the following stepwise obvious calculations: h(Oj([wI], ... , [Wk]))
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
= h([oj(wj, ... , Wk)]) = (z(Oj(Wj, ... , Wk)))IEI(K) = (oj(l(wd, . .. , l(W/c)))IEI(K) = OJ((Z(Wj))IEI(K), ... , (Z(W/c))IEI(K))
3.1
= o;(h([wI]), ... ,h([Wk]))·
We have now shown that W /=Q is isomorphic to a sub algebra of a direct product of subalgebras of algebras in K. Since K is a variety, we know that K is closed under all of these relationships, and so W /=Q E K. 0 We now combine the theorems of this section into the following theorem which connects the logical and algebraic notions of a set of equations' adequacy for a class of algebras. Theorem 2.17.7 Let W be an algebra of words and let Q be a non-trivial set of equations in W. Let K be a variety of algebras similar to W. Then Q is sound and complete (logically adequate) for K ~ff W /=Q is afree K-algebra (algebraically adequate).
Proof Simply combine the preceding results.
o
Introduction
In the CUlTent chapter, we present a very special and useful class of mathematical structures, which play an important role in the algebraic description of logic. These structures, called lattices, can be characterized both as relational structures (pertaining to the notion of implication), and as algebraic structures (pertaining to the notions of conjunction and disjunction). We first describe lattices as relational structures; in particular, we describe them as (special sorts of) partially ordered sets, which are the topic of the next section. 3.2
Partially Ordered Sets
Fundamental to logic is the concept of implication or entailment. Under one of its guises, implication is a binary (two-place) relation among sentences (alternatively, propositions). Thus, in this guise, implications among sentences in the object language are expressed by sentences in the metalanguage. The following are the basic properties of implication, where the variables range over either sentences or propositions: (1) x implies x.
(2) If x implies y, and y implies z, then x implies z. In other words, implication is a relation that is (1) reflexive and (2) transitive. A relation having these two properties is customarily called a pre-ordering. Another term that is used is 'quasi-ordering,' but we shall use only the former term. The following series of definitions presents the ideas formally. Definition 3.2.1 Let A be any set, and let R be any binary relation. Then R is said to be reflexive on A if the following is satisfied for all a in A: (RE) aRa.
Definition 3.2.2 Let A be any set, and let R be any binmy relation. Then R is said to be transitive on A if the following is satisfied for all a, b, c in A: (TR) If aRb and bRc, then aRc.
Definition 3.2.3 Let A be any set, and let R be any binary relation. Then R is said to be a pre-ordering on A if it is (1) reflexive on A, and
(2) transitive on A.
56
PARTIALLY ORDERED SETS
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
Definition 3.2.4 Let A be any set, and let R be any binary relation on A. Then the relational structure (A, R) is said to be a pre-ordered set if R is a pre-ordering on A.
A relation of the sort described above is called a pre-ordering because it can be used to define an ordering-more specifically, a partial ordering. More about this in a moment; first, we define a few subsidiary notions. First of all, recall that a relation R is said to be symmetric on A if the following condition is satisfied for every a, b in A: (SY) If aRb then bRa. R is said to be asymmetric on A if it satisfies the following condition for all a, b in A:
(AS) If aRb then not (bRa). It is easy to show that any relation that is asymmetric on A is automatically irreflexive on A, which is to say that it satisfies the following condition:
(IR) not (aRa). Condition (AS) says that b does not bear R to a if a bears R to b, even if a is b. A natural weakening of asymmetry involves adding the proviso that a and b are distinct, which yields the following condition of weak asymmetry: (WAS) If aRb then not (bRa), provided ai-b. A little reflection demonstrates that weak asymmetry is logically equivalent to condition (ANS) in the following definition, a condition that is usually called antisymmetry. Definition 3.2.5 Let A be a set and R a binary relation. R is said to be anti-symmetric on A if it satisfies the following condition for all a, b in A:
(ANS) If aRb and bRa, then a = b, With the notion of anti-symmetry, we can now define partial ordering. Definition 3.2.6 Let A be any set, and let R be any binary relation. Then R is said to be a partial ordering on A if
57
=
After we show that is an equivalence relation-a task left as an exercise for the reader-we factor out the relation =, thereby obtaining the class of equivalence classes of A modulo =. Finally, we define a relation R/= on A/= as follows: (d2) [a]R/= [b] iff aRb. Here, [a] is the equivalence class of a modulo =; i.e., [a] = {x : a = x}. Note that this is not the usual notion of quotient structure, but rather a stronger notion, according to which class A bears the derivative relation to class B iff every element of A bears the original relation to every element of B. Of course, whenever one defines a relation or operation on a collection of equivalence classes, one must demonstrate that the definition is legitimate, which is to say that it does not lead to contradiction. If there is a problem in (d2), it mises in a situation of the following sort: [all = [a2], [hI] = [b2], [aIlR/= [bIl, and hence [a2]R/= [b2], but not (a2Rb2), and hence not ([aIlR/= [bIl). The envisaged circumstance, however, cannot arise, in virtue of the definition of and R/=; in particular, the transitivity of R and of = precludes the envisaged circumstance. (The reader may prove this as an exercise.) Finally, having defined R/=, and having shown that it is well-defined, one shows that it is a partial order relation. (This too is left as an exercise.) The generic symbol for a partial order relation is the inequality sign '::;,' borrowed from mithmetic. Thus, the conditions on a partial order relation may be stated somewhat more suggestively as follows:
=
(pI) a::; a (reflexivity). (p2) If a ::; b, and b ::; c, then a ::; c (transitivity). (p3) If a ::; b, and b ::; a, then a = b (anti-symmetry). As suggestive as the inequality sign is, our use of it is not meant to suggest that every partial ordering is structurally just like the numerical ordering. In arithmetic (more generally, in the theory of real numbers) the order relation satisfies an important additional condition, not logically included in the above three conditions, a condition that is sometimes called connectivity, which is formally presented as follows.
(1) R is reflexive on A;
Definition 3.2.8 Let A be any set, and let R be any binary relation. Then R is said to be connected on A if, for all a, b in A,
(2) R is transitive on A;
(CON) aRb or bRa.
(3) R is anti-symmetric on A.
For example, for any pair of numbers a, b, either a is less than or equal to b or b is less than or equal to a. Thus, the standard numerical ordering is connected on the set of natural numbers. When we add the notion of connectivity to that of partial ordering, we obtain the notion of linear ordering, which is formally defined as follows.
Definition 3.2.7 Let A be any set, and let R be any binw)' relation on A. Then the relational structure (A, R) is said to be a partially ordered set (poset) if R is a partial ordering on A.
As mentioned above, every pre-ordering gives rise to a partial ordering. Here is how that works. We begin with a pre-ordered set (A, R), and we define a relation = on A as follows: (dl) a = b iff aRb and bRa.
Definition 3.2.9 Let A be any set, and let R be any binary relation. Then R is said to be a linear ordering on A if (l) R is a partial ordering on A, and (2) R is connected on A.
58
Definition 3.2.10 Let A be any set, and let R be any binary relation on A. Then the relational structure (A, R) is said to be a linearly ordered set if R is a linear ordering on A. In this connection, alternative terminology includes 'total ordering,' 'totally ordered set: and 'chain.' However, the latter term is typically used to describe subrelational structures. Thus, for example, one might talk about chains in a partially ordered set (A, R), which are simply subrelational structures of (A, R) that happen additionally to be linearly ordered. Derivatively, we will speak of a set B ~ A as a chain when (B, R') is a linearly ordered set, where R' is just R restricted to B. As noted already, if we take the natural numbers and their usualless-than-or-equalto ordering, then the resulting relational structure (N, ::;) is a linearly ordered set. There are, of course, other ways to impose order on the natural numbers. For example, one can define a relation of integral division, or simply division; specifically, we say that b (integrally) divides c if the quotient c/b is an integer; thus, for example, I divides everything, 2 divides 4,6,8, ... , and 4 divides 8,12,16, ... , but 2 does not divide 3, nor does 3 divide 2. One can show that the divides relation on the set of natural numbers is a partial order relation which is not a linear order relation. In other words, one can (and may as an exercise) show the following: (dl) (d2) (d3) (d4)
a divides a.
If a divides band b divides c, then a divides c. If a divides b and b divides a, then a = b.
For some a, b, a does not divide b, and b does not divide a.
Another example of a partially ordered set that is not a linearly ordered set is the set of all subsets of (i.e., the power set of) any set A with at least two elements, where the order relation is the relation of set inclusion. The reflexivity and transitivity of inclusion follow directly from its definition in terms of membership; the antisymmetry of inclusion is simply an alternative way of describing the principle of extensionality, which says that "two" sets are identical if they have precisely the same elements. More generally, let us define an inclusion poset to be a relational structure (A, R), where A is any collection of sets and R is set inclusion (restricted to A). Whereas every inclusion poset is indeed a poset, the converse is not true. On the other hand, as we see in the next chapter, inclusion posets are nonetheless completely typical posets, in the sense that every poset is isomorphic to an inclusion poset. Thus, in trying to fix one's intuitions about posets, it is a good idea to think of inclusion posets, but only so long as one does not fixate on overly specialized ones (e.g., power sets). 3.3
STRICT ORDERINGS
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
Strict Orderings
Having examined the general notion of a partially ordered set, let us consider some generalizations. First, let us call any transitive relation an ordering; thus, pre-orderings, partial orderings, and linear orderings are all examples of the more general concept of
59
ordering. As already emphasized, adding the restriction of reflexivity yields the notion of pre-ordering; on the other hand, adding the polar opposite restriction, irreflexivity, yields the notion of strict ordering, which is formally defined as follows. Definition 3.3.1 Let A be any set, and let R be any binary relation. Then R is said to be a strict ordering on A if (1) R is irrefiexive on A, and (2) R is transitive on A.
Definition 3.3.2 Let A be any set, and let R be any binmy relation on A. Then the relational structure (A, R) is said to be a strictly ordered set if R is a strict ordering on
A.r
Although they represent parallel restrictions on general orderings, pre-orderings and strict orderings are not precisely parallel to one another. The reason is that in the presence of transitivity, irreflexivity and asymmetry are equivalent. (This may be proved as an exercise.) Thus, strict orderings are parallel not to pre-orderings in general, but to partial orderings. In particular, every partial ordering R gives rise to an associated strict ordering R* , defined as (so) aR*b iff aRb and a =J b; and every strict ordering R gives rise to an associated partial ordering R', defined as (po) aR' b iff aRb or a
= b.
It is routine to show that these two procedures are mutually consistent, in the sense
that the strict ordering determined by the partial ordering determined by any strict ordering R is simply R itself, and the partial ordering determined by the strict ordering determined by any partial-ordering R is simply R itself. (This is left as an exercise.) Thus, every partial ordering has an alter ego, which is a strict ordering, and vice versa. It is accordingly useful to regard partial orderings and strict orderings as merely different facets of a common concept. Just as partial orderings are intimately connected to strict orderings, linear orderings are intimately connected to strict linear orderings, which are defined as follows. Definition 3.3.3 Let A be any set, and let R be any binary relation. Then R is said to be weakly connected on A if, for all a, b in A: (We) aRb or bRa, provided a =J b.
Notice that (WC) is simply a logical variant of the well-known principle of trichotomy, which says that, for all a, b, aRb or bRa or a = b. Definition 3.3.4 Let A be any set, and let R be any binary relation. Then R is said to be a strict linear ordering on A if it is (1 ) transitive on A, I The reader is warned that some authors use the words "strict partial order" to denote a partial order that has a least element, and that sometimes this is shortened to just "strict order," as in Wechler (1992).
60
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
(2) asymmetric on A (or irreflexive on A), and (3) weakly connected on A.
Definition 3.3.5 Let A be any set, and let R be any binmy relation on A. Then the relational structure (A, R) is said to be a strict linearly ordered set if R is a strict linear ordering on A. Since a strict linear ordering is a special sort of strict ordering, it has, as we have seen above, an alter ego, which is a partial ordeling. In fact, as the following theorem confirms, the alter ego of a strict linear ordering is a linear ordering, and vice versa. Theorem 3.3.6 The canonical partial ordering associated with a strict linear ordering is a linear ordering. Similarly, the canonical strict ordering associated with a linear ordering is a strict linear ordering. Exercise 3.3.7 The proof of Theorem 3.3.6 is left as an exercise. As we have seen, the notion of ordering naturally divides into the dual notions of partial ordering and strict ordering. Ordering possesses a dual character in another, more commonly discussed, sense. The converse of a relation R is the relation R- 1 defined so that xR- 1y iff y Rx. So for any ordering, we can consider its converse. The following theorem, sometimes refened to as the principle of duality, describes the connection between an ordering and its converse. Theorem 3.3.8 Let (A, R) and (A, R- 1) be relational structures, where Rand R- 1 are converses of each other. Then if(A, R) is an ordering of any kind (i.e., a plain ordering, a pre-ordering, a partial ordering, a strict ordering, a linear ordering, or a strict linear ordering), then (A, R- 1 ) is also an ordering of that kind. By way of concluding this section, we introduce the standard notational conventions that pertain to partial orderings, strict orderings, and converses of these. First, just as we use ':s' as the generic symbol for partial orderings, we use the familiar symbol '<' for strict orderings. Second, we use the obvious symbols '2::' and '>' as generic symbols in reference to the converses of partial orderings and strict orderings. In other words, in any specific context, :S, <, 2::, > are related as follows: (1) x:s y iff x (2) x 2:: y iff y (3) x> yiffy (4) x 2:: y iff x 3.4
< y or x = y; :S x; < x; > y or x = y.
Covering and Hasse Diagrams
Unlike many mathematical structures, partially ordered sets (at least the finite ones) can be graphically depicted in a natural way. The technique is that of Hasse diagrams, which is based on the notion of covering. Before formally defining the notion of cover, we discuss a number of subsidiary notions, including a general technique for constructing pre-orderings and strict orderings.
COVERING AND HASSE DIAGRAMS
61
Definition 3.4.1 Let R be any relation. Then the transitive closure of R, R*, is defined to be the smallest transitive relation that includes R. Alternatively, R* is defined so that xR* y iff at least one of the following conditions obtains: (1) xRy, or (2) there is a finite sequence of elements q, C2, ... , CIl , such that xRq, qRc2, ... ,
cnRy. One can show that R*, so defined, always exists. First define the field of R(fld(R» = {x : 3y(xRy or yRx)}. Then take the intersection of all the transitive relations on the field of R that include R, this latter set being non-empty since it contains at least fld(R) x fld(R). Definition 3.4.2 Let R be any relation. Then the transitive reflexive closure of R, herein denoted R*, is defined to be the smallest transitive and reflexive relation that includes R. Alternatively, R* is defined so that xR* y iff at least one of the following conditions obtains: (1) xRy, or (2) x = y, or
(3) there is afinite sequence of elements q, C2, ... , Cil such that xRq, qRcz, ... , cnRy. Definition 3.4.3 Let A be any set, and let R be any relation on A. The pre-ordering generated by R, denoted R*, is defined to be the transitive reflexive closure of R. Just as the transitive closure of a relation always exists, so does the pre-ordering generated by a relation. Thus, every relation gives rise to an associated ordering/preordering. One is naturally led to ask what added conditions ensure that the resulting relation is a partial/strict ordering. This leads to the following definition. Definition 3.4.4 Let R be any relation. Then R is said to be regular if it satisfies the following infinite series of conditions: (r1) (r2)
If aRb, then ai-b. If aRb, and bRc, then ai-c.
(r3) If aRb, and bRc, and cRd, then a i- d, etc. Alternatively stated, R is regular if its transitive closure is ineflexive. (The reader may show this as an exercise.) Intuitively, regular relations admit no closed loops2 (e.g., aRb & bRa; aRb & bRc & cRa; etc.). Familiar examples of regular relations include the membership relation of set theory (in virtue of the axiom of regularity), and the relation of strict inclusion. With the notion of regUlarity, one can prove the following theorem. Theorem 3.4.5 Let R be any regular relation. Then the transitive closure of R is a strict ordering, and the transitive reflexive closure of R is a partial ordering. Exercise 3.4.6 Prove Theorem 3.4.5. 2But they do not exclude infinite chains, unlike well-founded relations.
62
INFIMA AND SUPREMA
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
Before continuing, we introduce one further notion, intransitivity. This is the polar opposite of transitivity; it is considerably stronger than mere non-transitivity. In particular, by an intransitive relation, we mean a relation R satisfying the following condition:
12 {a,b}
3
4
(IT) If aRb and bRc, then not (aRc). Now, every regular relation, even if it is intransitive, generates a partial ordering. Every partial ordering is generated by a regular relation (let the generating relation be the counterpart strict ordering). A more interesting question is whether every partial ordering is generated by an intransitive regular relation. In general, the answer is negative (see Exercise 3.4.9). However, if we restrict our attention to finite posets, then the answer is affilmative. In order to see how this works, we present the following definition.
Definition 3.4.7 Let (A,:s;) be a poset, and let a, b be elements of A. Then b is said to cover a if the following conditions are satisfied: (1) a < b.
(2) There is no x such that a < x and x < b.
In other words, b covers a if and only if b is above a in the ordering, and furthermore, no element lies between them in the ordering. For example, in the numerical ordering of the natural numbers, 2 covers 1. On the other hand, in the numerical ordering of the rational numbers the covers relation is uninstantiated; no rational number covers any rational number, since the set of rationals is dense (there is a rational number between any two distinct rational numbers). One can show that the covering relation is regular and intransitive, so we can consider the partial (strict) ordering generated by the covering relation. In the case of a finite partially (strictly) ordered set, but not in general, the partial (strict) ordering generated by the covering relation is the original partial (stIict) ordering.
Exercise 3.4.8 Verify the claims in the preceding paragraph. Exercise 3.4.9 Show that the usual partial ordering on the rational numbers is not generated by an intransitive regular relation. With the notion of cover, we can describe precisely what a Hasse diagram is. A Hasse diagram is a graphical depiction of the covering of a partially (strictly) ordered set. The representational convention is straightforward: one uses points (or other tokens, like name tokens) to represent the elements, and one connects two points to indicate that the corresponding covering relation holds; in particular, in order to indicate that a covers b, one connects "a" and "b" in such a way that "a" is north of "b" in the diagram. One then reads off the diagram by noting that the strict ordering is the transitive closure, and the partial ordering is the transitive reflexive closure, of the covering relation. Figure 3.1 contains some examples of Hasse diagrams. The first diagram depicts the poset consisting of integers 1, 2, 3, ordered by the usual numerical ordering. The second depicts the poset consisting of the subsets of a two-element set {a, b}, ordered by set inclusion. The third depicts the poset consisting of all divisors of 12, ordered by the integral division relation discussed earlier.
63
2
{a}
{b}
1
0
(HI)
3
2
(H2)
(H3)
FIG. 3.1. Examples of Hasse diagrams As noted in the previous section, ordering is a double-sided notion in at least two senses. This "double duality" is reflected in the method of Hasse diagrams. First, a Hasse diagram is impartial between the partial ordering and the strict ordering it depicts; whether we regard a Hasse diagram as portraying one rather than the other depends only upon whether we take the relation depicted to be reflexive or irreflexive. Thus, the first principle of duality between strict and partial orderings is graphically represented by the impartiality of Hasse diagrams. The second form of duality (which is literally "duality" in the accepted mathematical jargon) has an equally concrete graphical representation. Specifically, the (second) principle of duality amounts to the principle that every Hasse diagram can be turned upside down, and the result is a Hasse diagram, not of the original ordering, but of its converse (dual).
3.5 Infima and Suprema Thus far in this chapter, we have discussed only one logical concept, implication, which we have treated as a two-place relation among sentences (propositions). Logic seeks to characterize valid reasoning, and so the concept of implication is central to logic. However, the fundamental strategy employed by formal logic is to analyze language in terms of its syntactic structure, and, in part at least, this involves analysis in terms of a few privileged logical connectives-most notably, "and," "or," "not," and "if-then." We are accordingly interested in the mathematical description of these concepts, and their relation to implication, as we have described it above. In the next few sections, we concentrate on "and" and "or"; later in the chapter, we concentrate on "not" and "if-then." Although "and" and "or" are probably best understood as anadic connectives in English ("anadic" means having no fixed degree), in formal logic they are typically treated as dyadic connectives. In lattice theory, they can be treated either way, but let us start with the dyadic treatment. What are the general properties of conjunction? Let us suppose that "and" corresponds to a propositional operation, so whenever x and yare
64
INFIMA AND SUPREMA
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
65
propositions, so is x-and-y. Without saying too much about the nature of propositions and propositional conjunction, we can at least say the following:
Applying the concept of lower and upper bound to the entire poset yields the following definitions.
(CI) x-and-y implies x; x-and-y implies y. (C2) if w implies x, and w implies y, then w implies x-and-y.
Definition 3.5.3 A poset is said to be upper bounded (or bounded above) if it has at least one (and hence exactly one) upper bound.
Note the exact formal parallel in set theory:
Definition 3.5.4 A poset is said to be lower bounded (or bounded below) if it has at least one (and hence exactly one) lower bound.
(Sl) XnY~X;XnY~Y. (S2) If W ~ X and W ~ Y, then W
~
X n Y.
Now, just as we can talk about the intersection of a collection of sets, we can talk about the conjunction of a collection X of propositions. This leads to the natural generalization of (Cl) and (C2): (CI*) The conjunction of X implies each x in X. (C2*) If w implies every x in X, then w implies the conjunction of X. These also have exact parallels in set theory, the statements of which we leave as an exercise. Next, we describe similar principles for disjunction, which is dual to conjunction and parallel to set union: (DI) (D2) (Dl*) (D2*)
x implies x-or-y; y implies x-or-y. If x implies z, and y implies z, then x-or-y implies z. Each x in X implies the disjunction of X. If every x in X implies y, then the disjunction of X implies y.
Definition 3.5.1 Let (A, ~) be a poset, let S be a subset of A, and let a be an element of A. Then a is said to be an upper bound of S if the following condition is satisfied: (ub) For all sin S, a ~ s. In other words, a is an element of A that is larger than or equal to every element of S. Notice that, in principle, a set S may have any number of upper bounds, including none. The set of upper bounds of S is denoted ub(S). The dual notion of lower bound is defined in a natural way.
Definition 3.5.2 Let (A,~) be a poset, let S be a subset of A, and let a be an element of A. Then a is said to be a lower bound of S if the following condition is satisfied: (Ib) For all s in S, a ~ s.
In other words, a is an element of A that is smaller than or equal to every element of S. As with upper bounds, a set can have any number of lower bounds. The set of lower bounds of S is denoted Ib(S). When we set S = A in the above definitions, we obtain two useful specializations. First of all, A does not have just any number of upper bounds (lower bounds); it has exactly one, or it has none. For suppose that p, q are both upper bounds of A. Then p ~ x for all x in A, and q ~ x for all x in A, but p, q are in A, so in particular, p ~ q and q ~ p, whence p = q by anti-symmetry. One shows in a similar manner that lower bounds are unique.
Definition 3.5.5 A poset is said to be bounded bounded.
if it is both upper bounded and lower
It is customary to use the symbol "1" to refer to the upper bound of a poset (supposing it exists), and to use the symbol "0" to refer to the lower bound of a poset (supposing it exists). Thus, in particular, in a bounded poset, every element lies between the zero element, 0, and the unit element, l. As we saw above, a poset has at most one upper bound and at most one lower bound. This generalizes to all the subsets of a given poset, since every subset of a poset is a poset in its own right. (Bear in mind that a poset is a relational structure, not an algebra.) This leads to the notion of least element and greatest element, which are defined as follows.
Definition 3.5.6 Let (A,~) be a poset, and let S be any subset of A. The greatest element of S is defined to be the unique element of S, denoted g(S), which (if it exists) satisfies the following conditions: (1) g(S) E S. (2) For all s in S, g(S) ~ s.
In other words, g(S) is an element of S that is also an upper bound of S. If there is any such element, then it is unique. On the other hand, not every subset S has a greatest element, which is to say that the term "g(S)" need not refer to anything. A succinct mathematical formulation of this idea is that S n ub(S) is either empty or has exactly one element. A weaker notion is that of a maximal element of S. This is an element m of S which is such that there is no xES with the property that x > m. Clearly g(S) is maximal, but not necessarily vice versa. The dual notion of least element is defined in the obvious dual way.
Definition 3.5.7 Let (A,~) be a poset, and let S be any subset of A. The least element of S is defined to be the unique element of S, denoted I(S), which (if it exists) satisfies the following conditions: (1) I(S) E S.
(2) For all s in S, I(S)
~
s.
In other words, I(S) is an element of S that is also a lower bound of S. Once again, I(S) need not exist, but if it does, it is unique. Mathematically speaking, S n Ib(S) is either empty or contains exactly one element. Again, minimal element can be defined dually.
66
Combining the notions of least and upper bound yields the notion of least upper bound, and combining the notions of greatest and lower bound yields the notion of greatest lower bound. The idea is quite simple. The set ub(S) of upper bounds of a set S mayor may not be empty; if it is not empty, then ub(S) mayor may not have a lea~t element. If ub(S) does have a least element (necessarily unique), then that element IS called the least upper bound of S, and is denoted lub(S). In a completely parallel manner, the set I b( S) of lower bounds of S mayor may not be empty, and if it is not empty, then it mayor may not have a greatest element. But if Ib(S) does have a greatest element, then that element is called the greatest lower bound of S, and is denoted glb(S). In spite of the perfectly compositional character of the expressions "least upper bound" and "greatest lower bound," a certain amount of confusion seems to surround these ideas, probably because they involve, so to speak, going in two directions at the same time, so that it may not be clear where one is when the process is completed. For this reason, and for reasons of succinctness, alternative terminology is often adopted. Specifically, the greatest lower bound of S is often called the infimum of S (whic~ is clearly below S), and the least upper bound is often called the supremum of S (WhICh is clearly above S). We will use these terms pretty much interchangeably, although we lean toward the less verbose "infimum" and "supremum." Exercise 3.5.8 Given a bounded po set, show that lub(0) oand 1 are the least and greatest elements, respectively.)
= 0, and glb(0) = 1. (Recall
Proposition 3.5.9 Let (A,~) be a poset, and let (A', ~') be a "subposet" in the sense that A' ~ A and ~' is just ~ restricted to A'. Let S ~ A have an infimum i. If i E A', then i is also an infimum of S' = S n A' (and similarly for suprema). Proof It should be reasonably intuitive that if an element i is the greatest lower bound of S ~ A, then it continues to be a greatest lower bound in any restriction S' of the set S to a subposet A', as long as the element s is a member of A'. Thus i is still a lower bound of the restricted set A', and if every element of S is "above" it, then so is every element of S' ~ S. 0 Corollary 3.5.10 Let (A', ~') be a subposet of (A', ~') as in Proposition 3.5.9. Then if every S ~ A has an infimum (supremum), then so does every S' ~ A' have that same supremum. We conclude this section by discussing an important example of a poset where infima and suprema always exist. Such po sets are called lattices and will be more formally introduced in the next section. Let ~(X) be the "power set" of the set X, i.e., the set of all subsets of X. Then it is easy to see that this forms a lattice in the following way. Proposition 3.5.11 Given Y, Z E ~(X), the infimum of {Y, Z} is the intersection of Y and Z, Y n Z = {x: x E Y and x E Z}. The supremum of {Y, Z} is the union ofY and Z, Y U Z = {x: x E Y or x E Z}. Proof Proposition 3.5.11 is actually a special case of Proposition 3.5.12 to be found 0 below.
67
LATTICES
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
Proposition 3.5.11 can be generalized from binary intersections and unions to arbitrary intersections and unions: c = {x : \iY E C, x E Y}, U C = {x : 3Y E C, x E Y}. The following actually includes Proposition 3.5.11 as a special case, since we can define Y n Z = {Y, Z} and Y U Z = U{Y, Z} .
n
n
Proposition 3.5.12 Given a non-empty C ~ ~(X), the infimum of C is the intersection of c, c, and the supremum of C is the union of C, U C.
n
n
n
Proof We show first that C is a lower bound of the sets in C, i.e., \iY E C, C~ Y. But this follows by definition of C. We next must show that C is the greatest among lower bounds of the sets in C. Let B be another lower bound. We must show that B ~ c. Since B is a lower bound this means that \iY E C, B ~ Y. This means that any member of B is also a member of every Y E C. But this means that B ~ c. The proof concerning the supremum is proven symmetrically, as the reader may confirm. 0
n
n
n
n
Corollary 3.5.13 Let C be any collection of sets closed under arbitrary intersections and unions. Then C forms a complete lattice (with inclusion as the partial order), intersection the infimum, and union the supremum. Proof This follows from Proposition 3.5.12 using Proposition 3.5.9.
o
As a special case we have: Corollary 3.5.14 Let C be any collection of sets closed under binary intersections and unions. Then C forms a lattice with binary intersection and union. Especially in older literature, a collection C satisfying the conditions of the last corollary is called a "ring of sets," and one satisfying the conditions of the first corollary is called a "complete ring of sets." Even though not every infimum is intersection, and not every supremum is union, nonetheless the notions of infimum and supremum are natural generalizations of the notions of (infinitary) intersection and union. In the next section, we examine notions that are the abstract counterparts of finite intersection and union. 3.6
Lattices
As noted in the previous section, the infimum (supremum) of a subset S of a poset P is the greatest (least) element of P that is below (above) every element of S. As defined, the notions infimum and supremum apply to all subsets, both finite and infinite. In the present section, we discuss these notions as they apply specifically to finite nonempty subsets of P. As we saw in the previous section such structures are called "lattices." Some standard references include: Balbes and Dwinger (1974), Gericke (1963), Rutherford (1965), and Szasz (1963). As noted below, infima (suprema) of finite non-empty sets reduce to infima (suprema) of doubleton sets, so we begin with these. A doubleton set is a set expressible (abstractly) by something of the form {s, t}, where s, t are terms; accordingly, in spite of its name, a doubleton may in fact have only one element (if s = t); in any case, a doubleton has at least one element, and at most two elements.
68
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
LATTICES
69
When the set S is a doubleton (a, b}, the infimum of S is denoted a /\ b, and the supremum is denoted a V b. This is in complete analogy with set theory, where the infimum of a pair A, B of sets is denoted A n B, and the supremum of A, B is denoted Au B. It is, furthermore, customary to call a /\ b the meet of a and b, and to call a V b the join of a and b; thus, we may read "a /\ b" as "a meet b" and "a V b" as "a join b". Sometimes, the infimum (supremum) of an infinite set S may be called the meet (join) of S. We shall, however, reserve these terms for the finite case. Indeed, we tend to reserve the terms "meet" and "join" for the specifically algebraic characterization of lattices (see Section 3.7), whereas we use the terms "infimum" and "supremum" for the characterization in terms of partially ordered sets. As noted in the previous section, the infimum (supremum) of a set S need not exist, ilTespective of cardinality. Issues concerning the existence of infima and suprema lead to the following series of definitions.
Hint: Consider the set of integers.
Definition 3.6.1 Let P = (A,~) be a poset. Then P is said to be a meet-semi-Iattice (MSL) if every pair a, b of elements of A has an infimum (meet) in A.
has an lI1fimum
Definition 3.6.2 Let P = (A,~) be a poset. Then P is said to be a join-semi-Iattice (JSL) if every pair a, b of elements of A has a supremum (join) in A.
Hint: ~roof by induction, where the i~duction formula is "every subset of A of size n has an mfimum". ,
Definition 3.6.3 Let P MSL and a JSL.
= (A, ~) be a poset. Then P is said to be a lattice ifP is both an
Hint: Consider the set of countable subsets of an uncountable set (e.g., the real num-
bers).
Theorem 3.6.10 Every complete lattice is bounded. Hint: Consider the infimum and supremum of the whole set.
Theorem 3.6.11 Not evel}' sigma-complete lattice is bounded. Hint: Consider the set of countable subsets of an uncountable set.
Theorem 3.6.12 Not every lattice is bounded.
Theore~ 3.6.1~ Let P 1Il
= (A,~) be a poset. Suppose that every doubleton {a, b} in A A. Then P is an MSL.
Theorem 3.6.14 . Let P = (A,~) be a poset. Suppose that every doubleton {a, b} in A 1Il A. Then P is a JSL.
has a supremum
The following can be shown (see below) to be an equivalent definition. Definition 3.6.4 Let P
Theorem 3.6.9 Not evel}' sigma-complete lattice is complete.
=
Hint: Proof by induction.
(A,~)
be a poset. Then P is said to be a lattice if evel}' non-empty finite subset S of A has both an infimum and a supremum in A.
By taking the latter as our official definition, we are naturally led to propose further, stronger notions, the following being the most commonly used. ~) be a poset. Then P is said to be a sigma-complete lattice if eVel}' non-empty countable subset S of A has both an infimum and a supremum in A.
Definition 3.6.5 Let P = (A,
= (A,~) be a poset. Then P is said to be a complete lattice if evel}' non-empty subset S of A has both an infimum and a supremum in A.
Definition 3.6.6 Let P
Thus, as a simple matter of definition, every complete lattice is a sigma-complete lattice, and every sigma-complete lattice is a lattice. The following is a list of theorems, some obvious, some less obvious. In each case, a more or less developed suggestion is appended by which it is hoped that the reader can provide a proof.
Theorem 3.6.15 The dual of an MSL is a JSL; conversely the dual MSL. '
IS
an
Theorem ~.6.16 . The dual of a lattice (a sigma-complete lattice, a complete lattice) is also a lattzce (a szgma-complete lattice, a complete lattice). Hint: See preceding hint.
Theorem 3.6.17 Not evel}' JSL is a lattice. Hint: Consider the Hasse diagram in Figure 3.2. a
Hint: Every subset of a finite set is finite.
Theorem 3.6.8 Not every lattice is sigma-complete, and hence not every lattice is comHint: Consider the set of integers ordered in the usual way.
a JSL .
Hint:. Show that a is the infimum (supremum) of S in P iff a is the supremum (infimum) of S m pOP, where pop is (A,:2:) iff P is (A, ~).
Theorem 3.6.7 Every finite lattice is complete, and hence sigma-complete.
plete.
OF 'J
c
b
FIG. 3.2. Theorem 3.6.17
70
LATTICES AS ALGEBRAS
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
Theorem 3.6.18 Let P = (A, S) be a poset. Suppose that every subset S of A (including the empty set) has a supremum in A. Then P is a complete lattice. Hint: First notice that the supremum of the empty set must be the least element of A. Define the infimum of a set to be the supremum of its lower bounds.
3.7
The Lattice of Congruences
Given a collection Y of binary relations on a set X, by definition they form a complete lattice iff for any 3 ~ Y both the greatest lower bound I\. 3 and the least upper bound V3 are members of 3 (these bounds taken relative to ~ on Y). Under favorable conditions I\. is just intersection.
Lemma 3.7.1 lfn 3
E
Y, then/\. 3
= n3.
o
Proof This follows from Lemma 3.5.13
This makes it easy to "compute" the meet. Many natural collections Y of binary relations satisfy the antecedent of the lemma for all 3 ~ Y. For example, the class of all equivalence relations, or the class of all congruences, is closed under arbitrary intersections. The details are provided through the following:
Lemma 3.7.2 The set £(A) of all equivalence relations on some algebra A is closed under arbitrmy (including infinite) intersections. The same holds for the set C(A) of all congruences on A. Exercise 3.7.3 Prove the above lemma. Computing V 3 is more difficult. It is very rare that this is the union, since if there are any closure conditions on the classes of relations (e.g., an equivalence relation requires symmetry and transitivity), the least upper bound is larger than the mere union. We must expand the union using the closure conditions. The symmetric and transitive closure of 3 is the intersection of all sets 3' :2 3 such that if (a, b) E 3' then (b, a) E 3', and if (a, b) E 3' and (b, c) E 3', then (a, c) E 3'.
Lemma 3.7.4 Let C(A) be the set of all congruences on the algebra A. Then
V3
is
the symmetric and transitive closure of3. Proof The symmetric and transitive closure of 3 is clearly the smallest equivalence relation including all of the relations in 3, being reflexive since each of the relations in 3 is reflexive. The replacement property is perhaps not quite so obvious. Suppose (a, b) E V3. If a and b are "directly congruent" in the sense that for some congruence 8 E 3, (a, b) E 8, then obviously the replacement property holds. Otherwise a and bare "indirectly congruent" in the sense that 3Cj, ... , C/1 E A and 381, ... ,8/1 E 3 such that:
Since each 8i is a congruence, the replacement can now take place a step at a time:
71
& ... & 1'(C/1-1)811-11'(C/1) & 1'(c/1)8I!1'(b). And so < 1'(a), 1'(b) > E
V 3, as required.
o
Note that the lemma can be strengthened to just requiring transitive closure, i.e., the intersection of all sets 3' :2 3 such that if (a, b) E 3' and (b, c) E 3', then (a, c) E 3':
Corollary 3.7.5 The set C(A) of all congruences on the algebra A forms a complete 3, and (b) V3 is the transitive closure of3. lattice, wherefor 3 ~ C(A), (a) I\. 3 =
n
Proof The corollary is an immediate consequence of the following:
o
Fact 3.7.6 If 3 is a set of symmetric relations, then the transitive closure of its union, TransCl(U 3), is also symmetric. Proof If (a, b) E TransCl(U 3), then it is easy to see that 3XI, ... , Xb Xk+l, ... , XI! and relations PI, ... ,Pn E 3 such that a(pI)xl & ... & XkCPk+I)xk+I & ... & xn(Pn+I)b.
But each of PI, ... , PI! is symmetric, and so we have: b(Pn+ I )XI! & ... & xk+ I (Pk+ I )Xk & ... & Xl (PI )a,
i.e., (b, a)
E
TransCl(U 3).
o
Since the class of congruences C(A) forms a complete lattice, it is clear that it must a smal~est a~d larg~st congruence. It is easy to see that the smallest congruence IS =A (the Identlty relatIOn restricted to A). The largest congruence is of course the universal relation A x A. ~ave
3.8
Lattices as Algebras
Thus fa~, we have characterized lattices as special sorts of partially ordered sets, which are relatIOnal structures. However, as mentioned at the beginning of this chapter, lattices can also be characterized as algebras, which is the focus of this section. We start by noting that every semi-lattice induces an associated simple binary algebra. In the case of a meet-semi-Iattice, the binary operation, called meet, is defined as one would expect: (M) a i\ b = inf {a, b) . Similarly, in the case of a join-semi-lattice, the binary operation, called join, is also defined as one would expect: (J) avb=sup{a,b}.
In an ~SL (JSL), the infimum (supremum) of any pair of elements exists, so these operatIOns are well-defined. Next, we note that if the poset in question is in fact a lattice, then there is an associated algebra of type (2,2), where the operations, called meet and join, are defined by (M) and (1), respectively.
LATTICES AS ALGEBRAS
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
72
Finally, in this connection, we note that if the poset in question is bounded above, we can add a zero-place operation, 1, to the associated algebra, and if the po set is bounded below, we can add a zero-place operation, denoted 0, to the associated algebra. Thus, for example, a bounded lattice gives rise to an algebra of type (2,2,0,0). Every lattice gives rise to an associated algebra. What about the other direction; what sort of algebra gives rise to a lattice? We begin by answering a smaller question, concerning the algebraic description of semi-lattices, which is the topic of the following definition. Definition 3.8.1 Let A be an algebra of type (2), where the sole operation is *. Then A is said to be a semi-lattice algebra if it satisfies the following equations:
(sl) a * (b * c) = (a * b) * c (associativity); (s2) a * b = b * a (commutativity); (s3) a * a = a (idempotence). One might naturally be interested in the relation between semi-lattice algebras, on the one hand, and MSLs and JSLs, on the other. One's curiosity is satisfied, it is hoped, by the four theorems that follow. The reader can easily provide the proofs. Theorem 3.8.2 Let (A,~) be an MSL. Define a binary operation * so that a inf {a, b}. Then the resulting structure (A, *) is a semi-lattice algebra.
*b =
Theorem 3.8.3 Let (A,~) be a JSL. Define a binary operation * so that a sup{a, b}. Then the resulting structure (A, '1') is a semi-lattice algebra.
*b =
These theorems assert that every MSL (JSL) generates an associated semi-lattice algebra. More interestingly perhaps, every semi-lattice algebra generates both an MSL and a JSL, which is formally stated in the following theorems. Theorem 3.8.4 Let (A, *) be a sem'i-lattice algebra. Define a binmy relation ~ as follows: a ~ b iff a * b = a. Then the resulting relational structure (A, ~) is an MSL, where inf {a, b} ( 0 1)
= a * b.
Theorem 3.8.5 Let (A, *) be a semi-lattice algebra. Define a binmy relation ~ as fol-
but a * b = a, so we have a * c = a, which by (01) means that a ~ c, which was to be shown. Thus we see that semi-lattices can be characterized by a set of equations, specifically (sl)-(s3), which is to say that semi-lattices form a variety. Next, we consider whether lattices can be equationally characterized. First, every lattice is both an MSL and a JSL, so at a minimum, we need two copies of the (s 1)-(s3), one for meet, one for join: (Ll) a/\(b/\c) = (a/\b)/\c; (L2) a /\ b = b /\ a; (L3) a /\ a = a; (L4) a V (b V c) = (a V b) V c; (L5) a V b = b V a; (L6) a V a = a.
But a lattice is not merely a pair of semi-lattices. In addition, the semi-lattices are linked by a common partial order relation so that a ~ b iff a /\ b = a and a < b iff a V b = b. So our set of equations must ensure that a /\ b = a iff a V b = b. W;could simply add this biconditional as an axiom, but it is not an equation, and we are looking for an equational characterization. The customary equations are the following two: (L7) a /\ (a V b) = a; (LS) a V (a /\ b) = a.
With these equations listed, we present the following formal definition and attendant theorems. Definition 3.8.7 Let A be an algebra of type (2,2). Then A is said to be a lattice algebra if it satisfies equations (Ll )-(L8). Theorem 3.8.8 Let (A, ~) be a lattice. Define two binary operations, /\ and V, so that a /\ b = inf {a, b}, and a V b = sup {a, b}. Then the resulting algebra (A, /\, V) is a lattice algebra. Theorem 3.8.9 Let (A, /\, v) be a lattice algebra. Define a relation ~ so that a ~ b ¢> a /\ b = a. Then the relational structure (A,~) is a lattice, and in particular, for evelY pair a, b in A, inf{a, b} = a /\ b, and sup{a, b} = a vb. Exercise 3.8.10 The proofs of these theorems are left as exercises.
lows: (02) a ~ b iff a * b = b. Then the resulting relational structure (A, ~) is a JSL, where sup {a, b}
73
= a * b.
Exercise 3.8.6 Prove the four preceding theorems.
Notice that the ordering relation defined by (01) is the converse of the order relation defined by (02). Thus, the MSL and JSL mentioned in the previous theorems are duals to one another. We sketch the proof that ~, as defined by (01), is transitive. Suppose that a ~ band b ~ c; we show that a ~ c. By (01), a * b = a, and b * c = b, so by substitution of b * c for b in a * b = a we have a * (b * c) = a, which by associativity yields (a * b) * c = a,
As a further detail, we note that bounded lattices (semi-lattices) can be algebraically characterized using the following equations: (B 1) (B2) (B3) (B4)
0 /\ a I /\ a 0Va 1 Va
= 0; = a; = a; = 1.
In particular, (B 1) gives us an MSL bounded below, (B2) gives us an MSL bounded above, (B3) gives us a JSL bounded below, and (B4) gives us a JSL bounded above. (These claims may be verified as exercises.) Thus, for example, by adding (B1)-(B4)
74
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
to (Ll)-(LS), we obtain a set of equations for bounded lattices. (This too is left as an exercise.) Having seen that lattice algebras are coextensive with lattices regarded as relational structures, we adopt the standard convention of using the term "lattice" ambiguously to refer both to lattices as posets and lattices as algebras. Alternatively, we can be regarded as using the term "lattice" to refer to mixed structures that have the lattice operations as well as an explicit partial order relation. This practice will seldom, if ever, cause any difficulty. 3.9
Ordered Algebras
Vmious people have defined ordered algebras as structures (A, S, (Oi) iEI) where S is a partial order on A that is isotonic in each argument place in each of the operations 0i: as b =? Oi(Xj, ... ,a, ... ,x n ) s oi(Xj, ... ,b, ... ,xn).
Bloom (1976) and Fuchs (1963) are two classic sources regm'ding ordered algebras. Birkhoff (194S) is an early discussion, focusing on the case where the partial order forms a lattice, and contains (fn 1, p. 200) historical information about seminal work by Dedekind, Ward, Dilworth, and Certaine. Birkhoff (1967) has more information about using the general partial order. Wechler (1992) is an excellent recent source. Example 3.9.1 Given any algebra A = (A, (Oi )iEI), it can be considered an ordered algebra A = (A, = A, (0; )iEI), where "s" is just = A (the identity relation restricted to A). This is called the discrete ordered algebra.
ORDERED ALGEBRAS
75
Thus the class of commutative semi-groups is not truly equational." We should surely respond that the axioms regarding equality are special, and are assumed as the background. It is then still an interesting question as to which algebraic structures can be axiomatized by adding simply equations to these background axioms. We can take the same attitude towards ordered algebras, and say that the following implicational axioms regarding S are assumed as coming for free, along with reflexiveness (a S a) as part of the background: asb & bsa=?a=b asb & bsc=?asc asb=?aoxsbox a S b=? x 0 a S x 0 b.
With this understanding, the class of lattices is "inequationally definable": Exercise 3.9.5 Show that a lattice may be defined as an ordered algebra (L, S, /\, V) satisfying the inequations a /\ b S a, a /\ b S b, a S a /\ a, a S a V b, b S a V b, a Vas a. One can formalize the inequationallogic (IL) with the following axiom and rules: xsx x S y, Y S xsz
z
x S y, Y S x x=y x
Example 3.9.2 Lattices: (A, S, /\, V). Example 3.9.3 Partially ordered groupoids: (A, S, 0). Example 3.9.4 The set of natural numbers N with the usual operations of + and x and the usual S. Similarly, the set of rational numbers Q and the set of real numbers lR with S. Ordered algebras are best understood on the following analogy with ordinary algebras. Ordinary algebras carry implicitly the logical relation of = since there are no other predicates available. Equations then become a big thing, and equational classes of algebras are happy occasions. We take pleasure in the fact that, say, a commutative semi-group is axiomatized by equations of the forms: x
0
(y 0 z) = (x 0 y) x 0 y = y 0 x.
0
z
But suppose some killjoy were to say: "1 see those two axioms are equations, but you are overlooking several other axioms which are implicational in form: x=y=?x+z=y+z x=y=?z+x=z+y x=y & y=z=?x=z x
= Y =? Y = x.
s S s' sUI/XI, ... , tn/xII) S S'UI/XI, ... , tn/xII) Note that these differ from EL only in the third rule, where we now have anti-symmetry instead of symmetry. Let I be a set of inequations. We can prove that I
1= sst
iff I f-IL sst.
Again the hard part is to prove that if I }LIL s s t then there exists an ordered algebra A and an interpretation I in A such that every inequation in I is true with that interpretation, but sst is not. We take the word algebra W similar to A, and notice that it is an ordered algebra if we define s ~ t iff hL sst. The idea is to define a "quasi-congruence" ~ on W, using the straightforward idea that s ~I t iff I f-IL sst. What is a "quasi-congruence"? First, it is a pre-order (often called a "quasi-order"), i.e., reflexive and transitive. Second, a ~ b =? a ~ b (so it includes the partial order, just
as congruence includes the identity relation on the algebra). Thirdly, all operations of A must be isotonic with respect to ~. From a quasi-congruence ~ we can define the corresponding congruence:
s ~ t iff s ~ t & t ~ s. Now we can go on to form the "ordered quotient algebra," A/s.. The elements are equivalence classes [a] (defined using the above congruence), and as for a normal quotient algebra, operations on the equivalence classes are defined representativewise. What is new is that we define [a] 5 [b] iff a ~ b. Clearly this definition does not depend on the choice of representatives, since if al,a2 E [a] and bl,b2 E [b], then if al ~ bl then a2 ~ al and so by transitivity, a2 ~ bl. But bl ~ b2, and so again by transitivity we obtain a2 ~ b2, as needed. The postulates above are enough to ensure that ~i is a quasi-congruence, as the reader can easily verify by inspection. The only part that might be thought to be not immediately obvious is: a =;< b ::::} a ~ b. But it is too immediately obvious (at least in a classical metalogic), since it amounts to hL s 5 t ::::} I hL s 5 t, i.e., all theorems are derivable from any set of premises. We leave to the reader the proof of the converse, which is that the rules are sound in the sense that if an ordered algebra A with interpretation I satisfies I, (i.e., I(S') 5 lU') for every inequation s' 5 t ' in Q) and Q hL s 5 t, then A also satisfies s 5 t (i.e., I(S)
TONOIDS
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
76
5 IU))· There is an analog of Birkhoff's varieties theorem for ordered algebras.
Theorem 3.9.6 (Bloom 1976) Let K be a class of abstract ordered algebras. Then K is definable by in equations iff K is an order variety. We owe the reader a definition of an "order variety." Definition 3.9.7 A class K of simil~r ordered algebras is an order variety iff K is closed under ordered subalgebras, ordered homomorphic images, and ordered direct products. For clarity we make clear the requisite notions, though the reader most likely understands them as the doubling up of the corresponding operational notion for the algebra and the relational notion for the poset. Thus (A', 5', (a~ )iEI) is an ordered subalgebra of (A, 5, (ai)iEi) iff (A', (a~)iEi) is a (strong) subalgebra of (Ay(ai)iEI) and (A', 5') is a subrelational structure of (A, 5). (A', 5', (a~)iEI) is an ordered homomorphic image of (A, 5, (ai) iEi) iff there is a function h : A ---+ B such that h is both an algebraic and relational homomorphism. This last simply requires that h be isotonic in the sense that if a 5 b then h(a) 5 h(b). Consider a collection (Aj)jEJ of similar ordered algebras. Each Aj = (Aj, 5j, (aj, i )iEi), so we can fOlID their direct product by defining A j = (XjEJ A j, 5, (ai) iEi) where (XjEJ A j, 5) is a relational direct product and (XjEJ A j, (ai) iEi) is the operational direct product. This means that the elements are of the form (aj) jEJ, and operations and the inequality relation are defined componentwise. The question naturally arises as to how to define ordered quasi-varieties. These should turn out to be the class of ordered algebras that satisfy a system of implications of the form:
ILo
sl
5 tl & ... &
S/1
5 t/1 ::::}
77
X
5 t.
We call such implications quasi-inequations. The reader can verify the following: Claim 3.9.8 Quasi-inequations are preserved under ordered subalgebra and ordered direct product. A simple example shows that quasi-inequations are not preserved under homomorphic image. Example 3.9.9 Consider the set A = {D, I}. It can be made into a partially ordered set A in the following two ways. 51 is just the identity relation restricted to A. This is the so-called discrete order. 52 is just the usual order of the natural numbers. 3 We display these in Figure 3.3.
D
•
r
•
D
FIG. 3.3. Non-preservation of an inequation It is clear that the identity map (restricted to A) is a homomorphism from the poset (A, 51) onto the poset (A, 52). This is an example of a very general phenomenon with order homomorphisms. The trick is simply to observe that adding further order relationships to the target preserves order homomorphisms. We need to outfit the two posets with operations to make them ordered algebras. Let us take the cheapest route possible, and take the identity operation restricted to A, i.e., for all a E A, a(a) = a. This ensures that the identity map preserves the operation. Note that the following holds in Al = (A,51,a), but not in A2 = (A,52,a):
a 5 b::::} b 5 a. Although strictly we are now through, it seems worth noting that the last may be restated as: a 5 b::::} ab 5 aa. 3.10
Tonoids
Ordered algebras are a step in the right direction towards obtaining the right framework to analyze logics, but are still too restrictive, failing in effect to account for the fact that there is another direction as well. Reading "5" as implies, isotonicity gets right the distribution of necessity across implication (isotonic): a 5 b ::::} Da 5 Db
(Becker's rule).
3There is also a third way, just reversing the usual order, but this is isomorphic to the second.
78
TONOIDS
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
Example 3.10.6 In particular, the discrete ordered algebras are tonoids. Since ordinary algebras may be viewed as discrete ordered algebras (see previous section) this means that ordinary algebras may be regarded as discrete ordered algebras.
However, it wrongly would ascribe the same property to negation:
a~b
=?
..,a ~..,b
(Denying the antecedent).
Example 3.10.7 The set of integers Z with the usual unary operation - for forming the negative of any integer, binary operations of + and x and the usual ~ is a tonoid.
Rather what is wanted is antitonicity: a ~ b =? ..,b ~..,a
(Contraposition).
There are similar facts with binary logical operations. Conjunction and disjunction are naturally isotonic in both of their arguments, whereas the conditional is isotonic in its second argument, a ~ b =? c -'> a ~ C -'> b (Prefixing), but is antitonic in its first argument, a ~ b =? b
-'> C
~ a -'> C
(Suffixing).
This suggests that each operation should come with a "tonic type" saying :vh~ther it is isotonic or anti tonic for each of its argument positions. This is the essentIal Idea of a "tonoid" as found in Dunn (1993a).4 Definition 3.10.1 A tonoid is a structure T = (A,~, (Oi)iEI) where (A,~) is a poset, and each 0i is an operation on A. Where the degree of 0i = n (;?: 1), associated with 0i is a tonic type ((n, ... , (Jm, ... , (In), where each "sign" is either plus (+) or minus (-). If (Jm = +, then 0i must be isotonic in its mth place, i.e., a ~ ~ =? .0i~Xl.' ... ,a, . .. ,X n ~ .(x 1, ... ,,··" b x n, ) and 11 (J m = -' then 0i must be antlfomc zn Its mth place, I.e., 0, a~b =? Oi(Xl, ... ,b, ... ,Xn)~Oi(Xl, ... ,a, ... ,xn).
!
Example 3.10.2 Assuming "colllIl10n sense" knowledge of the behavior of various logical connectives, we have the following associated tonic types: /\: (+, +) V:
(+,+)
-'>:
(-,+)
79
0: (+) 0: (+) ..,: (-)
Example 3.10.3 Let (A,~, 0, -'>, +-) be aresiduated partially ordered groupoid (cf. Section 3.17). Then the tonic types are as follows: 0: (+,+) -'>: (-,+) +-: (+,-). Remark 3.10.4 Residuated partially ordered groupoids are commonly stud~ed as ordered algebras (see Fuchs 1963), but this is really a kind of a "cheat" if the resIduals are indeed "first-class citizens." Example 3.10.5 Any ordered algebra is trivially a tonoid (with the distribution type for each operation being a string of plus signs). 4There is a further requirement that they preserve or co-preserve (invert) some bound, but we omit this here as simply complicating the discussion.
Example 3.10.8 The set Q+ of positive rational numbers, with multiplication (a x b) and division (a/b) is a tonoid. The relation a ~ b is understood as "a (integrally) divides b without remainder," i.e., there exists a natural number n such that a x 11 = b. Multiplication has tonic type (+, +) and division has tonic type (-, +).
One can formalize the tonoid inequationallogic (TIL) with the same axioms and rules for IL, with the obvious exception that the rule expressing isotonicity is replaced with the following rule (given that the tonic type of 0i is ((JI, ... , (J~, ... , (In), and that ~± is ~ if (Jm = +, and ;?: if (Jm = -): Xm ~Ym
We leave to the reader the straightforward task of proving that the set of axioms and rules for TIL is sound and complete. As we have seen from examples above, tonoids arise very naturally in looking at various logical connectives, particularly negation and implication. Definition 3.10.9 An implication tonoid is a tonoid operation on A whose tonic type is (-, +) .
(A,~,
-'»,
where
-'>
is a binary
Implication tonoids are a start on the road to axiomatizing various fragments of various substructural logics, including the implicational fragment of the relevance logic R, by adding various inequations as we shall show in Section 3.17. In addition to inequations, we can form quasi-inequations as inferences that have one or more inequations as premises, and an inequation as the conclusion, for example: (RfP)
a ~ b
-'> C
=? b ~ a
-'>
c
(Rule-form permutation),
which does not hold for all implication tonoids, since it is false in the implication tonoid described by the following table (assign a, b, c the values 1, respectively).
t, t,
T:
-'>
1
1
1+
1
1+ 1+
'2 0
1
'2 0 1+ 1+
0 0 0 1+
Note that the plus indicates when a ~ b holds, so this table does double duty, showing implication both as a "metalinguistic" relation and as an "object language" operation. Note that it is easy to visually check whether a table defines an implication tonoid.
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
80
TONOIDS
81
Implications are antitonic in the antecedent positions, so the values must decrease (more precisely not increase) as one moves along a row from left to right. Implications are isotonic in their consequent positions, so this means that the values must increase (more precisely, not decrease) as one moves along a column from top to bottom. The reader can easily see that 0 S ~ S 1, and so we have a linear order. The motivation of this definition is that a true implication is when we have a S b, and a false implication is when we have a b, and that we "skip the middle person" by taking truth to be the top element 1, and falsity to be the bottom element O. This is a kind of strict, or necessary conditional, of the sort associated with the modal logic S5. We noted above (see Example 3.9.9) that quasi-inequations are not preserved under homomorphism.
However, notice that permutation is really an instance of contraposing in the sense of Section 12.6. Moreover, (RfP) assumes that the two operations which are contrapositives of each other are the same. This is a rather strong assumption, although it is not rare in real case examples. The simplest such example is negation. Recall from Section 3.13 that (ml), which is a quasi-inequation stating contraposition for a single negation, implies (m2) and (m3), another form of contraposition and half of the double negation. We generalize these observations in the following fact stating that contrapositive quasi-inequations are preserved under (tonoid) homomorphism.
Problem 3.10.10 Prove a "vmieties theorem" for "quasi-inequationally definable" tonoids (those that can be axiomatized by implications from a finite number of inequations to an inequation) similar to Theorem 3.11.1 for inequationally definable tonoids, below. We conjecture that one somehow modifies the conditions for inequationally definable tonoids by replacing preservation under homomorphic ordered images with some more subtle requirement about preservation under images.
(cp)
i
One particularly interesting quasi-inequation is rule-form permutation (RfP). This is interesting because it distinguishes those substructural logics which potentially have two implications (distinct left and right residuals, e.g., the Lambek calculus) from those that have just one single implication (the residuals collapse to each other), e.g., linear logic, relevance logic, BCK logic, intuitionistic logic. It is interesting to note that in the context of tonoids, this is in fact inequationally axiomatizable using the single inequation: x S (x --+ y) --+ Y
(assertion).
The following derivation shows that assertion implies rule-form permutation: 1. 2. 3. 4.
xS Y
(assumption) (y --+ z) --+ Z S x --+ Z (1, suffixing) Y S (y --+ z) --+ Z (assertion) Y S x --+ z (3,2, transitivity). --+ Z
Conversely, assertion follows from x --+ Y S x --+ Y by rule-form permutation, so assertion and rule-form permutation are in fact equivalent in the context of tonoids. 5 This phenomenon is quite general for tonoids; (RfP) is a special-though undoubtedly interesting-case. To analyze the problem, first, notice that despite the fact that tonoids might have antitonic operations, the notion of homomorphism is the same as for ordered algebras. The reason for this is that the anti tonicity does not appem" in any inequation per se. The problem with the non-preservation of quasi-inequations in general is that they are conditionals, and nothing excludes the conditional being vacuously true by the antecedent being false. Thus, fine-tuning the notion of homomorphism cannot solve the puzzle. 5We owe this observation and the following generalization of it to Kala Bimb6.
Lemma 3.10.11 Let OJ be anll-my operation which is order-reversing ill the ith place. Then the quasi-inequation XSO}C. .. ,Yi, ... ) =? YiSOj(' .. ,x, ... )
is preserved under homomorphism.
Proof First we prove for any such operation the following: (1) OJ( ... ,Yi, ... )SOJC. .. ,Yi, ... ) (2) Yi S OJ( ... , OJ( ... , Yi, ... ), ... )
(refl.ofS); ((1), by contraposition).
Here (2) is the "law of intuitionistic double negation." (Another way to put this is that the operation satisfies the principle of "extensionality," taking the application of the operation twice as a closure operation. This is quite plausible since OJ is order-reversing in its ith place, i.e., forms with itself a Galois connection in the ith place.) Using this we show that if a tonoid I satisfies the above quasi-inequation, then so does tonoid J where J is a homomorphic image (under h) of I: 1. 2. 3. 4. 5.
hx S oJC. .. , hYi, ... )
(assumption, in J)
OJ( ... , OJ(' .. , hYi, ... ), ... ) S oJC. .. , hx, ... )
(1, by ton. rule, in J)
(by (2), in 1) (3, h ord. hom., in J) (2,4, by transitivity of S, in J).
Yi S OJ( .. . , OJ( .. . , Yi, ... ), ... )
hYi S o}C. .. , OJ( ... , hYi, ... ), ... ) hYi S o}C. .. , hx, ... )
This concludes the proof.
o
Turning back to the "axiomatizability" view, this lemma can be taken as demonstrating that in a tonoid "extensionality," that is, (2) is equivalent to "contraposition," that is, (cp) (where "contraposition is taken with respect to the ith place of an operation antitonic at this place). Note that (2) is a direct analog of "assertion," just as is (cp) an analog of the "rule form of permutation." The first pmt of the proof of the lemma directly derived "extensionality" from "contraposition." The second part of the proof showed the converse. The proof crucially uses the fact that the structure is a tonoid, and that the operation is order reversing in its ith place. Remark 3.10.12 We emphasize that rule-form permutation is inequationally axiomatizable by assertion only because we are requiring suffixing as part of the fundamental framework of a tonoid. It is easy to see that if we do not require this, then no set of
82
TONOID VARIETIES
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
in equations is equivalent to rule-form permutation. The argument is that the set of inequations would be preserved under homomorphism, whereas rule-form permutation is not. The proof goes by modifying Example 3.9.9 as follows. Define an operation on A : x -+ y = x.1t is easy to see that rule-form permutation (and prefixing and suffixing) hold for::;1 (and hence so would the inequations), but that permutation (and suffixing) fail for::;2 (though the supposedly equivalent inequations would be preserved). Permutation after all states that x ::; y -+ Z = Y implies that y ::; x -+ Z = x. Setting x = 0 and y = 1 (z can be either) and using the fact that 0 ::;2 1, we obtain 1 ::;2 0 (which is clearly false).
It is interesting to note that in equational definability may depend upon the presence of "helper" connectives. Thus suppose we add the fusion connective 0 (cf. Section 3.17) to an implication tonoid, forming a right-residuated partially ordered groupoid, subject to the residuation condition: x
(Res)
0
y ::;
z
iff y ::; x -+
z·
Then we can state the rule-form permutation (RfP) as: x
0
y::; yo x.
83
We shall call a class K of tonoids "similar" just when their algebraic parts are similar . as algebras (having the same number of operations of the same degree), and in addition, corresponding operations have the same tonic type. The subsequent discussion is based on Bloom's (1976) similar theorem for order algebras, and it turns out that the proof of this theorem can be readily extended to tonoids. Then the theorem for order algebras turns out just to be a special case (when all of the tonic types are positive in each position). The notions of subtonoid, tonic homomorphic image, and tonic direct product are defined precisely the same way as their corresponding notions for ordered algebras. We prove the theorem by first recognizing that the Galois connection that was established in Section 2.15 for algebras and equations, extends to tonoids and inequations. We write It to stand for the tonoids that satisfy the inequations I, and Ki to stand for the set of inequations that are valid in K.
Fact 3.11.2 The maps above form a Galois connection between the power set of the set of all inequations (in a language appropriate to K) and the power set of the set of tonoids. Let KI and K2 be similarity classes of to noids (of the same type), and let Ir and Iz be sets of equations (in the same terms). In particular, K ~ (Ki)t. We can then show Theorem 3.11.1, if we show the converse (Ki)t ~ K (under the closure hypotheses of the theorem). For then we have (Ki)t = K and so the class K is axiomatized by the set of equations Ki. We first prove a necessary theorem about the existence of free tonoids.
The reader may check this fact by verifying the following chain of equivalences:
a ::; b -+
C
iff boa::; c iff a 0 b ::; c iff b ::; a -+ c.
The discerning reader may worry that the residuation condition is not itself an inequation. But: Exercise 3.10.13 Prove that the residuation condition (Res) is equivalent to the following pair of inequations: a
0
(a -+ b) ::; b a ::; b -+ (b
0
a).
Things get more interesting yet, because in Section 8.1.2 we prove that every implication tonoid is embeddable in a right-residuated partially ordered groupoid. The question naturally arises as to whether there is an analog to Birkhoff's varieties theorem for tonoids. We answer this positively in the next section. In Chapter 12 we shall see that tonoids have nice representations.
3.11
Tonoid Varieties
We state the following, and then say something about the notions it uses.
Theorem 3.11.1 A similarity class K of tonoids is inequationally definable (ff K is closed under subtonoid, tonoid homomOlphic image, and tonoid direct product.
Theorem 3.11.3 Let K be a non-degenerate class of similar tonoids, which is closed under tonoid subalgebras and tonoid direct products. Then for any cardinal 17, a free K -tonoid FK(n) exists.
Proof Pick a set V of 17 variables. Form the word algebra W on V of the same similarity type as the algebras in K. Given a class K of tonoids, we define a quasi-congruence on W as follows: (qc)
W;) K w' iff for every interpretation I of W in an algebra A E K,I(W) ::; I(W').
G = {[xb K : x E V} generates AI ~K and (because of non-degeneracy) if x =f. y then [X]""K =f. [yb K • These are free generators. Let f be any mapping of G into an arbitrary A E K. Define an interpretation 1 so that (1) leX) = f([xb K ), (2) h([wb K) = leW).
We know as in the proof of Theorem 2.14.6 that h is an algebraic homomorphism. We need to show that it also preserves order. Thus suppose that [W]]""K::; [W2]""K' Then WI ;)K W2· We need to show that h([W]""K) ::; h([wb K), i.e., lewd ::; I(W2). But this is just what (qc) above provides. 0 We have shown that W/;)K is free in K, but we have not shown that it is a member of K. Instead we show that some isomorphic copy is a member of K. As a special case we define this relative to a single interpretation: W
;)1 w'
iff
I(W)::; leW').
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
84
CLASSICAL COMPLEMENTATION
Lemma 3.11.4 Let A = (A, =A, (Oi)iEI) and A' = (A', :::;A', (O;)iEI) be two similar tonoids (note that the first is a discrete algebra). Let h be an algebraic homomorphism from Aal g = (A, (Oi)iEI) onto A~lg = (A', (O;EI»' Then h is also a tonoid homomorphismfrom A = (A, =A, (Oi)iEI) onto A' = (A', :::;A', (O;)iEI).
Proof If a =A b, then a order, then a :::;A' b.
= b, and so h(a) = h(b). Because of the reflexivity of partial D
Proposition 3.11.5 The discrete word algebras are universally free in the class of tonoids.
Proof Let K be a class of similar algebras, and let A E K. First form the word algebra FK(n), of the same similarity type, with n is the cardinality of A. We know from Chapter 2 that FK(n) is universally free in K considered as a class of algebras. This means that every mapping f of the generators into A can be extended to an algebraic homomorphism h of FK(n) onto A. By Lemma 3.11.4, h is also a tonoid homomorphism, as is required for freedom in a class of order algebras. D Let K be an abstract class of tonoids, closed under tonoid subalgebra and tonoid direct product. We form a word algebra W on a set of variables V at least as big as A. Let 10 map V onto A. Outfitting W with =A (identity restricted to A) makes it a tonoid. By Proposition 3.l1.5 this tonoid is universally free, and so 10 can be extended to an interpretation: (1) I:W~A.
Since K is a subdirect class, K has free K-tonoids of any cardinality of generators. Let the cardinality of V be n, and consider the free tonoid FK(n). For the sake of simplification we assume that FK(n) has been constructed as a quotient tonoid on W. We then have the canonical interpretation:
8S
Corollary 3.11.7 (Theorem 3.9.6, Bloom 1976) A class K of ordered algebras is inequationally definable iff K is closed under ordered subalgebra, ordered homomorphic image, and ordered direct product. 3.12
Classical Complementation
As we have seen, the logical notions of conjunction and disjunction have lattice-theoretic c.ounterparts, ~e notions of meet and join. In a similar way, the logical notion of negatIOn has a lattIce-theoretic counterpart, the notion of complementation. We must hasten to add, however, that whereas the mathematical traits of meet and join characterize all im~lementations .of logical conjunction and disjunction, there is no corresponding root notIOn of algebraIC negation; rather, there is a whole group of notions falling under the general heading of complementation. In t~e present s~ction, we describe the classical notion of complementation, leaving alternatIve conceptIOns to the succeeding section. We begin with the traditional definition of complementation in lattice theory. Definition 3.12.1 Let L be a bounded lattice with bounds 0 and 1, and let a and b be elements of L. Then a and b are said to be complements of each other (in symbols, aC b) if they satisfy the following conditions:
=
(cl) a 1\ b 0; (c2) a V b = 1.
By way of illustrating this concept, consider the Hasse diagrams in Figure 3.4. In particular, consider in each case the element b. In (1), b has no complements (there is no x such that xCb). On the other hand, in (2) b has exactly one complement, a, and in (3) b has exactly two complements, a and c (which are also complements of each other).
(2) []K: W ~ FK(n)
It is easy to show that;:2,K ~ ;:2,1' We next define a new mapping (Ij K) from FK(n) onto A: (ljK)([W]K)
b
a
b
a
c
= I(W).
That (Ijw) is a homomorphism follows from: Lemma 3.11.6 Let A be a tonoid with two quasi-congruences ;:2,1 ~ ;:2,2. Let :::::i1 and :::::i2 be the corresponding congruences. Then Aj ;:2,1 is an ordered homomorphic image of Aj ;:2,2 under the mapping h([ab l ) = [ab 2 •
Proof The reader should consult the corresponding lemma for "unordered" algebras (Lemma 2.l5.2) for h preserving operations and being onto. We show here that h preserves order, i.e., if [a]~1 :::;1 [b]~i' then [a]~1 :::;2 [b]~2' The antecedent means that a ;:2,1 b and the consequent means that a ;:2,2 b, and so the required implication is just D the hypothesis that ;:2,1 ~ ;:2,2.
o (1)
o
(2)
o (3)
FIG. 3.4. Illustrations of complementation Consideration of these three examples leads to the following definition. Definition 3.12.2 Let L be a bounded lattice. Then L is said to be complemented if evelY element in L has at least one complement in L, and L is said to be uniquely complemented if every element in L has exactly one complement in L. In the Hasse diagrams in Figure 3.4, (1) is not complemented (since b lacks a complement), (2) is uniquely complemented, and (3) is complemented but not uniquely
86
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
CLASSICAL COMPLEMENTATION
complemented (since a, b, c all have each other as complements). Perhaps the most common example of a uniquely complemented lattice is the lattice of all subsets of a set U, where the partial ordering is set inclusion. In this case, the (unique) complement of a subset X of U is the complement of X relative to U (i.e., the set-theoretic U - X). If one has a uniquely complemented lattice, then one can define a unary complementation operation, c, so that c(a) is the unique complement of a. More generally, given simply a complemented lattice, one can define a complementation operation for each choice function that selects exactly one element from each set {x : xC a} of complements. In other words, for each way of choosing one particular complement for each element, one obtains a distinct complementation operation. Among complementation operations in general, there are special ones which are described in the following definitions.
1
1
a
a-
(1)
87
°
(2)
a
b-
b
a-
°
Definition 3.12.3 Let L be a complemented lattice, and let n be any function from L into L. Then n is said to be an orthocomplementation on L if, for all x, y in L: (n1) (n2) (n3) (n4)
n(x) /\ x = 0. n(x) V x = 1. n[n(x)] = x. Ifx ~ y, then n(y)
a
~
n(x).
b
(3)
If we read 11 as a negation function, then the intuitive content of (n1)-(n4) goes as follows. First of all, (n3) is simply double negation, and (n4) is simply a form of contraposition which says that if x implies y then the negation of y implies the negation of x. Then (n1) and (n2) together say that n(x) is a complement of x. Recall that implies every proposition, and 1 is implied by every proposition. Thus, (n1) says that the conjunction of x with its negation implies every proposition, and (n2) says that the disjunction of x with its negation is implied by every proposition. With the notion of orthocomplementation, we can define a special class of algebras as follows.
°
Definition 3.12.4 Let A = (A, /\, V, 0,1, /1) be an algebra of type (2,2,0,0,1). Then A is said to be an orthocomplemented lattice (or simply an ortholattice) if (1) (A, /\, V, 0,1) is a complemented lattice, and (2) n is an orthocomplementation on (A, /\, V, 0,1).
A common example of an ortholattice consists of the power set of any set U, where the orthocomplementation operation is the standard set-complement operation. Figure 3.5 contains Hasse diagrams of ortholattices. Here, x - denotes n(x); a further convention is that 0- = 1, 1- = 0, x -- = x. One can show that the orthocomplementation functions indicated in (1) and (2) are the only ones admitted by those structures. In the case of (1), the lattice is uniquely complemented. In the case of (2), the lattice is not uniquely complemented, so it admits many complementation operations; nevertheless, the lattice in (2) admits only one orthocomplementation operation. In the case of (3), there are three distinct orthocomplementation functions, one of which has been depicted; all the others are isomorphic to this one.
°
FIG. 3.5. Hasse diagrams of ortholattices
~ince orthocomplementation is defined in terms of the partial order relation, it is not whether ortholattices can be equationally characterized. However, as it turns out, they can, the relevant equations being given as follows: (01) aVa-=I; (02)a/\a-=O; (03) (a-)-=a; (04) (a/\bf =a-vb-; (05) (a V bf = a- /\ b-. ObVIOUS
Exercise 3.12.5 It is left as an exercise to show that every ortholattice satisfies these equations, and every lattice satisfying these equations is an ortholattice. We have seen that orthocomplementation constitutes a mathematical characterization of negation, treating it as a unary operation on propositions. This characterization of negation involves the following principles. (1) (2) (3) (4)
Double negation: not(not(x)) = x. Contraposition: if x implies y, then not(y) implies not(x). Contradiction: x and not (x) implies everything. Tautology: everything implies x or not(x).
These particular principles are not universal features of all logics that have been proposed. On the one hand, classical logic, supervaluational logic, and quantum logic
88
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
espouse all four principles. On the other hand, multi-valued logic, intuitionistic logic, and relevance logic dispute one or more of these principles. For this reason, it is useful to present alternative conceptions of complementation. An important thing to notice about orthocomplemented lattices is that since they are not required to be uniquely complemented, the definition of the orthocomplement function can be somewhat arbitrary. For example in Figure 3.4(3), the orthocomplement of a could be anyone of the nodes labeled a-, b- , or b. This is not true if the underlying lattice is "distributive," as we shall see in Section 3.14, since then complementation is unique. 3.13
Non-Classical Complementation
As noted in the previous section, orthocomplementation provides a mathematical representation of classical negation. Since classical negation has features that have been disputed by alternative logics, in the present section we discuss a general concept of complementation that subsumes classical negation as well as a number of well-known alternati ves. The notion of "non-classical complementation" might seem at first glance to be a contradiction in terms. Indeed, it has been customary in the literature to call an element c the "complement" of an element b only when they satisfy the "classical" conditions (cl) and (c2) of the previous section. This has led to a proliferation of pejorative terms for weaker notions 3lising in connection with various non-classical logics; for example, Rasiowa (1974) uses such terms as "pseudo-complementation" and "quasicomplementation." We hope to reverse this trend before the term "quasi-pseudo-complementation" appears in the literature. Unfortunately, there is not much time; Rasiowa (1974) already refers to "quasi-pseudo-Boolean algebras"! In particular, we propose to use the term "complementation" as a generic tem for any unary operation on a lattice (or partially ordered set) that satisfies a certain minimal condition common to most well-known logics. Note, however, that we shall continue to use the expression "complemented lattice" in the traditional sense, to mean a lattice in which every element has a complement in the sense of (c 1) and (c2) of the previous section. The following is our official definition of (generic) complementation.
NON-CLASSICAL COMPLEMENTATION
89
Exercise 3.13.2 Show that (ml) is equivalent to the following pair of conditions: (m2) If a ::; b then -b ::; -a. (m3) a::; - - a. Condition (m2) corresponds to the logical principle of contraposition. We shall call a unary operation satisfying (m2) a subminimal complementation. Condition (m3) corresponds to the weak half of the logical principle of double negation. The remaining (strong) half of double negation (viz., - - a ::; a) does not follow from the minimal principle(s) (ml)-(m3), as can be seen by examining various examples below. What does follow (and what may be verified as an exercise) is the following principle of triple negation: (m4) - - -a = -a.
Definition 3.13.1 Let P be a partially ordered set, and let x H -x be a unary operation on P. Then x H -x is said to be a complementation (operation) on P (f the following condition is satisfied:
~efore discussing the various specific versions of non-classical complementation, we dISCUSS a way of looking at the above minimal principles of complementation (negation). Recall that traditional logic distinguishes between contradictories (literal negations) and contraries. An example of a contrary of "snow is white" is "snow is black"; they are contrary precisely because they cannot both be true. However, "snow is black" is not the weakest proposition contrary to (inconsistent with) "snow is white." For it does not merely deny that snow is white; rather, it goes on to say specifically what other color snow has. The sentence "snow is white" has many contraries, including "snow is red," "snow is green," "snow is puce," etc. To deny that snow is white is to say that snow has some other color, which might be understood as the (infinite) disjunction "snow is red, or snow is green, or snow is puce, or ...." Thus, the negation of a proposition is an infinite disjunction of all of its contraries. Assuming that disjunction behaves like the join operation of lattice theory, another way of expressing the above is to say that the negation of a proposition b is the weakest proposition inconsistent with b. Somewhat more formally, the negation of b is the ~east u?per bo~nd (relative to the implication partial ordering) of all the propositions mconslstent WIth (contrary to) b. Now, of course, the indicated least upper bound may not exist. Under special circumstances, its existence is ensured; for example, its existence is ensured whenever the set of propositions forms a complete lattice with respect to the implication relation. Given the relation I (where "xl y" is read "x is inconsistent with y"), the least upper bound property can be expressed quite concisely by the following principle:
(ml) If a::; -b, then b ::; -a.
(nl) x::; -a iff xla.
Notice that, since (ml) implies its own converse, we could just as easily replace (ml) by the associated biconditional. Also notice that the poset P can, as a special but important case, be a bounded lattice. The minimal condition (ml) con"esponds roughly to the natural assumption that if proposition a is inconsistent with proposition b, then conversely b is inconsistent with a (see below, however).
This simply says that -a is implied by all and only propositions that are inconsistent with a. Ignoring for the moment the natural assumption that the relation I is symmetric, we can write a corresponding principle, pertaining to a second negation operation, as follows: (n2) x::;
~a
iff aIx.
90
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
Starting from (n1) and (n2), it is easy to show the following, which looks very much like (m1): (gl) as-b iff
NON-CLASSICAL COMPLEMENTATION
H3
1
1
bS~a.
a
Indeed, to obtain (m 1) as a special case, all we have to do is postulate that the relation I is symmetric, in which case we can show that -a = ~ a. A pair of functions (-, ~) satisfying (gl) is called a Galois connection. More will be said about Galois connections in due course; at the moment, we simply remark (and ask the reader to prove) that one can show that Galois connections have the following properties reminiscent of negation: (g2) If a S b, then -b S -a. (g3) If a sb, then ~b S ~a. (g4) as-~a. (g5) a S ~ -a.
a (= -a)
o
0(= -a)
H4
1
a
Now, returning to complementation operations, we remark that every orthocomplementation operation is an example of a complementation operation. On the other hand, there are alternative non-classical complementation operations which are intended to model the negation operators of the various non-classical logics. The classical principles of negation can be formulated (somewhat redundantly) lattice-theoretically as follows:
dM4
b
Exercise 3.13.3 Show that (gl) is equivalent to (g2) through (g5).
(pI) (p2) (p3) (p4) (p5)
dM3
91
(= -a) a
- - a = a (strong double negation). a /\ -a = 0 (contradiction). a V -a = 1 (tautology).
The first two principles are unopposed, and accordingly constitute the bare-bones notion of complementation, as we conceive it. On the other hand, the remaining three principles have been disputed by various non-classical logical systems. For example, Heyting's (1930) intuitionistic logic system H rejects (p3) and (p5), although it accepts the non-minimal principle (p4). The minimal logic system of Johansson (1936) goes one step further and rejects (p4) as well, accepting only the minimal principles-(p1) and (p2). On the other hand, the relevance logic systems E and R of Anderson and Belnap (1975) reject both (p4) and (p5), but accept (p3). Interestingly, the multi-valued logic systems of Lukasiewicz (1910, 1913) agree precisely with relevance logic concerning the principles of negation, although for very different philosophical reasons. In light of the different accounts of logical negation, we propose correspondingly different accounts of complementation, which are formally defined as follows. We offer in Figure 3.6 a few examples, using Hasse diagrams. In each diagram, complements are
o
0(= -a = -b)
as - - a (weak double negation). If as b, then -b S -a (contraposition).
b (= -b)
M4 (= -b) a
b(=-a)
o
a
b (= -b)
0(= -a)
FIG. 3.6. Examples of v31ious sorts of complementation indicated parenthetically, except for 0 and 1, in which case the reader is to assume that -1 = 0 and -0 = 1, unless explicitly indicated otherwise.
Definition 3.13.4 Let L be a lattice with 0, and let x 1-+ -x be a complementation on L. Then x 1-+ -x is said to be a Heyting complementation on L if it additionally satisfies the following condition: (p4) a /\ -a = O.
Definition 3.13.5 Let L be a lattice, and let x 1-+ -x be a complementation on L. Then x 1-+ -xis said to be a De Morgan complementation on L if it additionally satisfies the following condition:
CLASSICAL DISTRIBUTION
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
92
93
Minimal complement (p3) - - a = a. Notice that De Morgan complementation is so called because it satisfies the De Morgan laws: (dMl) -(a V b) = -a A -b; (dM2) -(a A b) = -a V -b.
Heyting complement
De Morgan complement
(This may be verified as an exercise.) Indeed, (p3) and either (dMI) or (dM2) provide an equational characterization of De Morgan complementation. (This may be shown as an exercise.) For the sake of completeness, we repeat the definition of orthocomplementation here.
Orthocomplement FIG. 3.7. Logical relationships among four sorts of complementation
Definition 3.13.6 Let L be a lattice with 0 and 1, and let x 1-+ -x be a complementation on L. Then x 1-+ -x is said to be an orthocomplementation on L if it additionally satisfies the following conditions: (p3) - - a = a;
= 0; a V -a = 1.
(p4) a A -a
(pS)
The various kinds of complementation operations are more rigorously associated with the various kinds of logics in the chapters devoted to those particular logics. The purpose of the current section is plimarily to give basic definitions, and to show a little of the lay of the land. In Figure 3.6, H3 and H4, which are Heyting lattices, illustrate Heyting complementation. dM3 and dM4, which are De Morgan lattices, illustrate De Morgan complementation. B4, which is a Boolean lattite, illustrates orthocomplementation. Finally, M4 illustrates minimal complementation, in the sense that the complementation operation satisfies no complementation principle beyond the minimal principles. We have now described four sorts of complementation: orthocomplementation, Heyting complementation, De Morgan complementation, and general (minimal) complementation. The logical relationships among these are depicted in the Hasse diagram in Figure 3.7, where the strict ordering relation is "is a species of." Exercise 3.13.7 As a final exercise for this section, the reader should verify the asymmetry of the above relation. In particular, the reader should show that not every minimal complement is a De Morgan complement (Heyting complement), and not every De Morgan complement (Heyting complement) is an orthocomplement. This may be done by reference to the examples above.
3.14
in the sense that every poset is isomorphic to an inclusion po set. A natural extension of this notion is that of a lattice of sets, which is formally defined as follows. Definition 3.14.1 Let P be an inclusion poset. Then P is said to be a lattice of sets if P is closed under intersection and union, which is to say that for all X, Y, (1) ifX,YEP,thenXnYEP;
(2) if X, YEP, then X U YEP. An alternative piece of terminology in this connection is "ring of sets." Note carefully that a lattice of sets is not merely an inclusion po set that happens also to be a latticewe will call this an inclusion lattice. In order for an inclusion lattice to be a lattice of sets, the meet of two sets must be their intersection, and the join of two sets must be their union. The Hasse diagrams in Figure 3.8 illustrate the difference. Both of these are inclusion po sets that happen to be lattices. However, ILl is not a lattice of sets, because in ILl join does not correspond to union; for example, {a} V {b} = {a, b, c} f:. {a} U {b}. By contrast, IL2 is a lattice of sets. Now, every poset is isomorphic to an inclusion po set, and every lattice is isomorphic to an inclusion lattice. On the other hand, not every lattice is isomorphic to a lattice of sets. In support of this claim, we state and prove the following theorem. {a,b,c}
{a, b, c}
{c}
{a}
{c}
{a,b}
Classical Distribution
Recall that an inclusion poset is a poset (A, :s;) in which A is a collection of sets and :s; is set inclusion. As noted in Section 3.2, inclusion po sets are completely typical po sets
o
o
FIG. 3.8. Distinction between inclusion lattices and lattices of sets
Theorem 3.14.2 Let L be a lattice that is isomorphic to a lattice of sets. Then for all a, b, c, in L, a 1\ (b V c) = (a 1\ b) V (a 1\ c). Proof. Suppose L * is a lattice of sets, and suppose that h is an isomorphism from L into L *. Then hex 1\ y) = hex) n hey), and hex V y) = hex) U hey) for all x, yin L. So, setting A = h(a), B = h(b), and C = h(c), we have h[a 1\ (b V c)] = An (B U C), and h[(a 1\ b) V (a 1\ c)] = (An B) U (A n C). By set theory, An (B U C) = (A n B) U (A n C), so h[a 1\ (b V c)] = h[(a 1\ b) V (a 1\ c)]. But h is one-one, since it is an isomorphism, so this implies that a 1\ (b V c) = (a 1\ b) V (a 1\ c). 0
The sort of reasoning employed in the above proof can be generalized to demonstrate that any lattice that is isomorphic to a lattice of sets satisfies the following equations: (dl) a 1\ (b V c) = (a 1\ b) V (a 1\ c); (d2) a V (b 1\ c) = (a V b) 1\ (a V c); (d3) (a 1\ b) V (a 1\ c) V (b 1\ c) = (a
V
b) 1\ (a
V
c) 1\ (b
V
c).
These three equations are known as the distributive laws; the first two are the common forms, and are dual to one another; the third one is the "self-dual" form. Consideration of these equations leads naturally to the following definition. Definition 3.14.3 Let L be a lattice (not necessarily bounded). Then L is said to be a distributive lattice if it satisfies (dl )-( d3). Now, one can demonstrate that a lattice satisfies all three equations-(dl), (d2), (d3 )-if it satisfies anyone of them. Exercise 3.14.4 Prove this claim. On the other hand, these formulas are not entirely lattice-theoretically equivalent, a fact that is demonstrated in the next section. Before continuing, we observe that one half of each distributive law is satisfied by every lattice. This is formally presented in the following theorem. Theorem 3.14.5 Let L be a lattice. Thenfor all a, b, c in L, (1) (a 1\ b) V (a 1\ c) ~ a 1\ (b V c); (2) a V (b 1\ c) ~ (a V b) 1\ (a V c);
(3) (a 1\ b)
V
(a 1\ c)
V
(b 1\ c) ~ (a
V
b) 1\ (a
V
c) 1\ (b
V
95
CLASSICAL DISTRIBUTION
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
94
1
NDI
1
ND2 a
a
b
c
b
c
0 0 FIG. 3.9. Non-distributive lattices Whereas the inequations of Theorem 3.14.5 are true of every lattice, their converses are not. In other words, not every lattice is distributive; the lattices in Figure 3.9, for example, are non-distributive. Indeed, NDI and ND2 are completely typical examples, as explained in the following theorem. Theorem 3.14.8 Let L be a lattice. Then L is distributive lattice that is isomorphic to NDI or to ND2.
if and only if it has no sub-
Proof. The easy direction is from left to right. If a lattice contains a sublattice isomorphic to NDI or ND2, then it is easy to check that a 1\ (b V c) i (a 1\ b) V c-assuming the labelling as in Figure 3.9. For the other direction, assume that a 1\ (b V c) i (a 1\ b) V c for some a, band c in L. There are five cases to check. If a ~ b, or b ~ a, or a and b are incomparable and a ~ c, then in fact a 1\ (b V c) ~ (a 1\ b) V c contrary to our assumption. (The verification of these cases is left as an exercise.) The two remaining cases are when a and b are incomparable but c ~ a, and when a and b, and also a and c, are incomparable. The first gives rise to a sublattice isomorphic to ND2, the other allows one to construct a sublattice isomorphic to NDI. We sketch the reasoning for ND2, leaving the other construction as an exercise. Since a ~ c, a 1\ c = c, and a V c = a. The diagram on the left in Figure 3.10 illustrates what is known so far. Notice that a 1\ (b V c) ~ a 1\ c, and a V c ~ (a 1\ b) V c, by isotonicity, which gives the diagram on the right. If any two of the five elements, a 1\ (b V c), b V c, b, a 1\ b and (a 1\ b) V c are identified, then a 1\ (b V c) ~ (a 1\ b) V c; thus, these five elements form a sublattice isomorphic to ND2. 0
c). a
a
bvc
bvc
Exercise 3.14.6 Prove Theorem 3.14.5. al\(bvc)
It follows from (1) that (dl) can be simplified to:
(dl') a 1\ (b
V
c) ~ (a 1\ b) V (a 1\ c).
This may, in turn, be simplified to: (dl") a 1\ (b V c)
~ (a 1\ b) V
c.
Exercise 3.14.7 Prove that (dl') and (dl") are equivalent.
b
b (al\b)Vc c al\b c al\b FIG. 3.10. Illustration of the proof of Theorem 3.14.8
96
97
CLASSICAL DISTRIBUTION
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
In this context, by "isomorphic," we mean that the function preserves meet and join, but need not preserve bounds (especially if the lattice has no bounds). Of course, as a special case of the above theorem, if the lattice has a sublattice exactly like ND1 or ND2, including the bounds, then it is not distributive. This fact is intimately connected to the following theorem.
1
1
Theorem 3.14.9 In a distributive lattice, every element has at most one complement; i.e., if aCx and aCy then x = y.
a
b
a
c
d
f
Corollary 3.14.10 Every complemented distributive lattice is uniquely complemented. A direct method of demonstrating this theorem proceeds along the following lines: suppose a is an element with complements c and c'; show c = c' by applying the distributive law together with the principle that x 1\ y = x iff x V y = y iff x :'S y. The details are left as an exercise. An alternative method of demonstrating this theorem involves appealing to the previous theorem as well as showing that every instance of ambiguous complementation yields either a sublattice isomorphic to ND1 or a sublattice isomorphic to ND2. This is also left as an exercise.
o
o
o
CD3
CD2
CD1
FIG. 3.1l. Complemented distributive lattices
1
Theorem 3.14.11 In a complemented distributive lattice, complementation is unique. Proof Suppose that we have two complements of x: -x and ~x. Then -x 1\ x ~x 1\ x, and -x V x = I = ~x V x. By Theorem 3.14.13, ~x = -x.
=0 =
1
b
a
0
Definition 3.14.12 A Boolean algebra (sometimes called a Boolean lattice) is a complemented distributive lattice. By the previous corollary, complementation is in fact unique, and it is customary to denote it by -x or sometimes x. We shall learn more about Boolean algebras in Section 8.7, but for now let us content ourselves with two examples. Example 3.14.13 Consider the power set of some set U: rp(U) = {X: X ~ U}. This is readily seen to be a Boolean algebra, with the lattice order just ~, glb just n, lub just u (all of these restricted to rp(U), and -X = {a E U: a ¢ X}. Example 3.14.14 The Lindenbaum algebra of classical propositional calculus can be shown to be a Boolean algebra. Showing this depends somewhat on the particular formulation, and may require some "axiom chopping." But it is very easy if we take the classical propositional calculus to be defined as truth-table tautologies. Some distributive lattices are complemented, and others are not. Figure 3.11 contains examples of distributive lattices that are complemented; Figure 3.12 contains examples of distributive lattices that are not complemented. The distributivity of CD1 and NCD1 is an instance of a more general theorem. Theorem 3.14.15 EvelY linearly ordered set is a distributive lattice. Proof We note the following. In a linearly ordered set, for all b, c, either b :'S cor c :'S b. In general, if b :'S c, then b 1\ c = band b V c = c. Now let us consider a 1\ (b V c) :'S (a 1\ b) V (a 1\ c). If b :'S c then a 1\ (b V c) = a 1\ b :'S (a 1\ b) V (a 1\ c). The case where c :'S b is similar. 0
a
d
o
d
o
o NCD1
c
NCD2
NCD3
FIG. 3.12. Non-complemented distributive lattices On the other hand, the distributivity of the remaining lattices above follows from the above theorem along with the fact that distributive lattices are equationally defined and hence form a variety. The class of distributive lattices is accordingly closed under the formation of homomorphic images, direct products, and subalgebras. CD2 is isomorphic to the 2-direct power of CD1; CD3 is isomorphic to the 3-direct power of CD1. NCD2 is isomorphic to the direct product of CD1 and NCD1, and NCD3 is a sublattice of NCD2, which happens to be a subdirect product of CD1 and NCD1. Note also that NCD1 is a subdirect product of CD1 with itself. We conclude this section by describing a general class of concrete distributive lattices. Theorem 3.14.16 Let n be a natural number, let D(n) be the set of divisors of n, and let:'S be the relation of integral divisibility among elements of D(n). Then (D(n),:'S) is a distributive lattice. And in particular, a 1\ b is the greatest common divisor of a and b, and a V b is the least common multiple of a and b.
Proof Let us denote the two operations by gcd and lnn. It is easy to verify that both gcd and ICIn are idempotent, commutative, and associative; furthermore, gcd(a, lcm(a, b)) = a and lcm(a, gcd(a, b)) = a. (This is left as an exercise.) To show that the lattice is distributive, one has to show gcd(a, lcm(b, e)) ~ lcm(gcd(a, b), e). To show this, assume that pll is a prime factor in gcd(a, lcm(b, e)), i.e., pll I gcd(a, lcm(b, e)), but pll t lcm(gcd(a, b), e). Then, pili a and pllllcm(b, e), so pili a and (pili b or pili e). From the second assumption, pll gcd(a, b) and pll e. Then, pll a and pll b, which leads to contradiction, since then pll t gcd(a, b). 0
t
t
t
99
NON-CLASSICAL DISTRIBUTION
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
98
I
b e
t
a
Distributive lattices have the following useful property. Theorem 3.14.17 Let L be a distributive lattice. Suppose that there are elements a, x, Y such that x
1\
a
o
= y 1\ a,
FIG. 3.13. ND2
x V a = y V a.
Then x = y. Proof x
=
=
x V (x 1\ a) x V (y 1\ a) (y V x) 1\ (y Va) y V (y 1\ a) y.
3.15
=
(x V y) 1\ (x V a)
= (x V y) 1\ (y V a) = o
=
Non-Classical Distribution
The classical principles of distribution are espoused by virtually every proposed logical system. Nevertheless, there are exceptions-in particular, the various non-distributive logics inspired (principally) by quantum theory. Inasmuch as classical distribution is not a universal logical principle, it is worthwhile for us to examine briefly various proposed weakenings of the classical principles of distribution. We begin by defining the notion of a distributive triple, which is a natural generalization of the notion of a distributive lattice. Definition 3.15.1 Let L be a lattice, and let {a, b, e) be an unordered triple of elements of L. Then {a, b, e) is said to be a distributive triple if it satisfies the following equations, for every assignment of the elements a, b, e to the variables x, y, z: (dl) x 1\ (y V z) = (x 1\ y) V (x 1\ z); (d2) x V (y 1\ z) = (x V y) 1\ (x V z).
if and only if every triple of elements of L
Thus, as the reader can verify, the assignment of a to x, b to y, and e to z satisfies (dl) but not (d2). Note that if we interchange a with b in ND2 the same assignment satisfies (d2) but not (dl). As mentioned in the previous section, ND2 and ND1 together are completely typical non-distributive lattices, in the sense that a lattice is non-distributive if and only if it has a sublattice isomorphic to ND1 or ND2. Thus, classical distributivity corresponds to the absence of sublattices like ND1 and ND2. This way of looking at it suggests an obvious way to generalize classical distribution. Classical distribution rules out both ND1 and ND2. One way to generalize classical distribution involves ruling out only ND1, and another way involves ruling out only ND2. The first generalization does not correspond to any well-known class of lattices, so we will not pursue it any further. The second generalization, however, does correspond to a frequently investigated class of lattices-namely, modular lattices-which are formally defined as follows. Definition 3.15.3 Let L be a lattice. Then L is said to be modular following equation:
We note the following immediate theorem. Theorem 3.15.2 A lattice L is distributive a distributive triple.
a 1\ (b V e) = a 1\ I = a; (a 1\ b) V (a 1\ e) = a V 0 = a; a V (b 1\ e) = a V 0 = a; (a V b) 1\ (a V e) = b 1\ 1 = b.
is
As mentioned in the previous section, a lattice satisfies (dl) if and only if it satisfies (d2), yet (dl) and (d2) are not entirely lattice-theoretically equivalent. This amounts to the fact that a particular assignment of lattice elements to variables can satisfy one without satisfying the other. For example, consider the lattice ND2 in Figure 3.l3. In this particular case:
(ml) x
1\ (y V
(x 1\ z))
if it satisfies
the
= (x 1\ y) V (x 1\ z).
Notice first of all that (ml) is a consequence of (the universal closure of) (dl), obtained simply by substituting 'xl\z' for' z: and noting that xl\(xl\z) = xl\z. On the other hand, (dl) is not a consequence of (ml). This is seen by noting that M1 (= ND1) satisfies (ml) but not (dl). (This may be shown as an exercise.) Note that M1 in Figure 3.14 is the smallest modular lattice that is not distributive.
NON-CLASSICAL DISTRIBUTION
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
100
101
a/\b
b
b
a
b/\(a/\c)
a/\c
c
c
c
b
a
a/\c
(1)
(2)
a/\b
a/\c
o FIG. 3.15. Illustration of the proof of Theorem 3.15.5
FIG. 3.14. M1 Next we observe that, just as there are three distribution equations, there are three modularity equations. In addition to (mI), there is also (m2), which is the dual of (mI), and (m3), which is its own dual: (m2) x
V (y /\ (x V z)) = (x V y) /\ (x V z);
(m3) (x /\ y)
V «x V y) /\ z)
= (x V y) /\ «x /\ y) V z).
One can show that a lattice satisfies all three equations-(m1), (~2), (m3)-if ~nd only if it satisfies anyone of them. On the other hand, th~se equatIOns. are not lattIcetheoretically equivalent. (The reader may wish to prove thIS as an ex~rclse:) . The following series of theorems connects the notion of modulanty. Wlt~ the notI~n of distributive triple, and with the idya that modularity excludes sublattIces IsomorphIc toND2. Theorem 3.15.4 Let L be a modular lattice, and let a, b, c be elements of L, and suppose that a ::; b. Then {a, b, c} is a distributive triple. To prove the claim notice that half of each equality (d1), (d2) .holds in every lattice. Notice also that simultaneously substituting y for z and z for y III (dl) or (~2) returns (dl) and (d2). Thus, it is sufficient to derive six inequations. The two steps whIch use the fact that the lattice is modular are explicitly indicated, the rest follows from a ::;..b or general lattice properties. (i) a /\ (b V c) ::; a = a V (a /\ c) = (a /\ b) V (a /\ c). ~~~) a V (c /\ b) = a V (c /\ (a V b)), and by (m2) a V (c /\ (a V b)) = (a V c) /\ (a V b). (~1l) b /\ (a V c) = b /\ (c V (b /\ a)), and by (m1) b /\ (c V (b /\ a)) = (b /\ a) V (b /\ c). (IV)
Proof
(a/\b)/\(a/\c). (iii) b/\(a/\(b/\c))
= b/\(a/\(a/\c)) = b/\a = a = (a/\b)/\(a/\c).
(iv) b /\ (c /\ (b /\ a)) = b /\ (c /\ a) = a = (a /\ b) /\ (a /\ c). (Figure 3.15(1) illustrates this case when no additional assumptions concerning c are made.) (2) Assume that a II b II c II a. The following inequations hold due to the isotonicity properties of meet and join: a /\ (b /\ (a /\ b)) ::; a, (a /\ b) /\ (a /\ c) ::; a, a /\ (b /\ (a /\ c)) ::; b /\ (a /\ c), (a /\ b) /\ (a /\ c) ::; b /\ (a /\ c), a /\ (b /\ (a /\ c)) ~ a /\ b, and (a /\ b) /\ (a /\ c) ~ a /\ b. Since in any lattice the inequation x /\ (y /\ z) ~ (x /\ y) /\ (x /\ z) is true, a /\ (b /\ (a /\ c)) ~ (a /\ b) /\ (a /\ c). Let us denote by x and y the left- and the right-hand side of the last inequation. We want to show that not only y ::; x, but also x ::; y. Since y ::; x, x and y form a distributive triple with any element; take b. Then (d 1): x /\ (y /\ b) = (x /\ y) /\ (x /\ b). Then b /\ Y = b /\ (a /\ b) /\ (a /\ c) = b /\ (a /\ c), and further, x /\ (y /\ b) = a /\ (b /\ (a /\ c)) /\ (b /\ (a /\ c)) = a /\ (b /\ (a /\ c)) = x. On the other hand, b /\ a = b /\ a /\ (b /\ (a /\ c)) = a /\ b, and so y /\(x /\ b) = (a /\ b) /\ (a /\ c) /\ (a /\ b) = (a /\ b) /\ (a /\ c) = y. That is, x = y. The incomparability is a symmetric relation; thus, the proof is complete. (Figure 3.15(2) illustrates this case, showing x and y not yet identified.) D
= b/\(bVc) = (bVa)/\(bVc). (v) c/\(avb) = c/\b::; (c/\a)V(c/\b~ = c V a ~ (c V a) /\ (c V b).
Theorem 3.15.6 A lattice L is modular iff it contains no sub lattice iso17ZOlphic to ND2.
Theorem 3.15.5 Let L be a lattice having the property that every triple {a, b, c} such that a ::; b is a distributive triple. Then L is a modular lattice.
Exercise 3.15.7 Prove this theorem. The tedious part of the proof is similar to the proof of Theorem 3.14.8. (Hint: Construct the free modular lattice over three generators.)
bV(a/\c)
~
The proof consists of showing that all triples satisfy (m1). There are two main cases to consider: first, when there are two elements related in the triple; and second, when all three elements are incomparable. (1) Assume that a::; b. (i) a /\ c ::; a ::; b, so b /\ (a /\ c) = b, and further a /\ (b/\ (a /\ c)) = a. On the other hand, a /\ b = a since a ::; b. But a /\ (a /\ c) = a. Thus, a /\ (b /\ (a /\ c)) = a = (a /\ b) /\ (a /\ c). The other cases are very similar, and they are summarized in by the following equations. (ii) a /\ (c /\ (a /\ b)) = a /\ (c 1\ a) = a =
Proof
b
(vi) c V (a /\ b)
102
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
NON-CLASSICAL DISTRIBUTION
Thus, in a modular lattice, although not every triple need be distributive, every triple in which one element is below another is distributive. Modular lattices are the best-known generalization of distributive lattices. Nonetheless, they are not sufficiently general for the purposes of quantum logic. We must accordingly consider further weakenings of the distributive laws. Now, just as we can generalize distributivity by defining the notion of a distributive triple, we can generalize modularity by defining the notion of a modular pair as follows.
b
a
103
c
Definition 3.15.8 Let L be a lattice, and let (b, c) be an ordered pair of elements of L. Then (b, c) is said to be a modular pair-written M (b, c)-if the following condition obtains for evelY a in L: (mp) If a :$ c, then a V (b 1\ c)
= (a V b) 1\ c.
d
e
The following theorem states the expected relation between modularity and modular pairs. (The proof of the theorem is left as an exercise.)
Theorem 3.15.9 A lattice L is modular iff every pair of elements of L is modular.
WMI
Note that the modularity relation M is not symmetric. For example, ND2 (see Figure 3.13) provides a counter-example-M(b, c) but not M(c, b). This leads to a fairly well-known generalization of modularity, known as semi-modularity, which may be (but usually is not) defined as follows.
FIG. 3.16. Weakly modular, semi-modular lattice
Definition 3.15.10 Let L be a lattice. Then L is said to be semi-modular if it satisfies the following condition:
Definition 3.15.13 Let L be an ortholattice. Then L is said to be an orthomodular lattice if it satisfies the following condition:
(sm) If M(a, b), then M(b, a).
(om) If a .1 b, then M(a, b).
Semi-modular lattices are also called symmetric lattices, since those are the lattices in which the M relation is symmetric. The principle of modularity may be understood as declaring that every pair of elements is modular. This suggests a general scheme for generalizing modularity-rather than declaring every pair to be modular, one declares only certain special pairs to be modular. Under this scheme, we consider two examples, weak modularity and orthomodularity, the former being defined as follows.
In oth~r words, in an orthomodular lattice, every orthogonal pair is a modular pair; hence Its name.
Definition 3.15.11 Let L be a lattice with lower bound O. Then L is said to be weakly modular if it satisfies the following condition: (wm) If a 1\ b = 0, then M(a, b).
o
The following theorems state the logical relation between orthomodularity and weak modularity.
Theorem 3.15.14 Every weakly modular ortholattice is an orthomodular lattice. Theorem 3.15.15 Not every orthomodular lattice is weakly modular. The first theorem follows from the fact that a 1\ b = 0 if a 1. b. The second theorem may be seen by examining the lattice in Figure 3.17; this lattice is orthomodular but not weakly modular (neither is it semi-modular nor modular). OMI is the smallest ortholattice that is orthomodular but not modular. Ne~t, we note t~at, whereas orthomodular lattices form a variety, weakly modular ~attIc~s and semI-modular lattices do not. Concerning the former, we observe that addmg eIther of the following (dual) equations to the equations for ortholattices serves to characterize orthomodular lattices:
In other words, in a weakly modular lattice, a pair (a, b) is modular provided a 1\ b = O. Figure 3.16 contains an example of a lattice that is weakly modular and semi-modular, but not modular. Whereas weak modularity applies exclusively to lower bounded lattices, orthomodularity applies exclusively to ortholattices. The most revealing definition of orthomodularity uses the notion of orthogonality on an ortholattice, which is defined as follows, after which the official definition of orthomodular lattices is given.
(OMI) a 1\ (a- V (a (OM2) a V (a- 1\ (a
Definition 3.15.12 Let L be an ortholattice, and let a, b be elements of L. Then a and b are said to be orthogonal-written a 1. b-if a :$ b- .
Concerning the latte:, w~ appeal to Birkhoff's varieties theorem (see Chapter 2), which states that every varIety IS closed under the formation of subalgebras. In particular, we
1\ V
b» b»
= a 1\ b; = a V b.
CLASSICAL IMPLICATION
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
104
3.16
d
d
105
Classical Implication
The logical concepts discussed so far in the mathematical theory of propositions have included implication, conjunction, disjunction, and negation. The astute reader has no doubt noticed the asymmetry between the concept of implication, on the one hand, and the concepts of conjunction, disjunction, and negation, on the other. Whereas implication has been treated as a binary relation on the set of propositions, the remaining concepts have been treated as binary operations on the set of propositions. In one concrete presentation of the theory of propositions, propositions are treated as sets of possible worlds. In this representation, implication corresponds to set inclusion, whereas conjunction (disjunction, negation) conesponds to intersection (union, set complement). Letting Ilxll denote the set of worlds in which proposition x is true, we can write the following pairs of expressions in the theory of propositions: (El) pimpliesq, lip II ~ Ilqll; (E2) p and q, Ilpll n Ilqll; (E3) p or q, lip II u Ilqll·
OMI
o FIG. 3.17. Orthomodular (but not weakly modular) lattice note that, whereas WMI is both weakly modular and semi-modular, it has a sublattice that is neither-specifically, the lattice in Figure 3.18: We conclude this section by noting that the lattices that are traditionally investigated in quantum logic are lattices of closed subspaces of separable infinite-dimensional complex Hilbert spaces. These lattices are orthomodular and semi-modular, but they are not weakly modular, and hence they are not modular. The condition of semi-modularity is not equational, so it is customarily ignored in purely logical investigations of quantum logic, which tend to concentrate on orthomodular lattices. I
Another analogy is worth remarking at this point. Recall the notion of a division lattice, which is the set of all divisors of a given number n, together with the relation "x divides y." In light of the theory of division, we can append the following to the above three pairs of expressions, thus obtaining three triples of analogous expressions: (el) x divides y; (e2) x plus y; (e3) x times y. Notice the crucial grammatical difference between these various expressions. Whereas (el) is a sentence (more specifically, an open formula) of the language of division theory, (e2) and (e3) are not sentences, but are rather (open) terms. Just as the fundamental predicate in the theory of propositions is "implies," the fundamental predicate in the theory of division is "divides." On the other hand, the theory of division can be enriched to include an additional concept of division, namely, the familiar one from grammar school, which can be written using either of the following pair of expressions: (e4) x divided by y;
a
c d
WM 1s
o FIG. 3.18. Sublattice ofWMl
x divided into y.
Whereas "x divides y" is a formula, "x divided by y" and "x divided into y" are terms. Thus, we have both a division relation (a partial order relation), and a division operation. Many concepts are paired in this way. For example, whereas "is a mother" is a predicate (unary relation), "the mother of" is a term operator (unary operation). What about the concept of implication? The concept of implication is expressed in English in two ways-in the formal mode of speech (the metalanguage), and in the material mode of speech (the object language), as illustrated in the following sentences: (1) "Grass is green" implies "grass is colored."
(2) If grass is green, then grass is colored.
CLASSICAL IMPLICATION
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
106
In (1), the sentences "grass is green" and "grass is colored" are mentioned, which is to say that the sentences are the topic (subject) of sentence (1). Thus, the overall grammatical form of (1) is [subject-verb-object]. On the other hand, in (2), the sentences "grass is green" and "grass is colored" are not mentioned, but rather used; they are parts of sentence (2), but they are not the topic of (2). The overall grammatical form of (2) is [sentence-connective-sentence] . From the vantage point of the mathematical theory of propositions, whereas the formal version of implication corresponds to the partial order relation ~ of lattice theory, the material version of implication corresponds to any of several two-place operations on the lattice of propositions, depending on which particular analysis of material implication one opts for. For example, classical truth-functional logic opts for the simplest, and least interesting, analysis of material implication, according to which p ~ q is identical to '" p V q. This particular operation has all the properties that one expects of an implication operation, but unfortunately it also has a number of properties that make it unsatisfactory as a representation of material implication. Dissatisfaction with the shortcomings of the classical material implication has led to the investigation of a large variety of alternative material implications, including the strict implications of C. 1. Lewis (1918), the relevant implications of Anderson and Belnap (1975), and the counterfactual implications of D. Lewis (1973) and Stalnaker (1968). In this chapter, we have identified the minimum properties of formal implication (it is a pre-ordering), the minimum properties of conjunction and disjunction (they are respectively greatest lower bound and least upper bound with respect to implication), and the minimum properties of negation (it is a generic complementation operation). With this in mind, we are naturally led to ask the following fundamental question. (Q) What are the (minimum) properties of a lattice operation, in virtue of which it is
deemed an implication operation? The first requirement, of course, is that the operation in question must be a twoplace operation, since the corresponding English connective ("if ... then ... ") is a two-place connective, and also since the intended formal counterpart (~) is a two-place relation. This alone cannot be sufficient, however, unless we are willing to countenance both conjunction and disjunction as legitimate implication operations. So what other requirements need be satisfied? First of all, it seems plausible to require of any candidate operation that it be related to the implication relation ~ in such a way that if a proposition p implies a proposition q, then the proposition p ~ q is universally true (true in every world), and conversely. Stated lattice-theoretically, we have the following condition: (cl) p
~
q iff p
~
q= 1
Notice that (c1) immediately eliminates both conjunction and disjunction as candidates for implicationhood. (This may be verified as an exercise.) On the other hand, (cl) is nonetheless quite liberal. Examples of binary operations satisfying (cl) are easy to
a
107
b
o FIG. 3.19. A four-element lattice B4 construct; for example, let L be any bounded lattice with two distinct elements 0 and 1. Define x ~ y as follows: whenever x ~ y, set x ~ y equal to 1; otherwise, set x ~ y equal to anything other than 1. For a concrete example, consider the lattice B4 in Figure 3.19. The three matrices below define three different implication operations-all satisfying (el )-on B4. ~
0
a
b
0 0 0
1 0 0
1 0 1 0
0 a b
1
~
0
a
b
0 1
a b
0 0 0
b
a a
~
1 1
1 b
1
0
a
b
1
0
1 1 1
0 a
b
b
a a
0 b
a
Exercise 3.16.1 There are 4 16 binary operations on B4. Calculate the number of the operations that satisfy condition (el). Needless to say, when we look at larger lattices, the number of distinct operations satisfying (c 1) becomes combinatorially staggering. Fortunately, (c1) is not the only condition one might plausibly require an operation to satisfy in order to count as a material implication. The next plausible requirement that comes to mind is the law of modus ponens. In its sentential guise, modus ponens sanctions the inference from the sentences Sand if-S-then-T to the sentence T. In its propositional guise, the principle of modus ponens may be stated as follows (see Section 3.18 on filters and ideals): (c2) p 1\ (p
~
q)
~
q.
In one concrete representation, propositions are sets of worlds, implication is set inclusion, and meet is set intersection. Accordingly, in this context, (c2) can be rewritten as follows: (c2*) p n (p
~
q)
~
q.
Now, the latter formula is equivalent set-theoretically to each of the following, where -p is the complement relative to the "set" of all possible worlds: (c3*) -q n (p ~ q) ~ -p; (c4*) p n -q ~ -(p ~ q).
108
NON-CLASSICAL IMPLICATION
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
109
Translating these into lattice-theoretic formulas yields the following, which are not lattice-theoretically equivalent to (c2):
(AI) If I were to drop this glass, then it would break; therefore, ifI were to drop this glass, and it were shatterproof, then it would break.
(c3) -q /\ (p -+ q) ::; -p;
Another condition one might consider is the law of contraposition, which may be stated lattice-theoretically as follows:
(c4) p /\ -q ::; -(p
-+ q).
Conditions (cl)-(c4) are collectively referred to as the minimal implicative conditions: every implication operation should satisfy all four conditions on any lattice with complementation. With few exceptions, no material implication that has been proposed violates any of these principles. One apparent exception is the system of Fitch (1952), which has no negative implication introduction rule, and so seems to violate (c4). The many-valued logics of Lukasiewicz violate (c2). The question is whether there are any other conditions that are satisfied by every implication operation. Without answering this question definitively, we simply examine a few candidate conditions, and show that each one is rejected by at least one extant material implication, and accordingly cannot be regarded as minimal. Let us start by considering a very powerful principle, the law of importation-exportation, which may be stated lattice-theoretically as follows:
(c8) p -+ q = -q -+ -p This condition is true of the classical material implication, and it is true of the strict implications of modal logic, but it is not true of the material implication of intuitionistic logic. Accordingly, (c8) cannot count as a minimal implicative criterion.
3.17 Non-Classical Implication A partially ordered groupoid is a structure (S, ::;, 0), where::; is a partial order on S, and o is a binary operation on S that is isotonic in each of its positions. When the partial order is a lattice ordering, we speak of a lattice-ordered groupoid when 0 distributes over V from both directions (in which case isotonicity becomes redundant). The binary operation -+ L is a left residual iff it satisfies:
(c5) p /\ q ::; riff p ::; q -+ r.
(1r) aob::;c iff a::; b-+L c.
This condition is quite strong. To begin with, it entails both (el) and (c2). What is perhaps more surprising is the following theorem.
A right residual satisfies:
Theorem 3.16.2 Let L be a lattice, and let
We often follow Pratt (1991) in denoting the right residual by the unsubscripted -+, and the left residual by +-, noting that the order of the arguments reverses so Z +- Y = Y -+ L Z. It is easy to see that left and right residuals are uniquely defined by the above properties.
-+ be any two-place operation on L satis-
fying condition (c5). Then L is distributive. Proof It suffices to show (x V y) /\ Z ::; (x /\ z) V (y /\ z). Let r = (x /\ z) V (y /\ z). Now, clearly both x /\ z ::; r, and y /\ Z ::; r, so by (c5) x::; z -+ r, and y ::; z -+ r, so x V y ::; z -+ r, so by (c5) (x V y) /\ Z ::; r. D
In other words, a lattice admits an operation satisfying (c5) only if it is distributive. On the other hand, (c5) cannot be considered a minimal implicative condition, for although it is satisfied by the classical material implication and the material implication of intuitionistic logic, it is not satisfied by the various strict implications of modal logic, nor is it satisfied by the various counterfactual implication connectives. Another candidate condition of implicationhood is the law of transitivity, stated lattice-theoretically as follows: (c6) (p -+ q) /\ (q -+ r) ::; p -+ r. This amounts to the claim that if (p -+ q) and (q -+ r) are both true, then (p -+ r) must also be true. As plausible as (c6) seems, it has the following immediate consequence, the law of weakening:
(rr) aob::;c iff b::;a-+Rc.
Definition 3.17.1 A residuated groupoid is a structure (S,::;, 0, +-, -+) where (S,::;, 0) is a partially ordered groupoid and -+,
+-
are, respectively, right and left residuals.
Note a similmity between residuals and implication. Thus thinking of 0 as a premise grouping operation (we call it "fusion" following the relevance logic literature-the term originates with R. K. Meyer) and thinking of::; as deducibility, the law of the right residual is just the deduction theorem and its converse. One could say the same about left residuals, but notice that right and left residuals differ in whether it is the formula to the left of 0 or to the right of 0 that is "exported" from the premise side to the conclusion side. We have the following easily derivable facts:
(c7) q -+ r ::; (p /\ q) -+ r.
(1) a 0 (a -+ b) ::; b, (b +- a) 0 a ::; b (modus ponens). (2) a::; b -+ (b 0 a), a::; (a 0 b) +- b (fusing). (3) Let reb) be any product al 0 . . . 0 b 0 . . . 0 all (parentheses ad lib). Then if a ::; b and reb) ::; c, then rea) ::; c (cut).
However, (c7), and hence (c6), cannot be considered as a minimal implicative criterion, since in particular it is not satisfied by counterfactual implications. In order to see this, consider the following argument:
It is easy to see that one may replace the law of the right residual equivalently with the first halves of (1) and (2), and similarly with the left residual and the second halves. One may also replace transitivity and isotonicity with (3).
110
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
NON-CLASSICAL IMPLICATION
Isotonicity of 0 and transitivity yield that residuals are antitonic in their first arguments and isotonic in their second arguments, as is stated in the following. Fact 3.17.2 If a :'S: b, then b -+ c :'S: a -+ c (rule suffixing) and c -+ a :'S: c -+ b (rule prefixing), and the same for the left-residual +-. Proof We do only the proofs for the right residual, the others being analogous:
1. a:'S: b (hypothesis) 2. a 0 (b -+ c) :'S: b 0 (b -+ c) (isotonicity) 3. b 0 (b -+ c) :'S: c (modus ponens) 4. a 0 (b -+ c) :'S: c (2,3, transitivity) 5. b -+ c :'S: a -+ c (4, right residual); 1. 2. 3. 4.
0
c) :'S: (a
0
b)
0
c.
1. co (c -+ a) :'S: a (modus ponens) 2. a 0 (a -+ b) :'S: b (modus ponens) 3. [c 0 (c -+ a)] 0 (a -+ b) :'S: b (1,2, cut) 4. co [(c -+ a) 0 (a -+ b)] :'S: b (3, right associativity) 5. (c -+ a) 0 (a -+ b) :'S: c -+ b (4, right residual) 6. (a -+ b) :'S: (c -+ a) -+ (c -+ b) (5, right residual).
(rc) a
b)
0
0
c :'S: a
0
(b
0
c),
1. 2. 3. 4. 5. 6.
6.
Fact 3.17.4 We can also go the other way and derive light associativity from prefixing. 6The reason for the word "right" is to keep mnemonic linkage to the right residual. We would not quarrel with anyone who said that it was more natural to call this "left associativity," but below we will use this name for the dual of the above inequation, to keep linkage with the lefi residual.
0
c) :'S: b 0 (a
0
c).
a -+ (b -+ c) :'S: a -+ (b -+ c)
(reflexivity)
a 0 [a -+ (b -+ c)] :'S: b -+ C (1, right residual) b 0 {a 0 [a -+ (b -+ c)]} :'S: c (2, right residual) a 0 {b 0 [a -+ (b -+ cm :'S: c (3, right commutation) b 0 [a -+ (b -+ c)] :'S: a -+ c (4, right residual) a -+ (b -+ c) :'S: b -+ (a -+ c) (5, right residual).
Now going the other way we derive right commutation from permutation:
5.
(imported prefixing).
(b
Proof First we show that permutation follows from right commutation:
3. 4.
D
0
Fact 3.17.5 In this context (pm) is equivalent to (rc).
(a 0 c) :'S: b 0 (a 0 c) (reflexivity) c :'S: b -+ [b 0 (a 0 x)] (1, right residual) c:'S:a-+ {b-+[bo(aoc)]} (2,rightresidual) c:'S: b -+ {a -+ [b 0 (a 0 cm (3, permutation) b 0 c :'S: a -+ [b 0 (a 0 c)] (4, right residual) a 0 (b 0 c) :'S: b 0 (a 0 c) (5, right residual).
I. b 2. a
Before going on let us note the following obvious consequence of prefixing (apply the law of the right residual twice): (a -+ b)] :'S: b
D
and (contextual) right commutation, D
Proof
0
(1,2, isotonicity) (imported prefixing)
(pm) a -+ (b -+ c) :'S: b -+ (a -+ c),
Fact 3.17.3 It yields the following property of the light residual, familiar from implicationallogics: a -+ b:'S: (c -+ a) -+ (c -+ b) (prefixing).
co [(c -+ a)
b:'S: a -+ (a 0 b) (fusing) c:'S: (a 0 b) -+ (a 0 b) 0 c (fusing) a 0 (b 0 c) :'S: a 0 {[a -+ (a 0 b)] 0 [(a 0 b) -+ (a 0 b) 0 c]} a 0 {[a -+ (a 0 b)] 0 [(a 0 b) -+ (a 0 b) 0 c]} :'S: (a 0 b) 0 c 5. a 0 (b 0 c) :'S: (a 0 b) 0 c (3,4, transitivity).
is equivalent to prefixing for the left residual. Let us next consider the relation between the "sequent form" of permutation,
0, e.g., associativity, commutation, idempotence, which are even likely to be taken for granted when premises are collected together into sets. In fact, these (or in the absence of associativity, slightly generalized versions) all correspond to various implicational axioms. Thus, consider half of associativity (let us call it "right associativity,,6):
(b
1. 2. 3. 4.
(a
It is customary to assume various familiar requirements on
0
Proof
It is easy to argue symmetrically that "left associativity",
a:'S: b (hypothesis) co (c -+ a) :'S: a (modus ponens) co (c -+ a) :'S: b (1,2, transitivity) c -+ a :'S: c -+ b (3, right residual)
a
III
0 0
D
Let us remark that it is clear that in the presence of associativity, right commutation can be replaced with simple commutation, a 0 b :'S: boa,
and even in the absence of associativity, the "rule f01177" of permutation, a :'S: b -+
C
implies b:'S: a -+
C,
can be shown equivalent to simple commutation. Also, of course, given a right identity element (a 0 e = a), the "sequent form" of permutation, (pm), can be shown equivalent to the "theoremfonn" of permutation,
~
e
[a -+ (b -+ c)] -+ [b -+ (a -+ c)].
It makes ideas simpler to assume the presence of such an identity element (and a left one too), as well as associativity. But we shall not so assume unless we explicitly indicate. Another familiar implicationallaw is contraction, ~
a -+ (a -+ b)
a -+ b.
If we had associativity and a right (left) identity, contraction for the right (left) residual would just amount to square-increasingness, a
~
a
0
a.
But working in their absence, we must consider the more general forms a
0
b
~
boa
a
~
(a
0
(b
0
0
a)
b) 0
(left square-increasingness).
Fact 3.17.6 Contraction for the right residual is equivalent to right square-increasingness. The cOlTesponding property for the left residual and left square-increasingness follows by a symmetric argument. Proof Let us first show that contraction follows from right square-increasingness:
1. a 2. 3. 4. 5.
a a a a
0 (a 0 b) ~ a 0 (a 0 b) (reflexivity) a 0 b ~ a -+ (a 0 (a 0 b» (1, right residual) b ~ a -+ [a -+ (a 0 (a 0 b)] (2, light residual) a -+ [a -+ (a 0 (a 0 b))] ~ a -+ (a 0 (a 0 b» (contraction) b ~ a -+ (a 0 (a 0 b» (3,4, transitivity) a 0 b ~ a 0 (a 0 b) (5, right residual).
1. a
D
Besides the rules that cOlTespond to thinking of premises as collected together into sets, there is one more rule that is often taken for granted, namely, dilution or thinning (sometimes called "monotonicity," and it is the absence of this rule that delineates socalled "non-monotonic logics"). Dilution is the rule that says it never hurts to add more premises, and algebraically it amounts to saying that a 0 b ~ b ("right lower bound"), and cOlTesponds to the positive paradoxfor the right residual, a
~
a
~
a .(- b.
The various relationships we have discovered between principles of the right residual and principles for fusion all have their obvious duals for the left residual. Besides residuation giving familiar properties of either left or right alTOW, it also gives "almost familiar" properties relating the two, at least if the reader squints so as not to be able to distinguish them (here we will use subscripts). Thus both of the following are easy to derive:
-+L
c
(rule
It is easy to see that pseudo-assertion and pseudo-permutation are equivalent to each
other. Thus, for example, the first variety of pseudo-assertion follows easily from the first variety of pseudo-permutation as follows: 1. a -+ L b ~ a -+ L b (reflexivity) 2. a ~ (a -+ L b) -+ R b (1, rule pseudo-permutation).
[a -+ (a -+ b)] ~ a -+ b (modus ponens) 0 {a 0 [a -+ (a -+ b)]} ~ b (1, right residual) 0 [a -+ (a -+ b)] ~ a 0 {a 0 [a -+ (a -+ b)]} (right square-increasingness) 0 [a -+ (a -+ b)] ~ b (2,3, transitivity) -+ (a -+ b) ~ a -+ b (4, right residual). 0
The converse goes as follows: 2. 3. 4. 5. 6.
We are really examining the case of "thinning on the right." There is also "thinning on the left," which algebraically amounts to saying that boa ~ b ("left lower bound"), and which cOlTesponds to the positive paradox for the left residual,
(pa) a ~ (a -+L b) -+R b, a ~ (a -+R b) -+L b (pseudo-assertion); (rpp) if a ~ b -+L c, then b ~ a -+R c, and if a ~ b -+R c, then b ~ a pseudo-permutation).
(right square-increasingness),
a
113
NON-CLASSICAL IMPLICATION
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
112
b -+ a.
The proof that these are equivalent is an immediate application of (IT), the law of the right residual.
Of course, when 0 is commutative, the two alTOWS are identical and hence we obtain ordinary assertion and rule permutation, familiar from relevance logic (cf. Anderson and Belnap 1975). Indeed, the commutativity of 0, the pseudo-permutation for -+ R, the pseudo-permutation for -+ L, ordinary assertion, and the rule form of permutation are all equivalent to each other. By putting various conditions on fusion, we obtain algebras cOlTesponding to various systems of implication in the logical literature (in Figure 3.20 one is supposed to keep every condition below and add the new condition). Thus with associativity alone one obtains the Lambek calculus. If one adds commutativity, one obtains linear logic (Girard 1990). From here one has two natural choices, adding either square increasingness to obtain relevant implication (Anderson and Belnap 1975), or the postulate that fusion produces lower bounds (a 0 b ~ a) to get BCK implication (Ono and Komori 1985). Finally, one obtains the properties of intuitionistic implication by collecting all these properties together. These relationships are summarized in Figure 3.20, where conditions below are always preserved in adding new properties above. Remark 3.17.7 The Lambek calculus was first formulated as a Gentzen system, and there are two versions depending on whether one allows an empty left-hand side or not. Algebraically this cOlTesponds to whether one assumes the existence of an identity or not. The same point arises with the other systems, and amounts to whether we are interested purely in the implication relations (a ~ b), or want theorems as well (e ~ c). If all theorems were of the form a -+ b, this might well be thought to be a distinction
114
FILTERS AND IDEALS
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
3.18
Intuitionistic implication
115
Filters and Ideals
Recall that, given any algebra A, and given any subset S of A, we can form a subalgebra of A based on S, so long as S is closed under tlle operations of A. Since a lattice may be regarded as an algebra, we may apply this idea to lattices, as follows. BCK implication (lower bound)
Relevant implication (square-increasingness)
Definition 3.18.1 Let L be any lattice, and let S be any non-empty subset of L. Then S is said to form a sublattice ofL if it satisfies the following conditions:
(SI) If 0 E Sand bE S, then
0 /\
(S2) fro E Sand b E S, then
0
bE
s.
V b E S.
Notice that conditions (Sl) and (S2) simply correspond to the closure of S under the formation of meets and joins. Among sublattices in general, two kinds are especially important, called filters and ideals. We begin with the formal definition of filters.
Linear implication
(commrtionl
De~ni~on
3.18.2 Let L be any lattice, and let F be any non-empty subset of L. Then F IS saId to be a filter on L if it satisfies the following conditions:
Lambek calculus (associativity)
(FI) If a E F and b E F, then
FIG. 3.20. Implicational fragments and algebraic properties without a difference, but even when the only operations are like e ~ (0 ....... 0) 0 (b ....... b).
0
and ....... , one still gets things
Remark 3.17.8 There is a subtlety implicit in the relationship of the algebraic systems to their parent logics that goes beyond the usual routine of seeing the logics as their "Lindenbaum algebras" (identifying provably equivalent sentences). The problem is that almost all of the logics mentioned above had connectives other than fusion and implication in their original formulations, and some of them did not have fusion as an explicit connective. The Lambek calculus is a notable exception, since it had no other connectives tllan the two arrows, and although it was formulated as a Gentzen system, tllere is not too much involved in reading comma as fusion and adding associativity to "defuse" the binary nature of the fusion operation. The story with the other systems is more complicated, and it would take us too far afield here to try to work out in detail that when t is a term containing only ....... , we can derive e ~ t in the algebraic system just when the formula t is a theorem in tlle logical system (for simplicity using tlle same symbols for sentential variables in tlle logic as for variables in the algebra). But by browsing through tlle literature one can find pure implicational fragments of all of the above logics-try Anderson and Belnap (1975) as a start: careful reading will give the clues needed for the other systems. The only question is then how to add fusion conservatively.
(F2) If a
E
F and 0
~
0 /\
b E F.
b, then b E F.
In ne~~ly every definition, there is the problem of dealing witll degenerate cases, and the defimtIOn of filters is no exception. For example, the empty set satisfies (FI) and (F2) vacu?usly, ye~ we have (by fiat) excluded it from counting as a filter. Similarly, the whole lattIce L satIsfies these conditions, so the dual question is whetller to count L as a filter. Here, we are somewhat ambivalent; we often want to exclude the whole set L at other times it is more convenient to count L as a filter. ' Our solution is officially to allow L as a filter, and introduce tlle further notion of proper filter, which is defined simply to be any filter distinct from L. On tlle other hand, a standard "conversational implication" throughout this book will be tllat by "filter" we mean proper filter. Occasionally we shall backslide on tllese conventions when it is convenient to do so, but we shall always tell the reader when we are doing so and why. . From the viewpoint of logic, a filter corresponds to a collection of propositions that IS closed under implication (F2) and the formation of conjunctions (FI). There are two wa~s of tllinking about tllis. On the one hand, we may think of a filter as a theory, whIch may ?e regarded as a logically closed collection of claims. In particular, if a the?ry T claIms p, and T claims q, then T claims the conjunction of p and q; and if T claIms p, and p logic.ally imp~ies q, then T claims q. On the other hand, a filter may be thought of as a (partial) pOSSIble world, which may be regarded as a closed collection of pro~~sitions (n~me.ly, tlle propositions that obtain in that world). In particular, if a propOSItIOn p o.bta~ns m wor~d w, and proposition q obtains in w, then tlle conjunction of p and q obtams m w; and If p obtains in w, and p implies q, tllen q obtains in w. (See below, however.)
There are several alternative ways of characterizing filters that are helpful. For example, (F2) may be replaced by either of the following conditions (which are equivalent in light of the commutativity of the join operation): (F2') If a E F, then a V b E F. (F2+) If a E F, or bE F, then a V bE F. The interchangeability of (F2) with (F2'), or with (F2+), is a consequence of two simple lattice-theoretic facts: a :S a V b (and of course b :S a V b); and a :S b iff b = a V b. Filters can also be characterized by a single condition, which is a strengthening of (FI): (FI +) a
FILTERS AND IDEALS
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
116
E
F and b E F iff a 1\ b E F.
The equivalence of (FI +) and (FI) & (F2) is based on the following lattice-theoretic facts: a 1\ b :S a, b; a :S b iff a 1\ b = a. Combining these observations, we obtain a useful (although redundant) characterization of a filter as a set satisfying conditions (FI +) and (F2+), collected as follows: (FI +) a E F and b E F iff a 1\ b E F. (F2+) If a E Forb E F, then a V bE F. From the logical point of view, (FI +) and (F2+) say that a filter corresponds to a set (of propositions) that behaves exactly like a classical truth set (possible world) with respect to conjunction, and halfway like a classical truth set with respect to disjunction. What is missing, which would make a filter exactly like a classical truth set, is the converse of (F2+). Appending the missing half of (F2+) yields the important notion of prime filter, which is formally defined as follows. Definition 3.18.3 Let L be any lattice, and let P be any non-empty subset of L. Then P is said to be a prime filter on L if it satisfies the following conditions: (PI) a 1\ b E P (P2) a V b E P
iff both a E P and b E P. iff either a E Par b E P.
Thus, in a prime filter (prime theory, prime world), the conjunction of two propositions is true if and only if both of the propositions are true, and the disjunction of two propositions is true if and only if at least one of the propositions is true. Having described filters, we now discuss ideals, which are exactly dual to filters. Recall that the dual of a lattice is obtained by taking the converse of the partial order relation. Now, filters on a lattice L correspond exactly to ideals on the dual of L. More formally, we define ideals as follows. Definition 3.18.4 Let L be any lattice, and let f be any non-empty subset of L. Then f is said to be an ideal on L if it satisfies the following conditions: (II) If a E f and b E f, then a V b E f. (12) If a E f and a 2:: b, then b E f.
Since ideals are dual to filters, all of the various characterizations of filters can be straightforwardly dualized (switching V and 1\, and :S and 2::) to yield corresponding
117
characterizations of ideals. In particular, dualizing the definition of prime filter yields the definition of prime ideal. Similarly, just as a filter can be thought of as a theory (i.e., a logically closed collection of claims), an ideal can be thought of as a counter-theory (i.e., a logically closed collection of disclaimers), and a prime ideal can be thought of as a "false ideal." Not only are prime filters and prime ideals dual concepts, they are also complementary, in the sense that the set-theoretic complement of any prime filter is a prime ideal, and vice versa. (The reader may wish to verify this as an exercise.) For historical as well as structural reasons, algebraists have favored ideals, whereas logicians have found filters more congenial. However, lattices are self-dual, so there can be no ultimate reason to prefer filters to ideals, or ideals to filters; after all, a filter on a lattice is just an ideal on the dual lattice, and vice versa. To emphasize this, some authors refer to filters as "dual ideals." But we are writing as logicians, and have a natural preference for truth over falsity; accordingly, we concentrate our attention on filters throughout this book, although we make occasional use of ideals (e.g., see below). Given our bias, and given the duality of filters and ideals, we shall not usually provide separate definitions of properties for ideals, being content to define a property for filters and letting the reader dualize as needed. Next, we note a very important property of filters. Theorem 3.18.5 Let L be a lattice, and let K be any non-empty collection offilters on L. Then K is also afilter on L.
n
Corollary 3.18.6 Let L be a lattice, and let S be any subset of L. Then there is a filter P on L satisfying the following: (sI) S S;; P. (s2) For any filter Fan L,
if S
S;; F, then P S;; F.
In other words, for any lattice L and for any subset S of L, there is a smallest filter on L that includes S. In particular, the smallest filter on L that includes S is the intersection of the set {F: F is a filter on L, and S S;; F}. (This is left as an exercise.) This justifies the following definitions. Definition 3.18.7 Let L be a lattice, and let S be any subset of L. Then the filter generated by S, denoted [S), is defined to be the smallest filter on L that includes S. Definition 3.18.8 Let L be a lattice, and let a be any element of L. Then the principal filter generated by a, denoted [a), is defined to be the smallest filter on L containing a; i.e., [a) = [{a}). Definition 3.18.9 Let L be any lattice, and let P be any non-empty subset of L. Then P is said to be a principal filter on L if P = [a) for some a in L. Given the above definitions, the following theorems may be verified. Theorem 3.18.10 Let L be a lattice, and let X be any subset of L. Then [X)
= {xEL:
for some a[, ... ,an in X, a[I\ ... l\an:Sx}.
FILTERS AND IDEALS
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
118
Proof We first observe that F = {y : 3Xl, .. ·, XIl E X such that Xl /\ ... /\ x ll S y} is a filter. Suppose Xl, ... , x ll E X and Xl /\ ... /\xn S a. Also suppose Yl, ... ,YI11 E X and Yl /\ .. . /\YI11 S b. Then Xl, ... , XIl, Yl, ... , YI11 E X and Xl /\ .. . /\XIl /\Yl /\ .. ·/\YI11 S a/\b. Suppose xl, ... , XIl E X and Xl /\ ... /\ x ll S a and a S b. Then Xl /\ ... /\ xn S b. We next suppose that G is a filter such that X ~ G. Clearly F ~ G, for Y E F =? 3XI, ... ,Xn E X such that Xl /\ ... /\ Xn S y. But since X ~ G, Xl,··· ,xn E G, and since G is a filter, Xl /\ ... /\ Xn E G, and again since G is a filter (and Xl /\ ... /\ Xn S Y), then Y E G. So since F ~ any filter G such that G :2 X, F ~ {G : G is ~ filter and G :J X}. And since we showed above that F is itself a filter, and since ObVlOusly 0 F :2 X, clearly [X) ~ F, and hence F = [X).
n
Theorem 3.18.11 Let L be a lattice, and let a be any element of L. Then [a) = {x E L: a
S
Theorem 3.18.12 IfG and H are filters, then [G U H) = {z : 3x E G,3y E H such thatx/\y s Z}. AndifG isafiltel; [GU {a}) = [GU[a)) = {z : 3x E G such that X /\ a S z}. [G U {a}) is often denoted by [G, a).
(i) 3Xl, ... , Xi Z,
zE
[GuH) iff 3Zl, ... , ZIl
E
G, 3Yl, ... , Yk
E
G such that Xl /\ ... /\ XIl S H such that Yl /\ ... /\ Yn S
E
E
GuH such that Zl/\· . . /\Z/l S
H such thati+k = n and Xl /\ .. . /\Xi/\Yl /\ .. ·/\Yk S
or
(ii) 3Xl, ... , x/l (iii) 3Yl, ... , Yll
E
Z,
Definition 3.18.13 Let L be any lattice, and let F be any non-empty subset of L. Then F is said to be a complete filter on L if it satisfies the following conditions: (CFl) If A ~ F, and inf(A) exists, then inf(A) E F. (CF2) Ifa E F and as b, then bE F.
Definition 3.18.14 Let L be any lattice, and let P be any complete filter on L. Then P is said to be a completely prime filter on L if it satisfies the following additional condition: (CP) For A
~
L, if sup(A) exists, and sup(A) E P, thell a E P for some a E A.
Whereas an ordinary filter is closed under binary (and hence finite) conjunction, a complete filter is closed under arbitrary (and hence infinite) conjunction. The following theorems connect these ideas to earlier ones.
x}.
Proof Finally, since [a) = {y : 3Xl, ... , Xn E {a}, Xl /\ ... /\ XIl S y}, and since a = a /\ a = a /\ a /\ a, etc., [a) = {y : as y} = [a). 0
Proof By Theorem 3.18.10, Z. But this is true iff either
119
or
z.
But (i) is true iff 3x E G (namely, Xl /\ ... /\ Xi), 3y E H (namely, Yl /\ ... /\ Yk) such that X /\ Y S Z (since both G and H are filters). And (ii) is true iff 3x E G (namely, Xl /\ ... /\ X/l), 3y E H (any Y E H) such that X /\ Y S Z (since G is a filter). And similarly, (iii) is true iff 3x E G (any X E G), 3y E H (namely, Yl /\ ... /\ Y/l) such that X /\ Y S z. As for the second part, if G is a filter, clearly [G U {a}) = [G U [a)), since any X 2:: a must be in [G U {a}). We now show that [G U [a)) = {z : 3x E G such that X /\ a S d· By the above [G U [a)) = {z : 3x E G,3y 2:: a such that X /\ Y S d· Clearly then {z : 3x E G such that X /\ a S z} ~ [G U [a)). Suppose conversely that 3x E G, 3y 2:: a such that X /\ Y S z. Then X /\ a S z. 0 The notions of ideal generated by a subset A, and principal ideal are defined in a dual manner which is left as an exercise. There ar~ two stronger notions, complete filter and completely prime filter, which are especially appropriate to complete lattices, but which are also useful occasionally in more general lattices. These are defined as follows.
Theorem 3.18.15 Every principal filter on a lattice is complete, and conversely, every! complete filter on a lattice is principal. Theorem 3.18.16 Not every principal filter is completely prime.
In examining the former, notice that in a complete filter F, the infimum of every subset of F must be an element of F, so in particular inf(F) must be in F; but if inf(F) E F, then F has a least element, viz., inf(F). In examining the latter, consider the lattice of all rational numbers, ordered in the usual way, and consider the set P = {r : o S r} of all non-negative rationals, which is clearly a principal filter. Now, although P contains the supremum of the set N of negative rationals (i.e., 0), it contains no negative rational, and accordingly is not completely prime. In addition to filters and ideals, we have occasional use for a pair of weaker notions, especially in connection with general partially ordered sets-the notions of positive cone and negative cone, which are dual to each other. The former notion is defined as follows. Definition 3.18.17 Let P be a partially ordered set, and let C be any subset of P. Then C is said to be a positive cone on P if it satisfies the following: (PC) Ifx
E
C, and X S y, then Y
E
C.
Notice that (PC) is just (F2), from the definition of filter. Next, we note that the intersection of any collection of positive cones on a poset P is itself a positive cone on P; this fact justifies the following definition.
Definition 3.18.18 Let P be a partially ordered set, and let S be any subset of P. Then the positive cone generated by S, denoted [S), is defined to be the smallest positive cone on P that includes S. Definition 3.18.19 Let P be a partially ordered set, and let a be any element of P. Then the principal positive cone generated by a, denoted [a), is defined to be the smallest positive cone on P that contains a; i.e., [a) = [{ a} ). A set S is called a principal positive cone on P if S = [a) for some a in P.
120
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
FILTERS AND IDEALS
Remark 3.18.20 Strictly speaking, the [S), [a) notation is ambiguous between filters and positive cones generated by S and a respectively. It turns out though that the principal filter generated by a is always the set {x : a ::; x}, i.e., the positive cone generated by a, so the ambiguity of "[a)" is harmless. This is not so with "[S)" since the positive cone generated by S need not be closed under meet, whereas of course the filter must be. Theorem 3.18.21 Let P be a partially ordered set, and let S be any subset of P. Then the following obtain: (1) [S) = {x E P: for some s E S, s ::; x};
(2) [a) = {xEP:a::;x}.
The notions of a negative cone, negative cone generated by A (denoted (AD, and principal negative cone generated by a, denoted (a], are defined dually. The following theorems are relevant to the historical origin of the term "ideal" in lattice theory. Theorem 3.18.22 Let L be a linearly ordered set, and let C be a positive cone Then C is in fact a filter on L.
011
L.
Theorem 3.18.23 Let L be a linearly ordered set, and let C be a negative cone on L. Then C is in fact an ideal on L.
Recall Dedekind's (1872) construction of the real numbers from the rationals using "cuts." Now, a Dedekind lower cut is simply a lower cone (and hence ideal) on the lattice of rational numbers; dually, a Dedekind upper cut is simply a positive cone (and hence filter). What Dedekind did was to identify real numbers with cuts (ideals) in such a way that rational numbers are identified with principal ideals, and inational numbers are identified with ideals that are not, principal. One way of looking at Dedekind's construction is that the rationals are completed by adding certain "ideal" objects which can only be approximated by the rationals, but are otherwise not really there; hence the expression "ideal" in reference to these set-theoretic constructions, which Dedekind used to make sense of Kummer's concept of "ideal number," which had arisen in connection with certain rings of numbers (the algebraic integers). The terminology was carried over, as a special case, to Boolean lattices (which may be viewed as special kinds of lings) and subsequently generalized to lattices as a whole. The Dedekind construction of the reals from the rationals may be viewed as embedding the (non-complete) lattice of rationals into the complete lattice of ideals. This is actually a special case of two more general theorems, stated as follows, but not proved until Chapter 8. Theorem 3.18.24 EVelY partially ordered set P can be embedded into the partially ordered set of negative cones on P, where the partial order relation is set inclusion. Theorem 3.18.25 Every lattice L can be embedded into the lattice of ideals on L, where the partial order relation is set inclusion.
121
In addition to prime filters, already defined, there is another special, and important, kind of filter, defined in what follows. Recall that, in a partially ordered set P, a maximal element of a subset S of P is any element m satisfying the following conditions: (ml) mE S.
(m2) For all
XES,
m ::; x only if m = x.
In other words, a maximal element of S is any element of S that is not below any other element of S. Now, the collection F of filters of a given lattice L form a partially ordered set, where inclusion is the partial order relation, and the collection P of proper filters of L is a subset of F. We can accordingly talk of maximal elements of P relative to this partial ordering. This yields the notion of a maximal filter, which is formally defined as follows. Definition 3.18.26 Let L be a lattice, and let F be a subset of L. Then F is said to be a maximal filter on L if the following conditions are met: (ml) F is a proper filter on L.
(m2) FOl· any proper filter F' on L, F
~
F' only ifF
= F'.
In other words, a maximal filter on L is any proper filter on L that is not included in any other proper filter on L. Note that the qualification "proper" is crucial, since every filter on L is included in the non-proper filter L. Logically interpreted, a maximal filter conesponds to a maximal theory, or a maximal possible world. A maximal theory is one that claims as much as it can claim, short of claiming everything (which would be inconsistent). Similarly, a maximal world is one that cannot be enriched without producing the "absurd world" (i.e., one in which every proposition is true). The first theorem concerning maximal filters is given as follows. Theorem 3.18.27 Let L be a lattice, and let F be any filter on L. Then there is a maximal filter F+ on L such that F ~ F+.
In other words, every (proper) filter on a lattice is included in a maximal filter. The proof of this very important theorem is postponed until Chapter 13. Before continuing, we note that another popular term appearing in the literature for maximal filter is "ultrafilter." We, however, stick to the less flashy name "maximal filter." The following very important theorems state the relation between maximal filters and prime filters in the special case of distributive lattices. Theorem 3.18.28 In a distributive lattice, evelY maximal filter is prime, although not every prime filter is maximal.
Proof Consider a maximal filter F and suppose that a V bE F, but that neither a E F nor b E F. Then consider [F U {a}) and [F U {b}). Both of these must equal the whole lattice L, i.e., for any x E L, there is some element fI E F such that flAa ::; x and there
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
122
r
FILTERS AND IDEALS
123
Theorem 3.18.32 In a Boolean lattice, a filter is maximal if and only if it is complete (and proper), and it is maximal if and only if it is prime (and proper). We have already shown that, in any distributive lattice, every maximal filter is prime. So all we need to show is that, in a Boolean lattice, (1) every prime filter is complete, and (2) every complete filter is maximal.
o FIG. 3.21. Lattice with a non-maximal prime filter is some element 12 E F such that 12 1\ b::; x. But then (using (FI)) f = (fI 1\ h) E F, and clearly f 1\ (a V b) = (f 1\ a) V (f 1\ b) ::; x. But since both f and a V b are in F, then (again using (FI)) f 1\ (a Vb) E F, and so (by (F2)) for arbitrary x, x E F, contradicting our assumption that F was proper. In order to see that not every prime filter is maximal, consider the lattice in Figure 3.21. This lattice is distributive, and whereas {l} is a prime filter, it is not maximal, since it is included in {a, I}, which is a proper filter distinct from {I}. 0 Although prime filters and maximal filters do not coincide in the general class of distributive lattices, they do coincide in the special subclass of complemented distributive lattices, i.e., Boolean lattices. Before showing this, we define some notions appropriate to the more general category of lattices with complementation operations (recall Sections 3.12 and 3.13). Definition 3.18.29 Let L be a lattice, let x f-+ -x be any complementation operation on L, and let F be any filter on L. Then F is said to be consistent (with respect to x f-+ -x) if the following condition obtains: (c) If x
E
F, then -xli F.
Definition 3.18.30 Let L be a lattice, let x f-+ -x be any complementation operation on L, and let F be any filter on L. Then F is said to be complete (with respect to x f-+ -x) if the following condition obtains: (c) If x Ii F, then -x E F. Logically interpreted, consistency says that if a proposition p is true, its negation -p is not true, whereas completeness says that if p is not true, then -pis true. Having presented the general notion, we concentrate on Boolean lattices. We begin with the following theorem. Theorem 3.18.31 In a Boolean lattice, every proper filter is consistent, and every consistent filter is proper. Proof There is only one improper filter on L, namely, L itself; so if a filter is improper, it contains every element, and hence is inconsistent. Going the other direction simply uses the fact that a 1\ -a = 0, so if a E F and -a E F, then a 1\ -a E F, so 0 E F, but o ::; a, for every a, so a E F for every a. 0
Proof (l) Let F be a prime filter of a Boolean algebra. Since F is non-empty, there is some x E F. But x::; 1 = a V -a. So by (F2), a V -a E F, and so by primeness, a E F or -a E F, for an arbitrary element a picked as you like. So F is complement complete. (2) Suppose that F is complete. If F is not maximal then there must exist some other proper filter G that properly includes F, and then there must be some element a E G such that a Ii F. But since F is complement complete, it must then be the case that -a E F ~ G, i.e., both a, -a E G. But then G is inconsistent, and hence improper, contrary to our earlier assumption. 0
By the work of the exercise above, and by the fact that maximal filters (ideals) coincide with prime filters (ideals) in Boolean algebras, we know that in a Boolean algebra the set-theoretical complement of a maximal filter is a maximal ideal. But this is not always the case for an arbitrary lattice. (Look again at the three-element lattice above.) Having discussed (maximal) filters and ideals separately, we conclude this section by mentioning what we think is the more fundamental notion-namely, the idea of a filter-ideal pair. We introduce two notions: that of a maximal filter-ideal pair and that of a principal filter-ideal pair. Definition 3.18.33 Let L be a lattice, and let F and I be subsets of L. Then the ordered pair (F, I) is said to be a filter-ideal pair on L ifF is afiltel; and I is an ideal, on L. Definition 3.18.34 Let L be a lattice, and let (F, I) be a filter-ideal pair on L. Then (F, I) is said to be disjoint ifF n I = 0, overlapping ifF n I =P 0, and exhaustive if FuI=L. Definition 3.18.35 Let L be a lattice, and let P be a collection offilter-ideal pairs on L. Define a binary relation::; on P so that (F, I) ::; (G, J) iff F ~ G and I ~ J. Fact 3.18.36 The relation::; defined above is a partial order relation on P. Definition 3.18.37 A filter-ideal pair on L is said to be a maximal filter-ideal pair if it is a maximal element of PI with respect to the ordering ::;, where PI is the collection of all disjoint filter-ideal pairs. Definition 3.18.38 A filter-ideal pair on L is said to be a principal filter-ideal pair if it is a minimal element of"P2 with respect to the ordering ::;, where "P2 is the collection of all overlapping filter-ideal pairs. In other words, a maximal filter-ideal pair is a disjoint filter-ideal pair that does not bear the relation::; to any other disjoint filter-ideal pair; a principal filter-ideal pair is an
124
ORDER, LATTICES, AND BOOLEAN ALGEBRAS
overlapping filter-ideal pair that does not bear the relation ::::: to any other overlapping filter-ideal pair. We shall show in Chapter 13 (see Lemma 13.4.4) that every disjoint filter-ideal pair can be extended to a maximal filter-ideal pair. We also show in Section 8.13 that every overlapping filter-ideal pair can be shrunk to a principal filter-ideal pair. The notion of filter-ideal pair puts the concepts of truth and falsity on equal terms. In particular, a filter-ideal pair cOlTesponds to a theory, not merely in the sense of a collection of claims, but more specifically in the sense of a collection of claims together with a cOlTesponding collection of disclaimers. Thus, under this construal, every theory claims ceI1ain propositions, denies others, and is indifferent with regard to still others. Notice carefully the difference between being a disclaimer and failing to be a claim: with respect to certain propositions, a given theory may simply have nothing to say. For example, a theory of celestial mechanics may say nothing about what wines are good with lobster. We conclude this section by noting that, in the special case of distributive lattices, the notion of maximal filter-ideal pair reduces to the earlier concepts of maximal filter (ideal) and prime filter (ideal). We cannot show a similar result for the principal filter-ideal pairs; however, as it will become clear in Chapter 13 they have other nice properties which render them interesting in the context of representation.
Theorem 3.18.39 Let L be a distributive lattice, and let (F,1) be a maximal filterideal pair on L. Then F is a prime filter on L, and I is a prime ideal on L. Corollary 3.18.40 Let L be a Boolean lattice, and let (F, 1) be a maximal filter-ideal pair on L. Then F is a maximal filter on L, and I is a maximal ideal on L. The corollary uses the fact that, in a Boolean lattice, maximal filters (ideals) are prime, and conversely. The proof of the theorem is left as an exercise.
4 SYNTAX 4.1
Introduction
It is customary to think of sentences concretely as utterances stretched out linearly in time, or, even more commonly, as inscriptions stretched out linearly in space, but this very sentence is a counter-example to such over-simplicity (because of the need for line breaks). Such examples (and even the previous sentence when intuitions are sufficiently trained) lend themselves nicely to the construction in most elementary logic texts of sentences as strings of symbols, where, when push comes to shove, these are given the standard set-theoretical rendering as finite sequences. But there is no reason to think that sequences are the most felicitous choice of "data structure" in which to code hieroglyphs or ideograms of various types. It could be that the placement of a pictorial element over or under, to the left or the right of another, might have linguistic significance. Nonetheless there seems nothing wrong with thinking that the pictographic elements of a language are ilTelevant from some suitably cold intellectual point of view, and we shall, for the time being, adopt the useful fiction of the logic texts that a sentence is indeed a string of symbols, understood in the standard set-theoretical way as a finite sequence, i.e., a function from some proper initial segment of the natural numbers. For ease of exposition we shall not countenance the null string ( ) (the function defined on the empty set), but we shall eventually get around to discussing it in an exercise.
4.2
The Algebra of Strings
Let us call any finite, non-null sequence of symbols chosen from some given set A a string (in A), and let us call A an alphabet and the members of A symbols. Many authors talk of "expressions" instead of strings, but this neologism leads to the eventual need to distinguish those "expressions" which are well-formed (i.e., grammatical) from those that are not, with the resultant barbmism "well-formed expression." We denote the set of all such sequences as S. There is a natural operation on finite sequences, namely juxtaposition: (so, ... ,sm) ~ (to, ... ,til) = (so,· .. , Sm, to,· .. ,til)' Juxtaposition can be pictured as joining two strings side by side, and is a natural operation on S that allows us to regard it as an algebra. Thus the algebra of strings in the alphabet A is the structure S = (S, ~). It is easy to see that it is generated from the singletons of its alphabet, and that it has the following property:
THE ALGEBRA OF STRINGS
SYNTAX
126
127
An algebra satisfying this propelty is called a semi-group. It turns out that in a certain sense this by itself captures all the typical properties of an algebra of strings. Thus we have the following.
Proof The proof is by induction on generators. The base case is when x is a generator. Plugging the antecedent of (3) into pseudo-trichotomy, we have either y = b (as desired), or else x < x, i.e., that x = x~m for some m E S, which violates the atomicity of x. For the inductive step, we assume that for x = Xl~X2, Xl and X2 each satisfy left-cancellation (no matter what the right-hand term is). Then assuming the hypothesis of (3),
Theorem 4.2.2 Up to isomorphism, free semi-groups and algebras of strings are the same.
and by associativity we may regroup so as to obtain
Exercise 4.2.1 Prove that all algebras of strings are associative.
(XI~X2)~y
= (XI~X2)~b,
Xl ~(X2 ~y) = Xl ~(X2 ~b).
We shall prove this result in two halves (Subtheorems 4.2.3 and 4.2.7). Subtheorem 4.2.3 Every algebra of strings is afree semi-group. It would be possible to prove this directly. Thus if f is a mapping of the set A of symbols into a semi-group S = (S, +), one can define h( (So, . .. , Sk») = f(So) + ... + f(Sk) and it is easy to see that h is then a homomorphism. However, we shall proceed
somewhat more abstractly, collecting some needed properties for the antecedent of a lemma because these properties are interesting in their own right. (1) (Pseudo-trichotomy.) a~b,
Define x < a to mean that
3m(x~m
= a). Then if x~y =
= b, or
Note that the positive integers have these properties, with + as ~ and G = {I}. Indeed the integers satisfy the stronger law of trichotomy (x = a or x < a or a < x), which helps motivate our choice of name above. Thus it will tum out that the positive integers form the free semi-group with one free generator, S(1). But more important for our purposes is that every algebra of strings has properties (1) and (2). We leave it for the reader to prove this in the following exercise.
Exercise 4.2.4 Show that an algebra of strings S in an alphabet A is atomically generated (with the singletons of the elements of A as the generators), and that it satisfies pseudo-trichotomy. Before stating our lemma, we shall state and prove the following sublemma which deals with semi-groups that are not necessarily algebras of strings. (S,~)
(3) (Left-Cancellation.)
and then for X2, so as to obtain y
= b, as D
Lemma 4.2.6 Let S = (S,~) be an atomically generated semi-group satisfying pseudo-trichotomy. Then S is afree semi-group. Proof Let G be the set of atomic generators, and let f be any mapping of these into the carrier set of some given semi-group with + as its operation. Define h inductively so that E
G, h(s) = f(s), and + hey).
(2) h(x~ y) = hex)
(2) (Atomic generation.) For every algebra of strings there exists a class G of atomic generators, i.e., no element a in G is of the fOlm x~y.
Sublemma 4.2.5 Let Then it also satisfies
Xl
We are now in a position to deal with the lemma that wiII give us Subtheorem 4.2.3.
(1) for s
either
(i) x = a and y (ii) x < a, or (iii) a < x.
We may now use left-cancellation, first for desired.
be a semi-group satisfying properties (1) and (2) above.
Ifx~y
= x~b, then y = b.
The only way that this definition could go wrong would be if the above clauses somehow conflicted either with each other, or with themselves, so as to assign different values to some given element. The first kind of conflict is clearly impossible, for no atom s can be of the form x~y. The second kind of conflict is clearly impossible in the case of clause (1) (since f is a function, and hence single-valued), and associativity will come into play in showing that it is also impossible in the case of clause (2). In somewhat more detail, the proof will proceed by induction on generators, showing that h is "well-defined" (gives a single value when computed according to clauses (I) and (2)). As we said above, clause (1) clearly determines a unique value for h on the generators. For the sake of having a sufficiently strong inductive hypothesis, we shall prove not merely that h is well-defined on each element e, but also that h is well-defined on all "substrings," i.e., on all elements x, y such that e = x~y. Thus suppose that we have a string x~y = a~b. We shall show that h must assign the left-hand side the same value that it assigns the right by way of the calculations of clause (2). We know from pseudo-trichotomy that unless x = a and y = b (in which case, invoking the inductive hypothesis, we are clearly OK), then either x < a or a < x. The two cases being symmetric, we shall treat only the first case. If x is "a proper initial segment" of a, this means that a = x~m (for some "middle chunk" m), and so (3) a~b = (x~m)~b.
SYNTAX
128
THE ALGEBRA OF STRINGS
appropriate algebraic structure is a monoid (M, +, 0), where (M, +) is a semi-group and 0 is a distinguished element satisfying
But then by the associativity of -, it may be seen that (4) x-y = x-em-b).
Since by inductive hypothesis we may assume that h is well-defined on "substrings" of a and b, we have by way of the computations of clause (2) that (5) h(a-b) = (hx
129
+ hm) + hb.
But from (4), using left-cancellation (guaranteed to us by the sublemma), we have that
(ld) x
+0 = 0+x
= x
(identity).
Besides the tedium of keeping track of 0 and ( ), which as "null entities" are a bit hard to always see, there is the further conceptual problem of how to treat "distinguished" elements. Our suggestion is that "0" be viewed as a nullary, or, if that is too much, a constant unary operation, always giving the value O. This way it need not be counted among the generators in the definition of afree monoid.
(6) y=m-b,
i.e., that m and b are "substrings" of y. This means that again we are justified in applying the computations of clause (2) to obtain (7) h(x-y) = hx + (hm
+ hb).
But then associativity of the semi-group operation + gives us the desired (8) h(x-y) = h(a-b).
o
Subtheorem 4.2.3, of course, follows from this lemma and Exercises 4.2.1 and 4.2.4. We still have to prove the other half of Theorem 4.2.2. We do this by proving Subtheorem 4.2.7, the converse of Subtheorem 4.2.3. Subtheorem 4.2.7 Let S be afree semi-group. Then S is isomorphic to an algebra of strings. Proof Let us assume that S = (S, +) is a free semi-group with free generators G. We shall show that S is isomorphic to an algebra of strings. Pick A as a set in oneone correspondence f with G (it might as well be G itself). Let SeA) be the algebra of strings in the alphabet A. We know from Subtheorem 4.2.3 that SeA) is itself a free semigroup with free generators A, and we know from a result of Section 2.14 that any two free semi-groups with the same cardinality of free generators are isomorphic. 0 Remark 4.2.8 Combining Lemma 4.2.6 and Subtheorem 4.2.7, we obtain a kind of representation theorem for atomically generated semi-groups that satisfy pseudo-trichotomy, that is, we show that structures satisfying a rather abstract descliption can all be thought of concretely as sets of strings operated on by concatenation. Exercise 4.2.9 The proof alluded to in the above remark is rather indirect, detouring through talk about free algebras, etc. Give instead a "direct" proof that every atomically generated semi-group satisfying pseudo-trichotomy is isomorphic to an algebra of strings. (Hint: Show that every element in such an algebra can be "factored" into atomic elements in at least one way, and in at most one way, i.e., prove a suitable "unique factorization theorem.") Exercise 4.2.10 In our description of the algebra of strings we have dropped the null string (the empty sequence ( » from consideration. We have done this for reasons of simplicity in exposition, but many authors allow it. "Your mission, should you choose to accept it," is to put it back in, and prove analogs to all of the above results. The
Exercise 4.2.11 There is often more than one fruitful way to abstract a concrete structure. Thus instead of thinking of strings as constructed by way of concatenation, we can think of them as all constructed from the null string at root, by the operation of extending a sequence by adding one more component at its end. Thus a multiple successor algebra is a structure (N, 0, «(J'i )iEI), where 0 E N, each (J'i is a unary operation on N, and where no special postulates are required. A multiple successor arithmetic (due to Hermes 1938) is a multiple successor algebra in which for all i E J, (1) for all x E N, (J'iX
(2) if (J'iX
1= 0
= (J'iY, then x = y.
Show that (up to isomorphism) free multiple successor algebras and multiple successor algebras of strings are the same. Show further that every multiple successor arithmetic is isomorphic to a multiple successor algebra of strings. We can give examples of syntactic structures that satisfy the postulates on the algebras corresponding to the Lambek calculus in a couple of its forms. 1 Example 4.2.12 (Associative Lambek calculus of strings). Consider the algebra of strings S = (S, -) in the alphabet A, i.e., the set of all strings of symbols from A. This includes the empty string ( ). The operation - of concatenation is an associative operation, and ( ) is the identity element. Concatenation is a kind of "addition" of strings, and might be denoted by +. We define a kind of "subtraction" as follows: x ~ y is the result of deleting the string x from the beginning of the string y. There clearly is the symmetric operation of deleting the suing x from the end of the string y. We denote this as y ~ x. (Note that in each case, the "harpoon" points to the suing from which the other string is being deleted.) An alternative metaphor, which does not seem as natural, is to view concatenation as multiplication x, and x)" y and y,/ x as quotients. A metaphor which has closer connections to logic is the following. We view concatenation as a kind of "fusion of premises" 0, and we view the deletion operations as kinds of implication, writing x -+ y and y +- x. Note that no matter what the metaphor, we use symbols that "point" so as to distinguish between the dual residuals. Older literature did not do this, instead using unmemorable notations such as x/y, x\y, x/ /y, x : y, x :: y to make distinctions. I By simply dropping the empty string (pair) one can obtain forms which correspond in the Gentzen system to not allowing empty left-hand sides.
SYNTAX
130
Exercise 4.2.13 Consider the algebra of strings S = (S, ~) in the alphabet A. Let =s be the identity relation restricted to the set S, which is of course a partial order on S. Show that (S, = s, ~, ~, ~, ( ») is a residuated monoid. Example 4.2.14 (Non-associative Lambek calculus of pairs). This is similar to the example above, except the fundamental operation is not concatenation but rather "pairing": x, y
1---+
(x, y).
S is now the set that results from closing A under repeated applications of the pairing operation. The "subtraction operations" now delete either the first or the second components. The empty operation that pairs nothing with itself is denoted by ().
Exercise 4.2.15 Let S be as in the above example. Prove that this is a residuated groupoid with identity. 4.3
The Algebra of Sentences
Let us look at the various ways that the string (p -+ q) may be composed by concatenation. Here we adopt the customary informal practice of denoting a sequence by listing its members. Thus (p -+ q) is our "nickname" for the more formally designated (Cp,-+,q,)·
Perhaps we should make one more comment about our practices. Following Curry (1963), we never display the object language, and so, for example, '-+' is not the conditional sign, it is rather the name of the conditional sign (the conditional sign itself could be a shoe, a ship, or a piece of sealing wax). Returning to the various ways of generating (p -+ q), these include first generating ( Cp, -+, q), and then sticking a right parenthesis on the end (this corresponds to the multiple successor arithmetic way of looking at things). But an equally valid mode of generation is to first generate ( Cp) and then concatenate it with (-+, q, ) ). We leave to the reader the task of writing all the various combinations, but one thing shouldbe clear: none of them corresponds to the intuitions that we all have from logic that (p -+ q) is generated from p and q. In logic texts, the usual inductive definition of sentences for sentential logic says that sentences are generated from sentences, that is from other (well-formed) expressions, and not, as in the examples above, from nonsensical strings. Thus the typical definition from logic texts starts out by postulating a certain (denumerable) class of atomic sentences (p, q, etc.), and then says things like: (-) if ¢ and VI are sentences, then (¢ -+ VI) is a sentence. (Of course typically there would be additional connectives besides -+, but this will do for our present purposes.) There are different ways of understanding clause (-+). One quite common way is to regard sentences as a special subclass of strings, and so (-+) is interpreted as saying that if two strings ¢ and VI are sentences, then so is the string ( ( ) ~ ¢ ~ ( -+ ) ~ VI~ ( ) ).
THE ALGEBRA OF SENTENCES
131
Atomic sentences are then reinterpreted so that strictly speaking they are singletons of the given atomic elements p, q, etc. This rather concrete way of interpreting (-+) would require that if we were to use Polish notation, where we write Cpq instead of (p -+ q) in order to avoid the need for parentheses, the clause would have to be redrawn: (C) if ¢ and VI are sentences, then C¢VI is a sentence. Another more fruitful approach to the interpretation of clause (-+) is to regard (¢ -+ VI) as denoting some way of composing the sentences ¢ and VI so as to form their "conditional," but to be non-committal as to the particular syntactical details. The conditional may be formed by the normal infix notation (as the "icon" (¢ -+ VI) suggests), but it might be formed by the Polish prefix notation, or the so-called reverse Polish suffix notation popularized in Hewlett-Packard advertisements, or even, as in English, by a mixture of prefix and infix notation ("if ___ , then ___ "). In this more abstract, algebraic approach, there is not even the need to think that we are dealing with sequences; this point of view nicely accommodates two-dimensional ideographs and tonal languages. This leads to a distinctive way of regarding the composition of sentences (quite different than the juxtapositional way). We thus regard sentences as forming an algebra, where sentences are composed from other sentences by various syntactic operations, e.g., that of the conditional. In general, of course, there are many more such operations (negation, conjunction, and disjunction, to name the most familiar ones). Thus we can view an algebra of sentences S as a structure (S, (Oi)iEI), where the operations 0i correspond to the various ways of composing sentences from each other. But this is overly general and does not get at the idea that there are certain atomic sentences which serve as the starting points, the generators for the others. We could throw into the structure, then, a certain set A of atomic sentences as the generators of the algebra, but we would still be missing an important feature of the situation, namely uniqueness of composition; no conditional is a conjunction, etc., and if two conjunctions are identical, then their component conjuncts are identical, etc. This is in a way one of the most notoriously difficult of the theorems in Church's (1956) classic Introduction to Mathematical Logic (at least out of all proportion to the seeming obviousness of its content). Definition 4.3.1 Let A = (A, (Oi )iEI). We say that the algebra A has the property of unique decomposition iff (1) there exists a set G of atomic generators of A in the sense that no element s ofG is of the form Oi(al,···, am); (2)
if Oi(al, ... ,am) =
OJ(bl, ... , bn ), then
(i) i = j,
(ii) m = n, and (iii) for each k S m, ak = ble.
132
SYNTAX
In "English," every element can be factored into a composition from the generators in only one way. Unique decomposition comes to take on algebraic clothing in the following result. Theorem 4.3.2 Let A = (A, (Oi) iEI) be an algebra with the property of unique decomposition. Then A is universally free, i.e., free in its similarity class. Proof Let f be a mapping of G into any algebra of the same similarity type. It is reasonably intuitive that f can be extended to a homomorphism h by the following inductive definition: (1) For s E G, h(s) = f(s).
(2) h[Oi(ai, ... , am)] = Oi(hai, ... , ham).
Clearly h so defined preserves the operations. The only thing that could conceivably go wrong would be that the clauses should somehow not determine a unique value hex) for some element x of A. We prove that this does not happen by induction on generators. When x is an atom, clause (1) applies (and clause (2) clearly does not), and since f is a (single-valued) function, clearly it assigns s a unique value. When x is composite, it is of the form Oi(ai, ... , am). The only way that clause (2) could fail to have h assign a unique value to it would be if the same element also had some other form, say OJ(bi, ... , bn ). But this is precisely what unique decomposition says is impossible, and so the proof is complete. D Not only is an algebra with unique decomposition universally free, but the converse is true as well, as we shaII investigate next. It turns out that this is easiest to show by looking at certain concrete examples of universally free algebras. It was claimed in Chapter 2 that universally free algebras exist for every cardinality of generators, and examples were provided by looking <:tt the algebra of words of the same similmity type and with the appropriate number of generators. In point of fact, a significant detail was suppressed, to wit, uniqueness of decomposition. Exercise 4.3.3 Let (W, (Oi)iEI) be an algebra of words. Show that it has the unique decomposition property. (Hint: Assign the number n - 1 to each n-ary operation symbol and the number -1 to each variable. Assign to each suing then the number that results from summing the number assigned to its symbols. Prove that a string is a word iff (1) it sums to -1 and (2) each proper initial substIing's sum is non-negative.) Remark 4.3.4 The above exercise is more difficult than it might first appear, hence the hint. The difficulty is caused by the fact that words are construed simply as stIings. So it turns out that if, say, 0iara2 = Ojbrb2, then clearly the first components of the sequences, Oi and OJ, must be identical, and applying cancellation, we know that the sequence ara2 is identical to the sequence brb2. But from here on, it gets messy since we know that there is nothing impossible about the very same sequence arising by two different concatenations. Things m'e perhaps even messier with the usual practice of using infix notation for binary connectives (cf. Church's Introduction to Mathematical Logic).
LANGUAGES AS ABSTRACT STRUCTURES
133
Having whetted the reader's appetite for the exercise, we now point out that there is a much easier way to construct algebras having the unique decomposition property. Rather than thinking of, say, the formula CKpqr as a sequence of symbols, think of it as a sequence of symbols and formulas. Thus do not think of it as the "flat" sequence (C, K,p, q, r), but instead as the "multi-leveled" sequence (C, (K,p, q), r). The latter displays the mode of composition more naturaIIy than does the former (understanding a formula as a tree showing its composition would work just as well). The unique decomposition property now follows straight away from the fact that identical sequences must have identical components. In any event, we know that algebras with the unique decomposition property exist of any given similarity type, and with any cardinality of generators. This allows us to show the following. Theorem 4.3.5 Any universally free algebra must have the property of unique decomposition. Proof Let A be a universaIIy free K-algebra. By the exercise (or remark) above we know that there is an algebra with the unique decomposition property, of the same similarity type and having the same number of atomic generators. By the theorem above, we know that this algebra is universaIIy free as weII, and by a result of Chapter 2 we know that any two such algebras are isomorphic. It is clear that the unique decomposition property is preserved under isomorphism, completing the proof. D
Remark 4.3.6 The above results show that there are in effect three different ways of defining a language: (l) explicitly as an algebra of words; (2) as a universally free algebra; and (3) as an algebra having the property of unique decomposition. Methods (1) and (3) make reference only to the language itself, and in that spirit are "syntactic." Method (2) makes reference to homomorphisms to other algebras ("interpretations"), and so has at least a minimal "semantical" component. Method (1) is more concrete ("syntactical" in a grungy sense) than the others, since it involves a specification of an alphabet, symbols for the connectives, and prefix versus suffix (or infix notation). But since the various methods are in effect equivalent, there is reaIIy very little to recommend one over another. We shaII adopt (2) in the sequel, largely for the sake of tradition (continuity with the "Polish School") and for the fact that it has a nicely algebraic flavor. 4.4
Languages as Abstract Structures: Categorial Grammar
Reflection on the results and discussion of the previous section shows that at the level of sentential logic, a language L can be viewed simply as a universally free K-algebra, where K is the similarity class containing all algebras of that similarity type. The free generators of L are called atomic sentences. For the more general case of first-order logic, a (formal) language L may be understood to be concerned with the recursive generation of structures caIIed sentences from primitively given items such as terms ("subjects") and predicates. Thus predicates are attached to terms to form sentences, and these sentences are then combined by way of
134
SYNTAX
the connectives and quantifiers to form further sentences. Terms themselves can be built up from simpler terms by way of any function letters that are present. Thus, for any syntax of standard first-order logic, we have two algebras: an algebra of terms, and an algebra of sentences (or more generally, an algebra of formulas, where afonnula is just like a sentence except that it can have free variables). As we mentioned in Chapter 2, an algebra of words can be viewed concretely as either one. Abstractly they are the same; as we know from the results of the previous section, they can both be viewed as universally free algebras (although in a concrete case, their similarity classes would most likely differ). Thus we can view two separate fragments of the syntax of first-order logic algebraically, the term-forming fragment and the formula-forming fragment. But can we view the whole syntax algebraically? It is, of course, foreign to the pure idea of an algebra to so divide elements into different categories. And yet such a division does sometimes arise in traditional studies that are labeled "algebraic." Thus in a linear algebra (vector algebra), items are divided up into two categories: vectors and scalars, and, for example, vectors and scalars can be composed ("multiplied") so as to form vectors. There are tricks that allow one to view such a structure as a pure algebra (thus one can replace the one operation of multiplication with the infinitely many operations that correspond to multiplication by one scalar after another). Can we use this as a model for viewing the language of first-order logic algebraically? Let us think of the terms as the scalars, and the formulas as the vectors. Multiplying a vector by a scalar could then correspond to substitution of the term for a variable in a formula, e.g., Fx "times" a gives Fa. But what do we say when the formula is Fxy? Which variable do we substitute for? Even here we can work things out. It is not just that a gives rise to a substitution, it gives rise to an indexed set of substitutions, one for each variable. So we replace a with the operation of substituting it for x, the operation of substituting it for y, etc. (There is still the problem of setting down conditions on the substitution operations so as not to lose track of the original structure of the algebra of terms, but this can be finagled.) If we look at certain ways of extending first-order logic, the project of viewing languages algebraically grows more complicated yet. Thus in the language of set theory, a formula ifJ(x) can have the abstraction operator {x : ... } applied to it to obtain the term {x : ifJ(x)}. The objects generated in one syntactic category (e.g., formulas) can feed the objects generated in another syntactic category (terms), which can turn around and feed the objects generated in the first category (formulas). Thus we obtain a formula such as a E {x : ifJ(x)}. Definite descliptions ((IX)ifJ(X) = the unique x such that ifJ(x)) provide another example. This all gets rather dizzying, but there is still a vector space analogy of sorts: normed vector spaces. The norm of a vector is a scalar (intuitively, its length), just as the abstract of a formula is a term. One idea of a basically algebraic flavor underlies all these examples: composition. Items of various categories are composed to form items of other categories. But it might be the wrong choice of data type to try to fit all of these into the pure idea of an algebra
LANGUAGES AS ABSTRACT STRUCTURES
13S
Uust a set with some operations), even if it were possible (which the last examples above make us doubt). Thus we could work with a generalization of an algebra, namely a structure (A, (Oi)iEI, (Cj)jEJ»), where each 0; is a function from Cj1 x ... X Cjll to Ck, for some j], ... ,j,z,k E J (each grammatical construction is thus "typed"). A language could then be defined abstractly as a "free" one of these structures, etc. Such a language would be grammatically unambiguous, since freedom prevents the same grammatical construction from being of two different "types." Clearly the precise definition of a fOlmal language is a rather complicated thing. There are other scary examples we could throw up, e.g., the adverbial modifiers that many think standard first-order logic forgets, which form complex predicates out of predicates (e.g., "very hot" from "hot"), or the "lambda operator" from second-order logic, which forms predicates from formulas [Ax(Fx)]. These higher-order grammatical constructions do not fit even in the above framework K and lead ultimately to the idea of "categorial grammar" with functions from functions to functions, etc., as far up the type hierarchy as wanted. Cresswell (1973) is a good source on categorial grammar, and says that categorial grammar grew from ideas by S. Lesniewski in the early 1930s, as expounded and developed in Ajdukiewicz (1935). As a simple example we have:
Definition 4.4.1 The class of simple syntactic categories is defined inductively as follows: (1) N is a simple syntactic category (names). (2) S is a simple syntactic category (sentences).
(3) IfCl and C2 are simple syntactic categories, then so is (Cl, C2). (4) Nothing else is a simple syntactic category.
Exercise 4.4.2 Illustrate the notion of simple syntactic categories using various examples given above. Fortunately for us, at the level at which we are working we do not have to untangle all of this. We are concerned mainly with the recursive generation of sentences from other sentences, and so we can adopt the relatively simple framework of the algebra of sentences in the sequel. The reader is invited, however, to make whatever adjustments are needed to accommodate a more general view of language. Lambek (1958) made an interesting refinement of categorial grammar which is of interest to propositional logic. He observed that there are two ways in which a predicate might operate upon a noun so as to make it a sentence. The predicate may be either prefixed to the noun, as in "Run Mary," or it may be affixed to the noun, as in "Mary runs." We may distinguish these two types by the notations N -* Sand S ~ N. In general, where we have an algebra of strings (S, ~), and B, C ~ S, we can classify B -* Cas the set of strings which, concatenated from the left to any string in B, always produces a string in C, x E B -* C iff Vb(b E B =? x~b E C).
And we can define C +- B dually, using concatenation from the right, i.e., xEC+-B iff 'v'b(bEB=?b~XEC).
There is a natural partial order where A b: B iff A ~ B. The identity I = {()}. Exercise 4.4.3 Prove that the set of all sets of strings in an alphabet A is a residuated monoid. Remark 4.4.4 Note that this example is up a type level from the example in Section 4.3. We can produce examples of the residuated groupoids with identity by replacing the concatenation operation ~ with the pairing operation. 4.5
Substitution Viewed Algebraically (Endomorphisms)
We suppose, then, that we have a sentential language, i.e., a universally free algebra (S, (Oi )iEI) with free generators G. For concreteness we shall suppose that it is the language of classical sentential logic with the connectives ~ and -+, and atomic sentences p, q, r, etc. Consider the following theorem: (1) (p -+ q) -+
(~q -+ ~p).
We know that theorems are closed under "substitution," and so the following are also all theorems: (2) (3) (4) (5) (6)
EFFECTIVITY
SYNTAX
136
(r -+ q) -+ (~q -+ ~r) (rip); (I' -+ s) -+ (,~s -+ ~r) (rip, slq); (q -+ p) -+ (~p -+ ~q) (qlp, plq); (p -+ p) -+ (~p -+ ~ p) (pi q); (~p -+ (q -+ s)) -+ (~(q -+ s) -;-* ~ ~p)
only substitutes an atomic sentence for itself, so that one hardly notices the difference. Thus in (2) above, not only is I' substituted for p, but also q for q, s for s, etc. A substitution can thus be viewed as a mapping (J of the atomic sentences to (possibly compound) sentences that extends itself to all sentences of the language by replacing each atomic sentence p with (J(p). If one thinks this through, one sees that one can first perform the substitution on the smallest components of the sentence (the atomic sentences), and then combine the results by the appropriate connectives. More formally, we can give an inductive definition:
What this amounts to-from an abstract (algebraic) point of view, when a language is viewed as a universally free algebra-is that a substitution (J is a mapping from the free generators that can be extended to a homomorphism of the language into itself, i.e., a substitution is just a homomorphism of the language into itself. Recalling that a homomorphism of an algebra into itself is called an endomOlphism, a substitution, "pronounced algebraically," is just an endomorphism on a universally free algebra. Exercise 4.5.1 We shall say that a substitution (J affects only a finite number of free generators (atomic sentences), if (J(p) = p for all but a finite number of free generators p. Assuming that there are infinitely many atomic sentences, show that every such substitution is obtainable by a composition of one-at-a-time substitutions, i.e., substitutions for which (J(p) = p for all but one free generator. 4.6
(~plp, (q -+ s)/q)·
There are several lessons to be learned from the above examples. Thus (2) results from (1) by the simplest kind of "one-at-a-time" substitution, whereas (3) results from (1) by "simultaneous" substitution. It is clear that (3) also results from (2) by one-at-a-time substitution, and that the simultaneous substitution that results in (3) can be decomposed into two one-at-a-time substitutions. Even the simultaneous substitution that results in (4) can be so decomposed if the language has enough atomic sentences so as to allow "relettering." (Thus in the example above one can always detour through the relettering (3), or even (2).) The point of (5) is that substitution does not have to be "one-one," and the point of (6) is that compound sentences may be substituted for atomic sentences. Even though, in many instances, simultaneous substitution can be decomposed into a "product" of one-at-a-time substitutions, this is not always possible if we allow, as we do, the simultaneous substitution for infinitely many atomic sentences. We learn from elementary texts to think of substitution as an act performed on a single sentence, or at best a small set of sentences (as for an inference rule). But we shall think of substitution as performed on all the sentences of the language. And we shall have need of simultaneous substitution on infinitely many atomic sentences. Indeed, the best way to think of this is that a substitution works on every atomic sentence, but often
137
Effectivity
We here review certain standard notions having to do with computability. The reader who wants more information is referred to Boolos and Jeffrey (1980). Let us first look at the notion of an effectively enumerable set S. This can be made precise in a number of seemingly different but equivalent ways, for example by wliting a computer program that lists all of the members of S (if left to run over infinite time). The notion of "a computer program" is a bit vague, and all of the various definitions of effectively enumerable can be seen as trying to make that precise. By now it is commonly accepted that any intuitively computable function can be computed by a particular idealization of computers called a "Turing machine.,,2 We will ourselves be working only with procedures for which it is intuitively clear that they are the sort that in principle could be implemented on a computer, and so we do not need to get into further technicalities other than to record for the record. Definition 4.6.1 A set S is effectively enumerable (ff there is a program P (implementable on a Turing machine) so that for each n E Z+ (the set ofpositive integers), the result Pen) of inputting n to P is in S, and conversely evny member s of S is such that for some 11 E Z+, S = Pen) (in short, P computes a function from the positive integers onto S). 2This is called "Church's thesis," or sometimes "Turing's thesis" when put in this very form. Church actually stated his thesis for an equivalent notion of "lambda-definable" functions.
SYNTAX
138
There is a notion stronger than effective enumerability, namely effective decidability. Intuitively, a set S is effectively decidable if there is a mechanical procedure that will determine whether a given item is, or is not, in the set. 3 Slightly more precisely: Definition 4.6.2 A set S is effectively decidable iff there is a program P (implementable on a Turing machine) so that for each s, the result P(s) of inputting s to P is I ("yes") if s E S, and othelwise 0 ("no"). Theorem 4.6.3 If a set S is decidable, it is also effectively enumerable. Questions of effectivity are always put relative to a "universe" which is a clearly effectively enumerable, e.g., the set of positive integers itself, and questions of effective enumerability or decidability are addressed with respect to subsets of this universe. Indeed it is quite standard to address questions of effectivity only for subsets of the natural numbers (or sometimes tlle positive integers) since clearly any other effectively enumerable set can be put in an effective one-one correspondence with the natural numbers (the only trick it to skip over any repetitions).4 This explains the standard language of "recursively enumerable" (= effectively enumerable) and "recursive" (= decidable), defined as they are from the notion of a recursive function on the natural numbers. Given this understanding of an underlying effectively enumerable universe, a decidable S is also effectively enumerable, for one can simply start enumerating the underlying universe and testing as one does so whetller each item is in S or not, and only finally list it when the answer is "yes." But not all effectively enumerable sets are decidable (by Church's theorem it is known that the theorems of first-order classical logic are a counter-example). But a kind of converse relationship is established by the following: Theorem 4.6.4 (Post) the set is decidable.
If a set and its, complement are both effectively enumerable, then
Proof Suppose one wonders whether s E S. Start ~ne Turing machine enumerating the set S and another enumerating its complement S. At some point, the item s will eitller tum up in the enumeration of S, in which case the answer is "yes" (output a 1), or else s will tum up in the enumeration of S, in which case the answer is "no" (output a~. 0
4.7
Enumerating Strings and Sentences
Given a countable alphabet A = {Sl, S2, ... } it is well known ("Cantor's enumeration theorem") that tlle set S of strings in A is denumerable. We shall sketch a proof of this, but in a way that makes clear the computational aspects. 3The modifier "effective" on "~frective decidability" is frequently left tacit. 40f course there are fancier ways to code up items as integers, e.g., the familiar method of "Gi:idel numbering" applied to strings, in which each symbol s of the alphabet is associated with a unique integer #s (> I), . ( SI ... Sl! > IS . co ded up as PI#s I x .. , X PI!#S" , were h ' Pi .IS the I'th pnme. and the expression
ENUMERATING STRINGS AND SENTENCES
139
Theorem 4.7.1 (Cantor) Suppose that the alphabet A is efJectively enumerable. Then so is the set S of all strings in S. Proof Clearly the set A 1 of all strings in A of length I is in an effective one-one correspondence with A, and so is effectively enumerable (given that A is). Cantor's famous proof that the set of rationals is in 1-1 correspondence with the set of natural numbers in effect proved that given a denumerable set A, tlle set A2 of all its strings of length 2 is also denumerable. The idea is to first picture tlle members of Ax A as laid out in infinite table:
(aI, al)
2
3
(a2, al) (a3, al)
2
3
(aI,a2) (a2,a2) (a3,a2)
(aI, a3) (a2,a3) (a3, a3)
Suppose one were to try to enumerate the pairs in the order
Then one would never get past the first column. Instead we enumerate them in the order given by the "short diagonals":
i.e., one picks the first item in column 1, then another item in column 1 but this time also an item from column 2, then again an item from column I and an item from column 2 but this time also an item from column 3, etc. Let us now dispense with the infinite table, and notice that if A can be enumerated effectively, then so can each of tlle columns, let us say by Turing machines M 1, M2, .... Now we give a mechanical procedure that will enumerate A x A, which one can imagine as implemented in a "meta-machine" whose job it is to control the running of the Turing machines M I, M2, .... The meta-machine runs the first machine M lone "tum of the crank." If it were to just keep turning the crank of this machine it would simply generate the first column and never get to the second column. So its instructions tell it to now tum the crank of Ml a second time but go on to tum the crank of M2. It then goes back to tum the crank of MI a third time, tum the crank of M2 a second time, and go on to tum tlle crank of M3, etc. In this way it enumerates all of the strings of length 2. The basic program of the meta-machine is simply to "count" 1; 1,2; 1,2,3; .... Now a sequence s of length n + 1 is in an effective correspondence with a certain "higher-order" sequence of length 2, namely the sequence whose first item is the sequence of tlle first n members of s, and whose second item is the last member of s. So by repeating the argument we are able to show each of the sets A 3 , A 4, ... is effectively enumerable. This still does not show enumerability of the set of all strings S = UnEZ+ A".
140
SYNTAX
But since we now know that the sets A I , A 2 , A 3 , ... are effectively enumerable, we can regard them as being generated by Turing machines MAl, M A2, MA3, .... By repeating the argument above, there is a "meta-meta-machine" that can in one enumeration list all of the members of these sets, listing one from A I , another one from A 1 but D now one from A 2 , etc. We now tum to questions of effectivity for sentences. It is convenient here to view the algebra of sentences as an algebra of words (cf. Section 4.3). Thus the elements are thought of "concretely" as strings and not more abstractly as, say, elements of a universally free algebra. Recall that "words" (in this case sentences) are defined recursively, starting with singleton sequences of atomic sentences, and forming new sentences by concatenating the singleton sequence of some k-place operation symbol with the concatenation of k sentences already formed. The set of sentences is a subclass of the set of strings and we know from the above that we can effectively enumerate these. Before we tum to the enumeration of sentences, we note that they have a stronger property: Theorem 4.7.2 The set of sentences is decidable. Proof We can decide whether a string cP is a sentence by the following "brute force" method (picture an upside-down "search tree", with cp at the top).
1. If cp starts with an atomic sentence, then cp is a sentence if it is of length 1, and otherwise it is not a sentence. 2. If the first symbol is a k-ary operation symbol, then decompose the rest of the string following into k-substrings in all possible ways. 3. Examine those subsuings to see if they are sentences in the same way as in 1-2, and then their Substlings, etc. 4. Finally, the process will terminate (since strings are finite). When this happens, we will be able to determine whether' there is a "formation tree" hidden in this search ~.
D
Exercise 4.7.3 Suppose that the only connectives are binary and unary. Show that the following is a decision procedure for sentencehood: Assign + 1 to each binary connective, 0 to each unary connective, and -1 to each atomic sentence. Add these "weights," starting from the left. cp is a sentence iff (i) the final sum is -1, (ii) none of the subtotals one gets as one adds from the left are negative. Exercise 4.7.4 Extend the method above to the case of connectives of arbitrary degree. Corollary 4.7.5 The set of sentences is effectively enumerable. Proof From Theorems 4.6.3 and 4.7.2.
D
5 SEMANTICS 5.1
Introduction
We have now discussed the mathematical theory of propositions, which we formulated in the framework of lattice theory, and we have discussed the mathematical theory of sentences (formal syntax), which we formulated in the theory of universally free algebras. In the present chapter, we discuss the relation between sentences and propositions, which is the central focus of algebraic formal semantics. A fundamental tenet of algebraic formal semantics is that a sentence expresses (denotes, designates, signifies) a proposition, which in tum is either true or false. In particular, a sentence is true (or false) only in a derivative sense; specifically, it is true (false) purely in virtue of expressing a proposition that is true (false). As in Chapter 3, we adopt a mathematical account of propositions that is analogous to the mathematical account of natural numbers. Specifically, we treat propositions as points in a space (a "logical space"), and we treat these points as bearing external relations to one another, in virtue of which they band together to form an algebra. On the other hand, unlike Chapter 3, in the present chapter we do not assume that propositions fOlm a lattice, or any other particular algebraic structure; we simply assume they form an algebra. Thus, our presentation is somewhat more abstract. Another, perhaps more important, tenet is what might be called algebraic compositionality, which is a particular implementation of the more general principle of co/npositionality, which traces back to Frege (1892). The latter maintains that the meaning of a complex expression is systematically composed of (i.e., is a function of) the meanings of its constituent expressions. Algebraic compositionality maintains that meanings compose in a mathematically simple, algebraically specifiable, manner. In Section 5.2, we describe the algebraic implementation of compositionality, which in Section 5.3 we apply to sentential languages. In Section 5.4, in order to provide a very simple illustration of algebraic compositionality, we discuss the algebraic treatment of truth functions, both in the context of two-valued semantics, and in the context of multivalued semantics. In Section 5.5, in order to provide a not so simple illustration of algebraic compositionality, we discuss possible worlds semantics. Then, in Section 5.6, we offer a general algebraic approach to semantics, based on the notion of a logical atlas, which subsumes two-valued, multi-valued, and possible worlds semantics under a common framework. Section 5.7 provides the linkinterpretations and induced valuations-between Section 5.4 where algebras of truth values are considered and Sections 5.5 and 5.6 which deals with algebras of propositions. In Sections 5.8 and 5.9 we introduce the notions of interpreted and evaluationally
SEMANTICS
142
constrained languages and investigate their relationships. In Section 5.10 we revisit algebraic compositionality in light of the previous sections. In Section 5.11 we use evaluationally constrained languages to define a series of logical notions, including validity and entailment. Section 5.12 explores different notions of equivalence between evaluationally constrained languages. In Section 5.13 we consider four different notions of compactness and demonstrate the link between these logical notions and the notion of compactness in topology. Lastly, in Section 5.14 we offer a brief analysis of some philosophical views on logical validity using the concepts developed in this chapter. 5.2
Categorial Semantics
Having discussed the categorial approach to formal syntax in Section 4.4, we now continue discussing it while at the same time examining the corresponding treatment of formal semantics. Fundamental to the general algebraic approach to formal semantics is the following.
Principle 5.2.1 (The mirror principle) Associated with every syntactic category C is a countelpart semantic category C*, whose mathematical type "mirrors" the grammatical type of C. And, in particular, every expression of syntactic category C is intelpreted by an "object" of semantic category C*. First of all, the basic syntactic categories, names and sentences, have basic semantic counterparts, which we call individuals and propositions. Next, the derivative syntactic categories,jilllctors, have derivative semantic counterparts,junctions, whose structure is analogous. For example, ajilllctor from sentences to sentences corresponds to afunction from propositions to propositions, and a functor from names to sentences corresponds to afunction from individuals to propositions. These are both special cases of a general correspondence between syntactic and semantic categories. For the sake of brevity, we concentrate on the simple syntactic categories, which the reader will recall can be defined as follows.
Definition 5.2.2 The class of simple syntactic categories is defined inductively as follows: (1) N is a simple syntactic categO/y (names). (2) S is a simple syntactic categO/y (sentences). (3) IfC] and C2 are simple syntactic categories, then so is (C], C2). (4) Nothing else is a simple syntactic category.
A completely parallel definition can be given for the simple semantic categories.
Definition 5.2.3 The class of simple semantic categories is defined inductively as follows: (1) I is a simple semantic category (individuals). (2) P is a simple semantic category (propositions). (3) IfC] and C2 are simple semantic categories, then so is (C], C2). (4) Nothing else is a simple semantic category.
CATEGORIAL SEMANTICS
143
The basic semantic categories, individuals and propositions, consist of mathematically plimitive objects. The derivative semantic categories consist of functions; in particular, objects of category (C], C2) are functions that take objects of category C] and yield objects of category C2. In this context, by "objects" we mean individuals, propositions, and functions. Given these definitions, we can formulate the precise relation between a syntactic category C and its counterpart semantic category C*.
Definition 5.2.4 The correspondence between syntactic categories and semantic categories is given by the function C f-+ C*, inductively defined as follows: (el) N* = I;
(c2) S* = P; (c3) (SYN],SYN2)* = (SYN;,SY,v;). In other words, names correspond to individuals (el), sentences correspond to propositions (c2), andjimctors correspond tojimctions (c3). More specifically, if SYN] corresponds to SEM] , and SYN2 corresponds to SEM2, then functors that take expressions of syntactic category SYN] and produce expressions of syntactic category SYN2 correspond to functions that take objects of semantic category SEM] and yield objects of semantic category SEM2. The following are simple examples of the correspondence between syntactic and semantic categories: functor
Syntactic category
Semantic category
connectives predicates predicate adverbs
(S,S) (N,S) ((N, S), (N, S))
(P,P) (1,P) ((1, P), (1, P))
For example, a predicate is a functor that takes names and produces sentences, so its semantic counterpart is a function that takes individuals and produces propositions. Having examined the mirror principle, we now tum to an equally important semantic principle.
Principle 5.2.5 (Principle of compositionality) The meaning of a compound expression E is a function of the meanings of the constituent expressions of E. Consider, as an example, anyone-place functor F of category (C], C2); in other words, when F is applied to an expression E, of category C], the resulting expression, which we call F[E], is of category C2. Now, according to the principle of compositionality, the meaning of the compound expression F[E] is composed of (is a function of) the meanings of constituent expressions, F and E (whose respective meanings, in tum, are composed of the meanings of their constituent expressions, and so on). In saying that the meaning of F[E] is a function of the meanings of F and E, the principle of compositionality does not specify the actual character of the function; it
SEMANTICS
144
merely says that a function exists. In the context of categOliallanguages that satisfy the mirror principle, one is naturally led to propose a stronger principle of compositionality, which states that the composition function takes a particularly tidy form, given as follows. (We write I(E) for the interpretation, i.e., the meaning of expression E). Principle 5.2.6 (Principle of categorial (algebraic) compositionality) If E = F[Ej, ... , Ek], and I(F) = f, I(Ej) = OJ, ... , I(Ek) = Ok, then I(E) = f(01,···, Ok), in other words, I(F[E1, ... , Ek]) = I(F)[I(Ej), ... , I(Ek)]. Suppose that a complex expression E is obtained from smaller expressions E1, ... , Ek, by the application of functor F (in symbols, E = F[E1,···, Ek]). Further suppose that the meaning of F is the function f, and suppose that the meanings of Ej, ... , Ek are 01, ... , 0", respectively. Then, according to Principle 5.2.6, the meaning of E is obtained by applying the function f to the objects 01,···, ok· In the following section, we illustrate the principle of categorial compositionality in the context of sentential languages, showing in particular that the principle of categorial compositionality amounts to requiring that every interpretation function is a homomorphism. 5.3
Algebraic Semantics for Sentential Languages
For the moment, we concentrate on sentential (fragments of) languages. Recall that a sentential language is about as simple as possible, without being completely trivial. In particular, such a language (or language fragment) has only two syntactic categories-5 (sentences) and (5,5) (sentential connectives); accordingly, there are only two associated semantic categories-P (propositions) and (P, P) (propositional connectives). The latter two categories are precisely what we examined in great detail in Chapters 2 and 3. This leads to the followtng. Definition 5.3.1 An algebra of propositions is any system (P, F), where P is a nO/1empty set of propositions, and F is a non-empty family of operations on P (i. e., propositional connectives). In other words, an algebra of propositions is simply an algebra whose carrier set consists of propositions. Mathematically speaking, of course, there is nothing special about an algebra of propositions; it is just an algebra. Having the notion of an algebra of propositions, we still need the notion of a structural similarity between the structure of propositions and the structure of sentences, in order to implement the mirror principle and the principle of algebraic compositionality. Toward this end, we develop a few subsidiary concepts. Definition 5.3.2 Let L be a sentential language, let A(L) be the associated algebra of sentences of L, and let P be an algebra of propositions. Then P is said to be appropriate for L if P has the same type as A(L). Note that we occasionally drop the term "appropriate" and simply say that an algebra is for a given language.
ALGEBRAIC SEMANTICS FOR SENTENTIAL LANGUAGES
145
The idea in the above definition is that an algebra of propositions is appropriate for a sentential language if and only if there is an exact (one-one) correspondence between the syntactic operations (sentential connectives) and the semantic operations (propositional connectives). For example, if L consists solely of one two-place connective and one one-place connective (in that order), then an appropriate algebra consists solely of one two-place operation and one one-place operation (in that order); in other words, an appropriate algebra is of type (2, 1>. Having the notions of a sentential language and algebra of propositions (appropriate) for a sentential language, we next define interpretations, which provide the crucial link between sentences and propositions. Definition 5.3.3 Let L be a sentential language, let ACL) be the associated algebra of sentences of L, and let P be an algebra (appropriate) for L. Then an interpretation of Lin P is any homomOlphismfrom A(L) into P. To say that a function I is a homomorphism from ACL) to P is to say that the following condition is satisfied for every connective c in L: (H) I[Oc(P1, ... , P1J] = Ddl(p]), ... , I(P1J].
Before continuing, it is important to notice that there are three objects associated with each connective c: (l) the connective itself, which is a symbol Cor string of symbols) together with a grammatical role in the language; (2) the operation Oc on the algebra of sentences, which is a mathematical function that represents the grammatical role of c; (3) the operation Dc on the algebra of propositions, which is the propositional counterpart of c. In particular, whereas c is not a mathematical (set-theoretic) object, both Oc and Dc are mathematical objects. The basic idea is that an interpretation I is a function that assigns a proposition to every sentence of L. But not just any assignment of propositions to sentences will do; in order to be an interpretation, a function must be a homomorphism. This amounts to the claim that the proposition assigned to any compound sentence is algebraically composed out of the propositions assigned to the constituent sentences. In particular, condition (H) states that the proposition assigned to a compound sentence, formed using the connective c, is a function of the propositions respectively assigned to the constituent sentences, where in particular the function is the propositional counterpart of the syntactic connective c. Exercise 5.3.4 Show that requiring that every interpretation is a homomorphism is just the algebraic way of expressing the principle of categorial compositionality in the special case of sentential languages. (Hint: Let the interpretation of a connective c, ICC), be the algebraic operation (i.e., the propositional connective) Dc.) In order to illustrate the idea that an interpretation is a homomorphism, let us consider conjunction. Furthermore, let us suppose that the propositions form a lattice, in which case propositional conjunction is the meet operation. In this case, we have the following, as a special case of condition (H):
TRUTH-VALUE SEMANTICS
SEMANTICS
146
(H&) l(¢ & lfI) = l(¢) /\ l(lfI)· This can be read in a straightforward manner: the proposition designated by the syntactic conjunction of two sentences ¢ and lfI is the propositional cOl~junction of the propositions designated by ¢ and lfI, respectively. An alternate way of writing (H&), which might be easier to understand, goes as follows: (H&*) If ¢ designates p, and lfI designates q, then ¢ & lfI designates p /\ q. Note carefully that & is not the conjunction connective per se; rather, it is a mathematical object, in particular, the algebraic operation associated with syntactic conjunction. Also note that we write the mathematical operation symbol '&' in infix notation irrespective of how syntactic conjunction is in fact concretely implemented. For example, in Polish formatted languages, the conjunction of two sentences ¢ and lfI is obtained by prefixing K in front of ¢ in front of lfI; however, in other languages, conjunction is implemented differently. In any case, whatever actual concrete form the syntactic conjunction of ¢ and lfI takes, we denote it in the same way, by the expression ¢ & lfI, which we may read as "the conjunction of ¢ and lfI," whatever that may in fact be in the particular language under scrutiny. In Sections S.4-S.6, we consider various ways of implementing compositional semantics for sentential languages, starting with the simplest method. 5.4
Truth-Value Semantics
Having discussed general algebraic semantics in Section S.2, and having simplified our discussion to sentential languages in Section S.3, in the present section we consider a further (rather extreme) simplification of the general theory. Specifically, we consider algebras that are most fruitfully understood as consisting, not of propositions per se, but rather of truth values. No one seems to know or care what truth values "really" are. We know there are at least two of them: Frege (1892) calls these "the true" and "the false." We reify these as the numbers 1 and 0 and often refer to them using the letters '[' and' We begin with a general definition.
r.
Definition 5.4.1 An algebra of truth values is any algebra (V, F), where V is a nonempty set of truth values and F is a family offunctions on V. If the elements of the algebra are truth values, then of course, the operations on such an algebra are functions that take truth values and yield truth values; in other words, the operations are what are customarily called truth functions. Another way to say this, perhaps, is that if propositions tum out to be truth values, then propositional connectives correspondingly tum out to be truth functions. We begin with a famous example, with which everyone is familiar-classical truth tables. First of all, we all know about addition and multiplication tables, without necessarily knowing that these tables specify an algebra of natural numbers. Similarly, we all know about truth tables, without necessmily knowing that these tables specify an algebra of truth values. We call this algebra the Frege algebra, which is officially
defined as follows.
147
Definition 5.4.2 The Frege algebra is the algebra (V, F), where V consists of only two elements-l ("the true") and 0 ("the false"), and where F consists of the familiar truth functions, formally defined as follows: (Fl) x /\ y = xy; (F2) x V y = x + y (F3) -x = 1 + x;
+ xy;
(F4) x =? y = 1 + x (FS) x ¢;> y = 1 + x
+ xy; + y.
Here, the variables x and y range over truth values (0 and 1), and the connective-like symbols refer to truth functions. Juxtaposition indicates ordinary numericalmultiplication, and + indicates modulo-2 addition, which is defined so that 0 + 1 = 1 + 0 = 1 and 0+0 = 1 + 1 = O. Exercise 5.4.3 (FI )-(FS) constitute a succinct presentation of the five classical truth functions, based on addition and multiplication on the two-element ring. Verify that the functions specified in (FI )-(FS) do in fact correspond exactly to the familiar truth functions of classical sentential logic. For example, show that the conjunction of t and t is t; i.e., 1 /\ 1 = 1, i.e., 1 x 1 = 1. Since the Frege algebra is an algebra of truth values, any interpretation of a sentential language L into this algebra is by default an assignment of truth values to sentences of L. Furthermore, since an interpretation must satisfy the requirement of compositionality (the homomorphism requirement), it must satisfy the following: (11) l(¢ & lfI)
= l(¢) /\ l(lfI);
(12) l(¢ v lfI) = l(¢) V l(lfI); (13) I(~¢) = -l(¢); (14) l(¢ -+ lfI) = l(¢) =? l(lfI);
(IS) l(¢ +-+ lfI) = l(¢)
¢;>
l(lfI)·
As before, the connective-like symbols on the syntactic side do not refer to the actual syntacti~ connecti~e.s, but rather to their mathematical representations; so, for example, ¢ -+ lfIIS the condItIOnal sentence formed from ¢ and lfI, however that is in fact accomplished in the particular language under scrutiny. Exercise 5.4.4 Verify that (l1)-(1S), in conjunction with (Fl)-(FS), yield the usual classical restrictions that apply to the assignment oftruth values to compound sentences. ~or example, if ¢ and lfI are both interpreted as "the true," then their conjunction is also mterpreted as "the true." Exercise 5.4.5 Show that the Frege algebra can be regarded as a two-element Boolean algebra (lattice), supplemented by a conditional operation (defined so that x =? y = -x V y), and a biconditional operation (defined so that x¢;> y = (x /\ y) V (-x /\ -y)). S~~w, f~r e~ample, that Frege conjunction is the same as Boolean meet, and that Frege dISjUnctIOn IS the same as Boolean join.
Having discussed two-valued algebras, we now consider a natural generalization, which is obtained by enlarging the number of truth values from two to three or more (including possibly infinitely many). In this way we arrive at multi-valued logic, which traces back to Lukasiewicz (1910,1913). The precise philosophical significance of the additional non-standard truth values is unclear. On the other hand, mathematically speaking, multi-valued (MV) algebras are just like two-valued algebras, only bigger! Bigger indeed: as one adds more and more intermediate truth values, the number of mathematically possible truth functions becomes staggeringly large. The following is a simple (though hardly unique) example of a whole family of MV algebras. In each particular example, 0 corresponds to "the false," 1 corresponds to "the true," and the fractions between 0 and I correspond to intermediate truth values.
Definition 5.4.6 An MV algebra is an algebra (V, F) in which V consists of all the fractions O/n, 1/ n, ... , n/nfor some fixed n, and in which the operations in F are defined as follows: (01) x/\y=min(x,y);
(02) x
V
y
= max(x, y);
(03) -x = I - x; (04) x
=>
y = mine 1 - x
(oS) x¢:> Y = (x
=>
+ y, I); => x).
y) /\ (y
Exercise 5.4.7 Show that the Frege algebra is a special case of an MV algebra, in which V is just {O, I} . As with the Frege algebra, it is customary and natural to interpret the elements of an MV algebra as truth values. The difference, of course, is that a (non-trivial) MV algebra has non-classical intermediate truth values. Multi-valued logic was originally motivated by the problem of future contingent statements. Unfortunately, as it turns out, multi-valued logic does not provide a satisfactory solution to this problem, primarily because multi-valued logic is fundamentally truth-functional. Notwithstanding its failure to solve the philosophical problem for which it was originally invented, multi-valued logic has grown into a thriving mathematical discipline with many (non-philosophical) applications. Nevertheless, we do not deal further with multi-valued logics as such, although we certainly deal with semantic algebras containing more than two elements. This is the topic of Sections S.S and S.6.
5.5
POSSIBLE WORLDS SEMANTICS
SEMANTICS
148
Possible Worlds Semantics
In the Frege algebra, there are exactly two "propositions," 1 and 0, which are identified with the truth values "the true" and "the false." In other words, in the Frege algebra, to say that a proposition is (adjectively) true is precisely to say that it is (identical to) "the true."
149
The Frege algebra is a special case of the more general class of truth-value algebras, which include various MV algebras. In every such algebra, the propositions are simply truth values, and propositional connectives are truth functions. Accordingly, only truthfunctional connectives can be interpreted within a truth-value algebra, be it the Frege algebra or an MV algebra. This approach to formal semantics works very well for truth-functional logic, including classical sentential logic and the various multi-valued logics, but it does not work for logics that are not truth-functional, including quantifier logic and modal logic. A more general approach, which we formally present in the next section, distinguishes between propositions and truth values, in analogy to Frege's (1892) distinction between sense and reference. According to this approach, every sentence has a direct intelpretation, which is a proposition; every proposition is, in turn, either true or false (adjectively), so every sentence also has an indirect intelpretation, which is a truth value. However, before proceeding to the more general approach, we consider one more method of implementing algebraic compositionality, namely, the method of possible worlds. According to this method, an interpretation function does not assign a truth value simpliciter to each sentence; rather, it assigns a truth value with respect to each possible world. One then gives truth conditions for complex sentences in a systematic manner, analogous to the truth conditions for classical truth-functional logic. The following illustrate this approach, where veep, w) is the truth value of ep at world w: (vI) v(ep & lJf, w) = t iff veep, w) = t and v(lJf, w) = t; (v2) v(ep v lJf, w) = t iff veep, w) = t and/or v(lJf, w) = t; (v3) v(~ep, w) = t iff veep, w) = f. All the connectives defined in (v 1)-( v3) are truth-functional. If we confine ourselves to these connectives, we have an intensional semantics for a language that has no syntactic means of articulating the various intensional distinctions that arise in the semantics. The failure of the syntax to adequately reflect the semantics prompts any syntactically oriented logician to introduce further, non-truth-functional, sentential operators, namely, modal operators. The most celebrated modal operators are "necessarily ... " and "possibly ... ," which are customarily symbolized by D and O. One characterization of their truth conditions, which traces back to Leibniz, is given as follows: (v4) v(Dep, w) = t iff veep, w') = t for every possible world w'; (vS) v(Oep, w) = t iff veep, w') = t for some possible world w'. The above truth conditions correspond to absolute modal logic. One obtains weaker modal logics by adding an accessibility relation R to the truth conditions, an idea that traces to Kripke (l963a). Thus, in the Kripke approach, one posits a non-empty set W of possible worlds together with a binary relation R on W. One then characterizes interpretations as follows: (v4*) v(Dep, w) (vS*) v(Oep, w)
=t =t
iff veep, w') = t for every w' such that WRW'; iff veep, w') = t for some w' such that wRw ' .
Depending on what properties one asclibes to the accessibility relation R (e.g., reflexivity, symmetry, transitivity, etc.), one obtains various well-known modal systems (e.g., T, S4, B, S5). In addition to the customary presentation of the semantics of modal logic, due to Kripke (1959, 1963a, 1963b, 1965), there is an alternative presentation that renders possible worlds semantics completely compatible with algebraic semantics. Toward this end, we define two special sorts of propositional algebras (cf. Lemmon 1966). The first is more general and is based on Kripke (l963a, 1963b). The second is based in effect on Kripke (1959). Definition 5.5.1 Let W be a nOll-empty set (of possible worlds), and let R be any binary relation on W (the accessibility relation). Then the Kripke algebra on (W, R) is the aigebra KACW, R) = (P, F) defined as follows: P is the set \p(W) of all subsets ofW; F is afamily of operations, defined by: (Kl) p/\q=pnq; (K2) p V q
POSSIBLE WORLDS SEMANTICS
SEMANTICS
150
= p U q;
(K3) -p = W - p; (K4) p =? q = W - P U q; (K5) p ¢} q = (p =? q) n (q =? p); (K6) Dp = {x : for all y, ifxRy then yEp}; (K7) Op = {x : for some y, xRy and yEp}.
Definition 5.5.2 Let W be a non-empty set (of possible vvorlds). Then the Leibniz algebra on W is the algebra LA(W), defined to be identical to the Kripke algebra KA(W, R), where R is the universal relation Oil W. Here, the variables x and y range o~er elements of W (i.e., worlds), and p and q range over elements of P (i.e., sets of worlds). Also, the symbols that look like connectives refer to the operations on the Kripke (Leibniz) algebra. Exercise 5.5.3 Show that in the Leibniz algebra LA(W), conditions (K6) and (K7) reduce to the following: (L6) Dp = W if p = W, and 0 otherwise. (L7) 0 p = 0 if p = 0, and W otherwise. In a Leibniz or Kripke algebra, a proposition is a "UCLA proposition."l In other words, a proposition is simply identified with the set of worlds in which it is true. This means, of course, that distinct propositions cannot be true in precisely the same worlds. Since UCLA propositions are sets (of worlds), propositional connectives are set-theoretic operations. For example, the negation of a proposition (set of worlds) is I JMD first heard the tenn "UCLA proposition" from Alan Ross Anderson sometime during the mid-I 960s. We do not know if it originates with Anderson, but it was of some currency then and reflects the contributions made by Carnap, Montague, and Kaplan (all at the University of California at Los Angeles) to the semantics of modal logic.
151
its set-theoretic complement (relative to W). Similarly, the conjunction of two propositions is simply their intersection, and the disjunction of two propositions is simply their union. Besides the simple set operations, there are also somewhat more complicated set operations, which are associated with the modal operators. For example, condition (K6) states that a world w is in Dp iff every world accessible from w is in p. On the other hand, the Leibnizian condition (L6) states that w is in Dp iff every world is in p. In other words, if p is true in every world, then its necessitation Dp is true in every world, but if p is false in at least one world, then Dp is false in every world. This reflects the fact that the Leibniz algebra corresponds to absolute modal logic. Now, back to algebraic semantics. First of all, in a Kripke or Leibniz algebra, a proposition is just a set of worlds, and a proposition p is true at a world w iff w E p. An algebraic interpretation (homomorphism) I assigns a proposition to every sentence. Accordingly, a sentence c}; is true, according to I, at a world w iff I(c};) is true at w, which is to say that wE I(c};). In other words, we have a straightforward correspondence between the customary interpretations of modal logic and algebraic interpretations. This is formally described as follows, where v is a customary interpretation and I is the corresponding algebraic interpretation: (cl) v(c};, w)
=t
<===? wE I(c};).
An algebraic interpretation does not assign propositions to sentences willy-nilly, but rather in a manner consistent with algebraic compositionality. For example, if (according to I), sentence c}; denotes proposition p, and sentence If/ denotes proposition q, then the syntactic conjunction of c}; and If/ denotes the propositional conjunction of p and q. In the special case of Kiipke/Leibniz algebras, p and q are subsets of W, and their propositional conjunction is simply their intersection. In connection with (c 1), this yields the following series of correspondences: v(c}; & If/, w) = t <===? wE I(c}; & If/) <===? w E I(c};) n I(lf/) <===? w E I(c};) and wE I(lf/) <===? v(c};, w) = t and v(lf/, w)
= t.
A similar thing can be said about each of the sentential connectives, including necessitation; this we leave as an exercise. Exercise 5.5.4 Finish up the above account. Show that the semantics based on the Leibniz algebra (Ki'ipke algebra) corresponds to the customary possible worlds semantics. As a final exercise, we direct the reader's attention back to the Frege algebra. Exercise 5.5.5 Consider an extremely simple example of a Leibniz algebra LA(W)one in which W = {w}, in which case there are only two propositions (namely, 0 and {w}). Show that the semantics based on LA( {w}) coincides with the semantics based on the Frege algebra. In particular, show that the Frege algebra is isomorphic to the Leibniz algebra LA( {w}) in which there is exactly one possible world.
152
5.6
SEMANTICS
Logical Matrices and Logical Atlases
Having examined truth-value semantics in Section 5.4, and possible worlds semantics in Section 5.5, we now turn to a general algebraic semantic scheme that is intended to subsume the former as special cases. Fundamental to our approach to algebraic semantics is the concept of a logical atlas (atlas, for short), which is a natural and useful generalization of the traditional concept of a logical matJix, fundamental to the Polish School. We begin by discussing logical matrices, which are formally defined as follows.
Definition 5.6.1 A logical matJix is a system (P, D), where P is a non-empty algebra, and D is a non-empty plVper subset of the carrier set P. The idea here is that P is an algebra of propositions, and D is a distinguished subset of "designated" (true) propositions. Intuitively, a logical matrix specifies two things: first, the propositions and their connectives; and second, which propositions are in fact true. For various reasons, logical and/or metaphysical, one might not wish to use just one set of propositions. It is accordingly natural to consider collections of similar logical matrices (two matrices being similar iff their propositional algebras are similar). This gives rise to the following formal definition.
LOGICAL MATRICES AND LOGICAL ATLASES
153
synonyms, but we adopt the former to the exclusion of the latter purely for the sake of ~revi:y and abst.ractness; this factor is considerably more important in later investiga-
ttons l11to the umversal algebraic properties of atlases (see Chapter 7). We hasten to add that we definitely do not presume that a logical space (i.e., an atlas) is constructed in the manner of the Tractatlls (i.e, from "atomic" propositions). Befo~e proceeding to semantics based on logical matrices and logical spaces in the next sectlOn, we offer a few examples of matrices, medleys, and atlases. ~i:s.t of all, the Frege algebra can be transformed into the Frege matrix simply by specJfymg D to be {I}. In other words, in the Frege matrix, a proposition is true iff it is "the .true." Similarly, one obtains the Frege atlas by fonning the singleton of the Frege matnx . .A somewhat. more complex family of examples can be obtained by letting P be any cham.(e.g., any l11ter:al of fractions between 0 and 1) and letting D be any proper upward mterval on P. FIgure 5.1 contains a few simple examples. Here, xs are designated elements, and os are undesignated elements. We also need some operations, so let us take conjunction as min(x, y) and disjunction as max(x, y) as with an MV algebra. (cl) x
Definition 5.6.2 A medley (of matrices) is a non-empty collection M of similar logical matrices.
(c2)
x
o
(c3) x
x
(c4) x
x
x
o
x
Of course, one way for all the mattices in M to be similar is for their underlying algebras all to be identical. This specialization of medleys is fonnally defined as follows.
Definition 5.6.3 A bundle (of matrices) is a collection of matrices, all of which have the same underlying algebra «{propositions.
o
o
An alternative, but equivalent, mathematical system is defined as follows.
o
o
Definition 5.6.4 A logical atlas is a system (P, (D i »), where P is all algebra ofpropositions, and (Di) is a non-empty family of non-empty proper subsets «{the carrier set P.
FIG. 5.1. Proper upward intervals on a chain
Remark 5.6.5 Obviously logical matrices may be regarded as a special case of logical atlases. Also, matrix bundles and logical atlases are coextensive, i.e., every bundle gives rise to an associated atlas, and vice versa.
A still n:ore comple.x family of examples can be obtained by letting P be any Boolean lattIce, and lettmg D be any maximal filter on P. We call such a matrix a maximal Boolean matrix. Figure 5.2 contains some examples; notice that the first one is simply the Frege matrix. Now, one obtains a medley simply by collecting any bunch of similar matrices but one o~t~ins an atlas (in effect) by collecting matrices founded on the same algebr~ of propositlOns. For example, one obtains an atlas Al by assembling (b2)-(b3), and another atlas A2 by assembling (b4)-(b6); these are discussed in an exercise below. We conclude this section by showing how the theory of atlases subsumes possible worlds semantics. We begin with the following. ,
As in the case of logical matrices, a logical atlas specifies two things: first, the propositions and their connectives; and second, which collections of propositions are admissibly combined into truth sets. On the other hand, unlike a logical matrix, a logical atlas does not (in general) specify which truth set is actual. The admissible truth sets (i.e., the various D i ) constitute the algebraic analog of possible worlds. The underlying metaphysical intuition is simple: one begins with a pre-existing set of propositions, and one then constructs a world by assembling a whole bunch of propositions-intuitively the propositions true in that world. This view of propositions reflects the dictum of Wittgenstein (1921), "The world is everything that is the case." Indeed, this suggests adopting Wittgenstein's term "logical space" as an alternative to "atlas." Indeed, we treat "atlas" and "logical space" as
Definition 5.6.6 Let W be any nOll-empty set (of possible worlds). Then the Leibniz atlas associated with W is the atlas (P, (DW)WEW), where (1) P is the Leibniz algebra LA(W);
INTERPRETATIONS AND VALUATIONS
SEMANTICS
154 x
(bI)
x
(b2)
o
x
Exercise 5.6.9 Show that the atlas Al above is equivalent to the Leibniz bundle in which W = {WI, W2}. Similarly, show that the atlas A2 is equivalent to the Leibniz bundle in which W = {WI, W2, W3}. 5.7
o
o
x
(b4)
(b3) x
x
o
x
o
x
o
o
o
x
(b5)
o
o
x
(b6)
x
x
x
o
x
o
o
o
FIG. 5.2. Maximal filters on a Boolean lattice (2) Dw = {S
~
155
W : W E S}, for each world win W.
In other words, the underlying algebra of propositions is the algebra of subsets of W, and for each world w, the associated designated set Dw is simply the set of propositions (subsets of W) that contain w as an element. This is exactly as it should be; a world w makes a UCLA proposition p true iff w is an element of p. Exercise 5.6.7 Show that the algebraic semantics based on the Leibniz atlas associated with W is equivalent to the customary possible worlds semantics based on W.
Interpretations and Valuations
Thus far, we have discussed two entirely different kinds of semantic algebras. On the one hand, in Section 5.4, we discussed algebras consisting of (two, possibly more) truth values; in this context, an interpretation assigns truth values to the sentences of a language. On the other hand, in Sections 5.5 and 5.6, we discussed algebras consisting of propositions; in this context, an interpretation assigns propositions, not truth values, to sentences. In the present section, we unify these ideas under a common semantic framework, which traces back to Frege (1892) and Carnap (1947). Fundamental to this semantic framework is the distinction between two kinds of semantic functions, which we call inteTpretations and valuations. Interpretations and valuations are very different beasts, both philosophically and mathematically. Philosophically, an interpretation assigns to every sentence a meaning (Fregean sense, Carnapian intension), whereas a valuation assigns to every sentence a truth value (Fregean reference, Carnapian extension). Mathematically, an interpretation is required to be a homomorphism, whereas a valuation is required simply to be a function. This latter fact is made explicit in the following definitions, the first of which is borrowed from Section 5.3. Definition 5.7.1 Let L be a sentential language, and let S be the associated algebra of sentences of L, and let P be an algebra (of propositions) appropriate for L. Then an interpretation of L in P is any h01110/nOlphism from S into P. Definition 5.7.2 An interpretation on L is any inteTpretation of L in any algebra P appropriate to L. Definition 5.7.3 Let L be a sentential language, let S be the associated set of sentences of L. Then a valuation on L is any function from S into the set {t, f} of truth values. Whereas an interpretation is a homomorphism, a valuation is not. This reflects the desire to have a compositional semantics without having a truth-functional semantics. In particular, whereas meanings are required to compose algebraically, truth values are not. Notice that every valuation v partitions the sentences of L into two halves, the true and the false. This leads to the following subsidiary definition, which is followed by a simple consequence. Definition 5.7.4 Let v be a valuation on L. Then: (1) Tv
Exercise 5.6.8 Modify the above definition to produce a corresponding definition of a Ktipke atlas, and show how the semantics based on Ktipke atlases is equivalent to the usual Kripke semantics for modal logic.
(2) Fv
= {¢ : v(¢) = t}; = {¢ : v(¢) = n.
Fact 5.7.5 Let v be a valuation on L. Then:
(1) Tv U Fv = S(L);
(2) Tv n Fv
INTERPRETATIONS AND VALUATIONS
SEMANTICS
156
= 0.
The latter fact amounts to the principle of bivalence, according to which every sentence has a truth value, and no sentence has more than one truth value. This particular construal of truth and falsity, which is the customary one, is based on a minimal notion of falsity, according to which falsity is simply the absence of truth. Bivalence, of course, has been challenged on both philosophical and linguistic grounds. Various attempts to model natural languages have produced various non-bivalent logics. In addition to traditional multi-valued logics, discussed earlier, there are also logics that countenance truth-value gaps (sentences having no truth value), as well as logics that countenance truth-value gluts (sentences having two truth values). The most common alternative conception of falsity defines "p is false" to be mean "~p is true." If it happens that neither p nor ~p is true, then p is neither true nor false, and we obtain a truth-value gap. On the other hand, if it happens that both p and ~ p are true, then p is both true and false, and we obtain a truth-value glut. Largely for the sake of convenience, we reject this alternative conception of falsity in favor of the minimal conception. Accordingly, rather than saying that p has no truth value, we say that both p and ~p are false, which means that neither p nor ~p is true. Similarly, rather than saying that p has two truth values, we say that both p and ~p are true. Thus, although we countenance various odd semantical situations, we prefer to describe them differently. Our principal goal is to provide a general semantic framework for logic. For the purposes of logic, one does not need an autonomous concept of falsity, but only a concept of truth. All the standard notions of logic (validity, consequence, etc.) can be defined exclusively in terms of truth. But, in order to simplify definitions, we allow ourselves a shorthand telm "false" to mean "not true." See, for example, Section 5.10. We now turn to the relation between interpretations and valuations. First of all, every logical matrix and every atlas is based on an algebra of propositions, so the above definition of interpretations can be modified slightly to yield corresponding definitions about logical matrices, atlases, etc. We begin by defining appropriateness of logical matrices, etc. Definition 5.7.6 Let L be a sentential language, let S be the associated algebra of sentences. (1) Let M = (P, D) be a logical matrix. Then M is said to be appropriate for L if P has
157
Definition 5.7.7 Let L be a sentential language, and let S be the associated algebra of sentences. (1) Let M = (P, D) be a logical matrix appropriate for L. Then an interpretation of L
into M is any homomorphism from S into P. (2) Let A = (P, (Di») be an atlas appropriate for L. Then an interpretation of L into A is any homomorphism from S into P. (3) Let M = (Mi) be a medley appropriate for L. Then an interpretation of L into M is any inteJpretation of L into Mi for some i.
In other words, an interpretation of a language L in a matrix (atlas, medley) is simply an interpretation ofL into (one of) the underlying algebra(s). We now return to valuations. We begin by noting that every atlas may be regarded as a bundle, which is a special case of a medley, which is a collection of matlices. Thus, every sort of interpretation mentioned above is, at root, a matrix interpretation. Accordingly, for the moment at least, we concentrate on matrix interpretations. Now, every matlix interpretation of a language L assigns a proposition to every sentence of L. Furthelmore, every matrix designates certain propositions as true, the remainder being regarded as false (by default). Accordingly, every matl'ix interpretation induces an associated valuation, formally defined as follows. Definition 5.7.8 Let L be a sententiallallguage, let M = (P, D) be a logical matrix appropriate for L, and let 1 be an intelpretation of L in M. Then the valuation induced by 1 is the function VI from S into {t, f} satisfying the following conditions for evelY sentence P of L: (1) VI(p) = t ifl(p) ED;
(2) vI(p) = f othelwise.
In other words, a sentence P is true, according to valuation V" if P expresses a true (i.e., designated) proposition, according to I; otherwise, p is false. This reflects the Fregean dictum that sense is the route to reference. We conclude this section by introducing some useful notation. Definition 5.7.9 Let L be a sentential language, and let M be a logical matrix appropriate for L. Then: (1) J (L, M) = (2) V(L,M)
{I : I
is an intelpretation of L into M};
= {VI: 1 E J(L,M)}.
In other words, J (L, M) consists of precisely tllose functions that are interpretations of L into M, and V (L, M) consists of the associated valuations on L. The above definition has a natural extension to medleys, and hence bundles, and hence atlases.
the same type as S. (2) Let A = (P, (Di») be an atlas (of logical matrices). Then A is said to be appropriate for L if P has the same type as S. (3) Let M be a medley (of logical matrices). Then M is said to be appropriate for L if evelY matrix M in M is appropriate for L.
Definition 5.7.10 Let L be a sentential language, and let M be a medley appropriate for L. Then:
In other words, a matrix (atlas, medley) is appropriate for a language L iff the underlying algebra (algebras) of propositions is (are) appropriate for L.
(2) V(L,M)
(1) J(L, M) =
U {I(L,M) : M
EM};
= {VI: 1 E J(L,M)}.
158
SEMANTICS
Exercise 5.7.11 Provide corresponding definitions for bundles and atlases. When the language L is tacitly understood, we simplify our notation by dropping reference to L, for example, we write "J(M)" or "V(M)". 5.8
Interpreted and Evaluationally Constrained Languages
A valuation assigns a truth value (t or f) to every sentence of the relevant language L. However, one cannot simply scatter truth values over the sentences; rather, one must assign truth values in a manner that respects the meanings of the sentences. In other words, some valuations are admissible and some are not. For example, we do not allow a valuation that assigns t to ¢ & lfJ but f to ¢, because such a valuation disrespects the meaning of conjunction. Similarly, we do not allow a valuation that assigns t to "John is a bachelor" but f to "John is unmarried," because such a valuation disrespects the meanings of "bachelor" and "unmarried." (The disanalogy between these two examples is discussed in Chapter 6.) At the very minimum, a semantics for a language specifies what valuations are admissible. The set of admissible valuations of a semantics for L might be regarded as the extensional "traces" of L, which are obtained by bouncing truth values off the intensions of sentences of L and seeing which ones stick and which ones bounce back, much as with the impulses generated by radar. If we disregard tlle intensional component of a semantics, i.e., the interpretations of sentences via propositions, and concentrate exclusively upon their extensional vestiges, we arrive at ilie notion of an evaluationally constrained language, which is due to van Fraassen (1971), and which is formally defined as follows. Definition 5.8.1 An evaluationally constrained language (also sometimes referred to as a semi-interpreted language) is a pair (S, V), where S is the set of sentences of some language L, and V is a non-empty set of valuations on S. The elements of V are referred to as the admissible valuations of (S, V). However, contrary to the motivation that introduced the idea, evaluation ally constrained languages are abstractly defined in such a way iliat any non-empty set of valuations on S qualifies as a set of admissible valuations. The reason why the epithet 'semi' is appropriate, we believe, is that (oftentimes) an evaluation ally constrained language may be viewed as the surface (i.e., extensional) manifestation of a deeper semantic object, namely, an intelpreted language. Indeed, we would contend that, without the underlying interpreted language, one cannot understand why some valuations are admissible and others are not. So we are naturally interested in those evaluationally constrained languages that are founded on underlying interpreted languages. (We use ilie term "interpreted language" occasionally-as here-in a general sense to mean to include also languages which are more technically will be called "interpretationally constrained." Context will clarify whether this more general usage is meant.) Before we discuss the latter notion, we need to present a specialization of evaluationally constrained languages appropriate to our special concerns.
INTERPRETED AND ECS LANGUAGES
159
Definition 5.8.2 An evaluation ally constrained sentential language (ECSL) is a pair (8, V), where 8 is the algebra of sentences of some language L, and V is a non-empty set of valuations on S. . The difference between a genelic evaluationally constrained language and an evaluationally constrained sentential language is iliat the latter, but not the former, is required to be based on an algebraically formatted (categorial) language. . Turning to interpreted languages, we first observe that there is no single category of mterpreted languages, but rather several, which we discuss in tum. First of all, b~ way of contrast, an unilltelpreted language is simply a language present~d syntactIcall~, but not semantically. There are various ways of supplying a semantIcs, and accordmgly various types of interpreted languages. The most thoroughgoing semantics is defined as follows. Definition 5.8.3 A locally evaluated interpreted language is a pair (8, I), where 8 is the algebra of sentences of some language L, and 1 is an intelpretation of L in a matrix M appropriate to L. ~~us, in a locally evaluated interpreted language, we are given the meaning of (propos.ItIOn expressed by) ~very sentence of L, and furthermore we are given which propositIOns are tr~le and whIch are false. Such a semantics might be called saturated, since it leaves nothmg open. In regard to traditional mathematical logic, a fully interpreted language is analogous to a first-order language L together with a model M of L and an assignment function d (which enables one to assign truth values to all formulas, both open and closed). Locally evaluated interpreted languages, as defined above, constitute an extI·eme f~rm of se~an.tic implementation, so extreme in fact that iliey do not apply very well eIther to artIficmllanguages or to natural languages. It is well known that many artificial languages are semantically open-ended, in virtue of being schematic in nature. But so are natural languages, in virtue of the presence of indexicals; for example, ilie sentence "it is raining" has a meaning, but not a truth value simpliciter; railier, it only has a truth value relative to each specific context in which it is uttered. It would be naive to expect a semantics to tell us once and for all the truth value of every sentence. Rather, it would be more reasonable to expect a semantics to provide not truth values, but truth conditions. This idea leads to various weakenings of the notion of an interpreted language, defined in what follows. Definition 5.8.4 A globally evaluated interpreted language is a pair (8, I) where 8 is the algebra of sentences of a language L, and 1 is a particular interpretation of L in an atlas A appropriate to L. A locally evaluated interpreted language provides both a meaning and a truth value for every. sentence. By contrast, an interpreted language provides a meaning, but not (necessanly) a truth value, for every sentence; what is missing is the specification of the actual world (situation, context, assignment). . In regard to traditional mathematical logic, a locally evaluated interpreted language IS analogous to a first-order language L together witll a particular model M of L.
160
INTERPRETED AND ECS LANGUAGES
SEMANTICS
Definition 5.8.5 A globally evaluated interpretationally constrained language is a pair (S, I), where S is the algebra of sentences of a language L, and I is the set of all admissible intelpretations of L in an atlas A appropriate to L. Whereas an interpreted language provides a specific meaning, but not necessarily a truth value, for every sentence, a globally evaluated interpretationally constrained language does not even provide a specific meaning for every sentence. Rather, what it provides is a whole collection of (admissible) meanings. In regard to traditional mathematical logic, a partially interpreted language is analogous to a first-order language L together with all of its admissible models. The transition from partially to locally evaluated interpreted languages proceeds in two steps. First, one fixes a particular meaning for each sentence, but does not fix which possible world is the actual world; this is the move from partial interpretation to simple interpretation. Next, one fixes which world is actual; this is the move from simple interpretation to full interpretation. The order of the above procedure could be reversed. One can fix the actual world first, and then fix the meaning of each sentence. There is accordingly an intermediate stage, at which point the world is fixed, but not the meanings of the sentences. Since we have not yet given a name to this intermediate semantic stage, we do that now. Definition 5.8.6 A locally evaluated interpretationally constrained language is a pair (S, I), where S is the algebra of sentences of a language L, and I is the set of all admissible interpretations of L in a logical matrix M appropriate to L. The difference between a globally evaluated interpretation ally constrained language and a locally evaluated interpretationally constrained language is that, whereas the former is defined with respect to an atlas, the latter is defined with respect to a single matlix. Of course, a logical matrix may be viewed as a special case of an atlasin particular, a singular atlas. Accordingly, technically speaking, a locally evaluated interpretationally constrained language is a special case of a globally evaluated interpretationally constrained language. Figure 5.3 shows how the different types of languages relate to each other. Returning to evaluationally constrained languages, we present a number of theorems. Theorem 5.8.7 Let (S, I) be a locally evaluated interpreted language, where M
=
(P, D) is the relevant matrix. Then there is an associated evaluationally constrained language (in fact, an evaluated language) (S, V), in which V = {VI}, where v/(c]J) = t
(P,D)
(P,(Dj)jEJ)
matrix
atlas
(S,I,P,D)
(S, I, P, (Dj) jEJ) globally evaluated interpreted language
locally evaluated interpreted language
(S,I,P)
interpreted language
(S,I, P, D) (S,I,P)
interpretati onall y constrained language
161
locally evaluated interpretation ally constrained language
(S, I, P, (Dj) jEJ) globally evaluated interpretationally constrained language
FIG. 5.3. Types of languages
Theorem 5.8.10 Let (S, I) be a globally evaluated interpretationally constrained language, where A = (P, (Dj» is the relevant atlas. Then there is an associated evaluationally constrained language (S, V), where V = VeL, A).
Type of language
Induced valuation
(S,I,P,D)
locally evaluated interpreted
v(s)
=t
iff l(S) ED
Vj(s)
=t
iff l(S)
VIc(S)
=t
iff Ik(S) ED
Vj, k(S)
=t
iff lk(S)
(S,l,P, (Dj)jEJ)
globally evaluated interpreted (S,I,P,D)
locally evaluated interpreted constrained (S,I,P, (Dj)jEJ)
globally evaluated interpreted constrained
E Dj
E Dj
(S, v) evaluated language (S, V)
evaluationally constrained language "semiinterpreted"
iff/(c]J) ED.
Theorem 5.8.8 Let (S, I) be a globally evaluated interpreted language, where A = (P, (Dj» is the relevant atlas. Then there is an associated evaluationally constrained language (S, V), where V = {Vj : j E J}, wherefor each j, Vj(c]J) = t iff/(c]J) E Dj. Theorem 5.8.9 Let (S, I) be a locally evaluated interpretationally constrained language, where M = (P, D) is the relevant matrix. Then there is an associated evaluationally constrained language (S, V), where V = VeL, M).
FIG. 5.4. Interpreted languages and induced valuations Every sort of interpreted language gives rise to an evaluationally constrained language. (Figure 5.4 shows how evaluationally constrained languages are obtained from interpreted languages.) Our radar metaphor suggests a natural converse, namely, that every evaluationally constrained (sentential) language arises from an underlying
162
SEMANTICS
interpreted language, in virtue of which we understand why some valuations are admissible and some are not. However, given the abstractness and generality of the definition of evaluationally constrained (sentential) languages, we should not expect this always to happen. Recall that, in the definition of evaluationally constrained languages, there is no requirement on the set V except that it is a non-empty set of valuations on the language. This allows evaluation ally constrained languages that are sufficiently bizane that they cannot be based on an underlying compositional semantics. For example, the class V may be generated by a random process; although it may not seem intuitively admissible, such a class is certainly mathematically admissible. In the following section, we discuss the circumstances under which an evaluationally constrained language may be viewed as arising from a globally evaluated interpretationally constrained, locally evaluated interpretation ally constrained, or globally evaluated interpreted language. 5.9
Substitutions, Interpretations, and Valuations
As noted in Section 5.8, every interpreted (locally/globally evaluated interpreted/interpretationally constrained) language gives rise to an associated evaluationally constrained language, but, as we also noted, one should not expect the converses to hold. Accordingly, in the present section, we consider the following question. Question: Under what circumstances does an evaluationally constrained sentential language arise from an underlying interpreted (locally/globally evaluated interpreted/interpretationally constrained) language? We begin with a pair of fairly trivial results, one about locally evaluated interpreted languages, the other about globally evaluated interpreted languages. Theorem 5.9.1 In orderfor an ECSL (S, V) to arise from an underlying locally evaluated interpreted language, it is both necessary and sufficient that V is a singleton {v}. Proof The necessity half is trivial; a locally evaluated interpreted language, by defi-
nition, supports exactly one valuation. Concerning the sufficiency half, suppose (S, V) is an ECSL, where V = {v}. We wish to construct an underlying locally evaluated interpreted language (S, I). Given the definition, it suffices to construct a matrix (P, D) and an interpretation 1 with the property that I(p) E D iff v(p) = t. Let P simply be the algebra S of sentences, let D be the set (p : v( p) = t}, and let I be the identity 0 function. Next, a slightly less trivial result about simply interpreted languages. Theorem 5.9.2 Every ECSL arises from a globally evaluated interpreted language. Proof Let (S, V) be an ECSL. We wish to show that there is an underlying interpreted language (S, I), where I is an interpretation of S into an atlas A = (P, (Dj)). Let P = S, and let I be the identity function, I( p) = p. Also, for each v in V, define Dv (p : v(p) = t}. Define (D j ) = (DvhEV. Claim: (S, I) gives rise to (S, V). 0
SUBSTITUTIONS, INTERPRETATIONS, AND VALUATIONS
163
We believe that this theorem is completely in keeping with the intuitions motivating the idea of an evaluationally constrained language; the sentences of an evaluationally constrained language L are provided a meaning, which is coded up by the admissible valuations, but are not (necessarily) provided truth values. This is in accordance with the intuition underlying the notion of a globally evaluated interpreted language; the sentences of a globally evaluated interpreted language are assigned meanings, which are coded up by propositions, but again are not (necessarily) provided truth values. Before proceeding to less trivial results, concerning globally evaluated interpretationally constrained and locally evaluated interpretationally constrained languages, we must introduce a crucial concept. Definition 5.9.3 Let (S, V) be an ECSL. Then (S, V) is said to be closed under substitution iffor any valuation v in V, andfor any substitution (i.e., endomorphism) (J on S, the functional composition of v and (J is also a valuation in V. In other words, if v E V, and (J is an endomorphism on S, then VO' E V, where VO' is defined so that vO'(p) = v((J(p)). The intuitive idea behind closure under substitution may be understood by way of a simple example. Suppose (J substitutes r for q, and leaves all other atomic formulas unaffected; we can accordingly denote (J by (r / q). Then for any valuation v there is another valuation, call it v(r/q), that assigns to q what v assigns to r; i.e., v(r/q)(q) = vCr). In other words, anything that can be accomplished by syntactic substitution can just as easily be accomplished by semantic "substitution" (i.e., a valuation). One might describe this by saying that syntactic substitutions are replaceable by semantic substitutions. Later in this section, we consider the converse property-to wit, the replaceability of semantic substitutions by syntactic substitutions. Later in this chapter, we discuss the relation of these replaceability properties to formal logic. Insofar as valuations rest on interpretations, the closure of valuations under substitution rests on the conesponding closure of interpretations under substitution. This is made more precise in the following definition and theorems about partially interpreted languages. Definition 5.9.4 A globally evaluated interpretationally constrained language (S, I) is said to be closed under substitution iffor any intelpretation 1 in I, andfor any substitution (J on S, the functional composition of 1 and (J is also an inteJpretation ill I. Theorem 5.9.5 Every globally evaluated interpretationally constrained language is closed under substitution. Proof Suppose (S, I) is a globally evaluated interpretation ally constrained language; further suppose that 1 is in I, and (J is a substitution on S. Then I is a homomorphism from S into an algebra P of propositions, and (J is a homomorphism from S into S (i.e.,
an endomorphism on S). The functional composition of any two homomorphisms is itself a homomorphism. Thus, 1 0 (J is also a homomorphism from S into P, but, by definition, I consists of all such homomorphisms, so 1 0 (J is in I. 0
164
SUBSTITUTIONS, INTERPRETATIONS, AND VALUATIONS
SEMANTICS
As an immediate consequence, we have the following theorem, which states that the ECSL based on any globally evaluated interpretationally constrained language is closed under substitution. Theorem 5.9.6 Let (S, I) be a globally evaluated interpretationally constrained language, and let (S, V) be the associated evaluationally constrained language. Then (S, V) is closed under substitution. Proof Let v be a valuation in V, and let a be a substitution on S. We wish to show that Vo- is also a valuation in V. Given the definition of V, there is an admissible interpretation I and designated set D such that, for any ¢, v(¢) = tiff I(¢) E D. We wish to show that there is an admissible interpretation I' and designated set D' such that, for any ¢, vo-(¢) = tiff I'(¢) ED'. Now, by the previous theorem, I is closed under substitution, so 10- E I. Claim: v 0 a(¢) = t iff 10 a(¢) ED. Thus, let t' = 10-, and let D' = D. D
Corollary 5.9.7 Let (S, I) be a locally evaluated interpreted language. Then (S, V) is closed under substitution. Proof Immediate, since a locally evaluated interpreted language is a special case of a D globally evaluated interpretationally constrained one.
Corollary 5.9.8 In order for an ECSL (S, V) to arise from an underlying globally evaluated interpretationally constrained language, it is necessalY that (S, V) be closed under substitution. Proof Immediate.
D
In light of the previous result, one is naturally led to ask whether the converse is also true. In other words, does every ECSL that is closed under substitution arise from an underlying globally evaluated interpretation ally constrained language? In an interpreted language, every sentence is assigned a particular meaning. By contrast, in a globally evaluated interpretationally constrained language, sentences are permitted to have a variety of meanings, the only requirement being that these meanings compose in an orderly (i.e., algebraic) manner. Theorem 5.9.9 Every ECSL that is closed under substitution arises from an underlying globally evaluated intelpretationally constrained language. Proof Let (S, V) be an ECSL. We wish to show that (S, V) arises from an underlying globally evaluated interpretationally constrained language (S, I), where I consists of all homomorphisms from S into an atlas A = (P, (Dj»). Clearly, it suffices to construct a particular atlas. So, let P = S; that is, let the algebra of propositions simply be identical to the algebra of sentences. In this case the admissible interpretations are homomorphisms from S back into itself, which is to say that interpretations are endomorphisms (i.e., substitutions) on S. Next, let (Dj) = (DV>VEV, where Dv = {¢ : v(¢) = t}. Claim: (S,I) gives rise to (S, V). To show this, we must show: (1) every v in V arises from (S, I); and (2) every valuation arising from (S, I) is in V.
165
(1) Suppose v E V, to show that it mises from (S, I). It suffices to show that there is an admissible interpretation I and designated set D such that, for any ¢, v(¢) = tiff I(¢) E D. Let I be the identity function, which is a (very trivial) endomorphism on S; let D = Dv; by the definition of Dv, v(¢) = tiff ¢(= I(¢)) E Dv. (2) Conversely, suppose that v arises from (S, I), to show v E V. Then there is an admissible interpretation I in I, and designated set D in {Dv : v E V}, such that, for any ¢, v(¢) = tiff I(¢) ED. We wish to show that v E V. Then I is a substitution, call it a, and D has the form Dv" for some v' in V. So v(¢) = tiff a(¢) E Dv" But this latter statement just means that v'(a(¢)) = t.1t follows that v = v'oa. But, v' E V and, by hypothesis, V is closed under substitution, so v E V. D
Earlier we promised to discuss the converse of the property of closure under substitution. We discharge our obligation now. Definition 5.9.10 Let (S, V) be an ECSL. Then (S, V) is said to be substitutionally determined if it is closed under substitution, and moreover; there exists at least one vo in V, called a root, such that, for every v in V, there exists a substitution a Oil S such that v = vo . a. Definition 5.9.11 Let (S, V) be an ECSL. Then (S, V) is said to be completely substitutionally determined if it is closed under substitution, and moreover, for evelY v, v' in V, there exists a substitution a on S such that v' = v 0 a. In other words, in a substitutionally determined ECSL, there is at least one root valuation vo, from which all valuations can be obtained via substitution. On the other hand, in a completely substitutionally determined ECSL, any valuation can serve as the root valuation from which the others m'e produced by substitution. Closure under substitution says that anything that can be accomplished by a substitution can be accomplished by a valuation. Complete determination says the converse, that anything that can be accomplished by a valuation can be accomplished by a substitution. Recall that locally evaluated interpretationally constrained languages are a special case of globally evaluated interpretation ally constrained languages, where the matrix bundle (logical space) is singular, which is to say that it admits only one designated set. Similarly, substitutional determination is a special case of closure under substitution. There is a parallel, as shown by the following theorem. Theorem 5.9.12 Let (S, V) be substitutionally determined. Then (S, V) arises from a locally evaluated interpretationally constrained language (S, I). Proof Suppose (S, V) is substitutionally determined. Then there is a root valuation va such that every v in V has the form vo . a for some substitution a. We wish to show that there is a logical matIix M = (P, D) that gives rise to V. Let P = S, and let D = {¢ : vo( ¢) = t}. Let I be the set of all endomorphisms from S to S. Claim: (S, I) gives rise to (S, V). To show this, it suffices to show: (1) every v in V arises from (S, I); and (2) every valuation arising from (S, I) is in V.
SEMANTICS
166
(1) Suppose v E V, to show that it arises from (S, 1). It suffices to show there is an admissible interpretation I such that, for any P, v( p) = f iff I( p) ED. Let I be the substitution 0" such that v = va 0 0", where va is the chosen root valuation. Clearly, v(p) = f iff va 0 O"(p) ED, since by definition of D the latter claim simply amounts to saying that va 0 O"(p) = t, and we already have that v = va 0 0". (2) Conversely, suppose that v arises from (S, 1), to show v E V. Then there is an admissible interpretation I in I such that, for any P, v(p) = tiff I(p) ED. We wish to show that v E V. Then I is a substitution, call it 0", and D = {p : vo(p) = f}. Thus, /(p) E D iff vo(O"(p» = f. It follows that v = va 0 0". But, va E V, and by hypothesis, V is closed under substitution, so v E V. 0
We conclude this section with a theorem about syntactic and semantic substitutions. Syntactic substitution was considered algebraically in Section 4.5; recall that p[ lfI / p] denotes the result of the substitution of lfI for p in the formula p. Now we define the notion of semantic substitution. Definition 5.9.13 Let I be an inteJpretation of S in a matrix M. The (semantic) substitution of lfI in the intelpretatiol1 I for p (denoted as I Iff / p) is defined as follows:
= q (where q i= p), then IIff/P(P) = I(p). rrp = p, then IIff/P(P) = l(lfI)· If P = oc(P!,···, Pn), then IIff/P(P) = Oc(/'JI/p(pd, ... , zlff/p(Pn».
(1) lfp (2) (3)
Substitution in an interpretation induces a substitution in the valuation determined by that interpretation in the obvious way: vlff/P(P) = t if IIff/P(P) ED, and vlff/P(P) = f otherwise. Theorem5.9.14 Let v be an interpretation of S in a matrix M. Then v(p[lfI/pD = vlff/P(P), that is, the result of a synta,ctic substitution is the same as that ofa semantic substitution.
The prove this theorem it is enough to show that l(p[lfI/pD = ZIJI/p(P), since there is a one-one cOlTespondence between induced valuations and interpretations. We proceed by induction on the structure of p. There are two base cases to consider: (i) if P is p, then IIff / p(p) = z( lfI) by the above definition, but also lfI = p[ lfI / p] and so I Iff / p(p) = z(p[lfI/pD; (ii) if pis q (q i= p) then zlff/p(q) = I(q) by the definition of IIff/P' and since q = q[lfI/P], IIff/P(q) = l(q[lfI/pD as desired. For the inductive step assume that pis of the form oc(P!, ... ,Pn) and for any i (l ::; i ::; n), IIff/P(Pi) = I(Pi[lfI/pD. Then IIff/P(oc(Pl, ... , Pn» = Oc(llff/P(P!)" .. , zlff/p(Pn» by the above definition. This further equals to Oc(/(P! [lfl / pD, . .. ,t(Pn[lfI / pm by the hypothesis of induction, and 0 since I is a homomorphism, we obtain I(Oc(Pl [lfl / pD, ... , Pn[lfI / pm.
Proof
5.10
Valuation Spaces
In Sections 5.8 and 5.9, we discussed evaluationally constrained languages and how they are related to interpreted languages. As we remarked, one way to understand an interpreted language (but not the only way) is to regard the sentences as having a fixed
VALUATION SPACES
167
meaning, but not necessarily a fixed truth value. In other words, there is a single fixed interpretation function I, which assigns to each sentence P a particular proposition I( p), but whether this proposition is true or false is not determined; rather, there are various possible worlds, each of which gives rise to a cOlTesponding admissible valuation on L. Propositions, as the reader will recall from Section 5.5, are occasionally identified with (or represented by) sets of possible worlds. In the present section, these ideas are combined within the framework of valuation spaces, which are defined in the following. Definition 5.10.1 Let (S, V) be an evaluationally constrained language, and let P be any sentence in S. We define V (p) as follows: V(p)
= {VEV:v(p)=t}.
In other words, V (p) consists precisely of those valuations in V that satisfy p. Definition 5.10.2 Let L = (S, V) be an evaluationally constrained language, and let W be any subset of V. Then W is said to be an elementary class on L if there exists a pin S sLlch that W = V(p). Definition 5.10.3 Let L = (S, V) be an evaluationally constrained language. Then the valuation space on L is the system (V, {V(p) : pES}). In other words, the valuation space on L is the set V of valuations together with the set of all the elementary classes on L. The notions of elementary class and valuation space are useful in mathematically formulating the various formal semantic notions, including validity, entailment, etc. Next, we consider how the valuation space on L can be used to construct an underlying interpreted language based on propositions viewed as sets of worlds. First, we identify the set V of valuations as the set of possible worlds. Next, we identify the elementary classes on L as the propositions; thus, every proposition cOlTesponds to at least one sentence; so, in particular, not every subset of V is a proposition. Finally, we define the interpretation function I exactly as one would expect, namely, I(p) = V(p). In other words, the proposition expressed by P consists precisely of those valuations (worlds) that make P true. This is a semantics, but it is not obvious that it is an algebraic (compositional) semantics. Whether it is an algebraic semantics depends upon whether the propositions (i.e., elementary classes) form an algebra appropriate to L, and it depends upon whether the function I is a homomorphism. Since I( p) = V (p), the latter question boils down to whether the following holds, for each connective C of L: (H) V(oc(P!, ... , p,,»
= OC(V(Pl), ... , V(p,,».
As usual, oc is the operation on the syntactic algebra, and Oc is the operation on the semantic algebra, associated with connective C. The nature of Oc is clear, since it is defined in the usual way. But what about Oc? This is precisely the question whether the elementary classes on L form an algebra appropriate to L. For the sake of notational simplicity, let us consider a language with only one connective *, of degree 2. In this case (H) may be rewritten as follows (where, for the
168
SEMANTICS
sake of further notational simplicity, we use the same symbol '*' to denote not only the connective, but also the syntactic algebraic operation, as well as the semantic algebraic operation): (H*) V(¢
* If/) =
V(¢)
* V(lf/)
Now, one seemingly sure-fired way to ensure that (H*) is true is simply to define V(¢) * V(lf/) so that (H*) is true, as follows: (D*) V(¢)
* V(lf/)
= V(¢
* If/).
* [b]
= [a
* b].
Here, [x] is the equivalence class of x, modulo the particular equivalence relation ==. But recall that the quotient construction (and in particular, (d*» is legitimate if and only if the equivalence relation employed is in fact a congruence relation, which is to say that it satisfies the replacement plVperty, which for the * operation amounts to the following: (R*) If a == a', and b == b', then a * b == a'
169
Unfortunately, not every ECSL is compositional, which is to say that definitions like (D*) need not be legitimate. We now consider a notion that is considerably stronger than compositionality, namely truth-functionality, which is defined as follows. Definition 5.10.6 Let L = (S, V) be an ECSL. Then L is said to be truth-functional for evelY connective C, andfor evelY valuation v in V, the following obtains:
if
=V(lf/i), i = 1, ... k, then v(ocC¢),···,¢JJ) = V(OCClf/l,···.lf/k»).
(tf) Ifv(¢i)
In this case, (H*) follows trivially. Although (D*) certainly has (H*) as an immediate consequence, it may in fact have everything as a consequence, in virtue of being inconsistent! Thus, the question is whether (D*) is a legitimate (i.e., non-contradictory) definition. Consider the directly analogous definition that arises in the construction of a quotient algebra (Chapter 2). The corresponding definition is the following: (d*) [a]
VALUATIONS AND LOGIC
* b'.
Now, every valuation space yields a natural equivalence relation, which may be defined by either of the following equivalent expressions:
In the special case of the (tf*) If v(¢)
*
connective, we have the following.
= v(¢'), and v(lf/) = v(lf/'), then v(¢ * If/) = v(¢' * If/').
In other words, formulas with the same truth value are intersubstitutable. This is considerably stronger than compositionality, which says only that if two fOlmulas express the same proposition (i.e., they are true under the same valuations), then they are intersubstitutable. Indeed. we have the following. Theorem 5.10.7 (/) EvelY truth Junctional ECSL is compositional. (2) Not evelY compositional ECSL is truth-functiollal. Exercise 5.10.8 Prove the above theorem. 5.11
Valuations and Logic
Exercise 5.10.4 Prove that (El) and (E2) are equivalent.
Having described evaluationally constrained languages and their associated valuation spaces. in the present section we turn to logical matters. As it turns out, many notions of logic can be fOlmulated exclusively in terms of evaluationally constrained languages, without regard to the manner in which the admissible valuations are constructed (via possible worlds, via logical atlases, etc.) The relevant fOlmal semantic notions are presented in a series of definitions. We begin with matters of terminology.
Thus, the question whether (D*) is legitimate amounts to the question whether ==, so defined, is a congruence relation on the algebra of sentences. In the special case that we are considering, this amounts to the following:
Definition 5.11.1 Let L = (S, V) be an evaluationally constrained language, let ¢ be a/1 element of S, let r be a subset S, let v be a valuation in V, and let W be a subset of V.
(El) ¢ == If/ iff for all v in V, v(¢) = v(lf/). (E2) ¢ == If/ iff V(¢) = V(lf/).
(c*) If V(¢)
= V(¢'), and V(lf/) = V(lf/'), then V(¢ * If/) = V(¢' * If/').
This is a special case of a property that we now officially define. Definition 5.10.5 Let L = (S, V) be an evaluationally constrained sentential language. Then L is said to be compositional if for every connective C, the following obtains: (c) IfV(¢i)=V(lf/i), i = 1, ... ,k, then V(OcC¢l,.·., ¢k» = V(OcClf/l, .. . , If/k))·
Thus, if an evaluationally constrained language is compositional, in this sense, the interpretation function z (where z(¢) = V(¢» is a homomorphism.
(1) v satisfies ¢
iff v( ¢)
= t.
(2) v falsifies ¢
~ff
= f.
v(¢)
(3) v satisfies r
(4) (5) (6)
iff v satisfies evelY sentence in r. v falsifies r iff v falsifies every sentence in r. W satisfies r iff evelY valuation in W satisfies r. W falsifies r iff evelY valuation ill W falsifies r.
As usual, we construe valuations as functions that assign exactly one truth value, tor f, to every sentence in the relevant language. Next, we define a variety of logical concepts, including validity and entailment.
Definition 5.11.2 Let L = (S, V) be an evaluationally constrained language, let ep, If.! be elements of S, let r, l::!.. be subsets of S. (1) ep is L-valid (ff V satisfies ep. (2) ep is L-contra-valid (ff V falsifies ep.
(3) (4) (5)
(6) (7)
VALUATIONS AND LOGIC
SEMANTICS
170
r
is L-falsifiable (L-unfalsifiable) fies r. r is L-satisfiable (L-unsatisfiable) isfies r. ep L-entails If.! (ff for every v in V, r L-entails ep (ff for evelY v in V, r L-entails l::!.. (ff for eVelY v in V,
(ff there is (is not) at least one v in V that falsi-
(ff there is (is not) at least one v in V that satif v satisfies ep then v satisfies If.!. ifv satisfies r then v satisfies ep. if v satisfies r then v satisfies 0 for some 0 in l::!...
Note that we customarily drop the 'L' when the language is understood; for example, we simply say "ep is valid," rather than "ep is L-valid." In order to symbolize these various predicates, we use a single symbol, the double turnstile 1=, ambiguously. As shown below, this ambiguity is harmless. In the following definitions, reference to L is suppressed: (sl) I=ep iff episvalid. (s2) ep 1= iff ep is contra-valid. (s3) 1= r iff r is unfalsifiable. (s4) r 1= iff r is unsatisfiable. (s5) ep 1= If.! iff ep entails If.!. (s6) r 1= ep iff r entails ep. (s7) r 1= l::!.. iff r entails l::!... The notions of validity and contra-validity correspond to the customary logical notions of logical truth and logical falsehood. For example, to say that a sentence ep is valid is to say that the meaning of ep is such that ep is true no matter what. Similarly, a contra-valid sentence is false no matter what. The notions of unfalsifiability and unsatisfiability are generalizations of validity and contra-validity, respectively. Specifically, to say that a set r of sentences is "valid" is to say that the sentences of r cannot all be made false. Similarly, to say that r is "contravalid" is to say that the sentences of r cannot all be made true. Notice that a sentence ep is valid (contra-valid) iff the singleton ep is unfalsifiable (unsatisfiable). In addition to the notions of validity and contra-validity, as well as their generalizations, there are three notions of entailment, which we might call simple entailment, ordinwy entailment, and symmetric entailment. Ordinary entailment corresponds to the customary notion of logical consequencethe notion of a conclusion following from (zero or more) premises. Simple entailment is a special case of ordinary entailment in which there is exactly one premise. On the other hand, symmetric entailment is a generalization of ordinary entailment that allows multiple conclusions (including "zero-many" conclusions) in addition to multiple premises (including "zero-many" premises).
171
In the definition of symmetric entailment, note carefully the quantifiers, which might seem counter-intuitive at first; r entails l::!.. iff every valuation that makes every r in r true makes at least one 0 in l::!.. true. A more symmetrical presentation of this concept is obtained by defining not entailment, but non-entailment:
r
(s7')
does not entail l::!.. iff there is a valuation v in V that satisfies r but falsifies l::!...
In other words, symmetrical entailment fails precisely when a valuation v can be found that makes every premise true and every conclusion false. As we will see in Chapter 6, this particular symmetry plays a crucial role in the general completeness theorem. We remarked above that our use of the symbol 1= is ambiguous but harmless; this is justified by the following theorem, whose proof is left as an exercise.
Theorem 5.11.3 Let L be any sentential language, let ep, If.! be any sentences of L, let be any set of sentences of L, and let V be a set of valuations on L. In the following, reference to the class V of valuations is suppressed.
r
(tl) (t2) (t3) (t4) (t5) (t6)
1= ep (ff 0 1= ep. ep 1= (ff ep 1= 0. 1= r (ff 0 1= r. r 1= (ff r 1= 0. ep 1= If.! (ff {ep} 1= {If.!}. r 1= ep (ff r 1= rep}·
Note in the above theorem that, in the left-hand side, the turnstile is used ambiguously; in the right-hand side, the turnstile refers exclusively to the symmetric entailment relation. The conventions seem clear enough: a single formula can stand in place of the cOlTesponding singleton; an empty expression can stand in place of the empty set.
Exercise 5.11.4 Prove the above theorem. By way of concluding this section, we note that the notions of elementary class and valuation space may be used to formulate the various fOlmal semantic notions. We state the basic theorem, leaving the proof as an exercise.
Theorem 5.11.5 Let L = (S, V) be an evaluationally constrained language, and let (V, {Veep) : ep E S}) be the associated valuation space. Let ep, If.! be elements of S, and let r, l::!.. be subsets of S.
ep is valid (ff Veep) = V. (2) ep is contra-valid (ff Veep) = 0. (3) r is unfalsifiable (ff U rVer) : r (1)
E r} = v. is unsatisfiable (ff {V (r) : r E r} = 0. (5) ep entails If.! (ff Veep) ~ V(If.!). (6) r entails ep (ff rVer) : r E r} ~ veep)· (7) r entails l::!.. (ff rVer) : r E r} ~ U {V(o)
(4)
n
r
n n
In the above, reference to L is suppressed.
: 0 EM·
SEMANTICS
172
Exercise 5.11.6 Prove the above theorem. 5.12
Equivalence
In virtue of the variety of logical notions that can be defined in terms of evaluationally constrained languages, there is a cOlTesponding variety of ways in which evaluationally constrained languages can be equivalent. In the present section, we examine three forms of equivalence, which we then extend to matrices, medleys, and atlases. We begin with a series of definitions, the first one being very simple. Definition 5.12.1 Two evaluationally constrained languages (SI, Vd alld (S2, V2) are said to be similar if SI = S2. In other words, evaluationally constrained languages are similar iff they are founded on the same set of sentences. Henceforth, we are exclusively concerned with similar evaluationally constrained languages. Definition 5.12.2 Let LI = (S, VI) and L2 = (S, V2) be similar evaluationally constrained languages. (1) Ll and L2 are strictly equivalent ifffor evelY pair of subsets r, .D.. of S, r entails .D.. in LI iff r entails .D.. ill L2. (2) L I and L2 are strongly equivalent ifffor evelY sentence rJ> in S, and every subset r of S, r entails rJ> in LI iff r entails rJ> in L2· (3) LI and L2 are weakly equivalent iff for evelY sentence rJ> in S, rJ> is valid in Ll !If rJ> is valid in L2. In other words, two languages are weakly equivalent if they agree concerning what sentences are valid, they are strongly equivalent if they agree concerning what singleconclusion arguments are valid, and they are strictly equivalent if they agree concerning what multi-conclusion arguments are valid.
EQUIVALENCE
173
Next, we present an important theorem, which states that the strict equivalence relation among ECSLs is in fact the identity relation. Theorem 5.12.6 Two (similar) ECSLs Ll and L2 are strictly equivalent LI =L2.
if and only if
Proof The "if" direction is tlivial, so we consider the "only if" direction. We proceed contrapositively. Suppose that LI "I L2, in which case VI "I V2 (since LI and L2 are similar). We wish to show that LI and L2 are not strictly equivalent, which is to say that there are sets rand.D.. such that r entails .D.. in LI but not in Lz, or the other way around. Since VI "I V2, there is a v in one but not the other. Without loss of generality, we may assume that there is some v in VI but not in V2. Consider Tv = {rJ> : v(rJ» = t} and Fv = {rJ> : v( rJ» = f}. Clearly, Tv does not entail Fv in Ll, since there is a valuation in VI that satisfies Tv but falsifies F v, namely v itself. On the other hand, Tv does entail Fv in L2. For suppose to the contrary; then there is a valuation v' in V2 that satisfies Tv and falsifies Fv. In this case, v' assigns t to every formula in Tv and f to every formula in F v , just like v! Functions are extensional objects, so v' must be the same as v, but this contradicts our earlier assumption that v is not in V2. D Having discussed general evaluationally constrained languages, we now focus our attention on evaluationally constrained (sentential) languages that arise from underlying interpreted languages. This provides con'esponding definitions of equivalence for matrices, medleys, and atlases. Recall that every matrix M appropriate to a given sentential language L gives rise to an associated set V(M) of valuations, and hence gives rise to a naturally associated evaluationally constrained (sentential) language. This is formally defined as follows. Definition 5.12.7 Let L be a sentential language, where S is the associated algebra of sentences of L, alld let M be a matrix appropriate for L. Then the associated evaluationally constrained (sentential) language is the system (S, V(M)), where V(M) are the valuations induced by M.
Exercise 5.12.3 We have defined only three forms of equivalence. There are others that can be defined, which respectively pertain to contra-validity, unsatisfiability, unfalsifiability, and simple entailment. Provide these additional definitions.
Exercise 5.12.8 The same can be said about medleys and atlases. Provide the COlTesponding definitions.
Since the validity of formulas is a special case of the validity of single-conclusion arguments, which in tum is a special case of the validity of multi-conclusion arguments, we have a natural ordering of the above forms of equivalence, given in the following.
Since every matrix appropriate to a language gives rise to an associated evaluationally constrained language, we can use the various equivalence relations on ECSLs to induce cOlTesponding equivalence relations on matrices. This is formally defined as follows.
Theorem 5.12.4
Definition 5.12.9 Let L be a sentential language, where S is the associated algebra of sentences of L. Let MI and M2 be logical matrices appropriate for L, and let Ll = (S, V(MI)) and L2 = (S, V(M2)) be the associated evaluationally constrained languages.
(1) If two ECSLs are strictly equivalent, then they are also strongly equivalent; the
(2)
converse does not hold. If two ECSLs are strongly equivalent, then they are also weakly equivalent; the converse does not hold.
Exercise 5.12.5 Prove the above theorem.
if L 1 and L2 are strictly equivalent. (2) Ml and M2 are strongly equivalent if Ll and L2 are strongly equivalent. (3) Ml and M2 are weakly equivalent if LI and L2 are weakly equivalent.
(1) M 1 and M2 are strictly equivalent
SEMANTICS
174
EQUIVALENCE
175
Exercise 5.12.10 Provide the corresponding definitions for medleys and atlases. An alternative term for strict equivalence is "logical indiscernibility"; the appropriateness of this term pertains to the fact that two matrices that are strictly equivalent agree on all logical questions, at least all questions that can be answered exclusively by reference to valuations. This is because strictly equivalent matrices give rise to the very same class of valuations. So, from the standpoint of notions defined purely in terms of valuations, strictly equivalent matrices are indistinguishable, although of course they may be metaphysically quite different. This idea is more fully developed in Chapter 6. In order to illustrate these definitions, we offer a variety of examples, defelTing detailed discussion, however, until our chapter on matrix and atlas theory. All the examples are standard Boolean matrices; they differ solely in what subsets are counted as designated. As before, xs are designated, os are undesignated. We give these examples without proof. The reader will be invited in Chapter 7, using relevant notions of homomorphic image and submatrix, to supply the proofs. Example 1: The logical matrices in Figure 5.6, the first of which is the Frege matrix, are all strictly (and hence strongly, and hence weakly) equivalent. x
x
x
o
x
o
o
x
x
o
o
o
x
o
o
FIG. 5.6. Example 1 Example 2: The two logical matrices in Figure 5.7 are strictly equivalent to each other. On the other hand, whereas they are not strictly equivalent to any of the matrices from Example 1, they are strongly, and hence weakly, equivalent to all of them. Example 3: The matrices in Figure 5.8 are weakly equivalent, but they are not strongly, and hence they are not strictly, equivalent. We now tum to a major theorem, which says that every medley is strictly equivalent to an atlas. In other words, for logical purposes, what can be done with a multitude of propositional algebras can equally well be done with a single propositional algebra, although it may be very big. Theorem 5.12.11 Let M be a medley of logical matrices. Then there exists a logical atlas A strictly equivalent to M, in the sense that V(M) = YeA).
o
o
o
o
o
FIG. 5.7. Example 2 x
x
x
o
x
o
x
o
x
o
FIG. 5.8. Example 3 Proof Index M as (Mi) iEI, or (Mi) for short, where each Mi = (Ai, Di). Define A = (P, (Di») as follows: P is the algebraic direct product of (Ai), denoted IIiAi. For each j E I, D j = {( ai) : aJED j }. In other words, a sequence (ai) is designated in the jth designated set D j of the atlas A iff the jth component of (a;) is designated in the jth matrix M j of the original medley. Claim: A and M are logically equivalent; i.e., YeA) = V(M). It suffices to show that every v in YeA) is also in V(M), and conversely every v in V(M) is also in YeA). (1) Assume v E YeA). Then v is induced by [ with respect to some Dj, where [ is a homomorphism from S into IliA;. By definition v(¢) = tiff l(¢) E Dj (from A). The projection function lrj is a homomorphism from IIi A; onto A j , and so lrj 0 [ is a homomorphism too, from S into Aj. However, /(¢) E Dj (from A) iff lrj 0 l(¢) E Dj (from M). Thus, lrj 0 I induces the same v, hence v E V(M). (2) Assume v E V(M). Then v is induced by I in some M j = (Aj, D j ). v(¢) = t iff I(¢) E Dj (in M). Define I' from S into II; A; so that if l(¢) = aj then l(¢) = (aj, ... , aj, ... ,a j), an i-tuple the elements of which are all ajs. It is easy to see that [' induces a valuation v on the atlas such that v(¢) = tiff l(¢) E Dj (in M) iff I(¢) E Dj (in M). Thus, v E YeA). D
SEMANTICS
176
5.13
Compactness
A key concept in fOlmal semantics and logic is the concept of compactness, which is bOlTowed from general (point-set) topology. Compactness in logic is intimately tied to a related notion, fin itm y entailment, which we begin by briefly discussing. The characterization of entailment (logical consequence) presented in Section 5.11 is semantic, not deductive. Specifically, according to the semantic construal of entailment, to say that cjJ is a logical consequence of r is simply to say that cjJ is true whenever every r is true; it is, in particular, not to say that cjJ can be deduced from r in a formal deductive system. Of course, logicians are generally not content to present a semantic account of entailment and leave it at that. They generally prefer to present a fOlmal deductive (axiomatic) account as well. DefelTing a detailed discussion of ax iomatics to a later chapter, in the present chapter we simply observe a very important feature of the axiomatic account of entailment. The fundamental notion of ax iomatics is the notion of a proof (or derivation), which is customarily defined to be afinite sequence of formulas subject to specified conditions. Furthermore, to say that cjJ can be deduced from r is to say that there is a proof (i.e., a finite sequence) using formulas of r that yields cjJ. But, because of its finite character, a proof of cjJ from r can use only finitely many formulas in r; accordingly, any proof of cjJ from r is in fact a proof of cjJ from a finite subset of r. This can be summarized in the following principle.
Principle 5.13.1 (The compactness of deductive entailment) Aformula cjJ can be deducedfrol71 a set r offormulas only if cjJ can be deducedfmm afinite subset r' ofr. This remarkable feature of deductive systems naturally leads one to query whether semantic systems of entailment have a cOlTesponding property, summarized as follows.
Principle 5.13.2 (The compactness of semantic entailment) A formula cjJ is (semantically) entailed by a set r offormulas only if cjJ is (semantically) entailed by a finite subset r' ofr. Alas, there is nothing about the fOlmal semantic definition of entailment that ensures the truth of this principle. The most famous example of the failure of compactness is in (second-order) number theory. Consider the following (infinite) entailment of second-order number theory: (E) {F(O), F(l), F(2), F(3), ... } I- \fxF(x).
Whereas this expresses a valid entailment in second-order number theory, its validity depends essentially on the infinitude of the set of premises. In particular, there is no finite subset of premises that entails the conclusion. Of course, (E) does not hold in firstorder number theory, for precisely the reason that classical first-order logic is compact! This is in fact the basis of much "mischief" in modern mathematics. We now formally present the concept of compactness as it relates to evaluationally constrained languages. Afterwards, we discuss how semantic compactness can be seen
COMPACTNESS
177
to be a special case of topological compactness, in virtue of which the term "compact" is fully justified. As it turns out, there are actually four distinct notions of compactness in general fOlmal semantics, although under commonly OCCUlTing conditions (see below) they all coincide. We refer to these forms of compactness respectively as U-compactness, 1compactness, E-compactness, and S-compactness, which are defined as follows.
Definition 5.13.3 Let L = (S, V) be an evaluationally constrained language. Then L is said to be U-compact ifior any subset r of S, r is unfalsifiable only if there is afinite subset r' of r that is Ullfalsifiable. In brief, every unfalsifiable set has a finite unfalsifiable subset, or contrapositively stated, if every finite subset of a set S is falsifiable, then S is also falsifiable. The "u" in "Ucompact" refers to the term "union," the relevance of which is explained below.
Definition 5.13.4 Let L = (S, V) be an evaluationally constrained language. Then L is said to be I-compact iffor any subset r of S, r is unsatisfiable only if there is afinite subset r' of r that is unsatisfiable. In brief, every unsatisfiable set has a finite unsatisfiable subset, or contrapositively stated, if every finite subset of a set S is satisfiable, then S is also satisfiable. The "I" in "I-compact" refers to the term "intersection," the relevance of which is explained below.
Definition 5.13.5 Let L = (S, V) be an evaluationally constrained language. Then L is said to be E-compact if for any subset r of S, a formula cjJ is entailed by r only if there is a finite subset r' of r that entails cjJ. The "E" in "E-compact" refers to the term "entailment," which is self-explanatory. Ecompactness is the notion refelTed to at the beginning of the section.
Definition 5.13.6 Let L = (S, V) be an evaluationally constrained language. Then L is said to be S-compact if for any subsets r, a of S, r entails a only if there are finite subsets r', a' such that r' entails a'. S-compactness is the natural generalization of E-compactness that applies to symmetric entailment; hence the name. Since unfalsifiability, unsatisfiability, and ordinary entailment are special cases of symmetric entailment, one might expect the cOlTesponding notions of compactness to be special cases of S-compactness. This is indeed the case.
Theorem 5.13.7 Let L be an evaluationally constrained language. Then if L is Scompact, then L is also U-compact, I-compact, and E-compact. Exercise 5.13.8 Prove the above theorem. The above theorem can be read as saying that S-compactness implies U-compactness, I-compactness, and E-compactness. The following theorem is to be understood in relation to this reading.
178
SEMANTICS
u
I
COMPACTNESS
E
S FIG. 5.9. Four forms of compactness Theorem 5.13.9
(1) U-compactness does not imply S-compactness, I-compactness, or E-compactness. (2) I-compactness does not imply S-compactness, V-compactness, or E-compactness. (3) E-compactness does not imply S-compactness, V-compactness, or I-compactness. Exercise 5.13.10 Prove the above theorem. (Hint: See van Fraassen (1971), where all three notions are discussed. Indeed, our own discussion of compactness owes much to van Fraassen.) Thus, the four forms of compactness can be diagrammed as in Figure 5.9. Although the four forms of compactness are in general distinct, under special but common circumstances they all coincide. We state the relevant definitions, after which we state the theorem. Definition 5.13.11 Let (S, V) be an evaluationally constrained language, and let ¢ be a sentence in S. A sentence Iff is said to be an exclusion negation of ¢ iffor every v in V, V(Iff) = t ijJv(¢) = f.
In other words, an exclusion negation of a sentence ¢ is any sentence whose truth value is always opposite to ¢'s truth value. Notice that an exclusion negation of ¢ need not be recognizable as such by its syntactic form; it can only be recognized by its semantic content, as characterized by the class V of admissible valuations. Definition 5.13.12 An evaluationally constrained language (S, V) is said to be closed under exclusion negation if every sentence in S has an exclusion negation in S. Theorem 5.13.13 Let L = (S, V) be an evaluationally constrained language that is closed under exclusion negation. Then if L is V-compact or I-compact or E-compact, then it is S-compact. Corollary 5.13.14 Suppose L is closed under exclusion negation. (1) If L is V-compact, then L is both I-compact and E-compact.
(2) If L is I-compact, then L is both V-compact and E-compact. (3) If Lis E-compact, then L is both U-compact and I-compact.
Exercise 5.13.15 Prove the above theorem. (Hint: See van Fraassen (1971).)
179
Of course, in classical logic, every sentence ¢ has an exclusion negation, being the syntactically produced negation -¢. Accordingly, in classical logic, all forms of compactness collapse into a single form of compactness. Having discussed the various forms of semantic compactness, which are the same in classical logic but not in general, we now discuss topological compactness, after which we show that the former is a species of the latter. We begin with the definition of a topological space. Definition 5.13.16 Let S be a non-empty set, and let 0 be a non-empty collection of subsets of S. Then 0 is said to be a topology on S precisely if the following conditions are satisfied:
(tl) 0 E O. (t2) S E O. (t3) If X E 0, and YEO, then X n YEO. (t4) IfC ~ 0, then U CEO.
Definition 5.13.17 A topological space is a system (S, 0), where 0 is a topology on S. Definition 5.13.18 Let (S, 0) be a topological space, and let X be a subset of S. Then X is said to be open in (S, 0) if X E 0; X is said to be closed in (S, 0) if'S _ X E 0; X is said to be clopen in (S, 0) if X is both open and closed in (S, 0).
Treating the elements of 0 as open sets, (t1 )-( t4) can be read as saying that 0 and S are open sets, ~hat the intersection of any finite collection of open sets is itself an open set, that the umon of any collection of open sets is itself an open set. Dually, treating the complements of elements of 0 as closed sets, (tl)-(t4) can be read as saying the dual; namely, 0 and S are closed sets; the intersection of any collection of closed sets is itself a closed set; the union of any finite collection of closed sets is itself a closed set. We next tum to the customary topological definition of compactness. Definition 5.13.19 Let (S, 0) be a topological space, and let C be any collection of subsets of S. Then C is said to be a cover if U C = S, and C is said to be an open cover if additionally every element of C is an open set; i.e., C ~ O. Definition 5.13.20 Let (S, 0) be a topological space, and let C be any cover, and let ~ C. Then C' is said to be a subcover of C if c' is also a covel; and C' is said to be a finite subcover if it is additionally finite. C'
Definition 5.13.21 A topological space (S, 0) is said to be compact if every open cover has afinite subcove!: In other words, ifC ~ 0, and U C = S, then there is afinite subset c' of C such that U C' = S.
We now tum to the question of how semantic compactness and topological compactness are related. In order to do this, we first discuss how one can convert an evaluationally constrained language into a quasi-topological object, namely a valuation space, which was defined in Section 5.10, together with the notion of an elementmy class.
THE THREE-FOLD WAY
SEMANTICS
180
Now, the collection of elementary classes on L need not form a topology on V; i.e., a valuation space need not be a topological space. On the other hand, the elementary classes can be used to construct a topology on V. This is a special case of a general theorem, stated as follows. Theorem 5.13.22 Let S be a non-empty set, and let C be any collection of subsets of S. Let int(C) = X : X ~ C, and X isfinite}; let T(C) = (U D : D ~ int(C)}. Then T(C) is a topology on S.
{n
Exercise 5.13.23 Prove the above theorem. (Hint: Note that
n0 = S, and U 0 = 0.)
In other words, to construct a topology from an arbitrary collection C of subsets of S, first one forms all the finite intersections of elements of C, and then one takes these sets and forms arbitrary unions. In this manner, one can construct a topological space from any valuation space. But before we deal with that construction, we discuss the compactness of (S, T(C)). Definition 5.13.24 Let S be a non-empty set, and let C be any collection of subsets of S. Then C is said to have the finite union property iffor any subset D ofC, if U D = S, then there is a finite subset D' of D such that U D' = S. In other words, the finite union property is simply the compactness property applied to an arbitrary collection C of subsets of a set S, irrespective of whether C forms a topology on S. Theorem 5.13.25 Let S be a non-empty set, let C be any collection of subsets of S, and let (S, T(C)) be the topological space on S induced by C. Then (S, T(C)) is compact iff C has the finite union property. Exercise 5.13.26 Prove the above theorem. (Hint: One half ("only if") is uivial; the other half ("if") is proved by extensive appeal to various properties of infinite union.) The dual of the finite union property is the finite intersection property, which is related to I-compactness, and which is defined as follows. Definition 5.13.27 Let S be a non-empty set, and let C be any collection of subsets of S. Then C is said to have the finite intersection property if for any subset D of C, if D = 0, then there is a finite subset D' of D such that D' = 0.
n
n
We now return to evaluation ally constrained languages and valuation spaces. First, two simple theorems. Theorem 5.13.28 An evaluationally constrained language is U-compact iff the associated valuation space has the finite union property. Theorem 5.13.29 An evaluationally constrained language is I-compact ated valuation space has the finite intersection property.
iff the associ-
Exercise 5.13.30 Prove the above theorems. Next, we define the topology associated with an evaluation ally constrained language.
181
Definition 5.13.31 Let (S, V) be an evaluationally constrained language, and let (V, (V(¢) : ¢ E S}) be the associated valuation space. The topological space induced by (S, V) is the topological space (V, T( (V(¢) : ¢ E S})). We conclude this section with the theorem linking semantic and topological compactness. Theorem 5.13.32 An evaluationally constrained language L is U-compact iff the topological space induced by L is compact. Exercise 5.13.33 Prove the above theorem. 5.14
The Three-Fold Way
The following remarks should help clarify the role of matrices and atlases in the definition of consequence, as well as the notion(s) of (quasi-, partially) interpreted language. Elementary logic books usually make one of three suggestions regarding the nature of logical validity: (1) It is a matter of "logical form"; all arguments of the same form are valid. (2) It is a matter of "logical necessity"; in every possible world in which the premises are true, the conclusion is also true. (3) All of the above. The usual definition of validity using models fudges the distinction between (1) and (2), since a model may variously be viewed as an interpretation or as a possible world. Although books rarely distinguish the first from the second, they are clearly different. Consider the argument: Snow is white or grass is green. Therefore, grass is green. The first criterion has one changing the meaning of the atomic constituents, and assessing the actual world for truth and falsity. This change of meaning is in practice usually accomplished by a "translation" that substitutes other constituents of the appropriate grammatical type. Thus in the case in point, one can substitute the sentence "grass is purple" for "grass is green," obtaining the following argument "of the same form," in which the premise is actually true but the conclusion false: Snow is white or grass is purple. Therefore, grass is purple. The second test has one performing thought experiments about "science fiction" worlds in which grass is purple, in which case the premise is true but the conclusion is not. The third test has one doing whichever comes quickest to mind, and maybe even a combination of the two. To be somewhat more formal, let us suppose that we have an atlas A, and adopt the useful fiction that A is the set of all propositions, or at least all the propositions within some realm of discourse. Let us suppose further that there is some particular interpretation 10 that assigns to p the proposition that snow is white, and assigns to q the proposition that grass is green. Now, consider the argument pVql-q.
182
SEMANTICS
Criterion (1) amounts to fixing on a particular designated subset D;, e.g., that designated subset Do which contains the propositions true in the actual world, and then considering all of the various interpretations I, e.g., an 11 that continues to assign the proposition that snow is white to p but assigns the proposition that grass is purple to q. In fact, as far as cliterion (1) is concerned, one really does not need an atlas, but could get by with a matrix instead, since one only looks at a single designated subset. Thus in effect we have a locally evaluated interpretation ally constrained language. Criterion (2) uses the other designated subsets, but only a single interpretation, say again 10. This is in effect to consider an interpreted language. One considers then another designated subset Dl, say the one that still contains the proposition that snow is white, and hence the proposition that snow is white or grass is green, but which does not contain the proposition that grass is green (containing instead, say, the proposition that grass is purple). Criterion (3) allows both the interpretation and the designated subset to change, and this time the needed apparatus is a globally evaluated interpretationally constrained language. Thus one might reject the validity of the argument above by changing both the interpretation and the designated subset. 2 Incidentally, cliterion (1) has a syntactic rendering. In changing the meaning, one can do it by considering all sentences of the same form. This mayor may not give the same result as changing the propositions, depending upon the expressive resources of the language. To be more formal, we would say that the argument ¢ f- If/ is valid iff for every substitution a, if lo(a(¢)) E Do, then lo(a(lf/)) E Do. Let us consider a language for classical sentential logic without negation and which has only the two atomic sentences p and q (the example can trivially be extended to accommodate more). Let us assume further that 10(p) is the true proposition that snow is white, and that lo(q) is the true proposition that grass is green. Then p V q f- q would end up as valid. All this relates to Quine's famous characterization of a logical truth. In Quine (1961, pp. 22-23), logical truth is characterized as "a statement which is true and remains true under all reinterpretations of its components other than the logical particles." This sounds like criterion 0), specialized to the case of unary assertion, and read in a seman tical tone of voice. However, on other occasions Quine has said the same thing in a more syntactical tone, talking of substitutions for components other than the logical particles, as in Philosophy of Logic: "a logical truth is a truth that cannot be turned false by substituting for lexicon. When for its lexical elements we substitute any other strings belonging to the same grammatical categories, the resulting sentence is true" (Quine 1986, pp. 58). A natural question arises as to whether and when the three criteria agree with each other. This question is complicated by the fact that in assessing the validity of an argument, one should be free to quantify over all atlases (matrices), but to start with let us 2This is useful from a pedagogical point of view in that changing the interpretation does not always produce premises which are literally true and a conclusion that is literally false, but rather more likely premises that are "almost true" and a conclusion that is "almost false." So one still has to tell some little story about the world to get things to turn out right. In the example above, one says things like let's suppose that snow never gets splattered with mUd, etc., and that grass never gets sprayed with purple paint or whatever.
THE THREE-FOLD WAY
183
fix on a single atlas. Then clearly criterion (3) implies the other two. We leave as an open problem the investigation of other relationships among the three criteria both abstractly an~ in m01:e concrete circumstances (say, for the case of c1assicallog'ic, where the atlases 1J1 questIOn are all Boolean algebras, and where the designated sets D· are all I the maximal filters).
THE VARIETIES OF LOGICAL EXPERIENCE
6 LOGIC 6.1
Motivational Background
We take the view that "consequence" (what follows from what) is the central business of logic. Strangely, this theme, clear from the time of Aristotle's syllogistics, has been obscured in modem times, where the emphasis has often been on the laws of logic, where these laws are taken not as patterns of inference (relations between statements) but rather as logical truths (statements themselves). Actually it seems that Aristotle himself was at least partly responsible for starting this perverse view of logic, with his so-called three laws of thought (non-contradiction, excluded middle, and identity), but we lay the major blame on Frege, and the logistic tradition from Peano, Whitehead and Russell, Hilbert, Quine, and others. This tradition views logic along a model adapted from Euclid's geometry, wherein certain logical truths are taken as axioms, and others are then deduced from these by way of a rule or two (paradigmatically, modus ponens). Along the way there were some divergent streams, in particular the tradition of natural deduction developed by laskowski (1934) and Gentzen (1934-35), and promulgated in a variety of beginning logic texts by Quine (1950), Copi (1954), Fitch (1952), Kalish and Montague (1964), Lemmon (1965), and Suppes (1957), to name some of the most influential. That this was indeed an innovative view of logic when measured against the axiomatic approach can be seen in a series of papers by Popper (e.g., 1947) concerning what he viewed as "logic without foundations." The view of logic as centered around consequence has been a major thrust of postwar Polish logic, building on earlier work by Tarski on the "consequence operator." In particular, a paper by L6s and Suszko (1957) laid the framework for much later Polish work. Our discussion here will also utilize much of this framework, although we will not take the trouble to always tie specific ideas to the literature. There is one more influence that we must acknowledge, and it too started with Gentzen (1934-35). It is well known that in developing his "sequenzen calculus" for classical logic, he found need for "multiple conclusions." Thus he needed to extend the idea of a "singular sequent" r I- ¢ (a set of premises r implies the sentence ¢) to "multiple sequents" r I- 11 (a set of premises r implies a set of conclusions 11). This last is understood semantically as "every way in which all the premises are true is a way in which some of the conclusions are true." Alternatively, it can be explained informally that the premises are understood conjunctively, whereas the conclusions are understood disjunctively. There is a kind of naive symmetry about this that we shall make more precise below, but in the meantime we shall dub the relation we shall be discussing symmetric consequence. Now there seems to us nothing to have been written
185
in the sky that says logic should focus on arguments with multiple conclusions. Indeed, in the work of Gentzen, the use of multiple conclusions appears to be more or less a technical device to accommodate classical logic's symmetric attitude towards truth and falsity (and this seems true of the subsequent work of Kneale (1956) and Camap (1943, p. 151) regarding what the latter dubbed "involution"). But more recently the work of Scott (1973) and Shoesmith and Smiley (1978) has shown the theoretical utility of considering multiple conclusion arguments in a more general setting, and we shall build on their work below.
6.2
The Varieties of Logical Experience
We shall here make a quick sketch of various ways of presenting logical systems: (1) (2) (3) (4)
unary assertional systems, I- ¢; binary implicational systems, ¢ I- If/; asymmetric consequence systems, r I- ¢; symmetric consequence systems, r I- 11.
Since it is understood here that the sets of sentences r,11, can be empty or of course singletons, (1) is a special case of (3), and (3) can be viewed as a special case of (4). Also in the same sense, (2) is a special case of (3) and (4). There are clearly other variants on these notions that will occur to the reader, e.g., (5) Unary refutational systems: ¢ I-
("¢ is refutable"),
or versions of (3) and (4) where the sets r, 11, are required to be finite, etc. Consider a symmetric consequence r I- 11. Either of r or 11 can be required to be finite, and having made that choice, one can further choose to restrict r or 11 to have a specific number of sentences. 1 or 0 are popular choices, though Aristotle would have restricted r to 2 and 11 to 1 for the syllogistic inferences (but both rand 11 to 1 for the immediate inferences). Sometimes the specific number is a maximum, as with Gentzen's (1934-35) treatment of intuitionistic logic, where 11 can have either 1 or 0 members. But for simplicity, let us restrict out attention to the two choices of requiring finiteness, or not requiring finiteness, and then go on to supplement the first choice with two specific exact numbers, I or O. This gives us 2 x 2 = 4 choices for each of rand 11, or then 4 x 4 = 16 choices for r I- 11. Logicians do not always bother to formally distinguish all of these variations because in "real life," logics tend to have the following properties: (a) compactness (and dilution), and (b) the presence of connectives that indicate structuralfeatures. By "compactness" we mean the property that if r I- 11 is valid, then so is some ro I- 110, where ro is a finite subset of rand 110 is a finite subset of 11. By "dilution" we mean the converse (but where ro and 110 do not necessarily have to be finite). Clearly, then, given (a), the restriction to finite sets is otiose.
WHAT IS (A) LOGIC?
LOGIC
186
By (b) we mean to refer to the phenomenon that allows one to replace
P] , ... , Pm
I- Iff] , ... , If/n
with first
P] /\ ... /\ Pm I- Iff]
V ... V If/n,
and then I-
PI /\ ... /\ Pm
--+ If/l V ... V If/n,
or that allows one to replace with I-
~p.
But in general there is no reason to think that (1)-(5), or the various variants that the reader may produce given our few hints, are equivalent. Certainly it is not hard to imagine (b) failing for real-life examples (think, for example, of studying fragments of even classical logic where various crucial connectives are missing). But there are other examples offull non-classical logics. There are not many real-life examples where (a) fails (though quantum logic and supervaluational classical logic certainly count, at least as usually considered). Compactness typically fails for second-order logics with standard models, and these are beyond the scope of this book. Actually there is a way of looking at the consequence relation of relevance logic so that dilution fails. But one need not look at the consequence relation in just this way, and so this is only a slight caveat. We shall focus in this book on just the four varieties of presentations of logics that lead off this section, and indeed largely concentrate on unary assertional systems, asymmetric consequence relations, and symmetric consequence relations, even though binary implicational systems are perhaps the presentation that most fits the idea of thinking of logics as ordered algebras. We consider one last way of presenting a logic which is of even closer affinity to algebraic approaches to logic, namely (6) equivalential systems,
P -11- If/.
Here P -11- If/ is to be understood as saying that If/ is a consequence of P, and vice versa. This way of thinking of logic is not quite on all fours with the others; at least it is true that one cannot think of (6) as a special case of (4) in just the same way that one can with the others. But clearly (6) is not altogether unrelated to (2), and given the emphasis in algebraic studies on identity (equational classes), it is not too surprising that equivalence should raise its head. Before leaving the topic of the vmious ways of presenting logical systems, we make a terminological point or two. For uniformity, in the sequel we can always assume that each variety is a special case of the symmetric consequence (with empty left-hand side, etc., as needed), but we shall not always bother to explicitly respect this point. We shall also in the sequel identify a system with its consequence relation, and speak of the two interchangeably.
6.3
187
What Is (a) Logic?
We do not in this section presume to decide issues between classical logic and its competitors (intuitionistic logic, etc.) as to which is really logic. We just want to lay down certain natural requirements on any system that is even in the running. We shall do this for at least the main different varieties discussed in Section 6.2. In this section we shall presuppose a universe S of statements. The word "statement" has a somewhat checkered philosophical past, sometimes being used by writers to refer to sentences, sometimes to propositions, sometimes to more subtle things like declm'ative speech acts, etc. We here take advantage of these ambiguities to appropriate it for a somewhat abstract technical use, allowing it to be any of these things and anything else as well. Quite likely the reader will think of S as a denumerable set of sentences of English or some other natural language, but it is our intention that the elements of S may be any items at all, including ships and shoes and sealing wax, natural numbers or even real numbers. The important thing to emphasize about the elements of S is that at the present stage of investigation, we are considering them as having no internal structure whatsoever. This does not mean that they in fact have no internal structure; they may in fact be the sentences of English with their internal grammatical structure. What it does mean is that any internal structure which they do have is disregarded at the present level of abstraction. Let us start with the familiar unary systems, Then a natural way to think of a logic 12 is that it is just a subset of S (the theorems). We shall write I- L P for pEL. There are various other things that one might build into a logic, perhaps sometlling about its having axioms and rules of inference. But we take these things as having to do with the particular syntactical presentation of the proof theory of a logic, and so do not want to take these as part of the abstract notion of a logic itself. If the reader has some reservations, thinking that still there ought to be some more structure representing all the valid rules of inference, this is just a reason for the reader to prefer consequence systems of one variety or the other. Before talking about these, we shall first pause to discuss binary systems, There are really two kinds: binary implicational systems and binary equivalential systems. In both cases a logic 12 is understood as a set of pairs of statements, The difference between the implicational systems and the equivalential systems is brought out by further requirements. Of course we require of both kinds of system: reflexivity, (p, p) E 12; transitivity, (p, If/) ELand (If/, X) E 12 only if (p, X) E L But we require further of an equivalential system, symmetry, (p, If/) E 12 only if (Iff, p) E L Turning now to consequence systems, for the asymmetric versions, a logic will be understood to again be a set of pairs, but this time the first component of each pair is a set of statements and the second component still just a statement. (For symmetric consequence the second component too will be a set of statements.)
188
LOGICS AND VALUATIONS
LOGIC
For an (asymmetric) consequence system we further require generalized forms of reflexivity and transitivity first set down by Gentzen (1934-35) (as is customary, we shall write 'T, A" in place of the more formal 'TuA", and 'T, cj/' in place of 'Tu {CP} "). Actually we shall need the property that Gentzen called "cut" in a stronger form, so we shall first define an (asymmetric) pre-consequence relation as a relation I- between sets of statements and statements satisfying the following two properties: overlap, if cP
E
r, then r 1-£ cP;
cut, if r 1-£ cP and r, cP 1-£ lJf, then r I- £ lJf. We require further (with Gentzen), dilution, ifr 1-£ cP then r,A 1-£ cpo There is a strengthening of cut that we require in addition for a full-fledged consequence relation: in finitary cut, ifr 1-£ cP, for all cP E A, andr,A 1-£ lJf, then r 1-£ lJf. Clearly infinitary cut includes plain cut as a special case; and they are equivalent when the consequence relation is compact, but not in general. Exercise 6.3.1 Show the above claims. (Hint: To show the non-equivalence give an example of a pre-consequence relation that lacks infinitary cut, i.e., is not a consequence relation.)
189
For full-fledged symmetric consequence we must again strengthen cut in some appropriate infinitary way. To this end we define the global cut property for symmetric consequence, but first we define a quasi-partition. Definition 6.3.3 Given any set of statements L, let us define a quasi-partition of L to be a pair of disjoint sets L 1, L2 such that L = L 1 U L2 (the reason why this is called a quasi-partition is that we allow one OfLI or L2 to be empty). Definition 6.3.4 We say that I- has the global cut property ijj·given any set of statements L, whenever not (r I- A) then there exists a quasi-partition Ll, L2 ofL sllch that not (L 1 , r I- A, L2). The global cut property for symmetric consequence clearly implies the cut property, for, proceeding contrapositively, if not (r I- A), then choosing L = {CP}, we have either not (r, cP I- A), or not (r I- cP, A). It can also be shown to imply the infinitary cut property, even in its stronger symmetric form: Symmetric infinitary cut, if r I- cP, 8 (for all cP
E
A), and r, A I- 8, then r I- 8.
Theorem 6.3.5 Let I- be a symmetric consequence relation. Then I- satisfies symmetric infinitary cut.
There is another way of treating consequence, namely as an operation on a set of statements r producing yet another set of statements Cn(r) (the set of "consequences of r"). This was the idea of Tarski and it has become the standard view of logic of the Polish School. Clearly it seems largely a matter of style as to whether one writes r I- cP or cP E Cn(r), as we pin down in the following.
Proof Proceeding by indirect proof, let us suppose the hypotheses of symmetric infinitary cut, and yet suppose that not (r I- 8). The global cut property tells us that we must be able to divide up A into Al and A2 so that not (AIX I- 8,A2). Clearly A2 must be empty, for if some cP E A2, then AI. r I- 8, A2 by virtue of dilution applied to the given hypothesis that r I- cP, 8. So all of A must end up on the left-hand side, that is, we have not (A, r I- 8). But this is impossible, since its opposite is just the given D hypothesis that r, A I- 8.
Exercise 6.3.2 Show that the properties of consequence as a relation listed above (including infinitary cut) are implied by the following properties of the consequence operation:
Note that for the case that L = S, the global cut property guarantees that if not (r I- A), then there is a partition of S into two halves T, F such that r ~ T, A ~ F, and not (T I- F) (indeed this "special case" is equivalent to the global cut property).
(i) r ~ Cn(r); (ii) Cn(Cn(r» = Cn(r); (iii) if r ~ A then Cn(r) ~ Cn(A).
Exercise 6.3.6 Prove that the global cut property is equivalent to the "special case" when L = S (the set of all sentences).
Show conversely that those properties of the consequence relation, with infinitary cut, imply properties (i)-(iii) of the consequence operation. For symmetric pre-consequence, the properties of overlap, etc. must be slightly generalized. Thus we have (we begin here to omit the subscript 12 on 1-£ as understood in context): overlap, if rnA =I- 0, then r I- A; cut, if r I- cP, A and r, cP I- A, then r I- A; dilution, if r I- A then L, r I- A,8.
We shall not make any further moves to justify these "Gentzen properties" as desirable features of a "logic," but we hope that they at least shike the reader as natural. Their fruitfulness will be clear from the results of the next section. 6.4
Logics and Valuations
By a valuation of a set of statements S is meant an assignment of the values t or f to the elements of S. This is just an extension of our usage in the previous chapter so as to allow the arguments of the valuation to be statements, which might be sentences (as required in the previous chapter), but might be propositions or something else entirely. It is convenient and customary to think of the assignment, let us call it v, as a function
190
defined on all the elements of S. This amounts to saying that each statement has a truth value, and no statement has more than one truth value, although for many purposes (reflecting attempts to model the imperfections of natural languages) these restrictions may seem over-idealistic. A valuation clearly partitions the statements in S into two halves, which we denote as Tv and Fv. Recall from the previous chapter the notion of a semi-interpreted language as a pair (S, V), where V is some set of valuations of the algebra S of sentences of the language L, which we call the admissible valuations. For a while at least, it is unimportant that the sentences have any internal structure or that they are sentences at all, so we shall replace L with a set of "statements" S (recall this is just any non-empty set), and talk of a semi-intelpreted semi-language (the reader is assured that we will not employ this barbarism often). Note that every semi-interpreted semi-language has a natural symmetric consequence relation:
r
BINARY CONSEQUENCE
LOGIC
f- A iff for every v E V, if v assigns t to every member of r, then v assigns t to some member of A.
We write f- (V) for this relation (also in accord with the usual conventions about functional notation, we sometimes write f-v).
Exercise 6.4.1 Show that a class of valuations of a set of statements S always gives rise to a symmetric consequence relation on the set S. Not only does a class of valuations determine a symmetric consequence system, but in a similar fashion it also determines an asymmetlic consequence system, a binary implicational system, a unary assertional system, a left-sided unary refutational system, an equivalential system, etc. (All of the explicitly listed systems, save the equivalential, are just special cases of the symmetric consequence obtained by requiring the righthand set to be a singleton, both left- and right-hand sets to be singletons, the left-hand set to be empty and the right-hand set to be a singleton, etc.) Thus to consider explicitly just one more case that interests us, a unary assertional system can be defined so that f-v P iffv(p) = t for every valuation v E V. The important thing about a consequence relation of any kind is not only that does a class of valuations determine the consequence relation, but also that the converse is true. Where f- is a symmetlic consequence relation, we shall say that a valuation v respects f- if there is no pair of sets of statements r, A, such that r f- A, and yet vCr) = t and v(A) = f. (We write vCr) = t to mean that vCr) = t for all r E r, and we similarly write v(A) = f to mean that v(8) = f for all 8 E A.) Analogously, when f- is an asymmetric consequence relation, respect amounts to there being no set of statements r and statement p such that r f- p, while vCr) = t and v(p) = f. And when f- is unary assertional consequence, respect just amounts to there being no statement p such that f- p and yet v(p) = f. (We leave it to the interested reader to figure out the appropriate extensions -to other varieties of logical presentation.) Given a consequence relation of any kind, we define V (f-) = {v: v is a valuation respecting f-}. (We also write this as H-.)
191
We speak of a class of valuations V being sound with respect to a consequence relation f- (of any kind) just when every v E V respects f-, i.e., f- ~ f-v. And we speak of a consequence relation f- being complete with respect to a class of valuations V just when conversely f-v ~ f-. Intuitively, this amounts to f- being strong enough to capture all of the inferences (of the appropriate kind-unary, asymmetric, symmetric, etc.) valid in V. Soundness and completeness then just means that f- = f-v. Naturally, then, the question arises as to when a logical system is both sound and complete with respect to some class of valuations. This can be expressed informally as the question as to when a logic has a semantics. The next two sections will address this question for asymmetric and symmetric consequence systems, respectively. We shall here and now dispose of the question for unary assertional systems, since this is such an easy and special case. Thus given a logic J: and its unary consequence relation f-, it is easy to show that J: is always sound and complete with respect to a singleton class of valuations. Thus simply define V = {v}, where v is the "characteristic" function for f-, i.e., v assigns t to just those sentences Iff for which f- Iff. Not only is J: sound and complete with respect to {v}, but clearly then also with respect to Vf- (the set of valuations respecting f-, of which v is the one that "does all the work").
Theorem 6.4.2 (Completeness for unary assertional consequence) Let f- be a unary assertional consequence relation. Then f- is sound and complete with respect to a class of valuations V = {v}, where v is the characteristic function for f-. Thus f- is also sound and complete with respect to Vf-. We shall not find things so easy in dealing with asymmetric and symmetric consequence (these are progressively more difficult). We shall find that we need to place a few natural conditions on f- so as to get any result at all, and we shall find that we cannot get by with singletons. We shall practice first on the "toy" case of binary consequence. The reader will find that several main themes will be introduced in this context.
6.5
Binary Consequence in the Context of Pre-ordered Sets
Let us fix throughout this section a set S, the elements that we think of as "statements" (sentences or propositions). There are two ways to think of determining a binary implication relation on S. The first, which we shall call the direct way, is to simply postulate a binary relation of implication ~ on S. It is natural to give ~ the properties of a pre-ordering, i.e., reflexivity and transitivity. Thus by an (abstract) binary implication relation we shall simply mean a pre-order ~. Anti-symmetry (the only ingredient missing from a partial ordering) also springs to mind. But surely when the elements of S are sentences there is no reason to think that two elements which co-imply one another must be identical. Thus consider p and p !\ p, the latter of which clearly contains a conjunction sign that the former does notthough Martin and Meyer (1982) show that an interesting "minimal" system of pure implicationallogic proposed by Belnap has precisely the coincidence of logical equivalence and identity. Even when the elements are propositions, the situation is somewhat
192
BINARY CONSEQUENCE
LOGIC
dicey, since in the philosophical literature there are ~ome co~ceptions of propositions that allow for finer-grained distinctions than mere logIcal eqUIvalence. There is another (indirect) way to induce a binary implication relation on the set S. Let us suppose that we have a set V of valuations, i.e., mappings of S into {t, f} . We can define the relation a ~v b iff for every v E V, if v(a) = t then v(b) = t. Valuations are just characteristic functions picking out subsets of S, so alternatively we could start with a set J of subsets of S (we call T E J a truth set), and define a ~J b iff for every T E J, if a E T then bET. Given a valuation v, we shall deno~e its tr~lth set ({a: v(a) = t}) by Tv, and given a truth set T, we shall denote the valuatIOn whIch .. . assigns t to its members (and f to its non-members) b~ VT: The direct way of specifying an implication relatIOn IS most famIlIar III a proo~ theoretic context, and the indirect way is certainly reminiscent of a more model-theoretIc (semantical) context. The question then naturally arises as to whether and when these two ways agree. This is the subject we now address. We begin somewhat abstractly. Let J be a collection of subsets of S, and R be a binary relation on S, characterized by the property that (1) aRb iff '
about the converse? Again let us be abstract. For any binary relation R (possibly not a pre-order), let [a) denote the R-image of a, {x: aRx}. Observe that: (2) aRb iff b E [a).
Now let R be a pre-ordering, i.e., a reflexive and transitive relation. We shall tea~e out the consequences of these properties for (2), denoting R now by the more suggestIve ~. Given a set C ~ S, we shall say that C is a cone on S just when whenever a E C and a < b then bE C (C is "closed upward"). Note that since ~ is transitive, [a) is always a co~e (called the principal cone determined by a). Also, since ~ is reflexive, it is always the case that a E [a). By "C(~)" we shall mean the set of all cones on S. By virtue of (2), the fact that a E [a), and the fact that [a) is a cone, it is clear that the following holds from right to left: (3) a ~ b iff'
193
Exercise 6.5.1 (Scott) A collection T of subsets of S is called a topology on S if both and S are in T, and if T is closed under finite intersections and arbitrary unions. The members of T are called the open sets. Show that C(S) is always a topology on S. A "To topology" is one having the weakest "axiom of separation," to wit, that if a =f. b, then there exists an open set that "separates them," i.e., contains one of a and b as member but not the other. Show that partial orderings on S determine To topologies on S, and conversely.
o
Note that mathematically, (3) gives us a way of describing a pre-ordering "by indirection." We need make no explicit reference to it, but rather refer to a collection of subsets of S (the cones). Indeed, (3) can be regarded as a kind of representation theorem, showing how the abstract notion of a pre-ordering can be "cashed out" as a concrete relation defined using a collection of subsets. This is made precise in the following exercise. Exercise 6.5.2 Let L = (S, ~) be a pre-ordered set. Show that the mapping h(a) = {C E C(~): a E C} has the property that a ~ b iff h(a) ~ h(b). Show further that when ~ is a partial-ordering, h is one-one, and so h is an embedding of L into \{.J(C(~)) partially ordered by ~ (restricted to C(~)). Remark 6.5.3 The above exercise is a prelude to one of our main themes, to wit, that representation theorems allow propositions viewed abstractly to be regarded more concretely as "UCLA propositions," i.e., sets of "possible worlds," or more literally, sets of their surrogates, the "truth sets" (the cones in this toy version). Philosophically (and this is the same theme in a different register), (3) can be regarded as an abstract completeness theorem, showing that for each binary implication relation it is always possible to characterize it by way of a semantics (the truth sets being the cones). By a semantics we shall mean a set of valuations. We shall say that V is sound with respect to ~ if whenever a ~ b, then for every v E V, v(a) = t only if v(b) = t. We shall say that ~ is complete for V just when the converse holds. Note that soundness just amounts to the truth sets detelmined by the members of V being cones; it is just the left-to-right half of (3). Completeness just amounts to the other half of (3). The following is thus apparent. Theorem 6.5.4 (Abstract completeness for binary implication) Given a binmy implication relation ~, i.e., a pre-ordering, it is always possible to find a semantics, i.e., a set of valuations V, with respect to which it is both sound and complete. Let us look at the above result yet another way. By J(~) let us mean the collection of subsets T of S that respect ~ in the sense that if a ~ b, then whenever a E T then bET (this is obviously just the same as C(~)). Recall that the "dual" notion ~ (J) was defined via (1). The next exercise is easy. Exercise 6.5.5 For the following assume that ~l and ~2 are two pre-orderings and that Jl and h are two collections of truth sets (all in the context of the set S). Then the following "Galois properties" hold: (a) If ~l
~ ~2,
then
J(~2) ~ J(~l).
ASYMMETRIC CONSEQUENCE AND VALUATIONS
LOGIC
194
(b) If J1 ~ h, then 5,(h) ~ 5,(J1)' (c) 5,; ~ 5, (J(5,;» (i = 1,2). (d) J; ~ J(5, (J;)) (i = 1,2). The abstract completeness theorem then can be viewed as saying that (c) may be strengthened to an identity (i.e., that the pre-ordering relations are "Galois closed"). Corollary 6.5.6 Given a binary implication relation 5" (4) 5,
=
5, (J(5,».
We can now wonder about the dual: (5) J = J(5,(J».
But it is easy to see in general that we need not have (5') J(5, (J)) ~ J.
Thus, to take perhaps the simplest (and most extreme) example, let S be a non-empty set with at least two members, and define J to consist of just the singletons of members of S. It is easy to see that 5, (J) is just the identity relation restricted to S. Yet it is also easy to see that J(5, (J» = \(.J(S). Note that this example works even in the case where S has but one member, since 0 E \(.J(S) but 0 ¢ J. However, under some definitions one might exclude the empty set as a truth set, so we thought it best to give a more general example. The moral of the above example is that in general two different collections of sets J1 and h will determine the same pre-ordering, and at most one of these need be the collection of all truth sets that respect the pre-order. Given (3'), it is clear that the principal cones always suffice to determine the pre-order, and that any additional truth sets are "just gravy." We think that (5) is an important property of a logic, maybe just as important as its dual, completeness (this is yet another of our main themes). We call this property absoluteness since it amounts to a logic having at most one semantics. Absoluteness is the appropriate analog for logics of the much studied property of theOlies called "categOlicity." One can expect of some theories that they be categorical in the sense of having abstractly only one model. This is an unreasonable expectation of a logic (which might be the logical basis of many different theories), but it still might be the case that abstractly the logic has only one class of models, and this is just absoluteness. It is clear from the above discussion that a logic viewed just as a binary implication relation will essentially never be absolute. To anticipate, it turns out that this is true also of a logic viewed as an asymmetric consequence relation. But on a still richer conception of a logic (symmetric consequence), we always have absoluteness (as well as completeness). But before this story is told, we shall first examine asymmetric consequence. 6.6
Asymmetric Consequence and Valuations (Completeness)
Since it gets confusing to talk about both the asymmetric and symmetric consequence relations at the same time, we shall for a while talk only about the asymmetric
195
consequence relations. We shall prove certain results about them, notice that these results are flawed in one important way, and then (in Section 6.8) see that the same results can be proven more smoothly for the symmetric consequence relations. Theorem 6.6.1 (Completeness for asymmetric consequence) Let I- be an asymmetric consequence relation. Then there is a set of valuations V such that r I- p if/every v ill V is such that if v(r) = t then v( p) = t. Proof Let V be the set of valuations that respect 1-. Then the direction from left to right is a matter of definition. We prove from light to left by contraposition. Thus let us suppose that not (r I- p). Define en(r) to be the set of p such that r I- p. Then define v so that it assigns t to every member of en(r) and f to all other statements. Clearly p ¢ r, and so we know that v(r) = t while v(p) = f. The only problem remaining is that we do not know that v respects 1-. Let us suppose for the sake of reductio that it does not. Then there must exist a set of statements il and a statement lJI such that il I- lJI, and yet veil) = t while v(lJI) = f. Clearly then il ~ en(r), and so, by dilution, en(r) I- lJI. It is then clear from the infinitary cut property that lJI E en(r), which means that v(lJI) = t, contrary to the supposition that v was a falsifying valuation for ill- lJI. This completes the proof. 0
If h ~ 1-2 then V(I-z) lfV1 ~ V2 then 1-(V2) V ~ V(I-(V». I- ~ I-(V(I-)).
~ ~
V(h).
1-(V1).
These are the relationships that characterize a so-called Galois connection. Relationships (i) and (ii) are easy to see, and roughly amount to the fact that if one condition is wealcer than a second, then there are fewer things satisfying the second than the first. Relationships (iii) and (iv) are less transparent, even though they really involve no ~ore t~an unpacking definitions. To prove (iii), let us assume that v E V, and yet that v IS not m V(I- (V)). Then for some rand p, we must have r I- (V) P and yet v(r) = t and v(p) = f. But this is a contradiction, since r I- (V) P means precisely that this does not happen for any v E V. . The proof of (iv) is analogously straightforward. Thus let us suppose that (r, p) E 1-, I.e., r I- p, and yet (r, p) is not a member of I- (V(I-», i.e., some valuation v in
PRE-ORDERED GROUPOIDS
LOGIC
196
197
r
~ a (in English, a is an explicit consequence of f') if there is some y such that y ~ a.
and yet assigns f to
(2)
assignment when r f-
It is important to realize that this relation is of mixed type level, but we would expect the notational ambiguity to be justified by the equivalence:
V(f-) assigns t to every member of
r
relation f-(V(f-)). We now examine the reason why the inclusion (iii) may no~ be reversed for asymmetric consequence. Let us consider S = {p, q, r ~. Let ~s consider two clas~es of valf S V and V2 Any valuation in VI which assigns t to p must assign f to q · ua tIOns 0 , I . Th' k f d as and vice versa, and every valuation in VI must assign t to r. ( m 0 q ~s ~ p an r p V ~p.) V2 does not care at all about the relationship of t~e values asslgned.to P and q, but still requires that r always be assigned the val~e t. It IS easy to see that If ~I and V2 are defined by these conditions, then they determme the very same a~ymmetnc consequence relation, and yet they do not determine the very same symmetnc consequence . relation (hint: for this last consider f- p, q). Clearly VI C V2. This means that if we look at the relation f- (VI), there are valuatlO~s v E V2 that satisfy it (e.g., those that give p and q the same truth value), but which are not in VI. This example may seem rather abstract, but the reader is assured that we shall find a "real-life" example of this phenomenon in Chapter 7 when we ~ontrast the .supervaluational semantics of classical logic with the ordinary semantics of claSSical logic. Exercise 6.6.3 Fill in any missing details in the above argument.
6.7
Asymmetric Consequence in the Context of Pre-ordered Groupoids
A structure L = (S,~, 0) is a pre-ordered groupoid iff (S,~) is a p.re-ordered set, ~d 0 is binary operation on S that is isotonic in each of its arguments with respect to ~, 1.e., (1) if a ~ b, then x
a~x
0
0
inductively as follows: E
r, then y
E
Fus(f').
(ii) If y, 5 E Fus(f'), then y 0 5 E Fus(f'). Given a set r ~ S and a E S, it is natural to "extend" the implication relation by way of the definition:
Fus(f')
{a}~biffa~b.
Somewhat unpleasantly, (2') does not hold from left to right without an additional postulate. The reason is that our definition allows that {a} ~ b because some "power" of a is less than or equal to b, say a 0 a or something even more complex like a 0 (a 0 a). The quick fix (which we adopt as an additional postulate below when needed) is to assume the principle of square increasingness: (3) a
~
a 0 a.
This gives enough idempotence so that (with transitivity) we do get (2'). A more elegant solution might be to work with "multi sets" instead of sets, as in Meyer and McRobbie (1982). Multisets are just like sets except that they allow for multiple "occurrences" of the same element (the point being that {a} in (2') could be viewed as a multi set with just one occurrence of a). Incidentally, note that at least initially we do not let r be "empty," since we do not know what the fusion of 110 elements might be. We shall show that explicit consequence is an asymmetric consequence relation (this holds even without square-increasingness). First observe that compactness is for free, since y is a finite fusion. So we need only to examine the overlap, dilution, and cut properties (infinitary cut following from compactness). For overlap, we need that if a E r, then r ~ a, but this is just a consequence of reflexivity via (2). For dilution, we require nothing at all, since clearly if there is some fusion of elements in r that implies a, then a fortiori there is that same fusion in r u Ii to do the job. Cut is a bit more complicated. Let us recall that it requires that if r ~ x and ru {x} ~ a, then r ~ a. The problem is caused by the second premise, since the conjunction of hypotheses chosen from ru {x} may contain no item from r, or may fail to contain x (it cannot do both because of our prohibition against empty fusions). The most general case may be displayed as follows (here n, Y2 (x) E Fus(f'), and Y2 (n) represents replacing one or more occurrences of x in Y2 (x) by n):
b and a 0 x ~ box.
Logically speaking, pre-ordered groupoids can be thought of ~s sets of stateme~ts"or dered by implication and in addition closed under some operatIOn (we shall call It fusion") which generalizes conjunction. Given some set of statements r, we shall define the fusions of elements ofr, Fus(f'),
(i) If y
(2')
E
(asymmetric) algebraic cut, ifn
~
x and Y2(x)
~
a, then Y2(n)
~
a.
This is easy to establish by induction on the construction of Y2(x), using monotonicity (the case when Y2(x) is just x is just transitivity, and the case where x does not occur in Y2 is just to repeat the second premise). Summarizing this discussion, we have the following theorem.
Theorem 6.7.1 Let (S,~, 0) be a pre-ordered groupoid. Then the relation of explicit consequence r ~ a is a compact asymmetric consequence relation. Now we shall examine elliptical consequence, where we in effect remove the prohibition against empty fusions in the definition of r ~ a. Let us pick out some set T
SYMMETRIC CONSEQUENCE AND VALUATIONS
LOGIC
198
of elements which we regard as "axioms." This is in effect to change the inductive definition of Fus(r) in its base clause so as to allow T ~ "FUST(r)·" We can then define r 'S.T a to mean that r 'S. a for some r E FUST(r) (or equivalently, r E ruT). In context, when T is understood, we symbolize elliptical consequence by r 'S.E a. Note that elliptical consequence not only allows for 0 'S.E a when t 'S. a for some t E Fus(T), but also permits a 'S.E b, when (say) a 0 t 'S. b for some such t. Explicit consequence can then be thought of as requiring that all of the statements, even the axiomatic ones, used in the "derivation" of a be mentioned, and elliptical consequence can be thought of as allowing their suppression. In special situations (which actually occur quite commonly with "real-life" logics) it may be possible to replace the role of E with a single distinguished element t, thought of as being something like the strongest truth (t 'S. t', for all t' E T). The following places elliptical consequence in a more general context, where the consequence relation need not arise "algebraically" by way of fusions. Lemma 6.7.2 Let f- be a (compact) asymmetric consequence relation, and let T be a set of statements. Then the "elliptical consequence" relation r f- E a defined, as ruT f- a, is also a (compact) asymmetric consequence relation. Proof Overlap, dilution (and compactness) are self-evident. lnfinitary cut is left as an exercise. D
199
Fact 6.7.6 The conjunction of the conditions of Facts 6.7.4 and 6.7.5 is a necessary and sufficient condition for the chain of equivalences: a 'S. b iff {a} 'S. b iff {a) 'S.E b.
We also leave as an exercise the proof of the following. Corollary 6.7.7 Let (S, 'S., 0, T) be a pre-ordered groupoid with a distinguished subset T, and let the elements of T be upper identity elements. Then the relation of mixed consequence r 'S.M b defined as either r 'S. b or 0 'S.E b (either r explicitly entails b or else b is a "theorem") is a compact asymmetric consequence relation. Indeed, it is the same as elliptical consequence. A structure L = (S, 'S., A) is a (lower) pre-semi lattice iff (S, 'S.) is a pre-ordered set, and A is a greatest lower bound with respect to 'S. (a pre-semi lattice would be a semi lattice were'S. anti-symmetric). Clearly a pre-semi lattice is a pre-ordered groupoid, and one that automatically satisfies square-increasingness because of idempotence. Furthermore, if there is a top element 1, then it functions as an identity, and a fortiori as an upper identity. This gives the right result for empty conjunctions, for /\ 0 must be the greatest among the lower bounds of 0, but vacuously every element is a lower bound of 0.
Notice that {a} 'S. band {a} 'S.E b need not agree. One ingredient obviously needed to make them agree is that the elements of T function as (upper) identity elements:
Theorem 6.7.8 Let (S, 'S., A) be a pre-semi lattice. Then the relation of explicit consequence r 'S. a is a compact asymmetric consequence relation. Further, if there is a top element 1, then the relation of elliptical consequence'S. [1 J and the relation of mixed consequence 'S.M are also compact asymmetric consequence relations. Further, for non-empty sets ofpremises r, the explicit, the elliptical, and the mixed consequences are the same. In particulm; {a} 'S. b, {a} 'S.E b, {a) 'S.M b all agree with each other and with a'S. b.
(4) If t
Exercise 6.7.9 Prove the above theorem.
Corollary 6.7.3 Let (S, 'S., 0, T) be a pre-ordered groupoid with a distinguished subset T. Then the elliptical consequence relation 'S.E (which in effect allows the empty fusion of premises) is a compact asymmetric consequence relation.
E
T, then a'S. to a and a'S. a 0 t.
Another ingredient is that the elements of T are all top elements: (5) If t E T, then for all a E S, a'S. t. (This last obviously implies that T has only one element when'S. is a partial ordering.) We record this and related facts, which we leave as exercises. Fact 6.7.4 If all of the members of T are both top elements and upper identity elements, then for any non-empty set of statements r,
r
'S. a iff
r
'S.E a.
In particular, {a} 'S. b iff {a} 'S.E b.
Fact 6.7.5 That equivalence:
0
is square-increasing, is a necessary and sufficient condition for the a'S.b iff {a} 'S.b.
6.8
Symmetric Consequence and Valuations (Completeness and Absoluteness)
As we just saw in the completeness theorem of Section 6.6, every logic specified by an asymmetric consequence relation has a set of valuations that characterize the logic. In this sense we can talk of every such logic as having a semantics. But as we saw in the example at the end, the embarrassment is that an asymmetric consequence logic can have more than one semantics. We shall see in this section that a symmetric consequence logic has precisely one semantics, i.e., there is a one-one correspondence between a symmetric consequence relation and a class of valuations that characterizes it. We think that this is a definite plus for symmetric consequence, and not just on grounds of mathematical elegance. One way to look at things is that a semantics for a set of statements should determine everything that there is to determine about the logical relationships among statements, whether certain statements imply certain other statements, whether certain statements are consistent with certain other statements, etc. A class of valuations does just those things by saying what distributions of truth values
SYMMETRIC CONSEQUENCE AND VALUATIONS
LOGIC
200
over the statements are admissible. But one would expect a logic to do all those things too. So when a logic is characterized by two different semantics, in the sense of two different classes of valuations, there is something logical that is underdetermined by the logic, and this should not be. Mere completeness by itself is thus not enough for semantic respectability. It is not enough that a logic have a class of valuations that characterize it; it must have at most one such class of valuations. This last is what we call absoluteness. We shall again define V (f-) to be the class of valuations that respect f-, but this time we of course mean that for no v E V do we have v(r) = t and v(~) = f· And we shall define f- (V) to be the relation that holds between sets of statements f and A whenever for no v E V do we have v(f) = t and v(A) = f· It is again just a matter of unpacking definitions to determine that we still have the Galois properties (i)-(iv) of the previous section. And again we can show that the inclusion of (iv) can be strengthened to the identity
201
We still have to prove the following.
Theorem 6.8.3 (Completeness for compact symmetric pre-consequence) Let f- be a compact symmetric pre-consequence relation. Then there is a set of valuations V such that f f- A (ffthere is no v E V such that vCr) = t and v(A) = f. This will follow immediately from the theorem above and the following lemma.
Lemma 6.8.4 Let f- be a compact symmetric pre-consequence relation. Then it has the global cut property. Proof Let us assume that not (f f- A). Note by Exercise 6.3.6 we can assume that L. = S, which we do for this proof. For the sake of concreteness, we give a proof here only for the case where S is countable. (Two different proofs for the general (uncountable) case are suggested later as exercises.) Since we are assuming that S is countable, we can enumerate the elements of S:
(iv') f- = HV (f-)) by way of proving a completeness theorem. This theorem is analogous to the one that we proved for asymmetric consequence, but requires a whole new bag of technical tricks. But the main philosophical point that interests us is that, given symmetric consequence, the inclusion (iii) can also be strengthened to the identity: (iii') V
(If S is finite, just infinitely repeat the last member.) We then define a series of pairs (fi' Ai) as follows: (i) (fa, Aa) = (f, A);
= V(HV)).
Lemma 6.8.1 Let f- be a symmetric consequence relation. Then V (f- (V)) ~ V.
") (f. A.) _ {(fi u {
Proof We proceed contrapositively. Thus let us suppose that va is not in V. Define T = {s E S: va(s) = t} and F = {s E S: va(s) = Clearly T f- (V) F, for given any v E V, v f:. va. This means that either v assigns f to some member of T, or else v assigns t to some member of F. And yet by the definition of T and F, it is clear that va falsifies T f- (V)F, i.e., va is not in V(f- (V)), and we are through. 0
In plain English, the construction is that we are to begin with the pair (r, A), and going through all of the statements of the language one at a time, we are to throw a statement into the left-hand component of the pair if doing so does not lead to the left having the right-hand side as a consequence, and otherwise we are to throw the statement into the right-hand component. We then define as the limit of this process:
n·
We now examine completeness for compact symmetric consequence.
Theorem 6.8.2 (Completeness for symmetric consequence) Let f- be a symmetric consequence relation. Then there is a set of valuations V such that f f- A (if there is no v E V such that v(r) = t and v(A) = f· Proof Set V = V(f-). The "if" part then holds by definition of V. For the "only if" part, we proceed contrapositively and suppose that not (f f- A). Let L. be the set of all statements S. By the global cut property and Exercise 6.3.6, we know that there exists a partition of S into two sets T and F such that f ~ T and A ~ F. Define a valuation VT that assigns t to every statement in T and f to all statements in F. Clearly this is a valuation that falsifies f f- A. The question remains though whether VT respects f-. Suppose that it did not, i.e., that there exist sets of statements II and E> such that II f- E> and yet vT(II) = t and VT(E» = f. Clearly II ~ T U f and E> ~ AuF, and so by dilution, we would have L.J, f f- A, L.2, contrary to what was guaranteed us by the global cut property. 0
We leave to the reader the proof of the following properties of the construction:
n
f- Ai. (Use mathematical (1) For each stage of the construction, it is not the case that induction.) (2) It is not the case that f OJ f- A OJ • (Use compactness and (1).) Finishing off the proof, it is clear that we have the desired partition.
o
Exercise 6.8.5 (For those readers having some knowledge of ordinals and/or wellorderings.) Extend the proof given above to the general (non-countable) case by "enumerating" the set of statements with ordinals (or well-ordering it) and using ordinal induction (or strong induction).
203
HEMI -DISTRIBUTOIDS
LOGIC
202
Exercise 6.8.6 Use Zorn's lemma, providing yet another way to remove the restriction to countable languages. (Hint: Define a class G of pairs of sets (C, ai) such that r ~ r i and a ~ ai, and partially order G by componentwise inclusion.)
(2) a 0 (b + c) ::; b + (a 0 c); (3) (b + c) 0 a ::; b + (c 0 a); (4) (b + c) 0 a ::; (b 0 a) + c.
Remark 6.8.7 The properties (iii') and (iv') guarantee that there is a one-one correspondence between symmetric consequence relations and sets of valuations ((iv') conesponds to completeness and (iii') conesponds to absoluteness). Indeed, it is easy to see that the map V (I-) is a "dual" isomorphism between the lattice of symmetric consequence relations and the lattice of sets of valuations (intersection canied into union, and vice versa). This means that the lattice of symmetric consequence relations has an exceptionally simple structure, namely, that of the lattice of all subsets of a set, or in more abstract terms, that of a complete atomic Boolean algebra. This is in sharp contrast to the situation of asymmetric consequence relations. See W6jcicki (1988) for more on the lattice of asymmetric consequence relations.
The idea of a hemi-distributoid related to consequence is that besides providing an operation generalizing conjunction (for the left), another dual operation generalizing disjunction (for the right), and the various "hemi-distributive laws," (1)-(4) postulated above, give the effect of the cut property (this is unfortunately not quite true as we shall ultimately see below). Given a set a we shall define the set of fissions of a (in symbols Fis(a)) in precisely the same way as we defined the fusions of a set, but using + instead of 0 (and at this stage we do not provide for empty fissions or fusions). Then given a hemi-distributoid we can define explicit symmetric consequence between two sets r, a ~ S as follows:
Historical Remark 6.8.8 Absoluteness is the appropriate analog for a logic of categoricity for a theory. In view of both the naturalness and importance of the property we call "absoluteness," we find puzzling the casualness with which it has been previously dealt in the literature. Scott (1973) emphasized the importance of symmetric consequence, and also of looking at a semantics as a class of valuations. Indeed, that paper contains the proof of the completeness theorem for compact symmetric logics, and yet Scott does not state its dual, the absoluteness theorem. Regarding Shoesmith and Smiley (1978), we tend to agree with a remark made by Shoesmith, in conespondence, that our asking whether they clearly state absoluteness is "like asking if the New Testament teaches the doctrine of the Trinity-it is written on every page, whether it is explicit or not." But the one explicit reference supplied (p. 73) mentions the property in an offhand way (in the context of establishing that "the single-conclusion part of a calculus is never sufficient to detennine the remainder of it"). It is said that "for whereas each multiple-conclusion calculus is characterized by a unique set of partitions, the presence or absence of each pair (T, U) being dictated by the falsity or truth of T I- U, several sets of partitions can characterize the same single-conclusion calculus" (a "partition" is just a division of sentences into two sets, the true T, and the false U, and so functions as a valuation). Not only does absoluteness not appear as a numbered theorem, but the justification provided in the quotation seems hardly to count as a proof. Talking as it does of partitions (T, U), it seems to overlook the more typical consequences r I- a where the pair (r, a) does not necessmily constitute a partition. Incidentally, in closing these historical remarks, we observe that viewing absoluteness as the dual of completeness in the context of Galois connections seems to be novel with us.
In analyzing what propeliies are needed for this to give a symmetric consequence relation, it is clear once more that overlap, dilution (and compactness) are simple consequences of the definition. Everything then falls on the symmetric cut property. This requires that if r ::; {x} u a, and r u {x} ::; a, then r ::; a. The most general case then would look something like this:
6.9
Symmetric Consequence in the Context of Hemi-distributoids
By a hemi-distributoid we mean a structure (S,::;, 0, +), where each of (S, ::;, 0) and (S,::;, +) is a pre-ordered groupoid, and where 0 distributes over + as follows: (1) a
0
(b
+ c) ::;
(a
0
b)
+ c;
(5) r::; a iff for some
r E Fus(r), 8 E Fis(a), r ::; 8.
(6) If Yl ::; 81 (x) and n(x) ::; 82, then n(Y1) ::; 81 (82)
((symmetIic) algebraic cut).
Here it is understood that the r-terms are members of Fus(r) and the 8-tenns are members of Fis(a), and that the displayed substitutions may be made for as many occurrences of x as one likes (as long as at least one is made on each side). The symmetric algebraic cut propeliy is intimately connected with the hemi-distributive laws (1)-(4). We shall derive the hemi-distributive law (2) from it by way of example, leaving the others as exercises. It is a matter of inspection to see that (7) below is a special case of (6). (We underline the substituted positions to aid the eye.) (7) If b + x::; b + ~ and a
0
~
::;
a
0
x, then a
0
(b
+ x) ::; b + (a 0
x).
We would love to proceed to show conversely that symmetric algebraic cut holds in general of hemi-distributoids, but we are unable to do this, as the following shows. Exercise 6.9.1 R. K. Meyer (personal communication) has supplied us with the following hemi-distributoid which provides a counter-example not only to symmetric algebraic cut, but to symmetric cut in general. Consider the set {1, ~, O} with the usual ordering, and let fusion and fission be defined by the tables in Figure 6.1. Show that this is a hemi-distributoid, and that whereas {x + x} ::; {x} and {x} ::; {x 0 x}, still {x + x} {x 0 x}. (Hint: for this last, assign x the value ~, and note that all fusions of the left-hand side will take the value 1 and that all fissions of the right-hand side will take the value 0.)
i
Remark 6.9.2 The reader may recognize these tables as being just like those for conjunction and disjunction in the Lukasiewicz (1910) three-valued logic, excepting the center entry in each case. Actually, where ~ and -+ are the Lukasiewicz operations for negation and implication, x 0 y = ~(x -+ ~y) and x + y = ~x -+ y.
T 1
HEMI -DISTRIB UTOIDS
LOGIC
204
1
o
2:
o
1
2:
1
o
+
1
2:
o 1 1
1
2:
2:
1
o
o
2:
o
o
o
o
o
1
1
2:
1
1
2:
o
FIG. 6.1. Counter-example to symmetric and symmetric algebraic cut Since we thus cannot show symmetric algebraic cut in general, we instead show it for the case where but one OCCUlTence of x is replaced, which we display as follows (the square brackets indicating a specific OCCUlTence of x-this can be understood as shorthand for an infinite number of specific laws using algebraic expressions where we can actually talk about specific OCCUlTences of x, as in (7) above): (8) If Yl ::; 81 [x] and Y2[x] ::; 82, then Y2[Yl] ::; 81 [82]
(single algebraic cut).
We need the following kind of general hemi-distributive law. (If it could be proven allowing substitution in more than one OCCUlTence, we would have been able to prove algebraic cut in general.) Lemma 6.9.3 Let y[x] be afusion, and let 8[x] be afission. Then y[8[x]] ::; 8[y[x]].
In proving the lemma, we find it convenient to have at hand what is in effect a special case (Sublemma 6.9.4), which we leave to the reader to prove by induction on the complexity of the construction of 8[x]. (The inductive proof of the main lemma can be structured so that this is not necessary, but it is more confusing.) Sublemma 6.9.4 Let 8[x] be afission. Then:
monotonicity, so we can assume that both are complex, say that Y2[X], + 82[X].
0
There are actually three more cases, depending on whether the distinguished ocCUlTence of x is on the left or right of the fusion and fission (a different one of the hemi-distributions (1)-(4) coming to the rescue each time). We shall explicitly work only on the case above. Thus: 0
(Y2[81
0
(81
+ (Yl
+ 82[X]])
+ 82[Y2[X]]) o
82 [Y2 [x]])
Y2[X]]
by sublemma
It is striking that the problem in generalizing the above all comes down in the end to the tangle of multiple OCCUlTences of the "cut element" x. One way to cut this Gordian knot is just to assume some suitable general forms of idempotence for 0 and +. It of course crosses the mind that we should have square-increasingness for 0 and its dual for+: (11) a + a::; a
(sum-decreasingness).
But these are not suitably general (unless one adds as well the commutativity and associativity of the hemi-distributoid operations). Given y, y' E Fus(r), we shall say that y' is a reduct of y (in symbols y' -< y) intuitively when y' is just the same as y except for containing fewer OCCUlTences of one or more members of r. We shall say that y' is a complete reduct of y when it is a reduct of y containing at most one OCCUlTence of each member of r (this is the same as saying that it is minimal with respect to -<). These notions can be made precise by looking at the inductive construction of y', and seeing how many times an element x enters in. We shall speak analogously concerning elements 8,8' of Fis(~). Remark 6.9.5 All the talk of "occulTences" in fusions and fissions has ultimately to be straightened out by "metalinguistic ascent." What we are really talking about is terms ("polynomials") formed recursively as multiple fusions (or fissions). Definition 6.9.6 We shall say a hemi-distributoid is general square-increasing if for eve,)' non-empty set r, for every y, y' E Fus(r), if y' is a reduct of y, then y' ::; y.
Definition 6.9.7 Dually, it is general sum-decreasing iffor every non-empty set ~,for every 8, 8' E Fis(8), if 8' is a reduct of 8, then 8 ::; 8'. when it is both general square-increasing and general sum-decreasing.
Proof of Lemma 6.9.3. We proceed by induction on the combined complexity of the constructions of y[x] and 8[x]. If either consists of just x alone, we are back to just
y[8[x]] = Yl y[8[x]] ::; Yl y[8[x]] ::; 81
0
y[8[x]] ::; 8[y[x]].
Definition 6.9.8 We shall say of a hemi-distributoid that it is sufficiently idempotent
(i) a 0 8[x] ::; 8[a 0 x]; (ii) 8[x] 0 a ::; 8[x 0 a].
(9) y[x] = Yl (10) 8[x] = 81
y[8[x]] ::; 81 + 82[Yl
205
by inductive hypothesis by hemi-distribution (2)
Lemma 6.9.9 Let (S,::;,
0, +) be a sufficiently idempotent hemi-distributoid. Then it satisfies the symmetric algebraic cut property.
Proof Let us suppose that Yl ::; 81 (a) and Y2 (a) ::; 82. Then let 8~ [a] be a complete reduct of 81 (a) and let y;[a] be a complete reduct of Y2(a). That such complete reducts exist follows from the fact that we are dealing with finite fusions and fissions. We then can apply the single algebraic cut as follows:
o Theorem 6.9.10 Let (S,::;, 0, +) be a sufficiently idempotent hemi-distributoid. Then the relation of explicit consequence relation.
r ::;
~
is a compact symmetric consequence
The proof is immediate from Lemma 6.9.9 and the sUlTounding discussion.
206
LOGIC
We shall briefly discuss elliptical consequence in the context of symmetric consequence, omitting proofs since the main ideas are obvious extensions of the COlTesponding asymmetric notions. First, we need to relativize both sides of the consequence relation with "axioms" T and "counter-axioms" F, as follows: (LE) (RE) (SY)
r r
r
I-r li iff ruT I-li (left elliptical consequence); I- F li iff r I- F U li (right elliptical consequence); I-r, F li iff ruT I- F u li (symmetric elliptical consequence).
In context, when T and F are clear, we write, respectively, I- LE, I- RE, and I- S E. Lemma 6.9.11 Let I- be a (compact) symmetric consequence relation, and let T and F be sets of statements. Then the "elliptical consequences" r I-LE a, r I-RE a, and r I- Sy a are all (compact) symmetric consequence relations.
Left elliptical consequence allows the set r to be empty; 0 I- LE li is interpreted as "li is unassailable, given the truth of the axioms T." Right elliptical consequence allows the set li to be empty; r I- RE 0 means 'T is refutable, given the falsity of the counter-axioms F." Symmetric elliptical consequence then allows either or both of r and li to be empty; 0 I- SE 0 then represents that "the axioms cannot be true while the counter-axioms are false." We shall not examine right elliptical consequence in any detail, since it behaves much as the elliptical consequence defined for asymmetric consequence. We shall also largely disregard left elliptical consequence, on the excuse that it is just the dual, and . focus our attention on symmetric elliptical consequence. Given an explicit consequence relation r :S: li defined on some underlying sufficiently idempotent hemi-distributoid, symmetric elliptical consequence r :S:r, F li (written in context as r :S:s E li) just amounts to there being some fusion r of elements of ruT and some fission 8 of elements of li u F such that r :S: 8. This provides the following. Corollary 6.9.12 For a sufficiently idempotent hemi-distributoid, symmetric elliptical consequence is a compact symmetric consequence relation. The same is true of left and right elliptical consequence.
A bottom element of a hemi-distributoid is an element 0 such that 0 :S: x for every element x. A lower identity element is an element f such that f + x :S: x and x + f :S: x. (These dualize the cOlTesponding notions for 0 introduced in the context of pre-ordered groupoids.) Fact 6.9.13 If all of the members of T are both top elements and upper identity elements, while all of the elements of F are both bottom elements and lower identity elements, then for all non-empty sets of statements r, li,
In particular, {a} :S: {b} iff {a} :S:s E {b}.
HEMI-DISTRIBUTOIDS
Fact 6.9.14 That 0 is square-increasing, while sufficient condition for the equivalence:
207
+ is sum-decreasing, is a necessary and
a :S: b iff {a} :S: {b}. Fact 6.9.15 The conjunction of the conditions of Facts 6.9.13 and 6.9.14 is a necessary and sufficient condition for the chain of equivalences: a :S: b iff {a} :S: {b} iff {a} :S:SE {b}.
Exercise 6.9.16 Establish analogous facts for
:S:RE.
We also leave as an exercise the proof of the following. Corollary 6.9.17 Let (S,:S:, 0, +, T, F) be a hemi-distributoid, with distinguished subsets T and F. Let the members of T all be upper identity elements, and the members of F all lower identity elements. Then the relation of "symmetric mixed consequence" r :S:SM li defined as r :S: li or 0 :S:LE li or r :S:RE 0 ('T explicitly entails li, or li is unassailable, or r is refutable") is a compact symmetric consequence relation. Indeed, it is symmetric elliptical consequence. The same is true of "left mixed consequence" :S:LM (dmp the third disjunctfmln the definition) and "right mixed consequence" SRM (dmp the second disjunct).
Finally, we apply all of the above generality to the principle concrete case of interest: distributive lattices. The reader will recognize that a distributive lattice is a hemidistributoid, with 0 as 1\ and + as V. The same is true of a "distributive pre-lattice" (just like a distributive lattice except that it may not satisfy anti-symmetry). Indeed, the distributive law for lattices can be stated just by way of the one hemi-distributive law (l), the others falling out by way of commutativity. Exercise 6.9.18 Show the above. Show as well that the full distributive law a 1\ (b V c) = (a 1\ b) V (a 1\ c)
holds in a lattice which is also a hemi-distributoid. It should be reasonably clear that a distributive pre-lattice is sufficiently idempotent (both 1\ and V are literally idempotent, and since both operations are commutative and associative, one can always "reanange multiple OCCUlTences of terms so that they abut one another"). Further, if the pre-lattice is bounded, i.e., has a top element 1 and a bottom element 0, then it is clear that these function, respectively, as an upper identity and a lower identity.
Theorem 6.9.19 Let (S,:S:, 1\, V) be a distributive pre-lattice. Then the relation of explicit consequence r :S: li is a compact symmetric consequence relation. Further, if there is a top element 1 and a bottom element 0, then (letting T = {I} and F = (O}) the relation of symmetric elliptical consequence (as well as the relations of left and right elliptical consequence and the relations of mixed consequence) is also a compact symmetric consequence relation. Furthel; for non-empty sets of premises r, the
208
LOGIC
LINDENBAUM MATRICES
209
explicit, all of the elliptical, and all of the mixed consequences are the same. In partintlm; {a} :S b, {a} :SSE {b}, {a} :SLE {b}, {a} :SRE {b}, {a} :SSM {b}, {a} :SLM {b}, {a} :SRB {b} all agree with each other and with a :S b.
(sl) Given a language L and a set of sentences L of L, L is a (unary) assertional (fonnal) logic iff for any endomorphism ("substitution") u of L, if I-c ¢ then
Exercise 6.9.20 Prove the above theorem.
For a binary system (of either the binary or equivalential kind), we shall say that it is a (formal) logic if whenever (¢, If/) E L then (u(¢), uClf/)) E I:for any endomorphism u of the language. For asymmetric consequence, being a (formal) logic will require that whenever (r, ¢) E L, then (u* (r), u(¢)) E L, where u* (r) is, of course, just {u(lf/): If/ E r}. For symmetric consequence, it is required that whenever (r, Ll) E L, then (u*(r), u* (Ll)) E L. Let us consider explicitly why the consequence relation determined by a set of valuations need not be structural. Thus suppose we consider the language of sentential logic and let V be the set of all valuations except those that assign t to pand f to q (maybe p is formalizing "John is a husband" and q is formalizing "John is male"). Then although pI- q, we would not have r I- s. Intuitively the problem in the example above is that there are implications that are not formal (or logical) implications. The p and the q are being treated as sentences having internal relations of meaning, and not as mere formal vmiables devoid of structure.
6.10
Structural (Formal) Consequence
In a natural language such as English there are many consequences among sentences that are not, at least obviously, consequences that depend on logical form alone, e.g., (1) "John is a husband" implies "John is male,"
(2) "Mary quickly ran" implies "Mary ran." That there is a prima facie difficulty in saying that these consequences depend on form alone may be seen by considering the following sentences: (1') "Mary is a wife" implies "Mary is male," (2') "John allegedly ran" implies "John ran."
It is the Logician's credo that all consequence can be reduced to formal consequence: it is merely a matter of analyzing sentences into their logical form. This is most plausible with a sentence like (I), the antecedent of which clearly means something like "John is manied and John is male." This makes the implication (1) simply an instance of the formal logical truth: (I') p /\ q
--*
q.
With sentences like (2) things are more problematic (as the literature attests). Are adverbs to be construed as predicates attaching to names of events, or are they instead to be construed as predicate modifiers? And in any case, how is an adverb like "allegedly" to be analyzed? In the tradition that treats adverbs as predicate modifiers, a distinction has grown up between detachable and undetachable adverbs (with "quickly" in the former class, and naturally "allegedly" in the latter). Issues like the above are not intended by us to be in any way decisive against the Logician's credo (which we would very much like to believe), but are raised merely to say that there is at least a lot of hard work that must be done before a natural language can be viewed as an interpreted formal language. In all our discussion of consequence above, we have left out any requirement that consequence be a matter of form. Let us first discuss (unary) assertional systems. We must add one requirement, or else we will not have captured the commonplace that being a logical truth is simply a matter of grammatical fOlm. This amounts to the set of theorems being closed under substitution, and, in Polish, as it were this is pronounced "structural." We will build into our notion of a logic that it is structural (or, as we shall say in English,formal). Thus in the sequel the default assumption when we speak of a "logic" will be that it is closed under substitution, in the following specific ways.
h:
u(¢).
Definition 6.10.1 We shall call a set V of valuations (and the semi-interpreted language (L, V)) fOlmal if it has the following property, a property we earlier called "closed under substitution"; (Coincidence of semantic and syntactic substitution) For evelY valuation v E V and for evelY substitution u ofL, there is a valuation Vcr E V such that Vcr ( ¢) = v(CJ(¢)). The intuitive idea behind this property is that, for example, given the substitution of r for q (substituting every other sentential vm·iable, in particular p, for itself), and the valuation that assigns t to p and f to r, there should be admitted a valuation that assigns to q the value f. Thus, using obvious notation, there should be an admissible valuation v(rjq) such that v(rjq)(q) = vCr).
Exercise 6.10.2 Show that if a class of valuations is fOlmal, then its symmetric consequence relation is always formal (i.e., structural). 6.11
Lindenbaum Matrices and Compositional Semantics for Assertional Formal Logics
The question before us is whether we can prove appropriate completeness theorems for formal (structural) consequence. The problem is that the class of valuations produced by the completeness proofs above is not obviously fOlmal. One could try to show that it is, perhaps by way of appropriate modifications given our now further assumption that the consequence relation is itself fOlmal, but we have not found that route very promising. Instead we detour through Theorem 5.9.14 of the previous chapter, which showed that the class of valuations determined by the interpretations in a matrix always has the property of the coincidence of syntactic and semantic substitution.
210
LINDENBAUM ATLAS
LOGIC
We do this first for unary assertional systems, both for clarity of ideas and for the historical importance of the result, for we are dealing with what has been called the "Lindenbaum-Tarski matrix." Suppose then that 12 is a formal unary assertionallogic, i.e., it is a set of sentences (the "theorems") closed under substitution. We want a class V of Jarmal valuations, so that a sentence p is valid in V, i.e., v( p) = t for all v E V, iff pEL. Now if there were no requirement that the valuations in V be formal, the job would be entirely trivial. We could just use one valuation v, namely, the "characteristic function" for 12: V(P)={Jt
if pEL, otherwise,
211
Remark 6.11.2 Note that if 12 is based on the sentences of the language L, and L has some infinite cardinal number m of atomic sentences, then it is a simple calculation in cardinal arithmetic to see that there are only m sentences in general in L (the chief point being that each sentence has a finite construction). Thus the characteristic matrix LM(L) need be no larger than the number of atomic sentences of L. In the usual case when L has but a denumerable number of atomic sentences, then LM(12) is denumerable. Retuming now to our main quest, to find under what conditions a formal logic has a formal semantics, we can answer the question for unary assertionallogics.
Theorem 6.11.3 A necessmy and sufficient condition Jar a unmy assertional system to have a characteristic Jonnal semantics is Jar the system to be Jormal (structural).
and then let V = {v}. It tums out that by a trick due to Lindenbaum and Tarski (the exact history of their contributions is complicated), we can replace the single valuation v with a class of valuations V that is formal. The trick is to find an underlying matrix in the cheapest possible way, since we know from Corollary 5.9.7 that the class of valuations determined by the interpretations in a matrix is always formal. The cheapest possible way is to simply take the matrix to consist of the algebra of formulas itself, with the designated elements being just the theorems of L. This is the famous "Lindenbaum-Tarski matrix," which (following Rasiowa and Sikorski 1963) we call simply the Lindenbaum matrix of the assertionallogic 12, and denote by LM(L). We then let V be the set of all valuations in LM(L). It is immediate that since a valuation is then just a homomorphism from the algebra of sentences into alg(LM(L)), i.e., into the algebra of sentences, and since a substitution is just an endomorphism of the algebra of sentences, that V is just the class of all substitutions. It is clear that if not f-c p, then there is a valuation, namely the identity valuation v( p) = p, that assigns p an undesignated value, i.e., a formula that is not a theorem, namely itself. Thus we have completeness. Soundness is only a slightly more complicated matter. Thus suppose f-c p. Then any valuation v, being just a substitution, must assign p some substitution instance v(p). But since we are assuming that 12 is formal, we know that v(p) is also a theorem of 12, and so v(p) is designated. We shall say that a sentence p is valid in a matrix M, or that M is a model of p, (in symbols PM p) iff for every interpretation I in M, I(p) E D. It is clear that this is the same as saying that p is assigned the value t by every valuation VI determined by an interpretation in M. Let us call the set of all such valuations Vb VM. Then saying that p is valid in M is just another way of saying that p is valid in VM. Given a logic 12, we shall say tllat LM (12) is characteristic for 12 when for all sentences p, f-c P iff PM p. It is then clear from the antecedent discussion that the following is true (recall that we built into our notion of a logic that it be formal).
When dealing with formal asymmetric consequence logics, there is a corresponding notion to the Lindenbaum matrix, to wit, the Lindenbaum atlas. Recall that an atlas is just like a matrix, except that it has more than one designated subset, i.e., an atlas is a structure (P, (Di )iEI), where each Di (a designated subset) is a non-empty subset of P. The Lindenbaum atlas defined on a formal asymmetric consequence logic 12 is then defined just like a Lindenbaum matrix, i.e., its algebraic part is just the algebra of sentences of the language L underlying 12, except that instead of taking just one designated subset (the set of theorems), we take many. Indeed, for every set r of sentences of L we take Cn(r), i.e., {p: r f- p}, as a Di. We call this the Lindenbaum atlas of 12, and denote it by LA(I2). We shall say that an atlas A is characteristic for an asymmetric consequence relation f- just when r f- p iff for every interpretation I in A and for every Di, if I(Y) E Di for all Y E r, then I( p) E Di. This is equivalent to saying that every valuation determined by the atlas respects r f- p. The following is then analogous to the Lindenbaum-Tarski theorem.
Theorem 6.11.1 (Lindenbaum-Tarsld) Every unary assertional logic 12 has a characteristic matrix, namely LM(L).
Theorem 6.12.1 (Wojcicki 1970) Given aJormal asymmetric logic 12, the Lindenbaum atlas LA(L) is always characteristic Jar L.
Proof That this is a sufficient condition follows from the Lindenbaum-Tarski theorem and a result of the previous chapter. That it is a necessary condition can be shown as follows. Suppose that f- c p, but that there is some substitution instance so that not f- C p( lfJ / p). Then since 12 has a characteristic formal semantics, there must be some class of valuations Vc that is closed under syntactic and semantic substitution and which is characteristic for L So, Vc being characteristic, there must be some v E Vc such that v(p(lfJ /p)) = J. But because of the property of the coincidence of syntactic and semantic substitution, there must be as well some v(lfJ/p) E Vc so that v(lfJ/p)(p) = J. But then, because Vc characterizes f-c, we must have that not f- p. D
6.12
Lindenbaum Atlas and Compositional Semantics for Formal Asymmetric Consequence Logics
SCOTT ATLAS
LOGIC
212
Proof Soundness is an easy consequence of our assumption that f- is formal, i.e., is closed under substitutions (endomorphisms). Thus if r f- ¢, then an interpretation in the Lindenbaum atlas just amounts to a substitution (J, and so we have for every interpretation (J that (J*(r) f- (J(¢). Now each set Dj E (Di)iEI is of the form Cn(A). But to say that (J(Y) E Cn(A) for all Y E r is just to say that (J* (r) ~ Cn(A). But then (as
mentioned in Section 6.3) Cn((J*(r)) ~ Cn(Cn(A))
= Cn(A),
and so, since (J(¢) E Cn((J*(r)), then (J(¢) E Cn(A). Completeness is even easier. Proceeding contrapositively, if not (r f- ¢), then choose as a designated set Cn(r). It is clear that r is designated under the identity interpretation, and yet that ¢ is not. D Now we know from a result of the previous chapter that the class of valuations determined by the interpretations in an atlas is formal. Also the following is easy. Exercise 6.12.2 Let V be a class of valuations, and let f-v be the asymmetric consequence relation determined by V. Then f-v is formal (structural). Theorem 6.12.3 A necessary and sufficient condition for an asymmetric logic to have a characteristic formal semantics is that its consequence relation be formal. Remark 6.12.4 In this section and the previous one, we produced relatively simple proofs to show that for formal unary assertional logics and for formal asymmetric logics, there is always a class of interpretations in an appropriate "Lindenbaum structure" with respect to which the logic is sound and complete. The appropriate "Lindenbaum structure" was a matrix for assertional logics and an atlas for asymmetric logics, but in each case the "propositions" in the structure could be chosen to be the sentences on which the logic is defined. This is a profound trick, and is a good witness to the fact that simplicity of proof does not necessarily translate into the triviality of the theorem proved. It took tremendous ingenuity to see that these "parasitical interpretations" could go proxy for "genuine interpretations" (reaching outside the language itself), and this of course is paradigmatic of more complicated completeness theorems (say, the completeness theorem for classical first-order logic, with which the reader may have some familiarity). In both cases, the appropriate Lindenbaum structure was in effect the logic itself simply "reconceptualized." This is clearest in the unary case, where the elements of the Lindenbaum matrix are the sentences and the designated elements are just the theorems. The relationship is not quite so clear in the case of asymmetric consequence. The straightforward Lindenbaum structure would be to again regard the sentences as the elements but to have some relation from sets of elements to elements. The atlas codes up the same information because of the familiar correlation between consequence and theories (a theory can be defined as a set closed under consequence, or ¢ can be said to be a consequence of a set r just when ¢ is a member of every theory including y). Of further interest to us is the fact that these completeness results with respect to the Lindenbaum structures could be used to produce formal classes of valuations (given that
213
the consequence relations are formal (structural)). We now want to show for symmetric consequence too that it has a characteristic formal semantics in the sense of being characterized by a class of valuations that has the property of the coincidence of syntactic and semantic substitution. It might naturally be thought that there would then be the need to consider yet another new kind of "Lindenbaum structure" defined on sentences, one with, in effect, a relation between sets of sentences (perhaps "coded up" in some clever way so as to disguise the transparency of its relationship to the logic, as with the case of atlas). Somewhat surprisingly, we shall find that there is no need for such a new basic type of Lindenbaum structure; the atlas will do! But we must change the way that the set (Di) of designated sets is defined, so there is a sense in which we do have a new Lindenbaum structure after all. We shall call this the Scott atlas. 6.13
Scott Atlas and Compositional Semantics for Formal Symmetric Consequence Logics
For the Scott atlas, the algebraic part is the same as for the Lindenbaum atlas (or matrix): the algebra of sentences. The set (Di) is somewhat different than for the case of asymmetric consequence, when we constructed the Lindenbaum atlas. Then it was the set of all subsets of L which are closed under consequence. This time it is a subset of those sets, namely the sets T so that there is some set F so that T and F pseudo-partition L (F is L - T) and not T f- F. Theorem 6.13.1 Let f- be a formal symmetric consequence relation. Then the Scott atlas is characteristic for f- in the sense that r f- A ifffor all Di (i.e., (Vi E I)Di), and for all interpretations I, ill(Y) E Di for all Y E r, then I(D) E Di for some D EA. Proof For completeness, let us suppose that not r f- A. Then by the global cut property
(Section 6.3), we know that there is a quasi-partition of the underlying language L into two sets T, F so that r ~ T, A ~ F, and not T f- F. Thus, fixing I to be the identity map (restricted to L) and letting Di = T, it is clear that I(Y) E Di for all Y E r, and yet no I(D) E Di for D E A. On the side of soundness, let us suppose that I. f- 8. Observe (this should by now be routine) that any interpretation in the Scott atlas is just an endomorphism (J of L and hence a substitution. But since f- is formal (closed under substitutions) we then have (J*(I.) f- (J*(8). For the Scott atlas to falsify this, we would need a quasi-partition (T, F) of L with (J*(I.) ~ T and (J*(8) ~ F, while not T f- F. But then, by dilution from (J*(I.) f- (J*(8), we would have T f- F (a contradiction). D It is easy to establish the following exercise as a kind of converse to the above theorem.
Exercise 6.13.2 Let V be a formal class of valuations on a language L. Show that the symmetlic consequence relation f-v determined by V is formal. Corollary 6.13.3 Let L be a symmetric consequence system. Then a necessary and sufficient condition that L be formal is that L have a characteristic atlas. We finally have the following by Corollary 5.9.8:
LOGIC
214
CO-CONSEQUENCE AS A CONGRUENCE
215
Corollary 6.13.4 Let L be a symmetric consequence logic. Then L has a fonnal semantics in the sense of having a characteristic formal set V of valuations.
Fact 6.14.3 For binary implication logics, and asymmetric and symmetric consequence logics, the relation P -II- lfI defined as P I- lfI and lfI I- P is an equivalence relation.
6.14
Proof Symmetry Iides piggy-back on the symmetry of conjunction. Reflexivity and transitivity were explicitly postulated for binary implication logics, and are special cases of overlap and cut for the consequence logics. 0
Co-consequence as a Congruence
Let X(p) be any sentence possibly containing one or more occunences of P and let X(lfI) be the result of possibly replacing one or more of those occun~n~es by lfI· :or equivalentiallogics it is then very natural to require the well-known pnnciple of lOgIC: (rep) If P -H-lfI, then X(p) -II- X(lfI)
(replacement).
We say "very natural" rather than something stronger like "most natural" since there are examples of "real-life" logics (certain formulations of counter-factual conditionals) that lack this principle. But most "real-life" logics do have this principle. It reflects the intuition that logically equivalent sentences express the same proposition, and hence in a compositional semantics make the same contribution to the meaning of the whole. Incidentally, it is important to emphasize that "logical equivalence" here is meant by the lights of the logic in question, and does not necessarily mean as loose a relation as allowed by classical logic. Whatever the final philosophical justification, replacement is a common enough principle to wan-ant attention, and is particularly interesting algebraically since it clearly amounts to saying that the co-consequence relation -II- is a congruence on the algebra of sentences. Theorem 6.14.1 Let -II- be any equivalence relation on an algebra of sentences. Then -II- satisfies replacement (if -II- is a congruence on the algebra of sentences. Proof Let us assume that -II- satisfies (rep). To say that it is a congruence is to say that, given an n-place connective n, if Pi -II- lfIi (1 ~ i ~ n), then (1) n(Pl, ... , Pi,"" Pn) -II- n(lfIl,···, lfIi,"" lfIll)'
But this clearly follows by applying (rep) to the first position of the left-hand side, replacing PI by lfIl, and then applying replacement to the second position of the result, replacing P2 by lfI2, etc. through n uses of (rep). At the end, each Pi has been replaced by the equivalent lfIi, and transitivity tells us that the beginning of the chain (the lefthand side of (1)) is equivalent to the end (the right-hand side of (l)). For the converse, let us suppose that -II- is a congruence on the algebra of sentences. It is then easy to show by an induction on the construction of the sentence X(p) that (rep) holds. 0 Exercise 6.14.2 Malee the details in the above proof explicit. We shall thus say of an equivalential logic that it (directly) supports congruence when it satisfies the replacement principle. We can apply a similar notion to binary implication logics, and asymmetric and symmetric consequence logics. The key idea is that in each of these logics. we :an understand the relation P I- lfI, its being explicitly given in the binary case and Its bemg definable in the consequence logics as {p} I- {lfI). But once we have the relation P I- lfI, we can clearly then define the relation P -II- lfI as the conjunction P I- lfI and lfI I- p.
We can thus speak of a binary implication logic, or a consequence logic of either the asymmetric or symmetric kind, as (directly) supporting congruence when the defined relation -II- satisfies (rep). Note that the modifier "directly" still applies, despite the need to define -II-, the definition being so transparent-we shall tum to more "indirect" support next. We have at least one more case of interest: unary assertional logics. When can we speak of them as "supporting congruence"? It is well known from the example of classical logic that there may be a connective (primitive or defined) within the object language which "indicates" implication. The so-called "material implication" :::> has the deduction theorem property that I- P :::> lfI iff P I- lfI. This allows us in a unary assertional fonnulation of classical logic, "throwing the ladder away," to simply define the relation P I- lfI as I- P :::> lfI. This move is common enough for even non-classical logics that we wish to abstract it. Thus we say of a sentence B(p, q) containing the atomic sentences p and q that it indicates implication iff (i) I- B(p, p) (reflexivity); (ii) if I- B(p, lfI) and I- B(lfI, X), then I- B(p, X) (transitivity); (iii) if I- P and I- B(p, lfI), then I- lfI (modus ponens). It is then natural to abbreviate B(p, lfI) as p -+ lfI, and to define p I- lfI as I- p -+ lfI. Then of course p -II- lfI is defined as the conjunction p I- lfI and lfI I- p. Finally, we say that the unary logic (indirectly) supports congruence if there is some sentence B(p, q) that indicates implication, and such that -II- satisfies (rep).
Remark 6.14.4 The role of (i) and (ii) above is clear enough in relation to obtaining a congruence, but what is the role of (iii)? Besides being natural on logical grounds, its algebraic role is to ensure that the congruence respects the predicate I- ("is a theorem"). Note that in the case of direct support, the infinitary cut property takes care of not only this, but also the analogous problem of respecting asymmetric and symmetric consequence. Although we have motivated the notion of indirect support of congruence by way of unary assertionallogics, there is nothing wrong with speaking of a consequence logic (either asymmetric or symmetric) as indirectly supporting congruence. (i)-(iii) above would just be understood as having the empty set of premises. Most nonnally we would then expect that if a consequence logic indirectly supported consequence, then it would directly support it (and vice versa). This of course would be ensured if the logic had the deduction theorem property.
216
LOGIC
But there is at least one real-life example (relevance logic) where under a "first guess" of what it would be like as a consequence logic, it indirectly but not directly supports congruence. In this case we would presumably want to require of indirect support of congruence that it satisfy replacement not just with respect to the theoremhood predicate (what (iii) enforces), but also, when working with consequence versions, that it allow replacement of equivalent sentences in the premise or conclusion sets (as we said, infinitary cut ensures this in the case of direct support). But we will not now worry any more about this rather obscure matter.
6.15
Formal Presentations of Logics (Axiomatizations)
We assume that everyone has the basic idea of an axiomatization. One is presented with certain sentences that count as axioms, and one is given certain rules that allow one to construct proofs of theorems from the axioms. Axiomatizations traditionally have a certain epistemological purpose. The axioms are supposed to be obviously true, and the rules are supposed to obviously carry truths to truths, and so one can discover, by the method of proof, further truths that may not be at all so obvious. The paradigm of this is of course Euclidean geometry. There were epistemological defects in Euclidean geometry. One of these (the lack of "obviousness" of the Axiom of Parallels) will not detain us, but the other is of more immediate logical interest. It had to do with the question as to what rules were allowed in deducing the theorems. Seemingly one was allowed to use logically correct inferences, whatever those were, and also maybe certain constructions. At the beginning of the twentieth century a number of mathematicians, Hilbert chief amongst them, worked at removing the mystery abou.t just what rules were to count. The constructions were uncovered and made explicit in the form of further axioms (e.g., Pasch's law), so that logic alone was needed to derive the theorems. Also the logic itself was axiomatized. Further the whole enterprise was made "formal" in the sense that what was an axiom or the correct application of a rule was made a question of syntactic form alone. Frege had already done this for the logical part in 1879, but given Hilbert's emphasis upon the importance of presenting logic axiomatically (and perhaps his greater influence), axiomatizations of logic of the most familiar kind are called Hilbert-style axiomatizations. 1 Given a language L, we shall call the following a Hilbert-style presentation: it is a structure P = (S, A, R..) where S is the set of all sentences ofL, A ~ S is called the set of axioms, and R.. is a set of rules, i.e., each member of R.. is an n-place relation on S for some natural number n > 1. Note that if we allow one-place relations (sets), the role of A can be subsumed under R.., but given the historical importance of axioms we shall continue to give them a special place in the structure. Given an (n + 1)place rule R, we say that a sentence lJf (the conclusion) follows from (the premises) PI, ... , Pn by way of the rule iff R(Pl, ... , Pn, lJf)· 1Actually, there is a welcome tendency in the computer science literature on complexity to call such axiomatization "Frege systems," but we retain here the less accurate but more traditional nomenclature.
FORMAL PRESENTATIONS OF LOGICS
217
Given a Hilbert-style presentation P, we can define a proof as a finite sequence of sentences PI, ... , Pi, ... ,PIl such that each member Pi is either (i) an axiom or (ii) follows from preceding items by one of the rules. A theorem is then simply the last item in a proof (we write f-p P to mean that P is a theorem of P). Now without addressing the question of just what the "correct" axioms and rules should be, there are still two further kinds of requirements that we might make from our abstract perspective. First, given the traditional epistemological purpose, we should require that both the axioms and rules be "effective," i.e., one ought to be able to mechanically determine whether or not a sentence is an axiom, and whether or not a given sentence follows from some finite number of other sentences by an application of a rule. The way this is traditionally achieved (in modem times) is to require that the set of axioms be recursive, and to require that there be only a finite number of recursive rules. The precise meaning of recursive need not detain us here, but intuitively it means simply that one could write a program to determine whether a sentence is an axiom, or whether a finite sequence of sentences was an instance of a given rule. Incidentally, it is customary to apply the term "axiomatization" only when the presentation is recursive in the sense above. This is one reason that we have chosen to speak of "presentations," although we may allow ourselves to use the more familiar term "axiomatization" as a synonym (reserving then the explicit "recursive axiomatization" for the stronger notion). A second requirement that might occur to one (it is not totally unrelated to the first, as we shall see) is that the axioms and rules be given "schematically." What this means informally is that, for example, certain sentences are presented as "typical" axioms with the understanding that all substitution instances are also axioms. Then, in addition, certain "typical" instances of applications of rules would be presented, with the understanding that all substitution instances would be allowed. Often when axioms and rules are schematically presented, one invokes a new style of sentence variable, e.g., a "schematic variable" P if one thinks of its being added to the object language, and says things like "all instances of the schematic axiom (P V ~ P) are to count as axioms." Or in a more modem style, one uses P as a "metalinguistic variable," and says things like "for all sentences p, (p V ~ p) is an axiom." But there is nothing wrong with avoiding the need for these variables by using the atomic sentence p itself, saying "all substitution instances of (p V ~ p) are axioms." Whether one explicitly uses axiom (and rule schemes) of either kind is a matter of personal choice, but what is important is that axioms and rules be closed under substitution. As argued before, this is what formal logic is all about. Moreover, from an epistemological point of view, schematic presentations are most important since they can allow, in the best of circumstances anyway, the number of axiom and rule schemes to be finite, the determination of their further instances being entirely a matter of "inspection" (very computable syntax). This is one way to be clear that the axioms and rules are effective. More technically, then, we shall say of a Hilbert-style presentation that it is schematic if its axioms and rules are schematic, i.e., the axioms A can all be obtained as "substitution instances" from some subset A of them (each pEA is of the form (J(lJf) for
218
FORMAL PRESENTATIONS OF LOGICS
LOGIC
some endomorphism u of L applied to some "axiom scheme" ljf E A). Analogously, we say of a rule R that it is schematic if there is some "rule scheme" (if> I , ... ,if>n, ljf) E R so that every member of R is of the form (uif>l, ... , uif>n, Uljf) for some endomorphism u of L. We shall say of a Hilbert-style presentation that it isfinite iff there are a finite number of schematic rules and the axioms A can all be obtained as substitution instances from some finite subset A of axiom schemes as detailed above. Clearly each Hilbert-style presentation P gives rise to a unary assertional system £: (the set of theorems). Also it is easy to show by induction on the length of proofs that if P is schematic, then £: is formal (closed under substitution). What about the converse? Given a unary assertional system £: (recall this is just a set of sentences), is there necessarily a Hilbert-style presentation of it? A positive answer is disappointingly easy: just let A = £:. The set R can be empty, since there is now no need for rules (every theorem having a one-step proof). Also clearly if £: is formal, then A can be schematic (let A = A). Surely we had something else in mind? Maybe an effective presentation, or even a finite presentation? To investigate this closely would take us on a detour through recursion theory, which is outside the scope of this book. Let it be said, though, that there is no reason at all to think that even a recursive formal unary assertional logic should be finitely axiomatizable (though it is trivially "recursively axiomatizable"). Hilbert-style presentations are obviously geared towards unary assertional logics, but in many "real-life" cases they can be used to "simulate" other varieties of logic. Thus, for example, one can introduce the concept of a deduction of the sentence if> from hypotheses r as a finite sequence whose last member is if>, each member of which is either (i) a member of r, or (ii) an axiom, or (iii) a consequence of previous items by a rule. Obviously, this is the same as the definition of proof except for the first clause, which in effect grants the status of "temporary axiom" to the members of r. This definition is only appropriate in "real-life" cases where the rules are all "truthpreserving" (like modus ponens and unlike substitution, generalization, necessitation, and other mere "validity-preserving rules"). One can then define r f-p if> to mean that there is a deduction of if> from r. Theorem 6.15.1 The relation r f-p if> is a compact asymmetric consequence relation. Moreover, ifP is schematic, then f-p isfonnal. Proof Compactness is an immediate consequence of the definition of a deduction as a finite sequence. Overlap follows from clause (i) of the definition. Dilution follows from the fact that clause (i) merely allows members of the hypothesis set to be used; it does not require that all be used. Because of compactness, it suffices to show just cut (as opposed to infinitary cut). This we do by way of the following lemma. 0
Lemma 6.15.2 Let if> 1, ... , if>m be a deduction of if> from r, and let ljfl, ... , ljfn be a deduction of ljf from r U {if>}. Then their concatenation if> 1, ... ,if>m, ljfl, ... , ljfn is a deduction of ljf from r. Proof Let us relabel the concatenation of the two deductions as Xl,···, Xm, Xm+l, ... , Xm+n. Clearly for i such that I :S i :S m, the sequence Xl, ... , Xi is a deduction of
219
Xi from r. From m+ lon, each Xi is either a member ofr, or an axiom, or a consequence of proceeding items, with the possible exception of an item which is justified as being the sentence if>. But this sentence can be justified in just the same way that Xm (= if» w~.
0
There is another way that a Hilbert-style presentation can simulate asymmetric consequence-it can contain a binary "implication" connective ~ and a binary "conjunction" connective o. (These connectives are subject to certain natural conditions that we tease out beloW.) Let us define for sentences if> and ljf, if> :Sp ljf iff f-p if> ~ ljf. Clearly we would hope that this is a binary implication relation, i.e., that :Sp is a pre-ordering. We can then go on to define for a set of sentences r, r :SP if> iff there exist sentences if> 1, ... , if>n E r so that the sentence (if>l 0 . . . 0 if>n) ~ if> is a theorem of P. Here we understand "the conjunction" if>1 0 . . . 0 if>n to have parentheses distributed across it such as to make the result a well-formed sentence. Thus we should talk of "a conjunction" of if> I , it being understood that a single sentence if>1 all by itself counts as "a conjunction." Finally, we want it understood that we do not have in mind an "empty conjunction," i.e., we do not allow the fact that if> is a theorem of P to mean that f-p r ~ if>. With all of these understandings we show that the relation r f-p if> defined as f-p r ~ if> is an asymmetric consequence relation. Incidentally, we shall label this relation explicit consequence. It should be clear to the reader that the above notion can be made precise using the notion of Fus(r) as in Section 6.7. We can define r :Sp if> to mean that for some y E Fus(r), y :Sp if>. If we can somehow connect P to pre-ordered groupoids, then we can take advantage of what we established there about consequence relations on such things. Thus we shall say that a Hilbert-style presentation P on an underlying language L induces a pre-ordered groupoid if (L, :Sp, 0) is a pre-ordered groupoid, i.e., all of the following hold: (1) f-p if>
~
(2) f-p if> (3) f-p if> (4) f-p if>
~ ~ ~
if>; ljf, f-p ljf ~ X =? f-p if> ~ X; ljf =? f-p (X 0 if» ~ (X 0 ljf); ljf =? f-p (if> 0 X) ~ (ljf 0 X).
If 0 satisfies in addition the principles below, we shall speak of P inducing a presemi-lattice (in which case we shall usually refer to the semi-lattice operation as /\ rather than 0): (5) f-p (if> 0 ljf) ~ if>; (6) f-p (if> 0 ljf) ~ ljf; (7) f-p X ~ if>, f-p X
~
ljf =? f-p X
~
(if>
0
ljf).
Exercise 6.15.3 Show that (3) and (4) are redundant in the list (1)-(7) above. The following is now immediate from the results of Section 6.7. Theorem 6.15.4 Let P be a Hilbert-style presentation, with the underlying language having connectives ~ and 0 (primitive or defined) inducing a pre-ordered groupoid
LOGIC
220
(i.e., satisfying principles (1 )-(4)). Then the relation of explicit consequence I-p r -+ ¢ is an asymmetric consequence relation. Now we shall examine elliptical consequence, but relativizing it not just to any arbitrary set of sentences (as in Section 6.7), but rather to the theorems of P. To mark this fixing of parameters, we shall call this particular elliptical consequence implicit consequence. Let T be the set of theorems of P. We shall define r ::;PI ¢ to mean that ruT ::;P ¢. Also we shall define the mixed consequence r::;p M ¢ to mean that either (i) r ::;P ¢ or (ii) 0 ::;PI ¢. Corollary 6.15.5 The relation of implicit consequence defined above is a compact asymmetric consequence relation. The proof is clear from Corollary 6.7.3, implicit consequence being just a special case of elliptical consequence. Of somewhat greater interest in Hilbert-style settings is the stop-gap consequence relation, which we define as follows: r ::;Psg ¢ iff either (i) r ::;P ¢ or (ii) I-p ¢ ("¢ is either an explicit consequence of r or else ¢ is a theorem"). We call this "stop gap" both because we are hard pressed for a permanent name and because of the pun of providing for consequences from the empty set of premises. Stop-gap consequence is quite natural in the context of a Hilbert-style presentation, since one is tempted to say in one breath both that the theorems "follow from" no premises, and that they follow from all premises. It turns out that under a very natural assumption, stop-gap consequence turns out to be the same as mixed consequence, and we know from Section 6.7 that this last, under another quite general assumption, is the same as implicit consequence. Corollary 6.15.6 Let P be a Hilbert-style presentation that induces a pre-ordered groupoid. Then if it has as a rule (8) I-p ¢, I-p ¢
-+
0/
~
I-p 0/
(modus ponens),
then stop-gap consequence is identical to what in Section 6. 7 we called mixed consequence, and hence, when the theorems are upper identities, to implicit consequence (which is of course a compact asymmetric consequence relation). Furthel; under all of the assumptions above, all these consequence relations agree with explicit consequence when the premise sets are non-empty. Proof We first observe that
(9) I-p ¢ iff there is some theorem t such that t :::;p ¢. For the left-to-right assertion the t in question can be ¢ itself. Going from right to left, we simply invoke (8). Thus stop-gap consequence is just a species of mixed consequence. The rest follows from results of Section 6.7. 0 The results above apply quite readily to a pre-semi-Iattice. Corollary 6.15.7 Let P be a Hilbert-style presentation, but let it induce a pre-semilattice. Then the relation of explicit consequence is a compact asymmetric consequence
FORMAL PRESENTATIONS OF LOGICS
221
relation. Further, implicit consequence is also a compact asymmetric consequence relation. Finally, on the additional assumptions that the theorems are top elements and that P satisfies (8), stop-gap and implicit consequence coincide (also with explicit consequence when the premise sets are non-empty). Proof Immediate from the fact that a pre-semi-Iattice is a pre-ordered groupoid, ap-
plying Corollary 6.7.7 (note that when identity).
° is 1\, a top element is the same as an upper 0
Can we simulate symmetric consequence in a Hilbert-style presentation? The use of deduction all by itself does not take us very far, since deductions have single conclusions. But given the presence of a "disjunction" connective +, we can define r I-p A to mean that there exist 0/1, ... ,0/11 E A so that 0/1 + ... + 0//1 is deducible from r. Exercise 6.15.8 Tease out the properties of + that are natural to assume in order that the above definition yield a symmetric consequence relation. We shall examine in some detail the alternative "object-language" definition ofr I-p A as I-p r -+ A, meaning by this that there exist YL ... ,Ym E r, 81, ... ,8/1 E A such that I-p (n ° ... ° Ym) -+ (81 + ... + 8/1). This we shall call explicit symmetric consequence. Again it is understood that -+,0, and + are binary connectives, and that parentheses are to be distributed ad lib so as to make the above well-formed. The more precise way of talking is to invoke the framework of Section 6.9, and require that there exist Y E Fus(r), 8 E Fis(A), such that I-p Y -+ 8. In analyzing what properties are needed for this to give a symmetric consequence relation, we again revert to the framework of Section 6.9. Thus let us jump in straight away and assume that P induces a sufficiently idempotent hemi-distributoid, i.e., that the structure (L, :::;p, 0, +) is a sufficiently idempotent hemi-distributoid (where L is the language underlying P). Theorem 6.15.9 Let P be a Hilbert-style presentation, with the underlying language having connectives -+,0, + (primitive or defined) inducing a distributoid. Then the relation r :::;p A of explicit consequence is a symmetric consequence relation. Corollary 6.15.10 Let P be a Hilbert-style presentation inducing a distributive prelattice. Then the relation of explicit consequence r:::;p A is a compact symmetric consequence relation. The proof is immediate from Theorem 6.9.19. Let us now briefly examine implicit consequence. It should be clear by now how this allows for empty left-hand sides, but how can we deal with empty right-hand sides? Although an empty left-hand side has a natural interpretation in Hilbert-style systems (I- ¢ just means "¢ is a theorem"), the problem is that there is no natural interpretation of its dual ¢ I- in a Hilbert-style "assertional" system at any rate (in a "refutational" system it would just mean "¢ is refutable"). If we, however, only had a negation connective", "indicating" refutation, we could simulate refutation in an ordinary Hilbert-style (assertional) system by 1-", ¢.
222
The idea in general is that we can set T to be the set of theorems, and F to be the set of "counter-theorems," i.e., those sentences rjJ such that rv rjJ is a theorem. Then following the ideas of Section 6.9 we can define all three species of implicit consequence: left, right, and symmetric implicit (LE, RE, and S E). We can also define the corresponding versions of mixed consequence. Thus, for example, symmetric implicit is defined so that r :S;PSE li iff rUT :S;p F Uli. And the corresponding symmetric mixed implication is defined so that r :S;PSM li iff either (i) r :S;P li, or (ii) 0 :S;PSE li, or (iii) r :S;PSE rjJ (left implicit is defined by deleting F, and its mixed version by deleting clause (iii), and right implicit is defined by deleting T, and its mixed version by deleting (ii)). When the theorems are upper identities, we know (from the results of Section 6.9) that left implicit and its mixed version agree, and when the counter-theorems are lower identities, we know the same for light implicit and its mixed version (and when both the theorems and counter-theorems serve as appropriate identities, we know that symmetric implicit consequence is the same as symmetric mixed consequence). To say that the counter-theorems serve as the appropriate lower identities is just to say (10) I-p rvrjJ
=}
I-p (rjJ
+ lfI)
-l-
lfI.
We are now in a position to define symmetric stop-gap consequence, saying r :S;PSsg li iff either (i) li is an explicit consequence of r, or (ii) I-p 8 for some fission of members of li, or (iii) I-p rvy for some fusion of members of r. Left stop-gap consequence omits clause (iii), while right stop-gap consequence omits clause (ii). What is needed to turn stop-gap implication into an honest implicit consequence? Just as with the asymmetric case, we need (8) to equate being a theorem with following from some theorem. What is new is that we need the principle below to equate being a counter-theorem with having as a consequence some counter-theorem: (11) I-p rjJ
FORMAL PRESENTATIONS OF LOGICS
LOGIC
-l-
lfI, I-p rvlfl
=}
I-p rvrjJ
(modus tollens).
Corollary 6.15.11 Let P be a Hilbert-style presentation that induces a distributoid. Then if (8) and (11) hold, then stop-gap consequence is identical to what in Section 6.9 we called mixed consequence, and hence, when also the theorems are upper identities and the counter-theorems lower identities, to implicit consequence (which is, of course, a compact asymmetric consequence relation). Corollary 6.15.12 Let P be a Hilbert-style presentation with connectives (primitive or defined) -l-, /\, V, rv that induces a pre-distributive lattice. Then the relations of explicit and implicit consequence are both compact symmetric consequence relations. Let P also satisfy (8) and (11). If the theorems are all top elements and the counter-theorems are all bottom elements, then stop-gap and implicit symmetric consequence are the same, and agree also with explicit consequence when neither side (premises or conclusions) is empty. Remark 6.15.13 Another, and in some ways more elegant, approach to the characterization of implicit consequence is to add a constant true sentence t, and interpret I- rjJ as t I- rjJ. For symmetry, we should add a constant false sentence f for interpreting empty right-hand sides. This interprets rjJ I- as I- rjJ -l- f-and, given the Johansson
223
(1936) definition of rvrjJ as rjJ -l- f, amounts to what we did above. Of course, one has to give these constant sentences some special properties. It turns out that making them respectively upper and lower identities suffices to ensure that the three species of implicit consequence agree with their mixed counterparts. What is then needed to forge linkage with the three stop-gap consequence relations is (12) I- rjJ
¢>
I- t
-l-
rjJ.
Note incidentally, that if we were working in the context of a system that allowed refutation as well as proof, we would need to add ( 13) rjJ is refutable
¢>
I- rjJ
-l-
f,
thus restoring what might seem to be some missing duality (in a Hilbert-style system there is no way of stating the dual of (12)). Exercise 6.15.14 Reprove the various results above concerning "simulated" consequence, with the new definitions of the various species of implicit consequence using t and f. Of course the assumption is that P satisfies (12), in addition to the vruious hypotheses of the theorems above. Remark 6.15.15 It might be thought strange that a single assertionallogic (with the proper connectives) always gives rise to (at least) four different symmetric consequence relations. Which one is the real consequence relation of the logic? Given t with the properties above, the Hilbert-style system can always be recaptured by the definition of I- rjJ as t I- rjJ, no matter which of the four consequence relations we choose to work with. (The proof of this is left as an exercise for the reader.) And yet it is clear from even the case of classical logic that the four consequence relations are distinct. Thus explicit consequence allows no consequences "from or to" the empty set by definition, and both left and right implicit consequence disallow the empty set on one side or the other. It is only symmetric implicit consequence that gives full status to the empty set as both a legitimate set of premises and conclusions. And yet classical logic has valid arguments with the empty set of premises (14) 01- {p, rvp}, and also with the empty set of conclusions (15) {p, rvp} I-
0.
So this suggests, for classical logic at least, symmetric implicit consequence is the "correct" consequence relation. Exercise 6.15.16 We know by the absoluteness of symmetric consequence, that every consequence relation is characterized by a unique semantics (class of valuations). This means that for any usual Hilbert-style axiomatization of classical logic, since there are four consequence relations, there are four different semantics. Thus corresponding to explicit consequence is the class of valuations that consists of the usual truth-functional valuations plus the valuation that makes every sentence true and the valuation that makes every sentence false (this is because we must be able to falsify both (13) and (14)). Left implicit consequence requires only the addition of the valuation that makes every
LOGIC
224
sentence true, and right implicit consequence requires of course only the addition of the valuation that makes every sentence false. Every reader should verify that with these minor changes the semantics is able to falsify, as appropriate, (13) and/or (14). The reader knowing something about how to prove completeness for some standard axiomatization of classical propositional calculus should go on to prove that, with these additions, the semantics does characterize the appropriate consequence relations.
6.16
Effectiveness and Logic
To simplify things, we shall suppose that the logic is given an effective Hilbert-style presentation as discussed in Section 6.15, i.e., the set of axioms is decidable and there are finitely many decidable rules (viewing each n-ary rule as a set of n-tuples of sentences). A proof is a finite sequence of sentences, each of which is either an axiom or follows from previous members of the sequence using one of the rules. Hence a proof is a certain sequence of sequences, and can be understood as a single string by separating its component sentence strings by a new symbol used as a delimiter (think of it as a "space"). Theorem 6.16.1 The set of proofs of an effective Hilbert-style presentation of a logic is decidable.
Proof In tum, check each item of the sequence to see whether it is an axiom (this is given as a decidable question) or whether it follows from preceding members by one of the rules (again a decidable question). If the answer is "yes" for each element, then the 0 sequence is a proof, and otherwise it is not. Corollary 6.16.2 The set of theorems is effectively enumerable.
Proof Construct a machine that strips the last line off of a proof. Effectively enumerate the proofs, and as you do feed them into the "stripping machine" and the result will be an enumeration of the theorems. 0 Remark 6.16.3 It is well known that the set of theorems of classical propositional logic is decidable (being just the set of two-valued tautologies-cf. Chapter 9). Various non-classical propositional logics are also decidable (e.g., intuitionistic logic and many standard modallogics-cf. Chapters 10 and 11). But not all propositional logics, even those that are finitely axiomatizable, are decidable. For many years, all examples were more or less artificial. But recently A. Urquhart has shown the systems Rand E of relevance logic to be undecidable-cf. Anderson et al. (1992). And Lincoln et al. (1992) have shown linear logic undecidable. It is clear that any logic that has a finite characteristic matrix is decidable. But Harrop (1958) showed (cf. also Harrop 1965) that if a logic has a finite presentation (finitely many schematic axioms and rules) that the following weaker property suffices:
Definition 6.16.4 Let L be a unary assertionallogic with a finite Hilbert-style presentation. Then L has the finite model property (fffor evel)! non-theorem cjJ, there is afinite
EFFECTIVENESS AND LOGIC
225
matrix M in which all axioms of L are valid and in which all of the rules of L preserve designation, but in which cjJ is not valid. Theorem 6.16.5 (Harrop 1958) Let L be a unm)! assertional logic with a finite Hilbert-style presentation. Then if L has the finite model property, then L is decidable.
Proof We know from Corollary 6.16.2 that the theorems are effectively enumerable. Finite models are also effectively enumerable, as we can see from a "brute force" argument. The actual nature of the elements of a matrix makes no difference (since isomorphic matrices verify the same sentences), so a matrix with n elements can always be thought of as composed of the elements {I, 2, ... , n}. Using these elements, one can thus construct (up to isomorphism) all of the two-element matrices, the three-element matrices, etc., and as one does so check to see whether the axioms are valid in the matrices and the rules preserve designation. (Here is where it is important that there be only finitely many axioms and mles-the theorems could still be enumerated even were the axioms an infinite set as long as it is decidable.) So to decide the theorernhood of a candidate sentence cjJ, start one machine enumerating the theorems, and start another machine enumerating the models of the logic. As each model is enumerated, pass control over to a third machine that checks whether it is also a model of cjJ. At some point, either cjJ will show up in the enumeration of the theorems, or else, if it is a non-theorem, it will be rejected by one of the finite models being enumerated. So at some point, one will have a "yes or no" answer to the question of whether cjJ is a theorem. 2 0 Remark 6.16.6 A matrix of the sort used in the above theorem, where the axioms are valid and the rules preserve designation, is called a "strong" matrix for the logic L. A matrix where the axioms are valid but where the rules merely preserve validity is called a "weak" matrix for L Obviously a strong matrix for a logic is also a weak matrix for that logic, but a weak matrix for a logic may not necessarily be a strong matrix for that logic. However, Harrop (1958, 1965) showed that if there is a finite weak matrix for a logic in which a certain sentence is not valid, that there is also a finite strong matrix for that logic which also invalidates that sentence. This shows the equivalence of what might be called the weak and the strong finite model properties. The idea of the proof is to take a non-theorem cjJ and consider the algebra of sentences generated just from its atomic sentences. The Lindenbaum matrix restricting the logic to that language is of course infinite. But one can use the given finite matrix to establish a natural congmence on the algebra of sentences, counting two sentences to be equivalent just when for every interpretation of the atomic sentences the sentences evaluate to the same element in the matrix. It is clear that the quotient algebra is finite, since for each positive integer n there are only finitely many n-ary operations on a finite set. We tum this quotient algebra into the required strong matrix by counting an equivalence class [If/] as designated iff the sentence If/ is a tautology. We leave the details of the proof to the reader. 20f course, one will not know in advance how many turns of the cranks of the two enumeration machines will be needed, but this is not essential to the notion of decidability. In privileged circumstances, one might have a "speed function" that would tell one how long one must wait for an answer.
MATRICES
7 MATRICES AND ATLASES
7.1
Matrices
We have introduced the notion of matrices in Chapter 5. Before introducing abstract notions, as matrix product and matrix homomorphisms, we devote this section to an overview of several logically important matrices. This also gives us a chance to familiarize the reader with the abstract notions, which will be formally defined later, on concrete examples. After outlining some basics, we present the Lukasiewicz (and Kleene) matrices, the GOdel matrices and the Sugihara matrices. We furnish examples of submatrices, matrix isomorphism and homomorphism, product and preservation of tautologies. We conclude this section with a quick look at infinite generalizations and different philosophical interpretations. 7.1.1
Background We defined the notion of a logical matrix in Definition 5.6.1. Formally, it is just an algebra with a non-empty proper subset of "designated" elements. The notion was formally introduced by Lukasiewicz and Tarski (1930) (translated into English in Tarski (1956)), and the study of matrices has been one of the characteristic features of the socalled "Polish School" in logic. I Many results concerning matrices are to be found in Polish journals such as Studia Logica. The classic treatise on matrices is Los (1944). The reader may wish to consult W6jcicki (1988) for many interesting results concerning matrices which go beyond the scope of our discussion here. The aim of the discussion in tllis section is to develop some relatively informal intuitions concerning matrices, with very few proofs and some simple exercises. We encourage the reader to think of the elements of the underlying algebra of a matrix as "propositions," but a popular alternative is to think of them as "truth values." Of course since every right-thinking person knows that there are only two real truth values (which we have designated as t and f) this would lead to a severely limited number of matrices. Indeed, except for the question of which operations to favor, this would basically lead to only variations on the classical truth tables. A natural response, originating with Lukasiewicz (1920, 1930), is to allow degrees of truth. Thus for tlrree values we might have: true,false, neither.
I English
(1967).
translations of some of the important early works of Polish logicians may be found in McCall
227
These values can be understood in various ways.2 Lukasiewicz himself was motivated by Aristotle's famous problem of future contingents, and thought that most statements about the future were not determined to be either true or false at the present. Thus true for Lukasiewicz means "presently determined to be true," and false means "presently determined to be false," whereas neither means "none of the above." Besides bringing metaphysical issues about determinism into the picture, Lukasiewicz further complicates things by bringing in the notion of tensed truth values. There are other logicians, such as Kleene, who have seen the need for three "truth values" even in the context of arithmetic, where presumably there are no issues of causal determination or tense. Kleene saw that in certain formalisms for recursive functions, it was not effectively decidable whether applying a function f at an argument n terminated, i.e., whether the value fen) exists or not. Thus Kleene devised a system of truth tables much like Lukasiewicz for sentences involving terms that might refer to partial recursive functions. There are also logicians 3 who have wanted to work with four "truth values," adding the value both to the values true,false, neither (cf. Anderson et al. 1992). See Dunn (1999) for a systematic exploration of various options in the use of these values. One thing to realize is that, so far in our description, there has really been no need to ontologize the extra "truth values." Not to have a value, or to have two values, need not be taken as additional values. (One is reminded of Lewis Carroll's "Nobody passed me on the road.") It is rather that a valuation can go wrong in two different ways: it can fail to be total or it can fail to be uniquely defined. A convenient way to mathematize this intuition is to take a valuation to be a function to the power set of the original two truth values {t,f}. 7.l.2
Lukasiewicz matrices/submatrices, isomorphisms
But what happens if we want more than four values? Let us return to Lukasiewicz who suggested having systems with 3,4,5, ... values. Indeed, for any positive integer none can have the n + 1 values n /1-1 n-(n-I) o} . { n'-n-'···'--n-'n,1.e.
{I /1-1
'-1-1
I
""'n'
O}
.
i,
One might interpret, say, {I,~, O} as true, mostly true, mostly false, false. Lukasiewicz extended this to infinitely many values, taking either the rational or real numbers as values. A set of values does not by itself make up an algebra, let alone a matrix. Two items need to be added: an indexed set of operations and a designated subset of the values. Lukasiewicz defined the operations as follows: x A y = min(x, y), x V y = max(x, y), oX = I-x, and x -+ y = min(I-x+y, 1). Lukasiewicz took the set of designated values to be just {I}. By Lm we shall mean the finite Lukasiewicz matrix with m values. The reader can easily compute that the matrix L3 is presented by the following "generalized truth tables" (* indicates the designated value): 2Malinowski (1993, Section 2.4) is a historically sensitive discussion of the philosophical difficulties raised by the various interpretations. 3Dunn (1976) pioneered this.
MATRICES
MATRICES AND ATLASES
228
-.
-+
1*
I
I
I
"2 0
"2 0
"2 1
0
/\
I
0
1*
I
I
"2
1 1
"2 1 1
1*
0
I
1*
1*
0
V
1*
I
0 0 0
1*
1 1 1
"2
I
I
"2 0
"2 0
"2 1
I
"2 "2 0
I
"2 0
I
"2 1 I
"2 I
"2
0 1 I
"2 0
L2 is the two-valued Boolean algebra 2 = {I, O} with I as the only designated element. The operations are defined by the usual 2-valued truth tables:
-TJo I~
-+
1*
o
10
1* 1 1
/\
0
1*
1*
0
1 0
o
0 0 0
1*
V
0
1*
1 0
o
While the charactelization of the elements as numbers in the unit interval [0, 1] is mathematically convenient, particularly in terms of charactelizing negation and implication, it is by no means essential. The valious finite Lukasiewicz matrices could just as well be characterized as having as elements {O, I}, {O, 1, 2), {O, 1, 2, 3), ... , or say {-I, 1 }, { -1, 0, + I}, {-2, - 1, + 1, + 2) , .... The point of course is that only the cardinality of the set of elements matters. On the first charactelization the three-valued Lukasiewicz tables can be presented as follows:
-m 1
o
1 2
-+
2*
1 0
/\
2*
0
V
2*
1 0
2*
2 2 2
1 0 2 2 2
2* 1 0
2
1 0 1 0 0 0
2* 1 0
2 2 2
2 2 1 1 1 0
0
0
The two presentations of L3 illustrate the idea of two matrices being isomorphic, which means that there is an isomorphism h in the algebraic sense between the two underlying algebras such that an element a is designated in the one matrix iff h(a) is designated in the other. What this amounts to visually is that the tables characterizing the operations for one matrix are a simple "relabeling" of the tables for the other (including where * appears). We next illustrate the notion of "submatlix." Consider the following table for Lukasiewicz's three-valued implication, with a few more lines added: -+
1*
0
1* 1
"2 0
The reader can determine by inspection that the two-valued truth table for the classical matelial conditional can be found "hidden" in the four corners of the tables for L3. We also of course need to consider the tables for -., /\, V before we can conclude that "L2 is a submatlix of L3," a task we leave to the reader.
229
In the next section we shall be more formal about the definition, but for now it suffices to say that a matrix M is a submatrix of a matrix M' iff the algebraic part of M is a subalgebra of the algebraic part of M', and the designated elements of M are those also designated in M'. We want to take the occasion to make the abstraction visually concrete. Given a table defining an operation, let us call the leftmost column and the uppermost row of values the border, and let us call the remaining values the interior. The values on the border are inputs, and those in the interior are outputs. To begin with, let us consider matrices with only one operation: -+. In visual terms, to say that M is a submatlix of M' means that the table for M results from deleting one or more rows and matching columns in the following way. One selects elements kl, ... , k n that one wants to delete. One deletes the rows headed by kl, ... , kn, as well as the columns headed by kl, ... ,kn . (In the case pictured above there is just lq = 1.) After this operation, two conditions must be met, which are easy to check visually: the intelior of the table must not contain a value that does not occur on the border; and (2) at least one designated value must remain on the border. Extending this to matlices with more than one operation is straightforward. For all of the tables displaying its operations, one must be able to do deletions as described above of the "same" rows and columns (i.e., those headed by the same values).
(1)
Exercise 7.1.1 (Lindenbaum) Show that L2 is a submatrix of each matrix L n , for n? 2. The reader should not hastily draw the conclusion that Ln is always a submatlix of Ln+l. It is easy to see that L3 is not a submatlix of L4, since negation has a fixed point (~) in L3 but not in L4. Exercise 7.1.2 (Lindenbaum) This generalizes the previous exercise. Show that if m divides n (without remainder) then Lm+1 is isomorphic to a submatlix of Ln+l. (Hint: If m divides n, then 3k(km = n). The map hex) = kx is the desired isomorphism.) Vmious interpretations can be put on the values of a Lukasiewicz matlix. Lukasiewicz himself favored some kind of interpretation where the values are taken to be probabilities. This does not fit the interpretation he gives to the connectives. For example, a conjunction is given the least value of either of the two conjuncts, whereas the probability calculus would instead take their product (and only then when the two conjuncts are independent). If one looks at only negation, conjunction, and disjunction, one gets Kleene's "strong" three-valued matlix K3. Kleene interprets the intermediate value as indicating "undefined," and he throws away the Lukasiewicz implication, providing in its place <jJ :J I/f, which can be regarded as just an abbreviation for -.<jJ V I/f. Note that ~ :J ~ = ~ and is undesignated in K3, whereas Lukasiewicz set ~ -+ 1 = 1. This means that there are no tautologies in Kleene's strong three-valued matrii. Note that there are nonetheless valid consequences, the simplest example being <jJ I= K 3 <jJ, i.e., every interpretation in K3 that gives <jJ the designated value 1 also assigns <jJ the value 1. Kleene also provided a "weak" three-valued logic which differs from the strong one only when one of the input values is the intermediate "undefined" value. The idea, put
MATRICES AND ATLASES
230
MATRICES
in computer science jargon, is "garbage in, garbage out," or in AmeIican folk wisdom "one rotten apple spoils the barTel." Put quickly, any computation involving 0, 1 is done as with classical truth tables, but any calculation that involves as input results in as output. This goes back to Bochvar's logic of meaningfulness. Readers wishing to lear'n more about the Kleene and Bochvar interpretations can consult Urquhart (1985).
1
0
-+
2*
0
/\
0
2*
2
0'1
1 0
2 2
0 OQ
2* 1 0
2
2 2
2
2* 2 1 0
1
0
V
1 0 1 0 0 0
2* 1 0
2* 2 2 2
1
2*
1*
0
0
2*
2
OQ
2 2
1Q
2
1* 0
0 OQ
2
2
2 1 1
2 1 0
Remark 7.1.4 The map above is almost the identity map, and so each G n (n ~ 1) is almost a submatrix of Gn+l. We would get precisely a submatrix if we changed the characterization of a Godel matrix so that the top element was always the same, say w. Sugihara matrices/homomorphisms
Another interesting sequence of finite matrices, called the Sugihara matrices, satisfies the relevance logic R (and in fact R. K. Meyer has shown that they are jointly characteristic of the system RM (R-Mingle)-cf. Anderson and Belnap 1975).are the natural numbers {O, 1, ... , n - I}. We define a /\ b and a V b as minimum and maximum, just as for Lukasiewicz and GOdel, and we define ..,a as the "mirror image" of a, i.e., ..,a = (n - 1) - a (just as for Lukasiewicz). Then a -+ b = ..,a V b if a :S b, and otherwise a -+ b = ..,a /\ b. Unlike either Lukasiewicz or Godel, we can have more that one designated value, taking a as designated whenever ..,a :S a. S2 is of course just the usual two-valued matrix and has only one designated value. The three- and four-valued tables are as follows (again ~ indicates differences from Lukasiewicz):
3*
2*
3*
3
OQ
0'1
2*
3
2Q
1Q
0 OQ
2Q
2Q
OQ
0
3 3
3
3
3
-+
/\
2*
1*
0
V
2*
1*
0
2* 1* 0
2 1 0
1 1 0
0 0 0
2* 1* 0
2 2 2
2
2 1 0
0
1-+
1, .., 1
1-+
2, ..,0
/\
3*
2*
1
0
V
3*
2*
1
0
3* 2* 1 0
3 2 1 0
2 2 1 0
1 0 0 1 0 0 0
3* 2* 1 0
3 3 3 3
3 2 2 2
3 2 1 1
3 2 1 0
1-+
3.
Note that S3 is not a submatrix of S4 since .., has a fixed point (.., 1 = 1) in S3 but not in S4. Nonetheless, there is an interesting relationship between the two which we shall express by saying that S3 is isomorphic to a weak homomorphic image of S4. Visually this means we can "box" together 2 and 1, squinting so to speak so that they blur together, and the result can be seen to be isomorphic to the implication table for S3: 0
-+
0
Exercise 7.1.3 Show that each Godel matrix G n (n ~ 2) is isomorphic to a submatrix of Gn+l. (Hint: Define h so that for 0 :S m < n - 1, hem) = m and hen) = n - 1.)
7.1.4
-+
The four-valued matrix for negation is obvious, ..,3* 1-+ 0, ..,2* The matrices for the binary connectives are as follows:
The operations above are obviously not the only operations definable on three values. GOdel (1933) defined a sequence of finite matrices Gl, G2, G3, ... which satisfy the theorems of the intuitionistic propositional calculus (cf. Section 11.10). The n-valued matrix Gil has as its set of values {O, 1, ... , n - 1 }. GOdel let 0 be the only designated value, but we follow the usual custom and reverse the order relation, so that n - 1 is the designated value. (Actually, it is more elegant to replace n - 1 with w since otherwise the designated value differs from matrix to matrix.) Operations ar'e then defined as follows: a /\ b = min(a, b), a vb = max(a, b) (just as for Lukasiewicz), but a -+ b = n - 1 if a :S b, and otherwise a -+ b = b. Finally ..,a can be defined as a -+ 0, which means for a I- 0, ..,a = 0, but for a = 0, ..,a = n - 1. GI is a degenerate one-element matrix whose only element is designated. G2 is the classical two-valued matrix. G3 is characterized by the following three-valued tables (we put a ~ where they differ from Lukasiewicz):
2*
2* 1* 0
1
Codel matrices/more submatrices
7.1.3
.., S3 :
231
box
----7
3* 2* 1 0
3
0
0
-+
0 blur
3GJJO 322 0 3 3 3 3
----7
3* {2, 1) * 0
3* 3 {2, 1) * 3 0 3
0 {2, 1) 3
0 0 3
-+
relabel
---+
2* 1* 0
2* 2 1* 2 0 2
0 1 2
0 0 2
No~e that d~signation is preserved from left to right, but not conversely, since {2, 1) is deSIgnated III the target, and yet 1 is not designated in the source. We will speak of a strong homomorphism when designation is preserved in both directions.
Exercise 7.1.5 Show that S2 is isomorphic to a submatrix of S3. Exercise 7.1.6 (Dunn (1970)) Show that when n is odd, Sn is a homomorphic image of Sn+ 1 and when n is even, S/1 is isomorphic to a submatrix of Sn+ I. Re~ark 7.1.7 We have used initial segments of the natural numbers for the Sugihara matrIces, so as to facilitate comparison with the Lukasiewicz and Godel matrices. But each has its most natural and/or normal presentation. We have already seen that for the Lukasiewicz matrices fractions between 0 and 1 are the most natural (with 1 designated). The GOdel matrices are, perhaps, best understood as an initial segment of the n~tura~ ~umbers, with wadded on top as the designated element. For the Sugihara matnces It IS natural to use the integers -n, - n + 1, ... , - 1, (0), + 1, ... , n - 1, n, the parentheses around 0 indicating that it may be absent. Now we can define ..,a as -a. Then a -+ b = max( -a, b) if a :S b, and otherwise a -+ b = mine -a, b). And a is designated whenever 0 :S a.
MATRICES
MATRICES AND ATLASES
232
7.1.5 Direct products We illustrate one last construction involving matrices. Suppose we want to take two matrices, say L3 and S3 and "glue them together." This is accomplished using the "direct product" construction, which is obtained by first taking t~e direct. product of the underlying algebras, and then taldng an element (ai, a2) as d~sIgnated Ju~t .when both al and a2 are designated. In the table below we have taken the lIberty of stnking out the secon.d components, and boxing together those pairs that have the same first component. It IS clear from this that L3 is a homomorphic image of II (L3, S3)· -+
1, )'*
1, f'*
1,-ft
~,)'
1, ,l*
1, )' 1, )' 1, )'
1, -ft
1,-ft 1,-ft
~,)' ~,)' ~,)'
~,)' ~,f'
1, )'
1, -ft
~, -ft
1, )'
0,),
1, )'
O,f' 0, -ft
1, f'*
1,-ft
I, )'
1, f' 1, )'
1, )' 1, -ft
1,)'
I, f' 1,), 1,-ft 1,f'
1,-ft 1,)' 1, -ft 1, -ft
1, )'
1, )'
1, )'
1, )' 1, )' 1, )'
~, f' ~, -ft ~,f' ~,)' 1,-ft 1,f'
~, -ft 4,-ft J,-ft ~,)' 1, -ft
1, )' 1, )'
1, -ft
1,f'
1,-ft 1,), 1,-ft 1, -ft
1,),
1, )'
1, )'
1, )'
0,), 0,), 0,), 0,),
~,)' ~,)' ~,)'
1,)' 1, )' 1, )'
O,f' O,-ft O,f'
0, -ft
0,),
0, -ft 0, -ft 0,),
~, -ft ~,f' ~,)' 1,-ft 1, f'
~, -ft ~, -ft ~,)' 1, -ft 1, -ft
1, )'
1, )'
Exercise 7.1.8 Through similar "visual reasoning," show that S3 is also a homomorphic image of the direct product. (Hint: First rearrange the order of the values in the borders so as to allow the order of the second component to dominate.) 7.1.6 Tautology preservation Given a matrix M, let Taut(M) (the "tautologies" of M) be the set of sentences that take designated values for every interpretation in the underlying algebra. Given a unary assertionallogic L, a matrix M is said to be characteristic for L whenever for all sentences
233
Thus Taut(L3), Taut(S3) ~ Taut(II(L3, S3)) follows because, as we have seen, both L3 and S3 are submatIices of the direct product II (L3, S3). Going in the other direction, let us suppose that we have an interpretation that assigns a sentence an undesignated pair in II (L3, S3), say (~, 0). Because of the "componentwise" definitions of the operations, if we just look at the first component we have an interpretation in L3 which assigns the formula ~, whereas if we look at the second we have an interpretation in S3 that assigns the formula 0. 7.1. 7
Infinite matrices
The matrices presented above are all finite. In this section we discuss some interesting infinite analogs, to further develop intuitions, skipping on proofs. Each of the series of matrices Ml, M2, ... described above has an infinite matIix Moo whose tautologies are just those sentences which are tautologies in each of the matrices Mi, namely the direct product IIiEw Mi. But in each case there is a more easily visualized matIix which is in effect the limit of the series. For the Lukasiewicz series, this limit matIix is defined on the rational numbers between and 1 : [0,1] n Q. We shall denote it by L w, since it is welllmown that the cardinality of the rational unit interval is OJ (often written ~o). There is a bigger infinite matrix defined on all of the real numbers in the interval [0, 1] and we shall denote it by L~" since (given the continuum hypothesis) ~1 is its cardinality. We shall not prove this here, but Lukasiewicz and Tarski (1930) showed that each of IIiEw Li, L w, and L~I determine the same tautologies, the so-called "infinite Lukasiewicz logic." The interested reader is advised to consult Wojcicki (1988). What about an infinite matrix for the series of Godel matrices? Again the direct product will do, but the "limit" G w is nicer. This is obtained by taking the sequence of natural numbers and sticking OJ at the "end":4
°
0,1, 2, ... , n, n + 1, ... ,OJ. The tautologies are the same as those of Dummett's logic (LC), and are the same as the tautologies shared by all of the GOdel matrices. With the Sugihara matrices we have an embarrassment of riches. It is fairly easy to see that the full series SI, S2, S3, ... , SI1' ... is tautology-equivalent to the sub-series S2, S4, ... ,S211, ... , meaning that a sentence is valid in all of the first iff it is valid in all of the second. The second series is missing the matrices with 0: S I , S3, ... , S211+ 1, .... But since each S211+1 is a homomorphic image of S2n+2, if
MATRICES
MATRICES AND ATLASES
234
actually was written after Meyer's result (despite the date) and provides a more directly algebraic proof of this result, as well as obtaining additional results, the most important of which is the "pretabularity" of RM, which roughly means that every extension is such that its tautologies are characterized by a finite characteristic matrix. A version of it may be found in Anderson and Belnap (1975, Section 29.4). Dunn and Meyer (1971) also showed that LC is pretabular (cf. Section 11.10 below). In this section we have mentioned "tautology hood" in a matrix from time to time, implicitly focusing on unary assertional logics. It is worth remarking that things can change if one looks at consequence. For example, Dunn (1970) shows that the Sugihara matrix SIQI defined on the rationals is strongly characteristic for RM, meaning that ¢ is deducible from r using the axioms and rules of RM just when every interpretation in SIQI that assigns every member ofr a designated value also assigns ¢ a designated value (in symbols: r I-RM ¢ iftT FSiI)l ¢). All of Sz, Sz±, SIQI± are tautology-equivalent to SIQI. But it is shown that none of Sz-, Sz±, SIQI± is strongly characteristic for RM. It is an interesting question as to when a logic is characterized by a finite matrix. Ulrich (1968,1986) gives an interesting answer for unary assertionallogics. Sometimes while there is no single finite matrix characterizing a logic, there is nonetheless a system of finite matrices that jointly characterize it. This is true of the Lukasiewicz infinite logic, and ofLC and RM. This illustrates the finite model property (Definition 6.16.4). 7.1.8
Interpretation
How should we best interpret all of these different many-valued matrices? Something like degrees of truth might be appropriate given Lukasiewicz's motivation, but there is another possibility, namely that they be interpreted as "propositions." This is most easily illustrated in terms of the Gtidel matrices. Thus let U2 = {ao, al }. Think of ao and al as states of information, and accordingly introduce an "information order" postulating that ao bal. In the present circumstance we can actually identify these information states with "moments of time." Of course we also postulate ao b ao and al bal. A proposition can then be understood to be a subset of U2 that is closed upward under b. Let us indicate the collection of these as ~ t (U2) (the mnemonics are ~(U2) for the power set of U2 and t for upward closed). Thus the propositions are 0, {al }, {ao, al }. The reader will immediately recognize the fortuitous circumstance that these are three, and that they are linearly ordered by ~. For p, q E ~ t (U2), define p /\ q = p n q, p V q = p U q. These clearly agree with the Lukasiewicz definition of conjunction and disjunction as minimum and maximum. We next define implication and negation as follows: (1) (2)
p -+ q
= {X : VfJ ;;;] X, ""p = {X : VfJ
if fJ
;;;]
E
P then
fJ
E q};
x, fJ ¢ p}.
The knowledgeable reader will recognize that these definitions correspond to the socalled Kripke-Grzegorczyk semantics for intuitionistic logic. More can be found on this in Chapter 11. The reader can easily check that tlle following tables are obtained
235
using the above definitions, and that these tables are isomorphic to the Gi::idel tables given previously (we omit conjunction and disjunction since their match with min and max has been already discussed): -+
{ao,ad*
{ad
tao, a]} {ao,ad tao, ad
{ad tao, ad tao, ad
tao, ad* {ad
0 0
tao, ad* {ad
0
tao, ad
0
0 0 0 tao, a]}
Exercise 7.1.9 Show tllat this can be extended to an (n+ I)-valued Gi::idel matrix whose elements are {O, ... , n} by using n + 1 linearly ordered states ao b al, ... , ai-] b ai, ... , an-] b an E Un. It is easy to see that the propositions correspond to the principal cones [ai) plus the empty cone 0. Given this one concrete illustration, we shall quickly discuss how something similar can be provided for the Lukasiewicz tables. Again we shall use a linearly ordered set U, whose elements are again thought of as moments of time. For concreteness let us again consider in effect U2, but this time we shall drop the a notation, and just set U2 = {O, I}, with 0 thought of as "the present." The reader should not be confused by the fact that we are using natural numbers both for the elements of the three-valued matrix and for the states in its interpretation. The propositions on U2 are then the three upward closed cones: {{O, I}, {I}, 0}, with to, I} being a proposition that is true at the present (and so will continue to have been true), {I} being a proposition that will become true at the next (and final) moment, and 0 being a proposition that is never true. We again take conjunction to be intersection, and disjunction to be union, but we have the following definitions of implication and negation: (3) (4)
p -+ q = {i :
Vj
E
U such that i + j
E
U, if j
E
P then i
+j
E q};
""p = {i : n - i ¢ p} .
The reader will now see why we chose to use natural numbers for the states, nan1ely because it makes sense to add and subtract them. The idea above comes from Urquhart (1973). There is a different interpretation of essentially the San1e mathematics that comes from Scott (1974) in terms of "degrees of precision.,,5 We shall not go into either the philosophy or the mathematics needed to make more sense of this idea, but let us point out that we at least have a basis for seeing what Scott has meant by his slogan of "replacing many values by many valuations" (the many valuations being given by relativizing the truth of a proposition to a degree of precision, or, in terms of the Urquhart semantics, a point in time). How do we interpret the Sugihara matrices? Again we think of U as moments of time. But we complicate matters by saying that a proposition can be both true and false at a given moment. SCf.
Urquhart (1985) for a discussion and comparison of the two interpretations.
RELATIONS AMONG MATRICES
MATRICES AND ATLASES
236
This leads to the construal of a proposition p as a pair of sets of moments (p+, p-), where p+ consists of the moments that p is true, and p- is the moments that p is false. We require that both p+, p- be cones in the information order: -,(p+,p-) = (p-,p+), (p+,p-) /\ (q+, q-) = (p+ n q+, p- U q-), (p+,p-) V (q+, q-) = (p+ U q+, p- n q-).
The definition of -+ is slightly complicated. Put intuitively, it is a variation on (l), the definition of intuitionistic implication, that treats falsity independently from truth. We begin by setting (p+,p-) -+ (q+, q-) = ((p -+ q)+, (p -+ q)-). We then define (p -+ q)+ to be
And we define (p -+ q)- to be
7.2
Relations Among Matrices: Submatrices, Homomorphic Images, and Direct Products
There are relations among matrices which conespond to certain relations among algebras discussed in Chapter 2. Since a matrix M = (A, D) is just an algebra with a designated subset, a matrix always has an "algebraic part" A (in symbols, alg(M)). It is thus to be expected that the conesponding relations among matrices will be defined by firs t requiring the relation among the algebraic parts, and then adding on some requirement about how the designated elements are to be treated. Given two matrices M and M', we say that M is a submatrix ofM' iff (1) alg(M) is a subalgebra of alg(M'), and
(2) D = D'nM.
An example might be to let M' be all the propositions of biology, and let M be all the propositions in some subfield of biology, like cytology. Then D' might be all the true propositions of biology, and D would be all the true propositions of cytology. We would always expect (2) to hold for any example such as the one above. However, there is also the weaker relationship (2') D
Readers who wish to learn more about the semantics of RM and how it relates to other semantical approaches to relevance logic can consult Anderson et al. (1992). Exercise 7.1.10 Show that one gets the three-element Sugihara matrix when one has just two moments of time ao and al. Go on to show that one gets the (n + I)-element Sugihara matrix by having n moments of time. In the rest of the book we shall not directly discuss Lukasiewicz's approach to manyvalued logic, and we shall say little about relevance logic. But we brought these subjects up here to make the point that a matrix understood abstractly as an algebra with some designated elements can often be given a more intuitive interpretation ("representation") where the elements are understood as sets of "states." In the examples above the states are taken to be moments in time, but they can also be taken to be states of information about an evolving system. When matrices are so interpreted, it is best to think of a matrix element as a "proposition," and not as a "truth value." This is because the most natural way to think of its representation is as the set of states in which it is true. This is one way of understanding Scott's (1974) idea of replacing "many values" by (sets of) "many valuations." A valuation is just a mapping of sentences into {t, f} and such a valuation can be regarded as a state. 6 6There is a subtlety here. Once each sentence tjJ has been assigned a proposition II tjJ II in the sense of a set of states, a state a uniquely determines a valuation: v(tjJ) = t iff a E IItjJll. But the converse does not hold. There can be two states that determine the same valuation. In this sense states are more abstract than valuations, and recognize that there can be more to a state that can be expressed by the particular sentences of the formal language.
237
~
D',
which should not be confused with (2). We speak of weak submatrices if (2) is replaced with (2'). We get a conesponding strong and weak sense of homomorphism for matrices. In both cases h must, of course, satisfy (1) h is an (operational) homomorphism of alg(M) into the alg(M'), and in the strong case it must satisfy (2) a E D only if h(a) ED',
as well as (2') a E D if h(a) ED',
whereas in the weak case it must satisfy only (2). This last is just the requirement that h be absolutely faithful, with respect to preserving the "unary relation" D (cf. Section 2.5). Some alternative terminology that we find useful is to call an operational homomorphism satisfying (2) positive (for it preserves designation), and one satisfying (2') negative (for it preserves non-designation). A (weak) homomorphism is then just a positive homomorphism, and a strong homomorphism is then a homomorphism that is both positive and negative. The literature, somewhat inconsistently with its practice with regard to "submatrix," seems to ascribe the term "homomorphism" simpliciter to the weak case, bringing out the terminology "strong homomorphism" when it wants something satisfying (2'). (A notable exception is to be found in Czelakowski (l980a, p. 16).) We shall follow this practice, but shall avail ourselves of the "strong" and "weak" epithets when we want to be perfectly clear.
238
MATRICES AND ATLASES
How can one give some intuitive logical force to these two different notions of homomorphism? Let us think about interpretations of a sentential language into an algebra of propositions. These are operational homomorphisms (from a universally free algebra), and as such are not required to preserve anything about designation. A natural way to think of designation of sentences is to think that there is some theory that asserts a certain set of the sentences as theorems. A natural way to think of designation of propositions is to think that there is the set of true propositions. A positive homomorphism is then just a sound interpretation (the sentences asserted are true), and corresponds to the well-known principle of charity in interpretation. A negative homomorphism is then a complete interpretation (all the true sentences are asserted). Let h be a (strong/weak) homomorphism from Minto M'. When h is onto we shall call M' a (strong/weak) homomOlphic image of M, and, conversely, M an inverse (strong/weak) homomorphic image of M'. Clearly the first relation is many-one, whereas the second, converse relation is one-many. When h is one-one we call it a (strong/weak) isomorphism, and call M' an isomorphic image and M an inverse isomorphic image (both or either the weak or strong kind). Note that if h is a strong isomorphism, then its inverse is also a strong isomorphism. Thus if M' is a strong isomorphic image of M, then M is also a strong isomorphic image of M', and symmetrically for inverse strong isomorphic images. But the same does not necessarily hold for weak isomorphisms. Normally, when we speak just of an "isomorphism" (or "isomorphic image" or "inverse isomorphic image") we shall mean these in their strong senses. Finally, we shall define the notions of direct and subdirect products of matrices. A direct product I1EI Mj is a structure (I1EI alg(Mj), D), where D = XiEIDi, i.e., an element (di)iEI of the direct product is designated iff each component di E Di. The subdirect product is just a submatrix of a direct product in which all the projection homomorphisms are onto. Clearly subdirect products admit of strong and weak versions depending on whether the homomorphisms are strong or weak. Our default convention will be that the term refers to the weak version, just as with homomorphisms. Indeed, there are very few instances of the strong version, because, if one thinks about it, one sees that it rules out "mixed elements" of the subdirect product, i.e., elements such as (Xl, X2), where Xl ED] and yet x2 ¢ D2. Since matrices are fundamentally just algebras, it is sometimes useful to speak of subalgebras, homomorphic images, isomorphic images, direct products, etc. with reference to the algebras (ignoring the sets of designated values). Context usually makes clear when we mean, say, "homomorphism," in the algebraic sense or in the matrix sense (and hopefully in the latter case whether we mean the weak or strong sense). But when context is not so clear we shall speak, say, of an algebraic homomorphism or matrix (strong or weak) homomorphism, and similarly with algebraic direct product, matrix (strong or weak) direct product, etc.
PROTO-PRESERVATION THEOREMS
7.3
239
Proto-preservation Theorems
In a later section, we discuss how various logical properties, such as validity, are preserv~d by t?e various relationships among matrices discussed in the previous section. In thIS sectIOn we shall do some preliminary work toward this end, examining to what extent designation and non-designation of sentences are carried up from submatrices homomorphic images, and (sub)direct products to the original matrices. One of ou; chief to?ls will be the invocation of weak and strong versions of these relationships as appropnate. Throughout all of the following lemmas, we shall assume some fixed language L, a~d M and M'. will be matrices of the same similarity type as L. We shall follow our preVIOusly establIshed convention of letting D and D' be the respective sets of designated elements, but we shall extend this convention by letting U and U' be the respective sets of undesignated elements. Each lemma is followed by a diagram that depicts it.
Lemma 7.3.1 Let M' be a weak submatrix ofM, and let z' be an interpretation in M'. Then I' is also an interpretation in M. Furthel; ijl'(cjJ) ED', then z'(cjJ) ED. IfM' is a strong submatrix ofM, then also ijz'(cjJ)
E
U', then 1'(cjJ)
E
U.
Proof It is clear that I' is also an interpretation in M, since it is clear that a homomorphism into a subalgebra is also a homomorphism into the algebra itself. Now the first pmt of the lemma follows from the fact that weak submatrices require that D' ~ D. L
L
I'
I'
M'
:S
I'
D'
M
w
D
FIG. 7.1. First part of Lemma 7.3.1 L
L
z'
I'
M'
:S s
z'
M
D' U'
FIG. 7.2. Second part of Lemma 7.3.1
D
U
PROTO-PRESERVATION THEOREMS
MATRICES AND ATLASES
240
For the second part, since strong submatrices require that D' = M' n D, it follows for an element a E M' that if a is not in D', then a is not in D, i.e., U' ~ U. The lemma is now immediate. 0 Lemma 7.3.2 Let h be a (weak) homol1101phism from M onto M' and I' be an interpretation in M'. Then for some intelpretation 1 in M, for every sentence ¢, if I( ¢) E D, then I'(¢) ED'. If h is a strong homomorphism from M onto M', then ifl(¢) E U, then I'(¢) E U' (and also ifl(¢) ED, then I'(¢) ED').
z' h w
The above lemmas all relate a given interpretation in a homomorphic image to some other interpretation in its source matrix. The next lemma treats the other direction.
31
I'
M' ..... , f - - - - - M
Then since our hypothesis is that I(¢) E D, we know, by (2) and the fact that h preserves designation, that I' (¢) E D'. For the second part, the desired interpretation is produced just as above, and clearly has the properties mentioned, since a strong homomorphism preserves both designation and undesignation. 0
Lemma 7.3.3 Let I be an interpretation in M and let h be a (weak) hOl11ol1101phism fivmMontoM'. Then the composition loh, w/zereloh(x) = h(/(x)), isanintelpretation ofL in M'. Further, for all sentences ¢, if I( ¢) E D, then (10 h)( ¢) ED'. If h is a strong homol1101phismfimn M onto M', then also ifl(¢) E U then (10 h)(¢) E U'.
L
L
241
L
L
D' ""'.,f----- D h
10
h
10
h
FIG. 7.3. First part of Lemma 7.3.2 D' ....E f - - - - - D h
M' ""'.,f----- M w
L
L
FIG. 7.5. First part of Lemma 7.3.3
31
I'
h M' ..... , f - - - - - M s
L
L
D' .... , f - - - - - D h
U'"
10
h
10
U
FIG. 7.4. Second part of Lemma 7.3.2
h
M' ""'Ef----- M s Proo.f Define I on atomic sentences so that I(p) E h- 1(I' (p)). This actually requires the Axiom of Choice, but, speaking intuitively, we know from the fact that h is onto that each element in M', including I'(p), comes via h from possibly many elements in M. It is "merely" the matter of selecting in each case one of them to "play the role" of l' (p). Extend this definition inductively so that (1) I(Oi(¢I,···, ¢m)) = Oi(I(¢l),···, I(¢m))·
Clearly h(/(p)) = I'(p) for each atomic sentence p.1t is then a transparent induction to show that (2) h(I(¢)) = I'(¢).
h
D'
""'.,f-----
U'"
h
D U
FIG. 7.6. Second part of Lemma 7.3.3 Proo.f The composition of any two homomorphisms is a homomorphism, and an interpretation is just a homomorphism from an algebra of sentences. 0
The next lemmas relate to direct products. Lemma 7.3.4 Let ItEI Mj be a direct product and let Ii be an intnpretation ofL in one of the component matrices Mj. Then there is an intelpretation I in I1EI Mi such thatforevery sentence ¢, ifl(¢) ED, then li(¢) E D i .
L
L
L
L
3/
3/
D/·
~Eo------
1[i
D/·
D
Proof This is a simple application of Lemma 7.3.4, given the obvious fact that the D projection 1[i(ai)iEI) = ai is a weak homomorphism.
interpretation Di.
Ii
== / 0
1[i
such thatfor every sentence p, ifl(p)
Proof Apply Lemma 7.3.3.
Remark 7.3.5 It might be thought that we would next prove the natural strengthening of Lemma 7.3.4, that is, if we knew that the projection 1[i were a strong homomorphism, we would know that the interpretation Ii carried over from the direct product not just designation, but also undesignation. But a little reflection shows that except in the most degenerate of cases there are no such "strong direct products." The problem is that an undesignated "sequence" (ai )iEI of the direct product can still have some designated components. We might flirt with the idea of defining a "strong direct product" on the union of the Cartesian product of the designated sets and the Cartesian product of their complements (XiEI Di U XiEIUi). Such a construction (though not so labeled) actually occurs in Belnap (1967). But a little reflection shows that these are also rare fauna (although not quite so rare as on the first alternative above). The problem is that an operation applied on two "pure sequences," (do,dl)' (UO,llj) = (do' UO,dl' UI),
can very likely produce a "mixed sequence," since there is very little hope that do . lIO and dl . III will be either both designated or both undesignated. To require that these mixed cases do not occur would be in effect to require "quasi-truth-functionality," i.e., to require of all the matrices under consideration that when elements have the same status of designation, then operations on them produce elements all having the same status of designation (in short, that being both designated or both undesignated is a congruence in the sense of the next section). This is normality in the sense of Church (1956), and can correctly be guessed to have too close an affinity to classical logic to be of any general utility. We can, however, prove a result for direct products related to Lemma 7.3.3 regarding homomorphisms. Mi be a direct product of the (indexed set of) matrices (Mi )iEI, and let / be an interpretation of L in IIiE! Mi. Then for each i E I, there exists an
~E"------
1[i
D
FIG. 7.8. Lemma 7.3.6
FIG. 7.7. Lemma 7.3.4
Lemma 7.3.6 Let IIiEI
243
PRESERVATION THEOREMS
MATRICES AND ATLASES
242
E
(Di)iEI, then li(P)
E
D
Lemma 7.3.7 Let IIiEI Mi be a direct product of the (indexed set of) matrices (Mi )iEI, let / be an interpretation ofL in IIiEI M i , and let If! be some particular sentence such that l(lf!) is undesignated in the direct product. Then for some i E I, there exists an intelpretation Ii such that for eVelY sentence p, if I(p) is designated in IIiEI M i , then li(P) E Di, and /i(lf!) 5i D;. Proof We know from Lemma 7.3.6 that Ii = 10 1[i has all but the last mentioned property with respect to any of the component matrices Mi. But since I(lf!) is undesignated in the direct product, there must be some component at which it is un designated, namely (/(If!))i. The matrix Mi is then the desired matrix for the lemma. D
7.4
Preservation Theorems
The "proto-preservation" theorems of the preceding section allow us to quickly establish a number of results concerning the preservation of "matrix validity" in various senses. We shall focus on the preservation of validity in the unary, asymmetric, and symmetric consequence senses. Recall from Chapter 5 that each interpretation I in a matrix M gives rise to a valuation VI> where v/(p) = tiff l(p) E D. The set of admissible valuations according to M, VM, is then defined as the set of all such VI' Where K is a class of (similar) matrices, we can then define the set of admissible valuations according to K as U{VM : M E K}. Using these sets of admissible valuations, we can define validity in M in various senses. These definitions make explicit notions that were implicit in the previous chapter and the previous section. First we define (unmy assertional) validity ("tautologyhood") in a matrix M as v(p) = t for every v E VM, i.e., /(p) E D for every interpretation in M. This is customarily denoted by some notation such as I=M p. This can be extended to a class K of (similar) mahices in the obvious way, requiring that I=M p for every ME K, which is the same as requiring that v( p) = t for each v in the set VM of admissible valuations according to K. We write I=K p.
PRESERVATION THEOREMS
MATRICES AND ATLASES
244
Asymmetric matrix consequence is defined in a corresponding manner. r I=M cjJ iff for every v E VM, if v(y) = t for every y E r, then v(cjJ) = t, i.e., iff for every interpretation I in M, if I(Y) E D for all y E r, then l(cjJ) E D. Again this can be extended to a class of matrices K, r I=K cjJ iff for every v E VK, if v(y) = t for all y E r, then v(cjJ) = t, i.e., iff for every ME K and interpretation I in M, if I(Y) ED for all y E r, then l(cjJ) E D. Symmetric matrix consequence is defined analogously. r I=M A iff for every v E VM, if v(y) = t for every y E r, then v(8) = t for some 8 E A, i.e., iff for every interpretation I in M, if I(Y) E D for all y E r, then 1(8) ED for some 8 E A. Again this can be extended to a class of matrices K, r 1= K A iff for every v E VK, if v(y) = t for all y E r, then v(8) = t for some 8 E A, i.e., iff for every M E K and interpretation I in M, if I(Y) ED for all y E r, then 1(8) ED for some 8 EA. lt is easy now to see, for example, that unary assertional validity in a matrix is preserved under both weak and strong submatrix. Thus, suppose that M' is a weak submatrix ofM and I=M cjJ. Consider an arbitrary interpretation t' in M'. By Lemma 7.3.1, I' is also an interpretation in M and so t'(cjJ) ED' since t'(cjJ) ED. Thus I=M' cjJ. We shall leave to the reader the routine derivation of other preservation theorems from their corresponding "proto-lemmas." The results are summarized in the following table. A check of course means that validity is preserved, and a cross means it is not. A small check (./) is an immediate consequence of a large check (v') in its vicinity, just as a small cross (x) is an immediate consequence of a large one ( X)· Note that there are only negative results in nine places, and because of the immediate consequences noted, these boil down to just four, which are labeled in order. We shall address these below.
245
We can produce an actual counter-example by letting M' be the three-valued Lukasiewicz matlix and letting M be a "designation expansion," i.e., the same except that we extend the set of designated values to be {l, ~} (in effect "designated" means "nonfalse"). It may seem odd that M' and M are the same on the algebraic component, but an algebra surely counts as a subalgebra of itself. Consider the sentence p V "p, and assign p the value ~. The interpretation of p V .,p is itself then ~, and thus p V "p is rejected in M' by this interpretation. But this interpretation no longer rejects p V "p in M, and indeed no interpretation does since the assignment of either 1 or 0 always gives p V .,p the value 1. Counter-example 2. Asymmetric consequence is not preserved under weak homomorphic images. We in fact show this for singleton left- and right-hand sides. Let us see where the argument for preservation breaks down. Suppose h is a matrix homomorphism of M onto M', and that cjJ f!:M' lJf. Then there is an interpretation t' in M' such that t' (cjJ) E D' and t' (lJf) ¢ D'. One can then argue that if one constructs an assignment IO(p) E h- 1(t' (p)) for each atomic sentence p, that the resulting interpretation in M will have the property that I(X) E h- 1 (t' (X)) for all sentences cjJ. So far, so good. But then we have just been arguing the algebraic situation, we have not yet turned our attention to the question of the designated values. Consider now l(cjJ) and l(lJf). We know that l(lJf) is not designated, for if l(lJf) E D, then since h preserves designation h(l(lJf)) = 1'(lJf) ED', but this is false. But there is no way to show that z( cjJ) is designated. We would try to argue that if l(cjJ) ¢ D, then h(z(cjJ)) = 1'(cjJ) ¢ D'. But this assumes that h preserves non-designation. Obviously what is required is that h is a strong homomorphism. We now produce an actual counter-example, again by fussing with the three-valued logic. For this purpose we add a new unary connective \/ with the table:
Unary Asymmetric Symmetric Submatrix
Homomorphic image
Inverse hom. image
Direct product
Weak Strong Weak Strong Weak Strong Weak Strong
Xl
x
x
./
v'
v'
X2
x
./
v'
v'
x:
x
x
./
./
v'
v'
v'
X4
v'
./
./
v'
Counter-example 1. Unary assertional consequence is not preserved under weak submatrices. Suppose that M' is a weak submatrix of M and (contrapositively) that f!:M' cjJ. Suppose that I' is an interpretation in M' such that I' (cjJ) ¢ D'. In terms of the relationship of the designated sets, all that we know is D' ~ D. For a strong submatrix we would have D n M' = D', and could argue then that t'(cjJ) ¢ D, as required. But this breaks down in the first step when we only have the inclusion the one way.
1 1
'2 2
1
'2
1
'2
1
'2
On a Bochvar reading of the value ~ as "garbage" this can be interpreted as "anything in, garbage out." We again use the trick of letting M and M' agree on their algebraic component (identity is an isomorphism and hence a homomorphism), and let this algebra have simp.ly the operation \/ defined by the above table. Thus M = M' = {I, ~, O}. Again we SImply expand the designated set (but in the converse direction). Thus D = {I} and D' = {I, ~ }. It is clear that \/p f!:M' q since we can assign q the value 0, and no matter what value is assigned to p, \/ p will take the designated value ~. But in M this value is not designated, and so there is no invalidating assignment. Incidentally, the reader may be bothered by the "artificiality" of the operation \/, which is just the constant ~ function. Such a reader may take the three-valued Lukasiewicz matrix for -+ but make ~ the only designated value. Call this M. M' is the same except that we make the value 1 designated as well. Clearly p -+ p I=M q since p -+ p
246
always takes the undesignated value 1, and yet p -'; P ft:M' q since in M' 1 is designated and q can be assigned the undesignated value 0. . Counter-example 3. Unary assertional consequence is not preserved under weak znverse homomorphic image. This is a simple reinterpretation of the situation of counterexample 2. M has {I} as the set of designated elements, and M' has {I, ~ }. M' is of course a weak homomorphic image of M under the identity homomorphism, and p V "'p is valid in M' and yet not in M. Counter-example 4. Symmetric consequence is not preserved under direct product. Indeed, we can show this for an empty left-hand side. Consider the direct product 22 of the two element Boolean algebra 2, with 1 as its only designated value. Its elements are {(1, 1), (1,0), (0, 1), (O,O)}, and the only designated v~lue is (1, 1). The cons~quence f- p,"'P is clearly valid in 2, since ..,p takes the Opposl~e value ~ro~ Pi as~urm~ th~t always one of p, ..,p will take the value 1. But f- p,"'P IS not vahd m 2 , smce If p IS assigned (1, 0) then ..,p is interpreted as (0, 1), and so neither ends up designated. Exercise 7.4.1 Complete all of the reasoning needed to justify the various checks and crosses in the table above. Exercise 7.4.2 Prove the claims about the matrices in the three examples in Section 5.12. (Hint: First relate the notions of strict, strong and weak equivalence of matrices to the notions at the beginning of this section.)
7.5
Varieties Theorem Analogs for Matrices
In Chapter 2 we stated Birkhoff's Theorem 2.10.3 and presented its proof. Recall .that this theorem links a proof-theoretical characterization of a class of algebras (equatIOnally definable) with a model-theoretic characterization of the class (closure under subalgebra, homomorphic image, and direct product). This gives a purely algebraic answer to the question: when is a class of algebras axiomatizable? Obviously the same t~pe ~f question can be asked of classes of matrices. We can ask when a ~lass of matr~ces IS characteristic for a "logic." This question is actually several questIOns, dependmg on what one takes a "logic" to be. As we saw in Section 6.2 there are various alternatives. We shall present answers to this question for unary assertional, asymmetric, and symmetric consequence logics. The first was shown by Blok and Pigozzi (1986), a~d th~ latter two by Czelakowski (1980a, 1983). Czelakowski (1980b), and Blok and Plgozz1 (1992) are also relevant. We omit proofs, which can be found in the works cited. A couple of preservation theorems not proven in Section 7.4 are given as exercises. 7.5.1
VARIETIES THEOREM ANALOGS FOR MATRICES
MATRICES AND ATLASES
247
Note that designation extension is related to our notion of A being a weak submatrix, except the two underlying algebras moe required to be the same. 7 Theorem 7.5.2 (Blok and Pigozzi 1986) A class of matrices K is the class of all matrices satisfying a unmy assertionallogic iff K is closed under (weak) direct products, strong submatrices, inverse strong homomorphic images, designation expansions, and strong homomOlphic images. Applying these operations finitely often, and in a certain order, suffices, as is summarized in obvious symbols: Matr(f-) = HstD HstSstP(Matr(f-».
We know from Section 7.4 that unary assertional consequence is closed under inverse strong homomorphic image (expansion), strong homomorphic image (contraction). We also know from Section 7.4 that since it is closed under weak submatrix, then it is closed under designation expansion. Designation extension plays a role for matrices similar to that of homomorphic image in equational systems. Exercise 7.5.3 Show that every weak homomorphic image of a matrix M can be obtained as a strong homomorphic image of a designation extension of M. Blok and Pigozzi actually state their theorem using notions of "contraction" and "expansion," which are equivalent to strong homomorphic and inverse strong homomorphic images: Definition 7.5.4 (Blok and Pigozzi, 1986) A matrix M = (A, D) is an expansion of a matrix M' = (A', D') (If there exists an onto algebraic homo117Olphism h : A I--l- A' such that h- I (D') = D. Conversely, M' is said to be a contraction of M. Blok and Pigozzi define M' to be a relative of M iff M' can be obtained from M by a finite number of contractions and expansions. This defines an equivalence relation, and, as Blok and Pigozzi observe, plays a role in the model theory of unary assertional logics similar to that played by isomorphism in equational systems. Blok and Pigozzi show that every relative of M can be obtained as an expansion of a contraction of M. Exercise 7.5.5 Show that if M is a relative of M', then f-M 7.5.2
= f-~.
Asymmetric consequence logics
We turn now to the case of an asymmetric consequence relation. It turns out that in characterizing the closure conditions on a class of matrices we have to work with a notion more complicated than direct product, namely an m-reduced product of matrices. We first introduce the simpler notion of a reduced product of matrices.
Unmyassertionallogics
We turn first to the question of characterizing the "vmieties" of matrices for unary assertionallogic. We first provide a needed definition: Definition 7.5.1 A matrix M' = (A', D') is a designation extension of a matrix M = (A, D) (fl(l) A' = A and (2) D ~ D'.
7Blok and Pigozzi actually speak of "filter extensions," but they do not necessarily mean a filter in the lattice sense. They mean any designated set of elements that satisfies the axioms and inference rules of some given underlying logic. (In a lattice this would most naturally form a filter, hence the terminology.) In fact they do not need this for their proof of the varieties theorem analog, and so, for the sake of both simplification and comparison to other results, we do not here impose any such requirement on the designated elements. This is why we prefer the term "designation expansion."
248
MATRICES AND ATLASES
Definition 7.5.6 Let I be a set of indices, and let ~ be a filter of sets of its power set rtJ(I). The product of matrices reduced by ~ (in symbols IT!l MiEl) is defined as the quotient matrix of the ordinary direct product ITiEl MiEl' produced by the congruence relation =;5 induced by ~ as follows: (ai liEf =;5 (b i liEl iff {i : ai = bi} E ~. An element [(ai liEl] =;5 is designated (ff {i : ai E Di} E~. Remark 7.5.7 This is to say that we first form the reduced product of the underlying algebras (as in Section 2.16) and then characterize designation. Note that just as the congruence relation can be understood as saying that (ailiEf and (biliEI are "almost everywhere" identical, designation of [(ai )iE[] =;5 can be understood as saying (ai liEf is "almost everywhere" designated in the underlying direct product. There are two examples of special interest. First, when ~ is just the power set of I, we obtain the ordinary direct product. Second, when ~ is a maximal filter, we obtain what is called an ultraplVduct (~ is usually called an ultrafilter in this context). Definition 7.5.8 An ultraproduct of a similarity class of matrices is a plVduct of these matrices reduced by a maximal filter. We also have use for the following abstraction: Definition 7.5.9 ~ is said to be an m-filter if besides being closed under upward inclusion and finite intersections, it is also closed under infinite intersections as long as the cardinality of the collection of sets being intersected has cardinality less than m. An m-reduced product of matrices is just a product of matrices reduced by m-filter~. We can now state the theorem of Czelakowski, subject to certain technical conditions on the cardinals m, which we shall subsequently explain. Any reader who wants or needs to skip over these technical considerations should go immediate to the corollary, which applies to most "real-life" logics. Theorem 7.5.10 (Czelakowski, 1980a) A class of matrices K is the class of all matrices satisfying an asymmetric consequence relation f- iff K is closed under (strong) submatrices, strong homomorphic images, inverse stlVng homomorphic images, and m-reduced plVducts, where m is an infinite regular cardinal weakly bounded by the cardinality of f- and the successor of the cardinal of the set of sentences in the underlying language. In obvious symbols:
We now tum to the technical conditions on m. First we explain the notion of the cardinality of an asymmetric consequence relation. It is the least infinite cardinal n such that the consequences of any set of sentences r can be obtained by taking the union of the consequences of sets of sentences each of whose cardinality is strictly smaller than n. The cardinality of f- is always less than or equal to the successor cardinal of the set of sentences in the underlying language. Observe that when the language is denumerable and f- is compact, its cardinality is ~o.
CONGRUENCES AND QUOTIENT MATRICES
249
A regular cardinal m is one such that it "cannot be surpassed from below," i.e., given a family of cardinals {mi} iEl such that each member mi < m and such that the family itself has a cardinal less than m, then LiEl mi < m. This is a fairly technical notion but let us note that the first infinite cardinal ~o is regular. This last observation, together with observation of the preceding paragraph, leads to the following, much less technical version of the theorem for the ordinary case where the underlying language is denumerable. Corollary 7.5.11 (Czelakowski, 1980a) Let f- be a compact asymmetric consequence relation on a denumerable sentential language. Then the theorem above can be simplified by replacing m-reduced plVducts with reduced plVducts. The following suffices: Matr(f-) = HstHstSstPR(Matr(f-».
PIVO! The observations already noted show that ~o satisfies the technical conditions, so we simply add that an ~o-reduced product is simply a reduced product, since any filter is closed under finite intersections. D Exercise 7.5.12 Show that an asymmetric consequence relation is preserved under reduced products. 7.5.3
Symmetric consequence logics
Czelakowski extended his theorem to apply to symmetric consequence relations. The conditions are similar, except closure under strong homomorphic images is dropped, and "Ultra-product" is substituted for "m-reduced product," which means the technical condition disappears. This gives a prettier statement: Theorem 7.5.13 (Czelakowski, 1983) A class of matrices K is the class of all matrices satisfying a symmetric consequence relation f- iff K is closed under ultraproduct, (strong) submatrices, strong matrix homomorphisms, and inverse strong homomorphic images. The following suffices: Matr(f-) = HstHstSst Pu(Matr(f-».
Exercise 7.5.14 Show that any symmetric consequence relation is closed under ultraproduct. We close this section by simply raising the question about how to extend the results above to a larger class of similar results. There various ways of presenting logics which we have not considered in this section-for example, unary refutational systems (cjJ f-). Not only are there a lot of ways of presenting logics, but there are also a lot of closure conditions on classes of matIices floating around, and it would be interesting to explore which combinations of them correspond to which "vmieties" oflogics (thus completing the pun set up by the title of Section 6.2: "The Varieties of Logical Experience"). 7.6
Congruences and Quotient Matrices
We recall from Section 2.6 that a congruence on an algebra is just an equivalence relation = that respects the operations of the algebra. In defining a congruence on a matIix
MATRICES AND ATLASES
250
M we obviously want it to be a congruence on the underlying algebra alg(M), but the question is how much the relation should respect designation. The natural requirement is (1) if a
E D
and a == b, then bED.
Because of the symmetry of ==, this is equivalent to requiIing that if a == b, then either both a and b are designated or else both a and bare undesignated. We shall call a congruence satisfying (1) a strong congruence, and one that is merely a congruence on the underlying algebra a weak congruence. We shall single out neither for the epithet congruence simpliciter, the problem being that we feel a certain tension: the strong congruence is certainly the one with the greater claim to naturalness, and yet (as we shall see) it has an affinity with the so-called strong homomorphism. Given a congruence == of either type, we can define a quotient matrix Mj == as follows: The algebraic part of Mj == will be the quotient algebra alg(M) j == as defined in Section 2.6, i.e., its elements will be the equivalence classes [a] detelmined by a E M, and operations are defined on them by way of representatives. So, for example, taking a binary operation: (2) [a]
* [b] = [a * b].
The important question is which equivalence classes we are to count as designated. For a weak quotient matrix, its set of designated elements, D j ==, will consist of those cliques [a] such that there is some a' E [a] (i.e., a' == a) such that a' E D. (This is in effect a special case of (Q) of Section 2.6, when R is a "unary relation".) For a strong quotient matrix we shall define the set of designated elements exactly the same way, but the difference between the weak and strong notions comes out in our requiring for a strong quotient matrix that the equivalence relation == be strong. It just turns out (because of (1)) that a clique [a] is designated iff all a' E [a] (i.e., all a' == a) are such that a' E D. How can we describe the notion of congruence with some intuitive logical vocabulary? Well, one way to think of congruence is as provable equivalence in some theory, and since theories should be closed under provable equivalence, this quickly motivates the requirement (I) above. The story can go on to give some intuitive force to the notion of a quotient matrix. Many times, propositions that are non-equivalent in some "inner theory" are said to be equivalent in some "outer theory" (proper extension). Thus, for example, propositions that are not logically equivalent might be said to be mathematically equivalent. And sentences that are not mathematically equivalent, might be said to be physically (in physics) equivalent. And sentences that are not physically equivalent might be said to be biologically equivalent, etc. And at the very beginning of the chain, we can have sentences that are not identical (say p and p 1\ p) even though they are logically equivalent. Now if one wants, one can just leave the situation at this level of description. But if one is interested in reifying what it is that equivalent propositions have in common (what makes them the "same"), one can go on to say that they express the same proposition "really," i.e., in some more
CONGRUENCES AND QUOTIENT MATRICES
251
elaborate theory. Thus it is a familiar quasi-nominalistic move to identify propositions with classes of logically equivalent sentences, and it is certainly not too funny a way of talking to say that two logically distinct propositions express mathematically the same proposition, etc. The partitioning into equivalence classes to form quotient matrices can be understood to be just a convenient set-theoretical device to reify what it is that equivalent propositions (or sentences) have in common. With some story like the above, how could we ever motivate some requirement weaker than (l)? Possibly the designated set could be the theorems of the "inner theory." Maybe D is some set of mathematical theorems (or truths) and == is physical equivalence. Assuming some standard reductionist imagery, we could imagine that a proposition a is equivalent to some mathematical fact a', and on that account the physical proposition [a] ought to be designated. It is clear that every strong homomorphism determines a strong congruence. Thus, count two elements as congruent when they are carried into the same element. We know from Section 2.6 that this is a congruence on the algebraic part of the matrix. But since a strong homomorphism can never carry a designated element and an undesignated element to the same value, it is clear that this congruence also respects designation in the sense of (2) above, and that we have then a strong congruence. Conversely, every strong congruence determines a strong homomorphism. As we know, once again from Section 2.6, the canonical homomorphism, which carries an element a into the class [a] of the quotient matrix, is an algebraic homomorphism. It is also clear that the canonical homomorphism respects designation, since if a is designated, then [a] is designated by virtue of our decision on how to designate cliques in the quotient. And if a is undesignated, then, by (2), all elements congruent to a are undesignated, and thus each element of [a] is undesignated and so [a] is undesignated. Playing with these facts, one can establish the following theorem. Theorem 7.6.1 (Strong homomorphism theorem for matrices) Every strong 11omomOlphic image ofM is isol1ZOIphic to a strong quotient matrix ofM. Exercise 7.6.2 Give a detailed proof of the above. What, then, of weak homomorphisms? Clearly weak congruences determine weak homomorphisms, since the canonical homomorphism again carries a designated element a to the designated clique [a]. (Designation of the clique requires only that olle member be designated.) But it is not necessarily true that weak homomorphisms determine even weak congruences. The problem is that a weak homomorphism can carry an undesignated element a to a designated element a', even though no other designated element is carried to a'. An obvious fix to this problem is to restrict our attention to homomorphisms that are (minimally) faithful in the sense of Section 2.5 (thinking of "designation" as a unar·y relation). What this means is that in the problem case described above, a' must also have some other designated element b carried to a. Incidentally, notice that the canonical homomorphism onto a weak quotient matlix is always faithful, since [a] is made designated only when there is some member b that is designated. But then b is the desired designated pre-image. The following is easily proven from the discussion above.
MATRICES AND ATLASES
252
Theorem 7.6.3 (Weak homomorphism theorem for matrices) EvelY weak faithful homomorphic image ofM is isomorphic to a weak quotient matrix ofM. Exercise 7.6.4 Fill in the details. Consider the set Cst(M) of all strong congruences on a matrix M = (A, D). As with algebras, the smallest congruence is just the identity relation restlicted to A. But this time the largest "natural" congruence cannot in general be the universal relation A x A (unlike the case with algebras), for if A has at least two elements and D f:. A, then the universal relation would identify a designated element with a non-designated element. Still, we have: Theorem 7.6.5 The set Cst(M)of all strong congruences on a matrix M = (A, D) forms a complete lattice. Proof We leave this to the reader. In virtue of Corollary 3.7.5, all that needs to be checked is that the intersection of relations () "compatible" with D (if a(}b, then a E D only if b E D)8 is also a relation compatible D, and similarly with the transitive closure of the union of relations compatible with D. 0 Blok and Pigozzi answer the question as to how to characterize the largest strong congruence on a matrix. As they point out, their solution stems from Leibniz's principle of the identity of indiscernibles. It is better rephrased in this context as congruence of indiscernibles, because the idea is to define two elements to be congruent just when they are extl'insically indiscernible in terms of their roles in the matrix. The Leibniz congruence has to do with indiscernibility by way of predicates (by which is meant a relation). There are just two natural atomic predicate symbols in the first-order language used to describe a matrix: one is a unary predicate for membership in the designated set, and the other is the binary predicate for identity. This discussion assumes that we have only the first, which we denote by D[x]. Definition 7.6.6 An n-ary predicate (relation) P is first-order definable over a matrix M = (A, D) iff there is afonnula
(XI, ... , Xll, YI, .. ·, Yk) offirst-order logic containing only the predicate D and function symbols corresponding to the various operations of A, and there are elements C], ... , Ck E A, such that for all ai, ... , all E A,
Definition 7.6.7 Let A be an algebra and let D over D is defined by QAD
= {(a, b) : Pea)
~
~
A. The Leibniz congruence on A
PCb) for every definable predicate P}.
CONGRUENCES AND QUOTIENT MATRICES
253
Exercise 7.6.8 Prove that QAD is a strong congruence. Remark 7.6.9 Note that the first-order formula (XI, ... , Xn , YI, ... , Yk) in Definition 7.6.6 can have all of the usual connectives and quantifiers in it, as well as various occurrences of the predicate D. We signal this by using the capital letter <1>, reserving the lower case p, as has been our practice, for sentences (really terms) in the sentential language appropriate to M. There is a fussy point to be made about this. Given a sentence of the sentential language, we have been writing it as P(PI, ... , Pn, ql, ... , qk), and we have been writing the corresponding first-order term as P(XI, ... , XI!, YI, ... , Yk). As the reader can easily see, the two terms m'e "isomorphic" except for the choice of the symbols (variables and operation symbols). For convenience, let us assume that the terms are written in the same language, with "x I" and "PI" just being two names in our metalanguage for the same symbol, and similarly with the other matching symbols. Let us consider just atomic formulas, i.e., those first-order formulas of the form D[P(XI, ... , X Il , YI,···, Yk)]. Remember that such a formula says that
Definition 7.6.10 We shall say that an n-ary predicate (relation) P is atomically definable over a matrix M = (A, D) iff there is an atomic formula D [p(x I, ... , x n , YI, ... , Yk)] offirst-order logic containing only the displayed occurrence of the predicate D and function symbols corresponding to the various operations of A, and there are elements C], ... , Ck E A, such thatfor all al, ... , all E A,
Definition 7.6.11 Let A be an algebra and let D ~ A. The atomic Leibniz congruence on A over D is defined by
Q% D =
{(a, b) : Pea) ~ PCb) for every atomically definable predicate P}.
It turns out that the restriction to atomic formulas in Definition 7.6.11 makes no difference. The predicates in Definition 7.6.7 can without loss be restricted to those definable by atomic formulas, i.e., formulas of the form D[P(XI, YI, ... , Yk)].
Lemma 7.6.12 QAD
= Q% D.
Proof This is an immediate consequence of the fact that the atomic replacement is equivalent to complex replacement (cf. Theorem 2.6.5). 0
The function QA(D) = QAD is defined on all subsets of A and is called the Leibniz operator on A.
Blok and Pigozzi use Lemma 7.6.12 to prove the following characterization of the Leibniz congruence in terms of other strong congruences:
SNote that when () is a congruence, it is symmetric, and so this is the same as requiring "two-way respect": if a(}b, then a E D iff bED.
Theorem 7.6.13 For any matrix M = (A, D), QAD is the largest strong congruence in the lattice of strong congruences on M.
Proof Let 0 be any strong congruence on M, and assume (a, b) EO. Let pep], q], ... , qk) be any sentence, and let Z be any interpretation in A. Since 0 is a congruence, we have for C] , ... ,Ck E A,
Since 0 is a strong congruence, this means: pA(a, C], ... , Ck) ED<=> pA(b, C], ... , Ck) ED.
Hence (using Lemma 7.6.12) we have (a, b)
E
QAD.
o
Let us consider the special case of a Leibniz congruence on an algebra of sentences. We know from Section 6.12 that theories correspond to the designated subsets, so we can write QsT to obtain a certain congruence on the algebra of sentences (the subscript is often omitted), namely the largest congruence that is compatible with T. Dividing the set S of sentences by this congruence gives the quotient algebra S I nT, and then dividing out the theory T as well gives us the matrix (S I nT, TinT). We .n~ed a na~e for this matrix. Note that this is neither the Lindenbaum algebra (because It IS a matnx, and because QT is generated from "above" rather than below), nor is it the Lindenbaum matrix (because the elements of the Lindenbaum matrix are sentences, not equivalence classes of sentences). We shall call it the Blok-Pigozzi matrix (detennined by T).
Theorem 7.6.14 p QT lJI (!ffor all intelpretations z in the Blok-Pigozzi matrix (SI nT, TinT), z(p) = Z(lJI)·
Proof The direction from right to left is a kind of completeness. Instantiate I to be the canonical interpretation I(X) = [X]nT. Then assuming z(p) = l(lJI), we have [p]nT = [lJI] nT, and hence p QT lJI· The direction from left to right is a kind of soundness result. Let 1(p) = [p'] nT and l(lJI) = [lJI'] nT. We first observe that we can choose p' and lJI' to be substitution instances of p and lJI. The reason is that we can consider just the interpretation of the atomic sentences I(pj) = [xIlnT, ... ,z(Pi) = [Xi], ... , and pick a sentence P' from each equivalence class, being careful to pick the same sentence for identic~l equivalence classes. This induces a substitution (J : CJ(Pi) ~ (p;), and in gen~ral (J(X) = x(p]1 p']' ... , Plllp;l)' where P], ... , Pil are all the atomIC sentences occurrmg in X. It is easy to prove by induction that z(X) = [x(p]lp'l' ... ,Plllp;l)] = [(JX]. N ow we can complete the proof of soundness. Since p QT lJI, we have p' QTlJI', and 0 this means [ifJ']nT = [lJI'] nT, i.e., I(p) = z(lJI).
7.7
255
THE STRUCTURE OF CONGRUENCES
MATRICES AND ATLASES
254
The Structure of Congruences
Whether we are talking of weak or strong congruences, they can be regarded as sets of ordered pairs, and ordered by set inclusion. Then =] ~ =2 means intuitively that =] is a "stronger" (stlicter) relation than =2. Somewhat in the face of English usage, the "stronger" relation is the "smaller" of the two, the idea being thatfewer pairs satisfy the stronger relation.
The inclusion relation clearly has the following properties, for arbitrary sets x, y, and z: (l) x (2) x (3) x
~
x
~
y and y ~ x imply x
~
y and y ~ z imply x ~ z
(reflexivity);
=y
(antisymmetry); (transitivity).
Any relation with these properties is called a partial order. Partial orders, and some related notions, were introduced more fully in Chapter 3, but we shall review pertinent facts about them as needed. Let £(M) be the set of all equivalence relations on (the carrier set of) M, and similarly let Cw(M) and Cst(M) be, respectively, the sets of all weak and strong congruences on M. It is easy to see that given any non-empty subset E of £(M), the intersection E is also an equivalence relation on M. It is also easy to see that the relation a E) b holds just when aOb for all 0 E E. Then, it is straightforward that E is reflexive since each 0 E E is reflexive, and similarly for symmetry and transitivity. If the members of E happen to be either all weak or all strong congruences, it is similarly easy to see that E will inherit the respective property. Thus £(M), Cw(M), and Cst(M) are all such that they are closed under non-vacuous intersections. Whenever U is a set with a partial order ~, given any subset S, we can ask whether S has a greatest lower bound (glb), i.e., whether there is an element 1\ S which is such that:
n
(n
n
n
(4) "Ix E S, 1\ S ~ x (lower bound). (5) Given any element u such that "Ix E S, II then u ~ 1\ S) (greatest lower bound).
~
x (i.e., given any lower bound
II
of S,
n
It is easy to see that for non-empty sets S, S has just these properties. We also have the dual notion of the least upper bound (lllb) of S, which is an
VS satisfying: "Ix E S, x ~ VS (upper bound).
element
(6) (7) Given any element II such that "Ix E S, x then V S ~ u) (least upper bound).
~
u (i.e., given any upper bound u of S,
It is easy to see that £(M) (and also Cw(M) and Cst(M» always contains the glb of any subset and that this is just intersection. We now address the question of whether £(M) (and also Cv(M) and Cst(M» always contains all the lubs of its subsets. In answering this question, it is important to note that the union of a bunch of equivalence relations is rarely itself an equivalence relation. This is because we may have a =] band b =2 c, and yet have no equivalence relation in the bunch so that a = c. The obvious answer to this problem is to take the transitive closure, i.e., the smallest transitive relation that includes the union. What this amounts to in practical tenns is that we shall say that a =E b iff there is some sequence of elements (possibly null) X], ... ,Xi,Xi+], ... ,XJc, and of equivalence relations 0], ... , Oi, Oi+], ... , Ok+] E E, such that (8) aO]x], ... ,XiOi+]Xi+], ... ,XkOk+]b.
256
THE CANCELLATION PROPERTY
MATRICES AND ATLASES
It is then easy to see that =E is an equivalence relation. It is equally easy to verify that if each e E E respects D, then =E respects D (respect for the operations or for designation just transmits itself across the chain (8)). Thus it is clear that each of [(M), Cw(M), and Cst(M) is closed with respect to the operation =E on non-empty subsets E. It is clear that =E is the lub of E, VE, in any of [(M), Cw(M), or Cst(M). So far we have been talking about taking glbs and lubs of non-empty sets. What happens when E = 0? Then, among the equivalence relations [(M), the lub V0 must be an equivalence relation that is included in every equivalence relation which is an upper bound of every equivalence relation in 0. But since there are no equivalence relations in 0, this means that every equivalence relation is such an upper bound, and so V 0 must be included in every equivalence relation on M. But this is just the identity relation (restricted to M), since each equivalence relation must be reflexive. Similar considerations give the same conclusion for either Cw(M) or Cst(M), the point being that identity clearly respects both operations and designation (indeed, presumably indiscemibility in all respects). Identity (restricted to the elements of the matrix) is of course the strictest equivalence or congruence, in any sensible sense of equivalence or congruence, and as such is at the very bottom of any of [(M), Cw(M), or Cst(M). But what of the largest element? We can quickly see that in the cases of [(M) and Cw(M) the largest element is just the universal relation on M, M x M (the relation that holds between any two elements of M). But this relation obviously does not respect designation (assuming that the matrix has at least one designated and one undesignated element), and so does not count as a strong congruence. Is there, then, a weakest strong congruence? The answer is clearly yes, since we know that every non-empty subset of Cst(M) has a lub, and so in particular, if we take the lub VCst(M) of the whole set of strong congruences we obtain our desired weakest strong congruence. Let us denote it by /-l. It should now be clear what happens when we take the glb of 0. In the environment of each of [(M), Cw(M), and Cst(M) we obtain their top element. In [(M) and Cw(M), this is the universal relation, whereas in Cst(M) it is something stronger. A partially ordered set that contains glbs and lubs for all of its subsets is called a complete lattice. We summarize all of the above discussion in the following theorem.
Theorem 7.7.1 Given a matrix M, its set of equivalence relations [(M), set of weak congruences Cw(M), and set of strong congruences Cst(M) are all complete lattices (with ~ the partial order). In each case, for non-empty subsets, the glb is intersection and the lub is transitive closure. In each case the identity relation (restricted to M) is the bottom element (and the lub of 0). In the case of [(M) and Cw(M) the top element is the universal relation (restricted to M), M x M. Remark 7.7.2 Clearly [(M) ~ Cw(M) ~ Cst(M). The identity map thus gives a kind of embedding of the complete lattice of equivalence relations into the complete lattice of weak congruences, and that, in tum, into the complete lattice of strong congruences. Since intersections and transitive closures of non-empty sets depend only on the elements of the sets, and not on other elements in the environment (unlike the general case
257
of glbs and lubs), we have that this embedding preserves glbs and lubs of non-empty sets. Also clearly it preserves V0. Also 1\ 0 is preserved by the first embedding, but not the second. The reader may have some sense of mystery as to just what the top element "looks like" in the case of Cst(M). Let us introduce the notation I(a/ p) to indicate the "semantic substitution of the element a for the atomic sentence p in the interpretation I," i.e., the interpretation just like I except (perhaps) for assigning a to p. Theorem 7.7.3 (Czelakowski 1980a) Given a matrix M, let L be a language (with infinitely many generators) appropriate to M. Let /-l be a relation on M defined so that aJlb ({ffor all sentences cjJ ofL and all atomic sentences p ofL and all interpret ations I, l(a/p)(cjJ) = l(b/p)(cjJ). Then /-l is the weakest congruence on M. Exercise 7.7.4 Prove the above theorem. 7.8
The Cancellation Property
Recall from Section 6.11 that a unary assertionallogic always has a characteristic matrix, namely its Lindenbaum matrix. We saw in Sections 6.12 and 6.13 that formal asymmetric and symmetric consequence logics also always have a characteristic semantics that can be defined in terms of a given set of "propositions," but our proof had us looking at the Lindenbaum atlas (with many different designated subsets) rather than just at the Lindenbaum matrix (with its single designated subset). In this section we discuss whether, and under what circumstances, we are forced to an atlas instead of just a matrix. We prove a theorem due to Shoesmith and Smiley (1978) giving a necessary and sufficient condition for a broad class of symmetric logics having a charactelistic matrix. There is an analogous (and simpler) result of Shoesmith and Smiley (1971), proven for asymmetric logics, that we shall examine after we look at the symmetric version. Before stating the theorem, we need to explain a key notion called cancellation. Following Shoesmith and Smiley, we say of two sentences cjJ and Iff that they are disconnected if they share no atomic sentences, and we shall say of two sets of sentences r and I:::.. that they are disconnected if for each cjJ E r and each Iff E 1:::.., cjJ and Iff are disconnected. Finally, we shall say of a family of sets of sentences (ri ) iEI that it is disconnected if for each j, k E I with j =f. Ie, r j is disconnected from r Ie. We then say of a symmetric logic that it has the cancellation property if and only if whenever (ri U I:::.. i ) is a disconnected family of sets of sentences such that Uri f- U I:::..i, then for some i, ri f- I:::..i. The cancellation property is a quite natural condition for a logic. The quick intuitive idea is that there can be no real logical interaction between formulas in sets rj and I:::..Ie with different indices, since the formulas share no content (except for degenerate cases such as when some sentences of rj are contradictory, or some sentences of I:::..j are valid), so that all of the "action" can be "localized" at some pair (ri, I:::..i) which
258
MATRICES AND ATLASES
THE CANCELLATION PROPERTY
presumably share content (or else we are back in one of the degenerate cases mentioned above, in which case it can be degenerately localized). It is easy to prove the following lemma.
Lemma 7.8.1 If a symmetric logic has a characteristic matrix, then it has the cancellation property. Proof Let L be a symmetric logic characterized by the matrix M, and let be a disconnected family. Suppose for each i, not (ri I- Ji i ). Then for each i there is some interpretation Ii that assigns a designated value to each sentence in r i and yet assigns an undesignated value to each sentence in Jii. Since the value assigned to a sentence depends only on the interpretations of the atomic sentences that occur in it, the valuations Ii can be combined into a single interpretation I so that I(¢) = li(¢) for ¢ E r i UJi i . (If ¢ is not in any ri UJii, define I on the atomic sentences in ¢ arbitrarily.) It is clear that I invalidates uri I- UJii as desired. D It turns out that under a suitable hypothesis about "stability" (the definition of which will be provided in the course of the proof) the converse of Lemma 7.8.1 also holds.
Theorem 7.8.2 (Shoesmith and Smiley) For a stable symmetric (formal) logic, a necessary and sufficient condition for it to have a characteristic matrix is for it to have the cancellation property. Proof Necessity is of course provided by Lemma 7.8.1. Recall that a symmetric logic is just a formal symmetric consequence relation 1-. We shall now sketch a strategy for proving sufficiency (somewhat different than that of Shoesmith and Smiley), and in the process uncover the needed definition of "stability." Let us collect together all of the pairs such that not eri I- Jii). The key idea is for each index i to make a copy of the set of sentences Si (for later convenience when invoking substitutions, we let one of these be S itself). We then let the elements of the matrix be the union of all these sets, and (as a first approximation) let the set of designated elements D be the union of the copies of the ris. We draw an appropriate picture (for the simple case of two pairs) in Figure 7.9.
:--.. r'1.' .. ' . '.
r'2
259
Before proceeding, we must correct the first approximation. In the final construction D will not simply be the union of the ris, but shall be a somewhat larger set. We shall have to show, reverting to the picture above, that Ji'1 UJi; is not a consequence of r'1 ur;, and then invoke the global cut property to partition S' (the union of all the copies-in the picture S1 US2) into the desired D and its complement (as the horizontal dotted line suggests). But consequence was defined only on the original set of sentences S, and so we must extend the definition to the set of copies S'. We do this as follows. The first thought is to define a new relation on S' so that r' 1-' Ji' iff there is a substitution cr so that r' = crer), Ji' = cr(Ji) , and r I- Ji. The problem with this is that it is pretty plain that 1-' does not satisfy dilution (the problem is that if we try to dilute before we substitute, the new items then become subject to substitution whether we want them to or not). So we just build into the definition of 1-' that it is the closure of I- under substitution and dilution, i.e., r' 1-' Ji' iff there exist r and Ji (subsets of S) and a substitution cr (defined on S') so that r' ;2 cr(r), Ji' ;2 cr(Ji), and r I- Ji. It is still not necessary that 1-' so amended be a consequence relation, but it might be. Clearly it satisfies overlap, but there is still the question of global cut. This brings us to the promised definition: Shoe smith and Smiley call a symmetric consequence logic Istable when it has the property that 1-' is always a symmetric consequence relation (for an arbitrary extension of the original language by new atomic sentences). We state some simple relationships between 1-' and 1-.
Fact 7.8.3 If rand Ji are sets of sentences of S, and r' and Ji' are the respective results of applying some one-one substitution cr (in S'), then r I- Ji iff r' 1-' Ji'. Fact 7.8.4 1-' has the cancellation property (given that I- has). Proof We leave the straightforward proof of Fact 7.8.3 to the reader. For Fact 7.8.4, we suppose that is a disconnected family of sets of sentences of S' and also that U r; 1-' U Ji;. By the definition of 1-', there exist sets rand Ji of sentences of S so that cr(r) ~
U r;,
cr(Ji) ~
U Ji;,
and r I- Ji.
Let r i = {¢: ¢ E rand cr(¢) E r;}, and let Jii be defined analogously. Then is a disconnected family since is. Note that U ri = rand U Jii = Ji, and so we can apply cancellation (on r I- Ji) to obtain ri I- Jii (for some i). And so (again by definition of 1-') r; 1-' Ji;, as required. D Returning now to the proof of Theorem 7.8.2 and reverting to the picture above, we must show that it is not the case that r'1 U r; 1-' Ji'1 U Ji;. Suppose to the contrary that
.. ' Ji'
Ji'1 S "Before"
.'
"After" FIG. 7.9. Illustration for i = 2
2
r'1' r; 1-' Ji~, Ji; . Since is clearly a disconnected family, this means that by the cancellation property (and Fact 7.8.4, which entitles us to apply it), we must have either r'1 1-' Ji'1 or r'2 1-' Ji'2' But Fact 7.8.3 entitles us to remove the primes, obtaining
260
THE CANCELLATION PROPERTY
MATRICES AND ATLASES
f] f- al or f2 f- a2.
But this is contrary to our original choice of the pairs (fi' ai) such that not (fi f- ai). The argument above, although carried out for the simple case of two pairs, is completely general, and so we know that it is not the case that U f-' U The penultimate step of the construction is then to invoke the global cut property so as to partition S' into two sets, D' and - D', so that U ~ D', U ~ D', and not (D' f- -D'). The final stage of the construction is to consider the algebra of sentences S' defined on the sentences S', and outfit it with the designated set D' so as to obtain "the Shoesmith-Smiley matlix" (S', D'). (Note that despite the definite article, it is not unique, depending as it does on a choice of D'.) Interpretations in the matlix are just substitutions, so soundness follows as with the constructions from the previous chapter of the Lindenbaum matlix and the Scott atlas. (We leave to the reader to check that f-' is formal, given that f- is.) And completeness is guaranteed by the construction, for if there are sets of sentences fi and ai such that not (fi f- ai), then we know that there is a substitution (J so that (J(fi) ~ D' and O'(ai) ~ - D' ((J just assigns to each sentence its ith copy). D
f;
f;
a;. a; -
Remark 7.8.5 Note that the construction above leads to a larger cardinality than does the construction of the Lindenbaum matrix. Since one copy S; has to be made for each pair of sets (fi, ai), it is reasonably clear that even when S is denumerable, most generally one will be constructing "continuum many" copies of a denumerable set, and so the union will be non-countable (indeed, of the power of the continuum). The following is an easy application of Theorem 7.8.2. Corollary 7.8.6 (Shoesmith and Smiley) For a compact symmetric logic, a necessary and sufficient condition for it to have a characteristic matrix isfor it to have the cancellation property. Proof It clearly suffices to show that a compact symmetlic logic is stable, and this boils down to showing that f-' (as defined in the proof above) satisfies the cut property. We leave the details to the reader. D
We now briefly discuss the case of an asymmetric consequence relation. First, the cancellation property specialized to the asymmetlic case comes down to the following: (ACP) If f, U fi f- p, the family (fi) is disconnected, and f and {p} are both disconnected from each n, then f f- P unless some fi is "absolutely inconsistent" in the sense that n f- If/ for all sentences If/. Thus consider the family that has in it the pair (f, {p} ) as well as the pairs (fi' 0). To say that this family is disconnected is precisely to give the hypotheses about disconnection above in (ACP). And to say that the union of the first components has as a consequence the union of the second components is precisely to give the hypothesis that f, Ufi f- p. So applying the symmetric version of the cancellation property, we obtain that either f f- P or else some n f- 0. But this last is the same as fi f- a for all
261
sets of sentences a (dilution). The closest we can come to saying this for the case of an asymmetric logic is that fi f- If/ for all sentences If/, but this is just the definition of absolute inconsistency. We can now state and prove the analog of Theorem 7.8.2 for asymmetric logics. Theorem 7.8.7 (Shoesmith and Smiley) Given a stable asymmetric (formal) logic, a necessary and sufficient condition for it to have a characteristic matrix is that it have the property (ACP). Proof Of course an asymmetric logic is just a formal asymmetric consequence relation f-, and to say that f- is stable is just to analogize the definition for a symmetric consequence relation. This requires that if we extend the original set of sentences S over which f- is defined by arbitrary new atomic sentences (obtaining a new set of sentences S' :2 S), and define f' f-' p' iff there exist f and p such that f ~ Sand pES, and a substitution (J (defined on S') so that r' :2 (J(r), p' = (J(p), and f f- p, then the relation f-' is an asymmetlic consequence relation. The rest of the proof is entirely analogous to that of Theorem 7.8.2, except that the set D can be just the closure under f-' of all the sets D
f;.
Formal asymmetlic consequence relations differ markedly from their symmetlic cousins, as shown by the following somewhat surprising, but nonetheless trivial fact (due to Shoesmith and Smiley). Fact 7.8.8 All formal asymmetric consequence relations (compact or otherwise) defined on countable languages are stable. Proof Let f- be a formal asymmetric consequence relation. Let f-' be defined as in the definition of "stability." We must show that f-' is a formal asymmetric consequence relation. As with symmetric consequence, the properties of overlap, closure under substitution, and dilution are easy. We concentrate our attention then on the infinitary cut. Let us suppose then that f f-' 8, for each 8 E a, and that a, p f-' If/. By the definition of f-', there exists a countable set of sentences a' ~ a so that a', p f-' If/. (A substitution performed on a countable set leaves a countable set.) Now for each 8' E a', there exists (similarly) a countable set f' ~ f so that r' f-' 8'. Considering all the sentences in a' and in the sets f', we are considering a countable union of countable sets, which set theory tells us is again countable. It is clear that only a countable number of atomic sentences occur as generators of these countably many sentences, and so we can find a substitution (J that rewrites these in a one-one fashion as atomic sentences from the originally given countable language, and we can apply the in finitary cut there using f-. Reversing the substitution (and possibly diluting) gives us the desired infinitary cut for f-'. D
Corollary 7.8.9 For a countable asymmetric logic, a necessmy and sufficient condition for it to have a characteristic matrix isfor it to have the property (ACP).
262
7.9
MATRICES AND ATLASES
Normal Matrices
A matrix is normal (in the sense of Church 1956) if the set of designated elements forms a "truth set" with respect to the classical logical operations, i.e., (i) -a
E
D iff a ¢ D,
(ii) a 1\ bED iff a
E
D and bED.
Clearly this definition presupposes a particular choice of the primitive logical connectives, and would have to be appropliately modified to provide for disjunction, the material conditional, Sheffer stroke, or whatever. But the particular choice above is convenient from our point of view (because of the association with the lattice notation), and of course, is well known for the fact that all of the other classical truth-functional operations can be defined from these. In particular, we shall assume the definitions of <jJ V lff = ~(~<jJ & ~lff) and <jJ :J lff = ~(<jJ & ~lff). A normal matrix is one that can be viewed as semantically "OK" in the sense that its elements (regarded as propositions) can be divided up into "the true" and "the false" in such a way as to respect the classical logical operations. Algebraically, a normal matrix is such that if we just consider the operations 1\, V, and - (that is, if we consider its "reduct" (M, D, 1\, V, - )), then there exists a strong homomorphism of it into the twoelement Boolean algebra 2 (recall that a strong homomorphism will have to carryall elements of D into 1, and all other elements to 0, and so in this instance is unique). Another way of looking at the situation is that if we "divide out" the reduct by counting two elements "equivalent" if either they are both designated or both undesignated, then this relation is in fact a congruence (on the reduct), and of course the quotient algebra determined by this congruence is just 2. In this section we prove a result stated in Kripke (1965) giving necessary and sufficient conditions for a unary assertionallogic to have a normal characteristic matrix. The proof that we give is based on a reconstruction worked out with Nuel Belnap and Peter Woodruff many years ago. Before stating the theorem, we define some requisite notions. We presuppose some antecedent knowledge of the notion of a classical tautology (a sentence <jJ is a tautology iff evelY interpretation of it in the two-element Boolean algebra assigns it the value 1). A set of sentences f tautologically implies a set of sentences l:!.. (in symbols, f 1= l:!..) iff there exist n, ... , Ym E f, and 01, ... ,On E l:!.. such that the sentence (n & ... & Ym) :J (01 V ... V On) is a tautology.
Here we presuppose the customary definition of <jJ :J lff as ~<jJ V lff, parentheses in the conjunction and disjunction are to be associated to the left, and either or both of the conjunction and disjunction may be missing (when the ys are missing the consequent must be a tautology, and when the os are missing the antecedent must be such that its negation is a tautology). As a special case when l:!.. = {<jJ}, we have that f tautologically implies the sentence <jJ iff there exist sentences n, ... ,Yn E f such that (n & ... & Yn) :J <jJ is a tautology,
or, when f is empty, <jJ all by itself is a tautology.
NORMAL MATRICES
263
Recall also that it is part of our notion of a logic that its theorems are closed under substitution. A logic is said to be consistent if / is not a theorem of 12, where we define / to be <jJ & ~<jJ, for some fixed sentence <jJ. A set of sentences f is said to be tautologically consistent if / is not tautologically implied by f. We need the notion of a sentence <jJ' being an alphabetic variant of a sentence <jJ, which means that there is a one-one substitution (Y that assigns atomic sentences to atomic sentences and such that (Y(<jJ) = <jJ'.
A logic is said to be complete in the sense 0/ Hallden (see Hallden 1951) iff whenever <jJ V lff is a theorem and <jJ and lff are disconnected from each other (share no atomic formulas), then either <jJ is a theorem or lff is a theorem. Hallden completeness is intimately related to the cancellation property, and indeed is equivalent to it, given quite usual assumptions. We shall show that Hallden completeness implies the symmetric cancellation property, under natural assumptions which we shall develop in the course of the proof. Let us. suppose the hypothesis of the symmetric cancellation property, that Ufi I- Ul:!..i. If I- IS compact, then there are finitely many finite subsets f'. and l:!..'. (of fi and l:!..i respec'1y?, ~o thU' . ~ of all ) the sentences in fi, tIve at . fj 1-. U" l:!..j" Lettmg Yj be the conjunctIOn and SImIlarly WIth OJ, If I- supports the Ketonen rules of "conjunction on the left" and "disjunction on the right," we have
n
& ... & Yj & ... & Yn I- 01 V ... V OJ V ... V On,
and if I- has the deduction theorem property (the Gentzen rule of "conditional on the right"), we have I- (n & ... & Yj & ... & Yn) :J (01 V ... V OJ V ... V 011)'
Assuming for the moment that we have closure under tautological implication, upon rearranging terms we obtain I- (~n V Dr) V ... V (~Yj V OJ) V ... V (~YIl V on).
The other hypothesis of the cancellation property tells us that each disjunction is discon~ected from every other disjunction in the above. So by repeated applIcatIOns of Hallden completeness we will obtain for some index j, (~Yj. V ~j)
I-
(~Yj V
OJ),
or upon rewriting, I-Yj:Joj.
If we have the natural properties that amount to the converse of the deduction theorem property, and the converse of the Ketonen rules, we can conclude
and hence fj I- l:!..j,
as desired for the cancellation property.
264
MATRICES AND ATLASES
265
NORMAL MATRICES
We have thus seen how, on very usual assumptions, the cancellation property (at least the symmetric version) falls out of Hallden completeness. It turns out, as the reader can easily verify, that, except for compactness, all of the assumptions above can be justified on the basis of one general assumption, to wit, that the relation f- includes the relation of tautological implication. Further, as the reader can verify, on this same assumption, Hallden completeness follows from the cancellation property. Further, the reader can verify that this equivalence holds, on the same assumption, for the asymmetric case. We record these facts in the following theorem.
finally, we define T to be just the union of all of the stages (T will be a set since we begin with a set of sentences). D
Theorem 7.9.1 Let f- be a compact symmetric (or asymmetric) consequence relation including the relation of tautological implication (in symbols, 1= ~ f-). Then f- has the cancellation property iff f- satisfies Hallden completeness, i.e., whenever f- ¢ V lfI, then if ¢ and lfI are disconnected from each other, then either f- ¢ or f- lfI·
Proof To is tautologically consistent, since it is just L, which we shall now show is not only consistent (as is given) but also tautologically consistent. It is easy to see that L is closed under tautological implication (because of conditions (i) and (ii)). Thus L is tautologically consistent if it is consistent, and of course this is just condition (iii). Let us next consider the successor case Ta+!, and suppose, contrary to what we want to show, that it is not tautologically consistent. This means that
Having investigated Hallden completeness, we return to the conditions of Kripke for normality:
Theorem 7.9.2 (Kripke) Let L be a unary assertional logic. Then L has a normal characteristic matrix iff all of the following conditions hold: (i) All truth-functional tautologies (in &, V, and ~) are theorems of L. (ii) If ¢ and ¢ ~ lfI (=~ ¢ V lfI) are theorems of L, then lfI is a theorem of L.
The construction defined above is just an interesting variation on the usual Henkinstyle proof of the completeness of classical propositional logic.
Lemma 7.9.3 At each stage a in the construction defined above, the set Ta is tautologically consistent. Hence the set T, defined as the union of all stages, must be tautologically consistent.
Ta+1
1=
f·
Since (by inductive hypothesis) Ta is tautologically consistent, a formula ¢ (and a negated alphabetic variant) must have been added at stage a + I, and so the inconsistency must be laid at their feet. This means
(iii) L is consistent. (iv) L is complete in the sense of Hallden. Proof Let 1= represent tautological implication. Then it is easy to check that this relation has the deduction theorem property and its converse, i.e.,
r, ¢ 1= lfI
iff r 1= ¢ ~ lfI· We start with the set L of theorems of L as a basis, and inductively build up a set of sentences T that behaves as a "truth set," i.e., (i) ~¢ E T iff ¢ if. T, (ii) ¢ & lfI E T iff ¢ E T and lfI E T. The desired matrix will be of the "parasitic" variety, i.e., its elements will be sentences and T will be the designated set. We "enumerate" the set of sentences S: ¢I, ¢2, ... , ¢a+! .... In the usual case where S is denumerable, the indices will be just the positive integers; in the more general case the indices will be ordinals, and in particular, for convenience, will always be successor ordinals-we skip over the limit ordinals. We define To = L. At each successor stage a + 1, if ¢a+ I is not a theorem of L, we define Ta+! to be the result of adding ¢a+ I to Ta if the result is tautologically consistent, and at the same time, adding ~¢~+ I (where ¢~+I is the first alphabetical variant of ¢a+1 in the enumeration disconnected from all sentences already in Ta+! - L). If the result of adding ¢a+1 is not tautologically consistent, or ¢a+1 is a theorem of L (in which case it got in at stage 0), we let Ta+1 be just Ta. At limit stages (of course there will be none in the usual case when S is denumerable) we just take unions of all the sets introduced at earlier stages. And
By the finitary definition of tautological implication and the deduction theorem property, we know that there exist sentences Yl, ... , Yll E Ta so that (Yl & ... & YIl)
~ (¢ ~ (~¢' ~ f))
is a tautology.
Some of the y;s come from L and others were introduced at later stages. Let us denote the conjunction of the first of these by A and the conjunction of the latter by 'C. Then (rearranging terms and using the fact that ~lfI ~ f is truth-functionally equivalen t to lfI) A~
('C ~
(¢ ~ ¢')) is a tautology,
and hence, by condition (i), a theorem of L. Clearly A is a theorem of L (since, as observed above, L is closed under tautological implication). So by condition (ii), 'C ~
(¢ ~ ¢') is a theorem of L.
Hence (by closure under tautological implication), (~'C V ~¢) V ¢'
is a theorem of L.
By the construction, the sentence ¢' is disconnected from (~'C V ~¢). By condition (iv) (the Hallden completeness property), at least one of (~'C V ~¢) or ¢' is a theorem of L. If it is ¢', then since L is closed under substitution, ¢ is also a theorem of L. But in that case ¢ was not added at stage a + 1, contrary to our assumption above that it was this addition (along with ~¢') that led to inconsistency. So then 'C ~ ~¢ must be a theorem of L. This contradicts the supposition that ¢ could be added with tautological consistency to Ta.
At limit stages TfJ, any tautological inconsistency (since it comes from only a finite number of sentences) must already exist at some previous stage Ta, contrary to inductive hypothesis. D
Lemma 7.9.4 T is a truth set. Proof We first observe as a sub lemma that if T 1= lfI, then lfI E T. The point is that lfI is some sentence ¢a+! in the enumeration, and when its turn came up in the construction of T, it would at that point have been added, since it could in no way interfere with the tautological consistency of T, being already a tautological consequence. We next observe that we do not have lfI, ~lfI E T, for otherwise T is not tautologically consistent. We next show that we have at least one of lfI, ~lfI E T. If neither is in T, then we consider T U {lfI} and T U {~lfI}. In this case, each of lfI and ~ lfI must have led to tautological inconsistency when an attempt was made to add them to the construction when their turn came up. So we have T, lfI
1=
j, and T, ~lfI
1=
j.
By obvious moves licensed by tautological implication, we then have T
1=
j,
contradicting that T is tautologically consistent. We have thus established (i)
NORMAL ATLASES
MATRICES AND ATLASES
266
~lfI E
T iff lfI
~
T.
But it is easy to see as well that (ii) lfI/\ X
E T
iff lfI
E T
and X
E T,
using the sub lemma that T is closed under tautological implication together with the obvious facts that lfI /\ X 1= lfI, lfI /\ X 1= X, and lfI, X 1= lfI /\ X· Turning back now to completing the main lines of the proof, we finally have to verify that "the Kripke matrix" (S, T) is characteristic for L. It should be clear that if ¢ is a theorem of L, since L is closed under substitution, any substitution instance (J(¢) will be a member of T. And since interpretations are just substitutions, this establishes soundness, since ¢ will then always be assigned a designated value in every interpretation. Turning now to completeness, if ¢ is not a theorem of L, then either ¢ was never added in the construction of T, or if it was, at the same time a substitution instance ~¢' of ~¢ was added. Since T is consistent, this means that ¢' is not in T. In either case it is then possible to find a substitution instance of ¢ that is not designated, i.e., an interpretation that falsifies ¢. D
7.10
Normal Atlases
Generalizing the idea of a nonnal matrix from the previous section, an atlas (A, (Di») will be said to be normal if for every index i, the matrix (A, Di) is nonnal. Talking infonnally, this means that the atlas can be understood as a collection of possible worlds
267
realized as sets of true propositions. Since we know that it is atlases that are key to the characterization of logics understood as consequence relations (of either the asymmetric or symmetric stripe), it is natural to raise the question as to when such logics have a characteristic nonnal atlas. Let us first observe that if an asymmetric logic tolerates inconsistency in the sense that there exist sentences ¢, ~¢, and lfI so that it is not the case that ¢, ~¢ I- lfI, then it is clearly impossible that it has a normal characteristic atlas. For to falsify such an implication we would need to designate both an element a and also -a (realizing an "impossible world"). For a symmetric logic, we would talk about the existence of sentences ¢, ~¢, and a set of sentences r such that it is not the case that ¢, ~¢ I- r, and we would also want to bring to attention the dual situation when it is not the case that r I- ¢, ~¢ (which would require the realization of an "incomplete world"). One quick way of ruling out such situations is to require that the consequence relation I- include tautological implication, i.e., that 1= ~ 1-.
Theorem 7.10.1 Any symmetric, or compact asymmetric logic I- has a characteristic normal atlas iff it satisfies the following conditions: (i) The consequence relation includes the relation of tautological consequence (in symbols, 1= ~ I-). (ii) r I- ¢ :J lfI, fl iff r, ¢ I- lfI, fl (fl is of course empty for an asymmetric logic).
Remark 7.10.2 The reader may desire a comparison of the above conditions with those for the corresponding theorem for unary assertional logics of Kripke in Section 7.9. Condition (i) above is the straightforward generalization of condition (i) of the Kripke theorem. Also, it is easy to see (because of the cut property) that (ii) above implies (ii') if r I- ¢, fl and r I- ¢ :J lfI, fl, then r I- lfI, fl, which is the obvious generalization of the corresponding condition (ii) for unary assertionallogics given in the statement of the theorem by Kripke in Section 7.9. Looking at the converse direction, it is easy to see (using dilution and cut) that (ii') implies half of (ii) (the direction from left to right), but we believe that the other direction does not follow. An informal argument for this goes as follows. Notice that (ii') does not affect the left-hand side of 1-. Since neither dilution nor overlap can move formulas from the left to the right, these rules cannot lead from r, ¢ I- lfI, fl to the premises of (ii'), nor to r I- ¢ :J lfI, fl. The only hope would be to cut ¢, but dilution and overlap cannot produce an appropriate premise, since r I- fl is not given. The theorem of Kriplce had two additional conditions: (iii) consistency; and (iv) Hallden completeness. We have no need for a generalization of consistency, (say) that I- 0 (or 0 I- f) does not hold, because in the construction below we are always trying to find a counter-example for some consequence. So we are given a set r and a sentence ¢ such that it is not the case that r I- ¢, and from this we can argue that r is a consistent set and then use it as the basis of our construction of a truth set. Thus we can drop (iii) as a condition. This construction is even easier than in the Kripke theorem, because we do not have to find counter-examples for all non-theses in relation to the same set of designated values. This means that we can drop the condition of Hallden completeness.
o
268
NORMAL ATLASES
MATRICES AND ATLASES
Proof The proof differs for (1) the asymmetlic and (2) the symmetric case. (1) The proof for the compact asymmetric consequence relation is much like that for the unary assertional case (cf. Remark 7.10.2). Again, necessity is straightforward and is left to the reader as an exercise. For sufficiency, let us suppose that not (r f- p). We are going to inductively expand r u {",p} to a truth set. Thus we set To = r u {",p}. We next verify that To is consistent. If not, then r u {",p} f- f, and so (by (ii)), r f- "'P :::> f. But "'P :::> f F P, and so by (i) and the cut property we have r f- p, contrary to our hypothesis. We then enumerate the new set of sentences S : PI, P2, ... , Pa+l, .... In the usual case where S is denumerable, the indices will be just the positive integers; in the more general case the indices will be ordinals, and in particular, for convenience, will always be successor ordinals. (We skip over the limit ordinals.) We define To = r', and at each successor stage a + I, we define Ta+ I to be the result of adding Pa+l to Ta if the result is tautologically 'consistent, and closing the result under f-. Otherwise Ta+ I is defined to be just Ta. At limit stages (of course, there will be none in the usual case when S is denumerable) we just take unions of all the sets introduced at earlier stages. And finally, we define T to be just the union of all of the stages.
Lemma 7.10.3 T is closed under f-. Proof This follows trivially from the fact that each successor stage is closed under f-, given that f- is compact. That each successor stage is closed under f- is easily seen to be the case, for if T f- I(f, then when I(f'S tum came up in the enumeration, it would have been added, since adding it could not interfere with T's consistency, it being already a consequence of T. (The reader can add the details, using dilution and cut.) 0
Lemma 7.10.4 At each stage a in the construction defined above, the set Ta is consistent. Proof We first give the proof for the case of a compact asymmeUic logic. That To is consistent was shown above. Let us suppose that some successor stage Ta+l is inconsistent. The reader can easily see that Ta then must have been inconsistent, contrary to inductive hypothesis. At limit stages T{3, any inconsistency (since, by compactness, it comes from only a finite number of sentences) must already exist at some previous stage Ta , contrary to inductive hypothesis. 0
Corollary 7.10.5 T (defined as the union of all the stages in the construction defined above) is consistent. Proof Similar to that for limit stages in Lemma 7.10.4.
o
Lemma 7.10.6 T is a truth set. Proof We first observe that we do not have I(f, "'I(f E T, for otherwise T is not consistent (this uses fact that I(f, "'I(f F f, and (i)). We next show that we have at least one
269
I(f, "'I(f E T. If neither is in T, then we consider T U {I(f} and T U {"'I(f}. Each of and "'I(f must have led to inconsistency when an attempt was made to add it at the appropriate stage in the construction. So we have (using dilution)
of
I(f
T, I(f f- f, and T, "'I(f f- f.
By obvious moves (using (ii), (i) and facts that have T f-
"'I(f,
I(f
and T f-
:::>
f F
"'I(f
and "'I(f :::> f
F I(f), we
I(f.
But (using (ii)) then T f-
"'I(f
:::>
(I(f
:::> f),
and so (using (ii') twice), we have T f- f,
contradicting that T is consistent. We have thus established (i)
"'I(f E
T iff
I(f
Ii! T.
But it is easy to see as well that (ii)
I(f /\
X E T iff
I(f E
T and X E T,
using the lemma that T is closed under f-, together with the obvious facts (derived using (i)) that I(f /\ X f- I(f, I(f /\ X F X, and I(f, X f- I(f /\ X. 0 We finally construct the desired "Kripke atlas" by so constructing a truth set Ti for each pair (r, p) such that it is not the case that r f- p, and by letting the desired atlas be (S, For soundness, if 1::J.. f- I(f, then 0-*(1::J..) f- o-(I(f) (for any substitution 0-), and thus if 0-*(1::J..) ~ h then o-(I(f) E Ti (since each T; is closed under f-). This means that any substitution, i.e., interpretation, that designates all members of 1::J.. also designates I(f, and so we have soundness. Turning now to completeness, if not (r f- p), then it is easy to see from the construction that there exists a truth set Ti such that r ~ Ti, and yet P is not in Ti. The identity substitution (interpretation) thus designates all members of r in some designated set of the Kripke atlas, but fails to designate p. (This completes the proof of Theorem 7.10.1 for case (1).) (2) The proof for a symmeUic consequence relation is a bit easier, since the global cut property allows us to drop the hypothesis of compactness, and the construction of a truth set T does not involve "Lindenbaumizing." (Again, we leave the necessity part of the proof to the reader.) If it is not the case that r f- 1::J.., then we can simply invoke the global cut property to obtain a partition of the sentences into two sets of sentences (T, F) such that it is not the case that T f- F. It is easy to verify the following hold:
m».
(0) Not (T f- F).
p, pET or P E F (2) For no sentence p, pET and P E F
(1) For each sentence
(exhaustiveness). (exclusiveness).
270
(3a) (3b) (4a) (4b) (5a) (5b)
MATRICES AND ATLASES
For no sentence cjJ, cjJ E T and ",CjJ E T. For no sentence cjJ, cjJ E F and ",CjJ E F. For each sentence cjJ, cjJ E Tor ",CjJ E T. For each sentence cjJ, cjJ E F or ",CjJ E F. cjJ 1\ If/ E T iff cjJ E T and If/ E T. cjJ V If/ E F iff cjJ E Tor If/ E T.
(0)-(2) restate the global cut property. As for (3a), if cjJ, ",CjJ E T, then since cjJ, ",CjJ F Ll, by (i), we would have (using dilution) that r I- Ll, contrary to hypothesis. «3b) follows similarly.) (4a) follows from the fact that a sentence has to end up on one side of the partition (T, F), or the other. If cjJ ends up in T, fine; if it ends up in F, then by (3b) we know that ",CjJ cannot also be in F, and so ",CjJ ETas needed. «4b) is argued dually.) As for (5), let us suppose that cjJ 1\ If/ E T, but that, say, cjJ Ii T. Then by exhaustiveness cjJ E F, and since cjJ 1\ If/ F cjJ, we have by (i) (and dilution) that T I- F, contrary to (0). As for the converse, if both cjJ, If/ E T, and yet cjJ 1\ If/ Ii T, then (again by exhaustiveness), cjJ 1\ If/ E F. And since cjJ, If/ F cjJ 1\ If/, we would have (using (i) and dilution) T I- F, contrary to (0). (Again (5b) is argued dually.) Using (0)-(5) it is easy to see that T is a truth set. Further, then for each pair (n, Lli) such that not (r i I- Lli) there exists such a pair (Ti' Fi ), and so we can use the "parasitical" atlas defined on the algebra of sentences S, (S, (1j»), as a normal atlas in which every non-consequence can be falsified (completeness). It is also clear that no conect consequence is falsified this way, for if II I- B, and yet this consequence could be falsified, then there would have to exist a pair (Ti, Fi) and a substitution u so that u* (II) ~ Ti and u*(B) ~ F i . But because I- is formal, we know that u*(II) I- u*(B), and so by dilution we would have Ti I- Fi, contrary to specifications. D
Corollary 7.10.7 (Symmetric case) Any compact symmetric pre-consequence relation has a normal characteristic atlas. Proof Immediate from the second half of the proof of Theorem 7.10.1, and from Lemma 6.8.4 showing that a compact symmetric pre-consequence relation is a symmetric consequence. D
7.11
Normal Characteristic Matrices for Consequence Logics
In this section we in effect combine the results of the previous three sections. The background for all of this work is the fact that every unary assertionallogic has a characteristic matrix (the Lindenbaum matrix). In Section 7.8 we examined the conesponding property for consequence logics, providing necessary and sufficient conditions (due to Shoesmith and Smiley) for a consequence logic (of either the asymmetric or symmetlic flavor) having a characteristic matrix. In Section 7.9 we returned to unary assertional logics, and raised the question of a stronger, normal, characteristic matrix, again providing necessary and sufficient conditions (due to Kripke) for a unary assertional logic to have a normal characteristic matrix. In Section 7.10 we came back to consequence logics, raising the normality issue, but this time in the context of atlases instead of matrices
MATRICES AND ALGEBRAS
271
(just as every unary assertionallogic has a characteristic matrix, so every consequence logic has a characteristic atlas; the interest then extends to the normal ones in each case). Now in the present section we attack the question of when a consequence logic has a normal characteristic matrix, combining all of the issues of strength and generality at once.
Theorem 7.11.1 For a stable symmetric logic I- the following constitute necessmy and sufficient conditions for it to have a normal characteristic matrix: (i) I- has the cancellation property; (ii) F ~ I- (F is tautological implication); (iii) r I- cjJ :J If/, Ll iff r, cjJ I- If/, Ll.
Proof Before we begin, let us recall the observation made in Section 7.9 in discussing the conditions of the theorem of Kripke. There it was remarked that in the presence of condition (ii) above, Hallden completeness and the Cancellation Property amount to the same thing, so we can freely substitute them below in our reasoning. Necessity is straightforward and is left to the reader. As for sufficiency, we begin as with the theorem of Shoesmith and Smiley to collect together all the pairs (n, Lli) such that it is not the case that ri I- Lli' and then for each such pair make a disjoint copy Si of the set of sentences S (we let one of these be S itself). We then union these together to get a new set of sentences S', and define a new consequence relation 1-' on the subsets of S', making 1-' the closure of I- under substitution and dilution. That 1-' is a symmetric consequence relation is just the content of the hypothesis of "stability." We then verify as before that Ur; ¥ U Ll;, and then invoke the global cut property so as to partition S' into two sets T and F so that U r i ~ T and U Lli ~ F. We now have only to verify that T is a truth set, and this velification proceeds exactly like the verification for the symmetric case of the conesponding theorem for atlases of Section 7.10. D We can also prove the following results, by techniques familiar by now (and left to the reader).
Corollary 7.11.2 A compact symmetric pre-consequence relation has a normal characteristic matrix under precisely the same set of necessmy and sufficient conditions as in Theorem 7.1l.l. Theorem 7.11.3 Let I- be a compact asymmetric logic. Then the following are necessmy and sufficient conditions for I- to have a normal characteristic matrix: (i) I- has the asymmetric cancellation property; F ~ I- (F is asymmetric tautological implication); r I- cjJ :J If/ (ff r, cjJ I- If/.
(ii) (iii)
7.12
Matrices and Algebras
Although a matlix is an algebra, it is not just an algebra because of the need of singling out a designated subset. This is unfortunate in that it means in general that we
272
MATRICES AND ATLASES
cannot just cmT)' over results from universal algebra9 and apply them to matrices without thought (although often close analogs can be obtained). But in this section we shall discuss how it is that often "in real life" matrices can be viewed (without loss) as just algebras. Often a matrix M will have an "implication" operation --+, so that if we define an "implication" relation a :S b iff a --+ bED, it turns out that :S is a partial ordering. If in addition D is a (positive) cone under this partial order (a E D and a :S b only if bED), we shall call the matrix standard. There is a slightly weaker notion that is also of interest. If:s turns out only to be a pre-order (not necessarily anti-symmetric), then, given one more condition which we shall next desclibe, we shall call the matrix pre-standard. Thus define a == b iff both a :S band b :S a. The additional requirement is that == must be a congruence (when :S is anti-symmetric == is just identity and we have no need for this requirement). Note that == is a strong congruence given the requirement above that D be a cone. It is easy to see that the quotient matrix of a pre-standard matrix is itself standard. Now for standard matrices a trick for throwing away D is to find some distinguished element e so that D = {x E M: e :S x}. The element e can be understood as just a nullary operation added to the underlying algebra. Unfortunately, this does not by itself suffice to let us consider M as just an algebra, since we still have the partial order with which to contend. But when there is a semi-lattice operation 1\ so that a :S b holds just when a 1\ b = a, then the reduction of a matrix to a plain algebra can be complete. (Clearly it would also suffice to have the dual semi-lattice operation V with a :S b iff a V b = b.) This allows us to give an entirely equational characterization of D as the set of elements satisfying the equation e 1\ x = e. So let us call a standard matrix with an operation that interacts in the required way with :S (and hence ultimately with --+) a completely standard matrix. Another tlick is to find an equation of the form sea) = tea) that holds of exactly the designated elements D. Elok and Pigozzi (1989) observe that this can be done for the relevance logic R using the fact that a is designated iff a --+ a :S a. And, of course, this last can be restated equationally as (a --+ a) 1\ a = a. There are a few things that need to be checked about the coincidence of matrix notions and algebraic notions for completely standard matrices. Since equations are preserved under algebraic homomorphisms, it is easy to see that a (weak) matl'ix homomorphism is just an ordinm'y algebraic homomorphism. Also, it is easy to see that a submatrix is just a subalgebra. First notice that if A' is a subalgebra of A, then (the nullary operation) e' has to be identical to e. But then the cone of elements in A' determined bye' is just the same as the cone of elements in A determined bye, i.e., D' = A' n D, as required. It is left as an exercise for the reader to verify that direct (and subdirect) products of matrices are just direct (and subdirect) products in the algebraic sense. (The verification falls back on the componentwise definitions of designation and 9Matrices can be regarded as many-sorted algebras in a natural way, and we could then appeal to results of many-sorted universal algebra. We have thought it best to leave the latter subject outside the scope of this monograph, however.
WHEN IS A LOGIC "ALGEBRAIZABLE"?
273
:S, once it is noted that the distinguished element of the direct product is just the indexed set of the distinguished elements of the component algebras.) The above ideas are perhaps most familim' in the context of classical logic and Boolean algebras, where the distinguished element e can be picked as the greatest element 1, and so D = {I}. The particular choice of e depends, of course, on the posit that all logical truths co-imply one another (an assumption common to a number of logics other than classical logic, e.g., the usual modal and intuitionistic logics). It turns out that the general idea is of much wider utility, and can be applied even to the usual relevance logics (which certainly have no such posit), although the logics have to be extended conservatively with a constant sentence t with the property that it is a theorem that implies all theorems. 7.13
When is a Logic "Algebraizable"?
We have been examining logics using the tools of algebra. It is natural to ask: how far does this methodology extend? We here give some rough answers to this question. Our answers are rough because we feel that a more precise answer may actually get in the way of further research. We follow Chairman Mao in wanting a hundred flowers to bloom. An algebra is simply a set with some operations. A logic is a set of sentences with some kind of consequence relation. For the sake of clarity we shall focus on asymmetl'ic consequence. The applications to symmetlic consequence and unary assertional systems are often left to the reader. We start by recalling some results of the previous chapter. For a unary assertional logic we showed that its Lindenbaum matrix characterizes its assertions, whereas for an asymmetric consequence logic its Lindenbaum atlas characterizes its consequences, and for a symmetric consequence logic it is the corresponding Scott atlas that does the trick. An atlas is just an algebra with many designated sets, and it is easy to see that an atlas (A, (Di) iEJ) can be "unfolded" into an equivalent indexed set of matrices ((A, Di)) iEJ (equivalent in the sense of validating the same consequences, whether they be unm'y, asymmetric, or symmetric). When a collection of matrices all have the same underlying algebra, then this is sometimes called a bundle. Summarizing, we have shown in Chapter 5 that formal logics, whether they be unary, asymmetric, or symmetric, all have a "matrix semantics" in the following sense.
Theorem 7.13.1 For eVelY unary assertional logic L there is a class of matrices M such that for every sentence cp: I- L cp iff 1= M cp. For evelY asymmetric consequence logic L there is a class of matrices M sLlch that for evelY set of sentences r and sentence cp : r I-L cp (If 1=M cp. For eVelY symmetric consequence logic L there is a class of matrices M sLlch that for every set of sentences rand l::!. : r I- L l::!. (If r 1= M l::!.. A matrix is an algebra with a designated subset, so this result shows that algebras figure centrally in the semantics of logic. On the other hand a matrix is more than an algebra because of its designated subset D. In addressing which logics are "very algebraizable" we must address the question when D can be defined "algebraically."
274
WHEN IS A LOGIC "ALGEBRAIZABLE"?
MATRICES AND ATLASES
275
Czelakowski (1981) defined an algebraic semantics for a logic to be a class of matrices charactelizing the consequence relation of the logic, with each matrix having just one designated element d. Without the restriction to just one designated element, we shall call this a matrix semantics. Let us introduce a predicate T(cjJ) which intuitively means "cjJ is true." This means that we can add a constant ad to the language of the logic, with the interpretation rule that lead) = d. We can now define T(cjJ) as cjJ = ad. The idea is that an inference licensed in classical logic, e.g., {p, p -+ q} f- q, can be translated into a statement about Boolean algebras, namely, if x = 1, and x -+ y = 1, then y = l. Note that the relettering is unnecessary if we use the same variables and operation symbols in the language of the algebra that we use in the language of the logic, and we shall adopt this simplifying convention for this discussion. Having a single designated element works fine for many logics, including classical and intuitionistic logics, where any theorem is implied by any sentence whatsoever, and so all theorems are logically equivalent. For these logics we can then look at the equivalence class of theorems as the greatest element 1 in the natural partial order defined by [cjJ] ~ [vr] iff f- cjJ -+ vr. For logics where a theorem is not implied by any sentence whatsoever, e.g., relevance logic and some other substructural logics, we have to resort to another device, briefly described in the previous section. The device introduced in Dunn (1966) (cf. 1975, Section 28.2 Anderson and Belnap) was to add to the language of the logic R a constant t conceived of as the conjunction of all theorems. Anderson and Belnap had shown that t can be added conservatively to R. In the Lindenbaum algebra [t] ~ [cjJ] is then a way of saying that cjJ is a theorem, and this can be abstracted out by having an "identity" element e and defining T(a) as e ~ a. The idea is that the "true" elements can be viewed as forming a principal cone, and it is just an "accident" that with, say, Boolean algebras this cone is degenerate and contains only 1. Cf. Remark 6.15.13. Note incidentally that e ~ a can be rephrased as an equation:
Remark 7.13.5 Just what would be the the "Blok-and-Piggoziesque" criterion for algebraizability of a symmetric consequence logic? The cliterion for symmemc consequ~nce would be of the same form as their criterion for an asymmetric consequence loglc, but K would be weaker than quasi-equationally definable. It would instead have to do with definability by "symmetric quasi-equations," i.e., formulas of the form: a conjunction of equations implies a disjunction of equations.
e /\ a = a.
Blok and Pigozzi show that theoremhood in the relevance logic R can be characterized without use of t, using the equivalence
But this does involve adding a constant to the language. A less contrived answer is to say that it is when we can find a set of equations of the form sex) = t(x) (containing only the vmiable x) which holds precisely of the elements of D. Blok and Pigozzi (1989) call these defining equations. They introduce this generalization in their definition of what it is for a logic to have an algebraic semantics. 10 It roughly amounts to saying that the logic has a matrix semantics, and the designated set D of each matrix can be uniformly characterized by equations. This is not quite right, since they also require that the set of defining equations be finite. This turns out to be no problem since they only consider logics that satisfy compactness, so an infinite set of premises r can always be traded for a finite subset r'. It would be interesting to investigate logics without this reshiction to compactness. IOThough they also implicitly introduce the requirement that the set of premises r in r I- rjJ is always finite. This obviously does not hurt for a system that satisfies compactness, but otherwise seems arbitrary.
Definition 7.13.2 (Blok and Pigozzi 1989) An asymmetric consequence logic L has an algebraic semantics iff there exists a finite set of equations with a single variable p, Sl (p) = tl (p), ... , sn(P) = tn(P)' such that for all i (1 ~ i ~ n);
r
f- L cjJ iff {Sl(vr/P) = tl(vr/P),· .. ,sn(vr/p) = tn(vr/p):
vr E r}
F=K Si(vr/p) = ti(vr/p).
The definition of when a unary assertional logic has an algebraic semantics is just the sp.eci~l case of this when r is empty. We leave to the reader to state the obvious generahzatlOn for a symmetric consequence logic. Blok and Pigozzi go on to give as their criterion for when an asymmetric consequence logic is "algebraizable": Definition 7.13.3 An asymmetric consequence logic is algebraizable iff there is a quasiequationally definable class of algebras K such that K is an algebraic semantics for the logic. Their cliterion for a unary consequence logic would presumably be essentially the same, except that the class is required to be equationally definable. Definition 7.13.4 A unary assertionallogic is algebraizable iff there is an equationally definable class of algebras K such that K is an algebraic semantics for the logic.
(cjJ /\ (cjJ
cjJ))
-+
(cjJ
¢;>
-+
cjJ),
which is fundamentally based on the R-theorem cjJ
-+
[(cjJ
-+
cjJ)
-+
cjJ]
(Demodalizer),
whose converse also holds. This means that the theorems can be characterized as those sentences cjJ such that (cjJ
-+
cjJ)
-+
cjJ
is a theorem. This may seem to be a circular definition, since "theorem" occurs in both the definiens and the definiendum, but the previous formula can be rephrased algebraicallyas (a -+ a) ~ a,
and obviously this can be reexpressed as the identity that Blok and Pigozzi require: a /\ (a
-+
a)
= (a -+ a).
276
MATRICES AND ATLASES
Whichever is chosen, R has a (characteristic) algebraic semantics. Blok and Pigozzi go on to show that the relevance logic E has no algebraic semantics. As the name "demodalizer" suggests, E lacks it since E is a modal logic in addition to being a relevance logic. In E, the necessity operator D¢ can be defined as (¢ --+ ¢) --+ ¢, and so (the demodalizer) has the effect of making mere truths necessary. Blok and Pigozzi also show that the implicational fragment of R (R... ) has no algebraic semantics. This relies on the fact that R-+ lacks conjunction (and disjunction, since a 1\ (a --+ a) = a --+ a can be dualized to a V (a --+ a) = a). It is well worth noting that this depends upon the lack of appropriate equation, since the inequality a --+ a ::; a would suit the bill admirably if inequations were allowed. Its surrogate e 1\ a = a works just as well for the extension of R with a logical constant t (W). We do not quarrel with the work of Blok and Pigozzi. Indeed, we praise it. Their criterion of algebraizability has a certain "philosophical" naturalness, and generalizes a large class of motivating logics. Also they can prove a number of interesting theorems, one of which we state. Recall that OCT) is the Leibniz congruence determined by a theory T on the underlying algebra of formulas of some given logic. Blok and Pigozzi establish the following interesting relationship between algebraizability and the Leibniz operator. Theorem 7.13.6 (Blok and Pigozzi 1989) An asymmetric consequence logic L is algebraizable ifffor all theories T and S, ifT ~ S then OCT) ~ O(S). But we think the restriction to equalities is too restrictive, and we propose another. Definition 7.13.7 An asymmetric consequence logic L is partially algebraizable iff there is a quasi-inequationally definable class of tonoids K such that K is a (sound and complete) semantics for the logic. Their criterion for a unary consequence logic is essentially the same, except that the class is required to be inequationally definable. Definition 7.13.8 A unary assertional logic is partially algebraizable iff there is an inequationally definable class of tonoids K such that K is a (sound and complete) semantics for the logic.
8 REPRESENTATION THEOREMS 8.1 8.1.1
Partially Ordered Sets with Implication(s) Partially ordered sets
We annunciated a theme in Remark 6.5.3 which we want to reemphasize here. Propositions can be understood as sets of "possible worlds" or, as we now stress, more generally as sets of "information states." The latter is more general in that there can be states of information that are inconsistent, incomplete, or both, and so do not correspond to possible worlds. An "information frame" will always consist of at least a set U whose elements are regarded as "states of information." It may have additional features, for example a binary relation b on U thought of as an "information order." a b p is to be read as "P contains at least the information a." This order is to be understood "qualitatively" and not "quantitatively" and is thus to be contrasted with Shannon's information theory. The information order arises quite naturally in a number of places, but we will content ourselves with assigning its origins to the Kripke-Grzegorczyk semantics for intuitionistic logic (cf. Chapter 11). Other features that an information frame might possess include accessibility relations and/or operations combining pieces of information. The most familiar example of the first is Rap (P is possible relative to a), which comes from the Kripke semantics for modal logic (cf. Chapter 10). As a less familiar example of the first we give Rapy, understood as something like "a and p are compatible from the standpoint of y," and as an example of the second we have something like "the combination of a and p." These can sometimes be parsed in terms of each other, e.g., Rapy can be understood as a • p b y. These last examples arise from the semantics of relevance logic (and more generally substructural logics), as developed by Routley, Meyer, Fine, and Urquhart. See Dunn (1986) or Anderson et al. (1992) for details and history. Routley and Meyer (1972, 1973) are the key references, along with Meyer and Routley (1972), which is even more important in the context of algebraic logic. We refer generally to sets whose elements are regarded to be states as "UCLA propositions." As the reader saw by working through Exercise 6.5.2, binary consequence, understood "mathematically" as a partial order between propositions, can be fully represented as inclusion between sets, and the latter can be understood "philosophically" as consequence between UCLA propositions. We run through the proof of this in slow motion, since grasping its essence is of major importance. Let P = (P, ::;) be a poset. We think of P as a set of propositions and::; as (binary) consequence. These "propositions" are conceived of abstractly. They could be anything;
PARTIALLY ORDERED SETS WITH IMPLICATION(S)
REPRESENTATION THEOREMS
278
we know nothing about their internal structure. Recall that a cone C is a subset of P which is closed upward under S;, i.e., if x E C and x S; y, then y E C. Let C be the set of all cones of P. A cone is a kind of primitive "theory" (at least it is closed under binary consequence) and as such can be regarded as an information state. So it is natural to interpret an "abstract" proposition a E P as the set of information states, i.e., theories (cones) that contain it. Thus, we define (1) h(a) = {CEC:aEC}. This is the desired representation function. We need to show that it preserves preting it as s on C):
S;
to be the residual of some fusion operator. Let us arbitrarily take implication to be the right residual-it will tum out, as we shall see, that there is no way of distinguishing the left residual from the right residual in the absence of fusion. So we assume that we have a po set with a binary operation (S, S;, -+). The only properties of -+ that we assume are that it is antitonic in its first position (antecedent) and isotonic in its second position (consequent), i.e., rule suffixing and rule prefixing. We shall call such a structure an implicational poset, and as we saw in Section 3.10 it is an example of a more general structure, discussed in Dunn (1993a), called a tonoid.
(inter-
s
(2) a S; b iff h(a) h(b), i.e., (3) a S; b iff VC E C, a E C implies b E C.
The left-to-right half follows immediately from the definition of a cone. The rightto-left half follows by instantiating C to be the cone determined by a, i.e., the smallest cone containing a. This involves a small "existence" proof since we have to show that there is indeed such a cone. In this case it is easy, since it explicitly constructed as [a) = {x : a S; x}. As we shall see, there are other cases, as with the representation of distributive lattices and Boolean algebras, when we have to go through some maximalization argument, but here all we need is a cone and it is easy to show that [a) is a cone. Thus suppose that x E [a) and x S; y. The first conjunct just means a S; x and so by transitivity with the second conjunct we obtain as; y, i.e., y E [a) as needed. But what we have is: (4) a E [a) implies b E [a).
Since a E [a) follows from the reflexivity of S;, we have bE [a) by the argument above, i.e., a S; b as desired. We still have to show that h is one-one, but this follows easily from anti-symmetry and (2). Thus if h(a) = h(b), then h(a) s h(b) and h(b) s h(a). But then by (2) a S; b and b S; a, and so by anti-symmetry a = b. This concludes the proof. Before we pass on, let us observe that sets of cones of the form h(a) have a special property. This is confusing as to type level, but not only are they sets of cones, they themselves are cones. Thus if C E h(a), i.e., a E C, and C s C', then a E C', i.e., C' E h(a). This is to say that we can require UCLA propositions to be not just any sets of states, but restrict them to sets of states closed upward under the information order, i.e., if a E p and a b fJ, then fJ E p. This is sometimes useful, and plays a role, for example, in the Kripke-Grzegorczyk semantics for intuitionistic logic (cf. Section 11.4). 8.1.2
279
Implication structures
In the previous section we regarded implication as a relation. We now tum to examining structures with an implication operation. While we take the point of view that useful properties of implication can be sorted out by relating them to properties of fusing premises, it must be acknowledged that fusion is (regrettably) still a relatively arcane notion. Accordingly it is interesting to wonder just what are the minimal properties of implication that are needed to allow it
Representation of Implicational Posets We shall now prove the following, which illustrates a more general result for tonoids. Theorem 8.1.1 EvelY implicational poset (S,
S;, -+) can be embedded in a right residuated partially ordered (p.o.) groupoid (S', S;, 0, -+).
Proof We prove this by way of a representation result. Let U be a non-empty set, and let R be a ternary relation on U. We call the structure (U, R) a ternary frame. We define the following operations on subsets of U: A
0
B
= {X:
3a E A,3fJ E B,RafJx};
A -+ B = {X: Va, VfJ, if RaxfJ & a E A, then fJ E B}; B +- A = {X: Va, VfJ, if RxafJ & a E A, then fJ E B}.
It is easy to verify that ('(.J(U), 0, -+, +-) is a residuated p.o. groupoid. Indeed, any subcollection S' closed under the operations 0, -+, +- is a residuated p.o. groupoid. We call these concrete residuated p.o. groupoids. 0
We can now state and prove the following lemma. Lemma 8.1.2 Every implicational poset (S,
S;, -+) can be embedded in a concrete re-
siduated p.o. groupoid. Proof Let U be the set of all cones C on S. Define the relation R as RCI C2C3
iff Vx, y, if x E CI and x -+ y E C2 then y E C3.
Using R we can now define the operation -+ on subsets of U as in the concrete residuated poset above. Next define the map h(a) = {C : C is a cone and a E C}. It is easy to see that h is one-one, since if a f:. b, then a S; b or b S; a or all b (a and b are unrelated). Without loss of generality we may choose as; band allb to be the cases considered. (1) If as; b then the principal cone [b) = {x : b S; x} is not in h(a), but certainly [b) E h(b) as well as [a) E h(b). (2) If allb, then neither [a) E h(b) nor [b) E h(a). Next we show that h preserves -+, i.e., C E h(a -+ b) iff C E h(a) -+ h(b). To facilitate this we first translate the left- and right-hand sides via their definitions: C E h(a -+ b) iff a -+ b E C; C E h(a) -+ h(b) iff VCI, C2, RCI CC2 & a E CI implies b E C2.
REPRESENTATION THEOREMS
280
It only remains to show then that
But the left-to-right half is immediate, given the canonical definition of R, which simply says that if an implication is in C and its antecedent is in Cl then its consequent is in C2. For the right-to-Ieft half, we proceed contrapositively, assuming that a --+ b ¢ C. We then show that 3CI, C2, RCI CC2 & a E CI yet b ¢ C2. We simply let CI = [a). To obtain C2, we first consider the principal dual cone determined by b, (b] = {x : x :s; b} . We then set C2 = U - (b] (it is easy to check that this is a cone). It is clear that a E C I and b ¢ C2. What needs argument is that RCI CC2. Recalling the canonical definition of R, this amounts to assuming that x E CI, x --+ Y E C and showing y E C2· For reductio let us suppose then that y ¢ C2. Our hypotheses that x E CI and y ¢ C2 become a :s; x, and y :s; b. Using the rule forms of prefixing and suffixing we can easily derive x --+ y :s; a --+ b (y :s; b implies x --+ y :s; x --+ b, a :s; x implies x --+ b :s; a --+ b, and apply transitivity). Then our hypothesis that x --+ y E C gives that a --+ b E C, contrary to our initial supposition. 0 Representation of Donble Implicational Posets Let us now consider the case where we have both "residuals" but no fusion operation to connect them. To this end we will define a double implicational poset to be a structure (S,:S;, --+, ~), where both (S,:S;, --+) and (S,:S;, ~) are implicational posets, and the arrows interact by "pseudo-assertion": (1) a:S; (b ~ a) --+ b;
(2) a:S; b
~
(a --+ b).
The following shows that we have again correctly axiomatized (in this case both) residuals without residuation. Theorem 8.1.3 Every double implicational poset can be embedded in a concrete residuated p.o. groupoid.
Proof From the theorem above we know tlrat every implicational poset is so embeddable, using the canonical relation R defined above, but let us now subscript it:
By a symmetric argument every implicational poset is also so embeddable using the canonical relation:
PARTIALLY ORDERED SETS WITH IMPLICATION(S)
that R-+CI C2C3 says '
We now consider the ternary frame (U, R o ), and the operation above: A
0
B
= {X:
0
0
y E C3.
on subsets U defined
3a E A, 3fJ E B,RoafJx}
Using the same function h as used in tlre representation above (that is, h(a) CD we show h(a
a
0
0
b)
= h(a)
0
= {C : a E
h(b), i.e.,
b E C iff 3Cj,C2,a E Cl,b E C2, & RoCIC2C.
As with the canonical definitions of the relations R-+, R+- above, one direction of this is immediate. Here, however, it is tlre direction from right to left. We tlren only consider the other direction. To this end we assume that a 0 b E C, define CI = [a), C2 = [b), and show RoCIC2C. Thus suppose x E Cl, Y E C2. Then a :s; x, b :s; y, and so by isotonicity we get a 0 b :s; x 0 y. Since tlre first of these is in C, which as a cone is closed upward, then x 0 y E C as needed. We are not yet through, because if we were to just add this representation of 0 to tlre representations of the operations --+, and +- above, we would end up using three different relations R o, R-+, R+-, when we only want one. But just as above we can prove that the three relations in fact coincide, this time in virtue of residuation. We leave details to the reader. Remark 8.1.5 Let us consider how we might in fact have been more subtle in our representations of residuated p.o. groupoids and related structures above. Let (U, R) be a ternary frame and let [;;; be a partial order on U. We shall call tlre structure (U, [;;;, R) an articulated frame if it satisfies the following "monotonicity" conditions: RafJx & X [;;; X'
The only problem is that we seem to have two canonical relations instead of one. But it is easy to see that in fact the two relations coincide. Thus suppose R-+Cj C2 C 3. In order to show R+-CIC2C3 we suppose b ~ a E Cl and a E C2 and show b E C3. But in virtue of a E C2 and pseudo-assertion, we obtain (b +- a) --+ b E C2. But our hypothesis
281
implies RafJx';
Ra fJ X & a' [;;; a implies Ra' fJ X; RafJx & fJ' [;;; fJ implies RafJ' X·
We shall say that a subset of U is hereditary when it satisfies the following:
282
REPRESENTATION THEOREMS
If a
E
fJ then fJ
A & a !:
E
PARTIALLY ORDERED SETS WITH IMPLICATION(S)
A.
(3) e'
Let ~ t (U) be the class of all hereditary subsets. It is easy to see that ~ t (U) is closed under the concrete operations defined above (that the resultant sets are hereditary follows from the "monotonicity" conditions), and so we get another concrete example of a residuated p.o. groupoid, which we shall call an articulated concrete residuated p.o. groupoid. It is easy to see that but a slight modification of the canonical frame of all cones produces an articulated frame. Again U is the set of all cones of S, and Ro, R--+, and R+- are defined as before and are all equivalent (we shall thus denote them ambiguously as R). The difference is that we add as well the natural partial order!: on cones, i.e., set inclusion restricted to U. Thus we get an articulated frame (U,~, R). We verify that the "monotonicity" conditions above hold when R = R o . For the first, let us suppose that RoC] C2C3 & C3 ~ C~, and show R oC1 C2C~, But this is obvious, since our first supposition says that the hypotheses x E C1 and Y E C2 imply x 0 Y E C3, and so a fortiori these same hypotheses imply x 0 Y E C~ (since it includes C3). For the second "isotonicity" condition, suppose R oC1 C2C3 & ~ C1. We need to show that RoCi C2C3. So assume x E and Y E C2. Then afortiori x E C], and since RoC] C2C3 we conclude x 0 Y E C3 as needed. A similar argument takes care of the third "monotonicity" condition. We still have to show that h canies each element a E S into a hereditary subset of the cones of S. But h(a) = {C : a E C} is clearly hereditary, for if a E C and C ~ C' then a E C'.
Ci
Ci
We now consider more explicitly what happens when the residuated p.o. groupoid has a "right identity element" e satisfying a
0
e
= a.
It is the presence of this element that allows us to consider the algebraic analog of
"theorems." Thus by residuation we get e::; a -+ a,
which can be interpreted as saying that "a implies a" is a theorem. The element e can be interpreted as "empty" or "void," but it should not be interpreted as "nothing." It is better to think of it as the conjunction of all theorems. In general, the presence of e allows us to "push the metalinguistic deducibility relation into the object language," giving us for each "deducibility relation" a ::; b a corresponding "object language theorem" a -+ b ~ e. Thus a ::; b iff a
0
e ::; b iff e::; a -+ b.
We shall talk of "pushing" and "popping" for the left and right directions of this equivalence, respectively. Note incidentally, that all we need for pushing is that e is a lower right identity, i.e., a 0 e ::; a. Note that if we assume a left identity e' with the usual law
0
a= a
283
(left identity),
as well as a right identity it can be shown that e = e', since e = e' 0 e = e'. This is pleasant because we do not have two classes of "theorems" (the principal cone of e and also that of e'). But it is not absolutely unavoidable since for many purposes (including the most general representation theorem, as well as dealing with some subtleties in application to certain weak modal logics) we might have just "lower right and left identities," a 0 e ::; a, e' 0 a ::; a, and the uniqueness proof would no longer go through. (As we mentioned, all we need for "pushing" is that e is a lower right identity, but the assumption of full equality is far from unusual, and certainly simplifies things.) Remark 8.1.6 Amplifying the above, recall that our proof above of "push and pop" for the right residual depends on the axiom for the right identity being stated with an equality and not just the inequality, and, of course, similarly for the left residual and the left identity. Indeed, from push and pop for the right residual we can infer that e is a full right identity, and similarly for the left residual and e'. Showing just the first we start with e ::; a -+ (a 0 e) (fusing), and by "popping" we obtain a ::; a 0 e. But we get a 0 e ::; a by residuation from e ::; a -+ a, which comes from a ::; a by "pushing." Again we return to the topic of "residuals without residuation." By an assertional implicational poset we shall mean a structure (S,::;, -+, t), where (S,::;, -+) is an implicational po set and t (thought of as the "conjunction of all logical implications") satisfies a::; b iff
t::;
a -+ b
(push and pop).
The element t functions in the role of the right identity element in a right residuated p.o. groupoid, except for the embarrassment of there being no fusion operator present for it to be the right identity of. But it still allows us to characterize "theorems" ("assertions") by the relationship t ::; a. We next examine what happens when we have both residuals without residuation. By an assertional double implicational poset we shall mean a structure (S,::;, -+, +-, t), where each of (S, ::;, -+, t) and (S,::;, +-, t) is an implicational po set. Note that it is easy to prove that if we had not required the same t, but instead allowed that we have t' in place of t in the second implicational poset, then we could prove t = t' (and thus we have lost no generality by building uniqueness into our definition). Thus we choose to show that t ::; t'. To start, we derive from t' ::; t', by "push" for t, that t ::; t' -+ t'. But next we shall derive t' -+ t' ::; t' (and so t ::; t' results from transitivity). By pseudo-assertion we obtain t' ::; t' +- (t' -+ t'), and by "pop" for t' we obtain t' -+ t' ::; t'. The thing that glues t and t' together is that they "push and pop" the same partial ordering ::;. Remark 8.1.7 "Push" can be easily seen to be equivalent to the following inequality: (4) e::; a -+ a
(self-implication thesis).
This inequality follows immediately from "push" using a ::; a. Conversely, we obtain push by the following sequence of moves:
284
1. 2. 3. 4.
REPRESENTATION THEOREMS
PARTIALLY ORDERED SETS WITH IMPLICATION(S)
a ~ b
(hypothesis) a ...... a ~ a ...... b (1, rule prefixing) e ~ a ...... a (self-implication thesis) e ~ a ...... b (2, 3, transitivity).
It would be nice to find some inequality equivalent to "pop," but the best we have been able to come up with assumes either rule permutation or the presence of the other (left) residual (with the equivalent properties of rule pseudo-permutation or of pseudoassertion). Let us work through the case where we have the addition of the left residual (the case where we have just the right residual but have rule permutation is just a special case, since rule permutation gives that the right residual is also the left residual). In this case one can postulate (5) a +- e ~ a
Proof Recall that in the canonical frame RCI C2C3 iff 'ix, y, if x E CI and x ...... y E C2, then y E C3. Let us suppose that C b C', i.e., 3Ct E T, RCCtC', that is, 'ix, y, if x E C & x ...... Y E C t , then y E C'. Thus for any x E C, since t E C t and t ~ x ...... x, then x ...... x E Ct. And so we have x E C', and so C ~ C'. Conversely, if C ~ C', we show RC[ t )C', i.e., 'ix, y, if x E C & x ...... Y E [t), then y E C'. So we suppose x E C & x ...... Y E [t). This last means t ~ x ...... y, and so by "pop" x ~ y. But since x E C (which is a cone), then y E C ~ C', and so y E C' as needed. D
The above suggests that the way to define a concrete assertional implicational implicational poset is to take a structure (U, b, R, T), where (U, b, R) is an articulated frame (as defined above), T ~ U, and for a, fJ E U, a b fJ iff 3r E T, RarfJ. We then consider the hereditary subsets ~t (U), and on these define ...... , +-, 0 as above. We must close T itself upward under b. Calling the result Z, we have
(specialized pseudo-assertion).
Z~A
From this we derive "pop" as follows: 1. e
~
2. a 3. a
~ ~
~
2. a
+-
Z
(a +- e) ...... a
e
~
a
(pseudo-assertion) (1, pop).
The dual form of specialized pseudo-assertion e' ...... a to "pop" for the left-residual.
~ a
would similarly be equivalent
Representations of assertional implicational posets The element t causes more problems that one might have thought. We shall first work through the representation when we have only one implication ...... , and then work through the case of assertional double implicational posets. Then, we shall add fusion to obtain a full assertional residuated p.o. groupoid. The issue is how to verify "push and pop." As we saw above, t
~
...... A.
Our definition of a concrete assertional implicational poset is not yet complete, for we still need a natural condition that ensures "pop":
a ...... b (hypothesis) b +- e (1, rule pseudo-permutation) b (2, specialized pseudo-assertion, transitivity).
Conversely, from "pop" we can derive specialized pseudo-assertion: 1. e
285
a ...... a
is equivalent to "push," and so we shall start there. In the representation, t and a will be assigned certain sets T, A ~ U, and so we need T ~ A ...... A, i.e.,
A ...... B implies A ~ B.
Thus suppose that Z ~ A ...... B and a E A. Clearly what is needed to ensure that a E B is that 3r E Z, Ram. But this is just to require that b be reflexive. But this is built into our definition, since we required that (U, b, R) be an articulated frame, which then requires b to be a partial order. There is a lot more required of an articulated frame that is strictly needed, but all of this information can be read off of the canonical frame and so there is, certainly, no harm in requiring it. Thus, not only are transitivity and antisymmetry not strictly needed, but not all of the "monotonicity" conditions on Rare needed. Indeed, we strictly need only the "monotonicity" condition that Rxyz & y' ~ y' implies Rxy' z, which guarantees that A ...... B is hereditary given that both A and Bare. However, it is convenient to require all of the "monotonicity" conditions (they clearly hold in the canonical frame) just when we have the operations +- and 0 around, since the other conditions are needed to verify that hereditary subsets are closed under these operations. Thus, let us now consider the case of an assertional double implicational poset (S,~, ...... , +-, t). Unfortunately, since we also want Z ~ A +- A, i.e., forallr E T, a E A, RmfJ impliesfJ E A,
for rET, a E A, RarfJ implies fJ E A. This suggests that we define a relation a b fJ iff 3r E T, RarfJ, and that we work with only the subsets of U that are hereditary with respect to b. In the canonical frame, T = {C : t E C}, and in particular [t) E T. Moreover, in the canonical frame b is just inclusion between sets: Lemma 8.1.8 (Inclusion lemma) In the canonical frame, C b C' (ffC
~
~
C'.
this last suggests that we also define a relation a b' fJ iff 3r E T, RrafJ, and that we work with the subsets of U that are hereditary with respect to b' (as well as of course b). Things are getting a bit complicated, but the good news is that we can show that in the canonical frame band b' are in fact identical. We already know from the Lemma 8.1.8 that b is just ~ on cones. The argument can be run through symmetrically to obtain that b' is also just ~ on cones, and so b = b'.
SEMI-LATTICES
REPRESENTATION THEOREMS
286
The above suggests that the way to define a concrete assertional double implicationa I poset is to take a structure (U, 1:, R, T), where (U, 1:, R) is an articulated frame (as defined above), T ~ U, and require for a, f3 E U, a I: f3 iff 3r
E
T, Rarf3 iff 3r
E
T, Rmf3.
We then consider the set U, define Z as the closure of T under 1:, and define -+ and +on the hereditary subsets as above. For a concrete assertional residuated p.o. groupoid, we proceed as above, but this time we want to have 0 as well as -+ and +-, and we need to verify Z
A
0
= A,
A
Z
0
= A.
Again we use an articulated frame, consider the hereditary subsets g-J t (U), and on these define -+, +-, 0 as above. Again we close T itself upward under 1:, calling the result Z. Canonical frames are defined as before. The term "correspondence" was coined by van Benthem to label the fact that in modal logic certain conditions on the accessibility relation correspond to particular modal laws. The same phenomenon is alive here, in that postulating a certain property of the ternary relation R of a frame forces a certain inequality to hold, and vice versa. In Dunn (1986) one can find a list of such correspondences derived from works of Routley and Meyer, but properties of the relation R are linked to implicational theses. One needs to further link the implicational theses to properties of residuation (as we did above) to obtain correspondences to properties of fusion. Thus, to illustrate with the simplest example, commutativity of 0 (a 0 b = boa) corresponds to the commutativity of R in its first two positions (Raf3 X implies Rf3ax), and Routley and Meyer link this last to assertion. It is easy to see that requiring R to be commutative in its first two places forces 0 as defined on a frame to be commutative, since A
0
B
= {X:
3a
E
A,f3
E
B,Raf3x}
= {X:
3f3
E
B,a
E
A,Rf3ax}
= BoA.
Conversely, if 0 is commutative, then in the canonical frame Ra f3 X implies Rf3a X. (In particular, if we have cones in the canonical frame as we had so far then RCI C2 C 3 implies RC2CI C3.) For if a E a and b E f3, then (since Raf3 X) we have a 0 b E X, and so by commutativity of 0 we have that boa E X. Other correspondences are not quite so straightforward. It is possible to develop a framework where one does not have "pop," but only "push," though things get a bit hard to keep track of since then one has both t and t', I: and 1:', and one must look at sets that are hereditary simultaneously under both relations. When fusion is present this is equivalent to having just e
0
a ::; a,
but not the converse of these relations.
a
0
e' ::; a,
287
There are logics where such a generalization is useful, e.g., "ticket entailment" -+ a ::; a, and "non-alethic" modal logics, lacking Oa ::; a. The relationship between implication and residuation has been known for years (cf. Birkhoff 1940). And one can find many correspondences between laws for fusion and laws for residuation by browsing through, say, Birkhoff's Lattice Theory (especially the 1967 edition), Certaine (1943), Fuchs (1963), Dunn (1966) (relevant portions of which are reprinted in Anderson and Belnap 1975), etc. Perhaps the single best source is Meyer and Routley (1972), who provide a long list of such correspondences. But since they assume fusion is commutative (as it is in most usual logics), they have no distinction between the left and right residual. But the reader should be aware that there are applications where one would not want this assumption, e.g., the Lambek calculus and "quantales" (cf. Vickers 1989). More recent work connecting residuation and implication is that of Dosen (1988, 1989b), which has many affinities with the material presented here. The representation results above are modeled after those of Meyer and Routley (1972), which in tum are based on their semantics of relevance logic. We used cones whereas they use prime filters because they are assuming an underlying distributive lattice. (cf. Anderson and Belnap 1975), which lacks t
8.2
Semi-lattices
Theorem 8.2.1 (Representation for semi-lattices) If (A, /\) is a semi-lattice, then h : A -+ g-J(A) such that h(a) = {x E A : x ::; a} is an isomorphism of A into (g-J(A), n), i.e., every semi-lattice is isomorphic to a semi-lattice of sets. Proof. Note that unlike in the previous section h maps here each element a of the algebra into the principal dual cone generated by this element. We show that h(a/\b) = h(a)nh(b). Assume that r E h(a/\b). Then by the definition of h, r ::; a/\b. But then r ::; a and r ::; b (since a/\b ::; a, b). Then r E h(a) and r E h(b), and so r E h(a) n h(b) as desired. For the converse, note that /\ is glb, and so all the steps are reversible. Also, h is one-one, since each element generates a unique principal dual cone. (In other words, if a b then there is an element a E A, such that a ¢. h(b) (but, of course, a E h(a». 0
i
We cannot hope to simply extend the above result to lattices, that is, we cannot have every lattice isomorphic to a lattice of sets where /\ is n and V is U. (Such a lattice of sets is called a ring of sets.) This is because a ring of sets is such that the distributive laws hold:
= (a /\ b) V (a /\ c); a V (b /\ c) = (a V b) /\ (a V c).
(DLl) a /\ (b V c)
(DL2)
And not every lattice is distributive. Figure 8.1 contains a counter-example.
288
a
b
c
o FIG. 8.1. A non-distributive lattice
8.3
Lattices
The question arises as to how to extend the previous section's representation of semilattices in terms of sets to full lattices. This turns out to be more complicated and/or less intuitive than one might naively think. The problem, put quickly, is that while the meet of a lattice is nicely interpreted as intersection, there is no corresponding way to simultaneously understand lattice join which does not over-interpret it. The most natural way to think of join is as union, but since intersection distributes over union this restricts us to distributive lattices. As we shall see in the next sections, this is in fact the way that Stone represented distributive lattices, but what can be done for lattices generally? We talk above of the simultaneous understanding of meet and join, because there is no problem in interpreting each separately. Indeed, they can both be interpreted as intersection, since there is no intrinsic difference between a meet-semi-lattice and a join-semi-lattice. The representation of Theorem 8.2.1 applies just as well to (A, v) as it does to (A, 1\). The problem arises when we simultaneously attempt to represent a lattice (L, 1\, v). We cannot have 1\ and veach at the same time interpreted simply as n. In Section 13.5 we will briefly discuss two other representations of lattices, one by Urquhart, the other by Hartonas and Dunn. Both of these representations may be regarded as building on the idea that a lattice is somehow the "gluing together" of its meet-semi-lattice and its join-semi-lattice, and that the semi-lattice representations can be similarly glued together to get a representation of the whole lattice. But those representations need further ideas. For the moment we content ourselves with a much simpler representation of lattices due to Dosen, and suggested to him by an earlier representation result of Birkhoff and Frink (1948). We shall say more about the relationship between these results below. The central notion used by Birkhoff and Frink is that of a join-irreducible filter, which is a proper filter that is not an intersection of two other filters. We define in the next section the notion of a join-irreducible element of an algebra; it will be easy to see then
289
LATTICES
REPRESENTATION THEOREMS
that the principal filter generated by a join-irreducible element is an example of a joinirreducible filter. (Join-irreducible filters generalize prime filters - see Definition 3.18.3 - which are used in a representation of distributive lattices.) Mapping each element of the algebra into the set of joint-irreducible filters containing this element gives rise to a lattice. Birkhoff and Frink show that this lattice is isomorphic to the original one, with meet being intersection, and therefore called by them a "meet-representation." Note that they do not provide an explicit definition of the join operation. This becomes important in the comparison to Dosen. To make our presentation a bit more formal we prove here a separation principle for join-irreducible filters similar to the separation principle we proved in the previous section representing semi-lattices.
Lemma 8.3.1 (Join-irreducible filter separation principle) Let L be a lattice, and let a b. There is a join-irreducible filter J containing a but not b.
i
Proof. The principal filter generated by a certainly contains a, but b rf. [a). Now define a set of filters E as E = {F : a E F & b rf. F}. Due to the previous observation, [a) E E. It is routine to show that the union of any non-empty chain of E is a filter, and enjoys the defining property of E, hence it is a member of E. Using Zorn's lemma E has a maximal element, say, J. That is, a E J, b rf. J, and J is maximal with respect to these properties. We still have to show that J is join-irreducible. Assume to the contrary that J is join-reducible, that is, for some Fl and F2 (J i= Fl i= F2 i= J) J = Fl n F2. Then, J is a proper subset of both of these filters (J C Fl, J C F2). Were b E Fl and also b E F2, then one would have that b E Fl n F2 contradicting b rf. J. Thus, let us assume that b rf. Fl (the argument with the assumption b rf. F2 is similar). Since J C Fl there is acE Fl which is not in J. a, c E Fl implies a 1\ c E Fl, since Fl is a filter. Furthermore, a 1\ c b, because b rf. Fl. Clearly this contradicts J's maximality. Thus, J is join-irreducible. 0
i
Theorem 8.3.2 (Meet representation of lattices) Every lattice regarded just as a meet-semi-Iattice is isomOlphic to a meet-semi-lattice of sets (with intersection as the meet). Proof. The crucial insight to this proof is to use sets of join-irreducible filters. Define the map h as h(a) = {F: F is join-irreducible and a E F}.
To show that h is a homomorphism for 1\ assume F E h(a 1\ b). The following series of iffs is justified by the definition of h, by filter properties, and by the common understanding of set intersection: FE h(al\b) iff al\b E F iff a E F&b E F iff FE h(a)&F E h(b) iff FE h(a)nh(b).
The map is one-one, and hence an isomorphism, by the previous lemma.
o
This representation treats a lattice as an ordered algebra with intersection being meet, but without an operation corresponding to join. The set of filters, or of principal
REPRESENTATION THEOREMS
290
LATTICES
filters, or of join-irreducible filters (filters which are not the intersection of two other filters) is suitable for this representation. The canonical map h in each case assigns to each element of the lattice the set of filters (of the particular kind) that contain this element. Ono and Komori (1985), and independently Dosen (1989b), provided Kripke-style semantics for substructural logics with non-distributive conjuction and disjunction. I Dosen references the Birkhoff-Frink representation as his source, but he defines ajoin operation as they do not. As we shall see below, Dosen is overly generous in his citation because one cannot extend the Birkhoff and Frink result to represent join in addition to meet. We have recently found out from Dosen that he explicitly proved the representation of lattices that we give below, and it is forthcoming in as a paper in a volume edited by A. Krapez and published by the Mathematical Institute of Belgrade: A Tribute to S.B. Presic: Papers Celebrating His 65th Birthday. In this paper too he attributed what is really his result to Birkhoff and Frink. Ono, Komori, and Dosen give a Kripke-style semantics for substructural logics based on a semi-lattice-ordered monoid, in which conjuction has the usual truth condition x F cP A lfJ iff X F cP and X F lfJ, and in which disjunction has the seemingly unusual truth condition: X
F cP V lfJ
iff :la, 13, [a n 13 :::; X, and (a
F cP or a F lfJ)
and (13
F cP or 13 F lfJ)].
For our immediate purposes there is no need for the monoid (since it is used to define implication and fusion), and so we consider that we have just a meet-semi-Iattice. Things get a bit confusing as to level, since we are representing a lattice using a semi-lattice. To keep the levels clear we shall introduce separate notations and a philosophical reading. We shall denote a semi-lattice which is a "Kripke frame" as (U, b, n). We think of the elements of U as information states, and denote them by lower-case Greek letters a, 13, X. a b 13 says that 13 contains at least the information of a. So far all is familiar from Section 8.1. What is new is the operation an 13, which is something like the intersection of the two pieces of information. 2 With the operation n, we can in fact define a b 13 in the usual way as an 13 = a. It is familiar from intuitionistic logic that a "proposition" A is not just any arbitrary subset of U, but rather one that satisfies the "hereditary condition," which says: (her) If a E A and a b 13, then 13 E A. This fits with the idea that b is the information order. Ono and Komori, and Dosen require this in the form: If a I Cf.
F cP and a
b 13, then 13
F cp.
also Ono (1993).
2Note well that this is not the "conjunction." The conjunction would contain all of the information in both a and {J, but this contains only the information that is common to the two. We want a n {J k a, {J.
291
They also require a kind of "downward" hereditary condition: If a
F cP and 13 F cP, then a n 13 F cp.
The "algebraic form" of this is: (cl) If a
E A
and 13
E A,
then a n 13
E A.
In standard terminology of lattice theory, (her) and (cl) together are the definition of A being afilter (with the small, technical difference that we allow A to be empty in this section, whereas in our Definition 3.18.2 we excluded the "empty filter"). The upward hereditary condition is much more intuitive than the downward one. Let a and 13 be as independent pieces of information as one can imagine. Then one might think that a n 13 would be empty. Suppose now that a and 13 both make a proposition C true. How can this be since a and 13 are supposed to have so little in common? Well, perhaps C is the disjunction of two propositions A and B, with a making A true and 13 making B true. Why should a n 13 make C true? The answer is, we guess, that a and 13 have more commonality that we might have thought. They at least have in common the (disjunctive) information in C. We are not totally satisfied with this answer, but it is at least a story. There are probably others that do not have closure under n. Returning to the mathematics, we let A, B, C etc. range over filters of U. We think of these as "propositions." A A B = An B, A V B = {X : :la, 13 such that a n 13 b X and a E A u Band 13 E A u B}. The latter corresponds to the "unusual" definition of disjunction, but in fact it is not so unusual. Those familiar with lattice theory will recognize it as just the filter generated by A u B. We observe that this gives a lattice with the lattice ordering being ~. Since it is well known that n gives glbs, it suffices to show that A V B is the lLib of A,B. We first show that A ~ A V B. Suppose that X E A. Set a = 13 = X and the result is clear. The argument that B ~ A V B is symmetric. So all that remains is to show that A V B is the least among upper bounds. Let C 2 A, B. We must show that A V B ~ C. Thus suppose that X E A V B, i.e., :la, 13 such that an 13 b X and a E Au Band 13 E Au B. Clearly, Au B ~ C, and clearly a, 13 E Au B, and hence a, 13 E C. Since C is a filter, then a n 13 E C, and so X E C. Fact 8.3.3 A u B
Proof If x
E
~
A vB.
Au B, then set a = 13 =
x.
Fact 8.3.4 It is not always the case that A V B
D ~
Au B.
Proof Follows from the fact we can represent non-distributive lattices. Here is a "concrete" counter-example, which also proves the claim. D Example 8.3.5 Consider the free meet-semi-lattice generated by {a, b, c} as a frame, b being:::; and n being A. Take A to be [a A b), that is, the principal filter generated by a A b, and B to be [b A c). Clearly, a A c, a A b A c ¢ Au B, but A V B is the whole carrier set. (The minimal (but perhaps less revealing) example can be obtained over {a, b} in a similar way.)
REPRESENTATION THEOREMS
292
We shall denote an abstract lattice which we are trying to represent by the notation (L, /\, v), and denote its elements by a, b, c. We let :S denote its partial order (definable in the usual way as a :S b iff a /\ b = a).
Theorem 8.3.6 Every lattice (L, /\, V) is embeddable in the lattice offilters of a semilattice. Proof. Let (L, /\, V) be an arbitrary lattice, and let F, G, H range over the filters of L. The key to the proof is that the collection of filters of L forms a semi-lattice in a natural way. Define the canonical frame (U L, !:LnL) so that UL is the set of filters of L, and !:L is just inclusion among these filters, and nL is intersection of these filters. We define h(a)
FINITE DISTRIBUTIVE LATTICES
8.4
Finite Distributive Lattices
Let us try now to extend the result for semi-lattices from Section 8.2 to distributive lattices (lattices in which the distributive laws hold). But even if (A, /\, V) is a distributive lattice, h : A -+ \{.:i(A) such that h(a) = {x E A : x :S a} is not necessarily a homomorphism of A into (\{.:i(A), n , U). Figure 8.2 contains a counter-example in which h(a V b) =f. h(a) U h(b). The point is that 1 :S a V b, yet 1 a and 1 b, so 1 E h(a V b), yet 1 ¢ h(a) and 1 ¢ h(b).
i
i
{1,a,b,O}
= {F : a E F}.
We first show that h(a) is a proposition. This is clear, since if a E F and F ~ G, then a E G; and if a E F and a E G, then a E F n G. We show that h is an isomorphism. That h is one-one and preserves /\ is familiar. So it suffices to show that h preserves V, i.e.,
a
F E h(a V b) ¢}
3G, H[G
nH
~
¢}
Going from right to left, there are four cases: (1) a E G, a E H; (2) a E G, b E H; (3) bEG, a E H; (4) bEG, b E H. In each case we have a V bEG n H since a :S a V b and b :S a V b. But since G n H ~ F, then a V b E F as required. Now going from left to right, suppose a V b E F. Set G = [a) (the principal filter determined by a) and set H = [b). Then G n H ~ F, for if x E G n H then a :S x and b :S x. It follows that a V b :S x and so x E F. It is immediate that a E G, and so a E G or bEG. Similarly, it is immediate that bE H, and so a E H or bE H. D The reader is surely asking by now whether we can prove the representation above by using join-irreducible filters in place of arbitrary filters. An ingenious counter-example was produced by Katalin BimbO, and it has been pointed out by Tatsutoshi Takenaka that the essential feature of the counter-example can be found in the lattice of Figure 3.17, as can be seen by the following exercise.
Exercise 8.3.7 Consider the lattice of Figure 3.17. For an element x, let h(x) be the set of join-irreducible filters that contain x. Consider the join-irreducible filter [d). a- vb - E [d) and so it is required that there exist join-irreducible filters G and H with a- E G, b- E H such that G n H ~ [d). Show that this is not the case. It turns out that one can extend the representation of lattices as given by Theorem 8.3.6 to lattice-ordered residuated groupoids of various kinds, interpreting the additional operators 0, -+, and +- by way of a ternary accessibility relation R ~ U 3 as in Section 8.1. This representation theorem is implicit in the Kripke semantics of substructural logics as developed by Ono and Komori (1985) and Dosen (1989b). See also Ono (1993). We do not pursue this here.
{b,O}
{O} FIG. 8.2. Counter-example
F E h(a) V h(b),
F & (a E G or bEG) & (a E H or bE H)].
{a,O}
b
o
h(a V b) = h(a) V h(b),
a V bE F
293
Now what if we send a E A into just those elements x :S a such that for all elements y and z, x :S y V z 9 x :S y or x :S z? We call elements satisfying this property joinprime (or just prime when there is no confusion with meet-primeness, defined dually). The terminology is motivated by the fact that if we consider the lattice (:~~+, /\, V), where Z+ is the set of positive integers, m /\ n is the greatest common divisor of m and n, and m V n is the least common multiple of m and n (then m :S n iff min), then p is join-prime iff it is prime in the usual sense. Every element of a chain is prime. So we define h : A -+ \{.:i(A) so that h(a) = {p E A I p:S a and p is prime}. Clearly h is a homomorphism of A into (\{.:i(A), n, U). We already know from the representation theorems for semi-lattices that h preserves /\. And we know from the way we defined primeness that h preserves V. The question is whether h is one-one. Clearly what we need is the following. Separation principle: Va, bE A, if a b, then 3p E A such that p is prime, p :S a, and pi b. We next investigate when the separation principle holds. We first study prime elements. An element a is join-reducible (or just reducible when there is no confusion with the dual notion) iff for some elements band c, a = b V c and yet a =f. b and a =f. c.
i
Exercise 8.4.1 (1) Show that every prime element is irreducible. (2) Show for a distributive lattice that every irreducible element is prime. (3) Find a non-distributive lattice with an irreducible element that is not prime.
GENERAL REPRESENTATION
REPRESENTATION THEOREMS
294
(4) Recall from the previous section that a filter is said to be join-reducible iff it is proper and is not the intersection of two other filters. Show that a principal filter is join-irreducible iff it is determined by a non-zero (join-)irreducible element.
Theorem 8.4.3 Every finite distributive lattice satisfies the separation principle. Proof The claim follows from the next lemma together with Exercise 8.4.1 (2).
Lemma 8.4.4 In any finite lattice, every (non-zero) element a = are all of the irreducible elements less than or equal to a.
Solution (1) Suppose a is prime, and a = b V c. Then a S b or a S c. But since b, c S a, a = b
or a = c. (2) Suppose a is irreducible. Suppose as b V c. Then a = a A (b V c) = (a A b) V (a A c). Then a = a A b or a = a A c, i.e., a S b or a S c. (3) In Figure 8.3 a is irreducible, but not prime since a S b V c, but a b and a c.
i
i
(4) Left safely to the reader.
295
0
Vi<11 Pi, where the PiS -
Proof If a is irreducible we are through. Otherwise a = al Vaz (a ~ aI, a ~ az). If al and a2 are irreducible, we are through. Otherwise, say al = all Va12 and az = a2l VaZ2. If aJ J, a12, aZJ, a22 are irreducible we are through, otherwise, etc. Now this process 0 cannot go on indefinitely, since the lattice is finite.
Exercise 8.4.5 (1) Remove the "etc." from the above proof. (2) Find a non-distributive lattice in which the separation principle does not hold.
1
Solution
a
b
(1) Let H be the set of elements that cannot be represented as above. If the lemma is false, 3a E H. Since A is finite, there must be a minimal such element a. a = b V c (a ~ b, a ~ c). Since a is minimal, b = ViSJ qi, where the qiS are the irreducible elements S b. And c = Vi
c
i
o FIG. 8.3. A non-distributive lattice with an irreducible, non-prime element
Observe that 0 (if it exists) is always prime (and irreducible), so there is not much point in saying that it is. We henceforth for the sake of convenience exclude 0 as prime (or irreducible). Notice that this does not interfere with the fact that h : A --+ \?J(A) defined above is a homomorphism of (A, A, V) into (\?J(A), n, u). {p E A : p S a V band
=
pisprime(andp~O)} {pEA:Psaandpisprime(andp~O)} U {pEA:PSb and p is prime (and p ~ O)}. Adding p ~ 0 just gives you more information. Further, the separation principle will be true in a lattice just when it is true with p
relativized to non-zero prime elements. Exercise 8.4.2 Establish the above. Solution Suppose the separation principle holds. Then suppose a thatp S a andp b. Thenp ~ O.
i
i
i
Theorem 8.4.3 can be extended to those distributive lattices satisfying the so-called minimum condition: there are no infinite descending chains of the lattice al > az > a3 > ... ai > ai+l ... , but it unfortunately fails generally. Counter-example: Let X be an infinite set. We define an equivalence relation on \?J(X) as follows: for x, y S;;; X, define x == y iff x and y differ in only in a finite number of elements, i.e., if (x - y) U (y - x) is finite. Let [x] = {y S;;; X : x == y}. Define [x]A[Y] = [xny], and [x]v[y] = [xuy]. These are well-defined (single-valued) operations (i.e., == is a congruence), and we obtain a distributive lattice. Now let x be an infinite set, and y a finite set. Then [x] [y]. There are no irreducible elements. [0] is the class of all finite sets, and if z is infinite, then z can be partitioned so that z = ZI U zz where ZI n zz = 0 and ZI and zz are both infinite. Hence [z] = [zIl V [zz] ([z] ~ [zIl, [z] ~ [Z2]), and hence [z] is reducible.
i
Exercise 8.4.6 Verify all details in the above example. b. Then 3p such
In fact, nothing substantial is affected by our consideration of just non-zero prime elements. "Zeros" are always nuisances in math, sometimes we want to include them, sometimes exclude them, and almost always our choices are dictated by prevailing fashions.
8.5
The Prohlem of a General Representation for Distributive Lattices
N ow we appear to be in the soup as far as a general representation for distributive lattices is concerned. But Stone (1937) had the idea of considering not just the elements of a distributive lattice, but also certain "ideal elements."
STONE'S REPRESENTATION THEOREM FOR DISTRIBUTIVE LATTICES
REPRESENTATION THEOREMS
296
297
We now set down the abstract properties which a set of elements P must have for Stone's argument. Stone's conditions are:
{a, b}
(1) x /\ yEP iff x E P and yEP.
a
h
b
--+
{b}
{a}
o
o
FIG. 8.4. A lattice and the result of the application of h In order to understand his method, we consider a prime element p of a distributive lattice A, and let [p) = {x E A: p S x}. We thereby establish a one-one correspondence between the prime elements p and the sets [P). If p = q, then obviously [P) = [q). And if [P) = [q), then since p S p, then p S q, and similarly q S p. Instead of considering the function h defined above (h(a) = {p E A : p S a and p is prime}), we can consider a function h' that associates with each a E A the class h' (a) of all the sets [p) where p is a prime element less than or equal to a. h' is one-one since hand [ ) were one-one. And h' preserves /\ and V since h preserved /\ and V. Consider the example in Figure 8.4. It shows a four-element distributive lattice with h applied to it. Figure 8.5 shows the application of h' to the same lattice. It might be thought that in passing from the prime elements p to the sets [p), nothing is gained. And straightforwardly this is so, because of the one-one correspondence noted above. But Stone saw that the demonstration that h' is a homomorphism could be made so as not to depend upon the primeness of the elements p determining the sets [p), but instead on certain abstract properties of the sets themselves. We should be reminded at this point of Dedekind' s famed completion of the rationals so as to obtain the reals. He identified real numbers with certain sets of rationals called "cuts," and although every rational determined a cut, not every cut was determined by a rational. Indeed, the so-called "upper cut" determined by a rational p is the set of rationals greater than or equal to p, and is often denoted by [P), making the analogy even more closely with what Stone did.
a
b
o
h' --+
{{b, I}}
{{ a, I} }
o
FIG. 8.5. The result of the application of h'
V
yEP iff x
E
P or yEP.
Recall «Fl+) from Section 3.18) that a (non-empty) set satisfying (1) is called a filter (or dual ideal). Recall also (Definition 3.18.3) that a (non-empty) set satisfying both (l) and (2) is called a prime filter. We also noted there that the intersection of a family F of filters (all of which contain some common member) is also a filter, and that for this reason we may speak of the filter generated by a (non-empty) set X, denoted by [X), and mean the least filter which includes the set. That is, [X) = {F : F is a filter and F :2 X}. (This filter need not be proper, it may be the lattice itself.) It is memorable to regard a filter as the set of provable propositions and a prime filter as the set of true propositions (where /\ is and, V is or). We could have dualized all of the above discussion, utilizing ideals. We can think of an ideal as the set of refutable propositions (those whose negations are provable), and a prime ideal as the set of false propositions. The lattice itself furnishes a trivial example of an ideal, but we adopt a convention similar to the one we have for filters so as to exclude it except when we indicate otherwise.
n
Exercise 8.5.1 P is a prime filter iff P is a prime ideal. Solution P is a prime filter iff P satisfies Stone's conditions: (1) x /\ yEP iff x E P and yEP; (2) x V yEP iff x E POI' yEP. Dually, P is a prime ideal iff P satisfies the duals of Stone's conditions: (1') x V yEP iff x E P and yEP;
(2') x /\ yEP iff x
E
P or yEP.
(2') is just the contrapositive of (1). And (1') is just the contrapositive of (2).
8.6
{ {a, 1}, {b, 1} }
1
(2) x
Stone's Representation Theorem for Distributive Lattices
By a ring of sets is meant a non-empty collection R of sets that is closed under (binary) intersection and union, i.e., for X, Y E R, X n Y E R and X U Y E R. The power set of all subsets of some set is an example. Obviously, every ring of sets is a distributive lattice. We have, conversely, the following theorem. Theorem 8.6.1 (Stone's representation for distributive lattices) EvelY distributive lattice is isol170/phic to a ring of sets. In order to establish Stone's representation we shall use Zorn's lemma to establish the following lemma:
REPRESENTATION THEOREMS
298
Lemma 8.6.2 (Prime filter separation principle) Let L be a distributive lattice with a, bEL such that a b. Then there exists a prime filter P of L such that a E P and
i
b
¢ P.
Proof Let L, a, and b be as in the hypothesis. Then [a) is a filter separating a from b. But it may not be prime. So the proof consists of extending [a) until it is prime, but in such a way so as not to capture b. We let E be the family of filters of L which have a as a member but not b. E is nonempty, since it contains [a). Now let C be any non-empty chain of E. Then U C E E. For clearly a E U C. And clearly, b ¢ U C. It remains only to show U C is a filter. So we suppose c, dE U C, but then there are F, F' E C such that c E F and d E F'. But either F ~ F' or F' ~ F, so either both c, d E F' or both c, d E F. But since both F and F' are filters, then either c 1\ d E F' or c 1\ d E F. But then in either case c 1\ d E U C. Next we suppose that c E U C and c ~ d. But then there is F E C such that c E F. And since F is a filter, d E F. So d E U C. We have just established the hypothesis for (Zorn's lemma), so we conclude that E has some maximal member P. Recalling our definition of E, we have that P is a filter such that a E P and b ¢ P. It remains to show that P is prime. So we suppose that for some c, dEL, c V d E P but c ¢ P. By Theorem 3.18.12 [P, c) = {y E L : (::Ix) such that x E P and x 1\ c ~ y}, and [P, d) = {y E L : (::Ix) such that x E P and x 1\ d ~ y}. Since [P, c) is a filter, [P, c) ¢ E, since P is maximal in E and [P, c) is a proper superset of P. Thus P ~ [P, c), for if YEP, then for some x E P (namely y), x 1\ c ~ y. And P :j:. [P, c), since c E [P, c), but by hypothesis c ¢ P. But since [P, c) contains a (since a E P), and is a filter, it can only fail to be in E by containing b. Similarly, we show that [P, d) must contain b. Then for some x E P, x 1\ c ~ b, and for some yEP, Y 1\ d ~ b. But then let z = x 1\ y, Z 1\ c ~ b, and z 1\ d ~ b. Then (z 1\ c) V (z 1\ d) ~ b. But since L is distributive, then z 1\ (c V d) ~ b. But Z E P, since P is a lattice, and c V d E P by hypothesis. So then z 1\ (c V d) E P (P a lattice), and then bE P (again, P a lattice). But this contradicts our proof that b ¢ P, so contrary to hyp.othesis, either c E P or dE P, and so P is plime. D
Exercise 8.6.3 Show that in a distributive lattice a filter is join-irreducible iff it is plime. Conclude that the prime filter separation principle (Lemma 8.6.2) is actually a special case of the join-irreducible separation principle (Lemma 8.3.1). Having established the prime filter separation principle, we proceed to the proof of Stone's representation theorem. Proof Let L be a distributive lattice. For a E L, define h(a) = {P : P is a prime filter of L and a E P}. The function h is obviously one-one in virtue of the prime filter separation principle. We next show that h is a homomorphism onto the ring of the sets h(a) (a E L). We recall that a prime filter P can be characterized by the following two conditions:
STONE'S REPRESENTATION THEOREM FOR DISTRIBUTIVE LATTICES
299
(I) a 1\ b E P iff a E P and b E P;
(2) a V b E P iff a E P or b E P.
ThenP E h(al\b) iff al\b E Piff(by 1) a E P and bE Piff P E h(a) and P E h(b) iff P E h(a) n h(b). So h(a 1\ b) = h(a) n h(b). And P E h(a V b) iff a V b E P iff (by 2) a E P or b E P iff P E h(a) or P E h(b) iff P E h(a) U h(b). So h(a V b) = h(a) U h(b). These two chains of "iffs" establish not only that h is a homomorphism, but also something that might be overlooked in the scuffle, namely, that the sets h(a) are actually closed under (binary) intersection and union. So Stone's representation is complete. D
For distributive lattices, there is a close connection between prime filters and homomorphisms into the two-element distributive lattice 2-they co-determine each other. Proposition 8.6.4 Let A be a distributive lattice and let P be a (proper) prime filter of A. For x E A, define h(x) = 1 if x E P, and hex) = 0 if x ¢ P. Then h is a homomorphism onto 2. Conversely, let h be a homomorphism of A onto 2. Define , P = {x E A : x = I}. Then P is a prime filte!: Proof We leave this as an exercise for the reader. For those who want to cheat, look ahead at the proof of the similar connections for Boolean algebras (Theorem 8.10.1). D
Stone's representation can be somewhat trivially rephrased as: Theorem 8.6.5 EvelY distributive lattice is isomOlphic to a direct product oflliEI 2 i . The trick is simply to realize that each element of XiEI2i is just a characteristic function determining a subset of J, and of course each subset determines a characteristic function. We shall go into this in more detail later with Boolean algebras. But this gives another route to the proof of Stone's representation for distributive lattices using Birkhoff's prime factorization theorem (Theorem 2.8.3). We show: Lemma 8.6.6 A (non-degenerate) distributive lattice A is subdirectly irreducible only if it is isomOlphic to 2. Proof Suppose, contrapositively, that A is not isomorphic to 2. Then there is an element a which is neither the top or the bottom of the lattice. We use a to define two congruences: X -=1'1 Y iff x x -=v y iff x
1\ a V
a
= y 1\ a; = y V a.
We can safely leave to the reader that these are congruences, but we note that distribution enters in when showing that -=1'1 preserves v, and -=v preserves 1\.3 These two congruences have as their intersection the equality relation restricted to A, which we denote by E. We know from Lemma 2.8.7 that this means that A is isomorphic to a subdirect product of A/=.A and A/.=v, and so A is not subdirectly irreducible. D 3Hence =A determines a congruence on a meet-semi-Iattice, and similarly for =v on ajoin-semi-Iattice.
REPRESENTATION THEOREMS
300
BOOLEAN ALGEBRAS
301
x = 0 V X = (x /\ x') V x =
Exercise 8.6.7 Show the converse of the above lemma.
argument, interchanging the roles ofx and x', we have (x V x) /\ (x'V x) = 1 /\ (x'V x) = x' V x, i.e., x' ::; X.
8.7
Complemented lattices do not in general have unique complements.
Boolean Algebras
Recall from Section 3.9 that a lattice A is said to be complemented iff'
E
In the example in Figure 8.6, both band c are complements of a. And in the example in Figure 8.7 both a and b are complements of c.
A, 3X' E A
(1) x /\ x' ::; y, (2) y::;XVX'.
Any complemented lattice A has a least element (x /\ x') and a greatest element (x Vx'), and, of course, these elements are unique. We denote the least element by "0" and the greatest element by "1." We define a Boolean algebra as a complemented distributive lattice. Some good references on Boolean algebras include Halmos (1963) and Sikorski (1964). The motivating' example of a Boolean algebra is a collection of subsets of some set S, where the collection is closed under (binary) intersection and union, and also under (relative) complementation. (So if Y is in the collection, so is S - Y, which we customarily denote by Y.) Such a (non-empty) collection of sets is called afield of sets. A field of sets is thus a ring of sets closed under complementation. Recall that Corollary 3.14.10 ensures that each element of a Boolean algebra has but one complement. In virtue of this we let x be the complement of x (which we shall sometimes denote by variants such as -x). From an operational point of view, we may define a Boolean algebra as a quadruple (B, /\, v, -), where (B, /\, V) is a distributive lattice (operationally defined), and where - is a unary operation on B satisfying the complementation laws:
a
o FIG. 8.6. A non-uniquely complemented lattice
a
c b
o FIG. 8.7. Another non-uniquely complemented lattice
(C1) (x /\ x) V Y = y; (C2) (x V x) /\ Y = y. Notice that (C1) and (C2) are dual to one another, so the duality principle for distributive lattices may be extended to Boolean algebras.
Exercise 8.7.1
(ii) (iii) (iv) (v)
(2)
(i)
ais the complemeEt ofa. But also a is the complement of a, since a/\a = 0 and aV a = 1. Hence a= a, since complements are unique.
(ii) We show that
(1) Verify that every element of a complemented distributive lattice has a unique complement. Is this true of complemented lattices in general? (2) Establish that the following laws are true of Boolean algebras:
=a
(double complementation); a /\ b = a V b, a V b = a /\ b (De Morgan laws); ::::; b ~ b ::; (antitonic law); 0 = 1, 1 = 0; a::; b iff vb = 1 a::; b iff a /\ b = O.
(i) a
c
a
a
Solution (1) Suppose x /\ x = 0 and x V x = I. Then x V x' = 1. x' = 0 V x' = (x /\ x) V x' = (x V x') /\ (x V x') = 1 /\ (x V x') = X V x', i.e., x ::; x'. Running through the same
aV bis the complement of a /\ b:
= (a /\ b /\ a) V (a /\ b /\ b) = 0 V 0 = 0; = (a Va V b) /\ (b V a V b) = 1/\ 1 = 1. By duality, we obtain a V b = a/\ b. (iii) a ::; b =? a /\ b = a =? a /\ b = a =? (by ii) a V b = a =? b V a = a =? b ::; a. (iv) Since 0/\1 = 0 and 0 V 1 = 1,0 = 1 (and]" = 0). (v) a ::; b =? (a /\ b) = a =? a vb = (a V b) V b = 1. Conversely, a Vb = 1 =? a = a /\ 1 = a /\ (a V b) ~ (a /\ a) V (a /\ b) = a /\ b. And a /\ b = a =? a ::; b. By (a /\ b) /\ (a V ~) (a /\ b) V (a V b)
duality, a ::; b ¢} a /\ b = O.
302
8.8
REPRESENTATION THEOREMS
STONE'S REPRESENTATION THEOREM FOR BOOLEAN ALGEBRAS
Filters and Homomorphisms
It is common in algebra to find special subsets of algebras determining congruences, e.g., normal subgroups in groups, ideals in rings, etc. A filter F of a Boolean algebra B determines a congruence =F on B as follows. Define a +-7 b = (Ci V b) /\ (b Va), and then define a F b iff a +-7 b E F. Furthermore, every congruence is determined by a filter F, defining F = {x E B : x I} = [1].
=
=
=
Exercise 8.8.1 Show that =F is a congruence on B. Show also that for every congruence on B, there is a filter F ofB such that for a, bE B, a b iff a =F b.
=
=
Proof Suppose M is maximal. Then M is prime, so since x E B, x vi = 1 EM, M contains at least one of x or i. Further, M cannot contain for some x E B both x and i, for then x /\ i = 0 EM, and M is not proper. If F is a filter such that F :J M, then 3x E F such that x ¢ M. But then i EM and 0 hence i E F, but then x /\ i = 0 E F, and hence F is not proper.
Exercise 8.9.3 (1) Show that the following statements are equivalent without using any form of the
Axiom of Choice. Note that for the purposes of the exercise, a Boolean algebra is a non-degenerate one (i.e., one with more than a single element), and a filter is a proper one (i.e., one not identical with the whole algebra). (Hint: For (iv) :::} (i), consider B/=.r) (i) For any Boolean algebra B, if F is a filter of B, then there is a maximal filter M ofB such that F ~ M. (ii) For any Boolean algebra B, for any a, bE B, if a i b, then there is a maximal filter M of B such that a E M and b ¢ M. (iii) For any Boolean algebra B, for any a E B, if a 1'= 0, then there is a maximal filter M of B such that a EM. (iv) For any Boolean algebra B, there is a maximal filter M of B.
In review of the universal connection between homomorphic images and congruences established in Chapter 2, we have that filters detelmine homomorphic images, and that any homomorphic image (or an isomorphic copy thereof) is determined by some filter. Instead of writing "B /='F'" we often write "B / F." 8.9
Maximal Filters and Prime Filters
A filter M is maximal in a Boolean algebra B iff (1 ) M is a proper filter of B; (2) there is no other proper filter F of B such that M c F. Note well that by definition, a maximal filter is a proper filter. We stick to this convention even when we find it convenient to lapse from our previous convention that by a "filter" we mean a proper filter. For Boolean algebras, we have the following identification of maximal filters and prime filters (but the implication from left to right holds also for distributive lattices). Theorem 8.9.1 M is prime.
If M is a (proper) filter of a Boolean algebra B,
then M is maximal iff
Proof Suppose a V b E M, but a ¢ M and b ¢ M. Consider [M, a) and [M, b). Both properly include M, so M will not be maximal if we can show either [M, a) 1'= B or [M, b) 1'= B. Suppose to the contrary. Then for x E B, 3m] E M such that m] /\ a ::::; x and 3m2 E M such that m2/\b ::::; x. But then (ml /\m2)/\a ::::; x and (ml /\m2)/\b ::::; x. But then ((ml /\m2)/\a)V((ml /\m2)/\b) ::::; x. But then, by distribution, (ml /\m2)/\(aV b) ::::; x, and hence x E M. So x E B, x E M, and hence M is not proper. (Note the similarity of this proof to the proof of the prime filter separation principle.) Suppose M is plime. But then x E B, x vi = 1 E M. So x E B, x E M or i E M. Suppose F is a filter and M C F. Then 3x E F such that x ¢ M. Then i E M, and hence i E F. But then x /\ i = 0 E F, and so F is not proper. 0
We also have the following handy characterization of maximal filters in Boolean algebras. Theorem 8.9.2 If M is a (proper) filter of a Boolean algebra B, then M is maximal iff for any x E B, M contains exactly one of x and i.
303
(2) Devise a proof of Stone's representation for distributive lattice for the special case of denumerable distributive lattices that does not use Zorn's lemma (or any other equivalent of the Axiom of Choice). (Hint: Prove the denumerable case of the prime filter separation principle by induction.) 8.10
Stone's Representation Theorem for Boolean Algebras
Remember that the motivating example for a Boolean algebra was a field of sets, which is a ring of subsets of some set X closed not only under (binary) intersection and union, but also under complementation (relative to X). Just as Stone proved that every distributive lattice is isomorphic to a ring of sets, so also he proved the following representation theorem in Stone (1936). Theorem 8.10.1 Every Boolean algebra is isomOlphic to a field of sets. Proof Let B be a Boolean algebra. By Stone's representation for distributive lattices, we know that the mapping h such that for a E B, h(a) = {P : a E P and P is a (proper) prime filter of B} is a one-one mapping preserving both /\ and V in the sense that h(a /\ b) = h(a) /\ h(b) and h(a V b) = h(a) V h(b). It only remains to show that where P is the set of all (proper) prime filters of B, h(a) = P - h(a). This follows immediately from the already mentioned identification of prime and maximal filters of a Boolean algebra, together with the also already mentioned fact that a maximal filter of a Boolean algebra contains exactly one of every element and its complement. Thus, where P is a (proper) prime filter, P
E
h(a) iff - a
E
P iff a ¢ P iff P ¢ h(a) iff PEP - h(a).
0
MAXIMAL FILTERS
REPRESENTATION THEOREMS
304
It is of some interest to see how we obtain a more trivial representation for finite Boolean algebras (and a certain generalization thereof), comparable to the representation we obtained for finite distributive lattices, where we mapped every element into the set of (non-zero) irreducible elements below it. We first introduce the following definition. An element a of a Boolean algebra B is an atom of B iff (1) a i= 0, and (2) x E B, x ~ a =? x = a or x = O. We say that a covers 0, meaning that a > 0 and that there is no element lying stlictly between a and
O. We have the following connection between atoms and irreducible elements of a Boolean algebra. Theorem 8.10.2 Let B be a Boolean algebra. Then an element a of B is an atom is a (non-zero) join-irreducible element.
iff a
Proof (1) If a is an atom, then clearly a is not join-reducible, for if a = b V c where b, c < a, then either 0 < b < a or 0 < c < a, and in either case a is not an atom. (2) Suppose a is non-zero join-irreducible ele.:nent and yet not an atom. The~ for some element b, a > b > O. a = (a /\ b) V (a /\ b) since a = t.:./\ 1 = a /\ (b V b) = (a /\ b) V (a /\ b). a is join-irreducible, thus a = a /\ b or a = a /\ b. However, a i= a /\ b, since a > b. Assume that a = a /\ b. Then b ~ a, and by transitivity b ~ b, which means b /\ b = b i= 0, a contradiction to B being a Boolean algebra. But then neither a = a /\ b, nor a = a /\ b, in contradiction to a being joint-irreducible, thus there is no such b, and therefore a is an atom. D
Theorem 8.10.3 If a is an atom of a Boolean algebra B, then for all x a
E
B, a ~ x
iff
ix.
x
Proof (1) If a ~ x and a ~ X, then a ~ x /\ = 0, and so a = 0 and is not an atom. (2) Suppose a and a ~ 1 = x V X. Since a is prime (by Theorem 8.10.2 and D Exercise 8.4.1 (2», a ~ x or a ~ X. Hence a ~ x.
i x
Since we already have for a finite distributive lattice that h(a) = {x: x ~ a and x is (non-zero and) irreducible} is an isomorphism onto a ling of sets, the above theorem gives us the following. Theorem 8.10.4 For a finite Boolean algebra B, if h is defined for a E B as h(a) = {a E B : a ~ a and a is an atom}, then h is an isomorphism of B onto a field of sets (which are subsets of the set of atoms). Proof By the previous theorem, a ¢ h(a) iff a ~ a iff a A - h(a) (where A is the set of atoms of B).
i
a iff a
i
h(a) iff a E D
The above theorem may be improved, once we notice that the mapping h is onto Let {al, ... , a,d ~ A. Consider al V ." V all' Clearly (al, ... , a,d ~ h(al V ... van). And heal V ... V all) ~ {al, ... , an}, for suppose a E h(al V ... Van), i.e., a ~ al V ... Vall' But then since a is prime, a ~ aI, or ... ,or a ~ an. But then a = aI, or ... , or a = a'b since if a < ai, then either a or ai are not atoms. So h is onto ~(A). This is captured by the following theorem. ~(A).
305
Theorem 8.10.5 (Finite representation theorem) Let B be a finite Boolean algebra. Then B is isomorphic to the field of all subsets of the set of atoms ofB. The question arises as to whether the general representation theorem for Boolean algebras can be improved so the field of sets in question is the field of all subsets of some set. Clearly if it can, then every Boolean algebra has a lot of atoms, since any unit set {x} would be an atom in the field. Indeed, every Boolean algebra B would be atomic in the sense that for every x E B, if x i= 0, then for some atom a E B, a ~ x. Also, clearly if the general representation theorem can be so improved, then every Boolean algebra is complete in the sense that for every X ~ B, the greatest lower bound of X is an element of B, and the least upper bound of X is an element of B. This is because arbitrary (not just finite) intersections performed on a power set give results in the power set, and similarly for arbitrary unions. In analogy with the finite operations, the greatest lower bound of X is denoted by "/\ X," and the least upper bound of X is denoted by "V x." But not every Boolean algebra is atomic, as may be seen by considering the same example (see Section 8.4) we used to show that not every distlibutive lattice satisfies the prime element separation principle. Remember Z was an infinite set, and for X, Y ~ Z we defined X == Y iff X and Y differ in at most a finite number of elements. Then where [X] = (Y ~ Z : Y == X} we defined [X] /\ [Y] = [X n Y], and [X] V [Y] = [X U Y]. We can also define [X] = [Z - X], and so obtain a Boolean algebra. (Verify that - is a well-defined operation and has the complementation properties.) Zj== has not a single atom, since it had no non-zero irreducible elements (such a Boolean algebra is called atomless). And not every Boolean algebra is complete. Let X be an infinite set, for the sake of concreteness the set of positive integers. Call a subset Y of X cofinite iff X - Y is finite. Let C be the collection of finite and cofinite subsets of X. C is a field of sets and hence a Boolean algebra. But C is not complete. For consider the unit sets of the odd positive integers: {I}, {3}, (5}, .... The union of these is the set of odd positive integers, which is not in C. 8.11
Maximal Filters and Two-Valued Homomorphisms
We let 2 be the two-element Boolean algebra. It is unique (except for isomorphic copies) and has the diagram depicted in Figure 8.8. We have the following connection between maximal filters of Boolean algebras, and homomorphisms onto 2. 1
I o
FIG. 8.8. 2
MAXIMAL FILTERS
REPRESENTATION THEOREMS
306
Theorem 8.11.1 Let M be a maximal filter of a Boolean algebra B. Then h : B -+ 2, defined so that for a E B, h(a) = 1 if a E M, is a homomorphism ofB onto 2. Proof (1) This may be verified directly: h(a 1\ b) = 1 iff a 1\ b E M, iff a, b E M, iff h(a) = 1 and h(b) = 1, iff h(a)l\h(b) = 1. From this it follows that h(al\b) = h(a)l\h(b). Similarly, h(a V b) = 1 iff a V b E M, iff a E M or b E M, iff h(a) = 1 or h(b) = 1, iff h(a) V h(b) = 1. Lastly, h( -a) = 1 iff -a E M iff a ¢. Miff h(a) = 0 iff h(a) = 1. (2) Or more elegantly, the result may be inferred from Exercise 8.8.1. Consider BIM, which is a homomorphic image of B under the natural homomorphism, h(a) = [a]. We next notice that if a E M, then a ~ 1 E M, i.e., Ca V 1) 1\ (0 V a) E M. And if a ¢. M, i.e., Ii E M, i.e., -a E M, then a ~ 0 E M, i.e., (Ii V 0) 1\ (1 Va) E M. Since either [a] = [1] or [a] = [0], BIM is isomorphic to 2, and if a E M, h(a) = [1], and if a¢. M, h(a) = [0]. D
Example 8.11.5 The elements of finite direct products of 2 may be regarded as finite sequences of 0 and 1. IIi<22i is shown in Figure 8.9, and IIi<32i is shown in Figure 8.10.
(1,1)
(1,0)
(0,1)
(0,0) FIG. 8.9. IIi<22i
Exercise 8.11.2 Prove, conversely to the theorem above, that if h is a homomorphism of a Boolean algebra B onto 2, then h- 1(1), i.e., {x E B : hex) = 1}, is a maximal filter ofB. The theorem and the exercise give us a natural one-one correspondence between maximal filters of a Boolean algebra and homomorphisms of the algebra onto 2. We may thus restate our prime filter separation principle in telms of separation by homomorphisms as follows.
(1,1,1)
(1,1,0)
(0, 1, 1)
(1,0,0)
(0,0,1)
Theorem 8.11.3 (Fundamental homomorphism) Let B be a Boolean algebra with a, bE B such that a b. Then there is a hOlll011101phism h ofB onto 2 so that h(a) = 1 and h(b) = 0, i.e., so that h(a) h(b).
i
307
i
We first recall some concepts of universal algebra from Chapter 2, but specialized to the case of Boolean algebras. A direct product of an indexed set of Boolean algebras (Bi )iEI is that algebra whose elements are the indexed sets (ai)iEI, ai E Bi, and whose operations are defined "componentwise" as follows: (ai)iEII\ (bi)iE[ (ai)iEI
V
= (ai 1\ bi)iEI,
(bi)iEI= (ai
(ai)iEI
V
bi)iEI,
= (ai)iEI.
Evidently, by the componentwise definitions of the operations, the direct product of Boolean algebras is again a Boolean algebra. The direct product of (Bi) iEI is written IIiEI Bi. The class of all Boolean algebras is thus closed under the processes of taking subalgebras, homomorphic images, and direct products. Any "equationally definable" class of algebras has this property. Birkhoff showed that conversely, any class of algebras closed under these three processes is equationally definable. Exercise 8.11.4 Verify that the direct product of an indexed set of Boolean algebras is a Boolean algebra.
(0,0,0) FIG. 8.10. Ili<32i
Exercise 8.11.6 Construct a diagram of IIi<4 2i. We may represent every finite Boolean algebra except the so-called degenerate, oneelement Boolean algebra, via a direct product of 2. And more generally, we have the following: Theorem 8.11.7 (Fundamental Embedding) Let B be a (non-degenerate) Boolean algebra. Then there is some direct product of 2, IIiEi 2 i, and some function h : B -+ IIiEi 2i, such that h is an isomorphism ofB into IIiEI 2 i.
REPRESENTATION THEOREMS
308
MAXIMAL FILTERS
Proof Let (hi )iEI be an indexed set of all homomorphisms ofB onto 2. This set is nonempty because of the fundamental homomorphism theorem. Now consider IIiEI2i. Define h : B -+ IIiEI2i so h(a) = (hi(a)iEI. h is obviously a function of B into IIiEI 2i , since each hi is a function ofB into 2. That h is a homomorphism follows from the facts that operations on IIiEI 2i were defined componentwise, and that each hi is a homomorphism onto 2. Thus:
a
o
h(a /\ b) = (hi(a /\ b)iEI = (Ma) /\ Mb)iEI = (Ma)iEI /\ (hi(b)iEI = h(a) /\ h(b); h(a V b)
309
FIG. 8.11. A Boolean algebra
= (hi (a V b»iEI = (hi (a) V Mb)iEI = (Ma»iEI V (hi(b)iEI = h(a) V h(b);
= (hi(a)iEI = (hi(a)iEI = (hi(a)iEI = h(a). h is one-one. For suppose a i b. Then assume without loss of generality h(a)
(1,1)
Finally, that a i b. Then by the fundamental homomorphism theorem there is a homomorphism hi of B onto 2 such that hi (a) i hi(b). But then h(a) differs from h(b) at the "ith" components. 0 We have thus "embedded" B into IIiEI2i, mapping B in an isomorphic fashion onto a certain sub algebra of IIiEI 2i, h(B). Notice that h(B) has stronger properties than a mere subalgebra, namely each of the elements 0 and 1 of 2 is such that Vi E I,3(ai)iEI E h(B) such that ai = 0 and 3(bi)iEI E h(B) such that bi = 1. This is because each hi maps onto 2. So for each i E I, 3a E B such that hi(a) = 0 and 3b E B such that hi(b) = 1. Then h(a) = (Ma)iEI E h(B) and h(b) = (Mb)iEJ E h(B). To put things roughly, neither of the elements 0 and 1 of 2 is "wasted" in the embedding. Each shows up someplace in h(B) at each component i. A subalgebra S of a direct product IIiEI Bi such that Vi E I, Vs E S, 3 (ai) iEI E S such that ai = s is called a subdirect product of the algebras (Bi)iEI (cf. Chapter 2). This gives the following theorem. Theorem 8.11.8 Every (non-degenerate) distributive lattice (in particulm; Boolean algebra) is isomorphic to a subdirect product of 2. Example 8.11.9 Consider the Boolean algebra in Figure 8.11. It has the following two homomorphisms onto 2 (since only [a) and La) are maximal filters): hi : 1 -+ 1 a -+ 1 a-+O 0-+0
h2 : 1 -+ 1 a-+O a-+I 0-+0
Then h(1) h(a) h(a) h(O)
= (hl(1),h2(1) = (hl(a),h2(a) = (hl(7i),h2(a) = (hI (0), h2(0)
= (1,1), = (1,0), = (0,1), = (0,0).
a
(1,0)
o
(0,0)
FIG. 8.12. h applied to the Boolean algebra in Figure 8.11 This situation is pictured in Figure 8.12. Notice, irrelevantly, that the contents of Figure 8.13 are subdirect products of the algebras in Figure 8.12. Exercise 8.11.10 Find all of the two-valued homomorphisms of the Boolean algebra diagrammed in Figure 8.14, and similarly embed it in a direct product of 2 according to the construction contained in the proof of the embedding theorem. Solutiou (1) [a), [b), and [c) are the maximal filters. Thus the three homomorphisms hi, h2 and
h3 are as follows: hl:I-+I a-+O b-+I c-+I a-+I b-+O c-+O 0-+0 h: 1-+ (1,1,1)
h2:1-+1 a -+ 1 b-+O c -+ 1 a-+O b-+l c-+O 0-+0 a-+(1,O,O)
h3: 1 -+1 a-+I b-+I c-+O a-+O b-+O c-+l 0-+0
310
MAXIMAL FILTERS
REPRESENTATION THEOREMS
(1,1)
1
MFP
I
I o
311
*4~
~ ET 3 FIG. 8.15. Implications Between (MFP), (RT), and (ET)
RT
(0,0)
E
FIG. 8.13. Subdirect products use of any other of the implications which have already been established. Thus establishing implication 2 by showing that implications 3 and 6 hold and thus getting 2 by transitivity of implication would not count as a "direct" proof. This requirement makes the project very redundant, but there is valuable conceptual information to be gained by each "direct" connection. Hints:
1
(2) Establish that if F is a field of sets, a X is a maximal filter of F.
U F, and X
= {x E F : a EX}, then
(3), (4) Remember characteristic functions. A characteristic function for a set Y ~ X may be regarded as a function f defined on X and taking values in the twoelement Boolean algebra 2, so that for x E X, if x E Y, then f(x) = 1, and if x ¢ Y, then f(x) = O.
o FIG.8.l4. A Boolean Algebra a -+ (0, 1, 1)
b-+(O,I,O)
b -+ (1,0,1)
C -+ (0,0,1) 0-+(0,0,0)
c-+(I,I,O)
E
(6) Let ITiEI2i be a direct product of 2. Show that if i 1 E J, then the set of elements of the direct product such that the i 1th component is 1 is a maximal filter of the direct product, that is, show that the set of elements (ai)iEI such that for i = iI, ai = 1, is a maximal filter of ITiEI 2i. Solution
Exercise 8.11.11 Consider the following three propositions: (I) Maximal filter principle (MFP): For any (non-degenerate) Boolean algebra B, for any a, b E B, if a i b, then there is a maximal filter M of B such that a E M and b ¢ M. (II) Representation theorem (RT): Any (non-degenerate) Boolean algebra is isomorphic to a field of sets. (III) Emhedding theorem (ET): Any (non-degenerate) Boolean algebra is isomorphic to a subdirect product of the two element Boolean algebra 2. Consider the implications between these propositions as depicted in Figure 8.15. In proving (RT) and (ET), we used (MFP), and have thus effectively established that implications 1 and 5 hold (that is, we have proven implications 1 and 5 without using any equivalent of the Axiom of Choice). The exercise is to establish directly that the remaining implications hold effectively (that is, hold in standard set theory without the Axiom of Choice). By "directly" we mean that each implication is to be established without the
(2) By (RT), B is isomorphic to some field of sets F. Identifying B with F, we see that fora,b E F, a ~ b =? 3x(x E aandx ¢ b). LetP = {c E F: x E c}. Pisafilter, for x E c and xEd iff x E c n d. Further, for every c E F, exactly one of x E c and x E c hold, so P is maximal. (6) Let B be identified by the isomorphism with a subdirect product of ITiEI 2i. For (ai)iEI, (bi)iEJ E B, (ai)iEJ i (bi)iEI then for some i1 E J, ail i bil' i.e., ail = 1 and bil = O. Let P = {(Ci )iEI E B : Cil = 1). P is a filter, for Cil 1\ dil = 1 iff Cil = 1 and dil = 1. Further, for every (q)iEI, either (Ci)iEJ E P or (Ci)iEI E P, for either Cil = 1 or ~ = 1. And not both (Ci)iEJ, (Ci)iEJ E P,for then both Cil = 1 and~ = 1. (3) Let B be identified with a field of sets F. Let J = U F. We map F into ITiEI 2i as follows: for a E F, let hi(a) = I if i E a, and otherwise let hi (a) = O. (Each hi is thus a characteristic function for each a E F.) We then let h(a) = (hi(a»iEJ. Then h is a homomorphism since h(a 1\ b) = (hi(a 1\ b»iEI = (hi(a) 1\ hi(b»iEJ = (hi(a»iEJ 1\ (hi(b»iEI = h(a) 1\ h(b), and similarly for V and -. h is one-one, for
DISTRIBUTIVE LATTICES WITH OPERATORS
REPRESENTATION THEOREMS
312
if a =J b, then either 3i such that i E a and i ¢ b, or vice versa. In the first case hi(a) = 1 and hi(b) = 0, and in the other case it is the other way around. Finally, h(b) is a subdirect product of IIiEI 2i, since each hi is onto 2. This is because if i E a, then i ¢ li. (4) Let B be identified with a subdirect product of IIiEI 2 i. For (ailiEI E B, let h( (ai liEI) = {i E I : ai = I}. h is the desired isomorphism. The equivalences above give us another route to Stone's representation for Boolean algebras. We can show the embedding theorem as an instance of Birkhoff's prime factorization theorem by showing:
Lemma 8.11.12 Every (non-degenerate) irreducible Boolean algebra is isomOlphic to the two-element Boolean algebra 2. Proof We modify the two congruences used in the similar result for distributive lattices (Lemma 8.6.6) by tacking on a clause guaranteed to see that complement is respected: x~Ay x~Vy
iff x /\ a = y /\ a & - x /\ a = -y /\ a; iff xVa=yVa & -xVa=-yVa.
Thus, for example, if x ~A y, then x /\ a = y /\ a & - x /\ a = -y /\ a, and so (using strong double negation), -x /\ a = -y /\ a & - -x /\ a = - - Y /\ a, i.e., -x ~A -yo We leave to the reader the verification of the corresponding fact for x ~v y, as well as the verification that the modified relations ~A and ~v are still equivalence relations and still respect meet and join. The proof is otherwise just as for Lemma 8.6.6. 0
Remark 8.11.13 As the reader can easily see by just dropping the extra parts of the solution to Exercise 8.11.11, there are corresponding "direct" equivalences among the prime filter separation principle, the representation of distributive lattices as rings of sets, and the embedding of distributive lattices as subdirect products of a direct product of the two-element distributive lattice 2. For each positive integer n, it is customary to denote the finite direct product IIi<1l 2i as simply 211 . We extend this to the trivial one-element Boolean algebra by letting n = O. This is motivated by the following considerations. Note that 2° is the set of all mappings from the empty set 0 into 2. But the only mapping whose domain is the empty set is itself the empty map, i.e., 0. Thus 2° = {0} has just one element. It is easy to see that any "componentwise" operation involving 0 leads again to 0.
Theorem 8.11.14 For each positive integer n, 211 is isomorphic to a subalgebra of 211+1. Hencefor i ~ j, 2i is a subalgebra of 2J. Proof For motivation of the following, please consult Figures 8.9-8.11 diagramming 2 1, 2 2 , and 2 3. It should be reasonably clear that the mapping h(al' ... , all I = (a 1, ... , all, all I is the desired embedding, and of course an induction (using the fact that the composition of two embeddings is an embedding) gives the more general result. We leave the details to the reader, but we note that the proof has nothing to do with the
313
algebra being 2 or the operations being Boolean, and in fact represents a law of universal algebra. 0
Exercise 8.11.15 Prove that for every algebra A, the finite direct product All is a subalgebra of A 11+ 1. We close this section by observing: Proposition 8.11.16 Every finite Boolean algebra B is isomorphic to a subdirect product 211 for some integer n. Proof Every finite Boolean algebra B is isomorphic to a subdirect product (2i/iEI. Because the projections are required to be onto, this means that I must be finite. Taking I to be the indices i is just in effect a relabelling, and does not change the isomorphism. There is still the niggling question as to what happens if B is a one-element algebra.
o 8.12
Distributive Lattices with Operators
This section is based on what would appear to be a wrongly neglected paper: "Boolean algebras with operators," by B. Jonsson and A. Tarski. 1951, 1952. Without too much exaggeration it may be said that this remarkable paper contains the Kripke semantics for modal logic (with the dyadic accessibility relation-cf. esp. Kripke 1963). Even more surprisingly, it anticipates the Routley-Meyer generalization of Kripke's semantics so as to provide a semantics for the relevance logics (with the triadic accessibility relationcf. esp. Routley and Meyer 1973). It goes without saying, though we say it anyway to make our views unmistakable, that there is incredibly much that is original and unique in Kripke and Routley-Meyer (and that goes beyond mere "labeling"). But still the extent of the J onsson-Tarski anticipation is not to be understated either. What Jonsson and Tarski do is show how to take any Boolean algebra with additional operations on it of any degree (there are simple conditions on these "operators," mainly that they be "additive"-a condition we shall define in a moment), and represent that Boolean algebra as a field of sets upon which are defined certain extra operations corresponding to each extra operator on the given Boolean algebra. These extra operations are defined as generalized image-forming operations in a certain natural sense we shall make clear, an n-place operation being defined as forming the image of a corresponding (n + 1)-place relation. The connection with the KripkelRoutley-Meyer semantics will eventually be made clear. But in anticipation, if we adopt the familiar idea (already in Boole) that a proposition be identified with a set-the set of cases (possible worlds) in which it is true-then additional (non-classical) connectives can be interpreted as extra operations on sets. And when these operations are "operators" (as they are in many cases), they can be defined as the "image operators" of J onsson-Tarski. This explanation does not work itself out quite so smoothly for the relevance logic semantics of Routley-Meyer, for the Lindenbaum algebra of relevance logics does not straightforwardly yield a Boolean algebra. That is one reason we prefer to develop the Jonsson-Tarski ideas in the more general
DISTRIBUTIVE LATTICES WITH OPERATORS
REPRESENTATION THEOREMS
314
+ I)-place relation
315
R ~ Un+l
setting of distributive lattices. Goldblatt (1989) also considers distributive lattices with operators, calling them "complex algebras.,,4 We now begin to set the framework. An operation 0i is additive just when it distributes over V (in each of its places), that is,
Indeed, following J6nsson and Tarski, given any (n we can define an n-ary operator:
(1) Oi(Xl, ... ,Y V Z, ... ,xn) = Oi(Xl,··· ,y, ... ,xn) VOi(Xl,··· ,Z, .. · ,xn).
This is the natural generalization of the image of a set under a function to the notion of the image of an n-tuple of sets under a relation. We leave to the reader the verification that R* is additive in the field of all subsets of U. This motivates the following definitions. Let (U, (Ri) iEl)' be any relational structure. By the associated distributive lattice with image operators we mean the ring of all subsets of U with the operators (Rt)iEI. The associated Boolean algebra with image operators is the same but with "field" in place of "ring." By a ring (field) of subsets ofU with image operators we mean any ring (field) of subsets of U closed under all the image operators An operation 0i on a lattice L is normal just in the case that if the lattice has a least element 0, and for some i (1 :S: i :S: n) Xi = 0, then
Thus, where
@
is a unary operation we require that
(2) @(Y V z) = @y V @z,
and where (3) x
0
is a binary operation, both
0
= (x x = (y
(y V z)
(4) (y V z)
0
0
y) V (x
0
z) and
0
x) V (z
0
x).
By a distributive lattice with operators we shall mean a structure {L, (Oi)iE[}, where each 0i is an additive operation on the distributive lattice L. A Boolean algebra with operators is the same except a Boolean algebra B replaces L. It is easy to see that an additive operation (henceforth, just operator) is isotonic (in each of its places), i.e., (5) y:S: z
-+
Oi(Xl, ... ,y,···,xn):S: Oi(Xl, .. ·,Z, ... ,xn).
Suppose that Y :S: z. Then z = Y V z. But then Oi(Xl, ... , z,···, xn) = Oi(Xl, ... , Y V Z, . .. , xn) = Oi(Xl, ... , y, . .. , x/J) V Oi(Xl, ... , z, ... , x/J) (the last step is by (1)). So we have the consequent of (5) as desired. A distributive lattice with a single binary operator 0 is called in the literature (cf. Fuchs 1963) a lattice-ordered groupoid, or an I-groupoid. There seems to be no familiar name for the analogous case with a single unary operator @, but this too shall be a very important case for us. Examples of binary operators are furnished by the Cartesian product on sets and relative product on relations (intersection and union are the lattice operations)-also by ordinary multiplication on the natural numbers (greatest common divisor is 1\, least common multiple is v). Examples of unary operators include the converse on relations and closure on sets of points in a topological space (again with intersection and union the lattice operations). Where U is any non-empty set and f : U -+ U, a particularly important motivating example is the f-image operator f*. As is familiar, for any X ~ U, f*(X) = {y : 3x E X such that y = f(x)}, and f*(X U Y)
pi : Un
(IfL has no 0, then every operation on L is trivially normal.) Note that any Rt is normal. (Also all of the examples of operators given above are normal.) Theorem 8.12.1 (Jonsson-Tarski) Every distributive lattice (L, (Oi )iEI) with normal operators is isomOlphic to a field of sets with image operators. Proof. We know by the proof of Stone's representation for distributive lattices that the mapping h(a) = {P : P is a prime filter and a E P}
is an isomorphism between L and a ring R of sets of prime filters of L. We let from now on the Ps range over the set of all prime filters of L. We show how to define a relation Ri for each operator 0i so that (#) h(Oi(Xj, ... , x/J))
= Rt(hxI, ... , hX/J).
Thus define Ri(PI, ... ,Pn,Pn+l) ifandonlyif'v'xI, ... ,Xn(XI E PI & '" & xn E Pn =? Oi(Xi, ... , Xn) E ~l+r). We express this last relationship compactly as Oi(PI, . .. , Pn) ~ Pn+I.
To show (#), it clearly suffices (upon removing definitions) to show (##) Oi(XI, ... ,X n) E ~z+1 P n) ~ Pn+r).
= f*(X) U f*(Y)·
This can clearly be generalized to any n-ary operation n-ary operator:
R7.
-+
U, giving rise to an
4The submission date makes it clear Goldblatt had this representation since at least 1985. We have presented the substance of this section to seminars since the early 1980s.
¢>
3Pl'''''~I(Xl E
PI & ... & xn E Pn & Oi(PI, .. ·,
The implication {= is immediate. The implication =? will, however, occupy us a while. Thus hypothesize O;(XI, . .. , xn) E Pn+l. Consider the principal filters [xr), ... , [xn)' We verify that Oi([Xr), ... ,[xn)) ~ ~1+I. Thus suppose Yl E [XI), ... ,Yn E [xn), i.e., Xl :S: YI,··., xn :S: Yn. By monotonicity, it follows that Oi(XI, ... , xn) :S: Oi(Yl, ... , Yn). Thus by our hypothesis and the fact that Pn+I is a filter, it follows that Oi(Yl, ... , Yn) E Pn+l, which completes the verification.
REPRESENTATION THEOREMS
316
The point is that we now know there exists an n-tuple of filters FI, ... , Fn like the PI, ... , Pn that we want except that they are most likely not prime. So we know that the following set is non-empty (where the Fs are all filters):
LATTICES WITH OPERATORS
and a'j E [Pj, y), i.e., 3p E Pj such that p 1\ x ::; a'j and 3p' E Pj such that p' Further standardizing, set p" = p 1\ p'. It is clear that
317
1\ Y ::;
ai.
(3) p" E p.J' p" 1\ x -< a"J' p" 1\ Y -< a'~j '
But then by the distributive law and lattice properties, Note that as always, we take filters to be proper filters. The careful reader may then worry about what happens if some Xi = O. The answer is that since 0i is normal, then o(X!, ... , x n ) = 0, and so the hypothesis that Oi(XI, ... , x n) E Pn+l is contradictory (Pn+l being proper). Define a partial order on E by componentwise inclusion, i.e., define (Fl, ... , Fn) ::; (GI, ... , G n ) iff Fi ~ Gi for 1 ::; i ::; n. It is easy to verify that each chain in this ordering has an upper bound in E formed by componentwise union of the chain's members. Thus, being more explicit, where e is a chain, let ei be the set of ith components of members of e. Then define Ve = (Uq, ... , U en), which clearly is an upper bound. The question is whether VeE E. It is easy to see that each Uei is a filter, the argument being precisely the same as in the analogous step in Stone's theorem when we needed that the union of a chain of filters is a filter. That Xi E Uei is clear, since Xi is an element of some Fi E ei. What remains to be shown, then, is that 0i(Uel, ... , Ue n ) ~ Pn+l. Suppose that ai E Uei. So there are (Fll, ... ,Fl;), ... ,(F{Z, ... ,F,~) E E such that
a! E FI!"'" an E F,;l. Let (Fl, ... , Fn) be the greatest (in the sense of ::;) of these n-tuples. Consider the n-tuple (Fl, ... , F/z1 ). It need not be in E. However, it is easy to see that (Fl, . .. , F/z ) ::; (F{Il, ... , F/z ), i.e., that Fj ~ FF(1 ::; j ::; n). Then each aj EFt Since (F{n, ... , Fin E E, then we know that oi(F{Il, ... , F,;n) ~ Pn+l, and so oi(al, ... , an) E P'l+l, as desired. Since we have now shown that each chain has an upper bound, we can conclude by Zorn's lemma (in a technically more general form than has been previously introduced since the relation::; is not just ~) that E has a maximal element (PI, . .. , Pn ).5 We now show that each of the PjS is prime, completing the proof of (##). So suppose that X V Y E Pj and yet X ¢ Pj and y ¢ Pj. Let [Pj, x) be the filter generated by Pj U {x}, and similarly for [Pj, y). Since [Pj, x) :J Pj, then (PI, ... , [Pj , x), ... , P'l) > (PI, ... , Pj, ... , Pn), and similarly for [Pj, y). So each of (PI, ... , [Pj , x), ... , Pn ) and (PI, ... , [Pj , y), . .. , Pn) must fail to be in E. Checking the conditions for membership in E quickly reveals that this can only be because there exist elements al,···, aj, ... , an, a'l"'" ai, ... , a~ such that: 1
ll
(1) al E PI, ... , aj E [Pj, x), ... , an E Pn, yet Oi(al,···, aj, ... , an) ¢ Pn+l; (2) a'l E PI, ... ,ai E [Pj,y), ... ,a;l E P,z, yet Oi(a'l"" ,ai,··· ,a;l) ¢ Pn+!. We next go through a series of moves that might be confusing in detail, but the basic spirit of which is to standardize the parameters so as to allow in the end a distribution (or two). Thus for k =f. j, set a~; = ak 1\ a~, but set a'j = aj V ai. It is clear that a'j E [Pj, x) 5This more general, abstract form says that if (E, ::;) is any partially ordered set such that each chain C S;; E has an upper bound, then E has a maximal element.
(4) p" 1\ (x V y)
= (p" 1\ x) V (p" 1\ y) ::; a'j.
Since p" E Pj, and our hypothesis has been that x V y E Pj , clearly p" 1\ (x V y) E Pj. It thus follows from (4) that a'j E Pj . It is further clear that by setting a~; = ak 1\ a~ (for k. =f. j) we have also guaranteed that a% E Pk. So in general (for all k) we have a% E Pk. Smce (PI, ... , Pj, ... , P'l) E E, we have (5) Oi(Pl, ... , Pj , . .. , Pn ) ~ P'HI.
So (6) 0i (" " ... ,a") al' ... ,aj' n E Pn+l.
B ut reca11 mg ' th at a"j
= aj V a"j , th'IS means
(7) Oi(a'{, ... , aj V ai, ... , a;;) E Pn+l.
But by the fact that 0i is additive, we can distribute this element to obtain (8) 0i (" al' ... ,aj, ... ,a") n
VOi (" al,
, ... ,a") .. ·,ap n EPn+l.
Since Pn+ 1 is prime, (9) oi(a'{, ... , aj, ... , a~) E Pn+l, or
(10) oi(a7, ... , ai, ... , a;;) E P'l+!'
Assume (9). Since a% ::; ak for k =f. j, we have by repeated applications of monotoniCity that (11) oi(a7, ... , aj, . .. , a;;) ::; Oi(al, ... , aj, . .. , an). From (9) and (1]) it follows by the filterhood of Pn+l that
(12) oi(al, ... ,aj, ... ,an )
E P'l+l,
contradicting our hypothesis (]). The other case (assuming (10)) can be argued symmetrically to give
(13) oi(a~, ... ,ai, ... ,a~)
E P'l+l,
contradicting our hypothesis (2).
8.13
D
Lattices with Operators
This section gives a representation for lattices with operators, where there is no assumption as to the distributivity of the lattice. Obviously, the representation cannot be a set of sets with the usual set-theoretical operations nand u, since these operations distribute over each other. As we mentioned above in Section 8.3, there are various representations for lattices; two of them were considered in detail there, and some others are mentioned
REPRESENTATION THEOREMS
318
in Chapter 13. All these representations can be extended by operators (cf. Ono 1993; Hartonas 1993b; Allwein and Dunn 1993). We present one of the extension here. This is based on the principal pair lattice representation, which is a modification of the Hartonas-Dunn representation, and looks somewhat as if it is a dual to the Urquhart representation. 6 The basic idea of adding an operator to the lattice representation is essentially the same as in the J 6nsson-Tarski representation, although the frame has to be augmented with extra conditions due to the lack of disttibutivity. First we define the structure we are to represent:
Definition 8.13.1 A lattice with operators is a structure (L, /\, V, (Oi )iEI) where (L, /\, V) is a lattice and each 0i is a normal additive operation.
Just as before, it does follow that 0i is isotonic in each argument place: (1) y:S
z :::}
Oi(XI, ... ,y,···,xn):S Oi(Xl,···,Z, ... ,xn).
As an example of such a structure one might consider the positive additive fragment of linear logic, together with multiplicative conjunction: the additive conjunction and disjunction gives the lattice, and 0 is a lattice ordered binary operation which is isotonic in both places. Before jumping to the technicalities of the principal pair representation, let us give some intuitive motivations. The set of the principal filters (equally, the set of principal ideals) of a lattice is isomorphic to the lattice itself (cf. Birkhoff 1967). This seems to make these two sets somewhat trivial or too easy to use, but simplicity might have advantages when things start to get complicated. Notice also that this mapping (where each element is mapped into the principal filter (or ideal) it determines) can be easily "typelifted." If one takes the power set of all principal filters (or principal ideals), then the map which assigns to each element a set of principal filters (or ideals) each of which contains the particular element, then this provides and equally good representation, though the first is a join representation, and the typelifted one is meet representation. In these cases the lattice is viewed as a relational rather than an operational structure, ~ (or its converse) being the ordering. Thus, if L = (L, :S) and F is the set of all principal filters, then the two maps are: (i) hl(a) (ii) h2(a)
= {x: a:S x}, i.e., hl(a) = [a) E F; = {F : F E F & a E F }, i.e., h2(a) = [[a))
LATTICES WITH OPERATORS
maximal, and not every inconsistent pair is minimal. Maximalization might require an application of Zorn's lemma, so might minimalization. However, one does not need to minimalize on inconsistent pairs in the representation, because minimally inconsistent pairs are always made up from a principal filter and a principal ideal, as follows from the next lemma.
Lemma 8.13.2 Let L be a lattice. If (F, 1) is a principal filter-ideal pair on L, that is, (F,1) is a minimal element in the set of overlapping filter-ideal pairs, then there is an a E L such that F = [a) and I = (a].
The proof of this lemma is easy but somewhat tedious, and left to the interested reader. The converse of the lemma holds as well, that is, a pair formed from a principal filter and a principal ideal, where these two are generated by the same element, is a minimal element among the overlapping pairs. The idea is then to take advantage of the fact that for each element of the lattice there is exactly one principal filter-ideal pair generated by that element? Consider a relational structure (U, C, (Ri )iEI) where C is a transitive relation. Define two functions r and I on subsets of U as follows: reX) = {y : Vx(x EX:::} x C y)} and g(y) = {x : Vy(y E Y :::} Y ex)}. We call a subset X of U hereditary or upward closed (with respect to C) in the same sense as in Section 8.3, i.e., if x E X and x C y implies y E X. We call a subset X of U dual hereditary or downward closed if y E X and x C y implies x EX. The two functions rand g are Galois connections between the upward closed and downward closed subsets of U. A set X is called stable if X = gr X. A stable lattice is a set V of stable subsets of U closed under intersection, as well as satisfying the condition that if X, Y E V then r X n r Y = r Z for some Z E V. (Meet is defined as n, and the join of X, Y is g(r X n rY).) Furthermore, we will call a set V of stable sets fit for R7 if V is closed under Rt and additionally satisfies the condition:
* * * (7) - Ri(XI, ... , Y vZ, ... , X n) ~Ri(XI, ... ,y, ... ,Xn)VRi(XI, ... ,Z, ... ,Xn). Theorem 8.13.3 Every lattice (L, /\, V, (Oi )iEI) with operators is isomorphic to a fit stable lattice of sets with image operators. Proof The canonical map is defined as h(a) = { (F,1) : (F,1) is a principal pair and a
~ F.
To avoid distributivity Urquhart uses maximal filter-ideal pairs (Definition 3.18.37) (and join in the representation is defined from Galois connections). Instead of "maximally disjoint" one might consider "minimally overlapping," and that is what we called a principal filter-ideal pair in Definition 3.18.38. Clearly, not every consistent pair is
319
E
F }.
For the ordering relation we choose ~ on the first members of the pairs, that is, set inclusion on filters. It is straightforward to check that h(a/\b) = h(a)nh(b) and h(aV b) = g(rh(a) n rh(b)). We define the relation R on principal filters as follows: R i ( (FI,II,), ... , (Fn'!Il)' (Fn+l ,!n+l»
iff
Val, ... , an((al E FI & ... & all E F Il ) :::} Oi(al, .. ·, an) E Fn+l). 60 n the one hand, the use of two relations in the frame, as well as using filter-ideal pairs in the canonical model, resembles the Urquhart representation, though there is no translation between the two representations. On the other hand, the frame can be viewed as a special case of the frame used by Hartonas and Dunn, though canonically we use a different set. A further extension of this representation was used in Bimbo (2000) to give a semantics for certain structurally free logics.
The image operator R* which corresponds to 0i is defined as before. 7Note that we do not require in this representation the filters and ideals to be proper.
320
REPRESENTATION THEOREMS
We show that h(oJaj, ... ,an)) = R;' (h(aj), ... ,h(a n)). Going from right to left, the claim amounts to a universal instantiation in the definition of Ri. Going from left to right, let us assume that (F', I') E h(oi(aj, . .. ,an))' By the definition of h this means that oi(aj, ... , an) E F'. Consider then the principal filters determined by aj, ... , an: [aj), ... , [an). Due to the monotonicity of 0i for all a~ ~aj, ... ,a;1 ~an, oj(aj, ... ,an) :S 0i (a~ , ... ,a;I)' Then, Ri( ([aj), (all), ... , ([an), (an]), ([oi(aj, ... , an)), (Oi (aj, ... , an)]») holds, and hence by the definition of the image operator ([oi(aj, ... , an)), (oi(aj, ... , an)]) E Rt(h(aj), ... , h(a n )), as desired. Lastly, h is one-one in view of our remark that each element generates a principal ~ D
9 CLASSICAL PROPOSITIONAL LOGIC 9.1
Preliminary Notions
Classical propositional logic has associated with it a number of different algebraic logics, which we will call Frege logic, Boolean logic, and unital Boolean logic. Although these logics are characterized by different classes of logical matrices, as we will see later, they are all strongly equivalent, which is to say that they validate exactly the same asymmetric consequences. However, they do not validate all the same symmetric consequences. First of all, Frege logic is characterized by a single matrix, the Frege matrix, which is discussed in Chapter 5. Frege logic may be thought of as the algebraic representative of classical truth-functional logic; in particular, insofar as there are only two propositions in the Frege matrix ("the true" and "the false"), each formula is interpreted directly as a truth value. Closely related to Frege logic is Boolean logic, which is characterized by the class of Boolean matrices. A Boolean matrix is a matrix (B, F), where B is any Boolean algebra and F is a proper filter on B. Recall from Chapter 3 that a filter on a lattice L is any non-empty subset F of L satisfying the following conditions for all a, bEL: (fl) a E F and a :S b only if bE F;
(f2) a E F and b E F only if a /\ b E F. Equivalent to (fl) and (f2) is the following single condition: (f3) a E F and bE F if and only if a /\ bE F.
A filter on L is said to be a proper filter if it is a proper subset of L. Recall that in a matrix, F is intended to be the set of true propositions. Now, the partial ordering among propositions is intended to represent the relation of entailment (not to be confused with a -+ b, which represents the conditional connective); in particular, "a :S b" may be read "a entails b." (Note that, where "a :S b" is a statement about the lattice elements, a -+ b is itself a lattice element; this conesponds to the difference between the metalanguage entailment and the object language conditional.) This allows us to interpret the conditions above that define filters: (fl') if a is true and a entails b, then b is true;
(f2') if a is true and b is true, then a /\ b is true. The class F(L) of all filters on a lattice L is partially ordered by set inclusion, with respect to which F(L) forms a complete lattice. In particular, if C is any collection of
CLASSICAL PROPOSITIONAL LOGIC
322
n
filters on L, then C is also a filter. Since F(L) is closed under arbitrary intersection, we can speak of the filter F(S) generated by a subset S of L, sometimes written as [S). Specifically, F(S) is defined as follows: (1) F(S) = n{F E F(L): S ~ F}.
Note also the following general theorem concerning F(S): (T) F(S)
= {x E L : x 2: glb(T) for some finite T ~ S} = {x E L : x 2: SI A S2 A ... A Sn for some SI, S2, ... , Sn E S}.
BOOLEAN LOGIC AND FREGE LOGIC
all Y E r. On the other hand, h(l(a)) not K-valid.
E
A' - D', so vl!o/(a) :f. T. It follows that r I- a is D
Our next few lemmas concern the theory of Boolean lattices. Lemma 9.2.2 Let B be any Boolean lattice, and let F be any jilter on B. Dejine a relation 8 on B as follows: a8b (If a '"7 b E F. Then 8 is a congruence relation on B.
Proof. The reader can easily "calculate" this from the properties of a Boolean algebra. D
In the special case that S is finite, F(S) is given as follows: (T*) F(S)
= {x E L: x 2: glb(S)}.
In the special case that S is a singleton {a}, F( {a}) is called the principal jilter generated by a, and is given as follows: (PF) F ( {a}) = {x E L : x 2: a}
We sometimes write [a) for F ( {a} ). A special subclass of Boolean matrices are the unital Boolean matrices. A Boolean matrix (B, F) is said to be unital if F = {I}. Unital Boolean logic is characterized by the class of unital Boolean matrices. This logic corresponds to the supervaluational logic of van Fraassen (1971). This has been a quick reprisal of the properties of a filter. We will assume additional notions and results from Chapter 8, in particular, those relating to maximal filters and the Stone representation theorem. 9.2
The Equivalence of (Unital) Boolean Logic and Frege Logic
In order to show that Frege logic and Boolean logic are strongly equivalent, that they validate exactly the same asymmetric consequences, we proceed in steps, first showing that Boolean logic is strongly equivalent to unital Boolean logic, and then showing that unital Boolean logic is strongly equivalent to Frege logic. The key general lemma is the following. Lemma 9.2.1 Let J and K be two classes of similar matrices, all appropriate for a given algebra of sentences. Then the following is a s£flficient condition that every Kvalid argument is also J-valid (here M = (A, D), M' = (A', D')). For every M E J, and every a E A - D, there exists an M' E K and a homomorphism h from A into A' such that h(D) ~ D', and such that h(a) E A' - D'.
Proof. Suppose the condition, and suppose that r I- a is not J-valid. Then there is a valuation v E V(J) such that v(y) = T for all Y E r but v(a) :f. T. From this it follows that there is a J-matrix M and an interpretation I such that I(Y) E D for all y E rand I(a) E A-D. According to the hypothesized condition, there is a K-matrix M' and a homomorphism h such that h(D) ~ D', and such that h(l(a)) E A' - D'. Since h is a homomorphism, the composition of h with I yields an M-interpretation, which in tum yields a valuation Vho/. In particular Vho/(Y) = Tiff h(I(Y)) ED', but I(Y) E D, and h maps every true proposition into a true proposition, so h(I(Y)) ED', so Vho/(Y) = T for
323
Corollary 9.2.3 EvelY jilter on a Boolean lattice B induces a quotient algebra which is itself a Boolean lattice, denoted as B I F.
Proof. Follows from Chapter 2, noting that Boolean algebras form a variety.
D
Lemma 9.2.4 Let B be a Boolean lattice, let F be any jilter on B, and let 8 be the congruence relation determined by F. Then we have the following:
a
E
F
¢:=?
Proof. a81 iff a '"7 I E F iff a E F (since a
a8l.
= (a A 1) V (-a A-I)).
D
Lemma 9.2.5 Let B be a Boolean lattice, let F be a maximal jilter on B, and let 8 be the associated congruence relation on B. Then for every element a E B, either a8I or a80.
Proof. Suppose F is a maximal filter on B. Then F is complete, so for all a E B, either a E F or -a E F. So by Lemma 9.2.4, for all a E B, either a81 or -a81. If -a81, then - - a8 - 1, that is, a80. D Corollary 9.2.6 IfF is a maximaljilter on a Boolean lattice B, then the quotient algebra BI F has exactly two-elements, so it is the two-element Boolean lattice, or the Frege algebra.
Proof. Immediate from Lemma 9.2.4 and facts about universal algebra.
D
Lemma 9.2.7 Let B be a Boolean lattice, and let F be any jilter on B. Then there exists a Boolean lattice B* and a homomorphism hfrom B into B* such that F
= h- 1({1}) = {x E B: hex) = l}.
Proof. Let F be a filter on the Boolean lattice B. According to Corollary 9.2.3, BI F is a Boolean algebra, which is a quotient of B. Consider the canonical homomorphism h from B into BI F, defined so that h(a) = [a]F, where [a]F = {x E B : a '"7 x E F}. Therefore, a E h- 1({ I}) iff h(a) = 1, that is, iff [a]F = 1. Now, [a]F consists of all those elements congruent to a (modulo F), and 1 consists of all those elements congruent to 1 (modulo F). If these two congruence classes are the same, then a is congruent to 1 (modulo F), from which it follows that a E F. Conversely, if a E F, then a81, so h(a) = [a]F = [1]F = 1, so a E h- 1( {I}). D
324
SYMMETRICAL ENTAILMENT
CLASSICAL PROPOSITIONAL LOGIC
Lemma 9.2.8 Let B be any Boolean lattice, and let a be any element different from l. Then there exists a homol1101phism hJrom B into the two-element Boolean lattice such that h(a) = o. Proof Let a be a non-unit element of Boolean lattice B. Since a :I I, {I} is a proper filter. So in virtue of Stone's maximal filter separation theorem, I is contained in a maximal filter, call it F*, with a ¢ F*. According to Lemma 9.2.7, there is a homomorphism h into the two-element Boolean lattice such that h- I ( {I}) = F*. Therefore, D since a ¢ F*, h(a) = O.
First we recall some notions we introduced earlier. Remember that the customary relation of consequence holds between sets of formulas and individual formulas; specifically, we say that r entails a relative to a class V of valuations if every valuation v E V that satisfies r also satisfies a. Since the two sides of the relation are categorically different, we might call this consequence asymmetrical consequence. By contrast, a symmetrical consequence relation is a relation holding among sets. Symmetrical entailment is defined as follows (perhaps not in the manner one might expect!): (1) Let V be a class of valuations on language L, with formulas W, and let rand li
Theorem 9.2.9 Boolean logic and unital Boolean logic are strongly equivalent. Proof Since the class of Boolean matrices includes the class of unital Boolean matlices, every Boolean valid argument is automatically unital Boolean-valid. Concerning the converse, we appeal to Lemma 9.2.1. Consider any Boolean matlix M = (B, F). According to Lemma 9.2.7, there is a Boolean lattice B* and a homomorphism h from B into B* such that F = h- I ({ID. It follows that there is a unital Boolean matrixthe one based on B*-and a homomorphism (namely, h) such that h(F) ~ {I}, and such that for any a ¢ F, h(a) ¢ {l}. We thus have the antecedent condition to apply Lemma 9.2.1, so every unital Boolean valid argument is also Boolean valid. (Note that Lemma 9.2.1 says that if for any a ¢ D, there is a homomorphism of the required nature, then J-Iogic subsumes K-Iogic. In this proof, we have shown something stronger-that there is a single homomorphism which does the job for every a ¢ D.) D
be subsets of W; then r is said to entailli relative to V if for every v satisfies r, then there exists an a E li such that v satisfies a.
Proof Since the Frege matrix is a unital Boolean matrix, every unital Boolean valid argument is automatically Frege valid. In order to prove the converse, we appeal to Lemma 9.2.1. Consider a unital Boolean matrix M = (B, D), where D = {I}, and consider any element a different from 1. According to Lemma 9.2.8, there is a homomorphism h from B into the Frege algebra such that h(a) = O. We accordingly have the antecedent condition of Lemma 9.2.1, from which it follows that every Frege valid argument is also unital Boolean valid. D Corollary 9.2.11 Boolean logic and Frege logic are strongly equivalent. D
Corollary 9.2.12 Boolean logic has a finite characteristic matrix. Proof Immediate from Corollary 9.2.6 together with the fact that the Frege matrix is finite (two elements). D
V, if v
We say that two logics LI and L2 are strictly equivalent, or equivalent with respect to symmetrical entailment, if for any sets r, li, r LI-entails li iff r L2-entails li. Note that ifLI and L2 are sttictly equivalent, then they are automatically strongly equivalent, and hence weakly equivalent; for strong equivalence is a special case of strict equivalence obtained by restricting one's attention to pairs of sets r, li where li is a singleton set. In the following theorems, we show that although Boolean logic is strictly equivalent to unital Boolean logic, it is not strictly equivalent to Frege logic. We begin by proving an important lemma that states that, from the point of view of truth-value semantics, Boolean logic is identical to unital Boolean logic.
Proof Since every unital Boolean matrix is a Boolean matrix, every unital Boolean valuation is a Boolean valuation: V(Ku) ~ V(KB). Concerning the converse inclusion, suppose v E V(KB). Then there is a Boolean algebra B, a filter F, and an interpretation I : W -+ B, from the word algebra into B, such that v is the valuation induced by I; that is, v(a) = T if I(a) E F, v(a) = F if otherwise. In virtue of Lemma 9.2.7, there is a homomorphism h from B into some Boolean algebra B* such that F = h- I ( {I D. We can form a unital Boolean matrix simply by designating 1 as the true proposition. Now, the composition c of h and I is an interpretation of L into a unital Boolean mattix, and the associated valuation is defined so that vc(a) = T if c(a) = 1, vc(a) = F otherwise. Claim: Vc = VI(= v); forvl(a) = Tiffz(a) E F iff h(z(a)) = 1 iff c(a) = I iffvc(a) = T. Thus, v = VI E V(Ku). D Theorem 9.3.2 Boolean logic and unital Boolean logic are strictly equivalent. Proof Immediate from Lemma 9.3.1.
9.3
E
Lemma 9.3.1 Where KB is the class oJBoolean matrices, and Ku is the class oJunital Boolean matrices, V(KB) = V(Ku).
Theorem 9.2.10 Unital Boolean logic and Frege logic are strongly equivalent.
Proof Immediate from Theorems 9.2.9 and 9.2.10.
325
Symmetrical Entailment
Having shown that Boolean logic and Frege logic are equivalent in the sense that they validate exactly the same arguments, in the present section we show how tlley are different.
D
Theorem 9.3.3 Boolean logic and Frege logic are not strictly equivalent. Proof It is sufficient to produce two sets r, li such that r F -entails li but r does not Bentailli. The following is a simple example: r = 0, li = {p, ~ p}. To say that 0 entails
326
CLASSICAL PROPOSITIONAL LOGIC
COMPACTNESS THEOREMS
./:::,. is to say that ./:::,. is unassailable, which is to say that every valuation satisfies some formula or other in ./:::,... Thus, the claim is that although {p, ~ p} is unassailable relative to Frege logic, it is not unassailable relative to Boolean logic. Suppose v E V(F), and v does not satisfy p. It follows that there is an interpretation I into the Frege algebra such that l(p) = O. But l(~p) = -l(p) = -0 = 1, so l(~p) = 1, so v(~p) = T. Thus, {p, ~ p} is Frege unassailable. On the other hand, consider the four-element unital Boolean matrix, where the propositions are 1, a, b, 0, and consider any interpretation I such that l(p) = a, so that l(~p) = b. Since neither a nor b is designated true or false, the corresponding valuation VI satisfies neither p nor ~p. Thus, {p, ~ p} is not unassailable relative to Boolean logic. D
9.4
Compactness Theorems for Classical Propositional Logic
Earlier we introduced three different notions of compactness, defined as follows. (1) (L, V) is I-compact iff for every subset r of W, r is unsatisfiable (relative to V) only if X is un satisfiable for some finite subset X of r. (2) (L, V) is U-compact iff for every subset r of W, r is unassailable relative to V only if X is unassailable for some finite subset X of r. (3) (L, V) has finitary entailment iff for every subset ru {a} of W, r V-entails a only if X V -entails a for some finite subset X of r. We begin by noting that the notions of unsatisfiability and unassailability can be succinctly expressed in terms of symmetrical entailment as follows: (a) r is unsatisfiable iff r entails 0 (relative to V). (b) r is unassailable iff 0 entails r (relative to V).
(2) Suppose r is unassailable. Then by (b) above, 0 entails r. So by strong compactness, there are finite subsets X of 0 and Y of r such that X entails Y. But the only subset of 0 is 0, so we have that 0 entails Y for some finite subset Y of r, which is to say that Y is unassailable. Thus, (L, V) is U-compact. (3) Suppose r entails a. Then by (c) above, r entails {a}. So by strong compactness, there are finite subsets X of rand Y of {a} such that X entails Y. Since there are two subsets of {a}, we have two cases to consider. Case 1: Y = {a}; in this case X entails {a}, so X entails a. Case 2: Y = 0; in this case, X entails 0, so X entails every formula, so in particular, X entails a. In either case, we have that X entails a for some finite subset X of r. Thus, (L, V) has finitary entailment. D
Lemma 9.4.2 Suppose (L, V) is I -compact, and suppose that (L, V) has exclusion negation. Then (L, V) is strongly compact. (Recall that (L, V) has exclllsion negation ifffor every a E W, there is a 13 E W such that yea) = V - V(f3).) Proof Suppose (L, V) is I -compact and has exclusion negation. Suppose r entails ./:::,... Then (V (a) : a E r} ~ U (V (a) : a E ./:::,..} . So by set theory X ~ Y ¢=} X - Y = 0,
n
n
and (Yea) : a E r} - U (Yea) : a E ./:::,..} = 0; also by set theory (generalized De Morgan laws), (yea) : a E r} n (V - Yea) : a E'/:::"'} = 0. At this point, we appeal to the exclusion negation property, which tells us that for every a E ./:::,.., there is an a* such that V - yea) = V(a*). (a* is a particular formula 13 such that V(f3) = V - yea); we are now appealing to the Axiom of Choice.) Thus, (V (a) : a E r} n (V (a*) : a E'/:::"'} = 0. Now, let'/:::"'* be the image of./:::,.. under the star function (which is a choice function). This allows us to write n{V(a) : a E r} n n{V(a) : a E ./:::,..*} = 0, from which it follows that (yea) : a E r u./:::,..*} = 0, which is to say that r u./:::,..* is unsatisfiable. So by I -compactness, there is a finite subset of r u./:::,.. * that is unsatisfiable, which may be written X U Y, where X ~ r, and Y ~ ./:::,. *. Since X U Y is unsatisfiable, we have that n{V(a) : a EX U Y} = 0, from which it follows that n{V(a) : a E X}nn{V(a): a E Y} = 0,sobyDeMorganlaws, n{V(a): a E X}-U{V-V(a): a E Y} = 0, so by set theory, n{V(a) : a E X} ~ U (V - yea) : a E Y}. Recall that every a E ./:::,..* is an exclusion negation of some formula 13 or other in ./:::,..; so for every a E Y, there is a 13 E ./:::,.. such that V - yea) = V(f3). For each a E Y, let ail be any such 13. Thus, (V - yea) : a E Y} = (Yea) : a E yiI}, where yil is the image ofYunder the U function. So we have that (yea) : a E X} ~ U (yea) : a E yiI}, which is to say that X entails yiI; here, X is a finite subset of rand yil is a finite subset of ./:::,... Thus, (L, V) is strongly compact. D
n
n
n
n
n
As noted earlier, asymmetrical entailment can be thought of as a special case of symmetrical entailment, where the consequent set is a singleton: (c) r entails a iff r entails {a}. In addition to the above three notions of compactness, there is a stronger notion of compactness which we call strong compactness or finitmy symmetric entailment: (4) (L, V) is strongly compact iff for every pair r,'/:::'" of subsets of W, r entails ./:::,. (relative to V) only if X entails Y for some finite subset X of r and some finite subset Y of'/:::"'. The following lemma states that strong compactness is in fact strong.
Lemma 9.4.1 If (L, V) is strongly compact. then it is I -compact. U -compact. and has finitary entailment. Proof Suppose (L, V) is strongly compact, and suppose r
327
~
W, and a
E
W.
(1) Suppose r is unsatisfiable. Then by (a) above, r entails 0. So by strong compactness, there are finite subsets X of rand Y of 0 such that X entails Y, but the only subset of 0 is 0, so we have that X entails 0 for some finite subset X of r, which is to say that X is unsatisfiable. Thus, (L, V) is I-compact.
n
Theorem 9.4.3 If (L, V) has exclusion negation, then all of the following conditions are equivalent: (1) (2) (3) (4)
(L, V) (L, V) (L, V) (L, V)
is I -compact; is U-compact; has finitary entailment; is strongly compact.
328
CLASSICAL PROPOSITIONAL LOGIC
COMPACTNESS THEOREMS
Proof Strong compactness implies each of the other three, by Lemma 9.4.1. Also, by Lemma 9.4.2, if (L, V) has exclusion negation, then I-compactness implies strong compactness. We leave the interested reader to prove, as an exercise, that with exclusion negation U -compactness implies finitary entailment, and that (with exclusion negation again) finitary entailment implies strong compactness. D
As noted earlier, Frege logic (but not Boolean logic) has exclusion negation, so in order to show that Frege logic is strongly compact, it is sufficient to show, for example, that it is I -compact. In order to do this we first prove a number of lemmas. Lemma 9.4.4 Let K
U {A} be a class of similar algebras; let H (A, K) be the class of homomorphisms that map A into B for some algebra or other B E K. Define a relation e on A as follows:
aeb iff h(a)
= h(b) for all h E H(A, K).
polynomial function, we have P(h(W1), . .. , h(wm)) =f. If/(h (x]), ... , h(xn)). This contradicts our earlier claim that p( ai, ... , am) = If/( bl, ... , bn ) for all Boolean algebras B, and for all elements a1, ... , am, bl, ... , bn E B. Thus, W Ie is a Boolean algebra. D
Remark 9.4.7 Polynomials on an algebra A are the set-theoretic counterparts of terms in the associated first-order language. Let A be any algebra. Then the polynomials on A-denoted poly(A)-are partial functions from UnEOJ An into A. Each polynomial has a degree: P has degree njust when dom(p) = An. poly(A) is defined inductively as follows: (1) The identity function on A is an element of poly(A): i(x)
E
1 1 2 2 n n] 0(P1, ... , Pn)[ a I' ... , ad I ' ai' ... , ad2' ... , a I' ... , ad ll
Proof e is obviously an equivalence relation. Concerning the replacement property, suppose (0) is an n-place operation on A, and suppose alebl, a2e b2, ... , aneb n . Then forallh E H(A,K),h(ai) = h(bi)(i E n).Considerany hE H(A,K). Thenh(0(a1,a2, ... , an)) = o*(h(a]), h(a2),"" h(a n )), and h(o(bl, b2,··., bn )) o*(h(b]), h(b2), ... , h(b n )). Since h(ai) = h(bi) for all i E n, it follows that 0*(h(a1), h(a2), ... , h(a n )) = o*(h(bl), h(b2), ... h(b n )), so h(0(a1, a2,··· ,an)) = h(o(bl, b2, ... , bn )), for all h E H(A, K). Thus, o(al, a2, ... , an )(}0(b1, b2, ... , bn ), and e satisfies the replaceD
Corollary 9.4.5 Let L be the standard sentential language, and let W be the associated algebra offormulas; let B be the class of Boolean algebras. Then the relation e, defined so that aeb iff h(a) = h(b) for all h E H(W, B), is a congruence relation. Proof Immediate.
= x for all x
A. (2) If PI, P2, ... , Pn are polynomials on A, where d(Pi) = d i , and 0 is an n-place operation on A, then 0(P1, P2, . .. , Pn) is a polynomial on A of degree dl + d2 + ... + d n , and is defined as follows:
Then e is a congruence relation on A.
ment property.
329
D
Lemma 9.4.6 Let e be the congruence relation cited in the above corollary. Then the quotient algebra W Ie is a Boolean algebra. Proof Since Boolean algebras form a variety (equational class), if WI e is not a Boolean algebra, then there is at least one equation E satisfied by every Boolean algebra but not satisfied by W Ie. It follows that if W Ie is not a Boolean algebra, then there are two polynomials p and If/ such that p(al, ... , am) = If/(bl, ... , bn ) for all Boolean algebras B, for all elements al, ... , am, bl, ... , bn E B, and such that p([wIJ, ... , [w m]) =f. If/([x]], ... , [x,d) for some [wIJ, ... , [w m], [x]], ... , [xn] E W Ie. It can be shown by induction that every polynomial preserves every homomorphism: h(p(a1, . .. , an)) = p(h(a1), ... , h(an)). Then, it follows that p([w]], ... , [wm]) = [P(W1, ... , wm)], and If/([x]], ... , [xn]) = [If/(XI, ... , xn)]. Thus, [P(W1, .. ·, wm)] =f. [If/(X1, ... , xn)], which is to say that p(WI, ... , wm) is not congruent to If/(XI, ... , xn). So there is a Boolean homomorphism such that h(P(WI, ... , wm)) =f. h(lf/(XI, ... , xn)). Since h preserves every
= 0(p1(ai,· .. ,a~I)'''·'Pn(a7,···,a~)). (3) Nothing is a polynomial on A except in virtue of clauses (1) and (2) . Lemma 9.4.8 There is a natural correspondence between Boolean interpretations on Wand homomorphisms on W Ie. In particular, every interpretation I on W determines a unique homolllOlphism hi on WI e, and every homomorphism h on WI e determines a unique intelpretation Ih on W. The functions h 1-+ Ih and I 1-+ hi are inverses of each othel: Proof Suppose I is an interpretation of W into a Boolean algebra B. Define the function hi : wle -7 B so that h,([a]) = I(a). First of all, hi is well-defined; for suppose [a] = [b]; then aeb, so I(a) = I(b), so h,([a]) = h,([b]). Next, hi is a homomorphism; for h[(oc([a]], ... , [an])) = h[([cal, ... ,an]) = I(cal, ... , an) = fc(l(a1», ... , I (an)) = fc(h([a]]), ... , h([al/])); here, Oc is the operation on W Ie associated with connective c, and fc is the operation on B associated with c. Thus, hi is a homomorphism on W Ie. Now, suppose h is a homomorphism from W Ie into Boolean algebra B. Define Ih : W -7 B so that lh(a) = h[a]. In other words, Ih is simply the composition of h with the canonical homomorphism from W into W Ie; it follows immediately that Ih is a homomorphism from W into B. To see that the map h -7 Ih is the inverse of the map Ih -7 h, consider the interpretation Ih(I); by definition Ih(I)(a) = h,[a], and by definition h[[a] = I(a); thus Ih(l) = I; by similar reasoning, we can show that h,(h) = h.
D
Lemma 9.4.9 Let B alld B* be any Boolean algebras (lattices), let F be any filter on B*, and let h be any homomOlphism from B into B*. Then the pre-image h- I (F) of F under h is a filter on B.
COMPACTNESS THEOREMS
CLASSICAL PROPOSITIONAL LOGIC
330
Proof In order to show that h- I (F) is a filter on B, given the hypotheses, it is sufficient to show: a E h-I(F) and bE h- I (F) only if a /\ bE h- I (F), and a E h-I(F) and a :S b only if bE h-I(F). Suppose a E h-I(F) and bE h-I(F). Then h(a) E F and h(b) E F, so h(a /\ b) = h(a) /\ h(b) E F, so a /\ bE h-I(F). Suppose a E h-I(F) and a :S b; then h(a) E F, and a /\ b = a, so h(a /\ b) E F, so h(a) /\ h(b) E F, so h(b) E F, so bE h-I(F). D
Lemma 9.4.10
h- I (F)
is a proper filter provided F is a proper filter.
331
Proof Suppose F(S) is proper. Then 0 fj. F(S), since 0 :S x for all x E B. In virtue of the definition of F(S), we have that it is not the case that al /\ ... /\ an :S 0 for some al, ... , an E S, which implies that for all al, ... , all E S, al /\ ... /\ an ::f. O. Conversely, suppose F(S) is not proper. Then F(S) = B, so 0 E F(S). Therefore, for some aI, ... , all E S, al /\ ... /\ an :S 0; but 0 :S x for all x E B, so al /\ ... /\ an = O. D
Theorem 9.4.15 Frege logic is strongly compact.
Proof Suppose F is a proper filter on B*. Then F ::f. B*, so 0 fj. F. Since h is a homomorphism, h(O) = 0, from which it follows that 0 fj. 11- 1(F), for 0 E h-I(F) iff 11(0) E F iff 0 E F. D
Corollary 9.4.11 Let h be any homomorphism from W 18 into the Frege algebra, denoted by 2; then 11- 1( {I} ) is a proper filter on WI 8. Proof Immediate from Lemmas 9.4.9 and 9.4.10, noting that {I} is a proper filter on 2. D
Lemma 9.4.12 A subset r of formulas of W is satisfiable (by a Frege intelpretation) iff the corresponding family r I 8 of equivalence classes is contained in a proper filter on W 18; here r/8 = {[a] : a
En.
Proof Suppose r is satisfiable. Then there is an interpretation I : W -+ 2 such that I( a) = I for all a E r. Accordingly, the associated homomorphism h/ : WI 8 -+ 2 has the property that h/([a]) = 1 for all [a] E r/8 or, equivalently, for all a E r. It follows that r 18 is contained in h- I ({1 D, which is a proper filter on W 18, according to Lemma 9.4.10. Conversely, suppose r I 8 is contained in a proper filter F on WI 8. It follows that r 18 is contained in a maximal filter F* on W 18. According to Lemma 9.4.6, the quotient algebra on WI 8 induced by F* is a two-element Boolean lattice, that is, the Frege algebra. Consider the canonical homomorphism c from WI 8 into WI 8 I F*; c maps every element of F* into 1, and every element of WI8 - F* into O. The associated interpretation Ie maps every formula a whose equivalence class [a] is in F* into 1. Therefore, since r I 8 S;;; F S;;; F*, le( a) = 1 for every a E r, which is to say that r is satisfiable. D
Lemma 9.4.13 Let B be any Boolean lattice, and let S be any subset of B. Define F(S) = {x E B: for some al, ... , an E S, al /\ ... /\ an :S x}. Claim: F(S) is a filter on B. Proof Suppose p E F(S) and suppose p :S q. Then al /\ ... /\ an :S p for some aI, ... , an E S; but then al /\ ... /\ all :S q, so q E F(S). Suppose p, q E F(S). Then aI/\ .. . /\ am :S p and bl/\ ... /\ bn :S q for some aI,···, am, bI, ... , b ll E S. So by lattice theory, al /\ ... /\ am /\ bi /\ ... /\ bn :S p /\ q, from which it follows that p /\ q E F(S). D
Lemma 9.4.14 Let B be any Boolean lattice, let S be any subset of B, and let F(S) be the filter described in Lemma 9.4.13. Then F (S) is a proper filter (f and only if for all aI, ... , all E S, al /\ ... /\ all ::f. O.
Proof Since Frege logic has exclusion negation, in virtue of Theorem 9.4.3, it is sufficient to show that Frege logic is I -compact. We proceed contrapositively. Suppose X is satisfiable for every finite subset X of r. Then by Lemma 9.4.12, the corresponding family XI 8 of equivalence classes is contained in a proper filter on WI 8. Since X is finite, we write X = {XI,X2, ... ,xm }, and we write XI8 = {[xIJ, [X2], ... , [x m]}. Since XI8 is contained in a proper filter, it follows that [xIJ /\ [X2] /\ ... /\ [x m] ::f. O. Recall that X is an arbitrary finite subset of r, so we have that for all [Xl], [X2], ... , [Xm] E r 18, [xIJ /\ [X2] /\ ... /\ [xm] ::f. O. So by Lemma 9.4.14, F(r 18) is a proper filter on W 18. Since r 18 S;;; F(r 18), we have that r 18 is contained in a proper filter on W 18. So by Lemma 9.4.12, r is satisfiable. D We have now shown that Frege logic is strongly compact. Since Boolean logic is strongly equivalent to Frege logic, it follows that Boolean logic has finitary entailment. On the other hand, since Boolean logic and Frege logic are not strictly equivalent, we do not automatically know whether Boolean logic is compact in any of the other senses of compactness. As it turns out, Boolean logic is strongly compact, but we proceed to this result in small steps.
Lemma 9.4.16 Boolean logic is I -compact. Proof Suppose that r is Boolean-unsatisfiable. Then there is no valuation v E V(B) such that vCr) = {T}. It immediately follows that r entails every formula, so in particular, r entails (relative to B) the constant falsity f (f can be defined as p/\ '" p). Since Boolean logic is strongly equivalent to Frege logic, we have that r F-entails f. But Frege logic has finitary entailment, so there is a finite subset X of r such that X F-entails f. Appealing to strong equivalence again, we have that X B-entails f, which is to say that every valuation v E V(B) that satisfies X also satisfies f. But no valuation satisfies f, so no valuation satisfies X. Thus, there is a finite subset X of r that is D unsatisfiable.
Lemma 9.4.17 Let S be any set offormulas that is satisfiable. Then there is a Boolean valuation Vs that satisfies S, and moreover has the property that every valuation v that satisfies S is an extension ofvs. Proof Suppose S is satisfiable. Then there is a Boolean interpretation I from W into B such that I(a) = 1 for every a E S; equivalently, the associated homomorphism h/ is such that h/ ([ a]) = 1 for all [a] E S 18. It follows that S 18 is contained in a proper filter on WI 8, so that the filter F (S18) generated by S 18 is proper; call this filter F (S). Let 8
A THIRD LOGIC
CLASSICAL PROPOSITIONAL LOGIC
332
be the congruence relation determined by F(S), and let s be the associated (canonical) homomorphism from W into W lB. Then sex) = 1 iff x E F(S), by Lemma 9.2.4. Consider the valuation Vs determined by s. Consider any valuation v that satisfies S; v is induced by I for some homomorphism I, where v(a) = Tiff I(a) = 1, v(a) = F iff I(a) = O. In particular, v satisfies S only if SIB c h- 1({I}). Suppose a E dom(vs); then either [a] E F(S) or [a] ¢ F(S); in the former case vs(a) = T, in the latter case vs(a) = F. In the first case, since F(S) ~ h;-1 ({ I}), [a] E h;-1 ( {I}), so I(a) = 1, so v(a) = T. (F(S) ~ h;-\ {I}) because h;-1 ({ I}) is a filter containing SIB, and F(S) is the smallest filter containing SIB.) In the latter case, [a] ¢ h;-I({I}), so I(a) = 0, so v(a) = F. In either case, v extends Vs. 0 Lemma 9.4.18 There is a Boolean valuation v such that a Boolean valid or Boolean contra-valid.
E
dom(v) only
if a is
Proof Consider the unital Boolean matrix whose underlying algebra is W IB, and consider the canonical homomorphism c from W into W lB. The associated valuation Vc has the property that a E dom(v c) only if c(a) = [a] = 1, or c(a) = [a] = O. Suppose a E dom(vc). In virtue of Lemma 9.2.5, every Boolean interpretation maps a into 1, since it maps [a] = 1 into 1, or every Boolean interpretation maps a into 0, since it maps [a] = 0 into O. It follows that a is either Boolean valid or Boolean contra-valid.
o Lemma 9.4.19 valid.
r
is Boolean unassailable only
if there is an a
E
r
that is Boolean
Proof Suppose r is unassailable. Then for every v E V(B), there is an a E r such that v(a) = T. Consider the minimal valuation defined in Lemma 9.4.17, Vc; there is an a E r such that vc(a) = T, but vc(a) = T only if a is Boolean valid. Thus, there is an a E r that is Boolean valid. 0
Corollary 9.4.20 Boolean logic is U -compact. Proof Suppose r is unassailable. Then by Lemma 9.4.19, there is an a E r that is Boolean valid. It follows that {a} is unassailable, but {a} is a finite subset of r. 0
Lemma 9.4.21 r Boolean entails A only entails a, provided that A is non-empty.
if there is an a
E
A such that
r
Boolean
Proof Suppose r Boolean entails A. Then for every v E V(B), if v satisfies r, then v satisfies a for some a E A. There are two cases to consider, according to whether r is or is not satisfiable. In the latter case, there is no valuation that satisfies r, so r entails every fOlmula, so it entails at least one formula a E A (assumed to be non-empty). In the former case, we can speak of the valuation Vi as described in Lemma 9.4.17; since Vi satisfies r, there is an a E A such that Vi satisfies a; call it ao. Now, consider any valuation v that satisfies r. By Lemma 9.4.17, v extends Vi, so in particular v(ao) = T. It follows that r entails ao. Thus, in either case, there is an a E A such that r entails a.
o Theorem 9.4.22 Boolean logic is strongly compact.
333
Proof Suppose r entails A. There are two cases to consider, according to whether A is or is not empty. In the former case, r entails the empty set, so r is unsatisfiable. According to Lemma 9.4.16, there is a finite subset X ofr such that X is unsatisfiable, so X entails 0 (= A). In the latter case, since A is non-empty, by Lemma 9.4.21, there is an a E A such that r entails a. Since Boolean logic is strongly equivalent to Frege logic, we have that r Frege entails a. But Frege logic is compact, so there is a finite subset X of r such that X Frege entails a, so by strong equivalence, X Boolean entails a. Thus there are finite subsets X of rand {a} of A such that X entails {a}. 0
Thus, we see that although Frege logic and Boolean logic are not strictly equivalent (so they provide different characterizations of symmetrical entailment), they are nevertheless both strongly compact. 9.5
A Third Logic
We now have two logics, Frege logic and (unital) Boolean logic, that are strongly equivalent, but not strictly equivalent. We now give a simple example of a logic that is weakly equivalent, but not strongly equivalent, to Frege logic. The logic we have in mind is characterized by a single matrix M = (A, F), where A is the four-element Boolean lattice (as in Boolean logic), and F = {l}. Here, a and b complement each other. In order to show that M-Iogic is weakly equivalent to Frege logic, we appeal to the following general lemma, which is similar to Lemma 9.2.1. Lemma 9.5.1 Let J and K be two classes of similar logical matrices, all appropriate for a given algebra of sentences. Then the following is a sufficient condition that every K-validformula is also J-valid. (Here, M = (A, F), N = (B, G).) For every M E J, and every a E A - F, there exists an N E K and a homomorphism hfrom A into B such that h(a) E B - G. (For every non-true J-proposition b, there is a homomorphism that maps b into a non-true K -proposition.) Proof Suppose a is not J-valid. Then there is an interpretation I into some J-mauix M = (A, F) such that I(a) E A-F. According to the hypothesized condition, there is a K-matrixN = (B, G), and a homomorphism h from A intoB, such that h(l(a)) E B-G. It follows that there is an interpretation-namely, the composition c of h and I-into a 0 K-matrix such that c(a) is not a true proposition, so a is not K-valid.
Theorem 9.5.2 Frege logic and M-logic are weakly equivalent, which is to say that for every formula a, a is Frege valid if and only if a is M-valid. Proof Consider the map h : A -+ 2 such that h(l) = h(a) = 1, h(b) = h(O) = O. One can readily show that h is a homomorphism. Moreover, every non-true proposition (there is only one, 0) is mapped into a non-true proposition (namely, 0). Thus, we have the antecedent condition of Lemma 9.5.1. It follows that every Frege valid formula is M-valid. Concerning the converse inclusion, consider the map h : 2 -+ A such that h(l) = 1, h(O) = 0; h is readily shown to be a homomorphism; it also maps each nontrue proposition into a non-u·ue proposition. Thus, applying Lemma 9.5.1, we have that every M-valid formula is also Frege valid. 0
CLASSICAL PROPOSITIONAL LOGIC
334
PRIMITIVE VOCABULARY AND DEFINITIONAL COMPLETENESS
Theorem 9.5.3 Frege logic is not strongly equivalent to M-logic. Proof It is sufficient to provide an argument r I- a that is Frege valid but not M-valid. The following is a Frege valid argument that is not M-valid: {p, q} I- p /\ q; for let /(p) = a, /(q) = b; then /(p /\ q) = 0; thus, v,(p) = v,(q) = T, but v,(p /\ q) = F. On the other hand, this is a standard valid argument of classical (Frege) logic. D
9.6
Axiomatic Calculi for Classical Propositional Logic
We have now provided formal semantic characterizations of Boolean logic and Frege logic, and we have shown that although they are not strictly equivalent, they are strongly equivalent. In other words, although they validate exactly the same asymmetrical arguments, they do not validate exactly the same symmetrical arguments. In addition to formal semantic characterizations, logicians have traditionally been interested in axiomatic characterizations of various logics, and have accordingly constructed various axiomatic calculi. For our purposes, an axiomatic calculus is an inductive definition of a particular subset of syntactic objects, where there are at least four sorts of objects that one might deal with, and hence at least four sorts of axiomatic calculi. These are listed as follows: (1) A unary calculus is an inductive definition of a subset of individual formulas.
(2) A binary calculus is an inductive definition of a subset of ordered pairs of formulas. (3) An asymmetrical sequent calculus is an inductive definition of a subset of ordered
pairs, whose first components are sets of formulas, and whose second components are individual formulas. (4) A symmetrical sequent calculus is an inductive definition of a subset of ordered pairs of sets of formulas. In each case, the set of objects defined by a calculus C are called the theses of C; this set is denoted by T(C). Thus, the theses of a unary calculus are individual formulas, the theses of a binary calculus are ordered pairs of formulas, etc. For the moment, we concentrate on unary calculi. Consider the set W of formulas of some particular sentential language L. An inductively defined subset S of W is specified by giving the base elements of S, and by giving the inductive generating clauses by which "new" elements are "added" to S. In the case of an axiomatic calculus, the base elements are called axioms, and the inductive generating clauses are called transformation (or derivation) rules. However, before discussing axiomatic calculi, it might be useful first to look at an already familiar inductively defined set, namely, the set W of formulas of a sentential language L. Specifically, W is an inductively defined subset of the set S of all strings of symbols in the vocabulary of L. The base elements are the atoms, and the inductive generating clauses are the clauses pertaining to the construction of molecules, the general clause being given as follows: (c) If
C
is an n-place connective, and if is a formula.
CWI W2 ... Wn
WI, W2, ... ,Wn
are formulas, then
335
Now, the relation between this sort of definition and ordinary mathematical induction may be seen as follows. The following is a theorem of set theory: (T) Let X be any set, let a be any element of X, and let g be any function from X into X; then there is a function II : (i) -+ X having the following properties: u(O) = a; lI(n
+ 1) =
g(u(n)).
In order to convert the earlier definition into an inductive definition in the strict sense, we proceed as follows. We begin with the power set \p(S) of the set of strings of symbols of L; this plays the role of the set X. Next, the base element a is the set A of atoms. Finally, our function g is defined as follows. First, for each connective Ci, define a function fi on \p(S) as follows: fi(X) = {CiXjX2 ... Xn : Xj,X2, ... ,Xn E X}; juxtaposition is multiplication (concatenation) in the semi-group of strings; thus, fi takes each n-tuple of strings of X and makes an (n + I)-tuple by prefixing the connective Ci. Next define the function g on \p(S) as follows: g(X) = UiEI fi(X) U X. Now, we have a set \p(S), a base element A, and a generator function g; the induced recursive function II : (i) -+ \p(S) is defined as usual. u(O) consists of just atoms; u(1) consists of atoms, as well as what one can obtain from atoms by applying connectives; lI(2) consists of all of these strings, as well as all those strings obtainable from these by applying the connectives; etc. The set W of formulas is the union of all of these sets: W = U{lI(i) : i E (i)}. So if we want to prove something about every formula in W, it is sufficient to prove it for u(O), and to prove it holds for lI(n + 1) on the condition it holds for lI(n); for then it holds for u(i) for all i E (i). Also, W E W iff W E u(i) for some i E (i).
9.7
Primitive Vocabulary and Definitional Completeness
Just as the well-formed formulas of a sentential language are an inductively defined subset of the set S of all shings of symbols, the theses of a (unary) calculus formulated in a given language form an inductively defined subset of the set W of all formulas. Now, we could construct an axiomatic calculus for classical logic on the basis of the standard sentential language, but since there are seven connectives in this language, this would prove to be too cumbersome. What we do instead is take a pmticular fragment of this language as the basis for our calculus. Afragment of a language L with connectives C is basically the set of formulas of L that are expressed in terms of a proper subset C* of C; that is, where C = {Ci : i E I}, C* = {Ci : i E J}, with J ~ I, but J =1= I. There are two sorts offragments that are considered. On the one hand, there are proper fragments; on the other hand, there are definitionally complete fragments, or simply complete fragments. Whether a particular fragment is complete depends upon the semantics one is considering. Intuitively, a fragment is complete if every connective of the parent language is "definable" in terms of the fragment language, where "definable" is relative to a particular semantics. In order to clarify this intuitive notion, we begin with the algebra W of formulas, and we construct a reduced algebra of formulas W*. The carrier set of Wand W* is the same, the set of all formulas of L; on the other hand, whereas the set F of operations on W is the family {oc : C E C} or {Oi : i E I}, the family F* of operations on W* is the family
CLASSICAL PROPOSITIONAL LOGIC
336
{oe : C E C*} or {Oi : i E J}. Note that although we have removed a number of connective operations, we have not correspondingly removed the connectors themselves. Now, we can define the polynomials on the respective algebras, denoted poly(W) and poly(W*), and we can define a congruence relation e on Wand W* relative to a given semantics K (a class of logical matrices appropriate for L); in particular, we say that two formulas a and p are congruent modulo e if for every K-interpretation h, h(a) = h(P), that is, a and Pare always interpreted as the same proposition. We are now in a position to define definitional completeness:
(DC) Let L be a sententiallanguage, with connectives {Ci : i E J}, let {Ci : i E J} (J c I) be a proper subset of connectives of {Ci : i E J}, and let K be a class of logical matrices appropriate for L. Then {Ci : i E J} forms a definitionally complete set of connectives for L relative to K if for every polynomial ¢ on W (of degree n), there is an n-place polynomial Iff on W*, such that for all a1, a2, ... , an E W, ¢(al, a2,···, an)
e Iff(al, a2,·· ., an).
THE CALCULUS BC
337
where Ie is the Boolean operation corresponding to c. Let us consider Ov as an example. Since ID is join v, condition (a) reduces to the following condition: (aD) h(ov(a, fJ)) = h(a) V h(fJ). But ov(a, fJ) is by definition equal to (a --+ fJ) --+ fJ, so we have (1) h(ov(a, fJ)) = h((a --+ fJ) --+ fJ);
but by hypothesis, h preserves the conditional operation, so (2) h((a --+ fJ) --+ fJ) = (h(a) --+ h(fJ)) --+ h(fJ),
where the arrow operation on a Boolean algebra is defined so that a --+ b = -a the question whether (aD) is true boils down to whether the following is true:
V
b. So
(aD*) a V b = (a --+ b) --+ b for all Boolean algebras A, for all a, bE A.
The particular logic that concerns us is Boolean logic. In this case, there are a number of different definition ally complete subsets of connectives, including the following: {--+,f}, {--+,~}, {V,~}, {/\,~}, {+-+, v,f}, {+-+, V,~}, {+-+,/\,f}, {+-+,/\,~}. The particular fragment we are going to employ uses --+ and I as its connectives; the associated set W of formulas is defined as follows:
9.8 The Calculus BC
(wI) Every atom is a wff. (w2) I is a wff. (w3) If a and pare wffs, then so is (a --+ P). (w4) Nothing is a wffexcept in virtue of clauses (wl)-(w3).
Having specified the language L, with its associated algebra of formulas and with its defined operations, we now describe the axiomatic calculus BC. Specifically, the class T(BC) of theses of BC is the smallest subset of formulas satisfying the following clauses:
The remaining connectives do not appear explicitly in our primitive language. In place of the explicit syntactic symbols, we have their set-theoretic surrogates, defined not in the language itself, but in the associated algebra of formulas. These operations on W are defined as follows:
(Al) (A2) (A3) (A4) (RI)
(D) ov(a, P) (K) oi\(a, fJ)
= (a --+ fJ) --+ fJ· = (a --+ (fJ --+ I)) --+ I·
(N) o~(a) = a --+ I. (B) o_(a, fJ) = [(a --+ fJ) --+ ((fJ --+ a) --+ I)] --+ (t) t
=I
--+
I·
I.
Note carefully the difference between Ov, which is a defined operation, and 0_", which is a primitive operation. Whereas 0-. is a map that sends each ordered pair of formulas (a, fJ) into the formula (a --+ fJ), Ov is a map that sends each ordered pair of formulas (a, fJ) into the formula ((a --+ fJ) --+ fJ). Thus, although we have a set-theoretic counterpart of disjunction, we do not have disjunction itself. In order to show that the above definitions are adequate for Boolean logic, one must show for each connective c, and for each Boolean interpretation h, that the interpretation preserves the corresponding operation 0e, which is to say
But this can be shown routinely in the theory of Boolean algebras. That the remaining definitions are adequate for Boolean logic is shown in a completely similar manner.
For all For all For all For all For all
a, PEW, a --+ (fJ --+ a) E T(BC). a, fJ, yEW, (a --+ (fJ --+ y)) --+ ((a --+ fJ) --+ (a --+ y)) E T(BC). a, fJ E W, ((a --+ fJ) --+ a) --+ a E T(BC).
a
E W,
I
--+
a
E T(BC).
a, fJ E W, if a E T(BC) and a --+ fJ E T(BC), then fJ E T(BC).
Adopting the traditional notation for axiomatic calculi, we write "BC f- a" in place of "a E T(BC)," or, when BC is understood, we simply write "f- a." If we also agree to drop universal quantifiers, we can rewrite the above clauses as follows: (AI) (A2) (A3) (A4) (RI)
f- a --+ (fJ --+ a). f- (a --+ (fJ --+ y)) --+ ((a --+ fJ) --+ (a --+ y)). f- ((a --+ fJ) --+ a) --+ a. f- I --+ a. If f- a and f- a --+ fJ, then f- p.
We now consider the adequacy of the proposed calculus with respect to Boolean Frege logic, which divides into two parts, soundness and completeness. To say that calculus BC is sound for Boolean logic is to say that every thesis ofBC is Boolean valid;
CLASSICAL PROPOSITIONAL LOGIC
338
to say that BC is complete for Boolean logic is to say that every Boolean valid formula is either a thesis of BC or is definitionally equivalent to a thesis of BC. Since every formula is definition ally equivalent to a formula written exclusively in the primitive vocabulary, this amounts to saying that every Boolean valid formula expressed in the primitive vocabulary is a thesis of Be. The following theorem concerns soundness.
Theorem 9.8.1 For all a E W,
if a
E T(BC), then a is Boolean valid.
Proof We proceed by induction, showing that every axiom of BC is Boolean valid, and
that the single rule (Rl) preserves Boolean validity. That each axiom is Boolean valid amounts to the fact that the cOlTesponding polynomial is identically equal to the unit element for every Boolean lattice (thus, for example, a -7 (b -7 a) = 1 is a theorem of Boolean lattices); this is shown routinely. That the rule of generation preserves Boolean validity may be seen as follows. Suppose a and a -7 fl are both Boolean valid. Then for every Boolean interpretation h, h(a) = h(a -7 fl) = 1, but since h is a homomorphism h(a -7 fl) = h(a) -7 h(fl), so h(a) -7 h(fl) = 1. Therefore 1 -7 h(fl) = 1, from which it follows that h(fl) = 1. Thus, fl is Boolean valid. 0 Our strategy in proving the completeness of BC is to construct the Lindenbaum algebra LA(BC) of the calculus BC, and show that it is a Boolean algebra the unit element of which is the equivalence class of theses of BC. LA(BC) is a quotient algebra of the algebra W of formulas, where the congruence relation e is defined as follows: (DI) aefl iff I- a
-7
fl and I- fl
-7
a.
Since e is a congruence relation on W, it sets up a homomorphism c, the canonical homomorphism, from W into W Ie, where cCa) = [a] for all a E W. Since W /e is a Boolean algebra, c is a Boolean interpretation; we can also form the unital Boolean matrix based on W Ie. Also, since the unit element of W /e is the equivalence class [t] of theses of BC, c has the following property: c(a) = 1 iff a E T(BC) (i.e., x E [tD. Thus, if any formula a is not a thesis ofBC, then c(a) f: 1, from which it follows that a is not Boolean valid, which is to say that every Boolean valid formula (in the primitive vocabulary) is a thesis of BC. The details of the completeness proof of BC are somewhat involved. We begin by proving various lemmas that demonstrate that e is indeed a congruence relation on W. (We use dots and colons to mark the (relatively) principal OCCUlTence of the alTOW connective. Thus, for example, "a -7 (a -7 a. -7 a)" is short for "a -7 ((a -7 a) -7 a)." A dot to the left of an alTOW substitutes for a right parenthesis, with the mated right parenthesis to be inserted at the nearest point where it is needed to make the formula well-formed. Similarly, a dot to the right of an anow substitutes for a left parenthesis. Colons are stylistic vmiants used when the alTOW is the main connective.)
Lemma 9.8.2 I- a
-7
a.
Proof
Suppose I- fl. By (AI), I- fl
Lemma 9.8.4 If I- a
-7
fl.
-7
fl and I- fl
(a
-7
fl), so by (RI), I- a
-7
y, then I- a
-7
339
-7
-7
Suppose I- a -7 fl and I- fl -7 y. Then by Lemma 9.8.3, I- a -7 (fl (A2), I- a -7 (fl -7 y). -7 (a -7 fl. -7 a -7 y), so by RI (twice), I- a -7 y.
Lemma 9.8.5 I- (fl
-7
y)
-7
(a
-7
fl.
-7
a
o
fl.
y.
Proof
-7
y), but by
Corollary 9.8.6
If I-
fl
-7
y, then I- (a
fl)
-7
0
y).
-7
Proof By (A2), I- a -7 (fl -7 y). -7 (a -7 fl. -7 a -7 y), but by (AI), I- (fl (a -7 .fl -7 y), so by Lemma 9.8.4, (fl -7 y) -7 (a -7 fl. -7 a -7 y). -7
(a
-7
-7
y)
Lemma 9.8.7 I- (a
-7
fl)
-7
(fl
-7
y.
-7
a
-7
-7
0
y).
Proof Immediate from Lemma 8.5 and (RI).
0
y).
Proof By Lemma 9.8.5, I- (fl -7 y) -7 (a -7 fl. -7 a -7 y), so by (A2) and (RI), I(fl -7 y. -7 a -7 fl) -7 (fl -7 y. -7 a -7 y). By (AI), I- (a -7 fl) -7 (fl -7 y. -7 a -7 fl), so by Lemma 9.8.4, I- (a -7 fl) -7 (fl -7 y. -7 a -7 y). 0
Corollary 9.8.8
If I- a
-7
fl, thell I- (fl
-7
y)
-7
(a
-7
y).
o
Proof Immediate from Lemma 9.8.7 and (RI).
Lemma 9.8.9 e is a congruence relation on W, which is to say for all a, fl, y, 8
E
W:
(1) aea. (2) If aefl, then flea.
(3) If aefl and fley, then aey. (4) If ae fl and ye8, then a -7 ye fl
-7
8.
(1) follows from Lemma 9.8.2, (2) from the definition of e, (3) from Lemma 9.8.4. Concerning (4), suppose aefl and ye8. Then I- a -7 fl, I- fl -7 a, I- y -7 8, I8 -7 y. So by Lemma 9.8.7, I- (fl -7 y) -7 (a -7 y) and I- (a -7 y) -7 (fl -7 y), and by Lemma 9.8.5, I- (fl -7 y) -+ (fl -7 8) and I- (fl -7 8) -+ (fl -+ y). So by Lemma 9.8.4, I- (a -+ y) -7 (fl -+ 8) and I- (fl -7 8) -+ (a -+ y), which is to say a -7 yefl -+ 8. Proof
o Lemma 9.8.10 If I- a, then aefl (ff I- fl. Proof Suppose I- a and aefl. Then I- a -+ fl, so by (RI), I- fl. Suppose I- a and I- fl. Then by Lemma 9.8.3, I- fl -+ a and I- a -+ fl, so aefl. 0
Corollary 9.8.11 The set T(BC) of theses ofBCform a congruence class modulo e.
o
Proof Immediate from Lemma 9.8.10.
Lemma 9.8.12 I- (a
Proof By (A2), I- a -7 (a -7 a. -7 a) -7: (a -7 .a -7 a) -7 (a -7 a), but by (AI), I- a -7 (a -7 a. -7 a) and I- a -7 (a -7 a), so by RI (twice), I- a -7 a. 0
Lemma 9.8.3 Ifl- fl, then I- a
THE CALCULUS BC
-7
.fl
-7
y) -+ (fl
-7
.a
-7
y).
Proof By (Al), I- fl -+ (a -7 fl), so by Lemma 9.8.7, I- (a -+ fl. -7 a -+ y) -+ (fl -+ .a -+ y). By (A2), I- (a -+ .fl -7 y) -+ (a -+ fl. -+ a -+ y), so by Lemma 9.8.4, I- (a -+ .fl -+ y) -+ (fl -+ .a -7 y). 0
THE CALCULUS D(BC)
CLASSICAL PROPOSITIONAL LOGIC
340
Lemma 9.8.13 I- (a -+
/3.
-+
/3)
-+
(/3
-+ a. -+ a).
Proof By Lemma 9.8.5, I- (/3 -+ a) -+: (a -+ /3. -+ /3) -+ (a -+ /3. -+ a), so by Lemma 9.8.12 and (R1), I- (a -+ /3. -+ /3) -+: (/3 -+ a) -+ (a -+ /3. -+ a). By (A3), I- (a -+ /3. -+ a) -+ a, so by Lemma 9.8.5 I- (/3 -+ a) -+ (a -+ /3. -+ a). -+ (/3 -+ a. -+ a), so by Lemma 9.8.4, I- (a -+ /3. -+ /3) -+ (/3 -+ a. -+ a). 0
Lemma 9.8.14 For every a, /3, r E W:
/3) -+ a B a; (/3 -+ r) B /3 -+ (a -+ r); (3) (a -+ /3) -+ /3 B (/3 -+ a) -+ a;
(2) a -+
Lemma 9.8.16 The Lindenbaum algebra LA(BC) of the calculus BCforms a Boolean ortholattice.
One lemma remains to be proved before we can state (and prove) the completeness theorem for calculus BC. We have already shown (Lemma 9.8.10) that the theses T(BC) form an equivalence class modulo B; we next show that T(BC) is in fact the unit element of the Boolean ortholattice cited in Lemma 9.8.16.
Proof
(AI) and (A3). Lemma 9.8.12. Lemma 9.8.13. (A4) and Lemmas 9.8.2 and 9.8.3.
As a consequence of Lemma 9.8.15 and Abbott's results, we have the following lemma.
o
(4) f-+aBa-+a.
Immediate from Immediate from Immediate from Immediate from
(d4) a A b = (a -+ (b -+ 0)) -+ 0; (d5) -a = a -+ O.
Proof Immediate from Lemma 9.8.15 and Abbott's results, noting that the non-primitive operations (conjunction, disjunction, etc.) correspond to the non-primitive operations (meet, join, etc.) on a bounded implication algebra; compare the definitions.
(1) (a -+
(1) (2) (3) (4)
341
Lemma 9.8.17 T(BC) is the unit element 1 ofLA(BC).
(11) (a -+ b) -+ a = a; (12) a -+ (b -+ c) = b -+ (a -+ c);
Proof By Lemma 9.8.10, T(BC) forms an equivalence class, which we denote [t]. To show that [t] is the unit element of LA(BC), in light of (dl) and (d2) above, it is sufficientto show that [a] -+ [t] = [a] -+ [a], for then [a] :s; [t] for all [a] E W lB. Here, t is an arbitrary thesis (for example, f -+ I), so by Lemma 9.8.3 (twice), I- (a -+ a) -+ (a -+ t). Also, by Lemma 9.8.2, I- a -+ a, so by Lemma 9.8.3, I- (a -+ t) -+ (a -+ a). Therefore, a -+ tBa -+ a, from which it follows that [a] -+ [t] = [a] -+ [a]. 0
(I3) (a -+ b) -+ b = (b -+ a) -+ a; (14) 0 -+ a = a -+ a.
Theorem 9.8.18 BC is complete for the class of Boolean valid formulas; that is, every Boolean valid formula (in primitive notation) is a thesis of Be.
o
Lemma 9.8.15 For every a, b, c E WI B:
Proof Each part follows from the corresponding part of Lemma 9.8.14, together with the following facts: for all a E WIB, there is an a E W such that a = [a]; [a] -+ [/3] = [a -+ /3]; 0 = [f]; [a] = [/3] iff aB/3. 0
At this point, we appeal to the work of J. C. Abbott (1967,1969,1976), who defines an implication algebra (IA) to be an algebra of type (2) such that the single binary operation -+ (implication) satisfies conditions (11)-(I3) above. He shows that every IA induces an upper-bounded join-semi-Iattice under the following definitions: (dl) 1 = a -+ a (for any a); (d2) a:S; b iff a -+ b = 1; (d3) a V b = (a -+ b) -+ b. A bounded IA is, by definition, an IA which has a lower bound as well as an upper bound, with respect to the partial ordering given by (d2). Equivalently, a bounded IA (BIA) is an algebra of type (2,0), where the two-place operation -+ and the zeroplace operation 0 satisfy conditions (11)-(14) above. Abbott shows that every BIA induces a Boolean ortholattice under the following definitions, together with (dl)(d3):
Proof We argue contrapositively: suppose a is not a thesis of BC, to show that a is not Boolean valid. It is sufficient to construct a Boqlean matrix M = (A, D) and an M-interpretation h such that h(a) E A-D. Define M so that A is LA(BC), D = ([t]}, and define h to be the canonical homomorphism from W into W IB, which is LA(BC). By Lemma 9.8.16, LA(BC) is a Boolean ortholattice; by Lemma 9.8.17, [t] = T(BC). Thus, M is a Boolean matrix, and h is a Boolean interpretation. Now, suppose a is not a thesis ofBC. Then a is not congruent to t, in virtue of Lemma 9.8.10, so [a] f= [t]. But h(a) = [a], so h(a) f= [t], so h(a) E A - D, from which it follows that a is not Boolean 0 valid.
9.9
The Calculus D(BC)
The calculus BC is a unary calculus, and so provides an axiomatization of the class of Boolean valid formulas. We next turn to the class of Boolean valid arguments (of the asymmetrical sort), and construct two diverse axiomatic calculi for this class. In axiomatizing the class of Boolean valid arguments, we can proceed either directly or indirectly. The direct method specifies an asymmetrical sequent calculus in a completely straightforward manner-by specifying the base elements, and by specifying the inductive generating schemes. This method is completely analogous to the unary calculus
THE CALCULUS D(Be)
CLASSICAL PROPOSITIONAL LOGIC
342
BC, the chief difference being that, whereas the theses of a unary calculus are single for-
mulas, the theses of an asymmetrical sequent calculus are ordered pairs, where the first components are sets of formulas and the second components are individual formulas. On the other hand, the indirect method of constructing the theses is based on the more traditional method offormal derivations (or formal proofs). This technique is indirect because it detours through a unary calculus in order to construct an asymmetrical sequent calculus. This is the method we examine in this section. Let C be a unary calculus based on language L with formulas W, and let r u {a} be any subset of W. Then a (formal) derivation of a from r in C is, by definition, any finite sequence (s 1, s2, ... , S n) of formulas of W having the following properties: (1) Sn = a. (2) Every element (line) of (Sl, S2, ... , sn) satisfies one of the following:
(i)
Si E
r;
(ii) (iii)
Si E
A(C);
Si
follows from (s j) j
Here, A(C) is the set of axioms of C. Recall that the transformation rules of a unary calculus provide the basis for the inductive generating function g; each transformation rule Ri is a function from \p(W) into \p(W); g is defined so that g(X) = U {Ri(X) : i E I} U X. Then the class of these of C is the set T( C) = U {u(i) : i E co}, where u is the inductively defined function from co into \p(W) such that u(O) = A(C), u(n + 1) = g(u(n)). To say that a formula a follows from formulas {Pj : j E J} according to a transformation rule of C is to say that a is an element of the set g( {Pj : j E J}). If there is a derivation d in C of a from r, then we say that a is deducible from r in C, and we write "C : r f- a" or simply 'T f- a" if the calculus C is understood. Given a unary calculus C, the associated calculus of derivations (or derivational calculus, or deduction calculus) is D(C), which may be given a strict inductive definition as follows. D( C) = U{d(i) : i E co}, where d(i) consists of all the ordered pairs (r, a) such that there is a derivation (Sl, S2, ... , sm) of a from r in C, and such that m ~ i; in other words, (r, a) E d(m) iff there is a derivation of a from r in C that has a length less than or equal to m. Also, (r, a) E D(C) iff there is a derivation of a from r in C of any (finite) length. Our notation so far is as follows: C f- a iff a E T(C); C : r f- a iff (r, a) E D(C). This notation is consistent since one can show that a E T(C) iff (0, a) E D(C), which is the content of the following general lemma. Lemma 9.9.1 Let C be any unmy calculus with theses T(C), and let D(C) be the associated derivational calculus. Then a formula a is a thesis of C (C f- a) iff there is a derivation of a from the empty set 0 in C (C : 0 f- a). That is, a E T(C) (ff (0, a) E D(C).
Proof Since T( C) is inductively defined as the set U {u(i) : i E co}, we proceed by induction to show that for all a E T(C), (0, a) E D(C). Base case: to say that a E u(O) is to say that a E A(C); then the sequence (a) consisting simply of a is a derivation of a
343
from 0; thus, (0, a) E D(C). Assume the inductive hypothesis: for all a E u(n), (0, a) E D(C). We wish to show that for all a E u(n + 1), (0, a) E D(C). Suppose a E u(n + l); then by the definition of u, a E g(u(n)), where g(X) = U {Ri(X) : i E I} U X. Thus, either a E u(n) or a E Ri(u(n)) for some i E I. In the former case, (0, a) E D(C) by the inductive hypothesis. In the latter case, there is an i E I such that a is obtained from formulas PI, P2, ... , Pm E u(n) by rule R i . Since the various Pj are in u(n), by the inductive hypothesis, (0, Pj) E D(C), so there are derivations dj of Pj from 0 in C. We now form a new sequence juxtaposing all the dj and then appending the formula a at the end; we claim that dl, d2, ... , dill> a is a derivation of a from 0 in C-every line is either an axiom of C or follows by some rule R i . Concerning the converse direction, we note that D( C) is inductively defined as the set U {d(i) : i E co}, so we proceed by induction. Base case: (0, a) E d(l), which is to say that there is a derivation of length less than or equal to 1 of a from 0; this can be only if a E A(C), from which it immediately follows that a E T(C). Assume the inductive hypothesis: for all formulas a, if (0, a) E den), then a E T(C). We wish to show that for all formulas a, if (0, a) E den + 1), then a E T(C). Suppose (0, a) E den + 1). Then there is a derivation of length m less than or equal to n + 1 of a from 0 in C. In the case that m < n + 1, the inductive hypothesis applies, so we have a E T(C); in the case m = n+ l, there is a derivation (Sl, S2, ... , sm) of a from 0 in C. Now, for all j < m, the sequence (Sl, S2, ... , sj) is a derivation of formula Sj from 0 of length strictly less than n + 1, so for all j < m, (0,sj) E den), so by the inductive hypothesis, for all j < m, Sj E T(C). Next, a (= sm) is either an axiom of C, in which case it is a thesis of C, or it follows from Sl, S2, ... , Sj(j < m) by a transformation rule Ri. In the latter case, since each S j(j < m) is a thesis of C, it follows that a is also a thesis of C. D In addition to Lemma 9.9.1, there are three more general lemmas that we appeal to in proving the completeness of D(BC). Lemma 9.9.2 If (r, a)
E D(C) and
r
~
fl, then (fl, a) E D(C).
Proof Immediate from the definition of formal derivation in calculus C.
Lemma 9.9.3
fr a E T(C), then for all r
~
W,
(r, a)
D
E D(C).
Proof Suppose a E T(C). Then by Lemma 9.9.1, (0, a) E D(C), so by Lemma 9.9.2, (r, a) E D(C). D
Lemma9.9.4 If(r,a;) E D(C) for i D(C), then (r, P) E D(C).
= 1,2, ... ,m, and (ru
{aI,a2, ... ,am}'P) E
Proof Suppose (r, a;) E D( C) for i E 111, and (r u {aI, a2, ... , am}, P) E D( C). Then for each i E m, there is a derivation di = (si, s~, . .. , S!l) of ai from r in C, and there is a derivation d = (Sl, S2, ... , Sk) of P from r u {aI, a2, ... , am} in C. FOlm the sequence d* = (01,02, ... , OJ) obtained from d by replacing each formula ai by the corresponding derivation di. Claim: d* is a derivation of P from r in C. D
The remaining lemmas leading up to the completeness theorem are peculiar to the calculus C; once again, we use a, P, r to range over formulas of W.
THE CALCULUS D(BC)
CLASSICAL PROPOSITIONAL LOGIC
344
Lemma 9.9.5 ({a
(1) if[a] E [deer)]' and [a] ~ [fJ], then [fJ] E [deer)];
-+ fJ, fJ -+ y}, a -+ y) E D(C).
Proof By Lemma 9.8.7, I- (a -+ fJ) -+ (fJ -+ y. -+ a -+ y), so by Lemma 9.9.3, {a -+ fJ,fJ -+ y} I- (a -+ fJ) -+ (fJ -+ y. -+ a -+ y). Also, in virtue of (Rl), {a -+ fJ,fJ -+ y} U {(a -+ fJ) -+ (fJ -+ y. -+ a -+ y)} I- a -+ y, so by Lemma 9.9.4, {a -+ fJ,fJ -+ y} I- a -+ y.
Lemma 9.9.6 I- a Proof
0
-+ (a -+ fJ· -+ fJ).
By (AI), I- a
-+ (fJ -+ a. -+ a),
and by Lemma 9.8.12, I- (fJ
-+ a. -+ a) -+
(a -+ fJ. -+ fJ), so by Lemma 9.8.4, I- a -+ (a -+ fJ· -+ fJ).
Lemma9.9.7 {a,fJ} I- (a -+.fJ
0
-+ y) -+ y.
(2) if[a], [fJ] E [deer)], then [a] A [fJ] E [deer)]. Proof (1) Suppose [a] E [de(r)] and [a] ~ [fJ]. Then by (D2), [a] -+ [fJ] = 1, so [a -+ fJ] [t], which is to say that a -+ fJ E T(BC). Therefore, by Lemma 9.9.9,
=
fJ E deer), so [fJ] E [de(r)]. (2) Suppose [a], [fJ] E [de(r)]. Then by Lemma 9.9.9, a, fJ E deer), which is to say that r I- a and r I- fJ. By Lemma 9.9.7, {a, fJ} I- (a -+ .fJ -+ f) -+ f, so by Lemma 9.9.2, r U {a, fJ} I- (a -+ .fJ -+ f) -+ f, so by Lemma 9.9.4, r I- (a -+ .fJ -+
f) -+ f, that is, (a -+ ·fJ -+ f) -+ f E deer)· Now, by definition, [a] /\ [fJ] [(a -+ .fJ -+ f) -+ fl· Thus, [a] A [fJ] E [de(r)].
Theorem 9.9.13 D(BC) is complete for Boolean logic, i.e., for every r Boolean entails a, then (r, a) E D(BC).
Proof By Lemma 9.9.6, I- a -+: (a -+ ·fJ -+ y) -+ (fJ -+ y), and I- fJ -+ (fJ -+ y. -+ y), so by Lemma 9.9.3, {a, fJ} I- a -+: (a -+ ·fJ -+ y) -+ (fJ -+ y), and
r
{a, fJ} I- fJ -+ (fJ -+ y. -+ y). Also, {a, fJ} U {a -+: (a -+ ·fJ -+ y) -+ (fJ -+ y), fJ -+ (fJ -+ y. -+ y)} I- (a -+·fJ -+ y) -+ (fJ -+ y), and {a,fJ} U {a -+: (a -+·fJ -+ y)-+ (fJ -+ y), fJ -+ (fJ -+ y. -+ y)} I- (fJ -+ y) -+ y, by (Rl) twice. But by Lemma 9.9.5, {(a -+ .fJ -+ y) -+ (fJ -+ y), (fJ -+ y) -+ y} I- (a -+ .fJ -+ y) -+ y, so by Lemma 9.9.2
Proof
{a,fJ} U {(a -+ .fJ -+ y) -+ (fJ -+ y), (fJ -+ y) -+ y} I- (a -+ ·fJ -+ y) -+ y, so by -+ .fJ -+ y) -+ y. 0
Lemma 9.9.4, {a, fJ} I- (a
We next define the set deer) of deductive consequences of the set r, and we prove a number of lemmas pertaining to it; intuitively, deer) consists of precisely those formulas that are deducible from r in BC: (D!) Let r be any subset of formulas in W; then deer) = {a E W :
(D2) Let deer) be defined as in (Dl); then [de(r)] = {[a] Lemma 9.9.8
r
(r, a)
E
U
D(BC)}.
E W/B: a E de(r)}.
~ deer).
Evidently, {a} I- a, so r U {a} I- a, by Lemma 9.9.2, therefore, if a {a} = r, and so r I- a, which is to say a E deer).
Proof
r
Lemma 9.9.9 If a
E deer), and a -+ fJ E
T(BC), then fJ
E
r, 0
E deer)·
Suppose a E deer) and a -+ fJ E T(BC). Then r I- a and I- a -+ fJ, so by Lemma 9.9.3 r I- a -+ fJ. By (RI), {a, a -+ fJ} I- fJ, so by Lemma 9.9.4, r I- fJ, so
Proof
fJ E deer).
Corollary 9.9.10 If a Proof
0 E deer), and aBfJ, then fJ E deer).
o
Immediate from Lemma 9.9.9 and definition of B.
Corollary 9.9.11 [de(r)] is de(r) factored by B. That is, [deer)] Proof
= de(r)/B.
Immediate from Corollary 9.9.10 and the definition of [de(r)].
0
Lemma 9.9.12 For any subset r of W, [de(r)] is a filter on LA(BC) (= W /B); that is,for all [a], [fJ] E W / B,
345
= [a A fJ] = 0
U {a} ~ W, if
We argue contrapositively. We suppose that (r, a) ¢ D(BC) to show that r does not Boolean entail a. It is sufficient to produce a Boolean matrix M = (A, D) and interpretation h such that h(y) ED for all y E r, but h(a) ¢ D (for then the associated valuation viz satisfies r but not a). Define M so that A is LA(BC), and so that D = [de(r)]. Define h to be the canonical homomorphism from W into W /B(= LA(BC» such that for all a E W, h(a) = [a]. As noted in the proof of Theorem 9.8.18, LA(BC) is a Boolean lattice, and h is a homomorphism; also, according to Lemma 9.9.9, [de(r)] is a filter on LA(BC), so the definitions are adequate. Now, suppose that (r, a) ¢ D(BC). Then a ¢ de(r) , so [a] ¢ [de(r)], so h(a) ¢ D. On the other hand, for all y E r, y E deer), so h(y) = [y] E [de(r)] = D. Thus, r does not Boolean entail a. 0 We have now shown that the derivational calculus D(BC) is complete for Boolean logic. That D(BC) is also sound for Boolean logic is the content of the following theorem. Theorem 9.9.14 For every ru {a}
~ W, if
(r, a)
E
D(BC), then r Boolean entails a.
Since D(BC) is inductively defined as the set U{d(i) : i E ill}, we proceed by induction. Base case: (r, a) E d(1); in this case, there is a derivation of a from r that has length less than or equal to 1; accordingly, a E T(BC) or a E r. But every thesis of BC is Boolean valid, and any Boolean valid formula is entailed by any set; also r entails a if a E r. Thus, r Boolean entails a. Inductive hypothesis: for any r U {a} ~ W, if there is a derivation of a from r of length less than or equal to n, then r Boolean entails a. Suppose there is a derivation of length less than or equal to n + 1 of a from r. If the length is strictly less than n + I, then the inductive hypothesis applies and r Boolean entails a. So consider the case in which the length is exactly n + 1. We have a derivation (d 1, d2, ... , d n+ 1) where d n+ 1 = a, and each di is either an axiom of BC, an element of r, or follows from (dj) j
ASYMMETRICAL SEQUENT CALCULUS
CLASSICAL PROPOSITIONAL LOGIC
346
it is entailed by r, or it follows from (dj) j5,/l by the sole transformation rule (RI). This means that, for some i,j, n, di = p, and d j = P -* a, for some formula p. As already noted, r Boolean entails both p and p -* a. So every Boolean filter that contains her) contains both h(P) and h(P -* a) = h(P) -* h(a). But every filter that contains band b -* a contains b A (b -* a), and b A (b -* a) = b A (-b V a) = (b A -b) V (b A a) = bAa. Since b A a ::; a, b A (b -* a) ::; a. It follows that every filter that contains her) also contains h(a), which is to say that r Boolean entails a. D 9.10
Asymmetrical Sequent Calculus for Classical Propositional Logic
Earlier we noted that one can axiomatize the class of valid arguments in one of two ways, directly or indirectly. Having shown an example of the indirect approach, we now consider an example of the direct approach. According to this approach, the theses of any particular axiomatic calculus are not individual formulas, as they are in unary calculi, but are ordered pairs of the form (r, a), where r is a set offormulas, and a is a formula. Because of the asymmetry of the components of the ordered pair, we call such calculi asymmetrical sequent calculi. The calculus we consider is the calculus ASC. This has the following base elements:
(RI) If (r, a) E T(ASC), and r ~ il, then (il, a) E T(ASC). (R2) If (r u {a}, P) E T(ASC), then (r, a -* P) E T(ASC). (R3) If (r,a) E T(ASC), and (ru (aLP) E T(ASC), then (r,P) E T(ASC).
iff a
E
T(BC).
Proof Conceming the "if" direction, since T(BC) is inductively defined, we proceed by induction, showing that every axiom has the property, and that the transformation rule preserves this property. The base case is trivial in virtue of (AI). That the transformation rule preserves this property follows immediately as a special case of Lemma 9.10.2. Thus, every thesis of BC has this property. Conceming the "only if" direction, we first note that ASC is sound, so if 0 f- a, then oBoolean entails a, which is to say that a is Boolean valid. As we have already shown, BC is complete, so a is a thesis ofBC. Thus, if0 f- a, then a E T(BC). D
Lemma 9.10.4 For any formula a, {a} f- a. Proof By Lemma 9.8.2, a -* a E T(BC), so by Lemma 9.10.3, 0 f- a -* a, so by (RI), {a} f-a-*a.Also,by(A2), {a,a-*a} f-a,soby(R3), {a} f-a. D
Lemma 9.10.6 Ijr f- a, and a -* P E T(BC), then r f- p.
Henceforth, in order to render our notation similar to traditional axiomatic notation, we write 'T f- a" instead of "(r, a) E T(ASC)." Thus, we can write the above clauses as follows: 0 f- a for every axiom a ofBC. {a, a -* P} f- p. If r f- a, and r ~ il, then il f- a. If r u {a} f- p, then r f- a -* p. Ifrf-a,andru{a} f-p,thenrf-p.
Proof Suppose r f- a and a -* P E T(BC). Then by Lemma 9.10.3,0 f- a -* p, so by
(RI), r f- a -* p, so by Lemma 9.10.2, r f- p.
D
Corollary 9.10.7 ffr f- a, and aep, then r f- p. Proof Immediate from Lemma 9.10.6 and the definition of e (i.e., aep iff a -* p E T(BC) and p -* a E T(BC)). D
Lemma 9.10.8 (a, P} f- a A p.
We begin by stating the soundness theorem for calculus ASe.
if r f-
a, then r Boolean en-
tails a. Proof This is routinely shown by induction, by showing that the base elements correspond to Boolean valid arguments, and showing that the rules preserve this property. D
Lemma 9.10.2 ffr f- a, and r f- a -* p, then r f- p.
Lemma 9.10.3 For any formula a, 0 f- a
Proof Suppose r f- a -* p and r f- p -* 8. Then by (RI), r u {a} f- a -* p and ru {a} f- p -* 8. Also, by Lemma 9.10.4, {a} f- a, so by (RI), ru {a} f- a. Therefore, by Lemma 9.10.2, r u {a} f- p, so by Lemma 9.10.2 again, r u {a} f- 8, so by (R2), r f- a -* 8. D
Its inductive clauses are as follows:
Theorem 9.10.1 For evelY subset r u {a} of formulas,
Proof Suppose r f- a and r f- a -* p. Then by (RI), ru {a} f- a -* p. Also, by (A2), {a,a -* P} f- p, so by (RI), ru {a,a -* P} f- p, that is, ru {a} U {a -* P} f- p. So by D (R3), r u {a} f- p, and by (R3) again, r f- p.
Lemma 9.10.5 Ijr f- a -* p, and r f- p -* 8, then r f- a -* 8.
(Al) (e, a) E T(ASC) for every axiom a of calculus Be. (A2) ({a, a -* P}, P) E T(ASC).
(Al) (A2) (Rl) (R2) (R3)
347
Proof By Lemma 9.9.6, p -* (P -* y. -* y) E T(BC), and a -* (a -* (P -* y). -* (P -* y)) E T(BC), so by Lemma 9.10.3, 0 f- P -* (P -* y. -* y) and f- a -* (a -* (P -* y). -* (P -* y)), so (P} f- P -* (P -* y. -* y), and {a} f- a -*
o
(a -*
(P -*
y). -*
(P -*
y)), by (RI), so by Lemmas 9.10.4 and 9.10.2, and finally (RI),
{a,p} f- (P -* y) -* y, and {a,p} f- a -* (P -* y). -* (P -* y), so by Lemma 9.10.5, (a, P} f- (a -* .p -* y) -* y, so in particular {a, P} f- (a -* .p -* f) -* f, which is to say {a,p} f- a A p. D
Definition 9.10.9 Let r be any subset offormulas; then deer) is the set offormulas a such that r f- a, i.e., deer) = {a : r f- a}.
348
SEMI-BOOLEAN ALGEBRAS
CLASSICAL PROPOSITIONAL LOGIC
Lemma 9.10.10 Let Wiebe the Lindenbaum algebra of unary calculus BC, alld let [de(n] be the set of equivalence classes of formulas in de(n, for some subset r of formulas. Then [de(n] is afilter on W Ie. Proof Suppose [a] E [de(n] and [a] ::; [Pl. Then by Lemma 9.10.6, r I- a, and by the theorems about W Ie proved previously, a -+ P E T(BC). So by Lemma 9.10.3 and (R1), r I- a -+ P, so by Lemma 9.10.2, r I- P, so [P] E [de(n]. Next, suppose [a], [P] E [de(n]. Then by Lemma 9.10.7, r I- a, and r I- p. By Lemma 9.10.8, {a,p} I- a /\ P, so r U {a,p} I- a /\ P and r U {a} I- a, by (R1), so by (R3) twice, r I- a /\ P, from which it follows that [a] /\ [P] = [a /\ P] E [de(n]. Thus, [deer)] is a ~oo~. D Theorem 9.10.11 ASC is complete for Boolean logic. That is, for any subset r U {a} offonnulas, r Boolean entails a only if (r, a) is a thesis of ASC. Proof Completely analogous to the proof of Theorem 9.9.13.
9.11
D
Fragments of Classical Propositional Logic
We have now considered the class of Boolean valid formulas and the class of Boolean valid arguments. An additional interesting logical question concerns various special subclasses of these, where each subclass is determined according to a proper subset C* of the set C of connectives of the standard sentential language. For example, we might be interested in those Boolean valid formulas (arguments) that exclusively involve the conditional connective, or exclusively involve conjunction and disjunction, or exclusively involve the non-negative connectives (those besides ~ and f). The subclass of valid formulas (arguments) of a logic that involve a proper subset of connectives is sometimes referred to as afragment of the logic. We have, in fact, already discussed a particular fragment of Boolean logic in our discussion of axiomatic calculi, the fragment involving just the conditional (-+) and the falsity constant (f). Also, as we noted, -+ and f form a definitionally complete set of connectives for classical logic, so the (-+, f) fragment of classical logic is, in some sense, not a proper fragment. Where C is a proper subset of connectives, we say that the C-fragment of a logic L is a proper fragment if C does not form a definitionally complete set of connectives for L. Thus, for example, although the (-+, f) fragment of classical propositional logic is not a proper fragment, the (-+, f) fragment of intuitionistic propositional logic is a proper fragment. There are basically two interesting proper fragments of classical logic, respectively called the implicative fragment, which concerns the conditional connective, and the positive fragment, which concerns all connectives except ~ and f. We could be through simply by saying that the implicative fragment of Boolean logic consists of all the Boolean valid formulas (arguments) that exclusively involve implication, and that the positive fragment of Boolean logic consists of all the Boolean valid formulas (arguments) that do not involve negation or falsity. This is not very satisfying, however. Ideally, we would like an independent specification of each of these in algebraic terms, and an independent specification of each of these in axiomatic terms. We do just that.
9.12
349
The Implicative Fragment of Classical Propositional Logic: Semi-Boolean Algebras
Just as the full classical propositional logic is algebraically characterizable in terms of Boolean lattices (Boolean algebras), its implicative fragment is characterizable in terms of semi-Boolean algebras, first studied in detail by Abbott (1967, 1969). Semi-Boolean algebras come in two forms, which are dual to one another. An upper semi-Boolean algebra (USBA) is, by definition, a join-semi-Iattice with a greatest element, which is such that every principal filter is a Boolean lattice (in this context, a distributed complemented lattice). By contrast, a lower semi-Boolean algebra (LSBA) is, by definition, a meet-semi-Iattice with a least element, which is such that every principal ideal is a Boolean lattice. Semi-Boolean algebras (SBAs) are equationally definable, where the characteristic equations are given as follows: (el) (ab)a = a; (e2) a(be) = b(ae); (e3) (ab)b = (ba)a. Just as a semi-group can be written both additively and multiplicatively, axioms (e1)(e3) can be written (or interpreted) in two different ways, either implicatively or subtractively. Briefly, an implication algebra (IA) is an algebra (A, F) of type 2, where the sole binary operation (denoted -+) satisfies the following conditions: (i1) (a -+ b) -+ a = a; (i2) a -+ (b -+ e) = b -+ (a -+ e); (i3) (a -+ b) -+ b = (b -+ a) -+ a. On the other hand, a subtraction algebra (SA) is an algebra (A, F) of type 2, where the sole binary operation (denoted -) satisfies the following conditions: (s1) a - (b - a) = a; (s2) (e - b) - a = (e - a) - b; (s3) b - (b - a) = a - (a - b). Notice that, whereas (i1)-(i3) are written in the same manner as (e1)-(e3), simply by infixing an arrow, (s1)-(s3) are written in reverse order, following the usual convention concerning how to read subtraction. Note that whereas (i1)-(i3) characterize classical implication, (s1)-(s3) characterize set difference (A - B consists of precisely those elements of A not in B). Having noted that set difference is dual to material implication, we leave subtraction algebras aside and concentrate on implication algebras. We begin by connecting lAs with upper semi-Boolean algebras. First of all, every USBA induces an lA, where the implication operation -+ is defined as follows: (dO) a -+ b
= (a V b)lb.
Here, alb is, by definition, the complement of a in the principal filter generated by b. Since by the definition of a USBA, this filter is a Boolean lattice, a Ibis well-defined. On the other hand, every IA induces a USBA according to the following definitions:
350
AXIOMATIZING THE IMPLICATIVE FRAGMENT
CLASSICAL PROPOSITIONAL LOGIC
(dl) 1 = a -+ a (for any a); (d2) a ~ b iff a -+ b = 1; (d3) a V b = (a -+ b) -+ b. Now, every Boolean lattice is a semi-Boolean algebra (both a USBA and an LSBA), so every Boolean lattice is an IA (and an SA). It can be shown in fact that the arrow operation is simply the Boolean arrow (a -+ b = -a V b). Next, since IAs are equationally definable (they are a variety), every implication subalgebra of a Boolean lattice is an IA; an implication subalgebra is a subset closed under the arrow operation. An IA that is a sub algebra of a Boolean lattice is called a Boolean IA. According to the basic structure theorem for IAs, the class of Boolean IAs is canonical, which is to say that every IA is isomorphic to a Boolean IA. To state matters equivalently, every IA can be embedded into a Boolean lattice preserving the arrow operation. This structure theorem for IAs (which can be dualized to SAs) has the following immediate result: every equational theorem of the theory of Boolean ortholattices that is expressible purely in terms of the Boolean arrow is a theorem of the theory of IAs. This can be argued as follows. Suppose P = Q is not a theorem of IA theory, where P and Q are terms in the corresponding language. Then by the completeness of first-order logic, there is an IA which fails to satisfy P = Q; call it A. According to the embedding theorem cited above, A can be embedded into a Boolean ortholattice; call it B, and call the embedding e. Thus, for every a, b E A, e(a -+ b) = e(a) -+ e(b) (= -e(a) V e(b», and e(a) = e(b) only if a = b. Let p,q be the polynomials on A that correspond to P, Q respectively. Since A does not satisfy P = Q, it follows that there are sequences (al, ... , am), (bl, ... , bn ) of elements of A such that p(al, ... , am) :j:: q(bl, ... , bn ). Since every polynomial is recursively defined, e(p(al,"" am» = p*(e(ar), ... , e(am», and e(q(bl,···, bn » = q*(e(br), ... , e(bn », where p*, q* are the polynomials on B corresponding to p, q, respectively. Since e is one-one, and since p(al, ... , am) :j:: q(bl, ... , bn ), we have p*(e(ar), ... , e(am» :j:: q*(e(br), ... , e(bn ». Since p*, q* correspond to P, Q, we have that B does not satisfy P = Q, so by the soundness of first-order logic, P = Q is not a theorem of Boolean lattice theory. We thus see that the theory of IAs provides a complete algebraic characterization of Boolean implication. This is a useful result if we wish to construct an axiomatic calculus for the implicative fragment of Boolean logic, as we see presently.
9.13 Axiomatizing the Implicative Fragment of Classical Propositional Logic We begin by constructing a unary axiomatic calculus BC*, which is based on a sententiallanguage L in which there is exactly one connective, the conditional. In order to show that BC* is adequate, we must show that every thesis of BC* is Boolean valid, and we must show that every Boolean valid implication formula is a thesis of BC*. The calculus BC* is specified as follows: (AI) I- a -+ (P -+ a); (A2) I- [a -+ (P -+ y)] -+ [(a -+ P) -+ (a -+ y)];
351
(A3) I- [(a -+ p) -+ a] -+ a;
(Rl) if I- a and I- a -+
p, then I- p.
Here I- a means that a is a thesis of BC", i.e., a E T(BC"). Notice that BC" is very similar to BC, the only difference being the absence in BC* of axiom schema (A4): (A4) I-
f
-+
a.
That BC* is sound follows from the fact that BC is sound, since every thesis of BC* is a thesis of Be. In order to show that BC* is complete for the implicative fragment of Boolean logic, we appeal to results 9.8.2-9.8.15, none of which appeals to (A4), except for the fourth parts of Lemmas 9.8.14 and 9.8.15, and we appeal to Lemma 9.8.17. As a result of these lemmas, we have that the Lindenbaum algebra of BC* is an implication algebra, the unit element of which is the equivalence class of theses of BC*. In virtue of the embedding theorem for lAs, LA(BC*) can be embedded into a Boolean ortholattice preserving implication. Now, suppose that a is not a thesis ofBC*. Then [a] :j:: [t] = 1. Consider the function h, which is the composition of the canonical homomorphism of the algebra of formulas into LA(BC*) with the embedding function e. Since h is the composition of two homomorphisms, it is a homomorphism. Since [a] :j:: [t], and since e is one-one, we have h(a) = e[a] :j:: e[t] = eel) = 1. Since h(a) :j:: 1, a is not Boolean valid. In order to discuss whether the associated deduction calculus D(BC*) is sound and complete for the class of Boolean valid implicative arguments, we need to discuss the implicational analog of Boolean filters. Specifically, we define an implicational filter on an IA, A, to be any subset F of A satisfying the following conditions: (iF!) 1 E F; (iF2) a E F and a -+ b E F only if b E F. In other words, an implicational filter is a set of propositions containing the tautological proposition 1, and closed under modus ponens. Note that if A is in fact a Boolean lattice, then every implicational filter on A is a lattice filter on A, and conversely. Recall that every Boolean filter F is the inverse image h- l ( {I} ) of a Boolean homomorphism h, since for any Boolean algebra B and any filter F on B, there is a Boolean algebra B* and a homomorphism h from B into B* such that F = h- l ({ I}). As shown by Abbott (1967), the same is true for implicational filters; specifically, he shows that, given any implicational filter F on A, the relation B on A, defined so that aBb iff a -+ b, b -+ a E F, is a congruence relation on A, and moreover, a E F iff aB1. This allows us to prove the completeness of D(BC*), as follows. (The surrounding details are exactly as they are for D(BC).) We begin with the Lindenbaum algebra of BC*, which we have previously shown to be an implication algebra, the unit element of which is the congruence class of theses of BC*. We next consider any subset r of formulas, and we consider its deductive closure deer), as before, and we consider the corresponding equivalence class [de(n], as before. The key lemma is that [de(n] is an implicational filter on LA(BC*). In virtue of the above mentioned theorem, there is an lA, A, and a homomorphism h from LA(BC*) into A such that [de(n] = h- 1 ({ I}). By the embedding theorem for IAs,
we know that A can be embedded into a Boolean ortholattice; call it B, and call the embedding e. Now, suppose that a is not a deductive consequence ofr. Then a ¢ deer), so [a] ¢ [de(r)], so h([a]) t= 1, so e(h[a]) t= 1. Consider the composition of e and h, and the canonical homomorphism, call it I, i.e., I(a) = e(h[a]). Then I is a Boolean interpretation such that I(Y) = 1 for all y E r, and such that I(a) t= 1. It follows that r does not Boolean entail a. The details of this proof are left as an exercise for the reader. Also left as an exercise for the reader is the proof that (A4) may not be derived from (Al)-(A3), (Rl), which simply amounts to showing that not every implication algebra is a Boolean algebra. 9.14
The Positive Fragment of Classical Propositional Logic
By the positive fragment of a propositional logic, we mean the subset of valid formulas (arguments) that involve the so-called positive connectives, which include conjunction, disjunction, conditional, biconditional, and the truth constant. By contrast, the negative connectives are negation and the falsity constant. As we have seen, whereas the full Boolean logic is characterized by Boolean lattices, the implicative fragment of Boolean logic is characterized by semi-Boolean algebras or implication algebras. In the case of the positive fragment of Boolean logic, the associated mathematical structures are Boolean rings, or, more specifically, their duals. This identification constitutes one of those remarkable coincidences in logic and mathematics. Recall that a ring is an algebra having an addition operation +, a multiplication operation x, an additive inverse operation *, and an additive identity element O. A ring must satisfy the following postulates (we write ab in place of ax b): (Rl) (R2) (R3) (R4) (R5) (R6)
THE POSITIVE FRAGMENT
CLASSICAL PROPOSITIONAL LOGIC
352
a + (b
+ e) = (a + b) + e; a + b = b + a; a + 0 = a, 0 + a = a; a + a* = 0, a* + a = 0; a(b + e) = ab + ae; (a + b)e = ae + be; a(be) = (ab)e.
An element b of a ring A is said to be idempotent if b2 = b, where b2 = bb. A Boolean ring is, by definition, a ring in which every element is idempotent. Equivalently, it is a structure satisfying (Rl)-(R6) as well as (R7) aa = a. Given these axioms, one can show first of all that every Boolean ring is a commutative ring, which is to say that for all a, b E A, ab = ba. One can also show that every element b E A is its own additive inverse: b* = b. Perhaps the most common example of a ring is a ring of subsets of some set X. Where X is any non-empty set, a collection S of subsets of X is a ring of subsets of X just when S is closed under the formation of unions and set differences. That is, (RS) if A E S, and B E S, then A - B E S and A U B E S.
353
S is afield of subsets of X if S is a ring of subsets of X and YES for all YES. A ring of sets is an algebraic ring under the following definitions:
(FSl) A + B = (A - B) U (B - A) [= (A U B) - (A n B)]; (FS2) A x B = An B = A - (A - B). It is easy to verify that every ring of sets is an algebraic ling, indeed a Boolean ring. What is the relation between Boolean rings and the positive fragment of Boolean logic? First of all, according to the customary reading of Boolean rings, every such ring induces a lattice with a least element, according to the following definitions: (dl) least element = 0; (d2) a V b = a + b + ab; (d3) a A b = abo Every lattice has a dual, so we can just as easily say that every Boolean ring induces a lattice with a greatest element, according to the following definitions: (dl') greatest element = 0 (the lattice unit equals the ring zero); (d2') a V b = ab; (d3') a A b = a + b + abo Observe that in the former case, the ring + is interpreted as binary exclusive disjunction (aVb = (a V b) A -(a A b) = (a A -b) V (b A -a». On the other hand, in the latter case, things are turned upside down, so that V becomes A and A becomes V; a + b = (a A b) V -(a V b) = a ~ b; that is, + is interpreted as the biconditional. The remaining positive connective operations may be defined as follows: (d4) a -+ b = b + ab; (d5) 1 = a + a (for any a). Recall that in a Boolean ring, every element is its own additive inverse: a + a = O. Just as every implication algebra can be embedded into a Boolean lattice preserving the implication operation, every Boolean ring can be embedded into a Boolean lattice preserving the ring operations (and hence preserving all the positive connective operations). In order to show this, one first shows that every Boolean ring induces an implication algebra under definition (d4). One next shows that this implication algebra is a lattice, not merely a join-semi-Iattice, where the meet operation is given by definition (d3). One next appeals to Abbott's (1967, 1969) work on implication algebras, according to which every IA can be embedded into a Boolean lattice preserving the implication operation, and preserving all existing infima (meets) in the IA. Since every IA that is a lattice is a Boolean ring, under the following definitions, the Abbott embedding preserves the Boolean ring operations: (d6) a + b = (a (d7) a x b = (a
-+ -+
b) b)
A
(b -+ a); b.
-+
In virtue of the embedding theorem for Boolean rings, we have that the theory of Boolean rings provides a complete algebraic characterization of the positive fragment of the theory of Boolean lattices: every equational theorem of Boolean lattice theory that
354
CLASSICAL PROPOSITIONAL LOGIC
is expressible in terms of the positive operations is a theorem of Boolean ring theory. The argument is completely analogous to the argument for implication algebras given earlier. Concerning the axiomatic characterization of the positive fragment of Boolean logic, the unary calculus BC+ is obtained from BC* by adding the conjunction symbol to the language, and by adding the following axiom schemata: (AS) I- (a /\ fJ) ~ a;
(A6) I- (a /\ fJ) (A7) I- (a
~
~
.fJ
fJ;
~ y) ~
(a
~
(fJ
~
.fJ /\ y)).
An alternative to (AS) and (A6) is the following: (AS+6) I- (a
~
·fJ
~ y) ~
(a /\ fJ. ~ y).
That BC+ is sound follows from the fact that BC* is sound (which follows from the fact that BC is sound), together with the fact that axiom schemata (AS)-(A 7) are Boolean valid. The completeness of BC+ is shown by showing that the Lindenbaum algebra of BC+ is a lattice ordered implication algebra, and hence a Boolean ring. That LA(BC+) is an IA has already been shown in connection with BC*. In order to show that it is lattice ordered, it is sufficient to show that [a] /\ [fJ] is the greatest lower bound of [a] and [fJ]. That it is a lower bound follows from (AS) and (A6), noting that [a] /\ [fJ] ~ [a] (for example) if and only if [a /\ fJ] ~ [a] iff [a /\ fJ] ~ [a] = 1 = [t] iff [(a /\ fJ) ~ a] = [t] iff (a /\ fJ) ~ a E T(BC+). That [a] /\ [fJ] is the greatest lower bound follows from (A7). For suppose [a] ~ [fJ] and [a] ~ [y]. Then I- a ~ fJ and I- a ~ y, so by (AI) and (R1), I- fJ ~ (a ~ y), so by Lemma 9.8.12 and (R1), I- a ~ (fJ ~ y), so by (A7) and (R1), I- a ~ (fJ ~ .fJ /\ y), so by (A2) and (Rl), I- (a ~ fJ) ~ (a ~ .fJ /\ y). But I- a ~ fJ, so by (R1), I- a ~ (fJ /\ y), from which it follows that [a] ~ [fJ] /\ [y]. We thus have that LA(BC+) is a Boolean ring. The completeness of BC+ for the positive fragment follows from this together with the fact that every Boolean ring can be embedded into a Boolean lattice. For suppose a is not a thesis of BC+. Then [a] =1= [t], so e([a]) =1= e([t]) = 1. It follows that there is a Boolean interpretation I, defined so that I(a) = e([a]), such that I(a) is not designated, which is to say that a is not Boolean valid (details may be filled in by the reader). Next, we may show that the deductive calculus C(BC+) associated with unary calculus BC+ is both sound and complete as follows. Soundness is routine and may be proved by induction on the length of the derivation. Completeness is shown in three steps. First of all, one shows that the deductive closure deer) of any subset r of formulas, when factored by the congruence relation B, is a lattice filter on LA(BC+). As the second step, one shows that every filter on a Boolean ring (thought of as a lattice ordered implication algebra) is the inverse image h- 1 ( { 1}) of a Boolean ring homomorphism h. This is shown as follows. Consider any filter F on a Boolean ring A. Define a relation B on A so that aBb iff a +-+ b E F. Next, we show that B is a congruence on A (the result is a general ring-theoretic result, when F is a ring ideal), and we also show that a E F
THE POSITIVE FRAGMENT
355
iff aBl. One then defines the quotient algebra AlB and the associated canonical homomorphism from A into AlB; this is the desired Boolean ring homomorphism. In the last step, one appeals to the embedding theorem for Boolean rings. The completeness proof for D(BC+) is then completely analogous to the completeness proof for D(BC+). Once again, the details are left as an exercise for the reader.
MODAL LOGICS
357
At the same time it would be natural to switch from D to 0 as primitive, and we officially do so.
10 MODAL LOGIC AND CLOSURE ALGEBRAS 10.1
Exercise 10.1.1 Show that the formulations K, K' and K" are all equivalent in the sense that they have the same theorems. (Of course, one has to use the translations given by the dual definitions of D into 0 and vice versa in going back and forth to K". (Hint: cf. Hughes and Cresswell 1968, Ch. VII.)
Modal Logics
Other standard modal logics which we will want to discuss are:
The reader who has no previous familiarity with modal logic may wish to consult Hughes and Cresswell (1968, 1996). The classic, but now dated reference, is Lewis and Langford (1932). The modal logics which we shall consider are only a few among many. The reader wanting to know more about the many should consult Chellas (1980), Segerberg (1971) or Gabbay (1976). Incidentally, this section is very much influenced by Lemmon (1966) and Lemmon with Scott (1977). The weakest modal logic which we shall consider is the system K (for Kripke). Its morphology is like that of the system TV of classical sentential calculus except that it has an additional unary connective D (Dp is read as "necessarily p"). In terms of this one can define Op (read "possibly p") as ~D ~p (or alternatively 0 could have been taken as primitive, setting Dp = ~O ~p). As axioms we take all instances of the following:
T (sometimes called M): K + Dp:J P (or T + Dp:J DDp (or OOp:J Op); B: T + p:J DOp (or ODp :J p); S5: S4 + Op:J DOp (or ODp:J Dp).
These logics are related to one another as shown in Figure 10.1 (where -+ means subsystem), as can be easily seen. S5
B
(1) tautologous schemes of TV; (2) D(p:J l/f) :J (Dp :J Dl/f) (Axiom of Kripke) ((p:J l/f) is to be understood as always abbreviating (~
As rules we have: (4) If P and p :J l/f, then l/f (MP). p, then I- Dp (Necessitation).
== (Dp 1\ Dl/f) (in this section, p == l/f is to be understood as abbreviating (p :J l/f) 1\ (l/f :J p», and the
following is to be added as a rule (additional to MP and necessitation): (6') If P == l/f, then Dp == Dl/f
(Necessitation of provable equivalents).
Alternatively we could replace necessitation with necessitation of provable equivalents, but we would have to add the axiom scheme D (p V ~ p) or some such-cf. Hughes and Cresswell (1968, Ch. VII). We formulate K" as a minor and somewhat dual variant of K'. Thus replace (2') with (2") O(p V l/f)
== (Op V Ol/f),
and replace (6') with (6") if
p == l/f, then Op == Ol/f.
"'/ t
S4
K
(5) If I-
(2') D(p 1\ l/f)
/'" T
p V l/f».
There are two other equivalent formulations that are often preferred in algebraic treatments. Thus let K' be formulated just like K except that (2) is replaced with
p :J Op);
S4:
FIG. 10.1. Relations between K, T, B, S4, and S5 In defining the notion of a deduction in K or any of the other modal logics, we must be careful with the rule of necessitation. It is best viewed as a rule that produces theorems from theorems, rather than consequences from premises. The inference (7) pI- Dp
is just plainly invalid. Thus, let p be any contingent truth, e.g., that you are reading this book. Although this is true, it is hardly necessary (close the cover and go get a cup of coffee if you do not believe us). On the other hand, the rule of necessitation is perfectly in order when applied to theorems, which are presumably logically true and hence necessary. Thus letting S be one of the systems K, T, B, S4, S5, we define a proof in S as a finite sequence of sentences of S, each one of which is either an axiom of S or follows from preceding sentences by either MP or necessitation. A theorem of S is then the last line of a proof in S (we write I-s p). A deduction in S of the sentence p from the set of sentences r will then be a finite sequence of sentences of S, each one of which is either
MODAL LOGIC AND CLOSURE ALGEBRAS
358
BOOLEAN ALGEBRAS
a member of r, or a theorem of S, or follows from preceding sentences by MP. We then say that cjJ is deducible (is a deductive consequence) from r in S when cjJ is the last line of a deduction from r in S (we write r I-s cjJ). In the sequel we shall assume that any "modal logic" is an extension of K, shares the same rules (MP and necessitation), and that the class of theorems is closed under substitution. In the literature these are often called "normal" or "classical" modal logics.
10.2
Boolean Algebras with a Normal Unitary Operator
Recalling definitions, a Boolean algebra with a normal unary operator is a structure (B, /\, v, -, e), where (B, /\, v, -) is a Boolean algebra, and (Ic) eO S 0 (2c) e(x V y)
(can be written as eO = 0); = e(x) Ve(y).
Anticipating applications, let us call such a structure a modal algebra or a "K-algebra." If a K-algebra satisfies additionally
359
1 s il (or il = 1); i(x /\ y) = ix /\ iy; ix s x; ix s iix; (Si) cix six; (6i) cix s x.
(Ii) (2i) (3i) (4i)
Because of this duality, and duality for the underlying Boolean algebras, for any equation or inequation that is derivable for a given system, its dual is also derivable (where its dual is defined as switching /\ and V, i and e, and I and 0). Exercise 10.2.3 Prove that each of (li)-(6i) is equivalent to its dual among (Ic)-(6c). (Hint: Use the fact that ix = -e - x. Thus, for example, 1 S il iff -il s -1 iff -i-O s 0 iff eO sO.) Proposition 10.2.4 Both e and i are isotonic, i.e., x S y implies both ex S ey and ix S iy.
(3c) x sex, we call it a "T-algebra" (in the literature such a thing has been called an "extension algebra"). And ifit satisfies (lc)-(3c) and (4c) eex S ex
(can be written as eex
Proof. We argue the first, the second being dual. If x S y then x ex V ey = e(x V y) = ey, and so ex S ey.
it is called an "S4-algebra" or "closure algebra" (which is why we have chosen to denote the operator by "e"). Closure algebras have important connections to topology, which we shall be describing in a moment. But first note that an operator i (the "interior" operator) can be defined as the dual of e: ix = -e-x. This allows us to define neatly the notion of an "SS-algebra" by adding on to (I c )-( 4c) the requirement
(Sc) ex S iex.
= y. But then 0
Proposition 10.2.5 For any modal algebra, i(a ::) b) S ia ::) ib.
Proof. (a::) b) /\ as b, so by isotonicity i[(a ::) b) /\ aJ sib. But using (2i), we obtain i[(a ::) b) /\ aJ = i(a ::) b) /\ ia. By substitution of identicals, we obtain i(a ::) b) /\ ia sib, and by Boolean algebra moves we obtain the proposition. 0 Exercise 10.2.6 Fill in the "Boolean algebra moves" in the above proof. Also, show that in any modal algebra i(a ::) b) S ea ::) eb. Note that the following corresponds to the rule of necessitation (if a proposition is provable, then so is its necessity).
While we are at it we shall define a "B-algebra" as satisfying (lc)-(4c) and the weaker requirement
Proposition 10.2.7 For any modal algebra, a
(6c) x S iex.
Proof.
The definition of the interior operator from the closure operator can be reversed, as the following shows.
Proof. -i-x = - - e - -x = ex.
y
The modal algebraic equivalent of the Kripke axiom is i(a ::) b) S ia ::) ib.
= ex by (3c)),
Proposition 10.2.1 In a modal algebra, ex
V
= -i-x.
= 1 implies ia = 1. Assume that a = 1. Then by isotonicity ia = ii, and by (il) ia =
1 as desired.
o For T we have the converse:
= 1 implies a = 1. a by (3i). However, then a = 1 since always a S
Proposition 10.2.8 In aT-algebra, ia
o
Remark 10.2.2 The axioms for a modal algebra favor "possibility" in their frequent use of the operator e. Of course one can write dual and equivalent axioms favoring "necessity," and we shall make implicit use of this fact by citing whichever form seems easiest for the purpose at hand. The dual axioms are:
Proof. If ia = 1, then 1 S
Define the operation of "strict implication" -< as
a -< b = i(a ::) b),
1.
o
360
MODAL LOGIC AND CLOSURE ALGEBRAS
FREE BOOLEAN ALGEBRAS
where a :J b is the usual "material implication" defined as -a V b. Define the operation of strict equivalence in the obvious way: a>< b = (a -< b)
1\
Proposition 10.2.13 In any S5-algebra the following hold: i(a 1\ cb)
(b -< a).
c(a
Proposition 10.2.9 We can derive as a theorem in any S4-algebra (and in any algebra "above," i.e., satisfying at least (Ji)-( 4i)): (a>< b) ::; (ia >< ib). Proof The key to this is of course to derive
(a -< b) ::; (ia -< ib).
361
V
= ia 1\ cb,
i(a V ib) = ia V ib;
ib) = ca V ib,
c(a 1\ cb)
= ca 1\ cb.
Proof We prove the two equations on the first line. (The two others are duals to the ones above them.) (1) i(a 1\ cb) = ia 1\ icb by (2i), and ia 1\ icb = ia 1\ cb by (3i) and (Sc). (2) Notice that ia ::; a ::; a V ib. So by isotonicity, iia ::; i(a V ib). Since ix ::; iix, this means that ia ::; i(a V ib). We can similarly show that ib ::; i(a V ib) starting with the lattice fact that ib ::; a V ib. Using lattice properties, we see that ia V ib ::; i(a V ib). We next show the inequality in the other direction. Using Proposition 10.2.12, we obtain i(aVib) ::; iaVcib. Since cib = ib, we obtain i(aVib) ::; iaVib as
D
desrred.
The main steps of the derivation are as follows: 1. 2. 3. 4.
i(a:J b) ::; (ia :J ib) (by Proposition 1O.2.S) ii(a:J b) ::; i(ia :J ib) (1, by Proposition 10.2.4) i(a:J b) ::; ii(a:J b) ((4i)) i(a:J b) ::; i(ia:J ib) (2,3, by transitivity of ::;).
10.3 Free Boolean Algebras with a Normal Unitary Operator and Modal Logic
D
Let us form the Lindenbaum algebra of the modal logic K (in symbols (LA(K))) as follows (actually, we will work with K"). For sentences if; and \fI, define if;;::;j \fI iff f- K " if; == \fl. It is easy to see that ;::;j is a congruence on the algebra of sentences, and that given the definitions
Proposition 10.2.10 The following holds in all S5-algebras: (ia >< 1)
V
- [if;] = [if;]
(ia >< 0) = 1.
1\ [\fI]
[~if;],
= [if; 1\ \fI],
[if;] V [\fI] = [if; V \fI],
Proof Note that the above expression is equivalent in effect to saying that a necessary
proposition is either itself necessary or else impossible. In S5 there is no room for contingency when it comes to modal propositions. We leave to the reader the details of the "translation." The heart of the proof is to demonstrate: iia
V
-cia
= 1.
we obtain that the Lindenbaum algebra of K" is a Boolean algebra with operator c. Indeed, it is easy to check that it is free in the class of Boolean algebras with a unary operator by an "induction on proofs" (soundness). Lindenbaum algebras for T, B, S4, and S5 can be constructed analogously. Theorem 10.3.1 Let S be anyone of the systems K, T, B, S4, or S5. valid in all S-algebras.
By Boolean algebra, this is equivalent to cia ::; iia.
The latter follows from the axioms cia::; ia (Si) and ia ::; iia (4i) by transitivity.
c[if;] = [Oif;],
If f-s if; then if; is
Corollary 10.3.2 For each of the logics S listed above, LA(S) is afree S-algebra. D
10.4 Exercise 10.2.11 Fill in the missing steps in the proof above. Proposition 10.2.12 In any modal algebra the following hold: i(x V y) ::; ix V cy; ix 1\ cy ::; c(x 1\ y). Proof We prove only the first, the second having a dual proof: i(x V y) = i( -y :J x) ::; i-y:Jix=-i-yVix=cyVix=ixVcy. D
The Kripke Semantics for Modal Logic
A good topic for gossip is "Who first invented the 'Kripke semantics' for modal logic?" We have already contributed enough of a gossipy nature with our discussion of the anticipations in the J onsson-Tarski representations of Boolean algebras with operators. Thus, here we go on to record as fully convinced that the Kripke semantics really should be called the Kripke semantics by vrrtue of the philosophically vivid and mathematically developed form in which Kripke first expressed these ideas. It rightly caught the imagination. The reader might especially want to consult Kripke (19S9, 1963a, 1963b, 1965) for first-hand knowledge.
For us, a (Kripke) frame shall be a structure (U, R), with U a non-empty set and R a binary relation on U. The elements of U are thought of as "possible worlds" and the relation R is thought of as "relative possibility." The reader should be warned that for reasons having to do with the Jonsson-Tarski representation using image-forming operators, we shall read aR/3 as "a is possible relative to /3," whereas Kripke (and most everyone else, following Kripke) would read it the other way around. This is merely a matter of convention. The reader should also be cautioned that the vivid terminology need not be taken too seriously. One beauty of the Klipke idea is that frames may be variously interpreted. Thus the elements of U may be thought of as moments in time with R the relation of temporal priority, to give just one important example (which is central to "tense logic"). This abstractness is nicely captured by (following Montague) calling the elements of U "reference points" or "indices," perhaps while also calling R by the neutral name "accessibility relation." But for the most part we shall not shrink from the words "possible world" (although in dealing later with the Kripke semantics for intuitionistic logic, we shall follow Kripke in regarding the elements of U as "evidential situations"). Subsets of U are quite naturally thought of as "propositions" (sometimes called "UCLA propositions") on the idea of "identifying" a proposition with the set of worlds in which it is true. In the end this may be a bad thing to do (since, for one thing, all contradictory propositions tum out to be the empty set, and hence identical-cf. Dunn 1976). But there seems to be not too much wrong with it as a heuristic in the context of modal logic, since whether a proposition is necessary or possible depends not upon any "fine-grained content" of the proposition, but merely upon which possible worlds have it true. (It would be a terrible heuristic in the context of relevance logic-again cf. Dunn (1976, 1986).) Accordingly, it makes good sense to view a modelfor (an interpretation of) a modal sentential language as a triple (U, R, I), (taking "', /\, 0 as the primitive connectives): (1) for each sentential variable p, I(p) ~ U; (2) 1(",rjJ) = U - l(rjJ);
(3) l(rjJ /\ lfI) = l(rjJ) (4) l(OrjJ)
n l(lfI);
= R*I(rjJ),
where R* X = {y : 3x E X such that y Rx }. Alternatively, and more standardly, a (world relativized) valuation function v can be taken instead of the interpretation function l. Then, v is a function from pairs (rjJ, a), where rjJ is a sentence and a E U, to truth values 2 = {O, I}. So then (1') v(p, a) E 2;
= 1 ¢;> v( rjJ, a) = 0; v(rjJ /\ lfI, a) = 1 ¢;> v(rjJ, a) = 1 and V(lfI, a) = 1; v(OrjJ, a) = 1 ¢;> 3x E U(xRa and v(rjJ, x) = 1).
(2') v(",rjJ, a) (3') (4')
COMPLETENESS
MODAL LOGIC AND CLOSURE ALGEBRAS
362
(Once again, note that we write xRa, where Kripke uses aRx.) On the first alternative, a sentence rjJ is said to be valid in a Kripke frame (U, R) iff for all interpretations I on (U, R) (in the first sense, of course), I( rjJ) = U. On the second
363
alternative, rjJ is valid in (U, R) iff for all valuations v on (U, R) (in the second sense) and for all x E U, v( rjJ, x) = 1. On either alternative, then, a sentence is said to be valid in a class of Kripke frames iff it is valid on each member of the class. The reader should convince himself that these two alternatives really boil down to the same ideas. Which alternative one prefers depends upon whether one wants to write "a E I( rjJ)" or "v( rjJ, a) = 1," whether one wants to work with a set or its characteristic function. In the present context we have a reason for wanting to work with sets, to wit, it is more "algebraic," i.e., by working with l(rjJ) ~ U, we are evaluating rjJ in the Boolean algebra which is the power set of U. Indeed by the material on Jonsson-Tarski, we are evaluating rjJ in a corresponding Boolean algebra with an operator (\(.I(U), n, -, R*). R* being additive and normal, this is in fact a K-algebra. In the sequel we shall examine various classes of frames appropriate to the various modal logics. (The class of K-frames will simply be the class of all frames.) The frames are characterized via properties of the accessibility relation as follows. T-frames: R reflexive. B-frames: R reflexive, symmetric. S4-frames: R reflexive, transitive. S5-frames: R reflexive, symmetric, transitive (= an equivalence relation). 10.5
Completeness
Theorem 10.5.1 (Weak completeness for Kripke semantics) Let S be anyone of the modal logics K, T, B, S4, or S5. Then I-s rjJ iff rjJ is valid in the class of S-frames. Proof. We work through the case of K by way of example. We leave it to the reader to work through the other cases. If not I- rjJ, then in LA(K), [rjJ] I- 1. But since LA(K) is a K-algebra, and hence a Boolean algebra with an operator, we know by the Jonsson-Tarski representation that each element of LA(K) can be identified with a set of "worlds" in some underlying universe U. It is then easy to see that l(rjJ)
= h[rjJ]
is an interpretation where I( rjJ) I- U. The proof for the other cases is analogous, except it must be verified that the way that R is defined in the proof of the J onsson-Tarski theorem gives rise to the appropriate properties in each case (reflexivity for T, etc.). D We can actually prove a stronger theorem equating deducibility and validity, but we have to be careful about how we define deducibility (r I-s rjJ). In constructing a derivation we must not apply the rule of necessitation to any step which has not been justified as being a theorem of S, for otherwise we get rjJ I-s DrjJ, which we know to be invalid. We must also define r !=s rjJ, which means for all S-frames (U, R), and for all a E U, if for all y E r, v(y, a) = 1, then v(y, rjJ) = 1.
TOPOLOGICAL REPRESENTATION
MODAL LOGIC AND CLOSURE ALGEBRAS
364
365
Theorem 10.5.2 (Strong completeness for Kripke semantics) Let S be anyone of the modal logics K, T, B, S4, or S5. Then r I-s rjJ ijjT Fs cp.
(11) I(X) = X; (12) I(Y) ~ Y; (13) I(I(Y)) = I(Y);
Exercise 10.5.3 Prove the above theorem.
(14) I(Y
10.6
We shall say that Y is a closed set iff Y = CCY), and that Y is an open set iff Y = I(Y). If Y is simultaneously closed and open, we say it is clopen. The following
Topological Representation of Closnre Algebras
In this section we shall show how every closure algebra (S4-algebra) can be represented so that the closure operator is represented as a closure operator on a topological space. McKinsey and Tarski (1944) were the first to make this connection, though we think our own proof here, using the Kripke semantics, is more intuitive for present day readers. We first review some relevant topological notions. A quasi-metric on a set X is a function d defined on the Cartesian product X x X taking non-negative real numbers as values, satisfying (MI) dCa, a) = 0, (M2) dCa, y) ::; dCa, p) + d([3, y)
(Triangle inequality).
A pseudo-metric additionally satisfies
(M3) dCa, [3)
= d([3, a)
(Symmetry).
A full-fledged metric satisfies also
(M4) if dCa, [3)
= 0, then a = [3.
(M4) may be regarded as a converse of (MI), since (MI) says that if a = [3, then dCa, [3) = o. By a (quasi- or pseudo-) metric space we mean an ordered pair (X, d), where X is a non-empty set and d is a (quasi- or pseudo-) metric on X. By a two-valued (quasi- or pseudo-) metric we mean one where dCa, [3) = 0 or 1. We shall also call this a discrete metric. Observe that there is only one two-valued metric on a set X, although there may be many two-valued quasi- or pseudo-metrics on X. Informally, dCa, [3) is to be thought of as the distance from a to [3. By a topological space we mean an ordered pair (X, C) where X is a non-empty set and C is a unary operation on the power set of X satisfying the closure axioms due to Kuratowski (1958): (Cl) (C2) (C3) (C4)
C(0) = 0; Y ~ C(Y); C(CCY)) = CCY);
CCY U Z) = CCY) U CCZ). We read CCY) as "the closure of Y," and think of it as the result of adding to Y all the points in X that are arbitrarily close to points of Y. There is a notion dual to a closure operator, called an interior operator. I(Y) may be defined as X - CCX - Y). We might have taken I as a primitive notion, defining C(Y) = X - I(X - Y), and characterizing I by the following interior axioms, the duals of (Cl)-(C4):
n Z) = I(Y) n I(Z).
exercises should be worked at seriously by anyone previously unfamiliar with topology, as they contain well-known facts of topology that we shall appeal to in the sequel. Exercise 10.6.1 Verify that Y is closed iff its complement (relative to X) is open. Hence Y is open iff its complement is closed. Exercise 10.6.2 Verify that the collection of clopen subsets of X is a field of sets. Exercise 10.6.3 Verify that the closed sets of a topological space are closed under finite unions and arbitrary intersections, and hence dually for the open sets. Exercise 10.6.4 The fact in Exercise 10.6.3 is the basis for an alternative way of characterizing a topological space. Let be a topology on S iff is a collection of subsets of X closed under finite intersections and arbitrary unions. We call the members of the open sets of the topology. (There is obviously a dual approach aimed toward the closed sets.) Define I(Y) to be the largest open set included in Y, i.e., as the union of all so there is at least one member members of that are included in Y. (Note that 0 E of included in Y.) Verify that I satisfies the interior axioms.
r
r
r
r
r
r,
Exercise 10.6.5 By the discrete topology on X is meant the set of all subsets of X. Show that the discrete topology on X is determined by the discrete (i.e., two-valued) metric on X. Given a quasi-metric d on a set X, we can define the closure operator C determined by d:
CCY)
= {a EX:
for every r E jR+, there exists [3 E Y such that dCa, [3) < r}.
We now verify that C so defined satisfies (CI) through (C4). (Note that (MI) is used for (C2), (M2) for (C3), and that (Cl) and (C4) come more or less for free.) (CI) Obvious, since there exists no [3 E 0. (C2) Obvious, since if a E X, then, since d(a,a) = 0 (MI), for every positive real number r there exists a [3, namely a itself, such that dCa, [3) < r. (C3) That CCY) ~ CCCCY)) is actually a special case of (C2), so it needs only to be argued that CCCCY)) ~ CCY). Suppose a E CCCCY)) and yet a (j. CCY). Then by the second there is some positive real number r so that for all [3 E Y, dCa, [3) ~ r. But by the first it must be the case that for some y E CCY), dCa, y) < ~. Further, for any such y E CCY), it must be that there is some [3 E Y so dey, [3) < ~. But by the triangle inequality (M2), dCa, [3) ::; dCa, y) + dey, [3) < ~ + ~ = r. So dCa, [3) < r, but remember r was chosen so that dCa, [3) ~ r for all [3 E Y. Contradiction!
366
MODAL LOGIC AND CLOSURE ALGEBRAS
THE ABSOLUTE SEMANTICS FOR SS
(C4) It is reasonably straightforward that C(Y) ~ C(YU Z) and similarly that C(Z) ~ C(Y U Z). Thus, for example, if a E C(Y) then for each positive r there must be some fJ E Y so dCa, fJ) < r. This same fJ is in Y U Z and so will serve for the same choice of r to show a E C(Y U Z). We argue the more interesting C(Y U Z) ~ C(Y) U C(Z). Suppose a E C(Y U Z) and yet a ~ C(Y) U C(Z). Since a ~ C(Y) there is some positive real r so that for all fJ E Y, dCa, fJ) ~ r. Similarly, there is some positive real s so that for all y E Z, dCa, y) ~ s. Let t = miner, s). Then obviously for all u E Y U Z, dCa, u) ~ t. This contradicts our supposition that a E C(Y U Z). The reason for our special interest in two-valued quasi-metrics stems from our use of quasi-ordered sets (U, <) in representing closure algebras. Define a function don U x U so dCa, fJ) = 0 if a < fJ, and otherwise dCa, fJ) = 1. This is the characteristic function for <. Note that a < a iff dCa, a) = O. Also transitivity of < just amounts to the triangle inequality. Thus if a < fJ and fJ < y, then dCa, y) ~ dCa, fJ) + d(fJ, y) = 0 + 0 = 0, so a < y. Conversely, if dCa, fJ) = lor d(fJ, y) = 1, then clearly dCa, y) ~ dCa, fJ)+d(fJ, y). So suppose dCa, fJ) = d(fJ, y) = 0, i.e., a < fJ and fJ < y. But then by transitivity a < y, i.e., dCa, y) = O. So again dCa, y) ~ dCa, fJ) + d(fJ, r). Thus we can freely interchange the notions of a quasi-ordered set (U, <) and a quasi-metric space (U, d). We shall exploit this fact so as to give a topological representation of closure algebras. Definition 10.6.6 Given a topological space (X, C), we call the structure (\9(X), n, u, -, C) (where - is complement relative to X) the full closure algebra on the space. By the J6nsson-Tarski representation we know that a closure algebra can be represented using a frame (U, R), where U is a non-empty set and R is a binary relation on U. In the canonical frame the elements of U are prime filters and aRfJ is defined as \ta(if a E a then Oa E fJ). It is easy to see in the case of a closure algebra that R so defined is a quasi-ordering. Thus, as the reader can easily verify, reflexivity follows from a ~ Oa, and transitivity follows from OOa ~ Oa. Thus the J6nsson-Tarski theorem provides the following.
Exercise 10.6.9 Show that every S5-algebra can be represented on a pseudo-metric space.
10.7
The Absolute Semantics for S5
Completeness of S5 is relative to a Kripke semantics with an accessibility relation that is an equivalence relation. Let us call this the "relative" semantics because it uses the notion of one world being possible relative to another, and defines necessity in terms of being true at all worlds that are relatively possible. There is a simpler semantics, as established in Kripke (1959), which we shall call the "absolute" semantics. This defines necessity as truth in all worlds period. Of course, what is meant here is in all the worlds of the frame. Thus a relative frame is of the form (U, whereas an absolute frame is just a set U. It is easy to see how an absolute frame can be turned into an equivalent relative frame. Just let the equivalence relation be the universal relation on U. Going the other way requires a bit more subtlety. Suppose ¢ is refuted in a relative frame (U, ==). Then for some w E U, w ¢ [¢]. Consider the equivalence class [w] of worlds w' such that w == w'. An easy induction shows that in evaluating ¢ at w we never need look outside of [w]. The only thing that forces us to look at worlds other than w is when we evaluate necessity (or possibility-but let us assume it has been defined in terms of necessity using negation). The key then is that in evaluating a subsentence DlJf of ¢ we need examine the truth lJf only at worlds w' such that w == w'. SO it metaphorically is as if the partition into the equivalence classes divides the worlds up into isolated universes that have no effect on each other. 10.8
Henle Matrices
Let us abstract out the algebraic structure of the absolute semantics. A Henle algebra is a structure (B, /\, v, -, i), where (B, /\, v, -) is a Boolean algebra with top element 1, and i ("necessity") is defined as follows: 1
=1 ia = 0
ia
Corollary 10.6.7 Every closure algebra can be represented as afield of sets with the closure operator interpreted as an image operator with respect to a quasi-ordering <:
CX
= {fJ : 3a E X, a < fJ}.
367
if a
= 1,
if a =f. l.
A Henle matrix is a Henle algebra with 1 the only designated element. We shall not always bother to distinguish these.
PIVof From Theorem 8.12.1 (cf. also Section 10.2 above to see how closure algebras fall under the J 6nsson-Tarski conditions). D This corollary can be reinterpreted given the discussion above about the relation between quasi-orderings and quasi-metrics. Theorem 10.6.8 Every closure algebra can be represented as a closure subalgebra of the full closure algebra on a topological space, indeed a quasi-metric space.
Proof The only trick is to switch the relation < with the corresponding quasi-metric d. D
Exercise 10.8.1 (Soundness) Show that every Henle algebra is an S5-algebra. Hence each theorem ¢ of S5 is valid in every Henle matrix. A good way to think of Henle matrices is in terms of the (absolute) Kripke semantics for S5, where each proposition is a set of possible worlds. The definition then says, in a two-valued on/off way, that a proposition is necessary if it is true in all possible worlds, and is not necessary otherwise. J Of course,
0 can be defined as Oa = -0 - a, in which case Oa = 1 if a
i' 0, and Oa =
°
if a = 0.
368
ALTERNATION PROPERTY FOR 84
MODAL LOGIC AND CLOSURE ALGEBRAS
The above definitions of "Henle algebra" and "Henle matrix" are all "up to isomorphism." Because every Boolean algebra is isomorphic to a subdirect product of 2, we can always let the Boolean part of a Henle matrix be some product of 2. We shall be particularly interested in the finite Henle matrices, and so (given Proposition 8.11.16) each can be nicely presented as a Boolean algebra of the form 2/l (where n is a positive integer), with the additional operator i. We shall denote this concrete Henle matrix as Hn. For convenience, we include the case of the one element Henle matrix as Ho. The following is an immediate consequence of the similar fact about Boolean algebras (Proposition 8.11.16): Proposition 10.8.2 EVelY finite Henle matrix is isomorphic to some Hn. Lemma 10.8.3 Let A be an S5-algebra and let a t= b. Then there exists a Henle matrix H and a homomorphism h of A onto H such that /1(a) t= h(b).
i
Proof Suppose without loss of generality that a t= b because a b (the argument is symmetric). We know from the prime filter separation principle (Lemma 8.6.2), together with the fact that in a Boolean algebra, prime filters are just maximal filters (Theorem 8.9.1), that there exists a maximal filter M with a E M and b ¢ M. We define a congruence on A as follows: a ~ b iff a
>< b E M.
Note that we use strict equivalence instead of material equivalence because the former supports replacement in necessity contexts (a ~ b implies Da ~ Db) because of Proposition 10.2.9. We now consider the quotient algebra A' = A/~. This is a homomorphic image of A under the canonical homomorphism (see Section 2.6), and it is easy to argue that a 'f!. b, and so [a] t= [b]. Otherwise a >< b E M. It is easy to see that a >< b ~ a :J b, and so a :J b E M. Since we have also that a E M, we conclude that (a :J b) /\ a E M. But (a :J b) /\ a ~ b, and so bE M, contrary to our selection of M as separating a and b. Since S5-algebras are equationally definable, we know from Theorem 2.10.3 that they are also S5-varieties. So we know that A' is an S5-algebra. But is it a Henle matrix? We need to prove for x E A/~ that ix = 1 if x = 1. Proposition 10.2.7 says that this holds of any modal algebra. We also need to prove that ix = 0 if x t= 1. This does not hold of all modal algebras and in fact is the distinguishing characteristic of a Henle matrix. From Proposition 10.2.10 and the fact that 1 EM, it follows that (ia>< 1) V (ia >< 0) EM.
Since M is prime, this means that (ia >< 1)
E
M or (ia >< 0)
E
M, i.e., ia ~ 1 or ia ~ O.
This means that for x E A/~, ix = 1 or ix = O. Now suppose that x t= 1. Then by Proposition 10.2.8, ix t= 1. Thus, ix = 0 is the only alternative remaining. D
369
Theorem 10.8.4 EvelY S5-algebra is isomorphic to a subdirect product of Henle algebras. Proof This is immediate from Lemma 10.8.3 and Theorem 2.8.12.
D
Remark 10.8.5 There are at least two other methods of proving the above theorem. The first is to show that only Henle matrices are irreducible and then apply Birkhoff's prime factorization theorem. If A is irreducible and not a Henle matrix, there is an element of the form ca t= 0,1. We define congruences ';::jfl and ';::jv just as for Lemma 8.11.12, but putting ca in place of a. The only thing then that needs doing is to show that these respect the modal operator, say c. If x ';::jfl y, then x /\ ea = y /\ ea & -x /\ ea = -y /\ ca. From the first we can conclude e(x /\ ea) = e(y /\ ea), and from this we can obtain ex /\ ea = ey /\ ea, using Proposition 10.2.13. This is just half of showing that ex ';::jfl ey. We must also show that -ex /\ ea = -ey /\ ca. We have that -x /\ ea = -y /\ ea, and so i( -x /\ ea) = i( -y /\ ea). But again using Proposition 10.2.13 we obtain i-x /\ ea = i - Y /\ ea, and hence -ex /\ ea = -ey /\ ea as needed. All that really remains to be shown is that ';::jv respects e, where the remaining two modal distribution principles from Proposition 10.2.13 play their role. The rest of the proof is as for Lemma 8.6.6. Another way to prove the embedding theorem is given implicitly by the following exercise. Exercise 10.8.6 We know from Exercise 10.6.8 (cf. also Section 10.7) that every S5algebra A is representable using a frame (U, ==), where == is an equivalence relation on U. We also know that the U is partitioned by == into a number of disjoint equiValence classes {[ w] : w E U}. Look at each power set \p([ w]) as a field of sets. Show that it becomes a Henle matrix, given the definition for X ~ [w], C(X) = {P : 3a E X, a == P}.1t is obvious that if X is empty then C(X) = 0. Show that when X is not empty, then C(X) = [w]. Go on to show that the S5-algebra A is isomorphic to a subdirect product of these Henle matrices. (This proof is somewhat analogous to showing implication 3 in Exercise 8.11.11 (Figure 8.15), and also has antecedents in the informal proof given above of the relative and absolute semantics for S5.) 10.9
Alternation Property for S4 and Compactness
A topological space (U', C') is a subspace of a topological space (U, C) iff (1) U' ~ U and (2) C' is just C restIicted to U'. (The latter is equivalent to the more usual requirement that each open set 0' of the first topology is the intersection 0 n U' for some open set 0 of the second topology.) A topological space U is compact iff for every indexed family of open subsets {Oi} iEi, wh~n~ver ~iEi Oi = U then ~here exists some finite J ~ I such that UjEJ OJ = U. ThIS IS easIly seen to be eqUIvalent to the dual condition that for every indexed family of closed subsets {C} iEi, if for every finite J ~ I, njEJ Cj t= 0, then niEi C:i t= 0. The antecedent of this conditional, i.e., that n'EJ Cj is non-empty for all fimte J, is often called the "finite intersection property," ~nd the consequent, i.e., that niEi C is non-empty is often called the "intersection property." Thus compactness
MODAL LOGIC AND CLOSURE ALGEBRAS
370
ALGEBRAIC DECISION PROCEDURES
371
may be neatly stated as requiring that every family of closed sets which has the finite intersection property also has the intersection property. We shall be interested in a somewhat stronger property, to wit, a topological space U is strongly compact iff for every indexed family of open subsets {Oi} iEI, whenever UiEIOi = U, then::li E I such that Oi = U. This is again easily seen to be equivalent to the dual condition that for every indexed family of closed subsets {Ci } iEI, if for every i E I, C =f. 0, then niEI Ci =f. 0. We want now to prove the one-point compactification theorem.
algebra (where [pJJ ... [Pk] are the generators, and PI, ... , Pk are all the subformulas of lfI). This finitely generated Boolean algebra B is finite (because of normal forms) but may not be closed under c[p] = [Op]. We define a new operation c' on B' that agrees with the old c when it exists in B' (all the theorems are about this). Thus we can now falsify lfI in a finite K-algebra. Indeed I-K lfI ifflfl is valid in all K-algebras of size:::; 22", where 11 is the number of subformulas of lfI. For other modal logics, e.g., K, T, and S4, we can obtain similar results. The details are to be found in the following theorems.
Theorem 10.9.1 Every topological space (U, C) is a subspace of a strongly compact topological space (U+, C+) such that U+ - U is a one-element set and is closed.
Theorem 10.10.1 (McKinsey 1941) Let (B, 1\, V, -, c) be a K-algebra and let (B', 1\',
Proof Let aD be anything not in U, and let U+ = U U {aD}. Define C+(X) = C(X) U {aD} (think of aD as "close" to all points). Clearly, by the definition of C+, (U, C) is a subspace of (U+, C+). 0
Let us say that a closure algebra (B, 1\, v, -, c) is finitely compact iff whenever ia V ib = 1 then ia = 1 or ib = 1 (or dually, ca =f. 0 and cb =f. 0 imply ca 1\ cb =f. 0). This terminology is frankly invented and is the vestigial form of strong compactness that remains when one is working with closure algebras that are not necessarily complete. Thus, it is easily seen to be equivalent to the condition that for every finite indexed family of open elements {Oi} iEI, if AiEl 0i = 1, then there is an i E I such that 0i = 1. Theorem 10.9.2 Every closure algebra (B, 1\, V, -, c) is a subalgebra of afinitely compact closure algebra. Proof By the representation theorem for closure algebras we know that B is isomorphic to a topological field of sets of some topological space (U, C). By the one-point compactification theorem, we know that (U, C) is a subspace of a strongly compact topological space (U+, C+). 0
Theorem 10.9.3 (Alternation property for S4) If 1-84 Dp V DlfI, then 1-s4 Dp or 1-84 DlfI· Proof Consider the Lindenbaum algebra LA(S4). By Theorem 10.9.2 it can be embedded in a finitely compact closure algebra. Let us assume the hypothesis I- Dp V DlfI. Then i[p] V i[lfI] = 1 in LA(S4) and so h(i[p] V i[lfI]) = ih([p]) V ih([lfI]) = 1 as well, where h is the embedding. But then by the definition of finite compactness, ih([p]) = 1 or ih([lfI]) = 1, i.e., h(i[p]) = 1 or h(i[lfI]) = 1. So i[p] = 1 or i[lfI] = 1 back in 0 LA(S4), i.e., 1-s4 p or 1-84 lfI.
10.10
Algehraic Decision Procedures for Modal Logic
We shall present a proof of McKinsey's (1941) theorem that S4 is decidable, which shows that S4 has the finite model property. We shall show the same for K and T. Decidability will then follow using Harrop's theorem (Theorem 6.16.5). The rough idea is that if, for example, 11K lfI then lfI is falsifiable in a K-algebra under the canonical valuation [lfI]. The K-algebra is "almost" a finitely generated Boolean
V', -') be an infinitely distributive complete Boolean subalgebra ofB, i.e.,
(i) (B', 1\', V', -') is a complete Boolean algebra;
(ii) (B', 1\', V', -') is a subalgebra of(B, 1\, V, -) where 1\' and V' are defined for 0, bE
B' as a 1\' b = A' (a, b} and a V' b = V' (a, b} ; (iii) B' is infinitely distributive, that is, a V' A' X = X ~ B'; and (iv) c1 E B'.
A' {a v' x
: x E X} for a E B',
Then there exists a unalY operation c' on B' such that (B', 1\', V', -', c') is a K-algebra, and whenever a, ca E B', then c' 0 = ca. Proof Before we begin, we remark that the principal application of the theorem (as in Corollaries 10.10.2 and 10.10.3) is to the case when B' is finite, in which case conditions
(i)-(iii) collapse to the simple requirement that B' be a Boolean subalgebra of B. The reader may want to keep this case in mind as he goes through the proof to "fix ideas." Also, to simplify notation, we shall drop the primes from the symbols denoting the operations of the sub algebra with the exception of the "new" operation c' which we wish to focus attention on. Now to begin the proof, define for a E B', (1) c'(a)
= A{cx: ex E B' and a :::; x},
which is non-empty due to (iv). We first verify that c' a = co for a, ca E B'. Thus, suppose a, ca E B'; then since 0 :::; a, ca is itself a component in the meet that makes up c' a. So clearly c' a :::; ca. But ca :::; c' a as well, since we can argue that ca :::; any component ex in the meet displayed above. For this, suppose ex E B' and a :::; x. Then by monotonicity of c, ca :::; ex, which is what is wanted. We will now occupy ourselves with showing that B' is a K-algebra. We need to show (2) c'o = 0 and (3) c'(aVb)=c'aVc'b.
It is trivial that (2) holds given (1). Thus, cO = 0 E B', and so c'o = cO = O. The calculations for (3) are somewhat complex. We first obtain a useful form of the right-hand side of (3). Thus, (4) c' a V c' b
= A {ex: ex E B' &
a :::; x} V A {cy : cy E B' & b :::; y}.
But by infinite distribution, this equals
(5)
!\ U\ {cx : cx E B' &
a S x} V cy : cy E B' & b s y}.
And by a further infinite distribution, this equals (6) !\{!\{cxVCy:cxEB'&asx} :cyEB'&bSy}·
!\ {cx V cy : cx, cy E B' & a S x
(9) c'a V c'b
Recalling the definition of c' a as the meet of cxs satisfying a certain condition, it suffices to assume (2) cx E B' and a S x
(3)
& b s y}.
These computations are straightforward but confusing (because of the nested braces, etc.). Asserting that our principal application of the theorem will be to the case where B' is finite, we invite the reader to work through all our computations with c' a = CXj, V ... V cx n . Now going on to show (3), it obviously suffices to show both (8) c'(a V b)
S c'a V c'b and S c'(a V b).
as cx.
But (3) clearly follows from the second conjunct of (2) and x S cx (which is the Talgebra postulate (3c)). 0 Corollary 10.10.3 Let B andB' be as in Corollary 10.10.2, except that B is required to be an S4-algebra. Then there exists a unmy operation c' on B' such that (B', /\', V', -', c') is an S4-algebra and whenever a, ca E B', then c' a = ca. Proof Tn virtue of the proof of Corollary 10.10.2, we need only check that the c' as defined in the proof of the theorem satisfies the characteristic S4-postulate:
To prove (8), given the definition of c'(a V b) and (7), it suffices to show
(1) c'c'a
s c'a.
(10) !\{cz:cZEB'&avbsz}scxvcy
It obviously suffices to suppose that
upon the assumption of
(2) cx E B' and a S x
(11) cx, cy E B' & a S x & b
s
y.
to and show
This we do if we show cx V cy to be a component of the meet that constitutes the lefthand side of (10). The trick is to show that c(x V y) is such a cz. From (10) it follows that c(x V y) = cx V cy E B'. It only remains to show then that a V b S x V y, but this follows by lattice properties from (11). To prove (9), given (7) and the definition of c'(a V b), it suffices to show (12)
!\ {cx V cy : cx, cy E B' &
a S x & b s y}
s cz
upon the assumption of
S z. By moves familiar by now, it suffices to show that such a cz is in fact a component of the meet on the left-hand side of (12). The trick is note that cz = cz V cz. By (13), cz E B' and a S z and b S z, which is all that is needed to put cz in the set being meeted. 0 (13) cz E B' & a vb
Corollary 10.10.2 Let Band B' be as in the theorem except that condition (iv) is dropped and in its place it is required that B be aT-algebra. Then there exists a unary operation c' on B' such that (B', /\', V', -', c') is aT-algebra and whenever a, ca E B', then c' a = ca. Proof Condition (iv) can be dropped since it is trivially satisfied by every T-algebra that c1 = 1 E B'. Since a T-algebra is a K-algebra, we need then only check the definition of c' used in the proof of the theorem to see that it satisfies the additional postulate for a T-algebra, namely, (1)
as c' a.
373
and to show
But by "infinite association," this equals (7)
ALGEBRAIC DECISION PROCEDURES
MODAL LOGIC AND CLOSURE ALGEBRAS
372
(3) c' c' a S cx, i.e.,
(4) !\{cy: cy E B' & c'a S y} S cx.
To show (4) it obviously suffices to show that cx is a component of the left-hand meet, i.e., to find some y such that cx = cy and cy E B' and c' as y. Since ccx = cx (the (4c) condition of an S4-algebra), cx is a possible choice for y (x itself is a very poor choice for y) provided only that the other two conditions on yare met. But c(cx) = cx E B' (from the S4-condition (4c) and (2)). So it remains to show only (5) c'a
s cx.
But this is immediate from (2) and the definition of c'.
o
Remark 10.10.4 For certain modal systems the definition of c' can be varied somewhat. Thus, for example, for S4-algebras (closure algebras) one may use the definition
c~4(a)= /\'{X:XEB' and cx=x/\aSx}. This definition accords with the familiar topological intuition that the closure of a set is the intersection of all its closed supersets (x is "closed" if cx = x). For other modal systems the definition of c' must be varied somewhat. Shukla (1970) gives an ingenious variation needed for some of the weaker modal systems in the neighborhood of SI. Exercise 10.10.5 Show for an S4-algebra that cS4(a)
= c'(a).
374
85 AND PRETABULARITY
MODAL LOGIC AND CLOSURE ALGEBRAS
Exercise 10.10.6 For certain systems stronger than K the definition of e' must be varied somewhat. Thus consider the logic D = K + O(¢ V ~¢). (O(¢ V ~¢) may be replaced by D¢ :J O¢, which has been thought particularly appropriate for a deontic logic with D read as "it is obligatory that" and 0 read as "it is pemlissible that.") The condition on the accessibility relation is that every world have a world possible relative to it. (a) Show that the theorem can be proven for D-algebras (K-algebras with the added postulate that e 1 = 1) with the definition
e~(a) =
I\. {ex: ex
E B' & a:S ex}.
(b) Does e~(a) = e'(a) in aD-algebra? Exercise 10.10.7 Can some analog of the theorem be proved for B and S5? It would seem that the definition of e' must be varied. (Strangely, there seems to be no discussion of this problem in the literature.)
375
VI(OIff) = e'[Iff], the result cannot be guaranteed to be [Olff]' But it can be when Olff is a sub sentence of ¢, and this is good enough. Thus one can prove the following by an easy induction on sentences:
Lemma 10.10.9 Let VI be as defined above. Then VI(Iff) = [Iff].
if Iff
is a subsentence of ¢, then
The finite model property now follows directly. Since we have been assuming that IIH ¢, then [¢] -:j:. 1, and so v I (¢) = [¢] is undesignated. The corresponding results for T and S4 follow using Corollaries 10.10.2 and 10.10.3 in place of Theorem 10.1 0.1. D Corollary 10.10.10 The set of theorems for each of the modal logics K, T, and S4 is decidable. Proof From Theorem 10.10.8 using Harrop's Theorem (Theorem 6.16.5).
D
Theorem 10.10.8 The modal logics K, T, and S4 all have the finite model property. Proof We suppose that the logic in question is K. The proofs for T and S4 follow by obvious modifications. Suppose then that 11K ¢. Then form the Lindenbaum algebra of K, which is a K-algebra. Let Iff], ... , Iffn be all of the subsentences of ¢. Consider the Boolean algebra B' generated by [Iff]], ... , [lffn]. (For D we also need to add [O(lffl V ~Iffr)] to the generators.) We will show that B' is finite. Using De Morgan's laws and double negation, every element can be put into "meetnormal" form (Xl,] V ... V Xl, 1111) /\ ... /\ (X n ,] V ... V x n , 1111/)' where the Xi} are the generators or their complements. Drive complement signs inside meets or joins (switching meets withjoins), removing double complements as they arise, so complement signs flank generators. Then use distribution to drive meets in past joins. Because of associativity, commutativity, and idempotence this form can be defined to be unique (up to an ordering of the generators). So clearly the sublattice L' is finite, since it is easy to see that there are at most 22211 such forms. Indeed, a little fiddling so as to not allow joins in which both a generator and its complement occurs reduces this bound to 221/.2 There is not yet a closure operation defined on B'. We cannot simply use c[1ff] = [Olff], since the result might not be defined when Olff is not a subsentence of ¢ (or even a conjunction of disjunctions of negations of such). And we cannot simply expand B', closing it under e, since the result need not be finite (as it was when we did the corresponding construction with meet, join, and complement). But we do know by Theorem 10.10.1 that a new closure operation e' can be defined on L' that agrees with the original closure operation e when a and ea are both in B'. This turns out to be good enough, and (B', /\, v, -, e') will be our desired finite model. Let us consider the interpretation l(p) = [p], for each atomic sentence p which is a sub sentence of ¢, and otherwise l(p) is arbitrarily defined. This is very much like the canonical valuation in the Lindenbaum algebra except that when one comes to compute 2 Alternatively, the reader can use the fact that Boolean tenns can be identified when they define the same operations in 2, and note that there are only 22" such n-p]ace functions.
10.11
S5 and Pretabularity
The reader may have noticed that in the previous section we did not bother to show that S5 is decidable. This is because S5 has the finite model property in a very strong sense which we shall call "uniform." It is not just that each non-theorem ¢ can be refuted in a finite model, but the size of the model can be calculated in a very simple manner from the size of ¢, indeed from the number of distinct atomic sentences occurring in ¢. Theorem 10.11.1 (Uniform finite model property for S5) If there are exactly n atomic sentences in a sentence ¢, then ¢ is a theorem of S5 iff ¢ is valid in the Henle matrixHIl • Corollary 10.11.2 (Soundness and completeness for S5) A sentence ¢ is a theorem of S5 (fifor all positive integers n, ¢ is valid in Hn. We shall eventually get around to proving the theorem. The corollary clearly follows from the fact (Exercise 10.8.1) that a Henle-matrix is an S5-matrix (provided Theorem 10.11.1 holds). We will prove the even more surprising theorem (Theorem 10.11.7 below) due to Scroggs (1951) that all modal logics extending S5 have a finite characteristic matrix, namely, one of the finite Henle matrices. This, together with the fact that S5 itself has no finite charactelistic matrix (Theorem 10.11.9 below), establishes that S5 is "pretabular" in the sense that while it does not itself have a finite characteristic matrix, every modal extension of it does. Lemma 10.11.3 Let X be a modal extension of S5. is not valid in some Henle matrix H.
If ¢
is not a theorem of X then ¢
Proof We leave to the reader the verification that Henle matrices satisfy the axioms and rules of S5. Consider the Lindenbaum algebra LA(X) fonned using the equivalence relation of provable material equivalences (Iff == X iff (Iff :J X) /\ (X :J Iff) is a theorem
376
MODAL LOGIC AND CLOSURE ALGEBRAS
85 AND PRETABULARITY
of S5), or alternatively, of provable strict equivalences. This is an S5-algebra in which [p] i= 1, since P is not a theorem of X. We can now apply Lemma 10.8.3 to say that there is a Henle matrix H and a homomorphism h from LA(X) onto H so that h([ p]) i= h(l) = 1. In the interpretation z(p) = h([p]), then z(p) i= 1. D Notice that there is no reason to think that H is finite, so we work towards replacing the Henle matrix H with some finite Henle matrix Hn. The following will be of use in that task: Proposition 10.11.4 For natural numbers i and j, i
~ j
= c(h(al,
Theorem 10.11.7 (Scroggs 1951) Let X be a modal extension of S5. Some finite Henle matrix H/1 is characteristic for X in the sense that p is a theorem of X (ff P is valid in H/1'
Proof We then let I be the set of indices such that Hi validates all of the theorems of X. Either I is infinite or finite. If infinite, then because of Proposition 10.11.4 we can take I to be all of :£;+, and so (by Theorem 10.11.5) X = S5. If finite, then (assuming non-emptiness) there is a largest index k. Because of Proposition 10.11.4 and
Theorem 10.11.5, it is easy to see than Hk is the desired characteristic matlix.
iff Hi is a subalgebra of Hj.
Proof The "if" part is obvious on size considerations. For the "only if" part we show that Hn is a subalgebra of Hn+ I, and then the proposition follows by an obvious induction. We know from the similar Theorem 8.11.14 that the Boolean part of Hn is isomorphic to a sub algebra of the Boolean part of Hn+l under the mapping h(al, ... , an) = (aI, ... , all, an). It remains to show that h(c(al, ... ,all))
Remark 10.11.8 Note that in the above proof, if I extension in which all sentences are theorems.
Note that we cannot simply apply the general law of universal algebra stated in Exercise 8.11.15 because (unlike the Boolean operations) c "thinks globally" and is not computed componentwise. Rather c( a I, ... , all) = (1, ... , 1) if some component ai = 1, and c(al, ... ,all) = (0, ... ,0) if every component ai = O. The key trick is that some component of (aI, ... , all) is 1 iff some component of its "expansion" (aI, ... , an, an) is 1, and similarly every component of (aI, ... , an) is 0 iff every component of its expansion (aI, ... , an, all) is O. Note that either h(c(al, ... , an)) is all Is, or else it is all Os. But h(c(al, ... ,all)) is allIs iff c(al, ... ,an ) is allIs iff (aI, ... ,an ) contains a 1 iff (aI, ... , an, an) contains a 1 iff c( (aI, ... , an, an)) is allIs. And h(c(al, ... ,an)) is all Os iff c(al, ... , an) is alIOs iff (aI, ... , an) is alIOs iff (aI, ... , an, an) is all Os iff c(al, ... ,an,an )) is allIs. In either case, h(c(al, ... ,an )) = c(h(al, ... ,an )).) D
Theorem 10.11.5 Let X be a modal extension of S5. If P is not a theorem of X then P is not valid in some finite Henle matrix Hn (where n is the number of atomic sentences with occurrences in p).
P of X is rejectable
in some Henle matrix H. We observe that the Henle submatrix H' of H generated by the elements assigned to atomic sentences in p is finite and no larger than 2n. This is because of normal forms for Boolean algebras, together with the fact that in a Henle matrix the modal operators do not lead to new elements. Because of Proposition 10.8.2 we can take H' to be of the form 2k, with k ~ n. Because of PropositionlO.11.4 this means that 2k is a subalgebra of 2n. Since validity is preserved under subalgebra, this D means that p is rejected in 2/1. Proposition 10.11.6 Setting X to be S5 in the above theorem gives the "completeness" half of Theorem /0.11.1, the discussion of which began the present section. The "soundness" half (that each H/1 validates all of the theorems of S5) is given by Exercise /0.8.1.
= 0 then
D
X is the inconsistent
Theorem 10.11.9 S5 has no finite characteristic matrix. Before we present the proof proper we discuss some background. Using == for material equivalence, the following is a well-know tautology of classical logic:
... ,all))'
Proof We know from the previous lemma that any non-theorem
377
(p ==
lfI) V (p
== X)
V (lfl
== X)·
The reason is that with only two truth values one ends up having to always assign the same truth value to two of the three sentences p, lfI, X. This can be generalized to Henle matrices. By the finitizing equality n we mean a formula of the form: (Xl
>< X2)
>< X3) V ... V (Xl >< Xn+t) (X2 >< X3) V ... V (X2 >< X/1+I)
V (Xl
V V
The displayed formula is a bit opaque, but the idea is that n says of n + 1 variables that some two of them stand for the same proposition. By the corresponding finitizing sentence L/1 we mean a sentence in the modal language of the following form, with p >< lfI (strict equivalence) defined as necessary material equivalence (D[(-.p V lfI) A (-'lfI V p))):
(PI >< P2) V (PI >< P3) V ... V (PI >< p/1+d V (P2 >< P3) V ... V (P2 >< Pn+l) V
V
(Pn >< P/1+I) = 1.
Proof Suppose that some finite matrix M is characteristic for S5 and that M has n elements. Just on size considerations it will thus validate the finitizing sentence <1'>/1. But there is a bigger Henle matrix Hk, and in it we can assign distinct objects to distinct variables. It is easy to show that this refutes n, contradicting that M is characteristic for S5. D
378
MODAL LOGIC AND CLOSURE ALGEBRAS
Exercise 10.11.10 Show that any SS matrix with more than n elements refutes < Xl) is designated, and (2) a disjunction is designated if any disjunct is designated.) We have the following corollary, which shows that modal extensions of SS have particularly simple axiomatizations. Corollary 10.11.11 Every proper modal extension of SS can be axiomatized by adding one of the jinitizing sentences L,n to the axioms of SS. Proof Let X be a proper modal extension, by which we mean that X c SS. We know that X has a characteristic Henle matrix Hn. It is easy to see that adding the finitizing sentence L,2" to the axioms of SS (call the resulting system SSn) forces Hn to be also the characteristic matrix of SSn. Hence X = SSn. 0 It turns out that "SS can be approximated from above" by its finitary counterparts SSn (defined as in the proof just above):
Corollary 10.11.12 The extensions of SSline up as a chain asfollows: SSo :J SSl :J ... :J SSn :J SSn+1 :J ... :J SS. Proof The weak version of these inclusions (replace :J with 2) follows from Proposition 10.11.4 and that fact that submatrices preserve validity. All that remains is to show that the inclusions are proper by showing distinctness of the displayed systems. That no SSn = SS amounts to Theorem 10.11.9. The finitizing sentence L,2" is a sentence valid in SSn but not valid in SSn+ 1. 0
Corollary 10.11.13 SS
= nnEw SS/l.
Classical logic can be viewed as the limiting case of modal logic. Viewed in truthfunctional terms, 0 is simply the identity function, and can be read in English as "it is true that." If one considers the two element Boolean algebra as a Henle matrix, one obtains the same result. We shall say that a modal logic "collapses to classical logic" when DcjJ >< cjJ is a theorem. A logic is said to be Post complete if every proper normal extension of it is Post inconsistent in the sense that every sentence is a theorem. Sometimes writers call these notions absolute completeness and absolute inconsistency, and we sometimes stray into this way of talking. Note that in logics where any contradiction implies every sentence, Post consistency (absolute consistency) and ordinary consistency (not both cjJ and ..,cjJ are theorems) coincide. This last is sometimes called "negation consistency" for emphasis. Corollary 10.11.14 The only consistent and Post complete modal extension of SS collapses to classical logic. Proof The proof can be more or less read off of the approximation of SS given by Corollary 10.11.12. First note that SSl is consistent, for there is a sentence provable in SSo which is not provable in SSl. Further, SSI is Post complete, since its only extension is SSo, which can be easily seen to be the absolutely inconsistent modal logic. And
S5 AND PRETABULARITY
379
for n > 1, SSn is not Post complete, since it can always be properly extended to the 0 consistent system SSn-l. Remark 10.11.1S The above results can be given a "purely algebraic" formulation. For example, the fundamental Theorem 10.11.7 can be restated to say that if any equations are added to those axiomatizing SS-algebras, then the resulting set of equations axiomatizes some finite Henle matrix H/l. Remark 10.11.16 An amazing fact is that SS is one of exactly five pretabular modal extensions of S4, as was shown independently by Maksimova (1975), and Esakia and Meskhi (1977).
IMPLICATIVE LATTICES
381
It is worth remarking that (H5) is given in an exported form, but it can be given in an imported form
11 INTUITIONISTIC LOGIC AND HEYTING ALGEBRAS
(H5') ((cjJ -- If!) /\ (cjJ -- X)) -- (cjJ -- (If! /\ X)) at the cost of adding either the axiom (H12) cjJ -- (If! -- (cjJ /\ If!)) or the adjunction rule:
11.1
Intuitionistic Logic
We here present a Hilbert-style formalism for the sentential calculus H due to Heyting (1930). We assume an infinite set of atomic sentences, binary connectives __ , /\, V for implication, conjunction, and disjunction respectively, and a unary connective ..., for negation. The axioms consist of all sentences of the following forms: (HO) (HI) (H2) (H3)
cjJ -- cjJ; cjJ -- (If!-- cjJ); (cjJ -- (If! -- X)) -- ((cjJ -- If!) -- (cjJ -- X)); (cjJ /\ If!) -- cjJ;
(H4) (cjJ /\ If!) -- If!;
(H5) (H6) (H7) (HS) (H9) (HlO)
(ADJ) If cjJ and If!, then cjJ /\ If!. One can also replace (HS) with its imported form (HS') ((cjJ -- If!) /\ (X -- If!)) -- ((cjJ V X) -- If!). The point of this is to make the axioms more obviously give /\ and V the properties of lattice meet and join. Incidentally, from (HI) and (H2), one can prove the following principle of transitivity: (H13) (cjJ -- If!) -- ((If! -- X) -- (cjJ -- X))·
This, with (HO), shows that f-H cjJ -- If! establishes a pre-order, indeed a pre-lattice. We shall see that it is distributive.
(cjJ -- If!) -- ((cjJ -- X) -- (cjJ -- (If! /\ X))); cjJ -- (cjJ V If!);
11.2 Implicative Lattices
If! -- (cjJ V If!);
We shall call a structure (L, /\, V, :::}) an implicative lattice if (L, /\, V) is a lattice and for all a, b, x E L,
(cjJ -- If!) -- ((X -- If!) -- ((cjJ V X) -- If!)); (cjJ -- ""X) -- (X -- ...,cjJ); cjJ--(...,cjJ--lf!).
As sole rule we take modus ponens: (MP) If cjJ and cjJ -- If!, then If!. We now make a few remarks of an axiom-chopping sort. Axiom (HO) is redundant. (Show this as an exercise if you have never done so before.) Axioms (HI) and (H2) completely characterize the "pure implicational fragment" (those theorems whose only connective is --). Axioms (HI) through (H5) completely characterize the "implicationconjunction fragment" (those theorems whose connectives are only -- and /\). And axioms (HI) through (HS) characterize so-called "positive logic" (those theorems that are negation-free). We do not prove these "separation results," but we mention them for the sake of calling attention to the importance of the various fragments. It turns out that if one has a primitive constant false proposition j, one can define ...,cjJ = cjJ -- j (in the style of Johansson 1936), thus dispensing with the need for a primitive negation connective. One must, though, add the axiom scheme (Hll) j -- cjJ
to get the effect of (HlO). But (H9) follows even without this addition, and so (HI) through (H9) amount to the axioms of what Johansson called "minimal logic" (positive logic supplemented with Johansson's definition of..., but no special axioms about f).
(*) x /\ a ~ b iff x ~ a :::} b.
(The reader may want to verify, as an exercise, that this condition implies that:::} is antitonic in the first position and monotonic in the second.) The motive for the name is obvious once it is remarked that the postulate (*) from left to right amounts to the "deduction theorem," and from light to left amounts to modus ponens. Nonetheless, it should be remarked that not all "implications" discussed in the literature satisfy the deduction theorem as it is usually understood. Thus, in particular, the strict implication of C. I. Lewis (191S), the counterfactual implications of Stalnaker and Thomason (1970) and D. Lewis (1973), the Sasaki implication of quantum logic, and the relevant implication of Anderson and Belnap (1975) all reject the deduction theorem in the form (**) r,cjJf-lf!onlyifrf-cjJ--lf!,
where f- stands for ordinary deducibility. The quick reason for this rejection is that by setting paradox of implication
r
to be {If!}, one obtains the
(***) If!f-cjJ--lf!,
which is anathema for strict, counterfactual, and relevant implication. But the deduction theorem in its form (**) is central to the classical and intuitionistic systems, particularly to the intuitionistic system wherein all the pure implicational theorems may be deduced from (**) and its converse (modus ponens). Accordingly, (*) is central to the algebraic
INTUITIONISTIC LOGIC
382
HEYTING ALGEBRAS
treatment of the intuitionistic system. Note that (*) is just a special case of residuation (see, in particular, Sections 3.10, 3.16, and 3.17), which does allow for more general ways of combining premises than simply the ordinary conjunction in (*). The careful reader may have noticed that our definition of an implicative lattice does not explicitly postulate distributivity. This is because to do so would be redundant, as the following shows. Theorem 11.2.1 (Skolem-Birkhoff) Every implicative lattice is distributive.
(1) b/\as(a/\b)V(a/\c), (2) c /\ a S (a /\ b) V (a /\ c).
Corollary 11.2.4 Let (L, /\, v) be a finite distributive lattice. Then there is a unique implicative lattice (L, /\, V, =?). 11.3
Heyting Algebras
(1) ..,a
From these we obtain by "exportation," using (*), -+
but this is transparent since each component a /\ x is postulated to be less than or equal D to b. Finally (*) trivially uniquely characterizes =?
A Heyting algebra (sometimes called a Heyting lattice) is an implicative lattice (L, /\, V, =?,O) with least element 0. The Heyting complement, also called pseudo-complement, can be defined as
Proof By lattice properties, we have
(3) b S a
383
= a =? 0,
and can be easily seen to have the properties ascribed to it in Chapter 3.
[(a /\ b) V (a /\ c)],
(4) c S a -+ [(a /\ b) V (a /\ c)].
11.4
And from (3), (4) we obtain by lattice properties
Let (U, t:) be a quasi-ordered set, i.e., t: is a reflexive and transitive relation on U. By a proposition p we mean a subset of U "closed upward," i.e., for a, p E U, if a E p and a t: P, then PEp. By a full Heyting algebra of propositions we mean a structure (A, /\, V, =?, 0), where A is the set of all propositions on some partially ordered set (U, t:), /\ and V are intersection and union, p =? q = {a E U: for all P E U such that a t: P, P jt p or P E q}, and 0 is the empty set. By a Heyting algebra of propositions we mean a subalgebra of a full one. We shall call (U, t:) an evidential frame. The idea comes from Kripke (1965). An item a E U is thought of as an "evidential state," and a t: P means that the information at a is contained in that of p. The requirement that p be closed upward corresponds to the idea that if p is established by a piece of information, it is also established by any piece of information that extends it. "Established" is intended in a very strong sense then, i.e., it means "proven." Such an assumption might well be given up for some weaker sense, such as "shown to be highly probable." So-called "non-monotonic logics" might well reject this assumption.
(5) b V c S a
-+
[(a /\ b) V (a /\ c)].
But from (5), we obtain by "importation," using (*) (with commutation), (6) a /\ (b V c) S (a /\ b) V (a /\ c),
which inequality suffices for distribution (cf. Chapter 2).
D
Remark 11.2.2 The above theorem has been thought important by many writers (including Birkhoff) for showing that "quantum logic" can have no decent implication, since distribution fails. In light of our remarks above about the dangers in the nomenclature "implicative lattice" in the light of many "non-exporting" implications, this moral must be regarded as questionable. Indeed, many recent workers on quantum logic have questioned this once orthodox moral; see especially Hardegree (1975), who relates a quantum implication to the Stalnaker conditional. Theorem 11.2.3 Let (L, /\, v) be a complete lattice which is infinitely distributive, i.e., a /\ VX = V{a /\ x : x E X}. Then there is a unique operation =? on L such that (L, /\, V, =?) is an implicative lattice (with a /\ b = !\ {a, b}, a V b = V {a, b D. Proof Define a =? b = V{x : x immediate, since if x /\ a S b, then x is a =? b. From right to left is only x /\ a S (a =? b) /\ a. So it suffices to
/\ a S b}. Then (*) from left to right is all but is in fact a component of the "infinite" join which slightly harder. Thus, assume x sa=? b. Then show that
a /\ (a =? b)
S b,
i.e., modus ponens holds. By virtue of the definition of a =? b as "infinite join" and "infinite distribution" it then suffices to show that
v
{a /\ x : x /\ a S b}
s b,
Representation of Heyting Algebras using Quasi-ordered Sets
Theorem 11.4.1 A full Heyting algebra ofpropositions is a Heyting algebra. Corollary 11.4.2 A Heyting algebra ofpropositions is a Heyting algebra. Proof Despite appearances, neither the theorem nor the corollary is of the form "an unmarried male is unmarried." We chose the name "full Heyting algebra of propositions" in anticipation of the theorem, but calling it a "Heyting algebra" does not make it a Heyting algebra. Such patterns of nomenclature are ubiquitous in logic and mathematics. The corollary follows by virtue of the fact that Heyting algebras are equationally definable, and hence, by Birkhoff's varieties theorem, are closed under subalgebras. As for the theorem proper, we first need to verify that the set of propositions A is closed under all of the operations. Because of our requirement that propositions be closed upward, A need not be the power set of U and so it is not immediate. But if a E p n q and a t: P, then since a E p and a E q and since both p and q are closed upward we have PEp, q, i.e., PEp n q. The case for V is argued similarly. As for =?,
384
INTUITIONISTIC LOGIC
suppose a E p =} q and a b {3. To show that {3 E P =} q we must argue that for arbitrary y such that {3 b y, y rt p or y E q. The trick is that since a b {3 and {3 b y, it follows that a b y, and it was required for a E p =} q that for all y such that a b y, y rt p or r E q. Finally, note that the empty set is vacuously closed upward. In verifying postulates, the only thing that is not obvious is that =} is a relative pseudo-complement. Suppose p n q ~ r. We show p ~ q =} r. Let a E p. To show a E q =} r it suffices to consider arbitrary {3 such that a b {3 and show {3 rt q or {3 E r. If {3 E q, then, since p is closed upward and a E p, {3 E P as well, and so {3 E P n q ~ r, i.e., {3 E r. Conversely, suppose p ~ q =} r and a E pn q, i.e., a E p and a E q. Then, obviously a E q =} r and since a b a, by the definition of p =} q it follows that a rt q or a E r. But since a E q, a E r, and thus p n q ~ r. 0
TOPOLOGICAL REPRESENTATION OF HEYTING ALGEBRAS
385
That AnY ~ B iff Y ~ (X - A) U B may be easily verified. But Y ~ (X - A) U B implies I(Y) ~ I((X - A) U B). This follows from the more general fact that Y ~ Z implies I(Y) ~ CCZ), which fact comes rather trivially from (14). But, since Y is open, Y = I(Y), so Y ~ A =} B. Arguing the converse direction, suppose Y ~ A =} B, i.e., Y ~ I((X - A) U B). Then by (12), Y ~ (X - A) U B, and so by the "iff" that starts off this paragraph, AnY ~ B. This gives the following.
Theorem 11.5.1 A full Heyting algebra of open sets is a Heyting algebra. By a Heyting algebra of open sets we shall mean a subalgebra of a full Heyting algebra of open sets. We then have the following corollary, which follows in the same way the corollary to Theorem 11.4.1 followed.
Corollary 11.5.2 A Heyting algebra of open sets is a Heyting algebra.
Theorem 11.4.3 EvelY Heyting algebra is isomorphic to a Heyting algebra ofpropoWe are next going to provide a topological representation for Heyting algebras.
sitions. Proof Let A = (A, /\, v, =}, 0) be an arbitrary Heyting algebra. Let U be the set of prime proper filters on A, and for a, {3 E U define a b {3 iff a ~ {3. We shall embed A into the full Heyting algebra of propositions on (U, b), using the mapping h(a) = {a E U : a E a}. By Stone's representation for distributive lattices, we know h preserves
Theorem 11.5.3 Every Heyting algebra is isomorphic to a Heyting algebra of open sets. Proof Instead of giving a direct proof of this theorem, we shall obtain it as a special
/\ and V and is one-one. Also, since it is evident that the only filter containing 0 is the whole lattice, h(O) = 0. We need only argue then that h(a) is a proposition, i.e., is closed upward, and that h preserves =}. As to the first, if a E h(a), i.e., a E a, and a ~ {3 E U, then a E {3, i.e., {3 E h(a). As to h preserving =}, suppose a E h(a =} b), i.e., a =} b E a E U, and suppose ab{3 E U. Ifwe can show {3 rt h(a) or {3 E h(b), i.e., art {3 or bE {3, we will have shown a E h(a) =} h(b). Since a b {3 means a ~ {3, we have a =} b E {3. So if a E {3, then a /\ (a =} b) E {3 and, since a /\ (a =} b) ~ b, b E {3. Arguing the other direction, we assume contrapositively that a rt h(a =} b), i.e., a E U but a =} b rt a. We argue that the latter means b rt [x, a). Recall that if b E [a, a), then 3c E a so c /\ a ~ b. But then c ~ a =} b and so a=} bE a. Since b rt [a, a), [a, a) may be extended to a prime filter {3 with b rt {3. So since a b {3 and a E {3 yet b rt {3, it follows that art h(a) =} h(b). 0
case of the representation in terms of "propositions" given in Theorem 11.4.3, exploiting the connection we have discovered between quasi-ordered sets and quasi-metrics in the previous chapter. The idea is to take some arbitrary quasi-ordered set (K, b) and then to consider the topological space (K, C) determined by the quasi-metric d which is the characteristic function of Con K x K. We shall show that the full Heyting algebra of propositions on (K, b) is the same as the full Heyting algebra of open sets of (K, C). Before proceeding we observe that for Y ~ K, I(Y) = K - CCK - Y). So for x E K,x E I(Y) iff x ~ CCK-Y)iff3r E JR+,Vy E K-Y,d(x,y) ~ r.But since d is two-valued, this is true iff Vy E K - Y, d(x, y) = 1, i.e. (since d is the characteristic function for b), iff Vy E K(y rt Y =} x g y), i.e. (by contraposition), iff Vy E K(x bY=} Y E Y). The latter gives us a workable characterization of the members x of I(Y). We first argue that the "propositions," i.e., the subsets of K that are closed upward, are precisely the open sets. Recall that a subset Y of K is a proposition iff
11.5
(1) Vx, y E K(x E Y & x bY=} Y E Y),
Topological Representation of Heyting Algebras
Given a topological space (X, C), we can construct a Heyting algebra which we shall call the full Heyting algebra of open sets of the space. Let it be (0, n, u, =}, 0) where 0 is the set of all open sets of X and where for A, B E 0, A =} B = I((X - A) U B). That o is closed under nand U follows from the well-known topological fact asked for in Exercise 10.6.3. That 0 is closed under =} follows from the clause (I3) of Section 10.6 together with our definition of an open set as one that equals its own interior. That 0 is open follows also from that definition, but using (12) (from the same section). That (0, n, U) is a distributive lattice with least element 0 is obvious. We do need, however, to verify that =} satisfies the residual law: AnY
~
B iff Y
~
A =} B.
and a subset Y of K is open iff (2) Y ~ I(Y).
But, by our characterization of I at the end of the last paragraph, (2) is equivalent to (3) Vx(x E Y =} Vy E K(x bY=} Y E Y)).
But (1) and (3) are essentially just stylistic variants of one another and are trivially equivalent. Since /\, V, and 0 are n, u, and 0 in both Heyting algebras of propositions and Heyting algebras of open sets, it remains only to argue that p =} q = I(K - p) U q, where =} is the implication operation defined on propositions. By definition, x E p =} q iff
INTUITIONISTIC LOGIC
386
ALTERNATION PROPERTY FOR H
Vy E K, x !: y implies x ¢ p or x E q. But by our characterization ofl, x E I(K - p).u q iff Vy E K, x !: y implies x E (K - p) U q, i.e., x ¢ p or x E q. So x E P ~ q iff
x E I((K - p) U q).
D
11.6 Embedding Heyting Algebras into Closure Algebras We have already established that a Heyting algebra of open sets is indeed a Heyting algebra in Theorem 11.5.1 and its Corollary 11.5.2. This result can be cast a little more generally. Given a closure algebra B, define an element a of B to be open iff ia = a (where i is defined in terms of the given closure operator by ia = -c-a). For a given closure algebra, we define a Heyting algebra of open elements as an abstract version of the notion of a Heyting algebra of open sets. Thus, where (B, II, v, -, c) is a closure algebra, the Heyting algebra of open elements of B is a structure (A, II, V, ~,O) where A is the set of open elements of B, II and V are the corresponding operations of the closure algebra restricted to A, a ~ b = i( -a V b), and 0 is the least element of B. We leave it as an exercise for the reader to verify that A is closed under the operations and that ~ is indeed relative pseudo-complement (the proof being precisely analogous to the corresponding proof for Heyting algebras of open sets).
Theorem 11.6.1 The Heyting algebra of open elements of a closure algebra is a Heyting algebra. We can also recast our Theorem 11.5.3 that represented Heyting algebras as Heyting algebras of open sets more abstractly as follows.
Theorem 11.6.2 Every Heyting algebra is isomO/phic to a Heyting algebra of open elements in some closure algebra. Notice that no proof is needed since (unlike the situation with Theorem 11.6.1) Theorem 11.6.2 is actually a weakening of the Oliginal, more concrete theorem.
11.7 Translation of H into S4 McKinsey and Tarski (1948) demonstrated that in a certain sense the intuitionist sentential calculus H may be translated into the modal logic S4. We define the translation * inductively:
= Dp; (-.cjJ)* = D-cjJ*; (cjJ IIIJI)* = cjJ* IIIJI*; (cjJ V IJI)* = cjJ* V IJI*; (cjJ -> IJI)* = D(cjJ* J IJI*)·
(1) p*
(2) (3) (4) (5)
Lemma 11.7.2 Let (B,II,V,-,O) be a closure algebra and let (A,II,V,~,O) be the Heyting algebra of open elements of B. Let / be an interpretation of S4-formulas in B and let t' be an interpretation of H-formulas in A which is such that for all atomic sentences pJ(p) = /(p). Thenfor all H-formulas cjJJ(cjJ) = /(cjJ*). Proof Basically the reader should be able to gestalt the lemma by reason of the fact that
definitions of the operations in a Heyting algebra of open elements and the "definitions" of the connectives of H given by the translation * parallel one another, but technically we need an induction on the length of the formula cjJ. For the base case, where cjJ is an elementary sentence p, we note that /(p) is an open element of B, hence t'(p) = i/'(p) = i/(p) = t(p*). We leave the tlivial cases when cjJ is a conjunction or disjunction to the reader and jump to the case when cjJ is IJI -> x. t'(cjJ -> X) = l(lJI) ~ lex), and, by definition of ~, this equals i(/(IJI) J z'(X)). Then, by inductive hypothesis i(/(IJI*) J z(X*)), and further /(D(IJI* J X*)) = z((1JI -> X)*). The case of negation is handled similarly, but is complicated slightly by the fact that we did not take the pseudo-complement operation -. as primitive in Heyting algebras but instead defined -.a = a ~ O. We argue that z'(-.cjJ) = -.z'(cjJ) = z'(cjJ) ~ 0 = i(/(cjJ) J 0) = i(-z(cjJ*)) = iz(-cjJ*) = /(D-cjJ*) = z((-.cjJ)*). D Now we tum to the proof of the preceding theorem. Proof From left to right is rather trivial. It may be proven relatively mechanically by induction on the length of proof of cjJ in H, producing for each axiom cjJ of H a proof of cjJ* in S4, and observing that if cjJ*, D(cjJ* J X*) are theorems of S4, then so is X*. Alternatively, we can take our Theorem 11.6.1 as establishing the faithfulness of the translation from left to right, for it really just amounts to a statement of that faithfulness in algebraic language. Thus, if f-H cjJ then by the generalized soundness theorem for H we have that cjJ is valid in all Heyting algebras. Hence, by Theorem 11.6.1, cjJ is valid particularly in the class of all Heyting algebras of open elements in closure algebras. Hence, by Lemma 11.7.2, cjJ* is valid in all closure algebras and thus, by the generalized completeness theorem for S4, we have f-S4 cjJ*. From right to left we proceed contrapositively, showing that if not f-H cjJ then not f-S4 cjJ. Supposing not f-H cjJ, we have by the general completeness theorem for H that there is a Heyting algebra (A, II, V,~, 0) and an interpretation z ofH sentences in A such that z(cjJ) =/: 1. By our Theorem 11.6.2 we know that A is isomorphic to a Heyting algebra of open elements in some closure algebra. Let the closure algebra be (B, II, v, -, c) and let the Heyting algebra of open elements be (B', II, V,~, 0). It is clear that cjJ can be rejected in an isomorphic image of B just as well as in B, so there is an interpretation I' in B' so that t' (cjJ) =/: 1. We now define an interpretation /' of S4 sentences in a so that zl/(cjJ*) =/: 1. Set II/(p) = z'(p) for each elementary sentence p. We know by the previous lemma that /(cjJ) = t"(cjJ*). Thus since /(cjJ) =/: 1, t"(cjJ*) =/: 1. Hence, by the general D soundness theorem for S4, not f-S4 cjJ.
Theorem 11.7.1 For each sentence cjJ of H, f-H cjJ Uff-s4 cjJ*. Before we prove the theorem we state the following, which may be established by a straightforward induction on formulas.
387
11.8 Alternation Property for H Theorem 11.8.1 Iff-H cjJ
V
IJI, then f-H cjJ or f-H IJI.
INTUITIONISTIC LOGIC
388
ALGEBRAIC DECISION PROCEDURES
Proof We could give a direct proof analogous to the proof given of the alternation property for S4, but instead we shall actually use the alternation property for S4 by way of the GOdel-McKinsey-Tarski translation of H into S4. We shall argue as follows. Suppose I-H ¢ V lfI. Then, by the translation, I-S4 ¢* V lfI*. But then, by the lemma that we will prove below, I-S4 O¢* V olfl*. Then, by the alternation property for S4, 1-s4 O¢* or I-S4 Olfl*. So by the translation, I-H ¢ or I-H lfI, as desired. 0
So we only need the following lemma. Lemma 11.8.2 Let ¢* be the Godel-McKinsey-Tarski translation of a sentence ¢ of H. Then 1-s4 ¢* +-+ o¢*. Proof We induct on the complexity of ¢. (1) Base case. ¢
= p, where p is a sentential variable. Then p* = op, and obviously
I-S4 0 p +-+ DO P by the Axiom of Necessity and the characteristic S4 axiom. (2) ¢ = "'lfI. Then ¢* = 0 -lfI, and the reasoning proceeds as in the base case. (3) ¢ = lfI A X. Then (lfl A X)* = lfI* A X*· By inductive hypothesis, I-S4 lfI* +-+ olfl* and I-S4 x* +-+ 0 X*. So by the replacement theorem for S4, 1-s4 (lfI* A X*) +-+ (Olfl* A 0 X*). But since it is a well-known fact that 1-s4 O(a A p) +-+ (Oa A OfJ) (distribution of necessity over conjunction), we have 1-s4 (lfI* A X*) +-+ O(lfI* A X*) as desired. (4) ¢ = lfI V X. The proof proceeds as in case (3), but, of course, cannot appeal to simple distribution of necessity over disjunction, since that is a well-known modal fallacy. Instead the trick is to proceed from the step I-S4 lfI* V X* +-+ olfl* V 0 X* by the S4 axiom (with Axiom of Necessity) to 1-s4 lfI* V X* +-+ OOlfl* V ooX*. Then use I-S4 OOa V OOfJ +-+ O(Oa V OfJ) to obtain I-S4 lfI* V X* +-+ O(Olfl* V 0X*). Inductive hypothesis then strips the inner necessity signs off, giving I-S4 lfI* V X* +-+ O(lfI* V X*), as desired. (5) ¢ = lfI -+ X. Then ¢* = O(lfI* :J X*), and the proof proceeds as in the base case.
o 11.9
Algebraic Decision Procedures for Intuitionistic Logic
It is possible to show that the theorems of H are decidable by using the translation of H
into S4 and the fact that S4 has the finite model property. Indeed, by fussing with this it is possible to show that H itself has the finite model property, but instead we shall sketch a more direct proof. The reader is advised to compare the presentation with that of the corresponding theorems for S4 of Section 10.10, Since we shall be more brief here. Theorem 11.9.1 Let (L, A, V, =}, 0) be a Heyting lattice and let (L', A, V, 0) be a complete infinitely distributive sublattice ofL (with same lower bound 0). Then there exists a binary operation =}' on L' such that (L', A, V, =}', 0) is a Heyting algebra and when a, b, a =} bEL', then a =}' b = a =} b. Proof We use the same symbols to denote the meet and join operations in the original
lattice and in the sublattice. We define
389
(1) a=}'b=V{XEL':xAa~b}.
We know from the proof of Theorem 11.2.3 (which uses infinite distributivity) that (2) a=}b=V{xEL:xAa~b},
and so clearly a =}' b ~ a =} b. We show that the inequality holds as well in the other direction when a, b, a =} b E L'. It suffices to recall (again from the proof of Theorem 11.2.3) that (a =} b) A a ~ b (modus ponens). Since a =} bEL', then a =} b is actually one of the components of the join that defines a =}' b; pictorially a =}' b is ... Va=} b V .... But then clearly a =} b ~ a =}' b.
0
Remark 11.9.2 When L' is finite the conditions above collapse to the condition that L' is a sublattice of L (with the same lower bound). Theorem 11.9.3 The intuitionistic propositional calculus H has the finite model property. Proof Suppose that IIH ¢. Then form the Lindenbaum algebra of H, which is a Heyting lattice L. Let lfII, ... , lfIn be all of the subsentences of ¢. Consider the sublattice L' generated by [lfII], ... , [lfIn], [f]. We will show that L' is finite.
Using distribution, every element can be put into "meet-normal" form (Xl, I V ... V Xl, III I ) A ... A (x n, I V ... V xn, Ill,), where the Xi} are the generators. Because of associativity, commutativity, and idempotence this form can be defined to be unique (up to an ordering of the generators). So clearly the sublattice L' is finite, since it is easy to see that there are at most 22"+1 such forms. There is not yet an implication operation defined on L'. We cannot simply define [lfI] =} [X] = [lfl -+ X], since the result might not be defined when lfI -+ X is not a subsentence of ¢ (or even a conjunction of disjunction of such, even throwing in f). And we cannot simply expand L', closing it under =}, since the result need not be finite (as it was when we did the corresponding construction with meet and join). But we do know by Theorem 11.9.1 that a new implication operation =}' can be defined on L' that agrees with the original implication operation =} when a, b, and a =} b are all in L'. This turns out to be good enough, and (L', A, V, =}', [f]) will be our desired finite model. 0 Let us consider the interpretation l(p) = [p], for each atomic sentence p which is a subsentence of ¢, and otherwise l(p) is arbitrarily defined. This is very much like the canonical valuation in the Lindenbaum algebra except that when one comes to compute V/(lfII -+ lfI2) = [lfII1 =}' [lfI2], the result cannot be guaranteed to be [lfII -+ lfI2]. But it can be when lfII -+ lfI2 is a subsentence of ¢, and this is good enough. Thus one can prove the following by an easy induction on sentences: Lemma 11.9.4 Given VI as defined above, iflfl is a subsentence of ¢, then VI(lfI)
= [lfI].
Proof The finite model property now follows directly. Since we have been assuming that IIH ¢, then [¢] f: 1, and v I (¢) = [¢] is undesignated. 0
390
11.10
INTUITIONISTIC LOGIC
LC and Pretabularity
Dummett (1959) presented the sentential calculus LC which is obtained from the intuitionist sentential calculus H by the addition of all sentences of the form (1) (¢ -+ 1Jf) V (1Jf -+ ¢). Dummett then gave a completeness proof for LC with respect to the sequence of matrices that Godel (1933) used in showing that LC has no finite characteristic matrix. Dummett proved that although LC too has no finite characteristic matrix, still each (n + 2)-valued GOdel matrix is characteristic for those LC sentences containing but n distinct sentential variables. Ulrich (1970) proved that every extension of LC that is closed under substitution and modus ponens (we call these normal extensions) has the finite model property. In this section we report results of Dunn and Meyer (1971), giving an alternative proof of Dummett's completeness theorem by algebraic means, but more importantly strengthening Ulrich's result by showing that every normal extension of LC has a finite characteristic matrix. Similar results have been obtained for S5 by Scroggs (1951) (presented in Section 10.11) and for RM by Dunn (1970). The proofs we give are exactly parallel to those of Dunn (1970). Maksimova (1972) has set these results about LC into the context of a very pleasant general result, to wit that there are only three normal extensions of the intuitionistic sentential logic H that have the property of pretabularity (that all their normal extensions have a finite characteristic matrix). Where X is an extension of LC (perhaps LC itself), by an X-algebra we mean a pseudo-Boolean algebra in which all of the theorems of X are valid. In pseudo-Boolean algebras generally, ...,a = a =? O. So in considering LC-algebras, we need only concern ourselves with A, V, =?, and O. Certain LC-algebras are especially important. By Goo we mean that algebra whose elements are the negative integers and 0 together with -(j) (where -(j) is the least element), and whose operations are defined as follows: (i) a A b = min(a, b); (ii) a V b = max (a, b); and
(iii) a =? b =
{Ib
if a ::; b if a> b. By Gil we shall mean that subalgebra of Goo whose elements are the negative integers -n to -1 inclusive, together with 0 and -(j). We take Go to consist of just -(j) and O. Incidentally, our definitions of these matrices are dualized from those of Godel (1933) and other references, where conjunction is interpreted as max(a, b), etc. Generalizing, by a Godel algebra we shall mean any algebra whose elements form a chain with least and greatest elements, and whose operations are defined in an analogous way. All GOdel algebras are LC-algebras. Where A is a pseudo-Boolean algebra and F is a filter of A, we define the quotient algebra AI F. The elements of AI F are the equivalence classes of [a] of all elements b of A such that a=? b, b =? a E F.
LC AND PRETABULARITY
391
Theorem 11.10.1 If X is an extension of LC, A is an X-algebra, and F is a filter of A, then AI F is an X-algebra and is a homomOlphic image of A under the natural homomOlphism, h(a) = [a]. Proof. AI F is a pseudo-Boolean algebra and is a homomorphic image of A. We need then only observe that every theorem of X is valid in AI F. Since AI F is a homomorphic D image of A and every theorem of X is valid in A, this follows.
Theorem 11.10.2 Let X, A, and F be as in Theorem n.IO.I, but let F be prime, i.e., a V b E F only if a E F or b E F. Theil AI F is a Godel algebra. Proof. That AI F is a chain is immediate given (1) and the primeness of F. Further, AI F must have least and greatest elements since every pseudo-Boolean algebra does.
We need then only check that the operations are defined as on a GOdel algebra. This is obvious for A and V, and the following theorem of LC, proved in Dummett (1959), ensures that =? is all right:
Thus since F is prime, either a =? b E F or (a =? b) =? b E F. But a =? b E F iff [a] ::; [b]. So if [a] ::; [b], then a=? b E F. But in general for a pseudo-Boolean algebra, x E F only if [x] is the greatest element in the quotient algebra. So [a] =? [b] = [a =? b], which is the greatest element of F, as it should be. On the other hand, if [a] i [b], then a =? b ¢. F, and then (a =? b) =? b E F. So then [a =? b] ::; [b]. And in any pseudoBoolean algebra, since b =? (a =? b) is the greatest element and hence in F, then [b] ::; [a =? b]. So [a] =? [b] = [a =? b] = [b], as it should. D Exercise 11.10.3 Prove that (¢ -+ 1Jf) V «¢ -+ 1Jf) -+ 1Jf) is a theorem ofLC. Theorem 11.10.4 Let X and A be as in Theorem n.lO.l, and let a E A be such that a :j:. 1. Then there is a homomorphism h of A onto a Godel algebra which is an Xalgebra, such that h(a) :j:. 1. Proof. Immediate from Theorems 11.10.1 and 11.10.2 once we invoke Stone's prime D filter separation theorem.
We remark that it easily follows from Theorem 11.10.4, by a familiar construction used by Stone, that every LC-algebra is isomorphic to a subdirect product of Godel algebras. Since the only GOdel algebra which is a Boolean algebra (excluding the degenerate one-element algebra) is Go, this result may be regarded as a generalization of the embedding theorem of Stone's for Boolean algebras. Theorem 11.10.5 Consider the sequence of Godel algebras Go, G], G2, .... tence ¢ is valid in G i , then ¢ is valid in G j for all j ::; i. Proof. This is immediate since each G j is a subalgebra of G i .
If a senD
Where X is a sentential calculus and A is a set of atomic sentences, let XI A be that sentential calculus like X except that its sentences contain no atomic sentences other than those in A. The following theorem is then obvious.
392
INTUITIONISTIC LOGIC
Theorem 11.10.6 If X is a normal extension of LC, then A(X/A) is an X-algebra, and in fact is characteristic for X/A, since any non-theorem may be falsified under the canonical valuation that sends every sentence ¢ to 1i¢11. The hard part of Dummett's completeness result for LC is showing that if a sentence ¢ is not a theorem, then there is some GOdel algebra G n such that ¢ is not valid in G n . This is contained in the following theorem, though generalized to arbitrary normal extensions of LC. Theorem 11.10.7 Let X be a normal extension of LC. Then if a sentence ¢ is not a theorem of X, then there is some Godel algebra G n such that G n is an X-algebra and ¢ is not valid in G n . Proof. It follows quickly from Theorems 11.10.6 and l1.lOA. Thus if ¢ is not a theorem of X, then by Theorem 11.10.6, ¢ is falsifiable in the X-algebra A(X/ A) where A is the set of atomic sentences occurring in ¢. But since 1i¢11 f:. IllfI:J lfIll, the greatest element, then by Theorem 11.10.4, there is a homomorphism h of A(X/A) onto a GOdel algebra G such that G is an X-algebra and h(II¢II) f:. 1. We may then falsify ¢ in G by the interpretation l(¢) = h(II¢I\). Note that G is an X-algebra and h(lllfIll) f:. 1. We may then falsify lfI in G by the interpretation l(lfI) = h(lilfll\). Note that G is finitely generated since it is the homomorphic image of A(X/A), which itself is finitely generated by the elements Ilpli such that pEA. Thus G is finitely generated by the elements h(llpll) such that pEA. It is obvious that every finitely generated GOdel algebra is finite, and it is further obvious that every finite Godel algebra containing at least two elements is isomorphic to some Gil' Thus G is isomorphic to some Gil' which completes the 0 theorem.
We now tum to the proof of our principal result. Theorem 11.10.8 Every consistent proper normal extension of LC has a finite characteristic matrix, namely, some Godel algebra Gil' Proof. The reasoning mimics that of Scroggs (1951). Let I be the set of indices of those GOdel algebras Gil that are X-algebras, where X is the given consistent proper normal extension of LC. By Theorem 11.10.7, since X is consistent, I is non-empty. If I contains infinitely many indices, then I contains every index because of Theorem 11.10.5. But then it follows from Dummett's completeness result that X is identical to LC. But if I contains only finitely many indices, then by Theorem 11.10.5, there must be some index i such that I contains exactly those indices less than or equal to i. By construction, G; is an X-algebra. Now suppose that a sentence ¢ is not a theorem of X. Then by Theorem 11.10.7, ¢ is not valid in some X-algebra G;, and by our choice of i, k ::; i. But then by Theorem 11.10.5, ¢ is not valid in G;. So G; is the desired finite characteristic matrix. 0
We remark that Theorem 11.10.8 has as a corollary that every proper normal extension of LC may be axiomatized by adding as an axiom to LC one of the sentences
LC AND PRETABULARITY
393
Godel (1933) used in showing that H has no finite characteristic matrix, and that from this it easily follows that the only consistent and complete normal extension ofLC is the classical sentential logic. (Compare the proof of similar corollaries at the end of Section 10.11.) It should also be remarked that Thomas (1962) contains another interesting way of ax iomati zing all of the GOdel matrices G,z, in which each of them is axiomatized by the addition of some appropriate pure implicational sentence as an axiom to LC. We finally allude to the fact that strong completeness results for LC are readily obtainable from Theorem 11.10.7, which are along the lines of strong completeness results for RM in Dunn (1970).
RESIDUATION AND GALOIS CONNECTIONS
12 GAGGLES: GENERAL GALOIS LOGICS 12.1
Introduction
The aim of this chapter is to provide a uniform semantical approach to a variety of non-classical logics, including intuitionistic logic and modal logic, so as to recover the representation theorems of Chapters 10 and 11 as special cases. The strategy is to adopt the basic framework of the Kripke-sty Ie semantics for modal and intuitionistic logic (cf. Chapters 10 and 11), using accessibility relations to give truth conditions for the connectives. We generalize this in line with Jonsson and Tarski (1951, 1952) so that in general an n-place connective will be interpreted using an (n+ 1)place accessibility relation (cf. Section 8.12). Besides the Kripke semantics for modal logic, there are motivating precedents with the Routley and Meyer (1973) semantics for relevant implication and the Goldblatt (1974) semantics for orthonegation. The problem with the Jonsson Tarsld result is that while it shows how Boolean algebras with n-place "operators" can be realized using (n + 1 place)-relations, the context is more restrictive than one would like. For example, the underlying structure must be a Boolean algebra, and the "operators" must distribute over Boolean disjunction in each of their places. We have already shown in Section 8.12 that Boolean algebras can be replaced with distributive lattices. But in this chapter we shall examine structures that we call "distributoids," and which relax the constraints of Jonsson Tarski. Distributoids are not the full abstraction we are seeking, because there need be no interaction between the various operators. We have noticed that many important logical principles can be seen as involving relationships between pairs of logical operators that may be seen under the algebraic abstractions of residuation and Galois connections. I We shall abstract these relationships into an algebraic structure called a "gaggle." Incidentally, we owe the name "gaggle" to Paul Eisenberg (a historian of philosophy, not a logician), who supplied it at our request for a name like a "group," but which suggested a certain amount of complexity and disorder. It is a euphonious accident that "gaggle" is the pronunciation of the acronym for "general galois logics.,,2 The general approach here is algebraic, thus we will think of a logic in terms of its "Lindenbaum algebra," formed by dividing the sentences into classes of provable equivalents, defining operators on these equivalence classes by means of the connectives applied to representatives. We shall represent the algebras in a way pioneered by Stone
395
(1936, 1937), and extended by Jonsson and Tarski, so that elements are mapped into sets (thought of as "propositions," or sets of states where the sentences are true). This gives completeness results for the various logics. See Chapter 1 for a discussion of the general relation between representation results and completeness theorems. In their original incarnation (Dunn 1991), gaggles were required to have underlying distributive lattices so that meet ("and") is represented as intersection, and join ("or") is union. Canonically then states are prime filters. However, this condition can be weakened to where the underlying structure is just a partial order (as with the Lambek calculus). Then states can just be principal cones and the complements of principal dual cones, and one does not need Zorn's lemma. In certain cases (as with orthologic and at least the non-exponential part of linear logic) where the logic is a meet-semilattice in any case, for consistency and join can be defined from meet using a negation that is a lattice involution, the methods can be extended so both meet and join are given reasonable interpretations. At the end of this chapter we shall give some applications. We should caution that in most cases, the general representation theorem leads to something other than the usual semantics known in the literature. Thus, for example, the usual semantics for intuitionistic implication (cf. Chapter 11) uses a two-place accessibility relation, whereas the gaggle approach yields a three-place accessibility relation. It then becomes necessary to examine the details of the general representation, applying specific algebraic properties of the logic in question to see that the usual result falls out as a special case after "fiddling with the representation." A toy example of this is given, showing how the usual Stone representation for Boolean algebras (where Boolean complement becomes set complement) can be obtained from the gaggle representation (where Boolean complement is represented using a two-place accessibility relation). 12.2
Residuation and Galois Connections
Consider two po sets A
= (A,::;) and B = (B, ::;') with functions f :A
1---+
B,
g: B
1---+
A.
The pair (f, g) is called residuated iff (rp) fa::;' b iff a::; gb.
The pair (f, g) is called a Galois connection iff (gc) b::;' fa iff a::; gb. A dual Galois connection is a pair (f, g), where (dgc) fa::;' b iff gb::; a. A dual residua ted pair (f, g) is a pair (f, g), where (rp) b::;' fa iff gb::; a.
I There
has been an anticipation of this in Sections 3.10, 3.17, and 8.1, but where the underlying order structure was only a partial order and not a (distributive) lattice. 2We do not ourselves endorse the alternative pronunciation "giggle."
Remark 12.2.1 We have already defined a Galois connection in Section 3.13 for the special case where A = B. The definitions above differ from one another only in the
GAGGLES: GENERAL GALOIS LOGICS
396
direction of an inequality here and there. Thus turn around the left inequality in the definition of a residuated pair, and we obtain a Galois connection, and turning around the right inequality gives a dual Galois connection. If both the left and right inequalities are turned around, of course a residuated pair becomes a dual residuated pair, and similarly for a Galois connection and its dual. Incidentally, observe that a dual residuated pair (f, g) is just a residuated pair (g, J), and so for the most part we shall not bother to look separately at dual residuated pairs. The moral is that as long as the two posets A and B are treated as distinct, one is of course free to turn the inequalities around, since the converse of a partial ordering is again a partial ordering. As someone in Australia can testify, one person's "up" is another person's "down." But these are abstractly all the same, and yet can be distinguished if we assume that A and B are the same, as we shall do henceforth. Our next theorem is easy to prove.
Theorem 12.2.2 (1) For a residuated pair, the following is an equivalent definition: (a) both f and g are monotonic, and fgx S x, x S gfx.
Moreover
if the poset is a lattice,
then
(b) f(x V y) = f(x) V fey) and (c) g(x 1\ y) = g(x) 1\ g(y). (a) both f and g are antitonic, and x S fgx, x S gfx.
if the poset is a lattice,
then
(b) f(x V y) = f(x) 1\ fey) and (c) g(x V y) = g(x) 1\ g(y).
(3) For a dual Galois connection, the following is an equivalent definition: (a) both f and g are antitonic, and fgx S x, gfx S x.
Moreover
if the poset is a lattice,
then
(b) f(x 1\ y) = f(x) V fey) and (c) g(x 1\ y) = g(x) V g(y).
Terminological note. The definition of a Galois connection was first introduced by Birkhoff (1940), and was defined in terms of condition (2a) above. (Birkhoff attributes to 1. Schmidt the equivalence stated as (gc) and which constitutes our definition.) Galois connections were extensively studied by Ore (1944) and Everett (1944). The notion of a Galois connection of course abstracts out the correspondence between the sub fields of a given separable field extension and the subgroups of the Galois group of transformations that leave that given subfield unmoved. Note the relation Fl ~ F2 iff H(F2) ~ H(Fd, which gives a clear motivation for turning the partial order around on the right-hand side of a Galois connection. There is an issue about this, as we shall see. The notion of residuation has most often been discussed in the case of binary operations, in the context of "residuated partially ordered groupoids" (see below), but the
397
definition of a residuated pair is a natural extension of that concept to the unary case (with the further understanding that the function need not be an operation taking values in its domain). Our unary form of a residuated pair can be found (after some decoding) in Blyth and Janowitz (1972), where it is explicitly contrasted with a Galois connection (pp. 18-19). It is also somewhat confusingly to be found in Gierz et al. (1980), where it is called a "Galois connection," with the ironic remark "notice that we have to keep the order straight." Gratzer (1979, p. 51) also follows this usage in an exercise. These usages reinforce the moral above about the essential abstract equivalence of these notions. Finally, let us note that in the language of category theory, a residuated pair can be understood as a pair of "adjoint functors" (cf. MacLane 1971). We believe that the theorem in MacLane "Galois connections are adjoint pairs" (p. 93) has been the driving force in identifying what we are distinguishing as Galois connections and residuated pairs (cf. also Lambek 1981), though MacLane himself is explicit about the fact that Galois connections are anti tonic, and that one must take the "opposite" of the righthand category (the dual of the po set) in order to obtain the desired result. Given a binary relation R on a set U (a frame), it is easy to construct examples of residuated pairs and Galois connections (also their duals) defined on subsets of U as follows.
Example 12.2.3 Residuated pair (0 tA
(2) For a Galois connection, the following is an equivalent definition: Moreover
RESIDUATION AND GALOIS CONNECTIONS
~
B
A
¢
~
DB):
0tA = {X: 3a(aRx & a E A)}, DA = {X: Va(xRa =? a E A)}.
Example 12.2.4 Dual residuated pair (0 A
~
B
¢
A
~ D t B):
OA = {X: 3a(xRa & a E A)}, DtA = {X: Va(aRx =? a E A)}.
Example 12.2.5 Galois connection (A ~ B.l.. .l..A = {X: Va(a E A A.l.. = {X: Va(a E A
=? =?
¢
B ~ .l..A):
XRa)}, aRX)}.
Example 12.2.6 Dual Galois connection (There is no customary notation-we use ? for "possibly false."?A ~ B ¢?tB ~ A): ?A = {X: 3a(xRa & a ¢ A)}, ?tA = {X: 3a(aR x & a ¢ A)}.
All of the above examples, except for the last, have occurred prominently in the literature, but not "on a single page." Thus, for example, thinking of U as a set of "possible worlds," or "states," and thus A and B as "propositions," DA and 0 A are just the usual definitions of the necessity and possibility operators in the Kripke semantics for modal logic. The relation R is called an "accessibility" relation, and aRfJ is read as "fJ is possible relative to a." Of course DtA and 0 tA are just the duals ("backward possibility and necessity"). It is not common to have these together with their forward version in the same modal logic, but this does happen in temporal logic, where aRfJ is read "fJ is an alternative future to a." Then DA becomes the usual tense operator GA for
398
GAGGLES: GENERAL GALOIS LOGICS
"it will always be the case," OA is FA for "it will (sometimes) be the case," and D~A and 0 ~A become the past tense versions H A and P A, respectively. Note also that in standard modal logic, when the accessibility relation is symmetric, (as for the logics Band S5), the "backward" operators are indistinguishable from the "forward" operators, and we can thus strike out the 'T' in the residuated pair and dual residuated pair laws above. The interesting thing is that these laws always hold if we distinguish backwards from forwards. Finally, we discuss Galois connections. It is interesting to note that these definitions can be found in Birkhoff (1940, 1948, 1967) under the heading of "polarities," and Everett (1944) showed that all Galois connections defined on power sets can be obtained from polarities. (Our results concerning gaggles will obtain, as a very special case, the more general result for Galois connections defined on distributive lattices, or, by trivial modification, Galois connections defined on posets). In interpreting R it is best to think of it as an "inaccessibility relation," or better as "incompatibility." It is customary to denote this relation as..L, and in Birkhoff (1940, p. 125) it is connected to Hilbert spaces as the "orthogonal" or "perp" relation. Goldblatt (1974) gives a completeness theorem for ortholattices (but not orthomodular lattices) in effect using A..L as the definition of negation. In many cases, including Goldblatt's, it is natural to require that the relation be symmetric, in which case A..L = ..LA. But this is not forced. It is easy to establish from our representation results below, that the above examples are "canonical," i.e., all residuated pairs, Galois connections, and their duals are isomorphic to those defined as above on a collection of subsets of some frame. Our representations assume that the po set is a distributive lattice, since we want representations that carry meet and join into intersection and union, respectively. But if one does not care about that one can have an arbitrary poset, and it is left as an exercise for the reader to see how to rework our results so that they apply to this "more general" case. Section 8.1 is a good model.
12.3
Definitions of Distributoid and Tonoid
As a first approximation a distributoid is a structure D = (A, 1\, v, (Oi) iEI), where (A, 1\, V) is a distributive lattice, and each I E (Oi) iEI is a (finitary) operation on A that "distributes" in each of its places over at least one of 1\ and V, leaving the lattice operation unchanged or switching it with its dual. Note that it is easy to see that for each I E (Oi) iEI, I is in each of its argument positions either isotonic or antitonic: '<:Jb,eEA: b::;e::} l(a1, ... ,b, ... ,a n)::;/(a1, ... ,e, ... ,an), '<:Jb,eEA: b::;e::} l(a1, ... ,b, ... ,an)?/(a1, ... ,e, ... ,an ).
A tonoid is roughly just a poset (A,::;, (Oi)iEI) where each I E (Oi)iEI is either isotonic or antitonic. We shall first consider distributoids. More explicitly, for a distributoid let T = {I\, v}, and let T (with subscripts) range over T. We associate with each I E (Oi )iEI a distribution type:
DEFINITIONS OF DISTRIBUTOID AND TONOID
Then, where * is V if Tj way on the value of T,
399
= V, and is 1\ if Ti = 1\, and # is V or 1\ depending in the same
l(a1, ... , b * e, ... , an) = l(a1, ... , b, ... , an )#/(a1, ... , e, ... , an).
There is one more requirement (which generalizes a requirement of Jonsson and Tarski). For convenience we require that the lattice be bounded, i.e., there is a least element and a greatest element 1 (a lattice can always be trivially extended with two additional elements to serve as the bounds). These bounds are useful, since with them, all finite subsets B of A have both greatest lower bounds (/\ B) and least upper bounds (V B). For a non-empty set B, just form the pairwise meets and joins, and for the empty set 0, /\ 0 = 1, and V0 = 0, as the reader can easily see. It is natural, and for technical reasons important, to have our operators "distribute" not just over finite nonempty meets or joins, but over the empty ones as well. Let us calculate the result for an operator 0 that distributes over join, leaving it as join. Then 00 = 0 V 0 = V 00 = V0 = 0. (Here we naturally understand 00 as the result of applying 0 to each member of 0, i.e., as 0.) Similar arguments can be produced for an operator D that distributes over 1\, leaving it as 1\, to show that D 1 = 1; for an operator ..L that distributes over v, changing it to 1\, to show that ..L = 1; and for an operator? that distributes over 1\, changing it to v, to show that ?1 = 0. More explicitly, and generalizing to operators of arbitrary finite degree, if the operator I is of distribution type (T1, ... ,Tj, ... ,Tn) ~ T, then I(C], ... ,ei, ... ,en) = e, where ej is when Tj = v, and is 1 otherwise, and the same for e in relation to T. We shall refer to this as "I respects the bounds."
°
°
°
Remark 12.3.1 Since, for a given operation, one is free to assign to each place as "input type" either 1\ or V, there are lots of degrees of freedom in terms of whether the operation I distributes over meet or join in that place. However, the fact that the "output type" T is always the same for each place forces uniformity in telms of whether a meet or ajoin results. Remark 12.3.2 The requirement that the operator respects the bounds can be conditionalized to remove the assumption that the lattice is bounded, but it is convenient to retain it throughout this chapter. Example 12.3.3 Boolean algebras with operators as in Jonsson-Tarski, and in particular closure algebras and the Lindenbaum algebras of normal modal logics with possibility as the operator, provide examples, as well as the dual of Jonsson-Tarski with necessity as the operator. Example 12.3.4 Further examples are: Boolean algebras, with complement as the (unary) operator; and various other distributive lattices with "complements" that arise in
the algebraic study of logic, including pseudo-complement (intuitionistic logic), De Morgan or quasi-complement (relevance logic, Lukasiewicz logic), impossibility and possibly false (modal logic).
400
GAGGLES: GENERAL GALOIS LOGICS
Example 12.3.5 Various (binary) implication operators, including relative pseudocomplement (intuitionistic logic), the implication of relevance logic, modal logic, and Lukasiewicz logic are examples too in virtue of: (a V b) -+ c
= (a -+ c) /\ (b -+ c),
a -+ (b /\ c) = (a -+ b) /\ (a -+ c). It is possible to generalize the notion of a distributoid so as not to require the existence of meets and joins. We call these structures tonoids (cf. Section 3.10). Recall that a tonoid is a poset (A,::;, (0; );EI) where each n-ary operation f E (0; );EI is in each of its argument positions either isotonic or antitonic. "Mix and match" is allowed, so that it can be isotonic in one position and anti tonic in another. For convenience, though, we assume that there always exist least and greatest bounds 0, 1, and each f will have to "respect the bounds" in the sense that for each of its argument positions there is a way of getting a uniform output of one of 0 or 1 by plugging in either 0 or 1 (this time it need not be uniform, one can "mix and match" in the argument positions), independent of the values of the other arguments. Putting all this formally, as for a gaggle, we assume that each n-ary operation f E (0;) ;EI has a "distribution type," but since this time preserving the bounds is the only issue, it is more appropriate to formalize this as simply an (n + I)-tuple of Os and Is and to call it a trace. 3 We use the notation (c I, ... , c 11) 1-+ c. The first n signs indicate at each position which of 0, 1 the function is preserving or co-preserving, and the value of the (17 + l)th indicates the output value. From these we can then determine for each position i whether the function is isotonic or antitonic, simply by seeing whether the value C; = c or not. Let us consider the two modal operators 0 and O. Both are isotonic, so how can we distinguish them? We do it by virtue of the two equations 01 = 1 and 00 = O. This is recorded by saying that the distIibution type of 0 is 1 1-+ 1, whereas that of 0 is o 1-+ O. Both of these types indicate isotonicity since the bounds are preserved, rather than being inverted.
12.4
Representation of Distributoids
We start with a definition of the "target" structures that will be used in the representation.
Definition 12.4.1 Given a distributoid D = (A, /\, V, (0;) ;EI), by a frame for D is meant a structure (U, (R;);EI), where there is a one-one correspondence between (0;) and (R;), and for each f E (0; );E[, if the degree of f is n, then the corresponding R; ~UI1+I. 3In Dunn (l993a) a somewhat reversed notion was called a "trace" and denoted by ±I, ... , ±Il f-7 ±. The idea was that a + in an argument position indicated isotonicity, whereas a - indicated anti tonicity. A + in the ouput position indicated that 1 was the value, whereas a - indicated O. From the bound which was output plus the information as to whether the function preserved or inverted the order, one could deduce the information about which bound was preserved or co-preserved.
REPRESENTATION OF DISTRIBUTOIDS
401
We state a rough theorem, the exact statement of which will not be given until after we have some motivating examples. Showing how various examples are treated and how this treatment is to be generalized will constitute all of the proof which we will give here of the theorem. A rigorous version of the proof would run along the lines of our proof Of. the Jo~sson-Tarski representation (Theorem 8.12.1), but with an appropriate schematIc notatIOn so as to "hide from the user" which operation of meet or join is bein a distributed or co-distributed over in which places. With such a notation the proof can b: made to look just like the proof of Jonsson and Tarski, where they had only to consider distribution over join.
Theorem 12.4.2 (Representation of distributoids) Every distributoid can be represented as a lattice of sets, with /\ as n, V as U, and each operator f E (0; );EJ intelpreted as an operation F defined on the power set of U, using R; according to a pattern to be described after motivation by the two examples below. The first examples which we give are those that relate directly to the Jonsson-Tarski representation. (1) Possibility (Kripke):
F(A) = OA = {p: 3a E A, R(p,a)}. (2) Generalized image operator (Jonsson and Tarski):
F(AI, ... ,An)=R;"(AI, ... ,An)=A, where A = {a: 3al E AI, ... , 3an E An, R;(al, ... , an, a)}.
Notice that it is tempting to say that the Kripke example is just the unary case of the Jonsson-Tarski example. This is right in spirit, but wrong in detail since the direction of the relation R is reversed in the two examples. Clearly nothing essential hangs on this (one person's relation is another person's converse), but it will be a constant theme of this chapter that one must pay as much attention to the "backwards" representations as to the "forwards" ones (witness the relation between menu A and menu C below), and ind.eed t~is is cru~ial to representing interactions between operators that are related by reslduatIOn, GalOIS connection, or their duals. Before setting out the general pattern for choosing the relation R, we first exhaustively analyze the case where R is binary. Thus let f be a unary operator. There are then only four distlibution types, and we list the condition in each case for p E f(A). Each "menu" represents alternative ways of realizing the operators. These examples show that one obtains various patterns of distribution and co-distribution depending on how one defines the operation. We have written in the vicinity some standard or non-standard notations fo!.:,.some operators. We use A for the set theoretic complement of A (relative to U). MENU A
P E +tA (/\ 1-+ /\)
Va(a E A or aRp)
PE (V
1-+
0tA E A and aRp)
v) 3a(a
REPRESENTATION OF DISTRIBUTOIDS
GAGGLES: GENERAL GALOIS LOGICS
402
13 (V
1-+
1\)
E A.L
= ..itA
Va(a E A or aRfJ)
13 E ?t A (1\ 1-+
V)
3a(a E A and aRfJ)
MENU B: R in place of R (1\ 1-+
(V
1-+
13 E 0t A 1\) Va(a E A or aRfJ) 1\)
Va(a E A or aRfJ)
(V
1-+
(1\ 1-+
V) V)
(1\ 1-+
13 (V
1-+
1\)
E .LA
(V
1-+
v)
= ..iA
Va(a E A or fJRa)
E OA 3a(a E A and fJRa)
(1\ 1-+
V)
3a(a E A and fJRa)
(1\ 1-+ 1\)
13 E OA Va(a E A or fJRa) Va(fJRa :::} a E A)
(V
V)
3a(a E A and fJRa)
(V
Va(a E A or fJRa)
(1\ 1-+ v)
3a(a E A and fJRa)
1\)
1-+
Remark 12.4.3 The above tables show that any menu will do as well as any other menu in realizing operators of various distribution types. One can pick and choose among the sets for different operators as in a Chinese restaurant, one from menu A for f, one from menu B for g. However, if one wants to preserve certain relationships between operators, the choice may not be so free. Thus observe that the customary d~fi~ition of possibility comes from menu C, while that for necessity comes from D. ThIS IS not arbitrary, but reflects the fact that the distributoid in question is a Boolean algebra, that possibility and necessity are related by the equation Oa = -0 - a, a~d th~t Bo~lean complement is to be realized by set complement. More profound relatIOnshIps wIll be explored when we come to residuation and Galois connection type phenomena. We finally set out the general rule, first with reference to menu A, which the reader should consult for motivation from time to time. We shall be treating V as a "positive sign" and 1\ as a "negative sign" and the reader is invited to think of them, respectively, as 1 and O. Accordingly Ev is just E, whereas E/\ is ¢. Similarly Rv is R, and R/\ is R. We also adopt the convention that 7\ = V and V = 1\. Given an n-ary operator f of distribution type
(i) ifr = v, then
E
Fi(AI,· .. , An) iff
13)).
We next notice that the second condition can be transformed into a semblance of the first by negating both sides (contraposing):
13 ¢ Fi(AI,· .. ,An) iff 3al, ... , an(al ErJ Al 1\ ... 1\ an Erll An 1\ Ri(al, ... , alb 13))·
13 E?A
MENU D: (R)-l in place of R
1-+
13
Val, ... , an(al ETJ Al V ... Van ETII An V Ri(al,· .. , an,
3a(a E A and aRfJ)
13
13
(ii) if T = 1\, then the realization condition for t is appropriately dual, i.e., the realization condition is a universally closed disjunction and the rules reverse for determining whether a component ai E Ai is negated:
3a(a E A and aRfJ)
MENU C: R- l in place of R E +A 1\) Va(a E A or fJRa)
403
The final trick is to stick in a couple of more subscripts to indicate "sign," and we thus obtain one general rule that covers both of the cases (when T = V and T = 1\): (1)
13
Er Fi(AI, ... ,An) iff
3al, ... , an(al ErJ Al 1\ ... 1\ an Erll An 1\ R i, r(al, ... , an, 13))·
We get the other three menus by switching R with R (= U - R), R- l , and (R)-l. We shall not need to consider these other menus until Section 12.7, and so the following theorem is stated explicitly only for menu A. However, the reader should recognize that it, of course, applies to the complement of R, or to any relation which is obtained from R by a simple permutation of its terms (as with R- 1 in the binary case), since these are also relations of the same degree. Theorem 12.4.4 (Realizing distribution types) Let (U, (Ri )iEI) be aframe. Associate with each Ri of degree n + 1 a distribution type ti : (Tl, ... ,Tn) 1-+ T. Define the n-my operation Fi on subsets of U as in (l). Then the operation Fi is of the distribution type ti· Proof We content ourselves with observing that the reason why the realizing condition is required to be an existentially closed conjunction when T = V is that both existential quantifiers and conjunction distribute over disjunction, and dually for when T = 1\. The reason why some atomic sentences ai E Ai are negated is that we at that place want to tum a conjunction into a disjunction, or vice versa, by De Morgan's laws. 0
Theorem 12.4.5 (Representation of distributoids) Every distributoid (A, 1\, V, (Oi)iEI) is representable as a distributoid defined as above on aframe (U, (Ri)iEl). Proof The set U is the set of prime filters on A. Using the method of Stone, for each element a E A, we set h(a) to be the set of prime filters P such that a E P. It is well known that h carries meet into intersection, and join into union. So all that remains is
404
GAGGLES: GENERAL GALOIS LOGICS
to provide for each n-ary operation defined on prime filters, such that
I in (Oi) (of type t) an
(n
+ 1)-place relation
REPRESENTATION OF DISTRIBUTOIDS
Rf
405
maximalization argument). The obvious definitions for the relations are then as follows (with the first two immediately guaranteeing their canonical equivalences from left to right, and the last two guaranteeing theirs in the other direction): P R+Q iff 3x(x
where for any sets of prime filters X 1, ... , X n , and a prime filter Q,
¢ P and +X
E Q);
PR.J..Q iff :lx(x E P and ..Lx E Q); P R?Q iff Vx(x ¢ P implies ?x E Q) iff Vx(x E P or ?x E Q);
f
:lPj, ... , Pn(Pj ErJ X 1 /\ ... /\ P n Erll Xn /\ Rr (Pj, ... , Pn, Q)).
Before jumping straight to the definition of the relation R{, we exhaustively analyze the cases where Ii is a unary operation, in hopes of shedding some light on the final definition. We shall focus on menu A. First let us suppose that we have an operation of type V l-+ V. For mnemonic reasons we will denote it as 0 (strictly speaking it should be subscripted with t since we are working with menu A, but we omit the subscript). We need then to show for some appropriate relation R, that h(Oa) = Fo(h(a)),
where Fo is the corresponding operation on sets of prime filters defined using R. This amounts to showing that for each prime filter Q: Q E h(Oa) iff Q E Fo(h(a)), i.e., Oa E Q iff :lP(P ErJ h(a) /\ PRQ), i.e. (using menu A), iff :lP(P E h(a) & PRQ), i.e., iff :lP(a E P & PRQ). We need then to define the relation PRQ so that a E P & PRQ implies Oa E Q. The obvious definition is to simply define PRQ to hold whenever Vx(x E P implies Ox E Q). Of course, this does not guarantee so immediately the other direction of the above "iff," i.e., if Oa E Q, then :lP(a E P & PRQ), but it turns out that such a prime filter P can be produced by a standard Stone-style argument using Zorn's lemma, maximalizing on the condition "pi is a proper filter, such that a E pi, and pi RQ." It is easy to show that the principal filter determined by a, [a) = {x : a ~ x}, satisfies this condition. (That [a) is proper is the whole point of the technicality that operations must respect the bounds.) Now let +, ..L, and? be unary operations of types /\ l-+ /\, V l-+ /\, /\ l-+ V, respectively. Again, we omit the subscript t, and work through what is required for these operations to be preserved, we obtain the following "canonical equivalences" (we list the one for 0 again for the record): (+) +a E Q iff VP(a E P or PR+Q);
(..L) ..La E Q iff VP(a ¢ P or PR.J..Q); (?) ?a E Q iff :lP(a ¢ P and P R?Q); (0) Oa E Q iff :lP(a E P and P RoQ). Adopting the strategy used above for 0, we define the relations so as to immediately guarantee half of each of the canonical equivalences (again the other half will come by a
P RoQ iff Vx(x E P implies Ox E Q) iff Vx(x
¢ P or Ox
E Q).
Briefly examining the halves of the canonical equivalences that are not an immediate consequence of the definition of the relation, we observe that the proof for ..L is similar to the proof sketched above for 0, but we start with the assumption that ..L a ¢ Q. We maximalize on the condition "pi is a proper filter, such that a E pi, and pi R.J..Q." The stories for + and ? are more interesting. For ?, we must show that if ?a E Q, then for some prime filter P, a ¢ P and P R?Q. This time we consider the ideal generated by a, (a] = {x_: x ~ a}, and maximalize on the condition "I is a proper ideal, such that a E I, ~d ~ R?Q," ~oting that (a] is such an ideal. The ideal we obtain can be argued to be pnme m a routme manner, and the complement of the prime ideal is our desired prime filter. The argument for +a similarly maximalizes on ideals. We at last try to figure out the general rule for defining an accessibility relation to use in representing an operation I of type t: Cr], ... , Tn) l-+ T.4 If the output type is V (think of 0 as an example), Rf (PI, ... , Pn, Q) iff VXI ... xn(Xj ETJ PI V ... V Xn ETII P n V I(XI, ... , xn) Er Q).
And if the output type is /\ (think of + as an example), Rf (PI, . .. , Pn, Q) iff
:lxI ... Xn(XI
ErJ PI /\ ... /\ Xn Ern P n /\ I(XI, ... , Xn) E:r Q).
~gain we can tum this into a form similar to the preceding condition by negating both SIdes: -f
R (PI, ... , Pn, Q) iff
VXj ... xn(Xj ETJ PI V '"
V Xn ETII P n V I(X1, ... , xn) Er Q).
Sticking in one more sign variable gives us our general rule:
(2)
R{ (Pj, ... , Pn, Q) iff VXj '" Xn(X1 ETJ Pj V '"
V Xn ETII P n V I(XI, ... , xn) Er Q).
4This notation is essentially due to Gerard Allwein.
406
GAGGLES: GENERAL GALOIS LOGICS
PARTIALLY ORDERED RESIDUATED GROUPOIDS
We need to show that h(f(a}, ... , an»
= F(h(aJ}, ... , h(an»,
i.e.,
Q E h(f(a}, ... ,an» iff Q E F(h(a}), ... ,h(an», i.e., I(a}, ... ,an) Er Q iff
::IPj, ... , Pn(aj ErJ Pj /\ ... /\ an Erll P n /\ Rr(Pj, ... , P n , Q», i.e., I(aj, ... , an) Er Q
iff ::IP}, ... , Pn(a} ErJ p} /\ ... /\ an Erll Pn/\ \;fXj ... xll(Xj E r ] Pj V ... V Xn Erll P n V I(xj, ... , x l1 ) Er Q)).
It is easily straightforward to see that the right-to-Ieft half of the last "iff" is guaranteed by the definition of R. The conditional the other way can be argued for by a maximalization argument (cf. J 6nsson and Tarski 1951), the details of which are routine but confusing given the level of generality. Note that the right-hand side is equivalent to an existentially quantified conjunction, and we are set up for a maximalization argument, much like the one in the proof of Theorem 8.12.1. But one must then be careful to maximalize on the principal ideal (ai], rather than on the filter [ai), when Ti = /\. D
Exercise 12.4.6 Prove the representation theorem for distributoids by modifying the proof of Theorem 8.12.1. With the right notation the modified proof should look to the eye much like the original, if one is not focusing on the "signs." For example, one useful notation is [ai)r;.
There are two familiar examples of residuation. First, think of 0 as multiplication, think of S as "divides," and think of say a -+ R c as the quotient cia. A more central example to our project is to think of 0 as some way of "conjoining" propositions into "premises," then think of S as logical deducibility, and think of (say) a -+ R c as "if a then c." (Conjoining premises need not be done by the logical operation usually called "conjunction," but instead by some other operation, as for relevance logic in Dunn (1966).) In both of these examples it is natural to think that 0 is commutative, and so there is no reason to distinguish left and right residuals, but in a general setting we wish to do so. Incidentally, given the quotient interpretation, it is common, especially in older literature, to use notations such as alc for one of the residuals and (say) a\c or allc for the other. These are not memorable notations for keeping track which is the left and which is the right residual, and they have frequently been reversed. Given the logical interpretation, and also the felt need for a notation that is more memorable as to its "sidedness," we introduce the following definitions following Pratt (1991).
Definition 12.5.1 x
-+ R Y
=x
-+ y
and x -+ L Y
T(*) = (Tj, T2)
where
T
0
b S c iff a S b -+ L c
(left residual).
Right residuation can be defined symmetrically: (rr) a
1-+
T,
(with or without subscripts) ranges over /\ and V. Thus, (V, V)
1-+
V.
Partially Ordered Residuated Groupoids
We have already discussed the notion of residuation for unary operations. We now want to examine it in the case of binary operations, where the standard jargon has to do with "partially ordered residuated groupoids." These were already introduced in Section 3.7. Good extemal sources are Birkhoff (1967) and Fuchs (1963). The general notion of "residuation," although it has its origins in such nineteenth-century figures as De Morgan and Peirce, seems first to have been abstracted in its modem form by Ward and Dilworth (1939). A partially ordered groupoid is just a structure (A, 0, S), where 0 is a binary operation on A, S is a partial order on A, and 0 is monotonic in each of its places with respect to s. When the partial order is a lattice ordering, we speak of a lattice-ordered groupoid when 0 distributes over V from both directions (in which case monotonicity becomes redundant). Such a groupoid is said to be left residuated when there always exists an element denoted by b -+ L c. This element can be shown to be unique, satisfying (lr) a
= Y +- x.
We want a more abstract way of stating the residuation properties, and these will ultimately help motivate our definition of a gaggle. Thus to each binary operation * of a residuated groupoid (0 and the two residuals) we assign a "type"
T(o):
12.5
407
0
b S c iff b S a -+ R c
(right residual).
The type of +-, a "contrapositive" of T( 0), is T( +-):
(/\, V)
1-+ /\,
and the type of -+, another "contrapositive" of T( 0), is T( -+):
(V, /\)
1-+ /\.
Note that we chose to view the left residual as composed from its arguments in the opposite order from the right residual. This is important, as we shall soon see. We next introduce some abbreviations that are useful for our abstract statement of residuation. · { a *b< c .if T S( *, a, b) ,c abb revlates c S a * b If T
= V' = /\.
The laws of the left residual and of the light residual can now be stated as follows: (SIr)
S(o,av,bv,cv)
iff
S(+- ,cA,bv,aI'J
bS c
iff
a
a
0
S c +- b;
REPRESENTATION OF GAGGLES
GAGGLES: GENERAL GALOIS LOGICS
408
(SIT)
S(o,av,bv,cv)
iff
S(-+,av,clI,brJ
b$ c
iff
b $ a -+ c.
a
0
The join and meet signs as subscripts are intended to aid the reader in discerning the patterns. What happens in each case is that the sign of the operation (which we attach to the third, or "output" variable) determines whether the operation applied to the first two terms (the "input" variables) is put on the left- or the right-hand side of the inequality. And the term c switches with the term with which it "contraposes," i.e., it is switched with the term under which there is indicated a change in input type.
Example 12.6.3 The following are examples of gaggles: 1. A residuated pair, Galois connection, or their duals, on a distributive lattice, with (Oi) chosen to be any of the following families of operations (either I or g can be the head of the family consisting of the two): {f}, {g}, {f, g} . 2. A distributive lattice-ordered residuated groupoid, with (OJ) chosen to be any of the following families of operations (0 is the head of the families of which it is a member, otherwise -+ or +- is the head): {o}, {o, +-}, {o, -+}, {o, +-, -+}, {+-}, {-+}, {-+, +-}.5 The operations are realized on a frame as follows (cf. Section 8.1.2): A
Exercise 12.5.2 Analyze the case of unary operations in a similar manner, assigning types to the operations in residuated pairs, Galois connections, and their duals so that the appropriate laws fall under the general scheme
Definition of a Gaggle
After this motivation, we are at last in a position to define a gaggle. First we define some useful terms. Definition 12.6.1 (1) Let D = (A, /\, V, (Oi )iEI) be a distributoid. Recall that evelY I E (0; );EI has a distribution type T(f). As a notational convention, let -/\ = V and -v = /\.
T(f) = (''1, ... , Ti, ... , Tn)
1-+ 1',
I and g that g is a contrapositive of I (with then T(g) = (1'1, ... , -
1', ... ,
Tn)
1-+
-Ti·
(3) If l' = V we write S(f, al, ... , a'b b) for tal ... an $ b, and if l' = /\ we write S(f, al,···, an, b) for b $ tal .. , an· (4) We say that two operations I and g satisfy the abstract law of residuation (in their ith place) when I and g are contrapositives (with respect to their ith place) and
(5) Two operations I, g E (Oi) are relatives when they satisfy the abstract law ofresiduation in some position. (6) The family of operations (0;) is founded when there is a distinguished operator f E (0;) (the head) such that any other operation g E (Oi) is a relative of I· Definition 12.6.2 A gaggle is a distributoid D = (A, /\, V, a founded family.
B = {y: 3a E A,3jJ E B, RajJy},
A+- B = {y : Va, VjJ( if RyjJa and jJ E B, then a E A)}, A -+ B = {y: Va, VjJ( if RayjJ and a E A, then jJ E B)}.
12.7
where [a, bY has the terms rearranged according to the contraposition of types.
(2) We shall say oltwo n-ary operations respect to the ith place) when if
0
A partial gaggle is defined quite analogously, replacing distributoids with tonoids. 6
S(f, a, b) iff S(g, [a, b]'),
12.6
409
(0; );EI),
in which (Oi )iEI is
Representation of Gaggles
We now have to figure out the realization conditions for gaggles. Concentrating on residuated pairs for concreteness, we know from our results on distributoids how each operation separately can be realized on a frame of prime filters, but now the question is how to "glue the operations together." A clue is gotten by looking at the example given in Section 12.4 of a concrete residuated pair defined using a single binary relation R. Both were realized in the standard distributoid way in terms of a relation, but one realization used the relation R, and the other used its converse R- l . The definitions were chosen from different menus. Before we can generalize we must define the "converse" of a relation with more than two terms. Of course, there is no such thing really; instead there are just all of the different ways of permuting the terms. But for our purposes here it fortunately turns out that we do not need this full generality. The following suffices:
This has to do with the fact that we look at operations whose types are "contrapositives" of each other. We cannot forbear remarking that in the case where R is binary, by a kind of universal harmony, the usual notation R- l results. If we now look at the concrete example of a residuated groupoid realized on a frame with a three-place relation (see Example 12.6.3), we see that the definition of 0 used RajJy, that of +- used R-lajJy = RyjJa, and that of -+ used R- 2 ajJy = RayjJ. 5Note that [->, <- J is literally not a gaggle, but it is very much in spirit. Either adding 0 or allowing the transitive closure of the relation "contrapositive" on operations would make it officially a gaggle. 6The reader should be warned that a subtle mistake was made in this definition in Dunn (l993a), wherein the notion of "contrapositive" was adapted from the definition of a "gaggle" without modification, even though the equivalent but somewhat reversed notion of "trace" was being used in place of "tonic type." This mistake does not really affect any results of the paper.
410
GAGGLES: GENERAL GALOIS LOGICS
The use of R- i does not by itself completely solve the problem of determining the realizing conditions. To make this clear, observe that with the case of a residuated pair, once we have decided to represent fusing O! from menu A, we of course must choose to represent its residual gas D. But 0 appears not on menu C (where R- 1 has been substituted for R), but rather on menu D (which uses (R)-l). We must address the question then as to why R- 1 gets negated. To make matters even more complicated, notice that it does not always get negated. Witness a Galois connection and the definition of A..L (menu A) and the definition of ..LA (menu C), which are precisely the same except for the direction of R. The same holds of dual Galois connections: ?!A and? A. But with a dual residuated pair we have the same phenomenon as with a residuated pair. The relation between D! (menu B) and 0 (menu C) has not simply to do with the direction of the relation, but negation again mysteriously raises its head. The moral is that we cannot determine an appropriate selection of the representing conditions by just knowing the distribution patterns, making an arbitrary choice of one menu for the one operation, and then simply going to "the dual" menu for the other. There are really two "dual" menus, one with the simple converse of the relation, and the other with it complemented. Under what conditions are we determined to go to one or the other? The answer is simple, once we observe that a Galois connection (and its dual) put the operations on the same side of the inequality, whereas residuation (and its dual) put them on opposite sides. It is this that determines whether we choose from the simple dual menu, or rather its negated version. Let us now put this more formally. We assume that we have a founded family of operations (Oi 1, where f is the head of the family. Using the distribution type t of f, let us decide to realize f by an operation F on subsets of U, arbitrarily choosing its representation condition to use R "directly" (as on menu A, rather than some inverse, complement, or inverse complement). So far everything is in accord with the way of realizing distributoids, and so we know that this operation F will have the same distribution pattern as f. Now we consider any other operation g of the family. The only question with respect to g is which "menu" to use in realizing it as an operation G on subsets of U. We know from our earlier work on distributoids how to choose a definition from a given "menu" so that the distribution pattern of G will realize that of g. But now we must choose this "menu" so that f and g satisfy the abstract law of residuation. We have used quotes here for "menu" since with the introduction of R- i , things appear more complicated than with our simple menus A through D. However, recall g is a relative of f. This requires, besides satisfying the abstract law of residuation with respect to f, that g is a contrapositive of f with respect to some position i. Now we observe that if the output type of g matches that of f, then we choose the definition of G using R- i , whereas if the output types clash, we use (R)-i. Thus once we pick our position, there are always four menus after all, and so we can continue to call them menu A, B, C, D, though we must realize that menus C and D depend on the choice of i.
REPRESENTATION OF GAGGLES
411
We leave to the reader to velify that the abstract law of residuation holds with respect to F and G defined as indicated, and record the above discussion as follows: Definition 12.7.1 Given two types t and t' of the same degree, i.e., having the same number of input types, we say that they agree if they have the same output type, and that they clash if they have different output types. Theorem 12.7.2 (Realizing contraposed distribution types) Let T be. a family of distribution types (T], ... , Ti, ... , Tn) 1-+ T (all the same degree), where to E T is the head ofT. Let (U, R) be aframe, where R is a relation on U of degree n + 1. Define the n-ary operation FO on subsets of U as follows:
P Er Fo(A], ... ,An)
iff
3al, ... , an(al ErJ Al /\ ... /\ an Erll An /\ Rr(al, ... , a'b P)). For a given type t' that is a contrapositive of t in the ith place, define the n-ary operation G on subsets of U as follows: PEr G(AI, ... , An)
iff
3a], ... ,an(al ErJ Al /\ ... /\ an Erll An /\ R~(al, ... ,an, P)),
where R' is R- i or (R)-i, depending on whether the output types oft and t' agree or clash. Then the operation FO is of distribution type to, G is of distribution type t', and Fo and G satisfy the abstract law of residuation with respect to each othe!: In short, we have a gaggle of operations with types assigned appropriately.
Theorem 12.7.3 (Representation of gaggles) Every gaggle (A, /\, V, (Oi liEf) can be represented using a frame (U, (Ri liEf), as for a distributoid, but choosing relations for different operations from different (dual) menus. Proof Actually in some ways this is more organized than the distributoid representation. There a different definition of R on the prime filters was needed for each operation. Here we use the same definition throughout, the definition coming from the type of the head of the family. The representation otherwise proceeds as for distributoids, but of course we must have a certain interaction among the relg.tions, which is ensured by the fact that they are defined using operations that all satisfy the abstract law of residuation with respect to some given operation ("the head"). We illustrate this with respect to distributive lattice residuated groupoids, leaving the general case to the reader. We have then the following ways of defining the relation R for each operation:
PL., Q) iff VxVy(x E PI & Y E PL. =? x 0 Y E Q); R-+ (p], PL., Q) iff VxVy(x E PI & x -+ Y E PL. =? Y E Q); R+-(PI, PL., Q) iff VxVy(x +-- y E PI & Y E PL. =? x E Q). Ro(PI,
We shall show only the equivalence of the definitions of Ro and R-+, since the argument for the equivalence of Ro and R+- is symmetric. Let us first suppose, then, that
GAGGLES: GENERAL GALOIS LOGICS
412
Ro(PI, PJ" Q). In order to show R .... (PI, P2, Q), let us suppose that x E PI, (x -+ y) E PJ" and show y E Q. From the definition of Ro(PI, PJ" Q), it follows that x 0 (x -+ y) E
Q. But it is easy to show that x 0 (x -+ y) s y, and since it is part of the definition of a filter that it is closed upward, then y E Q. So R .... (PI, Q, PJ,), as desired. For the converse, let us suppose that R .... (PI, PJ" Q) and that x E PI and y E PJ,. Since it is easy to show that y s x -+ (x 0 y), by upward closure we have that (x -+ (x 0 y)) E PJ,. But now using our hypothesis that R .... (PI, PJ" Q), we obtain (x 0 y) E Q, as desired. 0 Exercise 12.7.4 Prove the general case of the representation theorem for distributoids. We close this section by answering a question V. R. Pratt raised in discussion. Theorem 12.7.5 Distributoids and gaggles are equationally definable. Proof It is clear that the postulates for distributoids can be cast in the form of equations, since it is well-known that distributive lattices can be given such a formulation, and the distribution postulates for each place of 0 are all equations. It is further well known that in a (right) residuated groupoid, the inequalities a 0 (a -+ b) S band
suffice along with isotonicity for
0
b S a -+ (a
0
b),
and right-isotonicity for the residual:
if b s b' then a -+ b S a -+ b' . In a lattice (indeed, in a semi-lattice) the inequalities become equations given the equivalence a S b iff a 1\ b = a. Further, isotonicity follows from the fact that it is postulated that 0 distributes over V from both directions. Right isotonicity for -+ can be established by the inequality a -+ (b 1\ b') S a -+ b'. All these can be generalized to gaggles. It is just a matter of postulating the appropriate inequalities for each place, and seeing that the right form of isotonicity or antitonicity results. 0
GAGGLES WITH IDENTITIES
we need some way to say that a proposition a is "true," and this may be expressed as e S a (cf. Section 8.1.2). Note, for example, e S a -+ a, which follows from a 0 e S a. Of course, in classical logic e can be taken to be the greatest element of the Boolean algebra, but in general, as for relevance logic, this is too strong. Still e can be introduced as the "infinite conjunction" of all of the truths (cf. Dunn 1966). In representing certain distributoids or gaggles with identities (ej)jEJ (J ~ I), we add to a frame for each identity a partial order!: and a set Z ~ U. We require a !: p iff R(~l, ... , ~i-l, a, ~i+l, ... , ~/l' P) for some ~l, ... , ~i-l, ~i+l, ... , ~Il E Z. We also require that the representing sets (the "propositions") be "closed" as follows (where R is the (n + I)-place relation used to represent the n-ary operation fj). Hereditary condition. For each i thus closed we call it hereditary.
E J,
and a, PEA, if a !: p, then pEA. When A is
As a technical complication, we must also assume that each Zj is itself closed with respect to itself under the hereditary condition, but such sets can be "constructed" in the usual way by starting with a set of states and taking the intersection of all sets that include it and are so closed. The point of the hereditary condition is to ensure that the sets Zj function as identity elements. We illustrate this with respect to a left-residuated groupoid defined an a frame using a three-place accessibility relation. In order to verify that Z 0 A ~ A, we assume that X E Z 0 A. Then there exist ~ E Z and a E A such that R~ax. But by the hereditary condition, X E A as needed. One can also verify that A ~ Z -+ A. There is one problem that still must be dealt with. We want a guarantee that the hereditary sets are closed under the gaggle operations. The lattice operations of intersection and union are straightforward, but the only way to ensure closure under the other gaggle operations is to require that the following "monotonicity conditions" on the relations R hold: if R(al, ... , ai, ... , an, p) and a' !:i a, then R(al, ... , a;, ... , a'l> P); if R(al, ... , ai, ... , an, P) and P !:i
12.8
413
p',
then R(al, ... , ai,···, an, p').
Modifications for Distribntoids and Gaggles with Identities and Constants
Consider an n-ary operator f(XI, ... , Xi, .. , Xll) whose associated type is (-£1, ... , 7:i, ... ,7:n) H 7:. We shall say that an element e is an identity element for f with respect to the ith place if 7:i = 7: = V and f(e, ... ,Xi, ... ,e) S xi. or else 7:i = 7: = 1\ and i 7 Xi S fee, ... , Xi, ... , e). We shall sometimes denote such an element as e . In practice it is very common for gaggles to have identity elements. For example, residuated groupoids often have an element e such that eo x = x 0 e = x. In this case the single element e serves as an identity element for both places (and note further that we have =, and not merely s). In logical applications identity elements are important, since
Remark 12.8.1 Although one is accustomed to have closure requirements (such as the hereditary condition) on propositions in the case of intuitionist logic, relevance logic, and orthologic, it may strike one as strange for modal logic, where "propositions" are arbitrary subsets of a frame. And yet, where -+ is strict implication, it is clearly needed in order to have A -+ A tum out to be a logical truth. It turns out that by "fiddling with the representation" it can be discovered that the canonical accessibility relation between prime filters reduces to 0 PI ~ Q & PJ, ~ Q, but since modal logic contains classical logic, we are in a Boolean algebra and all prime filters are maximal, and so P2 = Q. SO closure is just closure under the identity relation, and so all sets are closed.
7This definition was misstated in Dunn (1991), where the requirement was omitted that the types of the input positions and the output position be identical. One could define "identity elements" where the types clash, if one postulates an underlying involution ~ on the lattice. Then if, say, Ti = V and T = II, one would require f(~e, ... , Xi, ... , ~e)::; ~Xi' Thus, for example, TI = V, T = II, and Xl .... , ~e::;, ~XI·
Usually in the Lindenbaum algebra of a logic one has few constants, maybe none. The identity, t, is the most likely constant to appear, perhaps, with its dual, f. However, in a combinatory algebra constants are abundant. The somewhat remote connection
414
MONADIC MODAL OPERATORS
GAGGLES: GENERAL GALOIS LOGICS
between implicational fragments of some logics and combinators is captured by the socalled Curry-Howard isomorphism. A more direct relation has been introduced in Dunn and Meyer (1997) which also shows how to represent such constants. Without going into the details, we give a short example referring the reader to the paper mentioned. Consider the combinator C. Taking the binary operation 0 to be application, C has a combinatory axiom
«C
0
x)
0
y)
0
z
==
(x
0
z)
0
y.
Just as for identity we required a set Z to be present in the representation, for such constants as C we require that an appropriate set, say, C exists in the representation satisfying the condition «C 0 X) 0 Y) 0 Z ~ (X 0 Z) 0 Y. Such a condition can be shown to be satisfied by the set into which a combinatory constant of the algebra is mapped. The same condition can be specified in terms of R (the accessibility relation) on the frame (3c E C)(3x E X)(3y E Y)(3z E Z)(3uj, u2)(Ruj zv & RCXU2 & RU2YUt) :::} (3x E X)(3y E Y)(3z E Z)(3u)(Ruyv & Rxzu).
(This latter is not as obvious as the condition was for identity. Examples of similar combinatory conditions can be found in Dunn and Meyer (1997), and an algorithm generating the appropriate condition for any combinator can be found in Bimbo and Dunn (1998).) We note here only that this observation seems to generalize easily whenever an operation (with respect to which the constant is a constant) is well represented.
12.9 Applications There are several topics that can be further developed. One has to do with applying distributoids or gaggles to various logics. This has already been carried out in some detail for modal, intuitionistic, and relevance logics, as we shall describe. This involves "fiddling with the canonical representation" to get the usual results, which do not always invol ve an accessibility relation of the same degree as that employed in the gaggle representation. We quickly illustrate this with the case of Boolean algebras (classical logic) viewing Boolean complement as a distributoid operation since it distributes over join, changing it to meet (it also does the dual, but one is enough). One can obtain the properties of Boolean complement by postulating that the relation R is irreflexive (A n A.l = 0) and symmetric (so A.l = .lA, and the properties of the negation of minimal logic result), and that the propositions A are "closed" in the sense that if a E A and it is not the case that a..Lx, then X E A (A U A.l = 1), and verify that all of these properties hold in the canonical representation in terms of prime filters. One finally observes that the use of the relation R is otiose, prime (proper) filters in Boolean algebras are maximal filters, and one can show that they are both consistent and complete with respect to Boolean complement, i.e., x E P iff x.l ¢ P. Since PRQ is just 3x(x E P & x.l E Q), it follows that PRQ iff P :f. Q. But then Q E A.l iff PRQ for all PEA, iff P :f. Q for all PEA, iff Q ¢ A. But then A.l becomes ordinary set-theoretic complement (relative to the class of prime filters), and so we obtain the ordinary (Stone)
415
representation for Boolean algebras out of the one delivered by the representation for distributoids.
12.10 Monadic Modal Operators In Chapter 10, we introduced Boolean algebras with normal modal operators as the algebraic way of looking at normal modal logics, and defined various classes of algebras corresponding to various well-known modal logics (K, T, B, S4, S5). We also showed how these algebras can be represented so as to get completeness results using Kripke frames. Also in Chapter 10, we introduced the operator i defined as -c-x corresponding to the necessity operator in modal logic. It is clear that we might just as well have had i as primitive and introduced c by the equation c = -i-x. By being even-handed to both sides, we can consider a normal modal Boolean algebra (or K-algebra) to be a structure (B,::;, 1\, v, -, i, c, 0,1), where (B,::;, 1\, v, -, 0,1) is a Boolean algebra, and (il\) i(a 1\ b)
= ia 1\ ib,
(il) il = 1,
(cv) c(a V b) (cO) cO
= ca V cb,
= 0,
(-i) ca = -i - a, (-c) ia = -c - a.
An interesting question arises, though, if we drop the underlying Boolean algebra in favor of a distributive lattice. Then we can no longer state (-i), and so there is no obvious relationship of duality between i and c. Indeed, in terms of a Kripke model, i could be modeled using one accessibility relation R, and c could be modeled using an entirely different one R,.8 But with the modal logic B, one has a natural connection between the operators c and i that can be expressed as the following (as the reader can easily verify): (b)
ca ::; b iff a::; i b,
which simply says that (c, i) is a "residuated pair." The Kripke semantics for modal logic is customarily presented in terms of a frame (U, R), where U is thought of as a set of possible worlds, and aRfJ is thought of as fJ is possible ("accessible") from a. A sentence is interpreted as expressing a "proposition" A, i.e., a set of worlds (those in which it is true), with the modal operators interpreted as follows: CA
= {X:
3a(xRa & a E A)},
8But Dunn (l995a) shows that if one adds the axioms i(a V b) ::::; ca V ib and ia 1\ cb ::::; c(a 1\ b), then one can (and must) use the same accessibility relation. An interesting question arises about how one might cover the axioms above and the residuation axioms below under a common abstraction.
416
GAGGLES: GENERAL GALOIS LOGICS
IA
= {X: Va(xRa =? a E A)}.
Conjunction, disjunction, and negation are just intersection, union, and complement (relative to U), and I is of course U and is the empty set. This can be viewed algebraically as the fact that a frame (U, R) determines a modal Boolean algebra, whose elements are the subsets of U, and whose operations are defined as above. We shall call this the modal algebra determined by the frame (U, R). As is well known, various conditions can be put on the accessibility relation R that give a sound and complete semantics for the corresponding logic; e.g., for T it is required that R be reflexive. From an algebraic point of view, soundness means that the modal algebra determined by the frame is of the appropriate kind, such as aT-algebra, and completeness means that all algebras of the appropriate kind, such as all T -algebras, may be embedded into the Boolean algebra determined by the frame. 9 Some of the well-1m own conditions on R include reflexivity and transitivity for S4, reflexivity and symmetry for B, and the various combinations of reflexivity, symmetry, and transitivity that correspond to various combinations of postulates on modal algebras. The condition that most interests us is the symmetry condition for B, which forces the interaction between the two modal operators in the condition (b). But the requirement of the symmetry of R is a kind of "red herring." The residuation condition CA ~ B iff A ~ IB
DYADIC MODAL OPERATORS
each of (B,::;;, /\, v, -, i, c, 0,1) and (B,::;;, /\, v, -, i', c' , 0,1) are modal Boolean algebras, and the following residuation properties are satisfied:
°
holds just as well if the direction of R is reversed in the definition of C. Of course, instead the definition of I might be changed to look backward (keeping the forwardlooking C). The point is that the only role the symmetry of R plays in verifying the residuation properties is to make the forward-looking operators and the backwardlooking ones indistinguishable. What really requires the symmetry of R is not the fact that I and C are duals in the sense of the residuation properties, but rather that they are duals in the sense of the De Morgan properties. Let us denote the backward-looking operators with a prime:
= {x:3a(aRx&aEA)}, I' A = {X: Va(aRx =? a E A)}.
C'A
The reader can readily verify the following residuation properties: C' A ~ B iff A ~ IB, CA ~ B iff A ~ l'B.
It gives a smoother theory to assume that all modal Boolean algebras are outfitted with suitable "backward" operators to match their forward operators. Thus by a symmetric modal Boolean algebra we mean a structure (B,::;;, /\, v, -, i, c, i', c' , 0,1), where 9Implicit in the above discussion is that the class of algebras satisfying given formulas is closed under subalgebra (which it always is).
417
c' a ::;; b iff a::;; ib, ca ::;; b iff a::;;
i' b.
Symmetric modal Boolean algebras can be viewed as gaggles in the sense of Dunn (1993a). The only subtlety is that they have an underlying Boolean algebra instead of a distributive lattice, but the representation of gaggles, based as it is fundamentally on Stone's (1937) representation of distributive lattices, includes as a special case Stone's (1936) representation of Boolean lattices wherein lattice complement is carried into set-theoretic relative complement. 10
12.11 Dyadic Modal Operators There are well-known connections (due fundamentally to GOdel) between Heyting algebras and S4 algebras (cf. Chapter 11). The idea is that the "open" elements x = ix of an S4 algebra form a Heyting algebra, with a ::J b = i( -a V b). But since the implication operator in a Heyting algebra is two-place, whereas the modal operators above are one-place, there is a certain lack of parallelism with the standard approach to modal logic. We fix this by first considering a modal algebra in which strict implication (-<) is a primitive, together with its adjoint o. Note that at this point the choice of the right arrow is not supposed to prejudge any relationship to the right residual, but the reader is wise to anticipate. It is standard in the Kripke semantics that ¢ -< lfI is true at a world w iff at every world w' accessible from w, either ¢ is not true at w' or lfI is true at w'. Less standardly, ¢ 0 lfI is true at a world w iff ¢ is true at wand lfI is true at some world w' from which w is accessible.!1 It is easy to verify that (1)
(¢ 0 lfI)
1= X
iff lfI
1= (¢ -<
X),
which motivates our adopting the algebraic analog, right residuation: (rr)
a
0
b ::;; c iff b::;; a -< c,
which is indeed expressed in algebraic jargon by saying that -< is the (right) residual with respect to o. We of course have a precisely parallel law for Heyting algebras, but with the Heyting implication ::J substituted for -< and the lattice meet /\ substituted for 0. 12 Put this JO Another way to look at this is that lattice complement itself is a gaggle operation being a Galois connection. While the initial gaggle representation is then in terms of a two-place relation, it can be fiddled with as in Dunn (1993a) so as to obtain the usual representation in terms of relative complement. I I Thus a 0 b is in effect a II c'b. It forms the basis for Wansing's (1994) Gentzenization of modal logics, and it is also implicit in Belnap's (1982) display logic.
12This corresponds to the well-known fact that the deduction theorem and its converse hold for intuitionistic logic.
1 I
418
GAGGLES: GENERAL GALOIS LOGICS
way, the question arises as to what distinguishes strict implication from intuitionistic implication. It is not the above law, but maybe it is the properties of 07 When 0 is just meet, as it is in a Heyting algebra, it is of course commutative, associative, and idempotent. But when it is the modal operator described above, it does not need to have all of those properties. Let us first examine commutativity. It is clearly a modality-destroying principle, since commutation on the left-hand side of (1) leads to permutation of "antecedents" on the right-hand side of (1). Thus, we get from the modally harmless ¢ 1= (If/ -< If/) to the modally noxious If/ 1= (¢ -< If/). The distinction between these two theses can be viewed as the central difference between strict implication on the one hand, and the classical and intuitionistic implications on the other. In the Kripke semantics the only natural ways to make 0 commutative are to either have just one world (classical logic) or to require the "hereditary condition" that if ¢ is true at w then ¢ is true at all worlds w' accessible from w (as in intuitionistic logic). The connective 0 is clearly always "square-decreasing," ¢ 0 ¢ I- ¢, since indeed it satisfies "left lower bound," ¢ 0 If/ 1= ¢. But the converse (¢ 1= ¢ 0 ¢) does not always hold. Indeed 0 is fully idempotent exactly for alethic modal logics (those where the accessibility relation is reflexive). Thus given a reflexive frame (U, R), it is easy to see that cjJ 1= cjJ 0 ¢ is valid, for if w 1= cjJ then w 1= cjJ and 3w' (namely w) such that w' Rw and w' 1= ¢. So w 1= ¢ 0 ¢. And conversely, if the frame (U, R) is not reflexive, then for some w, wRw. Pick an atomic sentence p and define a model where w 1= p, but for any other state w " w' ~ p. Clearly there then is no state w' such that w ' Rw and w' 1= p, and so w 1= p. We leave to the reader the verification of other such correspondences that we shall describe between properties of 0 and properties of R. Thus the reader can easily verify that 0 is associative exactly when the accessibility relation is dual Euclidean 13 as with K5. S5, of course, results when 0 is idempotent, commutative, and associative, and so these properties alone cannot distinguish intuitionistic from modal logic. Before going on to explore what does then make the difference, let us first try to identify a property of o that does express the simple transitivity of the accessibility relation. The answer can be checked to be the half of associativity that says (i)
a
(b
0
d) :S (a
0
0
b)
0
d,
while the other half, (ii)
(a
0
b)
0
d :S a
0
(b
0
d),
says that the accessibility relation is dual Euclidean. We will not go through the whole proof, but we note that it is useful in checking these correspondences to think of a 0 b as being defined as a 1\ c' b. 14 Then (i) can be expressed as 13That is, aRX and fJRX imply aRfJ, (and fJRa as we11, by the commutativity of conjunction). 14This is a convenience to guide "calculations," but they do not depend on it.
DYADIC MODAL OPERATORS
419
a 1\ c'(b 1\ c'd) :S (a 1\ c'b) 1\ c'd.
It is easy to see that the a term can be dropped, so it reduces to
c' (b 1\ c'd) :S c' b 1\ c'd.
Similarly, (ii) reduces to the converse of the expression above: c' b 1\ c'd :S c' (b 1\ c'd).
We have thus sorted out some of the distinctions between various modal logics in this framework, but still have not solved the mystery of what distinguishes modal from intuitionistic logic in terms of properties of o. There are two ways of looking at this, depending on whether we think of the intuitionistic implication :J as a left or a right residual. Both ways focus on the key intuitionistic fact that the following paradox of material implication holds: (pmi) b:S a :J b. Let us start by assuming that :J is a right residual (we shall accordingly denote it by --+) in order to facilitate its comparison with -<. Then (pmi') b:S a --+ b clearly follows from a 1\ b :S b by the law of the right residual. What distinguishes modal from intuitionistic logic then is the inequality that says that a 0 b is a lower bound of the right argument b: (lbr) a
0
b :S b.
When 0 is given the modal interpretation, this of course fails, but we still have that a a b is a lower bound of the left argument a: (lbl) a 0 b :S a. From this it follows by the law of the right residual that (psi) b:S a -< a, which is a version of the paradox of strict implication, familiar from modal logic. When o is meet, of course, both inequalities hold (because 0 is then commutative, and one can get from one to the other as indicated above), but in the present context it is interesting to ask what happens when only (lbr) holds. Then one would have a logic in which the paradox of material implication holds, but not the paradox of strict implication. There is another, and maybe prettier way to look at the distinction between modal and intuitionistic logic. Let us accept the fact that (lbl) holds, but not (lbr), i.e., let us keep as fixed the modal interpretation of o. And let us continue to regard -< as the right residual. But there is also a left residual :J satisfying (lr)
a
0
b :S c iff a:S b :J c.
From (lr) for :J and (lbl) we obtain the paradox of material implication for :J, but not the paradox of strict implication, and from (rr) and (lbl) for -< we obtain the paradox
GAGGLES: GENERAL GALOIS LOGICS
420
of strict implication for -<, but not the paradox of material implication. Of course, if we postulate that -+ is both a left and a right residual we obtain both paradoxes, and from these we can deduce the commutativity of 0, from which it follows that a 0 b ::; b after all. I like this way of looking at things, for it splits the properties of the intuitionistic implication into those of two different residuals.
REPRESENTATION OF POSITIVE BINARY GAGGLES
421
Let us say a word about how to define identity elements on a frame. As is made clear in Section 8.1.2 the best way is to work with articulated ternary frames (U, R, r:;;), where R ~ U 3 , and r:;; is a partial-order satisfying: Rapy & a' r:;; a imply Ra' py; Rapy & p' r:;; pimply Rap'y;
12.12
Identity Elements
Rapy & y r:;; y imply Rapy'.
While we are tidying up, let us observe that there is a more general way of looking at the inequalities (lbl) and (lbr), which is helpful in generalizing the framework to relevance logic. Let us assume that we have a distinguished element e. The following two inequalities say that e has half of the properties of a left and right identity, respectively: (lid) e 0 a ::; a; (rid) a 0 e ::; a, The postulates (lid) and (rid) are of course just special cases of (lbr) and (lbl) respectively, and they lead by the law of the right residual to
We think of r:;; as an "information order," and define a proposition A (a representing set) to be a hereditary subset of U, i.e., a subset that is closed upward under r:;; (that is, a E A & a r:;; a' imply a' E A). As we have already mentioned, this corresponds to requirements familiar from intuitionistic and relevance logics. It is easy then to see that if Z ~ U satisfies the following condition, then it is a left lower identity with respect to propositions (Z 0 A ~ A): (lli) 3(
E
Z(R(aP) iff a r:;; p.
Also Z is a right lower identity (A
0
Z
~
A) if it satisfies:
iff a r:;; p.
a::; e -+ a;
(rli) 3(
e ::; a
By a left assertionalframe we shall mean a structure (U, R, r:;;, Z), where (U, R, r:;;) is an articulated frame and Z satisfies (lli). A right assertional frame is defined similarly, but with Z satisfying (rli), and in a doubly assertionalframe Z must satisfy both.
-+ a.
Identity elements are important because they give us an algebraic way of saying that a proposition is "true," to wit, e ::; a. Thus we can say not only that a implies a (which is said in "algebraese" as a ::; a), but also with e ::; a -+ a that the proposition "if a then a" is true. When e satisfies the properties (lid) or (rid) we shall talk of "lower" (left or right) identities. But there are sometimes good reasons to require that e be a full identity. Thus, for example, if it is a full right identity (a 0 e = a) then one can derive what we called earlier (see Section 8.1.2) push and pop: (pp) a::;b iff e::;a-+b. The "if" direction of (pp) pushes the implication down into the object language, and the converse pops it out again. With e being only a lower light identity, one gets only the "push" half of this and not the "pop." But there may also be good reasons for sometimes not requiring "pop," and so we do not always require e to be a full right identity. Let us consider what happens when e is the top element in the structure, i.e., a ::; e for every element a (when it is we will often denote it by 1). For the cases of modal and intuitionistic logic we can make that assumption. Then, because of isotonicity, (lid) implies (lbr), and (rid) implies (lbl). We illustrate with the first implication. Since lob ::; b and a ::; 1, it follows (monotonicity) that ao b ::; b. Thus, (lid) and (lbr) are equivalent, when e = 1, and similarly for (rid) and (lbl). So the "fundamental" difference between modal and intuitionistic logic is whether e is a left, or both a left and right, lower identity element. IS 15 It turns out that this difference holds up in considering relevance logics too. amounting roughly to the distinction between the system R of relevant implication and the system E of entailment (which combines
12.13
E Z(Ra(p)
Representation of Positive Binary Gaggles
We codify some of the observations made above in the following framework.
Definition 12.13.1 A binary gaggle (BO) is a distributive lattice-ordered residuated groupoid, i.e., a structure (G, /\, v, 0, -+, e), where (1) (G, /\, V) is a distributive lattice; 0 (y V z) = (x 0 y) V (x 0 z), and (y V z) 0 x = (y 0 x) V (z 0 x) (i.e., 0 distributes over V); (3) x 0 e = x (i.e., e is a right identity); (4) x 0 y ::; Z iffy::; x -+ Z (i.e., -+ is a right residual with respect to 0).
(2) x
In standard "algebraese" this is called a right-residuated distributive-lattice ordered groupoid with right identity. There is the threat of what Whitehead called "the fallacy of misplaced concreteness" here, in that in the abstract framework of Dunn (1991) there could be other binary gaggles. Thus, for example, we might have a left residual and leftidentity instead, or in addition. And we might have a whole different family of operators, founded (say) on + (distributing over /\) instead of o. As can be seen from the discussion above, BOs are of interest for their application to modal logic, interpreting -+ as strict implication. If -+ is not only a right residual but also a left residual (0 is commutative), then it is non-modal and the BO has application necessity with relevance), but the element e is not required to be 1.
422
GAGGLES: GENERAL GALOIS LOGICS
to both intuitionistic and relevance logics. We have no very good name for such a BG, but let us label it a material BG (MBG).16
Definition 12.13.2 Given a right assertionalframe (U,!;;;, R, Z), we define the full binary gaggle (FBG) induced on it (G, /\, v, 0, -+, e) so that G is the collection of all hereditary subsets of U, /\ is n, V is U, e is Z, and if A and Bare hereditmy subsets of U, A 0 B, A -+ B, B +- A are defined as in Example 12.6.3. Theorem 12.13.3 EvelY BG is embeddable in the full BG induced on some right assertionalframe (U,!;;;, R, Z). Proof The proof is a special case of the result for general gaggles of Dunn (1991), but we run through in this special case so that the reader can understand both why the official representation of a binary operation in gaggle theory uses a three-place accessibility relation Rapy, and also how it can be resolved that the ordinary Kripke-style representation uses only a two-place one. The representation of a BG proceeds along the path started by Stone (1936, 1937) for distributive lattices and Boolean algebras, extended by Jonsson and Tarski to Boolean algebras with operators, and uses ideas from the semantics of relevance logic due to Routley and Meyer (1973) and Meyer and Routley (1972). An element a of the BG is mapped into the set of all prime filters P of the BG such that a E P. Meet is thus carried into set intersection, join into set union (as with Stone), the operation 0 is the generalized image operator defined above (as in Jonsson and Tarski), and -+ is also as defined above (as in Routley and Meyer). Of course, those definitions above of 0 and -+ assume we are given a three-place accessibility relation on the prime filters, and we next show how this is canonically defined. Thus in the canonical frame, the set UG is the set of all prime filters of G, ZG is the set of all prime filters of G containing e, and RGPIPJ.Q holds iff for all a, bEG, if a E PI and b E PJ. then a 0 b E Q. We will not here work through the details of the proof that this representation works. The interested reader can work out the details from the sources cited above. 0
Corollary 12.13.4 EvelY MBG is embeddable in the full MBG induced on some right assertional commutative frame. Proof One simply observes that if the accessibility relation is commutative, so is the induced fusion operator defined as in Example 12.6.3, and conversely, if the fusion operator is commutative, then the accessibility relation on the canonical frame is commutative. 0
12.14
Implication
We can now see the problem of relating the representation of gaggles to the usual Kripke semantics by way of a crude "dimension analysis." The gaggle representation of a bi16 As indicated above, this is equivalent to requiring that a be commutative. It should be mentioned that in the general theory of gaggles, it would be permitted that the structure have both a left and a right residual when these might not be the same, but we have chosen to simplify here to be closer to the applications discussed.
IMPLICATION
423
nary operation such as implication uses a three-place accessibility relation, whereas the Kripke semantics for either intuitionistic or strict implication uses a two-place accessibility relation. While, as we shall see, this fits perfectly the analysis of implication in the Routley-Meyer semantics for relevance logic, there are stories to be told in the cases of intuitionistic and strict implication. 17 12.14.1
Implication in relevance logic
The positive fragment of the relevance logic R can be characterized algebraically as a BG (G, /\, V, 0, -+, e), where 0 is associative, commutative, and "square-increasing" (a ::; a 0 a). (Since 0 is commutative this makes it an MBG.) We shall call such an algebra a positive De Morgan monoid. 18 A positive Routley-Meyerframe can be taken to be a right assertional frame (U,!;;;, R, Z).19 Other properties match the various other algebraic properties. Thus for an R+ frame we require the following (labeled to show their correspondences to algebraic properties) : (1) 3x(Rap X & RXyo) iff 3x(Raxo & RPy X)
(associativity);
(2) Rapy implies Rpay (3) Raaa
(commutativity); (square-increasingness).
Indeed, one obtain various relevance logics depending on which properties one adds, but we do not need to go into that here. We simply note that it must be checked that the canonical accessibility relation defined on the space of prime filters has the corresponding properties. 20 The chief point is that Routley and Meyer define X
1= cP -+ If.!
iff Va, p, if RaXP and a
1= cP,
then p
1= If.!.
And this definition (modulo the choice of the first or second position of R as the "fulcrum") perfectly matches the definition of -+ in the representation of a BG. Also the semantical clause for the fusion connective is a perfect match for the definition of o. I7For intuitionistic logic there is a related and even more dramatic problem in that the gaggle representation of a also uses a three-place accessibility relation, whereas the Kripke one does not use an accessibility relation at all (0 is just intersection). 18In various papers Meyer has called these "Dunn monoids," but modesty forbids its use here. 19The reader familiar with the relevance logic literature will notice that we have taken some liberties. First, for Routley and Meyer, Z = {OJ. This is because (cf. Meyer and Routley 1972) they in effect represent only prime De Morgan monoids (those where e ::; a V b implies e ::; a or e ::; b), obtaining a prime De Morgan monoid by dividing out the Lindenbaum algebra of R by a prime filter that excludes the given proposition that one wants to falsify. Second, they do not have an explicit partial order, but instead define one from ROa fJ, giving them in effect a left assertional frame. We have chosen to work with right assertional frames, choosing the second position of the accessibility relation R (and not the first) as our "fulcrum," in order to get the proper match to our use of"" as the right residual. Note, finally, that we have built in certain properties that Routley and Meyer explicitly require of R by connecting it with the partial order J;;;, e.g., RaOa follows from
a J;;; a. 20 Details can be worked out by consulting one or more of Routley and Meyer (1973), Meyer and Routley (1972), Dunn (1986), Anderson et al. (1992), and Dunn (l993a).
424
GAGGLES: GENERAL GALOIS LOGICS
Indeed, a large part of the original inspiration for developing gaggle theory was to tease these out as special cases. 12.14.2
Implication in intuitionistic logic
But, as already indicated, it is not so straightforward with intuitionistic implication. Some of what we say here has already been said in Section 8.1.2, but we repeat it in somewhat different words for the sake of clarity and completeness. Let us consider the case where G is a positive Heyting algebra. Then 0 is just meet, and we shall see that RPI P2 Q iff PI ~ Q and PL. ~ Q. Let us assume then that RPI P2 Q. We show that PI ~ Q, and of course PL. ~ Q follows symmetrically. Suppose, then, that a E PI. Since 1 E PL., a A 1 = a E Q, as desired. Now for the converse let us assume PI ~ Q and PL. ~ Q, and we shall show that RPIPL.Q. To this end let us suppose that a E PI and bE PL.. Since both PI and PL. are subsets of Q, both of a and b are in Q. But since Q is a filter and filters are closed under meet, a AbE Q, as needed.
Definition 12.14.1 For intuitionistic logic R is defined as: Rapy (ff Ray & RPy. We shall ultimately justify this definition, but for the moment let us observe that the gaggle representation employing them reduces to the usual definition in terms of the two-place relation. Thus for the intuitionistic ::>, the plugging in of Definition 12.14.1 into the definition of --+ in Example 12.6.3 gives us X E A ::> B iff Va, P(if RXP & Rap & a E A, then P E B).
Here more fiddling is required. Let us assume the right-hand side, Va, P(if RXP & Rap & a E A, then P E B),
If x E e' PL., then x = e' a for some a E PL.. Again 1 E PI and so 1 A e' a = e' a = x E Q, and so e' P2 ~ Q as desired. We next show that PI ~ Q as an initial step in showing that PI = Q. Thus if bE PI, then (since 1 E PL.) e'l AbE Q. But e'l A b ::; b, and so b E Q. Now since PI ~ Q, and PI is a maximal filter, it must be the case that PI = Q. But we can define the three-place relation in terms of the two-place one as follows:
Definition 12.14.2 For modal logic R is defined as: Rapy (ff RPy & a
Va(if RXa & a E A, then a E B).
Given the reflexiveness of R, this follows immediately, instantiating Pto a in the previous formula. The converse is only slightly trickier. Let us assume (xt) and the antecedent of the previous formula, RXP & Rap & a EA. We must now show that P E B. Since a E A and Rap, it follows from the hereditary condition that pEA. But since also RXP, we can apply (Xt) to obtain P E B as desired.
Thus plugging in the above definition of the three-place R in terms of the two-place one, we obtain for the modal --+: X E A -< B iff Va, P(if RXP & a = P & a E A, then P E B)
i.e., after substituting identicals, X E A -< B iff Va (if RXa & a E A, then a E B),
as desired. We leave to the reader a similar verification for a 0 b = a A e'b).
12.15
0
(showing that in effect
Negation
There are two more puzzles about the gaggle way of representing logical operations, one having to do with intuitionistic negation, and the other having to do with the De Morgan negation of relevance logic. The reader is advised to consult Dunn (1993b) to see some of the issues sUlTounding the definitions of negation in a more systematic context. 12.15.1
x::;
(1)
The gaggle treatment of negation
rvy
iff y::; ..,x.
The first problem, as the reader must have immediately recognized, is that we have two "negations." But if we simply require that.., and rv be the same operation we obtain all the properties of what is called "minimal negation." To obtain the intuitionistic negation we must also add the property that (2) a A rva ::; b.
A
12.14.3
= y.
Let us first quickly review the gaggle treatment of negation. The idea is to treat negation as a Galois connection on an underlying (distributive) lattice. Thus we require:
and now show that (xt)
425
NEGATION
The idea is that to represent negation one uses a binary frame (U, .1),21 and for a U, one can give the following definitions:
~
Modallogic
Let us next examine the case where G is a modal algebra. To make things easy we shall suppose that we have both of the modal operators e' and -<, with a 0 b = a A e' b. It would be interesting to explore in more detail what would happen if, say, we had only -< and 0 as primitives, but it would take us longer to develop the conceptual point, which is, as we shall see, that RPI PL. Q iff PI = Q and e' PL. ~ Q. Let us assume, then, that RPI PL. Q.
= A..L = {X: A.1X} = {X: Va E A,a.1x}; ..,A = ..LA = {X: X.1A} = {X: Va E A, x.1a}.
rvA
21 Actually for intuitionistic and relevance logics, the frame should be articulated with a partial order J;;;, and we should only be considering subsets A that are hereditary with respect to J;;;. But we suppress this level of detail at this moment to facilitate exposition.
426
The reader can verify that for X, Y
~
The first thing that occurs to one is to simply contrapose the KIipke definition to make it to have the same form as the gaggle definition:
U,
X :S "" Y
iff Y:S "" X. X
To ensure that ""A = "'A, it clearly then suffices to require that ..l be a symmetric relation, and to obtain (2) one simply adds the requirement that ..l be irreftexive, which ensures that there can be no X E An .LA.22 In the canonical representation of gaggles, where U is the set of prime filters, the canonical embedding is defined by h(a) = {P E U : a E P}, and P..lQ is defined as 3x(x E P & ""x E Q). It is easy to see that h(~a) = h(a).L, i.e., ""a E P iff VQ(a E Q implies Q..lP). The "if" direction is immediate. For the converse, one proceeds contrapositively, assuming that"" a ¢ P. We now must show 3Q(a E Q & Qtp). We consider the principal filter determined by a, [a) = {x : a:S x}. It is easy to see that [a) is "compatible" with P in the sense that [a)..lP. (Since otherwise 3q E [a), ""q E P. But then a :S q, and so ""q:S ""a, and then ""a E P, contrary to our assumption.) We next consider the set of all filters that contain a and are compatible with P, order it by inclusion, and a standard argument shows that the union of any chain in this set is also a filter containing a and compatible with P. This is the hypothesis of Zorn's lemma, and so we can apply it to show that there is a maximal such filter Q, and by another standard argument (using distribution) it can be shown that Q is prime. One might have chosen instead the definition P..l' Q as 3x(x E P & ..,x E Q), but, as the reader can easily verify, ..l' is just the converse of..l because of (1), and this explains why one represents one negation using.LA and the other using A.L. It is also clear that if the two operations .., and"" are identical, then the relation ..l on filters is symmetric. Also observe that if "" satisfies (2), then the relation ..l on filters is irreftexive, for if a, ""a E F, then a 1\ ""a E F, and so every b E F, contradicting the fact that we have (implicitly) taken "filters" to be proper. 12.15.2
Negation in intuitionistic logic
Having set the general context, we now examine the intuitionistic case, where the familiar Kripke definition of negation on a binary frame (U, R) is as follows: X
I=..,P iff
Va, X Ra implies a ~
p.
The standard gaggle representation of negation also involves a binary frame (U, ..l), so this time at least the "dimension analysis" is OK. Negation is a one-place operation, and so the gaggle representation uses a binary relation in representing it. But so does the Kripke semantics. So far there is no problem. The problem is that the gaggle definition uses the binary operation in an apparently different way:23 X
I=..,P iff
427
NEGATION
GAGGLES: GENERAL GALOIS LOGICS
Va, a
1= P implies x..la.
22This is essentially the semantics of Goldblatt (1974) for orthologic. 23This definition is in line with Birkhoff's definition of a Galois connection using a binary relation, and with Goldblatt's (1974) definition of negation in his representation of ortholattices.
I=..,P iff
Va, a
1= P implies XRa.
This is right, as far as it goes, but unfortunately one cannot simply take ..l to be R. The quick reason is that the binary relation ..l used in the definition of negation for intuitionistic logic in "gaggle theory" is, as we have just seen, a symmetric relation. Its complement R would then be symmetric as well, whereas the definition in the standard Kripke semantics requires R to be a partial order. Clearly, not many partial orders are symmetric (indeed, only identity). The solution reveals itself by looking at the well-known definition of negation in terms of implication and a constant false proposition ("the absurd"): ..,p = p :J F. If we apply the gaggle definition using the three-place accessibility relation we obtain: X
I=..,P iff
Va, y(Rxay & a
1= p ='? Y 1= F).
Since F is never true, this simplifies to X
I=..,P iff
Va, y(Rxay ='? a ~ p),
i.e., in the canonical model, where a and y are prime filters,
Clearly if we define XCa ("X is compatible with a") as for some y, X ~ Y & a ~ y, i.e., they have a common extension, we obtain a symmetric relation, and it turns out to be equivalent to the standard gaggle one. Compatibility is of course the complement of "perp," so X ..l a iff not (XCa). Thus the standard gaggle definition of negation becomes: X I=..,P iff Va, a 1= p implies X ..l a. The equivalence is trivial in one direction and needs Zorn's lemma for the other. What is going on is that the Kripke definition of negation uses inclusion of the prime theory a in the prime theory p, and inclusion is a subrelation of the compatibility relation. But it gives the same results because one can prove that if aCp, then there is a theory that extends both. Thus if p were in a prime theory a, but p showed up in a prime theory X such that aC x, then extend a and X to a common extension y, and this will be an extension of a that contains p, and so ..,p would be false not only on the gaggle account, but also on Kripke's. The converse is trivial since if..,p were in a prime theory a, but p showed up in an extension (i.e., ..,p is false on Kripke's account), then afortiori p would show up in a theory compatible with a (and be false on the gaggle account). 12.15.3
Negation in relevance logic
There is also a problem about how to "gaggleize" the standard semantical treatment of De Morgan negation in relevance logic. De Morgan monoids were introduced in Dunn
428
GAGGLES: GENERAL GALOIS LOGICS
NEGATION
(1966) as a way of characterizing the relevance logic R. A De Morgan monoid is a structure (G, /\, v, 0, -+,~, e), where (G, /\, v, 0, -+, e) is a positive De Morgan monoid (in the sense as we defined in Section 12.14.1), and ~ is an antitonic operation of period 2 satisfying: a 0 b :s: c iff bo ~ c :s: ~ a iff ~c 0 a :s: ~ b. Rather than saying that ~ is antitonic and of period 2, we could have equivalently said that (~, ~) is a Galois connection on G satisfying ~ ~a :s: a, i.e., every element of G is "Galois closed." A Routley-Meyerframe can be taken to be a structure (U, J;;;, R, Z, *), where (U, J;;;, R, Z) is a right assertional frame and * is a unary operation on U such that (1) a** = a (period two) and (2) Rapy iff Ry*ap* (switching). By an R-frame we mean a RoutleyMeyer frame whose right assertional frame is an R+ frame. 24 The clause for negation in the Routley-Meyer semantics is: 25
This makes clear the problem we must address, because this is simply not the form of a gaggle-theoretic definition of negation, which would rather be the following: x
F
~¢
iff Va(a
F¢
=? x.la).
Not only do the definitions have different forms, they also have different ingredients in that the Routley-Meyer definition uses a unary function, whereas the gaggle-theoretic definition uses a binary relation (not required to be functional in character). It turns out that in relevance logic ~¢ can be defined as ¢ -+ f, where f is thought of as the disjunction of all falsehoods. Note that f =I F, which last can be thought of as the conjunction of all propositions. Unlike F, f does not imply everything. It turns out that running through the analysis of the semantic conditions for ¢ -+ f in R leads to insights. Thus: x
F¢
-+
f iff
X
F
~¢ iff
F ¢ =? P F f), Va,p(RxaP & P F f =? a F ¢).
Va,p(RxaP & a
i.e.,
But one can work out that P F f iff P < 0*, but X < y abbreviates RO Xy for Routley and Meyer, and can be thought of as our information ordering 1;;;, canonically as X ~ y. So P < 0* means that whatever is true at Pis true at 0*.0* F f = ~t, since 0 Ft. So P F f· And conversely, if P F f, then P* F ~f = t, and so 0 < p*, i.e., P < 0*. So we have: X F ~¢ iff Va,p(RxaP & P < 0* =? a F ¢). Rearranging quantifiers, we obtain: 24Then in view of commutativity we can require instead of switching the customary postulate: Rapy implies Ray' p' . 25This clause is as in Routley and Routley (1972), though it actually originates in the representation of De Morgan lattices by Bialynicki-Birula and Rasiowa (1957). Cf. Dunn (1976) for a fuller discussion.
X
F
429
~¢ iff Va(3p[RxaP & P
< 0*]
=? a
F ¢).
Read in bastard English and with the canonical model in mind, this says that if X and a are "compatible" from the viewpoint of a consistent p,26 then ¢ is false at a. So it is not surprising that by contraposition we obtain something having the form of the gaggle definition of negation: X
F
~¢ iff Va(a
F¢
=?
Vp < 0*, RXaP),
taking X.la to mean Vp < 0*, RXap. Let us then in the canonical model compare the notion of "compatibility from a consistent viewpoint" with the canonical notion of XCa, or equivalently its negation X.la. First notice that if x 0 ~x is in any filter G, then f E G (since x 0 ~x :s: f). So let us suppose X.la. Then for some x E X, ~ x E a. Suppose that RXap. Given the canonical definition of R, this means that xo ~ x E p. But then f E p, and so P is not consistent. Arguing the other way around, let us suppose canonically that it is not the case that X.la. Set Po = [X 0 a) = {b : 3x E x,3a E a, x 0 a :s: b}. It is easy to see that Po is a filter. Indeed, it is a consistent filter since if f E Po, then 3x E x,3a E a, x 0 a :s: f. But then a :s: x -+ f = ~x, and so ",x E a, and so X .la, contrary to what we have supposed. So Po is a filter not containing f. But now using Zorn's lemma, one can prove by a standard argument (due to Stone) that Po can be extended to a prime filter excluding f. 12.15.4
Negation in classical logic
We save the most familiar to last. What properties does one have to put on .l to give A 1. the properties of classical negation? As we have already seen in Section 12.15.1, if .l is made symmetric and irreflexive then the following hold: X ~ yl. iff y ~ Xl.; xnxl.=0.
If, as is also needed for relevance logic, one restricts the "propositions" X to be the Galois closed sets, i.e., X
= Xl.l.,
one gets all of the properties of orthocomplement, including all of De Morgan's laws. This is the basis of Goldblatt's (1974) semantics for orthologic. We should point out that v is not the same as u, but rather, given the De Morgan definition,
We also point out that this does not give the whole orthomodular logic used in one standard interpretation of quantum mechanics, since it is missing the orthomodular law (a kind of weak form of distribution). 26"Consistent" here simply means
p or
~ a
¢ Pl.
f ¢ p. In R this implies negation consistency in the sense that 'Va(a ¢
430
GAGGLES: GENERAL GALOIS LOGICS
How does classical logic arise in this context? Obviously we need that all subsets of U are Galois closed. The natural way to ensure this is to take ..1 to be just the relation of non-identity -::j.. Then X.L is the set of things that are distinct from everything in X, i.e., just U - X. If we examine the canonical frame we note that P ..1 Q iff P i Q. Thus if P..l Q then 3x(x E P & ~x E Q). It must be the case then that P -::j. Q, for otherwise 3x(x E P & ~x E P). Then 0 = xA ~ x E P, and so P is not proper. Conversely, if P -::j. Q then 3x(x E P & x ¢. Q), or the other way around. The two cases are symmetric so we shall just consider the first case. But then since, in a Boolean algebra, (proper) prime filters are maximal (Theorem 8.9.1), we know (Theorem 8.9.2) that ~x E Q. SO canonically P ..1 Q.
12.16
Future Directions
Besides applying distributoids and gaggles as defined in this chapter, there are also three ways to generalize these structures so that they can be applied to more logics. One way has already been suggested, to weaken the distributive lattice ordering. Such a weakening to a semi-lattice ordering or just a partial ordering has been explored in Dunn (1993a). The latter is along the lines of the representation of implicational posets in Section 8.1. The weakening to a lattice ordering has been explored in Allwein and Dunn (1993) and Allwein (1992). Although operations whose degree is 3 or greater are not explicitly considered, we believe that such an extension would be routine (but complicated). Hartonas (l993a, 1993b) does explicitly consider operations of arbitrary degree, but contains a completely different style of representation. A second way to generalize gaggles would be to expand the notion of a "family of operations" so that operations of various degree can live happily together. There is a suggestion of this in the treatment of negation in intuitionistic logic, and in relevance logic, where negation is defined using the same accessibility relation as used for implication (and fusion). A third way to generalize gaggles would be to contemplate logics where various families of operations live happily together, for example intuitionistic or relevance logic with modal operators.
13 REPRESENTATIONS AND DUALITY! 13.1
Representations and Duality
The study of representations of lattices and related structures has been considerably ~n riched since Stone's original work, with contributions from various authors. We restnct ourselves here to a brief report for the sake of the interested reader. We begin by bringing attention to an important aspect of the Stone representation, called "duality," that we did not previously discuss. Let us first try to give philosophical meaning to duality issues. Duality gets its ultimate expression in the terms of category theory, but we shall do our best to give the reader some intuitive grasp of duality without resorting to category theory, or at least its diagrams. "Duality" in its most general sense has to do with "reversing." One talks of the "dual" of a partial order ~ being ;:::, or of A being "dual" to V. If one is lucky, laws regarding one notion become laws regarding the other, as is indeed the case with the examples given. A somewhat different kind of example is more germane to our concerns. In geometry it is a familiar idea that a line can be taken to be a set of points. But the opposite is true as well: a point can be taken to be the set of lines that intersect it. 2 So geometry can be formulated just as well in a "dual" fashion, with lines rather than points as primitive. The actual example that concerns us is that we can think of a proposition as a set of possible worlds, or dually, we can think of a possible world as a maximally consistent set (filter) of propositions. Thus given a set U of possible worlds, one can form the set of propositions ~(U) as just the power set of U (a Boolean algebra), and given a set of propositions B (viewed as having the structure of a Boolean algebra), one can form the set of worlds WeB) as just the set of maximal filters of B.3 This forms the underlying conceptual basis of Stone's representation of Boolean algebras, wherein we start with a Boolean algebra B of abstract propositions, form "possible worlds" as maximal filters of these and so obtain WeB), and then realize the abstract propositions as sets of these I We wish to thank Gerry Allwein and Chrysafis Hartonas for help on this chapter. In describing Priestley's representation and duality results we borrow heavily from Allwein (1992), who in turn borrows from Priestley (1970, 1972), and Davey and Priestley (1990). 2Indeed some sophisticated version of this is the basis of projective geometry, wherein not only do any two distinc~ points determine a line, but equally any two distinct lines determine a point (perhaps "at infinity" for lines that would otherwise be parallel). i~ a lovely 3That \?<J(U) can be viewed as standing for both "power set of U" and "propositions of mnemonic accident. But what can be done mnemonically with WeB)? The answer of course lIes III the fact that W is an upside down M and we are forming the dual set of maximal filters.
r:"
432
SOME TOPOLOGY
REPRESENTATIONS AND DUALITY
possible worlds so as to obtain '6:J WeB). Thus we go from propositions to possible worlds and back to propositions again. We of course can also start with possible worlds, construct propositions as sets of these, and then obtain possible worlds as maximally consistent sets of these propositions. So suppose we have a set U which we think of as the set of "all" possible worlds. Let us suppose that the cardinality of U is n. Then when we form the power set '6:J(U) we get the set of "all" propositions, and its cardinality is 211. So far, so good. 4 A possible problem arises when we realize that we can now apply the dual construction to '6:J(U) so as to form all of its maximally consistent sets and so obtain a set of possible worlds. Let us call this new set W'6:J(U). How big is W'6:J(U)? We know that in the power set all maximal filters are principle, and that means that for each maximal filter M there is a world W E U so that A E M iff W E A. So there is a natural one-one correspondence between U and W'6:J(U), and thus in some appropriate sense we obtain no new worlds in going from U to W'6:J(U). To make this concrete the reader is invited to consider: U '6:J(U)
= {WI, W2}, = {{WI, W2}, fwd, {W2}, 0},
W'6:J(U) = {{ fwd, {WI, W2}}, {{W2}, {WI, W2}}}.
Things are not so easy if one starts with an abstract Boolean algebra and forms WeB), '6:JWeB), etc. The question, put roughly, is whether we get back to where we started, or whether we have a new set of propositions, and then a new new set of propositions, etc. Where does it stop? If we do not cut this off at the first iteration, there seems no hope of stopping this construction, and it completely undermines any claim we might have had about B being the set of all propositions (say, about some particular subject matter). If the reader considers only finite Boolean algebras, say the four-element Boolean algebra of Figure 8.11 for concreteness, there may be a false conclusion, for here we do get back to where we started. Thus B = p, a, G, O}, WeB) = {{ 1, a}, p, G} }, and '6:JW(B) = {{ {l, a}, {I, G}}, {{I, a}}, {{I, G}}, 0}. But the problem is caused by Boolean algebras that are not isomorphic to any power set, and for this we need to consider some appropriate infinite Boolean algebra. One that is not too exotic is the Lindenbaum algebra formed from a language with a denumerable set of atomic sentences: At = {PI, P2, ... }. Where ¢ is a sentence of the language, [¢] is the set of sentences provably equivalent (in classical logic) to ¢. We know that this Lindenbaum algebra is the same as the free Boolean algebra with denumerably many free generators, with the generators being members of the set G = 4 Actually the reader may have noticed that already there is a mathematical problem, since the collection of all possible worlds would seem to be too large to form a set, and would seem to be a proper class. But in certain special circumstances, it may be that we can consider all possibilities and collect them together in a set. Think, for example, of a phase space in Newtonian mechanics, with some fixed finite number of (point) particles, each of which is assigned a position and momentum. Each "possible world" is given by such an assignment.
433
{[pd, [P2], ... }. We know that the cardinality of the set of sentences of the language is
No, and it is easy to see that the cardinality of the set of elements of the Lindenbaum algebra is also No, since it is not any larger and is not finite since always [pd f:. [Pj] when if:. j. We know from Cantor that the size of the power set of G is non-denumerable, indeed it is of cardinality 2 No. It is also clear that each subset of G corresponds to a set that has in it just every item in the subset together with the complement of every item not in the subset. Each such set is consistent and can be extended to a maximal filter, but no two such sets can be extended to the same maximal filter since they will differ in that one has, say, [pd and the other has -[Pd = [""pd. So the size of the set of maximal filters is also 2No. Now considering this as a set of worlds, we know when we form the set of its 2NO propositions, i.e., its power set, we get a set of the size 2 , and so we do definitely get a new set of propositions, and we are off and running on a kind of infinite regress, since we can now form a new, new set of worlds, and a new, new set of propositions, etc. Perhaps it is stretching things, but Stone, in his representation of Boolean algebras, can be viewed as cutting off this regress by providing the set of maximal filters WeB) with topological structure so as to determine the field of sets isomorphic to the original Boolean algebra independently of the representation map in terms of properties of subsets of WeB) arising from this additional structure. Of course, to complete this story we should give a philosophical interpretation of this topological structure, but we do not at this point have such an interpretation and commend this project to some reader. From this point on we stop being philosophers and turn to being pure mathematicians. We know that Stone's representation cannot always represent a Boolean algebra as a full power set. Thus we seem forced to describe the target representation algebras as "fields of sets" (sometimes "concrete Boolean algebras"), where these are just any collection of subsets of some given "universe" U where these subsets are closed under intersection, union, and complement relative to U. It would be nice to have some more intrinsic characterization of these target algebras, and this is what Stone provides by way of looking at U not just as a mere set, but rather as a set with a topological structure.
13.2 Some Topology Some of the notions discussed here have appeared in other parts of the book. But we collect them all together here in a unified fashion. The reader who wishes to learn more is advised to consult some standard book on topology such as Kelley (1955) or Simmons (1963) (our own personal favorite). One of the prettiest and most useful notions of pure mathematics is that of a topological space = (U, (Od iEI), where U is a non-empty set and {Oi} iEl is a collection of subsets of U (called a topology) which is closed under finite intersections and arbitrary unions. Among the finite intersections we include the empty intersection, and so U is open. Also 0 is open since we include the empty union. Each Oi is called an open set, and its complement U - Oi is called a closed set. Alternatively the closed sets might be taken as primitive, requiring them to be closed under finite unions and arbitrary intersections, with then the open sets characterized as their complements (relative to U). Note that U and 0 are always closed as well as open.
r
434
DUALITY FOR BOOLEAN ALGEBRAS
REPRESENTATIONS AND DUALITY
A set is clop en if it is both open and closed. Note that if 0 is clopen, then so is U - O. Since clopen sets are also closed under nand u the clopen sets constitute a field of sets. Sometimes we are given a collection {Bj liEf of subsets of U, where the collection is closed under finite intersections but not under arbitrary unions. Such a collection is called a base for a topology, because we can just close it under arbitrary unions, and the resulting collection of subsets then of course can be viewed as a collection of open sets. There is an even weaker notion called a subbase, where the given collection of sets has no special closure properties. But we can close it under finite intersections (obtaining a base) and then under arbitrary unions to obtain a topology. By a clop en subbase we mean a collection of subsets {Sj j JEJ closed under relative complement: (U - Sj). Given U' ~ U, we can view it as a subspace of T (the relative topology): T' = (U', {O;liEI), where 0; = Oi n U'. The additional topological structure that is needed for duality is given by two notions, total disconnectedness and compactness. Total disconnected means that distinct points are separated by a clopen set, i.e., x f= y implies that there is a clopen set 0 such that x E 0 and y ¢ O. Since U - 0 is also clopen, this means that the pair is separated by two disjoint clopen sets. A space is Hausdorjfif any two distinct points are separated by disjoint open sets, so any totally disconnected space is Hausdorff. Hausdorff spaces playa very important role in topology, and are midway in a hierarchy of separation principles, the weakest of which defines a space to be To if for each pair of distinct points there is an open set that contains one but not the other. Compactness is a very important property of topological spaces that arises from the real line. It says that if there is a collection of open sets {Oi) iEI such that U iEI Oi = U, then for some finite J ~ I, UJEJ OJ = U. An important feature of the real line is the set of open intervals. This is already closed under intersection, and so is a base for the so-called usual topology of the real line, since all one has to do is close it under arbitrary unions and it becomes a collection of open sets. The famous Heine-Borel theorem states that every closed and bounded subspace of the real line is compact. In particular, this means that every closed interval, such as the closed unit interval [0, ... 1], is compact. This is related to the following more abstract theorem.
Theorem 13.2.1 Every closed subspace of a compact space is compact (in the relative topology). Corollary 13.2.2 Let X be a closed set and let {Oi j iEI be a collection of open sets that covers X in the sense that UiEI Oi :2 X. Then there is a finite subcollection {OJ}jEJ that covers X, again in the sense that UJEJ OJ :2 X. Proof We shall not prove Theorem 13.2.1, which can be found in any standard topological text. The reader should be warned, though, that it is sometimes found stated as our Corollary 13.2.2. This is sometimes disguised by changing the sense of cover so as to replace :2 with =, and so we introduce the terminology exact cover for the latter, i.e., when U jEJ OJ = X. It is worth noting how the corollmy is related to the theorem. The trick is to notice that we can view X as a topological space in its own right with the relative topology. Then Oi n X is an open set in the relative topology, and
435
X = UiE1(Oi n X). Since X is closed in the original topology, we can now apply Theorem 13.2.1 to conclude that X = UjEJ(Oj n X) for some finite subunion, and hence
X~UjEfOj.
D
It is often convenient to investigate compactness by focusing not on the open sets, but on the basic, or even subbasic sets. The following is therefore useful.
Theorem 13.2.3 (Alexander subbase theorem) If S is a subbase for the topology T = (U, (9) sllch that eveJY cover ofU by members of S has afinite subcover then T is compact. We need one last notion before continuing our discussion of duality.
Definition 13.2.4 Given two topological spaces T = (U, (9) and T' = (U', (9'), we call a function I from U to U' open if it preserves open sets, i.e., if 0 E (9, then the image 1*(0) = {tea) : a E OJ E (9'. A function I from U to U' is called continuous if it pulls open sets back to open sets, i.e., if 0' E (9', then the inverse image l-h(O') = {a: I(a) EO'} E (9. Continuous maps play roughly the SaIne role in topology as do homomorphisms in algebra. We need then to introduce the analog of an isomorphism, which (somewhat confusingly given the algebraic analogies) is called a homeomorphism. The reader is COlTect to think that for a homeomorphism we need to require that I be one-one and onto, as well as, of course, continuous. But it is also necessmy to require I to be open in order to meet the requirement that the inverse of a homeomorphism is also a homeomorphism.
Definition 13.2.5 A homeomorphism between two topological spaces T = (U, (9) and T' = (U', (9') is a I-I function I from U onto U' which is both open and continuous. Two topological spaces will be said to be homeomorphic images of each otherjust when there is a homeomorphism between them. 13.3
Duality for Boolean Algebras
We are now in a position to prove a version of Stone's representation that builds in the duality aspects. A compact and totally disconnected space has come to be known in the literature as a Stone space (though Stone himself, of course, spoke more modestly of Boolean spaces). We assume knowledge of the proof of the representation theorem for a Boolean algebra B, recalling that the isomorphism is defined as h(a) = {M : M is a maximal filter of B and a E M j. We will show how to add to this proof so as to obtain a topologized version of it which builds in duality considerations.
Theorem 13.3.1 (Stone) EvelY Boolean algebra B is isomorphic to the Boolean algebra of clopen subsets of a Stone space S(B). The proof of this theorem is distributed among various lemmas and comments. First we describe how S(B), called the dual space ofB, is determined. We let U be the set of maximal filters of B, and we use the collection of all sets of the form h(a) as a base, which it is because h(a) n h(b) = h(a!\ b). We thus let the open sets of the dual space
436
DUALITY FOR BOOLEAN ALGEBRAS
REPRESENTATIONS AND DUALITY
be unions of sets of the form h(a). Note that sets of the form h(a) are clopen since there complements are also open by virtue of the fact that U - h(a) = h( -a). Lemma 13.3.2 S(B) is totally disconnected. PlVO! Suppose we have M, M' E U with M -::j:. M'. Then 3a E B such that a E M and a ¢ M', or vice versa (the argument is symmetric). But then M E h(a) and yet M' ¢ h(a). And so h(a) is a clopen set that separates the point M from the point M'. D
Compactness is harder. We find it easiest to work with compactness in a well-known dual form. 5 We shall call the original form of compactness U-compactness. A collection of sets C is said to have the finite intersection property if every finite subcollection has non-empty intersection. A space U is said to be n-compact if every collection of closed sets which has the finite intersection property also has non-empty intersection. Lemma 13.3.3 U-compactness and n-compactness are equivalent. PlVO! Apply set-theoretic complement and De Morgan's laws. We need to show that if UXEX hex) = U, then for some finite Y ~ X, UYEY hey) = U. Note that this is compactness for the (sub)basic sets, but it implies compactness for the open sets by the alexander subbase theorem 13.2.3. D
Lemma 13.3.4 (Maximal Filter Principle) Let F be a plVper filteJ: Then F can be extended to a maximalftlter M'. PlVO! This is just another variation on Stone's .prime/maximal filter separation principles (see Lemma 8.6.2 and Exercise 8.11.11), and is sometimes called the maximal filter theorem. The idea is to set E = {G : G is a filter and F ~ G} and order E by inclusion. All of its chains have upper bounds (their unions), and so by Zorn's lemma E has a maximal member M. It is easy to see that this is a maximal filter simpliciter (and not just a filter maximal among those extending F), for if M c M' then F ~ M c M', and so M would not be a maximal member of E after all. D
Lemma 13.3.5 The space of maximal filters U is n-compact. PlVO! Suppose X ~ B and for no finite Y ~ X does nyEy hey) = 0. Let [X) be the filter determined by X. We first show that [X) is proper. Otherwise 3Yl, ... , YIIl E X such that Yl/\ .. . /\Ym = O. But this is impossible given that some maximal (proper!) filter ME h(Yl) n ... n h(Ym). Because then Yl, ... , YIIl EM, and so Yl/\ ... /\ Ym = 0 E M, meaning that M is not proper after all. Now that we know that [X) is proper we can extend it to a maximal filter M' using the maximal filter principle (Lemma 13.3.4). Clearly for x E X, x EM', i.e., M' E hex). Thus nXEX hex) -::j:. 0. D
Lemma 13.3.6 h is onto the class of clopen sets. SThe various fonus of compactness we discuss are analogous to various semantical notions introduced in Section 9.4.
437
PlVO! By using sets of the form h(a) as a base, we have trivially ensured that each h(a) is open. We are finally in a position to show the converse. Let 0 be a clopen set. Then 0 = UXEX hex) for some X ~ B. But 0 is also a closed set of a compact Hausdorff space, and so by Theorem 13.2.1 is thus itself compact. But then there are Yl, ... , YI1 E X such that 0 = hY1 U ... U hYI1 = h(Y1 V ... V YI1). D
Let us now assess the duality aspects. We start with a Boolean algebra B, form the Stone space S(B), and then form the Boolean algebra of the clopen sets B(S(B». Contrary to the famous novel of Thomas Wolfe, we can sometimes get home again (at least in category theory). This is shown by Theorem 13.3.1, which tells us that B is isomorphic to B(S(B». There is a symmetric question concerning duality. What if we start with a Stone space S, form the Boolean algebra B(S) (the dual algebra of S) of its clopen sets, and then form the Stone space S(B(S»? When we examined this construction with respect to the power set construction we discovered that the answer is yes: simply map an element a E U to the set of subsets of U that contain it as a member. This is a one-one mapping onto the maximal filters of the power set. Let us then try the same trick, mapping an element a to the set f(a) of clopen subsets of U that contain a. It is easy to see that f(a) is a complete and consistent filter, and hence a maximal filter. The map is clearly one-one because of total disconnectedness. Is it onto? Suppose we have a maximal filter M. We need to know that there is some a E nM. Note that M is closed under binary intersections since it is a filter, and so under all finite intersections. This means that we have the finite intersection property, and so by compactness nM is non-empty and contains some a. We show that f(a) = M. Clearly M ~ f(a). But since f(a) is a filter and M is a maximal filter, M = f(a). We are not yet through showing that f is a homeomorphism, but we leave to the reader the completion of this as an exercise. Exercise 13.3.7 Show that f is both open and continuous. The duality above is often called "object" duality and is distinguished from "functorial duality," with "full duality" including both. Functorial duality is at the level of morphisms (in this case the isomorphisms and the homeomorphisms), and not just at the level of the objects (in this case the Boolean algebras and the Stone spaces), and says that there is a natural one-one correspondence of isomorphisms between two Boolean algebras and homeomorphisms between their two dual spaces, and that there is a similar one-one correspondence of homeomorphisms between two Stone spaces and the isomorphisms between their dual algebras. Further, these correspondences between isomorphisms and homeomorphisms compose in ways so that "nothing new" arises by repeated compositions. For example, if one were to start with a isomorphism between two Boolean algebras, go to the corresponding homeomorphism between the two dual spaces, and then take the isomorphism between the two dual algebras, there is a natural sense in which one "comes home" to the isomorphism with which one started. And similarly if one were to start with a homeomorphism between two Stone spaces. In this chapter we will focus on object duality, though we shall try to keep track from time to time of functorial duality as well.
438
13.4
REPRESENTATIONS AND DUALITY
Duality for Distributive Lattices
There are two topological approaches to Stone's representation of distributive lattices. The approach taken by Stone (1937) leads to a representation in terms of spaces which Hochster (1969) calls spectral. These spaces are such that their family of compact open sets is closed under intersection and forms a base for the topology of the space. Further, they are sober in the sense that every irreducible closed subset is the closure of a unique singleton set (a "point"), where a subset C of the space is irreducible closed if C is the union of two closed sets Cl UC2 then C = Cl or C = C2. Johnstone (1982) and Vickers (1989) are good "textbook" sources on this approach, though the reader is wamed that Johnstone speaks of coherent spaces instead of spectral spaces. 6 The approach of Priestley (1970) is somewhat simpler, and also (though this is hindsight) sympathetic to the idea from philosophical logic that a "proposition" is a hereditary subset A of an information frame (U, 1:), i.e., if a E A and a I: f3 then f3 E A. Priestley introduced ordered Stone spaces T = (U,I:, {ad iEI), where I: is a partial order on U, (U, {Oi) iEI) is a compact topological space, which is totally order disconnected, i.e., every two points a ~ f3 can be separated by a hereditary clopen set, i.e., there exists a hereditary clopen set C such that a E C and f3 ¢ C. We shall call these Priestley spaces. To smooth things out Priestley considers only bounded lattices, that is, those which have a least element 0 and a greatest element 1. Theorem 13.4.1 (Priestley) Every bounded distributive lattice is isomorphic to the lattice of clopen hereditmy subsets of a Priestley space. Priestley's theorem builds on Stone's representation of distributive lattices (see Theorem 8.6.1) and is essentially the same as for Boolean algebras, except that we let U be the set of proper prime filters P, and let ~ restricted to U be the required partial order. We set h(a) = {P : a E P}. We note that this nicely generalizes the Boolean case (since for a Boolean algebra, maximal filters and proper prime filters are the same), and the proof that h is an isomorphism is even slightly easier since we do not have to check that h( -a) = U - h(a). Unfortunately, the absence of Boolean complement makes the duality part of the theorem much less obvious, since we can no longer so easily see that the set-theoretic complement of a basic open set is also a basic open set by simply observing that U - h(a) = h( -a). Accordingly we define a new class of subbasic sets: S = {h(a): a E L} U {U - h(a): a E L}. Before showing that the topology given by S is compact, we find it interesting to reformulate compactness. Definition 13.4.2 A space is symmetrically compact if given any collection of closed sets {Ci} iEI and any collection of opens sets {OJ} jEJ such that niEI Ci ~ U jEJ OJ, thenfor some finite I' ~ I, J' ~ J, niEI' C; ~ UjEJ' OJ.
DUALITY FOR DISTRIBUTIVE LATTICES
439
Proof First note that symmetric compactness includes both U-compactness and ncompactness as special cases. We examine the case of U-compactness, leaving the case of n-compactness to the reader. Suppose that UjEJ OJ = U. Then no matter which collection of closed sets we choose, niEI Ci ~ U jEJ OJ. Choose the empty collection, setting I = 0. By symmetric compactness, for some finite I' ~ 0, J' ~ J, niEI'Ci ~ U JEJ' OJ. But then I' = 0, and since the intersection of an empty collection of sets gives U, we have U ~ U JEJ' OJ, and so we have our finite subcover. To see that compactness implies symmetric compactness, we observe that if niEI Ci ~ UjEJ OJ, then (U - niEl C;) U UjEJ OJ = U, i.e., by De Morgan, UiEI(U Ci) U U 'EJ OJ = U. Since U - C; is always an open set, this means that we have a cover usIng the sets U - Ci and the sets OJ. So by compactness there must be a finite subcover, and so for some finite I' ~ I, J' ~ J, UiEl'(U - Ci) U UjEJI OJ = U. And so niEI' Ci ~ UjEJI OJ, as needed. The following generalize the prime filter separation principle (Lemma 8.6.2) in various ways. The reader may also want to consult the end of Chapter 3. Corollary 13.4.6 greatly simplifies the argument for compactness from the standard arguments using mere prime filter separation.? D
Lemma 13.4.4 (Maximal filter-ideal pair principle) Let L be a lattice (not necessarily distributive) and let F be a filteT; I an ideal, and assume that F n I = 0. Then there exists a maximal (disjoint) filter-ideal pair (P, Q) with P ;2 F and Q ;2 I. Proof Let E = {(F', I') : F' is a filter;2 F, and I' is an ideal ;2 G, and F' n I' = 0}. Pmtially order E by componentwise inclusion: (F', I') ::::; (F", I") iff F' ~ F" and I' ~ I". It is easy to apply Zom's lemma to obtain a maximal member (P, Q), noting that the union of a chain of filters is a filter, and similarly with the union of a chain of ideals. It is easy to see that this is a maximal filter-ideal pair simpliciter (and not just maximal in E). D
This pair is disjoint in the sense that P n Q = 0. When a pair is not disjoint, we shall speak of it as inconsistent. We next argue that the pair is exhaustive in the sense thatP U Q = L. Corollary 13.4.5 When L is distributive, then P U Q = L. Proof Thus suppose we have an element a that cannot be thrown into either P or Q without causing it to be the case that the resultant pair is inconsistent. Thus both ([P, a), Q) and (P, (a, Q]) are inconsistent, where [P, a) is the filter generated from P U {a} and (a, Q] is the ideal generated from Q U {a}. Using properties of filter and ideal generation, we can see that there must be inequalities of the form PI /\ ... /\Pm /\a ::::; x ::::; ql V ... V qll and of the form P'1 /\ ... /\ P;Il' ::::; Y ::::; a V q; V ... V q;l" where x and Y are elements that, respectively, make the two pairs inconsistent, and all of the P terms
Theorem 13.4.3 Symmetric compactness is equivalent to compactness. 6Vickers (1989, p. 121) says thatIohnstone's usage is common in domain theory.
7The basic construction may be found in a number of places (cf. Dunn 1995b, note 3), but the most pertinent reference is to Urquhart (1979), who used it in his extension of Priestley's results to lattices in general.
REPRESENTATIONS AND DUALITY
440
EXTENSIONS OF STONE'S AND PRIESTLEY'S RESULTS
come from the filter P, and all of the q telms come from the ideal Q. We can make this uniform by throwing all of the p terms together into one big meet, and similarly throwing all of the q terms into one big join, obtaining two inequalities:
From these we obtain p ~ p /\ (a (P, Q) is not disjoint after all.
V
P /\ a
~
q,
p
~
a V q.
q)
~
(p /\ a) V q
~
q, and so q E P. Thus the pair
0
Corollary 13.4.6 (Prime filter principle) Let L be a distributive lattice, let F be a = 0. Then there exists a prime filter P :2 F
filter, I an ideal, and assume that F n I such that P n I = 0.
Proof We show that the P from the maximal filter-ideal pair (P, Q) is a prime filter. Suppose a V b E P, but both a, b ¢ P. Then by the previous corollary, both a, b E Q, and since Q is an ideal, then a V b E Q. But then a V bE P n Q, contrary to its construction
using Lemma 13.4.4.
0
Remark 13.4.7 Corollary 13.4.6 can be understood as saying that every disjoint filterideal pair can be extended to a prime filter-ideal pair, since Q can dually be argued to be a prime ideal (or we can recall that L - P = Q is a prime ideal).
441
Lemma 13.4.10 Sets ofthefonn hex) are closed under finite unions and intersections. Similarly with sets ofthefonn U - hex). Proof hex) n hey) = hex /\ y), hex) u hey) = hex V y). (U - hex)) n (U - hey)) = U - (h(x) u hey)) = U - hex V y), (U - hex)) u (U - hey)) = U - (h(x) n hey)) = U-h(x/\y). 0
Lemma 13.4.11 The clopen sets are precisely those sets which are finite unions of sets of the form hex) n (U - hey)). Proof Let 0 be a clopen set. Then it is a union of finite intersections of sets of the form hex) or U - hey). Since 0 is a closed subset of a Hausdorff space, we can take this to be a finite union. Collect together the sets of the two separate forms, obtaining hex]) n ... n h(Xk) n (U - hey])) n ... n (U - h(y/)). By Lemma 13.4.10 this is of the form h(x]/\ ... /\ Xk) n U - hey] V .,. V y/). 0
Lemma 13.4.12 Every set of the form hex) is hereditmy, and evelY set of the form U - hey) is dual hereditary. Proof It is obvious that hex) is hereditary, for suppose that p] ~ I7. and p] E hex), i.e., x E Pl. Then x E I7., i.e., I7. E hex). It is equally obvious U - hey) is dual hereditary. Thus suppose that PI ~ I7. and I7. ¢ hey), i.e., y ¢ I7., then y ¢ I7., i.e., P2
¢ hey).
0
Lemma 13.4.8 If nXEX hex) ~ UYEY hey), then for some finite X' ~ X, yl ~ y,
Theorem 13.4.13 Every hereditmy clopen set 0 is of the form hex).
nXEX' hex) ~ UYEY' hey)·
Proof First note that if 0 is empty, then 0 = h(O), where 0 is the least element of the lattice. So pick a particular P E O. Consider U - O. For all prime filters Q E U - 0, P g; Q since 0 is hereditary. Hence for some a E L, a E P and a ¢ Q. This means that P E h(a) and Q E U - h(a). So we have shown that each Q E U - 0 is a member of U - h(a) for some a E P, a ¢ Q. SO then 0 is covered by a collection of open sets
Proof We proceed contrapositively, supposing that for every finite X' ~ X, yl ~ y, nXEX' hex) g; UYEY' hey). We show that nXEX hex) g; UYEY hey).
We first observe that our supposition means that for every finite X' ~ X, Y' ~ y, /\ X' VY'. For suppose to the contrary that /\ X' ~ Vyl. Then if P E nXEX' hex), then Vx E X', P E hex), i.e., x E P. Since P is a filter, first /\ X' E P, and then VY' E P. Since P is prime, 3y E yl such that yEP, i.e., P E hey), and so P E UyEY' hey). Thus nXEX' hex) ~ UyEY' hey), contrary to our hypothesis. The prime filter principle gives us a prime filter P such that X ~ P and Y ~ U - P. Q = U - P is a prime ideal. Since X ~ P, this means Vx E X, P E hex), and so P E nXEX hex). But P ¢ UYEY hey), for otherwise 3y E Y such that P E hey), i.e., yEP, contradicting Y ~ U - P. Thus we have shown nXEX hex) g; UyEY hey), as desired. 0
i
of this form. U - 0 is closed and hence by Corollary 13.2.2 it is covered by a finite subunion: U - 0 ~ Ui
Theorem 13.4.9 The topology generated from S as a subbase is compact. Proof By the Alexander subbase theorem, we know that it suffices to consider just the subbasic sets, i.e., sets of the form hey) or U - hex). Suppose then that UXEX(U hex)) u UYEY hey) = U. Then nXEX hex) = U - UxEX(U - hex)) ~ UyEY hey). By the lemma there exist finite X' ~ X, yl ~ Y such that nXEX' hex) ~ UYEY' hey). But
then U - nXEX' hex) u UYEY' hey)
= U,
i.e., UXEX'(U - hex)) u UYEY' hey)
= U. o
13.5
Extensions of Stone's and Priestley's Results
This is a largely bibliographic report, and we do not attempt to define all of the terminology used but hope, nonetheless, to convey something of the flavor of the results we report. Johnstone (1982) is a high-level presentation of results relating to Stone's representation in the context of category theory. Priestley's representation for distributive
442
REPRESENTATIONS AND DUALITY
lattices has been extended in several directions. Urquhart (1978) extended it to Ockham lattices, i.e., bounded distributive lattices with a dual homomorphic operator (a negation operator satisfying the De Morgan laws and switching the bounds), generalizing De Morgan lattices. A second extension was by Martinez (1988) for Wajsberg algebras, which coincide with the bounded commutative BCK-algebras (the algebraic models of BCK logic). Dosen (1989a) gave a duality result for modal algebras. Goldblatt (1989) showed how the Priestley representation of distributive lattices could be extended to distributive lattices with operators, with a duality result similar to Priestley's for distributive lattices. Hansoul (1983) had previously shown how the original Jonsson-Tarski representation of Boolean algebras with operators can be extended to include a duality result of the original type of Stone (using spectral spaces). It is worth noting that Goldblatt extended the Jonsson-Tarski representation, not just in supplying duality, but also in replacing Boolean algebras with distributive lattices, as well as allowing operators that distribute over meet in each of their places, as well as those of the J onsson-Tarski type that distribute over join in each of their places. But he did not, as with distributoids, allow operators that mix and match, distributing over meet in some places and join in others. We should also mention Sambin and Vacarro's (1988) work on duality for modal algebras. Goldblatt (1975) published a representation theorem for ortholattices, largely driven by logical motivation,S which was linked in Goldblatt (1974) to a semantic analysis of orthologic. An ortholattice is a bounded lattice with an orthocomplement operation, where the latter is an antitonic map.., with the involution property (..,..,a = a, for any element of the lattice) and satisfying a A..,a = 0 and a V..,a = 1. The representation relies on the fact that orthonegation satisfies the De Morgan laws, so that joins are definable by means of meets. Thereby, in Goldblatt's theorem, what is needed is a representation of meets, which is canonically done by intersections, and a representation of the orthocomplement operation. Since orthocomplementation cannot be interpreted as set-theoretic complementation (else joins would be interpreted as unions and the lattice would be distributive), Goldblatt's idea was to interpret the negation operator by means of an irreflexive, symmetric relation ..i~ U 2 of orthogonality on the points of the space (the filters of the lattice), defined by a..i f3 iff 3a E a(..,a E f3). Where a..i X means that for all f3 E X, a..i f3, the relation ..i induces an operation on subsets of the space, defined by ..,X = {a : a..i X}. With the usual Priestley topology imposed on the dual space, Goldblatt concludes by showing that the original ortholattice is isomorphic to the lattice of clopen, ..i-regular subsets of the space, where a subset U is ..i-regular iff ..,..,U = U. Dunn (1996) shows how Goldblatt-style representations can be extended to various other logics by playing with the conditions on ..i. This is based on a representation of Galois connections from Dunn (1993a). We give some examples. Dropping just symmetry gives two "complements": ..,X = 1. X = {a: a ..iX}, and ",X = Xl. = {a: X ..ia}, and these behave as a Galois connection. Dropping just irreflexiveness gives the De Morgan complement of relevance logic. Dunn (1993b) shows the relationship of the SGoldblatt's interest in this paper is only with representation, and not full duality, which he does not investigate.
EXTENSIONS OF STONE'S AND PRIESTLEY'S RESULTS
443
Goldblatt-style semantics using ..i to the Routley-Meyer style semantics using their *-operator. Urquhart (1979) proposed a topological representation of general lattices. The dual space, in Urquhart's representation, is a doubly ordered space (X,:Sl, :S2), consisting of maximal, disjoint filter-ideal pairs, and where :Si is coordinatewise inclusion of pairs. To represent joins, Urquhart defined a pair of maps r and fJ on sets of disjoint, maximal filter-ideal pairs increasing in the I-order and 2-order, respectively, by rU = {x : Vy(x :S2 y :::} Y ¢ U)} and fJU = {x : Vy(x :Sl y :::} Y ¢ U)}. With representation defined by a 1--+ ua = A = {x EX: a E Xl}, where a is an element of the lattice and Xl is the filter part of the filter-ideal pair x, Urquhart topologized X via the subbase
s
= {-ua : a E L} U {-rub: bEL}
and characterized the image of the representation function II as the collection of all doubly closed (both A and r A closed in the topology) stable sets (stability meaning that fJr A = A). A doubly ordered space (X, :Sl, :S2), where :Si are pre-orders on X, is called doubly disconlZected, if for any points X and y, if x $1 y, then 3U(U is doubly closed x E U, Y ¢ U) and if x $2 y, then 3U(U is doubly closed, x E rU, y ¢ rU). An L-space is then defined as a doubly ordered set (X, :Sl, :S2) with a topology such that:
r,
r
(1) X is a doubly disconnected compact space; (2) if U, V are doubly closed, then r(U n V) and fJ(rU n rV) are closed; and (3) the family {-U : U is doubly closed, stable} u {-r V : V doubly closed, stable} forms a subbase for the topology on X.
r
Urquhart's representation theorem, however, is not functorial (homomorphisms are not preserved by the representation) and cannot be extended to a full duality, as was observed by Allwein (1992), who enlarged the canonical space to include all disjoint filter-ideal pairs (not just the maximal ones) and proved full (functorial) duality in his dissertation. In an as yet unpublished manuscript, Allwein and Hartonas extended the duality to one for the lattice of congruences (strengthening the relevant result in Urquhart) and the lattice of sublattices. They also showed how to represent lattices by sheaf spaces. Allwein and Dunn (1993) and Allwein (1992) used the Urquhart representation of lattices as the basis for adding additional (binary) gaggle-type operators to a lattice that is not necessarily distlibutive, thereby providing (among other things) a "Kripke-style" semantics for linear logic. Hartonas (1993a, 1993b) does explicitly consider operations of arbitrary degree, but employs a completely different style of representation. Yet another representation is considered in BimbO (1999, 2000). Instead of maximally consistent pairs, minimally inconsistent ones are used, and implication and fusion are added. While the frame is defined more in the spirit of the Urquhart representation using a doubly ordered set, the whole representation (and especially the canonical model) is more closely related to the Hartonas and Dunn representation mentioned below. (Duality has not been investigated for this representation.)
444
REPRESENTATIONS AND DUALITY
A different kind of representation of lattices is to be found in Hartonas and Dunn (1993), together with full duality. The idea is based on an observation of Birkhoff (1948). Birkhoff defines a polarity as in effect a triple (U, U', 1..), with 1.. ~ U xU'. For X ~ U we define Xl. = {a' E U' : X1..a'} , and for X' ~ U' we define l. X = {a' E U' : X 1..a'}. Birkhoff observes that this establishes a Galois connection between the power set of U and the power set of U'. For X ~ U we call Xl. Galois closed if X = l.(Xl.), and dually for X' ~ U'.9 Birkhoff further observes then that the Galois closed subsets of U (there is a dual fact about U') form a lattice, with X f\ Y = X n Y and X v Y = l.(Xl. n yl.). Hartonas and Dunn show, generalizing the representation of ortholattices to be found in Dunn (l993a), that every lattice can be embedded in such a lattice of Galois closed subsets generated by a polarity.
REFERENCES Abbott, J. C. (1967). Semi-boolean algebra. Matematicki Vesnik, 4:177-198. Abbott,1. C. (1969). Sets, Lattices, and Boolean Algebras. Allyn and Bacon, Boston. Abbott, J. C. (1976). Orthoimplication algebras. Studia Logica, 35: 173-177. Ajdukiewicz, K. (1935). Die syntaktische Konnexitat. Studia Logica, 1: 1-27. Translated into English in McCall (1967). Allwein, G. (1992). Duality of Algebraic and Kripke Models for Linear Logic. PhD thesis, Indiana University, Bloomington. Allwein, G. and Dunn, J. M. (1993). Kripke models for linear logic. loumal of Symbolic Logic, 58:514-545. Anderson, A. R. and Belnap, Jr., N. D. (1975). Entailment. The Logic of Relevance and Necessity, Volume 1. Princeton University Press, Princeton, N1. Anderson, A. R., Belnap, Jr., N. D. and Dunn, J. M. (1992). Entailment. The Logic of Relevance and Necessity, Volume 2. Princeton University Press, Princeton, NJ. Angell, R. B. (1962). A propositional logic with subjunctive conditionals. loumal of Symbolic Logic, 27:327-343. Balbes, R. and Dwinger, P. (1974). Distributive Lattices. University of Missouri Press, Columbia, MO. Belnap, Jr., N. D. (1967). Intensional models for first degree formulas. loumal of Symbolic Logic, 32: 1-22. Belnap, Jr., N. D. (1982). Display logic. loumal of Philosophical Logic, 11:375--417. Bialnicki-Birula, A. and Rasiowa, H. (1957). On the representation of quasi-boolean algebras. Bulletin de l'Academie Polonaise des Sciences, 5:259-261. BimbO, K. (1999). Substructural Logics, Combinatory Logic and A-calculus. PhD thesis, Indiana University, Bloomington. BimbO, K. (2000). Semantics for the structurally free logic LC+. Manuscript, 14 pp. Bimb6, K. and Dunn, 1. M. (1998). Two extensions of the structurally free logic LC. Logic loumal of IGPL, 6:403--424. Birkhoff, G. (1935). On the structure of abstract algebras. Proceedings of the Cambridge Philosophical Society, 31 :433--454. Birkhoff, G. (1940). Lattice TheOlY. American Mathematical Society, Providence, RI. Birkhoff, G. (1944). Subdirect unions in universal algebras. Bulletin of the American Mathematical Society, 50:764-768.
9Urquhart's notion of "stability" is a special case of Galois closure since the pair (fl, r) constitute a Galois connection.
Birkhoff, G. (1948). Lattice TheOlY (2nd edition). American Mathematical Society, Providence, RI.
446
REFERENCES
REFERENCES
447
Birkhoff, G. (1967). Lattice Theory (3rd edition). American Mathematical Society, Providence, RI.
Davey, B. A. and Priestley, H. A. (1990). Intmduction to Lattices and Order. Cambridge University Press, Cambridge.
Birkhoff, G. and Frink, O. (1948). Representations of lattices by sets. Transactions of the American Mathematical Society, 64:299-316.
Dedekind, R. (1872). Continuity and irrational numbers. In Essays on the TheOlY of Numbers. Open Court, Chicago. W. W. Beman translation of Dedekind's "Stetigkeit und irrationale Zahlen," first published in 1901.
Birkhoff, G. and von Neumann, J. (1936). The logic of quantum mechanics. Annals of Mathematics, 37:823-843. Blok, W. 1. and Pigozzi, D. (1986). Proto algebraic logics. Studia Logica, 45:337-369. Blok, W. J. and Pigozzi, D. (1989). Algebraizable Logics. Memoirs of the American Mathematical Society 396. Amelican Mathematical Society, Providence, RI.
Dipert, R. R. (1978). Development and Crisis in Late Boolean Logic: The Deductive Logics of Peirce, levons and Schroder. PhD thesis, Department of Philosophy, Indiana University, Bloomington. Dosen, K. (1988). Sequent systems and groupoid models I. Studia Logica, 47:353-385.
Blok, W. J. and Pigozzi, D. (1992). Algebraic semantics for universal Hom logic without equality. In Romanowska, A. and Smith, 1. D. H. (eds), Universal Algebra and Quasigmup Theory, pp. 1-56. Heldermann Verlag, Berlin.
Dosen, K. (1989a). Duality between modal algebras and neighborhood frames. Studia Logica,48:219-234.
Bloom, S. L. (1976). Varieties of ordered algebras. lournal of Computer and System Science, l3:200-212.
Dummett, M. (1959). A propositional calculus with denumerable matrix. Journal of Symbolic Logic, 24:97-106.
Blyth, T. S. and Janowitz, M. F. (1972). Residuation Theory. Pergamon Press, New York.
Dunn, 1. M. (1966). The Algebra of Intensional Logics. PhD thesis, University of Pittsburgh.
Boole, G. (1847). Mathematical Analysis of Logic. Being an Essay towards a Calculus of Deductive Reasoning. Macmillian, Barclay & Macmillian, London.
Dunn, 1. M. (1970). Algebraic completeness results for R-mingle and its extensions. Journal of Symbolic Logic, 35: 1-l3.
Camap, R. (1943). Formalization of Logic. Studies in Semantics. Harvard University Press, Cambridge, MA.
Dunn, J. M. (1976). Intuitive semantics for first-degree entailment and 'coupled trees'. Philosophical Studies, 29: 149-168.
Camap, R. (1947). Meaning and Necessity: A Study in Semantics and Modal Logic. University of Chicago Press, Chicago.
Dunn, 1. M. (1986). Relevance logic and entailment. In Gabbay, D. and Guenthner, F. (eds), Handbook of Philosophical Logic, Volume 3, pp. 117-224. D. Reidel, Dordrecht.
Certaine, J. (1943). Lattice-Ordered Gmupoids and Some Related Pmblems. PhD thesis, Harvard University. Chellas, B. F. (1980). Modal Logic: An Introduction. Cambridge University Press, Cambridge. Church, A. (1956). Introduction to Mathematical Logic, Volume 1. Princeton University Press, Princeton, NJ. Copi, I. M. (1979). Symbolic Logic (5th edition). Macmillian, New York. Curry, H. B. (1963). Foundations of Mathematical Logic. McGraw-Hill, New York. Czelakowski, J. (1980a). Model-Theoretic Methods in Methodology of Pmpositional Calculi. Institute of Philosophy and Sociology of the Polish Academy of Sciences, Warsaw. Czelakowski, J. (1980b). Reduced products of logical matrices. Studia Logica, 39:1943. Czelakowski,1. (1981). Equivalentiallogics I, II. Studia Logica, 40:227-236, 335-372. Czelakowski, J. (1983). Some theorems on structural entailment. Studia Logica, 42:417429.
Dosen, K. (1989b). Sequent systems and groupoid models II. Studia Logica, 48:41-65.
Dunn, J. M. (1991). Gaggle theory: An abstraction of Galois connections and residuation with applications to negation and various logical operations. In van Eijck, J. (ed.), Logics in AI. Pmceedings of the Eumpean Workshop lELIA 1990, Lecture Notes in Computer Science 478, pp. 31-51. Springer-Verlag, Berlin. Dunn, 1. M. (l993a). Partial-gaggles applied to logics with restricted structural rules. In Dosen, K. and Schroeder-Heister, P. (eds), Substructural Logics, pp. 63-108. Clarendon Press, Oxford. Dunn, 1. M. (1993b). Perp and star: two treatements of negation. Philosophical Perspectives (Philosophy of Language and Logic), 7:331-357. Dunn, J. M. (l995a). Gaggle theory applied to intuitionistic, modal and relevance logic. In Max, I. and Stelzner, W. (eds), Logik und Mathematik. Frege-Kolloquium lena 1993, pp. 335-368. W. de Gruyter, Berlin. Dunn, J. M. (1996). Generalized ortho-negation. In Wansing, H. (ed.), Negation: A Notion in Foclls, pp. 3-26. W. de Gruyter, Berlin. Dunn, J. M. (1999). A comparative study of various semantical treatments of negation: A history of formal negation. In Gabbay, D. and Wansing, H. (eds), What Is Negation?, pp. 23-61. Kluwer Academic Publishers, Dordrecht.
448
REFERENCES
Dunn, 1. M. and Meyer, R. K (1971). Algebraic completeness results for Dummett's LC and its extensions. Zeitschrift fiir Mathematische Logik und Grundlagen der Mathematik, 17:225-230. Dunn, J. M. and Meyer, R. K (1997). Combinatory logic and structurally free logic. loumal of the Interest Group in Pure and Applied Logic, 5:505-537.
REFERENCES
449
Halmos, P. R. (1963). Lectures on Boolean Algebras. Van Nostrand Mathematical Studies #1. D. Van Nostrand, Princeton, NJ. Hansoul, G. (1983). A duality for boolean algebras with operators. Algebra Uiversalis, 17:34-49.
Esakia, L. and Meskhi, V. (1977). Five critical modal systems. Theoria, 43:52-60.
Hardegree, G. M. (1975). Stalnaker conditionals and quantum logic. lournal of Philosophical Logic, 4:399-42l.
Everett, C. 1. (1944). Closure operators and Galois theory in lattices. Transactions of the American Mathematical Society, 55:514-525.
Harrop, R. (1958). On the existence of finite models and decision procedures for propositional calculi. Proceedings of the Cambridge Philosophical Society, 54: 1-13.
Fitch, F. B. (1952). Symbolic Logic. Ronald Press, New York.
Harrop, R. (1965). Some structure results for propositional calculi. Journal of Symbolic Logic, 30:271-292.
Font, 1. M. and Jansana, R. (1996). A General Algebraic Semantics for Sentential Logics, Lecture Notes in Logic 7. Springer-Verlag, Berlin. Frege, G. (1879). Begriffsschrift. In van Heijenoort, J. (ed.), From Frege to Godel: A Source Book in Mathematical Logic, 1879-1931, pp. 5-82. Harvard University Press, Cambridge, MA Trans. S. Bauer-Mengelberg.
Hartonas, C. (1993a). Algebraic and Kripke Semantics for Substructural Logics. PhD thesis, Indiana University, Bloomington. Hartonas, C. (1993b). Lattices with additional operators. Technical Report IULG-93-27, Indiana University Logic Group.
Frege, G. (1892). Uber Sinn und Bedeutung. In Feigl, H. and Sellars, W. (eds), Readings in Philosophical Analysis, pp. 85-102. Appleton Century Crofts, New York. Trans. Herbert Feigl.
Hartonas, C. and Dunn, J. M. (1993). Duality theorems for partial orders, semilattices, Galois connections and lattices. Technical Report IULG-93-26, Indiana University Logic Group.
Fuchs, L. (1963). Partially Ordered Algebraic Systems. Pergamon Press, New York.
Henkin, L. A (1954). Boolean representation through propositional calculus. Fundamenta Mathematicae, 41:89-96.
Gabbay, D. (1976). Investigations in Modal and Tense Logics with Applications to Problems in Philosophy and Linguistics. D. Reidel, Dordrecht. Gentzen, G. (1934-35). Untersuchungen tiber das logische Schliessen. Mathematische Zeitschrift, 39: 176-210, 405-43l. English translation ("Investigations into logical deduction") by M. E. Szabo in Szabo, M. E. (ed.), The Collected Papers of Gerhard Gentzen, pp. 132-213. North-Holland, Amsterdam, 1969.
Henkin, L. A, Monk, 1. D. and Tarski, A (1971). Cylindric Algebras, Part 1. Studies in Logic and the Foundations of Mathematics 64. North-Holland, Amsterdam and London.
Gericke, H. (1963). Theorie der Verbiinde. Bibliographishes Institut, Mannheim.
Hermes, H. (1938). Semiotik. Eine Theorie der Zeichengestalten als Grundlage fUr Untersuchungen von formalisierten Sprachen. Forschungen zur Logile und zur Grundlegung der exakten Wissenschaften, n.s. 5:22.
Gierz, G., Hofman, K H., Keimel, K, Lawson, J. D., Mislove, M. and Scott, D. S. (1980). A Compendium of Continuous Lattices. Springer-Verlag, Berlin.
Heyting, A (1930). Die formalen Regeln der intuitionistischen Logik. Sitzungsberichte der preussichen Akademie der Wissenschaften, pp. 42-56, 57-71, 158-169.
GOdel, K (1933). Zum intuitionistischen Aussagenkalktil. Ergebnisse eines mathematischen Kolloquiums, 4:40.
Hochster, M. (1969). Prime ideal structure in commutative rings. Transactions of the American Mathematical Society, 142:43-60.
Goldblatt, R. I. (1974). Semantic analysis of orthologic. loumal of Philosophical Logic, 3:19-35.
Hughes, G. E. and Cresswell, M. (1968). An Introduction to Modal Logic. Methuen, London.
Goldblatt, R. I. (1975). The Stone space of an ortholattice. Bulletin of the London Mathematical Society, 7:45-48.
Hughes, G. E. and Cresswell, M. (1996). A New Introduction to Modal Logic. Routledge, London and New York.
Goldblatt, R. I. (1989). Varieties of complex algebras. Annals of Pure and Applied Logic, 44: 173-242.
Huntington, E. (1904). Sets of independent postulates for the algebra of logic. Transactions of the American Mathematical Society, 5:288-309.
Gratzer, G. (1979). Universal Algebra (2nd edition). Springer-Verlag, New York.
Jaskowski, S. (1934). On the rules of suppositions in formal logic. Studia Logica, 1:532.
Hallden, S. (1951). On the semantic non-completeness of certain Lewis calculi. lournal of Symbolic Logic, 16: 127-129. Halmos, P. R. (1962). Algebraic Logic. Chelsea, New York.
Jevons, W. S. (1864). Pure Logic or the Logic of Quality Apartfrom Quantity. E. Stanford, London.
450
REFERENCES
REFERENCES
451
Johansson, I. (1936). Der Minimalkalkiil, ein reduzierte intuitionistischer Formalismus. Compositio Mathematica, 4: 119-136.
Lincoln, P., Mitchell, 1., Scedrov, A. and Shankar, N. (1992). Decision problems for propositional linear logic. Annals of Pure and Applied Logic, 56:239-311.
Johnstone, P. T. (1982). Stone Spaces. Cambridge University Press, Cambridge.
Los,1. (1944). On Logical Matrices, Travaux de la Societe des Sciences et des Lettres de Wroclaw, Series B vol. 19. Wroclaw. Translated into English in Ulrich (1968).
J6nsson, B. and Tarski, A. (1951). Boolean algebras with operators. American Journal of Mathematics, 73:891-939. Kalish, D. and Montague, R. (1964). Logic: Techniques of Formal Reasoning (2nd edition). Harcourt Brace Javanovich, New York. Kelley, J. L. (1955). General Topology. Van Nostrand, New York. Kiss, S. (1961). An Introduction to Algebraic Logic. Privately printed, Westport, CT. Kneale, W. C. (1956). The province of logic. In Lewis, H. (ed.), Contemporary British Philosophers, Third Series, pp. 237-261. George Allen & Unwin, London. Kripke, S. A. (1959). A completeness theorem in modal logic. Journal of Symbolic Logic, 24:1-15. Kripke, S. A. (1963a). Semantic considerations on modal logic. Acta Philosophica Fennica, 16:83-94. Kripke, S. A. (1963b). Semantical analysis of modal logic - I. Zeitschrijt filr mathematische Logik und Grundlagen der Mathematik, 9:67-96. Kripke, S. A. (1965). Semantical analysis of modal logic - II. In Addison, 1., Henkin, L. and Tarski, A. (eds), The Theory of Models, pp. 206-220. North-Holland, Amsterdam. Kuratowski, C. (1958). Topologie I (2nd edition). PWN (Polish Scientific Publishers), Warsaw. Lambek, J. (1958). The mathematics of sentence structure. American Mathematical Monthly, 65:154-170. Lambek, J. (1981). The influence of Heraclitus on modem mathematics. In Agassi, 1. and Cohen, R. S. (eds), Scientific Philosophy Today, pp. 111-121. D. Reidel, Dordrecht. Lemmon, E. 1. (1965). Beginning Logic. Nelson, London. Lemmon, E. 1. (1966). Algebraic semantics for modal logics, I and II. Journal of Symbolic Logic, 31:46-65,191-218. Lemmon, E. 1. (1977). An Introduction to Modal Logic, American Philosophical Quarterly Monograph Series, No. 11. Basil Blackwell, Oxford. In collaboration with D. Scott, edited by K. Segerberg. Lewis, C. I. (1918). A Survey of Symbolic Logic. University of California Press, Berkeley and Los Angeles. Lewis, c.I. and Langford, C. H. (1932). Symbolic Logic. Century, New York. Lewis, D. K. (1973). Counteifactuals. Library of Philosophy and Logic. Basil Blackwell, Oxford.
Los, J. (1957). Remarks on Henkin's papers: Boolean representation through propositional calculus. Fundamenta Mathematicae, 44:82-93. Los,1. and Suszko, R. (1957). Remarks on sentential logics. Indagationes Mathematicae, 20:177-183. Lukasiewicz, J. (1910). 0 zasadzie sprzecznosci uArystotelesa. Akademia Umiej~tnosci, Krak6w. Lukasiewicz, J. (1913). Die logischen Grundlagen der Wahrscheinlichkeitsrechnung. Akademie der Wissenschaften Viktor Oslawski'scher Fonds, Krak6w. Lukasiewicz, J. (1920). On three valued logic. Ruch Filozojiczny, 5:169-170. In Polish, translated into English in McCall (1967). Lukasiewicz, J. (1930). Philosophical remarks on many-valued systems of propositional logic. Comptes Rendus des Seances de fa Societe des Sciences et les Lettres de Varsovie, 23:51-77. In German, translated into English in McCall (1967). Lukasiewicz, J. and Tarski, A. (1930). Untersuchungen iiberder Aussagenkalktil. Comptes Rendus des Seances de la Societe des Sciences et les Lettres de Varsovie, 23:30-50. In German, translated into English in Tarski (1956). MacLane, S. (1971). Categories for the Working Mathematician. Springer-Verlag, New York. Maksimova, L. (1972). Predtablichnge superintuitionistskie logiki. Algebra i Logika, 11:558-570. (See Algebra and Logic, 1974, pp. 308-314, for English translation: Pretabular superintuitionist logic.) Maksimova, L. (1975). Predtablichnge rasshirenia logiki S4 Lewisa. Algebra i Logika, 14:28-55. (See Algebra and Logic, 1975, pp. 16-33 for English translation: Pretabular extensions of Lewis S4.) Marcev, A. I. (1973). Algebraic Systems. Springer-Verlag, Berlin. Malinowski, G. (1993). Many- Valued Logics, Oxford Logic Guides 75. Clarendon Press, Oxford. Martin, E. P. and Meyer, R. K. (1982). Solution to the P-W problem. Journal of Symbolic Logic, 47:869-886. Martinez, N. G. (1988). The Priestley duality for Wajsberg algebras. Studia Logica, 49:31-46. McCall, S. (1966). Connexive implication. Journal of Symbolic Logic, 31:415-433. McCall, S. (1967). Polish Logic: 1920-1939. Clarendon Press, Oxford. McKinsey, J. C. C. (1941). A solution ofthe decision problem for the Lewis systems S2 and S4, with an application to topology. Journal of Symbolic Logic, 6:117-134.
452
REFERENCES
REFERENCES
453
McKinsey, 1. C. C. and Tarski, A. (1944). The algebra of topology. Annals of Mathematics,45:141-191.
Schroder, E. (1890-1905). Vorlesungen iiba die Algebra del' Logik, 3 vols. Teubner, Leipzig.
McKinsey, J. C. C. and Tarski, A. (1948). Some theorems about the sentential calculi of Lewis and Heyting. Journal of Symbolic Logic, 13:1-15.
Scott, D. S. (1973). Background to formalization. In Leblanc, H. (ed.), Truth, Syntax and Modality. Proceedings of the Temple University Conference on Alternative Semantics, pp. 244-273. North-Holland, Amsterdam.
Meyer, R. K. and McRobbie, M. A. (1982). Multisets and relevant implication. Australasian Journal of Philosophy, 60:265-281. Meyer, R. K. and Routley, R. (1972). Algebraic analysis of entailment (I). Logique et Analyse, n.s. 15:407-428.
Scott, D. S. (1974). Completeness and axiomatizability in many-valued logic. In L. Henkin et al. (eds), Proceedings of the Tarski Symposium/Proceedings of Symposia in Pure Mathematics, vol. 25, pp. 411-435. American Matllematical Society.
Ono, H. (1993). Semantics for substructural logics. In Dosen, K. and Schroeder-Heister, P. (eds), Substructural Logics, pp. 259-291. Clarendon Press, Oxford.
Scroggs, S. J. (1951). Extensions of the Lewis system 85. Journal of Symbolic Logic, 16:112-120.
Ono, H. and Komori, Y. (1985). Logics without the contraction rule. Journal of Symbolic Logic, 50:169-201.
Segerberg, K. (1971). An Essay In Classical Modal Logic, 3 vols. University ofUppsala Press, Uppsala.
are, O. (1944). Galois connexions. Transactions of the American Mathematical Society, 55:493-513.
Shoesmith, D. and Smiley, T. J. (1971). Deducibility and many-valuedness. Journal of Symbolic Logic, 36(4):610-622.
Popper, K. R. (1947). Logic without assumptions. Proceedings of the Aristotelian Society,47:251-292.
Shoesmith, D. and Smiley, T. J. (1978). Multiple-Conclusion Logic. Cambridge University Press, Cambridge.
Priestley, H. A. (1970). Representation of distributive lattices by means of ordered stone spaces. Bulletin of the London Mathematical Society, 2:186-190.
Shukla, A. (1970). Decision procedures for Lewis system 81 and related modal systems. Notre Dame Journal of Formal Logic, 11(2):141-180.
Priestley, H. A. (1972). Ordered topological spaces and the representation of distributive lattices. Proceedings of the London Mathematical Society, 2:507-530.
Sikorski, R. (1964). Boolean Algebras. Springer-Verlag, Berlin.
Quine, W. V. O. (1961). From A Logical Point of View. Harvard University Press, Cambridge, MA. Quine, W. V. O. (1986). Philosophy of Logic (2nd edition). Harvard University Press, Cambridge and London. Rasiowa, H. (1974). An Algebraic Approach to Non-Classical Logics. North-Holland, Amsterdam. Rasiowa, H. and Sikorski, R. (1963). The Mathematics of Metamathematics (2nd edition). Panstwowe Wydawnictwo Naukowe, Warsaw. Routley, R. and Meyer, R. K. (1972). The semantics of entailment - II, III. Journal of Philosophical Logic, 1:53-73, 192-208.
Simmons, G. F. (1963). Introduction to Topology and Modern Analysis. McGraw-Hill, New York. Smiley, T. 1. (1959). Entailment and deducibility. Proceedings of the Aristotelian Society, n.s. 59:233-254. Stalnaker, R. C. (1968). A theory of conditionals. In Rescher, N. (ed.), Studies in Logical Theory. Blackwell, Oxford. Stalnaker, R. C. and Thomason, R. H. (1970). A semantic analysis of conditional logic. Theoria, 36:23-42. Stone, M. H. (1936). The theory of representations for boolean algebras. Transactions of the American Mathematical Society, 40:37-111. Stone, M. H. (1937). Topological representations of distributive lattices and Brouwerian logics. Casopis pro Pestowinz Matematiky a Fysiky, 67:1-25.
Routley, R. and Meyer, R. K. (1973). The semantics of entailment - I. In Leblanc, H. (ed.), Truth, Syntax and Modality. Proceedings of the Temple University Conference on Alternative Semantics, pp. 199-243. North-Holland, Amsterdam.
Strawson, P. F. (1952). Introduction to Logical The01Y. Methuen, London.
Roiltley, R. and Routley, V. (1972). Semantics of first degree entailment. Nous, 6:335359.
Szasz, G. (1963). Introduction to Lattice Theory. Academic Press, New York. Trans. B. Balkay and G. T6th.
Rutherford, D. (1965). Introduction to Lattice Theory. Hafner, New York.
Tarski, A. (1956). Logic, Semantics, Metamathematics. Clarendon Press, Oxford. Translated by 1. H. Woodger.
Sambin, G. and Vaccaro, V. (1988). Topology and duality in modal logic. Annals of Pure and Applied Logic, 37 :249-296.
Suppes, P. (1957). Introduction to Logic. Van Nostrand, Princeton, NJ.
Thomas, I. (1962). Finite limitations on Dummett's LC. Notre Dame Journal of Fonnal Logic, 3:170-174.
454
REFERENCES
Ulrich, D. E. (1968). Matrices for Sentential Calculi. PhD thesis, Wayne State University.
INDEX
Ulrich, D. E. (1970). Decidability results for some classes of propositional calculi (abstract). Journal of Symbolic Logic, 35:353-354. Ulrich, D. E. (1986). On the characterization of sentential calculi by finite matrices. Reports on Mathematical Logic, 20:63-86. Urquhart, A. (1973). An interpretation of many-valued logic. ZeitschriJt filr mathematische Logik und Grundlagen der Mathematik, 19: 111-114. Urquhart, A. (1978). Distributive lattices with a dual homomorphic operation. Studia Logica, 38:201-209. Urquhart, A. (1979). A topological representation theorem for lattices. Algebra Universalis, 8:45-58. Urquhart, A. (1985). Many-valued logic. In Gabbay, D. and Guenthner, R. (eds), Handbook of Philosophical Logic, Volume III, Synthese Library Series vol. 166, pp. 71116. Reidel, Dordrecht. Van Fraassen, B. C. (1971). Formal Semantics and Logic. Macmillan, New York. Vickers, S. (1989). Topology via Logic, Cambridge Tracts in Theoretical Computer Science 5. Cambridge University Press, Cambridge. Wansing, H. (1994). Sequent calculi for normal modal propositional logics. Journal of Logic and Computation, 4:125-142. Ward, M. and Dilworth, R. P. (1939). Residuated lattices. Transactions of the American Mathematical Society, 45:335-354. Wechler, W. (1992). Universal Algebra for Computer Scientists. EATCS Monographs on Theoretical Computer Science. Springer-Verlag, Berlin. Whitehead, A. N. and Russell, B. (1910). Principia Mathematica to *56. Cambridge University Press, Cambridge. Wittgenstein, L. (1921). Tractatus Logico-Philosophicus. Routledge & Kegan Paul, London. Translation by D. F. Pears and B. F. McGuinness, published by Routledge & Kegan Paul in 1961. Wojcicki, R. (1970). Some remarks on the consequence operator in sentential logic. Fundamenta Mathematicae, 68:269-279. Wojcicki, R. (1988). Theory of Logical Calculi: Basic Theory of Consequence Operations. Kluwer Academic Publishers, Dordrecht.
+,202 -,1 1,2 <,60 =,36 >,60 Cn(f"), 188 [S), 117, 120 [cP]' 4 [a), 117, 120
X(Ai),27 D.A(D),252
n,13 U,30 V3,70 n, 19 0,39,82 =,4 ~, 125 ;::,60 1;1,245 E,5
[(a/p),257 [(cP),7
&,3 <--,109 ~, 129 :S;, 1,57 :S;p,220 «,12 0,149 F,170 /,,129 .l,I03 <,359 ~, 75 j,75 II, 28 ~, 129 (Y, 137 ~, 3 1;;;,235 D,149 ~, 12 :J, 1 ;2,44 ,/, 129 0,30 ->,106
f- (V), 200 f-p,218 V, 1,402 11,67,402 h([cP]),7 v(cP, IV), 149 2,6 A/=,23 A,401 Abbott, J. C., 340, 351, 353 absolute completeness, 378 absolute frame, 367 absolute inconsistency, 260, 378 absolutely faithful, 16 absoluteness, 194, 200-202, 223 abstract algebra, I abstract law of residuation, 281, 408, 411 accessibility relation, 277, 361-363, 401-430 appropriate for operators, 401-406 for intuitionistic logic, 424 symmetry of, 416 adequacy of equations for algebras, 54 adverbs, 208 agreement of types, 411 Ajdukiewicz, K., 135 Alexander subbase theorem, 435 algebra, 10-11 and lattices, 71-74 and sets of generators, 14 definable by equations, 36, 39 free, 39-47 method of equivalence classes, 40 of propositions, 144 and algebra of sentences, 144 of sentences, 33, 130-133, 254 Leibniz congruence on, 254 of a zero-order language, 33 of strings, 125 and free semi-groups, 126 as multiple successor algebra, 129 associativity in, 126 of terms, 33 and interpretations, 34 of truth values, 146-148 of words, 34-36 on an operational syntax, 34 polynomials on, 329 simple, 25
456
algebraic compositionality, 141 algebraic logic and universal algebra, 10 algebraic prime factorization theorem, 29-33 algebraic semantics, 8, 167,273 and representation theory, 8 of modal logic, 151 algebraizability, 273-276 and the Leibniz operator, 276 partial, 276 Allwein, G., 405, 431,443 alphabet, 125 alphabetic variant of a sentence, 263 alternation property for H, 387 for S4, 370, 388 Anderson, A. R., 91, 106, 224, 227, 234, 274, 277, 287, 381 Angell, R. B., 5 anti-symmetry, 56 antitonic, 279, 397 A(S),15 appropriateness of an algebra of propositions for an algebra of sentences, 144 of atlases, 156 of matrices, 156 of medleys, 156 articulated concrete residuated partially ordered groupoid, 282 articulated frame, 281, 285 ternary, 421 ASC, 346 assertion, 80 and rule-form permutation, 80 pseudo-, 280 specialized pseudo-, 284 assertional double implicational poset, 283-285 assertional frame, 423 assertional implicational poset, 283 assignment, 34 and interpretations, 35 associated distributive lattice with image operators, 315 associativity in algebras of strings, 126 asymmetric cancellation property, 271 asymmetric consequence, 195-199,322 and explicit consequence, 197 and normal characteristic atlas, 269 completeness for, 195 Galois connection, 195 asymmetric pre-consequence relation, 188 asymmetric sequent calculus, 334, 341 for Boolean logic, 346-348
INDEX
asymmetric tautological implication, 272 asymmetry, 56 atlas, 141, 152,211-214 and logical space, 153 and possible worlds semantics, 153 Lindenbaum, 211 normal, 266-270 Scott, 260 atom of a Boolean algebra, 304 atomic definability, 253 atomic generation, 126 atomic Leibniz congruence, 253 atomic sentences, 131 automorphism, 17 axiom of necessity, 357, 388 axiom of separation, 193 axiomatic calculi for classical logic, 334 for intuitionistic logic, 380 for modal logics, 356 axiomatization, 216-224 axioms, 380 Balbes, R., 67 base for a topology, 434 BC', 350 BC,337 BCK,80 Belnap, N. D., viii, 91, 106,224,227,234,242, 262,274,277,287,381,417 BG, see binary gaggle 422 Bimb6, K., 80, 318, 414, 443 binary calculus, 334 binary consequence, 187 binary frame, 425 binary gaggle, 421 and intuitionistic logic, 421 and modal logic, 421 and positive fragment ofrelevance logic, 423 and relevance logic, 421 embeddable in full binary gaggle, 422 binary implication relation, 191-194 Birkhoff's prime factorization theorem, 30, 369 varieties theorem, 47, 103,246 for ordered algebras, see Bloom 76 Birkhoff, G., viii, 5, 10, 29, 36, 46, 48, 74, 287, 382,396,398,406,426 Blok, W. J., 246, 247, 252, 253, 274 Blok-Pigozzi matrix, 254 Bloom, S. L., 74, 83 Blyth, T. S., 397 Bochvar, D. A., 230, 245 Boole, G., 1-2
INDEX
Boolean algebra, 2, 4,88,92, 122,278,300-301, 306,328,351,371,394,402,414,434 and classical propositional calculus, 3 and completeness of classical propositional logic,
7 and consistency of classical propositional calculus,7 and homomorphisms onto 2, 306 and implication algebras, 350 and semi-Boolean algebras, 350 and subdirect product of 2, 313 and varieties, 328 as a lattice, 88 embedding theorem, 310 filter in, 351 filters in, 302 and congruences, 302 maximal, 302 fundamental embedding, 307 ideals in, 5 and deductive systems, 5 maximal filter principle, 310 representation theorem, 310 with operators, 313, 361, 399, 422, 442 Boolean interpretation, 329, 331 and homomorphisms, 329 Boolean lattice, see Boolean algebra 153 Boolean logic, 321, 322, 328, 336 and Frege logic, 322, 324 and unital Boolean logic, 325 axiomatic calculus for, 337 compactness of, 331-332 complete fragment, 336 Boolean matrix, 174,321 Boolean ring, 5, 352-354 Boolean spaces, 435 Boolean valid, 332 Boolean valuation, 331, 332 Boolos, G., 137 bundle, 152, 273 calculus of derivations, 342 cancellation property, 257-261 asymmetric version of, 260 in asymmetric logic, 261 in symmetric logic, 257 canonical homomorphism, 24 Cantor's enumeration theorem, 138 Cantor, G., 433 Carnap, R., 155 carrier set, 10 Cartesian product, 26 of a family of sets, 27 categorial grammar, 135 categoricity, 194
457
category theory, 397 Certaine, J., 74, 287 chain, 30, 58, 391 Chellas, B. F., 356 Church, A., 35, 131,262 clashing of types, 411 classical (material) implication, 106 classical logic, 2-9, 321-355 and modal logic, 356, 378 and LC, 393 classical negation, 88 classical propositional calculus, see classical logic, and Boolean logic, 2 clopen set, 179,365,434,441 subbase for a topology, 434 closed set, 179,365,433 closure algebra, 5, 8, 358, 364-367, 370-373, 386, 387,399 finite compactness of, 370 closure of a set, 365 closure operator, 358, 364 determined by a quasi-metric, 365 closure under exclusion negation, 178 closure under gaggle operations, 413 closure under substitution of a globally evaluated interpretationally constrained language, 163 of an evaluationally constrained sentential language, 163 co-consequence, 214-216 combinators algebra of, 413 and implicational fragment, 414 commutation, 72, III commutative ring, 352 compact topological space, 179,438 compactness, 176-181, 197,274,320,369,436, 438 and exclusion negation, 327 and the finite union property, 180 E-, 177 1-, 177, 326 in classical propositional logic, 326 in explicit symmetric consequence, 203 of a topological space, 179 of a topology, 440 of deductive entailment, 176 of elliptical consequence, 198 of semantic entailment, 176 relation of semantic to topological, 179 S-, 177 strong, 326 symmetrical, 438 U-, 186,326
458
compatibility, 429 complementation, 85-93 classical, see complementation in distributive lattices, 88 in distributive lattices, 88 non-classical, 88 complementation laws, 300 complemented lattice, 85 complete filter, 119 complete fragment for Boolean logic, 336 complete lattice, 67, 68, 70, 256, 322, 382 complete reduct, 205 complete ring of sets, 67 completely prime filter, 119 completely standard matrix, 272 completeness, viii, 196 and K -algebras, 53 and finite algebras, 8 and Lindenbaum algebras, 8 and modal logics, 424 and representation theorems, viii, 193 and valuations, 191, 193 definitional, 336 extensions of 85 in Henle matrices, 376 for asymmetric consequence, 195 for binary implication, 193 for compact symmetric pre-consequence, 20 I for ortholattices, 398 in formal symmetric consequence logics, 213 in formal unary assertionallogic, 210 in the sense of Hallden, 263, 287 and the asymmetric cancellation property, 264 and the symmetric cancellation property, 263 of a modal logic, 416 of asymmetric consequence logic relative to a Kripke atlas, 269 of Boolean algebras, 305 of classical logic, 7, 8, 265 of formal asymmetric consequence logics, 212 of intuitionistic logic, 8, 384 of modal logics, 8, 375 of symmetric consequence, 200 of unary assertionallogic, 266 ofLC,390-393 W.r.t. a class of algebras, 53 compositionality algebraic, 141 of a language, 168 compression, 15 concrete assertional double implicational poset, 286 implicational poset, 285 residuated partially ordered groupoid, 286 concrete residuated partially ordered groupoids, 279
INDEX
INDEX
concrete residuated pair, 409 cone, 119, 192,272,277-282 and pre-orderings, 193 principal, 120, 278 congruence, 19-25,250,254-257,323,328,336 and equivalence relations, 20 and homomorphisms, 23 and identity, 256 and lattices, 71 and quasi-congruence, 75 on a matrix, 249-252 strong and weak, 250 on algebras, 21 on relational structures, 21 congruence of indiscernibles, 252 conjunction, 64 connectivity, 57 consequence, 181, 184, 188 asymmetric and stability, 261 cardinality of, 248 asymmetric in a matrix, 244 binary, 277 elliptical, 197, 199 explicit, 197, 199 in a matrix, 234 mixed, 199 relation and operation, 188 stop-gap, 220 symmetric, 184 class of matrices satisfying, 249 symmetric in a matrix, 244 symmetric stop-gap, 244 consistency, 267 and Lindenbaum algebras, 8 of a logic, 263 of classical propositional calculus w.r.t. Boolean algebras, 7 tautological, 263 consistency of a filter, see filter, consistency, 267 continuous function, 435 contra-validity, 170 contraction, 112 contradiction, 87 contraposition, 87, 89 contrapositive, 408, 409 converse of a relation, 60 with more than two terms, 409 Copi, 1. M., 184 correspondence, 286 cover, 179 open, 179 covering relation, 62, 304 Cresswell, M., 135, 356 Cs t(M),255
Curry, H. B., 2, 130 Curry-Howard isomorphism, 414 cut, 188,267 asymmetric algebraic, 197 global, 189 in asymmetric consequence systems, 188 in explicit consequence, 197 in hemi-distributoids, 202 in symmetric consequence systems, 188 infinitary, 188 symmetric algebraic and hemi-distributoids, 203 symmetric infinitary, 189 CIV (M),255 Czelakowski, J., 237, 246, 248, 257, 273 data types, 41 Davey, B. A., 431 D(BC),343 D(C),342 De Morgan complementation, 91 Laws, 374,429,441 monoid, 428 negation, 425, 427 properties, 416 De Morgan, A., 406 decidability,370-375 of intuitionistic logic, 388 Dedekind cut, 120,296 Dedekind, R., 5, 74, 120,296 deducibility relation, 282 deduction, 218, 357 in a modal logic, 357 deduction calculus, 342 deduction theorem, 381, 417 deduction theorem property, 215, 265 deductive consequences, 344 deductive system, 5 defining equations, 274 degree of a relation, 10 of an operation, II derivation, 176, 342 calculus, 342 for classical logic (D(BC)), 341 rules of, 334, 342 designation, 226 preservation of, 237, 239-243, 243-246 designation extension, 246 dilution, 113, 188,259, 267 and elliptical consequence, 198 in explicit consequence, 197 in explicit symmetric consequence, 203 in symmetric consequence systems, 188
459
Dilworth, R. P., 74, 406 direct interpretation, 149 direct product, 26, 306, 307 and power, 28 of algebras, 27 of arbitrary family of structures, 27 of matrices, 232 of ordered algebras, 76 of relational structures, 26 disconnected family of sets of sentences, 258, 259 discrete metric, 364 discrete topology, 365 disjointness of a filter-ideal pair, 123,439 disjunction, 64, 291 display logic, 417 distribution classical, 93-98 non-classical, 98-104 distribution type, 400, 401 and trace, 400 in a gaggle, 403 of modal operators, 400 of operations in a distributoid, 408 distributive lattice, 4, 94, 207,288,293-300,314, 382,398,421,440 and distributive triple, 98 and hemi-distributoids, 207 and varieties, 97 complementation in, 96 with operators, 314, 362, 415 distributive laws, 94 distributive pre-lattice, 207 and consequence relations, 207 distributive triple, 98 and distributive lattice, 98 in modular lattices, 100 distributoid, ix, 394, 398-406, 408, 442 equational definability of, 412 with identities, 413 Dosen, K., 287, 442 double implicational poset, 280, 281 and partially ordered groupoids, 280 double negation, 87 doubly assertional frame, 421 dual Galois connection, 395, 396 dual hereditary set, 441 dual ideal, 5, 297 dual residuated pair, 395 dual space of a Boolean algebra, 435 dual-strong fidelity, 17 duality, 431, 435 and topology, 434 for Boolean algebras, 437 object and functorial, 437
460
of propositions and possible worlds, 431 Dummett, M., 233, 390 Dunn, J. M., 78, 224, 227, 234, 274, 277, 279, 286,288,318,362,390,395,409,412,414, 415,418,421,422,427,439,442,444 Dwinger, P., 67 E-compact, 177 ECSL,159 effectively decidable set, 138 of axioms and rules, 217 of sentences, 140 effectively enumerable set, 137 of strings, 139 elementary class, 167 elliptical consequence, 197, 198,206 £(M),255 embedding, 17 embedding theorem, 310, 369 and Birkhoff' s prime factorization theorem, 312 endomorphism, 17 entailment, 55, 170,321 deductive, 176 compactness in, 176 finitary, 176, 326 ordinary, 170 semantic, 176 compactness in, 176 simple, 170 symmetric, 170 epimorphism, 17 equational class, 36 equivalence of evaluationally constrained languages, 172 of matrices, 173 strict between matrices and atlases, 174 strict, of logics, 325 equivalence relation, 19, 255 and congruence relations, 20 and partitions, 20 respect for designation in, 256 Esakia, L., 379 Euclid, 184 Euclidean geometry, 216 evaluation ally constrained language, 158 Everett, C. J., 396, 398 evidential frame, 383 exact cover, 434 exclusion negation, 178,327,328 closure under, 178 exhaustive filter-ideal pair, 123,439 expansion of a matrix, 247 explicit consequence, 197-199 explicit symmetric consequence, 221
INDEX
falsifiability, 170 FBG, see full binary gaggle, 422 fidelity, 15 absolute, 16 minimal,16 field of sets, 300, 315, 353 filter, 5, 115-123,291,297,321,323,330,424, 431 and Dedekind cut, 120 and deductive systems, 5 and ideals, 5, 117 and possible worlds, 115 and theories, 115 complete, 122 consistent, 122 generated by a set, 117 join-irreducible, 304 maximal, 121,302 of the classical propositional calculus, 6 prime, 116,302 completely, 119 filter-ideal pair, 123 disjoint vs overlapping, 123 exhaustive, 123 maximal vs principal, 123 Fine, K., 277 finite algebras and completeness, 8, 371 finite direct product, 313 finite distributive lattice, 295, 304 finite intersection property, 180,369,436 finite model property, 224, 234, 375, 388 and unary assertionallogics, 224 finite union property and compactness, 180 first-order definability over a matrix, 252 Fis(~), 203 fissions, 203 Fitch, F. B., 4, 108, 184 FOLE, 38 formal derivations, 342 formality of valuations, 209 formula, 134 founded family of operations, 408 fragment of a language, 335 of a logic, 348 frame for a distributoid, 400 for a gaggle, 400 for a modal algebra, 313, 362, 416 for a tonoid, 279 free generators, 43 FK(n),44 free K -algebra, 45 free K -tonoid, 83 free semi-group, 127
INDEX
freedom, 43 and soundness, 51 model-theoretic notion, 41 of discrete word algebras in tonoids, 84 of word algebra in groupoids, 39 universal, 44 Frege algebra, 147,323,330 and Frege matrices and atlases, 153, 324 Frege interpretation, 330 Frege logic, 321, 322, 328, 333 and Boolean logic, 325-333 and M-Iogic, 333 and unital Boolean logic, 322-324 Frege semantics, 146 Frege, G., 2,141,151,184,216 Frink, 0., 288 Fuchs, L., 74,287,406 full binary gaggle, 422 full closure algebra, 366 full Heyting algebra of open sets, 384 of propositions, 383 function space, 28 fundamental embedding, 307 fundamental homomorphism theorem, 308 FusCr). 196 fusion, 109, 196,278 and its properties, 110,282 and residuation, 278 future contingent statements and multi-valued logic, 148 Gabbay, D., 356 gaggle, ix, 281, 394, 395, 408-420, 421-430 and symmetric modal Boolean algebra, 417 definition of, 408 equational definability of, 412 with identities, 413 Galois closed, 194, 429 Galois connection, 48, 89, 195, 395, 396, 401, 409,410,417,425,426,428,442,444 and residuated pairs, 396 and tonoids and inequations, 83 dual, see dual Galois connection, 395 general square-increasingness ofhemi-distributoids, 205 general sum-decreasingness ofhemi-distributoids, 205 generation, 13 generators induction from, 14 Gentzen, G., 114, 184, 188 Gericke, H., 67 Gierz, G., 397 global cut property, 189,260
461
globally evaluated interpreted language, 159 Gil' 230, 390 Godel, K., 230, 390, 393, 417 Godel algebra, 390 and LC-algebras, 390 Godel matrix, 230 Godel numbering, 138 Godel-McKinsey-Tarski translation, 388 Goldblatt's theorem, 442 Goldblatt, R., 394, 398,426,429,442 Gratzer, G., 51, 397 greatest element, 65 glb(S),66 greatest lower bound, 66, 255 groupoid, 39, 279, 406 H, 380 Hallcten completeness, 263 and the cancellation property, 271 Halmos, P., viii, 300 Hansoul, G., 442 Hardegree, G., 382 Hartonas, C., 288, 318, 431, 443 Hasse diagram, 62 Hausdorff space, 434, 441 Heine-Borel theorem, 434 hemi-distributoid, 202 and compact symmetric consequence, 205 bottom element of, 206 general square-increasingness of, 205 general sum-decreasingness of, 205 lower identity element of, 206 sufficient idempotence of, 205 and symmetric algebraic cut, 205 Henkin, L., 9 Henle algebra, 367 Henle matrix, 367-369, 377, 375 hereditary condition, 290,413,424 downward,291 upward,291 hereditary set, 421, 441 Hermes, H., 129 Heyting algebra, 5, 383-386, 417, 424 and S4-algebras, 417 of open elements, 386 of open sets, 385 Heyting complement, 91, 383 Heyting implication, 417 Heyting lattice, see Heyting algebra, 388 Heyting, A., 91,380 Hilbert spaces, 398 Hilbert, D., 184,216 Hilbert-style presentation, 216 and consequence relations, 216-222 of intuitionistic logic, 380
462
of modal logics, 356 ofLe, 390 Hochster, M., 438 Hofmann, K. H., 397 homeomorphic image, 435 homeomorphism, 435 homomorphic image, 15 homomorphism, 7,15 and congruence in matrices, 251 and congruence relations, 23, 24 and interpretations, 7, 34, 51, 329 between matrices, 237 in tonoids, 80 relational and operational, 19 Horn formulas and quasi-equations, 50 Hughes, G. E., 356 Huntington, E., 2 I-compact, 177,326 lA, 340, 349 ideal,S, 116, 115-123 and Dedekind cut, 120 ideal elements, 121, 295 idempotence, 40, 197,352,418 idempotent rings and Boolean algebras,S identity element, 282, 412, 421 and truth of propositions, 282, 412, 420 left, 282,420 right, 282, 420 identity of indiscemibles, 252 image operator, 314 implication, 55 and residuation, 109,279,287 classical, 105-109 in intuitionistic logic, see intuitionistic implication, 424 in quantum logic, 382 non-classical, 109-115, 278 push and pop, 420 implication algebra, 340, 353, 349 and Boolean implication, 350 and Boolean lattices, 350 and upper semi-Boolean algebra, 349 implication operation, 106 implication subalgebra, 350 implication tonoid, 79 and substructural logics, 79 implicational filter, 351 implicational fragment, 115,380,414 implicational po set, 279, 283 and partially ordered groupoids, 279 implicative fragment of classical logic, 348 and semi-Boolean algebras, 349 derivational calculus for, 352 unary axiomatic calculus for, 350
INDEX
implicative lattice, 38 I, 382, 383 inclusion lattice. 93 inclusion lemma, 284 inclusion poset, 58 and complete typicality, 58 inconsistency of filter-ideal pair, 123,439 toleration of, 267 indirect interpretation, 149 induction, 14 IL,75 inequational logic, 75 inequations, 75 infimum, 66 and set intersection, 66, 67 infinitary cut, see cut, infinitary, 188 infinitely distributive. 382 information order on a frame, 277, 383, 421 interior operator, 364, 358 I(L,M),157 interpretation, 6, 34, 51, 145,389 and algebra of terms, 34 and assignments, 35 and homomorphisms, 34, 51,155 and the principle of categorial compositionality, 145 and valuations, 155 in a Scott atlas, 213 into 2, 6 into a Frege algebra, 147 of a language in a matrix, 157 of a language in a medley, 157 of a language in an algebra, 155 of a language in an atlas, 157 of H formulas, 387 of S4 formulas, 387 on a language, 155 truth w.r.t. possible worlds, 149 interpretation structure, 34 interpreted language, 159 intersection of algebras, 13 intersection property, 370 finite, 181,370,436 intransitivity, 62 intuitionistic calculus (H), 380 intuitionistic implication, 395, 418, 424 and residuation, 420 intuitionistic logic,S, 8. 277, 380, 381, 418 and binary gaggles, 421. 424 and modal logic, 386, 419, 420 and non-classical complementation, 91 finite model property for, 389 fragments of, 380 Hilbert style, 380
INDEX
Kripke-Grzegorczyk semantics, 234 intuitionistic negation, 425 irreducible closed subset, 438 irreflexivity, 56 isolation lemma, 35 isomorphic, 17 isomorphism, 15, 17,292 and free K -algebra, 45 between matrices, 228 isotonic, 74, 76, 79, 196,279,282,314.359 and implication, 77 and negation, 77 Janowitz, M. E, 397 J askowski, S., 184 Jeffrey, R. C., 137 Jevons, W. S., 2 Johansson, 1., 91,380 Johnstone, P. T., 438, 441 join, 68 join-irreducible, 304 join-irreducible separation principle, 289 join-reducible, 293 join-semi-Iattice, 68, 349 J6nsson, B., viii, 313, 394, 401, 422 1SL,71 K -free algebra, 43 and free K -algebra, 45
and subdirect classes, 46 existence of, 44 Kalish, D., 184 Kelley, J. L., 433 Kleene, S. C., 35, 227 Komori, Y., 290 Kripke algebra, 150 Kripke atlas, 154, 269 Kripke matrix, 266 Kripke semantics, 290, 313, 361, 383, 417, 418 Kripke, S., 149,262,277, 356, 361, 383, 401 Kummer, E. E., 5 Kuratowski, C., 41 Kuratowski closure axioms, 364 L-space, 443 Lambek calculus, 129,287,395 of pairs (non-associative), 130 of strings (associative), 129 Lambek, J., 135,397 Langford, C. H., 356 language, 133-136 and decidabiIity, 140 and substitution, 136, 163 and unique decomposition, 133 atomic sentences of, 133
463
compositional, 168 evaluated, 162 evaluationally constrained, 158, 162, 163, 173 interpretationally constrained, 159 interpreted, 159 truth functional, 169 types of, 160 uninterpreted, 159 lattice, 4, 67-70, 68, 71-74, 288-297, 396, 398, 431 and congruences, 71 and Lindenbaum algebras, 4 and ordered algebras, 75 as a relational structure, 68 bounded,69,438 complemented, 85-88, 89, 300 complete, 68, 256 distributive, 93-98 equational definability of, 73, 412 modular, 99 power set, 66 with operators, 314, 318,361,398 lattice of sets, 93 lattice-ordered groupoid, 109,314,406 law of contraposition, 78, 90, 109 law of importation-exportation, 108 law of the left residual, 109, 280, 407 right residual, 109, 280, 407, 420 LC,390 algebras and G6del algebras, 390 least element, 65 lub(S),66
least upper bound, 66, 255 left identity, 282, 420 left lower identity, 421 left residual, 109,421-423 positive paradox for, 113 left-cancellation in semi-groups, 126 left-residuated partially ordered groupoid, 413 Leibniz algebra, 150 Leibniz atlas, 153 Leibniz congruence, 252 atomic, 253 Leibniz operator, 252 Leibniz, G. W, 2, 252 Lemmon, E. J., 184,356 Lewis, C. 1., 106, 356, 381 Lewis, D., 106,381 Lincoln, P., 224 Lindenbaum algebra, 4, 46, 274, 338, 361, 370, 374,389,399,413,423,432 Lindenbaum atlas, 257, 273 Lindenbaum matrix, 212, 257, 260, 273 Lindenbaum, A., 3, 210
464
linear ordering, 57 logic, 184-225 absoluteness and categoricity of, 194 algebra of, 8 algebraic and logistic approaches to, 2 algebraic semantics for, 273 and consequence, 184 and lattices, 5 and natural deduction, 184 and ordered algebras, 77 and tonoids, 77 and valuations, 169-172, 189-191 axiomatic characterizations of, 216 bivalent, 156 classical, 321 consistency of, 263 countable asymmetric consequence, 261 equivalential, 186 formal,208 Hilbert-style presentation of, 218 intuition is tic, 380 Lindenbaum algebra of, 8 modal,356 multi-valued, 148 non-truth-functional, 149 of quantum mechanics, 5 relevance, 274 and algebraic semantics, 275 soundness and completeness in, 191 structural, 208 substructural,274 typical properties of, 184 logical equivalence, 214 logical indiscernibility, 174 logical operations distribution type of, 399 tonic type of, 78 trace of, 400 logical validity, 181 Los, J" 184,226 Ib(S),64
lower bound, 64 lower semi-Boolean algebra, 349 LSBA,349 L m,227 L w ,233 LNJ,233 Lukasiewicz, J., 91, 108, 148,203,226,233,399 M,333 m,248 m-filter, 248 M-Iogic, 333 and Frege logic, 333 m-reduced product of matrices, 248
INDEX
MacLane, S., 397 Maksimova, L., 379, 390 Malinowski, G., 226 Martinez, N. G., 442 material binary gaggle (MEG), 422 material implication, 105,360 mathematical induction, 14,335 matrix, 152,226,321 atomic definability over, 253 Blok-Pigozzi, 254 characteristic for a unary assertionallogic, 232 for asymmetric consequence, 234 completely standard, 272 congruence on, 250, 257 strong and weak, 250 consequence relations in, 244 contraction of, 247 designation extension of, 246 direct product of, 232, 238, 243 equivalence relations on, 256 expansion of, 247 first-order definability over, 252 Glidel,230 homomorphic image, 238, 247 homomorphism, 237 and preservation of designation, 231, 240 positive and negative, 237 strong and weak, 237 infinite, 233 isomorphism, 228, 231, 238 strong and weak, 238 Kleene and Lukasiewicz, 229 Kripke, 266 Lindenbaum, 260 Lukasiewicz, 227, 235, 245 normal,262 and homomorphism into 2, 262 characteristic of unary assertionallogics, 264 pre-standard, 272 reduced product, 248 relations between, 173 relative of, 247 semantics, 274 Shoesmith-Smiley, 260 standard, 272 strong homomorphism theorem, 251 subdirect product, 238 Sugihara, 230 tautologies of, 232 weak homomorphism theorem, 252 maximal Boolean matrix, 153 maximal element, 65 maximal filter, 121,323,368,433 maximal filter principle, 310, 436
INDEX
maximal filter-ideal pair principle, 439 McCall, S., 5, 226 McKinsey, J., 5, 8, 364, 370, 371, 386 McRobbie, M., 197 medley, 152 meet, 68 meet-normal form, 389 meet-semi-Iattice,68 Meskhi, v., 379 method of equivalence classes, 40 metric, 364 metric space, 364 Meyer, R. K., 197,203,234,277,286,390,394, 414,422 minimal conditions for implication, 108 minimal element, 65 minimal logic, 91, 380 and non-classical complementation, 91 minimal negation, 425 minimally faithful homomorphism, 16 minimum condition for separation, 295 mirror principle, 142 Mitchell, J., 224 mixed consequence, 199 mixed structures, II modal algebra, 358-361, 370-375, 416, 424, 442 modal logic, 277, 313, 356-379, 394 and binary gaggles, 421 and classical logic, 378 and intuitionistic logic, 386 and residuated pairs, 398 axioms of, 356 B, 150,357,415,416 backward operators, 397,416 closure operator, 358 decidability, 371-375 deducibility and validity. 363 finite model property, 374 frames, 363, 415 interior operator, 358 K, 150,356,415 algebra, 371, 372 various axiomatizations, 356 1(5,418 Kripke semantics, 313, 361, 394, 415 (weak) completeness, 364 completeness, 363 model for, 362 normal, 358,415 relations among. 357 S4, see S4, 357 S5, see S5, 357 soundness, 361 strict implication and intuitionistic implication, 418
465
T,150,357,415,416 algebra, 358 ternary accessibility relation. 425 modal operators, 149 model,36 model theory, 36 modular lattice, 99 and modular pairs, 102 distributive triples in, 100 ortho-, 102, 103 varieties of, 100 weak,102 M(b,c),102
modus ponens, 3, 107,351,380 monoid, 129 monomorphism, 17 monotonic, 113 Montague, R., 184 MSL,71 multi-valued algebra, 148 mUltiple successor algebra, 129 multi set, 40, 197 natural deduction, 184 necessity, 149. 356,402 negation, 85 gaggle treatment, 427 negation consistency, 429 negative cone, 120 non-classical complementation, 91 non-classical distribution, 98 non-classical logics, 5, 113,394 normal atlas, 262-266 normal extensions of Le, 392 normal matrix, 262-266 and homomorphism into 2, 262 normal modal Boolean algebra, 415 normal operation, 315 normality of a matrix, 262 Ockham lattices, 442 one-point compactification theorem, 370 Ono, H., 290 open cover. 179 open function on a topological space, 435 open set, 179,365,433 operation, 11 additive, 314 normal,315 operational homomorphism, 18 operational structure, II operational syntax, 33 operator, 314 order variety. 76 ordered algebra, 74
466
and lattices, 75 discrete, 74 varieties theorem for, 76 ordered quasi-varieties, 76 ordered quotient algebra, 76 ordered Stone spaces, 438 ordering, 58 and the principle of duality, 60 Ore, 0., 396 orthocomplement, 86, 91, 429 and classical negation, 88 orthocomplementation, 442 orthogonal, 102 ortholattice, 86, 398, 442 equational characterization of, 87 orthologic, 426, 429, 442 orthomodular lattice, 103 orthonegation, 86, 442 overlap, 188, 197,267 paradoxes of implication, 6, 113, 381 of material implication, 419 of strict implication, 419 partial gaggle, ix, 409 partial ordering, 56, 74,78,255 and associated strict ordering, 59 and regular relations, 62 partially ordered set, 56 partially ordered groupoid, 109,279,406 partially ordered residuated groupoid, 279, 406 partition, 19 and quotient algebras, 20 Peano, G., 184 Peirce, C. S., 406 permutation, 79, III Pigozzi, D., 246, 247, 252, 253, 274 polarity, 398, 444 polynomials on an algebra, 329, 336 pop, 282-284,286,420 Popper, K., 184 poset,56 bounded,65 positive cone, 119 generated by a set, 120 positive De Morgan monoid, 423 positive fragment of classical logic, 348 and Boolean rings, 353 axiomatic characterization of, 354 of intuitionistic logic, 380 of relevance logic, 423 positive Routley-Meyer frame, 423 possibility, 149, 402 possible worlds semantics, 313, 362, 397
INDEX
and algebraic semantics, 150 Post complete, 378 Post inconsistent, 378 Post, E., 8 IfJ(X),66 Pratt, v., 109,407,412 pre-ordered groupoid, 196 pre-ordering, 55, 191 and implication, 55 and partial orderings, 56 generated by a relation, 61 pre-semi-Iattice, 199 pretabularity, 234, 375, 390 Priestley Spaces, 438 Priestley, H. A., 431, 438, 439, 441, 442 prime algebra, 29 prime element, 296, 293 separation principle, 305 prime filter, 116,297,298,368,414,426,423 prime filter principle, 440 prime filter separation principle, 298, 302, 368, 439 prime ideal, 116 principal cone, 120, 192,274,395 principal dual cone, 120, 280, 395 principal filter, 118,318,319,322,404 principal ideal, 119,349 principal pair, 123 principle of bivalence, 156 principle of categorial compositionality, 144 principle of compositionality, 143 projection, 28 proof,216 proper filter, 115,302,316,321,330 proper fragment, 348 proper modal extension, 378 proper normal extension of LC, 392 proposition, 2, 278, 362 as a hereditary subset, 290, 421 as a set of information states, 278, 383, 438 pseudo-assertion, 113,281, 283 pseudo-Boolean algebra, 8, 91, 383, 390 pseudo-complement, 91, 383 pseudo-metric, 364 pseudo-trichotomy, 126 push,282-284,286,420 quantum logic, 88, 102, 104, 186,381 implication in, 382 quantum mechanics, 5, 429 quasi-complement, 399 quasi-congruence, 75 quasi-equation, 49 quasi-equational class of algebras, 50 quasi-equations and Horn formulas, 50
INDEX
quasi-inequations, 77 and implication to no ids, 79 quasi-metric, 364, 383 quasi-metric space, 366 quasi-ordering, 55, 366, 383, 385 quasi-partition, 189 quasi-variety, 49, 50 Quine, W. V. 0.,182,184 quotient algebra, 20, 23, 323, 328 and partitions, 20 quotient matrix, 250 weak and strong, 250 quotient structure, 23 of operational structures, 23 of relational structures, 23 R,272 Rasiowa, H., 4, 88 realizing contraposed distribution types, 411 recursiveness ofaxiomatization, 217 reduced product of algebras, 50 reduced product of matrices, 248 reduct of a fusion, 205 reflexivity, 55 refutable sentences and ideals, 6 regular cardinal, 249 regular relation, 62 orderings induced by, 62 relational homomorphism, 15 variants of, 16 relational structure, 10 relative frame for S5, 367 relative matrices, 247 relative topology, 434 relatives (in a set of operations), 408 relevance logic, 287,413,420,427,428 and binary gaggles, 421 and non-classical complementation, 91 implication in, 381, 394 replacement, 22, 214, 328 types of, 22, 253 replacement theorem, 4 representation, 8, 277-320 of Boolean algebras, 303-305, 310, 435 with operators, 313 of closure algebras, 367 of distributive lattices, 293-300, 438 with operators, 314 of distributoids, 40 I, 403 of gaggles, 409-412, 422, 426, 429 of Heyting algebras, 383-386 of implicational posets, 279 of lattices, 288-292,431-444 with operators, 317-320 of residuated partially ordered groupoids, 281
467
of semi-lattices, 287 of tonoids, 279 of LC-algebras, 391 representation theorems, viii, 277-320 and completeness theorems, viii representation theory, viii, 8 residuated monoid, 130 residuated partially ordered groupoid, 109, 279, 281,401 residuation, 82, 280, 382, 395, 396, 401, 406, 410 abstract law of, 281, 408 and fusion and implication, 287 and Galois connections, 396 and intuitionistic implication, 420 and modal operators, 416 in lattice ordered groupoids, 406 in partially ordered groupoids, 396 laws of left and right, 407 right identity, 283, 420 right lower identity, 421 right residual, 109,421-423 positive paradox for, 113 right residuated partially ordered groupoid, 279, 413 right-residuated distributive lattice ordered groupoid with right identity, 421 ring algebraic, 352 of sets, 67, 93, 287, 300, 315, 352 RM,230 Routley, R., 277, 286, 394,422 Routley-Meyer frame, 286, 428 R*, 61, 315 rule of necessitation, 356, 359 rule prefixing, 78, 279 rule pseudo-permutation, 113, 284 rule suffixing, 78, 279 rule-form permutation, 80, 284 Russell, B., 2, 184 Rutherford, D., 67 S-compact, 177 S4,5,8,150,357,415 algebra, 358, 360, 364, 373 and intuitionistic logic, 386 S5,80,150,357,359,375,377,390,415,418 absolute semantics, 149,367 algebra, 358, 361, 368, 369 extensions, 378 and Henle matrices, 376 axiomatization of, 378 pretabularity of, 375 uniform finite model property in, 375 SA,349 Sambin, G., 442
468
Sasaki implication, 381 satisfaction, 36, 169 by a class of algebras, 36 by an algebra, 36 satisfiability, 170 by a Frege interpretation, 330 SEA,349 Scedrov, A., 224 Schmidt, J., 396 Schroder, E., 2 Scott atlas, 213, 260, 273 Scott, D., 185,202,235,356 Scroggs, S. J., 375, 392 Segerberg, K, 356 self-implication thesis, 283 semantic categories, 142 semantics, 141-183, 193 algebraic (compositional), 167 possible worlds, 277 semi-Boolean algebra, 349 and Boolean lattices, 350 equational definability, 349 semi-group, 40, 126 Abelian, 40 free, 40 trichotomy and cancellation in, 126 semi-interpreted language, 158, 190 semi-lattice, 40, 290 as algebra, 72 meet- and join-, 71 varieties, 73 semi-lattice-ordered monoid, 290 semi-modular lattice, 102 sentential modal operators, 149 separation principle, 278, 287, 289, 293-295, 298 Shankar, N., 224 Shoesmith, D., 185,202,257,261 Shoesmith-Smiley matrix, 260 Shukla, A., 373 Sikorski, R., 4, 300 similarity of operational structures, 11 of relational structures, 10 of tonoids, 83 Simmons, G. F., 433 simple entailment, 170 simple semantic categories, 142 Smiley, T. J., 4, 185,202,257,261 soundness, 7 and asymmetric consequence, 269 and freedom, 51 offormal asymmetric consequence logics, 212 offormal symmetric consequence logics, 213 of formal unary assertionallogics, 210 of modal logics, 362, 367, 416
INDEX
of valuations w.r.t. a consequence relation, 191 of S5, 376, 375 w.r.t. a class of algebras, 51 specialized pseudo-assertion, 284 spectral topological spaces, 438 square-decreasingness, 418 square-increasingness, 197,423 stability and asymmetric consequence, 261 stable consequence relations, 259-261 Stalnaker, R., 106, 381 statement. 187 Stone, M. H., viii, 5, 8, 288, 295, 384, 395, 403, 417,422,431,438,441 Stone space, 435 Stone's prime filter separation principle, 297, 391, 436 Stone's representation theorem, 297-300, 303305,322 and duality, 435 for Boolean algebras, 312, 433 for distributive lattices, 297 stop-gap consequence, 220 Strawson, P. F., 4 strict equivalence of logics, 325 strict implication, 360,418 strict linear ordering, 59 strict ordering, 59 and associated partial ordering, 59 string, 125 strong compactness, 326, 370 of a topological space, 370 strong homomorphism theorem for matrices, 251 subalgebra, 12 generated, 14 subbase for a topology, 434 subdirect classes, 46 and K -free algebras, 46 subdirect product, 29 subdirectly irreducible, 30 sublattice, 115 submatrix, 237 weak and strong, 237-239 subminimal complementation, 89 subspace, 369 substitution, 137 semantic, 166,257 syntactic, 166 substitutionally determined language, 165 substructural logics, 80, 278, 400 and implication tonoids, 79 substructure, 12 subtonoid, 83 subtraction algebra, 349 sufficient idempotence in a hemi-distributoid, 205
INDEX
superstructure, 12 supervaluationallogic, 88, 186 Suppes, P., 184 supremum, 66 and set union, 66, 67 Suszko, R., 184 symbolic logic, I symbols, 33, 125 symmetric algebraic cut, 205 symmetric consequence, 184, 189, 196,325 absoluteness in, 202 and valuations, 199 explicit, 203 logic, 273 symmetric cut, 203 symmetric entailment, 170, 325 symmetric infinitary cut, see cut, symmetric infinitary, 189 symmetric lattice, 102 symmetric modal algebra, 416 symmetric modal Boolean algebra and gaggles, 417 symmetric pre-consequence, 188 symmetric stop-gap consequence, 222 symmetrical sequent calculus, 334 symmetry, 56 syntactic categories, 142 syntactic degree, 33 syntax of standard first-order logic, 134 Szasz, G., 454 Tarski, A., viii, 3, 5, 8,188,210,226,233,313, 364,386,394,395,401,422 tautological consistency, 263, 265, 268 tautological implication, 262, 265 tautology, 87, 262 of a matrix, 232 preservation, 232 theorem, 3, 216 theory, 278 thesis in a derivational calculus, 342 of a calculus, 334 Thomas, L, 393 Thomason, R. H., 381 tonic direct product, 83 tonic homomorphic image, 83 tonic type, 78 TIL, 79 tonoid, ix, 78, 279, 398, 400, 409 inequationallogic, 79 similarity of, 83 varieties theorem for, 82 top element, 198, 206, 257 topological space, 179, 193,364,369,385,433
469
and duality, 434 clopen set in, 179 closed set in, 179 compactness of, 179 Hausdorff, 434 induced by a valuation space, 181 open set in, 179 totally disconnected, 434 totally order disconnected, 438 trace, 400 transitive closure, 61, 255 transitive reflexive closure, 61 transitivity, 55 and implication, 108 triangle inequality, 364 truth-functional language, 169 truth functions, 146 truth set, 152, 262, 268 truth values, 146-148, 226, 227 Turing machine, 137 TV, 356 two-valued quasi-metric, 364 type of a relational structure, 10 of an operational structure, 11 typicality, 42, 41-44 and inclusion posets, 58 and non-identity, 42 U-compact, 177,326 UCLA proposition, 150, 277 Ulrich, D. E., 234, 396 ultrafilter, 50 ultraproduct, 50, 248 unary assertionallogics, 185,267,273 unary calculus, 334, 342 unassailable, 326 uninterpreted language, 159 unique decomposition, 131 and universal freedom, 132 uniquely complemented lattice, 85 unital Boolean logic, 321, 333 and supervaluationallogic, 322 identity with Boolean logic, 325 matrix for, 322, 324 strict equivalence of Boolean logic, 325 strong equivalence to Boolean logic, 322-324 strong equivalence to Frege logic, 324 universal algebra, viii, 10 and algebraic logic, 10 universal freedom, 44, 132 and unique decomposition, 132 UF(n),45 UFK(n),45 unsatisfiable, 170, 326
470
ub(S),64 upper bound, 64 upper identity elements, 206 upper semi-Boolean algebra, 349 Urquhart, A., 224, 230, 235, 277, 288, 318, 439, 443 USBA,349
Vacarro, v., 442 validity, 7, 170 in a frame, 363 in a matrix, 210 valuation, 155-158, 162-172, 189 admissibility of, 158 induced by an interpretation, 157 V(f-),200 V(L,M),157 van Benthem, J. F. A. K., 286 van Fraassen, B. c., 158, 178,322 varieties tbeorem, 36, 76, 82 variety, 36, 323, 350 and distributive lattices, 97 and distributoids, 412 and gaggles, 412 and lattices, 73 and modular lattices, 103 and semi-lattices, 72 Vickers, S., 438 Wajsberg algebra, 442 Wansing, H., 417 Ward, M., 74, 406 weak asymmetry, 56 weak modularity, 102 weakly connected, 59 Wechler, W, 51, 74 WI =,40 Whitehead, A. N., 2, 184,421 WII, 47 Wittgenstein, L., 153 WIK,46 Wojcicki, R., 226, 237 Woodruff, P., 262 word algebra, 33 and freedom in groupoids, 40 Zorn's lemma, 30, 297, 316, 395, 404, 426, 429, 436
INDEX