Journal of Semantics 22: 119–128 doi:10.1093/jos/ffh030
Modality and Temporality CLEO CONDORAVDI Palo Alto Research Center STEFAN KAUFMANN Northwestern University
COUNTERFACTUALS Any theory of counterfactuals has to grapple with the fact that judgments about their truth or falsehood cannot be explained in terms of logical relations alone. Invariably, such judgments appear to draw on additional assumptions about non-logical dependencies between facts or propositions. This gives rise to some of the hardest problems in the theory of conditionals: What is the nature of these relations between facts or propositions? Are they all of one kind, or are different relations relevant for different counterfactuals? Can they be analysed without circular reference to counterfactuals? And just how much of this additional information needs to be incorporated into the formal semantic theory? The most prominent semantic approach to counterfactuals is ordering semantics, developed by Stalnaker (1968) and Lewis (1973b). Both authors were concerned with providing a logical theory of the special inferential relationships between counterfactuals, a purpose for which the classical Fregean material conditional is famously inadequate. Ignoring differences in detail, both rely on a model theory in which a notion of similarity between possible worlds plays a central role. For instance, the truth value of (1a) at a world at which the match was not scratched is determined by the truth value of its consequent at the most similar world(s) at which it was. Ó The Author 2005. Published by Oxford University Press. All rights reserved.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The present collection addresses a number of issues in the semantic interpretation of modal and temporal expressions. Despite the variety the papers exhibit both in the selection of topics and the choice of formal frameworks, they are interconnected through several overarching themes that are at the centre of much ongoing research. The purpose of this brief introduction is to put the papers into context and draw the reader’s attention to some of these connections. The topics we will discuss in the remainder are: counterfactuals, causality, partiality, compositionality of conditionals, and context dependence.
120 Modality and Temporality (1) a. If that match had been scratched, it would have lighted. b. If the match had been wet and scratched, it would have lighted.
(2) If that match had been scratched, it would have been wet. The consequent of (1a) follows from the antecedent and the fact that the match was dry (ignoring certain other factors, such as the presence of oxygen). The consequent of (2) follows from the same antecedent and the fact that the match did not light. Why do most speakers consider the fact that the match was dry, but not the fact that the match did not light, relevant to their deliberations of what would have been the case if it had been scratched? Within ordering semantics, the question becomes how exactly the similarity relation ought to be specified, and whether it can be reduced to some more basic notion. The proponents of the theory have made only tentative suggestions in this regard. Lewis (1973a, 1979) proposed to use judgments about counterfactuals as empirical evidence about the way speakers assess similarity, and put forth his well-known hierarchy of ‘miracles’, which he conceived of as more or less drastic deviations from the actual course of events. Stalnaker (1968, 1984) offered his
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The resulting theory correctly invalidates certain inferences involving counterfactuals, such as that from (1a) to (1b). Under the classical interpretation this inference would be valid; but clearly (1a) can be true while (1b) is false. In accounting for the logical behaviour of counterfactuals, ordering semantics was a significant step forward and must be considered an unqualified success. At the same time, many authors have voiced doubts as to whether, in itself, it really amounts to a semantic analysis, given that it delegates the most difficult questions to the unanalysed similarity relation. Thus van Fraassen (1976) noted: ‘To the question what principles govern deductive reasoning involving conditionals, Stalnaker and Lewis give exact answers. But the validity of an argument does not depend on whether its premises are true; and indeed, Stalnaker and Lewis have not notably increased our ability to decide whether particular conditionals are true or false’ (p. 266). To this day, there is little consensus on the question of how the interpretation of counterfactuals depends on the facts or, more semantically put, the truth values of atomic and truth-functional sentences. An early statement of this question is due to Goodman (1947), who saw no way of giving a non-circular logical explanation for the fact that speakers consistently judge (1a) true and (2) false.
Cleo Condoravdi and Stefan Kaufmann 121
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
‘projection strategy’ as an alternative which gives more importance to epistemic considerations, explaining similarity between worlds in terms of the logic of belief update and revision. A third possible strategy would be to take certain dependencies between facts in a possible world as basic and formulate the semantic analysis of counterfactuals in terms of those. One framework in which this latter approach has been explored is premise semantics, originally proposed by Veltman (1976) and Kratzer (1981). Here the semantics of counterfactuals makes reference to premise sets, maximally consistent sets of propositions compatible with the antecedent. Premise semantics provides a way to formulate hypotheses as to which truths are given up in evaluating counterfactuals. Broadly speaking, the idea is that speakers do not view facts as mutually independent: some (but not all) facts are affected by manipulations of other facts in hypothetical reasoning. Formally, this means that some logically possible sets of premises may be irrelevant to the truth or falsehood of a given counterfactual. The question then is what determines the selection of the relevant premise sets. For Veltman (1976), this selection was driven by epistemic preferences. Kratzer (1981, 1989, 2002), on the other hand, has tried to define it in terms of assumptions about the internal structure of the world at which the counterfactual is evaluated. Two papers and one discussion note in this collection address directly the semantics of counterfactuals and the proper construction of premise sets. ‘On the Lumping Semantics of Counterfactuals’, by Makoto Kanazawa, Stefan Kaufmann and Stanley Peters, discusses Kratzer’s (1989) attempt to cast the intuition about the dependence between facts into a precise logical form and relate it to assumptions speakers appear to make about the structure of the world. Kratzer’s proposal and the underlying situation-semantic apparatus have been influential in the field, being based on intuitions that many authors find plausible. What Kanazawa et al. show is that despite its appeal, the approach is plagued by certain logical problems in its formal implementation, which are not obvious at first but lead to a number of unwelcome consequences. It is the particular formalization of lumping developed in Kratzer (1989), along with the workings of premise semantics, that leads to the triviality problems discussed in the paper. Kanazawa et al. leave open the question of whether the prima facie plausible idea can be preserved in a modified version of lumping semantics, or whether an altogether different approach is called for. This question is important, and the theory deserves that it be resolved. Kanazawa et al. show where the cracks run in the logical foundation of the most explicit formalization
122 Modality and Temporality
CAUSALITY Kratzer’s appeal to ‘lumping’ and Veltman’s notion of some facts ‘bringing others in their train’ are but two ways of placing constraints on counterfactual inferences. Another solution appeals to causal relations, which in recent years has risen to new prominence in adjacent fields, such as artificial intelligence and psychology (Ortiz, 1999a,b; Pearl, 2000), and was applied in the interpretation of conditionals by Kaufmann (2005) and
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
currently available. Angelika Kratzer, in her reaction paper ‘Constraining Premise Sets for Counterfactuals’, argues that by further developing the theory in directions she has indicated in more recent work, it may ultimately be possible to avoid the problems. In ‘Making Counterfactual Assumptions’, Frank Veltman gives the beginnings of a compositional analysis of counterfactuals and a new version of premise semantics for counterfactuals. Veltman takes seriously the observation, made by Tichy´ (1976), that local mismatches of facts can make for big differences in the truth of counterfactuals. Like Kratzer, he aims for a semantics of counterfactuals that pins down more precisely which facts count and which do not in evaluating the consequences of a hypothetical assumption. He makes a crucial distinction between particular facts and general laws; the latter do not depend on the particular world of evaluation. The counterfactual assumptions an agent can make are limited to those that are compatible with the laws. In Veltman’s formal system, situations assign truth values to atomic propositions, and laws complete situational bases into worlds. A basis for a world is a situation which contains all and only the basic, mutually independent facts distinguishing that world from others. A situation smaller than a basis, on the other hand, can ‘grow’ into different possible worlds. It is situations of this kind that determine the truth values of counterfactuals. A counterfactual with a false antecedent (the only case Veltman considers) is evaluated at a given world by reducing a basis of that world to a situation which admits the antecedent. This process involves the removal of some propositions, which, Veltman maintains, take others in their train: When a proposition is retracted from a world, all the independent facts that led to its truth, as well as its consequences under the laws, are retracted as well. Veltman in fact formulates the meaning of counterfactuals in terms of update conditions on belief states, whereas our informal description here is given in truth conditional terms. As Veltman notes, update distributes over worlds in this way only when the laws are fixed. Thus the reader should be alerted that our description here covers only this special case.
Cleo Condoravdi and Stefan Kaufmann 123
FROM EVENT DESCRIPTIONS AND TIME TO WORLDS Both Veltman and Hobbs make use of partial entities—situations in one case, eventualities in the other—as well as worlds, and largely abstract away from time. Veltman takes worlds and situations to be total and partial functions, respectively, from atomic formulas to truth values. Hobbs appeals to an ontology of eventualities, construed rather broadly as facts that may or may not hold in a particular world. For example, in addition to the eventuality of some basic property holding of an individual, there are negative eventualies—the eventuality of another eventuality not existing—and what we may call modal eventualities, such as the eventuality of another eventuality being hypothetical. It is fair to say that the exact nature of a model-theoretic interpretation for such ontological entities is an open question. Hobbs considers time with regard to temporal order and causal flow, but does not in general
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
others. Here again, we face unresolved foundational questions: What are the relata of causal relations? What are their logical properties? How do they enter into speakers’ reasoning about particular sentences? How are they utilized in the absence of full knowledge about the relevant facts? The importance of an adequate notion of causality is by no means restricted to counterfactuals. Jerry Hobbs, in the paper ‘Toward a Useful Concept of Causality for Lexical Semantics’, starts out by noting that it is required for the analysis of a wide variety of expressions, and moves on to propose an account of its logical properties that is independent of any particular linguistic application. Central to the proposal is the notion of a causal complex, a collection of eventualities which in their totality are responsible for the effect, and none of which is irrelevant to the occurrence of the effect. Sceptical about the possibility of giving a complete definition of the concept, Hobbs’ goal is to identify general conditions on causal complexes that support linguistically relevant inferences. Using techniques from non-monotonic logic, Hobbs weakens inferences which go against the direction of causality to ensure that they do not lead to counterintuitive consequences in causal reasoning. Along the way, he develops a number of auxiliary notions that should prove independently interesting, especially that of a closest world, which is similar but not equivalent to that of the Stalnaker/Lewis theory of counterfactuals. This raises new questions, such as what exactly the relationship is and whether Hobbs’ notion of closeness may offer a causal explication of Stalnaker’s.
124 Modality and Temporality
A COMPOSITIONAL SEMANTICS FOR INDICATIVE CONDITIONALS It has been widely accepted since the work of Lewis (1975) and Kratzer (1979) that the semantic contribution of if-clauses is to restrict the domain of an overt or covert modal operator with scope over the consequent clause. The question remains, however, how this restriction comes about in the process of compositional interpretation. Von Fintel (1994) proposed that if-clauses may act as modifiers of consequent clauses, but left open the details of a compositional analysis. Another largely unaddressed issue regarding the meaning of indicative conditionals is the interpretation of Present and Past tenses in their antecedents and consequents (a notable exception is Crouch 1993). The variety of temporal readings and the semantic interdependence between the tenses in antecedent and consequent are illustrated by (3) and (4). (3) a. b. c.
If he comes out smiling, the interview went well. If he came out smiling, the interview went well. If he went in smiling, the interview will go well.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
address change through time, i.e. the fact that both an eventuality and its negation can be realized in a world at different times. In the paper ‘Schedules in a Temporal Interpretation of Modals’, Tim Fernando approaches temporal matters constructively, treating eventuality descriptions rather than worlds as primitives. He formulates both worlds and eventualities as relations between time and eventuality descriptions—relations he calls schedules. Schedules ground eventuality descriptions in time, and, as such, amount to temporal realizations of eventuality descriptions. Insofar as eventuality descriptions can be understood as intensional notions and schedules as extensional notions, Fernando’s formation of schedules from eventuality descriptions reverses the Montagovian tradition of deriving intensions from extensions (by abstracting over a world parameter). Of particular interest in the paper is the fact that schedules satisfy not just atomic eventuality descriptions, but also descriptions with temporal and modal operators. Fernando argues that satisfaction need not rest on worlds, even for modal formulas involving epistemic or historical alternatives, and offers a reformulation of the temporal interpretation of modals proposed in Condoravdi (2002). The paper generalizes the notion of eventive, stative and temporal properties by defining a satisfaction predicate (‘forcing’) that is persistent relative to a partial order on schedules. Persistence then allows one to reconstruct worlds from certain so-called generic sets of schedules.
Cleo Condoravdi and Stefan Kaufmann 125
(4)
a.
If he is at the interview (now/when we call him), he will be late for the meeting. b. If he is at the interview, the interviews are on schedule.
CONTEXT DEPENDENCE AND DYNAMIC INTERPRETATION Moving above the level of individual sentences, we face the inextricable context dependence of modal and conditional expressions. Not only is their interpretation determined and constrained by a variety of contextual parameters, but they in turn operate on the context, affecting the interpretation of subsequent utterances. The cross-sentential dependencies that result from such interactions are
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Note, for instance, that the Past tense in the consequent of (3a) indicates backshifting from a future time, wherease those in the antecedents of (3b,c) and in the consequent of (3b), on their most natural interpretations, indicate backshifting from the time of utterance. Similarly, the Present tense in the antecedent of (3a) calls for a forwardshifted interpretation, while that in the antecedent and consequent of (4b) indicates overlap with the time of utterance and that in the antecedent of (4a) is compatible with either interpretation. In ‘Conditional Truth and Future Reference’, Stefan Kaufmann proposes a compositional semantics for indicative conditionals which brings together the modal and temporal elements of their interpretation. He treats if-clauses as modifiers of the consequent, with the desired effect of restricting the modal base associated with the latter. Regarding temporal interpretation, he makes the (at first sight striking) claim that the tenses in the antecedent and consequent of indicative conditionals receive the same interpetation as in isolation, and demonstrates that this assumption helps explain a number of otherwise puzzling facts. He shows how the same basic meaning can give rise to predictive and nonpredictive, metaphysical and doxastic readings, depending on contextual parameters and the choice of modal base for the consequent, and explains why non-predictive readings tend to be associated with doxastic modality. The main technical innovation of Kaufmann’s paper is to take apart the various parameters involved in the interpretation of a modal and then have them enter the process of compositional interpretation separately at different stages, rather than being fixed once at the top level by the context. In this way, a modal can be transformed into a modal-temporal operator by an if-clause.
126 Modality and Temporality subsumed under the label ‘modal subordination’. Some examples are given in (5) through (7). (5) a. b. (6) a. b. (7) a. b.
A thief might come in. He would take the silver. Mary may come to the party. Sue may come, too. I don’t have a car. It would be parked outside.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In each of these mini-sequences, the (b)-sentence carries certain presuppositions (triggered by he in (5b), too in (6b), it in (7b)). One question that such examples raise is why the second sentence is felicitous in the given context, even though on the face of it, its presuppositions are not satisfied: For instance, the existence of a thief is not asserted in (5a), nor does (6a) assert that Mary (or anyone else) is coming to the party. Furthermore, (5b) and (7b) are intuitively interpreted as the consequents of conditionals whose antecedents are (5a) and (7a), respectively. A variety of proposals for dealing with this phenomenon have been put forth in the literature. Common to all of them is a dynamic perspective on the interaction between sentences and their contexts of interpretation, but regarding the nature of this interaction, we can discern several different approaches. Roberts (1989) appealed to complex inferences and accommodation to explain how sentences like (5b) are interpreted. Others view the dependency as essentially anaphoric (Frank 1996; Geurts 1998; Kibble 1998). Such approaches have been criticized for being too unconstrained and unable to account for the fact that the dependency is, with very few exceptions, limited to the immediately preceding discourse context. Still others draw a crucial distinction between those contexts which result from an update with a licensing expression (such as 5a–7a) and give rise to modal subordination for that reason, and those contexts which do not. Kaufmann (2000) made one such proposal, arguing that this approach is better suited to account for the locality of modal subordination. In the paper ‘A Modal Analysis of Presupposition and Modal Subordination’, Robert van Rooij offers a novel approach in the same conceptual vein, addressing a wider range of data and employing a leaner formal framework. Contexts are represented as modal accessibility relations. An update with a modally subordinating expression results in a context with special properties (rather than a stack of multiple contexts, as Kaufmann would have it). The evaluation
Cleo Condoravdi and Stefan Kaufmann 127
of presupposition-carrying sentences in such contexts is spelled out in a two-dimensional framework which is inspired by Karttunen and Peters’ (1979) account of conversational implicature, but not subject to certain well-documented problems afflicting the latter. The proposal should therefore be of independent interest.
Acknowledgments
CLEO CONDORAVDI Palo Alto Research Center 3333 Coyote Hill Drive Palo Alto CA 94304, USA e-mail:
[email protected]
Submitted and accepted: 20.02.05 Final version received: 22.02.05
STEFAN KAUFMANN Department of Linguistics Northwestern University 2016 Sheridan Road Evanston IL 60208, USA e-mail:
[email protected]
REFERENCES Condoravdi, C. (2002) ‘Temporal interpretation of modals: Modals for the present and for the past’. In D. I. Beaver, L. Casillas, B. Clark, & S. Kaufmann (eds), The Construction of Meaning. CSLI Publications, Stanford, CA, 59–88. Crouch, R. (1993) The temporal properties of English conditionals and modals. PhD thesis, University of Cambridge. Faller, M., Pauly, M., & Kaufmann, S. (eds), (2000) Formalizing the Dynamics
of Information. CSLI Publications, Stanford, CA. von Fintel, K. (1994) Restrictions on quantifier domains. PhD thesis, University of Massachusetts. van Fraassen, B. C. (1976) ‘Probabilities of conditionals’. In W. L. Harper, R. Stalnaker, & G. Pearce (eds), Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science, volume 1 of The University of Western Ontario Series
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We would like to express our thanks to the eight contributors, twelve anonymous reviewers, and above all to Peter Bosch, the former managing editor of this journal, for his support and patience in seeing the project through. We would also like to thank his successor, Bart Geurts, who oversaw the last stages of this project.
128 Modality and Temporality Lewis, D. (1973a) ‘Causation’. Journal of Philosophy 70:556–567. Lewis, D. (1973b). Counterfactuals. Harvard University Press, Cambridge, MA. Lewis, D. (1975) ‘Adverbs of quantification’. In E. Keenan (ed.), Formal Semantics of Natural Language. Cambridge University Press, Cambridge/New York, NY, 3–15. Lewis, D. (1979) ‘Counterfactual dependence and time’s arrow’. Nouˆs 13:455–476. Ortiz, C. (1999a) ‘A commonsense language for reasoning about causation and rational action’. Artificial Intelligence 111:73–130. Ortiz, C. (1999b) ‘Explanatory update theory: Applications of counterfactual reasoning to causation’. Artificial Intelligence 108:125–178. Pearl, J. (2000) Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge/New York, NY. Roberts, C. (1989) ‘Modal subordination and pronominal anaphora in discourse’. Linguistics and Philosophy 12:683–721. Stalnaker, R. (1968) ‘A theory of conditionals’. In J. W. Cornman (ed), Studies in Logical Theory, American Philosophical Quarterly, Monograph: 2. Blackwell, Oxford, 98–112. Tichy´, P. (1976) ‘A counterexample to the Lewis-Stalnaker analysis of counterfactuals’. Philosophical Studies 29:271–273. Veltman, F. (1976) ‘Prejudices, presuppositions and the theory of conditionals’. In J. Groenendijk & M. Stokhof (eds), Proceedings of the First Amsterdam Colloquium [¼Amsterdam Papers in Formal Grammar, Vol. 1], Centrale Interfaculteit, Universiteit van Amsterdam, 248–281.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
in Philosophy of Science, D. Reidel. 261–308. Frank, A. (1996) Context dependence in modal constructions. PhD thesis, Institut fu¨r maschinelle Sprachverarbeitung, Stuttgart. Geurts, B. (1998) ‘Presuppositions and anaphora in attitude contexts’. Linguistics and Philosophy 21:545–601. Goodman, N. (1947) ‘The problem of counterfactual conditionals’. The Journal of Philosophy 44:113–128. Karttunen, L. & Peters, S. (1979) ‘Conventional implicature’. In C.-K. Oh & D. Dinneen (eds), Presupposition, volume 11 of Syntax and Semantics. Academic Press, New York, NY, 1–56. Kaufmann, S. (2000) ‘Dynamic discourse management’. In M. Faller, M. Pauly, & Kaufmann (eds), Formalizing the Dynamic of Information. CSLI Publications, Place, 171–188. Kaufmann, S. (2005) ‘Conditional predictions: A probabilistic account’. Linguistics and Philosophy. To appear. Kibble, R. (1998) ‘Modal subordination, focus and complement anaphora’. In J. Ginzburg, Z. Khasidashvili, C. Vogel, J.-J. Le´vi, & E. Vallduvı´ (eds), The Tbilisi Symposium on Language, Logic and Computation: Selected Papers. CSLI Publications, Stanford, CA, 71–84. Kratzer, A. (1979) ‘Conditional necessity and possibility’. In U. Egli, R. Ba¨uerle, & A. von Stechow (eds), Semantics from Different Points of View. Springer, Berlin/New York, NY, 117–147. Kratzer, A. (1981) ‘Partition and revision: The semantics of counterfactuals’. Journal of Philosophical Logic 10:201–216. Kratzer, A. (1989) ‘An investigation of the lumps of thought’. Linguistics and Philosophy 12:607–653. Kratzer, A. (2002) ‘Facts: Particulars of information units?’ Linguistics and Philosophy 25:655–670.
Journal of Semantics 22: 129–151 doi:10.1093/jos/ffh027 Advance Access publication March 29, 2005
On the Lumping Semantics of Counterfactuals MAKOTO KANAZAWA National Institute of Informatics STEFAN KAUFMANN Northwestern University
Abstract Kratzer (1981) discussed a naı¨ve premise semantics of counterfactual conditionals, pointed to an empirical inadequacy of this interpretation, and presented a modification—partition semantics—which Lewis (1981) proved equivalent to Pollock’s (1976) version of his ordering semantics. Subsequently, Kratzer (1989) proposed lumping semantics, a different modification of premise semantics, and argued it remedies empirical failings of ordering semantics as well as of naı¨ve premise semantics. We show that lumping semantics yields truth conditions for counterfactuals that are not only different from what she claims they are, but also inferior to those of the earlier versions of premise semantics.
1 INTRODUCTION Counterfactuals pose some of the most recalcitrant problems for truth-conditional semantic analysis. The long and rich tradition of writings on this topic, despite substantial advances in many directions, has so far failed to deliver a formally explicit and intuitively accurate account of how their truth conditions depend on those of their constituents and other non-conditional sentences. Among the most influential writings in this area are those of Kratzer (1981, 1989), the latest of which puts forward a theory centred around the novel notion of lumping, which, she argues, solves a number of problems with previous accounts. Given the initial appeal of the use of lumping in Kratzer’s (1989) semantics and its wide influence in linguistics, it is both surprising and worth pointing out that it seems to be in fundamental conflict with other features of her semantics, depriving the theory of much of its predictive power. In this paper, we carefully examine the logical The Author 2005. Published by Oxford University Press. All rights reserved.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
STANLEY PETERS Stanford University
130 On the Lumping Semantics of Counterfactuals
n
n
(1) a. h/; )/ p p b. h/; )/ l l c. h/; )/ denote the paired conditionals under the three semantic interpretations. Our main result is that, under certain conditions, the lumping l
semantics of Kratzer (1989) is truth-functional. Specifically, u h/ w is l equivalent to the material conditional u/w; and u )/ w is equivalent to the conjunction u ^ w. It suffices to describe the propositional language since the critical problem with the lumping semantics for conditionals already arises in this case. Models M will be ordered pairs ÆW, Væ of a non-empty set W of possible worlds and a function V mapping propositional variables to subsets of W. For each w 2 W and M propositional variable p, ½½pM w ¼ 1 if w 2 V(p) and ½½pw ¼ 0 if w ; V(p). Below we will refer to propositions by variable names, writing ‘p’ instead of ‘V(p)’. The semantics of truth-functional connectives is as usual: For all formulas u, w and w 2 W, we set M M (2) ½½u ^ wM w ¼ 15½½uw ¼ ½½ww ¼ 1 M M (3) ½½u _ wM w ¼ 05½½uw ¼ ½½ww ¼ 0 M M (4) ½½u/wM w ¼ 05½½uw ¼ 1 and ½½ww ¼ 0 M (5) ½½:uM w ¼ 15½½uw ¼ 0
We suppress the superscript henceforth because no confusion can arise.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
consequences of the lumping semantics, and show that the predictions that it makes about counterfactuals are quite different from the ones Kratzer ascribes to it. Although we do not offer a counterproposal of our own, we hope our analysis proves useful for any future attempts to develop a viable theory of counterfactuals that makes crucial use of a notion like lumping. We can best explain Kratzer’s motivations for her 1989 theory as well as present our formal analysis of it by contrasting it with the two earlier theories of counterfactuals discussed in Kratzer 1981. The three theories are all closely related and belong to the class of premise semantics. Each of the three interpretations recognizes dual counterfactual connectives, the would-conditional and the mightconditional, for which we introduce corresponding pairs of binary connectives h/ and )/, respectively. We let the six connectives
M. Kanazawa et al. 131
2 PRELIMINARIES
2.1 Background
(6) If that match had been scratched, it would have lighted. The scratching of the match does not in itself guarantee its lighting: In addition, oxygen has to be present, the match has to be dry, etc. ‘The first problem’ in the interpretation of counterfactuals, Goodman writes, ‘is . . . to specify what sentences are meant to be taken in conjunction with the antecedent as a basis for inferring the consequent’.1 Clearly, for instance, sentences which contradict the antecedent should be excluded, since otherwise many false counterfactuals would come out vacuously true. Less obvious, but far more vexing to Goodman, is the fact that speakers consistently exclude other sentences for non-logical reasons. Why, for instance, is it easy to believe that (6) is true, but unnatural to conclude (7) from the fact that the match did not light? (7) If the match had been scratched, it would have been wet. Goodman was unable to offer an answer to this and related questions that would not make circular reference to counterfactuals: His rule bluntly calls for the selection of those true sentences that would not be false if the antecedent were true. However, his suggestions inspired much subsequent work by authors who continued to grapple with the problem (Rescher 1964; Veltman 1976; Pollock 1981, and others). Kratzer’s writings on premise semantics contribute to this line of research.
1 Goodman’s second problem—that of defining ‘natural or physical or causal laws’—will not concern us in this paper.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Most current theories of conditionals are based on a simple intuition: A conditional asserts that its consequent follows when its antecedent is added to a certain body of premises. This idea was first made explicit in Ramsey’s (1929) influential statement about indicative conditionals, which inspired much subsequent work (cf. Stalnaker, 1968). It is also at the center of Goodman’s (1947) theory of counterfactuals, a close predecessor of Kratzer’s premise semantics. Goodman noted about examples like (6) that while they generally assert that some connection holds between the propositions expressed by their constituent clauses, it is rarely the case that the second follows from the first.
132 On the Lumping Semantics of Counterfactuals
2.2 Basic apparatus Central to Kratzer’s theory is the notion of a premise set. Intuitively, the premise sets associated with a counterfactual at a possible world w represent ways of adding sentences that are true at w to the antecedent, maintaining consistency. We write Premw(u) for the set of premise sets associated with u at world w. Premw(u) determines the truth values at w of both wouldcounterfactuals and might-counterfactuals with antecedent u. Kratzer’s truth conditions can be reproduced as follows.2
‘The would-counterfactual u h/ w is true at w if and only if every set in Premw(u) has a superset in Premw(u) which entails w.’ Definition 2 (might-counterfactual) ½½u)/ww ¼ 1 iff T dX 2 Premw ðuÞ"Y 2 Premw ðuÞ½X4Y/ Y \ w \ W 6¼ ˘ ‘The might-conditional u)/w is true at w if and only if there is a set in Premw(u) all of whose supersets in Premw(u) are consistent with w.’ Remark 1 uh/w iff :ðu)/:wÞ, as intended. All versions of Kratzer’s theory follow this schema. The difference lies in the definition of Premw. In all three versions, Premw depends on a parameter f(w) which identifies the set of propositions relevant to the truth of counterfactuals at w. Kratzer showed that the most naı¨ve implementation of the account is empirically inadequate and sought to improve on it by imposing further conditions on membership in Premw(u). We will discuss three versions of the theory below, distinguishing between them using superscripts: Premn, Premp and n
p
l
Preml give rise to h/; h/ and h/; respectively. 2
The intersection with W is redundant as long as the universe of the model consists only of worlds. We include it here for the sake of generality because the definitions for lumping semantics below will employ a richer ontology.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Definition 1 (would-counterfactual) ½½uh/ww ¼ 1 iff T "X 2 Premw ðuÞdY 2 Premw ðuÞ½X4Y ^ Y \ W4w
M. Kanazawa et al. 133
3 NAI¨VE PREMISE SEMANTICS In the simplest version of the account, the set Premnw ðuÞ of premise sets associated with antecedent u at w represents all possible ways of adding true sentences to the antecedent, maintaining consistency. Thus the only conditions imposed on each member X of Premnw ðuÞ are that (i) all propositions in X other than u be true at w; (ii) X be consistent; and (iii) u be in X. More concisely: Definition 3 (Naı¨ ve premise set) T Premnw ðuÞ ¼ fX4f ðwÞ [ fug j X 6¼ ˘ and u 2 Xg; where f ðwÞ ¼ fp 2 PðWÞ j w 2 pg:3 n
(8) a. If Paula weren’t buying a pound of apples, the Atlantic Ocean might be drying up n b. (Paula isn’t buying a pound of apples) )/ (the Atlantic Ocean is drying up) This unwelcome consequence is part of a much larger problem which is deeply entrenched in naı¨ve premise semantics: For any false sentence n w that is consistent with the negation of a true sentence u;:u )/w is true. This fact follows from the following equivalences, which were first shown by Veltman (1976): Proposition 1 n w5ðu/wÞ ^ ð:u/hðu/wÞÞ (9) uh/ n (10) u)/w5ðu ^ wÞ _ ð:u ^ )ðu ^ wÞÞ
3 4
PðWÞ denotes the power set of W. n n For )/, see Kratzer (1979). Only h/ is discussed in Kratzer (1981).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
n
The truth conditions for the connectives h/ and )/ are as given by Definitions 1 and 2, respectively, where Premnw ðuÞ is substituted for Premw(u).4 Kratzer (1981) discusses at some length the implications of this definition, in particular the predictions it makes about the truth values of would-counterfactuals. It turns out that naı¨ve premise semantics, which she considers the ‘‘most intuitive’’ analysis of counterfactuals, is deeply flawed. Suppose w 2 W is like the actual world in that the Atlantic Ocean is not drying up, and suppose further that Paula is buying a pound of apples. Then the analysis predicts that (8a), interpreted as (8b), is true at w. Intuitively, however, sentence (8a) seems to be false in w.
134 On the Lumping Semantics of Counterfactuals n
n
Thus at a world at which u is true, uh/w and u)/w are both materially equivalent to w, the consequent. This part seems reasonable and is shared with many other logics of conditionals. More problematic n is that at a world at which u is false, u h/w comes down to strict n implication, and u)/w to the statement that u and w are logically consistent. The problem with (8), discussed above, follows from (10). Kratzer (1981) discusses a different but related problem which arises from the equivalence in (9). Naı¨ve premise semantics is, alas, very naı¨ve indeed. 4 PARTITION SEMANTICS
Definition 4 (Partition function) A function f : W/PðPðWÞÞ is a partition function if and only if for T every w 2 W, f(w) ¼ fwg. The set of premise sets for partition semantics is defined in terms of f in the same way as that for naı¨ve semantics. In partition semantics, f is supposed to be indeterminate and allowed to vary from context to context, so it constitutes a new parameter in the definition of the premise sets. Definition 5 (Partition premise set) Let f be a partition function. Then T prempw ðuÞ ¼ fX4f ðwÞ [ fug j X 6¼ ˘ and u 2 Xg: The truth definitions of counterfactuals remain the same as in Definitions 1 and 2. The resulting truth values now depend, via Prempw ; on the partition function. Naı¨ve premise semantics is a special case of partition semantics. As p before, at worlds at which u is true, the conditionals uh/w and
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
To address the above difficulties, Kratzer (1981) proposed a repair for naı¨ve premise semantics. Rather than treating all true sentences equally for purposes of constructing premise sets, she argued, one has to take into account the fact that speakers, in interpreting counterfactuals, entertain a more coarse-grained conception of the world, analyzing it into agglomerations of facts rather than atomic truths. Formally, Kratzer assumes that only some of the propositions that are true at the world w of evaluation are relevant to the truth of counterfactuals. These relevant propositions are determined by a partition function f. The only condition imposed on f is that the propositions it selects, taken together, uniquely identify w.
M. Kanazawa et al. 135 p
(11) Let a world w be such that a. a zebra escaped; b. it was caged with another zebra; c. a giraffe was also in the same cage. In such a world, the sentence in (12a), interpreted as (12b), is predicted to be false, given the intuitive understanding of similarity between worlds. (12) a. If a different animal had escaped, it might have been a giraffe. p b. (a different animal escaped) )/ (it was a giraffe) The reason behind this prediction is not hard to understand. Given that a zebra escaped in w, among all the possible worlds in which a different animal escaped, the ones where the other zebra escaped are more similar to w than any where the escaped animal was of a different species. Intuitively, however, sentence (12a) seems true in w. The lesson from examples like this is that the relation of ‘similarity’ between worlds that yields the right truth conditions in ordering semantics does not always correspond to the most intuitive notion of similarity. But if the former is simply a theoretical construct, then ordering semantics cannot make concrete predictions about the truth values of particular counterfactuals about which our intuitions are relatively sharp (see Kratzer 1989; 626). T This is due to the requirement that f(w) ¼ fwg There are minor differences with Lewis’ original (1973) formulation, which he argues are immaterial for the resulting semantic theory. 5 6
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
u)/w are both materially equivalent to w.5 However, where the antecedent is false, the choice of f determines whether they are equivalent to hðu / wÞ and )ðu ^ wÞ ðif f ðwÞ ¼ fp 2 PðWÞ j w 2 pg or f ðwÞ ¼ ffwgg for all wÞ or to some other propositions. Kratzer suggests that, in practice, the range of possible partitions may be further restricted by our ‘modes of cognition’ (p. 211). Lewis (1981) showed that this version of Kratzer’s semantics is equivalent to a version of his own ordering semantics in terms of similarity between possible worlds, as formulated by Pollock (1976).6 While this result attests to the significant expressive power of Kratzer’s theory, it also shows that the latter shares with ordering semantics a number of unwelcome features. Consider the following illustration, discussed in Kratzer (1989):
136 On the Lumping Semantics of Counterfactuals 5 LUMPING SEMANTICS
Definition 6 (Situation Model) A situation model is a triple M ¼ ÆS; <; Væ; where S is a non-empty set (of situations); < is a partial order on S satisfying the following condition: For all s 2 S there is a unique s# 2 S such that s < s# and for all s$ 2 S, if s < s$ then s$ < s#; V is a function mapping propositional variables to subsets of S. We will continue to use propositional variables to refer to their denotations, writing ‘p’ instead of ‘V(p)’. No confusion is likely to arise from this. Situations are the carriers of truth: Definition 7 (Truth) A proposition p is true in a situation s 2 S if and only if s 2 p.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Examples like (12) above pose challenges to both premise semantics and ordering semantics. Kratzer (1989) set out to find further ways of refining the theory in order to solve these problems while preserving the advantages of partition semantics over naı¨ve premise semantics. Her proffered solution is lumping semantics. This time, she changes both the set f(w) of propositions relevant to the truth of counterfactuals and the way the set of premise sets is defined in terms of f(w). The main point of her new strategy is to require premise sets to be closed under certain conditions, providing fewer opportunities for premise sets to be consistent with the consequent of a might-conditional—to address the problem of (8)—while eliminating the bias toward a different zebra escaping rather than a giraffe—to address the problem of (12). To implement a suitable closure condition on premise sets, she introduces the concept of lumping—a relation between propositions relative to a possible world and fully determined by its internal structure. To represent this structure, Kratzer takes up the concept of a situation, introduced by Barwise and Perry (1983). Though she conceives of situations as partial worlds, she models them with total models (unlike Barwise and Perry), borrowing from the ontological inventory of David Armstrong’s theory of states of affairs (see Armstrong 1978, 1997). We will reproduce the details of Kratzer’s proposal only to the extent that they are needed for our discussion below. We start with the definition of a situation model.
M. Kanazawa et al. 137
Definition 6 ensures that for each situation s there is a unique maximal situation s# such that s < s#. We follow Kratzer in calling these maximal situations ‘worlds’. Definition 8 (Worlds) For each s 2 S, let ws 2 S be the maximal situation such that s < ws. The set of worlds in M is the set W ¼ fws j s 2 Sg. As Kratzer points out (p. 615), truth is the only logical property in whose definition the partiality of situations comes into play. Other notions are defined solely inT terms of worlds. (In the following definitions, if A ¼ ˘, we let A ¼ S.)
Definition 10 (Logical consequence) A proposition p 2 P(S) T logically follows from a set of propositions A 4 P(S) if and only if A \ W 4 p. Not all propositions may be expressed by sentences of natural languages. Kratzer tentatively assumes that propositions that are expressible in natural language must be persistent. This property is defined as follows. Definition 11 (Persistence) A proposition p 4 S is persistent if and only if for all s, s# 2 S, if s 2 p and s < s#, then s# 2 p. Conjunction and disjunction are defined as usual, but now relative to S rather than W.7 Definition 12 ½½u ^ ws ¼ 15½½us ¼ 1 and ½½ws ¼ 1 ½½u _ ws ¼ 15½½us ¼ 1 or ½½ws ¼ 1 Kratzer’s discussion of negation is somewhat more complex, and not all of the details are relevant here. However, the following properties will be needed below: For all persistent propositions u, (13) a.
:u is persistent.
7 Kratzer in fact considers two different definitions of disjunction. The definition here is the one she used in her discussion of the Atlantic Ocean example (example (8) of this paper).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Definition 9 (Consistency) T A set of propositions A 4 P(S) is consistent if and only if A \ W 6¼ ˘.
138 On the Lumping Semantics of Counterfactuals b. c.
fu, :ug is inconsistent; hence, by persistence, for all w 2 W, if w 2 u then there is no s < w such that s 2 :u. W 4 u [ :u.
Definition 13 (Relevance function) A function f : W/PðPðSÞÞ is a relevance function iff for all w 2 W, f(w) 4 fp 2 P(S) j w 2 p and p is persistentg. In addition to the two conditions in Definition 13, Kratzer requires any proposition in f(w) to be graspable by humans, but does not explicate this notion formally. We will consider this requirement later. Kratzer also assumes that the set of propositions relevant to the truth of counterfactuals is further affected by the individual properties of the counterfactual considered and the context of use.8 Unlike in partition semantics, where f ranges over a precisely characterized set of functions, Kratzer’s lumping semantics does not completely characterize the range of variation of f. It is certainly not her intention that any subset of the set of true propositions that are persistent (and satisfy the other conditions that she mentions in her paper) can be a candidate for f(w); there must also be some kind of lower bound for f(w). In fact, her paper seems to suggest that f(w) should include all propositions that are not excluded by general conditions like persistence and human graspability and factors coming from context and the linguistic structure of the given counterfactual. She is not explicit about this, however, and we will consider all relevance functions that are not clearly excluded by considerations in her paper. Now we turn to the definition of the crucial notion of lumping. 8
She mentions that propositions that do not match the focus structure of the antecedent may be excluded, as well as ‘generic’ propositions (in the technical sense of her situation semantics) that are not law-like.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
These conditions ensure that both Excluded Middle and NonContradiction are valid at the world-level. With this background, we can turn to how Kratzer characterizes f(w) and Premlw : Kratzer assumes that all propositions that are relevant to the truth of counterfactuals are persistent. In addition to persistence, she considers an open-ended list of other general properties that propositions ought to have if they are to be relevant for the truth of counterfactuals. Since this list is left incomplete, we consider a range of different choices that appear to be consistent with what Kratzer says in her paper.
M. Kanazawa et al. 139
Definition 14 (Lumping) For all p, q 4 S and w 2 W, p lumps q in w if and only if w 2 p and p \ fs j s < wg 4 q. We will use the notation ‘p 8w q’ for ‘p lumps q in w’. The following property of lumping will be used repeatedly in later sections. Remark 2 Let p and q be persistent propositions. If w; p and w 2 q, then p _ q 8w q, where _ is as defined in Definition 12.
Definition 15 (Closure under lumping) A set of propositions A 4 P(S) is closed under lumping in w (relative to f(w)) if and only if for all p 2 A and all q 2 f(w), if p lumps q in w, then q 2 A. Definition 16 (Closure under logical consequence) A set of propositions A 4 P(S) is closed under logical consequence (relative to f(w)) if and only if for all p 2 f(w), if p logically follows from A, then p 2 A.9 In addition to the requirement that a premise set X be a consistent subset of f(w) [ fug that contains u, which is shared with Definitions 3 and 5, Definition 17 requires that (i) X be closed under lumping in w and (ii) X \ f(w) be closed under logical consequence.10 9 Kratzer speaks of ‘‘strong closure under logical consequence’’, which is equivalent to closure under logical consequence. The weak notion of closure under logical consequence she has in mind is defined as follows:
" p 2 A " q 2 f ðwÞ½p \ W 4 q/q 2 A According to this definition, the empty set of propositions, for example, is weakly closed under logical consequence, although it is not strongly so closed. 10 The condition of strong closure under logical consequence is motivated by examples like our (6) and (7) above, originally due to Goodman. Kratzer (1989, Section 5.2, pages 640–642) discusses a variant of the example and points out an undesirable premise set which would not be ruled out by closure under lumping alone, even in combination with weak closure under logical consequence.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Although Kratzer (1989) does not state the above property explicitly, it is made crucial use of in her solution to the problem exemplified by (8). As we shall see, this property of lumping is responsible for some undesirable consequences (see Propositions 4 and 5). Lumping enters the interpretation of counterfactuals as a closure condition on premise sets.
140 On the Lumping Semantics of Counterfactuals Definition 17 (Lumping premise set) Let f be a relevance function. Premlw ðuÞ ¼ fX 4 f ðwÞ [ fug j
\
X \ W 6¼ ˘ ^ u 2 X ^
"p 2 X "q 2 f ðwÞ½p8w q/q 2 X^ \
"p 2 f ðwÞ½ ðX \ f ðwÞÞ \ W4p/p 2 X \ f ðwÞg l
l
5.1 Lumping semantics and closure under logical consequence A little excursion is in order here to remark on the last line in the characterization of Premlw ðuÞ; which requires closure under logical consequence to hold of X \ f(w). Given the informal characterization of the lumping semantics in Kratzer 1989, it seems to us that Definition 17 is what Kratzer intends, but the actual formalization that she gives is slightly different: Definition 18 (Lumping premise set—Kratzer’s version) Let f be a relevance function. Premlw ðuÞ ¼ fX4f ðwÞ [ fug j
\
X \ W 6¼ ˘ ^ u 2 X ^
"p 2 X"q 2 f ðwÞ½p8w q/q 2 X ^ \
"p 2 f ðwÞ½ ðX fugÞ \ W4p/p 2 X fugg Instead of X \ f(w), Kratzer requires X – fug to be closed under logical consequence. If u is false at w, u ; f(w), so for any X 4 f(w) [ fug, X \ f(w) ¼ X – fug, and the two definitions are equivalent. However, if u is true at w, in all likelihood it is a member of f(w), so X \ f(w) ¼ X and X – fug differ. Suppose both u and u ^ v, for some proposition v, are true at w. Clearly u ^ v can be added to the antecedent consistently. One would therefore expect, given the intuitive truth definitions Kratzer states informally, that the set X ¼ fu, u ^ vg should be able to be extended
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The counterfactual connectives h/ and )/ are again defined as in Definitions 1 and 2 above. Similarly to the partition semantics for l l conditionals, the connectives h/, )/ inherit via Premlw a dependence on the relevance function f. Kratzer (1989) argued that the definitions reproduced in this section solve problems of the sort exemplified in (12) as well as (8).
M. Kanazawa et al. 141
(14) a.
If Paula were buying a pound of apples, she would be buying a pound of Golden Delicious. l b. (Paula is buying a pound of apples) h/ (Paula is buying a pound of Golden Delicious)
The counterfactual is false because the true sentence Paula is buying a pound of Golden Delicious, as well as all others which entail the antecedent of (14a), is barred from membership in any premise set by the condition of closure under logical consequence as defined in Definition18. The same is not true of either naı¨ve premise semantics or partition semantics, both of which, as we saw, make the counterfactual equivalent to its consequent at worlds in which its antecedent is true. Whether one finds this outcome agreeable or not, it is not the end of the story. Notice that closure under lumping makes the situation even worse. One could maintain, as Kratzer would, that Paula is buying a pound of Golden Delicious is lumped by the antecedent of (14a) and should therefore be included in all premise sets. But this means that no premise set at all can be closed under both lumping and logical consequence in the way Definition 18 requires, so that the mightcounterfactual in (15a) is false as well. (15) a.
If Paula were buying a pound of apples, she might be buying a pound of Golden Delicious. l b. (Paula is buying a pound of apples) )/ (Paula is buying a pound of Golden Delicious)
Below, we will proceed with Definition 17 instead of 18, assuming that this is what Kratzer had in mind. The exact formulation of our results depends on this small change, but since the two definitions are
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
to a premise set. However, all supersets of X fail the test for closure under logical consequence under Definition 18. It seems to us that this is not what Kratzer intends. Surely, Premlw ðuÞ; defined in this way, fails to capture all ways to add true propositions to the antecedent, maintaining consistency. If this were indeed what Kratzer intends, the conceptual difference from the intuitive paraphrase would have semantic repercussions, contrary to Kratzer’s claim (p. 635) that the condition of closure under lumping is the only substantive change from the earlier versions in Kratzer (1989). Suppose Paula is buying a pound of Golden Delicious and nothing else. Then the counterfactual in (14a), interpreted as in (14b), is predicted by Kratzer’s definition to be false if there are worlds like ours in which she is buying some other variety of apples instead.
142 On the Lumping Semantics of Counterfactuals equivalent for counterfactuals with false antecedents, it does not affect what the results say about such counterfactuals, which are the more interesting case.
5.2 Some properties of lumping semantics
Lemma 1 l Suppose u is true at w. Then u)/w is true at w iff f(w) [ fug is consistent with w. Proof. Let Zw ¼ f(w) [ fug. Then Zw 2 Premlw ðuÞ. To see this, note that Zw 4 f(w)T[ fug; T T since w 2 f(w) and w 2 u, w 2 Zw \ W, so Zw \ W 6¼ ˘ u 2 Zw; every proposition in f(w) lumped in w by any member of Zw is in f(w) and thus in Zw; every proposition in f(w) that logically follows from Zw \ f(w) is in f(w) and hence in Zw \ f(w) ¼ f(w).
Clearly, for any X 2 Premlw ðuÞ: X 4 Zw, and for any Y 4 f(w) [ fug such that Zw 4 Y, Y ¼ Zw. This means that the truth definition of l u)/w given in Definition 2: \
dX 2 Premlw ðuÞ "Y 2 Premlw ðuÞ½X4Y/ Y \ w \ W 6¼ ˘ is equivalent to \
Zw \ w \ W 6¼ ˘;
which says that f(w) [ fug is consistent with w.
h
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We noted earlier that both naı¨ve premise semantics and partition semantics make the might- and would-conditionals materially equivalent to the consequent when the antecedent is true. Although this property seems desirable, it is not shared by lumping semantics. In lumping semantics, when the antecedent is true, the truth of the consequent implies the truth of the might-conditional, but not of the wouldconditional. The dual of this fact is that when the antecedent is true, the falsity of the consequent implies the falsity of the would-conditional, but not of the might-conditional.
M. Kanazawa et al. 143
Lemma 2 l If u and w are true at w, then u)/w is true at w. Proof. Suppose Tu and w are true at w. Since all propositions in f(w) are true at w, w 2 (f(w) [ fug) \ w, so f(w) [ fug is consistent with w. l
h Therefore, by Lemma 1, u)/w is true at w. l In case u and :w are true at w, :(u)/w) is true at w if and only if :w logically follows from f(w) [ fug. As a special case, we have
The following is a direct consequence of Lemmas 2 and 3. Lemma 4 l Suppose f:wg \ fp 2 PðSÞ j w 2 pg4f ðwÞ: If u is true at w, u)/w is true at w iff w is true at w. The above lemmas will be useful in our analysis of lumping semantics. 6 TRIVIALITY OF LUMPING SEMANTICS In this section, we show that lumping semantics becomes trivial for a large class of relevance functions that are not clearly ruled out; for these relevance functions, lumping semantics makes both types of counterfactuals truth-functional. Proposition 2 l Suppose that fW, fwgg 4 f(w). Then u)/w is true at w iff u ^ w is true at w. l
Proof. Case 1. u is true at w. If w is true at w, then u )/w is true by Lemma 2. If w is false at w, w is inconsistent with fW, fwgg and hence l with f(w) [ fug. By Lemma 1, u )/w is false at w. Case 2. u is false at w. Suppose X 2 Premwl (u). Then we have X is consistent; u is in X; for all propositions p in X, all propositions in f(w) that are lumped by p in w are in X; all propositions in f(w) that logically follow from X \ f(w) are in X \ f(w).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Lemma 3 l Suppose u and :w are true at w, and :w 2 f (w). Then :(u)/w) is true at w.
144 On the Lumping Semantics of Counterfactuals Since W is in f(w) and logically follows from X \ f(w), W must be in X. Since fwg is in f(w) and W 8wfwg, T fwg must also be in X. But since u 2 X and w;u, we have X ¼ ˘, contradicting the l
consistency of X. Therefore, Premwl (u) ¼ ˘. This means that u)/w is false at w. l In both cases we have shown that u)/w is true at w iff u ^ w is true at w. h Proposition 3 l Under the same assumption, u h/ w is true at w iff u / w is true at w. l
l
Note that the condition on f(w) in Propositions 2 and 3 can be relaxed substantially. In place of W and fwg, one can use any propositions p and q such that W 4 p and p \ fs j s < wg 4 q 4 fs j s < wg and the proof goes through in the exact same way. One possible objection to Propositions 2 and 3 and the above generalization of them is that propositions like fwg that are true only in one world should be excluded from the values of f by the condition of human graspability. According to this objection, such propositions are too specific to be a possible object of belief and thus not graspable by humans.11 Independent of the plausibility of this objection, we can show that it has little merit, as the following formulation demonstrates. Proposition 4 Suppose that fu _ :u; :u; :wg \ fp 2 PðSÞ j w 2 pg4f ðwÞ, where l
_ is defined in Definition 12 and : satisfies (13). Then u)/w is true at w l
iff u ^ w is true at w, and u h/ w is true at w iff u / w is true at w. Proof. Clearly, the second half follows from the first, so it suffices to show that l
(i) If u is true at w, u)/ w is true at w iff w is true at w. l (ii) If u is false at w, u)/w is false at w. Part (i) follows from the assumption that f:wg \ fp 2 PðSÞ j w 2 pg4f ðwÞ by Lemma 4. 11
Kratzer voices this objection in her posting titled ‘Lumps of thought: A reply’ at http:// semanticsarchive.net. The notion of human graspability seems to be related to the notion of naturalness employed by Kratzer 2002. See the Addendum below for some discussion.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Proof. By definition, u h/ w 5 :(u)/w). By Proposition 2, l :w) 5 :(u ^ :w). By propositional calculus, :(u ^ :w) 5 :(u)/ (u / w). h
M. Kanazawa et al. 145
To prove (ii), we use the assumption that fu _ :u; :ug\ fp 2 PðSÞ j w 2 pg4f ðwÞ: Suppose w 2 :u. For any X 2 Prem wl(u), u _ :u 2 X by closure under logical consequence. Since u _ :u 8w T :u, :u 2 X, which contradicts u 2 X and X \ W 6¼ ˘. This shows l h that Premwl(u) ¼ ˘ and u)/w is false at w.
l
l
u)/w and u h/ w so as not to give them counterintuitive truth values. Propositions 2–4 show that there is a strong tension between closure under lumping and closure under logical consequence, which can very easily make lumping semantics break down.
7 DISCUSSION The results in the preceding section show that various claims that Kratzer makes about the predictions of lumping semantics cannot be taken at face value. The propositions relevant to the truth of counterfactuals must be restricted to a very small set, much smaller than Kratzer’s paper suggests, in order for lumping semantics to have any chance of assigning reasonable truth conditions to counterfactuals. We think that the general nature of our argument calls into question the existence of any reasonable restriction on the set of propositions relevant to the truth of counterfactuals that saves lumping semantics from counterintuitive predictions; the possibility remains, however, that some clever restriction may solve the problems.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Note that the assumption fu _ :u; :ug \ fp 2 PðSÞ j w 2 pg4 f ðwÞ alone leads to the counterintuitive prediction that :u implies l :ðu )/ w). Propositions like u _ :u, :u, :w are certainly graspable by humans if u and w are, so the condition of human graspability cannot save lumping semantics from triviality. Incidentally, Propositions 2 and 3, on the one hand, and Proposition 4, on the other, make different kinds of claim. Propositions 2 and 3 say that all counterfactuals are truth-functional with respect to a certain broad class of relevance functions. Proposition 4 makes a weaker kind of claim, that if f(w) contains certain propositions, one particular counterfactual is truth-functional with respect to f(w). Nevertheless, Proposition 4 reveals a surprising (and in our opinion undesirable) feature of lumping semantics in that at least one true proposition in fu _ :u, :u, :wg must be ruled out as irrelevant to the truth of
146 On the Lumping Semantics of Counterfactuals
(16) a. If Paula were not buying a pound of apples, she might not be buying a pound of apples. l b. :(Paula is buying a pound of apples) )/ :(Paula is buying a pound of apples) If there is a world like ours in which Paula is not buying a pound of apples, (16a) is intuitively true. It might be instructive at this point to consider what happens if one decides to exclude from f ðwÞ propositions like W and u _ :u which are true in all possible worlds, even though none of the considerations in Kratzer’s paper point in this direction. Although one might think it is not unreasonable to suppose that tautologies, being uninformative, should be excluded from premise sets, the following proposition is an indication that not much would be gained by this move. Proposition 5 Suppose that (i) f (w) contains no propositions that are true in all possible worlds; (ii) whenever p is in f (w), (p ^ u) _ (p ^ :u) is in f (w)12 and moreover, if :u is true at w, p ^ :u is also in f(w); and (iii) if :w is true l n at w, :w is in f(w). Then, u )/ w is true at w iff u )/ w is true at w, l n and u h/ w is true at w iff u h/ w is true at w. Proof. Suppose that u is true at w. Then by condition (iii) and Lemma l 4, u )/ w is true at w if and only if w is true at w. 12 Note that p and (p ^ u) _ (p ^ :u) do not necessarily stand for the same proposition in a situation model.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Kratzer leaves many important details of her semantics to be spelled out. In particular, she does not explicitly provide any condition on f(w) to the effect that some propositions must be in it. There is one passage in her paper, however, that suggests that she has in mind a condition requiring certain propositions to be in f(w), which we can show leads to a disastrous result similar to Propositions 2–4. In discussing example (8), she seems to assume that u _ w and u are relevant to the truth of l :u)/w; where _ is as defined in Definition 12. In her discussion, u and w are unrelated propositions like Paula is buying a pound of apples and the Atlantic Ocean is drying up, but let us see what happens when the assumption just stated is applied to the case w ¼ :u. Analogously to part (ii) of the proof of Proposition 4, we can show that whenever u is l true, :u )/ :u must be false. This runs counter to intuition when :u is true in some other possible world, as can be seen in the following example.
M. Kanazawa et al. 147
8 CONCLUSION We have shown that Kratzer’s (1989) lumping semantics fails to achieve the expressed aim of providing ‘a theory of counterfactuals that is able to make more concrete predictions with respect to particular examples’ (p. 626) than earlier theories. We have to conclude that either her theory makes wrong predictions because not enough propositions are excluded from the set of relevant propositions, or it fails to make any concrete predictions because we have no good idea how to restrict that set. We can also conclude that lumping semantics is a significant step backward compared to partition semantics. As Kratzer (1981) stresses, the latter theory provides a reasonable ‘logic’ for counterfactuals; it predicts the validity of certain intuitively acceptable forms of inference involving counterfactuals, while correctly predicting the invalidity of other forms of inference. In contrast, lumping semantics fails to validate simple laws like )u 5 (u )/ u) and ) u ^ (u h/ w) 0 u )/ w. What is striking is how little value Kratzer’s incorporation of lumping actually brings to premise semantics. The requirement of closure under lumping, together with closure under logical consequence, introduces a host of new problems that even naı¨ve premise semantics
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Suppose now that u is false at w. Firstly, we show that Premwl (u) 4 ffugg. Suppose to the contrary that for some X 2 Premwl (u), X 6¼ fug. Let p 2 X – fug. Then (p ^ u) _ (p ^ :u) 2 X by condition (ii) and closure under logical consequence. Since u is false at w, closure under lumping implies that p ^ :u 2 X, making X inconsistent. This contradicts X 2 Premwl (u). Secondly, we show that fug 2 Premwl (u) iff u is consistent. To see this, note that fug is closed under lumping in w because u is false at w, and fug \ f(w) ¼ ˘ is closed under logical consequence relative to f(w) because f(w) contains no propositions that l are true in all possible worlds. Finally, we show that u )/ w is true at w iff ) (u ^ w) is true at w. In case u is consistent, Premwl (u) ¼ ffugg l and u )/ w is true at w iff u \ w 6¼ ˘, that is, iff ) (u ^ w) is true at l w. In case u is inconsistent, Premwl (u) ¼ ˘ and both u )/ w and ) (u ^ w) are false at w. l By Proposition 1, we have shown that u )/ w is true at w iff u n )/ w is true at w. h
148 On the Lumping Semantics of Counterfactuals did not face.13 The parameter of a relevance function f, which is intended to be context-dependent and indeterminate, is then burdened with the dual task of keeping those new problems at bay and yielding better predictions than the pre-lumping versions of the theory. Our analysis has cast into serious doubt whether there exists any choice of f that can meet this demand, but we have not settled this question. Even if the answer turns out—to our surprise—to be positive, it remains to be seen whether the resulting theory retains the initial appeal of the introduction of lumping to the premise semantics of counterfactuals.
Recently, Kratzer (2002) offered an entirely different approach to the same kinds of problems that motivated lumping semantics. In this addendum, we very briefly discuss some aspects of this paper that are relevant to our analysis of lumping semantics. Kratzer (2002) suggests that the facts that are relevant for the truth of counterfactuals ‘may very well be propositional facts’. A propositional fact is the closure of a singleton proposition fsg, where s is some actual situation, under two closure conditions, namely, (i) persistence and (ii) closure under maximal similarity. Two situations are said to be maximally similar if they are qualitatively the same and preserve counterpart relationships between individuals. The requirement that a relevant proposition be closed under maximal similarity serves to rule out overly specific propositions like fwg or fs j s# < sg, and the requirement that it be generated by a singleton set serves to rule out overly general propositions like W or u _ :u. Kratzer (2002) outlines how this version of premise semantics can handle examples similar to (8) and (12), which motivated the lumping semantics, without the use of lumping. Briefly, the offending proposition that 13 We noted at the end of Section 6 that the triviality results (Proposition 2–4) stem from the tension between the two requirements of closure under lumping and closure under logical consequence. If u is false, any true proposition lumps a proposition that is inconsistent with u, and any set of propositions logically implies some true proposition. But closure under logical consequence is not important for most of the examples in Kratzer’s paper. What happens if we change her definition of Premwl and either (a) drop the requirement of closure under logical consequence altogether or (b) replace it by a weaker closure condition, like the weak closure under logical consequence mentioned in footnote 9? Both (a) and (b) have the effect of making fug a premise set when u is false but consistent. (fug is closed under lumping because u is false, and fug T f(w) ¼ ˘ is weakly closed under logical consequence.) We can show that assuming fs j s < wg 2 f(w) makes both modifications of lumping semantics collapse to naı¨ve premise semantics. Also, the modification by (b) collapses to naı¨ve premise semantics under the assumption in Proposition 2 or under the assumption that conditions (ii) and (iii) in Proposition 5 hold.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ADDENDUM
M. Kanazawa et al. 149
Acknowledgments We would like to thank the audience at the 2002 Stanford Workshop on Mood and Modality, Johan van Benthem, Cleo Condoravdi, Angelika Kratzer, Jeff Pelletier, the editor, and two anonymous reviewers for comments and helpful suggestions on this work.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
cannot be in any premise set under the lumping semantics because it lumps a proposition incompatible with the antecedent is now ruled out as irrelevant under the new semantics because it does not ‘correspond to a worldly fact’ (i.e., it is not the closure of fsg for some actual situation s). The ‘lumpee’ proposition is relevant because it is the closure of fsg, where s is the situation exemplifying the ‘lumper’ proposition. Although the new semantics is not meant to be a special kind of lumping semantics, it can be understood to be such, since it can be shown that a propositional fact lumps another propositional fact only if the former is a subset of, and hence logically implies, the latter. So if f(w) in the lumping semantics is taken to be (a subset of) the set of propositional facts, closure under lumping becomes redundant, and the resulting specialized version of lumping semantics becomes equivalent to the new semantics. (Note that closure under logical consequence alone is always redundant.) Of course, there is no point in having both closure under lumping and restriction to propositional facts, then. The requirement of closure under maximal similarity seems to be related to the requirement of human graspability, although Kratzer (2002) does not explicitly mention the connection. Both notions are intended to rule out overly specific propositions—propositions that make distinctions among situations that humans supposedly cannot make. This suggests a specialization of the lumping semantics in which members of f(w) are restricted to natural propositions (propositions that are both persistent and closed under maximal similarity) that are true at w. (The class of natural propositions is much broader than the class of propositional facts.) We can show that such a move will not solve the fundamental problems with the lumping semantics which we have pointed out in this paper. On the assumption that u and w express natural propositions, we can replace fwg by the closure thereof in Proposition 2 and the proof goes through as before. Propositions 4 and 5 are not affected, assuming a suitable definition of : that preserves naturalness.
150 On the Lumping Semantics of Counterfactuals MAKOTO KANAZAWA National Institute of Informatics 2–1–2 Hitotsubashi, Chiyoda-ku Tokyo 101–8430 Japan e-mail:
[email protected]
Received: 18.02.03 Final version received: 02.09.04 Advance Access publication: 29.03.05
STANLEY PETERS Department of Linguistics Margaret Jacks Hall, Building 460 Stanford, CA 94305–2150 USA e-mail:
[email protected]
REFERENCES Armstrong, D. M. (1978) Nominalism and Realism: Universals and Scientific Realism. Cambridge University Press. Cambridge. Armstrong, D. M. (1997) A World of States of Affairs. Cambridge University Press. Cambridge. Barwise, J. & Perry, J. (1983) Situations and Attitudes. MIT Press. Cambridge, MA. Goodman, N. (1947) The problem of counterfactual conditionals. The Journal of Philosophy 44:113–128. Kratzer, A. (1979) ‘Conditional necessity and possibility’. In U. Egli, R. Ba¨uerle & A. von Stechow (eds), Semantics from Different Points of View. Springer, Berlin, 117–147. Kratzer, A. (1981) ‘Partition and revision: The semantics of counterfactuals’. Journal of Philosophical Logic, 10:201–216.
Kratzer, A. (1989) ‘An investigation of the lumps of thought’. Linguistics and Philosophy 12:607–653. Kratzer, A. (2002) ‘Facts: Particulars or information units?’ Linguistics and Philosophy 25:655–670. Lewis, D. (1973) Counterfactuals. Harvard University Press. Cambridge, MA. Lewis, D. (1981) ‘Ordering semantics and premise semantics for counterfactuals’. Journal of Philosophical Logic 10: 217–234. Pollock, J. L. (1976) ‘The ‘‘possible worlds’’ analysis of counterfactuals’. Philosophical Studies 29:469–476. Pollock, J. L. (1981) ‘A refined theory of counterfactuals’. Journal of Philosophical Logic 10:239–266. Ramsey, F. P. (1929) ‘General propositions and causality’. In R. B. Braithwaite (ed.), 1931. The Foundations of Mathematics and other Logical Essays,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
STEFAN KAUFMANN Department of Linguistics Northwestern University 2016 Sheridan Road Evanston, IL 60208 USA e-mail:
[email protected]
M. Kanazawa et al. 151 Routledge & Kegan Paul, London, 237–255. Rescher, N. (1964) Hypothetical Reasoning. North-Holland. Amsterdam. Stalnaker, R. (1968) ‘A theory of conditionals’. In Studies in Logical Theory, American Philosophical Quarterly, Monograph: 2, Blackwell, Oxford, 98–112.
Veltman, F. (1976) ‘Prejudices, presuppositions and the theory of conditionals’. In J. Groenendijk & M. Stokhof (eds), Proceedings of the First Amsterdam Colloquium [¼Amsterdam Papers in Formal Grammar, Vol. 1], Centrale Interfaculteit, Universiteit van Amsterdam, 248–281.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Journal of Semantics 22: 153–158 doi:10.1093/jos/ffh020 Advance Access publication March 14, 2005
Constraining Premise Sets for Counterfactuals ANGELIKA KRATZER University of Massachusetts at Amherst
Abstract
A unifying theme of my papers on premise semantics, starting with the earliest one in 19761 and leading up to the latest one in 2002, was the search for suitable constraints on premise sets for modals and conditionals. In the case of counterfactuals, the task is to find constraints that are exactly right for modelling the vagueness and context dependence that those constructions are known to exhibit. I have tried various avenues towards this goal, but I think it is fair to say that a completely explicit and satisfying characterization of the relevant constraints has not yet been given. I agree that the analysis of Kratzer (1989) failed. But it didn’t fail because it was committed to problematic premise sets. The paper made clear that the list of constraints it explicitly discussed was not meant to be exhaustive. The usual fate of a linguistic analysis is to be superseded by another one. I abandoned the approach of Kratzer (1989) as early as 1990 because I had found a better way of characterizing the constraints for admissible premise sets. In their contribution to this issue, Kanazawa, Kaufmann and Peters (henceforth KKP) identify certain premise sets that should not be permissible in the evaluation of counterfactuals. Their discussion is for the most part framed as a reply to Kratzer (1989), though, which became obsolete after the publication of Kratzer (1990) and Kratzer (2002). For editorial reasons, my reply has to be short, so I will only address two 1 Kratzer (1976) is the German predecessor of Kratzer (1977). Frank Veltman independently developed a premise semantics for conditionals around the same time: Veltman 1976.
Ó The Author 2005. Published by Oxford University Press. All rights reserved.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This note is a reply to ‘On the Lumping Semantics of Counterfactuals’ by Makoto Kanazawa, Stefan Kaufmann and Stanley Peters. It shows first that the first triviality result obtained by Kanazawa, Kaufmann, and Peters is already ruled out by the constraints on admissible premise sets listed in Kratzer (1989). Second, and more importantly, it points out that the results obtained by Kanazawa, Kaufmann, and Peters are obsolete in view of the revised analysis of counterfactuals in Kratzer (1990, 2002).
154 Constraining Premise Sets for Counterfactuals
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
issues: the first one is that under reasonable assumptions, the first triviality result obtained by KKP is already ruled out by the tentative constraints mentioned in Kratzer (1989). The second and more important issue is that none of the potential problems discussed in KKP are problems for the more recent method of constructing premise sets proposed in Kratzer (1990) and (2002). In fact, the new method did away with closure under lumping, was developed precisely so as to explicitly rule out premise sets of the kind discussed in KKP, and seems independently supported by the semantics of knowledge ascriptions. The triviality results presented in Propositions 2 and 3 of KKP require the singleton proposition fwg to be relevant for the interpretation of counterfactuals. This violates the condition of Kratzer (1989) that the relevant propositions be ‘graspable by humans’. A plausible necessary condition for a proposition to be graspable by a human mind seems to be that it should be possible for a person to believe that proposition. For the singleton fwg to be believed by a person, then, that person’s set of doxastic alternatives has to be a subset of fwg. Assuming that the person’s beliefs are consistent, it follows that she has to be omniscient in a rather strong sense. Her beliefs have to be so specific that they are able to distinguish the actual world from all other possible worlds—including all of its perfect duplicates! It seems, then, that given a fairly obvious necessary condition for propositions to be graspable by humans, premise sets of the kind assumed in KKP’s Proposition 2 are already ruled out by the constraints mentioned in Kratzer (1989). The main idea of Kratzer (1990) and (2002) was to construct premise sets for counterfactuals in such a way that whenever they contain propositions p and q such that p lumps q, p also logically implies q. The requirement that premise sets be closed under lumping became superfluous, then. From the present perspective, the work reported in the 1989 paper was a useful intermediate step that eventually helped me find a way of getting rid of closure under lumping, while nevertheless acknowledging the crucial role of lumping relations for the premise sets needed for counterfactuals. On the (1990, 2002) analysis, lumping relations are paid attention to by ‘hard-wiring’ them into the construction of premise sets, rather than by stipulating a condition of closure under lumping. The unpleasant results of KKP are all brought about by applying the closure conditions of Kratzer (1989) to premise sets containing propositions like singletons, tautologies, negations, and disjunctions. The new approach excluded the dangerous kinds of premises on general grounds, and at the same time eliminated the need for closure under lumping, too.
Angelika Kratzer 155
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Kratzer (1990) and (2002) constructs premise sets for, say, the actual world, by expanding singleton sets containing actual situations. Apart from satisfaction of the persistence requirement, the driving engine for this expansion is relations of maximal similarity between situations. Assuming strictest standards of similarity, only situations that are perfect duplicates are maximally similar to each other. Lowering those standards somewhat allows situations to be maximally similar without being perfect duplicates. The method of projecting ‘natural’ propositions from particular situations can be illustrated as follows: Suppose there is exactly one situation s exemplifying the proposition that Thomas picked a rose in the actual world. Then fsg is a proposition, albeit a very specific one. Its smallest persistent extension is p ¼ fs#: s < s#g, the set of all actual situations in which Thomas picked that particular rose. The proposition p is persistent, but not yet natural. To extend p into a natural proposition, we must add all situations that are maximally similar to some situation in p. Assuming persistence then forces us to add all situations that contain any one of the recently added situations as parts. Naturalness requires us to add all situations that are maximally similar to the situations we just added, . . . and so on. We eventually end up with a set of situations in which a maximally similar counterpart of Thomas picks a maximally similar counterpart of the rose in a way that is maximally similar to the way he picked the rose in the actual world. There is a substantial body of philosophical work on the connection between similarity and natural properties (e.g. Lewis 1986). That work is very relevant since propositions are properties, too—properties of situations in our case. The project of Kratzer (1990, 2002) was to exclude from premise sets for counterfactuals true propositions that are too ‘gruesomely gerrymandered’ or too ‘miscellaneously disjunctive’ (Lewis 1986: 59). The technique was to ban premises that could not be projected as natural propositions from actual situations. On the proposed analysis, there are two possible sources for the notorious indeterminacy of counterfactuals. One is underspecification of the similarity relation, and the other one is underspecification of the set of relevant actual situations—the set of starter situations. In the spirit of Kratzer (1981), one desirable constraint might be that the mereological sum of all starter situations be identical to the actual world. Non-accidental connections between facts of the kind discussed in Tichy (1976) and Kratzer (1981) might be reflected by the way worlds are partitioned into situations. Two facts that stand and fall together, for example, might not contribute two separate starter situations. They would have to be ‘lumped’ together. I would expect
156 Constraining Premise Sets for Counterfactuals
2
The role played by similarity in Kratzer’s (1990, 2002) premise semantics for counterfactuals is quite different, however, from that in Lewis’ (1973) analysis. In Kratzer’s semantics, similarity is only used for projecting natural propositions from actual situations.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
that independent justification for constraints on admissible ways of partitioning worlds into sub-situations could be found outside the area of counterfactual reasoning by exploring the highly underdetermined principles for individuating places and events. Those explorations might then feed the development of formal theories of suitable parthood structures of the kind discussed in Casati and Varzi (1999), for example. I would also hope that constraints on the range of admissible similarity relations might eventually be grounded in general theories of category formation, and hence would not have to be stipulated specifically for the analysis of counterfactuals. Kratzer (2002) does not address the role of laws in the proposed account of counterfactual reasoning. Since laws seem to play an important role in category projection, too, we would expect laws to impact counterfactual reasoning via the similarity relation, as in David Lewis’ work (Lewis 1979).2 Two situations would not count as maximally similar, if they are part of worlds with very different laws. Laws could play an important role in counterfactual reasoning, then, without ever being premises. Crucially, propositions expressing law-like generalizations would not have to be natural projections from actual situations. Clearly, more research is needed to flesh out the account of Kratzer (1990, 2002). Yet even in its present form, certain conclusions can be drawn. Neither the set of all possible worlds W, nor singleton sets like fwg (where w is the actual world) will come out as natural projections of actual situations. W could be a natural projection of an actual situation only if w was a relevant situation, and all other possible worlds were maximally similar to w. This could only be if our standard of similarity was absurdly permissive. For fwg to be a natural projection of an actual situation, w would have to be a relevant situation, too. But this time round, no possible worlds whatsoever—not even perfect duplicates—could be allowed to be maximally similar to it. Our standards of similarity would have to be absurdly restrictive. ‘Negative’ propositions do not fare any better. Let p be the proposition that there are ghosts. One possible persistent negation of p would be a proposition that is only true in worlds (as opposed to proper situations, that is, proper parts of worlds). It would be true in precisely those worlds in which there are no ghosts. The verdict on that proposition is similar to the one for W. To obtain it as a natural projection of some actual situation, we would have to assume that w is a relevant situation, and all
Angelika Kratzer 157
3 On the approach of Kratzer (1989), this kind of negation would correspond to a law-like generalization. Since laws do no longer have to be premises on the new approach, propositions expressing law-like generalizations do not have to be projected as natural propositions from actual situations.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
possible worlds in which there are no ghosts would have to be maximally similar to it. Again, this would require a highly implausible similarity relation. The set of possible worlds in which there are no ghosts is a rather diverse bunch. Alternatively, consider a negation of p that is true in any possible situation that is part of a situation in which there are no ghosts. If this proposition is true in a world, it is true in all of its sub-situations.3 What would it take for this negation to come out as a natural projection of some actual situation, according to the method I described? I can’t see how to pull this off in any reasonable way. If we take w as our starter situation, for example, we would have to look for a similarity relation that not only makes all subsituations of w maximally similar to w, but also all other possible worlds in which there are no ghosts, and all of their sub-situations. No way. Maybe we should start with some small sub-situation s of the actual world. In the next step, we would then have to add all super-situations of s. Next, all possible situations are added that are maximally similar to one of the situations in the set we have so far. One of the problems here is that to get the intended proposition, we would have to find a way of excluding all those possible situations that are part of worlds that do have ghosts, even if those situations themselves were extremely similar to s or one of its super-situations. Bad luck. Finally let us try to produce the true disjunctive proposition that I am in Brazil or cats have wings. Again, I see no good way of generating this proposition from some actual situation using the prescribed method. Since cats have no wings in the actual world, our starter situation s would have to be one in which I am in Brazil. None of the situations that are maximally similar to s or any of its super-situations are situations in which cats have wings. I think, then, that the method proposed in Kratzer (1990, 2002) provides at least a promising starting point for characterizing suitable premise sets for counterfactuals. It is a method that is worth thinking about, even though in its current state of development, it cannot be considered more than a beginning. As with all empirical research, it will take many more years to fully explore the consequences of what I have already been thinking about for so long. This kind of work is extremely difficult and can only be done well in cooperation with others.
158 Constraining Premise Sets for Counterfactuals Acknowledgements I thank Peter Bosch and Bart Geurts, as well as Cleo Condoravdi and Stefan Kaufmann, for making it possible to include this note in this special issue. I am grateful for having been given the opportunity to clarify some issues discussed in Kratzer (1990, 2002) on this occasion. ANGELIKA KRATZER Department of Linguistics South College University of Massachusetts at Amherst Amherst, MA 01003. e-mail:
[email protected]
Received: 20.01.05 Final version received: 23.01.05 Advance Access publication: 14.03.05
Casati, R. & Varzi, A. (1999) Parts and Places. The Structures of Spatial Representation. MIT Press. Cambridge, MA. Kanazawa, M., Kaufmann, S. & Peters, S. (2005) ‘On the lumping semantics for counterfactuals’. Journal of Semantics, this issue. Kratzer, A. (1976) ‘Was ‘‘ko¨nnen’’ und ‘‘mu¨ssen’’ bedeuten ko¨nnen mu¨ssen’. Linguistische Berichte 42:128–160. Kratzer, A. (1977) ‘What ‘‘must’’ and ‘‘can’’ must and can mean’. Linguistics and Philosophy 1:337–355. Kratzer A. (1981) ‘Partition and revision: The semantics of counterfactuals’. Journal of Philosophical Logic 10:201– 216. Kratzer, A. (1989) ‘An investigation of the lumps of thought’. Linguistics and Philosophy 12:607–653. Kratzer, A. (1990) ‘How specific is a fact?’ In Proceedings of the 1990 Conference on Theories of Partial Information. Center for Cognitive Sci-
ence and College of Liberal Arts at the University of Texas at Austin. Kratzer, A. (2002) ‘Facts: Particulars or information units?’ Linguistics and Philosophy 25:2002, 655–670. Lewis, D. K. (1973) Counterfactuals. Blackwell. Oxford. Lewis, D. K. (1979) ‘Counterfactual dependence and time’s arrow’. Nous 13:418–446. Lewis, D. K. (1986) On the Plurality of Worlds. Blackwell. Oxford. Tichy, P. (1976) ‘A counterexample to the Stalnaker-Lewis analysis of counterfactuals’. Philosophical Studies 29:271–273. Veltman, F. (1976) ‘Prejudices, presuppositions and the theory of conditionals’. In J. Groenendijk and M. Stokhof (eds) Amsterdam Papers in Formal Grammar, Vol. 1. Centrale Interfaculteit, Universiteit van Amsterdam.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
REFERENCES
Journal of Semantics 22: 159–180 doi:10.1093/jos/ffh022
Making Counterfactual Assumptions FRANK VELTMAN Institute for Logic, Language and Computation, University of Amsterdam
Abstract
1 INTRODUCTION Syntactically, counterfactual conditionals are quite complex. First, there is the antecedent starting with ‘if ’ and consisting of a sentence in which a past perfect is used, and then there is the consequent with a verb phrase built from ‘would’, ‘have’, and a past participle, a so-called modal perfect, presumably1 formed by taking the past tense of a future perfect. In semantics this complexity has been neglected. The usual practice is to put all these modal, temporal, and aspectual modifications together in one single special arrow and to represent a counterfactual by a formula of the form £u,w·; where u and w stand for arbitrary sentences. The meaning of £u,w·; is then defined in one go from the meanings of u and w. It would be nice to have a more stepwise analysis. This paper makes a start at this by decomposing counterfactuals in two pieces: the antecedent £If it had been the case that u·, and the consequent £it would have been the case that w·.2 Such a decomposition is called for because the modal perfect is not only used in counterfactual conditionals. Consider for instance the second sentence in the following text. 1 Things are changing. Joyce Tang Boyland convincingly argues in Boyland (1996) that in present day English ‘have’ is becoming affixed to the preceding ‘would’, which makes ‘would have’ a single syntactic unit which is combined with a past participle. 2 Condoravdi (2002) starts at the other end, giving a decompositional analysis of phrases like £it might have been the case that . . .· and £it would have been the case that . . .·
The Author 2005. Published by Oxford University Press. All rights reserved.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This paper provides an update semantics for counterfactual conditionals. It does so by giving a dynamic twist to the ‘Premise Semantics’ for counterfactuals developed in Veltman (1976) and Kratzer (1981). It also offers an alternative solution to the problems with naive Premise Semantics discussed by Angelika Kratzer in ‘Lumps of Thought’ (Kratzer, 1989). Such an alternative is called for given the triviality results presented in Kanazawa et al. (2005, this issue).
160 Making Counterfactual Assumptions (i) John did not drink any wine. He would have become sick. Sentences with a verb phrase consisting of ‘would have’ + past participle make no sense if they are presented without context. The first sentence in (i) provides a proper context for the second. Together they convey roughly3 the same information as the one sentence (ii) If John had drunk any wine, he would have become sick. It would be interesting to know in exactly which contexts ‘would have’ + past participle can be used. This is by no means a trivial question, as is illustrated by (iii). Why does (iii) make no sense? In particular, why do we not understand (iii) as (iv) unless something like ‘otherwise’ is inserted in front of the second sentence?4 (iv) If John had not drunk too much wine, he would not have become sick. The dynamic outlook on semantics5 offers a way to come to grips with questions like this, as it is designed to study meaning ‘in context’. On the dynamic view, knowing the meaning of a sentence is knowing the change it brings about in the cognitive state of anyone who wants to incorporate the information conveyed by it. Formally, this amounts to this.
The meaning [u] of a sentence u is an operation on cognitive states.
In the following ‘S[u]’ denotes the result of applying the operation [u] to state S; it is the result of updating S with u. An important notion in this framework is the notion of support. A cognitive state S supports a sentence u when updating S with u adds no information over and above what is already in S. Instead of ‘S supports u’, I will often say ‘u is accepted in S’.
S supports a sentence u; S~u; iff S½u ¼ S:
3 I do not want to get into the question whether it is a pragmatic or a semantic consequence of sentence (ii) that John did not drink any wine. If you believe that counterfactuals presuppose the falsity of their antecedent and that presupposition is a semantic notion, you can omit the qualification ‘roughly’. 4 For readers who think that it is crucial that there be a negation in the first sentence, it will be worthwhile to look at (v) We asked Mary to taste the wine. John would have become sick. If ‘Mary’ is stressed in the first sentence and ‘John’ in the second, the second sentence makes perfect sense. 5 This paper utilizes the framework presented in Veltman (1996).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(iii) John drank too much wine. He would not have become sick.
Frank Veltman 161
Logical validity is defined in terms of this notion:
u1 ; . . . ; un ~w iff for any state S; S½u1 . . . ½un ~w:
In other words, a sequence of premises u1, . . ., un entails a conclusion w if updating a state with that sequence invariably leads to a state that supports the conclusion. It is now possible to outline the general setup. Consider a sentence of the form £If it had been the case that u, it would have been the case that w·. By interpreting the antecedent in state S one gets to state
This state S# is stored in memory as a state subordinate to S, so that the consequent can be interpreted in the right context. The modal perfect ‘it would have been the case that’ indicates that the message given in w pertains to this subordinate state S# rather than to the state S. This can be generalized. Apparently, after processing the first sentence of (i) the stage is set for the interpretation of the subsequent sentence, whereas in (iii) this is not the case. When you have to interpret a negative sentence, such as the first sentence in (i), the interpretation process starts with an update with the positive subsentence, and then continues with some operation on this intermediate result. This intermediate result is kept in memory as an auxiliary state subordinate to the main state, and differing from it mainly in that it supports a statement that is rejected in the main state. This subordinate state is the state the modal perfect in the second sentence of (i) is looking for. In interpreting the first sentence of (iii) no such subordinate state is created. Therefore the second sentence of (iii) finds nothing to pertain to. But if the phrase ‘otherwise’ is inserted in front of the second sentence, a subordinate state will be created and the interpretation of the second sentence runs smoothly.6,7 6 To deal with the example of footnote 4 one needs to invoke a theory of focus like the one presented in Rooth (1985). According to this theory the general function of focus is evoking alternatives. (In this case, alternatives to Mary—which other people could we have asked to taste the wine?). These alternatives will give rise to subordinate states and are available for sentences in which the modal perfect is used to be interpreted in. (Apparently, we are ready to accommodate the idea that John is one of the alternatives.) 7 Stefan Kaufmann (Kaufmann 2000) uses stacks of states in a formalism to describe discourse phenomena like the one discussed here. See also van Rooy (2005, this issue) for a discussion of modal subordination.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
S# ¼ S½If it had been the case that u:
162 Making Counterfactual Assumptions In this paper my main concern is not the interpretation of the consequent of a counterfactual, but the interpretation of the antecedent. What is it to make a counterfactual assumption? Given a state S and a sentence u, what does S½If it had been the case that u
2 RAMSEY’S TEST AND TICHY’S PUZZLE The starting point for our analysis is the informal recipe for evaluating counterfactual conditionals named after Frank Ramsey. It plays a fundamental role in many theories of counterfactuals, notably in the theories presenting some variant of Premise Semantics (Rescher 1964; Veltman 1976; Kratzer 1981). Ramsey test: ‘This is how to evaluate a conditional: first, add the antecedent (hypothetically) to your stock of beliefs; second, make whatever adjustments are required to maintain consistency (without modifying the hypothetical belief in the antecedent); finally, consider whether or not the consequent is then true.’ The quotation is taken from Stalnaker (1968; 106). Ramsey’s original suggestion only covered the case in which the antecedent is consistent with one’s stock of beliefs. In that case no adjustments are required. In the above, Stalnaker generalizes this to the case in which the antecedent cannot be added to ‘your’ stock of beliefs without introducing a contradiction. In this case, which is typical of counterfactuals, adjustments are required. The Ramsey test is in need of amendments. Making a counterfactual assumption does not boil down to a minimal belief revision, as is illustrated by the counterexample devised by Pavel Tichy: ‘Consider a man, call him Jones, who is possessed of the following dispositions as regards wearing his hat. Bad weather invariably induces him to wear a hat. Fine weather, on the other hand, affects
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
look like? In the next section I will discuss some problems with the standard answer to this question. In sections 3 and 4 I will develop an alternative and show that it solves the problems discussed in section 2. In section 5 I will discuss the repercussions of the resulting theory of counterfactual conditionals for the treatment of indicative conditionals. Finally, section 6 is devoted to a comparison of the theory proposed here with its nearest neighbour, the theory proposed in Kratzer (1989).
Frank Veltman 163
him neither way: on fine days he puts his hat on or leaves it on the peg, completely at random. Suppose moreover that actually the weather is bad, so Jones is wearing his hat.’ Tichy (1976; 271)
A sentence of the form £If it had been the case that u, it would have been the case that w· is true in the actual world w iff the consequent w is true in every world9 in which the antecedent u is true, and which in other respects differs minimally from w. Tichy claims that the counterfactual ‘If the weather had been fine, Jones would have been wearing his hat’, asserted in the context described above, meets this condition. In the actual world, it is raining and Jones is wearing is hat. Given that it is a matter of chance whether or not Jones wears his hat when the weather is fine, it would seem that for any sunny world in which Jones is not wearing his hat there is an equally sunny world in which he does, and which—because of this—is less different from the actual world. Lewis and Stalnaker are ready to admit that Tichy’s example shows that the relevant conception of minimal difference ‘needs to be spelled out with care’ (Stalnaker (1984; 129)), but they do not think the example shows that the idea of minimal difference is wrong. Perhaps such contingencies like whether or not Jones is wearing his hat, do not matter when the differences and similarities of possible worlds have to be assessed. This is at least what Lewis suggests in Lewis (1979), where he formulates a system of weights that governs the notion of similarity
8 If you like the sentence better if there is a ‘still’ between ‘would’ and ‘have’ in the consequent, then please read it that way. 9 According to Stalnaker there is at most one such world, according to Lewis there may be more than one.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The question is: would you accept the sentence ‘If the weather had been fine, Jones would have been wearing his hat’?8 Presumably, your answer is ‘no’, but Ramsey’s recipe yields ‘yes’. We know (i) that Jones is wearing his hat, we know (ii) that it is raining. Now we must add to this the proposition (iii) that the weather is fine, thereby making the adjustments minimally required to maintain consistency. Clearly, this can be done without Jones having to take his hat off. Tichy’s criticism was not directed directly against the Ramsey test but against the analysis of counterfactuals developed by Robert Stalnaker and David Lewis. (Stalnaker (1968), Lewis (1973)). They proposed the following truth condition for counterfactual conditionals.
164 Making Counterfactual Assumptions involved. After some remarks on the important role of ‘general’ laws in this matter,10 he says the following about the role of ‘particular’ fact. ‘It is of little or no importance to secure approximate similarity of particular fact.’ Lewis (1979; 472) Here is a variant11 of Tichy’s puzzle which shows that this is not quite right.
And again, the question is whether you would accept the sentence ‘If the weather had been fine, Jones would have been wearing his hat’. This time, your answer will be ‘yes’. Lewis, too, would want to say ‘yes’, I guess. But can he? If similarity of particular fact did not matter in the first version of the puzzle, why would it now? What really matters is this: In both cases Jones is wearing his hat because the weather is bad. In both cases we have to give up the proposition that the weather is bad—the very reason why Jones is wearing his hat. So, why should we want to keep assuming that he has his hat on? In the first case there is no special reason to do so; hence, we do not. In the second case there is a special reason. We will keep assuming that Jones is wearing his hat because we do not want to give up the independent information that the coin came down heads. And this, together with the counterfactual assumption that the weather is fine, brings in its train that Jones would have been wearing his hat. In other words, similarity of particular fact is important, but only for facts that do not depend on other facts. Facts stand and fall together.12 In making a counterfactual assumption, we are prepared to give up everything that depends on something that we must give up to maintain consistency. But we want to keep in as many independent 10 As the first and the third criterion he mentions the following: It is of the first importance to avoid big, widespread, diverse violations of law. . . . It is of the third importance to avoid even small, localized, simple violations of law. 11 The example was suggested to me years ago by my former student Frank Mulkens. 12 This is also the idea behind Kratzer’s lumping semantics in Kratzer (1989).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Suppose that Jones always flips a coin before he opens the curtains to see what the weather is like. Heads means he is going to wear his hat in case the weather is fine, whereas tails means he is not going to wear his hat in that case. Like above, bad weather invariably makes him wear his hat. Now suppose that today heads came up when he flipped the coin, and that it is raining. So, again, Jones is wearing his hat.
Frank Veltman 165
facts as we can. In the next section I will develop this idea more precisely. 3 STATES, AND WHAT ASSUMPTIONS DO TO THEM In what follows I will assume that the reader is acquainted with the basic apparatus of possible worlds semantics. Definition 1 (Worlds and states) Fix a finite set A of atomic sentences.
In this definition a possible world is identified with the valuation that assigns the value 1 to the atomic sentences true in it, and 0 to the atomic sentences false in it. I am using ‘p’, ‘q’, and ‘r’ to refer to atomic sentences. I will often write ‘Æ p, 1æ 2 w’ rather than ‘w( p) ¼ 1’, and use a similar notation when situations are at stake. So, ‘Æq, 0æ 2 s’ means that the atom q is false in the situation s. Pairs like Æ p, 1æ and Æq, 0æ will sometimes be referred to as (positive and negative) facts constituting the situations they are elements of. I will write ‘½½u’ for the proposition expressed by u, and assume the reader is acquainted with the fact that for formulas of propositional logic by definition the following holds: ½½p ¼ fw 2 W j wðpÞ ¼ 1g; ½½:u ¼ W ; ½½u; ½½u ^ w ¼ ½½u \ ½½w; ½½u _ w ¼ ½½u [ ½½w; ½½u/w ¼ ðW ; ½½uÞ [ ½½w: An agent’s cognitive state S is given with two sets of possible worlds, FS and US, the former a subset of the latter. A world w is supposed to be an element of FS if, for all the agent in state S knows, w might be the actual world. The set US is called the universe of the state S. A possible world belongs to US if all the propositions that an agent in state S considers to be general laws hold in it.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(i) A world is a function with domain A and range f0, 1g; a situation is a partial such function; a proposition is a set of worlds. (ii) Let W be the set of possible worlds. A cognitive state S is a pair ÆUS, FSæ, where either (a) ; 6¼ FS 4 US 4 W; or (b) FS ¼ US ¼ ;.
166 Making Counterfactual Assumptions It has often13 been noted that general laws play a special role in the interpretation of counterfactuals. Consider: If John’s boat had been made of wood, it would not have sunk. Imagine that John’s boat, an iron rowing boat, has sunk. It has been raining a lot lately, and John forgot to bail the water out. Probably, in this context you would prefer the above counterfactual to the next one. If John’s boat had been made of wood, it would (still) have sunk.
If White had played 14.Nd5, Black would have lost we are not prepared to consider worlds where chess is not played by the rules, or played by different rules. In any state S, FS 4 US: the general laws set a limit to the factual information one can have. If the agent has no specific information about the actual world, then FS ¼ US: any world in which the general laws hold might be the actual world. In the minimal state, given by 1 ¼ ÆW, W æ, the agent neither has any factual information, nor is he acquainted with any law. For a state to be coherent it is required that FS 6¼ ;. A state S in which FS ¼ ; is absurd: given the available information, there is no possible world left that might be the real one. In the mathematical setup we identify all absurd states and allow only one: Æ;, ;æ, also known as 0, is the absurd state. Agents will avoid getting into this state. In our formal language h will be used to express ‘it is a law that . . .’. Here, the dots have to be filled by a formula of propositional logic, h 13 A prominent example is John Pollock, who has stressed the point in all his writings on counterfactuals since Pollock (1976).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In making a counterfactual assumption we are not prepared to give up propositions we consider to be general laws. We will stick to a law of nature like Wood floats on water, at the cost of a contingent fact like John’s boat sank. It’s not just natural laws that are at stake here. Take for instance the proposition that bad weather invariably induces Jones to wear a hat, and think about the role this proposition plays in the scenarios sketched above. It is not a law of nature, of course, but it’s a law. We will not give it up when making counterfactual assumptions. When the weather is fine, we will assume that if the weather had been bad, Jones would have been wearing his hat, even if we have just seen him without it. Or take conventional laws like the rules of chess. In evaluating a statement like
Frank Veltman 167
can only occur as the outermost operator of a formula. Part (ii) of the next definition explains what an update with hu amounts to. Definition 2 (Interpretation) Let u be a formula of propositional logic. (i) (a) S½u ¼ ÆUS ; FS \ ½½uæ if FS \ ½½u 6¼ ;; (b) S½u ¼ 0; otherwise: (ii) (a) S½hu ¼ ÆUS \ ½½u; FS \ ½½uæ if FS \ ½½u 6¼ ;; (b) S½hu ¼ 0; otherwise:
Figure 1
Definition 3 (Basis) Let S ¼ ÆUS, FSæ be a state. (i) 14
The situation s forces the proposition P within US iff for every w 2 US such that s 4 w it holds that w 2 P.15
Note that this definition would not work if we had allowed stacking of h ’s etc. If there is no world w 2 US such that s 4 w, then, according to this definition, the situation s forces every proposition. 15
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Updating with a propositional formula u eliminates from FS all possible worlds in which u is false. Hence, only worlds in which u is true are left as worlds that might be the actual world. If there are no such worlds left, one gets into the absurd state. Similarly, an update with hu eliminates from US all worlds in which u is false. So, only worlds in which u is true are left as worlds that might have been the actual world. The other ones are so outlandish, you do not have to reckon with them, not even in making the wildest counterfactual assumption.14 Below on the left a table is drawn representing the minimal state for a language with three atoms. Every row represents a world. The table in the middle represents the state that results when the minimal state is updated with :q, and on the right you see 1½:q½hðr/pÞ: For a given state S, worlds belonging to FS are printed in boldface, and worlds that do not belong to US are struck through.
168 Making Counterfactual Assumptions (ii) The situation s determines the world w iff s forces fwg within US. (iii) The situation s is a basis for the world w iff s is a minimal situation determining w within US.
Definition 4 (Retraction) Let S ¼ ÆUS, FSæ be a state. (i) Suppose w 2 US, and P 4 W. The set wYP is determined as follows: s 2 wYP iff s 4 w and there is a basis s# for w such that s is a maximal subset of s# not forcing P. (ii) SYP, the retraction of P from S, is the state ÆUSYP, FSYPæ determined as follows: (a) w 2 USYP iff w 2 US; (b) w 2 FSYP iff w 2 US and there are w# 2 FS and s 2 w#YP such that s 4 w. (iii) The state S[if it had been the case that u] is given by ( SY½½:u)[u] A counterfactual £If it had been the case that u, . . .· is usually asserted in a contexts in which u is known to be false. Let’s concentrate on such contexts, and see what S[if it had been the case that u] amounts to.16 According to definition 4 (iii) we have to retract ½½:u from S first and then update the result SY½½:u with u: Definition 4 (ii) and 4 (i) add that to retract ½½:u from S the following has to be done for every world w in FS, and every basis s# for w: Given that the basis s# forces the proposition ½½:u; make minimal adjustments to s# to the effect that ½½:u is no longer forced. It is very well possible that there are various ways to do so. Let s be one of the results. The worlds in US extending s all belong to SY½½:u. 16 Again, in this paper I do not want to get into a discussion of the question whether counterfactuals presuppose the falsity of their antecedent. I am perfectly happy if this definition only works for cases in which both the speaker and the hearer believe that the antecedent is false. Maybe amendments are in order for the other cases.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A basis for a world w 2 US is a part of w consisting of mutually independent facts which, given the general laws, bring the other facts constituting w in their train. It is easy to check that every world in the universe of the state pictured on the right above has exactly one basis. (For instance, the one basis for w0 is fÆ p, 0æ, Æq, 0æg). However, generally speaking, it may very well be that a world has more than one basis. Making a counterfactual assumption £If it had been the case that u· in state S takes two steps. In the first step any information to the effect that u is in fact false is withdrawn from S, and in the second step the result is updated with the assumption that the antecedent u is true. Definition 4 (ii) describes the first step, and definition 4 (iii) the second.
Frank Veltman 169
S~if had been u; would have been w iff S½if had been u~w In other words, a state S supports a counterfactual conditional £if had been u, would have been w· iff the subordinate state S[if had been u] supports the consequent w. (The reader will have noticed that I changed notation. For reasons of economy, I will henceforth write ‘if had been’, and ‘would have been’ rather than ‘If it had been the case that’, and ‘it would have been the case that’.) Tichy 1 Let p be short for ‘The weather is bad’, and q for ‘Jones is wearing his hat’. We are interested in the state S ¼ 1[h( p / q)][p][q], which is pictured in the left table below.
Figure 2
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Readers acquainted with Premise Semantics will have recognized the melody. Just like in other versions of Premise Semantics we are interested in the maximal subsets of the ‘premise set’ that are consistent with the antecedent of the counterfactual. However, in this version the premise set for a world w is not given by the set of propositions that hold in w, as naive Premise Semantics would have it. In this version a premise set is given by a set of facts constituting a basis of the world w. And now, by ‘consistent’ we don’t just mean ‘logically consistent’, but ‘compatible with the general laws’. The crucial trick is that actual retraction takes place at the level of the bases of the worlds. This is because we want to keep in as many independent facts as we can, but don’t bother about facts that depend on other facts. This way we ensure that when a particular fact is retracted all the facts it takes in its train are retracted with it. To give a formal analysis of the Tichy cases, we do not need an exact definition of what an update with a counterfactual amounts to. All we need to agree upon is that this definition must satisfy the following constraint:
170 Making Counterfactual Assumptions World w6 has one basis, fÆ p, 1æ, Ær, 0æg. The one basis for w7 is fÆ p, 1æ, Ær, 1æg. Applying definition 4, we see that w6Y½½p ¼ ffÆr, 0ægg, and w7Y½½p ¼ ffÆr, 1ægg. This means that SY½½p ¼ ÆUS, USæ. Hence, the state S[if had been :p] is the state given by the right table above. Clearly, S[if had been :p] 2 q. Therefore, S 2 if had been :p, would have been q;. in other words, the theory says that an agent in state S should not accept the sentence ‘If the weather had been fine, Jones would have been wearing his hat’.
Figure 3
The (only) basis for w7 ¼ fÆ p, 1æ, Ær, 1æg. Furthermore, w7Y½½p ¼ ffÆr, 1ægg, which means that SY½½p ¼ ÆUS, fw3, w7gæ. Hence, the subordinate state S[if had been:p] is the state pictured by the table on the right above. Notice that S[if had been:p] ~ q. Therefore, S ~ if had been:p, would have been q. In other words, the theory says that we were right when we accepted ‘If the weather had been fine, Jones would have been wearing his hat’. 4 COUNTERFACTUALS AS TESTS The next update condition for counterfactuals is the simplest condition in line with the constraint we formulated above: Definition 5 (Counterfactuals as tests) S½if had been u; would have been w ¼ S; if S½if had been u~w S½if had been u; would have been w ¼ 0; otherwise:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Tichy 2 Turning to the variant to Tichy’s example, again, let p be short for ‘The weather is bad’, and q for ‘Jones is wearing his hat’. The atom r stands for ‘The coin comes up heads’. This time we are interested in the state S ¼ 1½hððp _ rÞ4qÞ½r½p; which is given by the left table below.
Frank Veltman 171
Proposition Let S be a state. FS½if had been u ¼ fw 2 US j w 2 FÆUS ;ftgæ½if had been u for some t 2 FS g:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Given this definition, sentences of the form £if had been u, would have been w· do not convey new information — not directly at least. They provide an invitation to perform a test. By asserting £if had been u, would have been w·, a speaker makes a kind of comment: ‘Given the general laws and the facts I am acquainted with, the sentence w is supported by the state I get in when I assume that u had been the case’. The addressee is supposed to determine whether the same holds on account of his or her own information. If not, a discussion will arise, and in the course of this discussion both the speaker and the hearer may learn some new laws and facts, which could affect the outcome of the test. Such things sometimes really happen. Consider the Tichy case once more. Imagine that someone with the information of the variant of Tichy’s example says ‘If the weather had been fine, Jones would have been wearing his hat’ to someone who only has the information available in the original example. The addressee will not accept the statement. But then, when he or she hears about the coin etc., this will change. So, ultimately the addressee gets some new information about the actual world, but in a very indirect way. The reason why we cannot give a direct update rule for counterfactuals that works in all cases, is that it is not always clear which part of the new information is due to some hitherto unknown laws and which part to some hitherto unknown facts. More formally, in many cases it is not uniquely determined which worlds should be removed from the universe US and which worlds from FS. It could be that you should accept £if had been u, would have been w· because it is a general law that whenever u is the case, w is the case as well, or it could be that you should accept it because it happens to be the case that v, and it is a general law that you cannot have u and v without having w. And these are just two possibilities. If you don’t know beforehand which laws are involved, there are various ways to decompose the new information. However, in a context where the laws are fixed – like when we are discussing a chess game, or when we are solving problems in classical mechanics, we can give a direct update rule. The key to this update rule is supplied by the next proposition.
172 Making Counterfactual Assumptions
Definition 6 (a) If there is some t 2 FS such that ÆU; ftgæ½if had been u ~w; then S[if had been u, would have been w] ¼ ÆUS ; ft 2 FS j ÆU; ftgæ ½if had been u~wgæ: (b) Otherwise, S[if had been u, would have been u] ¼ 0. As I already noted, this clause only works in cases in which no new laws can arise. Only when the universe is fixed do counterfactuals express a fixed proposition: ½½if had been u; would have been w ¼ fw 2 W j ÆU; fwgæ½if had been u~wg and only in those cases can we think of an update with a counterfactual as a propositional update: S½if had been u; would have been w ¼ ÆUS ; FS \ ½½if had been u; would have been wæ When the universe can change, counterfactuals get rather capricious. In that case they are not even persistent. Definition 7 Let S and S# be states. (i) S is at least as strong as S# iff US 4 US# and FS 4 FS#. (ii) A sentence u persistent iff the following holds: If S is at least as strong as S#, and S#~u; then S~u:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This proposition says that the operation of making a counterfactual assumption is distributive: we can think of FS[if had been u] as the result of taking the union, for all t 2 FS, of all the sets FÆUS ;fvgæ½if had been u : Call w 2 FÆUS ;fvgæ½if had been u a u-alternative to t. Using this terminology, we can reformulate the proposition and say that FS[if had been u] consists of the u-alternatives of all the worlds in FS. Notice that a state S supports the sentence £if had been u, would have been w·, iff the consequent w holds in all u-alternatives of all worlds in FS. And if a state S does not support £if had been u, would have been w·, then we can turn it into one that does by removing from FS all the worlds t that have some u-alternative in which w does not hold. Thus we arrive at the following update clause.
Frank Veltman 173
1½p½q~if had been :p; would have been q; but 1½p½q½hð:p/:qÞ 2 if had been :p; would have been q: It is crucial that a law is learnt here. 5 ARE COUNTERFACTUALS AMBIGUOUS? The test condition for counterfactuals provided above fits in nicely with the theory of indicative conditionals proposed in Gillies (2004). Stated in our format, Gillies suggests the following: S½if u; w ¼ S; if S½u~w S½if u; w ¼ 0; otherwise: Some philosophers advocate a unified account of indicative and subjunctive conditionals. They believe that the only difference between indicatives and counterfactuals is that each is used in different circumstances. Indicatives are typically used in circumstances in which the speaker is ignorant about the truth value of the antecedent and counterfactuals in circumstances in which the agent thinks that the antecedent is false, but both express the same ‘connection’ between the antecedent and the consequent. It would seem that anyone subscribing to this position is committed to the following. An agent who is ignorant about the truth value of u, but entitled to entertain the indicative conditional £If u, w·, will later, after learning that u is in fact false, be entitled to entertain the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
If S is stronger than S#, you know more laws and/or more facts in S than you know in S#. However, this does not necessarily mean that in S you will accept every sentence you accept in S#. Of course, intuitively this should hold for sentences that describe the facts, or that exemplify laws, but not all sentences do so. Well-known examples of sentences that are not persistent are sentences in which the epistemic modality might occurs. With £It might be the case that u· a speaker expresses that u is consistent with the information available. Obviously, as more information gets available this consistency might get lost. (See Veltman 1996 for details.) The question is: are counterfactuals persistent? Here is a counterexample:
174 Making Counterfactual Assumptions counterfactual £If it had been the case that u, it would have been the case that w·. At first sight, this looks quite plausible, but it is false, as the next scenario shows. The duchess has been murdered, and you are supposed to find the murderer. At some point only the butler and the gardener are left as suspects. At this point you believe (i) If the butler did not kill her, the gardener did.
(ii) If the butler had not killed her, the gardener would have. Actually, quite a few people believe that counterfactuals have two readings, an ‘epistemic’ reading and an ‘ontic’ one. They will maintain that on the epistemic reading sentence (ii) is true. In the epistemic case implicit reference is made to some previous epistemic state, in this example the state you were in when only two suspects were left. Thinking back, one can say that if it had not been the butler, it would have to have been the gardener.17 Notice that only people who have gone through the same epistemic process as you did in your role of detective, will be able to appreciate this epistemic reading. People who have never been in a state in which the butler and the gardener were the only suspects left, and who just wonder which course history would have taken if the butler had not killed the duchess, will rightly think that in that case the duchess might still have been alive. So, on this second, ‘ontic’ reading the sentence is plainly false. I myself doubt that (ii), or any other counterfactual for that matter, has an epistemic reading. There are other means to express what the epistemic reading is supposed to express. In any case, the theory proposed here only covers the ‘ontic’ reading as the next formal picture shows. Set p :¼ ‘The butler killed the duchess’, q :¼ ‘The gardener killed the duchess’, and r :¼ ‘The duchess was killed’. Consider first the state 17
See Morreau (1992) for an epistemic semantics for counterfactuals.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Still, somewhat later — after you found convincing evidence showing that the butler did it, and that the gardener had nothing to do with it — you get in a state in which you will reject the sentence
Frank Veltman 175
S ¼ 1½hððp _ qÞ/rÞ½r½p _ q; which comprises what you know after you found out that it must have been the butler or the gardener.
Notice that S½:p~q: Gillies’ theory says, as any decent theory of indicative conditionals would do, that under these circumstances the state S supports if :p; q: Next, consider the state S# ¼ 1½hððp _ qÞ/rÞ½r½p _ q½p½:q; pictured below on the left hand side. The (only) basis for w5 is given by fÆp; 1æ; Æq; 0æg: The state S#Y½½p; ÆUS ; fw0 ; w1 ; w5 gæ: The state S#½if had been :p is the state pictured below on the right hand side.
Figure 6
S#½if it had been that :p 2 q: In other words; S 2 if had been :p; would have been q: The above illustrates that there is a huge difference between making a counterfactual assumption and revising one’s beliefs. When you believe that u is true and you imagine that u had been false, you have to change your cognitive state, but it is it not the kind of change you
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 5
176 Making Counterfactual Assumptions would have to make if you were to discover that u is in fact false. It is not a correction. Notice that w0 2 S#½if it had been that :p; which means so much as that if the butler had not killed her, the duchess might still be alive. However, if at some point you were to discover that your belief that the butler did it is in fact wrong, you would not automatically give up your belief that the duchess was killed. It is likely that you would reopen the investigations.18 6 A PROBLEMATIC CASE
If a different animal had escaped from the zoo, it would have been a zebra. should be accepted by an agent with the following information: Last year, a zebra escaped from the Hamburg zoo. The escape was made possible by a forgetful keeper who forgot to close the door of a compound containing zebras, giraffes, and gazelles. A zebra felt like escaping and took off. The other animals preferred to stay in captivity. (Kratzer 1989; 625) This example poses a problem for an account which just follows Ramsey’s recipe. After all, it is possible to accommodate the counterfactual assumption that a different animal escaped from the zoo without giving up the idea that it was a zebra. The reason why this example poses no problem for the theory presented here is because the information that a zebra escaped is not represented as an independent fact in the bases of the worlds that constitute the state S supporting this information. Every basis of every world in FS will contain some object that is a zebra (fact 1) and that 18
See Rott (1999) for an insightful discussion of these points.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The theory presented in this paper offers an alternative solution to the problems dealt with by Angelika Kratzer in Kratzer (1989). Such an alternative is called for given the defects in the formal set up of lumping semantics discussed in Kanazawa et al. (2005, this issue). I have tried to remedy these defects, and I have tried to do so keeping in as many informal ideas behind the lumping set up as I could. The result is a modification of naive Premise Semantics, just like Kratzer’s theory, and just like hers it is a theory that makes concrete predictions in concrete cases. The predictions are in many cases the same. For example, given the account above, there is no reason why the sentence
Frank Veltman 177
escaped (fact 2). This is all that is needed to enforce the proposition that a zebra escaped. In accommodating the assumption that a different animal escaped, in every basis of every world fact 2 will have to be replaced by a fact consisting of a different object that escaped. There is no reason why this object should be a zebra. Any other kind of animal will do.19 As far as I can see, there is just one case where our theory does not give the outcome wanted by Kratzer. Here is the story.
Kratzer needs all the lumping machinery to exclude the latter possibility. The theory presented here cannot exclude it. Here is a formal sketch of the situation: Let p be short for ‘The flag is up’, q for ‘The lights are on’, and r for ‘The king is out’. Consider the state S ¼ 1½hððp ^ qÞ/:rÞ½:p½q½r;20 given by the left table below.
Figure 7
The world w3 has one basis, fÆq; 1æ; Ær; 1æg; and to accommodate the counterfactual assumption p, one could give up either Æq; 1æ or Ær; 1æ: Hence, S[if had been p] is the state pictured on the right. Notice that S½if had been p 2 :r: Therefore, S 2 if had been p; would have been :r: According to the theory presented here, it is not the case that if the flag were up, the King would be in. Indeed, the lights might be out and the King might still be away. 19 20
This can only be made more precise in a predicate logic version of the theory presented here. Nothing much changes if one strengthens the law to hððp ^ qÞ4:rÞ:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
King Ludwig of Bavaria likes to spend his weekends at Leoni Castle. Whenever the Royal Bavarian flag is up and the lights are on, the King is in the Castle. At the moment the lights are on, the flag is down, and the King is away. Suppose now counterfactually that the flag were up. Well, then the King would be in the castle and the lights would still be on. But why wouldn’t the lights be out and the King still be away? (Kratzer 1989, p. 640)
178 Making Counterfactual Assumptions For those who share Kratzer’s intuitions, this will be a drawback. However, there are examples with the same logical structure as the one above for which one wouldn’t want that S~if had been p; would have been :r: Consider the case of three sisters who own just one bed, large enough for two of them but too small for all three. Every night at least one of them has to sleep on the floor. Whenever Ann sleeps in the bed and Billie sleeps in the bed, Carol sleeps on the floor. At the moment Billie is sleeping in bed, Ann is sleeping on the floor, and Carol is sleeping in bed. Suppose now counterfactually that Ann had been in bed. . .
21 I owe this observation to one of the referees, who illustrates the point by a variant on the Ann/ Billie/Carol puzzle: ‘Suppose Carol is invisible. Suppose further that you are a proud parent of Ann, Billie and Carol, and before you go to bed you go in and check on the kids. As described in the original version, Ann is on the floor, Billie is in bed and Carol (obviously) is also in bed. Now you turn to your spouse and comment: if Ann had been in bed, Carol would have been on the floor.’
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
I am pretty sure that this time you are not prepared to say: ‘Well, in that case Carol would be sleeping on the floor’. Indeed, why wouldn’t Billie be on the floor? Still, this example has the same logical structure as the King Ludwig example. Let p stand for ‘Ann sleeps in the bed’, q for ‘Billie sleeps in the bed’, and r for ‘Carol sleeps in the bed’. The question we are interested in, is whether 1½hððp ^ qÞ/:rÞ½:p½q½r~if had been p; would have been :r: We saw already that the answer is ‘no’. If Kratzer’s intuitions are right, there must be some crucial factor that the theory presented here does not take into account. I do not know what factor that would be. Clearly, there are important differences between the two examples: the three atoms figuring in the second example refer to facts with an equal ‘epistemic status’, whereas in the first example there is an important difference between, on the one hand, the king’s presence and, on the other hand, the light being on and the flag being up; the latter serve as external signs for the otherwise invisible occurrence of the former. I can imagine that an explanation of the difference between the two examples starts with this observation21, but I have no idea how the explanation would continue, let alone how to model it formally. This is not the only issue I have to leave for another occasion. I have just taken a first step in getting a decompositional analysis of counterfactual conditionals. Further steps are called for. Most urgent: in the above I have neglected all matters having to do with the interplay of tense and mood in the would+have+past participle construction. This means there is a range of problems I have nothing sensible to say about.
Frank Veltman 179
Let me give one example. For an indicative conditional to make sense, it is not necessary that the event described in the antecedent precede the event in the consequent. There is nothing wrong with a sentence like: If he left the interview smiling, it went well. However, in the counterfactual mood, this cannot be done. If the interview had gone well, he would have left smiling. sounds perfect. But it is hard, if not impossible, to get a reading of
If he had left the interview smiling, it would have gone well.22
Acknowledgment I thank the referees for their comments and suggestions, and the editors for their help and patience.
Received: 23.05.04 Final version received: 14.09.04
FRANK VELTMAN ILLC/Department of Philosophy University of Amsterdam Nieuwe Doelenstraat 15 1012 CP Amsterdam e-mail:
[email protected]
REFERENCES Boyland, J. T. (1996) Morphosyntactic Change in Progress: A Psycholinguistic Treatment. Ph.D. thesis, Michigan State University.
Condoravdi. C. (2002) Temporal interpretation of modals. In D. Beaver, L. Martinez, B. Clark & S. Kaufmann (eds), The Construction of
22 There should at least be a comprehensible epistemic reading of this sentence—if at least counterfactuals have such readings.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
in which the event described in the consequent precedes the event described in the antecedent. One wonders if this phenomenon is due to the peculiar way in which tense and mood are combined in the English modal perfect, or if there is a deeper, semantic or cognitive reason for it, which also affects the counterfactual mood in other languages. I hope it is possible to shed some light on this by combining some of the ideas put forward here with the event based semantics put forward in Condoravdi (2002).
180 Making Counterfactual Assumptions Rooth, M. (1985) Association with Focus. Ph.D. thesis, University of Massachusetts, Amherst, MA. Rott, H. (1999) ‘Moody conditionals: Hamburgers, switches, and the tragic death of an American president’. In J. Gerbrandy, M. Marx, M. de Rijke & Y. Venema (eds), JFAK. Essays dedicated to Johan van Benthem on the occasion of his 50th birthday. Amsterdam University Press. Amsterdam, 98–112. Stalnaker, R. (1968) ‘A theory of conditionals’. In N. Rescher (ed.), Studies in Logical Theory. Basil Blackwell. Oxford, 98–112. Stalnaker, R. (1984) Inquiry. A Bradford Book. MIT Press. Cambridge, MA. Tichy, P. (1976) ‘A counterexample to the Stalnaker-Lewis analysis of counterfactuals’. Philosophical Studies 29: 271–273. van Rooy, R. (2005) ‘A modal analysis of presupposition and modal subordination’. Journal of Semantics (this issue). Veltman, F. (1976) ‘Prejudices, presuppositions, and the theory of counterfactuals’. In J. Groenendijk & M. Stokhof (eds), Amsterdam Papers in Formal Grammar. Proceedings of the 1st Amsterdam Colloquium. University of Amsterdam, pages 248–281. Veltman, F. (1996) ‘Defaults in update semantics’. Journal of Philosophical Logic 25:221–261.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Meaning. CSLI Publications. Palo Alto, 59–88. Gillies, A. (2004) ‘Epistemic conditionals and conditional epistemics’. Nouˆs 38:585–616. Kanazawa, M., Kaufmann, S., and Peters, S. (2005) ‘On the lumping semantics of counterfactuals’. Journal of Semantics (this issue). Kaufmann, S. (2000) ‘Dynamic discourse management’. In M. Faller, M. Pauly & S. Kaufmann (eds), Formalizing the Dynamics of Information. CSLI Publications. Palo Alto, 171–188. Kratzer, A. (1981) ‘Partition and revision: the semantics of counterfactuals’. Journal of Philosophical Logic 10:242–258. Kratzer, A. (1989) ‘An investigation of the lumps of thought’. Linguistics and Philosophy 87(1):3–27. Lewis, D. (1973) Counterfactuals. Basil Blackwell. Oxford. Lewis, D. (1979) ‘Counterfactual dependence and time’s arrow’. Nouˆs 13:455–476. Morreau, M. (1992) ‘Epistemic semantics for counterfactuals’. Journal of Philosophical Logic 21:33–62. Pollock, J. (1976) Subjunctive Reasoning. Reidel. Dordrecht. Rescher, N. (1964) Hypothetical Reasoning. North Holland Publishing Company. Amsterdam.
Journal of Semantics 22: 181–209 doi:10.1093/jos/ffh024 Advance Access publication April 11, 2005
Toward a Useful Concept of Causality for Lexical Semantics JERRY R. HOBBS Information Sciences Institute, University of Southern California
Abstract
1 INTRODUCTION It is natural to say that when you flip a light switch, you cause the light to go on. But it would not happen if a whole large system of other conditions were not in place. The wiring has to connect the switch to the socket, and be intact. The light bulb has to be in good working order. The switch has to be connected to a system for supplying electricity. The power plant in that system has to be operational, and so on. Flipping the light switch is only the last small move in a large-scale system of actions and conditions required for the light to go on. I will take as my starting point that people are able to recognize that a particular effect is caused by some ‘causal complex’. By ‘causal complex’ I mean some collection of eventualities (events or states) whose holding or happening entails that the effect will happen. People may not know The Author 2005. Published by Oxford University Press. All rights reserved.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We do things in the world by exploiting our knowledge of what causes what. But in trying to reason formally about causality, there is a difficulty: to reason with certainty we need complete knowledge of all the relevant events and circumstances, whereas in everyday reasoning tasks we need a more serviceable but looser notion that does not make such demands on our knowledge. In this work the notion of ‘causal complex’ is introduced for a complete set of events and conditions necessary for the causal consequent to occur, and the term ‘cause’ is used for the makeshift, nonmonotonic notion we require for everyday tasks such as planning and language understanding. Like all interesting concepts, neither of these can be defined with necessary and sufficient conditions, but they can be more or less tightly constrained by necessary conditions or sufficient conditions. The issue of how to distinguish between what is in a causal complex from what is outside it is discussed, and within a causal complex, how to distinguish the eventualities that deserve to be called ‘causes’ from those that do not, in particular circumstances. One particular modal, the word ‘would’, is examined from the standpoint of its underlying causal content, as a linguistic motivation for this enterprise.
182 Toward a Useful Concept of Causality for Lexical Semantics
1. How do we distinguish what eventualities are in a causal complex from those that are outside it. 2. Within a causal complex, how do we distinguish the eventualities that deserve to be called ‘causes’ from those that do not. Lewis (1973), Ortiz (1999b), Simon (1952, 1991), and Pearl (2000) are primarily concerned with the first question. Mackie (1993) and Shoham (1990) are primarily concerned with the second. The first question leads one to examine counterfactuals. The second leads one to introduce nonmonoticity. It is because Simon deals with the first and Shoham with the second that in their exchange (Shoham 1990, 1991; Simon 1991) they largely talk past one another. The first question is dealt with in section 3, and the second in section 5. It should be noted at the outset that one of the aims of this article is the development of a theory of causality that will work equally well for physical causality and other types of causality, such as social, political, and economic causality, and the causality of folk psychology. It should work in any domain where we attribute the occurrence of events to underlying causal principles. Moreover, possible causes should be permitted to be not just actions, but also agentless events and states, such as tornadoes, the slipperiness of the floor, and a signature not being present on a document. We would like to be able to say that the lack of a signature causes a contract to be invalid. In this article I first motivate the development of a coherent concept of causality in linguistics with an example involving modality—the word ‘would’. But causal relations pervade discourse, and any number of other examples could be given. Much research in AI begins with simple intuitions about a phenomenon, but then problems are encountered, and by the time they are overcome, the formal treatment is quite complex. My aim in
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a priori what events or eventualities are in the causal complex or what constraints or laws the world is operating under. But they are able to reason to some extent about what may or may not be part of that causal complex. The first step in coming to a clear account of causality is deciding how to talk about such causal complexes and what criteria there are for deciding what eventualities are in or out of the causal complex for a particular effect. The second step is determining what we should mean by the predicate cause, as it appears in commonsense reasoning and lexical semantics. Splitting the inquiry like this, into an investigation of causal complexes and an investigation of the predicate cause, leads us to see two principal questions about causality that have been addressed in the literature:
Jerry R. Hobbs 183
2 MODALITY: THE CASE OF ‘WOULD’ In this section I assume an ‘Interpretation as Abduction’ framework (Hobbs et al. 1993). One uses a knowledge base of defeasible axioms to
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
this work is to frontload the complexity, so that the axiomatizations that result at the end preserve the original simplicity of the intuitions. Before beginning the formal treatment of these questions, I will describe the notational conventions used in this article. The notation and ontology of Hobbs (1985a) will be employed. Briefly, corresponding to every predication p(x) there is an eventuality e which is the eventuality of p being true of x. The expression p#(e, x) says that e is the eventuality of p being true of x. Thus, tall#(e, John) says that e is the eventuality or state of John’s being tall. An eventuality may or may not exist in a particular possible world. The predication holds(e, w) says that eventuality e exists in world w. The predication Rexists(e) says that eventuality e exists in the real world. This ontological manoeuvre goes under the name of ‘reification’; we have made things out of events. A possible world can be thought of as consisting of a set of eventualities that does not contain both an eventuality and its negation. A possible world may or may not be restricted to a particular moment in time. Since eventualities correspond to predications, it makes sense to talk about the conjunctions and negations of eventualities, and they are eventualities too. Thus, and#(e, e1, e2) says that e is an eventuality that exists when the eventualities e1 and e2 exist, and not#(e, e1) says that e is an eventuality that exists when the eventuality e1 does not exist. (In the following development, when not#(e, e1) holds, e will normally be abbreviated to :e1.) For a set s of eventualities to hold in a world w, the conjunction of the eventualities in s must hold in w, and thus each of the eventualities in s must hold in w. In Hobbs (2003) the predicates and# and not# are axiomatized in a way that yields the right relations with the logical operators ^ and :. The use of this notation allows us to work entirely in first-order logic. When one’s primary focus is a particular phenomenon, like causality, a special-purpose logic that highlights the special features of the phenomenon may be justified, and indeed most formal explorations of causality have taken place in such special-purpose logics. But when, as in this research, the effort is part of the larger enterprise of developing an account of natural language understanding and/or reasoning in everyday life, it is better to have a simple and uniform logic for all phenomena, and that is what the introduction of eventualities gives us.
184 Toward a Useful Concept of Causality for Lexical Semantics arrive at interpretations of discourse. Essentially, one seeks the best proof of the explicit content of the text, where ‘best’ is related in part to the reliability of the defeasible axioms used in the proof. Proofs are also better when they make use of redundant information conveyed in different parts of the text; this encourages the linking of that information, as we will see in the following example. Consider the pair of sentences from Frank and Kamp (1997). I don’t own a TV set. I would watch it all the time. For the sake of exposition, let us simplify this to the slightly less idiomatic The modal ‘would’, like all modals, is with respect to a set of constraints c. We can build this into the predicate would by giving it a second argument—would(e4, c). In ‘I would watch it’, my watching it is e4 and c is the set of constraints that would result in my watching it. We can then reify the ‘would’ situation and write would#(e3, e4, c). This says that e3 is the ‘would-ness’ of situation e4; we can think of e3 as the hypotheticality of e4 with respect to constraints c, that is, that e4 obtains if the hypothesis c obtains. Then the causal content of ‘would’ can be described by the axiom cause#ðe3 ; c; e4 Þ would#ðe3 ; e4 ; cÞ That is, if e3 is the causal relation between some causing situation c and another, caused situation e4, then e3 is the ‘would-ness’ property of e4 with respect to the constraint c. To interpret text (1) we need to assume our knowledge base has two specific rules involving causality. cause#ðe3 ; e2 ; e4 Þ ^ bad-forðe4 ; iÞ ^ p#ðe4 ; iÞ causeðe3 ; e1 Þ ^ not#ðe1 ; e2 Þ own#ðe2 ; i; tÞ cause#ðe3 ; e2 ; e4 Þ ^ use#ðe4 ; i; tÞ The first is an axiom schema that says that if a situation (e2) causes you to do an action (e4) that is bad for you, that causal relationship (e3) causes you not to bring about that situation. Don’t do the cause if you don’t want the effect. This will be instantiated below with watch instantiating the predicate variable p. The second rule says owning something causes you to use it.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(1) I don’t own a TV. I would watch it.
Jerry R. Hobbs 185
Two more specific axioms are required. The first says that watching TV is bad for you. watch#ðe4 ; i; tÞ ^ tvðtÞ bad-forðe4 ; iÞ This is a belief that many have, including any sincere speaker of (1). The next axiom says that to use a TV set is to watch it. tvðtÞ ^ use#ðe4 ; i; tÞ watch#ðe4 ; i; tÞ
cause#ðe3 ; e1 Þ CoherenceRelðe1 ; e3 Þ says that the causal relation from e3 to e1 is one such possible relation. The fact that I would watch TV if I had one causes my not having one, and this causal relation is what is conveyed by the adjacency of the two sentences in the example. In the ‘Interpretation as Abduction’ framework, one interprets a text by finding the best abductive proof of the logical form of the text. A proof is abductive if it allows assumptions. A proof is better insofar as it is shorter, makes fewer assumptions, uses more reliable axioms, and exploits redundancy in the text. The logical form of the text is the conjunction of the logical forms of the sentences, conjoined with a CoherenceRel predication representing the information conveyed by their adjacency. The logical form of text (1) is Rexistsðe1 Þ ^ not#ðe1 ; e2 Þ ^ own#ðe2 ; i; tÞ ^ tvðtÞ ^ CoherenceRelðe1 ; e3 Þ ^ Rexistsðe3 Þ ^ would#ðe3 ; e4 ; cÞ ^ watch#ðe4 ; i; xÞ That is, there exists in the real world the negation (e1) of my owning (e3) a TV set t, and that is related to the ‘would-ness’ e3 of my watching
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
One final axiom also involves causality. It says that one kind of information the adjacency of two sentences in discourse can convey is a causal relation between the eventualities described. The second sentence functions as an explanation of the first; the sentences are related by the ‘coherence relation’ of Explanation (Hobbs 1985b). If CoherenceRel(e1, e3) means that there is some relation between a segment describing e1 and a segment describing e3, then the axiom
186 Toward a Useful Concept of Causality for Lexical Semantics
Interpretation of ‘I don’t own a TV. I would watch it.’
(e4) something referred to as ‘it’ (x), a very tortured paraphrase of (1). At this point, the constraints c have not yet been identified with e2, the owning of a TV set, and ‘it’ (x) has not yet been identified with the TV set. The proof of this logical form given the above axioms is illustrated in Figure 1, where the arrows represent implication in the above axioms. The arrows converge from multiple conjuncts in antecedents of axioms and fan out to multiple consequents. Where use of an axiom results in identifying two variables in the logical form (i.e. resolving a coreference), there is an equality statement next to that axiom’s arrow. We have to assume a hypothetical owning, the non-existence of that owning, and the existence of the ‘would’ property. Everything else can be proved from these assumptions. In the course of the proof, in order to get the best proof, the constraint c is identified with the hypothetical owning, and ‘it’ is identified with the hypothetical TV set. For the purposes of this article, what is most interesting about this example is the important role played by cause in the interpretation, and
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Figure 1
Jerry R. Hobbs 187
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
in particular, the role the causal content of ‘would’ plays in recognizing the coherence of the text, i.e. the relation between the two sentences, and in determining the identity of the constraints associated with ‘would’. This example is also interesting because it has been posed as a challenge by Frank and Kamp (1997) for Discourse Representation Theory (DRT) approaches to co-reference via accessibility conditions. Here the hypothetical ‘it’ of the second sentence happily resolves to the nonexistent TV set embedded in negation in the first sentence. In the ‘Interpretation as Abduction’ approach, the supposed accessibility conditions on pronouns are really ways of computing whether there are contradictory statements about the existence of entities in the real world, for purposes of ruling out certain co-reference relations. Here the resolution is possible because we have identified the nonexistent owning as the hypothetical cause implicit in the modal ‘would’, and there is no contradiction. In this example, the causal constraint that is implicit in the use of ‘would’ can be found explicitly in the immediately preceding sentence—the owning of a TV set, which would cause the watching. This is a made-up example, but real discourse is typically highly redundant. It therefore should not be surprising for the modal ‘would’ to convey a causal relation that is explicit in the surrounding text as well. I examined 120 examples of the modal ‘would’ in naturally occurring discourse from several genres, and in 101 of them this is true. For the listener or reader comprehending the discourse, making the connection between this cause and the complement of ‘would’ is part of the job of comprehension. For this reason, we can say that understanding the underlying causal nature of the modal ‘would’ is the beginning of understanding how it functions in discourse. Other modals can similarly be analysed from the perspective of causality. For example, possibility is possibility with respect to a set of constraints. For something to be possible is for that set of constraints not to cause it not to occur. Other modals, such as ‘must’ and ‘should’, have their own implicit sets of causal constraints. One could argue that to understand a particular instance of a modal in discourse is to recover that set of causal constraints. The chief objection to basing a treatment of modality on causality is that causality presents such a quagmire of difficulties in philosophy. In this article, I propose not so much to solve these difficulties as to show that it is possible to work around them to create a coherent and usable theory of causality.
188 Toward a Useful Concept of Causality for Lexical Semantics 3 CAUSAL COMPLEXES
3.1 Change relevance
ð"w2 ; w1 ; e1 ; CÞ½closest-worldðw2 ; w1 ; e1 ; CÞ [½:holdsðe1 ; w1 Þ ^ holdsðe1 ; w2 Þ ^ ð"e2 2 w2 w1 Þ½½ðw1 \ w2 Þ [ C [ fe1 g e2 ^ :½ðw1 \ w2 Þ [ C e2 That is, eventuality e1 doesn’t hold in w1 but it does hold in w2, and for every eventuality e2 in the difference between w2 and w1, e2 follows from the common core of w1 and w2 and the constraints C together
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
With this background, let us now perform a thought experiment. Think of the world at any given instant as made up of a very very large set of eventualities which obtain at that instant. (In Pearl 2000 random variables taking on specific values correspond to the eventualities in this article.) Suppose the eventuality e is one of the eventualities that obtain, and we wish to understand the causal complex of which e is an effect. Now suppose we can reach into this world and cause an eventuality e1 to hold. That is, we toggle :e1 into e1. This change will propagate along what after the fact we think of as ‘causal chains’, changing other eventualities into their negations. But not everything will change. The effects of any single change tend to be quite local, in some sense of ‘local’. Suppose e is changed in this process. Then we know that making e1 true is relevant to e. In attempting to identify a causal complex that causes e, we have learned that there is one that has e1 in it. Let S be the set of possible worlds. Let Con(S, C) be the subset of S containing the worlds that respect some set C of constraints. The constraints may be thought of as background knowledge, or if the thought experiments are real, then simply the laws that would operate to bring about the consequences of a change. I will be as silent as possible about the structure of C. In particular, C may contain both what may be thought of as causal constraints and what may be thought of as noncausal, but I will not distinguish these a priori, since knowledge of the constraints is not given to us through intuition but is the hard-won result of careful investigation. We first need to define a predicate closest-world. We want to say that a world w2 is a closest world to world w1 with respect to an eventuality e1 and a set of constraints C if everything in w2 different from w1 is a consequence of adding e1. Formally,
Jerry R. Hobbs 189
with the eventuality e1, but does not follow from the common core of w1 and w2 and the constraints C alone. There is not necessarily a unique closest world w2. Then we can define change-relevant as follows, where the set of possible worlds S and the set of constraints C are fixed: ð"e1 ; eÞ½change-relevantðe1 ; eÞ [ ðdw1 ; w2 2 ConðS; CÞÞ½closest-worldðw2 ; w1 ; e1 ; CÞ ^ :½holdsðe; w1 Þ [ holdðe; w2 Þ
ð"s; eÞ½causal complexðs; eÞ ð"e1 2 sÞ½change relevantðe1 ; eÞ That is, if a set s of eventualities is a causal complex for an effect e, then all of the eventualities in s are change-relevant to e. Toggling them can change e under the right circumstances. This axiom does not define the notion of a causal complex, but it does constrain it. Change relevance is only a necessary condition for an element of a causal complex; it is not sufficient.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
That is, to show that an eventuality e1 is change-relevant to an effect e, find two possible worlds w1 and w2 such that e1 doesn’t hold in w1 and does hold in w2 but where the two worlds are otherwise as close as possible, given the constraints C, and where the effect e holds in one world and not in the other. In other words, there is some situation in which turning e1 on will toggle e. It is not necessarily the case that if w2 is a closest world to w1 when e1 is turned on, then w1 is a closest world to w2 when e1 is turned off. The constraints in C may cause the changes to propagate more in one direction than in the other. As a result, it does not follow from the definitions that if e1 is change-relevant to e then so is :e1. Events have consequences, and sometimes it is not possible to fix things by merely undoing what triggered the damage. If I drop a vase and it shatters, I can’t fix it just by lifting it up again. This axiom may be thought of as instructions for how to carry out an experiment. We want to know if a certain factor e1 is relevant to a certain phenomenon e. We try to find situations in which e is absent and when we add the factor e1, e is present, or in which e is present and when we add the factor e1, e is absent. Now we can propose the axiom
190 Toward a Useful Concept of Causality for Lexical Semantics
3.2 Examples
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Consider two simple models for this set of axioms. In the first, there is a set of light switches on a table, and a light bulb. When the right combination of switches are toggled in the right ways, the light is on. Here the effect e is the light’s being on, :e its being off. There is an eventuality ei for each switch si which is the condition of its being on; :ei is the condition of its being off. A possible world is some combination of the switches being on or off. The condition e1 of a switch s1 being on is change-relevant to e, the light’s being on, if and only if there is arrangement w1 of switches in which switch s1 is off, and we can turn it on and change the state e of the light. The possible world or arrangement w2 of switches, in this case, is the arrangement in which s1 is on and all other switches are as in w1. The proposition holds(e1, w2) is obviously true. The expression :[holds(e, w1) [ holds(e, w2)] means the light changes when the switch is toggled. There are no constraints in C that mean that one switch can affect the state of another switch, so the only consequence for the switches of turning s1 on is that s1 is on. Thus, the only e2 in w2 – w1 is e1 itself, and e1 follows trivially from (w1 \ w2) [ C [ fe1g, while e1 does not follow from (w1 \ w2) [ C since switches cannot influence each other. Thus the eventuality e1 is change-relevant to e, and is therefore not ruled out as a part of some causal complex for e. For the second example, consider a line of dominos all of which are close enough to their neighbors to knock them down. Assume that the constraints C enforce this. The two possible states for each domino di will be being upright, ei—upright#(ei, di)—and being knocked over to the right, :ei. Let the effect e be the eventuality of the domino d at the right end being upright. First let us ask if the eventuality :e1 of the leftmost domino d1 being down is change-relevant to e. To show that it is, we need to find closest worlds w1 and w2 such that e holds in one and :e in the other. Let w1 be the world in which all the dominos are upright. In particular, d and d1 are upright, so e and e1 hold. Now let us introduce :e1 into w1; that is, we knock d1 down to the right. Because of the constraints, all the other dominos will go down. Thus the closest world to w1 after introducing :e1 under constraints C is the w2 in which all the dominos are down. In particular, d is down, so :e holds in w2, and the definition of change-relevant is satisfied. Now let us ask if the eventuality e1 is change-relevant to e. Let w1 be any world in which d1 is down to the right; that is, :e1 holds. Because of the constraints, all the other dominos will be down, and in particular,
Jerry R. Hobbs 191
:e will hold. Now introduce e1; that is, set d1 upright. The constraints as stated do not entail that any other domino will thereby become upright, and thus the closest world to w1 is the one in which only d1 is upright. The eventuality :e still holds, and thus e1 is change-irrelevant to e. If however we augment the constraints C with ‘frame axioms’ that say that the only way a domino can be down is if its neighbor knocked it down, then e1 is change-relevant to e. Ortiz (1999a) builds a solution to the frame problem into his treatment of causality, so that frame axioms do not have to be explicitly stated. That has not been done here.
3.3 Temporal order and causal priority
ð"e; e1 Þ½causally-involvedðe1 ; eÞ [ðdsÞ½causal-complexðs; eÞ ^ e1 2 s A further constraint on causes and effects is that the cause cannot happen after the effect. ð"e1 ; eÞ½causally-involvedðe1 ; eÞ :beforeðe; e1 Þ However, the facts about causal flow are not entirely determined by knowing the times of events. There are cases where events occur simultaneously, but we have clear intuitions about what caused what. For example, flipping the switch and the light coming on are perceptibly simultaneous (except on airplanes). Yet clearly it is flipping the switch that causes the light to go on. Or consider a person hammering a nail. The person’s arm reaching the end of its trajectory, the head of the hammer striking the head of the nail, and the nail beginning its motion into the surface are all simultaneous but have a clear causal order. It may be that the criteria for causal flow for many such cases can be spelled out in various domain theories with varying specificity. At a general level, we can sometimes make judgments of causal flow between eventualities within a larger causal complex. We certainly want it to be the case that the eventualities in a causal complex for an event are causally prior to the effect, unless there is a feedback loop. Thus, ð"e1 ; e2 Þ½causally-involvedðe1 ; e2 Þ ^ :causally-involvedðe2 ; e1 Þ causally-priorðe1 ; e2 Þ
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We can say that if an eventuality is a member of a causal complex for an effect, then it is ‘causally involved’ in the effect.
192 Toward a Useful Concept of Causality for Lexical Semantics Given a causal complex s for an effect e, consider two eventualities e1 and e2 in s. There may be cases where we know that e1 is itself in a causal complex s1 for e2 and not vice versa. In this case, we would know that e1 is causally prior to e2, independent of information about time. Causal priority is related to temporal order in that if e1 is causally involved in e and occurs before e, then it is causally prior to e: ð"e1 ; eÞ½causally-involvedðe1 ; eÞ ^ beforeðe1 ; eÞ causally-priorðe1 ; eÞ Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Pearl (2000) faces the problem of causal flow in his treatment of counterfactuals and causality. He models his causal complexes as Bayesian networks. The links in these networks have an intrinsic directionality. There is nothing in the definition of Bayesian networks that requires this directionality to respect the direction of causal flow, but in the examples one sees, they generally do. Because of this, when he looks for the closest world, he can simply excise the links into the node whose value he wants to change, and the backwards propagation of the change is prevented. Ortiz (1999b) has also dealt with the problem of causal flow by stipulation. He divides his constraints into two sets, those used for prediction, LP, and those used for explanation, LE. In the former, inference follows the direction of causal flow; in the latter, it goes against causal flow. When constructing the nearest counterfactual world to the real world, he favours suspending the laws in LE over those in LP, thereby preventing, or at least discouraging, propagation of changes against causal flow. (In my view, explanation is not a process of deduction but of abduction. Thus, the same forward rules would be used for both prediction and explanation, but in explanation they would be used abductively by back-chaining over them.) None of the development here precludes circular causation, or feedback loops. Suppose two books are leaning against each other. The first book’s leaning against the second is in the causal complex causing the second to be upright, and vice versa. Now we can see that the definitions of closest-world and changerelevant are not as tight as our intuitions will allow. Suppose e1 causes e and is the only possible cause of e. That is, in the constraints C there is a constraint that whenever one occurs, the other does too, but we nevertheless know that e1 is causally prior to e. Then e1 and e occur in all the same worlds, and each is change-relevant to the other, by our
Jerry R. Hobbs 193
bang fire ði:e::bang _ fireÞ
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
current definitions, and there is no distinguishing which is in the causal complex for the other. To deal with this problem, we first need a way of loosening the constraints on possible worlds. Then we need to stipulate that when we toggle an eventuality, the changes cannot propagate against causal flow. Consider, for example, the situation in which someone fires a gun, a loud bang happens, and someone dies. We want to know if the loud bang is causally implicated in the death. The constraints that the set of possible worlds must respect are that the firing occurs if and only if the bang occurs, and that the firing occurs if and only if the death occurs. If the bang occurs, then by the first constraint, so does the firing, and thus by the second constraint so does the death. Therefore, the closest world to the world in which nothing happens is the world in which everything happens, and the bang is change-relevant to the death. Thus it is not ruled out as part of the causal complex that causes the death. Yet we know the firing is causally prior to the bang, and that when we infer the firing from the bang, we are propagating changes against causal flow. We would like to stipulate in the definition of closest-world that in selecting w2 we only change eventualities that are not causally prior to e1. But this will normally involve breaking some of the constraints (e.g. bang implies firing), and thus there is no such possible world in the set of possible worlds that respect the constraints C. We need to alter the set of constraints to a weaker set C#. One way to modify C into C# is suggested by non-monotonic logic. If an axiom or constraint in C allows us to draw a conclusion from e1 to something causally prior to it, then we must be able to disable that axiom somehow, or we could reason backwards about causally prior conditions. One way to do this is to consider the axiom to be defeasible. In nonmonotonic logic this is done by including :abi predications in the antecedent of a rule: P ^ :abi Q (McCarthy 1980). This says that if P is true and a specific abnormality condition abi does not hold, then Q is true. Equivalently, in ‘‘Interpretation as Abduction’’ (Hobbs et al. 1993), most axioms are assumed to be of the form P ^ etci Q. Thus, we can change :e1 to e1 and prevent back-propagation of its effects by changing C into C# the following way: Suppose the constraints are expressed in disjunctive normal form. Then for every constraint that contains a negative occurrence of e1 (or holds(:e1, w)) and a positive occurrence of a causally prior eventuality, disjoin an abnormality predication to it—abi(. . .). For example, we change the rule
194 Toward a Useful Concept of Causality for Lexical Semantics into ½bang ^ :abi ð. . .Þ fireði:e::bang _ abi _ fireÞ
DefeasðC; e1 Þ ¼ fa _ aba ja 2 C ^ disjunct-inð:e1 ; aÞ ^ ðde0 Þ½disjunct-inðe0 ; aÞ ^ causally-priorðe0 ; e1 Þg [ faja 2 C ^ :½disjunct-inð:e1 ; aÞ ^ ðde0 Þ½disjunct-inðe0 ; aÞ ^ causally-priorðe0 ; e1 Þg By allowing possible worlds consistent with this modified set of constraints, we can eliminate inferences, for example, from the bang to the firing, and consequently from the bang to the death. We can now modify the definition for closest-world as follows: ð"w2 ; w1 ; e1 ; CÞ½closest-worldðw2 ; w1 ; e1 ; CÞ [½w1 2 ConðS; CÞ ^ w2 2 ConðS; DefeasðC; e1 ÞÞ ^ :holdsðe1 ; w1 Þ ^ holdsðe1 ; w2 Þ ^ ð"e0 2 w2 Þ½causally-priorðe0 ; e1 Þ e0 2 w1 ^ ð"e2 2 w2 w1 Þ½½ðw1 \ w2 Þ [ DefeasðC; e1 Þ [ fe1 g e2 ^ :½ðw1 \ w2 Þ [ DefeasðC; e1 Þ e2 The definition of change-relevant has to be modified as well, since now world w2 need only respect the weakened constraints: ð"e1 ; eÞ½change-relevantðe1 ; eÞ [ðdw1 2 ConðS; CÞ; w2 2 ConðS; DefeasðC; e1 ÞÞÞ ½closest-worldðw2 ; w1 ; e1 ; CÞ ^ :½holdsðe; w1 Þ[holdðe; w2 Þ
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Then when we change :e1 to e1, it will be possible to maintain consistency between the altered constraints C# and the condition e1 by assuming the abnormality predications. Thus, from a set of constraints C and an eventuality e1, we can define a new set of constraints in which all the constraints that would allow us to draw conclusions about causally prior eventualities are made defeasible. It is assumed that the constraints are expressed in disjunctive normal form, and disjunct-in(e, a) means that e is a top level disjunct in the expression a.
Jerry R. Hobbs 195
Now the closest world to the world in which nothing happens is the world in which only the bang happens, and the bang is not changerelevant to the death.
3.4 Structure in causal complexes
3.5 Causality and implication There is a problem that Pearl does not address at all and that Ortiz addresses but bypasses by stipulation, that I also do not have a solution to: What principled ways are there to distinguish between causal connections and mere implicational connections? Clyde’s being an elephant implies that Clyde is a mammal, but does not cause it. A stapler’s being on a piece of paper causes the paper not to blow away, but it only implies that the paper is under the stapler. John’s flipping a switch causes a light to go on, but John’s flipping a switch only implies John turned the light on; they are two different descriptions of the same event. Nevertheless, the two notions are closely related, as evinced by the fact that ‘‘because’’ is used to convey either of them. My view is that
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The two key features of causal complexes are that if all the eventualities in the causal complex obtain, then the effect will occur, and there is nothing in the causal complex that is irrelevant to the effect. This characterization allows for internal causal structure in causal complexes. For example, if someone lets go of a vase, the vase will fall, and when it hits the floor it will shatter. Suppose only the letting go and the falling are relevant eventualities to the effect of shattering. The falling alone constitutes a causal complex for the shattering. It’s relevant to the shattering, and if it happens, the shattering happens. Similarly, the letting go alone constitutes a causal complex for the shattering. It’s relevant to the shattering, and if it happens, the shattering happens. Finally, the letting go and the falling together constitute a causal complex for the shattering. They are both relevant to the shattering, and if they both happen, the shattering happens. In a causal complex s for e, it may be the case that there is a subset s1 of s and an eventuality e1 in s and not in s1 such that s1 is a causal complex of e1, in which case s – fe1g is also a causal complex for e. For example, since the letting go causes the falling, the letting go is a sufficient causal complex for the shattering. The letting go, in a sense, is more ‘‘ultimate’’ than the falling. Causal complexes can sometimes be composed. If s1 is a causal complex for e and contains the eventuality e1 and s2 is a causal complex for e1 and s1 and s2 are consistent, then s1 [ s2 is a causal complex for e.
196 Toward a Useful Concept of Causality for Lexical Semantics implication is a kind of ‘‘washed-out’’ causality. It is causality applied to the informational domain. Another take on the relation is that it is a variety of metonymy. If P implies Q, then for someone to think P causes them to think Q. 4 COUNTERFACTUALS A counterfactual statement is a conditional whose antecedent is counter to the truth, as in If John had read the driver’s manual, he would have passed the exam.
If John were a millionaire, he would retire. In the philosophical and more recently in the AI literature (e.g. Lewis 1973; Ortiz 1999b; Pearl 2000), counterfactuals are taken to give us reliable insights into the facts about causality. Thus, John’s reading the driver’s manual would cause him to pass the exam, and John’s being a millionaire would cause him to retire. Counterfactuals are certainly related to causality. In the definitions of closest-world and change-relevant, suppose e holds in w2. w2 results from the occurrence of e1 in w1. If this had not occurred, then we would still be in w1, in which e does not hold. That is, if e1 had not occurred, then e would not have occurred—the counterfactual. Pearl observes that a counterfactual is a natural language way of saying that one eventuality should be changed and everything else should remain the same, insofar as possible. This is what the definition of closest-world attempts to capture. Ortiz develops a rich notion of counterfactual reasoning, including an impressive account of how to minimize the changes triggered by the counterfactual, and defines causality in terms of that. My position in this article however is that, aside from the hints it gives us for formalizing causality, counterfactuals in English are just a particular kind of English expression, with no special or priveleged status. I have not adopted the position that we have clear intuitions about the use of counterfactuals that give us special access to facts about causality. Rather I am seeking to develop a clear and coherent theory of causality, where one of the ultimate aims of the theory is to provide the predicates and axioms required for characterizing and relating lexical items with a causal flavor, such as the conjunctions ‘‘if ’’, ‘‘because’’, and ‘‘so’’, causative verbs, the subjunctive mood, and modal
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
It is distinguished in English by the use of the otherwise rare subjunctive mood in the antecedent and the modal ‘‘would’’ in the consequent.
Jerry R. Hobbs 197
auxilliaries such as ‘‘would’’, within a framework of interpreting discourse by abduction. The issue is what counts as evidence. My view is that a challenging counterfactual is not direct evidence against a particular theory of causality, but rather a challenge for characterizing ‘‘if ’’, ‘‘would’’, and the subjunctive mood in terms of that theory. However useful a tool for discerning causality the counterfactual is, it does not help us with the problem of distinguishing between causality and mere implication. It is a true, implicational fact that chimpanzees are not monkeys. The following counterfactual sentence is perfectly fine, and it is based on this implicational relation:
A classic conundrum in philosophical treatments of causality, especially those based on counterfactuals, is the problem of preemption. Suppose Adam and Ben both want to murder Chuck. Before Chuck walks off into the desert, Adam secretly drills a tiny hole (H) in Chuck’s canteen so it will be empty (E) by the time he needs a drink and he will die (D) of thirst. Independently Ben poisons the water in the canteen (P) so when Chuck takes a drink he will die (D). Chuck walks off into the desert and dies of thirst. Did the hole H cause the death D? This is a problem for counterfactual approaches to causality because if Adam hadn’t drilled the hole, Chuck still would have died, of poisoning by Ben. The hole pre-empted the poisoning. In the framework of causal complexes, this situation presents no particular problems. There is a causal complex including H in which E is the effect. There is a causal complex including H and E in which D is the effect. There is a causal complex including P and :E in which D is the effect. Since H occurred, the conditions for P’s causal complex did not obtain, and that causal complex was thus not what resulted in D. Adam murdered Chuck. Ben only attempted to murder Chuck. (The movie Gosford Park is based on exactly this premise.) 5 PROBABILITY AND CAUSALITY When you flip a coin, you say that there is a 50% probability that it will come up heads. But you say this because you are ignorant of all the conditions that cause the outcome to be what it is, such as the distribution of mass inside the coin, the air currents, the force with which the coin is flipped, the distance it falls, and so on. If we knew all of
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
If Bonzo were a monkey, he wouldn’t be a chimpanzee.
198 Toward a Useful Concept of Causality for Lexical Semantics
6 WHAT THE PREDICATE CAUSE MEANS We could use predications of the form causal-complex(s, e) for encoding our causal knowledge, where we then spell out the nature and interaction of the elements of s. The problem with this is that we rarely know or need to know the entire causal complex for many of the effects in our lives. Very few people really know how a car works, but they know to turn the key in the ignition to start it. Very few people really know all that goes into making an electric light work, although they know that flipping the switch will generally turn it on. The most common situation is one in which nearly all of the causal complex is in place for the effect to occur, and we must only figure out the one or two last steps to complete it. Or in seeking to explain an effect, nearly all of the causal complex normally holds, and only one or two eventualities are in doubt, and they thus constitute the explanation for the effect. Or a causal statement is made in discourse, and to verify its plausibility we don’t need to verify the truth of the entire causal complex but only that part of it that is not true normally. A causal complex will contain a large number of eventualities that are defeasibly true, or assumably true if there is no evidence to the contrary, or normally true, or true with high probability. We will say that in all these cases the eventualities are ‘‘presumable’’ or ‘‘presumably true’’. Only the remainder of the eventualities will be exercised in most reasoning, explanation, interpretation, and planning. The predicate
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
these, and knew all the relevant physical laws, we could predict the result of flipping the coin with certainty. The reason there is a 50% probability of the coin coming up heads is that there is a 50% probability of those hidden conditions being such that a heads will result. Now consider a causal complex s for an effect e, and consider a subset s1 of s. For simplicity, suppose s is the only causal complex that will cause e. We can talk of the probability of s1 causing e. It is simply the probability that all the conditions in s – s1 will be true. Thus, the notion of a set s1 of events causing an effect e with some probability can be reduced to the joint probability of the eventualities in the set s – s1 occurring, within the development of causality already presented. None of this should be taken to deny the utility of probabilistic approaches to causality. On the contrary, such approaches provide a means of reasoning about causality at a granularity intermediate between the full rigor of causal complexes and the defeasible reasoning in terms of the predicate cause.
Jerry R. Hobbs 199
ð"e1 ; eÞ½causeðe1 ; eÞ causally-involvedðe1 ; eÞ A cause is a member of a causal complex, but not just any member. Moreover, if we have a predicate presumable, meaning that its argument
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
cause should be reserved for these latter eventualities. The axioms in which the predicate cause occurs will be central in the use of any commonsense knowledge base, whereas appeals to axioms for the predicate causal-complex will be relatively rare. Which eventualities are presumable is very much dependent upon the task that is being performed, the situation or context, and/or the knowledge base being used. Shoham (1990) points out that what we specify as causal laws are just those causal relationships that prove to be useful in everyday reasoning. In part this depends on probabilities; we can ignore as presumable those factors that hold with high probability and focus just on those factors that are in doubt—it is the latter that get expressed in terms of cause. In addition, the choice of causal laws depends on utility; even if a factor normally holds, if its not holding would result in catastrophic consequences, we would usually want to reason about it as well and thus would express its role in terms of cause. Shoham gives the example of firing a cartridge that is probably a blank. An eventuality is generally a cause if it is manipulable by agents, and thus of use in planning, although there are certainly agentless causes such as a bare wire causing a fire. If an event is the final, triggering event that completes the causal complex and precipitates the effect, it is often identified as a cause. Actions required at significantly lower frequency than other actions are often taken to be presumable; to drive a car, we have to both fill the tank and turn the key in the ignition, but generally we take the former to be presumable. An extreme example of these criteria is when we say that a certain virus causes influenza. Perhaps no more than one out of a million viruses actually cause damage. That is, the other conditions that make up the causal complex resulting in influenza, such as the failure of the lymphocytes to destroy the virus, are highly improbable. Yet that one virus’ invasion of the cell is the highly consequential and potentially manipulable triggering causal element of the causal complex that results in influenza. On the other hand, if a person’s immune system is compromised, we may call that the cause of an infection that is normally countered. The predicate cause thus implies but is stronger than causallyinvolved:
200 Toward a Useful Concept of Causality for Lexical Semantics is presumably true, or presumably really exists, then we can state the following: ð2Þ ð"e1 ; eÞ½causeðe1 ; eÞ ðdsÞ½causal-complexðs; eÞ ^ e1 2 s ^ ð"e2 2 s fe1 gÞ½presumableðe2 Þ
ð"e1 ; s; e2 Þ½triggerðe1 ; s; e2 Þ [ðds1 Þ½s1 s ^ e2 ;s1 ^ causal-complexðs1 ; e2 Þ ^ ð"e0 2 s1 Þ½presumableðe0 Þ _ e0 ¼ e1 _ triggerðe1 ; s; e0 Þ That is, eventuality e1 in causal complex s triggers eventuality e2 if and only if there is a proper subset s1 of s not including e2, s1 is a causal complex for e2, and every eventuality in s1 is either presumable, is e1 itself, or is triggered by e1. This definition is recursive rather than circular since s1 is smaller than s. Then axiom (2) becomes ð"e1 ; eÞ½causeðe1 ; eÞ ðdsÞ½causal-complexðs; eÞ ^ e1 2 s ^ ð"e2 2 s fe1 gÞ½presumableðe2 Þ _ triggerðe1 ; s; e2 Þ
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
If e1 ‘‘causes’’e, then e1 is in a causal complex for e, the rest of which is presumably true. Recall that conjunctions of eventualities are eventualities too, so that this use of cause covers the case of multiple causes as well. For example, to start a bonfire, we first pour starter fluid on the wood and then strike a match. The cause is the conjunction of these two actions and their temporal order. The above axiom is not quite right. Consider again the example of the causal complex for the vase shattering that consists of two eventualities, letting go of the vase and the vase falling. Neither the letting go nor the falling is presumable. But we would want to call the letting go a cause, all by itself. Once it happens, the shattering will happen. Thus, we would like to eliminate as causes those eventualities in a causal complex that will occur anyway when all the causes and presumable eventualities occur. We will say those eventualities are not themselves causes, but are ‘‘triggered’’ by the causes within the causal complex, where we define trigger as follows:
Jerry R. Hobbs 201
If e1 is a cause of e, then e1 is in a causal complex s whose other members e2 are either presumable or triggered by e1. I won’t explicate the predicate presumable except to say that if an eventuality is presumable, its negation is not. ð"eÞ½presumableðeÞ :presumableð:eÞ Our causal knowledge could be stated in a form using the predicate causal-complex: ð"sÞ½. . . some long characterization of s . . .
If we were to do so, we could state our knowledge with certainty. The axioms would be monotonic. However, we are more likely to learn and use facts about especially useful or manipulable elements of causal complexes, for which the predicate cause is appropriate, and the form of our axioms will be as follows: ð3Þ ð"e1 ; xÞ½p#ðe1 ; xÞ ðdeÞ½q#ðe; xÞ ^ causeðe1 ; eÞ That is, p-type things normally cause q-type things. Since we have not made the entire causal complex explicit, this axiom will only be defeasible (as will be most axioms in a commonsense knowledge base). Axiom schema (3) is the typical form for general causal knowledge. For specific instances of one eventuality causing another, the typical form is the following: ðde1 ; e; xÞ½p#ðe1 ; xÞ ^ q#ðe; xÞ ^ causeðe1 ; eÞ ^ Rexistsðe1 Þ That is, e1 is the eventuality of p ’s being true of x, e is the eventuality of q’s being true of x, e1 causes e, and e1 really exists. Some philosophers have argued for the existence of ‘‘singular causation’’, that is, a specific instance of one event causing another without it being in any way an instance of a general causal principle. In Shoham’s formulation (1990), there can be no causation without the presence of a general rule, since he lacks an explicit predicate for cause. My approach admits singular causation, although my feeling is that it does not occur; to recognize causality is to recognize a causal regularity.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ðdeÞ½q#ðe; . . .Þ ^ causal-complexðs; eÞ
202 Toward a Useful Concept of Causality for Lexical Semantics Shoham lists as one of his criteria for a notion of causality that it be nonmonotonic and that the cause not necessarily imply the effect. Separating out the notions of causal-complex and cause as distinct and deriving cause, as we have, from causal-complex makes these properties of cause fall out. More precisely, we can restate axiom schema (3) as follows: ð3#Þ
ð"e1 ; xÞ½p#ðe1 ; xÞ ^ :abi ðe1 ; xÞ ðdeÞ½q#ðe; xÞ ^ causeðe1 ; eÞ
where
That is, causality from p to q will fail when in any causal complex s for q, some presumable eventuality e2 in s does not obtain; otherwise, p causes q. The form of (3#) exactly matches the nonmonotonic causal rules that Shoham uses. Mackie (1993) proposes his ‘‘at least INUS’’ definition for causality. In the INUS condition, C is a cause for E just in case there are an X and a Y such that neither C nor X entails E C ^ X does entail E E does not entail Y E does entail C _ Y That is, C is an Insufficient but Necessary condition of an Unnecessary but Sufficient condition for E. In Mackie’s ‘‘at least INUS condition’’, X and/or Y can be empty. We thus have (C ^ X) _ Y E. In the present development, C corresponds to the cause, X to the remainder of the causal complex, and Y to the disjunction of the other causal complexes for E. Mackie also discusses the ‘causal field’, those things in the causal complex that are assumed to be true, and thus do not count as a cause. For us, this corresponds to the presumable portion of the causal complex. Shoham similarly distinguishes between the foreground and the background in what I would call the causal complex. Suppes (1970) proposes a probabilistic account of causation. Briefly, C is a ‘‘prima facie’’ cause of E if P(C) > 0 and P(E j C) > P(E). This is
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ð"e1 ; xÞ½abi ðe1 ; xÞ [ð"e; sÞ½causal-complexðs; eÞ ^ e1 2 s ^ q#ðe; xÞ ^ presumableðs fe1 gÞ ðde2 Þ½e2 2 s fe1 g ^ :Rexistsðe2 Þ
Jerry R. Hobbs 203
consistent with the present account. The probability of E is the probability of one of its entire causal complexes holding. C is part of one of the causal complexes. If C has been isolated as a cause, then its occurrence is not entirely predicted by the rest of the causal complex. If C holds, then that increases the probability that the entire causal complex holds. Suppes further goes on to eliminate ‘spurious’ causes, that is, those causes that are in fact themselves caused by deeper causes. The two relevant conditions for B to be a spurious cause of E are
This eliminates causal chains, just as I do within causal complexes by ruling out triggered events as causes. However, causal chains play a very important role in commonsense reasoning. Letting go of a vase is the cause of the falling of the vase, which is the cause of its shattering. But I would not want to eliminate the possibility of calling the falling a cause of the shattering, just because something caused it. In my framework this is handled by relativizing the notion of triggering to causal complexes. Since the more proximate cause is itself in a causal complex that does not include the more ultimate cause, it can be a cause by virtue of that smaller causal complex. Thus, the letting go is a cause by virtue of the causal complex consisting of only the letting go, or by virtue of the causal complex consisting of the letting go and the falling, but the falling is a cause only by virtue of the causal complex consisting only of the falling. I have said that most causal knowledge used in planning, explanation, prediction, and interpreting causal statements is expressed in terms of the predicate cause. By contrast, the causal knowledge used in diagnosis would more likely be knowledge about causal complexes, since we do diagnosis when the normal or usual or presumable operation of things breaks down. Although we don’t know everything that is in a causal complex, we do know specific things that are, and this type of knowledge is expressed in axioms of the following form: ð"e; xÞ½q#ðe; xÞ ðds; e1 Þ½causal-complexðs; eÞ ^ p#ðe1 ; xÞ ^ e1 2 s That is, when a q-type event occurs, there is a p-type event in its causal complex.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
PðEjB; CÞ ¼ PðEjCÞ PðEjB; CÞ > PðEjBÞ
204 Toward a Useful Concept of Causality for Lexical Semantics 7 GENERAL PROPERTIES OF CAUSALITY Domain knowledge about what kinds of eventualities cause what other kinds of eventualities is encoded in axioms of form (3). These are usually very specific to domains—e.g. flipping switches causes lights to go on. These are the most common sufficient conditions for causality. The predicate cause appears in the consequent. A candidate for a general sufficient condition is the idea that every eventuality has a cause. The axiom would be stated as follows:
It is not uncontroversial that we would want this axiom. Certainly, very often we have no idea of what the cause of something is. For most of human history, people did not know what caused the wind, although they may have had theories about it. There is in commonsense reasoning, one can argue, the scientifically erroneous notion of an ‘agent’, an entity capable of initiating causal chains. Either agents could appear as the first argument of the predicate cause, or some primitive action on the part of agents, such as will(a), would initiate the causal chain and these actions would be exempt from the axiom. There is not very much that can be concluded from mere causality, without any further details. That is, there seem to be very few axioms stating general necessary conditions for causality, in which cause is in the antecedent. I will mention two. The first relates causality and existence in the real world. If we were to state this in its strongest, monotonic form, we would use the predicate causal-complex: ð"s; eÞ½causal-complexðs; eÞ ^ RexistsðsÞ RexistsðeÞ If s is a causal complex for an effect e and s really exists, then e really exists. When we state this using the predicate cause, ð4Þ ð"e1 ; eÞ½causeðe1 ; eÞ ^ Rexistsðe1 Þ RexistsðeÞ the axiom is only defeasible, because it requires the rest of e ’s causal complex, the presumably true part, to be actually true. Axiom (4) can be used with axiom (3) to show that specific causes occurring will cause their effects to occur.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ð"e2 Þ½Rexistsðe2 Þ ^ eventualityðe2 Þ ðde1 Þ½Rexistsðe1 Þ ^ causeðe1 ; e2 Þ
Jerry R. Hobbs 205
Another general necessary condition for causality is its relationship to time. Effects can’t happen before their causes: ð"e1 ; eÞ½causeðe1 ; eÞ :beforeðe; e1 Þ Similarly, ð"e1 ; eÞ½causeðe1 ; eÞ :causally-priorðe; e1 Þ Now we come to the question of whether cause should be transitive:
Let us analyze the question in terms of causal complexes. Suppose we know that an eventuality e1 is a member of a set s1, which is a causal complex for eventuality e2, which is in a causal complex s2 for eventuality e3. It is possible that s1 [ s2 is inconsistent. Shoham’s example is that taking the engine out of a car (e1) makes it lighter (e2) and making a car lighter makes it go faster (e3), so taking the engine out of the car makes it go faster. The problem with this example is that the union of the two causal complexes is inconsistent. A presumable eventuality in s2 is that the car has a working engine. When it is consistent, then we can say that s1 [ s2 is a causal complex for e3. ð5Þ ð"s1 ; s2 ; e2 ; e3 Þ½causal-complexðs1 ; e2 Þ ^ e2 2 s2 ^ causal-complexðs2 ; e3 Þ ^ consistentðs1 [ s2 Þ causal-complexðs1 [ s2 ; e3 Þ Since ð"e1 ; e2 Þ½causeðe1 ; e2 Þ ðds1 Þ½causal-complexðs1 ; e2 Þ ^ e1 2 s1 we can define the function ccf ðe1 ; e2 Þ ¼ s1 That is, ccf(e1, e2) is a causal complex by virtue of which e1 causes e2.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ð"e1 ; e2 ; e3 Þ½causeðe1 ; e2 Þ ^ causeðe2 ; e3 Þ causeðe1 ; e3 Þ
206 Toward a Useful Concept of Causality for Lexical Semantics Now suppose cause(e1, e2) and cause(e2, e3) are true. We can conclude that causal-complex(s1, e2), e2 2 s2, and causal-complex(s2, e3) are all true for some s1 and s2. If s1 [ s2 is consistent, then by (5) we can conclude causal-complexðs1 [ s2 ; e3 Þ By the definition of cause, all the eventualities in s1 fe1g and s2 fe2g are presumably true, and e2 is triggered by e1 in the causal complex s1 [ s2. Thus, we can identify e1 as the cause in s1 [ s2 for e3. This means that cause(e1, e3) holds. We have established the rule
^ consistentðccf ðe1 ; e2 Þ [ ccf ðe2 ; e3 ÞÞ causeðe1 ; e3 Þ If we take :consistent(ccf(e1, e2) [ ccf(e2, e3)) to be the abnormality condition for the axiom, then we can state the defeasible rule ð"e1 ; e2 ; e3 Þ½causeðe1 ; e2 Þ ^ causeðe2 ; e3 Þ ^ :ab1 ðe1 ; e2 ; e3 Þ causeðe1 ; e3 Þ That is, causality is defeasibly transitive. This rule is used heavily in commonsense reasoning for deducing causal chains between an effect and its ultimate cause. Several writers have argued against the transitivity of causality on the basis of examples like The cold caused the road to ice over. The icy road caused the accident. The cold caused the accident. (Hart and Honore´ 1985; Ortiz 1999b) and John’s leaving caused Sue to cry. Sue’s crying caused her mother to be upset. John’s leaving caused Sue’s mother to be upset. (Moens and Steedman 1988; Ortiz 1999b)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ð"e1 ; e2 ; e3 Þ½causeðe1 ; e2 Þ ^ causeðe2 ; e3 Þ
Jerry R. Hobbs 207
ð"e1 ; e2 Þ½causeðe1 ; e2 Þ possibleðe1 Þ ^ possibleð:e1 Þ ^ possibleðe2 Þ ^ possibleð:e2 Þ
8 SUMMARY The key move in the present development has been to distinguish between the notion of ‘causal complex’ and ‘cause’. Causal complexes can be reasoned about monotonically but can rarely be completely explicated. Causes constitute the bulk of our causal knowledge but must be reasoned about defeasibly. A precise picture of how cause and causal-complex are related was described. This has led to more precise
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Neither of these examples is very compelling. Certainly the starred sentences are not about direct causes, but they are about indirect causes. Very frequently, newspapers attribute some number of deaths to a heat wave, even though the direct causes might be a variety of medical and other conditions. And we can imagine Sue’s mother complaining about the wide repercussions of John’s actions—‘Look what he did to me!’ Direct causality is, of course, not transitive. Shoham (1990) believes that cause should be antisymmetric and antireflexive. Two eventualities cannot cause each other, and an eventuality cannot cause itself. I am not sure of this. If two books are leaning against each other and keeping each other in an upright position, it seems quite reasonable to say that the one book’s condition of leaning toward the other is causing the other’s condition of leaning toward the first. If this instance of symmetry is allowed, then reflexivity follows. Each book’s position is causing its own position, though not directly. It is possible to view this as a reasonable statement, in spite of its initial implausibility. There is a strong temptation in writing about causality to confine oneself to events, that is, changes of state. This is surely not adequate, since we would like to be able to say, for example, that the slipperiness of the floor caused John to fall, and that someone spilling vegetable oil on the floor caused the floor to be slippery. A state like slipperiness can be both a cause and an effect. Nevertheless, there is something to the temptation. Whether a state is a cause or effect will not normally become an issue unless there is the possibility of a change into or out of that state. That requires that both the state and its negation be possible. Thus, the focus on events could be seen to result from this requirement. With the proper notion of ‘possible’, we could state the following axiom:
208 Toward a Useful Concept of Causality for Lexical Semantics
Acknowledgements I have profited from discussions with Lauren Aaronson, Cleo Condoravdi, Cynthia Hagstron, Pat Hayes, David Israel, Srini Narayanan, and Charlie Ortiz about this work, and from the comments of the anonymous reviewers. The research was funded in part by the Defense Advanced Research Projects Agency under Air Force Research Laboratory contract F30602-00-C-0168 and under the Department of the Interior, NBC, Acquisition Services Division, under Contract No. NBCHD030010, and in part by the National Science Foundation under Grant Number IRI-9619126 (Multimodal Access to Spatial Data). The U.S. Government is authorized to reproduce and distribute reports for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained herein are those of the author and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of any of the above organizations or any person connected with them.
JERRY R. HOBBS Information Sciences Institute 4676 Admiralty Way Marina del Rey, California 90292 USA e-mail:
[email protected]
Received: 30.06.03 Final version received: 07.09.04 Advance Access publication: 11.04.05
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
characterizations of some of the properties of causality, such as transitivity. These concepts cannot be defined in terms of necessary and sufficient conditions. But it has been possible to specify a number of necessary conditions and a number of sufficient conditions, thereby constraining what a causal complex and a cause can be. All of this puts us in a good position to study a number of linguistic phenomena in terms of their underlying causal content. Section 2 gives the example of modals such as ‘would’. Other such phenomena are the meanings of explicitly causal words like ‘because’ and ‘so’, and causal uses of prepositions like ‘from’, ‘before’, ‘after’, ‘for’, and ‘at’; causative lexical decompositions, e.g., ‘kill’ as ‘cause to become not alive’ and ‘teach’ as ‘cause to learn’ or ‘cause to come to know’; coherence relations in discourse such as explanation; and many others. One often senses an uneasiness in accounts of linguistic phenomena in terms of causality, as though one were building on very shaky foundations. What I have tried to do in this paper is put the concept of causality on a firm enough basis that we no longer have to be embarassed by an appeal to causality in linguistic analyses.
Jerry R. Hobbs 209
REFERENCES printed in M. Ginsberg (ed.) Readings in Nonmonotonic Reasoning, 145–152, Morgan Kaufmann Publishers, Inc. Los Altos, California.) Moens, M. & Steedman, M. (1988). ‘Temporal ontology and temporal reference’. Computational Linguistics 14(2):15–28. Ortiz, C. L. (1999a). ‘A commonsense language for reasoning about causation and rational action’. Artificial Intelligence 111(2):73–130. Ortiz, C. L. (1999b). ‘Explanatory update theory: Applications of counterfactual reasoning to causation’. Artificial Intelligence 108(1–2): 125–178. Pearl, J. (2000). Causality. Cambridge University Press. Shoham, Y. (1990). ‘Nonmonotonic reasoning and causation’. Cognitive Science 14:213–252. Shoham, Y. M. (1991). ‘Remarks on Simon’s comments’. Cognitive Science 15:301–303. Simon, H. A. (1952). ‘On the definition of the causal relation’. The Journal of Philosophy. 49:517–528. Simon, H. A. (1991). ‘Nonmonotonic reasoning and causation: Comment’. Cognitive Science 15:293–300. Suppes, P. (1970). A Probabilistic Theory of Causation. North Holland Press.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Frank, A. & Kamp, H. (1997). ‘On context dependence in modal constructions’. Proceedings, SALT 7, Stanford University. March. Hart, H. L. A., & Honore´, T. (1985). Causation in the Law. Clarendon Press. Hobbs, J. R. (1985a). ‘Ontological promiscuity.’ Proceedings, 23rd Annual Meeting of the Association for Computational Linguistics, Chicago, Illinois, July 61–69. Hobbs, J. R. (1985b). ‘On the coherence and structure of discourse’. Report No. CSLI-85-37, Center for the Study of Language and Information. Stanford University. Hobbs, J. R., Stickel, M., Appelt, D. & Martin, p.(1993). ‘Interpretation as abduction’. Artificial Intelligence 63(1–2):169–142. Hobbs, J. R. (2003). ‘The logical notation: Ontological promiscuity’. Available at http://www.isi.edu/ hobbs/disinf-tc.html Lewis, D. K. (1973). Counterfactuals. Harvard University Press. Cambridge, MA. Mackie, J. L. (1993). ‘Causes and conditions’. In E. Sosa and M. Tooley (eds), Causation. Oxford University Press, 33–55. McCarthy, J. (1980). ‘Circumscription: A form of nonmonotonic reasoning’. Artificial Intelligence, 13:27–39. (Re-
Journal of Semantics 22: 211–229 doi:10.1093/jos/ffh023
Schedules in a Temporal Interpretation of Modals TIM FERNANDO Trinity College Dublin
Abstract
1 INTRODUCTION Just as semantic accounts of modality commonly invoke possible worlds, theories of temporality (concerning, for instance, aspect) often appeal to eventualities. But what are eventualities? And what are worlds? The present work analyses eventualities and worlds uniformly as certain relations s 4 TI 3 ED between a set TI of times t and a set ED of eventuality-descriptions u, with ‘s schedules u at t:
‘
sðt; uÞ pronounced
Insofar as eventuality-descriptions apply to eventuality-types, we may call s a schedule of eventuality-types. Exactly what eventualitydescriptions are and how they pick out eventuality-types depend on the application at hand: the fragment of English to be analysed, and the bit of reality that is conceptualised (to serve that end). In particular, we may derive ED from certain words and phrases under consideration, while basing eventuality-types on additional conceptualisations of, for instance, time. A concrete and illuminating illustration is provided by the temporal interpretation of modals in Condoravdi 2002, henceforth CON2. Some sentences with which CON2 is concerned are listed in (1). (1) a. b. c. d.
He He He He
might might might might
be be be be
here right now. here any day now. here next week. here yesterday.
The Author 2005. Published by Oxford University Press. All rights reserved.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Eventualities and worlds are analysed uniformly as schedules of certain descriptions of eventuality-types (reversing the reduction of eventuality-types to eventualities). The temporal interpretation of modals in Condoravdi 2002 is reformulated to bring out what it is about eventualities and worlds that is essential to the account. What is essential, it is claimed, can be recovered from schedules that may or may not include worlds.
212 Schedules in a Temporal Interpretation of Modals The oddness of (1d), marked by , is broadly compatible with the idea of historical necessity (e.g. Thomason 1984), according to which the past is settled, and only the present and future are open to branching (whence the acceptability of (1a–c)). But if there are no might’s about the past, then how do we explain (2)? (2) She might have won. What (2) has, which each sentence in (1) lacks, is the perfect (have -en), an analysis of which leads CON2 to two readings of (2), given in (3). (3) a. For all we know now, she might have won. b. She might have at an earlier point won.
(4) a. MIGHT (PERF (she-win)) b. PERF (MIGHT (she-win)) Flying against the surface form (4a) of (2), the scoping of the perfect over might in (4b) is not uncontroversial. It is, however, crucial in CON2 for imposing historical necessity on (3b)/(4b) relative to a notion of history shifted towards an earlier point in the past. But is (4b) based on a flawed interpretation of the perfect? To understand this question, let us examine some of the assumptions underlying CON2. CON2 draws on a generous inventory of worlds, states, events and times to form, on the one hand, eventive and stative properties, and, on the other hand, temporal properties. As made precise in section 3 below, eventive and stative properties serve as interpretations of eventuality-descriptions. To interpret the modals and the perfect, CON2 steps up to temporal properties, replacing the specific states and events in stative and eventive properties by times, alongside worlds that figure in all properties. Now, it is easy enough to convert a stative or eventive property to a temporal property by mapping states or events to their temporal trace. Going back from a temporal property to a stative or eventive one, however, runs into the problem that too many states and events may have the same temporal trace. For instance, does an interpretation of (5) as a temporal property allow us to extract the consequent state of Pat being away (left out from CON2, which focuses on the so-called existential perfect)? (5) Pat has left. And if we were to sharpen the temporal property interpreting PERF(A) to a stative property, could PERF still scope over MIGHT
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
CON2 defines operators for the perfect and for might, deriving (3a) from the scoping (4a), and (3b) from (4b).
Tim Fernando 213
Table 1.
From temporal properties to schedules in 3 steps.
Section Section Section Section
Given: temporal property (from CON2) Step 1: turn world w into schedule sw (satisfies ~) Step 2: generalize sw to smaller schedules s (forces y) Step 3: reconstruct sw from (generic) set G of schedules
2 3 4 5
u(w)(t) sw , t ~ u s y t,Su sw ¼ G
(6) She might have at an earlier point won (had she followed my advice . . .), but she didn’t. By contrast, accepting she did not win makes (3a) untenable and, in that sense, unstable. (7) a. She didn’t win. But for all we know now, she might have. b. She didn’t win. But she might have (had she . . .).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
as in (4b)? Not if PERF were to require states or events (in its inputs), whereas MIGHT returns times (in its outputs). Under the reformulation below, the opposition between temporal properties and stative/ eventive properties evaporates. The world-time pairs in temporal properties become schedules, encompassing worlds and eventualities alike. (4b) is kept viable, and so the reader wishing to rule out (4b) must seek other grounds for doing so. The reformulation of CON2 below is intended as a first step at pinning down the semantic entities CON2’s modals and perfect characterise—a first step, that is, to isolating what Schubert (2000) calls characterised situations. The main thrust is to strip world-time pairs down to the essentials—or, at least, to schedules, from which, it is claimed, the essentials can be extracted. This is carried out below in three steps, outlined in Table 1. We proceed in the next section, section 2, from the semantic set-up in CON2, converting worlds into schedules in section 3. We introduce schedules other than those induced by worlds in section 4, before making do without world-induced schedules in section 5. Precisely what the symbols in Table 1 mean will be explained in due course. That said, let us note at the outset that appeals to forcing y (as in sections 4 and 5) are not new in philosophical semantics, stretching at least as far back as van Fraassen (1969). Forcing lurks at the background of the data semantics of Veltman (1984), where its impact is diminished by the failure of (what is termed there) ‘stability.’ Stability relates to (3) above roughly as follows. (3b) is stable insofar as it is tenable even if we accept that she did not win.
214 Schedules in a Temporal Interpretation of Modals More formally, stability coincides in section 4 below with the persistence of y relative to the subset relation 4 on schedules s, s# s y t; u and
s 4 s#
implies
s# y t; u:
s y Æeæu
iff
ðds# 2 RÞ s# y u
for some set R of schedules specifying the epistemic possibilities of y (without regard to s). As s appears only in the left side (not the right) of the biconditional, we can restore t to get (8),1 making persistence with respect to Æeæu unproblematic. (8) For all schedules s, s# in the domain of y, s y t; Æeæu
iff
s# y t; Æeæu:
What then becomes of the instability in (7a)? Rather than analysing (7a) in terms of a single non-persistent forcing relation y, we appeal to context change of the kind advocated in Veltman (1996). The first sentence of (7a), she didn’t win, changes the epistemic base R to R#, effectively inducing a new forcing relation y#, relative to which (3a) fails (whether or not it holds for the initial relation y).2 Notice that if we are to make sense of discourses such as (7b), the first sentence in which rules out possibilities entertained in the second, we must keep the epistemic base for (3a)/(4a) separate from the modal base for (3b)/(4b), called metaphysical in CON2.3 Accordingly, we shall 1 This would suggest that the situation characterised by Æeæu is not so much a schedule s that forces Æeæu but rather the set fs# 2 R j s# y ug of schedules in R that force u. 2 My apologies for the notational clash with Veltman 1996, where might / is described as nonpersistent relative to a predicate y that takes on its left side not s, but rather a state r corresponding (here) to a set such as R of s ’s. My y is just a slice of Veltman’s y, fixed by a choice of r/R in the background. (In saying this, I am putting aside times t that appear to the right of my y, but not Veltman’s. Variations in t should, of course, not be confused with Veltman’s updates of r/R.) 3 We must, so to speak, immunise metaphysical might from updates that infect epistemic might.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Persistence is indispensable to the application we shall make of forcing (e.g. Proposition 3, section 5). Now, while (3b)/(6), (7b) may pose no problem for persistence, (3a)/(7a) is a different matter. Suppose s forced (3a) and s# encoded she lost, whereas s did not. Then surely s# could not force (3a)? In fact, it could, provided we analyze epistemic might not as in Veltman 1984’s data semantics, but more along the lines of Veltman’s (1996) update semantics. Dropping t for the sake of clarity (at the cost of correctness) and writing Æeæu for ‘might epistemically u,’ let
Tim Fernando 215
assume y comes with two sets Re and Rm of schedules specifying the epistemic and metaphysical possibilities, respectively. To avoid cluttering the notation, we will refrain from hanging the sets Re, Rm as subscripts on y. Such a practice would be useful were we to encode a dynamic interpretation of conjunction involving changes to Re (and possibly also to Rm, s and t). But the present paper stops short of that, keeping Re and Rm frozen.4 Holding Re, Rm constant, we will have enough to do sorting out complications involving time, the perfect and metaphysical might (omitted in Veltman 1984, 1996). 2 TEMPORAL PROPERTIES IN CON2
(i) a set PT of temporal points/moments/instants linearly ordered by a, and a set TI 4 Pow(PT) - f;g of times consisting of non-empty subsets t of PT such that for every z 2 PT, z2t
whenever x a z a y
for some x; y 2 t
(that is, time is a non-empty a-interval) (ii) sets WO, EV and ST of worlds, of events and of states, respectively, along with a function s: (EV [ ST) 3 WO / (TI [ f;g) that specifies the temporal trace s(e, w) 2 TI [ f;g of an event or state e in world w, where sðe; wÞ ¼ ;
iff
e is not realized in w;
the intuition behind s(e, w) 2 TI being that e is a single token/ occurrence in w (as opposed to a type that recurs in w). CON2 calls a function P from worlds (i) eventive if for every world w, P(w) is a unary predicate on events (so P(w)(e) is either true or false for every event e) (ii) stative if for every world w, P(w) is a unary predicate on states (so P(w)(e) is either true or false for every state e) (iii) temporal if for every world w, P(w) is a unary predicate on times (so P(w)(t) is either true or false for every time t) (iv) a property if P is eventive or stative or temporal. 4 Thus, other instances of instability noted in Veltman (1984) need not concern y, which in its static form, is not designed to cover all formulas that arise in natural language interpretation.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The semantic set-up in CON2 takes the following ingredients for granted:
216 Schedules in a Temporal Interpretation of Modals To turn any property to a temporal property, a world-time pair w, t is assigned sets EV(w, t) and ST(w, t) of events and states as follows. An event is located at w, t if its temporal trace in w is contained in t Evðw; tÞ ¼ fe 2 Ev j ; 6¼ sðe; wÞ 4 tg whereas a state is located in w, t if its temporal trace in w overlaps with t Stðw; tÞ ¼ fe 2 St j sðe; wÞ \ t 6¼ ;g:
AT is used to formalize both the perfect and the modals. A function PERF mapping properties P to temporal properties is defined by ðPERF PÞðwÞðtÞ ¼ ðdt# a tÞ ATðt#; w; PÞ where the linear order a on PT is extended to a relation on TI by quantifying universally over the points t a t# iff
ð"x 2 tÞð"x# 2 t#Þ x a x#:
To analyse modals, a modal base function MB is assumed that maps a world-time pair (w, t) to a set of worlds, relative to which a function MIGHTMB maps a property P to the temporal property satisfying ðMIGHTMB PÞðwÞðtÞ ¼ ðdw# 2 MBðw; tÞÞ ATðtN ; w#; PÞ where (expanding time forward, as in Abusch 1998)5 tN is the indefinite extension of t to the future fx 2 Ti j ðdy 2 tÞy d xg 5
Gennari (2003) makes a claim related to the idea that modals expand time forward.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(Viewed from outside t, events give the impression of being bounded while states do not. Events occur, states hold.) A property P is then mapped to the temporal property kwktAT(t, w, P) by existentially quantifying over the events and states located in w, t 8 <ðde 2 Evðw; tÞÞ PðwÞðeÞ if P is eventive ATðt; w; PÞ ¼ ðde 2 Stðw; tÞÞ PðwÞðeÞ if P is stative : PðwÞðtÞ if P is temporal:
Tim Fernando 217
(with y d x abbreviating ‘y a x or x ¼ y’). To capture historical necessity, worlds are bundled at each time t by an equivalence relation ’t (on WO) satisfying (9). (9) For every temporal property Pˆ of interest, ˆ PðwÞðtÞ iff
ˆ ð"w# ’t wÞ Pðw#ÞðtÞ:
The qualification ‘of interest’ in (9) is necessary to allow for branching in the future (i.e. beyond t); otherwise, ’t’ s satisfying (9) must be equality, in view of uninteresting temporal properties such as those given, for every world w, by
The metaphysical alternatives to w at t are restricted to worlds that share the same t-history as w. (10) ð"w# 2 MBðw; tÞÞð"t#atÞ w ’t# w# for metaphysical MB. Consequently, if the ‘present perspective’ ðMIGHTMB ðPERF PÞÞðwÞðtÞ ¼ ðdw# 2 MBðw; tÞÞðdt# a tÞ ATðt#; w#; PÞ is to differ from (PERF P))(w)(t), then MB had better not be metaphysical. Not so for the ‘‘past perspective’’ ðPERFðMIGHTMB PÞÞðwÞðtÞ ¼ ðdt# a tÞðdw# 2 MBðw; t#ÞÞ ATðt#N ; w#; PÞ as PERF pushes t back to t#, which MIGHT then expands forward. Hence, CON2 disambiguates (2) by deriving the epistemic reading from (4a) and the metaphysical reading from (4b). 3 A REFORMULATION IN TERMS OF SCHEDULES We are at Step 1 of Table 1, the point of which is to reformulate the temporal properties u(w)(t) from section 2 in terms of satisfaction s; t ~ u where s is a schedule capturing w. To speak properly about schedules, we need to specify a set ED of eventuality-descriptions, relative to
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
ˆ ð"w# 2 WOÞð"t 2 TIÞ Pðw#ÞðtÞ iff w# ¼ w:
218 Schedules in a Temporal Interpretation of Modals which schedules are subsets of TI 3 ED. Given a set EP of eventive and stative properties (as in section 2), let ED be the set of names P_ of P 2 EP, ED ¼ fP_ j P 2 EPg _ for P 6¼ P#: Then every world w induces the schedule with P_ ¼ 6 P# _ j P 2 EP and de½PðwÞðeÞ and sðe; wÞ ¼ tg: sw;EP;s ¼ fðt; PÞ As is common in the literature (e.g. Dowty 1979), let us assume that states are divisible in the sense of (11).
Given (11), it is easy to prove Proposition 1. For all P 2 EP, w 2 WO and t 2 TI, ATðt; w; PÞ
iff
_ ðdt# 4 tÞ sw;EP;s ðt#; PÞ:
Let us treat ED as the set of atomic sentences in a language U ED generated by the clause whenever u 2 U; so are : u 4 ; PerfðuÞ; Æmæu; ½mu; Æeæu; ½eu: The intent is that u4 express the step from s to AT in section 2, and m and e label metaphysical and epistemic modals (with diamond/d and box/"6 forms Ææ and []) respectively. To be more precise, let us agree that for s 4 TI 3 ED, t 2 TI and u 2 U, (i) eventuality-descriptions are satisfied exactly if they are scheduled s; t ~ u
sðt; uÞ for u 2 ED
iff
(ii) u4 allows for temporal slack s; t ~ u 4
iff
ðdt# 4 tÞ s; t# ~ u
6 CON2 defines a universal variant of MIGHT, called WOLL, reformulated here in terms of square brackets [m], [e].
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(11) For all e 2 ST, P 2 EP, w 2 WO and t 2 TI, if P(w)(e) and t 4 s(e,w) then there is an e# 2 ST such that P(w)(e#) and t ¼ s(e#,w).
Tim Fernando 219
(iii) Perf pushes time back ðdt# a tÞ s; t# ~ u:
s; t ~ PerfðuÞ iff
Proposition 1 can then be restated as ATðt; w; PÞ iff
sw;EP;s ; t ~ P_ 4
from which it follows that ðPERF PÞðwÞðtÞ
iff
As for the modals, let us attend first to the time expansion tN from section 2, defining t# do t (pronounced ‘t# expands t forward’) as the conjunction ð"x# 2 t#Þðdx 2 tÞ x d x# and
ð"x 2 tÞðdx# 2 t#Þ x d x#:
Known as the Plotkin d-preorder (Plotkin 1983), do compares both end points of open intervals (l, r) and (l#, r#) ðl#; r#Þdo ðl; rÞ iff
l d l# and r d r#
so that for example, (2, 3) do (0, 1) and (0, 2) do (0, 1). It is easy to see that (12) holds.7 (12) ðMIGHTMB PÞðwÞðtÞ iff
ðdw# 2 MBðw; tÞÞðdt#do tÞ ATðt#; w#; PÞ
Next, let us spell out the modal base functions MB(w, t) above, given two sets Wm, We 4 Pow(TI 3 ED) of schedules for the metaphysical and epistemic possibilities respectively. To impose historical necessity on the metaphysical alternatives, let t hold between schedules that are the same up to times at s# t s
iff
ð"t# a tÞð"u 2 EDÞ sðt#; uÞ iff s#ðt#; uÞ:
7 Observe that if t has left end point l, and flg 2 TI. then we can make do in (12) with t#dflg in place of t#do t: But, as we cannot, in general, reduce t to a point, and as it is not inconceivable that we may wish to eliminate the slack in AT (analysed in Proposition 1 via P_ 4 Þ; I have kept the Plotkin pre-order above.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
sw;EP;s ; t ~ PerfðP_ 4 Þ:
220 Schedules in a Temporal Interpretation of Modals The metaphysical alternatives are then defined from t and Wm by strengthening (10) as follows mbm ðs; tÞ ¼ fs# 2 Wm j s# t sg: Reducing the epistemic alternatives to We mbe ðs; tÞ ¼ We ; we can, for a 2 fe, mg, uniformly interpret Æaæ as quantifying existentially over modal alternatives ðds# 2 mba ðs; tÞÞðdt#do tÞ s#; t# ~ u
and [a] as quantifying universally s; t ~ ½au
iff
ð"s# 2 mba ðs; tÞÞðdt#do tÞ s#; t# ~ u:
Henceforth, we may assume that ~ 4 ðPowðTi 3 EDÞ 3 TiÞ 3 U is determined by a choice Wm, We of a pair of sets of schedules. Let us write P for an element of ED (dropping the dot on P_ 2 EP) and write w instead of s, construing worlds (from here on) as schedules. As hinted in the introduction above,8 a pair (u, t) 2 U 3 TI induces a change in the epistemic modal base from We to fw 2 We j w; t ~ ug: Whether the set Wm of metaphysical possibilities should be updated, I am less confident. It seems to me quite reasonable to equate Wm with the initial set of epistemic possibilities, and to assume that while We shrinks, Wm stays fixed—so that We 4 Wm. But I will not insist on that below. 4 A PERSISTENT GENERALIZATION We are at Step 2 of Table 1, in which we consider schedules other than those induced by worlds, by adding schedules smaller than worldschedules. More precisely, given a set W 4 Pow(TI 3 ED) of schedules 8 We write Wm, We instead of Rm, Re to distinguish satisfaction ~ (with which the W’s are associated) from forcing y (with which the R’s are associated).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
s; t ~ Æaæu iff
Tim Fernando 221
(say, Wm [ We), let YW be the set [w2W Pow(w) of all subsets of schedules in W YW ¼ fs 4 Ti 3 ED j ðdw 2 WÞ s 4 wg:
y
4
PowðTi 3 EDÞ 3 ðTi 3 UÞ
from a pair Wm, We of sets of schedules (determining a satisfaction relation ~ according to the previous section) such that Proposition 2. (a) For all s, s# 2 Y(Wm[We), t 2 TI and u 2 U, s y t; u and s 4 s# implies s# y t; u: (b) For all w 2 Wm [ We, t 2 TI and u 2 U, w; t ~ u
iff
w y t; u
assuming Wm [ We is an anti-chain. Part (a) of Proposition 2 is (as indicated in the introduction) what we mean by y being persistent, while part (b) is the sense in which we get a generalization of ~. The shift in t from the left of ~ to the right of y is designed to isolate (and thereby highlight) the partial order 4 on schedules (to the left of y). The idea is to reduce the schedules s to the left of y so that they have just enough weight to force the pairs t, u to the right of y. That is, having blown Wm [ We apart into Y(Wm [ We), we might seek a subset B 4 Y(Wm [ We) of ‘basic’ schedules smaller than those in Wm [ We, from which to reconstruct ~ according to (13). 9 More specifically, we might pair each eventuality description P with + (for truth) or (for falsehood), defining ðP; +Þ ¼ ðP; Þ and ðP; Þ ¼ ðP; +Þ: The doubling of ED here corresponds in data semantics (Veltman 1984) to flipping y for a notion of falsehood (complementing truth). Notice that there is nothing anomalous about both w(t, P4) and wðt; P4 Þ holding. This explains the elimination in section 3 of the slack in AT, before its re-introduction via the mapping u1u4 :
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
What schedules are induced by worlds? Let us call a set S of schedules an anti-chain if no two distinct elements in S are related by 4 that is, for all s, s# 2 S, if s 4 s# then s ¼ s#. We may assume that a set of worlds (with worlds construed as schedules) is an anti-chain, by arranging, for example, the eventuality descriptions ED to come with a ‘negation’ map : ED/ED and excluding from schedules relations s such that for some t and P, both s(t, 9 Now, the ‘persistent generalisation’ from which the present P) and sðt; PÞ: section gets its title is the definition of a forcing relation
222 Schedules in a Temporal Interpretation of Modals (13) For all w 2 Wm [ We, t 2 TI and u 2 U, w; t ~ u
ðds 2 BÞðs 4 w and s y t; uÞ:
iff
More on (13) in the next section. In the meantime, let us attend to the definition of y. The nonmodal clauses are as in (i)–(iii) of section 3 s y t; P iff s y t; u 4 iff s y t; PerfðuÞ iff
sðt; PÞ for P 2 ED ðdt# 4 tÞ s y t#; u ðdt# a tÞ s y t#; u:
mbm ðs; tÞ ¼ fs# 2 Rm j s# t sg mbe ðs; tÞ ¼ Re : However, the inclusion of schedules that are not 4-maximal leads to a couple of complications. First, there is the question of persistence. While this is no problem for epistemic modalities, we need to be careful about metaphysical modalities. To hardwire persistence, let us write # for the restriction of to Rm [ Re s #s#
iff
s# 4 s and s; s# 2 Rm [ Re
and add the quantification "r # s in10 s y t; Æaæu
iff
ð"r #sÞðds# 2 mba ðr; tÞÞðdt#do tÞ s# y t#; u:
(For a ¼ e or for world-schedules s, the prefix ("r # s) makes no difference; we add it above for the sake of uniformity.) A second complication arises with interpreting the universal modality s y t; ½au iff
ð"r #sÞð"s# 2 mba ðr; tÞÞðdt#do tÞ s# y t t#; u
10 This trick (of prefixing "r# s) is borrowed from a common treatment of negation in y, recalled in the definition of generic sets in the next section.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
For the modalities, we need to fix sets Rm and Re of schedules, just as we did for ~ with Wm and We. To establish Proposition 2, we must choose Rm ¼ YWm and Re ¼ YWe. But it will be useful to define y independently of Wm, We, assuming only that we have fixed some sets Rm and Re of schedules. The modal base functions mba (for a 2 fm, eg) are as in x3, with Wa replaced by Ra
Tim Fernando 223
where the modification yt on the right hand side takes the partiality of s# into account, allowing s# to be substituted by an s$ # s# with s$ t s# to ensure that time is extended sufficiently far beyond t s# y t t#; u iff
ðds$ #s#Þðs$ t s# and s$ y t#; uÞ:
This complication does not arise in Æaæu, where s# is quantified existentially (rather than universally, as in [a]u) and can absorb the choice of s$. With the appropriate definitions in place, we can prove Proposition 2 by a routine induction on u (for Rm ¼ YWm and Re ¼ YWe). 5 RECONSTRUCTING WORLDS
Proposition 3. Given y with Rm, Re satisfying (Q), let ~ be the satisfaction relation given by Wm ¼ [Rm and We ¼ [Re. Then w; t ~ u iff
ðds 2 Rm [ Re Þ s 4 w and s y t; u
for all w 2 Wm [ We, t 2 TI, and u 2 U. Very roughly, (Q) is the counterpart of the assumption in Proposition 2(b) that Wm [ We is an anti-chain, while [ reverses Y. Proposition 3 follows easily if we adopt (14). (14) Let [ R be the set of 4-maximal elements of R, [R ¼ fs 2 R j ð"s# 2 RÞ s 4 s# implies s# ¼ sg and (Q) stipulate that [ covers Rm and Re in that for all a 2 fm; eg and s 2 Ra ; there exists s# 2 [Ra such that s 4 s#: But (Q) in (14) is a very strong assumption that excludes interesting choices of R.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We have arrived at Step 3 of Table 1. Having defined y from any two sets Rm, Re of schedules (while suggesting choices of these in Proposition 2 given by ~), we may ask how to extract ~ from y without assuming y has been formed from ~. Since y and ~ are determined by Rm, Re and We, Wm respectively, the question becomes how to define We, Wm from Rm, Re. Let us record our goal as Proposition 3, leaving open for now just what assumption (Q) and the sets [Rm and [Re are.
224 Schedules in a Temporal Interpretation of Modals Indeed, suppose every schedule in R were finite, but that each one were properly containedSin another one. Then [R would be empty, leaving out infinite schedules i>0 si formed from chains s0 s1 s2 . . . in R. Such objects [i>0 si are captured in the usual completeness theorems in logic as maximal consistent extensions. For reasons to be explained shortly, forcing arguments refine the notion of a maximal consistent extension to that of a generic set (e.g. Keisler 1973)—generic not in the sense of ‘lions have manes’ but rather in connection with y. To spell out this connection, it is useful to extend y to formulas u 2 U with negation : s y t; :u
iff
ð"r #sÞ not r y t; u
(i) for all s 2 G and s# 2 Rm [ Re, s# 4 s implies s# 2 G (ii) every pair s, s# 2 G has a common extension s$ 2 G: s$ s [ s# (iii) for all u 2 U and t 2 TI, there is an s 2 G such that either s y t, u or s y t, :u. Conditions (i) and (ii) essentially make G an ideal. What differentiates generic sets from plain maximal consistent extensions is condition (iii), as illustrated by the following example. Over the time intervals Ti ¼ fð0; 1Þ; ð1; 2Þ; ð2; 3Þ; . . .g and eventuality descriptions ED ¼ fdieðSocratesÞ; aliveðSocratesÞg; let us define the schedules s0 ¼ ; and for i > 0, si+1 ¼ si [ fðði; i+1Þ; aliveðSocratesÞÞg s#i ¼ si [ fðði; i+1Þ; dieðSocratesÞÞg: Suppose these schedules constituted Rm and Re Rm ¼ fsi j i>0g [ fs#i j i>0g ¼ Re : Clearly, Socrates is immortal in the maximal consistent extension ð"t 2 TiÞðdi>0Þ si ðt; aliveðSocratesÞÞ:
S
i>0 si
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(recalling that # is the restriction of to Rm [ Re). Now, G is generic if
Tim Fernando 225
On the other hand, applying condition (iii) to u ¼ [m]die(Socrates) and t ¼ (0, 1), it follows from ðzÞ ð"s 2 Rm Þðdt# 2 TiÞ s [ fðt#; dieðSocratesÞÞg 2 Rm that Socrates must die in every generic set G ðdt# 2 TiÞðds 2 GÞ sðt#; dieðSocratesÞÞ:
[R ¼ f[G j G is generic; and G 4 Rg; the intuition being that for a 2 fm, eg, a generic set G contained in Ra induces the a-world [G. As for (Q), we take it to say [ðRm [ Re Þ ¼ ð[Rm Þ [ ð[Re Þ and ED and Ti are finite or countable: Under these assumptions, Proposition 3 can be proved along standard lines in forcing, using the persistence of y (Proposition 2(a) above) and the fact that every s 2 Rm[Re belongs to a generic set, provided TI 3 ED is finite or countable (see e.g. Lemma 1.4 in Keisler 1973, page 101). Two other facts, (15) and (16), are worth recording. (15) The forcing of doubly negated formulas s y t; ::u
iff
ð"r #sÞðdp #rÞ p y t; u
reduces (as usual) to generic sets s y t; ::u
iff
ð" generic G such that s 2 GÞ [ G ~ t; u:
(16) If Wm [ We is an anti-chain, then [YW ¼ W
for
W 2 fWm ; We ; Wm [ We g
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This is not to say that generic sets commit us to the mortality of Socrates. Only that if we want to entertain the metaphysical possibility that Socrates is immortal in a generic set, then we had better choose Rm so that (z) fails. (This is easy enough to arrange say, by including infinite schedules in Rm, or by introducing an eventuality description immortal(Socrates) with a suitable constraint on its schedules in Rm.) Armed with the notion of a generic set, we can devise more useful choices of [ and (Q) for Proposition 3 than (14). For all R 4 Rm [ Re, let [R form the unions of generic sets contained in R
226 Schedules in a Temporal Interpretation of Modals where [ is defined relative to Rm ¼ YWm and Re ¼ YWe. Concerning (15), note that s y t; u implies
s y t; ::u
6 CONCLUSION We have carried out Steps 1–3 of Table 1, reformulating the temporal properties of CON2 in terms of a forcing relation s y t; u involving schedules s that may or may not be worlds. But why bother with a schedule s# forcing t, u that is -smaller than a world w? Because s# picks out more schedules that force t, u than w. By the persistence of y, any schedule bigger than s# must force t, u, including w, all schedules bigger than w (of which there are none, if w is a world), and other worlds bigger than s# (of which there may be any number). In other words, small is beautiful. We should try to make a schedule s that forces t, u smaller (and not only bigger, as we do when forming worlds via generic sets). With this in mind, let us close by considering the prospects of truncating schedules. More precisely, given a schedule s and time t, let st be the restriction of s to times a or 4 t st ¼ fðt#; PÞ 2 s j t# a t or t# 4 tg: The question is: for which u 2 U can we count on (17)? (17) s y t; u iff
st y t; u
(17) goes through without a hitch for nearly all the clauses of y. The only problematic cases are the metaphysical modalities Æmæu and [m]u,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
but that in general, the converse fails (and hence so would the last equivalence in (15), if we were to drop :: from its left hand side). As for (16), what it says essentially is that if we feed our machinery a set of worlds (that is, an anti-chain Wm [ We), then we get it back. No more, no less. But of course, what makes y and genericity interesting is that worlds are not, at the outset, required. Notice that in the Socrates example of this section, no schedule in Rm ¼ Re is a world (i.e. 4-maximal).
Tim Fernando 227
which can be saved if we strengthen the prefix "r # s to "r # st, yielding s y t; Æmæu iff ð"r #st Þðds# 2 mbm ðr; tÞÞðdt#do tÞ s# y t#; u and s y t; ½mu iff ð"r #st Þð"s# 2 mbm ðr; tÞÞðdt#do tÞ s#y t t#; u:
s½t ¼ st [ fðt; RÞg for some schedule s and time t. Then we can encode y in statements s 8 u (pronounced ‘‘s pins u’’) between R-skeds s and formulas u satisfying (18). (18) s½t8u iff
s y t; u
To construe (18) as a definition of 8, we need to show that the choice there of s and t for the sked s ¼ s[t] is immaterial — that is, whenever s[t] ¼ s#[t#], s y t; u
s# y t#; u:
iff
But that is immediate from (17) and R ; ED. Alternatively (instead of deriving 8 from y), we might define 8 from scratch12 and read (18) from right to left as a definition of y from 8. This route to y,
11
As before, the prefix "r# st has no effect on the epistemic modalities, so we can replace m above by a 2 fm, eg for the sake of uniformity. 12 This is easy, albeit tedious. Given an R-sked s, let us write last(s) for the unique time t such that s(t, R), and write sRfor the schedule s – f(last(s), R)g obtained from s by removing R. Then s8P s8u 4 s8PerfðuÞ s8Æaæu s8½au
iff iff iff iff iff
sðlastðsÞ; PÞ for P 2 ED ðdt 2 domainðsÞÞ t 4 lastðsÞ and sR ½t8u ðdt 2 domainðsÞÞ t a lastðsÞ and sR ½t8u ð"r # sÞðds# 2 mboa ðr; lastðsÞÞÞs#8u ð"r # sÞð"s# 2 mboa ðr; lastðsÞÞÞðds$ lastðsÞ s#Þ lastðs$Þdo lastðsÞ and s$8u
where mboa ðs; tÞ ¼ fs#½t# j s# 2 mba ðsR; tÞ and t#do tg:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
With this modification,11 we secure (17) for all u 2 U, without (remarkably) losing any of our previous results (including Propositions 2 and 3, and (15),(16)). Moreover, suppose we introduce a fresh symbol R ; ED (for Reichenbach’s reference time) and define an R-sked to be a relation of the form
228 Schedules in a Temporal Interpretation of Modals however, depends on (17), which may well fail for u incorporating the progressive.13
Acknowledgments I am indebted to Cleo Condoravdi, Stefan Kaufmann and Frank Veltman for helpful comments, and to two anonymous referees for thoughtful criticisms. My thanks also to the organizers and participants of the 7th Symposium on Logic and Language in Pecs, Hungary (August 2002), where an early version of this paper (here superseded) was presented under the shameless title ‘Between events and worlds under historical necessity.’
Received: 02.03.03 Final version received: 28.08.04
REFERENCES Abusch, D. (1998) ‘Generalizing tense semantics for future contexts’. In S. Rothstein, (ed.), Events and Grammar, Kluwer. Dordrecht; 13–33. Condoravdi, C. (2002) ‘Temporal interpretation of modals: Modals for the present and for the past’. In D. Beaver, S. Kaufmann, B. Clark, & L. Casillas, (eds), The Construction of Meaning. CSLI. Stanford, 59–88. Dowty, D. R. (1979) Word Meaning and Montague Grammar. Reidel. Dordrecht. Fernando, T. (2003) ‘Reichenbach’s E, R and S in a finite-state setting. Sinn und Bedeutung 8 (Frankfurt), (www.cs.tcd. ie/Tim.Fernando/wien.pdf) van Fraassen, B. C. (1969) ‘Facts and tautological entailments’. Journal of Philosophy, 66(15):477–487. 13
Gennari, S. P. (2003). ‘Tense meanings and temporal interpretation’. Journal of Semantics, 20(1):37–71. Keisler, H. J. (1973) ‘Forcing and the omitting types theorem’. In M. Morley, (ed.), Studies in Model Theory. The Mathematical Association of America, 96–133. Plotkin, G. (1983) Domains (‘Pisa notes’). Department of Computer Science, University of Edinburgh. Schubert, L. (2000) ‘The situations we talk about’. In J. Minker, (ed.), LogicBased Artificial Intelligence. Kluwer. Dordrecht, 407–439. Steedman, M. The Productions of Time. Draft, ftp://ftp.cogsci.ed.ac.uk/pub/ steedman/temporality/temporality.ps. gz, July 2000. (Subsumes ‘Temporality,’ in
Details in Fernando (2003), where schedules are reduced further to regular languages over the alphabet Pow(ED [ fR, Sg), with ED assumed finite, towards a finite-state formulation of ideas described in Steedman 2000.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
TIM FERNANDO Computer Science Department Trinity College, Dublin 2 Ireland e-mail:
[email protected]
Tim Fernando 229 J. van Benthem & A. ter Meulen, (eds), Handbook of Logic and Language. Elsevier North Holland, 895–935,1997.) Thomason, R. (1984) ‘Combinations of tense and modality’. In D. Gabbay & F. Guenthner, (eds), Handbook of Philosophical Logic. Reidel, 135–165.
Veltman, F. (1984) ‘Data semantics’. In J. Groenendijk, T.M.V. Janssen & M. Stokhof, (eds), Truth, Interpretation and Information, Foris, 43–63. Veltman, F. (1996) ‘Defaults in update semantics’. Journal of Philosophical Logic 25:221–261.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011