is well formed)] To say that (Q, Q.^ is obligatory is to say: (x) [PX/C3 z> (Ñ,+j/Q Î <Ñ×, Px+ 1 > is well formed)] (Lakoff 1971, 233)
The first thing to notice about these remarks is that, as they stand, they are confused. What they seem to assert, taken in conjunction with (5), (or with my reformulation of (5)), is that (17) A sequence of phrase markers is a member of Ê only if every pair of successive phrase markers is well formed)] or (b) There is some obligatory transformation (Q, Q> such that (x) [Ñ,/Q = (PX+1/Q = is well formed)].10 However, this clearly won't do. First, to adopt (17) and (18) would be to fail to formally define the notion of well formedness as it applies to successive pairs of phrase markers; for what looks like a definition — namely (18) — mentions the term to be defined on both sides of the biconditional. Second, to say that a specific pair of phrase markers is well formed iff some universal statement of the form mentioned in (a) or (b) is true of the whole sequence of phrase markers of which is a part is just bizarre. Finally, any pair of phrase markers <Ñß} Pj+i) which is a member of a sequence none of whose members satisfies the structural description of some one transformation will automatically be well formed by falsity of antecedent of the statement corresponding to that transformation. Surely these results are undesirable. Thus, if we are to make sense of LakofFs remarks, some other interpretation is needed. 10
Although Lakoff doesn't tell us what the bound variables in (18) are supposed to range over, it seems fair to take them as ranging over the subscripts of phrase markers in an ordered sequence of phrase markers. First, it is only in such sequences that phrase markers have subscripts (and, of course, they have different subscripts in different sequences). Second, Lakoff uses the same notation in stating rule ordering constraints—where the bound variables must be interpreted in this way.
Rule orderings, obligatory transformations and derivational constraints
129
Perhaps, then, we ought not to interpret Lakoff as offering a definition of the notion of well formedness as it applies to pairs of phrase markers. Instead, we may be able to construe his remarks as giving axioms regarding the use of this notion. According to this interpretation, we can keep (17) above. However, in place of (18) we will have a list of statements of the forms mentioned in (a) and (b)—one for each transformation. Then, to find out whether a given pair of phrase markers (Pj, Pj+i^ is well formed, one instantiates each of the universally quantified formulas, substituting the constant "i" for the variable "x". If one of the resulting formulas of the form mentioned in (b) has a true antecedent, then we can derive by modus ponens a biconditional that states necessary and sufficient conditions for the well formedness of ^Pj, Pi+iX Similarly, if an instantiation of one of the formulas of the form mentioned in (a) has a true antecedent, then modus ponens gives us a sufficient, but not necessary, condition for the well formedness of (P^ Pj+iX These claims are comprehensible. The question now is "Are they adequate?" As they stand, they are not. The reason for this is that nothing has been said to assure us that our axioms regarding well formedness are consistent. To see this, consider the situation in which a phrase marker P| satisfies the structural description of transformations , only one of which need be obligatory. Suppose, for purposes of illustration, that (Q, Q.) is obligatory. Suppose further that Pi+i/Q & nP^/Q. Then, clearly, is both well formed and not well formed. Since this is a contradiction, any set axioms that allows this possibility is inconsistent and must be rejected. It might be thought that on our present interpretation Lakoff could be saved from inconsistency by adopting constraints on the class of allowable transformations. That is, it might be thought that we could constrain the class of transformations so that the possibility considered above could never arise* However, this is not the case. For consider what constraints would have to be imposed. To avoid contradiction we would have to demand that- every obligatory transformation Ô be such that, for all other transformations T', either the structural descriptions of Ô and T' are incompatible or the conditions imposed on the outputs of T and T' are identical. If a grammar didn't meet this demand, then for some obligatory and a pair of phrase markers Pj and Pj such that Pj/Q & Pj/Q, and Pj/Q & nPj/ Q. However, from this it would follow that (i.e. letting j = 1 + 1) is both well formed and not well formed. Further, on Lakoff's theory no amount of rule ordering can save us from this result. This is the ease since rule orderings are constraints defined on members of K whereas what we are worried about now is specifying the class of well formed pairs of phrase markers as a means of defining K itself. Finally, it is worth noting that the problem that Lakoff encounters here is independent of his misrepresentation of the nature of transformations. That
130
Scott Soames
is, if it is granted that transformations are binary relations in the restrictive sense defined above, then we could just as well take Lakoff as claiming that for each obligatory transformation T, there is a statement that (i) [Pjedomain of T => ( eT == is well formed)] and that for each optional transformation T' there is a statement that (j) [PjEdomain of T' => ( eT' ^ is well formed)]. (Where 'i' and 'j' range over numerical subscripts of phrase markers in derivations.) Then, to avoid contradiction, we would have to restrict the class of transformations in a grammar by demanding that every obligatory transformation T be such that for every other transformation T" and every phrase marker Pj, if P| 6 domain of both T and T", then for all Pj? € T iff 6 T". In practice what this restriction would amount to is the absurdly strong demand that the structural description of every obligatory transformation be incompatible with (not just distinct from) the structural description of every other transformation in the grammar. It would amount to this because otherwise, we would have to imagine two distinct transformations yielding identical outputs from the same input. Anyone familiar with the actual work of generative grammarians will recognize that none of the transformations that have been empirically motivated have this property.
V.
Having shown that both LakofFs characterization of transformations and his distinction between optional and obligatory rules are inadequate, I now turn to what he says about rule ordering. One of the facts that Postal takes to be central to Lakoff s characterization is that rule ordering constraints are to be represented in grammars by independent theoretical statements. Specifically, such statements are supposed to define well formedness conditions on members of K. About this, Lakoff says Rule orderings, for example, are given by global derivational constraints, since they specify where in a given derivation two local derivational constraints can hold relative to one another. Suppose define local derivational constraints. To say that is to state a global constraint of the form: (i) (i) [(Pi/Q & Pi+i/Q & Pj/Q & Pj-H/Q) ^ i < j] (Lakoff 1971, 234)
Adopting the characterization of transformations that I suggested earlier, we can take Lakoff as asserting that, if Tj and T2 are transformations, then the statement that T! is ordered before T2 imposes the following constraint on members of K: (i) (j) [«Pb Pi+^eT! & eT2) ^ i< j] (where T and *j' range over subscripts of phrase markers in a derivation)
Rule orderings, obligatory transformations and derivational constraints
131
The intent of such constraints is to define a subset of K by eliminating all members of K that— intuitively—are produced by applying rules in the wrong order. There are two major points to notice regarding this proposal. First, if it is accepted, then there is every reason to believe that grammars will be allowed to vary in the extent to which they impose rule ordering—just as there is every reason to believe that grammars vary in the number of language particular transformations that they allow. Second, this characterization does not require that the ordering relation holding between transformations be transitive. In fact, it leads one to expect that it is not. This second point can be readily grasped by considering a grammar that contains three transformations—T, T*, and Ô Ö. Suppose further that the grammar orders Ô before T* and T* before ÔÖ. Given Lakoff's characterization of rule ordering, to say this is just to say that the grammar contains the following two derivational constraints. (19) (20)
(i) (j) [«Pi, P i+ i>eT & eT*) ï i< j] (i) (j) [«P,, Pi+1>eT* & eT#)^ i<j]
However, from (19) and (20) one cannot deduce (21).
(21)
(i)(j)[«P i ) P i + 1 >6T& P j + 1 >eT#)^i<j]
Suppose, for example, that W is a member of K which is such that for all in W, £T*. In a case like this it is not inconsistent to suppose both that (19) and (20) are true of W (by falsity of antecedent) and that for some Pj and Pj in W, eT and ET# and i>j. What this means is that according to Lakoff's characterization of rule ordering, the fact that T is ordered before T* and T* is ordered before Ôö does not guarantee that T is ordered before Ô ö ,11 Of course, it is open to Lakoff to accept (21) as a separate grammatical rule in the case just described. If he were to do so, then Ô, Ô* and ÔÖ would be linearly ordered in the usual sense. However, if it is the case that whenever grammars impose orderings, the orderings imposed are linear, then we want a theory that is not just compatible with this result, but which predicts it. Therefore, if we want to preserve the idea that all orderings are linear, we must reject Lakoff's characterization of such devices. This point can be brought out more clearly by comparing Lakoff's characterization of rule ordering with an alternative characterization which requires all rules that are ordered to be linearly ordered. According to this characterization, universal grammar contains a single rule ordering statement (22). 11
It doesn't even guarantee that Ô ö is not ordered before T.
132
Scott Soames
(22)
For every generative grammar G, (i) if T' and T" are transformations of G to which G assigns numerical subscripts m and n respectively (ii) and if m < n, then (iii) for all sequences of phrase markers W, (Pj.... Pn), such that We the class K defined by G, and for all Pj, Pj,eW ifeT'and (Pj, Pj+i^ € T", then W is a well formed derivation in G only if Pi precedes P} in W. I will use the following formula, considered as a part of universal grammar, as an abbreviation for this constraint. 22«. (m) (n) [m [(i) (j) [(eT'm & <Ñ^ P j+1 >eT"„)^i<)]]] (Where the variables 'i' and 'j* range over subscripts of phrase markers in derivations and the variables 'n' and 'm' range over subscripts of transformations in the grammar of a language.) Individual grammars can impose rule ordering constraints by assigning positive integers to transformations that need to be ordered. It is clear that this proposal requires rule orderings to be linear since the indices of transformations are drawn from a linearly ordered system—namely the sequence of natural numbers under successor. However, there are two important features of Lakoff's original proposal that are shared by this one. First, rule ordering constraints are global constraints defined on members of K (even though they are not language particular rules). Second, grammars may vary in the amount of rule ordering that they impose. In evaluating this characterization of rule ordering against Lakoffs, there is one important consideration to bear in mind. That is, whereas Lakoffs proposal is compatible with the possibility that all, some, or no transformational orderings are linear, the alternative just constructed is compatible only with the frirst of these three possibilities. Thus, the characterization that incorporates (22) is more restrictive than Lakoff's in that it makes a stronger prediction about the nature of natural languages. Unless there is empirical evidence that contradicts this prediction, Lakoffs characterization of rule ordering must be rejected in favour of the alternative just constructed. In light of this, it would be natural to expect Lakoff to attempt to provide some empirical justification for his proposal. However, not only does he make no such attempt, he seems not to be aware that any empirical justification is necessary. Rather, he seems to regard his characterization of rule ordering as uncontroversial. For example, he says that the framework that he constructs is one which "Most of the work in generative semantics since 1967 has assumed." (Lakoff 1971, 236). In addition, he says that if we restrict the theory that he presents by limiting "global derivational constraints to those which specify rule order" (and if we make certain other restrictions which are irrelevant to our
Rule orderings, obligatory transformations and derivational constraints
133
discussion), then "The resulting restricted version of the basic theory is what Chomsky describes as a version of the standard theory." (ibid. 268). All of this suggests that Lakoff believes that his characterization of rule ordering captures assumptions that, in Postal's words, have "been implicitly part of transformational theory since the beginning." (Postal 1972, 140) However, since Lakoffs characterization of rule ordering can be accepted only if some grammars impose nonlinear orderings and since generative grammarians have traditionally assumed that all orderings are linear, it seems fair to conclude that neither Lakoff nor Postal is aware of the consequences of LakofPs proposal. Because of this and because no empirical evidence has been brought to bear against the methodologically superior proposal that employs (22), I will assume in what follows that all transformational orderings are linear.12 VI.
We are now in a position to define the notion of an obligatory transformation. To do this we must determine exactly what we take obligatory transformations to be. The standard intuitive characterization of these rules is that they are transformations that cannot be allowed not to apply. That is, they are rules that cannot be "passed over" in their turn in the ordering if their structural descriptions are met. This remark, though far from precise, tells us one very important thing—namely, that in the standard view the notion of an obligatory transformation is defined partly in terms of rule ordering considerations. Given this much, we can determine precisely which members of K violate the constraints defined by obligatory transformations. The offending derivations are those that contain a phrase marker Pj such that (i) PJ e domain of some obligatory transformation Tn (ii) .*T 0 (iii) -.3m[
6TJ m€Ô à =>|£À] r>n (í) (Ô) [Ô is unordered => £T]13 (Where the variables *n', 'm* and V range over numerical subscripts of transformations in a grammar, the variables Ô and ('f range over numerical subscripts of phrase markers in a derivation, and *T' ranges over transformations in a grammar). 12
Though I intend to leave open for the moment whether or not all grammars completely order their transformations. 13 To say that a transformation is unordered is to say that there are no constraints affecting when it can apply. This is reflected in our formalism by not assigning a numerical subscript to it.
134
Scott Soames
Intuitively, what these conditions state is that if a derivation D defined by a grammar G is such that (i)
It contains a phrase marker Pj that satisfies the structural description of some obligatory transformation Tn
(ii)
In constructing D, Tn did not apply to Pj
(iii)
Failure to apply Tn to Pj was not the result of the fact that Tn had not yet been reached in the ordering of transformations at the time that Pj was an available input to transformations
(iv)
Failure to apply Tn to Pj was not the result of the fact that Tn had already been passed in the ordering of transformations at the time that Pi was available, and
(v)
was not produced by a rule upon which no ordering constraints are stated,
then D is not a well formed derivation in G. The reason that these conditions are adequate to define the standard notion of an obligatory transformation is that to say that a derivation satisfies them (and hence is not well formed) is just to say that the derivation was produced by allowing an obligatory transformation to be passed over in its turn in the ordering even though its structural description was met. Finally, by collapsing these conditions we can formally define the notion of an obligatory transformation. Thus, a transformation Tn of a grammar G is obligatory in G iff G contains a constraint of the following form. (23)
(i) (m) [[ ö·¾Ë & (Ô) (Ô is unordered => ö Ô) m
& (r) (j) «P„ Pj+1>eTr => j > i)] => P^domain Tn] r>n Only those members of K that satisfy all constraints of the form (23) are well formed derivations in G. (Note, to say that a derivations W fails to satisfy (23) is logically equivalent to saying it satisfies conditions (i)-(v) above.) I claim that this characterization captures what we standardly take obligatory transformations to be. However, there is one caveat that must be added. Since the above characterization makes use of rule ordering considerations, the extent to which some subset of the rules of a grammar corresponds to what we standardly take obligatory transformations to be depends upon the extent to which the grammar orders its rules. There are three cases to consider. First, if a grammar orders all of its rules, then the obligatory transformations of that grammar will correspond perfectly with what we standardly take obligatory transformations to be. Second, if a grammar orders none of its rules, then, according to the characterization just given, it can have no obligatory transformations. Finally, if a grammar stands somewhere between these two extremes then (i) none of its unordered rules can be obligatory and (ii) the application of an unordered rule can keep
Rule orderings, obligatory transformations and derivational constraints
135
an obligatory transformation from applying (if it destroys the relevant environment).14 Of course, for one who thinks that grammars can vary in the amount of rule ordering that they impose, considerations like these might lead one to try to define the notion of an obligatory transformation independently of rule ordering considerations. Two obvious ways of doing this would be to require (a) that every obligatory transformation apply to every phrase marker in a derivation that meets its structural description or (b) that if some phrase marker Pj in a derivation satisfies the structural description of an obligatory transformation T, then that derivation must also contain some PJ^J such that
VII. In this paper I have attempted to clarify the nature of a number of key grammatical notions, to construct proposals for formalizing these notions, and 14
There are several possibilities regarding unordered rules. The reason for my condition (v) is to permit, as well formed, derivations which are such that, for some obligatory Tn and phrase marker Pj meeting its structural description, there is an unordered transformation that applies to Pj, prior to the time in the ordering at which Tn has a chance to apply. It may also be worth noting that by complicating my conditions one could retain this feature of my characterizations while stipulating that no unordered rule could keep an obligatory rule from operating by applying at the time at which the obligatory transformation is applicable. Assuming that there are unordered rules the decision of whether or not to accept conditions like these is purely empirical. 15 For example, the claim that all grammars impose maximal rule ordering.
136
Scott Soames
to spell out the empirical consequences of accepting one or another of these proposals. In addition, I have tried to explicate and criticize Lakoff 's abstract characterization of grammars. However, there are two questions regarding this characterization that remain. First, considerations of transitivity aside, is it the case that LakofPs treatment of rule ordering "has been implicitly part of transformational theory since the beginning"? (Postal 1972, 140). Second, is there any reason to accept a theory in which the justification of rule ordering constraints in a grammar is fully on a par with the justification of language particular transformations ? It seems to me that the answer to both of these questions is 'no'. This follows from what is probably the most significant feature of LakofPs treatment of rule ordering — namely, the claim that grammars can vary in the amount of rule ordering that they employ.16 Only if this claim is accepted can one maintain 16
The fact that Lakoff regards rule ordering constraints as global constraints defined on members of K is of no theoretical consequence since we can define an equivalent theory that lacks this feature. For example, consider the following: (Recall that in this discussion we are following Lakoff in ignoring the principle of the cycle.) (I) Let the transformations in a grammar be given by means of an ordered list T lt T2, T3 . . . Tn. Further, let each transformation be marked either "obligatory" or "optional." (II) Let the notion of a derivation be defined as a finite sequence of phrase markers in which each is well formed. (III) Then for all sequences W, is well formed in W iff (a) P| = PI ; PI is generated by the base component; and 3 n [eTn & (m) [Tm is optional í P{£ domain of Tm]] m <=Tn & (r) f3m
(g) [# Trl & 00
3
k<m in any sequence is well formed iff (a) Pj is generated by the base. (b) P2 is produced from Pj by applying some transformation T. (c) In arriving at T in the ordering no obligatory transformation whose structural description is met is allowed not to apply. (iii) A pair of phrase markers that is non-initial in a derivation is well formed iff (a) P|+i is produced from Pj by applying a transformation T. (b) T has not been previously reached in the ordering. (c) All transformations ordered before T have either already been passed in the ordering, are optional — and hence can be allowed not to apply — or are not applicable to Pj because their structural descriptions are not met.
Rule orderings, obligatory transformations and derivational constraints
137
that rule ordering constraints are no more freely available than language particular transformations. Suppose, for example, that in order to account for some phenomena of a particular language we have to choose between postulating a new language particular transformation and ordering two independently motivated transformations. What sort of theoretical considerations are relevant to making this decision? If one's theory of grammar does not require that all grammatical transformations be ordered, then it might be plausible to suppose that there are no theoretical grounds for selecting one alternative over the other. That is, it might be plausible to suppose that rule orderings don't come any more freely than do language particular transformations. If, on the other hand, one's theory of grammar requires that every transformation be ordered with respect to every other transformation, then, since the two independently motivated transformations would have to be ordered with respect to each other anyway, selection of the rule ordering solution would be preferred because it gives us a chance to save ourselves the postulation of an extra grammatical device. From this result two conclusions immediately follow. First, any reason to accept a theory in which rule ordering constraints are no more freely available than language particular transformations must also be a reason to reject the claim that all grammars impose maximal rule ordering. Second, the extent to which rule orderings have traditionally been considered to be more freely available than language particular transformations gives us some reason to think that LakofFs treatment of rule ordering has not "been implicitly part of transformational theory since the beginning" (Postal (1972) 140)—a conclusion that seems obvious anyway from an examination of much of the work of generative grammarians. Finally, we must decide whether or not to accept a theory that allows grammars to vary in the amount of rule ordering that they employ. Clearly, there is at least one good reason not to do so—namely, to adopt such a theory is to expand the class of possible grammars.17 What this means is that instead of being a proposal for which there is a priori justification, the claim that grammars can vary in the amount of rule ordering that they impose is, from an a priori point of view, inferior to the standard account. Of course, factual considerations might ultimately force us to this less restrictive position. However, in order to determine whether or not they do, it is necessary first, to distinguish methodological consi17
My argument assumes that any adequate linguistic theory will allow grammars to order at least some of their rules. If this assuption is accepted, then any constraint on rule ordering that is common to all languages further restricts the class of possible grammars. The requirement that all grammars impose maximal ordering is an obvious choice for such a constraint. Of course, the assumption that adequate linguistic theory will allow grammars to order at least some of their rules is an empirical hypothesis which itself needs justification. However, if it is rejected, then Postal's claim that there is no asymmetry between the availability of rule ordering constraints and other grammatical apparatus is automatically false. It is false because to reject such an assumption is just to say that rule ordering constraints are not available at all.
138
Scott Soames
derations (like restricting or expanding the class of possible grammars) from empirical considerations and second, to evaluate alternative formal characterizations in terms of both their restrictiveness and their empirical adequacy. A persistent problem with LakofFs theoretical discussions is that these tasks are not adequately performed. With respect to rule ordering, the result is that the issue of restrictiveness is ignored. Hence, the need to empirically justify the claim that grammars vary in the amount of rule ordering that they impose is not recognized. However, once formal, methodological and empirical concerns are distinguished, this oversight can be corrected, and the nature of different hypotheses can be clearly recognized.
References CHOMSKY, N. (1961), On the notion "rule of grammar.*' pp. 119—136 in: Fodor, J.A. and J.J. Katz (Eds.), The Structure of Language, Englewood Cliffs, N.J.: Prentice Hall CHOMSKY, N. (1967), The formal nature of language, pp. 397—442 in: Lenneberg, E.H. Biological Foundations of Language, appendix, New York, N.Y.: John Wiley CHOMSKY, N. (1971), Deep structure, surface structure, and semantic interpretation, pp. 183—216 in: Jakobovits, L. A. and D. D. Steinberg (Eds.), Semantics, Cambridge, Engl.: The University Press CHOMSKY, N. (1972), Some empirical issues in the theory of transformational grammar, pp. 63—130 in: Peters, S. (Ed.), Goals of Linguistic Theory, Englewood Cliffs, N.J.: Prentice Hall EMMONDS, J. (1969), Root and structure preserving transformations. Ph.D. dissertation, Cambridge, Mass.: MIT LAKOFF, G. (1971), On generative semantics, pp. 232—296 in: Jakobovits, L. A. and D.D. Steinberg (Eds.), Semantics, Cambridge, Engl.: The University Press POSTAL, P. (1972), The best theory, pp. 131—170 in: Peters, S. (Ed.), Goals of Linguistic Theory, Englewood Cliffs, N.J.: Prentice Hall
DOV M. GABBAY AND J. M. E. MORAVCSIK
BRANCHING QUANTIFIERS, ENGLISH, AND MONTAGUE-GRAMMAR
In this paper we distinguish branching quantifier constructions from linear ones, and illustrate these in English. We then call attention to the interesting properties of systems that include branching quantifiers, and raise the issue of what their inclusion would show for English. In part I we show certain correlations between grammatic and logical order within different syntactic constructions, and show also the syntactic devices in English that allow us to build branching quantifier-constructions of a wide variety. In part II we illustrate how a Montague-type grammar can accommodate branching quantifiers. In doing this we use a simplified version of Montague-grammar for the sake of facilitating exposition. Our work shows that Montague-semantics is stronger than the semantics for the 1st order predicate calculus.
This paper is an attempt to shed some light on quantificational structures in English. As such, it is a part of a larger enterprise, namely, that of presenting a rigorous semantics and syntax for certain important parts of English. Some parts of this enterprise appeared already (see Gabbay 1973, Gabbay and Moravcsik 1973, and Gabbay forthcoming). Below we shall present a brief sketch of the general project. Our aim is to link an empirically motivated rigorous syntax to a modeltheoretic semantics so as to capture what is involved in the understanding of a natural language like English. We assume that the syntax and semantics of English can be represented in a formal way. This amounts to the assumption that the class of wellformed sentences of English, together with their interpretations, can be generated by a set of rules. We do not feel that well known facts such as that grammatically is a matter of degrees, that some terms in natural languages are vague, and that some limits of what certains terms in natural languages denote can be understood only against certain background assumptions, create unsurmountable obstacles for our project. Arguments like the ones in Ryle (1957) against the possibility of projects like ours do not seem to us convincing.
140
Dov M. Gabbay and J.M.E. Moravcsik
Essential to our work is the assumption that the notions of truth and denotation are fundamental to the understanding of the semantics of a natural language. We do not claim that no other notions are needed for a full theory of meaning; nor do we claim that there ar no other dimensions of understanding in addition to the dimension of veracity. But we expect that other aspects of meaning will turn out to be additions, and that their full understanding presupposes an explicit semantics in terms of satisfaction relations, such as the one we are trying to formulate. There are further arguments in support of the claim that the notion of truth and reference are fundamental for the semantics of natural languages in Moravcsik (1972). To be more specific, the grammars that we envisage are context-free phrase structure grammars with transformations, while the semantics that we wish to attach to such a syntax is a set-theoretic semantics, that assigns to each non-syncategorematic term a set-theoretic entity as the semantic object representing the denotation of that term. Ideally, these assignments should be governed by the following conditions, i) Terms that would be normally interpreted as having different extensions should have different semantic objects assigned to them, ii) Terms belonging to the same syntactic category should have the same type of semantic object assigned to them, iii) Different syntactic categories should have different types of semantic objects associated with them, iv) Since we deal with natural languages, the construction of the syntactic categories should depend partly on empirical arguments independent of semantic considerations. Our project has its origins in a similar project that was left unfinished by the late Professor Montague (See Montague 1970 and 1973). Montague's work incorporated some, but not all, of the conditions enumerated above. Our work is both an extension and a modification of Montague's work, in ways that are indicated separately in the various parts of our project. In view of some of the work done on the formal structures of the grammars of natural languages—due mostly to Chomsky—-we are in a position today to formulate and partially answer certain clearly formulated questions regarding the complexity of the mind, or machine, that is required to interpret natural languages. As of now, no analogous question can be formulated with regard to the semantic component. Hopefully, work such as ours will lead to a change in this, undesirable, situation.
I. BRANCHING CONSTRUCTIONS AND ENGLISH SYNTAX We shall assume the usual treatment of existential and universal quantifiers, as presented in modern symbolic logic. In recent years there have been attempts to relate this treatment to quantificational structures in English. These attempts, however, have been restricted to structures in which within the sequence of
Branching quantifiers, english, and montague-grammar
141
quantifiers each quantifier depends for its interpretation on its predecessor. In other words, in these structures we find a linear dependency among the quantifiers. E.g., in the much used example: (1)
Every man loves some woman
the interpretation of the second quantified noun phrase depends on the first one. There is no non-arbitrary restriction on the number of quantifiers that can occur in a linear sequence of this sort within a sentence of English. E.g., in: (2)
Every man loves some woman some time
we have a sequence with three quantified phrases, and it is easy to imagine the addition of further—dependent—modifiers containing quantifiers. The attempts to account for these structures have focused so far mostly on the kinds and number of ambiguities that the relevant English sentences exhibit, and the investigation of how the syntactic analysis of these sentences can be related to the structure demanded by logical analysis. Montague 1973 contains, in addition, also a set-theoretic semantic for such sentences. Logicians have considered for some time configurations other than the linearly dependent ones in quantificational logic. One can construct configurations within which the linear dependency breaks, and thus branches of dependent quantifiers arise. But it is only in recent work by Hintikka (forthcoming) that attention has been called to the existence of such structures in English. E.g., in contrast with (l)and(2): (3)
All products of some subdivisions of every Japanese company, and all products of some subdivisions of every Korean company are shipped to the U.S.A. on the same boat
contains two sequences of quantified phrases, linked by a conjunction, such that though within each sequence there is linear dependency, the two sequences themselves are not connected by such a dependency. Nevertheless, the two sequences are tied to the phrase 'are shipped to the U.S.A. on the same boat'; this phrase functions as a logical predicate. Thus (3)—-unlike (1) and (2)—exemplifies the structure of branching quantifiers in English. These structures are of interest by themselves. E.g., it is not clear that various proposals within the framework of generative grammars—such as generative semantics and Chomsky's extended standard theory—-are equally adequate in representing the syntax of these structures. Furthermore, the representation of the semantics presents an interesting challenge. There are, however, other issues of theoretical interest that rest partly on the analysis of these structures. It has been shown by logicians that the class of logical
142
Dpv M. Gabbay and J.M.E. Moravcsik
truths that contains all of the constructions involving branching quantifiers is not recursively enumerable, and is equivalent to some fragment of second order logic. (For references, see Hintikka [forthcoming] and Enderton 1970). In view of these considerations, our work is an attempt to shed light on the following questions: (a) (b)
(c)
Can a Montague-grammar handle branching quantifiers? Does the existence and adequate treatment of branching quantifiers in English show anything about the need to go beyond 1st order predicate calculus to present an adequate semantics for English? Does the investigation of these phenomena give support to the claim that the class of analytic sentences in English is not recursively enumerable? [I.e., if English includes all logical truth expressible by branching quantifiers, and that set is not r.e.].
If the answer to the third question is affirmative, and we assume that there is an effective procedure for generating all and only the grammatically wellformed sentences of a natural language, then we will be able to establish an interesting asymmetry between the semantic and syntactic components, namely the set of valid sentences is not r.e. while the set of well-formed sentences is recursive. Before we proceed to deal with these issues, let us consider the order of occurrences of quantified phrases in various types of English sentences and the ordering of quantifiers required by the corresponding paraphrases in any standard logical analysis. Even a cursory look at (3) shows discrepancies. The logical paraphrase would start: "for every Japanese company there are some subdivisions such that...", in short, the quantifiers would occur in the REVERSE order from the order in which the quantified phrases in (3) occur. Thus we should look for some regularities relating logical and grammatical orderings. An alternative to this would be to start with a structure in which the quantifiers occur as required by any reasonable interpretation in 1st order predicate calculus, and assuming this to be "deep structure", proceed to construct a derivation via transformations that arrives eventually at the English sentences. However, there seems to be no empirical syntactic motivation for this, and it is not clear to what extent the types of transformations needed would obey a suitable set of constraints. Sequences of quantified phrases in English can be constructed via the following syntactic patterns: i) ii) iii) iv)
with prepositional phrases (such as in [3]) within subject-verb-object sequences (with modifiers, such as in [2]) with relative clauses with embedded structures (e.g. sentences containing 'believes that') We shall try to show that there are regularities of interpretation linked to
these four classes.
Branching quantifiers, english, and montague-grammar
143
In the case of structures like (3) the logical predicate linking the branches— 'are shipped to the USA on the same boat'—is dyadic. The variables linked to it in logical paraphrase will be the ones tied to the quantifiers appearing last in the branches. I.e. "for every χ s.t. χ is Japanese company, there are somej s.t. j is subdivision of x, and every ^ s.t. ... and for all u (Korean company) v (subdivision) and #> (product) % and w are shipped...". We shall see later that more complex branching can also be expressed in English. While the respective products make up the classes to which the predicate in (3) applies and thus occupy the last place in the logical sequences of quantifiers, they occupy the initial positions in the sequences as expressed by the grammatical construction of English. This suggests the hypothesis that when we deal with sequences of quantifiers that are constructed out of prepositional phrases, the grammatical order is the reverse of the logical order. As the examples below indicate, this hypothesis seems to work for the prepositions OP, 'to', 'in' (and other locationals), and 'for'. (4) (5) (6)
Some gift to every girl and some gift to every boy are bought by the same Santa Claus Every deer in some forest and every moose in some meadow drink from the same brook Every sacrifice for some good cause and every prayer for some blessing please the gods equally
The same point is illustrated by sentences in which different prepositions are combined such as: (7) Some entrances to every freeway to some city of every country are badly constructed Though this does not involve branching, the same point is exemplified; the logical order will start with the last quantified phrase ('every country') and work in reverse order, ending with 'some entrances' which is linked to the predicate. The hypothesis stands up, even though in this project we do not distinguish between different senses of the prepositions; such as the Of of possession, origin, and object (the portrait of David), as well as the 'to' of direction in contrast with the true dative ('he gave it to me'), and the 'for' of purpose as distinct from the 'for' in 'doing something for someone'. One class of exceptions to our hypothesis will be the pseudo-prepositionals; prepositional phrases that function, semantically at least, not as relational phrases. (8)
In one such type of case, the phrase functions adjectivally; e.g.: Every man of some intelligence (with some sense?) smokes cigars.
In another type of case, the preposition links events with their aspects, and thus denotes once more not a genuine relation. E.g.: (8 a) Some aspects of recent linguistic studies and some aspects of recent logical studies are equally depressing.
144
DovM. Gabbay and J.M.E. Moravcsik
The other exception is the preposition 'with' and its relatives; and as of now we have no explanation for this phenomenon. But the point becomes clear from such examples as (9) Every man without some woman is like every ship without some sail (10) Every man with a large income and every woman with a large appetite suit each other In sentences like these, the logical order and the grammatical order coincide, and the dyadic predicate does not apply to the variables bound by the last quantifiers in the sequences. The same coincidence of logical and grammatical order can be seen in sentences with subject-verb-object structure, such as: (11) Every farmer has some sons and every banker has some daughters who belong to the same club But in the case of this syntactic configuration the dyadic predicate is applied to the variables bound by the last quantified phrases in the logical and grammatical order, i.e., those denoting the sons and daughters respectively. Sentences of this structure admit ambiguities, and thus we cannot assume that all readings of all such sentences will preserve the coincidence of grammatical and logical order. Let us now consider the third type of construction mentioned initially namely relative clauses. Sentences of the following sort are illustrative: (12) Some truths that were rejected by every ancient sage in some civilization and some falsehoods that are accepted by every modern scientist in some country resemble each other The main predicate 'resemble each other' applies clearly to the truths and falsehoods. These are clearly not the things bound by the logically last quantifiers in the sequences. Within the relative clauses, however, the regularities observed so far apply. This shows that we have to determine the logical sequences within each relative clause first, and then attach the whole clause as a complex predicate to the head NP's; the main predicate of the sentence then applies to these. There are sentences where a predicate applies both to something within the clause and outside, such as: (13) Some men who pursue every woman are rejected by them In these cases the same considerations apply as to constructions built around 'with'. (14)
Similar considerations apply to clauses built with 'where' such as: Every place where some lawyers lived and every house where some doctors lived comes under the same zoning law
The specificity-quantifiers of English, such as 'a certain', always come first in logical order, and this rule is prior to all other regularities mentioned here.
Branching quantifiers, english, and montague-grammar
145
Finally, we turn to embedded sentences. One type of construction is exemplified by sentences with 'believe' as the main verb; e.g. (15) Some men believe that every woman hates them This sentence shows that there can be co-reference between noun-phrases and pronouns across the embedding construction. Needless to say, we can also have branching constructions in which one branch is also outside the belief-context, e.g. (15 a) Some men of some countries believe that they resemble some animals of every species. But note the impossibility of a branching construction in which the branches would be held together by a predicate denoted by the verb which creates the embedding construction. . (15c) Some χ and allj such that ... and some % and all TV such that ... andj believes that w. Let us now review the whole range of syntactic devices that allow us to express a wide variety of branching configurations. So far we dealt only with cases where distinct branches are held together by a predicate. But there are cases in which the branches are preceded by a common node, as in: (16)
Men who make a deal with a certain chisler and women who keep company with him deserve the same fate.
Here the expression referring to the chisler picks out an element that is necessary for the interpretation of both branches, giving us the structure: There is a chisler s.t.
/*
men who...
Vs
*
deserve the same fate
women who... Little reflection on (16) shows that one should be able to attach such an expression, thus forming the diamond-like structure, in front of any arbitrary number of quantified phrases and with any number of branches; the guarantee comes from the fact that these structures are equivalent to very complex relative clauses, as the schematic representation above indicates. Furthermore, these same syntactic devices make it possible to add one diamond-like structure after another; e.g. to go on with (16) in some such form as: - some angels of all religions . deserve the same fate s.t.
equally abhor some devils of all superstitions '
Further complexities of this sort are also expressible in English. Another relevant question is: how many branches can we have? The answer is: any number of these; one can link them with conjunctions, and predicates like the one in (11) can apply to any number of branches. 10 TLIl/2
146
Dov M. Gabbay and J. M. E. Moravcsik
Still another dimension of complexity is revealed when we see that the main predicate can apply to more than one class denoted within any one of the branches. E.g., (17)
Some lie of every politician and some weakness of every voter make the voters hate the politicians
Here all four NP's are tied to the main predicate; and further complexity in this dimension (involving more quantified NP's) is imaginable. Examples used in logic textbooks usually treat only verbs that function as monadic or dyadic predicates in English. But the syntactic device of adding cases via prepositional phrases shows that verbs can function as predicates with a much larger number of elements related. E.g., consider the schema: χ brings j with ^ to w in ν for u. We already saw above that more complex verb phrases can apply to any arbitrary finite number of elements. Furthermore, grammatically as well as semantically, prepositional phrases such as 'to Mars', eto London', 'to California' have common structure that must be brought out by analyzing them into further constituents. Thus the treatment of prepositional phrases as adverbial operators is necessities. We therefore give in §2 a simplified account of Montague grammar. In unsatisfactory, since under such treatment (an extra primitive for each phrase) we would lose internal structure and thus miss important generalizations. Thus we have shown how one can build up in English sentences with any number of branches, and predicates tying the branches together, applying to any finite number of variables, and how the language allows us to form the diamondtype branching, with relative clauses. Can we form branches with arbitrary length? Again, there is a syntactic device that guarantees this. For some prepositions, such as Of* allow of an arbitrary number of iterations. Thus we can always form branches of the form: "Quantified NP of quantified NP of..." of arbitrary length. In this section we illustrated a number of regularities linking the order of quantifiers within standard logical paraphrases and the order of quantified noun phrases in a variety of English syntactic constructions. We also indicated several syntactic devices of English that allow the construction of branching structures with arbitrary number of branches, with arbitrary number of NP's within any of the branches, and with predicates that can apply to an arbitrary number of elements. We also illustrated diamond-like structures of arbitrary complexity, and sentences in which the common main predicate is applied to more than one quantified NP's from each branch. The regularities and the variety of sentences expressible in English indicate the extent to which branching structures are part of a natural language like English. This sets the background for the development of the rigorous semantics.
Branching quantifiers, english, and montague-grammar
147
II. BRANCHING QUANTIFIERS AND MONTAGUE SEMANTICS
§1.
Introduction
Montague (1973) presented a grammar and a semantical interpretation for a certain fragment of English. The fragment is small, but "accomodates all of the more puzzling cases of quantification and reference" known to him. In part I we have shown that branching quantifiers behave according to certain rules. We now show that branching quantifiers can be expressed in Montague grammar and the respective semantics is correct for them. Thus the system of Montague (1973) allows for the construction of sentences with branching quantification. Our plan is as follows. In §2 we give an introduction to Montague type semantics. As explained in Gabbay (1973), Montague's paper (1973) is extremely elegant and some of its features are only technical options and not conceptual necessities. We therefore give in §2 a simplified account of Montague grammar. In § 3 we show how the simplified grammar of § 2 can accomodate branching quantification'and in §4 we show how Montague grammar (1973) accomodates branching quantification. §2.
Simplified Montague Semantics
Let us begin with a certain fragment of English. For example the fragment containing words like John, Mary, run, fall, and sentences like ''John runs', 'Mary kills John''. Our first step is to divide the words into categories, e.g., in this case into the four categories: NP (containing John), IV (containing run), TV (containing kill), S (containing 'John runs'). We supply rules that allow us to construct the sentences we have in mind. E.g., in our case. (Rl) S-+NP + IV, or in diagram:
(R2) IV-+TV + NP, or in diagram:
10*
148
DovM. Gabbay and J.M.E. Moravcsik
(The choice of rules and categories depends on what tasks we set for ourselves; which sentences do we want to construct? What ambiguities do we want to account for? Can we give a simple semantics for this grammar? etc.) Given a grammar, that is, given a set of categories and rules of construction, we can supply this grammar with a semantics. A semantics consists of the following. (a) With each properly constructed phrase ÷ of the language we associate a semantical object ||x||. Phrases belonging to the same category obtain the same kind of semantical object. (b) With each rule R we associate a semantical rule SR. If the rule R allows you to construct phrase z from phrases ×÷,..., x„ (e.g., S—»NP + IV) then SR tells you how to construct ||z || from ||x! ||,..., ||xn ||. These assignments must be natural. That is the semantics keeps close to the meaning of the English phrases and reflects correctly conditions of truth, makes distinctions in case English makes them, etc. As an example let us construct a semantics for the fragment given above. We start with a set of object õ (cao be thought e.g., as a set of people, etc.). NP's get elements of the set, e.g. \Jobn \ å õ. IV's get subsets of õ, e.g. || run || £ õ (intuitively, the set of those who run). TV's get binary relations on õ, (e.g., || kz//\\ c: õ ÷õ (i.e., a list of who kills whom). Sentences get truth values (true or false). We now have to specify the rules SRI, SR2. SRI tells you to check for, given ÷ å ÍÑ and y å IV, whether the element ||x|| belongs to the set ||y||, and gives a truth value. So: \Jobn run || is true if \John || å || run ||, i.e., the element associated with John is in the set ||nwr||, (those who run). The rule SR2 says: Take the relation ||TV|| and construct all those elements that are related to the ||NP||, e.g., ||kill Mary || is obtained from \\kill\\ and ||^irfry|| by collecting all those elements that the relation ||&//|| relates to the object (|Ë/ÁÃ÷:||·. Thus \\ki// Mary\\ is the set of all those who kill Mary. Thus semantics for the above grammar are obtained by taking sets õ and assignments || ||. There are many possible semantical interpretations. The nature of the interpretation depends on the richness of the fragment, the rules and categories chosen, the various distinctions required etc. We give two examples: 1. A fragment containing the verb seek cannot be handled like we handled kill, because you kill objects of the universe but you may seek non-existent objects. 2. But even without new kinds of words, how about adding the simple rule: (R3) NP-*NP + NP (i. e., we want to construct John loves John (and) Mary). What will be || NP + NP || (i. e., \John (and) Mary\)t. John (and) Mary is a member of NP and therefore must be assigned the same type of element as 'John\ A simple way out is to assign to NP's
Branching quantifiers, english, and montague-grammar
149
finite subsets of õ. So |[/0/&/?|| = set containing one element (John himself). \John (and) Mary\ is the set containing 2 elements. If x,y are NP's and z = x+y then || æ || = || ÷ || u || y || . SR3 tells you to take union. All the other rules remain the same except that we replace õ by \\ = set of all finite subsets of õ. If you look at the fragment and rules of Montague (1973), you will begin to see why the semantics of Montague (1973) is complicated. We now describe how we can introduce quantifiers in the language. Let us, for simplicity, confine ourselves to the fragment with rules Rl, R2, and categories NP, IV, TV. First we add to the category NP, the variable names he^ he2, he^ We can now form phrases that depend on unspecified names. E.g., kill he (or kill him}. We don't know who him is; for different choices of be we get different sentences. We also introduce common nouns like women , men. This is a new category CN. The semantical objects for them are subsets of õ \\wen\\ £Î õ (i.e., the set of all men) etc. We add the category Q of quantifiers containing every and some and add the rule NP—»Q + CN. Now we can form: Every man runs or every man kills some woman or he kills every man or he kills her·, etc. This is a very simple fragment. However, we need one more rule to allow us to express branching quantification ! Luckily this rule was introduced by Montague to treat relative clauses. Recall that we can construct sentences with variables in them, like he runs. We indicate that a sentence S contains a variable name he by writing S (he) (a sentence with he). We want to allow the construction of CN from CN + S (he) by the use of such that., e.g., man such that he runs. We can therefore form "every woman kills some man such that he (the man) runs" . In the next section, let us present this grammar formally and show how it can accomodate branching quantifiers. §3 1.
A Grammar For Branching Quantifiers 1 We have the following basic categories : NP (noun phrases). Contains two sets of basic phrases. N! = {John, Mary, etc.} and N2 — {x,y,z, ^i, he2, she-ij she2, etc.} This is the set of name variables. Besides these basic NP's we can construct more using the rules below. Morphology is neglected, i.e., run—* runs, love—»loves, he—»him, she—»Her, he—»she. In our examples we use the correct English form only for reasons of style. It doesn't affect the representation of branching quantifiers.
1 50
Dov M. Gabbay and J.M.E. Moravcsik
2.
IV (basic ones are for example run, walk).
3.
TV (basic ones are e. g., love, ki/J).
4.
Q (contains every and some).
5.
CN (basic common nouns are man, woman, sheep, etc.) We now define some derived categories and give some rules of grammar. To do this we shall simultaneously define, for any phrase P of any category the notion: "The name variable ÷ appears free in P". We denote this by writing P(x). The following are the clauses defining the rules and the notion.
6.
If Ñ is a basic element of any category (see (1) — (5) for a list of basic elements, such as run, kill, etc.), then ÷ is free in Ñ if Ρ is ÷ itself (as a name in NP).
7.
Æ is an element of the category S of sentences if it is of the form X+Y where X is an NP and Υ is an IV (i.e., the rule S—»NP + IV). A variable name ÷ is free in Æ if ÷ is free in either X or Y.
8.
Ζ is in IV if Z*=X+ Υ with X in TV and Υ in NP. A variable name is free in Æ if it is free in either Xot Υ (i.e., the rule IV—»TV+NP).
9.
Ζ is in NP if Ζ — X+ Υ and X is in Q and Υ is in CN. A variable name is free in Ζ if it is free in either X or Y.
10.
If X is in CN and F(x) is in IV with ÷ free in Y and not appearing in X then the following phrase Æ is in CN. Z= X such that Y(hen), where ben is the first new name variable of this type not appearing in X or Y. u is a free variable of Z if u? 6 x, u ^ hen and u is free in Xot in Y.
11.
If F(x) is in S with ÷ free and X is in NP then the following Æ is in S : (a) If X is a variable name u then take Z(u) to be Y(u) and a variable v is free in Z iff v = u or í φ ÷ but í is free in Y. (b) If X is not a name variable then Æ is obtained by replacing in K(x) the first occurence of ÷ by ¢" and every other occurence of ÷ by ben (or optionally shen depending on gender), where hen is the first name of this form not occurring
Using the notion of construction tree, Montague (1970, 1973), we can give some examples. Near the node of the tree we indicate the free variables of the phrase of that node and the rule (if in doubt) used to construct it.
Branching quantifiers, english, and montague-grammar
151
every man loves some woman
(18)
every man
every
loves some woman
some woman
man some
woman
Meaning: For every man there is a woman (depending on the man) whom the man loves. every man loves some woman
every man loves ÷
every
man
love
÷
Meaning: There is some woman such that every man-loves her. (The woman is the same for all men.)
152
Dov M. Gabbay and J. M. E. Moravcsik
(20)2 Every man loves some woman that shej kills every sheep that he, runs. every man loves some woman such that she, kills every sheep that hex runs love some woman such that she, kills every sheep that he é runs some woman such that shei kills every sheep that he, runs woman such that shei kills every sheep that he, runs y kills every sheep that hei runs v kills
every sheep that hej runs v every
sheep such that he} runs sheep that he é runs
The meaning is that the woman depends on the man (i.e., for every man ÷ there is a woman w = w(x) such that w(x) kills every sheep that ÷ runs).
2
We are not concerned with the deletions that take us from the sentence of this example to the fine surface form.
Branching quantifiers, english, and montague-grammar (21)
153
(Branching quantification): Every man loves some woman (and) every sheep befriends some girl that belongs to the same club (i.e., the woman and the girl). Z(every man, every sheep)
every sheep
Z (every man, u) Z(x, u) = some woman such that ÷ loves she é belongs to the same club as some girl such that u befriends she2
belong to the same club as some girl such that u befriends she2
some woman such that ÷ loves shei woman such that ÷ loves shei
some
woman
÷ loves y
ftat u befriends she2
girl such that u befriends she2
u befriends í
Once we know how to express branching like (21), we can also express branching like (16) in part I. Simply form a statement Z(u) where u is a name variable (i.e., u replaces the "chiseler") and now quantify over the Uj. Clearly any kind of lattice can be created in this way. (We regard "belong to the same club (as)" as one unit here. We abbreviate it by "belong-". In the top nodes (with Z(x,u)) we used rule 11.) The semantics will show that the meaning of the quantifiers is branching.
154
Dov M. Gabbay and J. M. E. Moravcsik
Now suppose we look at the sentence "every man loves some woman and every sheep befriends some girl that belongs to the same club owned by the man". This sentence has the same tree as in (21) except that the node "belong-" should be replaced by the node "belong to the same club owned by x". This phrase has to be constructed separately (i.e., the node is really replaced by a branch constructing the above). This shows we can express branching quantification where the main verb depends on more than the last of the branching quantifiers. We now turn to the semantics for the grammar of this section. Since our fragment is smaller than Montague's (not including intensional objects), our semantics is less complicated in the sense that the semantical objects associated with elements of the basic categories need not be sets of too high a type. As we remarked in section 2, the natural simple semantics of §2 has to be changed a little to accomodate various technical difficulties. So in order to make the present semantics more transparent, we use the original semantics as our starting point. Let H be a function assigning semantical objects to the elements of the categories, //is defined as follows: Let õ be a set (our universe of object). Let Η (÷) åõ, for any name variable ÷ (by name variable we mean variable for names). Let //(ç) å õ for any basic name such as John or Mary. Let //(ç) £Î õ, for any basic IV such as run or basic CN such as man. Let //(ç) £Î õ2 for any basic TV such as love. Given such an assignment Η we define our semantical interpretation. We define a semantical object || Ρ ||H associated with any phrase Ρ of any category, constructive in the language. The definition is given by induction. || \\H is defined first for the basic elements of the categories and then with each syntactical rule of grammar that allows us to obtain new phrases from old. We associate a semantical rule that allows us to obtain the sematical object associated with the newly constructed phrase. The numbers here follow the numbers of the definition of the grammar. (51)
If ç is a basic NP then ||n||H = the function f such that for any subset A £ õ, f(A) is a truth value and f(A) = true exactly when //(ç) å A.
(52)
If ç is a basic IV then ||n ||H = //(n).
(53)
If ç is a basic TV then || ç ||H is the function f such that for any function F (giving truth values to subset of õ), yields the set f(F) = {a å õ|F({b å õ| (a,b) å //(ç)}) = true}.
(54)
.1140/0* ||j? == the function that associates with every subset AS õ the function fA on subsets of õ with ,the property that fA(B) = true exactly when Á ç  φ Ï \every \\Η = the function that associates with every subset Á £Î l) the function GA such that for any B £ õ, GA(B) = true iff  2 ¢.
(55)
If ç is a basic CN, then || n ||H = //(n).
Branching quantifiers, english, and montague-grammar
155
To continue we need a definition. For a name variable x, let Η = ΚΗ* if Hl is like //except possibly for giving a different value on x, i.e., Vy(y Φ x—»//(y) = (S7-S9) Each of the rules (7-9) has the form Z= X+ Y, where Ζ is the new phrase obtained from X and Y. The corresponding semantical rules say \\Z\\jj — = É|^||Ç(||*º|Á)> i- e -> aPpty tne semantical function ||.ËÃ||Ç to the argument imiHandthevalueis||Z||H. (S10) If 7= A'such that 7(x) then ||Z[|H = || AT||Hn {a å õ | || r(x)||fli = /r«*, where (Sll) If Æ is obtained by applying the rule 11 to the NP X and sentence F( then || Z\\H = || *||H({a| II ^W II*1 = /^ where //* = x// and #*(x) = a}). Lemma 1 : Let P(XI , . . . , x,,) be a phrase with the only free name variables x t , . . ., x„. Let //, T/1 be two semantic functions that agree on x lr . . ., x,, and all the basic phrases appearing in P (such as run, kill, etc.) then ||Ñ||Ç^ Il-Plta 1 · Proof: clear, by induction. Lemma 2. Let Ρ be a phrase not containing the free name variable u, then no existential quantifier in Ρ can be dependent on u. Proof: Follows from Lemma 1, since by changing the value //, assigns to u || Ρ ||H does not change. Corollary. The quantifier of the tree (21) is branching since "some girl" does not depend on "man" and "some woman" does not depend on "sheep". They are both constructed as different phrases (with variables x, u) in different parts of the tree. (22)
(Branching Quantifier) : Every son of some king and every daughter of some queen are friends. First construct : then construct : now construct : now construct : and you can now and finally :
y is son of x u is daughter of í man such that he is son of x woman such that she is daughter of í every woman such that she is daughter of í every man such that he is son of x. S(x, v) = every woman such »that she is daughter of í and every man such that he is son of x are friends say : S (some king, v) S(some king, some queen)
assuming that every king is a man and every queen is a woman. Also note that we are not concerned with rules yielding the final surface form.
156
Dov M. Gabbay and J.M.E. Moravcsik
4 BRANCHING QUANTIFIER IN MONTAGUE (1973) Our grammar of §3 is a subgrammar of Montague (1973) and therefore this system can express branching quantification. The reader should note that we did not give a rigorous proof that every sentence with branching quantifiers can be expressed in our grammar. We simply gave several examples to convince the reader that this can be done. §5 DEGREE OF COMPLEXITY OF THE MONTAGUE LANGUAGE The language with branching quantification is stronger than the 1st order predicate calculus. In fact (due to F. Galvin) there exists a sentence of the form (23)
Vx3y| > A(x,y,u, v) Vu3v J
that cannot be expressed in 1st order predicate calculus as it is true in all models with infinite domains. The set of valid sentences of branching quantification is not recursively enumerable. (See Enderton 1970, p. 393.) The fact that we can express, e.g., the above sentence in Montague grammar, shows that the Montague language is stronger than 1st order predicate calculus. The reader may wonder, since on the face of it, whatever we can express in Montague grammar can be expressed also in the predicate calculus. The difference, however, is in the semantic interpretation. Take (23). Montague grammar rewrites this essentially as A(x, F(x), u, G(u)) which is expressible in predicate logic. However, semantically, Montague semantics gives it the interpretation 3 F 3 G A, which cannot be done in predicate logic.
References ENDERTON, H.B. (1970) Finite partially ordered quantifiers, Zeitschrift für Mathematische Logik und Grundlagen der Mathematik 16, 535-555. GABBAY, D.M. (1973) Representation of the Montague-semantics as a form of the Suppessemantics, pp. 395—412 in: Hintikka J., Moravcsik, J., and Suppes, P., (Ed.) Approaches to Natural Languages, Reidel: Dordrecht. GABBAY, D.M. (forthcoming) Tense logics and the tenses of English, in: Moravcsik, J. (Ed.) Logic and philosophy for linguists, Mouton: the Hague. GABBAY, D. M. and J. MORAVCSIK (1973) Sameness and individuation, Journal of Philosophy 70, 513-525.
Branching quantifiers, english, and montague-grammar
157
HINTIKKA, J. (forthcoming) Branching quantifiers, Linguistic Inquiry. MONTAGUE, R. (1970) English as a formal language, pp. 189-223 in: Visentini et al. (Ed.) Linguaggi nella societä e nella tecnica, Edizioni di communitä: Milano MONTAGUE, R. (1973) The proper treatment of quantification in ordinary English, pp. 221 -242 in: Hintikka, J. Moravcsik, J. Suppes, P. (Ed.) Approaches to Natural Language, Reidel: Dordrecht. MORAVCSIK, J. (1972) review of G. Leech's Towards a semantic description of English, Language 48,445^54. RYLE, G. (1957) The theory of meaning, pp. 239-264 in: Mace, C. A. (Ed.) British Philosophy in the Mid-Century, George Allen & Unwin: London.
J. PH. HOEPELMAN
TENSE-LOGIC AND THE SEMANTICS OF THE RUSSIAN ASPECTS1
We consider the applicability of J. A. W.Kamp's system for S(ince) and U(ntil) in the formalization of the supposed deep-structure of Russian sentences in which the aspects occur. We will see that, assuming certain expressions for the representation of the perfective and the imperfective, the consequences that are generally felt to be implied by these aspects in spoken Russian, can be inferred, assuming the axioms for linear and dense time. The semantical relations between the imperfective and the perfective aspecjft become more clear.
Introduction
If a "natural logic" exists (Lakoff 1970), it is to be expected that a tenselogical fragment will occur in it. Even in advanced treatments as (Montague 1973), the tense-operators are those of the propositional tense-logical system Kj and its extensions. These operators, however, cannot give a proper account of the logical form of all tense-phenomena that occur in natural language. In the following we consider the drawbacks of the aforementioned operators in the treatment of the logical form of Russian sentences in which the so-called "aspects" are found. Then we will make some proposals concerning the representation of these forms by means of Kamp's system for the operators S(ince) and U(ntil). We will limit our tense-logical analysis to one standard example, the verb "zakryt'—zykryvat'", "to close". This is not due to a limitation of Kamp's system, but to difficulties in the analysis of "unbalanced" word-pairs, like "to close" and "to open", by means of Potts' operator "Δ" (Cooper 1966). To expose this would take too much room for the purpose of the present article.
1
This article is part of research-pro jekt "The tense-logical fragment of natural logic", supported by The Netherlands Organization for the Advancement of Pure Research. I am indebted to Prof. S. C.Dik. J. A. W. Kamp, G. Berger and E. Krabbe for their help.
Tense-logic and the semantics of the russian aspects
159
I.
Until recently the study of tenses in linguistics has been more or less primitive. Most linguists treat the tenses in ways similar to those of Russell (1903, 458—476) or Jespersen (1924) and Reichenbach (1947). Prior (1967, 12) however, shows how Russell's analysis leads to a paradox in the treatment of compound tenses, and we in turn can show how Reichenbach's analysis leads to a similar paradox. In his treatment of the tenses, Reichenbach—following Jespersen—uses diagrams like figure 1.
Jti
Κ
I
ο
I
S
I
E
I
R
I
R,E ι
S ι
E
R, S
> I had seen John * I shall have seen John + I saw John -* I have seen John
S: point of speech R: point of reference E: point of event. Fig.l
Let us now assume that the sentence "Once all speech will have come to an end" is true (cf. Prior 1967, 12). Then a finite number of utterances will have taken place. Assuming further that each utterance has a finite number of reference points, there will be a finite number of reference points. At least one of these is the last one. But Reichenbach's analysis of the sentence "there will have been a last reference point" gives a reference point that is later than the last one. A similar pardox can be constructed when the expression "now" is analysed as "contemporaneous with this utterance" (Krabbe 1972). If the analysis of tenses is related to utterances, one is forced to assume that there always were and always will be utterances, in order to avoid these problems.
160
J.Ph. Hoepelman II.
Of the different forms of tense-logic, the non-metric prepositional ones with proposition-forming operators on propositions seem to bear the greatest formal resemblance to the tensed sentences of natural languages. J. A. W. Kamp (Kamp 1968) studies in detail the advantages of non-metric tense-logics for the treatment of tensed expressions in natural languages. We shall enumerate the axioms of standard propositional tense-logic and briefly mention the properties of the related models. The basis we choose is a system for standard propositional logic. The set of well-formed formulas is extended with those well-formed formulas which are preceded by: F P G H
"it will be the case that" "it was the case that" "it will always be the case that" "it always was the case that",
plus the usual truth-functional combinations of these. "F" and "P" are undefined. "A", "B", ... are metavariables for well-formed formulas Def.G. Def.H.
HA= d -.PiA
The rules of inference of propositional logic are extended with: RG. hA->h-iF-.A RH. hA-^KP-iA RM ("Mirror-image" rule). If in a thesis we replace all occurrences of F by P, and of P by F, the resulting formula is a thesis. Ax. 1. -ι Ft (A z> B) z> (FA ^ FB) Ax.2. PiF-iAz>A
Ax. 1. and 2. together give the system K t , the theses of which are valid in every model for the axioms given below. Extensions of K t : Ax.3.FFA=>FA (transitivity) Ax. 4. PFA ID (A v FA v PA) (linearity) Ax. 5. τ FT A => FA (non-ending, non-beginning time) Ax. 6. FA r> FFA (denseness) Ax.7. Π(τΡ Ί A -5 ρ Ί ρπ A) ..=> (τΡτ Ao τΡτ A) (completeness) Def. D : D A = d A & GA & HA
Tense-logic and the semantics of the russian aspects
161
III.
Russell offers the following definition of "change": "Change is the difference in respect to truth or falsehood, between a proposition concerning an entity and a time T, and a proposition concerning the same entity and another time T', provided that the two propositions differ only by the fact that T occurs in the one, where T' occurs in the other. Change is continuous when the propositions of the above kind form a continuous series, correlated with a continuous series of moments, ... Mere existence on some, but not all moments constitutes change on this definition" (Russell, 1903,469—470). This definition can, with due modifications, equally well be applied to non-metric systems. Von Wright (1963), (1965) has developed a system with a dyadic proposition-forming operator on propositions, T, by means of which four elementary transformations can be described. Clifford (1966) has pointed out, that von Wright's system-goes together with a discrete series of moments. Tp,q means "p now, and q the next moment". If p represents the proposition "the window is open", then Τρ,τρ describes the transformation of a world in which a window is open into a world in which it is closed. Ττ ρ, ρ describes the reverse, Tp, ρ describes the staying open of the window and Ττρ,τρ its staying closed. Agreeing with Russell's definition we can say, that only Tp, τ ρ and Ττρ, ρ describe changes. Anscombe has given an operator Ta, that Prior (1967, 70) has defined as follows: Def. Ta: Ta(A,B) =dP(PA & Β) ν (PA & B) Ta(p, q) may be called "p and then q". Def. Ta can be given for any of the axiom systems given above, and so does not presuppose discrete time. Ta(p, np) and Ta(np, p) describe changes as well, but do not preclude the possibility of there having been more changes in between.
IV.
In Russian there are in general two verbal forms corresponding to one English verb. So for instance the verb "to close", which in Russian is represented by the two forms "zakryt"' and "zakryvat'". These two forms are referred to as "perfective" and "imperfective" respectively (if necessary we will indicate the perfectivity or imperfectivity of a form by the superscripts p and '). It has for a long time been thought, that the aspects are a feature characterising only the slavic languages, but recent studies show that they can be assumed in the basis of other languages as well, e.g. in Dutch; cf. (Verkuyl 1971). Aspectual differences, however, are expressed very systematically in the morphology of the slavic languages. 11 TLIl/2
162
J.Ph. Hocpelman
There is considerable disagreement among linguists as to the meaning of the two Russian aspects and their different functions. The great "Grammatika Russkogo Jazyka" tries to cover all their basic meanings (Forsyth 1970, 3): "The category of aspect indicates that the action expressed by the verb is presented: (a) in its course, in process of its performance, consequently in its duration or repetition, e.g. zit', pet', rabotat', chodit', citat', ... (imperfective); (b) as something restricted, concentrated at some limit of its performance, be it the moment of origin or beginning of the action, or the moment of its completion or result, e.g. zapet', koncit', pobezat', propet', prijti, uznat', ujti, ... (perfective)". Forsyth tries to define the difference between the perfective and the imperfective by means of Jakobson's concept of "privative opposition": "A perfective verb expresses the action as a total event, summed up with reference to a single specific juncture. The imperfective does not inherently express the action as a total event summed up with reference to a single specific juncture" (Forsyth 1970,7). The Dutch slavist Barentsen too uses the term "limit" to define the meaning of the perfective and says further: "The perfective points out that in the Narrated Period two contrasting parts are distinguished, of which one contains the Contrast Orientation Period. The element NONPERF points out that in the Narrated Period no two contrasting parts are distinguished". Furthermore, analysing the meaning of the perfective and imperfective forms "otkryt/p—otkryvat'1"— "to open"—he states: "The notion of contrast ... asks for explanation. Let us consider the following example: There exists a situation "the window is closed". After some time a change is observed: the window is now open. This means that a transition has taken place from one situation into another" (Barentsen 1971, 10; translation mine, J.H.). The similarity of this definition (and many others could be adduced) to Russell's definition of change is easily seen. Both the imperfective and the perfective forms can describe a change, but whereas the imperfective past form "zakryvalas'"—"closed" in the sentence "dver' zakryvalas'"—"the door closed/the door was closing" does not necessarily mean that the door was ever closed, the perfective past form "zakrylas'"—"closed" in "dver' zakrylas'"—"the door closed"/"the door is (was) closed"—does mean that the "limit" of closing (which in a complete (i. e. continuous) series of moments is the first moment of the door being closed—the first moment at which the sentence "the door is closed" is true) is attained. In other words, the imperfective form may describe a change like "becoming more and more closed", while the door is open, whereas the perfective form describes not only this change, but also what may be called the result of this change: the fact that the door was really closed for the first time. The attainment of this result is an event without duration, which may be called an "instantaneous event", cf. (Anscombe 1964,17).
Tense-logic and the semantics of the russian aspects
163
V.
Let us now assume that we have attached a system for predicate logic to the systems of propositional tense-logic given above. We express predicate constants by "nV, "m2", ..., predicate variables by "jj", "f2", · - ·, individual constants by "a", "b", ..., individual variables by "x", 'V, 'V» - · - In this article we will only consider one-place predicates. Now we can express that an individual, a, gets a quality (e.g. "closed") for the first time: 1.
Hinija & n^a
Clearly (1) is too strong to represent the meaning of "dver' zakrylas/p"— "the door closed"—. Neither the Russian, nor the English sentence imply that the door has never been closed before. What we want to express is, that for some period the door became less and less open and was closed finally. If we try to express this in the propositional tense-logic given above, the best we can get is: 2.
HF-imja & n^a
In dense time HF-in^a is true if -11%a was true during some interval until now. But it is also possible that HF-in^a is true because in^a is true in the future—as can easily be inferred from ax. 2 and ax. 5. Even if we therefore stipulate that Gm1a, HF-imja can in a dense series be verified by a "fuzz": if between any past moment of -in^a's truth and the present, however close, there is a moment of -i mj a's falsehood and conversely, cf. (Prior 1967,108). A second difficulty is, that the standard predicate logical systems do not enable us to relate the result of an event to the event itself, so that we cannot distinguish between an event that stops because its result is attained and an event that stops without its result being attained: 3.
is true when a in the past gradually became mx and finally was (or is) 1%, as well as when a "jumped" to mj. On the other hand, if we have an expression "Φ(πΐι a)" to represent the imperfective verb "zakryvat'"—"to close (gradually)",
4. would be true if a stopped closing gradually without finally being closed, as well as when this result was indeed attained.
164
J.Ph. Hoepelman VI.
To express the concept of gradual becoming Potts (1969) has devised a system of rules of natural deduction. His main rule we use here as an axiom2. If p stands for "x is mj", than Δρ stands for "x becomes m^'. Ax. PL Δ/ίχ=>τ/[χ If we substitute "a is closed" for^x in PI. we get: "if a becomes closed, a is not closed". Contraposition gives "if a is closed, it doesn't become closed". We attach Potts' operator an axiom to the system of predicate logic we choose.
VII. We still cannot express that a proposition, p, was true during a certain interval of time. To express intervals in non-metric systems Kamp (1968) has developed the dyadic proposition-forming operators on propositions S(ince) and U(ntil). S(p, q) means: "Since a time at which p was true q was true until now", and U(p, q) means: "From now q will be true until a time at which p is true". Kamp has proved the functional completeness of S and U (Kamp 1968). We give some expressions defined in S and U (Prior 1967,106f.): PA FA ΗΆ G'A P'A F'A
2
= d S(A, A o A): "A i3 A has been true since A was the case". =d U(A, A => A): "A => A will be true until A is the case". = d S(A=>A, A): "A has been the case since A ID A, i.e. A has been true for some period until now". = d U(Az> A, A): "A will be true until A=> A will be true, i.e. A will be true during some period from now". =d-iHSA: "There can be found no interval, however short, stretching from the past until now, during which π A is uninterruptedly true". =
Potts' system: A, B for /„x,,,
a.
ΔΑΒΞΑ ΔΒ ΔΑ
, ΔΑ&ΔΒ b. Δ(Α & B) , -,ΔΑ d. -
ΔΔΑ
Ad b.: provided that "B" depends on no premisses other than "A", and that "A" depends on no premisses other than"B". I do not think that Potts' system is suited to express all kinds of becoming, e.g. not the concept of becoming more and more open.
Tense-logic and the semantics of the russian aspects
165
H'A and G Ά are the expressions we have been looking for. H'A is not verified by a "fuzz". Kamp (personal communication) has divised an axiom-system for S and U: Axioms : I. II . 1. 2. 3.
4. 5. 6. 7. 8. 9. 10.
All axioms of standard propositional logic. All formulas of the following forms : nS(A&-iA,B) S(A, B) & S(C, D) ΞΞ (S(A & C, B & D) v S (A & D & S(C, D), B & D) v S(C&B&S(A,B),B&D) S(A & U(C, D), B) = (S(C & B & S(A, B & D), B) v (C & S(A, B & D)) v (U(C, D) & D & S(A, B & D)) (iS(A v -ιΑ, -ιΑ) & S(C,D)) = (S(A & D & S(C, D), D) & nS(A v πΑ, πΑ)) S(A v B, C) = S(A, C) v S(B, C) HA = -.S(-iA,Av-iA) S(AvnA,AvnA) U(A v πΑ, Α ν -ιΑ) -iS(A v -iA, A & -iA) nS(A, B) = iS(A, A v πΑ) ν -iS(B v -iB, B) v S(-iA & (-iB v nS(B v -,Β, Β), nB)
Rules of inference: MP.
ΗΑ,ΗΑ=>Β hB
Eq.
HA = A' hB^B'
If A is a subformula of Β and B' results from replacing an occurrence of A in Β by A'. RM. Mirror-image rule for S and U (cp. Sect. II). Def. P, H', P7, F, G', F', as given above. The axioms 1 —6 correspond to linear time, the axioms 1 —7, 8 to linear nonbeginning, non-ending time, axiom 9 to dense time, axiom 10 to complete time. Furthermore we assume that we have attached a system for predicate-logic, extended with Ax. PL (cp. Sect. VI), to I and II. VIII. 6.
H'An^a &
is true if a is closed now for the first time, after becoming more and more closed during some period. We can prove3 that 1
Throughout proofs of theorems and lemma's can be found in the Appendix.
166
7.
J.Ph. Hoepelman
H'p=> Pp
and thus 8.
Η'Δ/iX &/ lX z> Ρ-,/ιΧ &/iX
can be inferred from axioms 1 — 6, 9, for dense time (cp. Sect. VII). From P^rrt! χ & mjx "x wasn't closed and is now closed" we can, by means of PL., infer T^-imjX, mjx), i.e. the contrast that, according to Forsyth, Barentsen and other grammarians is implied by a sentence like "dver' zykrylas'p"—"the door closed", and that, according to Russell can be used to define change. But because 9.
Η'Δη^χ & m t x
is stronger than P-imjX & nijx, we are now able to express formally the difference between the proposition that a door, a, was closing during some period until now and is now, indeed, closed for the first time, and the proposition that a was closing during some period until now, still being open at the present moment. The former was expressed by (9), the latter we can express by 10.
H'Amja & -in^a.
Although it is possible that a Russian sentence with a perfective-past verb form refers to present time, as in the following examples, this is not often the case : 11.
Umer — vskriknul Kukuskin, brosajas' na koleni u ego krovati — Umer. He is dead/he died — shouted Kukuskin falling on his knees at his bed — He is dead/he died. (B.N. Polevoj, Ex. Russ. Synt. II, 300.)
12.
My pogibli We are lost (Bondarko, 1967, 99)
13.
I teper', poborov otorop', ja resil . . . And now, having fought my shyness, I decided to ... (A. Terc. Pchenc.)
Most Russian sentences with perfective-past verb forms, however, refer to past time. There are, moreover, examples of sentences in which the perfective-past verb is ambiguous in respect to time : 1 4.
Kogda my prisli, oni usli When we arrived, they had (already) gone/When we arrived, they just went away. (Forsyth 1970, 68.)
To (14) we can ascribe the following structure : 15.
P(O(/ixn)&(PO(/jxm)vO(y5xin)), where Φ^χ«) represents a formula in which /^ occurs.
Tense-logic and the semantics of the russian aspects
167
So we can assume that 16. 17.
Η'Δ/ιΧ&/ιΧ, Ρ(Η'Δ/ιΧ&/ιΧ) as well as
18.
(Η'Δ/ιΧ &/ lX ) ν Ρ(Η'Δ/ιΧ &/lX)
are represented by forms like "zakrylas' " in surface structure. It is equally possible to assume that only (18) is represented by forms like "zakrylas'" in surfacestructure, because (18) is implied by (16) as well as by (17), which occur on some supposed deeper level, without direct representation in the surface-structure. By means of the λ-operator (cf. Carnap 1959, 82ff, 129ff.) we define a predicateforming operator on predicates, p, such that ffi κ is true if and only if Η'Δ^ίΧ & jJX is true : Def. p.: PA =<1(λχ)(Η'Δ/1χ &/lX). Let us for the time being assume that 19.
in the deep structure of Russian sentences is represented by perfective-past forms like "zakrylas'". From 19. we can easily infer 20.
ΡΗ'Δ/ ιΧ νΗ'Δ/ ιΧ ,
while on the other hand from (20) it is impossible to infer (19). If we assume furthermore that (20) occurs in the deep structure of russian sentences with forms of imperfective-past verbs, like the imperfective-past correlate of "zakrylas"', i.e. "zakryvalas'", this corresponds to the situation that in Russian 21. *Dver' zakrylas/p, no ne zakryvalas'1 The door was closed, but didn't close is unacceptable, while 22.
Dver' zakryvalas'1, no ne zakrylas/p. The door was closing, but wasn't closed finally.
is a normal sentence. We may therefore assume, that (21) is unacceptable for a logical reason, but not unwellformed (in the sense of "not generated by the formation-rules"). 23.
Jesli dver' zakrylas'1, ona zakryvalas/p If the door was closed, it was closing
can therefore be considered as an instance of a tense-logical postulate for Russian.
168
J.Ph. Hoepelman IX.
Russian perfective verb-forms of verbs like "zakryt's'a"—"to close" with present-tense endings refer to the future, either denoting an event that is already taking place in the present and has its result in the (near) future, or an event that starts in the future and has its result in a more distant future. Examples: 24.
Ja risuju1 portret moej staroj n'ani; kupite1 li vy etot portret, kogda ja jego narisujup? I am drawing the portrait of my old nanny, will you buy it when it is finished?
25.
Vse ze oni dumali1, cto Kutuzov dozivetp do rassveta. Nevertheless they thought that Kutuzov would live until dawn.
26.
Kogda vy im napisete1* ? My im napisemp cerez tri nedeli. When will you write them? We will write them in three weekstime,(compare also Rassudova, 1968, 93—94)
This corresponds to a consequence of 27.
FP/lX
in linear time. We first prove: 28.
F(H'p & q) => U(q, ρ) ν FU(q, p), and then:
29.
Ff/lX z> U(/lX, Δ/ιΧ) ν FU(/lX, Δ/χχ).
Proof of (29): Subst. Δ/ίχ/ρ JJx/q ω· (28): Def. |>. If we now define a dyadic proposition-forming operator on propositions, Qa, "and then", "and after that", corresponding to Anscombe's Ta in the following way Def. Q.: Q a (A,B)= d F(A & FB) ν (A & FB) We can infer from axioms 1—6, 9, for linear, dense time (cp. Sect. VII): 30.
(U(p, q) ν FU(p, q)) ID Qa(q, p)
Substitution of Δ^χ/q and^x/p in (30) gives 31.
(U(/lX> Δ/,χ) ν FU(/lX) Δ/ιΧ)) ^
(^(AfrJri.
From Ο,(Δ^χ,^χ) we infer by PI, Lemma 9 and PL. 32.
Ο,ίπ/,χ,^χ),
and so from (29) and (31) by Syll.: 33.
Tense-logic and the semantics of the russian aspects
169
So, if we assume that Fp/jX occurs in the deep structure of Russian sentences with perfective-present verb-forms, the contrast of Barentsen, Forsyth and Russell, mentioned previously, can be inferred for linear, dense time. Furthermore, as we saw, U0£x, A/[x) v F(U(/jx, Δ£χ)) can be inferred from Ff^x. From U(/1x,A/1x)vFU(/1x,A^x) we can infer 34.
G'A/ix ν
by Lemma 4 and Lemma 9. Conversely, from (34) we cannot infer FJ^x. Assuming that (34) occurs in the deep-structure of Russian sentences with the imperfective correlate of perfective verbs with present-tense endings, e.g. "budet zakryvat's'a" — "will be closing" — then the situation described above again corresponds to that of Russian: 35. *Dver' zakroets'ap, no ne budet zakryvat's'a1 The door will close, but it will not be closing is unacceptable — this, as previously pointed out, for a logical reason —, while 36.
Dver' budet zakryvat's'a1, no ne zakroets'ap The door will be closing, but it will not be closed
is perfectly acceptable.
X.
A negated perfective verb can often be replaced by the negated corresponding imperfective verb (Forsyth 1970,102f.). This possibility is also accounted for by our assumed deep-structure for perfective verbs. 37.
ρ/ιΧ=Η'Δ/ιΧ&/ιΧ (byDef.p.)
If one of the conjuncts of the right member of (37) is negated, than p/[x is not true. Thus, if we assume Η'Δ/χΧ to occur in the deep-structure of imperfective verbs, the negation of Η'Δ^χ suffices for the negation of £/Jx. On the other hand, as we saw, the negation of a perfective verb can mean that the result of the event described by the verb, i.e. jix, is not attained, while nevertheless Η'Δ/Jx was the case: 38.
Ja dolgo ubezdal1 prepodavaternicu, cto v etom net nikakogo anarchizma, naoborot—Ubezdal1, no ne ubediP. I tried for a long time to convince the teacher that this was not a manifestation of anarchism, on the contrary. I tried to convince her, but I didn't succeed. (Erenburg. L'udi, gody, zizn'. From Forsyth 1970,104r)
170
J.Ph. Hoepelman XL
We have already assumed that expressions in which Η'Δ^χ occurs play a role in the deep structure of Russian sentences with imperfective-past verb forms. We have seen that these forms are implied by the postulated expressions for the deep-structure of perfective forms, so that it is impossible to state the perfective form, but to deny the imperfective one. It is, however, not possible to replace the perfective form by the imperfective one in all contexts. Perfective forms are required when the verb has the function of a perfect and when a series of successive events is described (Forsyth 1970, 92f.). We will try to find a formal expression for these contexts. The perfect meaning of a Russian perfective verb form expresses that a situation, of which the beginning is described by the perfective-past verb, has been existing up to and including now. 39. 40.
41. 42.
On pol'uhiP ee He fell in love with her (and still is in love with her)/He loves her. Moroz snova krepkij—podulp severnyj veter It's hard frost again (because) the north wind has got up. (Erenburg. OttepeF) Ja zabylp, gde on zivet I forget where he lives On s uma soselp He is mad (examples from Forsyth 1970, loc. c.)
As a formal expression of this perfect (for the group of verbs considered here) we propose: 43.
/,x & S(P/1X,/1X)
e.g.: "the door is closed now, and has been closed since it became closed". When perfective verbs are used to describe a series of successive events, each perfective verb describes an event that takes place in a new situation, the beginning of which is described by the preceding perfective verb. This situation can continue to exist after the new event has started, but it is equally well possible that it ceases therewith. 44.
D'akon vstalp, odels'ap, vz'alp svoju tolstuju sukovatuju palku i tycho vyselp iz domu The deacon got up, dressed, took his thick, rough stick and quietly left the house (Cechov, Duel'. Forsyth 1970, 65).
45.
On otkrylp dver', vyselp, i zaper1* ee op'at' He opened the door, went out, and closed it again (Forsyth 1970, 9).
Tense-logic and the semantics of the russian aspects
171
As a general formal expression of such a sequence we propose: 46.
P(0(x„+1) & S(|>/A & S(p/k_lXn_, & S(. . . & S(P/lXl ,fa\ . . .),
Λ-Λ-ι)./Α v (Φ(*.+ι) & SQP/,Α, & S(?/k_lXn_i & S(. . . &
Xj may be identical with x^/j For the future we replace in (46) all occurrences of Ρ by F, and of S by U in accordance with RM. The formulation of (46) as a disjunction allows us to consider sequences of events of which the last one took place in the past, as well as sequences of events of which the last one takes place in the present (and which eventually goes on in the future). The presence of the expression Φ(χ,ι+ι) allows for the possibility of an interruption or termination of the sequence of perfectives by an imperfective expression, as is often the case in Russian: 47.
Cto i govorit', eto bylo ne samoe obrazcovoe otdelenie. Proderzalip nas tarn minut sorok — kuda-to zvonili1, vyjasn'ali1, trebovali1 fotoplenku —i tol'ko posle aktivnych nasich ubezdenij . . . i dopol'nitel'nych zvonkov nas otpustilip i daze izvinilis/p. I must say it wasn't a model police-station. They held us there for about forty minutes while they made phone-calls, asked questions, demanded the film from the camera. And it was only after our active persuasions . . . and further phone-calls that they let us go, and even apologised. (V. Nekrasov. Po obe storony okeana. Forsyth 1970, 65.)
We see now, that (43) is a special case of the second member of the disjunction (46).
XII.
Except for the expression of a gradual change (and certain other functions), the imperfective forms of Russian verbs can have two functions that stand in a relationship to one another, and to the perfect meaning of perfective verbs. The first one is the expression of a "two-way-action", the other that of a repeated action. The imperfective verb describing a two-way-action stipulates that the situation which came into being by the action described by the verb does not exist any more. This function of the imperfective thus contrasts to the perfect meaning of perfective-past verbs.
172
J.Ph. Hoepelman
48.
Vojd'a v komnatu on skazal tovariscu — Kak zdes' dusno! Ty by chot' otkrylp okno. — Da ja ego nedavno otkryval1. When he entered the room he said to his friend: "How stuffy it is in here! You might at least have opened the window". "But I did open it (have it open) not long ago". (Forsyth 1970, 78.)
49.
Prochod'a mimo nee on snimal1 sl'apu. As he passed her he raised (i.e. took off and put on again) his hat.
Compare (49) to (50) in which the corresponding perfective form of snim P, sn'alp, occurs: 50.
Vstretiv ee, on sn'alp sTapu i skazal . . . When he met her he took off his hat and said (still with his hat off) . . . (Forsyth 1970, loc. c.)
In our formalism, and for the group of verbs considered here, we can express this meaning of the imperfective as follows : 51.
(Pp/lX & H S/i*) v P(Pf>/lX & Η'-,/ι*)
Η'τ/[Χ implies iSff^f^ 52.
in dense, linear time:
HS/ lX ^nSp/ lX ,/ lX
From τ5£/ίχ, j|X we infer by PL. 53.
-,(/lX & SJ>/lX,/lX)
So we can infer from (51) by PL., Lemma 9 and RM : 54.
P(Pp/lX & -,(/lX & Sp/lX,/lX)) v (Pp/lX & -,(/lX & Sp/lX,/lX)).
(53) is the negation of (43), which we proposed as the formal expression of the perfect meaning of the perfective-past. A repetition of a proposition, p, being true at different moments can, in Anscombe's formalism, be eXpressed as follows : or
T a (T a (p,-,p),p),...etC.
T a (T a (T a (np,p),-,p),...),...etc.
Pp/JX & Η'τ/[Χ implies Ta(Ta(-ijiX,^X), n^X), i.e. a repetition of n^X. 55.
Pp/lX & H VlX ^ T.CT.Cvix, /ix),
This means that, if we assume that (51) occurs in the deep-structure of Russian sentences with imperfective-past forms, as of "zakryvatVa" — "to close" — which denote a "two-way-action", then we can infer the repetition that, as may appear from the eXamples, is implied by this function of the imperfective, given the axioms for linear, dense time.
Tense-logic and the semantics of the russian aspects
173
XIII.
The other function of imperfect! ve verb forms we mentioned was the expression of repeated action (the iterative). An imperfect!ve verb that expresses an iterative can be considered as a repetition of perfectives : 56.
Kazdyj den' on vypival1 pered obedom r'umku vodki Every day before lunch he drank a glass of vodka i.e.: v ponedel'nik vypilp, vo vtornik vypilp, ... on Monday he drank one, on Tuesday he drank one . . . (Forsyth 1970, 164).
57.
Kazdyj den' on zakryval1 okno He closed the window every day i.e.: v ponedel'nik zakrylp, vo vtornik zakryP ... he closed it on Monday, on Tuesday . . . etc.
We can infer the repetition Ta(Ta(/[x, n^x),^x) without any new axioms if we express this function of the imperfective by a repetition of a perfective verb: 58.
P(P/lX & PP/lX) v (P/lX & PP/,x) o Ta(T.(/lX)
Forms of the future tense of imperfective verbs can also have an iterative meaning: 59.
Kazdyj den' on budet vypivat'1 pered obedom r'umku vodki Every day before lunch he shall drink a glass of vodka i.e. : ν ponedel'nik vyp'etp, vo vtornik vyp'etp ... on Monday he shall drink one, on Tuesday he shall drink one, ...
60.
Kazdyj den' on budet zakryvat'1 okno Every day he shall close the window i.e.: v. ponedel'nik zakroetp, vo vtornik zakroetp, . . .he shall close it on Monday, on Tuesday, . . .
We can infer that Qa(QaOix> ~l/ix)> Jix)> # we represent this meaning of the imperfective future by a future repetition of perfectives, i.e. if a perfective will twice or more be the case. 61.
F(P/ 1 x&FP/ 1 x)=Q a (Q a (/ 1 x > Vix),/ix)
As to the semantical relationship between the perfective and the imperfective (of the group of verbs considered) these results mean, that we can suppose that in surface structure an imperfective verb form occurs when in the deep-structure an expression occurs that implies τ/Jx, Pn^x or F-ijix, but not 'S^Jx and then^x", or a repetition of ^x, or of τ/Jx, whereas the perfective occurs in the surfacestructure when in the deep structure an expression occurs that implies "i^x and then^x", but not a repetition of jix, or of -i/ix.
174
J.Ph. Hoepelman APPENDIX
Proofs of theorems. Proof of 8. Η'Δ/ιΧ & / x x => P-i/lX & / lX : Lemma 1. S(p, q) 12 S(p, π q ν q) Proof of Lemma 1.: 1.
2. 3. 4. 5. 6. 7.
n(S(p, q) & S(p, q viq)) s -,(S(p & p, q & (q v -,q)) v S(p & (q v iq) & S(p, q vnq), q & (q vnq))) v S(p & q & S(p, q), q & (q v-iq))) (PL, ax. 2, subst.) -,(S(p, q) & S(p, q v -, q)) = -,(S(p, q) v S(p & S(p, q v -, q), q) v S(p&q&S(p,q),q)) (1. PL. EQ) -,(S(p, q) & S(p, q v-.q))==-,S(p, q) & -,S(p & S(p, q v-,q), q) & -,S(p&q&S(p,q),q)) (2. DeM.) -,(S(p, q) & S(p, q v -, q)) ID S(p, q) (3. PL, Sep.) S(p, q) ID S(p, q) & S(p, q ν τ q) (4. Contrap.) S(p, q) & S(p, q) =D S(p, q ν -, q) (5. PL) S(p,q)=>S(p,qv-,q) (6. PL)
Lemma 2. H(p ^ q) ID (H'p ID H'q) Proof of Lemma 2.: 1.
nS((pvnp),-nq)& S(p ν-φ, p) ID S(nq & ρ & S(p ν ι ρ, ρ), ρ) (ax.4,Sep.)
2.
S(iq & p & S(p V -ip, p), p) ^ S(iq & p, p) &
3. 4. 5. 6. 7.
S(S(pv-.p,p),p) (1., ax. 2, PL) -, S(p v -»p, -nq) & S(p v -ip, p) z> S(iq & p, p) (1, 2, Syll, Sep.) S(n q & p, p) -3 S(T q & ρ, τ ρ ν ρ) (Lemma 1., subs.) π S(p ν np, m q) & S(p ν τ p, p) z> S(n q & p, p v τ p) (3, 4, Syll.) n ST (n p v q, p v n p) ID τ S(p v np, p) v S(p v n p, q) (5., Contr. DeM.) H(p =5 q) ID (Hrp i5 H'q) (6., Def. H, Def. H', Def. =5)
Lemma 3. H'p "=> Pp. Proof of Lemma 3.: 1. 2. 3. 4. 5. 6. 7. 8. 9.
(·ι$(ρν-,ρ,-,(ρν-,ρ))&5(ρν-.ρ,ρ))Ξ(5((ρν-,ρ)& p & S(p v n p, p), p) & -i S((p v -ι ρ), -ι(ρ ν -ι ρ)) S(p v -,ρ, p) = S((p v -ip) & p & S(p v -,ρ, p), p) S(p v -ι ρ, ρ) Ξ S((p & S(p v -,ρ, p), p) S((p & S(p v i p, p), p) => S(p, p) & S(S(p v -, p, p), p) S(p v -, p, p) :D S(p, p) & S(S(p v -. p, p), p) S(p v n p, p) => S(p, p) S(p, p) ID S(p, ρ ν τ p) S(p v -, p, p) => S(p, p v i p) H'piDPp
(Αχ. 4.) (1. Ax. 9, Eq. PL) (2. PL, Eq.) (ax. 2, PL) (3, 4, Syll.) (6. Sep) (Lemma l.) (6, 7, Syll.) (8. Def. H', P)
Tense-logic and the semantics of the russian aspects
175
Proof of 8. Η'Δ/[χ &^x=> Ρτ/[χ
1. 2. 3. 4. 5. 6.
> -ι/ιχ) Η'Δ/ϊχ => Η'Τ/ΪΧ Η'Δ/ιΧ => Ρ-,/ix H f A/ 1 x&/ 1 x^PVi x &/ix
(PL) 0· Ax· l-> ax· 60 (2. Lemma 2., MP.) (Lemma 3.) (3, 4, Syll.) (5. PL) Q.E.D.
Proof of 28. F(q & H'p) z> FU(q, p) v U(q, p) :
2. 3.
4. 5.
U(q & S(p^p, P ), P =D P )s(U((p^ p ) & (pup) & P=> Ρ) ν ((p=> p) & U(q, (p=> p) & ρ)) ν (S(p=> p, p)& p&U(q,( P ^ P )& P )) (ax. 3) U(q & S(p => p, p), p o p) = (U(Uq, p), p o p) v U(q, p) v (S(P => P, P) & P & U(q, p)) (1. PL. Eq.) U(q&S(p=>p,p),p=>p)=> (U(U(q, ρ), ρ => p) v U(q, p) v U(q, p) (2. PL. Eq.) U(q & S(p ID ρ, ρ), ρ 13 p) o (U(U(q, p), p ^ p) v U(q, p) (3.PL.) F(q & H'p) 3 FU(q, p) v U(q, p) Q. E. D.
Proof of 30. (U(p, q) ν FU(p, q)) =5 Qaq, p: Lemma 4. U(p, q) => U(p ν τ p, q) Proof of Lemma 4. : 1. 2. 3-
-iU(Ov-i P ,q) = iU( P ,q)&-iU(-i P ,q) -.U(pvip,q)=>-.U(p,q) (1. PL., Sep.) U(p, q) 13 U(p ν τP, q) (2. ContraP.)
(Ax. 5. DeM.) (1. PL., Sep.) (2. Contrap».)
Lemma 5. G'P & G'q ^ G'(P & q) Proof of Lemma 5 .
1. 2.
3.
4. 5.
& U(p v-ip, p) & U(p ντ Ρ , q) = (U(p v-ip, P& q) ν U((p v q & U(P ν τ Ρ , q), P & q) ν U((P ν τ Ρ ) & (ax. 2, Eq.) ρ & U(p ν -, ρ, ρ), ρ & q)) & U(P ντ Ρ , Ρ) & U(p ντ Ρ , q)=> (U(P vi p , P & q) ν U((P q & U(P ν -ip, q)), p & q) ν U((p ν -ιΡ) & P & U(p ν τ Ρ , ρ) ν ι((ρ ν τρ) & (1 . Lemma 4. PL.) p & U(p ν π ρ, ρ)), ρ & q)) U(P VTJ>, P) & U(P ντ Ρ , q)=>U( P vn p , P & q) vU( P ν Ρ, Ρ & q) (2.Eq.) vU(Pv-iP,P&q) (3. PL.) U(p ν -,ρ, ρ) & U(p ν -,ρ, q) z> U(p ν -,ρ, ρ & q) (4. Def. G;, Eq.) G'p&G'q=>G'(p&q)
176
J.Ph. Hoepclman
Lemma 6. Fpn>G'Fp Proof of Lemma 6.: 1.
2. 3. 4. 5. 6. 7. 8.
(-iU(p V-ip,-nU(p,p V-ip)) & U(p,p V-ip)) = (U(-iU(p,pVnp) & (p V n p )
&υ(ρ,ρντρ), ρ ν -,p) & -»U(p ν τρ, -nU(p, p v τ p))) (Ax. 4. Subst.)(RM.) (-,U(p v-,p,-r,U(p, ρ ν-,ρ)) & U(p, p vip))z> (U(-,U(p, p ν-,ρ) & (p vip) & U(p, p v-.p), p v -.p)) (1. PL., Sep.) (-iU(p v i p, -n U(p, p v -· p)) & U(p, ρ v -i p)) ID (U(iU(p, p v -i p) & U(p, p v -, p), p v -, p)) (2. PL., Eq.) -i U(-, U(p, p v -i p) & U(p, p v τ p), p v τ p) (ax. l., Subst.) i(-iU(p v -ip, -iiU(p, ρ ν -ip)) & τΙΙ(ρ, ρ ν -ιp) (3, 4. Contrap. MP.) U(p v -ip, -nU(p, ρ ν τ p)) v iU(p, ρ ν τ p) (5. DeM.) U(p, p vip)=> U(p vnp, U(p, p ν-,ρ)) (6. Def. ID) Fpi^G'p (7. Def. F, G')
Lemma 7. FFp =2 Fp Proof of Lemma 7.:
1. 2. 3. 4.
U((p v -ip) & (p v -ip) & U(p, p v ip), (p v -ip) & (p v -·ρ)) ^ U(p ν-,ρ, ρ ν-,ρ) & U(p, p ν-,ρ) (ax. 2, PL.)(RM) U((np v p) & U(p, -,ρ v p), ip v p) ID U(p, -ip v p) (2. Sep.) U(U(p, ρ v -, p), -, p v p) ZD U(p, p v i p) (3. Eq.) FFp->Fp (4. Def. F.)
Lemma 8. U(p, q) ID F(q & Fp) Proof of Lemma 8.: 1. 2. 3. 4. 5. 6. 7.
U(p, q) r> Fp G'(q & Fp) ID F(q & Fp) U(p,q)iDG'q Fp=)G'Fp U(p,q)rDG'q>p (G'q & G'Fp) z> G'(q & Fp) U(p,q)^F(q&Fp)
(Lemma l, RM) (Lemma 3, RM) (Lemma 4, RM) (Lemma 6) (3,4, PL) (Lemma 5) (7,2 Syll.)
Tense-logic and the semantics of the russian aspects
177
Lemma 9. G(p ^ q) => (Fp z> Fq) Proof of Lemma 9.: 1. 2. 3.
i(U(-iOp v q), p v -ιρ) ν U(q, p v τ ρ)) Ξ i(U((-.(-i p v q) v q), p v -. p) (ax. 5., Eq.)(RM) -i(U((-i(-ip v q ) v q ) , p v - , p ) = i(U((p & nq) v q), p v -i p) (DeM. Eq.) i(U((p & -. q) ν q), ρ ν -. ρ) = -,(U((p ν q) & (q ν -ι q)), ρ ν ι ρ) (Distr. Eq.)
4.
-iU((p ν q) & (q ν nq), ρ ν -ιρ) = -iU(p v q, p v -φ)
5. 6. 7. 8. 9. 10.
(PL. Eq.)
-iU(p vq, ρ VTp) = -iU(p, ρ ντρ) & iU(q, ρ ντρ) (ax. 5, DeM.) -iU(p v q, p v ip) => -iU(p, ρ ν τ ρ) (5. Sep.) -.(U(-.(-ip ν q), ρ ν -ιρ) ν U(q, ρ ν -,ρ)) => -.U(p, ρ ν -ip) (l, 6, Syll.) -i(nU(-i(Tpvq),pv-ip)z>U(q,pv-ip))iD-iU(p,pv-ip) (7, Def. =>) i(G(p =>q) =5 Fq) ζ? π Fp (8. Def. G, F, Def. =>) G(p 35 q) Z5 (Fp z> Fq) (9. PL.)
Proof of 30. 1. 2. 3. 4. 5. 6.
G(U(p, q) => F(q & Fp)) (Lemma 8, ax. 1, RM, Def. G.) FU(p, q) ^ FF(q & Fp) (Lemma 9,1., MP.) FU(p, q) => F(q & Fp) (Lemma 7.) U(p, q) ν FU(p, q) => F(q & Fp) (3., Lemma 8, PL.) U(p, q) ν FU(p, q) ^ F(q & Fp) ν (q & Fp) (4. PL.) U(p,q)vFU(p,q)^Qaq,p Q.E.D.
Proof of 52. HVlX ID -, Lemma 10. H'q =5 τ S((p =5 ρ), τ q) Proof of Lemma 10.: 1.
S(p v-ip,q) & S(p Vip,nq)=(S((p v n p ) & (p Vnp), q & nq) ν S((pVTp)&iq& S(pVTp,nq),q&Tq)v
2.
S(p νnp, q) & S(p ντρ, nq) ID S(p ν np, q & -iq) ν S(p ν np, q & -iq) ν
3. 4. 5. 6. 7.
S(p ν τ ρ, q & π q) (1. Lemma 4, Eq.) S(pv-.p,q)&S(pv-,p,-,q):z>S(pvip,q&-,q) (2. PL.) τ S(p ν τ p, q & τ q) =5 i(S(p ν τ ρ, q) & S(p ν τ ρ, η q)) (3. Contrap.) τ S(p ν τ ρ, ρ & τ ρ) (ax. 9.) -,(S(pv-,p,q)&S(pv-ip,-,q) (4,5, MP.) H'q => ·, S(p => ρ, π q) (DeM. Def. =>, Def. H'.)
S((p v-ip) & q & S(p ν np, q), q & -.q)
12 TLI1/2
(ax. 2)
178
J.Ph. Hoepelman
Proof of 52. 1. 2. 3. 4. 5.
S(p, q) ID H'q H'q =3 P'q S(p,q)=>P'q -i P'q ID π S(p, q) HS q ^ -ι S(p, q)
6.
HViX^Sfltfx,/^)
(Lemma 4.) (Lemma 10, Def. P'.) (l,2,Syll.) (3, Contrap.) (4, Def. H'.)
(5. Subst.) Q.E.D.
Proof of 55. HS/iX & PftJxiD Ta(Ta(Vix,/ix), Lemma 11. (H'p & Pq) => P(p & Pq) Proof of Lemma 11.: 1. 2. 3. 4.
H'p & Pq => H'p & H'Pq H'p & H'Pq ID H'(p & Pq) H'(p & Pq) =5 P(p & Pq) H'p & Pq ID P(p & Pq)
(PL. RM. Lemma 6.) (Lemma 5, RM.) (Lemma 3.) (1, 3, Syll.)
(Proof of 55.:) 1. 2. 3. 4.
H'-i/iX & PP/iX =5 P(Vix & ρΡ/ιχ) (Lemma 11, Subst.) P(n/ix & PP/ix) = P(Vix & P^x & Η'Δ/ιΧ) (Def. f.) P(-,/ix & P(/iX & Η'Δ/ιΧ) => P(^x & P(/tx & P-i/ix) (PL, Lemma 3, Lemma 2, Lemma 9, RM.) H S/ix & Pf/lX ο Ta(Ta(Vix, /ix), Vix (1> 3, Syll, Def. Ta) Q.E.D.
Proof of 58. (P/1X & Pf/lX) ν P(f/lX & Pftfx) ^ Ta(Ta(/lX, Vix Lemma 12. P(q & r) :=> (Pq & Pr) Proof of Lemma 12.: 1. 2. 3. 4.
S(p, r) & S(q, r) Ξ (S(p & q, r) v S(p & t & S(q, r), r) v S(q & r & S(p, r), r)) S(p & q, r) = S(p, r) & S(q, r) S(p & q, p v n p) o S(p, ρ ν τ ρ) & S(q, ρ ν π ρ) Ρ(ρ & q) ζ> (Ρρ & Pq)
(Ax. 2.) (1. PL.) (2. Subst.) (3. Def, P.)
Tense-logic and the semantics of the russian aspects
179
(Proof of 58.)
1. 2. 3. 4. 5. 6. 7. 8.
p/lX & Pp/lX = (Η'Δ/1Χ &/lX) & Ρ(Η'Δ/ιΧ &/lX) (Def. P) (Η'Δ/ιΧ &/lX) & Ρ(Η'Δ/ιΧ &/lX) => (HS/lX &/lX) & P(HS/iX &/ιχ) (pl· Lemma 2, Lemma 9, RM.) χ HS^X &/ lX & Ρ(Η'-ι/ιΧ &/ι ) ^ (Lemma 12, Sep) H'-i/lX & P^x &/ lX (Lemma 12, Sep.) H'-i/iX & P/lX &/ lX iD P(-./iX & P/ix) &/ix (Lemma 11.) /lX & P(ViX & P/ix) =>/ix & P(Vix & PViX & P(/lX & P(./ix & P(Vix & P/1X & PP/1X ^ Ta(Ta(/lX, Vi P(|>/lX & Pp/lX) is Ta(Ta(/lX, -i/ix)>/ix) (Lemma 9, RM, Lemma 7, PL.) (P/1X & PJ>/lX) ν P(f/lX & Pp/lX) z, Ta(Ta(/lX, ^X),/1X) Q.E.D.
Proof of 61. F(P/1X & Fp/lX) => Qa(Qa(/lX, Lemma 13. F(q & H'p) => F(p & Fq) Proof of Lemma 13.: 1.
F(H'p&q) = U(q,p)vFU(q ) p)
2. 3. 4.
U(q, ρ) ν FU(q, p) => (F(p & Fq) ν FF(p & Fq)) F(p & Fq) ν FF(p & Fq) => F(p & Fq) F(H'p & q) = F(p & Fq)
(28.)
(Lemma 8., Lemma 9.) (Lemma 7., PL.) (1, 3, Syll.)
Proof of 61.: 1. 2. 3. 4. 5.
F(P/tx & FP/lX) = F((H'A/lX &/lX) & F(H'A/lX &/,X)) (Def. P.) F((H'A/lX & /lX) & Ρ(Η'Δ/,Χ &/,X)) r> F((H'-,/lX &/lX) & F(H' n/ix&/ix)) (PI. Lemma 9.) F((HS/lX &/lX) & F(HS/lX &/lX)) = F0ix & F(n^x & F^X)) (Lemma 12, RM., Lemma 7, Lemma 13.) F(/lX & F(-,/IX & F/lX)) = F(/,X & F(-,/lX & F/lX)) ν Οίχ & F(ViX & F/lX)) (PL.) F(P/lX & FP/,X) ^ Qa(Qa(/lX, Vix)./ix) (L *> SylL, Def. Qa.) Q.E.D.
180
J.Ph. Hoepelman
References ANSCOMBE,G.E.M.(1964) 'Before and After*. Philosophical Review, 73:1, 3—24 BARENTSEN A. (1971) 'K opisaniju semantiki kategorii "vid" i "vrem'a" (Na materiale sovremennogo russkogo literaturnogo jazyka).* Amsterdam. Unpubl. BONDARKO, A.V. BULANIN, L.L. (1967) 'Russkij glagol. Posobije dl*a studentov i ucitelej; pod red. JU. S. Maslova. Leningrad. CARNAP, R. (1958) 'Introduction to Symbolic Logic and its Applications.* New York: Dover CLIFFORD, J. (1966) 'Tense logic and the logic of change.* Logique et analyse no. 34,219—230. COOPER, N. (1966) Scale-words. Analysis, vol. 27,1966—1967, pp. 153—159. Cambridge University Press FORSYTH, J.: Ά Grammar of Aspect. Usage and Meaning of the Russian Verb.* Cambridge: JESPERSEN, O. (1924) 'The Philosophy of Grammar.* London: Allen and Unwin KAMP, J.A.W. (1968) 'Tense logic and the theory of linear order.* Diss. Univ. of Calif. KRABBE, E. (1972) 'Propositionele tijdslogica.* Amsterdam. Unpubl. LAKOFF, G. (1970) 'Linguistics and natural logic.* Synthese 22,151—271. POTTS, T. (1969) 'The logical description of changes which take time.* (Abstract) Journal of Symbolic Logic 34, 537. PRIOR, A. (1967) 'Past, Present and Future.* Oxford: Clarendon Press RASSUDOVA, O.P. (1968) 'Upotreblenie vidov glagola v russkom jazyke.* Moskva. REICHENBACH, H. (1947) 'Elements of Symbolic Logic.* London: Collier-Macmillan RUSSELL, B. (1903) 'Principles of Mathematics. Cambridge, Engl.: At the University Press VERKUYL, H. (1971) On the compositional nature of the aspects.* Diss. Utrecht: Routledge and Kegan Paul VON WRIGHT, G. (1963) 'Norm and Action, a Logical Inquiry.* London—(1965). 'And Next', Acta Philosophica Fennica, Fasc. 18.
LAURI KARTTUNEN
PRESUPPOSITION AND LINGUISTIC CONTEXT*
According to a pragmatic view, the presuppositions of a sentence detrmine the class of contexts in which the sentence could be felicitously uttered. Complex sentences present a difficult problem in this framework. No simple "projection method" has been found by which we could compute their presuppositions from those of their constituent clauses. This paper presents a way to eliminate the projection problem. A recursive definition of "satisfaction of presuppositions" is proposed that makes it unnecessary to have any explicit method for assigning presuppositions to compound sentences. A theory of presuppositions becomes a theory of contraints on successive contexts in a fully explicit discourse.
What I present here is a sequel to a couple of my earlier studies on presuppositions. The first one is the paper "Presuppositions of Compound Sentences" (Karttunen 1973a), the other is called "Remarks on Presuppositions" (Karttunen 1973b). I won't review these papers here, but I will start by giving some idea of the backgro nd for the present paper. Earlier I was concerned about two things. First, I wanted to show that there was no adequate notion of presupposition that could be defined in purely semantic terms, that is, in terms of truth conditions. What was needed was a pragmatic notion, something along the lines Stalnaker (1972) had suggested, but not a notion of the speaker's presupposition. I had in mind some definition like the one given under (1). (1) Surface sentence A pragmatically presupposes a logical form L, if and only if it is the case that A can be felicitously uttered only in contexts which entail L. * Presented at the 1973 Winter Meeting of the Linguistic Society of America in San Diego. This work was supported in part by the 1973 Research Workshop on Formal Pragmatics of Natural Language, sponsored by the Mathematical Social Science Board. I acknowledge with special gratitude the contributions of Stanley Peters to my understanding of the problems in this paper. Any remaining confusions are my own.
182
Lauri Karttunen
The main point about (1) is that presupposition is viewed as a relation between sentences, or more accurately, as a relation between a surface sentence and the logical form of another.1 By "surface sentence" I mean expressions of a natural language as opposed to sentences of a formal language which the former are in some manner associated with. "Logical forms" are expressions of the latter kind. "Context" in (1) means a set of logical forms that describe the set of background assumptions, that is, whatever the speaker chooses to regard as being shared by him and his intended audience. According to (1), a sentence can be felicitously uttered only in contexts that entail all of its presuppositions. Secondly, I argued that, if we look at things in a certain way, presupposition turns out to be a relative notion for compound sentences. The same sentence may have different presuppositions depending on the context in which it is uttered. To see what means, let us use "X" as a variable for contexts (sets of logical forms), "A" and "B" stand for (surface) sentences, and "PA" and "PB" denote the set of logical forms presupposed by A and B, respectively. Let us assume that A and B in this instance are simple sentences that contain no quantifiers and no sentential connectives. Furthermore, let us assume that we know already what A and B presuppose, that is, we know the elements of PA and PB. Given all that, what can we say about presuppositions of complex sentences formed from A and B by means of embedding and sentential connectives? This is the notorious "projection problem" for presuppositions (Morgan 1969, Langendoen & Savin 1971). For instance, what are the presuppositions of "If A then B"? Intuitively it would seem that sentential connectives such as if... then do not introduce any new presuppositions. Therefore, the set Py A then B should be either identical to or at least some proper subset of the combined presuppositions of A and B. This initially simple idea is presented in (2). (2) PIT A U « B S P A ^ P B However, I found that when one pursues this line of inquiry further, things become very complicated. Consider the examples in (3). (3) (a) If Dean told the truth, Nixon is guilty too. (b) If Haldeman is guilty, Nixon is guilty too. (c) If Miss Woods destroyed the missing tapes, Nixon is guilty too. In all of these cases, let us assume that the consequent clause "Nixon is guilty too" is interpreted in the sense in which it presupposes the guilt of someone else. The question is: does the compound sentence as a whole carry that presupposition? In the case of (3a), the answer seems to be definitely jes, in the case 1
There is some question over whether this notion of presupposition is properly labeled "pragmatic". For Stalnaker (1972, 1973), pragmatic presupposing is a prepositional attitude of the speaker. However, I will follow Thomason (1973) and others who would like to reserve the term "presupposes" for relations (semantic or pragmatic) between sentences. The idea that it is important to distinguish in this connection between surface sentences and their logical forms is due to Lakoff (1972, 1973).
Presupposition and linguistic context
183
of (3b) definitely no, and in the case of (3c) a maybe, depending on the context in which the sentence is used. For example, if the destruction of the tapes is considered a crime, then Miss Woods would be guilty in case she did it, and (3c) could be a conditional assertion that Nixon was an accomplice. In this context the sentence does not presuppose that anyone is guilty. But in contexts where the destruction of the tapes in itself would not constitute a crime (3c) apparently does presuppose the guilt of someone other than Nixon. These examples show that if we try to determine the presuppositions of "If A then B" as a particular subset of the joint presuppositions of A and B, the initial simplicity of that idea turns out to be deceptive. In reality it is a very complicated enterprise. The kind of recursive principle that seems to be required is given in (4a) in the form it appears in Karttunen (1973b). (4b) says the same in ordinary English. (4)
(») Pif A then B/X = PA/X^ (P /XuA ~ (E\uA ~ Εχ))
where Ex is the set of logical forms entailed (in the standard sense) by X, and X U A is the result of adding the logical form of A to X. (b) The presuppositions of "If A then B" (with respect to context X) consist of (i) all of the presuppositions of A (with respect to X) and (ii) all of the presupposition of B ( with respect to XuA) except for those entailed by the set XuA and not entailed by X alone. One would like to find a better way to express this, but I am not sure there is one.2 It really is a complicated question. So much for the background. What I want to show now is that there is another way to think about these matters, and about presuppositions of complex sentences in particular. Let us go back for a moment to the attempted pragmatic definition in (1). The point of that definition is that the presuppositions of a sentence determine in what contexts the sentence could be felicitously used. A 2
Peters has pointed out to me that, under certain conditions, (4a) is equivalent to the following projection principle.
Peters* principle has the advantage that it assigns the same set of presuppositions to "If A then B" irrespective of any context. Note that this set is not a subset of P A uP B , as required by my initial assumption in (2). Peters* principle says that, for each presupposition of B, "If A then B" presupposes a conditional with that presupposition as the consequent and the logical form of A as the antecedent. In addition, "If A then B" has all of the presuppositions of A. I realize now that some of the complexity in (4a) comes from trying to state the principle in such a way that (2) holds. If this is not worth doing, Peters' way of formulating the rule is superior to mine. However, in the following I will argue that we can just as well do without any explicit projection method at all, hence the choice is not crucial.
184
Lauri Karttunen
projection method, such as (4a), associates a complex sentence with a class of such contexts by compiling a set of logical forms that must be entailed in any context where it is proper to use the sentence. Thus we say that the sentence "If A then B" can be felicitously uttered in context X only if X entails all of the logical forms in the set Ptf A Ëâç B/x,defined in (4a). There is another, much simpler, way to associate complex sentences with proper contexts of use. Instead of characterizing these contexts by compiling the presuppositions of the sentence, we ask what a context would have to be like in order to satisfy those presuppositions. Of course, it is exactly the same problem but, by turning it upside down, we get a surprisingly simple answer. The reason is that we can answer the latter question directly, without having to compute what the presuppositions actually are. The way we go about this is the following. We start by defining, not presupposition, but a notion of satisfaction of presuppositions. This definition is based on the assumption that we can give a finite list of basic presuppositions for each simple sentence of English. For all cases where A is a simple, non-compound sentence, satisfaction is defined as in (5). (5)
Context X satis es-the-presuppositions-of A just in case X entails all of the basic presuppositions of A (that is, ÑÁ<Î Å÷).
The basic presuppositions of a simple sentence presumably can be determined from the lexical items in the sentence and from its form and derivational history, say, the application of certain transformations such as Pseudo-Clefting. To give a somewhat oversimplified example, consider the word too that occurs in the examples under (3). As a first approximation to the meaning of too we could give a condition like the one in (6), which is based on Green (1968). (6)
Context X satisfies-the-presuppositions-of "a is Ñ too" only if either (i) X entails "b is P" for some b (^ a), or (ii) X entails "a is Q" for some
Q(*P)· This in turn is equivalent to saying that a simple sentence like "Nixon is guilty too" either has a presupposition that someone else is guilty or that Nixon has some other property.3 One or the other must be entailed in context. For compound sentences we define satisfaction recursively by associating each part of the sentence with a different context. The basic idea behind this 3
It appears to me that the only contribution too makes to the meaning of a sentence is that it introduces a presupposition whose form depends on the sentence as a whole and the particular constituent too focuses on. If this is so, there is no reason to assume that too is represented in the logical form of the sentence. As far as the truth conditions are concerned, "Nixon is guilty too" seems equivalent to "Nixon is guilty", therfore, it is possible to assign the same logical form to them. The same point has been raised in Lakoff & Railton (1971) with regard to two-way implicative verbs, such as manage^ whose only function also seems to be to bring in a presupposition.
Presupposition and linguistic context
185
was independently suggested in both Stalnaker (1973) and Karttunen (1973b). For conditionals, satisfaction is defined in (7). (7)
Context X satisfies-the-presuppositions-of "If A then B" just in case (i) X satisfies-the-presuppositions-of A, and (ii) XuA satisfies-the-presuppositions-of B.
As before, the expression "XuA" denotes the set that results from incrementing X with the logical form of A.4 For conjunctions, that is, sentences of the form "A and B", satisfaction is defined just as in (7). For disjunctions, sentences of the form "A or B", we have "~A" instead of "A" in part (ii). Examples that illustrate and support these principles can be found in my earlier papers.5 Note that satisfies-the-presuppositions-of is a relation between contexts and sentences. As I have tried to indicate orthographically, we are defining it here as a primitive, irreducible locution. Eventually it would be better to replace this clumsy phrase with some simple verb such as "admits", which has the right pragmatic connotations. I keep the former term only to bring out the connection between (4) and (7) more clearly. At the end, of course, it comes down to having for each simple sentence a set of logical forms that are to be entailed (in the standard logical sense) by a certain context. What is important is that we define satisfaction for complex sentences directly without computing their presuppositions explicitly. There is no need for a projection method. Secondly, in case a sentence occurs as part of a larger compound, its presuppositions need not always be satisfied by the actual conversational context, as long as they are satisfied by a certain local extension of it. For example, in order to admit "If A thenB" a context need only satisfy-the-presuppositions-of A, provided that the presuppositions of B are satisfied by the context as incremented with the logical form of A. It can be shown that the new way of doing things and the old way are equivalent. They sanction the use of any sentence in the same class of contexts. Although it may not be obvious at first, the statement in (8) is true just in case (9) holds, and vice versa. 4
In simple cases, incrementing a context consists of adding one more logical form to it. If the context entails the negation of what is to be added to it, as in counterfactual conditionals, other changes are needed as well to keep the resulting set cosistent. This is a difficult problem, see Lewis (1973) for a general discussion of counterfactuals. 5 It is possible that the principle for disjunctions, and perhaps that for conjunctions as well, should be symetric. This depends on how we want to deal with sentences like "Either all of Jack's letters have been help up, or he has not written any" (see Karttunen 1973a, ftn.ll). A symetric condition for "or" would read follows X satisfies-the-presuppositions-of "A or B" iff × õ { ~A} satisfies-the-presuppositions-of "B" andXu { *»B} satisfies-the-presuppositions-of"A". For "and", substitute "A" for ^ A" and "B" for " ^B".
186
Lauri Karttunen
(8)
X satisfies-the-presuppositions-of "If A then B".
(9)
P i f A t h e n B / X ^ Ex
The proof is straight-forward and will not be presented in detail. Here it suffices to note that, by (4a), (9) is equivalent to the conjunction of (10) and (11). (10) (11)
Ñ Á <ÎÅ÷ P B -(E XUA -E X )CE X
Similarly, by (7), (8) is equivalent to the conjunction of (12) and (13). (12) (13)
X satisfies-the-presuppositions-of A. XuA satisfies-the-presuppositions-of B.
Given our basic definition of satisfaction in (5) and that A and  are simple sentences, it follows that (10) and (12) are equivalent. So it remains to be shown that (11) and (13) also amount to the same thing. This can be done with simple set-theoretic means by proving the equivalence of (11) and (14). (Note that Å÷<ÎÅ÷ õÁ ). (14)
P B £E XUA
(14) in turn says the same thing as (13) provided that  is a simple sentence, as we have assumed here. In short, (8) and (9) are equivalent by virtue of the fact that (10) is equivalent to (12) and (11) is equivalent to (13). Consequently, the class of contexts that satisfy-the-presuppositions-of "If A then B" by principle (7) is the same class of contexts that entail all of the presuppositions assigned to this sentence by (4a).6 As we move on to more complicated sentences, the advantages of (7) over (4) become more and more clear. For example, consider sentences of the form (15). (15)
If (A and B) then (C or D).
It is a very cumbersome undertaking to compute the set of logical forms presupposed by (15) by means of rules like (4a). But it is a simple matter to tell by principles like (7) what is required of a context in which (15) is used. This is shown in (16). Note that (16) is not a new definition but a statement that directly follows from (7) and the corresponding principles for conjunctions and disjunctions. (16)
6
Context X satisfies-the-presuppositions-of "If (A and B) then (C or D)" just in case
The same holds in case we choose Peters* principle (see ftn. 2) over (4a). In demonstrating this, what we prove equivalent to (14) is not (11), of course, but that {rAi3 C1 | CeP B }i=Ex. This equivalence follows straight-forwardly from the fact that just in case CeEx UA .
Presupposition and linguistic context
187
(i) X satisfies-the-presuppositions-of A, (ii) Xu A satisfies-the-presuppositions-of B, (iii) XuA&B satisfies-the-presuppositions-of C, and (iv) Xu A & Bu ~ C satisfies-the-presuppositions-of D. As we study complex cases such as this one, we see that we could look at satisfaction of presuppositions in an even more general way. As illustrated in (16), by our definition a given initial context satisfies-the-presuppositions-of a complex sentence just in case the presuppositions of each of the constituent sentences are satisfied by a certain specific extension of that initial context. For example, the presuppositions of D in (15) must be satisfied by a set of logical forms that consists of the current conversational context as incremented with the logical forms of "A and B" and the negation of C. In compound sentences, the initial context is incremented in a left-to-right fashion giving for each constituent sentence a heal context that must satisfy its presuppositions.7 We could easily define a notion of local context separately and give the following general definition of satisfaction for all compound sentences. (17)
Context X satisfies-the-presuppositions-of S just in case the presuppositions of each of the constituent sentences in S are satisfied by the corresponding local context.
Note that in this new framework the earlier question of how it comes, about that presupposition is a relative notion for compound sentences does not arise at all. Also, the distinction between cases like (3a) and (3b) is of no particular importance. What is required in both cases is that the presupposition of the consequent clause contributed by the word too be entailed by the current conversational context as incremented with the logical form of the antecedent. In case of (3b), we recognize that this condition is met, no matter what the initial context is like, by virtue of the particular antecedent. In (3a) it appears that the antecedent does not contribute anything towards satisfying the presuppositions of the consequent, at least, not in contexts that immediately come to mind. Hence we can be sure that the presuppositions of the consequent are satisfied in the incremented context just in case they are already satisfied initially. It seems to me now that this is a much better way of putting it than to talk about a presupposition being "shared" by the compound in (3a) and being "cancelled" or "filtered away" in (3b), as I did in the earlier papers. Such locutions can be thrown out with the projection method that gave rise to them.
7
Lakoff has pointed out to me that a notion of local context is also needed for transderivational constraints that make the well-formedness of derivations in which a certain transformation has applied dependent on the context. In compound sentences, it is the local context these constraints must refer to, not the overall conversational context.
188
Lauri Karttunen
So far I have only discussed complex sentences that are formed with sentential connectives. However, satisfaction of presuppositions can easily be defined for all kinds of complex sentences. Without going into any great detail, I will try to outline how this is done for sentences with sentential subjects or objects. Let us represent such sentences with the expression "v(... A...)" where "v" stands for a complementizable verb and "A" for an embedded subject or object clause. Sentences with verbs like believe and want that require non-sentential subjets are represented with "v(a,A)" where "a" stands for the underlying subject. In this connection we have to distinguish three kinds of complementizable verbs, as shown in (18). (18)
I Verbs of saying: say, ask, tell, announce, etc. (including external negation). II Verbs of prepositional attitude: believe, fear, think, want, etc. Ill All other kinds of complementizable verbs: factives, semi-factives, modals, one- and two-way implicatives, aspectual verbs, internal negation.
Essentially this amounts to a distinction between verbs that are "transparent" with respect to presuppositions of their complements (type III) and verbs that are "opaque" to one degree or another (types I and Ð).8 These distinctions of course are not arbitrary but presumably follow from the semantics of verb complementation in some manner yet to be explained. For sentences where the main verb is of the last type, we need the condition in (19). (19)
If v is of type III, context X satisfies-the-presuppositions-of "v(... A...)" only if X satisfies-the-presuppositions-of A.
Thus in a case such as (20), where may, force, and stop all are of type III, a context satisfies-the-presuppositions-of the whole sentence only if it satisfies those of all the nested complements.9 (20)
The courts may force Nixon to stop protecting his aides.
For example, a context for (20) ought to entail that Nixon has or will have been protecting his aides. 8
One of the mistakes in Karttunen (1973a) was the claim that verbs of saying and propositional attitude verbs are all "plugs". 9 Since ordinary negation is a sentential operator of type III, it also follows from (19) that a context satisfies-the-presuppositions-of "Nixon won't stop protecting his aides" just in case it satisfies-the-presuppositions-of "Nixon will stop protecting his aides". This is an important fact, but there is no need to make it part of the definition of pragmatic presupposition, as Thomason (1973) does, presumably for historical reasons because the semantic notion of presupposition is traditionally defined in that way.
Presupposition and linguistic context
189
For verbs of propositional attitude we need a condition such as (21), where the expression "Ba(X)" stands for the set of beliefs attributed to a in X. (21) If v is of type II, context X satisfies-the-presuppositions-of "v(a,A)" only if Ba(X) satisfies-the-presuppositions-of A.10 The condition says that sentences such as (22) require that the subject of the main sentence be understood to have a set of beliefs that satisfy-the-presuppositions-of the complement. (22) John fears that Nixon will stop protecting his aides. To satisfy the presuppositions of (22), a context must ascribe to John a set of beliefs that satisfy-the-presuppositions-of "Nixon will stop protecting his aides". Finally, with verbs of type I a complex sentence does not necessarily require that the presuppositions of the complement be satisfied, as we can observe by contemplating examples such as (23). (23) Ziegler announced that Nixon will stop protecting his aides. (23) can be spoken felicitously, perhaps even truly, no matter what the facts are understood to be or whether anyone is supposed to hold a set of beliefs that satisfy the presuppositions of the complement. As a final example of complementation, consider the sentence in (24). John thinks that, if Rosemary believes that Nixon has been protecting his aides, she is afraid that Nixon will stop protecting them. By applying the principles in (21) and (7) recursively, we arrive at the conclusion that, if a given context, X, satisfies the presuppositions of (24), then the presuppositions of the last clause in (24), "Nixon will stop protecting his aides", are satisfied by the set (25). (25) ^Rosemary (Ptohn (X) u Rosemary believes that Nixon has been protecting his aides). This set contains all of the beliefs attributed to Rosemary in a context that consists of all of the beliefs attributed to John in X and the logical form of the given sentence. By virtue of its last-mentioned ingredient, this set in (25) is guaranteed to entail that Nixon has been protecting his aides. Therefore, (24) does not requke that this particular presupposition of the last clause be entailed in contexts where (24) is used, or by the set of beliefs that in those contexts are attributed to John or to Rosemary. As far as I am able to tell, this is the correct result.
(24)
This concludes what I have to say about satisfaction of presuppositions. What we are interested in is associating sentences with proper contexts of use. We can achieve this goal directly by defining a notion of satisfaction as a relatiqn between contexts and" sentences. In this way we avoid the many complications 10
It is implicit in this treatment that every individual's beliefs are considered to be closed under entailment. I am not sure whether this is a defect.
190
Lauri Karttunen
that have to be built into a projection method that does the same by associating each sentence with a set of presuppositions. The efforts by Langendoen and Savin (1971), Morgan (1969,1973), Keenan (1973), Lakoff and Railton (1971), Herzberger (1973), myself (1973a, 1973b), and many others to find such a method now seem misplaced to me. The best solution to the projection problem is to do away with it. The moral of this paper is: do not ask what the presupposition of a complex sentence are, ask what it takes to satisfy them. I will conclude with a few comments about the notion of context. It is implicit in what I have said about satisfaction that a conversational context, a set of logical forms, specifies what can be taken for granted in making the next speech act. What this common set of background assumptions contains depends on what has been said previously and other aspects of the communicative situation. In a fully explicit discourse, the presuppositions of the next sentence uttered are satisfied by the current context. This guarantees that they are true in every possible world consistent with the context. Of course, it is possible that the actual world is not one of them, since people may be talking under various misapprehensions. Satisfaction of presuppositions is not a matter of what the facts really are, just what the conversational context is. Once the new sentence has been uttered, the context will be incremented to include the new shared information. Viewed in this light, a theory of presuppositions amounts to a theory of a rational order of contexts from smaller to larger sets of shared information. At each step along the way that a fully explicit discourse proceeds, the current context satisfies the presuppositions of the next sentence that in turn increments it to a new context. There are definitions of pragmatic presupposition, such as (1), which suggest that there is something amiss in a discourse that does not proceed in this ideal, orderly fashion. Those definitions make it infelicitous to utter sentences whose 11
Many things can of course go wrong. First of all, the listener may refuse to go along with the tacit extension that the speaker appears to be suggesting. In case of the classical example: "Have you already stopped beating your wife?" he may have a good reason to balk. The listener may also be unable to comprehend what tacit extension of the current context the speaker has in mind. Some types of presupposition are especially unsuited for conveying anything indirectly. For example, "Nixon is guilty too" is not a good vehicle for suggesting that Agnew is guilty, although the presuppositions of the sentence are satisfied in all contexts where the latter is the case. Finally, the listener may extend the context in some way other than what was intended by the speaker. To what extent we actually can and do make use of such shortcuts depends on pragmatic considerations that go beyond the presuppositions themselves. Note also that there are certain expressions in current American English that are almost exclusively used to convey matters indirectly, hence it is a moot question whether there is anything indirect about them any more. One is likely never to hear "Don't you realize it's past your bedtime" in a context entailing that the addressee ought to be in bed.
Presupposition and linguistic context
191
presuppositions are not satisfied by the current conversational context. They outlaw any leaps and shortcuts. All things considered, this is an unreasonable view. Consider the examples in (26). (26)
(a) We regret that children cannot accompany their parents to commencement exercises. (b) There are almost no misprints in this book. (c) I would like to introduce you to my wife. (d) John lives in the third brick house down the street from the post office. (e) It has been pointed out that there are counter examples to my theory. The underlined items in these sentences bring in a certain presupposition. Thus (26a) presupposes that its complement is true. Yet the sentence could readily be used in a conversational context that does not satisfy this presupposition. Perhaps the whole point of uttering (26a) is to let it be known that parents should not bring their kids along. Similarly, (26d) might be used to give directions to a person who up to that point had no idea that there are at least three brick houses down the street from the post office, which is a presupposition for the sentence by virtue of the underlined definite description. The same goes for the other examples in (26). What do we say here? I am not all sure we want to say that, in these cases, a sentence has been used infelicitously. I am sure that there is no advantage in saying that sentences like (26a) sometimes do and sometimes do not presuppose their complements. A notion of "part-time presupposition" is not going to help; on the contrary. Had we defined presupposition as a relation between a sentence and its speaker, we would be tempted to talk about some presuppositions being optional. I think the best way to look at this problem is to recognize that ordinary conversation does not always proceed in the ideal orderly fashion described earlier. People do make leaps and shortcuts by using sentences whose presuppositions are not satisfied in the conversational context. This is the rule rather than the exception, and we should not base our notion of presupposition on the false premiss that, it does not or should not happen. But granting that ordinary discourse is not always fiilly explicit in the above sense, I think we can maintain that a sentence is always taken to be an increment to a context that satisfies its presuppositions. If the current conversational context does not suffice, the listener is entitled and expected to extend it as required. He must determine for himself what context he is supposed to be in on the basic of what was said and, if he is willing to go along with it, make the same tacit extension that his interlocutor appears to have made.11 This is one way in which we communicate indirectly, convey matters without discussing them. When we hear a sentence such as (26a), we recognize that it increments contexts which entail that children are not permitted at commencement exercises. These are the only contexts that satisfy the presuppositions of (26a). So if we
192
Lauri Karttunen
have not realized already that we are supposed to be in that kind of context, the sentence lets us know that indirectly. Perhaps the whole point of uttering (26a) was to make us conclude this for ourselves so that we would not have to be told directly.12 One must be careful not to confuse presuppositions with features of contexts that satisfy those presuppositions. Consider a sentence such as (27), which is a modified version of an example discussed by Lakoff (1971). (27) John called Mary a Republican and then she insulted him back. Because of the word back, the second conjunct of (27) presupposes that John has insulted Mary. The principle (17) tells us that this presuppositions ought to be satisfied by the corresponding local context. In this case, the local context consists of the initial context for (27) incremented with the logical form of "John called Mary a Republican". Let us suppose that this context in fact satisfies the presupposition that John has insulted Mary, and that the initial context by itself would not satisfy it. This state of affairs could come about in several ways. The most obvious one is that the initial context entails that calling someone a Republican constitutes an insult. Note that there is nothing in (27) which presupposes that "Republican" is a dirty word. It is not a necessary feature of every context that satisfies the presuppositions of (27). But there are some contexts in which the presuppositions of (27) are satisfied only because of it. Sometimes we can exploit this fact be uttering (27) in a context which does not satisfy its presuppositions. In that case we expect the listener to notice what extension we have in mind. This is similar to what can be done with the examples in (26), except that here the piece of information that is passed along under the counter is neither presupposed nor entailed by any part of (27). As a final example, consider a case of the kind first discussed in Liberman .(1973). (28) Bill has met either the King or the President of Slobovia. The two disjuncts that constitute (28) have conflicting presuppositions: Slobovia is a monarchy/Slobovia is a republic. Yet, (28) as a whole is not contradictory. It seems to assert that Bill has met the Slobovian Head of State and indicates that the speaker does not know much about Slobovia. What sort of context does it take to satisfy-the-presuppositions-of (28) ? Assuming that the condition for "or" is symmetric (see ftn. 5 above), we find that, according to our principles, (28) can be admissible at least in contexts which entail the logical forms of the three sentences in (29). (29) (a) Slobovia is either a monarchy or a republic. (b) If Slobovia is a monarchy, Bill has met the King of Slobovia. (c) If Slobovia is a republic, Bill has met the President of Slobovia. Spch a context can satisfy the presuppositions of (28) for the following reason. By 12
I owe this example to an official MIT bulletin about the spring 1973 commencement.
Presupposition and linguistic context
193
incrementing it with the negation of the first disjunct, "Bill has not met the King of Slobovia", we get a context which entails that Slobovia is a republic, which is what the second disjunct presupposes. By incrementing the original context with the negation of the second disjunct, we get a context which entails that Slobovia is a monarchy, which is a presupposition for the first disjunct. Given that both constituent sentences in (28) are admissible in their respective local contexts, (28) as a whole is admissible. If our way of looking at presuppositions is correct, it should be in principle possible to utter (28) to someone who has never even heard of Slobovia and leave it up to him to conclude that the speaker assumes (29). It seems to me that this is a desirable result. In this paper I have argued that a theory of presuppositions is a best looked upon as a theory of constraints on successive contexts in a fully explicit discourse in which the current conversational context satisfies-the-presuppositionsof, or let us say from now on, admits the next sentence that increments it. I have outlined a recursive definition of admittance, based on the assumption that we can give a finite list of presuppositions for each simple sentence. In this approach we do not need an explicit projection method for assigning presuppositions to complex sentences. A theory of presuppositions of the kind advocated here attempts to achieve both less and more than has been expected of such a theory: less in the sense that it is not a theory of how ordinary discourse does or ought to proceed; more in the sense that it tries to explain some of the principles that we make use of in communicating indirectly and in inferring what someone is committed to, although he did not exactly say it. References DAVIDSON, D. and G. HARMAN (Eds.) (1972), Semantics of Natural Language, Dordrecht: D. Reidel. FILLMORE, C. J. and D. T. LANGENDOEN (Eds.) (1971), Studies in Linguistic Semantics, New York, N. Y.: Holt, Rinehart, and Winston. GREEN, G. (1968), On too and either, and just on too and either, either, in: Darden, B. et al. (Eds.), Papers From the Fourth Regional Meeting of the Chicago Linguistic Society. University of Chicago, Chicago, Illinois. HERZBERGER, H.G. (1973), Dimensions of Truth, Journal of Philosophical Logic 2,535—556. KARTTUNEN, L. (1973a), Presuppositions of Compound Sentences, Linguistic Inquiry IV :2, 169—193. KARTTUNEN, L. (1973b), Remarks on Presuppositions, in: Murphy, J., A. Rogers, and R. Wall (Eds.). KEEN AN, E. (1973), Presupposition in Natural Logic, The Monist 57:3,344370. LAKOFF, G. (1971), The Role of Deduction in Grammar, in: Fillmore, C. J. and D. T. Langendoen (Eds.). LAKOFF, G. (1972), Linguistics and Natural Logic, in Davidson, D. and G. Harman (Eds.). LAKOFF, G. (1973), Pragmatics and Natural Logic, in: Murphy, J., A. Rogers, and R. Wall (Eds.). 13 TLI1/2
194
Lauri Karttunen
LAKOFF, G. and P. RAILTON (1971), Some Types of Presupposition and Entailment in Natural Language, unpublished manuscript. LANGENDOEN, D. T. and H. B. SAVIN (1971), The Projection Problem For Presuppositions, in: Fillmore, C. J. and D. T. Langendoen (Eds.). LEWIS, D. (1973), Counterfactuals, Cambridge, Mass.: Harvard University Press. LIBERMAN, M. (1973), Alternatives, in: Papers from the Ninth Regional Meeting of the Chicago Linguistic Society, University of Chicago, Chicago, Illinois MORGAN, J. L. (1969), On the Treatment of Presupposition in Transformational Grammar, in: Binnick, R. et al. (Eds.), Papers from the Fifth Regional Meeting of the Chicago Linguistic Society, University of Chicago, Chicago, Illinois. MORGAN, J. L. (1973), Presupposition and the Representation of Meaning, unpublished Doctoral dissertation, University of Chicago, Chicago, Illinois. MURPHY, J., A. ROGERS, and R. WALL (Eds.) (forthcoming), Proceedings of the Texas Conference on Performatives, Presuppositions, and Conversational Implicatures, Center for Applied Linguistics, Washington, D. C. STALNAKER, R. C. (1972), Pragmatics, in: Davidson, D. and G. Harman (Eds.). STALNAKER, R. C. (1973), Presuppositions, Journal of Philosophical Logic 2, 447—457 THOMASON, R. H. (1973), Semantics, Pragmatics, Conversation, and Presupposition, in: Murphy, J., A. Rogers, and R. Wall (Eds.).
DISCUSSIONS AND EXPOSITIONS MARCELO DASCAL AND AVISHAI MARGALIT A NEW DEVOLUTION' IN LINGUISTICS?TEXT-GRAMMARS' VS. 'SENTENCE-GRAMMARS'
Some of the arguments presented in favor of a replacement of the existing 'sentence-grammars' by a 'text-grammar' (a grammar whose rules would generate a set of well-formed 'texts', and not merely a set of well-formed 'sentences') are discussed and evaluated. Three main arguments in favor of T-grammar are put forward in one of the most comprehensive expositions of the subject (van Dijk, 1972): a methodological, a grammatical and a psycholinguistic argument. The paper examines in detail only the first two, showing that none of them provides satisfactory support for the replacement of S-grammar by T-grammar.
1.
Introduction
In the past few years, the amount of publications devoted to 'text-grammar* has been steadily increasing, particularly in continental Europe.1 The advocates of the new product tend to interpret such an increase as a sign of a new Revolutionary* development taking place now in the field of linguistics: the replacement of the old and limited 'sentence-grammars' by the new, more ambitious and powerful, 'textgrammars'. In spite of the great amount of work recently devoted to it, research on text-grammar is still in an embryonic state; the term 'text-grammar', therefore, should be taken as a label for a 'research program', rather than for a full fledged linguistic 'theory'. Nevertheless, it is time to attempt at least a preliminary evaluation of the claims and achievements of text-grammar. Firstly, because enough has been said about it in order to see what it roughly purports to be, and where it might lead. Secondly, because although embryonic in nature, research programs are able to mobilize considerable resources—both human and economic—which might eventually be employed elsewhere, in the light of the results of a careful evaluation.
1 For example: van Dijk, 1972 (see bibliography); van Dijk, et al., 1971; Ihwe et al, 1971; Petöfi, 1971; Stempel, 1971; and many more announced papers, books, symposia, etc.
196
Marcelo Dascal and Avishai Margalit
And thirdly, because criticism may have a positive effect on the research program itself, leading to sharper formulations, shifts in its aims, etc. Before embarking in the analysis of the text grammar research program, it is necessary to point out some limitations of our study. Instead of surveying the whole literature on text-grammar, we will concentrate our attention on one book, (van Dijk, 1972) which contains a systematic, detailed and comprehensive presentation of the basic assumptions, aims and current achievements of the text-grammar research program. One of these aims is the establishment of new and more precise foundations for the theory of literature, as a result of the advocated reform in linguistic theory. However, whatever its effects on the theory of literature, the reform in linguistics is claimed to be independently motivated by shortcomings within linguistic theory itself.2 We shall deal here only with this allegedly 'internal' motivation for the proposed reform in linguistics, disregarding 'external' considerations such as the possible applications of'reformed' linguistics to the theory of literature. To be sure, the possibility of such applications might turn out to be a crucial factor in choosing between two theories equally adequate on purely linguistic grounds. But arguments concerning their respective linguistic adequacy (both descriptive and explanatory) naturally have more weight than arguments concerning their applications to other domains, and should, therefore, be considered first.3 Van Dijk's main thesis is that the existing 'sentence-grammars' ("structural and generative-transformational grammars ... limited to the formal enumeration and structural description of the sentences of a language", henceforth 'S-grammars') are inadequate for the account of the phenomena of natural language, and should be replaced by a grammar which would account also for the formal structure of 'texts' or 'discourse', i.e., by a 'text-grammar' (henceforth, 'Tgrammar') (p. I).4 For him, a T-grammar is significantly different from an Sgrammar, and cannot be construed as a trivial extension of the latter. He gives three types of arguments—which he considers "decisive"—in support of his thesis: a methodological one, a grammatical one, and a psycholinguistic one. The first is the claim that the proper object of linguistics is discourse (or 'texts') and not sentences. The second consists in pointing out that S-grammar is unable to explain 2
Thus Petöfi claims that a satisfactory description of the linguistic facts pertaining to the sentence levelcan only be achieved within the framework of a text-grammar: („Eine solche Beschreibung scheint uns—selbst in bezug auf die satzgrammatische Ebene—nur im Rahmen einer in bestimmter Weise aufgebauten Textgrammatik durchführbar zu sein") (Petöfi, 1971, p. 16). 3 This argument is somewhat question-begging, since one of the main contentions of the textgrammarians is that the 'domain' of linguistic theory should be enlarged far beyond its traditional scope. We will return to this question in what follows. Let us point out, however, that the text-grammarians do not deny that the facts within the scope of traditional linguistic theory should be adequately accounted for by any linguistic theory. 4 Henceforth non specified page references will be to van Dijk, 1972.
A new 'revolution* in linguistics?—'Text-grammars' vs. 'sentence-grammars'
197
certain grammatical phenomena, like pronominalization, defmitivization, etc., which can only be accounted for, according to Van Dijk, by a grammar which postulates texts as its basic units. The third is an appeal to the intuition that a 'well-formed* piece of discourse must be 'coherent* in some sense, and that we are able to express its coherent structure in summaries, outlines, etc. This intuition is interpreted as suggesting that a 'plan' or 'coherent structure' underlies every piece of discourse in much the same way as sentences have underlying 'deep structures'. We will examine in this paper mainly the first and second types of arguments. (Sections 2 and 3, respectively), since the third hinges on considerations of a more 'external' nature.5
2.
The 'natural domain' of linguistic theory
According to Van Dijk, "discourses are the only justifiable 'natural domain' of an empirically adequate theory of language" (p. 7). From this he goes on to claim that what a grammar of a language should generate is texts—abstract entities which underly discourses—and not merely sentences. There is nothing new about mentioning 'text' and 'discourse' in connection with linguistic inquiries. However, one should avoid lumping together two distinct types of contexts in which these terms have been used: the context of discovery and the context of justification. Those who use 'discourse' in the context of discovery mean that given discourses (corpora) are to be considered the only legitimate data from which significant linguistic generalizations can be extracted, through the use of controlled discovery procedures (cf. Harris, 1951). Those who, like van Dijk, use 'discourse' or 'text' in the context of justification mean that the adequacy of a grammar of a language can only be evaluated with respect to its ability to generate the correct 'texts' of this language. This approach is characteristic of the recent T-grammar research program. The two approaches are clearly different, but they are not incompatible. Neither does the use of texts as a part of discovery procedures imply that texts are to be considered the proper objects of linguistic explanation, nor does this last claim, characteristic of the more recent use of the notion of text in linguistics, imply anything whatsoever concerning the existence and nature of discovery procedures. For those who use 'text' or 'discourse' in the context of discovery, the main linguistic object to be explained and described is the set of sentences of a language and not the set of its texts, whereas for van Dijk, who certainly rejects—following Chomsky—the idea that there are discovery procedures in linguistics, the set of texts is the basic explanandum of linguistics. 5
We discuss the third type of argument more extensively in Dascal and Margalit, 1973. The present paper is partly based on that earlier criticism of the idea of a text grammar.
198
Marcelo Dascal and Avishai Margalit
What is apparently common to both approaches is the assumption that the 'natural domain' of linguistic theory is a set of (possible or given) discourses. In one sense, they really share this assumption, namely, insofar as 'discourses' constitute for both approaches the Observational basis' or set of empirical data that linguistic theory must account for. Nevertheless, from the fact that certain data constitute the 'natural domain' of observation for a certain theory, nothing follows as to the 'naturalness' of the 'proper' theoretical entities that should be postulated in the theory. A good example is the case of syllables and phonemes. We are definitely more familiar with syllables than with phonemes, in the sense that syllables are closer to easily observable units than phonemes. And yet, in phonology, phonemes are the basic theoretical units employed, and not syllables, since in using phonemes we are able to capture more generalizations (Bever, 1971, pp. 176—9). Van Dijk, on the contrary, seems to believe it necessary or fruitful to postulate abstract linguistic entities, 'texts', corresponding as closely as possible to the observational entities, 'discourses'. Van Dijk distinguishes between the 'naturalness' of a theory and that of its domain. According to him, a theory T! is more natural than another, T2, with respect to a specified natural domain of objects D, if all the statements of T2 are derivable from Tj and the basic theoretical entities postulated in Tx are somehow 'closer' to the objects of D than the basic theoretical entities postulated in T2. Since van Dijk bases his definition of the naturalness of a theory on certain methodological concepts derived from Nagel's framework, it is useful to recall that a theory, in the broad sense, has two components: an uninterpreted set of statements (sometimes called the 'theory' in the narrow sense) and rules of correspondence (Nagel, 1967, pp. 97—105). According to van Dijk's definition, all you have to do in order to compare the 'naturalness' of theories is to look at their correspondence rules. The more straightforward the rules of correspondence, the more natural the theory. Therefore, van Dijk's claim that T-grammars are more natural than S-grammars amounts to the claim that the rules of correspondence from texts (as abstract linguistic entities) to discourses (as natural observational units) are more straightforward than the rules of correspondence of a theory in which the highest level abstract entities are sentences. Nothing is said, however, in van Dijk's account, about the nature of the other part of the theories, namely, their theoretical statements. It may turn out, therefore, that a theory which is more natural in van Dijk's sense, because it has simpler correspondence rules, will have to pay for this feature a high price in terms of the complexity of its purely theoretical part. This is in fact the case in van Dijk's model of T-grammar, where 'macro-structures' of texts replace simple 'deep-structures' of sentences and very complex 'transformational' rules are needed in order to map such macro-structures onto the deep structures of the sentences composing the text. An idea of the complexity of the T-grammar proposed can be formed by inspecting figure 1. The specific contribution of the T-grammar (marked T) consists of the components Rj, R2, and R3, with their respective outputs. The other
A new devolution' in linguistics?—'Text-grammars' vs. 'sentence-grammars'
199
components (marked S) are essentially those of an S-grammar of the type proposed by generative semantics. The output of Rg is the 'surface structure* of a discourse or text, i.e., the structure corresponding closely to that of the sequence of sentences in the text. It is obtained by the operation of two sets of transformation rules on the 'macro-structures' generated by Rj. These are conceived not as syntactic, but as semantic 'deep* structures, in the manner of generative semantics. That is, they should look rather like huge logical formulae, expressed by means of a hybrid set of logical devices (drawn from predicate calculus of different orders, modal logic, epistemic logic, etc.) as well as by extra categories such as 'actant', 'text-qualifier', etc. (pp. 140—1). Not much is said about the formation rules in Rj, and still less is said about the precise nature of the transformations in R2 and R3.6 If compared with S-grammar, T-grammar presents not only an enormous amount of extra complexity in its theoretical framework, but also a considerable loss in precision which casts doubts on the very possibility of achieving a formalization of the theory. If so, in shifting from S-grammar to T-grammar, linguistics would be abandoning one of the major achievements of the Chomskyan revolution: the establishment of the requirement of explicitness (i.e., formalization) as a conditio sine qua non for any serious linguistic theory. In any case, it is clear that in order to be 'natural' in van Dijk's sense, i.e. in order to have relatively simple correspondence rules, a theory must make use of a highly complex and logically powerful set of theoretical constructs. Given this tradeoff relationship, it is by no means obvious—as most text-grammarians assume— that a 'natural* theory should be preferred over a 'non-natural' one.
3.
The grammatical argument
According to van Dijk, "one of the basic properties of natural language is not only the possibility of constructing complex sentences by recursive embedding or coordinating other sentences (sentoids), but also the possibility of producing sequences of syntactically independent sentences possessing certain relations between each other" (p. 39). 3.1. Our position in face of this is that no evidence has been produced by van Dijk supporting the claim that the description of the relations between independent sentences is not equivalent to the description of the conditions for constructing complex sentences by recursively embedding or coordinating other sentences. If our view is correct, then such a task belongs properly to S-grammar and does not
6
It is even doubtful whether such rules might be properly called 'transformations', in the technical sense. But in order to pass judgement on this point, one must wait for at least some examples of such 'transformations'.
200
Marcelo Dascal and Avishai Margalit
require postulation of 'texts' as additional theoretical entities. Certainly, in order to cope with the problem indicated by van Dijk in the preceding quotation, the available system of rules for embedding and coordination (in the framework of the existing S-grammars) is not yet sufficient. Thus, the discussion of such problems by text-grammarians may be beneficial, in the sense of pointing out difficulties that must be solved in the course of extending and completing S-grammars. Nevertheless, the main point is one of principle, as recognized by van Dijk himself: "It is clear that we must not reject S-grammars as inadequate for the description of intersentential relation(s) on the ground of their hitherto rather restricted attention to such problems. We should first be sure that an S-grammar cannot IN PRINCIPLE account for such relations. It might be the case that the conditions determining the wellformedness of complex and compound sentences are similar, if not identical, to those which should be formulated for sequences" (pp. 15-6). As we will try to show in the following sections, none of van Dijk's grammatical arguments provides grounds for his being sure (as he in fact is) that S-grammar cannot account for intersentential relations. On the contrary, we suggest that, once a (complete) S-grammar becomes able to explain satisfactorily coordination and embedding, all the problems related to "producing sequences of syntactically independent sentences possessing certain relations between each other" will be ipso facto solved, without need of any special additions falling outside the scope of S-grammar.7 To illustrate these claims, consider one of van Dijk's examples: (1) We will have guests to lunch. Calderon was a great Spanish writer. According to him, this 'text' is "definitely ungrammatical" and "any native speaker of English will consider this sequence, when presented in one utterance as nonsense" (p. 40). Admittedly, this is indeed, in normal conditions, a bizarre sequence
7
.Fodor and Katz (1964, pp. 490—1), defended, almost ten years ago, a very similar thesis. For them, "except for a few types of cases, discourse can be treated as a single sentence in isolation by regarding sentence boundaries as sentential connectives". They acknowledge, however, that in many cases (e.g., pieces of discourse containing questions and answers and, we would add, dialogs of any type) there is no straightforward technique for reducing discourse to complex sentences. In our opinion, these are the cases par excellence of prima facie evidence for including the notion of 'text' in the theoretical apparatus of a grammar. Therefore, such cases should be thoroughly explored by van Dijk as a source of support for T-grammar. But he, on the contrary, simply relegates their treatment to the (merely sketched) 'pragmatic* component of his T-grammar (cf. p. 14 and chap. 9). However, it seems by now clear that, in spite of Katz's continuous claims to the contrary, there is no escape to supplementing S-grammars with a 'pragmatic* component as well (cf. Kasher, 1970). Our main line of argumentation can, therefore, be adopted also in the present case. That is, one can argue that the conditions which define a (pragmatically) well-formed sequence of sentences must, ultimately, be accounted for by the 'pragmatic' component of any adequate S-grammar.
A new 'revolution* in linguistics?—'Text-grammars' vs. 'sentence-grammars*
201
of sentences. As for its 'ungrammatically', let us assume that van Dijk's claim is correct. We suggest that whatever has to be said in a grammar about the grammatical! ty or ungrammatically of such a sequence, has to be said also of either (2) or (3)
We will have guests to lunch and Calderon was a great Spanish writer, We will have guests to lunch but Calderon was a great Spanish writer.
That is, if a grammar is inadequate to solve the problems raised by (1), it is also inadequate to solve problems which an S-grammar should solve. If it doesn't, it is inadequate as an S-grammar. If, on the other hand, the S-grammar is adequate, it is also able to solve the problems raised by (1), and need not be supplemented by a T-grammar. Van Dijk himself eventually acknowledges this fact: "Jn this case, the conditions for the combination of two sentences in a sequence seem to be parallel with those for combining 'sentences' (sentoids) in a complex sentence" (p. 40). To be sure, his observations and examples support the conclusion that "at least some conditions for the combination of sentences in a sequence are similar to those for combining sentences (sentoids) in a complex sentence" (p. 41). But what he must produce, in order to substantiate the need of a T-grammar, are examples in which the conditions are not parallel. He should prove that at least some conditions for the combination of sentences in a sequence are radically different from those for combining sentences in a complex sentence. He believes, indeed, that "there are a great number of theoretical and empirical arguments against an identical treatment, of (compound) sentences and sequences" (p. 14). What are, then, these arguments? 3.2. The problem of characterizing the conditions of use of definite and indefinite articles provides the basis for the most elaborated grammatical argument put forward by van Dijk. We shall pay it, therefore, due respect, and try to show that the attempt to treat definitivization (and, later on, also pronominalization) by means of various logical operators proves the contrary of what van Dijk expects to prove. For this attempt shows that to treat texts adequately means to treat them ultimately as complex sentences. One of the conditions for definitivization, as is well known, is the uniqueness of the referent of the 'definite description'. Consider the following text: (4)
(i) Yesterday John bought a book. (ii) The book is about theoretical syntax.
As a possible representation of (4), van Dijk suggests the following formulae: (5)
(i)(3x)(b(x)Ac( 2l ,x))
Where V is the ÉÏÔÁ-operator, (zj) is a constant for 'John', and the interpretation
202
Marcelo Dascal and Avishai Margalit
of predicate letters is obvious. Nevertheless, van Dijk considers (5) (ii) not as a formalization of (4) (ii), but rather of the complex sentence (6)
The book John bought yesterday is about theoretical syntax.
He then argues that (5) (ii) is redundant with respect to (4) (ii) and proposes, as a better approximation for text (4), the formula (7)
(3x) (((b(x) Ë c(Zt,x)) Ë s((iy) (y = ÷) Ë b(y))).
But this version is also considered defective by van Dijk. Partly, because it does not take into account the fact that the order of the conjuncts may be significant, a fact which is not expressed by the use of the logical sign Ë. But the main reason is that the existential quantifier used in it "merely specifies that the property holds for at least one individual, not for one individual" (p. 49). This is the reason for introducing a new operator, ETA (ç), which identifies a particular but unspecified individual. After this remark, van Dijk proposes another formula: (8)
ä ((ç÷)(Ê÷)Ë
c(z1)X))).
Unfortunately, he does not say what sentence (or text) this formuk is supposed to formalize, but it seems that it can only be taken to formalize the entire text, like (5) (ii). We shall return to this problem soon. Coming back now to the main point. We agree with van Dijk's claim that (5) (ii) is the formalization of (6). But, unlike van Dijk, we consider it also a suitable formalization of the entire text (4). In fact, if the ÉÏÔÁ-operator is eliminated according to standard procedures it is easily seen that (5) (i) becomes a part of the disabbreviated version of (5) (ii). Therefore, (5) (i) is redundant relative to (5) (ii), which, alone, suffices as a formalization of (4). If we are correct, then it is clear that the same logical form (or semantic representation) is assigned to a 'text', i.e. (4), and a complex sentence, i.e. (6). Therefore, examples of this type do not support the claim that a distinction between texts and complex sentences must be drawn. Let us disregard, for the sake of the argument, what has just been proved, and let us examine (7) on its own merits, in the light of van Dijk's 'linguistic' criteria. Apparently, what he has in mind when affirming the 'linguistic' superiority of (7) is the fact that this formula has, within the scope of the existential quantifier, the form of a conjunction of two formulae, each of which somehow represents one of the two sentences of text (4). It seems that van Dijk's requirement is based on a sort of Wittgensteinian picture theory of language. It could be formulated as follows: a semantic representation is better if it corresponds more closely ideographically to the surface structure represented. Such an 'ideographic' interpretation of van Dijk's requirement is suggested by the explanation he gives of the advantage of adopting (7) as a formalization of (4), namely, that (7) "shows the linguistically relevant fact that the individuals in the subsequent sentoids/sentences are identical and described both as books" (p. 49).
A new 'revolution* in linguistics?—'Text-grammars' vs. 'sentence-grammars'
203
This notion of 'linguistic perspicuity', which seems to be connected with the previously discussed concept of 'naturalness' of a theory, may have its merits and be worth considering. In a sense, it is reasonable to claim that '~ ~p' is a better representation of an English sentence containing a double negation than just 'p', although both are logically equivalent. But van Dijk's formalization ((7)), although probably better on these marginal grounds, fails to support his main claim, viz., that texts and complex sentences are to be differently treated in the grammar. For, obviously, (7) itself is a complex sentence. Moreover, from such formulae as (7) separate representations for (4) (i) and (4) (ii) can be easily derived by standard inference rules, whereas the opposite is not true, i. e., from a conjunction of the form (9)
(3x) (b(x) Ë c(2l,x)) Ë (3x)s((iy) (y = ÷) Ë b(y))
it is impossible to derive (7). As for the 'linguistic perspicuity', it may be said that (7) is indeed in some respects more perspicuous but, in other, perhaps more important, respects, less perspicuous than (5) (ii), when both are viewed as alternative formalizations of (4). Expression (7) is more perspicous than (5) (ii) in the sense we just pointed out: it contains, as it were, two parts which closely correspond to the parts of (4).8 But it is less perspicuous than (5) (ii) insofar as (a) it is essentially the representation of a complex sentence and not of two independent sentences and (b) it does not have a representation for the definite description which appears in (4) (ii), whereas (5) (ii) contains IOTA which represents this definite description quite perspicuously. That is to say, ideographic perspicuity in one respect trades off with perspicuity in another respect, so that van Dijk cannot accumulate much capital out of his ideographic requirements. Let us come back now to formula (8) and to its alleged superiority over the other formalizations discussed. As we have seen, the fundamental reason adduced for the rejection of (7) in favor of (8) is that the existential quantifier of (7) does not represent the fact that only one book was bought by John, but allows also for an interpretation according to which some books were bought by John. That is, (7) does not offer an adequate representation of the first sentence of (4), namely, (4) (i). In other words, what is claimed is that
(10)
(3x)(b(x) Ë ö,,÷))
is not a correct rendering of the logical force of (4) (i) taken in isolation. The question, then, is what would count as a correct representation of the logical
8
Incidentally, this effect is obtained through the use of an unnecessary pair of parentheses around b(x) Ë c(zj, x).
204
Marcelo Dascal and Avishai Margalit
form of (4) (i). The ETA-operator is certainly useful for this purpose, and a natural answer would be: (11)
c( 2l ,frx)(b(x))),
but certainly not (8), since it contains the predicate s ("...is about syntax"), which does not appear in (4) (i). Formula (8), therefore, although preventing the undesirable interpretation, is not a 'perspicuous' representation of (4) (i). What sentence, then, is formalized by (8)? It cannot be (4) (ii), since this sentence contains a definite article and not an indefinite one. Its formalization should, therefore, contain the IOTA- but not the ETA-operator. The only possibility left, then, is to consider (8) as a formalization of the entire text (4). But in that case, since the text as a whole contains both an indefinite and a definite article, why select for formalization the former and not the latter? If van Dijk can accept that one of the articles has to be 'sacrificed', so to speak, in the formalization of the text, then (5) (ii) would be as good a candidate as (8). We have already pointed out that van Dijk's idea is that there must be a sort of 'ideographic' perspicuity in the formalization. In that case, it would be reasonable to expect that both articles, which constitute important formal elements in the two sentences, should be somehow represented in the formalization. In order to obtain such a result, the following conjunction could be envisaged: (12)
c(2l, (ç÷) (b(x))) Ë s((ix) (b(x) Ë c^x)))
But it is clear that such a formula does not capture what van Dijk considers the main linguistic insight to be captured by the formalization of text (4), namely, the fact that the individual referred to in the second sentence is the same as the one referred to in the first. This is so because both occurrences of the variable ÷ are not within the scope of the same quantifier. Moreover, the addition of a clause specifying this identity is not possible without using a quantifier whose scope comprises the whole conjunction. And the only quantifier able to do that job is the existential quantifier, so that, at best, we would have to return to a formula like (7). This, it has been claimed, is inadequate because it does not reproduce the specific characteristics of (4) (i). Alternative solutions would be (5) (ii), which is inadequate by the same reason, or (8), which does not reproduce the spedfity of (4) (ii). Furthermore, this inadequacy stems from the fact that all three alternative formalizations of (4) are, in an essential way, complex sentences, which cannot be split into separate elements, each standing for one of the sentences of text (4). At this point, we can envisage the situation faced by van Dijk as a straightforward dilemma. Either something like (12), namely, two independent sentences combined by means of conjunction, or a complex single formula like (7), (5) (ii) or (8), is the adequate formalization of text (4). If the former, then the two-sentences text (4) has as its semantic representation a conjunction of two independent semantic representations, each corresponding to one of the sentences of the text. In that case,
A new 'revolution' in linguistics?—'Text-grammars' vs. 'sentence-grammars'
205
it becomes obvious that the full description of a 'text' does not require anything that is not provided by an S-grammar. If the second alternative is chosen, then the semantic structures represent both the 'text' and a sentence which is in a paraphrase relation to it. And, again, the mechanisms of an S-grammar would be entirely sufficient to account for the structure of the 'text*. Let us compare now van Dijk's type of argumentation with the 'traditional' way transformationalists present deep structures. Usually, a text like (4) would not be taken as an example whose deep structure is to be exhibited, for only sentences in isolation are considered. The novelty in van Dijk's approach is that he brings to the fore, as 'surface' starting points of linguistic analysis, sequences of sentences or 'texts'. But it would be wrong to think that 'texts' (in van Dijk's sense) do not play any role whatsoever in transformational analysis. As a matter of fact, in most grammatical arguments in which a certain 'deep structure' is proposed for a given sentence, such a deep structure is not presented explicitly employing the full formal apparatus of the theory. Instead, a set of kernel sentences which 'roughly* stands for the deep structure is supplied. In many respects such sets of sentences are like van Dijk's 'texts': the individual sentences are simple; they are separated by full stops. The only difference seems to be that their order is not considered essential to the description, whereas in van Dijk's 'texts' order is a factor of major importance. This way of arguing derives either from a desire to facilitate communication with readers who do not master the formal apparatus, or—and perhaps more frequently so—because the writers have no precise idea of what would be the full formal representation. Whatever the reason, it is clear that 'texts' have somehow been taken into account—at least as an heuristic device—in the literature. When facing examples which seem to endanger his own preferred explanations, van Dijk reverts to the sort of argumentation we called above 'traditional'. Thus, in (13) The girl I told you about yesterday will visit us next week. a definite article is used before the corresponding referent has been 'introduced*. This example of 'retrospecification' of the referent apparently violates van Dijk's main rule for definitivization, namely, that "only previous sentoids... may identify discourse referents" (p. 57). His solution to the problem consists in postulating the 'text' (14) (i) I told you about a girl yesterday (ii) The girl will visit us next week as the structure underlying (13). But the informal presentation of (14) is nothing but the 'traditional' way of arguing on behalf of a particular deep structure for the complex sentence (13). From this point of view, there is absolutely nothing new in van Dijk's treatment of such a sentence. Let us return once more to definitivization. The chief result of van Dijk's analysis of definitivization is the rule that (15) all definite articles in a text are derived from preceding indefinite articles. This rule, if true, provides an argument for introducing the notion of a text into.
206
Marcelo Dascal and Avishai Margalit
the grammar, for only by taking into account 'preceding' sentences/sentoids would it be possible to explain the use of the definite article in a given sentence. Van Dijk admits that an explanation for an occurrence of a definite article need not be found in a preceding sentence (or sentoid), but may also be provided by a 'preceding' situation or context. In that case, however, it cannot be simply said that the definite article is preceded by an indefinite article, for an article is a linguistic item, and not a feature of a situation. The rule would have to be reformulated in order to take care of this. It would have to say that an individual or class of individuals, or whatever may be referred to by the definite article, was previously 'introduced' in the discourse through the situation. This solution seems to be effective for the referential uses of the definite article, but it is useless for the non-referential uses. We turn now to the application of rule (15) to these different uses of the articles. Indefinite NPs may be referential, as in (16), or non-referential as in (17): (16) James bought a car yesterday (17) James cannot buy a car. The term 'derive' in rule (15) is to be interpreted as meaning that, in a text, the characteristic features of the preceding indefinite article are transferred to the corresponding definite article which follows. The rule is a sort of law of conservation of features. If can be split into: (15a) If the preceding 'a' is non-referential, so is the following 'the'. (15b) If the preceding 'a' is referential, so is the following 'the'. There are, however, counter-examples both to the rule (15) and to its specifications (15a) and (15b). Against the general rule that every 'the' must somehow be preceded by an 'a', we can mention the sentence (18) Sally would surely fall in love with the man who is able to teach her Swahili. Here, the 'the' is not, and need not be, preceded by an 'a'. Van Dijk would claim that this is so only because 'the man' is retro-specified by the restrictive relative clause 'who is able to teach her Swahili'. And, according to him, postnominal restrictive relative clauses are always derived from 'preceding' sentoids in the deepstructure. Thus, for him, there would be in the deep structure of (18) a sentoid in which the noun 'man' would have the feature [—DEF]9: (19) (i) A man is able to teach Sally Swahili, (ii) Sally would surely fall in love with him. But (19) says much more than (18), as can be shown by means of standard logic notation. Thus, (18) can be formalized as (20) (x) (Mx Ë Txz2-»Lz2x) (Mx = ÷ is a man; Txy = ÷ is able to teach Swahili to y; 22 = Sally; Lxy = ÷ will fall in love with y).
9
Since van Dijk's treatment of pronominalization is essentially parallel to his treatment of definitivization, the fact that we have in (19) 'him* and not 'the man' is immaterial for the discussion.
A new 'revolution* in linguistics?—'Text-grammars' vs. 'sentence-grammars'
207
Now, (19) (i) seems to mean: (21)
(3x) (Mx Ë Txza)
But, whereas (21) affirms the existence of a man who is able to teach Sally Swahili, no such assertion is to be found in (20) and in the original (18). The reason for that is simply that the 'the' in (18) is a generic 'the', which does not imply existence, as, for instance in (22)
The unicorn is a mythical animal.
As for (19) (ii), it could be formalized as: (23)
Lz2 ((ix) (Mx Ë Txza)).
But since, according to van Dijk, every occurrence of an ÉÏÔÁ-operator must be 'preceded', at least in deep structure, by an occurrence of an ETA-operator (or, at least, an existential quantifier), the fact that (23) can be accepted as a rendering of (19) (ii) is of no consequence, if no suitable formulation for (19) (i) can be found. Moreover, even (23) is not exempt of certain difficulties. As we have shown above (see our discussion of (5) (ii)), sentences like (23) can be taken as formalizations both of complex sentences and of texts, without the need of a separate representation of the Other part of the text'. In that sense, (23) should be taken as a formalization of the sentence (18) and not of one of the parts of (19). But then none of the definitions of the ÉÏÔÁ-operator would be adequate: neither Russell's definition, used by van Dijk, from which separate assertion of the existence of the individual described by the definite description does not follow; nor the more usual definition, from which such an assertion follows (cf. Reichenbach, 1947, p. 261). The reason is that in both definitions there is a commitment to the uniqueness of the individual in question, whereas in (18), the 'the' being generic, there is no such commitment.10 10
The distinction between the two definitions of IOTA becomes crucial once we replace 'fall in love* by 'marry', as in (10—1) Sally would surely marry the man who is able to teach her Swahili. This is so because 'marry* (in our culture) implies uniqueness. RusselPs definition of IOTA could be used to capture successfully both the fact Sally would marry exactly one man who has the ability to teach her Swahili and the fact that it is not sure that such a man exists. On the other hand, from (23), with RusselFs interpretation, (21) does not follow. That means that (23) (interpreting 'L* as 'marry*) would be a happy formalization of (10—1) but not of the text which corresponds to it, namely: (10—2) (i) A man is able to teach Sally Swahili. (ii) Sally would surely marry him. In order to make (23) represent the entire text (10—2), it would be necessary to use the usual definition of IOTA. But in that case (21) follows from (23) (both with 'L* interpreted as 'marry', of course), and the proposed formalization is no longer adequate for (10—1).
208
Marcelo Dascal and Avishai Margalit
The preceding discussion shows that the text-grammar program is misguided not only in its general lines, but also in at least some of its few detailed analyses of grammatical facts. Let us turn now to a less detailed analysis of the remaining grammatical arguments, in order to show that they suffer from similar deficiencies. 3.3. Van Dijk's treatment of pronominalization is essentially parallel to that of definitivization. The basic rule is:11 "pronominalization may take place if the antecedent NP precedes, and 'backward' pronominalization is possible if the following 'postcedent' occurs in a sentence which immediately dominates the S-symbol under which the pronominalizable NP occurs" (p. 61) This rule was formulated by Langacker within the framework of transformational S-grammar. Van Dijk characterizes it as 'roughly correct'. One could ask, then, in what sense S-grammars are inadequate for dealing with pronominalization and, in particular, how does van Dijk justify the conclusion that "a textual treatment seems to impose itself in the case of pronominalization (p. 63). In fact, nothing in van Dijk's elaborate discussion of the problem offers an answer to these questions. In his discussion, he merely recapitulates the history of pronominalization treatment in S-grammars, trying to create the impression that something is wrong with such a treatment. After formukting the above rule, he discusses a wellknown apparent counterexample to it, the so-called Bach-Peters paradox. This 'paradox' hinges on the fact that if the rule is applied to a sentence like (24), then it will be impossible to show how the sentence is recursively generated. (24)
The girl who fell in love with him kissed the boy who insulted her.
Van Dijk mentions a series of authors who have tried to show that the paradox is only apparent and to provide solutions for it. He criticizes one of these attempts, McCawley's, for not being able to distinguish between the difference in meaning of two sentences closely related to (24), namely (25) and (26)
The girl who fell in love with the boy who insulted her kissed him; The boy who insulted the girl who fell in love with him was kissed by her.
The difference becomes clear if we consider the possibility of 'him' referring not to the individual referred to by 'the boy', but to another individual i. In that case, according to (25) the girl kissed i and fell in love with the boy, whereas, according to (26), the girl fell in love with i and kissed the boy. But if McCawley failed in this case, Kuroda didn't and van Dijk himself uses Kuroda's predicate logic formalization of the two sentences as a means for showing their difference in meaning. However, 11
Compare this rule with van Dijk's rule for definitivization (p.); which was discussed towards the end of the preceding section.
A new 'revolution* in linguistics?—* Text-grammars* vs. *sentence-grammars'
this solution is also rejected by van Dijk on the grounds that it is "cumbersome and not easy to handle" (p. 63), which can only mean—since no further explanation is given—that he considers it of excessive formal complexity, but not incorrect. Well, but then why is Karttunen's informal account which "gives an immediate insight into who fell in love with whom and who kissed whom" (id.) also rejected? Certainly because it is too informal. It is clear, therefore, that van Dijk is looking for a sort of golden middle which combines rigour in formalization with intuitive clearness. Before getting to his proposal, let us point out that it is always possible to offer an ad hoc solution to a particular problem which is 'simpler* and 'more intuitive' than other theoretical solutions. But the measure of 'simplicity', as a methodological criterion, cannot be applied 'locally', but only 'globally'. And since van Dijk does not develop a global system of rules, even for the limited domain of pronominalization12, it is pointless to argue that his proposal is better on the grounds that it is simpler. But is it really simpler and formally satisfactory? Van Dijk's solution consists in postulating 'texts' (27) and (28) underlying, respectively, (25) and (26): (27)
(28)
(i) A boy insulted a girl. (ii) The girl fell in love with the boy. (iii) The girl kissed the boy. (i) A girl fell in love with a boy. (ii) The boy insulted the girl. (iii) The girl kissed the boy.
He then gives a quasi-formalization of the embedding relations between the sentences of such 'text', and of the transformational chain which leads to (25) and (26). Nevertheless this procedure is nothing but using the old trick again. That is to say, scoring points in 'simplicity' just by not being explicit enough. Had he presented explicitly the rules connecting his postulated texts and sentences (25) and (26), he would have realized that his solution is either as 'cumbersome' as Kuroda's or else altogether inadequate. Unlike van Dijk, who refuses to draw general conclusions from his discussion of pronominalization processes (p. 74), we draw the following one: nothing in van Dijk's discussion has substantiated the claim that in order to deal with pronominalization S-grammars are inadequate and therefore a shift to T-grammars is needed. 12 "It is impossible at this moment to draw general conclusions from our discussion of pronominalization processes, let alone to formulate sufficiently general and simple rules" (p. 74).
14 TLI 1/2
210
Marcelo Dascal and Avishai Margalit
3.4. Definitivization and pronominalization are examples of grammatical facts that can only be dealt with, according to van Dijk, at the level of'textual surface structure', because they involve relations between sentences which form a sequence. The semantic relations which explain definitivization and pronominalization do not involve entire sentences (or their semantic representations), but only some of their elements, especially their noun phrases. Examples of such semantic relations are "referential identity, semantic identity, lexical identity, ..." (p. 91). But there are also semantic relations between the whole sentences of a textual sequence, which must also be accounted for, in van Dijk's model, at the level of 'textual surface structure'. One of these relations is the relation of presupposition, which we shall consider now briefly. Van Dijk's definition of presupposition is an informal counterpart of some definitions that can be found in the literature (e.g., van Fraassen, 1968). Since it is not intended to be a significant contribution to the clarification of the linguistic notion of presupposition, it is pointless to discuss here its shortcomings.13 Leaving aside the question of a satisfactory definition of presupposition, we shall try to see in what sense the addition of the notion of 'text' to a grammar contributes to the explanation of the semantic relation of presupposition between sentences, as suggested by van Dijk. Van Dijk's basic idea for a 'textual' treatment of presuppositions is that the presuppositions of a sentence should be equated with the 'preceding sentoids' in the semantic representation of the 'text' underlying the sequence of sentences to which the sentence belongs (p. 100). That means that, for each sentence involving presuppositions, there is an underlying 'textual surface structure' of which these presuppositions, as well as the semantic representation of the sentence in question, are parts, and in which the former 'precedes' the latter. Consider the following example : (29)
Peter realizes that Johnj pretends that hei is ill.
This sentence has presuppositions of the first and second order. The first order presupposition is the embedded sentence, "Johnj pretends that hej is ill" which, in turn, presupposes that John is not ill. Since, according to van Dijk, presupposition is a transitive relation,14 the second order presupposition (that John is not ill) is also a presupposition of the whole sentence (29). All this information is made explicit by van Dijk in the underlying 'text': (30)
13
(i) John is not ill. (ii) John pretends &þ. (iii) John is ill. (iv) Peter realizes Sjj.
Some discussion of this topic can be found in Dascal and Margalit (1973). That this is not a general rule is clearly shown by the existence of contexts that 'cancel out* the presuppositions of the embedded sentences (cf. Karttunen, 1973).
14
A new 'revolution' in linguistics?—'Text-grammars' vs. 'sentence-grammars'
211
Having in mind the fact that one of the major purposes of text grammars is to explain the notion of 'coherence' of a text, one would wonder in what sense (30) is a coherent text. For (30), as far as we understand, involves a straight-forward contradiction, namely, (i) and (iii). The presentation of a sentence in a separate line strongly suggests indeed that its truth is simply asserted. Van Dijk could, of course, argue that (iii) is 'dominated* by (ii), and therefore, that its truth is not independently asserted. But then, what is gained by separating (iii) from (ii), except for the typographical illusion of having a 'text' and not a complex sentence underlying (29)? Moreover, it is clear that (30) or something of the like is nothing but an informal presentation of the set of kernel sentences that belong to the 'deep structure' of (29). The difference between T-grammarians und S-grammarians is that the latter, but not the former, offer also an explicit formal account of the structural relationships connecting (i)—(iv). We see, then, that the introduction of 'texts', even assuming that this can be done without introducing contradictions into the semantic representations, is not likely to lead us beyond the limits of an S-grammar, but rather leaves us far behind it in what concerns explicitness and rigour. To conclude this section, we may say that the only statement in van Dijk's treatment of presupposition with which we entirely agree is the following one: "From the observations made above, the precise theoretical status of presupposition ... has not become fully clear" (p. 100). *
*
*
It is clear, thus, that neither deiinitivization, nor pronominalization, nor presupposition, nor any of the other grammatical facts discussed by van Dijk require the replacement of S-grammar by T-grammar. Ultimately, van Dijk himself is aware of this fact. After spending about a hundred pages in trying to prove that a T-grammar is necessary to account for these grammatical facts, he withdraws from this position, by declaring that the additional rules of grammar required to handle the combination of subsequent sentences in a text "could be introduced into a sentence-grammar without profound changes in the form of the grammar" (p. 132). This simply means that the grammatical arguments which allegedly support T-grammar vs. S-grammar do not in fact offer any evidence whatsoever for this claim. The same is true for the methodological arguments discussed in section 2. Hence, the remaining set of arguments, relying on our intuitions about the 'macrocoherence' of texts as reflected in the fact that we normally process texts by discerning in them structured 'plans' or plots, should provide not further evidence (p. 130) for T-grammars, but rather the only evidence, if any, that van Dijk is able to produce. A close examination of these arguments (see Dascal and Margalit, 1973), however, shows that the intuitions on which they depend are rather vague, and by no means comparable to our intuitions about the acceptability of sentences. Therefore,
Marcelo Dascal and Avishai Margalit
212
the attempt to characterize the 'well-formedness' of texts by a set of recursive rules, analogous to those which characterize the well-formedness of sentences seems to be doomed and does not warrant the conclusion that there is such a thing as a 'grammar' of texts, and still less the conclusion that this alleged 'grammar' should replace the existing grammars of sentences.
Ri
Semantic formation rules macro-structures macro-transformation rules transformed macro-structures
R3
transformation rules textual surface structure
S' micro-semantic formation rules ô -·
transformation rules lexicalization rules
R4
syntactic representation of sequences
morphophonological representations
Figure 1. Schematic representation of the structure of van Dijk's T-grammar
A new 'revolution* in linguistics?—'Text grammars* vs. 'sentence-grammars*
213
The inevitable conclusion, then, is that even the most elaborated results of the T-grammar research program have not, so far, provided the necessary justification for its central claims. The program has not produced a convincing 'refutation' of S-grammar, nor presented a viable alternative to it. Far from being the beginning of a promising 'revolution' in the field of linguistics, the T-grammar research program looks, so far, more like a doomed *coup d'etat'. In the light of the available evidence, therefore, no linguist could be suspected of 'conservatism', should he decide to stick to the Old'—but still fruitful—research program of transformational S-grammar.
*
*
*
REFERENCES BEVER, T. G. (1971) The Integrated Study of Language Behavior, pp. 158—209 in: Morton, John (ed.), Biological and Social Factors in Psycholinguistics, London: Logos Press. DASCAL, MARCELO and MARGALIT, AVISHAI (1973), Text-grammars—a critical view, in: Ihwe, J., Rieser, H., and Petöfi, J. (eds.), Probleme und Perspektiven der neueren textgrammatischen Forschung I, Papers in Textlinguistics, vol. 5, Hamburg: Buske. VAN DIJK, TEUN, A. (1972), Some Aspects of Text-Grammars, The Hague: Mouton. VAN DIJK, T. A., JENS IHWE, JANOS PETÖFI, HANNES RIESER (1971), Textgrammatische Grundlagen für eine Theorie narrativer Strukturen, Linguistische Berichte 16, pp. l—38. FODOR, J. and KATZ, J. J. (1964), The structure of a semantic theory, pp. 479—518 in: Fodor, J. and Katz, J. J. (eds.),The Structure of Language, Englewood Cliffs: Prentice Hall. HARRIS, ZELLIG S. (1951), Structural Linguistics, Chicago: The University of Chicago Press. FRAASSEN, BAS C. van (1968) Presupposition, Implication and Self-Reference, The Journal of Philosophy 65,136—152 IHWE, JENS; HANNES RIESER; WOLFRAM KÖCK; MARTIN RÜRTENAUER (1971), Informationen über das Konstanzer Projekt »Textlinguistik*, Linguistische Berichte 13, pp. 105—6 KARTTUNEN, LAURI (1973), Presuppositions of Compound Sentences, Linguistic Inquiry IV, pp. 169—193 KASHER, ASA (1970), The Logical Status of Indexical Expressions, unpublished Ph.D. thesis, Jerusalem. NAGEL, ERNST (1961), The Structure of Science, Problems in the Logic of Scientific Explanation, London: Routledge and Kegan Paul. PETÖFI, JANOS S. (1971), Transformationsgrammatiken und eine ko-textuelle Texttheorie. Grundfragen und Konzeptionen, Frankfurt am Main: Athenäum. REICHENBACH, HANS (1947), Elements of Symbolic Logic, New York: Macmillan. STEMPEL, WOLF-ÖIETER (ed.) (1971), Beiträge zur Textlinguistik, München: Fink.
ANNOUNCEMENT
The Department of Philosophical Studies of Southern Illinois University at Edwardsville, in joint sponsorship with the Association for Symbolic Logic, takes great pleasure in announcing an International Conference on Relevance Logic, to be held in St. Louis, Missouri, from September 26 through September 28, 1974. The conference will be devoted to an examination of both the logical and philosophical aspects of relevance logics. There will be planned sessions on the following topics: I. The Pros and Cons of Relevance Logics; II. Philosophical Applications and Implications; III. Neighbors and Relatives of E and R; IV. Semantics and Proof Theory; V. Negation in Relevance Logics. In addition there will be one session for contributed papers. Among those presently scheduled to read papers are the following: Alan Ross Anderson, John A. Barker, Nuel D. Belnap, Jr., J. Michael Dunn, Kit Fine, Dov M. Gabbay, Louis F. Goble, Robert K. Meyer, Garrel Pottinger, R. Zane Parks, William T. Parry, Richard Routley and Alasdair Urquhart. Papers submitted for the open session may be on any aspect, logical or philophical, of relevance logics and may reflect any point of view toward such topics. The papers to be read at the open session will be selected by the steering committee composed of Alan Ross Anderson, Nuel D. Belnap Jr., Kenneth W. Collier, Robert K. Meyer and Robert G. Wolf. It is planned to publish the proceedings of the conference. Individuals interested in further information about the conference should write to Professor Robert G. Wolf, Department of Philosophical Studies, Southern Illinois University, Edwards ville, Illinois 62025. Individuals interested in submitting papers to the open session should write to Professor Kenneth W. Collier at the same address. The deadline for submitted papers is 1 April 1974.
IRENA BELLERT
ON INFERENCES AND INTERPRETATION OF NATURAL LANGUAGE SENTENCES The paper discusses the calculus of inferences of a complex sentence from the inferences attributable to its constituent parts. It is argued that the concept of presupposition is not well defined in a number of linguistic papers. Presuppositions could be defined in the same way as all other inferences and the calculus of presuppositions would constitute only part of the general calculus of inferences which is a crucial problem in natural language semantics.
The problem of inferences in a natural language has recently received much attention in the literature on semantics. It may, therefore, be worthwhile to recall briefly the different ways of approaching this problem. Papers concerned with the relation of consequence, presupposition, entailment, implication, meaning postulates or meaning components are in fact all coping with the same problem, namely, the problem of the semantic relations holding between a sentence and the set of propositions it expresses, or in other words, the problem of what can be correctly said to follow from a sentence. It is evident that the semantics of any language (natural or formal) is inseparably connected with the rules that account for the relation of inference holding between the sentences of that language. Every speaker of a language can draw conclusions from the sentences of that language and from his implicit knowledge of the rules of that language, and the problem, in fact the crucial one in semantics, is how to account for such conclusions in an adequate way, that is, how to describe explicitly both the sentence structure and the rules of language which allow us to make the correct inferences, those we are capable of making in an intuitive way when we argue, get persuaded by someone's argument and draw proper conclusions from what we hear. In several papers (Bellert, 1968, 1970, 1972) I have proposed to identify the semantic interpretation of a sentence with the set of conclusions that can be drawn from that sentence and from a set of implicational rules pertinent to the syntactic and lexical characterization of that sentence. Such implicational rules correspond to Carnap's meaning postulater, and this concept has a tradition in philosophy; but such a conception of semantics had not been proposed before in linguistics. As I argued, it seems a possible and promising approach to 15 TLI3
216
Irena Bellert
semantics. The proposal has empirical support, for the capability of drawing conclusions can be taken as evidence of understanding, and on the other hand we can expect that if someone understands what has been said, he is capable of drawing the correct conclusions. We may, therefore, say that: (1)
X can understand (is able to interpret) S Î × can draw conclusions from S.
There are two problems which need some clarification here. Firstly it is obvious that only some of the conclusions we can draw are made on the grounds of just the knowledge of language, other conclusions are made on the grounds of some additional premises pertaining to the factual knowledge of the world. The difference between the two is evident in clearcut cases, but there is a large number of boundary cases for which only an arbitrary decision can be made as to whether we do or do not want to include the information in the description of our lexicon (our descriptive terms). This is a question that constantly poses itself to lexicographers. Should such and such information be included in the description of a lexical item or should it, rather, be left aside as pertaining to the encyclopaedic knowledge. In any case, once such a (partly) arbitrary decision is made, we need to distinguish between the two types of conclusions: the ones which we make on additional grounds of the factual knowledge of the world, and which will not be obtainable from our description, and those which will be accounted for by our description of the given language and general rules of inference. I used to call the latter "consequences" in order to distinguish them from the former; in the present paper no confusion may arise, for I will be concerned with only those conclusions or inferences that, clearly, should be accounted for by an explicit description of language, and I will thus leave the problem of boundary cases aside. The second problem that I want to clarify is the following. There are obviously some inferences of complex sentences which correspond to the propositions included in and represented by the deep structure description of those sentences (obviously, there are also some which are even represented in the surface structure). The problem is whether it is possible to represent all propositions corresponding to the inferences we would have to account for in the deep (or logical) structure. I will argue below that the answer is in the negative. But even if it were possible to represent all such propositions in the deep (or logical) structure, it would still be necessary to add to our description additional rules for the calculus of inferences of complex sentences from the representation of their deep structure in terms of the inferences of the particular propositions included in the description. That is to say, the deep (or logical) structure representation alone would not be a solution to this problem. Earlier proponents of generative semantics suggested, however, that all semantic information be explicitly included in the deep (logical) structure. The ex-
On inferences and Interpretation of natural language sentences
217
treme claim was that the deep, logical structure is the semantic interpretation of the corresponding sentence, a ckim which is untenable. In more recent papers, generative semanticists do not maintain such an extreme position1. What seems to me important to realize is that no matter whether we accept the lexical decomposition hypothesis (as proposed by generative semantics) or not, whether we accept the deep structure of the interpretivists or just a grammar generating surface structures alone, we are always faced with the problem of the calculus of inferences of a complex sentence from its component parts, we cannot escape this problem in a full description of a language including semantics, and therefore the inclusion of the propositions corresponding to inferences in the representation of a sentence will never do the job; we still have to know in which syntactic positions such propositions constitute inferences of the entire sentence, and in which other positions they do not. The acceptance of the lexical hypothesis has, however, the following disadvantage for this purpose, as compared with the meaning postulates method. Each time a given lexical item occurs, the tree structure from which it is said to be derived must also occur in the description, thus making the representation of sentences unnecessarily complex. But if we add just one meaning postulate with the equivalent information, it will hold true of all occurences of the lexical item which it is associated, and we achieve the same goal by simpler means. In either case the rules for the calculus of inferences of complex sentences are indispensable. It is clear from the recent literature on the subject of inferences that the set of inferences that can be made from a complex sentence is neither included nor includes, but in most cases overkps with the sum total of the inferences that would be made from the component clauses, if the ktter were to be interpreted as sentences standing by themselves. There are inferences of the whole sentence that do not hold of any of its components, and on the other hand, there are inferences that could be made of a component (if it were a sentence by itself) and which do not hold of the entire sentence. The formet can be exemplified, say, by counterfactual conditionals from which we can infer the negation of the antecedent and the negation of the consequent clause , whereas the latter can be shown in the case
1 For instance, Lakoff (1970) says: "We want a logic which is capable of accounting for all correct inferences in natural language and which rules out incorrect ones" and: "Meaning postulates will have to be accepted for a full description of natural language". He also adds: "I think it is clear that there is a range of cases where lexical decomposition is necessary. In addition, it is also clear that certain meaning postulates are necessary. The question is where to draw the line". 2 Consider, as an example, the following sentence: "If John had not been in Boston yesterday, he would not have lost his suitcase" from which we may infer: "John was in Boston yesterday" and "John lost his suitcase". In general, we need however, additional restrictions for the negation of the second clause.
218
Irena Bellert
of so called verba dicendi and sentiendi, the complements of which cannot be inferred from the entire sentence. Thus, to use Karttunen's terminology (Karttunen, 1971), certain verbs behave like 'holes' (let the inferences through), others behave like 'plugs' (block off the inferences), still others behave like 'filters' (cancel the inferences under certain conditions)3. In view of what we already know about the inferences of complex sentences, we have to admit that the calculus of inferences is a problem to be dealt with independently of how much information is represented in the deep structure description. I would even venture to claim that the more we include in the deep structure, the more complex becomes the calculus of inferences. The proposal for representing auxiliaries, quantifiers, sentential connectives, abstract performatives, etc. as higher verbs (or 'abstract predicates') in the deep structure, obliterates the semantic differences which obviously exist between these abstract entities, so that the apparent advantage gained from a uniform syntactic representation (all such entities have the same syntactic category), would be obtained at the expense of an extremely complex calculus of inferences, in which we would have to differentiate among higher verbs that are quantifiers, those that are sentential connectives, those that are modals, auxiliaries, etc. For we cannot simply assume that once we have such a uniform syntactic representation of the various 'abstract predicates', we can use standard logic to account for the different semantic properties of those 'predicates'. Moreover, linguistic quantifiers and connectives have additional semantic properties which are ignored in standard logic, and the remaining 'abstract predicates', such as auxiliaries, modals or tenses, need to be additionally defined. For instance, the linguistic quantifiers all, or some give grounds for additional inferences concerning the embedded propositions (See, Bellert, 1969, 3
'
The filtering conditions are roughly the following:
Let S stand for any sentence of the form 'If A then B'. (a) If A presupposes C, then S presupposes C. (b) If B presupposes C, then S presupposes C unless A semantically entails C. Let S stand for any sentence of the form ¢ and B' (a) If A presupposes C, then S presupposes C. (b) If B presupposes C, then S presupposes C unless A semantically entails C. Let S stand for any sentence of the form ¢ or B'. (a) If A presupposes C, then S presupposes C. (b) If B presupposes C, then S presupposes C unless the negation of A semantically entails C. The filtering conditions described above have been slightly revised by both Karttunen (1971) and Lightfoot (1971), but this is irrelevant to my argument, as in any case the filtering conditions play a significant role in the calculus of inferences. I will only argue for the extension of the filtering conditions so that they cover not only presuppositions but all inferences.
On inferences and interpretation of natural language sentences
219
1972)4. Besides, we have more quantifiers in any natural language than we have in logic, and those that do share the semantic properties of the quantifiers defined in logic in some syntactic positions, display some additional properties in other positions. The latter has been demonstrated by Zeno Vendler (1967) for the case of the distributive and non-distributive interpretation of quantifiers depending on the predicate in the quantified sentence. There is in addition one argument for the necessity of adding meaning postulates (or some equivalent device) to any possible description of a natural language that would account for the correct inferences, and this is the following. There is no denial of the fact that there are simple (non-complex) sentences from which it is possible to infer an unlimited number of sentences. This alone forces us to admit that we need some rules, such as meaning postulates, which will be finite in number and will account for the possibility of an unlimited number of inferences. To give a trivial example, consider the following sentences: (2)
This statue has stood in the city of Quebec for two hundred years.
(2 a) This statue was standing in the city of Quebec fifty minutes ago. (2b) This statue was standing in the city of Quebec five years ago. (2c)
This statue was not standing in the city of Montreal (New York, Warsaw, etc.) a year (fifty years, seven months, etc.) ago.
The sentences (2 a) and (2b), and many analogical sentences, are correct inferences which can be accounted for by one meaning postulate, in which reference is made to all moments of time within the period denoted by the Time Adverbial preceded by the preposition for, the moment of speaking and the present perfect marker. Sentences of the form (2c), and an unlimited number of analogical sentences, can be accounted for, if we add one additional meaning postulate, which roughly will say that: for any physical entity, if it is at a given time in a given pkce, it is not at the same time in any other place. What I have said so far may be summarized as follows: Meaning postulates (or some equivalent formal rules which would permit us to draw inferences from sentences) are indispensable for a full description of a natural language, that is, for a theory of language with semantics (a theory relating 'sounds to meanings'), independently of whether we accept one or another version of grammar. Evidently there is a separate problem worth consideration: what kind of information is really
4
This matter has been discussed in more detail in my other papers but a well known example is the one of an inference, which does not hold in case of a universal quantifier, but holds in case of the linguistic ^//-quantifier. From the sentence "All the students sitting in Jim's room are reading*' we may infer that there are students in Jim's room.
220
Irena Bellert
necessary for defining recursively all sentences of a knguage, and what can otherwise be 'left over' to be accounted for by meaning postulates.5 What I now want to submit for discussion is the following. If the problem of inferences of natural language sentences is indeed a crucial problem of semantics—which fact can hardly be denied—and if a number of linguists are already concerned with the problem—why then so little concern has been given to an explicit description and the formal properties of the relation of presupposition, inference or entailment in natural language? Let me give some examples. George Lakoff, in his very interesting paper (Lakoff, 1972, Chapter V) discusses the problem of transitivity of the relation of presupposition, but it is only in a footnote that he gives a definition of this relation, that is, he explains what he means by 'presupposition' in his discussion (Footnote 2, Chapter V): "This notation" (i.e. the implicational sign) "is introduced purely as a device to keep track of what is going on. It is not meant to have any theoretical significance. I take the term 'presupposition' as meaning what must be true in order for the sentence to be either true or false." If we rephrase LakofPs definition formally, it is equivalent to Keenan's definition of presupposition (which will be discussed below), and then it can easily be proved by a few trivial steps that the relation of presupposition is necessarily transitive. Thus the puzzling problem of the transitivity of the relation of presupposition disappears immediately. Yet, Lakoff concludes by leaving the problem of transitivity of presuppositions an open question. What he discusses in his paper, and what is of real interest, is obviously not the problem of transitivity of presupposition, but the general problem of the calculus of inferences in complex sentences. One could argue that the value of a discussion is independent of the fact that the problem is somewhat confusingly posed—and indeed the discussion and evidence presented by Lakoff is of great interest for further research in the field. However, the problem is ill-posed, and as often happens, this brings about some confusion. On the other hand, a properly posed problem, in non-ambiguous terms, already constitutes already an achievement, as it may contribute to a better understanding of the subject matter. Similar observations can be made on some other papers which deal with the problem of presupposition, consequence and inference in natural language description. If I am examining some statements in more detail, it is because they appear in papers which are of great significance in the field. I also believe that there is a good deal of confusion on the matter in question and a great need for adequately describing the relations which are of crucial importance for natural language semantics. 5
Instead of deriving a lexical item from a tree structure corresponding to its meaning components, we can give the equivalent information in a meaning postulate associated with that lexical item. Many transformations could also be reformulated in terms of implicational rules.
On inferencex and interpretation of natural language sentences
221
Let me now make a few remarks on Karttunen's definitions concerning inferences in natural language, as Karttunen papers have advanced to a great extent the linguistic description of presuppositions and inferences. Again what is missing in his description and what, therefore, brings about a slight confusion and logically puzzling problems, is the lack of explicitness in the definitions of the relations being discussed. Let me take an example from his classification of verbs into implicatives, negative implicatives, if-verbs, etc. (Karttunen, 1971). The meaning postulates for implicative verbs (such as, 'manage to', 'succeed in', etc.) are roughly described as: (3) (a) v(S) => S (b) ~ v(S) => ~ S
'v(S) is a sufficient condition for S' 'v(S) is a necessary condition for S'
Let me quote what Karttunen has said on the formal properties of the implicative sign => used in his definition: "In saying that p implies q, I am not using the term 'imply' in the sense of 'logically implies* or 'entails'. The relation is somewhat weaker as indicated by the definition (4)
P implies Q, iff whenever P is asserted, the speaker ought to believe thatQ."
Karttunen mentions, then, that this relation is closely related to Van Fraassen's notion of necessitation: (5)
P implies Q, iff whenever P is true Q is also true.
There are some problems, however, with the meaning postulates which contain the implicative sign in question, as for example (3). The statement in (3 a) says that 'v(S) is a sufficient condition of S' (which is obvious), while the statement in (3b) says that 'v(S) is a necessary condition of S', and the latter is not justified by the author. If such were the case, Karttunen's meaning postulates for implicative verbs would constitute an equivalence, (v(S) would be truthfunctionally equivalent to S, on the grounds that v(S) is said to be a sufficient and necessary condition of S). And this is something that Karttunen did not mean; on the contrary, he claims earlier in the same paper that the two are not logically equivalent, the implication holds in one direction only; the sentence with an implicative verb 'John managed to kiss Mary' is not equivalent to 'John kissed Mary', as the verb 'manage'carries along an extra assumption that is not shared by the latter sentence. The inconsistency stems from the fact that the statement 'v(S) is a necessary condition of S' holds true in this case, only if the negation is conceived of as a logical, sentential negation, and then v(S) and S are indeed logically equivalent, as the law of contraposition applies to ~ v(S) => ~ S, and we obtain S ^ v(S). However, if the negation in ~v(S) is conceived of as a verbal, internal negation, then it does not follow that v(S) is also a necessary condition of S. This is not to say that the law of contraposition (or modus tollens) does not apply to the second meaning postulate in (3); it does, but then
222
Irena Bellert
the consequent is a negation of the internal negation of S, rather than v(S), and it is the former, rather than the latter, which can be correctly said to be a necessary condition of S. I wish to emphasize here that my critical comments concern the formal aspect of Karttunen's definitions only, something that can easily be corrected6 but I have learnt a lot from the discussion presented in the paper. And it seems that once we agree on specifying meaning postulates in terms of common criteria, it will be easier to push forward our description which would account adequately for inferences in natural language. Keenan's notion of presupposition (Keenan, 1970) is explicitly defined, and can therefore be discussed easily. From a formal point of view there can be no confusion as to the properties of the relation of presupposition: (6)
If a sentence S presupposes a sentence S1 (or S1 is a presupposition of S), then both S and its negation ~S logically imply S1 (or S1 is a consequence of S and ~ S).
I will discuss this definition in detail, as it constitutes an explicit version of the definitions of presupposition accepted implicitly in several linguistic papers (the definition in Lakoff (1970) discussed above is an example). Such a definition would be improper in a two-valued logic. A presupposition S1 is a logical consequence of both S and ~S, that is to say, S1 is implied by S independently of the truth value of S (in all possible worlds). However, the presuppositions are not logically true sentences, which would make it possible for them to be true in all possible worlds (in all possible state of affairs). In order to solve this problem, Keenan assigns a third or zero value to sentences whose presuppositions do not hold. Thus a presupposition may be false, but then the corresponding sentence will be said to be neither true nor false, it will have a zero value.
6
The same problem arises in the case of some other meaning postulates in Karttunen's paper. For instance the negative only-if verbs (e.g. 'to hesitate') are associated with the following meaning postulate: ~ v(S) 13 S
f
v(S) is a necessary condition of ~ S'
I think that v(S) can by no means be said to be a necessary condition for ~ S. The first part of the meaning postulate correctly accounts for the fact that the sentence: *Bill did not hesitate to call Jim a liar* implies that Bill called Jim a liar, but the statement: ev(S) is a necessary condition for ~ S' does not hold true, as it would mean that the sentence: ¸ßÀÀ did not call Jim a liar* implies that Bill hesitated to call him a liar. And it is obvious from what Karttunen says in the same paper that he did not mean this.
On inferences and interpretation of natural language sentences
223
There is nothing formally wrong in Keenan's proposal. However, we wouldn't really want to use a non-bivalent logic for many reasons. We would have then to dismiss with the rules of standard logic, and work out new inference rules based on a three-valued model. In addition to all the disadvantages of such an approach, we would not gain anything that is of interest to linguistics. For it is worth while to realize that linguists are not interested in the question of whether a sentence is actually true, false or neither true nor false (has a zero value), because this cannot be decided by linguistic or logical methods; this can be decided only by the knowledge of the factual state of affairs and of the time at which a (declarative) sentence has been used. What linguists are interested in is, rather, the question of what can be correctly inferred from a sentence on the grounds of its component parts. What we need to know, then, is how to formulate rules which would account for the inferences that can be made whenever a given sentence (not necessarily a declarative one) is used, independently of the speaker, of the time of speaking, independently of whether it happens to be true, false or neither true nor false; independently of whether the speaker believes or not in what he claims to be the case, and whether he is sincere or not. All such considerations do not belong to linguistics and should not bother us at all, for the rules of a knguage remain the same, no matter whether a speaker says the truth or not, whether he is sincere or not; if he lies then the inferences will not be true either, and the listener will be misinformed—but the interpretation of the sentence constituting a lie will be correct. Now in order to give rules which would account for the inferences we make whenever a given sentence is uttered, it is necessary to extend the rules of standard logic, rather than to reject the two-valued logic by accepting a threevalued logic. In any case, we need much more than standard logic rules for truth conditions for our purposes.7 And this can be achieved by meaning postulates or other equivalent devices, without rejecting a two-valued logic. In addition to the objection against a three-valued logic, involved in the discussed definition, I doubt if the definition is empirically adequate or that it covers all cases referred to as presuppositions in the linguistic literature. Is it really 7
For instance, a speaker who uses a sentence of the form 'Sj and S2' or 'Sj but S2', expresses more (only by the meaning of the connectives used) than just: 'Si is true and S2 is true'; a speaker who uses a sentence of the form 'If Sx then S2', also expresses more than just: 'If St is true, then S2 is true'. This is to say, that even the logical connectives have some additional semantic properties which can be formulated by appropriate meaning postulates. The sentence: 'John returned home and had dinner' gives grounds to different inferences (has different truth conditions) than the sentence: 'John had dinner and returned home.' For the meaning of the connective / / " . . . then, see footnote 12.
224
Irena Bellert
a logical negation that is in question here? Is it not the case that only an internal (verbal) negation of S implies S1?. Many linguists (including Keenan himself) have pointed.to the semantic difference between the internal negation: (7)
John did not realize that Mary liked him.
and the weak (sentential) negation: (8)
It is not true (the case) that John realized that Mary liked him.
The sentence 'Mary liked John* can be said to follow from (7), whereas most people would not agree that it follows from (8). As a test, we may add the negation of 'Mary liked John' to (8), without making the statement contradictory: (9)
It is not the case (true) that John realized that Mary liked him; she didn't even like him at all (in fact she didn't like him).
The logical negation in the definition of presupposition would, however, cover both types of negation (if we want to retain a two-valued logic), so that the definiens would not constitute a necessary condition. Let me now sketch briefly the main point of my proposal concerning meaning postulates or implicational rules that would account for inferences in natural language. Let me discuss it first intuitively. Presuppositions of a sentence obviously constitute a subset of the inferences of that sentence and in the calculus of inferences we should take both into account equally. If we want to distinguish the particular inferences, let us call them presuppositional inferences or p-inferences, it is on the grounds that pinferences are those (inferences) which are more strongly associated with certain linguistic elements than are other inferences; that is, a p-inference holds not only of a simple sentence containing the corresponding "source-element" of that inference, but it will hold also of a complex sentence under some conditions (in some syntactic positions of the "source-element") in which other inferences do not hold or should be modified. For instance, a p-inference will hold true if the "sourceelement" is within the scope of a verbal negation, of verbs expressing the propositional attitude of the speaker (to want, doubt, imagine, etc) or in the scope of if; Hence the problem of distinguishing a p-inference from other inferences boils down to be only a particular problem of the calculus of inferences, namely, the problem of specifying the conditions in which the inferences due to some components of a sentence remain the same or not in other syntactic positions8. And this is the basic 8
David Lightfoot (1971), for instance, has proposed conditions under which the inferences holding true of a subjunctive conditional are cancelled if the subject of the antecedent clause is indefinite and co-referential with the subject of the consequent clause in deep structure. These conditions will account e.g. for the fact that the sentence: 'If anybody had gone to Athens; they would have seen Socrates* does not imply that nobody went to Athens.
On inferences and interpretation of natural language sentences
225
problem of the calculus of inferences of complex sentences. Moreover, once we have the rules for the calculus of inferences, and we do obtain the inferences of complex sentences by those rules, there is no need of distinguishing the pinferences from other inferences. For what does it matter if a subset of inferences we can make from a complex sentence would have been also inferences of a different sentence, in which the "source-element" of the said inferences would have been in the scope of negation or in the scope of a verb expressing a propositional attitude, etc. We could ask similar questions about other inferences and other syntactic positions as well. What remains to be shown, and what I will try to show, is that the rules of the calculus of p-inferences should then be extended to cover other inferences s well; in fact, those rules in some cases hold of p-inferences and other inferences, in other cases they can be modified and cover all the inferences. It seems thus unreasonable to restrict the study of inferences to a particular case only. I will go into more detailed examination of this problem after I present definitions which seem to me adequate for the relation of inference that holds in natural language. From a formal point of view, it seems to me that the concept of strict implication, originating from C.I. Lewis, is adequate for the linguistic concept of inferences, as it applies to all cases discussed in the linguistic papers in the field. We will say that p—»q (p strictly implies q) only if it is necessary that: if p then q. Equivalently, we may say that p strictly implies q, if it is not possible that p and not q. Thus we have:
(10)
ñË9
or equivalently (11)
p-^q=
The intuitive understanding of the strict implication, proposed for meaning postulates that should account for the linguistic inferences, is the following. As we have to account for inferences that would hold true independently of the factual conditions in which sentences may be uttered (independently of who is the speaker, what he actually believes, what is the actual state of affairs) the implication is necessarily true, that is, its truth, by definition, is guaranteed by the meaning of p and q alone. This is exactly what we intuitively require of the relation called 'presupposition* in the literature, and what we require of all other inferences as well. The corresponding pragmatic definition, which foDows from the formal properties of the strict implication, as defined above, is: (12)
p—»q = df for any speaker, it is necessary that if he uses (utters) p, then he purports to believe that q
226
Irena Bellert
or equivalently g (13) p—>q = df for any speaker, it is not possible to use (utter) p and at the same time to purport to believe that q is not the case. If I say that the pragmatic definition follows from the definition of strict implication, it is for the following reasons. If the implication p—>q is necessarily true (that is, q is a necessary condition of p by the meaning alone), then whenever we utter p, we cannot but express our purported belief that q. In other words, it is impossible to utter p and at the same time to express a belief that q is not the case. This holds independently of all extralinguistic factors pertaining to the time, pkce and circumstances in which the sentence is used; the inferences will thus be independent of the factual state of affairs or the state of mind of the speaker, of whether he is lying or telling the truth; if he lies, that is, if the sentence is false, then the conclusions will be false. But this is what happens when we make correct inferences from statements that are lies: we are deceived by the speaker, but our interpretation is correct. Finally the pragmatic definition applies to questions and commands as well, since we do not apply the truth value to a sentence "p", but to the sentence "a speaker uses (utters) p", which is true whenever a speaker £ses (utters) p. This way our inferences are independent of the actual states of affairs. The states of affairs change constantly from speaker to speaker, from one moment of time to another for the same speaker, etc. This is why a referential semantics for all possible discourses is out of the question. Our proper names and definite descriptions have unique referents only for a given discourse at a given time (with a few exceptions)9. But in establishing meaning postulates we are not concerned with this problem. Let me now discuss in some detail the rules for the calculus of inferences of complex sentences in relation to the calculus of p-inferences. Let me first discuss those rules that have been established for p-inferences, but which hold of all other inferences as well. Consider the following examples. (14)
(A) John awoke Mary at noon
John killed Mary at noon John pretended to be sick John met Mary in Boston
(ß) Mary was asleep immediately before noon Mary was not alive after noon John was not sick John was in Boston
The sentences of (B) would not be called p-inferences of the respective sentences of (A), as it is possible to use a negation of the latter together with the negation of
9 This is one of the crucial points concerning and limiting the possible semantic approaches to natural language. I have discussed some aspects of this matter in (Bellert, 1968).
On inferences and interpretation of natural knguage sentences
227
the former without contradiction (and this would not be the case for p-inferences). (15)
John did not awake Mary at noon; she was not asleep immediately before noon. John did not kill Mary at noon; she was alive after noon. John did not pretend to be sick; he was sick. John did not meet Mary in Boston; he was not in Boston.
The relation between sentences of (A) and those of (B) is of the same interest to semantics as the relation of p-inference, since sentences of (B) can clearly be inferred from the respective sentences of (A) on the grounds of the meaning of the respective verbs. Now if we test such inferences against the rules defined by Karttunen for the calculus of presuppositions (Karttunen, 1972) we will see that it is possible to extend the calculus of p-inferences to other inferences and thus cover a broader area of semantics of natural language. Verbs classified as "holes", that is, those that let through all the pinferences of their complements, will also let through all the other inferences as well. Consider, as an example, the verbs 'to force', 'to regret' and 'to realize': (16)
Jim forced John to awake Mary at noon —> Mary was asleep immediately before noon. ï
Jim forced John to kill Mary at noon —> Mary was not alive after noon. : s John regretted that he pretended to be sick —* John was not sick. John realized that he met Mary in Boston —> John was in Boston. Let us now examine the case of verbs classified as 'plugs', that is those which cancel (to use Karttunen's terminology) the p-inferences of their complements. It could be expected that if such verbs cancel those inferences that remain 'untouched' in the scope of negation—then they should cancel other inferences as well. However, as it appears, in the calculus of inferences, in which we pose the problem with regard to all the inferences, rather than with regard to a particular subset of inferences—we can extend the rule by stating that the inferences (whether p-inferences or not) remain but are modified: we have to refer to the beliefs of the individual denoted by the subject of the verb. Consider some examples: (17)
John has told me that Jim's children are sick. John asked George to stop beating his wife. John ordered his wife to awake the children at noon. John said that he met Mary in Boston.
228
,
IrenaBellert
Now the sentences 'Jim has children', 'George used to beat his wife', 'The children will be asleep at noon' and 'John was in Boston' can be inferred, respectively, from the sentences in (17), but only as the reported beliefs of John. An interesting fact, which should also be accounted for in the calculus of inferences, is that when the verbs called 'plugs' become negated, then the denial 'cancels' the embedded propositions with its inferences (such is the semantic function of a denial in general), with the exception of those inferences which are not sensitive to propositonal attitudes (and therefore to the denial), and thus remain unchanged. Both sentences: (18)
John has not told me that Jim's children are sick. John did not ask George to stop beating his wife.
imply the purported belief of the speaker that Jim has children and that George used to beat his wife. Another interesting fact is that 'plugs' which correspond to verbs of propositional attitude and which obviously have an effect on the asserted proposition (instead of being asserted it acquires the corresponding propositonal attitude), have no effect on the remaining inferences. They are unchanged if the speaker is at the same time the subject of the verb in question10 or they become modified as the reported beliefs of the individual denoted by the subject (as in the case of verba dicendi in (17)). In order to make clear in our examples which propositions are asserted, I will indicate the main stress. (19)
I doubt if John's children went to Boston yesterday.
The speaker's purported beliefs are that John has children and that John's children went to Boston. (20)
Mary doubts if John's children went to Boston yesterday
If Mary is reported to have doubts as to the date of John's children's trip to Boston, she must have assumed that John has children and that his children did go to Boston. This is something that we can infer from such a sentence. There is obviously a lot more to investigate in this respect, but what I want to show is that 'plugs' do not always cancel the inferences of their complements. The main argument for canceling the inferences in the case of 'plugs' was that the individual denoted by the subject could be misinformed. However, the same argument applies mutatis mutandi to the speaker and his purported beliefs—he can be misinformed as well. And according to our interpretation we take all sentences with all the inferences as relative truths in any case. Let me now say a few words on the interpretation of if ... then and (either) ... or sentences, and the filtering conditions established for such 10
Karttunen (1973) has observed this with respect to presuppositions, but the conditions become still mote satisfying when we deal with the other inferences as well.
On inferences and interpretation of natural language sentences
229
sentences (Karttunen, 1971, Lightfoot, 1971)11. Generally speaking, the filtering conditions tell us which inferences hold true of a given state of affairs, that is, which inferences are not filtered out by the scope of the element if or or, which indicate that reference is made to a possible state of affairs. The rules are very important for the calculus of inferences in general, but in my understanding of the semantic interpretation of sentences, they leave a very important aspect of the semantic interpretation of such sentences aside. We cannot ignore the inferences that hold true of the state of affairs that is ckimed to be possible (or counterfactual) by the speaker, as this is what he intends to convey, this is his semantic message. The inferences of sentences, which can be accounted for by meaning postulates based on the notion of strict implication (or the corresponding pragmatic definition), are correct inferences independently of the actual state of affairs. For they depend solely on the meaning of linguistic expressions. The proposed definitions are, then, suitable even in the case of counterfactuals or sentences of the type 'Imagine (suppose) that S*. The inferences are then valid in the possible state of affairs as described by the speaker by the contents of the (/"-clause. And this is exactly the intended meaning of such sentences12. The interesting thing is that the interpretation is twofold. Some inferences, which are not sensitive to the scope of //*, will be said to hold true of the actual state of affairs, the remaining inferences will be said to hold true of the state of affairs ckimed by the speaker to be possible. 1
*
See footnote 3. Notice that the semantic interpretation of the relation holding between the two clauses in sentences of the form: 'If A then B' is not an inferential relation which we have to account for by linguistic rules. If a speaker says: If A then B', it would be incorrect to say that B is inferred, entailed or implied, in any sense, by A. It is the speaker who claims that there is a dependency of B on A, and that, if A is the case then B is (or will be) the case. A speaker may say 'If it rains I will go to the movies' as well as 'If it rains I will not go to the movies'. The relation between A and B, then, cannot, and does not need to receive linguistic explanation. The connection between A and B is part of the claim of the speaker, and may depend on various considerations which may be absolutely unpredictable by the hearer. The speaker, in fact, introduces a new premise in the if-clause, which shifts the actual state of affairs the speaker was referring to, into a possible (or sometimes counterfactual) state of affairs which is then referred to. The speaker, then argues what would follow (or what would have. followed) in such a state of affairs. Suppose someone says: 'If Rene: Levesque wins the elections, then Quebec will be independent, French will be the official language, etc. The consequent clauses are probably correct inferences but they are not of linguistic relevance, for they are based on additional premises that pertain to the factual knowledge of the world (for the above example); the consequent clauses may be based on just what the speaker intends to do in the case described by the if-clause (as was.shown in the former example). From a lingustic standpoint, it seems to me that it would be sufficient to add a meaning postulate associated with all sentences of the form 'If A then B', to the effect that the speaker asserts that B is dependent on A (where 'asserts' is an abbreviation for: 'intends to inform the addressee' (Bellert, 1972)). 12
230
IrenaBellert
Consider the sentence. (21)
If Bill remains in Boston, John's son will kill him in May.
The inferences constituting the speaker's purported beliefs concerning the actual state of affairs are the following: 'Bill is in Boston at the present time', and 'John has a son'. The inferences constituting the speaker's purported beliefs concerning a possible state of affairs are: *Bill continues to stay in Boston'. Bill will not be alive after May'. The example has not been analysed in precise terms, but it is used only to serve the point. The distinction between the two states of affairs is necessary, and both types of inferences are equally important for the semantic interpretation of such sentences—the semantic interpretation of a sentence being conceived of as a set of inferences or conclusions that can be drawn from that sentence and from a set of implicational rules (meaning postulates) pertinent to the syntactic and lexical characterization ofthat sentence.13 Bibliography BELLERT, IRENA (1968), On a condition of the coherence of texts. International Symposium on Semiotics. Warsaw. (Reprinted in Semiotica 2.4, 1970) — — —
(1969), Arguments and predicates in the logico-semantic structure of utterances. Dordrecht-Holland: Reidel. (1970), On the use of linguistic quantifying operators. COLING. Sanga Saby, 1969 Reprinted in Poetics. (1972), On the logico-semantic structure of utterances. Wroclaw, Poland: Ossolineum.
CARNAP, RUDOLF (1947), Meaning and necessity. Chicago, 111.: University of Chicago Press. KARTTUNEN, LAURI (1971), The logic of English predicate complement constructions. IULC mimeographed. — (1972), Presuppositions of compound sentences. Linguistic Inquiry 4.2. 13
The present text is an extended version of my paper: On the Application of Meaning Postulates to Linguistic Description", presented at the LSA Meeting in San Diego, 1973.1 wish to express my indebtedness to David Lightfoot for his critical comments owing to which I was able to clarify some of my points and eliminate some of the defects of my text. I also feel indebted to several other people who have discussed the paper and made comments. It is difficult, however, to realize to what extent the ideas and the improvements on my earlier concepts have been affected by the criticism I have received. In any case I would like to acknowledge the fact that some of Lieb's critical remarks (H. Lieb, "Grammars as Theories", Theoretical Linguistics, No 1—2, 1974) have helped me to realize where I was wrong in my formulations (in "Theory of Language as an Interpreted Formal Theory", Proceedings of the llth International Congress of Linguists, Bologna, 1972). Although I still disagree with Lieb on some points, I would like to express my indebtedness to him for his patience in going through the detailed and sometimes incorrect formulations of my earlier paper which will certainly help me in further developing my concepts.
On inferences and interpretation of natural language sentences
231
KEENAN, EDWARD (1970), A logical base for a transformational grammar of English. Philadelphia. TDAP 82. LAKOFF, GEORGE (1970), Linguistics and natural logic. Mimeographed. Reprinted in Synthese 22. No. 1-2 LIGHTFOOT, DAVID (1971), Notes on entailment and universal quantifiers. Papers in Linguistics 5. No. 2 VAN FRAASSEN, B.C. (1968), Presupposition, implication and self-reference. Journal of Philosophy 65. VENDLER, ZENO (1967), Linguistics in philosophy. Ithaca N.Y.: Cornell University Press.
S. D. ISARD WHAT WOULD YOU HAVE DONE IF... ?
In this paper I formulate principles to account for some of the ways in which tense, mood, aspect and modal verbs are used in English, and describe a computer program which operates according to these principles. The program is capable of playing a game of tic-tac-toe (noughts and crosses) and answering questions about the course of the game. In particular, it is able to discuss hypothetical situations, both past and future, and to answer questions about possible, as well as actual, events. One of the main ideas on which the program is based is a "pronominal" account of both tense and mood as forms of definite reference to previously mentioned situations.
1.
Introduction
In this paper I shall put forth some ideas on the use of tense, mood, aspect and modal verbs in English, and describe a computer program which embodies these ideas. The program is capable of playing a game of tic-tac-toe (noughts and crosses) and answering questions about the course of the game. In particular, questions having the form of the paper's title. Before plunging into details, I would like to discuss briefly the point of writing such a computer program. To begin with, it is intended as an expository device, rather than a useful, or even potentially useful, piece of technology. It is a small working model, meant to illustrate some abstract proposals. As such, it is supposed not only to perform its task, but to do so in a way that is comprehensible and, hopefully, illuminating to an observer. A computer program cannot constitute a theory in itself, but it is often easier to grasp a theory through consideration of a detailed example than by starting with abstractions. Unfortunately, most readers of this paper will not have direct access to the program itself, for one reason or another, and will find themselves in a position where they must accept on faith that it actually exists and does what is claimed of it. Such readers are, in fact, being presented with whatever I may have to offer through the traditional medium of the ordinary language essay. However, even without the intention of actually writing a program, one can see the temptation to resort to computing terminology in discussing any process as complex as knguage use.
234
Stephan D. Isard
Our everyday vocabularies just don't seem to be very rich in the right sorts of words, and computational concepts, "calling a subroutine", for example, or "assigning a temporary value to a variable", provide us with the best, if still inadequate, source of process metaphors that we have. What the program itself then contributes is a demonstration that one's ideas about the process being modelled are at least coherent, that they can be filled in in detail, even if they ultimately turn out to be wrong. In the circumstances, this sort of demonstration is really of more direct benefit to the producer of ideas than to the consumer, but it is to be hoped that he will profit too in finding the wares in the marketplace better developed and less woolly than they might otherwise have been. To give a somewhat better idea of the range of English, and subject matter, that the program is concerned with, I would like to introduce a sample of its performance. Before I can do this, however, I must make the admission that it does not actually pky tic-tac-toe in that game's classic form. Instead, it plays a variant in which, rather than alternatively putting marks (0 and X) on a three-by-three board, X 0 fig.l the players choose numbers between one and nine. If we consider these numbers arrayed as an appropriate "magic square" 2 9 4
7 5 3
6 1 8
fig-2 the game's original objective, to make three marks in a straight line, translates itself into the goal of taking three numbers which add up to fifteen. The program and its opponent communicate by means of a teletype. The game is played in the players' heads, so to speak, but after each of its own moves, and at other times if requested, the program signals its understanding of the current state of pky by typing out the magic square with the numbers it has taken replaced by "0"s and those that its opponent has taken replaced by "X"s. Here is a sample conversation. It exhibits all the important syntactic constructions and most of the vocabulary with which the program can cope. The user's remarks are prefaced by U and the program's by P. These symbols do not actually appear on the teletype, nor do the parenthesized numbers, which are for future reference. Our teletype does not posses a question mark, but they are inserted here for ease of reading, as are commas, which the program does not use.
What would you have done if... ?
235
The conversation begins before either player has made a move. U:
Display the board.
P:
2 7 6 9 5 1 4 3 8
U:
What have you taken?
P:
Nothing.
U:
Move.
P:
0 7 6 9 5 1 4 3 8
U:
I take five.
U:
You move.
P:
0 7 6 9 X 1 4 3 0
U:
Had you taken eight when I took five?
P:
No.
(2)
U: P:
If I had taken six when I took five, would you have taken what you did? No.
(3)
U: P:
What would you have taken? 4
W
U:
Have you taken two?
P:
Yes.
U:
Could you have taken two when I took five?
P:
No.
U:
Did you take seven ?
P:
No.
U:
What did you take?
P:
Eight.
U:
If I had taken four when I took five, what would you have done?
P:
Take six.
U:
Could you have won?
P:
Yes.
(1)
(5) (6) (?)
236
Stephan D. Isard
U: P: U: P: U: P: U: P: U: P: U: P: U: P:
Will you win? Dunno. Move. I can't move now. I take eight. I took eight. Had you taken eight when I took two? You haven't taken 2. Had you taken two? Under what circumstances ? When you took eight. Yes. Can I win? No.
2.
Syntax
2.1.
Translating vs. obeying
In describing how the system goes about responding to an English sentence, it will be helpful to distinguish between two modes of operation —a "translation" mode, in which the English is turned into computer instructions, and an "obeying" mode, in which the operations specified by these instructions are actually carried out. In computer terminology, this is essentially the distinction between compiling a program and running it. In this model it is supposed to correspond approximately to the distinction between a hearer's understanding what a sentence calls upon him to do—answer a question, say, or believe a statement—and his actually doing it. This is not, in fact, an altogether clearcut distinction in the use of natural language. Consider, for instance, the process of establishing the referent of "that man over there in the corner" in a question such as "Is that man over there in the corner the Professor of Paleontology?" Identifying the man allows you to reduce the question to whether be is the Professor of Paleontology (and once he has been identified, it won't make any difference if he moves out of the corner and wanders about the room while you ponder the matter). In this sense, making the identification is part of finding out what the question is, and so to be viewed as part of the translation, or compiling process. On the other hand, there is a sense in which it is an "obeying" action. For one thing, you could refuse to do it. You could simply not deign to figure out who was specified by the phrase "that man over there in the corner"—you could avert your gaze—and you could do this while understanding the words perfectly and knowing exactly how you would go about discovering who was indicated if you felt like it.
What would you have done if...?
237
The moral, then, is that although there is a distinction to be made between comprehending instructions and carrying them out, there will be times when both sorts of process are going on simultaneously, when one instruction is being carried out in order to make clear the content of another. One consequence of this in the operation of our model is that for many, perhaps most, of the sentences that it treats, there never appears anything that could properly be called a translation of the entire sentence. That is, if it were to treat sentences about suspected paleontologists, which it does not, it would at some stage cook up instructions to itself to find out who was in the corner, and it would carry these out, with the result, let us say, that he is identified as Jones. It would next pose itself the question whether Jones is Professor of Paleontology, and, if it happens that he is not, there could be a final instruction to say "no". Of these three instructions, the one which discovers whether Jones is Professor of Paleontology would seem to come closest to capturing the sense of the question, but it misses out the information that the original sentence referred to him as the man in the corner. The identical instruction might arise in the process of answering a question which called him "that chap with the extraordinarily thick spectacles". One might, in analysing the system's operation, try to display all of the instructions at various levels on a tree, and let the tree as a whole represent the system's understanding of the sentence. Whatever the possible merits of such a notation, the system itself does not employ it because, as a model of a naive language user, it is not attempting to analyse its own operation.
2.2.
Closure functions
I would like now to discuss the form of the instructions which the program derives for itself from the English sentences it is given. To do this, it is necessary to introduce the notion of a closure function. In fact I shall go back one step further and begin by saying that (plain, old) function is the name given to the fundamental unit of program in the programming language POP-2 (Burstall et al., 1971), in which the system is written. If you want to accomplish something with POP-2, you apply (or, equivalently, run or call) a function. The programming language comes equipped with a set of basic functions, including, for instance, addition and multiplication, and the activity of programming consists of combining, in various ways, functions you already have, to make new and more complicated ones. In general, functions require arguments on which to work. We can't call the addition function without supplying it with two numbers to add together. The number of arguments required will vary from function to function, and there are some, like the one which turns off the machine, which require none at all. Now, given a function, PLUS, say, which requires two arguments, we can, in POP-2, construct a new function by "freezing" one of the arguments of PLUS at some
238
Stephan D. Isard
specific number, say 1. That is, the new function, written PLUS(%1%), will require only one argument, and what it will do is to add 1 to this argument. Or, to put it another way, it is just like PLUS except that it supplies its own second argument, i.e. 1. A function constructed by "freezing" arguments in this manner is called a closure function. To choose an example more pertinent to the matter at hand, the tic-tac-toe playing system has a function TAKE, which requires two arguments, a player and a number, and whose effect is to transfer the number from the list of those not yet taken to the list of those taken by the player (or to print a rude remark if the number is no longer available to be taken). The closure function TAKE(%7%) then only requires a player as argument and has the effect of transferring 7 to his list. Where TAKE is used to translate the (two-argument) English verb "take", TAKE(%7%) corresponds to a (one-argument) verb phrase. Let me stress here the fact that TAKE(%7%) is a function, the same sort of beast as TAKE itself, the percentage signs being introduced so as to avoid a possible confusion arising from the fact that in ordinary mathematical notation, a term of the form F(x) represents the result of applying the function F to the argument x, and not, in general, another function. Sin(O) is a number, 0 in fact, whereas sin(%0%) is a function—a rather silly one to be sure, but a function nonetheless. One consequence of this is that we can, if we wish, repeat the process of freezing arguments and from TAKE(%7%) construct, e.g. TAKE(%7%)(%!%), which fixes both player and number and, when applied, wants no arguments at all, but simply awards 7 to me.
2.3.
Translation of clauses
Now, if TAKE(%7%) is to be compared to a verbphrase, and we further add a subject, to form TAKE(%7%)(%!%), we get something which might serve as the translation of a phrase like "my taking of seven", "that I take seven" or "for me to take seven". It falls short of representing a full clause since it does not contain, within itself, any indication of the circumstances to which it is meant to apply, past, present, or future, real or imaginary. As a result, in the translation of a sentence like "Have I taken seven?" it finds itself placed as an argument in a larger structure HAVE(%TAKE(%7%)(%!%)%), where the role of HAVE is to search (the machine's memory of) the past for an event of the sort specified by its argument, in this case an occasion on which I took seven. Without going too far into the details of the way in which HAVE is programmed, we can note here that in trying to decide whether such an event has occurred, it is able to look at the component parts of the closure function given to it as argument, and so seek out the agent, action, etc. of the desired event. This is because there are basic functions, supplied with the language, which will identify the original function and the frozen arguments out of which a closure
What would you have done if... ?
239
function has been constructed. In fact, given a closure function, we are not restricted to just looking at its components, but we can also alter the function by replacing them with new values. This ability turns out to be useful in the translation procedure. It is by iterating the process of embedding one closure function as the argument of another that the program builds up its translations of English clauses. Thus the function corresponding to "would you have taken seven" is finally SUBJUNCT(%WILL(0/oHAVE(%TAKE(%7%)(0/oYOU%)%)%)%), where each of SUB JUNCT, WILL and HAVE is a function taking a single closure function taking a single closure function as argument, and the expression as a whole therefore represents a function of no argument. In fact, every clause is given a translation of this general form, which is to say a matrix of verb and one or two arguments embedded successively in functions which are classified as an aspect, a modal verb and a tense. This is true even for sentences such as "I take four", which do not exhibit any overt signs of aspect or modal verb. The modal and aspect functions employed in such cases are dummy ones, which do no real work when called, but simply call the next function in line. Thus the translation of "I take seven" is actually PRES(%NOMODAL (%NOASPECT(%TAKE(%7%) (%!%)%)%)%), where the effect of calling NOMODAL(%NOASPECT(%TAKE(%7%) (%!%)%)%) is to call NOASPECT (%TAKE(°/o7%) (%!%)%), which in turn calls TAKE(%7%) (%I%). These dummy functions were introduced in the interest of making the translation program, if not its output, somewhat simpler and easier to read, and they are not meant to have any theoretical significance. It is just that it is convenient to be able to furnish each clause with a fixed number of slots, and to know that, say, the main verb belongs in the fourth one from the top, both when deciding what to do when the main verb is encountered and also when it is necessary in making some later decision to ask what the main verb is. In what follows, I shall normally omit dummies when giving examples of the translator's output.
2.4.
Syntactic trees
We are now in a position to discuss how the translation procedure works. What it produces, in fact, is translations of clauses, rather than of whole sentences. This distinction does not, of course, make itself felt until one begins to consider sentences with more than one clause. What happens in such cases is that when the translation of a clause is complete, it is run—the instructions into which it has been translated are obeyed. The fact that one clause is embedded in another is reflected not in the translation of either—as we have already noted:— but rather in the fact that the translation and running of a subordinate clause are treated as a subprocess in the translation of a higher clause. That is, for a sentence such as "I took what you would have taken", one can represent not the translation procedure's output, but its operation with a tree diagram such as
240
Stephan D. Isard
CLAUSE NP
V
NP
REL NP
CLAUSED MODAL
ASPECT
V
fig. 3
where each node stands for a function which is called within the program, and the nodes lower down the tree are invoked during the operation of the ones higher up. While running the clause translator, we are called upon to run the nounphrase translator, the verb translator and the nounphrase translator again, and this second time the nounphrase translator itself calls the clause translator, and so on. There is, of course, nothing novel about relating the syntactic analysis of a sentence to the operation of an automaton in this way. In fact standard proofs that pushdown store automata can parse context free languages (see, e.g. Hopcroft and Ullman (1969)) exploit a correspondence between the parse tree of a string and the sequence of states through which the store of the automaton passes in processing it. Now the sort of tree whose absence is explained away in this fashion corresponds to what would normally be called surface structure. One can also find a correlate for a deeper level of analysis in the sequence of operations that are performed as the compiled programs are run. For it is at this level that sentences which would normally be said to have the same underlying representation receive similar treatment. For example, when faced with the sentence (8)
When you took seven, had I taken two?
the system first translates the words "when you took seven", then executes the resulting instructions, before translating the remaining words. In the case of (9)
Had I taken two when you took seven?
the translation of "had I taken two" is performed first, but execution of the result is postponed until after both translation and execution of the subordinate clause. In both cases, then, we execute the translation of "when you took seven" before the translation of "had I taken two", and in running the translations of
What would you have done if... ?
241
clauses in an order different from the one in which they are formed, the system might be viewed as performing grammatical transformations. However, the structures being transformed will have their existence only in the eye of the beholder, and not in the machine itself. This strikes me as a relatively happy situation from the point of view both of the linguist who would like his structures and transformations to have some sort of psychological reality, and the psychologist who would like to make use of the linguist's insights, but who is reluctant to postulate operations on trees going on in the head. Unfortunately, the sample of English understood by this program is small and far from exhibiting all of the complexities of English syntax, to say nothing of meaning, and so it can only be used as an illustration of the proposed relation between syntactic structures and the process of comprehension, and not as evidence for it.
2.5.
The translation procedure
Translation is actually carried out by associating with each word a list of procedures to be run whenever the word is recognized. These procedures correspond, in the main, to the syntactic category of the word. They operate on closure functions, initially filled with dummies as described above, and in the most straightforward sort of case, simply replace one of the dummies with a "real" function. For example the verb "take" is assigned just the procedure V. When "take" is encountered this has the effect of replacing the dummy verb of the closure function with the function TAKE. In general, each verb has an associated "semantic" function, usually with the verb itself as its name, and the V procedure puts the semantic function of whatever word has just been recognized into the verb position of the closure function. A verb can also have other procedures as well as V on its list. "Win", for example, has both V and INTRANS, with the effect that after V inserts the WIN function in the verb position, INTRANS removes the object position from the closure function. There are similar procedures which replace the dummies in the tense, modal, aspect, subject and object positions when appropriate words are encountered. These procedures are also able to terminate the clause being translated and begin a fresh one if the position that they are trying to fill is already occupied. The tense, modal and aspect procedures also check positions which should not be filled until after theirs in a given clause. For instance, in the translation of "If I take six, will you win?" the modal procedure activated by "will" notes that TAKE has already replaced the dummy verb of the "if clause and, since a modal cannot follow the main verb, it ends the translation of the "if clause, runs it, and puts WILL into the modal position of a new clause. (The decision to terminate the "if clause
242
Stephan D. Isard
cannot be taken before "will" is encountered because of the possibility of a subordinate "when" clause—as in "If I take six when you take five, will you win?"— which would have to be dealt with before the "if clause could be considered finished.)
2.6.
Tense and mood markers
The tense position is capable of being occupied by functions which correspond to mood as well as tense. Its initial situation is different from that of the other positions in that it starts off with a "default" value of PRES rather than a dummy value. This corresponds to the common claim by linguists (see, e.g., Lyons (1968)) that the present tense is an unmarked case in English and means in effect, that a clause is assumed to be present tense and non-subjunctive unless we encounter explicit evidence to the contrary. Verbs carrying markers of other tenses and moqds set off both the procedures associated with the verb itself, and those associated with the marker. Actually, there is only one function that can supplant PRES at this stage, and that is REMOTE. That is to say that a verb form like "took" is not considered to be ambiguous between a past form which appears in "I took seven" and a subjunctive form which appears in "If I took seven, what would you do?". The system knows only a single "remote" form. However, the interpretation placed on this remote form differs with the circumstances in which it finds itself, and so the function REMOTE has two essentially different modes of behaviour. It can behave like a past tense and it can behave like a subjunctive. I shall spell out what it means to "behave like" a past tense or subjunctive in greater detail later on, but the distinction I have in mind is the ordinary one, expressed in informal terms by saying that the past tense is used to refer to actual past situations, while the subjunctive is used for hypothetical situations. In "if clauses, the remote form is behaving as a subjunctive if replacing it by "were to" plus the infinitive produces a paraphrase. That is, "became" in (10)
If he became Prime Minister, there would be a revolution,
serves as a subjunctive, and the sentence is equivalent to (11)
If he were to become Prime Minister, there would be a revolution.
On the other hand, in (12)
If he became Prime Minister, then my history book is wrong,
"became" serves as a past tense form and there is no paraphrase (13)
If he were to become Prime Minister, then my history book is wrong.
The decision as to which way a given instance of REMOTE should behave is made by the procedures attached to other parts of the sentence. In the program's
What would you have done if... ?
243
present form, REMOTE is initially set to behave like a past tense, but when a sentence appears which contains both "if and the remote form of a modal verb, i.e. "would", "could", "might", REMOTE begins to behave like a subjunctive when the translation of the "if clause is run. The idea behind this initially complicatedlooking condition is that the "if clause conjures up a hypothetical situation to which subsequent remote modal forms refer. Any clauses whose translations are run before that of the "if clause cannot refer to the hypothetical situation because it hasn't been created yet. The remote form continues to be taken as subjunctive in succeeding sentences as long as they contain remote modal forms. REMOTE reverts to being a past tense when a sentence is encountered which does not contain such a form. Thus in (1) of the sample dialogue the remote form is interpreted as a past tense. (2) contains the words "if and "would", so that REMOTE behaves like a subjunctive in the "if clause, and in the main clause, whose translation is run after that of its subordinate clauses. However, the "when" clause is subordinate to the "iP* clause, and so its translation is run first, with the result that REMOTE is still a past tense in the "when" clause. (3) contains the remote modal form "would", so the subjunctive interpretation persists, but (4) does not contain a remote modal form, so that REMOTE reverts to being a past tense in (5). Note that if (5) had followed (3) directly, then it would have been natural to take it as a subjunctive, and a continued reference to the hypothetical case raised in (2). The "cancelling" effect of change of tense, which in this example operates through (4) to interrupt reference to the hypothetical case and make (5) a reference to the actual past, was pointed out to me by Christopher LonguetHiggins (as was the example about the Prime Minister). There is one sort of clause which the program treats as an exception to the principles just set out and in which REMOTE acts as a past tense under all circumstances, but without interrupting the subjunctive interpretation for further clauses. This is the case of "what" clauses, either question or relative, such as "what you did" or "what did you do", which contain a remote tense marker, but no modal. Thus, in a sentence like "If I had taken four, would you have taken what you did?", the "if clause is translated first, and its translation run, with REMOTE acting as a subjunctive. The sub-clause "what you did" of the main clause is the next to have its translation run, and here the remote marker acts as a reference to the actual past. But in the main clause, REMOTE acts as a subjunctive once again. In this way we get the desired effect of comparing what actually happened with what would have happened under different conditions. Unfortunately, it is not always correct to take the remote form in such "what'' clauses as a reference to the actual past. The "what we said" in "If he became Prime Minister he would ignore what we said" can perfectly well be used to refer to what we would say in the hypothetical circumstance of his becoming Prime Minister, as well as to an actual past utterance.
244
2.7.
Stephan D. Isard
Binders
It still remains to discuss the syntactic procedures associated with the words if", "when" and "what", which the program classes together as "binders". These procedures are quite similar in that they all bring about the translation of a subclause—the clause translator is called recursively—and then they apply a semantic function to the result. Thus in the translation of (2), the occurrence of "when" causes the clause translator to turn the succeeding words "I took five" into the function REMOTE(%TAKE(%5%) (%!%)%) to which a function WHEN is then applied. The procedure associated with "what" is slightly different in that the word is effectively decomposed into a word DAT, which gets inserted into the subject or object position in the sub-clause, and a function WH which is applied to the translation of the sub-clause. "What did you take?" then causes an application of WH to REMOTE(%TAKE(%DAT%)(% YOU)%). Other differences among the binder procedures arise from the fact that in this context, an "if clause can have a subordinate "when" clause, but not viceversa, and that relative ckuses, but not questions, beginning with "what" can appear inside the other two kinds of clause. (The program does not deal with "when" questions.)
3.
Semantics
3.1.
Situations
We now turn to what the programs into which the English has been translated actually do. Very informally, we might say that they are programs for obtaining information about the machine's state of mind, and for altering this state of mind. Since the machine never thinks about anything but tic-tac-toe, and even about that with no great profundity, its states of mind can be described in relatively simple fashion. The central concept in its world view is that of a situation, which is basically a state of the game. For most purposes, situations are determined by the values of six variables in the program. These variables can be thought of as pigeonholes whose contents can be looked at, to determine the answers to questions, or replaced, by programs which alter the situations, or, rather, the machine's mental image of them. The particular variables at stake here are: (1)
MINE, which contains the list of numbers which have been taken by the machine.
(2)
HIS, which contains the list of numbers taken by the opponent.
(3)
REST, holding the list of numbers not yet taken.
What would you have done if...?
245
(4)
MEMORY, a list of the moves made so far, in the order in which they were made.
(5)
TURN, which specifies whose turn it is.
(6)
INTRAIN, which can specify that a particular move is in progress, or can be undefined if the situation to be represented lies between moves.
It is evident that there is considerable redundancy in this form of representation, but it is convenient to have such information as whose turn it is ready to hand, and not have to work it out each time it is wanted. There is also one further variable, FOCUSSED, which is part of the machine's concept of a situation, but it is of interest only in special cases, and I shall postpone discussion of it until kter. At any given moment, the system can have only one situation under active consideration, in the sense that the crucial variables have one set of values, and no others. However, it can also have other situations "in mind", to be turned to, or returned to, when the occasion demands. In particular, we want to be able to remember the actual board position while considering hypothetical cases. There are two ways of keeping situations "in mind". The first is by means of a function called SPOSE. The effect of SPOSE is to copy down the values of the variables in the situation currently under consideration so that they can be reinstated at a prearranged point later on. Within the scope of a SPOSE, it is safe to tamper with the situation without losing track of reality. SPOSE appears typically in contexts like SPOSE TAKE(I, 7), which would be part of the program corresponding to the clause "if I take seven". The other way to store a situation is to create a function which, when called, will set up the situation by giving the crucial variables the right values. REMOTE uses a function of this sort when it behaves like a past tense. Its effect is to set up the past situation currently under discussion, so that the rest of the sentence can be considered with respect to it. Thus, when it comes time to run the translation of (6), which, omitting dummies, amounts to REMOTE(%TAKE(%7%)(%YOU%)%), the REMOTE in the tense position recreates the board position 0 9 4
7 5 3
6 1 8
fig. 4 referred to in (5) by the clause "when I took five."
246
3.2.
Stephan D. Isard
Tenses as pronouns
It is this sort of behaviour on the part of the past tense marker that makes McCawley (1971) refer to it as a kind of pronoun. It acts as a form of definite reference to a past situation on which the attention of the conversants has recently been focussed, usually by a previous mention. If you ask me "What did you do after your lecture this morning?" and I reply "I ate lunch", I mean that at the particular past time we are talking about I ate lunch, in just the same way that if you ask me "Is your brother right or left-handed?" and I reply "He is lefthanded", I mean the particular man you asked me about. By way of contrast, the reply "I have eaten lunch" seems irrelevant or evasive, like "Someone is lefthanded", because it doesn't make reference to the particular time in question. We might note in this regard that those operators in formal tense logics which are meant to be read as "at some time in the past it was true that" (see, e.g. Prior (1967)) do a job which is usually performed in English by the perfective aspect, "have-I-past participle", rather than the past tense. The program distinguishes between its HAVE and PAST functions. HAVE appears in the translation of a sentence like "Have I taken two?" and searches the memory for any occasion on which I took two, while PAST appears in the translation of "Did I take two?" and goes to a specific past situation to see whether I took two then. Both functions come into play in the program's translations of past perfect sentences—"Had I taken two?"—where we first move back to a point in the past, and then search the range of moves before it. However, it is also possible to use past perfect sentences in such a way that the right translation would seem to involve two "pronominal" past tense functions. Consider "What had I done on the previous move? Had I taken two?".
3.3.
Setting referents
An interesting difference between the case of pronouns and that of tenses is that we have few, if any, devices in English whose function is just to call attention to some item so that it can be referred to elsewhere by a pronoun. A possible candidate for such a device would be the sort of topicalization that produces constructions such as "And your father, how is he?" where "And your father" does nothing but call attention to its referent. However, this attentioncalling need not necessarily be in the service of a pronoun, as we can see from "And your father, now there's a man I admire" or "And your appointment, did you arrive in time?". We do, however, have devices whose role is just to provide referents for tenses. "When" clauses, in particular, serve this function. The situation referred to by a "when" clause is always the referent of the tense in a higher clause, and, as in (5)—(7), it can serve as the referent of the tense in subsequent sentences as well.
What would you have done if... ?
247
In the cases dealt with by the program I have written, a past tense "when" clause—"when I took seven"—establishes a referent for the past tense which stands either until the appearance of another past tense "when" clause, or, as mentioned earlier, until there is a change of tense. A new time clause will, of course, establish a new referent, but a change of tense leaves the past tense with no referent at all. In such a context, a return to the past tense with a sentence unadorned by any time clause or time adverbial will appear odd. Consider, for example, the effect of replacing sentence (5) of the sample dialogue by sentence (7). The pronominal behaviour of the present tense is slightly more complicated. To begin with, it can never find itself entirely without a referent, because it can always adopt its "default value", the time of speaking, in the absence of any other referent. The role of present tense "when" clauses is to provide other referents for the present tense. These are usually future situations, as in "When I take seven, what will you do?", or generic ones, as in "When I take aspirin, the spots grow even larger" or "When I take my sugar to tea, I'm as happy as I can be". The program does not attempt to cope with generic situations and I shall not discuss them here. A peculiarity of future situations is that further references to them will normally be made in conjunction with modal verbs, as in the example just cited or "When we get there, they may be gone" and "When you have typed that letter, you can go home". I am adopting here the position put forward by Boyd and Thorne (1969) that these modal verbs are not in themselves markers of a future tense, and that English, in contrast to, say, French or Italian, does not in fact have a syntactically distinguishable future tense. This is not to say that we cannot refer to future times, but just that when we do, we use the same tense that we use for talking about the present. In particular, then, I would not regard a sentence like "You can go home" as syntactically ambiguous, in spite of the fact that it can be used either in reply to "What shall I do when I have finished typing this letter?", making reference to a time in the future, or in reply to "What shall I do now?", making reference to the present moment. Boyd and Thorne, in arguing against the existence of a future tense, concentrate their fire on the claim that "will" serves as the marker for one. They give examples both of sentences about the future which lack a "will", like "He goes to London tomorrow", and sentences containing "will" which are not about the future, as in "My cousin is downstairs. He will be wondering what has happened to me." I think it is also worth remarking on the absence of "will" from time clauses which refer to the future. A language like French, which has a distinct future tense, will use it in such clauses, producing "quand il viendra" or "apres qu'il sera venu" in contrast to the English "when he comes" or "after he has come". This fits in well with the contention of Boyd and Thorne that "will" is a marker of prediction, rather than of futurity as such. The prediction in "When he becomes Prime Minister, he will shed his lofty principles" is made by the main clause. His becoming Prime Minister is presupposed. 17 TLI3
248
3A.
Stephan D. Isard
WHEN
These considerations should make clear the motivation behind the way that the program's WHEN function works. WHEN operates upon the sort of closure function that the program produces as the translation of a clause. In the case of "when I took seven", for example, it gets applied to REMOTE(%TAKE(%7%) (%!%)%). What it does, huessence, is to search out the situation described by the clause, in this case my taking of seven, and make the tense of the sentence, in this case past, refer to it. In slightly more detail, it first notes the tense, in order to find out where to look for a situation of the sort described. If the tense is past, it searches the memory. If this search were to fail, it would indicate a disagreement with the presupposition of the "when" clause, and a message to this effect would be printed out. In the present example, this would be "You haven't taken seven". If the search succeeds, the program notes the values of the variables which define the situation that it finds, and creates a function which, when called, will conjure up this situation by assigning the variables these values. The newly created function is then given the label PAST. When it is not in use, which is to say before any past tense references have been made in a conversation, or after a change of tense, this PAST label is hung on a function whose effect is to temporarily abandon the attempt to understand the sentence it is working on, print out "Under what circumstances", and interpret the response, in the hope that it will provide the missing referent. It then has another try at the original sentence. It is the function currently labelled PAST that REMOTE calls upon when it is acting as a past tense. Thus the REMOTE in the translation of (6) of the sample dialogue looks to see what function is currently labelled PAST, and finds the one created during the sentence (5). It then runs this function, recreating the scene "when I took five" and answers, with respect to it, a question which amounts to "are you just about to take seven, or in the process of doing so?". The reason for this formulation is that "when I took five" does not, even in the restricted universe under consideration, specify a precise point in time. We can compare, for instance, (14)
Did I win when I took five ?
with (15)
Did you take six when I took five?
In the game of the sample dialogue, "when I took five" is equivalent to "on my first move" for the purposes of (14), but to "on your second move" for (15). The reasons for this do not lie on the syntax of (14) and (15), or even in their semantics, in the sense that the opposite interpretations, "your second move" for (14) and "my first move" for (15), would not be meaningless. It is not in general out of order to speak of two people doing things at the same time. It is, however, impossible for two moves to take place at once in a game played
What would you have done if... ?
249
according to the rules of tic-tac-toe, and the program makes the (pragmatic) assumption that it will only be asked to discuss possibilities that might arise during a well-formed game. It is in order to give it room to operate this assumption that the variable INTRAIN is introduced. The value of this variable is supposed to represent a move in progress, and the program can take a question about whether an event occurred in a given situation as asking either whether the move in progress constituted such an event, the interpretation for (14), or whether the event took place just after the move in progress, the interpretation for (15). When the body of a "when" clause specifies an event, as it does in "when I took five", the WHEN function makes this event the value of INTRAIN in the situation that it constructs and gives the memory and board position variables the values they held just before the event. In answering a question such as (14) or (15), the system first looks to see if the event INTRAIN and the event being asked about involve the same agent. If they do, as in (14), it then goes on to see whether the event INTRAIN meets the description given in the question. In the case of (14) it would have to ask whether my taking of five was a winning move. If the agents do not match, it is not the event INTRAIN that is of interest. In this case, the board position variables are altered by letting the move INTRAIN take place, and the question is taken as applying to the following move. This altered situation then carries on as the referent of the tense for the purpose of further sentences. Thus "Did I take four?" following a "Yes" answer to (15) would not be taken as equivalent to "Did I take four when I took five?", whereas following (14) it would be. In trying to seek out the situation referred to by a present tense "when" clause—"when I take six"—the program must look to the future. This raises the difficulty, which the nature of the game allowed us to avoid in the case of the past tense, that there may be several different possible future situations in which the event could occur. The program does not answer "Well, that depends" in such circumstances, but instead looks to see whether the event might happen as one of the next pair of moves, and, if not, indicates that it doesn't know what situation is referred to. Furthermore, a present tense "when" clause carries the presupposition that the event it mentions, my taking of six, say, will actually happen. The program objects to this supposition if the event is impossible, for instance if six has already been taken, or if its own strategy is such as to prevent it happening. For example, if it is the program's turn to move, and it is about to take six, it will not countenance a "when I take six" from the user. Similarly, it will object to "when you take four" if its strategy dictates that it will not. If the event description manages to clear all these hurdles, things proceed as with the past tense. The program constructs a function which, when called, will create the situation referred to. In this case, of course, the function is given the label PRES, rather than PAST, and it can be called into use by further present tense sentences with modal verbs as discussed above. As soon as a sentence appears whose
250
Stephan D. Isard
main clause does not have this form, PRES reverts to the "actual present", the current game position. It is important to note that when the program encounters a present tense "when" clause, the point of departure for its search into the future is the current referent of PRES, and not necessarily the actual position in the game. Thus we migk have an exchange like U: What will you do when I take six? P: Take 4. U: What will you do when I take two? P: Take 3. where the second question-answer pair refers to the situation after six and four have been taken. We can also have sentences like "If you take two, what will you do when I take three?" where the referent of the present tense is given by the "if clause, and the "when" clause follows on from the hypothetical situation established there.
3.5.
IF
The function corresponding to the word "if* operates on present tense clauses in a manner quite similar to that just described for the function WHEN. It seeks out a future situation to serve as the referent of PRES. The main difference between the two arises from the lack of any presupposition, in "if clauses, that the event described by the clause will actually take place. The IF function does not share the WHEN function's concern with what the program's strategy would lead it to do in a given situation, and so the program is willing to speculate on what might happen "if you take six" in circumstances that would lead it to reject "when you take six". There is a considerable divergence between IF and WHEN when we come to clauses bearing the remote tense marker. As mentioned earlier, "if clauses with this marker are always interpreted by the program as being subjunctive, rather than past tense. The subjunctive mood shares a syntactic marker with the past tense, and exhibits the same sort of "pronominal" behaviour that we have attributed to tenses, but the sorts of things to which it "refers" are somewhat different. In the usages we are concerned with here, it does not stand for a particular situation, even a hypothetical one, but rather for an entire alternative time line. Consider the useful example (contributed by Christopher Longuet-Higgins) "If my father's name had been Smith, my name would be Smith". Speaking informally, we can say that the "if clause asks us to consider a parallel universe in whose past my father's name was Smith. In the main clause, we go on to an assertion about the present of this parallel universe, namely that my name is Smith. The way that I have attempted to capture this intuition in the program is to let the referents of SUBJUNCT be functions which temporarily assign values to PRES and PAST which make them refer to situations different from their "real"
What would you have done if... ?
251
values. Thus applying PAST within the scope of a SUBJUNCT evokes the subjunctive past, the past of the alternative universe, rather than a point in the actual past. Now, it is "if clauses that tell SUBJUNCT which alternative universe to consider. Taking first remote clauses without perfective aspect—"if I took seven"— we see that they refer to possible future events in the same way as present tense "if clauses without perfective aspect—"if I take seven". I believe that there is a preference in ordinary usage to employ the subjunctive form in cases where we deem the event unlikely, but I have not taken account of this in the program. The search for a referent situation is therefore carried out in the same way. However, this situation, when found, is not now given directly to PRES, but rather a function which gives it to PRES is created, and made the value of SUBJUNCT. The situation can then be re-evoked only by first entering the subjunctive. Our account of the difference between sequences like (16)
What will you do if I take five? Will you take four?
and
(17)
What would you do if I took five? Would you take four?
is not then in terms of a difference in meaning between them, but just that the hypothetical situation is filed under the label "present** in one case, and under "present subjunctive" in the other. A possible reason for wanting to have such alternative filing systems is that, as we have already noted, the syntactic labels can be used in a variety of different ways, and one of them might be in use at any given moment. Consider, for instance, U:
What will you do if I take five?
P:
Take four.
U:
If I take six, will I win?
In U's second question, the present tense can continue the reference to the hypothetical case established in his first. However, in U:
What would you do if I took five?
P:
Take 4.
U:
If I take six, will I win?
The change of tense shifts the topic away from the hypothetical situation, and back to the current board position.
252 3.6.
Stephan D. Isard The subjunctive past
Since the subjunctive manifests itself in the same syntactic marker that might otherwise signal the past tense, the language must press some other device into service to form the past of the subjunctive. What it uses is the perfective aspect marker "have", and in subjunctive environments the program always translates this marker as PAST, rather than the function HAVE discussed earlier. (This can happen in other constructions as well. We have seen an example involving the past perfect in 3.2. and "He may have" can also serve as either "Maybe he has" or "Maybe he did", depending on whether there is a particular past situation under discussion.) The role of IF in a construction like "If I had taken six when I took two" is then to seek out the hypothetical situation which has been specified, and then declare that when we are in the subjunctive mood, this hypothetical situation counts as the past. In somewhat greater detail, the program's operations on this clause go as follows: The translation procedure produces a closure function of the form IF(%SUBJUNCT(%PAST(%TAKE(%6%)(%I%)%)%)%). In doing this it has produced, and run, a translation of the sub-clause "when I took two", with the result that the (ordinary, non-subjunctive) value of PAST has been set to a specific past situation that has my taking of two in progress. IF then looks down into its frozen argument, finds PAST, and applies it, setting up the past situation. The next step is to alter this situation to one in which I take six. It does this by changing the move in progress, the value of the variable INTRAIN, to a move in which I take six. (If the "when" clause were "whenjou took two", it would be the following move that was altered.) A function is then constructed whose effect is to set the value of PAST to this altered situation, and this function is made the value of SUBJUNCT. The hypothetical situation is now ready to be consulted by succeeding past subjunctive main clauses. Suppose, for instance, that we get the clause "would you have taken three?", which translates to SUBJUNCT(%PAST%(%WILL(%TAKE (%3%)(%YOU%)%)%)%). SUBJUNCT gives PAST the value established by the "if" clause. PAST is then applied, actually setting up the hypothetical situation, with respect to which a question amounting to "now will you take three?" is asked.
3.7.
Modalverbs
The program treats the three modal verbs "may", "will" and "can", together with their associated remote forms "might", "would" and "could". Tlhe senses in which these words are taken are as the "may" of possibility (as opposed to permission), "can" in the sense of "be able" and the "will" of prediction.
What would you have done if... ?
253
Unfortunately for these purposes, the "permission" sense of "may" is the preferred one when the word appears in questions, and "might" is often used in a non-remote sense to replace it. "Might you take four?" is a much more natural question than "May you take four?". The program is therefore written to allow "might" to be either remote or not, whichever seems appropriate in the context. There are also cases in which "might" and "could" can take their remote forms without an explicit "if" clause having gone before to define the subjunctive, as in "When I took six, might (or could) you have taken two?". Such sentences seem to carry an implicit "if things had been different" or "if you had felt like it". For the program's purposes, it is possible to gain the effect of such an implicit clause by simply letting the actual past act as the subjunctive past in these sentences. Thus in (5) of the sample dialogue the "have" in the main clause is translated as PAST, because the clause is subjunctive, and this PAST is used to refer to the situation described in the "when" clause, with respect to which "can you take two?" is asked. The presupposition that two was not actually taken is overlooked. The functions MAY, WILL, and CAN operate by considering possible continuations from the board position under consideration at the time they are run. All three take functions corresponding to events, e.g. TAKE(%4%)(%YOU%) as arguments. MAY is the simplest and just hunts for any possible continuation that includes an event corresponding to its argument. WELL ignores those continuations which involve the program's making a move that would be contrary to its strategy. It can confidently predict that none of these will happen. It reports back on whether all, none, or some but not all of the remaining possible games lead to an occurrence of the specified event. Thus, suppose that the program is asked "Will you win". If every continuation played according to the program's strategy leads to a win, it answers "Yes", if none do, it answers "No" and if some do, but not all, it answers "Dunno". The function CAN notes the agent of the event it is given as argument and checks to see whether he can, by making a suitable move at each stage, force the event to occur. The program ignores its actual strategy in these calculations, so that it is able to report that there are things which it can do, but will not. CAN is also quite different from MAY, because the program will not claim that it can win when the opponent is able to prevent it, but it will still say that it may win if it is possible for it to do so through a mistake on the opponent's part. The program answers "No" to "can" questions in situations where the agent has no sure-fire strategy for bringing the event about. In fact it would often be better to say "Not necessarily", because "No" seems to express impossibility, rather than lack of certainty. That is, a reply of "No" to "Can you win?" looks equivalent to "I cannot win", which would be wrong in a situation where a win was still possible if the opponent slipped up.
254
3.8.
Stephan D. Isard
Focus
The discussion of the modal functions has so far proceeded as if they always searched possible continuations from the given board position all the way to the end of the game. But a question like "If I take six, will you take two?" seems to be more naturally taken as "will you take two on your next move?" than as "will you take two during the rest of the game?". Questions involving "win" on the other hand—"If I take six, will you win?"—do not seem to be focussed on the next move. I have not produced any principled account of this phenomenon, but have provided the variable FOCUSSED, which can have the value 0 or 1, as one of the elements in a situation. When operating in a situation where the value of the variable is 1, the modal functions only look ahead to the agent's next move. There are two main ways in which situations become focussed. The program sets the variable to 1 before asking questions with the verbs "take" or "do", and the situations supplied by the function WHEN are always focussed. Note that the question "When I take six, will you win?" does mean "Will you win on your next move?". This device serves its purpose in the present restricted context, but it is really just an indication that I have noticed a problem, not that I have solved it. 3.9.
Other/unctions
The program contains a number of further functions whose workings I will not go into here. They do things which have to be done somehow or other in order to make the program work, but the way in which they do their jobs is not meant to say anything about the way in which people use English. Two such functions that have been mentioned in passing are the WHAT function, which searches the universe for an object, or action, that fits a given description, and the function that decides whether two event descriptions amount to the same thing in a given context, e.g. whether my taking of seven constitutes a winning move. I have tried to keep such functions to a minimum, sometimes at the cost of failing to make the program do things that it clearly could be made to do without much difficulty. The reason comes back to the program's original purpose as an expository device. With this in mind, I thought it best to keep it as simple as possible, so as to help the reader see more easily what the limits of my proposals are, and what phenomena are meant to fall within their scope. Ad hoc patches, to make the basic principles appear to cover more cases than they really do, would only confuse matters. Acknowledgements This work was supported by a grant from the Science Research Council. As should be evident from the text, my thinking on these topics has been greatly influenced by conversations with Christopher Longuet-Higgins. This does
What would you have done if... ?
255
not, of course, mean that he necessarily subscribes to my conclusions. I am also grateful to Julian Davies, Anthony Davey and Graeme Ritchie for many helpful discussions about both English and programming.
References BOYD, JULIAN and THORNE, J. P. (1969), The semantics of modal verbs, Journal of Linguistics 5, 57—74. BURSTALL, R. M. COLLINS, J. S. & POPPLESTONE, R. J. (1971), Programming in POP-2. Edinburgh: Edingurgh University Press. HOPCROFT, J. E., & ULLMAN, J. D. (1969), Formal Languages and Their Relation to Automata. Reading, Mass.: Addison-Wesley. MCCAWLEY, JAMES D. (1971), Tense and Time Reference in English. In Studies in Linguistic Semantics, CHARLES J. FILLMORE and D. TERENCE LANGENDOEN (Eds.) New York N.Y.: Holt, Rinehart and Winston. PRIOR, A. N. (1967), Past Present and Future, Oxford: Oxford Univeristy Press.
FRANZ VON KUTSCHERA
INDICATIVE CONDITIONALS
In this paper a semantics and logic of conditional necessity is developed as the basis of a logic of indicative and subjunctive conditionals and of causal sentences. It is argued, against E. Adams and D. Lewis, that these three types of statements differ only in presupposition.
In "Counterfactuals" (1973) David Lewis has developed a logical system VC for counterfactuals. For the normal cases of counterfactuals, in which the antecedent is false, VC can be replaced by VW, a system based on weak instead of strong centering. In this paper I shall try to show that this system can also be applied to indicative conditionals and can generally be used for a comprehensive and unified treatment of the logic of conditionals. The main obstacles in generalizing Lewis' analysis of counterfactuals to include indicative conditionals are, first, that the intuitive background he provides for his semantics favours strong centering, and, second, an argument by Ernest W. Adams in (1970) to the effect that "subjunctive and indicative conditionals are... logically distinct species" so that the truth-conditions for the former cannot be derived from those for the latter by adding suitable presuppositions. Our first step will be a reconsideration of that argument. 1.
Adams' argument
If counterfactuals derive from indicative conditionals or both from a basic type of conditionals then it should be true that: (1)
an indicative conditional 'If it is the case that A, then it is the case that B' (shortly: *If A, then B') has for non-A the same truth conditions as the counterfactual 'If it were the case that A, then it would be the case that B'.
According to Adams the two following sentences form a counter-example to (1), since we consider (2) true and its antecedent false, but (3) false: (2) (3)
If Oswald didn't shoot Kennedy', then someone else did. If Oswald hadn't shot Kennedy, then someone else would have.
258
Franz von Kutschera
Now Lewis and before him N. Goodman, N. Rescher and R. Stalnaker, have analyzed the truth conditions of conditionals as dependent upon ceteris-paribti:conditions C, not explicitly mentioned in the conditional, that are compatible with the antecedent. If we change our assumptions as to the truth or compatibility of C with A, then we also change our assessment of the truth of the conditional. In the example we are only prepared to accept (2) as true if we know that Kennedy was indeed shot in Dallas, and if we consider that compatible with Oswald's innocence, although this may be very unlikely for us as things stand. If, on the other hand, we consider (3) to be false, we take it that Oswald did indeed shoot Kennedy and that there was no one else around with intentions or means to assassinate the president. We therefore do not consider Kennedy's being shot in Dallas compatible with Oswald's innocence.1 So we have changed our assessment of the ways things might have been, if Oswald hadn't shot Kennedy. But (1) presupposes that this assessment remain the same for the indicative and the counterfactual conditional. The difference in our assessment of the truth of (2) and (3) seems to be a consequence of the fact that, while (2) speaks about the author of the killing in Dallas, (3) implies that Kennedy was somehow fated to be killed anyhow, which is not implied in the logical representation of (3). There are indeed quite a lot of differences of meaning in natural language, for instance by a change in topic and comment as in Goodmans example of Georgia and New York2, that are not accounted for in the usual straight-forward logical representation. So Adams' example is not a conclusive argument against (1). 2.
Types of conditionals
In traditional grammar three types of conditionals are distinguished: Those using the indicative in the antecedent and consequent (indicative conditionals) and two forms of subjunctive conditionals. These two forms can be distinguished morphologically in some languages, as in Latin, by their use of present or past tense (Si hoc credas, erres vs. Si hoc crederes, errares)} but generally they have to be determined by the fact that one type (the counterfactual) carries the presupposition that the antecedent (and also normally the succedent) is false, while the other type (in Latin potentialis) carries no such presupposition but expresses the speaker's opinion that the antecedent is improbable or uncertain. We shall also include causal statements of the form "Since it is the case that A, it is the case that B" in our investigation of conditionals. Such sentences presuppose that A (and hence B) is true. The grammatical subdivision of conditionals is of little logical interest since it mixes syntactical criteria (mood) with semantical (presupposition) and pragmatical 1 2
Cf. also Lewis (1973), p. 71. Cf. Goodman (1965), pp. 14 seq.
Indicative conditionals
259
ones (beliefs of the speaker). As we are here only after truth conditions and not after expressive meaning components the difference between indicative conditional and potentialss is not relevant for us. And since we shall not consider partial interpretations3 we shall take no account of presuppositions. We want to argue that we can get along then with only one type of conditional which we write A=>B. We say that A=>B is used as an indicative conditional if it is undecided (for the speaker) whether the antecedent A holds or not. A=>B is used as a counterfactual if (for the speaker) it is a fact that ô A. And it is used as a causal statement "Since it is the case that A, it is the case that B" if (for the speaker) it is a fact that A.4 We think, therefore, that the difference between the indicative, counterfactual, and causal conditional is not a difference of truth-conditions but only a difference in presupposition. If we assert for instance (1)
If Jack believes that John is married, then he is wrong,
and are told that Jack does not believe John to be married, then we are committed to the statement. (2)
If Jack were to believe that John is married, he would be wrong.
The reason for asserting (1), viz. that John is not married, is the same as that for asserting (2). And if we assert (2) we are committed to (1) if we hear that it is really uncertain whether Jack believes John to be married or not. And if I assert (1) then if I learn that Jack really does believe John to be married, then I am committed to the statement (3)
Since Jack believes that John is married, he is wrong.
And conversely, if I assert (3) and then learn that it is not sure that Jack believes John to be married, I shall say that (1) is true. One or a few examples are not conclusive evidence for our thesis of course. They just serve to give it a certain intuitive plausibility. Our main argument has to be that the semantic analysis of A=>B is such that the thesis is intuitively adequate. 3.
Similarity of worlds
D. Lewis gives several types of semantics for the language of conditionals. The intuitively fundamental one is that of comparative similarity systems5. Such systems are based on relations j
Cf. for instance Kutschera (1974a). We shall not discuss the difference between subjective and objective presuppositions
here. 5
Cf. Lewis (1973), pp. 48 seq.
260
Franz von Kutschera
says that the world k is at least as similar to i as j is. j < j k is to be a weak ordering for which several conditions hold, among them (1)
j < j i for alii, j å Iandj=i=i.
This is the condition of strong centering^ which says that every world is more similar to itself than any other world. Lewis' truth condition for A => Â is (2) A^-B is true in i iff A is impossible or there is an Á-world j so that all A-worlds that are at least as similar to i as j are B-worlds. From (1) we obtain then (3)
ÁËÂ:3(Á=>Â).
This is harmless for counterfactuals which normally are used only under the presupposition that ô A. It is unacceptable, however, if we want to interpret A=>B as the basic form of conditionals, since every causal conditional would then be true. If we replace (1) by the condition for weak centering (!')
j<jiforaUi,j8l,
then (3) is not valid anymore, but the assumption that there is a world, different from i, which is to i just as similar as i itself, is counterintuitive. Similarity of j and i, according to Lewis, is to be overall-similarity, so that j is the more similar to i the more details they have in common and the more important these common details are. Since for j Ö i j must in some details, however few and unimportant, be different from i, i itself must certainly be more similar to i than j. To obtain an adequate semantics for our A=>B there remain then only two possibilities : Change (2) or change the whole intuitive background of the semantics. We might, for instance, change (2) to (2')
A^»B is true in i iff A is impossible or Á Ë Â necessary or there is an Á-world j and a ôÁ-world k so that all A-worlds at least as similar to i as j or k are B-worlds.
In case A is false in i this coincides with (2), i.e. nothing is changed for counterfactuals. (2') expresses the fact that A=^B holds if we can infer B from A together with a suitable ceteris-paribus-condition compatible both with A and ô A, and looks, therefore, like a good candidate for indicative conditionals. The trouble with (2'), however, is that the logic we obtain from this condition is too weak. Conditionals are a type of inference-relation and though many of the fundamental principles valid, for instance, for logical entailment or material or strict implication (like the laws of contraposition, strengthening of the premiss or transitivity) are not valid for conditionals6 they are valid in the normal cases. From (2'), however, we do not obtain sufficiently strong restrictions of these laws. Cf. Lewis (1973), 1.8.
Indicative conditionals
261
We shall therefore follow the second course and abandon the use of worldsimilarities for the interpretation of A^»B altogether.
4.
Conditional necessity
We shall interpret A=>B as a statement about conditional necessity and read it as "On condition that A, it is necessary that B". The notion of conditional necessity is a generalization of the usual notion of (unconditional) necessity, as conditional probability or conditional obligation are generalisations of the notions of (unconditional) probability and obligation. Under different conditions different propositions may be necessary. From conditional necessity we obtain two concepts of unconditional necessity: proposition p is weakly necessary if it is necessary on a tautologous condition, and p is strongly necessary if it is necessary under all conditions, p is weakly necessary if under the given circumstances p is normally the case. Therefore A^B expresses a notion of weak necessity: on condition that A, it is normally the case that B. Conditional possibility can then be defined by D4.1. A=?>B: = -,(A=>-iB) We read A==^B as "On condition that A, it is (weakly) possible that B". A proposition p is unconditionally weakly possible if under the given circumstances it would not be abnormal if p were the case. And p is strongly possible if there is a condition under which p is (weakly) possible. Before we discuss these intuitive concepts further let me give the formal definitions: Let (£ be the language obtained from that of Predicate Logic by stipulating that (A^»B) be a sentence if A and B are. To economize on brackets é, Ë , í are to bind stronger and n>, = weaker than =>, so that we may write Á ËÂ=>Â í C=>C=>TAinsteadof ((ÁËÂ)=>(Â í Q) D (C=>nA). D4.2. An interpretation of (£ is a quadruple so that: (1) (2) (3)
7
U is a non-empty set of (possible) objects. I is a non-empty set of (possible) worlds. f(i, X) is a function on I x P(I) (P(I) being the power set of I) so that for all i å I and X r> I (a) f(i,X)c=X (b) X c Õ Ë f(i, ×) öË r> f(i, Õ) öË7 (c) X c: Õ Ë f(i, Y)f|X ÖË => f(i, ×) = f(i, Õ)Ã)× (d) ief(i,I)
For the sake of brevity we use the logical operators of (£ also as metatheoretical symbols.
262
(4)
Franz von Kutschera
For all i å É Öß is a function from the set of sentences of (£ into the set {t, £} of truth values so that (a) Oj (a) =
In modal logic we set Oj(NA) = t iff Sj c [A], where for all ßåÀ Sj is a non-empty subset of I. Sj is the set of worlds possible from the standpoint of i. (4c) is the straight-forward generalization for conditional necessity: f(i, A) is the set of worlds (weakly) possible under condition that A from the standpoint of i. If we set Si = (J((i, X), then Sj is the set of worlds strongly possible from the standpoint of i, i.e. the proposition X is strongly necessary iff Sj c X, and X is strongly impossible iff Sj c X. We construe f so that
(á)
f(i,X) = A = SiCX,
i.e. f(i, X) is empty iff X is strongly impossible. This follows from the conditions of D4.2(3), and the definition of S^ If SiC: X then f(i, ×) =Ë according to (a). If Sj Ð × öË then there is a Y with f(i, Y) f) × öË, so according to (c) f(i, X f) Y) = f(i, Õ) Ð × öË, and according to (b) f(i, ×) öË. From the definition of Si we obtain (â) Sj c=X iff for all Y c I f(i, Y) c X, And (a) together with (a) implies
(ã)
SiC=X = f(i,X)cX.
We can, therefore, define strong necessity and possibility by D4.3. (a) (b)
NA :=nA=>A MA:=-iNiA,
while weak necessity and weak possibility are defined by D4.4. (a) (b)
LA : = T=>A, where T is a tautology, PA : = -iLiA.
Now condition D4.2(3 a) says that all worlds (weakly) possible on condition that X are X-worlds. Condition (b)—always of D4.2(3)—says that if X is strongly possible and Xc Y, then Y is strongly possible. This is the law A =>BhMA => MB of modal logic. Condition (c) (in view of (a) and (b)) is equivalent to f(i, Õ)Ã)×ÖË ^ f(i, × Ð Y) — f(i> Õ) Ð X; i.e. if among the worlds (weakly) possible on condition that Y there are some X-worlds, then these are the worlds (weakly) possible on condition that X and Y. This implies the law of im- and exportation of premisses (Á Ë B=>C = A=>B => C).
Indicative conditionals
263
Condition (d) finally says that i is weakly possible from the standpoint of i. (d) together with (c) implies the law of modus ponens for conditional necessity: Á Ë (A=>B) =>B. A word, perhaps, is also in order on condition D4.2 (4a): All individual constants are interpreted as standard names. S. Kripke has given good reasons for such a procedure in (1972). Since we are not interested in existence here we have not introduced sets Uj of objects existing in i. If Å is a one-place predicate constant of (£ we could set Öß(Å) = Ui and define quantification over existing instead of possible objects by Ë.÷Á[÷]: = Ë÷(Å÷ ID A[x]). D4.5. An interpretation SR = satisfies a sentence A in i iff Öé(Á) = t. A is valid in $R iff 9M satisfies A for all i å I. And A is C-valid iff A is valid in all interpretations of (£. Our concept of interpretation appears in Lewis (1973), 2.7 as that of a model based on a weakly centered selection function. His selection functions, however, are introduced on the basis of comparative similarity concepts for which only weak centering is counterintuitive, as we have seen. To arrive at such functions—f(i, A) being interpreted as the set of Á-worlds most similar to i—the Limit-Assumption has to be assumed, that for all i and A there is an A-world most similar to A. Though this makes no difference for the resulting logical system it is intuitively not well-founded as Lewis points out, since the similarity of worlds may depend on the values of real valued parameters like places, times, masses etc. in them. Our approach avoids these difficulties in giving another interpretation to the selection functions. If we want to consider iterated applications of modal operators the principles (ä) (å)
NA^NNA of C. I. Lewis'system S4, and -iNAiDN-iNA ofS5
suggest themselves. As S5 seems to be intuitively most adequate, we may incorporate the condition (e)
j å Sj ID Sj = Sj for all )å!
intoD4.2(3a). If we want to obtain principles for iterated applications of => it seems best to generalize (ä) and (å). The following two conditions are the likeliest candidates : (æ) (ç)
A=>B=>L(A=>B) i(A=>B)iDL-i(A=>B).
These two conditions are equivalent with postulating in D4.2(3) also (f)
j å f(i, I) => f(j, X) = f(i, X) for all )å! and X c I.
If we assume (æ') A=>BiDN(A=>B) and (ç') -é(Á=>Â) ID Ni(A=*B), 18a TLI3
264
Franz von Kutschera
instead of (æ) and (ç), Á=*·Â holds iff it holds on all conditions, and does, not hold iff it holds on no conditions, all conditionals would be necessarily true or false, and we would have LA=NA and N(A isB) = A=s-B, i.e. conditional necessity would coincide with strict implication. This is not adequate since it may be true that If my barometer goes up, then the atmospheric pressure rises but, since my barometer does not necessarily function correctly, it is false that The going up of my barometer strictly implies that the atmospheric pressure rises. If, on the other hand, we postulate (æ") (ç")
Á=>Â^>Á=>(Á=>Â) and -é(Á=>Â) =D A=*-i(A=>B)
this would not be intuitively correct, since, if A^>B holds, then A is a reason for B, but not a reason for A => B. 5.
The logic of conditional necessity Let CQ be Predicate Logic plus the following rule and axioms
CR: Cl: C2: C3: C4: C5: C6: C7:
AhNA . A=>A NA=>B=>A N(A =>B) Ë (C=>A) => C=>B (A=>B) Ë (A=>C) ^ A=>B Ë C A=^B=>(A A B = > C = A=>B=3C) A=>BiD(Az)B) Ë÷(Á=>Â[÷]) ^ Á=>Ë÷Â[÷] ÖÉ is to be (£0 plus C8: NA => NNA C9: iNA=>lShNA (£2 is to be (E! plus CIO: A=>B^L(A=>B) Cll: n(A=>B)=)L-i(A=i>B). The propositional part of G0» i-e. (£0 minus C5, is equivalent with D. Lewis' system VW in (1973), pp. 132 seq. C0 contains the basic modal system M of von Wright (or T of R. Feys) together with the Barcan formula Ë÷ÍÁ[÷] is ÍË÷Á[÷], and therefore â^ contains S5. For inference relations > like logical entailment and material or strict implication the following principles are fundamental: (1) (2) (3)
A> A (A > Â) Ë ( > C) => (A > C) (Á>Â)=>(ÁËèÂ)
Indicative conditionals
(4) (5) (6) (7) (8) (9)
265
(A>B)=>(A>BvC) (A > Â Ë C) = (A > Â) Ë (A > C) (Á í Â > C) = (A > C) Ë (Â > C) (A>B) ÁË(Á (Á Ë Â >C) = (A > B ID C)
For => in place of > only (1), (4), (5) and (8) hold. In place of (2) we have ( =>Á) Ë (Á=»Â) Ë (B=>C) ^ A=>C, in place of (3) (A =>C) Ë (Á=>Â) z> Á Ë C=>B, in place of (6) (Á í  =>A) Ë (Á í B=>B) ID (A v B=>C = (A=>C) Ë (B=>C)), in place of (7) nLB Ë (A=>B) =3 iB=>nA and in place of (9) we have C5.
6.
Conditional necessity and conditionals
We have to show now that conditionals can be adequately analyzed in terms of statements A ^> Â about conditional necessity. (A)
Indicative Conditionals
A statement A=>B may be used as an indicative conditional if PA Ë Ñ ô A, i.e. if under the given circumstances A and -\ A are both weakly possible (it may very well be the case that A, but also that ô Á). Í. Goodman's analysis of counterfactuals in (1965) can in part be carried over to indicative conditionals. Then a sentence (1) "If A, then B" is not only true if N(A => B) but also if there is a relevant condition C, not mentioned in "If A, then B" so that N(A Ë C ^ B). C cannot be a free parameter for then (1) would have no definite truth value. C cannot be the conjunction of all true statements, for if A is false (1) would always be true since N(A Ë ô A z>B). C has to be at least consistent with A. C cannot always be true since, on condition that A, C might be an implausible assumption if ð A. As Goodman has shown, the truth condition "If A, then B" is true iff there is a C so that ôÍ(Á => ô C) and N(A Ë C => B) violates the principle MB Ë (Á=>Â)=>-é(Á=ß>ðÂ). So we will have to choose a stronger relation than ôÍ(Á ^ nC) between A and C which Goodman calls cotenability. It seems natural to take A=>C in place of such a relation. If A ^> C then C is (weakly) necessary or the normal case on condition that A, so that assuming A C goes without saying. This accounts 18b TLI 3
266
Franz von Kutschera
for C not being mentioned in (1). We then have: "If A, then B" iff there is a condition C such that A=>C and N(A Ë C=>B). From this it follows that "If A, then B" holds iff A=>B does. For if A =>  is true we can take A D  as our C; then N(A Ë C=>B) and (in view of C3) A=s>C. And if we have A=>C and N(A Ë C=>B), then we have A=>A Ë C (Cl and C4) and therefore A^B, in view of C3. This equivalence A=>B = C(A=>C Ë Í(Á Ë C ï Â)) also holds for counterfactuals and causal conditionals and so is a valuable argument for the correctness of our analysis. In the normal case of an indicative conditional "If A, then Â"  is not weakly necessary, i.e. we have iLB. We would, for instance, not normally say "If Nixon is stillpresident next year, then he will be over sixty" (1) normally expresses that there is a connection between the facts expressed by A and  so that it may very well be that nB if ô A. The case LB can be excluded by using the strong conditional defined by D6.1. A=*B: = (A=>B) Ë -é(éÁ=>Â), which implies that ð  is (weakly) possible under condition that ô A. For -iLB, A=>B,and A =>  are equivalent. A =>B can be read as "If it is the case that A, then it may be the case that B". As Lewis points out, such "may"-conditionals (speaking of counterfactuals he had his eye on "might"-conditionals) also play a role in ordinary discourse. They come quite naturally from the standpoint of conditional necessity. (B)
Counter/actuals
A =s> Â may be used as counterfactual if ô A. Then we also have ÑôÁ. We shall not stipulate nPA, i.e. LnA however, since under the given circumstances, though A is false, A might very well be possible. So on our definition for PA A=>B may be used both as an indicative and a counterfactual conditional, depending on the speakers knowledge, of which we have taken no account in our semantics. The logic of counterfactuals then coincides with that given by D. Lewis. In the normal case of a counterfactual we again have -iLB. If we want to imply that B is in fact true, we say "If it were the case that A, then it would still be the case that B". By use of => instead of => we can exclude this case LB, while for /iLB A =>B is equivalent again to A => B. (C)
Causal conditionals
A=>B may be used as a causal statement (2) "Since it is the case that A, it is the case that B", if A is true. Then we have PA, but we don't stipulate that ðÑôÁ, i.e. that A is normally the case. In fact from A=>B and LA we obtain LB,
Indicative conditionals
267
so that B is normally the case too; from LA (or PA) and LB A^>B already follows, so that this statement is quite uninformative in case of LA. In ordinary discourse, we usually only give reasons for phenomena that are unexpected, strange or unusual as for instance J. K nig, H. Hart and H. Honore, J. Passmore and E. Scheibe have pointed out in their discussions of the notion of explanation. This is not always so, but iLB (and therefore iLA), is certainly an important case in the use of causal statements. Taking A => B instead of A => B (equivalent with A => B again in case of -iLB) we do not exclude the case LB, but only iLA Ë LB and give an informative sense to a causal statement in case of LA Ë LB since -é(ôÁ=>Â) does not follow from LA and LB. A=>B then states that A is necessary for the weak necessity of B. It should finally be emphazised first that a causal conditional "Since A, B" does not state that A is a cause of B. As in Since my barometer is going ftp, the atmospheric pressure rises or
Since the period of the pendulum is t, its length is g(—)2
A may be an effect or a symptom for B that is a reason for believing that B. Moreover, A may not be the only possible or actual reason for B. But in Á=*·Â if A were not to be the case then at least B might be false. The question, wether CIO and Cll are adequate, is very hard to decide, since we lack reliable truth criteria for ordinary knguage sentences with iterated "if-then's". Take the following examples of sentences of the form A=> (B=*· C), (Á^·Â)=> C, and A AB=*>C: (3) (4) (5)
If John will come, then if Jack mil come too, it will be a nice party. If Jack will come in caseJohn comes, it will be a nice party. If John andJack will come, it will be a nice party.
If under the given circumstances it is possible that John will come (PA), then (3) according to CIO and Cll is equivalent to (6)
If Jack will come, then it will be a nice party.
And if under the given circumstances it is possible that Jack will come in case John comes (P(A=>B)), (4) is equivalent to (7)
Under the given circumstance it will be a nice party. Under condition that P(A=>B) (5) follows from (3), but is not equivalent
to (3).
All this is hardly convincing. But I doubt that we can really spell out the difference of meaning between (3), (4), and (5). Such constructions are very rare so that we have only a narrow basis for a test for the adequacy of CIO and Cll. 18b*
268
Franz von Kutschera
If we do not want to exclude such iterated "if-then" 's altogether—and we wouldn't lose much for ordinary language analyses thereby—then it seems best to adopt strong principles that permit us to reduce many such sentences to simple "if-then" 's, as C8 and C9 do in Modal Logic. 7.
Conclusion
An explication of a concept is adequate if the explicatum is coextensional with the explicandum for the great mass of normal instances, and if the explicatum is simple and fruitful. I have tried to show that our explication of conditionals captures the main ideas that we express by them. The simplicity of the explicatum is achieved only by leaving open the question of how to give a precise sense to the notion of relative necessity8 and by passing over a lot of problems connected with natural language analyses. As for fruitfulness just one example: Many conditional obligations have to be analyzed in the form 'If it is the case that A, then it is obligatory that B', for which a rule of detachment holds, so that we can infer from A that B is obligatory? If we take the 'if-then' here as a material or a strict implication, this conditional obligation takes no exceptions. As we can think, with a little imagination, for almost every current conditional obligation of situations, in which it would not hold, i.e. of conditions C so that, if Á Ë C, then not O(B), we would have inconsistency in almost all our normative systems.10 But if we analyze 'If A, then O(B)' as A=>O(B), then O(B) is only said to be the normal case on condition that A, not that O(B) holds in Á-worlds in which extraordinary circumstances and strange coincidences obtain. It may then very well be the case that A=>O(B), but -é(Á Ë C=>O(B)). Such restrictions to normal cases are implied, I think, in most every-day statements of conditional obligation. References ADAMS, E. W. (1970), Subjunctive and Indicative Conditionals. Foundations of Language 6, pp. 89—94. FRAASSEN, B. VAN (1973), Values and the Heart's Command. The Journal of Philosophy 70, pp. 5-19.
8
In another paper I define an epistemic interpretation of conditional necessity and try to show that a purely "objective" interpretation is, as in the cases of unconditional necessity or the similarity of worlds, impossible. 9 For other types of conditional obligations cf. Lewis (1973), 5.1, and Kutschera (1974B). 10 For a way out of this problem that comes close to deontic suicide cf. B. van Fraassen (1973).
Indicative conditionals
269
GOODMAN, N. (1965), Fact, Fiction, Forecast. 2nd ed. Indianapolis 1965. KRIPKE, S. (1972), Naming and Necessity, pp. 253—355, 763—769 in G. HARMAN and D. DAVIDSON (ed.): Semantics of Natural Language, Dordrecht: Reidel. KUTSCHERA, F. VON (1974a), Partial Interpretations. To appear in E. KEENAN (ed.): Formal Semantics for Natural Language, Cambridge. KUTSCHERA, F. VON (1974b), Normative Präferenzen und bedingte Obligationen. In H. LENK (ed.): Normlogik. München-Pulkch. LEWIS, D. (1973), Counterfactuals, Cambridge, Mass.: Harvard Univ. Press. STALNAKER, R. C. (1968), A Theory of Conditionals, pp. 98—112 in N. RESCHER (ed.): Studies in Logical Theory, Oxford.
DISCUSSIONS AND EXPOSITIONS H.A.LEWIS MODEL THEORY AND SEMANTICS1
In this paper some basic concepts of model theory are introduced. The structure of a definition of truth for a formal language is illustrated and the extension and alteration required for model theory proper is explained. The acceptability of a modeltheoretic account of truth in a natural language is discussed briefly.
I.
Introduction
In recent years several writers have proposed that formal logic has a role to play in linguistics. One suggestion has been that a semantic theory for a natural language might take the same form as the semantic accounts that are usual for the artificial languages of formal logic. Without presupposing a knowledge of formal logic on the part of the reader, I attempt here to sketch the arguments for this suggestion in the forms it has been given by Donald Davidson and by Richard Montague. Readers already familiar with their views will find nothing here that is new, but I hope also nothing that is dangerously misleading. One good reason for seeking to say nothing that presupposes knowledge of formal logic or of model theory is that questions of importance for my subject have to be settled, or begged, before formal semantics can be developed at all. I shall therefore be dealing with some basic concepts of model theory rather than with any detailed formal developments. It may help to identify my topic more clearly if I explain why it seems to me to survive two criticisms that might be brought—two views about the role of formal logic in the semantics of natural languages that imply (from opposite directions) that no question about the applicability of formal semantics to natural languages arises.
1
This paper is a revised version of a paper presented at the meeting of the Semantics Section of the Linguistics Association of Great Britain at the University of York on 5 April 1972. My presentation of the issues owes much to Donald Davidson, in particular to Davidson (1967). I am grateful to the referee for many improvements to an earlier version.
272
Harry A. Lewis
One school of thought is very hospitable to formal logic: allowing a distinction between deep and surface structures in a grammar, it claims that in a correct grammar deep structures will be nothing other than sentences of formal logic, and that such deep structures are necessarily bearers of clear semantic information. The only serious semantic questions that arise for natural languages would then be questions about the derivation of surface structures from deep structures. This view, a caricature to be sure, seems to me too hospitable to formal logic. If formulas of logic are usable as deep structures in a generative grammar, and the principle that meaning does not change in proceeding from deep to surface structure is espoused, semantic questions are simply thrown back onto the deep structures. The semantics of first order predicate logic (to mention the most familiar logical system) is well established for the purposes of logic textbooks but not without its limitations if it is taken as accounting for meaning in natural languages. A linguist who chooses logical formulas for his deep structures enters the same line of business as the philosophers who have puzzled over the philosophically correct account of the semantics of the logical formulas themselves. Some of their puzzles depend on the acceptance of a standard way of translating to or from logical symbolism, but others arise from the usual semantic accounts for first-order logic2. Even if it were legitimate to take the semantics of standard first-order logic for granted, this logic notoriously cannot deal in a straightforward way with many aspects of natural languages, such as tenses, modalities and intentional verbs, and indexicals. But the semantics of the more complex logical systems designed to deal with such notions is sufficiently controversial among logicians that no one can safely take it for granted. Another school of thought, viewing my subject from the opposite direction, holds that we know that a semantic account appropriate to an artificial language could not be appropriate to a natural language just because the former is artificial. The formulas of an artificial language are stipulated at their creation to have the meaning that they have, whereas a natural language, although it is a human creation, must be investigated empirically by the linguist before he can hope to erect a semantic theory3. It seems to me that the bare charge of artificiality is a pointless one: there is no reason why a semantic account that fits a language we have invented should not also fit another language that we have not. (Just as there is no reason why a human artifact should not be exactly the same shape as an object found in nature.)
2
See below, p. 276/7. In many presentations of first-order logic, only the logical constants (the connectives and quantifiers and perhaps the identity sign) are regarded as having a fixed meaning. In such an approach no formula (except one consisting entirely of logical constants) has a meaning until an interpretation is assigned to the non-logical constants: so the stipulation is a two-stage process. 3
Model theory and semantics
273
The idea that the semantics of natural languages is subject to empirical constraints that do not operate for artificial languages is worthy of greater respect, however. Whereas I may, it seems, decree what my artificial symbols are to mean, I must take the natural language as I find it—we can talk of 'getting the semantics right* for a natural language but not for an artificial one. As a matter of fact there is such a thing as getting the semantics wrong, indeed provably wrong, for an artificial language, because of the requirements of consistency and completeness that formal semantic accounts are intended to meet: but this remark, although it may make formal semantics sound more interesting, does not meet the difficulty about natural languages. Let us imagine that we have a proposed semantic theory for English before us, and that it gives an account of the meaning of the words, phrases, and sentences of the language. An empirical linguist must then inspect this theory to see if it squares with the facts. But what facts? The answer to this question depends to some extent on the nature of the theory: it may be a theory with obviously testable consequences. If, for example, it alleges that speakers of the language will assent to certain sentences, then provided you know how to recognize speakers of the language, and their assentings, such consequences can be tested in the field. Alternatively, the theory may provide a translation of sentences of English into another language. If the other knguage is Chinese, and you speak Chinese, you can check the translations for accuracy. If the other language is a knguage no one speaks, such as semantic markerese, whose existence is asserted only by the theory, then no such check is possible4. The problem of empirical adequacy is a central one for semantics. A semantic theory must provide an account of the meaning of the sentences of the language it purports to describe. If the language is our own language, we should be able to tell without difficulty whether the account is correct. It is a minimal requirement of a semantic theory that it offer a translation, paraphrase or representation of each sentence of the language. Any transktions it offers should be synonymous with the sentences of which they are translations. If the transktions talk about abstract entities of a kind of whose existence we were unaware, we shall need to be persuaded that we really were talking about them all the time, although we did not realize it. (It is an even more basic requirement that the transktion of a deckrative sentence should be a declarative sentence, rather than (for example) a verbless string.) It is not obvious that these platitudes bring us any closer to an understanding of the empirical constraints on a semantic theory. Certainly, if we think of translation as a simple pairing of sentences, it does not5. But if we think of
4
I owe the expression 'semantic markerese' to David Lewis. See Lewis, D. (1972). For the possibility of giving semantics by pairing synonymous expressions, cf. Hiz (1968) and Hiz (1969).
5
18a*
274
Harry A. Lewis
translation as the explaining in one language of the sentences of another, we may find a way out. Compare: (1)
'Pierre attend' means the same as 'Pierre is waiting';
(2)
'Pierre attend' means that Pierre is waiting.
(1) informs us of a relation between two sentences: (2) tells us what a particular French sentence means. A semantic theory should not simply pair sentences, it should tell us what they mean. (1) and (2) are both contingent truths, but the same cannot be said of both (3) and (4): (3)
'John is tall' means the same as 'John is tall';
(4)
'John is tall' means that John is tall.
We know that (3) is true in virtue of our understanding of 'means the same as', and so it is a necessary truth. We also know that (4) is true, but (4) is a contingent truth about the sentence 'John is tall'6. Moreover a semantic theory about English in English, worthy of the name, should have (4) as a consequence, as well as (5): (5)
'Four is the square of two' means that four is the square of two,
and in general should have as consequences all sentences like (A) (A)
S means that p.
where 'S' is replaced by a syntactic description of a sentence and 'p' is replaced by that sentence or a recognizable paraphrase of it. Such a theory does not simply pair sentences: it tells us what they mean. A minimal requirement on a semantic theory for a natural language is that it have as consequences sentences of form (A). The fact that the Á-sentences are contingent truths rather than stipulations or necessary truths proves to be no block to providing for a natural language a semantic account similar to some that can be given for formal languages: indeed the founding father of formal semantics, Alfred Tarski, made it one of his basic requirements for a formal semantic theory that it yield something very like the-A-sentences7. It may seem that the requirement that a semantic theory yield the Asentences is a weak one, and that the production of such a theory would be a trivial matter. It will be part of my purpose in what follows to show that this is not the case.
6 This point can easily be misunderstood because any (true) statement about meanings might be thought to be true *in virtue of meaning' and so necessarily true. But we do not need to know what 'John is tall* means to recognise (3) as true: all we need to know is that the same expression occurs before and after 'means the same as*. 7 See Tarski, A. (1956), in particular section 3 (pp. 186 sqq.). The idea that Tarski's approach may yet be appropriate for natural language is due to Donald Davidson. See in particular Davidson, D. (1967), (1970) and (1973).
Model theory and semantics
II.
275
Simple semantics
The semantic theories that will now be described have the form of definitions of truth for a language: they set out to give the truth-conditions of its sentences. The classical account of the definition of truth is Tarski's paper 'The concept of truth in formalized languages'. It seems to me to be essential to present briefly the main themes ofthat important paper8. A definition of truth is given in a language for a language. There are thus typically two languages involved, the one for which truth is defined, which plays the role of Object-language', and the one in which truth is defined, playing the role of 'metalanguage'. The metalanguage must be rich enough to talk about the object-language, in particular it must contain names of the symbols or words of the object-language and the means to describe the phrases and sentences of the object-language as they are built up from the words: it must also contain translations of the sentences of the object-language. Tarski lays down, in his Convention T, requirements for an adequate definition in the metalanguage of a truth-predicate: that is, requirements that a definition must fulfil if the predicate so defined is to mean 'is true'. The convention demands that it should follow from the definition that only sentences are true; and that the definition should have as consequences all strings of the form (B)
S is true if and only if p
where 'S' is replaced by a structure-revealing description of a sentence of the object-language and 'p' is replaced by a transktion of the sentence S in the metalanguage. Convention T resembles the requirement that an adequate semantic theory must have the Á-sentences as consequences. If we require that the definition of truth be finitely statable, then for an object-language with infinitely many sentences it is not possible to take as our definition of truth simply the conjunction of all the infinitely many B-sentences. It is in the attempt to compass all the B-strings in a finite definition of truth that the interest of the Tarski-type truth-definition lies. It is perhaps still not widely appreciated even among philosophers that the production of a definition of truth that fulfils Convention T for an interesting but infinite object-language is far from a trivial matter. Tarski himself showed that one superficially attractive trivialising move does not work. We might be tempted to use the axiom (6)
(x) ('x' is true if and only if x)
but this does not succeed in doing what was intended since the expression immediately to the left of the 'is true' is but a name of the letter 'x'. 8
cf. note 7 above. A simple introduction to Tarski's ideas is given in Quine (1970), chapter 3: Truth. 19 TLI3
276
Harry A. Lewis
In his paper, Tarski showed how Convention T's requirements could be net for one logical language. In order to illustrate his method I shall use a tiny fragment of first-order logic, with the following syntactic description: (7)
Symbols: variables: w ÷ y 2 predicates: F G the existential quantifier: E the negation sign: ô Sentences:
(i) (ii) (iii)
'F' followed by a single variable is a sentence; 'G' followed by two variables is a sentence, If S is a sentence, S' followed by S is a sentence, called the negation of S. If S is a sentence containing a variable other than 'w' or V, the result of writing ¸' followed by the variable followed by S is also a sentence, called (if the variable is v{) the existential quantification of S with respect to í·Ã
Thus the following are sentences of the fragment: (8)
Fx Gwx nFw EyFy EznGxz ôÅæÑæ Fy
A definition9 of truth will be offered for this fragment, using a small part of ordinary English as metalanguage. This part should at least contain the predicates 'smokes' and 'loves' and the names 'John' and 'Mary'. In order to explain the truthdefinition, I need to introduce the notion of satisfaction, since the fundamental semantic notion that we use is not truth but truth of, a relation between an individual and a predicate (or verb-phrase). We say that 'smokes' is true of John just in case John smokes; 'is red' is true of this poppy just in case this poppy is red. We have to complicate the notion in a natural way to fit a sentence like 'John loves Mary'. We can already say: 'loves Mary' is true of John if and only if John loves Mary, and 'John loves' is true of Mary if and only if John loves Mary. But we cannot say that 'loves' is true of John and Mary, for that would also be to say that 'loves' is true of Mary and John, but 'John loves Mary' means something different from 'Mary loves John'. We have to say in what order John and Mary are taken: so we use the notion of an ordered pair, John then Mary,—of a sequence with two members, John and Mary (in that order). Then we say that a sequence satisfies a predicate, by which we mean that the objects in the sequence,
9
Since a definition of truth in Tarski's style proceeds by way of a recursive definition of satisfaction, it may more appropriately be styled a theory of truth. If the metalanguage is powerful enough, such a theory may be converted into an explicit definition of truth.
Model theory and semantics
277
ordered as they are, fit the predicate, ordered as it is. We must use some notational device to keep track of the places in the predicate and to correlate them with the places in the sequence. (Note however that the places in the sequence are occupied by objects, the places in the predicate by names or other noun-phrases.) In the examples just given, both object-language and metalanguage are English. In our logical language, 'F' does duty for 'smokes' and 'G' for 'loves': so T' is true of John if and only if John smokes, and 'G' is true of John and Mary (in that order) if and only if John loves Mary. It is now possible to give the recursive definition of satisfaction for the fragment:10 (9) (i) (ii) (iii) (iv)
For any sequence q of persons whose first member is John and whose second member is Mary11, and all i and j, q satisfies Fv{ if and only if the i'th member of q smokes. q satisfies Gv-^v^ if and only if the i'th member of q loves the j'th member of q. q satisfies the negation of S if and only if q does not satisfy S. q satisfies the existential quantification of S with respect to the i'th variable if and only if at least one sequence differing from q in at most the i'th pkce satisfies S.
and the definition of truth: (10)
A sentence is true if and only if it is satisfied by all sequences of persons whose first member is John and whose second member is Mary.
If the implications of this definition are unravelled, we find out for example that (11)
'EyGyw' is true if and only if someone loves John.
(It is common to draw a veil, as I have done, over the process of abbreviation that yields this result: mention of sequences has been obliterated, but they are the most important piece of machinery required for the definition.) The definition of truth just stated tells us what sentences of the fragmentary language are to mean rather than what they do, as a matter of fact, mean: but this is only because the language that served as object-language was not one whose sentences already had a definite meaning. The procedure for defining truth could equally be followed for English sentences where both the metalanguage and the object-language were English. If it was followed, the role of the semantic definition of truth would be to articulate the structure of the language—to show how the meanings of sentences of arbitrary complexity depend on the meanings of their parts—rather than to give any very helpful information about the meanings of the parts. A definition of truth of the simple type that I have presented does 10
The italicised *P and *G* do duty for names of the predicates *F and 'G'. Concatenation, the writing of one symbol next to another, has been left to be understood, although it is usual in formal approaches to make it explicit. Vj' means 'the i'th variable', e.g. (v^ means *y* 11 I owe this device for handling proper names to Donald Davidson. 19*
278
Harry A. Lewis
all its serious work in the recursive clauses, and we look to it in vain for more than obvious information about the meanings of the simples (in this language, the elementary predicates and names). The attempt to extend a definition of truth according to Convenction T to a more useful fragment of English is a much more difficult task than it at first appears, however. Although we can leave many questions aside in pursuing this objective, it is still necessary to decide how sentences are built up and to determine the precise semantic role of the parts. I may illustrate the difficulties by a suggestive example familiar to philosophers. Students of modal logic (so-called) interest themselves in the notion of necessity, and concern themselves in particular with 'necessarily' as an adverb modifying whole sentences. The consequences required by Convention T of the truth definition would include sentences such as (12) (12)
'Necessarily John is tall' is true if and only if necessarily John is tall.
An obvious way to accommodate necessity in the recursive definition of satisfaction would be this: (13)
q satisfies the necessitation of S (i.e. the result of writing 'necessarily' then S) if and only if necessarily q satisfies S.
Such a clause implies that the sequence whose first member is the member one satisfies 'necessarily vl is odd' if and only if it is necessary that the sequence satisfies ivl is odd'—but the notion of a necessary link between a sequence and an open sentence is surely not present in the given sentence-form, and thus (13) is false. Intentional notions resist straightforward treatment in the definition of truth. A warning about the idea of a simple definition of truth in English for English that satisfies Convention T is appropriate here. There are formal reasons why a definition of truth in a language for the same language is not possible : unless the language concerned is very weak, a version of the Epimenides paradox will emerge : (14)
'Is not true of itself is not true of itself.12
III.
Model theory
Model theory is the investigation of the relationship between languages that can be formally described and the structures, interpretations or models for which their sentences are true. The sentences of such a formal language are, typically, true only in certain models, so that, given a collection of sentences, it is often possible to say what features any model in which all the sentences hold must have. Starting from the other end, with a model, it is usual to find that only certain sentences
See Tarski (1956), and cf. Martin (1970).
Model theory and semantics
279
hold in it. Thus, given some sentences, we can investigate their possible models: given a model, we can investigate which sentences it makes true. The connection between a language and a model is set up by means of a definition of truth: by a definition that explains under what conditions a sentence holds, or is true, in the model. The simplest definitions for first-order logic go by way of a recursive characterization of satisfaction, as described above, or directly to a recursive characterization of truth13. The sole difference from the semantics of Section II is that truth is defined as relative to a given model. In contrast, the simple semantics of Section II offers a definition of truth as absolute. For present purposes I take it as a defining characteristic of model theory that it studies relative definitions of truth. Relative definitions of truth are quite standard in logic textbooks, both for elementary logic (where the notion of model or interpretation is needed to define logical truth and logical consequence) and for modal and intentional logics (where truth is often defined as relative to a 'possible world'14). In applications to natural languages, there is a measure of agreement that truth of sentences is relative to the context of utterance where indexical elements (e.g. tense, location) occur: the philosophical divide comes between those who hold that Semantic' primitive notions are admissible in the definition of truth and those who do not15. An influential recent writer who supported the claims of model theory was Richard Montague. Here are two typical statements of faith: I reject the contention that an important theoretical difference exists between formal and natural languages. On the other hand, I do not regard as successful the formal treatments of natural languages attempted by certain contemporary linguists. Like Donald Davidson, I regard the construction of a theory of truth—or rather, of the more general notion of truth under an arbitrary interpretation—as the basic goal of serious syntax and semantics; and the developments emanating from the Massachussetts Institute of Technology offer little promise towards that end16.
13
For a recursive definition of truth-in-an-interpretation, see Mates (1972) p. 60, and for a definition of truth-in-an-interpretation by way of a recursive definition of satisfaction, see Mendelson (1964) pp. 50—51. 14 Thus for example the necessitation of S is said to be true just in case S is true in all possible worlds. For formal purpose, possible worlds function as do models—they are abstract entities in which certain sentences hold. 15 This is not the place to argue at length that this is the philosophically interesting divide. Davidson favours the absolute definition of truth, Montague the relative (see Davidson 1973, and Wallace, 1972). It is Davidson who has taught us the importance of this contrast, and I follow his usage in using 'absolute* to cover accounts of truth that do not use semantic relata—e.g. interpretations, possible worlds—even if they use a notion of truth as relative to such things as place and time of utterance. 16 Montague (1970a) p. 189. (Emphasis added.) It will be clear from the last note that 'like Donald Davidson* is here misleading.
rtarryA.Lewis
280
There is in my opinion no important theoretical difference between natural languages and the artificial languages of logicians; indeed, I consider it possible to comprehend the syntax and semantics of both kinds of languages within a single natural and mathematically precise theory. On this point I differ from a number of philosophers but agree, I believe with Chomsky and his associates. It is clear, however, that no adequate and comprehensive semantical theory has yet been constructed, and arguable that no comprehensive and semantically significant syntactical theory yet exists. The aim of the present work is to fill this gap, that is, to develop a universal syntax and semantics17.
Both papers from which I have quoted offer a syntax and semantics for fragments of English. These accounts are technically complex and even a summary is out of the question. What I shall try to do is to characterise in general terms the modeltheoretic approach and to indicate the special contributions made by Montague. For first-order predicate logic the standard models are collections of sets, one of them, the domain, containing all the objects that can be talked about; predicates have subsets of the domain or relations on the domain assigned to them, and names have members of the domain assigned to them. An interpretation can thus be construed as a domain together with a function that assigns to each predicate or name an appropriate object taken from the domain. A sentence is then said to be true for an interpretation if it is satisfied by every sequence of objects from the domain, given the interpretation. In this approach, meaning is determined in two stages. The meanings of the logical constants—the connectives ('and', 'if then...' etc.) and the quantifiers ('all', 'some')—that also provide the recursive elements in the definition of satisfaction, are stated in advance for all interpretations. An interpretation then gives the meanings of the non-logical constants (the predicates and proper names of the language). One advantage is that the method permits of the definition of a notion of logical truth as truth in all interpretations. Moreover, truth-in-an-interpretation can be defined in advance of knowing a particular interpretation. In the simple semantics that I sketched earlier, this was not so: to give a definition of truth, we need to have all the basic clauses for the recursive definition to hand, and we had no general way of characterizing the interpretation of, for example, a one-place predicate. The new facility could be seen as an advantage: you may have felt that the basic clauses in the simple semantics, such as (9) (i), were disappointingly trivial. The corresponding clause in a relative definition might look like this: (15)
q satisfies Fv{ in I if and only if the i'th member of q is a member of the set assigned by I to F.
This appears to open up the possibility of discussing alternative interpretations of the basic elements in the language, a possibility that was not evident for the absolute
Montague (1970b) p. 373.
Model theory and semantics
281
18
definition. An interpretation can be thought of as a dictionary for a language whose syntax is known and about which we have semantic information up to a point: we know the meanings of the logical constants (we are not allowed to vary these from one dictionary to another) and we know the kind of interpretation that is allowed for any lexical item whatsoever, since the kind is determined by the syntactic category. What a particular dictionary, or interpretation, does is to specify which meaning of the appropriate kind each lexical item possesses. If we could approach a natural language in a similar way, we could hope to describe its syntax precisely and to determine what the appropriate meaning or interpretation for each lexical item would be. We could expect to discover some logical constants, in particular among the devices for constructing compound or complex sentences out of simpler parts. It is plain, however, that shifting our attention from the absolute to the relative definition of truth has not at all changed the problems that must be met. The standard mode of interpreting predicate logic together with the rough-and-ready methods we have for translating between English and the logical symbolism fail to deal with a host of constructions and aspects of English, such as intentional contexts in general, in particular tense, modality, intentional verbs such as 'believe', 'seek*; and indexical or token-reflexive elements such as personal pronouns. Moreover the kck of a serious syntactic theory linking the logical notation and English is an embarrassment. A defect that is likely to strike the linguist as particularly important is the lack of syntactic ambiguity in the logical notation. The list of obstacles could be prolonged, and they are serious. Montague and others have sought to overcome them all by providing a single theory of language, embracing syntax and semantics, which provides a framework within which a complete formal grammar for any particular language, natural or artificial, could be fitted19. Although Montague's framework is complex, I believe that a large part of it can be understood by the use of two simple ideas20. The first involves framing all our semantical talk in terms of'functions (with truth as the ultimate value). The second is the idea of analysing the syntax of a language into functional relationships. If the functions are matched in a suitable way, an elegant and powerful semantic theory appears to fall out naturally. I shall try to explain first how the notion of function can be used in semantics. The familiar requirement that a semantic theory determine under what conditions a declarative sentence is true, could be stated in a more abstract way by asking that a
18 Strictly, the part of the interpretation that assigns meaning to expressions, but not the domain itself. 19 See Montague (1970a), (1970b) and (1973), and Lewis, D. (1972). 20 Here I am indebted to Lewis, D. (1972). I recommend this article as a preliminary to any readers who would understand Montague's writings.
282
Harry A. Lewis
semantic theory define a function from declarative sentences to truth-values so that every sentence has a truth-value. The abstract way of talking, in terms of functions, appears quite gratuitous at this point, but it is indispensible for the steps that follow. Consider a simple declarative sentence: (16)
John is running.
We want a semantic theory to assign a truth-value to it in a model M: in particular, we now should like it to entail sentences like (17)
val ('John is running') = T in M if and only if John is running in M.
According to the standard approach to the model theory of predicate logic, the name 'John* would receive a member of the domain of an interpretation, and the predicate 'is running* would be given a subset of the domain. (Of course, predicate logic is an artificial language: I continue to use English expressions for illustration only.) The resources of this mode of interpretation do not allow us to say that the subset assigned to 'is running' varies: but this leads to a difficulty, for of course the extension of 'is running'—the set of people who are running—varies from moment to moment, although the meaning of the expression does not. How can we assign a single meaning to 'is running' within a formal semantic theory which allows for this complexity? The answer is, we assign to the predicate a function from times to subsets of the domain; a function, it could be argued, that we already know to exist—for we know that at any time some people are running and some not. The resulting account of the truth-conditions of 'John is running' might look like this: (18)
val ('John is running', tk) = T in M if and only if val('x is running')(val('John'), tk) = T in M This can be read: the valuation function yields truth in M for the arguments 'John is running' and tk(the k-th time) if and only if the result of applying the interpretation of the predicate 'x is running' (which is a function of two arguments) to the arguments (a) the interpretation of 'John', (b) tk, is truth in M. The standard interpretation of predicates by subsets of the domain can be progressively complicated to deal with any features of the occasion of utterance that are relevant to truth-value. A particular valuation, or model, will then ascribe to each predicate an appropriate function. The other simple idea involved is the extension to the semantic realm of Ajdukiewicz' syntactic ideas which derive in turn from Frege. Ajdukiewicz showed how, given two fundamental syntactic categories, it was possible to assign syntactic categories to other parts of speech21. The categories other than the basic sentence
See Ajdukiewicz (1967).
Model theory and semantics
283
and name are all functions of more and less complexity, so that the criterion of sentencehood is that the syntactic categories of the component expressions should when applied to one another yield 's' (sentence). I shall illustrate the idea and its semantic analogue by the case of a simple verb-phrase. If we know that (19) Arthur walks. is a sentence, and 'Arthur' a name, we know that the syntactic category of 'walks' is s/n, i.e. the function that takes a name into a sentence. A semantic analogue (much simpler however than anything in Montague) would be this: if we know that the interpretation of the whole sentence is to be a truth-value, and the interpretation of the name 'Arthur' is to be a person, we can infer that the interpretation of 'walks' is to be a function from people to truth-values. Montague gives a far more complex account of the interpretation even of simple predicates, as he wishes to allow for the occasion of utterance and further factors. But the principle by which the appropriate meaning for items of a certain syntactic category is discovered is the same. The case of adverbs that modify verb-phrases is in point. Syntactically speaking, such adverbs can be seen as turning verb-phrases into verb-phrases (e.g. 'quickly', 'clumsily'): so semantically speaking, they turn verb-phrase meanings into verb-phrase meanings. We therefore know the type of function that such adverbs require as interpretations—they are functions from verb-phrase interpretations to verb-phrase interpretations. Adjectives that attach themselves to common nouns are treated in the same way. The syntax of the fragment of English described in Montague's 'Universal Grammar' (Montague, 1970b) is sufficiently sophisticated to allow that (20) Every man such that he loves a woman is a man is a logical truth, whereas (21) Every alleged murderer is a murderer is not. The assignment to syntactic categories, given the semantic principle I have just presented, is not a trivial matter, and it seems to me, although I claim no expertise, that the syntactic descriptions given of fragments of English in 'Universal Grammar' and 'English as a formal language' are ingenious and interesting22. Both fragments allow syntactic ambiguities, and in the latter paper Montague suggests a way of dealing with ambiguity by relativizing his semantics to analyses of sentences, where an ambiguous sentence receives two distinct analyses.
IV.
Conclusion
My aim in this paper has been to present the basic ideas of two approaches to the semantics of natural languages, those associated with Donald Davidson and with Richard Montague. A long critical discussion would be out of place and it would have to draw on writings and detail not expounded here. See also Montague (1973).
284
Harry A. Lewis
The main themes have been these. A semantic theory can have empirical content even if it is built on the pattern of the theories of truth usually offered for formal languages. Such a theory of truth may represent truth as an absolute notion, or as a relative notion, where the relativity may be to context of utterance (time, place, person) or to "possible worlds". Such notions as 'interpretation' or 'possible world', used as undefined terms in a theory of truth, rest truth on a prior notion that is 'semantic' in that it involves essential use of the notion of truth or a rekted concept such as reference23. A great deal of philosophy is condensed into the contrast between those accounts of truth that use a semantic primitive and those that do not—for example, the question of the intelligibility of concepts of necessity or logical truth can be phrased as the question of the acceptability of certain semantic primitives. In the present context, the contrast is that between theories of meaning for natural languages that make reference to possible worlds, models and interpretations and those that do not. The reader new to this subject may be tempted to suggest that this contrast is unimportant, and perhaps that allegiance to the truth-definition as the criterion of adequacy in semantics is the sole interesting test. Possible worlds—some but not all models—are theoretical entities, it would seem, whose existence is postulated to help with a smoothly-running account of language. If the question is whether possible worlds are disreputable epicycles or respectable ellipses, surely time alone will tell? To be sure, the understanding of language is in us, not in the heavens: but we now readily concede that the ability to produce grammatical sentences is not the same as the ability to produce a grammar that will model the former ability. Why should semantics not trade in notions as obscure to the lay mind as are 'phrase-structure grammar' and 'generalised transformation', provided that they aid theory? Surely we can expect our theory of meaning to be at least as complicated as our syntax? One obstacle to such a generous view lies in the criterion of adequacy built into this approach to semantics: Tarski's convention T. It is a powerful constraint on a theory that it generate the theorems required by'the Convention. Such theorems have on one side a translation or paraphrase of the sentence whose truth-conditions they thus state. Proponents of the relative definition of truth as the semantic paradigm have to persuade us that talk of possible worlds does paraphrase everyday English. If they are right, we should all be easy to persuade, for it strains credulity that native speakers cannot recognize synonymous pairs of sentences when they see them24. The welcome feature of this approach to semantics is that a lay audience can easily test the plausibility of particular proposals by asking that the B-sentences be exhibited: they may then inspect the two sides to see if one is a plausible paraphrase of the other. In other words, such a proposal has empirical
cf. Davidson (1973). This is the argument of Wallace (1972), VI.
Model theory and semantics
285
consequences, and like respectable theories in other fields, it is falsifiable: unlike theories in some other fields, these can be tested by any native speakers of the language in question.
References AJDUKIEWICZ, K. (1967), On syntactical coherence (translated from the Polish by P.T. Geach), Review of Metaphysics 20,635—647. DAVIDSON, D. (1967), Truth and meaning, Synthese 17,304—323. DAVIDSON, D. (1970), Semantics for natural languages, pp. 177—188 in: Linguaggi nella societa e nella tecnica, Milan: Edizioni de Comunita. DAVIDSON, D. (1973), In defense of Convention T. pp. 76—86 in: Leblanc, H. (Ed.), Truth, Syntax and Modality, Amsterdam: North-Holland. Hiz, H. (1968), Computable and uncomputable elements of syntax, pp. 239—254 in: van Rootselaar, B. and J. F. Staal (eds.), Logic, Methodology and Philosophy of Sciences III, Amsterdam: North Holland. Hiz, H. (1969), Aletheic semantic theory, The Philosophical Forum 1 (New Series), 438—451. LEWIS, D. (1972), General semantics, pp. 169—218 in Davidson, D. and G. Harman (Eds.), Semantics of natural language, Dordrecht: Reidel. MARTIN, R.L. (1970), The paradox of the liar, New Haven: Yale University Press. MATES, B. (1972), Elementary Logic (second edition), London: Oxford University Press. MENDELSON, E. (1964), Introduction to Mathematical Logic, Princeton, N. J.: D. Van Nostrand Company. MONTAGUE, R. (1970 a), English as a formal language, pp. 189—223 in Linguaggi nella societa e nella tecnica, Milan: Edizioni di Communita. MONTAGUE, R. (1970b), Universal grammar, Theoria 36,374—398. MONTAGUE, R. (1973), The proper treatment of quantification in ordinary English, pp. 221—242 in: Hintikka, K. J.J., J.M.E. Moravcsik and P. Suppes (Eds.), Approaches to natural languages, Dordrecht: Reidel. QUINE, W. V.O. (1970), Philosophy of Logic, Englewood Cliffs: Prentice-Hall. TARSKI, A. (1956), The concept of truth in formalised languages, pp. 152—278 in: Logic, semantics, metamathematics (translated by J.H. Woodger), Oxford: Clarendon Press. WALLACE, J. (1972), On the frame of reference, pp. 212—252, in: Davidson, D. and G. Harman (Eds.), Semantics of natural language, Dordrecht: Reidel.
IRENA BELLERT
REPLY TO H. H. LIEB
I wish to take this opportunity and reply to Lieb's comments (in: Grammars as Theories, Theoretical Linguistics, Vol I [1974] pp. 39-115) on my proposal (I. Bellert, Theory of Language as an Interpreted Formal Theory, Proceedings of the 11-th International Congress of Linguists, Bologne, 1972). I will discuss only some critical remarks which, if valid, would make my proposal untenable. The first is due to a misinterpretation of one part of my text, which in fact was carelessly formulated and hence misleading. I am indebted to Lieb for his observations. The others which I found objectionable, give me the opportunity to clarify my statements. On page 103 Lieb says: "The separation of the 'axiomatic implications' from the 'meta-rule' is untenable. To make sense of the conception, we have to take "A" in an axiomatic implication as a free variable(...) Because of the free variable(s) these axioms are neither true nor false and they have no acceptable interpretation (...)"· Of course the separation of the axiomatic implications from the meta-rule is untenable and by no means was it intended so in the proposal. But it is quite evident that the conception would not make any sense at all if we took "A" as a free variable. As I said on page 291: "Notice that in the above implicational scheme the expressions A, R and (S, D) (addresser, receiver and sentence with its structural description, respectively) are all bound by universal quantifiers" (the stress is added). It is obvious then that I did not mean "A" to be a free variable. The meta-rule cannot be separated from the expression "C—> A PROPOSITIONAL ATTITUDE S1" which constitutes only part of it, and thus, if taken separately, could by no means be said to be true, false nor even to constitute a well formed formula. However, what evidently misled Lieb was that part of my paper in which I used the latter expression in referring to axiomatic implications, for the sake of brevity. The reason of my careless formulation was that the expressions: "C" in the antecedent, "PROPOSITIONAL ATTITUDE" and "S1" in the consequent are the only ones which are specific for each implication and essential for analycity, whereas the remaining expressions are exactly the same and in implications the variables are bound by universal quantifiers. Therefore, when establishing axiomatic implications for any language or a fragment of a
288
Irena Bellert
language, we would have to specify only the mentioned expressions, while the entire meta-rule would always be presupposed as part of each implication. Perhaps the term 'axiomatic scheme' would be more appropriate than 'meta-rule'. In conclusion, I should then have said that the interpretative component will consist of a finite set of axiomatic implications of the form given by the axiomatic schema (meta-rule), the essential and language-specific expressions of which are: "C", "PROPOSITIONAL ATTITUDE" and "S1". Lieb objects against my statement that "the consequents can be said to follow formally from the antecedents". He says: "But even in analytic sentences material implication does not mean deducibility." (Footnote 131, page 104). I cannot agree with his objections. Material implication, clearly, does not mean deducibility, but this is not what my statement says. What I say here is in agreement with the terminology established by Tarski and widely accepted in the literature. Let me recall Tarski's definitions of the terms in question. He considers a finite class of sentences K from which a given sentence X follows. He denotes by the symbol Z the implication whose antecedent is the conjunction of the sentences in the class K and whose consequent is the sentence X. He then gives the following equivalences: "The sentence X is (logically) derivable from the sentences of the class K if and only if Z is logically provable. The sentence X follows formally from the sentences of the class K if and only if Z is analytical. The sentence X follows materially from the sentences of the class K if and only if the sentence Z is true" (Logic, Semantics, Metamathematics,Oxford: At the Clarendon Press, 1956, p. 419). In my proposal I can correspondingly say that the sentence in the consequent of an axiomatic implication follows formally from the class of sentences (or conjunction of sentences) in tire antecedent, as the implication is analytical. Furthermore, Lieb questions the analyticity of the axiomatic implications (Footnote 133, page 104). The question of properly distinguishing analytic statements from synthetic (contingent) statements has been widely discussed in the literature and there is no complete agreement as to the status of some statements. However, when discussing analyticity the authors agree that a statement is said to be analytical if its truth is based on meaning alone independently of extralinguistic facts. Carnap's meaning postulates have been proposed as an intended explication of the concept of analyticity. He defines meaning postulates as L-true implications. The L-truth is an explicatum for what Leibniz called necessary and Kant analytic truth. In Carnap's formulation the antecedent ^implies the consequent. His example is: "If Jack is a bachelor, then he is not married" (Meaning and Necessity—Meaning Postulates, Phoenix Books, 1967, p. 10 and 222). In spite of the controversies involved, undoubtedly there is a difference made in logic
Reply to H. H. Lieb
289
between unconditionally true statements and contingent, factual statements: To the former class belong logically true statements and those that are not theorems in standard logic but their truth is independent of extra-linguistic facts. Those are usually called analytical. Now, since my implications are intended to be constructed so that their truth be dependent solely on the meanings of the words and the structures involved, I presume that they can correctly be called analytical. Moreover, if they are taken as axioms of the theory, their truth cannot, without contradiction, be considered contingent, and they have to be taken as unconditionally true statements in the theory. Marian Przel^cki has discussed in detail the status of meaning postulates in axiomatized empirical theories (The Logic of Empirical Theories, Routledge & Kegan Paul, New York, 1969). As he observed, it is a usual practice in axiomatizing empirical theories to explicate the meaning of extra-logical terms by meaning postulates which are then considered to be analytical sentences of that language. Lieb questions, however, the empirical contents of such a theory or grammar (Footnote 133, p. 104). The class of sentences that follow from a given sentence S and some pertinent meaning postulates obviously adds nothing to the meaning of S. Meaning postulates, or axiomatic implications (in my terminology), are established for explicating the nonlogical terms and structures contained in S. The axiomatic implications in my proposal are intended to explicate more complex predicates used in specific structural conditions in terms of other predicates—in a way which, in principle, should reflect the native speakers* understanding of the language, that is, in particular, they should account for the conclusions speakers generally can draw from the corresponding utterances in any fixed universe of discourse for which the meaning and denotation of the terms involved are clearly understood. In order to test the empirical adequacy of axiomatic implications, it is necessary to find a possible state of affairs in which the antecedent holds true but the consequent does not. If such a case is found, the implication in question should be rejected, or the conditions C in the antecedent should be modified in such a way that they become necessary conditions (as they are intended to). But this can be done only by determining a universe of discourse, as well as the denotation of some predicates (those that are not further explicated by axiomatic implications, but occur in the consequents only) by establishing in some way (other than the verbal way of specifying axiomatic implications) the sets of individuals in the universe of discourse of which the given predicates hold true. Otherwise the theory will have no empirical contents indeed. It is clear, however, that such tests are based ultimately on speakers' judgements only. Finally, I wish to add that being a linguist with only some knowledge of logic, I did not aim at a rigirous formalization of the proposal but, rather, I did what is a common practice for non-logicians interested in the possibility of formalizing some aspects of their empirical field, namely, I submitted for discussion a rough outline of a theory which would account for the empirical fact that speakers are capable of drawing a number of conclusions from a single utterance
290
Irena Bellert
by virtue of only the meanings of words and structures involved, independently of extra-linguistic facts. And I am indebted for all critical observations, which may help me in further clarifying my proposal as it has been the case with Lieb's comments.