Semantics & Pragmatics Volume 3, Article 1: 1–72, 2010 doi: 10.3765/sp.3.1
Quantifiers in than-clauses∗ Sigrid Beck University of Tübingen
Received 2009-01-13 / First Decision 2009-03-17 / Revised 2009-06-17 / Second Decision 2009-07-06 / Revised 2009-07-27 / Accepted 2009-07-27 / Published 201001-25
Abstract The paper reexamines the interpretations that quantifiers in than-clauses give rise to. It develops an analysis that combines an interval semantics for the than-clause with a standard semantics for the comparative operator. In order to mediate between the two, interpretive mechanisms like maximality and maximal informativity determine selection of a point from an interval. The interval semantics allows local interpretation of the quantifier. Selection predicts which interpretation this leads to. Cases in which the prediction appears not to be met are explained via recourse to independently attested external factors (e.g. the interpretive possibilities of indefinites). The goal of the paper is to achieve coverage of the relevant data while maintaining a simple semantics for the comparative. A secondary objective is to reexamine, restructure and extend the set of data considered in connection with the problem of quantifiers in than-clauses.
Keywords: comparatives, degrees, intervals, quantifiers, indefinites, plurals, scope ∗ Versions of this paper were presented at the workshop on covert variables in Tübingen 2006, at two Semantic Network meetings (in Barcelona 2006 and Oslo 2007), at the 2009 Topics in Semantics seminar at MIT, and at the Universität Frankfurt 2009. I would like to thank the organizers Frank Richter and Uli Sauerland and the audiences at these presentations for important feedback. Robert van Rooij and Jon Gajewski have exchanged ideas with me. The B17 project of the SFB 441 has accompanied the work presented here — Remus Gergel, Stefan Hofstetter, Sveta Krasikova, John Vanderelst — as have Arnim von Stechow and Irene Heim. Several anonymous reviewers and Danny Fox have given feedback on earlier versions, and David Beaver and Kai von Fintel have commented on the prefinal version. I am very grateful to them all.
©2010 Sigrid Beck This is an open-access article distributed under the terms of a Creative Commons NonCommercial License (creativecommons.org/licenses/by-nc/3.0).
Sigrid Beck
1 Introduction The problem of quantifiers in than-clauses has been puzzling linguists for a long time, beginning with von Stechow 1984, via Schwarzschild & Wilkinson 2002, Schwarzschild 2004, and Heim 2006b, to very recent approaches in Gajewski 2008, van Rooij 2008 and Schwarzschild 2008. It can be illustrated with the examples below. (1)
John ran faster than every girl did.
(10 )
a. For all x, x is a girl: John ran faster than x. b. #The degree of speed that John reached exceeds the degree of speed that every girl reached. i.e. “John’s speed exceeds the speed of the slowest girl.”
(2)
John ran faster than he had to.
(20 )
a. #For all w, w is a permissible world: John ran faster in @ than he ran in w. b. The degree of speed that John reached in @ exceeds the degree of speed that he has in every permissible world w. i.e. “John’s actual speed exceeds the slowest permissible speed.” (@ stands for the real world)
Example (1) intuitively only has a reading that appears to give the universal NP scope over the comparison, namely (10 a): all the girls were slower than John. The reading in which the universal NP takes narrow scope relative to the comparison is paraphrased in (10 b). Here we must look at degrees of speed reached by all girls; depending on the precise semantics of the thanclause (see below), this could mean the maximal speed that they all reached, i.e. the speed of the slowest girl. Example (1) has no reading that compares John’s speed to the speed of the slowest girl. Sentence (2), on the other hand, only has a reading that gives the modal universal quantifier narrow scope relative to the comparison, (20 b). That is, we consider the degrees of speed that John reaches in all worlds compatible with the rules imposed by the modal base of have to. This will yield the slowest permissible speed, and (2) intuitively says that John’s actual speed exceeded this minimum requirement. The sentence is not1 understood to mean that John did something that was 1 Heim (2006b) and Krasikova (2008) include a discussion of when readings like (20 a) are available. The reading can be made more plausible with a suitable context, depending on the modal chosen. For the moment I will stick to the simpler picture presented in the text. See
1:2
Quantifiers in than-clauses
against the rules — that is, reading (20 a), in which the modal takes scope over the comparison, is not available. We must ask ourselves how a quantifier contained in the than-clause can have wide scope at all, why it cannot get narrow scope in (1), and why (2) is the opposite. Since — as we will see in more detail below — these questions look unanswerable under the standard analysis of comparatives, the researchers cited above have been led to a revision of the semantic analysis of comparison. Schwarzschild & Wilkinson (2002) employ an interval semantics for the thanclause and give the comparative itself an interval semantics. Heim (2006b) adopts intervals, but ultimately reduces the semantics of the comparison back to a degree semantics through semantic reconstruction. This allows her to retain a simple meaning of the comparative operator. A than-clause internal operator derives the different readings that quantifiers in thanclauses give rise to. The line of research in Gajewski 2008, van Rooij 2008 and Schwarzschild 2008 in turn adopts the idea of a than-clause internal operator but not the intervals. In this paper, I pursue a strategy that can be seen as an attempt to simplify Schwarzschild & Wilkinson’s proposal. Like them, I derive a meaning for the than-clause without a than-clause internal operator, and that meaning is based on an interval semantics. But I combine this with a standard semantics of the comparative in the spirit of von Stechow 1984. This means that the end result of interpreting the than-clause must be a degree. Everything will hinge on selecting the right degree, so that each of the relevant examples receives the right interpretation. In Section 2, I present the current state of our knowledge in this domain. The analysis of than-clauses is presented in Section 3. Section 4 ends the paper with a summary and some discussion of consequences of the proposed analysis. 2
State of affairs
I first present a sample of data that I take to be representative of the interpretational possibilities that arise with quantifiers in than-clauses. Then I sketch Schwarzschild & Wilkinson’s (2002) and Heim’s (2006b) analyses in Section 2.2, and in Section 2.3 a summary of the proposals in Gajewski 2008, van Rooij 2008 and Schwarzschild 2008. Section 3 for more discussion.
1:3
Sigrid Beck
2.1 The empirical picture 2.1.1 A classical analysis of the comparative The basis of our present perception of the problem presented by (1) and (2) is the analysis of the comparative construction, because the data are understood in terms of whether the quantfier appears to take wide scope over the comparison according to a classical analysis of the comparative, or whether it would have to be seen as taking narrow scope relative to the comparison. My presentation assumes a general theoretical framework like Heim & Kratzer 1998 and begins with specifically Heim’s (2001) version of the theory of comparison promoted in von Stechow 1984 (see also Klein 1991 and Beck 2009 for an exposition and Cresswell 1977; Hellan 1981; Hoeksema 1983; Seuren 1978 for theoretical predecessors). This theory is what I will refer to as a classical analysis of the comparative. For illustration, I discuss the simple example (3a) below. In (3b) I provide the Logical Form and in (3c) the truth conditions derived by compositional interpretation of that Logical Form, plus paraphrase. Interpretation relies on the lexical entries of the comparative morpheme and gradable adjectives as given in (4). (3)
a. b. c.
(4)
a. b. c.
Paule is older than Knut is. [-er [hd,ti than 2 [Knut is t2 old]] [hd,ti 2 [Paule is t2 old]]] max(λd. Paule is d-old) > max(λd. Knut is d-old) = Age(Paule) > Age(Knut) “The largest degree of age that Paule reaches exceeds the largest degree of age that Knut reaches.” “Paule’s age exceeds Knut’s age.” 0 -er = λDhd,ti . λDhd,ti . max(D 0 ) > max(D) oldhd,he,tii = [λd. λx. x is d-old] = [λd. λx. Age(x) ≥ d] Let S be a set ordered by R. Then maxR (S) = ιs[s ∈ S & ∀s 0 ∈ S[sRs 0 ]]
Importantly, the role of the comparative operator is ultimately to relate the maximal degree provided by the than-clause to some matrix clause degree. The than-clause provides degrees through abstraction over the degree argument slot of the adjective. Different versions of such a classical analysis are available (for instance von Stechow’s (1984) own or Kennedy’s
1:4
Quantifiers in than-clauses
(1997)), but the problem of quantifiers in than-clauses presents itself in a parallel fashion in all of them. I will make one small revision to the above version of the classical analysis: I will suppose that what is written into the lexical entry of the comparative morpheme as the maximality operator in (4a) is not actually part of the meaning of the comparative itself. Rather, it is a general mechanism that allows us to go from a description of a set to a particular object, for example also in the case of free relative clauses in (5) (Jacobson 1995); see also Beck 2009. I represent maximality in the Logical Form, as indicated in (40 b). The meaning of the comparative is then simply (40 a), the ‘larger than’ relation. It is basically this meaning of the comparative that I will try to defend below. The resulting interpretation remains of course the same. (5)
a. b.
We bought [what we liked]. max(λx. we liked x)
(40 )
a. b.
-er = λdd .λd0d . d0 > d [-er [d than max 2 [Knut is t2 old]] [d max 2 [Paule is t2 old]]] max(λd. Paule is d-old) > max(λd. Knut is d-old)
c. 2.1.2
Apparent wide scope quantifiers
Universal NPs are a standard example for an apparent wide scope quantifier (see e.g. Heim 2006b). The sentence in (6) below only permits the reading in (60 a), not the one in (60 b). This can be seen from the fact that the sentence would be judged false in the situation depicted below. (6)
John is taller than every girl is.
(60 )
∀x[girl(x) → max(λd. John is d-tall) > max(λd. x is d-tall)] “For every girl x: John’s height exceeds x’s height.” b. #max(λd. John is d-tall) > max(λd. ∀x[girl(x) → x is d-tall]) “John’s height exceeds the largest degree to which every girl is tall.” “John is taller than the shortest girl.” a.
_ _ _ _ _ • _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ • _ _ _ _ _/
g1 ’s height
J’s height
1:5
g2 ’s height
g3 ’s height
Sigrid Beck
The classical semantics of comparatives makes this look as if the NP had to take scope over the comparative. The LF given in (600 a) can straightforwardly be interpreted to yield (60 a); analogously for (600 b) and (60 b). Thus, strangely, the sentence appears to permit with (600 a) only an LF which violates constraints on Quantifier Raising (QR): QR is normally confined to a simple finite clause (May 1985 and much subsequent work). The LF in (600 b), which would be unproblematic syntactically, is not possible. (600 )
a.
[[every girl] [1 [[-er
b.
[[-er
[d [d
than
[d [d
than
max 2 [t1 is t2 tall]] max 2 [John is t2 tall]]]]
max 2 [every girl] [ 1 [t1 is t2 tall]]] max 2 [John is t2 tall]]]
The example with the differential in (7) shows the same behaviour (it uses a version of the comparative that accomodates a difference degree, (7c)). (7)
a. b. c.
John is 200 taller than every girl is. ∀x[girl(x) → max(λd. John is d-tall) ≥ max(λd. x is d-tall) + 200 ] = For every girl x: John’s height exceeds x’s height by 200 . -erdiff = λd. λd0 . λd00 . d00 ≥ d + d0
The problem posed by (5) and (7) is exacerbated in (8), as Schwarzschild & Wilkinson (2002) observe. We have once more a universal quantifier, but this time it is one that is taken to be immobile at LF: the intensional verb predict. Still, the interpretation that is intuitively available looks to be one in which the universal outscopes the comparison, (80 a). The interpretation in which comparison takes scope over predict, (80 b), is not possible. This is problematic because the LF we would expect (8) to have is (10), and (10) is straightforwardly interpreted to yield (80 b). (8)
John is taller than I had predicted (that he would be).
(9)
My prediction: John will be between 1.70 m and 1.80 m. Claim made by (8): John is taller than 1.80 m.
(80 )
a.
∀w[wR@ → max(λd. John is d-tall in @) > max(λd. John is d-tall in w)] “For every world compatible with my predictions: John’s actual height exceeds John’s height in that world.” b. # max(λd. John is d-tall in @) > max(λd. ∀w[wR@ → John is d-tall in w])
1:6
Quantifiers in than-clauses
“John’s actual height exceeds the degree of tallness which he has in all worlds compatible with my predictions.” “John’s actual height exceeds the shortest prediction, 1.70 m.” (where R is the relevant accessibility relation, compare e.g. Kratzer 1991) (10)
[[-er
[hd,ti [hd,ti
than
max 2 [ I had predicted that [ John be t2 tall]]] max 2 [ John is t2 tall]]]
This is the interpretive behaviour of many quantified NPs, plural NPs like the girls, quantificational adverbs, verbs of propositional attitude and some modals (e.g. should, ought to, might). See Schwarzschild & Wilkinson 2002 and Heim 2006b for a more thorough empirical discussion. 2.1.3 Apparent narrow scope quantifiers Not all quantificational elements show this behaviour. A universal quantifier that does not is the modal have to, along with some others (be required, be necessary, need). This is illustrated below. (11)
Mary is taller than she has to be.
(12)
Mary wants to play basketball. The school rules require all players to be at least 1.70 m. Claim made by (11): Mary is taller than 1.70 m.
(110 )
a. ?#∀w[wR@ → max(λd. Mary is d-tall in @) > max(λd. Mary is d-tall in w)] = For every world compatible with the school rules: Mary’s actual height exceeds Mary’s height in that world; i.e. Mary is too tall. b. max(λd. Mary is d-tall in @) > max(λd. ∀w[wR@ → Mary is d-tall in w]) = Mary’s actual height exceeds the degree of tallness which she has in all worlds compatible with the school rules; i.e. Mary’s actual height exceeds the required minimum, 1.70 m.
These modals permit what appears to be a narrow scope interpretation relative to the comparison. Example (11) does not favour an apparent wide scope interpretation. Krasikova (2008) argues though that some examples with have to–type modals may have both readings, depending on context. (13) is one of her examples favouring a reading analogous to (110 a), an apparent
1:7
Sigrid Beck
wide scope reading of have to (see Section 3 for more discussion). (13)
He was coming through later than he had to if he were going to retain the overall lead. (from Google, cited from Krasikova 2008) = He was coming through too late.
Existential modals like be allowed also appear to take narrow scope: (14)
Mary is taller than she is allowed to be.
(15)
a. #∃w[wR@ & max(λd. Mary is d-tall in @) > max(λd .Mary is d-tall in w)] = It would be allowed for Mary to be shorter than she actually is. b. max(λd. Mary is d-tall in @) > max(λd. ∃w[wR@ & Mary is d-tall in w]) = Mary’s actual height exceeds the largest degree of tallness that she reaches in some permissible world; i.e. Mary’s actual height exceeds the permitted maximum.
And so do some other existential quantifiers and disjunction: (16)
Mary is taller than anyone else is.
(17)
a. #There is someone that Mary is taller than. b. Mary’s height exceeds the largest degree of tallness reached by one of the others.
(18)
Mary is taller than John or Fred are.
(19)
a. ?#For either John or Fred: Mary is taller than that person. b. Mary’s height exceeds the maximum height reached by John or Fred.
This is the interpretive behaviour of some modals (e.g. need, have to, be allowed, be required), some indefinites (especially NPIs) and disjunction (compare once more Heim 2006b). It is also the behaviour of negation and negative quantifiers, with the added observation that the apparent narrow scope reading is one which often gives rise to undefinedness, hence unacceptability (von Stechow 1984; Rullmann 1995). (That this is not invariably the case is shown by (22), illustrating that we are concerned with a constraint on meaning rather than form.) (20)
*John is taller than no girl is.
1:8
Quantifiers in than-clauses
(21)
a.
John’s height exceeds the maximum height reached by no girl. The maximum height reached by no girl is undefined, hence: unacceptability of this reading. b. #There is no girl who John is taller than.
(22)
I haven’t been to the hairdresser longer than I haven’t been to the dentist.
Here is how the empirical picture presents itself from the point of view of a classical analysis of comparatives. It appears that there are two different scope readings possible for quantifiers embedded inside the than-clause, wide or narrow scope relative to the comparison. But there is usually no ambiguity. Each individual quantifier favours at most one reading (negation frequently permits none). Apparent narrow scope readings are straightforwardly captured by the classical analysis. It is unclear how apparent wide scope readings are to be derived at all. As Schwarzschild & Wilkinson argue, they are beyond the reach of an LF analysis. It is also unclear what creates the pattern in the readings that we have observed. Before we examine modern approaches to this problem, a final comment on the data. I have presented them the way they are presented in the literature on the subject, as if they were all impeccable and their interpretations clear. But I would like to use this opportunity to point out that I find some of them fairly difficult and perhaps not even entirely acceptable. This concerns example (6), for which I would much prefer a version with a definite plural (the girls instead of every girl). The NP the girls is, if anything, more problematic under the classical analysis, as Schwarzschild & Wilkinson (2002) point out (having less of an inclination towards wide scope); but see Section 4 for a comment on how this issue may be relevant for the analysis developed in this paper. (6000 )
a. ?John is taller than every girl is. b. John is taller than the girls are. ∀x[x ∈ the girls → John is taller than x]
Another instance are examples with intensional verbs like predict or expect; when a genuine range is predicted or expected, intuitions regarding when sentences with differentials like (800 ) would be true vs. false are not very firm. This seems to me an area in which a proper empirical study might be helpful. The issue is taken up in Section 3.4.
1:9
Sigrid Beck
(800 )
a. b.
John is two inches taller than I had predicted (that he would be). John arrived at most 10 minutes later than I had expected.
2.2 New analyses I Since it is very hard to see how the data can be derived under the classical theory, the two theories summarized below (Schwarzschild & Wilkinson 2002 and Heim 2006b) both change the semantics of the comparative construction in ways that reanalyse scope. The quantificational element inside the thanclause can take scope there even under the apparent wide scope reading. The two theories differ with respect to the semantics they attribute to the comparison itself. They also differ in their empirical coverage. 2.2.1
Schwarzschild & Wilkinson 2002
Schwarzschild & Wilkinson (2002) are inspired by the scope puzzle to a complete revision of the semantics of comparison. The feature of the classical analysis that they perceive as the crux of our problem is that the than-clause provides a degree via abstraction over degrees. According to them, the quantifier data show that the than-clause instead must provide us with an interval on the degree scale — in (23) below an interval into which the height of everyone other than Caroline falls. (23)
Caroline is taller than everyone else is. ‘Everyone else is shorter than Caroline.’ interval that covers everyone else’s height _ _ _ _ _ _ _• _ _ _ _•_ _ _ _ •_ _ _ _ _ _ _ _• _ _ _/
x1
x2
x3
C
(the interval is related to Caroline’s height by the comparative) (24)
than everyone else is = λD. everyone else’s height falls within D (where D is of type hd, ti)2
To simplify, I will suppose that it is somehow ensured that we pick the right matrix clause interval (Caroline’s height in (23), Joe’s height in the example 2 I present the discussion here in terms of the classical theory’s ontology, where degrees (type d, elements of Dd ) are points on the degree scale and what I call an interval is a set of points, type hd, ti.
1:10
Quantifiers in than-clauses
below). (25)
Joe is taller than exactly 5 people are.
Here is a rough sketch of Schwarzschild & Wilkinson’s analysis of this example. (26)
Subord: Matrix + Comp: Whole clause:
[λD. exactly 5 people’s height falls within D] max D 0 : [Joe’s height − D 0 ] 6= 0 the largest interval some distance below Joe’s height the largest interval some distance below Joe’s height is an interval into which exactly 5 people’s height falls.
Note that the quantifier is not given wide scope over the comparison at all under this analysis. The interval idea allows us to interpret it within the than-clause. While solving the puzzle of apparent wide scope operators, the analysis makes wrong predictions for apparent narrow scope quantifiers (cf. example (27)). The available reading cannot be accounted for ((28a) is the semantics predicted by the classical analysis, corresponding to the intuitively available reading; (28b) is the semantics that the Schwarzschild & Wilkinson analysis predicts). (27)
John is taller than anyone else is.
(28)
a. John’s height > max(λd. ∃x[x 6= John & x is d-tall]) b. #The largest interval some distance below John’s height is an interval into which someone else’s height falls = Someone is shorter than John.
The breakthrough achieved by this analysis is that we can assign to the thanclause a useful semantics while interpreting the quantifier inside that clause. For this reason, the interval idea is to my mind a very important innovation. The analysis still has a crucial problem in that it does not extend to the apparent narrow scope quantifiers. That is, it fails in precisely those cases that were unproblematic for the classical analysis. I will also mention that the semantics of comparison becomes rather complex under this analysis, since the comparative itself compares intervals. This is not in line with the plot I outlined above of maintaining as the semantics of the comparative operator the plain ‘larger than’-relation.
1:11
Sigrid Beck
2.2.2
Heim 2006b
Heim (2006b) adopts the interval analysis, but combines it with a scope mechanism that derives ultimately a wide and a narrow scope reading of a quantifier relative to a comparison. Her analysis extends proposals by Larson (1988). Larson’s own analysis is only applicable to than-clauses with an adjective phrase gap denoting a property of individuals — a limitation remedied by Heim. Let us consider her analysis of apparent wide scope of quantifier data, like (29), first. Heim’s LF for the sentence is given in (30). She employs an operator Pi (Point to Interval, credited to Schwarzschild (2004)), whose semantics is specified in (31). Compositional interpretation (once more somewhat simplified for the matrix clause, for convenience) is given in (32). (29)
John is taller than every girl is.
(30)
[ IP [ CP than [1 [every girl [2 [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]]]]] [ IP 4 [[-er t4 ] [5 [John is t5 tall]]]]
(31)
Pi = λD.λP . max(P ) ∈ D
(32)
a. b.
c.
main clause: [[[4 [[-er t4 ] [5 [John is t5 tall]]]]] = λd. John is taller than d than-clause: [than [1 [every girl [2 [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]]]]] = D 0 /1 λD 0 . [every girl [2 [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]]]]]g = 0 x/2 0 g D /1 λD . ∀x[girl(x) → [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]] ]= x/2 0 0 λD . ∀x[girl(x) → [λD.λP . max(P ) ∈ D](D )([3 [t2 is t3 tall]]g )] = λD 0 . ∀x[girl(x) → [λD.λP . max(P ) ∈ D](D 0 )(λd. Height(x) ≥ d)] = λD 0 . ∀x[girl(x) → max(λd. Height(x) ≥ d) ∈ D 0 ] = λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ] intervals into which the height of every girl falls main clause + than-clause: (29) = [λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ]](λd. John is taller than d) = ∀x[girl(x) → Height(x) ∈ (λd. John is taller than d)] = for every girl x: John is taller than x
1:12
Quantifiers in than-clauses
The than-clause provides intervals into which the height of every girl falls. The whole sentence says that the degrees exceeded by John’s height is such an interval. Semantic reconstruction (i.e. lambda conversion) simplifies the whole to the claim intuitively made, that every girl is shorter than John. The analysis assumes that the denotation domain Dd is a set of degree ‘points’, and that intervals are of type Dhd,ti . The analysis is a way of interpreting the quantifier inside the than-clause, and deriving the apparent wide scope reading over the comparison through giving the quantifier scope over the shift from degrees to intervals (the Pi operator). It is applicable to other kinds of quantificational elements like intensional verbs in the same way. Our example with predict is analysed below; the intuitively plausible reading can now be derived straightforwardly from the LF in (34). (33)
a. b.
(34)
[ IP [ CP than [1 [I had predicted [ CP [Pi t1 ] [2 [ AP John t2 tall]]]]]]] [ IP 3 [John is taller than t3 ]]]
(35)
a. b.
c.
John is taller than I had predicted (that he would be). ∀w[wR@ → max(λd. John is d-tall in @) > max(λd. John is d-tall in w)] = For every world compatible with my predictions: John’s actual height exceeds John’s height in that world.
main clause: [3 [John is taller than t3 ]] = (λd. John is taller than d in @) than-clause: [than [1 [I had predicted [ CP [Pi t1 ] [2 [ AP John t2 tall]]]]]]] = 0 [λD 0 . ∀w[wR@ → [ CP [Pi t1 ] [2 [ AP John t2 tall]]]g[D /1] ] = [λD 0 . ∀w[wR@ → max(λd. Height(John)(w) ≥ d) ∈ D 0 ]] = [λD 0 . ∀w[wR@ → Height(John)(w) ∈ D 0 ]] intervals into which John’s height falls in all my predictions main clause + than-clause: (34) = [λD 0 . ∀w[wR@ → Height(John)(w) ∈ D 0 ]] (λd. J is taller than d in @) = for every w compatible with my predictions: John’s actual height exceeds John’s height in w.
The effect of the Pi operator on the predicate of degrees it combines with is sketched below for the AP tall. As long as a than-clause quantifier takes
1:13
Sigrid Beck
scope over the Pi operator, the resulting meaning of the whole sentence will be one that lets the quantifier take scope over the comparison, even though it is interpreted syntactically below the comparative operator and inside the than-clause. (36)
Pi shifts from degrees to intervals: [λd. Height(x) ≥ d] =⇒ [λD. Height(x) ∈ D]
In contrast to Schwarzschild & Wilkinson’s original interval analysis, Heim is able to derive apparently narrow scope readings of an operator relative to the comparison as well. The sentence in (37a) is associated with the LF in (38). Note that here, the shifter takes scope over the operator have to. This makes have to combine with the degree semantics in the original, desired way, giving us the minimum compliance height (just like it did before, without the intervals). The shift is essentially harmless. (37)
a. b.
(38)
[ IP [ CP than [1 [[[Pi t1 ] [2 [has-to [Mary t2 tall]]]]]]] [ IP 3 [Mary is taller than t3 ]]]
(39)
a. b.
c.
Mary is taller than she has to be. max(λd. Mary is d-tall in @) > max(λd. ∀w[wR@ → Mary is d-tall in w]) Mary’s actual height exceeds the degree of tallness which she has in all worlds compatible with the school rules; i.e. Mary’s actual height exceeds the required minimum, 1.70 m.
main clause: [3 [Mary is taller than t3 ]]] = (λd. Mary is taller than d in @) than-clause: [than [1 [[[Pi t1 ] [2 [has-to [Mary t2 tall]]]]]] = 0 λD 0 . [[Pi t1 ] [2 [has-to [Mary t2 tall]]]]]]g[D /1] = λD 0 . max(λd. has-to [Mary t2 tall]]]]]]g[d/2] ) ∈ D 0 λD 0 . max(λd. ∀w[wR@ → Mary is d-tall in w]) ∈ D 0 intervals into which the required minimum falls main clause + than-clause: (38) = [λD 0 . max(λd. ∀w[wR@ → Mary is d-tall in w]) ∈ D 0 ] (λd. Mary is taller than d in @) = Mary is taller than the required minimum.
1:14
Quantifiers in than-clauses
Other apparent narrow scope operators receive a parallel analysis. The crucial ingredient to this analysis is that the Pi operator is a scope bearing element, able to take local or non-local scope. Pi-phrase scope interaction is summarized below: (40)
Pi takes narrow scope relative to quantifier =⇒ apparent wide scope reading of quantifier over comparison Pi takes wide scope relative to quantifier =⇒ apparent narrow scope reading of quantifier relative to comparison
Thus than-clauses include a shift from degrees to intervals, which allows us to assign a denotation to the than-clause with the quantifier. The shift amounts to a form of type raising. Through semantic reconstruction, the matrix clause is interpreted in the scope of a than-clause operator when that operator has scope over the shifter. In contrast to Schwarzschild & Wilkinson, comparison is ultimately between degrees, not intervals. Heim’s analysis is able to derive both wide and narrow scope readings of operators in than-clauses. It does so without violating syntactic constraints. There is, however, an unresolved question: when do we get which reading? How could one constrain Pi-phrase/operator interaction in the desired way? One place where this problem surfaces is once more negation, where we expect an LF that would generate an acceptable wide scope of negation reading. That is, the LF in (41b) should be grammatical and hence (41a) should be acceptable on the reading derived from this LF in (42). (41)
a. *John is taller than no girl is. b. [ IP [ CP than [1 [no girl [2 [[Pi t1 ] [3 [ t2 is t3 tall]]]]]]] [ IP 4 [[-er t4 ] [5 [John is t5 tall]]]]
(42)
a. b.
c.
main clause: [4 [[-er t4 ] [5 [John is t5 tall]]] = λd. John is taller than d than-clause: [than [1 [no girl [2 [[Pi t1 ] [3 [ t2 is t3 tall]]]]]]] = λD 0 . for no girl x : max(λd. x is d-tall) ∈ D 0 intervals into which the height of no girl falls main clause + than-clause: (41b) = [λD 0 . for no girl x : max(λd. x is d-tall) ∈ D 0 ](λd. J is taller than d) = for no girl x: John is taller than x
1:15
Sigrid Beck
Adopting the interval analysis, but combining it with a scope mechanism and semantic reconstruction, allows Heim to derive both types of readings (apparent narrow and apparent wide scope), and to reduce the comparison ultimately back to a comparison between degrees. Thus her empirical coverage is greater and the semantics of comparison simpler than Schwarzschild & Wilkinson’s analysis. The problem that this analysis faces is overgeneration. We do not have an obvious way of predicting when we get which reading. The fact that in general, only one scope possibility is available makes one doubt that this is really a case of systematic scope ambiguity. 2.3 Alternative new analyses: Gajewski, van Rooij, Schwarzschild There is a group of new proposals — Gajewski 2008, van Rooij 2008 and Schwarzschild 2008 — for how to deal with quantifiers in than-clauses whose approach seems to be inspired by Heim’s (2006b) analysis. I present below a simplified version of this family of approaches that is not entirely faithful to any of them. I call this the NOT-theory. It can be summarized in relation to the previous subsection as ‘keep the than-clause internal operator, but not the intervals’. It adopts the idea that there is an operator — like Heim’s Pi — that can take wide or narrow scope relative to a than-clause quantifier, dictating what kind of reading the comparative sentence receives. It does not adopt an interval analysis, and thus the operator is not Pi and the semantics of the comparative is not the classical one. Instead, the operator is negation and the proposed semantics is basically Seuren’s (1978). 2.3.1 Seuren’s semantics for the comparative (operator: NOT) Seuren (1978) suggests (43b) as the interpretation of (43a). The than-clause provides the set of degrees of tallness that Bill does not reach. It does so by virtue of containing a negation, as illustrated in the LF in (44). This meaning could be combined intersectively with the main clause and the degree existentially bound, as represented in (45). (43)
a. b. c.
John is taller than Bill is. ∃d[Height(J) ≥ d & ¬ Height(B) ≥ d] There is a degree of tallness that John reaches and Bill doesn’t reach.
1:16
Quantifiers in than-clauses
than λd[NOT Bill is d-tall] λd[¬ Height(B) ≥ d] = λd[Height(B) < d]
(44)
a. b.
(45)
[∃ [λd [John is d-tall] [than λd [NOT Bill is d-tall]]]
The authors mentioned above note that this semantics gives us an easy way to derive the intuitively correct interpretation for apparent wide scope quantifiers. This is illustrated below for the universal NP. In (46) I show that the desired meaning is easily described in this analysis and in (47) I provide the LF for the than-clause that derives it. (48) illustrates that some, another apparent wide scope quantifier, is equally unproblematic. (46)
a. b. c.
John is taller than every girl is. ∃d[Height(J) ≥ d & ∀x[girl(x) → Height(x) < d]] every girl is shorter than John.
(47)
a. b. c.
than every girl is than λd [every girl [1 [NOT [t1 is d tall]]]] than λd.∀x[girl(x) → Height(x) < d]]
_ _ _ _• _ _ _ _ _ _ _•_ _ _ _ •_ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ _ _ _/
g1 (48)
a. b. c.
g2
g3
g4
...
John is taller than some girl is. ∃d[Height(J) ≥ d & ∃x[girl(x) & Height(x) < d]] there is a girl who is shorter than John.
An interesting application is negation, here illustrated with the negative quantifier no. Proceeding in the now familiar way, we derive (49b). Rephrasing this in terms of (49c) makes it clear that the resulting semantics is very weak. Whenever the girls have any measurable height at all — that is, whenever the than-clause can be appropriately used — there will be a height degree that John reaches and that all the girls reach as well. The smallest degree on the scale will be such a degree. The NOT-theory proposes that the sentence is unacceptable because it is necessarily uninformative. (49)
a. *John is taller than no girl is. b. ∃d[Height(J) ≥ d & for no girl x : Height(x) < d] c. ∃d[Height(J) ≥ d & for every girl x : Height(x) ≥ d] uninformative! (The lowest degree on the height scale makes this true.) 1:17
Sigrid Beck
2.3.2
NOT has to take varying scope
The NOT-theory needs another important ingredient: Just like the Pi-operator above, other than-clause internal operators have to take flexible scope relative to NOT in order to create the different readings we observe. This is illustrated below with the familiar have to example, and with allowed. (50)
a. Mary is taller than she has to be. b. #∃d[Height(M)(@) ≥ d & ∀w[wR@ → NOT Height(M)(w) ≥ d]] Mary should have been shorter than she is. c. ∃d[Height(M)(@) ≥ d & NOT∀w[wR@ → Height(M)(w) ≥ d]] Mary is taller than the minimally required height.
(51)
a. John is taller than he is allowed to be. b. #∃d[Height(J)(@) ≥ d & ∃w[wR@ & NOT Height(J)(w) ≥ d]] ∃d[Height(J)(@) ≥ d & ∃w[wR@ & Height(J)(w) < d]] John would have been allowed to be shorter than he is. c. ∃d[Height(J)(@) ≥ d & NOT∃w[wR@ & Height(J)(w) ≥ d]] John is taller than the tallest permissible height.
(52)
a. #than λd [allowed [λw [NOT [John is d tall in w]]]] b. than λd [NOT [allowed [λw [John is d tall in w]]]]
Just like the Pi-theory, then, the NOT-theory is able to generate the range of readings we observe for operators in than-clauses. It seems somewhat simpler than the Pi-theory in that it does not take recourse to intervals in addition to a scopally flexible than-clause internal operator. But as in the case of the Pi-theory, we must next ask ourselves what prevents the unavailable readings, e.g. what excludes the LF in (52a). 2.3.3
Which reading?
The NOT-theory would have an empirical advantage over the Pi-theory if constraints on scope could be found to deal with the overgeneration problem we noted above. A first successful application are polarity items. Example (53a) can only have the LF in (54b), not the one in (54a), according to constraints on the distribution of NPIs. Thus we only derive the approproate interpretation. Note though that the Pi-theory has the same success since the scope of Pi is a downward entailing environment, but the rest of the than-clause isn’t (compare Heim 2006b). (55) is the mirror image.
1:18
Quantifiers in than-clauses
(53)
a. John is taller than any girl is. b. #∃d[Height(J) ≥ d & ∃x[girl(x) & Height(x) < d]] there is a girl who is shorter than John. c. ∃d[Height(J) ≥ d & NOT ∃x[girl(x) & Height(x) ≥ d]] John reaches a height degree that no girl reaches. = John is taller than every girl.
(54)
a. *than λd [any girl [1 [NOT [t1 is d tall]]]] b. than λd [NOT [any girl [1 [t1 is d tall]]]]
(55)
John is taller than some girl is.
Let us next reexamine negation. Two interpretations need to be considered. The one in (56b) was already rejected above as uninformative. It turns out that the alternative interpretation is equally uninformative. The ungrammaticality of negation in than-clauses is thus captured elegantly by this theory. Here it has an advantage over the Pi-theory. (56)
a. *John is taller than no girl is. b. ∃d[Height(J) ≥ d & for no girl x : Height(x) < d] uninformative c. ∃d[Height(J) ≥ d & NOT for no girl x : Height(x) ≥ d] uninformative = ∃d[Height(J) ≥ d & some girl x : Height(x) ≥ d]
Among the proponents of the NOT-theory, Schwarzschild (2008) examines modals. He argues that the NOT-theory predicts that modals in than-clauses should give rise to the same reading that they have with ordinary clause-mate negation. This prediction is borne out, as the examples below illustrate. (57)
a. b.
John is not allowed to be that tall. than he is allowed to be.
NOT allowed
(58)
a. b.
John might not be that tall. than he might be.
(59)
a. b.
John is not supposed to be that tall. than he is supposed to be.
supposed NOT
(60)
a. b.
John is not required to be that tall. than he is required to be.
NOT required
might NOT
While this is helpful with modals, it stops short of explaining the interpretation associated with intensional full verbs like predict.
1:19
Sigrid Beck
(61)
a. b.
John was not predicted to be that tall. than he was predicted to be.
NOT predict — #
Two further possible constraints are discussed. Van Rooij (2008) examines universal DPs and Gajewski (2008) investigates numeral DPs. Let us consider both in turn. Note first that a universal DP is ambiguous relative to clause mate negation. In particular it allows a reading in which the universal takes narrow scope relative to negation. Thus there are no inherent scope constraints that would help us to exclude (630 b) as an LF of (63a). But exclude it we must, since it gives rise to the unavailable reading (63c). (62)
a. b.
Every girl isn’t that tall. than every girl is.
ambiguous
(63)
a. b.
John is taller than every girl is. ∃d[Height(J) ≥ d & ∀x[girl(x) → Height(x) < d]] ‘Every girl is shorter than John.’ c. #∃d[Height(J) ≥ d & NOT ∀x[girl(x) → Height(x) ≥ d]] ‘John reaches a height that some girl doesn’t.’ = John is taller than the shortest girl.
(630 )
a. than λd [every girl [1 [NOT [t1 is d tall]]]] b. *than λd [NOT [every girl [1 [t1 is d tall]]]]
Van Rooij observes that (630 a) yields stronger truth conditions than (630 b). He proposes that if no independent constraint excludes one of the LFs, you have to pick the one that results in the stronger truth conditions. This amounts to the suggestion that than-clauses fall within the realm of application of the Strongest Meaning Hypothesis (SMH; Dalrymple, Kanazawa, Kim, Mchombo & Peters 1998). If they do, the NOT-theory can make the desired predictions about every DPs (and some other relevant examples). So could the Pi-theory, though, so this does not distinguish between the two scope based theories of quantifiers in than-clauses. While I am sympathetic to the idea of extending application of the SMH, I see some open questions for doing so in the case of than-clauses. Dalrymple et al. originally proposed the SMH to deal with the interpretation of reciprocals. (64a) receives a stronger interpretation than (64b), for example, because the predicate to stare at makes it factually impossible for the reading of (64a) to ever be true. Similarly for (64c) vs. (64a,b). But (64a) only has one inter-
1:20
Quantifiers in than-clauses
pretation, the strongest one, and (64b) also cannot have a reading parallel to (64c). The SMH says, very roughly, that out of the set of theoretically possible interpretations you choose the strongest one that has a chance of resulting in a true statement, i.e. that is conceptually possible.3 (64)
a. b. c.
These three people know each other. = everyone knows everyone else. These three people were staring at each other. = everyone was staring at someone else. These three people followed each other into the elevator. = everyone followed, or was followed by, someone else.
There is a theoretical question as to when the SMH applies. We would not wish it to apply in (62) for instance because it would predict that there is no ambiguity. When there is ambiguity, the data in question must not be subject to the SMH. Are than-clauses in the domain of application of the SMH? Prima facie, this seems very plausible, because — just like reciprocals — they are (almost always) unambiguous, while semantic theory provides several potential interpretations. What strikes me as problematic is that there is no way to make the weaker reading emerge, even if the stronger reading is conceptually impossible. The following sentences are necessarily false, rather than having the interpretations indicated. (640 )
a.
(about a 100 m race:) The next to last finalist was faster than every other finalist. ≠ the next to last finalist was faster than the slowest other finalist.
3 Below I provide the formulation of the SMH given in Beck 2001. If we extend the domain of application of the SMH to than-clauses, we need to strike out those phrases that make explicit reference to reciprocals, as indicated. The relevant point is that the SMH makes reference to interpretations compatible with non-linguistic information I, which in the examples in (640 ) below would be knowledge about the order of finalists, elevator buttons and weekdays, parallel to knowledge about processions of people and possibilities for staring in (64). (i)
Strongest Meaning Hypothesis (SMH) Let Sr be the set of theoretically possible reciprocal interpretations for a sentence S. Then, S can be uttered felicitously in a context c, which supplies non-linguistic information I relevant to the reciprocal’s interpretation, provided that the set Sc has a member that entails every other one. Sc = {p: p is consistent with I and p ∈ Sr } In that case, the use of S in c expresses the logically strongest proposition in Sc .
1:21
Sigrid Beck
b.
c.
(in an elevator:) The second button from the bottom is higher than every other button. ≠ the second button from the bottom is higher than the lowest other button. Friday is earlier than every other day of the week. ≠ Friday is earlier than the latest other day of the week.
Thus than-clauses do not seem parallel to reciprocals. It would be better if an LF that gives rise to the ‘the least . . . other’ reading for universal DPs simply did not exist. Turning now to numeral DPs, note first that it is not immediately obvious how the NOT-theory predicts a plausible meaning for them at all. Gajewski (2008) points out that the following analysis of exactly-DPs gives rise to truth conditions that are too weak. (650 ) would be true in a situation in which more than three girls stay below John’s height. (65)
John is taller than exactly three girls are.
(650 )
∃d[Height(J) ≥ d & for exactly 3 girls x : Height(x) < d] At least three girls are below John’s height _ _ _ • _ _ _ _ • _ _ _ _ • _ _ _ _ _ _ _ • _ _ _ _ • _ _ _ _ • _ _ _ _/
g1
g2
g3
g4
g5
d
H(J)
Reversing the scope of NOT and the exactly-DP doesn’t help: (6500 )
∃d[Height(J) ≥ d & NOT for exactly 3 girls x : Height(x) ≥ d] there is a degree of height that John reaches that is not reached by exactly 3 girls, i.e. fewer or more girls reach that degree true e.g. if John is taller than every one of five girls
Gajewski develops an analysis that relies on Krifka’s (1999) work on exactly, at least and at most, according to which these elements take effect at the level of the utterance, far away from their surface position. I present this analysis in simplified terms below, using (66) to illustrate. The semantic effect of exactly is due to an operator I call EXACT, which applies at the utterance level and operates on the basis of the ordinary as well as the focus semantic
1:22
Quantifiers in than-clauses
value of its argument. The operator’s semantics is given in (67). The truth conditions derived for the example are the right ones, as shown in (68) ((68) uses Link’s (1983) operator ∗ for pluralization of the noun). (66)
a. b.
Exactly three girls weigh 50 lb. [EXACT [XP (exactly) threeF girls weigh 50 lb.]]
(660 )
threeF girls weigh 50 lb.o = ∃X[∗ girl(X) & card(X) = 3 & ∗weigh.50.lb(X)] threeF girls weigh 50 lb.f = {∃X[∗ girl(X) & card(X) = n & ∗weigh.50.lb(X)] : n ∈ N}
(67)
EXACT(XPf )(XPo ) = 1 iff XPo = 1 & ∀q ∈ XPf : ¬(XPo → q) → ¬q ’Out of all the alternatives of XP, the most informative true one is the ordinary semantics of XP.’
(68)
(66b) = 1 iff ∃X[∗ girl(X)&card(X) = 3 & ∗weigh.50.lb(X)] & ∀n[n > 3 → ¬∃X[∗ girl(X) & card(X) = n & ∗weigh.50.lb(X)]] iff max(λn.∃X[∗ girl(X) & card(X) = n & ∗weigh.50.lb(X)]) = 3
Krifka’s analysis of exactly allows us to assign the problematic example (65) the LF in (69), which captures the right meaning, namely the interpretation in (690 ). (69)
EXACT [∃ [λd [John is d-tall] [than λd [threeF girls [λx [NOT x is d-tall]]]]
(690 )
max(λn. ∃d[Height(J) ≥ d & for n girls x : Height(x) < d]) the largest number n such that John reaches a height that n girls don’t is 3. = exactly three girls are shorter than John.
Thus independently motivated assumptions about numerals allow the NOTtheory to derive the desired interpretation. However, there is still the question of the other LF, (70), in which NOT takes scope over the DP. This gives rise to interpretation (700 ). (70)
EXACT [∃ [λd [John is d-tall] [than λd [NOT threeF girls [λx [x is d-tall]]]]
(700 )
max(λn. ∃d[Height(J) ≥ d & NOT for n girls x : Height(x) ≥ d]) the largest number n such that there is a height John reaches and it’s
1:23
Sigrid Beck
not the case that n girls do is 3. = exactly two girls are shorter than John. The reasoning in (71) makes it clear that this reading leads to truth conditions that do not correspond to an available reading; they would make the sentence true in the situation depicted, where there are two girls shorter than John. (71)
a. b.
∃d[Height(J) ≥ d & NOT for n girls x : Height(x) ≥ d] = ∃d[Height(J) ≥ d & fewer than n girls reach d] ∃d[Height(J) ≥ d & fewer than 3 girls reach d]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/ • • • • • •
g1
g2
g3
g4
g5
H(J) The NOT-theory would have to come up with an explanation for why this reading is unavailable. I am not aware that there is at present such an explanation. Note that even if we didn’t have the reservations about the SMH pointed out above, it would not apply here, as the two interpretations don’t stand in an entailment relation. To summarize: just like the Pi-theory, the NOT-theory faces an overgeneration problem. Both the Pi-theory and the NOT-theory solve this easily regarding NPIs. The NOT-theory also has a simple story about modals and negative quantifiers. It does not have an explanation for intensional full verbs and numeral DPs, and I argue it does not have a story about universal DPs (or other prospective applications of the SMH) either. Thus I see some progress compared to the Pi-theory, but not a complete analysis. A conceptual advantage seems to be the NOT-theory’s simplicity. But we will need to reexamine that in the next subsection. 2.3.4 Reference to degrees — differentials One of the strengths of the classical analysis of comparatives is the way in which it deals with explicit reference to degrees. For example differentials in comparatives, illustrated in (72) and (73), receive an easy and natural analysis. (72)
a. b. c.
Bill is 1.70 m tall. John is 200 taller than that. Height(J) ≥ 200 + 1.70 m
1:24
Quantifiers in than-clauses
(73)
a. b.
John is 200 taller than Bill is. Height(J) ≥ 200 + max(λd. Height(B) ≥ d) = Height(J) ≥ 200 + Height(B)
It is not obvious how to incorporate differentials into the NOT-theory, whose semantics of a simple example is repeated in (74). That is because the than-clause does not refer to a degree. (74)
a. b.
John is taller than Bill is. ∃d[Height(J) ≥ d & NOT Height(B) ≥ d]
Among the proponents of the NOT-theory, Schwarzschild (2008) discusses this problem. He proposes to understand (75a) in terms of (75b); I simplify this to (75c) for the purposes of discussion. (75)
a. b. c.
John is 200 taller than Bill is. ∃d[Height(J) ≥ d & 200 (λd0 . d0 ≤ d & Height(B) < d0 )] 200 (λd0 . d0 ≤ Height(J) & Height(B) < d0 )]) “the degrees between Bill’s height and John’s are a 200 interval”
_ _ _ _ • _ _ _ _ _ _ _ _ _ _ _ _ _ _ • _ _ _ _/ 200 interval
H(B)
H(J)
The question is how to derive this interpretation. Schwarzschild proposes to replace NOT in the than-clause with an operator FALL-SHORT. The resulting LF of our example is given in (750 a) and the semantics of FALL-SHORT in (750 b). Diff is a variable that is the first argument of FALL-SHORT, to be bound outside the than-clause and identified with the differential in the matrix clause (as if the differential was raised out of the embedded clause to its main clause position). (750 )
a. b. c.
than [[FALL-SHORT Diff] λd [Bill is d-tall]] FALL-SHORT = λDiff.λDh d, ti.λd. Diff(λd0 . d0 ≤ d & D(d0 ) = 0) Diff(λd0 . d0 ≤ d& Height(B) < d0 ) Bill’s Height is a Diff-large distance below d
We combine with the differential next, as shown in (76). Then, the degree d is bound and the usual semantic mechanisms combine this with the rest of the main clause in (77). This derives (75).
1:25
Sigrid Beck
[200 er] [λDiff [than Bill is tall]] λd. 200 (λd0 . d0 ≤ d& Height(B) < d0 )] Bill’s Height is a 200 distance below d
(76)
a. b.
(77)
[∃ [λd [John is d-tall] [200 er] [λDiff [than Bill is tall]]]
It seems to me that this is a rather substantial modification of the original NOT-theory. The basic points about than-clause scope interaction remain the same (as the reader may verify), but some of the explanation is less obvious. In particular, I don’t see that scopal behaviour of a modal with same clause negation necessarily predicts scopal behaviour relative to FALL-SHORT, any more than it predicts scopal behaviour relative to Pi. I also believe that we lose the explanation of the unacceptability of negative quantifiers. Neither of the readings associated with the two possible LFs below is necessarily uninformative. Finally, I no longer see that the FALL-SHORT-theory is simpler than the Pi-theory. (78)
a. b. c.
John is 200 taller than no girl is. [∃ [λd [J. is d-tall] [200 er] [λDiff [than [[FALL-SHORT Diff] λd [no girl is d-tall]]] [∃ [λd [J. is d-tall] [200 er] [λDiff [than [no girl λx [FALL-SHORT Diff] λd [x is d-tall]]]]]
(780 )
∃d[Height(J) ≥ d&200 (λd0 . d0 ≤ d & [λd. no girl is d-tall](d0 ) = 0)]] = ∃d[Height(J) ≥ d & 200 (λd0 . d0 ≤ d & some girl is d0 -tall])] = John and some girl are at least two inches tall.
(7800 )
∃d[Height(J) ≥ d & no girl x : 200 (λd0 . d0 ≤ d& Height(x) < d0 )] = no girl is 200 shorter than John.
I conclude that while the type of analysis discussed in this section — what one might call scopal theories of quantifiers in than-clauses — has brought forth some very interesting ideas, there are also unanswered questions. It may be worthwhile to pursue a scopeless alternative, which is what I will do in the next section. 3
Analysis: Selection
The strategy I propose in this section is inspired by both Schwarzschild & Wilkinson and Heim. Schwarzschild & Wilkinson’s use of intervals is retained in order to be able to interpret a quantifier inside a than-clause. But like Heim,
1:26
Quantifiers in than-clauses
I attempt to make this move compatible with a simple, standard semantics of the comparative. The novel aspect of the analysis below concerns how this is done. I do not adopt a than-clause internal operator Pi and I do not rely on semantic reconstruction. I propose instead that there is a mechanism that derives a particular degree from an interval provided by the than-clause. This degree is compared in the normal way with a matrix clause degree. The trick will be to ensure that the degree chosen is the right one, i.e. that the comparison ultimately made reflects the intuitively accessible reading of the comparative sentence in question. The same selection mechanism will account for both apparent wide scope and apparent narrow scope readings. The analysis will not employ a scoping mechanism that is specific to comparatives. Its relation to the earlier work discussed above can be simply stated as ‘keep the intervals, but not the operator’. Two rationales guide me in pursuing this approach. The first is that a scoping mechanism inside the than-clause overgenerates in ways that we have yet to find the means of constraining. Therefore it would be an advantage to make do without such an extra scopal element. The second is that it remains a strength of the classical analysis that degree operators combine directly with expressions referring to degrees, and that differentials in particular can be accounted for in a direct and straightforward way. Therefore I want to come out of the calculation of the semantics of the than-clause holding in my hand the degree we will be comparing things to. The combination of these two lines of reasoning persuades me to attempt a simplification of Schwarzschild & Wilkinson, which should of course also cover the apparent narrow scope data that were problematic for them. Section 3.1 presents the idea behind the selection analysis and applies it to straightforward cases. Apparent narrow scope universals are not straightforward and addressed in Section 3.2. Apparent wide scope existentials similarly seem problematic and are the issue of Section 3.3. In Section 3.4 I reexamine comparatives that combine a differential with a quantifier in the than-clause and propose a refinement of the analysis of the comparative to capture the data. 3.1
Basic idea and simple cases
I illustrate the idea behind the selection analysis with example (79), which would not in fact require intervals at all of course. But, suppose that we in general compositionally derive as the meaning of the than-clause a set of
1:27
Sigrid Beck
intervals, as suggested in the Schwarzschild & Wilkinson and Heim theories. Suppose furthermore that this comes from the basic lexical entry of the adjective, as indicated in (80). This is what I will assume in this section, for the sake of uniformity (see Section 4 for more discussion). It amounts to (790 ) in the present case. How do I propose to derive the truth conditions of (79a), (79b), from that? (79)
a. b.
(790 )
a. b.
John is taller than Bill is. Height(John) > Height(Bill) [than Bill is tall] = λD 0 . Height(Bill) ∈ D 0 _ _ _ _ _ _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/
H(B)
.. . (80)
_ Intervals containing Bill’s height
_
tall = λD. λx. Height(x) ∈ D
I suggest that general mechanisms available in such situations enable us — in fact, force us — to pick from the set of intervals something that is suitable as the input to the comparative operator repeated in (81). I represent this selection mechanism as the in (7900 ) for the moment. This subsection asks what the appropriate meaning for the is. (Note that the term ‘selection’ is not intended to imply that there is a genuine choice; I intend to provide one semantics for the.) (81) (7900 )
-er = λdd . λd0d . d0 > d John is taller than the than-clause
In the present case, the could be an operator selecting the shortest interval from the set, i.e. Bill’s height, cf. (82). This seems a natural choice, given that all other intervals contain extraneous material and that the point that really ‘counts’ is just Bill’s height. (82)
min(phhd,ti,ti ) = ιD. p(D) & ¬∃D 0 . D 0 ⊂ D & p(D 0 ) (shortest p interval)
1:28
Quantifiers in than-clauses
Irene Heim and Danny Fox (p.c.) point out to me that the sense in which choosing the minimal interval is ‘natural’ is informativity. (83) below states what the maximally informative propositions out of a set of true propositions (say, a question meaning) are. (83)
a. b.
m_inf(w)(Q & h hs,hhs,ti,tii ) = λq. Q(w)(q) i 0 0 0 ¬∃q Q(w)(q ) & q 6= q & Q(w)(q0 ) → Q(w)(q) the maximally informative answers to a question Q(w) (Q(w) the set of true answers to Q in w) is the set of propositions q in Q(w) such that there is no other proposition q0 in Q(w) such that Q(w)(q0 ) entails Q(w)(q) (i.e. if q0 is in Q(w) then so is q).
Informativity allows us to capture the fact that an appropriate answer to (84a) is the true answer that entails all the other true answers, i.e. John’s maximal speed (for example the proposition that he drove 50 mph), and in a parallel way the minimum amount of flour that suffices in (84b)(see Heim 1994; Beck & Rullmann 1999). (84)
a.
How fast did John drive? λw. λp. ∃d p(w) & p = λw 0 . John drove d-fast in w 0 {that John drove 50 mph, that John drove 49 mph, that John drove 48 mph, . . . } b. How much flour is sufficient? λw.λp.∃d[p(w)&p = λw 0 .d-much flour is sufficient in w 0 ] {that 500 g is sufficient, that 501 g is sufficient, that 502 g is sufficient, . . . }
The definition can be extended to (intensions of) arbitrary sets in the following way: (85)
m_inf(w)(p & h hs,hα,tii ) = λq. p(w)(q) i 0 0 0 ¬∃q p(w)(q ) & q 6= q → p(w)(q0 ) & p(w)(q)
The instance of this generalization that we will be interested in is (86). (86)
a. b.
m_inf(w)(p & h hs,hhd,ti,tii ) = λD. p(w)(D) i 0 0 0 ¬∃D p(w)(D ) & D 6= D & p(w)(D 0 ) → p(w)(D) the maximally informative intervals out of a set of intervals p(w) is the set of intervals D such that there is no other interval D 0 in p(w) such that p(w)(D 0 ) entails p(w)(D) (i.e. if D is in p(w) then so is D 0 ). 1:29
Sigrid Beck
Fox & Hackl (2006) argue that we want to extend the definition from the question case to others in order to capture the similarity between (84a,b) above and (87a), (88a). (87a) refers to the maximum speed John reached and (88a) refers to the minimum amount that suffices, both maximally informative in the sense of (85). The instance in (86) extends the analogy from (84a,b) and (87a), (88a) to (87b), (88b). (87)
a. b.
the speed that John drove than John drove
(88)
a. b.
the amount of flour that is sufficient than is sufficient
Hence, the in (7900 ) is m_inf, which yields a singleton, combined with taking from a set its only member (here represented with max). We can understand these operators as semantic ‘glue’ (a term introduced by Partee 1984, see also von Stechow 1995): operations that have to enter into composition, in addition to what the syntax strictly speaking provides, in order to make the sentence parts combinable. Their presence is required by the need for interpretability. (79000 )
John is taller than max(m_inf(than-clause))
The simple example allows me to emphasize another aspect of what I call the selection analysis: there is no choice in ‘selecting’ a point from a set of intervals. Only one interpretation is possible for (79). The ‘glue’ we have here is entirely semantic (and not, say, subject to pragmatic variability). Although we will see in a moment that quantifiers in than-clauses require some more elaboration, this will be preserved. Selection means, basically, taking from the minimal interval(s) the maximal element. 3.1.1 Apparent wide scope universals Let’s return to the now familiar example (89). We take the than-clause to have the denotation in (890 ). (89)
a. b.
John is taller than every girl is. For every girl x: John’s height exceeds x’s height.
1:30
Quantifiers in than-clauses
(890 )
[than every girl is tall] = λD 0 . ∀x girl(x) → Height(x) ∈ D 0 interval into which the height of every girl falls
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/ • • • •
x1
x2
x3
J
The intuitive truth conditions of (89) can be described as making a comparison between John’s height and the end point of the interval into which all the girls’ heights fall. If John is taller than the tallest girl, he is taller than all of them. Thus I propose that from the denotation of the than-clause that is given in (890 ), we first choose the shortest = maximally informative interval that fits the description (i.e. that covers all the girls’ heights) and then select the maximal point of that interval.4 (90)
John is taller than Max> m_inf(than-clause) = John is taller than the height of the tallest girl
(91) and (92) below provide the relevant definitions. We extend the notion of the ordering relation underlying our degree scale from degrees to intervals, (91). We can then define the maximal element of a set of intervals, and finally the end point of an interval, (92). (91)
(92)
d > d0 d is larger than d0 I > J iff ∃d d ∈ I & ∀d0 [d0 ∈ J → d > d0 ] I extends beyond J
a.
ordering of degree points:
b.
ordering of intervals:
a. b.
max> := the max relative to the > relation on intervals or degrees Max> (p) := max> (max> (p)) = the end ‘point’ of the interval that extends furthest
We straightforwardly derive the desired meaning. Other universal quantifiers can be treated in the exact same way. This is illustrated below with the 4 Fox & Hackl propose to replace maximality with maximal informativity. I have not been able to develop an analysis that incorporates that proposal. The reason is lack of entailment among the degrees in the minimal than-clause interval: If I know of a degree d that it falls in between the height of the smallest girl and the height of the tallest girl, I cannot infer that a degree d0 larger than d also falls within that interval (d0 might be beyond the height of the tallest girl) and I cannot infer that a smaller degree d00 also falls within that interval (d00 might be below the height of the shortest girl). Therefore I will use both maximal informativity and ordinary maximality.
1:31
Sigrid Beck
familiar example containing predict. If my prediction was that John would be between 1.70 m and 1.80 m tall, then the interval [1.70–1.80] is the unique shortest interval described by the than-clause. The end point of that interval is 1.80 m, and the example is correctly predicted to be true if John is taller than 1.80 m. (93)
a. b.
(930 )
[than I had predicted (that he would be tall)] = λD 0 . ∀w[wR@ → John’s height in w ∈ D 0 ] intervals into which John’s height falls in all my predictions
(9300 )
John is taller than I had predicted (that he would be). For every world compatible with my predictions: John’s actual height exceeds Johns height in that world.
John is taller than Max> (m_inf(than-clause)) = John is taller than the height according to the tallest prediction
What I call selection yields the maximum relative to the ordering relation linguistically given — ‘larger than’ on the size scale in the case of taller. This follows from more general interpretive mechanisms suggested independently (compare Jacobson 1995; Fox & Hackl 2006). Application of these mechanisms is required by the need for the than-clause to serve as input to the comparative operator. 3.1.2 Apparent narrow scope existentials We can apply the same strategy to narrow scope existentials. This is illustrated with (94) below. In contrast to Heim’s analysis and like Schwarzschild & Wilkinson’s, I assume that the than-clause denotes the set of intervals in (940 ) (once more via the shifted lexical entry for the adjective, (80)). Importantly, remember that I assume that the shift to intervals must take place locally, i.e. within the adjective phrase. I do not assume a genuine mobile operator Pi like Heim (2006b) does (whose LF for (94a) would give Pi wide scope relative to anyone). We dispense with the interpretations for than-clauses that were attributed to wide scope of the Pi operator. (94)
a. b.
Mary is taller than anyone else is. Mary’s height exceeds the largest degree of tallness reached by one of the others.
(940 )
[than anyone else is tall] = λD 0 . ∃x[x ≠ Mary & Height(x) ∈ D 0 ] intervals into which the height of someone other than Mary falls 1:32
Quantifiers in than-clauses
The shortest = maximally informative than-clause intervals will be the heights of the other relevant people. (Thus we get rid of the intervals immediately.) Out of these, we choose the maximum. This results in the same meaning as under the classical analysis. Thus the same selection strategy that we used above will predict the right truth conditions. The analysis extends to other apparent narrow scope existentials like be allowed etc. (95)
_ _ _ _• _ _ _ _• _ _ _ _ •_ _ _ _ _ _ _ • _ _ _ _ _ _ _/
x1 (9400 )
x2
x3
M
Mary is taller than Max> (m_inf(than-clause)) = Mary is taller than the height of the tallest other person.
The selection strategy predicts the right truth conditions for these ‘apparent narrow scope’ and ‘apparent wide scope’ quantifier data without changing scope. This allows us to predict ungrammaticality of negation straightforwardly, as illustrated below. 3.1.3 Negation Remember that the unacceptability of (96) could be understood in terms of an undefined contribution of the than-clause (von Stechow 1984; Rullmann 1995). The selection analysis presented here can retain this desirable prediction. The meaning of the than-clause is (960 ), in accordance with what is said above. This is the only meaning possible for the than-clause. (96) (960 )
*John is taller than no girl is. than no girl is tall = λD 0 . for no girl x : Height(x) ∈ D 0 intervals into which the height of no girl falls
(960 ) will not yield a well-defined meaning for the comparative. Just as in the original analysis of these data, the than-clause will not provide us with a maximum, since there is no largest interval containing no girl’s height. Max> is undefined; hence negation in the than-clause leads to undefinedness of the comparative as a whole. Since there is no other option, we no longer face the problem of ruling out the apparent wide scope reading of the negative quantifier. The simple data discussed in this subsection highlight the potential attraction of the selection analysis. We keep a simple semantics for the comparative
1:33
Sigrid Beck
and don’t double interpretive possibilities with a scoping mechanism. Next, we turn to all the complications. 3.2
Refinement I: Have to–type modals
This subsection concerns universal quantifiers that do not behave like every girl, predict and other apparent wide scope universals. Remember from Section 2 that modals like have to appear to favour a narrow scope interpretation rather than the apparent wide scope interpretation described and derived above for other universals. (97)
Mary wants to play basketball. The school rules require all players to be at least 1.70 m.
(970 )
a. b.
Mary is taller than she has to be. Mary’s actual height exceeds the degree of tallness which she has in all worlds compatible with the school rules; i.e. Mary’s actual height exceeds the required minimum, 1.70 m.
Keeping stable our assumptions about the meaning of than-clauses, we will assume (98) for this example. Selecting the maximum of the shortest than-clause interval will not yield the desired truth conditions this time, though: that would amount to the claim that Mary’s height exceeds the maximum height permitted. The sentence intuitively says that Mary is above the required minimum. Contrasts like the one between have to and predict are of course what motivates the scope analysis (apparent wide scope for predict, apparent narrow scope for have to). A different description of the facts is that the example with predict (and similar examples with every girl, should, etc.) has a ‘more than maximum’ interpretation while have to can have a ‘more than minimum’ interpretation. I see the task for my approach as having to explain how factors independent of comparative semantics may result in a ‘more than minimum’ interpretation rather than the expected ‘more than maximum’ reading. (98)
than she has to be tall = λD 0 . ∀w[wR@ → Mary’s height in w ∈ D0 ] intervals into which Mary’s height falls in all worlds compatible with the rules the beginning of this interval is below Mary’s actual height, i.e. Mary’s height exceeds the minimal element of the shortest than-clause interval 1:34
Quantifiers in than-clauses
There are two analyses, as far as I am aware, that propose to reduce the variation in the interpretation of than-clauses with universal modals between maximum and minimum interpretation to independent factors, such that the readings collapse into one. Meier (2002) proposes that the ordering source that modal semantics uses is responsible for a contextually guided determination of the interpretation, explaining away apparent maxima and minima both. Krasikova (2008) examines the problem of have to–type modals in comparatives in particular and employs covert exhaustification to explain away apparent ‘more than minimum’ interpretations. While both approaches solve the problem at hand equally well for my purposes, I describe below Krasikova’s suggestions because they seem to me to offer more promise for identifying which modal operators give rise to which reading(s). Krasikova (2008) points out that whether we get a ‘more than minimum’ reading like the one illustrated above for this type of modal or a ‘more than maximum’ reading parallel to the reading illustrated for predict depends on the context an individual example is put into. Remember example (99) from above, which shows that have to–type modals may also give rise to a ‘more than maximum’ reading — the reading we expect under the present analysis.5 Thus what distinguishes have to–type modals from others is the availability of an apparent narrow scope reading (a ‘more than minimum’ reading under the present perspective). (99)
He was coming through later than he had to if he were going to retain the overall lead. (from Google, cited from Krasikova 2008)
Krasikova further observes that the universal modals that can give rise to the ‘more than minimum’/apparent narrow scope reading are just the ones that occur in sufficiency modal constructions (SMC). An example of an SMC is given below (von Fintel & Iatridou 2005). (100)
You only have to go to the North End (to get good cheese).
5 It is not at present clear to me under what circumstances a have to–type modal seems to permit a more-than-maximum interpretation. Relevant factors may be the choice of a negative polar adjective and a subjunctive-like interpretation (Danny Fox and Irene Heim, p.c.). Personally, I find this interpretation very hard to get.
1:35
Sigrid Beck
(1000 )
Truth conditions:
Implicature:
You do not have to do anything more difficult than to go to the North End (to get good cheese). You have to go to the North End or do something at least as difficult (to get good cheese).
The combination of only and a modal in the SMC considers alternatives to the proposition that is the complement of have to, and ranks those alternatives on a scale. Plausible alternatives for our example and their ranking are given in (101). They provide the domain of quantification, C in (102); (102a) sketches a structure for the example, (102b) a meaning for ‘only have to’ and (102c) the outcome, which corresponds to the desired truth conditions (1000 ). Note that the SMC reading is one that identifies the point on a scale that is the minimum sufficiency point, as illustrated in (103). (101)
a. b.
(102)
a. b. c.
(103)
that you go to the nearest supermarket, that you go to the North End, that you go to New York, that you go to Italy SUPER < NE < NY < Italy (where ‘<’ means: is easier than) [[only have to]C,< [you go to the North End]] [only have to]C,< (p)(w) = 1 iff ∀q[q ∈ g(C)&¬(q < p) → ¬have to(q)(w)] For all q such that q is in g(C) and ¬(q < NE) : ¬have to(q)
_ _ _ _ _ _ _ _ _ _ _ _ • _ _ _ _ _ _ _ _ _ _ _ _/ necessary not necessary
NE My sketch leaves unaddressed all the thorny problems of the SMC construction like the composition of only and have to with the rest of the clause, and the problem of only’s presupposition; compare in particular von Fintel & Iatridou 2005 and Krasikova & Zhechev 2006. What is important for present purposes is Krasikova’s observation that the interpretation that have to–type modals give rise to in than-clauses can be seen as an SMC interpretation. The ‘more than minimum’ interpretation just like the SMC identifies the point on a scale that is the minimum sufficiency point. Whatever is a plausible analysis of the SMC should be extendable to the problem at hand. (104)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/ • necessary
not necessary
1.70 m
1:36
Quantifiers in than-clauses
Krasikova suggests that have to–type modals can use Fox’s (2007) covert exhaustivity operator EXH instead of only, whose meanings are basically the same. This is what happens in our comparatives, and this is responsible for the ‘more than minimum’ interpretition.6 A structure for the than-clause of (970 a) is given in (105a). Its interpretation using (102b) is (105b). Suppose now that the relevant alternatives are the propositions in (106a), which place Mary’s height in varying intervals. Our context is such that difficulties arise with respect to reaching a certain height. Being short is not hard, being tall is difficult. Thus the ordering of the alternatives in (106a) is one that ranks them according to the height of the interval on the tallness scale into which Mary’s height falls. The requirement easiest to meet is the minimal compliance height. Given this, (105b) can be paraphrased as (105c). (105)
a. b. c.
(106)
a. b.
(than) [1 [[EXH has to]C,< Mary be t1 tall]] [λD 0 . ∀q[q ∈ g(C) & ¬(q < λw. M’s height in w ∈ D 0 ) → ¬have to(q)]] 0 [λD . nothing more difficult is required than for Mary’s height to fall within D 0 ] {λw. Mary’s height in w ∈ D1 , λw. Mary’s height in w ∈ D2 , λw. Mary’s height in w ∈ D3 , . . . } If the ordering in terms of height is D1 < D2 < D3 . . . then: λw. M’s height in w ∈ D1 < λw. M’s height in w ∈ D2 < λw. M’s height in w ∈ D3 < . . . (where ‘<’ means: is easier; in our context, being shorter is easier than taller.)
Applying maximal informativity as usual yields the meaning below for the subordinate clause, the minimum ‘point’ as desired. Selection with Max> is trivial; the resulting meaning is that Mary’s actual height exceeds the minimum compliance height.
6 As an anonymous reviewer points out, this raises the question of why we cannot have an overt only in such sentences, cf. the ungrammaticality of (ia). The editors point out that extraction of the associate of only is not good, cf. (ib). This would have to be different for EXH than for only in order to answer the reviewer’s question. (i)
a. b.
*Mary is taller than she only has to be. *WhoF did Mary only _ call?
1:37
Sigrid Beck
(107)
m_inf([λD. nothing more difficult is required than for Mary’s height to fall within D]) = {the minimum compliance height} = {[1.70–1.70]}
SMC readings of have to–type modals explain the ‘more than minimum’ reading that they can give rise to in comparative than-clauses with the single assumption that EXH takes the place of only. Internal to the subordinate clause, exhaustification occurs. Exhaustification of the than-clause reduces the than-clause interval to a point. The ‘point’ that exhaustification yields is the minimum compliance height. I follow Krasikova in making the connection between SMC use and ‘more than minimum’ readings and in her analysis in terms of exhaustification. This allows me to maintain the selection analysis from the previous subsection. According to this analysis, have to–type modals don’t require any revision of the semantics of comparative constructions. We need to take into account the special semantics of SMC modals instead. Contrary to appearances, we uniformly select a degree from an interval via Max> ; with have to, we may apply Max> after exhaustification. This gives rise to a ‘more than minimum’/apparent narrow scope reading. If exhaustification does not apply, we get the regular ‘more than maximum’ = apparent wide scope reading (cf. example (99) above). Modals that do not permit an SMC reading do not permit a ‘more than minimum’ reading either, because the ‘more than minimum’ reading is an SMC reading. I refer the reader to Krasikova 2008 for further discussion. Crucially for present purposes the correlation with SMC use provides an independent criterion for when to expect which reading. The contrast between the different kinds of universal quantifiers is not analysed as a scope effect. The analysis argued for here makes the interpetation of have to–type modals a property of those particular lexical items. They are the only apparent narrow scope items requiring special attention since in contrast to the scope analysis’ procedure, apparent narrow scope existentials have already been taken care of. 3.3
Refinement II: Indefinites, numeral NPs and the like
This section concerns existential quantifiers that do not behave like NPI any and other apparent narrow scope existentials. The problem for the selection
1:38
Quantifiers in than-clauses
strategy can be illustrated by the example below. (108)
John is taller than exactly five of his classmates are.
(1080 )
a. Exactly five of John’s classmates are shorter than he is. b. #John is taller than the tallest of his 5 or more classmates.
The intuitively available interpretation (1080 a) looks once more like a straightforward wide scope reading of the numeral quantifier. Application of the selection strategy predicts an interpretation that is unavailable, (1080 b), as illustrated below. (109)
λD 0 . for exactly 5 x : max(λd. x is d-tall) ∈ D 0 intervals into which the height of exactly 5 classmates falls Max> (m_inf([λD 0 . for exactly 5 x : max(λd. x is d-tall) ∈ D 0 ])) = the height of John’s tallest classmate, as long as there are at least 5 _ _ •_ _ _•_ _ _• _ _ _ _ •_ _ _•_ _ _• _ _ _• _ _ •_ _ _ _ _ _/
c1
c2
c3
c4
c5
c6
c7
c8
Max>
We face the combined challenge of (i) predicting the right interpretation and (ii) not predicting the non-existing one. I propose to tackle this problem through a more thorough analysis of numeral NPs. We will first consider indefinite NPs in the context of than-clauses and then move on to numerals and example (108). 3.3.1 Singular and plural indefinites Singular indefinites allow in principle two interpretations in than-clauses: an apparent wide scope and an apparent narrow scope reading. Which reading(s) is/are possible depends on the indefinite as well as the sentence context. We have seen examples with NPIs in which only the narrow scope reading is available. An example that has a wide scope reading is given in (110). (111) and (112) provide two examples which I take to be genuinely ambiguous (the English version of (111) probably is too, although native speakers seem to have some difficulty judging the example). (110)
a. b.
John is taller than one of the girls is. There is a girl x such that John is taller than x.
1:39
Sigrid Beck
(111)
(1110 ) (112) (1120 )
Annett hat lauter gesungen als eine Sopranistin. Annett has louder sung than a soprano ‘Annett sang more loudly than a soprano did.’ a. b.
(German)
There is a soprano x such that Annett sang more loudly than x. Annett sang more loudly than any soprano did.
Sveta could solve this problem faster than some undergrad could. a. b.
There is an undergrad x such that Sveta could solve this problem faster than x could. Sveta could solve this problem faster than any undergrad could.
For examples with apparent narrow scope existentials it was demonstrated above (with an NPI indefinite, anyone else) how the selection analysis can derive an appropriate interpretation corresponding to the apparent narrow scope reading. What about the apparent wide scope reading? One option open to us is to acknowledge that indefinites quite often give rise to apparent wide scope readings — so-called specific readings — and to adopt whatever mechanism is appropriate for the analysis of specific readings in general for apparent wide scope indefinites in than-clauses. This is what I will do, and I use the choice function mechanism as the probably best known analysis of specific indefinites (e.g. Reinhart 1992; Kratzer 1998; but see Endriss 2009 for a different analysis). I illustrate with example (113a) from Heim 1982, where a friend of mine can have apparent scope over the conditional. (113)
a.
b.
If a cat likes a friend of mine, I always give it to him. There is a friend of mine such that if a cat likes him, I give it to him. ∃f : CH(f ) & [if a cat likes f(friend of mine), I give it to him] If a cat likes the friend of mine selected by f (f a choice function), I give it to him.
Furthermore, I will assume that indefinite NPs, e.g. with German ein (‘a’), are ambiguous between the ‘normal’ interpretation ‘∃x’ (existential quantification over individuals) and the ‘specific’ interpretation ‘∃f ’ (existential quantification over choice functions). Below I provide a selection analysis of the two readings of (111) under those assumptions.7 On this analysis, the apparent narrow scope reading amounts to a ‘∃x’ interpretation and 7 I use the German example because the larger English inventory of indefinites makes it hard for me to determine which examples are genuinely ambiguous.
1:40
Quantifiers in than-clauses
the apparent wide scope reading amounts to a ‘∃f ’ interpretation for the indefinite. (114)
a.
b.
[als [1 [einex Sopranistin t1 laut gesungen hat]]] = [λD 0 . ∃x[soprano(x) & max(λd. x sang d-loudly) ∈ D 0 ]] intervals that cover the loudness of soprano singers Annett sang more loudly than Max> (m_inf([λD 0 . ∃x[soprano(x)&max(λd. x sang d-loudly) ∈ D 0 ]])) = Annett sang more loudly than the loudest soprano. = Annett sang more loudly than any soprano did. [als [1 [einef Sopranistin t1 laut gesungen hat]]] = [λD 0 . max(λd. f (soprano)sang d-loudly) ∈ D 0 ] intervals that include the loudness of the soprano selected by f ∃f : CH(f ) & Annett sang more loudly than Max> (m_inf([λD 0 . max(λd. f (soprano) sang d-loudly) ∈ D 0 ])) = Annett sang more loudly than the soprano selected by f (f a choice function). = There is a soprano x such that Annett sang more loudly than x.
I further assume that the usual factors (in particular, the nature of the indefinite and what readings the sentence context permits) decide when we can get which reading(s) of a singular indefinite. I have nothing illuminating to say about the particulars of this; note, however, that I do assume that apparent narrow scope readings are possible with indefinites/existentials other than NPIs. My intuitions regarding German indefinites like jemand (someone) + anders/sonst (other/else), wh-word + other/else convince me of this in particular, because these indefinites are not, I believe, plausibly analysed as polarity items, nor are they plausibly analysed as generic (hence not existential). Other languages’ inventory of indefinites may make my view of what the interpretive possibilities of existentials in than-clauses are appear less obvious. I am grateful in particular to Sveta Krasikova for discussion of this point.
1:41
Sigrid Beck
(115)
a.
b. (116)
a.
b.
Hier ist es schöner als anderswo. here is it nicer than elsewhere ‘It’s nicer here than it is elsewhere.’ possible reading: It is nicer here than it is anywhere else. Sam ist schneller als jemand anderes/sonstwer. Sam is faster than someone other/someone else ‘Sam is faster than another person.’ possible reading: Sam is faster than anyone else is.
Also, the data in (117) (in addition to (111) above) provide an indefinite, ein anderer (‘another’), that is ambiguous. Both (117a) and (117b) were collected informally from the web. Context makes it clear that (117a) is intended to mean ‘faster than everyone else’ and (117b) is intended to mean that someone was slower. (117)
a.
b.
Wir denken 7-mal schneller, als ein anderer reden kann. we think 7 times faster than an other talk can ‘We think seven times faster than anyone else can talk.’ Die meisten überholten mich, aber ab und zu war ich auch the most passed me but now and then was I also mal schneller als ein anderer. once faster than an other ‘Most people passed me, but now and then I was faster than someone.’
Matters look somewhat different when we consider plural indefinites. Beginning with bare plurals, note that many examples sound strange (thank you to Irene Heim for example (118)). (118)
a. John is taller than a giraffe. b. ??John is taller than giraffes.
(119)
a.
Prof. Shimoyama hat einen längeren Beitrag geschrieben Prof. Shimoyama has a longer contribution written als eine Doktorandin. than a Ph.D. student ‘Prof. Shimoyama wrote a longer contribution than a Ph.D. student.’ (ok: ∃x, ok: ∃f ) 1:42
Quantifiers in than-clauses
b. ??Prof. Shimoyama hat einen längeren Beitrag geschrieben Prof. Shimoyama has a longer contribution written als Doktorandinnen. than Ph.D. students ‘Prof. Shimoyama wrote a longer contribution than Ph.D. students.’ (120)
a.
Hans Hans b. ??Hans Hans c. Hans Hans
ist schneller gelaufen als eine Schwester von Greg. ran faster than a sister of Greg’s. (ok: ∃x, ok: ∃f ) ist schneller gelaufen als Schwestern von Greg. ran faster than sisters of Greg’s. ist schneller gelaufen als einige Schwestern von Greg. ran faster than several sisters of Greg’s. (ok: ∃f )
The version with the singular indefinite can have an apparent narrow scope or an apparent wide scope interpretation (with some speaker variation regarding which interpretation is favoured). It is known that bare plurals prefer narrow scope interpretations — let’s say this implies that the choice function ‘∃f ’ interpretation is dispreferred. What the oddness of the plural data tells us, then, is that there is something unexpectedly wrong with the non-specific ‘∃X’ interpretation of the plural indefinite (I write capital ‘X’ to indicate plurality, in contrast to ‘x’ for singular). Note that the data (118)–(120) improve when some or several/einige is added to the plural indefinite. They then have an apparent wide scope or ‘∃f ’ interpretation. The following generalization emerges: (121)
Max> (m_inf(λD.∃X[. . . ])) is dispreferred relative to Max> (m_inf(λD.∃x[. . . ])). A plural indefinite ambiguous between ‘∃X’ and ‘∃f ’ will yield ‘∃f ’. A plural indefinite that prefers the ‘∃X’ interpretation will sound strange.
Why should a plural indefinite sound odd unless it can easily reveice a specific interpretation? The generalization is intuitively unsurprising once we examine the ‘∃X’ interpretation more closely. Careful consideration as to what it would mean in the case of (120), provided in (122a), reveals that (given that there is more than one sister of Greg’s) it would be true iff the sentence with the singular ‘∃x’ (’any sister of Greg’s) would be true. I suggest that this makes the interpretation (122a) somehow inappropriate for the example. Perhaps this can be seen as a matter of economy: the plural has no purpose,
1:43
Sigrid Beck
hence cannot be used gratuitously. (122)
a. #Hans ran faster than Max> (m_inf([λD 0 . ∃X[∗sister(X) & ∀x ∈ X : x’s speed ∈ D 0 ])) = Hans ran faster than any sister of Greg’s. b. ∃f : CH(f ) & Hans ran faster than Max> (m_inf([λD 0 . ∀x ∈ f (∗sister) : max(λd. x ran d-fast) ∈ D 0 ])) = Hans ran faster than each of the sisters selected by f (f a choice function). (dispreferred with bare plural, ok with some/several)
(123) is a first shot at what the relevant constraint might effect. The reading that survives, (122b), is one in which, compared to the corresponding singular indefinite, the plural serves a purpose. (123)
Ban on Unmotivated Pluralization (BUMP): Do not quantify over a plurality if quantification over a singularity lets you infer the same reference.
It would be good to be able to reduce this phenomenon to other cases with a similar semantics.8 Below I relate than-clauses to definite descriptions and embedded questions (I am once more inspired by Danny Fox (p.c.) in making this connection). The idea is that all three constructions share some sense of maximality and/or maximal informativity (Fox & Hackl 2006 and the above considerations). So (124a) refers to the maximal, and in the sense of (85) above, the maximally informative speed that John ran; (124b) will require the maximally informative answer, i.e. the maximal speed John reached; and according to the analysis developed here, (124c) is of course analogous. (124)
a. b. c.
the speed that John ran how fast John ran than John ran
8 An anonymous reviewer and Danny Fox pointed out to me that a plural is not generally dispreferred when a singular yields the same interpretation, contrary to a claim I made in an earlier version of this paper. Negation and other downward monotone environments allow plural indefinites, as the example in (i) illustrates. I thank them for pointing out this flaw to me. (i)
We don’t sell apples (??an apple) in this store. There were no women present.
1:44
Quantifiers in than-clauses
The following three sets of data replace the proper name in (124) with various kinds of indefinites in the three constructions. The plain singular indefinite is fine and picks out the fastest speed in the definite description and the question as well as in the than-clause — in addition to a possible specific reading. The bare plurals are somewhat odd, which we can explain if a constraint like the BUMP above is operative (and the ‘∃f ’ interpretation is dispreferred). The last set with plural some indefinites are fine and have the specific reading. Plural indefinites with some are different from bare plurals in easily allowing an ‘∃f ’ interpretation. (125)
(126)
(127)
a. b. c.
the speed that a sister of Greg’s ran how fast a sister of Greg’s ran than a sister of Greg’s ran
a. ??the speed that sisters of Greg’s ran b. ??how fast sisters of Greg’s ran c. ??than sisters of Greg’s ran a. b. c.
the speed that some sisters of Greg’s ran how fast some sisters of Greg’s ran than some sisters of Greg’s ran
These data share the problem of having to determine unique reference from a set via maximality/informativity. They motivate the way that the BUMP is phrased above. Perhaps it is the nature of maximality/informativity as ‘glue’ that makes it sensitive to such a constraint: the step of postulating such operators is an inference one draws to have things make sense, and such inferences are subject to ‘making sense’-type of requirements like the BUMP. But I hasten to add that I am by no means confident that I understand what is at stake and that more work ought to be done in figuring out what the BUMP is really about. I conclude this subsection with a couple of comments on further kinds of indefinites. The first data point confirms the perspective on the data developed so far with the German example (128), where the obligatorily weak lauter (several/many) sounds very strange. Only einige (several) is acceptable, under an apparent wide scope reading. (128)
Annett hat lauter gesungen als einige/??lauter Sopranistinnen. Annett has louder sung than several sopranos ‘Annett sang more loudly than several sopranos.’
1:45
Sigrid Beck
This can be understood if lauter disprefers a choice function analysis, permitting only the BUMP violating reading (1280 a), while einige yields an acceptable interpretation in terms of (1280 b). Our assumption about lauter vs. einige is confirmed by (129), where only the version with einige allows the specific interpretation of the NP ‘relatives of mine’. (1280 )
a. #Annett sang more loudly than Max> (m_inf([λD 0 . ∃X[∗soprano(X) & ∀x ∈ X : max(λd. x sang d-loudly) ∈ D 0 ])) = Annett sang more loudly than any soprano. b. ∃f : CH(f ) & Annett sang more loudly than Max> (m_inf([λD 0 . ∀x ∈ f (∗soprano) : max(λd. x sang d-loudly) ∈ D 0 ])) = Annett sang more loudly than each of the sopranos selected by f (f a choice function)
(129)
a. b.
Wenn einige Verwandte von mir sterben, erbe ich einen Bauernhof. Wenn lauter Verwandte von mir sterben, erbe ich einen Bauernhof. ‘If several relatives of mine die, I will inherit a farm.’
Similarly, we might expect that NPIs in than-clauses will only be licensed on the apparent narrow scope reading ‘∃x’ (perhaps they have no ‘∃f ’ interpretation, or perhaps that interpretation would fail to satisfy the licensing requirements on their context). This predicts that singular NPIs only have an apparent narrow scope reading. It also makes the interesting prediction that plural NPIs should be odd in than-clauses. (130b) is judged degraded compared to (130a) and (130c) by some speakers, but not by all. (130)
a. John solved this problem faster than any girl did. b. ??John solved this problem faster than any girls did. c. John solved this problem faster than any of the girls did.
I don’t understand why some people judge (130b) to be fine; I wonder whether a Free Choice interpretation of any girls is possible for those who accept the sentence. A final remark: it is not the case that plural indefinites in than-clauses are generally bad, not even narrow scope ones. The data in (131) embed the indefinite beneath another operator, and the BUMP does not apply.
1:46
Quantifiers in than-clauses
(131)
a. b.
More people bought books than read magazines. I buy books more often than I buy magazines.
To sum up: indefinites are semantically ambiguous, and this shows up in than-clauses just like it does elsewhere. Apparent wide scope of indefinites is analysed as pseudoscope: a specific reading. Sometimes one interpretation is excluded by independent factors. In particular an economy constraint BUMP can rule out ‘∃X’ for plural indefinites in than-clauses.9 The analysis rests on how the semantic glue interacts with intervals, and on how the interpretation is derived. I assume that the semantic glue is sensitive to BUMPy constraints, i.e. that it is a natural place for their application. 3.3.2
Numerals
With these results regarding indefinites in place, let us next be somewhat more precise in our semantic analysis of ‘exactly n’. Like Gajewski (2008), we employ a more elaborate analysis of these numerals (compare Hackl 2001a,b; 9 It is not clear to me that competing analyses of quantifiers in than-clauses can easily explain the pattern of singular vs. plural indefinites. To give an example, the Pi analysis (supposing it goes along with my assumptions about the semantics of plural indefinites) predicts for (ia) a narrow scope reading (ic) in addition to the wide scope reading (ib). (i)
a. b. c.
d. e.
John was faster than (some) sisters of Greg’s were. ∃X[∗ sister(X) & ∀x ∈ X : Speed(John) > Speed(x)] ‘Some sisters of Greg’s were slower than John.’ Speed(John)] > max(λd. ∃X[∗ sister(X) & ∀x ∈ X : Speed(x) ≥ d]) ‘John’s speed exceeds the speed reached by the slowest member of a plurality of sisters of Greg’s’ = John was faster than the second fastest sister of Greg’s. ∃d[Speed(John) ≥ d & NOT ∃X[∗ sister(X) & ∀x ∈ X : Speed(x) ≥ d]] Suppose Greg has three sisters:
_ _•_ _ _ _ _ _ _ _ •_ _ _ _ _ _ _ _ •_ _ _ _ _ _ _/ x1
x3
x2 largest speed reached by every member of a plurality of sisters of Greg’s
An interpretation corresponding to (ic) is not available and would have to be excluded — in the plural case, but not in the singular. The reading predicted by the NOT-theory, (id), is parallel. Depending on how hard it is to do so, an argument might be gained for the selection analysis from the pattern of singular vs. plural indefinites in than-clauses.
1:47
Sigrid Beck
Krifka 1999 on the semantics of such NPs). Remember the simple example (66) and its analysis. (66)
a. b.
Exactly three girls weigh 50 lb. [EXACT [XP (exactly) threeF girls weigh 50 lb.]]
(660 )
threeF girls weigh 50 lb.o = ∃X[∗ girl(X)&card(X) = 3&∗ weigh. 50. lb(X)] threeF girls weigh 50 lb.f = {∃X[∗ girl(X) & card(X) = n & ∗ weigh. 50. lb(X)] : n ∈ N}
(67)
EXACT(XPf )(XPo ) = 1 iffXPo = 1 & ∀q ∈ XPf : ¬(XPo → q) → ¬q ‘Out of all the alternatives of XP, the most informative true one is the ordinary semantics of XP.’
(68)
(66b) = 1 iff ∃X[∗ girl(X) & card(X) = 3 & ∗ weigh. 50. lb(X)] & ∀n[n > 3 → ¬∃X[∗ girl(X) & card(X) = n & ∗ weigh. 50. lb(X)]] iff max(λn. ∃X[∗ girl(X) & card(X) = n & ∗ weigh. 50. lb(X)]) = 3
This step does not immediately solve our problem. If we give the than-clause in (108) the semantics in (132), nothing changes: we still compare with the tallest of John’s classmates, as long as there are at least five. Notice, however, that this interpretation is just as strange as the plain plural indefinite ‘∃X’ interpretation above, since the number information serves no real purpose for the truth conditions. (108)
John is taller than exactly five classmates of his are.
(132)
λD 0 . max(λn. ∃X[∗ classmate(X)&card(X) = n&∗ Height(X) ∈ D 0 ]) = 5 Intervals into which the height of exactly five of John’s classmantes falls
(133)
John is taller than Max> (m_inf(λD 0 . max(λn.∃X[∗ classmate(X) & card(X) = n & ∗ Height(X) ∈ D 0 ]) = 5))
(1330 )
Presupposition: Assertion:
John has at least five classmates. He is taller than any of them.
This reading is thus ruled out by the same constraint BUMP. We should then alternatively consider a choice function analysis of the indefinite ‘n class-
1:48
Quantifiers in than-clauses
mates’. I combine this below with the assumption that exactly is evaluated in the matrix clause. In (134), we derive the desired interpretation. (134)
max(λn. ∃f [CH(f ) & John is taller than Max> (m_inf(λD 0 . ∀x ∈ f ((λX. ∗ classmate(X) & card(X) = n) : Height(x) ∈ D 0 ]) = 5 ’the largest number n such that John is taller than the tallest of the n classmates of his selected by some choice function f is 5.’
An LF of example (108) representing a version of Krifka’s analysis looks as in (135). (135)
a. b.
[EXACT [John is taller [than Max> m_inf [(exactly) 5f of his classmates are tall]]]] Out of all the alternatives of the form ‘John is taller than n of his classmates are’, the most informative true one is ‘John is taller than 5 of his classmates are’.
The applicability of the constraint BUMP to numeral indefinites is empirically supported by the data below, which behave in a parallel way to plural indefinites with some, for example. (136)
a. b. c.
the speed that two finalist drove how fast two finalist drove than two finalist drove
Thus I suggest that a proper semantic analysis of numeral NPs makes the facts compatible with a selection solution after all. 3.3.3
Further relevant cases
The analysis developed here for indefinite NPs in than-clauses needs to be extended to NPs with many and most, which show the same apparent wide scope interpretations we observed for numerals. (137)
a. b.
John is taller than many of his classmates are. There are many classmates of John’s such that he is taller than they are.
(138)
a. b.
John is taller than most of his classmates are. For most x, x a classmate of John’s: John is taller than x.
1:49
Sigrid Beck
I will make further use of the semantics developed by Hackl (2001a,b, 2009) for these NPs, according to which ‘many N’ is an indefinite NP including a gradable adjective in the positive form, and ‘most N’ is correspondingly a superlative. This makes feasible analyses that can be paraphrased in the following way:10 (1370 )
John is taller than the tallest of the many-membered group of classmates of his selected by f (f a choice function).
(1380 )
John is taller than the tallest of the group selected by f , which comprises a majority of his classmates (f a choice function).
More detailed analysis are given below ((139) provides the two potential readings of (137) and (140)–(142) analyse (138)). Besides being able to predict the existing readings, the BUMP constraint in (123) will rule out the ones that are intuitively unavailable. (139)
a. #John is taller than Max> (m_inf([λD 0 . ∃X[∗ classm(X) & many(X) & ∀x ∈ X : Height(x) ∈ D 0 ])) = John is taller than any classmate (as long as there are many). b. ∃f : CH(f ) & John is taller than Max> (m_inf([λD 0 . ∀x ∈ f (λX. ∗ classm(X) & many(X)) : Height(x) ∈ D 0 ])) = John is taller than each of the many classmates selected by f (f a choice function)
(140)
than [1 [X most of his classmates are t1 tall]] = [λD 0 . ∃X∃d[∗ classm(X)&d-many(X)&∀Y ∈ C[Y ≠ X&∗ classm(Y ) → ¬d-many(Y )] & ∀x ∈ X : Height(x) ∈ D 0 ] intervals that contain the heights of a majority of John’s classmates
(141)
than [1 [f most of his classmates are t1 tall]] = [λD 0 . ∀x ∈ f (λX. ∃d[∗ classm(X) & d-many(X) & ∀Y ∈ C[Y ≠
10 An anonymous reviewer points out that this predicts that these NPs can have the same specific readings we know from indefinites. I concur, but would like to point out that this prediction arises from an analysis of these quantifiers as indefinites, not from the application of that analysis to than-clauses. The empirical test cases include data like (i) below. (i)
a. b.
If many relatives of mine die, I will inherit a farm. If most relatives of mine die, I will inherit a farm.
1:50
Quantifiers in than-clauses
X & ∗ classm(Y ) → ¬d-many(Y )]) : Height(x) ∈ D 0 ] intervals that contain the heights of the majority of John’s classmates selected by f (142)
a. #John is taller than Max> (m_inf([λD 0 . ∃X∃d[∗ classm(X)&d-many(X)&∀Y ∈ C[Y ≠ X & ∗ classm(Y ) → ¬d-many(Y )] & ∀x ∈ X : Height(x) ∈ D 0 ])) = John is taller than the tallest of any majority of his classmates. = John is taller than any of his classmates. b. ∃f : CH(f ) & John is taller than Max> (m_inf([λD 0 . ∀x ∈ f (λX. ∃d[∗ classm(X) & d-many(X) & ∀Y ∈ C[Y ≠ X & ∗ classm(Y ) → ¬d-many(Y )]) : Height(x) ∈ D0 ] = John is taller than the tallest of the majority of John’s classmates selected by f (f a choice function) = For most x, x a classmate of John’s: John is taller than x.
To sum up: this subsection has analysed the available vs. unavailable readings of indefinite NPs in than-clauses using a choice function mechanism plus a constraint on unmotivated pluralization. The formulation of the BUMP in (123) is offered as a first version of the constraint we need; what we want to derive is that it is strange to say ‘John is taller than exactly three girls are’ if we meant, and might as well have said ‘John is taller than any girl is’. Since this seems eminently reasonable, I am hopeful that a good way of stating the relevant constraint exists. Given this, the present section has extended the selection analysis to apparent wide scope indefinite NPs of various kinds (including numerals, many and most), using a pseudoscope mechanism argued for extensively for indefinites independently of comparatives. The comparative semantics itself remains simple. 3.4
Refinement III: Differentials
The final kind of data that does not immediately fall out from the selection analysis is represented by example (143) below: a than-clause containing a universal quantifier in combination with a differential. (143)
a. b.
John is exactly 200 taller than every girl is. For every girl x: John is exactly 200 taller than x.
1:51
Sigrid Beck
Compared to Heim, and also Schwarzschild & Wilkinson, we seem to have a problem. Heim’s analysis can derive the intuitive interpretation as shown below. (144)
[[than every girl is tall] [5 [John is exactly 200 taller t5 ]]]
(1440 )
[than [1 [every girl [2 [[Pi t1 ] [3 [t2 is t3 tall]]]]]]] = λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ] intervals into which the height of every girl falls
(145)
(144) = [λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ](λd. John is exactly 200 taller than d) = for every girl x: John is exactly 200 taller than x
Choice of Max> on the other hand predicts a different interpretation, which does not seem right for (143): (146)
John is exactly 200 taller than Max> (m_inf(than-clause)) = John is exactly 200 taller than the tallest girl.
The intuitively available reading of (143a) can be described as one in which we assume that all the girls reach the same height. I call this an assumption of equality among the individuals universally quantified over, EQ for short. The EQ appears to speak in favor of a scope solution since it is entailed by the truth conditions resulting from giving the universal wide scope over the comparison. It is not entailed by the truth conditions according to the selection analysis, although it is of course compatible with the truth conditions in (146) that the girls all have the same height. Sentence (143a) exemplifies a problem that arises when a than-clause containing a universal quantifier is combined with a differential that includes exactly, at most or almost. A differential including at least does not distinguish between the two sets of truth conditions. (147)
a. John is at most/almost 200 taller than every girl is. b. For every girl x: John is no more than 200 taller than x c. #John is no more than 200 taller than the tallest girl.
(148)
a. b. c.
John is at least 200 taller than every girl is. For every girl x: John is at least 200 taller than x John is at least 200 taller than the tallest girl.
An unmodified differential does not constitute evidence as strong as an exactly/at most-type differential, because, while it gives rise to the usual
1:52
Quantifiers in than-clauses
strengthening implicature that amounts to an exactly reading, this implicature can be canceled. If we suppose the implicature to be present, the unmodified differential is parallel to exactly. (149)
a. b. c.
John is 200 taller than every girl is. Implicature: John is no more than 200 taller than every girl is. John is 200 taller than every girl is, perhaps more.
To sum up the picture so far, differentials with exactly and at most, and perhaps simple differentials, seem to be problematic for the selection analysis as opposed to the scope analysis. However, there is more to say about this issue empirically and theoretically. Beginning with the theoretical side, note that the interpretation of the matrix clause in (144) was simplified in terms of not giving the differential quantifier exactly 200 independent scope.11 Data like (150) show that such expressions do take scope, however: (150) (151) (1500 )
You are allowed to be exactly 60 tall. exactly 60 = λD. max(D) = 60 a. b.
max(λd. ∃w[wAcc@& you are d-tall in w]) = 60 The largest permitted height for you is 60 . ∃w[wAcc@ & max(λd. you are d-tall in w) = 60 ] It is permitted that you be exactly 60 tall.
Hence, in addition to (a more elaborate version of) (144) above, the LF and interpretation in (152) become possible. For the Pi theory, this leads to availability of the analysis in (153). (152) (1440 )
(153)
[[exactly 200 ] [4 [[than every girl is tall] [5 [John is t4 taller t5 ]]]]] [than [1 [every girl [2 [[Pi t1 ] [3 [t2 is t3 tall]]]]]]] = λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ] intervals into which the height of every girl falls (152) = [exactly 200 ](λd0 . [λD 0 . ∀x[girl(x) → Height(x) ∈ D 0 ] (λd. John is d0 taller than d) = [exactly 200 ](λd0 . for every girl x: John is d0 taller than x) = max(λd0 . for every girl x: John is d0 taller than x) = 200 ‘The largest amount that John is taller than every girl is 200 .’
11 Thanks to Danny Fox for drawing my attention to this point.
1:53
Sigrid Beck
Note that this LF no longer predicts all the girls to have the same height. It says that John is exactly 200 taller than the tallest girl — just like the selection analysis. It is thus not clear that the predictions of the scope analysis are really different from, and superior to, the selection analysis. Next, let’s take a closer look at the data. Above, we identified as a problem that EQ is not predicted, the assumption that all individuals universally quantified over have the same height (or whatever the gradable predicate measures). However, the data are quite difficult. While I agree with the perception in the literature that in (143a) the EQ is plausible, it is clear that it does not always arise. Below are some examples where it doesn’t; (154)–(156) are collected from the internet.12 The reader can convince her/himself that further relevant data can easily be found. The difficulty in determining the interpretation of data with nominal universal quantifiers is related to the point mentioned in Section 2 about differentials and intensional verbs. I mention in (1560 ) a suggestive example also collected from the web. (154)
Aden had the camera for $100 less than everyone else in town was charging.
(155)
WOW! Almost 4 seconds faster than everyone else, and a 9 second gap on Lance.
(156)
Jones was almost an inch taller than the both of them. (the both of them = John Lennon and Paul McCartney, Jones = Tom Jones. The author thinks that Jones was 50 1100 and that Paul McCartney was about 50 1000 . John Lennon is reported to be shorter than McCartney by about an inch.)
(1560 )
I finished 30 seconds faster than I expected. [. . . ] I know my 300 yard time more accurately now. (the continuation suggests that the speaker’s expectation was a range rather than a precise point in time.)
The examples are straightforwardly analysed using Max> to determine the relevant ‘point’ provided by the than-clause.13 The differential measures the 12 A naive Google search has not unearthed a clearly relevant example with an exactlydifferential. 13 A different type of example illustrated below is difficult for both a scope and a selection analysis. I find it hard to decide what such examples mean precisely. It seems plausible to me that we select some kind of ‘point’ from the meaning of the than-clause, but not in the way described in the text.
1:54
Quantifiers in than-clauses
distance between that and the main clause degree. This is demonstrated for (155) below. (1550 )
a. #For all x, x ≠ Z: (Z was) almost 4 seconds faster than x (wide scope) b. (Z was) almost 4 seconds faster than Max> (m_inf(λD 0 . for all x ≠ Z : Speed(x) ∈ D 0 )) = Z was almost 4 seconds faster than the next fastest person. (selection Max> )
We face the task of figuring out what distinguishes (143) from (154)–(156), i.e. why EQ arises in some data but not all. I would like to ask this question in terms of how the selection analysis might predict not only (154)–(156), but also (143). To this effect, let’s take a closer look at the combination of a differential with a comparative. Note that we understand a claim like (157a) relative to a plausible level of granularity. For us to judge (157a) to be true, it is in most contexts sufficient to be precise up to the level of a few millimeters. Suppose on the other hand that (157b) is about a sensitive piece of machinery. A one millimeter margin could very well not be acceptable. This means that what we call John’s height, or that rod’s length, is actually somewhat fuzzy: it is a ‘blob’ or an interval on the relevant scale whose size depends on context. The sensitivity to a level of precision is not represented in the standard truth conditions of the two examples given in (1570 ). (157)
a. b.
(1570 )
a. b.
Mary is exactly 2 cm taller than John is. This rod is exactly 2 cm longer than that rod is. Height(Mary) = Height(John) + 2 cm Length(this rod) = Length(that rod) + 2 cm
To capture this, I follow Krifka (2007) in assuming that a scale can be divided into different units. A unit on the scale then has to be identified that can count as a ‘point’ at the contextually relevant level of granularity. Which (i)
a.
Ben was almost a year older than everyone else in his class (because he had just missed the deadline for the previous school year). b. #For all x ≠ Ben: Ben was almost a year older than x. c. #Ben was almost a year older than the next oldest in his class. d. ?The others’ ages center around a point almost a year younger than Ben.
1:55
Sigrid Beck
division we assume depends on context. Talking about a length of 1.80 m for example could then refer to a very short or a somewhat larger stretch of the scale, depending on the relevant standard of precision/unit size. I talk about unit size as granularity. (158)
. . . _ _ _ _ •_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _/ 1.80 m ... ... 5 cm 5 cm ... ... 2 cm 2 cm ... ... 1 cm
1 cm
I make use of Schwarzschild’s (1996) notion of a cover as a division of an entity into its contextually relevant parts, and apply it to scales in (159). Covers provide the relevant granularity. (159)
Let hS, >i be a scale. Then Cov is a cover of S if Cov is a set of subsets of S such that each d in S is in some set in Cov, each set in Cov is contiguous and no two sets in Cov overlap. Assume Cov to be the set of intervals that are of the contextually relevant size.
I furthermore revise the definition of an end “point” from (160) to (161) ((161b) is the informal version, (161c) the more precise version employing covers). Note that the distinction between points and intervals dissolves under this view because what we usually call a point is an interval on the scale whose size depends on context. (160)
a. b.
(161)
a. b. c.
Max> (phhd,ti,ti ) : = max> (max> (p)) = the end point of the interval that extends furthest Let S be a set ordered by R. Then maxR (S) = ιs[s ∈ S & ∀s 0 ∈ S[sRs 0 ]] Max> (phhd,ti,ti ) := end> (max> (p)) = the end ‘blob’ of the interval that extends furthest end> (D) := ιd. d ⊆ D & ¬∃d0 [d0 ⊆ D & d0 > d] & d counts as a point at the relevant level of granularity Let Cov be the set of intervals that are of the contextually relevant size. end>,Cov (D) := ιd. d ⊆ D & d ∈ Cov & ¬∃d0 [d0 ⊆ D & d ≠ d0 & d0 ∈ Cov &d0 > d]
1:56
Quantifiers in than-clauses
Supposing that we talk about what we roughly call 1.80 m, the meanings of our two than-clauses could (depending on context, i.e. the relevant cover) come out as in (162). It is thus in the nature of scales that they have a part/whole structure whose units are determined in a context dependent manner. (162)
a. b.
Max>,Cov1 (than John is tall) = [1.798–1.803] (a 0.5 cm unit) Max>,Cov2 (than that rod is long) = [1.7998–1.8002] (a 0.4 mm unit)
Let’s consider differentials under this refined understanding of scales. A differential measures the distance from the “point” referred to in the matrix to the “point” referred to in the than-clause, “point” being determined by the relevant unit size. Note that a plausible granularity for the than-clause has to match the granularity level suggested by the differential. If the two do not match, an odd sentence results. I call this a granularity clash. In the example below, we know that it is impossible to determine to the second the amount of time that it took John to learn French. The than-clause comes inherently with a coarse granularity, which clashes with the granularity of the differential in (163b). (163)
a. Mary learned arithmetic faster than John learned French. b. ?Mary learned arithmetic faster than John learned French by 7 minutes 23 seconds. c. Mary learned arithmetic faster than John learned French by several months.
We can generalize from the example as follows. In a comparative of the form (164a), it must at least be given that the cover of the relevant interval that the than-clause provides (via informativity) furnishes units that are smaller than the differential; i.e. (164b) is a requirement for the comparative to make sense. If that is the case, then the unit picked out as a “point” by Max> will also be smaller than the differential (164c). The comparative can then measure the gap between the main clause degree and the maximum of the than-clause with the differential ((164d)). If the maximum itself is larger, this will be impossible. In our example, suppose that we can with exceptional precision determine to the day how long it took Mary to learn arithmetic and John to learn French. We cannot reasonably measure the gap between two days in terms of the differential ‘7 minutes 23 seconds’. The level of
1:57
Sigrid Beck
granularity relevant for the than-clause has to make sense in relation to the differential. (164)
a. b. c. d.
Main Clause Differential than D for all U ∈ Cov: U < Diff Since Max>,Cov (D) ∈ Cov : Max>,Cov (D) < Diff Max>,Cov (Main Clause) = Diff + Max>,Cov (D)
The reasoning works out given that the cover, and therefore the unit that counts as ‘maximal point’, is determined locally, i.e. than-clause internally, independently of the differential which will then either fit or clash.14 I think that granularity offers an explanation for the interpretive effect I call EQ. Consider the situation depicted below for (165). If we have no further information regarding the situation, the girls’ sizes can be far apart. This would indicate a large interval. The idea is that the semantics of the thanclause itself indicates possible Covers. There is then a danger that we have a x1 –x5 would be coarse-grained cover. A reasonable division of into relatively long units, hence Max> is long. This would be incompatible with the differential — a granularity clash. That is, a sentence in which the than-clause indicates a real spread (e.g. because of a universal quantifier) brings with it the danger of a granularity mismatch with a differential. John is exactly 200 taller than every girl is.
(165)
_ _ _• _ _ _• _ _ _ _ _ •_ _ _ •_ _ _ _ _ _• _ _ _ _ _• _ _ _ _/
(166)
x1
x2
x3
x5
x4
m_inf((than) every girl is tall) = {
J
x1 –x5
}
14 A similar effect can be observed with Covers in the plural domain in examples like (i) below. (i)
a. b.
The women and the men love their child. The Smiths and the Johnsons love their child.
Suppose we are talking about Angelina and Reginald Johnson and Mary and John Smith. Then the two subjects in (ia) and (ib) refer to the same group, but make different covers salient (Schwarzschild 1996). By virtue of the cover suggested by the subject, (ia) tends to be understood as ‘the women love their child and the men love their child’, which is unexpected. (ia) amounts to ‘the Smiths love their child and the Johnsons love their child’, which is more expected. The point is that the subject group autonomously makes salient a cover, whether this leads to a plausible interpretation of the whole or not.
1:58
Quantifiers in than-clauses
The Cover indicated by the than-clause may agree with the differential only under an additional assumption of closeness of the individual “points” covered by the than-clause interval. My suggestion is that if a potential granularity clash could only be avoided under an additional assumption of closeness, one tends to assume equality and a default Cover of the thanclause interval D in terms of the singleton set {D}. This is the EQ. In short, without an informative context, there is a danger of a granularity clash. The danger is avoided by the EQ. The EQ would under this analysis be an extra assumptions speakers make in order to ensure that a sentence is meaningful. (Note that the EQ is not the weakest assumption one could make to ensure that; perhaps it is the simplest assumption.) The data above for which the selection analysis automatically makes good predictions with Max> , (154)–(156), are such that we have a rather clear expectation about the kind of interval denoted by the than-clause — the range within which the individual degrees fall is fixed. The context is rich, and no problems with granularity arise. Thus a genuine Max> interpretation (i.e. one in which we pick out the maximum from a genuine spread) is possible without further assumptions. This distinguishes those data from our original example (165). I suggest that danger of a granularity clash leads to EQ: to supposing that the ‘points’ that are in danger of being spread over too large an interval in fact collapse into one. We expect that it should depend on the amount of information available on the interval covered by the thanclause whether we get an EQ interpretation or a genuine Max> interpretation. Additional information to the effect that the points are not the same, but close enough together for the purposes of the differential, may make the EQ unnecessary and thus make a genuine Max> interpretation possible for our EQ data. This appears to me to be correct: (167)
Background: we are running an experiment in which we vary the growth conditions of seedlings. In particular, we test different fertilizing agents (ViagraFlor, Dung™, ComposFix and GuanoPlus) and their effect on how fast our seedlings grow. After two weeks, it is reported that:
(168)
The ComposFix seedlings are exactly 200 taller than all the others. (Max> possible)
Danger of granularity clash arises in uninformative contexts and triggers EQ. I should be able to take the same than-clauses that occured in Max> examples
1:59
Sigrid Beck
and place them into a less fortunate context, and trigger EQ. Again, this seems the right prediction. (169)
a. b.
This pot dries out exactly 40 min faster than all the others. (EQ likely) This T-Shirt dries exactly 20 min faster than all the others. (EQ likely)
We see that minimal pairs can be found that have essentially the same comparative (differential plus comparative adjective plus than-clause) but differ as to informativity of background context regarding the than-clause interval. An uninformative context makes us assume that the interval is point-like, so that Max> will be well defined and suitable — EQ. If we have enough background information to be sure that the Max> unit in the thanclause interval is suitable, we do not panic, make no extra assumptions, and can get a genuine Max> interpretation as expected. Things are different with an existential quantifier. Consider (170) against the same background as before. The minimal than-clause intervals will be the heights of the individual girls. Max> will be well defined and suitable without any additional assumptions, and will make this a comparison between John’s height and the height of the tallest girl, as desired. (170)
John is exactly 2 cm taller than any girl is. Max> (m_inf((than) any girl is tall))
(171)
_ _ _• _ _ _• _ _ _ _ _ •_ _ _ •_ _ _ _ _ _• _ _ _ _ _• _ _ _ _/
x1
x2
x3
x4
x5
J
I conclude that the selection strategy provides a reasonable perspective on differential comparatives. It depends on context whether we get an EQ interpretation or a genuine Max> interpretation, and the selection strategy can explain this. I will not investiate here what a scope strategy could say about the data. A more general remark: At this point in the analysis, a pragmatic element has entered the picture. The ‘glue’ I have been talking about so far is genuinely semantic and seems fully determined (as far as I can see) given the requirement of interpretability. But scales (following the insights represented by Krifka’s work) require reference to context and include a pragmatic element in the shape of the cover. In addition to the maximality/informativity
1:60
Quantifiers in than-clauses
operators themselves, we need the contextually relevant part/whole structure of the scale to interpret a particular example. Properties of the cover become relevant in particular in the presence of differentials, and speakers may be lead to make extra assumptions (EQ). The fuzzy nature of the data, in my opinion, speaks in favour of the idea that some kind of pragmatic glue is required to make things work out. Depending on the context, speakers may or may not have an easy time figuring out what the necessary glue is. That said, a remaining caveat is a more thorough empirical understanding of the data with differentials. 4 Summary and conclusions 4.1 Summary Building on work primarily by Schwarzschild & Wilkinson and Heim, I propose an analysis of quantifiers in than-clauses in which the quantifier is interpreted inside the than-clause. A shift from degrees to intervals of degrees makes this possible. Despite appearances, there is no scope interaction between quantifier and shifter or quantifier and comparison operator. Instead, there is uniformly selection of a point from the subordinate clause interval. The analysis takes from Schwarzschild & Wilkinson the step to intervals. It shares with Heim that comparison is ultimately reduced to comparison of points. Intervals are not directly compared. In contrast to Heim and the subsequent NOT-theory, apparent scope effects like the interpretation of have to–type modals and exactly n NPs have been explained away via recourse to alternative interpretational mechanisms, which have been argued for independently of than-clauses (in these two examples: exhaustification and an alternative semantics for exactly-numerals). My strategy is motivated by the lack of clear scope interaction in than-clauses. One feature of the proposal is that the semantics of the comparative operator is very simple. It is the same semantics that one needs for data like (172a), namely one in which the first argument of the comparative operator is a degree, (172c). Maximality is still used in clausal comparatives like the ones we have discussed, but it is independent of the comparative operator. (172)
a. b. c.
John is taller than 1.70 m. [[-er [than 1.70 m]] [2 [John is t2 tall]]] -er = λd1 . λd2 . d2 > d1
1:61
Sigrid Beck
It is in this sense the analysis developed here is in my opinion ‘simpler’ than Schwarzschild & Wilkinson’s. The complexity that is no doubt there in the present analysis consists in the assumption that general interpretive strategies like informativity and maximality are involved (plus in independent complications like the availability of specific readings for indefinites and the like). Also, the semantics is no longer completely determined by compositional semantics. Data with differentials could only be analysed by enriching the classical semantics with pragmatic notions (covers, contextual background). However, this aspect of the proposal is supported by contextual variability of the judgements and thus has to be part of a successful analysis. In order to ultimately evaluate the success of my proposals, the whole approach needs to also be extended to adverbials. I will not attempt to do so now. Other considerations concern a more detailed analysis of the various modals (including might) and an investigation of the interaction of several scope bearing elements inside a than-clause. I give some representative data below and acknowledge the need for further work on the subject (compare Schwarzschild & Wilkinson 2002, Heim 2006b, Schwarzschild 2008). Finally, I admit that I have no analysis for Sauerland’s (2008) example (174), for which he provides a solution in terms of Heim’s theory. (173)
(174)
a. b. c.
It is hotter here today than it often is in New Brunswick. It is hotter today than it might be tomorrow. Sveta solved this problem faster than someone else could have.
Ekaterina is an odd number of centimeters taller than each of her teammates.
These issues are left for future work. 4.2
Where do the intervals come from?
There is one important theoretical question left for the intervals-plus-selection analysis to answer: where do the intervals come from? In Section 3 I made the assumption that basic adjective meanings already contained intervals: (175)
tall = [λD. λx. Height(x) ∈ D]
I could alternatively have assumed that the operator Pi from Heim 2006b shifts the standard adjective meaning to (175).
1:62
Quantifiers in than-clauses
(176) (177)
Pi shifts from degrees to intervals: [1 [ AP [Pi t1 ] [3 [ AP t2 is t3 tall]]]] a. b. c.
tall = [λd. λx. Height(x) ≤ d] Pi = [λD. λP . max(P ) ∈ D] [λD. Pi(D)(tall(x))] = [λD. Height(x) ∈ D]
Since Pi on the analysis pursued here always takes scope immediately next to the adjective, this would have served no particular purpose and I simplified to (175). But a problem for assuming (175) as the basic meaning of a gradable adjective is that it is very weak. This creates problems for example for the negation theory of antonymy (compare e.g. Heim 2006a). (178a) analyses the negative polar adjective short as the negation of tall. I fail to be able to imagine how a parallel strategy for the interval based meaning (178b) could be successful. (178)
a. b.
short = [λd. λx. ¬ Height(x) ≥ d] = [λd. λx. Height(x) < d] short = λD. λx. Height(x) 6∈ D
So if the intervals do not come into the semantics via a motivated independent (since mobile) operator Pi, and nor are they plausibly basic, how do they come in? It would be attractive to say that intervals enter the semantics because, that is, if and only if, they are needed. That is what I would like to think, and (175) really was a simplification for the sake of uniformity that I think of as preliminary. An idea for how to bring intervals into the semantics when needed that is due to Heim (2009) is given below. We begin by observing that a relation can be expressed between a plurality and a part of a scale — a degree ‘blob’. (179)
a. b.
(You have to be 50 tall to enter.) Our children are that tall. (Bill’s GPA is 3.75.) Sam’s grades are that good, too.
We see a parallel to expressing a relation between a plurality and a mass noun. The example (180a) can be represented as in (180b) with the meaning in (180c) in mind for the relation between the two objects of drink — a cumulative interpretation (see e.g. Beck & Sauerland 2000 and all the earlier work cited there that they rely on). (180)
a. b.
Our children drank the milk. ∗ ∗ drank(M)(C)
1:63
Sigrid Beck
c.
∀x ≤ C : ∃y ≤ M : drank(y)(x)&∀y ≤ M : ∃x ≤ C : drank(y)(x) All children participated in drinking the milk, and all parts of the milk were drunk by one of the children.
Transferring the analysis to our degree example yields (181). (181)
a. b. c.
Our children are that tall. ∗ ∗ tall(D)(C) ∀x ≤ C : ∃d ≤ D : tall(d)(x) & ∀d ≤ D : ∃x ≤ C : tall(d)(x) All the children’s heights fall into D, and all parts of D contain the height of a child.
It is easy to apply the same analysis to a than-clause containing a definite plural, and it yields the set of intervals that we need according to the analysis in Section 3. Comparison will be with the maximum point in that set and the sentence is predicted to mean that our children are shorter than John. (182)
a. b. c.
(John is taller) than our children are. λD. ∗ ∗ tall(D)(C) λD. ∀x ≤ C : ∃d ≤ D : tall(d)(x)&∀d ≤ D : ∃x ≤ C : tall(d)(x) intervals that contain the heights of all our children (and nothing else)
Note that the notion of degree ‘blobs’ that have a part/whole structure is anticipated by the reference to covers in Section 3. A cover provides us with the relevant parts of the degree scale. We are consistently assuming a mass like structure of the degree scale. To make the connection clear, (1820 ) provides a more complete formalisation of (180a) which includes covers (compare Beck 2001 for this kind of use for covers). (1820 )
a. b.
λD. [∗ ∗ λd. λx. d ∈ Cov &x ∈ Cov & tall(d)(x)](D)(C) λD. ∀x[x ≤ C & x ∈ Cov → ∃d[d ≤ D & d ∈ Cov & tall(d)(x)]] & ∀d[d ≤ D & d ∈ Cov → ∃x[x ≤ C & x ∈ Cov & tall(d)(x)]] (suppose that the relevant parts of ‘the children’ are the individual children, and that the relevant parts of the cover are the units according to granularity)
Example (182)/(1820 ) derives a set of intervals, as pluralities of degrees, as the meaning of a than-clause via plural predication. What would we need
1:64
Quantifiers in than-clauses
to do in order for this idea to apply to the range of data examined in this paper? I briefly discuss three issues for which this change in perspective is reelvant: (i) universal quantifiers, (ii) singular quantifiers, and (iii) maximal informativity. First, regarding universal quantifiers: The introduction of intervals analogously to (182) would have to happen with universal quantifiers of various kinds, in particular universal nominals and intensional verbs (cf. our two representative examples every girl and predict). Regarding intensional verbs, there is a proposal by Boškovi´ c & Gajewski (2008) that instead of universal quantification over worlds (183a) they (or at least some of them) involve sum formation (183b). (183)
a. b.
believex = λp. ∀w[w ∈ BELx → p(w)] believex = max(λW . W ∈ ∗BELx )
This makes possible the following analysis of a than-clause with an intensional verb (in the simpler version without covers): (184)
a. b. c.
(John is taller) than you believe. λD.[∗∗λw.λd. John is d-tall in w](max(λW .W ∈ ∗BELyou ))(D) λD.∀w ≤ max(λW .W ∈ ∗BELyou ) : ∃d ≤ D : tall(w)(d)(John)& ∀d ≤ D : ∃w ≤ max(λW .W ∈ ∗BELyou ) : tall(w)(d)(John) intervals that contain John’s height in all your belief worlds (and nothing else)
Nominal universal quantifiers, it has been observed, can sometimes be used to introduce a plurality, although this is not always easily possible. Perhaps (185) involves a reinterpretation as a plural definite NP. The same reinterpretation would be responsible for the interpretation of the than-clause in (186) in case the girls are of varying heights. This might make sense of my abovementioned intuition that a definite plural is more acceptable than a universal NP. (185) (186)
a. Everyone gathered in the hallway. b. ?Every student gathered in the hallway. a. b. c.
John is taller than every girl is. ‘every girl’ → G (the plurality of girls) λD.∗∗ tall(D)(G)
1:65
Sigrid Beck
d.
h i λD. ∀x ≤ G : ∃d ≤ D : tall(d)(x) h i & ∀d ≤ D : ∃x ≤ G : tall(d)(x) intervals that contain the heights of all the girls (and nothing else)
Thus it can be argued that a plural analysis of intervals can capture these data15 The discussion from Section 3 is (almost — see below) unchanged; what changes is what happens below the level of AP, so to speak (the predication ‘x is d-tall’): what we assumed to be basic in (175) is now compositionally derived via pluralization mechanisms. Next, let’s reconsider data with singular quantificational elements: (187) (1870 )
a. Mary is taller than anyone else is. b. *John is taller than no girl is. a. b.
c.
John is taller than some girls are. h i λD. ∃X : ∀x ≤ X : ∃d ≤ D : tall(d)(x) h i & ∀d ≤ D : ∃x ≤ X : tall(d)(x) h i λD. ∀x ≤ f (∗ girl) : ∃d ≤ D : tall(d)(x) h i & ∀d ≤ D : ∃x ≤ f (∗ girl). tall(d)(x)
There would be no reason to introduce intervals in the data with singular indefinites and negative quantifiers. Remember from Section 3.1 that in these cases, we got rid of the intervals immediately anyway (maximal informativity reduced the contribution of the than-clause to the set of individual heights). Now, we could just revert to the classical analysis for those data. This is not an unwelcome result, since the classical analysis offers a successful solution for them. Pluralization as the trigger for the introduction of intervals will continue to play a role for plural indefinites (see example (1870 )); the discussion in Section 3.3 is thus also in important respects unchanged. Finally, we need to think once more about the role of maximal informativity. Plural semantics keeps intervals small. The truth conditions of cumulation are such that the pluralised relation holds between the plurality and the smallest interval that covers all the individuals in the plurality (cf. the second conjunct in (181c) and the following analyses). This may make 15 I am not sure at this point what to say about the have to–type modals. Perhaps (as non-negraising verbs) they do not have a plural analysis. We then revert to the classical analysis. If they do have a plural semantics, the story in Section 3.1 is maintained. The first version relates the behavior of a modal to neg-raising, the second to SMC use.
1:66
Quantifiers in than-clauses
m_inf unnecessary, leaving us with iterated maximality. Again this can be seen as a welcome result. The attraction of this approach is, as said above, that intervals enter the picture only when there is a real need for them. The idea is entirely compatible with the selection analysis and in my view very desirable. Why did I not set out in this fashion in Section 3? I am not quite confident enough of the story in (185), (186), and too many details remain to be worked out, plus the data need to be examined more carefully. As things stand, readers sceptical of the ideas sketched in this subsection may take Section 3 as it is, while others have the beginnings of an analysis of how and why intervals come into play at all. 4.3
Outlook
Let’s take a step back and think about what an analysis of quantifiers in than-clauses in terms of selection achieves — beyond the empirical coverage of the mostly well-known set of data that I have been concerned with above. Compared to its theoretical competitors, it primarily removes quantifiers in than-clauses from the realm of scope interaction phenomena. For example, the interpretive behaviour of quantifiers in than-clauses cannot be seen as an instance of the Heim/Kennedy generalization (Kennedy 1997; Heim 2001). The analysis I’ve given in Section 3 violates this generalization. (188)
Heim/Kennedy generalization: [ DegP . . . [ QP [. . . tDegP . . . ] . . . ]]
(189)
a. b.
than [1 [every girl is t1 tall]] λD. for every girl x : Height(x) ∈ D
The Heim/Kennedy generalization is motivated in particular by quantifiers in the matrix clause of comparatives. Suppose that the behaviour of quantifiers in the matrix clause relative to degree operators is regulated by a scope constraint deriving the Heim/Kennedy generalization. Then there would be no theoretical connection between this and than-clause quantifiers. We would accordingly expect empirical differences between quantifiers in main clause vs. than-clause. On the other hand, if one were to extend the requirement of finding a definite degree from the than-clause to the main clause (a good way of ensuring applicability of the lexical entry in (172c), note), a parallel analysis could still be pursued. (See once more Heim 2009 for a sketch of such an analysis.) There are some striking similarities between main clause and than-clause quantifiers that motivate such a step, in 1:67
Sigrid Beck
particular (190), (191) below: Both sentences in (190) have an interpretation that talks about the minimum requirement length of the paper, and neither sentence in (191) does. (190) (191)
a. b. a. b.
The paper is longer than it is required to be. The paper is required to be less long than that. The paper is longer than it is supposed to be. The paper is supposed to be less long than that.
But there are also apparent mismatches: (192)
a.
b.
Anderswo ist es weniger schön als hier. elsewhere is it less nice than here ‘It is less nice elsewhere than it is here.’ b. ??The most beautiful other place is less nice than it is here.
(193)
a.
(194)
a.
b. (195)
Hier ist es schöner als anderswo. here is it nicer than elsewhere ‘It is nicer here than it is elsewhere.’ ok: It is nicer here than it is in the most beautiful other place.
Sam war schneller als jemand anderes. Sam was faster than someone other ‘Sam was faster than another person.’ ok: Sam was faster than the fastest other person.
Jemand anderes war weniger schnell als Sam. Someone other was less fast than Sam ‘Another person was less fast than Sam.’ b. ??The fastest other person was less fast than Sam.
a.
At this point, I do acknowledge interesting empirical parallels, but I am also worried about apparent differences. I would not wish to be committed at present to claiming that quantifiers in the main clause behave in the same way as quantifiers in the than-clause, or that they don’t, and will remain neutral as to whether the analysis developed here should be extended to cover matrix clause quantifiers as well. Instead of making a connection to scope interaction phenomena, the present analysis is based on a plural/mass-semantics related vagueness plus semantic and pragmatic glue. It makes the interpretation of quantifiers in than-clauses more of a coercion-like phenomenon. Perhaps the variable and partly messy nature of the data can motivate the nature of the analysis.
1:68
Quantifiers in than-clauses
References Beck, Sigrid. 2001. Reciprocals are definites. Natural Language Semantics 9(1). 69–138. doi:10.1023/A:1012203407127. Beck, Sigrid. 2009. Comparatives and superlatives. To appear in Klaus von Heusinger, Claudia Maidenborn, and Paul Portner (eds.), Handbook of semantics: An international handbook of natural language meaning. Berlin: Mouton de Gruyter. Beck, Sigrid & Hotze Rullmann. 1999. A flexible approach to exhaustivity in questions. Natural Language Semantics 7(3). 249–298. doi:10.1023/A:1008373224343. Beck, Sigrid & Uli Sauerland. 2000. Cumulation is needed: A reply to Winter 2000. Natural Language Semantics 8(4). 349–371. doi:10.1023/A:1011240827230. Boškovi´ c, Željko & Jon Gajewski. 2008. Semantic correlates of the NP/DP parameter. Proceedings of the North East Linguistics Society 39. URL http://gajewski.uconn.edu/papers/NELS39paper.pdf. Cresswell, Max J. 1977. The semantics of degree. In Barbara H. Partee (ed.), Montague grammar, 261–292. Academic Press. Dalrymple, Mary, Makoto Kanazawa, Yookyung Kim, Sam Mchombo & Stanley Peters. 1998. Reciprocal expression and the concept of reciprocity. Linguistics and Philosophy 21(2). 159–210. doi:10.1023/A:1005330227480. Endriss, Cornelia. 2009. Quantificational topics: A scopal treatment of exceptional wide scope phenomena (Studies in Linguistics and Philosophy (SLAP) 86). Springer. doi:10.1007/978-90-481-2303-2. von Fintel, Kai & Sabine Iatridou. 2005. What to do if you want to go to Harlem: Anankastic conditionals and related matters. URL http://mit. edu/fintel/fintel-iatridou-2005-harlem.pdf. Ms, MIT. Fox, Danny. 2007. Free choice and the theory of scalar implicatures. In Uli Sauerland & Penka Stateva (eds.), Presupposition and implicature in compositional semantics, 537–586. New York: Palgrave Macmillan. Fox, Danny & Martin Hackl. 2006. The universal density of measurement. Linguistics and Philosophy 29(5). 537–586. doi:10.1007/s10988-006-9004-4. Gajewski, Jon. 2008. More on quantifiers in comparative clauses. Proceedings of Semantics and Linguistic Theory 18. doi:1813/13043. Hackl, Martin. 2001a. Comparative quantifiers. Ph.D. thesis, Massachusetts Institute of Technology. URL http://hdl.handle.net/1721.1/8765. Hackl, Martin. 2001b. A comparative syntax for comparative quantifiers.
1:69
Sigrid Beck
Proceedings of the North East Linguistics Society 31. Hackl, Martin. 2009. On the grammar and processing of proportional quantifiers: most versus more than half. Natural Language Semantics 17(1). 63–98. doi:10.1007/s11050-008-9039-x. Heim, Irene. 1982. The semantics of definite and indefinite noun phrases. Ph.D. thesis, University of Massachusetts at Amherst. URL http:// semanticsarchive.net/Archive/Tk0ZmYyY. Heim, Irene. 1994. Interrogative semantics and Karttunen’s semantics for know. In Rhonna Buchalla & Anita Mittwoch (eds.), The proceedings of the conference of the Israel Association for Theoretical Linguistics (IATL 1), 128–144. Hebrew University of Jerusalem. URL http://semanticsarchive. net/Archive/jUzYjk1O. Heim, Irene. 2001. Degree operators and scope. In Caroline Féry & Wolfgang Sternefeld (eds.), Audiatur vox sapientiae: A festschrift for Arnim von Stechow, 214–239. Berlin: Akademie Verlag. Heim, Irene. 2006a. Little. Proceedings of Semantics and Linguistic Theory 16. doi:1813/7579. Heim, Irene. 2006b. Remarks on comparative clauses as generalized quantifiers. URL http://semanticsarchive.net/Archive/mJiMDBlN. Ms, MIT. Heim, Irene. 2009. A unified account? Handout for ‘Topics in Semantics’, MIT. Heim, Irene & Angelika Kratzer. 1998. Semantics in generative grammar. Oxford: Blackwell. Hellan, Lars. 1981. Towards an integrated analysis of comparatives (Ergebnisse und Methoden moderner Sprachwissenschaft 11). Tübingen: Narr. Hoeksema, Jack. 1983. Negative polarity and the comparative. Natural Language and Linguistic Theory 1(3). 403–434. doi:10.1007/BF00142472. Jacobson, Pauline. 1995. On the quantificational force of English free relatives. In Emmon Bach, Eloise Jelinek, Angelika Kratzer & Barbara H. Partee (eds.), Quantification in natural languages (Studies in Linguistics and Philosophy (SLAP) 54), 451–486. Dordrecht: Kluwer. Kennedy, Chris. 1997. Projecting the adjective: The syntax and semantics of gradability and comparison. Ph.D. thesis, University of California, Santa Cruz. Klein, Ewan. 1991. Comparatives. In von Stechow & Wunderlich (1991), chap. 32, 673–691. Krasikova, Sveta. 2008. Quantifiers in comparatives. Proceedings of Sinn und Bedeutung 12. 337–352. URL http://www.hf.uio.no/ilos/forskning/
1:70
Quantifiers in than-clauses
konferanser/SuB12/proceedings/krasikova_337-352.pdf. Krasikova, Sveta & Ventsislav Zhechev. 2006. You only need a scalar only. Proceedings of Sinn und Bedeutung 10. URL http://www.sfb441.uni-tuebingen. de/b10/Pubs/KrasikovaZhechev_SuB05.pdf. Kratzer, Angelika. 1991. Modality. In von Stechow & Wunderlich (1991), 639–650. Kratzer, Angelika. 1998. Scope or pseudoscope? are there wide-scope indefinites? In Susan Rothstein (ed.), Events and grammar. Dordrecht: Kluwer. Krifka, Manfred. 1999. At least some determiners aren’t determiners. In Ken Turner (ed.), The semantics/pragmatics interface from different points of view (Current Research in the Semantics/Pragmatics Interface 1), 257–291. Elsevier. Krifka, Manfred. 2007. Approximate interpretation of number words: A case for strategic communication. In Gerlof Bouma, Irene Maria Krämer & Joost Zwarts (eds.), Cognitive foundations of interpretation (Verhandelingen der Koninklijke Nederlandse Akademie van Wetenschappen, Afd. Letterkunde 190), 111–126. Amsterdam: Royal Netherlands Academy of Arts and Sciences. Larson, Richard K. 1988. Scope and comparatives. Linguistics and Philosophy 11(1). 1–26. doi:10.1007/BF00635755. Link, Godehard. 1983. The logical analysis of plurals and mass terms: A lattice-theoretical approach. In Rainer Bäuerle, Christoph Schwarze & Arnim von Stechow (eds.), Meaning, use, and interpretation of language, Grundlagen der Kommunikation und Kognition, 302–323. de Gruyter. May, Robert. 1985. Logical form: Its structure and derivation (Linguistic Inquiry Monographs 12). Cambridge, MA: MIT Press. Meier, Cécile. 2002. Maximality and minimality in comparatives. Sinn und Bedeutung 6. 275–287. URL http://www.phil-fak.uni-duesseldorf.de/asw/ gfs/common/procSuB6/pdf/articles/MeierSuB6.pdf. Partee, Barbara H. 1984. Compositionality. In Fred Landman & Frank Veltman (eds.), Varieties of formal semantics (Groningen-Amsterdam Studies in Semantics (GRASS) 3), 281–311. Dordrecht: Foris. Reinhart, Tanya. 1992. Wh-in-situ: An apparent paradox. Proceedings of the Amsterdam Colloquium 8. 483–492. van Rooij, Robert. 2008. Comparatives and quantifiers. Empirical Issues in Syntax and Semantics 7. 423–444. URL http://www.cssp.cnrs.fr/eiss7/ van-rooij-eiss7.pdf.
1:71
Sigrid Beck
Rullmann, Hotze. 1995. Maximality in the semantics of wh-constructions. Ph.D. thesis, University of Massachusetts at Amherst. URL http://scholarworks. umass.edu/dissertations/AAI9524743/. Sauerland, Uli. 2008. Intervals have holes: A note on comparatives with differentials. Ms, ZAS Berlin. Schwarzschild, Roger. 1996. Pluralities (Studies in Linguistics and Philosophy (SLAP) 61). Kluwer. Schwarzschild, Roger. 2004. Scope splitting in the comparative. URL http: //www.rci.rutgers.edu/~tapuz/MIT04.pdf. Handout from a colloquium talk at MIT. Schwarzschild, Roger. 2008. The semantics of comparatives and other degree constructions. Language and Linguistics Compass 2(2). 308–331. doi:10.1111/j.1749-818X.2007.00049.x. Schwarzschild, Roger & Karina Wilkinson. 2002. Quantifiers in comparatives: A semantics of degree based on intervals. Natural Language Semantics 10(1). 1–41. doi:10.1023/A:1015545424775. Seuren, Pieter A.M. 1978. The structure and selection of positive and negative gradable adjectives. In Donka Farkas, Wesley M. Jacobsen & Karol W. Todrys (eds.), Papers from the Parasession on the Lexicon, Chicago Linguistic Society, April 14–15, 1978 (CLS 14), 336–346. von Stechow, Arnim. 1984. Comparing semantic theories of comparison. Journal of Semantics 3(1-2). 1–77. doi:10.1093/jos/3.1-2.1. von Stechow, Arnim. 1995. Lexical decomposition in syntax. In Urs Egli, Peter E. Pause, Christoph Schwarze, Arnim von Stechow & Götz Wienold (eds.), Lexical knowledge in the organization of language (Current Issues in Linguistic Theory 114), 81–118. John Benjamins. von Stechow, Arnim & Dieter Wunderlich (eds.). 1991. Semantics: An international handbook of contemporary research. Berlin: de Gruyter.
Prof. Dr. Sigrid Beck Chair of Descriptive and Theoretical Linguistics Englisches Seminar Universität Tübingen Wilhelmstr. 50 72074 Tübingen Germany
[email protected]
1:72
Semantics & Pragmatics Volume 3, Article 3: 1–41, 2010 doi: 10.3765/sp.3.3
Two kinds of modified numerals∗ Rick Nouwen Utrecht University Received 2009-03-27 / First Decision 2009-07-19 / Revised 2009-08-18 / Second Decision 2009-09-08 / Revised 2009-09-29 / Accepted 2009-10-14 / Final Version Received 2009-10-15 / Published 2010-01-26
Abstract In this article, I show that there are two kinds of numeral modifiers: (Class A) those that express the comparison of a certain cardinality with the value expressed by the numeral and (Class B) those that express a bound on a degree property. The goal is, first of all, to provide empirical evidence for this claim and second to account for these data within a framework that treats modified numerals as degree quantifiers.
Keywords: modified numerals, scalar quantification, modality
1 Introduction Modified numerals are most commonly exemplified by combinations of a numeral and a comparative, as in more than 100. Following Hackl (2001), I will refer to such expressions as comparative quantifiers. As (1) shows, however, apart from modification by a comparative, numerals combine with a striking diversity of expressions. (1)
more/fewer/less than 100 no more than 100, many more than 100
comparative quantifiers differential quantifiers
∗ I would like to thank two anonymous reviewers for their helpful comments. Many thanks, moreover, to S&P editors Kai von Fintel and, especially, David Beaver, for their painstaking efforts to point out ways in which to improve the article. A concise presentation of the main points of this article appeared under the same title in the proceedings of the thirteenth Sinn und Bedeutung conference (Nouwen 2009). Earlier ideas on this subject were presented at Semantics and Linguistic Theory 13 in Amherst (2008) and the Journées Sémantique et Modélisation in Toulouse (2008). I am grateful to the audiences of these events for useful discussion. Special thanks to Min Que and Luisa Meroni for some help with data. This work was supported by a grant from the Netherlands Organisation for Scientific Research NWO, which I hereby gratefully acknowledge. ©2010 R.W.F. Nouwen This is an open-access article distributed under the terms of a Creative Commons NonCommercial License (creativecommons.org/licenses/by-nc/3.0).
R.W.F. Nouwen
at least/most 100 100 or more/fewer/less under/over 100, between 100 and 200 from/up to 100, from 100 to 200 minimally/maximally 100, 100 tops
superlative disjunctive locative directional
quantifiers quantifiers quantifiers quantifiers other
For a long time, there seemed to be agreement in the formal semantic literature that there was little to be gained from a thorough investigation of these expressions. An especially dominant view, originating from generalised quantifier theory (Barwise & Cooper 1981), was that there was not much more to the semantics of such quantifiers than the expression of the numerical relations >, <, ≤ and ≥. In the past decade, however, several studies have shown that this is an overly simplistic assumption. Examples are Hackl 2001, Krifka 1999 and Takahashi 2006 on comparative quantifiers, Nouwen 2008b on negative comparative quantifiers, Solt 2007 on differential quantifiers, Geurts & Nouwen 2007, Umbach 2006, Corblin 2007, Büring 2008 and Krifka 2007b on superlative quantifiers, Corver & Zwarts 2006 on locative quantifiers and Nouwen 2008a on directional quantifiers.1 Such investigations usually concern the specific quirks of a certain type of modified numeral. While I believe that it is important to have a semantic analysis of modified numerals on a case by case basis, I also believe that what is lacking from the literature so far is a view of to what extent the various modified numerals in (1) involve the same semantic structures. In this paper, I will attempt to reach a generalisation along this line by claiming that there are two kinds of modified numerals: (A) those that relate the numeral to some specific cardinality and (B) those that place a bound on the cardinality of some property. The difference will be made clear below. The main example of (A) are comparative quantifiers like more/fewer than 100. Most other kinds of modified numerals fall in the second class. I will start by making clear what distinguishes the two classes of modified numerals by presenting a body of data that sets them apart. Then, in section 3, I introduce a well-founded decompositional treatment of comparative quantifiers, proposed by Hackl (2001), which I take to represent the proper treatment of class A modifiers. In section 4, I propose that class B modifiers are operators that indicate maxima/minima. I will then account for the distribution of these quantifiers by arguing that they are often blocked by unmodified numerals, which are capable of expressing equivalent meanings. 1 See also Nouwen 2010b for an overview.
3:2
Two kinds of modified numerals
Section 5 discusses a particular problem that occurs with the interaction of B-type quantifiers with modal operators. In section 6, I provide some more details on the empirical basis for the A/B distinction. Section 7 concludes. 2
Class A and class B modified numerals
It is a striking feature of comparative quantifiers that they can be used to assert extremely weak propositions. For instance, (2) is acceptable, even though it expresses a rather under-informative truth. (2)
A hexagon has fewer than 11 sides.
A
This example contrasts strongly with the examples in (3), which are all unacceptable. (Or, alternatively, one might have the intuition that they are false). (3)
a. #A hexagon has at most 10 sides. b. #A hexagon has maximally 10 sides. c. #A hexagon has up to 10 sides.
B B B
Why is this so? A naive theory might have it that (2) states that the number of sides in a hexagon is strictly smaller than 11 (i.e. <11), and that the only difference with (3) is that, there, it is stated that this number is smaller or equal to 10 (i.e.≤ 10). Clearly, 6 is both < 11 and ≤ 10. So why are not both kinds of examples under-informative but true? On the naive view, having at most 10 sides is expected to be equivalent to having fewer than 11 sides. That is, both these properties pick out objects with n ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10} sides. Semantically, no contrast is to be expected. Given this semantic equivalence, a pragmatic explanation of the contrast between (2) and (3) seems equally unlikely.2 Let us call quantifiers that are acceptable in such examples class A quantifiers and those that are like (3) class B quantifiers. As the contrast between (4) and (5) shows, the distinction is also visible with lower bound quantifiers. 2 A reviewer wondered whether the naive view could not be maintained if we assume that there is a pragmatic effect associated to the fact that ≤ n includes the possibility of n while < n excludes it. It is very much unclear what kind of effect that would be, however. One could, for instance, base a pragmatic inference on the fact that, in (3a), the speaker seems to signal the possibility that a hexagon has 10 sides by using at most 10. However, one could equally argue that the same signal is given by the speaker of (2), simply by using fewer than 11 instead of fewer than 10.
3:3
R.W.F. Nouwen
That is, (4) is under-informative, yet true and acceptable, while the examples in (5) are unacceptable/false. (4) (5)
A hexagon has more than 3 sides. #A hexagon has {at least / minimally} 3 sides.
A B
What I think is the underlying problem of examples involving class B expressions is that such quantifiers are incapable of expressing relations to definite amounts. Class A expressions, on the other hand, excel at doing so. Imagine, for instance, that we are talking about my new laptop and that we are concerned with how much internal memory it has. Say that it has 1GB of memory (and that I know that it has so much memory.) I can then assert (6) in a context where you, for instance, just told me that your laptop has 2GB of memory. (6)
My laptop has less than 2GB of memory.
Or, if your computer has a mere 512MB of memory, I can boast that: (7)
My laptop has more than 512MB of memory.
In these examples, I am comparing the definite amount of 1GB, i.e. the precise amount of memory I know my laptop has, to some given contrasting amount 2GB (512MB) by means of less than (more than). This is something class A quantifiers can do very well, but something that is unavailable for class B modified numerals: (8)
I know exactly how much memory my laptop has. . . a. b.
. . . and it is {#at most / #maximally / #up to} 2GB. . . . and it is {#at least / #minimally} 512MB.
In contrast to (8), class B quantifiers are acceptable when what is ‘under discussion’ is not a definite amount, but rather a range of amounts, as in (9). (9)
a. b.
Computers of this kind have {at most / maximally / up to} 2GB of memory. Computers of this kind have {at least / minimally} 512MB of memory.
3:4
Two kinds of modified numerals
In other words, it appears that class B quantifiers relate to ranges of values, rather than to a single specific cardinality.3 This intuition is supported by (10). (10)
Jasper invited maximally 50 people to his party.
We normally interpret (10) to indicate that the speaker does not know how many people Jasper invited. That is, it is unacceptable for a speaker to utter (10) if s/he has a definite amount in mind, which is why the addition of 43, to be precise in (11) is infelicitous.4 (11)
Jasper invited maximally 50 people to his party. #43, to be precise.
By assuming that the speaker does not know the exact amount, (10) is interpreted as being about the range of values possible from the speaker’s perspective. The speaker thus states that there is a bound on that range. The same intuition occurs if we substitute maximally 50 by any other class B quantifier. In sum, I showed that the landscape of modified numerals can be divided into two separate classes of expressions. What distinguishes class B quantifiers from other modified numerals is that they are incompatible with definite amounts and are always interpreted with respect to a range of values. Below, I will present a semantics of class B expressions that makes this intuition 3 In his comments on this article, David Beaver pointed out examples like (i), where the number appears to be a variable quantified over. (i)
There were maximally 50 people there at any one time.
Although I will not attempt a compositional analysis of cases like (i), such examples do appear to support the main intuition that class B quantifiers express relations between amounts and ranges. An example like (i) states that 50 is the maximum of the range formed by the different number of people present at different times. This is different from (ii), which states that at any time the number of people present did not exceed 50. (This is true, for instance, in case from start to finish there were always 20 people present.) So while (i) expresses a maximum on a range of values created by quantification, (ii) quantifies over different times and compares the number of people present at that time with 50. (ii)
There were fewer than 50 people there at any one time.
4 Compare this to (i), which forms a minimal pair with (11). (i)
Jasper invited fewer than 50 people to his party. 43, to be precise.
3:5
R.W.F. Nouwen
precise. Before I can do so, however, I will need to discuss the semantics of A-type numeral modifiers. 3
Hackl’s semantics for comparative modifiers
In this section, I discuss the semantics for comparative modified numerals as developed in Hackl 2001. I will assume that this represents the proper treatment of class A numeral modifiers. I also extend the framework slightly by adding a way to account for the ambiguity of non-modified numerals. 3.1
Class A modifiers as degree quantifiers
What is the semantics of a class A quantifier? It is tempting to think that class A quantifiers correspond to the well-known generalised quantifier-style determiner denotations such as the ones in (12).5 (12)
more than 10 = λP .λQ. ∃x[#x > 10 & P (x) & Q(x)] fewer than 10 = λP .λQ.¬∃x[#x ≥ 10 & P (x) & Q(x)]
In the past decade it has become clear that it is important to have a closer look at these modified numerals (Krifka 1999; Hackl 2001). In what follows, I will assume the following semantics of fewer than, which is based on the arguments in Hackl 2001. (13)
more than 10 = λM. maxn (M(n)) > 10 fewer than 10 = λM. maxn (M(n)) < 10
The workings of this definition will become clear below, but one of the main motivations for an analysis along this line can be pointed out immediately. The semantics in (13) is simply that of a comparative construction, where cardinalities are seen as a special kind of degrees. That is, like the comparative, it involves a degree predicate M and a maximality operator that applies to 5 In a set-theoretic approach (12) would correspond to the perhaps more familiar (i). I discuss (12) rather than (i) since, in what follows, I will assume a framework that makes use of sum individuals. It is easy to see that, within their own respective frameworks, (12) and (i) ultimately yield the same truth-conditions. (i)
more than 10 = λX.λY .|X ∩ Y | > 10 fewer than 10 = λX.λY .|X ∩ Y | < 10
3:6
Two kinds of modified numerals
this predicate (Heim 2000). In other words, (13) is completely parallel to other comparatives, like (14). While in (13), M is a predicate like being a number n such that Jasper invited n people to his party, in (14) M could, for instance, be filled in with something like being a degree d such that Jasper is tall to degree d. (14)
-er than d
= λM.maxd0 (M(d0 )) > d
Hackl assumes that argument DPs containing a (modified) numeral always contain a silent counting quantifier many: (15)
many = λnλP λQ.∃x[#x = n & P (x) & Q(x)]
(16)
10 sushis [ DP [ 10 many ] sushis ]
In this framework, the numeral (of type d, of degrees) is an argument of the silent quantifier many (of type hd, hhe, ti, hhe, ti, tiii, of generalised quantifier-style determiners parameterised for degrees). By applying [ 10 many ] to the noun (phrase), the standard generalised quantifier denotation of 10 sushis is derived: λQ.∃x[#x = 10 & sushi(x) & Q(x)]. The structure of a DP containing a modified numeral does not differ essentially. Modified numerals are also the argument of a counting quantifier, as illustrated in (17). (17)
fewer than 10 sushis [ DP [ [ fewer than 10 ] many ] sushis]
As was stated above, many is parametrised for cardinalities, which we take to be degrees. Fewer than 10, however, denotes a degree quantifier, not a degree constant. Thus, to avoid a type clash, the modified numeral in (17) has to move, leaving a degree trace and creating a degree property. (18)
Jasper ate fewer than 10 sushis. [ [fewer than 10] [ λn [ Jasper ate [ [ n many ] sushis ] ] ] ]
This leads to the following interpretation, which results in the desired simple truth-conditions. (19)
[λM.maxn (M(n)) < 10] ( λn.∃x[#x = n & sushi(x) & ate(j, x)]) =β maxn (∃x[#x = n & sushi(x) & ate(j, x)]) < 10
This might seem like a rather elaborate way of deriving the truth-conditions for such simple sentences. Using (12), we would have derived as truth-
3:7
R.W.F. Nouwen
conditions ¬∃x[#x ≥ 10 & sushi(x) & ate(j, x)], which is equivalent to (19), but which does not require resorting to (moving) degree quantifiers and silent counting quantifiers. Importantly, however, Hackl’s theory makes some crucial predictions which are not made by theories assuming a semantics as in (12). If, like degree operators, modified numeral operators can take scope, we expect to find scope alternations that resemble those found with degree operators (Heim 2000). As Hackl observed, this prediction is borne out. For reasons explained in Heim 2000, structural ambiguity arising from degree quantifiers and intensional operators like modals is only visible with nonupward entailing quantifiers, which is why all the following examples are with upper-bounded modified numerals. The example in (20), for instance is ambiguous, with (20a) and (20b) as its two readings. (20)
(Bill has to read 6 books.) John is required to read fewer than 6 books. a. b.
‘John shouldn’t read more than 5 books’ ‘The minimal number of books John should read is fewer than 6’
One of the readings of (20) states that there is an upper bound on what John is allowed to read. The more natural interpretation, however, is a minimality reading, which is about the minimal number of books John is required to read. (That is, (20) would, for instance, be true if John meets the requirements as soon as he reads 3 or more books.) Following Heim (2000), Hackl analyses this ambiguity as resulting from alternative scope orderings of the modal and the comparative quantifier. The upper bound reading, (20a), corresponds to a logical form where the modal takes wide scope. The minimality reading involves the maximality operator intrinsic to the comparative construction taking wide scope over the modal (Heim 2000). (21)
[maxn (∃x[#x = n & book(x) & read(j, x)]) < 6] [require [ [fewer than 6] [ λn [John read n-many books] ] ] ]
(22)
maxn (∃x[#x = n & book(x) & read(j, x)]) < 6 [ [fewer than 6] [ λn [ require [John read n-many books] ] ] ]
A similar structural ambiguity can be observed with existential modals. The two readings of (23) are an upper bound interpretation as well as a reading
3:8
Two kinds of modified numerals
which is very weak, stating simply that values below the numeral are within what is permitted, without stating anything about the permissions for higher values. (That is, the reading intended in (23b) is, for instance, verified by a situation where there are no restrictions whatsoever on what John is allowed to read. Clearly, (23a) would be false in such a situation.) (23)
John is allowed to bring fewer than 10 friends. a. b.
‘John shouldn’t bring more than 9 friends’ ‘It’s OK if John brings 9 or fewer friends (and it might also be OK if he brings more)’
As before, these readings can be predicted to exist on the basis of the relative scope of modal and comparative quantifiers. (24)
maxn (♦∃x[#x = n & friend(x) & bring(j, x)]) < 6 [ [fewer than 6] [ λn [ allow [John invite n-many friends] ] ] ]
(25)
♦[maxn (∃x[#x = n & friend(x) & bring(j, x)]) < 6] [ allow [ [fewer than 6] [ λn [John invite n-many friends] ] ] ]
The reader may check that Hackl’s predicted readings in (24) and (25) are indeed the attested ones. 3.2
Class B modifiers are different
These analyses are strongly supportive of an approach which treats comparative quantifiers as comparative constructions. The question now is whether class B quantifiers should be given a similar treatment. In other words, will the semantics in (26) do? (26)
up to / maximally / at most / etc... 10 =? λM. maxn (M(n)) ≤ 10
Choosing a semantics that is parallel to that of fewer than is partly unintuitive since the class B quantifiers are not comparative constructions. Yet, cases like maximally 10 suggest that the crucial ingredient of the semantics is the same, namely a maximality operator. The unsuitability of the analysis in (26) becomes immediately apparent, however, if we investigate examples with class B modified numerals embedded under an existential modal: these turn out not to be ambiguous (cf. Geurts & Nouwen 2007). Class B modifiers like maximally, up to and at most always yield an upper bound on what is allowed and resist the weaker reading that was found with comparative modifiers, as
3:9
R.W.F. Nouwen
the contrast between (27) and (28) makes clear. (27)
John is allowed to bring fewer than 10 friends. But more is fine too.
(28)
John is allowed to bring {up to / at most / maximally} 10 friends. #But more is fine too.
A further interesting property of the interaction of class B modified numeral quantifiers and modals is that existential modals interfere with the inferences about speaker knowledge that we found for simple sentences. Above, I observed that (29) licenses the inference that the speaker does not know how many friends Jasper invited. In contrast, (30) does not license any such inference; it is compatible with the speaker knowing exactly what is and what is not allowed. (29)
Jasper invited maximally 50 friends.
(30)
Jasper is allowed to invite maximally 50 friends.
These observations add to the data separating class A from class B quantifiers. Summarising, the distinctions are then as follows. First of all, class B quantifiers, but not class A quantifiers, resist definite amounts, except when embedded under an existential modal. Second, class B quantifiers, but not class A quantifiers, resist weak readings when embedded under an existential modal. In the next section I will argue that the peculiarities of class B quantifiers can be explained if we assume that they are quite simply maxima and minima indicators. Basically, what I propose is that the semantics of maximally (minimally) is simply the operator maxd (mind ). This might be perceived as stating the obvious. What is not obvious, however, is how such a proposal accounts for the difference between class A and class B quantifiers. I will argue that the limited distribution of class B modifiers is due to the fact that they give rise to readings that are in competition with readings available for non-modified structures. I will show that, in many circumstances, the application of a class B modifier to a numeral yields an interpretation which is equivalent to one that was already available for the bare numeral. Before I can explain the proposal in detail, I therefore need to include an account of bare numerals in the framework.
3:10
Two kinds of modified numerals
3.3
The semantics of numerals
Above, I adopted the semantics of Hackl 2001 for comparative modified numerals. An important part in that framework is played by the counting quantifier many. I will re-name this operator many1 , for, in what follows, I assume that for any numeral there are two counting quantifiers available. These two options are to account for the two meanings of numerals that may be observed: on the one hand the existential / weak / lower-bounded meaning and, on the other hand, the doubly bound / strong meaning. An example like (31), for instance, is ambiguous between (31a) and (31b). (31)
Jasper read 10 books. a. b.
the number of books read by Jasper ≥ 10 the number of books read by Jasper = 10
I assume that, like the meaning in (31a), the meaning in (31b) is semantic and not the result of a scalar implicature that results from (31a). See e.g. Geurts 2006 for a detailed ambiguity account, and for some compelling arguments in favour of it.6 In the current framework, that of Hackl 2001, the weak reading in (31a) is due to a weak semantics for the counting quantifier: i.e. many1 . I propose that the strong reading, (31b), is accounted for by an alternative quantifier many2 (taking inspiration from Geurts 2006.)7 (32)
many1 = λnλP λQ.∃x[#x = n & P (x) & Q(x)] many2 = λnλP λQ.∃!x[#x = n & P (x) & Q(x)]
Here, ∃!x[ϕ] abbreviates ∃x[ϕ & ∀x 0 [x 0 6= x → ¬ϕ[x/x 0 ] ]].8 In other words, ∃!x stands for ‘exactly one . . . ’. When x ranges over groups of individuals, ∃!x[#x = n & P (x)] is verified by assigning to x the maximal group of individuals with property P , where n is the cardinality of that group. This is because any smaller group will not be the unique group with property P of its cardinality. For instance, if our domain is {a, b, c, d}, all of which satisfy P , then ∃!x[#x = 3 & P (x)] is false, since several groups have three atoms and property P , among which a ⊕ b ⊕ c and a ⊕ c ⊕ d. However, ∃![#x = 4 & P (x)] 6 But see Breheny 2008 for a dissenting view. 7 Here is a mnemonic. The 1 in many1 represents the fact that this operator is unilaterally bound, namely lower-bounded only. Many2 on the other hand is bilaterally bound. 8 Here, ϕ[x/x 0 ] is the formula that is exactly like ϕ except that free occurrences of x have been replaced by x 0 . Moreover, it is assumed that ϕ contains no free occurrences of x 0 .
3:11
R.W.F. Nouwen
is true, since apart from a ⊕ b ⊕ c ⊕ d there is no other group that has 4 atoms while satisfying P . Consequently, ∃!x[#x = n . . .] stands for ‘exactly n. . . ’. For instance, the doubly bound reading of Jasper read 10 books is (33). The truth-conditions of (33) are such that it is false if Jasper read fewer than 10 books (for then there would not be 10 books he read), but also false if Jasper read more than 10 books (for then there would be many groups of 10 books he read). (33)
∃!x[#x = 10 & book(x) & read(j, x)]
Not only does the option of two counting quantifiers, many1 and many2 , suffice to account for the ambiguity of bare numerals, it is moreover harmless with respect to the semantics of comparative quantifiers. A sentence like Jasper read more than 10 books is not ambiguous. It is important to show that the availability of two distinct counting quantifiers does not predict ambiguities in such examples. It will be instructive to see in somewhat more detail why this is indeed the case. The structure in (34) is exemplary of a simple sentence with a modified numeral object. As explained earlier, the modified numeral applies to the degree predicate that is created by moving the quantifier out of the DP. (34)
[ MOD n [ λd [ Jasper read d many1/2 books ] ] ]
Now that there is a choice between two counting quantifiers, the denotation of the degree predicate depends on which of many1 and many2 is chosen. The predicate in (35) is the result of a structure containing many1 ; the predicate in (36) is based on many2 . If, in the actual world, Jasper read 10 books, then (35) denotes {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. When, however, the predicate contains the many2 quantifier, the denotation is a singleton set: {10} if Jasper reads 10 books. This is because only the maximal group of books read by Jasper is such that it is the unique group of that kind of a certain cardinality. In general, the many2 -based degree predicate extension is a singleton set containing the maximum of the values in the denotation of the many1 -based degree predicate. (35)
λd.∃x[#x = d & book(x) & read(j, x)]
(36)
λd.∃!x[#x = d & book(x) & read(j, x)]
As discussed above, comparative quantifiers involve maximality operators. However, the maximal values for degree predicates like (35) and (36) are
3:12
Two kinds of modified numerals
always equivalent. In simple sentences based on a structure like (34), the option of having two distinct counting quantifiers does therefore not result in any ambiguity. When we turn to cases where the degree predicate is formed by moving the modified numerals over a modal operator with universal force, something similar can be observed. If Jasper is required to read (exactly) 10 books, then the structure in (37) yields, again, the set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Once more, the structure which contains the bilateral counting quantifier, the one in (38), yields the set containing the maximum of its weaker counterpart. (37)
[ λd [ require [ Jasper read d many1 books ] ] ] λd.∃x[#x = d & book(x) & read(j, x)]
(38)
[ λd [ require [ Jasper read d many2 books ] ] ] λd.∃!x[#x = d & book(x) & read(j, x)]
Given that the relation between (38) and (37) is once again one of a set and its maximal value, no ambiguities can be expected to arise when comparative quantifiers are applied to these two predicates. This is as is desired. Of course, it could be that the actual situation is not one containing a specific requirement, but one with for instance a minimality requirement. Say, for instance, Jasper has to read at least 4 books. In that case, (37) denotes the set {1, 2, 3, 4}. The extension of (38), however, is the empty set. (In such a context, there is no specific n such that Jasper has to read exactly n books.) Clearly, the maximal value for the predicate is undefined in such a case. This means that the logical form based on many2 will not lead to a sensible interpretation and, so, we again do not expect to find ambiguity. The case of predicates that are formed by abstracting over an existential modal operator is illustrated in (39) and (40). If Jasper is allowed to read a maximum of 10 books, then the two predicates are equivalent, both denoting the set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}.9 (39)
λd.♦∃x[#x = d & book(x) & read(j, x)]
(40)
λd.♦∃!x[#x = d & book(x) & read(j, x)]
In sum, the option of two counting quantifiers many1 and many2 is irrelevant when combined with a comparative quantifier. This is because the compara9 If there is in addition a lower bound, the two predicates are no longer equivalent, but their maximum will be.
3:13
R.W.F. Nouwen
tive quantifier is based on maximality and the degree predicates containing the different counting quantifiers do not differ in their maximum value. 4 The semantics of class B quantifiers I now turn to the main proposal: class B quantifiers are maxima/minima indicators. I start with the upper-bounded modifiers. 4.1 Upper bound class B modifiers In the formula in (41), MOD↓B generalises over any of the class B modifiers at most, maximally, up to, etc.10 (41)
MOD↓B = λd.λM. maxn (M(n)) = d
If the semantics of upper bound class B quantifiers is as in (41), then why is their distribution so limited? What I think is the reason for the awkwardness of a lot of examples with class B quantifiers is the fact that, in many cases, (41) is a vacuous operator. To be precise, the two propostions in (42) are equivalent whenever the cardinality predicate M denotes a singleton set. In such a case, a bare numeral form is to be preferred over a numeral modified by a class B modifier, since the latter derives the same meaning from a much more complex linguistic form. (42)
a. b.
maxn (M(n)) = d M(d)
What I have in mind exactly is the kind of reasoning underlying Horn’s division of pragmatic labour (Horn 1984). The idea is that a maxim of brevity, 10 For modifiers like at most and maximally, one might wonder whether (41) is not too restricted, given that they are capable of modifying DPs more generally. However, it appears that there is a common mechanism to all uses of such modifiers. For instance, (i) could be assigned its intuitive meaning if we assume that at most has the semantics in (ii), where the operator ‘max’ compares properties on the rank order [assistant professor < associate professor < full professor]: (i)
Jasper is at most an associate professor.
(ii)
at most = λP .λx.maxP 0 (P 0 (x)) = P
It goes beyond the scope of this article to implement a formal connection between (ii) and (41), but it should be clear that the underlying mechanism is the same.
3:14
Two kinds of modified numerals
part of Grice’s maxim of Manner (Grice 1975), steers toward minimising the form used to express something. This causes simple (unmarked) meanings to be typically expressed by means of simple (unmarked) forms. Marked forms which by convention could be given the same unmarked meaning as some unmarked form are instead given a more marked interpretation. There are many variations and implementations of this idea (McCawley 1978; Atlas & Levinson 1981; Blutner 2000; van Rooij 2004),11 but what is most relevant for this paper is the general idea that an unmarked meaning is blocked as an interpretation for the marked form. With this in mind, the equivalence of (42a) and (42b) whenever M denotes a singleton set has profound consequences for when it actually makes sense to state that the maximum of a degree predicate equals a certain value. That is, in cases where (42a) equals (42b), we expect that the use of maximally does not lead to an interpretation based solely on (42a), since the use of the bare numeral form would result in the same meaning. To illustrate this in some more detail let us carefully go through the following examples. We know from the discussion above that one of the interpretations available for (43) is (44). (43)
Jasper invited 10 people.
(44)
∃!x[#x = 10 & people(x) & invite(x)]
Now consider (45), which is interpreted either as (46) or as (47). (45)
Jasper invited maximally 10 people.
(46)
[ maximally 10 [ λd [ Jasper invited d many1 people ] ] ] maxn (∃x[#x = n & people(x) & invite(j, x)]) = 10
(47)
[ maximally 10 [ λd [ Jasper invited d many2 people ] ] ] maxn (∃!x[#x = n & people(x) & invite(j, x)]) = 10
The interpretations in (46) and (47) are equivalent. In fact, just like we do not expect ambiguities to arise with comparative quantifiers on the basis of the many1 /many2 choice, we do not expect any ambiguities to arise with MOD↓B quantifiers, for the simple reason that both such operators involve 11 In fact, there is a close resemblance between this prevalent idea in pragmatics and blocking principles in other parts of linguistics. The commonality is that two different expressions cannot have identical meanings. See, for instance, the Elsewhere Condition (Kiparsky 1973) in phonology or the Avoid Synonymy principle (Kiparsky 1983) in morphology.
3:15
R.W.F. Nouwen
a maximality operator and that the maximal values of predicates based on many1 are always those of predicates based on many2 . In what follows, we will therefore gloss over the two equivalent options by representing the semantics following the general scheme in (48). (48)
[ maximally 10 [ λd [ Jasper invited d many1/2 people ] ] ] maxn (∃(!)x[#x = n & people(x) & invite(j, x)]) = 10
Importantly, the single reading of (45) is equivalent to (44), the strong reading of (43). The example in (43), however, reaches this interpretation by means of a much simpler linguistic form, one which does not involve a numeral modifier. I propose that this is why the reading in (48) of (45) does not surface: it is blocked by (43).12 As observed above, we can nevertheless make sense of (45) once we interpret the sentence to be about what the speaker holds possible. So, a further possible reading for (45) is that in (49). (49)
maxn (♦∃(!)x[#x = n & people(x) & invite(j, x)]) = 10
Crucially, this interpretation is not equivalent to (50), which is the result of interpreting (43) from the perspective of speaker possibility. (50)
♦∃!x[#x = 10 & people(x) & invite(j, x)]
12 An anonymous reviewer notes two complications with the proposed blocking mechanism. First of all, s/he wonders why exactly 10 is not blocked in a similar way to minimally 10, since the same reasoning seems to apply. I acknowledge that this is something that needs to be explained. Interestingly, this is something any theory that believes in the existence of an ‘exactly’ sense for numerals has to explain. One promising route has been proposed by Geurts (2006), who suggests that exactly is semantically empty and that its only function is “to reduce pragmatic slack” (p. 320). That is, whereas bare 100 allows for an imprecise rough construal (Krifka 2007a), exactly 100 enforces precision. If Geurts is on the right track, then there is no reason to expect that exactly 100 is blocked by 100. A further complication noted by the same anonymous reviewer is that if we assume that the ‘max’ operator is presuppositional, we might come to expect that maximally 100 blocks 100 instead of the other way around. This prediction appears to be made when at the same time we assume the Maximize Presupposition principle (Heim 1991). Since maximally 100 and 100 share the same meaning, but the former triggers a presupposition, the use of 100 would be blocked. This is a very interesting scenario, but since I have little to say about the kind of presuppositions (if any) expressions like maximally trigger and I furthermore have no thoughts on how maximize presupposition would interact with a brevity maxim, I will leave this issue to further research.
3:16
Two kinds of modified numerals
In other words, the meaning in (49) for (45) is not blocked by the bare numeral form in (43) since (43) lacks this reading. To be sure, I do not claim that (50) would be an available reading for (43). That is, the particular kind of interpretation that examples like (45) receive is available only as a last resort strategy. Underlying this analysis is the assumption that there exist silent modal operators. I can offer no independent evidence for this assumption, but stress that the intuitions regarding examples like (45) quite clearly point into the direction of some sort of speaker modality. In work on superlative quantifiers, we find some alternatives to the present account. Such approaches are meant to deal with at most and at least only, but if my arguments above are on the right track, then we could reinterpret these proposals for the semantics of superlative quantifiers as applying to the whole of class B. For instance, the analysis of class B expressions presented here differs from that of superlative modifiers in Geurts & Nouwen 2007. According to the present proposal, the modal flavour of (45) is due to a silent existential modal operator. In Geurts & Nouwen, however, the modal was taken to be part of the lexical content of superlative quantifiers. Another alternative, proposed for superlative modifiers in Krifka 2007b and which is closer to the present proposal, is to analyse examples like (45) not as involving a modal operator, but rather a speech act predicate, like assert. In that framework, the analysis of (45) would say that n=10 is the maximal value for which ∃(!)x[#x = n & people(x) & invite(j, x)] is assertable, rather than possible.13 That is, according to Krifka, (45) is interpreted by assigning the modified numeral scope over an illocutionary force operator, rather than over a modal operator. I will return to a comparison of these approaches below. I would like to point out immediately, however, what I think are the major disadvantages of both alternatives. The main problem is with examples like (51), which contain an overt existential modal. (51)
Jasper is allowed to invite maximally/at most 10 people.
13 In his comments on the first version of this paper, David Beaver observed that it it is not necessarily the speaker’s knowledge that matters, as can be seen from (his) example (i). (i)
I know how many people were at the party, but I’ve been told not to reveal that number to the press. However, there were maximally 50 there.
It would be interesting to see if data like these help in reaching a synthesis of Krifka’s account and the present proposal.
3:17
R.W.F. Nouwen
Its most salient reading is one in which 10 is said to be the maximum number of people Jasper is allowed to invite. That is, it places an upper bound on what is allowed. For Krifka, this is problematic since, here, the modified numeral is quite obviously not a speech act operator. For the proposal in Geurts & Nouwen 2007, such examples are problematic since the modal lexical semantics of at most predicts a reading with a double modal operator, one originating from the verb and one from the numeral modifier. To remedy this, Geurts and Nouwen provide an essentially non-compositional analysis of such examples as modal concord.14 In contrast, the current proposal deals effortlessly with examples, such as (51). What was crucial to my explanation of how (45) gets to be interpreted is that degree predicates based on modals with existential force denote non-singleton sets even when the counting quantifier associated with the numeral is many2 . This entails that saying that the maximum value for such a predicate is n is not equivalent to saying that the predicate holds for n. More formally, there is a contrast between (52a) and (52b). (52)
a. b.
maxn (∃(!)x[#x = n & people(x) & invite(j, x)]) = 10 a ∃!x[#x = 10 & people(x) & invite(j, x)]) maxn (♦∃(!)x[#x = n & people(x) & invite(j, x)]) = 10 i ♦∃!x[#x = 10 & people(x) & invite(j, x)])
As a result, whenever an upper bound class B modifier scopes over an existential modal, no blocking from the simpler bare numeral form will be able to take place. The application of an upper bound class B quantifier to a degree predicate is only felicitous if the resulting readings are not readings that can be expressed just as well by omitting the class B modifier. This is the case when a modal with existential force has scope inside the degree 14 A further problem I see with the proposal in Krifka 2007b is that the analysis does not appear to extend straightforwardly to illocutionary forces other than assertion, although in fairness this might be because (at the time of writing) no detailed exposition of this theory exists. For instance, nothing suggests that superlative modified numerals can scope over a question operator in questions. An additional disadvantage for the proposal of Geurts and Nouwen is that it does not yield an explanation of the lexical form of class B modifiers. Whereas the current proposal assigns to a modifier like maximally the semantics of a maximality operator, an extension of Geurts and Nouwen’s approach would have to take it to be a modal, thereby disassociating it from the intuitive meaning of maximal.
3:18
Two kinds of modified numerals
predicate.15 Treating upper bound class B quantifiers as maxima indicators thus also predicts the absence of weak readings for examples like (51). Given the flexible scope of the numeral modifier we expect this sentence to have two corresponding logical forms, (54a) and (54b). (From here on, indicates deontic modality, to distinguish it from the (epistemic) speaker possibility ♦). (53)
Jasper is allowed to invite maximally/at most 10 people.
(54)
a. b.
maxn (∃(!)x[#x = n & people(x) & invite(j, x)]) = 10 [maxn (∃(!)x[#x = n & people(x) & invite(j, x)]) = 10]
If maximally 10 is taken to have wide scope over the modal, then we arrive at (54a), the reading that says that the maximum number of people Jasper is allowed to invite equals 10. This is not a semantic interpretation that is available for (55). Its many2 reading, for instance, says that inviting exactly 10 people is something that Jasper is allowed to do. This is much weaker than (54a). (The only way we can arrive at an equally strong reading for (55) is by means of implicature.) (55)
Jasper is allowed to invite 10 people.
If we take the modal in (51) to have widest scope, as in (54b), the resulting interpretation is one in which inviting exactly 10 people is allowed for Jasper. This is the reading for (55) discussed above, and so it is blocked. As a result, (54a) is the only interpretation available. An interesting side to the account presented here is that the upper bound class B quantifiers do not encode the ≤ relation. As maxima indicators, their application only makes sense if what they apply to denotes a range of values. Otherwise, using the strong reading of the bare numeral form will do just as well. Interestingly, the approach also predicts that some of the examples I discussed above do not only result in a blocking effect, but could moreover be predicted to be false. For instance, according to the approach set out above, the meaning of (56a) is that in (56b). 15 As far as I can see, assertability would have the same (crucially weak) properties as possibility. So, should a silent speech act predicate seem more plausible than a silent modal operator, then ♦ can just as well be interpreted as expressing assertability. It appears that such a move would be largely compatible with the proposal of Krifka 2007b.
3:19
R.W.F. Nouwen
(56)
a. #A triangle has maximally 10 sides. b. ‘the maximum number of sides in a triangle is 10’
The reading in (56b) is not only blocked by A triangle has 10 sides, but it is moreover plainly false. I believe that this predicts that (56a) should be expected to have a somewhat different status from (57), which strictly speaking has a true interpretation, but one that can be expressed by simpler means. (57)
#A triangle has maximally 3 sides.
It is difficult to establish whether this difference in status is borne out, or even how this difference can be recognised. However, my own intuition tells me that while (56) is never acceptable, (57) could be used in a joking fashion. Native speakers inform me that (58) is marginally acceptable: (58) 4.2
?A triangle has minimally and maximally 3 sides. Lower-bound class B modifiers
Lower-bound class B modifiers correspond to minimality operators. Let MOD↑B correspond to any of the class B expressions at least, from, minimally, etc. (59)
MOD↑B = λd.λM. minn (M(n)) = d
Note first that minimality operators are sensitive to the many1 / many2 distinction. Consider the degree predicate [λd. John read d many1/2 books] and, say, that John read 10 books. In the many1 version of the logical form, the minimal degree equals 1. In fact, independent of how many books John read, as long as he read books, the minimal degree will always be 1. In the many2 version of the logical form, the predicate denotes a singleton set, {10} if John read 10 books. The minimal degree in that case is, of course, 10. These observations already straightforwardly account for our intuitions for an example like (60). (60)
John read minimally 10 books.
The many1 interpretation of (60) will be rejected, for it will always be false. The minimal value for any simple many1 -based degree predicate is always 1. The many2 interpretation of (60) will be rejected too, for it will correspond to an interpretation saying that John read (exactly) 10 books. This reading is
3:20
Two kinds of modified numerals
blocked by the bare numeral. (In fact, (60) in the many2 variant is equivalent to John read maximally 10 books, which, as was explained above, is blocked for the same reasons.) We can save (60) by interpreting it with respect to an existential modal operator. This yields two readings: (61)
a. b.
mind (♦∃x[#x = d & read(j, x) & book(x)]) = 10 mind (♦∃!x[#x = d & read(j, x) & book(x)]) = 10
The form in (61a) is once more a contradiction: the minimal degree for which it is deemed possible that John read d-many1 books is always 1. The reading in (61b) is much more informative. It says that that the minimal number for which it is thought possible that John read exactly so many books is 10. In other words, this says that it is regarded as impossible that John read fewer than 10 books. This is exactly the reading that is available. 4.3
Beyond modals
Some words are in order on the interaction of numeral modifiers with nonmodal operators. Given the current proposal, any property that involves existential quantification would license the use of a class B modifier. However, it is known that degree operators (which we take modified numerals to be) cannot move to take scope over nominal quantifiers (cf. Kennedy 1997; Heim 2000).16 This explains why (62) does not have the reading in (63). (62)
Someone is allowed to invite maximally 50 friends.
(63)
the person who is allowed to invite most friends is allowed to invite 50 friends
As observed above, however, bare plurals do interact with class B quantifiers, as in for instance example (9). This would suggest that some intensional/modal analysis of the readings involved in such examples is in order. (Thanks to Maribel Romero for pointing this out to me.) I will leave a detailed analysis of these cases for further research. 16 In Heim’s formulation: If the scope of a quantificational DP contains the trace of a degree phrase, it also contains that degree phrase itself. See Heim 2000 for details.
3:21
R.W.F. Nouwen
5 Maximal and minimal requirements As Hackl (2001) observed, there is an interesting interaction between modified numerals and modals. I have extended these observations by showing how existential modals have a tight connection to class B modifiers in that they license their (otherwise blocked) existence. What I have not discussed so far is how class B modifiers interact with universal modals. It turns out that this part of the story is not straightforward at all. Given my proposal in the previous section, we expect that there are in principle four logical forms that correspond to (64).17 (64)
Jasper should read minimally 10 books.
(65)
í >min: The minimum n such that Jasper will read n books should be 10 a. b.
(66)
í[minn (∃x[#x = n & book(x) & read(j, x)]) = 10] í[minn (∃!x[#x = n & book(x) & read(j, x)]) = 10]
many1 many2
min> í: The minimum n such that Jasper should read n books is 10 a. b.
minn (í∃x[#x = n & book(x) & read(j, x)]) = 10 minn (í∃!x[#x = n & book(x) & read(j, x)]) = 10
many1 many2
It turns out that none of these logical forms provide a reading that is in accordance to our intuitions regarding (64). First of all, notice that minn (∃x[#x = n & book(x) & read(j, x)]) = 10 is a contradiction. If there are 10 books that Jasper read, then there is also a singleton group containing a book Jasper read. The minimum number of books Jasper read is therefore either 1 (in case he read something) or 0 (in case he did not read anything). It could never be 10. Consequently, (65a) is a contradiction. For a similar reason, (66a) is a contradiction too. If there needs to be a group of 10 books 17 In this paper, I ignore readings which (for the case of at least) Büring (2008) calls speaker insecurity readings and which Geurts & Nouwen (2007) discuss extensively. Basically, this reading amounts to interpreting the modal statement with respect to speaker’s knowledge. Such readings are especially prominent with superlative quantifiers. For instance, the speaker insecurity reading of Jasper should read at least 10 books is: the speaker knows that there is a lower bound on the number of books that Jasper should read, s/he does not know what that lower bound is, but she does know that it exceeds 9. Furthermore, I also ignore a reading of (64) in which 10 books is construed as a specific indefinite. In that reading, (64) states that there are 10 specific books such that only if Jasper reads these books will he comply with what is minimally required.
3:22
Two kinds of modified numerals
read by Jasper, then there also need to exist groups containing just a single book read by Jasper. Once again, the minimum number referred to in (66a) is either 0 or 1, never 10. Turning to (65b), notice that the minn -operator is vacuous here, since there is just a single n such that Jasper read exactly n books. This renders (65b) equivalent to the many2 reading of Jasper should read 10 books, and so we predict it to be blocked. The interpretation in (66b) does not fare any better. In fact, the minn -operator is vacuous here as well. This means that (65b) is equivalent to (66b) and that it is consequently also blocked. Even if no blocking were to take place, (65b)/(66b) offer the wrong interpretation anyway. They state that Jasper must read exactly 10 books (no more, no fewer), which is not what (64) means. One might think that the problems with (65b) and (66b) can be remedied by abandoning quantification over sums and instead using reference to (maximal) sums. For instance, (67) represents the truth-conditions we are after. (Here σx returns the maximal sum that when assigned to x verifies the scope of σ ). (67)
minn (í[#σx (book(x) & read(j, x)) ≥ n]) = 10
Still, here too the application of minn is not meaningful, since there is only a single n such that í[#σx (book(x) & read(j, x)) ≥ n] holds, which is 10 if (64) is true. As a consequence, it would not matter whether we applied a maximality or a minimality operator. We then wrongly predict that (68) should share a reading with (64). (Note that (65b) and (66b) suffer from the same odd prediction, given that the operator minn has no semantic impact there either.) (68)
Jasper should read maximally 10 books.
It appears then that the proposal defended in this article fails hopelessly on sentences like (64). As I will show, however, things are not so dire as they appear. In fact, I will argue that what we stumble upon here is a general, but poorly understood property of modals, which could be summarised as follows: (69)
Generalisation: universal modal operators are interpreted as operators with existential modal force when minimality is a stake
An illustration of (69) is (70), which is a satisfactory paraphrase for (64).
3:23
R.W.F. Nouwen
What is striking is that this paraphrase contains allow instead of should. (70)
10 is the smallest number of books John is allowed to read
I will not offer an explanation for this generalisation (but see Nouwen 2010a for an attempt). I will simply show that if we look a bit closer at the interpretation of modal operators, then we come to understand that my theory actually yields a welcome analysis. 5.1 Previous analyses There is a precedent. In an earlier theory of at least, Geurts & Nouwen 2007, the correct predictions regarding its relation to universal modals are arrived at by an essentially non-compositional mechanism. A central claim made in that paper is that superlative quantifiers are modal expressions themselves. For instance, (71a) was proposed to correspond to (71b).18 Furthermore, it was assumed that there may be a non-compositional interaction between the modal that is implicitly contributed by a modified numeral and an explicit modal operator. For instance, (72a) is interpreted as an instance of modal concord, as in (72b), where the two modals fuse and the modal takes on the deontic flavour of need.19 (71)
a. b.
John read at least 10 books. ∃x[#x = 10 & book(x)& read(j, x)]
(72)
a. b.
John needs to read at least 10 books. í∃x[#x = 10 & book(x)& read(j, x)]
18 This is how I see the theoretical landscape: Although not immediately obvious, the proposal by Geurts and Nouwen already carries in it the idea that superlative quantifiers are minimality and maximality operators. For instance, (71b) is equivalent to stating that 10 is the minimal number of books John is allowed to read. Given the basic idea of treating class B operators as min/max-operators, one has a range of options to account for the distribution of such quantifiers and for their behaviour in intensional contexts. Geurts and Nouwen represent one extreme, where the lexicon specifies the exact behaviour of such quantifiers (together with the rule of modal concord). The present proposal puts forward the other extreme, where the lexical entry for superlative (and other class B) quantifiers is rather minimal, and where pragmatic mechanisms account for distribution and behaviour in intensional contexts. 19 I am simplifying the analysis here a little bit. Geurts & Nouwen (2007) propose that there is an additional conjunct to the meaning of sentences containing superlative quantifiers, for which they leave implicit whether it is entailed or implicated. For (71), for instance, there would be an additional condition in the truth-conditions saying: ¬∃x[#x > 10& book(x) & read(j, x)]. Similarly for (72).
3:24
Two kinds of modified numerals
The approach of Geurts and Nouwen is the most broadly applicable approach to superlative quantifiers in the (admittedly small body of) literature on that topic. There are alternatives on the market, but they do not handle examples like these very well. As I mentioned above, Krifka (2007b) takes at least to be a speech act modifier. Basically, an example like (71) is analysed by Krifka in terms of what the speaker finds assertable and is paraphrased as follows: the lowest n such that it is assertable that John read n books is 10. When at least is embedded in an intensional context, however, it does not modify the strength of assertability, but rather the intensional operator. So, taking Krifka’s analysis as suitable not just for superlative, but rather for all class B quantifiers, (72a) would be paraphrased as (73). (73)
10 is the smallest value for n such that John should read n books
In such cases, Krifka’s analysis is identical to the one I have set out above and it runs in exactly the same problem: (73) is not the reading we are after. Rather, (72a) means that 10 is the smallest number of books John is allowed to read. 5.2
Minimal requirements
Geurts & Nouwen (2007) and Krifka (2007b) say nothing about the distinction between class A and class B expressions. However, if we extend their proposals for superlative quantifiers to cover all B-type quantifiers, then we have an interesting trio of competing characterisations of such expressions. At face value, the observations made so far in this section would appear to speak in favour of the modal concord proposal of Geurts & Nouwen (2007) (generalised to all class B quantifiers) and against the account defended here or in Krifka 2007b. As I will argue now, however, there are reasons to believe that the problematic predictions made by the latter two theories are not due to the semantics of the modified numeral, but are actually the result of an overly simplistic understanding of requirements. What I will do is discuss in some detail examples like (74). (74)
The minimum number of books John needs to read to please his mother is 10.
Notice, first of all, that on an intuitive level, (74) is equivalent to (75). (75)
John needs to read minimally 10 books to please his mother.
3:25
R.W.F. Nouwen
Note, secondly, that (74) spells out the semantics I have proposed for (75). What I will show now is that when we look into the semantic details of (74), we will run into exactly the same problems as we did for (75). What this shows is that rather than thinking that my account of class B quantifiers is on the wrong track, there are actually reasons to believe that the proposal lays bare a hitherto unexplored problem for the semantics of modals like need, require, etc. Let us consider the semantics of (74). Say that, in fact, the minimal requirements for pleasing John’s mother are indeed John reading 10 books. That is, if John reads 10 or more books, she is happy. If he reads fewer, she will not be pleased. Standard accounts of goal-directed modality (von Fintel & Iatridou 2005) assume that statements of the form to q, need to p are true if and only if p holds in all worlds in which the goal q holds. Below, I refer to the worlds in which John pleases his mother as the goal worlds. It is instructive to see what we know about the propositions that are true in such worlds. The following is consistent with the context described above. (76)
a. b. c. d. e. f.
In all goal worlds: ∃x[#x = 10 & book(x) & read(j, x)] In all goal worlds: ∃x[#x = 9 & book(x) & read(j, x)] In all goal worlds: ∃x[#x = 1 & book(x) & read(j, x)] In some (not all) goal worlds: ∃x[#x = 11 & book(x) & read(j, x)] In some (not all) goal worlds: ∃x[#x = 12 & book(x) & read(j, x)] In no goal world: ¬∃x[book(x) & read(j, x)]
Let us now analyse some examples. First of all, (77a) and (77b) are intuitively true and are also predicted to be true ((77a) by virtue of (76a) and (77b) by virtue of (76c).) (77)
a. b.
To please his mother, John needs to read 10 books. To please his mother, John needs to read a book.
The example in (78) is intuitively false, and is also predicted to be false, for the context is such that there are goal worlds in which John reads only 10, and not 11, books. (78)
To please his mother, John needs to read 11 books.
So far, so good. If we turn to examples that place a bound on what is required, however, then the theory makes a wrong prediction. The example in (79) is intuitively false. If interpreted as (80), however, it is predicted to be true (by
3:26
Two kinds of modified numerals
virtue of (76c)). (79)
The minimum number of books John needs to read, to please his mother, is 1.
(80)
minn [In all goal worlds: ∃x[#x = n & book(x) & read(j, x)]] = 1
In general, theories such as that of von Fintel & Iatridou (2005) predict that if S is an entailment scale of propositions, and p is a proposition on this scale, then if p is a minimal requirement for some goal proposition q, then a statement of the form “the minimum requirement to q is p” is always predicted to be false, except when p is the minimal proposition of S. This makes a devastating prediction, namely that minimal requirements could never be expressed, since they would always correspond to the absolute minimum. One might think that what is going wrong in the example above is that I assume that when we talk about how many books John read we should be talking about existential sentences, that is about at least how many books John read. The alternative would be to describe the number of books John read by means of the counting quantifier many 2 , that is, how many books John read exactly. I’m afraid this only makes the problem worse. Here is a description of the relevant context in terms of the exact number of books that were read by John. (81)
a. b. c. d. e.
In In In In In
some but not all goal worlds: John read exactly 10 books. no goal world: John read exactly 9 books. no goal world: John read exactly 1 book. some but not all goal worlds: John read exactly 11 books. some but not all goal worlds: John read exactly 12 books.
Now, there is no number n such that John read exactly n books in all goal worlds. So, the smallest number of books John needs to read does not refer. The upshot is that there is no satisfactory analysis of examples like (74) under the assumptions made here. In general, it seems that, under standard assumptions, there is no satisfactory analysis of minimal requirements. Whatever way we find to fix the semantics of cases like (74), however, this fix will work to save the account of class B quantifiers too, for (74) was a literal spell-out of the proposed interpretation of similar sentences with at least, minimally, etc. It goes beyond the scope of this article to provide such a fix. The overview in (81), however, can help to indicate where we should look for
3:27
R.W.F. Nouwen
a solution.20 Given that there is no goal world in which John read exactly n books for n’s smaller than 10, it follows that 10 is the minimal number of books John could read to please his mother. In other words, examples like (74) show that, in the scope of a minimality operator, modals that are lexically universal quantifiers get a weaker interpretation. That said, it is time to revisit example (64), repeated here as (82). (82)
Jasper should read minimally 10 books.
My proposal generated four logical forms, two of which were contradictory and two of which were blocked by a non-modified form. Let us revisit one of these logical forms, namely the one with a narrow scope modal and a doubly bound counting quantifier, represented in (83). The resulting truth-conditions were presented above as (84). (83)
[ minimally 10 λn [ should [ Jasper read [ n-many2 books ] ] ] ]
(84)
minn (í∃!x[#x = n & book(x) & read(j, x)]) = 10
What the discussion in the current section suggests is that it is a misunderstanding to assume that (83) is interpreted as (84), and that it looks like there is a mapping to a form like (85), instead. (85)
minn (∃!x[#x = n & book(x) & read(j, x)]) = 10
This captures the intuitive meaning of (82). At this point I do not have anything to offer which provides the mechanism behind the generalisation that the combination of a universal modal and a minimality operator leads to a semantics which is existential in nature. What is relevant for the present purposes is that this is a general phenomenon. Interestingly, this means there are noteworthy connections to other areas where the semantics of a modal statement appear mysterious. Schwager (2005), for instance, notices that certain imperatives, which are standardly considered to have universal modal force, require a weaker semantics. Her key examples are German imperatives containing for example. (86)
Q: How can I save money? A: Kauf zum Beispiel keine Zigaretten! Buy for instance no cigarettes “For example, don’t buy any cigarettes!”
20 See Nouwen 2010a for a proposal along these lines.
3:28
Two kinds of modified numerals
In the context of the question asked in (86), the imperative does not convey that to comply with the advice, the hearer has to stop buying cigarettes. Instead, it is interpreted as stating that one of the things one could do to save money is to stop buying cigarettes. Thus, examples like these display a mechanism that is similar to the interaction of numeral modifiers and modality. The mysterious interaction of modified numerals and modals is moreover reminiscent of the interaction of modals and disjunction (Zimmermann 2000; Geurts 2005; Aloni 2007), especially since, on an intuitive level at least, a class B modified numeral like minimally 10 (and, quite obviously, 10 or more) appears to correspond to a disjunction of alternative cardinalities, with 10 as the minimal disjunct.21 A central issue in the literature on modals and disjunction is that classical semantic assumptions fail to capture the entailments of sentences where a disjunctive statement is embedded under a modal operator (Kamp 1973). A detailed comparison of this complex issue with the discussion of minimal requirements that I presented here, however, will be left to further research. 6
More about the A/B distinction
In this section, I will attempt to give some initial answers to three empirical questions concerning the distinction between class A and B modified numerals that is central to this article. First of all, I turn to the issue of which expressions go with which class. So far, I have restricted my attention mostly to, on the one hand, comparative quantifiers (as proto-typical class A expressions) and, on the other hand, superlative, minimality/maximality and up to-modified numerals (as representatives of class B). What about expressions like the prepositional over n or under n or the double bound between n and m or from n to m? Below, I will turn briefly to such expressions. A second empirical question concerns the validity of the examples used so far. Although I believe that the intuitions concerning the constructed examples in this article are rather clear, my plea for two kinds of modified numerals would still benefit from some independent objective support. Below, I present the results of a small corpus study that clearly reflects the distinction argued for in this article. Finally, this section will turn to the cross-linguistic generality of the 21 See Nilsen 2007 and Büring 2008 for suggestions along this line for the modifier at least only.
3:29
R.W.F. Nouwen
proposal. I will provide data from a more or less random set of languages that suggest that the class A/B distinction is not a quirk of English or Germanic, or even Indo-European, but is, in fact, quite general. 6.1 Filling in class A and B I will leave it an open question exactly which quantifiers belong to which class. Nevertheless, I can already offer some speculations on several quantifiers that I have so far not discussed. To start with disjunctive quantifiers, it appears that these are clear cases of class B expressions. (87)
a. #A triangle has 3 or more sides. b. #A triangle has 3 or fewer sides.
With disjunctive quantifiers in class B, one might wonder whether there are any examples of class A expressions which are not the familiar comparative quantifiers more/fewer/less than n. I think that locative prepositional modifiers are a likely candidate for class A membership, however. In fact, I believe that the locative/directional distinction in spatial prepositions corresponds to the class A/B distinction when these prepositions are used as numeral modifiers. Roughly, locative prepositions express the location of an object and are compatible with the absence of directionality or motion. Directional prepositions, on the other hand, cannot be used as mere indicators of location.
(88)
Locative: a. b. c.
(89)
John was standing under a tree. That cloud is hanging over San Francisco. Breukelen is located between Utrecht and Amsterdam.
Directional: a. #John was standing up to here. b. #John was standing from here. c. #Breukelen is located from Utrecht to Amsterdam.
Now, compare (90a) and (90b). (90)
a. b.
You can get a car for under €1000. You can get a car for maximally €1000.
3:30
Two kinds of modified numerals
The example in (90b) is somewhat strange, since it claims that the most expensive car you can buy is €1000. The example in (89a), in contrast, makes no such claim. It clearly has a weak reading: there are cars that are cheaper than €1000 and there might be more expensive ones too. As explained above, such weak readings are typical for class A quantifiers and do not occur with class B quantifiers.22 Furthermore, under seems perfectly compatible with definite amounts, such as in (91). (91)
The total number of guests is under 100. To be precise, it’s 87.
Class A is then not restricted to comparative constructions only. In fact, other locative prepositions seem to behave similarly to under. (92)
The total number of guests is between 100 and 150. It’s 122.
The locative complex preposition between . . . and . . . contrasts with its directional counterpart from . . . (up) to . . . , which behaves like a class B modifier: it is incompatible with definite amounts, as in (93), but felicitous if it relates to a range of values. (93) (94)
#The ticket to the Stevie Wonder concert that I bought yesterday cost from €100 to €800. Tickets to the Stevie Wonder concert cost from €100 to €800.
It appears then that locative prepositions turn into class A modifiers, while directional ones turn into class B modifiers. A potential counterexample, however, is over, which apart from a (relatively rarely used) locative sense, as in (88b), has a directional sense, such as exemplified in (95). (95)
The bird flew over the bridge.
As a numeral modifier, however, over looks like a class A element. In (96), over 100 is clearly relating the precise weight 104kg with 100kg. Note in (97) how this contrasts with the directional 100 . . . and up, which is made 22 An anonymous reviewer notes a complication. It appears that under cannot take wide scope with respect to a modal. That is, it fails to display scope ambiguities such as the one in (20) above. For instance, (i) (which is an example given by the reviewer) is odd, since it misses an interpretation where the modified numeral has scope over require. (i)
#John is required to come up with under 6 brilliant ideas.
3:31
R.W.F. Nouwen
felicitous by embedding it under an existential modal. (96)
He weighs over 100 kg. To be precise, he weighs 104 kg.
(97)
a. #He weighs 100 kg and up. b. He is allowed to weigh 100 kg and up.
A potential explanation for why the numeral modifier over lacks a directional/class B sense23 is that the use of prepositions in numeral quantifiers is restricted to prepositions that are vertically oriented. This is connected to the observation of Lakoff & Johnson 1980 that cardinality is metaphorically vertical: more is higher (as in a high number), less is lower (as in a low number). Prepositions in modified numerals follow this metaphor.24 What is interesting about over, however, is that only its locative sense is vertical. Its directional sense, as in (95), rather expresses a mainly horizontal motion. This could explain why there is no class B sense numeral modifier over. Further clues that this analysis is on the right track come from Dutch, where the preposition over lacks a locative sense. (98) (99)
#De wolk hangt over San Francisco. The cloud hangs over San Francisco. De vogel vloog over de brug. The bird flew over the bridge.
Instead of over in (98), boven (above) should be used for locative meanings. (100)
De wolk hangt boven San Francisco. The cloud hangs above San Francisco. ‘The cloud hangs over San Francisco.’
In Dutch, only boven can modify numerals. Over, which lacks a vertical sense, is unacceptable in modified numerals. (101)
Inflatie kan {boven / #over} de 10% zijn. Inflation can {above / over} the 10% be. ‘Inflation can be over 10%’
23 Thanks to Joost Zwarts for discussing this matter with me. 24 Up (to) and under are clearly vertical. Between and from . . . to are compatible with all possible axes.
3:32
Two kinds of modified numerals
I will refrain from attempting to offer further evidence for my suggestion that there is a correspondence between the locative/directional and the A/B distinction. In any case, it should be clear that the set of prepositional quantifiers offers an interesting range of contrasts that support the existence of two classes of modified numerals. To summarise this subsection, I tentatively put forward the following classification for English modified numerals. (102)
Class A (Positive:) more than —, over — (Negative:) fewer than —, less than —, under — (Neutral:) between — and —
(103)
Class B (Positive:) at least —, minimally —, from — (up), — or more (Negative:) at most —, maximally — , up to —, — or fewer, — or less (Neutral:) from — and —
Missing from this classification are the negative comparative quantifiers like no more/fewer than 10. The reason for this is that the occurrence of negation complicates the comparison with other quantifiers. In fact, I think that such quantifiers are best treated as the compositional combination of a class A comparative modifier with a negative differential no. See Nouwen 2008b for the consequences of such a move and for more details on the interpretations available for sentences containing such quantifiers. 6.2 Support for the A/B distinction from a corpus study I now turn to a small corpus study I conducted which supports the division between class A and class B modifiers. Recall that one of the central observations in favour of the distinction connected to contrasts such as (104). Whereas (104a) can be interpreted with respect to a definite actual number of people invited by Jasper, (104b) does not allow such an interpretation and instead is evaluated in relation to what the speaker holds possible. (104)
a. b.
Jasper invited fewer than 100 people. 87, to be precise. Jasper invited maximally 100 people. #87, to be precise.
3:33
R.W.F. Nouwen
I explained this contrast by proposing that upper bound class B quantifiers are indicators of maxima. The indication of the maximum of a single value leads to infelicity. Existential modals, however, introduce a range of (possible) values, which thereby license the application of the maxima indicator. For examples like (104b), where no overt modal is present, the hearer will have to accommodate an interpretation with respect to speaker possibility. Given that ♦-modals licenses the application of an upper bound class B modifier, one would expect, however, that class B modifiers co-occur with an overt modal operator relatively often. I conducted a corpus study to find out whether this expectation is fulfilled. 6.2.1
Method
I used the free service for searching the Corpus of Contemporary American English (COCA, 385 million words, a mix of fiction, science, newspaper and entertainment texts and spoken word transcripts) at americancorpus.org (Davies 2008). For each numeral modifier I took 100 quasi-random25 occurrences of the modifier with a numeral. For each of these cases I examined whether the modified numeral was in the scope of an explicit existential modal operator (such as can, could, might, possibly, allow, etc.) In other words, I only looked at the surface form and only counted the number of cases where a modal expression has a scope relation with a modified numeral. Given the theory presented in this article, the prediction is that this number is significantly higher with class B numerals than with class A expressions. I compared five modifiers: fewer than, under, between, at most and up to. Not all occurrences of these modifiers with a numeral in the corpus were taken into consideration. For instance, (105) was ignored because in this example up to is probably not a constituent.26 That is, this example contains the particle verb to lift up, rather than the verb to lift. (105)
Periodically we’d lift up to 60 kilometers where the temperatures and pressures are more like Earth’s.
I similarly disregarded occurrences of under n where under is a regular preposition rather than a preposition in a role of numeral modifier. (For instance, examples resembling He was known under 2 different names.) 25 ‘Quasi’, since the results are given in chronological order and I would just take the earliest hits. 26 From: “To boldly go. . . ”, Donald Robertson (1994), Astronomy, Vol. 22, Iss. 12; pg. 34, 8 pgs.
3:34
Two kinds of modified numerals
6.2.2
Results
The results, summarised in the table in (106), support the proposal in this article. Here, P is the percentage of occurrences within a existential modal context, within a sample of 100 occurrences of that modifer.27 (106) P
Class A fewer than under 4% 3%
between 4%
Class B at most up to 23% 21%
The corpus thus shows a clear preference for combining class B quantifiers with existential modal operators, as was predicted.28 Whether the data are as clear as (106) for other expressions too remains to be seen. It will be difficult to extend this type of study to other modifiers. Maximally and from. . . to, for instance, were included in the present corpus search, but did not yield enough occurrences to make a meaningful comparison. 6.3 The cross-linguistic generality of the distinction The class A/B distinction is not a peculiarity of the English language. I will suggest in this subsection that, in fact, the distinction is quite general and that languages seem to fill in the two classes in roughly the same way. Dutch, for instance, mirrors the English data perfectly. To illustrate, (107) and (108) shows the A/B distinction in a contrast between comparative and superlative quantifiers. (107)
Een driehoek heeft meer dan 1 zijde. A triangle has more than 1 side.
(108)
#Een driehoek heeft minstens 2 zijdes. A triangle has at least 2 sides.
There are similar contrasts for other numeral modifiers. In a nutshell, the Dutch data suggests the two classes in (109), which is parallel to English.
27 I also counted the number of occurrences in a universal modal context. As would be predicted, this yielded no significant difference between class A and class B modifiers. For all modified numerals, this number was between 1 and 5. 28 The contrast between the Class A and Class B data is significant (χ 2 =41.2, df=1, p = 1.375×10−10 .)
3:35
R.W.F. Nouwen
(109)
Dutch Class A (Positive:) meer dan — (more than), boven de — (above the) (Negative:) minder dan — (fewer/less than), onder de — (under the) (Neutral:) tussen de — en de — (between the. . . and. . . )
(110)
Dutch Class B (Positive:) ten minste —, minstens —, op z’n minst — (at least), vanaf — (from off), zeker — (certain), minimaal — (minimal) (Negative:) ten hoogste —, hoogstens —, op z’n hoogst — (at most),tot — (up to), maximaal — (maximal) (Neutral:) van — tot — (from — to —)
In other languages, we find similar data. For instance, the division between comparative and superlative modifiers appears to be cross-linguistically quite general. In Italian, for instance, the following contrast exists. (111)
Un triangolo ha piú di 1 lato. A triangle has more than 1 side.
(112) #Un triangolo ha almeno 2 lati. A triangle has at least 2 sides. In Chinese, there also exists a superlative form that behaves like a class B modifier. (113) #Sanjiaoxing zui-shao you liang-tiao bian. triangle most-little have 2-CL side On the other hand, there also exists an alternative form resembling English at least, which behaves differently. The form zhi-shao can be used as in a similar way as English at least is in sentences like At least it doesn’t rain!. Despite this parallel to the English superlative modifiers, the example in (114) appears to be fine, which suggests zhi-shao is of type A. (114)
Sanjiaoxing zhi-shao you liang-tiao bian. triangles to-little have 2-CL side
I leave a more detailed investigation of such data for further research. Whatever the outcome, however, the data first and foremost reveal that the type of contrasts that have been the central focus of this paper occur in Chinese and that, thereby, Chinese also appears to have the class A/B distinction.
3:36
Two kinds of modified numerals
Above, I suggested that prepositional numeral modifiers are to be divided in two classes in accordance with the locative/directional distinction that exists for their spatial meanings. The clearest case of a class B directional prepositional modifier in English is up to. In many other languages, one and the same particle is used for indicating spatial, numerical and temporal extremes. (In English, up to cannot be used as a temporal operator, for which until exists.) In Dutch, for instance, the preposition tot has these three functions. Crucially, in all these three domains tot displays class B characteristics. (115) #Een driehoek heeft tot 10 zijdes. A triangle has up to/until 10 sides. (116) #Je auto stond tot hier geparkeerd. Your car stood up to/until here parked. ‘#Your car was parked up to here’ (117)
Je auto mag tot hier geparkeerd worden. Your car may up to/until here parked be. ‘You may park your car up to here’
(118) #Jasper kwam tot middernacht de kamer binnengelopen. J. came up to/until midnight the room inside-walked. ‘#J. entered the room until midnight’ (119)
Jasper mag tot middernacht de kamer binnen komen J. may up to/until midnight the room inside come lopen. walk. ‘J. is allowed to enter the room until midnight’
Similar data exist for German bis (zu), Hebrew ’ad, Catalan fins a, Spanish hasta and Italian fino a. In fact, in Italian it appears that (120) is generally awkward, resisting a reading that connects to speaker’s possibility. However, it becomes acceptable if an overt modal verb is inserted. (120) ??John ha invitato {al massimo / fino a} 50 amici. John has invited {at most / until} 50 friends. (121)
John può invitare {al massimo / fino a} 50 amici. John can invite {at most / until} 50 friends.
3:37
R.W.F. Nouwen
7
Conclusion
The central aim of this article has been to put forward the empirical observation that numeral modifiers come in two classes: those that relate to definite amounts (class A) and those that resist association with definite cardinality (class B). Theoretically, I proposed that underlying this distinction is a difference in the kind of relations numeral modifiers encode: either a simple comparison relation between numbers (class A) or a relation between a range of values and its minimum or maximum (class B). I furthermore showed how this theory can be implemented in a framework where numeral modifiers are treated as degree quantifiers. While there already existed analyses of both type A and type B modifiers, the class difference that was the central focus of this article has not yet been discussed. For the treatment of class A quantifiers in this article I adopted the proposal of Hackl 2001. My account of class B modifiers, on the other hand, is original. It can be compared to two closely related proposals on the semantics of superlative modifiers: Geurts & Nouwen 2007, where superlative modified numerals are proposed to lexically specify modal operators, and Krifka 2007b, where superlative quantifiers are proposed to be speech act modifiers. Both works do not discuss the class A/B distinction, but I take it that both these proposals, in view of the main observations of this article, can be viewed as accounts not just of superlative quantifiers, but of class B members in general. As suggested in section 5, my proposal is in certain respects quite close to Krifka’s. It differs greatly, however, from Geurts & Nouwen 2007 in the way the interaction between modified numerals and modality is accounted for. In a way, the current article as well as Krifka 2007b represent a position where quantifiers lexically specify quite minimal functions, which consequently leads to much of the work being done by pragmatic mechanisms (such as blocking). For the proposal in Geurts & Nouwen 2007, on the other hand, the balance is different in that a much greater burden is placed on semantics. An in-depth comparison of these accounts of class B quantifiers, however, is left for further research. References Aloni, Maria. 2007. Free choice, modals, and imperatives. Natural Language Semantics 15(1). 65–94. doi:10.1007/s11050-007-9010-2. Atlas, Jay David & Stephen C. Levinson. 1981. It-clefts, informativeness, and
3:38
Two kinds of modified numerals
logical form: Radical pragmatics (revised standard version). In Peter Cole (ed.), Radical pragmatics, 1–61. New York: Academic Press. Barwise, John & Robin Cooper. 1981. Generalized quantifiers and natural language. Linguistics and Philosophy 4(2). 159–219. doi:10.1007/BF00350139. Blutner, Reinhard. 2000. Some aspects of optimality in natural language interpretation. Journal of Semantics 17(3). 189–216. doi:10.1093/jos/17.3.189. Breheny, Richard. 2008. A new look at the semantics and pragmatics of numerically quantified noun phrases. Journal of Semantics 25(2). 93–140. doi:10.1093/jos/ffm016. Büring, Daniel. 2008. The least at least can do. In Charles B. Chang & Hannah J. Haynie (eds.), Proceedings of WCCFL 26, 114–120. Somerville, Massachusetts: Cascadilla Press. Corblin, Francis. 2007. Existence, maximality and the semantics of numeral modifiers. In Ileana Comorovski & Klaus von Heusinger (eds.), Existence: Semantics and syntax (Studies in Linguistics and Philosophy 84), Springer. Corver, Norbert & Joost Zwarts. 2006. Prepositional numerals. Lingua 116(6). 811–836. doi:10.1016/j.lingua.2005.03.008. Davies, Mark. 2008. The corpus of contemporary American English (COCA): 385 million words, 1990-present. Available online at http://www. americancorpus.org. von Fintel, Kai & Sabine Iatridou. 2005. What to do if you want to go to Harlem: Anankastic conditionals and related matters. Ms. MIT, available on http://mit.edu/fintel/www/harlem-rutgers.pdf. Geurts, Bart. 2005. Entertaining alternatives: disjunctions as modals. Natural Language Semantics 13(4). 383–410. doi:10.1007/s11050-005-2052-4. Geurts, Bart. 2006. Take five: the meaning and use of a number word. In Svetlana Vogeleer & Liliane Tasmowski (eds.), Non-definiteness and plurality, 311–329. Amsterdam/Philadelphia: Benjamins. Pre-published version available at http://ncs.ruhosting.nl/bart/papers/five.pdf. Geurts, Bart & Rick Nouwen. 2007. At least et al.: the semantics of scalar modifiers. Language 83(3). 533–559. Grice, Paul. 1975. Logic and conversation. In Peter Cole & Jerry L. Morgan (eds.), Syntax and semantics 3: Speech acts, 41–58. New York: Academic Press. Hackl, Martin. 2001. Comparative quantifiers: Department of Linguistics and Philosophy, Massachusetts Institute of Technology dissertation. doi:1721.1/8765. Heim, Irene. 1991. Artikel und Definitheit. In Arnim von Stechow & Dieter
3:39
R.W.F. Nouwen
Wunderlich (eds.), Semantik: Ein internationales Handbuch der zeitgenössischen Forschung, Berlin: de Gruyter. Heim, Irene. 2000. Degree operators and scope. In Proceedings of SALT 10, Ithaca, NY: CLC Publications. Horn, Laurence R. 1984. Toward a new taxonomy for pragmatic inference: Q-based and R-based implicature. In Deborah Schiffrin (ed.), Meaning, form and use in context, 11–42. Washinton: Georgetown University Press. Kamp, Hans. 1973. Free choice permission. Proceedings of the Aristotelian Society 74. 57–74. Kennedy, Christopher. 1997. Projecting the adjective: the syntax and semantics of gradability and comparison: UCSD PhD. Thesis. Kiparsky, Paul. 1973. "Elsewhere" in phonology. In Stephen R. Anderson & Paul Kiparsky (eds.), A festschrift for Morris Halle, 93–106. New York: Holt, Reinhart, & Winston. Kiparsky, Paul. 1983. Word formation and the lexicon. In Proceedings of the 1982 Mid-America Linguistics Conference, 47–78. Lawrence, Kansas: University of Kansas. Krifka, Manfred. 1999. At least some determiners aren’t determiners. In Ken Turner (ed.), The semantics/pragmatics interface from different points of view vol. 1, 257–291. Elsevier. Krifka, Manfred. 2007a. Approximate interpretation of number words: A case for strategic communication. In Irene Vogel & Joost Zwarts (ed.), Cognitive foundations of communication, Amsterdam: Koninklijke Nederlandse Akademie van Wetenschapen. Krifka, Manfred. 2007b. More on the difference between more than two and at least three. Paper presented at University of California at Santa Cruz, available at http://amor.rz.hu-berlin.de/~h2816i3x/Talks/SantaCruz2007. pdf. Lakoff, George & Mark Johnson. 1980. Metaphors we live by. University of Chicago Press. McCawley, James. 1978. Conversational implicature and the lexicon. In Peter Cole (ed.), Syntax and semantics 9: Pragmatics, New York: Academic Press. Nilsen, Øystein. 2007. At least: Free choice and lowest utility. Paper presented at ESSLLI workshop on quantifier modification. Nouwen, Rick. 2008a. Directionality in modified numerals: the case of up to. Semantics and Linguistic Theory 18. doi:1813/13056. Nouwen, Rick. 2008b. Upper-bounded no more: the implicatures of negative comparison. Natural Language Semantics 16(4). 271–295.
3:40
Two kinds of modified numerals
doi:10.1007/s11050-008-9034-2. Nouwen, Rick. 2009. Two kinds of modified numerals. In T. Solstad & A. Riester (eds.), Proceedings of Sinn und Bedeutung 13, Available at http: //www.let.uu.nl/~Rick.Nouwen/personal/papers/sub09.pdf, 15 pages. Nouwen, Rick. 2010a. Two puzzles of requirement. In Maria Aloni & Katrin Schulz (eds.), The Amsterdam Colloquium 2009, Springer. http://www. hum.uu.nl/medewerkers/r.w.f.nouwen/papers/neccsuff.pdf. Nouwen, Rick. 2010b. What’s in a quantifier? In Martin Everaert, Tom Lentz, Hannah de Mulder, Øystein Nilsen & Arjen Zondervan (eds.), The linguistic enterprise: From knowledge of language to knowledge in linguistics (Linguistik Aktuell/Linguistics Today 150), John Benjamins. Pre-published version available at http://www.hum.uu.nl/medewerkers/r.w.f.nouwen/ papers/wiaq.pdf. van Rooij, Robert. 2004. Signalling games select Horn strategies. Linguistics and Philosophy 27(4). 493–527. doi:10.1023/B:LING.0000024403.88733.3f. Schwager, Magdalena. 2005. Exhaustive imperatives. In Paul Dekker & Michael Franke (eds.), Proceedings of the 15th Amsterdam Colloquium, Universiteit van Amsterdam. Solt, Stephanie. 2007. Few more and many fewer: complex quantifiers based on many and few. In Rick Nouwen & Jakub Dotlacil (eds.), Proceedings of the ESSLLI2007 Workshop on Quantifier Modification, . Takahashi, Shoichi. 2006. More than two quantifiers. Natural Language Semantics 14(1). 57–101. doi:10.1007/s11050-005-4534-9. Umbach, Carla. 2006. Why do modified numerals resist a referential interpretation? In Proceedings of SALT 15, 258 – 275. Cornell University Press. Zimmermann, Thomas Ede. 2000. Free choice disjunction and epistemic possibility. Natural Language Semantics 8(4). 255–290. doi:10.1023/A:1011255819284.
Dr. R.W.F. Nouwen Utrecht Institute for Linguistics OTS Janskerkhof 13, NL-3512 BL Utrecht, the Netherlands
[email protected]
3:41
Semantics & Pragmatics Volume 3, Article 4: 1–42, 2010 doi: 10.3765/sp.3.4
Iffiness∗ Anthony S. Gillies Rutgers University
Received 2009-06-24 / First Decision 2009-08-07 / Revised 2009-09-13 / Second Decision 2009-09-21 / Revised 2009-10-14 / Accepted 2009-11-18 / Final Version Received 2010-01-17 / Published 2010-02-01
Abstract How do ordinary indicative conditionals manage to convey conditional information, information about what might or must be if such-and-such is or turns out to be the case? An old school thesis is that they do this by expressing something iffy: ordinary indicatives express a two-place conditional operator and that is how they convey conditional information. How indicatives interact with epistemic modals seems to be an argument against iffiness and for the new school thesis that if -clauses are merely devices for restricting the domains of other operators. I will make the trouble both clear and general, and then explore a way out for fans of iffiness.
Keywords: indicative conditionals, epistemic modality, if-clauses, conditionals, strict conditionals, dynamic semantics
1 An iffy thesis One thing language is good for is imparting plain and simple information: there is an extra chair at our table or we are all out of beer. But — happily — we ∗ This paper has been around awhile, versions of it circulating since 05.2006 and accruing a lot of debts of gratitude along the way. Chris Kennedy, Jim Joyce, Craige Roberts, Josef Stern, Rich Thomason, audiences at the Rutgers Semantics Workshop (October 2007), the Michigan L&P Workshop (Lite Version, November 2007), the Arché Contextualism & Relativism Workshop (May 2008), the University of Chicago Semantics & Philosophy Language Workshop (March 2009), and — especially (actually, especially∗ ) — Josh Dever, David Beaver, Kai von Fintel, Brian Weatherson, and the anonymous S&P referees have all done their best trying to save me from making too many howlers. But too many is surely context dependent, so caveat emptor. This research was supported in part by the National Science Foundation under Grant No. BCS-0547814. ©2010 A. S. Gillies This is an open-access article distributed under the terms of a Creative Commons NonCommercial License (creativecommons.org/licenses/by-nc/3.0).
A. S. Gillies
do not only exchange plain information about tables, chairs, and beer mugs. We also exchange conditional information thereof: if we are all out of beer, it is time for you to buy another round. That is very useful indeed. Conditional information is information about what might or must be, if such-and-such is or turns out to be the case. My target here has to do with how such conditional information manages to get expressed by indicative conditionals (not so called because anyone thinks that’s a great name but because no one can do any better). Some examples: (1)
a. b. c.
If the goat is behind door #1, then the new car is behind door #2. If the No. 9 shirt regains his form, then Barça might advance. If Carl is at the party, then Lenny must also be at the party.
Each of these is an ordinary indicative, two of them have epistemic modals in the consequent clause, and all of them express a bit of ordinary conditional information.1 What I am interested in is how well the indicatives play with the epistemic modals. What these examples say is plain. Take (1b). This says that — within the set of possibilities compatible with the information at hand — among those in which the star striker regains his form, some are possibilities in which Barça advance. Or take (1c). It says something about the occurrence of Lenny-is-at-the-party possibilities within the set of Carl-is-at-the-party possibilities — that, given the information at hand, every possibility of the latter stripe is also of the former stripe. So what sentences like these say is plain. How they say it isn’t. That’s my target here: How is it that the if s in our examples manage to express conditional information and do so in a way compatible with how they play with epistemic modals? The simplest story about how the if s in our examples manage to express conditional information is that each of them expresses the information of a conditional. Which is to say: what these conditional sentences mean can be read-off the fact that if expresses a conditional operator. Let’s say that a story about if is iffy iff it takes if to express a bona fide operator, a bona fide iffy operator (that is, a conditional operator properly so called), and the same bona fide iffy operator in each of the sentences in (1). We will have to sharpen that up by saying what it means for an operator to be a conditional 1 We ought to be careful to distinguish between conditional sentences (sentences of natural language), conditional connectives (two-place sentential connectives in some regimented language that may serve to represent the logical forms of conditional sentences), and conditional operators (relations that may serve as the denotations of conditional connectives).
4:2
Iffiness
operator properly so called. But that is the gist: iffiness — a.k.a. the operator view — is the thesis that ordinary indicative conditionals manage to express conditional information because if expresses a conditional operator. Depending on your upbringing, the operator view of if may well seem either obvious or obviously wrongheaded. More on that below. Either way, it is a hard line to maintain: how conditional sentences play with epistemic modals seems to refute it. A seeming refutation isn’t quite the same as an actual one, though. I will show that the refutation isn’t quite right by showing how fans of iffiness can account for what needs accounting for. But before showing how the operator view can be made to account for how if s and modals interact I want to make it look for all the world like it can’t be done. 2
Doom and how to avoid it (sketches thereof)
The operator view is an old school story about indicatives. It says that if expresses some relation between the (semantic value of the) antecedent and consequent. So if takes its place alongside other connectives and expresses an operator — the same operator — on the semantic values of the sentences it takes as arguments.2 To tell a story like this we have to say exactly what that operator is. But not just any telling will do. I want to show how our simple examples cause what looks like insurmountable trouble (doom, even) for any version of the operator view. Here’s an informal sketch of the trouble, what rides on it, and how — eventually — we can and ought to get out of the mess. Take this sketch as a promissory note that a formally precise version of all that can be given; the rest of the paper makes good on that. Suppose if expresses the limit case conditional operator of material implication. Iffiness requires that in sentences like (1b) and (1c) either the epistemic modals outscope the conditionals or the conditionals outscope the modals. Neither choice gets the truth conditions right if the conditional operator is the horseshoe. That’s easy to see (and well known).3 Linguists grow up on arguments like that. That is one reason why even though the operator view is the first thing a logician thinks of, it is the last thing a linguist does. 2 If is a little word with a big history — a big history that we can’t adequately tour here. But there are guides for hire: for instance, Bennett (2003) and von Fintel (2009). 3 The material conditional analysis of ordinary indicatives is defended (in somewhat different ways) by, for example, Grice (1989), Jackson (1987), and Lewis (1976). A textbook version of this “no-scope” argument that has the horseshoe analysis as its target appears in von Fintel & Heim 2007.
4:3
A. S. Gillies
But (as I’ll show) this very same trouble holds no matter what conditional operator an iffy story says if expresses. To see that requires two things. First, we need to say in a precise way what counts as a conditional operator (Section 4). Given some pretty weak assumptions iffiness requires that if means all (well, all relevant). Second, there are some characteristic Facts about how indicatives and epistemic modals interact (Section 5). These neatly divide: there are some consistency facts and there are some intuitive entailment facts. The operator view requires that either the conditionals outscope the modals or the modals outscope the conditionals. Something general then follows: no matter what conditional operator we say if expresses, one scope choice is ruled out by the consistency facts, the other by the entailments (Section 6). That seems to be bad news for any fan of any version of the old school operator view. And there seems to be more bad news in the offing since the operator view isn’t the only game in town (in some circles, it’s a game played only on the outskirts of town). The anti-iffiness rival — a.k.a. the restrictor view — is a new school approach. It embraces Kratzer’s thesis that if is not a connective at all: it doesn’t express an operator, a fortiori not an iffy operator, and a fortiori not the same iffy operator in each of our example sentences it figures in.4 Instead, says the restrictor analysis, if simply restricts other operators. In the cases we will care about, it restricts (possibly covert) epistemic modals. The restrictor view makes embarrassingly quick work of the data that spells such trouble for the operator view (Section 7). But the success of the restrictor analysis is no argument against Chuck Taylors and skyhooks tout court. That’s because there are old school stories that say that if expresses a strict conditional operator over possibilities compatible with the context, and that it can do all the restricting that needs doing (Sections 8). Once we see just how, we can look back and see more 4 The restrictor view gets its inspiration from Lewis’s (1975) argument that certain if s (under adverbs of quantification) cannot be understood as expressing some conditional but rather serve to mark an argument place in a polyadic construction. Kratzer’s thesis is that this holds for if across the board. The classic references are Kratzer 1981, 1986. There is another rival, too: some take if to be an operator, but an operator that does not (when given arguments) express a proposition (Adams 1975; Gibbard 1981; Edgington 1995, 2008). Instead, they say, if s express but do not report conditional beliefs on the part of their speakers. I will ignore this view here: it doesn’t really start off as the most plausible candidate, the trouble I make here about how if s and modals interact makes it less plausible not more, and it will just take us too far afield.
4:4
Iffiness
clearly what is at stake in the difference between new school and old, why iffiness is worth pursuing (Section 9), and how this version of the old school story relates to recent dynamic semantic treatments (Section 10). 3
Ground rules
Let’s simplify. Assume that meanings get associated with sentences by getting associated with formulas in an intermediate language that represents the relevant logical forms (lfs) of them. Thus a story, old school or otherwise, has to first say what the relevant lfs are and then assign those lfs semantic values. We will begin with an intermediate language L that has a conditional connective that will serve to represent the lfs of ordinary indicatives. So let L be generated from a stock of atomic sentence letters, negation (¬), and conjunction (∧) in the usual way. But L also has the connective (if ·)(·), and the modals must and might. What I have to say can be said about an intermediate language that allows that the modals mix freely with the formulas of the non-modal fragment of L but restricts (if ·)(·) so that it takes only non-modal sentences in its first argument. So assume that L is such an intermediate language. When these restrictions outlive their utility, we can exchange them for others.5 Iffiness requires that the if of English expresses something properly iffy. That leaves open just which conditional operator we say that the if of English means. But our choices here are not completely free, and some ground rules will impose some order on what we may say. These will constrain our choice by saying what must be true for a conditional operator to be rightfully so called. But before getting to that, I’ll start with what I will assume about contexts. First, a general constraint: assume that truth-values — for the if s and the modals (when we come to that), as well as for the boolean fragment of L — are assigned at an index (world) i with respect to a context. I will assume that W , the space of possible worlds, is finite. Nothing important turns on this, and it simplifies things. For the fragment of L with no modals and no if s, contexts are idle. It will be the job of the modals to quantify over sets of live possibilities and the job 5 Conventions: p, q, r , . . . range over sentences of L (subject to our constraints on L); i, j, k, . . . range over worlds; and P , Q, R, . . . range over sets of worlds. And let’s not fuss over whether what is at stake is the ‘if ’ of English or the ‘if ’ of L; context will disambiguate.
4:5
A. S. Gillies
of contexts to select these sets of worlds over which the modals do their job. What I want to say can be said in a way that is agnostic about just what kinds of things contexts are: all I insist is that, given a world, they determine a set of possibilities that modals at that world quantify over.6 The functions doing the determining need to be well-behaved. Given a context c — replete with whatever things contexts are replete with — an epistemic modal base C determined by it is just what we need: Definition 3.1 (modal bases). Given a context c, C is a modal base (for c) only if: C = λi. j : j is compatible with the c-relevant information at i Since the only context dependence at stake here will be dependence on such bases, we can get by just as well by taking them to go proxy for bona fide contexts, granting them the honorific “contexts”, and relativizing the assignment of truth-values to index–modal base pairs directly. So we’ll be saying just which function ·C,i : L → {0, 1} is, where C represents the relevant contextual information. No harm comes from that, and it makes for a prettier view.7 But not just any function from indices to sets of indices will do as a (proxy) context. So we constrain C’s accordingly, requiring that they are well-behaved — that is, reflexive and euclidean: 6 The problems and prospects for iffiness are independent of just whose information in a context — speaker, speaker plus hearer, just the hearer, just the hearer’s picture of what the speaker intends, and so on — counts for selecting the domains for the modals to do their job, and whether or not that information is information-at-a-context at all. So let’s keep things simple here. If you’d rather be reading a paper which has these (and other) complexities at the forefront, see von Fintel & Gillies 2007, 2008a,b and the references therein. 7 Three comments. First: take ·C to be shorthand for i : ·C,i = 1 . If p’s denotation 0 is invariant across contexts – if pC = pC no matter the choice for C and C 0 – let’s agree to conserve a bit of (virtual) ink and sometimes omit the superscript: so, e.g., the if s I am focusing on here have non-modal antecedents, and so those antecedents will be context-invariant. Second: it’s a little misleading to say that the only context dependence is dependence on modal bases since we will want to allow the possibility that what worlds are relevant to an if at a world can vary across contexts. But, in fact, we can (and will) still leave room for that possibility by constraining how contexts and the sets of if -relevant possibilities relate. Third: if I had different ambitions, we couldn’t simplify quite like this. If the interaction at center stage were how if s and quantifiers interact, or if the modals in the if /modal interaction were deontic, then we’d want our contexts to rightly characterize the kind of information at stake and taking them to determine sets of possibilities compatible with what is known would not do. But my ambitions here aren’t different from what they are.
4:6
Iffiness
Definition 3.2 (well-behavedness). C is well-behaved iff: i. i ∈ Ci ii. if j ∈ Ci then Ci ⊆ Cj
(reflexiveness) (euclideanness)
C represents a (proper) context only if it is well-behaved. Observation 3.1. If C is well-behaved then Ci is closed — well-behavedness implies that if j ∈ Ci , then Cj = Ci . Proof. Suppose j ∈ Ci . Consider any k ∈ Cj . Since C is euclidean and j ∈ Ci , Ci ⊆ Cj . Since C is reflexive, i ∈ Ci and thus i ∈ Cj . Appeal to euclideanness again: since k ∈ Cj , Cj ⊆ Ck ; but i ∈ Cj and so i ∈ Ck . And once more: since i ∈ Ck , Ck ⊆ Ci . And now reflexiveness: k ∈ Ck and so k ∈ Ci . (The inclusion in the other direction just is euclideanness.) Gloss Ci as the set of live possibilities at i in C. That Ci is closed means that the live possibilities in Ci do not vary across worlds compatible with C.8 4 Conditional operators By saying something about what must be true of an operator for it to be a conditional operator properly so called we thereby say something about what must be true for a story to be iffy. Taking if to express a bona fide conditional operator requires, minimally, two things. Thing one: it requires, in the cases we’ll care about, that if such-andsuch, then thus-and-so doesn’t take a stand on whether such-and-such is the case and so conditionals like that are typically happiest being uttered in circumstances in which such-and-such is compatible with the context as it stands when the conditional is issued. I will take this as a definedness condition on the semantics for our conditional connective. Definition 4.1 (definedness). if p q C,i is defined only if p is compatible with Ci . This is a weak constraint.9 8 Given euclideaness, we could get by with different assumptions on C to the same effect. But reflexiveness is a constraint it makes sense to want since, when we come to them, epistemic modals — what might or must be in virtue of what is known — in a given context will quantify over the set of possibilities compatible with that context. 9 The motivating idea isn’t novel (see, e.g., Stalnaker 1975): if it’s ruled out that p in C, and you want to say something conditional on p in C, then you should be reaching for a
4:7
A. S. Gillies
Thing two: it requires that if expresses a relation between antecedent and consequent. Whether if such-and-such, then thus-and-so is true depends on whether the relevant worlds at which such-and-such is true bears the right relationship to the worlds where thus-and-so is true. Take an arbitrary conditional like if p q at i, in C. And let P and Q be the sets of antecedent and consequent possibilities so related by the if . Now we need to zoom in on the relevant worlds in P . So let Di be the set of if -relevant worlds at i. For if to express a conditional operator properly so called, its denotation must be a relation R between P -together-with-the-relevant-possibilities-Di and Q. Di is the set of possibilities relevant for the if at i. Since Di is a function of i, different worlds may be relevant for one and the same if when evaluated at different worlds. But, depending on your favorite theory, Di may be a function of more than just i: it may be a function of i, of C, of p, of q, or of your kitchen sink. We will return to that shortly. No matter your favorite theory, we can still ex ante agree to this much: i is always among the possibilities relevant for an if at i, and only possibilities compatible with the context are relevant for an if at i. That is: Di is the set of if -relevant worlds at i only if i ∈ Di and Di ⊆ Ci . The first requirement is a platitude: the facts at a world are always relevant to whether an indicative at that world is true. The second means that an indicative in a context is supposed to say something about the possibilities compatible with that context. Beyond this, what your favorite theory implementing the operator view says about Di may vary because what stories say counts as an if -relevant possibility varies. But what does not vary is that all such stories determine Di in a pretty straightforward way and so the denotation they assign to if can be put as a relation between the relevant antecedent possibilities and the consequent possibilities. Three examples: Example 1 (variably strict conditional). Suppose your favorite story takes if to be a variably strict conditional based on some underlying ordering of possibilities (Stalnaker 1968; Lewis 1973). For every world i, let i be an ordering of worlds, a relation of comparative similarity (at least) weakly centered on i. Given a conditional if p q at i in C, you will want to identify Di with the set of possibilities no more dissimilar than the most similar p-world to i, restricted by Ci . Example 2 (strict conditional). Suppose your favorite Lewis-inspired story counterfactual not an indicative. That can be implemented in any number of ways, including making it a presupposition of if -clauses (see, e.g., von Fintel 1998a).
4:8
Iffiness
comes not from D.K. but from C.I. You thus take if to be strict implication (restricted to C). But that, too, can be put in terms of orderings: your ordering i is universal, treating all worlds the same. Whence it follows that — since the nearest p-world is the same distance from i as is every world — taking Di to be the set of possibilities no further from i as the nearest p-world amounts to taking Di to be the set of all worlds W , restricted by Ci . Example 3 (material conditional). Suppose you are smitten by truth-tables, and your favorite incarnation of the operator view is the material conditional story. Equivalently: you will have a maximally discerning ordering (every world an island) and take Di to be the set of closest worlds to i simpliciter according to that ordering. For an if at i you will thus take Di to be {i}. (For an if at some other world j, even an if with the same antecedent and consequent as the one at i, take Dj to be j .) Summing this all up: even before taking a stand on just what relation between relevant antecedent possibilities and consequent possibilities that if must express in order to express a conditional operator properly so called, we know that it must still express such a relation. So let’s insist that we can put things that way, parametric on just how Di gets picked out and so parametric on what counts as “relevant” antecedent possibilities and so parametric on the details of your favorite theory: Definition 4.2 (relationality). (if ·)(·) expresses a conditional only if its truth conditions can be put this way: if defined, if p q C,i = 1 iff R(Di ∩ P , Q) for some set of possibilities Di and relation R, where i ∈ Di and Di ⊆ Ci . But not just any relation between Di ∩ P and Q counts as a conditional relation properly so called. I insist on three minimal constraints on R, for any P and Q: (i) that Di ∩ P imposes some order on the set of Q’s so related; (ii) that Q matters to whether the relation holds; and (iii) that — plus or minus just a bit — only the relationship between the possibilities in Di ∩ P and the possibilities in Q matter to whether the relation holds. These are not controversial, but do bear some unpacking.10 First, the order imposed by the antecedent: 10 This general way of characterizing conditionality is not new: both the assumptions and the results here are inspired by van Benthem’s (1986: §4) investigation of conditionals as generalized quantifiers. There are, however, differences between his versions and mine.
4:9
A. S. Gillies
Definition 4.3 (order). R is orderly iff: i. R(Di ∩ P , P ) ii. R(Di ∩ P , Q) and Q ⊆ S imply R(Di ∩ P , S) iii. R(Di ∩ P , Q) and R(Di ∩ P , S) imply R(Di ∩ P , Q ∩ S) R is something (if ·)(·) at i could mean only if it is orderly. Such R’s are precisely those for which the set of Q’s a Di ∩ P bears it to form a filter that contains P .11 That is an aesthetic reason for constraining R this way. Such R’s also jointly characterize the basic conditional logic.12 The relational properties correspond to reflexivity, right upward monotonicity, and conjunction. That is another — only partly aesthetic — reason for constraining them this way. Second, R must care about consequents. This is just the requirement that conditional relations, like quantifiers, be active: Definition 4.4 (activity). R is active iff: if Di ∩ P 6= then there is a Q and Q0 such that: R(Di ∩ P , Q) but not R(Di ∩ P , Q0 ) R is something (if ·)(·) at i could mean only if it is active. This means that R cares about how Di ∩ P relates to Q. So long as there are some relevant P -possibilities, there have to be some Q’s for which the relation holds and some for which it doesn’t. And finally: R is a relation between the sets of possibilities. Thus if R holds at all between P -plus-the-relevant-possibilities-Di and the consequentpossibilities Q, R will hold between any two sets of things that play the right possibility role. Intrinsic properties of worlds don’t count for or against the relation holding. The idea is simple, the execution harder. That is because I have allowed you to choose your favorite iffy theory, and what goes into determining Di depends on your choice. What is important is this: suppose your favorite story posits some additional structure to modal space to find just the right worlds which, when combined with P , gives the set of worlds relevant for evaluating Q. That means that your favorite story cares about how P relates to Q but also about the distribution of the worlds in P compared to the distribution in Q — for 11 It follows straightaway that orderly R’s are fully reflexive in the sense that R(Di ∩ P , Di ∩ P ). 12 See Veltman 1985 for a proof.
4:10
Iffiness
example, perhaps insisting that it is the closest worlds in P to i that must bear R to Q. If we systematically swap possibilities for possibilities in a way that preserves the relevant structure, then the conditional relation ought to hold pre-swapping iff it holds post-swapping. And mutatis mutandis for Di : since once the posited structure does its job determining Di , then any systematic swapping of possibilities that leaves the domain untouched should also leave the conditional relation untouched.13 Where π is such a mapping and P a set of worlds, let π (P ) be the set of worlds i such that π (j) = i for some j ∈ P . Then: Definition 4.5 (quality). R is qualitative iff: R(Di ∩ P , Q) implies R(π (Di ∩ P ), π (Q)) R is something (if ·)(·) at i could mean only if it is qualitative. This does generalize the familiar constraint on quantifiers — it allows conditional operators to care about both the relationship between P and Q and also where the satisfying worlds are. If i is the universal ordering then this requirement reduces to the more familiar quantitative one (restricted to Ci ). And if Di = {i}, it trivializes. I am insisting that a story is iffy only if the truth conditions for an indica tive if p q at i in Ci can be put as a relation between R between Di ∩ P and Q. And we have insisted that the relation be constrained in sensible ways — it must impose some order on sets of consequent possibilities, it must care about consequents, and it must not care about the intrinsic properties of possibilities. Each example of an instance of the operator view above — variably strict, strict, and material conditionals — lives up to these constraints. Still, it seems like for all we have said it is possible to take the conditional to be true just in case most/many/several/some/just the right possibilities in Di ∩ P are in Q. But that is not so: given our constraints, if must mean all.14 13 This is the natural extension of the familiar requirement that quantifiers be quantitative: for Q to be a quantifier (with domain E) it must be that QE (A, B) iff QE (f (A), f (B)) where f is an isomorphism of E. Once we have structure to our domain, this will not do. The more general constraint is then to require that Q be invariant under O-automorphisms of the domain, where O is the ordering that imposes the posited structure. We can get by with slightly less: namely, stability under Di -invariant automorphisms. 14 Well, all relevant. This was first proved by van Benthem — see, e.g., van Benthem 1986. The version I give is simpler (we’re ignoring the infinite case) and a bit more general (slightly weaker assumptions); the proof is based on one in Veltman 1985, but generalizes it slightly.
4:11
A. S. Gillies
Observation 4.1. Assume R is a conditional relation properly so called. Then R(Di ∩ P , Q) iff Di ∩ P ⊆ Q. Proof. I care about the left-to-right direction. Suppose — for reductio — that R(Di ∩ P , Q) but Di ∩ P 6⊆ Q. What we’ll see is: (i) R(Di ∩ P , P ∩ Q); (ii) the world that witnesses that Di ∩ P 6⊆ Q can be exploited (by quality) to show that no world in P ∩ Q plays a role in R(Di ∩ P , P ∩ Q) holding — from which it follows that R(Di ∩ P , ); (iii) from which it follows that Di ∩ P must be empty — a contradiction. (i): By hypothesis R(Di ∩ P , Q). By order it follows that R(Di ∩ P , P ) and hence that R(Di ∩ P , P ∩ Q). (iia): Claim: Di ∩ P ∩ Q 6= . Proof of Claim: Assume otherwise. order guarantees that R(Di ∩ P , Di ∩ P ). By hypothesis R(Di ∩ P , Q), and so by order R(Di ∩ P , Di ∩ P ∩ Q). Applying the assumption that Di ∩ P ∩ Q = : R(Di ∩ P , ). Appeal to order again and we have that R(Di ∩ P , S) for any S. But then Di ∩ P must be empty (activity), contradicting the assumption that Di ∩ P È Q and proving the Claim. (iib): Let j be a witness to Di ∩ P 6⊆ Q. So j ∈ Di ∩ P but j 6∈ Q. Now pick any confirming instance k — that is, any k ∈ Di ∩ P ∩ Q — and let π be the mapping that swaps k and j and leaves all else untouched:
• π (j) = k • π (k) = j • π (i) = i for every i 6∈ j, k By (i) R(Di ∩ P , P ∩ Q). Hence, by quality, R(π (Di ∩ P ), π (P ∩ Q)). But π doesn’t affect Di ∩ P . So: R(Di ∩ P , π (P ∩ Q)). That is: R holds between Di ∩ P and both P ∩ Q and π (P ∩ Q). Hence — by order — it holds also between Di ∩ P and their intersection: R(Di ∩ P , (P ∩ Q) ∩ π (P ∩ Q)). But π (P ∩ Q) = ((P ∩ Q) \ {k}) ∪ j , so their intersection is (P ∩ Q) \ {k}. So: R(Di ∩P , (P ∩Q)\{k}). Which is to say that k is irrelevant for R’s holding. But k was any world in Di ∩ P ∩ Q, so finiteness plus order implies R(Di ∩ P , ). (iii): Appeal to order again: since R(Di ∩ P , ), it holds that for any S whatever R(Di ∩ P , S). Whence, by activity, it follows that Di ∩ P = . And that contradicts the assumption that Di ∩ P 6⊆ Q. The intuitive version is just this: if R holds between Di ∩ P and Q then the former must be included in the latter. That is because if things didn’t go that way then the witnessing counterexample world could play the role of any one of the confirming worlds. But that would mean that confirming worlds
4:12
Iffiness
play no role. Nothing like that could be something a conditional properly so called could mean. So Di ∩ P must be included in Q after all. 5 Three facts Iffiness requires that if is a conditional connective that expresses a conditional operator, and that pretty much means that if has to mean all. It requires that no matter what other operators we might find in its neighborhood. That spells trouble because of three simple Facts about how indicative conditionals and epistemic modals play together.15 I have lost my marbles. I know that just one of them — Red or Yellow — is in the box. But I don’t know which. I find myself saying things like: (2)
Red might be in the box and Yellow might be in the box. So, if Yellow isn’t in the box, then Red must be. And if Red isn’t in the box, then Yellow must be.
Conjunctions of epistemic modals like Red might be in the box and Yellow might be in the box are especially useful when the bare prejacents partition the possibilities compatible with the context. The first fact is simply that if s are consistent with such conjunctions of modals. Fact 1 (consistency). Suppose S1 and S2 partition the possibilities compatible with the context. Then the following are consistent: i. might S1 and might S2 ii. if not S1 , then must S2 ; and if not S2 , then must S1 15 Three notes about the Facts. First: “Facts” may be laying it on a little thick. The judgments are robust, and the costs high for denying the generalizations as I put them. That’s all true even if what we may say about them is a matter for disputing. But it does not much matter: what I really care about is three characteristic seeming facts about if s, mights, and musts that at first blush look like the kind of thing our best story ought to answer to. So let’s agree to take them at face value and see where that leads. Later, if your English breaks with mine or if your old school pride overwhelms, you can deny the Facts or explain them away as your preferences dictate. Second: the Facts may seem eerily familiar. They are not far removed from the sorts of examples of the interplay between adverbs of quantification and if -clauses in Lewis 1975 and Kratzer 1986. That is no coincidence, as we’ll see (briefly) in Section 7. Third: since the operator view isn’t the only game in town and since predicting the Facts is something any story (old school or otherwise) must do, we should state the Facts in a way that is agnostic on the iffy thesis. So the Facts characterize what is true of sentences in (quasi-)English, not necessarily what is true of their lfs in our regimented intermediate language.
4:13
A. S. Gillies
I do not know whether Carl made it to the party. But wherever Carl goes, Lenny is sure to follow. So if Carl is at the party, Lenny must be — Lenny is at the party, if Carl is. We just glossed an if with a commingling epistemic must by a bare if with no (overt) modal at all. Thus: (3)
a. b.
If Carl is at the party, then Lenny must be at the party. ≈ If Carl is at the party, then Lenny is at the party.
This pair has the ring of (truth-conditional) equivalence. Fact 2 below records that. But there are also arguments for thinking that the truth-value of (3a) should stand and fall with the truth-value of (3b). For suppose that such if s validate a deduction theorem and modus ponens, and that must is factive.16 The left-to-right direction: assume that (3a) is true. And consider the argument: (4)
If Carl is at the party, then Lenny must be at the party. Carl is at the party. So: Lenny is at the party.
The first two sentences — intuitively speaking — entail the third. And that is pushed on us by the assumptions: from the first two sentences we have (by modus ponens) that Lenny must be at the party, which by factivity entails Lenny is at the party. Apply the deduction theorem and we have that If Carl is at the party, then Lenny must be at the party entails If Carl is at the party, then Lenny is at the party. Since we have assumed that (3a) is true, it follows that (3b) must be. There are spots to get off this bus to be sure — by denying either modus ponens or by denying the factivity of must — but those costs are high.17 The right-to-left direction: assume that (3b) is true and consider: 16 Remember that, for now, we are dealing with properties of sentences of (quasi-)English not properties of those sentences’ lfs in some regimented language. The argument here isn’t meant to convince you of Fact 2, it is meant to make some of the costs of denying the data vivid. Geurts (2005) also notes that bare conditionals and their must-enriched counterparts are “more or less equivalent”. 17 You have to troll some pretty dark corners of logical space for deniers of modus ponens, but that’s not true for deniers of the factivity of must. That view has something of mantra status among linguists (philosophers are surprised to hear that). Mantra or not, it is wrong. For an all-out attack on it see von Fintel & Gillies 2010. Here is just one sort of consideration: if must p didn’t entail p (because must is located somewhere below the top of the scale of epistemic strength), then you’d expect must to combine with only in straightforward ways the way might can:
4:14
Iffiness
(5)
If Carl is at the party, then Lenny is at the party. Carl is at the party. So: Lenny must be at the party.
This is as intuitive an entailment as we are likely to find. Whence it follows by the deduction theorem that If Carl is at the party, then Lenny is at the party on its own entails If Carl is at the party, then Lenny must be at the party. So if (3b) is true so must be (3a): that’s why the former seems to gloss the latter. Fact 2 (if/must). Conditional sentences like these are true in exactly the same scenarios: i. if S1 , then must S2 ii. if S1 , then S2 The glossing that this pattern permits is a nifty trick. But that is only half the story since if can also co-occur with epistemic might. The interaction between if and might is different and underwrites a different glossing. Alas, my team are not likely to win it all this year. It is late in the season and they have made too many miscues. But they are not quite out of it. If they win their remaining three games, and the team at the top lose theirs, my team will be champions. But our last three are against strong teams and their last three are against cellar dwellers. Still, my spirits are high: if we win out, we might win it all. Put another way, within the (relevant) my-team-wins-out possibilities — of which there are some — lies a my-teamwins-it-all possibility; there is a my-team-wins-out possibility that is a myteam-wins-it-all possibility. But that is just to say that there are (relevant) my-team-wins-out-and-wins-it-all possibilities. Maybe not very many, and maybe not so close, but some.18 Apart from keeping hope alive, the example also illustrates that we can gloss an indicative with a co-occurring epistemic might by a conjunction under the scope of might: (6)
(i)
a. b.
If my team wins out, they might win it all. ≈ It might turn out that my team wins out and wins it all.
a. I didn’t say it is raining, I only said it might be raining. b. #I didn’t say it is raining, I only said it must be raining.
But it doesn’t. 18 For the record: the Cubs. Please don’t bring it up.
4:15
A. S. Gillies
That gloss sounds pretty good. And for good reason: conjunctions that you would expect to be happy if the truth of (6a) and (6b) could come apart are not happy at all: (7)
a. #If my team wins out, they might win it all; moreover, they can’t win out and win it all. b. #It might turn out that my team wins out and wins it all, and, in addition there’s no way that if they win out, they might win it all.
That gives us the third Fact about how if s play with modals.19 Fact 3 (if/might). Sentences like these are true in exactly the same scenarios: i. if S1 , then might S2 ii. it might be that [S1 and S2 ] It’s now a matter of telling some story, iffy or otherwise, that answers to these Facts. Old school operator views will have trouble with them; the new school restrictor view predicts them trivially. 6
Scope matters
The operator view takes if to express an operator, an iffy operator, and the same iffy operator no matter whether we have a co-occurring epistemic modal or not and no matter whether the modal is must or might. In cases where there is a modal, scope issues have to be sorted out. Take a sentence of the form (8)
If S1 then modal S2
19 There is a wrinkle: Fact 3 implies that if S1 , then might S2 is true in just the same spots as if S2 , then might S1 . Seems odd: (i)
a. b.
If I jump out the window, I might break a leg. If I break a leg, I might jump out the window.
The first is true, the second an overreaction. I intend, for now, to sweep this under the same rug that we sweep the odd way in which Some smoke and get cancer/Some get cancer and smoke don’t feel exactly equivalent even though Some is a symmetric quantifier if ever there was one. (The rug in question seems to be the tense/aspect rug; similar considerations drive von Fintel’s (1997) discussion of contraposition of bare conditionals.)
4:16
Iffiness
and let S10 (S20 ) be the L-representation for sentence S1 (S2 ), and modal the L-representation for modal. We have a short menu of options for the relevant lf for such a sentence — either the narrowscoped (9a) or the widescoped (9b):
(9)
a. b.
if S10 modal S20 modal if S10 S20
If you want to put your lfs in tree form, be my guest: opting for narrowscoping means opting for sisterhood between modal and S2 ; opting for widescoping means opting for sisterhood between modal and if S1 then S2 . The trouble for the operator view is that, since if has to express inclusion, neither choice will do. One choice for scope relations seems ruled out by consistency (Fact 1), the other by if/must (Fact 2) and if/might (Fact 3). To put the trouble precisely, we need one more ground rule. Contexts, we said, have the job of determining the domains the modals quantify over. Modals, I’ll assume, do their job in the usual way by expressing their usual quantificational oomph over those domains: must (at i, with respect to C) acts as a universal quantifier, and might as an existential quantifier, over Ci . Definition 6.1 (modal force). i. might pC,i = 1 iff Ci ∩ pC 6= ii. must pC,i = 1 iff Ci ⊆ pC Now suppose we plump for narrowscoping. Then, given the ground rules, we cannot predict the consistency of the likes of (2) and that means that we cannot square iffiness with Fact 1. That’s true no matter how you fill in the particulars of the iffy story. Here is the narrowscoped analysis of my lost marbles. We have a modal and two indicatives: (10)
a. b. c.
Red might be in the box and Yellow might be in the box. might p ∧ might q If Yellow isn’t in the box, then Red must be. if ¬q must p If Red isn’t in the box, then Yellow must be. if ¬p must q
Any good story has to allow that the bundle of if s in (10b) and (10c) is consistent with the conjunction in (10a). But, assuming narrowscoping,
4:17
A. S. Gillies
this — even without taking a stand on how we choose Di and so without taking a stand on what counts as the set of if -relevant worlds — seems to be beyond what can be delivered by any version of the operator view. Observation 6.1. Suppose p and q partition the possibilities in C and that (10a) is true. Then the (narrowscoped) sentences in (10) can’t all be true. Proof. Suppose otherwise — that the regimented formulas in L are all true at a live possibility, say i, with respect to C. Just one of my marbles is in the box. So any world in Ci is either a p-world or a q-world, but not both; C is well-behaved, so i ∈ Ci . That leaves two cases. case 1: i ∈ ¬q. By hypothesis if ¬q must p C,i = 1, and so Di ∩ ¬qC ⊆ must pC . Since i ∈ Di , it then follows that i ∈ must pC — which is to say must pC,i = 1. Thus Ci has only p-worlds in it. But that is at odds with the second conjunct of (10a): that might q is true at i guarantees a q-world, hence a ¬p-world, in Ci . case 2: i ∈ ¬p. By hypothesis if ¬p must q C,i = 1, and so Di ∩ ¬pC ⊆ must qC . Since i ∈ Di , it then follows that i ∈ must qC — which is to say must qC,i = 1. Thus Ci has only q-worlds in it. But that is at odds with the first conjunct of (10a): that might p is true at i guarantees a p-world, hence a ¬q-world, in Ci . Narrowscoping has the virtue of taking plain and simple lfs to represent indicatives with apparently epistemic modalized consequents. But it has the vice of not squaring with consistency. This is true no matter the particulars of your favorite version of the operator view.20 So suppose instead that co-occurring modals scope over the if -constructions in which they occur. Now it is the generalizations if/must and if/might that cause trouble. Again, that’s true no matter how Di is chosen and so no matter what counts as an if -relevant possibility and so no matter what conditional operator we say if expresses. Here is a widescope analysis of the key examples (3) and (6): (11)
a. b.
If Carl is at the party, then Lenny must be at the party. must if p q If Carl is at the party, then Lenny is at the party. if p q
20 Thus by supplying how your favorite version of the operator view says Di is determined, you can use this proof to show how that story (assuming narrowscoping) departs from Fact 1.
4:18
Iffiness
(12)
a. b.
If my team wins out, they might win it all. might if p q It might turn out that my team wins out and wins it all. might (p ∧ q)
The facts are that must if p q ≈ if p q and that might if p q ≈ might (p ∧ q). What we need is a semantics for the conditional connective (if ·)(·) that can predict both patterns. But paths that might lead to one pretty reliably lead away from the other. So far I have insisted that i is always among the relevant worlds to an if at i (i ∈ Di ) and also that only worlds compatible with the context are relevant (Di ⊆ Ci ). Here I am in good company. But perhaps there is even more interaction between domains of if -relevant worlds and contexts. Some theories say that there can be no difference in domains for conditionals between worlds compatible with the context, others disagree: Definition 6.2 (egalitarianism & chauvinism). i. A semantics is egalitarian iff if whenever j ∈ Ci then Dj = Di . ii. A semantics is chauvinistic iff it is not egalitarian. egalitarianism requires domains to be invariant across worlds compatible with a context. That means that distinctions between worlds made by D’s — this world is relevant, that one isn’t — are unaffected when those distinctions are made from behind the veil of ignorance (we don’t know which world compatible with C is the actual world). Chauvinistic theories allow differences from behind the veil to matter to what possibilities get selected for domainhood, and thus allow that a possibility j ∈ Ci may determine a different set of relevant possibilities than does i. Once we have agreed that, for any i, Di selects from the worlds compatible with C and must include i, it is a further question whether we want to be egalitarians or chauvinists.21 21 The history of the conditional is littered with chauvinists. The material conditional analysis is chauvinistic. It says that the only possibility relevant for the truth of an if at i in C is i itself. And similarly for an if at j: only j matters there. Thus, except in the odd case where the context rules out uncertainty altogether, we will have that Dj 6= Di , for any choice of i and j compatible with C. A variably strict conditional analysis, based on a family of orderings (one for each world), is chauvinistic if we do not impose an “absoluteness” condition — the requirement that orderings around any two worlds be the same. (Lewis (1973: §6) discusses absoluteness in the process of characterizing the V -logics.) What to say about absoluteness is optional and so there is room for agnosticism about chauvinism. Stalnaker’s (1975) treatment of indicatives is not officially agnostic about chauvinism, but
4:19
A. S. Gillies
It is hard to be a chauvinist. That is because, assuming the particulars of the chauvinistic theory are compatible with there being a (p ∧ ¬q)-world in Ci but not in Di , no such story will predict if/must. The data say that bare indicatives and their must-enriched counterparts are true in the same scenarios. But chauvinism plus widescoping guarantees that the domain the if quantifies over is properly included in the domain its must-enriched counterpart quantifies over. Thus the former says something strictly weaker than — true in strictly more spots than — the latter. That is at odds with Fact 2: Observation 6.2. Suppose that Di ⊂ Ci . There are scenarios in which the widescoped (11b) is true but (11a) isn’t. Thus chauvinism plus widescoping can’t explain Fact 2. Proof. Consider a (p ∧ ¬q)-world — call it j — and suppose that Ci does, but Di does not, contain j. Then every possibility in Di ∩ p is in q and the plain if is true (at i, in C): if p q C,i = 1. But not the widescoped mustenriched if . That is because there is a world in Ci — namely j — such that not every possibility in Dj ∩ p is a possibility in q. Thus if p q C,j = 0 and so it is not true that the plain if is true at every world in Ci and so must if p q C,i = 0. Again, this is true no matter how we fill in the particulars of the operator view. If we widescope the modals, and the story is chauvinistic, it will not square with Fact 2. Given widescoping, egalitarianism fares no better. But here it is if/might (Fact 3) that causes trouble. This time the issue is triviality: mustenriched if s are true iff their might-enriched counterparts are. Here is why. First, egalitarianism implies that Di covers Ci : Observation 6.3. egalitarianism implies that Di = Ci . Proof. Assume otherwise. Di ⊆ Ci , so there must be a j ∈ Ci such that j 6∈ Di . By egalitarianism, Dj = Di . But we know that j ∈ Dj . Contradiction. that is only because he requires that i induce a total order that is centered pointwise on i, and that rules against absoluteness. But the pragmatic mechanisms he develops there are agnostic on the chauvinism question — what he says about how the context constrains selection functions is compatible with both egalitarianism and chauvinism. I myself see little reason to go for chauvinism.
4:20
Iffiness
Thus if Di reflects some measure of proximity to i, egalitarianism implies that the underlying ordering is centered not pointwise on i but setwise on the worlds compatible with C. So egalitarianism implies that if is really a strict conditional. That’s true whether Di is derived from some underlying ordering or not: if , might and must quantify over the same domain of possibilities, and an if is true at i iff all of the antecedent worlds in that domain are consequent worlds.22 That means that an if at i (in C) is true iff the corresponding material conditional is true at every possibility compatible with C. And that means that such an if is true at i iff the material conditional, widescoped by must, is true at i.23 But from this degree of fit between Di and Ci it follows straightaway that no two possibilities compatible with C can differ over an if issued in C. There is solidarity among if s; they stand and fall together: Observation 6.4. egalitarianism implies if p q C,i = 1 iff for every j ∈ Ci : if p q C,j = 1 Proof. if p q C,i = 1 iff Di ∩ p ⊆ q. By egalitarianism: iff, for any j ∈ Ci , Dj ∩ p ⊆ q. Equivalently: iff, for any j ∈ Ci , Cj ∩ p ⊆ q — that is, iff for every such j, if p q C,j = 1. Given widescoping, any story with this equivalence will have a hard time saying why conditionals like (12a) seem to be true iff modalized conjunctions like (12b) are and so will have trouble with if/might. That is because, given the usual story for the modals (Definition 6.1), we get triviality: Observation 6.5. egalitarianism implies: might if p q C,i = 1 iff must if p q C,i = 1 Thus widescoping plus egalitarianism implies that must if p q is true iff might(p ∧ q) is. Not even Cubs fans fall for that. 22 Strictness makes it easy to understand why negating a bare conditional sounds so much like saying the counterexample might obtain. For more on context-dependent strictness (of different flavors) see, e.g., Veltman 1985, von Fintel 1998a, 2001, and Gillies 2004, 2007, 2009. 23 Thus, given well-behavedness (Definition 3.2), explaining Fact 2 is easy for widescoping egalitarians: if p q is equivalent to must (p ⊃ q) which, given well-behavedness, is equivalent to must must (p ⊃ q). And that, in turn, is equivalent to must if p q .
4:21
A. S. Gillies
Proof. Note that might if p q C,i = 1 iff the plain conditional if p q is true somewhere in Ci . But by Observation 6.4 the plain if is true somewhere in Ci iff it is true everywhere in Ci . And it is true everywhere in Ci just in case must if p q C,i = 1. That trivializes rather than explains Fact 3. No matter the particulars, widescoping plus egalitarianism can’t predict Fact 3. Iffiness requires conditionals to have a structure that does not play nice with modals. That’s because no way of resolving the relative scopes will work.24 What causes the trouble is that the operator view requires if to mean all. But the Facts don’t seem to allow that. If we widescope, then sometimes that seems all right — if the modal in question happens to have universal quantificational force. But when the modal is existential, if looks more like conjunction than inclusion. And narrowscoping seems no better, rendering all manner of coherent bits of discourse inconsistent. That is pretty bad news for the operator view. True, we could save iffiness by denying some Fact or other. (With defenders like that who needs detractors?) Adding insult to injury: the Facts were chosen not at random but with an eye to the competition. They are Facts that the new school restrictor view predicts so easily hardly anyone has noticed. 7
Iffiness lost
Lewis (1975) famously argued that if s appearing in certain quantificational constructions (under adverbs of quantification) are not properly iffy, that the if in 24 Could we go for widescoping must-enriched indicatives and narrowscoping might-enriched indicatives? For all we’ve said so far: yes. But that strategy faces an uphill battle. It is ad hoc, three times over. First because there is no good reason to think we should settle for anything less than a uniform story. Second because it is not obvious what it says we should do when we consider ways in which the modal might be embedded. What if the modal is can’t (a possibility modal scoped under negation) or needn’t (a universal under negation)? (i)
a. b.
If my team doesn’t win out, they can’t win it all. If the gardener didn’t do it, the culprit needn’t be the butler.
Do we widescope or narrowscope these? What principled story is there that predicts, rather than stipulates, that the first is widescoped and the second narrowscoped? Third because as soon as we consider epistemic modals that lie between the existential might and the universal must — like probably and unlikely — it is doomed to failure anyway.
4:22
Iffiness
(13)
Always Sometimes if a man owns a donkey, he beats it. Never
is not a conditional connective with a conditional operator as its meaning but instead acts as a non-connective whose only job is to mark an argumentplace for the adverb of quantification. The relevant structure is not some Q-adverb scoped over a conditional nor some conditional with a Q-adverb in its consequent, he said, but instead something like (14)
Q-adverb + if-clause + then-clause
The job of the if -clause in (13) is merely to restrict the domain over which the adverb (unselectively) quantifies, and allegedly that restricting job is a job that cannot be done by treating if as a conditional connective with a conditional operator as its meaning. If Q-adverb is universal, maybe an iffy if will work; but if it is existential, then conjunction does better. I want to set the issue about adverbial (and adnomial, for that matter) quantifiers aside for two reasons. First because I doubt the allegation sticks. But that is another argument for another day.25 And second because it will do us good to focus on simple cases. Still, the trouble for the operator view that is center stage here does look quite a lot like the problem Lewis pointed out. We have to make room for interaction between if -clauses and the domains our modals quantify over. But that interaction is tricky. That is because it looks impossible to assign if the same conditional meaning — thereby taking its contribution to be an iffy one — in all of our examples. Indeed, when the modal is universal a conditional relation looks good; but when the modal is existential, conjunction looks better. This is pretty much the same trouble Lewis saw for if s occurring under adverbs of quantification, and led him to conclude that such if s do not express operators at all (and a fortiori not conditional operators).26 Just as with adverbial quantifiers, there is a fast and easy solution to the problem if we get rid of the old school idea that if is a conditional connective and plump instead for anti-iffiness. The most forceful way of putting the anti-iffy thesis is Kratzer’s (1986: 11): 25 There are ways to get the restricting job done after all. The operator-based stories in, e.g., Belnap 1970, Dekker 2001, and von Fintel & Iatridou 2003 all manage. 26 For recent and more thorough-going defenses of if s-as-quantifier-restrictors see, e.g., Kratzer 1981, 1986 and von Fintel 1998b. But see Higginbotham 2003 for a dissenting view.
4:23
A. S. Gillies
The history of the conditional is the history of a syntactic mistake. There is no two-place “if. . . then” connective in the logical forms for natural languages. “If”-clauses are devices for restricting the domains of various operators. The thesis is that the relevant structure for the conditionals at issue here is not some modal scoped over a conditional nor some conditional with a modal in its consequent, but is instead something like (15)
modal + if-clause + then-clause
Or, closer to the way we’ve been putting things: (16)
modal(if-clause )(then-clause )
The job of the if -clause is to restrict the domain over which the modal quantifies. So instead of searching for a conditional operator properly so called that if contributes whether it commingles with a modal or not, we search for an operator for if to restrict. And, for indicative conditionals, we do not have to search far: the operators are (possibly covert) epistemic modals.27 So it is the modals, not the if s, that take center stage. They have logical forms along the lines of modal(p)(q), with the usual quantificational force: Definition 7.1 (modal force, amended). i. if defined, might (p)(q)C,i = 1 iff (Ci ∩ p) ∩ qC 6= ii. if defined, must (p)(q)C,i = 1 iff (Ci ∩ p) ⊆ qC This plus two assumptions gets us the now-standard and familiar restrictor view. It easily accounts for consistency (Fact 1), if/must (Fact 2), and if/might (Fact 3). First assumption: assume that when there is no if -clause and so no restrictor is explicit — as in Blue might be in the box or Yellow must be in the box — the first argument in the lf of the modal is filled by your favorite tautology (>). In those cases there is nothing to choose between an analysis that follows our earlier Definition 6.1 and an analysis that follows Definition 27 Officially, our intermediate language now also goes in for a change. L had one-place modals might and must and a two-place connective (if ·)(·). That won’t do to represent the restrictor view. Instead, we need the two-place modals might (·)(·) and must (·)(·) and have no need for a special conditional connective that expresses a conditional operator.
4:24
Iffiness
7.1, and so the latter generalizes the former. Second assumption: assume that the job of if -clauses is to make a (nontrivial) restrictor explicit. If there is no overt modal — as in a bare conditional — the if restricts a covert must. Collecting the pieces: Definition 7.2 (anti-iffiness). For any sentence S, let S 0 be its lf in our intermediate language. Then: i. A sentence of the form if S1 then S2 has lf: a. modal(S10 )(R 0 ) if S20 = modal R 0 b. must (S10 )(S20 ) otherwise ii. Truth conditions as in Definition 7.1 Return to the case of my missing marbles. Taking the if -clauses to be restrictors in the example: (17)
a. b. c.
Red might be in the box and Yellow might be in the box. might (>)(p) ∧ might (>)(q) If Yellow isn’t in the box, then Red must be. must (¬q)(p) If Red isn’t in the box, then Yellow must be. must (¬p)(q)
It’s modals all the way down. And the modals can all be true together. Observation 7.1 (anti-iffiness & consistency). Assume anti-iffiness (Definition 7.2). And suppose, in C, that (17a) is a partitioning modal. Then the sentences in (17) can all be true together. Proof. I am in i and there are just two worlds compatible with the facts I have, i and j. The first is a (p ∧ ¬q)-world, the second a (q ∧ ¬p)-world. The restrictors in (17a) are trivial, so it is true at i iff Ci has a p-world in it and a q-world in it; i witnesses the first conjunct, j the second. The restricting if -clause of (17b) makes sure that the must ends up quantifying only over the ¬q-worlds compatible with C: (17b) is true at i iff all of the worlds Ci ∩ ¬q are p-worlds. And the only one, i, is. Similarly for the must in (17c): it quantifies over the ¬p-worlds in Ci , checking to see that they are all q-worlds. It is just as easy to square this picture with if/must (Fact 2) and if/might (Fact 3). Here are the examples with their new school lfs:
4:25
A. S. Gillies
(18)
a. b.
(19)
a. b.
If Carl is at the party, then Lenny must be at the party. must (p)(q) If Carl is at the party, then Lenny is at the party. must (p)(q) If my team wins out, they might win it all. might (p)(q) It might turn out that my team wins out and wins it all. might (>)(p ∧ q)
Observation 7.2 (anti-iffiness, if/must, & if/might). Assume anti-iffiness (Definition 7.2). Then: i. If S1 , then S2 ≈ If S1 , then must S2 ii. If S1 , then might S2 ≈ might [S1 and S2 ] Proof. anti-iffiness assigns the same lf to a bare conditional like (18b) and its must-enriched counterpart (18a): must (p)(q). It would thus be hard, and pretty undesirable, for their truth conditions to come apart. That explains if/must. Now consider the if -as-restrictor analysis of the sort of examples behind if/might in (19). If (19b) is true at i in C then Ci has a (p ∧ q)-world in it. But then that same world must be in Ci ∩ p. It is a q-world, and that will witness the truth of (19a) at i. Going the other direction: if (19a) is true at i in C, then there are some q-worlds in Ci ∩ p. Any one of those will do as a (p ∧ q)-world in Ci , and that is sufficient for (19b) to be true at i. That explains if/might. These explanations are easy. And, given the trouble for the operator view, it looks like the only game in town is to say that if doesn’t express an operator and so not an iffy operator. That stings. 8
Iffiness regained
The problem for iffiness is that there is an interaction between if -clauses and the domains our modals quantify over. That is an interaction that seems hard to square with the thesis that if is a binary connective with a conditional meaning if we assume that it has the same meaning in each of the cases we care about here.
4:26
Iffiness
But we have overlooked a possibility. We insisted that for a story to be iffy it must say that if p q at i in C expresses some relation R between Di ∩ P and Q, where Di ∩ P is the set of (relevant) worlds where the antecedent is true and Q the set of worlds where the consequent is true. That is all right. But we unthinkingly assumed that the context relevant for figuring out what these sets of worlds are must always be C just because that was the context as it stood when the if was issued. That was a mistake. Setting it straight sets the record straight for old school iffiness. The Ramsey test — the schoolyard version, anyway — is a test for when an indicative conditional is acceptable given your beliefs. It says that if p q is acceptable in belief state B iff q is acceptable in the derived or subordinate state B-plus-the-information-that-p. You zoom in on the portion of B where p is true and see whether q throughout that region. But our job is to say something about the linguistically encoded meanings of indicatives not to dole out epistemic advice. Still, the Ramsey test (plus or minus just a bit) can be turned into a strict conditional story about truth-conditions. Here’s how (in three easy steps). Step one: sentences get truth-values at worlds in contexts. So swap C’s for B’s. Step two: embrace egalitarianism. The worlds compatible with the context are the if -relevant worlds. These first two steps give us a strict conditional analysis of indicatives, requiring that if p q is true at i in C iff all the p-possibilities in Ci are possibilities at which q is true. But truth depends on both index and context. Question: What context is relevant for checking to see whether q is true at these p-possibilities? Answer: The Ramseyan derived or subordinate context Cplus-the-information-that-p, or C + p for short. That’s step three. The Ramsey test invites us to add the information carried by the antecedent to the contextually relevant stock of information C and check the fate of the consequent. What we fans of iffiness overlooked was that this assigns two jobs to if -clauses, and we only paid attention to one of them. One job is the index-shifting job. The if -clause tells us to shift to various alternative indices — the antecedent-possibilities compatible with C — to see whether the consequent is true at them. This job is familiar and most versions of the operator view do a fine job tending to it. But there is another job. When we add the information carried by the antecedent to C we also add to the context relevant for figuring out whether the consequent is true. That is the context-shifting job. The if -clause tells us to shift to an alternative derived or subordinate state to see whether the consequent is true. We fans of old school iffiness made the mistake of only making sure that the first job
4:27
A. S. Gillies
got done. So far this isn’t a story about the meaning of if (much less an iffy one). It is a blueprint for how to construct a semantics that gives a uniform and iffy meaning to if s whether or not those if s mix and mingle with other operators. To construct a story using it we need to take a stand on what it means to add the information carried by an antecedent to the contextually relevant stock of information. Taking that stand depends on the aspirations of the theory since different constructions may depend on different sorts of contextually available information and there is every reason to think that augmenting information of different sorts goes by different rules. But our aspirations are pretty modest here: how indicatives interact with epistemic modals. So we can opt for an equally simple stand on what it means to add information to a context. Even before getting all the details laid out, we can see how the doubly shifty behavior of if -clauses will be able to predict what needs predicting about how indicatives and epistemic modals interact. The difference between interpreting q against the backdrop of the prior context C and against the backdrop of C + p is a difference that makes no difference if q has no context sensitive bits in it. No wonder we missed it! But if q does have context sensitive bits in it — like might or must, whose semantic value depends non-trivially on C — then this is a difference that makes all the difference. For example: consider a modal like must q. The contexts C and C + p may well determine different sets of possibilities. Since must q depends exactly on whether that set of possibilities has only q-worlds in it, we then get a difference. Thus if must q is the consequent of an indicative, contextshiftiness matters. Here is the simplest way of constructing a semantics around the blueprint: Definition 8.1 (iffiness + shiftiness). i. if defined, if p q C,i = 1 iff Ci ∩ pC ⊆ qC+p ii. C + p = λi.Ci ∩ pC Such a story about if is iffy: if expresses a relation between relevant antecedent and consequent worlds and that relation lives up to all the constraints we insisted on earlier. Hence if means all. And it expresses that no matter whether it scopes over a universal modal or an existential modal or no modal at all in the consequent. It is also doubly shifty. It is index-shifty since the truth of if p q at i depends on the truth of the constituent q
4:28
Iffiness
at worlds other than i. It is context-shifty since the truth of if p q in C depends on the truth of the constituent q in contexts other than C. The if /modal interactions that were such trouble were only trouble because we forgot to keep track of the context-shifting job of if -clauses. And doing that, even in the simple context-shifting in Definition 8.1, is enough to make iffiness sit better with the Facts. I know that just one of my marbles is in the box — either Red or Yellow — but do not know which it is. Narrowscope the modals. Then all of these can be true together: (20)
a. b. c.
Red might be in the box and Yellow might be in the box. might p ∧ might q If Yellow isn’t in the box, then Red must be. if ¬q must p If Red isn’t in the box, then Yellow must be. if ¬p must q
Observation 8.1 (iffiness & consistency). Assume iffiness + shiftiness (Definition 8.1). Suppose p and q partition the possibilities in C. The (narrowscoped) sentences in (20) can all be true together in C. Proof. Here is why. Suppose — for concreteness and without loss of generality — that C contains just two worlds: i, a (p ∧ ¬q)-world and j, a (q ∧ ¬p)world. So (20a) is true at i. Now take (20b). It is true at i in C, given iffiness + shiftiness, iff all the possibilities in Ci ∩ ¬q are possibilities that must pC+¬q maps to true. Thus we have to see whether the following holds: if k ∈ Ci ∩ ¬q then must pC+¬q,k = 1 Iff this is so is (20b) true at i in C. But Ci ∩ ¬q = {i}, so we have to see whether or not must pC+¬q,i = 1. Equivalently: the if is true at i iff (C + ¬q)i ⊆ p. And since i is in fact a p-world the if is true at i in C. And mutatis mutandis for (20c). The operator view isn’t at odds with consistency after all. It is also easy to predict if/must (Fact 2) and if/might (Fact 3). Here are the narrowscoped analyses of the motivating examples: (21)
a.
If Carl is at the party, then Lenny must be at the party. if p must q
4:29
A. S. Gillies
(22)
b.
If Carl is at the party, then Lenny is at the party. if p q
a.
If my team wins out, they might win it all. if p might q It might turn out that my team wins out and wins it all. might (p ∧ q)
b.
Observation 8.2 (iffiness, if/must, & if/might). Assume iffiness + shiftiness (Definition 8.1). Then: i. If S1 , then S2 ≈ If S1 , then must S2 ii. If S1 , then might S2 ≈ might [S1 and S2 ] Proof. If must q is true then so is q, no matter the world and context. So it’s easy to see that when (21a) is true so is (21b). Now suppose (21b) is true at i (with respect to C). Then all of the p-worlds in Ci are q-worlds (Ci ∩ p ⊆ qC+p ). But if they are all worlds at which q is true, then i — and so, given well-behavedness, every world in Ci — is equally a world at which must q is true (with respect to C + p). And so (21a) is true, at i in C, if (21b) is. That’s just what if/must requires. if/might is no different. The noteworthy part is seeing how iffiness + shiftiness predicts that when (22a) is true then so is (22b). Note that (22a) is true at i (with respect to C) just in case all of the p-worlds in Ci are worlds where might q, evaluated in C + p, is true. By well-behavedness we have that: if j, k ∈ Ci ∩ p then (C + p)j = (C + p)k = Ci ∩ p If there is a q-world in (C + p)j , then might q is true throughout this set. Since might q is an existential modal, if it is true with respect to C + p it must also be true with respect to C. (Updating contexts with + is monotone.) Whence it follows that the if with a commingling might is true at i iff among the p-worlds in Ci lies a q-world. And any such q-world will do to witness the truth of might (p ∧ q) at i in C. That’s just what if/might requires. Indicatives play well with epistemic modals. That interaction seemed hard to square with old school views that take if to express a conditional operator. No way of sorting out the relative scopes between the modals and the conditional seemed right. But that is because we mistakenly thought that antecedents of conditionals only have one job to do. They shift the index at which we check to see if the consequent is true. But they also contribute to the
4:30
Iffiness
context that is relevant when we do that checking. Once we let antecedents do both their index-shifting and context-shifting jobs we can safely narrowscope and there is no special problem posed for old school iffiness. The if in if p modal q means the same iffy thing — inclusion! — saying that all the (relevant) worlds where p is true are worlds where modal q is true. That’s so whether the oopmh of modal is universal or existential or null and does nothing to get in the way of explaining the Facts. That is something we fans of iffiness ought to dig.28 9
What is at stake
Given the success of anti-iffiness why bother with iffiness at all? A fair question. Given the context-shifting I’m advocating for fans of iffiness, what’s the difference between old school and new school? Another fair question. I owe some answers. I make three (not wholly unrelated) claims. First, even if the shifty version of the operator view and the basic version of the restrictor view covered the same ground, there is still reason to explore the operator view. Second, the views have different conceptual roots and different allegiances. Third, the views don’t cover the same ground. I need to argue for each of these. Suppose that — at least when it comes to accounting for data about the sorts of constructions at issue here — there’s nothing to choose between iffiness + shiftiness and anti-iffiness. Even under that assumption there is reason to take this version of the operator view seriously. That is because it is important to set the record straight. Maybe you don’t like skyhooks, Chuck Taylors, and conditional connectives expressing iffy operators in your lfs. It is important to know that whatever your reasons, it can’t be because iffiness can’t be squared with the Facts about how if s and modals interact. The Ramsey test intuition leads naturally to a story according to which if expresses a bona fide conditional operator that captures the restricting behavior of if -clauses. Thus the restricting behavior of if -clauses can be a 28 Before I said that I wanted to ignore issues about how this version of the operator view can meet Lewis’s challenge about the ways if -clauses and adverbs of quantification interact, saving that argument for another day. I want to stick to that (it really is an argument for another day), but the general idea is straightforward. First, adjust the kinds of information represented by a context so that we can sensibly quantify over individuals and the events they participate in. Second, allow that quantificational domains can be restricted by material in if -clauses — those domains play the role of the subordinate or derived context. Adverbs of quantification appear under the conditional and have their usual denotations.
4:31
A. S. Gillies
part of, rather than an obstacle to, their expressing something iffy. That is cool. But what’s the real difference between the views? One view says we have no conditional operator, just a complicated modal with a slot for a restrictor. The other says we have a conditional operator but that its antecedent shifts the context thereby acting like a restrictor. Tomato/tom˘ ato, right? Wrong! Here is one way of seeing that. Consider three indicatives: (23)
a. b. c.
If Scorpio succeeds, then the end must be near. If Scorpio succeeds, then the end is near. If Jimbo is in detention, then Nelson might be.
Compare (23a) and (23c). The restrictor view says these have different modals and different arguments for each of the slots in those modals. So, apart from the fact that each is a modal expression of some flavor or other, there is nothing much in common between the two. They are as different as Some students smoke and All dogs bark: each is a quantificational expression of some flavor or other. The operator view says something different. It says that, despite their different antecedents and different consequents, they still share a common iffy core: there is a conditional connective in common between them and it contributes the same thing to each of the sentences it occurs in. Or compare the must-enriched (23a) with its bare counterpart (23b). The restrictor view says the bare indicative just is the must-enriched version in disguise. That is how it predicts if/must (Fact 2). It thus treats bare indicatives as a special case, dealt with by positing a covert and inaudible necessity modal. Maybe there is reason to posit such an operator, and an independent and principled reason to posit the necessity modal instead of an existential one or some different modal with different quantificational force, and maybe those reasons outweigh the cost of the positing. The operator view adopts a very different stance here and that is what I want to point out. It says that bare indicatives like (23b) are ordinary conditionals and their counterparts with must-ed consequents like (23a) are ordinary conditionals that happen to have must in their consequents. No special cases, no positing of inaudible operators, and if/must comes out as a prediction not as a stipulation. None of this is a knock-down argument for or against either of the views — it’s not meant to be — but it does highlight their difference in worldview. All of this has been under the assumption that both the doubly shifty iffy view and the anti-iffy restrictor view cover the same ground about how if s
4:32
Iffiness
and modals interact. But that’s not quite right.29 So far we have only worried about how it is that a conditional sentence manages to express what might be if such-and-such or how it manages to express what must be if such-and-such. But conditional information can be more economically expressed than that. We can just as well have a single conditional sentence that expresses what must be and what might be if such-and-such. A case in point: although I have lost my marbles, I know that some of them — at least one of Red, Yellow, and Blue — are in the box. In fact I know a bit more. I know that Yellow and Blue are in the same spot and so that Red can’t be elsewhere if Yellow isn’t in the box. Another example: arriving at the party, I’m not sure who’s there and who isn’t. I do know that Lenny goes wherever Carl goes (but sometimes Lenny goes alone), but Monty never goes where Lenny goes. (24)
a. b.
If Yellow is in the box, then Red might be and Blue must be. If Lenny is at the party, then Carl might be but Monty isn’t.
These are not exotic, each conditional is a true thing to say in the circumstances, and there is space for the iffy view and incarnations of the anti-iffy restrictor view to differ on the truth conditions they assign to conditionals like these — and so the two views can’t be stylistic variants. Here is the issue: (24a) and (24b) have glosses: (25)
a.
If Yellow is in the box, then Red might be and if Yellow is the box, then Blue must be.
29 There are reasons independent of interaction with epistemic modals to think that antiiffiness, in its purest if -only-restricts form, can’t be the whole story. If it were, and if -clauses and when-clauses have the same restricting behavior, then we wouldn’t expect differences in cases like this: (i)
a. b.
If the Cubs get good pitching and timely hitting after the break, they might win it all. When the Cubs get good pitching and timely hitting after the break, they might win it all.
But we do detect a difference. I can say something true-if-hopeful with (ia). But (ib) passes optimistic and heads straight for delusional. It’s hard to see where to locate the difference — whether it’s semantic or pragmatic — if the semantic contribution of if and when is purely to mark the restrictor slot for the common operator might. (Lewis (1975) noticed that sometimes a restricting if is odd when its corresponding restricting when is fine. But he labeled these differences “stylistic variations”.) Some arguments along these lines are pushed by von Fintel & Iatridou (2003).
4:33
A. S. Gillies
b.
If Lenny is at the party, then Carl might be but if Lenny is at the Party, then Monty isn’t.
These swap a single conditional with a complicated consequent for a conjunction of simple conditionals. The simple incarnation of the anti-iffy restrictor view in Definition 7.2 says we do one thing when a conditional consequent has an overt modal, and do another when there isn’t. But we didn’t say how out in the open a modal must be to count as overt. Depending on what we say, we can get divergence between the operator view and the restrictor view for cases like these. Assume — for now — that a modal is overt in a sentence iff it is the connective featured in (the lf of) that sentence.30 Under that assumption, it is then easy to see that the two stories come apart: iffiness + shiftiness predicts that (24a) is equivalent to (25a) and so true (in the relevant context) and anti-iffiness does not. That is because the consequent of (24a) isn’t decorated with a leading modal (it’s a conjunction of modals), and so we have to posit one. So (24a) gets an L-representation like (26)
must (p)(might (>)(q) ∧ must (>)(r ))
But the truth conditions of (26) do not match the truth conditions of (25a) and so do not match the truth conditions of the original (24a): (26) is false in the context as we set it up even though both (24a) and (25a) are true. Now assume, instead, that a modal is overt iff it is pronounced — no matter how arbitrarily deeply embedded. Then (26) isn’t the right anti-iffy lf for (24a). Instead, we get something more sensible: (24a) and (25a) have the same lf. There’s no in-principle problem with that.31 But what about conditionals like (24b)? We don’t want to posit a must that outscopes the pronounced might. So we have to posit a narrowscoped one. In order to get the posited modal appropriately restricted — so that (24b) comes out equivalent to (25b) — we have two obvious options. Option (i): Argue that conditionals like those in (24) are not single conditionals at all, that they are really conjunctions of two simple modals. That way there is no difference at all between the conditionals in (24) and the glosses in (25). Option (ii): Enrich our intermediate language to allow for explicit domain-restricting variables, and provide a mechanism for the inheriting of those restrictions 30 In this sense, a modal is any (non-equivalent) stack of musts, mights, and negations. 31 Though it doesn’t come free: it puts strain on the process of assigning formulas of L to serve as the lfs of sentences of natural language.
4:34
Iffiness
across intervening operators like conjunction. Both options are open, and party line proponents of anti-iffiness are free to pursue them. But they do require work. Option (i) posits movement we’d not like to have to posit, treats conditionals with apparent conjoined consequents as yet another special case, and describes rather than explains why the conditionals in (24) are glossable by those in (25). Option (ii) requires more expressive resources for L than we thought necessary and requires something over and above the anti-iffy story as it stands to say when and how domain restriction gets inherited over distance and across intervening operators. That’s not an argument against this option but a description of it.32 But none of that really matters: my point was that iffiness + shiftiness and anti-iffiness aren’t notational variants. And they are not: the iffy story takes conditionals like (24) in perfect stride. No special cases, no positing of inaudible operators, no stress on the parser in assigning formulas of L to serve as the lfs of conditional sentences, no movement. We get the right truth conditions, and we get as a prediction not a stipulation that the conditionals in (24) are equivalent to those in (25). 10
Context and dynamics
Not every fan of old school iffiness will want to follow me this far. But there is a cost to cutting their trip short since they must then deny or explain away one of the Facts. Iffiness, they’ll no doubt point out, is not without its own costs: the price of iffiness is shiftiness twice over. I reply that there are costs and then there are costs. Embracing contextshiftiness may be a cost, but I want to point out that it is not a new cost: it makes the analysis here a broadly dynamic semantic account of indicatives.33 So shiftiness is a cost you may already be willing to bear. I want to (briefly) point out how it is that this shiftiness amounts to a four-fold dynamic perspective on modals and conditionals. 32 Something in the neighborhood of Option (ii) is developed (though not with an eye to conjoined consequents) in von Fintel (1994). For a recent discussion see Rawlins 2008. 33 The general idea that consequents are evaluated in a subordinate or derived context is standard in dynamic semantics — see, e.g., dynamic treatments of donkey anaphora (Groenendijk & Stokhof 1991) or dynamic treatments of presupposition projection in conditional antecedents and consequents (Heim 1992; Beaver 1999) or dynamic treatments of counterfactuals (Veltman 2005; von Fintel 2001; Gillies 2007). But exploiting a derived context isn’t quite a litmus test for dynamics since that is something shared by a lot of Ramsey-inspired accounts, whether or not they count as ‘dynamic’.
4:35
A. S. Gillies
The version of the operator view I’m advocating for fans of iffiness takes the truth of an indicative (at an index, in a context) to be doubly shifty. That doubly shifty behavior makes the semantics dynamic in the sense that interpretation both affects and is affected by the values of contextually filled parameters. Whether if p q is true at i in C depends on C; the indicative can be true at i for some choices of C and false at i for others. So interpretation is context-dependent. Whether if p q is true at i in C also depends on the subordinate context C + p. Interpreting the indicative in C affects — temporarily — the context for interpreting some subparts of it. So interpretation is also context-affecting. This analysis is also dynamic in a second sense. It makes certain sentences unstable — the truth-value a sentence gets in a context C is not a stable or persistent property since it can have a different truth-value in a context C 0 that contains properly more information. Definition 10.1 (persistence). 0
i. p is t-persistent iff pC,i = 1 and C 0 ⊆ C imply pC ,i = 1 0 ii. p is f -persistent iff pC,i = 0 and C 0 ⊆ C imply pC ,i = 0 p is persistent iff it is both t- and f -persistent. The boolean bits are, of course, both t- and f -persistent and so persistent fullstop. But not the modals: might, being existential, is f - but not t-persistent; must goes the other way. And since if is a strict conditional, equivalent to a necessity modal scoped over a material conditional, its pattern of persistence is just like that for must.34 These two senses in which the story is dynamic are two sides of the same coin. Together they explain how it is that the narrowscoped conditionals if ¬p must q and if ¬q must p are consistent with the partitioning modals in might p ∧ might q. From the fact that i ∈ if ¬p must q C and i ∈ ¬pC it does not follow that i ∈ must qC . Indeed, with my marbles lost, this is sure to be false at i in C since might p is true. What is true at i is that — in the subordinate or derived context C + ¬q — must q is true. That is allowed because must isn’t f -persistent. But that is not at odds with the might claim. And mutatis mutandis for the other if . 34 This pattern makes the treatment of indicatives here similar in some respects to Veltman’s (1985) data semantic treatment of indicatives. But there are important differences between the two stories. Here’s one: if p might q is data semantically equivalent to if p q . That won’t do given Fact 3.
4:36
Iffiness
So we have dynamics twice over. But so far none of this looks quite like what is usually called “dynamic semantics”. In that sense of dynamics meaning isn’t associated with truth conditions or propositions but with context change potentials, effects on relevant states of information. Take an information state s to be a set of worlds, and say that what a sentence means is how its lf updates information states. That assigns to sentences the semantic type usually reserved for programs and recipes; they express relations between states — intuitively, the set of pairs of states such that executing the program in the first state terminates in the second. We can think of all sentences in this way, thereby treating them as instructions for changing information states. Thus: the meaning of a sentence p is how it changes an arbitrary information state. We might put that by saying the denotation [p] applied to s results in state s 0 ; in post-fix notation s[p] = s 0 .35 Now say that p is true in s iff s[p] = s, for then the information p carries is already present in s.36 Having gone this far, we can make good on the Ramsey test this way: Definition 10.2 (Dynamic Iffiness). s[ if p q ] = i ∈ s : q is true in s[p] Some programs have as their main point to make such-and-such the case; others to see whether such-and-such. Programs of the latter type are tests and they either return their input state (if such-and-such) or fail (otherwise). That is the kind of program Definition 10.2 says if is.37 It says an if tests s to see whether the consequent is true in s[p]. But — in good Ramseyian spirit — s[p] is just the subordinate context got by hypothetically adding p to s. Truth isn’t persistent here, either. That is because a state may pass a test posed by an existential (Are there p-possibilities?) and yet have 35 For the fragment without if s the updates are as you would expect (Veltman 1996). For the if -free fragment of L, define [·] as follows: i. s[patomic ] = i ∈ s : i(patomic ) = 1 ii. s[¬p] = s \ s[p] iii. s[p ∧ q] = s[p][q] iv. s[might p] = i ∈ s : s[p] 6= It then follows straightaway that — for the if - and modal-free fragment — s[p] = s ∩ p. 36 This generalizes the plain vanilla story about satisfaction we were taught when first learning propositional logic: as the story usually goes, a boolean p is true relative to a set of possibilities s iff all the possibilities in s are in p. But that is equivalent to saying that adding p to the information in s produces no change: s ∩ p = s iff s ⊆ p. 37 See, e.g., Gillies 2004.
4:37
A. S. Gillies
some narrower, less uncertain state fail it (No more p-possibilities!). And dually for the universal must and if . An iffy account like the one in Definition 10.2 is dynamic in this third sense. But the doubly-shifty operator view iffiness + shiftiness doesn’t look much like a dynamic semantics in that sense. That analysis looks static, assigning truth-conditions to indicatives at a world in a context. And we can recover propositions if the mood strikes us. But the two stories are in fact the same: lack of persistence plus the global behavior of the modals and if s in the doubly shifty story make it equivalent to a dynamic story of the indicative that dispenses with the assignment of propositions of the normal sort from the beginning.38 Even though I told the story about truth-values assigned at contexts and indices, it is equivalent to a story about changing information states. So we have dynamics thrice over. We have gotten this far, and found ways to predict the Facts about how indicatives and epistemic modals interact, without taking a stand on when one sentence entails another. (Having said nothing about entailment we couldn’t have said anything about modus ponens either.) Entailment is usually taken to be preservation of truth at a point of evaluation: iff q is true at a point if p1 , . . . , pn are all true at that point do the latter entail the former. Not necessarily so in a dynamic semantics. Often enough, what is important and what an entailment relation ought to capture is not preservation of truth but preservation of information flow — what must be true after adding the information carried by the premises. That is an updateto-test entailment relation.39 Similarly, since the story as I have told it turns out to be a dynamic one, we ought to expect a larger menu of options for what it takes for a collection of premises to entail a conclusion. That is because truth is sensitive to both context and index and contexts can shift about as we move from the pi ’s to q. To make sure entailment is sensitive to those shifts, we shouldn’t merely require preservation of truth-at-a-point. Instead, just as in a more explicitly dynamic set-up, we want to augment the 38 The standard benchmark for dynamics is whether the interpretation function [·] is either non-introspective (Can it be that s[p] 6⊆ s?) or non-continuous (Can it be that s[p] 6= S i∈s {i} [p]?). In set-ups like the one in Definition 10.2, the behavior of indicatives is not continuous. See Gillies 2009 for the details on how the iffy story as I have put it is equivalent to a more directly dynamically iffy semantics, and how the right notions of entailment coincide in the two set-ups. 39 For more about the space of options for entailment relations in dynamic semantics see van Benthem 1996 and Veltman 1996. Update-to-test entailment is a lot like Stalnaker’s (1975) notion of reasonable inference.
4:38
Iffiness
context with the information of the premises, evaluating q not in C but in (C + p1 ) + · · · + pn ). And that corresponds exactly to the dynamic update-totest entailment relation over our language L. That is the fourth way in which the semantics here is dynamic. So the doubly shifty behavior of indicatives reflects this four-fold dynamic perspective. That is useful to know for two reasons. First because it makes clear what the costs of iffiness are and it makes clear that some of those costs are not completely new. Second because it makes clear that the dynamic perspective on modals and conditionals is broader than we may have thought. The senses in which the story here reflects a dynamic perspective are familiar senses, but the mechanisms of that iffy story aren’t the usual mechanisms in a dynamic semantics. The semantics traffics in things like truth conditions and propositions, not in things like support or programs or context change potentials. So nothing in the dynamic perspective on modals and conditionals requires the latter sort of semantic trafficking at the expense of the former sort. It’s broader than that. 11 An iffy upshot My preferred version of the operator view says that an indicative is a doublyshifty strict conditional over sets of live possibilities. It assigns two jobs to if -clauses. They have the index-shifting job of shifting the point at which we check for a consequent’s truth, but they also have the context-shifting job of shifting the context relevant for deciding at such a point whether a consequent is true. That is how if can mean the same iffy thing no matter whether the consequent is modal, and no matter the quantificational force of that modal, without running afoul of the Facts. We began with the iffy thesis that conditional information is information of a conditional. Then we showed that — given some broad constraints for what counts as a conditional operator properly so called — apparently no operator view could be squared with the Facts since no way of sorting out the scopes would work. But all of that assumed that antecedents have no context-shifting role. So if you want to plump for an incarnation of the operator view, and you want to square your story with the Facts, you had better allow for context-shifting. It’s easy to get the idea that how if s and operators like epistemic modals interact is an argument for anti-iffiness. But since some iffy stories — this one! — can account for that data, that’s not right. Nothing about shiftiness
4:39
A. S. Gillies
rules out anti-iffiness, of course. And so it’s open to go for a restrictor view that co-opts context-shifting to account for the way that conditionals with conjoined consequents turn out equivalent to conjunctions of simpler conditionals. So if you want to toe the anti-iffy line, you might want to allow for context-shifting anyway. Of course, that makes toeing the line a bit like not toeing the line. References Adams, Ernest W. 1975. The logic of conditionals. Dordrecht: Reidel. Beaver, David. 1999. Presupposition accommodation: A plea for common sense. In Larry Moss, Jonathan Ginzburg & Martin de Rijk (eds.), Logic, language, and information vol. 2, 21–44. Stanford, CA: CSLI Publications. https://webspace.utexas.edu/dib97/itallc.pdf. Belnap, Nuel D. 1970. Conditional assertion and restricted quantification. Noûs 4(1). 1–12. doi:10.2307/2214285. Bennett, Jonathon. 2003. A philosophical guide to conditionals. Oxford University Press. van Benthem, Johan. 1986. Essays in logical semantics (Studies in Linguistics and Philosophy 29). Dordrecht: Reidel. van Benthem, Johan. 1996. Exploring logical dynamics. Stanford, CA: CSLI Publications. Dekker, Paul. 2001. On if and only. Semantics and Linguistics Theory [SALT] 11. 114–133. http://staff.science.uva.nl/~pdekker/Papers/OIAO.pdf. Edgington, Dorothy. 1995. Conditionals. Mind 104(414). 235–329. doi:10.1093/mind/104.414.235. Edgington, Dorothy. 2008. Conditionals. In Edward N. Zalta (ed.), The Stanford encyclopedia of philosophy, Winter 2008 edn. http://plato.stanford.edu/ archives/win2008/entries/conditionals/. von Fintel, Kai. 1994. Restrictions on quantifier domains. Amherst, MA: University of Massachusetts dissertation. http://semanticsarchive.net/ Archive/jA3N2IwN/fintel-1994-thesis.pdf. von Fintel, Kai. 1997. Bare plurals, bare conditionals, and only. Journal of Semantics 14(1). 1–56. doi:10.1093/jos/14.1.1. von Fintel, Kai. 1998a. The presupposition of subjunctive conditionals. In Uli Sauerland & Orin Percus (eds.), The interpretive tract (MIT Working Papers in Linguistics 25), 29–44. http://mit.edu/fintel/fintel-1998-subjunctive. pdf.
4:40
Iffiness
von Fintel, Kai. 1998b. Quantifiers and if -clauses. Philosophical Quarterly 48(191). 209–214. doi:10.1111/1467-9213.00095. von Fintel, Kai. 2001. Counterfactuals in a dynamic context. In Michael Kenstowicz (ed.), Ken Hale: A life in language, 123–152. Cambridge, MA: MIT Press. von Fintel, Kai. 2009. Conditionals. Ms, to appear in Semantics: An international handbook of meaning, edited by Klaus von Heusinger, Claudia Maienborn, and Paul Portner. http://mit.edu/fintel/ fintel-2009-hsk-conditionals.pdf. von Fintel, Kai & Anthony S. Gillies. 2007. An opinionated guide to epistemic modality. In Tamar Szabó Gendler & John Hawthorne (eds.), Oxford studies in epistemology: Volume 2, 32–62. Oxford University Press. von Fintel, Kai & Anthony S. Gillies. 2008a. CIA leaks. The Philosophical Review 117(1). 77–98. doi:10.1215/00318108-2007-025. von Fintel, Kai & Anthony S. Gillies. 2008b. Might made right. In Brian Weatherson & Andy Egan (eds.), Epistemic modals, Oxford University Press (to appear). http://rci.rutgers.edu/~thony/fintel-gillies-2008-mmr.pdf. von Fintel, Kai & Anthony S. Gillies. 2010. Must... stay... strong! Natural Language Semantics to appear. http://mit.edu/fintel/fintel-gillies-2010-mss. pdf. von Fintel, Kai & Irene Heim. 2007. Intensional semantics. Lecture Notes, MIT. http://tinyurl.com/intensional. von Fintel, Kai & Sabine Iatridou. 2003. If and when if -clauses can restrict quantifiers. Manuscript, MIT. http://web.mit.edu/fintel/www/lpw.mich. pdf. Geurts, Bart. 2005. Entertaining alternatives: Disjunctions as modals. Natural Language Semantics 13(4). 383–410. doi:10.1007/s11050-005-2052-7. Gibbard, Allan. 1981. Two recent theories of conditionals. In William L. Harper, Robert Stalnaker & Glenn Pearce (eds.), Ifs, 211–248. Dordrecht: Reidel. Gillies, Anthony S. 2004. Epistemic conditionals and conditional epistemics. Noûs 38(4). 585–616. doi:10.1111/j.0029-4624.2004.00485.x. Gillies, Anthony S. 2007. Counterfactual scorekeeping. Linguistics and Philosophy 30(3). 329–360. doi:10.1007/s10988-007-9018-6. Gillies, Anthony S. 2009. On truth-conditions for if (but not quite only if ). The Philosophical Review 118(3). 325–349. doi:10.1215/00318108-2009-00. Grice, Paul. 1989. Indicative conditionals. In Studies in the way of words, 58–85. Cambridge, MA: Harvard University Press.
4:41
A. S. Gillies
Groenendijk, Jeroen & Martin Stokhof. 1991. Dynamic predicate logic. Linguistics and Philosophy 14(1). 39–100. doi:10.1007/BF00628304. Heim, Irene. 1992. Presupposition projection and the semantics of attitude verbs. Journal of Semantics 9(3). 183–221. doi:10.1093/jos/9.3.183. Higginbotham, James. 2003. Conditionals and compositionality. Philosophical Perspectives 17(1). 181–194. doi:10.1111/j.1520-8583.2003.00008.x. Jackson, Frank. 1987. Conditionals. Oxford University Press. Kratzer, Angelika. 1981. The notional category of modality. In Hans-Jurgen Eikmeyer & Hannes Rieser (eds.), Words, worlds, and contexts: New approaches in word semantics (Research in Text Theory 6), 38–74. Berlin: de Gruyter. Kratzer, Angelika. 1986. Conditionals. Proceedings of the Chicago Linguistics Society [CLS] 22(2). 1–15. Lewis, David. 1973. Counterfactuals. Cambridge, MA: Harvard University Press. Lewis, David. 1975. Adverbs of quantification. In Edward Keenan (ed.), Formal semantics of natural language, 3–15. Cambridge University Press. Lewis, David. 1976. Probabilities of conditionals and conditional probability. The Philosophical Review 85(3). 297–315. doi:10.2307/2184045. Rawlins, Kyle. 2008. (Un)Conditionals. Santa Cruz, CA: UC Santa Cruz dissertation. Stalnaker, Robert. 1968. A theory of conditionals. In Nicholas Rescher (ed.), Studies in logical theory (American Philosophical Quarterly Monograph Series 2), 98–112. Blackwell. Stalnaker, Robert. 1975. Indicative conditionals. Philosophia 5(3). 269–286. doi:10.1007/BF02379021. Veltman, Frank. 1985. Logics for conditionals. Amsterdam: University of Amsterdam dissertation. Veltman, Frank. 1996. Defaults in update semantics. Journal of Philosophical Logic 25(3). 221–261. doi:10.1007/BF00248150. Veltman, Frank. 2005. Making counterfactual assumptions. Journal of Semantics 22(2). 159–180. doi:10.1093/jos/ffh022.
Anthony S. Gillies Department of Philosophy Rutgers University
[email protected]
4:42
Semantics & Pragmatics Volume 3, Article 9: 1–74, 2010 doi: 10.3765/sp.3.9
Cross-linguistic variation in modality systems: The role of mood∗ Lisa Matthewson University of British Columbia
Received 2009-07-14 / First Decision 2009-08-20 / Revision Received 2010-02-01 / Accepted 2010-03-25 / Final Version Received 2010-05-31 / Published 2010-08-06
Abstract The St’át’imcets (Lillooet Salish) subjunctive mood appears in nine distinct environments, with a range of semantic effects, including weakening an imperative to a polite request, turning a question into an uncertainty statement, and creating an ignorance free relative. The St’át’imcets subjunctive also differs from Indo-European subjunctives in that it is not selected by attitude verbs. In this paper I account for the St’át’imcets subjunctive using Portner’s (1997) proposal that moods restrict the conversational background of a governing modal. I argue that the St’át’imcets subjunctive restricts the conversational background of a governing modal, but in a way which obligatorily weakens the modal’s force. This obligatory modal weakening — not found with Indo-European non-indicative moods — correlates with the fact that St’át’imcets modals differ from Indo-European modals along the same dimension. While Indo-European modals typically lexically encode quantificational force, but leave conversational background to context, St’át’imcets modals encode conversational background, but leave quantificational force to context (Matthewson, Rullmann & Davis 2007, Rullmann, Matthewson & Davis 2008).
Keywords: Subjunctive, mood, irrealis, modals, imperatives, evidentials, questions, free relatives, attitude verbs, Salish ∗ I am very grateful to St’át’imcets consultants Carl Alexander, Gertrude Ned, Laura Thevarge, Rose Agnes Whitley and the late Beverley Frank. Thanks to David Beaver, Henry Davis, Peter Jacobs, the members of the UBC Pragmatics Research Group (Patrick Littell, Meagan Louie, Scott Mackie, Tyler Peterson, Amélia Reis Silva, Hotze Rullmann and Ryan Waldie), three anonymous reviewers, and audiences at New York University, the University of British Columbia and the 44th International Conference on Salish and Neighbouring Languages for helpful feedback and discussion. Thanks to Tyler Peterson for helping prepare the manuscript for publication. This research is supported by SSHRC grants #410-2005-0875 and #410-2007-1046. ©2010 Lisa Matthewson This is an open-access article distributed under the terms of a Creative Commons NonCommercial License (creativecommons.org/licenses/by-nc/3.0).
Lisa Matthewson
1 Introduction Many Indo-European languages possess both modals, lexical items which quantify over possible worlds, and subjunctive moods, agreement paradigms which usually require a licensing modal element. The contrast is illustrated for Italian in (1)–(2). (1) contains modal auxiliaries; (2) contains subjunctive mood agreement which is licensed by the matrix attitude verb. (1)
a.
deve essere nell’ ufficio must+3sg+pres+ind be in.the office ‘He must be in the office.’
b.
puo essere nell’ ufficio may+3sg+pres+ind be in.the office ‘He may be in the office.’
(2)
(Italian; Palmer 2006: 102)
(Italian; Palmer 2006: 102)
dubito che impari I.doubt that learn+3sg+pres+sbjn ‘I doubt that he’s learning.’
(Italian; Palmer 2006: 117)
Previous work on the Salish language St’át’imcets (a.k.a. Lillooet; see Matthewson et al. 2007, Rullmann et al. 2008, and Davis, Matthewson & Rullmann 2009) has established the existence of a set of modals in this language, which differ in their semantics from those of Indo-European. Indo-European modals typically lexically encode distinctions of quantificational force, but leave conversational background (in the sense of Kratzer 1981, 1991) up to context. (1a), for example, unambiguously expresses necessity, while (1b) unambiguously expresses possibility. However, both modals allow either epistemic or deontic interpretations, depending on context. In contrast, modals in St’át’imcets lexically encode conversational background, but leave quantificational force up to context. (3a), for example, is unambiguously epistemic, but is compatible with either a necessity or a possibility interpretation, depending on context. (3b) is unambiguously deontic, but similarly allows differing quantificational strengths. See Matthewson et al. 2007, Rullmann et al. 2008, and Davis et al. 2009 for extensive discussion.1 1 All St’át’imcets data are from primary fieldwork unless otherwise noted. Data are presented in the practical orthography of the language developed by Jan van Eijk; see van Eijk & Williams 1981. Abbreviations: adhort: adhortative, caus: causative, circ: circumstantial modal, col: collective, comp: complementizer, cond: conditional, conj: conjunctive, counter: counter to expectations, deic: deictic, deon: deontic, demon: demonstrative, det:
9:2
Cross-linguistic variation in modality systems: The role of mood
(3)
a.
wá7=k’a s-t’al l=ti=tsítcw-s=a be=epis stat-stop in=det=house-3sg.poss=exis s=Philomena nom=Philomena ‘Philomena must / might be in her house.’
b.
only epistemic
lán=lhkacw=ka áts’x-en ti=kwtámts-sw=a already=2sg.subj=deon see-dir det=husband-2sg.poss=exis ‘You must / can / may see your husband now.’
only deontic
A simplified table representing the difference between the two types of modal system is given in Table 1:
Indo-European St’át’imcets Table 1
quantificational force
conversational background
lexical context
context lexical
Indo-European vs. St’át’imcets modal systems
In this paper I extend the cross-linguistic comparison to the realm of mood. I argue that St’át’imcets possesses a subjunctive mood, and show that it induces a range of apparently disparate semantic effects, depending on the construction in which it appears. One example of the use of the subjunctive is given in (4): it weakens the force of a deontic modal proposition (in a sense to be made precise below). Other uses include turning imperatives into polite requests, and turning questions into statements of uncertainty (cf. van Eijk 1997 and Davis 2006). (4)
a.
gúy’t=Ø=ka ti=sk’úk’wm’it=a sleep=3indic=deon det=child=exis ‘The child should sleep.’
determiner, dir: directive transitivizer, ds: different subject, epis: epistemic, erg: ergative, exis: assertion of existence, foc: focus, fut: future, impf: imperfective, inch: inchoative, indic: indicative, infer: inferential evidential, irr: irrealis, loc: locative, mid: middle intransitive, nom: nominalizer, obj: object, prt: particle, pass: passive, perc.evid: perceived evidence, pl: plural, poss: possessive, prep: preposition, real: realis, red: redirective applicative, rem.past: remote past, sbjn: subjunctive, sg: singular, sim: simultaneous, stat: stative, temp.deic: temporal deictic, ynq: yes-no question. The symbol - marks an affix boundary and = marks a clitic boundary.
9:3
Lisa Matthewson
b.
guy’t=ás=ka ti=sk’úk’wm’it=a sleep=3sbjn=deon det=child=exis ‘I hope the child sleeps.’
I will show that the St’át’imcets subjunctive differs markedly from IndoEuropean subjunctives, both in the environments in which it is licensed, and in its semantic effects. I propose an analysis of the St’át’imcets subjunctive which adopts insights put forward by Portner (1997, 2003). For Portner, moods in various Indo-European languages place restrictions on the conversational background of a governing modal. I argue that the St’át’imcets subjunctive mood can be analyzed within exactly this framework, with the twist that in St’át’imcets, the restriction the subjunctive places on the governing modal obligatorily weakens the force of the proposition expressed. This has an interesting consequence. While we can account for the St’át’imcets subjunctive using the same theoretical tools as for Indo-European, at a functional level the two languages are using their mood systems to achieve quite different effects. In particular, St’át’imcets uses its mood system to restrict modal force — precisely what this language does not restrict via its lexical modals. At a functional level, then, we find the same kind of cross-linguistic variation in the domain of mood as we do with modals. This idea is illustrated in the simplified typology in Table 2:
Indo-European St’át’imcets Table 2
lexically restrict quant. force
lexically restrict convers. background
modals moods
moods modals
Modal and mood systems
These results suggest that while individual items in the realm of mood and modality lexically encode different aspects of meaning, the systems as a whole have very similar expressive power. The structure of the paper: Section 2 introduces the St’át’imcets subjunctive data. I first illustrate the nine different uses of the relevant agreement paradigm, and then argue that this agreement paradigm is a subjunctive, rather than an irrealis mood. Section 3 shows that the St’át’imcets subjunctive is not amenable to existing analyses of more familiar languages.
9:4
Cross-linguistic variation in modality systems: The role of mood
Section 4 reviews the basic framework adopted, that of Portner (1997), and Section 5 provides initial arguments for adopting a Portner-style approach for St’át’imcets. Section 6 presents the formal analysis, and Section 7 applies the analysis to a range of uses of the subjunctive. Section 8 concludes and raises some issues for future research. 2
St’át’imcets subjunctive data
St’át’imcets possesses a complex system of subject and object agreement. There are different subject agreement paradigms for transitive vs. intransitive predicates. For intransitive predicates, there are three distinct subject paradigms, one of which is glossed as ‘subjunctive’ by van Eijk (1997) and Davis (2006).2
1sg 2sg 3sg 1pl 2pl 3pl Table 3
indicative indicative nominalized
subjunctive
tsút=kan tsút=kacw tsut=Ø tsút=kalh tsút=kal’ap tsút=wit
tsút=an tsút=acw tsút=as tsút=at tsút=al’ap tsút=wit=as
n=s=tsut s=tsút=su s=tsút=s s=tsút=kalh s=tsút=lap s=tsút=i
Subject agreement paradigms for the intransitive predicate tsut ‘to say’ (adapted from van Eijk 1997: 146)
With transitive predicates, the situation is similar, except that there are four separate paradigms, one of which is subjunctive.3,4 2 The cognate forms are often called ‘conjunctive’ in other Salish languages, primarily in order to disambiguate the abbreviations for ‘subject’ and ‘subjunctive’. See for example Kroeber 1999. 3 The traditional terms for the first two columns are ‘indicative’ and ‘nominalized’ respectively. The nominalized endings are identical to nominal possessive endings, and are glossed as ‘poss’ in the data. The choice between these first two paradigms is syntactically governed: the so-called ‘indicative’ surfaces in matrix clauses and relative clauses, while the nominalized paradigm appears in subordinate clauses. Both these sets contrast semantically, in all syntactic environments, with the subjunctive, hence my overall categorization of the first two paradigms as ‘indicative’. 4 See Kroeber 1999 and Davis 2000 for justification of the analysis of subject inflection
9:5
Lisa Matthewson
In subsection 2.1 I illustrate the uses of the paradigms glossed as subjunctive, and in subsection 2.2 I argue that these paradigms more closely approximate familiar subjunctives, rather than irrealis moods. 2.1 Uses of the St’át’imcets subjunctive The mood I am glossing as ‘subjunctive’ has a wide range of uses, which at first glance are not easily unifiable. I illustrate all of them here. First, the subjunctive functions to turns a plain assertion into a wish (Davis 2006: chapter 24).5 (5)
a.
nilh s=Lémya7 ti=kél7=a foc nom=Lémya7 det=first=exis ‘Lémya7 is first.’
b.
nílh=as s=Lémya7 ku=kéla7 foc=3sbjn nom=Lémya7 det=first ‘May Lémya7 be first.’
(6)
a.
ámh=as ku=scwétpcen-su! good=3sbjn det=birthday=2sg.poss ‘May your birthday be good!’
b.
ámh=as ku=s=wá7=su! good=3sbjn det=nom=be=2sg.poss ‘Best wishes!’ [‘May your being be good.’]
(Davis 2006: ch. 24)
This use of the subjunctive is very restricted (see van Eijk 1997: 147). Minimal pairs cannot usually be constructed for ordinary assertions, as shown in (7)–(9). (7)
a.
kwis lhkúnsa rain today ‘It’s raining today.’
b.
*kwís=as lhkúnsa rain=3sbjn today intended: ‘May it rain today.’
assumed here. I do not provide the transitive paradigms, as subject markers vary based on the person and number of the object and the table is excessively large. See van Eijk 1997 and Davis 2006 for details. 5 The determiner alternation between (5a) and (5b) (ti=. . . =a vs. ku=) is predictable, but irrelevant for current concerns. See Matthewson 1998, 1999 for discussion.
9:6
Cross-linguistic variation in modality systems: The role of mood
(8)
a.
áma ti=sq’ít=a good det=day=exis ‘It is a good day.’
b.
*ámh=as ti=sq’ít=a good=3sbjn det=day=exis intended: ‘May it be a good day.’
(9)
a.
guy’t ti=sk’úk’wm’ita sleep det=child=exis ‘The child is sleeping.’
b.
*guy’t=ás ti=sk’úk’wm’ita sleep=3sbjn det=child=exis intended: ‘I hope the child sleeps.’
In general, the subjunctive seems only to add to a plain assertion either in a cleft structure, as in (5), or in conventionalized wishes, as in (6). I return to this issue below. The more usual case of the subjunctive creating a wish-statement is when it co-occurs with the deontic modal ka, as in (10)–(11). (10)
a.
plan=ka=tí7=t’u7 wa7 máys-n-as already=deon=demon=prt impf fix-dir-3erg ‘He should have fixed that already.’
b.
plan=as=ká=tí7=t’u7 wa7 máys-n-as already=3sbjn=deon=demon=prt impf fix-dir-3erg ‘I wish he had fixed that already.’
(11)
a.
gúy’t=ka ti=sk’úk’wm’it=a sleep=deon det=child=exis ‘The child should sleep.’
b.
gúy’t=ás=ka ti=sk’úk’wm’it=a sleep=3sbjn=deon det=child=exis ‘I hope the child sleeps.’
When used with the deontic modal ka, in addition to the ‘wish’ interpretation shown in (10)–(11), the subjunctive can also render a ‘pretend to be ...’ interpretation.6 6 The data in (12) are from the Upper St’át’imcets dialect; in Lower St’át’imcets, (12a) is corrected to (i), which has the subjunctive but lacks the deontic modal. This independent
9:7
Lisa Matthewson
(12)
a.
skalúl7=acw=ka: saq’w knáti7 múta7 em7ímn-em owl=2sg.sbjn=deon fly deic and animal.noise-mid ‘Pretend to be an owl: fly around and hoot.’ (Davis 2006: chapter 24)
b.
snu=hás=ka ku=skícza7 2sg.emph=3sbjn=deon det=mother ‘Pretend to be the mother.’ (Whitley, Davis, Matthewson & Frank (editors) no date)
The fourth construction which licenses the subjunctive is the imperative; the subjunctive weakens an imperative to a polite request (Davis 2006: chapter 24). In each of (13)–(15), the subjunctive imperative in (b) is construed as ‘more polite’ than the plain imperative in (a). The subjunctive is particularly common in negative requests, as in (15). (13)
a.
lts7á=malh lh=kits-in’=ál’ap! deic=adhort comp=put.down-dir=2pl.sbjn ‘Just put it over here!’
b.
lts7á=has=malh lh=kits-in’=ál’ap deic=3sbjn=adhort comp=put.down-dir=2pl.sbjn ‘Could you put it down here?’/‘You may as well put it down over here.’7 (adapted from Davis 2006: chapter 24)
(14)
a.
nás=malh áku7 pankúph=a go=adhort deic Vancouver=exis ‘You’d better go to Vancouver.’
b.
nás=acw=malh áku7 pankúph=a go=2sg.sbjn=adhort deic Vancouver=exis ‘You could go to Vancouver.’
pronoun construction is argued by Thoma (2007) to be a concealed cleft. I return to this issue below. (i) nu=hás ku=kalúla7 2sg.emph=3sbjn det=owl ‘Pretend to be an owl.’ 7 The third person subjunctive ending appears here because the structure is bi-clausal, involving a third-person impersonal main predicate: ‘It is here that you could put it down.’
9:8
Cross-linguistic variation in modality systems: The role of mood
(15)
a.
cw7aoz kw=s=sek’w-en-ácw ta=nk’wanústen’=a neg det=nom=break-dir-2sg.erg det=window=exis ‘Don’t break the window.’
b.
cw7áoz=as kw=s=sek’w-en-ácw ta=nk’wanústen’=a neg=3sbjn det=nom=break-dir-2sg.erg det=window=exis ‘Don’t break the window.’
Fifth, in combination with an evidential or a future modal, the subjunctive helps to turn wh-questions into statements of uncertainty or wondering. (16)
a.
kanem=lhkán=k’a do.what=1sg.indic=infer ‘What happened to me?’
b.
kanem=án=k’a do.what=1sg.sbjn=infer ‘I don’t know what happened to me.’ / ‘I wonder what I’m doing.’8
(17)
a.
kanem=lhkácw=kelh múta7 do.what=2sg.indic=fut again ‘What are you going to be doing later?’
b.
kanem=ácw=kelh múta7 do.what=2sg.sbjn=fut again ‘I wonder what you are going to do again.’
(18)
a.
(van Eijk 1997: 215)
nká7=kelh lh=cúz’=acw nas where=fut comp=going.to=2sg.sbjn go ‘Where will you go?’
b.
nká7=as=kelh lh=cúz’=acw nas where=3sbjn=fut comp=going.to=2sg.sbjn go ‘Wherever will you go?’ / ‘I wonder where you are going to go now.’ (adapted from Davis 2006: chapter 24)
The same effect arises with yes-no questions. In combination with the evidential k’a or a future modal, the subjunctive also turns these into statements of uncertainty which are often translated using ‘maybe’ or ‘I wonder’. 8 For expository reasons, k’a was glossed as ‘epistemic’ in (3a) above, but from now on will be glossed as ‘inferential’. Matthewson et al. (2007) analyze k’a as an epistemic modal which carries a presupposition that there is inferential evidence for the claim.
9:9
Lisa Matthewson
(19)
a.
lán=ha kwán-ens-as already=ynq take-dir-3erg ni=n-s-mets-cál=a det.abs=1sg.poss-nom-write-act=exis ‘Has she already got my letter?’
b.
lan=as=há=k’a kwán-ens-as already=3sbjn=ynq=infer take-dir-3erg ni=n-s-mets-cál=a det.abs=1sg.poss-nom-write-act=exis ‘I wonder if she’s already got my letter.’/’I don’t know if she got my letter or not.’
(20)
wa7=as=há=k’a tsicw impf=3sbjn=ynq=infer get.there i=n-sésq’wez’=a, cw7aoz kw=en det.pl=1sg.poss-younger.sibling=exis neg det=1sg.poss zwát-en know-dir ‘Perhaps my younger siblings went along, I don’t know.’ (Matthewson 2005: 265)
In combination with a wh-indefinite and the evidential k’a, the subjunctive creates free relatives with an ‘ignorance/free choice’ reading; see Davis 2006 for discussion. (21)
a.
qwatsáts=t’u7 múta7 súxwast áku7, t’ak aylh áku7, leave=prt again go.downhill deic go then deic nílh=k’a s=npzán-as foc=infer nom=meet(dir)-3erg k’a=lh=swát=as=k’a káti7 ku=npzán-as infer=comp=who=3sbjn=infer deic det=meet(dir)-3erg ‘So he set off downhill again, went down, and then he met whoever he met.’ (van Eijk & Williams 1981: 66, cited in Davis 2009)
b.
o, púpen’=lhkan [ta=stam’=as=á=k’a] oh find=1sg.indic [det=what=3sbjn=exis=infer] ‘Oh, I’ve found something or other.’ (Unpublished story by “Bill” Edwards, cited in Davis 2009)
When used in combination with the scalar particle t’u7, the subjunctive creates a statement translated as ‘might as well’ or ‘may as well’.
9:10
Cross-linguistic variation in modality systems: The role of mood
(22)
a.
wá7=lhkan=t’u7 wa7 k’wzús-em impf=1sg.indic=prt impf work-mid ‘I am just working.’
b.
wá7=an=t’u7 wa7 k’wzús-em impf=1sg.sbjn=prt impf work-mid ‘I might as well stay and work.’
(23)
a.
wá7=lhkacw=t’u7 lts7a lhkúnsa ku=sgáp be=2sg.indic=prt deic now det=evening ‘You are staying here for the night.’
b.
wá7=acw=t’u7 lts7a lhkúnsa ku=sgáp be=2sg.sbjn=prt deic now det=evening ‘You may as well stay here for the night.’
And finally, in combination with a wh-word and the scalar particle t’u7, the subjunctive creates free relatives with a universal / indifference reading. (24)
a.
wa7 táw-em ki=smán’c=a, ns7á7z’-em impf sell-mid det.col=tobacco=exis trade-mid ku=stám’=as=t’u7 det=what=3sbjn=prt ‘He was selling tobacco, trading it for whatever . . . ’ (van Eijk & Williams 1981: 74, cited in Davis 2009)
b.
wa7 kwám=wit ku=káopi, ku=súkwa, ku=saplín, impf take(mid)=3pl det=coffee det=sugar det=flour [stám’=as=t’u7 cw7aoz [what=3sbjn=prt neg kw=s=ka-ríp-s-tum’-a det=nom=circ-grow-caus-1pl.erg-circ l=ti=tmícw-lhkalh=a] on=det=land-1pl.poss=exis] ‘They got coffee, sugar, flour, whatever we couldn’t grow on our land. . . ’ (Matthewson 2005: 105, cited in Davis 2009)
c.
[stám’=as=t’u7 káti7 i=wá7 [what=3sbjn=prt deic det.pl=impf ka-k’ac-s-twítas-a i=n-slalíl’tem=a] circ-dry-caus-3pl.erg-circ det.pl=1sg.poss-parents=exis] wa7 ts’áqw-an’-em lh=as sútik impf eat-dir-1pl.erg comp(impf)=3sbjn winter 9:11
Lisa Matthewson
‘Whatever my parents could dry, we ate in wintertime.’ (Matthewson 2005: 141, cited in Davis 2009) The nine uses of the St’át’imcets subjunctive are summarized in Table 4: environment
indicative meaning
subjunctive meaning
plain assertion deontic modal deontic modal imperative wh-question + evidential/future yes-no question + evidential/future wh-word + evidential scalar particle t’u7 wh-word + scalar particle t’u7
assertion deontic necessity/possibility deontic necessity/possibility command question
wish wish ‘pretend’ polite request uncertainty/wondering
question
uncertainty/wondering
question ‘just/still’ N/A
ignorance free relative ‘might as well’ indifference free relative
Table 4
Uses of the St’át’imcets subjunctive
These are all the cases where the subjunctive has a semantic effect; in the next sub-section we will also see some cases where the subjunctive is obligatory and semantically redundant. I will not aim to account for the entire panoply of subjunctive effects in one paper. However, the analysis I offer will explain the first seven uses, setting aside for future research only the two uses which involve the particle t’u7. See Section 8 for some speculative comments about the subjunctive in combination with t’u7. 2.2
This is a subjunctive mood
In this sub-section I justify the use of the term ‘subjunctive’ for the subject agreements being investigated. The choice of terminology is intended to reflect the fact that the St’át’imcets mood patterns with Indo-European subjunctives, rather than with Amerindian irrealis moods, in several respects. However, we will see below that the St’át’imcets subjunctive also differs
9:12
Cross-linguistic variation in modality systems: The role of mood
semantically in important ways from Indo-European subjunctives.9 Palmer (2006) observes that there is a broad geographical typology, such that European languages often encode an indicative/subjunctive distinction, while Amerindian and Papuan languages often encode a realis/irrealis distinction. A typical irrealis-marking system is illustrated in (25). (25)
a.
ho bu-busal-en age qo-in pig sim-run.out-3sg+ds+real 3pl hit-3pl+rem.past ‘They killed the pig as it ran out.’
b.
(Amele; Palmer 2006: 5)
ho bu-busal-eb age qo-qag-an pig sim-run.out-3sg+ds+irr 3pl hit-3pl-fut ‘They will kill the pig as it runs out.’
(Amele; Palmer 2006: 5)
According to Palmer (2006: 145), the indicative/subjunctive distinction and the realis/irrealis distinction are ‘basically the same’. The core function of both a subjunctive and an irrealis is to encode ‘non-assertion’.10 However, there are differences in distribution and in syntactic functions. First, Palmer observes that subjunctive is not marked independently of other inflectional categories such as person and number. Instead, there is typically a full subjunctive paradigm. On the other hand, irrealis is often marked by a single element. In this respect, the St’át’imcets mood patterns like a subjunctive; see Table 3 above. Second, in main clauses, irrealis marking is often used for questions, futures and denials; this is not the case for main clause subjunctives. In this respect also, the St’át’imcets mood patterns like a subjunctive. It is not used to mark questions, futures or denials. (26)–(28) all have indicative marking. 9 This raises a terminological issue which arises in many areas of grammar. Should we apply terms which were invented for European languages to similar — but not identical — categories in other languages? For example, should we say ‘The perfect / definite determiner / subjunctive in language X differs semantically from its English counterpart’, or should we say ‘Language X lacks a perfect / definite determiner / subjunctive’, because it lacks an element with the exact semantics of the English categories? I adopt the former approach here, as I think it leads to productive cross-linguistic comparison, and because it suggests that the traditional terms do not represent primitive sets of properties, but rather potentially decomposable ones. 10 Palmer does not provide a definition of ‘non-assertion’. He observes that common reasons why a proposition is not asserted are because the speaker doubts its veracity, because the proposition is unrealized, or because it is presupposed (Palmer 2006: 3). See Section 3 below for discussion.
9:13
Lisa Matthewson
(26)
t’íq=Ø=ha kw=s=Josie? arrive=3indic=ynq det=nom=Josie ‘Did Josie arrive?’
(27)
t’íq=Ø=kelh kw=s=Josie arrive=3indic=fut det=nom=Josie ‘Josie will arrive.’
(28)
cw7aoz kw=s=t’iq=s s=Josie neg det=nom=arrive=3poss nom=Josie ‘Josie didn’t arrive.’
Third, Palmer notes that subjunctive marking is obligatory and redundant only in subordinate clauses, while irrealis marking is often obligatory and redundant in main clauses. Here again, the St’át’imcets mood patterns like a subjunctive. It is obligatory and redundant only in three cases. The first is when embedded under the complementizer lh=. lh= is glossed by van Eijk (1997) as ‘hypothetical’, and analyzed by Davis (2006) as a complementizer which introduces subjunctive clauses, including if -clauses, as in (29a) and (29b), temporal adjuncts (29b), locative adjuncts (29c), and complements to the evidential k’a when this is used as a (focused) adverb (29d). (29)
a.
lh=cw7áoz*(=as)=ka kw=s=gúy’t=su, comp=neg*(=3sbjn)=irr det=nom=sleep=2sg.poss lán=ka=tu7 wa7 xzum i=n’wt’ústen-sw=a already=irr=then impf big det.pl=eye-2sg.poss=exis ‘If you hadn’t slept, your eyes would have been big already.’ (van Eijk & Williams 1981: 12)
b.
xwáyt=wit=ka lh=wa7=wit*(=ás)=t’u7 qyax many.people.die=3pl=irr comp=be=3pl*(=3sbjn)=prt drunk múta7 tqálk’-em lh=w*(=as) qyáx=wit and drive-mid comp=impf*(=3sbjn) drunk=3pl ‘They would die if they got drunk and drove when they were drunk.’ (Matthewson 2005: 367)
c.
lts7a lh=wa7*(=as) qwál’qwel’t deic comp=impf*(=3sbjn) hurt ‘It is here that it is hurting.’
9:14
Cross-linguistic variation in modality systems: The role of mood
d.
k’a lh=7án’was*(=as) sq’it, maybe comp=two*(=3sbjn) day ka-láx-s-as-a n-skícez7=a circ-remember-caus-3erg-circ 1sg.poss-mother=exis na=s-7ílacw-em-s=a det=nom-soak-mid-3poss=exis ta=n-qéqtsek=a det=1sg.poss-older.brother=exis ‘Maybe two days later, my mother remembered the fish my brother had been soaking.’ (Matthewson 2005: 152; cited in Davis 2006: chapter 23)11
The second case where the St’át’imcets subjunctive is obligatory and redundant is when embedded under the complementizer i= ‘when’, as in (30). i= has a similar distribution to lh=, but is restricted to past-time contexts. See van Eijk 1997: 235-6 and Davis 2006: chapter 27 for discussion. (30)
a.
i=kél7=at tsicw, áts’x-en-em when.past=first=1pl.sbjn get.there see-dir-1pl.erg i=cw7ít=a tsitcw det=many=exis house ‘When we first got there, we saw lots of houses.’ (Matthewson 2005: 74)
b.
wá7=lhkan lexláx-s i=kwís*(=as) impf=1sg.indic remember-caus when.past=fall*(=3sbjn) na=n-sésq’wez’=a, s=Harold Peter det.abs=1sg.poss-younger.sibling=exis nom=Harold Peter ‘I remember when my little brother was born, Harold Peter.’ (Matthewson 2005: 354-5)
11 Incidentally, Davis (2006: chapter 23) observes that ‘two or more k’a lh= clauses strung together form the closest equivalent in [St’át’imcets] of [English] “either...or”.’ An example is given in (i). (i)
k’a lh=xw7utsin-qín’=as, k’a lh=tsilkst-qín’=as=kelh maybe comp=four-animal=3sbjn maybe comp=five-animal=3sbjn=fut ‘It’ll either be a four point or a five point buck.’
(Davis 2006: chapter 23)
As Davis implies, St’át’imcets lacks any lexical item which renders logical disjunction, and constructions like (i), although used to translate English ‘or’, are literally two ‘maybe’-clauses strung together.
9:15
Lisa Matthewson
Finally, the subjunctive is obligatory when it appears in combination with the perceived-evidence evidential =an’. =an’ is analyzed by Matthewson et al. (2007) as an epistemic modal which is defined only if the speaker has perceived indirect evidence for the prejacent proposition. (31)
a.
*táyt=kacw=an’ hungry=2sg.indic=perc.evid ‘You must be hungry.’
b.
táyt=acw=an’ hungry=2sg.sbjn=perc.evid ‘You must be hungry.’
(32)
a.
*nílh=Ø=an’ s=Sylvia ku=xílh-tal’i foc=3indic=perc.evid nom=Sylvia det=do(caus)-top ‘Apparently it was Sylvia who did it.’
b.
nílh=as=an’ s=Sylvia ku=xílh-tal’i foc=3sbjn=perc.evid nom=Sylvia det=do(caus)-top ‘Apparently it was Sylvia who did it.’ (Matthewson et al. 2007: 208)
The perceived-evidence evidential is the only environment in the language where the subjunctive is obligatory in a matrix clause. I assume that the subjunctive lacks semantic import here, as an otherwise very similar evidential lákw7a does not allow the subjunctive in cases parallel to (31)–(32) (Matthewson 2010, to appear). The conclusion is that St’át’imcets, in spite of being an Amerindian language, has a mood which patterns, at least morpho-syntactically, like a subjunctive rather than an irrealis. This fits with how van Eijk (1997) and Davis (2000, 2006) gloss the relevant forms. However, we will see in the next section that the St’át’imcets subjunctive differs semantically in interesting ways from European subjunctives. 3
Why previous analyses do not work for St’át’imcets
The vast majority of formal research on the subjunctive deals with IndoEuropean. In languages such as the Romance languages, the subjunctive mood is used for wishes, fears, speculations, doubts, obligations, reports, unrealized events, or presupposed propositions. Some examples are provided in (33)–(34).
9:16
Cross-linguistic variation in modality systems: The role of mood
(33)
a.
creo que aprende I.believe that learn+3sg+pres+indic ‘I believe that he is learning.’
b.
dudo que aprenda I.doubt that learn+3sg+pres+sbjn ‘I doubt that he’s learning.’
(34)
(Spanish; Palmer 2006: 5)
(Spanish; Palmer 2006: 5)
potessi venire anch’ io can+1sg+pres+sbjn come also I ‘If only I could come too.’
(Italian; Palmer 2006: 109)
In this section I briefly discuss some of the main approaches to the subjunctive. I cannot do justice to the full array of proposals in the literature; the goal is to provide enough background to establish that the St’át’imcets subjunctive is not amenable to a range of existing approaches. One pervasive line of thought is that subjunctive encodes a general semantic contribution of ‘non-assertion’ (Bolinger 1968, Terrell & Hooper 1974, Hooper 1975, Klein 1975, Farkas 1992, Lunn 1995, Palmer 2006, Haverkate 2002, Panzeri 2003, among others). One recent formal proposal in this line is that of Farkas (2003). Farkas argues that there is a correlation between indicative mood and complements which have assertive context change potential relative to the embedded environment. Assertive context change for a matrix clause is defined as in (35); the context set of worlds Wc is narrowed. (35)
Assertive context change c + φ is assertive iff Wc 0 = Wc ∩ p, where c 0 is the output context. (Farkas 2003: 5)
Farkas provides an analysis of assertion in embedded contexts which predicts that positive epistemic predicates like believe or know take indicative complements, as these complements are asserted relative to the matrix subject’s epistemic state.12 Predicates of assertion (‘say’, ‘assert’) and of fiction (‘dream’, ‘imagine’) similarly introduce complements which are assertively added to the embedded speech context, and also take indicative complements. On the other hand, complements to desideratives (‘want’, ‘wish’, ‘desire’) and directives (‘command’, ‘direct’, ‘request’) are not assertive. Rather than eliminating 12 Predicates like believe take subjunctive complements in Italian; see Giorgi & Pianesi 1997, among many others, for discussion.
9:17
Lisa Matthewson
worlds in the context set where the complement is false, these predicates eliminate worlds in the context set which are low on an evaluative ranking.13 Thus, these predicates take the subjunctive: (36)
Maria vrea s˘ a-i r˘ aspund˘ a Maria wants subj-cl answer.sbjn ‘Maria wants to answer him.’
(Romanian; Farkas 2003: 2)
Giannakidou (1997, 1998, 2009) offers an alternative characterization of the distribution of the subjunctive, according to which it appears in nonveridical contexts, while indicative appears in veridical contexts. The relevant definition is given in (37): (37)
A propositional operator F is veridical iff from the truth of F p we can infer that p is true relative to some individual x (i.e., in some individual x’s epistemic model) . . . If inference to the truth of p under F is not possible, F is nonveridical. (Giannakidou 2009: 1889)
According to this analysis, the division between indicative-taking and subjunctive-taking predicates relies on whether at least one epistemic agent is committed to the truth of the embedded proposition. Giannakidou’s approach predicts a similar division between indicative- and subjunctivetaking predicates to Farkas’s. In Modern Greek, the indicative is found in complements to predicates of assertion or fiction, epistemics, factives and semi-factives. The subjunctive is found in complements to volitionals, directives, modals, permissives, negatives, and verbs of fear (Giannakidou 2009: 9).14 An approach which aims to derive mood selection directly from the semantics of subordinating predicates is that of Villalta (2009). Villalta argues 13 The complements of desideratives are also not ‘decided’ relative to their context set, which is what is actually crucial here for Farkas (2003). Farkas proposes an Optimality Theory account involving the two constraints in (i): (i) *SUBJ/+Decided
*IND/-Assert
Different rankings of these two constraints give rise to different mood choices in Romanian vs. French for emotive factive predicates like ‘be sorry/happy’, ‘regret’. Emotive factives are +Decided but -Assertive, and take the indicative in Romanian and the subjunctive in French. 14 Giannakidou (2009) proposes that the Modern Greek subjunctive complementizer na contributes temporal semantics (introducing a ‘now’ variable). The generalization is still that subjunctive appears in non-veridical contexts; see Giannakidou 2009 for details.
9:18
Cross-linguistic variation in modality systems: The role of mood
that subjunctive-selecting predicates are those whose embedded propositions are compared to contextual alternatives on a scale encoded by the predicate. The contribution of the subjunctive is to evaluate the contextual alternatives. Quer (1998, 2001), looking mainly at Catalan and Spanish, argues that the subjunctive signals a shift in the model of the evaluation of the truth of the proposition. For unembedded assertions, the anchor is the Speaker and the model is the epistemic model of the Speaker. Operators which introduce subjunctive introduce buletic models, or other models which create comparative relations among worlds. This predicts we will find subjunctive in purpose clauses, and predicts indicative/subjunctive alternations in restrictive relative clauses, concessives, and free relatives. Quer (2009) also discusses indicative/subjunctive alternations in conditionals, claiming that indicative appears in protases that are ‘realistic in the sense that they quantify over worlds which are close enough to the actual one’ (2009: 1780). Subjunctive is used when the worlds are further away from the actual one or even disjoint from it. An approach to mood which draws on notions from noun phrase semantics is offered by Baker & Travis (1997). Baker and Travis argue that in Mohawk, mood marks a division between ‘verbal specificity’ (‘factive’ mood) and Kamp/Heim-style indefiniteness (two variants of non-factive mood, previously called the ‘future’ and the ‘optative’). Indefinite/non-factive mood appears in future contexts, in past habituals, in negative clauses, under the verbs ‘promise’ and ‘want’, and in free relatives with a non-specific reading. What links all these indefinite-mood environments, according to Baker and Travis, is the same feature that characterizes indefinite noun phrases in the Kamp/Heim system: a free variable (in the Mohawk case, an event variable) which undergoes existential closure in the scope of various operators. This ends our brief tour through some major formal approaches to the subjunctive.15 The reader is referred to Portner (2003) for further overview and discussion. In the next sub-section I show that the St’át’imcets subjunctive does not behave like the Indo-European or Mohawk subjunctives, and that a new approach is required. 15 I defer discussion of Portner’s (1997) analysis to Section 5, since I will be adapting Portner’s approach for St’át’imcets.
9:19
Lisa Matthewson
The St’át’imcets subjunctive is not amenable to existing approaches
3.1
The St’át’imcets subjunctive differs from familiar subjunctives in both its distribution and semantic effects. Although there are some initial similarities, such as the fact that both St’át’imcets and Indo-European subjunctives can be used to express wishes and hopes, St’át’imcets mood displays no sensitivity to the choice of matrix predicate. Thus, unlike in Romance or Greek, predicates of assertion, belief and fiction are not differentiated from desideratives or directives. All attitude verbs in St’át’imcets take the indicative, as illustrated for a representative range in (38).16,17 (38)
a.
tsut k=Laura kw=s=t’iq=Ø k=John say det=Laura det=nom=arrive=3indic det=John ‘Laura said that John came.’
b.
tsut-ánwas k=Laura kw=s=t’iq=Ø k=John say-inside det=Laura det=nom=arrive=3indic det=John ‘Laura thought that John came.’
c.
zwát-en-as k=Laura kw=s=t’iq=Ø k=John know-dir-3erg det=Laura det=nom=arrive=3indic det=John ‘Laura knew that John came.’
16 Interestingly, the same is not true of the related language Skwxwú7mesh (Squamish). In Skwxwú7mesh, the subjunctive (glossed as ‘conjunctive’; see fn. 2) is obligatory under ‘tell someone to do something’ (as in (i)), but is optional under ‘I think’, depending on whether the speaker knows that the event did not take place (ii-iii) (all data from Peter Jacobs, p.c.). (i) chen tsu-n-Ø-Ø mi as uys I tell-dir-dat-3obj come 3conj come.inside ‘I told him to come inside.’ (ii) chen ta7aw’n kwi s-Ø-s mi uys I think det nom-real-3poss come come.inside ‘I think he came inside.’ (iii) chen ta7aw’n k’-as mi uys I think irr-3conj come come.inside ‘I thought he came inside (but then I found out that he’s still outside playing).’ Jacobs (1992) analyzes the mood distinction in Skwxwú7mesh as encoding speaker certainty, which suggests that it differs from the St’át’imcets mood system. 17 The expected subject inflection in the embedded clauses in (38) would actually be possessive =s; see van Eijk 1997 and Davis 2006. However, many modern speakers prefer to omit the possessive ending and to use matrix indicative =Ø in these contexts. This does not affect the point at hand, as the variation is between two forms of indicative marking.
9:20
Cross-linguistic variation in modality systems: The role of mood
d.
kw7íkwl’acw k=Laura kw=s=t’iq=Ø k=John dream det=Laura det=nom=leave=3indic det=John ‘Laura dreamt that John came.’
e.
xát’-min’-as k=Laura kw=s=t’iq=Ø k=John want-red-3erg det=Laura det=nom=arrive=3indic det=John ‘Laura wanted John to come.’
f.
tsa7cw k=Laura kw=s=t’iq=Ø k=John glad det=Laura det=nom=arrive=3indic det=John ‘Laura was happy that John came.’
g.
tsún-as k=Laura k=John kw=s=ts7as=Ø say(dir)-3erg det=Laura det=John det=nom=come=3indic ‘Laura told John to come.’18
The St’át’imcets subjunctive is also not used under negated verbs of belief or report, as it is in many European languages (cf. Palmer 2006: 116). Compare Spanish (39a) with St’át’imcets (39b) and (39c). (39)
a.
no creo que aprenda not I.think that learn+3sg+pres+sbjn ‘I don’t think that he is learning.’
b.
(Spanish; Palmer 2006: 117)
cw7aoz kw=en=tsut-ánwas kw=s=zwátet-cal=s neg det=1sg.poss=say-inside det=nom=know-act=3poss ‘I don’t think that he is learning.’
c.
cw7aoz kw=s=tsut=s kw=s=Aggie neg det=nom=say=3poss det=nom=Aggie kw=s=t’cum=s i=gáp=as det=nom=win=3poss when.past=evening=3sbjn ‘Aggie didn’t say she won last night.’
Nor does the St’át’imcets subjunctive give rise to interpretive differences inside relative clauses. In some Indo-European languages, an indicative/subjunctive contrast in restrictive relatives gives rise to a distinction which has variously been analyzed as referential/attributive, specific/nonspecific, or wide-scope/narrow-scope (see Rivero 1975, Farkas 1992, Giannakidou 1997, Beghelli 1998, Quer 2001, among many others). This is illustrated in 18 The predicate in (38g) differs from that in (38a)–(38f) because the ‘ordering’ environment in (38g) requires an unergative embedded verb.
9:21
Lisa Matthewson
(40) for Catalan. Quer’s analysis of these examples involves a shifting of the model in which the descriptive condition in the relative clause is interpreted; the effect is one of apparent ‘wide-scope’ for the descriptive condition in the indicative (40a), as opposed to in the subjunctive (40b). (40)
a.
necessiten un alcalde [que fa grans need.3pl a mayor that make.indic.prs.3sg big inversions] investments ‘They need a mayor that makes big investments.’ (Catalan; Quer 2001: 90)
b.
necessiten un alcalde [que faci grans need.3pl a mayor that make.sbjn.prs.3sg big inversions] investments ‘They need a mayor that makes big investments.’ (Catalan; Quer 2001: 90)
In St’át’imcets, nominal restrictive relatives uniformly take indicative marking, as shown in (41). The distinction which is in Catalan is encoded by mood, is achieved by means of determiner choice in St’át’imcets (see Matthewson 1998, 1999 for analysis). (41)
a.
wa7 xat’-min’-ítas ti=kúkwpi7=a wa7 impf want-red-3pl.erg det=chief=exis impf ka-nuk’wa7-s-tanemwít-a k=wa=s mays circ-help-caus-3pl.pass-circ det=impf=3poss fix ku=tsetsítcw det=houses ‘They need a (particular) chief who can help them build houses.’ [wide-scope indefinite]
b.
wa7 xat’-min’-ítas ku=kúkwpi7 wa7 impf want-red-3pl.erg det=chief impf ka-nuk’wa7-s-tanemwít-a k=wa=s mays circ-help-caus-3pl.pass-circ det=impf=3poss fix ku=tsetsítcw det=houses ‘They need a(ny) chief who can help them build houses.’ [narrow-scope indefinite]
9:22
Cross-linguistic variation in modality systems: The role of mood
The mood effects seen in conditionals in some Indo-European languages are also absent in St’át’imcets. The antecedents of both notionally indicative and subjunctive conditionals are obligatorily marked with the subjunctive, as shown in (42), a paradigm borrowed from Quer 2009: 1780. Although there are ways to distinguish the different types of conditionals, they do not involve an indicative-subjunctive mood alternation. (42)
a.
Context: I’m looking for John. You say: lh=7áts’x-en=an, nílh=t’u7 s=qwál’-en-tsin comp=see-dir=1sg.sbjn foc=prt nom=tell-dir-2sg.obj ‘If I see him, I’ll tell you.’
b.
Context: I’m looking for John, and I suspect you know where he is but you haven’t been telling me. You say: lh=7ats’x-en=án=ka, sqwal’-en-tsín=lhkan=kelh comp=see-dir=1sg.sbjn=irr tell-dir-2sg.obj=1sg.indc=fut ‘If I saw him, I would tell you.’
c.
Context: I was looking for John, but he left town before I could find him. You say: lh=7ats’x-en=án=ka=tu7 comp=see-dir=1sg.sbjn=irr=then qwal’-en-tsín=lhan=ka tell-dir-2sg.obj=1sg.indic=irr ‘If I had seen him, I would have told you.’
The St’át’imcets subjunctive is also not like the Mohawk one. Unlike in Mohawk, St’át’imcets futures take the indicative, as shown in (43); so do past habituals, as shown in (44), and plain negatives, as in (45). (43)
a.
ats’x-en-tsí=lhkan=kelh lh=nátcw=as see-dir-2sg.obj=1sg.indic=fut comp=one.day.away=3sbjn ‘I’ll see you tomorrow.’
b.
*ats’x-en-tsín=an=kelh lh=nátcw=as see-dir-2sg.obj=1sg.sbjn=fut comp=one.day.away=3sbjn ‘I’ll see you tomorrow.’
9:23
Lisa Matthewson
(44)
a.
wa7=lhkalh=wí7=tu7 n-záw’-em ku=qú7 impf=1pl.indic=emph=then loc-get.water-mid det=water lhel=ta=qú7qu7=a múta7 lhel=ta=tswáw’cw=a from=det=water(pl)=exis and from=det=creek=exis ‘We used to fetch water from the spring and the creek.’ (Matthewson 2005: 370)
b.
*wa7=at=wí7=tu7 n-záw’-em ku=qú7 impf=1pl.sbjn=emph=then loc-get.water-mid det=water lhel=ta=qú7qu7=a múta7 lhel=ta=tswáw’cw=a from=det=water(pl)=exis and from=det=creek=exis ‘We used to fetch water from the spring and the creek.’
(45)
a.
áy=t’u7 kw=en=gúy’t ku=pála7 sgap neg=prt det=1sg.poss=sleep det=one evening ‘I didn’t sleep one night.’
b.
(Matthewson 2005: 267)
*áy=t’u7 kw=s=gúy’t=an ku=pála7 sgap neg=prt det=nom=sleep=1sg.sbjn det=one evening ‘I didn’t sleep one night.’
Finally, there are the cases where the St’át’imcets subjunctive does appear, with a predictable meaning difference, which are not attested in other languages. These include the use of the St’át’imcets subjunctive to weaken an imperative to a polite request, or to help turn a question into a statement of uncertainty (see examples in (13)–(15) and (16)–(20) above). I will argue below that in spite of these major empirical differences between the St’át’imcets subjunctive and that of familiar languages, the basic framework for mood semantics advanced by Portner (1997) can be adapted to capture all the St’át’imcets facts. This will support Portner’s proposal that moods are dependent on modals and place restrictions on the modal environments in which they appear. 4 Basic framework: Portner 1997 Portner’s (1997) leading idea is that moods place presuppositions on the modal environment in which they appear. More precisely, moods typically restrict properties of the accessibility relation associated with a governing modal operator (see also Portner 2003: 64). The modal operator may be
9:24
Cross-linguistic variation in modality systems: The role of mood
provided by a higher attitude verb or modal; it may also, in unembedded situations, be provided by context. For illustration, let us first see how Portner analyzes English ‘moodindicating may’. In each of the examples in (46), the may is not the ordinary modal may; it is not asserting possibility. (46b), for example, does not mean ‘it is possible that it is possible that Sue wins the race.’ (46)
a.
Jack wishes that you may be happy.
b.
It is possible that Sue may win the race.
c.
May you have a pleasant journey!
(Portner 1997: 190)
Portner argues that mood-indicating may presupposes that p is doxastically possible (possible according to someone’s beliefs). For example, (46a) presupposes that Jack believes it is possible for you to be happy. He provides the analysis in (47). (47)
For any reference situation r , modal force F , and modal context R,
Jmay dep (φ)Kr ,F ,R is only defined if φ is possible with respect to
Doxα (r ), where α is the denotation of the matrix subject. When defined, Jmay dep φKr ,F ,R = JφKr ,F ,R
(Portner 1997: 201)
Portner further argues that there are actually two mood-indicating may’s, with slightly different properties. Mood-indicating may under wish, pray, etc. (as in (46a)) or in unembedded clauses (as in (46c)) has an extra requirement: it presupposes that the accessibility relation R is buletic (deals with somebody’s wishes or desires). The discussion of mood-indicating may illustrates an important aspect of Portner’s analysis, namely that moods place presuppositions on the modal accessibility relation (a type of conversational background). With English mood-indicating may, there is a doxastic and sometimes a buletic restriction. For the English mandative subjunctive, which appears in imperatives as well as in embedded contexts as in (48), R must be deontic, as shown in (49). (48)
Mary demands that you join us downstairs at 3pm. (Portner 1997: 202)
(49)
For any reference situation r , modal force F , and modal context R, Jm-subj(φ)Kr ,F ,R is only defined if R is a deontic accessibility relation. When defined, Jm-subj(φ)Kr ,F ,R = JφKr ,F ,R
9:25
(Portner 1997: 202)
Lisa Matthewson
For Italian moods, Portner claims that R is restricted to being (non-)factive.19 The idea that moods restrict modal conversational backgrounds is common to several other modal-based analyses of mood (e.g., Farkas 1992 and Giorgi & Pianesi 199720 ), and is also found in James 1986. What James calls ‘manners of representation’ are root vs. epistemic conversational backgrounds: The ambiguity of the modal auxiliaries . . . supports the hypothesis that there are two separate manners of representation. Moods . . . signify manners of representation. They are not ambiguous, however; they signify one modality or the other (James 1986: 15). In the analysis to follow, I will adopt Portner’s idea that moods place restrictions on a governing modal operator. I will argue that the empirical differences between the St’át’imcets subjunctive and Indo-European subjunctives derive from the fact that the former restricts the conversational background of the modal operator in such a way that the modal force is weakened. 5 Adapting Portner’s approach for the Statimcets subjunctive I deal here only with the constructions where the subjunctive has a semantic effect; I will not address the cases of obligatory subjunctive agreement which were presented in subsection 2.2.21 My analysis will account for all meaningful uses of the St’át’imcets subjunctive except the two uses which contain the particle t’u7. See Section 8 for some discussion of the t’u7-constructions. 19 Interestingly, the Italian indicative imposes a modal force restriction as well as a conversational background restriction; it is only used with a force of necessity (Portner 1997: 197). 20 According to Giorgi and Pianesi, the subjunctive indicates that the ordering source is nonempty; this is a restriction on a conversational background. 21 The analysis presented below is actually compatible with the obligatory presence of the subjunctive in if -clauses introduced by lh=, and may even help to explain why lh= obligatorily selects the subjunctive when it means ‘if’, but selects indicative when it means ‘before’. Thanks to Henry Davis for discussion of this point, and see Davis 2006: chapter 26. (See also van Eijk 1997: 217, although van Eijk analyzes the subjunctive-inducing lh= as distinct from (e)lh= ‘before’.) As for the other obligatory cases of subjunctive, these may be grammaticized, semantically bleached relics of original meaningful uses, aided by the fact that subjunctive marking is intertwined with person agreement.
9:26
Cross-linguistic variation in modality systems: The role of mood
5.1 The St’át’imcets subjunctive presupposes rather than asserts a modal semantics The first thing to establish is that like Portner’s moods, the St’át’imcets subjunctive does not itself assert a modal semantics, but is dependent on a governing modal operator. One piece of evidence for this is that the St’át’imcets subjunctive must co-occur with an overt modal in almost all its uses. Of the seven uses of the subjunctive being analyzed here, five of them have an overt modal (the deontics, ‘pretend’, wh-questions, yes-no questions, ignorance free relatives), one of them is plausibly analyzed as containing a covert modal (imperatives), and only one is non-modal (plain assertions). As noted above, the addition of the subjunctive to plain assertions is extremely restricted and at least semi-conventionalized. If the subjunctive were itself independently modal, it would be difficult to explain the minimal contrasts in (50)–(51).22 (50)
a.
*gúy’t=as ti=sk’úk’wm’it=a sleep=3sbjn det=child=exis Attempted: ‘I hope the child sleeps.’
b.
gúy’t=as=ka ti=sk’úk’wm’it=a sleep=3sbjn=deon det=child=exis ‘I hope the child sleeps.’
(51)
a.
*skalúl7=acw: saq’w knáti7 múta7 em7ímnem owl=2sg.sbjn fly deic and make.animal.noise ‘Pretend to be an owl: fly around and hoot.’
b.
skalúl7=acw=ka: saq’w knáti7 múta7 em7ímnem owl=2sg.sbjn=deon fly deic and make.animal.noise ‘Pretend to be an owl: fly around and hoot.’
Furthermore, just like with English mood-indicating may, the interpretation of St’át’imcets subjunctive clauses indicates that the mood does not 22 As noted above, Portner’s analysis does allow for unembedded uses of non-indicative moods, with the modal accessibility relation being provided by context. So there is no problem with the cases where the St’át’imcets subjunctive can appear without a c-commanding modal (as in (5)–(6)). Of course, we would eventually like to explain when these unembedded subjunctives can and cannot appear. Portner (1997: 201) notes for mood-indicating may and the mandative subjunctive that ‘Neither of these have a completely predictable distribution, in that neither occurs in every context in which a purely semantic account would predict that it could . . . it must be admitted that lexical and syntactic idiosyncracies come into play.’
9:27
Lisa Matthewson
itself contribute modal semantics. For example, (50b) does not mean ‘It should be the case that the child should sleep’. The St’át’imcets subjunctive also patterns morphosyntactically like a mood rather than like real modals in the language. As shown above, the subjunctive is obligatorily selected by some complementizers, unlike modals. The subjunctive is also fused with subject marking into a full paradigm, unlike the modals, which are independent second-position clitics.23 I therefore conclude that the St’át’imcets subjunctive does not itself introduce a modal operator, but requires one in its environment. The St’át’imcets subjunctive does not presuppose a particular conversational background
5.2
The Statimcets subjunctive differs from most Indo-European moods in that it cannot be analyzed as being restricted to a certain type of conversational background. This is illustrated by the fact that it allows deontic, buletic or epistemic uses. Deontic conversational backgrounds arise with imperatives, as in (52) or (14b), repeated here in (53): (52)
ets7á=has=(malh) lh=xílh-ts=al’ap deic=3sbjn=(adhort) comp=do-caus=2pl.sbjn ‘Could you do it like this, you folks?’
(53)
nás=acw=malh áku7 pankúph=a go=2sg.sbjn=adhort deic Vancouver=exis ‘You could go to Vancouver.’
Buletic conversational backgrounds arise with the modal ka: (54)
plan=as=ká=ti7=t’u7 wa7 máys-n-as already=3sbjn=deon=demon=prt impf fix-dir-3erg ‘I wish he had fixed that already.’
(55)
guy’t=ás=ka ti=sk’úk’wm’it=a sleep=3sbjn=deon det=child=exis ‘I hope the child sleeps.’
23 Or in one case, a circumfix on the verb; see Davis et al. 2009.
9:28
Cross-linguistic variation in modality systems: The role of mood
And epistemic conversational backgrounds arise with questions. (56)
nká7=as=kelh lh=cúz’=acw nas where=3sbjn=fut comp=going.to=2sg.sbjn go ‘Wherever will you go?’ / ‘I wonder where you are going to go now.’ (adapted from Davis 2006: chapter 24)
(57)
lan=as=há=k’a kwán-ens-as already=3sbjn=ynq=infer take-dir-3erg ni=n-s-mets-cál=a det.abs=1sg.poss-nom=write-act=exis ‘I wonder if she’s already got my letter.’ / ‘I don’t know if she got my letter or not.’
These data suggest that the St’át’imcets subjunctive is not analyzable in the same way as the European moods discussed by Portner (1997), which hardwire a restriction to a particular type of conversational background. 5.3
Instead, the St’át’imcets subjunctive functions to weaken the modal force
The core idea of my proposal is that the St’át’imcets subjunctive restricts its governing modal only in such a way as to weaken the force of the proposition expressed. The intuition that the St’át’imcets subjunctive weakens the proposition it adds to was already expressed by Davis (2006: chapter 24): The best way to characterize this meaning difference is in terms of the ‘force’ of a sentence. With ordinary indicative subjects, a sentence expresses a straightforward assertion, question or command; but with subjunctive subjects, the effect is to weaken the force of the sentence, so that an assertion becomes a wish, a question becomes a conjecture, and a command becomes a request. The important question is what exactly is meant by ‘weakening’ in this context, and how to derive the various effects of the subjunctive in a unified way. I will claim that the St’át’imcets subjunctive restricts the conversational background of a governing modal in such a way that the modal imparts a force no stronger than weak necessity. Since there are no modals which
9:29
Lisa Matthewson
lexically encode quantificational force in St’át’imcets, this will mean that the subjunctive must appear in the scope of a variable-force modal, and will restrict it to a weakened interpretation. 6
Analysis
The idea to be pursued is that the St’át’imcets subjunctive restricts the domain of quantification of a c-commanding modal, so that the interpretation which obtains is weaker than pure necessity.24 Rullmann et al. (2008) argue that St’át’imcets possesses no modals which are lexically restricted for a pure necessity reading (see also Matthewson et al. 2007 and Davis et al. 2009). Instead, all St’át’imcets modals seem to allow both weak and strong interpretations (see (3) above, and see the references cited for many more examples). So, what we need to say is that the subjunctive forces an already potentially weak c-commanding modal to have a weak reading. In order to see how this will work, I first very briefly review the basics of a Kratzerian analysis of modals, and then outline how modals in St’át’imcets are analyzed. We will then add the subjunctive. Modals in a standard analysis introduce quantifiers over possible worlds. The set of worlds quantified over is narrowed down by two conversational backgrounds. First, it is narrowed down by the modal base, and then it is ordered and further narrowed down by the ordering source. The modal base and the ordering source are both usually provided by context in English, although there are systematic contributions of tense and aspect to the conversational background (see e.g., Condoravdi 2002 for discussion). A simple example is given in (58). (58)
Chris must do his homework. Modal base (circumstantial): The set of worlds in which the relevant facts are the same as in the actual world (e.g., we ignore worlds where Chris is not in school). Ordering source (normative): Orders worlds in the modal base so that the best worlds are those which come closest to the ideal represented by the school’s homework regulations. Universal quantification: In all the best worlds, Chris does his homework.
24 I would like to thank David Beaver and three anonymous reviewers for helping me clarify aspects of the analysis and its presentation.
9:30
Cross-linguistic variation in modality systems: The role of mood
Rullmann et al. (2008) argue that there are two differences between English universal modals like must and St’át’imcets modals. First, the St’át’imcets modals place presuppositions on the conversational backgrounds. Second, the set of best worlds is further narrowed down by a choice function which picks out a potentially proper subset of the best worlds to be quantified over. This can lead to a weaker reading, depending on context. The idea is illustrated informally in (59).25 (59)
gúy’t=ka ti=sk’úk’wm’it=a sleep=deon det=child=exis ‘The child must/should/can sleep.’ Modal base (presupposed to be circumstantial): Worlds in which the relevant facts about our family are the same as in the actual world. Ordering source (presupposed to be normative): The best worlds are those in which my desire for an early night is fulfilled. Choice function: Picks out a potentially proper subset of the best worlds. Universal quantification: In all worlds in the subset of the best worlds picked out by the choice function, the child sleeps.
Since the quantification is over a potentially proper subset of the best worlds, sentences like (59) can be interpreted with any strength ranging from a pure possibility (‘The child can/may sleep’) to a strong necessity (‘The child must sleep’). The apparent variable quantificational force of St’át’imcets modals is thus derived not by ambiguity in the quantifier itself, but by restricting the size of the set of worlds quantified over by the universal quantifier. The larger the subset of the best worlds selected by the choice function, the stronger the proposition expressed. As a limiting case, the choice function may be the identity function. This results in a reading that is equivalent to the standard analysis of strong modals like must in English. Now we turn to the subjunctive. In order to capture the idea that the subjunctive weakens the c-commanding modal, I analyze the subjunctive as presupposing that at least one world in the set of best worlds is a world in which the embedded proposition is false. This will prevent the choice 25 A very sensible suggestion that we should replace Rullmann et al.’s choice function with an(other) ordering source has been made independently by Kratzer (2009), Portner (2009), and Peterson (2009, 2010). I will in fact do this below when I compare the current analysis to that of von Fintel & Iatridou (2008).
9:31
Lisa Matthewson
function from being the identity function.26 This is illustrated informally for a deontic case in (60). (60)
guy’t=ás=ka ti=sk’úk’wm’it=a sleep=3subj=deon det=child=exis ‘I hope the child sleeps.’ Modal base (presupposed to be circumstantial): Worlds in which the relevant facts about our family are the same as in the actual world. Ordering source (presupposed to be normative): The best worlds are those in which my desire for an early night is fulfilled. Choice function (must pick out a proper subset of the best worlds, to avoid a contradiction with the presupposition of the subjunctive): The very best worlds are those in which my spouse’s desire for an early night is also fulfilled. Universal quantification: All the very best worlds are worlds in which the child sleeps.
(59) allows a strong interpretation which (60) disallows. If the choice function in (59) is the identity function, the speaker will be satisfied only if the child sleeps (‘in all the worlds where my desire for an early night is fulfilled, the child sleeps’). In (60), the speaker will certainly be satisfied if the child sleeps, but there are also other ways to make him/her happy. (60) asserts only that ‘in all the worlds where my and my spouse’s desires for an early night are fulfilled, the child sleeps’ — so the speaker’s desires may be satisfied if the speaker’s spouse looks after the child while the speaker goes to sleep. The requirement that (60) places on the child is thus weaker than a strong necessity. In the remainder of this section I provide a more formal implementation of this idea, and in Section 7 I show how the analysis accounts for a wide range of uses of the St’át’imcets subjunctive, including imperative-weakening, question-weakening, and ignorance free relatives. 26 Thanks to Hotze Rullmann (p.c.) for discussion of this point. The requirement that p be false in at least one of the best worlds appears reminiscent of a nonveridicality-style analysis, and there may be some deep significance to this. However, the analyses are different. For Giannakidou, the issue is always epistemic, as veridicality is defined in terms of a truth entailment in an individual’s epistemic model; see (37). Thus, subjunctive is predicted under verbs like ‘want’, as propositions under ‘want’ are not entailed to be true in any individual’s epistemic model. Under my analysis, the subjunctive has an anaphoric modal base and ordering source. I will show in subsection 7.5 that my analysis correctly predicts the indicative under verbs like ‘want’ in St’át’imcets.
9:32
Cross-linguistic variation in modality systems: The role of mood
I adopt the following basic definitions from von Fintel & Heim 2007. (61) shows the ordering of worlds according to how well they satisfy the set of propositions in the ordering source, and (62) shows how the best worlds are selected. (61)
Given a set of worlds X and a set of propositions P , define the strict partial order
(62)
For a given strict partial order
The best worlds are those for which there are no worlds closer to the ideal than they are. The analysis of English must is given in (63). must takes as arguments a modal base, an ordering source and a proposition, and asserts that in all the best worlds in the modal base, as defined by the ordering source, the proposition is true.27,28 (63)
Jmust Kc,w = λhhs,hst,tii .λghs,hst,tii .λqhs,ti .∀w 0 ∈maxg(w) (∩h(w)) : q(w 0 ) = 1 (von Fintel & Heim 2007: 55)
The analysis of St’át’imcets normative ka is given in (64). ka takes as arguments a modal base, an ordering source, and a proposition. fc represents the contextually given choice function. 27 Nothing crucial hinges on having the conversational backgrounds present in the syntax (as in von Fintel & Heim 2007) rather than being parameters of interpretation (as in Portner 1997). However, the syntactic version may have a potential advantage in enforcing the required anaphoricity of the conversational backgrounds once we bring in the subjunctive. In Rullmann et al.’s (2008) analysis of St’át’imcets modals, the choice function is also a syntactic argument of the modal. Following the suggestion of an anonymous reviewer, I have changed this here, but again, nothing crucial hinges on the decision. 28 As an anonymous reviewer reminded me, English must also encodes restrictions on its modal base and ordering source, parallel to (but obviously different from) those defined for ka in (64). See for example von Fintel & Gillies 2010 and Matthewson 2010, to appear for discussion.
9:33
Lisa Matthewson
(64)
Jka(h)(g)Kc,w is only defined if h is a circumstantial modal base and
g is a normative ordering source.
If defined, Jka(h)(g)Kc,w = λqhs,ti .∀w 0 ∈fc (maxg(w) (∩h(w))) : q(w 0 ) = 1 (adapted from Rullmann et al. 2008: 340) Now for the subjunctive. As shown in (65), the subjunctive does not affect truth conditions but merely enforces a weaker-than-necessity reading of a modal in the environment. The subjunctive does not itself introduce any conversational backgrounds; h and g in (65) are free variables. I assume that this enforces anaphoricity: the mood must be c-commanded by a modal which introduces h and g.29 (65)
Jsbjn(φ)Kc,w is only defined if ∃w 0 ∈ maxg(w) (∩h(w))[φ(w 0 ) = 0].
When defined, Jsbjn(φ)Kc,w = λw 0 .J(φ)Kc,w
0
According to (65), the subjunctive is only defined if there is at least one world w’ in the set of best worlds in the modal base, as defined by the ordering source, such that φ is false in w 0 . The analysis is applied to a normative subjunctive case in (66). (66)
guy’t=ás=ka ti=sk’úk’wm’it=a sleep=3subj=deon det=child=exis ‘I hope the child sleeps.’
Jka(h)(g)(as(guy’t ti sk’úk’wm’ita))Kc,w is only defined if i. h is a circumstantial modal base and g is a normative ordering source ii. ∃w 0 ∈ maxg(w) (∩h(w)) [the child doesn’t sleep in w 0 ] When defined, Jka(h)(g)(as(guy’t ti sk’úk’wm’ita)) Kc,w =1 iff ∀w 0 ∈ fc (maxg(w) (∩h(w))) [the child sleeps in w 0 ] As above, maxg(w) (∩h(w)) picks out the best worlds in the modal base, as defined by the normative ordering source. The contextually determined choice function fc picks out a subset of maxg(w) (∩h(w)), and the modal universally quantifies over the set picked out by the choice function. Because the subjunctive mood presupposes that there is at least one world 29 Thanks to an anonymous reviewer for pointing out an inconsistency in an earlier version of (65).
9:34
Cross-linguistic variation in modality systems: The role of mood
in maxg(w) (∩h(w)) in which the proposition is false, the choice function must pick out a proper subset of the worlds provided by the modal base and ordering source. This forces a weaker-than-universal reading. We in fact predict gradient readings with the subjunctive — anything from pure possibility to weak necessity. This seems to fit with the facts about when the subjunctive is felicitous. I have so far been simply following Portner (1997) in modeling the mood restriction as a presupposition, rather than as ordinary asserted content, or some other kind of inference. The question arises of whether there is any St’át’imcets-internal justification for the assumption that presupposition is involved.30 If the subjunctive contributed ordinary asserted content, we would predict that it would fail to project through presupposition holes such as negation or conditionals, and that it could be directly affirmed or denied by the hearer. The issue of projection through presupposition holes is not testable for most of the relevant constructions in St’át’imcets. For example, negation in St’át’imcets is a predicate which embeds an obligatorily nominalized (i.e., indicative) subordinate clause. When a subjunctive clitic does co-occur with negation, it attaches to the negation itself, as shown in (67). Thus, while (67) is not interpretable in a way which would show that the subjunctive contributed asserted content, the results are not conclusive because the subjunctive is probably not scoping under negation syntactically. (67)
cw7aoz=as=ká=t’u7 kw=s=nas=ts neg=3sbjn=deon=prt det=nom=go=3poss ‘I wish he wouldn’t go.’
(van Eijk 1997: 214)
≠ ‘It is not the case that [in at least one of the best worlds in the modal base, he doesn’t go, and in all of the set of worlds selected by the choice function, he goes].’ i.e, ≠ ‘It is not the case that [it’s good if he goes, and I can still be happy if he doesn’t].’ Nor can we test projection through ‘if’, as ‘if’-clauses obligatorily and redundantly select the subjunctive in St’át’imcets (see subsection 2.2). However, questions provide evidence that the subjunctive does not contribute ordinary asserted content. Recall that the subjunctive plus an inferential evidential 30 Thanks to David Beaver and an anonymous reviewer for asking for clarification of this issue.
9:35
Lisa Matthewson
when added to a question results in a statement of uncertainty (16)–(20). The question in (68) cannot be interpreted as if the subjunctive contributed asserted content which scopes below the question. (See subsection 7.2 for analysis of questions like (68).) (68)
nilh=as=há=k’a s=Lémya7 ku=kúkwpi7 foc=3sbjn=ynq=infer nom=Lémya7 det=chief ‘I think maybe Lémya7 is the chief / I wonder if Lémya7 is the chief.’ ≠ ‘Is it the case that [in at least one of the best worlds compatible with the inferential evidence, Lémya7 is not the chief, and in all of the set of worlds selected by the choice function, Lémya7 is the chief]?’ i.e, ≠ ‘Is it the case that [Lémya7 is possibly but not necessarily the chief]?’
Further evidence that the subjunctive does not contribute ordinary asserted content comes from the impossibility of directly affirming or denying its contribution. This is shown in (69), where B and B’ try to deny A’s subjunctive claim that in at least one world compatible with A’s knowledge and desires, the children don’t sleep. The consultant absolutely rejects the replies in B and B’. (69)
A
guy’t=ás=ka i=sk’wemk’úk’wm’it=a sleep=3sbjn=deon det.pl=child(pl)=exis ‘I hope the children sleep.’
B
#cw7aoz kw=s=wenácw. plán=lhkacw zewát-en neg det=nom=true already=2sg.subj know-dir kw=s=cuz’ gúy’t=wit det=nom=going.to sleep=3pl ‘That’s not true. You already know they will sleep.’
B’
#cw7aoz kw=s=wenácw. lh=cw7áoz=as neg det=nom=true comp=neg=3sbjn kw=s=gúy’t=wit i=sk’wemk’úk’wm’it=a, áoz=kelh det=nom=sleep=3pl det.pl=child(pl)=exis neg=fut kw=a=s áma ta=scwákwekw-sw=a det=impf=3poss good det=heart-2sg.poss=exis ‘That’s not true. If the children don’t sleep, you won’t be happy.’
Having established that the weakening contribution of the subjunctive is not ordinary asserted content, the question now is whether it contributes a
9:36
Cross-linguistic variation in modality systems: The role of mood
presupposition per se, or some other not-at-issue content, such as a Potts (2005)-style conventional implicature. One major empirical difference between a traditional understanding of presuppositions (e.g., Stalnaker 1974) and conventional implicatures is that only the former impose constraints on the state of the common ground. Conventional implicatures, in contrast, standardly contribute information which is new to the hearer (Potts 2005). I have argued elsewhere (Matthewson 2006, 2008b) that St’át’imcets entirely lacks presuppositions of the common ground type; all not-at-issue content in this language is treated as potentially new to the hearer.31 In those earlier works I argued that the St’át’imcets facts necessitate an alternative analysis of presupposition (for example that of Gauker 1998). However, another way to look at things is to say that out of the class of not-at-issue meanings, St’át’imcets lacks one sub-type, namely common ground presuppositions. What I have modeled as a presupposition of the St’át’imcets subjunctive would then be some other kind of not-at-issue content, perhaps a conventional implicature. However, these issues go beyond the scope of the present paper and do not affect the main points being made here, so with these caveats I will continue to model the subjunctive as introducing a presupposition. Before turning to more complex constructions involving the subjunctive, it is interesting to consider the similarity between the analysis of the St’át’imcets subjunctive provided here and von Fintel & Iatridou’s (2008) ideas about weak necessity modals. von Fintel and Iatridou are concerned with the difference in quantificational strength between ought and have to/must. In (70), we see that the restriction on employees is stronger than that on everyone else. (70)
After using the bathroom, everybody ought to wash their hands; employees have to. (von Fintel & Iatridou 2008: 116)
(71) also illustrates the contrast between the different modal strengths. In (71a), taking Route 2 is the only option, if you want to get to Ashfield: all the worlds in which you get to Ashfield are Route 2-worlds. In (71b), there are other getting-to-Ashfield worlds apart from only Route 2-worlds. But the Route-2 worlds are the best, taking into consideration some other factors (such as a scenic route). 31 For example, attempts to elicit ‘Hey, wait a minute!’ responses to presupposition failures for a wide range of standard presupposition triggers have all failed (Matthewson 2006, 2008b). We are therefore unable to decide the presupposition issue for the subjunctive by using the ‘Hey, wait a minute!’ test (as was suggested by an anonymous reviewer).
9:37
Lisa Matthewson
(71)
a.
To go to Ashfield, you have to / must take Route 2.
b.
To go to Ashfield, you ought to take Route 2. (von Fintel & Iatridou 2008: 118)
von Fintel and Iatridou argue that ought is a weak necessity modal, and that weak necessity modals signal the existence of a secondary ordering source. This is illustrated informally in (72)–(73). (72) contains a strong necessity modal, and gives a strong reading, as usual. In (73), a secondary ordering source further restricts the set of worlds which are universally quantified over, leading to a weaker reading. (72)
To go to Ashfield, you have to / must take Route 2. Modal base: Restricts worlds considered to those in which the same facts about roads hold as in the actual world. Ordering source: Orders worlds in the modal base so that the best worlds are those in which you attain your goal of getting to Ashfield. Universal quantification: In all the best worlds, you take Route 2.
(73)
To go to Ashfield, you ought to take Route 2. Modal base: Restricts worlds considered to those in which the same facts about roads hold as in the actual world. Ordering source 1: Orders worlds in the modal base so that the best worlds are those in which you attain your goal of getting to Ashfield. Ordering source 2: Further orders the best worlds picked out by ordering source 1, so that the very best worlds are those in which you not only attain your goal of getting to Ashfield, but also attain an additional goal of going via a scenic route. Universal quantification: In all the very best worlds, you take Route 2.
As von Fintel & Iatridou (2008: 137) put it: ‘The idea is that saying that to go to Ashfield you ought to take Route 2, because it’s the most scenic way, is the same as saying that to go to Ashfield in the most scenic way, you have to take Route 2.’ This is very parallel in spirit to Rullmann et al.’s (2008) analysis of St’át’imcets modals, where a weak reading is obtained by a universal quantifier with a restriction provided by a choice function. And just like Rullmann et al.’s analysis, von Fintel and Iatridou’s actually predicts gradience: how ‘weak’ a weak necessity modal is can vary, depending on
9:38
Cross-linguistic variation in modality systems: The role of mood
which secondary ordering source you pick. In fact, given that the motivation for using a choice function rather than an ordering source was unconvincing anyway (cf. Kratzer 2009, Peterson 2009, 2010, and Portner 2009), the Rullmann et al.-style analysis is better implemented using a double ordering source, exactly as in von Fintel & Iatridou 2008.32 So what is the difference between English and St’át’imcets? Simply that in English, we lexically encode the weak necessity (ought vs. have to/must). In St’át’imcets, no differences in modal force are lexically encoded by modals, but what English modals do, St’át’imcets does via mood. Another way of describing the analysis offered here would be to say that the St’át’imcets subjunctive enforces weak necessity (via domain restriction): it forces there to be two (non-vacuous) restrictions on the set of worlds in the modal base. While further cross-linguistic investigation goes beyond the scope of this paper, it is worth pointing out a connection to another intriguing observation of von Fintel and Iatridou’s, namely that in many languages, weak necessity modals are created transparently from a strong necessity modal plus counterfactual morphology. This is illustrated in (74) for French, where the modal appears in the conditional mood, the one which occurs in counterfactual conditionals. (74)
tout le monde devrait se laver les mains mais les serveurs everybody must/cond refl wash the hands but the waiters sont obligés are obliged ‘Everybody ought to wash their hands but the waiters have to.’ (von Fintel & Iatridou 2008: 121)
This is very reminiscent of St’át’imcets, where a modal which introduces universal quantification gives rise to weak necessity interpretations in the presence of the subjunctive. In St’át’imcets, I have analyzed the weakening effect as the sole contribution of the subjunctive mood. Of course, ‘counterfactual’ and ‘subjunctive’ are not the same thing, and I am not in a position to claim that the current analysis of the subjunctive can extend to counterfactual morphology in the languages discussed by von Fintel and Iatridou. However, the present analysis at the very least supports von Fintel and Iatridou’s cross-linguistic generalization that mood morphology can derive weak 32 Like von Fintel and Iatridou, I omit a formal definition of a modal with a double ordering source; see von Fintel & Iatridou 2008: 138 for some suggestions on how to do this.
9:39
Lisa Matthewson
necessity interpretations, and may offer a potential new avenue for looking at languages like French. 7
Applying the analysis to other subjunctive constructions
In the previous section I presented an analysis of the St’át’imcets subjunctive and applied it to cases involving a normative modal. In this section I aim to establish that the analysis of the subjunctive as restricting the conversational background of a co-occurring modal can extend to the other uses of the subjunctive. I deal in turn with imperatives (subsection 7.1), questions (subsection 7.2), ignorance free relatives (subsection 7.3), the ‘pretend’ cases (subsection 7.4), and finally I return to the fact that in St’át’imcets, the subjunctive is not licensed by any attitude verbs (subsection 7.5). 7.1 Imperatives Recall that the subjunctive, when added to an imperative, makes the command more polite. An example is repeated here: (75)
a.
lts7á=malh lh=kits-in’=ál’ap! deic=adhort comp=put.down-dir=2pl.sbjn ‘Just put it over here!’
b.
lts7á=has=malh lh=kits-in’=ál’ap deic=3sbjn=adhort comp=put.down-dir=2pl.sbjn ‘Could you put it down here?’/‘You may as well put it down over here.’
The easiest way to analyze the imperatives would be as sub-cases of the deontic cases already analyzed above. We could say that the imperative introduces a deontic necessity modal, and the subjunctive weakens the proposition expressed. That is what I will in fact say, adopting Schwager’s (2005, 2006) analysis of imperatives. Schwager (2005, 2006) claims that imperatives introduce a modal operator, which is a more restricted version of a deontic necessity modal.33 Normally, the imperative modal expresses necessity, with the Common Ground 33 See Han 1997, 1999 for an earlier proposal of a similar idea. Han’s modal analysis shares many of the advantages for St’át’imcets of Schwager’s approach. However, since Han models the modal claim of the imperative as a presupposition rather than part of the assertion, extra assumptions would be required to apply it to St’át’imcets subjunctive imperatives.
9:40
Cross-linguistic variation in modality systems: The role of mood
serving as the modal base, and a contextually given set of preferences giving the ordering source. In addition, imperatives carry presuppositions, as shown in (76). The presuppositions restrict an imperative to situations where a performative use of a deontic modal would be possible, namely those in which the speaker is an authority on the matter.34 (76)
Presuppositions of an imperative: 1. The speaker is an authority on the parameters. [modal base and ordering source] 2. The ordering source is preference-related.35 3. The speaker affirms the ordering source as a good maxim for acting in the given scenario. (Schwager 2006: 248-249)
A simple case is illustrated in (77). (77)
Get up! Modal base: What the speaker and hearer jointly take to be possible Ordering source: The speaker’s commands
(77) is true iff all worlds in the Common Ground that make true as much as possible of what the speaker commands at the world and time of utterance make it true that the addressee gets up within a certain event frame t (Schwager 2005: chapter 6). The difference between (77) and the plain modal statement ‘You must get up’ is that with the imperative, the speaker is presupposed to be an authority. This has the consequence that whenever an imperative is defined, it is necessarily true. Adopting Schwager’s analysis enables us to treat the St’át’imcets subjunctive imperatives the same way we treated the weakened normative kastatements above. We have to assume that the deontic modal in a St’át’imcets imperative is, like the overt ka, a universal modal which introduces a choice function or secondary ordering source. While a normal imperative roughly says that in all the best worlds (the worlds where you obey my commands), 34 The descriptive vs. performative use of a deontic modal is shown in (i), from (Schwager 2008: 26). (i)
a. Peter may come tomorrow. (The hostess said it was no problem.) descriptive b. Okay, you may come at 11. (Are you content now?)
performative
35 The preferences may relate to the addressee’s wishes, as in the case of advice or suggestions.
9:41
Lisa Matthewson
you do P, a subjunctive imperative presupposes that at least one world in which you obey my commands is a world in which you do not do P. This predicts that a weakened imperative means that in the very best worlds, you do P, but there are other ways to satisfy me. The requirement on the addressee becomes weaker, just as the requirement on the child to sleep becomes weaker in the examples discussed above. An advantage of Schwager’s analysis for St’át’imcets is that it makes the correct predictions for ‘permission imperatives’ like ‘Have a cookie!’ These do not perform a speech act of ordering, but rather of invitation. It might be natural to think that permission imperatives involve a possibility modal, but Schwager argues that imperatives always introduce a necessity operator. For Schwager, the permission effect arises due to the contextual parameters; this is shown in (78). (78)
Take an apple if you like! Given what we know the world to be like and given what you want, it is necessary that you take an apple. (cf. Schwager 2008: 49)
Under Schwager’s analysis, then, the difference between an order and an invitation consists not in a difference in quantificational force, but in ordering source. This correctly predicts that in St’át’imcets, permission imperatives do not have to take the subjunctive:36,37 (79)
Context: Your friend comes over and is visiting with you. You hear her stomach rumbling. You give her a plate and say ‘Have some cake!’ a.
wá7=malh kiks-tsín-em be=adhort cake-eat-mid ‘Have some cake!’
b.
#wá7=acw=malh kiks-tsín-em be=2sg.sbjn=adhort cake-eat-mid ‘You may as well have some cake.’
36 (79b) is marked as infelicitous in this context, which is how the consultant judges it. (80b) appears to be ungrammatical. The difference possibly relates to the presence in (79b) of the adhortative particle malh, an interesting element whose analysis must await future research. 37 An anonymous reviewer points out that permission imperatives should be able to take the subjunctive in certain circumstances, meaning something like ‘the very best way to achieve your desires is p, though there are other ways’. Future research is required to see whether this prediction is upheld once the right discourse contexts are provided.
9:42
Cross-linguistic variation in modality systems: The role of mood
(80)
Context: You are at a gathering and they are almost running out of food. You take the last piece of fish and then you see an elder is behind you and is looking disappointed and has no fish on her plate. You say ‘Take mine!’ a.
kwan ts7a ti=n-tsúw7=a take(dir) deic det=1sg.poss-own=exis ‘Take mine!’
b.
*kwán=acw ts7a ti=n-tsúw7=a take(dir)=2sg.sbjn deic det=1sg.poss-own=exis intended: ‘Take mine!’
We have seen that an analysis of imperatives as containing a concealed necessity modal works for St’át’imcets. In the remainder of this section I briefly discuss the alternative analysis of Portner (2004, 2007). Portner’s (2004, 2007) analysis of imperatives relies on the notion of a ‘To-Do List’. The idea is that each participant in a conversation has a To-Do List, a set of properties which they are committed to satisfying. The To-Do list Function (which maps each participant to their own To-Do List) is a component of the Discourse Context (along with the Common Ground and the Question Set). An imperative, as in (81), denotes a property whose subject is the addressee. This causes the property to be added to the addressee’s To-Do List. (81)
JLeave! Kw∗,c = [λwλx : x = addressee (c) . x leaves in w]
Similarly to in Schwager’s analysis, ‘permission’ imperatives are dealt with in Portner’s analysis by the counterpart of the ordering source, namely different sub-sets of the To-Do List. The To-Do List is divided into deontic, bouletic and teleological sub-parts, corresponding to orders, invitations, and suggestions respectively. The addressee can therefore keep track of actions she is supposed to take to satisfy someone’s orders, her own wishes, or her own goals. An important feature of this analysis is that under the To-Do List approach, imperatives do not contain modal operators. While for Portner, imperatives and root modals are closely linked — for example, the successful utterance of an imperative leads to the truth of a corresponding sentence containing a root modal — imperatives do not themselves contain modals.38 38 See Portner 2007: 363ff for arguments against Han’s (1999) and Schwager’s (2005, 2006) analysis of imperatives as containing concealed modals.
9:43
Lisa Matthewson
My analysis of the St’át’imcets subjunctive, however, seems to require the presence of a modal, whose force is functionally weakened via a restriction on the conversational background. A unified analysis of the St’át’imcets subjunctive across all its uses would therefore seem to require a modal in the imperative. However, as pointed out by an anonymous reviewer, Portner’s analysis of imperatives will work for St’át’imcets. The lexical entry for the subjunctive given above in (65) does not literally require the presence of a governing modal; it merely requires the presence of contextually available conversational backgrounds. These are provided within Portner’s analysis, given that the Common Ground corresponds to (at least a subset of) a circumstantial modal base, while a To-Do List corresponds to (at least a subset of) a deontic, bouletic or teleological ordering source. To apply Portner’s analysis to St’át’imcets, we only need to assume that the imperative morpheme can take the Common Ground plus two To-Do Lists as arguments. The subjunctive will presuppose that there is a world among the best worlds in the Common Ground, according to To-Do List 1, in which the imperative is not satisfied. Assuming that the second To-Do List is ‘more ignorable’ than the first (cf. also von Fintel and Iatridou 2008 on the primacy of the first ordering source), then a hearer can decide to be bound either by both To-Do Lists, or only by the first. If the speaker has set up her own desires as the secondary To-Do List, we obtain the politeness reading typical of a St’át’imcets subjunctive imperative. In summary, we have seen that our analysis of the St’át’imcets subjunctive extends to the weakened imperatives, as long as we assume that imperatives are concealed normative modal statements, or at least provide the same conversational backgrounds as a normative modal. This idea can be implemented within either the approaches of Schwager (2005, 2006, 2008) or Portner (2004, 2007). 7.2
Questions
The subjunctive appears, in combination with an evidential or future modal, in both yes-no and wh questions in St’át’imcets, in each case turning the question into a statement of uncertainty. Some examples are repeated here. Following Littell, Matthewson & Peterson (2009), I use the term ‘conjectural question’ for this construction.
9:44
Cross-linguistic variation in modality systems: The role of mood
(82)
a.
lán=ha kwán-ens-as already=ynq take-dir-3.erg ni=n-s-mets-cál=a det.abs=1sg.poss-nom-write-act=exis ‘Has she already got my letter?’
b.
lan=as=há=k’a kwán-ens-as already=3.sbjn=ynq=infer take-dir-3.erg ni=n-s-mets-cál=a det.abs=1sg.poss-nom-write-act=exis ‘I wonder if she’s already got my letter.’ / ‘I don’t know if she got my letter or not.’
(83)
a.
nká7=kelh lh=cúz’=acw nas where=fut comp=going.to=2sg.sbjn go ‘Where will you go?’
b.
nká7=as=kelh lh=cúz’=acw nas where=3sbjn=fut comp=going.to=2sg.sbjn go ‘Wherever will you go?’ / ‘I wonder where you are going to go now.’ (adapted from Davis 2006: chapter 24)
Previous discussion of conjectural questions in Salish includes Matthewson 2008a, Littell et al. 2009 and Littell 2009.39 The analysis given here will essentially be that of Littell (2009), with the addition of an account of the role of the subjunctive (which Littell does not discuss), and an extension to cases where the subjunctive in a conjectural question is licensed by a future modal, rather than an evidential. The paradigms in (84) and (85) illustrate the distributional facts for conjectural questions which contain an evidential (as opposed to a future modal). We see that the evidential is obligatory (the (b) examples), but the subjunctive — while strongly preferred — is not quite obligatory (the (c) examples).40 (84)
a.
t’íq=Ø=ha k=Bill arrive=indic=ynq det=Bill ‘Did Bill arrive?’
indic
39 Littell et al. (2009) investigate conjectural questions in three languages: St’át’imcets, NìePkepmxcín (Thompson Salish) and Gitksan, while Littell (2009) focuses mainly on NìePkepmxcín. 40 While subjunctive evidential questions (as in (84d), (85d)) are obligatorily interpreted as statements of uncertainty rather than questions, indicative evidential questions (as in (84c), (85c)) can optionally be interpreted as ordinary questions. I return to this below. 9:45
Lisa Matthewson
b. c.
*t’íq=as=ha k=Bill arrive=3sbjn=ynq det=Bill ?t’íq=ha=k’a k=Bill arrive=ynq=infer det=Bill ‘I wonder if Bill arrived.’
d.
a.
b. c.
evid + sbjn
ínwat=wit say.what=3pl ‘What did they say?’
indic
*inwat=wít=as say.what=3pl=3sbjn
sbjn
??inwat=wít=k’a say.what=3pl=infer ‘I wonder what they said.’
d.
evid + indic
t’iq=as=há=k’a k=Bill arrive=3sbjn=ynq=infer det=Bill ‘I wonder if Bill arrived.’
(85)
sbjn
evid + indic
inwat=wít=as=k’a say.what=3pl=3sbjn=infer ‘I wonder what they said.’
evid + sbjn
As argued in the above-mentioned references, conjectural questions have the syntax and the semantics of a question, but the pragmatics of an assertion (as they do not require an answer in discourse). With respect to syntax, conjectural questions clearly pattern with ordinary questions. Littell et al. (2009) point out that not only do conjectural questions contain the normal yes-no question particle or sentence-initial wh-phrase plus extraction morphology, they embed under the same predicates as ordinary questions do. This is shown in (86). (86)
aoz kw=s=zwát-en-as k=Lisa neg det=nom=know-dir-3erg det=Lisa lh=wa7=as=há=k’a áma-s-as k=Rose ku=tíh comp=impf=3sbjn=ynq=infer good-caus-3erg det=Rose det=tea ‘Lisa doesn’t know whether Rose likes tea.’
9:46
Cross-linguistic variation in modality systems: The role of mood
The ability to embed under question-taking predicates is prima facie evidence that conjectural questions have the same semantic type as ordinary questions. Pragmatically, however, conjectural questions do not behave like ordinary questions, because conjectural questions do not require an answer from the addressee. In fact, conjectural questions are infelicitous in any situation where the hearer can be assumed to know the answer. This is illustrated in (87).41 (87)
a.
??lan=acw=há=k’a q’a7 already=2sg.sbjn=ynq=infer eat ‘I wonder if you’ve already eaten.’
b.
Context: You see your friend wearing a watch and you say: ??zwat-en=ácw=ha=k’a know-dir=2sg.sbjn=ynq=infer lh=k’wín=as=t’elh comp=how.many=3sbjn=now ‘Would you know what the time was?’ Consultant’s comment: “You wouldn’t have seen the watch if you say this.”
Nor are conjectural questions a type of rhetorical question. Han (2002) argues that rhetorical questions have the force of a negative assertion, as in (88). (88)
Did I tell you it would be easy? ≈ I didn’t tell you it would be easy.
But this is not the meaning we get in St’át’imcets for conjectural questions. In order to express a true rhetorical question, St’át’imcets speakers use something which is string-identical to an ordinary question, just as in English. This is illustrated in (89)–(90). (90b) shows that adding a subjunctive plus an evidential to a rhetorical question results in rejection of the utterance. (89)
Context: Your daughter is complaining that learning how to cut fish is hard. You say: a.
tsun-tsi=lhkán=ha k=wa=s lil’q say(dir)-2sg.obj=1sg.indic=ynq det=impf=3poss easy ‘Did I tell you it would be easy?’
41 See Rocci 2007: 147 for the same claim for an Italian construction with similar semantics to St’át’imcets conjectural questions.
9:47
Lisa Matthewson
b.
swat ku=tsút k=wa=s lil’q who det=say det=impf=3poss easy ‘Who said it would be easy?’
(90)
Context: You are at the PNE (a fair) and there is this very scary ride which looks really dangerous. Your friend asks you if you are going to go on it. You say: a.
tsut-anwas=kácw=ha kw=en=klíisi say-inside=2sg.indic=ynq det=1sg.poss=crazy ‘Do you think I’m crazy?’
b.
*tsut-anwas=ácw=ha=k’a kw=en=klíisi say-inside=2sg.sbjn=ynq=infer det=1sg.poss=crazy ‘Do you think I’m crazy?’
The status of speaker and addressee knowledge also differs between rhetorical questions and conjectural questions. In rhetorical questions, the speaker knows the true answer to the question, and typically assumes that the hearer does as well (e.g., Caponigro & Sprouse 2007). Subjunctive questions are the exact opposite: neither the speaker nor the addressee typically knows the answer. In the remainder of this section I will first present the analysis of conjectural questions which contain evidentials, and then explain an interesting difference between the evidential and the future with respect to subjunctive licensing. First, we need an analysis of questions. I adopt a fairly standard approach, according to which a question denotes a set of propositions, each of which is a (partial, true or false) answer to the question (Hamblin 1973).42 This is illustrated in (91)–(92). (91)
Jdoes Hotze smokeKw = {that Hotze smokes, that Hotze does not smoke}
(92)
Jwho left me this fishKw = {that Ryan left me this fish, that Meagan left me this fish, that Ileana left me this fish,...} = {p : ∃x[p = that x left me this fish]}
42 As far as I am aware, this choice is not critical and a different approach to questions would work just as well.
9:48
Cross-linguistic variation in modality systems: The role of mood
Next, we need an analysis for the inferential evidential k’a. I adopt Matthewson et al.’s (2007) and Rullmann et al.’s (2008) analysis of k’a as an epistemic modal with a presupposition about evidence source. (93)
Jk’a(h)(g)Kc,w is only defined if h is a epistemic modal base, g is a stereotypical ordering source, and for all worlds w 0 , ∩h(w 0 ) is the set of worlds in which the inferential evidence in w holds.
If defined, Jk’a(h)(g)Kc,w = λqhs,ti .∀w 0 ∈ fc (maxg(w) (∩h(w)))[q(w 0 ) = 1] (adapted from Matthewson et al. 2007: 245) I assume that the evidential modal scopes under the question operator, so that each proposition in the question denotation contains the evidential. A conjectural question thus bears some similarity to an English question containing a possibility modal (e.g., ‘Could Bill have (possibly) arrived?’), with the additional factor that the evidential introduces a presupposition about evidence source. Following Guerzoni (2003), I assume that when a question contains a presupposition trigger, each proposition in the alternative set carries the relevant presupposition. The question therefore denotes a set of alternative partial propositions. This is illustrated in (94).43 (94)
a.
t’iq=as=há=k’a k=Bill arrive=3sbjn=ynq=infer det=Bill ‘I wonder if Bill arrived.’
b.
Alternatives introduced by (94a): {that Bill possibly arrived [presupposing there is inferential evidence that Bill arrived], that Bill possibly did not arrive [presupposing there is inferential evidence that Bill did not arrive]}
Notice that the evidence presuppositions of the two propositions in (94b) conflict with each other — there is presupposed to be evidence both that Bill did arrive, and that Bill did not arrive. As Guerzoni (2003) has shown for the presuppositions of English even, questions whose alternative propositions introduce different presuppositions end up presupposing the conjunction of all the individual presuppositions. Take, for example, the question in (95). (95)
Guess who even solved Problem 2?
43 Recall that although (94a) is translated into English using wonder, the meaning of (94a) does not include an attitude verb. The claim is that (94a) denotes a set of alternative propositions.
9:49
Lisa Matthewson
This question introduces ‘a set of alternative partial propositions that for each relevant person x contains an answer asserting that x solved Problem 2 and presupposing that solving problem 2 was less likely for x than solving any other relevant problem’ (Guerzoni 2003: 127). Guerzoni then observes that a speaker who utters (95) knows that for any arbitrary individual in the restrictor of who, if the addressee answers that that individual solved the problem, he will automatically presuppose that the problem was difficult for that person. Moreover, if the speaker is unbiased, she doesn’t know in advance (and has no expectations regarding) which propositions will be chosen by the addressee as the true answer to her question. Given this, it must be the case that she is taking for granted that the problem was hard for every arbitrary x in the restrictor of who. Since the addressee will be able to infer this much, the question is a presupposition failure unless this condition is indeed satisfied in the context of the conversation (Guerzoni 2003: 128). Applying this idea to the St’át’imcets conjectural questions, we obtain the result that an utterance of (94a) commits the speaker to the presupposition that there is evidence both that Bill did arrive, and that he did not. This is illustrated in (96). (96)
Alternatives introduced by (94a): {that Bill possibly arrived, that Bill possibly did not arrive} Presupposition of (94a): There is inferential evidence both that Bill arrived and that Bill did not arrive
In previous work (Matthewson 2008a, Littell et al. 2009), I assumed that the mixed-evidence presuppositions which result when we conjoin the presuppositions of all the propositions in the question set could derive the reduced interrogative force of conjectural questions. The idea was that a speaker who utters a question while presupposing that there is mixed or even contradictory evidence about the true answer cannot be taken to be requiring that the hearer provide the true answer to the question. That is, the mixed presuppositions about evidence signal that the speaker does not
9:50
Cross-linguistic variation in modality systems: The role of mood
believe the question is easily answerable, and this lets the hearer off the hook with respect to providing an answer.44 However, there are various problems with this analysis, as pointed out by Littell (2009). One is that the evidence presuppositions are not always contradictory. For example, a conjectural question such as ‘Who likes ice cream?’ would presuppose for each contextually salient individual x that there is inferential evidence that x likes ice cream. But it is perfectly possible that everyone likes ice cream, and the evidence presuppositions in this case do not rule out the possibility that the hearer knows the true answer. A second problem is seemingly incorrect predictions about questions which contain other evidentials, such as reportative or direct evidentials. Littell argues that an analysis of conjectural questions which relies on conjoined evidence presuppositions should predict reduced interrogative force for any evidential question — yet cross-linguistically it is overwhelmingly only inferential or conjectural evidentials which result in reduced interrogative force. This is certainly true of St’át’imcets, as shown in the minimal pair in (97).45 (97)
a.
stám’=as=k’a ts7a what=3sbjn=infer here ‘I wonder what these are.’
b.
*stám’=as=ku7 ts7a what=3sbjn=report here
For these reasons, I instead adopt and extend an analysis proposed by Littell (2009). Two assumptions are required. First, the evidence source 44 Rocci (2007) analyzes a construction in Italian with strikingly similar semantics and pragmatics: the che-subjunctive construction. According to Rocci, che-subjunctives, which are formed from questions, are interpreted as statements of doubt. He argues that they involve epistemic modality and inferential evidentiality, and induce the following presuppositions: (i) p is not in the Common Ground and ¬p is not in the Common Ground (ii) There is no sign that either Speaker or Hearer knows whether p or ¬p (iii) There is some set of facts E in CG, such that E is non-conclusive evidence in favor of p These are very similar to the effects of the St’át’imcets conjectural questions. However, Rocci does not give a compositional analysis, perhaps partly because the che-subjunctives have no overt evidentials or epistemic modals in the structure. 45 Cheyenne is an exception; reportatives in questions in Cheyenne allow non-interrogative readings under certain circumstances (Murray to appear).
9:51
Lisa Matthewson
requirement of an evidential in a question can or must undergo ‘interrogative flip’ (or ‘origo shift’; Garrett 2001, Faller 2002, 2006, Aikhenvald 2006, Tenny & Speas 2004, Tenny 2006, Davis, Potts & Speas 2007, Murray to appear, among others). Thus, a question containing an evidential expects that the hearer, rather than the speaker, has the relevant type of evidence for the answer. For example, (98) is not appropriate if directed to your mother, if she is the one who always cooks dinner. However, it is acceptable when directed to a third person, who might have heard from your mother what you are going to eat. (98)
stám’=ku7 ku=cuz’=s-q’á7-lhkalh what=report det=going.to=nom-eat-1pl.poss ‘What are we going to eat?’
The second assumption is that a speaker who uses an evidential which is low on a hierarchy of evidence strength implicates that there is no available evidence of a stronger type (Faller 2002, among others). This also seems to be correct in St’át’imcets; the use of an inferential evidential, for example, leads a hearer to infer that the speaker did not have reportative or direct evidence.46 These two assumptions lead to the following result: a question containing an evidential which is low on the scale of evidence strength will lead to an implicature that the hearer does not have evidence of any stronger type. This is illustrated in (99). (99)
a.
man’c-em=há=k’a k=Hotze smoke-mid=ynq=infer det=Hotze ‘I wonder if Hotze smokes.’
b.
Alternatives introduced by (99a): {that Hotze might smoke, that Hotze might not smoke}
c.
Presupposition of (99a): The hearer has inferential evidence both that Hotze smokes and that Hotze does not smoke
46 Evidential hierarchies are a topic of some debate and there are many interesting questions to be investigated (see Faller 2002 for an overview). It is also an interesting question how evidence-type hierarchies interact with the variable interpretations of all evidentials in St’át’imcets (Matthewson et al. 2007, Rullmann et al. 2008). Although all strengths are possible for all evidentials in St’át’imcets, inferential k’a is more likely to be weaker (i.e., to have a more restricted domain of worlds to quantify over), while the reportative ku7 and the perceived-evidence =an’ are much more likely to give rise to stronger interpretations.
9:52
Cross-linguistic variation in modality systems: The role of mood
d.
Implicature: The hearer does not have any stronger type of evidence than inferential about the correct answer
According to Littell (2009), this analysis accounts for the reduced interrogative force of conjectural questions. The idea is that inferential evidence is a fairly weak type of evidence, and a speaker who asks a question while implicating that the hearer only has inferential evidence about the true answer is letting the hearer off the hook with respect to answering. This is intended to account for (a) the judgments of St’át’imcets consultants that conjectural questions do not require an answer, (b) the fact that conjectural questions are infelicitous when the addressee is likely to know the answer (cf. (87)), and (c) the fact that conjectural questions are translated as ‘I wonder’ or ‘maybe’-statements (although they do not literally have the semantics of ‘wonder’). ‘I wonder’ is simply a typical method in English of raising a question without demanding an answer. However, this account does not seem to predict a complete absence of interrogative force. After all, the inferential evidence the hearer is assumed to possess is better than no evidence at all. In line with this, an English question like ‘According to the weak evidence you have, could Hotze smoke?’ still functions pragmatically as an interrogative. I conclude, therefore, that interrogative flip plus implicatures about the absence of stronger evidence are not sufficient in and of themselves to completely let the hearer off the hook with respect to answering. This is actually a welcome result, since questions containing k’a in the indicative mood are sometimes translated by speakers into English using ordinary questions (rather than as statements of doubt; see footnote 40). However, conjectural questions containing the subjunctive are never translated as ordinary questions. I therefore assume that while a question containing an evidential is already somewhat ‘weakened’ in terms of its interrogative force, the subjunctive performs a further weakening. The task now is to see whether this falls out from the analysis of the subjunctive proposed above. Recall that in the context of a governing modal, the subjunctive adds the presupposition that in at least one of the best worlds in the modal base, the proposition is false. The best worlds here (as the modal is epistemic) are those which conform to the propositions known to be true, and in which things happen as normal. Since the evidential has undergone interrogative flip, the epistemically accessible worlds must also be flipped to be the worlds
9:53
Lisa Matthewson
compatible with the hearer’s knowledge. The results are shown in (100).47 (100)
a.
cuz’=as=há=k’a ts7as s=Bill going.to=3sbjn=ynq=infer come nom=Bill ‘I wonder if Bill is going to come.’
b.
Alternatives introduced by (100a): {that Bill is possibly going to come, that Bill is possibly not going to come}
c.
Presuppositions of (100a): The hearer has inferential evidence both that Bill is going to come and that Bill is not going to come; Bill doesn’t come in at least one normal world compatible with the hearer’s knowledge, and Bill comes in at least one normal world compatible with the hearer’s knowledge
d.
Implicature: The hearer does not have any stronger type of evidence than inferential about the correct answer
As before, the implicature that the hearer does not have strong evidence about the true answer, combined with the mixed-evidence effect of the evidential presuppositions, will partially reduce the expectation that the hearer is able to answer the question. In addition, thanks to the subjunctive, the question now presupposes not only that the evidence about Bill’s possible arrival is mixed, but also that there are worlds compatible with the hearer’s knowledge in which Bill does come, and worlds compatible with the hearer’s knowledge in which he does not come. In other words, the hearer does not know whether he will come or not. The result is that a subjunctive conjectural question has a significantly reduced expectation on the hearer to provide an answer.48 The account just given, which incorporates the analysis of the St’át’imcets subjunctive as weakening a modal proposition via domain restriction, suc47 An anonymous reviewer raises a potentially significant issue with the choice function required for these cases. With the deontic and imperative cases discussed above, the choice function had intuitive content (e.g., the ‘very best way to achieve some end’), but here the role of the subjunctive is purely to make sure there are some ‘best worlds’ where the prejacent is false. It is thus not clear which proper subset of the best worlds the function picks out. 48 As noted above, conjectural questions also imply that the speaker does not know the answer. I assume that this follows, by Gricean reasoning, from the fact that the speaker uttered a question, rather than having simply asserted the true answer. However, there is a bit more to be said here, since plain questions in St’át’imcets allow a ‘display question’ use — a teacher can ask (i):
9:54
Cross-linguistic variation in modality systems: The role of mood
cessfully accounts for the distributional and interpretive facts illustrated in (84)–(85) above. The fact that the subjunctive requires a modal licenser in a question follows from the analysis of the subjunctive as requiring a governing modal. The fact that an evidential in a question always licenses at least slightly reduced interrogative force, regardless of mood, falls out from the fact that the evidential plays a part in reducing interrogative force. However, the added contribution of the subjunctive accounts for the preferred presence of the subjunctive in conjectural questions, as well as for the fact that questions containing an evidential plus the subjunctive, in contrast to indicative evidential questions, can only be interpreted with reduced interrogative force. In the final part of this section I extend the discussion to conjectural questions which contain a future morpheme rather than an evidential. We have already seen some examples of this ((17b)–(18b) above). In contrast to the evidential k’a, the future modal obligatorily requires the subjunctive mood if it is to be interpreted as a statement of doubt. This is shown in (101)–(102), where the (a) examples are only interpretable as ordinary questions which expect an answer. (101)
a.
t’íq=ha=kelh k=Bill arrive=ynq=fut det=Bill ‘Is Bill going to come?’
b.
fut + indic
t’iq=as=há=kelh k=Bill arrive=3sbjn=ynq=fut det=Bill ‘I wonder if Bill will come.’
fut + sbjn
(i) k’win ku=án’was múta7 án’was how.many det=two and two ‘What is two plus two?’ As an anonymous reviewer points out, this display use should technically remain even when the subjunctive is added. However, consultants judge the subjunctive version of (i) to no longer be a teacher’s question, but a student’s reply: (ii) k’wín=as=k’a ku=án’was múta7 án’was how.many=3sbjn=infer det=two and two ‘I don’t know how much two plus two is.’ Perhaps conjectural questions like (ii) simply do not make good questions for a teacher to ask because they encode addressee ignorance.
9:55
Lisa Matthewson
(102)
a.
inwat=wít=kelh say.what=3pl=fut ‘What will they say?’
b.
fut + indic
inwat=wít=as=kelh say.what=3pl=3sbjn=fut ‘I wonder what they will say.’
fut + sbjn
The contrast between the evidential and the future with respect to whether the subjunctive is required to create a conjectural question is striking. So far, I have argued that the evidential k’a contributes to reduced interrogative force by means of an implicature that the hearer has no better than inferential evidence for the true answer, and that the subjunctive contributes to further reduced interrogative force by presupposing that it is compatible with the hearer’s knowledge state that each possible answer is false. Now unlike k’a, the future modal kelh has not been analyzed as an epistemic modal, and it does not introduce any evidence presuppositions. The denotation for kelh is given in (103). (103)
Jkelh(h)(g)Kc,w,t is only defined if h is a circumstantial modal base and g is a stereotypical ordering source.
If defined, Jkelh(h)(g)Kc,w,t = λqhs,hi,tii .∀w 0 ∈ fc (maxg(w) (∩h(w, t)))[∃t 0 [t < t 0 ∧ q(w 0 )(t 0 ) = 1]] (adapted from Rullmann et al. 2008)49 Applying this analysis of kelh to questions containing a subjunctive gives (104). (104)
a.
nká7=as=kelh lh=cúz’=as nas k=Gloria where=3sbjn=fut comp=going.to=2sg.sbjn go det=Gloria ‘I wonder where Gloria will go.’
b.
Alternatives introduced by (104a): {that Gloria will go home, that Gloria will go to her mother’s house, . . . }
49 I have altered Rullmann et al.’s formula to incorporate the ordering source and to make the format parallel to that of other formulas above. The modal base in (103) is a function from world-time pairs to sets of propositions.
9:56
Cross-linguistic variation in modality systems: The role of mood
c.
Presuppositions of (104a): The future claim is made on the basis of the facts; Gloria won’t go home in at least one stereotypical world compatible with the facts, Gloria will not go to her mother’s house in at least one stereotypical world compatible with the facts, . . .
There are no implicatures about evidence types this time, but interestingly, we still predict reduced interrogative force. And this time, the contribution of the subjunctive is absolutely critical to deriving the effect. Due to the subjunctive, the question as a whole presupposes for each contextually salient place that Gloria might go, that there is at least one stereotypical world compatible with the facts in which she doesn’t go there. This means that the facts underdetermine where she might go — and thus, that the addressee may not know where she will go. Given that the subjunctive is crucial in deriving the reduced interrogative force, we correctly predict that the subjunctive is obligatory in conjectural questions like (102). 7.3 Ignorance free relatives Ignorance free relatives in St’át’imcets are formed by the combination of a wh-word, the subjunctive, and the inferential evidential k’a. Some examples are repeated here.50 (105)
a.
qwatsáts=t’u7 múta7 súxwast áku7, t’ak aylh áku7, leave=prt again go.downhill deic go then deic nílh=k’a s=npzán-as foc=infer nom=meet(dir)-3erg k’a=lh=swát=as=k’a káti7 ku=npzán-as infer=comp=who=3sbjn=infer deic det=meet(dir)-3erg ‘So he set off downhill again, went down, and then he met whoever he met.’ (van Eijk & Williams 1981: 66, cited in Davis 2009)
b.
o, púpen’=lhkan [ta=stam’=as=á=k’a] oh find=1sg.indic [det=what=3sbjn=exis=infer] ‘Oh, I’ve found something or other.’ (Unpublished story by “Bill” Edwards, cited in Davis 2009)
There is a large literature on free relatives, concentrating mainly on English (although see Dayal 1997 for discussion of Hindi and Davis 2009 for 50 Thanks to Henry Davis for helpful discussions of free relatives in St’át’imcets.
9:57
Lisa Matthewson
discussion of St’át’imcets). Here I adopt von Fintel’s (2000) analysis; as far as I know, nothing crucial hinges on the differences between von Fintel’s analysis and those of, for example, Jacobson (1995) or Dayal (1997). I will argue that the St’át’imcets ignorance free relatives are compatible with von Fintel’s proposals, and that their interpretation relies on the independently-attested semantics of the subjunctive and the evidential. According to von Fintel, both ignorance and indifference free relatives presuppose that there is variation among the worlds in the modal base with respect to the identity of the referent. The free relative denotes a definite description, and the sentence as a whole asserts that the definite description satisfies the relevant property. (106)
(whatever)(w)(F )(P )(Q) a.
presupposes: ∀w 0 ∈ minw [F ∩(λw 0 .ιx.P (w 0 )(x) ≠ ιx.P (w)(x))] : Q(w 0 )(ιx.P (w 0 )(x)) = Q(w)(ιx.P (w 0 )(x))
b.
asserts: Q(w)(ιx.P (w)(x))
(von Fintel 2000: 34)
With ignorance free relatives, the modal base F is the epistemic alternatives of the speaker.51 Consider (107), for example. (107)
There’s a lot of garlic in whatever (it is that) Arlo is cooking. (von Fintel 2000: 27)
(107) presupposes that in all the speaker’s epistemically accessible worlds which are minimally different from the actual world and in which Arlo is cooking something different from what he is actually cooking, there is the same amount of garlic in what he is cooking. As the min-operator introduces an existential presupposition, (107) presupposes that there are epistemically accessible worlds in which Arlo is cooking something different from what he is actually cooking. This amounts to a presupposition that the speaker is ignorant about the identity of what Arlo is cooking. (107) then asserts that the unique thing which Arlo is cooking has a lot of garlic in it. Turning to St’át’imcets, we see that von Fintel’s semantics captures the required meanings accurately. (105a) presupposes that the speaker does not know who ‘he’ (the man being talked about) met, and asserts that he met whoever he met. Moreover, it seems that we can account for the presence of the subjunctive in free relatives, and also for the presence of the inferential evidential. In particular, I would like to suggest that the presupposition of 51 With indifference free relatives, the modal base includes counterfactual alternatives.
9:58
Cross-linguistic variation in modality systems: The role of mood
speaker ignorance about the denotation of the free relative actually derives from the evidential k’a and the subjunctive. The basic idea is that an ignorance free relative is formed from a conjectural question (see Davis 2009 for this insight, although Davis does not word it in this way). The free relative in (105a), for example, is formed from the conjectural question in (108). (108)
swát=as=k’a káti7 ku=npzán-as who=3sbjn=infer deic det=meet(dir)-3erg ‘I wonder who he met.’
Following the analysis of conjectural questions given in subsection 7.2, (108) denotes the set of propositions of the form ‘he met x’. The evidential in (108) would normally undergo interrogative flip, giving rise to the inference that the hearer is not in a position to answer the question of who he met. When (108) is embedded in a non-matrix environment as in (105a), however, I assume that interrogative flip does not take place. The free relative based on (108) will therefore carry a conjoined presupposition that the speaker has inferential evidence for each alternative, and an implicature that the speaker has no stronger evidence about who he met. And due to the subjunctive, it will presuppose that for each alternative, there is at least one best world in the modal base in which that alternative is false. Thus, the free relative formed from (108) will presuppose that there is mixed evidence about who he met, and that for each person x, it’s compatible with the speaker’s knowledge that he did not meet x. This derives the desired ‘speaker ignorance’ presupposition. Moreover, we can regard the subjunctive as an overt spell-out of the existential presupposition of the min-operator, namely that there are epistemically accessible worlds in which the person he met is not who he met in the actual world. A final advantage of this approach is that we correctly capture the fact that the modal base contains epistemic alternatives, as k’a lexically encodes an epistemic conversational background. This accounts for the fact that only ignorance free relatives, and not indifference free relatives, contain k’a in St’át’imcets (Davis 2009).52 52 Free relatives in St’át’imcets are far from solved. For example, Davis (2009) points out a problem with free relatives which surface as DPs, as in (105b) above. Davis shows that syntactically, this wh-word acts like the head noun of a relative clause. This poses a challenge for the claim that (105b) is formed from a conjectural question. Moreover, if the wh-word is functioning as a head noun in (105b), the evidential k’a should not be able to attach to it, as
9:59
Lisa Matthewson
7.4 ‘Pretend’ There are two patterns to account for with the ‘pretend’ cases, depending on the dialect. In Upper St’át’imcets, the subjunctive plus the normative modal ka frequently renders a ‘pretend to be ...’ interpretation. In Whitley et al. no date, a native-speaker-produced St’át’imcets teaching manual, the standard construction when the teacher is asking the students to pretend something is that in (109). (109)
a.
skalúl7=acw=ka: saq’w knáti7 múta7 em7ímn-em owl=2sg.sbjn=deon fly deic and animal.noise-mid ‘Pretend to be an owl: fly around and hoot.’ (Davis 2006: chapter 24)
b.
snu=hás=ka ku-skícza7 2sg.emph=3sbjn=deon det=mother ‘Pretend to be the mother.’
(Whitley et al. no date)
In Lower St’át’imcets, however, examples like the ones in (109) are rejected in ‘pretend’ contexts. Lower St’át’imcets uses either an emphatic pronoun in a cleft, as in (110a), or the adhortative particle malh, as in (110b). In each case, the subjunctive is present, but ka is absent. (110)
a.
nu=hás ku=skalúla7: sáq’w=kacw knáti7 2sg.emph=3sbjn det=owl fly=2sg.indic deic ‘Pretend to be an owl.’
b.
skalúl7=acw=malh: sáq’w=kacw knáti7 owl=2sg.sbjn=adhort fly=2sg.indic deic ‘Pretend to be an owl: fly around.’
In each of the dialectal variants, the apparent ‘pretend’ construction seems to reduce to another usage, rather than really meaning ‘pretend’. The examples in (109) are merely instances of the subjunctive adding to a normative modal assertion. (109a) thus really means something like ‘I wish you were an owl’, and (109b) means ‘I wish you were the mother.’ In (110a), the subjunctive adds to a plain assertion to create a wish, something which is possible with clefts; cf. (5) above. As for (110b), the consultant spontaneously k’a attaches only to predicates. This is a peculiarity of k’a; Davis shows that other secondposition evidentials, such as reportative ku7 or perceived-evidence =an’, are ungrammatical in free relatives. Further research is required.
9:60
Cross-linguistic variation in modality systems: The role of mood
translates this into English as ‘You may as well be an owl’. The presence of adhortative malh here is a matter for future research; see comments in Section 8 below. Support for the idea that (109) and (110) are not really ‘pretend’ constructions comes from the fact that exactly parallel structures are used when the wish is not that someone pretend to be something, but rather is a wish which has a chance of coming true. This is shown in (111). While the consultant accepts a ‘pretend’ translation for the sentences in (111), she spontaneously translates them into English using simply ‘you be . . . ’. She judges that the St’át’imcets sentences do not really mean ‘pretend’. (111)
a.
nu=hás ku=kúkwpi7 2sg.emph=3sbjn det=chief ‘Pretend to be the chief.’ ‘You be the chief.’
b.
[accepted] [spontaneously given]
nu=hás ku=kúkw 2sg.emph=3sbjn det=cook ‘Pretend to cook.’ ‘You be the cook.’
[accepted] [spontaneously given]
7.5 Why St’át’imcets is not like Romance In this final sub-section I return to a major cross-linguistic difference between the St’át’imcets subjunctive and more familiar, Indo-European subjunctives, namely that in St’át’imcets the subjunctive is never selected by a matrix predicate, and in fact is ungrammatical under all attitude verbs (as shown in (38) above). It turns out that this falls out from the current analysis. The St’át’imcets subjunctive is parasitic on a modal, and introduces the presupposition that in at least one of the best worlds in the modal base according to the ordering source, the embedded proposition is false. This presupposition is incompatible with the semantics of attitude verbs, which are standardly analyzed as introducing universal quantification over a set of worlds. This is illustrated in (112) for English believe. (112)
JbelieveKw,g =
λphs,ti .λx.∀w 0 compatible with what x believes in w : p(w 0 ) = 1 (von Fintel & Heim 2007: 18)
9:61
Lisa Matthewson
There is no reason to assume that attitude verbs like ‘believe’ have different semantics in St’át’imcets from in English. On the contrary, the St’át’imcets verb tsutánwas ‘think, believe’ must involve universal quantification over belief-worlds, without the possibility of domain restriction (in other words, there is no choice function or second ordering source). Thus, (113), just like its English gloss, requires that in all Laura’s belief-worlds, John has left. It cannot mean that Laura’s beliefs allow, but do not require, that John has left. (113)
tsut-ánwas k=Laura kw=s=qwatsáts=s k=John say-inside det=Laura det=nom=leave=3poss det=John ‘Laura thinks that John left.’
Given this, adding the subjunctive under the verb ‘believe’ in St’át’imcets leads to the following contradictory result. (114)
*tsut-ánwas k=Laura kw=s=qwatsáts=as k=John say-inside det=Laura det=nom=leave=3sbjn det=John ‘Laura thinks that John left.’
J(114)Kw is only defined if ∃w 0 compatible with Laura’s beliefs in w:
John didn’t leave in w 0
If defined, J(114)Kw = 1 iff ∀w 0 compatible with Laura’s beliefs in w: John left in w 0 The presupposition of the subjunctive contradicts the assertion. This explains why the subjunctive is not used under verbs like ‘believe’ in St’át’imcets, unlike in Romance. We need to separately discuss the absence of subjunctive under desire verbs in St’át’imcets. An example was given in (38e), repeated here.53 (115)
xát’-min’-as k=Laura kw=s=t’iq=Ø k=John want-red-3erg det=Laura det=nom=arrive=3indic det=John ‘Laura wanted John to come.’
Desire verbs are often treated as involving comparison between alternative worlds (e.g., Stalnaker 1984, Heim 1992 and much subsequent work). The intuition is that ‘John wants you to leave means that John thinks that if you leave he will be in a more desirable world than if you don’t leave’ (Heim 1992: 53 Thanks to an anonymous reviewer for discussion of this issue.
9:62
Cross-linguistic variation in modality systems: The role of mood
193). Here I adopt Portner’s (1997) analysis of desire verbs, and in particular we will see that the St’át’imcets verb xát’min’ is better analyzed as similar to English hope (which according to Portner is similar to believe, and therefore is not intrinsically comparative) than to English want. Portner analyzes hope in terms of a buletic accessibility relation Bulα (s, b). For any situation s and belief situation b of an agent α, Bulα (s, b) is the set of buletic alternatives for α in s — i.e., ‘the worlds in which the most of α’s plans in s (relative to his or her beliefs in b) are carried out’ (Portner 1997: 178). The sentence in (116) receives the interpretation shown: it is true just in case in all of James’s buletic alternatives, Joan arrives in Richmond soon. (116)
James hopes that Joan arrives in Richmond soon. {s : BulJames (s, b) ⊆ J Joan arrives in Richmond soon Ks } (Portner 1997: 188)
Portner’s analysis of hope differs from that of want, and is parallel to that of believe, in crucial respects (which explain the different embedding possibilities for hope/believe vs. want). In particular, while hope and believe are defined directly in terms of (doxastic or buletic) alternatives, want is defined in terms of the agent’s plans. Portner argues that the difference between hope and want is ‘an idiosyncratic lexical one’ (Portner 1997: 189). If this is correct, it would not be unexpected that a language could contain only the hope-type of desire predicate. If we apply Portner’s analysis of hope to St’át’imcets xát’min’, and attempt to use the subjunctive in the embedded clause, we get the result in (117). (117)
*xát’-min’-as k=Laura kw=s=t’íq=as k=John want-red-3erg det=Laura det=nom=arrive=3sbjn det=John ‘Laura wanted John to come.’
J(117)Ks is only defined if ∃s ∈ BulLaura (s, b): John does not come in s If defined, J(117)Ks =1 iff {s : BulLaura (s, b) ⊆ J John comes Ks } (117) is defined only if there is at least one situation in Laura’s buletic alternatives in which John does not come, but it asserts that in all Laura’s buletic alternatives, John comes. The contradiction between the presupposition and the assertion leads to the unacceptability of the sentence.
9:63
Lisa Matthewson
The idea that St’át’imcets xát’min’ is parallel to English hope or believe rather than to English want leads to the following cross-linguistic comparison. While Indo-European has two kinds of attitude verbs — those involving universal quantification over alternative worlds, and those which are intrinsically comparative — St’át’imcets has only the former kind. This explains why St’át’imcets lacks subjunctives under attitude verbs, and even allows us to draw the broader generalization that St’át’imcets only allows universal quantification over worlds. This language lacks both true possibility modals and comparative subjunctive-embedding predicates.54 8
Conclusions and questions for future research
The goal of this paper was to extend the formal cross-linguistic study of modality to the related domain of mood. Prior work on St’át’imcets has proposed that languages vary in whether their modals encode quantificational force (as in English), or conversational background (as in St’át’imcets) (Matthewson et al. 2007, Rullmann et al. 2008, Davis et al. 2009). Here, I have argued that languages vary in their mood systems along the same dimension, at least functionally. While some languages use moods to encode distinctions of conversational background (buletic, deontic, etc.), St’át’imcets uses mood to functionally achieve a restriction on modal quantificational force. (Of course technically, both modals and moods in St’át’imcets restrict conversational backgrounds: the modal force is always universal.) If this view is correct, then each language-type draws on its moods and its modals together to allow the full range of specifications. In other words, what modals don’t encode, moods do. The simplified typological table is repeated here.
Indo-European St’át’imcets Table 5
lexically encode quant. force
lexically encode conv. background
modals moods
moods modals
Modal and mood systems
The analysis presented here raises some questions for future research. One outstanding issue is the status of subjunctives with no overt licenser at 54 Thanks to an anonymous reviewer for discussion of this point.
9:64
Cross-linguistic variation in modality systems: The role of mood
all, as in (5)–(6). As noted earlier, these appear to be productive only in clefts. It is not immediately obvious that a cleft contains a modal operator which would license the subjunctive, so further investigation is required (although see fn. 22). A second interesting puzzle relates to subjunctive imperatives (see subsection 7.1). These seem to strongly prefer the presence of the adhortative particle malh, which is normally optional in imperatives. Perhaps malh (which has not previously been analyzed) is a modal, and perhaps its obligatoriness reflects the licensing requirement of the subjunctive. But what consequence would this have for the analysis provided above, which assumes that even imperatives with no adhortative particle contain a concealed deontic modal? This question cannot be answered without a real investigation of malh, something which goes beyond the bounds of the current paper. An even trickier element is the particle t’u7. t’u7 is the culprit in the two uses of the subjunctive I have declined to analyze here, the ‘might as well’ cases and the indifference free relatives. Like malh, t’u7 has not yet been formally analyzed, but for t’u7 there are not even any clear descriptive generalizations about its usage. It is often translated as ‘just’ or ‘still’, but also occurs where there is no obvious English translation, or even any detectable semantic contribution. t’u7 frequently appears with strong quantifiers, as in (118a), is almost obligatory if one wants to express ‘only’, as in (118b), and is also the St’át’imcets way to express ‘but’, as in (118c) (although here, unlike in its other uses, it is not a second-position enclitic, and this may therefore be a case of homophony). (118)
a.
tákem=t’u7 swat áolsvm l=ti=tsítcw=a all=prt who sick in=det=house=exis ‘Everyone in the house was sick.’
b.
(Matthewson 2005: 311)
tsúkw=t’u7 snilh ti=tsícw=a aolsvm-áolhcw finish=prt 3sg.emph det=get.there=exis sick-house ‘It was only him who went to the hospital.’ 324)
c.
(Matthewson 2005:
plan aylh láku7 wa7 cw7it i=tsetsítcw=a, t’u7 already then deic impf many det.pl=houses=exis but pináni7 cw7aoz láti7 ku=wá7 tsitcw temp.deic neg deic det=impf house ‘Now there are lots of houses there, but then there were no houses.’
9:65
Lisa Matthewson
(Matthewson 2005: 54) As noted above, t’u7 is present in the ‘might as well’ uses of the subjunctive, and in indifference free relatives. Examples are repeated here. (119)
a.
wá7=lhkacw=t’u7 lts7a lhkúnsa ku=sgáp be=2sg.indic=prt deic now det=evening ‘You are staying here for the night.’
b.
wá7=acw=t’u7 lts7a lhkúnsa ku=sgáp be=2sg.sbjn=prt deic now det=evening ‘You may as well stay here for the night.’
(120)
[stám’=as=t’u7 káti7 i=wá7 ka-k’ac-s-twítas-a [what=3sbjn=prt deic det.pl=impf circ-dry-caus-3pl.erg-circ i=n-slalíl’tem=a] wa7 ts’áqw-an’-em det.pl=1sg.poss-parents=exis] impf eat-dir-1pl.erg lh=as sútik comp(impf)=3sbjn winter ‘Whatever my parents could dry, we ate in wintertime.’ (Matthewson 2005: 141, cited in Davis 2009)
Given the analysis above, we expect there to be a modal — or at least a modal base and an ordering source — present in any structure where the subjunctive is licensed. The interpretation of subjunctive + t’u7 in (119b) is plausibly modal — the consultants are remarkably consistent with the ‘might as well’ translation. There is also a certain similarity between the ‘might as well’ construction and the Sufficiency Modal Construction (Krasikova & Zchechev 2005, von Fintel & Iatridou 2008), illustrated in (121). (121)
To get good cheese, you only have to go to the North End! (von Fintel & Iatridou 2008: 445)
The crucial elements of the Sufficiency Modal Construction are (a) a necessity modal and (b) an exclusive operator such as ‘only’.55 The possible connection between (119) and (121) may be fruitful to investigate in future work.56 55 For von Fintel and Iatridou, the ‘only’ is decomposed into ‘neg . . . except’ (and shows up overtly as this in some languages). 56 See also Mitchell 2003 on ‘might as well’ in English.
9:66
Cross-linguistic variation in modality systems: The role of mood
As for indifference free relatives as in (120), these also very plausibly contain a covert modal, presumably a necessity one. The important question will be whether the subjunctive can be analyzed as a weakener in the indifference free relatives. Ideally, the future analysis of (119)–(120) will also elucidate the semantic connection between the two t’u7-subjunctives, both of which somehow express the notion of ‘indifference’ (although perhaps in different senses of the word). (119b), for example, conveys that you can stay here for the night or not, I don’t really care. In spite of these outstanding questions, I believe that the empirical coverage of the analysis presented here is encouraging. Out of the nine meaningful uses of the St’át’imcets subjunctive, we set aside two which rely on the poorlyunderstood particle t’u7, but have managed to unify the remaining seven. The analysis accounts for such seemingly disparate effects as the weakening of imperatives, the reduction in interrogative force of questions, and the non-appearance of the subjunctive under any attitude verb. The analysis, if correct, supports the modal approach to mood advocated by Portner (1997), and suggests that languages have a certain amount of freedom in how they divide up the various functional tasks required of moods and modals. Finally, the research reported on here opens up broader questions about the nature of mood cross-linguistically, for example about the relation between subjunctive and irrealis. In Section 2, I showed that the St’át’imcets subjunctive patterns morpho-syntactically, as well as in some of its semantic properties, like a subjunctive rather than an irrealis. However, we also saw that the St’át’imcets subjunctive differs semantically from Indo-European subjunctives. I argued above (see fn. 9) that the use of the term ‘subjunctive’ was justified, even in the face of such non-trivial cross-linguistic variation. However, there is much more work to be done on the formal semantics of mood cross-linguistically. Once a wider range of systems are investigated in depth, we may find that the traditional terminology does not correlate with the cross-linguistically interesting divisions. Topics for future inquiry include whether there is a minimal semantic change which would turn a subjunctive morpheme into an irrealis one, or vice versa, and in general what the semantic building blocks are from which moods are composed.
9:67
Lisa Matthewson
References Aikhenvald, Alexandra. 2006. Evidentiality. New York: Oxford University Press. Baker, Mark & Lisa Travis. 1997. Mood as verbal definiteness in a “tenseless” language. Natural Language Semantics 5(3). 213–269. doi:10.1023/A:1008262802401. Beghelli, Filippo. 1998. Mood and the interpretation of indefinites. The Linguistic Review 15(2-3). 277–300. doi:10.1515/tlir.1998.15.2-3.277. Bolinger, Dwight. 1968. Postposed main phrases: an English rule for the Romance subjunctive. Canadian Journal of Linguistics 14. 3–33. Caponigro, Ivano & Jon Sprouse. 2007. Rhetorical questions as questions. In Proceedings of Sinn und Bedeutung 11, 121–133. http://idiom.ucsd.edu/ ~ivano/Papers/2007_Rhetorical-Qs_SuB.pdf. Condoravdi, Cleo. 2002. Temporal interpretation of modals: Modals for the present and the past. In David Beaver, Stefan Kaufmann, Brady Clark & Luis Casillas (eds.), Stanford Papers on Semantics, vol. 7, 59–88. Stanford: CSLI Publications. http://semanticsarchive.net/Archive/2JmZTIwO/. Davis, Christopher, Christopher Potts & Margaret Speas. 2007. The pragmatic values of evidential sentences. In Masayuki Gibson & Tova Friedman (eds.), Proceedings of the 17th Conference on Semantics and Linguistic Theory, 71–88. Ithaca, NY: CLC Publications. doi:1813/11294. Davis, Henry. 2000. Remarks on Proto-Salish subject inflection. International Journal of American Linguistics 66(4). 499–520. doi:10.1086/466439. Davis, Henry. 2006. A grammar of Upper St’át’imcets. Ms., University of British Columbia. Davis, Henry. 2009. Free relatives in St’át’imcets (Lillooet Salish). Ms., University of British Columbia. Davis, Henry, Lisa Matthewson & Hotze Rullmann. 2009. ‘Out of control’ marking as circumstantial modality in St’át’imcets. In Lotte Hogeweg, Helen de Hoop & Andrey Malchukov (eds.), Cross-linguistic semantics of tense, aspect and modality, 205–244. Oxford: John Benjamins. http:// www.linguistics.ubc.ca/sites/default/files/TamTam_final_11-08-08.pdf. Dayal, Veneeta. 1997. Free relatives and ever: Identity and free choice readings. In Proceedings of SALT VII, 99–116. http://www.rci.rutgers.edu/ ~dayal/ever.pdf. van Eijk, Jan. 1997. The Lillooet language: Phonology, morphology, syntax. Vancouver, BC: UBC Press.
9:68
Cross-linguistic variation in modality systems: The role of mood
van Eijk, Jan & Lorna Williams. 1981. Lillooet legends and stories. Mt. Currie, BC: Ts’zil Publishing House. Faller, Martina. 2002. Semantics and pragmatics of evidentials in Cuzco Quechua: Stanford dissertation. Faller, Martina. 2006. Evidentiality and epistemic modality at the semantics/pragmatics interface. http://www.eecs.umich.edu/~rthomaso/ lpw06/fallerpaper.pdf. Farkas, Donka. 1992. On the semantics of subjunctive complements. In Paul Hirschbühler & Konrad Koerner (eds.), Romance languages and modern linguistic theory: Papers from the 20th linguistic symposium on Romance languages, 69–104. Amsterdam and Philadelphia: Benjamins. Farkas, Donka. 2003. Assertion, belief and mood choice. Paper presented at the Workshop on Conditional and Unconditional Modality, ESSLLI, Vienna. http://people.ucsc.edu/~farkas/papers/mood.pdf. von Fintel, Kai. 2000. Whatever. In Proceedings of SALT X, 27–40. http: //web.mit.edu/fintel/www/whatever.pdf. von Fintel, Kai & Anthony Gillies. 2010. Must . . . stay . . . strong! Natural Language Semantics. doi:10.1007/s11050-010-9058-2. von Fintel, Kai & Irene Heim. 2007. Intensional semantics lecture notes. Ms., MIT. http://mit.edu/fintel/IntensionalSemantics.pdf. von Fintel, Kai & Sabine Iatridou. 2008. How to say ought in foreign: The composition of weak necessity modals. In Jacqueline Guéron & Jacqueline Lecarme (eds.), Time and modality, 115–141. Dordrecht: Springer. http: //mit.edu/fintel/fintel-iatridou-2006-ought.pdf. Garrett, Edward. 2001. Evidentiality and assertion in Tibetan. Los Angeles, CA: UCLA dissertation. Gauker, Christopher. 1998. What is a context of utterance? Philosophical Studies 91(2). 149–172. doi:10.1023/A:1004247202476. Giannakidou, Anastasia. 1997. The landscape of polarity items. Groningen: University of Groningen dissertation. Giannakidou, Anastasia. 1998. Polarity sensitivity as (non)veridical dependency. Amsterdam and Philadelphia: John Benjamins. Giannakidou, Anastasia. 2009. The dependency of the subjunctive revisited: Temporal semantics and polarity. Lingua 119(12). 1883–1908. doi:10.1016/j.lingua.2008.11.007. Giorgi, Alessandra & Fabio Pianesi. 1997. Tense and aspect: From semantics to morpho-syntax. Oxford: Oxford University Press. Guerzoni, Elena. 2003. Why ‘even’ ask? on the pragmatics of questions and
9:69
Lisa Matthewson
the semantics of answers: MIT dissertation. http://hdl.handle.net/1721.1/ 17646. Hamblin, C. L. 1973. Questions in Montague English. Foundations of Language 10(1). 45–53. http://www.jstor.org/stable/25000703. Han, Chung-hye. 1997. Deontic modality of imperatives. Language and Information 1. 107–136. Han, Chung-hye. 1999. Deontic modality, lexical aspect and the semantics of imperatives. In Linguistics in the morning calm 4, Seoul: Hanshin Publications. URLhttp://www.sfu.ca/~chunghye/papers/morningcalm. pdf. Han, Chung-hye. 2002. Interpreting interrogatives as rhetorical questions. Lingua 112(3). 201–229. doi:10.1016/S0024-3841(01)00044-4. Haverkate, Henk. 2002. The syntax, semantics and pragmatics of Spanish mood. Amsterdam and Philadelphia: John Benjamins. Heim, Irene. 1992. Presupposition projection and the semantics of attitude verbs. Journal of Semantics 9(3). 183–221. doi:10.1093/jos/9.3.183. Hooper, Joan B. 1975. On assertive predicates. In John Kimball (ed.), Syntax and semantics 4, 91–124. New York: Academic Press. Jacobs, Peter. 1992. Subordinate clauses in Squamish: A Coast Salish language. MA thesis, University of Oregon. Jacobson, Pauline. 1995. On the quantificational force of English free relatives. In Emmon Bach, Eloise Jelinek, Angelika Kratzer & Barbara Partee (eds.), Quantification in natural language, 451–486. Dordrecht: Kluwer. James, Frances. 1986. Semantics of the English subjunctive. Vancouver, BC: UBC Press. Klein, Flora. 1975. Pragmatic constraints in distribution: the Spanish subjunctive. In Papers from the 11th CLS, 353–365. Krasikova, Sveta & Ventsislave Zchechev. 2005. Scalar uses of only in conditionals. In Proceedings of the fifteenth Amsterdam Colloquium, 137– 142. University of Amsterdam. http:www.ventsislavzhechev.eu/Home/ Publications_files/. Kratzer, Angelika. 1981. The notional category of modality. In Hans-Jürgen Eikmeyer & Hannes Rieser (eds.), Words, worlds, and contexts: New approaches in word semantics (Research in Text Theory 6), 38–74. Berlin: de Gruyter. Kratzer, Angelika. 1991. Modality. In Dieter Wunderlich & Arnim von Stechow (eds.), Semantics: An international handbook of contemporary research, 639–650. Berlin: de Gruyter.
9:70
Cross-linguistic variation in modality systems: The role of mood
Kratzer, Angelika. 2009. Modals and conditionals again, chapter 3. To be published by Oxford University Press. Kroeber, Paul. 1999. The Salish language family: Reconstructing syntax. Lincoln, NE: The University of Nebraska Press. doi:10.1017/S0022226702231928. Littell, Patrick. 2009. Conjectural questions and the wonder effect or: What could conjectural questions possibly be? Ms, University of British Columbia. Littell, Patrick, Lisa Matthewson & Tyler Peterson. 2009. On the semantics of conjectural questions. Paper presented at the MOSAIC Workshop (Meeting of Semanticists Active in Canada), Ottawa. Lunn, Patricia. 1995. The evaluative function of the Spanish subjunctive. In Joan Bybee & Suzanne Fleischman (eds.), Modality and grammar in discourse, 419–449. Amsterdam and Philadelphia: Benjamins. Matthewson, Lisa. 1998. Determiner systems and quantificational strategies: Evidence from Salish. The Hague: Holland Academic Graphics. Matthewson, Lisa. 1999. On the interpretation of wide-scope indefinites. Natural Language Semantics 7(1). 79–134. doi:10.1023/A:1008376601708. Matthewson, Lisa. 2005. When I was small – i wan kwikws: Grammatical analysis of St’át’imcets oral narratives. Vancouver, BC: UBC Press. Matthewson, Lisa. 2006. Presuppositions and cross-linguistic variation. In Proceedings of NELS 36, Amherst, Mass: GLSA Publications. Matthewson, Lisa. 2008a. Moods vs. modals in St’át’imcets and beyond. Paper presented at New York University. Matthewson, Lisa. 2008b. Pronouns, presuppositions and semantic variation. In Proceedings of SALT XVIII, 527–550. Cornell University: CLC Publications. http://www.linguistics.ubc.ca/sites/default/files/ MatthewsonSALTpronouns.pdf. Matthewson, Lisa. 2010. Evidence about evidentials: Where fieldwork meets theory. Paper presented at Linguistic Evidence 2010, University of Tübingen. http://www.linguistics.ubc.ca/sites/default/files/ MatthewsonLE2010.pdf. Matthewson, Lisa. to appear. On apparently non-modal evidentials. To appear in Proceedings of CSSP 2009 (EISS8). Matthewson, Lisa, Hotze Rullmann & Henry Davis. 2007. Evidentials as epistemic modals: Evidence from St’át’imcets. In J.V. Craenenbroeck (ed.), Linguistic Variation Yearbook, vol. 7, 201–254. John Benjamins Publishing Company.
9:71
Lisa Matthewson
Mitchell, Keith. 2003. Had better and might as well: On the margins of modality? In M. Krug R. Facchinetti & F. Palmer (eds.), Modality in contemporary english, 129–149. Berlin: Mouton de Gruyter. Murray, Sarah. to appear. Evidentiality and questions in Cheyenne. In Suzi Lima (ed.), Proceedings of SULA 5: Semantics of under-represented languages in the Americas, Amherst, MA: GLSA Publications. Palmer, Frank. 2006. Mood and modality. Cambridge: Cambridge University Press 2nd edn. doi:10.2277/0521804795. Panzeri, Francesca. 2003. In the (indicative or subjunctive) mood. In Proceedings of Sinn und Bedeutung 7, http://ling.uni-konstanz.de/pages/ conferences/sub7/proceedings/download/sub7_panzeri.pdf. Peterson, Tyler. 2009. The ordering source and graded modality in Gitskan epistemic modals. Ms., University of British Columbia. http://www. linguistics.ubc.ca/sites/default/files/Peterson(SuB).pdf. Peterson, Tyler. 2010. Epistemic modality and evidentiality in Gitksan at the semantics-pragmatics interface: University of British Columbia dissertation. http://hdl.handle.net/2429/23596. Portner, Paul. 1997. The semantics of mood, complementation and conversational force. Natural Language Semantics 5(2). 167–212. doi:10.1023/A:1008280630142. Portner, Paul. 2003. The semantics of mood. In Lisa Cheng & Rint Sybesma (eds.), The second Glot international state-of-the-article book, 47–77. Berlin: Mouton de Gruyter. Portner, Paul. 2004. The semantics of imperatives within a theory of clause types. In Proceedings of SALT XIV, Cornell University: CLC Publications. http://semanticsarchive.net/Archive/mJlZGQ4N/PortnerSALT04.pdf. Portner, Paul. 2007. Imperatives and modals. Natural Language Semantics 15(4). 351–383. doi:10.1007/s11050-007-9022-y. Portner, Paul. 2009. Modality Oxford Surverys in Semantics and Pragmatics. Oxford: Oxford University Press. Potts, Christopher. 2005. The logic of conventional implicatures. Oxford: Oxford University Press. Quer, Josep. 1998. Mood at the interface. The Hague: Holland Academic Graphics. Quer, Josep. 2001. Interpreting mood. Probus 13(1). 81–111. doi:10.1515/prbs.13.1.81. Quer, Josep. 2009. Twists of mood: The distribution and interpretation of indicative and subjunctive. Lingua 119(12). 1779–1787.
9:72
Cross-linguistic variation in modality systems: The role of mood
doi:10.1016/j.lingua.2008.12.003. Rivero, María. 1975. Referential properties of Spanish noun phrases. Language 51(1). 32–48. doi:10.2307/413149. Rocci, Andrea. 2007. Epistemic modality and questions in dialogue. the case of Italian interrogative constructions in the subjunctive mood. In L. de Saussure, J. Moeschler & G. Puska (eds.), Tense, mood and aspect: Theoretical and descriptive issues, 129–153. Amsterdam and New York: Rodopi. Rullmann, Hotze, Lisa Matthewson & Henry Davis. 2008. Modals as distributive indefinites. Natural Language Semantics 16(4). 317–357. doi:10.1007/s11050-008-9036-0. Schwager, Magdalena. 2005. Interpreting imperatives: University of Frankfurt/Main dissertation. Schwager, Magdalena. 2006. Conditionalized imperatives. In Proceedings of SALT XVI, Cornell University: CLC Publications. http://ecommons.library. cornell.edu/bitstream/1813/7591/1/salt16_schwager_241_258.pdf. Schwager, Magdalena. 2008. Optimizing the future - imperatives between form and function. Course notes, ESLLI 2008. http://zis.uni-goettingen. de/mschwager/esslli08/ms_schwager_esslli08.pdf. Stalnaker, Robert. 1974. Pragmatic presuppositions. In Milton Munitz & Peter Unger (eds.), Semantics and Philosophy, 197–214. New York University Press. Stalnaker, Robert. 1984. Inquiry. Cambridge, MA: MIT Press. Tenny, Carol. 2006. Evidentiality, experiencers and the syntax of sentience in Japanese. Journal of East Asian Linguistics 15(3). 245–288. doi:10.1007/s10831-006-0002-x. Tenny, Carol & Peggy Speas. 2004. The interaction of clausal syntax, discourse roles and information structure in questions. Paper presented at the Workshop on Syntax, Semantics and Pragmatics of Questions. ESLLI, Université Henri Poincaré, Nancy. http://www.linguist.org/ESSLI-Questions-hd.pdf. Terrell, Tracy & Joan Hooper. 1974. A semantically based analysis of mood in Spanish. Hispania 57(3). 484–494. doi:10.2307/339187. Thoma, Sonja. 2007. The categorical status of independent pronouns in St’át’imcets. Ms., University of British Columbia. Villalta, Elisabeth. 2009. Mood and gradability: an investigation of the subjunctive mood in Spanish. Linguistics and Philosophy 31(4). 467–522. doi:10.1007/s10988-008-9046-x. Whitley, Rose (translator), Henry Davis, Lisa Matthewson & Beveley Frank
9:73
Lisa Matthewson
(editors). no date. Teaching St’át’imcets Through Action. Translation of Bertha Segal Cook Teaching English Through Action. Upper St’át’imcets Language, Culture and Education Society.
Lisa Matthewson UBC Department of Linguistics Totem Field Studios 2613 West Mall Vancouver, BC, V6T 1Z4, Canada
[email protected]
9:74
Semantics & Pragmatics Volume 3, Article 10: 1–38, 2010 doi: 10.3765/sp.3.10
Free choice permission as resource-sensitive reasoning∗ Chris Barker New York University
Received 2009-10-14 / First Decision 2009-11-24 / Revised 2010-07-04 / Accepted 2010-08-14 / Final Version Received 2010-08-31 / Published 2010-09-01
Abstract Free choice permission is a long-standing puzzle in deontic logic and in natural language semantics. It involves what appears to be a conjunctive use of or: from You may eat an apple or a pear, we can infer that You may eat an apple and that You may eat a pear — though not that You may eat an apple and a pear. Following Lokhorst (1997), I argue that because permission is a limited resource, a resource-sensitive logic such as Girard’s Linear Logic is better suited to modeling permission talk than, say, classical logic. A resource-sensitive approach enables the semantics to track not only that permission has been granted and what sort of permission it is (i.e., permission to eat apples versus permission to eat pears), but also how much permission has been granted, i.e., whether there is enough permission to eat two pieces of fruit or only one. The account here is primarily semantic (as opposed to pragmatic), with no special modes of composition or special pragmatic rules. The paper includes an introduction to Linear Logic.
Keywords: Free choice, permission, linear logic, deontic, implicature, resourcesensitive, substructural
∗ Thanks to Simon Charlow, Emmanuel Chemla, Cleo Condoravdi, Judith Degen, Nicholas Fleisher, Sven Lauer, Koji Mineshima, Paul Portner, Daniel Rothschild, Philippe Schlenker, Chung-chieh Shan, Seth Yalcin, and my anonymous referees. ©2010 Chris Barker This is an open-access article distributed under the terms of a Creative Commons NonCommercial License (creativecommons.org/licenses/by-nc/3.0).
Chris Barker
1 The resource-sensitivity of permission talk Since Ross 1941, it has been clear that the logic of obligation and permission behaves dramatically differently than other sorts of ordinary reasoning: (1)
a. b. c.
You may eat an apple or a pear. You may eat an apple. You may eat a pear.
If (1a) is true, then it is certainly true that you may eat an apple. Likewise, it is equally true that you have it within your power to safely eat a pear. So an adequate account of the meaning of (1a) must explain how it comes to imply (1b) and (1c). This pattern is by no means the usual case. Consider a variation on (1) in which the permissive modal may is omitted: (2)
a. b. c.
You ate an apple or a pear. You ate an apple. You ate a pear.
In this case, (2a) certainly does not imply either (2b) or (2c). So something about permission talk correlates with the unusual implications we are concerned with here. The puzzle posed by the facts in (1) is known as the free choice permission problem (Kamp (1973) attributes the choice of name to von Wright). Since (1a) implies both (1b) and (1c), (1b) and (1c) are therefore both equally true. Thus in many discussions, (1a) is said to imply (3a), since (3a) is merely the conjunction of (1b) and (1c): (3)
a. b.
You may eat an apple and you may (also) eat a pear. You may eat an apple or you may (*also) eat a pear.
Crucially, however, (3a) has an interpretation on which it furnishes permission to eat more than one piece of fruit. This interpretation is the one compatible with adding also in the second conjunct. Now, although (1a) may be consistent with a situation in which the addressee is allowed to eat more than one piece of fruit (as we will see below), the truth of (1a) alone is never sufficient to guarantee that more than one piece of fruit may be eaten. As a result, (3b) is a better candidate for a paraphrase of (1a): it, too (surprisingly!) implies (1b) and (1c), but, like (1a), it does not ever justify eating more than one piece of
10:2
Free choice permission as resource-sensitive reasoning
fruit. This is why also is never appropriate in the second disjunct in (3b) on the intended reading. What I am suggesting is that a complete characterization of permission sentences must not only tell us whether permission exists and what type of permission it is (i.e., permission to eat an apple versus permission to eat a pear), it must also characterize how much permission has been granted. Thus it must predict that (1a) and (3b) guarantee permission only to eat one piece of fruit, but that (3a) can be used to provide permission to eat two pieces of fruit. The key insight that I would like to develop in this paper first appears, as far as I know, in unpublished work of Lokhorst (1997): that permission and obligation is a resource-sensitive domain, so that logics based on (resource-insensitive) classical logic are not appropriate. Lokhorst suggests using Girard’s (1987) Linear Logic instead, and I will follow the technical details of his proposal closely. The contribution of this paper will be to introduce Lokhorst’s work to a linguistic audience, to evaluate it with respect to competing linguistic analyses, and to investigate the implications of adapting Lokhorst’s proposal for the theory of natural language semantics and pragmatics. Resource-sensitive (‘substructural’) logics are already familiar in linguistics as tools for building syntax/semantics interfaces (e.g., Moortgat 1997 or Dalrymple 2001). As far as I know, however, no one has yet suggested that natural language connectives such as or or and can have uses in which they behave semantically like connectives in a substructural logic, as I am suggesting here. Kamp (1973, 1978) discusses free choice permission not just as a puzzle for modeling reasoning about obligation (deontic logic), but as a puzzle for the composition of natural language expressions. From the point of view of natural language semantics, the interesting thing about the free choice permission problem is that it appears to require not only making assumptions about the meaning of certain uses of modal expressions such as may, but about the meaning of the corresponding uses of the coordinating conjunctions and and or. This will be true of the solution I offer below. Many solutions to the free choice permission problem rely on pragmatic mechanisms for much of the heavy lifting, including Kamp 1978, Zimmermann 2000, Fox 2007, and others. The arguments that free choice implications are pragmatic, and more specifically are scalar implicatures, stem from discussions of indefinites in Kratzer and Shimoyama 2002, as developed by
10:3
Chris Barker
Alonso-Ovalle (2006) and Fox (2007). The main evidence that free choice implications may be scalar implicatures turns on the behavior of negated permission sentences (You may not eat an apple or a pear); I show how the analysis here can explain the behavior of such sentences in section 5. In contrast to the pragmatic approaches, I will argue that the main free choice implications, including especially the implications from (1a) to (1b) and to (1c), are matters of entailment. To the extent that the analysis here is viable, it calls into question whether free choice implications are indeed implicatures. I discuss other entailment approaches (e.g., Aloni 2007) in section 6.2. 2
Classical logic versus Linear Logic
The account of free choice given below will depend on understanding the basics of Linear Logic at a fairly deep level. Since Linear Logic is unfamiliar to most semanticists, this section will present the basics of Linear Logic. 2.1 Classical logic I will only introduce the elements of classical logic that will be relevant for comparison with Linear Logic in the discussion below. This will include conjunction, disjunction, negation, and Weakening, but not, for example, quantification. Formulas. There is a set of atomic formulas a, b, c, . . . , and a set of variables over formulas A, B, C, . . . . Assume A and B are formulas. Then the classical negation of A, written ¬A, is a formula; the classical conjunction of A and B, written A ∧ B, is a formula; and the classical disjunction of A and B, written A ∨ B, is a formula. In addition, the classical implication of A and B, written, A → B is defined as an abbreviation of (¬A) ∨ B. Sequents. A sequent A, B, . . . , M ` N, O, . . . , Z consists of two multisets of formulas joined by a turnstile (‘`’). Classical sequents are interpreted as asserting that whenever all of the formulas in the leftmost multiset hold, then at least one of the formulas in the rightmost multiset must also hold. Saying that a sequent contains multisets rather than lists of formulas means that the order in which formulas are written is immaterial. Thus A, B and B, A represent the same multiset, but A, B is a different multiset than A, A, B, since the second multiset contains two instances of the formula A.
10:4
Free choice permission as resource-sensitive reasoning
Capital Greek letters (∆, Γ , . . . ) schematize over (possibly empty) multisets of formulas. The turnstile can occur in any position, and there can be more than one formula on the right hand side, so that the expression ‘∆ ` A, B’, the expression ‘∆ `’, and the expression ‘` ∆’ are all legitimate sequents. Negation. The following pair of inference rules characterize classical negation: ∆, A ` Γ ∆ ` A, Γ ¬1 ¬2 ∆ ` ¬A, Γ ∆, ¬A ` Γ Beginning with ¬1 , the inference rule on the left: if Γ follows from the formulas in ∆ along with A (this is what the sequent above the horizontal line expresses), then from ∆ alone we can conclude that either some member of Γ is still true, or else A must be false (the sequent below the horizontal line). Similar reasoning applies for the inference rule on the right, ¬2 . Proofs. A proof that a sequent is valid begins with trivial tautologies, here, that A ` A: A`A ¬1 ` ¬A, A ¬2 ¬¬A ` A As long as each subsequent inference step instantiates a valid inference rule, the proof guarantees that the final sequent will also be valid. A sequent at the bottom of such a proof is called a theorem of the logic. Reading from top to bottom, the first step of the proof here is an instantiation of the inference rule ¬1 . This step concludes that either A or its negation must be true (a version of the law of excluded middle); the second step (labeled ¬2 ) proves that two adjacent negations cancel out (the law of double negation). Proving that A ` ¬¬A is equally easy. Conjunction. The inference characterizing classical conjunction has two premises: ∆`A ∆`B ∧ ∆`A∧B If the assumptions in ∆ allow you to prove that A is true (i.e., if ∆ ` A), and the very same set of assumptions also allow you to prove that B is true, then you are certainly in a position to assert that the classical conjunction of A and B must be true. Disjunction. For disjunction, we have a matched pair of inferences: ∆`A ∆`A∨B
∆`B ∨1
∆`A∨B
10:5
∨2
Chris Barker
If the assumptions in ∆ allow you to prove that some proposition A is true, you can conclude that the classical disjunction of A and B is true. After all, if you know that Ann arrived, then you know that either Ann arrived or Bill arrived. The reason we need a pair of rules is that disjunction is symmetric, i.e., we are free to add the new disjunct either on the left or on the right. The classical duality of conjunction and disjunction. The following equivalences hold: (4)
a. b. c.
¬¬A ≡ A ¬(A ∧ B) ≡ ¬A ∨ ¬B ¬(A ∨ B) ≡ ¬A ∧ ¬B
The last two (DeMorgan’s laws) express the logical interrelationship between disjunction and conjunction. These equivalences can be thought of as bidirectional inference rules. In any case, I will freely replace formulas with forms deemed equivalent by (4). Weakening. Weakening allows assumptions to be discarded. ∆`Γ Weak ∆, A ` Γ If Γ follows from ∆, then Γ certainly still follows if A also happens to be true, no matter what A happens to express. The assumption A is gratuitous, but harmless. Weakening allows us to pick and choose among evidence as we focus on different parts of an argument. Implication as a form of disjunction. Recall that in the definitions of well-formed formulas, we defined classical implication A → B as an abbreviation of ¬A ∨ B. The inference rule that characterizes implication is Modus Ponens, which says that A, A → B ` B is valid. We can prove Modus Ponens as follows. The main aspect of the proof that is relevant for comparison with Linear Logic is the role of Weakening. A`A
¬B ` ¬B Weak
A, ¬B ` A
Weak ¬B, A ` ¬B ∧
¬B, A ` A ∧ ¬B A, ¬(A ∧ ¬B) ` ¬¬B
¬1 , ¬2 ≡
A, A → B ` B
10:6
Free choice permission as resource-sensitive reasoning
Wadler (1993) uses classical modus ponens in the following proof to emphasize the differences between classical logic and Linear Logic: A`A
[see previous proof] Weak
A, A → B ` A
A, A → B ` B ∧
A, A → B ` A ∧ B Weakening allows us to make use of assumption A twice: once to justify the left conjunct of the conclusion, and once to support modus ponens in order to derive the right conjunct of the conclusion. We will see that Linear Logic requires careful accounting: each assumption can be used exactly once, so this proof will not go through. Finally, completing the ¬, ∧, ∨ fragment of classical logic requires Contraction: from ∆ ` A, A, infer ∆ ` A. In Linear Logic, Contraction is also rejected, but Contraction does not play a role in the exposition here. 2.2 Linear Logic Formulas. Once again there is a set of atomic formulas a, b, c, . . . , and a set of variables over formulas A, B, C, . . . . However, since none of the Linear Logic connectives mean what their classical counterparts mean, Linear Logic uses a completely distinct set of connective symbols. Assume A and B are formulas. Then the linear negation of A, written A⊥ , is a formula; the additive conjunction of A and B, written A & B (pronounced “A with B”) is a formula; the multiplicative conjunction of A and B, written A ⊗ B (pronounced “A times B”) is a formula; the additive disjunction of A and B, written A ⊕ B (pronounced “A plus B”) is a formula; and the multiplicative disjunction of A and B, written A B (pronounced “A par B”) is a formula. (Many things in natural language semantics are called ‘additive’. The Linear Logic notions of ‘additive’ and ‘multiplicative’ do not line up with any of them.) In parallel with the definition of classical implication above, linear implication, written A ( B (pronounced “A lollipop B”), is defined as an abbreviation for A⊥ B. Sequents. A sequent ∆ ` Γ says that whenever the multiplicative conjunction of ∆ holds, then the multiplicative disjunction of Γ must hold. Fragment of Linear Logic for the free choice permission problem. Figure 1 displays the complete set of rules of Linear Logic that we will use in the discussion of the free choice permission problem. &
&
10:7
Chris Barker
∆, A ` Γ
∆ ` A, Γ
∆ ` A⊥ , Γ
⊥1
∆, A⊥ ` Γ
⊥2
Axiom A`A
&
A ( B ≡ A⊥
B
A⊥⊥ ≡ A (A ⊗ B)⊥ ≡ A⊥
(A ⊕ B)⊥ ≡ A⊥ & B ⊥
(A
∆, Γ ` A ⊗ B
∆`A⊕B
⊕2
Fragment of Linear Logic for FCP
10:8
∆`A
&
Figure 1
∆ ` A, B
∆`B
B
&
∆`A&B
∆`A⊕B
Γ `B ⊗
&
⊕1
B⊥
B)⊥ ≡ A⊥ ⊗ B ⊥
∆`A
∆`B
∆`A
&
∆`A
&
(A & B)⊥ ≡ A⊥ ⊕ B ⊥
Free choice permission as resource-sensitive reasoning
Linear conjunction and disjunction. The rules for & and ⊕ (the ‘additive’ connectives) look exactly like the classical rules for ∧ and ∨, except for the substitution of & for ∧ and of ⊕ for ∨. However, as a result of how they interact with the rest of the logic, the linear logic additives behave differently from their classical counterparts. For instance, the law of the excluded middle is valid for classical disjunction: ` (¬A) ∨ A. In Linear Logic, the law of excluded middle is not valid for additive disjunction, despite the fact that the inference rule for additive disjunction has the same form as the inference rule for classical disjunction: 6` A⊥ ⊕ A. However, the excluded middle is valid for multiplicative disjunction (` A⊥ A). Linear negation. We have direct analogs to the classical rules for pushing a formula across the turnstile, namely, ⊥1 and ⊥2 . Since we now have two kinds of conjunctions and two kinds of disjunctions, there are more duality equivalences; however, each conjunction is still dual to a disjunction, and vice-versa. Linear implication. Once again, we have defined implication in terms of disjunction. Now, interestingly, we can prove the linear version of Modus Ponens without using Weakening (which is a good thing, since Weakening is not allowed in Linear Logic): &
A`A
B⊥ ` B⊥
A, B ⊥ ` A ⊗ B ⊥ A, (A ⊗ B ⊥ )⊥ ` B ⊥⊥
⊗ ⊥1 , ⊥2 ≡
A, A ( B ` B Because the inference rule for ⊗ splits up the resources (that is, the formulas) into those used to prove A and those used to prove B, there is no need to ignore gratuitous assumptions via Weakening. If we try to reproduce Wadler’s classical proof from the previous section, we’re out of luck: ?? ` A ?? ` B ⊗ A, A ( B ` A ⊗ B We could take some of the resources to the left of the turnstile to prove A, and we could take some (actually, we would need all) of the resources to prove B, but no matter how we divide up the left-hand formulas, we’ll fall short of proving one or the other of the conjuncts. Linear Logic requires strict accounting of assumptions, and we can’t make use of A twice, the way we could in the classical proof.
10:9
Chris Barker
2.3 Choice Since free choice permission is about making choices, what does Linear Logic have to say about choice? The critical connectives will be the additive conjunction ‘&’ and its (also additive) disjunctive dual, ‘⊕’. The relevant inference rules are repeated here: ∆`A
∆`B
∆`A &
∆`A&B
∆`A⊕B
∆`B ⊕1
∆`A⊕B
⊕2
Imagine yourself in the role of the prover. Then the assumptions on the left of the turnstile are what your environment gives you to work with, and the conclusion on the right of the turnstile is what you return as the result of your labors (perhaps to be used as an assumption in a larger proof). So here is what the & inference says: if the resources in ∆ allow you to provide A, and if the same resources allow you to provide B, then you can certainly offer to provide either A or B. Furthermore, since you are prepared to provide either alternative, you can leave the choice up to whoever might be interested in making use of the conclusion. Thus & conjoins two equally viable alternatives. Though both alternatives are equally viable, the consumer is forced to choose between them. For instance, imagine that ∆ contains a certain amount of sugar and a certain number of eggs. Using the resources provided, you can construct either a meringue or else an angel food cake, but you don’t have enough ingredients to cook both. Being as flexible and gracious as possible, you offer “meringue & cake” for dessert, and you let your guest choose. Tellingly, “meringue & cake” is pronounced “meringue or cake” in idiomatic English (this is a point that we will return to in section 7.3). In the context of granting permission, the consumer is the entity to which permission has been granted: we shall see that (unembedded) & corresponds to free choice on the part of the entity given permission. Continuing with our investigation of choice in Linear Logic, turning to the ⊕1 inference rule, if the resources in ∆ allow you to provide A, then you can certainly offer to provide either A ⊕ B — as long as you remain in control of which of the alternatives is chosen. You may only know how to make one dessert, perhaps. You can truthfully promise that dessert will either be meringue or else Baked Alaska, although you know in advance that it will have to be meringue. (Analogously with the roles reversed for ⊕2 .) In the context of granting permission, offering A ⊕ B does not give the grantee free choice.
10:10
Free choice permission as resource-sensitive reasoning
In order to complete the picture of the dualities of & and ⊕, we must consider what happens on the other side of the turnstile. Hopping across the turnstile involves negation, which exchanges & for ⊕ (and vice versa). A`∆
B`∆
A&B `∆
A&B `∆
A`∆
B`∆
A⊕B `∆
These rules follow from the official inference rules by applications of ⊥1 and ⊥2 . If A alone is enough to enable you to provide ∆, then if someone promises you A & B, you can certainly commit to providing ∆: just select A when they give you your choice. (Similarly for the other rule introducing & on the left of the turnstile.) Finally, if having A is enough for you to be able to offer ∆, and if having B is likewise enough for you to be able to offer ∆, then you’re in a position to promise ∆ even if all you can count on is A ⊕ B. All you know is that you’ll get either an A or a B, and that which one you get will be someone else’s choice. However, since you are prepared to cope with either possibility, you can commit to providing ∆. The bottom line is that & and ⊕ are two perspectives on a single choice, differing only in who has the power to make the selection: & provides two equally legitimate alternatives, but forces an unconstrained (free) choice between them; ⊕ also provides two alternatives, but reserves the choice for whoever is providing the resource. 3
Strong permission versus weak permission
Standard deontic logics introduce unary modalities representing obligation () and permission (♦), and add axioms that characterize an appropriate set of entailments, usually including at least K and D, though there is considerable variation; see McNamara 2006 or Portner 2009a for an introduction to deontic logic. Lokhorst (1997) chooses instead a strategy attributed independently to Anderson and to Kanger called deontic reduction. Deontic reduction depends on a special proposition δ (pronounced “yay”), glossed as ‘the good thing’, or ‘all things are as required’. Thus δ is roughly analogous to Kratzer’s (e.g., 1991) notion of an ordering source, that is, the set of propositions that characterize how things ought to be. Then A is obligatory iff δ ( A: if A follows from the state where all things are as required, then A is required. Dually, a weak version of permission
10:11
Chris Barker
is often defined as (δ ( A⊥ )⊥ : if the negation of A is not obligatory, then A is at least not forbidden. However, there is a difference between weak permission, which is the absence of prohibition, and strong permission, i.e., a permissive norm (as discussed in, e.g., Hansen et al. 2007), which is the assertion that some action is explicitly ok. Lokhorst (1997) renders strong permission as A ( δ. Viewed from the linguistics tradition, it is not so easy to make sense out of this as a statement of permission (as discussed in Portner 2009a:60). It is important to bear in mind that the ‘strong’ part of ‘strong permission’ does not mean that merely eating an apple will guarantee that everything is ok, no matter what else happens. If only permission could be that strong! Rather, the difference between ‘weak’ and ‘strong’ here is the difference between a system in which we have only obligation and its negation (in which everything that is not forbidden is permitted), and a more articulated system in which some things are permitted (A ( δ), some things are forbidden ((A ( δ)⊥ ), and some things are neither permitted nor forbidden. If I explicitly give you permission to eat an apple, and I explicitly forbid you to eat a pear, what about eating a banana? Is it permitted or forbidden? Maybe yes, maybe no. There is not much discussion of weak permission versus strong permission in the linguistics literature, but at least Asher and Bonevac (2005) conclude that free choice permission involves strong permission. Certainly if we want to distinguish between explicit permission and the absence of prohibition, then we need a logic that can express strong permission. Since I have claimed that You may eat an apple or a pear crucially neither permits nor forbids eating both an apple and a pear, we must use strong permission here. But what exactly does A ( δ assert, if not that eating an apple will guarantee the good thing? The key is to consider when A ( δ will be true. We will be in a situation in which A ( δ just in case eating an apple in that situation is compatible (‘cotenable’ in the terminology of Relevant Logic) with all obligations being fulfilled. There are two kinds of such situations: situations in which eating an apple happens to be obligatory, in which case we can only conform to obligations by eating the apple (after all, everything that is obligatory is at least permitted); and situations in which we’re already in compliance, but eating an apple is optional and does not disturb our happy state. But if we are otherwise in compliance, and we decide to eat an apple (A), and we decide to simultaneously kill the postman (K), the fact that apple eating is permitted will not save us: because of the resource-sensitivity of
10:12
Free choice permission as resource-sensitive reasoning
linear logic, in particular, the absence of Weakening, we can’t ignore the dead postman. As a result, the combination of eating an apple and killing the postman will land us in a situation that is far from ok: A, K, A ( δ 6` δ. A fuller understanding of linear implication, and therefore of strong permission, will emerge from the model theory developed in section 8. One major expository advantage of the reduction strategy is that it enables us to talk about permission without complicating the logic with inference rules for and ♦. Note that we do not necessarily give up anything by omitting the unary connectives: McNamara (2006) and Lokhorst (2006) show that under appropriate additional assumptions, deontic reduction characterizes all the theorems of standard deontic modal logics. Not that replicating standard deontic logic should be our goal; after all, standard deontic logic has A → ¬¬A as a tautology, which imposes a kind of consistency on the set of deontic obligations. In the linguistics tradition, a number of people (notably Kratzer (1991)) have argued that this is not appropriate for describing natural language modality, and that we should instead allow for inconsistent laws. However, I’m not aware of any reason why deontic reduction is incompatible with Kratzer’s characterization of deontic modality. I should note that deontic reduction is not an innocent choice for the empirical phenomena under consideration here. As I will explain shortly, because linear implication is defined as A ( B ≡ A⊥ B, the formula for which permission is granted (i.e., A) occurs in a downward-entailing position. This will be crucial in deriving the desired entailments. For all I know, however, it is possible that if a suitable notion of strong permission were defined in a standard deontic framework (i.e., one based on unary operators like ), similar entailments would go through. I intend for deontic reduction to be a convenient expository choice, and not an essential feature of a resource-sensitive approach to free choice permission. Nevertheless, there may be some empirical support for the naturalness of deontic reduction. After all, in addition to being able to use a modal verb to express permission and obligation, English can also deploy a conditional: It’s ok if you eat ‘You may eat’. In fact, in Japanese there is no modal verb that expresses permission, and permission normally can only be conveyed by means of a conditional construction (Clancy 1985, Akatsuka 1992): tabe-temo ii ‘eat-even.if good’, ‘It’s ok if you eat’. &
10:13
Chris Barker
4 Free choice permission We can now suppose that or has among its meanings ⊕, so that You may eat an apple or⊕ a pear translates as (a ⊕ p) ( δ: the additive disjunction of a and p is explicitly permitted. Then the desired free-choice implication follows directly from simple linear reasoning. Generalizing slightly by using variables over formulas (A, B) instead of atomic formulas (a, p), we have:
` δ⊥ , δ
` δ⊥ , δ
` (A ⊕ B) ⊗ δ⊥ , B ⊥ , δ ` (A ⊕ B) ⊗ δ⊥ , B ⊥
δ
` (A ⊕ B) ⊗ δ⊥ , (A⊥
&
&
` (A ⊕ B) ⊗ δ⊥ , A⊥
⊗
⊕2
δ) & (B ⊥
&
` (A ⊕ B) ⊗ δ⊥ , A⊥ , δ
` A ⊕ B, B ⊥
&
⊕1
&
` A ⊕ B, A⊥
` B, B ⊥ ⊗ &
` A, A⊥
δ &
δ)
(A ⊕ B) ( δ ` (A ( δ) & (B ( δ)
⊥2 , ≡
This theorem is noted in Lokhorst 1997:6.1 What the speaker provides when she utters You may eat an apple or⊕ a pear is justification for assuming either that eating an apple is permitted, or that eating a pear is permitted. She is not providing enough resources to prove both, so if her utterance is to provide the justification for action, a choice must be made. However, since the resources allow proof of either alternative, the consumer is free to choose whichever of the alternatives he prefers. That is how the addressee has permission to eat an apple, or else permission to eat a pear, but normally (and certainly not by virtue of the utterance of (1a)) does not have permission to eat two pieces of fruit. This result depends on only two assumptions: that or can express additive disjunction, and that it is reasonable to represent strong permission using the deontic reduction strategy. The assumption that or can express additive disjunction is essential, and is the heart of the explanation offered here. Deontic reduction is a well-established approach to deontic logic motivated entirely independently of any concern with the free choice permission problem. Whether it can be replaced with a modal system more familiar to linguists (if desired) remains for future work. 1 Strictly speaking, since the inference rules given above in section 2.2 are written with a single formula on the right-hand side, many of the steps given in this proof (for example, the ⊕1 inference) require shuffling extra formulas across the turnstile, applying the inference rule of interest, then shuffling them all back.
10:14
Free choice permission as resource-sensitive reasoning
It is worth emphasizing that the basic free choice meaning is purely semantic, without requiring any silent pragmatically-triggered type shifting operators (as in, e.g., Fox 2007), or other pragmatic enrichment. 5 Prohibition The behavior of permission under negation plays an important role in recent discussions. As mentioned above, Alonso-Ovalle (2006) and Fox (2007) argue that the fact that free-choice implications seem to disappear under negation shows that free choice implications are likely to be implicatures. Since I am claiming that the relevant free choice implications are entailments, it is important to carefully examine negated cases. Whatever is not permitted is forbidden: just as in English, Lokhorst renders (strong) prohibition as negated (strong) permission. Thus if (A ( δ)⊥ , then A is prohibited. (It is a well-known property of English that may not is always construed with negation taking scope over may.) (5)
a. b. c.
You may not eat this apple or this pear. You may not eat this apple. You may not eat this pear.
The main fact to be explained is that (5a) implies (perhaps entails) (5b) and (5c). Unlike positive free choice implications, we can usually infer that (5b) and (5c) hold simultaneously. That is, you cannot comply with (5a) by merely refraining from eating apples. Apparently, permission is a scarce resource, but prohibition is all too abundant. I will call this construal of (5a) the doubleprohibition reading, and I will suggest that it arises as a standard Gricean implicature. As with most stories about scalar implicatures, we will be concerned with the epistemic state of the discourse participants. (6)
a. b. c.
You may not eat this apple or this pear. You may not eat this apple or you may not eat this pear. ((A ⊕ B) ( δ)⊥ ` (A ( δ)⊥ ⊕ (B ( δ)⊥
The translation of (6a) entails the translation of (6b) (that is, (6c) is a theorem), so we predict that (6a) ought to have an interpretation on which it guarantees that (6b) is true. Such an interpretation is widely attested in the literature, and usually is described as favoring the continuation . . . but I don’t know
10:15
Chris Barker
which. I’ll call this the ignorance reading. Note, by the way, if a forgetful babysitter utters (6) to the child she is babysitting, if the child behaves rationally, he will not eat either piece of fruit, since he can’t be sure which action is safe — exactly the same behavior as if both actions had been explicitly forbidden. So far, so good. Next, consider a situation in which the speaker is not ignorant. Exactly one of the alternatives is prohibited, and this time the speaker knows which one it is. Let’s say that apple-eating is forbidden, but pear eating is fine. If the speaker were being fully cooperative, then she would normally choose to simply say (5b), and certainly would not choose to say (5a). In Gricean terms, adding a superfluous disjunct would violate either the maxim of Quantity, or the maxim of Manner, or both. There are nevertheless situations in which this kind of uncooperative statement might be used. For instance, if a father tells an older sister the rules (“apples forbidden, pears ok”), she might later uncooperatively tell her younger brother (7)
You may not eat this apple or this pear . . . but I won’t tell you which.
Once again, the rational course of action on the part of the younger sibling will be to refrain from eating either piece of fruit. Presumably this is exactly the outcome the unkind sister is aiming for. (I’m indebted to Sven Lauer for this scenario; see also Simons 2005:273n.4.) In both the ignorance scenario and the uncooperative scenario, at least one of the disjuncts holds, but the choice of which fruit is prohibited belongs to the master, not the slave. The subject of the prohibition must plan for the worst, and therefore can’t safely commit to either alternative. Finally, imagine that the speaker is neither ignorant nor uncooperative. She may be an expert (perhaps she just received full instructions from the parents) or she may be herself the source from which permission flows; in any case, she is fully opinionated about what is forbidden. Crucially, although (6) guarantees only one disjunct, it is consistent with situations in which both disjuncts hold. As just argued, if exactly one disjunct held, the speaker would simply have said so. We can deduce, therefore, that both disjuncts must hold. There is one more step to complete the Gricean explanation. If the speaker intends to convey double prohibition, why not use and? (8)
You may not eat an apple and a pear.
10:16
Free choice permission as resource-sensitive reasoning
Although this sentence may have the desired double-prohibition reading, it certainly also has a reading on which it prohibits (only) complex events that involve eating both an apple and a pear. Uttering (8), then, leaves in play the possibility that eating a single piece of fruit may be permitted. The speaker uses a weak form in (6) to express a stronger meaning in order to avoid misinterpretation. Thus the assumption that the speaker is opinionated and cooperative derives the implicature that both disjuncts are prohibited via ordinary Gricean reasoning, without the need to stipulate any special uniformity or distributivity axioms (as in Alonso-Ovalle 2006) or Zimmermann’s (2000:286) Authority Principle. 6
Comparisons with other accounts
6.1 Implicature accounts A number of authors, including Schulz (2005) and Fox (2007), suggest that free choice implications are implicatures that arise in contexts in which the speaker is opinionated about which options are permitted and which are not. Fox (2007) reasons as follows: if a speaker utters a disjunction when she could have made a stronger statement, this could naturally lead to a Quantity implicature that she did not have sufficient evidence to assert the stronger statement. If those ignorance implicatures are implausible, as when the speaker is describing permissions in a situation in which their judgment is authoritative, the implausibility can trigger a repair strategy under which the disjunction is pragmatically enriched by the application of a predicate exh (for “exhaustive”). For instance, if an authoritative speaker says You may eat an apple or a pear, it may be implausible that she doesn’t know whether you may eat an apple, or whether you may eat a pear. Therefore the statement ♦(A ∨ P ) can be strengthened (given a number of additional assumptions) to an exhaustive meaning equivalent to the proposition ♦A ∧ ♦P ∧ ¬(♦(A ∧ P )). This asserts that you may have an apple, and you may have a pear, but you may not both have an apple and a pear. I will discuss three potential problems with these accounts. The first problem is that the free-choice reading can survive even in the presence of manifest ignorance on the part of the speaker: (9)
I don’t know whether you may have an apple or a pear.
10:17
Chris Barker
Since exhaustivity is supposed to be triggered by contexts that are incompatible with ignorance, (9) should only have a reading on which it means ‘I don’t know whether you may have an apple or whether you may have a pear’. But (9) robustly also has a free-choice reading on which it means ‘I don’t know whether you may eat a piece of fruit, where the fruit is your choice between an apple or a pear’. (10)
If it turns out that John may have an apple or a pear, he’ll choose the pear.
Likewise, as Kamp (1978:279) notes, free choice interpretations remain available for the antecedent of a conditional, where it is far from clear how assumptions about complete knowledge of the alternatives could enter in. The second problem is that if free choice implications were implicatures, we should expect them to be generally cancelable: (11)
You may eat an apple or a pear, although in fact you may not eat an apple.
Probably (11) has a non-free choice reading on which it is at least logically consistent. If this were the basic semantic meaning of (11), then we would expect it to emerge whenever the free-choice implication is cancelled. The puzzling thing is that if we assume the speaker is opinionated, (11) gives a strong impression of contradiction rather than of a cancelled implicature. Chemla (2009a, 2009b) proposes a pragmatic principle that he calls symmetry, which says that the epistemic attitude of the speaker must be uniform across disjuncts. Symmetry correctly predicts that (11) should be infelicitous, since it implies that the speaker holds a different attitude towards one disjunct than towards the other. However, symmetry alone cannot explain why (11) sounds contradictory. One possibility is that performativity is interfering. Portner (2009b) suggests that performative uses (see section 7.2 below) force, or at least strongly promote, a free choice interpretation. If so, then what (11) shows is that at least when an utterance is performative, free choice implications cannot be cancelled. The third problem applies to Fox’s account, though not to Schulz’s: as Fox himself notes, the proposed implicatures for the free-choice reading do not match intuitions about the meanings of the sentences in question. Fox’s exh-enhanced truth conditions assert that eating an apple is permitted, and
10:18
Free choice permission as resource-sensitive reasoning
eating a pear is permitted, but eating an apple and a pear is forbidden. But as Simons (2005) and others observe, free choice is compatible with joint permission. For instance, (12)
[You may eat as much fruit as you want, so] You may (certainly) eat an apple or a pear.
On Fox’s account, (12) should be contradictory on a free-choice reading of the final clause. However, although (12) may be mildly redundant, there is no hint of contradiction. Franke (2009:8) and van Rooij (2010:18) derive results similar to Fox’s by using a particular game-theoretic technique (“Iterated Best Response”) to compute implicatures. One advantage of their approach is that the proposition that eating both an apple and a pear is forbidden arises as an implicature only when certain alternatives are salient, correctly predicting that (12) need not be a contradiction. On the account here, of course, the explanation for the fact that (12) is not a contradiction is particularly simple and direct: You may eat an apple or a pear entails that you may eat an apple, and that you may eat a pear, but refrains from saying anything about whether it’s ok to eat both an apple and a pear. It neither grants permission to eat two pieces of fruit, nor forbids it. Van Rooij frames the comparison between exhaustivity and game theory as part of the debate about embedded implicatures: if free choice implications can be handled using iterated best response, then free choice no longer provides an argument that implicatures must be calculated locally (i.e., in embedded contexts). The resource-sensitive approach here weakens the argument that free choice motivates embedded implicatures even further, by calling into question whether free choice implications are implicatures in the first place. 6.2 Alternative set semantics Zimmermann (2000) proposes that disjunction contributes a set of exhaustive epistemic alternatives, so that You may eat an apple or you may eat a pear expresses the claim that it is possible that you may eat an apple and it is possible that you may eat a pear. Novel pragmatic principles (notably his Authority Principle) strengthen this conjunction into an assertion that you may eat an apple and you may eat a pear.
10:19
Chris Barker
Geurts (2005) elaborates on Zimmermann’s analysis, arguing that disjunctive alternatives should not always be epistemic. Rather, disjunction “fuses” with nearby modal operators, so that You may eat an apple or a pear means that you may eat an apple and you may eat a pear without needing to invoke any special pragmatic principle. Neither Zimmermann’s nor Geurts’ analyses explain why the free-choice or differs from an overt and (i.e., You may eat an apple and you may eat a pear) in failing to guarantee that two pieces of fruit may be eaten. In addition, as Geurts (2005:406) briefly discusses, it is not clear how either analysis accounts for negated free choice (discussed above in section 5). Zimmermann’s idea that disjunction introduces a set of alternatives has been implemented in a variety of ways. I will mention three here. Kratzer and Shimoyama (2002) propose that indefinites contribute a set of alternatives, one for each way of resolving the indefinite. This requires in turn a modification of the basic compositional semantics, since it is necessary to allow for composition with sets of meanings instead of single meanings. This is done pointwise using “Hamblin semantics”, so that an embedded indefinite can give rise to a set of alternatives at higher compositional levels (see Shan 2004 for discussion of the complexities of pointwise composition). Alonso-Ovalle (2006) extends this strategy from indefinites to disjunction, explicitly addressing the free choice problem. Aloni’s (2007) approach manages disjunction-alternatives within a dynamic semantics based on Dekker 2002, supplemented with structured propositions. Van Rooij (2008:309) sketches yet a third implementation, on which alternatives are built into the definition of a minimal extension of a world. Then a world in which you eat only an apple might qualify as a minimal extension of the world we are in, but not a world in which you eat both an apple and a pear. In order to deliver free choice implications, it is necessary for the propositions expressed by a disjunction to always be among those used for articulating minimal extensions, though this requirement is not guaranteed by the formal analysis. In these approaches, free choice effects arise when certain operators explicitly manipulate alternative sets. For instance, Aloni stipulates that may(Φ) is true (where Φ is a set of alternatives) just in case the ordinary meaning of may is true of each alternative. Thus You may eat an apple or a pear involves applying may to the set of alternatives corresponding to the addressee eating an apple and the addressee eating a pear. The sentence will
10:20
Free choice permission as resource-sensitive reasoning
be true, then, just in case You may eat an apple is true and You may eat a pear is true. The account here resembles Aloni’s alternatives account in two important respects. First, free choice implications are entailments rather than implicatures. As we saw in section 6.1, the fact that free choice implications do not always seem to be cancelable argues in favor of theories on which they are treated as entailments. Second, because alternative-taking may requires that ordinary may must be true of every alternative, it is a downward-entailing operator with respect to the disjunction that gives rise to the alternatives. Aloni points out that this explains why (so-called free choice) any is licensed (e.g., You may eat anything), and since the antecedent of linear implication is likewise a downward-entailing position (as noted above), the same explanation carries over here. (Of course, there is more to free choice than placing an indefinite in a downward entailing context. For instance, a referee observes that in some Romance languages, some free-choice indefinites are licensed under permission, but not in the antecedent of conditionals or in other downward entailing contexts.) One important difference between the approach here and alternativebased analysis, including Aloni’s, is the integration with the larger compositional system. The alternative-set approach in effect creates unbounded dependencies in the semantics: or introduces alternatives which the compositional system must track until an alternative-aware operator collapses the alternatives back into to a single proposition. The account here adjusts only the denotations of the logical connectives, leaving the compositional system entirely undisturbed. (Not that I had provided a compositional analysis, though I trust that appropriate details can easily be supplied.) 7
Issues
7.1 Free choice effects apart from permission It is widely assumed that whatever explains free choice implications for deontic modals should be the same thing that explains the similar behavior of epistemic modals: (13)
a. b. c.
John might be in Aarhaus or in Boston. John might be in Aarhaus. John might be in Boston.
10:21
Chris Barker
In parallel with the permission cases, the disjunction in (13a) entails (13b) and (13c). The simplest way to extend the account here to epistemic cases would be to add to our logic a new atomic formula , which is true just in case everything that is epistemically known holds. Then You might be in Aarhaus would translate as A ( , and the desired entailments follow as a matter of logic. Adding an epsilon to the logic is more than a superficial change. It is important to keep track of what the logic claims to be modeling. Classical logic promises to preserve truth: if the assumptions are true, the conclusion will be true. Since truth is not resource sensitive (if something is true once, it is true again and again), that is why it is legitimate to duplicate and discard assumptions. Linear Logic promises to preserve resources: whatever resources the assumptions provide, that is exactly what resources will appear in the conclusion. In our deontic application, the critical resource is permission: if the assumptions provide enough permission to eat exactly one piece of fruit, then the conclusion will provide the same amount of permission. In the epistemic case, the critical resource is epistemic commitment: whatever commitments are made by the assumptions, the conclusion will make exactly the same commitments. There are other important differences between deontic logic and epistemic logic. For instance, it is generally considered desirable for an epistemic logic to guarantee that if you know that A is true, then A is true (A ` A). But deontically, you would not want to conclude from the fact that A is obligatory that A must hold, since obligations are all too often not fulfilled. More relevantly, there are empirical dis-analogies between the free choice behavior of deontic uses of modals versus epistemic modals. For instance, Kamp (1978), Zimmermann (2000), and Aloni (2007) note that it is significantly more difficult to construe epistemic modals as having a . . . but I don’t know which interpretation (though it is still possible — see especially Simons 2005:274). I’m not aware of any reason why a reduction strategy could not be part of a more complete analysis of epistemic modality; nevertheless, it would be prudent to be cautious about assuming that any deontic analysis should automatically extend to epistemic cases. In addition to the possibility that free choice effects may occur in other modalities, Fox (2007) argues that free choice effects can be discerned in non-modal contexts that involve existential quantifiers.
10:22
Free choice permission as resource-sensitive reasoning
(14)
There’s beer in the fridge or in the cooler out back.
Especially when (14) is heard as an implicit permissive, (14) entails both that there is beer in the fridge and that there is beer in the cooler out back. Both alternatives are guaranteed to be true, and the consumer of the information has free choice of which one is relevant for forming a plan of action. Klinedinst (2007) suggests that free choice effects are present with some existential quantifiers, but only when the quantificational DP is plural: (15)
a. b.
Some passengers got sick or had difficulty breathing. A passenger got sick or had difficulty breathing.
In (15a), there is a reading on which some passengers got sick, and some had difficulty breathing. On such a reading, at least some of the passengers must have gotten sick, and at least some of the passengers must have had difficulty breathing. But in (15b), there is no guarantee that both of the properties must be instantiated. Having mentioned these facts, I will not attempt a discussion here of the interaction of free choice with quantifiers or with plurals. See Chemla 2009a for experimental evidence and relevant discussion. 7.2
Performativity
Kamp (1978) draws a distinction between granting permission versus describing permission, where granting permission is a performative action. When a parent says You may eat an apple or a pear in the right circumstances, fruit-eating options may come into being that were not present before the utterance. But when a sibling comments later Apparently, you may eat an apple or a pear, they are merely describing the current situation, and no new options come into being. Van Rooij (2008) and Portner (2009b) develop a dynamic semantics for permission on which a permission sentence performatively changes the set of what is allowed. One of the main arguments that performativity is important relies on correlations between performative uses and the availability of free choice interpretations. Certainly descriptive uses (such the sibling’s comment) can have a free choice interpretation or not. Performatives, however, strongly prefer a free choice interpretation. Yet it may still be possible for a performative to have a non-free choice interpretation:
10:23
Chris Barker
(16)
You may pillage city X or city Y. But first take counsel with my secretary.
Kamp (1973:67; see also Kamp 1978:279) says of this example that “[t]he second part of this statement makes it clear that the vassal should not infer from the first part that he may make his own choice of city. Which one he may loot ultimately depends on the secretary’s advice, the tenor of which — we may assume — is at this point unknown to king and vassal alike.” To be sure, nothing specific has been permitted, and the vassal cannot form a complete plan of action. If we conceive of a performative as something that enlarges what an agent may safely do, we might therefore suppose that (16) is a merely descriptive use, since it does not by itself allow the vassal to act. Yet something must have been permitted: where does the disjunctive permission that the sentence describes come from, if not from the performance of (16)? As far as the current paper is concerned, it is enough for permission sentences to characterize what is allowed. Then whether an utterance expands the sphere of permissibility depends on the interaction of the truth conditions with the normal range of factors that influence how a discourse participant decides to react to an utterance. Whether this minimalist strategy is viable, or whether it will ultimately be necessary to provide a special role for performativity remains to be seen. (See Kamp 1978 for extensive, but ultimately inconclusive, discussion.) 7.3 Is there a conjunctive use of or after all? Geurts (2005) and Simons (2005) emphasize the importance of explaining how free choice implications arise when or takes scope over the permission modal. (17)
a. b.
You may eat an apple or a pear. You may eat an apple or you may eat a pear.
The account of free choice given so far does not explain why (17b) also has a free choice interpretation. Simons proposes an across-the-board LF movement operation on which the sentence with unembedded or is predicted to be logically equivalent to You may [eat an apple or eat a pear]. That approach is compatible with the account of free choice here.
10:24
Free choice permission as resource-sensitive reasoning
However, there is an alternative explanation that may be worth some consideration: perhaps resource-sensitive or is ambiguous between ⊕ (the translation we’ve given it so far) and &. After all, there is no other lexical item that is a candidate for expressing &. For instance, as mentioned above, if you have ingredients for either meringue or angel food cake, but only enough to make one recipe, and someone asks ‘What’s for dessert?’, the answer is meringue or& cake, never meringue and& cake. A second intriguing clue comes from conditionals. In Linear Logic, strengthening of the antecedent is valid for & but not for ⊗. That is, we have A ( C ` (A & B) ( C but A ( C 6` (A ⊗ B) ( C. The observation that and never expresses & explains why trying to strengthen an antecedent using and in English does not work: If John left, we could all play bridge does not entail If John left and Mary left, we could all play bridge. But if or has a conjunctive use, then we could explain why the inference does seem valid if we use or: If John left or& Mary left, we could all play bridge. If or can express &, then the ability of (17b) to serve as a paraphrase of (17a) is immediately explained: it translates directly as (A ( δ) & (B ( δ), and it is easy to prove that (A ( δ) & (B ( δ) ` (A ⊕ B) ( δ. Of course, if or had such a conjunctive use, we would expect it to occur in embedded position too, for example, You may eat an apple or& a pear. But this is harmless, and merely gives a different route to the . . . but I don’t know which reading, which we derived above by giving (disjunctive) or wide scope. More problematically, we would also expect a conjunctive or to be available in non-modal sentences. Then saying that John left or& Mary left would offer the addressee free choice of which disjunct to believe, yet would license belief in at most one of the disjuncts. Such a meaning does not appear to be available. Put another way, non-modal uses of or appear to always be classical disjunction (this is hardly surprising). One notable feature of Linear Logic is that the classical connectives are easily expressible, given the addition of the so-called exponential operators, ! (pronounced ‘of course’) and ? (‘why not?’): from ∆ ` !A infer ∆ ` A, !A); from ?A ` ∆ infer ` ∆. These operators allow a richer control over resources in which assumptions can be used repeatedly, as in contraction, or ignored, as in weakening. Given Linear Logic with exponentials, we can choose a more relaxed classical resource management regime, or a more fussy pure Linear Logic regime, as needed. For instance, the classical disjunction of A and B can be expressed as !A ⊕ !B.
10:25
Chris Barker
So there is no problem allowing Linear reasoning to peacefully coexist with classical reasoning, as long as we can reliably tell which kind of resource management to use in any given context. To a first approximation in English, linear resource management appears to be relevant only for untensed clauses with bare verb forms, as in You may eat an apple or eat a pear, in which or takes scope over the untensed bare verb phrases eat an apple and eat a pear. Then we could suppose the reason that John left or Mary left does not have a conjunctive interpretation is because the tensed clauses trigger (only) a classical interpretation of or. Figuring out how to regulate the distribution of an ambiguous or would be a major undertaking, so I leave this issue unresolved for now. 8
Semantics for linear logic
The discussion so far has been conducted entirely in terms of inference rules and proofs. It is unusual these days, though not unheard of, to express the meaning of natural language using proof theory without giving a model theory. More often, of course, we have the opposite situation, in which semantic analyses provide models without any proof theory. The most complete picture, however, emerges when proof theory and model theory complement each other. Therefore I will discuss models for Linear Logic here, with a detailed illustration of a free choice example. There are a number of semantic approaches to Linear Logic. Girard’s (1987, 1995) original semantics in terms of coherence spaces and in terms of phase spaces would not be directly helpful here. There are other semantic approaches, however, that have tantalizing associations with the granting and denying of permission. I will mention three. First, Petri nets describe the movement of tokens through a network. Lokhorst (1997) uses Petri nets as models of his Linear Logic treatment of deontic reasoning. (Think of the tokens as lumps of permission moving from one location to another.) Second, in game semantics a Proponent and an Opponent take turns making choices, and I have argued that tracking choice is central to understanding permission talk. See, e.g., Accorsi and van Benthem 1999 for a discussion of game semantics for Linear Logic. Third, there are computational models of Linear Logic that make an explicit connection between the additives and choice. For example, Abramsky’s (1993) computational semantics for intuitionistic Linear Logic interprets A ⊗ B as an ordered pair hA, Bi both of whose elements will be used in further computation (eager evaluation); A & B, on the other
10:26
Free choice permission as resource-sensitive reasoning
hand, denotes an ordered pair only one of whose elements will ever be used (lazy evaluation), and of course A ⊕ B delivers a projection function that chooses one or the other of the elements in a & pair. Unfortunately for our purposes here, Abramsky’s computational interpretation of classical Linear Logic involves parallel distributed processing, which would take us too far afield.2 Most reassuringly familiar for linguists, Allwein and Dunn (1993) provide a kosher Kripke-style possible worlds semantics, and that is the approach that I will present here. Following Allwein and Dunn, the expository strategy will be to begin with an algebraic model that is faithful to the inference rules, then show how to reconstruct that algebra in terms of worlds. 8.1
An algebraic semantics
The algebraic model contains three main components: a lattice for modeling the additive connectives, a unary operation for modeling negation, and a binary operation for modeling the multiplicative connectives. Additives: let A, ∧, and ∨ form a bounded lattice with partial order ≤ and top and bottom elements. The lattice can be finite or non-finite, and it can be distributive or non-distributive. Negation: now let ∼ be a DeMorgan negation on that lattice. This means that ∼ must be order-reversing (for all x, y in A, x ≤ ∼y iff y ≤ ∼x), and it must be involutive (for all x in A, ∼∼x ≤ x). Multiplicatives: we add a commutative, associative binary operation ◦ with identity element t (that is, t ◦ a = a = a ◦ t for all a in A). Thus A,◦, and t form a commutative monoid. Note that t may be distinct from the top of the lattice. The monoid operation must distribute over the join operation, that is, for all a, b, c ∈ A : a ◦ (b ∨ c) = (a ◦ b) ∨ (a ◦ c). It must also be compatible with negation in the sense that for all a, b ∈ A : a ◦ b ≤ c iff a ◦ ∼c ≤ ∼b (“antilogism”).
2 Though it is intriguing to think that the meaning of some natural language expressions might be appropriately modeled by a distributed process. Perhaps some permission sentences denote programs which the recipient can execute in various environments in order to produce whichever certificate of permission is required. Then a free choice permission sentence denotes a program whose execution is blocked until it receives an external choice (a selection of which alternative to deploy).
10:27
Chris Barker
The points in the lattice model formulas. Given a valuation v mapping atomic formulas onto elements of A, we extend v to complex formulas as follows: v(A⊥ ) = ∼v(A); v(A & B) = v(A) ∧ v(B); v(A ⊕ B) = v(A) ∨ v(B); v(A ⊗ B) = v(A) ◦ v(B); v(A B) = ∼(∼v(A) ◦ ∼v(B)); and v(A ( B) = ∼(v(A) ◦ ∼v(B)). As an example, I will present a six-element, non-distributive lattice: &
5 3
4
1
2 0
∼ 0 1 2 3 4 5
5 3 4 1 2 0
◦ 0 1 2 3 4 5
0 0 0 0 0 0 0
1 0 1 2 1 2 5
2 0 2 1 2 1 5
3 0 1 2 3 4 5
4 0 2 1 4 3 5
5 0 5 5 5 5 5
The Hasse diagram on the left gives the lattice order in the usual way, so that 0 ≤ 1, 1 ≤ 3, and so on. In addition, since ≤ is reflexive and transitive, we also have 0 ≤ 0, 0 ≤ 3, etc. Since meet (∧) in a lattice is the unique greatest lower bound, it can be read off the Hasse diagram, e.g., 5 ∧ 5 = 5, 4 ∧ 5 = 4, 4 ∧ 3 = 0, and so on (dually for the join operation ∨). It is easy to see by inspection that the negation relation ∼ is involutive (e.g., ∼∼3 = 3) and order reversing (e.g., along with 0 ≤ ∼3 we have 3 ≤ ∼0). Note that 3 serves as the identity element t of the monoid. Since the monoid operation is commutative, the matrix is symmetric across the top-left to bottom-right diagonal (e.g., 4◦2 = 2◦4). Furthermore, mechanical checking will confirm that the monoid operation is associative (e.g., (4 ◦ 2) ◦ 1 = 4 ◦ (2 ◦ 1)), that it distributes over the join operation (e.g., 3◦(1∨4) = (3◦1)∨(3◦4)), and that it respects the antilogism requirement (e.g., 4 ◦ 2 ≤ 3 ≡ 4 ◦ ∼3 ≤ ∼2). A sequent Γ semantically entails ∆ (written ‘Γ î ∆’) just in case the valuation of the multiplicative conjunction of the formulas in Γ is dominated by the valuation of the multiplicative disjunction of the formulas in ∆. For instance, since x ∧ y ≤ x for all x, y in A by the definition of meet in a lattice, we have that A & B î A. To illustrate how these tables provide a model of the logic, recall that we have the following three theorems discussed in previous sections and one non-theorem:
10:28
Free choice permission as resource-sensitive reasoning
(18)
a. b. c. d.
(A ( δ) & (B ( δ) ` (A ⊕ B) ( δ (A ⊕ B) ( δ ` (A ( δ) & (B ( δ) (A ( δ) ⊕ (B ( δ) ` (A & B) ( δ (A & B) ( δ 6` (A ( δ) ⊕ (B ( δ)
If the given algebra is a faithful model of Linear Logic, we expect that for every valuation v assigning a lattice element to the propositional symbols δ, A, and B, the valuation of the left hand side of any theorem will be dominated (in the sense of the lattice order ≤) by the valuation of the right hand side. This is the case for (18) (a) through (c), but we have a countermodel for (18d): if v(δ) = 0, v(A) = 1, and v(B) = 2, then v((A & B) ( δ) = v(((A & B) ⊗ δ⊥ )⊥ ) = ∼((v(A) ∧ v(B)) ◦ ∼v(δ)) = ∼((1 ∧ 2) ◦ ∼0) = 5. But v((A ( δ) ⊕ (B ( δ)) = 0, and 5 6≤ 0. There are (infinitely) many other possible choices for a lattice, and for any given lattice, there may be many choices for a suitable negation and for a suitable monoid operation. For instance, Restall (2000:170) gives an even simper (but still instructive) model of (distributive) Linear Logic based on a four-element lattice. Since Linear Logic is sound and complete with respect to the class of algebraic models given here, a sequent is a theorem iff its left hand side semantically entails its right hand side for every valuation in every model. 8.2
A possible-worlds semantics
The algebraic semantics is simple and straightforward, in part because it merely recapitulates the inference rules; for the same reason, it may not add any insight beyond what is already evident from the inference rules themselves. Constructing a Kripke-style possible worlds semantics is a bit more complicated, but may allow natural language semanticists to transfer some of their intuitions from more familiar sorts of semantics for natural languages. We shall see that one particularly intriguing feature of the Kripke semantics for Linear Logic is that there will be three possibilities for the status of a formula at a world: it may be true, false, or neither true nor false, which is exactly what makes Linear Logic suitable for modeling actions that may be permitted, forbidden, or neither permitted nor forbidden. Allwein and Dunn associate each element in A with a particular set of worlds. The construction goes as follows. Consider pairs of the form hF , Ii, where F and I are sets of points in the lattice. We require hF , Ii to satisfy the
10:29
Chris Barker
following four requirements: first, (w1): F and I must be disjoint. Second, (w2): F must be closed upward under ≤, so that for all a ∈ F and for all b ∈ A : (a ≤ b) implies b ∈ F . Dually, I must be closed downward under ≤, so that for all a ∈ A and for all b ∈ I : (a ≤ b) implies a ∈ I. In particular, F always contains the top element, and I always contains the bottom element of the lattice. Third, F and I must be closed under meets and joins, respectively. That is: (w3): for all a, b ∈ F : a ∧ b ∈ F ; and for all a, b ∈ I : a ∨ b ∈ I. In other words, conditions (w2) and (w3) require that F must be a filter, and that I must be an ideal. Finally, there is a maximality condition: Maximality: A filter/ideal pair hF , Ii satisfying (w1), (w2), and (w3) satisfies maximality only if there is no other distinct pair of sets hF 0 , I 0 i also satisfying (w1), (w2) and (w3) that properly includes the first, i.e., such that F ⊆ F 0 and I ⊆ I 0. Here are a few of the possible pairs of subsets that fail to satisfy the requirements: h{1, 2}, {1, 3}i h{3}, {0}i h{4, 3, 5}, {0}i h{4, 5}, {0}i
violates violates violates violates
w1 w2 w3 Maximality
In fact, in this model there are exactly four maximal disjoint filter/ideal pairs: World World World World
a: b: c: d:
h{4, 5}, {0, 2}i h{3, 5}, {0, 1}i h{2, 4, 5}, {0, 1, 3}i h{1, 3, 5}, {0, 2, 4}i
These pairs will stand in one to one correspondence with our possible worlds. For each world w = hF , Ii, we will interpret F as the set of points that are true at w, and I as the set of points that are false at w. For worlds c and d, every point in the lattice is either true or false. But for world a, points 1 and 3 are neither true nor false. Similarly, for world b, points 2 and 4 are neither true nor false. In terms of permission talk, there may be situations in which some things are permitted, some things are forbidden, and some things are neither permitted nor forbidden.
10:30
Free choice permission as resource-sensitive reasoning
The next step is to associate each point in the lattice with a set of worlds. If w is a world associated with the pair of sets of points hF , Ii, let w1 indicate F and w2 indicate I. Then we can define a map β that takes each point p in the lattice onto the set of worlds w such that p ∈ w1 : β(0) = {} β(1) = {d} β(2) = {c} β(3) = {b, d} β(4) = {a, c} β(5) = {a, b, c, d} In other words, we map each point in the lattice to the set of worlds that make it true. We now need to define relations over sets of worlds that will allow us to reconstruct the logical operations we want to model: ∧, ∨, ∼, and ◦. The meet operation is straightforward. We extend β in the following way: β(p ∧ q) = β(p) ∩ β(q). So meet corresponds to simple set intersection. Thus 4 ∧ 2 = 2, and β(4 ∧ 2) = β(4) ∩ β(2) = {a, c} ∩ {c} = {c} = β(2). The join operation is not quite so straightforward. We cannot represent join as set union. To see why, note that 3 ∨ 2 = 5, but β(3) ∪ β(2) = {b, d} ∪ {c} = {b, c, d} 6= β(5). The solution is to exploit the information present in the second element in the pair of sets that define the worlds. To do this, we define two operations on sets of worlds. Let W be our set of worlds, and let C be any subset of W : l(C) = {x|for all y ∈ W , x1 ⊆ y1 implies y 6∈ C} r (C) = {x|for all y ∈ W , x2 ⊆ y2 implies y 6∈ C} Although l and r are defined over all subsets of W , we will only need to apply them in the following cases: r (β(0)) r (β(1)) r (β(2)) r (β(3)) r (β(4)) r (β(5))
= r ({}) = r ({d}) = r ({c}) = r ({b, d}) = r ({a, c}) = r ({a, b, c, d})
= {a, b, c, d} = {b, c} = {a, d} = {c} = {d} = {}
For instance, the reason a is not in r (β(1)) is because a2 ⊆ d2 , but d ∈ β(1). Allwein and Dunn show that for all points p in the lattice, l(r (β(p))) = β(p).
10:31
Chris Barker
We can now define join by shifting the conjuncts using r , then taking their intersection, then shifting back using l: β(p ∨ q) = l(r (β(p)) ∩ r (β(q))). For instance, we have β(1 ∨ 3) = l(r (β(1)) ∩ r (β(3)) = l({b, c} ∩ {c}) = l({c}) = β(3). Trying the problematic case given above, β(3 ∨ 2) = l(r (β(3)) ∩ r (β(2))) = l({c} ∩ {a, d}) = l({}) = {a, b, c, d} = β(5), as desired. At this point, β, l, and r allow us to fully simulate the structure of the lattice in terms of sets of worlds. Representing negation: β(∼p) = {x|h∼x2 , ∼x1 i ∈ r (β(p))} (where applying ∼ to a set of points returns the set resulting from applying ∼ to each member of the original set). For instance, we have β(∼1) = {h∼{0, 1}, ∼{3, 5}i, h∼{0, 1, 3}, ∼{2, 4, 5}i} = {b, d} = β(3). Note that linear negation expresses something about provability, not about falsity. One way to see this is to observe that in this model, 3 and its negation ∼3 = 1 are both true at world d. Representing the tensor relation ◦ proceeds in two steps. In the usual Kripke semantics, unary modal operators are characterized by an accessibility relation, a two-place relation over worlds. Because the multiplicatives are two-place connectives, we will need a three-place relation.3 Sxyz iff ∀p, q : (p ◦ q ∈ z2 and q ∈ y1 ) implies p ∈ x2 The strategy here is a generalization of the Routley-Meyer semantics for Relevant Logic. The goal is for the relation S to capture all of the information present in the monoid operation ◦. In order to do this, it needs to take advantage of both sets of points that define the worlds: the set of propositions that are true at a world as well as those that are false at that world. Conceptually, S models modus ponens, in which x plays the role of antecedent, y plays the role of the implication, and z plays the role of the consequent. If the implication is true at y, and the consequent is false at z, S guarantees that the antecedent must be false at x. For instance, since 3 (role: the implication) is true at b and 1 ◦ 3 (the consequent) is false at c, but 1 (the antecedent) is not false at a, S does not hold of a, b, and c. We do have Saba, however. The complete relation is aab, aba, baa, bbb, caa, cad, cbb, cbc, cca, ccd, cdb, cdc, dab, dac, dba, dbd, dcb, dcc, dda, ddd. Once we have constructed S as a function of ◦, we can define multiplicative 3 Lambek grammars (e.g., Moortgat 1997) also use a three place relation to give a semantics for a multiplicative conjunction, where the conjunction is used to model concatenation of linguistic expressions. For an example of modus ponens in type-logical grammar, DP ⊗ DP \S ` S.
10:32
Free choice permission as resource-sensitive reasoning
conjunction purely in terms of relations over worlds: β(p ◦ q) = l({z|∀x, y : Sxyz and y ∈ β(q) implies x ∈ r (β(p))}) This definition unpacks S in order to reconstruct the original relation ◦. 8.3
Understanding linear implication
What does the multiplicative conjunction of two formulas mean? Since we now have both an algebraic and a possible worlds semantics in correspondence, we can move back and forth between the two semantics in search of insight. Begin with the algebra. We can keep track of the state of our reasoning process by picking out a point in the lattice. Assume that I have good reason to believe we are located at lattice position 1. This is a highly specific situation: I know that we are located on world d, since that is the only world at which 1 is true. Now assume that I learn something: that you have eaten a pear. Call this fact B, and associate it with lattice point 4 (i.e., let v(B) = 4). To find out where we are now, I compute 1 ◦ 4 = 2. Since β(1 ◦ 4) = β(2) = {c}, we are now on world c. Learning that you have eaten an apple changes our location from world d to world c. This may initially seem somewhat distressing. In the usual Stalnakerian system, adding information is typically a monotonic process of eliminating possible worlds. If we’ve already narrowed the set of live options to a single world d, there is no way to end up on a distinct world c. Because ◦ is nonmonotonic in this sense, it may be better to think of what we have been calling worlds as classes of worlds. Sometimes the term ‘set-up’ is used instead of ‘world’. I will use the term ‘situation’. Then learning that you have eaten an pear changes the current situation into a different situation, one in which the consequences of having eaten a pear obtain. Let’s continue to reason. We pick a point in the lattice to serve as A, the situation in which you eat an apple, and a separate point to serve as δ, the situation in which all obligations are fulfilled. Say that v(A) = 2, v(δ) = 3, and v(B) is still 4. Now consider the proposition that eating an apple is permitted: A ( δ. Then v(A ( δ) = v((A ⊗ δ⊥ )⊥ ) = ∼(v(A) ◦ ∼v(δ)) = ∼(2 ◦ ∼3) = ∼(2 ◦ 1) = ∼2 = 4. Apparently, in this model, the situation in which you eat a pear is modeled by the same situation in which you are permitted to eat an apple. (This sort of coincidence is unavoidable in such a
10:33
Chris Barker
tiny model, in the same way that a valuation for classical logic will be forced to map very different formulas to the same truth value.) So let’s say that I know we’re in a situation in which you are permitted to eat an apple (say, point 4), and then I learn that you have eaten an apple. Perhaps I watch you eat it. This changes things: I compute 4 ◦ 2 = 1. Thanks to your eating an apple, we’re now in situation 1. And since 1 ≤ 3, things are as they are supposed to be. In terms of worlds, δ is modeled by worlds (situations) b and d; and since point 1 corresponds to (a singleton set containing only) world d, we must be in a δ-world. So, what if you are permitted to eat an apple or a pear? That’s ∼((2 ∨ 4) ◦ ∼3) = 4. We just saw that if we start at 4 and you an apple, we land on a δ-world. And indeed, if we’re at point 4 and you eat a pear instead, 4 ◦ 4 = 3, and once again we’re in a δ-situation. But what if you eat an apple and you eat a pear? 4 ◦ 4 ◦ 2 = 2. Situation 2 is not a δ situation, so things are not ok. Having permission to eat an apple or a pear is not the same thing as having permission to eat an apple and a pear. Likewise, if killing the postman is modeled by situation 4 (i.e., v(K) = 4), then eating an apple and killing the postman will definitely not leave us in a δ-situation. (This small model is somewhat unrealistic, however, in that there are situations in which eating an apple, killing the postman, and then eating another apple is perfectly permissible.) However, as emphasized above, having permission to eat an apple or a pear is compatible with also having permission to eat both. Making use of the same model, if we have v(A) = v(δ) = v(B) = 3, then v((A & B) ( δ) = v((A ⊗ B) ( δ) = 3. With this valuation, eating apples and pears is truly optional: you can eat an apple and stop, or you can eat a pear and stop, or you can eat an apple and you can eat a pear, and in all three cases you’ll end up in a δ-situation. 9
Conclusions
On the view presented here, understanding free choice hinges on recognizing that permission is a scarce resource, and so permission talk requires a resource-sensitive semantics. Following Lokhorst (1997), I propose Linear Logic as a way of tracking permission: not only what kind of permission has been granted, but how much. Then primary free choice implications (given You may eat an apple or a pear, infer You may eat an apple & You may eat a pear) follow merely from expressing permission using the (independently-
10:34
Free choice permission as resource-sensitive reasoning
motivated) Anderson/Kanger deontic reduction strategy. Double prohibition (from You may not eat an apple or a pear infer You may not eat an apple and You may not eat a pear) follows from standard Gricean reasoning, without any need to postulate special pragmatic mechanisms. The implications of this view are fairly dramatic. The claim is that natural language expressions can differ in the resource management schemes they impose. At the least, alethic modes impose classical resource management, and deontic modes impose linear resource management (and quite likely, other modes as well). Linear Logic is one of the better known resource-sensitive logics. Other logics may be worth considering instead. Similarly, the Anderson/Kanger deontic reduction strategy was adopted in part for ease of exposition, and work remains to integrate the account here within a more general framework of modality in natural language. But apart from the advantages of Linear Logic specifically or the deontic reduction, I would like to suggest a more general conclusion: that we may be able to gain new and valuable insights into long-standing puzzles in natural language semantics if we allow ourselves to consider richer logical approaches than standard classical logic. References Abramsky, Samson. 1993. Computational interpretations of Linear Logic. Theoretical Computer Science 111(1–2). 3–57. doi:10.1016/0304-3975(93)90181R. Accorsi, Rafael & Johan van Benthem. 1999. Lorenzen’s games and Linear Logic. University of Amsterdam manuscript. http://www.informatik. uni-freiburg.de/~accorsi/papers/games.pdf. Akatsuka, Noriko. 1992. Japanese modals are conditionals. In Diane Brentari, Gary Larson & Lynn MacLeod (eds.). The joy of grammar: A festschrift in honor of James D. McCawley. Amsterdam: John Benjamins. 1–10. Allwein, Gerard & J. Michael Dunn. 1993. Kripke models for Linear Logic. The Journal of Symbolic Logic 58(2). 514–545. doi:10.2307/2275217. Aloni, Maria. 2007. Free choice, modals, and imperatives. Natural Language Semantics 15(1). 65–94. doi:10.1007/s11050-007-9010-2. Alonso-Ovalle, Luis. 2006. Disjunction in alternative semantics. UMass Amherst: PhD dissertation. Asher, Nicholas & Daniel Bonevac. 2005. Free choice permission is strong permission. Synthese 145(3). 303-323. doi:10.1007/s11229-005-6196-z.
10:35
Chris Barker
Brown, Mark. 1996. Doing as we ought: Towards a logic of simply dischargeable obligations. In Mark Brown & José Carmo (eds.). Deontic logic, agency and normative systems (Third International Workshop on Deontic Logic in Computer Science). Berlin: Springer. 47–65. Chemla, Emmanuel. 2009a. Universal implicatures and free choice effects: experimental data. Semantics and Pragmatics 2(2). 1-33. doi:10.3765/sp.2.2. Chemla, Emmanuel. 2009b. Similarity: Towards a unified account of scalar implicatures, free choice permission and presupposition projection. Manuscript. http://www.emmanuel.chemla.free.fr/Material/ Chemla-SIandPres.pdf. Clancy, Patricia M. 1985. The acquisition of Japanese. In Dan Slobin (ed.). The crosslinguistic study of language acquisition: The data (Volume 1). Hillsdale, NJ: Lawrence Erlbaum Associates. 373–524. Dalrymple, Mary. 2001. Lexical Functional Grammar (Syntax and Semantics volume 34). New York: Academic Press. Dekker, Paul. 2002. Meaning and use of indefinite expressions. Journal of Logic, Language and Information 11(2). 141–194. doi:10.1023/A:1017575313451. Fox, Danny. 2007. Free choice disjunction and the theory of scalar implicature. In Uli Sauerland and Penka Stateva (eds.). Presupposition and implicature in compositional semantics. New York: Palgrave Macmillan. 71–120. Franke, Michael. 2009. Free choice from iterated best response. In Maria Aloni, Harald Bastiaanse, Tikitu de Jager, Peter van Ormondt & Katrin Schulz (eds.). Pre-proceedings of the seventeenth Amsterdam Colloquium. Amsterdam: ILLC/Department of Philosophy. 267–276. Geurts, Bart. 2005. Entertaining alternatives: disjunctions as modals. Natural Language Semantics 13(4). 383–410. doi:10.1007/s11050-005-2052-4. Girard, Jean-Yves. 1987. Linear Logic. Theoretical Computer Science 50(1). 1–102. doi:10.1016/0304-3975(87)90045-4. Girard, Jean-Yves. 1995. Linear Logic: its syntax and semantics. In Jean-Yves Girard, Yves Lafont & Laurent Regnier (eds.). Advances in Linear Logic. Lecture Note Series 222. Cambridge, UK: Cambridge University Press. 1–42. Hansen, Jörg, Gabriella Pigozzi & Leendert van der Torre. 2007. Ten philosophical problems in deontic logic. Dagstuhl Seminar Proceedings 07122. http://drops.dagstuhl.de/opus/volltexte/2007/941.
10:36
Free choice permission as resource-sensitive reasoning
Kamp, Hans. 1973. Free choice permission. Proceedings of the Aristotelian Society 74. 57–74. Kamp, Hans. 1978. Semantics versus pragmatics. In Franz Guenthner & Siegfried J. Schmidt (eds.). Formal semantics and pragmatics for natural languages. Dordrecht, Holland: Reidel. 255–287. Klinedinst, Nathan. 2007. Plurality and possibility. UCLA, CA: PhD dissertation. Kratzer, Angelika. 1991. Modality. In Arnim von Stechow, Dieter Wunderlich (eds.). Semantik: Ein internationales Handbuch der zeitgenössischen Forschung. Berlin: De Gruyter. 639–650. Kratzer, Angelika, and Shimoyama, Junko 2002. Indeterminate pronouns: The view from Japanese. In Yukio Otsu (ed.). The proceedings of the third Tokyo conference on psycholinguistics. Tokyo: Hituzi Syobo. 1–25. Lokhorst, Gert-Jan C. 1997. Deontic linear logic with Petri net semantics. Technical report, FICT (Center for the Philosophy of Information and Communication Technology). Rotterdam. http://homepages.ipact.nl/ ~lokhorst/deopetri.pdf. Lokhorst, Gert-Jan C. 2006. Andersonian deontic logic, propositional quantification, and Mally. Notre Dame Journal of Formal Logic 47(3). 385–395. doi:10.1305/ndjfl/1163775445. McNamara, Paul. 2006. Deontic Logic. In Dov M. Gabbay & John Woods (eds.). Handbook of the history of logic, volume 7: Logic and the modalities in the twentieth century. Amsterdam: Elsevier. 197-288. A version is also available in the Stanford encyclopedia of philosophy. http://plato. stanford.edu/entries/logic-deontic/. Moortgat, Michael. 1997. Categorial Type Logics. In Johan van Benthem & Alice ter Meulen (eds.). Handbook of logic and language. Cambridge, MA: MIT Press. 93–177. Portner, Paul. 2009a. Modality. Oxford, UK: Oxford University Press. Portner, Paul. 2009b. Permission and choice. Georgetown University: Manuscript. Restall, Greg. 2000. An introduction to substructural logics. London: Routledge. van Rooij, Robert. 2008. Towards a uniform analysis of any. Natural Language Semantics 16(4). 297–315. doi:10.1007/s11050-008-9035-1.
10:37
Chris Barker
van Rooij, Robert. 2010. Conjunctive interpretation of disjunction. Semantics and Pragmatics 3(11). doi:10.3765/sp.3.11. Ross, Alf. 1941. Imperatives and logic. Theoria 7(1). 53–71. Schulz, Katrin. 2005. A pragmatic solution for the paradox of free choice permission. Synthese 147(2). 343–377. doi:10.1007/1-4020-4631-6_10. Shan, Chung-chieh. 2004. Binding alongside Hamblin alternatives calls for variable-free semantics. In Kazuha Watanabe & Robert B. Young (eds.). Proceedings from Semantics and Linguistic Theory XIV. Cornell University Press. 289–304. Simons, Mandy. 2005. Dividing things up: the semantics of or and the modal/or interaction. Natural Language Semantics 13(3). 271–316. doi:10.1007/s11050-004-2900-7. Wadler, Phil. 1993. A taste of Linear Logic. In Andrzej Borzyszkowski & Stefan Sokolowski (eds.). Proceedings of the 18th international symposium on mathematical foundations of computer science (Lecture Notes in Computer Science Volume 711). Heidelberg: Springer. 185-210. doi:10.1007/3540-57182-5_12. Zimmermann, Ede. 2000. Free choice disjunction and epistemic possibility. Natural Language Semantics 8(4). 255-290. doi:10.1023/A:1011255819284.
Chris Barker 10 Washington Place New York, NY 10003, USA
[email protected] http://homepages.nyu.edu/~cb125
10:38
Semantics & Pragmatics Volume 3, Article 6: 1–54, 2010 doi: 10.3765/sp.3.6
The semantics and pragmatics of plurals Donka F. Farkas Department of Linguistics, University of California at Santa Cruz
Henriëtte E. de Swart Department of Modern Languages, Utrecht University
Received 2008-11-14 / First Decision 2009-01-23 / Revised 2009-04-28 / Second Decision 2009-07-07 / Revised 2009-09-18 / Third Decision 2009-10-07 / Revised 2009-10-30 / Accepted 2009-11-21 / Final Version Received 2010-01-06 / Published 2010-03-30
Abstract This paper addresses the semantics and pragmatics of singular and plural nominals in languages that manifest a binary morphological number distinction within this category. We review the main challenges such an account has to meet, and develop an analysis which treats the plural morpheme as semantically relevant, and the singular form as not contributing any number restriction on its own but acquiring one when in competition with the plural form. The competition between singular and plural nominals is grounded in bidirectional optimization over form-meaning pairs. The main conceptual advantage our proposal has over recent alternative accounts is that it respects Horn’s ‘division of pragmatic labor’, in that it treats morphologically marked forms as semantically marked, and morphologically unmarked forms as semantically unmarked. In our account, plural forms are polysemous between an exclusive plural sense, which enforces sum reference, and an inclusive sense, which allows both atoms and sums as possible witnesses. The analysis predicts that a plural form is pragmatically appropriate only in case sum values are among the intended referents. To account for the choice between these two senses in context we invoke the Strongest Meaning Hypothesis, an independently motivated pragmatic principle. Finally, we show how the approach we develop explains some puzzling contrasts in number marking between English three/more children and Hungarian három/több gyerek (‘three/more child’), a problem that has not been properly accounted for in the literature so far.
Keywords: singular, plural, morphology, markedness, optimality theory, strongest meaning hypothesis, Hungarian ©2010 Farkas and de Swart This is an open-access article distributed under the terms of a Creative Commons NonCommercial License (creativecommons.org/licenses/by-nc/3.0).
Farkas and de Swart
1 Atoms, sums and the inclusive/exclusive sum interpretation 1.1 Inclusive and exclusive interpretations of the plural The question addressed in this paper is a simple one: What is the difference in meaning between singular and plural nominals in languages such as English, where this distinction is morphologically marked?1 The issue then is to characterize the semantic difference between the pair in (1), as it pertains to information conveyed by the contrast in number. (1)
a. Mary saw a horse. b. Mary saw horses.
A disarmingly simple answer would be to say that singular nominals (such as a horse above) refer to one entity while plural nominals (such as horses) refer to more than one entity. Recast in more technical terms based on Link (1983), this answer is formulated in (2): (2)
a. Singular nominals refer within the domain of atoms. b. Plural nominals refer within the domain of sums.
In Link’s proposal, the domain of entities from which nominals take values has the structure of a join-semilattice whose atoms are ordinary individuals (in this case individual horses) and whose non-atomic elements are all the possible sums of more than one atom (in this case groups of more than one horse). Under the simple view, when a nominal is singular, the domain from which its referent is chosen is the set of atoms in the semilattice denoted by its head noun, while in case it is plural, its reference domain is the set of sums in that semilattice. 1 We use nominal here as a cover term for DPs and NPs. We limit the discussion to nominals in regular argument position, and ignore special uses in predication, incorporation, etc. (cf. de Swart & Zwarts 2009 and references therein for discussion of such constructions). Among the languages that manifest a singular/plural morphological distinction are the languages within the Germanic and Romance families as well as Finno-Ugric languages such as Hungarian and Finnish. We will not deal here with languages that make more fine-grained distinctions in number, involving duals or paucals (see Corbett 2000). Languages such as Mandarin Chinese that lack morphological number distinctions are briefly taken into consideration below but do not receive a full-fledged analysis in this paper but see Krifka (1995) and Rullmann (2003) for relevant proposals. Nor do we go into issues concerning non-morphological encoding of number information of the type discussed for Korean by Kwon & Zribi-Hertz (2006) or for Papiamentu and Brazilian Portuguese by Kester & Schmitt (2007).
6:2
Singulars and Plurals
The interpretation of the plural nominal in (1b) is labelled exclusive because its reference is restricted to sums, excluding atoms: (1b) is interpreted as claiming that Mary saw more than one horse. The classical challenge for the naïve view of the semantics of the plural is the existence of so-called inclusive plurals, exemplified in (3a-c). These are plural forms whose interpretation appears to be indifferent to the atom/sum divide in that the plural nominal is allowed to range over both atoms and sums. (3)
a. Have you ever seen horses in this meadow? b. If you have ever seen horses in this meadow, you should call us. c. Sam has never seen horses in this meadow.
Thus, a yes answer to (3a) normally commits the speaker to having seen one or more horses; in (3b), the addressee is expected to call even if she has seen a single horse in the meadow, and (3c) is judged false in case Sam has seen a single horse in the meadow. The existence of inclusive readings comes as an unpleasant surprise to the naïve view, which predicts that the plural forms in (3a)-(3c) are interpreted exclusively, just like the plural in (1b). Note next that even though plurals may receive an inclusive interpretation in questions and within the scope of negation, as shown in (3a-c), the distinction between singulars and plurals is not fully obliterated in these environments. This is illustrated by (4a) and (4b) taken from Farkas (2006) and Spector (2007) respectively, who note that the plural is distinctly odd in these examples because normally people have only one nose and only one father. (4)
a. Does Sam have a Roman nose/#Roman noses? b. Jack doesn’t have a father/#fathers.
The contrast between (3a-c) and (4a-b) shows that a plural form remains sensitive to the atom/sum distinction, even in environments where it can be interpreted inclusively. A plural is always odd when sum values are pragmatically excluded from its domain of reference. Ideally, this property should follow from the account of the semantics and pragmatics of number interpretation without any specific stipulations. So far then we have established that an account of number interpretation has to explain why plural forms are susceptible to both exclusive and inclusive readings, and furthermore, one has to understand why particular linguistic environments favor one or the other shade of meaning, while at the same time predicting the sensitivity of plural forms to sum reference in all contexts. In
6:3
Farkas and de Swart
the rest of this section we establish some further conceptual and empirical challenges an adequate account of number must meet and discuss some of the most influential previous ways of dealing with them. In Section 2, which contains the core of our proposal, we give a semantics for the singular/plural contrast. In keeping with facts about overt morphology in the languages under consideration, we do not make use of a singular morpheme and therefore do not assign singular forms any inherent ‘singular’ semantics. The plural morpheme on the other hand is treated as contributing a polysemous meaning, with the inclusive and exclusive interpretations being its two related senses. The atomic reference of the singular comes about in our account as a result of the competition between singular and plural forms in the spirit of previous analyses but starting from opposite assumptions. This competition is modelled in bidirectional Optimality Theory. In Section 3 we account for the inclusive/exclusive interplay exemplified by the contrast between (1b) and (3a-c) by exploiting the Strongest Meaning Hypothesis, an independently motivated pragmatic principle. We also show that the analysis we propose predicts that a plural form always requires the possibility of sum witnesses, thus explaining the contrast in (4a-b) without any extra stipulation. Section 4 shows how the analysis of languages like English extends to an apparent puzzling use of singular forms with sum reference in Hungarian, while Section 5 sums up the results of the paper. 1.2
The strong singular/weak plural view
An immediate solution to the inclusive plural problem illustrated in (3a-c) is sketched in Krifka 1989. Plural forms, he suggests, are semantically indifferent to the atom/sum distinction while singular forms involve number semantics that imposes atomic reference. In this view, the plural is semantically “weak” in that it has no semantic contribution. The singular on the other hand, is semantically “strong” in that it imposes an atomic reference requirement. The ‘exclusive’ interpretation of the plural in sentences like (1b) is due, in Krifka’s view, to a pragmatic blocking effect. The existence of the semantically strong singular form blocks the use of the semantically weak plural when atomic reference is intended because of a pragmatic rule that forces the choice of a more specific form over a less specific one when the two are equally complex. Since a singular nominal is more specific than its plural counterpart, the singular has to be chosen whenever atomic reference is meant, thus excluding an atomic interpretation for plural forms.
6:4
Singulars and Plurals
This idea is worked out in detail in Sauerland 2003 and Sauerland, Anderssen & Yatsushiro 2005. In Sauerland et al. 2005, there are two number features, SG and PL, located syntactically in the head of a φP node, as in figure 1, where *boy is a number-neutral predicate, insensitive to the atom/sum distinction: φP φ [SG/PL]
Figure 1
DP D
NP
the
*boy
Number features in Sauerland et al. 2005
The proposed semantic contribution of the two number features is given in (5): (5)
Semantics of the singular/plural in Sauerland et al. 2005: a. SING(x) is defined only if #x = 1 SING(x) = x wherever it is defined b. PLUR(x) is always defined PLUR(x) = x wherever it is defined
In this approach, the plural feature is semantically weak because it contributes nothing to the interpretation of the phrase it occurs in. The singular feature, on the other hand, is semantically strong because it contributes a presupposition of singleton (or atomic) reference. The exclusive reading of plurals exemplified in (1b) is derived with the help of the principle of Maximize Presupposition originally proposed in Heim 1991 to account for the non-uniqueness inference of indefinite DPs. If there is a choice between two alternative morphemes that differ only in that one has more presuppositions than the other, this principle requires speakers to choose the morpheme that has the most presuppositions satisfied in the context. Given the semantics in (5), Maximize Presupposition predicts that the plural form in (1b) is interpreted as exclusive since if the atomic presupposition of the singular had been met, Maximize Presupposition would have mandated the use of the singular form.
6:5
Farkas and de Swart
In order to account for the inclusive interpretation of plural forms in sentences such as (3a-c), Sauerland et al. weaken Maximize Presupposition as in (6): (6)
Maximize Presupposition applies to the scope of an existential if this strengthens the entire sentence.
Sauerland et al. treat indefinites as generalized quantifiers with existential force, and decompose no syntactically into an indefinite and negation. Presupposition maximization applied to the scope of the existential adds a condition that would make the entire utterance logically weaker when the existential occurs in a downward entailing environment. The generalization in (6) blocks this process, and thus predicts an inclusive reading for plural indefinites within the scope of negation and more generally, in downward entailing environments. There are, however, problems concerning the precise details of when and how Maximize Presupposition is suspended, for a discussion of which we refer the reader to Spector (2007: 267–271). A different account is argued for in Spector 2007. Spector also posits two number features, a singular and a plural, but in this approach each has its own semantic contribution. The semantics of the singular feature imposes atomic reference while the semantics of the plural is inclusive. The exclusive plural interpretation of (1b) comes about as the result of a second-order scalar implicature denying the ‘exactly one’ reading of (1b)2 . For details and contrasts in predictions between Sauerland’s position and Spector’s with respect to bare plurals in non-monotonic and universally quantified contexts, see Spector 2007. The analysis we develop here shares with these earlier approaches the insight that the competition between singular and plural forms drives their interpretation in a process that intertwines semantics and pragmatics. The crucial difference between these earlier approaches and ours is that we treat the singular as semantically weak and the plural as semantically strong. Krifka (1989) and Sauerland et al. (2005) use blocking to derive the interpretation of the semantically weak plural given the existence of a semantically strong singular, whereas we posit a semantically strong plural, and use blocking to derive the interpretation of the semantically weak singular. This reversal is worth striving for because it reconciles the semantics of number with Horn’s division of pragmatic labor, an issue we turn to next. 2 Note that this implicature is independent of the suitability implicature that Cohen (2005) proposes to distinguish bare plurals from plurals with an overt indefinite determiner.
6:6
Singulars and Plurals
1.3
Reconciling number semantics with the Horn pattern
Any account in which the singular feature makes a semantic contribution while the plural does not forces one to distinguish between semantic and morphological markedness. It has long been known that there is a strong tendency for languages that have a singular/plural contrast in nominals to morphologically mark plural forms and leave singular forms morphologically unmarked (Greenberg 1966; Corbett 2000).3 Such languages have a morpheme used in plural nominals but no special singular morpheme, and therefore plural forms are morphologically marked while singulars are not. But under a strong singular/weak plural view, it is the singular that makes a semantic contribution while the plural is semantically vacuous. Thus, in Sauerland et al. 2005 the singular is morphologically unmarked, but semantically marked (cf. 5a), while the plural is morphologically marked and semantically unmarked (cf. 5b).4 It is, in fact, this very tension between semantic and morphological markedness that made the existence of inclusive plurals interesting in the first place. McCawley (1981) raises the question of how to reconcile the morphology and the semantics of number given the general tendency of language to pair morphologically unmarked forms with semantically unmarked meanings. The morphological asymmetry between singular and plural forms is also unexplained under the analysis in Spector 2007 because in that account both singular and plural features are semantically potent. Following van Rooij (2004) and others, we call the fundamental connection between semantic and morphological markedness Horn’s division of pragmatic labor or the Horn pattern, but note that this generalization has a long history that reaches back long before Horn’s work. Citing structuralist and Prague school views on markedness, Horn (2001: 155) describes it as follows: “. . . one member of an opposed pair is literally marked (overtly signaled) while the other is unmarked (signaled via the absence of an overt signal). Semantically, the marked category is characterized by the presence of some property P, while the corresponding unmarked category entails 3 Exceptions to this generalization exist, and are discussed in the literature. For instance, in some ‘singulative’ languages such as Welsh, a singular morpheme is used in instances where unmarked reference is to groups, or to unindividuated mass. Some instances of reversed markedness are addressed in de Swart & Zwarts (2010). We will be concerned here with the typologically most frequent pattern, where the plural is morphologically marked, but the singular is not. 4 For further discussion of the special status of number within nominal φ features with respect to semantic and morphological markedness within the assumptions of Sauerland (2003) and Sauerland et al. (2005), see Sauerland (2008). 6:7
Farkas and de Swart
nothing about the presence or absence of P but is used chiefly (although not exclusively) to indicate the absence of P (Jakobson 1939)”. Any strong singular/weak plural analysis involves an anti-Horn pattern because in such an approach the singular forms are assigned a strong semantics (requiring atomic reference, which plays the role of P above), while plural forms are given a weak interpretation, neutral with respect to whether values are chosen among atoms or sums. Recently, Bale, Gagnon & Khanjian (in press) have explicitly defended the anti-Horn pattern for number, claiming that the empirical data are only reconcilable with a negative correlation between morphological and semantic markedness. A central goal of the present paper is to challenge the anti-Horn view, and achieve a reconciliation of the semantics and the morphology of number, formulated in A: A.
Plural forms should be semantically marked relative to singular forms so as to preserve the correspondence between morphological and semantic markedness seen elsewhere in language.
Analyses that are in line with the Horn pattern in the sense that they treat the plural feature as making a semantic contribution while treating the singular as semantically vacuous are called here weak singular/strong plural approaches. They are preferable on theoretical grounds to their competitors because they explain the asymmetry in number morphology in languages that have a plural but no singular morpheme and thus reconcile morphological and semantic markedness. Endowing the plural morpheme with a semantic contribution and deriving the interpretation of singular forms from the absence of the plural morpheme makes sense of the systematic morphological asymmetry between singular and plural forms. The existence of inclusive plural readings constitutes the main empirical challenge for the weak singular/strong plural approach. Before we address this problem and offer a solution, we present data from Hungarian that appears puzzling for a strong singular/weak plural account but not for a weak singular/strong plural view. 1.4 Cross-linguistic challenges In this subsection we review two sets of facts that add further challenges to any account of number interpretation. The first comes from Hungarian, a language that displays a pattern of number marking that raises an empirical challenge to approaches that treat singular forms as requiring atomic reference. Just like English and other Indo-European languages, Hungarian has a singular/plural distinction: 6:8
Singulars and Plurals
(7)
a. Mari látott egy lovat. Mari saw a horse.acc ‘Mari saw a horse.’ b. Mari látott lovakat. Mari saw horse.pl.acc ‘Mari saw horses.’
[Hungarian]
There is no special morphology marking singular forms, while the plural feature is realized by the morpheme -(a)k.5 (8b) shows that in Hungarian, just like in English, verbs must agree with their subjects in number, and that (a)k realizes the plural feature on verbs as well: (8)
a. A gyerek elment. the child leave.past ‘The child left.’ b. A gyerekek elmentek / *elment. the child.pl leave.past.pl / leave.past ‘The children left.’
Hungarian is like English also in that plural nominals may have inclusive uses. In (9a) for instance, the addressee is expected to give a positive answer even if she saw a single horse, (9b) claims that Anna has not seen one or more horses, and (9c) asks the addressee to say something if she saw one or more horses.6 (9) a. Láttál valaha lovakat? see.past.II ever horse.pl.acc ‘Have you ever seen horses?’ b. Anna nem láttot lovakat. Anna not see.past horse.pl.acc ‘Anna hasn’t seen horses.’ c. Ha láttál valaha lovakat, szólj. if see.past.II ever horse.pl.acc say.imp ‘If you have ever seen horses, say so.’ 5 The vowel a is in parentheses here because in many phonological analyses it is treated as epenthetic. The quality of this vowel is determined by vowel harmony as well as by morphological considerations that are irrelevant for our purposes. 6 The contrast between inclusive and exclusive plurals in Hungarian is complicated by the fact that in this language bare nominals (whether singular or plural) can incorporate, an issue discussed at length in Farkas & de Swart 2003. Incorporated singulars are number neutral, while incorporated plurals have sum reference. We will not be concerned with incorporated nominals in this paper, but only note that our proposals are compatible with the analysis of incorporation proposed by Farkas & de Swart (2003). 6:9
Farkas and de Swart
The problematic data concern DPs whose determiner entails reference to sums, including but not limited to cardinals bigger than one, exemplified in (10): 7 (10)
a. három gyerek / *három gyerekek three child / three child.pl ‘three children’ b. sok gyerek / *sok gyerekek many child / many child.pl ‘many children’ c. mindenféle gyerek / *mindenféle gyerekek all.kind child / all.kind child.pl ‘all kinds of children’ d. több gyerek / *több gyerekek more child / more child.pl ‘more children’ e. egy pár gyerek / *egy pár gyerekek a couple child / a couple child.pl ‘a couple of/some children’
As one can see from these examples, such DPs must be morphologically singular. Note that these cases involve not only cardinal numerals but other types of Ds as well. Therefore, no analysis specific to cardinals, such as the one proposed in Ionin & Matushansky 2006, can cover all the relevant examples. That these DPs are semantically plural can be seen from the fact that they may occur as subjects of verbs like összegyülni ‘to gather’, as seen in (11a). The fact that they are not necessarily distributive, and therefore that they are, or at least, can be, referential is shown in (11b-d).8 The data are the same for all the D types exemplified in (9). (11)
a. Sok gyerek gy˝ ult össze a téren. many child gather.past part the square.on ‘Many children gathered in the square.’ b. Három gyerek felemelt egy zongorát. three child lift.past a piano.acc ‘Three children lifted a piano.’
7 Hungarian is not the only language that displays this pattern of number marking, but it is sufficient to work out the data for one particular language to make the relevant theoretical point. 8 We are grateful to an anonymous reviewer for drawing our attention to the data in (11b-d).
6:10
Singulars and Plurals
(11)
c. A három gyerek elérte a plafont. the three child reach.past the ceiling.acc ‘The three children reached the ceiling.’ d. *Három gyereki azt hiszi, hogy ˝ oi a legjobb. three child it.acc believe that III the best ‘Three childreni think that hei is the best’
Example (11b) is most naturally interpreted as involving a single lifting of a single piano in which three children participated together. (11c) can be interpreted as involving the reaching of the ceiling by one child helped along by her two teammates. In (11d) we see that just like in the English equivalent, *Three childreni think that hei is the best, these DPs cannot bind singular pronouns. Note that these properties distinguish the DPs in (10) and (11) from necessarily quantificational, non-referential DPs such as those headed by each/mindegyik in English and Hungarian respectively. The data in (10) are compatible with Rullmann & You’s (2003) semantics of number, but their analysis of the Hungarian singular as number neutral associates exclusive sum reference with the plural morpheme, and therefore does not account for the inclusive interpretation of the plural in (9). The analyses in Sauerland et al. 2005 and Spector 2007 have no problem with the inclusive plural interpretation in (9) but the singular form of the DPs in (10) is problematic, given that the atomic reference semantics that the singular is crucially supposed to have is violated here. Next, note that we cannot simply treat these forms as involving the presence of a [Pl] feature within the nominal that happens not to be realized on the head N. Such an analysis would predict that these DPs trigger plural verb agreement when in subject position, a prediction that is not borne out, as shown in (12): (12)
a. Három gyerek elment / *elmentek three child leave.past / leave.past.pl ‘Three children left.’ b. Mindenféle gyerek jelentkezett / *jelentkeztek all.kind child apply.past / apply.past.pl ‘All kinds of children applied.’
Finally, note that the DPs in (9) are similar to plural DPs in that discourse pronouns referring back to them must be plural. If (13) is the continuation of (12a) and if the three children are to be the antecedent of the direct object
6:11
Farkas and de Swart
pronoun, that pronoun must be plural. (13)
˝ Mari nem látta oket / *˝ ot. Mari not see.past III.pl.acc / III.acc ‘Mari didn’t see them.’
These observations show that the DPs in (10) are semantically plural in that they refer to sums but that morphologically, they are singular. This characterization accounts for the data under the assumption that SubjectVerb agreement is sensitive to the morphological feature of the DP, while the form of a discourse pronoun is sensitive to the semantics of its antecedent (see Farkas & Zec 1995 for discussion). The morphology explains the intrasentential agreement pattern these DPs trigger while their semantics explains the form of the discourse pronouns for which they serve as antecedents. We see then that in Hungarian, singular forms must be used in certain cases of sum reference, a situation that is problematic for any strong singular view. The challenge raised by the Hungarian data reviewed here is formulated in B. B.
There are languages with a morphological singular/plural distinction in nominals, where singular forms may have sum reference in case sum reference is entailed by the determiner.
The account of the contrast between English and Hungarian we offer in Section 4 below differs in empirical coverage from that found in Sauerland et al. 2005 and Rullmann & You 2003 in that we capture the similarities between the two languages when it comes to the interpretation of ordinary plural forms in (7, 8, 10) as well as the differences between them when it comes to the DP types exemplified in (9). The second cross-linguistic empirical problem we consider is raised by languages such as Mandarin Chinese that do not have a morphological contrast between singular and plural forms. Nominals unmarked for number in such languages get a number neutral interpretation, as emphasised by Krifka (1995), on the basis of examples such as (14): (14)
Wò kànjiàn xióng le. I see bear asp ‘I saw a bear/some bears.’
[Mandarin Chinese]
The empirical generalization we draw from contrasting the interpretation of non-plural forms in English-type languages and Chinese type languages is
6:12
Singulars and Plurals
formulated in C. C.
In languages which lack morphological number marking on nominals, unmarked forms are number neutral.
We capture this generalization below but will not work out the semantics of Chinese nominals since we focus here on languages that have a morphological number distinction. Our approach to these languages is compatible with Rullmann & You’s semantics of Mandarin. We have seen in this section that the naïve (and attractive) view of number interpretation we started with, according to which singular nominals refer to atoms and plural ones refer to sums, faces two stumbling blocks: (i) the existence of cases in which plural nominals are interpreted inclusively; (ii) the number marking system in languages like Hungarian, where certain singular nominals must receive a non-atomic interpretation. Retreating to a view according to which the singular is semantically potent while the plural is semantically empty runs against the Horn pattern of markedness, and has difficulty with the Hungarian data as well. In the remainder of this paper, we work out an account of number interpretation which: i. accounts for the existence of the inclusive as well as the exclusive interpretation of plurals; ii. respects the Horn pattern and is in line with the morphological markedness facts (generalization A); iii. predicts when inclusive interpretations are possible (1b vs. 3 and 7b vs. 9); iv. predicts the possibility of certain singular forms referring to sums in languages like Hungarian (generalization B, examples 10–13); v. is compatible with the number neutral interpretation of nominals in languages that do not have a morphological mark for plurals, such as Mandarin Chinese (generalization C, example 14). We focus here on the semantics of number interpretation on nominals in regular argument position (cf. footnote 1), and do not discuss issues concerning the feature [Pl] when it occurs as an agreement feature on verbs and VPs for instance. We concentrate on N-headed nominals, and leave detailed discussion of pronouns and coordinate DPs for future work.
6:13
Farkas and de Swart
2
The semantics of singular and plural nominals
In this section we give our account of the interpretation of the plural feature and its associated morpheme in the languages under consideration and derive the interpretation of singular forms based on it. We start from what we consider the null hypothesis, according to which, in languages with a binary number distinction, there is a single, privative morphological feature [Pl] in nominals and no singular feature. We assume that this feature is generated in NumP, a node that is dominated by DP. We give the feature [Pl] a polysemous semantics and derive the restriction of singular nominals (i.e., nominals that lack the feature [Pl]) to atomic reference under bidirectional optimization. The bidirectional OT model we use is based on Mattausch 2005, 2007, a set-up that captures the harmonization of unmarked forms with unmarked meanings, and of marked forms with marked meanings.9 2.1 Bi-directional optimization over form-meaning pairs Our analysis is cast in the framework of Optimality Theory (OT), a theory that defines well-formedness in terms of optimization over a set of output candidates for a particular input. OT syntax, for instance, defines grammaticality as the optimal form that conveys a particular meaning, and thus represents the speaker orientation (production). OT semantics picks the optimal interpretation of a given form as the meaning construed by the hearer for that form (comprehension). Bidirectional OT deals with the syntax-semantics interface by combining the two directions in an optimization process over form-meaning pairs (Hendriks, de Hoop, de Swart & Zwarts 2010). This framework is appropriate for the problem at hand because it allows us to treat the interpretation of singular and plural nominals in tandem, as a matter of competition between the two forms. As was made clear in the previous section, we treat as fundamental to the enterprise the fact that plural forms are morphologically marked and singular forms are not. Bidirectional OT is particularly useful to us because Mattausch (2005, 2007) has already worked out in this framework an abstract way of modeling the association of forms and meanings as an optimal communication strategy that captures the Horn pattern. His proposal 9 Mattausch’s work goes back to ideas developed by Jäger (2003) and Blutner (1998, 2000, 2004). For a slightly different bidirectional OT set-up, see Beaver 2002. For a comparison of different bidirectional OT models, see Beaver & Lee 2004.
6:14
Singulars and Plurals
can be applied to our problem in a straightforward way. The gist of Mattausch’s system is the following. Suppose there are two forms, one overtly marked (m), the other unmarked (u) and suppose there are two meanings, an unmarked (more frequent, or simpler) meaning α and a marked (less frequent, or more complex) meaning β. Their combination leads to four possible form-meaning pairs: hu, αi, hm, αi, hu, βi, hm, βi. How do we determine which pairs are the optimal, most harmonic ones? As a starting point, Mattausch posits four bias constraints, one for each of the possible form-meaning pairings: (15)
Bias constraints *m, α: the (marked) form m is not related to the (unmarked) meaning α. *m, β: the (marked) form m is not related to the (marked) meaning β. *u, α: the (unmarked) form u is not related to the (unmarked) meaning α. *u, β: the (unmarked) form u is not related to the (marked) meaning β.
These constraints penalize all possible form-meaning combinations. They become operative only because they are differentially ranked relative to a general markedness constraint, *Mark, a constraint that penalizes the use of the marked form. This constraint models a notion of economy that prefers simpler forms over more complex ones. All constraints are soft, but the ease with which they can be violated depends on their relative strength. The ranking of *Mark with respect to the bias constraints reflects the balance of economy considerations relative to faithful correspondence relations between forms and meanings in the process of optimal communication. Mattausch derives the ranking of the bias constraints relative to the markedness constraint from iterated learning over several generations in a computational learning model based on frequency distributions (cf. Kirby & Hurford 1997). Comparisons of forms and meanings trigger the promotion or demotion of constraints. If the marked meaning is less frequent, iterated learning over several generations with the four bias constraints leads to a stochastic OT grammar in which the ranking of the bias constraints mirrors the frequency distribution of the meanings α and β. The central idea we take over from Mattausch’s work is that the relative ordering of the bias constraints and the markedness constraint is such as to result in an absolute preference for the association of the unmarked meaning
6:15
Farkas and de Swart
with the unmarked form and the association of marked meaning with the marked form thereby capturing Horn’s division of pragmatic labor. The universal constraint ranking Mattausch derives is given in (16): (16)
{*u, β; *m, α} *Mark {*u, α; *m, β}.
Marked forms always violate *Mark, so under the ranking in (16), they only appear with the marked meaning β. Mattausch (2005, 2007) derives the emergence of Horn’s division of pragmatic labor as the optimal communication strategy that arises under evolutionary pressure. 2.2 Morphological and semantic markedness in the domain of number Before we can apply Mattausch’s abstract model to the interpretation of singular and plural nominals, we have to establish which forms and which meanings correspond to u, m, α and β in the domain of number, and we have to establish what the relevant markedness constraint is. Concerning formal markedness, recall that we are concerned with the typologically frequent pattern in which the plural is morphologically marked, and the singular remains unmarked. Because of this asymmetry, the singular is the unmarked form u, and the plural is the marked form m. On this point the literature is in agreement: see Sauerland 2008 and Bale et al. (in press) for recent discussion. We differ from previous approaches, however, in adopting the null hypothesis and taking plural morphology to mark the presence of the privative feature [Pl], and not positing a singular feature or a null singular morpheme. Establishing semantic markedness is a more delicate matter because there are several distinct parameters along which it can be defined, besides frequency. We mention here some major contenders. Denotational markedness involves the subset relation between the denotation of an item i and that of an item i’. For instance, the lexical item dog is denotationally unmarked relative to the lexical item bitch. Conceptual markedness concerns the nature of the denotation of two items: the denotation of the unmarked item i is conceptually simpler than the denotation of the marked item i’. In temporal semantics, for instance, the present is perceived as conceptually less marked than the past and the future since the latter two are defined in terms of the former (Lakoff 2000: 44). Finally, we distinguish a third type of semantic markedness, semantic complexity according to which an item i is less marked than an item i’ iff i’ is associated with a semantic requirement that is lacking
6:16
Singulars and Plurals
in i. For example, the definite article is semantically more complex than the indefinite one under analyses where the definite article is associated with a uniqueness requirement while the indefinite article is neutral in this respect (see Heim 1991 and Farkas 2006). In discussions of semantic markedness in the domain of number, the notion of denotational markedness has dominated. As we have seen already, Sauerland et al. (2005) take the plural to denote within the entire domain of the nominal (= *N, cf. Figure 1 above) while Bale et al. (in press) assign the plural feature an augmentative semantics, which takes the join of all atoms and sums in the semi-lattice of N. As a result, the denotation of the singular (which has to have atomic reference) is a strict subset of the denotation of the plural in both proposals. The same is true for the account in Spector 2007. Approaches which rely on denotational markedness alone then lead to an anti-Horn analysis. However, this is not the only way one can go in relating number interpretation to the Horn pattern. Our analysis of number is grounded in a notion of markedness in terms of semantic complexity. No singular feature is posited for a singular nominal, and no inherent number semantics is assigned to this form. Plural nominals are assumed to involve an overt plural feature [Pl] realized by a plural morpheme whose semantic contribution concerns the atom/sum distinction. If a singular nominal is not inherently associated with any number semantics while a plural nominal comes with such a constraint, the singular form qualifies as semantically unmarked relative to the plural with respect to semantic complexity. In addition, in terms of conceptual complexity, we take atomic reference to be less marked than sum reference. We follow Link (1983) in taking the domain of interpretation from which variables are assigned values to consist of atoms and sums, where the latter are built from the former by means of the join operation ⊕. Given that atoms may exist independently of sums, but not the other way around, a nominal that denotes within the domain of sums is conceptually more marked than one that denotes within the domain of atoms only. Support for this conceptual markedness view is found in psychological research that points to the special nature of sum reference. Recent psychological research suggests that non-human primates and children under two represent small sets of objects as object-files, and do not establish a singular-plural distinction based on atoms vs. sums (Hauser, Carey & Hauser 2000, Feigenson, Carey & Hauser 2002, Feigenson & Carey 2003, 2005). The
6:17
Farkas and de Swart
evidence comes from a variety of non-linguistic tasks.10 Wood, Kouider & Carey (2004) and Kouider, Halberda, Wood & Carey (2006) assume that by the time children learn the meanings of linguistic markers for the singular-plural distinction, they must have distinguished between singletons and sets (or sums, in our terms). They do indeed find that children over two who have started to produce the plural marker understand that it signals reference to multiple objects.11 Whereas representations of individual objects and object arrays are available from ten months onward, the representation of multiple objects as sums is paired up with the acquisition of linguistic markers of plurality (around 24 months). The psycholinguistic evidence supports the view that sum reference is conceptually marked as opposed to reference to atoms. In our view, the interpretation of nominal number concerns restricting the domain from which witnesses of a nominal can be chosen in terms of the atom/sum distinction. The conceptually marked reference is one that includes sums, i.e., one that allows possible sum witnesses. Allowing sums within the domain of reference is then the crucial markedness parameter when it comes to number interpretation in the languages under consideration. Consequently, the denotational space of nominals is divided into two subdomains, one that includes sums and one that excludes them. Nominals that refer in the latter domain have exclusive atom reference, while nominals that refer within the former have sum reference. Next, note that there are two ways in which a nominal can have sum reference: (i) its reference may be restricted to sums (excluding atoms), a case we call exclusive sum reference, and (ii) its reference may include sums but not exclude atoms, a case we call inclusive sum reference. Thus, the formally marked plural form may fulfil 10 For example, infants watch while sets of crackers are placed into two different buckets. When encouraged to crawl to one of the buckets, infants reliably choose the bucket with more crackers with numbers up to three. When one set of crackers exceeds three (in four vs. two, six vs. three or even four vs. one comparisons), infants up till 20 months are at chance. All they would have to do to succeed on a one vs. four comparison is to represent one as a singular individual and four as a plurality, but they fail to do so, and do not show a preference for the bucket containing more crackers. The three-item limit is expected when infants’ representation is object-based, as the object-file system is assumed to be subject to the working memory limit of three to four items. 11 The experiments use a preferential looking paradigm, and tested sensitivity to number expressed on the verb (is/are), on the noun (using nonsense words, e.g. the blicket/the blickets) and with quantifiers (a blocket/some blickets). The results suggest that learning the force of number marking on linguistic expressions strongly correlates with the conceptual distinction between sets (sums) and individuals (atoms).
6:18
Singulars and Plurals
the requirement of being associated with the conceptually marked number interpretation by being associated with either inclusive or with exclusive sum reference. In the latter case all witnesses must be sums while in the former case some witnesses must be sums. The formally unmarked singular form is conceptually unmarked when its denotation excludes sum reference, i.e., when it denotes exclusively in the realm of atoms. When it comes to denotational markedness, matters are complicated because the domain of atoms is, of course, a subset of the domain of atoms and sums. Note, however, that in our account, singular nominals do not have a number feature of their own and thus do not involve inherent number semantics. Plural forms on the other hand are treated as having a number feature whose semantics forces them to include sums within their denotation and thus a plural form cannot denote exclusively in the realm of atoms. We suggest that because of the competition with plural forms, the denotation of singular nominals ends up being restricted to the complement of the denotation of plurals, i.e., singular nominals end up being interpreted as denoting in the exclusive atom realm. More specifically, the account we propose involves the formally and conceptually marked plural form blocking the formally and conceptually unmarked singular form from being interpreted as having sums within its domain. When the competition between the singular and the plural is inoperative, however, singular forms are number neutral, and can, in principle, denote in any subdomain of the lattice. Hungarian sum denoting singulars are a case in point. What is impossible, according to our account, is for a plural form to have exclusive atomic reference, a situation that is indeed unattested as far as we know. In the next subsection we make these proposals concrete by implementing Mattausch’s abstract system of the pairing of form and meaning in the domain of number. The core operative concept for number interpretation in our system is conceptual markedness according to which atomic reference is the unmarked meaning α, whereas sum reference (whether inclusive or exclusive) is the marked meaning β. 2.3 Distribution of atomic/sum reference over singular/plural nominals The forms we are dealing with in the bidirectional optimization process are morphologically singular and plural nominals, which we denote by sg and pl. Sg here is short for a DP that has no number feature in its NumP, while pl is short for a nominal that has the feature [Pl] in NumP. The interpretations
6:19
Farkas and de Swart
associated with these forms are atomic reference and inclusive or exclusive sum reference respectively, which we denote by at and i/e sum. The bias constraints for number are given in (17): (17)
Bias constraints for number: *pl, at: a plural nominal does not have atomic reference. *pl, i/e sum: a plural nominal does not have inclusive/exclusive sum reference. *sg, at: a singular nominal does not have atomic reference. *sg, i/e sum: a singular form does not have inclusive/exclusive sum reference.
The markedness constraint on forms is *functN, the constraint proposed by de Swart & Zwarts (2008, 2009, 2010) as the central economy constraint in the nominal domain: (18)
*functN: avoid functional structure in the nominal domain
*functN prefers ‘bare’ nominals without articles, number morphology, classifiers, etc. over nominals involving the functional features or projections that host these expressions. The elaborate structures we find in the nominal domain support the view that *functN is a soft, violable constraint that can be overruled by faithfulness constraints driving the expression of sum reference, discourse referentiality, definiteness, etc. (cf. de Swart & Zwarts 2008, Hendriks et al. 2010: chapter 7). However, its influence is pervasive, even in languages in which such faithfulness constraints are ranked high, as argued by de Swart & Zwarts (2009) in relation to a range of bare nominal constructions in Germanic and Romance languages. The crucial ordering that emerges under the assumption that i/e sum reference is semantically more marked than atomic reference is in (19): (19)
The Horn pattern for number {*sg, i/e sum; *pl, at} *functN {*pl, i/e sum; *sg, at}
The ranking in (19) captures the insight that nominals marked with the feature [Pl] must include sums within their domain of reference, and that the interpretation of a singular form (when in competition with a plural) is atomic reference. In line with Horn’s division of pragmatic labor, the ranking in (19) pairs up marked plural forms with marked sum reference meanings, and unmarked singular forms with unmarked atomic meanings. The optimization over form-meaning pairs under this ranking is spelled out in the bidirectional
6:20
Singulars and Plurals
*sg, i/e sum *pl, at hsg, ati hsg, at ∪ sumi hsg, sumi hpl, ati hpl, at ∪ sumi hpl, sumi
*functN
,
*pl, i/e sum *sg, at ∗
∗ ∗ ∗
, ,
∗ ∗ ∗
∗ ∗
Tableau 1 Optimization over singular/plural form-meaning pairs Tableau1, where singular and plural forms are paired up with their respective domain of interpretation in the lattice. All possible form-meaning combinations are listed in the first column, and constitute the input to the bidirectional optimization process. The interpretations that particular forms are paired up with restrict the possible witnesses of the nominal. Atomic reference, represented as at, limits possible witnesses to atoms only. Exclusive sum reference, represented as sum, limits possible witnesses to sums only. Inclusive sum reference, represented as at ∪ sum, allows witnesses to be chosen both from the domain of atoms and that of sums. The four bias constraints, plus the markedness constraint *functN are ranked across the top, where the left-right order reflects a decreasing order of strength, and follows the ranking in (19). The two bias constraints *sg, i/e sum and *pl, at are ranked above the markedness constraint *functN, but their mutual order is irrelevant, which is reflected in the dotted line between the two columns. Similarly, (19) requires the two constraints *pl, i/e sum and *sg, at to be both ranked below *functN, but their mutual order is irrelevant, as marked by the dotted line. Because of the set-up with the bias constraints, all form-meaning combinations incur one or more violations, marked by an asterisk ∗ in the relevant cell. The schema in (19) ranks the bias constraints penalizing the combination of singular forms with (inclusive or exclusive) sum reference and the combination of plural forms with atomic reference above the markedness constraint *functN, which is what drives the optimization over form-meaning pairs in Tableau 1. The constraints mitigating against the combination of plural forms with (inclusive or exclusive) sum reference, or the combination of singular
6:21
Farkas and de Swart
forms with atomic reference are ranked below *functN, and are de facto inactive in the optimization process.12 Tableau 1 shows that we assign the (unmarked) singular form the (unmarked) meaning of atomic reference under strong bidirectional optimization, because hsg, ati constitutes a bidirectionally optimal pair (,): there is no better form to convey atomic reference, and there is no better meaning to associate with a singular form. The expression of sum reference calls for the use of a plural form. Both sum and at ∪ sum qualify as sum reference, so plural forms have exclusive or inclusive sum reference. Accordingly, both hpl, sumi and hpl, at ∪ sumi qualify as bidirectionally optimal pairs (,). Crucially, however, a plural form cannot be used in case sums are not part of the meaning to be expressed, because hpl, ati is suboptimal. In line with Horn’s division of pragmatic labor then, unmarked forms pair up with unmarked meanings, and marked forms pair up with marked meanings. Given this analysis, singular nominals have exclusive atomic reference when in competition with the plural, while plural nominals have (inclusive or exclusive) sum reference.13 We are proposing here a weak singular/strong plural account in which plurals are formally marked with a feature that is interpreted in compositional semantics, as spelled out in (20), while singular nominals have no explicit number feature and are restricted to atomic reference only as a result of the competition with the plural form. We capture this asymmetry by assuming that the interpretation of the feature [Pl] is as given in (20), where *P is the number neutral property denoted by the head noun and its complement (cf. Section 1.2 above). For any given occurrence of a plural form, either (20a) or 12 Technically, either the set of four bias constraints or the combination of *FunctN with the bias constraints ranked above it (i.e. *sg, i/e sum and *pl/at) is sufficient to obtain three bidirectionally optimal pairs in the ordinal Tableau 1. That is, leaving out either *pl, i/e sum and *sg, at or *FunctN would not change the outcome of the optimization process. However, in Mattausch’s system, we need a markedness constraint in the learning system in order to derive a 100% form-meaning distribution in the stochastic grammar. Note also that *FunctN plays a key role in the unidirectional optimization in Section 4. 13 The crucial difference between λx [x ∈ Sum ∪ Atom & *P(x)] and λx *P(x) is precisely the fact that the former is semantically plural, necessitating the possibility of sum reference while the latter is number neutral and thus truly insensitive to the atom/sum divide. We have seen in examples (4a-b) that the plural is indeed not insensitive to the atom/sum divide, and will work out in section 3.3 an analysis of choice of form that brings out the relevance of sum reference for plurals.
6:22
Singulars and Plurals
(20b) holds: (20)
a. Pl = λx λ*P [x ∈ Sum ∪ Atom & *P(x)] b. Pl = λx λ*P [x ∈ Sum & *P(x)]
This interpretation ensures that the denotational space of plural nominals will always include sums, whether inclusively, as in (20a), or exclusively, as in (20b). It is this property that makes the plural forms marked relative to singulars, in terms of Horn’s characterization of markedness in Section 1. Crucially, the two interpretations in (20) are semantically related since (20b) asymmetrically entails (20a): whenever a witness meets the condition in (20b), it also meets the condition in (20a) but not the other way around. This, therefore, is a case of polysemy rather than one of arbitrary ambiguity. The semantics in (20) leads to the truth conditions of sentences like (1b) and a simplified version of (3a) (repeated here as 21a and 22a) as in (21b) and (22b), respectively: (21)
a. Mary saw horses b. ∃x : [x ∈ Sum & *Horse(x)] [See(m, x)]14
[exclusive plural]
(22)
a. Have you seen horses? [inclusive plural] b. ?[∃x : [x ∈ Sum ∪ Atom & *Horse(x)] [See(addressee, x)]]
Thus, nominals with the feature [Pl] are incompatible with the conceptually unmarked meaning, namely exclusive atomic reference. The reference of such forms is restricted by the contribution of the feature [Pl] to a domain that includes sums either to the exclusion of atoms or not. In Section 3 we exploit the entailment relation between these two senses to account for the pragmatic factors that play a role in choosing one sense over the other. Since our analysis appeals to the feature [Pl] but has not implicated particular determiners, it carries over straightforwardly to the definite plural in (23a): (23)
a. Mary touched the horses. b. ∃!x : [x ∈ Sum & *Horse(x)] [Touch(m, x)]
We assume that in English the definite article, as well as ‘definite’ possessive determiners such as your horse/your horses have no number restrictions of 14 We use here and below First Order Predicate Logic formulas with restricted quantification. We put square brackets around the Restrictor and the Nuclear Scope, and use capitals to distinguish logical predicate constants from their natural language counterparts. We disregard matters that are not directly relevant to us, such as tense interpretation.
6:23
Farkas and de Swart
their own since they combine with both singular and plural nominals. In the case of the latter, the feature [Pl] is present in the NumP and brings its contribution to the semantic interpretation of the DP. Given that we assume NumP to be dominated by DP, we also assume that the feature [Pl], like other agreement features, percolates to the DP in order to trigger plural agreement outside the DP, as in the case of Subject-Verb agreement. Our account leads us to expect inclusive plural possessive or definite DPs alongside inclusive plural indefinites. Example (24) shows that this expectation is met: (24)
[Instruction for parents picking up their kids from day care after an outing in different groups]: If your children are back late, you have to wait.
Your children in (24) is interpreted inclusively, for the instruction is assumed to be relevant both to parents with a single child and to parents with more than one child in day care. The inclusive interpretation of the plural possessive in (24) is parallel to the inclusive interpretation of indefinite plurals in the restrictor of conditionals, exemplified in (3b) and repeated here as (25): (25)
If you have ever seen horses in this meadow you should call us.
The plural definite in (23a), on the other hand, gets an exclusive plural interpretation on a par with that of the bare plural in (21a). Singular nominals do not involve a singular feature in NumP and therefore they do not have an inherent denotation restriction concerning the atom/sum divide imposed by any of their subparts. The denotation of the singular nominal horse is the number neutral property λx[*Horse (x)], an interpretation that is insensitive to the atom/sum divide. Crucially, however, we assume that the interpretation of count nominals in argument position in languages with morphological number has to involve information concerning the atomic vs. sum nature of their referent. In other words, a nominal that introduces a discourse referent (i.e, a nominal of type e) in these languages has to be interpreted as giving information concerning the atom/sum nature of its possible witnesses. Under standard assumptions in Discourse Representation Theory, discourse referents are introduced at the point when the D combines with its sister(cf. Kamp & Reyle 1993; Kamp & van Eijck 1996). Because number restrictions target the possible values of discourse referents, we assume that it is at this point that the presence of a number restriction becomes
6:24
Singulars and Plurals
relevant. In the case of plural nominals, the interpretation of the feature [Pl] in NumP contributes the required number restriction. In the case of singular nominals, however, there is no explicit number feature that can contribute the required number information. When a singular nominal combines with a determiner that itself is not specified for number, such as the definite article the, number specification is contributed via the optimization mechanism given above. Such a singular DP denotes exclusively within the set of atoms because allowing reference to sums has to involve the presence of [Pl] in NumP according to Tableau 1. Thus, at the point when a number neutral D such as the combines with a morphologically singular sister nominal that has no inherent number specification either, such as horse, the system of constraints in Tableau 1 enriches the interpretation of the DP with the constraint x ∈ Atom imposing exclusive atomic reference on the DP because this is the optimal number interpretation for a DP that is not marked with the feature [Pl]. The compositional semantics yields no number requirement on its own but in the absence of plural morphology, the DP will be interpreted as having atomic reference.15 Note that our account of the interpretation of singular DPs is similar in spirit to Krifka’s account of number interpretation for plural nominals. For Krifka, singular DPs are marked for atomic reference and plural nominals denote in the complement of the singular forms. For us, plural nominals are marked for including sums in their reference domain, and singular DPs, when in competition with plurals, denote in the complement of the plural form, i.e., they are interpreted as having exclusive atom reference. The truth conditions of sentences like (1a), repeated here as (26a), involving a singular form in competition with a plural one, are then as given in (26b). (26)
a. Mary saw a horse. b. ∃x : [x ∈ Atom & *Horse(x)] [Saw(m, x)]
The condition x ∈ Atom is present because nothing in the inherent semantics of some horse specifies that sum reference is a possibility, and therefore the 15 We are assuming here that the OT system works hand in hand with composition rather than applying at a particular point in the derivation of the interpretation of an expression. The type of enrichment we use here is different from the ‘pragmatic enrichment’ proposed most recently in Chierchia 2004, 2006; Chierchia, Fox & Spector 2008, which relies on a covert exhaustification operator. Note also that our account of the singular/plural contrast is different from that assumed in Chierchia 2004, 2006; Chierchia et al. 2008 in that in our account singular forms do not involve a feature that imposes atomic reference.
6:25
Farkas and de Swart
restriction to atomic reference is imposed by the constraints in Tableau 1. This condition then is not contributed by the presence of a particular piece of morphology in (26a) but rather, by the absence of the feature [Pl].16 The analysis extends in a straightforward way to the definite singular nominal in (27): (27)
a. Mary touched the horse. b. ∃!x : [x ∈ Atom & *Horse(x)] [Touch(m, x)]
Note that the definite article requires uniqueness, whereas the indefinite article just contributes existential quantification. We exploit this difference in Section 3.2 below. The account we proposed above allows explicit semantic information to be contributed by an unmarked form that has no inherent semantics on the basis of the competition with a marked form with a specific semantics. The bidirectional OT system spells out the details of a blocking account in the spirit of Krifka (1989) and Sauerland et al. (2005) with the important difference that in our system the existence of the semantically and morphologically marked plural form affects the interpretation of the semantically and morphologically unmarked singular form rather than the reverse The most important challenge for a weak singular/strong plural view is the existence of inclusive readings of plural forms. The polysemous semantics of plural nominals that we adopted in (20) as the outcome of the bidirectional optimization process meets this challenge as it leaves room for both inclusive and exclusive sum reference. Following the spirit though not the letter of Sauerland et al. (2005) and Spector (2007), we rely on pragmatics to determine the choice between these two senses in context and give, in the next section, a pragmatic account of the contrast between (1b), (3a-c) and (4a, b). 16 We assume here that the atom condition is part of the semantics of the relevant DPs. Alternatively, one could treat it as an implicature whose generation would rely on the constraints in Tableau 1. Under both views singular DPs are taken to denote within the realm of atoms because languages that have a plural form have to use it, other things being equal, in case sums are among the possible referents of the DP and thus, the existence of the plural blocks the singular from being interpreted as having sum reference. The implicature analysis sketched here differs sharply from the use of implicature in Spector (2007) summarized in Section 1.2 above.
6:26
Singulars and Plurals
3
The pragmatics of the plural
So far we have worked out a weak singular/strong plural analysis of number interpretation in which plural nominals are interpretable as having either inclusive or exclusive sum reference. In Section 1 we saw that both interpretations are indeed available for such nominals. We also saw, however, that the choice between them is not free: exclusive sum reference is the rule in upward entailing environments, as exemplified in (21) and (23), whereas inclusive readings are typically found in downward entailing environments such as the scope of negation, in the restrictor of a universal or the antecedent of a conditional, as well as in questions (cf. 22, 24 and 25). In section 3.1 we turn to the problem of explaining this contrast. We suggest that a crucial factor regulating the choice between the two senses of the plural is the independently motivated S(trongest) M(eaning) H(ypothesis), a pragmatic principle that can be constrained under contextual pressure. We discuss, in Section 3.2, the predictions this hypothesis makes for the interpretation of plural forms in quantificational contexts. The interpretation of singular and plural forms comes closest in environments where the plural is interpreted inclusively because in such cases both forms are compatible with atomic witnesses. In Section 3.3 we investigate factors that regulate the choice between singular and plural forms in downward entailing environments and questions, environments where a plural is most likely to receive an inclusive interpretation. 3.1
Strongest meaning hypothesis for number
If plurals can have either an inclusive or an exclusive interpretation, along the lines of (20), the question of how one chooses between these two possibilities arises immediately. We have noted above that the exclusive interpretation asymmetrically entails the inclusive one. This, we claim, makes the choice between the two interpretations sensitive to the Strongest Meaning Hypothesis. In this section we make this connection explicit and discuss its predictions. Recall that Dalrymple, Kanazawa, Kim, Mchombo & Peters (1998) propose the Strongest Meaning Hypothesis (SMH) to account for the contextual choice between a range of interpretations for reciprocals. Winter (2001) extends the principle to instances of Boolean conjunction and quantification. Zwarts (2004) exploits the SMH as part of his interpretation procedure for the preposition round. We exploit here the same idea in claiming that the SMH
6:27
Farkas and de Swart
is one of the factors that govern the choice between the inclusive and the exclusive sum interpretation of plural nominals. The Strongest Meaning Hypothesis applies when an expression is assigned a set of interpretations ordered by entailment and chooses the strongest element of this set that is compatible with the context.17 The two senses of the feature [Pl] in our account, given in (20), are ordered by (truth-conditional) strength: an existentially closed proposition involving the exclusive sense asymmetrically entails the same proposition involving the inclusive sense. Because of this relationship the choice between interpretations of the [Pl] falls under the jurisdiction of SMH. Our hypothesis is formulated as smh_pl (the Strongest Meaning Hypothesis for Plurals): smh_pl: the Strongest Meaning Hypothesis for Plurals: for a sentence involving a plural nominal, prefer that interpretation of [Pl] which leads to the stronger overall interpretation for the sentence as a whole, unless this interpretation conflicts with the context of utterance. In upward entailing environments exemplified in (21a) and (23a), the sentence under the exclusive interpretation of the plural entails the sentence under the inclusive interpretation, and therefore the smh_pl favors the exclusive interpretation of horses and the horses over the inclusive one. In other words, the interpretation that Mary saw ‘more than one’ horse is stronger than the claim that Mary saw ‘one or more’ horses, and therefore the smh_pl favors the exclusive plural interpretation in (21b). Similarly, the statement that Mary touched the maximal sum of horses in the context entails the proposition that Mary touched the maximal set of one or more horses, so the smh_pl also favours the exclusive plural interpretation in (23b). In downward entailing environments on the other hand, the smh_pl leads to the inclusive interpretation because of scale reversal under monotonicity reversal (see Fauconnier 1979 and much subsequent work).18 The weaker, inclusive reading of the plural in such contexts leads to a stronger claim for the sentence as a whole. This indeed is the case for (3a-c, 22, 24, 25). With respect to (3c), for instance, the proposition that Mary never saw ‘one or more’ horses (inclusive plural) entails the proposition that Mary never saw 17 Note that Sauerland et al. (2005) also makes reference to strength when suspending Maximize Presupposition in cases where disobeying this principle would lead to a stronger overall claim, cf. (6) in Section 1.2. 18 Whenever relevant, the notion of downward entailment can be refined to ‘Strawson entailment’, e.g. in conditionals (cf. von Fintel 1999).
6:28
Singulars and Plurals
more than one horse (exclusive plural). Given that the inclusive interpretation of the plural in (3c) leads to a stronger claim for the negative sentence than the exclusive interpretation, the former interpretation is preferred under the smh_pl. We assume here that the smh_pl is relevant to bringing about the inclusive interpretation of plurals in questions as well, though the details of how to compute the strength of questions must remain an open issue for the time being, despite the fact that the affinity between downward entailing contexts and questions has been noted for a long time.19 Other things being equal then, the smh_pl predicts that a plural nominal is interpreted inclusively in downward entailing contexts and questions, and exclusively in upward entailing ones. This is indeed the situation we find in (21)-(24). Note that the SMH as advanced by Dalrymple et al. (1998), Winter (2001) and Zwarts (2004) is a pragmatic principle, and as such it can be overridden by contextual pressure. If the smh_pl is indeed responsible for the choice of interpretation for plural nominals, we expect pragmatic pressure to render it inoperative, and make inclusive interpretations available even in upward entailing environments. We argue below that this is indeed the case. Under the assumption that the speaker knows the facts, the plural form in sentences such as (21a) and (23a) will receive an exclusive interpretation which is informationally stronger than the inclusive one. Furthermore, in these cases there is a single relevant witness for the plural nominal. Under the assumption that the speaker is in full possession of the facts, she should know whether this witness is an atom or a sum. In the first case she should use a singular form because that is the best expression for conveying atomic reference, given the high ranking of the constraint *pl, at (cf. Tableau 1). In the latter case, she should use the plural form, given the equally high ranking of *sg, i/e sum. Under the assumption that the speaker knows what Mary saw/touched then, there is no possibility to weaken (21a) or (23a) to an inclusive plural interpretation under the bidirectional optimization process spelled out in Section 2. But in contexts where the speaker is assumed to in fact lack information concerning the atomic/sum nature of the relevant witness, the smh_pl no longer requires the exclusive reading and thus inclusive readings of plurals become possible even in upward entailing contexts. 19 Obviously, questions are not generally perceived as downward entailing, but they are subject to the same principle of scale reversal, as evidenced by the well-known fact that NPIs are often licensed in all these environments (cf. Guerzoni & Sharvit 2007 for a fine-grained discussion of NPI licensing in questions, and Ladusaw 1996 for a general overview of NPI licensing).
6:29
Farkas and de Swart
When not in possession of the relevant information, the speaker may be assumed to choose an inclusive plural form precisely because this relatively weak statement (which allows both atoms and sums as possible witnesses) is the strongest one compatible with the incomplete evidence she has. The examples in (28) illustrate just such a case:20 (28)
a. [Speaker walks into basement, and notices mouse droppings]: Arghh, we have mice! b. [Speaker walks into unknown house, and notices toys littering the floor]: There are children in this house.
Crucially, the utterances in (28) are felicitous with an inclusive interpretation only in situations in which the speaker finds positive indirect evidence for the presence of mice and children, but has no way of telling how many there are. Although the inclusive interpretation is weaker in upward entailing contexts, it is the strongest possible interpretation of the sentence in a situation of speaker ignorance, where both atomic and sum reference are compatible with the information the speaker is assumed to have. The stronger, exclusive, interpretation in this case is not supported by assumptions about the speaker’s state of knowledge, and one prefers to assume that the speaker is obeying the maxim of quality over assuming that she makes the strongest claim her utterance is compatible with. Note that were the speaker to utter (28b) in her own house, and thus be assumed to be in full possession of the facts, the interpretation of the plural is correctly predicted to be exclusive again. Analogous cases are discussed in Zwarts 2004 in terms of a constraint fit (determining which interpretation fits the context) outranking the constraint strength. We will not spell out the interpretative tableaux here, but refer the reader to Zwarts 2004 for a way to do so. 3.2
Plurals in quantificational contexts
So far, we have proposed that the choice between the two senses of the plural is influenced by monotonicity. In upward entailing contexts, a plural form is normally interpreted as exclusive whereas in downward entailing contexts and questions, scale reversal leads to an inclusive interpretation. This raises the question of what happens in quantificational contexts.21 20 We thank one of the participants in ‘A bare workshop 2’ (LUSH, June 2008) for suggesting the example in (28a). 21 We are grateful to an anonymous reviewer for suggesting to us to discuss the implications of our analysis for plurals in quantificational contexts. We only discuss bare plurals and
6:30
Singulars and Plurals
If the smh_pl is indeed involved in the choice between the two senses of the plural morpheme we expect, other things being equal, a difference in interpretation of plurals depending on whether they are in the Restrictor or the Nuclear Scope of a distributive universal quantifier because the Restrictor of such a quantifier is downward entailing and the Nuclear Scope is upward entailing. We therefore expect the plural in (29) to favor an exclusive reading:22 (29)
Each sportsman is wearing gloves.
In order to test this prediction, we carried out a small-scale pilot experiment. We set up a picture-matching task, in which participants were requested to evaluate the Dutch counterpart of (29) in a ‘mixed’ situation in which some sportsmen were wearing two gloves and others were wearing a single glove. In order to neutralize the effect of expectations, each person in the picture was wearing the correct number of gloves required by their respective sport, so the boxer and the cyclist were wearing two gloves, and the baseball player was wearing a single glove.23 Participants strongly rejected (29) as a correct description of such a mixed situation (23 out of 24 said ‘no’), confirming the prediction that a plural in the Nuclear Scope of a definite plurals here, not plural some DPs; plural indefinite determiners will be discussed in Section 4 below. 22 Note that our predictions here differ from those of Spector (2007), where the distinction between the Restrictor and the Nuclear Scope of universal quantifiers is not assumed to be relevant to the choice between inclusive and exclusive plurals. Note also that in order to rule out cumulative and dependent plurals, which are possible in the case of (i) and (ii) we focus on cases involving distributive each. i. All children were sitting on small chairs. ii. Unicycles have wheels. See Zweig (2008) for relevant discussion of dependent plurals. 23 We thank Bert Le Bruyn for his help in designing and carrying out the experiment. The experiment was carried out in Dutch. The singular/plural system in Dutch is parallel to the one in English, and the contrasts between inclusive/exclusive interpretations are easily reproduced in this language. Dutch iedere (‘each’) proved to be a good universal quantifier to use because it is strongly distributive. One of the control items involved alle (‘all’), which easily allows dependent/cumulative interpretations, just like its English counterpart. 24 native speakers served as subjects of the experiment. They were first-year BA students who had just completed an introduction to linguistics, in which the semantics of plurals was not discussed. The test was administered electronically. The participants were presented with a picture and a sentence below it. They were asked to judge whether the sentence gave a correct description of the situation (yes/no).
6:31
Farkas and de Swart
distributive universal favors an exclusive interpretation. As expected, the control item All children were sitting on small chairs, where the quantifier all gives rise to a cumulative interpretation of the bare plural small chairs, was widely accepted as the description of a picture with a group of children each sitting on their own small chair (22 out of 24 participants said ‘yes’). This preliminary result appears to support the hypothesis that smh_pl is indeed relevant in choosing between the two senses of [Pl]. Further experimental work is needed in order to conclusively establish this point. There is a further problem that arises in connection with plurals in the Nuclear Scope of distributive universal quantifiers. It has been noted in the literature that in (30), the definite plural gets an inclusive interpretation (cf. Sauerland et al. 2005): (30)
Each boy invited his sisters
Sentence (30) can be used to describe a ‘mixed’ situation, in which each boy invited all the sisters he has, which for some boys means inviting just one sister while for others, it means inviting several. The question that arises is how to account for the difference in interpretation between the bare plural in (29), which seems to favor an exclusive interpretation, and the definite plural in (30), which seems to allow an inclusive reading more readily. Section 2 developed a unified analysis of plural morphology so if bare plurals and definite plurals behave differently here, the difference in our account can only be due to the definite/indefinite contrast. Here we sketch a possible explanation of the contrast in number interpretation based on the contrast in definiteness. The crucial difference, in our account, between (30) and (29) relates to the contrast between (31) and (32): (31)
Each boy invited his sister
(32)
Each boy invited a friend of his
The possessive singular in (31) is interpreted as definite and therefore as referring to the maximal entity that is a sister of the relevant boy. Because of the maximality requirement that is part of the semantics of definite possessives, (31) is false24 in a situation in which some boys have one sister and they invited her, while others have more than one sister and they invited all of their sisters. The predicate invited his sister is true only of boys such 24 Or lacks a truth-value, if one prefers to state the maximality requirement as a presupposition.
6:32
Singulars and Plurals
that the maximal entity that is a sister of theirs is atomic because reference to sum values is not allowed for a singular DP. A mixed situation in which each boy invited the maximal entity that encompasses his sister(s) can be described with a definite plural interpreted inclusively because in that case the maximality requirement of the definite is met as long as for each boy in question there is no sister that remains uninvited. The inclusive plural requirement is met because although some witnesses are atoms there are others which are sums. Note that, as expected, (30) cannot be used in case all boys have a single sister whom they invited (cf. Section 3.3 below). The truth conditions of the definite singular are incompatible with a mixed situation where no sister is left uninvited and some boys have one sister while others have more than one, while the truth conditions of the definite plural, under the inclusive interpretation, are compatible with such a situation. The indefinite singular on the other hand is truth conditionally compatible with a mixed situation in which some boy invited one friend of his while others invited several. This is because the predicate invited a friend of his can be true of a boy that has several friends and invited only one of them precisely because the indefinite, unlike the definite, has no maximality requirement. If maximality is not part of the semantics of the sentence, the truth conditions of a singular form are compatible with a mixed situation, where some boys invited one of their friends and others invited several. The contrast between (29) and (30) then is due to the fact that a ‘mixed situation’ is incompatible with the truth conditions of (31) but compatible with those of (32). Thus, the contextual pressure to override the smh_pl and give the plural an inclusive interpretation when in the Nuclear Scope of a distributive quantifier is stronger in the case of definites than in the case of indefinites because for definites the singular form is truth conditionally incompatible with a mixed situation while for indefinites this is not so. We have claimed in this subsection that the choice between the inclusive and exclusive senses of the plural is sensitive to the smh_pl, which favors the exclusive interpretation of plural forms in ordinary upward entailing environments and the inclusive interpretation in ordinary downward entailing ones. Since the smh_pl is a pragmatic principle it can be overridden by contextual factors involving cases where the speaker is assumed to describe a ‘mixed situation’, one where some relevant witnesses are atoms and others are sums. This may arise either because of speaker ignorance of the nature of the relevant witness (as in 28) or because the speaker knows that both types of witnesses are involved and using the plural form is the best way to
6:33
Farkas and de Swart
convey this information (as in 30). 3.3
25
Implications of the bidirectional analysis for choice of form
According to the account developed so far, the semantic contrast between singular and plural forms is smallest in downward entailing contexts and questions. In these environments a singular has atomic reference while a plural has inclusive sum reference. Both forms therefore are compatible with atom witnesses. The approach we worked out predicts that a plural form will be appropriate in such contexts only if sum witnesses are relevant because the OT system we set up predicts that the unmarked singular form is optimal in case the witness domain does not include sums and therefore the use of the marked plural is appropriate only in this latter case. In this section we show that this prediction is confirmed and discuss some subtle pragmatic factors that determine whether sum witnesses are relevant in the context or not. Our approach predicts that even in environments that lead to inclusive interpretations, plural forms are sensitive to the presence of sums among relevant witnesses. We have already seen that this prediction is borne out in cases such as (4a,b) from Farkas 2006 and Spector 2007 respectively, repeated here as (33a) and (33b): (33)
a. Does Sam have a Roman nose/#Roman noses? b. Jack doesn’t have a father/#fathers.
In order to account for the contrast between singular and plural forms in examples like (33), Spector assumes an additional modal presupposition associated with (indefinite) plurals that explicitly requires the possibility of a sum witness. The bidirectional OT analysis developed in Section 2.2 and 25 Spector (2007) raises a further empirical issue, namely the interpretation of the plural in exemples such as (i). i. Exactly one student bought wine bottles to the party. This sentence is interpreted as claiming that one student brought more than one bottle of wine to the party and no other student brought any bottles of wine to the party. This is a problematic example for us because one and the same plural nominal appears to be interpreted both exclusively (in the positive part of the interpretation) and inclusively (in its negative part). This is a problem we leave open for the time being noting that a full discussion would have to involve both the optimal interpretation of exactly and the way closely related senses of polysemous items interact with it.
6:34
Singulars and Plurals
summed up in Tableau 1 accounts for this contrast without any additional stipulations. The OT analysis does two things simultaneously: it pairs up the singular and plural forms with their optimal interpretation, and at the same time it spells out which forms are optimal to express reference to atoms, to sums, and to a mixture of atoms and sums. Thus, according to Tableau 1, the optimal expression of exclusive atomic reference is a singular form while in case reference to sums only or to sums as well as atoms is intended the optimal form is a plural. Therefore, we predict that in case the witness domain includes atoms only the plural form will be excluded since hsg, ati is optimal while hpl, ati is not. The plural form has to be chosen when sums are the only values in the domain of reference (exclusive plural), or when sums and atoms are included in that domain (inclusive plural) but cannot be used in case the domain of reference excludes sums. Thus, we predict that even in environments in which the inclusive plural interpretation is preferred by the smh_pl, the singular form will be chosen when the nominal is assumed to take values exclusively from the set of atoms because in such a case the use of the plural form is suboptimal. How do we know whether sums are relevant in the context? Given that the bidirectional OT analysis spells out the syntax-semantics interface for number, we cannot expect it to determine when reference to sums is intended by the speaker. That requires knowledge of the world and contextual knowledge, and is therefore a matter of pragmatics.26 The remainder of this section discusses some of the pragmatic factors that come into play to determine whether reference to sums is assumed to be relevant or not. Let us consider first the clear cases. As just mentioned, if sums are pragmatically excluded from being possible witnesses for reasons of general world knowledge, as in (33), our analysis predicts that the speaker will choose a singular form and therefore we predict that the use of the plural form in (33a, b) is inappropriate. If, on the other hand, atomic witnesses are excluded for reasons of world knowledge, we predict that a plural form will be appropriate and the singular will not, as confirmed by (34). The singular form in (34b) is infelicitous because a singular form cannot be used if the 26 Although Spector (2007) doesn’t work this out in his paper, we take it that he also appeals to contextual knowledge and knowledge of the world in order to explain why the modal presupposition introduced by plural indefinites under his analysis is violated in cases like (33a, b), but not in (3a-c).
6:35
Farkas and de Swart
pragmatically restricted domain of reference consists of sums alone. (34)
a. Does a dog have eyes? b. #Does a dog have an eye?
These are clear cases since knowledge of the world tells us that people have one nose and that eyes come in pairs. The relevance of sums, inherent to the optimization over forms, accounts for the contrasts in (33) and (34). There are, however, less clear cases, in which the issue of whether sums are relevant is a more subtle pragmatic matter. Following Farkas (2006), we adopt the hypothesis that in some cases there are default expectations with respect to the atom vs. sum nature of relevant witnesses and that these expectations affect the choice of a singular vs. a plural in environments that are otherwise friendly to inclusive plural interpretations. The account developed here accounts for this effect. To exemplify, note that when it comes to a person having an MA degree, it is simply a default expectation that if they have such a degree, they will have only one. Nothing stops people from piling up multiple MA degrees in their academic career, so sum witnesses in this case are not absolutely excluded. But normally, a person obtains just a single MA degree, so sum witnesses are not among the expected, default witnesses. Under the analysis developed above, we expect that the unmarked way of inquiring whether a person has an MA degree is (35a), with a singular. The question with the plural form is unusual because it explicitly requires one to include sums among possible witnesses. Indeed (35b) suggests that the speaker is inquiring after the possibility of having multiple MA degrees. The use of the plural here signals deviation from default expectations. (35)
a. Do you have an MA degree? b. Do you have MA degrees?
By contrast, when it comes to a department that has an MA program, the default expectation is that there will be more than one MA student in it. Since sums are now the default witnesses, we expect that the speaker will use a plural in (36a), when inquiring whether the department has an MA program. The choice of the singular in (36b) will be highly unusual, since it signals that sum values are not among the expected witnesses, a situation that is highly unexpected. (36)
a. Are there MA students in your department? b. #Is there an MA student in your MA department?
6:36
Singulars and Plurals
The contrast between (35) and (36) is due to the difference in whether one expects sums to be among the relevant witnesses or not. Since the choice of a plural form always requires sum witnesses to be relevant, such a form is natural in (36a) but is unusual in (35b). The pragmatic relevance of sums also plays a role in (37a), the example most frequently cited as support for the existence of an inclusive reading of the plural. (37)
a. Do you have children? b. Do you have a child on our baseball team?
The domain from which the nominal chooses witnesses in (37a) is a mixed one since there is no default expectation with regard to how many children a person has. In this case then sums are part of the pragmatically relevant domain and therefore the choice of a plural form is predicted to be appropriate on a tax form, for instance. In (37b) on the other hand, we changed the example so that now the presence of sum values among the default witnesses is removed and, as expected, a singular form is the natural one in a questionnaire in this case. In the examples discussed so far, common world knowledge shared between speaker and hearer is sufficient to account for the optimal choice of the singular or plural form. The two questions in (38) illustrate that the choice of form may also depend on the context of use. (38)
a. Do you have a broom? (asked in your kitchen after I spilled peas on your floor) b. Do you have brooms? (asked in a store)
As far as we can see, there are no special expectations about people having one or more than one broom in their house, if they have any. In addition, given the context of use sketched for (38a), the speaker is not expected to need more than one broom. The choice of a singular form is therefore expected, since sum witnesses are not relevant to the situation of use nor is the relevance of sums imposed by common world knowledge. In a store, on the other hand, the relevant witness is by default a sum, since stores normally sell more than one item of a particular type, if they sell that type of item at all. A plural form then is the natural choice in (38b) not because the speaker is interested in buying more than one broom but because of the default sum value expectation associated with the positive answer to her question.
6:37
Farkas and de Swart
Further examples that support the generalization that a plural form is used in case sum values are among the default values of a nominal in a particular context, and that singular forms are used when this is not the case are given in (39)-(42): (39)
a. Is Sarah wearing shoes? b. Is Sarah wearing a hat?
(40)
a. Do you have pictures from your wedding? b. Do you have a picture of Sarah in your wallet?
(41)
a. Is there a sauna in this house? b. Are there nice plants in the garden?
(42)
a. Have you bought your Christmas presents already? b. Have you bought a Christmas present for Aunt Sarah?
Under the bidirectional OT analysis developed in Section 2.2, the singular form is the optimal choice when the domain of reference includes atomic values only (cf. Tableau 1). The inclusive plural tolerates atoms in its domain of reference, but the pair
is suboptimal, because of the high ranking of *pl, at. So even in questions and under negation, the choice of a plural form requires that sums be included in the domain of the nominal. Sum witnesses must be relevant in whichever way the context supports this (general world knowledge or specific situational knowledge). Therefore, the use of a plural in downward entailing contexts and questions will be natural just in case intended sum reference can be pragmatically justified. The choice between a singular and a plural form in contexts where the interpretations of the singular and the plural overlap thus falls out naturally from our account. 3.4
Taking stock
In section 2, we took Horn’s division of pragmatic labor to heart, and developed an analysis of the singular/plural contrast in line with the view that unmarked forms are paired up with unmarked meanings, and marked forms with marked meanings. We made no use of a singular feature or morpheme and assigned the plural a polysemous semantics (inclusive and exclusive sum reference). In Section 3 we invoked the Strongest Meaning Hypothesis to account for the fact that plural forms in ordinary upward entailing environments are normally interpreted exclusively while the best cases of inclusive plurals are found in downward entailing environments and questions. In our analysis
6:38
Singulars and Plurals
the use of a plural form requires sum witnesses to be relevant, a property that we have argued guides the choice between singular and plural forms even in contexts where a plural is interpreted inclusively. The analysis set up so far meets the desiderata (i)-(iii), formulated in Section 1.4. What remains to be investigated is its cross-linguistic validation (desiderata iv and v). A full-fledged analysis of languages such as Chinese, which lack morphological number altogether, goes beyond the scope of this paper, but note that our set-up is in line with a semantics of Chinese nominals in terms of general number (Rullmann & You 2003). In contrast with Farkas & de Swart (2003), we do not take atomic reference to be the default interpretation for argument nominals in general and therefore the current analysis is subtler than our earlier proposal. Crucially, in the current account, the mechanism that associates atomic reference with non-plural forms requires a morphological opposition between singular (unmarked) and plural (marked) nominals in the language. In the next section we turn to the contrast between English and Hungarian DPs in cases where the D lexically entails sum reference. We argue that plural morphology can be absent in such cases precisely because, given the semantics of the D, the contribution of [Pl] is redundant. The difference between Hungarian and English then is a matter of whether redundant plural morphology is required (English) or prohibited (Hungarian). We work out a full-fledged account of this contrast in Section 4. 4 Plural determiners: a cross-linguistic perspective So far we have concentrated on the interpretive contribution of the morphological number feature [Pl] when it occurs in nominals in argument position in languages like English, where a morphological distinction between singular and plural nominals is operative. In principle, determiners may also encode information concerning the atom/sum divide. In English, the definite determiner the combines with both singular and plural nouns, so the restriction to atomic or sum reference in the case of definite DPs is solely encoded in morphological information located in NumP. Within the category of indefinite DPs, just like in the case of definites, number interpretation is primarily driven by the morphological singular/plural contrast realized in NumP, though this contrast may be reinforced by determiner choice. A DP headed by several must refer exclusively to sums, while a DP headed by a(n) can only refer exclusively to atoms. The indefinite determiner some, on
6:39
Farkas and de Swart
the other hand, is like the definite article in that it has no inherent lexical restrictions pertaining to number interpretation. The core analysis set up in Sections 2 and 3, illustrated with English, extends to other languages that have a morphological number distinction such as Germanic or Romance languages as well as to non-Indo European language such as Hungarian. In DPs whose determiner does not contribute number information, we expect the effect of the feature [Pl] on the nominal to be the same as in English. We now turn to the data noted in Section 1, where we saw that English and Hungarian contrast in case the determiner is lexically marked for sum reference. 4.1 A contrast between English and Hungarian As outlined in Section 1.4, Hungarian is like English in distinguishing between singular and plural nominals, with the singular remaining unmarked and the plural being marked by the presence of the morpheme -(a)k. The facts of number interpretation in English that we discussed so far are parallel in Hungarian and therefore the analysis proposed for English extends to Hungarian as well. We have seen, however, that there is a crucial difference between the two languages when it comes to DPs whose determiner is lexically marked for sum reference. In English, such DPs are morphologically plural, while in Hungarian they are morphologically singular. We repeat the key relevant Hungarian facts in (43) (see examples in 9 above): (43)
a. három gyerek three child ‘three children’
[Hungarian]
b. sok gyerek many child ‘many children’ These DPs are singular in form (and trigger singular agreement with the V when in subject position), and yet they have exclusive sum reference. This then is an environment where the semantic contrast between singular and plural forms is neutralized in Hungarian. What needs an explanation now is why in languages that have a morphological number contrast if the D is marked for sum reference, we find two options: (i) the language may require the number contrast to be mor-
6:40
Singulars and Plurals
phologically expressed by the presence of the feaute [Pl], as in the English three children, many children or (ii) the language may require the number contrast to stay morphologically unexpressed, as in the Hungarian három gyerek, sok gyerek. Note that the difference between these two languages is purely morphological since the semantic interpretation of the relevant DPs is identical. We turn to an account of these facts after we review their significance for competing analyses of number. 4.2
Implications for the weak/strong singular debate
If we analyze a determiner such as three as a generalized quantifier expressing existential quantification over sums with cardinality of at least three, the semantics of the DP three children / három gyerek is as in (44). We use [three NP]sg as shorthand for the Hungarian case, where the DP is singular and there is no plural feature in NumP and thus no plural suffix on N, and [three NP]pl as shorthand for the English case, where NumP contains the feature [Pl] overtly realized as a suffix on N.27 (44)
a. [three NP]sg = λP ∃x : (✓weak singular) b. [three NP]sg = λP ∃x : (*strong singular) c. [three NP]pl = λP ∃x : (✓weak plural) d. [three NP]pl = λP ∃x : (✓strong plural)
[*N(x) & P(x) & |x| ≥ 3] [*N(x) & P(x) & x ∈ Atom & |x| ≥ 3] [*N(x) & P(x) & |x| ≥ 3] [*N(x) & P(x) & x ∈ Sum & |x| ≥ 3]
Note that in order to obtain the intended meaning, the fact that the nominal is singular should have no interpretive consequence here. The weak interpretation of singular nominals in (44a) yields the desired interpretation, but the strong singular semantics in (44b) does not, which is why the Hungarian facts are problematic for accounts in which singular forms are semantically potent while being at least compatible with a ‘weak singular’ approach such 27 The semantics spelled out in (44) may be an oversimplification, given the more fine-grained analyses of the differences in meaning between three children and at least three children that have been offered in the recent literature (cf. Nouwen & Geurts 2007 and references therein). However, the observations made in these works are tangential to the issues at stake in this paper, because they focus on the role of the determiner, not the singular/plural distinction on the noun. So we ignore these complications here.
6:41
Farkas and de Swart
as the one we propose. In our account a singular DP has no inherent atomic reference requirement contributed by a singular feature. It only acquires atomic reference when a number restriction is required and is not provided otherwise. When the nominal is plural, on the other hand, both the weak and the strong plural analyses yield the right interpretations. Under a weak plural account, the plural feature does not contribute any number information, while the determiner requires sum reference given the cardinality requirement, as in (44c). Under the strong plural analysis advocated here, the plural morphology on the noun conveys sum reference in (44d). In this case the semantic contribution of the feature [Pl] is redundant given that exclusive sum reference is entailed by the semantic contribution of the determiner. Thus plural morphology in the DP is redundant when the determiner conveys sum reference, but it is not harmful. The particular weak singular/strong plural analysis developed in Section 2 derives atomic reference for singular nominals under bidirectional optimization. We crucially need this mechanism in Hungarian as well in order to assign the correct interpretation of ordinary Hungarian singular and plural definite and indefinite DPs such as a gyerek ‘the child’ and a gyerekek ‘the children’, which behave just like their English counterparts. But precisely in case the D entails sum reference, the semantic difference between singular and plural forms is neutralized under the assumption that singulars have no semantic import. Due to the semantics of the D, the semantic contribution of the feature [Pl] is redundant. Because there is no crucial interpretive difference between [three NP]sg and [three NP]pl the bidirectional optimization over form-meaning pairs spelled out in Section 2 above does not apply to these cases. In view of the semantic equivalence between singular and plural nominals in DPs headed by ‘semantically plural’ determiners, both English [three NP]pl and Hungarian [három NP]sg are compatible with the analysis developed so far in this paper. The competition between singular and plural forms is inoperative precisely when there is no meaning contrast that could be encoded by these two forms. A singular nominal can be associated with sum reference when the possibility of atomic reference is excluded on independent grounds and when the requirement that argument nominals be specified for number reference is satisfied by the D. Note, however, that we predict that the reverse is not possible. Since the plural has a semantic contribution to make, there can be no language just like English or just like Hungarian except that a plural form will be used in case the DP entails atomic reference. If the D excludes sum
6:42
Singulars and Plurals
reference, the use of the marked plural form is predicted to be impossible and thus DPs like *one/a single children are ruled out in both English and Hungarian. So far we have explained how our approach accounts for the possibility of a singular DP in case the D entails sum reference. What remains to be explained is what dictates the choice between a singular and a plural form in cases where the difference is semantically neutralized. In the languages under consideration the choice between the two forms is not free: English requires the use of the plural in such cases (*three child), while Hungarian requires the use of a singular (*három gyerekek ‘three child_pl’). The question we address next is what drives the choice between these two forms in the grammar. We discuss it in some detail because, as far as we know, this issue has not been addressed in the literature. 4.3
A unidirectional OT analysis
The contrast between English and Hungarian nominals headed by a semantically plural determiner instantiates a shallow syntactic difference that arises when two forms exist in the language but their semantic difference is neutralized. We view the presence of the feature [Pl] in the English three children, many children as number agreement , resulting from a requirement that imposes the presence of the feature [Pl] on sum denoting nominals. Its absence in the corresponding Hungarian három gyerek/sok gyerek is seen as a choice dictated by economy considerations that militate against the use of marked forms when redundant. Given that we posit the same semantics for English and Hungarian plural indefinites, such a situation calls for a unidirectional syntactic OT analysis that establishes a more fine-grained distinction within the set of languages with a morphological number distinction. We embed our analysis in an OT typology of number based on classical markedness and faithfulness constraints. Recall that in Section 2 we exploited the economy constraint *FunctN that favours the least number of functional layers on top of the NP. If the plural feature [Pl] lives in the functional projection of NumP and cardinals and indefinite determiners live in D, the presence of such expressions constitutes a violation of *FunctN. Such violations are motivated by the need to satisfy faithfulness constraints that are ranked above *FunctN. One of these is the constraint Fpl, favouring the expression of sum reference in a functional
6:43
Farkas and de Swart
∃!x : [x ∈ Sum & *Child(x)] a gyerek the child a gyerekek + the child.pl
fpl
*functN
∗
∗ ∗∗
Tableau 2 expressive optimization for definite plurals (Hungarian) layer above NP.28 (45)
Fpl: Sum reference must be encoded in the functional structure of the nominal.
Languages that do not have a morphological singular/plural distinction in nominals (such as Mandarin Chinese, cf. Section 1.4 above) rank Fpl below *FunctN (see de Swart & Zwarts (2008, 2010)). The morphological singular/plural distinction in both English and Hungarian is the result of a grammar in which Fpl outranks *FunctN. But the formulation of Fpl is more general, and allows the expression of number distinctions by other elements in the functional layer above the NP in addition to [Pl] in NumP. If we take the Hungarian determiners három, sok, and the other determiners in (10), to satisfy Fpl, there is no reason to use the [Pl] feature in NumP. In fact, the markedness constraint *FunctN forbids its realization given that in the presence of a determiner that entails sum reference, the feature [Pl] is redundant. Tableaux 2 and 3 illustrate the optimization process for the plural definite a gyerekek (‘the children’) and the cardinal három gyerek (‘three child’). In both the definite plural and the cardinal plural, we find a violation of *FunctN because of the presence of an expression in the functional projection of the nominal.29 Given that the input meaning involves sum reference, the high ranking of Fpl in the grammar of Hungarian requires satisfaction of this constraint at the expense of the economy constraint 28 Note that the formulation of the constraint Fpl here is slightly different from that in de Swart & Zwarts 2008, 2010, who did not deal with the complexities of cardinals and indefinite plural determiners, but focused on ‘plain’ definites and indefinites. 29 The presence of the definite determiner a is licensed by a high ranking of the faithfulness constraint fdef governing the expression of definiteness (see Hendriks et al. 2010: chapter 7, de Swart & Zwarts 2008, 2010).
6:44
Singulars and Plurals
∃!x : [x ∈ Sum & *Child(x)] három gyerek + three child három gyerekek three child.pl
fpl
*functN ∗ ∗∗
Tableau 3 expressive optimization for plural cardinals (Hungarian) *FunctN. The definite article a in Hungarian is similar to English the in that it does not convey number information, so the optimal plural nominal form incurs a second violation of *FunctN in Tableau 2. In Tableau 3, there is no reason to use a plural form of the nominal, given that the lexical semantics of the cardinal D három entails sum reference, and may therefore be taken to satisfy Fpl. A singular form of the noun is more economical, and is therefore preferred. In OT terms, the use of a singular nominal in combination with a determiner that entails sum reference exemplifies the emergence of the unmarked. The ranking Fpl *FunctN is sufficient to account for Hungarian, but does not yet capture the cross-linguistic contrast between Hungarian három gyerek (‘three child’) and English three children. To capture the intuition that the use of a plural nominal in English in these cases is motivated by agreement in number between the plural determiner and the noun we posit an additional constraint Maxpl. (46)
Maxpl: Mark with [Pl] nominals that have sum reference.
Unlike Fpl, Maxpl favours redundant marking of plural morphology within the nominal, at the expense of extra violations of *FunctN. The advantage of this multiplication of plural marking is the emphasis on sum reference. Maxpl is inspired by de Swart’s (2006, 2010) analysis of negative concord in terms of semantic agreement.30 We suggest that the use of plural nominals in contexts in which the determiner already conveys sum reference and thereby satisfies Fpl is governed by a high ranking of Maxpl. Under this analysis, the 30 de Swart posits a constraint fneg requiring faithfulness to the expression of negation, and maxneg requiring a reflection of negation on an indefinite argument within the scope of negation on the form of the nominal. The high ranking of maxneg in negative concord languages leads to a multiplication of negative forms even in contexts in which they are not needed to satisfy fneg, and thus convey semantic negation.
6:45
Farkas and de Swart
∃x : [*Child(x) & |x| ≥ 3] three child three children +
fpl maxpl ∗
*functN ∗ ∗∗
Tableau 4 expressive optimization for plural cardinal meaning (English) grammar of Hungarian has the ranking Fpl *FunctN Maxpl, whereas English exemplifies the grammar {Fpl, Maxpl} *FunctN. For Hungarian, the introduction of the new constraint does not affect the optimization patterns spelled out in Tableaux 2 and 3, because Maxpl is ranked too low to have an effect. For English, the new ranking leads to the optimal form three children for the expression of cardinality information over children, as illustrated in Tableau 4. Sum reference is entailed by the cardinal determiner three, so Fpl is satisfied, just like in Hungarian. However, the constraint Maxpl maximizes the expression of plurality by forcing it to appear in NumP as well. The high ranking of this constraint in English leads to a preference of agreement between the determiner and the nominal over a more economical form. In Hungarian, the constraint Maxpl is ranked below *FunctN, where it is inoperative. Independent support in favor of our analysis comes from L1 acquisition. Children acquiring a double negation language such as standard English sometimes go through a phase in which they multiply negation as if they were speaking a negative concord language. Along similar lines, Hungarian children sometimes mistakenly use the form *három gyerekek (‘three children’) before they acquire the grammatical három gyerek (‘three child’). Even though anecdotal, this evidence suggests that child grammar favours agreement both for negation and number marking. The analysis of the contrast between Hungarian and English is not in conflict with the bidirectional optimization process developed in Section 2, but rather, it covers a niche where the competition in form evades the competition in meaning. With inherently plural determiners, the determiner entails sum reference for the nominal as a whole, and thus makes irrelevant the semantic competition between singular and plural forms. In languages with a morphological singular/plural distinction, this creates room for a new competition between unmarked (singular) and marked (plural) forms. In the absence of a difference in meaning, the optimal expression is selected
6:46
Singulars and Plurals
on purely formal grounds. The competition here is between economy of form (exemplified by Hungarian), and agreement between D and its sister (exemplified by English).31 We conclude that Hungarian and English are both members of the class of languages with a full-fledged morphological singular/plural distinction in nominals, and a grammar in which Fpl is ranked above *functN. However, there are subclasses within this general class, that exploit contrasts in form for other purposes than to express a distinction between atomic and sum reference. Given that agreement in number between determiner and its sister is available only in languages with a morphological singular/plural distinction (instantiating Fpl *FunctN), we predict such subtleties not to occur in languages lacking number morphology. 5 Conclusion The semantics and pragmatics of the plural in languages with a morphological number distinction has been a problem on the semantics agenda since McCawley (1981) raised the question of how to reconcile the morphological markedness of the plural, with its seemingly unmarked semantics. The main point of this paper is to propose a way of resolving this tension, and maintain Horn’s division of pragmatic labor for number in natural language. Recent accounts of number interpretation, stemming from Krifka (1989), accept this tension, and attempt to explain it (Bale et al. in press). Recall that Sauerland et al. (2005) rests on the assumption that singular forms are marked with a singular feature that requires atomic reference while plural forms involve a feature with no semantic contribution while in Spector 2007 singulars have the same ‘strong’ semantics while the plural feature is assigned a weak semantics equivalent to ‘at least one’. In Sauerland et al. 2005 plural forms have sum reference because the existence of the semantically more specific singular form blocks their use in case of atomic reference. In Spector 2007 a similar result is achieved using higher order implicatures. The approach we developed here shares with these previous proposals the insight that number interpretation requires a competition-based account 31 We use the term ‘agreement’ here to cover not only cases where morphological features are shared but also cases where the presence of a morphological feature on one node is connected to the presence of a semantic constraint on another. Our account therefore is compatible with a morphological treatment of English which does not use [Pl] as a feature on Ds.
6:47
Farkas and de Swart
and involves the blocking of one form by the existence of the other. We couch it in terms of bidirectional Optimality Theory because this framework is particularly suitable for capturing the phenomenon of blocking. In Bidirectional OT, the syntax-semantics interface is defined in terms of optimization over form-meaning pairs, making use of a mechanism that selects the optimal meaning for a particular form, and the optimal form for a particular meaning. The crucial novelty of this paper is that it reverses the direction of blocking. We have worked out a weak singular/strong plural account of number interpretation for the languages under consideration, in which there is no singular feature and no special semantics associated with singular forms while plural forms are assumed to involve a semantically potent plural feature. The main conceptual advantage of such an approach is that it reconciles semantic and formal markedness when it comes to number interpretation and explains why in the languages under consideration there is a plural morpheme but no special singular marking. The main empirical advantage of our approach is that it predicts the possibility of using singular forms with sum reference in case the semantic distinction between singular and plural forms is neutralized, a possibility that is realized in Hungarian. We have adopted the abstract system developed in Mattausch 2005, 2007 and adapted it to the morphology and semantics of number. Crucially, we have suggested that the relevant semantic markedness parameter for the languages under consideration is the distinction between the conceptually unmarked atom reference and the conceptually marked inclusive or exclusive sum reference. The system we propose associates marked plural forms with marked sum reference interpretation and unmarked singular forms with unmarked atomic reference. The marked plural form is associated with the requirement that sums be included among possible witnesses of the nominal, a requirement that is realized by giving the feature [Pl] a polysemous semantics, with one sense reserved for the exclusive interpretation and the other for the inclusive interpretation. The unmarked singular form has no inherent semantics, but under bidirectional optimization, it takes the complementary meaning of the marked plural which is exclusive sum interpretation. We have proposed a weak singular/strong plural approach in which formal and interpretational markedness are parallel, a pattern we find elsewhere in natural language. At the same time, our proposal meets the challenge posed by the existence of plural forms interpreted inclusively. In fact, once we adopt the view that having sum reference is the conceptually marked
6:48
Singulars and Plurals
interpretation, the account makes us expect plural forms to be used both for inclusive and exclusive sum reference. What the system rules out, however, is a plural form used when the existence of a sum witness is excluded. This is a welcome result. The relevance of sum values to all uses of plurals in the languages under consideration follows from our analysis without having to assume a strong semantics for singulars (as in Sauerland et al. 2005) or having to add a special modal presupposition for plurals (as in Spector 2007). In our approach, just as in previous proposals, the competition between the inclusive and the exclusive interpretation of plural forms is decided by pragmatic rather than semantic factors. We have relied on applying the Strongest Meaning Hypothesis to the interpretation of the plural, which correctly predicts that plural forms will be interpreted exclusively in ordinary upward entailing contexts and inclusively when under the scope of negation or in the Restrictor of conditionals or distributive universals. But even in contexts in which inclusive readings are permitted, sum reference must be relevant. We have seen that subtle pragmatic factors determine in which contexts and situations sum reference is relevant. The main theoretical contribution of the account we developed here is that it respects the Horn pattern while at the same time accounting for the existence of inclusive plurals as well as for the main dividing line between inclusive and exclusive plurals. We have shown here that such an account is both possible and desirable. On the empirical side, our approach has the advantage of accounting for the relevance of sum reference with plural forms, as well as predicting the possibility of singular nominals with sum reference just in case sum reference is imposed by D independently of what is found in NumP. This is indeed the case of Hungarian singular DPs such as sok gyerek ‘many child’. We have presented an account of these facts that treats the singular form of these DPs as the result of the language valuing functional economy over the pressure to mark sum reference uniformly with the feature [Pl]. The obligatory plural forms of such DPs in English is due to this language valuing uniform [Pl] marking of sum denoting DPs higher than functional economy. The account we propose then meets what we take to be the main challenges number semantics faces without having to rely on any tools that are not independently motivated.
6:49
Farkas and de Swart
References Bale, Alan, Michaël Gagnon & Hrayr Khanjian. in press. On the relationship between morphological and semantic markedness: the case of plural morphology. Journal of Morphology http://linguistics.concordia.ca/bale/ pdfs/Morphology%20paper.pdf. Beaver, David. 2002. The optimization of discourse anaphora. Linguistics and Philosophy 27(1). 3–56. doi:10.1023/B:LING.0000010796.76522.7a. Beaver, David & Hanjung Lee. 2004. Input-output mismatches in OT. In Reinhard Blutner & Henk Zeevat (eds.), Optimality theory and pragmatics, 112–153. Palgrave/MacMillan. https://webspace.utexas.edu/dib97/ publications.html. Blutner, Reinhard. 1998. Lexical pragmatics. Journal of Semantics 15(2). 115–162. doi:10.1093/jos/15.2.115. Blutner, Reinhard. 2000. Some aspects of optimality in natural language interpretation. Journal of Semantics 17(3). 189–216. doi:10.1093/jos/17.3.189. Blutner, Reinhard. 2004. Pragmatics and the lexicon. In Laurence Horn & Gregory Ward (eds.), Handbook of pragmatics, 488–514. Oxford: Blackwell. Chierchia, Gennaro. 2004. Scalar implicatures, polarity phenomena and the syntax/pragmatics interface. In A. Belletti (ed.), Structures and beyond, 39–103. Oxford: Oxford University Press. Chierchia, Gennaro. 2006. Broaden your views: implicatures of domain widening and the "logicality" of natural language. Linguistic Inquiry 37(4). 535–590. doi:10.1162/ling.2006.37.4.535. Chierchia, Gennaro, Danny Fox & Benjamin Spector. 2008. The grammatical view of scalar implicatures and the relation between semantics and pragmatics. In Klaus von Heusinger, Claudia Maienborn & Paul Portner (eds.), Semantics. An international handbook of natural language meaning, Mouton de Gruyter, New York, NY. Cohen, Ariel. 2005. More than bare existence: An implicature of existential bare plurals. Journal of Semantics 22(4). 389–400. doi:10.1093/jos/ffh031. Corbett, Greville G. 2000. Number. Cambridge University Press, Cambridge. doi:10.2277/0521640164. Dalrymple, Mary, Makoto Kanazawa, Yookyung Kim, Sam Mchombo & Stanley Peters. 1998. Reciprocal expressions and the concept of reciprocity. Linguistics and Philosophy 21(2). 159–210. doi:10.1023/A:1005330227480. Farkas, Donka F. 2006. The unmarked determiner. In Svetlana Vogeleer & Liliane Tasmowski de Rijk (eds.), Non-definiteness and plurality, 81–106.
6:50
Singulars and Plurals
John Benjamins, Amsterdam. Farkas, Donka F. & Henriëtte E. de Swart. 2003. The semantics of incorporation: from argument structure to discourse transparancy. CSLI Publications, Stanford: CA. Farkas, Donka F. & Draga Zec. 1995. Agreement and pronominal reference. In Guglielmo Cinque & Giuliana Giusti (eds.), Advances in Roumanian linguistics, 83–101. John Benjamins, Amsterdam. Fauconnier, Gilles. 1979. Implication reversal in natural language. In Franz Guenthner & Siegfried J. Schmidt (eds.), Formal semantics and pragmatics for natural languages, Reidel. Feigenson, Lisa & Susan Carey. 2003. Tracking individuals via object files: evidence from infants’ manual search. Developmental Science 6(5). 568– 584. doi:10.1111/1467-7687.00313. Feigenson, Lisa & Susan Carey. 2005. On the limits of infants’ quantification of small object arrays. Cognition 97(3). 295–313. doi:10.1016/j.cognition.2004.09.010. Feigenson, Lisa, Susan Carey & Marc Hauser. 2002. The representations underlying infants’ choice of more: object files versus analog magnitudes. Psychological Science 13(2). 150–156. doi:10.1111/1467-9280.00427. von Fintel, Kai. 1999. NPI-licensing, Strawson entailment and contextdepencency. Journal of Semantics 16(2). 97–148. doi:10.1093/jos/16.2.97. Greenberg, Joseph. 1966. Language universals. Mouton, the Hague. Guerzoni, Elena & Yael Sharvit. 2007. A question of strength: on NPIs in interrogative clauses. Linguistics and Philosophy 30(3). 361–391. doi:10.1007/s10988-007-9014-x. Hauser, Marc, Susan Carey & L.B. Hauser. 2000. Spontaneous number representation in semi-free-ranging rhesus monkeys. In Royal society of london: Biological sciences, vol. 267, 829–833. doi:10.1098/rspb.2000.1078. Heim, Irene. 1991. Artikel und Definitheit. In Arnim von Stechow & Dieter Wunderlich (eds.), Handbuch der Semantik, Berlin: de Gruyter. Hendriks, Petra, Helen de Hoop, Henriëtte de Swart & Joost Zwarts. 2010. Conflicts in interpretation. Equinox Publishing (in press), preprint http: //www.let.rug.nl/~hendriks/conflict.htm. Horn, Laurence. 2001. A natural history of negation. CSLI Publications, Stanford: CA. Ionin, Tania & Ora Matushansky. 2006. The composition of complex cardinals. Journal of Semantics 23(4). 315–360. doi:10.1093/jos/ffl006. Jäger, Gerhard. 2003. Learning constraint sub-hierarchies. In Reinhard
6:51
Farkas and de Swart
Blutner & Henk Zeevat (eds.), Pragmatics and optimality theory, 251–287. Houndmills: Palgrave MacMillan. Jakobson, Roman. 1939. Signe zéro. In Mélanges de linguistique offerts à charles bally, Genève (also in Selected Writings II). Kamp, Hans & Jan van Eijck. 1996. Representing discourse in context. In Johan van Benthem & Alice ter Meulen (eds.), Handbook of logic and linguistics, 179–237. Amsterdam: Elsevier. Kamp, Hans & Uwe Reyle. 1993. From discourse to logic. Dordrecht: Kluwer Academic Publishers. Kester, Ellen-Petra & Christina Schmitt. 2007. Papiamentu and Brazilian Portuguese: a comparative study of bare nominals. In Marlyse Babtista & Jacqueline Guéron (eds.), Noun phrases in creole languages: a multifaceted approach, Amsterdam: Benjamins. Kirby, Simon & Jim Hurford. 1997. The evolution of incremental learning: language, development and critical periods. In Antonella Sorace, Caroline Heycock & Richard Shillcock (eds.), Gala ’97 conference on language acquisition, HCRC, Edinburgh University. Kouider, Sid, Justin Halberda, Justin Wood & Susan Carey. 2006. Acquisition of English number marking: the singular-plural distinction. Language Learning and Development 2. 1–25. doi:10.1207/s15473341lld0201_1. Krifka, Manfred. 1989. Nominal reference, temporal constitution and quantification in event semantics. In Renate Bartsch, Johan van Benthem & Peter van Emde Boas (eds.), Semantics and contextual expression, Dordrecht: Foris publication. Krifka, Manfred. 1995. Common nouns: a contrastive analysis of English and Chinese. In Greg N. Carlson & Francis Jeffry Pelletier (eds.), The generic book, 398–411. Chicago University Press. http://amor.rz.hu-berlin.de/ ~h2816i3x/. Kwon, Song-Nim & Anne Zribi-Hertz. 2006. Bare objects in Korean: (pseudo)incorporation and (in)definiteness. In Svetlana Vogeleer & Liliane Tasmowski de Rijk (eds.), Non-definiteness and plurality, 107–132. John Benjamins, Amsterdam. Ladusaw, William. 1996. Negation and polarity items. In Shalom Lappin (ed.), The handbook of contemporary semantic theory, 321–341. Oxford: Blackwell. Lakoff, Robin. 2000. The language war. University of California Press, Berkeley. Link, Godehard. 1983. The logical analysis of plural and mass nouns: a lattice-
6:52
Singulars and Plurals
theoretic approach. In Rainer Bäuerle, Christoph Schwarze & Arnim von Stechow (eds.), Meaning, use and interpretation of language, 302–323. Berlin: de Gruyter. Mattausch, Jason. 2005. On the optimization and grammaticalization of anaphora: Humboldt University Berlin, published as ZAS Papers in Linguistics 38 dissertation. http://www.zas.gwz-berlin.de/mitarb/homepage/ mattausch/Dissertation-Mattausch.pdf. Mattausch, Jason. 2007. Optimality, bidirectionality and the evolution of binding phenomena. Research on Language and Computation 5(1). 103– 131. doi:10.1007/s11168-006-9018-7. McCawley, Jim. 1981. Everything that linguists have always wanted to know about logic (but were ashamed to ask). Chicago: University of Chicago Press (2nd edition 1993). Nouwen, Rick & Bart Geurts. 2007. At least et al.: the semantics of scalar modifiers. Language 83. 533–559. http://ncs.ruhosting.nl/bart/. van Rooij, Robert. 2004. Signalling games select Horn strategies. Linguistics and Philosophy 27(4). 493–527. doi:10.1023/B:LING.0000024403.88733.3f. Rullmann, Hotze. 2003. Bound-variable pronouns and the semantics of number. In Western conference on linguistics, vol. 14, 243–254. WECOL 2002. http://semanticsarchive.net/Archive/DM3ODk0N/. Rullmann, Hotze & Aili You. 2003. General number and the semantics and pragmatics of indefinite bare nouns in Mandarin Chinese. ms. UBC. http: //semanticsarchive.net/Archive/jhlZTY3Y/. Sauerland, Uli. 2003. A new semantics for number. In Rob Young & Yuping Zou (eds.), Salt 13, vol. 13, 258–275. CLC publications. http://semanticsarchive. net/Archive/TM0YjdiO/salt03paper.pdf. Sauerland, Uli. 2008. On the semantic markedness of phi-features. In Harbour, D. et al. (ed.), Phi theory, 57–83. Oxford: Oxford University Press. http: //www.zas.gwz-berlin.de/home/sauerland/downloads.html. Sauerland, Uli, J. Anderssen & J. Yatsushiro. 2005. The plural is semantically unmarked. In Stephan Kepser & Marga Reis (eds.), Linguistic evidence, de Gruyter. doi:10.1515/9783110197549.413. Spector, Benjamin. 2007. Aspects of the pragmatics of plural morphology: On higher-order implicatures. In Uli Sauerland & Penka Stateva (eds.), Presuppositions and implicatures in compositional semantics, 243–281. Palgrave/MacMillan. http://lumiere.ens.fr/~bspector/. de Swart, Henriëtte. 2006. Marking and interpretation of negation: a bidirectional OT approach. In Raffaella Zanuttini, Héctor Campos, Elena
6:53
Farkas and de Swart
Herburger & Paul Portner (eds.), Negation, tense and clausal architecture: Cross-linguistic investigations, 199–218. Georgetown: Georgetown University Press. http://www.let.uu.nl/~Henriette.deSwart/personal/negot.pdf. de Swart, Henriëtte. 2010. Expression and interpretation of negation: an OT typology. Dordrecht: Springer (in press). de Swart, Henriëtte & Joost Zwarts. 2008. Article use across languages: an OT typology. In Atle Grønn (ed.), Sinn und Bedeutung, vol. 12, 628–644. University of Oslo. de Swart, Henriëtte & Joost Zwarts. 2009. Less form, more meaning: why bare nominals are special. Lingua 119(2). 280–295. doi:10.1016/j.lingua.2007.10.015. de Swart, Henriëtte & Joost Zwarts. 2010. Optimization principles in the typology of number and articles. In Bernd Heine & Heiko Narrog (eds.), Handbook of linguistic analysis, Oxford: Oxford University Press. http://www. let.uu.nl/~Henriette.deSwart/personal/oupdeSwartZwartsmay08.pdf. Winter, Yoad. 2001. Plural predication and the strongest meaning hypothesis. Journal of Semantics 18(4). 333–365. doi:10.1093/jos/18.4.333. Wood, Justin, Sid Kouider & Susan Carey. 2004. The emergence of singular/plural distinction. Poster presented at the biennial International Conference on Infant Studies. http://www.wjh.harvard.edu/~lds/index. html?carey.html. Zwarts, Joost. 2004. Competition between word meanings: the polysemy of around. In Sinn und Bedeutung, 349–360. Konstanz. http://www.let.uu. nl/users/Joost.Zwarts/personal/. Zweig, Eytan. 2008. Dependent plurals and plural meaning: NYU dissertation. http://www-users.york.ac.uk/~ez506/.
Donka Farkas Department of Linguistics University of California at Santa Cruz Stevenson College 1156 High Street Santa Cruz, CA 95064, USA [email protected]
Henriëtte de Swart Department of Modern Languages Utrecht University Trans 10 3512 HD Utrecht The Netherlands [email protected]
6:54
Semantics & Pragmatics Volume 3, Article 2: 1–13, 2010 doi: 10.3765/sp.3.2
Embedded Implicatures and Experimental Constraints: A Reply to Geurts & Pouscoulous and Chemla∗ Uli Sauerland Zentrum für Allgemeine Sprachwissenschaft, Berlin
Received 2009-11-13 / First Decision 2009-11-22 / Revised 2009-12-11 / Accepted 2009-12-08 / Published 2010-01-25
Abstract Experimental evidence on embedded implicatures by Chemla (2009b) and Geurts & Pouscoulous (2009a) has fewer theoretical consequences than assumed: On the one hand, the evidence successfully argues against obligatory local implicature computation, which has however already been discredited. On the other hand, the data are fully consistent with optional local implicature computation.
Keywords: conversational implicature, embedded implicature, experimental pragmatics, free-choice permission, truth dominance
Both Chemla (2009b) (C in the following) and Geurts & Pouscoulous (2009a) (G&P in the following) in recent papers in this journal provide welcome new experimental evidence on embedded implicatures. However, while their work takes us a couple of steps closer to full understanding of the issue, I will argue that in both papers the theoretical implications of the new data are overstated and much work remains to be done. Intuitively clear cases of embedded implicatures are examples like (1). (1)
a. If you ate some of the cookies and no one else at any, then there must still be some left. (Levinson 2000: 205) b. Mary solved the first problem or the second problem or both problems. (Chierchia et al. 2008: (31))
∗ I thank Nicole Gotzner, Lisa Hartmann and the editors of this journal for their help with this paper, and the German Research Foundation (DFG grant SA 925/1 in the Emmy Noether Programm) for financial support. ©2010 Uli Sauerland This is an open-access article distributed under the terms of a Creative Commons NonCommercial License (creativecommons.org/licenses/by-nc/3.0).
Uli Sauerland
Here the implicatures of some and or are part of the truth-conditional content of an embedded sentence: the conditional in (1a), which is understood as if you ate some and not all of the cookies, and a disjunct in (1b), which is understood as Mary solved either the first problem or the second problem and not both. In these examples, the sentence without the embedded implicature would be either contradictory (If you ate some or all of the cookies, then there must still be some left) or a violation of a pragmatic constraint (#Mary solved at least one of the problems or both problems, see Singh 2008). The question theorists of all stripes are faced with is whether and how to integrate these phenomena into a general theory of implicatures, or at least of quantity or scalar implicatures. Some narrower directions that have been pursued to address the general question raised by embedded implicatures are listed in (2). (2)
I. How frequently and under what conditions do embedded implicatures arise? II. Are embedded implicatures a uniform phenomenon? Or more specifically: How many mechanisms can give rise to embedded implicatures? III. Are embedded implicatures really implicatures? Or more specifically: When is the mechanism giving rise to embedded implicatures the same one as the mechanism giving rise to global implicatures?
G&P and C both focus on the first of these directions, and take this discussion as far as it can be taken presently. However, I argue that just pursuing the first direction is insufficient to resolve the issues embedded implicatures raise fully. In particular, I show that independent pragmatic constraints — in particular, the constraint of Truth Dominance (Meyer & Sauerland 2009) — predict conditions on when embedded implicatures can be detected that are largely independent of the account of embedded implicatures assumed. Therefore, the observations on the presence and absence of (embedded) implicatures by G&P and C are consistent with much wider range of theories of implicatures than what the original papers say. In the second section of the paper, I address a finding by C on the embedding of free-choice effects that speaks to directions II and III of (2). I argue that this finding is more significant for the account of embedded implicatures and speculate on two theoretical ideas that would account for it. I conclude that C’s second result is the most important one for the theory of implicatures from these two papers.
2:2
Embedded Implicatures and Experimental Constraints
1 Embedded Implicatures are Still Repairs G&P are exclusively concerned with the first issue of (2). The primary target of G&P is an extreme view of localism espoused by Levinson (2000) and Chierchia (2004).1 The Levinson/Chierchia view predicts that implicatures should always be fully local unless a cancellation mechanism applies. A number of people have noted that this prediction seems to be intuitively wrong in many cases. Specifically, this holds in case the embedded implicature is not needed to make the sentence coherent (see for instance Geurts 2009; Russell 2006; Sauerland 2004b). Compare (3) with (1): Intuitively, (3a) does not seem to mean the same as If you ate some but not all of the cookies, then you must have liked them. And for the multiple disjunction in (3b), the paraphrase Mary solved either exactly one or all three of the problems, which local computation of implicatures predicts, is clearly off the mark. (3)
a. If you ate some of the cookies, then you must have liked them. b. Mary solved the first problem or the second problem or the third problem.
In these two examples, the addition of a local implicature to the truth conditions results in weaker truth conditions for the entire sentences — in (3a) the implicature trigger occurs in a downward entailing environment, but not in (3b). Such examples show that local implicatures cannot be obligatory.2 Examples where local implicatures would cause truth conditions that are stronger overall are the focus of G&P. The data presented by G&P shows to my full satisfaction that the prediction of the proposals of Levinson and Chierchia is wrong also for these cases. Their result is also consistent with other experimental results presented by Chemla (2009a), Schwarz, Clifton & Frazier (2008) and Bezuidenhout, Morris & Widman (2009), who look at different environments: mostly downward entailing cases like negative attitude verbs, the restrictor of a universal quantifier and conditional, but Bezuidenhout et al. (p. 139) also look at the scope of conditionals and present some findings similar to G&P, though less striking. In sum, the proposals of 1 I am not sure whether any researcher active in this area still holds this extreme view: The unpublished paper by Chierchia et al. (2008) that G&P cite seems not fully consistent to me in this regard (Sauerland submitted): initially it adopts the view of Chierchia (2004), however, later the quite different view of Fox (2007) is assumed without any comment on the shift. 2 Acknowledging this problem, Chierchia (2004) proposes that cancellation of implicatures is obligatory in downward entailing environments. Still, (3b) remains a problem for Chierchia’s proposal.
2:3
Uli Sauerland
Levinson (2000) and Chierchia (2004) are falsified by the experimental data to the extent possible.3 G&P claim their results also argue against another view, which they call Minimal Conventionalism. However, I will show that G&P are mistaken: Actually, their result says nothing about Minimal Conventionalism once we take into account general pragmatic constraints on how ambiguous sentences are judged. To show this, I consider the view of Fox (2007) as a concrete example of G&P’s Minimal Conventionalism. I motivate the general pragmatic principle of Truth Dominance and then argue that G&P’s data are fully consistent with Fox’s (2007) account once Truth Dominance is taken into account. Fox’s (2007) account is non-committal on the locality of implicature computation, allowing it to apply locally, but also globally. He assumes that implicatures can be contributed to the meaning of a sentence by the grammatical operator Exh.4 (Fox 2007: p. 79 & p. 97) defines the Exh operator via the three statements in (4) through (6) (with minor notational adjustments). The operator depends on a contextually provided set of alternative propositions C, which can be taken to be the scalar alternatives of the argument of Exh in the examples in the following. (4)
NWC (p) = {q ∈ C | p does not entail q}
(5)
q is innocently excludable given C if and only if ¬∃q0 ∈ NWC (p) [p ∧ ¬q] → q0
(6)
ExhC (p)(w) a p(w) & ∀q ∈ NWC (p) [q is innocently excludable given C] → ¬q(w)
Consider now Fox’s (2007) account for example (7). The account predicts an ambiguity between a local+global reading, which corresponds to structure (8a), and a global-only reading, which corresponds to structure (8b). (7)
All the squares are connected with some of the circles.
(G&P: (26a))
3 Of course, there are always ways to save any scientific theory by adding additional assumptions, but nothing short of almost obligatory local cancellation of the proposed obligatory local implicatures would seem to do the trick in this case. 4 There are two major differences between Fox’s account and that of Chierchia (2004): First, Fox does not require local application of implicature computation. And second, his Exh operator is different from Chierchia’s due to the appeal to innocent excludability. The second difference does not matter for the following, but is important for the analysis of disjunctions (Sauerland submitted).
2:4
Embedded Implicatures and Experimental Constraints
(8)
a. Exh All the squares λx Exh x be connected with some of the circles. b. Exh All the squares λx x be connected with some of the circles.
The two readings stand in a special logical relationship: the local-global reading logically entails the weaker, global-only reading. From work on scope ambiguity resolution, it is independently known that speakers’ intuitions are affected by the entailment relation between the two readings. I adopt the principle Truth Dominance from Meyer & Sauerland (2009) to account for this effect because one case they consider is exactly analogous to (8),5 namely, the German example (9). Most theories of quantifier scope in German predict (9) to be ambiguous between two structural representations that should give rise to the two readings given below (9). However, previous researchers (Büring & Hartmann 2001; Reis 2005) have noted that (9) seems to lack the second one of these readings: the reading where the postverbal subject takes scope over the sentence-initial object (the every only reading in (9)). (9)
Nur Maria liebt jeder. only Mary[acc] loves everyone.nom only every: ∀y (y = Mary ↔ ∀x love(x, y)) [every only: ∀x ∀y (y = Mary ↔ love(x, y))]
Meyer & Sauerland (2009) explain the lack of evidence for the every only reading by arguing that this reading cannot be detected for pragmatic reasons. Specifically, the Truth Dominance principle in (10) predicts it to be undetectable: Because the strong, only every reading entails the weak, every only reading, any situation where the truth values of the two readings differ is one where the strong reading is false while the weak one is true. But Truth Dominance predicts that in such a situation, speakers will judge the sentence to be true, as it’s predicted to be by the weak reading. The strong reading therefore remains undetectable in the truth conditions of (9). (10)
Truth Dominance: Whenever an ambiguous sentence S is true in a situation on its most accessible reading, we must judge sentence S to be true in that situation. (Meyer & Sauerland 2009: (1))
5 The principle can be traced back at least to work on wide scope indefinites by Abusch (1994). Gualmini, Hulsey, Hacquard & Fox (2008) call a similar principle Charity. The differences between the Charity and Truth Dominance are not relevant to the discussion in this paper. In fact, Charity would make exactly the same predictions as Truth Dominance for the examples in the following.
2:5
Uli Sauerland
Principle (10) is a well-supported pragmatic principle: As already mentioned, work by Abusch (1994) and Gualmini et al. (2008) provides further support for a principle like (10) and in addition, principle (10) makes pragmatic sense as a principle of cooperative behavior in discourse. Principle (10) is directly relevant for determining the predictions of Fox’s (2007) analysis of implicatures in the following way. As discussed above, Fox’s account predicts (7) to be ambiguous between the two readings represented in (8). However, reading (8a) entails (8b), so the same situation obtains as with the two readings of (9). Principle (10) entails for (7) that only reading (8b) can be empirically detected for (7).6 Indeed, G&P argue that only reading (8b) is empirically supported by the judgments of the subjects in their experiments, which is the judgment that Fox’s ambiguity account together with Truth Dominance predict. The experimental results of G&P are therefore fully consistent with Fox’s proposal and what G&P call Minimal Conventionalism more generally. The preceding discussion does not entail that Fox’s (2007) account is without problems: Fox’s account makes the wrong prediction for cases like (3) because the local application of implicature computation leads to a weaker interpretation than the one actually attested. In this case, the local implicature would be detectable, but is actually not attested. Fox briefly entertains two suggestions that would address this shortcoming (Fox 2007: page 82), but both fall short. One suggestion is to only compute local implicatures if they strengthen the sentence meaning. The other suggestion is to only permit global application of his Exh operator. Both of these suggestions solve the problem of (3), but leave Fox with no account for (1). So, Fox’s (2007) account would need to be amended further. For example, the empirical problems would be solved by stipulating that embedded Exh is blocked unless an inconsistency or pragmatic violation results otherwise.7 The important point for our present purposes, though, is that G&P’s data do not bear on Fox’s (2007) account and probably others that G&P would characterize as Minimal Conventionalism. The above discussion also shows how difficult it is to address the puzzle posed by embedded implicatures by just looking at the distribution of embedded implicatures. I conclude 6 To be more precise, the application of Truth Dominance here assumes that (8b) represents a more accessible syntactic parse than (8a). Since (8b) contains fewer silent operators, this assumption is independently justified. 7 This is essentially a more specific statement of the view of Sauerland (2004a) that embedded implicatures are a repair strategy.
2:6
Embedded Implicatures and Experimental Constraints
therefore that a comparison of the properties of embedded implicatures with non-embedded ones may be a more promising direction to pursue than to solely focus on the distribution of embedded implicatures. Such attention to the properties of implicatures would address both II and III of the questions in (2). In the next section, I focus on one aspect of the data reported by C that points in this direction. 2
Are Embedded Implicatures Implicatures?
The results of C (= Chemla 2009b) add one new aspect, but are otherwise consistent with the picture already summarized: The results show that obligatory localism is false, but don’t distinguish between other views. In particular, C’s discussion of examples like (11a) and (11b) is limited in the same way as the discussion of (9) by G&P: the only theory Chemla’s result argues against is the extreme localism of Levinson (2000) and Chierchia (2004), which is already known to have numerous problems. More viable views of localism, where embedded implicatures are an option, but not required, make exactly the right predictions for both examples in (11) — namely, the same predictions as a global account. (11)
a. Every student read some of the books. b. No student read all the books.
The most interesting result of C’s study is the embedded free choice effect in examples like (12). He shows experimentally that subjects judge (12) to entail that every student is allowed to have an apple and also that every student is allowed to have a banana. (12)
Every student is allowed to have an apple or a banana.
(C: (12b))
Chemla’s observation is interesting because it shows a difference between free choice effects and scalar implicatures, which are not as frequently locally computed in the same environment. This difference may bear on the second and third of the questions in (2). Unfortunately, Chemla’s theoretical discussion is limited to the account of Fox (2007) and on one point even mistaken. Chemla compares the two versions of Fox’s (2007) proposal I already mentioned above: either permitting embedded occurrences of Exh or restricting Exh to one occurrence with clausal scope per utterance. The non-deterministic former view predicts (12) to be ambiguous between the two
2:7
Uli Sauerland
representations in (13), while the latter globalist view permits only representation (13b) (13)
a. Exh Every student λx x is allowed to have an apple or a banana. b. Exh Every student λx Exh x is allowed to have an apple or a banana.
Chemla focuses on the fact that the Fox’s globalist view incorrectly predicts that (12) should be restricted to scenarios where not all students make the same choice since (13a) entails that neither every student is allowed to have an apple nor every student is allowed to have a banana. What Chemla fails to note, though, is that the optionally local version of Fox’s proposal also predicts (13a) as a possible reading for (12). In particular, neither does (13a) entail (13b), nor vice-versa, and therefore both readings should be detectable. But this doesn’t seem to be the case and therefore (12) is also a problem for the non-deterministic version of Fox’s proposal, not just for the global one. Since I argued above that both version are independently problematic, Chemla’s new evidence just strengthens the point against both proposals. The main conclusions I draw from Chemla’s paper concern a) the status of free choice inferences, and b) the relation of embedded to global implicatures. Chemla’s data only speak to my questions in (2) if we assume that free choice inferences are indeed implicatures. Chemla’s data actually cast this relationship in doubt. There is not that much empirical evidence in favor of the relationship in the first place: the main direct piece of evidence in favor of an implicature account of free choice inferences is the observation by Kratzer & Shimoyama (2002) that the inferences disappear in the scope of negation just like implicatures.8 However, C shows two differences between free choice inferences and implicatures: First, only free choice inferences are locally present in the scope of a universal quantifier as I already referenced above. Second, C shows that negated modalized statements like (14a) don’t trigger free choice inferences. Since (14b) is logically equivalent to (14a) the absence of a free choice inference in (14b) shows that free-choice inferences are not detachable in the sense of (Grice 1989). Usually implicatures are detachable as, for example, Grice already discusses. (14)
a. John is allowed to not do A or not do B. b. John is not required to do A and B.
(Chemla 2009b: (15a))
8 Furthermore, free choice inferences can also be cancelled like other implicatures as in You may have an apple or a banana, but I don’t know which.
2:8
Embedded Implicatures and Experimental Constraints
C’s results are intuitively plausible and very interesting for the theory of free-choice inferences. As far as I can see, there are two possible directions to pursue. On the one hand, one could seek to treat free choice inferences not as implicatures. Specifically, C’s result could be seen to support non-implicature accounts of free choice such as Zimmermann (2000). On the other hand, it may be that matrix free-choice effects are still implicatures, but embedded free choice effects may be due to a special free-choice inference generating operator. The latter position should be attractive to those who believe that there are satisfying analysis of free-choice inferences as an implicature (Fox 2007; Schulz 2005). 3
Conclusions
In sum, the recent experimental work by G&P and C has confirmed the views of those who have argued against the obligatory localism of Levinson (2000) and Chierchia (2004), e.g. Geurts (2009), Russell (2006), and Sauerland (2004b). Beyond that, the account of embedded implicatures and their relation to global implicatures are still unclear. Solely testing for the presence of embedded implicatures as G&P and C mostly do may be insufficient for understanding embedded implicatures. Rather it may be more promising to investigate wether the content of embedded implicatures is exactly the same as that of implicatures at the matrix level. In this direction, the difference C observes between embedded and matrix implicatures is interesting and most likely helpful in sorting out the puzzle of embedded implicatures. While I have no complete account to offer myself, I close with some arguments to be skeptical of Geurts & Pouscoulous (2009b) analysis of Chemla’s example (12): Geurts & Pouscoulous (2009b) suggest accounting for (12) as an instance of an embedded speech act. This account, I argue now is plausible for some cases, but most likely cannot cover all cases of embedded implicatures: The possibility of embedded speech acts has been acknowledged at least since Huddleston (1973) and embedded speech acts can certainly be a source of embedded implicatures:9 (15) illustrates that embedded speech acts must trigger embedded implicatures: the modal particle wohl (‘well’) requires an embedded speech act interpretation for the complement of glaubt (‘believes’) and furthermore triggers an inference that 9 The idea of a metalinguistic negation of Horn (1985) is closely related to the idea of embedded implicatures, but more specific since it assumes a restriction to negation.
2:9
Uli Sauerland
the speaker also believes the complement clause. (15)
#Bill glaubt, dass einige der Kinder wohl krank sind. Aber alle Bill believes that some of the children wohl sick are but all Kinder sind krank. children are sick.
However, this alone doesn’t predict correctly that (15) is odd. The oddness of (15) is only predicted if there is also an embedded implicature. The embedded implicature is the reason that the stronger belief, that some, but not all children are sick, is attributed to the speaker. Then (15) is predicted to be odd because the second sentence explicitly contradicts this attribution of the embedded implicature to the speaker. This example indicates that embedded speech acts trigger embedded implicatures as all theories of speech acts would predict.10 However, I do not believe that the reverse entailment also holds — that an embedded implicature is always triggered by an embedded speech act. One problem for this entailment is the following: Krifka (2001) argues that most does not allow embedding of speech acts in its scope. Hence, (16) should not allow embedded free choice inferences unlike (12). However, this doesn’t accord with my intuitions: (16) suggests that most students can choose freely. For instance, consider (16) in the following scenario: the majority of students can freely choose between A and B and the few other students, who cannot freely choose, must do option A. In this scenario (16) seems acceptable to me, even though it may happen that not a single student chooses option B. The acceptability of (15) in such a scenario is only expected if the free choice inference is embedded in the scope of most. (16)
Most students are allowed to do A or B.
The embedded free choice inferences in (16) couldn’t be due to an embedded speech act if Krifka’s (2001) is correct that most blocks embedded speech acts. Therefore, (16) presents a problem for the proposal to derive all embedded implicatures from embedding of speech acts. Some further data that are problematic for the idea of deriving all embedded implicatures from embedded speech acts are discussed by Sauerland (2004a). Therefore, I conclude that contrary to Geurts & Pouscoulous’s (2009b) opinion, Chemla’s 10 This prediction, of course, arises to the extent that theories of speech acts permit embedding of speech acts in the first place.
2:10
Embedded Implicatures and Experimental Constraints
data in (12) are still in need of an account. And the search for such an account may finally really lead us to a better understanding of embedded implicatures. References Abusch, Dorit. 1994. The scope of indefinites. Natural Language Semantics 2(2). 83–135. doi:10.1007/BF01250400. Bezuidenhout, Anne, Robin Morris & Cintia Widman. 2009. The DE-blocking hypothesis: The role of grammar in scalar reasoning. In Sauerland & Yatsushiro (2009), 124–144. Büring, Daniel & Katharina Hartmann. 2001. The syntax and semantics of focus-sensitive particles in German. Natural Language & Linguistic Theory 19(2). 229–281. doi:10.1023/A:1010653115493. Chemla, Emmanuel. 2009a. An experimental approach to adverbial modification. In Sauerland & Yatsushiro (2009), 249–263. Chemla, Emmanuel. 2009b. Universal implicatures and free choice effects: Experimental data. Semantics and Pragmatics 2(2). 1–33. doi:10.3765/sp.2.2. Chierchia, Gennaro. 2004. Scalar implicatures, polarity phenomena, and the syntax/pragmatics interface. In Adriana Belletti (ed.), Structures and beyond, 39–103. Oxford, UK: Oxford University Press. Chierchia, Gennaro, Benjamin Spector & Danny Fox. 2008. The grammatical view of scalar implicatures and the relationship between semantics and pragmatics. Ms, to appear in Maienborn, Claudia, Klaus von Heusinger, and Paul Portner (eds.) Handbook of Semantics. Fox, Danny. 2007. Too many alternatives: Density, symmetry and other predicaments. Semantics and Linguistic Theory 17. doi:1813/11295. Geurts, Bart. 2009. Scalar implicature and local pragmatics. Mind and Language 24(1). 51–79. doi:10.1111/j.1468-0017.2008.01353.x. Geurts, Bart & Nausicaa Pouscoulous. 2009a. Embedded implicatures?!? Semantics and Pragmatics 2(4). 1–34. doi:10.3765/sp.2.4. Geurts, Bart & Nausicaa Pouscoulous. 2009b. Free choice for all: a response to Emmanuel Chemla. Semantics and Pragmatics 2(5). 1–10. doi:10.3765/sp.2.5. Grice, Paul. 1989. Studies in the way of words. Cambridge, MA: Harvard University Press. Gualmini, Andrea, Sarah Hulsey, Valentine Hacquard & Danny Fox. 2008. The
2:11
Uli Sauerland
question-answer requirement for scope assignment. Natural Language Semantics 16(3). 205–237. doi:10.1007/s11050-008-9029-z. Horn, Laurence R. 1985. Metalinguistic negation and pragmatic ambiguity. Language 61(1). 121–174. Huddleston, Rodney. 1973. Embedded performatives. Linguistic Inquiry 4(4). 539–541. Kratzer, Angelika & Junko Shimoyama. 2002. Indeterminate pronouns: The view from Japanese. In Yukio Otsu (ed.), Proceedings of the Third Tokyo Conference on Psycholinguistics, 1–25. Tokyo: Hituzi Syobo. Krifka, Manfred. 2001. Quantifying into question acts. Natural Language Semantics 9(1). 1–40. doi:10.1023/A:1017903702063. Levinson, Stephen C. 2000. Presumptive meanings. Cambridge, Mass.: MIT Press. Meyer, Marie-Christine & Uli Sauerland. 2009. A pragmatic constraint on ambiguity detection: A rejoinder to Büring and Hartmann and to Reis. Natural Language & Linguistic Theory 27(1). 139–150. doi:10.1007/s11049008-9060-2. Reis, Marga. 2005. On the syntax of so-called focus particles in German: a reply to Büring and Hartmann 2001. Natural Language & Linguistic Theory 23(2). 459–483. doi:10.1007/s11049-004-0766-5. Russell, Ben. 2006. Against grammatical computation of scalar implicatures. Journal of Semantics 23(4). 361–382. doi:10.1093/jos/ffl008. Sauerland, Uli. 2004a. On embedded implicatures. Journal of Cognitive Science 5(1). 107–137. Sauerland, Uli. 2004b. Scalar implicatures in complex sentences. Linguistics and Philosophy 27(3). 367–391. doi:10.1023/B:LING.0000023378.71748.db. Sauerland, Uli. submitted. Disjunction and implicatures: Some notes on recent developments. In Chungmin Lee (ed.), Proceedings of the CIL18 workshop on contrastiveness in information structure and/or scalar implicatures. Sauerland, Uli & Kazuko Yatsushiro (eds.). 2009. Semantics and pragmatics: From experiment to theory. Basingstoke, UK: Palgrave Macmillan. Schulz, Katrin. 2005. A pragmatic solution for the paradox of free choice permission. Synthese 147(2). 343–377. doi:10.1007/s11229-005-1353-y. Schwarz, Florian, Charles Jr. Clifton & Lyn Frazier. 2008. Strengthening ‘or’: Effects of focus and downward entailing contexts on scalar implicatures. To appear in UMOP 37. Singh, Raj. 2008. On the interpretation of disjunction: Asymmetric, incremental, and eager for inconsistency. Linguistics and Philosophy 31(2).
2:12
Embedded Implicatures and Experimental Constraints
245–260. doi:10.1007/s10988-008-9038-x. Zimmermann, Thomas Ede. 2000. Free choice disjunction and epistemic possibility. Natural Language Semantics 8(4). 255–290. doi:10.1023/A:1011255819284.
Uli Sauerland Zentrum für Allgemeine Sprachwissenschaft Schützenstr. 18 D-10117 Berlin [email protected]
2:13
Semantics & Pragmatics Volume 3, Article 8: 1–57, 2010 doi: 10.3765/sp.3.8
Varieties of conventional implicature∗ Eric McCready Aoyama Gakuin University
Received 2009-11-30 / First Decision 2010-03-08 / Revised 2010-04-12 / Accepted 2010-05-08 / Final Version Received 2010-05-24 / Published 2010-07-29
Abstract This paper provides a system capable of analyzing the combinatorics of a wide range of conventionally implicated and expressive constructions in natural language via an extension of Potts’s (2005) LCI logic for supplementary conventional implicatures. In particular, the system is capable of analyzing objects of mixed conventionally implicated/expressive and at-issue type, and objects with conventionally implicated or expressive meanings which provide the main content of their utterances. The logic is applied to a range of constructions and lexical items in several languages.
Keywords: conventional implicature, mixed content, type logic, resource sensitivity, expressive content
1 Introduction The nature of conventional implicatures has been under debate since their existence was proposed by Grice (1975). Some philosophers deny that there are such things at all (Bach 1999). In linguistic semantics, however, there has been a recent surge of interest in their analysis, starting with the work of Potts (2005). The work of Potts in this area has centered on conventional implicatures that provide content which supplements the main, at-issue content of the sentence in which they are used. ∗ Thanks to Daniel Gutzmann, Yurie Hara, Makoto Kanazawa, Stefan Kaufmann, Chris Potts, Magdalena Schwager, Yasutada Sudo, Wataru Uegaki, Ede Zimmermann, and audiences at NII, Kyoto University and the University of Göttingen for helpful discussion, and in particular to three anonymous reviewers for Semantics and Pragmatics, as well as David Beaver and Kai von Fintel, for extremely useful and insightful comments. ©2010 Eric McCready This is an open-access article distributed under the terms of a Creative Commons NonCommercial License (creativecommons.org/licenses/by-nc/3.0).
Eric McCready
(1)
a.
John, a banker I know, played golf with Bernie yesterday.
b.
Frankly speaking, I don’t know what you’re talking about.
Here, the content of the nominal appositive in (1a) and that of the speakeroriented adverbial in (1b) add content to the utterance, but in a way intuitively independent of the claim the speaker intends to make by her utterance. Notice also that the appositive and adverbial only introduce conventionally implicated content; they add nothing to the ‘at-issue’ content. This is characteristic of all the elements studied by Potts.1 A number of authors (e.g. Bach 2006; Williamson 2009) have noted that not all lexical items (or constructions) are associated exclusively with atissue content, or with conventionally implicated or expressive (CIE) content;2 instead, some expressions seem to introduce both. Pejoratives are the most widely cited example. Williamson discusses an example from Dummett (1973), the (extinct) pejorative Boche, which according to Williamson was in use in Britain and France in the initial stages of WW1 in anti-German propaganda. This choice is presumably made to avoid other expressions that are more obviously offensive to the modern reader. However, the obsolete nature of Boche makes it difficult to have clear intuitions about sentences in which it is used. I will therefore make use of the pejorative Kraut instead, as an example of a pejorative that, while still attested, is probably milder and less offensive than some other possible choices.3 In any case, all instances of pejoratives in this paper are data; they are mentioned, not used. (2)
He is a Kraut.
Pejoratives plainly introduce what I will call mixed content: they are predicative of at-issue content, yet introduce a conventional implicature. I will 1 It is still possible that these expressions could be presuppositional in nature, rather than part of a separate class of conventionally implicated meanings, as suggested by a reviewer. I find the arguments of Potts on this issue (2005, 2007a, and 2007b) convincing, but I will return to the point below. 2 ‘CIE content’ is intended as a neutral term for conventionally implicated and expressive content. In this paper, the assumption is made that, to a first approximation, both conventional implicatures and expressives make use of roughly the same combinatoric system. Where the distinction matters, I will not use the cover term. 3 I thank David Beaver and Kai von Fintel for helping me with the difficult choice of which pejorative would have the desired qualities of being both relatively common and relatively inoffensive. Hom (2008), faced with a similar decision, makes use of Chink, which is perhaps fairly similar in quality.
8:2
Varieties of conventional implicature
provide a more detailed characterization of mixed content expressions in the next section. Potts’s core logic is not able to handle examples of this sort of mixed content (due to limitations imposed by the type system) without additional, costly assumptions about semantic decomposition.4 The first goal of this paper is to provide a system capable of analyzing mixed content without such assumptions; formally this corresponds to an extension L+ CI of Pott’s original (2005) system LCI , which is the most explicit theory of CIE content presently available, and the one that is best understood. This is done in section 2, where I also discuss and analyze other cases of mixed content. The research of Potts and others suggests that conventionally implicated content is supplementary by nature, a conclusion embodied in the original system LCI , as will be shown in section 2.2. The cases of mixed content to be discussed do not significantly alter this picture: although mixed content elements introduce content in both at-issue and CIE dimensions, there is clearly a sense in which the CIE content remains supplementary to the at-issue content. LCI does allow for non-supplementary content of propositional type, which is one way to view purely expressive single-expression utterances, as I will show in section 3. In cases where combinatorics come into play, however, only supplementary interpretations are available. The extended system L+ CI enables nonsupplementary interpretations for reasons explained in detail in sections 2.3 and 3.1: briefly, a new set of types turns out to be necessary for the mixed content cases, given the combinatoric rules of LCI (which I will argue should be kept intact). These cases cannot be analyzed in LCI at all. Section 3, after discussing some instances of single-expression utterances, argues that there are reasons that one might want the additional possibilities given by these new types. Two main reasons are discussed: first, cases of elements that are able to modify certain kinds of CIE elements but not others (a possibility already disallowed by LCI ) and, second, cases of multiexpression utterances that seem to lack at-issue content. The main cases focused on are stand-alone particles and the Japanese adverbial yokumo (cf. McCready 2004). Section 4 presents an analysis of Quechua evidentials (following the basic picture presented by Faller 2002) that treats parts of their content as CIE; this analysis makes use of the full system introduced here. The analyses proposed here, if correct, have substantial implications for our understanding of CIE elements, and possibly for other semantic elements as well. Section 5, the conclusion to the paper, discusses some of 4 Williamson comes to the same conclusion about pejorative items, indeed noting that Potts must allow for mixed content to analyze them (his note 16).
8:3
Eric McCready
these implications, as well as summing up the paper and mentioning some directions for future research. 2
Mixed Content
This section focuses on mixed content. I begin by providing criteria for an expression introducing mixed content, in section 2.1, continuing with a detailed look at the case of pejoratives in section 2.2. There it becomes clear that there are two parts to the meaning of a pejorative expression: an ‘ordinary’ predication of an individual as part of some group, and a negative attitude expressed by the speaker with regard to that individual by virtue of being part of that group. As has been argued in the literature, this first content must be at-issue, while the second must be CIE. I review these arguments and add some additional ones. Section 2.3 introduces Potts’s (2005) logic LCI and shows that it has no way of producing single lexical entries for linguistic objects that introduce mixed content. As I will show, however, this does not mean that LCI has no way of analyzing such expressions: it can decompose them into multiple morphemes at some level of representation, some introducing at-issue content, and some CIE content. I will evaluate this way of doing things in section 2.3 as well, concluding that it is undesirable as a general method. 2.4 extends the logic to a system that can analyze mixed content without decomposition: this is done by allowing the construction of additional types via the recursive type definition, and (crucially) introducing new combinatoric operations over these types. The resulting system, L+ CI , is used to analyze pejoratives in 2.5. Section 2.6 examines other kinds of mixed content elements: formal and informal pronouns, benefactive expressions, and certain honorifics, among others. 2.1 Mixed Content: Criteria Before considering particular examples of expressions introducing what I will be calling mixed content, it will be useful to first make it clear exactly what is meant by this term.5 I will take an expression to introduce mixed content if it fulfills the following two criteria. First, it should introduce content in both at-issue and CIE dimensions. The pejorative case above fills the bill: it is predicative, and so introduces content in the at-issue dimension, but at the same time introduces an attitude 5 Thanks to several reviewers for suggesting that this exposition be made.
8:4
Varieties of conventional implicature
of the speaker toward some individual or group of individuals, which is CIE content, as I will show in detail in the next section. Introducing content in both dimensions is the essential criterion. The second criterion is that it should be monomorphemic. Exactly what counts as monomorphemic is, in part, a theory-dependent notion; the amount of decomposition licensed by a particular theory will influence what can count as introducing mixed content. For example, if one were to take pejoratives like Kraut to introduce multiple morphemes at some level of semantic composition, then such pejoratives would no longer introduce mixed content, at that level; rather, each bit of the word meaning would introduce unmixed content of either purely at-issue or purely CIE type. This criterion means that the first criterion is in fact strengthened: not only must at-issue and CIE content both be introduced, but they must be introduced simultaneously, at the same point in semantic composition. A word about my own methodology. Here I will be working mostly with a naive view of word structures which admits little to no semantic decomposition. This will lead to taking certain expressions to introduce mixed content which on other approaches might not do so; it will also lead to a particular analysis which allows such introductions. I will discuss some issues raised by this view, as well as alternate possible accounts, after introducing the analysis itself. In any case, I believe that I will be able to present some examples of mixed content that are monomorphemic on most anyone’s view of the lexicon. 2.2 Pejoratives Let us take as our main example of pejoratives Kraut, mentioned already in the previous section. This choice is made for reasons of delicacy: many current pejoratives sting quite a bit more than this one does, thanks to the fact that it is (I believe) not used very much these days: in this sense it resembles Boche, though it is much more current. This allows for a more objective consideration. If the reader wishes to sharpen intuitions, she is welcome to substitute her favorite pejorative; also, if she finds the particular pejorative I have chosen excessively offensive, she is welcome to substitute another one.6 Kraut is a pejorative term for German people on its nominal use. By 6 Again, just to make things absolutely clear, I have no attachment to the word Kraut, and I would not want to be associated with the attitude it expresses.
8:5
Eric McCready
saying (2), repeated as (3), I assert that the referent of he is German, and express that I have negative feelings about him. (3)
He is a Kraut.
Here, ‘Kraut’ obviously must contribute to at-issue content: if it does not, the sentence cannot form a proposition, for the pejorative is the main predicate of the sentence. The same can be seen when pejoratives serve as subjects. (4)
Every Kraut is not evil.
Here, the pejorative term is serving as the first argument to the determiner (on a standard semantics). Pejoratives thus clearly form part of at-issue content. The expression of negative feeling that the word introduces, though, is not part of at-issue content. This can be seen by considering the characteristics of conventionally implicated and expressive content as discussed in Potts 2005 and Potts 2007a. Potts lists a number of properties that these kinds of content are meant to have, some of which have been called into question by various authors (e.g. Wang, Reese & McCready 2005; Wang, McCready & Reese 2006; Geurts 2007; Amaral, Roberts & Smith 2008). In this paper I will primarily consider two tests for conventional implicature/expressiveness (CIEness). The first is scopelessness. The second is the behavior of CIE items under denial. CIE items, by definition, do not participate in at-issue semantic processes.7 In particular, they are not affected by semantic operators. Consider the following examples. (5)
(6)
a.
It is false that John, the swimmer, is a good dancer.
b.
If John, the swimmer, comes to the party, everyone will have a good time.
a.
That damn John didn’t come to the party.
b.
If that damn John comes to the party, no one will have a good time.
7 I do not consider here counterexamples to this claim which have been raised by Wang et al. 2005, 2006 and Amaral et al. 2008. These authors’ focus is on indefinite appositives in the first case and on the interaction of attitude verbs and CIE content in the second. In my discussion, I will use only examples that have not been controversial. I think it is clear that the Potts generalizations about scope independence and denial hold for at least the areas of CIE content and operators that I will be concerned with.
8:6
Varieties of conventional implicature
In these examples, it is clear that the content of the nominal appositives is not affected by the negation or by the conditional, and similarly for the expressive adjective damn. In this respect, CIE content is similar to presupposition. It differs in that it cannot be bound (cf. van der Sandt 1992). ‘Binding’ refers to the situation in which a conditional antecedent (or other universal construction) entails the content of a presupposition which appears in the consequent. In this situation, no presupposition is projected. (7)
If John has a daughter, John’s daughter must be pretty.
Such binding does not happen for CIE content. (8)
a.
# If John is a swimmer, then John, a swimmer, came to the party.
b.
# If I hate John, then that damn John came to the party.
In these sentences, the content of the appositive, that John is a swimmer, and of the expressive adjective, that John is in some way bad, is indeed projected. The infelicity of the examples can be taken to follow from this projection behavior: in (8a), for instance, since the speaker indicates that John is a swimmer, it is odd conversational behavior to conditionalize over this content, producing a sense of redundancy. The second test relates to the first. CIE content does not participate in denials. In ordinary denial, the truth of any at-issue part of a sentence can be called into question. B’s denial in (9a) has the interpretations in (9b).8 (9)
a.
A: John came to the party last night. B: That’s not true/That’s false.
b.
‘John didn’t come to the party.’ ‘John didn’t come to the party last night.’ etc.
Consider what happens when one denies a sentence containing CIE content. As the following examples show, the CIE content cannot be the target of denial. (10)
a.
A: John, a swimmer, came to the party last night. B: That’s not true/That’s false.
b.
≠ ‘John is not a swimmer.’
8 Exactly which interpretation is selected will depend on focus, discourse topic, and other aspects of information structure.
8:7
Eric McCready
(11)
a.
A: That damn John came to the party last night. B: That’s not true/That’s false.
b.
≠ ‘There’s nothing wrong with John.’
Insofar as we take denial to be at least partly a semantic operation (cf. van Leusen 2004), the result of this second test is a direct corollary of the first. Now we can apply our first test to the cases of present concern: what happens when one attempts to embed pejoratives? Clearly, the negative attitudes they express are projected in that situation, so the content must at the very least be presuppositional. (12)
a.
He is not a Kraut.
b.
He might be a Kraut.
c.
Is he a Kraut?
However, if it is presuppositional we would expect that it can be ‘bound’ in the usual way, so that if a conditional antecedent entails the non-assertive content of Kraut this content will not be projected. In order to check whether this is possible, we must determine what exactly the content of Kraut is. Discussing Boche, Williamson takes the expressed content to be that the individual picked out by the subject he is cruel, noting that it is not clear that this really captures the non-asserted part of the meaning. Here he is abstracting from Dummett, who writes ‘barbarous and more prone to cruelty than other Europeans’ (Dummett 1973:454). I do not think Williamson’s paraphrase is correct (indeed, he himself is not satisfied with it). It is certainly not correct for the modern pejoratives that I know; while it may be correct for Boche, it seems that pejoratives behave more or less alike in terms of their basic meanings, differing only in the degree of approbation assigned to the individual or group under discussion.9 Richard (2008) describes the expressive part of the content of pejoratives as that an individual is bad by virtue of membership in a particular group; in this case, the individual picked out by the pronoun is bad by virtue of being a German. This is closer, but still cannot be correct; note that in the examples in (12) there is no implication that the subject individual is bad in 9 I say ‘individual or group’ so as not to prejudge the issue. Also, it is possible that there may be real differences between pejoratives in semantic terms, and that there may be different semantic classes of pejoratives. These issues are larger than I can take on in the present paper.
8:8
Varieties of conventional implicature
any way at all.10 Instead, what is expressed by the sentences in (12) is that the speaker takes German people to be bad.11 Presumably the sense that the subject individual is negatively characterized that Williamson picks up on is derived via an inference: since it is asserted that he is German, and expressed that German people are bad, it is also expressed, though indirectly, that he is bad. But this does not seem to be a part of literal content, either at-issue or CIE. Supposing then that the expressed content of Kraut is roughly that German people are bad, we can test its bindability via a conditional in the usual way. (13)
If (I think) Germans are bad, then he is a Kraut.
This sentence is rather odd, in part because the expressed content of Kraut does indeed appear to project from the conditional.12 On the assumption that the proposed paraphrase is the right one, and generalizing from this case, we can conclude that the expressed content of pejoratives is CIE rather than presupposed. I will assume so in the following. It should be noted, however, that the significance of the result of the binding test depends on the accuracy of the paraphrase. If the paraphrase given is incorrect, or, even worse, if the expressive portion of pejoratives is such that it does not admit a linguistic paraphrase at all, then the test is invalidated. This is worrisome given the analysis of Potts (2007a), according to which expressives have the property of ‘ineffability,’ meaning that they literally cannot be paraphrased in ways not involving other expressives.13 Even in this case, though, an expressive paraphrase is possible:14 10 Unless one takes it to be a bad thing that one is not, or might be (etc.), a German; I will ignore this notion in the following. 11 This may well be what Dummett had in mind. 12 A reviewer suggests that the oddity is due to the speaker apparently expressing uncertainty about his own attitudes, which should be pragmatically inappropriate. However, even if the speaker is an amnesiac who in fact does not know what his attitudes are (in some sense), the oddity remains, suggesting that this is not the right explanation. 13 Geurts (2007) notes that something similar holds for other, non-expressive words like green, though: they are not easily given satisfying paraphrases either. See also Fodor 2002. However, the degree of difficulty seems to be different for the cases of green and (e.g.) damn. A paraphrase of the latter cannot even be attempted without using expressives, whereas one can (for instance) try to give exemplars of greenness for the former. I think Potts is right in distinguishing the two types. I will have more to say about this issue in the conclusion. 14 A reviewer notes that the projection behavior may not be very surprising, given that we also have expressive content in the antecedent, which has nothing to bind it. The fact that it
8:9
Eric McCready
(14)
If I hate the {damn|fucking} Germans, then he is a Kraut.
Here, if one accepts the Richard analysis, the expressive content of ‘Kraut’ is pretty clearly entailed15 by the content of ‘(I) hate the damn/fucking Germans.’ The conclusion is that this part of the content of Kraut is not presupposed, which indicates that it is highly likely to be CIE content, given its other behavior.16 Let us now consider the second test. What happens when one tries to deny the content of a pejorative? (15)
a.
A: Juan is a Kraut. B: That’s not true/That’s false.
b.
≠ ‘German people are not bad.’
The result of this test also supports the conclusion that the negative part of the meaning of Kraut, and, by extension, pejoratives in general is CIE content, and not part of the at-issue meaning. To sum up, we have reached the conclusion that pejoratives play a dual semantic role: they act as ordinary nominals for predication or as arguments of determiners, etc., but carry CIE content as well. They also appear to be monomorphemic, at least in many cases. One might argue (as has Chris Potts, p.c.) that in fact pejoratives are polymorphemic. An argument for such a view comes from pejoratives like Jap, which could be viewed as composed of is necessary to use expressives to paraphrase other expressives (given Potts’s ineffability condition) may be one reason that binding of CIE content is impossible. 15 Or some expressive equivalent for the Potts 2007 system. Since according to that analysis the function of (emotive) expressives is to narrow down a subinterval of R used as a model of a range of emotion displayed with respect to some object, one can define a notion of emotive entailment according to which P x emotively entails Qx iff the interval assigned to x by P is a subset of that assigned to x by Q. Since I will not make use of this system in this paper, I will not work out the details. 16 A reviewer suggests an analysis in terms of indexical presuppositions (Schlenker 2007), with the following lexical entry: (i) Krautc,w = λx : speaker(c) has a negative attitude toward German people in w(c). German(x) But this suggests (as far as I can see) that such a presupposition should be bindable in examples like (ii). (ii) If I have a negative attitude toward German people, then he is a Kraut. Again, in amnesia contexts, this should be felicitous; and here the content certainly projects.
8:10
Varieties of conventional implicature
a root word (Japanese) and a truncating suffix with an expressive meaning. I think this is at least reasonably plausible for cases like Jap, but certainly not for all pejoratives. Expressions like Frog, Yankee17 or the Japanese sangokujin ‘third country person’ — or indeed Kraut — pretty clearly lack a truncation of the relevant type. At the very least, it is not clear that all pejoratives contain multiple morphemes. Since the proposed conditions are met, then, they introduce mixed content.18 In the next section, I will introduce the compositional system of Potts 2005, which was designed for the analysis of conventional implicature; as we will see, it is not, as it stands, able to analyze mixed content qua mixed content. But this is not the end of the story yet. 2.3 LCI Potts (2005) proposes a pair of logics called LCI and LU for the analysis of conventional implicature.19 These two logics interact in sometimes complex ways. The parts of the system that concern us here involve a) what kinds of expressions are semantically well-formed, b) how these expressions are combined in the logical syntax, and c) how the resulting expressions are interpreted. These issues all relate to LCI , which is a higher-order lambda calculus. The first corresponds to a definition of admissible types in LCI and the second to rules for how the admissible types are combined. The third issue corresponds to a rule for the interpretation of conventionally implicated expressions: effectively a mapping between expressions of LCI , the type theory used for the combinatorics, to logical forms intended for model-theoretic evaluation. I examine each in turn. As we will see, the system as set up in Potts’s work cannot be used to model the behavior of mixed content expressions, which will prompt modifications to it in section 2.3. First, the types themselves. Potts defines a system of types. Here, as in 17 As in ‘Yankee Go Home’ — I make no claims about the historical development of the term. 18 It is still debatable whether the precise content I have proposed (following Richard) is right. Hom (2008) gives an interesting analysis in which pejorative content is not expressive at all, but instead is a social construct varying across speaker groups. I will not argue in detail against this proposal here — I am sympathetic to the notion of social construction of meaning, at least in these sorts of cases — but I doubt that all the content of pejoratives is truth-conditional. Hom considers and rejects the sort of evidence (denials and operator scope arguments) I have made use of here. In my opinion he is too hasty in doing so, but fully responding to his arguments would take us too far afield. 19 I will not review the full motivations for these logics here, or all the details of how they work. I will focus only on the parts that will be necessary for the proposal in this paper.
8:11
Eric McCready
the type theories standardly used in linguistic semantics (cf. Heim & Kratzer 1998), basic types are e, t, s, which are used to produce an infinite set of types via the usual kind of recursive definition. (The details of the definition are provided in Appendix A.) However, Potts’s logic differs in that it makes crucial use of a distinction between at-issue types and CI types (‘CI’ indicating conventional implicature). The distinction is indicated via a superscript ‘a’ or ‘c’ on the type name. At-issue types are freely produced in the usual way. CI types are distinct: they are always of the form hσ a , τ c i, functions taking at-issue typed objects as input and outputting CI-typed objects. There is no mechanism for producing types that take CI-typed objects as input. This, according to Potts, is the reason that conventionally implicated content is independent of at-issue operators: there simply are no operators over CI content. How are these objects combined? LCI has the derivation rules for type combination shown in Figure 1. Potts couches them as ‘tree admissibility conditions’ but this comes out to more or less the same thing as a derivation rule if one understands his trees as proof trees: the Table 1 notation is more compact, so I will use it in what follows. As far as I am concerned this is a notational variant. It should, however, be noted that the logic behaves in ways that are odd from the standpoint of many logics familiar to linguistics such as categorial grammar; notably, unlike the categorial grammars implemented for standard at-issue semantic combination, it is not resource sensitive for CI types, as detailed below. The essential point is that a resource sensitive logic is one that consumes resources as they are used in proofs. This is a property of the combinatorics of at-issue content: combining sleeps with John yields sleeps(john), but the meanings of noun and verb are consumed and no longer available for further composition. As we will see, this is a property that LCI rightly lacks. The rules in Figure 1 are meant to model the combinatorics in conjunction with a syntactic structure, just as in the work of Potts, meaning that they should retain the constituency-driven character of the original LCI rules.20 (R1) is just a reflexivity axiom. (R2) is ordinary application for at-issue 20 I also diverge from Potts on my treatment of CI propositions introduced low in a tree. In Potts’s formulation, the possible presence of such additional CI conditions warrant sometimes thinking of these rules as shorthand for a larger rule set. See Potts 2005: 222 for details. Instead of this route I will consistently make use of R5 to eliminate all elements of type t c from derivations immediately after they are derived, which means that there will not be extra free-floating CI content. Thanks to a reviewer for inspiring this strategy.
8:12
Varieties of conventional implicature
(R1)
α:σ α:σ
(R2)
α : hσ a , τ a i, β : σ a α(β) : τ a
(R3)
α : hσ a , τ a i, β : hσ a , τ a i λX.α(X) ∧ β(X) : hσ a , τ a i
(R4)
α : hσ a , τ c i, β : σ a β : σ a • α(β) : τ c
(R5)
β : τ a • α : tc β : τa
(R6)
α:σ (where β is a designated feature term) β(α) : τ
Figure 1
Rules of proof in LCI .
8:13
Eric McCready
elements; this is completely standard in formal semantics. (R3) is a rule for intersection, where we abstract over the input type of two elements. (R4) and (R5) are the rules mainly of interest to us. Given an expression of a given atissue type and another expression mapping that type to some conventionally implicated type, use of (R4) yields the resulting conventional implicature paired with the original at-issue type, where the ‘•’ operator (henceforth referred to as ‘bullet’) simply indicates this pairing. The bullet is used only to conjoin at-issue and CI type objects. This means that any given node in the proof tree can be decorated with both at-issue and conventionally implicated content.21 (R5) strips CI objects of propositional type away from a premise set (by shunting them away to another meaning dimension, as we will see shortly). What is absolutely crucial in rule (R4) is that the at-issue content is duplicated in the output of the derivation. This means that the logic allows, indeed requires, duplication of resources, when conventional implicatures are involved. Given that LCI is designed for the interpretation of supplementary elements like appositives and (some) speaker-oriented adverbials, this makes perfect sense. This observation, though, highlights a difference with standard categorial logics: since such logics are meant exclusively to model at-issue semantic composition (via the Curry-Howard isomorphism, cf. Carpenter 1998; Sørensen & Urzyczyn 2006), they are always resource-sensitive. This difference can be taken as a significant generalization about supplementary CI(E)s. The final rule, (R6), allows introduction of content via ‘designated features’; such features can be associated with constructions, as in the case of appositives, or (in principle at least) with lexical items. After the semantic computation is complete, the proof tree itself is then interpreted as a semantic object via the following rule.22 21 •-terms have some affinities to the dot objects of Pustejovsky (1995), and not only in form. I will say a bit more about this in footnote 29. 22 As noted by Chris Potts (p.c.), this rule is potentially odd from the perspective of proof interpretation. In proofs, objects of type t are often introduced in the course of logical derivations but left out of truth evaluation (e.g. in the context of a conditional proof); (16) has such objects contributing to evaluation just in case they are of type t c . This is just to say that it is necessary to collect type t c objects from the entire proof, so in a sense the proof becomes a first class citizen of the interpretation mechanism and not merely a means for deriving a sentential interpretation. This may well be out of line with what is commonly assumed in e.g. the literature on direct interpretation (see Barker & Jacobson 2007). For this reason Potts uses derivation trees, which he takes to be a necessary intermediate step in interpretation of CI elements, a point stressed in both Potts 2005 and Amaral et al. 2008. He suggests that my use of proof trees here is misleading. He may be right, but I do not think the problem is so serious. In essence, defining a rule
8:14
Varieties of conventional implicature
(16)
Proof tree interpretation (after Potts). Let T be a proof tree with atissue term α : σ a on its root node, and distinct terms β1 : t c , . . . , βn : t c on nodes in it. Then the interpretation of T is hα : σ a , {β1 : t c , . . . ,βn : t c }i.
Here α and β are variables over lambda terms, and σ a is a variable over semantic types. The superscripts distinguish the types as either at-issue (superscript a) or CI (superscript c). Effectively, conventionally implicated content is shunted into a separate dimension of meaning. The bullet therefore functions as a bookkeeping device in the proof. The action of these three elements of the Potts logic, then, is as follows. First, types for conventional implicature are defined; crucially, there are no types that take conventionally implicated content as input. Second, these types are combined via the rules in (R1-6). With respect to conventional implicatures, this means the effect is to isolate conventionally implicated content from at-issue content with a bullet, by rules (R4) and (R5). •-terms are then separated into separate dimensions of meaning, by the schema in (16). Let us consider how this logic can be used for the analysis of mixed content objects. It is easy to see that it cannot be so used in its current form, given the assumption that the at-issue and CIE content are introduced by the lexical item simultaneously. The type construction rules (again, see Appendix A for details) provide for types of the form hσ , τia , purely at-issue types, and hσ , τic , purely CI types. Intuitively, in the case of pejoratives we require an object with the type of an ordinary predicate in the at-issue dimension, and one of propositional type which is CIE.23 What we need is a typing for objects that are of mixed type, but this cannot be produced in LCI . As far as I can see, the only way to model mixed content in LCI would be to assume that content can be introduced in two distinct stages. This on semantic derivation trees and semantic derivation proofs should yield the same results, given that the mechanisms of derivation are equivalent. I do not see a substantial difference in giving derivation trees citizen status and giving the same kind of status to proof trees. In any case, the proof-based rule is less odd in the context of derivations proceeding in concert with a syntax, and problems that could arise with e.g. λ-abstraction will not arise in the context of CIE content, where (as far as is known presently) abstraction does not occur. Still, if the reader feels happier with using trees, she is welcome to perform the translation, which is technically trivial. 23 If one follows e.g. Williamson and takes pejoratives to introduce predicates in the CI dimension as well, the situation changes somewhat, but the basic problem is the same. We will see cases of this type in section 2.6.
8:15
Eric McCready
idea can be implemented by assuming that pejoratives introduce an at-issue object, which is then predicated in some way by a CI object via R4. The result will be a CI proposition and an at-issue predicate. In the case of Kraut, we would have the following. (Here ‘∩ ’ is the kind formation operator used by e.g. Chierchia 1998.) R4
λx. German(x) : he, tia λP . bad(∩ P ) : hhe, tia , t c i a λx. German(x) : he, ti • bad(∩ German) : t c
This is the desired logical form. But this kind of approach requires allowing mixed content objects to separately introduce multiple pieces of content. This analysis seems to destroy the intuition that pejoratives and other instances of mixed content are singular semantic objects with a dual character. It indeed strikes me as highly unnatural to have a lexical entry realized in terms of multiple, fully separate entities.24,25 I therefore take it to be truer to intuitions to modify the logic in such a way that mixed content can be modelled directly. This is done in the following section. 2.4 L+ CI This section of the paper proposes L+ CI , an extension of LCI that can handle +S mixed content. In the process, we will also define a sublogic of L+ CI , LCI , which introduces a set of types for CIE objects that have resource-sensitive properties. The first necessary step involves adding resource-sensitive CIE types to LCI . The reason is that there are mixed content items which are predicative in both dimensions. Pejoratives introduce mixed content: but only part of this content, the at-issue portion, is predicative (or so I have argued). The CIE content is propositional. Because it is propositional, there is no special 24 The case of presupposition may seem formally similar on a superficial level, but it is rather different in that presuppositions (on some perspectives at least) simply indicate definedness conditions for the at-issue content, whereas here the two bits of content are entirely separate and represent fully distinct discourse contributions. 25 Note also that the proposed analysis is different from analyzing single lexical items as consisting of a single complex condition; the two types of decomposition are entirely different in quality. Assigning a word a meaning of the form λx[P (x) ∧ Q(x)] seems rather different from giving it a pair of meanings λx[P (x)] and λx[Q(x)] which are meant to apply to the input at different points in the derivation. The latter seems appropriate in only special situations, e.g. when a word makes two distinct contributions that can be traced back to specific distinct parts of the word. We will return to such examples in section 2.6, where I will discuss the general merits of the decompositional strategy.
8:16
Varieties of conventional implicature
need for resource-sensitive types here; but in cases where there is a dual predication, a lack of resource sensitivity will cause serious problems in the meaning composition, as I will detail shortly. It is not hard to find cases of mixed content where both the at-issue content and the CIE content are predicative. An instance can be found in the Japanese honorific system. Certain honorifics in Japanese come with special morphology which clearly carries the honorific load; these sorts of expressions are analyzed by Potts & Kawahara (2004) as introducing a kind of expressive content. In such cases, it is easily possible to analyze the morphemes as introducing supplementary expressive content exclusively. However, there are other lexical items which simultaneously honor some individual and predicate something of her. An example is irassharu ‘come[Hon]’. (17)
sensei-ga irasshaimasi-ta teacher-Nom came.Hon-Pst ‘The teacher came’ (the teacher is being honored)
Here, the verb simultaneously says of the teacher that she came, and indicates that she is deserving of honor.26 This verb satisfies both the criteria for mixed content: it introduces both an at-issue predication and expresses honorification at the CIE level.27 Further, the verb is (at the surface at least) monomorphemic. It cannot be separated into morphemes introducing atissue and expressive content separately, unlike (for instance) the honorifics studied by Potts & Kawahara (2004), which clearly contain morphemes which separately provide honorific meanings. This does not of course preclude a decompositional analysis, on which more below. But, barring independent (synchronic) reasons for such an analysis, it seems desirable to analyze this expression as simultaneously introducing two types of meaning, and so as a bearer of mixed content. The upshot is that honorifics like irassharu are instances of mixed content which are predicative in both dimensions of meaning. How could such examples be analyzed in LCI ? Note what will happen if we make the obvious move, and analyze this expression as involving an object of at-issue predicative type, and a CIE object of similar type, conjoined by a bullet as usual: 26 Or however one wishes to paraphrase the honorific relation; I will not address this question here in detail. See section 2.4 for some brief discussion. 27 For arguments that honorific content is expressive, see Potts 2005, Potts & Kawahara 2004, and Kim & Sells 2007.
8:17
Eric McCready
(18)
irassharu= λx. come(x) : he, tia • λx. honor(s, x) : he, tic
Applying this object to the referent of sensei ‘the teacher’ (which I will treat as a referring expression for simplicity) yields the following by R4, or would if R4 was defined for expressions conjoined by the bullet operator, which it actually is not. If we wanted to extend R4 to cases of •-conjoined objects, we would actually need to define a new rule. Let us see what such a rule would be for purposes of discussion. This rule simply assumes that we perform pointwise application of every element conjoined by a bullet according to the proper rules, which will be R2 for the at-issue side of the bullet and R4 for the CIE side. The use of R4 of course means that the content of the input to the CI type will be duplicated in the output, yielding the results of the two applications, and an unmodified input as well. (19)
α : hσ a , ρ a i • β : hσ a , τ c i γ : σa α(γ) : ρ a • β(γ) : τ c • γ : σ a
With this rule we can attempt a derivation of (17), which will go as follows. λx. come(x) : he, tia • λx. honor(s, x) : he, tic come(t) : t a • honor(s, t) : t c • t : ea
t : ea
Since the CIE content is not, by R4, resource-sensitive, the predication by the right conjunct of the • in the premises will yield the result of the application, as desired, but also will return the original at-issue input to the functional application. But this is undesirable: the result is not semantically interpretable. In Potts’s work, where CIE expressions are restricted to those introducing supplementary content, the CI types were required to have a resource-insensitive nature. But, as we can see, in cases of mixed content it yields the wrong results. We therefore need to add a new sort of content which is both CIE and resource-sensitive. The result of adding types for resource-sensitive CIE content to LCI is called L+S CI . I will use a superscript s to distinguish what I will call shunting types, types for those semantic objects that ‘shunt’ information from one dimension to another, without leaving anything behind for further modification. The type system obtained by adding these types to LCI is defined in Appendix B.1. With this type classification, it becomes possible to define a rule specific to nonsupplementary conventional implicatures. (R7)
α : hσ a , τ s i, β : σ a α(β) : τ s
8:18
Varieties of conventional implicature
We can then modify the rule in (16) to handle information from shunting types as well. σ {x,y} indicates that σ is a type of sort x or sort y.28 We will see a number of examples of the application of this rule in what follows. (20)
Generalized Interpretation (first attempt). Let T be a proof tree with at-issue term α : σ a on its root node, and distinct terms β1 : t {c,s} , . . . , βn : t {c,s} on nodes in it. Then the interpretation of T is hα : σ a , {β1 : t {c,s} , . . . ,βn : t {c,s} }i.
The combination of (R7) and the new interpretation rule in (20) serves to maintain the original generalizations about supplementary meanings provided by LCI while expanding the system’s coverage to conventional implicatures that introduce the primary meaning of the sentence they appear in. In section 3, I will show that the possibilities made available by the existence of these types are exploited by natural language, even outside the domain of mixed content. The resources to create the needed kind of objects to model mixed content are obviously already present in L+S CI . We already have what we need: at-issue types and CI types. We need only a way to produce product types across the two dimensions, and then an application rule telling us what to do with such types when we have them. I will now provide these tools; the resulting type system is called L+ CI . It is rather simple to add the relevant types. We need only a single typing rule producing mixed types. This rule is provided in Appendix B.2. It produces types of the following form: hσ , τia × hζ, υis This object is a product type where the conjoined types are an at-issue type and a shunting type.29 Note that the input to the at-issue type and the shunting type need not be of the same semantic type; this means that it is in principle possible that the situation arises where the two will have incompatible inputs. Such typings will not work in composition though, as 28 I thank Yasutada Sudo for helping me to correct an infelicity in an earlier version of this definition. 29 These objects are rather similar to the dot objects of Pustejovsky (1995), as already mentioned in footnote 21. The difference is that, in Generative Lexicon theory, trying to make use of both ‘sides’ of the dot object generally results in zeugmatic infelicity as in (i), so there is no rule like (R8) even in the extended system (Asher & Pustejovsky 2005). (i) ?? John hung a poster on and walked through the door.
8:19
Eric McCready
they will not be interpreted by any rule, which will rule them out in practice. Mixed types like these are paired with λ-terms of the form α _ β: ‘_’ (hereafter ‘diamond’) signifies a semantic object of mixed type. We now need rules for interpreting these types. I propose the following two. (R8)
α _ β : hσ a , τ a i × hσ a , υs i, γ : σ a α(γ) _ β(γ) : τ a × υs
Given as input a mixed type and an object of the at-issue type that is input to both conjoined elements in the mixed type, (R8) outputs the result of applying each element of the mixed type to the input, where both objects are conjoined with ‘_’ as before. An example of this is precisely the derivation of mixed content terms, where both CIE content and at-issue content look for objects of the same type as input; we will see many examples in the coming sections. We will need one further rule telling us what to do with mixed terms when the CIE part of the derivation is complete: this is provided as R9. (R9)
α _ β : σ a × ts α : σ a • β : ts
This rule instructs us to replace mixed type terms involving the conjunction ‘_’ with terms conjoined by a ‘•’ when the CIE object is propositional (of type t). Roughly, we have a change in bookkeeping device corresponding to a change in typing: the diamond indicates that the two terms it conjoins are still ‘active’ in the derivation, but the bullet indicates that the CIE side has already gotten all its arguments and is ready for interpretation. R9 thus, in a sense, moves shunting-typed terms out of active use. Doing so allows for interpretation via the rule in (20). Again, we will see examples in the following sections. At this point, it is possible to abstract away from the honorific example provided earlier to make clear the general need to use shunting types on the CI side of the mixed type. Recall that the CI types in LCI are not resource sensitive; they always return their at-issue input as well as the result of applying the CI type to this input. (R4) yields an object of the type σ a • τ c when an functional CI type hσ a , τ c i is applied to something of type σ a . But this means that, if we use CI types, then in the terms typed as α(γ) _ β(γ) : τ a × υc yielded by a variant of (R8) which uses CI types, the object to the right of the diamond will be of the form γ : σ a • β(γ) : υc itself due to (R4), as we have seen. This means that the result of the application is of the form α(γ) _ γ : σ a • β(γ) : υc .’ We have seen an instance of this with the
8:20
Varieties of conventional implicature
attempted (and failed) derivation of (17) above. This means that there is an ‘unused’ term of type σ a floating around in the derivation, which will result in ill-formedness. We do not want this, and we can avoid it by using shunting types on the right-hand side instead. Such types remove the terms they apply to from the at-issue dimension completely, which clearly is what is needed in this case.30 With this rule and the type system in Appendix B.2, we are able to provide an adequate semantics for lexical items that introduce simultaneously atissue and conventionally implicated content, by defining objects of mixed at-issue and CI types.31 The next section shows in detail how this can be done for pejoratives, and the following section, 2.6, how it applies to other parts of natural language in which we find mixed content. 2.5 Analyzing Pejoratives It is straightforward to give an analysis of pejoratives in L+ CI . Recall that we needed a way to provide at-issue content and CIE content in a single lexical entry. We now have the means to do so. We need only make use of the mixed types defined in the previous section. As discussed in section 2.2, I will take the at-issue content of pejoratives to be predicative, and the CIE content to be propositional. We end up with the following kind of lexical entry: again, I use Kraut as a representative example. (21)
Kraut= λx. German(x) _ bad(∩ German) : he, tia × t s
The composition will work as follows. 30 If one takes the intuitive interpretation of shunting types to be ‘main conventionally implicated content,’ then the definition of mixed types indicates that there are two kinds of ‘main content’ in mixed-type sentences. I myself do not find this very counterintuitive. 31 A reviewer asks whether we need CI types at all anymore, given the new system. The suggestion is that one could make all types for CIE objects use the format of mixed types, but just provide a tautological component on the at-issue side, for instance the identity λX.X for polymorphic types. I do not see any technical reason this could not be done, though there might be reasons one would want to make a clear distinction between mixed and unmixed types in the type system. In any case, the comment shows that L+ CI is in fact a genuine extension of LCI . Thanks to the reviewer for picking up on this point.
8:21
Eric McCready
(22)
a.
Juan is a Kraut.
b. R9 R2
j : ea
λx. German(x) _ bad(∩ German) : he, tia × t s λx. German(x) : he, tia • bad(∩ German) : t s R5 λx. German(x) : he, tia German(j) : t a
Given the rule (20), this will yield hGerman(j), {bad(∩ German)}i as its interpretation, which will be evaluated as usual in the Potts system. Roughly, the sentence will be true iff Juan is a German, and expressively appropriate if the speaker feels that Germans are bad. Use of (22a) intuitively indicates that the speaker thinks that Juan is bad himself; I showed in 2.2 that this is not a part of the CIE content of the sentence (via embedding tests), but one can see why it follows in this system. Since the speaker asserts that Juan is German, and expresses a negative attitude toward German people in general, it is natural to conclude that the speaker holds a negative attitude toward Juan as well. It is also natural to conclude that the speaker intends, as part of the reason for his utterance, to indicate this attitude. The content that Juan is bad, then, is communicated, probably intentionally, but is not, strictly speaking, a part of the semantic content of the sentence.32 2.6 Other Mixed Elements It is easy to find examples of mixed content in the languages of the world. It suffices to consider the characteristics of mixed expressions. They are 32 A reviewer questions the analysis on the basis of examples like (i) and (ii). (i) He’s German but at least he’s not a Kraut. (ii) He’s a Boche but at least he isn’t a Kraut as well. The reviewer finds these grammatical and suggests that they are problematic, because only the CIE content distinguishes the two categories in each case. This is an interesting observation, but speakers I have consulted (including myself) find the examples infelicitous. I myself feel they are contradictory, especially (ii). I therefore will not modify the theory to address them. But one suggestion might be that, for those that find such examples OK, there is some content present in the pejoratives in addition to the CIE content which distinguishes the two properties; perhaps it is even the case that some of the CIE content has been reanalyzed as at-issue. I will not speculate further.
8:22
Varieties of conventional implicature
associated with conventional implicatures, but, since they also denote atissue content, they can serve as main predicates and are affected (in part) by various semantic operators. It does not seem at all difficult to find such expressions; in fact, many examples are noted in the literature. Let us begin by returning to the Japanese mixed content honorifics discussed in section 2.3. There I discussed the honorific irassharu, which has the at-issue content of an ordinary motion verb and the CIE content that the speaker honors the individual denoted by the sentential subject. In L+ CI , this can easily be given 33 an analysis. (23)
irassharu= λx. come(x) _ λx. honor(s, x) : he, tia × he, tis
Given this lexical entry, we can see that the honorific will participate in composition in much the same way that (predicative instances of) pejoratives do. The difference will, of course, be that predication takes place in both at-issue and CIE dimensions. An example is the following. (24)
a.
Yamada-sensei-ga irasshaimasi-ta Y-teacher-Nom came.Hon-Pst ‘Teacher Yamada came. (and I honor him)’
b. R8 R9
ty : ea
λx. come(x) _ λx. honor(s, x) : he, tia × he, tis came(ty ) _ honor(s, ty) : t a × t s came(ty) : t a • honor(s, ty) : t s
Other examples of this type include meshiagaru ‘eat.Hon’ and goranninaru ‘see.Hon’, which will receive an analysis similar (in terms of typing) to the above irassharu, except that they will take two arguments, as the verbs are transitive. 33 It is worth asking what the behavior of expressions like these is with respect to the tests proposed by Potts, Alonso-Ovalle, Asudeh, Bhatt, Cable, Davis, Hara, Kratzer, McCready, Roeper & Walkow (2009). These authors argue that expressive content does not participate in a number of grammatical operations that intuitively involve identity, such as anaphora. Indeed, the behavior of irassharu is as expected given this test. (i) Sensei-ga irasshaimasita. Ano kojiki mo soo-shita teacher-Nom came.Hon. that bum also so-did ‘The teacher came. (The teacher is honored.) That bum did too.’ No inconsistency is felt here, despite the epithet in the second sentence; and the second person who came is not honored, consistent with the conclusions of the squib.
8:23
Eric McCready
We can now consider the details of what one would have to do to analyze these examples with only the type resources of at-issue and CI types. This makes the need for shunting types even more obvious than before. I can see two ways to allow for this in principle in LCI , only one of which involves modifying the logic at all. The first, as with the propositional part of pejorative meanings, involves letting mixed content elements introduce separate pieces of content. Then we could simply stipulate that CI application takes place before at-issue application, yielding a two-step composition process for mixed type objects. This ordering must be introduced to exploit the non-resource-sensitivity of CI types. We would get roughly the following, supposing that both at-issue and CI content is of type he, ti. R4
a : ea R5
R2
λx.P x : he, tic a : ea • P a : t c a : ea Qa : t a
λx.Qx : he, tia
which in turn yields the meaning hQa, {P a}i by the interpretation rule in (16). Effectively, this idea amounts to analyzing mixed content terms as two completely separate lexical objects, one at-issue and one CI, as can be seen from the fact that in the semantic derivation this application would have to take place on two distinct nodes. Notice also that the two parts of the content must be separated in the combinatorics for things to work out. I take it that this option is entirely undesirable, just as in the case of pejoratives. However, there may be arguments for this style of analysis in certain cases; I will discuss some below, and also evaluate the whole style of this approach as a possibility for the general analysis of mixed content bearers. A second option would be to add a new composition rule to LCI and add a means of producing mixed types, but not to introduce shunting types, instead making use of only the standard Pottsian CI types, σ c .34 Together with this, we would require a composition rule for ‘mixed bullet types,’ necessary in order to avoid the unwanted duplication of content that would result from allowing the application of R4, as discussed in section 2.2. This rule would have to look roughly like the following. This can be viewed as an attempt to solve the problems introduced by the rule (19), which of course caused difficulties stemming from lack of resource sensitivity. 34 The rule for producing such types is the obvious analogue of B.2.1.i in which ‘•’ is substituted for ‘_’ and all instances of shunting types are replaced with CI types.
8:24
Varieties of conventional implicature
(25)
α • β : hσ a , τ a i × hσ a , υc i γ : σa α(γ) : τ a • β(γ) : υc
The result of (25) is to allow application to occur in • types, but without duplication of content. This is just what is required for cases of mixed content. However, it comes with obvious problems. Its function is precisely to make R4 not apply in the relevant cases. But this has bad consequences for the typing system: it becomes inconsistent in the sense that the behavior of types is now situation-specific. One might even wonder if objects behaving in this way are types in the usual sense at all. Further, consider one major purpose of allowing CI types in the first place in LCI . This was to model the work done by supplementary CI content, which always seems to show non-resource-sensitive behavior. If we allow for rules like (25), this behavior is no longer a direct consequence of the system. Concretely, suppose that, unlike the instances of supplementary content discovered so far, instances of supplementary content that take more than one argument are discovered, but which are still resource-insensitive. In such circumstances, conflicts may develop between R4 and (25), which the type system would have no way to resolve without use of ad hoc constraints external to the formal system. All these problems are avoided by the use of shunting types. It is not hard to find other examples of mixed content in recent work in the semantics-pragmatics literature. Kubota & Uegaki (2009) analyze the Japanese benefactive, which simultaneously indicates that the subject has caused the dative argument to do some action and conventionally implicates that the action was beneficial for the nominative argument.35 (26)
Taroo-ga Hanako-ni piano-o hii-te morat-ta. Taro-Nom Hanako-Dat piano-Acc play Benef-Pst at-issue: ‘Taro made Hanako play the piano.’ CI: ‘Hanako’s playing the piano was beneficial to Taro.’ glosses)
(K&U; their
The crucial point here is that the benefactive introduces both a causative at-issue meaning and a conventional implicature to the effect that the caused event benefited the causer. Again, this expression satisfies both criteria for mixed content bearing: it is both monomorphemic and introduces content along two dimensions. This is plainly an instance of mixed content. 35 I follow Kubota and Uegaki’s glosses and morphological analysis.
8:25
Eric McCready 36 In our system L+ with the CI , we can represent the benefactive morau semantics in (27a), which is of the type in (27b):
(27)
a.
λP λxλy. cause(y, P (x))_λP λxλy. good(y, P (x))
b.
hhe, ti, he, he, tiiia × hhe, ti, he, he, tiiis
This lexical entry is of mixed type; derivations with it will proceed via the rules (R8), for the combinatoric steps, and (R9), for the final step which shifts the mixed content to something interpretable via (20). Here is the derivation, with types and rules of proof only.37 piano : ea hiite : he, he, tiia moratta : hhe, ti, he, he, tiiia × hhe, ti, he, he, tiiis hiite(piano) : he, tia R8 moratta(hiite(piano)) : he, he, tiia × he, he, tiis moratta(hiite(piano))(h) : he, tia × he, tis moratta(hiite(piano))(h)(t) : t a × t s π1 (moratta(hiite(piano))(h)(t)) : t a • π2 (moratta(hiite(piano))(h)(t)) : t s
R2
R8 R9
t:
ea
R8
h : ea
Formal and informal pronouns in various European languages such as tu/vous in French or tu/usted in Spanish also carry mixed content, as discussed by Horn (2007). These objects carry the conventional implicature that the speaker feels (as if he should be) formal (informal) toward the addressee, while having the at-issue indexical denotation of a normal second person pronoun, on which they pick out the addressee of the context (Kaplan 1989). Again, they are (at the surface) monomorphemic, and they plainly introduce both at-issue and CIE content, making them mixed content bearers by the proposed criteria. This means the formal versions can be assigned the following denotation, where s c denotes the speaker of the context and hc its hearer: (28)
hc _ honor(s c , hc ) : ea × t s
I make use of just an honorific relation here, following Potts & Kawahara (2004). I do not want to take a position on its content here because mere use of a pronoun need not indicate that the addressee is actually honored. It is difficult to decide exactly what should be made of insincere uses of such pronouns. Potts & Kawahara (2004) analyze Japanese subject honorifics as 36 The term morat-ta ‘Ben-Pst’ is derived from mora-u ‘Ben-Npst’ via morphological operations that are of no concern to us here. 37 π1 and π2 here are the usual projection functions/pullbacks on product types, which work to pick out the first or the second element of the product type, respectively.
8:26
Varieties of conventional implicature
performative, so their use already causes the ‘honoring’ relation to hold; it is not so clear to me that this is the right analysis, for there is a merely normative or polite use. Perhaps we should understand honor(x, y) in this way. The same of course holds for the honorifics discussed earlier. I put these delicate issues aside here. This is the place to discuss the alternative decompositional analysis in detail. Potts (2007a) provides an analysis of formal pronouns in terms of an honorific feature applying to the pronoun meaning. The idea is that a pronoun consists of a feature bundle which introduces certain kinds of content via the features themselves. Kratzer (2009) elaborates this sort of view. This is certainly another possible route for the pronoun case; the correct answer depends on what the real nature of pronouns is, and on how much of this should be implemented at the level of interpretation rather than, say, morphology. I cannot address these large questions in this paper. My work here merely implements the picture suggested by Horn’s (2007) work. I am not ultimately certain what the right analysis of pronouns should be. However, I am skeptical about the prospects of extending this sort of view to the general case of mixed content.38 The question ultimately is whether we need a separate system of types for mixed content at all. Generalizing from the above, one might wish to maintain the simpler system of LCI and analyze all mixed content expressions as morphologically complex at the level of type combination: in other words, to decompose all mixed content bearers into at-issue parts and CIE parts, and let these parts operate on one another to yield the right meanings. Could this strategy work? Not without further elaboration, because cases like the Japanese benefactive above require multiple operations at the CIE level, which we have seen cannot be handled by using CI types. I do not see any easy way to get around this problem, even if one admits shunting types into the system (so adopting L+S CI ; see Appendix B.1), while rejecting mixed types. But this is largely a technical problem. It is possible that it might have a solution within the system, though I cannot see how it would be done.39 38 Thanks to several anonymous reviewers and to Chris Potts (p.c.) for discussion of this point. 39 One possibility would be to perform an extreme decomposition and separate out a ‘morpheme’ from the benefactive of type ht a , t c i which would provide a conventionally implicated modification of the whole sentence. For this to work out, one would need a way to predicate properties (e.g. deriving benefit) of individuals occupying roles in the sentence without doing so directly, which might be done by using neo-Davidsonian event semantics, or a system providing ‘tags’ for grammatical roles in the way that e.g. LFG does. But allowing the
8:27
Eric McCready
More worrisome, in my view, is the idea of necessarily decomposing all mixed content terms. One can justify this move in the case of pronouns, which have independently motivated analyses as feature bundles already. It may also be justifiable for some pejoratives, like Jap, which is truncated; I noted previously that one might take the truncation to introduce expressive content as a separate morpheme. Perhaps it is even possible to decompose honorifics like irassharu as something like [V COME Hon ], a motion verb with a separate honorific morpheme. But giving a multimorphemic analysis to epithets like bum or asshole, pejoratives like Frog or Boche,40 or (especially) the so-called colored terms that will close the discussion in this section seems to be a stretch. In at least some of these cases, a decompositional analysis seems very unnatural. I do not think that a knockdown argument is available against such analyses — one could always decompose, after all. But in at least these cases, there is no obvious motivation for decomposition, other than the limitations imposed by the analytical resources made available in LCI . Without independent motivation, it seems much more natural just to analyze them as mixed content bearers. At the very least, one would not want to be forced to a decompositional analysis by the type system underlying the work. As a final example of mixed content terms discussed in the literature, and perhaps the example least amenable to decomposition, let us consider pairs like Frege’s steed and nag, where the extensions are identical but the attitudes conveyed distinct (Horn 2007). Terms of this kind initially appear similar to pejoratives, but they are semantically distinct. While pejoratives express negative attitudes toward all members of some particular group, steed, nag and other terms that merely add ‘color’ to an at-issue description (Neale 1999) express positivity and negativity which is directed only at the individual being described, in the case of predicative uses. Again, these expressions are monomorphemic and introduce both at-issue and CIE content; they are therefore mixed content bearers, which do not seem to be decomposable in any natural way. multiple morphemes introduced by lexical items in decompositional analyses to take distinct scope positions and to be of different types opens the door to many impossible readings and unattested possibilities; the costs of the story seem to far outweigh the benefits here. 40 Again, these pejoratives are selected for their lack of real sting. It is not hard to find other pejoratives that are clearly monomorphemic in my sense, but most of them are sensitive enough that I will avoid even their mention, much less their use.
8:28
Varieties of conventional implicature
(29)
a.
Get my steed from the stable. at-issue: ‘Get my horse from the stable.’ CIE: ‘My horse is a noble animal.’
b.
Get my nag from the stable. at-issue: ‘Get my horse from the stable.’ CIE: ‘My horse is a useless animal.’
This generalization can be taken to mean that colored terms have denotations of a similar type to the subject honorifics discussed earlier. We can give them lexical entries as follows. (30)
a.
steed= λx. horse(x) _ λx. noble(x) : he, tia × he, tis
b.
nag= λx. horse(x) _ λx. useless(x) : he, tia × he, tis
The behavior of these items in semantic derivations should be obvious by now; I omit showing details of any derivations. Let me briefly mention another case provided by McCready & Schwager (2009), who discuss the Viennese German intensifier ur in this system. One use of ur is to intensify the meaning of a noun or adjective: (31)
a.
Das ist ur interessant. that is ur interesting ‘That is totally interesting.’
b.
Er ist ein ur Idiot. he is a ur idiot ‘He is a total idiot.’
The meaning of this modifier has two parts. First, it performs intensification in the at-issue dimension, so (31a) means that the referent of that is extremely (or ‘totally’) interesting; but the speaker also indicates that she holds some emotive attitude toward the sentential content. This latter part is expressive or conventionally implicated, and indeed bears the usual hallmarks of emotive expressive meanings: for example, it is highly context dependent with respect to positivity and negativity.41 McCready and Schwager further provide a formal semantics for the intensifier in L+ CI . The analysis is complex, and I will not review it here; but it is at least clear that ur passes the tests I have proposed for mixed content bearers. 41 Footnote 50 discusses the issue of context dependence of emotive meanings further.
8:29
Eric McCready
I suppose that there are many other kinds of mixed content, but most have not come to the attention of researchers yet. The previous discussion should at least show the usefulness of the notion. There is plainly much more work to be done on the range of conventionally implicating and expressive items in the world’s languages, but I hope that the small sample given here and in the previous section show that the type-theoretic tools proposed here have useful application in their analysis. 3
Main CIEs
The logic proposed in the previous section, L+ CI , does more than allow for the analysis of mixed content. The introduction of shunting types that was shown to be necessary for that purpose also makes available another possibility for semantic denotation. As we have seen, the result of composition with mixed terms is similar in the end to the addition of supplementary information via conventional implicatures: this similarity is modeled by letting both sorts of CIE content be conjoined to at-issue content via the bullet. Shunting types, though, because of their resource sensitivity, allow for a situation where there is no at-issue content at all. The aim of this section is to show that this feature of the logic should not be taken as a negative one. The existence of shunting types implies that it is possible that a particular sentence (or utterance) can convey only CIE content. We will examine several cases where this situation appears to be realized. In general, this situation is somewhat special; the uses of language most often analyzed in linguistic and philosophical work serve to convey information about the world, rather than to express aspects of the speaker’s mental state or meta-information about the conversation, which (arguably) is the function of conventional implicature. Information about the world is thus conveyed mostly by default here, or in ways other than via the conventional implicature itself, e.g. when the ‘primary’ content is present in the context, or entered into it by other means. This observation suggests a division in content type which we will find to be borne out, at least at the level of inspection that I can provide in the present context. The discussion is structured as follows. In section 3.1, I briefly show why shunting types imply that CIE content can be primary. Section 3.2 examines a first case, the basic cases of single-word utterances of particles of the kind introduced in Kaplan 1999. There it is also shown that these cases exhibit unexpected behavior from the perspective of LCI in that they
8:30
Varieties of conventional implicature
can fall in the scope of certain semantic operators. As it turns out, the existence of shunting types makes it possible to allow for these cases while simultaneously retaining Potts’s generalizations about the interaction of semantic operators and CIE content. Section 3.3 discusses the Japanese adverbial yokumo, which exhibits a different kind of behavior: while the denial test supports an analysis of the content of sentences containing this adverbial as CIE, there is composition within the adverbial scope, unlike what is found with Kaplan’s particles (as noted by Kratzer 1999). It is shown that analyzing yokumo as being of shunting type both provides an explanation of its behavior with respect to denials. 3.4 concludes with some suggestions about possible related phenomena. 3.1
Why Main Content?
The reason that shunting types allow for utterances with only CIE content is the resource-sensitivity of these types. The function of shunting types is to ‘shunt’ at-issue content into the CIE dimension of meaning; because of the resource-sensitivity of these types, no at-issue content remains. Any successful derivation will result in an object of type t s . Here is a sample, with two applications:
R7
β : τa
R7
α : σa γ : hσ , hτ, υiis γ(α) : hτ, υis γ(α)(β) : υs
Plainly, no at-issue content remains. We have seen that shunting types are needed in the analysis of mixed content. But their existence implies that there could be expressions that are exclusively of shunting type. The rest of this section indicates some instances of such expressions in various natural languages. Before the empirical facts, though, two theoretical issues must be addressed; one relatively simple, and one difficult. The first issue is that the definition of proof tree interpretation in (20) cannot be used when an utterance lacks asserted content. The reason is that the definition assumes the existence of an object of type t a on the root node, but when there is no asserted content, there is no such object.42 It is therefore necessary to modify the definition to allow for this case. Note that it also seems necessary to modify the original definition provided by 42 Thanks to Kai von Fintel (p.c.) for bringing this issue to my attention.
8:31
Eric McCready
Potts (2005) as well, for precisely the same reasons; I therefore modify (20) to cover the case where the utterance contains only content of type t c as well. I will simply stipulate that in cases where a sentence lacks asserted content it is still interpreted as a 2-tuple, but one with a first (left) element which is always satisfiable. I will denote this trivial assertion by T . The result of all this is a definition with two distinct cases, one which applies when there is an asserted proposition, and one which applies when there is not. (32)
Generalized Interpretation (final). i. Let T be a proof tree with at-issue term α : σ a on its root node, and distinct terms β1 : t {c,s} , . . . , βn : t {c,s} on nodes in it. Then the interpretation of T is hα : σ a , {β1 : t {c,s} , . . . ,βn : t {c,s} }i. ii. Let T be a proof tree with at-issue term α : σ {c,s} on its root node, and distinct terms β1 : t {c,s} , . . . , βn : t {c,s} on nodes in it. Then the interpretation of T is hT , {α : t {c,s} ,β1 : t {c,s} , . . . ,βn : t {c,s} }i.
The second issue is less easily resolved. We have a fairly good idea of what conditions there are on assertion and what norms govern this speech act. But these norms do not necessarily apply when there is no asserted content present in an utterance. What then are the norms of the use of sentences which have CIE content as their primary content?43 This is a difficult question and one which might be asked about all uses of CIE content. It is not really clear at this point exactly what the normative conditions are on the use of supplementary CIEs, for example. A full answer is therefore far beyond the scope of this paper. I can only suggest a path toward an answer here. It seems that what the ‘norms of expression’ are depends on what kind of act is at issue. In assertion we are, roughly, concerned with the transmission of true information. If a sentence is false, then a norm has been violated. With respect to CIE content, one can think of a notion of ‘expressive correctness,’ following Kaplan; the question then becomes what exactly it takes for something to be expressively correct. The answer to this turns on what one takes the function of CIEs to be. It is not clear to me that we have the necessary understanding of their function yet. Once we do, we will be in a better position to articulate the norms of expressive use. Let us now turn to some empirical facts, focusing on particles and adverbials. 43 Thanks to Kai von Fintel (p.c.) for raising this question.
8:32
Varieties of conventional implicature
3.2
Particles
Sentence-modifying particles introduce several interesting issues. First, we can consider the case of particles that do not modify any sentences, such as man. (33)
Man!
This kind of case is discussed briefly by McCready (2008b). There man was taken to be a conventional implicature-introducing propositional modifier that applies to a proposition made available by context. If one agrees with this analysis (and if one follows the analysis of proposition-modifying sentenceinitial man offered in that paper) one ends up with an undesirable situation where both man(φ) and φ are directly communicated. The reason is that man would end up being analyzed as of type ht, tic , which means that one ends up with the denotation ϕ : t a • man(ϕ) : t c for the sentence. Intuitively, though, this is not correct: ϕ is not asserted by sentences like the above. To see this, consider cases where a question is answered with the particle: (34)
a.
A: What’s the weather like outside?
b.
B: Man!
B’s response is understood roughly as follows: B has some sort of strong feeling about the weather outside. It is not clear what the weather outside is actually like. In this kind of case, A is likely to infer that the weather is extreme in some way, but exactly what way this is depends entirely on A’s prior knowledge about the weather. We can therefore see clearly that the proposition man modifies is not asserted by B’s utterance — if it were, it should be recoverable, but it is not. Still, we should not take this to mean that nothing about this proposition is communicated, only that this communication cannot be ‘literal.’ Of course, there is another possibility for analysis. The above discussion is relevant only if stand-alone man is in fact modifying a proposition. It is also possible that it is a simple exclamation of the type discussed immediately below: if this is right, then (33) indicates only that the speaker is in an excited state. If so, then the conclusion that B’s response in (34) indicates something about the weather follows completely from inference: given that A has asked a question about the weather and B is indicating that he is in a heightened emotional state, it is natural (though defeasible) to conclude that he is excited
8:33
Eric McCready
about the weather. It is not easy to see which of these options is correct, for it’s not clear that there are empirical tests to distinguish between the two positions.44 However, as we’ll see, either approach proves to give support to an analysis of particles that takes them to denote objects of shunting type. Clearly, on either analysis, stand-alone particles provide another case where the conventionally implicated content is the primary content of the utterance. If we assume that a proposition is being directly modified, man can be typed as λp. man(p) : ht a , t s i ignoring the actual content of the particle, which is roughly that the speaker has some kind of emotional reaction toward p (that it is good or bad).45 This analysis disallows the assertion of p itself, as desired. The question of how extensively we should take particle meanings to be analyzable in terms of shunting types is left for another occasion; it turns on the empirical question of whether or not the propositional content of sentences modified by particles can serve as answers to questions. In many cases it is clear that they can, in others, perhaps not. Another kind of even more obvious case is that of expressives that do not perform any modification, such as salutations or fully expressive exclamations (cf. Kaplan 1999; Kratzer 1999). On the second analysis of stand-alone particles like man, they too will fall into this category. (35)
a.
Thanks!
b.
Good morning.
c.
Ouch!
Expressions like these lack truth conditions, though they can be expressively correct (appropriate) or not. They plainly do not assert anything.46 They can be analyzed as objects of type t c (or t s ), which simply express something about the speaker’s mental states or what she takes the situation to be like. 44 We cannot, for instance, make use of the kind of binding tests that proponents of ‘unarticulated constituents’ have taken as evidence for their approach (cf. Stanley 2000 for a use of these tests, and Cappelen & Lepore 2005 for critical discussion). 45 The semantics of man is discussed in detail in McCready 2008b. 46 As the editors point out, this is so only if one does not accept relevant aspects of the performative hypothesis, according to which (35c), for example, would assert something like ‘I hereby express ‘ouch!” Discussion of the hypothesis with arguments for and against it can be found in Levinson 1983.
8:34
Varieties of conventional implicature c Here the extension to L+S CI does not at first appear necessary, as type t is sufficient, given that no combinatorics are taking place; but it is clear that, in cases like these, the expressive (or conventionally implicated) content is the main content of the utterance. We thus have a division between cases of ‘primary’ CIs: one, modeled via shunting types, where the CI content is functional, and another, apparently modellable either via shunting types or CI types, where the content is not functional and expresses a constant. However, it turns out that there are reasons to take type t c to be inappropriate for these contexts. The reason is that — by definition — there are no functional types taking CI types as input. As discussed in detail above, this is by design: the content of e.g. appositives never seems to fall in the scope of semantic operators. But certain operators are able to act on expressive particles such as those discussed by Kaplan: namely, other particles.
(36)
a.
Ouch, man!
b.
Man, ouch!
If man is to modify ouch in these cases, it must be either of type ht c , ti or ht s , ti (where the output type is also either t c or t s ). But if it takes an object of type t c as input, the generalization about the semantic independence of e.g. appositives is lost: we must admit functional types taking CI types as input. If we assume that ouch denotes something of type t s , though, we can avoid this situation. One might think that the two particles are merely adjacent, so neither need to be analyzed as functional. To see that there is genuine interaction between the two particles, consider the following two situations. (37)
(38)
a.
Situation 1: You stub your toe on the curb while walking down the street with your friend Curly.
b.
Situation 2: Your friend Curly suddenly pokes you in the eye with a fork.
a.
Ouch!
b.
Ouch, man!
(38a) is an appropriate utterance in either Situation 1 or Situation 2. (38b) gives an impression of blame: ‘it’s your fault that I am in a position to say
8:35
Eric McCready
this appropriately!’47 This kind of accusation is obviously appropriate in Situation 2. If uttered in Situation 1, it is somewhat odd: why is it Curly’s fault that you’ve stubbed your toe? These considerations are enough to make it clear that man is in fact doing something to the meaning of ouch in (38b), and so some kind of composition is at work. Another kind of example comes from the intensifiers discussed by McCready & Schwager (2009). One use of these expressions is as propositional modifiers, which intensify along the expressive dimension, as in (39). (39)
a.
John totally came to the party.
b.
He fully wiped out, dude.
McCready & Schwager (2009) analyze uses like these as expressing that the speaker has maximal epistemic commitment to her justification for her use of the modified proposition, so (39a) would express that the speaker is maximally committed to her justification (evidence) that John came to the party. It turns out that these modifiers can also modify purely expressive items in some dialects of English. (40)
Totally ouch(, dude).
On the McCready and Schwager analysis, this would express that the speaker has maximal commitment to her justification for uttering ouch, itself an expressive item. Presumably such justification would be a pain felt by the speaker or something similar. But the main point for our purposes here is that ouch is a bearer of purely expressive content. A proper analysis of cases like these therefore will, again, require modification of expressive content. We have now seen that there are instances in which purely expressive content is modified. This means that we must add to the system a provision for operators that take CIE content as input. But what type of content should this be? The worry is that, if we allow operators over CI types (σ c ), the generalizations made by Potts (i.a.) about modification of conventional implicatures such as the content of appositives are lost. The natural way to avoid this problem is to analyze man and totally in (39) as operators over shunting typed objects, so to make them of type ht s , t s i.48 Such types are 47 I believe this follows from the analysis of sentence-final man given in McCready 2008b, on which it performs a dynamic strengthening of speech acts, though I will not provide details here. 48 Of course, there is also a need for a typing for these operators that allows them to modify at-issue content as well: ht a , t s i. Depending on the facts about modification of CIE content,
8:36
Varieties of conventional implicature
easily added to the system (via clause (i) of B.1.1). With this move the Potts generalizations are maintained in the type system. I believe that the particles, and particularly the expressives like (35), are the clearest instances of sentences which lack at-issue content, and, perhaps as a consequence, are the instances which have received the most attention in the literature. Let us now turn to another kind of sentence that does not appear to have at-issue content. 3.3
Yokumo
The second example we will consider are sentences modified by the Japanese adverbial yokumo. In line with McCready 2004, I will argue that yokumo introduces three pieces of content: a) a statement of the speaker’s emotional attitude toward the modified proposition ϕ, b) a statement regarding the prior probability the speaker assigned to ϕ, and c) a condition on mutual knowledge of ϕ. Unlike McCready 2004, however, I will analyze conditions (a) and (b) as conventionally implicated rather than asserted, for reasons which will become clear. The question of the status of (c) is more difficult to resolve, but in the end I will conclude that it is presuppositional. The meaning of yokumo is complex, as may already be clear from the brief discussion above. Here are some representative examples, with somewhat rough translations.49 (41)
a.
Yokumo koko ni kita (na)! yokumo here to came (PT) ‘You have a lot of guts to come here!’
b.
Yokumo ore o damashita (na!) yokumo me Acc tricked (PT) ‘I can’t believe you had the gall to trick me.’
The most obvious approximation of the meaning of the adverbial is a simple negative statement about the propositional content.50 it may be that these two typings are consistently available for particles and other such modifiers. Much more empirical investigation is needed before this question can be answered definitively. 49 Most examples in this section come from McCready 2004. 50 This is the simplest version of the adverbial meaning. For many speakers, yokumo can also be used with a positive meaning.
8:37
Eric McCready
(42)
yokumo= λp. bad(p)
The second component of yokumo’s meaning involves likelihood. Yokumo indicates that the speaker did not expect the event described by the modified sentence to occur, and that she is surprised that it actually did. There are a variety of ways to model this situation. I will simply make use of a predicate surprise, which can be given a semantics in terms of probabilities in ways that are more or less obvious.51 Adding this to the denotation of yokumo yields (43)
yokumo= λp. bad(p) ∧ surprise(p)
One element of this adverbial’s meaning remains to be analyzed. It was also discussed by McCready (2004): the proposition modified by yokumo must be (believed by the speaker to be) common ground. To see that this proposition must indeed be common ground, note that sentences modified by yokumo are not felicitous as answers to questions. (i) omae yokumo konna ii sakuhin kaketa na you yokumo this-kind-of good artwork write.able-Pst PT ‘I can’t believe you were able to make a piece this good!’ Whether the attitude expressed by yokumo is positive or negative appears to depend on several factors. First, the content of the sentence: in (41b), the modified proposition describes an event that (we can assume) was negative for the speaker, while (i) is clearly positive. Other facts about the world also must play a role, though. Suppose that it is the speaker’s birthday, and he comes home to find a surprise party. The hearer had told him earlier that everyone had forgotten his birthday. Here, the tricking lacks a negative character. The identity of the speaker also obviously plays a role. These facts are reminiscent of what we find with modification by the particle man (McCready 2008b), which has the introduction of emotional attitudes as one of its functions. There I introduced a function E which maps Kaplanian contexts and propositions to emotive predicates; the relevant features of the context, and the content of the proposition, determine an emotive predicate, which is then applied to the proposition itself. In these more permissive dialects, the statement bad(p) in the semantics below should be replaced with E(c)(p)(p), which is interpreted, after application of E to the context and the proposition, either as bad(p) or good(p). The issue of how the emotive import of expressives arises is an important one in the context of the study of expressive meaning and one I hope to return to in later work, but is orthogonal to the purposes of the present paper, which is mostly concerned with combinatorics. 51 The operator should be defined in terms of probabilities prior to learning that the ‘surprising’ proposition is true, which requires a notion of dynamic changes in probabilities. For discussion, see Jeffrey 1983, Kooi 2003, or McCready & Ogata 2007.
8:38
Varieties of conventional implicature
(44)
a.
Context: A asks B ‘Who did Austin marry?’(McCready 2004)
b.
#Yokumo Dallas to kekkon sita na! yokumo Dallas with marry did PT ‘He did an amazingly stupid and shocking thing by marrying Dallas!’
This example can be taken to indicate that yokumo cannot provide new information. In my earlier work I modeled this knowledge requirement via a condition on update: update is only defined if both hearer and speaker already know the content of the proposition, in conjunction with an assumption of common knowledge. There are several options regarding how this condition should be stated. On the one hand, it is possible to simply presuppose that CG{s,h} (ϕ), that ϕ is common ground for speaker and hearer;52 on the other hand, taking a less interactive approach to the dynamics of information, we can simply stipulate that an update with yokumo(p) is only defined if update with p does not alter the information state of speaker or hearer. These two conditions amount to the same thing for present purposes.53 I will make use of the former method in this paper.54 We arrive at the following lexical entry.55 (45)
yokumoc = λp : CG{s,h} (p). bad(p) ∧ surprise(p)
52 See van Ditmarsch, van der Hoek & Kooi (2007) for the semantics of this operator. 53 We do not need to concern ourselves with deep questions about the difference between knowledge and belief here, for instance. 54 In McCready 2004, I took the second route. This decision was partly motivated by the fact that the particle na can induce felicity, which I took to mean that it can help introduce content into the common ground. Since I will not consider the action of this particle in this paper, we can avoid detailed discussion of common ground and update. In any case, it may well turn out that na has a different function that makes sentences modified by it compatible with yokumo (McCready, in preparation). 55 One might think that all this is unnecessary, given that surprise(φ) is factive, if we assume that the logical predicate has the same interpretation as the natural language surprise, which I see no reason to do. But even if it is presupposed that φ, must we take φ to be common knowledge? The answer is yes. First, note that what is presupposed by surprise(φ) is not φ but that the speaker (believes herself to have) learned φ at some past time, which is already the wrong interpretation. Further, this presupposition should be accommodatable; but it is not. This is surprising given the results of Kaufmann (2009), who shows that such presuppositions should be readily accommodatable, unlike presuppositions about the common ground. I take this to indicate that the presupposition of common ground is needed.
8:39
Eric McCready
This essentially restates the lexical content originally provided in McCready 2004. However, there is more to the story, as discussed in that paper. In (45) I have, without argument, taken the common ground condition to be presupposed, and the other two parts of the meaning to be asserted. But if they are indeed asserted, it should be possible for a hearer to deny them directly. However, the content of yokumo(p) cannot be directly denied. Consider the following example.56 (46)
Yokumo Dallas to kekkon shita na! yokumo Dallas with marry did PT ‘He did an amazingly stupid and shocking thing by marrying Dallas!’ a.
# sore-wa hontoo janai that-Top truth Cop.Neg ‘That’s not true.’
b.
# uso da lie Cop ‘That’s a lie!’
Each of the possible denials in (46) is infelicitous. One might try to explain this in terms of ‘privileged content’ or speaker relativity; it is known that it is difficult to make claims about the truth or falsity of claims that depend (in part) on the speaker’s preferences (cf. Lasersohn 2005; Stephenson 2007). It makes some sense, given this, that the emotive content of the adverbial content is hard to deny. But this argument does not go through for the probability statement.57 The analysis starts with the observation that it is not actually impossible to deny the content of the adverbial — it just cannot be done with the responses in (46). Less direct expressions are needed. (47)
Yokumo Dallas to kekkon sita na! yokumo Dallas with marry did PT ‘He did an amazingly stupid and shocking thing by marrying Dallas!’ a.
Chigau yo! wrong PT
56 Here we suppose that it is known that the referent of ‘he’ is marrying Dallas. 57 If probabilities are understood as subjective, the basis for assertion may indeed be hard to deny. But it seems clear that statements about likelihood become part of the public domain once made, so denial of the surprise clause in the denotation of yokumo is surely possible.
8:40
Varieties of conventional implicature
‘That’s wrong!’ b.
Sonna koto nai yo! that-kind-of thing Cop.Neg PT ‘That’s not right.’
These facts are reminiscent of facts noted by Potts (2005) about conventional implicatures. How can one call the content of a nominal appositive into question, given that it cannot be denied directly? (48)
Bill, the philanthropist, is very rich. a.
That’s not true. (= Bill is not very rich.)
b.
Well, yeah, he is, but that’s not really right . . . (= casts doubt on the appositive content)
What I will call truth-directed denials like those in (46) cannot target conventionally implicated content, but only asserted content. Denials like (47) can target either type of content. If we assume that the content of yokumo is conventionally implicated, the facts in (46) are therefore immediately explained. Note that the fact that truth-directed denial can target the asserted content in (48) and not in (46) has an immediate explanation: (48) asserts that Bill is rich, but (46) asserts nothing at all, for it is already common ground that Dallas and Austin got married.58 58 Another commonality can be found with denials. Note that there are two parts to the ‘deniable’ content of yokumo sentences, given that the proposition modified is already part of the common ground: the emotive content and the statement of surprise. For many (but not all) speakers, the denials of yokumo-modified sentences in (47) can only target one of these, meaning that they can deny the good/badness of the marriage, or its surprisingness, but not both. The same seems to hold for sentences in English where multiple conventional implicatures are tied to the same host NP, as in (ia). Here, the denial in (ib) seems to indicate that either a) John is not a banker, or b) that he does not own a large house. It is difficult to understand (ib) as denying both together. If this data is correct, the identification of the content introduced by yokumo as conventional implicature receives additional support. (i)
a. A: John, a banker, who owns a large house, is going bankrupt. b. B: Well, yeah, true, but . . .
However, none of this follows from the analysis I am going to provide in terms of L+ CI , where the adverbial simply introduces a conjunction; unless it is assumed that only a single conjunct can be targeted by a denial in the case of conventionally implicated content. Formally, we might take the adverbial to introduce several distinct conditions, for example
8:41
Eric McCready
In previous work, I analyzed these facts in Segmented Discourse Representation Theory (SDRT; Asher & Lascarides 2003), in a way related to the analysis of parentheticals of Asher (2000). Here I will explore a different approach.59 One may wonder if the above facts about denial are really sufficient evidence to justify treating the content of yokumo as conventionally implicated. This is legitimate; but, for independent reasons, it is difficult to apply the other standard test for conventional implicature. It is known that conventional implicatures are scopeless with respect to semantic operators over asserted content, such as negation, conditionals and the various modalities. Ordinarily, one would test the behavior of the putative conventional implicature item in operator contexts, and then draw conclusions about whether or not it is actually asserted. Unfortunately, this proves to be impossible with yokumo. Yokumo is resistant to appearing in nonveridical contexts, as shown by McCready (2004).60 Because yokumo is ungrammatical in these contexts, it is impossible to test its scope behavior, and, as a result, the operator test for conventional implicature cannot be applied. The same goes for the binding test. Since yokumo can’t appear in conditional consequents, it is hard to tell whether or not its content would be bindable. But a conceptual argument is available. Intuitively, sentences modified by yokumo serve to introduce new information about the speaker’s mental states and attitudes. If this content was presupposed, then (on a standard picture of presupposition) the speaker would be assuming it to be in the common ground. But, intuitively, in the form of a set of propositions. Before taking this kind of step, though, it is worth checking to see how stable the denial facts are with respect to ‘multiple denials.’ 59 The SDRT analysis involved assuming that each part of the lexical content of the adverbial introduced distinct speech act discourse referents which were then connected by discourse relations. This analysis has three problems, as I now see it. First, there is no clear reason why the denials in (46) are different from those in (47). There is no independently motivated reason to distinguish between these kinds of denial at the level of discourse structure (to my knowledge). Second, I had to make an assumption about possible attachment points for the denials to work out right, which also lacks independent motivation. Third, on my analysis there, yokumo(p) also was taken to assert p, despite the presence of p in the common ground already (as shown by the facts in (44)). This strikes me as highly problematic in view of the norms of assertion: one should not assert things that are already common ground (or even cannot, if this is taken to be a precondition on assertions). I therefore take the new analysis presented in the main text to be preferable. 60 The reason for this may relate to evidential behavior: it seems possible that yokumo requires that the speaker have a certain kind of relation with the proposition it modifies, in a way related to what is found with sentence-initial man (McCready 2008b). I will not consider this behavior in detail here.
8:42
Varieties of conventional implicature
the speaker is communicating her attitudes, so the presupposition picture simply does not seem to be correct.61 Here I will take the results of the denial test to be conclusive, and therefore treat the content of yokumo as conventionally implicated in what follows (excluding the presupposition of common ground).62 The question now is what type to assign it. As with stand-alone man, there are two options: ht a , t c i and ht a , t s i. Just as with man, there are obvious problems with the first option. Given the resource-insensitivity of CI types, applying a denotation of the first option to a proposition ϕ will yield ϕ : t a • yokumo(ϕ) : t c . But this means that ϕ is asserted, and so it should be deniable. But it is not. The first option, therefore, cannot be right. Assuming yokumo to be of type ht a , t s i, however, means that the result of combining the adverbial with a proposition will be only yokumo(ϕ) : t s ; nothing is asserted, so the denial facts are predicted. The result is that sentences modified by yokumo carry only CIE content. 3.4
Conclusion
In this section we have seen several areas in which natural language appears to make use of the possibilities afforded by shunting types, and have also had occasion to slightly extend L+ CI to allow for modification of shunting typed objects. I hope the reader has been convinced of their usefulness. I do not think that this discussion exhausts the utility of shunting types: for example, one other area where I think they could be useful is in the analysis of exclamatives, which have the combinatory properties one would expect from shunting-typed objects in terms of further combinatorics, given certain 61 This argument seems reasonable, but the presupposition that the modified proposition is in the common ground is less simple to get clear about. How can we be sure that presuppositions of this sort, that have no real equivalent in non-technical natural language, are not actually conventionally implicated? I do not know of a really good way. The issue is general, and has received a bit of recent discussion by Schlenker (2008), who raises worries for his theory of presupposition involving complex presuppositions that cannot be articulated easily or at all in natural language. This is an interesting issue but a difficult one, and I will not be able to do it full justice in this paper. 62 Another way to interpret these results is to conclude that yokumo introduces a different kind of content, that behaves in some ways similarly to CIE content (cf. the comments of a reviewer). This seems possible; but it also seems that, even in this case, it behaves like CIE content where it can appear. I think this justifies using the present system to analyze it.
8:43
Eric McCready
assumptions.63 They also exhibit semantic similarities with yokumo and even the modifications done by particles, which suggest a larger correspondence. The topic is large enough that I cannot do justice to it here. Another area is expressive small clauses, sentential phrases like (49), discussed by Potts & Roeper (2006). (49)
You damn fool!
Utterances like this one do not exhibit any at-issue content; there is nothing for truth-directed denials to target, for example. This fact makes it look like shunting types should be involved. As Potts and Roeper state, though, it is not completely clear how the details of the composition should work, and I cannot improve on their observations here. In a sense, the conventional implicatures introduced by shunting-typed content remain supplementary, at least in the cases examined here; the difference with ‘ordinary’ conventional implicatures of CI type is that shuntingtyped objects supplement content that is already present, and not asserted by the sentence providing the supplementary information. In the case of yokumo, this content must be introduced via accommodation, if it is not already present; but this presents no special difficulties, unlike presuppositions of some kinds of expressive content (e.g. Kaufmann 2009). For some other instances of CIE content in contexts where no assertion is made, the situation can be different, for instance in the analysis of the Japanese modal particle daroo provided by Hara (2008). According to this analysis, daroo(ϕ) conventionally implicates that µ(ϕ) > 50%, but does not assert anything. Hara notes that LCI is not appropriate for analyzing this case, in that, given that this type system returns ϕ itself in the at-issue dimension, Gricean maxims would be violated by any use of daroo to modify a proposition. L+S CI , however, makes the right predictions (assuming that the Hara analysis is correct.) What these cases have in common is that the conventionally implicated content is, in some sense, primary to the intent behind the utterance. 63 For instance, one must say something about ‘embedded exclamatives.’ One possible route is to note that embedded instances of exclamatives show very different behavior from non-embedded instances, a fact already noted by Rett (2008), who draws a sharp distinction between the two types.
8:44
Varieties of conventional implicature
4 Quechua Evidentials: a Case Study Let us now examine a single phenomenon (or group of phenomena) that seems to make use of all the types of content discussed here. This is the system of Quechua evidentials, for which L+ CI can provide an alternate analysis to the proposal of Faller (2002), on which these evidentials modify speech acts. I will begin by giving the basic background and facts that a theory of the evidentials should explain. I then briefly present Faller’s speech act-based analysis and show (following McCready 2008a) that, despite the conventional implicature-like behavior of the evidentials, an adequate analysis cannot be given in LCI . I then show that such an analysis is available in L+ CI . The intent is to duplicate the basics of Faller’s analysis as closely as possible in a conventional implicature-based system which does not make use of speech acts. I should make two caveats before embarking on this project. First, the proposal I make here does not account for many of the subtle issues that arise in the Quechua evidential system, only the most basic, brutal facts about the way in which composition seems to work for the different evidentials in the language.64 Second, the analysis of Faller (2002) is by no means the last word on this subject. More recent work by Faller (2003, 2007, 2006) introduces additional complexities, which I will also leave aside. This section should therefore be taken as only a sketch of an alternate analysis, in which we see how one can ensure some kinds of scope behavior without making anything other than lexical stipulations about types of content. Cuzco Quechua has several enclitic suffixes that mark evidentiality: roughly, the nature of the speaker’s justification for the claim made by the utterance. Faller analyzes three suffixes in detail. The first is the direct evidential -mi, which indicates that the speaker has the best available grounds for the claim made, which generally amounts to perceptual evidence. The second, -si, is a hearsay evidential which indicates that the speaker heard the information expressed in the claim from someone else. Finally, -chá, an inferential evidential, indicates that the speaker’s background knowledge, plus inferencing, provides evidence for the proposition the modified sentence denotes, and asserts that the sentence might be true. (50)
a.
Para-sha-n-mi rain-Prog-3-mi
64 I also restrict attention to assertions; complex issues arise with questioning evidentials in this language, which I am not sure how should best be addressed.
8:45
Eric McCready
‘It is raining. + speaker sees that it is raining’ b.
para-sha-n-si rain-Prog-3-si ‘It is raining. + speaker was told that it is raining’
c.
para-sha-n-chá rain-Prog-3-chà ‘It may be raining. + speaker conjectures that it is raining based on some sort of inferential evidence’
Cuzco Quechua evidentials do not embed semantically; even when they appear in the surface scope of semantic operators, they always take widest scope (or are scopeless with respect to such operators). The negation in the following example cannot take scope over the evidential, for instance. (51)
Ines-qa mana-n/-chá/-s qaynunchaw ñaña-n-ta-chu Ines-Top not-mi/chà/si yesterday sister-3-Acc-chu watuku-rqa-n visit-Pst1-3 ‘Ines didn’t visit her sister yesterday.’ (and speaker has evidence for this) NOT ‘Ines visited her sister yesterday’ (and speaker doesn’t have evidence for this)
A final basic fact that a theory of evidentials in this language must explain is that use of the hearsay evidential with a sentence does not commit the speaker to the content of the sentence. For instance, the first clause of the following sentence does not commit the speaker to the proposition that a lot of money was left for the speaker, as the continuation shows. (52)
Pay-kuna-s ñoqa-man-qa qulqi-ta muntu-ntin-pi saqiy-wa-n, (s)he-PL-si I-Illa-Top money-Acc lot-Incl-Loc leave-1o-3 mana-má riki riku-sqa-yki i un sol-ta centavo-ta-pis not-Surp right see-PP-2 not one Sol-Acc cent-Acc-Add saqi-sha-wa-n-chu leave-Prog-1o-3-Neg ‘They left me a lot of money (they said/it was said), but as you have seen, they didn’t leave me one sol, not one cent.’ (Faller 2002:191)
Thus, roughly, what is needed is the following result, where the evidential content is not asserted:
8:46
Varieties of conventional implicature
(53)
a.
mi(φ) î φ ∧ speaker has direct evidence for φ
b.
si(φ) î speaker has hearsay evidence for φ
c.
cha(φ) î ♦φ ∧ speaker has inferential evidence for φ
Faller uses Vanderveken’s (1990) speech act theory for her analysis. This theory, like other theories of speech acts, assigns them preconditions for successful performance. Faller takes evidentials to introduce additional content into the set of preconditions. For the cases under consideration, we need only be concerned with one kind of precondition: sincerity conditions on successful performance of the speech act. For assertions, Vanderveken takes it to be necessary that Bel(s, p) holds — that the speaker believes the content of the assertion.65 Most of the action in Faller’s analysis of -mi and chá is in the sincerity conditions for the assertion. On her analysis, -mi adds an additional sincerity condition to the assertion, that Bpg(s, φ). The formula Bpg(s, φ) means that the speaker has the best possible grounds for believing φ. It is very difficult to make this condition precise. Faller notes that what counts as best possible grounds is dependent on the content in the scope of -mi: for externally visible events Bpg will ordinarily be sensory evidence, while for reports of people’s intentions or attitudes even hearsay evidence will often be enough. Faller analyzes -chá as being simultaneously modal and evidential. The asserted content is therefore ♦φ when φ is modified by -chá; the corresponding sincerity condition also involves ♦φ instead of φ. A sincerity condition indicating that the speaker’s reasoning has led him to believe that φ might be possible is also introduced. The hearsay evidential -si is also complex; the propositional content p is not asserted when this hearsay evidential is used, as we saw, which means that the propositional content of the utterance cannot be asserted. Faller posits a special speech act present for this situation, on which the speaker simply presents a proposition without making claims about its truth. In addition, the sincerity condition requiring that the speaker believe φ is eliminated, and a condition stating that the speaker learned φ by hearsay is added. While considering the degree to which the semantics of evidentials can be viewed as homogeneous, McCready (2008a) attempted to provide a conventional implicature-based analysis of the Quechua system. It seems plain that the evidentials of this language behave in a way similar to conventional 65 This is only a very rough approximation of the normative conditions on assertion. See e.g. Searle 1969 and Siebel 2003 for discussion.
8:47
Eric McCready
implicatures: they are scopeless, do not participate in denial,66 and so on. However, an adequate semantics cannot be provided in LCI . To see this, it suffices to consider -si : although si(φ) does not entail φ, taking si to introduce a conventional implicature causes φ to be asserted, given a LCI analysis where si is an object of type ht a , t c i. As we have already seen, the combinatorics, together with (16), yield hφ, {si(φ)}i in this situation; this means that φ is asserted, so the analysis fails. However, with the extension of LCI to L+ CI , we have more options available. In fact, when one examines the conditions in (53), it can be seen that they correspond to the three kinds of content we have discussed. The direct evidential appears to provide the ‘ordinary’ supplementary content of Pottsian conventional implicatures; the hearsay evidential, given that it makes no claims about the truth of the content it applies to, acts to provide the conventionally implicated main content of its utterance, and the inferential evidential, given that it has effects in both the at-issue and CI dimensions, is of mixed type. With this observation, an analysis becomes available. Here I do not delve deeply into the content of the evidentials, instead making use of predicates Bpg ‘there are best possible grounds for’, Hearsay ‘there is an event of hearsay of’, and Inf, a relation between individuals and propositions indicating that the first element has inferential evidence for the second.67 (54)
a.
mi= λp. Bpg(p) : ht a , t c i
b.
si= λp. Hearsay(p) : ht a , t s i
c.
cha= λp. ♦p _ λp. Inf(s, p) : ht a , t a i × ht a , t s i
Applied to a proposition φ, these lexical entries will, respectively, yield the following: 66 See Faller 2002 for details. 67 It is possible to spell at least some of this out in McCready & Ogata’s (2007) evidential logic. This logic is dynamic and makes use of discourse referents for evidence sources, sorted according to the type of evidence they provide (hearsay, visual, etc.). Quinean occasion sentences are associated with a predicate E and are associated with an agent a, the evidence holder, and a source i, the source of the content. McCready (2008a) gives a first attempt at using this logic for the Quechua system. The idea is that Hearsay(p) can be defined by making use of a test over Eia p-events where Sort(i)=hearsay and Inf(s,p) can be defined via a test over Eia p-events where Sort(i)=judgemental. It is a bit harder to define Bpg(s, p), because its satisfaction conditions are dependent on the content of p itself; but it should be possible. I will not go further into this issue here.
8:48
Varieties of conventional implicature
(55)
a.
hφ, {Bpg(φ)}i
b.
hT , {Hearsay(φ)}i
c.
h♦φ, {Inf(s, φ)}i
These are precisely the desired results. This sketch of an analysis for the Quechua evidential case thus provides an example of a situation in which the full power of L+ CI is needed to analyze a single linguistic phenomenon. Of course, the question of whether this analysis or Faller’s speech act-based one is to be preferred for this case is separate, and depends on working out the details of the conventional implicature story in connection with looking at a wider array of more complex data. Still, at minimum, the discussion here shows that a speech act analysis is not the only possibility for the phenomena in question. 5 Conclusion This paper has made two major contributions. It has distinguished and provided a logical system for the analysis of three distinct types of conventional implicature: supplementary CIEs as modeled in Potts 2005, CIEs that provide main content, analyzed in L+S CI as being of shunting type, and mixed CIEs, an+ alyzed in LCI . This typology is novel and is one that I think helps significantly in understanding CIE phenomena. I doubt it is exhaustive, however. It seems possible that the three categories analyzed need further subdivision, even in terms of their typing (there is obvious need for subdivision in terms of content). I believe that these systems will be useful for researchers working to understand the range of conventional implicature in the world’s languages; I hope the above discussion has provided some support for this belief. In the process, the paper has analyzed a number of phenomena involving CIE content, mostly of mixed or shunting type: these analyses are the second contribution of the paper. One question that has not been addressed in any detail is the nature of the distinction between conventional implicature and expressive content, or even if there is any empirical distinction. I think that, in terms of their combinatorics, there might well not be any difference. The two show a similar lack of interaction with most kinds of semantic operators (embedding under attitudes being a significant exception), which suggests that they act similarly in terms of compositional semantics. At the present moment, there has not been sufficient empirical investigation for this point to be really clear.
8:49
Eric McCready
My suspicion is that the difference between expressive and CI lies in the type of meanings that are carried rather than how those meanings behave in composition, and so that the distinction is one that cross-cuts the distinctions embodied in L+ CI . Another issue that arose several times in this paper is the nature of the divide between presupposition and conventional implicature. I suggested that (in part at least) it comes down to a difference in function. Presuppositions aim to ‘match’ old information with new; conventional implicatures instead work to introduce new information, but information that is not ‘open to question’ in the way that asserted content is, instead serving to indicate the speaker’s attitudes and commitments. This distinction is useful in cases where the standard tests break down due to the complexity of a given piece of content or the lack of a way to express it in a given (formal or natural) language, as we saw. The particular examples provided here also raise questions about the degree of translatability one can find for non-at-issue domains in natural languages. It seems likely to me that Katz (1978) was right in his thesis that any piece of content in a natural language L can be translated into any other language L0 — if one restricts attention to at-issue content. Whether this thesis holds for presupposition or for conventional implicature strikes me as more problematic (and not me alone: see Keenan 1974 and von Fintel & Matthewson 2008). The data in this paper suggests that in certain complex cases, translation of these kinds of non-truth-conditional content might be difficult or impossible, if there is no term in the target language with the same semantics. For example, it is not at all obvious how one might translate a sentence containing honorifics, or (certain) evidentials, or particles of the kind discussed in this paper, into a language without similar constructions, in a way that preserves meaning.68 It is my hope that the work described in the present paper will contribute to solving questions like these, and, in general, to the theory of natural language meaning. A
Formal System of Potts (2005)
Here is the type system of LCI . i. The type system itself is as follows. a. ea , t a , s a are basic at-issue types forLCI . 68 This task is difficult even in the most basic sense of content-level equivalence. If one specifies a translation that also preserves pragmatic and discourse-level behavior, it is even harder.
8:50
Varieties of conventional implicature
b. ec , t c , s c are basic CI types for LCI . c. If σ and τ are at-issue types for LCI , then hσ , τi is an at-issue type for LCI . d. If σ is an at-issue type for LCI and τ is a CI type for LCI , then hσ , τi is a CI type for LCI . e. If σ and τ are at-issue types for LCI , then hσ × τi is a product type for LCI . f. The full set of types for LCI is the union of the at-issue types and CI types for LCI . ii. Further, let x serve as a variable over {e, t, s} and let σ and τ serve as variables over well-formed types with their superscripts stripped off. The type-superscript abbreviator is defined as follows: xa xc hσ a , τ a i hσ a , τ c i B
xa xc hσ , τia hσ , τic
Modified Type System: L+ CI
I define two type systems here. The first, L+S CI , introduces shunting types. The +S second, L+ , builds on L to allow for the use of mixed content terms as CI CI well. The reason for defining the two systems independently is that the full power of the extended system will not be needed for all applications, and it may be convenient for users of the types proposed here to have a subsystem at hand that fits their needs. B.1 Shunting types: L+S CI Here is the type system of L+S CI , which is just that of LCI supplemented with additional shunting types. I follow Potts in my definition, which means that many shunting types are produced that do not get used (just as with the CI types of LCI ).
• The type system itself is identical to that of LCI except that: i. The following clauses are added to the LCI type specification:
8:51
Eric McCready
(g) es , t s , s s are basic shunting types for L+S CI . (h) If σ is an at-issue type for L+S CI and τ is a shunting type for +S LCI , then hσ , τi is a shunting type for L+S CI . (i) If σ is a shunting type for L+S CI and τ is a shunting type for L+S , then hσ , τi is a shunting type for L+S CI CI . ii. Clause (f) of the LCI type specification is replaced with f’. The full set of types for L+S CI is the union of the at-issue types, the CI types and the shunting types for L+S CI . iii. All instances of ‘LCI ’ in the LCI type specification are replaced with ‘L+S CI ’. iv. The following two clauses are added to the definition of the type-superscript abbreviator : xs xs hσ a , τ s i hσ , τis
• This type definition, bundled with the LCI rules (R1-6), the newly defined rule (R7), and the revised interpretation mechanism in (32), comprises L+S CI . B.2
The full system: L+ CI
The full system adds some rules to L+S CI .
• The type system is identical to that of L+S CI except that: 69 i. The following clauses are added to the L+S CI type specification.
(i) If σ andτ are at-issue types for L+ CI , and ζ and υ are shunting + types for LCI , then σ × ζ, hσ , τi × ζ, σ × hτ, ζi and σ × hζ, υi are mixed types for L+ CI . (ii) If σ , τ and ζ are at-issue types for L+ CI and υ is a shunting type for L+ , then hσ , τi × hζ, υi is a mixed type for L+ CI CI . 69 Comment: It is not necessary to use most of the types produced by clause (i) for the analyses made in the present paper. However, I will make such types available in the logic: I do not think it wise to restrict the type system too much in view of our limited current knowledge of the range of mixed type expressions in natural language. Here I in effect follow the practice of LCI , where a wide range of CI types is made available, although in practice only a narrow range of them ends up being used.
8:52
Varieties of conventional implicature +S ii. All instances of ‘L+S CI ’ in the LCI type specification are replaced with ‘L+ CI ’.
• This type definition, together with the LCI rules (R1-7) and the new rules (R8,9) and the interpretation rule (32), comprise L+ CI . References Amaral, Patricia, Craige Roberts & E. Allyn Smith. 2008. Review of ‘The logic of conventional implicatures’ by Christopher Potts. Linguistics and Philosophy 30(6). 707–749. doi:10.1007/s10988-008-9025-2. Asher, Nicholas. 2000. Truth conditional discourse semantics for parentheticals. Journal of Semantics 17(1). 31–50. doi:10.1093/jos/17.1.31. Asher, Nicholas & Alex Lascarides. 2003. Logics of conversation. Cambridge: Cambridge University Press. Asher, Nicholas & James Pustejovsky. 2005. Word meaning and commonsense metaphysics. Ms., University of Texas Austin and Brandeis University. http://semanticsarchive.net/Archive/TgxMDNkM/ asher-pustejovsky-wordmeaning.pdf. Bach, Kent. 1999. The myth of conventional implicature. Linguistics and Philosophy 22(4). 327–366. doi:10.1023/A:1005466020243. Bach, Kent. 2006. Review of Christopher Potts, ‘The logic of conventional implicatures’. Journal of Linguistics 42(2). 490–495. doi:10.1017/S0022226706304094. Barker, Chris & Pauline Jacobson. 2007. Direct compositionality. Oxford: Oxford University Press. Cappelen, Herman & Ernest Lepore. 2005. Insensitive semantics. Oxford: Blackwell. Carpenter, Bob. 1998. Type-logical semantics. Cambridge, MA: MIT Press. Chierchia, Gennaro. 1998. Reference to kinds across language. Natural Language Semantics 6(4). 339–405. doi:10.1023/A:1008324218506. van Ditmarsch, Hans, Wiebe van der Hoek & Barteld Kooi. 2007. Dynamic epistemic logic. Berlin: Springer. Dummett, Michael. 1973. Frege: Philosophy of language. London: Duckworth. Faller, Martina. 2002. Semantics and pragmatics of evidentials in Cuzco Quechua. Stanford, CA: Stanford University dissertation. Faller, Martina. 2003. Propositional- and illocutionary-level evidentiality in Cuzco Quechua. In Jan Anderssen, Paula Menendez-Benito & Adam Werle
8:53
Eric McCready
(eds.), The proceedings of the second conference on the semantics of underrepresented languages in the Americas [SULA 2], 19–34. Amherst: GLSA. Faller, Martina. 2006. Evidentiality above and below speech acts. Unpublished ms. http://personalpages.manchester.ac.uk/staff/Martina.T.Faller/ documents/Evidentiality.Above.Below.pdf. Faller, Martina. 2007. The Cuzco Quechua reportative evidential and rhetorical relations. In Andrew Simpson & Peter Austin (eds.), Endangered languages (Linguistische Berichte Sonderheft 14), 223–252. Hamburg: Helmut Buske Verlag. von Fintel, Kai & Lisa Matthewson. 2008. Universals in semantics. The Linguistic Review 25(1-2). 139–201. doi:10.1515/TLIR.2008.004. Fodor, Jerry. 2002. Concepts. Oxford: Oxford University Press. Geurts, Bart. 2007. Really fucking brilliant. Theoretical Linguistics 33(2). 209–214. doi:10.1515/TL.2007.013. Grice, H. Paul. 1975. Logic and conversation. In Peter Cole & Jerry Morgan (eds.), Syntax and semantics III: Speech acts, 41–58. New York: Academic Press. Hara, Yurie. 2008. Non-propositional modal meaning. Manuscript, Kyoto University. http://www.semanticsarchive.net/Archive/WUxZjFiM/darou_ hara.pdf. Heim, Irene & Angelika Kratzer. 1998. Semantics in generative grammar (Blackwell Textbooks in Linguistics 13). Oxford, England: Blackwell. Hom, Christopher. 2008. The semantics of racial epithets. The Journal of Philosophy 105(8). 416–440. Horn, Laurence. 2007. Toward a Fregean pragmatics: Voraussetzung, Nebengedanke, Andeutung. In Istvan Kecskes & Laurence Horn (eds.), Explorations in pragmatics, 39–69. Berlin: Mouton de Gruyter. Jeffrey, Richard. 1983. The logic of decision. Chicago: University of Chicago Press. Kaplan, David. 1989. Demonstratives. In Joseph Almog, John Perry & Howard Wettstein (eds.), Themes from Kaplan, 481–566. Oxford University Press. Manuscript version from 1977. Kaplan, David. 1999. The meaning of ouch and oops: Explorations in the theory of meaning as use. Manuscript, UCLA. Katz, Jerrold. 1978. Effability and translation. In Franz Guenthner & Monica Guenthner-Reutter (eds.), Meaning and translation: Philosophical and linguistic approaches, 191–234. London: Duckworth. Kaufmann, Stefan. 2009. On the projection of expressive presuppositions.
8:54
Varieties of conventional implicature
Paper presented at Workshop on Non-truth-conditional Meaning, DGfS, Osnabrück. Keenan, Edward L. 1974. Logic and language. In Morton Bloomfield & Einar Haugen (eds.), Language as a human problem, 187–196. New York: W.W. Norton and Company. Kim, Jong-Bok & Peter Sells. 2007. Korean honorification: A kind of expressive meaning. Journal of East Asian Linguistics 16(4). 303–336. doi:10.1007/s10831-007-9014-4. Kooi, Bartled Pieter. 2003. Probabilistic dynamic epistemic logic. Journal of Logic, Language and Information 12(4). 381–408. doi:10.1023/A:1025050800836. Kratzer, Angelika. 1999. Beyond ouch and oops: How descriptive and expressive meaning interact. Handout of a talk given at Cornell Conference on Theories of Context Dependency. http://semanticsarchive.net/Archive/ WEwNGUyO/. Kratzer, Angelika. 2009. Making a pronoun. Linguistic Inquiry 40(2). 187–237. doi:10.1162/ling.2009.40.2.187. Kubota, Yusuke & Wataru Uegaki. 2009. Continuation-based semantics for conventional implicatures and the Japanese benefactive. Poster presented at SALT 19. http://www.ling.ohio-state.edu/~kubota/papers/ci-salt.pdf. Lasersohn, Peter. 2005. Context dependence, disagreement, and predicates of personal taste. Linguistics and Philosophy 28(6). 643–686. doi:10.1007/s10988-005-0596-x. van Leusen, Noor. 2004. Incompatibility in context: a diagnosis of correction. Journal of Semantics 21(4). 415–441. doi:10.1093/jos/21.4.415. Levinson, Stephen. 1983. Pragmatics. Cambridge: Cambridge University Press. McCready, Eric. 2004. Two Japanese adverbials and expressive content. In Kazuha Watanabe & Robert B. Young (eds.), Proceedings of SALT 14, 163–178. http://semanticsarchive.net/Archive/2Y3YjAxM/. McCready, Eric. 2008a. Semantic heterogeneity in evidentials. In Ken Satoh, Akihiro Inokuchi, Katashi Nagao & Takahiro Kawamura (eds.), New frontiers in artificial intelligence: JSAI 2007 conference and workshops revised selected papers (Lecture Notes in Computer Science 4914), 81–94. Berlin: Springer. doi:10.1007/978-3-540-78197-4_10. McCready, Eric. 2008b. What man does. Linguistics and Philosophy 31(6). 671–724. doi:10.1007/s10988-009-9052-7. McCready, Eric & Norry Ogata. 2007. Evidentiality, modality, and probability. Linguistics and Philosophy 30(2). 147–206. doi:10.1007/s10988-007-9017-7.
8:55
Eric McCready
McCready, Eric & Magdalena Schwager. 2009. Intensifiers. Paper presented at Workshop on Non-truth-conditional Meaning, DGfS, Osnabrück. Neale, Stephen. 1999. Coloring and composition. In Robert Stainton (ed.), Philosophy and linguistics, 35–82. Boulder, CO: Westview Press. Potts, Christopher. 2005. The logic of conventional implicatures. Oxford University Press. Revised version of 2003 UCSC dissertation. Potts, Christopher. 2007a. The expressive dimension. Theoretical Linguistics 33(2). 165–198. doi:10.1515/TL.2007.011. Potts, Christopher. 2007b. The centrality of expressive indices: Reply to the commentaries. Theoretical Linguistics 33(2). 255–268. doi:10.1515/TL.2007.019. Potts, Christopher, Luis Alonso-Ovalle, Ash Asudeh, Rajesh Bhatt, Seth Cable, Christopher Davis, Yurie Hara, Angelika Kratzer, Eric McCready, Tom Roeper & Martin Walkow. 2009. Expressives and identity conditions. Linguistic Inquiry 40(2). 356–366. doi:10.1162/ling.2009.40.2.356. Potts, Christopher & Shigeto Kawahara. 2004. Japanese honorifics as emotive definite descriptions. In Kazuha Watanabe & Robert B. Young (eds.), Proceedings of SALT 14, 235–254. http://semanticsarchive.net/Archive/ WZhMmY3N/. Potts, Christopher & Tom Roeper. 2006. The narrowing acquisition path: From expressive small clauses to declaratives. In Ljiljana Progovac, Kate Paesani, Eugenia Casielles & Ellen Barton (eds.), The syntax of nonsententials, 183– 201. John Benjamins. Pustejovsky, James. 1995. The generative lexicon. Cambridge, MA: MIT Press. Rett, Jessica. 2008. Degree modification in natural language. New Brunswick, NJ: Rutgers dissertation. Richard, Mark. 2008. When truth gives out. Oxford: Oxford University Press. van der Sandt, Rob. 1992. Presupposition projection as anaphora resolution. Journal of Semantics 9(4). 333–377. doi:10.1093/jos/9.4.333. Schlenker, Philippe. 2007. Expressive presuppositions. Theoretical Linguistics 33(2). 237–245. doi:10.1515/TL.2007.017. Schlenker, Philippe. 2008. Presupposition projection: Explanatory strategies. Theoretical Linguistics 34(3). 287–316. doi:10.1515/THLI.2008.021. Searle, John. 1969. Speech acts. Cambridge: Cambridge University Press. Siebel, Mark. 2003. Illocutionary acts and attitude expression. Linguistics and Philosophy 26(3). 351–366. doi:10.1023/A:1024110814662. Sørensen, Morton Heine & Pawel Urzyczyn. 2006. Lectures on the CurryHoward isomorphism (Studies in Logic and the Foundations of Mathemat-
8:56
Varieties of conventional implicature
ics 149). Amsterdam: Elsevier Science. Stanley, Jason. 2000. Context and logical form. Linguistics and Philosophy 23(4). 391–434. doi:10.1023/A:1005599312747. Stephenson, Tamina. 2007. Judge dependence, epistemic modals, and predicates of personal taste. Linguistics and Philosophy 30(4). 487–525. doi:10.1007/s10988-008-9023-4. Vanderveken, Daniel. 1990. Meaning and speech acts. Cambridge: Cambridge University Press. Wang, Linton, Eric McCready & Brian Reese. 2006. Nominal appositives in context. In Michael Temkin Martínez, Asier Alcázar & Roberto Mayoral Hernández (eds.), Proceedings of the thirty-third Western Conference on Linguistics [WECOL 33], 411–423. Wang, Linton, Brian Reese & Eric McCready. 2005. The projection problem of nominal appositives. Snippets 11. 13–14. http://www.ledonline.it/ snippets/allegati/snippets10005.pdf. Williamson, Timothy. 2009. Reference, inference and the semantics of pejoratives. In Joseph Almog & Paolo Leonardi (eds.), The philosophy of David Kaplan, 137–159. Oxford: Oxford University Press.
Eric McCready Department of English Aoyama Gakuin University 4-4-25 Shibuya Shibuya-ku, Tokyo 150-8366 [email protected]
8:57
Semantics & Pragmatics Volume 3, Article 5: 1–15, 2010 doi: 10.3765/sp.3.5
Embedded Implicatures? Remarks on the debate between globalist and localist theories Michela Ippolito University of Toronto
Received 2009-10-24 / Revised 2009-12-28 / Published 2010-03-24
Abstract Geurts & Pouscoulous (2009) present experimental evidence that embedded implicatures are not systematically available and conclude that localist theories of implicatures cannot be maintained. I argue that this conclusion can be strengthened by showing that their findings cannot be reconciled with a localist theory even when the latter is supplemented with a formal way to predict when an embedded implicature will be preferred, as suggested in Chierchia, Fox & Spector 2008.
Keywords: implicatures, scalar implicatures, embedded implicatures, conversational implicature, local implicature, experimental pragmatics, neg-raising verbs
1 Introduction Geurts & Pouscoulous (2009) present some interesting experimental data pointing against a localist view of scalar implicatures according to which scalar implicatures are systematically generated in embedded as well as in non-embedded positions. One case that typically is said to trigger an embedded implicature is the case of a clause embedded under an attitude verb such as think or believe, as in (1). (1)
John thinks that Fred heard some of Verdi’s operas.
The implicature that (1) generates — the localist maintains — is that John thinks that Fred heard some but not all of Verdi’s operas. Assuming a localist view according to which implicatures are triggered by means of a silent exhaustive operator O as in Chierchia et al. 2008, the embedded implicature ©2010 Michela Ippolito This is an open-access article distributed under the terms of a Creative Commons NonCommercial License (creativecommons.org/licenses/by-nc/3.0).
Michela Ippolito
in (1) is triggered when O adjoins the embedded clause as shown in (2).1 This gives rise to the meaning in (3). (2)
John thinks that O(Fred heard some of Verdi’s operas)
(3)
John thinks that Fred heard some of Verdi’s operas and John thinks that Fred didn’t hear all of Verdi’s operas
According to the globalist view, embedded implicatures such as (3) cannot be generated and (1) can only conversationally implicate that John doesn’t think that Fred heard all of Verdi’s operas. Suppose that John is knowledgeable about whether Fred heard all of Verdi’s operas or not. Then, from the weak implicature that John doesn’t believe that Fred heard all of Verdi’s operas we can infer that John believes Fred didn’t hear all of Verdi’s operas. This is what Sauerland (2004) calls the “epistemic step”. The result looks like the embedded implicature that the localist generates by embedding the exhaustive operator O, but in fact is just a global implicature strengthened by means of some assumptions about John’s epistemic state.2 Geurts and Pouscoulous conclude that, on the basis of their findings, the localist position is untenable. Here I will discuss whether their findings can in principle be reconciled with a localist theory once the latter is supplemented with a mechanism for predicting when a given reading will be preferred or dispreferred. As a paradigmatic case, I will consider the localist theory defended by Chierchia et al. (2008). My conclusion will support Geurts and Pouscoulous’s: even when supplemented with a mechanism for determining when an embedded implicature will be preferred, the localist predictions are incompatible with Geurts and Pouscoulous’s findings. Furthermore, I will consider the class of Neg-raising (NR) verbs (Horn 1978) and argue that a localist theory makes predictions which again are incompatible with Geurts and Pouscoulous’s experimental findings. I will then sketch a way in which a globalist theory might in principle be able to explain the experimental differences we find among the NR predicates included in Geurts and Pouscoulous’s questionnaires. However, further experimental research is needed in order to ascertain whether the globalist line of argument suggested here works when extended to other NR predicates. 1 That local implicatures exist has been advocated by several people, even though the idea has been implemented differently in different proposals. See for example Bach 1994, Carston 1988, Chierchia 2004, Fox 2006, Levinson 1983, Levinson 2000, Recanati 2003, among others. 2 Following Grice (1975), advocates of a globalist theory of implicatures include Gadzar (1979), Geurts (2009), Horn (1972, 1989), King & Stanley (2006), among many others.
5:2
Embedded Implicatures?
2
Experiments and results
To test whether the predictions made by the local view of implicatures are correct, Geurts and Pouscoulous looked at different types of embeddings. In their first experiment, they considered complex sentences where the scalar item some is embedded in the nuclear scope of the universal quantifier all; under a modal verb with a universal force; in the complement of think; and finally in the complement of want. They compared the results they obtained in these cases with the rate of implicatures drawn in unembedded clauses and found that, while scalar implicatures were accepted in the majority of simple (unembedded) cases, the acceptance rate was much lower in the complex conditions (with differences among conditions; see section 3.1 below). The experiment used an inference task in which participants were shown a sentence containing a scalar expression (e.g. some) and were asked whether they would infer that the corresponding sentence with the stronger scalar expression (e.g. all) was false. In a subsequent experiment, the authors compared the rate of local implicatures found using the inference task with the rate of local implicatures found using a verification task in which participants where shown a sentence containing a scalar expression and were asked to decide whether that sentence correctly described a picture that they were shown. The result of the latter experiment when applied to unembedded clauses showed that the inference paradigm yields higher rates of scalar implicatures than the verification paradigm, and therefore that the verification task is a more reliable way to find out the rate at which people actually draw scalar implicatures. When applied to the question of whether local implicatures are drawn in embedded clauses, the verification task performed by Geurts and Pouscoulous “completely failed to yield the local SIs predicted by mainstream conventionalism” (Geurts & Pouscoulous 2009). In particular, the authors tested scalar items (here, some) embedded in downward-entailing (DE) contexts (i.e. Not all the squares are connected with some of the circles); scalar items embedded in upward-embedding (UE) contexts (i.e. All the squares are connected with some of the circles); and finally, scalar items embedded in non-monotonic (NM) contexts (i.e. There are exactly two squares that are connected with some of the circles). In the next section, taking Chierchia et al. 2008 to be a paradigmatic example of a localist theory, I will spell out in more detail how this theory works and I will consider the consequences of Geurts and Pouscoulous’s experimental results, particularly with respect to the issue of the frequency with which embedded implicatures are drawn. 5:3
Michela Ippolito
3
Discussion
Geurts and Pouscoulous’s experimental results are not per se a knockdown argument against embedded implicatures. The localist might object that Geurts and Pouscoulous’s experimental results do not show that embedded (or local) implicatures are impossible but only that they are not generally available, and this is at least consistent with one possible localist view: that is, that since they seem to be triggered in special circumstances, embedded implicatures must be possible, even though they are not generally available. Chierchia et al. (2008) have recently discussed some of the circumstances where embedded implicatures are triggered. The examples in (4) through (6) illustrate some of these circumstances. (4)
If you take salad or dessert, you pay $20; but if you take both there is a surcharge.
(5)
Exactly two students wrote a paper or ran an experiment. The others either did both or made a class presentation.
(6)
Mary solved some or all of the problems.
Take (4). Chierchia et al. (2008) argue that, while implicatures are not normally triggered in the antecedent of conditionals (a DE environment), the continuation in (4) forces an exclusive interpretation of or in the antecedent (that is an interpretation of the antecedent strengthened with the scalar implicature “but not both”) as the only way to guarantee a coherent interpretation for the discourse. Embedding the exhautive operator in the antecedent guarantees that such an interpretation is generated.3 Someone might initially object to Chierchia et al.’s (2008) argument that, if embedded and non-embedded implicatures are generated by the same mechanism–in this case the exhaustive operator O–then you would not expect local implicatures to be confined to this very special set of cases. The fact that local implicatures seem to be confined to a very narrow set of cases, and that occurrences of scalar items (such as some or or) in embedded positions do not normally trigger local implicatures raises the suspicion that the “effect” 3 Similarly for (5) and (6). In (5), the continuation is argued to force the embedded implicature giving rise to the interpretation according to which ‘exactly two students wrote a paper or ran an experiment but didn’t do both’. The continuation in (6) is also argued to force the embedded implicature so that as a result the interpretation of the sentence will be that either Mary solved some but not all of the problems or she solved all of them.
5:4
Embedded Implicatures?
of local implicatures is actually due to a different mechanism, and that these are not implicatures after all.4 To address the issue of the frequency of embedded implicatures (why embedded implicatures are much less frequent than global implicatures), Chierchia et al. (2008) have suggested that there is a preference for the strongest possible interpretation among the possible readings of a sentence, and that this might account for why having the exhaustive operator O in the scope of a DE operator is a dispreferred option since it gives rise to an interpretation weaker than the one obtained without O. The authors consider two versions of the “strongest meaning hypothesis” (SMH), as shown in (7) and (8), both from Chierchia et al. 2008. (7)
SMH1: Let ϕ be a certain logical form. Let ϕ’s competitors be all the LFs that differ from ϕ only with respect to where the exhaustivity operator occur. Then, everything else being equal, ϕ is dispreferred if one of its competitors is stronger than ϕ.
(8)
SMH2: Let S be a sentence of the form [S . . . O(X) . . . ]. Let S 0 be the sentence of the form [S 0 . . . X . . . ], i.e. the one that is derived from S by replacing O(X) by X, i.e. by eliminating this particular occurrence of O. Then everything else being equal, S 0 is preferred to S if S 0 is logically stronger than S.
According to SMH1, given a certain logical form, all LFs differing in where the exhaustivity operator occurs will compete with each other and the strongest LF will be preferred. According to SMH2, alternative LFs differing in the placement of the exhaustive operator do not compete with each other but only with the LF without the operator. Taking Chierchia et al.’s (2008) theory as the paradigmatic localist theory, the question that arises is whether the localist theory sketched above supplemented with either SMH1 or SMH2 can be reconciled with Geurts and Pouscoulous’s experimental results. 4 This might explain why, while focal stress is often needed to bring out the embedded implicature interpretation, focal stress is not needed to bring out the non-local implicature interpretation. Chierchia et al. (2008) attribute the fact that focal stress helps the embedded implicature reading of the sentences they consider to the nature of the mechanism they appeal to, i.e. covert exhaustification, which is triggered by focus. However covert exhaustification is also supposed to be responsible for the non-local implicature raising the question why focal stress is a relevant factor in the explanation of one type of implicature but not in the explanation of the other.
5:5
Michela Ippolito
Consider the predictions made by either version of SMH for sentences where the scalar item is embedded under the epistemic predicate be certain. (9)
John is certain that Fred heard some of Verdi’s operas. a. John is certain that O(Fred heard some of Verdi’s operas) b. O(John is certain that Fred heard some of Verdi’s operas)
The configuration of the operator O in (9a) triggers the local implicature that John is certain that Fred did not hear all of Verdi’s operas. (9b), on the other hand, triggers the implicature that John is not certain that Fred heard all of Verdi’s operas. Consider SMH1 first. Let us assume that α is certain that ϕ means that α has a justified belief that ϕ is true.5 The assertoric content of (9) strengthened by the implicature in (9a) gives rise to a meaning stronger than the meaning obtained by strengthening (9)’s assertion with the implicature in (9b). If in all of John’s doxastic worlds it is true that Fred heard some but not all of Verdi’s operas (and if John is justified in having this belief), then it is not the case that in all of John’s doxastic worlds Fred heard all of Verdi’s operas (and it is not the case that John is justified in believing that Fred heard them all). This entailment is asymmetric. Therefore, SMH1 predicts that the interpretation in (9a) should be the preferred one. However, assuming that Geurts and Pouscoulous’s findings can be extended to predicates such as be certain, they show that the embedded implicature in (9a) is clearly not the preferred interpretation. Suppose we assume SMH2 instead of SMH1. Because both LFs with the exhaustive operator convey interpretations stronger than the one conveyed by the LF without the operator, the proposal predicts that both (9a) and (9b) should be equally available. But we have already seen that Geurts and Pouscoulous’s results show that this is not the case: (9a) is dispreferred. Appealing to independent considerations like the lack of plausibility for the reading in (9a), in order to explain why it is rare is a dubious move. In Geurts and Pouscoulous’s experiments, the context plays no role. Therefore, we expect that the most salient reading (the reading preferred by the participants in the experiment) will be the one selected by the SMH, but we saw that this is not the case. 5 I am not claiming that this is all there is to say about what be certain means. All I am assuming here is that saying that α is certain that ϕ entails that α believes ϕ and has some justification for believing ϕ.
5:6
Embedded Implicatures?
Similar considerations apply to modal verbs like wish: (10)
John wishes that Fred would try some of the cookies. a. John wishes that O(Fred would try some of the cookies) b. O(John wishes that Fred would try some of the cookies)
The configuration in (10a) triggers the embedded implicature that John wishes that Fred would not try all of the cookies. In (10b), on the other hand, the implicature is that John doesn’t wish that Fred would try all of the cookies. Consider first the prediction made by SMH1. Just like in the previous example, (10)’s assertion supplemented with the embedded implicature in (10a) gives rise to a meaning stronger than the meaning obtained by incrementing the same assertion with the implicature in (10b): if John’s desire-worlds are all worlds where Fred tries some but not all of the cookies, then it is not the case that all of John’s desire-worlds are worlds where Fred tries all of the cookies. However, the reverse does not hold: the assertion together with (10b) is compatible with a state of affairs where in some of John’s desireworlds Fred tries all of the cookies, a possibility ruled out by the implicature in (10a). Therefore, (10a) is predicted to be the preferred reading of the sentence in (10) by SMH1. One of the conditions that Geurts and Pouscoulous tested in one of their experiments was embedding of a scalar item under want and they found that the embedded implicature reading was not the preferred interpretation of the sentence. If their results can be extended to any volitional verb, including wish, they show that the prediction made by SMH1 is not correct. Similarly for SMH2: in this case, both (10a) and (10b) are predicted to deliver meanings stronger than the meaning obtained without O and so the two strengthened interpretations are incorrectly predicted to be equally available. This is so unless some independent contextual consideration rules out (10a), but as we observed above the context plays no role in Geurts and Pouscoulous’s experiment and therefore we do not expect it to be a factor affecting the subjects’s judgments. In conclusion, even when supplemented with a formal mechanism for predicting when an embedded implicature will be preferred or dispreferred, Chierchia et al.’s (2008) localist theory fails to account for the fact that embedded implicatures are systematically dispreferred. Appealing to contextual/plausibility considerations in order to override the outcome of the theory is problematic since in Geurts and Pouscoulous’s experiments judgments were elicited out-of-context.
5:7
Michela Ippolito
In the next section, I will look at the exceptional behavior of the verb believe, for which Geurts and Pouscoulous found a higher acceptance rate for the embedded implicature than in any other complex condition. Even though the exceptional behavior of believe initially appears to support a localist theory, I will conclude that it actually constitutes another challenge for it. 3.1
Believe and other Neg-raising verbs
Consider (11), a variant of Geurts and Pouscoulous’s original sentence.6 (11a) and (11b) give rise to the embedded implicature reading and the global implicature reading, respectively. (11)
John believes that Fred tried some of the cookies. a. John believes that O(Fred tried some of the cookies) b. O(John believes that Fred tried some of the cookies)
According to Horn (1978) and others, believe is a Neg-raising (NR) verb: a normal utterance of John doesn’t believe Mary lied implies that John believes that Mary didn’t lie. Similarly, (11b) will imply that John believes that Fred did not try all of the cookies. Therefore, both configurations in (11a) and (11b) give rise to the same implicature and, according to both SMH1 and SMH2, since both available interpretations are equivalent and are stronger than the LF without O, they should be equally available. Indeed, Geurts and Pouscoulous found a relatively high rate of acceptance of the embedded implicature in the believe/think condition (even though, as we saw, it wasn’t the preferred interpretation), and they acknowledge the possibility that this “elevated level of positive responses (57.5%) wasn’t merely an artifact” of the inference model (Geurts & Pouscoulous 2009). The problem is that a similar prediction is made by Chierchia et al. (2008) with respect to want. (12)
John wants Fred to try some of the cookies. a. John wants O(Fred to try some of the cookies) b. O(John wants Fred to try some of the cookies)
According to the classification in Horn 1978, want is also NR. It follows that Chierchia et al.’s (2008) localist theory predicts that both (12a) and (12b) should be equally available. However, the rate of acceptance of the embedded 6 Geurts and Pouscoulous’s sentence was given in (1). (1) is a translation of the French sentence actually used in the experiment.
5:8
Embedded Implicatures?
implicature with the modal verb want was low (32%), lower than what they found in the believe case. The embedded implicature reading is dispreferred, and nothing in how the exhaustive operator O or SMH work seems to explain why the embedded implicature is more frequently accepted with believe than with want. Geurts and Pouscoulous, following the lines of van Rooij & Schulz 2004 and Russell 2006, sketch a globalist account for why believe shows a higher acceptance of the embedded implicature: (i) the sentence Bob believes that Anna ate some of the cookies generates the global implicature that Bob doesn’t believe that Anna ate all of the cookies; (ii) assuming that Bob has an opinion about whether Anna ate all of the cookies or not, it follows that Bob believes that Anna did not eat all of the cookies. Now, in their paper defending a localist view of implicatures, Gajewski & Sharvit (2009) criticize this type of globalist account by arguing that appealing to the disjunctive proposition “either Bob believes that Anna ate all of the cookies or he believes that she didn’t” in the reasoning above is only plausible because believe is a NR verb and as such it carries the presupposition that either α believes that ϕ or α believes that it is not the case that ϕ (as argued in Gajewski 2005). In other words, according to Gajewski and Sharvit, the globalist account only appears to work because the predicate is NR and the disjunctive proposition crucial to the globalist explanation is actually presupposed by the verb. But if this were correct, then all NR verbs would trigger an embedded implicature since they all presuppose the relevant disjunctive proposition. But we just saw that this is not so: the experimental results reported in Geurts and Pouscoulous show that local implicatures with want are relatively rare, despite want being a NR verb. A short digression on NR verbs is in order here. I have assumed with Horn (1978) that want, like believe but unlike wish, is NR based on the observation that in (13) but not in (14) the first sentence implies the second. (13)
a. I don’t want Mary to leave. b. I want Mary not to leave.
(14)
a. I don’t wish to meet Mary. b. I wish not to meet Mary.
However, Rooryck (1991) cites the following pair from Horn 1978 against the view that want/vouloir are NR verbs: while (15) supports the NR hypothesis, (16) does not.
5:9
Michela Ippolito
(15)
a. Je ne veux pas que vous sortiez. “I don’t want you to leave” b. Je veux ques vous ne sortiez pas. I want you not to leave”
(16)
a. Je ne voudrais pas être Dieu. “I wouldn’t want to be God” b. Je voudrais ne past être Dieu. “I would want not to be God”
Rooryck concludes that volitional verbs only appear to be NR but in fact they are not. An exhaustive discussion of this issue is beyond the scope of this paper. However, what is important in the context of the current discussion about embedded implicatures is to notice that even if volitional verbs are not NR, it is still true that in cases such as (15) vouloir behaves like a NR verb in that the two sentences are judged to be synonymous, just like originally observed by Horn. Just like in (15), the English rendition of the implicature in (12b) (i.e. John doesn’t want Fred to try all of the cookies) is also judged to have a NR interpretation, and so does the French translation with vouloir.7 Therefore, since the logical form in (12b) receives a NR interpretation, it is expected to pattern like non-volitional NR verbs such as believe with respect to the computation of the embedded implicatures, and the experimental results show it does not. Going back to the main discussion, obviously the globalist needs to explain the asymmetry between want and believe too. In principle we should be able to run the reasoning sketched by Geurts and Pouscoulous for believe with want: (i) (12) generates the implicature that John doesn’t want Fred to try all of the cookies; (ii) let us assume that John has a definite desire about Fred’s trying all of the cookies, that is, that either John wants Fred to try all of the cookies or he wants Fred not to try all of the cookies; (iii) it follows that John wants Fred not to try all of the cookies. The crucial step is (ii). What “blocks” (ii) in the want case but not in the believe case? We saw that dismissing the globalist account by appealing to the presuppositional nature of this disjunctive proposition is not going to work. According to the Russellian line followed by Geurts and Pouscoulous, an assumption like “either John wants Fred to eat all of the cookies or John 7 Thanks to Annick Morin for providing the French sentence Je (ne) veux pas que Marie mange tous les biscuits and for her judgment.
5:10
Embedded Implicatures?
wants Fred not to eat all of the cookies” is purely contextual and as such it will be part of the common ground in some contexts but not in others. Whenever the context grants this assumption, the strengthening of the global implicature happens, giving rise to an embedded implicature effect without an actual embedded implicature. If this is correct, then the reason why subjects assented to the local implicature less frequently in the want case than in the believe case must have to do with how likely they felt they could make the relevant disjunctive assumption. In particular, it must be the case that, in the absence of any context, subjects felt that the assumption in (17a) was less likely to be true than the assumption in (17b). (17)
a. Either John wants Fred to try all of the cookies or John wants Fred not to try all of the cookies. b. Either Bob believes that Anna ate all of the cookies or he believes that she didn’t.
If it is the case that, out of context, people are less likely to make the assumption in (17a) than the one in (17b), we expect that it should be much easier to trigger the apparent local implicature with want if the context allows one to do so. In the globalist theory, then, plausibility considerations such as the ones outlined above might be expected to distinguish among other NR verbs which the localist theory would predict pattern alike with respect to embedded implicatures. 8 The localist too can appeal to the context (and Chierchia et al. (2008) leave this door open explicitly in their paper), but appealing to the context 8 One pair of predicates that might be interesting to test experimentally is the pair expect/ought to, as in John expects Mary to try some of the cookies and Mary ought to try some of the cookies. According to Horn 1978, both predicates are NR. The localist theory predicts that both should give rise to a high rate of acceptance of the embedded implicature (“John expects Mary not to try all of the cookies” and “Mary ought to not try all of the cookies”, respectively), at least out of context. The globalist theory, on the other hand, would have to appeal to two different disjunctive propositions in order to strengthen the global implicature giving rise to an embedded implicature effect. (18)
a. Either John expects Mary to try all of the cookies or John expects Mary not to try all of the cookies. b. Either Mary ought to try all of the cookies or Mary ought to not try all of the cookies.
At least out-of-context, it seems that (18a) would be easier to assume. If indeed (18a) is more plausible than (18b), then the globalist theory predicts that the acceptance rate for the embedded implicature should be higher in the expect case than in the ought case.
5:11
Michela Ippolito
in this case is needed to systematically “correct” the predictions of the theory which are not supported by the experimental findings (recall that the preferred interpretation according to Chierchia et al.’s (2008) localist theory augmented with either version of the SMH is not the interpretation preferred by Geurts and Pouscoulous’s subjects). Finally, we noticed that appealing to the context in order to override the outcome of the theory does not seem right in Geurts and Pouscoulous’s experiments since in both the inferential and the verification tasks the subjects had to make their choice out-of-context. Since the context plays no role in Geurts and Pouscoulous’s experiments, we expect the reading selected by the theory to surface undisturbed in people’s judgments. The fact that the subjects’s judgments did not agree with the predictions of the localist is therefore problematic. 4 Conclusion The experimental results presented by Geurts and Pouscoulous are at odds with the predictions of the localist, in particular the localist theory advocated by Chierchia et al. (2008) The challenge for the localist is to explain why embedded implicatures are so infrequent. We focused on the believe and want conditions in Geurts and Pouscoulous’s experiments. They found that the acceptance rate for the embedded implicature in the believe condition was not negligeable (even though the imbedded implicature reading was still not the preferred one). Since believe is a NR, we observed that the localist view advocated in Chierchia et al. 2008, together with either version of the strong meaning hypothesis (what we called SMH1 and SMH2), seems to account for the elevated rate of positive responses with believe. However, we also observed that the theory is unable to account for the very low acceptance rate with want, since want is also a NR verb. When we considered non-NR verbs such as be certain and wish, which belong to the same category as believe and want respectively, we saw that Chierchia et al.’s (2008) localist view augmented with either SMH1 or SMH2 makes incorrect predictions about what should be the preferred interpretations when a scalar item occurs in an embedded clause.9 9 Whether be certain patterns like believe or not, the localist faces a problem. In the former case, the localist view faces a problem since be certain is not NR and the embedded implicature reading is predicted by his theory to be the preferred one. In the latter case (i.e. if be certain does not show the higher rate of acceptance of the embedded implicature found with believe), the problem is that the localist expects non-NR verbs like be certain to show high acceptance rate for the embedded implicature.
5:12
Embedded Implicatures?
Furthermore, we noticed that the difference between want and believe reported in the experiment we are considering also undermines Gajewski and Sharvit’s criticism of the globalist view. Gajewski and Sharvit have recently suggested that the globalist account in Russell (2006) seems to work for believe only because believe is NR (see above for details). But if that were true, since want is also NR we would expect the two to pattern in the same way, but they don’t. It seems hard to reconcile the rates of acceptance of the embedded implicatures found with epistemic and volitional verbs with the grammatical theory of embedded implicatures proposed by the localist. Appealing to the context in order to override the predictions of the theory is problematic since the context plays no role in Geurts and Pouscoulous’ experiments. On the other hand, according to the global theory of implicatures, the appearance of an embedded implicature is due to the strengthening of the global implicature in a context where specific assumptions are taken to be part of the common ground: therefore, if the context plays no role (as in Geurts and Pouscoulous’ experiments) or the relevant assumptions are not made, there will be no embedded implicature effect. In general, both globalist and localist theories must appeal to contextual considerations to make the correct predictions but, unlike in a localist theory, the role played by the context in a globalist theory is an essential component of the globalist theory itself. Indeed, we noted that plausibility considerations might be able to account for the contrast between want and believe reported by Geurts and Pouscoulous. However, more experimental evidence of the type provided by Geurts & Pouscoulous (2009) must be collected in order to establish whether these considerations do play the explanatory role they are expected to play in a global theory of implicatures.
References Bach, Kent. 1994. Conversational impliciture. Mind and Language 9(2). 124–162. doi:10.1111/j.1468-0017.1994.tb00220.x. Carston, Robyn. 1988. Implicature, explicature and truth theoretic semantics. In Ruth Kempson (ed.), Mental representations: the interface between language and reality, 155–181. Cambridge & New York: Cambridge University Press.
5:13
Michela Ippolito
Chierchia, Gennaro. 2004. Scalar implicatures, polarity phenomena, and the syntax/pragmatics interface. In Structures and beyond, 39–103. Oxford: Oxford University Press. Chierchia, Gennaro, Danny Fox & Benjamin Spector. 2008. The grammatical view of scalar implicatures and the relationship between semantics and pragmatics. In Claudia Maienborn, Klaus von Heusinger & Paul Portner (eds.), Handbook of semantics, Mouton de Gruyter. Fox, Danny. 2006. Free choice disjunction and the theory of scalar implicatures. http://web.mit.edu/linguistics/people/faculty/fox/free_choice.pdf. Gadzar, Gerald. 1979. Pragmatics: Implicatures, presuppositions and logical form. New York: Academic Press. Gajewski, Jon. 2005. Neg-raising: Polarity and presupposition: MIT dissertation. doi:1721.1/33696. Gajewski, Jon & Yael Sharvit. 2009. In defense of the grammatical approach to local implicatures. http://web2.uconn.edu/sharvit/ Gajewski-Sharvit-23nov2009.pdf. Geurts, Bart. 2009. Scalar implicature and local pragmatics. Mind and Language 24(1). 51–79. doi:10.1111/j.1468-0017.2008.01353.x. Geurts, Bart & Nausicaa Pouscoulous. 2009. Embedded implicatures?!? Semantics and Pragmatics 2(4). 1–34. doi:10.3765/sp.2.4. Grice, Paul. 1975. Logic and conversation. In Peter Cole & Jerry Morgan (eds.), Syntax and semantics 3: Speech acts, 41–58. New York: Academic Press. Horn, Laurence. 1972. On the semantic properties of logical operators in English: UCLA dissertation. Horn, Laurence. 1978. Remarks on neg-raising. In Peter Cole (ed.), Syntax and semantics 9: Pragmatics, 129–220. New York: Academic Press. Horn, Laurence. 1989. A natural history of negation. Chicago: The University of Chicago Press. King, Jeff & Jason Stanley. 2006. Semantics, pragmatics, and the role of semantic content. In Zoltan Szabó (ed.), Semantics vs. pragmatics, 111– 164. Oxford: Oxford University Press. Levinson, Stephen. 1983. Pragmatics. Cambridge & New York: Cambridge University Press. Levinson, Stephen. 2000. Presumptive meanings: The theory of generalized conversational implicatures. Cambridge, MA: MIT Press. Recanati, François. 2003. Embedded implicatures. Philosophical Perspectives 17(1). 299–332. doi:10.1111/j.1520-8583.2003.00012.x. van Rooij, Robert & Katrin Schulz. 2004. Exhaustive interpretation of complex
5:14
Embedded Implicatures?
sentences. Journal of Logic, Language and Information 13(4). 491–519. doi:10.1007/s10849-004-2118-6. Rooryck, Johan. 1991. Negative and factive islands revisited. Journal of Linguistics 28(2). 343–374. doi:10.1017/S0022226700015255. Russell, Benjamin. 2006. Against grammatical computation of scalar implicatures. Journal of Semantics 23(4). 361–382. doi:10.1093/jos/ffl008. Sauerland, Uli. 2004. Scalar implicatures in complex sentences. Linguistics and Philosophy 27(3). 367–391. doi:10.1023/B:LING.0000023378.71748.db.
Michela Ippolito Department of Linguistics University of Toronto 130 St. George Street Toronto, ON M5S 3H1 Canada [email protected]
5:15
Semantics & Pragmatics Volume 3, Article 7: 1–13, 2010 doi: 10.3765/sp.3.7
Embedded implicatures observed: A comment on Geurts and Pouscoulous (2009)∗ Charles Clifton, Jr. Chad Dube University of Massachusetts Amherst University of Massachusetts Amherst
Received 2010-05-16 / First Decision 2010-06-11 / Revised 2010-06-30 / Accepted 2010-07-08 / Published 2010-07-28
Abstract Conventionalist theories of scalar implicature differ from other accounts in that they predict strengthening of embedded scalar terms. Geurts & Pouscoulous (2009a) argue that experimental support for this prediction is largely based on sentence comprehension tasks that inflate the frequency with which terms like some are strengthened. Using a picture verification task, they observed no strengthening of embedded scalars. We present data from a multiple-choice picture verification task that is more sensitive to interpretation preferences, and find that readers do show a preference for strengthened interpretations even in embedded phrases. These data cast doubt on Geurts and Pouscoulous’s empirical arguments against the existence of embedded implicatures.
Keywords: implicatures, scalar terms, interpretation, psycholinguistics
1 Introduction Geurts & Pouscoulous (2009a)1 present data arguing against what they call “mainstream conventionalist” and “minimal conventionalist” accounts of the strengthening of scalar terms like some. Both positions (see Chierchia, Fox & ∗ Acknowledgements: We thank Lyn Frazier for comments on an earlier version of our manuscript. We thank Maria Bonilla and Morgan Mendes for their assistance in this research. This project was supported in part by Grant Number HD18708 from NICHD to the University of Massachusetts. The contents of this paper are solely the responsibility of the authors and do not necessarily represent the official views of NICHD or NIH. 1 See Chemla 2009, and Geurts & Pouscoulous 2009b, for more discussion. ©2010 Clifton & Dube This is an open-access article distributed under the terms of a Creative Commons NonCommercial License (creativecommons.org/licenses/by-nc/3.0).
Clifton & Dube
Spector 2008 for a survey; see Geurts & Pouscoulous 2009a for additional references) claim that an “exclusivity” or O-operator is freely prefixed to any S node with the result that a proposition containing some X, X or Y, etc. is strengthened to ‘some but not all,’ exclusive ‘or’, etc. Mainstream conventionalism claims that the strengthened interpretation is the preferred interpretation, unless it occurs in a context (e.g. a downward-entailing context) which results in a logically weaker global interpretation of the sentence in which it occurs. Minimal conventionalism merely claims that the strengthened interpretation is possible, but says nothing about preference. One way to evaluate conventionalist approaches is to examine ‘embedded implicatures’ (or, following Geurts & Pouscoulous, ‘local scalar implicatures’). Consider a sentence like (1) (Geurts & Pouscoulous’s (7a)): (1)
All students read some of Chierchia’s papers.
Insertion of the exclusivity operator under the scope of all students entails that all students read some but not all of Chierchia’s papers and thus that no students read all of Chierchia’s papers. This should be the preferred reading according to mainstream conventionalism, because it is a stronger (more limited) claim than the non-strengthened claim. It is also a possible reading according to minimal conventionalism. However, it is not a pragmatically justified reading from a Gricean perspective. The author of the statement presumably did not believe that all students read all of Chierchia’s papers (else he would have said that). Thus, the pragmatically justified implication of (1) is (2a). It is not (2b), which is entailed if the exclusivity operator is inserted. (2)
a. b.
It is not the case that all students read all of Chierchia’s papers. All students read not all of Chierchia’s papers.
Geurts & Pouscoulous (2009a) argue that introspective evidence is not adequate to decide what people usually do take sentences with scalar terms to mean (an argument that is particularly persuasive when the theorist is doing the introspecting). They present some very interesting ‘verification’ experiments which they claim disconfirm both flavors of conventionalism (but are consistent with a construal of Gricean pragmatics). In these experiments, a subject is shown a picture and asked whether a sentence containing a scalar term ‘correctly describes’ the picture. Their subjects nearly universally accepted sentences as correctly describing pictures that a strengthened
7:2
Embedded implicatures observed
interpretation of the sentence was not true of. For instance, 100% of Geurts and Pouscoulous’s subjects accepted the sentence in Figure 1 (from Geurts & Pouscoulous 2009a) as correctly describing the arrangement shown in the figure, even though the locally-strengthened interpretation (’all of the squares are connected to some but not all of the circles’ and thus ‘none of the squares are connected to all of the circles’) is false of the figure. They concluded, on the basis of data like these that “the conventionalist approach to scalar implicatures has little to recommend it” (Geurts & Pouscoulous 2009a, p 431).
All the squares are connected with some of the circles. true
Figure 1
false
(From Geurts & Pouscoulous 2009a)
Geurts & Pouscoulous (2009a) acknowledged that data they obtained in verbal “inference” tasks (in which subjects are asked whether a sentence like All the squares are connected with some of the circles implies All the squares are connected with some but not all of the circles) exhibited a fair proportion (on the order of 50%) of strengthened interpretations. However, they state that such data are suspect. They argue that the proportion of acceptances of strengthened interpretations is inflated, perhaps because subjects’ attention is called to the putative implication, so that subjects confuse it with the legitimate non-embedded Gricean implicature (The square is connected with some of the circles pragmatically implicates The square is connected with some but not all of the circles). We were concerned that the verification task used by Geurts & Pouscoulous (2009a) has its own bias. Displays like that in Figure 1 can be
7:3
Clifton & Dube
correctly described in many ways: There are squares and circles; Squares and circles are connected to each other; Some squares are connected to some circles; etc. A pragmatic perspective does not require that only the strongest interpretation is a correct description, even if it is the preferred description. Similarly, while a mainstream conventionalist perspective claims that the preferred (strengthened) interpretation is not strictly true of the display, the existence of various weaker but legitimate descriptions of the display suggests that the non-strengthened interpretation may be acceptable. It may be that the locally-strengthened interpretation is considered to be the best interpretation of the sentence, as long as it is the globally-strongest interpretation. However, Geurts and Pouscoulous’s subjects were not asked whether the display was the best possible depiction of the target sentence. They were only asked whether the sentence correctly described the display. A variety of weaker statements and interpretations can still be considered to be correct descriptions of the display. From this perspective, it is tempting to consider what would happen if the subject were given a choice between two displays, one of which honors the locally-strengthened interpretation and the other of which violates it. If the locally-strengthened interpretation is the preferred one (as claimed by the mainstream conventionalist position), subjects should choose the display that honors it rather than the one that does not. If minimal conventionalism is on the right path, then subjects should be equally happy choosing either display. And the same should be true if Gricean pragmatics rules the day: the proper interpretation should be ‘All the squares are connected to some and possibly all of the circles.’ We conducted two experiments, modeled on Geurts & Pouscoulous’s (2009a) Experiments 2 and 3. In each case, we shifted from a verification format to a choice format. Subjects were shown a sentence and two figures (generally one honoring a locally-strengthened interpretation, one honoring only a basic interpretation; see below for details), and asked to choose which picture was best described by the sentence: the ‘strengthened’ picture, the ‘basic’ picture, “both,” and “neither.” Both experiments were conducted in a single session, with randomly intermixed presentation of items including filler items, as described below.
7:4
Embedded implicatures observed
2
Experiment 1
The first experiment was based on Geurts and Pouscoulous’s Experiment 2, in which subjects were given a (Dutch) sentence like ‘Some of the B’s are in the box on the left’ and a picture containing the letters A, B, and C, and asked “to decide whether [the sentence] correctly describes [the picture]” (page 16). The left box had all the B’s and all the A’s, and the right box had all the C’s. Geurts and Pouscoulous present this experiment not as a test of whether embedded implicatures are made (the sentences evaluated are simple sentences, presumably supporting the Gricean implicature that ‘not all of the B’s are in the box on the left’) but simply as a check on the verification technique. They assumed that a subject who made the strengthened interpretation of ‘Some of the B’s are in the box on the left’ would reject that sentence as being a correct description of a picture where all the B’s are in the box on the left. In addition to having their subjects verify whether such sentences correctly described the pictures, they had their subjects perform a written inference task. Subjects were asked to decide whether a sentence like ‘Some of the B’s are in the box on the left’ implies that not all the B’s are in that box. 62% of their subjects accepted the truth of such a strengthened inference. However, a substantially smaller 34% of their subjects denied that the sentence correctly described the picture, as they should have done had they insisted on the strengthened interpretation. The only claim that Geurts and Pouscoulous made for these data is that the inference technique yields inflated rates of scalar implicatures. We conducted Experiment 1 to shed light on whether this is the right claim, or whether the picture verification technique used by Geurts and Pouscoulous underestimated the incidence of scalar implicatures. 2.1 Materials Four some sentences were constructed, as illustrated in (3). One pair of pictures was made for each sentence, as illustrated in Figure 2. (3)
Some of the stars are in the box on the left.
An additional 84 items (6 practice items plus 78 items from other experiments, including Experiment 2, presented below) were constructed. These were a mixture of picture verification items and written inference acceptance items, and tested both the scalar term some and the term or. We
7:5
Clifton & Dube
present only the some verification data here, for comparability with Geurts and Pouscoulous. Please indicate which shape is best described by the sentence below Some of the stars are in the box on the left.
Figure 2
Illustration of figures used in Experiment 1
2.2 Subjects and Procedures Thirty-six undergraduates at the University of Massachusetts participated; they received extra credit in their psychology courses in exchange for their participation. All subjects were tested individually. They viewed all the items on a computer monitor, and made their responses on a computer keyboard. The general instructions for all experiments were as follows: In this experiment, you will be shown several short sentences. Following each sentence, there will be a question about the meaning of the sentence. On some trials,you will also be shown simple diagrams along with the sentences, and you will be asked to choose the diagram that is best described by the sentence. Please read the sentences carefully and answer each question to the best of your ability. Subjects then advanced through 6 practice trials containing 3 simple verification and 3 inference items, followed by the individually-randomized presentation of a total of 82 experimental trials, including the 4 critical trials for Experiment 1. The verification instructions for all trials in all experiments simply asked subjects to ‘Please indicate which shape is best described by the sentence below.’ The sentence to be evaluated was presented below the verification instruction, and below the sentence was the diagram. The response options ‘A’, ‘B’, ‘C (Both)’ and ‘D (Neither)’ were indicated below the
7:6
Embedded implicatures observed
diagram (see Figure 2). Subjects made the verification response via key-press. No time constraint was imposed on the subjects, and participation in the study took approximately 20 minutes. 2.3 Results and Discussion Table 1 contains the percentages of choices of each of the four options. The results are very clear. There was a preponderance of choices of the ‘B’ pair of boxes, in which some but not all of the named items (e.g., stars) were on the left; there were more choices of B than A: t(35) = 9.8, p < .001, 95% CI of difference: (.54, .82). This, of course, is the choice that is consistent with a strengthened interpretation. Choices of ‘both,’ consistent with a nonstrengthened ‘some and possibly all,’ were fairly infrequent and failed to rise above the arguable chance level of .25 choices of a given option, t(35) = .13, p = .90, 95% CI : (.13, .35). Choices of the A picture (which Geurts & Pouscoulous’s subjects accepted 66% of the time) and the ‘neither’ item were essentially non-existent. Choice Option
Table 1
A
B*
C (“both”)
D (“neither”)
3 (2)
71 (6)
24 (5)
2 (2)
Percentages of choices of each option (standard errors in parentheses), Experiment 1. “Strengthened" alternative indicated by *
The methodological implication is clear: The verification task as used by Geurts & Pouscoulous (2009a) gives a much smaller estimate of the extent to which readers arrive at a strengthened interpretation of some in a non-embedded context than does the choice task we used. Geurts and Pouscoulous apparently assume that subjects will reject a sentence as a correct description of a picture if the most preferred interpretation of the sentence is not true of the picture. However, alternative interpretations of a sentence are possible; it is possible to cancel a scalar implicature. Under such an interpretation, the quantified sentence seems to be a possible description of the picture, permitting Geurts and Pouscoulous’s subjects to accept it as such. However, our choice task permitted our subjects to let us know what their preferred interpretation of the quantified sentences is. They apparently took
7:7
Clifton & Dube
this opportunity to tell us, contrary to Geurts and Pouscoulous’s conclusions, that they preferred the strengthened interpretation. This methodological conclusion justifies re-examining Geurts and Pouscoulous’s verification results about the (non-) strengthening of embedded implicatures. 3
Experiment 2
The second experiment examined strengthening in embedded implicatures, using a task like that in Experiment 1. The critical items gave subjects a quantified sentence containing the scalar term some and asked them to indicate which of two displays it more accurately described, where one display pictured the ‘some but not all’ interpretation and the other pictured the ‘all’ possibility (see Figure 3, version 1; version 2 is a second type of test, described below). The basic predictions are as follows: If mainstream conventionalism is correct in a very strict sense, only the display that honors the strengthened (’some but not all’) interpretations should be chosen. If minimal conventionalism is strictly correct, the “both” option should be chosen (and to the extent that a specific display is chosen, each should be chosen equally often). The interpretation of a sentence strengthened by a conventional implicature is the denial of “all...all” (e.g., for the sentence All the squares are connected to some of the circles, it is ‘It is not the case that all the squares are connected to all of the circles’). Since this interpretation is true of both the displays in the version 1 portion of Experiment 2, the pragmatic perspective predicts the same pattern of choices as minimal conventionalism does. 3.1
Materials
Four sentences were constructed that contained the scalar some. They were written in two versions each, as illustrated in (4), one with the universal quantifier all and the other with each.2 Both forms involve embedded implicatures, and do not support scalar implicatures from a Gricean perspective. Each of the four items referred to a different triple of shapes.
2 This manipulation was included based on the intuition – which proved to be incorrect – that the more individuating nature of each compared to all would discourage a ‘group’ interpretation of the predicate and encourage strengthening.
7:8
Embedded implicatures observed
(4)
a. b.
All of the squares are connected to some of the circles. Each of the squares is connected to some of the circles.
Two different figures, each with two designs, were made up for each of the four items. An illustration appears in Figure 3. One figure (top panel in Figure 3, Version 1) contained one design that honored the strengthened interpretation (the B item) and one design that honored the unstrengthened ‘all’ interpretation. The predictions for these items were laid out earlier. The other figure (bottom panel, Version 2) was designed so that neither design was true of the strengthened interpretation. For these items, a reader who arrived at that interpretation (i.e., a reader who made a local or embedded implicature) should choose Option D, ‘neither.’ A reader who did not take the strengthened interpretation should find either display acceptable and ideally choose Option C, ‘both.’ 3.2
Subjects and Procedures
Since they were conducted together, details regarding the subjects and procedures for Experiment 2 are identical to those of Experiment 1, with the exception that each subject received 8 critical trials. Each subject saw all four sentences twice, once where one figure honored the strengthened interpretation (Figure 3, Version 1) and once where neither figure did (Figure 3, Version 2). Two of each of these had the quantifier all and two, each, counterbalanced over subjects so that each item was tested with each quantifier equally often. Apart from this variation, trials differed only in the particular forms used (circles, triangles, stars, moons, hearts, etc.) 3.3
Results and Discussion
Table 2 contains the percentages of choices of each option. Trials on which subjects were presented with a design that honored the strengthened interpretation (’Version 1’) provided evidence that they frequently arrived at the strengthened interpretation: There were substantial numbers of choices of the design that honored that interpretation, but essentially none of just the design that was inconsistent with it. t tests comparing the probability of a strengthened response to .25 indicated significant strengthening for Version 1, t(71) = 2.59, p < .05, 95% CI : (.28, .48). However, the most frequent choice was option ‘C,’ “both,” which is the answer that is consistent with
7:9
Clifton & Dube
Please indicate which shape is best described by the sentence below All/Each of the squares are connected to some of the circles.
Version 1. Figure used where B option illustrated the strengthened interpretation
Version 2. Figure used where neither option illustrated the strengthened interpretation Figure 3
Illustration of figures used in Experiment 2
the non-strengthened, ‘logical,’ interpretation. Indeed, this option was chosen significantly more often than option B, t(71) = 2.13, p < .05, 95% CI of difference: (.01, .42). Trials on which neither design honored the strengthened interpretation received a substantially increased number of option ‘D’ ("neither") interpretations, which are consistent with a strengthened interpretation of the scalar, t(71) = 4.39, p < .001, 95% CI of the difference: (.10, .26). Version 2 also produced substantially more choices of option ‘A’ than option ‘B,’ t(71) = 4.15, p < .001, 95% CI of the difference (.11, .31), which is further reflected in a significant increase in the probability of choosing the A figure from Version 1 to Version 2, t(71) = 5.13, p < .001, 95% CI of the difference: (.15, .34). However, the most-frequent choice was option ‘C,’ “both," the interpretation that is consistent with the non-strengthened interpretation (vs. option A: t(71) = 2.74, p < .01, 95% CI : (.07, .43)).
7:10
Embedded implicatures observed
Version 1: B alternative strengthened Choice Option Quantifier all each
A 3 (2) 0 (0)
B* 39* (7) 38* (7)
C (“both”) 57 (8) 63 (7)
D (“neither”) 1 (1) 0 (0)
Version 2: Neither alternative strengthened all each Table 2
28 (7) 24 (6)
6 (3) 4 (2)
50 (8) 51 (8)
17* (6) 21* (6)
Percentages of choices of each option (standard errors in parentheses), Experiment 2. “Strengthened" alternative indicated by *
The greater frequency of choices of ‘A’ than of ‘B’ is of some interest. It has two apparent possible interpretations. From a Gricean perspective, a writer who wanted to describe the B picture would have written Each of the squares is connected to all of the circles. Since this is not what the sentence said, the sentence should not be taken to refer to the B picture. From a local strengthening perspective, the (strengthened) interpretation ‘Each of the squares is connected to some but not all of the circles’ is falsified by each of the squares in the B picture, but only by one square in the A picture. This could have encouraged choice of A as the ‘less-wrong’ alternative. 4 Conclusions Methodologically, the conclusion is clear: While Geurts & Pouscoulous (2009a) may be correct in their concern that an inference judgment test yields an inflated number of instances of apparent strengthening of scalar terms, their alternative – the picture verification task, as they used it – apparently underestimates strengthening. When subjects were given a choice between two figures, only one of which honored the strengthened interpretation, they showed a distinct preference for choosing that figure. Geurts & Pouscoulous (2009a) took their verification data to show that subjects never, or almost never, rejected figures that violated strengthening of an embedded scalar term. Our data show that our subjects nonetheless showed a substantial preference for a figure that honored strengthening when given a choice
7:11
Clifton & Dube
between the two types of figures (and further, that they showed a smaller but still substantial frequency of rejecting both figures when neither honored strengthening). We submit that Geurts and Pouscoulous’s conclusion that readers do not make embedded implicatures is based on suspect data, and hence is at best premature. Theoretically, though, the cup may be only half full. While our data show that readers who make the choice between the strengthened and the unstrengthened interpretation of an embedded scalar strongly prefer the former, they also show that the most common response is not to choose between the interpretations but to accept both. Such ecumenism is not a given; Experiment 1, which tested non-embedded scalar terms, found that “both” choices were fairly infrequent. The choice of “both” in Experiment 2 presumably reflects the absence of strengthening. Perhaps the right conclusion is that an apparently strengthened interpretation of an embedded scalar term like some is possible, but not obligatory and not even preferred. This conclusion may present some difficulty to one who holds a pragmatic Gricean perspective. As Geurts & Pouscoulous (2009a) make clear, Gricean accounts of strengthening of scalar terms under the scope of (e.g.) think and believe (Geurts 2009) do not readily generalize to scalar terms under the scope of all or each. In the absence of a Gricean account of pragmatic strengthening under the scope of such terms, our results call Gricean accounts generally into question. Similarly, our findings may present some difficulty for a mainstream conventionalist perspective: It is not clear from such a perspective why the strengthened interpretation is apparently taken less frequently than the basic interpretation. The minimal conventionalist perspective discussed by Geurts & Pouscoulous (2009a) can accommodate our data, as can a perspective that says that terms like some are simply ambiguous, but these perspectives are so unconstraining that one would hope to adopt them only as a last resort. We can conclude only that the evidence presented by Geurts & Pouscoulous (2009a) has not made a solid case against the existence of local, embedded implicatures. We trust that additional experimental research will clarify the conditions under which such implicatures are made, and hope that additional linguistic analysis will shed light on why these conditions encourage strengthening.
7:12
Embedded implicatures observed
References Chemla, Emmanuel. 2009. Universal Implicatures and free choice effects: Experimental data. Semantics and Pragmatics 2(2). 1–33. doi:10.3765/sp.2.2. Chierchia, Gennaro, Danny Fox & Benjamin Spector. 2008. The grammatical view of scalar implicatures and the relationship between semantics and pragmatics. In Claudia Maienborn, Klaus von Heusinger & Paul Portner (eds.), Semantics: An international handbook of natural language meaning, Berlin: Mouton de Gruyter. http://semanticsarchive.net/Archive/ WMzY2ZmY/CFS_EmbeddedSIs.pdf. To appear. Geurts, Bart. 2009. Scalar implicatures and local pragmatics. Mind and Language 24(1). 51–79. doi:10.1111/j.1468-0017.2008.01353.x. Geurts, Bart & Nausicaa Pouscoulous. 2009a. Embedded implicatures?!? Semantics and Pragmatics 2(4). 1–34. doi:10.3765/sp.2.4. Geurts, Bart & Nausicaa Pouscoulous. 2009b. Free choice for all: a response to Emmanuel Chemla. Semantics and Pragmatics 2(5). 1–10. doi:10.3765/sp.2.5.
Charles Clifton, Jr. Tobin Hall 135 Hicks Way University of Massachusetts Amherst, MA 01003 USA [email protected]
Chad Dube Tobin Hall 135 Hicks Way University of Massachusetts Amherst, MA 01003 USA [email protected]
7:13
Semantics & Pragmatics Volume 3, Article 11: 1–28, 2010 doi: 10.3765/sp.3.11
Conjunctive interpretation of disjunction∗ Robert van Rooij ILLC, University of Amsterdam
Received 2010-02-02 / First Decision 2010-03-21 / Revision Received 2010-04-19 / Second Decision 2010-04-20 / Revision Received 2010-05-12 / Third Decision 201006-10 / Revision Received 2010-07-13 / Accepted 2010-08-18 / Published 2010-09-15
Abstract In this extended commentary I discuss the problem of how to account for “conjunctive” readings of some sentences with embedded disjunctions for globalist analyses of conversational implicatures. Following Franke (2010, 2009), I suggest that earlier proposals failed, because they did not take into account the interactive reasoning of what else the speaker could have said, and how else the hearer could have interpreted the (alternative) sentence(s). I show how Franke’s idea relates to more traditional pragmatic interpretation strategies.
Keywords: embedded implicatures, optimal interpretation, free choice permission
1 Introduction Neo-Gricean explanations of what is meant but not explicitly said are very appealing. They start with what is explicitly expressed by an utterance, and then seek to account for what is meant in a global way by comparing what the speaker actually said with what he could have said. Recently, some researchers (e.g., Levinson (2000), Chierchia (2006), Fox (2007)) have argued that it is wrong to start with what is explicitly expressed by an utterance. Instead — or so it is argued — implicatures should be calculated locally at linguistic clauses. For what it is worth, I find the traditional globalist analysis of implicatures more appealing, and all other things equal, I prefer the global ∗ The content of this paper was crucially inspired by Michael Franke’s dissertation, and earlier work done on free choice permission by Katrin Schulz. Besides them, I would also like to thank the reviewer of this paper and the editors of this journal (David Beaver in my case) for their useful and precise comments on an earlier version of this paper. ©2010 Robert van Rooij This is an open-access article distributed under the terms of a Creative Commons NonCommercial License (creativecommons.org/licenses/by-nc/3.0).
Robert van Rooij
analysis to a localist one. But, of course, not all things are equal. Localists provided two types of arguments in favor of their view: experimental evidence and linguistic data. I believe that the ultimate “decision” on which line to take should, in the end, depend only on experimental evidence. I have not much to say about this, but I admit to be happy with experimental results as reported by Chemla (2009) and Geurts & Pouscoulous (2009a) which mostly seem to favor a neo-Gricean explanation. But localists provided linguistic examples as well, examples that according to them could not be explained by standard “globalist” analyses. Impossibility proofs in pragmatics, however, are hard to give. Many examples involve triggers of scalar implicatures like or or some embedded under other operators. Some early examples include φ ∨ (ψ ∨ χ) and (φ ∨ ψ). Localist theories of implicatures were originally developed to account for examples of this form. As for the first type of example, globalists soon pointed out that these are actually unproblematic to account for. As for the second type, Geurts & Pouscoulous (2009a) provide experimental evidence that implicature triggers like or and some used under the scope of an operator like believe or want do not necessarily give rise to local implicatures. That is, many more participants of their experiments infer the implicature (1-b) from (1-a), than infer (2-b) and (3-b) from (2-a) and (3-a), respectively. Moreover, they show that there is little evidence that people in fact infer (3-b) from (3-a). (1)
a. b.
Anna ate some of the cookies. Anna didn’t eat all of the cookies.
(2)
a. b.
Bob believes that Anna ate some of the cookies. Bob believes that Anna did not eat all of the cookies.
(3)
a. b.
Bob wants Anna to hear some of the Verdi operas. Bob wants Anna not to hear all of the Verdi operas.
These data are surprising for localist theories of implicatures according to which scalar inferences occur systematically and freely in embedded positions. The same data are accounted for rather easily, however, on a global analysis.1 Thus, Geurts & Pouscoulous (2009a) argue that localist theories of embedded implicatures tend to over-generate, and that global neo-Gricean theories predict much better.
1 See Geurts & Pouscoulous 2009a and Geurts & Pouscoulous 2009b for discussion, and footnote 18.
11:2
Conjunctive interpretation of disjunction
It is well-known, however, that globalist theories have serious problems with other examples involving triggers used in embedded contexts as well. Problematic examples include conditionals with disjunctive antecedents like (φ ∨ ψ) > χ and free choice permissions like ♦(φ ∨ ψ). Both examples seem to give rise to “conjunctive” interpretations: from ♦(φ ∨ ψ), for example, we infer ♦φ ∧ ♦ψ. Standard neo-Gricean analyses like those of Sauerland (2004) and van Rooij & Schulz (2004), however, do not predict this. Fox (2007) has shown that this conjunctive interpretation follows once we make use of recursive exhaustification, and Chemla (2009) has defined a new operator that can be applied globally to the formula ♦(φ ∨ ψ) and still gives rise to the desired conjunctive reading. This is certainly appealing, but it is not so clear that Chemla’s analysis is truly neo-Gricean. In the words of Geurts & Pouscoulous (2009b), “Defining an operator is one thing; providing a principled pragmatic explanation is quite another”. Franke (2010, 2009) provided such a principled pragmatic explanation of these data making use of game theory.2 The purpose of this paper is to show how this analysis relates to more traditional pragmatic interpretation strategies. As we will see, this reformulation also involves multiple uses of exhaustive interpretation. I will explain how the analysis still differs from the analysis of Fox (2007), and suggest that it is more Gricean in spirit. The experimental data of Chemla (2009) are mostly problematic for localist analyses of implicatures. He found, for instance, that sentences of the form ∀x(P x ∨ Qx) do not routinely give rise to the expected “local” implicature that ∀x¬(P x ∧ Qx).3 Still, there is at least one type of experimental result that, he claims, favors a localist analysis. Chemla (2009) found that just as for sentences of the form ♦(φ ∨ ψ), sentences of the form ∀x♦(P x ∨ Qx) also give rise to a “conjunctive” interpretation: it licenses the inference to ∀x♦P x ∧ ∀x♦Qx. Chemla claims that this inference is predicted by a localist analysis, but not by a globalist one. In section 4.3 we will come back to this issue. 2 For a rather different pragmatic explanation of these data, see Chemla 2008. 3 Geurts & Pouscoulous (2009a) found something similar, and claim that on the basis of their data one should conclude that this inference simply never takes place. I am not sure, though, whether they also tested that the inference also does not take place in case a sentence like Everybody likes bananas or apples is given as answer to the explicit question What does everybody like?.
11:3
Robert van Rooij
2
In need of pragmatic explanation
2.1 Conditionals with disjunctive antecedents It seems reasonable that any adequate theory of conditionals must account for the fact that at least most of the time instantiations of the following formula (Simplification of Disjunctive Antecedents, SDA) are true: (4)
(SDA)
[(φ ∨ ψ) > χ] → [(φ > χ) ∧ (ψ > χ)]
For instance, intuitively we infer from (5-a) that both (5-b) and (5-c) are true: (5)
a. b. c.
If Spain had fought on either the Allied side or the Nazi side, it would have made Spain bankrupt. If Spain had fought on the Allied side, it would have made Spain bankrupt. If Spain had fought on the Nazi side, it would have made Spain bankrupt.
Of course, if the conditional is analyzed as material or strict implication, this comes out immediately. Many researchers, however, don’t think these analyses are appropriate, and many prefer an analysis along the lines of Lewis and Stalnaker. Adopting the limit assumption,4 one can formulate their analyses in terms of a selection function, f , that selects for each world w and sentence/proposition φ the closest φ-worlds to w. A conditional represented as φ > χ is now true in w iff fw (φ) ⊆ χ. This analysis, however, does not make (SDA) valid. The problem is that if we were to make this principle valid, e.g., by saying that fw (φ ∨ ψ) = fw (φ) ∪ fw (ψ), then the theory would loose one of its most central features, its non-monotonicity. The principle of monotonicity, (6)
(MON)
[φ > χ] → [(φ ∧ ψ) > χ],
becomes valid. That is, by accepting SDA, we can derive MON on the assumption that the connectives are interpreted in a Boolean way,5 and we end up with a strict conditional account. We have seen already that the strict conditional account (or the material conditional account) predicts SDA, 4 The assumption that for any world there is at least one closest φ-world for any consistent φ — see Lewis 1973 for classic discussion. 5 From φ > χ and the assumption that connectives are interpreted in a Boolean way, we can derive ((φ ∧ ψ) ∨ (φ ∧ ¬ψ)) > χ. By SDA we can then derive (φ ∧ ψ) > χ.
11:4
Conjunctive interpretation of disjunction
but perhaps for the wrong reasons. The Lewis/Stalnaker account does not validate MON because SDA is not a theorem of their logic. Although there are well-known counterexamples to SDA,6 we would still like to explain why it holds in “normal” contexts. A simple “explanation” would be to say that a conditional of the form (φ ∨ ψ) > χ can only be used appropriately in case the best φ-worlds and the best ψ-worlds are equally similar to the actual world. Though this suggestion gives the correct predictions, it is rather ad hoc. We would like to have a “deeper” explanation of this desired result in terms of a general theory of pragmatic interpretation. 2.2 Free choice The free choice problem is a problem about permission sentences. Intuitively, from the (stated) permission You may take an apple or a pear one can conclude that you can take an apple and that you can take a pear (though perhaps not both). This intuition is hard to account for, however, on any standard analysis of permission sentences. There is still no general agreement of how to interpret such sentences. In standard deontic logic (e.g., Kanger 1981, though basically due to Leibniz (1930)) it is assumed that permission sentences denote propositions that are true or false in a world, and that deontic operators (like ought and permit) apply to propositions. The permission ♦φ is considered to be true in w just in case there exists a world deontically accessible from w in which φ is true. Obviously, such an analysis predicts that ♦φ î ♦(φ ∨ ψ).7 This analysis does not predict, however, that ♦(φ ∨ ψ) î ♦φ ∧ ♦ψ. According to other traditions (e.g., von Wright 1950, Lewis 1979), we should look at permission sentences from a more dynamic perspective. But there are still (at least) two ways of doing this. According to the performative analysis (cf., Lewis 1979), the main point of making a permission is to change a prior permissibility set to a posterior one. This analysis might still be consistent with the deontic logic approach in that it assumes that what is permitted denotes a proposition. Another tradition (going back to von Wright 1950) is based on the assumption that deontic concepts are usually applied to actions rather than propositions. Although permissions are now said to apply to actions, a permission sentence by itself 6 See Fine 1975. 7 In the philosophical literature, this is sometimes called the paradox of free choice permission, because it is taken to be problematic.
11:5
Robert van Rooij
is still taken to denote a proposition, and is true or false in a world.8 2.2.1
A conditional analysis with dynamic logic
Let us first look at the latter approach according to which deontic operators are construed as action modalities. Dynamic logic (Harel 1984) makes a distinction between actions (and action expressions) and propositions. Propositions hold at states of affairs, whereas actions produce a change of state. Actions may be nondeterministic, having different ways in which they can be executed. The primary logical construct of standard dynamic logic is the modality hαiφ, expressing that φ holds after α is performed. This modality operates on an action α and a proposition φ, and is true in world w if some execution of the action α in w results in a state/world satisfying the proposition φ. Dynamic logic starts with two disjoint sets; one denoting atomic propositions, the other denoting atomic actions. The set of action expressions is then defined to be the smallest set A containing the atomic actions such that if α, β ∈ A, then α ∨ β ∈ A and α; β ∈ A.9 The set of propositions is defined as usual, with the addition that it is assumed that if α is an action expression and φ a proposition, then hαiφ is a proposition as well. To account for permission sentences we will assume that in that case also Per(α) is a proposition. Propositions are just true or false in a world. To interpret the action expressions, it is easiest to let them denote pairs of worlds. The mapping τ gives the interpretation of atomic actions. The mapping τ is extended to give interpretations to all action expressions by τ(α; β) = τ(α); τ(β) and τ(α ∨ β) = τ(α) ∪ τ(β). The action α; β consists of executing first α, and then β. The action α ∨ β can be performed by executing either α or β. We write τw (α) for the set {v ∈ W | hw, vi ∈ τ(α)}. Thus, τw (α) is the set of all worlds you might end up in after performing α in w. We will say that Per(α) is true in w, w î Per(α),10 just in case τw (α) ⊆ Pw , where Pw is the set of 8 There is yet another way to go, which recently became popular as well (e.g., Portner 2007): assume that permission applies to an action, but assume that a permission statement also changes what is permitted. I won’t go into this story here. Another story I won’t go into here is the resource-sensitive logic approach to free choice permission proposed in Barker 2010, a paper I became aware of just as the current paper was going to press. 9 I will ignore iteration here. 10 Strictly speaking the definition of î should be relativized to a model, but the model remains implicit here as throughout the paper.
11:6
Conjunctive interpretation of disjunction
permissible worlds in w. Notice that this way of interpreting permissions gives them a conditional flavor: Per(α) really means that it is acceptable to perform α.11 Given the interpretation of disjunctive actions, it immediately follows that we can account for free choice permission: from the truth of Per(α ∨ β) we can infer the truth of Per(α) and Per(β).12 Although free choice permission follows, one wonders whether it should be built into the semantics: if I allow you to do α this doesn’t mean that I allow you to do α in any way you want. I only allow you to do α in the best way. To account for this latter rider, we can add to our models a selection function, f , that picks out the best elements of any set of possible worlds X for every world w. Then we say that ♦α is true in w iff fw (τw (α)) ⊆ Pw . But even if fw (X ∪Y ) ⊆ fw (X)∪fw (Y ), it is still not guaranteed that fw (X ∪Y ) = fw (X) ∪ fw (Y ), and thus the free choice permission inference isn’t either. Of course, the inference follows in case fw (α ∨ β) = fw (τw (α)) ∪ fw (τw (β)), but we would like to have a pragmatic explanation of why this should be the case if an assertion of the form ♦(α ∨ β) is given. 2.2.2
A performative analysis
Lewis (1979) and Kamp (1973, 1979) have proposed a performative analysis of command and permission sentences involving a master and his slave. On their analysis, such sentences are not primarily used to make true assertions about the world, but rather to change what the slave is obliged/permitted to do.13 But how will permission sentences govern the change from the prior permissibility set, Π, to the posterior one, Π0 ? Kamp (1979) proposes that this change depends on a reprehensibility ordering, ≤, on possible worlds. The effect of allowing φ is that the best φ-worlds are added to the old permissibility set to figure in the new permissibility set. This set will be ∗ denoted as Πφ is and defined in terms of the relation ≤ as follows: (7)
def
∗ Πφ = {u ∈ φ | ∀v ∈ φ : u ≤ v}
Thus, the change induced by the permission You may do φ is that the new ∗ permission set, Π0 , is just Π ∪ Πφ . Note that according to this performative account it does not follow that for a permission sentence of the form You 11 See Asher & Bonevac 2005 for a conditional analysis of permissions sentences. 12 Notice also that another paradox of standard deontic logic is avoided now: from the permission of α, Per(α), the permission of α ∨ β, Per(α ∨ β) doesn’t follow. 13 For further discussion of this model, see e.g, van Rooij 2000.
11:7
Robert van Rooij
may do φ or ψ the slave can infer that according to the new permissibility set he is allowed to do any of the disjuncts. Still, in terms of Kamp’s analysis we can give a pragmatic explanation of why disjuncts are normally interpreted in this “free choice” way. To explain this, let me first define a deontic preference relation between propositions, ≺, in terms of our reprehensibility relation between worlds, <. We can say that although both φ and ψ are incompatible with the set of ideal worlds, φ is still preferred to ψ, φ ≺ ψ, iff the best φ-worlds are better than the best ψ-worlds, ∃v ∈ φ and ∀u ∈ ψ : v < u. Then we can say that with respect to ≺, φ and ψ are equally reprehensible, φ ≈ ψ, iff φ ψ and ψ φ. It is easily seen that ∗ ∗ ∗ Πφ∨ψ = Πφ ∪ Πψ iff φ ≈ ψ. How can we now explain the free choice effect? According to a straightforward suggestion, a disjunctive permission can only be made appropriately in case the disjuncts are equally reprehensible.14 This suggestion, of course, exactly parallels the earlier suggestions of when conditionals with disjunctive antecedents can be used appropriately, or disjunctive permissions according to the dynamic logic approach. Like these earlier suggestions, however, this new suggestion by itself is rather ad hoc, and one would like to provide a “deeper” explanation in terms of more general principles of pragmatic reasoning. 3 3.1
Pragmatic interpretation The standard received view
Implicatures come in many varieties, but scalar implicatures have received the most attention by linguists. A standard way to account for the scalar implicatures of ‘φ’ is to assume that φ is associated with a set of alternatives, A(φ), and that the assertion of φ implicates that all its stronger alternatives are false. (8)
Prag(φ) = {w ∈ φ | ¬∃ψ ∈ A(φ) : w ∈ ψ & ψ ⊂ φ}.
If the alternative of Some of the students passed is All of the students passed, the desired scalar implicature is indeed accounted for. McCawley (1993) noticed, however, that if one scalar item is embedded under another one — as 14 For an alternative proposal using this framework, see van Rooij 2006.
11:8
Conjunctive interpretation of disjunction
in (9)15 — an interpretation rule like Prag does not give rise to the desired prediction that only one student passed if the alternatives are defined in the traditional way. (9)
Alice passed or (Bob passed or Cindy passed).
This observation can be straightforwardly accounted for if we adopt a different pragmatic interpretation rule and a different way to determine alternatives. First, we assume that the set of alternatives includes { Alice passed, Bob passed, Cindy passed } (which should perhaps be closed under conjunction and disjunction). According to the new pragmatic interpretation rule Exh, w is compatible with the pragmatic interpretation of φ iff (i) φ is true in w, and (ii) there is no other world v in which φ is true where less alternatives in A(φ) are true than are true in w, see (10). In the following, we abbreviate the condition ∀ψ ∈ A(φ) : v ∈ ψ ⇒ w ∈ ψ by v ≤A(φ) w, and define v
Exh(φ) = {w ∈ φ | ¬∃v ∈ φ : v
The pragmatic interpretation rule Exh correctly predicts that from (9) we can pragmatically infer that only one of Alice, Bob, and Cindy passed. In fact, this pragmatic interpretation rule is better known as the exhaustive interpretation of a sentence (e.g., Groenendijk & Stokhof 1984, van Rooij & Schulz 2004, Schulz & van Rooij 2006, Spector 2003, 2006). By interpreting sentences exhaustively one can account for many conversational implicatures. But from a purely Gricean point of view, the rule is too strong. All that the Gricean maxims seem to allow us to conclude from a sentence like Some of the students passed is that the speaker does not know that All of the students passed is true; not the stronger proposition that the latter sentence is false. To account for this intuition, the following weaker interpretation rule, Grice, can be stated, which talks about knowledge rather than facts (where Kφ means that the speaker knows φ):16 (11)
Grice(φ) = {w ∈ Kφ | ∀v ∈ Kφ, ∀ψ ∈ A(φ) : w î Kψ → v î Kψ}
15 Landman (2000) and Chierchia (2004) discuss structurally similar examples like Mary is either working at her paper or seeing some of her students. 16 A similar weaker interpretation is given by Sauerland (2004).
11:9
Robert van Rooij
As shown by Spector (2003) and van Rooij & Schulz (2004), exhaustive interpretation follows from this, if we assume that the speaker is as competent as possible insofar as this is compatible with Grice. 3.2
The problem
Although these interpretation rules account for many conversational implicatures, they give rise to the wrong predictions for more complex statements involving disjunction. Two prime examples are (i) free choice permissions of the form ♦(φ ∨ ψ), and (ii) conditionals with disjunctive antecedents like (φ ∨ ψ) > χ. It is widely held that the alternatives of these sentences are respectively ♦φ, ♦ψ and ♦(φ ∧ ψ), and φ > χ, ψ > χ and (φ ∧ ψ) > χ. Before we can discuss the possible pragmatic interpretations, let us first note that according to standard deontic logic ♦(φ ∨ ψ) î ♦φ ∨ ♦ψ,17 and that adopting the Lewis/Stalnaker analysis of conditionals, it holds that (φ ∨ ψ) > χ î (φ > χ) ∨ (ψ > χ). Let us first look at the standard pragmatic interpretation rule Prag. Given that ♦φ and ♦ψ express stronger propositions than ♦(φ ∨ ψ), it immediately follows that Prag(♦(φ ∨ ψ)) = , which is obviously wrong. Let’s turn then to exhaustive interpretation. We take the only relevantly different worlds in which ♦(φ ∨ ψ) are true to be {u, v, w}, where ♦φ is true in u and w, and ♦ψ is true in v and w. Recall that Exh(φ) holds in worlds in which as few as possible alternatives to φ are true. But this means that Exh(♦(φ ∨ ψ)) = {u, v}, from which we can wrongly conclude that only one of the permissions is true. The desired conclusion that both permissions are true is incompatible with this pragmatic interpretation. A similar story holds for conditionals with disjunctive antecedents. Let us turn now to the weaker Gricean interpretation Grice. This weaker Gricean rule indeed predicts an interpretation that the sentences in fact have. For the disjunctive permission ♦(φ ∨ ψ) it is predicted that neither ♦φ nor ♦ψ are known to be true, but that they both are possibly true, perhaps even together. This prediction is appealing, but strengthening this by assuming our earlier form of competence doesn’t give rise to the desired conclusion: the resulting exhaustive interpretation gives rise to the wrong prediction. Perhaps this just means that the set of alternatives is chosen wrongly, or ∗ ∗ ∗ 17 Similarly, fw (τw (α ∨ β)) ⊆ fw (τw (α)) ∪ fw (τw (β)) and Πφ∨ψ ⊆ Πφ ∪ Πψ .
11:10
Conjunctive interpretation of disjunction
that the competence assumption is formalized in the wrong way. Indeed, this was proposed by Schulz (2003, 2005) to account for free choice permissions. As for the latter, she took a speaker to be competent in case she knows of each alternative whether it is true. Second, she took the set of alternatives of ♦φ to be the set {ψ : ψ ∈ A(φ)} ∪ {¬ψ : ψ ∈ A(φ)}.18 First, notice that by applying Grice to a sentence of the form ♦(φ ∨ ψ) it immediately follows that the speaker knows neither ¬φ nor ¬ψ, in formulas, ¬K¬φ and ¬K¬ψ. What we would like is that from here we derive the free choice reading: ♦φ and ♦ψ, which would follow from K¬¬φ and K¬¬ψ. Of course, this doesn’t follow yet, because it might be that the speaker does not know what the agent may or must do.19 But now assume that the speaker is competent on this in Schulz’ sense. Intuitively, this means that Pφ ≡ Kφ and P♦φ ≡ K♦φ. Remember that after applying Grice, it is predicted that neither K¬φ nor K¬ψ holds, which means that P¬¬φ and P¬¬φ have to be true. The latter, in turn, are equivalent to P♦φ and P♦ψ. By competence we can now immediately conclude to K♦φ and K♦ψ, from which we can derive ♦φ and ♦ψ as desired, because knowledge implies truth.20 Although I find this analysis appealing, it is controversial, mainly because of her choice of alternatives. This also holds for other proposed pragmatic analyses to account for free choice permissions, such as, for example, that of Kratzer & Shimoyama (2002). In section 4 I will discuss some other possible analyses that explain the desired free choice inference that assume that the alternatives of (φ ∨ ψ) > χ and ♦(φ ∨ ψ) are φ > χ and ψ > χ, and ♦φ and ♦ψ, respectively. 18 Taking φ as an alternative is natural to infer from ♦φ to the falsity of this necessity statement. 19 Notice, though, that this inference does follow if ‘’ and ‘♦’ stand for epistemic must and epistemic might. This is so, because for the epistemic case we can safely assume that the speaker knows what he believes, which can be modeled by taking the epistemic accessibility relation to be fully introspective. This gives the correct predictions, because from Katrin might be at home or at work, it intuitively follows that, according to the speaker, Katrin might be at home, and that she might be at work (cf., Zimmermann 2000). 20 Notice that it is also Schulz’ reasoning and notion of competence for Anna ate all of the cookies that is used to explain why from (2-a) we conclude to (2-b).
11:11
Robert van Rooij
4 Taking both directions into account 4.1 The intuition21 Suppose we adopt a Stalnaker/Lewis style analysis of conditional sentences. In that case we have to assume a selection function f , to evaluate the truthvalue of the sentence. Take now a set of worlds in which (φ ∨ ψ) > χ = {u, v, w} such that (i) fu (φ ∨ ψ) = fu (φ) ⊆ χ and fu (ψ) 6⊆ χ, (ii) fv (φ) 6⊆ χ and fv (φ ∨ ψ) = fv (ψ) ⊆ χ, and (iii) fw (φ ∨ ψ) = fw (φ) ∪ fw (ψ) ⊆ χ.22 We would like to conclude via pragmatic reasoning that the speaker who asserted (φ ∨ ψ) > χ implicated that we are in world w. In that case both φ > χ and ψ > χ are true as well, and we derived the “conjunctive” interpretation of the conditional with a disjunctive antecedent. The reasoning will go as follows. First, we are going to assume that the speaker is competent: she knows in which world she is. It seems unreasonable that she is in u, because otherwise the speaker could have used an alternative expression, φ > χ, which (limiting ourselves to worlds in which (φ ∨ ψ) > χ is true) more accurately singles out {u} than (φ ∨ ψ) > χ does. For the same reason we can conclude that the speaker is not in world v. In the only other case, w, fw (φ ∨ ψ) = fw (φ) ∪ fw (ψ) ⊆ χ, and thus both φ > χ and ψ > χ are true. Of course, one might wonder whether also this state cannot be expressed more economically by an alternative expression. But the answer to this will be negative, because we have already assumed that (φ > χ) ∧ (φ > χ) is not an alternative to (φ ∨ ψ) > χ. Thus, (φ ∨ ψ) > χ 21 The intuition of the following solution I owe to Franke (2009). One way of working out this intuition will be somewhat different, though, from what Franke proposed. This way makes use of bidirectional optimality theory. Earlier accounts making use of Bi-OT include Sæbø 2004 and Aloni 2007. What I always found problematic about such earlier Bi-OT solutions (I was a co-author of an earlier version of Aloni 2007) is that complexity of alternative expressions was taken to play a crucial role. But explanations based on complexity are not always equally convincing. Following Franke 2009, I believe that making use of complexity is not required. At a 2009 conference in Leuven where I presented Bi-OT and game-theoretic “solutions” of the problem of free choice inferences, Bart Geurts presented a solution that was based on a similar intuition (I am not sure in how far complexity played a crucial role here, or not), to be presented in Geurts 2010. I believe that also Edgar Onea suggested a solution very much in the same spirit. Perhaps this should be taken as an indication how natural a solution in this spirit is. 22 It might seem that I wrongly assume that φ > χ î (φ ∨ ψ) > χ and ψ > χ î (φ ∨ ψ) > χ. This is not, and should not, be the case. It might well be, for instance, that φ > χ is true in w, but (φ ∨ ψ) > χ is not. However, our reasoning will not depend on such worlds, because we will only consider worlds in which (φ ∨ ψ) > χ is true.
11:12
Conjunctive interpretation of disjunction
pragmatically entails φ > χ and ψ > χ, because if not, the speaker could have used an alternative expression which more accurately singled out the actual world. Intuitive solutions are ok, but to test them, we have to make them precise. In the following I will suggest two ways to implement the above intuition. Both implementations are based on the idea that to account for the desired “conjunctive” inferences of the disjunctive sentences, alternative expressions and alternative worlds/interpretations must play a very similar role in pragmatic interpretation. Thinking of it in somewhat different terms, we should take seriously both the speaker’s and the hearer’s perspective. Fortunately, there are two well-known theories on the market that look at pragmatic interpretation from such a point of view: Bi-directional Optimality Theory (e.g., Blutner 2000), and Game Theory (e.g., Benz, Jäger & van Rooij 2005). In the following I will discuss two possible ways to proceed, but they have something crucial in common: both ways make use of different levels of interpretation. The first proposal is game-theoretic in nature, and due to Franke (2010, 2009). The second suggestion is a less radical departure from the “received view” in pragmatics, and is more in the spirit of Bi-OT. It makes crucial use of exhaustive interpretation and of different levels of interpretation, but like in Bi-OT, alternative worlds and expressions that initially played a role in interpretation need not play a role anymore at higher levels.23 4.2
Franke’s game-theoretic solution
Game-theoretic and optimality-theoretic analyses of conversational implicatures seek to account in one systematic way for both scalar implicatures and for implicatures involving marked and unmarked meanings/interpretations, inspired by Horn’s division of pragmatic labor. In order to do so, they associate with an expression not just a semantic meaning, but assign also probabilities. According to the most straightforward proposal, the proba1 bility of w given φ, P (w | φ) = card(φ) if w ∈ φ, 0 otherwise. Recall that according to one standard approach pragmatic interpretation works as follows: (8)
Prag(φ) = {w ∈ φ | ¬∃ψ ∈ A(φ) : w ∈ ψ & ψ ⊂ φ}.
23 For the exact relation between Bi-OT and the game-theoretical best-response dynamics Franke makes use of, see Franke 2009.
11:13
Robert van Rooij
Figure 1
u
v
w
Some
1 3
1 3
1 3
=
1
Most
0
1 2
1 2
=
1
All
0
0
1
=
1
w
P (w | φ)
P (w | φ) for standard scale
P (w | f )
Figure 2
P
P (w | φ)
P
uφ≺ψ
vψ≺φ
wφ≈ψ
φ>χ
1 2
0
1 2
=
1
ψ>χ
0
1 2
1 2
=
1
(φ ∨ ψ) > χ
1 3
1 3
1 3
=
1
w
P (w | f )
P (w | f ) for counterfactual
On the assumption that all worlds are equally likely, here is a straightforward way to reformulate (8) making use of probabilities: (12)
Prag0 (φ) = {w ∈ φ | ¬∃ψ ∈ A(φ) : P (w | ψ) > P (w | φ)}
Look now at a standard example with All = {w} ⊂ Most = {v, w} ⊂ Some = {u, v, w}.24 From the assertion Some the desired implicature immediately follows, as can be seen from figure 1. The idea is that, for instance, Most is pragmatically interpreted as {v}, because (i) there is no world in which Most gets a higher value, and (ii) in v it is best to utter Most, because for all alternatives ψ, P (v | ψ) < P (v | Most). Let us now do the same for the sentence (φ ∨ ψ) > χ, together with its alternatives. Suppose that if we have in the columns the alternative worlds (with uφ≺ψ standing for the world where the best φ-worlds are closer to u than any ψ-world), and that we assume that χ is true in the most similar worlds (but not in others). In that case we get figure 2 (where f is an arbitrary form, or expression). 24 With All abbreviating All Ps are Qs.
11:14
Conjunctive interpretation of disjunction
A number of things are worth remarking. First of all, all sentences are true in wφ≈ψ . As a result of this, a (naive) hearer will interpret, for instance, φ > χ as equally likely true in uφ≺ψ as in wφ≈ψ . Now take the speaker’s perspective. Which statement would, or should, she make given that she is in a particular situation, or world? Naturally, that statement that gives her the highest chance that the (naive) hearer will interpret the message correctly. Thus, she should utter that sentence which gives the highest number in the column. But this means that in uφ≺ψ she should (and rationally would) utter φ > χ, in vψ≺φ she should utter ψ > χ, and in wφ≈ψ it doesn’t matter what she utters, both are equally good. The boxed entries model this speaker’s choice. The important thing to note is that according to this reasoning, no speaker (a speaker in no world) would ever utter (φ ∨ ψ) > χ. Still, this is exactly the message that was uttered and should be interpreted, so we obviously missed something. Franke (2009) proposes that our reasoning didn’t go far enough. We should now take the hearer’s perspective again, taking into account the optimal speaker’s message choice given a naive semantic interpretation of the hearer.25 This can best be represented by modeling the probabilities of the messages sent according to the previous reasoning, given the situation/world that the speaker is in.26 How should the hearer now interpret the messages? Well, because the speaker would always send φ > χ in uφ≺ψ , while the chance that she sends φ > χ in wφ≈ψ is lower (and taking the a priori probabilities of the worlds to be equal), there is a higher chance that the speaker of φ > χ is in world uφ≺ψ than in wφ≈ψ , and thus the hearer will choose accordingly. This is represented by the boxed entry in figure 3 (in which P (f | w) stands for the probability with which the speaker would say f if she were in w). Something similar holds for ψ > χ. As for (φ ∨ ψ) > χ, it is clear that all worlds are equally likely now, given that a previous speaker would not make this utterance in any of those worlds. Having specified how such a more sophisticated hearer would interpret the alternative utterances, we turn back to the speaker, but now assume that the speaker takes such a more sophisticated hearer into account. First we fill in the probabilities of the worlds, given the previous reasoning. Notice that these probabilities are crucially different from the earlier P (w | f ). The speaker now chooses optimally given these probabilities: i.e., the speaker 25 Jäger & Ebert (2009) make a similar move. Both models are instances of Iterated Best Response (IBR) models. 26 For a more precise description, the reader should consult Franke 2009, obviously.
11:15
Robert van Rooij
P (f | w)
uφ≺ψ
vψ≺φ
wφ≈ψ
φ>χ
1
0
1 2
ψ>χ
0
1
1 2
(φ ∨ ψ) > χ
0
0
0
1
1
1
P
f
Figure 3
P (f | w) for 1st -level hearer
P (w | f )
Figure 4
P (f | w)
P
P (w | f )
uφ≺ψ
vψ≺φ
wφ≈ψ
φ>χ
1
0
0
=
1
ψ>χ
0
1
0
=
1
(φ ∨ ψ) > χ
1 3
1 3
1 3
=
1
w
P (w | f ) for 1st -level speaker
chooses (one of) the highest rows in the columns. In uφ≺ψ and vψ≺φ she would choose as before, but in wφ≈ψ she now chooses (φ ∨ ψ) > χ instead of either of the others. This is again represented by boxed entries in figure 4. If we take the hearer’s perspective again, the iteration finally reaches a fixed point. As illustrated by figure 5, (φ ∨ ψ) > χ is now interpreted by the even more sophisticated hearer in the desired way. From the truth of (φ ∨ ψ) > χ, both φ > χ and ψ > χ pragmatically follow. Franke (2010, 2009) shows that by exactly the same reasoning free choice permissions are accounted for as well.27 What is more, using exactly the same machinery he can even explain (by making use of global reasoning) why we infer from (φ ∨ ψ) > χ and ♦(φ ∨ ψ) that the alternatives (φ ∧ ψ) > χ and ♦(φ ∧ ψ) are not true, inferences that are sometimes taken to point to a local analysis of implicature calculation. 27 Franke uses standard deontic logic, but that doesn’t seem essential. Starting with one of the two more dynamic approaches, he could explain the free choice inference as well using a very similar reasoning.
11:16
Conjunctive interpretation of disjunction
P (f | w)
uφ≺ψ
vψ≺φ
wφ≈ψ
φ>χ
1
0
0
ψ>χ
0
1
0
(φ ∨ ψ) > χ
0
0
1
1
1
1
P
f
Figure 5 4.3
P (f | w)
P (f | w) for 2nd -level hearer
A Bidirectional-like solution
Is there any relation between the above game-theoretic reasoning and the “received” analysis making use of pragmatic interpretation rule (8) or that of exhaustive interpretation, (10)? I will suggest that a “bidirectional” received view is at least very similar to Franke’s proposal sketched above, and does the desired work as well. In the above explanation, we started with looking at the semantic interpretation from the hearer’s point of view. This way of starting things was motivated by pragmatic interpretation rule (8): (8)
Prag(φ) = {w ∈ φ | ¬∃ψ ∈ A(φ) : w ∈ ψ & ψ ⊂ φ}.
But we could have started with the pragmatic interpretation rule (10) as well. (10)
Exh(φ) = {w ∈ φ | ¬∃v ∈ φ : v
In that case we wouldn’t have started from the hearer’s, but rather from the speaker’s point of view. Also this would have given rise to a reformulation and a table, but now the probability function, P (ψ | w), gives the probabilities with which the speaker would have used the alternative expression ψ given the world w she is in. The naive assumption now is that P (ψ | w) is simply 1 , if w î ψ, and 0 otherwise. The reformulation now looks as card({χ∈A(φ) : wîχ}) follows: (13)
0
Exh (φ) = {w ∈ φ | ¬∃v : P (φ | v) > P (φ | w)}.
For the simple scalar implicature, the table to start with from a naive speaker’s
11:17
Robert van Rooij
P
Figure 6
P (φ | w)
u
v
w
Some
1
1 2
1 3
Most
0
1 2
1 3
All
0
0
1 3
P (f | w)
1
1
1
P (f | w)
u
v
w
♦φ
1 2
0
1 3
♦ψ
0
1 2
1 3
♦(φ ∨ ψ)
1 2
1 2
1 3
1
1
1
f
P (φ | w) for standard scale
P
f
Figure 7
P (f | w)
P (f | w) for 0th -level speaker
point of view is given in figure 6: Though the way of choosing would be different (it is the hearer now who chooses the column with the highest number), the result would be exactly the same. What would be the beginning table for our problematic sentence ♦(φ ∨ ψ)? It is given in figure 7. Just as we derived using the rule of exhaustive interpretation, the first prediction would be that ♦(φ ∨ ψ) is interpreted as {u, v}. To improve things, we have to look again at the hearer’s perspective. And, in fact, this could be done in Franke’s framework, and we end up with exactly the same desired solution. What this suggests is two things: (i) adopting speaker’s and hearer’s point of view closely corresponds with pragmatic interpretation rules (10) and (8), respectively; (ii) to correctly predict the pragmatic interpretation of ♦(φ ∨ ψ) we have to take both types of interpretation rules into account. Recall the intuition as expressed in the previous subsection. That reasoning corresponded very closely to the following pragmatic interpretation
11:18
Conjunctive interpretation of disjunction
rule: Prag∗ (φ) = {w ∈ Exh(φ) | ¬∃ψ ∈ A(φ) : w ∈ ψφ ∧ ψφ ⊂ φ}, where ψφ denotes ψ ∩ φ and ψ is taken not to be an element of A(φ).28 Notice that this rule is close to interpretation rule (8), with the important difference that exhaustive interpretation (the speaker’s point of view) plays an important role. Unfortunately, just as the earlier Prag, also this rule wrongly predicts that a sentence like ♦(φ ∨ ψ) doesn’t have a pragmatic interpretation (Prag∗ (♦(φ ∨ ψ)) = ). For this reason we have to iterate,29 although the intuition behind this new rule will remain the same: ♦(φ ∨ ψ) pragmatically entails ♦φ and ♦ψ, because if not, the speaker could have used an alternative expression which more accurately singled out the actual world. In the following we will abbreviate the condition that ∀ψ ∈ A(φ) : v ∈ ψ ⇒ w ∈ ψ by v ≤A(φ) w as before. If K is the set of worlds in which the sentence under consideration is true, I will also abbreviate ψ ∩ K by ψK . Moreover, ψ ≺n φ will be an abbreviation for the condition ψKn ⊂ φKn , if n = 0, and ψKn ⊆ φKn , otherwise. Intuitively, ψ ≺n φ expresses the fact that at least some worlds of the nth -level interpretation of φ could be expressed more precisely by alternative expression ψ. We will make use of the following definitions:30 K
def
= {w ∈ φKn | ¬∃v ∈ φKn : v
(14)
Exhnn (φ)
(15)
PragKnn (φ) = {w ∈ Exhnn (φ) | ¬∃ψ ∈ An (φ), w ∈ ψKn & ψ ≺n φ}.
(16)
Kn+1
(17)
An+1 (φ)
def
K
def
K
= {w ∈ φKn | w 6∈ Exhnn (φ)}.
def
K
= {ψ ∈ An (φ) | ¬∃w ∈ Exhnn (φ), w ∈ ψKn & ψ ≺n φ}.
The pragmatic interpretation of φ with respect to set of worlds K and alternative expressions A(φ), PragK (φ), will now be PragKnn (φ) for the first K n such that PragKnn (φ) 6= . If there is no such n, PragK (φ) = Exh0 0 (φ), where K0 = K and A0 (φ) = A(φ). 28 Notice that if w ∈ Prag∗ (φ), one can think of the pair hφ, wi as — using bidirectional OT-terminology — a strong optimal form-meaning pair. 29 In OT-terminology, we have to look at the notion of weak optimality. 30 I won’t try to prove this here, but I believe that the analysis would be almost equivalent to Franke’s game-theoretic approach, if we redefined the definitions of the orderings ‘v
11:19
Robert van Rooij
Notice that (14) and (15) are just the straightforward generalizations with respect to a set of worlds K of standard exhaustive interpretation rule (10) and pragmatic interpretation rule (8) respectively. (10) (8)
K
Exh (φ) = {w ∈ φK | ¬∃v ∈ φK : v
The only difference between (14) and (10) is that the relevant set of worlds and the relevant set of alternatives might depend on earlier stages in the interpretation. If we limit ourselves to the first interpretation (i.e., level 0), the two interpretation rules are identical. Similarly for the difference between (15) and (8): the relevant alternatives depend on earlier stages, and the set of worlds with respect to which the entailment relation between ψ and φ must be determined depends on earlier stages as well. Indeed, if we look at the first interpretation, the only important difference is that (15) takes as input the exhaustive interpretation of φ, while this is not the case for (8). This difference implements the view that speaker’s and hearer’s perspective are both required. The definitions (16) and (17) determine which worlds and alternative expressions are relevant for the interpretation at the n + 1th level of interpretation. We start with interpretation 0 (the first interpretation). Notice first K that level 1 is only reached in case Prag0 0 (φ) = , i.e., in case for each world v in the exhaustive interpretation of φ there is an alternative expression ψ that is true in v and which is stronger than φ. Thus, in that case there is no world v ∈ Exh(φ) such that φ is at least as specific as any other alternative that is true in v. For the interpretation φ at level 1 we will not consider worlds in the 0th -level exhaustive interpretation of φ anymore. This is what (16) implements. The new set of alternatives determined by (17) are those elements of the original set of alternatives A0 that did not help to eliminate worlds in Exh(φ) at the 0th -level of interpretation. Let us see how things work out for some particular examples. Let us first look at ♦(φ ∨ ψ) with A(♦(φ ∨ ψ)) = {♦φ, ♦ψ, ♦(φ ∧ ψ)}, and assume that K = {u, v, w, x}, ♦(φ ∨ ψ) = {u, v, w, x}, ♦φ = {u, w, x}, ♦ψ = K {v, w, x}, and ♦(φ∧ψ) = {x}. Observe that Exh0 0 (♦(φ∨ψ)) = {u, v}. But K neither u nor v can be an element of Prag0 0 (♦(φ ∨ ψ)), because ♦φK0 ⊂ K ♦(φ ∨ ψ) and ♦ψK0 ⊂ ♦(φ ∨ ψ). It follows that Prag0 0 (♦(φ ∨ ψ)) = . We continue, and calculate K1 and A1 (♦(φ ∨ ψ)). The new set of worlds K we have to consider, K1 , is just K − Exh0 0 (♦(φ ∨ ψ)) = {w, x}. The new
11:20
Conjunctive interpretation of disjunction
set of alternatives, A1 (♦(φ ∨ ψ)), is just {♦(φ ∧ ψ)}. Now, we have to K K determine Exh1 1 (♦(φ ∨ ψ)) and Prag1 1 (♦(φ ∨ ψ)). Because K1 = {w, x} and ♦(φ ∧ ψ) is only true in x, both will be {w}. But this means that also PragK (♦(φ ∨ ψ)) = {w}, and thus that we can pragmatically infer both ♦φ and ♦ψ from the assertion that ♦(φ ∨ ψ), as desired. A very similar calculation shows that we can pragmatically infer both φ > χ and ψ > χ from the assertion that (φ ∨ ψ) > χ. What’s more, we have even explained why we can pragmatically infer from ♦(φ ∨ ψ) that the alternative ♦(φ ∧ ψ) is not true, just as Franke (2009) could. These predictions are exactly as desired, but how does our machinery work for more simple examples, like φ ∨ ψ? Fortunately, it predicts correctly here as well. First, assume that φ∨ψ = {u, v, w} = K, φ = {u, w}, ψ = K {v, w}, and φ ∧ ψ = {w}. Observe that Exh0 0 (φ ∨ ψ) = {u, v}. On the K0 basis of these facts, we can conclude that Prag0 (φ ∨ ψ) = . This is just the same reasoning as before. The difference shows up when we go to the next K level and determine Prag1 1 (φ ∨ ψ), because now there will be an alternative left over which plays a crucial role. But first calculate K1 and A1 (φ ∨ ψ): K K1 = {w} and A1 (φ ∨ ψ) = {φ ∧ ψ}. Obviously, Exh1 1 (φ ∨ ψ) = {w}, but because w ∈ φ ∧ ψ, it follows that (φ ∧ ψ) ≺1 (φ ∨ ψ), and thus K Prag1 1 (φ ∨ ψ) = . It follows that K2 = , from which we can conclude that K PragK (φ ∨ ψ) = Exh0 0 (φ ∨ ψ) = {u, v}, as desired. Let us now see what happens if we look at multiple occurrences of disjunctions: examples like φ ∨ ψ ∨ χ, ♦(φ ∨ ψ ∨ χ), and (φ ∨ ψ ∨ χ) > ξ. First look at φ ∨ ψ ∨ χ and assume that φ = {w1 , w4 , w5 , w7 }, ψ = {w2 , w4 , w6 , w7 }, K and χ = {w3 , w5 , w6 , w7 }. Observe that Exh0 0 (φ ∨ ψ ∨ χ) = {w1 , w2 , w3 }. K On the basis of these facts, we can conclude that Prag0 0 (φ ∨ ψ ∨ χ) = . If only the separate disjuncts were alternatives of φ ∨ ψ ∨ χ, it would result that K1 = {w4 , w5 , w6 , w7 }, which would then also be the inferred pragmatic interpretation. We have to conclude that thus we need other alternatives as well. It is only natural to assume that also φ ∧ ψ, φ ∧ χ, ψ ∧ χ, and φ ∧ ψ ∧ χ are alternatives. In that case K1 is still {w4 , w5 , w6 , w7 }, but now the new set of alternatives is {φ ∧ ψ, φ ∧ χ, ψ ∧ χ, φ ∧ ψ ∧ χ}, and the resulting pragmatic K meaning will be different. In particular, Exh1 1 (φ∨ψ∨χ) = {w4 , w5 , w6 }. HowK1 ever, none of these worlds remains in Prag1 (φ∨ψ∨χ), because w4 ∈ φ∧ψ which is a stronger expression than φ ∨ ψ ∨ χ, and similarly for w5 and w6 . This means we have to go to the next level where K2 = {w7 }. But w7 won’t K be in Prag2 2 (φ ∨ ψ ∨ χ), because w7 ∈ φ ∧ ψ ∧ χ = {w7 }. As a result, K PragK (φ ∨ ψ ∨ χ) = Exh0 0 (φ ∨ ψ ∨ χ) = {w1 , w2 , w3 }, just as desired.
11:21
Robert van Rooij
What about ♦(φ ∨ ψ ∨ χ), for instance? Once again we have to make a closure assumption concerning the alternatives. As it turns out, the correct way to go is also the most natural one: first, A(♦φ) = {♦ψ : ψ ∈ A(φ)}, and second, A(φ ∨ ψ ∨ χ) = {φ, ψ, χ, φ ∧ ψ, φ ∧ χ, ψ ∧ χ, φ ∧ ψ ∧ χ, φ ∨ ψ, φ ∨ χ, ψ ∨ χ}. Thus, at the “local” level, the alternatives are closed under disjunction as well. Let us now assume that ♦φ = {w1 , w4 , w5 , w7 }, ♦ψ = {w2 , w4 , w6 , w7 }, and ♦χ = {w3 , w5 , w6 , w7 }. Let’s assume for simplicity that in none of these worlds any conjunctive permission like ♦(φ ∧ ψ) is K true. Observe that Exh0 0 (♦(φ ∨ ψ ∨ χ)) = {w1 , w2 , w3 }. It follows that K1 = {w4 , w5 , w6 , w7 } and the new set of alternatives is the earlier set minus K {♦φ, ♦ψ, ♦χ}. The new exhaustive interpretation will be Exh1 1 (♦(φ ∨ ψ ∨ K χ)) = {w4 , w5 , w6 }, but all these worlds are ruled out for Prag1 1 (♦(φ∨ψ∨χ)) because of our disjunctive alternatives. This means that we have to go to the next level. At level 2, the new set of worlds is just {w8 }, which is thus K also Exh2 2 (♦(φ ∨ ψ ∨ χ)). World w8 cannot be eliminated by a more precise K alternative, which means that also Prag2 2 (♦(φ ∨ ψ ∨ χ)) = {w8 }, which is what PragK (♦(φ ∨ ψ ∨ χ)) will then denote as well. Notice that in w8 it holds that all of ♦φ, ♦ψ, and ♦χ are true: the desired free choice inference. Similar reasoning applies to (φ ∨ ψ ∨ χ) > ξ. These calculations have made clear that to account for free choice permission, we have to make use of exhaustive interpretation several times. In this sense it is similar to the analysis proposed by Fox (2007). Still, there are some important differences. One major difference is that Fox (2007) exhaustifies not only the sentence that is asserted, but also the relevant alternatives. Moreover, Fox uses exhaustification to turn alternatives into other alternatives, thereby “syntacticising” the process. We don’t do anything like this, and therefore feel that what we do is more in line with the Gricean approach. Exhaustification always means looking at “minimal” worlds: we don’t change the alternatives. The worst that can happen to them is that they are declared not to be relevant anymore to determine the pragmatic interpretation. Notice that our analysis also immediately explains why it is appropriate to use any under ♦, but not under : whereas ♦(φ∨ψ∨χ) pragmatically entails ♦(φ∨χ), (φ∨ψ∨χ) does not pragmatically entail (φ∨χ). It is easy to see that our analysis can account for the “free choice” inference of the existential sentence as well: that from Several of my cousins had cherries or strawberries we naturally infer that some of the cousins had cherries and some had
11:22
Conjunctive interpretation of disjunction
strawberries.31 In formulas, from ∃x(P x ∧ (Qx ∨ Rx)) we can pragmatically infer that both ∃x(P x ∧ Qx) and ∃x(P x ∧ Rx) are true. But this shows that yet another “paradoxical” conjunctive reading of disjunctive sentences can be accounted for as well.32 If we analyze comparatives as proposed by Larson (1988), for instance, it is predicted that John is taller than Mary or Sue should be represented as something like ∃d[d(T )(j)∧(¬d(T )(m)∨¬d(T )(s))], with d a measure function from (denotations of) adjectives to sets of individuals. Pragmatically we can infer from this that John is taller than Mary and that John is taller than Sue. Chemla (2009) argued that sentences of the form ∀x♦(P x ∨ Qx) give rise to inferences that are more problematic to account for by globalist approaches towards conversational implicatures than by localist approaches. He found that people inferred from Everybody is allowed to take Algebra or Literature that everybody can choose which of the two they will take. This suggests that in general we infer from ∀x♦(P x ∨ Qx) both ∀x♦P x and ∀x♦Qx. In their commentary article, Geurts & Pouscoulous (2009b) suggested that the observed “conjunctive” inference might very well depend on the particular construction being used, however, and thus be less general than predicted by a localist approach. Moreover, they suggest that universal permission sentences are just summaries of permissions of the form ♦(φ ∨ ψ) made to multiple addressees, in which case the data can be explained by any global analysis that can explain standard free choice permissions. I don’t know what is the appropriate analysis of these inferences. I can point out, however, what we would have to add to our analysis to account for the conjunctive interpretation. If this conjunctive interpretation really depends on the particular construction being used (as suggested by Geurts and Pouscoulous), then it would be wise not to make use of this extra addition. As it turns out, our approach predicts the conjunctive interpretation if we include ∃x♦(P x ∨ Qx) among the alternatives, and we exchange in the definition of ψ ≺ φn the notion ψKn by the pragmatic interpretation of ψ, PragK (ψ).33 The crucial step in this case is the one in which a minimal world 31 I believe that Nathan Klinedinst and Regine Eckhardt were the first to observe that these inferences should go through. Perhaps it should be pointed out that Schulz (2003) could straightforwardly account for these inferences as well. 32 This observation is due to Krasikova (2007), though she uses Fox’s analysis of free choice inferences. 33 Thus, ψ ≺n φ will be an abbreviation for the condition ψKn ⊂ φKn , if n = 0, and n PragK (ψ) ⊆ φKn , otherwise, where PragK (ψ) is, as before, PragK n (ψ) for the first n such n that PragK (ψ) = 6 n
11:23
Robert van Rooij
where ∃x♦P x and ∃x♦Qx are true but both ∀x♦P x and ∀x♦Qx false is eliminated, because such a world could be more accurately expressed (given the truth of ∀x♦(P x ∨ Qx)) by the alternative ∃x♦(P x ∨ Qx). While the inclusion of ∃x♦(P x ∨Qx) among the alternatives of ∀x♦(P x ∨Qx) is not a significant change to our framework, it has to be admitted that the exchange of the notion ψKn by the pragmatic interpretation of ψ is significant. From an intuitive point of view, the effect of this exchange would be that we do not only look at the exhaustive interpretation of φ, the sentence asserted, but also at the exhaustive interpretations of the alternatives. As a result, our analysis would become much closer to the proposal of Fox (2007). But, as mentioned above, if we were to adopt the suggestion of Geurts & Pouscoulous (2009b), this would, in fact, not be the way to go. 5 Conclusion The papers of Geurts & Pouscoulous (2009a) and Chemla (2009) provide strong empirical evidence that sentences in which a trigger of a scalar implicature occurs under a universal does not in general give rise to an embedded implicature. This evidence favors a globalist analysis of conversational implicatures over its localist alternative. As far as I know, it is uncontroversial that triggers occurring under an existential do give rise to implicatures. In this paper, and following Franke (2010, 2009), I discussed some ways in which these challenging examples for a “globalist” analysis of conversional implicatures could be given a principled global pragmatic explanation after all. I suggested how potentially problematic examples for our global pragmatic analysis of the form ∀x♦(P x ∨ Qx), as discussed by Chemla (2009), could be treated as well. At least two things have to be admitted, though. First, our global analysis still demands that the alternatives are calculated locally. I don’t think this is a major concession to localists. Second, according to Zimmermann (2000), even a disjunctive permission of the form You may do φ or you may do ψ gives rise to the free choice inference, and according to Merin (1992) a conjunctive permission of the form You may do φ and ψ allows the addressee to perform only φ. I have no idea how to pragmatically account for those intuitions without reinterpreting the semantics of conjunction as well as disjunction. If our analysis is acceptable, it points to the direction in which richer pragmatic theories have to go: (i) we have to take both the speaker’s and the hearer’s perspective into account, and (ii) one-step inferences (or strong Bi-OT) are not enough, more reasoning has to
11:24
Conjunctive interpretation of disjunction
be taken into account (i.e., weak Bi-OT, or iteration). These are what I take to be the main messages of this paper. References Aloni, Maria. 2007. Expressing ignorance or indifference: Modal implicatures in Bi-directional OT. In Balder ten Cate & Henk Zeevat (eds.), Logic, Language, and Computation: 6th International Tbilisi Symposium on Logic, Language, and Computation (TbiLLC 2005) (Lecture Notes in Computer Science 4363), 1–20. Berlin & Heidelberg: Springer. doi:10.1007/978-3-54075144-1. Asher, Nicholas & Daniel Bonevac. 2005. Free choice permission is strong permission. Synthese 145(3). 303–323. doi:10.1007/s11229-005-6196-z. Barker, Richard. 2010. Free choice permission as resource-sensitive reasoning. Semantics and Pragmatics 3(10). 1–38. doi:10.3765/sp.3.10. Benz, Anton, Gerhard Jäger & Robert van Rooij. 2005. Games and pragmatics (Palgrave Studies in Pragmatics, Language and Cognition). Houndmills, Basingstoke & Hampshire: Palgrave Macmillan. Blutner, Reinhard. 2000. Some aspects of optimality in natural language interpretation. Journal of Semantics 17(3). 189–216. doi:10.1093/jos/17.3.189. Chemla, Emmanuel. 2008. Similarity: Towards a unified account of scalar implicatures, free choice permission and presupposition projection. Ms, Ecole Normale Supérieure & MIT. http://www.semanticsarchive.net/ Archive/WI1ZTU3N/Chemla-SIandPres.html. Chemla, Emmanuel. 2009. Universal implicatures and free choice effects: Experimental data. Semantics and Pragmatics 2(2). 1–33. doi:10.3765/sp.2.2. Chierchia, Gennaro. 2004. Scalar implicatures, polarity phenomena, and the syntax/pragmatics interface. In Adriana Belletti (ed.), Structures and beyond (Oxford Studies in Comparative Syntax: The Cartography of Syntactic Structures 3), 39–103. Oxford: Oxford University Press. Chierchia, Gennaro. 2006. Broadening your views: Implicatures of domain widening and the “logicality” of language. Linguistic Inquiry 37(4). 535–590. doi:10.1162/ling.2006.37.4.535. Fine, Kit. 1975. Critical notice of Lewis 1973. Mind, New Series 84(335). 451–458. doi:10.1093/mind/LXXXIV.1.451. Fox, Danny. 2007. Free choice and the theory of scalar implicatures. In Uli Sauerland & Penka Stateva (eds.), Presupposition and implicature in com-
11:25
Robert van Rooij
positional semantics (Palgrave Studies in Pragmatics, Language and Cognition), 71–120. Houndmills, Basingstoke & Hampshire: Palgrave MacMillan. Franke, Michael. 2009. Signal to act: Game theory in pragmatics. Amsterdam: University of Amsterdam dissertation. http://www.illc.uva.nl/ Publications/Dissertations/DS-2009-11.text.pdf. Franke, Michael. 2010. Free choice from iterated best response. In Maria Aloni, Katrin Schulz, Harald Bastiaanse & Tikitu de Jager (eds.), Logic, Language and Meaning: 17th Amsterdam Colloquium (Lecture Notes in Computer Science 6042), 267–276. Berlin and Heidelberg: Springer. To appear. Geurts, Bart. 2010. Quantity implicatures. Cambridge: Cambridge University Press. To appear. Geurts, Bart & Nausicaa Pouscoulous. 2009a. Embedded implicatures?!? Semantics and Pragmatics 2(4). 1–34. doi:10.3765/sp.2.4. Geurts, Bart & Nausicaa Pouscoulous. 2009b. Free choice for all: A response to Emmanuel Chemla. Semantics and Pragmatics 2(5). 1–10. doi:10.3765/sp.2.5. Groenendijk, Jeroen & Martin Stokhof. 1984. Studies in the semantics of questions and the pragmatics of answers. Amsterdam: University of Amsterdam dissertation. http://dare.uva.nl/record/123669. Harel, David. 1984. Dynamic logic. In Dov Gabbay & Franz Guenthner (eds.), Handbook of philosophical logic, vol. 2, 497–604. Dordrecht: D. Reidel. Jäger, Gerhard & Christian Ebert. 2009. Pragmatic rationalizability. In Arndt Riester & Torgrim Solstad (eds.), Sinn und Bedeutung (SuB13) (SinSpecC 5), 1–15. Stuttgart. http://www.ims.uni-stuttgart.de/projekte/ sfb-732/sinspec/sub13/jaegerEbert.pdf. Kamp, Hans. 1973. Free choice permission. Proceedings of the Aristotelian Society, New Series 74. 57–74. Kamp, Hans. 1979. Semantics versus pragmatics. In Franz Guenther & Siegfried Schmidt (eds.), Formal semantics and pragmatics for natural languages (Studies in Linguistics and Philosophy 4), 255–287. Berlin & Heidelberg: Springer. Kanger, Stig. 1981. New foundations for ethical theory. In Risto Hilpinen (ed.), New studies in deontic logic: Norms, actions and the foundations of ethics (Synthese Library 152), 36–58. Berlin & Heidelberg: Springer. Krasikova, Sveta. 2007. Quantification in than-clauses. In Maria Aloni, Paul Dekker & Floris Roelofsen (eds.), Sixteenth Amsterdam Colloquium, 133– 138. Amsterdam: ILLC. doi:10.1.1.156.7902.
11:26
Conjunctive interpretation of disjunction
Kratzer, Angelika & Junko Shimoyama. 2002. Indeterminate pronouns: The view from Japanese. In Yukio Otsu (ed.), The Third Tokyo Conference on Psycholinguistics (TCP3), 1–25. Tokyo: Hituzi Syobo. Landman, Fred. 2000. Events and plurality: The Jerusalem lectures. Dordrecht: Kluwer. Larson, Richard. 1988. Scope and comparatives. Linguistics and Philosophy 11(1). 1–26. doi:10.1007/BF00635755. Leibniz, Gottfried. 1930. Elementa iuris naturalis. In Preussische Akademie der Wissenschaften (ed.), Gottfried Wilhelm Leibniz: Sämtliche Schriften und Briefe. Sechste Reihe: Philosophische Schriften, vol. 1, 431–485. Darmstadt: Otto Reichl Verlag. Levinson, Stephen. 2000. Presumptive meanings: The theory of generalized conversational implicature. Cambridge: MIT Press. Lewis, David. 1973. Counterfactuals. Oxford: Blackwell. Lewis, David. 1979. A problem about permission. In Esa Saarinen, Risto Hilpinen, Ilkka Niiniluoto & Merril Provence Hintikka (eds.), Essays in honor of Jaakko Hintikka: On the occasion of his fiftieth birthday on January 12, 1979, 163–175. Dordrecht: D. Reidel. McCawley, James. 1993. Everything that linguists always wanted to know about logic∗ . Chicago: The University of Chicago Press 2nd edn. Merin, Arthur. 1992. Permission sentences stand in the way of Boolean and other lattice-theoretic semantics. Journal of Semantics 9(2). 95–152. doi:10.1093/jos/9.2.95. Portner, Paul. 2007. Imperatives and modals. Natural Language Semantics 14(4). 351–383. doi:10.1007/211050-070-9022-y. van Rooij, Robert. 2000. Permission to change. Journal of Semantics 17(2). 119–143. doi:10.1093/jos/17.2.119. van Rooij, Robert. 2006. Free choice counterfactual donkeys. Journal of Semantics 23(4). 383–402. doi:10.1093/jos/ffl004. van Rooij, Robert & Katrin Schulz. 2004. Exhaustive interpretation of complex sentences. Journal of Logic, Language, and Information 13(4). 491–519. doi:10.1007/s10849-004-2118-6. Sæbø, Kjell Johan. 2004. Optimal interpretations of permission sentences. In Rusudan Asatiani, Kata Balogh, Dick de Jongh, George Chikoize & Paul Dekker (eds.), The Fifth Tbilisi Symposium on Language, Logic and Computation (TbiLLC 2003), 137–144. Amsterdam and Tiblisi: ILLC/CLLS. Sauerland, Uli. 2004. Scalar implicatures of complex sentences. Linguistics and Philosophy 27(3). 367–391. doi:10.1023/B:LING.0000023378.71748.db.
11:27
Robert van Rooij
Schulz, Katrin. 2003. You may read it now or later: A case study on the paradox of free choice permission. Amsterdam: University of Amsterdam MA thesis. http://www.illc.uva.nl/Publications/ResearchReports/ MoL-2004-01.text.pdf. Schulz, Katrin. 2005. A pragmatic solution for the paradox of free choice permission. Synthese: Knowledge, Rationality and Action 147(2). 343–377. doi:10.1007/s11229-005-1353-y. Schulz, Katrin & Robert van Rooij. 2006. Pragmatic meaning and nonmonotonic reasoning: The case of exhaustive interpretation. Linguistics and Philosophy 29(2). 205–250. doi:10.1007/s10988-005-3760-4. Spector, Benjamin. 2003. Scalar implicatures: Exhaustivity and Gricean reasoning. In Balder ten Cate (ed.), Eighth ESSLLI Student Session (European Summer School in Logic, Language and Information), 277–288. Vienna. http://www.cs.ucsc.edu/~btencate/esslli03/stus2003proc.pdf. Spector, Benjamin. 2006. Aspects de la pragmatique des operateurs logiques. Paris: University of Paris VII dissertation. http://cognition.ens.fr/ ~bspector/THESE_SPECTOR/THESE_SPECTOR_AVEC_ANNEXE2.pdf. von Wright, G. H. 1950. Deontic logic. Mind 60(237). 1–15. doi:10.1093/mind/LX.237.1. Zimmermann, Thomas Ede. 2000. Free choice disjunction and epistemic possibility. Natural Language Semantics 8(4). 255–290. doi:10.1023/A:1011255819284.
Robert van Rooij Nieuwe Doelenstraat 15 1015 CP Amsterdam Amsterdam the Netherlands [email protected]
11:28