The Grammar of Polarity: Pragmatics, Sensitivity, and the Logic of Scales (Cambridge Studies in Linguistics)

Th e G rammar of Pol ari t y Many, and perhaps all, languages include constructions which are sensitive to the express...

Author: Michael Israel

87 downloads 762 Views 2MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Th e G rammar of Pol ari t y

Many, and perhaps all, languages include constructions which are sensitive to the expression of polarity: that is, negative polarity items, which cannot occur in affirmative clauses, and positive polarity items, which cannot occur in negatives. Although relatively unknown outside of linguistics, the phenomenon of polarity sensitivity has been an important source of evidence for theories about the mental architecture of grammar over the last fifty years, and to many the oddly dysfunctional sensitivities of polarity items have seemed to support a view of grammar as an encapsulated mental module fundamentally unrelated to other aspects of human cognition or communicative behavior. This book draws on insights from cognitive/functional linguistics and formal semantics to argue that, on the contrary, the grammar of sensitivity is grounded in a very general human cognitive ability to form categories and draw inferences based on scalar alternatives, and in the ways this ability is deployed for rhetorical effects in ordinary interpersonal communication. The book surveys a wide variety of polarity items, both negative and positive, commonly found in English and other languages and shows that grammatical sensitivities arise regularly and only in semantic domains which are inherently scalar. m i c h ae l is r a e l is Associate Professor of English Language at the University of Maryland, College Park.

In this series 81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. 97. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125. 126. 127.

roger lass: Historical linguistics and language change john m. anderson: A notional theory of syntactic categories bernd heine: Possession: cognitive sources, forces and grammaticalization nomi erteschik-shir: The dynamics of focus structure john coleman: Phonological representations: their names, forms and powers christina y. bethin: Slavic prosody: language change and phonological theory barbara dancygier: Conditionals and prediction c l a i r e l e f e b v r e : Creole genesis and the acquisition of grammar: the case of Haitian creole heinz giegerich: Lexical strata in English keren ri ce: Morpheme order and semantic scope april mc mahon: Lexical phonology and the history of English matthew y. chen: Tone Sandhi: patterns across Chinese dialects gregory t. stump: Inflectional morphology: a theory of paradigm structure joan bybee: Phonology and language use laurie bauer: Morphological productivity thomas ernst: The syntax of adjuncts elizabeth closs traugott and richard b. dasher: Regularity in semantic change maya hickmann: Children’s discourse: person, space and time across languages diane blakemore: Relevance and linguistic meaning: the semantics and pragmatics of discourse markers ian roberts and anna roussou: Syntactic change: a minimalist approach to grammaticalization donka minkova: Alliteration and sound change in early English mark c. baker: Lexical categories: verbs, nouns and adjectives carlota s. smith: Modes of discourse: the local structure of texts rochelle lieber: Morphology and lexical semantics holger diessel: The acquisition of complex sentences sharon inkelas and cheryl zoll: Reduplication: doubling in morphology susan edwards: Fluent aphasia barbara dancygier and eve sweetser: Mental spaces in grammar: conditional constructions hew baerman, dunstan brown and greville g. corbett: The syntaxmorphology interface: a study of syncretism marcus tomalin: Linguistics and the formal sciences: the origins of generative grammar samuel d. epstein and t. daniel seely: Derivations in minimalism paul de lacy: Markedness: reduction and preservation in phonology yehuda n. falk: Subjects and their properties p. h. matthews: Syntactic relations: a critical survey mark c. baker: The syntax of agreement and concord gillian catriona ramchand: Verb meaning and the lexicon: a first phase syntax pieter muysken: Functional categories juan uri agereka: Syntactic anchors: on semantic structuring d. rober t ladd: Intonational phonology second edition leonard h. babby: The syntax of argument structure b. elan dresher: The contrastive hierarchy in phonology david adger, daniel harbour and laurel j. watkins: Mirrors and microparameters: phrase structure beyond free word order niina ning zhang: Coordination in syntax neil smith: Acquiring phonology nina topintzi: Onsets: suprasegmental and prosodic behaviour cedric boeckx, norbert hornstein and jairo nu ň es: Control as movement michael israel: The grammar of polarity: pragmatics, sensitivity, and the logic of scales

Earlier issues not listed are also available

CAMBRIDGE STUDIES IN LINGUI STI CS General Editors: p. austin, j. bresnan, b. comrie, s. crain, w. dressler, c. j. ewen, r. lass, d. lightfoot, k. rice, i. roberts, s. romaine, n. v. smith

The Grammar of Polarity

THE GRAMMAR OF POLARIT Y P RAG MAT IC S , S EN S I TIVITY, AND THE LO GI C OF S C ALES

M ich a el Is r a el University of Maryland, College Park

ca mbridge uni ve r s i t y p r e s s Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo, Delhi, Tokyo, Mexico City Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521792400 © Cambridge University Press 2011 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2011 Printed in the United Kingdom at the University Press, Cambridge A catalogue record for this publication is available from the British Library Library of Congress Cataloguing in Publication data Israel, Michael, 1965– The grammar of polarity : pragmatics, sensitivity, and the logic of scales / Michael Israel. p. cm. Includes bibliographical references and index. ISBN 978-0-521-79240-0 1. Polarity (Linguistics) 2. Grammar, Comparative and general–Negatives. 3. Grammar, Comparative and general–Syntax. 4. Semantics. I. Title. P299.N4.I87 2011 415–dc22 2010051866 ISBN 978-0-521-79240-0 Hardback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

The more that I philosophize The more and more I realize That little things which I despise, Like peanut shells and grains of sand, Are very hard, hard to understand. Delmer Israel, To Harry F. Harlow

Contents

List of figures List of tables Acknowledgments List of abbreviations 1

Trivium pursuits

page xii xiii xiv xvi 1

1.1 As above, so below 1.2 A quirk of grammar or a trick of thought? 1.3 The hypothesis: sensitivity as lexical pragmatics 1.4 Putting pragmatics in its place Pragmatics in a usage-based grammar 1.5

1 2 7 10 14

2

Ex nihilo: the grammar of polarity

20

2.1 2.2 2.3 2.4 2.5 2.6

The simplicity of negation The complexity of polarity The phenomenon of polarity sensitivity Basic mysteries: three problems of polarity sensitivity Varieties of polarity sensitivity The Scalar Model of polarity sensitivity

20 21 26 30 37 47

3

Licensing and the logic of scalar models

48

3.1 What is a polarity context? 3.2 Fauconnier’s insight 3.3 The natural logic of scalar models 3.4 Affectivity as a mode of scalar construal 3.5 Syntactic constraints on scalar contruals 3.6 Polarity contexts are mental spaces

48 49 51 61 70 78

4

Sensitivity as inherent scalar semantics

79

4.1 4.2

Scalar operators Two scalar properties

79 81 ix

x Contents 4.3 Four sorts of polarity items 4.4 Sensitivity and the square of oppositions 4.5 The conspiracy theory of polarity licensing The anomaly of inverted polarity items 4.6

5

85 92 93 95

The elements of sensitivity

104

5.1 The Informativity Hypothesis Quantitative semantics 5.2 5.3 The pragmatics of informativity 5.4 Assessing informativity 5.5 Rhetorical coherence in polarity contexts 5.6 Compositional sensitivities

104 105 109 116 120 121

6

126

The scalar lexicon

6.1 Paradigmatic predictions of the Scalar Model 6.2 Modal polarity items 6.3 Connective polarity items 6.4 Aspectual polarity items 6.5 The limits of diversity

126 127 142 151 161

7

163

The family of English indefinite polarity items

7.1 The many splendors of any 7.2 Indefinite family resemblances 7.3 Emphatic construals of indefinite any 7.4 The effects of phantom reference 7.5 Some uses of some 7.6 The limits of free choice 7.7 Indefinite conclusions

163 164 168 180 188 196 200

8

202

Polarity and the architecture of grammar

8.1 High stakes grammar 8.2 Terms of the debate 8.3 The syntactic approach 8.4 Semantic approaches 8.5 Toward a more pragmatic approach

202 203 206 212 227

9

233

The pragmatics of polarity licensing

9.1 Affectivity reconsidered 9.2 Scalar construal 9.3 Logical conditions are not sufficient 9.4 Logical conditions are not necessary

233 235 237 243

Contents xi 9.5 Rhetorical coherence 9.6 Affectivity reclaimed

250 254

10

256

Visions and revisions

Appendix: A catalogue of English polarity items Notes References General index Person index

258 267 270 285 290

Figures

2.1 Haspelmath’s semantic map of indefinite functions 3.1 A scalar model of puzzles 3.2 A two-dimensional scalar model 4.1 Four sorts of polarity items 4.2 Polarity items and the square of opposition 4.3 Canonical and inverted polarity items 6.1 A connective lattice 6.2 Durative until 6.3 Punctual until 8.1 The monotonicity hierarchy

xii

page 33 58 59 90 93 97 147 158 159 219

Tables

1 English quantifiers and indefinite constructions 2 Distributions of three polarity items

page 165 175

xiii

Acknowledgments

This book began with an epiphany in a stairwell by the sea near San Diego. The idea that two rhetorical tropes, exaggeration (emphasis) and understatement (attenuation), might explain the entire grammar of polarity sensitivity (NPIs and PPIs), seemed in an instant so neat, obvious, and simple, I was sure it must be obviously wrong or else already widely assumed, or perhaps both. Now I think the idea was both less obvious and more correct than I first suspected. That idea became the basis for a qualifying paper in 1994, a paper in Linguistics and Philosophy in 1996, and a dissertation in 1998, as well as a handful of shorter works (Israel 1997, 1999, 2001, 2006), and now, finally, for this book. Even now I wonder if I have done justice to this one little idea, but I know that what justice I have done, I could never have done alone. While I am entirely responsibile for the inadequacies which remain in this work, I am deeply in the debt of others for what virtues I have managed to include. Probably I never could have had the idea at all were it not for the extraordinary scholars and teachers who inspired me on my way. It was Chuck Fillmore who first introduced me to polarity items and Eve Sweetser who first taught me to see the rhetoric in lexical semantics, and neither seems ever to have tired of encouraging me since. Adele Goldberg, Suzanne Kemmer, and George Lakoff, each in their different ways, taught me to seek the connections between grammar and meaning, and to appreciate the importance of doing so. Ron Langacker was always generous to me with his thoughts and patient and kind as he encouraged me to develop my own. I am deeply grateful for the thoughtful advice and meticulous readings he has given to me and this work over the years. Gilles Fauconnier, whose old ideas are at the heart of this work, was unstinting in his willingness to revisit old issues here and to help me as I worked through them again. And special thanks are due to Larry Horn, who has been generously reading and responding to drafts of this work almost from the beginning. His unflagging enthusiasm has sustained me throughout and his insights have greatly improved the final product. xiv

Acknowledgments xv Many others have read and commented on drafts of this work and contributed to its final form. I am thus grateful to Raùl Aranovich, Chris Barker, Christine Bartels, Jack Hoeksema, Bill Ladusaw, Haj Ross, Neil Smith, and Yuki Kuroda. Very special thanks are also due to Peter Mallios and Tess Wood for their generosity as readers and editors in the long last stages of this writing, and to my editorial team at Cambridge – Jacqueline French, Sarah Green, Tom O’Reilly, and Andrew Winnard – for all their hard work. Many others have aided, abetted, encouraged, and inspired me in this work. Scholarship is nurtured in friendship – it is hard to say where one begins and the other ends – and I am grateful for the friendly feedback and lively conversations I have enjoyed with, among others, Michel Achard, Noah Baum, Patty Brooks, Bill Byrne, Claudia Brugman, Kathleen Carey, Linda Coleman, Bill Croft, Seana Coulson, Adrian Cussins, Michelle Cutrer, Rich Epstein, Jeanne Fahnestock, Anastasia Giannikidou, David Gil, Joe Grady, Peter Harder, Martin Haspelmath, Dennis Hilton, Paul Hirschbühler, Chris Johnson, Paul Kay, Henny Klein, Margaret Langdon, Pierre Larrivée, Chungmin Lee, Phil Lesourd, Jeff Lidz, Louise McNally, Laura Michaelis, Bill Morris, Karin Pizer, Hotze Rullmann, Scott Scwhenter, Ron Sheffer, Vera Tobin, Michael Tomasello, Elizabeth Traugott, Mark Turner, Karen van Hoek, Arie Verhagen, Gregory Ward, Paul Weinstein, Deirdre Wilson, and Ton van der Wouden. Finally, I am grateful to my family and friends for their love and support over the years, and especially to Tess Wood and Zev Israel who have had to live with me, and sometimes without me, as I wrote this. Words cannot express my gratitude.

Abbreviations

ACC Accusative ADJ Adjective API Affective polarity item BNC British National Corpus CN Common Noun DAT Dative DE Downward entailing DEC Declarative DET Determiner DISJ Disjunctive ERG Ergative FC Free choice FP Focus particle FUT Future IC Implication Constraint IMPF Imperfective INDEF Indefinite INF Infinitive LF Logical form LM Landmark MOD Modal N Noun NEG Negative NOM Nominal (i.e. N′, the complement of a determiner in an NP) NP Noun Phrase NPI Negative polarity item OED Oxford English Dictionary P Preposition/particle PFV Perfective PL Plural xvi

List of abbreviations xvii PPI Positive polarity item PRO Pronoun PS Polarity sensitive S Finite clause SG Singular SUBJ Subjunctive UE Upward entailing TR Trajector V Verb VP Verb Phrase WSJ Wall Street Journal

1 Trivium pursuits

But the truth is, they be not the highest instances that give the securest information, as may be well expressed in the tale so common of the philosopher that while he gazed upwards into the stars fell into the water; for if he had looked down he might have seen the stars in the water, but looking aloft he could not see the water in the stars. So it cometh often to pass that mean and small things discover great, better than great can discover the small. Bacon, The Advancement of Learning, Book II, 1.v. (1605)

1.1

As above, so below

Bacon’s philosopher might be forgiven for looking too much upwards and not enough down. We look “up” not just to the stars and the sky, but to those we admire and to our highest ideals. We look “down,” as often as not, on things we despise, things beneath us, which are low, mean, and base. Familiarity breeds contempt, and it is easy to forget that what lies beneath may also run deep. Figuratively speaking, up is where it’s at. Up is above, on top of, superior to, beyond; it is higher than, taller than, farther than, and more. It can be a location or a direction. It is defined within a larger frame, the vertical scale, which it shares with down – normally, the physical dimension parallel to an upright person standing erect on an even surface. The basic experience of bodily uprightness motivates the common metaphorical associations of being “up” with wakefulness, alertness, strength, reason, and virtue, and being “down” with sleep, weakness, folly, and vice. This massive alignment of evaluative metaphors along a vertical scale is not just some whim of imaginative fancy, nor is it unique to English. Indeed, it is a normal way for conceptual contents to be imaginatively structured across semantic domains – a reflection in grammar of the workings of the mind. The basic opposition between ‘up’ and ‘down,’ and the many metaphorical oppositions it engenders, are themselves symptoms of a much more general tendency for human concepts to be structured in terms of contraries. All languages, it seems, have metaphors in which abstract notions like ‘truth’ and 1

2 The Grammar of Polarity ‘goodness’ are fleshed out in terms of more basic bodily experiences, and one of the most basic experiences featured in such metaphors is the sense of opposition one may feel between contrary concepts like ‘up’ and ‘down,’ ‘light’ and ‘dark,’ or ‘hot’ and ‘cold.’ Contrariety itself is a quintessentially abstract concept, but it is immanent in our most down-to-earth experiences. The human mind thrives on the logic of contraries, and this is everywhere reflected in the structure of language, from the most basic phonemic oppositions and antonymic lexical pairings to the elementary rules for predicate affirmation and denial. Keeping with Bacon’s advice, this book looks mainly down at little things in order to glimpse therein the image of something great. The little things of concern here are matters of grammar – ordinary constructions of everyday talk and their attendant bits of form and meaning. The greater things to be discovered are the elements and principles of thought itself: the commonsense imaginative abilities which allow us, the speaking ape, to entertain concepts and to share them with one another. 1.2

A quirk of grammar or a trick of thought?

This book is concerned with a single, intricate, and easily overlooked grammatical phenomenon going by the awkward name of polarity sensitivity. Many, and perhaps all, human languages include a class of constructions which are somehow sensitive to the expression of polarity – forms whose acceptability in a sentence can depend on whether that sentence is grammatically negative or affirmative. Such polarity items arise in many semantic domains and come in many morphosyntactic flavors; but, since polarity itself is a binary relation, all polarity items divide into two basic classes: positive polarity items (PPIs), which are unacceptable in the scope of negation, and negative polarity items (NPIs), which are unacceptable in simple affirmative contexts. Both NPIs and PPIs can be found side by side in semantic domains they share with semantically similar but grammatically insensitive (or neutral) constructions. The data in (1–4), for example, reveal four sets of sensitivity triplets – items with similar semantics but different sensitivities – taken from four basic semantic domains: (1) agentive effort, (2) epistemic possibility, (3) propositional conjunction, and (4) event frequency. For each domain, the examples in (i) illustrate neutral items, those in (ii) illustrate PPIs, and those in (iii) illustrate NPIs. The unacceptable sentences in (ii–iiib) give some impression of what happens when a polarity item occurs in the wrong sort of context.

Trivium pursuits 3 (1) EFFORT: (i) make an effort to V, (ii) take a stab at V-ing, and (iii) even bother to V. i) a. He made an effort to solve the puzzle. b. He didn’t make an effort to solve the puzzle. ii) a. He took a stab at solving the puzzle. b. *He didn’t take a stab at solving the puzzle. iii) a. *He even bothered to solve the puzzle. b. He didn’t even bother to solve the puzzle. (2) POSSIBILITY: (i) be likely to V, (ii) could well V, and (iii) can possibly V. i) a. She is likely to win the race. b. She is not likely to win the race. ii) a. She could well win the race. b. *She couldn’t well win the race. iii) a. *She can possibly win the race. b. She can’t possibly win the race. CONJUNCTION: (i) and, (ii) as well as, and (iii) let alone. (3) i) a. Chris has read the Aeniad and the Georgics. b. Chris hasn’t read the Aeniad and the Georgics. ii) a. Sally has read the Aeniad as well as the Georgics. b. *Sally hasn’t read the Aeniad as well as the Georgics. iii) a. *Glynda has read the Aeniad, let alone the Georgics. b. Glynda hasn’t read the Aeniad, let alone the Georgics. FREQUENCY: (i) to V X a lot, (ii) be always V-ing X, (iii) to V X much. (4) i) a. Ann listens to the Grateful Dead a lot. b. Ann doesn’t listen to the Grateful Dead a lot. ii) a. Hugh is always listening to the Grateful Dead. b. *Hugh isn’t always listening to the Grateful Dead. iii) a. *Jeff listens to the Grateful Dead much. b. Jeff doesn’t listen to the Grateful Dead much.

The proper way to account for this little phenomenon has been a subject of long-standing and at times rather intense controversy in theoretical linguistics. These are not the sorts of facts one is likely to notice about a language, but they are remarkable nonetheless. One would expect that anything one could affirm, one could also deny, and that anything one could deny, one could also affirm. But polarity items are subject to special constraints, the violation of which results in unexpectedly unacceptable sentences. These constraints are more complicated than the examples here suggest since NPIs can be licensed, and PPIs blocked, in a variety of contexts beside clausal negation – among others, in questions, and in conditional (if) and comparative (than) clauses (see below §2.3.2). Still, the fundamentally striking observation here is that a simple switch in polarity can make an otherwise unobjectionable sentence

4 The Grammar of Polarity not just unacceptable, but apparently ungrammatical. The problem with these sentences is not just one of semantic anomaly (since it is clear what they should mean) nor of any obvious pragmatic infelicity (for it is easy to see how they might be used). Rather, something about these sentences seems to make them intrinsically incoherent. The question is, what is the nature of this incoherence? What, precisely, is wrong with these sentences? How should this wrongness be represented in a theory of grammar? And crucially, what is it about the way speakers understand such sentences that makes them feel so wrong? To answer these questions, one must confront fundamental questions about the nature of grammar and meaning. Almost from the start of generative linguistics, polarity items have been a battleground in debates about the nature of grammatical representation (Lees 1960; Bolinger 1960; Klima 1964; Baker 1970), and as theories have evolved, polarity items have remained a flashpoint. Polarity sensitivity neatly straddles the realms of syntax, semantics and pragmatics, so that a theory of polarity necessarily raises questions not just about the interfaces between these components, but ultimately about the architecture of grammar itself and the grammar’s relation to extra-linguistic aspects of cognition (Fauconnier 1975a, 1976; Ladusaw 1979, 1983; Linebarger 1980, 1987; Israel 1996, 1998a, 2004; Chierchia 2004; Giannakidou 2006). For the most part, these debates have turned on the question of what sorts of entities are needed in a theory of grammatical representations in order to account for the constraints on polarity items. The distributions of polarity items have thus served as evidence that the grammaticality of a sentence may depend on its entailments (Baker 1970) or on its implicatures (Linebarger 1980, 1987, 1991), and as such they have played a central role in debates about the nature of logical form as a level in grammatical representations. Most famously, perhaps, Ladusaw (1979, 1983) has argued that the grammar of polarity items depends on a fully interpreted level of logical form where negative polarity items are constrained to appear in the immediate scope of a downward entailing (DE) operator. According to this proposal, the model-theoretic representation of a sentence’s literal truth conditions is itself a part of grammar – a level where constraints on well-formedness are defined – and not merely the product of more general cognitive abilities operating on the output of a generative grammar. However one chooses to formulate the constraints on polarity items, one must also confront the problem of how language users manage to learn these constraints. Polarity items epitomize a classic quandary of language acquisition: the absence of negative evidence (Braine 1971; Bowerman 1988; Pinker 1989). Somehow speakers learn the grammar of polarity items without

Trivium pursuits 5 hearing the ways these forms cannot be used. But what speakers have to learn about polarity items is precisely the ways they are not used. The obvious way one could learn such a thing – the way linguists in fact learn it – is to find an instance of a polarity item in a context where it cannot be used and to observe – whether by introspection or controlled elicitation – the oddness of its usage. But of course ordinary speakers can never make such an observation since the oddness, or “ungrammaticality,” of such uses normally prevents their occurring at all. In fact, one of the few places such uses do occur (though even here they are rare) is in the spontaneous speech of very young children. The examples below, from the CHILDES database (MacWhinney 1995), illustrate the sort of uncertainty typical of young children’s early uses of polarity items. In (5–6), Abe is just under 33 months old (2;8.22), when he uses the idiomatic NPI in my life in a conversation with his father about an orange fish (Kuczaj 1976): (5) *fat: I bet if you used one of those orange fish # you could catch something what do you think? *ab e: what orange fish? *ab e: what orange fish? *ab e: I never heard of that my life. *fat: you never heard of that in your life? *ab e: I wan(t) (t)a go catch a corn fish. (File 032 – lines 47–53)

In this first use, the NPI (or something close to it, since Abe actually omits the preposition in) is licensed by the negative never. Most likely Abe has learned the NPI here [in pro’s life] as part of a larger idiom – something like never heard of X in my life. But whatever the details, Abe’s usage here is clearly flexible and creative, as moments later he produces the same item in a simple affirmative sentence, without never or any other negative licensor. (6)

*fat: *ab e: *fat: *ab e: *fat: *ab e:

what kind do you want to catch? a [/] a [/] a [/] a stair fish. a stair fish? uhhuh I heard of that in my life. you heard of that is [sic] your life? uhhuh I can’t fish like that. (File 032 – lines 118–23)

Apparently, Abe at this age was not yet aware of the constraints which limit expressions like in my life to negative contexts, or if he was, he did not yet realize that this particular expression is subject to such constraints. A similar pattern of confusion appears in Nina’s corpus (Suppes 1974), where, on one occasion at 36 months (3;0.16), the child seemed

6 The Grammar of Polarity to vacillate between any more and some more in several repetitions of the same clause. (7)

*n i n: have to close (th)em # (be)cause it’s not raining any more. *n i n: when it’s raining some more. *n i n: it’s not raining some more now. *n i n: it’s not raining any more # so we have to close this one. (File 44 – lines 640, 652, 653, and 687)

If these sorts of anecdotal observations are at all representative (and they are certainly not uncommon), it appears that whatever children might know about the theoretical constraints on polarity items, it is not enough to keep them from using such items in some very unconstrained ways. Even if one assumes that speakers come equipped with some innate know ledge of the constraints which govern polarity items, speakers still must learn the particular constructions in their language that are sensitive, which sensitivities they have, and just how strongly sensitive they are. This is a formidable problem since languages vary widely both in the polarity items they include and in the details of their distributions. Moreover, as the data in (1–4) show, near synonyms can and do vary sharply in their sensitivities. Somehow, it seems, speakers must master these subtleties on a case-by-case basis. It thus seems reasonable to follow van der Wouden’s suggestion (1997: 80), that while “the mechanisms underlying the behaviour of polarity items are part of grammar; the specific behaviour of individual polarity items is part of the lexicon.” Still, the question is, just how do these grammatical mechanisms find their way into the individual polarity items? This book seeks answers to this and other questions about the grammar of sensitivity by viewing polarity items, and sensitive items in general, in terms of the semantic and pragmatic contents they encode in observable discourse (whether “real” or in some way experimentally contrived). I assume, in other words, that polarity items are polarity sensitive because of the meanings they encode, so that speakers effectively learn “the grammar” of these constructions (i.e. their particular sensitivities) the same way they learn the meaning and use of any other linguistic construction. This does not mean that “the grammar” here is not in some sense “innate” or “universal.” There are universal constraints on what a human mind may imagine, and on what sorts of imaginings can be encoded by a linguistic construction. But such constraints might take a variety of forms, and it is far from clear which, if any, of our innately human predispositions consists precisely in a constraint on linguistic representations. I will argue here that the distributions

Trivium pursuits 7 of polarity items, at least, are not determined by constraints on linguistic representations per se, but rather reflect the operation of general cognitive abilities in ordinary communicative interactions. My goal is to explain not just why polarity items have the peculiar distributions they do or how speakers manage to learn these distributions, but also why it is that polarity items should exist in the first place. I argue that polarity sensitivity in general arises as a grammatical consequence of the ways language users regularly exploit a basic conceptual ability for rhetorical purposes. The conceptual ability here is the ability to reason in terms of scales – the ability, that is, to construe an entity within a particular sort of semantic frame, a scalar model, and to make inferences based on this construal. 1.3

The hypothesis: sensitivity as lexical pragmatics

The basic theory – what I call the Scalar Model of Polarity – is simple. The claim is that polarity contexts are defined by their effects on scalar inferences and that polarity items encode semantic properties which make them sensitive to such inferences. Polarity items are thus a special class of what Fillmore, Kay, and O’Connor (1988) and Kay (1990) term “scalar operators” – forms which must be interpreted with respect to an appropriately structured scalar model. In particular, I claim, sensitivity arises from the interaction of two sorts of scalar semantic properties – quantitative (q-) value and informative (i-) value – each of which functions independently of polarity sensitivity, but which together constitute the necessary and sufficient conditions for a construction to be polarity sensitive. A form’s q-value depends on its relative position (either high or low) in a scalar ordering. A form’s i-value reflects the informative strength (either emphatic or attenuating) of the proposition to which the form contributes its meaning. Both features are grounded in the logic of scalar reasoning and the rhetoric of interpersonal communication. Their combination within a single form effectively limits that form to contexts which allow the scalar inferences needed to make both values felicitous. The theory makes clear predictions about where polarity items might be found in a language and what forms they can take. Most generally, the theory predicts the existence of four broad classes of polarity items: NPIs divide into emphatic forms with low q-value and attenuating forms with high q-value; PPIs divide into attenuating forms with low q-value and emphatic forms with high q-value. All four sorts are well attested in English and other languages,

8 The Grammar of Polarity and the theory predicts that all sorts of polarity items from all sorts of domains fit this broad taxonomy. The idea that sensitivity might be related to scalar semantics is not new: it has been advanced in one way or another by an impressive set of theorists (e.g. Schmerling 1971; Horn 1972, 1989, 2005; Fauconnier 1975a, 1976; Fillmore, Kay & O’Connor 1988; Kadmon & Landman 1993; Lee & Horn 1994; Krifka 1995; Haspelmath 1997; Lahiri 1998; van Rooy 2003; Zepter 2003), and disputed in one way or another by an equally impressive set (Linebarger 1980; Progovac 1992, 1994; Rullmann 1996; Giannakidou 1998, 1999; Chierchia 2004; Szabolsci 2004). The present work, however, makes the unusual claim (though see Verhagen 2005 for a similar view) that polarity items are not just scalar in their propositional semantics, but also in their pragmatics. Polarity items are, I contend, argumentative operators1 which conventionally index an argumentative attitude – an attitude, that is, toward the expressed content of an utterance; or, in Gricean terms, toward what is (baldly and explicitly) said. For my purposes here, it will suffice to distinguish just two major types of argumentative attitude, emphasis and attenuation, each of which may attach to either a positive or a negative proposition. Constructions which express an emphatic attitude – for example, the English [really Adj] and [(not) at all Adj] constructions in (8) and (9) – present an expressed proposition (what is said) as somehow stronger and more significant than an alternative proposition which might have been said. Conversely, constructions expressing an attenuating attitude – like [sort of Adj] and [(not) such a Adj] in (10) and (11) – hedge what is said, and present a proposition as weaker and less exciting than it might have been. (8)

a. That’s true. b. That’s really true.

p p (> n)

(9)

a. That’s not true. b. That’s not true at all.

~p ~p (> n)

(10)

a. That’s a good idea. b. That’s sort of a good idea.

q q (< n)

(11)

a. That’s not a good idea. ~q b. That’s not such a good idea. ~q (< n)

The constructions here illustrate the four basic sorts of argumentative meanings. These are very general sorts of meaning, and as such can be (and typically are) coded by a great many constructions within a single language. The notations on the right reflect the status of these sentences as neutral, emphatic,

Trivium pursuits 9 or attenuating: “p” and “q” here stand for expressed propositions, “(n)” for a salient alternative proposition (the parenthesis indicates its status as implicit or backgrounded), and the “more than” (“>”) and “less than” (“<”) signs show the strength of an expressed proposition relative to n, and thus its status as either emphatic or attenuating. There are a variety of ways one might understand “strength” as a property of propositions – as, for example, its likelihood of being true (Karttunen and Peters 1979), its noteworthiness (Herburger 2000), its relevance (van Rooy 2003), or its force as an argument for some conclusion (Ducrot 1973, 1980; Anscombre and Ducrot 1983). I follow Kay (1990, 1997) in defining the strength of a proposition directly in terms of its entailments: a proposition p is stronger than a proposition n if and only if p unilaterally entails n. I take it that while emphasis and attenuation are fundamentally rhetorical aspects of meaning, they are in fact grounded in this simple propositional logic. Marking an expressed proposition as either emphatic or attenuating is basically just a way of calling attention to its logical status with respect to background assumptions. But the act of calling attention itself is always rhetorically loaded. An argumentative operator thus does not add to the logical content of what is said but expresses an attitude about that content and so situates it in a larger context. The key idea in this book is that such argumentative content is an irreducible part of the meanings of certain linguistic constructions, and that the encoding of such content has systematic grammatical consequences. This idea seems to go against the grain of much contemporary theorizing. The problem is that emphasis and attenuation are fundamentally pragmatic aspects of meaning, and so the claim that sensitivity depends on such features means that polarity licensing must be, at least in part, pragmatic in nature. The “grammaticality” of a polarity item in a linguistic context is thus a function not just of the sentence in which it occurs, but also of the utterance in which it is used. But if this is true, and if sensitivity really is a grammatical phenomenon (which, I maintain, it is), then grammar itself cannot be limited to the generation of well-formed sentences but must also regulate their uses in discourse. Whether or not this claim really makes sense depends in part on how one imagines linguistic knowledge (or grammar) is mentally encoded and how it relates to communicative competence in general. Basically, it makes sense if, as I contend, pragmatics is a part of grammar, and linguistic constructions regularly encode pragmatic constraints as an irreducible part of their conventional meanings. It makes less sense if one assumes that grammars are strictly a matter of linguistic representations, that pragmatics merely effects the ways such representations can be used, and that pragmatic effects are, as a rule,

10 The Grammar of Polarity dependent on objective semantic contents of linguistic constructions. Both sorts of assumptions have much to recommend them, so it may be useful to consider some of the reasons why the latter view is so widely assumed, and why it thus seems so odd to so many that grammar might be in some measure a matter of pragmatics. 1.4

Putting pragmatics in its place

It was Charles Morris who in 1938 first distinguished pragmatics as a branch of semiotics distinct from and parallel to the studies of syntax and semantics. In some ways, Morris’s trichotomy of syntax, semantics, and pragmatics is reminiscent of the original trivial pursuits, the Trivium of grammar, logic, and rhetoric – the first three of the seven liberal arts. Carnap famously conceived of this trichotomy as a series of abstractions: If in an investigation explicit reference is made to the speaker, or, to put it in more general terms, the user of a language, then we assign it to the field of pragmatics… If we abstract from the user of the language and analyze only the expressions and their designata, we are in the field of semantics. And if, finally, we abstract from the designata also and analyze only the relations between the expressions, we are in (logical) syntax. (Carnap 1942: 9)

Thus syntax – or at least “(logical) syntax” – studies relations between logical or linguistic expressions; semantics is what you get when you add meanings to those expressions; and pragmatics is what you get when you place those meanings in contexts with speakers, hearers, and communicative intentions. Framed this way, syntax – the combinatorics of signs – is at the (logical) core of the enterprise. One advantage of this is that it immunizes the study of linguistic forms from the obvious subjectivity which infects so much of language use. At the same time, by isolating syntax from both meaning and usage, it presupposes that the forms of syntax are, in a deep sense, independent from the communicative concerns of the people who use them. This way of framing things – in particular the idea that grammar, meaning, and use belong to separate domains in the study of language – has been a cornerstone of generative linguistic theorizing. Generative grammar starts from the assumption that a language is a kind of formal object – “a set (finite or infinite) of sentences finite in length and constructed out of a finite set of elements” (Chomsky 1957: 13) – and that a grammar is “a device that generates all of the grammatical sequences of [a language] L and none of the ungrammatical sequences” (ibid.). As Partee, ter Meulen, and Wall put it (1990: 437),

Trivium pursuits 11 “a formal grammar (or, simply, grammar) is essentially a deductive system of axioms and rules of inference, which generates the sentences of a language as its theorems.” A grammar effectively determines – and in precisely this sense “explains” – what sentences do and do not count as part of a language. This sort of theory draws a clear and impermeable boundary between grammar and usage. Grammars provide structural constraints on grammatical representations; what people do with such representations is their own business, and perhaps the business of psychologists, sociologists, and others interested in human behavior, but it is not the business of a linguist. The linguist’s job is to discover the rules which specify the possible sentences in a language, not to say how or why or under what circumstances any of those sentences might actually occur. Thus, the grammaticality of a sentence – whether or not it is included in a language – is quite different from its potential acceptability in use (Chomsky 1965: 11). Many perfectly grammatical sentences may be unusable for one reason or another (e.g. processing difficulty, uninterpretability, inappropriateness, etc.), and even some ungrammatical sentences may be quite useful in their way. As Newmeyer puts it, “knowledge of grammatical structure is only one of many systems that underlie usage” (2003: 692). Of course, a sharp distinction between grammar and usage does pose a practical problem for the theorist who must somehow decide which acceptable sentences really are grammatical and which unacceptable sentences are not. And unfortunately, this is not something that can safely be left to linguists’ intuitions since linguists and non-linguists actually differ quite sharply in their judgments of acceptability (Spencer 1973; Labov 1975). But while it is difficult in practice to draw a clear line between grammaticality and acceptability, there may still be a real difference between what a speaker knows about a language (her competence) and the ways a speaker actually uses what she knows (her performance).2 In principle, grammaticality is a function of competence, and acceptability a function of performance. The hard question, however, is not whether competence and performance should ever be distinguished, but just how they should be related. In generative theories the distinction is typically held to be absolute, so that grammar and usage are subject to very different sorts of theories: [Grammars] are theories about the structure of sentence types … Pragmatic theories, in contrast, do nothing to explicate the structure of linguistic constructions or grammatical properties and relations. They explicate the reasoning of speakers and hearers in working out the correlation in a context of a sentence token with a proposition. In this respect, a pragmatic theory is part of performance. (Katz 1977: 19, quoted in Levinson 1983: 8)

12 The Grammar of Polarity Such an approach eliminates any consideration of language users from the study of grammar and establishes pragmatics as a sort of theoretical ghetto for the unruly phenomena of usage. The assumption is that matters of usage are always neatly separable from facts of structure, and that one can identify the well-formed combinations which make up a language without reference to the ways these forms are actually used. Little room is left for a theory of pragmatics to help explain why languages take the forms that they do. Rather, linguistic structures are seen as subject to a variety of formal principles – including constraints on phonological, morphological, and syntactic representations – all essentially divorced from usage and (in most modular theories) all unrelated to any other aspect of cognition. Ordinary faculties of memory, perception, attention, inference, and imagination, among others, all play important roles in actual communication, but their operation is supposed to be independent from the core principles of grammar: they are parts of performance and not of competence. Indeed, the standard view has been and remains that language and communication are fundamentally distinct phenomena governed by distinct modules of the mind. Relevance Theory is admirably explicit on this point. Sperber and Wilson (1986/1995) view communication in general as basically any process whereby “information-processing devices” can intentionally share information, and they develop Relevance Theory to account for the complex inferential processes humans rely on to produce and interpret communicative acts. But for these authors, the fact that humans use language as the primary medium of communication is completely incidental to the nature of language itself, which they view much as Chomsky does, as a symbol system – a set of well-formed expressions generated by a grammar and semantically interpreted only in the sense that the expressions regularly correspond to objects of some other kind (Sperber & Wilson 1995: 172–3). Language is thus primarily a kind of representational device, one which is only secondarily adapted for communication. For Sperber and Wilson “it is as strange for humans to conclude that the essential purpose of language is for communication as it would be for elephants to conclude that the essential purpose of noses is for picking things up” (1995: 173–4). The stangeness of this conclusion might itself seem strange, but if language is just an abstract symbol processing system, then its use as a medium for communication must be a derivative and non-essential property. While such a view effectively precludes functional explanations for linguistic structure, it has hardly been dysfunctional as a research program. Since Saussure first segregated the structures of langue from the vagaries of parole, linguists have

Trivium pursuits 13 developed an ever-increasing array of analytic tools for distinguishing new sorts of linguistic units and levels of organization. Whole fields of inquiry have opened up as syntax, semantics, morphology, and phonology emerged as distinct subfields, and each of these has contributed a dazzling array of generalizations and phenomena. Whatever one thinks of the generative enterprise as a whole, there is no denying that the fields of generative linguistics have produced a rich harvest of phenomena to explain. There are, of course, real practical and theoretical advantages to seeking explanations in which language and grammar function as symbolic systems more or less independently from communicative practice. Most importantly the separation of form and function makes it possible to view human languages in general as formal objects which may be studied and modeled with some mathematical precision. With the domain of inquiry narrowed to the structural properties of linguistic forms, hypotheses can be formulated and tested with admirable explicitness, and the structures themselves may become clearer as they are observed on their own terms. But the segregation of linguistic form and function has its disadvantages as well. For one thing, it limits the explanatory power of any theory since it eliminates in principle the possibility of explaining linguistic forms in terms of their functions. And while it immunizes linguistic theory from the messy details of language use, it also makes it hard for linguistics to contribute meaningfully to theories of language processing, language use, or sociolinguistic variation. Finally, and perhaps most dramatically, the theoretical segregation of linguistic form from communicative function makes language learning appear to be impossible in principle. Grammar must be in some sense innate in the human mind, for if the structures of language really are unrelated to its uses, then presumably no amount of experience could suffice to teach them. But, of course, those who accept the autonomy of grammar are generally happy to embrace these conclusions. Ultimately, the assumption that the forms and functions of language are clearly separated may prevent one from seeing the ways in which they might be interdependent. Langacker (1987: 1) notes that what one finds in language depends in large part on what one looks for. And there is a danger that by removing language too far from its communicative contexts, one may end up with a radically distorted view of what language is. As Fauconnier (1997: 7–8) puts it: language data suffers when it is limited to language, for the simple reason that the interesting cognitive constructions underlying language use have to do with complete situations that include highly structured background knowledge, various kinds of reasoning, on-line meaning construction, and negotiation of meaning.

14 The Grammar of Polarity To understand the ways language is adapted for communication, one must understand how it works in context. In fact, this point was not lost on Charles Morris himself. The man who first distinguished pragmatics from syntax and semantics did not assume that linguistic phenomena would always respect the boundaries between these domains: it is legitimate and often convenient to speak of a particular semiotical investigation as falling within pragmatics, semantics, or syntactics. Nevertheless, in general it is more important to keep in mind the field of semiotic as a whole, and to bring to bear upon specific problems all that is relevant to their solution. (Morris [1946] 1998: 8)

In this perspective, grammar and usage may be seen as mutually dependent and deeply connected pieces of a larger cognitive puzzle – that is, “the field of semiotic as a whole,” including both the forms languages take and the functions they serve. If this is the puzzle we wish to solve, it might help to start out with a vision not, as Carnap would have it, with syntax purged of its users, but rather with pragmatics at the heart of grammar. This, in any case, is the vision I propose is needed to explain the grammar of polarity sensitivity. 1.5

Pragmatics in a usage-based grammar

What might such a vision consist of? While the main traditions in the cognitive sciences still see pragmatics as an essentially extra-grammatical set of phenomena, a growing chorus of researchers has been suggesting that grammar is in crucial ways based on usage, and that the pragmatics of language use is thus in some sense built into the structure of language itself (Bybee 1985, 2001; Langacker 1988, 2000; Barlow & Kemmer 2000; Traugott & Dasher 2002; Tomasello 1999, 2003; Goldberg 2006). In this section I briefly sketch out some of the basic assumptions about language and meaning which make a usage-based view of grammar plausible in general. While these assumptions are largely incidental to the arguments and analyses I develop in the rest of this work, the work as a whole is, at least in part, an argument for the plausibility and the fruitfulness of a usage-based approach. First, as the phrase usage-based grammar itself implies, grammar and usage are distinct things (Newmeyer 2003). Usage is what people do with language in all its messy social, phonetic and pragmatic detail. Grammar, on the other hand, is what allows people to use language, and consists of the internal representations at work in the production and interpretation of utterances. No theory which hopes to explain the cognitive foundations of linguistic behavior can get

Trivium pursuits 15 around this distinction. The issue which separates generative and usage-based theories is how these two things are related, and to what degree, if any abstract, mental representations of grammar are informed by people’s experience of language use. I begin with the uncontroversial assumption that a grammar is a kind of device in the minds of language users (or persons) linking phonological representations with semantic contents. With Langacker (1987: 57) I view the grammar of a language as “a structured inventory of conventional linguistic units,” which come in three basic sorts: phonological (forms), semantic (meanings), and symbolic (pairings of forms and meanings). The view of grammar as a richly articulated collection of signs is actually shared by a number of frameworks, including, among others, Cognitive Grammar (Langacker 1987, 1991), Construction Grammar (Fillmore, Kay & O’Connor 1988; Goldberg 1995; Kay & Fillmore 1999), Radical Construction Grammar (Croft 2001), Embodied Construction Grammar (Bergen & Chang 2005), and Head-Driven Phrase Structure Grammar (Pollard & Sag 1994). These frameworks differ to some extent in the categories of signs they assume, and in the types of information they associate with different signs, but they are united by a number of key assumptions (Koenig 1999). In all of these approaches, the grammar of a language is essentially just a structured collection of signs. There is no essential distinction between the lexicon and other parts of grammar: “lexicon, morphology, and syntax form a continuum of symbolic units serving to structure conceptual content for expressive purposes” (Langacker 1987: 35). Meanings are directly associated with surface structures – as such syntactic and morphological derivations operate in tandem with, and indeed as part of, semantic composition. Language use in general is held to involve a dynamic process of parallel constraint satisfaction: linguistic units of all sorts are activated in usage in order to categorize, and so to sanction, usage events. Usage events (or aspects thereof) which either cannot be categorized, or which diverge too sharply from the linguistic units they activate, are experienced as either ungrammatical or uninterpretable. In this respect there is no distinction between the grammar which represents a speaker’s competence, and the grammar which guides performance: a speaker’s knowledge of her language is based on her experience of linguistic usage, and is precisely what allows a speaker to use her language. I use the term construction to refer to symbolic units of all sorts (Goldberg 1995: 2), including both simplex signs (e.g. words and morphemes) and complex symbolic assemblies (e.g. abstract constructions like subject–auxiliary inversion). Furthermore, I assume that language use in general is essentially

16 The Grammar of Polarity just the use of constructions for the purpose of coordinating conceptual contents among communicative agents (allowing, of course, for the limiting case in which the communicative agents correspond to a single person speaking or thinking to herself). The question now is where and how does pragmatics fit into this picture of grammar? And more specifically, how can it be – as I argue throughout this work – that the conventional content of a construction can include constraints on its use? The short answer, of course, is that there is really no reason why it shouldn’t. As Kay explains: The constructional approach to grammar … specifically countenances direct pragmatic import of linguistic forms, not necessarily mediated by a truthconditional level of semantics concerned only with the content of sentences. In Grice’s terms it is not only ‘what is said’ that has pragmatic import but also how it is said. (Kay [1990] 1997: 74, fn. 24)

Basically, the meaning of a construction is not and cannot be limited to the contribution it makes to a sentence’s truth conditions, but regularly encompasses the contribution it makes to utterance interpretation more generally. My own views here blend elements of traditional, Gricean (or “postGricean”) pragmatics (e.g. Levinson 1983, 2000; Horn 1984, 1989; Sperber & Wilson 1995; Carston 2002) with the more radical assumptions of cognitive semantics (e.g. Fillmore 1982, 1985; Fauconnier 1985, 1997; Lakoff 1987; Langacker 1987, 1991; Coulson 2001; Fauconnier & Turner 2002; Croft & Cruse 2004). This will strike some as an unlikely combination. Cognitive linguists habitually deny the distinction between semantics and pragmatics, while Grice’s work and the tradition it inspired offer some of the most compelling theoretical and empirical motivations for taking that distinction seriously. The two traditions are united, however, by their concern for the details of utterance interpretation, and their commitment to explaining these details in terms of general cognitive abilities. Of course, there are important differences as well. Crudely, where cognitive semantics views meaning as a fundamentally subjective phenomenon, Gricean pragmatics sees it in essentially rational terms. But rationality and subjectivity are not antithetical notions; in fact, both are crucial to an understanding of what language is and how it works. Grice’s basic insight is that the coded meaning of a sentence is always but the starting point for a larger interpretive process. The standard assumption in pragmatic theory is that the output of this process, utterance meaning, is a different sort of theoretical object from the input – sentence meaning. While sentence meaning is a product of grammar, utterance meaning reflects the

Trivium pursuits 17 operation of general principles of reasoning and inference. The point here is not just that there are distinct levels of meaning representation, but also that there are regular and even rational processes of pragmatic enrichment which contribute to utterance interpretation. It is this insight which explains why speakers can regularly mean more than they say and still expect to be understood. And it also provides a necessary foundation for explaining such mysteries as how it is that words can change their meanings over time (Traugott & Dasher 2002) and how it is that children are able to learn a language in the first place (Tomasello 2003; Goldberg 2006). The idea that there are general principles of pragmatic interpretation means, among other things, that the meanings which constructions can convey are typically underspecified in the grammar and systematically enriched in usage (Horn 1984, 1989; Sperber & Wilson 1995; Fauconnier 1997; Levinson 2000). For those who assume that grammar is an encapsulated mental module, the fact that these principles are informed by general cognitive constraints and abilities (e.g. considerations of processing simplicity and goal-directed, cooperative behavior in general) means that they are essentially extra-grammatical. But if one assumes that grammar itself is based on usage, then it makes little sense to treat the principles of usage – no matter how general they might be – as essentially unrelated to the grammar. And if pragmatic principles really are based on deep cognitive and social-cognitive abilities – abilities, that is, which both predated and paved the way for the emergence of language in human societies – then perhaps it does make sense to think these principles have always informed not just the use but also the form of grammatical systems. The point is not that there is no distinction to be drawn between semantics and pragmatics. Indeed, there are many. It is both legitimate and important to distinguish meanings which are explicitly coded from those only impli citly conveyed, meanings which a speaker actually asserts from those which are merely presupposed, and objective propositional contents in general from other affective and interpersonal sorts of meaning. The problem is not with the distinctions per se, but with the assumption that these distinctions divide into two neatly ordered levels of representation, one of which is somehow ontologically prior to the other. Cognitive semantics begins with the assumption that linguistic meanings are essentially just special sorts of conceptual structures which speakers are in the habit of sharing. As Langacker puts it, “semantic structure is conceptualization tailored to the specifications of linguistic convention” (1987: 99). Because semantic structures are conventional, they are also always to some

18 The Grammar of Polarity degree language specific – they are conventional habits of thought and so cannot be equated with a general purpose, universally available “language of thought.” But because these structures are themselves patterns of conceptualization, they are in fact precisely the phenomenological stuff of which all usage events and utterance interpretations are made. In cognitive semantics, the meaning of a sentence cannot be reduced to its propositional content or “logical form” but also includes the embodied, phenomenological experience of that content. All content must be somehow subjectively construed, and the construal a form imposes is as much a part of its meaning as is its objective content. Clearly, any theory which accepts such a view of meaning cannot include a level of semantic structure that represents the meanings of sentence-types independently of the ways these meanings would be experienced in use. A meaning, on this view, is basically just an instruction to construct and experience a conceptualization. This is not to deny that there may be, and often are, import ant differences between the “literal” (or coded) content of a sentence and its actual interpretation in context. Indeed, the coded content of an utterance is almost always vastly underspecified with respect to the meaning it expresses, but it does not follow from this that the two involve distinct levels of repre sentation. What is coded can be either semantic (in that it can put constraints on the expressed propositional content of an utterance) or pragmatic (in that it can put constraints on the construal of an expressed proposition, or, equivalently, on the kinds of contexts in which an expression can be used). But both the coded content of a sentence and the inferential processes which guide its interpretation are part of the same, single (though complex and multilayered) phenomenon of meaning construction. This view of meaning is in many ways deeply at odds with standard conceptions of semantics in the linguistic and philosophical literature. Indeed, the divide between “cognitive semantics” and “model-theoretic semantics” often seems unbridgeable, with researchers on either side of the divide typically showing little interest in (or knowledge of) work done on the other side. This, however, is both unfortunate and unnecessary. While there are important and substantive philosophical issues dividing them, many of the differences are really matters of emphasis rather than fundamental disagreements about the phenomena to be explained. Ultimately, any cognitively adequate theory of meaning in language must somehow come to terms with both the phenomenological and the referential aspects of meaning. Meaning is a cognitive phenomenon: it takes place in the mind of an individual who either means something or understands a meaning. As such, meaning

Trivium pursuits 19 is an inherently subjective phenomenon, involving the perceptual and imaginative capacities of a thinking, speaking subject (Benveniste 1966; Lyons 1982; Langacker 1987, 1990). But meanings are also necessarily things which can be communicated from speaker to speaker, and as such they must be publicly available so that speakers can have common access to them. Meanings are not merely subjective, but intersubjective and conventional as well (Clark 1996; Tomasello 1999; Verhagen 2005). Finally, since speakers need to communicate not just about themselves and their subjective feelings, but also about the world around them, meanings must stand in reliable relations to the world and so must express propositions which can be objectively true or false. A prominent strain of thinking in twentiethcentury linguistics and philosophy has perhaps been too dogmatic in viewing truth as the basic unit of semantic theory. But the point is well taken nonetheless: for all its subjectively and intersubjectively constituted nature, human language provides a remarkably robust medium for the expression of objectively verifiable propositions. Indeed, there can be little doubt but that the success of human language as an evolutionary adaptation (or as the product of one or more smaller adaptations) reflects its usefulness for things like sharing information by telling the truth, and deceiving conspecifics by telling lies. While many formal semanticists have perhaps paid too little attention to the subjective and intersubjective aspects of meaning, it is equally true that some cognitive linguists have tended to scant its objective and referential aspects. Both perspectives may benefit from the insights of the other. In any case, both perspectives are clearly called for if one hopes to deal definitively with the grammar of polarity sensitivity.

2 Ex nihilo: the grammar of polarity

It’s hard to imagine that nothing at all could be so exciting could be so much fun.

2.1

Talking Heads, from the song Heaven (1979)

The simplicity of negation

Nothing, it seems, could be simpler than negation. Nor could anything be more indispensable. No language could go without some way of saying “no,” nor could any speaker. A basic logical primitive, negation is one of the fundamen tal building blocks of cognition. It is hardly an exaggeration to say that without such a basic conceptual ability, objective thought itself would be impossible. We cannot know how the world is, unless we also know how the world is not. Even the simplest affirmation is only meaningful in so far as it contrasts with its own negation: there is no way of knowing what it is for the sky to be blue without at the same time also knowing what it would be for the sky to be some thing else – for it to be not blue. In this sense, negation is itself the cognitive guarantor of positive knowledge. And it is precisely in this sense that negation seems so simple. One expects that for every positive proposition there should be a unique negative propos ition, and that in any situation either one or the other, but not both, should be true. Either the sky is blue, or it is not blue; either the moon is made of green cheese, or it is not made of green cheese; either death is the mother of beauty, or it is not the mother of beauty. The classical view, going back at least to the Greek Stoics, is that negation is a contradictory operator on propositions, mapping any given proposition on to its truth-conditional opposite. Given this notion, and given the simplistic but appealing assumption that sentences represent propositions, one expects that every meaningful positive sentence will also have a unique negative counterpart.

20

Ex nihilo: the grammar of polarity 21 Indeed, there are productive patterns in every human language relating sim ple sentences with negative counterparts. The examples in (1–3) illustrate one such pattern in English. (1)

a. The sky is blue. b. The sky is not blue.

(2)

a. The moon is made of green cheese. b. The moon is not made of green cheese.

(3)

a. Death is the mother of beauty. b. Death is not the mother of beauty.

Such patterns seem to suggest there might be some simple rule mapping every positive sentence in English onto a corresponding negative sentence, and that the expression of negation in natural language might be as simple an operation as it is in propositional logic. 2.2

The complexity of polarity

As it turns out the formation of negative sentences in English is anything but simple, and whether or not every affirmative sentence does have a unique nega tive counterpart, it is unlikely that any language could be content with a single mechanism for the expression of negation. English contains a variety of overtly negative morphemes beyond the basic sentence negator not. A brief list would have to include negative pronouns like none, nothing, and nobody, negative adverbs like never, the negative determiner no, negative connectives like neither and nor, and suppletive negative auxiliaries like won’t and the non-standard ain’t. Beyond these obvious expressions of negation, English provides a variety of other devices which can mark the denial of a proposition. The standard neg ation of sentence (4a) may well be (4b), but the sentences in (5) are all plausible vehicles by which a speaker might reject the idea expressed in (4a). (4)

a. The philosophy students smoke French cigarettes. b. The philosophy students don’t smoke French cigarettes.

(5)

a. I doubt they smoke French cigarettes. b. Few of them smoke French cigarettes. c. They rarely smoke French cigarettes. d. They have yet to smoke any French cigarettes. e. I’d be surprised if they ever smoked Dutch cigarettes either.

The variety of negative constructions available within a single language shows that there is more going on here than a simple pairing of propositions

22 The Grammar of Polarity with their logical contradictions. Rather, and not all that surprisingly, English and other languages are equipped with a variety of pragmatically differenti ated and semantically nuanced resources which speakers can use in different contexts where it is useful, for one reason or another, to express the negation of a proposition. Negation is one of those special things, like anaphora, deixis, predication, and recursion, which feature prominently in all human languages. All human languages have some way of expressing a logical negation, and probably all have many. It is thus particularly striking, as Horn (1989) points out, that apparently no system of animal communication has anything quite like a nega tive operator. Of course, cats and dogs and other animals will respond to words like no, don’t, and stop, but from the animals’ point of view, these construc tions appear to be neither operators nor even strictly negative. They are simply prohibitive holophrases – command words which threaten and demand sub mission. Animals in general are sensitive to nuances of positive and negative affect, they can express displeasure without ambiguity, and they are often adept at avoidance and refusal; but they have trouble with even the simplest logical inferences. And apparently no animal can contradict an assertion. In this sense, logical denial appears to be both necessary for and impossible without a lan guage. And human languages in particular (as opposed, say, to artificial logical languages) can be profligate in the resources they devote to the expression of negation. And there is a more surprising complication. Not only do languages pro vide lavish resources for expressing negation, the expression of negation itself often has significant consequences for the structure of a sentence as a whole. Negation can change the way a proposition is expressed, and it can even affect the acceptability of certain forms within a sentence. The examples in (6–8) illustrate three famous suppletive relations in which the forms some, already, and once are replaced in a negative sentence by any, yet, and ever, respectively. (6)

a. Edna wants some spinach. b. Edna doesn’t want any spinach.

(7)

a. Ernest has already finished the whiskey. b. Ernest hasn’t finished the whiskey yet.

(8)

a. Martin was once able to understand his mother. b. Martin wasn’t ever able to understand his mother.

The alternations here are not stylistic variations, but obligatory grammat ical rules: any, yet, and ever cannot appear in the positive contexts of the

Ex nihilo: the grammar of polarity 23 (a) sentences, nor can some, already, or once be easily used in the negative contexts of the (b) sentences. More striking still, it turns out that many negative sentences actually lack any direct positive counterpart. (9)

a. Clarissa didn’t sleep a wink that night. b. *Clarissa slept a wink that night.

(10)

a. She wouldn’t so much as give him the time of day. b. *She would so much as give him the time of day.

(11)

a. She can’t possibly expect that he will forgive her. b. *She can possibly expect that he will forgive her.

By the same token, and no less surprisingly, many positive sentences seem to lack any direct negative counterpart. (12)

a. That guy Winthrop is some mathematician. b. *That guy Winthrop isn’t some mathematician.

(13)

a. He’s a regular Einstein. b. *He’s not a regular Einstein.

(14)

a. He can calculate an eigen vector in the blink of an eye. b. *He can’t calculate an eigen vector in the blink of an eye.

The sentences in (6–14) are special because they contain elements which are somehow sensitive to the expression of negation and affirmation. The phenomenon is known as polarity sensitivity and the elements which exhibit this sensitivity are polarity sensitive items, or simply polarity items. They are linguistic constructions whose acceptability or interpretation depends somehow on the positive or negative status of the sentences in which they occur. The sensitivity of these forms is puzzling in many ways. For one, it is by no means obvious how one could predict which constructions in a given language will count as polarity items. For another, it is unclear why any item in any language would have such a sensitivity. Still, polarity items are not especially unusual expressions. On the contrary, the examples above show that polarity items appear in even the most ordinary and unassuming of English sentences. Polarity items are in fact commonplace throughout the world, and crosslinguistically certain types of expressions seem especially prone to polarity sensitivities. Indefinite determiners and pronouns, like English any and anyone, are frequently, though not always, polarity sensitive. Haspelmath (1997) describes a range of areally and genetically diverse languages which include comparable forms with similar, though by no means identical, sensitivities.

24 The Grammar of Polarity Thus, we find polarity items more or less like English anyone in languages like Irish (aon duine), Hindi (koii bhii), Persian (hic-kasi), Turkish (hiç kimse), Finnish (kuka-an), literary Hebrew (ish), Swahili (mtu ye yote), Japanese (dare-mo), and Basque (inor), among many others. The most well-known and widely attested sort of polarity item, however, is probably the minimal unit, or minimizer NPI. These forms consist minimally of a singular indefinite NP used to denote a minimal unit or degree of some sort (Bolinger 1972: 17). Typical examples in English include an iota, a jot, a thing, a red cent, a plugged nickel, a thin dime, a pin, a (living) soul, a stick (of furniture), a stitch (of clothing), an inkling, and a shred (of evidence), among many others. Usually such minimizing indefinites are limited to occurring as a direct object in just one or a few idiomatic VP constructions: e.g. drink a drop, sleep a wink, lift a finger, give a damn, spend a red cent, budge an inch, bat an eyelash, hold a candle to, miss a beat, show a spark of decency, and hurt a fly. In such constructions, the indefinite NP serves as an incremental theme of some sort, though often with a highly idiomatic sense: thus, for example, the fly in hurt a fly seems to denote a minimal unit of harm, while the candle in hold a candle to represents a minimal degree of comparative worth – the degree, that is, to which something shines. Horn (1989: 452–3) notes that at least since Pott (1857) such forms have been recognized as instantiations of a general strategy whereby an expression denoting “a small or negligible quantity” – the minimizing indefinite – may be used to strengthen the expression of a negated proposition. The same point is emphasized by Schmerling (1971), who points out that minimizer forma tion is a systematic and highly productive feature of English, as it appears to be in many languages. The strategy is particularly common in Indo-European languages, where it often plays an important role in the grammaticalization of negation itself (Bernini 1987; Croft 1991). Minimizers are found in Sanskrit, Greek, Latin, Persian, Irish, and apparently all Romance, Slavic, and Germanic languages. Indeed, where philologists have looked systematically, novel min imizer constructions are often found in remarkably creative abundance (see especially Möhren 1980 on Old French and Wagenaar 1930 on Old Spanish). Observations from outside Indo-European are more sparse, but the phenom enon has also been noted in Basque, Estonian, Hebrew, Hungarian, Korean, Japanese, Lezgian, and Maltese, among others (Haspelmath 1997: 227). The process is probably used more in some languages (and in some contexts) than in others, but it may well be available and at work in all human languages. Indeed, some of the most colorful and idiomatic sorts of polarity items have close equivalents in a wide variety of languages. Many languages include

Ex nihilo: the grammar of polarity 25 polarity items in which the movement of a finger stands metonymically for any minimal effort. Thus, as van der Wal (1996) notes, parallel to the English NPI lift a finger, one finds, among others, the French lever le petit doigt, the Portuguese mover um dedo (Evandro Nobrega, p.c.), the Dutch een vinger uitsteken, the German einen Finger krummachen, the Norwegian å røre en finger, and the Hungarian a kisujját mozdítani. The examples below illustrate compar able polarity items in Russian (Asya Pereltsvaig, p.c.) and Korean (Yoon-Suk Lee, p.c). (15)

Ona ne udarit palets o palets chtob pomoch svoei sestre. she not will-strike finger at finger for-to help her-own sister ‘She wouldn’t lift a finger to help her own sister.’

(16)

ku-nun caki yedongsaeng-ul top-nun-dey he-t op own sister-ac c help-m o d -business hana kkattak ha-ci one lift do-n e g ‘He wouldn’t lift a finger to help his own sister.’

sonkarak finger anh-ass-ta. not-pas t-decl

Polarity items are commonplace, but polarity sensitivity itself is an extraor dinary phenomenon. It is tempting to dismiss it as a peculiar quirk of grammar, a curiosity in the structure of English (and, as it turns out, in the structure of a great many other languages as well). While it is striking that languages should include polarity items, their existence hardly seems to be an important or even a useful feature of human language. But then this is precisely why polarity sensitivity is so extraordinary, for if polarity items are essentially useless, the question naturally arises as to why languages should contain so many of them, or really why languages should tolerate such peculiar entities at all. This, in fact, is the fundamental mystery about polarity items: why do they exist in the first place? What possible justification can there be for such a phenomenon? The existence of polarity items challenges common assumptions about lan guage and cognition. Chief among these are the twin principles of composi tionality and generality (e.g. Strawson 1970). The principle of compositionality holds that the meaning of any complex expression is a function of the meanings of its parts. The principle of generality requires that the meanings of the parts do not depend on the complex whole to which they contribute. In the simplest terms, these principles simply ensure that the meanings of sentences are sys tematically related to the meanings of the words they contain. This of course just seems like common sense. As Davidson suggests, if these principles were not in effect “there would be no explaining the fact that we can learn [a] lan guage” (1967: 304). Be that as it may, polarity items seem to be a clear viola tion of these principles, since not only their particular meanings but often their

26 The Grammar of Polarity very meaningfulness in a sentence depends on the properties of the sentence as a whole. The point is not that the existence of polarity sensitivity in itself requires us to abandon these basic assumptions. Still, given such reasonable assumptions, polarity sensitivity is a totally unexpected phenomenon. At the very least, the existence of such a phenomenon raises fundamental questions about the nature of language and cognition. Polarity sensitivity also raises fundamental questions about the nature of negation. It seems natural to think the expression of negation should be a symmetrical function of the expression of affirmation. The fact that it is not suggests that a view of natural language negation as a simple propositional operator may be inadequate. Negative sentences are not formed simply by add ing a negation to a positive sentence. Rather, languages provide a range of constructions which, though they do not themselves actually express negation or affirmation, are specialized to occur in sentences of a given polarity, either negative or positive. It thus appears that the resources a language provides for the expression of negative and affirmative propositions are surprisingly inde pendent of each other. 2.3

The phenomenon of polarity sensitivity

As with many grammatical phenomena, polarity sensitivity involves a depend ency between two elements: a polarity item and a polarity trigger. Polarity items are linguistic forms whose distribution is sensitive to polarity. Polarity triggers are linguistic constructions which trigger polarity sensitivities. There are two basic sorts of polarity sensitivity and two sorts of polarity item: nega tive polarity items occur only in contexts where they are licensed by a polar ity trigger; positive polarity items occur only in contexts where they are not blocked by a polarity trigger. Negative polarity items can be thought of as dependent because their felicitous use depends on the availability of a suit able trigger; positive polarity items on the other hand are “anti-dependent” (Progovac 1994) because their felicitous use requires the absence of some class of trigger. In this section I present a broad outline of the basic phenomenon of polarity sensitivity, beginning with a basic definition for polarity items and proceeding to a broad survey of the various types of polarity items and polarity triggers. 2.3.1 Polarity items Polarity sensitivity is a distributional phenomenon. Polarity items are forms or expressions whose interpretation or acceptability depends on the polarity

Ex nihilo: the grammar of polarity 27 of the contexts in which they occur. The simplest way to characterize these forms is with a negative definition: they are remarkable precisely for what they do not do. Negative polarity items (NPIs) can occur in a negative sentence but not in its affirmative counterpart; positive polarity items (PPIs) can occur in an affirmative sentence but not (normally) in its negative counterpart. The examples in (17–22) show that the English expressions at all, in ages, crack a book, ever, all that, and unstressed much are negative polarity items: while the negative (a) sentences sound idiomatic, the affirmative (b) counterparts are distinctly, even ludicrously, jarring. (17)

a. Hillary isn’t at all interested in continental philosophy. b. *Hillary is at all interested in continental philosophy.

(18)

a. He hasn’t read Spinoza in ages. b. *He has read Spinoza in ages.

(19)

a. Gladys hasn’t cracked a book this quarter. b. *Gladys has cracked a book this quarter.

(20)

a. She hasn’t ever really enjoyed reading Lacan. b. *She has ever really enjoyed reading Lacan.

(21)

a. Lola isn’t all that interested in the human genome project. b. *Lola is all that interested in the human genome project.

(22)

a. She hasn’t been eating much since she started reading La Nausée. b. *She’s been eating much since she started reading La Nausée.

Many positive polarity items show roughly the opposite distribution. The examples in (23–28) illustrate the sensitivity of the PPIs tons of, sort of (sorta), rather, begin to wonder, could just as well, and a fair chance. Note that while these forms are much more natural in the affirmative (a) sentences, they are not all positively hopeless in the negative (b) sentences. In particular, PPIs appear freely with metalinguistic and echoic negations, that is, in rejoinders to their positive uses, and more generally in quoted or precompiled phrases. The fol lowing judgments thus hold only for utterances “with normal intonation and no special context” (Baker 1970: 169). (23)

a. Lola has tons of work to do before Saturday. b. *Lola doesn’t have tons of work to do before Saturday.

(24)

a. The secretary was sort of rude to Gladys. b. *The secretary wasn’t sort of rude to Gladys.

(25)

a. The committee was rather hard on poor Gladys. b. *The committee wasn’t rather hard on poor Gladys.

28 The Grammar of Polarity (26)

a. I’m beginning to wonder why she puts up with it. b. *I’m not beginning to wonder why she puts up with it.

(27)

a. Hugo could just as well have bought a Ferrari. b. *Hugo couldn’t just as well have bought a Ferrari.

(28)

a. Bob has a fair chance of winning the election. b. *Bob doesn’t have a fair chance of winning the election.

In general, PPIs often seem less sensitive than NPIs: their behavior tends to be less constrained, and judgments about them less robust than is the case with NPIs. Horn (1978: 154) notes that PPIs are less common cross-linguistically, less abundant within languages, and generally less regular in their formation than are NPIs. He suggests that this asymmetry reflects the fact that while the licensing conditions for NPIs are overt, those for PPIs are not: NPIs must be triggered; PPIs need only avoid being anti-triggered. Put simply, it may be eas ier to see that something is present than to notice that something is absent, and so the positive constraints on negative polarity items are in general more robust than the negative constraints on positive polarity items. The examples above, and those in (6–14) earlier, only hint at the diversity of polarity items, which is much greater than most studies care to admit or can reasonably hope to address. Polar sensitivities of one sort or another can be found in constructions of almost every syntactic sort. Among the better-known English NPIs we find determiners (any, much), conjunctions (let alone, never mind), adverbs (yet, at all, in the slightest), adjectives and adjective phrases (equaled, believable, as bad as all that), at least one preposition (punctual until), nouns and noun phrases (iota, squat, wild horses), auxiliary and caten ative verb constructions (need, dare, bother, care), lexical verbs (budge, amount to, matter, shirk), and a wide variety of partially lexically-filled constructions and phrasal idioms (bear comment, grow on trees, think small beer of, set the Thames on fire, be trifled with). Polarity sensitivity does not seem to discriminate on the basis of grammat ical class, semantic affiliation, or even social register. Still, it is hardly a random phenomenon either: there are real patterns to be found in the general rabble of polarity items. Indefinite pronouns and determiners, degree modifiers, and expressions of minimal degree (see Haspelmath 1997; Klein 1998; Schmerling 1971 – respectively) seem particularly prone to polarity sensitivity. Common nouns denoting basic-level categories (i.e. birds, dogs, chairs, books) seem largely immune from the condition. Auxiliaries, adverbs, and particles which mark aspect or modality are frequently polarity sensitive. Color terms, per sonal pronouns, and spatial prepositions are never polarity items (at least in

Ex nihilo: the grammar of polarity 29 their most basic senses). While it may be impossible to predict in advance which particular forms in a language will be polarity sensitive, there is clearly some principled force at work here. 2.3.2 Polarity contexts Polarity contexts – the broad class of contexts which license NPIs – like polar ity items themselves, come in many shapes and sizes, and the linguistic con structions which form these contexts – the polarity triggers – can sometimes be hard to spot. While negation is in some sense the prototypical trigger, many triggers have little, if anything, to do with negative constructions. Outside the scope of negation, polarity items also occur in a variety of other non-canonical clauses – in, among others, interrogatives, comparatives, and conditional ante cedents (29–31); content clause complements of “adversative” predicates like doubt, deny, regret, and be amazed (32); relative clauses headed by a generic or universal quantifier (33); and in the scope of an exclusive particle like only (34). The contrasts in these invented examples should give some sense of the general pattern: the (a) sentences show the effect of polarity triggers (in bold) on the NPI at all and the PPI rather, and the (b) sentences illustrate the very different effects of some comparable non-triggers. (29)

a. Are you {at all / *rather} interested in what I’m saying? b. I am {*at all / rather} interested in what you’re doing.

(30)

a. If Gladys is {at all / *rather} late, there may be trouble. b. Since Gladys was {*at all / rather} late, there may be trouble.

(31)

a. She’d sooner die than appear {at all / *rather} drunk in public. b. She is happy to go about {*at all / rather} drunk in public.

(32)

a. I’m amazed that Elly is {at all / *rather} interested in birdwatching. b. I expect that Elly is {*at all / rather} interested in birdwatching.

(33)

a. Everyone who was {at all / *rather} interested in the truth read the report. b. Some folks who were {*at all / rather} interested in the truth read the report.

(34)

a. Only Hugo was {at all / *rather} impressed by her arguments. b. Even Hugo was {*at all / rather} impressed by her arguments.

This sort of data clearly suggests that whatever it is polarity items are sensitive to, it’s not just negation. Or, if it is negation, then it’s not really clear just what it is negation is supposed to be. Given the diversity of polarity triggers, it seems unlikely that any pretheoretical notion of negation or negative contexts will help much in explaining the distribution of polarity items. Indeed, if anything, polarity items seem more likely to serve themselves as diagnostics of negativity. As Horn (1978: 151) puts it:

30 The Grammar of Polarity If NPIs are “les satellites de la négation” (Gaatone 1971), the sun around which they revolve is often obscured, partially eclipsed, or just strongly imagined, as the negative elements which function as their “triggers” may be incorporated, distant, or implied. NPIs thus serve to define the notion of “negative environment.”

But naturally, if we want to explain the distributions of polarity items them selves, this sort of approach simply will not do. We cannot base our notion of negative environment on the distribution of NPIs, and then explain the distri bution of NPIs in terms of our notion of negative environment. That would be circular. The first problem of polarity sensitivity, then, is to find some way of charac terizing this strange assortment of grammatical environments as a natural class. Klima (1964: 313) first addressed this problem by suggesting that polarity con texts share a “grammatico-semantic” feature, affective, which non-licensing environments lack. This stipulation, of course, did not solve the problem so much as just give it a name. Still, by naming this problem, Klima did effect ively set the terms of debate. Since then, the nature of “affectivity” – whatever it is, the thing that polarity sensitive items are sensitive to – has been at the heart of a literature in which the central bone of contention has always been the division of labor between grammar (i.e syntactic structures) and semantics (i.e. logical forms) in determining the distribution of polarity items. 2.4

Basic mysteries: three problems of polarity sensitivity

Polarity sensitivity is not a simple phenomenon. English alone contains hundreds, perhaps thousands, of distinct polarity items, and even within English different polarity items vary widely in the precise nature of their sensitivities. Naturally, the situation only gets worse when one considers other languages. The bewildering variety of polarity items both within and across languages raises the question whether one should think of polarity sensitivity as a unified phenomenon in the first place. And the intricacy of the distributional facts associated with even a single polarity item seems to ensure that any theory of polarity sensitivity will almost have to be an oversimplification. Still, one mustn’t despair prematurely. Polarity sensitivity may not be a sim ple phenomenon, but it is not an incoherent one either. This much, at least, is clear – that there exists a class of constructions, polarity items, which are, or tend to be, sensitive to a broad class of affective contexts. At this level of generality, anyway, there is something which needs to be explained.

Ex nihilo: the grammar of polarity 31 In order to count as an adequate explanation, any theory of polarity sensitiv ity should address at least three basic problems (Israel 1996; Ladusaw 1996): •

• •

Licensing: How do polarity contexts license polarity items? What makes something a polarity context? What, if anything, makes these contexts a nat ural class? Sensitivity: What makes a construction polarity sensitive? What do polarity items have in common that makes them sensitive to polarity contexts? Diversity: Why are there so many polarity items and what can account for the differences between them? What makes polarity items a natural class and what are the ways in which polarity items can differ from one another?

Since polarity items are defined in terms of their distributions, the Licensing Problem is in some sense the fundamental problem of polarity. Any account of polarity sensitivity must explain what it is that makes these distributions somehow coherent. The very concept of polarity sensitivity depends on there being some principled solution to this problem. If the peculiar distributions of polarity items were merely the arbitrary product of chance, there would be no reason to think of polarity sensitivity as an explainable phenomenon in the first place. In essence, licensing is a combinatorial problem: the question is when can a polarity item felicitously combine with a given grammatical context. I follow Ladusaw (1996: 326) in distinguishing two basic problems within the Licensing Problem. First, there is the Licensor Question: this is the prob lem of how to define the class of triggers which can license a polarity item. Assuming the class of triggers can be clearly delimited, the next problem is the Licensing Relation Question: how to specify the precise relation which must obtain between a licensor and a licensed polarity item. Obviously, one can’t just stick a licensor anywhere in a sentence and expect it to license a polarity item anywhere else. The NPI at all is not licensed in a sentence like *That is not the woman who I love at all. Presumably, in order for an NPI to be licensed, it must somehow count as being in the scope of the licensor. Precisely how the relevant notion of scope should be formulated, whether it is essentially a syn tactic or a semantic property, and indeed, whether or not the scope relation is in itself a sufficient constraint on polarity licensing, are questions at the heart of the Licensing Problem. While the Licensing Problem is, in a broad sense, a problem of syntax, con cerned with the syntagmatic principles which govern sentence formation, the Sensitivity Problem is, in turn, a problem of lexis, concerned with the para digmatic relations which hold between polarity items in the lexicon. The most basic problem of sensitivity is the issue of lexical marking, what Ladusaw

32 The Grammar of Polarity (1996: 329) calls the Licensee Marking Question: what marks polarity items as polarity sensitive, and what distinguishes them from similar, polarityinsensitive items within the lexicon? In this sense, sensitivity is effectively the mirror image of licensing, asking what it is that speakers may know about polarity items that could make them so peculiarly sensitive to whatever it is about polarity contexts that makes them polarity licensors. But the problem is bigger than that. Ideally, a solution to the Sensitivity Problem would allow us to predict the precise distribution of any polarity item simply on the basis of the item’s conventional meaning. Ultimately, such per fect predictability of patterning is probably unobtainable. Even so, the explan ation of sensitivity should do more. If, as I have suggested, the basic mystery of polarity sensitivity is the very fact that polarity items exist at all, then the real question is not just what sorts of features mark polarity sensitivity, but also why such features happen to exist in the first place. What is it that makes languages include these features that make for such peculiar expressions? Where do these features come from, and how do they come to be associated with individual polarity items? The answers to these questions should help us understand why certain types of expressions are particularly prone to polarity sensitivity while others seem resistant to, or even immune from, the phenom enon. And they should explain why whatever it is that polarity items do have in common should apply so widely, and have such uniform effects, across such a heterogeneous collection of expressions. While the Licensing and Sensitivity Problems comprise the core issues for a theory of polarity sensitivity, any fully adequate theory must also confront the significant vexations posed by the Diversity Problem. Polarity sensitivity is not a homogeneous phenomenon. Both within and across languages, polarity items can and do vary widely in terms of their syntactic class, semantic content, prag matic function, and, perhaps most disturbingly, in terms of their distributions across polarity contexts. In order to accommodate such diversity, a theory must therefore be general enough to explain the broad patterns of polarity sensitiv ity, and yet flexible enough to handle a considerable degree of variation among individual polarity items. While numerous typological studies have focused on the expression of neg ation itself, there has been little explicitly comparative work on polarity items or polarity sensitivity per se. Still, one can get a sense of just how varied such sensitivities can be from Haspelmath’s (1997) typological study of indefinite pronouns, which drew on a small sample of forty languages, supplemented by data from reference grammars for a genetically and areally balanced sample of a hundred languages.

Ex nihilo: the grammar of polarity 33

(1) specific known

(2) (3) specific irrealis unknown non-specific

(7) direct negation

(4) question

(6) indirect negation

(5) conditional

(8) comparative (9) free choice

Figure 2.1 Haspelmath’s semantic map of indefinite functions

Haspelmath was interested in the ways languages mark indefinite nominals, and in the ways such marking can vary. What he found is that most languages contain several series of indefinite pronouns which vary in the types of refer ence they allow. A series generally has (at least) one member for each of the major ontological categories – person, thing, place, time, manner, and amount. Although a very few languages may have one or no grammaticalized indefin ite pronouns (Lango, Boumaa Fijian, Somali, Yimas), most have at least three distinct series of indefinites (Italian, Irish, Lezgian), and some have many as five (French, Latin, Lithuanian), six (Korean), or even seven (Russian). By my count English has at least five indefinite series: a some- series (someone, something, someplace, sometimes, somehow, etc.), an any- series (anyone, anything, anyplace, etc.), a no- series (nobody, nothing, no place, etc.), a wh- series (who, what, how), and an -ever series (whoever, whatever, whenever, etc.). Different kinds of indefinites vary in the kinds of semantic functions they can fulfill. Haspelmath (1997: 64) distinguishes nine major functions for indef inites based on the types of contexts in which they can be used. These are: (1) specific – known to speaker, (2) specific – unknown to speaker, (3) nonspecific – irrealis, (4) questions, (5) conditional antecedents, (6) indirect negation, (7) direct negation, (8) the standard of comparison, (9) free choice reference. Haspelmath proposes that these nine functions form a semantic map of indefinite reference, as in Figure 2.1, and that this map places clear con straints on the functions which any given form may serve. In effect, the map provides a geometric representation for a proposed impli cational universal: the claim is that if an indefinite form fufills more than one function, then all its functions must be connected in an unbroken chain. The map predicts, for example, that there will be no language with an indefinite that can be used in direct negations (I didn’t kiss anyone) and in questions (Did you kiss anyone?) but not with indirect negations (I don’t think she kissed anyone). And it predicts that any sort of indefinite construction which

34 The Grammar of Polarity does not violate these constraints might be found in at least some human language. The map also provides a useful (if rather crude) way of categorizing indef inite constructions according to the functions in which occur. The prototypical English NPI any, for example, is found in contexts (4–9). Many languages (e.g. Arabic, Russian, Serbo-Croatian, and Spanish) have indefinites which occur only with an explicit, tautoclausal sentence negation (context 7). And many languages have indefinites which occur only in contexts (6–7), that is, with tautoclausal negation, and with more indirect or long-distance negations (with so-called bridging negations or in the complements of inherently negative predicates like doubt or deny), but not in questions, conditionals, or compara tives (Korean, Japanese, Serbo-Croatian). Still other languages include forms which are sensitive, but yet not so sensi tive as even the liberal English any. Forms like German irgend- (as in irgendwer ‘anyone’ and irgendwas ‘anything’) and Modern Greek kanénas ‘anyone’ or típota ‘anything’ are blocked in certain affirmative sentences but do occur in some contexts hostile to even the most liberal of English NPIs – among others, in disjunctions, in the scope of a modal, in future or habitual clauses, in com mands, and in the scope of intensional verbs like hope or want. Giannakidou (1998, 1999, 2006) proposes that such forms are sensitive to the semantic prop erty of non-veridicality, and so (roughly) can appear only in contexts where their use does not entail the existence of a referent (see §8.4.3). If nothing else, constructions like these show that sensitivity in general is not a monolithic grammatical phenomenon. Haspelmath’s semantic map offers a radically different way of conceptu alizing the nature of sensitivity from that found in standard theories based on necessary and sufficient syntactic or semantic conditions. The semantic map emphasizes local relations and family resemblances among licensors, and suggests how certain sets of constructions may form more or less coher ent natural categories. In this light, polarity sensitivity is not, and should not be expected to be, a single, discrete phenomenon; rather, constructions may exhibit a range of possible sensitivities, with sensitivity to negation being one of several salient sorts. In any case, the diversity problem is not just a problem for typologists. Even a single language can contain a dizzying array of sensitivities. The com plexity of the problem is apparent when one considers even a small number of English polarity items. Certain NPIs appear to be very liberal in their licens ing requirements, allowing a significant range of licensors which do not work nearly as well for most other polarity items. The examples in (35) give a small

Ex nihilo: the grammar of polarity 35 sample of contexts which can license the NPI ever. As the examples in (36) demonstrate, many of these contexts do not so easily allow the less liberal NPI would dream of. (35)

a. Glinda has *(hardly) ever robbed a liquor store. b. I didn’t realize that Glinda had ever robbed a liquor store. c. At most three of my friends have ever robbed a liquor store. d. I would be impressed if she ever robbed a liquor store. e. Only people who have ever robbed a liquor store can understand the thrill.

(36)

a. Glinda would *(hardly) dream of robbing a liquor store. b. *I didn’t realize that Glinda would dream of robbing a liquor store. c. *At most three of my friends would dream of robbing a liquor store d. *I would be impressed if she would dream of robbing a liquor store. e. *Only people who have dreamed of robbing a liquor store can understand the incomparable thrill it offers.

The diversity of polarity items’ polarity sensitivities is an endlessly nettle some problem, and the worst of it is that the facts themselves are by no means easy to establish. Some of the examples in (36) are, at least potentially, salvage able, but paradoxically, one of the things which helps them most is the addition of more NPIs: all of these examples improve at least a little if the liberal ever is inserted before the word dream. Apparently then, certain forms which require a weak polarity trigger themselves can also strengthen the licensing potential of the contexts in which they occur. This phenomenon is sometimes referred to as secondary, or parasitic, triggering (Horn 2001). Moreover, as it turns out, some NPIs are sometimes licensed in contexts where it is difficult, at best, to discern any clear trigger. On the one hand, minimizer constructions like the shred of evidence in (37) normally require a negative license of some sort, but as the attested examples in (c–e) show, they are also sometimes used in positive contexts to denote a literally minimal amount. (37)

a. There’s not a shred of evidence to suggest that he’s a pedophile. b. *There is a shred of evidence to suggest that he’s a pedophile. c. We’ll take a shred of evidence and try to turn that into a story. (from Frontline, “Tabloid Truth: The Michael Jackson Story”) d. Go ahead. Get her on the witness stand and try her with your shreds of evidence. (Mr. Liedecker in the movie Laura) e. There is a shred of substance in these claims. (British National Corpus [BNC])

In some of these examples, the apparently unlicensed minimizer might be acceptable because of an ironic implicature: that is, the reference to a ‘shred

36 The Grammar of Polarity of evidence’ suggests that the evidence referred to is scant or nonexistent. But not all unlicensed NPIs come with such negative connotations. The examples in (38) illustrate a variety of other NPIs (at all, ever, any, give a thought, and bother to) in apparently unlicensed contexts. (38)

a. It’s nice to sit at a table with a candle at all. (E. Lowson, conversation, 1997) b. The tone [of Germaine Greer’s attack on manufacturers of vaginal deodor ants] wasn’t light-hearted, which might have justified touching the subject at all. (C. McCabe, S.F. Chronicle; cited in Horn 1978: 153) c. Because of what he did with the tiny fleet over the next month, Perry became one of the few military men of his time who has retained much fame. Yet, in retrospect, he had already done his greatest work: building the Eerie fleet and getting it onto the lake at all. (Smithsonian Magazine, Jan. 1995: 31. Article on Captain Oliver Perry) d. Barbara Biewener impressed Public Eye by knowing any of California’s state song, even if she did think “And I know when I die / I will breathe my last sigh / for my sunny California” was “And I know when I die / I will leave my glass eye …” (San Diego Union Tribune, C-2, May 18, 1993, italics in original) e. Sensitive Man as portrayed in popular culture was always a caricature, of course. But the signs of his discrediting have been building, along with male confusion. (We speak of those heterosexual men, mainly in their 30’s, 40’s, and 50’s, who ever gave a thought to any of this.) (New York Times, May 8, 1994; cited in Horn 2001: 7) f. The reason one ever bothers to decant a wine is to leave the sediment … behind in the bottle. (SouthWest Airlines Spirit Magazine, August 1994: 47)

Horn refers to the licensors in such cases as “Flaubert triggers” since, like “God in the deist universe and the author in the Flaubertian novel, so is negation [in these examples]: everywhere present yet nowhere visible” (2001: 176–7). Such examples do occur, but they are difficult to find. In examining over a thousand tokens of the NPI ever from the Wall Street Journal (WSJ), I could find no more than one or two comparable examples without some sort of overt trigger. While the liberal NPIs in these examples may seem excessively tolerant of some very dubious triggers, more conservative NPIs can have much more exacting standards, often resisting, or flatly refusing, licensing in many stand ard polarity contexts. The use of until with punctual verbs, in (39), and the aspectual operator yet, in (40), are typical examples of such hyper-sensitive polarity items. As the examples show, neither form is licensed in a compara tive or in the restriction of a universal quantifier, punctual until is unlicensed in questions, and yet is unlicensed with the weakly negative adverb rarely.

Ex nihilo: the grammar of polarity 37 (39)

a. I doubt any of them will come home until midnight. b. *They’d rather dance all night than come home until dawn. c. *Everyone who came home until dawn managed to get some sleep. d. *Did Zelda come home until dawn? e. Zelda rarely comes home until dawn.

(40)

a. I doubt any of them will come home yet. b. ?They’d rather dance all night than come home yet. c. *Everyone who came home yet is probably watching T.V. d. Did Zelda come home yet? e. ??Zelda rarely comes home yet.

If the judgments here count for anything, they suggest that the Diversity Problem cannot be solved just by positing a hierarchy of negative contexts ranked in terms of licensing strength. The idea that there should be such a hierarchy remains a surprisingly common assumption in the study of nega tive polarity items (Horn 1970; Edmondson 1981, 1983; Krifka 1994, 1995; Zwarts 1996, 1998; van der Wouden 1997); however, these facts suggest that the distributions of different polarity items might have less to do with abstract levels of licensing strength than with the particular combinatorial properties of individual polarity items. 2.5

Varieties of polarity sensitivity

While the twin problems of licensing and sensitivity are, in some sense, the essential issues for an account of polarity items, in the end it is the diversity problem which prevents any single, definitive analysis from emerging for the phenomenon of polarity sensitivity. The diversity of these forms obviously makes it difficult for any single explanation to apply to all polarity items. But worse than that, it raises the question of whether or not there is even a coherent topic here to explain in the first place. It is thus crucial that we should be clear from the beginning as to what we can and cannot explain. Polarity sensitivity is a coherent phenomenon, but it is not a particularly dis crete one. As a group, polarity items do not form a simple, well-defined class of constructions but look rather more like an extended (and often unruly) family, where everyone is related to someone but not everyone is related to each other. In section 2.3.1, above, I began with a simple and deliberately broad defin ition: any form which can occur in a sentence of one polarity, but not in the corresponding sentence of the opposite polarity, counts as a polarity item. This definition casts a wide net in order to identify as fully as possible the ways in which polarity may affect the use and distribution of linguistic constructions.

38 The Grammar of Polarity In this section I will explore some of the more marginal phenomena included under this definition in order to highlight, by way of a few negative examples, some of the characteristics of more prototypical polarity items. In the broadest terms, polarity sensitivity simply reflects the general fact that linguistic forms are not evenly distributed across polarity contexts. Certain forms, for whatever reason, tend to be used disproportionately in negative con texts, while others tend to occur almost exclusively in affirmative contexts. And, for whatever reason, certain grammatical constructions – questions, con ditionals, comparatives, adversatives, etc. – form environments which pattern with negation in their effects on polarity items. In this work I am concerned primarily with the broad class of polarity items which are sensitive not just to polarity proper – the contrast between negative and affirmative sentences – but also to this wider class of “polarity contexts.” Ultimately, however, we must rec ognize that these forms, for all their diversity, may reflect but one of the many ways in which polarity skews the distributions of linguistic constructions. 2.5.1 Semi-polarity items, sometime polarity items The asymmetrical distribution of forms across positive and negative contexts manifests itself in a variety of ways. At one end of the spectrum one finds forms which can occur in either context but which in actual usage show a marked preference for one polarity or the other. Hoeksema (1994), for example, dis cusses “semi-NPIs” like the English psychological verbs care about, matter, and bother. Since these forms do occasionally occur in affirmative contexts (e.g. this fact matters and I do care about it), they are not, strictly speaking, NPIs. They are, however, sensitive to polarity. Hoeksema offers corpus data showing that these verbs occur much more frequently in negative contexts than they do in positive contexts, despite the fact that negative contexts themselves are much less frequent than positive contexts in his corpora. Interestingly, as Hoeksema points out, there are significant patterns among the class of semiNPIs: certain grammatico-semantic fields seem to be fertile breeding grounds. In English and Dutch, for example, one finds an abundance of NPIs and semiNPIs among verbs of indifference (care, mind, bother), verbs of intolerance (stand, bear, abide), and verbs of minimal degree (budge, breathe a word, bat an eyelash). These and similar lexicalization patterns turn out to be very robust cross-linguistically. The consistency of these patterns clearly suggests that polarity items do not arise arbitrarily. As Hoeksema suggests, certain expres sions may, for purely pragmatic reasons, occur disproportionately in negative contexts, and over time such asymmetric patterns of occurrence may grammat icalize into strict constraints.

Ex nihilo: the grammar of polarity 39 Typically, the constraints on polarity items are not properties of lexical items per se, but apply rather to particular senses of a given lexical item as it is used in particular contexts. Many words and constructions seem to develop spe cialized senses in negative contexts, and it is these specialized senses which ultimately lead the way to full polarity sensitivity. In contrast with the semipolarity items, which show skewed but not fully polarized distributions on a given lexical sense, these “sometime” polarity items may display an absolute preference for a certain polarity within the limits of a certain narrowly defined usage. The word exactly, for instance, can indicate the precision of a quantitative measurement in any sentence, as in (41). But the use in (42), where it qualifies a predicate NP, is limited to negative contexts in which it forms a superficially weak, understating denial. (41)

a. Winthrop is exactly half as old as his sister. b. Circe isn’t exactly half as old as her brother.

(42)

a. Winthrop isn’t exactly a saint. b. *Winthrop is exactly a saint.

One finds the opposite situation with the semantically quite similar quite. In (43) quite works in both negative and positive contexts to qualify adjectives like sober and ready, which include in their basic meaning some notion of a limit or threshold of applicability. As Bolinger (1972: 119) puts it, quite in these examples is “oriented toward a beginning point” for the modified predicate. In positive sentences, however, quite can work as an intensifier for virtually any gradable predicate, whether or not it incorporates an inherent threshold. As the examples in (44) illustrate, this usage is barred from negative contexts. (43)

a. Circe was quite sober, but she wasn’t quite ready. b. Winthrop wasn’t quite sober, but he was quite ready.

(44)

a. Circe was quite disappointed by his bad manners. b. *Circe wasn’t quite disappointed by his bad manners.

In other cases an expression which has apparently the same semantic content in several different constructions may be polarity sensitive in only one type of construction. A particularly clear example is the semi-auxiliary use of need in English. The examples in (45) show that as a main verb taking an infinitival to complement, need occurs freely in both negative and positive sentences; how ever, in (46) as an auxiliary verb, shunning inflection and taking a bare stem verbal complement, need requires a polarity trigger and so is ungrammatical in the simple affirmative (46d).

40 The Grammar of Polarity (45)

a. He doesn’t need to talk to you. b. He needs to talk to you.

(46)

a. He needn’t ask your permission. b. Need he ask your permission? c. I doubt you need ask his permission. d. *You need ask his permission.

Finally, it is important to recognize that polarity sensitivity, broadly con strued, need not involve an absolute aversion to a given polarity, but may appear more subtly as a preferential pattern of occurrence across polarity contexts. This can be seen in the use of particularly as a degree modifier. The examples below show that this form is normally not polarity sensitive, occurring freely both in positive assertions and directly in the scope of negation. (47)

a. Circe is particularly charming. b. I am particularly interested in the grammar of Tongan.

(48)

a. Natasha is not particularly charming. b. Paul is not particularly interested in the grammar of Tongan.

When we consider more examples, however, a subtle asymmetry (at least for some speakers) begins to appear. In the negative examples of (49) particularly is unexceptional in modifying a predicative adjective applied to a quantified subject. But in the affirmative examples of (50) the use of particularly is less easily accommodated. (49)

a. None of her friends are particularly interesting. b. Few of her friends are particularly interesting. c. I doubt that many of her friends are particularly interesting.

(50)

a. ??Many of her friends are particularly interesting. b. ??Several of her friends are particularly interesting. c. ??All of her friends are particularly interesting.

In order to work in these contexts, particularly itself must contribute focal information and must be understood in terms of an implied contrast with other friends who might also be interesting, but not particularly interesting. The judgments here are, admittedly, rather subtle, but the facts do seem to point to a basic asymmetry in the distribution of particularly across polarity contexts. I will not attempt to explain this particular asymmetry but will content myself with the simple conclusion that sensitivities to polarity may be more pervasive and less conspicuous than is often assumed. While some might be inclined to disregard the subtle sensitivities of forms like particularly, quite, and exactly, the general pattern here is actually typical

Ex nihilo: the grammar of polarity 41 of what one finds with almost all polarity items. Even the most robustly polarity sensitive forms tend to have usages which belie their status as polarity items. The NPI any, for example, occurs happily in affirmative contexts as a free choice item, as in Circe will dance with anyone. And the NPI ever occurs in a variety of constructions without a polarity trigger, for example in She’s getting ever more reckless or The party was ever so much fun. If we were to exclude forms which are only partially polarized from consideration as polarity items, we would virtually define the phenomenon out of existence. Polarity sensitivity in general applies to particular expressions as they are used in particular, and sometimes extremely specific, constructions. 2.5.2 Polarity sensitive morphology Normally, polarity sensitivity is understood as a problem of sentence gram mar. The present work is no exception. Polarity items are defined in terms of their distributions in sentences: they are constructions whose felicitous use depends on the properties of the sentences in which they occur. Ultimately, however, the phenomenon may not be so limited. As van der Wouden (1997) points out, polarity sensitivity may manifest itself in the morphological structure of words, as a constraint on the derivations of a polarity sensitive morpheme. The obvious examples in this respect are lexical roots which can only occur with overt negative morphology. One can be unkempt, but not merely *kempt; uncouth, but not *couth; disheveled, but not sheveled; nonchalant, but never *chalant; unruly and unrivaled, but never merely *ruly or *rivaled. These roots might be called morphological NPIs since it appears that they can only be licensed in words with negative derivational morphology (see Appendix for further examples in English). There is also no shortage of forms which resist such morphology, and which might therefore be thought of as morphological PPIs. One can be either happy or unhappy, but never *unsad, *unecstatic, or *unmiserable; and while one might like something or dislike it, one can’t really *dislove or *dishate anything (Horn 1989: 274–5). While morphological polarity sensitivity is clearly a real phenomenon, its significance and its relation to other sorts of polarity sensitivity are less than obvious. Crucially, a polarity item must be the sort of linguistic item that one would expect to combine freely in novel constructions, otherwise its failure to occur in constructions with the wrong polarity would not seem remarkable. But the adjectives *couth, *kempt, and *chalant simply do not exist as inde pendent units of contemporary English, even if the existence of forms like uncouth, unkempt, and nonchalant makes it look as if they should.

42 The Grammar of Polarity It is not necessary that these roots are somehow grammatically constrained to occur only with the un- prefix. Presumably, they are simply learned that way and so listed in the lexicon as subparts of the complex un-prefixed forms in which they appear. Assuming there is some division of labor between grammar and lexicon, there is no particular reason to hold the grammar responsible for gaps in the lexicon. Still, it is not unlikely that some of the same forces which lead to the devel opment of syntactic NPIs and PPIs may be at work in the evolution of mor phological polarity items as well. This is an open question, and as far as I know one which has not been systematically addressed in the literature so far. For the purposes of this work, I am sorry to say, it will remain an open question. 2.5.3 Inherently negative idioms In order for something to be a polarity item, it must first be an “item,” and if the item in question does not occur with, but itself contains, negation, then it is not a polarity item but simply a negative idiom. Just as certain morphological roots come to be frozen under a negative affix, one might expect to find a class of larger phrasal idioms in which a negative word of some sort – no or not or never – has become a frozen constituent. Such an idiom might appear to con tain a negative polarity item, but in fact it would not. Actually, it is rather difficult to find examples of phrasal idioms with a rigidly fixed negation. Elsewhere (Israel 1998a) I have suggested that the English expressions not half bad (as in The show wasn’t half bad) and can’t seem to (as in I can’t seem to get the hang of it) only occur with an overt, tautoclausal sentence negation. But this was wrong. In fact, the can seem to construction occurs with a wide range of licensors (Langendoen 1970; Jacobson 2006), as can be seen in (51) – (51a, b = Jacobson’s 9–10). And while half bad does require overt negation of some sort, the attested examples in (52) show that it need not be adjacent to or even in the same clause as its licensor, and so cannot be treated as simply part of a not half bad idiom. (51) (52)

a. Few can seem to fathom how he could be so popular. b. Only John can seem to stomach watching reruns of the 6th game of the 1986 Series. c. That’s all I can seem to think of. (BNC) a. Not that he looked half bad himself, he thought … (BNC) b. Considering how old I’ll be on my next birthday, I don’t think it’s at all halfbad. (found using Google, May 2008) c. I didn’t think he was half bad, though his character seemed a little overthe-top. (www.dvdverdict.com/printer/speed2.php)

Ex nihilo: the grammar of polarity 43 It thus seems clear that both half bad and can seem to are fairly ordinary NPIs – sensitive constructions which require a negative licensor of some sort, and not fixed expressions which just happen to include a negation. It is not, of course, that such negative idioms do not exist. For example, Palacios Martinez’s (1999) analysis of 550 “negative polarity idioms” includes many expressions with a near perfectly frozen negation; however, these are not regularly recombinable constructions at all, but fixed sayings – e.g. Rome was not built in a day; One swallow does not a summer make; No pain, no gain; There’s no place like home; Dead men tell no tales; and Don’t do anything I wouldn’t do. Presumably the expression of negation is fixed in these idioms because the idioms themselves are stereotyped ways of forming an utterance. In a sense, then, they are the syntactic equivalent of the morphological NPIs (§2.5.2): they do not consist of a sensitive item in construction with a licensor, but are rather learned as complex units which just happen to include a negative morpheme. 2.5.4 Negative concord and the Jespersen cycle There is one important class of constructions in the languages of the world in which something that does look very much like a polarity item is constrained to a very limited range of syntactic positions in the scope of sentential neg ation. The problem here is that it is difficult, if not impossible, to distinguish the polarity item from the negation which licenses it. The term negative concord refers to any construction in which multiple, apparently negative expressions – n-words – combine in a single clause but together express only a single logical negation. While such constructions are often considered illogical, they are not only widespread but actually appear to be the preferred pattern for negatively quantified sentences cross-linguistically. The examples below, from Progovac (1994: 40–1), illustrate the phenomenon as it appears in Serbo-Croatian. (53) a. Milan *(ne) vidi nista. Milan not sees nothing Milan cannot see anything. b. Ni(t)ko *(ne) vidi Milan-a. no one not sees Milan Nobody can see Milan. c. Milan to ni-kako *(ne) odobrava. Milan that no-way not approves Milan does not approve of that at all.

As the examples show, the n-words nista, ni(t)ko, and ni-kako are ungram matical without an overt expression of negation (ne) to license them. In this

44 The Grammar of Polarity respect, they are just like polarity items everywhere; but unlike ordinary polar ity items, these forms are themselves morphologically negative, incorporating the negative prefix ni. They are thus not merely dependent on negation but are themselves inherently negative. But then if they are negative, it is not obvious how a compositional semantics should explain why the two occurrences of negation in these sentences do not logically cancel each other. The ambiguous negativity of n-words is particularly clear in a language like Spanish. As the examples below illustrate, the Spanish n-word nadie fluctuates between uses in which it looks like a polarity item and others where it just seems to mean no one (Vallduvì 1994; Aranovich 1996). (54) a. María *(no) ha visto a nadie. Maria hasn’t seen anybody. b. Nadie (*no) ha visto María. No one has seen Maria. c. ¿A quién has visto? A nadie. Who have you seen? Nobody.

In (54a), where it occurs in object position, nadie has to be licensed by the negative clitic no. Here nadie functions like the English polarity item anyone and is glossed as such. But in (54b), where nadie occurs in subject position, and in (54c), as a response to a question, nadie seems to be inherently negative, and so is glossed as no one. Forms like nadie seem somehow puzzlingly midway between an NPI like English any and a negated quantifier like nobody. On the one hand, like a strong NPI, they can require an overt negative licensor, as in (54a). On the other hand, in subject position (54b) and in elliptical responses (54c), rather than depending on a negative licensor, they seem to incorporate and express negation themselves. Different approaches to negative concord tend to emphasize the different sides of n-words’ ambiguous nature. One line of reasoning, pursued in Laka (1990), treats all n-words as NPIs and formulates special treatments to explain their occurrence in contexts like (54b, c) without an overt licensor. Another line of reasoning, pursued, for example, by Zanuttini (1991) and Vallduvì (1994), holds that all n-words are actually negative quantifiers, like the English quan tifiers nothing and no one. The problem for this approach is to explain how multiple n-words in a single sentence can effectively express a single semantic negation. A third strategy, pursued in van der Wouden and Zwarts (1993), van der Wouden (1997), and Dowty (1994), steers a middle course by allowing a context-sensitive meaning assignment for n-words, in effect claiming that n-words are sometimes NPIs and sometimes negative quantifiers.

Ex nihilo: the grammar of polarity 45 Ultimately, the key to a theory of negative concord depends on finding a way to capture the commonalities between n-words and polarity items while still respecting their differences. One promising approach, developed in Ladusaw (1992), Aranovich (1996), and Giannakidou (1998), is to treat n-words as indef inites which, like NPIs, are semantically dependent on negation, but which are subject to much more stringent constraints on their distribution. Aranovich in particular stresses the differences between English NPIs and Spanish n-words, noting that the latter allow a much narrower range of licensors and are sub ject to much tighter locality constraints. He concludes that while the distribu tion of n-words involves tight syntactic constraints similar to those governing wh-movement, the constraints on polarity licensing are looser and are more strongly influenced by pragmatic factors (Aranovich 1996: 6). The advantage of such an approach is that it permits a view of negative polarity and negative concord as situated at opposite ends on a cline of gram maticalization and syntacticization. As I will argue throughout this work, polarity sensitivity is fundamentally a pragmatic phenomenon. The distorted and asymmetrical distributions of polarity items reflect the fact that the prag matic potential and the rhetorical force of these forms depends on the polarity of their environment. Negative concord, on the other hand, is essentially a syn tactic phenomenon having to do specifically with the resources and construc tion types a language may include for the expression of negation. Negative polarity and negative concord are, however, closely related, and ultimately there may be no way of drawing a hard-and-fast line between them. The relationship between them is perhaps most dramatically illustrated in the process known as Jespersen’s cycle: The history of negative expressions in various languages makes us witness the following curious fluctuation: the original negative adverb is first weak ened, then found insufficient and therefore strengthened, generally through some additional word, and this in its turn may be felt as the negative proper and may then in course of time be subject to the same development as the original word. (Jespersen 1917: 4)

The words used to strengthen negation start off as innocent, affirmative forms which gradually come to be so closely associated with the expres sion of negation that they begin to express it themselves. Bréal (1900: 200) calls this process in which a word takes on the meaning of its surroundings semantic contagion. The Spanish n-words nada and nadie are a case in point, nada coming from the Latin (res) nata ‘insignificant [lit., born] thing,’ and nadie from the Latin (homo) natus ‘a born person, a living soul’ (Jespersen 1917: 21; Horn 1989: 454). The process has been widely noted in the histories

46 The Grammar of Polarity of Indo-European languages (e.g. Meillet 1948), and, as Haspelmath (1997) shows, is also well attested in a broad range of typologically and geograph ically diverse languages. Jespersen’s cycle makes it clear that the distinction between polarity items and n-words is often a matter of degree and not of kind. The point is illustrated in the history of the French word pas, which started out as a common noun meaning ‘step,’ later acquired a special use as a strengthener of negation, and now, in modern spoken French, is itself a negative adverb and the primary sig nal of sentence negation. The examples in (55) illustrate the main stages of this evolution. (55) a. Jeo ne dis. b. Je ne dis (pas). c. Je ne dis pas. d. Je (ne) dis pas. e. Je dis pas. ‘I don’t say.’

In the first stage of the cycle, negation is expressed by the preverbal clitic ne. In stage (b) the clitic negation is optionally supplemented by a negative reinforcer, pas. Pas began as a morphologically positive form meaning ‘step’ and was ori ginally used as a negative reinforcer with verbs of movement, e.g. Je n’avance pas ‘I don’t move a step.’ By stage (b), however, pas has been reanalyzed as a general intensifier of negation and so functions as a negative polarity item. In stage (c), the optional pas becomes obligatory and thus shifts from being an expression of negative emphasis to an expression of negation itself. In stage (d), which is the stage of Modern (spoken) French, the original negative ne becomes optional. And in the final stage (e) ne would drop out of the language altogether, and negation would be expressed by pas alone. Perhaps some reg isters of colloquial modern French have already reached this stage, though the conservative tendencies of written and formal spoken usage may yet prevent ne’s absolute disappearance. The cycle as a whole demonstrates that whether a form counts as a polarity item, an n-word, or a genuine expression of negation is not just a matter of synchronic grammar but may also depend on a form’s position within a larger evolutionary process. I cannot in this work do justice to the dynamic, evolutionary processes which unite negative concord with looser sorts of polarity sensitivity, but it seems fair to say that the two phenomena are both distinct, yet profoundly linked. N-words exist at the extreme end of sensitivity, the products of a grammat icalization process that leads forms frequently occuring in negative contexts to end up as expressions of negation themselves. My focus here, however, is on

Ex nihilo: the grammar of polarity 47 the opposite end of this process and on why it is that so many constructions are sometimes sensitive to the expression of negation. 2.6

The Scalar Model of polarity sensitivity

In what follows I present a general theory of polarity sensitivity grounded in the lexical semantics of polarity items themselves. My goal is to explain not just where polarity items can appear in sentences, but crucially, where polarity sensitivities arise in the lexicon. If successful, I will raise at least as many ques tions as I answer, and leave many more unaddressed or underexplored. Still, I hope to provide a compelling solution to one fundamental puzzle, namely why it is that polarity items should exist in the first place. The answer is that polarity items exist because they are useful, because the distinctions they encode and which make them sensitive serve important semantic and pragmatic functions. The answer may sound like an anticlimax, but of course that is the risk one takes in trying to solve any puzzle. Once the puzzle is solved, the mystery may well disappear.

3 Licensing and the logic of scalar models

At the heart of social and emotional expression is the linguistic feature of intensity. William Labov (1984: 43)

3.1

What is a polarity context?

Polarity items are defined by their distributions – by their patterns of occurrence or non-occurrence in particular sorts of sentences and utterance contexts. But it is not obvious what it is that makes the loose conglomeration of contexts that affect polarity licensing a natural class. In English the contexts which license NPIs and inhibit PPIs include clauses and constituents not just in the scope of negation, but also in the scope of a question, the premise of a conditional, the standard of a comparison, or in the complements of certain intensional predicates (e.g. doubt, deny, sad, sorry, surprised, and amazed). Presumably all these contexts share some common property which explains their common behavior as “affective,” NPI-licensing contexts. Broadly speaking, linguists since Klima have followed two basic strategies for explaining affectivity. The traditional assumption was that affective contexts are defined by their relation to negation (Klima 1964; Baker 1970; Linebarger 1980; Progovac 1994). The idea is implicit in the term negative polarity item itself,1 and in the loose but common use of negative contexts for NPI licensors generally (Buyssens 1959; van der Wouden 1997). Although many licensors are neither semantically nor grammatically negative, there is nonetheless a common intuition that “negation is the most central of them, while the others have various kinds of semantic or pragmatic connection with negation” (Huddleston & Pullum 2002: 834). A long line of research, from Lees (1960) to Laka (1990) and beyond, has pursued this intuition by viewing NPI-licensing primarily as a syntactic relation between a polarity item and some (usually abstract) negative element. Another tradition, going back to the work of Fauconnier (1975a, b, 1976) and Ladusaw (1979, 1983), rejects the idea that affective contexts are defined 48

Licensing and the logic of scalar models 49 by their family resemblances to negation and instead has sought the essence of affectivity in a single abstract semantic property. In this spirit I argue here that polarity contexts constitute a natural semantic category defined by their common effects on scalar inferencing. Ultimately, I propose that polarity items themselves are a special class of scalar operators, and that their sensitivity reflects their conventional scalar semantics. But before we can understand their special semantics, we must first come to terms with the contexts that define them.

3.2

Fauconnier’s insight

Polarity items are not the only sorts of constructions which are sensitive to polarity. In addition to “syntactically” polarized NPIs like any, at all, and all that, Fauconnier (1975a, b) showed that certain superlative expressions can also be semantically or pragmatically sensitive to polarity contexts. But these forms do not in themselves seem to be negative in any obvious sense. The facts are simple. Under the right circumstances, as in (1), expressions like the most disgusting food and the simplest puzzle allow quantificational readings in which the superlative functions as a sort of covert universal quantifier. (1)

a. Felicity will eat the most disgusting food. (= Felicity will eat any food.) b. Norm can’t solve the simplest puzzle. (= Norm can’t solve any puzzle.)

When the context changes, as in (2), these quantificational readings disappear and so make the superlatives sound peculiarly trivial. (2)

a. Felicity won’t eat the most disgusting food. (But she likes grasshoppers.) b. Norm can solve the simplest puzzle. (But not the Sunday crossword.)

The contrast between the examples in (1) and (2) shows that the quantificational readings of superlatives are sensitive to polarity. Such superlative NPs are thus semantically polarized: their interpretation, though not their grammaticality, depends on a context’s polarity. In other cases a switch in polarity can lead to pragmatic complications. Paragon-denoting NPs like even Bill Gates and the Pope himself in (3) (adapted from Fauconnier 1975b: 190) function as pragmatic superlatives and generate universal implicatures, though their contextual appropriateness depends on certain background assumptions. (3)

a. Even Bill Gates couldn’t afford that apartment. b. The Pope himself might be tempted to use viagra.

50 The Grammar of Polarity (3a) presupposes that Bill Gates is among those least likely not to be able to afford an apartment, and so if he couldn’t afford it, presumably no one could; (3b) presupposes that the Pope is, of all men, among the least likely to use viagra, so if even he is tempted, probably everyone else is too. In (4) holding background assumptions constant but reversing the sentences’ polarity, these inferences vanish and the sentences sound peculiar.2 (4)

a. #Even Bill Gates could afford that apartment. b. #The Pope himself isn’t tempted to use viagra.

Expressions like these are pragmatically polarized: their contextual appropriateness depends on a context’s polarity. Like their syntactically polarized counterparts, the semantically and pragmatically polarized expressions in these examples are also sensitive to a broad range of polarity contexts. As Fauconnier (1975a, 1976, 1978) demonstrates, it’s not just negation that triggers these superlative quantificational effects. Superlative NPs like the simplest puzzle can get a quantificational reading when they appear in the focus of a question (5a), in the complement of an adversative predicate (5b), in the nuclear scope of a quantifier like few (5c), in the restriction of a quantifier like every (5d), or in a conditional premiss (5e). (5)

a. Can Norm solve the simplest puzzle? b. I’m surprised that Norm can solve the simplest puzzle. c. Few of the professors could solve the simplest puzzle. d. Everyone who could solve the simplest puzzle got a prize. e. If Norm can solve the simplest puzzle, he’ll win a prize.

In all these examples, the superlative allows an interpretation in which it is effectively equivalent to the determiner any. (5a), for example, seems to ask not just whether Norm can solve the one puzzle which happens to be the easiest, but whether he can solve any puzzle at all. (5b) suggests that the speaker will be surprised if Norm can solve any puzzle, and (5c) suggests that few of the professors could solve any puzzle. And for (5d) and (5e), if people are rewarded for solving even the simplest puzzle, presumably they also get prizes for solving any of the harder ones. The parallelism between grammatical polarity items and the quantificational properties of superlative constructions suggests that perhaps the ‘polarity’ to which polarity items are sensitive might not be a matter of negation at all; that the contexts which license polarity items might rather be united by a more general semantic property. The key to this general property may be found in the semantics of superlatives and in the inferential mechanisms which support their quantificational interpretations.

Licensing and the logic of scalar models 51 Intuitively, a superlative designates an extreme value in a scalar ordering: the simplest puzzle, for example, marks an endpoint on a scale in which puzzles are ordered in terms of their difficulty. Scales in general are associated with a sort of commonsense logic. If someone can solve a difficult puzzle, we assume that they can also solve any easier puzzle. Conversely, if someone cannot solve a simple puzzle, we assume that they can’t solve any harder ones either. And if someone can’t solve even the simplest puzzle, then presumably they can’t solve any puzzle at all. Apparently then, the quantificational readings in (5) depend on the fact that these sentences support a kind of scalar inferencing allowing one to draw conclusions about harder puzzles based on what is understood about the simplest puzzle. If this is correct – if this is indeed the property which unites the diverse class of polarity contexts – then polarity items may be sensitive not to negation per se, but rather to a kind of scalar inferencing which, though prototypically linked to negation, is in fact a much more general property of polarity contexts as a whole. This was Fauconnier’s intuition, and it is the hypothesis I pursue here. 3.3

The natural logic of scalar models

Scalar inferencing is the prime suspect in our search for the essence of polarity licensing, but to catch this suspect we need a clear description of its distinguishing characteristics. Building on Fauconnier’s work and work in Construction Grammar (Fillmore, Kay & O’Connor 1988; Kay 1990, 1997; Michaelis 1993), I distinguish two basic constructs, conceptual scales and scalar models, which make scalar inferencing possible. Conceptual scales are partial orderings defined on conceptual contents, and they form the basis for scalar reasoning of all kinds. A scalar model is a matrix of propositions with different dimensions supplied by different conceptual scales. Given these rudiments, scalar inferencing in general depends on the way a focused proposition is construed against the background of an ordered set of alternative propositions in a scalar model. I argue below (§3.4) that the contexts which license grammatical polarity items are defined precisely in terms of the sorts of scalar inferences they support. First, however, it may be useful briefly to consider the basic phenomena associated with scales and scalar reasoning in language and cognition more generally. 3.3.1 Scalar reasoning and scalar implicature Scalar reasoning is a very general phenomenon and its importance in language extends well beyond the behavior of quantificational superlatives and polarity

52 The Grammar of Polarity items. The linguistic study of scales dates back to the work of Horn (1972), and, in a parallel tradition, to Anscombre and Ducrot’s Theory of Argumentation in Language (Ducrot 1972, 1973, 1980; Anscombre and Ducrot 1983). Horn originally conceived of scales as orderings of linguistic expressions based on a relationship of semantic entailment: given two expressions Ej and Ek, Ej is held to outrank Ek on a given scale if and only if “a statement containing an instance of the former unilaterally entails the corresponding statement containing the latter” (1989: 231). Scales of this sort are known as quantitative scales or simply Horn scales and are typically represented as ordered n-tuples of expressions, 〈…, Ej, Ek, …〉, with each element understood as outranking all elements to its right. Typical examples of Horn scales are given below (adapted from Horn 1989: 232). (6)

〈all, most, many, some〉 〈and, or〉 〈must, should, may〉 〈boiling, hot, warm〉 〈adore, love, like〉 〈excellent, good, OK〉

〈always, usually, often, sometimes〉 〈…, 6, 5, 4, 3, 2, 1〉 〈necessary, (logically) possible〉 〈freezing, cold, cool〉 〈loathe, hate, dislike〉 〈{terrible/awful}, bad, mediocre〉

Horn developed the notion of a quantitative scale as part of a general scheme to account for the mechanics of scalar implicature. Roughly, scalar implicature is a pragmatic process whereby a speaker S, in making a weak assertion Pk, may effectively communicate that she does not believe, or at least cannot confirm, any stronger assertion Pj>k. The general phenomenon was first brought to the attention of linguists in Grice’s 1967 William James lectures at Harvard (later appearing as Grice 1975). The examples in (7–9) are typical: in each case, a speaker’s utterance of the (a) sentence may conversationally implicate the conclusion expressed in the corresponding (b) sentence, and in each case, the generation of this implicature depends on the italicized expressions featured in Horn scales. (7)

a. Zelda sometimes drinks her whiskey neat. b. (For all S knows) Zelda doesn’t always drink her whiskey neat.

(8)

a. It is possible that the USA will invade Costa Rica. b. (For all S knows) it is not necessary that the USA will invade Costa Rica.

(9)

a. Michael has seen the movie or read the book. b. (For all S knows) Michael has not seen the movie and read the book.

The conclusions in the (b) sentences do not follow logically from the truth of the (a) sentences, but they are natural inferences which can be, and regularly are, exploited in everyday communciation. Crucially, as Grice argued, the

Licensing and the logic of scalar models 53 ability to make such inferences is not specifically linguistic in nature but rather reflects a general cognitive ability and a general presumption that speakers will act in a rational manner. The key to these and other sorts of conversational implicatures can be located in a general principle, Grice’s Cooperative Principle, that, all things being equal, the participants in a communicative exchange can be assumed to be cooperating with each other. More specifically, scalar implicature exploits Grice’s first submaxim of quantity, which states that in order to be cooperative a speaker should, as Grice puts it (1975: 46), “make [her] contribution as informative as is required (for the current purposes of the exchange).” So, if a speaker says something less informative than something else she could have said just as easily, one may draw the inference that she doesn’t believe the more informative thing, for otherwise she should have just said it. This thumbnail sketch of scalar implicature does justice neither to the complexity of the phenomenon nor to the sophistication of its feature-length treatments. (Those seeking such justice might look in Horn (1972, 1989), Gazdar (1979), Levinson (1983, 2000), Hirschberg (1985), Carston (1995, 1998), Matsumoto (1995), or Schwenter (1999a), among others.) The important point for our purposes is just that people do seem to possess a general and very basic ability for scalar reasoning, and that this basic ability has significant linguistic consequences. Among other things, as I have already hinted, this general cognitive ability holds the key to the grammar of polarity sensitivity. The question is, what exactly is scalar reasoning? As many have pointed out, and as Horn himself freely concedes (1989: 231–42), the original conception of quantitative scales being ordered by semantic entailment may be too narrow. As Fauconnier (1975a, b, 1976, 1978) and Hirschberg (1985) demonstrate, actual entailments are not required for scalar implicatures. Scalar inferences are often not logically valid, but rather depend on general and contingent pragmatic knowledge about how the world normally seems to work. This is clearly the case, for example, in the quantificational interpretations of examples (3–4) with their conjectures about a famous tycoon’s purchasing power and the Pope’s potential temptation to use viagra. Hirschberg (1985), in particular, shows that the types of relation which support scalar implicatures (and, by extension, scalar reasoning in general) go well beyond the logical orderings of Horn scales to include a variety of linear orderings, hierarchical rankings, whole/part relations, entity/attribute relations, set/subset relations, type/instance relations, and orderings defined by step-bystep processes. Hirschberg argues that the relations which can define a scale and thus support scalar implicature in fact include all and only those relations

54 The Grammar of Polarity which define partially ordered sets (or posets). As far as I can tell, this is a robust result, and given this result, we are now in position to define the crucial mechanisms which support scalar reasoning. 3.3.2 Cognitive foundations: conceptual scales Scalar reasoning is a general cognitive ability. As such, it seems reasonable to assume that it is based on very general cognitive constructs. The constructs which I will take as the foundation for scalar reasoning are conceptual scales and scalar models. A conceptual scale is simply a partially ordered set of conceptual entities. In the spririt of Hirschberg (1985), I assume that the ordering of elements in a conceptual scale is determined by an ordering metric, understood as a relation which is irreflexive, asymmetric, and transitive. These three notions are defined as follows: (10)

Given a set Q with elements {…, qi, qj, qk, …}, and a relation R defined on Q: R is irreflexive iff, for all qi ∈ Q, ¬ qi R qi . R is asymmetric iff, for all qi, qj ∈ Q, qi R qj ⇒ ¬ qjR qi . R is transitive iff, for all qi, qj, qk ∈ Q, (qi R qj & qj R qk) ⇒ qiR qk .

The formal nature of these definitions should not distract one from the basic simplicity of the idea behind them. The relations which satisfy these defin itions are, for the most part, the sort of thing one would expect to define a scale. Prominent among these are comparative relations defined on gradable predicates – be taller than, be harder than, be more likely than, be less intelligent than, be less interesting to read than – and absolute comparatives like be more than and be less than. But the class of ordering metrics for conceptual scales is much broader than our ordinary, pretheoretical notion of a scale. Conceptual scales are not simply orderings of amounts or degrees but include a variety of other relations, like the kind of relation in a taxonomic hierarchy and the inclusion relation of set membership, which, though they might not look prototypically scalar, have similar logical structures and support similar patterns of inferencing as do other scales. Hirschberg, like Horn before her, defines her scales as orderings of linguistic expressions; however, since scalar reasoning appears to be a general cognitive ability rather than a specifically linguistic one, I assume that the entities ordered by a conceptual scale are in fact conceptual structures. That is, the scalar relation between a set of words like freezing, cold, and cool reflects the fact that our experience of cold things is itself fundamentally scalar. Scalar reasoning is a way of thinking about these sorts of scalar experiences, and it works

Licensing and the logic of scalar models 55 whether or not the particular experiences are conventionally associated with a particular linguistic expression. Of course, as Levinson (2000) emphasizes, languages do regularly include sets of expressions which are paradigmatically arranged as profiling distinct scalar values – i.e. the Horn scales discussed above – and speakers do exploit such paradigmatic oppositions to generate scalar implicatures. My point is just that these sorts of linguistic oppositions themselves depend on conceptual scales. Technically, a conceptual scale is an ordered pair 〈Q, R〉, where Q is a set of conceptual structures and R is a relation which defines a partial order on the elements of Q. Effectively, then, R is a set of ordered pairs 〈qi, qj〉. Generally, the elements of Q will include all and only those conceptual structures which can be ordered by the relation R. Stretching the mathematical notion of a domain, we can say that the elements of Q constitute a semantic domain, defined in Langacker (1987: 488) as “a coherent area of conceptualization relative to which semantic units may be characterized.” This kind of semantic domain is much broader than what is needed just to define a conceptual scale – it includes, for example, such “coherent areas of conceptualization” as the geometric structure of space, the rules of baseball, and the nature of the speech act situation. Nonetheless, and rather trivially, the ordered elements on a conceptual scale necessarily constitute a semantic domain relative to which an ordering relation is characterized. Some simple examples may serve to illustrate. Time, or, more precisely, the set of points in time, functions as the semantic domain for relations like be earlier than and be later than. These two relations, then, define two distinct conceptual scales with converse orderings on the domain of time: in one, times are ranked from the latest to the earliest; in the other, times are ranked from the earliest to the latest. Similarly, the (extremely complex) semantic domain of private property involves, among other things, the set of things which can, in some sense, be possessed: elements of this set can be ordered on conceptual scales by relations like be more valuable than or be less valuable than. In general, conceptual scales simply reflect the fact that much of our realworld knowledge depends on the way we experience different entities in terms of orderings of one sort or another. The technical notion of a conceptual scale is thus not all that different from the pretheoretical, commonsense notion, though it is strikingly more general. A great many of our most basic experiences of the world, and particularly of the physical world, are scalar in nature: the dimensions of space (length, width, depth, height), color (brightness, saturation, hue), sound (pitch, amplitude), and temperature (warmth, cold), to name some of the most basic perceptual

56 The Grammar of Polarity domains, are all essentially scalar. And they are similar in that they all feature objective, and indeed measurable, properties of the world. But the elements on a scale need not be measurable in any objective sense: they only need to be ordered, and this can be done in basically any way conceivable. In addition to prototypical scales based on measures and quantities, conceptual scales may be defined by rank orderings, hierarchies, taxonomies, or sequences. Hierarchies – including the orders of poker hands, military ranks, and social classes – consist of an ordered set of ranks based on dominance relations, and taxonomies consist of sets of categories ordered by inclusion (i.e. the isa relation). Thus, in Linnaean taxonomics, a phylum includes classes, classes include orders, orders include families, and families include genera. In this sense, biological classification in general is a scalar semantic domain. Similarly, many complex processes require an intrinsically ordered series of steps: to make an omelette one must break the eggs, beat them, and cook them, in just that order, and so “omelette-making” is an inherently scalar domain. Finally, even the basic experience of moving along a path is fundamentally scalar in nature, since every point on a path is intrinsically ordered with respect to every other point. All these sorts of orderings are regularly used to draw scalar inferences. If we observe someone on a path, we use their position and direction of movement to infer where they have been and where they are going. If we observe someone cooking a meal or building a house – or in any telic process, for that matter – we infer from where that person is in the process what sorts of things they must already have done, and what they may be doing next. If we observe someone succeed at a difficult activity – lifting a heavy object, reading a difficult text, performing a complicated acrobatic routine – we assume this person will also succeed at any comparable but less difficult activity. Inferencing of this sort is essential to the most ordinary activities of everyday life. Ultimately, the structure of conceptual scales and their broad significance for reasoning in general stem from a basic cognitive ability to compare and contrast different events in our mental experience. As Langacker (1987: 101) has argued, this ability is fundamental to virtually all aspects of cognition: Fundamental to cognitive processing and the structuring of experience is our ability to compare events and register any contrast or discrepancy between them. Such comparison is at work when we perceive a spot of light against a dark background, for example, or when we catch a spelling error. I assume that this ability to compare two events is both generalized and ubiquitous: acts of comparison continually occur in all active cognitive domains, and at various levels of abstraction and complexity; regardless of domain and level,

Licensing and the logic of scalar models 57 moreover, they are manifestations of the same basic capacity (or at least are functionally parallel).

Conceptual scales are thus a manifestation of a much more general feature of cognitive processing. One should thus not mistake their apparent simplicity for a sign of triviality. Scales seem so simple only because they are such a fundamental part of the way we conceive the world. 3.3.3 Inferential mechanisms: scalar models Conceptual scales form the foundation for scalar reasoning, but there is more to scalar reasoning than just conceptual scales. As defined above, a conceptual scale can consist of elements of any semantic type. However, we do not in fact reason about structures of any semantic type; rather we reason about things like propositions, possible worlds, and communicative intentions, all of which regularly depend on complex combinations of simple scales. If, for example, all I know is that a new Ferrari costs more than a candy bar, there is little I can conclude beyond the mere corollary that a candy bar costs less than a new Ferrari. On its own, the ordering of elements on a conceptual scale is not so informative, but in combination with other simple scales in a matrix of imaginable propositions, its real usefulness emerges. The ordering of Ferraris and candy bars on a scale of cost is really only of interest where (or to the extent that) such items are considered as potential items of purchase. Thus, if I know my friend Paul can afford a Ferrari, then I also know he can afford a candy bar. And given the logic of scalar implicature, if, when asked for a loan, Paul says that he can lend me enough for a candy bar, I will reasonably conclude that he won’t lend me enough for a Ferrari. To explain scalar reasoning, we need to understand how reasoners use their knowledge of conceptual scales to draw inferences about propositions in discourse. The way they do this, I suggest, is that they use the complex kind of conceptual structure which Fillmore, Kay, and O’Connor (1988) dubbed a scalar model. Loosely, a scalar model is a structured set of propositions ordered in terms of one or more conceptual scales. A scalar model consists of a propositional function (or schema), P, with one or more open variables, each ranging over the ordered elements of a conceptual scale. Scalar models are thus (potentially) multidimensional structures: each variable in P defines a unique dimension and each dimension corresponds to a distinct conceptual scale. Fillmore, Kay, and O’Connor (1988: 535–6) call the set of dimensions, D, in a scalar model an argument space, since it is the values in D which define the set of potential arguments for the function P.

58 The Grammar of Polarity y5

hardest

y4 y3

P: ‘Norm can solve y’

y2 y1

easiest

Figure 3.1 A scalar model of puzzles

Figure 3.1 presents a one-dimensional scalar model in which a propositional function, P[y], stated informally as ‘Norm can solve y,’ combines with a conceptual scale of puzzles ordered in terms of their difficulty. The scale specifies a range of values, 〈y1, y2, y3, …〉, as possible arguments for the function P. Within a scalar model, whenever the propositional function P holds for some point yn, then P can be assumed to hold for all points lower than yn on the scale. In other words, for any two propositions P[yi] and P[yj], where yi > yj , P[yi] → P[yj]. Fauconnier (1975a: 193) calls this basic principle of commonsense logic the Scale Principle: here the solid arrow pointing down from y4 thus represents inferences following from the truth of P[y4]. Expanding the argument space only slightly, Figure 3.2 depicts a twodimensional model pairing puzzles ranked in terms of their difficulty with puzzlers ranked in terms of their acuity. Elements on both dimensions are ordered in a way that supports inferences in terms of their potential to satisfy the propositional function Q, ‘x can solve y’. The ordering of puzzles from the easiest to the hardest reflects a default assumption that if someone can solve a hard puzzle, they can also solve an easier one. Similarly, the ordering of puzzlers from the brightest to the dullest, reflects the default assumption that if a dimwitted puzzler can solve a particular problem, then any more clever puzzler will succeed as well. Once again, the scalar model defines a pattern of pragmatic entailments. Given the truth of a proposition p within the model (i.e. where p has a value of T), one can infer that any distinct proposition q which is lower than p on at least one dimension and no higher than p on any other dimension will also be true. Conversely, given the falsity of p (i.e. where p has a value of F), one can conclude that any proposition q higher than p on at least one dimension and no lower than p on any other dimension will also be false. The ordering of elements on a conceptual scale may reflect cultural assumptions or context-specific expectations, so entailments within a scalar

Licensing and the logic of scalar models 59 ∞

hard p u z z l e s easy

T

: . 4 Q: ‘x can solve y’

3 2

F

1 0 1

2 3

4

. . .

∞

puzzlers Stella

Norm

Dim

Figure 3.2 A two-dimensional scalar model

model need not be, even if they usually are, logically valid. Very clever people can be confused by things which should be obvious, and very simple problems can sometimes baffle a brilliant mind. Still, an assertion that one can solve the most difficult puzzle normally invites the inference that one can in fact solve any puzzle. As Fillmore, Kay, and O’Connor (1988: 537) put it, such entailments hold relative to a scalar model. For Fauconnier, such context-sensitive inferences are “pragmatic entailments.” Pragmatic entailments assume a sort of ceteris paribus condition: they are inferences which do not necessarily hold in all the possible worlds, but just in all the worlds one might reasonably consider on any given occasion. They are thus practically, if not logically, valid. The structure of scalar models, together with the practical wisdom of Fauconnier’s Scale Principle, trivially explains how superlative expressions sometimes allow quantificational interpretations. Given the rather innocent assumption that superlatives designate the endpoint of a conceptual scale, it follows from the Scale Principle that, with the right sort of propositional function, predication of a superlative will pragmatically entail all values lower on the scale. The Scale Principle thus effectively turns superlatives into universal quantifiers. The question is, what, exactly, makes something the right sort of propositional function? As outlined above, a scalar model consists of a propositional function and an argument space defined by a set of conceptual scales. Significantly, different propositional functions may operate on the same argument space to define different scalar models. Thus, the conceptual scale of puzzles above might combine with a variety of schemas – for example, ‘I doubt that Norm can solve y,’ ‘I

60 The Grammar of Polarity expect that Norm can solve y,’ ‘If Norm can solve y, I’ll give him a candy,’ ‘Stella will be pleased if Norm can solve y’ – to define a class of related scalar models. The inferences available in a scalar model depend on the schematic proposition that defines that model, but for our purposes here, all such propositional schemas divide into two basic sorts depending on the direction of the inferences they support. With a schema like P, ‘Norm can solve y,’ in Figure 3.1, inferences canonically flow from high to low values for y. With the contradictory schema ¬P, ‘Norm cannot solve y,’ and with other negative and affective contexts, the inferences are reversed, and so run from low to high scalar values, from easier to harder propositions; the validity of any proposition ¬P[yn] low in the model pragmatically entails the validity of all propositions in the model higher than yn. In semantics, as in photography, everything is backwards in the negative, and so for any two propositions ¬P[yi] and ¬P[yj], if yi < yj , then ¬P[yi] → ¬P[yj]. Beyond negation, I follow Fauconnier (1976, 1978) in drawing a general distinction between propositional schemas which, like simple affirmatives, license inferences from high values to low values for a thematic argument in a scalar model, and those which, like negation, reverse these entailments and license inferences from low values to high values for the same argument. Since affirmative assertions constitute the unmarked context, I will refer to schemas which license inferences from high values to low values as scale preserving, and schemas which license inferences from low values to high values I will call scale reversing. Additionally, there may be propositions which, for whatever reason, are not construed with respect to a scalar model or which simply do not license inferences within a scalar model. Such propositions are non-scalar. The crucial notions of scale reversal and scale preservation provide the foundation for a defintion of polarity contexts, and ultimately, for an explanation of polarity sensitivity. As noted above (§3.2), and as Fauconnier (1976) showed in detail, scale reversal appears to be a general property of the contexts which license canonical polarity items like English ever and French jamais. This suggests that polarity items really are in some sense like quantificational superlatives, and that polarity sensitivity in general is a sensitivity to scalar inferencing. In what follows, I propose that polarity items are a special class of scalar operators: their distribution thus depends on the availablity, in context, of an appropriately structured scalar model, and on the way an expressed proposition is construed in relation to its scalar alternatives. NPIs need a model with reversed scalar inferences; PPIs need one with preserved scalar inferences.

Licensing and the logic of scalar models 61 3.4

Affectivity as a mode of scalar construal

If it really is an act of inferencing that licenses NPIs, then strictly speaking licensing itself is neither a syntactic relation between constituents in a sentence, nor even a semantic relation between propositional contents, but fundamentally a pragmatic relation – a mode of construal – which holds between a conceptualizer and the expressed proposition to which a polarity item contributes its meaning. An act of inferencing is always a matter of judgment. To draw an inference in a scalar model, one must make a judgment about the factual status in some discourse context of a single proposition in the model. Given the scalar background, the way any proposition is judged – e.g. whether it is taken as affirmed, denied, suggested, hypothesized, or questioned – has automatic consequences for the way other propositions in the model can be judged. It is the mode of judgment – intuitively, the way a proposition is brought to mind – that determines whether a propositional function is scale-reversing or scale-preserving, and so, by hypothesis, what sorts of polarity items it can license or tolerate. Of course, there is nothing new to the idea that polarity licensing is somehow linked to scalar inferencing, nor even to the idea that this link is somehow mediated by pragmatics. Since Ladusaw (1979) first proposed that licensing is a matter of logical semantics, it has been widely noted that the relevant inferences are subject to pragmatic constraints: they can be triggered by certain conversational implicatures (Linebarger 1980, 1987, 1991; Israel 1996), and they may be blocked if certain presuppositions are not held constant (Heim 1984; Kadmon & Landman 1993; von Fintel 1999; Horn 2002). But while all may admit a role for pragmatics in polarity licensing, the phenomenon is widely seen as in essence a semantic constraint on grammatical well-formedness – a constraint that is defined on a model-theoretic representation of an expression’s objective truth conditions. This is both because in general, as Ladusaw noted, “whether an expression (lexical or phrasal) is a trigger is predictable from its meaning” (1979: 3), and especially because the relevant sort of meaning here consists in an expression’s contribution to the truth conditions of an expressed proposition. Since most linguists will agree that sentences like *I ever kissed her and *There was anyone at the party are not merely semantically anomalous but somehow truly “ungrammatical,” the default assumption has been that the mechanism responsible for polarity licensing must apply at some level of grammatical representation. There is, however, a substantial question here as to whether the inferences involved in polarity licensing are best understood as properties of linguistic representations per se, or as elements of conceptual

62 The Grammar of Polarity structure more generally. In other words, are the constraints on polarity items really a matter of sentence grammar, or do they apply rather at the level of utterance interpretation itself, as constraints on coherent conceptualizations? Since my basic argument here is that polarity licensing depends on the way an expressed proposition is construed within a scalar model, and since I have already defined scalar models as schematic conceptual structures which facilitate general patterns of inferencing, it follows that polarity licensing itself must be essentially a conceptual phenomenon. In general, I take it, any linguistic construction, of any arbitrary complexity (whether a word, a phrase, a clause, or a complex schema), is associated with some semantic content which determines (or at least constrains) the contribution it makes to an expressed proposition. But an expressed proposition is more than just a reflection of linguistic semantics: it is the culmination of a process of meaning construction in which a variety of complex cognitive abilities and structured features of background knowledge are integrated in a mental space (Fauconnier 1985, 1997; cf. Sperber & Wilson 1986). In this light, I contend that polarity contexts are defined not just by the logical, semantic, or syntactic properties of constructions, but also, and crucially, by the pragmatic properties which determine how an expressed proposition is construed in context. The relevant level of analysis here is thus neither purely linguistic nor purely conceptual, but a little of both: it is the level at which a speech act is conceptualized in an act of communication, the mental space in which a proposition is expressed and interpreted. Since polarity items, as opposed to polarity contexts, are necessarily linguistic constructions, they must occur in a linguistic context of some sort. But if licensing depends on the construal of a constructional meaning within a scalar model, then polarity contexts are properly defined not by their grammatical structure per se, but directly in terms of their semantic-pragmatic effects. The formulae below thus define three ways a focused construction C can be construed in a context (or mental space) M with respect to a scalar model SM built on a conceptual scale S = 〈Q, R〉, with Q a set of conceptual entities and R an ordering on Q. Let c′ be the conventional interpretation of c in M, assume that c′ ∈ Q, and for any element x ∈ Q, let M [x / c′] be the context M where c’ = x. Then: (11) a. M is scale preserving with respect to c iff, for all xi, xj ∈ Q, xi < xj → (M [xj / c′] → M [xi / c′]) b. M is scale reversing with respect to c iff, for all xi, xj ∈ Q, xi < xj → (M [xi / c′] → M [xj / c′]) c. M is non-scalar if it is neither scale preserving, nor scale reversing.

Licensing and the logic of scalar models 63 The examples below (12–16) illustrate pairs of semantically similar grammatical contexts which differ in the scalar construals they allow for the models of puzzles and puzzlers discussed above (§3.3.3). The constituents in square brackets in the (a) examples support scale-preserving interpretations, as demonstrated by the apparent entailments from (high-scalar) hard problems to (low-scalar) easy problems; in the minimally different (b) examples, the bracketed constituents are scale reversing, giving inferences from (low-scalar) “easy problem” propositions to (high-scalar) “hard problem” propositions. The overt triggers most responsible for these scale reversals are shown here in boldface. (12) a. Someone [who could solve the hard problems] got a prize. → Someone who could solve the easy problems got a prize. b. Everyone [who could solve the easy problems] got a prize. → Everyone who could solve the hard problems got a prize. (13) a. Norm must [be able to solve the hard problems] to get a prize. → Norm must be able to solve the easy problems to get a prize. b. If [Norm can solve the easy problems], he’ll get a prize. → If Norm can solve the hard problems, he’ll get a prize. (14) a. Norm quit after [he figured out how to solve the hard problems]. → Norm quit after he figured out how to solve the easy problems. b. Norm quit before [he figured out how to solve the easy problems]. → Norm quit before he figured out how to solve the hard problems. (15) a. I expected [Norm could solve the hard problems]. → I expected Norm could solve the easy problems. b. I’d be surprised if [Norm could solve the easy problems]. → I’d be surprised if Norm could solve the hard problems. a. I’m sure [Norm could solve the hard problems]. (16) → I’m sure that Norm could solve the easy problems. b. I doubt [Norm could solve the easy problems]. → I doubt Norm could solve the hard problems.

In all these cases the context M is a clause which supports the construal of problem-solving situations in terms of the ease or difficulty of the problems involved. The inferences here thus depend on a scalar model like that in Figure 3.2, with a conceptual scale S including a set Q of things that can be solved (puzzles, riddles, problems, mysteries, or whatever) and an ordering relation R ranking members of Q in terms of their difficulty. Each sentence includes an instance of the English noun phrase (NP) construction: thus, C is the NP construction, c is the particular NP, either the easy problems or the hard problems, and c′ is the scalar interpretation of the NP in M as contrasting with other members of Q.

64 The Grammar of Polarity The definitions in (11) are, it should be noted, strictly analogous to standard definitions of downward and upward entailing (UE) operators (e.g. Ladusaw 1979: 145–6; van der Wouden 1997: 90; §8.4 below). The real question about any such definition lies in the definition of the arrow “ →” here: in just how, that is, one understands the notion of “entailment” or “informativity” which holds between propositions in a scalar model. The basic claim here is that affective contexts are defined precisely by their effects on scalar inferencing, and in particular by their effects on things like the interpretation of superlative NPs and on the use of scalar focus particles like English even, French même, and German sogar and auch nur. The examples in (12–16) support this hypothesis, since it appears that the constructions here which license NPIs are just those triggers in the (b) examples that license inferences from ‘easy’ to ‘hard’ problems. These examples also suggest that the inferences relevant for licensing can be identified precisely with a sentence’s logical semantic entailments: in each case it seems that any reasonable person who accepted one of these premisses would also have to accept the truth of the consequent proposition. But scalar inferencing is not always so straightforward. Natural language sentences typically have many entailments, and sometimes they can seem to pull in opposite directions. Consider “approximate” (Huddleston & Pullum 2002: 815) or “quasi-negative” (Klein 1998) negators like few, rarely, scant, or hardly. These are well-known NPI triggers, but they also invite some clearly positive sorts of inferences: to say few people came to the show suggests that at least some people did come; to claim that one rarely smokes is in some sense to admit that one might smoke occasionally. This poses a problem for inferences like those in (17–18). (17)

a. Few primates [can solve the easy problems]. b. → Few primates can solve the hard problems.

(18)

a. Norm can rarely [solve the easy problems]. b. → Norm can rarely solve the hard problems.

The thing is, in a situation where some, but not many (i.e. “few”) primates can solve the easy problems, it is perfectly conceivable that no primates could solve the hard ones. Thus, there are situations where (17a) is true, but where it would be misleading, at best, to utter (17b). But while the use of a weak negative like few usually does license a positive inference – i.e. few Xs Y suggests that ‘at least some Xs Y’ – it is unclear whether this inference is a semantic entailment. As many have noted (Horn 1969, 1972, 1989; Ducrot 1973; Carston 1998; Ladusaw 1979: 153), while positive quantifiers like many and often give

Licensing and the logic of scalar models 65 lower-bounded ‘at least’ readings compatible with universal statements (e.g. Many – if not all – of my friends like chocolate), weakly negative quantifiers like few, rarely, seldom, hardly, and (arguably) only yield upper-bounded ‘at most’ readings which are compatible with negation (e.g. Few – if any – of my friends rob liquor stores). Since it is suspendible, the positive inference here seems to be a kind of implicature: the inferences in (17–18) are thus valid, and so the relevant contexts count as scale reversing. In fact, there is good psycholinguistic evidence that speakers draw very different sorts of inferences from quantifiers like few and not many than they do from their counterparts a few and many. Moxey and Sanford (1993, 1994) report on a series of studies in which subjects asked to complete a short discourse in which a pronoun refers back to a quantifier of some sort (e.g. {Few / A few} MPs were at the meeting. They …). The results show that with a positive quantifier, subjects overwhelmingly treat the pronoun as referring to what Moxey and Sanford call “the reference set” – the subset of a quantifier’s restriction which intersects with its nuclear scope (i.e. the MPs who were at the meeting); however, with negative quantifiers there is a strong tendency to interpret the pronoun as referring to the “complement set” – the subset of the quantifier’s restriction which are in the complement of the nuclear scope (i.e. the MPs who missed the meeting). Given this tendency, it seems clear that while quantifiers like few can support a positive inference, they typically serve to highlight a negative proposition and so function as scale reversers. But while the positive inferences of the approximatives may be dismissed as mere implicatures, other polarity triggers really do seem to entail a positive proposition. Consider, for example, a sentence like Only Bill had any fun, where the exclusively focused NP only Bill licenses the NPI any. This sentence clearly entails that ‘no one besides Bill had any fun’ (Horn 1969, 1996), but as Atlas (1993, 1996) has vigorously argued, it defies common sense to think such a sentence could be true if Bill himself did not have any fun. Since, as Horn (2002) himself concedes, a sentence like Only Bill had any fun, and even he didn’t is irreparably self-contradictory, the positive proposition associated with the only NP (i.e. here ‘that Bill had fun’) must be an entailment and not just an implicature. But this poses a problem for the putative reversed entailment in (19), since (19a) clearly does not share the entailment in (19b) that Norm can solve the hard problems, even if both sentences do entail that no one besides Norm can solve them. (19)

a. Only Norm can solve the easy problems. b. ? → Only Norm can solve the hard problems.

66 The Grammar of Polarity In fact, a number of polarity contexts come with similar positive entailments which surprisingly seem not to interfere with their potential as licensors. Factive adversatives like regret and be surprised that in (20–21) are another notorious example (Ladusaw 1979; Linebarger 1980, 1987, 1991; Heim 1984; Kadmon & Landman 1993; von Fintel 1999). These forms both license NPIs in their complements and seem to presuppose the truth of the propositions to which these NPIs would contribute. This combination poses a problem for the putative entailments in (20b–c) and (21b–c). (20)

a. I’m surprised that Norm solved any of the problems. b. I’m surprised that Norm solved the easy problems. c. ? → I’m surprised that Norm solved the hard problems.

(21)

a. I regret having tried to solve any of the problems. b. I regret having tried to solve the easy problems. c. ? → I regret having tried to solve the hard problems.

The problem is that one could perfectly well be surprised that Norm solved some easy problems without ever believing, let alone being surprised, that he also solved any hard problems. Of course, these triggers are scale reversing in the limited sense that, given a “constant perspective” on what counts as surprising (Kadmon & Landman 1993: 381), if I am surprised that Norm solved the simplest puzzle, I will be even more surprised if he turns out to have solved some harder puzzle. Thus, von Fintel (1999) calls the relation from the (b) to the (c) sentences in (20–21) “Strawson Entailment” because the inference only works if one can ignore the sentences’ presuppositions. Contexts like these are scale reversing with respect to their assertions, but not with respect to their presuppositions. This suggests that for purposes of NPI-licensing, what really matters is not so much what a sentence entails, but crucially, what it says – the ostensive contribution it makes to a discourse context. Nowhere is this more dramatically illustrated than in the behavior of forms like almost and barely. Surprisingly, while almost in (22) seems to entail a negative proposition and barely in (23) seems to entail a positive proposition, it is actually barely, and not almost, which licenses an NPI in (24). (22)

a. Yesterday, Stella almost proved Fermat’s Last Theorem. b. → Stella didn’t prove Fermat’s Last Theorem.

(23)

a. Dim barely knows how to balance his checkbook. b. → Dim does know how to balance his checkbook.

(24)

a. *Stella almost proved anything. b. Dim barely knows anything.

Licensing and the logic of scalar models 67 These examples seem puzzling if one assumes that NPIs are somehow triggered by the expression of a negative proposition, since almost with its negative entailments clearly seems more negative than barely with its positive entailments. But not all entailments are created equal. In the spirit of Ducrot (1973, 1980: 20–2; see also Verhagen 2005: 41ff.), I suggest that almost and barely are argumentative operators which, when added to a monoclausal sentence p with a single entailment P, yield a sentence with two entailments: [almost p] entails both (i) that the situation is, in some salient way, very much like one in which P is true, and (ii) that P is not true; and [barely p] entails both (i) that the situation is somehow very like one in which P is false, and (ii) that P is true. There has been a great deal of debate about the status of these dual entailments, if that is indeed what they are (e.g. Sadock 1981; Atlas 1984; Fillmore, Kay & O’Connor 1988; Horn 1991, 1996; Klein 1998 – see Horn 2002 for an overview), but for our purposes here, the important point is just that the two propositions are not on the same pragmatic footing: for both forms, it is the first which is communicatively salient (i.e. asserted), while the second is presented as relatively uncontroversial (or presupposed). In effect, (22a) presents a situation in which Stella has failed to accomplish something very impressive, but it asserts that what she did do is quite impressive in itself. Similarly, (23a) evokes a situation in which Dim can do something very unimpressive, but it asserts that he cannot do anything more. Almost is thus scale preserving: if Stella can accomplish a lot with a difficult problem (even without solving it), then she will succeed all the more with an easier problem. Barely is scale reversing: if Dim has only meager success with easy problems, then clearly he will not have any greater success with harder problems. While polarity licensing may depend on scalar inferencing, polarity items are not, it seems, sensitive to just any inference a context might support. As Horn (2002) suggests, the important thing is not what a sentence entails, but rather what it actually asserts: Semantically entailed material that is outside the scope of the asserted, and hence potentially controversial, aspect of utterance meaning counts as as s er t o r ica l ly in e r t and hence as effectively transparent to NPIlicensing and related diagnostics of scalar orientation. (Horn 2002: 63)

The existence of such assertorically inert entailments clarifies a fundamental fact about the nature of polarity sensitivity. The inferences which can license or block a polarity item are not just a matter of a sentence’s objective truth conditions, but rather are features of the ways a sentence can be used: they are,

68 The Grammar of Polarity specifically, the inferences which a sentence is meant to communicate or taken to mean in the event of an illocutionary act. And in this sense the mechanism of polarity licensing is essentially a matter of pragmatics: really, it all depends on what is said and how it’s taken. The idea that licensing in some sense takes place at the level of the speech act or utterance rather than at some more autonomous level of grammatical structure may also help with one of the great old mysteries of polarity sensitivity – that is, why both NPIs and PPIs are commonly licensed in interrogative clauses. The basic facts here have never fit easily with standard accounts of licensing, since questions are neither essentially negative nor obviously scale reversing (van Rooy 2003). NPIs like any and ever are licensed in all sorts of interrogative clauses: in yes–no questions in (25a), information questions in (25b), and indirect questions in (25c). (25)

a. Will Norm ever solve any of these problems? b. When has Norm ever solved anything? c. I wonder if Norm will ever solve any of these problems.

Since the effect of the NPIs here, particularly with ever and any together in the same clause, is a stong bias towards a negative response like “no” or “never,” these sentences do have a negative flavor. But the negativity seems to come directly from the NPIs rather than the interrogative contexts, and NPIs normally cannot license themselves. And in any case the same contexts also license PPIs like some, already, long since, and every single one in (26). (26)

a. Has Norm already solved some of these problems? b. Who has already solved some of these problems? c. I wonder if Stella has long since solved every single one of these problems.

Presumably, if NPIs really do require scale-reversing contexts and PPIs scale-preserving ones, then interrogative clauses must somehow provide both. But do they provide either? Is there any entailment either way with the pairs of questions in (27–29)? (27)

a. Will Norm be able to solve the easy problems? b. ? → Will Norm be able to solve the hard problems?

(28)

a. When has Norm solved the easy problems? b. ? → When has Norm solved the hard problems?

(29)

a. I wonder if Norm will be able to solve the easy problems. b. ? → I wonder if Norm will be able to solve the hard problems.

The first problem is just to explain what it means for one question to entail another. A standard approach is to define the meaning of a question in terms of

Licensing and the logic of scalar models 69 its possible answers, so that one question is said to entail another just in case any true and complete answer to the former is also a true and complete answer to the latter. Or, more generally, we can say that a sentence P pragmatically entails another sentence Q just in case the utterance of P normally commits a speaker to everything (at least) that the utterance of Q commits her to: thus a question P entails another question Q only if any speaker, in asking P, also asks in effect whatever would be asked by asking Q. But this is clearly not the case for the questions here. The putative entailment in (27) fails since a “yes” answer might easily be true for (27a) but false for (27b); and similarly for (29), one can perfectly well wonder whether Norm can solve an easy problem without thereby wondering whether he can solve any harder problem. Interrogative clauses are thus not scale reversing, and they are not scale preserving either, since one could easily wonder about a man’s ability with some hard problems without thereby questioning his ability with the easier problems. According to these criteria then, interrogative clauses appear to be inherently non-scalar. This makes sense if interrogatives really are neutral between affirmation and denial; still, the act of posing a question is rarely rhetorically neutral.3 While the primary inference licensed by a question may be just that the questioning speaker is unsure of the truth of a questioned proposition, still, as Fauconnier (1980: 63–4) points out, the use of a question like (27a) can say something about a speaker’s attitude toward questions like (27b). If a speaker wonders about the easy problems, then she must either believe that Norm cannot solve the harder ones or else be uncertain about his ability, since if she believed that he could solve the hard ones, she would have to conclude, by the logic of the scale principle, that he could solve the easy ones too. Thus, while questioning a proposition in a scalar model does not entail any questioning of propositions higher in the model, it does presuppose that one either doubts or disbelieves those higher propositions. So the (a) questions in (27) really are stronger than the (b) questions since they express a greater degree of doubt and wonder. And in this sense, interrogatives really can be scale reversing, but the relevant inferences are not between the questions themselves, but the levels of doubt the questions express. Questions on their own seem to be inherently non-scalar, but the act of posing a question is itself enough to express a scale-reversing doubt. Or at least it can be. The important point is that only where such doubts are available do NPIs get licensed in interrogative clauses. And interrogatives do differ in this respect. For closed interrogatives like the yes–no questions in (27), the logic is fairly simple: just by posing a question about Norm’s ability with easy problems, a speaker will normally implicate her doubt that Norm

70 The Grammar of Polarity could solve the harder ones. But with the open interrogatives in (28), the questions turn not on problem-solving abilities per se, but on the particular times such abilities have been demonstrated. Taken literally, such questions are not scale reversing even in the extended sense described above: one can perfectly well wonder when Norm solved an easy problem without being in any doubt as to whether or when he solved the harder ones. In order to get the inference from easy problems to hard ones here, the questions must be taken as purely rhetorical, not as requests for information about an event, but as expressions of doubt that an event took place at all. And taken this way, a sentence like (28a) is not really a question at all but a sort of indirect denial. Since information questions are only scale reversing on a rhetorical reading, it follows that only rhetorical information questions should license NPIs. And indeed, this is the case. As the contrasts in (30–31) suggest, while any and ever can occur with minimal or no bias in yes–no questions, they come with a note of skepticism or disbelief when they occur in information questions. (30)

a. Do you want any bourbon? b. Who wants any bourbon?

(31)

a. Have you ever hunted wild boar? b. When have you ever hunted wild boar?

These facts suggest that interrogative clauses license polarity items not because of what they mean (their semantic entailments) but rather because of the ways they can be used (their illocutionary effects). In other words, it’s not the question itself that licenses a polarity item, but the way it is posed. While approximatives like few, exclusives like only, and adversatives like regret all come with entailments or implicatures which might be expected to block reversed scalar inferences, interrogatives on their own seem to lack the entailments needed to trigger scale reversing. These constructions are not united by their contributions to truth-conditional meaning but rather by their effects on the presentation of such meanings – that is, by the ways they frame a proposition within a scalar model. The evidence reviewed in this section thus suggests that polarity contexts are defined not by their logical meanings alone, but rather and crucially by the ways their meanings affect pragmatic inferencing in a communicative context. 3.5

Syntactic constraints on scalar construals

Beyond the basic problem of finding an eligible trigger, negative polarity items must negotiate a number of minor constraints and petty regulations in order to

Licensing and the logic of scalar models 71 receive a license. Prominent among these are locality conditions prohibiting certain operators from intervening between an item and its licensor (Linebarger 1980, 1987) and the precedence condition which requires a licensor to precede the items it licenses (Ladusaw 1979; Hoeksema 2000). The behavior of NPIs in contexts with multiple licensors is also relevant here: given the logic of scalar reversal and the law of double negation, one might expect any two scale-reversing triggers to form a scale-preserving context, so that NPIs would require an odd number of triggers to be licensed, but this is not always the case (Baker 1970; Chierchia 2004). These constraints show that there is more to licensing than the mere presence of an eligible licensor; however, their effects apply not just to polarity licensing but to the availability of scalar construals more generally. As a group they show just how delicate scalar construals can be, and they further support the claim that such construals are precisely what licensing depends on. 3.5.1 The precedence condition All human languages feature utterances structured as a linear progression of symbolic units, and many languages have rules which govern the ordering of sentence constituents. As (32–34) suggest, NPIs like English any normally appear only after an overtly preceding trigger. (32)

a. The dancers didn’t eat anything for breakfast. b. *Anything wasn’t eaten by the dancers.

(33)

a. None of the dancers ate anything for breakfast. b. *Any of the dancers ate nothing for breakfast.

(34)

a. Nobody came to visit at any time. b. *At any time nobody came to visit.

The precedence condition seems to apply only within a single clause, since NPIs can occur before their licensor when embedded in the clausal complement of an adversative predicate, as in (35), though only if the NPI-containing clause is itself preceded by some overt indication of its subordinate status. (35)

a. *(The idea that) she would so much as think of betraying you sickens me. b. *(The notion) he would budge an inch to save her is pure fantasy. c. *(That) she would lift a finger to help him comes as a complete surprise.

As Ladusaw (1979) acknowledged, the precedence condition greatly undermines the idea that polarity licensing is a matter of logical semantics: since the logical operators which trigger polarity items take scope over whole propositions, their surface word order should not affect licensing. Basically, the (a) and (b) sentences in contrasts like (32–34) above are, or should be, logically

72 The Grammar of Polarity identical. This suggests that it is just a hard syntactic fact of life that NPIs must follow their licensors, at least when both occur in the same clause. But of course the same proposition can be construed and constructed in many different ways, and so the precedence condition may reflect something about the ways NPIs contribute to the construal of a proposition. The problem seems to be that where NPIs occur before their licensors, they are liable to be interpreted as denoting particular, individuated referents, rather than virtual scalar endpoints. Thus, (36a) suggests a specific finger that was lifted to help no one, (36b) a specific occasion of Hillary batting just one eye, and (36c) a particular expression of interest slighter than all others. (36)

a. *Bill lifted a finger to help none of his friends. b. *Hillary batted an eye at none of his outrageous antics. c. *Monica has shown the slightest interest in none of these issues.

The important intuition is perhaps clearest when NPIs occur in subject position, and so are more likely to be construed as topical. The sentences in (37) are heavily biased toward an interpretation in which their respective fingers, interests, and eyes are construed as referential, and so the effect of ungrammaticality is markedly enhanced. (37)

a. **A finger was lifted to help no one. b. **An eye was batted at none of his antics. c. **The slightest interest was not expressed.

It appears, then, that for at least some NPIs, the scalar properties of a licensing context must be established on-line before the item itself can be introduced. In fact, it is difficult, though not quite impossible, to construe any end-ofscale expression within the scope of a following tautoclausal scale reverser. Thus, in (38), while the superlative NP the simplest problem may allow a quantificational interpretation with the following negative (particularly if construed as the focus of even), the construction of the sentence as a whole is awkward in a way that seems analogous to the ill-formedness in violations of the precedence condition. (38)

a. ??The simplest problem couldn’t be solved by those fools. b. ??Oscar solved the simplest problem on none of his exams.

The awkwardness of these examples does not, of course, explain the precedence condition, but it does suggest that the condition may be a constraint on scalar construals in general rather than, as often assumed (Crain & Pietroski 2002), a purely syntactic rule.

Licensing and the logic of scalar models 73 In any case, as Hoeksema (2000) has shown, the condition does not apply equally to all polarity items. English NPIs like as of yet, auxiliary need, and can stand are among those which can and often do precede their licensors. (39)

a. As of yet, there has been no answer from the Klingons. b. You need not trouble yourself with the details. c. I can stand the anticipation not a second longer.

Other NPIs, or NPI-like constructions, actually require their licensor to follow them. For example, the likes of which construction, illustrated with examples from Google in (40–41) below, features a polarity trigger inside the relative clause headed by which and thus after (or perhaps inside of ) the NPI it licenses. (40)

a. Reality has a taste the likes of which fiction can rarely match. b. It was a manhunt the likes of which we will never see again. c. Saddam believes that he is a great natural leader, the likes of which his world has not seen in thirteen centuries.

(41)

a. *Reality has a taste the likes of which fiction can sometimes match. b. *It was a manhunt the likes of which we will see again. c. *He is a natural leader, the likes of which his world has often seen.

Still, in a language like English, where negation canonically occurs at the start of a predicate, most NPIs do mostly occur after their licensors. The important point for my purposes here is just that where NPIs are blocked by a failure of precedence, the scalar inferences which license NPIs also appear to be systematically blocked. 3.5.2 Intervention effects Sentences involving multiple operators are often multiply ambiguous. Negation is especially notorious for creating ambiguities which depend on what falls within its scope and what, precisely, is being negated. Example (42) illustrates a scope ambiguity arising from the interaction of negation with the quantifier every. On the reading in (a), the quantified NP every child is unaffected by the negation, which therefore is said to take narrow scope; on the reading in (b), every child is affected by the negation, which in this case is said to take wide scope. (Note that other readings are possible too: for instance, where a story takes wide scope with respect to negation.) (42)

Margaret didn’t tell every child a story. a. There is no story that Margaret told to every child. b. Not every child was told a story by Margaret.

At this point an interesting fact emerges. If we substitute the NPI any for the indefinite article a in this example, the ambiguity disappears. The sentences

74 The Grammar of Polarity below are only grammatical where the quantifier every takes wide scope with respect to negation and the NPI any is interpreted in the immediate scope of negation. (43)

a. Margaret didn’t tell every child any story. b. Margaret didn’t tell any stories to every child.

The sentences in (43) are acceptable only on a narrow scope reading for negation. For this reason, given normal background assumptions, the (b) sentences in (44–46) sound odd or ungrammatical. In each case, the reading in which the any NP takes wide scope over the every NP is blocked or at least made unlikely for pragmatic reasons: one would not expect anyone to give each of his wives the same painting or to put the same egg in every basket consecutively. (44)

a. Pablo didn’t give all of his wives a painting. b. *Pablo didn’t give all of his wives any paintings.

(45)

a. Gwyneth didn’t fill most of the baskets with eggs. b. *Gwyneth didn’t fill most of the baskets with any eggs.

(46)

a. Hillary didn’t make a donation to every charity. b. *Hillary didn’t give a red cent to every charity.

Horn (1998) makes a similar point when he argues that intervention effects explain the failure of absolutely and just to occur with polarity sensitive any (e.g. Alf won’t eat (*absolutely) any squid). The problem, according to Horn, is that degree adverbs like absolutely incorporate the semantics of a universal quantifier (see also Klein 1998 on the semantics of degree adverbs). In general, certain quantificational expressions – in English, for example, all, every, most, several, and always, among others – seem to absorb the force of a negative operator and block the licensing of NPIs in their scope (Linebarger 1980, 1987; Chierchia 2004), though other logical expressions – in particular, indefinite determiners like many and modals like necessarily and can – do not have this property. While the Scalar Model of Polarity cannot, on its own, explain these facts, it does at least predict them. The prediction is that NPIs should be acceptable in just those sentential contexts which reverse entailments in a scalar model. As it turns out, one basic effect of an intervening quantificational operator is to block scale reversal. The point is illustrated by the examples below where a single doughnut and the easiest problem cannot receive a scalar construal when they are interpreted in the immediate scope of every. (47)

a. Parker didn’t eat a single doughnut in every diner she visited. b. Bruno can’t solve the easiest problem on every test.

Licensing and the logic of scalar models 75 The preferred reading for (47a) is the one on which it entails that Parker ate no doughnuts, that is, the reading in which every takes wide scope over negation and a single takes narrow scope. Another possible reading entails that in at least some restaurants the cardinality of the doughnuts consumed by Parker was other than one. The reading which is, as far as I can tell, perfectly impossible, is the one where Parker ate no doughnuts in only some of the diners – that is, where negation takes wide scope over every, every takes wide scope over a single, and a single gets a scalar construal. Similarly, (47b) cannot be construed to mean that it is not on every test that Bruno fails to solve any problem (only perhaps on some tests). For whatever reason, every seems to absorb the scale-reversing properties of the negation and so blocks the quantificational reading of the superlative NP. Again, the constraints on polarity licensing appear to be part of a larger pattern affecting the availability of scalar construals in general. 3.5.3 The paradox of double negation One of the oldest truths about negation is the law of double negation, that the negation of a negative proposition yields a positive proposition. More generally, as has been known since the work of the medieval scholastics (see Horn 1989, 1996; Sánchez Valencia 1991 for the history), in any complex expression containing multiple scale-reversing operators, the context as a whole will be scale preserving if the number of reversers is even, and scale reversing if the number of reversers is odd. This fact leads one to expect that if polarity items really are sensitive to scale reversal, polarity licensing should be sensitive to the ways multiple scale reversers can combine in a single complex context. But the facts here are more complicated than one might expect. Baker (1970) was the first to address the problem of double negation in a systematic account of polarity phenomena. As he pointed out, while the PPIs would rather and still are blocked by a single negation in sentences like (48a and 49a), the same forms are licensed when they appear embedded under two negations, as in (48b and 49b) (1970: 171–2). (48)

a. *Karin wouldn’t rather be in Montpellier. b. There isn’t anyone here who wouldn’t rather be in Montpellier.

(49)

a. *Someone isn’t still holed up in that cave. b. You can’t convince me that someone isn’t still holed up in that cave.

In much the same way, NPIs in affirmative clauses are licensed when that clause appears embedded in an appropriately scale-reversing context. Thus, in the (b) sentence below, the NPIs anything and at all are licensed because the

76 The Grammar of Polarity sentence as a whole provides a scale-reversing context, even if the local clauses in which they appear do not. (50)

a. *The doctor is doing anything at the moment. b. I’m not sure that the doctor is doing anything at the moment.

(51)

a. *Ellen is all that interested in your sordid tales. b. I find it hard to believe that Ellen is all that interested in your sordid tales.

If NPIs really are sensitive to scalar inferences, then they should also be blocked where an even number of scale reversers combine to form a scalepreserving context. In certain cases this prediction is borne out. Hoeksema (1986: 37–8) offers the examples below as a case in point. (52) (53)

a. Every student who knows anything about logic should know Modus Ponens. b. *Not every student who knows anything about logic should know Modus Ponens. a. Only her husband was ever allowed to dance with her. b. *Not only her husband was ever allowed to dance with her.

In (52b) the combination of negation with the quantifier every creates a complex scale-preserving quantifier not every and so blocks polarity licensing. In (53b) the combination of negation with the normally scale-reversing focus particle only creates a complex operator not only which also blocks licensing. Usually, however, when multiple scale reversers interact across a sentence, their combinations will not affect the licensing of an NPI, so long as the NPI is appropriately licensed by one of the scale reversers at some level of sentence organization. Thus, when an NPI is licensed within a given clause, embedding that clause within a negative environment does not affect its acceptability. The examples in (54), from Baker (1970: 177–8), and (55), from Horn (1996), show that doubling negations do not necessarily cancel the licensing power of a well-placed local negation, and so most multiply-negated clauses will welcome either NPIs or PPIs – though perhaps not both at once (54c).4 (54)

a. There isn’t anyone here who wouldn’t care to do anything down town. b. There isn’t anyone here who wouldn’t rather do something down town. c. *There isn’t anyone here who wouldn’t rather do anything down town.

(55)

a. None of the guests who had seen any of the suspects were excused. b. None of the guests who hadn’t seen any of the suspects were questioned.

These facts show that the scalar inferences needed for licensing need not be computed globally over a complete sentence, so long as they are locally available within some well-formed subpart. In (54a), for example, the NPIs care to

Licensing and the logic of scalar models 77 and anything are licensed by the negated proposition expressed in the relative clause despite the fact that the matrix negation makes the sentence as a whole scale preserving and in fact yields the global entailment that everyone wants to do something in town. The licensing of NPIs in doubly negated contexts shows that if polarity items are licensed by scale reversal, the domain in which they are licensed cannot be that of the sentence as a whole but must be flexibly defined in a way that will include both simple propositions expressed by a single clause and the multi-propositional complexes formed by multi-clausal sentences. As Krifka puts it, “the semantic contribution of a polarity item can be exploited at various levels of a complex semantic expression, not just at the uppermost level of the sentence” (1995: 244). On the basis of such observations, Baker (1970: 178) offers what is surely the most colorful theory of cross-clausal polarity licensing in the literature: We can think metaphorically of a presentational negative element as giving off paint, which spreads through any structure within the scope of that negative element. The flow of paint can, however, be stopped at any S, so that each S represents a sort of valve which, if shut, stops the flow of paint. However, if a valve is left open, the flow of paint cannot be stopped again except by some lower S.

Baker’s flowing paint theory of polarity captures the insight that for purposes of polarity licensing, a clause must be construed either as positive or as negative but never as both. There is in fact a good semantic reason for the apparently syntactic fact that the valves controlling the flow of polarity are located at S – that is, the level of the clause. The reason is that polarity licensing depends on the way a proposition is construed within a scalar model, and the clause is the minimal grammatical level which encodes a complete proposition. A clause which occurs with two scale-reversing operators can be given a scalar construal with respect to either one or both of these operators just so long as each operator can be understood as contributing to the expression of its own distinct proposition which contrasts with its own ordered set of alternatives. Problems arise only when two or more triggers either combine in a sort of complex operator (e.g. as in not every and not only in (52–53), above, and in other non-licensors like not unaware and not unlikely), or for some other reason are not easily construed as contributing to the expression of separate propositions. (56)

a. I doubt that anyone lifted a finger to help him. b. I doubt that no one lifted a finger to help him.

78 The Grammar of Polarity (57)

a. There wasn’t anyone who lifted a finger to help him. b. ??There wasn’t anyone who didn’t lift a finger to help him.

(58)

a. It was never the case that Wilbur didn’t lift a finger to help you. b. ??Wilbur never didn’t lift a finger to help you. c. ???Wilbur never failed to lift a finger to help you.

Thus, while lift a finger is fine in (56b), where it occurs in the negated complement of the negative verb doubt, it is awkward in (57b), where it occurs in a negated relative clause modifying the negated indefinite pronoun anyone. The difference is that in (56b) the embedded negative clause denotes a negative fact which may be considered doubtful or not; but in (57b) the interpretation of the relative clause depends on the interpretation of its indefinite antecedent, and so the proposition denoted – in effect, ‘that not anyone did not lift a finger’ – is composed of two negatives which are construed together, and which therefore do cancel each other out. So it seems there may be something to the old law of double negation after all, but it really only counts where the doublings come so quickly they can’t be kept apart. 3.6

Polarity contexts are mental spaces

I have argued here that Klima’s (1964) old grammatical feature, ±Affective, is not a property of grammatical representations, but a matter of imaginable conceptualizations. The contexts which license polarity items are defined not so much by their syntactic structures or logical form, but rather, and precisely, by their effects on scalar inferencing, and so by the ways they are judged and construed. The relevant modes of judgment are not matters of individual fancy or imagination but depend on the sorts of conceptual structures which make communication possible. Affectivity is thus not properly a syntactic relation between symbolic representations, nor a matter of entailments between objective propositions, nor even just a property of the ways a proposition can be subjectively entertained. It is a matter of meaning and communication. Polarity contexts are mental spaces in which conceptual contents can be jointly imagined and considered in a meaningful discourse. They are defined not just by the denotations of linguistic constructions, but by the very acts in which those constructions are used. Licensing indeed depends on meaning – not just sentence meaning, but speaker meaning as well; not just what is entailed, but also and crucially what is said.

4 Sensitivity as inherent scalar semantics

Nothing that actually occurs is of the smallest importance. Oscar Wilde (1894)

4.1

Scalar operators

Why should polarity items be sensitive to scalar inferencing? In some cases the answer is simple. Just like the quantificational superlatives, many NPIs literally designate a scalar endpoint. Some are themselves superlatives indicating minimal degrees: the foggiest notion, the least bit, in the slightest. Others, like sleep a wink, lift a finger, and a shred of evidence, feature a stereotypical minimal unit on some scale. These minimizer NPIs are like superlatives which only allow a quantificational reading: they have no inherent referential value, and so they cannot refer to a specific minimal unit, but they can be used emphatically, as a way of triggering reference to the ordered set of elements on a conceptual scale. And of course this is only possible in scale-reversing contexts, where pragmatic inferences are licensed from lower to higher scalar values. So, at least for the minimizer NPIs, the sensitivity to scalar inferencing seems intuitively well motivated. But how should this sensitivity be represented in the lexicon? And more importantly, will this sort of intuitive explanation extend to other polarity items with similar sensitivities but with very different scalar semantic properties? This chapter seeks answers to these questions by exploring the hypothesis that polarity items in general constitute a broad but well-defined class of scalar operators. Fillmore, Kay, and O’Connor (1988) introduced the notion of a scalar operator to model the complex semantics and pragmatics of the idiomatic (and polarity sensitive) conjunction let alone. Scalar operators are themselves a special class of what Kay (1989) calls contextual operators – expressions whose meanings involve, usually in addition to constraints on the situations

79

80 The Grammar of Polarity they can appropriately describe, constraints on the contexts where they can appropriately be used. More precisely, contextual operators are lexical items or grammatical constructions whose semantic value consists, at least in part, of instructions to find in, or impute to, the context a certain kind of information structure and to locate the information presented by the sentence within that information structure in a specified way. (Kay 1989: 181)

Naturally, a contextual operator will not be acceptable if it occurs in a context where the information structures it requires can neither be found nor constructed.1 Not all scalar operators are polarity items, nor are all contextual operators scalar in nature. Contextual operators in general are forms which situate the expressed meaning of a sentence within some larger conceptual structure which must be pragmatically available in (i.e. which can be found in or imputed to) the context. Such conceptual structures need not be scalar. Forms like respective, respectively, and vice versa, for example, are non-scalar contextual oper ators whose basic function is to direct traffic among multiple sets of denotata as they are mapped onto their intended propositional roles (Kay 1989). Other contextual operators include hedges like technically, strictly speaking, loosely speaking, a regular X (Lakoff 1972; Kay 1983), and discourse particles such as as a matter of fact, as it turns out, and of course but. Scalar operators in particular are forms which must be construed with respect to a scalar model: they presuppose a scalar model available in the context, and they require the information they express to be integrated with that scalar model in a particular way (Fillmore, Kay & O’Connor 1988; Kay 1990, 1997). The focus particle even exemplifies one of the most common types of scalar operator, the scalar focus particle. There have been many proposals concerning the peculiar contribution words like even make to a sentence (Horn 1969; Fauconnier 1976; Kay 1990; König 1991; Francescotti 1995; among others), but there is a broad agreement, at least to a first approximation, that a sentence containing even will express a proposition which is somehow less expected or more informative than some other contextually supplied proposition. So in a sentence like Even Ezra failed the exam, the particle even does not affect the truth conditions of the expressed proposition but rather introduces a presupposition that the expression of this proposition is somehow very surprising. More precisely, even presupposes that the element in its focus, in this case Ezra, represents the least, or one of the least, likely values one would expect to satisfy the proposition over which even has scope – in this case, x failed the exam. In

Sensitivity as inherent scalar semantics 81 effect, even here presupposes a scale in which individuals are ranked in terms of their propensity to failure, and it presupposes that the focus element, Ezra, occupies a position at or near the bottom of this scale. I suggest that polarity items as a class are like even in that they impose a scalar construal on an expressed proposition. As Heim (1984) noted, many polarity items – and especially, the minimizers – seem to incorporate the semantics of even itself, requiring that the expressed content of a proposition be construed as the least likely in a scale of alternative propositions. But as Heim also noted, not all polarity items can be so analyzed: in particular, indefinite NPIs like any and ever differ from their minimizer cousins in several critical respects (Lee & Horn 1994; Rullmann 1996; §7.4 below). And polarity items like much and all that (as in she doesn’t cheat much or he rarely gets all that drunk) differ from both indefinite and minimizer NPIs in their scalar effects (Linebarger 1980: 236), making for weakly informative or understated propositions instead of strong, emphatic propositions. But while polarity items are not entirely uniform in their scalar effects, they are united at least in that they all have scalar effects. I therefore propose that all polarity items conventionally encode two sorts of semantic properties inherent in the construction of a scalar model. 4.2

Two scalar properties

Every proposition within a scalar model is distinguished by two basic properties. Quantitative value (q-value) refers to a proposition’s position within a scalar model: the higher a proposition is along a scale, the higher its quantitative value. Informative value (i-value) refers to a proposition’s relative informativity within a model: the more entailments a proposition has within a model, the higher its i-value. Q-value and i-value are essentially properties of propositions within a scalar model, but they also find work as lexical features. Many lexical and grammatical constructions are conventionally specified as encoding either a particular q-value or a particular i-value or both. Such forms are, by definition, scalar operators. Since scalar models are built on conceptual scales, the idea of q-value seems relatively straightforward. For many expressions, and for most polarity items, q-value is a salient, and even transparent, feature of the construction’s semantic content. Quantifiers and degree modifiers, for example, typically designate an abstract scalar extent or degree, often without reference to any particular dimension. Thus, a PPI like utterly (as in she was utterly amazed) signals that the predicate it modifies holds to a high degree, while the NPI the least bit (as in she wasn’t the least bit impressed) indicates a minimal degree. The precise

82 The Grammar of Polarity position these forms designate within a scalar ordering is vague and may vary with context, but their fundamentally quantificational nature is hardly open to doubt. In fact, quantitative value is somewhat less straightforward than it might seem at first. The temptation, naturally, is to think of a q-value as a kind of fixed, objective quantity or amount. But problems arise when one considers the analysis of gradable antonyms like fast/slow, easy/difficult, and clever/ dull. There are good reasons to assume that the two terms in each of these pairs do not simply pick out different regions in a single ordering but actually define two distinct scales with inverse orderings of identical elements. One reason is that degree adverbs like very and extremely, forms which themselves seem to denote high q-values, apply equally well to both members of each pair. Presumably a phrase like very fast applies to elements “high” in speed, while very slow applies to elements “high” in slowness. But then the very same objective entity will have a high q-value with respect to one conceptual scale and a low q-value with respect to another. This seems reasonable enough – what it means is just that quantitative value is not itself an inherent property of things in the world but is always defined relative to some scale. It is, in effect, a matter of construal. The crucial question is, how can one determine which scales are operating in any given scalar model? Consider again the model of puzzles and puzzlers discussed above (§3.2.3). Here the argument space consists of two conceptual scales corresponding to each of the two participants in the propositional function ‘x can solve y’. In theory, both of these scales could be ordered in either of two possible ways: the puzzlers on the x-axis can be arranged either from the least to the most clever or from the least to the most dim; and the puzzles on the y-axis can be ordered either from the least to the most difficult or from the least to most easy. The choice, however, is not arbitrary. In order for the model to support the right pragmatic inferences, the two scales need to be correctly coordinated. In effect this means that elements for each dimension are ordered in terms of their potential to satisfy the propositional schema which defines the model. Puzzles are ordered from the least to the most difficult, because if someone can solve a difficult puzzle, then presumably she can also solve any easier puzzle. Similarly, puzzlers are ordered from the least to the most dim, since it is the dim puzzlers who are least likely to solve any puzzles. There is something counterintuitive about this. It would seem more natural to order the puzzlers in terms of their cleverness, and to think of clever puzzlers as having more of something which less clever puzzlers lack. This is an important intuition, and it appears to be the normal way people have of

Sensitivity as inherent scalar semantics 83 thinking about scalar phenomena: in general, given any two inverse orderings for a given domain (i.e. fast vs. slow, sharp vs. dull, bright vs. dim, etc.) one ordering tends to be the default. Thus, we can ask how fast something is without assuming that it really is fast, but if we ask how slow it is, we must have some idea that it really isn’t fast. But scalar models do not necessarily use such unmarked orderings. In general, the ordering of elements on any conceptual scale within a scalar model depends on the role that particular scale plays within a larger proposition. Elements are ordered not in terms of their inherent amounts, nor even in terms of default assumptions about normal orderings, but rather in terms of their significance within a scalar model. As we will see below (§4.5), this fact has important consequences for the ways polarity items are lexicalized, and more generally, for the types of scalar reasoning which underlie polarity sensitivity. For the moment, we can think of quantitative value simply in terms of an element’s position within a scalar ordering. For a form to encode a q-value, it simply has to designate some relative or absolute position within such an ordering. In principle, this allows for an infinite number of distinct q-values, but languages are rather stingy about lexicalizing such distinctions. Among degree adverbs, the locus classicus for scalar distinctions, we tend to find no more than eight basic degrees which are lexically encoded, ranging from the absolute to the absolutely negative (Bolinger 1972; Hübler 1983; Paradis 1997; Klein 1998). For the purpose of understanding polarity sensitivity, I suggest we need only recognize two: high q-value and low q-value, both of which are defined relative to some contextual norm associated with a scale. In general, a coded quantitative value will not denote a precise or objectively fixed position on a scale; more often, q-values consist in a contextually determined range of scalar values. What counts as high or low on a scale depends on background assumptions and implicit norms: what’s big for a mouse tends to be small for a house. In context the construal of a conceptual scale, and hence the use of any scalar predicate, always evokes some scalar norm as an implicit standard of comparison (Sapir 1944). The point is trivial in the case of gradable predicates like tall, fast, beautiful, and intelligent. For something to count as tall, it must exceed some normal expectation about height: it must be construed with respect to a conceptual scale ordering elements in terms of their height and it must be judged as exceeding some scalar norm associated with that scale. The particular value of a scalar norm varies with the expectations and assumptions of speech act participants, but in general it simply reflects a default understanding of the entity under discussion. The scalar norm allows us to view the gradient notion of q-value as a binary opposition: propositions

84 The Grammar of Polarity above the scalar norm associated with a conceptual scale have high q-values; propositions below the scalar norm have low q-values. The need for a scalar norm is also apparent in the case of informative value. I-value depends on an expressed proposition’s inferential relation to other propositions in a model. The question is, how is this relation determined and with respect to which other propositions? If scalar norms constitute an essential, if unspoken, aspect of any scalar model, then i-value can be understood directly in terms of an expressed proposition’s inferential relation to the norm. The norm effectively represents an expectation about what proposition within a model would, in some default context, be most likely to hold. What it means for an assertion to be construed with respect to a scalar model is that it is implicitly contrasted with some alternative default proposition. Kay (1990) thus distinguishes between the expressed proposition overtly encoded by a sentence – the text proposition – and a presupposed proposition in a scalar model with respect to which the text proposition is evaluated – the context proposition. In what follows, I make a parallel distinction between the manifest content expressed by a sentence in context, which I call simply the expressed proposition, and the scalar norm, understood as a proposition within a scalar model with respect to which an expressed proposition may be understood as implicitly contrasting. In general, if an expressed proposition entails the scalar norm, then it is more informative than one might have expected and so has a high i-value; if the expressed proposition is (or would be) itself entailed by the scalar norm, then it is less informative than one might have expected and so has a low i-value. By defining i-value relative to an implicit scalar norm, we again reduce a gradient phenomenon to a binary opposition: propositions entailing the scalar norm have a high i-value; propositions entailed by the norm have a low i-value. This should be intuitive. In general, if a proposition entails the norm, its assertion is informative because it exceeds what one would normally expect to be asserted. I call such relatively informative propositions emphatic. On the other hand, if an expressed proposition is itself entailed by the norm, then its assertion is uninformative, or at least under-informative, because it fails to say whether the default expectation of the norm is met as well. Such under informative propositions I call attenuating. Given this basic distinction between q-value and i-value, we can now consider how these features are encoded in polarity items and how together they can create polarity sensitivities. These features are special because the content they contribute to an expressed proposition is not in fact an inherent property of that proposition but rather depends on its position within a structured set of alternatives: q-value determines an expressed proposition’s position within a

Sensitivity as inherent scalar semantics 85 scalar model; i-value determines an expressed proposition’s inferential value with respect to other propositions in the model. Effectively, what it means for a lexical form to encode one of these properties is that the proposition to which it contributes must be construed relative to a scalar model: i-value and q-value do not simply add information to a proposition; rather, they situate a proposition within a sort of informational matrix. In this sense, both of these features have more to do with construal than they do with the objective content of an expressed proposition. In general, if a form conventionally encodes either an i-value or a q-value, it counts as a scalar operator and must be interpreted relative to a scalar model. But if a form encodes both an i-value and a q-value, it will also be a polarity item: the combination of a fixed scalar location (q-value) with a fixed inferential relation to the scalar norm (i-value) constrains a form to occurring in just those contexts where the direction of scalar inferencing is compatible with both of its scalar values. Thus, the minimizer NPIs discussed above combine a low (in fact, minimal) q-value with a high (or emphatic) i-value, and this combination effectively makes them polarity sensitive, limiting their distribution to the scale-reversing contexts in which their low q-values can support their emphatic i-values.

4.3

Four sorts of polarity items

Since q-value and i-value are both, effectively, binary features, their potential combination yields four theoretically possible classes of scalar operators. The minimizers offer a clear example of one of these basic types, combining low q-values with high i-values. NPIs like much and all that illustrate a second group, in which high q-values combine with low i-values. As NPIs, these forms show roughly the same distributional constraints as the minimizers, though their pragmatic purpose in life is quite different, being used not to strengthen but rather to mitigate the force of a negative utterance. The contrast in (1) between the NPIs much and a wink illustrates the difference. (1)

a. Margo did *(not) sleep a wink before her big test. b. Margo did *(not) sleep much before her big test.

Intuitively, (1a) makes a strong claim by denying that Margo slept even the smallest amount, while (1b) makes a weak claim by denying only that Margo slept for a long time. In (1a), a wink expresses a minimal q-value and produces an emphatic sentence; in (1b), much marks a relatively high q-value and produces an understatement.

86 The Grammar of Polarity Similar examples abound. As noted above (§2.2), one of the most common sorts of NPI is the minimizer – an expression which denotes a minimal quantity or a scalar endpoint and which serves a stereotypically emphatic function (Borkin 1971; Schmerling 1971; Fauconnier 1975a; Horn 1989). Examples in English include drink a drop, (spend) a red cent, budge an inch, lift a finger, and have a snowball’s chance in hell, and similar examples are found in many (and perhaps all) other languages. Another common class of emphatic NPIs includes degree adverbs like English at all, in the slightest, and the least bit, and cross-linguistic counterparts like French le moindre ‘the slightest,’ Hindi zaraa-(bhii) ‘(even) a little,’ and kataii-(bhii) ‘at all’ (Vasishth 1998), and Japanese ikkoo-ni and kaimoku, both meaning roughly ‘at all’ (McGloin 1972). Other emphatic NPIs include scalar conjunctions like let alone and much less, modal constructions like can possibly, and a variety of verbs and verbal idioms such as budge, can stand, can stomach, can fathom, and would dream of. Also in this class are the classic indefinite polarity items any and ever, which in most, though not all (pace Kadmon & Landman 1993), of their uses are clearly emphatic (Heim 1984; Krifka 1995; Rullmann 1996; Israel 1995a, 1998a; §7 below). Attenuating NPIs patterning like the construction with much in (1b) have attracted less attention than their emphatic counterparts, but they are very common both in English and cross-linguistically. Other obvious English items include the temporal adverbial long (e.g. he won’t last long); the degree adverb all that (e.g. he’s not all that clever); and certain uses of many, which in informal usage tends to be replaced by a lot of in positive contexts. Similar NPIs from other languages include French grand chose ‘a whole lot,’ grand monde ‘many people,’ grand choix ‘much choice,’ pour autant ‘for all that,’ and de sitôt ‘so very soon’ (Gaatone 1971; Bouvier 2002: 229); German sonderlich ‘particularly’; and Dutch bijster ‘very,’ pluis ‘plush,’ or ‘easy,’ and mals ‘tender, gentle’ (van der Wouden 1997; Klein 1998); Japanese sonna-ni ‘that much,’ anmari ‘too very,’ rokuni ‘much,’ and betu-ni ‘particularly’ (McGloin 1972: 82); and Persian cœndan ‘much’ and un-qœdrha ‘that much’ (Raghibdoust 1994). Appropriately enough, everything is backwards when polarity is reversed: the neat division of NPIs into low-scalar emphatics and high-scalar attenuators is mirrored by a division of PPIs into high-scalar emphatics and low-scalar attenuators. The abundant emphatic PPIs include quantificational idioms like heaps of, scads of, and the whole shebang, and degree modifiers like horribly, utterly, and amazingly. These forms encode high-scalar q-values in bold, expressive, high i-value assertions, and their use tends to signal a high degree of speaker confidence in the content of an expressed proposition. Attenuating

Sensitivity as inherent scalar semantics 87 PPIs, the last of the four types, also include a wide variety of quantifiers (some, several, few, scant), quantificational idioms (a dab, a tad, a trifle, a soupçon), and degree modifiers (pretty, fairly, kinda). These forms encode (relatively) low-scalar q-values in hedged, low i-value assertions: their use tends to signal either a certain tentativeness, or, at least, a desire not to insist to strongly on one’s point. Consider the contrast between the low-scalar PPI a little bit, and the highscalar scads. (The status of these expressions as PPIs is demonstrated by their unacceptability in the polarity context formed by rarely.) (2)

a. Belinda (*rarely) won scads of money at the races. b. Belinda (*rarely) won a little bit of money at the races.

Again, the difference is intuitively straightforward: (2a) makes an emphatic assertion that Belinda won a very large quantity of money, while (2b) modestly asserts the winning of only a small quantity. Once again, there is a correlation between a polarity item’s informative and quantitative values, only here the correlation is the mirror image of that found with the NPIs in (1): scads designates a high quantity and produces an emphatic sentence; a little bit designates a small quantity and produces an understatement. Similar examples of both low-scalar attenuating and high-scalar emphatic PPIs are readily multiplied. Indeed, if anything, it seems that both sorts of PPIs may be far more abundant than NPIs. High-scalar PPIs, among them what Hinds (1974) calls “doubleplusgood polarity items,” include comparative and superlative expressions such as far Xer, way Xer, and by far the Xest; intensifiers such as utterly, damnably, intensely, and as hell; quantifying NPs such as heaps, mountains, and tons; universalizing idioms like all the time in the world, all smiles, every jot and tittle, and the whole kit and caboodle; and a large class of slangy and unstable evaluative adjectives such as (in some registers of my own idiolect) bitchin, awesome, radical, gnarly, and way cool. There is in fact an overwhelming cross-linguistic tendency for degree words encoding an extremely high quantitative value to function as PPIs. McGloin (1972: 82) cites Japanese PPIs with meanings like ‘everything’ (nan-demo), ‘extremely’ (zuibun, hidoku), ‘very’ (taihen, totemo), ‘considerably’ (hizyoo-ni), ‘quite’ (kanari, naka-naka), and ‘all the more’ (issoo). In her comprehensive study of Dutch degree adverbs, Klein (1998: 208–9) lists just over a hundred Dutch forms expressing a very high degree, among which she identifies eighty-six PPIs (among others, enorm, flagrant, idioot, kolossaal, and vervloekt). And indeed, van der Wouden ventures that in any language most, if not all, “inherently intensified” lexical items are PPIs (1997: 80).

88 The Grammar of Polarity In addition to the quantificational and degree modifier constructions noted above, the class of low-scalar PPIs in English includes frequency adverbs like occasionally and at times; modal auxiliary and catenative verb constructions like would rather, could well, might as well, and might consider; and a good many verbal idioms like have a go at, give X a shot, take a stab at, do pro’s bit for, get by, make do, take a dim view of, and put in a word for. Examples from other languages include French un peu ‘a few’ and plutôt ‘rather’; German etwas ‘somewhat’ and ziemlich ‘rather’; Dutch een beetje ‘a little bit’ and nogal ‘rather’ (Klein 1998); Persian forms like qœdri ‘a bit,’ kœm kœm ‘little by little,’ and the idiomatic VP ye qolop xordœn ‘to drink a gulp’ (Raghibdoust 1994); and Japanese forms like dare-ka ‘somebody,’ sukosi ‘a bit,’ ikubun ‘to some extent,’ and tasyoo ‘somewhat’ (McGloin 1972: 81–2). Having distinguished these four classes of polarity items, it is important to acknowledge that they do not form neat, homogenous groups. The features which define them, q-value and i-value, are schematic properties and allow for a wide range of variation in the ways they apply to lexical items. The crude distinction between low and high q-value, for example, flattens fine-grained distinctions between scalar degrees that could be made in the analysis of degree modifiers. Klein (1998) thus distinguishes between expressions marking absolute, extremely high, high, moderate, and minimal degrees. There are, of course, other ways to divide up the scale, and other distinctions to be made among degree modifiers (Bolinger 1972; Hübler 1983; Paradis 1997, 2001), but these variations appear not to be relevant to the narrow question of what it is that causes polarity sensitivity. One variation which deserves special mention, however, involves the distinction between simple attenuation and actual understatement (Israel 2006; see also Margerie 2007). Some of the forms identified here as attenuating PPIs, for example, sometimes seem to work more like intensifiers. This is particularly true of degree words like rather and pretty and their cross-linguistic counterparts. The basic problem is inherent in the nature of attenuation. An attenuated proposition is one which says less than could have been said – which is less informative than some other proposition that could have been, but was not, expressed. As it turns out, there are two very different sorts of circumstances in which such uninformative propositions get expressed. In cases of true attenuation, one says little because that is all one wants to say; but in other cases, attenuation shades into understatement, where one says little but means much more. The distinction is evident in the different possible uses of a litotic expression like not bad (Horn 1991). One might say of a party, for example, that it was

Sensitivity as inherent scalar semantics 89 not bad. As a simple attenuation, the expression indicates merely that the party could not be characterized as unpleasant. The assertion is attenuating because it says less than one might want to know – it does not say whether the party was actually any good. But as an understatement, the same expression may convey that the party was not only ‘not bad,’ but indeed extraordinarily good. Saying little always raises the question of what is left unsaid, and in some cases, saying less may be a way of meaning more. This is the essence of understatement (i.e. minus dicimus, plus significamus). Certain attenuating forms like rather and pretty seem to be more or less conventionalized expressions of understatement as opposed to mere attenuators. But understatements are not emphatic assertions, and understating attenuators like rather and pretty are not the same as emphatics or intensifiers. In particular, a form like rather displays a certain vagueness, which makes it weaker than a true intensifier like very. To say that someone is rather beautiful is, in effect, a nuanced form of praise, and may well be perceived as less generous than the unqualified compliment without rather. Where rather does express a genuinely high-scalar value, it does so with some delicacy, as if the speaker were reluctant to express the full force of her opinion. Thus, in the examples below, the understatements with rather allow for some latitude as to the precise degree which is meant. This contrasts with the less nuanced expression of forms like very, quite, and awfully, which unambiguously commit the speaker to a high, or very high, degree. (3)

a. I was rather disappointed by your behavior last night. b. Your performance at the party was rather impressive. c. She is, indeed, a rather remarkable young woman.

Considerations of this sort suggest that while rather can be used to express a high degree, such uses are consistent with its basic value as a low-scalar attenuator. Similar points apply to pretty, as in a pretty good dissertation, which, although it may express a relatively high-scalar degree, also allows a certain equivocation absent from a true intensifier like very.2 It is in the nature of attenuation that more may be conveyed than is actually said. Forms which are by their very nature uninformative (that is, which encode a low i-value) naturally tend to be interpreted as coy or oblique ways of expressing something more. In the case of rather and pretty this coy effect of understatement has become so conventional that the forms function almost like intensifiers themselves. With other forms, like sort of, kind of, and fairly, the obliqueness, if present at all, remains more obvious: to call a thesis fairly brilliant could (depending on the speaker) express the highest praise, but what is

90 The Grammar of Polarity

Emphatic

high Attenuating

a heap, a ton, utterly, the whole shebang

a whole hell of a lot, much, all that much, any too

PPls

NPIs

n Attenuating a little bit, sort of, rather, somewhat

Emphatic a damn thing, an inch, at all, the least bit low

Figure 4.1 Four sorts of polarity items

actually said is still much more attenuated. In any case, not all attenuating PPIs allow such interpretations: expressions like somewhat, moderately, and slightly do not so easily admit of obliquely emphatic, “understating,” readings. The lexicalization pattern of PPIs mirrors that of NPIs. While low-scalar attenuators are PPIs, low-scalar emphatics are NPIs, and conversely, while high-scalar attenuators are NPIs, high-scalar emphatics are PPIs. This situation is depicted schematically in Figure 4.1, which presents the four sorts of polarity items arranged in terms of their quantitative and informative values. As depicted here, emphatic – heaping- and damn-type – polarity items tend to hug the scalar extremes, while attenuating – somewhat- and much-type – polarity items hover more around the middle. But the difference between these classes has less to do with their precise positions on a scale than with how those positions are construed. In ordinary use, emphatic items express a value which is felt as somehow more than some alternative, while attenuating items express a value felt to be somehow less. The taxonomy here was first proposed in Israel (1996). Each of the four classes of polarity items had by then already been individually recognized in the literature, though they had not yet been considered as parts of a larger whole. And for the most part, they still aren’t. Instead, one class, the low-scalar emphatic NPIs, overshadows the others in most theoretical attempts to explain the causes of polarity sensitivity. As Ladusaw (1996: 336) notes: It is a theme running through the history of the investigation of this topic that negative polarity items strengthen negative statements, that they are useable

Sensitivity as inherent scalar semantics 91 precisely where they make strong statements, and hence when the polarity items are not licensed, the sentence makes such a weak statement that it is in effect unuseable.

The conventional wisdom remains that polarity items are inherently emphatic, and that the constraints on negative polarity items (NPIs) reflect a need to be maximally informative. In their influential analysis of any, Kadmon and Landman suggest that it is “a very prominent characteristic of any as well as other NPIs that they make the statement they are in stronger” (1993: 369). The implication seems to be that it is a prominent characteristic of all other NPIs. The same intuition has motivated a succession of otherwise quite different analyses (inter alia, Krifka 1992, 1994, 1995; Jackson 1994; Lahiri 1998; van Rooy 2003; Zepter 2003; Chierchia 2004), which take some notion of informative strength or strengthening as the essence of polarity sensitivity. As Zepter puts it, “The actual licensing condition of NPIs is the requirement to be contained in a particularly strong statement” (2003: 235). This view seems unequivocally to predict that all polarity items should occur only where they are especially informative and that no polarity items could be conventionally attenuating or understating. Thus, while Krifka (1992) recognizes a connection between high-scalar PPIs and low-scalar NPIs suggesting that both are somehow inherently emphatic, he argues that some cannot be a true PPI because it is not obligatorily construed against a set of contrasting alternatives (1995: 241). But of course there are many polarity items, both NPIs and PPIs, which, like some and all that, appear only where they are less than fully informative. Many early works on polarity (Klima 1964; Baker 1970; McGloin 1972) discuss both NPIs and PPIs which are clearly attenuating as opposed to emphatic, and these forms have featured prominently in work by Horn (1989), von Bergen and von Bergen (1993), and van der Wouden (1997). Similarly, the formation of understatements via the denial of high-scalar expressions has been widely discussed in the pragmatics literature (Spitzbardt 1963; Bolinger 1972; Hübler 1983; Horn 1989, 1991; Margerie 2008), but rarely with reference to the phenomenon of sensitivity. Linebarger was perhaps the first to recognize “scalar endpoint” NPIs and “understater” NPIs as distinct classes (1980: 236–7), but even she denied any natural connection between them, claiming that each has its own distinct pragmatic motivation (1980: 248). While this is certainly true, it effectively ignores the shared foundation for both types of NPI in the cognitive bedrock of scalar reasoning. The proposed taxonomy is thus neither daring nor entirely original, but it does unite a set of facts which clearly belong together. Each of the four pieces

92 The Grammar of Polarity has already, in one way or another, been independently identified and discussed in the literature; but this just makes it all the more remarkable that they have so rarely been put together, for together they provide new insight into the mystery of polarity sensitivity. 4.4

Sensitivity and the square of opposition

Interestingly, the proposed taxonomy for polarity items maps rather neatly onto the classical square of opposition. The square, which goes back, at least implicitly, to Aristotle, charts a four-way contrast between elements defined in terms of quantity (universal vs. particular) and quality (affirmation vs. negation). The corners of the square are related to each other in terms of contradictory, contrary, and subcontrary opposition. The four corners are labeled with the vowels of the Latin verbs affirmo and nego: the A and I corners represent universal and particular affirmation, respectively; the E and O corners represent universal and particular negation, respectively. These are illustrated with the sentences in (4): (4)

A: All men are foolish. I: Some men are foolish. E: No man is foolish. (≈All men are not-foolish.) O: Not all men are foolish. (≈Some men are not-foolish.)

By minimally redefining the quantity axis in terms of informative value and the quality axis in terms of polarity sensitivity, we can superimpose the taxonomy of polarity items directly onto the square, as illustrated in Figure 4.2 and by the sentences in (5). (5)

A: Stella is awfully clever. I: Stella is sort of clever. E: Stella is not at all clever. O: Stella is not all that clever.

A moment’s introspection confirms that the oppositions between these sentences conform to those required by the square. A and E are contraries since they cannot both be simultaneously true, though they can both be false: if Stella is awfully clever, then it cannot be that Stella is not clever at all, and vice versa; although she could be just moderately clever, in which case both A and E would be false. I and O are subcontraries since they cannot both be false, though they can both be true: if it’s not true that Stella is at least sort of clever, then she must not be all that clever either; though again, if Stella is just moderately clever, it will be true both that she is sort of clever and that she is not all that clever. And I

Sensitivity as inherent scalar semantics 93 [PPIs] [emphatics]

A

[NPIs] contraries

E

awfully

(not) at all

contradictories

[attenuators]

(not) all that

sorta I

subcontraries

O

Figure 4.2 Polarity items and the square of opposition

and E, and A, and O are contradictories since in all possible worlds, one member of each pair must be true and the other false: thus, if Stella is in fact awfully clever, it cannot be that she is not all that clever, and if it is not the case that she is awfully clever, then it must be true that she is not all that clever. The relationship between the four sorts of polarity items and the classical square of oppositions is intriguing, for it suggests an important parallel between polarity items and other sorts of quantificational elements. Klein (1998: 115–26) pursues and refines these parallels in her study of degree adverbs in Dutch. As she notes, the real parallel here likely holds among degree adverbs in general, whether polarity sensitive or not, and among other sorts of quantifying expressions. Either way the evidence does suggest one general pattern to the ways in which polarity items may be lexicalized, and this general pattern seems to conform in broad outline to the patterns which hold among quantifying expressions generally. 4.5

The conspiracy theory of polarity licensing

The analysis of polarity items as scalar operators helps explain both what it is that makes polarity items so sensitive and what it is that polarity items are sensitive to. Polarity items are sensitive to scalar inferencing, and they are sensitive because of the interaction of their scalar semantic properties. In forms which are conventionally specified for both q-value and i-value, the two properties will conspire to create polarity sensitivity. The NPI sleep a wink in (6) provides a simple illustration. (6)

a. Marianne didn’t sleep a wink that night. b. *Marianne slept a wink that night.

94 The Grammar of Polarity Here, the emphatic NPI marks a minimal scalar value in an expressed text proposition and requires that the expressed proposition be construed as more informative than what would be expressed by a proposition based on the scalar norm. In (6a), since the expressed proposition, ‘M didn’t sleep the smallest amount,’ entails the norm, ‘M didn’t sleep a normal amount,’ the requirement is met: the NPI counts as emphatic and the sentence is well formed. In (6b), however, the NPI cannot properly express its emphatic i-value: here the expressed text proposition, ‘M slept the smallest amount,’ is itself entailed by the scalar norm, ‘M slept a normal amount.’ This produces a relatively weak assertion, which clashes with the emphatic nature of the NPI. Similar considerations apply to attenuating PPIs like some and a smidge. (7)

a. Brandon had a smidge of jelly on his collar. b. *Brandon didn’t have a smidge of jelly on his collar.

As a scalar operator, a smidge denotes a quantity which contrasts with the set of alternative scalar values underlying a scalar model. The expressed proposition in (7a), ‘that B had a slight amount of jelly on his collar,’ is construed relative to an implicit scalar norm in the evoked model, something like ‘B had a moderate amount of jelly on his collar.’ (Note that the scalar norm does not require any particular expectation that B should have jelly on him; rather, given that he is so besmirched, the scalar norm just reflects what might be expected to constitute a normal degree of besmirchment in such circumstances.) Since having a moderate amount of jelly on one’s collar entails having a slight amount there too, the implicit norm is more informative than the expressed proposition. A smidge happily expresses its low i-value, and so the sentence is attenuating and grammatical. In (7b), however, the implication reversal triggered by negation makes the diminutive a smidge unacceptably emphatic. Here the expressed proposition, that ‘B didn’t have a slight amount …,’ entails the implicit norm, that ‘B didn’t have a moderate amount …’ The emphatic effect is at odds with the attenuating i-value of the PPI, and the sentence is consequently odd, at best. As these examples suggest, the particular combinations of q-value and i-value in PPIs are such that they are only compatible with contexts where inferences run from high to low q-values, that is, in scale-preserving contexts; contrariwise, the particular combinations in NPIs are such that they are compatible only with scale-reversing contexts, where inferences run from low to high q-values. This, in essence, is why polarity items are sensitive to polarity. Polarity in general is a matter of scalar inferencing and polarity items are just

Sensitivity as inherent scalar semantics 95 scalar operators: the proper expression of their lexical semantics depends on the availability of a properly constructed scalar model. Thus far, then, we have addressed the sensitivity problem by defining polarity items as a special class of scalar operators which encode both a proposition’s position within a scalar model and a proposition’s rhetorical informativity. Given this, and given an understanding of informative value in terms of scalar inferencing, the licensing problem effectively solves itself. Polarity items require a scalar model which can support the expression of their conventional scalar semantics, and so affectivity turns out to be nothing more than the property of being construed with respect to an appropriately structured scalar model: [+Affective] contexts are scale reversing; [-Affective] environments are scale preserving. 4.6

The anomaly of inverted polarity items

As outlined above, the “conspiracy theory” of polarity licensing predicts that there should be four and only four types of polarity items. Although one never knows in advance whether a given form will encode the relevant scalar features – the association of a semantic feature with a lexical form is always essentially arbitrary – one can at least expect that these features should not interact randomly. Rather, the precise sensitivities of any given polarity item should be a direct function of the scalar features it encodes. Basically, this means that certain sorts of polarity item should not exist. For example, one should never find an NPI combining a high-scalar value with an emphatic rhetorical force – such a combination should only yield PPIs. Similarly, there should be no PPIs combining a low-scalar value with an emphatic informative value – such a combination should always create an NPI. At this point, it seems, we have a problem. Both types of putatively impossible polarity item, or things very much like them, not only exist but are in fact rather common. Von Bergen and von Bergen (1993: 155–7), for example, discuss a variety of “maximizing” NPIs – forms which emphatically strengthen negation precisely by virtue of their high quantitative values. Typical instances from English include the items underlined in (8). (8)

a. Wild horses could *(-n’t) keep me away. b. I would *(-n’t) do it for all the tea in China. c. I wouldn’t touch it with a ten-foot pole.

Intuitively, it seems that the wild horses in (8a) stands for something like the most irresistible force imaginable. Similarly, in (8b) all the tea in China

96 The Grammar of Polarity represents an unusually valuable reward, one high on a scale of monetary worth. And the ten-foot pole in (6c) is an unusually large instrument, one which maximizes the distance between the toucher and the touched. Such maximizing NPIs are not peculiar to English. Larrivée (1996) notes parallel constructions from French including pour tout l’or du monde ‘for all the gold in the world,’ de mémoire d’homme, roughly ‘in living memory’ (see also Gaatone 1971: 190), and de (toute) sa vie ‘in his (whole) life.’ Similarly van der Wal (1996) and Hoeksema and Rullmann (2001) note ‘maximum quantity’ NPIs in Dutch such as voor goud ‘for gold’ and in de verste verte ‘in the farthest distance.’ Parallel to these troublesome maximizing NPIs is a set of equally troubling minimizing PPIs like those in (9) – constructions with low-scalar q-values which produce emphatic effects in affirmative contexts. (9)

a. Godfrey is (*not) scared of his own shadow. b. She would (*not) betray us at the drop of a hat. c. You could have knocked me over with a feather.

Clearly, the shadow in (9a) is a minimally frightful sort of thing, the dropped hat in (9b) a stereotypically minimal provocation, and the feather in (9c) a minimally forceful tool for knocking things over, but all three contribute to the expression of maximally emphatic positive propositions. The Scalar Model would seem to make these minimalistic emphatic PPIs just as impossible as the maximalist emphatic NPIs noted above. While inverted polarity items may be less common than their canonical counterparts, they are not particularly rare either, and in some semantic fields are fairly abundant. Both English and Dutch (Hoeksema p.c.) feature an open class of inverted NPIs denoting large time spans, as in (10). Similarly, many inverted PPIs denote minimal time spans, as in (11). (10)

a. We have*(n’t) heard from you in a coon’s age! b. You’ll *(never) in a million years guess who I saw last night.

(11)

a. We will (*not) be back in a jiffy. b. In a New York minute, everything can (*not) change.

Comparable examples include PPIs like in a flash, in a second, in a trice, and in a heartbeat, and NPIs like in days, in weeks, in years, in ages, and in a blue moon. It gets worse. Among polarity items denoting monetary values, there are emphatic constructions at both ends of the same scale: unambiguously emphatic NPIs denoting things of little value (canonical: a red cent, a plugged nickel,

Sensitivity as inherent scalar semantics 97

C A N O N I C A L

high Emphatic PPIs

Emphatic NPIs

tons of, utterly, insanely, way, a heap

wild horses, in ages, all the tea in China n

Emphatic NPIs

Emphatic PPIs

a wink, an inch, at all, the least bit

the drop of a hat, a jiffy, a pittance

I N V E R T E D

low

Figure 4.3 Canonical and inverted polarity items

a thin dime, a brass farthing) and others referring to things of extreme value (inverted: for all the tea in China, for all the money in the world, for love or money, for the life of me); and both unambiguously emphatic PPIs referring to things of the greatest value (canonical: a king’s ransom, an arm and a leg) and others referring to things of negligible value (inverted: for peanuts, for a song, for a pittance, for next to nothing). This is the pecuniary paradox of polarity sensitivity. (12)

a. He won’t spend a red cent on your wedding. b. She wouldn’t kiss him for all the tea in China.

(13)

a. Julio spent a king’s ransom on the party. b. But he somehow got Madonna to play for peanuts.

Apparently, the simple correlation between scalar semantics and polarity sensitivity cannot be as simple as one might have hoped, or as the Scalar Model would seem to predict. Maximizing NPIs and minimizing PPIs invert the normal correlations observed among the more canonical polarity items. As Figure 4.3 suggests, the existence of both canonical and inverted polarity items would seem to preclude the possibility of there being any regular correlation between scalar semantics and polarity sensitivity. On the other hand, all of these apparent counterexamples do share a clearly scalar semantics with their canonical counterparts: their rhetorical effects depend on the scalar values they encode. The distribution of emphatic inverted polarity items obeys the same scalar logic which rules the distribution of canonical emphatic items, and as with the canonical items, this scalar logic is driven by the pragmatics of informativity: such forms are acceptable only where their different q-values support the same sorts of emphatic scalar inferences.

98 The Grammar of Polarity In (12b), for example, all the tea in China denotes a highly valuable reward, and under negation triggers inferences about all less valuable rewards: presumably, if a girl isn’t tempted by all the tea in China, nothing less could tempt her either. Similarly, in (9a), if Godfrey fears such minimally fearsome things as his own shadow, he will presumably be scared of anything more fearsome – that is, in effect, of anything. But how can these forms both invert the normal scalar semantics of canonical polarity items, and still obey the same scalar logic? The answer can be found in the different kinds of scales associated with different polarity items. As it turns out, there is a consistent correlation between the sorts of syntactic and semantic roles a polarity item plays within a proposition and its status as inverted or canonical. Prototypical minimizers – canonical NPIs like crack a book, hurt a fly, lift a finger, or breathe a word – feature indefinite direct objects which measure out the action of a predicate. Inverted polarity items, on the other hand, tend to have idiomatic NPs anywhere but direct object position. The wild horses idiom, for example, is perhaps the only English NPI with an idiomatic subject NP, while other inverted items feature indefinites governed by prepositions such as for (for love or money, for a song), in (in a flash, in a million years), with (touch with a ten-foot pole, knock down with a feather), and at (at the drop of a hat, at a moment’s notice). Underlying this superficial syntactic distinction is a deeper semantic generalization concerning the thematic roles typically associated with canonical and inverted polarity items. Canonical polarity items tend to refer to a patient (crack a book, hurt a fly), a theme (lift a finger, move a muscle, bat an eye), or some sort of increment (sleep a wink, drink a drop, budge an inch, breathe a word). All these forms involve entities which are somehow affected by the action of the verb: they are low in the thematic hierarchy, near the bottom of the action chain (Langacker 1987). Inverted polarity items, however, feature participants at the top of the thematic hierarchy – entities which somehow facilitate the realization of an eventuality. The idiomatic use of wild horses, for example, denotes a stereotypically irresistible force which can affect an agent’s behavior. Thematically, the wild horses idiom fits into a more general class of inverted polarity items which depict a stimulus or causal trigger: for example, at the drop of a hat profiles a small event which provokes a big response, and scared of one’s own shadow profiles a minor threat that triggers major fears. Pushing the generalization a bit, forms like all the tea in China and for a song also profile stimuli of a sort – rewards which might motivate one to act. Finally, polarity items involving reference to an instrument (touch with a ten-foot pole, knock down with a feather)

Sensitivity as inherent scalar semantics 99 are always inverted: the use of a bigger or more powerful instrument tends to facilitate the performance of an act. It appears, then, that the division between canonical and inverted polarity items reflects a deeper distinction in the ways scalar reasoning applies to different propositional roles. Certain types of participants function effectively as obstacles to the occurence of an event; others, on the contrary, act as stimuli. A theme or patient, for example, is an entity which must be affected for an event to take place: the bigger it is, the more resistance it offers, the less likely the event will be. An agent or a stimulus, on the other hand, is an entity which must itself be effective for an event to take place: in this case, the bigger the agent, the more powerful it is, the more likely the event will be. In this light, the pecuniary paradox noted above simply reflects the fact that valuables in an exchange are split between two very different sorts of participant roles. As a rule, any participant in a commercial event must both give something up and gain something in return: otherwise, the exchange won’t happen. We may thus distinguish between the valuables given and the valuables gained. The logic of self-interest treats these two types of valuable very differently. All things being equal, a rationally self-interested participant will strive to give up the smallest amount necessary, and to gain the greatest amount possible. The logic of commercial exchange thus depends on whether a given valuable is understood as a Resource – what one stands to lose – or a Reward – what one stands to gain. The greater the demands on one’s Resources, the less likely one will be to accept an exchange; conversely, the greater the potential Reward, the more likely one will be to accept. Given this, canonical polarity items – emphatic PPIs like a king’s ransom and emphatic NPIs like a red cent – can be understood as expressions denoting Resources (things one can own or spend), while inverted polarity items – emphatic PPIs like for a song and emphatic NPIs like for all the tea in China – denote Rewards (things one can gain). Fundamentally, then, canonical and inverted polarity items do obey the same scalar principles. Emphatic NPIs, whether canonical or inverted, always pick out a class of participants – in this case, big rewards and small expenses – which facilitates the realization of an event. Emphatic PPIs, on the other hand, always denote the sorts of participants which militate against the realization of an event – in this case, small rewards and large expenses. Similar considerations apply to the logic of temporal polarity items like those in (10–11), where emphatic NPIs denote large time spans (in weeks, in years, etc.) and emphatic PPIs denote minimal spans (in a sec, in a jiffy, etc.). Of course, there is nothing about the domain of time itself which makes long

100 The Grammar of Polarity time spans emphatic in negative contexts and short ones emphatic in positive contexts. As it turns out, the briefest moments can also be emphatic in negative contexts, as in (14), and the longest periods can be emphatic in affirmative contexts, as in (15). (14)

a. I {won’t/??will} be half a minute. b. I can *(not) for a second believe she would do that.

(15)

a. It has always been so, time out of mind. b. They’ve been married for {ages / an eternity / donkey’s years}.

The key to this apparent chaos is, again, the realization that the same type of entity – in this case, a temporal interval – may be associated with very different sorts of roles within a proposition. In the case of time spans, the crucial difference depends on the aspectual character of an expressed proposition. Basically, whether or not a long time span makes a given eventuality more or less likely depends on the durativity of that eventuality. Punctual events culminate in an instant within a temporal interval: the longer the interval, the more likely it is that the event will actually happen. Durative situations, on the other hand, must hold for every instant of a time span: the more time that passes the more likely it is that the situation will no longer obtain. So in (14) and (15), where the temporal expressions indicate how long a situation will or will not last, the expression of emphasis involves a canonical scale: brief durations are emphatic under negation; long durations are emphatic in affirmations. Inverted forms like those in (10) and (11), however, invariably designate the bounded interval within which an event takes place. Their inverted scales – with short intervals emphatic in affirmation – simply reflect the logic of punctuality: the shorter the interval, the less likely it is to include the moment where a punctual event takes place. Again, the apparent anomaly of these forms turns out to be a regular feature of the roles they play within an expressed proposition. The question remains, though, why certain propositional roles are associated with canonical scales and others with inverted scales. As I noted above, there are suggestive correlations here with traditional thematic roles like agent, patient, and instrument. But thematic structure alone is an unwieldy instrument for sorting out canonical and inverted polarity items. Aside from the fact there is no consensus on the inventory of thematic roles, or even on their status within linguistic theory, it is unclear, at best, how the multiplicity of thematic roles should map onto a binary distinction between inverted and canonical scalar semantics. Put bluntly, the question is what do Agents,

Sensitivity as inherent scalar semantics 101 Stimuli, Instruments, Rewards, and Temporal Intervals all share that distinguishes them from Themes, Patients, Expenses, and Durations? One might think of the distinction broadly as a force dynamic division between “antagonistic” participants (agents, stimuli, etc.) which facilitate the realization of an eventuality, and “agonistic” participants (patients, themes, etc.) which act against the force of an antagonist to impede the realization of an eventuality (Talmy 1985). The explanation seems appealing with a contrast like the one between the affected fly in hurt a fly and the forceful horses in wild horses, but other polarity items do not lend themselves so naturally to an analysis in terms of force dynamics. In an expression like have a clue, for example, it is hard to see how the clue acts against or impedes the ‘having’ relation. Similar considerations apply to the paper in be worth the paper it’s written on and the ghost in stand a ghost of a chance. And it seems a stretch to think of temporal intervals like a jiffy or a million years as forces compelling or impeding the occurence of an event. Another possibility might be to appeal to Dowty’s (1991) notion of protoagent and proto-patient to predict when a polarity item can be inverted. One might thus propose that polarity items are inverted if and only if they occur in argument positions where they have more proto-agent properties than at least one other argument. Dowty’s proto-agent properties – that is, (1) volitional involvement, (2) sentience or perception, (3) causing an event or change of state in another participant, (4) movement relative to another participant, (5) independent existence from the event denoted by the verb – seem like a good place to start, at least. Such a proposal should make the right predictions for at least those canonical polarity items involving a direct object. It also seems to work for the canonical NPI be worth the paper it’s written on: since both arguments of the predicate be worth entail just one of Dowty’s proto-agent properties (i.e. neither one is more agentive than the other), the NPI is canonical. But it is unclear how this approach could handle temporal adjuncts like a coon’s age and in a jiffy, which do not seem to entail any proto-agent properties. And even if the general framework does extend to such cases, it is unclear what one gains thereby, unless one can also somehow explain what it is about proto-agentivity that makes it invert polarity items. Ultimately, to understand the division between canonical and inverted polarity items we must first understand the roles such forms play within the structure of a scalar model. A scalar model is basically a conceptual tool for thinking about the relations between different possible eventualities. The structure of the model is such that if one knows the status of a given eventuality

102 The Grammar of Polarity (i.e. whether it does or does not hold), one may automatically infer the status of other, related eventualities within the model. This in fact is the key to the problem of inverted polarity. Elements on any scale within a scalar model are always ranked in the same way, that is, in terms of the entailments they yield for a given propositional schema. Those elements which in scale-preserving contexts form the propositions with the most entailments are ranked at the top of the scale; those elements which under the same conditions form the propositions with the fewest entailments are ranked at the bottom. The ranking thus does not depend on the objective properties of the scalar elements alone but is crucially determined by the way these properties interact with a given propostional schema. Normally, of course, one thinks of scales more concretely as ordered in terms of amounts or degrees. Canonically, these orderings run from lesser to greater amounts. Prototypical scales measuring things like size, weight, or intelligence regularly conform to this pattern, and the pervasive scalar metaphor, more is up, whereby an increase in amount is conceptualized as a rise in elevation (Lakoff & Johnson 1980), similarly presupposes a canonical ordering of elements from lesser to greater quantities. But the canonical scale is in fact just a special case (albeit the default case), and in this case, as with all others, the ordering depends on the scale’s role within a larger propositional frame. The frame here involves nothing more than the attribution of a scalar property: x has property to extent-y. Such scales are always canonical, running from smaller to larger extents. Their logic reflects the fact that if some entity instantiates a property to some high degree, then it must also instantiate that property to all lesser degrees as well. As Hoeksema and Rullmann (2001) note, this sort of logic makes canonical scales useful in reasoning about the possible existence of different entities. For example, while things may vary infinitely in weight, everything with any weight will weigh at least a minimal amount, but relatively few things will weigh as much as a ton: so, all things being equal, things with a minimal weight will be more likely to exist, and the canonical order for weights will run from light things to heavy things. The distinction between canonical and inverted polarity items depends on the fact that for every canonical scale, there exists a corresponding inverted scale, and which of these two scales appears in a scalar model depends on its role there. So, again speaking of weights, a propositional frame like this camel can carry X has a very different logic from a frame like X will break this camel’s back. The former calls for a canonical scale, the latter for an inverted one. And the choice in both cases depends on the way the variable affects the possibility of the proposition as a whole being true.

Sensitivity as inherent scalar semantics 103 In short, it is an irreducible fact about scalar logic that different roles within a proposition relate differently to the probability of the proposition’s truth. Some roles involve entities which may facilitate the realization of a proposition; others involve entities which militate against its realization. Canonical polarity items always involve roles of the latter sort; inverted polarity items always involve roles of the former sort. The scalar logic in both cases is identical, and in both cases reflects the way propositions in a model are ordered in terms of their pragmatic entailments. Inverted polarity items thus do not undermine the theory that polarity items are scalar operators; rather, they confirm it. These peculiar polarity items do, however, raise questions about the structure of scalar models and, more particularly, about the nature of quantitative value. The basic generalization remains that polarity items encode a fixed position within a scalar model, and that quantitative value is the feature which expresses this positioning. Emphatic NPIs, whether canonical or inverted, consistently encode a low-scalar q-value, and emphatic PPIs, equally consistently, encode a high-scalar q-value. Only the relevant notion of quantitative value is not a matter of size or amount per se but rather reflects the role a profiled participant plays in the realization of a proposition. Loosely speaking, one may think of quantitative value as reflecting a default expectation of how likely it is that a given element on some scale within a scalar model will yield a true proposition. Scalar models themselves constitute complex presuppositions about the way the world usually works, and the orderings of elements within a model simply reflect a default understanding of how those elements will contribute to the realization of a given situation type. To conclude, polarity items are defined not just with respect to the contexts which license them, but also in terms of the roles they fulfill in a larger propositional context. Different propositional roles are associated with different scalar orderings. More precisely, the ordering for any given propositional role in a scalar model depends on the way that role affects the possibility of the proposition as a whole being true. While the simultaneous existence of both canonical and inverted polarity items shows that the relation between sensitivity and scalar semantics is more complicated than one might have thought, the intricate interaction between argument structure and quantitative value in the lexicalization of sensitive constructions itself confirms the central importance of scalar reasoning in the grammar of polarity.

5 The elements of sensitivity

5.1

The Informativity Hypothesis

This book began with a mystery. Polarity items seem like such peculiar c onstructions, their patterns of distribution so apparently unmotivated, one wonders why such forms should exist in any language. But polarity items might not be so mysterious after all. Thus far I have argued that polarity sensitivity is a regular function of the meanings of polarity items. The theory is that polarity items are scalar operators and that polarity sensitivity is a sensitivity to scalar inferencing. This is the “Scalar Model of Polarity,” or simply, the Scalar Model. The key to the Scalar Model is the idea that polarity items are defined by their rhetorical functions, and particularly by the argumentative force which they contribute to the expression of a proposition. Certain polarity items are associated with the expression of emphatic propositions. Others are associated with the expression of attenuating propositions. The hypothesis is that these associations are not just incidental facts about the uses of polarity items, but essential facts about the nature of grammatical sensitivity. Polarity items are sensitive precisely because they are conventionally associated with these rhetorical functions. The Informativity Hypothesis makes two claims about the lexical semantics of polarity items: i) ii)

Polarity items profile an element with a fixed q-value, either high or low in an ordered set of semantic alternatives. Polarity items are conventionally construed with a fixed i-value, either emphatic or attenuating with respect to their ordered alternatives.

It is the interaction of these properties which is held to cause polarity sensitivity. If this is correct, then all polarity items should exhibit both of these properties, and all constructions which exhibit them should be polarity sensitive. The strange case of the inverted polarity items (§4.6) shows that which values count as high or low within a scalar ordering depends 104

The elements of sensitivity 105 in part on how those values are construed within a proposition. But with the inverted polarity items, even if they seem to have the wrong sorts of quantitative values, the values they have are at least clearly quantitative. The question now is, are all polarity items really scalar operators? And if so, is that really such a remarkable thing? In order to test the Informativity Hypothesis, we need some general way (or ways) of assessing when and to what degree particular constructions can be said to encode quantitative and informative values. 5.2

Quantitative semantics

What makes a construction susceptible to the grammaticalization of polarity sensitivity? This is as much a question about the nature of the lexicon as it is about syntax or semantics. Part of the problem is that forms with very similar meanings may differ in their sensitivities, which suggests that sensitivity is at least in part an arbitrary property of constructions. But the fact that something is not fully predictable does not make it unprincipled. In fact, a relatively small number of semantic domains account for the vast majority of polarity items both within English and cross-linguistically (I distinguish twenty-six such domains in the partial catalogue of English polarity items given in the Appendix, though the list is hardly exhaustive). My main claim here is that not only are all of these domains in some sense inherently scalar, but also that all inherently scalar semantic domains – all domains, that is, in which elements are distinguished by quantitative values – can and do support the formation of polarity sensitive constructions. Quantitative value seems like an easy notion, but it is also an easy notion to misunderstand. The most obvious confusion is to assume that in order to encode a q-value, a construction must somehow denote a quantity, but q-value is a feature of many expressions which are not so obviously quantitative in nature Basically, a quantitative value is just a location in an ordering. In the most general sense, any construction which saliently evokes an entity construed as located on a conceptual scale encodes a q-value. The precise scalar location is typically not specified; more often, q-value includes a range of values relative to a scalar norm. For example, the gradable predicates tall, handsome, and polite profile vague regions on scales of height, beauty, and courtesy, respectively; all three, however, encode a high q-value since in each case the specified range is construed as relatively high with respect to some scalar norm (Sapir 1944; Klein 1998).

106 The Grammar of Polarity The idea that many polarity items – or even most – are scalar operators is hardly controversial. Minimizers, for example, clearly encode both quantitative and informative values: they are conventionally emphatic, and their emphatic force depends on their transparent expression of a minimal quantitative value in an appropriately scale-reversing context. And many other polarity items are just as transparently scalar in meaning: indeed, quantifiers and degree modifiers appear to be among the constructions most prone to polarity sensitivities in English (Bolinger 1972), Japanese (McGloin 1972), Dutch, and many other languages (Klein 1998; Hoeksema & Rullmann 2001). But to focus only on the most obviously scalar sorts of constructions would be to miss the bigger picture. In fact, one finds polarity items in abundance among a wide variety of modal verbs, conjunctions, and temporal and aspectual adverbs, all of which exhibit a demonstrably scalar semantics, even if their scalar properties are in some ways less glaringly obvious than those of a quantifier or a degree modifier. One reason why minimizers and degree modifiers may seem more obviously scalar than some other constructions is that they tend to feature, more or less transparently, one element which designates a gradable relation and another which profiles the extent to which that relation is instantiated: for example, in sleep a wink the verb sleep designates a process of variable duration while the measure NP a wink designates a minimal unit of duration. This clear division of labor between the expression of a gradable relation and the expression of a scalar extent effectively highlights the act of scalar construal in the compositional semantics of the construction as a whole. Many constructions, however, do not so neatly distinguish in the meanings of their parts the relation between a gradable relation and a scalar extent. More often a single word will simultaneously evoke a conceptual scale and designate a quantitative value on that scale. This is particularly clear with implicitly superlative adjectives like fantastic, marvelous, or wonderful (each of which can mean just ‘very good’), but it is true of many lexical constructions: verbs like crawl, amble, and run, for example, both designate a manner of bodily locomotion and saliently express something about the relative rates of motion involved; similarly, like, love, and adore denote emotional attitudes and saliently express something about those attitudes’ intensity. All these expressions simultaneously evoke a conceptual scale and highlight a range of values on that scale. Expressions like these are inherently scalar, but their inherent scalarity may be overshadowed by their broader lexical content. Of course, almost any construction can be given a scalar construal. Even the most robustly non-scalar predicates sometimes combine with degree

The elements of sensitivity 107 modifiers: a woman in her ninth month of pregnancy can be said to be very pregnant, and a man who has been shot, stabbed, chopped into pieces, and scattered at sea will certainly be very dead. Similarly, even an apparently innocent predicate like the reciprocal hold hands may allow a scalar construal in a sentence like all we did was hold hands! where it may contrast with other predicates – like kiss, make out, pet, fondle, etc. – representing different degrees of physical intimacy. Nonetheless, these expressions are not inherently scalar: their scalar construals depend on the contexts in which they are used and do not, as such, constitute an essential and indefeasible feature of their meanings. If the claim that polarity items are scalar operators is to have any empirical teeth, this sort of contingent scalarity must be excluded. For a construction to count as encoding a q-value, it must be inherently scalar – it must evoke some conceptual scale as a necessary and conventional aspect of its meaning. The question is, what exactly does this mean? Within Cognitive Grammar (Langacker 1987, 1991) the semantic content of any linguistic expression (including expressions of any arbitrary complexity) involves the imposition of a semantic profile on some base of construal. The distinction can be thought of as a particular linguistic manifestation of the much more general phenomenon of figure/ground organization. Roughly, the profile of an expression E is the cognitive structure (an entity of any semantic sort) to which E conceptually refers: within the overall conceptualization evoked by E, the profile is that subpart which receives focal prominence. The base of an expression, then, is the semantic domain within which an entity is profiled. A standard example concerns the meaning of the word hypotenuse, where the conceptualization of a right triangle provides the base in which a particular line segment is profiled. Given this basic distinction, we may say that a construction counts as inherently scalar to the extent that its profiled content must be construed against a scalar base, that is an ordered set of alternatives. A construction thus encodes a q-value if and only if it includes a conceptual scale as (part of) its base, and profiles, or includes within its profile, an entity located on that scale. From this it follows that many constructions which can support a scalar construal nonetheless do not count as inherently scalar. Thus, while a word like dance contrasts with words like stand, jump, and slither, and while silk contrasts with cotton, wool, and satin, these oppositions are not scalar but absolute. Words like these evoke alternatives, but the alternatives are not ordinarily construed as ordered in any particular way. Not so with items like love, care for, mind, or matter, each of which profiles a kind of gradable relation which can only be construed against a range of more or less

108 The Grammar of Polarity intense alternatives: for these forms, the scale is an inalienable aspect of their meaning. Furthermore, while many expressions denote elements or relations in domains that are inherently scalar (e.g. ability, cost, reward, significance, comprehension, desire, etc.), many others are in fact inherently unscalar – either because their meaning precludes a construal in terms of alternative values, or because the alternatives they evoke are inherently unordered. The construal of a referring expression as definite, for example, cannot depend on the availability of an ordered set of alternatives. While it is possible to construe a specific individual against a set of scalar alternatives (e.g. even the dean was amused), the construal of an entity as fully individuated and specific cannot be a matter of degree: a proper noun like Glynda or George refers to a particular individual and not just to someone who is more specific or individuated than some set of alternatives. Their reference is absolute rather than scalar. Similarly, words denoting basic-level categories of all sorts – colors, shapes, artifacts, and natural kinds – are probably never inherently scalar, at least in their most basic senses. Such words do, of course, present their profiled content against a range of contrasting alternatives (i.e. something counts as red only to the extent that it is not blue, yellow, or orange), but their evoked alternatives are not ordinarily ordered in any particular way. Most constructions and most semantic domains are thus not inherently scalar. Probably any situation can be given a scalar construal, but few expressions actually encode such a construal as an indefeasible part of their meanings, even when one or more conceptual scales is salient in their encyclopedic semantics. Although one can talk loudly, quickly, fluently, angrily, or lovingly to different degrees, the verb talk itself is neutral with respect to all these parameters. Similarly, the verbs eat and drink are neutral as to the quantity or quality of what is consumed, and the verbs buy, sell, charge, and pay say nothing about the quantity or value of what is exchanged. All of these expressions are defined largely in terms of the participant roles they evoke and the relations they profile between these roles, rather than in terms of any scalar contrast between one sort of participant and another. The claim that all polarity items are inherently scalar is thus by no means trivial. Indeed, given the diversity of forms and functions observed across polarity items, the fact that they appear to be drawn exclusively from scalar semantic domains seems highly significant. It also seems to fit with a parallel observation – one which is perhaps equally remarkable, though by and large much less remarked on – that certain sorts of constructions appear never to be polarity sensitive. Thus, it would be surprising to find a language in which the

The elements of sensitivity 109 word for a particular kind of fruit or insect or a variety of clothing could only be used in negative sentences. Such words can, of course, function as stereotypical minimal units in idiomatic NPIs like hurt a fly or care a fig, but where they do so, they are more like measure terms than the names of natural kinds. The fact that many sorts of constructions are not scalar means that the Informativity Hypothesis could easily be falsified by a single example of a clearly non-scalar polarity sensitive item. Although it has often been claimed that such items do exist (e.g. Linebarger 1980, 1987; van der Wal 1997; Szabolsci 2002, 2004), the sorts of forms most commonly cited in support of this claim – e.g. modal NPIs like English auxiliary need, Dutch hoeven, German brauchen, and French être besoin de; phasal adverbs like yet and already; and basic expressions of disjunction and conjunction in Hungarian and Japanese – actually support the Scalar Model. All of these constructions come from semantic domains which are rather trivially scalar in nature, and each of these domains actually hosts a variety of polarity sensitive constructions of precisely the four sorts predicted by the Scalar Model. Ultimately it will be impossible to prove either that all polarity items are scalar operators or that all inherently scalar domains give rise to polarity sensitive constructions. Still, either of these propositions could in principle easily be falsified: the fact that no such falsification appears to be at hand stands in favor of the Scalar Model. 5.3

The pragmatics of informativity

Informativity seems like a rather mysterious property for a construction. Why should speakers need to mark the informative value of the propositions they express? In context, every sentence is just as informative as it is: any extra signaling of informativity would seem, at best, needlessly redundant. A sentence like Kristen was not the least bit impressed counts as emphatic because under negation the minimal q-value of the least bit entails all higher values on the conceptual scale associated with the predicate impressed. The fact that the least bit also encodes a high informative value does not, in itself, make the sentence any more informative. Paradoxically, informative value appears to be a particularly uninformative property. Informativity is fundamentally an expression of speaker affect. In producing an emphatic or an attenuating utterance, a speaker does more than simply assert a proposition – she expresses an attitude toward that proposition, toward its significance and its informative strength. Most importantly, she expresses an attitude toward her audience, saying “you are the sort of person with whom

110 The Grammar of Polarity I would express myself in this way.” Emphasis and attenuation are, in essence, rhetorical strategies for the presentation of self in discourse. Viewed in this light, the notion of informativity as a linguistic property may not seem so mysterious after all. In this section, I argue that informativity in general, and the linguistic encoding of informative value in particular, reflect general strategies for the negotiation of social interaction. They are, in effect, devices for the expression of speaker involvement. The notion of involvement has a rich and rather heterogeneous history of applications in the study of emotive meaning (Caffi & Janney 1994: 343–8), but for our purposes, the basic idea may be usefully cast in terms of a general theory of politeness. Roughly, understatement and attenuation serve to give a hearer more options in responding to a speech act. As such, they may be seen as expressions of a speaker’s detachment and deference to the hearer. Emphasis, on the other hand, is a sign of intensity and speaker involvement. As such, it serves as a marker of camaraderie and solidarity with the hearer. There is more to communication than just the efficient exchange of information. Social interaction in general, and communicative interaction in particular, are hazardous undertakings. There is almost always more to worry about than just making oneself understood. One must consider the feelings of others and the potential danger to one’s own feelings in any social encounter. Broadly construed, such considerations are aspects of politeness. As many have noted (Lakoff 1973; Brown & Levinson 1978; Leech 1980, 1983; Hübler 1983; Geis 1995), the expression of politeness has important consequences for the ways language is used and, ultimately, for the structure of language itself. Brown and Levinson (1978) conceive of politeness in terms of the work undertaken by a speaker and a hearer to maintain each other’s face. As defined by Goffman, face is “the positive social value a person effectively claims for himself … an image of self, delineated in terms of approved social attributes” (1967: 5). For Brown and Levinson, face consists specifically of “two particular wants – roughly, the want to be unimpeded and the want to be approved of in certain respects” (1978: 63): the first of these they call “negative face,” the second “positive face.” Politeness, in this conception, involves the strategies a speaker may use to satisfy an addressee’s face wants. In general, it is in the speaker’s interest to provide such satisfaction since, all things being equal, this is the best way to ensure that the hearer will in turn work to satisfy her face wants. The basic face wants define two basic strategies for the expression of politeness. According to Brown and Levinson, positive politeness consists in the strategies and devices a speaker may use to reassure the hearer that his wants, and more generally his positive conception of himself, are desirable to the speaker

The elements of sensitivity 111 (1978: 106). Basically, a speaker does this by expressing her interest and approval for her hearer, and by generally observing Lakoff’s third politeness maxim, “Make [H] feel good – be friendly” (1973: 298). While positive politeness emphasizes a speaker’s solidarity with the hearer, negative politeness focuses on the need to show deference. Negative politeness consists in the strategies a speaker may use to satisfy a hearer’s desire that his actions should be unimpeded (Brown & Levinson 1978: 134) and corresponds roughly to Lakoff’s first two maxims of politeness: “Give options” and “Don’t impose” (1973: 298). For the most part, strategies of negative politeness are a matter of avoidance. They seek to avoid the negative consequences to a hearer’s face which might arise from some action of the speaker’s. As such, they are tailored to mitigate the imposition associated with specific sorts of face-threatening acts. The prototypical strategy of negative politeness is perhaps the indirect speech act, in which a speaker seeks to avoid imposing by superficially disguising a potentially facethreatening speech act as something more innocent. Thus, as in Gordon and Lakoff’s (1971) example, a speaker may avoid expressing a potentially facethreatening opinion that it’s silly to paint one’s house purple by asking the apparently innocent question Why are you painting your house purple? Emphasis and attenuation are usefully understood as strategies of positive and negative politeness, respectively. Indeed, Brown and Levinson all but explicitly list them as such. The positive politeness strategy which exhorts a speaker to “exaggerate (interest, approval, sympathy with H)” (1978: 109) depends on a judicious use of emphasis. More generally, the strategy “Intensify interest to H” requires a speaker to make her contributions as vivid and intense as possible in order to convey the pleasure of “a good story” (1978: 110). Similarly, understatement and attenuation figure prominently among the strategies Brown and Levinson list for negative politeness. In general, the surest way to minimize the threat associated with any given speech act is to minimize the expressed content of that act. Thus, instead of asking for a piece of cake, one might ask for just a taste, even if a piece is really what one wants; or when seeking the privilege of taking up somebody’s time, one might innocently ask Do you have a minute?, even if what one really wants is much more. Not surprisingly, then, many of the forms which Brown and Levinson list as useful for the mitigation of face threats are in fact attenuating PPIs: among them a sip, a taste, a smidgen, and a little bit (1978: 182). It might be rash to reduce the notions of emphasis and attenuation to simple politeness strategies. At best, such a move would seem to underestimate the full extent of their usefulness. Emphasis is not just a matter of solidarity: emphatic utterances can convey a sense of urgency (e.g. Don’t waste a second!), and

112 The Grammar of Polarity they can express insults as easily as intimacy (e.g. You wouldn’t know your ass from a hole in the wall). Similarly, the low informativity of an attenuated construction is just as useful for protecting a speaker’s positive face as it is for protecting a hearer’s negative face: hedging, for instance, as Brown and Levinson themselves point out (1978: 151), may serve the selfish function of protecting a speaker from criminal liability when giving testimony in a court of law. Clearly, the rhetorical motivations for emphasis and attenuation extend beyond their usefulness in politely protecting the face wants of others. For this reason, it might help to think of informativity in general as a way of expressing interpersonal involvement with an audience or even just with an utterance. The point is that emphasis and attenuation are useful forms of expression, and their usefulness is systematically related to the ways in which they are (or are not) informative. It is this general usefulness which makes informative value a natural sort of lexical semantic feature, and which motivates its conventional association with particular constructions. Beyond politeness, the pragmatics of informativity plays a fundamental role in the generation of non-logical inferences in the interpretation of texts and conversational interaction. In this respect, emphasis and attenuation basically reflect different strategies a speaker might follow in the formulation of an utterance based on the different strategies a hearer might use in interpreting it. Horn (1984, 1989) develops a model of non-logical inference based on two antithetical principles of conversational interaction. As Horn conceives them, these principles reflect a general principle of least effort governing all linguistic interactions: from the hearer’s perspective, least effort requires that utterances should be as easy as possible to understand; from the speaker’s perspective, least effort requires that utterances should be as easy as possible to produce. As Horn notes, this general tension between what is easy for the speaker and what is easy for the hearer has deep roots in the linguistic literature, going back to the work of Hermann Paul, André Martinet, and George Zipf, and to Atlas and Levinson (1981). In Horn’s (1989: 194) formulation, the basic tension is cashed out in terms of two principles of conversational inference: the Q and R Principles (with Q and R alluding to Grice’s Maxims of Quantity and Relation, respectively): Q Principle R Principle

Hearer Economy: Make your contribution s u f f ic ie n t : Say as much as you can. Speaker Economy: Make your contribution n e c e s s a r y: Say no more than you must.

The elements of sensitivity 113 Basically, the Q Principle ensures that a hearer will be given enough information to understand a speaker’s meaning, while the R Principle ensures that the expression of this information will be as simple as possible.1 From the hearer’s point of view, the two principles define distinct strategies for the interpretation of any utterance. Basically, if one assumes that the Q Principle is in effect, it follows that the speaker has given as much information as she could. Consequently, a hearer can infer that any stronger utterance the speaker might have made would either be false, or at least would not be something the speaker could confidently vouch for. This, of course, is the logic which underlies scalar implicature. On the other hand, if one assumes that the R Principle is in effect, it follows that the speaker may in fact mean more than she says, and a hearer may infer that the explicit, expressed content of an utterance does not exhaust what the speaker hopes to convey. This is the logic which underlies irony, allusion, understatement, and indirectness of all kinds. Given Horn’s formulation here, it seems natural to treat these as principles not just for the pragmatic interpretation of an utterance, but actually as defining two distinct strategies for communicative interaction, roughly analogous to the strategies of positive and negative politeness outlined above. The Q Principle thus encourages a speaker to produce the strongest and most informative contribution she can honestly maintain, and so favors the formulation of emphatic utterances. The R Principle, on the other hand, encourages a speaker to say as little as she can without compromising clarity, and as such, it favors strategies of understatement and attenuation. Of course, in Horn’s formulation, the R Principle serves primarily as a constraint on the form of an utterance rather than on its inherent informativity: if the point is to do no more work than is absolutely necessary, the important thing would seem to be to minimize the amount one utters, not the amount one actually communicates. But communication itself can be a risky business. The more information one conveys with one’s utterance, the more one imposes on the credulity of one’s audience, and the more one risks being exposed to disagreement and rejection. It thus seems reasonable to view the principle of Speaker Economy in broad terms as a force which militates not just against excess articulatory effort but more generally against taking any excessive communicative risks. These considerations may help clarify why informative value can be, and often is, a meaningful part of what a speaker is doing in an utterance. As suggested above, informativity is essentially an index of speaker involvement, but it is also a way of involving an audience in the act of communication. The use of low i-value, informatively attenuated utterances puts a light touch on

114 The Grammar of Polarity communication – they give the hearer options, protect his negative face wants, and afford him the pleasure of working out for himself the full significance of what is said. High i-value, emphatic utterances, on the other hand, leave less to chance in the interpretive process, but they also leave a hearer with fewer options for responding. Ultimately, the fact that linguistic constructions can be conventionally tied one way or the other to the expression of emphasis or attenuation allows speakers to signal their attitude toward their own speech acts, and allows hearers a simple way of knowing where they stand. Figures like emphasis and attenuation turn out to be useful in a variety of rhetorical contexts, and through frequent use, they may come to be conventionally associated with particular lexical items or expressions. I have stressed here the role of politeness in explaining this usefulness because it illustrates just how important, and how pervasive, such rhetorical properties can be in actual usage. At the heart of almost any canonical social encounter, there is the desire to make sure that the interaction will come off well, and that all participants (speaker, hearer, ratified and even non-ratified audience) will leave with their feelings and their face intact. For this reason, it is not only useful, but in fact critical that speakers should have at their disposal the linguistic devices they need to express their good intentions. The conventional encoding of properties like emphasis and attenuation provides a speaker with just such devices, allowing her to unambiguously signal the rhetorical nature of her conversational moves. Seen in this light, it would in fact be strange if languages failed to encode such properties. The conventionalization of informative value as a property of particular lexical items is consistent with, and indeed exemplary of, the general tendency noted by Traugott for “meanings to become increasingly situated in a speaker’s subjective … attitude” toward what is said (1988: 411). In general, i-values get associated with particular lexical items because the rhetorical properties of emphasis and attenuation are salient features of the utterances in which they occur. In essence, informativity is a property of sentences used in context. Emphatic sentences convey more or somehow make a stronger claim than might have been expected; attenuating sentences say less or make a weaker claim than might have been expected. I-value, the sentential property, becomes a feature of lexical semantics when particular words or constructions are regularly used in emphatic or attenuating contexts. If a given form occurs frequently and systematically in such contexts, the rhetorical properties of the utterance as a whole may come to be associated with the use of the form itself. This sort of metonymy is in fact a common source of semantic change. A typical example is the tendency for connectives expressing temporal overlap

The elements of sensitivity 115 to develop concessive meanings, as with English while, still, and yet (Traugott & König 1991: 199): often the point of saying that two things occur together is to draw attention to their normal incompatibility (e.g. She’s seven and she’s studying modal logic), and so this notion of controverted expectation may become associated with a marker of simultaneity. In general, if the use of a word frequently involves the expression of an attitude of some sort, that attitude may become an important part of the word’s meaning. How this actually happens with polarity items is a complex problem, but the idea that constructions can over time acquire an informative value, and so gradually become polarity sensitive, explains one fundamental mystery about polarity items, which is why different constructions with similar or even identical referential properties often exhibit very different sensitivities. In general, it seems that lexical semantics plays an important role in determining what sorts of constructions can become polarity sensitive (Hoeksema 1994, 1998; van der Wal 1997; van der Wouden 1996a, b; 1997). The role it plays, however, is far from determinative. Hoeksema (1994), for example, mentions two Dutch verbs, klikken and boteren, both of which idiomatically mean ‘to get along/be compatible,’ but which differ wildly in their affinity for negation: while boteren occurs with negation 98 percent of the time, klikken does so only 40 percent of the time. Similarly, Hoeksema offers corpus data on English verbs of indifference showing that the verb mind, as in (1), occurs in affirmative contexts just 1 percent of the time (with n=341), while the verb care, as in (2), does so 20 percent of the time (with n=792). (1)

a. I really don’t mind waiting. b. Would you mind waiting a little while? c. I don’t mind waiting, but I will mind if you don’t show up.

(2)

a. I don’t care for kippered herring. b. Would you care for some kippered herring? c. I don’t care for kippered herring, but I do care for you!

As the affirmative (c) examples here suggest, these forms are in fact more likely to occur in affirmative contexts where they function echoically in the rejection of a contextually relevant negative proposition. Still, the fact that they occur in such contexts at all qualifies them in Hoeksema’s estimation as quasipolarity items rather than true NPIs. Hoeksema suggests that such quasi-polarity items may become true polarity items by a process of grammaticalization. He notes that many NPIs exhibit three major properties typical of grammaticalized constructions: they are semantically bleached relative to their lexical counterparts, like wink in sleep

116 The Grammar of Polarity a wink or finger in lift a finger; they encode relatively subjective meanings, for example indexing a speaker’s attitude toward what is said; and they exhibit constructional layering, with one form having several uses, some of which are sensitive while others are not. Indeed, even with the most lexical of sensitive constructions, it makes sense to view the property of sensitivity itself as essentially grammatical. Typical examples of grammaticalization involve a loss of lexical independence and a fusing of elements across word boundaries – as in the evolution of tense markers from periphrastic verbal constructions, agreement markers from pronouns, or case markers from erstwhile adpositions. And this is precisely what happens in the development of NPIs, as constructions become increasingly dependent on and fused with the expression of negation. But grammaticalization in general is not just a matter of syntactic erosion and semantic bleaching; typically, it also involves a process of pragmatic strengthening and subjectification (Sweetser 1988; Traugott 1988; Traugott & Dasher 2002). This is where the Informativity Hypothesis seems particularly helpful, as it offers some insight into what it is that allows some constructions to show up first disproportionately and then exclusively in contexts of one polarity or another. Verbs like care, mind, matter, and bother can grammaticize as polarity items because they profile relations in domains – like desire, displeasure, significance, and effort – which are themselves intrinsically scalar. The degree to which such forms count as true polarity items depends on the degree to which they are conventionally associated with a particular informative value: in as much as verbs like care and mind can still function in neutral or emphatic contexts, their association with an attenuating i-value remains more a pragmatic preference than a strict semantic requirement. The attenuating force of such forms is thus essentially a defeasible conversational implicature which may over time grow more difficult to override, until it becomes an indissociable feature of a form’s conventional meaning. 5.4

Assessing informativity

If polarity items really are conventionally associated with rhetorical functions like emphasis and attenuation, these associations should have consequences for their distributions. For a polarity item to be felicitous, it must felicitously express its informative value in an appropriately attenuating or emphatic way. The right scalar inferences alone will not license an item, if its conventional i-value is inconsistent with the rhetorical force of its use. Where such clashes arise, the effects may range from mild semantic anomaly to outright

The elements of sensitivity 117 ungrammaticality. This section briefly reviews a variety of constructions which bias an expressed proposition either toward an emphatic or an attenuating rhetorical force, and which may therefore be used to assess the inherent i-value of the constructions with which they combine. 5.4.1 Diagnostics of emphasis Literally. The English adverb literally is widely used as an expression of emphasis, indicating that an utterance is to be taken in the strongest possible sense (Powell 1992; Israel 2002). Emphatic literally is compatible with emphatic polarity items like sleep a wink and scads but is ungrammatical with attenuators like much or a little bit in its focus. (3) Margo literally didn’t sleep {a wink / *much} before her big test. (4) Belinda literally won {scads of money / *a little bit} of money playing blackjack.

Similarly, in (5–6), where the emphatics sleep a wink and a ton can be felicitously introduced by a breathless You’ll never believe it!, the attenuators much and a couple of sound awkwardly off. (5) You’ll never believe it! I didn’t sleep {a wink / ?much} last week. (6) You’ll never believe it! Belinda won {a ton of money / ?a couple of dollars} at the races.

The anomalies here are mitigated when the attenuators occur without focal stress, which suggests that the anomaly arises from the use of these constructions as expressions of emphatic or controversial new information. Absolute modification. In general, emphatic polarity items can occur in the scope of degree modifiers like absolute and absolutely. Attenuating polarity items cannot. (7) (8) (9) (10)

Gilda absolutely would not {lift a finger to help / *help much}. They’ve been in there for absolutely {hours / *a while}. She brought an absolute {ton/*touch} of energy to the performance. a. I absolutely do not give a damn what you say. b. *I absolutely do not care to comment.

Modification by even. In general, emphatic polarity items can occur in the focus of even. Most attenuating polarity items are ungrammatical in this context. (11)

a. Laura didn’t give even so much as a word of explanation. b. Even wild horses couldn’t get me to write another dissertation. c. She didn’t even bother to return his desperate phone calls.

118 The Grammar of Polarity (12) a. ??Brandon didn’t even eat much. b. ??Are you even entirely sure that’s wise? c. ??Huguette doesn’t even care for asparagus. cf. Huguette doesn’t even like asparagus (let alone love it).

Focal prominence. In general, emphatic polarity items allow, and even welcome, emphatic focus. Because these forms serve to signal the unusual significance of an expressed proposition, they naturally tend to serve as the informational and intonational focus of a sentence. Attenuating polarity items, on the other hand, usually prefer not to draw attention to themselves and so may be awkward where they occur with focal stress, as in (13–14). (13) (14)

The news left me {thoroughly/??somewhat} confused. Lily didn’t {lift a finger / ??make much effort} to help us.

The anomaly here reflects the incoherence of emphatically strengthening an element whose very purpose is to weaken an expressed proposition. But attenuators can occur with focal stress where the stress is used contrastively, for example to attenuate a prior assertion, as in (15–16). (15) (16)

That novel confused me. At least, it sort of confused me. Jude didn’t help with the moving. At least, he didn’t help much.

In such cases, the use of stress is basically metalinguistic rather than truly emphatic: the point is not to strengthen an expressed proposition, but just to highlight the way the proposition is expressed, and so it is not inconsistent with the expression of an attenuated proposition. The point is, while emphatic forms welcome, and even crave, focal prominence, attenuating forms tolerate it only under specific discourse conditions. Horn scales. The distinction between the emphatic and attenuating sentences is nicely illustrated in the syntactic tests developed by Horn (1972, 1989) to define quantitative scales. These tests establish paradigmatic relations between forms ranged on a scale. Thus, the connective construction or at least, as in (17), links two clauses, the first of which must be stronger than the second, while in fact in (18) links clauses where the second is stronger than the first. Negated restrictive particles (like not only/just), with or without a following adversative (like but also/actually) work similarly, as in (19), requiring their second conjunct to be stronger than the first. (17)

a. Margo didn’t sleep a wink, or at least she didn’t sleep much. b. *Margo didn’t sleep much, or at least she didn’t sleep a wink.

(18)

a. Jerry didn’t help much, in fact he didn’t lift a finger. b. *Jerry didn’t lift a finger, in fact he didn’t help much.

The elements of sensitivity 119 (19)

a. Belinda didn’t just win a little bit of money, she won scads. b. *Belinda didn’t just win scads of money, she won a little bit.

The contrasts here clearly show that emphatic and attenuating polarity items differ precisely in terms of their informative strength, and that these differences have real distributional consequences. 5.4.2 Diagnostics of attenuation At least. Attenuating polarity items can occur in the focus of at least, while emphatic polarity items are comparatively awkward in this context. (20) The evening was a disaster, but at least we didn’t spend {all that much money / *a red cent}. (21) Her taste is expensive, but at least it’s not {too / *the least bit} garish. (22) She is a cruel woman, but at least she’s {sort of / *awfully} cute. (23) It’s very baroque, but at least {I’m beginning to / *I thoroughly} understand it.

Hedged concession. Attenuating polarity items can be used to qualify a concession. Given a strong expectation of some sort, and confronted with evidence that the facts do not justify this expectation, a hedged concession allows one to acknowledge the evidence without abandoning the more general expectation. (24)

a. Well, I guess he’s not here yet, (but I still think he will come). b. Well, I guess he didn’t get all that drunk, (but I still think he drank too much). c. Well, I guess she has some interest in tax policy, (but I still think she’d rather go dancing). d. Well, I guess the movie was just a tad pornographic, (but I still think that it was tastefully done).

Anti-concessives. Attenuating polarity items, but not emphatic polarity items, may be used to qualify a concessive construction in order to reestablish an argumentative conclusion. I call such uses “anti-concessive” because while the first clause superficially concedes a point, the second effectively denies that it matters. The formula I might be stupid, but I’m not that stupid is a classic illustration. Because anti-concessives necessarily express a qualified judgment rather than an absolute one, they are an ideal environment for attenuators but are inhospitable to emphatic constructions. The examples below thus sound fine with attenuators like the NPI much and the PPIs a good bit and fairly, but are basically incoherent with the emphatic counterparts of these constructions, the NPI at all and the PPIs a ton and insanely. (25) (26) (27)

He may have danced, but he didn’t dance {much / *at all}. She might not be rich, but she does have a {good bit / *ton} of money. She may not be brilliant, but she is {fairly/*insanely} clever.

120 The Grammar of Polarity This test does not work unless the concessive and the anti-concessive clauses involve values on the same scale. Paired scalar constructions, as below, allow emphatics in the anti-concessive. (28) (29)

She may be beautiful, but she can be awfully selfish. He may have showed up, but he didn’t say a word to either of us.

The patterns of acceptability in these sentences lend support to the claim that polarity items are divided between emphatic and attenuating forms. 5.5

Rhetorical coherence in polarity contexts

If i-value in general is a rhetorical property of constructions, then forms which encode i-values should be sensitive to the rhetorical properties of any potential licensor. Certain polarity contexts, for example, seem by their very nature to express an emphatic proposition, and these contexts, therefore, should be uncomfortable with attenuating polarity items. Consider (30). (30)

Jasmine kept pestering the coach long after a. she had a hope in hell of getting on the team. b. ?she had much hope of getting on the team.

The use of long after here depends on a felt contrast between the intensity of Jasmine’s efforts and the likelihood of her success: the minimizer a hope in hell reinforces this contrast, while the attenuating much undermines it. The construction as a whole, particularly with the modifier long, sets up an expectation for something emphatically surprising, which effectively blocks the use of an attenuating construction. Comparative constructions in general are similarly inhospitable to attenuators, since typically they serve to emphasize the contrast between compared propositions, and the use of an attenuator may undermine this contrast. At least, this is what seems to go wrong with the use of much in (31). (31) I’d rather be trapped in an elevator with a lecherous Martian than spend {a minute / ??much time} with that Murray.

The comparison here serves to emphasize the speaker’s distaste for Murray: the minimizer a minute effectively reinforces this judgment, while the weaker much diminishes it. Similar considerations apply below, where the attenuators all that much and terribly lead to rhetorical anomaly, if not outright ungrammaticality. (32) Jasmine is more likely to chase chimpanzees through the forest than she is to study {at all / * all that much}.

The elements of sensitivity 121 (33)

Taylor visits the moon more often than she gets {the least bit / *terribly} excited about her work.

The significance of i-value is also evident in its effects on the uses of polarity items in questions. As noted above, emphatic polarity items tend to bias a question toward either a negative or a positive response (Lakoff 1969; Borkin 1971; Hinds 1974; Linebarger 1980; Guerzoni 2004). But, as the examples below suggest, attenuating polarity items are more open-ended and less likely to introduce bias one way or another. (34)

a. Did you eat a bite of the cake? (biased: expects a negative response) b. Did you eat much of the cake? (unbiased)

(35)

a. Wasn’t she awfully pretty? (biased: expects a positive response) b. Wasn’t she sorta pretty? (less biased)

In (35) the negated form of the question itself signals the expectation of a positive response; but the choice between awfully and sorta is significant nonetheless. The emphatic (a) sentence is hardly even a question so much as an expression of amazement coupled with a request for agreement. The (b) sentence, with its tentative sorta, leaves room for disagreement and some latitude concerning the degree of expected pulchritude. Rhetorical questions constitute a sort of indirect speech act in which a speaker, by superficially and insincerely requesting information, actually conveys a very definite opinion. Normally, if a question is a sincere request for information, the speaker will not want to excessively prejudice any possible response; however, that is exactly what an emphatic polarity item will do. By posing a question with reference to an extreme value, the speaker renders one possible response extremely informative and the other extremely uninformative: if the answer to (40a) is “no,” we learn precisely how much cake was eaten (none); if it’s “yes,” we know only that at least the smallest amount possible was eaten. Such a prejudicial posing generates the implicature that the speaker in fact has a very definite idea about the answer, and so the question is rhetorical. The attenuating polarity items, on the other hand, allow more room for negotiation, and so can be used to form simple information questions. 5.6

Compositional sensitivities

Thus far I have argued that q-value and i-value are jointly responsible for a wide array of polarity effects. But if these really are independent lexical semantic features, they should occur independently of each other in constructions which

122 The Grammar of Polarity are in themselves not polarity sensitive but which are systematically sensitive in combination with other scalar expressions. Obviously, there are many constructions which encode a q-value of some sort but are not polarity sensitive. Sometimes, indeed, a single q-value can be conventionally associated with distinct constructions which differ in their sensitivities. The degree modifiers below are a case in point. All encode low q-values, but they vary with respect to i-value: only a bit, unlike its near synonyms the least bit (NPI) and a tad (PPI), can occur equally in emphatic and attenuating contexts. (36)

a. Harry is a bit overweight. b. Harry is a tad overweight. c. *Harry is the least bit overweight.

(37)

a. Harry isn’t a bit overweight. b. *Harry isn’t a tad overweight. c. Harry isn’t the least bit overweight.

The positive sentences in (36) all make weak claims and so can function only as understatements or hedged assertions: the emphatic NPI the least bit cannot be accommodated. In (37), where the same q-value yields a strong scalar claim, the sentences can only count as emphatic denials: here, the attenuating PPI a tad is ruled out. But the versatile a bit is fine in both situations. This shows that quantitative value alone does not determine a form’s sensitivity. A similar contrast is found among the high q-value constructions in (38–39), where the insensitive item very contrasts with the PPI awfully and the NPI all that. Here in (39) negation produces a set of contrary propositions, unlike the contradictory ones in (36). (38)

a. Lewis is very clever. b. Lewis is awfully clever. c. *Lewis is all that clever.

(39)

a. Lewis isn’t very clever. b. *Lewis isn’t awfully clever. c. Lewis isn’t all that clever.

In (38a), very marks a high degree of cleverness in an emphatic assertion; in (39a), very marks a high degree of cleverness in a hedged denial. The (b) and (c) sentences show that awfully and all that, while notionally similar to very, are not so flexible. The notion of i-value provides a simple explanation: forms specified for a particular i-value are limited to contexts supporting that value; forms not so specified are free to occur in emphatic, attenuating, or neutral contexts. Forms like a bit and very, while sharing a q-value with their apparent

The elements of sensitivity 123 synonyms, differ in that they do not encode a conventional i-value. They are therefore not polarity sensitive and their distributions are consequently less constrained. At this point one may object that the argument has turned circular.2 While I’ve claimed that polarity sensitivity is predictable on the basis of i-value and q-value, it seems that in (38–39) the determination of i-value itself depends on a form’s polarity sensitive behavior. If there were no other evidence than this for informative value, it would be just a clever diacritic. But as we have seen, i-value does have significant grammatical reflexes. The point is that i-value turns out to be independent from q-value. More generally, i-value cannot be predicted from lexical semantics because i-value is itself a part of lexical semantics, and so its association with any given form is arbitrary. In this respect, i-value is no different from any other lexical semantic feature. Still, it is worth pointing out that the situation here is at least somewhat more complex. The behavior of degree words in polarity contexts clearly demonstrates that there is more to their meanings than the simple specification of a quantitative value. But as it turns out, there is also more to informative value than these examples might suggest. When one compares the behavior of the insensitive a bit with the superficially similar a little strange things happen (Bolinger 1972; Horn 1989: 401). (40)

a. I’m {a bit / a little} worried about the situation. b. I’m not {a bit / a little} worried about the situation.

While both constructions are similarly attenuating in positive contexts, in the black light of negation (Horn 2000a: 147) their true characters come out: a bit in (40b) contributes to an emphatic denial of concern, while a little only denies that the concern is negligible and implicates that it is actually considerable. The effect of not a little here illustrates the classic figures of litotes and understatement: it is litotic in its application of negation to a low-scalar value, and it is truly understating (Israel 2006) in its effective expression of an obliquely emphatic positive proposition. If i-value is a fully functioning lexical-semantic feature, we should be able to find it at work in constructions with or without any accompanying q-value; and where it occurs without any co-encoded q-value, presumably it will not be polarity sensitive either. The obvious example is the focus particle even. Even is not polarity sensitive, occurring freely in both negative and affirmative sentences, and even is not linked to any fixed q-value, since both low- and high-scalar expressions can occur in its focus. But even is sensitive to the interaction of polarity with the

124 The Grammar of Polarity scalar semantics of its focus. While both even the lowest and even the highest are perfectly well-formed phrases, generally only one of the two can occur in any given context, and which one that is depends on the context’s polarity. (41)

a. Dolly can jump over even the highest fence. b. #Dolly can jump over even the lowest fence.

(42)

a. Dolly can’t jump over even the lowest fence. b. #Dolly can’t jump over even the highest fence.

These expressions are not generally polarity sensitive; rather they are sensitive only with respect to a given propositional context. Thus, a change of predicate as in (43) reverses the pattern of acceptability. (43)

a. Dolly has trouble with even the lowest fence. b. #Dolly has trouble with even the highest fence.

Superlatives like those in the (a) sentences here are tantamount to universal quantifiers. As such, these sentences represent remarkable claims and so welcome even as a marker of their unusual informativity; but replacing these superlatives with their polar contraries, as in the (b) sentences, renders the claims trivial and makes even sound bizarre. Because even is not itself tied to any particular point on a scale, it can occur in both positive and negative sentences and still retain its emphatic force; but to do so, its focus must encode a q-value which fits the polarity of the sentence. Concessive conditionals are subject to a similar effect (Sweetser 1990: 134). Either a positive or a negative apodosis may allow a concessive, even if, reading in (44). (44)

a. Dolly wouldn’t marry you (even) if you were the last man on earth. b. Dolly would marry you (even) if you were a monster from Mars.

But given normal background assumptions about marriage and attractiveness, reversing the polarity of these examples, as in (45), blocks the concessive readings. (45)

a. Dolly would marry you (*even) if you were the last man on earth. b. Dolly wouldn’t marry you (*even) if you were a monster from Mars.

The behavior of even, very, and a bit has significant implications for a theory of polarity. First, quantitative value and informative value are autonomous lexical features. Either one can be conventionally associated with a lexical item independently of the other: even encodes an emphatic i-value but is neutral as to q-value; very and a bit encode high and low q-values, respectively, but are unspecified for i-value. And the behavior of these three forms as a group

The elements of sensitivity 125 demonstrates that the three parameters relevant to polarity sensitivity – polarity, q-value, and i-value – all interact independently of polarity items themselves. Most importantly, the systematic interaction of these features shows that the grammar of polarity sensitivity itself involves a regular process of semantic composition. This point is brought home by the fact that i-value and q-value produce the same grammatical effects whether they co-occur within a single lexical item (e.g. in polarity items like the least bit), or come together as the result of syntactic composition, as in (40–44). Thus, although neither even nor a bit are strictly speaking polarity sensitive, their conventional scalar semantics ensures a regular interaction with polarity contexts. In (41–43), where even expresses an emphatic i-value, a change in sentence polarity necessitates a change in the q-value of the focus. Similarly, in (36–40), while a bit and very mark a constant q-value, a change in polarity brings with it a change in the sentence’s i-value. The implications for a theory of sensitivity are simple and profound. If an expression is such that it conventionally holds constant both quantitative and informative value, that expression will be acceptable only in contexts where both its quantitative and informative values can be compatibly expressed. Polarity items are scalar operators and polarity sensitivity is a sensitivity to scalar inferencing. The account developed here has three major virtues: first, by recognizing polarity items in general as a semantically coherent class of expressions, it explains their distributions directly in terms of the meanings they encode; second, the proposed classification provides a unified account for a wide range of both NPIs and PPIs; and finally, by distinguishing emphatic from attenuating polarity items, the account provides the beginning of a principled explanation for distributional differences between two broad classes of polarity item.

6 The scalar lexicon

Negation seals strange friendships.

6.1

Dwight Bolinger (1960: 380)

Paradigmatic predictions of the Scalar Model

The Scalar Model aims to explain what polarity items are and why they should exist. It also makes clear predictions about what sorts of constructions can be polarity sensitive and where they might be found in the lexicon. If all polarity items are scalar operators, it follows that polar sensitivity can only arise in scalar semantic domains and that all polarity items must profile an entity against a scalar base. Given the diversity of polarity items in both form and meaning and both within and across languages, this might seem an unlikely, or even quixotic, hypothesis. But scalar reasoning is itself a broad and abstract phenomenon, and given its pervasiveness in language and cognition and the ease with which almost anything can be construed against a set of alternatives, this is really a rather weak claim. However, if polarity items really are all conventional expressions of rhetorical affect – of emphatic and attenuating i-values – then they should arise in contexts where there are good pragmatic reasons for expressing such a rhetorical stance. Furthermore, if sensitivity really is an effect of frequently felt pragmatic needs, then there should be regular patterns in the types of expressions which become polarity sensitive and the types of functions these serve, both within and across languages. Thus, polarity sensitive constructions are likely to have sensitive synonyms (or near synonyms) within a language and sensitive counterparts in other languages. Indeed, the Scalar Model predicts that all polarity items should divide into just four basic sorts: (i) Low-scalar emphatic <−q,+i> NPIs; (ii) Low-scalar attenuating <−q,−i> PPIs; (iii) High-scalar attenuating <+q,−i> NPIs; and (iv) High-scalar emphatic <+q,+i> PPIs. And since both q-value and i-value are assumed to operate independently of any particular semantic domain, it follows that any domain which includes polarity items of any one sort is liable to include NPIs and PPIs of all other three sorts as well: domains in which NPIs 126

The scalar lexicon 127 occur should host PPIs too; and domains with emphatic operators should also typically include attenuators. Of course, the Scalar Model is founded on the observation that all four types of polarity item are abundant in the lexicons of English and many other languages. But the model is confirmed only to the degree that all sorts of polarity items can be assimilated to just these four semantic types. Given the lack of any consensus as to what the complete class of observable polarity items is in any language, this may be a tricky proposition to assess. Most practical and theoretical work on sensitivity has focused on the syntagmatic distributions of polarity items in sentence grammar, and has been much less concerned with their paradigmatic distributions in the lexicon. But there is no reason to assume that the latter should be any less orderly than the former, and the Scalar Model makes very specific predictions as to just what sorts of order to expect. The evidence from English strongly supports these predictions. My partial catalogue of English polarity items (see Appendix) includes constructions from some twenty-six scalar semantic domains, of which at least nineteen include multiple instances of each of the four construction types predicted by the Scalar Model. NPIs and PPIs regularly arise in precisely the same scalar semantic domains, some very abstract (e.g. quantity, degree, frequency, potential), others more lexically contentful (e.g. similarity, significance, effort, affection). The items in these domains are themselves quite heterogeneous: they are not all sensitive to the same degree or in the same ways in all dialects and registers of English. But their overall distribution in the lexicon does form a pattern, and it is precisely the pattern the Scalar Model predicts. This chapter focuses on three of the less obviously scalar classes of polarity items – modals, connectives, and aspectual operators – to make the case that indeed, all three classes are grounded in inherently scalar semantic domains, and, as predicted by the Informativity Hypothesis, sensitive constructions in these domains conventionally encode emphatic or attenuating scalar pragmatic meanings. 6.2

Modal polarity items

The semantics of modality is notoriously complex, and the complexity begins with the problem of just what modality is supposed to be. In general a modal operator is any construction which profiles the status of an expressed proposition with respect to a speaker’s conception of reality. The most basic sorts of modal operators express notions of ‘possibility’ or ‘necessity’ in relation to everyday reasoning (epistemic modality), social obligation (deontic modality),

128 The Grammar of Polarity or mental and physical ability (dynamic modality). These modalities are of special interest because they are cross-linguistically prone to grammaticalization (Bybee, Perkins & Pagliuca 1994; van der Auwera & Plungian 1998), and because within languages they often exhibit close structural and historical relations (Traugott 1989; Palmer 1990; Sweetser 1990). Syntactically, modal constructions occur in various guises with either a verb phrase or clausal complement: there are modal adjectives (e.g. possible, impossible, easy, hard), modal verbs (e.g. can, may, must, need, want), modal nouns (e.g. chance, hope, or prayer in have a ___ of V-ing), and modal adverbs (e.g. maybe, certainly). Such constructions are often constrained in their combinations with other logical operators, and particularly with negation, both in English (Palmer 1990; Cormack & Smith 2002) and cross-linguistically (Palmer 1995; de Haan 1997; van der Auwera 2001). A modally unmarked proposition presents a situation as actual with respect to some conceived reality – as occurring in a mental space with the status of fac t in the past or present (Cutrer 1994; Fauconnier 1997) – while modally marked propositions (i.e. things which are possible or necessary or preferable or imaginable) present a situation as a kind of potential with respect to a real or imagined world (i.e. in a mental space not construed as fact ). I use the term potential here generally to include such relations as the ease with which a situation might be realized, the evaluation of a situation as desirable or distasteful, and the status of a situation as logically, morally, or physically possible. I assume here, as is common in cognitive linguistics, that while these different sorts of potentials are logically and ontologically very different sorts of creatures, they are all commonly conceptualized in terms of basic embodied, force dynamic image schemas (Talmy 1985; Sweetser 1990; Langacker 1991 – cf. Portner 2009). Thus, the relation between the subject NP of a modal verb and the process profiled in its complement VP is understood as a force – a compulsion, attraction, enablement, or blockage – which can push a proposition into or out of construal as fact. Different modal constructions contrast paradigmatically both in the kinds of potentials they encode (i.e. dynamic, deontic, epistemic, bouletic, etc.) and in the strength of their profiled potencies: strong modals like need and must have high q-values, while weak modals like can and may have low q-values. Modality provides a fertile field for the growth of polarity items because modality itself is a scalar phenomenon. Modal operators have traditionally been understood as the propositional analogues of nominal quantifiers (Jespersen 1917, 1924; von Wright 1952; Horn 1972, 1989: 259ff.; Kratzer 1991; von Fintel 2006). Thus, in possible world semantics, necessity

The scalar lexicon 129 operators are basically universal quantifiers over sets of worlds, while possibility operators are existential quantifiers.1 The analogy between quantifiers and modals need not be exact – the important point is that like quantificational operators, modal operators support scalar inferences. Just as all entails some, so necessity entails possibility, and obligation entails permission. The scalar structure of epistemic (1) and root (2) modality is illustrated below. (1)

It’s after midnight, so the game ___ be over by now. a. must b. should c. may

(2)

You ____ eat your Brussels sprouts before you have dessert. a. must b. should c. may

The strong (a) modals express necessity and encode the sort of extreme q-value found in a quantifier like all or every; the mid-scalar (b) modals express likelihood or obligation and feature high q-values analogous to that of most; and the weak (c) modals express possibility or permission and have low q-values like those of some or any. Probably the best-known polarity sensitive modals are expressions of necessity like the semi-auxiliary use of need as in you needn’t worry. In its most basic use, the lexical verb need denotes a social or physical necessity: for something to need to happen, there must be some positive force, either a social obligation or a physical requirement, favoring its occurrence. In this sense, need encodes a high q-value and contrasts with weaker modals like can or could: normally, if one needs to do something, it follows that one could do it as well. But since, as an NPI, auxiliary need occurs only in scale-reversing contexts, its high q-value contributes to the expression of weak, attenuating propositions. The attenuating force of need makes it particularly suitable for indirect requests as in (3), where the denial that something is necessary (you needn’t) gently implicates that it is also undesirable (please don’t). (3)

a. You needn’t be so coy. b. You needn’t leave yet.

These sentences are conspicuously uninformative if one wants to know what it is one should or should not do, and so they sound more like muted requests than simple statements of fact. They exhibit negative politeness by giving options instead of dictating a desired course of action (Lakoff 1973).

130 The Grammar of Polarity Similarly sensitive needful constructions are found in many languages. These include German brauchen+VP, Norwegian å behØve (Johannessen 2003), Dutch hoeven (Hoeksema 1994; van der Wal 1997), French être besoin de, and Mandarin yòng (Edmondson 1983; van der Wouden 1996a, b). Despite the evidence of a clear cross-linguistic pattern here, these forms remain marginal in most studies of polarity, and as Duffley (1994) notes, where they are discussed, they are mostly seen as an interesting anomaly – either an odd sort of a modal (Coates 1983; Palmer 1990) or an unusual kind of NPI (Jackson 1995; van der Wal 1997). But such constructions may be more ordinary than many have supposed. In fact, modal expressions of all sorts are prone to the four polarity sensitivities predicted by the scalar model. The well-documented ‘necessary’ NPIs are complemented by an even larger class of ‘possible’ NPIs, in which the notion of ‘possibility’ serves an emphatic function. And both the ‘possible’ and the ‘necessary’ NPIs are complemented by robust classes of attenuating ‘possible’ PPIs and emphatic ‘necessary’ PPIs. English ‘necessary’ PPIs include auxiliary and catenative verb constructions with must, should, have got to (gotta), and (had) better, all of which, like some, obligatorily take wide scope over negation. While needn’t expresses the absence of a requirement to do something, shouldn’t and mustn’t profile a requirement to not do something: mustn’t means ‘necessary not’; needn’t can only mean ‘not necessary.’ (4)

a. You {shouldn’t/mustn’t} be so coy. b. You {shouldn’t/mustn’t} leave yet.

These constructions are a rather weak breed of PPI, since though they cannot be interpreted in the scope of negation, both allow negative contraction (in many dialects, at least – mustn’t is uncommon in American English). The constraints on the periphrastic modals (have) got to and (had) better are stronger: these constructions cannot combine directly with negation at all. While both can take a negative VP-complement, as in (5b), neither allows negation in the higher auxiliary phrase, as in (5c). (5)

a. You {have got to / had better} finish the report by Tuesday. b. You {have got to / had better} not worry so much. c. *You {haven’t got to / hadn’t better} finish the report by tomorrow.

Many expressions of ‘necessity’ are similarly uncomfortable in the scope of negation – among others, the complex auxiliary constructions be bound to, be compelled to (at least when used epistemically), and the epistemic modal adverbs surely and certainly, which can emphasize a negative assertion (e.g.

The scalar lexicon 131 I certainly did not eat the last cookie!) but cannot occur to the right of or be interpreted inside the scope of negation (e.g. *I did not certainly eat the cake). Nor is it just ‘necessary’ modals that are subject to such sensitivities. One also finds polarity items in abundance at the low end of the modality scale, where ‘possibility’ is the modal analogue to low q-value quantifiers and indefinites. Probably the best-known example of a ‘possible’ NPI is the epistemic use of the auxiliary can (Horn 1972), which, unlike epistemic could, occurs only in negative (6) and interrogative (7) clauses. (6)

a. They actually pay you to make up words? You can’t be serious. b. He says they pay him to make up words. He could/*can be serious.

(7)

a. Can this really be happening to me? b. *This can really be happening to me.

Epistemic can is essentially an expression of shocked disbelief. It profiles a minimal degree on a scale of possibility and is restricted to use in emphatic propositions where every possibility is effectively excluded. Other ‘possible’ NPIs are even more obviously emphatic. For example, modal nouns and nominals like chance, prayer, ghost of a chance, and snowball’s chance in hell all express emphatically minimal likelihoods where they occur as values of X in the have an X of V-ing construction. Also worth mentioning here are the idiomatic uses of seem, manage, and begin with can and could as in I {can’t/*can} seem to get her attention; We {couldn’t/*could} manage to fix it; and You {can’t/*can} begin to imagine what it was like. Each of these complex NPIs expresses a minimal ability of some sort: as such they are all emphatic NPIs profiling a low q-value on a dynamic modality scale.2 Just as the attenuating NPI need finds an emphatic PPI counterpart in must, so the low-scalar epistemic can is complemented by epistemic may, which is an attenuating PPI. In (8), where may indicates permission, it allows either narrow or wide scope with negation, though the narrow scope reading is generally preferred; but where may indicates logical possibility, as in (9), it obligatorily scopes over negation. (8)

You may not leave the table. a. Narrow: You do not have permission to leave. b. Wide: You have permission to not leave (= you can stay).

(9)

We may not get a chance to talk again. a. *Narrow: It’s not possible that we will talk again. b. Wide: It’s possible that we will not talk again.

132 The Grammar of Polarity The exclusion of epistemic may from questions, as in (10), strengthens the arallel between the PPI may and the NPI can. p (10)

a. *May this really be happening to me? b. Her light is on. Can/Might/*May she be home already?

Epistemic may thus appears to be blocked in just those contexts – questions and negatives – where epistemic can is licensed. Together the two divide up the expression of logical possibility in much the same way that need and must divide the expression of necessity and obligation: can is an emphatic NPI, with low q-value and high i-value; may is an attenuating PPI, with low q-value and low i-value. Other modals patterning with may as low-scalar attenuating PPIs include might, might well, just might, could well, and could just as well, all of which allow negated VP complements (e.g. He could well not be there or She just might not know) but require any tautoclausal negative operator to occur to their right and to be interpreted in their scope. The modal adverbs maybe and perhaps are similarly constrained, as illustrated below. (11)

a. Maybe/Perhaps, it will/won’t rain tonight. b. *It won’t {maybe/perhaps} rain tonight.

The evidence from English suggests that modal constructions of all sorts are prone to polarity sensitivity, and that the sensitivities they display are a function of their basic modal meanings: low q-value ‘possible’ modals grammaticalize as emphatic NPIs and attenuating PPIs, high q-value ‘necessary’ modals as emphatic PPIs or attenuating NPIs (see Appendix, section 8). And if such constructions are common, then presumably they must be somehow useful. Indeed, the Scalar Model predicts that scalar constructions will be polarity sensitive only to the extent that they are conventional expressions of rhetorical affect, encoding either an emphatic or an attenuating informative value. Consider the ‘necessary’ polarity items. Because hearers in general prefer utterances which are highly informative, and because scalar extremes are such salient conceptual reference points, speakers in general are inclined to make emphatic claims. This leads to the grammaticalization of modal PPIs like gotta and better in just those contexts where their ‘necessary’ meanings are effectively emphatic. But emphatic claims are by their nature more likely to be false than more moderate claims, and so are more frequently subject to questioning or denial; and this favors the grammaticalization of ‘necessary’ operators in negative contexts where they are used to rebut their positive counterparts.

The scalar lexicon 133 The modal adverb necessarily, for example, appears in the bnc about three times as often in negative contexts, as in (12), as it does in simple affirmatives, as in (13). (12) (13)

a. I tried to make her understand that kids weren’t necessarily the key to happiness. b. To be a layman, even to be anticlerical, is not necessarily to be irreligious. c. “Life isn’t necessarily fair, Miss Levington,” he rapped. a. This could obviously cause problems when information is recorded electronically, since any print out will necessarily be a copy of the original. b. And we can go on keeping it ad lib. whereas Byrd concerts will necessarily be few and far between. c. It is necessarily selective and undoubtedly subjective in choice of material, and the author apologizes where appropriate.

But while necessarily can appear without negation, where it is negated, and particularly when it is used epistemically or metalinguistically to express disagreement, the negation is often indispensable. In positive uses like those in (13) necessarily presents a proposition objectively, as a consequence of the way the world is: thus the necessity in (13a) is inherent in the nature of electronic records, while the paucity of Byrd concerts in (13b) is due to the inherent difficulty of arranging them. But in (12) the relevant notion of ‘necessity’ concerns the degree to which an inference is justified: thus in (12c) the point is not so much to deny that life is always of necessity fair (which probably no one believes anyway), but to counsel against the not uncommon expectation that it will be. Similarly, in (14), where necessarily works as a conversational particle expressing “a non-committal response to a question or suggestion” (OED II), it must occur with negation. (14) “I’ve just had the most boring night of my life with Bucky Leo and his one amazing brain cell.”… “A no-go, then?” “Not necessarily… He has one mighty fine body.” (S. Stewart, Sharking, xiv. 234)

Here, as in (12), the negated necessarily provides a way of delicately demurring from a contextually salient conclusion without having to fully disagree with one’s interlocutor. It is thus classically attenuating. It leaves options open. The same general tendency which drives (not) necessarily to its frequent use in gentle denials is evident also in the frequent use of auxiliary need with predicates denoting speech acts and anxious emotions. Thus, of the forty-five instances of need hardly in the BNC, thirty-eight (84 percent) feature auxiliary need complemented by a speech act verb and a first person actor, as in (15).

134 The Grammar of Polarity (15)

a. I need hardly say that my wife’s first impression of Lewis differed somewhat from my own. b. “I need hardly tell you,” he continued in his dry voice, “what a blow you dealt to she who cared so much for your welfare.” c. It need hardly be pointed out that the provision of additional health/leisure facilities would also justify an increase in room rates.

The effect in these examples is distinctly mitigating. In each case, what is said is just that an informative act is unnecessary, presumably because the information it would provide is too obvious to mention. The use of need hardly is thus attenuating in the trivial sense that the proposition it expresses is conspicuously uninformative and literally non-informing. But the construction here is also understating in the stronger sense that it allows the speaker to mention a proposition without actually taking responsibility for saying it. Thus, the most common use of auxiliary need in this sample is as a device for coyly expressing something which is explicitly left unsaid. The effects are similar where need is used with predicates of worry and distress, as in (16), to discourage an addressee from some presupposed potential discomfort. (16)

a. And you need not worry about whether I was safe or not. b. You need not be ashamed of any degree from Glasgow. c. “You need not trouble yourself, Doctor Sparrow, you need have no anxiety about the question of a wheelchair.”

This use only occurs where an addressee is presumed to be susceptible to some negative emotion, and it serves as a way of both indirectly acknowledging this feeling and gently dismissing it. Pragmatically, what auxiliary need and NPI necessarily share in these examples is a tendency to qualify a highly topical proposition – either one recently posed in a discourse, or one likely to be entertained anyway by the hearer. These are thus dialogical and inherently argumentative constructions – they are used to respond to old propositions in a discourse rather than to construct new ones. But they are not exact synonyms either. Auxiliary need seems especially well suited to and common in the formulation of indirect requests (3), oblique assertions (15), and gentle reassurances (16). NPI necessarily serves primarily as a way of objecting to a conclusion that one’s addressee may have jumped to. A better synonym than need for necessarily is perhaps the use of just because which Bender and Kathol (2001) call the “JB-X DM-Y construction.” The examples in (17–18) suggest that this construction is an NPI and that it is licensed in rhetorical questions, conditional antecedents, and indirectly negated clauses.

The scalar lexicon 135 (17)

a. Just because we live in Berkeley doesn’t mean we’re left-wing radicals. b. *Just because we live in Berkeley means we’re left-wing radicals.

(18)

a. “Just because a guy has bleached hair, winter tan, speaks slowly and is pleasant to the point of being vacuous … does that mean he’s a surfer.” b. If just because we live in Berkeley means we’re left-wing radicals, you have some serious misconceptions about our city. c. Don’t assume that just because we live in Berkeley means we’re left-wing radicals.

This construction features a preposed just because clause expressing the grounds (X) for a possible conclusion (Y), and a main clause proposition denying the inference from X to Y. Typically, this denial is lexicalized by the words doesn’t mean, but other predicates are possible if they effectively convey that X is not a reason to conclude Y. In effect, like NPI necessarily, the JB-X DM-Y construction presupposes a context in which some topical proposition X is considered a strong argument for a conclusion, and it serves to deny that the conclusion is warranted. As with necessarily, there is some question as to whether this construction really is polarity sensitive. Based on examples like those in (19), Bender and Kathol conclude that it is not an NPI but rather is licensed “by any environment that distances the speaker from the belief that X in fact implies Y.” (19)

a. Kim seems to believe that just because we live in Berkeley means we’re left-wing radicals. b. So what you’re saying is just because we live in Berkeley means we’re left-wing radicals.

I think this is right, but what it shows is just that, like many other NPIs, this construction can be licensed by an appropriate negative implicature (Linebarger 1980; Horn 2001; Israel 2004, 2006). In (19) these implicatures are triggered by the representational predicates seems to believe and what you’re saying, which suggest a contrast between the speaker’s beliefs and those of some other person. In fact, many polarity items can be licensed by just this sort of quasi-ironic distancing from a proposition. The expression big deal, for example, occurs in negative understatements only about 70 percent of the time in the BNC, and as such seems not to be a true NPI. But where it occurs without negation, it is either in ironic exclamations (big deal!) or in contexts which place some epistemic distance between a speaker and an expressed proposition, as in the examples below where the big deal is embedded under verbs like think, feel, and seem.

136 The Grammar of Polarity (20)

a. That’s because parents in Sylhet seem to think it is a big deal, a real status symbol to get a Biliti Bor (a bridegroom from England). b. The extent to which they are felt to be a big deal for the pupils will mirror the extent to which they are felt to be a big deal by their teachers. c. “It always seemed like a pretty big deal to me,” he said.

While the constraints on the big deal and JB-X DM-Y constructions may be looser than those on some NPIs, they do not seem to be different in kind. In fact the two constructions seem to be fairly representative of two larger classes of polarity items: thus big deal is, like the end of the world and something to write home about, an attenuating NPI which operates on a scale of significance and contrasts with emphatic NPIs like matter and make a difference (see Appendix, section 9). Similarly, the JB-X DM-Y construction is not so different from modal polarity items like the emphatic NPI epistemic can or attenuating ‘necessary’ NPIs like English need and Dutch hoeven. Of course, these sorts of connections across polarity items may be difficult to discern if one lacks a proper theory of what it means to be a polarity item. The Scalar Model predicts that modal operators in general should grammaticalize as polarity items precisely where their use is most pragmatically loaded, and the behavior of the ‘necessary’ NPIs reviewed here suggests that one of the more useful functions a polarity item can perform is the attenuated expression of a contrary opinion. But while the ‘necesssary’ NPIs are all similarly argumentative in their meanings, they are not identical in the pragmatic problems they solve. ‘Necessity’ itself is a very abstract sort of concept, and as such it plays a role in the conceptualization of more concrete semantic domains and may be useful in a wide variety of communicative contexts. Indeed, modal operators appear to be magnets for polarity sensitivity precisely because modality is such a quintessentially abstract scalar semantic domain. Thus, beyond the traditional realms of epistemic, deontic, and dynamic modality, many more contentful verbal polarity items incorporate notions of ‘possibility’ or ‘necessity’ in the broad sense that they denote relations which can affect the likelihood of a situation obtaining. Many of these are psychological in nature, for example states like desire, aversion, tolerance, and antipathy, which can be thought of as forces which compel a subject either to seek out or to avoid a situation of some sort. The Scalar Model is thus further confirmed to the degree that these more contentful sorts of modal domains also include polarity items of all four predicted types. A large class of English verbal polarity items profile different states of desire, like the care to V and the dream of V-ing constructions illustrated below (all unasterisked examples are from the BNC).

The scalar lexicon 137 (21)

a. It’s not a decision I would care to have to make. b. No one here, I trust, would care to disagree. c. Would you care to spend Christmas with me? d. *It’s a decision I would care to make.

(22)

a. I would never dream of marrying for anything less than love. b. If you think I’d dream of sharing so much as a blanket with you after that you’re crazy! c. Who, he asked, would dream of privatising the Royal Navy? d. *I would dream of sharing a blanket with you.

I take it that the care to V construction profiles a moderate-to-strong desire and is thus attenuating, while the would dream of V-ing construction denotes a minimal inclination, and so is emphatic. The pragmatic differences between the two are particularly clear in questions, where the use of would dream of, as in (22c), strongly anticipates a negative response, while the milder care to, as in (21c), makes an indirect invitation with a real hope of a positive response. Indeed, in formulae like would you care…, if you’d care…, you might care…, and perhaps you’d care…, all of which are abundant in the BNC, the care to V construction has arguably grammaticalized as a kind of illocutionary forceindicating device for invitations and polite directives. Like the would dream of construction, the semi-auxiliary use of dare in (23) is a low-scalar emphatic NPI which profiles a subject’s minimal inclination to act in some way. (23)

a. Of what followed I cannot tell in detail – I dare not put it into words. b. I was wearing dresses that showed more than she ever would dare, before she was born. c. But who would dare approach the aloof Lady Eleanor? d. *I will dare put it into words.

Both of these constructions mark a low degree of willingness to act, and so, in the scale-reversing contexts to which they are limited, they form highly informative propositions emphasizing the unlikelihood of an event. The analysis of dare as encoding a low q-value may seem counterintuitive. The problem is that ‘daring’ itself is a scalar notion, and the verb dare seems to indicate a high degree on a scale of daring or audacity. But audacity itself is a relatively weak indicator of future action: one’s having the audacity to do something does not entail that one will do it, or even that one would want to; and if one lacks the audacity to do something, then in that respect one must prefer not to do it. Parallel to NPIs like dare and would dream of, many PPIs also operate on a scale of inclinations – among others, the attenuating would rather construction

138 The Grammar of Polarity illustrated in (24) and the emphatic would love to construction in (25). Examples in (24–25) are from the BNC. (24)

a. I would rather have been down at the villa making figgy hedgehogs for Tony, but a promise is a promise. b. I would rather do anything than have to fly.

(25)

a. I would love to hear from you again, if you can spare the time. b. I would love to have a chauffeured Cadillac, but I can’t afford it.

One striking piece of evidence for the status of these constructions as PPIs comes, ironically enough, from the ways they combine with negation. PPIs are usually acceptable only in the scope of negation where the negation itself is construed in the scope of another polarity trigger (Baker 1970; Chierchia 2004; Szabolsci 2004; §3.6.3, above). Assuming that simple questions and denials are more frequent than negated questions and denials, if a construction commonly occurs with negation only when it is also in the scope of a question, conditional or a negative, then that construction is very likely a PPI. There are just two instances of the string would not rather in the entire BNC, both in (26), compared to 548 for would rather, and no instances of would not (or wouldn’t) love to. Where instances of the latter can be found on the web, they are almost always in rhetorical questions, as in (27). (26) (27)

a. Who would not rather own to theft and deception within the Church’s writ, rather than put his neck into the sheriff’s noose for murder? b. I wonder whether Christ would not rather go to Calvary again than to suffer the unfaithfulness of some of his friends. a. What man on all of Terra would not love to have private nude dancers running around a room of eight foot tube lights suspended from the ceiling. b. What guy wouldn’t love to claim bragging rights as Wendy Shalit’s first lover?

While desiderative polarity items profile an experiencer’s positive desire for or approval of a potential situation, another large class of polarity items operates on precisely the opposite sort of scale – a scale of aversions, where affective attitudes range from a minimal ability to keep away from something to absolute loathing for it. At the low end of the scale, we find predicates which denote an ability to resist or abstain from engaging in some act, or else which indicate simple indifference (e.g. the attenuating PPI can take it or leave it). Falkenberg’s class of “abstentive” NPIs (2001: 81) fit in here as emphatic expressions of minimal aversion. Her examples from German include anstehen ‘hesitate,’ erwarten können ‘can wait,’ sich enthalten können ‘to refrain from,’ sich entbloden ‘to

The scalar lexicon 139 be ashamed to,’ sich entmutigen lassen ‘grow discouraged,’ ermüden ‘tire of,’ verfehlen ‘fail,’ and sich zurückhalten können ‘can hold back from.’ The examples in (28) illustrate a few similar NPIs in English. (28)

a. She {couldn’t/*could} help laughing. b. He {couldn’t/*could} resist asking her about her date. c. I {couldn’t/*could} wait to see him.

The BNC examples in (29–30) illustrate two constructions which denote relations at the high end of the aversion scale: the attenuating NPI mind and the emphatic PPI hate to. (29)

a. Oh well I wouldn’t mind being entertained by her! b. Do you mind if I join you?’

(30)

a. I hate to sound sappy, but anything can make me cry… b. I hate to burden you with this.

Attenuating aversive NPIs like mind reflect a coy strategy common in some cultures (Hübler 1984) for expressing desires indirectly. Other English constructions with similar uses include be averse to, have qualms about, and a broad family of expressions which literally denote disdain but are typically used to express approval – among others, would sniff at, would turn up one’s nose at, and would kick X out of bed, which, as Horn (p.c.) notes, typically functions as an oblique way of expressing sexual attraction by denying an inclination toward rejection. The difference between aversive and desiderative polarity items is particularly clear in the contrast between the mind V-ing and care to V constructions. As attenuators, both constructions tend to be used in politely uninformative speech acts, commonly occurring in more or less formulaic questions where they effectively make it easier for an addressee to answer negatively. In general, one is less likely to have either a strong desire or a strong aversion than to have a weaker one, and so to ask of someone do you mind? (roughly, ‘do you strongly object?’) or would you care? (roughly, ‘do you strongly desire?’) is, at least formulaically, a way of giving an addressee more options. But since the two forms here are associated with the opposite sorts of scales, the pragmatic functions which they serve are similarly opposed. Thus, as noted above, would you care to V (or would you care for X) typically expresses an offer or an invitation – because it politely allows one to decline gracefully – while an expression like would you mind V-ing is typically used for requests – because saying “no” in this case implicates a willingness to do what is requested.

140 The Grammar of Polarity The expression of modality also plays an important role in the semantics of many more complex verbal NPIs. Epistemic can is perhaps the purest example of what Horn (1972: 187ff.) calls an “impossible polarity item” – a form which can be used only to express what cannot be. Impossible polarity items themselves are a subclass of “possible polarity items” (ibid.) – forms like cope and afford, which must occur in the scope of a ‘possibility’ operator of some sort (e.g. can, be hard to, be enough to, help, etc.). ‘Impossible’ polarity items are just ‘possible’ polarity items for which the notion of ‘possibility’ must itself be interpreted inside the scope of negation or some other polarity trigger. The complex sensitivities of this last group are nicely illustrated in Horn’s multilayered example below (1972: 190). (31) can’t I ?can *didn’t

abide bear stand take stomach

linguistics. writing dissertations.

Not all of these forms are strict NPIs. Hoeksema (1994) finds instances of stand, take, and bear in affirmative contexts, though only rarely in his corpus (<10%). These forms are, however, very strict in their need for a licensing possibility operator, though as with negation this can be expressed in any number of ways (e.g. it’s {hard / tough / impossible / not easy} to stomach such nonsense). The frequent incorporation of ‘possibility’ operators in verbal polarity items highlights a significant parallel between the NPI uses of can and polarity sensitive any. The difference between the possibility of an event and its actual manifestation seems analogous to the distinction between an arbitrary indefinite instance of a nominal type and its actual instantiation. In this sense, the can in a verbal NPI like can fathom or can stomach is the verbal analogue of the indefinite article in minimizers like sleep a wink or lift a finger: in both cases the effect is to preclude specific reference and so to reinforce the irrealis nature of the expressed proposition. Tolerative polarity items like those in (31) presuppose a scale of situations ordered in terms of their unpleasantness (Appendix, section 14). NPIs similar to those in (31) are abundant in at least several languages: among them Dutch – e.g. (kunnen) {uitstaan/velen/verdragen/verkroppen/zetten/ergens tegen} ‘(can) stand / stand / bear / stomach / swallow / bear something’ (Hoeksema 1994; van der Wal 1997); German – e.g. (können) {ausstehen/ leiden/vertragen} ‘(can) abide / suffer / tolerate,’ dulden ‘stand for,’ and zulassen ‘allow’ (Falkenberg 2001); and French – e.g. (pouvoir) {avaler/boire/blairer/

The scalar lexicon 141 piffer/puer/digérer/souffrir} literally ‘(can) swallow, drink, smell, smell, smell, digest, suffer’ (Bouvier 2002; Tovena, Deprez & Jayez et al. 2004). Other sorts of tolerative constructions include attenuating NPIs like take lying down and can resign oneself to and emphatic PPIs like be dying to V and would give {anything / one’s right arm} to V, all of which profile a great willingness to suffer passively. At the low end of the scale, English includes several PPIs denoting a weak acceptance of an unwelcome situation – e.g. can live with, as in I guess I can live with that, and bite the bullet, as in I guess we’ll just have to bite the bullet – or which simply suggest an ability to endure or survive some difficulty – e.g. figurative uses of weather the storm and survive. Parallel to the contrast between desiderative and aversive polarity items, tolerative polarity items contrast with the irritative (or troublesome) polarity items (Appendix, section 15). These are mostly impersonal constructions denoting a relation between a potentially bothersome stimulus of some sort (usually coded as subject) and a potentially discomfited experiencer (usually coded as direct object). They thus rank situations from the least to the most irritating on a scale of troublesomeness, with emphatic NPIs like be any bother and attenuating PPIs like can be a hassle at the low end, and attenuating NPIs like (be) skin off one’s nose and the end of the world at the high end. Irritative NPIs in other languages include French constructions with déranger and gêner (as in ça ne me gêne pas ‘it’s not a problem’) and German verbs like jucken ‘itch,’ kratzen ‘scratch,’ kümmern ‘bother,’ and verschlagen ‘matter’ (Falkenberg 2001). The classes of sensitive verbs distinguished here do not exhaust the varieties to be found even in English, but they may be enough to show that such constructions are both strikingly diverse and remarkably consistent in the meanings they encode. Previous work on verbal NPIs by Hoeksema (1994) and Falkenberg (2001) has shown that these forms regularly recur in a few fairly well-defined semantic domains. Hoeksema distinguishes three major groups of verbal NPIs in Dutch and English based on their lexical semantics: verbs of “minimal degree” like English budge and touch (as in she wouldn’t touch her food) and Dutch reppen ‘to mention, speak of’ and kunnen tippen an ‘to match, hold a candle to’; verbs of “indifference” including English care, matter, mind, and bother and Dutch kunnen schelen and kunnen bommen ‘to matter, make a difference’; and verbs of intolerance like English (can) stand, stomach, bear, or abide and Dutch kunnen uitstaan and kunnen verkroppen ‘to stand, tolerate.’ Falkenberg distinguishes four main groups of verbal NPIs in German: (i) predicates of privation, which express a lack of some sort, like English lack for or do without; (ii) abstentive predicates with meanings like ‘hesitate,’ ‘can wait,’ or

142 The Grammar of Polarity ‘can keep from’; (iii) predicates of attraction (equivalent to Hoeksema’s “verbs of intolerance”); and (iv) impersonal verbs of care, with meanings like ‘bother’ or ‘trouble’ (basically a subset of Hoeksema’s “verbs of indifference”). Similar sorts of constructions are also found in French (Bouvier 2002; Tovena, Deprez & Jayez et al. 2004) and Norwegian (Johannessen 2003). While all these languages are culturally, historically, and geographically closely related, the similarities among their verbal polarity items do not seem to be a matter of shared inheritance or of borrowing, and so must have some broader pragmatic basis. The fact that so many nearly synonymous polarity items can be found both within and across languages shows that sensitivity must be at least partly semantically motivated. And the fact that such synonym clusters occur in so many distinct domains shows that these motivations cannot be reduced to a single cause. As Falkenberg puts it, “NPI-verbs do not share one single property that makes them susceptible to negative polarity … they fall into natural subclasses whose semantic and syntactic behavior is to be explained separately” (2001: 80). And this is surely correct – different semantic domains provide different pragmatic motivations for the grammaticalization of NPIs, and these differences should not be ignored. But acknowledging such differences should not prevent us from recognizing that all these sorts of sensitive verbs are built on inherently scalar semantic domains and that all of them are pragmatically specialized for the expression of propositional strength.

6.3

Connective polarity items

I take a connective to be any construction which profiles a link between two or more coordinate propositions in a discourse. Connectives in this broad sense encompass both traditional conjunctions (e.g. and, or, but, although, because, etc.) and focus particles (e.g. also, either, even, too, as well) which mark the relation between a profiled text proposition and a background presupposition or context proposition. Connectives in general can encode a variety of different relations between propositions – among others, causality, concession, and temporal order – but the most fundamental connective relations are conjunction and disjunction, and it is on these I focus here. Connective polarity items are actually not uncommon. The most wellknown exemplars are probably the NPI conjunction let alone (Fillmore, Kay & O’Connor 1988; Verhagen 2005) and the focus particles either and too (Klima 1964; Rullmann 2002, 2003). Parallel constructions in other languages include, for let alone, Dutch laat staan and German geschweige denn, and for either/too, French non plus / aussi and Spanish tan poco / tambien. Within English, other

The scalar lexicon 143 similar constructions include the negatively inclined conjunctions much less and never mind, and the positively inclined what’s more and no less. Finally, recent work by Szabolsci and her colleagues (Szabolsci 2002, 2004; Szabolsci & Haddican 2004) shows that in many languages – among others, Hungarian, Russian, Polish, Italian, and Japanese – even the most basic expressions of disjunction act like PPIs, being obligatorily construed as taking wide scope over a tautoclausal negation. While connective polarity items may seem rather exotic, their sensitivities are, I suggest, as with other polarity items, a function of their scalar semantics. Conjunction and disjunction are themselves in fact basic scalar operations,3 and so, like other scalar operators, disjunctive and conjunctive connectives are liable to grammaticalize the sorts of scalar rhetorical functions which give rise to sensitivities. The existence of such polarity items is thus, again, both consistent with and predicted by the Scalar Model of Polarity. There are two basic ways in which a connective can be construed within a scalar model and so function as a scalar operator. First, a connective c may operate directly on a conceptual scale by requiring that the propositions it connects – the coordinates X and Y – be somehow intrinsically ranked with respect to each other. I will refer to connectives of this sort – including English let alone, much less, and nevermind – as intrinsically scalar. But even where the coordinate propositions are not intrinsically ranked, the effect of coordination itself may create a scale in which a complex proposition [X c Y] is ranked with respect to its atomic elements [X] and [Y]. Connectives of this sort – including both, and, or, and either – are usually considered non-scalar (König 1991; Rullmann 2003), but since they can and do give rise to scalar effects, I will refer to them here as implicitly scalar. The Informativity Hypothesis predicts that intrinsically scalar connectives will be polarity sensitive whenever they require their coordinates to express propositions which have both (i) fixed relative q-values on a dimension in a scalar model, and (ii) fixed relative i-values, or argumentative strengths. To see how this works, consider the let alone construction, which Fillmore, Kay, and O’Connor (henceforth FKO) analyze as a pragmatically rich, polarity sensitive coordinator as follows (1988: 512). (32)

a. F 〈X A Y let alone B 〉 ‘I doubt you could get f red to eat squid, let alone l ou i s e .’ b. F 〈X A let alone B Y〉 ‘I doubt you could get f red, let alone louis e to eat squid.’

Here F is a polarity trigger with scope over the whole sentence; A and B are coordinated and prosodically focused constituents which denote contrasting

144 The Grammar of Polarity elements on a contextually inferable scale; and X and Y are the non-contrasting parts of the clause in which the coordination occurs. In the accompanying examples, F is the adversative matrix construction I doubt; A and B are Fred and Louise; and X is you could get and Y is to eat squid. The effect of the construction as a whole is that two propositions are expressed – F〈X A Y〉 and F〈X B Y〉 – in such a way that the second (B) proposition is more contextually relevant, while the first (A) proposition is more informative, either unilaterally entailing proposition B, or else counting as a stronger argument for some salient conclusion. FKO treat the polarity sensitivity of let alone as a syntactic fact which needs to be stipulated independently of its semantic content, but this sort of stipulation may not be necessary. I suggest that let alone expresses a relationship between two propositions, P(A) and P(B), where: (i) A and B denote elements in a scalar model SM, (ii) B ranks higher than A on a salient dimension of SM, and (iii) P(A) is more informative than P(B). Points (i) and (iii) are central features of FKO’s own analysis, and together they basically amount to a constraint on the relative informative strength of the propositions which let alone connects. However, FKO do not include anything like point (ii), which is really just a constraint on the relative q-values A and B can have within a scalar model. Together, requirements (ii) and (iii) mean that let alone can be used only in contexts where low q-value propositions entail (or make stronger arguments than) higher-ranked propositions: they thus effectively limit the construction to scale-reversing contexts, typically with an overt polarity trigger. Their effects can be seen in the examples below from the BNC. (33)

a. It isn’t fit for a latrine, let alone a resting place for the dead. b. There is something rather disturbing about the thought of having your pet stuck with needles, let alone seeing smoke rise from him as well. c. The chance that a random conglomeration of whale cells would swim, let alone swim as fast and efficiently as a whale actually does swim, is negligible.

The relevant scale in each of these examples is inferable from the relation between the A/B coordinates and the rest of the expressed proposition: thus in (33a) be fit for ranks values for A/B in terms of their relative sanctity; in (33b) be disturbing ranks values in terms of their shock value; and in (33c) the chance that X would ranks values in terms of the accomplishments they represent. In all three cases, B denotes the higher-ranked entity on the relevant scale, but P(A) expresses the more informative proposition. Much like let alone, the connective much less presents its coordinates as ordered on a scale with B outranking A, and as contributing to two propositions,

The scalar lexicon 145 P(A) and P(B), such that P(A) makes a stronger argument for some conclusion than does P(B). Again, the examples below are from the BNC. (34)

a. How could any Catholic – much less a priest – argue against that? b. Best of all, he is not a proselytiser, much less a saint. c. Nothing in my upbringing had prepared me for the weather, much less the absurd notion of hitchhiking.

Here I take it that being a priest outranks being a Catholic on a scale of presumed faith, that a saint outranks a proselytiser on a scale of religious devotion, and that hitchhiking in bad weather is worse than just the bad weather itself on a scale of hardships. The sensitivity of a PPI like what’s more can be explained in a similar fashion. All three constructions let alone, much less, and what’s more involve the coordination of two propositions, P(A) and P(B), where B is construed as outranking A on some salient scale. The difference is that with what’s more it’s the second expressed proposition, P(B), which has to be more informative in context than the first, P(A). This is why what’s more is blocked in the scalereversing contexts that license let alone. The scalar, rhetorical force of the what’s more construction is evident in the passage below, from a 1986 interview with Raymond Carver: (35) Until a decade ago, publishers would risk bringing out a collection of stories only when the author had already proved himself as a novelist. But these kids aren’t Updikes; they’ve burst from obscurity like “literary phenomenons” precisely with story collections. That’s what’s new: publishers have discovered the short story has a market and, what’s more, a lucrative one.

Carver’s use of what’s more here sums up the case in favor of publishing story collections by unknown writers. The first reason, P(A) – that there is a market for short stories – is relatively weak; the second reason, P(B) – that the market is actually lucrative – is much stronger. The conjunction never mind provides a telling contrast here to both what’s more and much less type conjunctions. Never mind has a mild preference for negative contexts: of the twenty-five instances in the BNC, eighteen (72 percent) occur with polarity licensors, and seven (28 percent) in affirmative (though usually modal) contexts. In all its uses, however, never mind marks a connection between two propositions, where the first, P(A), is understood as stronger than the second, P(B): where never mind occurs with a polarity licensor, the B element is higher on the relevant scale than the A element; where it occurs without a licensor, it is the A element that outranks the B. Thus, in (36a) the relevant scale ranks the presence of a decent curry house (B), as a stronger

146 The Grammar of Polarity argument for a town having good amenities than it does the presence of a pub (A); similarly, in (36b), an ability to provoke laughter (B), is taken as a stronger sign of something being entertaining than is an ability to raise a smile (A). (36)

a. They don’t even have a pub there, never mind a decent curry house. b. It’s zany, it’s irreverent and it’s brimful of right-on attitudes, but I found it virtually impossible to raise a smile, never mind a laugh.

However, in the positive contexts in (37) we find just the opposite: in (37a) where the relevant scale involves the degree to which someone is commendable, deserving a medal (A) clearly outranks deserving a compliment (B); and in (37b) where the relevant scale involves a certain sort of worldliness, it is the Brazilian transvestite (A) that outranks the Sotheby’s research assistant (B). (37)

a. Anyone who can work with you deserves a medal, never mind a compliment! b. Malle follows suit, equipping his female lead with a get-up – black leather, gloves, fags, sunglasses, four-inch heels and an Eton crop to make a Brazilian transvestite blink, never mind a research assistant at Sotheby’s, which is what she is meant to be.

The fact that never mind works in both negative and affirmative contexts shows that intrinsically scalar conjunctions need not be polarity sensitive: since it places no constraints on the relative q-values of its conjuncts, they can be arranged in any context so that the second conjunct has a higher i-value than the first. Of course, the idea that pragmatically loaded conjunctions like let alone and never mind are scalar operators is hardly original and probably not very controversial. The bigger challenge for the Scalar Model comes from the fact that even the most innocent-looking sorts of connectives can sometimes be polarity sensitive. While ordinary expressions of conjunction and disjunction might not seem like the most obviously scalar constructions, their meanings are easily analyzed in terms of an arbitrarily complex lattice of propositions. The lattice consists of the set of all subsets of a set of atomic propositions, ordered from the least restrictive (the set containing the disjunction of all the candidate propositions) to the most restrictive (the set containing the conjunction of all the subsets of the set of candidate propositions). Given any two propositions P and Q, a conjunction of the form (P & Q) unilaterally entails both P and Q individually; and any two atomic propositions P and Q similarly each entail their disjunction (P v Q). In effect, conjunction creates a complex proposition with a high q-value relative to its atomic constituents, while disjunction creates a

The scalar lexicon 147 (P & Q & R)

(P & Q)

(P & R)

(Q & R)

P

Q

R

(P v Q)

(P v R)

(Q v R)

(P v Q v R)

Figure 6.1 A connective lattice

complex proposition with a low q-value relative to its atomic constituents: (P & Q) → (P) → (P v Q). As with all scalar models, these entailments reverse under negation, so that negated conjunctions are relatively uninformative, and negated disjunctions relatively informative: ¬(P & Q) ← ¬(P) ← ¬(P v Q). Figure 6.1 shows these relations as a lattice of propositions built from the conjunctions and disjunctions of three atomic propositions. A lattice of this sort can be expanded indefinitely by the addition of more atomic propositions to handle conjunctions or disjunctions of any complexity. At the bottom there is complete disjunction over a set of propositions, and at the top there is complete conjunction. Moving up, intermediate levels consist of disjunctions of progressively smaller sets of propositions, then of individual propositions, and then of conjunctions of progressively larger sets of propositions. Since a lattice is just a partial ordering, it is a scale: the lowest point (where all propositions are connected disjunctively) representing a minimal quantitative value, and the highest point (where all propositions are connected conjunctively) representing a maximal quantitative value. Conjunction is an essentially additive function. It adds a proposition to what is said, and in this sense encodes a high q-value. Disjunction is essentially subtractive: it presents a choice among expressed propositions, and thus removes one or more possiblities from consideration. It thus encodes a low q-value. Expressions of conjunction and disjunction form a classic Horn scale supporting scalar implicatures: to use the weaker or conversationally implicates that the stronger and might be misleading. Neo-Griceans (e.g. Horn 1989; Levinson 2000) see this as explaining why natural language expressions of disjunction typically receive an exclusive interpretation: thus the asserted

148 The Grammar of Polarity disjunction in (38a) usually implicates the negated conjunction in (38b), though the implicature is easily suspended by adding or perhaps both. (38)

a. I will (either) give you flowers or bake you a cake. b. ⇒ I will not both give you flowers and bake you a cake.

Under negation and with other scale reversers, the entailments are reversed: the weak denial of a conjunction suggests that the speaker could not honestly give a more informative denial of a disjunction. Thus (39a) implicates (39b). (39)

a. I will not (both) give you flowers and bake you a cake. b. ⇒ I may either give you flowers or bake you a cake.

Let us assume for the moment that English and and or are not themselves polarity sensitive. If and has a high q-value and or a low q-value and if both are unmarked for i-value, then they should form sensitive constructions when combined with a high i-value operator like even. And indeed the complex connectives and even and or even are emphatic polarity items: or even is an NPI and so is blocked in affirmative clauses, but licensed under negation (40), and in questions, conditionals, and comparatives (41); and even is a PPI and works in affirmative clauses but is blocked in negative and other scale-reversing contexts. (40)

a. Jasper didn’t go home, {or/*and} even leave the office. b. Hank has been to China, {and/*or} even Mongolia.

(41)

a. Do you love me, {or/*and} even like me? b. If you loved me, {or/*and} even liked me, you would buy me coffee. c. Janet likes Rothko more than Klimt {or/*and} even Matisse.

As predicted by the Scalar Model, sensitivity arises through the normal composition of quantitative and informative values in a complex expression. These insights should help to make some preliminary sense of the grammar of English either (Klima 1964; Rullmann 2002, 2003). As Rullmann (2002) points out, modern English includes at least four distinct either constructions: as a coordinator which marks the first member of a disjunction and is followed by an or-marked constituent (Disjunctive either in 42); as a binary free-choice pronoun or determiner (Determiner either in 43); as a disjunctive focus particle, following an extraposed or NP, in (44); or as a focus particle (FP either in 45) (42)

a. We’re going either to LA or to New York. Disj either b. Either you will or you won’t.

(43)

a. You can have either of these cookies. b. He wouldn’t eat either cookie.

Det either

The scalar lexicon 149 (44) a. She doesn’t like tea, or coffee either. b. Did you ever know English law, or equity either, plain and to the purpose? (Dickens)

Disj FP either

(45) a. I’m not going to LA, and I’m not going to New York, either. FP either b. Publishers will usually reject suggestions, and writers will rarely accept them, either. (Klima 1964)

All four of these constructions have sensitive tendencies: disjunctive either tends not to occur in the scope of a negative; the focus particle constructions are NPIs; and determiner either, much like any, is licensed in scale-reversing and other modal contexts. I propose that either in all its uses is essentially just a way of emphasizing a disjunction, and since disjunction itself marks a low point on a connective lattice, either encodes a low q-value: in its use as a connective, Disj either is a PPI and conventionally attenuating. In its NPI uses as a determiner and a focus particle, either is emphatic. An analysis along these lines seems plausible and is at least broadly consistent with Rullmann’s (2003) claim that FP either should not be analyzed as a synonym of too: the former is an emphatic expression of disjunction (and so an NPI, a sort of minimizer among the connectives), while the latter is an emphatic expression of conjunction (and so a PPI, blocked where conjunction is not emphatic). Too and either are in some ways more like antonyms than synonyms: they are both emphatic, but in opposite directions. In some languages even the most basic expressions of conjunction and disjunction can be polarity sensitive. Szabolsci (2002), for example, argues that the Hungarian disjunctive operator vagy is a positive polarity item which cannot be interpreted inside the scope of clause-mate negation or other antiadditive operators (e.g. nélkül ‘without’ or a negative quantifier like never or no one). Szabolsci discusses possible answers to the question Miért van itt olyan hideg? ‘Why is it so cold in here?’ that would include two possible causes and a negation. In (46), the two possible causes are connected by a disjunction, vagy, in (47), by a conjunction, és. (46)

Nem csukt-uk be az ajto-t vagy az ablak-ot. not closed-1pl in the door-ac c or the window-acc a. √ ‘Either we didn’t close the door or we didn’t close the window.’ b. * ‘We didn’t close either the door or the window.’

(47)

Nem csukt-uk be az ajto-t és az ablak-ot. not closed-1pl in the door-ac c and the window-acc a. √ ‘We didn’t both close the door and the window.’ b. √ ‘We both didn’t close the door and didn’t close the window.’

150 The Grammar of Polarity As the glosses here indicate, where the disjunctive vagy occurs with a clausemate negation, it must take wide scope over the negation. Hungarian simply does not allow what for English would be the most natural reading, (46b), that neither the window nor the door were closed. To express that proposition, Hungarian relies on the conjunction és, which allows either wide or narrow scope with respect to negation: the wide-scope reading in (47b) is logically equivalent to the unavailable narrow-scope interpretation for vagy in (46b). As Szabolsci notes, the scoping possibilities for English and and or are essentially the reverse of Hungarian és and vagy. English or allows broad or narrow scope with respect to negation (though the more informative narrowscope reading in (48a) is normally preferred). (48)

We didn’t close the window or the door. a. √ ‘We didn’t close either the door or the window.’ b. √ ‘We either didn’t close the window or we didn’t close the door.’

English and, however, usually takes narrow scope with a clause-mate negation: thus the negation in (49) is normally understood as denying the truth or appropriateness of the conjunction itself, as in (49a), rather than applying to both conjuncts separately, as in (49b). (49)

We didn’t close the window and the door. a. √ ‘It’s not the case that we closed both the window and the door.’ b. *‘We both did not close the window and did not close the door.’

But the use of a sentence like (49) usually involves metalinguistic negation – normally one only bothers to deny an and-conjoined predicate if someone or something in the context somehow suggests that the conjoined predicate does in fact hold. As Szabolsci and Haddican (2004) note, such uses require a special intonation contour with contrastive stress on the conjunction, and this sort of contour is typical of metalinguistic negation. But if the negation here is metalinguistic, and if English and does not normally occur in the scope of descriptive negation, then and itself is a positive polarity item – the conjunctive counterpart to disjunctive PPIs like vagy. A disjunction like vagy may be usefully understood as a sort of uncertainty operator: vagy expresses a complex proposition of which only one constituent proposition must hold, and where the speaker cannot say with certainty which one. As a disjunction vagy encodes a low q-value, but it is a PPI because it is also inherently under-informative and its inherently weak i-value cannot be appropriately expressed if it occurs in the scope of negation. Connectives like and, on the other hand, are high q-value scalar operators by virtue of their conjunctive meanings, and they are PPIs by virtue of encoding an emphatic

The scalar lexicon 151 i-value. These forms cannot be construed in the scope of negation because such a construal would prevent them from contributing to the expression of a relatively strong proposition. Vagy’s conjunctive counterpart, és, thus expresses a high q-value on a connective lattice without being specified for i-value. Thus, in affirmative contexts, és forms a complex proposition which itself entails each of its constituent propositions. In negative contexts it can be effectively emphatic where it takes wide scope over negation and so forms a complex proposition consisting of negative constituent propositions; or it can be effectively attenuating when negation takes scope over the conjunction. Presumably because vagy is limited under negation to an attenuating use, the default interpretation for és under negation is the emphatic one. These observations suggest that the grammaticalization of disjunctive connectives as PPIs may reflect a more general tendency for connectives to be associated with the expression of emphasis and attenuation and so to develop polarity sensitivities. The existence of such polarity items is thus, again, not only consistent with but broadly predicted by the Scalar Model. 6.4

Aspectual polarity items

Aspectual operators like English already, yet, still, and anymore make modest contributions to sentence meaning, but they have inspired a voluminous literature (Horn 1970; König 1977, 1991; Löbner 1987, 1989; Mittwoch 1988; Garrido 1992; Michaelis 1992, 1993; van der Auwera 1993; inter alia). While this work has greatly illuminated the lexical semantics of these forms, the reasons for their sensitivity to polarity have mostly remained obscure. Traditionally, the problem has been treated by an arbitrary rule of suppletion (Traugott & Waterhouse 1969); often it is simply ignored. Predictably perhaps, I argue here that aspectual adverbs are scalar operators, and that like other polarity items, their sensitivities reflect the interaction of their conventional rhetorical force with their basic scalar semantics. Following Horn (1970), and many others since, I view aspectual operators like still and already as marking a relation between two phases of an eventuality, one presupposed and the other asserted. Building on Michaelis’s (1993) analysis of still as a scalar operator, I propose that these phases are in general construed as propositions within a scalar model. Aspect in general is an inherently scalar domain because it depends on the way an eventuality (that is, the state or event profiled in a proposition) is construed with respect to time, and time itself is a scalar phenomenon. There are

152 The Grammar of Polarity basically just two ways in which an eventuality can be construed with respect to a time, and thus two sorts of scalar aspectual operators (Israel 1997): inceptive operators focus on the beginning of a profiled eventuality and order alternatives in terms of their relative earliness; durative operators focus on the end of an eventuality and order alternatives in terms of their lateness. Given these scalar structures, the Scalar Model predicts rather trivially that aspectual operators should be cross-linguistically prone to polarity sensitivity, and more importantly that sensitive aspectual operators should be rhetorically loaded in specific ways and should enter into specific sorts of paradigmatic relationships with other operators. I propose that still and anymore are durative operators which situate an eventuality relatively high on a scale of lateness, while already and yet are inceptive operators which place an eventuality relatively high on a scale of earliness. All four forms thus encode high quantitative values on their respective scales, though the scales themselves involve two converse orderings on the dimension of time. I thus take it that the major differences between already and yet and between still and anymore reflect their different informative values: while still and already are emphatic operators, yet and anymore are inherently attenuating. The proposed analysis explains the sensitivities of these forms in the usual way (i.e. as in §4.5 above): the PPIs yet and already occur only in scalepreserving contexts because only in these contexts are their high-scalar profiled propositions appropriately emphatic; the attenuating NPIs yet and anymore occur only in reversing contexts because only there are they appropriately uninformative (see Israel 1997, 1998a, for further details). The scalar analysis also helps explain other aspects of these forms’ behaviors. Any analysis of already, for example, must account for two basic facts. First, already profiles a situation (the ‘already’ state) as holding earlier than might have been expected: thus it only makes sense to say that something is already the case if it wasn’t expected to be the case until later. This point, of course, is directly accounted for in the analysis of already as encoding a high q-value on an inceptive scale, since as such it presents a profiled proposition as holding earlier than its salient alternatives. Second, the profiled ‘already’ state is normally understood as contrasting with an earlier negative phase: normally, that is, it only makes sense to say that something is already the case if it might have been thought not to be the case before. Thus, Garrido (1992: 367) argues that already and still (or rather, their Spanish equivalents, ya and todavia) each make an assertion against the background of a contrary expectation; and similarly, van der Auwera speaks of a

The scalar lexicon 153 counterfactual course of events against which already is evaluated (1993: 621). I follow Michaelis (1992) in viewing the prior negative phase of already as a generalized conversational implicature closely associated with the word’s use, and particularly with its status as an emphatic operator: basically, if the profiled eventuality is known to hold prior to the reference time, then the assertion of the ‘already’ state at the reference time will be inappropriately uninformative. I take it that yet is denotationally equivalent to already: both encode a high q-value on an inceptive scale, but they differ in their i-values – already is emphatic, and yet attenuating. As such, yet contributes to a proposition which is compatible with, rather than contrary to, some default expectation. While already is an emphatic PPI, yet is an attenuating NPI analogous to forms like much and all that: it situates an asserted proposition high on an earliness scale, and it presupposes the availability of some more informative proposition lower down the scale (i.e. later) which might just as well have been asserted. As noted by Horn (1970: 321) and van der Auwera (1993: 632), the differ ence between already and yet is particularly noticeable in questions: (50a), with yet, poses a neutral question, but (50b), with already, is strongly biased toward a positive response. (50)

a. Has Larry read your paper yet? b. Has Larry read your paper already?

The difference follows from the two forms’ distinct scalar semantics. Both sentences depend on an expectation that Larry will read the paper at some point, and both ask whether this point has been reached. With already the question posed is effectively whether the event has been realized earlier than was expected. In other words, the focus of the question is already itself, and more particularly its high i-value. With yet the question posed is effectively whether the event has been realized, not in excess of, but in accordance with, what was expected: in other words, is the situation at reference time such that it would be entailed by what is expected later on? Yet presupposes an expectation that the event will take place, but not that it will happen in any way ahead of schedule: it thus produces a neutral question. The same contrast between a neutral yet and a biased already can also occasionally be found in conditional clauses, which, like questions, tend to allow both NPIs and PPIs. The example in (51a) occurred in a poker game and was addressed to a new player who wanted to watch a few hands before starting to play herself. (51)

a. “If you’re playing yet, it’s your deal, otherwise it goes to Josh.” b. If you’re playing already, it’s your deal.

154 The Grammar of Polarity The use of yet here signals an expectation that the addressee will in fact be joining the game, but it remains neutral as to just when this might take place. Contrast this with the less welcoming (51b), where the use of already would suggest an expectation that the new player will wait out at least a few more hands. Horn (1970) suggests that the contrast in questions between a biased already and a neutral yet does not show up with still and anymore, so that the questions in (52) are close to mutual paraphrases. (52)

a. Does Gladys still smoke pot? b. Does Gladys smoke pot anymore?

But while the contrast here is perhaps closer than it is with already and yet in (50), the examples here are perhaps not entirely interchangeable. In particular, (52a) seems more appropriate than (52b) in a context where the speaker would be shocked if Gladys persisted in this passé pastime, and the expression of shock is accentuated where still itself is stressed. The effect is milder with anymore: (52b) presupposes that Gladys did smoke pot in the past and may no longer, but it doesn’t presume that she shouldn’t. As an emphatic, still presents an expressed proposition as somehow surpassing normal expectations (i.e. as exceeding the norm on a durative scale), while the attenuating anymore presents a proposition as relatively unsurprising with respect to expectations. In general, though the distinction is subtle, anymore suggests that a situation will not last indefinitely and may in fact be over, while still suggests that a situation has already lasted longer than expected and probably should be over. Although I cannot explain why the distinction between still and anymore should be more subtle than that between yet and already, I conclude that it is in kind, if not in degree, the same: emphatic PPIs yield biased questions; attenuating NPIs allow for neutral questions. Already, yet, still, and anymore thus form mirror-image pairs of polarity items operating on aspectual scales. Consideration of the English temporal connective until provides an interesting comparison to this aspectual foursome. Until can be very picky about where it appears, and this pickiness has inspired much debate. The controversy centers on examples like those in (53). (53)

a. That morning Lola slept until the bomb went off. b. *That morning Lola woke up until the bomb went off. c. That morning Lola didn’t wake up until the bomb went off.

In (53a) until evidently marks the endpoint of a durative process – in this case the activity of sleeping. In (53b), where the main clause denotes a punctual event, until appears nonsensically ungrammatical. So it seems that until is

The scalar lexicon 155 sensitive to sentence aspect and can work only with durative predicates; but in (53c) until happily combines with a punctual predicate under negation. The question is whether the until in (a) is the same until as the until in (c). One line of reasoning, going back to Klima (1964) and pursued by Smith (1970), Heinämäki (1974), and Mittwoch (1977), says “yes,” arguing that there is only one until and that the only constraint on until is that it must occur with a durative predicate. According to this view, sentences like (53c) are acceptable because negation itself is a durative predicate (de Swart 1996). Against this view another line of researchers (Lakoff 1969a; Lindholm 1969; Horn 1970, 1971, 1972; Karttunen 1974; Declerck 1995) has argued that the uses in (53a) and (53c) reflect either distinct values of until or at least distinct constructional types – that while until in (53a) indicates the length of a durative process, until in (53c) is punctual and indicates the point at which an event begins. Aside from Declerck, all these researchers view punctual until as a strong NPI. There is more at stake here than the analysis of one lexical item: the real question is what it means to be a polarity item. Punctual until is an unusual sort of NPI, one with particularly strict licensing requirements, and so one may wonder whether this construction really is subject to the same constraints that govern other NPIs. I maintain that it is and will explain its distribution in both durative and negative polarity contexts by treating it as a polysemous scalar operator with two tightly related senses. Both senses profile the relation between an expressed eventuality and a temporal boundary, but they differ in the way the boundary is construed: durative until presents the boundary as a high point on a durative scale, thus marking the latest point at which a process holds; punctual until presents the boundary as a low point on an inceptive scale, thus marking the earliest point of the interval within which a punctual event occurs. The analysis of until as a scalar operator suggests that punctual until is in fact a fairly ordinary sort of NPI, encoding a minimal q-value on an earliness scale and being constrained to appear only where it can express a high i-value emphatic proposition: it is, in effect, a kind of aspectual minimizer. Here are the basic facts. English until is a preposition which combines with nominal and clausal complements which denote a boundary of a temporal process: it thus takes an expressed eventuality as its trajector (TR) which it situ ates with respect to a temporal landmark (LM) on a scale of conceived time. Syntactically, until is flexible about the expression of both its trajector and its landmark, though the details of this flexibility need not concern us. What is crucial is that the trajector must profile an eventuality with an appropriate

156 The Grammar of Polarity polarity and aspectual type, while the landmark must profile a precise temporal interval, either directly by specification of a reference time, or metonymically by reference to a temporally anchored event. In affirmative contexts, until heads a constituent marking the endpoint of a durative process, and as such can only combine with a durative trajector. In Vendler’s (1967) terms, until works with state and activity predicates as in (54), but is ungrammatical with punctual predicates denoting accomplishments and achievements as in (55). (54)

a. Jack and Jill danced until two. b. Jill was happy until the storm hit.

(55)

a. *Simone fainted until midnight. b. *Simone finished her novel until she returned from Hawaii.

The facts are more complicated in negative contexts, where durative predicates become ambiguous and punctual predicates become grammatical. Thus (56) allows both a durative reading, where the dancing didn’t last as late as two, and a punctual reading, where it only began at two. The examples in (57), however, only allow the punctual reading. (56)

Jack and Jill didn’t dance until two.

(57)

a. Simone didn’t faint until midnight. b. Simone didn’t finish her novel until she returned from Hawaii.

The punctual use of until is licensed with a variety of triggers beside clausemate negation (pace Progovac 1994: 81–2), occurring in the scope of quasinegative quantifiers like few and rarely, negative verbs like doubt, and the adversative degree construction too Adj to X (Horn 1970; Horn & Lee 1995; Declerck 1995). (58)

a. Few of the guests got here until two. b. Lola rarely gets here until two. c. I doubt Lola will get here until two. d. Lola is probably too busy to get here until two.

While such data suggest that punctual until is an ordinary sort of NPI, the evidence must be weighed against the fact that many standard NPI licensors – for example, questions, conditionals, factive adversatives, and comparatives – do not license punctual until. (59)

a. *Will Lola get here until two? b. *If Lola gets here until two, it’ll be a good party. c. *I’m surprised Lola got here until two. d. *I win the lottery more often than Lola gets here until two.

The scalar lexicon 157 These facts suggest that until’s sensitivity may have less to do with polarity than with the need for an appropriately durative eventuality, in which case there may be no need to posit two distinct lexical entries to explain the two uses of until. Still, the two untils do display a range of strikingly distinct semantic and pragmatic effects. In particular, Declerck (1995: 68) points to a “sense of actualization” inherent in the use of punctual until, which seems to mark the time at which a positive state actually begins rather than the time that a negative situation lasts. This is particularly clear in Karttunen’s famous spinster sentences below (1974: 290). (60)

a. Nancy remained a spinster until she died. b. #Nancy didn’t get married until she died.

While until in (60a) simply indicates the duration of Nancy’s spinsterhood, (60b) seems to suggest that Nancy actually did marry once she was dead. This, of course, conflicts with our real-world understanding of how marriage works, and so makes the sentence weird. The sense of actualization is also evident in other contexts where the two untils “go their separate ways” (Karttunen 1974: 287), as in the examples below, where durative until allows and punctual until resists modification by only and continuation with and beyond and or longer. (61)

a. The princess slept only until two. b. *The princess didn’t wake up only until two.

(62)

a. The princess slept until two and beyond. b. *The princess didn’t wake up until two and beyond.

(63)

a. The princess slept until two or longer. b. *The princess didn’t wake up until two or longer.

Karttunen suggests (1974: 288) that until in the (a) sentences marks the endpoint of an interval which could have been, or may in fact have been, longer. As such it answers the question “how long did X last?” In the (b) sentences, on the other hand, until is roughly equivalent to before (Lindholm 1969), marking the onset of an event rather than the endpoint of an interval. As such it answers the question “how early did X begin?” For Karttunen, the essence of punctual until is lateness: in other words, the answer it gives is “not early at all.” As he puts it, “the time period picked out by… punctual until is at the very end of the time stretch” in which one would expect the negated main clause event to occur (1974: 292). This point is brought home by a contrast first noted by Horn (1972: 61). While (64a) with durative until can be continued with at the latest, it is awkward, at

158 The Grammar of Polarity TR

time 7

9

n

11

1

LM

lateness

Figure 6.2 Durative until

best, with at the earliest; (64b), with punctual until, shows exactly the opposite pattern of preference. (64)

a. The princess slept until two at the {latest/*earliest}. b. The princess didn’t wake up until two at the {*latest/earliest}.

These examples suggest that the difference between the two untils has to do with the ways in which they construe the flow of time: while the durative usage is concerned with how late a situation persists, the punctual use focuses on how early an event occurs. I suggest that until is a polysemous scalar operator, and that its two senses are related by scale inversion. In both its uses, until situates a profiled eventuality on a scale of conceived time. The difference between them is basically a matter of perspective: durative until profiles the extent of an eventuality along a durative scale, with times ordered in terms of their lateness; punctual until situates the onset of an eventuality along an inceptive scale, with times ordered in terms of their earliness. Durative until takes the temporal extent of a profiled process as its trajector (TR) and relates this to a reference time, the landmark (LM), on the scale. The landmark marks a late point in the extent of the trajector, and by implicature is normally understood as the endpoint of the profiled process. Thus, in (65) the subject–verb complex profiles an activity, which is represented schematically in Figure 6.2 by a boxed series of circles; the temporal extent of this activity, represented by the box itself, elaborates the trajector of until; and the landmark, supplied by the time two in the morning, is construed as relatively late on a durative scale. (65)

The princess danced until two in the morning.

The sentence thus entails that the dancing continued at least as late as two and it typically implicates that that’s when it ended too. Until two thus marks a lower bound for the endpoint of the profiled process.

The scalar lexicon 159 TR

time 7

9

n

11

1

LM

earliness

Figure 6.3 Punctual until

Punctual until situates a punctual event on a scale of earliness, indirectly, by reference to its antecedent incompletion. As with the durative sense, punctual until profiles a relationship between the temporal extent of some process and a point in conceived time, only here points in time are ordered by their relative earliness rather than their lateness. So instead of a high point on a lateness scale, the landmark for punctual until is a low point on an earliness scale. Again, the trajector is elaborated by the temporal extent of a (usually) clausal process, though in this case a negative one consisting in the absence of an event. For example, in (66) the trajector – represented in Figure 6.3 as a boxed series of circles in outline – is the temporal extent of a situation in which the princess does not faint. (66)

The princess didn’t faint until two in the morning.

The representation here is consistent with the claims of Klima (1964), Mittwoch (1977), and de Swart (1996), among others, that negation is inherently durative and that a negative state of affairs (i.e. the absence of an event) itself counts as an eventuality with a temporal duration; the analysis does not, however, depend on such categorical claims. We do not need to decide whether our ontology should recognize negative states of affairs or whether non-events have the property of existing in time. All we need is a profiled process saliently evoking a precise temporal extent which can elaborate the trajector of until. Since here reference to the princess not fainting necessarily evokes a time in which she might have, but did not, faint, this requirement is met. Thus far, then, punctual until hardly differs from the durative sense. With punctual until, however, what is highlighted is not how long a negative situation lasts but rather how early its positive counterpart begins. This is because the landmark of punctual until is a low point on an inceptive scale: it marks the least early (i.e. latest) point at which the TR holds, and, by implication, the earliest point at which the converse of the TR is realized. At the composite level of organization, (66) thus profiles not the negative state of the princess

160 The Grammar of Polarity not fainting, but rather the actual fainting which ends this negative situation. This shift in profiling away from the non-event of the main clause process and toward the implied event itself accounts for the sense of actualization noted above. The analysis here suggests that punctual until is a rather ordinary emphatic NPI: its sensitivity reflects the fact that it both encodes a minimal q-value on an inceptive scale and requires this minimal value to be construed as part of an emphatic, high i-value proposition. The predictable effect is that punctual until can occur only in contexts where later, low q-values on the scale entail earlier, high q-values. Since, like already and yet, punctual until is an inceptive operator, its meaning is roughly paraphrased by ‘as early as.’ The examples below show that licensors for punctual until – sentence negation (67a), the quantifier few (67b), and the adverb rarely (67c) – support the appropriate inferences from propositions low on an inceptive scale to earlier propositions higher on the scale, and that these inferences are not available in a simple affirmative context (68). (67) a. Jen didn’t leave as early as March. → Jen didn’t leave as early as February. b. Few of the bombs exploded as early as midnight. → Few of the bombs exploded as early as sundown. c. Lola rarely arrives as early as two. → Lola rarely arrives as early as one. (68)

The bomb exploded as early as two. ↛ The bomb exploded as early as one.

These facts suggest that punctual until, as an emphatic operator on an inceptive scale, is not all that different from ordinary minimizers like sleep a wink or lift a finger. But punctual until is an especially sensitive sort of NPI and, unlike most minimizers, is not licensed by non-negative triggers like questions, conditionals, or comparatives. The reason for this, however, is precisely that punctual until is a minimizer on an aspectual scale. It marks the least early point at which a situation holds, which is, in effect, the endpoint of a situation. As such its use presupposes a transition from the negative phase profiled by its trajector to the positive event that ends it: without this transition, its landmark cannot count as the least early time at which the profiled situation holds. Thus, a question like (59a) *Will Lola get here until two? is effectively incoherent since it both asks whether the transition will take place by two, and presupposes that it will. Similarly, in (59b), the hypothetical status of Lola’s arrival “until two” leaves open the possibility that she might actually arrive after two.

The scalar lexicon 161 But if she were to arrive after two, then “two” would not be the lowest point on the earliness scale at which she might arrive. It is worth noting in this regard that the facts associated with the English until are neither unique nor inevitable cross-linguistically. Thus, languages like Latin and German have constructions (Latin ad, German bis) which are similar to durative until but which cannot translate the punctual use (Kartunnen 1974: 293). The Spanish preposition hasta, however, exhibits a durativepunctual polysemy similar to that of English until (Bosque 1980), as suggested by the examples below from Schwenter (1999b). (69) a. La fiesta duró hasta las tres de la mañana. ‘The party lasted until / as late as three in the morning.’ b. No se cayeron hasta las doce. ‘They didn’t fall down until twelve o’clock.’

But as Schwenter notes, hasta also has another use, illustrated in (70), where it functions as a scalar additive focus particle more like English even than until. (70) a. Hasta Sara vino hoy a clase. ‘Even Sara came to class today.’ b. En la cena comí hasta caviar. ‘At the dinner I ate even caviar.’

Schwenter argues that one important difference between this use of hasta and similar focus particles like Spanish incluso and English even is that while the latter forms require only that their focus count as more informative than a salient scalar alternative (Kay 1990), hasta requires its focus to be an actual scalar endpoint and thus to be more informative than all its relevant alternatives. For my purposes, the interesting point is that hasta has grammaticalized here from a durative operator into a scalar focus particle, which strongly suggests that its basic durative sense itself involves an inherently scalar construal. I take the polysemy of hasta as strong corroboration for the claim that until on both its uses is also a scalar operator. Of course, it also shows that even the most similar forms, like hasta and until, may differ in surprising and significant ways, since there is no obvious reason why hasta but not until should enjoy an extra use as a focus particle. In general, it seems, even where lexical semantic patterns are highly motivated, they are never deterministic. 6.5

The limits of diversity

Although it is often suggested that the diversity of polarity items somehow undermines the scalar hypothesis, it is in fact the source of its strongest

162 The Grammar of Polarity empirical support. Given the range of syntactic types and semantic functions which polarity items can assume, it is striking that they are so united in their scalar semantics. Such uniformity in the face of so much diversity suggests that we are dealing here with an important generalization: that polarity sensitivity in general is rooted in the scalar semantics of polarity items themselves. The claim that all polarity items share an inherently scalar semantics should not be mistaken for a claim that all polarity items are essentially the same. On this point, I fully agree with Rullmann (1996: 349) that in the area of negative polarity there is quite a bit of, often subtle, variation. No single approach is likely to be correct for all NPIs. Rather than drawing conclusions about the entire class of NPIs on the basis of a few cases, we should carefully investigate the detailed properties of each individual item.

Nonetheless, careful attention to the subtle diversity of NPIs and PPIs should not preclude the recognition of some broad generalizations which unite them. The most obvious generalization here is that the semantic domains in which polarity items regularly emerge are all structurally analogous in at least one crucial respect: they are all inherently scalar. And for all their real and apparent differences, in at least two respects polarity items of all sorts seem to operate on their scalar domains in precisely the same ways: (i) they encode a quantitative value by profiling the relative position of an element in a scalar ordering; and (ii) they encode an informative value by highlighting the relative rhetorical strength of an expressed proposition. While the theory does not explain all the many forms polarity items can take, and while it cannot predict all the idiosyncrasies of their various sensitivities, the Scalar Model is both broad and flexible enough to allow for, and even welcome, such variation. If sensitivity really is a reflection of lexical semantics, it is to be expected that polarity items will vary as much and as subtly in their distributions as they do in their meanings, and so every item will have its own story. Ultimately, the only way to explain every polarity item is to do so one at a time, slowly and carefully.

7 The family of English indefinite polarity items

We don’t know a millionth of one percent about anything.

7.1

Thomas Edison

The many splendors of any

In linguistics and philosophy it is often the little words that cause the biggest problems, and by this measure any is a very little word. Controversy over its interpretation has raged since at least the mid nineteenth century, and recently has gone all the way to the Supreme Court, where the split 5–3 decision in Small v. US (No. 03–750, 2005) turned on the precise interpretation in US statutory law of the phrase convicted in any court. Given the narrowness of this decision, there appears to be no end in sight to the dissent this one little word will occasion. Still, there may be hope for some resolution to one old persistent problem. Any is, as Vendler put it, “a many-splendored thing” (1967: 79). The question is just how many splendors are there and how, precisely, are they related? At the most general level, the splendors of any appear to divide into two major spectra. Polarity sensitive (PS) any, as in (1), occurs almost exclusively with mass or plural nouns in the scope of a polarity licensor and lends itself to analysis as an existential quantifier. Free-choice (FC) any, as in (2), combines with singular count nouns and has a range of uses in modal, generic, and habitual contexts where it expresses a kind of reckless generalization – referring in a way that has been variously called arbitrary, random, free, or quodlibetic (e.g. in Tovena & Jayez 1999; Langacker 1991; Vendler 1967; and Hamilton 1858, respectively) . (1)

a. There weren’t any mushrooms at the market. b. Do you have any mushrooms? c. I’d be surprised if Laura had any mushrooms. d. *Craig probably has any mushrooms. Let’s call him.

(2)

a. Mildred will eat just about any mushroom. b. Any mushroom has spores. c. Pick a mushroom, any mushroom – any one you like. 163

164 The Grammar of Polarity While the polarity sensitive and free-choice uses of any seem to constitute distinct sense spectra, they are also intriguingly similar, and the literature on any has wound itself into contortions trying to sort them out (see Horn 2000a, b, 2005, for panoramic reviews). The basic issue is whether the uses should be analyzed as two distinct, though perhaps related, forms (Klima 1964; Horn 1972: §2; Ladusaw 1979; Carlson 1980, 1981; Linebarger 1981; Dayal 1998, 2004), or whether they are twin reflexes of a unitary abstract meaning, either as a wide-scope universal quantifier (Quine 1960; Horn 1972: §3; LeGrand 1974; Gil 1994, 2005), or as a special indefinite or existential determiner (Fauconnier 1975a, b; Davison 1980; Kadmon & Landman 1993; Lee & Horn 1994; Israel 1995a; Lee 1996; Haspelmath 1997; Giannakidou 1998, 2001; Lahiri 1998; Tovena 1998; Tovena & Jayez 1999; Horn 2000a, b, 2005). This chapter seeks to unravel the uses of any by examining their place in a larger family of English indefinite constructions, including its emphatic NPI cousins ever and at all and its attenuating PPI sister some. These constructions are distinguished from other polarity items by a variety of semantic, pragmatic, and distributional features. As scalar operators, what makes them special is that they operate on a subjectively construed scale of indefinite reference itself, where elements are ordered not by their objective properties, but rather by the subjective effort needed to call them to mind. With this basic insight, I hope to vindicate the intuition that any is a scalar indefinite, to explain its differences with other NPIs, and to reveal some unexpected conceptual connections among the two uses of any and the several uses of some. 7.2

Indefinite family resemblances

While the uses of any have long been a magnet for scholarly attention, there has been less interest in the peculiarities of other indefinite polarity items like some, ever, and at all. This is a pity, for each of these forms presents its own little array of splendors, the understanding of which may yield insights into the mysteries of any. The family of indefinite polarity items itself fits into a larger paradigm of English quantificational constructions. Table 1 thus distinguishes on syntactic and semantic grounds three classes of quantifiers each with eight distinct flavors of quantification. Amount quantifiers – including both determiners like all, most, and some, and cardinality predicates like three, many, and few – combine with nouns to denote a quantity or proportion of things. Frequency quantifiers like always, often, and never combine with predicative constructions and profile the frequency with which a situation is instantiated in some mental space: they are

The family of English indefinite polarity items 165 Table 1. English quantifiers and indefinite constructions

Universal Majoritary Multiple P-paucal P-indefinite N-paucal N-indefinite Negative

A mou n t Things

F r e qu e nc y Events

D e gr e e Properties

all most many, lots several, a few some few, scant any no

always usually often, frequently occasionally sometimes rarely, seldom ever never

altogether mostly very fairly somewhat hardly, barely at all nowise, notatall

syntactically flexible, appearing in construction with adjective phrases (always hungry), preposition phrases (always on the prowl), and verb phrases (always getting into trouble), among others. Degree quantifiers like altogether, mostly, and hardly indicate the degree or extent to which a gradable relation is instantiated: these are adverbials and combine mostly with adjectives (somewhat grumpy) and adverbs (altogether unseemly), though a few can also modify a VP (she hardly spoke at all). Most of these expressions are indefinites in the very broad sense that they do not presuppose that their referents are familiar or accessible to consciousness. The eight distinct series here are not intended as an exhaustive inventory of the varieties of indefinite reference possible within English – there are clearly more, as well as more fine-grained distinctions to be made within these series; however, they do illustrate the close parallels in form and meaning between the nominal determiners and their frequency and degree adverbial counterparts. Among these forms polarity sensitivity is more the norm than the exception: while the (positive) P-paucals (e.g. a few, several, etc.) are weak attenuating PPIs, the (negative) N-paucals (scant, few, rarely, barely) are in fact strong PPIs, not occurring in the same clause or in the semantic scope of negation. The only quasi-exception to this is hardly, which occurs with pleonastic negation in colloquial American English (e.g. I couldn’t hardly tell the difference). The close relationships between the N-indefinites and the P-indefinites appear first and most dramatically in the paradigmatic alternations illustrated below of some with any, sometimes with ever, and somewhat with at all. (3)

a. Jeff has {some/*any} interesting etchings. b. Mutt doesn’t have {*some/any} interesting etchings.

166 The Grammar of Polarity (4)

a. Oscar {sometimes/*ever} dances in the moonlight. b. Lucinda doesn’t {*sometimes/ever} dance in the moonlight.

(5)

a. Aunt Jane was {somewhat / *at all} upset by the news. b. Uncle Bob wasn’t {*somewhat / at all} upset by the news.

Classic analyses (Lees 1960; Klima 1964) derived these alternations syntactically, but others (Bolinger 1960; Lakoff 1969) soon noted that the distributions of these forms are not fully complementary and that the choices between them are often driven by semantic or pragmatic considerations. While most now see the differences here as lexical rather than purely syntactic, the paradigmatic alternations among these forms suggest that they are closely related semantically. Indeed, under ellipsis some and any seem entirely indistinguishable: thus in (6) either can be the antecedent of the other in an elided clause with reversed polarity. (6)

a. Sally didn’t drink anything, but Glynda did (drink something). b. Oscar ate something, but Oswald didn’t (eat anything).

Moreover, while the meanings of these forms are clearly closely connected, they can also be subtle to the point of invisibility: the presence or absence of words like any, ever, and some often has little effect on a sentence’s truth conditions. Thus the sentence pairs in (7–11) all appear to be (at least roughly) mutually entailing, and the pair in (12) might be as well, if somewhat is held to put only a lower bound and no upper bound on the degree to which a predicate holds. (7)

a. Tess doesn’t eat fish. b. Tess doesn’t eat any fish.

(8)

a. None of my friends use heroin. b. None of my friends ever use heroin.

(9)

a. I’m not surprised. b. I’m not at all surprised.

(10)

a. He had eggs for breakfast. b. He had some eggs for breakfast.

(11)

a. Harvey robs liquor stores. b. Harvey sometimes robs liquor stores.

(12)

a. Larry’s examples are often amusing. b. Larry’s examples are often somewhat amusing.

The meanings of these forms are not always evident in their truth-conditional effects, but speakers of English do have clear intuitions about the differences

The family of English indefinite polarity items 167 among them. Crucially, the N-indefinites in (7–9) all seem to strengthen the force of an utterance, while the P-indefinites in (10–12) seem comparably weakening. While a sentence like (7a) might be acceptable even if there are exceptional occasions on which Tess does eat fish, the use of any in (7b) precludes any such exceptions. The use of a P-indefinite has the opposite effect, limiting the strength of an expressed proposition. Thus some in (10b) suggests a modest, or at least not an excessive, quantity of eggs, while sometimes in (11b) implies that Harvey’s liquor store robberies are no more than an occasional habit. The N-indefinites and the P-indefinites also show surprisingly similar patterns of polysemy. The details vary from construction to construction, but three sorts of senses commonly recur which I label here crudely as existential, universal, and emphatic. The most basic sense – the only one common to all P- and N-indefinites – is the existential, used in reference to a singular indefinite instance of the appropriate semantic sort (i.e. a thing, an eventuality, or an extent): e.g. She didn’t drink any gin; Why would I ever lie?; Are you at all concerned? Several indefinites also have quasi-universalizing uses. Free-choice any is famous for this, but ever can also be used to mean something more like ‘always’ than ‘once,’ where it profiles constancy across occasions (Israel 1998b), as in the examples below from the WSJ corpus. Such quasi-universal uses stretch back to the earliest stages of Old English (Leuschner 1996) but are now confined to a few semi-productive constructions. (13)

a. Waste management will be another big area as small, densely populated countries like Taiwan cope with ever-increasing amounts of garbage. b. COLAs … are positive feedback loops that drive inflation ever upward. c. Our skipper, ever the optimist, finally acknowledged that our three-week trip was becoming a four-week trip.

Similarly, while modern English at all is not synonymous with entirely or altogether, in its earliest uses it was, as the examples below attest (OED s.v. all, 9b). (14)

a. I thee coniure & commande att alle. (1350) b. My waverand wyt, my cunnying febill at all. (1513)

Although the universalizing use of at all has passed from the language, the form retains some of its old universality where it is used to reinforce a freechoice any (e.g. anything at all!). Beyond the existential and universal senses, at least two very different indefinites, some and ever, have special uses where they lack quantificational

168 The Grammar of Polarity force and function as pure exclamatives: e.g. That was some party! or Did we ever dance! Such uses clearly transcend the forms’ basic indefinite semantics, but they also attest to the rhetorical potency that can be found in indefinite reference. Finally, both any and some have adverbial uses roughly synonymous with their frequency and degree counterparts, some being like ‘sometimes’ and ‘somewhat’ in (15), any like ever and at all in (16). (15)

a. We still talk some, but not as often as we used to. b. It hurts some, but only when I laugh.

(16)

a. Your constant whining doesn’t help any. b. I doubt it could get any {better/worse}. c. Was he any {good/use/fun/*bad/*blue}?

These uses are part of a larger pattern of adverbial uses for indefinite determiners which combine with mass nouns (e.g. much, a lot, and a great deal, among others). But the pattern is not without its idiosyncrasies: while adverbial any and much both occur “VP-finally and before all comparatives, some adjectives, and a handful of nouns” (Horn 2000a: 167), adverbial some and a lot have more restricted distributions, as seen in (17). (17)

a. Did you get to dance {any / much / some / a lot} at the party? b. Was it {any / much / *some / a lot} more fun than last time? c. Did it do you {any / much / ?some / *a lot} good to go?

There may be some principled explanation for these restrictions, or it may be that each of these constructions has its own story as well. As in any close-knit family, every individual indefinite construction is likely to have a few quirks of its own. Still, it is remarkable how very similar are the quirky senses which circulate in this family. Which senses a form encodes may be arbitrary, but the senses available for encoding seem to be built from just a few common elements. 7.3

Emphatic construals of indefinite any

The idea that any is a scalar operator has a distinguished, if somewhat controversial, pedigree. The idea was first articulated by Fauconnier (1975b), who took the structural and semantic parallels between any and quantificational superlatives (forms like the simplest problem in Norm can’t solve the simplest problem) as evidence that any denotes an endpoint on a pragmatic scale. Fauconnier’s intuition has since been pursued in Kadmon and Landman (1993), Lee and Horn (1994), and a host of others (e.g. Krifka 1992, 1994,

The family of English indefinite polarity items 169 1995; Israel 1995, 1998a, b, 1999; Lee 1996; Haspelmath 1997; Lahiri 1998; Tovena 1998; Tovena & Jayez 1999; Horn 2000a, b, 2002, 2005; Langacker 2002; van Rooy 2003; Zepter 2003). Fauconnier’s innovation was to treat the quantificational properties of both polarity sensitive and FC any as pragmatic manifestations of the word’s basic scalar semantics, thus achieving a unified account for what seem to be two very different meanings. Vendler (1967) had claimed that the various uses of any are united as signals of a “blank warranty” to choose any random instance of a given nominal type (see also Langacker 1991, 2002; Horn 2000a, b). For Fauconnier, the warranty resides in the fact that any denotes a scalar endpoint and triggers inferences over all its scalar alternatives. Fauconnier’s unified scalar account languished nearly twenty years before being revived, reinterpreted, and more extensively developed in the work of Kadmon and Landman (1993) and Lee and Horn (1994) – henceforth K&L and L&H. These accounts begin with the observation that the English indefinite determiner a/an, like any, systematically allows two types of quantificational interpretation. (18)

a. Zev saw a lion at the zoo. b. A lion eats a lot of meat.

In (18a) the indefinite article has the quantificational force of an existential operator, and suggests that Zev saw only one lion. In (18b), on the other hand, the determiner has the force of a generic operator and suggests that it is a general (though not necessarily exceptionless) fact about lions that they eat a lot of meat. Given that the indefinite article is itself systematically ambiguous, if any is also an indefinite, then perhaps the split quantificational behavior of any in (1–2) simply reflects its status as an indefinite. Both K&L and L&H thus propose that the construction of any with a common noun (CN) – for example any lion, any lions – is semantically equivalent to the corresponding indefinite NP (a lion, lions), “with some additional characteristics contributed by any” (K&L 1993: 357). Both accounts view PS any as an existential indefinite, and FC any as a generic: the hard part, of course, is specifying the special “additional characteristics.” On K&L’s account, the special properties of any are, first, that it widens the interpretation of the NP in which it occurs, and second, that the widened interpretation must produce a stronger proposition than its unwidened equivalent. K&L define widening and strengthening as follows. A: w i de n i n g : In an NP of the form any CN, any widens the interpretation of the common noun phrase (CN) along a contextual dimension. (1993: 361)

170 The Grammar of Polarity B: st r e n g t h e n in g : Any is licensed only if the widening that it induces creates a stronger statement, i.e., only if the statement on the wide interpretation entails the statement on the narrow interpretation. (1993: 369)

Widening explains the fact that, as K&L put it, “any indicates a reduced tolerance of exceptions” (1993: 356), while strengthening constrains the acceptability of a widened interpretation. Since the wide interpretation must entail the narrow, sentences like (19c) are systematically ruled out. (19)

a. Mildred didn’t drink any whiskey. b. Mildred will drink absolutely any whiskey. c. *There is any whiskey in the kitchen.

(19a) is good because the assertion on the wide interpretation of whiskey – that Mildred didn’t drink anything at all that could count as a whiskey – entails every narrower interpretation whereby she didn’t drink some particular sort of whiskey. And (19b) is good because the wide interpretation – that she is willing to drink anything that counts as whiskey – entails any narrower interpretation on which she would drink some particular sorts of whiskey. But (19c) is bad because the widened interpretation does not entail its narrower counterparts: if there is some liquor in the kitchen consistent with the widest definition of whiskey, it does not follow that there is anything which meets a more restrictive definition, for example a highland single malt or a bootleg sour mash. Although K&L do not mention scales per se, and although Rullmann suggests that “the notion of scale does not play any role” in their account (1996: 336), their analysis is nonetheless scalar in spirit. Widening means that the denotation of the CN in any CN must be a superset of the CN’s normal denotation. While it is true that the elements within these sets need not themselves be ordered, the set–superset relation itself defines an ordering of the sort which forms a conceptual scale (see §3.3, above; Hirschberg 1985), and in this sense, the entailments which define strengthening are scalar entailments even if they are not advertised as such. In contrast to the crypto-scalarity of Kadmon and Landman, Lee and Horn (see also Horn 2000a, b, 2005) offer an unabashedly scalar analysis. For L&H, what distinguishes any from other indefinites is that it incorporates the semantics of the scalar focus particle even. As they put it, any = even + indefinite. Following Horn (1969, 1971), Fauconnier (1975b), and Kay (1990), among others, L&H propose that even presupposes a pragmatic scale associated with the sentence in which it occurs (1994: 15) so that a phrase of the form [any CN] is equivalent to [even a CN], where even has focus over the indefinite a itself. The nature of the evoked pragmatic scale further depends on the type of

The family of English indefinite polarity items 171 any in question. PS any evokes a quantity scale with alternatives ordered in terms of size or amount, so that a phrase like any boy is in effect equivalent to even a single boy. FC any evokes a kind scale in which the alternatives are different kinds denoted by the CN – e.g. even the {tallest/cutest/cleverest …} boy – ordered in terms of their potential to yield a valid proposition. PS any thus profiles a minimal quantity which must be construed within a maximally informative proposition, while FC any profiles an extreme kind of some sort in a way that triggers inferences about all less extreme kinds. Examples (19a–b) thus work because they allow for the construction of appropriate quantity and kind scales, as shown by the naturalness of their counterparts in (20), but (19c) fails because it cannot be coherently construed with either a quantity (21a) or a kind scale (21b). (20)

a. Mildred didn’t drink even a single whiskey. b. Mildred will drink even the nastiest whiskey.

(21)

a. *There is even a single whiskey in the kitchen. b. *There is even the most delicious whiskey in the kitchen.

Support for an analysis in terms of quantity and kind scales comes from the fact that both types of scalar construal can be seen in the behavior of any’s attenuating counterpart, some (Israel 1999). Thus in (22) an NP of the form [some CN] can either profile a limited quantity of some type or a limited selection of kinds. (22)

a. Some drugs might make you feel better. Quantity or Kind b. Some drugs can be very dangerous. Kind only c. There are some drugs in the refrigerator. Quantity only

(22a) allows both construals, though with distinct truth conditions: on a quantity construal, there must be some amount of at least one kind of drug which might work; on a kind construal, there must be at least two or more kinds which might work. The two construals typically come with slightly different pronunciations: where some is contrastively stressed, only the kind construal is possible; where it is unstressed and reduced (sm) only the quantity construal is possible. Syntactic and semantic factors also come into play: individual-level predicates, as in (22b), favor the kind construal; existential constructions, as in (22c), block it. Crucially, it appears that kind construals are available only in contexts which license FC any: both are blocked in episodic contexts (23) and existential constructions (24). (23)

a. {Some/*Any} syntacticians were in the garden. Quantity only b. I introduced Sally to {some/*any} phonologists. Quantity only

172 The Grammar of Polarity (24)

a. There are some lemurs in Leipzig. Quantity only b. There aren’t any lemurs in Silver Spring. PS only

The parallels here strongly suggest that the contrast between the two any’s is not just a lexical idiosyncrasy, but a principled fact about the interpretation of indefinites. Still, it is not obvious that the relevant principle is really a matter of quantities and kinds. For one thing, predicates which apply only to kinds, like extinct, rare, and widespread (Krifka et al. 1995), are actually hostile to the putatively kind-referring FC any, though they work with, and in fact require a kind reading for, some. (25)

a. *Any lemur is rare/common/widespread/extinct. b. Some lemurs are rare/common/widespread/extinct.

But the oddness of (25a) in itself is not fatal for the L&H analysis. The idea there is that FC any evokes a scale of kinds by picking out the individual least likely to instantiate some property (“the least likely NP to VP” – Horn 2005) and thereby triggers inferences over other individuals, of whatever kind, more likely to instantiate the property. Thus in (25a) the reference to an unlikely individual required by any clashes with the reference to a kind required by the predicate. Since no individual lemur can be common or rare, the sentence cannot make sense. Still, the naturalness of (25b), with some, undermines the parallel between the two indefinites and suggests that their kind-referring properties may not in fact be of a kind. A more serious problem here is that FC any can sometimes denote a minimal element on a quantity scale, while PS any sometimes evokes a kind scale. Langacker (2002) offers the examples below, his (18–19), to show that both FC and PS any can occur with both kinds of scales. (26)

a. Any knife at all will do (even a piece of glass). Kind b. I don’t have any knife at all (not even a piece of glass). Kind

(27)

a. Any salt at all will help this stew (even a pinch). Quantity b. I don’t have any salt at all (not even a pinch). Quantity

Similar examples also occur in the BNC. In (28) we find FC any denoting a minimal amount on a quantity scale: thus with any luck in (28a) means roughly ‘with even the smallest amount of luck’ and not ‘with even the worst kind of luck.’ (28)

a. “Less than a minute ahead,” he muttered. “With any luck, I should overtake him this time.” b. Any rise in aggregate demand which was rationally anticipated would have had no such effect; it would merely have led to a rise in prices.

The family of English indefinite polarity items 173 Similarly, the PS any in (29) does not denote a quantity: the phrase any one factor seems to preclude paraphrasing any as ‘even a single’ or ‘even the smallest,’ and requires instead construal on a kind scale, with any meaning something like ‘even the most powerful.’ (29) Certainly, if we exclude our own species, we cannot find any one factor having this sort of effect.

Examples like these show that there is more at stake here than a simple distinction between two kinds of scales; still, the intuition behind the distinction may be worth holding onto. Thus even where it allows a quantitative paraphrase, FC any still seems to have a basically characterizing function: (27a), for example, says something about salt in general (that any amount will help the stew), while (26b), with a putatively kind-denoting PS any, expresses an incidental fact about the absence of a knife. This is precisely the intuition that both L & H and K & L start with – that the difference between FC and PS any is the difference between generic and existential uses of an indefinite – and that Langacker (2002) seeks to capture when he claims that FC any involves the same conceptual operation as PS any – random selection of an indefinite instance – only applied in a particular kind of mental space, what he calls “the structural plane,” which constitutes a generic conception of the way the world works. Ultimately, however we characterize their differences, there are good reasons to think that both uses of any reflect the determiner’s core lexical semantics, and that part of what both share is an inherently scalar semantics. First, this pattern of polysemy is frequently found in indefinite constructions crosslinguistically. Haspelmath (1997: 117) reports that roughly half the languages in his forty-language sample use the same form for both free-choice and negative polarity functions. Moreover, such polysemous forms frequently display a transparently scalar morphology with a simple indefinite marked by an additive scalar particle meaning roughly ‘even’ or ‘also’ – like the bhii in Hindi koii-bhii ‘anyone,’ the -to in Korean amwu-to, the -mo in Japanese daremo, and the af in Hebrew af ehad (Lahiri 1998; Lee & Horn 1994; Lee 1996). Such morphological evidence should be handled with caution, since once a particle has grammaticalized inside an indefinite it may lose its original scalar semantics. These sorts of constructions are thus far from uniform in their behavior cross-linguistically, but there is enough of a correlation between ‘even’-marked indefinites and the combination of polarity sensitive and free-choice uses to suggest that scalar semantics does play an important role in the sensitivities of these constructions, and so plausibly in the sensitivities of any as well.

174 The Grammar of Polarity Finally, there are other constructions with an unmistakably emphatic meaning which display a similar split between free-choice and polarity sensitive uses. For example, the slightest is a sort of complex determiner which occurs only where it can trigger scalar inferences, and which, like any, allows both a free-choice reading in generic contexts, as in (30), and an existential reading in polarity contexts, as in (31). (30) (31)

a. The slightest suggestion of compromise can substantially weaken oil prices. b. Now the airline finds that it’s in a fishbowl, under harsh scrutiny for the slightest mistake. c. Mostly, Mr. Perroton says, he keeps to himself, since “the slightest wrong word can set someone off.” d. *They made the slightest suggestion of compromise. a. My Life as a Dog is low-tech and nonviolent – and not the slightest bit Kafkaesque, despite its title. b. In none of their dancing do they … perform any steps that appear to require the slightest degree of virtuosity. c. *Some of their dances require the slightest degree of virtuosity.

Not only do any and the slightest exhibit a similar split between PS and FC uses, their distributions in written discourse appear to be almost identical. The unasterisked examples here are taken from a sample of the Linguistic Data Consortium’s WSJ corpus, where, as Table 2 shows, the distribution of PS and FC any across licensing contexts is almost indistinguishable from that of PS and FC the slightest, and very different from the distribution of ever in the same corpus (which of course lacks a free-choice use). The slightest construction here is also reminiscent of the French le moindre N (literally ‘the least N’), which exhibits a similar split between free-choice and polarity sensitive uses (Tovena & Jayez 1999). Since both of these constructions are transparently scalar quantificational superlatives, their striking parallels with any strongly support the idea that any’s apparent polysemy itself reflects a basic scalar semantics. There are many ways of understanding the notion of ‘scalarity’ here, but the common intuition behind any such understanding is that any is somehow inherently emphatic. This, in effect, is why Fauconnier and Haspelmath claim that any denotes a scalar endpoint; why Kadmon and Landman hold that it induces strengthening; why Lee and Horn see it as encoding an implicit ‘even’; and why it is held here to encode a high informative value. But there are reasons to doubt that any really is inherently emphatic. First, if it were, then presumably FC any should be a positive polarity item, since where

The family of English indefinite polarity items 175 Table 2. Distributions of three polarity items

Sentence negation Conditionals Interrogatives Comparatives Superlatives Other reversing contexts Total PS Free choice Other

the slightest (n=100)

any (n=316)

ever (n=1031)

34% 5% 2% 1% 0% 21% 63% 35% 2%

39% 2% 2% 2% 1% 21% 67% 31% 2%

16% 5% 7% 12% 28% 17% 85% 0% 15%

any acts like a universal quantifier, it must take wide scope over negation to yield an emphatic proposition. But FC any is not a PPI, and it seems perfectly acceptable in contexts where it is conspicuously uninformative, as in (32). (32)

a. Mildred won’t drink absolutely any whiskey – even she has her standards. b. Are you sure that Mildred will drink absolutely any whiskey? c. I’d be surprised if Mildred will drink just any whiskey.

Horn (2000a) dubs this use “anti-indiscriminative” any, since, as the name suggests, it conveys a lack of total indiscriminateness: (32a), for example, literally entails only minimal discrimination in Mildred’s choice of whiskeys. This use poses a problem both for the Scalar Model and for K&L’s strengthening analysis. For the latter, any here should induce a widened interpretation of the CN whiskey and thereby produce a stronger statement than would otherwise be expressed. But the expressed proposition in (32a) – that Mildred will not drink everything conforming to the broadest definition of a whiskey-like libation – does not entail her disdain for more narrowly defined sorts of whiskey: a small batch bourbon might still be acceptable. By the same logic, the anti-indiscriminative use seems to be conspicuously low in informative value, as in these examples the expressed proposition requires no more than the existence of a single kind of whiskey which Mildred will not drink. But anti-indiscriminative any does not seem to pose a problem for L&H’s implicit ‘even’ analysis. As predicted, the examples in (32) all allow the profiled whiskey to be construed as an outlier on a scale of kinds: thus each can be honestly paraphrased with something like even the nastiest substituted for absolutely any.

176 The Grammar of Polarity (33)

a. Mildred won’t drink even the nastiest whiskey – even she has her standards. b. Are you sure that Mildred will drink even the nastiest whiskey? c. I’d be surprised if Mildred will drink even the nastiest whiskey.

Since even is itself an emphatic scalar operator – requiring that the element within its focus (in this case the nastiest whiskey) should form a strongly informative proposition within a scalar model – the success of L&H’s even + indefinite account here suggests that the anti-indiscriminative use of FC any does involve a kind of emphatic strengthening, even if the expressed proposition of the sentence as a whole is not emphatic itself. This sort of embedded emphasis reflects the fact that anti-indiscriminative any is essentially echoic – it always comes in response to some real or imagined prior utterance in which any is used emphatically. This point is implicit in Horn’s doubly negated sobriquet for the construction – the “anti-indiscriminative” formulation itself points to the fact that any here serves to deny an assertion of emphatic indiscriminateness and not just to advance a claim of minimal discriminateness. As such, the emphatic proposition to which any contributes its i-value in (32) concerns a purportedly extreme inclination to whiskey drinking on the part of Mildred: (32a) denies this emphatic proposition, (32b) questions its validity, and (32c) expresses incredulity toward it. In each case, FC any is locally strengthening, though in each case the presence of a scale-reversing polarity trigger prevents the sentence as a whole from expressing a strong proposition. FC any in these examples is thus licensed not because of, but despite, the presence of a polarity trigger, and it is emphatic with respect to the local proposition inside the scope of these triggers (i.e. negation, question formation, and the adversative predicate be surprised). In this respect, FC any appears to behave just like PS any. As noted above (§3.4.3; see also Baker 1970; Chierchia 2004), licensing in general only requires a form to occur in a local domain with the right scalar semantics – once licensed, sensitive items are generally impervious to the effects of further embedding under other scalereversing constructions. So if, as seems to be the case, FC any is licensed in a local proposition, even when it appears to be inappropriately weak with respect to its global context, this only strengthens the parallel between the two sorts of any. But the real problem with the idea that any is an emphatic scalar operator is just that often any does not seem to be emphatic at all (Heim 1984; Krifka 1995; Haspelmath 1997: 122–8; Guerzoni 2004). Thus there is an important difference between stressed uses of any, as in (35), where it is unambiguously emphatic, and unstressed uses, as in (34), where it sounds neutral and innocent.

The family of English indefinite polarity items 177 (34)

a. I have a headache. Do you have any heroin? b. I’m terribly sorry, but I don’t have any heroin.

(35)

a. Do you have any idea (at all) how much this hurts? b. I wish I could help you, but I don’t have any heroin (at all).

While the expressive uses in (35) seem to involve some sort of widening or emphatic reference to a scalar extreme, those in (34) are comparatively neutral ways of expressing a question or denial: (34a) will not normally be heard as a request for “even the smallest amount of heroin,” and (34b) need not be understood as an emphatic denial of possession. Observations like these have led some (e.g. Schmerling 1971; Heim 1984) to conclude that the difference between unstressed any and truly emphatic NPIs is precisely that the latter do, while any does not, incorporate the meaning of a scalar focus particle like even. Indeed, since unstressed any does not seem to induce widening or to contrast with scalar alternatives, Rullmann (1996: 349) suggests that, at least in these uses, any is identical in meaning to an ordinary indefinite like English a(n) but with an arbitrary constraint limiting it to downward entailing contexts. Duffley and Larrivée (2010) go further, suggesting that any is not intrinsically scalar in any of its uses – that the impression of scalarity is always a derived effect. Clearly, the distinctive meaning of unstressed any, if any there is, is something more subtle than the unabashed strengthening of a scalar focus particle like even. Still, it might be something subtly related. I maintain that any in all its uses is a scalar operator in the broad sense that it profiles an indefinite instance within an ordered set of alternatives. What is special about unstressed any is precisely that it lacks stress. As such its scalar content is not a focus of attention and its emphatic effects are relatively backgrounded within an expressed proposition. Still, unstressed any remains emphatic in that its use necessarily triggers inferences from a profiled indefinite instance to all alternative instances. The inferences are still there, only they are no longer quite the focus of attention: the scale is somehow backgrounded, and more subjectively construed (Langacker 1990; Traugott 1989, 1990). In its unstressed uses, any does not profile a scalar endpoint and does not even require that the alternatives it evokes saliently contrast along an objective scale – the heroin instances in (34), for example, need not be construed in terms of their relative amounts or inherent qualities. The ordering which any evokes does not depend on any imaginable property of the evoked alternatives, but rather on the act of imagination itself – the way, that is, that an indefinite instance is brought to mind.

178 The Grammar of Polarity Indefiniteness is a grammatical property – an abstract aspect of construal that can be encoded in the construction of referring expressions. A fully grammaticalized indefinite is a construction which contributes nothing but a sense of indefiniteness and which is more or less obligatory in the use of referring expressions. Different grammaticalized indefinites within a language (e.g. English a/some/any/whatever) differ in the ways they evoke an indefinite instance, and even in the degree to which they construe an instance as indefinite. Forms like any and ever are special because they are maximally indefinite: they construe a profiled instance as minimally distinguished within a set of possible instances. The relevant scale here is entirely subjective. Different types of referring expressions (e.g. pronouns, proper names, demonstratives, definites, indefinites) differ in the ways they evoke their referents, and, as suggested by Kirsner (1993), in the mental effort they require of a conceptualizer to select a referent. As Langacker (2002: 298) explains it: Singling out a particular referent, in specific contrast to all other options, requires a higher degree of mental effort and selective control than when one is merely instructed to imagine an instance (any instance). By nature, definite determiners imply a higher level of selective force than do indefinites; the kind of pointing that sometimes accompanies demonstrative use may indicate the upper degree of selective effort, with both physical and mental aspects. Among indefinites, any stands out by requiring the least degree of selective force imaginable.

It is this minimal selective force that gives any its special, quodlibetic character: any, in effect, prompts one to select an arbitrary instance entirely at random from a set of possible instances (Langacker 1991: 116). In this sense any denotes a maximally indistinguished instance of a type, since it requires the minimal effort possible to be applied in the selection of an instance. What any denotes is thus just the idea of an instance – a purely fictive “phantom instance” (Israel 1995). This is why any cannot occur in episodic contexts, where the existence of a specific event is entailed (Giannakidou 2001): because any denotes an instance without distinguishing features, its referent cannot be anchored to any particular time or place or event. I suggest that ever and at all work in much the same way: all three constructions profile phantom indefinites. The core intuition here echoes analyses from Jespersen (1924) to Giannakidou (2001) which view any as a special indefinite “whose use is bound up with some aspect of the hearer’s unrestricted freedom to choose from a set of alternatives in identifying referents or witnesses to fill out the speaker’s proposition” (Horn

The family of English indefinite polarity items 179 2005). What Langacker adds is a focus on the imaginative process by which any presents an instance to consciousness, and thus a way of thinking about indefiniteness itself as something which can vary in intensity. Indefiniteness does not depend on what is referred to, but on how a referent is called to mind. What marks any as an especially indefinite determiner is the minimal effort, the absolute carelessness, it imposes on the selection of a referent from a set of possible alternatives.1 Like the indefinite NPIs, the indefinite NPs in minimizer constructions (i.e. a finger in lift a finger or a wink in sleep a wink) also denote phantom elements. This explains why neither minimizers nor indefinite NPIs can introduce discourse referents, and also more generally why any NPs lack existential import (Vendler 1967; Horn 1997) so that a sentence like any trespassers will be prosecuted does not presuppose that there are any actual trespassers to prosecute. What any requires of its referent is only that it be imaginable, not that it actually exist. Because phantom indefinites lack referential content of their own, they are only meaningful where reference to the phantom supports inferences about other instances of the same type. Phantom indefinites in general are thus subject to what I have called the Implication Constraint, or IC (Israel 1995: 165). IC: Given a partially ordered set A with elements {a, b, c, … } and phantom α linked to A, α is licensed in a proposition P iff for every element x of A, P[α] → P[x].

The IC is basically just a generalized version of the requirement that NPIs occur only in contexts which support particular sorts of scalar inferences. Because any and the indefinite NPIs are defined by the minimal selective force they impose on a conceptualizer, they are scalar operators just like the min imizers, but with the important difference that the scales they evoke are construed subjectively rather than objectively (Langacker 1990): they are implicit in the act of conceptualization itself. For a minimizer like sleep a wink or lift a finger, the phantom value denoted by the indefinite contrasts with alternative values ordered along some objective scale – e.g. amounts one might sleep or degrees to which one might exert oneself. With the indefinites, however, the scale is defined not by any inherent property of the referents themselves, but rather by the effort required to imagine an instance of the relevant type. One consequence of this is that the alternative instances on an indefinite scale need not be exhaustively ordered amongst themselves: while every particular alternative necessarily takes more mental effort to distinguish than does a phantom instance, no particular alternative need take more mental effort than any other one. Minimizer and indefinite NPIs thus differ crucially in the ways

180 The Grammar of Polarity their scales are structured: while minimizers presuppose a contrast between a minimal value and an ordered set of alternative values, indefinite NPIs do not require any particular ordering among alternative values. Instead, they allow a sort of weak scalar construal with a minimal scale, where only the phantom element itself is ordered with respect to all other elements. But this sort of construal is possible only where the scalar content of an NPI is subjectively construed – where no contextual cues highlight an objective quantity or quality scale, and where the choice of the NPI itself is unobtrusively unstressed. This potential for a weak scalar construal explains why indefinite NPIs, unlike their minimizer counterparts, can sometimes sound more neutral than emphatic. The emphatic effect arises where a profiled phantom element is construed as an extreme value, contrasting with other elements along some objective parameter: this sort of construal is built into the meaning of a minimizer but it is optional, or at least much less salient, with the indefinite NPIs. Where such a contrast is absent or only dimly felt, forms like any and ever work more or less like ordinary generics, without implicating anything exceptional about their referents: thus Do you have any sugar? can be a simple request for sweetener, without any hint that the addressee should consider a wide variety of sugar types (e.g. corn syrup, glucose, or confectioner’s sugar) in his answer. Where any occurs with prosodic prominence or emphatic stress, it always receives a strong scalar construal and profiles an extreme value in a fully ordered set of alternatives. But there is a fine line between the weak scalar construal, on which any denotes a subjectively indistinguished (hence easily imagined) instance, and the strong scalar construal, on which any denotes something which is objectively unexceptional: a negligible amount of something (a low value on a quantity scale) or an insignificant example of something (a low value on a kind scale). These two types of indistinction often go together – the element requiring the least mental effort to imagine will not be imagined as having particularly distinctive objective properties. Presumably this sort of referential overlap between the two construals has facilitated a reanalysis which, in certain contexts, has led any to lose some of its objective scalarity and to grammaticalize as a quasi-neutral expression of indefiniteness. 7.4

The effects of phantom reference

Thus far I have argued that any is indeed an emphatic scalar operator in all its uses, but that it is also highly grammaticalized as an indefinite. As such it can operate on a subjectively construed scale in which elements are ordered in terms of the ease with which they can be imagined rather than in terms of the

The family of English indefinite polarity items 181 objective properties which distinguish them. I have called this sort of subjective construal “weakly scalar” since it profiles a phantom indefinite as being ordered with respect to its alternatives but construes the alternatives themselves as unordered with respect to each other. This weak scalar construal makes it possible to use any in non-focal, unstressed positions within an utterance. The adverbials ever and at all are less likely than any to sound completely neutral – perhaps because their inherent optionality makes their use feel more contrastive and emphatic – but like any they do occur without stress in a variety of non-focal positions where they contribute to the expression of backgrounded or topical propositions. Since at all itself denotes an indefinite value on an objective scale of some sort, its profile tends to be construed as a minimal value in an ordered set of alternatives. I take it that as an indefinite at all really does work the same way ever and any do – in effect, by conjuring up an arbitrarily selected value in a set of scalar degrees – but since degrees in general are always intrinsically ordered, there is little practical difference between such indefinite reference and reference directly to a minimal value. As a group, indefinite NPIs are notoriously liberal (“weak”) in their licensing requirements, tolerating a host of licensors which more discriminating (“strong”) NPIs shun. Unlike minimizers and most emphatic NPIs, indefinites occur in non-focal positions within a sentence, contributing to the expression of topical (i.e. presupposed) information, occasionally even in sentences where another constituent bears contrastive stress. (36) I don’t know about Glynda, but Alice didn’t drink {any / *a drop of} eggnog. (37) Maybe Glynda has, but Lulu hasn’t {ever read / *read a word of} Lacan.

The minimizer constructions with drop and word here seem to presuppose that there is someone who drank precisely a drop or read precisely a word. Unstressed indefinites can also occasionally occur with metalinguistic negation, which normally neither licenses NPIs nor blocks PPIs (Baker 1970; Linebarger 1980; Horn 1989); nonetheless, the indefinites any and ever are able to slip into B’s responses below so long as they are not themselves construed as the focus of the metalinguistic comment. (38)

A: I’m surprised you managed to get any work done with that music on. B: I didn’t manage to get any work done – it was easy. B′: *I didn’t manage to get any work done – I finished it all.

(39)

A: Do you fantasize about visiting Bali? B: No, I don’t ever fantasize about it – I just go whenever I want. B′: *I don’t ever fantasize about it – I think about it all the time.

182 The Grammar of Polarity This option seems less available for stronger NPIs like so much as or the least bit. (40) I haven’t {ever / *so much as} fantasized about it. I’ve planned it in meticulous detail. (41) Sally wasn’t {at all / *the least bit} irritated. She was furious.

Again, the crucial factor here appears to be an NPI’s ability to occur outside of a primary propositional focus. In order for an NPI like so much as to occur with a metalinguistic negation it must be construed as emphasizing the metalinguistic ineptness of the expression in its focus, rather than the scalar extent of the property depicted by that expression. The Implication Constraint explains why the unstressed indefinites are, and minimizers are not, licensed in negated because clauses, as in (42). (42)

David didn’t get the job because he had ______. Lois just liked him. a. {ever / *ever} studied tax law b. {any / *any} special talents c. *the least bit of talent.

It appears that any and ever do better in this context than at all does: examples found in Google Books below appear to be limited to nineteenth-century literary prose (43)

a. When Mr. Tapley stopped in these calculations in simple addition, he did it, not because he was at all tired of the exercise, but because he was out of breath. (Dickens, Martin Chuzzlewit) b. At the last moment Lord Alfred Grendall had been asked not because he was at all in favour with any of the Longestaffes, but in order that he might be useful in disposing of the great Director. (Trollope, The Way We Live Now) c. So he went to bed and stayed there for nearly eight years; not because he was at all ill, but because he liked it. (Henry George, Social Problems)

Since both minimizers and indefinites denote phantom elements, both are subject to the Implication Constraint. But the inference from a schematic indefinite to a specific value is very different from the inference from a minimal value to a higher value in a scalar ordering. Consider, for example, the contrast below. (44)

Zelda didn’t fall asleep because _______ . She was just tired. a. she drank anything. b. *she drank a drop of gin.

Imagine that Zelda is sleeping and we want to know why. If her somnolence was not caused by some unspecified beverage, then no particular sort of

The family of English indefinite polarity items 183 beverage (milk, gin, Nyquil, or whatever) could have caused it either. The IC is thus satisfied for any. But if all we know is that Zelda’s drinking a small quantity of gin did not cause her to doze off, it is still possible that her consuming some larger amount might have had a soporific effect: a pint may suffice where a shot would not. The IC thus fails for the minimizer construction: negated because clauses do not allow inferences from low values to higher values on a quantitative scale, even though they do license inferences from a pure phantom indefinite to more specific values linked to the phantom. Another context where indefinite polarity items behave differently from minimizers is the focus of only, as in the examples below (Linebarger 1980; Lee & Horn 1994; Israel 1995; Horn 1996). (45)

a. Only people who’ve ever read Lacan will enjoy this movie. b. ??Only people who appreciate Lacan at all will enjoy this movie. c. *Only people who’ve read the least bit of Lacan will enjoy this movie. d. *Only people who’ve read Lacan yet will enjoy this movie.

Linebarger (1980) suggests that NPIs are licensed in this sort of context by negative implicatures – here, that if one has not (ever) read (any) Lacan, then one will not enjoy the movie. Crucially, to the degree the indefinites are licensed here, they seem to emphasize the force of the restriction expressed by only: the rhetorical point is not that the people who enjoy the movie will be precisely those who have read the smallest amount of Lacan (i.e. any Lacan ever), but rather that everyone else really won’t enjoy it. The indefinite NPIs work in this context precisely because they allow a weak scalar construal, triggering inferences about all people with Lacanian experience without highlighting the differences among people in the extent of their experience. In fact, indefinite NPIs can have this effect even without an overt only when they occur inside generic (definite or bare plural) NPs, which suggests that they are licensed here not by the exclusive particle, but rather by their occurrence in the restriction of an implied generic or universal quantifier (Beaver & Clark 2008: 189). What the NPIs contribute in such uses is precisely what is expressed by the addition of only. The examples in (46), all found on the Web, illustrate a common construction of this sort, where indefinite NPIs occur in the clausal complement of a noun like reason, or, as in (46e), before a because clause. (46)

a. I think this site might also be the reason I have any faith left in women. b. I think the reason I have any friends at all is that I am certainly not above self-deprecation. c. Paul Zindel is the reason I ever wanted to write… d. you’re the reason I do this at all.

184 The Grammar of Polarity

e. I do this at all because there are plenty of YOUNGSTERS watching who may not understand the esoteric arguments but CAN understand the prosaic physics.

These contexts seem to allow indefinite NPIs just in case the sentence as a whole can be construed as presenting a unique, necessary, and sufficient explanation for the truth of the proposition denoted by the clause in which the NPI appears. While they are in many ways less strict in their licensing requirements than other NPIs, indefinite NPIs are also remarkably robust in their sensitivities. Many polarity items, particularly those susceptible to a non-polarity sensitive literal reading (e.g. an iota, a shred of evidence, bat an eyelash) can occasionally be found in playful or ironic uses without any licensor. Such uses arise most often in nonveridical contexts, which do not entail the truth of an expressed proposition – for example, in the scope of an imperative or a possibility operator as in the examples below found on the Web. (47)

a. Just lift a finger and you’ve got curd, with zero hassles. b. It would be too easy for Him to just lift a finger, and all my problems will fade away. c. Thanks you guys, I think I may sleep a wink or two tonight.

And with a little jocularity, such “liberated minimizers” (Horn 2006) can also appear in the affirmative denial of an asserted negation, as in the invented bit of dialogue below. (48)

A: I’ll bet none of my students have read a word of Deleuze. B: Well, some have probably read a word, maybe even two or three.

Other possible responses with indefinite NPIs seem less jocular and more ungrammatical. (49)

Some have probably a. *read any Deleuze. b. *ever read Deleuze. c. * read Deleuze at all.

The fact that indefinite NPIs are blocked in such non-serious contexts shows that they require a serious licensor of some sort, and in this sense they seem to be more thoroughly grammaticalized than their more lexically contentful minimizer counterparts. Thus while the implication constraint allows indefinite NPIs to occur in a variety of weakly negative contexts inhospitable to other NPIs, it also blocks them more thoroughly from purely positive contexts. For the indefinites, the implication constraint is thus fully grammaticalized and non-defeasible.

The family of English indefinite polarity items 185 This is perhaps also why the presence of an indefinite NPI can sometimes actually improve the acceptability of a less than fully licensed NPI, as below where the addition of ever seems to boost the acceptability of the stronger NPIs had a thing to do with and anymore. (50)

a. Joe was sorry he (+ever) had a thing to do with that woman. b. I didn’t expect that she would (+ever) come back anymore.

This phenomenon has been called secondary (Borkin 1971; Emanatian 1983) or parasitic licensing (den Dikken 2002) since it appears that the stronger NPIs are licensed here by the effect of a weaker polarity item somehow highlighting the context’s overall negativity. The fact that forms like ever and Dutch ooit make such effective secondary triggers may be seen as evidence of grammaticalization of a kind – the presence of these forms alone creates the impression of licensing. Where such licensing is saliently absent, as in (49), indefinite NPIs are robustly ungrammatical, but where it is subtly present, as in (50), the indefinites have a reinforcing effect. I conclude that these forms have grammaticalized as pure phantom indefinites and as such have effectively conventionalized an implicature, namely that the Implication Constraint is met. Less grammaticalized NPIs, which have not conventionalized this implicature, require a stronger licensor but can also function ironically. The more grammaticalized status of weak indefinites thus gives them both their unusual flexibility and distinct restrictions. But even if the indefinite NPIs really are grammaticalized emphatics in both their stressed and unstressed uses, there is still some question whether freechoice any really is an indefinite or whether, as Dayal (1998, 2004) suggests, it is a kind of universal quantifier. The Implication Constraint licenses any wherever reference to an indefinite instance supports inferences about other instances of the same sort, and it predicts that any should be licensed anywhere an ordinary indefinite like a/an can be interpreted generically. But the claim here is not as in Kadmon and Landman (1993), that FC any gets its quantificational force from an implicit generic operator, but rather that any’s universal-like interpretation is a function of its special indefinite semantics.2 Like PS any, FC any denotes a maximally indistinguished phantom element. The free-choice reading arises where the any phrase figures as part of a proposition which predicates an essential property of such an indistinguished instance. As such, it is itself inherently generic and so does not depend on an external generic operator for its free-choice interpretation. This explains why, as Dayal (1998) notes, FC any can occur in contexts where an ordinary indefinite like a/an receives an existential interpretation – for example, in the scope of deontic and epistemic possibility operators as in (51).

186 The Grammar of Polarity (51)

a. You may pick {a / any} flower. b. {A / Any} pilot could be flying this plane.

In both cases, the plain indefinite entails only that there is some (at least one) instance of the relevant type (i.e. a flower or a pilot) for which the asserted possibility holds true, while the use of any conveys that the possibility actually holds for every relevant instance. Dayal’s basic point is that the conveyed universal force of any here cannot be attributed to an independently motivated generic operator (1998: 437). But that does not mean that any is itself a universal quantifier. As a phantom indefinite, any gets a free-choice reading in contexts like (55) because if a possibility holds (even) for the most indistinguished instance of a type, it must hold for all other particular instances as well. In effect, the Implication Constraint does not require that FC any always occur in the scope of some external, generic operator, but only that it never occur where it has to be construed as denoting a particular individual rather than an undistinguished instance. This is why FC any is often, though not always, blocked with expressions of necessity. Thus Dayal (1998) judges the examples below – her (4a) and (50a) – as ungrammatical. (52)

a. *You must pick any flower. b. *Any pilot must be flying this plane.

The problem with both these examples is that any fails to denote an entirely undistinguished referent. In (52b) progressive marking and demonstrative this entail reference to a particular flying event, and since any given instance of plane flying can involve only one pilot (or at most a few), the referent of any is necessarily distinguished from other instances of the pilot category: the sentence is thus inherently incoherent. (52a), on the other hand, fails only on the plausible interpretation that it demands just one act of flower-picking – it is fine, for example, construed as a broad injunction to destroy any colorful foliage in the garden. The difference, again, is that in the former case, any picks out a participant in a particular flower-picking event, whereas in the latter it describes a kind of event required for flowers in general. In fact, necessity operators regularly license FC any, so long as the necessity can be construed as applying generically to situations which feature an instance of the relevant sort. In other words, where a phantom indefinite occurs in the scope of a necessity operator, the necessity itself must apply without distinction to instances of the relevant type. Whether or not such an interpretation is plausible is largely a matter of pragmatics: the oddness of (51a) simply reflects the unlikelihood that someone would want to prevent flowers from occurring somewhere. But the fact that orders are generally supposed to be

The family of English indefinite polarity items 187 obeyed and errors corrected is sufficient to license any in (53a, b), and (53c) is fine as an instruction to a doctoral student who is expected to master the freechoice literature. (53)

a. You must obey any order. b. You must correct any errors. c. You must read any article on free-choice semantics.

The Implication Constraint is also satisfied where free-choice any is used to indicate the open-ended nature of an obligation, as in (54). With modals like must or have to this sort of construal requires a “supplementary any” reinforcing an ordinary indefinite (Jennings 1994; Dayal 2004; Horn 2005). Curiously, this supplementary construction seems not to be necessary with should. (54)

a. You must read an article on free-choice semantics – any article. b. If you don’t know where to start, you should just read any article.

A similar effect is found in certain imperatives, where the use of any typically indicates a kind of indirect offer rather than a simple command. The examples in (55) thus illustrate expressions of possibility rather than necessity, and they convey that an offer applies generally to instances of the relevant type – i.e. cards and things in the fridge – regardless of their particular characteristics. This sort of construal is difficult, however, with explicit expressions of necessity like must or insist, as in (56), where any sounds oddly self-contradictory. (55)

a. Pick a card, any card. b. Help yourself to anything in the fridge.

(56)

a. *You must pick any card. b. *I insist you help yourself to anything in the fridge.

In (56a), if the requirement applies to any random card, then one must in fact “pick” the whole deck; in (56b), if one is obliged to eat whatever is in the fridge, the eating will hardly be “helping oneself.” Ultimately, the Implication Constraint requires any to profile a maximally schematic element selected from a class of possible referents. This explains why any is typically blocked in episodic sentences like (57a), where it would have to denote a distinct individual rather than a schematic instance of some type; and it also explains the effect known as “subtrigging” (LeGrand 1975; Dayal 1998, 2004) – where free-choice any in an episodic sentence is saved by the addition of an appropriate postnominal modifier, as in (57b). (57)

a. *Any soldier died in the blast. b. Any soldier who was there at the time died in the blast. c. ??Any soldier who had a mustache died in the blast.

188 The Grammar of Polarity Intuitively, while the unsubtrigged any in (57a) denotes a particular individual in a particular event, the subtrigged use in (57b) defines the class of individuals who fulfill a given role in the event. In effect, then, subtrigging converts an ordinary episodic context into a characterizing context. The intuition is reinforced by Dayal’s (1998) observation that subtrigging fails in examples like (57c), where the modifier denotes an accidental property rather than an essential condition for participation in the relevant event. Despite the fact that the sentence as a whole is episodic, the any NP in (57b) is construed quasigenerically: it does not refer directly to the participants in an event, but rather profiles a schematic instance of the kind of individual that participates. Precisely why it is that postnominal modification has this effect on any NPs is an interesting question, but it is the effect itself that is crucial for understanding the semantics of free choice. In fact, as Dayal (2004: 9) notes, the effect does not even require an overt subtrigger: one thus finds companies apologizing “for any inconvenience” when their website is down, and, in an example from Jason Stanley, dinner parties where “any leftovers will be thrown away.” In all such cases, the use of any creates a sort of undifferentiated reference to an open-ended class of individuals – any is licensed because its reference to a phantom instance supports inferences about other instances of the same type. Dayal is surely correct that any is more than just a Heimian indefinite which derives its quantificational force from some external generic operator; but it does not follow that any is a universal quantifier. Rather any is a phantom indefinite: it profiles an absolutely indistinguished fictive entity and so is meaningful only where it triggers inferences over a set of alternative instances. The difference between FC and PS any lies in the ways these inferences are generated. PS any, like other NPIs, occurs in scale-reversing contexts, where reference to a minimally distinguished element pragmatically entails reference to any more distinguished alternative. FC any, on the other hand, is essentially generic – it occurs in characterizing contexts, profiling an element on the structural plane (Langacker 2002), where the minimally distinguished element is schematic with respect to other conceivable instances. 7.5

Some uses of some

Any and some actually have a great deal in common. My central claim here is that attenuation is an essential feature of some’s meaning, just as emphasis is an inherent feature of any’s. While any occurs where reference to an indefinite instance can trigger inferences over all alternatives, some occurs only where its indefinite semantics will not trigger such inferences. Some profiles a limited

The family of English indefinite polarity items 189 indefinite instance of a nominal type in a relatively uninformative proposition. In Kadmon and Landman’s terms, some blocks widening and strengthening. In the terms of the Scalar Model, it is a low-scalar attenuating PPI: an NP of the form some CN encodes a low q-value by profiling a limited indefinite instance, and it encodes a low i-value by requiring that the expressed proposition to which it contributes be informatively weaker than a salient context proposition. Any account of some should start with the distinction noted by Milsark (1977) between the reduced form, sm (with syllabic [m]), as in (58a), and the full form (=[sm]), as in (58b). (58)

a. There were sm lemurs sitting in the library. b. Some lemurs enjoy drinking whiskey.

Basically, while the reduced sm simply introduces an indefinite instance of a nominal type, the full form typically presupposes a contrast with other instances of the type. Thus, while the lemurs in (58a) need not be understood as contrasting with some other group of lemurs outside the library, the whiskeydrinking lemurs in (58b) can only be construed as a subset of lemurs in general. Because these two construals are usually associated with distinct pronunci ations, the two forms of some are sometimes considered distinct, if perhaps related, lexical items. But the distinct construals here could also reflect their differences in pronunciation and so might just be pragmatic variants of a single basic meaning. In this light I suggest that the contrast between full and reduced some precisely parallels the contrast between stressed and unstressed any. Other uses of some, however, cannot be so easily dismissed as pragmatically conditioned variants. Each of the four constructions illustrated below exhibits distinct semantic and syntactic properties.3 (59)

a. You have some peanut butter on your chin. b. There’s some linguist here to see you. c. It’s going to be some party. d. We danced some, and then we went home.

Basic some Spesumptive some Exclamative some Adverbial some

In what I take to be its most basic use, (59a), some combines with a mass or plural count noun and profiles an indefinite, limited instance of that nominal type. The “spesumptive” use in (59b) departs from this use in that some combines with a singular count noun and adds a nuance to the interpretation of the indefinite – that while the NP does have a specific referent the speaker cannot (or will not) fully identify it. The exclamative use in (59c) again features some with a singular count noun, but here it forms an exclamative predicate rather than an actual indefinite referring expression. Finally, in (59d), some functions

190 The Grammar of Polarity not as a determiner, but as an adverb profiling the limited instantiation of an atelic process. Warfel (1972) coined the term spesumptive to capture what he saw as a “presumption of specificity” in uses like those below from the BNC. (60)

a. “But six shots and a man in intensive care,” he added, “makes it reasonably certain that some friend of yours was involved.” b. By the time she had closed her room door behind her, though, while there was still some part of her that didn’t want to be attracted to him some other part of her was arguing, why shouldn’t she be attracted to him? c. Hard fingers encircled his arm and haled him out of the room, and he went like one in a dream, unable to resist but ready for some treacherous pitfall to open under his feet at every step.

Some has a sort of quasi-specific force here, suggesting at once that the NP it forms does have a unique referent, but one which is not fully identifiable. But spesumptive some need not even entail the existence of a specific referent: thus (60c) does not seem to require that there is an actual treacherous pitfall awaiting its protagonist, but only that some such pitfall might well exist. What spesumptive some does convey as a non-defeasible feature of its meaning is that the speaker (or more generally, the conceptualizer whose viewpoint is assumed at a given point in a discourse – in (60c) it is a third person protagonist) is unable to fully identify a referent. This explains the oddness of (61a), where a spesumptive some is immediately followed by the identification of its spesumptive referent. The contrast with (61b) suggests that basic some does not encode any such presumption of unidentifiability. (61)

a. Noah rented some film for us to watch. ??Laura – an old favorite of mine. b. Noah rented some films for us to watch. They’re all old favorites of mine.

Exclamative some has two properties typical of exclamatives in general (Michaelis & Lambrecht 1996; Zanuttini & Portner 2003) – subjectivity and scalar predication. First, it can only be used to express a speaker’s subjective evaluation of a specific referent. This explains why (62a) is odd on an exclamative reading, since the speaker here explicitly disavows the relevant evaluation.4 Second, exclamative some always predicates an extreme (or much greater than expected) scalar property. This is why exclamative some occurs only in predicative NPs, thus barring examples like (62b), where some is not part of the predicate, and allowing examples like (62c). (62)

a. *Harry thought she was some dancer, but I disagreed. b. *Some friend stole my prized bottle cap collection! c. She must have been some friend to rip you off like that!

The family of English indefinite polarity items 191 The semantics of the exclamative is radically different from the other somes. Exclamative some requires a uniquely identifiable – i.e. definite – referent, since the act of exclamation itself presupposes the contextual salience of something to exclaim about. And the act of exclamation itself is fundamentally at odds with some’s other more unassuming and basically uninformative roles. One property which all of the somes do seem to share is that they are finicky about where they appear. Some is a positive polarity item. Usually where some occurs with negation, the result is either ungrammatical, as in (63a–c), or else can only be interpreted with some taking scope over negation. Thus (63d) can assert only that there are some books which were not read, and not that no books were read. (63)

a. *You don’t have some peanut butter on your chin. b. *There isn’t some linguist here to see you. c. *We didn’t dance some, although we both wanted to. d. I didn’t read some books.

The sensitivity of some extends beyond negation as well. As Langacker points out (1991: 103), unlike most other indefinites, some does not allow a generic construal. NPs formed with a/an or with a null determiner can refer either existentially, to an indefinite instance of a type, as in (64), or generically, to a representative instance which stands for the type as a whole, as in (65), NPs with some allow only non-generic readings. (64)

a. Sally saw{a/some} wombat hiding under the bed. b. We have {some/Ø} wombats living in our attic.

(65)

a. {A/*Some} cat likes to chase mice. b. {*Some/Ø} cats are mammals.

The examples in (65) are ungrammatical specifically when some is understood generically. While an NP of the form some CN can denote a set of kinds (e.g. some (kinds of) cats have soft fur), such genera-denoting NPs are not in fact generic since they cannot refer to the full set of kinds encompassed in the extension of the CN. In effect, what they denote is a limited quantity of kinds. Both these limitations on the use of some – its polarity sensitivity and its antigenericity – reflect its status as a scalar operator. In particular, some conforms to the Informativity Hypothesis in that it always expresses a low, indefinite quantitative value and makes a relatively weak and uninformative contribution to an expressed proposition. As a result, some can refer only to a limited, indefinite instance of a category and is systematically excluded from contexts where its use would trigger reference across all instances of a category. In this sense

192 The Grammar of Polarity (pace Krifka 1995), some really does appear to be the mirror image of any. While any is a low-scalar emphatic NPI, some is a low-scalar attenuating PPI. The scalar semantics of some is clearest where it functions as a strong, proportional determiner and can be construed as contrasting with other determiners like a little and a lot or most and all. Thus, in the examples below some indicates a positive instantiation of a nominal type, and it implicates that this instantiation could have been greater. (66)

a. I may not have a lot of money, but I do have some. b. Some of my friends actually like metrical phonology.

These examples here are attenuating because in each case the expressed proposition formed with some contrasts with stronger propositions that might have been expressed with a different determiner. The contrast is explicit in (66a), where the speaker actually denies having a lot of money, but it is no less evident in (66b), where the use of some generates a scalar implicature that other friends probably do not enjoy metrical phonology. Uses like that in (66b) form the foundation for a variety of pragmatic enrichments which further clarify the inherent scalarity of some. Thus, if we substitute something like the weather in San Diego for metrical phonology – e.g. Some people actually like the weather in San Diego – we get an effect of pregnant irony, as the normal assumption that San Diego has beautiful weather allows some to obliquely convey ‘some and virtually all.’ The result is a case of genuine understatement, as opposed to mere attenuation. The effect depends crucially on the fact that some’s expressed low-scalar q-value allows for the possibility of higher q-values holding as well. The example in (67), from Sammy Cahn’s lyrics to Bei Mir Bist Du Schoen, illustrate another case where the attenuating semantics of some allows it to mean more than it seems to say. (67)

Of all the boys I’ve known, and I’ve known some Until I first met you I was lonesome.

The implication here is that the singer has in fact known a large number of boys, but the attenuating some seems to soften the claim: it is not a boast, but a concession. The singer, in expressing her affection for one boy, does not want to stress too much the range of her prior intimacies. The delicate rhetoric of the situation thus favors the use of the relatively uninformative some. The attenuating some is circumspect in its claims, but it nonetheless invites the hearer to make the strongest possible interpretation of the singer’s intended meaning. It is not uncommon for attenuating constructions like some to have such pregnant uses, and presumably it was rhetorically loaded uses of just this sort that

The family of English indefinite polarity items 193 led to the conventionalization of the emphatic some construction. A similar sort of shifting through repeated understatements from a basically attenuating to a largely emphatic sense can also be observed in the evolution of degree modifiers like rather and pretty, and is in fact quite common with attenuating constructions in general (Israel 2006; Margerie 2007). While examples like those in (66–67) reveal a fine line between attenuation and understatement in the use of basic some, emphatic some (e.g. that was some party!) may be closer to the spesumptive use (e.g. I guess he went to some party) since these are the two uses where some combines with a singular count noun. But if, as seems likely, emphatic some is a sort of short-circuited understatement, then spesumptive some must itself be an attenuating operator, despite the fact that it seems to lack any quantitative force at all. Instances of basic some can generally be construed as denoting a relatively small quantity of some sort, and as contrasting with other determiners like many, most, and all, which denote larger quantities. In this respect, basic some does indeed appear to be the attenuating counterpart of polarity sensitive any. But spesumptive some seems just to denote an indefinite individual of some type. The question is, in what sense can reference to a quasi-specific indefinite individual be construed as scalar? The evoked scale in such uses actually has nothing to do with quantity or size; rather, as happens with any, the scale itself is subjectively construed. What’s at issue is the degree to which an indefinite referent is individuated from other instances of the relevant type – or, more precisely, the degree of selective force which the determiner requires of a conceptualizer in order to pick out an instance. Spesumptive some is thus the weak scalar analogue to free-choice any: both combine primarily with singular count nouns, and both profile a minimally distinguished singular instance of a nominal type. The difference is that some is inherently attenuating and so cannot support inferences about other instances, while any is inherently emphatic and so must trigger inferences about alternative values. Any’s referent must thus be a true phantom indefinite, entirely indistinguished from other instances. Some’s referent, since it is quasi-specific, must be distinguished from other instances, but only minimally distinguished – that is, in some way that either is not, or perhaps cannot, be specified. The effect is particularly clear in the example below, discussed by Mazodier (1998), where some raises the question of a referent’s precise identity, even as it signals an inability or an unwillingness to make that identity known. (68) There were two rubbing sticks for making fire, two stones shaped roughly like knives, a woven-root container which held a few pounds of dried worms and the dead body of some rodent.

194 The Grammar of Polarity Some here seems to convey “it was a rodent, that much is clear, but what kind of rodent, I cannot (or will not) say.” The use is effectively attenuating in that it gives a limited amount of information (it was a rodent), even as it insists that more information could be given (what kind of rodent I cannot say). In accord with its inherently attenuating function, spesumptive some picks out a referent which is individuated to only a limited degree. In this sense it functions as a low-scalar alternative to a more fully individuating indefinite like a certain, which requires greater mental effort on the part of a conceptualizer to pick out a particular indefinite referent. In general, spesumptive some is a robust positive polarity item, and as such is awkward or ungrammatical in contexts like questions and conditionals which sometimes tolerate other PPIs, including basic some. This is because spesumptive some, by its very nature, can occur only where it has (or could have) a specific referent. Nonetheless, there is a common construction in which spesumptive some, or something very like it, does occur in a predicate nominal under negation, as in the examples below from the BNC. (69)

a. Doing the best or the right thing is central to the definition of agape love, because it allows us to see that love is not some sugar-plum notion – it is an investment in the future as well. b. A baby is not some final luxury item that consumers can afford (although this is obviously how the Government views it). c. I am not some petty chieftain with time on his hands to exchange chatter and gossip.

The use here is, however, consistent both with some’s status as a PPI and with its basically attenuating function – and in fact it is the spesumptive counterpart to the anti-indiscriminative (not just any) use of free-choice any. Crucially, in both cases the use of negation is always echoic or polemical rather than simply descriptive. Here the construction as a whole does not merely deny that a referent has some property, but serves rather to reject a thought which someone other than the speaker either had or might have had in the context. Much like anti-indiscriminative any, the not some in these examples is naturally expanded to not just some with no adverse effects to truth or felicity. As Horn notes, the construction basically serves to reject “a low-scalar assessment from the discourse context in favor of a high(er)-scalar upgrade” (2000a: 157): the point is not to emphatically assert a low opinion, but rather to reject a low opinion which is either explicit or easily imaginable in the context. And in this sense, the use of some here is in fact perfectly uninformative, since the rejection in itself does nothing to clarify how one should think of the referent in question.

The family of English indefinite polarity items 195 The use of some in which it seems hardest to discern any scalar semantics or rhetorical force is also arguably its most basic use. Where it appears as an unstressed, weak determiner, as in the BNC examples below, some often seems just to introduce a referent of some sort – an indefinite mass or plural instance – without implying any conspicuous contrast with other determiners. (70)

a. On his way across the waste ground he tripped over some rusty car parts and was injured. b. Soon they plan to revamp the kitchen and perhaps invest in some unusual modern furniture, which is a particular passion of Bill’s. c. A lulling orchestral accompaniment, some smokey crooning and an ingratiating melody gatecrash your approval on first hearings and then slowly reveal the tumult of fury that seethes beneath the surface.

Here, the references to “rusty car parts,” “unusual modern furniture,” and “smokey crooning” do not suggest either that such entities are in short supply or that one might have expected there to be more of them. Nor do they make the sentences in which they appear noticeably uninformative. But then in what sense does the use of some in such examples count as an attenuating scalar operator? Strikingly, the contrast between the two sorts of basic some ([sm] and [sm]) neatly parallels the contrast between the stressed (emphatic) and unstressed (neutral) versions of the indefinite NPIs noted above (§7.3). Just as stress makes the liberal any and ever more conservative in their licensing requirements, so too does it strengthen the positive polarity proclivities of some. Thus, for example, while unstressed some can occur with questions or conditionals, as in (71), it is much less welcome in such contexts when it occurs with focal stress, as in (72). (71)

a. Would you like some pizza. b. If he knew something, I’m sure he’d tell us.

(72)

a. ??Would you like some pizza? b. ??If he knew something, I’m sure he’d tell us.

Apparently, the more saliently these words appear in a sentence, the more conspicuously their sensitivities are felt. And this seems reasonable if their sensitivities really do reflect their scalar semantics, since a noticeable effect of emphasizing these forms is to highlight the ways their use in a sentence contrasts with alternative scalar expressions. But just because these scalar contrasts are sometimes not so salient doesn’t mean they aren’t always there. The sentences in (70), for example, all mean something different from what they would if some were replaced by, say, a

196 The Grammar of Polarity lot of: indeed, they express a much less informative proposition. And while the choice of some may feel more or less neutral in such examples, its use is also, and in fact always, consistent with the requirement that it should profile a limited, indefinite instance of a nominal type. For this reason, even the most unassuming uses of some can trigger scalar implicatures. Thus, in the minidialogue below, Bill’s non-committal use of an unstressed sm invites the inference that he did not do the specific reading he is being asked about. (73)

Ann: Did you do the reading for class today? Bill: I did sm reading.

The implicature, of course, depends on the way Bill’s response implicitly contrasts with other things he might have said in the context. But if these contrasts were not somehow inherent in the use of some, it is unclear how they could ever arise in the first place. I conclude that some is, like the other indefinite polarity items, an inherently scalar operator, and specifically that it is the attenuating counterpart to the emphatic any. Even in its most pragmatically unmarked uses, attenuation is an essential feature of some’s meaning. Some profiles an indefinite instance of a nominal type construed against a set of potential alternative instances of that type. The meaning expressed by some, in context, is always less specific, and hence less informative, than the relevant potential alternatives, and some requires that the proposition to which it contributes must be less informative with respect to these other instances.

7.6

The limits of free choice

Despite the many parallels between the determiner any and its emphatic adverbial counterparts ever and at all, there is one glaring difference: the adverbial NPIs ever and at all seem to lack the free-choice readings for which any is so famous (though at all can serve to emphasize the freedom implied in a freechoice NP – e.g. any time at all, anything at all, etc.). (74)

a. *Sally would ever enjoy walking in the rain. b. *Linguists can be at all annoying.

The absence of such uses is particularly striking since both ever and at all do have (or have had) uses in which they express something akin to universal quantification (e.g. 13–14 in §7.2). But while at all has in the past been used to mean ‘altogether’ or ‘entirely,’ there is no evidence of it ever meaning ‘in any respect’ or ‘to any degree’; and though ever still has uses with a sense

The family of English indefinite polarity items 197 like ‘continually’ or ‘at all times,’ at no time has it been used to mean ‘at any time.’ It has been suggested (e.g. Israel 1998b) that the asymmetry between adverbs like ever and determiners like any appears to be an idiosyncratic fact about the English lexicon, and as such, that it may count as evidence against a unified analysis of any. But as Horn (2000b) notes, the absence of a free-choice reading appears to be a systematic feature of adverbial indefinites. Thus, as Horn notes with the examples below, where any functions adverbially, it lacks a free-choice use. (75)

a. This doesn’t help us any. b. If it’s any better (worse, bigger),… c. Is it any good (any use / *any bad / *any blue)? d. It isn’t any different (*any similar).

(76)

a. *This {may/could/might} help us any. b. *He could be any better. c. *“The Phantom Menace” might be any good, for all I know. d. *Any good (at all) can come of this.

In fact, there is no deep conceptual incompatibility between a free-choice meaning and the types of semantic entities which adverbials denote (i.e. for at all, extents or degrees; for ever, cases or occasions). Thus, it is easy to find paraphrases with any that express a quodlibetic selection over random occasions, as in (77), or over extents, as in (78); but these sorts of meaning cannot be expressed with ever or at all. (77)

a. I can quit smoking any time (I want to). b. *I can ever quit smoking.

(78)

Yasmin’s smoking is very unpredictable. a. She may smoke almost any amount on any given day. b. *She may smoke almost at all ever.

This is all the more striking given that both ever and at all sometimes play supporting roles in the expression of free-choice semantics: at all can intensify the force of free-choice any (e.g. in Come by any time at all!), and ever combines with wh- words to form a whole paradigm of free-choice pronouns including whatever, whoever, whenever, wherever, and whichever. But if ever and at all are compatible with the free-choice semantics of these pronominal constructions, why can’t they express a free-choice meaning on their own? The answer, presumably, has something to do with the fact that these forms are adverbials, and something to do with the nature of free-choice reference

198 The Grammar of Polarity itself. Building on the assumption that FC any is a kind of emphatic counterpart to a plain generic indefinite, Horn (2000b) suggests that adverbial NPIs like ever lack free-choice readings because their non-emphatic counterparts can never be generic: since the plain indefinite once has no generic use, there can be no generic use for ever either. This seems reasonable if, as argued above, FC any really is a sort of inherently generic indefinite. But then FC any also occurs in contexts which do not allow a generic reading for plain indefinites – e.g. with the modals in (51–54), with a subtrigger in (57), and notably in (77–78), where substituting a for any leads to ungrammaticality. The question is, what gives FC any its generic force in these contexts and yet precludes such construals with adverbials of any sort? The answer, I suggest, concerns the conceptual structure of genericity itself. Specifically, I claim that the conceptualization of a generic sentence requires two distinct mental operations: first, the imagination of a kind, and second, the predication of a property. In other words, a generic sentence necessarily expresses a categorical judgment (Kuroda 1992: 42ff.), in which one first imagines a category (the categorical subject) and then either affirms or denies some property (the predication) of that category. Categorical judgments in general are distinguished from thetic judgments, in which one simply conceives of a situation as a whole and judges it to be or not to be the case without clearly distinguishing subject and predicate. The former are double judgments (Doppelurteil); the latter are simple judgments (einfache Urteil). The basic intuition here is that free-choice reference in general requires a kind of double judgment: first, one conjures up some freely chosen instance of the relevant kind, and then one affirms some property of the conjured-up instance. If free-choice reference requires the construal of a freely chosen instance as the subject of a categorical judgment, then adverbial expressions can never have free-choice uses for the simple reason that they are modifying constructions – they can never be construed autonomously and so can never function as the subject of a predication, either syntactically or conceptually. An adverbial profiles a property of a property, and as such is always construed within the overall conceptualization of some higher-order property. As the examples above show, free-choice reference to occasions and degrees is always possible (if somewhat unusual), but it requires one first to imagine a set of occasions or degrees from which one might have to choose. The effect is achieved where ever is incorporated in the focus of a free relative clause (e.g. whenever you say, however hard you try), but if an adverbial like ever or at all modifies some constituent inside a clause, the relevant occasions and degrees

The family of English indefinite polarity items 199 are just properties of a higher-order profiled relation and so cannot themselves be singled out for consideration themselves. It is worth noting that my proposal here is, at least superficially, at odds with Horn (1997) and Tovena (1998), both of whom have argued that free-choice construal cannot involve a categorical judgment on the grounds that categorical subjects must be referential while any phrases notoriously lack existential import. As Tovena puts it, “since thetic statements do not contain a predication base, and any does not support referential links, then, at least in principle, the two should go together” (1998: 212). She and Horn both equate the “aboutness” of a categorical judgment with its referential import, and so both conclude that any is incompatible with a categorical judgment. But the crucial notion of ‘aboutness’ here is not, I contend, a matter of reference but rather one of construal. The point is that free-choice reference requires two distinct acts of imagination – first, the selection of a “freely chosen” instance, and second, the attribution of an essential property to the selected instance. I thus propose that FC any consists in nothing more than the use of a phantom indefinite as the subject of a categorical judgment. This might help explain the phenomenon of subtrigging, where a relative clause or other postnominal modifier can save an unlicensed any in an episodic sentence (as in 61c, above, or in he believed anything she said). As noted above, such uses allow an any NP to be construed quasi-generically as denoting the kind of referent that participates in an event (e.g. ‘things such that she said them’) rather than a set of particular participants in that event. But why should postnominal modifiers have such an effect? The reason, I suggest, is that postmodifiers, unlike prenominal modifiers, basically function as secondary predicates and so require the nominal head they modify to be individuated and construed independently of the modifying clause. I take it that this extra conceptual step gives the postmodified nominal a categorical construal and so allows it to function as the subject of a generic judgment within the NP it heads. The idea that free-choice reference requires a categorical judgment also fits neatly with the idea that spesumptive some is the attenuating counterpart of FC any (§7.6). Basically, if there is anything to this idea, then the two constructions should involve the same kind of construal and differ only in their rhetorical force. In effect, this predicts that spesumptive some, like FC any, should always denote the subject of a categorical judgment, but unlike FC any, should never be construed generically. Indeed, this appears to be a plausible description of the spesumptive construction and may explain its preference for predicates which support a stage-level construal (as in 79a) and

200 The Grammar of Polarity their awkwardness in characterizing contexts, with individual-level predicates (as in 79b). (79)

a. Some guy just stole my bike. b. *Some guy is very inconsiderate.

I take it this reflects the fact that with an individual-level predicate, the indefinite subject is liable to be construed generically, but with a stage-level predicate the indefinite must denote a specific (even if still unknown) individual. I will not pursue the idea further here, but leave it as an open hypothesis that what is special about both free-choice any and spesumptive some is that they profile a singular indefinite instance of some type as the subject of a categorical judgment. 7.7

Indefinite conclusions

A theory of polarity items cannot content itself with a few distributional constraints. Ultimately, the goal is not to explain the distribution of any one polarity item but to elucidate the grammatical forces which lead different forms in different languages – or even in the same language – to repeatedly arise and evolve along similar lines. The English sensitive indefinites are a close-knit family, and the close relations among them provide clear evidence that the many uses of any reflect its basic status as an indefinite scalar operator. Developing an idea on any from Langacker (2002), I have proposed that all these forms encode low quantitative values on a subjectively construed scale of indefiniteness: they prompt a conceptualizer to exert (just) the minimal effort it takes to imagine an indefinite instance. The proposed analysis explains at least some of the liberal tendencies of indefinite NPIs, some of the vagaries of free-choice meaning, and some surprising analogies between the emphatic indefinites and their attenuating PPI counterparts. I have argued that the two major divisions in the uses of any – the split between free-choice and polarity sensitive uses, and the split between neutral and expressively emphatic uses – find parallels in the divisions of some, with unstressed some (aka sm) the attenuating counterpart of neutral any, and spesumptive some the counterpart of free-choice any. The various uses of any reflect the various ways a minimally individuated (or, equivalently, a maximally indefinite) entity can be construed emphatically, and the various construals available for any recur systematically with other sorts of sensitive indefinites. These forms differ from other polarity items in

The family of English indefinite polarity items 201 that the scales they evoke are more abstract and more liable to be subjectively construed, as implicit in the act of imagining an indefinite referent. Their special sensitivities reflect their special indefinite semantics, but it is not so much a matter of indefinite reference as one of indefinite construal – of the way, that is, that content is presented to consciousness.

8 Polarity and the architecture of grammar

I fear we are not yet rid of God, for we still believe in grammar. Nietzsche, Twilight of the Idols

8.1

High stakes Grammar

Polarity items and polarity sensitivity have been a flashpoint in debates about language and the structure of grammar since the earliest days of generative linguistics. While theories have proliferated and the data have mushroomed in complexity, the terms of the debate have remained fairly constant. The goal is to explain why polarity items should pattern the way they do and, more particularly, to discover what it is in a speaker’s mental grammar that conditions these peculiar patterns. For the most part, the debate has centered on the nature of grammatical representation itself, and on the sorts of constructs which must be available within a theory of grammar in order to explain the complex constraints on polarity items. The significance of this debate is not difficult to see: whatever constructs are needed to explain polarity sensitivity in particular should also, and indeed must, be part of the theory of grammar in general. Polarity sensitivity thus becomes a diagnostic for the structure of language itself, and the debate about polarity items ends up being a debate about the architecture of grammar. The very existence of polarity items raises questions about the nature of human language and its relation to cognition generally. Polarity items are not the sort of thing one would expect to find in a well-designed language, and one may well wonder what it is about human language or the human mind that makes such unlikely things so surprisingly commonplace. How one goes about answering this question depends in some part on how one views the relation between grammatical competence narrowly construed and the rest of cognition: is grammar best conceived as an autonomous cognitive module, or as a structure which somehow depends on the operation of other, more general cognitive abilities? This is a big, unwieldy question which surely will not 202

Polarity and the architecture of grammar 203 be resolved on the basis of a single grammatical phenomenon. In practice, one’s position here may be as likely to influence one’s view of the evidence as the evidence itself is to determine one’s position. Still, whatever one’s views on the matter, there seems to be a broad consensus that the phenomenon of polarity sensitivity may shed useful light on the relation between grammar and cognition. The Scalar Model of Polarity takes a strong position on this relation. I have argued that the grammar of polarity sensitivity is grounded in the basic logic of scalar reasoning – that the existence of polarity items and the constraints on their distributions ultimately reflect the ways very general conceptual abilities are deployed in the negotiation of communicative interaction. Language and cognition, on this view, are inextricably interrelated and interdependent. This is, of course, quite the opposite of the canonical view within generative linguistics, where it is widely assumed that the principles of grammatical structure are fundamentally different from those which govern the rest of cognition. In fact, the various major treatments of polarity in the literature reflect in miniature the range of possible positions in this larger debate. My basic claim in this chapter is that as theories of polarity have grown more sensitive to pragmatic and extra-linguistic aspects of the phenomenon, they have also advanced in their empirical coverage and explanatory force. 8.2

Terms of the debate

In generative linguistics, theoretical work on polarity sensitivity has itself tended to polarize around two competing approaches, one viewing the phenomenon essentially as a matter of syntax, the other seeing it as one of semantics. The basic split is already visible, at least embryonically, when Klima (1964) defines licensing structurally, as a constraint on syntactic representations, but then presents the crucial notion of affectivity as a “grammatico-semantic property” of licensing contexts (emphasis added). Perhaps inevitably the basic tension between syntax and semantics implicit in Klima’s work spawned two very different traditions of analysis, representing two very different views on the nature of grammar. The syntactic approach (e.g. in Linebarger 1980, 1987, 1991; Laka 1990; Progovac 1992, 1994; Uribe-Etxebarria 1994) builds on Klima’s contention that NPIs must occur “in construction with” (i.e. roughly, be c-commanded by) negation. In syntactic accounts, constraints on polarity items apply at a level of syntactic representation distinct from, and independent of, the meaning a sentence may convey, and grammar itself is understood as being largely,

204 The Grammar of Polarity if not entirely, a matter of syntax. The assumption is that linguistic general izations are best formulated in terms of formal syntactic structures, and that notions like meaning and communication have little if any significance for a theory of linguistic structure. The semantic approach (e.g. in Ladusaw 1979, 1983; Hoeksema 1983, 1986; Dowty 1994; Jackson 1994; Zwarts 1996, 1998; van der Wouden 1997) starts with Klima’s idea that affective contexts share some general semantic property. On these accounts polarity licensing depends directly on a sentence’s semantic representation, and only incidentally on the syntactic structure of a licensing context. The standard assumption here is that meaning, and particularly truth-conditional meaning, is an important part of grammatical structure. Despite the significant differences which divide them, there is a sense in which most generative theories of polarity sensitivity, whether syntactic or semantic, are united in their basic conception of grammar. Proposals in both approaches have tended to be structural in nature, conceiving of the constraints on polarity items as well-formedness conditions for linguistic representations – that is, as constraints on the formal structures associated with sentence types. In these terms, the difference between a semantic approach and a syntactic approach is just a matter of the types of symbolic representations in which the constraints are formulated. Syntactic accounts are necessarily structural, since they depend entirely (or at least, crucially) on the representation of syntactic structures. But semantic accounts tend to be structural as well, for the common assumption among them is that the meaning of a sentence is itself a formal structure, a logical representation algorithmically assigned by the grammar. It is at this logical level of representation that most semantic accounts since Ladusaw (1979) formulate the laws of polarity sensitivity. The assumption that the meaning of a sentence is itself a separate level of linguistic representation is by now so widely accepted as to seem beyond questioning. It seems like a natural assumption, and for this reason it also seems like an innocent one. But the existence of such a level is hardly a theoretical necessity. If meaning is understood as the communicative potential of a sentence, it is clear that the semantic representation of a sentence cannot exhaust its meaning. This is not, in fact, a controversial point. It is a truism that speakers regularly and systematically convey much more than they actually say, and that a sentence’s “literal meaning” is at best only the starting point for a complicated process of pragmatic interpretation. It is this process which allows speakers to compute conversational implicatures, understand indirect speech acts, accommodate obvious exaggerations, and generally interpret all sorts of figurative language. No one can seriously doubt the importance of such abilities for any

Polarity and the architecture of grammar 205 complete account of language understanding. And no one seriously disputes the claim that at least some such abilities are not specifically linguistic but rather reflect general cognitive capacities for reasoning and communication. The real question here is whether, and to what extent, such general cognitive capacities function as an integral part of grammatical competence. If, on the one hand, pragmatic processes of interpretation depend on the formulation of a literal, truth-conditional meaning as a distinct level of semantic representation, then pragmatics in general may be seen as operating outside of the grammar itself. And if the constraints on polarity items really do require specific reference to the logical structure of semantic representations, this would constitute compelling evidence that such logical forms do play a significant role in the architecture of grammar. But if there is no such level of logical form independent of pragmatics, then the grammatical constraints on polarity items cannot be formulated at such a level and so presumably must depend, at least in part, on pragmatic factors. The Scalar Model of Polarity challenges the fundamental assumption that polarity licensing depends on linguistic representations of any sort, whether syntactic or semantic. While the scalar approach agrees with formal semantic theorists in viewing polarity licensing as a matter of meaning, it parts company with many in its conception of what meaning is. Ultimately, I claim that the evidence from polarity sensitivity supports a view of grammar in which there is no specifically linguistic level of sentence meaning, and in which the “literal” truth-conditional meaning of a sentence has no significant theoretical status independent from the pragmatic processes which support its interpretation in context. Rather than depending on static, truth-conditional semantic representations, I suggest that polarity licensing involves a dynamic process of meaning construction in which speakers and hearers use symbolic resources of all sorts, including both conventional linguistic constructions and contextspecific inferences and affordances, in the on-line elaboration of coherent conceptualizations. My conclusions here, and indeed some of my argument, should be reminiscent of Fauconnier’s early work (e.g. 1977, 1980). They are also deeply con sistent with work in cognitive semantics (e.g. Fauconnier 1997; Fauconnier & Turner 2002; Croft & Cruse 2004; Sweetser & Dancygier 2005; Verhagen 2005), and with recent work in formal semantics and pragmatics which takes a dynamic view of meaning construction (e.g. von Fintel 1999; Beaver & Clark 2003, 2007; Roberts 2004). Still, in most work on polarity since Fauconnier’s first efforts, the conventional wisdom remains that the grammar of sensitivity consists in constraints on linguistic representations and that the role of

206 The Grammar of Polarity pragmatics in licensing, if there is one, is purely extra-grammatical. It is this idea, in particular, which I call into question here. 8.3

The syntactic approach

Syntactic accounts of polarity licensing hold that the constraints on polarity items depend solely on the syntactic structure of their licensing contexts. This is a strong claim. Any theory of licensing based on semantics or pragmatics must allow some role for syntax, broadly construed, in the determination of sentence meaning. Otherwise there would be no explaining the fact that people often create novel sentences whose meanings systematically reflect the meanings of their parts. But a syntactic account can aspire to theoretical purity by refusing any role for meaning and keeping its syntax strictly autonomous. Syntactic accounts view polarity licensing as a structural relation between a polarity item and its licensor at some level of syntactic representation, and they view negation, or some syntactic corollary of negation, as the primary licensor for all polarity items. They tend to differ primarily on the level of syntactic structure at which licensing is held to apply, and, to a lesser degree, on the precise formulation of the licensing relation itself. The constraints on polarity items have been taken to hold at the level of logical form (LF) (Linebarger 1980, 1987, 1991; Mahajan 1990; Uribe-Etxebarria 1994), at S-structure (Laka 1990), or at some combination of the two (Progovac 1992, 1994). The problem of licensing in non-negative contexts often gets short shrift in these accounts and is typically handled by some secondary licensing mechanism, as in Baker (1970) and Linebarger (1980), or by an invisible syntactic operator, as in Progovac, or else it is basically just ignored, as in Laka (1990) and Uribe-Etxebarria (1994). This narrow focus might be justified by the intuition that negation is, in some sense, the prototypical polarity licensor, or by the fact that many languages have items which can only be licensed by negation. The assumption seems to be that the grammatically significant phenomenon is the special relation between negation and polarity items, and that the relation of negation to other sorts of licensors is either unimportant or uninteresting. For many researchers in the generative tradition, polarity sensitivity seems to be interesting primarily because it involves a special kind of dependency and therefore constitutes a special source of evidence for a theory of syntactic dependencies. From this perspective, the fundamental issue in the study of polarity items is really the licensing problem. The important question is not so much what polarity items are, or why they exist, or even what makes something a polarity licensor, but rather what polarity items can tell us about the nature

Polarity and the architecture of grammar 207 of syntactic structure, and how polarity licensing relates to other long-distance dependencies like constraints on wh-movement and pronominal anaphora. One should take care in comparing the virtues of competing analyses from fundamentally different frameworks. Since many theorists in the syntactic tradition do not seem particularly interested in the diversity of polarity licensors or in the complicated variation among polarity items in a single language, it is perhaps unfair to make too much of their failing to explain such phenomena. They might, after all, raise similar objections against an account like my own which fails to address the relation between polarity licensing and the syntactic structure of inflectional categories. Still, it would be wrong to assume that different theoretical frameworks are necessarily incommensurable. At least two theorists within the syntactic tradition have made significant attempts at explaining the variety of polarity triggers besides negation, and it is worth considering what these approaches may reveal about the phenomenon. 8.3.1 Progovac: polarity and binding The work of Progovac is in many ways an exemplary attempt to relate the problem of polarity licensing to a larger set of issues in theoretical syntax. Progovac (1992, 1994) offers a theory of polarity which views the constraints on polarity items as a special case of the principles of Binding Theory. Progovac’s basic goal is to contribute to a restrictive model of Universal Grammar by assimilating the conditions on two apparently very different phenomena, the distribution of pronouns and anaphors, on the one hand, and the distribution of polarity items, on the other, to a single set of syntactic principles. I will focus here on the narrower issue of Progovac’s proposed solution for the licensing problem (for fuller reviews see Horn & Lee 1995; Hoeksema 1996). Binding Theory is supposed to explain the distributions of anaphors and pronominals, but as Progovac points out, there is nothing inherent in the theory which limits its principles to just these sorts of forms. Anaphors (e.g. reflexives like herself or himself), which can only occur in close construction with a coreferential NP, are like NPIs, which can only occur in close construction with an appropriate licensor. Pronouns, which must be free from any coreferential NP, are like PPIs, which must in some sense be free from negation. Progovac formulates the basic idea as follows (1994: 6–7): i) ii)

NPIs are subject to Principle A: they must be bound to negation (or other [sic] truth-functional operator) in their governing category. PPIs are subject to Principle B: they must not be bound to (fall within the scope of) negation (or a truth-functional operator) in their governing category.

208 The Grammar of Polarity Progovac views the potential binders for polarity items as functional categories, either negation in Infl (i.e. an inflectional phrase), or a null polarity operator, Op, in Comp (i.e. a complementizer phrase). The null operator is intended to explain the phenomenon of non-negative polarity licensing, and it plays a crucial role in her theory; however, Progovac remains agnostic as to what it is that licenses Op itself and so provides no real generalization about the class of polarity licensors as a whole. As she says, her main concern is with the locality conditions which hold between NPIs and their licensors (1994: 63–4). In this sense she seems to view the licensor question as less theoretically important than the licensing relation problem. Progovac argues that in order for a polarity item to be licensed by Op, it must raise to Comp at LF, since otherwise its first potential antecedent will be in Infl. She thus makes two predictions: first, that licensing in non-negative contexts should be a property of clauses, rather than phrases; second, that in order for an NPI to be licensed by Op, it must be the sort of thing which can undergo quantifier raising. The first prediction is supported by contrasts like that in (1), where the adversative predicate deny licenses NPIs in a clausal complement in (1a), but not in direct object position in (1b). (1)

a. Natasha denied that she intended to steal anything. b. *Natasha denied anything.

But there are plenty of cases where NPIs are licensed in the same clause as an adversative predicate (Kas 1993; Krifka 1995), as in the examples below, adapted from Horn and Lee (1995: 410). (2)

a. Natasha {denied/*admitted} any involvement in the conspiracy. b. Natasha {avoided/*accepted} any responsibility for the defeat. c. This proposal {lacks/*has} any content.

The second prediction reflects the role of Op in Progovac’s theory. Since Op can only bind an NPI in its local domain, only NPIs which allow raising can be bound by Op. NPIs which cannot raise at LF should only occur with local negation and be barred from non-negative polarity contexts (1994: 80). Thus only quantificational NPIs like any should ever allow licensing by a non-negative trigger. This prediction is dramatically wrong. Polarity items of almost any syntactic type occur with non-negative triggers: a few possible examples include the conjunction let alone, the degree adverbial so much as, the indefinite adverbial ever, the verbal idiom bother to, and the punctual use of until. None of these

Polarity and the architecture of grammar 209 is of a sort which is liable to raise at LF, but all of them are easily licensed in non-negative contexts. (3)

a. I’d be surprised if Beth ate salmon, let alone jellyfish. b. I’d be surprised if she so much as looked at me. c. I’d be surprised if Alan ever bothered to thank us. d. I’d be surprised if she gets here until this weekend.

Obviously, my remarks here can do justice neither to the complexity of Progovac’s theory nor to the breadth of its empirical coverage. Progovac deserves credit for her basic insight that polarity licensing in many ways parallels the conditions on pronominal anaphora, and more, for supporting that insight with an impressive wealth of cross-linguistic data. Still, the considerations here do suggest that Progovac’s approach may not be the best way to draw these parallels. 8.3.2 Linebarger: syntax and pragmatics Linebarger (1980) argues that polarity licensing is fundamentally a syntactic phenomenon, and that the precise sensitivities of polarity items provide strong evidence for a syntactic level of logical form. Her double-barreled approach to licensing builds on Baker (1970) by proposing that polarity items are licensed primarily by negation but may be triggered secondarily by an appropriate negative implicature in otherwise non-negative contexts. In later work (Linebarger 1987, 1991), she takes aim squarely at those who, like Ladusaw (1979, 1983), argue that polarity licensing is a matter of semantics. As she puts it, “the distribution of negative polarity items in English reflects an interplay between syntax and pragmatics, with no apparent role for a level of ‘pure’ semantic representation” (Linebarger 1987: 326). For Linebarger, the basic constraint on polarity items is strictly configurational: NPIs must occur in the immediate scope of negation at the syntactic level, LF. Linebarger’s Immediate Scope Constraint (ISC) states that A negative polarity item is acceptable in a sentence S if in the LF of S the subformula representing the NPI is in the immediate scope of the negation operator. An element is in the immediate scope of NOT only if (1) it occurs in a proposition that is the entire scope of NOT, and (2) within this proposition there are no logical elements intervening between it and NOT. (Linebarger 1987: 338)

The ISC is Linebarger’s answer to Ladusaw’s (1996) Licensor Relation Question and is intended to explain two facts about NPI-licensing. First, it predicts that negation in a matrix clause should not license NPIs in a subordinate

210 The Grammar of Polarity clause, as in (4), because in such cases the proposition containing the NPI will not constitute the entire scope of NOT. (4)

a. *Kristen is not the woman who I am the least bit interested in marrying. b. *She wasn’t ready for the final because she had cracked a book all quarter. c. *She didn’t stay home so that she could lift a finger to help to me.

Second, it predicts that certain types of logical elements will block licensing if they come between NOT and an NPI. This explains the contrasts below, where the careless addition of a quantifier in the wrong place seems to lead to licensing failure. (5)

a. Jane didn’t contribute a red cent to the Linguist List. b. *Jane didn’t contribute a red cent to every charity.

(6)

a. Ted usually wouldn’t lift a finger to help his friends. b. *Ted won’t always lift a finger to help his friends.

It also explains why the ambiguity in (7a) does not arise in (7b), where the indefinite article a is replaced by the NPI any. While (7a) can mean that not every child was read a story, (7b) can only mean that there are no stories that Gwyneth read to every child. (7)

a. Gwyneth didn’t read every child a story. b. Gwyneth didn’t read every child any stories.

The impossibility of the first reading for (7b) is explained by the fact that the logical operator every blocks licensing by negation when it intervenes between NOT and the NPI at LF. While the ISC does effectively constrain the conditions under which NPIs are licensed by negation, it also rules out licensing in many contexts which frequently do trigger NPIs. To account for this situation, Linebarger proposes a secondary mechanism which allows NPIs to be licensed indirectly, by virtue of a negative implicature. The idea is that NPIs are themselves somehow inherently negative and that part of their contribution to sentence meaning consists of a conventional negative implicature. In sentences where an NPI does not conform to the ISC, it may nonetheless be licensed if it effectively “alludes” to another sentence in which it does occur in the immediate scope of negation. According to Linebarger, in non-negative contexts, the NPI is licensed because it does occur in the immediate scope of negation in the LF of an implicature. Thus the NPI lift a finger can be licensed with the adversative be surprised in (8a) by virtue of the negative implicature (NI) in (8b), and sleep a wink is licensed in the before clause of (9a) because its use there will convey the implicature in (9b).

Polarity and the architecture of grammar 211 (8)

a. I’m surprised that she would lift a finger to help that jerk. b. NI: I expected that she would not lift a finger to help that jerk.

(9)

a. The phone rang before he managed to sleep a wink. b. NI: When the phone rang, he had not managed to sleep a wink.

Linebarger’s proposal has come under attack for the unconstrained nature of a licensing mechanism based on implicature (Krifka 1992; Kadmon & Landman 1993; Yoshimura 1994; Horn 1996; among others). The worry is that without a precise account of how negative implicatures are generated, the theory is immune from disconfirmation. Linebarger is not insensitive to such concerns, and she provides three constraints on what can count as a negative implicature for the purpose of licensing a polarity item (Linebarger 1991: 166). The availability requirement demands that a speaker be actively attempting to convey the NI. The strength requirement demands that the truth of the NI “must virtually guarantee” the truth of the overtly expressed proposition. And the foreground requirement demands that neither the NPI nor the NI occur as background information in the conversational context. It is unlikely that these constraints are enough to save the theory. The problem is to show why negative implicatures are available where the theory needs them to license NPIs, and why negative implicatures are not available elsewhere, in contexts which do not license NPIs. Horn (1996) points out an outstanding empirical problem here with respect to the licensing potential of forms like almost and barely. (10)

a. Clara barely said a word to me at the party. b. *Clara almost said a word to me at the party.

The problem here is that whereas (10a) clearly conveys that Clara did, in fact, say at least one word, (10b) just as clearly conveys that Clara did not succeed in uttering anything quite so elaborate. While the licensing failure in (10b) might be explained by the strength requirement (the NI that Clara didn’t say a word would hardly seem to guarantee the truth of the proposition that Clara almost said a word), it is nonetheless difficult to discern a potential implicature which could explain the licensing success in (10a). Even assuming that the notion of negative implicature could be amended to explain these examples, there is something fundamentally odd about Linebarger’s conception of the syntax–pragmatics interface and her assumption that conventional implicatures come complete with their own syntactic representations: NPIs are licensed by negative implicatures when the NPI obeys the ISC in the logical form of the implicature. This is a peculiar assumption on just about any view of the autonomy of syntax. Trivially, if one doubts

212 The Grammar of Polarity the existence of a level of autonomous syntax, one will tend to assume that a constraint like the ISC, which is crucially stated in terms of such a level, cannot exist either. But if one assumes that there is a level of syntax independent of meaning and the pragmatic inferences that support implicatures, presumably one would expect syntactic constraints to apply only at this autonomous level of syntax. Linebarger, however, seems to assume a theory of grammar in which the output of the syntactic module feeds the semantic and pragmatic machinery of a general conceptual structure, which in turn feeds back into the syntax in a constant complex loop. But if implicatures come complete with syntactic representations, then it seems reasonable to assume that all thoughts must have syntactic representations. And if all thoughts come complete with syntactic structures, it seems odd, at best, to claim that syntax is autonomous from the rest of conceptual structure. So there are some problems with Linebarger’s notion of negative implicature and with the role it plays in her theory. But while the idea of licensing by implicature may be untenable in principle, it may also be unavoidable in practice. As I have argued above and will argue further below (§9), polarity licensing often depends crucially on the pragmatic effects a polarity item can have in a particular context. The importance of Linebarger’s work lies in her demonstration that polarity licensing is not just a structural phenomenon, and that whether or not a polarity item is acceptable in a given context ultimately depends, at least in part, on what the speaker is trying to say. Thus while Linebarger held that licensing was primarily syntactic in nature, she also helped to establish the pragmatic basis of the phenomenon. 8.4

Semantic approaches

Even the most ambitious syntactic theory has to acknowledge that there are limits to what it can explain. As Ladusaw notes, “it has been clear since Klima (1964) that the range of expressions which license negative polarity items is beyond syntactic characterization” (1996: 327). The point is underscored by the fact that the only syntactic theorists who seriously confront the question of what makes something a polarity licensor, Linebarger and Progovac, both rely on extra-syntactic devices, negative implicature for Linebarger and a special null operator for Progovac, to explain the range of non-negative polarity contexts. The semantic approach to polarity sensitivity begins with an appreciation of Klima’s (1964) claim that polarity contexts are united by a common grammatico-semantic property. Although Klima himself did little more than

Polarity and the architecture of grammar 213 give this property a name, his insight paved the way for a project to give semantic substance to the notion of affectivity. There are by now a variety of proposals concerning the precise nature of this semantic substance, all of which are at least loosely united in the view that polarity contexts are defined in terms of their inferential properties. They are also united by their common intellectual debt to the work of Ladusaw, and it is with his work that I will begin. 8.4.1 The Monotonicity Thesis By many measures the most successful theory of polarity sensitivity is that of Ladusaw (1979, 1983). Ladusaw’s theory has stood as the natural starting point for any semantic approach to polarity, and has spawned a small industry of research on the relation between polarity sensitivity and monotonicity (Hoeksema 1983, 1986; Kas 1993; Dowty 1994; Zwarts 1996, 1998; van der Wouden 1997; von Fintel 1999; and many others). Ladusaw’s thesis is elegantly simple. The basic claim is that polarity sensitivity is a sensitivity to logical monotonicity, and that negative polarity items are only licensed in the scope of a downward entailing operator. Basically, DE operators license inferences from general properties to specific instances, from sets to subsets. Negation, for example, is a DE operator because it licenses the inference in (11) from the general bird to the specific penguin. (11)

a. Beth didn’t see a bird in the garden. → b. Beth didn’t see a penguin in the garden.

DE operators are polarity reversers, because their characteristic property is that they reverse the sorts of inferences one could draw in a neutral context, where inferences normally run the other way, from specific instances to general cases, as in (12). (12)

a. Beth saw a penguin in the garden. → b. Beth saw a bird in the garden.

Formalizing Fauconnier’s earlier work on polarity and implication reversal (1975a, b, 1976, 1978), Ladusaw defines the class of polarity reversers as in (13a), and posits (13b) as a necessary condition for the occurrence of negative polarity items (1983: 383). Together (13a) and (13b) constitute what I will call the Monotonicity Thesis. (13)

a. An expression d is a polarity reverser iff its denotation function d′ is such that ∀X∀Y[X ⊆ Y → d’(Y) ⊆ d’(X)]. b. A negative polarity item will be acceptable only if it is in the scope of a polarity-reversing expression.

214 The Grammar of Polarity The definition in (13a) here expands the concept of monotone decreasing quantifiers from Barwise and Cooper (1981) to define a class of monotone decreasing (or downward entailing or downward monotonic) operators of any semantic type. The statement in (13b) is given as a necessary, but not necessarily a sufficient, condition for polarity licensing. In addition to downward entailing operators, other operators may be either upward entailing or non-monotonic. Since UE operators do not reverse, but rather preserve the normal pattern of inferences, they are called polarity preservers. (14) An expression d is a polarity preserver iff its denotation function d′ is such that ∀X∀Y[X ⊆ Y → d’(X) ⊆ d’(Y)].

In principle, in order to determine the monotonicity properties of a given construction C in a natural language L, one must specify the denotation of C within an algebra of meanings sufficiently precise to allow a demonstration of C’s entailment properties. This requires an interpretation procedure for some fragment of L which includes C, and which specifies the mapping from expressions in L to a set of model-theoretic representations (Ladusaw 1979). For our purposes, however, it will be enough to provide a quick heuristic for determining the monotonicity properties of natural language expressions. With a few significant exceptions (e.g. Ladusaw 1979; Hoeksema 1983; Sánchez Valencia 1991; Kas 1993; Zwarts 1996, 1998), most semantically oriented studies of polarity tend to rely on these sorts of informal tests. Generally, we can classify linguistic contexts directly in terms of the inferences they license. Since downward entailment involves reasoning from the general to the specific, DE operators should allow for the substitution of specific terms, or subsets, in place of more general terms, or supersets, without affecting truth conditions. Similarly, UE operators should allow for the substitution of more general terms, or supersets, in place of more specific terms, or subsets. And non-monotonic operators, those which are neither DE nor UE, will not allow substitutions in either direction, whether from subsets to supersets or from supersets to subsets, without affecting truth conditions. The basic ideas are summarized in (15), where T represents a linguistic context expressing a proposition and T[x] is the context T containing the expression x. The formula [x′ ⊆ y′] may be read as “the denotation of x is a subset of the denotation of y” and the formula T[x] → T[y] means “the proposition expressed by T[x] entails the proposition expressed by T[y].”

Polarity and the architecture of grammar 215 (15) Given a linguistic context T and linguistic expressions x, y, such that x′ ⊆ y′ a. T is an upward entailing context iff T[x] → T[y] b. T is downward entailing context iff T[y] → T[x] c. T is non-monotonic iff neither T[x] → T[y] nor T[y] → T[x].

Assuming that skim milk is a sort of milk, HPSG is a syntactic theory, the DNA Lounge is a dance club and LSD is an illegal drug, the examples below show that expressions like rarely, few people, be too Adj to VP, and the restriction of a universally quantified NP all create downward entailing contexts. (16) a. Mookie rarely drinks milk. → Mookie rarely drinks skim milk. b. Few people understand the importance of a syntactic theory. → Few people understand the importance of HPSG. c. Rupert is too busy to be spending all night at a dance club. → Rupert is too busy to be spending all night at the DNA Lounge. d. Everyone who takes an illegal drug is morally suspect. → Everyone who takes LSD is morally suspect.

The examples in (17) suggest that expressions like often, many people, be Adj enough to VP, and the first argument of a quantified NP like at least five students all define environments which are upward entailing. (17) a. Mookie often drinks skim milk. → Mookie often drinks milk. b. Many people understand the importance of the Minimalist Program. → Many people understand the importance of a syntactic theory. c. Glynda is bored enough to go to the DNA lounge. → Glynda is bored enough to go to a dance club. d. At least five students who took LSD wrote a great thesis. → At least five students who took illegal drugs wrote a great thesis.

As should be clear, Ladusaw’s Monotonicity Thesis is motivated by the same basic generalizations which inspired Fauconnier’s scalar approach to polarity. Both approaches build on a recognition that polarity items are sensitive to the inferential properties of the contexts in which they occur, and that polarity contexts are united in terms of the inferences which they license. Both approaches, in other words, view polarity licensing in terms of meaning. Still, there are a variety of ways of thinking about meaning. For Fauconnier, and for the scalar approach generally, the sorts of meaning which define polarity contexts are pragmatic in nature, and so licensing is held to depend on the conveyed content of an utterance in context. Ladusaw takes a narrower view. For Ladusaw (at least for Ladusaw 1979, 1983) the inferencing which defines polarity licensors

216 The Grammar of Polarity is a formal semantic property of linguistic expressions, and licensing is held to depend specifically on a representation of a sentence’s conventional truthconditional meaning. The choice between the two approaches thus comes down to the question of what aspects of meaning are relevant in defining grammatical constraints: is polarity licensing a matter of truth-conditional denotations alone (in Gricean terms, what is said), or is it a matter of non-natural meaning more generally, a property of the larger conceptual structures which can be expressed in the use of a given sentence? 8.4.2 A hierarchy of negative contexts There is a significant advantage to Ladusaw’s conservative position that the relevant level of representation for polarity licensing is just the literal, truthconditional content of a sentence’s meaning. Not only does it make the representation of polarity contexts formally tractable, it allows for the introduction of various fine-grained distinctions characterizing different polarity contexts. In the 1980s and 1990s, scholars at the University of Groningen (e.g. Hoeksema 1986; Kas 1993; Zwarts 1996, 1998; van der Wouden 1997) showed that downward entailment is only the weakest in a hierarchy of logically defined types of negative contexts. This result has allowed the development of an elaborated, multitiered version of the Monotonicity Thesis to explain the diversity of sensitivities found in polarity items. The monotonicity hierarchy distinguishes three degrees of downward monotonicity, the strongest of which is antimorphicity. Antimorphic operators conform to both of De Morgan’s Laws in (18). (18)

A function f is antimorphic iff for all X, Y in the domain of f, i) f (X or Y) ↔ f (Y) and f (X) ii) f (X and Y) ↔ f (Y) or f (X)

Almost all antimorphic operators in natural language are expressions of sentential negation. For Dutch, van der Wouden (1997) lists the adverbs allesbehalve ‘anything but’ and allerminst ‘least of all, not at all,’ as antimorphic operators alongside the sentence negator niet. In French both ne … pas and ne… point express antimorphic operators. The examples below show that in English, auxiliary negation, as in (19), and the have yet to construction, as in (20), are both antimorphic. (19) a. Jeremy doesn’t smoke or drink. ↔ Jeremy doesn’t smoke and Jeremy doesn’t drink. b. Jeremy doesn’t smoke and drink. ↔ Jeremy doesn’t smoke or Jeremy doesn’t drink.

Polarity and the architecture of grammar 217 (20) a. Gillian has yet to visit London or Paris. ↔ Gillian has yet to visit London and Gillian has yet to visit Paris. b. Gillian has yet to visit London and Paris. ↔ Gillian has yet to visit London or Gillian has yet to visit Paris.

Next in the hierarchy is anti-additivity, or regular negation. Anti-additive operators conform only to the first of De Morgan’s Laws. Since anti-additivity thus represents a looser standard of negativity than does antimorphicity, all antimorphic operators are also anti-additive. (21)

A function f is anti-additive iff for all X, Y in the domain of f, f (X or Y) ↔ f (Y) and f (X)

The negative quantifiers and adverbs nobody, nothing, and never are typical anti-additive operators. Other anti-additive contexts include the first argument of a universal quantifier like all or every, certain comparative constructions (Hoeksema 1983), before clauses (Sánchez Valencia, van der Wouden & Zwarts 1993), sentential complements of verbs like deny, doubt, and refuse (Kas 1993: 92–8), and (arguably at least) questions and the antecedents of conditionals (Kas 1993; van der Wouden 1997).1 It is worth noting here that the “regular negation” of anti-additivity includes most of the intuitively nonnegative contexts – among others, questions, conditionals, and comparatives – which license polarity items. The examples in (22–23) suggest that in English both the conditional marker if and the negative quantifier never make for clauses which are anti-additive but not antimorphic. (22) a. David will thank you if you talk to him or ask him to dance. ↔ David will thank you if you talk to him and if you ask him to dance. b. David will thank you if you talk to him and ask him to dance. ↛ (←) David will thank you if you talk to him or if you ask him to dance. (23) a. Gillian has never snorted cocaine or smoked opium. ↔ Gillian has never snorted cocaine and Gillian has never smoked opium. b. Gillian has never snorted cocaine and smoked opium. ↛ (←) Gillian has never snorted cocaine or Gillian has never smoked opium.

One can see that if, for example, cannot be antimorphic, because the first proposition in (22b) does not entail the second: if David really wants you both to talk to him and to dance with him, doing just one of these may not be enough to earn his thanks. Similarly, in (23b), just because Gillian has never used both cocaine and opium on a single occasion doesn’t mean that she has never done either one alone.

218 The Grammar of Polarity The weakest of the three types of negative environment is simple downward entailingness, the property which Ladusaw relied on to define the class of polarity reversers. Downward entailingness can be defined in terms of implication reversal, as in (13) above, or else, like anti-additivity and antimorphicity, it can be defined in terms of a subset of the entactments in De Morgan’s Laws (van der Wouden 1997: 104–6). (24)

A function f is downward entailing iff for all X, Y in the domain of f, both (i) f(X or Y) → f(X) and f (Y), and (ii) f (X) or f (Y) → f (X and Y).

All antimorphic and anti-additive operators are, by definition, also downward entailing, but some downward entailing operators fail to be either anti-additive or antimorphic. Prominent in this latter group are “weak” negatives like few N, seldom, rarely, and hardly, which, though basically negative in their meaning, also leave room for positive exceptions to an asserted negative proposition. The examples in (25) and (26), below, suggest that few and rarely are downward entailing without being anti-additive. (25)

Few linguists take Prozac or cheat on their taxes. → (↚) Few linguists take Prozac and few linguists cheat on their taxes.

(26)

Mork rarely gets depressed or complains about the futility of life. →(↚) Mork rarely gets depressed and he rarely complains about the futility of life.

Both operators here fail to be anti-additive because neither allows the inference from the second proposition to the first. It is possible, for instance, that only a small number of linguists take Prozac and only a small number cheat on their taxes, but that taken together the set of linguists who either take Prozac or cheat on their taxes is too large a proportion of linguists to count as ‘few.’ The monotonicity hierarchy thus sorts polarity contexts into three groups, with the set of antimorphic contexts contained as a proper subset of the antiadditive contexts, and the set of anti-additive contexts contained as a proper subset of the downward entailing contexts.2 The situation is depicted schematically in Figure 8.1. The monotonicity hierarchy forms the basis for a logical approach to the diversity problem. Since different polarity items differ in the sorts of licensors they require, it seems reasonable to hypothesize that polarity items may in fact be sensitive to the different levels of the monotonicity hierarchy. This hypothesis was originally proposed by Frans Zwarts, and further developed in work by Sánchez Valencia (1991), Kas (1993), and particularly van der Wouden (1997), who argues that the three levels of downward monotonicity define three distinct types of positive polarity items and three distinct

Polarity and the architecture of grammar 219

Downward Entailing few rarely

Anti-additive Antimorphic not

if never

Figure 8.1 The monotonicity hierarchy

types of negative polarity items, as summarized in his Laws of Polarity (1997: 130). (27) (28)

Laws of Polarity: PPIs a. Strong PPIs are incompatible with all monotone-decreasing contexts. b. Medium PPIs are compatible with downward monotonic contexts but incompatible with anti-additive ones. c. Weak PPIs are compatible with downward monotonic anti-additive contexts but incompatible with antimorphic ones. Laws of Polarity: NPIs a. Weak NPIs are licensed in monotone-decreasing contexts. b. Medium NPIs can be licensed in anti-additive contexts but not in downward monotonic ones. c. Strong NPIs are licensed only in antimorphic contexts.

Evidence is given from Dutch and other languages supporting the promulgation of his laws, and Vasishth (1998a, b) argues that the same laws may also apply to polarity items in Japanese and Hindi. Many others, among them Dowty (1994), Jackson (1994), Krifka (1995), Atlas (1996), and van der Wal (1996), have either assumed or explicitly endorsed the basic thesis that polarity items can be sensitive not just to the broad division between upward and downward mononicity, but also to the more fine-grained distinctions of the monotonicity hierarchy. Although I cannot pursue this issue with the thoroughness it deserves, I would like to sound a soft cautionary note about its implications. Van der Wouden’s Laws offer a comprehensive and explicit attempt to deal with the troubling fact that polarity items do not all behave alike. The proposal has the advantage of being simple, well formalized, and (for the most part) admirably clear in its predictions. The proposal also has a clear seductive appeal: first, simply as a natural extension of Ladusaw’s basic monotonicity thesis; but

220 The Grammar of Polarity also because there is something alluring about the idea that the grammar of polarity – and by extension, human cognition – should reflect the elegant logic of De Morgan’s Laws. Nonetheless there are reasons to suspect that van der Wouden’s Laws of Polarity may fall short of actually solving the diversity problem. One worry is that the proposal effectively codifies the assumption that polarity contexts are organized in a hierarchy. The basic idea, going back at least to Horn (1970), is that polarity contexts can be ranked in terms of their degree of negativity – or, more generally, in terms of their licensing potential – and that polarity items can be ranked in terms of their licensing requirements. The problem is that this is almost surely an oversimplification. As noted in §2.3, above, and in Israel (1995a, b), polarity items often fail to line up in the simple ways such a hierarchy would predict. On the contrary, their distributions often overlap and diverge in ways that make it difficult to decide which licensors should count as strong and which as weak. Still, the idea of a hierarchy is not in itself misguided and may even be a useful idealization. The more fundamental question about the monotonicity hierarchy is why polarity contexts should be organized along these particular lines. The monotonicity hierarchy does not, in fact, offer the most intuitive arrangement of polarity contexts one might imagine. This, of course, is part of what makes the proposal interesting. Other conjectures on the diversity problem have grouped different polarity licensors in terms of their functional rather than their logical properties. For example, Edmondson (1981, 1983) divides polarity triggers into four major classes – negation, questions, conditionals, and comparatives – and argues that they are ranked in that order in terms of licensing strength. Similarly, Haspelmath (1997) explains the differences between different sorts of indefinite constructions in terms of the different sorts of functions they can serve (e.g. specific reference, non-specific reference, irrealis reference, free-choice reference, and uses in questions, conditionals, comparatives, and with negation). Interestingly, the logical distinctions which define the monotonicity hierarchy seem to cut across these more traditional, functional hierarchies. Thus, for example, the monotonicity hierarchy separates weak negatives like few and hardly, which are merely downward entailing, from negative quantifiers like never and no one (anti-additive); and separates both from the sentence negator not, which is antimorphic. The claim that polarity items are sensitive to anti-additivity as such thus predicts that negative quantifiers should pattern more like questions, conditionals, and comparatives, which van der Wouden analyzes as anti-additive, than they do like the negatives not, which

Polarity and the architecture of grammar 221 is antimorphic, and hardly, which is merely downward entailing. This strikes me as a startling claim, and the evidence in its favor appears to be both sparse and unstable. The question is, are polarity items really more sensitive to abstract logical distinctions, like the difference between downward entailment and antimorphicity, than they are to functional distinctions, like the difference between a question and a denial? And what exactly is the relation between these logical and functional categories? While van der Wouden and others have offered intriguing evidence that polarity items are actually sensitive to a variety of logical contexts, this is at best only part of the story. As van der Wouden himself acknowledges, the monotonicity hierarchy defines only the borderlines within which polarity sensitivities can range, and individual polarity items may always display their own idiosyncrasies within these broad boundaries (1997: 80). This sober conclusion will probably have to be a part of any solution to the diversity problem. Whatever the principles which govern the varieties of polarity sensitivity, polarity items will always be subject to their own lexical peculiarities. 8.4.3 Veridicality and nonveridicality By providing fine-grained distinctions within the broad class of downward entailing operators, the monotonicity hierarchy predicts that some polarity items may require more than mere downward entailingness to be licensed. But there are also polarity items which appear to require less, and so even if the logical properties of antimorphicity, anti-additivity, and plain downward entailingness did pick out neat classes of NPIs and PPIs, the monotonicity hierarchy on its own cannot account for the full diversity of sensitive constructions. The behavior of any is a case in point, since unlike other English NPIs, it is licensed in several contexts which are not even downward entailing – for example, with certain modal verbs, and in habitual and generic contexts. Of course, one may treat these as instances of a distinct free-choice any, but even then, one is left with a new set of sensitivities to explain. In fact, cross-linguistically there are many indefinite constructions (and not just free-choice indefinites) which occur in such non-monotonic contexts and yet are blocked in simple episodic contexts. Indeed, many sensitive constructions actually allow a wider range of licensors than does any. Thus Haspelmath notes that Hindi -bhii indefinites, as in (29), can occur in irrealis contexts so long as they lack a specific referent (1997: 284–5), and that German irgend indefinites, as in (30), can have a specific referent so long as the referent’s identity is unknown to the speaker (1997: 245).

222 The Grammar of Polarity (29) a. Kisii-ne (*bhii) fon kiy-aa thaa, par mujhe nahi maauluum, kis-ne. some-erg indef phone do-pfv was, but I:dat neg known who-erg ‘Someone has phoned, but I don’t know who.’ b. Wah kisii-ko bhii fon kar-naa caah-tii hai. she some-dat indef phone do-inf want-impf is ‘She wants to phone someone (non-specific – it could be anyone).’

(30) a. Ich habe (*irgend) etwas verloren. Rate mal, was! ‘I lost something. Guess what!’ b. Ich habe irgend etwas verloren, aber ich weiss nicht, was. ‘I lost something, but I don’t know what.’

Similarly, Giannakidou (1999) shows that the Modern Greek indefinite kanenas, which cannot occur in simple affirmative assertions, can occur in a wide variety of contexts which are clearly not downward entailing but which are at least non-episodic: with, for example, (a) future verbs, (b) deontic modals, (c) disjunctions, and (d) weak epistemic adverbs like isos ‘perhaps.’ (31) a. Tha vro kanena filo na me voithisi. f u t find.1s g any friend s u b j me help.3s g ‘I will find a friend to help me.’ b. Prepi na episkeftis kanenan jatro. must.3s g s u b j visit any doctor ‘You should visit a doctor.’ c. I bike kanenas mesa i afisame to fos anameno. or entered.3s g anyone in or left.1p l the light lit ‘Either somebody broke into the house or we left the light on.’ d. Isos na irthe kanenas. perhaps s u b j came.3s g anybody ‘Perhaps somebody came.’

Based on considerations of this sort, Giannakidou (1998, 1999, 2006, and elsewhere) has argued that polarity items in general are not sensitive to the monotonicity properties of their licensing contexts (or, by extension, to their effects on scalar inferencing), but rather to the status of these contexts as veridical, nonveridical, or antiveridical. Veridical operators entail the truth of propositions in their scope, nonveridical operators lack this entailment, and antiveridical operators entail the falsity of propositions in their scope. Thus, following Giannakidou (1999: 384): (32)

Given a monadic propositional operator Op, (i) Op is veridical iff Op p → p is logically valid. Otherwise, Op is nonveridical. (ii) A nonveridical operator is antiveridical iff Op p → ¬ p is logically valid.

Polarity and the architecture of grammar 223 Giannakidou (1999) further refines these definitions to capture a distinction between strong intensional predicates like ‘hope’ and ‘want,’ which can license NPIs like Greek kanenas (33a), and weak intensional predicates like ‘dream’ and ‘believe,’ which cannot (33b). As the glosses show, English any is ungrammatical in both contexts. (33) a. Elpizo na emine kanena komati. hope.1s g s u b j left.3s g any piece ‘I hope there is {a/*any} piece left.’ b. *Onireftika oti irthe kanenas idravlikos. dreamt.1s g that came.3s g any plumber ‘I dreamt that {a/*any} plumber came.’

Briefly, she suggests that (non)veridicality is defined not globally with respect to a speaker’s model of reality, but locally with respect to an epistemic anchor which can be supplied by the subject of an intensional predicate. Thus verbs like believe and dream are (weakly) veridical with respect to their complement propositions since they entail the existence of a conceptualizer for whom the embedded proposition is somehow taken to be true. Given all this, Giannakidou distinguishes two major classes of polarity items: true negative polarity items (her NPIs), which are licensed in the scope of an antiveridical operator, and a broad class of affective polarity items (APIs), including both the Greek nonemphatic indefinites, which are licensed in the scope of a nonveridical operator, and English any, which Giannakidou suggests is anti-licensed by veridicality. Finally, in the spirit of Linebarger, Giannakidou suggests that some NPIs may be licensed indirectly by a negative implicature. The idea that polarity items are sensitive to veridicality – the Veridicality Hypothesis – has several advantages. Since nonveridicality covers a wider range of contexts than downward monotonicity (i.e. the set of DE contexts is a proper subset of nonveridical contexts), the Veridicality Hypothesis readily explains why APIs are licensed in contexts like questions and conditionals, which are not strictly downward entailing (Heim 1984; von Fintel 1999). Moreover, the concept of nonveridicality seems to capture nicely the descriptive construct of an irrealis context, and many languages seem to have constructions which either mark or are sensitive to the distinction between realis and irrealis contexts. But the Veridicality Hypothesis has some drawbacks as well. Most importantly, the claim that all APIs are licensed by nonveridicality (or anti-licensed by veridicality) massively overgenerates. In English, for example, both the liberal NPI any and the more conservative yet are licensed in questions, but

224 The Grammar of Polarity neither regularly occurs in nonveridical contexts like those in (31–33) – with futures, inside disjunctions, with perhaps, or with strong intensional verbs like hope or want. On the other hand, there are also sensitive constructions for which the Veridicality Hypothesis is too restrictive: thus, the use of German irgend with specific reference as in (30b) is problematic, since such contexts are not nonveridical. Finally, while nonveridicality may provide a rigorous definition of irrealis contexts, the problem with irrealis as a grammatical category is that different languages actually seem to define it very differently (Bybee, Perkins & Pagliuca 1994). Givón (1995: 115) notes that “there are few if any languages where the grammatical marking of all irrealis submodes (or ‘environments’) is totally uniform,” and suggests that irrealis is a prototype category built around a set of core examples with motivated extensions, rather than being defined by necessary and sufficient logical conditions of the sort proposed by Giannakidou. Indeed, to account for the diverse sensitivities observed in polarity items both within and across languages, what seems to be required is not just one or even a few broad categories like nonveridicality or downward entailingness, but rather a detailed map of the ways different contexts within these broad categories may be related to one another. While some constructions may be sensitive to veridicality, the Veridicality Hypothesis on its own cannot explain the fine-grained sensitivities exhibited by polarity items. Of course, there is no reason one should have to choose just one particular semantic property as the unique source of constructional sensitivities. The question is not which semantic category – monotonicity or veridicality – polarity items are senstive to, but rather how either or both of these categories may be broken down in ways that might capture the full variability found in sensitive items. 8.4.4 Semantics and logical form In his 1979 dissertation, Ladusaw offered the Monotonicity Thesis as an elaboration and formalization of Fauconnier’s work on scale reversal and polarity licensing. Part of the importance of his work thus consisted in recasting Fauconnier’s earlier insight within an explicit theory of semantic representation. This was more than just a technical achievement, and its significance cannot be easily overstated. Ladusaw was arguing for a view of language in which semantic representations play a real role in determining the grammaticality of sentences. His goal was to demonstrate the need for a fully interpreted level of model-theoretic semantics within a theory of generative grammar, and his thesis made a compelling case that the evidence from polarity sensitivity

Polarity and the architecture of grammar 225 d emonstrated this need. For many, the constraints on polarity items still constitute the paradigmatic example of a grammatical phenomenon requiring a semantic explanation. The Monotonicity Thesis is in many ways radically different from the syntactic analyses of Linebarger, Progovac, and others working within a more narrowly Chomskyan framework. Ladusaw (1983), comparing his own thesis with that of Linebarger (1980), offers an illuminating discussion of the differences between a syntactic account and his own semantic account of polarity licensing. The distinction is easily confused in these works, since the two rely on very different notions of “logical form” in formulating their constraints. For Linebarger, logical form (LF) is a syntactic level of representation and does not represent an interpretation of a sentence’s semantic content. LF does encode some semantic information – it is, for example, the level at which scope ambiguities are resolved – but it is fundamentally no different from other levels of syntactic representation (S-structure, D-structure) in terms of its structure or its formal vocabulary: it is a labeled hierarchical organization of the lexical elements and grammatical formatives of a sentence. This sort of logical form is not in fact a logical structure at all, but rather just the syntactic input to the semantic interpretation. As May puts it, LF includes those “properties of syntactic form … relevant to semantic interpretation – those aspects of semantic structure that are expressed syntactically” (1985: 2, emphasis added). Ladusaw’s notion of logical form, what he calls the composition structure of a sentence, is very different. The composition structure really is the semantic interpretation of a sentence, expressing the way the meaning of a sentence is built up from the meaning of its parts (Ladusaw 1983: 378). In essence, Ladusaw’s project was, first, to provide a formal theory specifying how the model-theoretic semantic interpretation of a sentence is composed as a function of its syntactic structure together with the meanings of its parts; and, second, to show that this formal interpretation provides the structure needed to capture certain grammatical generalizations. In particular, Ladusaw argued that the property of being a polarity trigger depends on a linguistic expression’s truth-conditional meaning properties, and that the relationship between a polarity trigger and a polarity item is properly captured in the semantic interpretation of a sentence. Ladusaw’s definition of polarity triggers and his definition of scope are thus both cashed out in terms of the function–argument composition of a sentence’s semantic structure. It is important to understand that the composition structure of a sentence, its semantic representation, is not necessarily the same thing as a sentence’s meaning, at least not in the ordinary, everyday sense of the word meaning. A

226 The Grammar of Polarity composition structure is a model-theoretic interpretation of a sentence’s truth conditions formulated in “an algebra of meanings” (Ladusaw 1983: 381): a formal object built algorithmically by the grammar of a language. The “interpretation” here does not represent what one can understand from a sentence: it is an objective representation of the conditions under which a sentence counts as true, and as such is independent of any subjective apprehension of meaning. It is, in short, a mathematical object. Although the Monotonicity Thesis is often cited as a semantic generalization, it is thus essentially an arbitrary structural constraint. The constraint here applies to semantic objects rather than syntactic objects, but it is nonetheless a condition on formal representations, and thus a matter of form rather than content. What is missing here is an explanation of why NPIs should be so sensitive. In Ladusaw’s formulation, the relationship between a polarity item and its licensor is effectively arbitrary. Ladusaw requires that an NPI should occur within the scope of an appropriate trigger, and he defines scope in terms of function–argument structure of a semantic representation. For any two expressions α and β, constituents of a sentence S, α is in the scope of β with respect to a CS [composition structure] of S, S′ iff the interpretation of α is used in the formulation of the argument of β’s interpretation in S′. (1983: 382)

An interesting feature of this definition is that it does not actually require a trigger to affect the interpretation of a polarity item in any way. To be licensed, the NPI, or more precisely, the interpretation of the NPI, must form part of the argument of a downward entailing operator, but the operator itself is not required to interact with the NPI in any particular way. The question that naturally arises is, what does the NPI get from this arrangement? On its own the Monotonicity Thesis accounts for polarity licensing, but it does not explain polarity sensitivity. At best, it offers a foundation on which to build such an explanation. If licensing really is a matter of semantics, then one may reasonably hope for a semantic explanation of why polarity items need to be licensed. Ladusaw’s work helped to vindicate the importance of semantics for a theory of grammar, but in fact Ladusaw was arguing for much more. Ladusaw’s theory and the semantic tradition following him depend on a particular view of grammar in which logical, truth-conditional aspects of meaning are segregated out and packed into their own level of representation, free from the mess and muddle of pragmatics. The evidence from polarity sensitivity and the tradition that began with Ladusaw’s dissertation build a strong case for this view, but the matter remains at the heart of theoretical debates on the nature of

Polarity and the architecture of grammar 227 polarity licensing. Linebarger (1991: 165) provides a clear statement of what is at stake. If it could be demonstrated that NPI filtering is purely semantic, then we would have striking evidence that all speakers of English make a systematic distinction between logical entailment and other kinds of relations between propositions such as conversational and conventional implicature. But if NPI filtering does not delimit a level of “pure” semantics, this lends support to the view that sentence grammar stops short of truth-conditional meaning and that the distinction between the truth-conditional meaning of a sentence and its intended import in context may not be a psychologically real one.

In this work I have sought to resolve this issue in a way that might not please either Linebarger (1991) or Ladusaw (1983) but is indebted to both. With Linebarger, I propose that the constraints on polarity licensing are pragmatic in nature, and that the evidence from polarity sensitivity suggests that truth-conditional meaning cannot be separated from other aspects of meaning in a theory of grammar. But in the spirit of Horn, Fauconnier, Ladusaw, and many others, I maintain that polarity licensing is a semantic phenomenon, and that polarity contexts are defined by their inferential properties. The real question, I suggest, is not whether polarity licensing depends on semantics, but what sort of semantics it depends on. The answer can be found in the conventional meanings of polarity items themselves, and in the conventional pragmatic functions they fulfill. Polarity licensing depends on the pragmatically determined conceptual content which a sentence conveys in context because, ultimately, this is the level of semantics to which a polarity item must contribute its meaning. The lexical semantics of polarity items – or at least that part of their conventional meanings that makes them polarity sensitive – is essentially pragmatic in nature.

8.5

Toward a more pragmatic approach

The Monotonicity Thesis was only a first step toward the semantic explanation of polarity sensitivity. Ladusaw’s theory gave semantic substance to the idea of an affective context, but the basic relation between a polarity item and its licensor remained essentially arbitrary. Work building on Ladusaw’s early effort has thus sought to explain why polarity items should need a licensor and what it is that makes them sensitive in the first place. This work tends to focus on the lexical semantic properties of polarity items themselves and on the special contributions they make to the sentences in which they occur.

228 The Grammar of Polarity Heim (1984) was an early pioneer in this effort. Heim’s brief note began with the observation that certain apparently downward entailing contexts may not be quite as downward entailing as they appear. While the validity of the inference in (34) suggests that the antecedent of a conditional is a DE context, allowing subset for set substitutions salva veritate, the flagrant invalidity of the inference in (35) clearly undermines this conclusion. (34)

a. If you go to Las Vegas, you are bound to enjoy yourself. → b. If you go to Las Vegas in November, you are bound to enjoy yourself.

(35)

a. If you go to Las Vegas, you are bound to enjoy yourself. →??? b. If you go to Las Vegas and get shot, you are bound to enjoy yourself.

But if conditionals are not downward entailing, why do they license polarity items? Heim’s solution to this quandary was to loosen the requirements for licensing so as to allow NPIs in contexts which are not strictly DE, but which do allow some limited, relevant downward inferences (see also von Fintel 1999). Heim suggested the relevant downward inferences are defined directly in terms of the polarity item to be licensed. For example, Heim proposes that an NPI like ever means something like ‘at least once’ and that an expression like any newspaper (at all) means, basically, ‘at least one x.’ A form like ever will be licensed in conditionals so long as the conditional allows inferences from single events to multiple occasions, as in (36). Similarly, any is licensed in (37) because the conditional here allows inferences from a single newspaper to any larger quantity of newspapers. Heim referred to the property of conditionals which allowed them to license polarity items by virtue of a few select downward entailments as “limited downward entailment.” (36)

a. If you ever deceive me, I will cut you out of my will. → b. If you deceive me several times, I will cut you out of my will.

(37)

a. If you read any newspaper at all, you run the risk of being disgusted. → b. If you read six different newspapers, you run the risk of being disgusted.

An important consequence of Heim’s theory, and a significant advance over Ladusaw’s original formulation, is that it requires a substantive semantic relationship between a polarity item and its licensor. On Ladusaw’s theory it was sufficient for an NPI simply to occur in the scope of a polarity reverser; for Heim, on the other hand, a licensor has to actually license appropriate inferences with respect to the denotation of the NPI to be licensed. Licensing succeeds in (36) because the conditional reverses entailments on the scale of frequencies evoked by ever. And it succeeds in (37) because inferences are reversed on the scale of quantities evoked by any.

Polarity and the architecture of grammar 229 In a sense, then, Heim’s theory of limited downward entailment is an expanded theory of the role of meaning in polarity licensing. Licensing is no longer an arbitrary relation stated on semantic representations but involves a meaningful relationship between a polarity item and its licensor. Heim’s notion of limited downward entailment made crucial reference to the meanings of polarity items, but beyond a few suggestive comments, Heim’s brief note did not offer a theory of the lexical semantics of polarity items. It did, however, point the way toward a theory which could explain polarity sensitivity directly in terms of the meanings of polarity sensitive items. Ultimately, if we are going to explain why this grammatical phenomenon should exist, we will need to understand what it is about polarity items that makes them so sensitive. In recent years there has thus been a subtle but significant shift in the literature away from explanations focusing on the constraints which govern the distributions of polarity items, and toward explanations which derive those constraints directly from the lexical properties of polarity items themselves. As Ladusaw notes with respect to Kadmon and Landman’s theory of any, such an account “does not need to characterize a licensing relation directly” since licensing simply reflects the interaction of a polarity item’s special semantic properties with “the general principles which determine the propositional content of the sentence” (1996: 335). Part of the appeal of this sort of approach, then, is that it avoids the need to state the constraints on polarity items as constraints on grammatical representations of any sort, syntactic or semantic. Rather the idiosyncratic distributions of polarity items are understood as the product of their idiosyncratic lexical meanings, which are naturally represented in the lexicon, where idiosyncrasies belong. To conclude this section, I briefly consider the general and ambitious theory of NPIs and PPIs proposed by Krifka (1992, 1994, 1995). Although Krifka develops his theory in the framework of Montague grammar, his account has much in common with the cognitively oriented Scalar Model developed here. Both theories hold that polarity licensing depends crucially on some sort of inferencing and both maintain that this inferencing is essentially pragmatic in nature. While I will seek to distinguish the two approaches in my discussion here, my basic conclusion is that the two approaches may have much to offer each other. Krifka holds that polarity sensitivity reflects “a peculiar interaction between the meaning of polarity items and the expressions in which they occur, and certain general pragmatic rules” (1995: 218). Elaborating on Heim (1984), Krifka proposes that the essential properties which distinguish polarity items and make them polarity sensitive are: (i) that they are associated with a set of

230 The Grammar of Polarity alternatives; and (ii) that the alternatives introduce an ordering relation such that the polarity item denotes either the least element in the ordering (for negative polarity items) or the greatest element (for positive polarity items). As such, polarity items are understood as elements in a particular sort of scalar structure, that is, a lattice. Krifka (1995) represents polarity items as ordered triples 〈B,F,A〉, where B stands for the contextual background, F for the foregrounded denotation of the polarity item itself, and A for the set of alternatives to F. In Krifka’s semantics the interpretation of a complex expression containing a polarity item (e.g. Betty saw anything) thus consists of a foreground proposition and a set of alternative propositions which are either stronger (in the case of NPIs) or weaker (in the case of PPIs) than the foreground proposition. The explanation of licensing itself depends crucially not just on the meaning but also on the use of polarity items. Krifka provides a formal pragmatics for assertion and scalar assertion, according to which whenever a speaker uses a polarity item, she both asserts the value F of the polarity item and systematically implicates that she has reasons for not asserting any of the alternative values in A. The result, Krifka claims, is that polarity items can be used only in contexts where their foreground value F defines a more informative proposition than any alternative value in the lattice. As Krifka (1995: 225) puts it, the problem with a sentence like Betty saw anything is not just that it is vague, but that what it asserts systematically contradicts what it implicates: it says that Betty saw something, but its implicatures deny that there is anything she saw, and the result is incoherence. My synopsis here may not do justice to the richness of Krifka’s theory, but it may suffice to draw some broad comparisons between Krifka’s lattice theory and the Scalar Model. Both accounts share a similar ambition to provide a general theory which would apply both to positive and to negative polarity items, and which could explain the sensitivities of as wide a range as possible of different polarity items. Both accounts explain sensitivity directly in terms of the lexical semantics of polarity items. Both accounts view polarity items as presenting their semantic content against the background of a (partially) ordered set of alternative values. Both accounts make crucial reference to some notion of informativity. And both accounts view licensing itself as a pragmatic phenomenon which depends on the ways polarity items are actually used in context. There is, however, an important difference in the way the two accounts treat the notion of informativity. In the Scalar Model, informative value is a lexical semantic property, with different polarity items conventionally specified as either emphatic or attenuating. Krifka, on the other hand, does not treat informativity as a lexical property but instead builds the notion into his rules for scalar

Polarity and the architecture of grammar 231 assertion and relies on formalized Gricean principles to generate the scalar implicatures which can license or defeat the use of a polarity item. For Krifka, informativity is a pragmatic notion based on general principles, and so applies in the same way to all lexical forms. In the Scalar Model, i-value is a conventional feature which gets encoded in different ways in different lexical items. The two theories differ not only in their conceptions of informativity, but also crucially in their views of its usefulness. On Krifka’s account, unlicensed polarity items are self-defeatingly uninformative: they say too little even as they evoke the possibility of saying more. But this explanation only works for emphatic polarity items. In fact, there are many polarity items which seem to exist precisely for the purpose of saying little, even as they evoke the possibility of saying more. Attenuating polarity items – PPIs like sort of, would rather, and somewhat, and NPIs like much, auxiliary need, and all that – evoke a salient set of more informative propositions which could be, but are not, asserted. They thus appear only and precisely where they are maximally uninformative. Krifka does not discuss these forms, and he leaves no room for them in his theory: on the contrary, his theory predicts they should not exist. For Krifka’s theory to handle the distinction between attenuating and emphatic polarity items, it needs some way of encoding the rhetorical orientations which separate these classes of polarity items: something like informative value has to be recognized as a conventional lexical pragmatic property. This is a point worth emphasizing, for there is, in fact, ample evidence to support such a feature. It is needed, for example, to explain the different sensitivities of forms which otherwise appear to be synonyms, or near synonyms. For example, Horn (p.c.) points out the minimal triplet a bit, a little bit, and the least bit. The first of these is insensitive (I’m a bit worried; I’m not a bit worried), the second forms a litotic understatement under negation (I’m a little bit worried = I’m a bit worried; I’m not a little bit worried = I am very worried), and the third is a negative polarity item (*I’m the least bit worried; I’m not the least bit worried). This sort of pattern suggests that whatever lexical semantic features are responsible for polarity sensitivity, they cannot be limited to quantificational semantics alone. A similar point emerges from the examples below, where a bite and a nibble denote small quantities of ingested food, but where a bite appears freely in either polarity, while a nibble is confined to positive contexts. (38)

a. I think I’ll have just a {bite/nibble} of the cake. b. I haven’t had even a {bite/*nibble} to eat all day.

On the scalar account, the difference between these forms is that while a bite is unspecified for informative value, a nibble is conventionally specified as

232 The Grammar of Polarity attenuating, and therefore, given its low quantitative value, functions as a PPI. On Krifka’s account, however, there is no way of explaining why a nibble should be a positive polarity item at all. Rather the prediction is that if a nibble is a polarity item and is interpreted within a lattice of alternatives, then it will have to be a negative polarity item, since it will necessarily denote a minimal value in the lattice. Krifka could always adopt a feature like i-value to make these distinctions, but such a move might be out of character with the rest of his theory. The problem is that there really is no place for a feature like i-value in Krifka’s lexical 〈B,F,A〉 triples: it is not part of the background, the foreground, or the lattice of alternatives. I-value does not contribute content to a proposition but, rather, constrains the way content is construed. It is an expression of the speaker’s attitude toward the content she conveys. As such it is an aspect of the subjective conceptualization of a proposition, rather than a feature of objective content. The differences between Krifka’s lattice theory and the Scalar Model are substantial, but they are hardly irreconcilable. The theories rely on the same basic type of explanation, though their explanations are cashed out in very different terms. This is remarkable, really, since the two begin with very different sets of assumptions about the nature of language and meaning. Krifka comes at the problem from the point of view of formal semantics, where meanings are understood as symbolic structures relating words directly to the world by virtue of their truth conditions (Carnap 1942; Davidson 1967; Barwise & Cooper 1981). The Scalar Model, on the other hand, takes a broadly cognitive view of meaning, seeking “a semantics of understanding,” as Fillmore (1985) had it, and identifying linguistic meanings directly with embodied conceptual structures and patterns of conceptualization (Langacker 1987, 1991). On this view, meaning is fundamentally non-symbolic in nature: words do not represent the world directly, but only through the mediation of a human conceptualizer. In the end both approaches may benefit from the insights of the other. In particular, the Scalar Model might be enriched by Krifka’s more explicit formulation of an alternative semantics. The lattice theory, on the other hand, might benefit from a richer view of semantic content by allowing for figures of informativity like emphasis and attenuation – figures, that is, involving the way a speaker positions an expressed proposition in an intersubjectively construed context with an audience – as elements of encoded meaning.

9 The pragmatics of polarity licensing

Putting together novel expressions is something people do, not grammars. (Langacker 1987: 65)

9.1

Affectivity reconsidered

When Klima first proposed the feature [+Affective] to explain the licensing properties of negatives, interrogatives, and conditionals, it was really just a convenient way of labeling polarity triggers. In creating the label, however, Klima effectively advanced a hypothesis that polarity triggers as a group constitute a natural class in the grammar of English, and by implication in Universal Grammar. The history of polarity studies is in large part an effort to flesh out the intuition behind this hypothesis. The history having proceeded for over forty years now, it is not too soon to reconsider the hypothesis and the intuition behind it. The assumption that polarity licensing in general can be explained by any single mechanism has already been largely abandoned: theorists who agree on little else – Linebarger (1980; 1987), Progovac (1994), van der Wouden (1997), Giannakidou (2006) – agree in positing distinct licensing mechanisms to account for the distinct sensitivities of different types of polarity items. One might go further, however, and wonder whether the sensitivities of any single polarity item can be adequately captured in terms of a single grammatical feature in the first place. The question is, is polarity licensing in fact a property of some class of linguistic representations (whether syntactic or semantic or of some other sort)? And is it sensible to define affective contexts in terms of necessary and sufficient (or even just necessary) conditions on linguistic structures? I suggest that the answer to these questions is, if not a categorical no, at least a qualified not really. The point is not that the notion of an affective context is itself ill-conceived. It all depends on how one conceives it. Trivially, an affective context is one which allows for the felicitous use of a negative polarity item – one in which a polarity item makes a meaningful and coherent 233

234 The Grammar of Polarity contribution to the use of a well-formed sentence. The question is, what is it about an affective context that gives it these special licensing powers? In this chapter I will argue that affectivity is not a syntactic or a semantic property, and indeed is not a property of sentence types at all; rather, it is an essentially pragmatic phenomenon, and as such is a property of utterances (or of utterance types) and their intended interpretations. I have proposed that polarity items are scalar rhetorical operators and that polarity sensitivity is in general a sensitivity to scalar semantics. If this is correct, then affective contexts should be distinguished by their scalar properties. In particular, an affective context must: i) i i) iii)

allow for the construal of an expressed proposition within a coherent scalar model; allow for scalar inferences supporting a polarity item’s rhetorical force; be pragmatically consistent with a polarity item’s rhetorical force.

These are conditions on the content and construal of an expressed meaning, and they do not, in themselves, necessarily entail any restrictions on the sorts of linguistic structures which might express such meanings. Affectivity does not depend on a sentence’s formal structure but rather on the meaning it is intended to convey and on the rhetorical purposes it is intended to fulfill. It depends not on a sentence’s syntactic structure or its semantic representation, but crucially rather on the pragmatic coherence of a sentence’s use. Fauconnier was the first to suggest that polarity sensitivity might involve something other than a structural constraint on grammatical representations. He put it bluntly: “abstract representations play no role in the understanding of the syntactic and semantic facts related to the question of ‘polarity’ ” (Fauconnier 1978: 289). The point, as I understand it, is not that polarity phenomena cannot be formally modeled, but that there is no specifically linguistic level of representation which explains the distribution of polarity items. There is, in effect, no level of linguistic representation – syntactic or semantic – at which the notion of an affective context is defined. Polarity licensing is not a matter of well-formed representations but rather one of coherent conceptualizations: polarity items are licensed in contexts where they make sense and fulfill their conventional expressive functions. This is not to say that the formal properties of a linguistic context are irrelevant. Logical monotonicity, for example, clearly is a property of linguistically encoded semantic representations, and the monotonicity properties encoded by linguistic constructions clearly affect the inferences one draws from a construction’s use. And of course, the syntactic structure of a sentence – for example, whether a polarity item precedes or follows a potential trigger – does

The pragmatics of polarity licensing 235 have consequences for the ways in which that sentence can be construed. But while syntactic and semantic properties are important characters in the drama of polarity licensing, the constraints they impose ultimately only set the stage for a story whose hero is the pragmatics of scalar construal.

9.2

Scalar construal

Scalar reasoning is a general conceptual ability, and probably a very primitive one. It is not, fundamentally, a linguistic phenomenon. Scalar reasoning depends on an ability to think about a situation in terms of other potential situations, and to draw inferences about potential situations on that basis. While language may exploit this ability and even include forms which depend on its operation (i.e. scalar operators), scalar reasoning itself need not depend on any particular linguistic ability. On the contrary, it appears to be a basic feature of the way people understand their perceptual experiences and organize their conceptual structures, and it is plausible to think that pre-linguistic infants, non-human primates, and perhaps even other animals might share the same abilities. As such, scalar reasoning might be considered an instance of the way humans are able to draw complex inferences based on simple image-schematic structures involving notions like path, up-down, containment, and inclusion (Johnson 1987; Lakoff 1987; Mandler 1992; Lakoff & Johnson 1999; Bergen & Chang 2005). Or, given the abstractness of scalar reasoning in general, we might view scalar ordering in general as a kind of “superschema” (Grady 2005: 1606) available to structure conceptual domains independently of their particular image (or sensory) contents. Ultimately, whatever the cognitive bases of scalar reasoning, the important point is that this fundamental cognitive ability exists prior to, and independently of, our ability to exploit it in language. In general, any given situation may be an object of scalar reasoning so long as it can be understood with respect to a contrasting set of similar situations. Understanding a situation in this way does not depend on any objective property of the situation in question, but rather reflects the way a conceptualizer may choose to construe the situation. Scalar reasoning thus depends on a mode of scalar construal and, in this sense, is essentially a non-logical, or pre-logical, conceptual ability: it does not involve the manipulation of objective facts to draw valid inferences, but rather constitutes a form of cognitive pattern completion based on the understanding of a given situation type. Whether or not a given sentence supports a scalar interpretation depends not so much on its logical or referential properties as it does on pragmatic factors which determine

236 The Grammar of Polarity how it will be construed in context. The same expressed proposition might be understood either as contrasting with other propositions within a scalar model, in which case it can be said to receive a scalar construal, or simply as expressing information about a particular situation, in which case it can be said to receive a simple construal. Consider a situation in which someone utters the sentence Margaret tasted the jellyfish. Such an utterance will normally allow a simple construal in which it is taken to express information about a single act of jellyfish tasting on the part of a single individual. But in the right context, and given appropriate background assumptions, the same sentence may also be used to generate scalar inferences. Given that jellyfish, at least in Europe and the United States, is widely considered an exotic and not very appealing sort of food, the assertion here may easily be taken as a comment on Margaret’s lack of inhibition when it comes to eating. Such an interpretation probably requires a context in which many foods are available for sampling, and of these, the one which is least likely to be appealing and most likely to be considered repulsive is the jellyfish. In such a context, such a sentence may effectively convey that Margaret was so bold as to taste everything, even the least appealing and most appalling of the various offerings. It is important to note that at least in its written form the sentence Margaret tasted the jellyfish has no overt markers for either a scalar or a simple construal. Its interpretation is, in this respect, underdetermined by its formal structure, and the construal it receives ultimately depends on its fit with the context in which it is used. In fact, several distinct scalar construals are possible, depending on what sorts of situations the sentence is understood as contrasting with. On the interpretation discussed above, jellyfish is understood as contrasting with other, probably more appealing, things one could eat. But one might also understand the same sentence as contrasting Margaret with other people who tasted the jellyfish, or even as contrasting the act of tasting with other things Margaret might have done with the jellyfish, such as merely sniffing it or actually consuming large quantities of it. The first of these would make sense in a situation where Margaret is known to be relatively inhibited in her eating habits, and where the sentence is intended to convey that basically everybody tasted the jellyfish. The second might apply in a situation where Margaret was expected to eat some quantity of jellyfish, and the predicate tasted was used to provide a vague indication of how much or how little she actually did eat. One could, of course, force a scalar construal on any basic sentence. One way would be to insert a scalar focus particle like even, either before the intended scalar focus itself (Margaret, tasted, or the jellyfish), or before the

The pragmatics of polarity licensing 237 verb, in which case any of the three possibilities can function as the focus. The other would be to indicate the focus prosodically, with a fall–rise intonation on the intended scalar focus. If a particular scalar construal is forced on the sentence, this has significant consequences for the ways in which the sentence can fit into a given context: a given scalar construal has to be compatible with the information structure of the context in which it occurs. Thus a sentence with subject focus like (1a) can work as an answer to the question Who tasted the jellyfish? but not to the question What did Margaret eat?; while a sentence with object focus like (1b) can only answer the second of these questions. (1)

a. Even Margaret tasted the jellyfish. b. Margaret tasted even the jellyfish.

Words like Margaret and jellyfish do not force a scalar construal on their own because they do not inherently contrast with an ordered set of alternatives on a conceptual scale. Polarity items, on the other hand, do. Polarity items are scalar operators profiling a conceptual entity with respect to some set of alternatives ranked on a conceptual scale. As scalar operators, polarity items impose a scalar construal on the interpretation of a sentence, and as such they require a pragmatic context compatible with the scalar construal they impose. Since scalar construal in general depends precisely on the way a sentence’s content is integrated into a larger propositional context, the choice between a scalar and a simple construal is fundamentally pragmatic in nature. However unlike, for example, an implicature, it is not a kind of expressed propositional content itself, but rather a way of accessing expressed content. It is not something that can be said or implicated, but rather a way of saying. As such, though it is fundamentally pragmatic in nature, it can also be grammatically constrained by the presence of a scalar operator. In general, if a sentence is so constrained and yet for one reason or another cannot be accommodated into a larger context, the result is not just pragmatic anomaly, but rather a pragmatically conditioned ungrammaticality. 9.3

Logical conditions are not sufficient

The claim that polarity licensing ultimately depends on the availability of a coherent scalar construal and the observation that scalar construal is itself neither a logical nor a structural phenomenon together predict that the constraints on polarity items ultimately cannot be captured in terms of logical or structural conditions. In this section I will consider evidence that the logical properties

238 The Grammar of Polarity of a linguistic context are not sufficient to predict its potential for licensing polarity items. As will be seen, under certain conditions negative polarity items may fail to be licensed in contexts which are downward entailing, and even anti-additive. These licensing failures are systematically linked to pragmatic properties of the sentences in which they appear, and in particular to a sentence’s ability to support appropriate scalar inferences for a given polarity item. Basically, there are two important ways that things can go wrong. In many cases, polarity items may fail to be licensed because their use somehow depends on the availability of an incoherent, or otherwise pragmatically anomalous, scalar model. In other cases, licensing may fail because a licensor which logically should work simply does not allow for a scalar construal at all. Both types of licensing failure show that the right sorts of entailment are of no avail if they cannot be construed as applying within a scalar model. 9.3.1 Incoherent scalar models Polarity items are scalar operators, and as such they generate scalar inferences. The simple generalization for this section is that polarity items cannot be used felicitously in contexts where the inferences they generate will not make sense. Although universal quantifiers are uncontroversially downward entailing on their first argument, they do not always manage to license NPIs. Linebarger (1980) and Heim (1984) discuss contrasts like those below. (2) (3)

a. Every restaurant that charges so much as a dime for iceberg lettuce ought to be closed down. b. ??Every restaurant that charges so much as a dime for iceberg lettuce actually has four stars in the handbook. a. Anyone who gives a damn about the environment enjoys recycling. b. ??Anyone who gives a damn about the environment shops at Ikea.

As Heim notes, the intuitive difference between the (a) and (b) sentences here is that in the (a) sentences, but not the (b) sentences, there is some natural connection between the matrix and relative clauses. As she puts it, the predicate in (2a) [Heim’s 36] is something that applies to restaurants because they charge a dime or more for iceberg lettuce … whereas the predicate in (2b) [Heim’s 37] just happens to apply to those restaurants without regard to, or even in spite of, what they charge for iceberg lettuce. (Heim 1984: 104–5)

Heim proposes that the NPIs in these examples somehow incorporate the semantics of even – in my terms, that they are emphatic scalar operators – and

The pragmatics of polarity licensing 239 that as such they only make sense where they can be construed as contrasting with, and more informative than, an ordered set of alternatives. The (b) sentences thus fail here because, given normal background assumptions, they do not supply the necessary set of scalar alternatives that could make the expressed proposition count as emphatic. Yoshimura (1994) notes a similar phenomenon with respect to NPIs in before clauses. Since before clauses are downward entailing, and in fact anti-additive (Sánchez Valencia, van der Wouden & Zwarts 1993), a strictly logical view of polarity licensing predicts them to be robust polarity triggers; however, the contrasts below suggest that pragmatic factors have a significant impact on the licensing potential of these contexts. (4)

a. Miss Prism spilled her wine before she had drunk a drop. b. ??Miss Prism poured her wine before she had drunk a drop.

(5)

a. The alarm clock was ringing before I managed to sleep a wink. b. ??It was raining before I managed to sleep a wink.

(6)

a. Oscar had been studying linguistics for ten years before he learned a damned thing about pragmatics. b. ??Oscar had been fishing many times before he learned a damn thing about pragmatics.

Apparently there is more to polarity licensing than just logical monotonicity. Yoshimura herself accepts the thesis that polarity items are sensitive to monotonicity, and she argues on the basis of these and similar examples that they must be sensitive to a “cognitive constraint” as well. Drawing on Relevance Theory (Sperber & Wilson 1986), Yoshimura claims that NPIs share certain procedural-semantic properties with words like but which depend on a contrast between what is said and what might have been expected. A sentence like (4a) is thus licensed, in part, by the fact that one might have expected Miss Prism to have drunk at least a little of her wine before she spilled it; similarly, (4b) is odd because there is normally no expectation that one should drink any before one has poured it. The point is that the felicitous use of an NPI depends on the availability of some sort of implicit contrast. More specifically, I maintain that NPIs require a context which supports a scalar construal, where their expressed content is naturally understood as contrasting with an ordered set of propositions within a scalar model. Thus the NPIs in the (a) sentences here are acceptable because they form emphatic propositions which implicitly contrast with a range of weaker propositions that could have been asserted. In the (b) sentences, however, the NPIs sound peculiar because normal background assumptions make

240 The Grammar of Polarity it difficult to imagine a situation in which the expressed proposition could be construed as appropriately emphatic. In (6a), for example, the longer one studies linguistics, the more likely one should be to know something about pragmatics: Oscar’s minimal knowledge after ten years thus contrasts with what one would have expected him to have learned in that time. But in (6b), there is no particular reason why one should expect Oscar’s fishing trips to have had any effect on his knowledge of pragmatics, and so the scalar construal required by the NPI a damn thing is difficult to provide without the benefit of some particularly rich context. The examples in (4) are similarly instructive. There is a natural connection between pouring wine and drinking it: as a rule, given normal social conventions for drinking wine (e.g. don’t swig directly from the bottle and don’t use a straw), until something is poured, it cannot be drunk. But the relationship here is absolute: waiting longer before pouring something does not, under normal conditions, increase the likelihood that any quantity will be drunk; waiting longer before spilling something, however, may well have this effect. For this reason, the (a) sentence is easily construed within a scalar model, while the (b) sentence is not. Speakers’ intuitions tend to be less robust with more thoroughly negative licensors, but comparable examples may perhaps be constructed for without, as in (7), and for sentential negation in (8): again, these examples will not make sense if their minimizers cannot be construed within a coherent scalar model. (7)

a. Algernon left without saying a word. b. ?Algernon napped without saying a word.

(8)

a. Cecily didn’t eat a bite of her food. b. ??Cecily didn’t stare at a bite of her food.

(7a) is appropriately emphatic largely because it is normal when taking one’s leave to say at least something; but in (7b) the minimizer say a word seems oddly out of place since napping is not something that normally requires talking at all. Similarly, for (8), while there are many activities in which a bite of food might count as a natural minimal unit, staring is not one of them: one can just as easily stare at a whole banquet as at a single bite, and so the minimizer in this context fails to create a particularly emphatic proposition. The need for a scalar construal is particularly apparent in the so much as construction, an emphatic NPI which requires its focus (the nominal or verbal complement of as) to denote a very low value within a scalar model. Exactly what counts as sufficiently low is flexible and may vary with context and with

The pragmatics of polarity licensing 241 a speaker’s rhetorical goals. Still, the examples in (9–10) suggest that while its implementation may be fluid, the requirement itself is fairly strict. (9) Rupert wouldn’t so much as {look at / talk to / kiss} the cigarette girl. (10) ??Rupert wouldn’t so much as {run away with / make love to / marry} the cigarette girl.

It is perhaps a trivial matter to accommodate these sorts of scalar constraints within a logical view of polarity. The fact that monotonicity is not always sufficient to license polarity items hardly proves that it is not an important factor. But if the constraints on polarity items systematically exceed the predictions of the Monotonicity Thesis, this might be a sign that the thesis has missed the right generalization. The examples here, in any case, suggest that the essential condition on polarity licensing has more to do with the availability of a well-structured scalar model than with logical monotonicity per se. Where an appropriate scalar model is available, NPIs are licensed; where none can be found, NPIs are unwelcome. 9.3.2 On the need for a scalar construal The licensing failures discussed above show that polarity licensing often depends on various sorts of real-world knowledge which may enter into the construction of a well-formed scalar model. But there is also a more fundamental requirement for polarity licensing, which is just that any potential polarity licensor must allow for a scalar construal of the polarity item. That is, it must foster a conceptualization in which the profiled content of the polarity item is construed against a background of scalar alternatives. Most downward entailing contexts naturally allow such construals, but some do not, and when they don’t they naturally fail to license polarity items. In this section I will consider one such context. As is well known, negation in a matrix clause can often license polarity items in a sentential complement. This is true of all neg-raising predicates (e.g. think, believe, expect) and of a number of other verbs as well (e.g. say, realize). In general, the Monotonicity Thesis predicts the occurrence of NPIs in all these contexts, since they are all demonstrably downward monotonic. The tests in (11), for example, show that the sentential complement of not realize is both DE (11a), and in fact anti-additive (11b). (11) a. I didn’t realize that Monica likes scotch. → I didn’t realize that Monica likes expensive scotch. b. I didn’t realize that she smokes cigars or drinks scotch. ↔ I didn’t realize that she smokes cigars and I didn’t realize that she drinks scotch.

242 The Grammar of Polarity Given these facts, it is not surprising to find that NPIs like any, ever, and the least bit are licensed in this context. (12)

Ken didn’t realize that Monica a. knew any oil executives. b. had ever been to Haiti. c. was the least bit interested in glossolalia.

The sentential complement of not discover presents an unexpected contrast to what we find with not realize. As (13) demonstrates, not discover seems to pass the same tests for downward entailment and anti-additivity as not realize. (13) a. I didn’t discover that Monica drinks scotch. → I didn’t discover that Monica drinks expensive scotch. b. I didn’t discover that Monica smokes crack or shoots heroin. ↔ I didn’t discover that Monica smokes crack and I didn’t discover that she shoots heroin.

But surprisingly, NPIs appear not to be licensed in the complement not discover – the examples below are especially bad when the NPIs occur with stress. (14)

He didn’t discover that she a. knew {several/*any} oil executives. b. had (*ever) been to Haiti. c. was (*the least bit) interested in glossolalia.

Both the verbs realize and discover are factive in that they presuppose the truth of their complements: (12a) thus presupposes that Monica does know some oil executives, (14b) that she has been to Haiti. The crucial difference seems to be that whereas realize can be used as a stative predicate to mean something like ‘be aware of,’ discover only occurs as an achievement verb and so necessarily refers to the single instant in which knowledge is gained. On the relevant reading, when it’s used in the past tense, to not realize something is effectively to be ignorant of it. The predicate allows a scalar construal because ignorance is an inherently gradable property. Not realize licenses any here because the sentence as a whole allows for a scalar construal in which the asserted proposition – K didn’t know that M knew even one executive – contrasts with an ordered set of alternative propositions – K didn’t know that M knew two executives, … that M. knew three executives, … that M knew four…, etc. The different propositions within the scalar model make different claims about the depth of Ken’s ignorance. Discovery, however, unlike ignorance, is not a gradient phenomenon. A single discovery does not suggest a set of contrasting discoveries. (14a), like

The pragmatics of polarity licensing 243 (12a), presupposes some familiarity with some number of oil executives, and it asserts that this familiarity has not been discovered with even one individual. But here the very nature of discovery prevents a scalar construal. The assertion indicates that a certain type of event, a discovery, has not taken place, but it does not thereby evoke any larger set of possible discoveries against which the assertion is to be evaluated. This fact makes it difficult, if not impossible, to accommodate an NPI here. 9.4

Logical conditions are not necessary

The evidence so far clearly demonstrates that logical monotonicity alone is not sufficient to guarantee polarity licensing. Polarity items, it turns out, are sensitive not just to the logical structure of a linguistic context, but also, and crucially, to pragmatic factors which affect what inferences can be drawn from a sentence on a given occasion of use. This result certainly challenges Ladusaw’s strong claim that “the property of being a trigger is completely predictable from the truth conditional meaning of an expression” (1979: 162). On the other hand, Ladusaw himself suggested that downward entailment might be a necessary, but not a sufficient, condition for polarity licensing (1979, 1983: 385). So the question now becomes, can polarity items ever be licensed in contexts which are demonstrably not downward entailing? And if they can, then what role, really, does logical monotonicity have to play in polarity licensing? In this section I will consider a number of contexts which, at least sometimes, license polarity items, but which are not logically downward entailing. For Linebarger, at least, this result has been taken to show that notions like logical monotonicity are in fact irrelevant to polarity licensing, and to grammar in general. This conclusion may be too uncharitable. Ladusaw, I think, had the right idea but applied it at the wrong level of grammar. Polarity items may not be sensitive to monotonicity per se, but they are sensitive to something very much like monotonicity, that is, scalar inferencing. In the next section I will consider examples which illustrate this sensitivity as it appears in contexts which are not, strictly speaking, downward monotonic. 9.4.1 ‘Exact’ cardinals Noun phrases involving a precise cardinal quantity (exactly three puppies, precisely two reasons) pose a problem for the Monotonicity Thesis. Under the right conditions, an expression of this sort can sometimes license polarity items despite the fact that it is clearly non-monotonic. As the examples below demonstrate, a quantifier like Exactly 3 N will not license upward or downward

244 The Grammar of Polarity entailments within its nuclear scope: from the truth of (15a), neither (15b) nor (15c) can safely be inferred. (15)

a. Exactly 3 professors read a novel last night. b. ↛ Exactly 3 professors read a book last night. c. ↛ Exactly 3 professors read a trashy romance novel last night.

Exactly 3 is not upward entailing since if just exactly three professors read a novel last night, it still may be that many more were busy reading important scholarly monographs. And Exactly 3 is not DE since if exactly three professors read novels, it is still possible that they all read different kinds of novels: one might have read a trashy romance, while the other two read trashy mysteries. Judgments vary on the licensing potential of these sorts of NPs, but I am inclined to agree with Linebarger (1987, 1991) that under the right conditions an ‘exact’ NP can trigger NPIs: the constructed examples in (16) strike me as fairly unimpeachable. (16)

a. There are exactly two reasons I would ever talk to her again: one is if my life depended on it; the other is if she were to say ‘hello’ to me. b. Exactly three people have expressed the slightest interest in reading my dissertation: you, my mother, and this homeless guy I talk to in the park. c. There are precisely four people in the whole world who would so much as consider lifting a finger to help that maniac, and one of them is your father.

Linebarger argues that NPIs are licensed in examples like these because, and only because, they convey an appropriate negative implicature. (16a), for example, clearly conveys that beyond the two reasons given, there are no other reasons the speaker would ever talk to her again. And (16b) similarly conveys that beyond the three people mentioned, no one else has shown the slightest interest. Linebarger’s basic intuition in these cases is surely correct. Some notion of implicature must play a crucial role in these examples. At first blush, the examples in (16) can be no more easily explained by pragmatic scalar entailments than they can by logical monotonicity: as the sentences in (17) suggest, exactly N is neither scale preserving nor scale reversing, any more than it is upward or downward entailing. (17) a. Exactly three professors can solve the hard puzzles. ↛ Exactly three professors can solve the easy puzzles. b. Exactly three professors can solve the easy puzzles. ↛ Exactly three professors can solve the hard puzzles.

In (17a), given three professors who can solve the hard problems, there is no reason to suppose that there isn’t an abundance of professors who can

The pragmatics of polarity licensing 245 solve the easy puzzles. In (17b), given only three who can solve the easy puzzles, it seems unduly optimistic to suppose that all of them can also solve the harder ones. Still, it should be clear that the reason examples like those in (16) are wellformed has everything to do with the scalar semantics of exact quantification. The reason an expression like exactly three N can license NPIs is that it adds something to what would be expressed by three N alone. While three N may sometimes be used to express ‘at least three N,’ exactly three N makes explicit, and so indefeasible, the upper-bounding implicature ‘no more than three N.’ The sense of precision in a word like exactly is thus not symmetrical: although in principle it means ‘neither more nor less,’ in practice the emphasis is often on the ‘not more.’ And in as much as exactly three serves to convey ‘no more than three,’ it is both downward entailing and scale reversing: if no more than three professors can solve the easy puzzles, then no more than three can solve the hard ones either. The suggestion here is that sentences like those in (16) license NPIs not because of what they assert or entail, but more generally because of what they convey. And crucially, what they convey here is not just a matter of truth-conditional semantics, but also of a sentence’s rhetorical function in context. Unless we allow the ‘no more than’ reading as a conventional sense of exactly, an account based strictly on logical monotonicity cannot explain the licensing in (16). Pragmatic considerations should not affect a monotonicity calculus, but they do affect scalar inferencing. For example, given the (quite natural) assumption that a great many people should be interested in reading a given dissertation, the assertion in (16b) probably conveys more information about how many people will not read this dissertation than about who will. As Linebarger suggests, it is because of this pragmatic potential that NPIs can be licensed in examples like those in (16), and the effect of the NPIs in these examples is itself precisely to emphasize this negative pragmatic potential. In effect, I am proposing a sort of compromise between Ladusaw and Linebarger. Linebarger is right to point to the importance of implicature in licensing the NPIs in (16); however, her account leaves the scalar nature of the implicature conveniently obscure. Ladusaw is right to point to the importance of inferencing as the crucial mechanism of licensing; however, his account leaves no room for the important role pragmatics often plays in creating the appropriate inferences. The scalar approach to polarity licensing seeks to preserve the insights of both Ladusaw’s logic and Linebarger’s pragmatics.

246 The Grammar of Polarity 9.4.2 ‘Most’ and ‘few’ The determiners most and few provide another example of environments which at least sometimes license NPIs but which cannot do so by virtue of their logical properties alone. As Ladusaw points out (1979: 151), most is difficult to judge, but the examples in (18) suggest that it is neither upward nor downward entailing on its first argument. (18)

a. [Most boys who ate an apple] got sick. ↛ b. ↚ [Most boys who ate fruit] got sick.

(18a) does not entail (18b): it could be that all the boys ate some fruit, and that the apples were poisoned so that most of those who ate an apple got sick, but that really very few boys got sick because most just ate cherimoyas and blackberries and avoided the poisoned apples. This shows that most is not upward entailing on its first argument. Similarly, (18b) does not entail (18a): after all, it could be that the cherimoyas were poisoned, but that the apples contained an antidote, so that most of the boys who ate fruit got sick, but those lucky few who ate an apple were all spared. This shows that most is not downward entailing on its first argument. Similar considerations apply to the first argument of the determiner few. The examples below suggest that few, like most, is non-monotonic. (19)

a. [Few girls who came to the party] rode motorcycles. ↛ b. ↚ [Few girls who came to the party alone] rode motorcycles.

Again, there are no entailments in either direction. It could be that hardly any girls came to the party on motorcycles, but that all those who did came alone: thus (19a) does not entail (19b). Similarly it could be that few of those who came alone rode motorcycles (say, two or three out of forty), but that most of the girls who came (say, 150 out of 200) came on motorcycles: thus (19b) does not entail (19a). But while most and few are both non-monotonic – neither downward nor upward entailing, they do license NPIs (Heim 1984; Jackson 1994; Barker 1995; Israel 1995a). The examples below suggest that many NPIs, both weak and strong, can appear in an NP quantified by most or few. (20a) comes from Barker (1995: 117); (20b) was heard by Larry Horn in a public service announcement from the NYC Department of Health. (20)

a. Most children with any sense steal candy. b. Most people who have ever smoked have already quit, and you can too. c. Most people who have ever studied semantics understand the difference between sense and reference. d. Most people who would lift a finger to help Bill now are either very foolish or very well-paid.

The pragmatics of polarity licensing 247 (21)

a. Few children with any sense play frisbee on freeways. b. Few people who have ever really thought about it actually believe in the Tooth Fairy. c. Few people with the least bit of human feeling can fail to be moved by the heroic achievements of our brave men and women.

While these determiners are logically non-monotonic, in the right context and under the right circumstances, they do allow the limited downward entailments needed to license a polarity item (Heim 1984: 102–4). Thus, for example, any is licensed in (20a) and (21a) because these sentences implicate that most children with a lot of sense will steal candy as well and that few children with a lot of sense will play frisbee on the highway either. In other words, despite their logical inadequacy, most and few can license polarity items because they can trigger scalar inferences. The examples below demonstrate that both most and few will, in the right context, reverse entailments in a scalar model. In (22), given the familiar conceptual scale with puzzles ranked in terms of their difficulty, both forms license pragmatic inferences from propositions involving lower-ranked easy puzzles to propositions involving higher-ranked, harder puzzles. (22) a. [Most students who could solve the easy puzzles] got a prize. → [Most students who could solve the hard puzzles] got a prize. b. [Few students who could solve the easy puzzles] had trouble on the exam. → [Few who could solve the hard puzzles] had trouble on the exam.

The inference in (22a) is natural because, if prizes are being awarded for solving easy puzzles, one naturally assumes they will also be awarded for solving harder puzzles. Similarly, in (22b), if an ability to solve the easy puzzles is normally enough for students to get through the exam without difficulty, then presumably an ability to solve the harder ones will only make the exam that much more trouble-free. So, given the right context and the right scalar model, the restriction of a quantifier like most or few does form a scalereversing context and so licenses the inferences needed to license an NPI. Of course, the availability of these inferences depends on the context and crucially on the content of the main clause predicate, that is, the quantifiers’ nuclear scope. Switching the predicates in these examples switches the inferences they support. (23) a. [Most students who could solve the easy puzzles] had trouble on the exam. ← [Most students who could solve the hard puzzles] had trouble … b. [Few students who could solve the easy puzzles] got a prize. ← [Few students who could solve the hard puzzles] got a prize.

248 The Grammar of Polarity In (23) the inferences now run from hard puzzles to easy ones. In (23a), if an ability to solve the hard problems does not prevent one from having trouble on the exam, then presumably the mere ability to solve easy ones will not help either. In (23b), if students who solve hard puzzles normally do not receive a prize, then presumably students who just solve easy ones will not be awarded either. Thus, the same scalar logic which makes most scale reversing in (22a) also makes few scale preserving in (23b); and the same scalar logic which makes few scale reversing in (22b) also makes most scale preserving in (23a). The inferential properties of these forms thus depend on the ways they are used in context, and, more precisely, on the ways they are construed with respect to a scalar model. Switching the predicates in (22) and (23) switches the scalar models: both models license inferences about students based on their problem-solving abilities, but in one the conceptual scale of puzzles supports inferences about who will get a prize, and in the other it supports inferences about who will have trouble on an exam. Thus these forms do license polarity items by virtue of their inferential properties, but these are not logical properties of the forms themselves, nor even of the sentences which contain them. They are rather a pragmatic property of the way these forms can be used. 9.4.3 ‘After’ effects My final example of a polarity trigger whose affective powers depend crucially on pragmatic assumptions comes from the occasional use of NPIs with subordinating expressions like long after or hours after. The following examples, adapted from Linebarger (1987: 370–1), illustrate after’s licensing potential. (24)

a. He kept writing novels long after he had any hope they might be published. b. The mad general kept issuing commands hours after there was anyone left to obey them. c. She kept trying out for the team long after she had a snow ball’s chance in hell of ever making it. d. He kept wanting to call her up long after he had the slightest interest in actually talking to her.

Linebarger (1987, 1991) uses examples like these to support her contention that non-negative polarity licensing depends on the availability of a negative implicature. In each case the situation profiled by the main clause is presented as persisting despite the fact that (i.e. long after) some normal or expected condition for its continuation has ceased to hold: (24a) conveys that novel writing

The pragmatics of polarity licensing 249 persisted when there was no hope of publication; (24b) portrays a general issuing orders with no one to obey them; (24c) indicates continued efforts when there is no longer any hope for success; and (24d) concerns a man who still wants to call someone even when he no longer wants to speak with her. As Linebarger notes, expressions like (long) after cannot be considered downward entailing. In fact, the evidence suggests that they are normally actually upward entailing. The examples below show that after licenses inferences from specific instances (in this case, a specific book) to the general case, and not the other way around. (25)

a. Marcia solved the mystery (long) after she read about it in a book. ↛ b. ← Marcia solved the mystery (long) after she read about it in A Natural History of Negation.

The inference from (25a) to (25b) fails because Marcia might have solved the mystery after reading about it in a book other than Horn’s classic tome: this demonstrates that (long) after is not a downward entailing operator. The inference from (25b) to (25a), on the other hand, seems generally valid: if she solved the mystery after she read about it in A Natural History, then she necessarily solved it after she read about it in a book. This suggests that (long) after is in fact upward entailing. Again, as with exactly, Linebarger’s basic contention is unimpeachable: the licensing effects in (24) depend on the availability of an appropriate implicature and cannot be attributed to the logical structure of these examples. But what Linebarger fails to appreciate (or at least to acknowledge) is that the implicatures in these cases crucially exploit a scalar logic. Each of these examples depends on a background assumption according to which two types of situations are understood in terms of linked scales. The implicature in (24a), for example, depends on a default assumption that one’s interest in writing novels might depend in part on a belief that one’s novels could be published. Given this assumption, the contribution of any is to emphasize that the novel writing persisted even when the normal motivation of possible publication had been reduced to a minimum. And crucially the use of any in an example like this triggers scalar inferences that the novel writing would also persist if there was any greater amount of hope. So the context does succeed in supporting the scalar semantics of the NPI. Similarly, in (24d), there is a normal assumption that the more one wants to talk to someone, the more likely one will be to want to call them. In this context, the emphatic NPI the slightest helps to highlight the perversity of the situation by emphasizing the total dissociation of the subject’s desire to call from any normal motivating desire to talk. Again, the slightest is

250 The Grammar of Polarity licensed because the context does support the inference that if the subject had any greater desire to talk, he would also still want to call. It seems to me that it cannot be a coincidence that a form like after licenses NPIs only in contexts which have this sort of rich scalar structure. Linebarger is correct in pointing to the importance of implicature in these examples. The conclusion, however, is not that inferential properties like logical monotonicity are irrelevant to polarity licensing (pace Linebarger 1991 and Giannakidou 1998, 2006), but rather that the relevant inferencing cannot be stated exclusively in terms of a sentence’s logical semantic structure. 9.5

Rhetorical coherence

Polarity licensing, I maintain, is fundamentally a pragmatic phenomenon, and depends more on the ways a sentence can be construed in context than on its strictly logical properties. Logical monotonicity alone cannot license a polarity item where the appropriate scalar inferences cannot be drawn, and where they can be drawn it is not necessary. It is worth considering why this should be. The Scalar Model holds that polarity items are scalar operators, and that their conventional semantic properties cause their sensitivities. These properties are themselves fundamentally pragmatic, at least in the broad sense that they concern an expression’s subjective construal as well as its objective content. Q-value depends on the way an expression’s profiled content is construed against an ordered set of alternatives, and i-value depends on the sorts of rhetorical effect an expression can contribute to an utterance. Polarity items are sensitive because only in certain contexts can they felicitously express both of these scalar semantic-pragmatic properties. The expression of these features often depends on factors other than just the scalar properties of a licensing context. I-value is basically an index of speaker affect, and as such may be constrained by considerations of politeness or interpersonal rhetoric generally. Emphasis and attenuation, in other words, are not just diacritics which segregate polarity items into scale-reversing and scale-preserving contexts. They are genuine properties of an item’s expressive force, with real consequences for its distribution in different expressive contexts. Consequently, polarity items should be sensitive not just to the pragmatics of scalar inferencing, but to the pragmatics of social interaction in general. Consider, for example, the behavior of polarity items in conditional constructions like those below, where a conditional clause serves a sort of quasi-performative function (Austin 1956; Geis & Lycan 1993).

The pragmatics of polarity licensing 251 (26)

a. If you’re hungry, there’s some pizza in the kitchen. b. If you don’t mind my saying so, you look absolutely gorgeous. c. If you’ve got a minute, could you take a look at this?

Sweetser calls such constructions speech act conditionals since “the performance of the speech act represented in the apodosis is conditional on the fulfillment of the state described in the protasis” (1990: 118, emphasis in original). This construction poses a minor problem for any strictly logical approach to polarity licensing, if only because it is difficult to gauge the logical properties of what is essentially a pragmatic construction. While content conditionals clearly do license (at least limited) downward inferences, the inferential properties of speech act conditionals are harder to establish. Thus, while the content conditional in (27) clearly licenses the downward inference from the genus car to the species Lamborghini, the speech act conditional in (28) does not seem to work the same way. (27)

a. If you have a car, you can get there on time. → b. If you have a Lamborghini, you can get there on time.

(28)

a. If you need a car, I’ll gladly sell you my old Mazda. ↛ b. If you need a Lamborghini, I’ll gladly sell you my old Mazda.

In (28) the speaker conditions her offer on the hearer’s needs, but just because the hearer’s needing a car might justify the speaker’s offer to sell her Mazda, it does not follow that the hearer’s needing a Lamborghini would also justify such an offer. Logically, speech act conditionals are very different from their contentful cousins, since in a speech act conditional the truth of the protasis does not count as a sufficient condition for the truth of the consequent. Indeed, in examples like (26a) the sentence as a whole actually entails the apodosis – that is, the pizza must be in the kitchen whether the addressee is hungry or not (Austin 1956: 210; Horn 2000c). But speech act conditionals often do support the sorts of scalar reasoning found with other conditionals, licensing inferences from low to high scalar values. Basically, if something holding to a small degree is enough to justify the performance of a speech act, then, all things being equal, that same thing holding to a greater degree will justify it as well. Thus in (26c), for example, if your “having a minute” is enough to warrant a request for some of your time, then presumably your having even more time would also warrant the request. Given this sort of logic, speech act conditionals should license NPIs, and indeed they do, at least sometimes. In (29), while the liberal NPIs anything and at all are fine in clauses condtioning an offer of food, the more emphatic the least bit is awkward.

252 The Grammar of Polarity (29)

a. If you want anything to eat, there are plenty of eggs. b. If you’re at all hungry, there are plenty of eggs. c. ??If you’re the least bit hungry, there are plenty of eggs.

Considering just its minimal q-value, one might expect the use of the least bit here to emphasize the speaker’s sincerity and generosity in offering the eggs: literally, it just seems to say that the offer stands even if the hearer is only minimally hungry. But actually, it sounds oddly manipulative rather than generous. The problem is that the point of the conditional here is to make the offer without sounding pushy. Speech act conditionals often server to mitigate the face threat associated with a speech act. In this case, the conditional helps prevent the offer of eggs from being mistaken as a demand that the guest should eat: the offer is conditioned on the addressee’s desire to eat, and this gives the addressee a valid option to refuse. But the emphatic nature of the least bit closes down that option by insisting on the offer even where it might least be desired. The emphatic NPI thus makes what would be a gracious offer into a peremptory demand. The examples in (30) illustrate much the same sort of effect. Here the conditional serves to mitigate the face threat associated with an invitation by making it easier to decline it. The choice between some and any in (30a) may reflect the speaker’s optimism that the addressee might say ‘yes,’ but the more emphatically negative constructions with any … at all and even a second in (30b–c) undermine the rhetorical purpose of the speech act conditional, which is to give the hearer options. Instead of making it easier for the hearer to say “no,” these more emphatic NPIs suggest that the invitation should be accepted even if he has only the smallest amount of time. (30)

a. Assuming you have {Ø/some/any} time tonight, would you like to see a movie? b. ?Assuming you have any time at all tonight, would you like to see a movie? c. *Assuming you have even a second to spare, would you like to see a movie?

The overt clash between the insistence of the emphatic NPIs and the deference implied by the conditional invitation creates an awkward effect, and as the polarity items get more emphatic, the effect gets worse. The problem here has nothing to do with the scalar inferential properties of the conditional construction itself, which does in fact provide the scalar inferences needed to license an NPI. The indirect invitation is predicated on the condition that the hearer should have enough free time, and in general if

The pragmatics of polarity licensing 253 having a little time suffices to justify the invitation, then having more time should suffice as well. The conditional does make the high i-value NPIs any and at all appropriately emphatic. The problem is that the emphatic rhetoric itself conflicts with the deferential pragmatics of the conditional usage as a whole here. In general, speech act conditionals resist emphatic NPIs. This is particularly true where the premise serves to mitigate the potential face threat associated with an illocutionary act. My judgments in (31) – variations on an example from Quirk et al. (1985) – suggest that as NPIs get more emphatic, they undermine the deferential nature of the conditional as a whole and so lead to a decline in acceptability. (31)

a. We’re getting married, if it’s of any interest to you. b. ?We’re getting married, if you’re at all interested. c. *We’re getting married, if you’re the least bit interested. d. **We’re getting married, if you give a flying fuck.

The same effect is evident in (32), and is, if anything, more striking since the NPIs here fail to be licensed by a negation inside the conditional. Note that the semi-NPI mind works in this context because it is part of a stereotyped speech act conditional formula, and it fits in the formula precisely because its understating meaning is consistent with the deferential pragmatics of the speech act conditional. (32)

a. If you don’t mind my saying so, you should dump that jerk. b. ??If you don’t at all mind my saying so, you should dump that jerk. c. *If you don’t in the least mind my saying so, you should dump that jerk.

In these examples, of course, the sentence as a whole can hardly be construed as deferential: the advice given in the apodosis is neither delicately phrased nor likely to be welcomed. Still, the speech act conditional here at least pays lip service to the idea that the speaker will withhold her advice if it might cause the hearer any distress. Once again, the emphatic nature of the NPIs undermines this show of consideration, and again licensing fails because the rhetorical force of the polarity items clashes with the rhetoric of the construction in which they appear. Emphatic NPIs are not entirely barred from speech act conditionals. They are welcome, in fact, so long as they do not clash with the pragmatic purpose of the conditional itself. Sometimes, for example, a conditional might be used to establish that a given speech act is relevant to the hearer’s interest. The example in (33a), from Quirk et al. (1985: 1092), is a case in point: the conditional here does not serve to mitigate a face-threatening speech act;

254 The Grammar of Polarity on the contrary, it helps to show why the information expressed in the main clause should be welcome. Emphatic NPIs are more than welcome here, where their rhetorical strength effectively reinforces the generosity of the offer. (33)

a. I’ll be in my office all day, in case you have any problems at all. b. If you have any problems at all, just give me a call. c. If you have even the slightest problem, just give me a call. d. If anybody so much as lifts a finger to interfere with your work here, just give me a call.

The distribution of NPIs in speech act conditionals shows that polarity sensitivity is not just a matter of scalar reasoning tout court, but crucially depends on the expression of a polarity item’s rhetorical affect. Scalar reasoning is critical because it enables this affective expression, but it is the rhetorical affect itself that determines a polarity item’s distribution. 9.6

Affectivity reclaimed

The study of polarity sensitivity began in earnest with Klima’s notion of affective contexts, and with his hypothesis that there is some “grammatico-semantic” property which unites the class of polarity licensors. Since polarity sensitivity is clearly a grammatical phenomenon, a distributional property of particular linguistic constructions, it has seemed natural to assume that licensing itself is a property of linguistic representations, and that polarity contexts, as a class, are defined by innate principles of grammar which govern the set of possible semantic and syntactic structures. A central purpose of this chapter has been to question this assumption. The question is not whether polarity contexts form a class of some sort. Obviously they do: they are the class of contexts which license polarity items. However, the evidence reviewed here suggests that polarity contexts are defined not by their formal linguistic properties, but rather by their meaningful pragmatic effects – by the acts of construal in which linguistic forms are used and understood. There is thus no level of grammatical representation, syntactic or semantic, at which there are necessary and sufficient conditions defining the class of polarity contexts – at least not independently of the way such syntactic and semantic structures are deployed and construed in an actual or imagined utterance. The quality of being a polarity context depends on a combination of morphological, syntactic, semantic, and pragmatic factors, including background

The pragmatics of polarity licensing 255 knowledge, contextual assumptions, and speaker intentions, all of which work together to make the use of a polarity item in an utterance actually make sense. But the particular sense polarity items need to be licensed is not some abstract grammatical symbol, but an actual act of scalar inferencing in the construal of what is said.

10 Visions and revisions

And so we return, roughly, to where we started in this work – that is, to a vision of grammar grounded in communicative interaction, and of pragmatics itself as an integral part of grammatical structure. My own view is that the evidence from polarity sensitivity strongly supports this vision, but I hardly expect those who take a different view to quickly change their minds in the face of a few odd facts. I speak here of visions, after all, and how one views the facts of a matter necessarily depends on how one sees the matter in which the facts consist. Polarity sensitivity is a complex phenomenon, but there is no reason why a theory of polarity sensitivity should not be simple. The theory I have presented here, the Scalar Model of Polarity, is a simple theory of how grammatical sensitivities can be determined by cognitive constructional semantics. The goal has been to explain both what it is that makes something a polarity item and, incidentally, why it is that polarity items should exist in the first place. The Scalar Model holds that polarity items are scalar operators and as such their denotata must be construed with respect to the information structure supplied by a scalar model. A scalar model is itself a structured set of propositions organized in a way that supports scalar inferencing. The theory is that polarity items are polarity sensitive because they require both that an expressed proposition occupy a fixed position within a scalar model, and that the expressed proposition stand in a fixed inferential relationship to a scalar norm. The first requirement means that polarity items encode a conventional quantitative value, either high or low in a scalar ordering. The second requirement means that polarity items encode a conventional informative value, either emphatic or attenuating with respect to other values in a scalar model. The effect is that polarity items can only occur where an appropriate scalar construal is available to render the expression of their quantitative values appropriately emphatic or attenuating, as the case may be. In this sense polarity sensitivity itself is essentially just a sensitivity to scalar inferencing. The important point, however, is not just that polarity items are scalar operators, but more precisely that they are grammaticalized expressions of 256

Visions and revisions 257 pragmatic affect, conventional encodings of the twin antithetical rhetorical figures emphasis and attenuation. The key to polarity sensitivity thus lies in the notion of informative value as a grammaticalized (and grammaticalizable) semantic-pragmatic feature. The conventional association of this feature with specific linguistic constructions stems from the ways speakers habitually deploy scalar models to achieve their rhetorical purposes. Grammar, on this view, is driven by the rhetorical factors at play as speakers negotiate the dangers and delicacies of communicative interaction. While polarity sensitivity is a purely linguistic phenomenon, it is grounded in a general feature of human conceptual and perceptual systems, that is, the ability to perceive an instance and to conceive of alternatives. At least in this case, complex grammatical knowledge does not depend on richly structured domain-specific linguistic abilities but arises rather as the linguistic manifestation of a much more general mental ability. The grammar of polarity sensitivity is still “grammar” – something which a speaker must know about her language. But in this case what the speaker must know is itself in fact something essentially pragmatic. What people are sensitive to when they use sensitive constructions are the ways those items contribute to the intersubjective and rhetorical construal of an utterance. Pragmatics thus does not replace, but rather resides in, grammar.

Appendix: A catalogue of English polarity items

This is a partial catalogue of words and constructions I suspect may be polarity sensitive in at least some varieties of Modern English. Although this is by no means an exhaustive list, I have sought to include examples of as many different sorts of English sensitive constructions as possible. The constructions are divided into four groups based on the interaction of the two features, q-value and i-value, as elaborated in §4. The four groups are here named after their most salient exemplars: any-type NPIs: low quantitative value in a strongly informative proposition, <−q/+i> some-type PPIs: low quantitative value in a weakly informative proposition, <−q/−i> much-type NPIs: high quantitative value in a weakly informative proposition, <+q/−i> galore-type PPIs: high quantitative value in a strongly informative proposition, <+q/+i> The following conventions are observed here. Different constructions are separated by “;”. Paradigmatic alternatives within a construction are listed in curly brackets and separated by slashes, as {A/B/C}. Simple parentheses are used for constituents within a construction which are either optional or which allow various morphological realizations. 1. Amount 2. Degree 3. Frequency 4. Connection 5. Earliness 6. Lateness 7. Similarity 8. Potential 9. Significance 10. Effort 11. Esteem 12. Inclination 13. Aversion 14. Tolerance 15. Trouble 16. Wealth 17. Poverty 18. Loquacity 19. Sapience 20. Various

1.

The amount scale: quantitative polarity items

Entities ordered in terms of measurable amounts and individuability. any <−q/+i>

some <−q/−i>

258

any; anyone; anything; a (living) soul; a one; a (damn) thing; an iota; a jot; a thimbleful; jot or tittle; jack; diddly; fuck (all); {diddly/ doodly} squat; (jack) shit; the {slightest / least (little)} some; several; certain; a {certain/particular/few/little} NP; one or two; a thing or two; a {couple/handful/number/smattering} (of NP); a {dash/dearth/glint/mite/shiver/smidge/soupcon/spell/splash/ sprig/taste/tinge/touch/trifle/whiff} (of NP); a while; a ways; a good

Appendix: A catalogue of English polarity items 259 {bit/deal/ways/while}; a {fair/decent} bit; a {piddling/trifling} amount; a pittance; dribs and drabs; bits and pieces; (precious) {few/ little}; scant much <+q/−i> much; many; a whole (hell of a) lot; all that {much/many}; exactly a lot; precisely a whole lot; the half of it galore <+q/+i> a (whole) {bunch / heap / pile / rash / ton / shitload, a swarm} (of NP); a {fair/good/great} {amount/bit/deal/many/ways/while}; {barrels/heaps/(shit)loads/mountains/piles/reams/scads/tons} (of NP); {fuck/hell} (of NP); the whole {bit / deal / megillah / show / story / shebang / works / kit and caboodle / nine yards / shooting match / ball of wax}; the long and the short of it; every last Nom; galore

2.

The degree scale: intensive polarity items

Situations ordered in terms of extent or intensity. any <−q/+i> some <−q/−i>

at all; in the {least / slightest}; the {least / slightest} bit Adj whatsoever; any Adj-er somewhat Adj; fairly; kinda; sorta; moderately; rather; pretty; a {little/tad} Adj; in a {sense/way}; to a {degree/point}; in a manner of speaking; so to speak; as it were; more or less; somehow (or other); partly; partially; in part; little by little; inch by inch; by degrees; piecemeal; piecewise; bitwise; inchmeal; only so Adj; only so {much/ many/long/far}; Adj-ish (e.g. tallish; smallish; lateish; blueish …); hardly; barely; scarcely; neither much <+q/−i> all that Adj; as Adj as all that; too {terribly/very}; any too; so very; overly; overmuch; (so) much of a Nom; exactly a Nom; far wrong; be {(any) great shakes / a panacea} galore <+q/+i> totally; wholly; ever so Adj; {all/fully/way} Adj; absolutely; amazingly; awfully; considerably; extraordinarily; extremely; insanely; thoroughly; utterly; {fucka/hella/wicked} Adj; Adj as {fuck / hell / shit / all get out}; by far the Adj-est; far and away the Adj-est; {far/ way} more Adj; be of a magnitude (that); sheer {joy/luck/N}; all in all; all the time in the world; all smiles; in every (last little) way; whole hog; to the utmost; to all intents and purposes

3.

The frequency scale: occasional polarity items

Situations ordered by the density of their occurences in a region of time or space. any <−q/+i> some <−q/−i>

ever; even once; at any time; in my life; in all the time that; anywhere near; in the world; in hell; on the face of the earth once in a while; once or twice; some; sometimes; a few times; a number of times; occasionally; on occasion; from time to time; at times; now and then; now and again

260 Appendix: A catalogue of English polarity items much <+q/−i> much; so much; that much; so often; that often; quite all the time galore <+q/+i> all the time; be {always/constantly/continually/ever} V-ing; ever since; ever Adj (e.g. ever ready; ever prudent; ever alert); ever Adjer; ever the Nom; again and again; over and over

4.

The earliness scale: inceptive polarity items

Situations ordered by proximity to their starting point or instantiation. any <−q/+i>

in {weeks / months / ages / eons / (a million) years / a coon’s age / a blue moon}; punctual until (e.g. I won’t love you until you kiss me) some <−q/−i> in a while; in a (little) bit; a (little) bit ago; some time ago much <+q/−i> yet; as of yet; just yet; (much) in a hurry; any time soon; any too soon; much ago galore <+q/+i> already; almost; nearly; suddenly; just now; even now; ( just) this (very) minute/second; at a moment’s notice; in a {flash / jiffy / heartbeat / trice / (New York) minute / in a jiffy}; in {two shakes / the blink of an eye}; before you know it; faster than you can {count to 10 / say X to Y / think about it}; on the {brink/verge} of V-ing; within an eyelash; within an inch (of his life)

5.

The lateness scale: durative polarity items

Situations ordered by their span from beginning to end. any <−q/+i>

be {a sec / a minute / any time}; {believe / think} for a {minute/ moment/second}; {have/get/give} a minute’s {peace/rest/ solitude}; last a {minute/moment/second}; (can) go on; take but a minute; a moment of your time; have a moment to spare; lose an instant; miss a beat. some <−q/−i> be a minute or two; be a (little) while; be a (little) bit; last a while; a spell; a bit; go on for a while; just a sec much <+q/−i> anymore; any longer; much longer; much more; {be/take/last} long; too long; so long; can {last / go on (like this)} galore <+q/+i> still; so far; up to now; {be/have} yet to V; finally; at last; at long last; long since; high time; until the end (of time); for ages; for an eternity; until the cows come home; keep on keeping on

6.

The connection scale: connective polarity items

Complex propositions ordered in lattices with component propositions. any <−q/+i> some <−q/−i>

X or Y either (Focus Particle either); nor; or even; or any… else; or XP either either X or Y (Disjunctive either); or {maybe/perhaps}; (can)… X or on the other hand Y

Appendix: A catalogue of English polarity items 261 much <+q/−i> X let alone Y; X much less Y; X never mind Y galore <+q/+i> X and Y too; X as well as Y; X, (and) {even / furthermore / in fact / what’s more } Y

7.

The similarity scale: comparative polarity items

Pairs of entities ordered by mutual similarity, from the merely comparable to the fully indistinguishable or identical. any <−q/+i>

(can)… {compare to / hold a candle to / touch}; (be)… {anything like / in the same street as / a patch on / even close to}; (be) a thing like X; by a long shot; the likes of which; without {compare/equal/ match/peer} (cf. *with compare, *with equal. …) some <−q/−i> something like; circa; give or take; more or less; reminiscent of much <+q/−i> (be) an X so much as a Y; be {exactly/that/so} Adj; (be) {exactly / such} a Nom; can place (= ‘recognize’) galore <+q/+i> the very one; the identical one; the spitting image; {dead / spot} on; unique; every bit the; one of a kind; a real; a regular

8.

The potential scale: modal polarity items

Situations ordered in terms of their potential for being real. any <−q/+i>

can V (epistemic), can possibly V, (can)… {begin/manage/seem} to V; have a {hope / prayer / (ghost of a) chance} of V-ing; a snowball’s chance in hell; (can) … {cut/hack} it some <−q/−i> {may/might} V (epistemic wide scope only); be liable to V; could (just as) well V; just might V; could do with V-ing; maybe S, perhaps S much <+q/−i> need V; necessarily (epistemic); just because X (does) mean Y galore <+q/+i> {must / should / gotta / (had) better} V; be {bound/compelled/led} to V; absolutely have to V; be every likelihood that S; fully expect that S; certainly S; surely S

9.

The significance scale: mattering polarity items

Situations ordered in terms of their importance, or contextual relevance. any <−q/+i>

some <−q/−i>

matter (e.g. it matters, no matter); make a difference; mean a thing; {be worth / amount to} {jack / shit / (a hill of) beans}; cut any ice; add up; count; signify; register; bear comment; be (any/*some)… good/point/use (V-ing); be of (any/*some) avail mean something; be of some significance; be worth noting; be a drop in the ocean; be a merest detail; be beneath notice; could help; might do X some good; be better than nothing

262 Appendix: A catalogue of English polarity items much <+q/−i>

big deal; make {much / a whole (hell of a) lot of} difference; be {a great matter / of consequence / the be all and end all / a thing to write home about}; set the Thames on fire; be {as bad as all that / as black as X is painted / half bad} galore <+q/+i> have every reason to V; mean {a lot / (everything in) the world} to; be the {real thing / real deal / genuine McCoy / shit}; be a matter of some urgency

10.

The effort scale: trying polarity items

Actions ordered by degrees of willful exertion involved. any <−q/+i>

even try; bother to V; bother V-ing; can be bothered to V; (would) stoop to V; lift a finger; do a thing; (do) a {stroke/lick/stitch} of work; budge (an inch); move a muscle; crack a book some <−q/−i> make an effort to V; have a {go/stab} at V-ing; take a {shot/stab} at; give X a {go/shot/try/whirl} much <+q/−i> put oneself out; sweat it; trouble oneself; let X trouble one’s head about galore <+q/+i> {strain / struggle / make every effort} to V; do one’s {all/best/ damnedest} to V; give it one’s all; pull out all the stops; leave no stone unturned; go for it; throw oneself into; work one’s fingers to the bone; work like a {(cart) horse / galley slave}

11.

The esteem scale: caring polarity items

Situations ordered in terms of an experiencer’s degree of appreciation. any <−q/+i>

(give)… a {damn / rap / shit / (flying) fuck / (ragged) rat’s ass} (about); have a care (for); care a {bit/fig/jot/pin/snap} (for); (could) care less some <−q/−i> be fond of; have a thing for; quite like; rather enjoy much <+q/−i> care for; much care; take kindly to; much like; think much of; be one’s cup of tea; think {small beer / vin ordinaire} of; be (all that) thrilled about; be exactly looking forward to galore <+q/+i> adore; love; be mad for; delight in; be one’s heart’s delight in; embrace; applaud; welcome; endorse

12.

The inclination scale: desiderative polarity items

Situations ordered in terms of a subject’s desire to experience them. any <−q/+i>

dare V; care to V; have {any / the slightest} intention of V-ing; (would)… {dream of / even consider / be caught dead} V-ing; (could)… bring oneself to V; wild horses could {keep X…from V-ing / get X…to V}; (would) … touch (with a {barge/10-foot} pole);

Appendix: A catalogue of English polarity items 263 (would) go near; (would give) … {the time of day / a second thought / a moment’s notice} some <−q/−i> (would)… {rather / just as soon / prefer to / be tempted to} V; {could/ might} (just) as well V; have half a mind to V; might consider V-ing much <+q/−i> care to V; be too {keen/eager} to V; exactly {dying to V / looking forward to V-ing}; (exactly) thrilled at the prospect of V-ing galore <+q/+i> (would)… {love / be delighted} to V; be {driven/excited} to V; {adore / be all about / look forward to} V-ing; simply must V; have an urge to V; be all over{it / that}

13.

The aversion scale: abstentive polarity items

Situations ordered in terms of a subject’s inclination to avoid them. any <−q/+i>

(can) … {help / resist / keep from / stay away from / live without} V-ing; (can)… {wait / resist the temptation} to V; (would) {hesitate to V / shrink from V-ing}; can get enough of some <−q/−i> can do without {NP/V-ing}; can take it or leave it; take exception to; take a dim view of; be unmoved by much <+q/−i> mind {NP/V-ing}; be averse to V-ing; have qualms about; (would)… {object to / sniff at / turn up one’s nose at / look down on / kick out of bed / say “no” to}; be bothered about; would {worry / trouble oneself} about galore <+q/+i> {hate/loathe} to V; {hate/abhor/detest/despise} V-ing

14.

The tolerance scale: tolerative polarity items

Situations ordered in terms of a subject’s willingness to suffer for them. any <−q/+i>

(can)… {abide / allow / bear / cope with / countenance / stand / stomach / swallow / take / tolerate / put up with / stand the sight of}; would be caught dead V-ing; (will)… {brook / have / hear of / stand for / suffer…(gladly)} some <−q/−i> {persist in / go (right) on} V-ing; can live with; weather (the storm); get by; make do (with); shrug off; grin and bear it much <+q/−i> have the heart to V; can {resign/reconcile} oneself to V-ing; take NP lying down galore <+q/+i> {be dying / give anything / sacrifice everything} to V; give one’s right arm to V; come hell or high water; brave out; take in stride; tough it out

15.

The trouble scale: irritative polarity items

Actions ordered in terms of the bother they cause. any <−q/+i> (be) … {trouble / a bother / a problem} (to V); (have) … {qualms about / a problem with} (V-ing); find fault with; go with a hitch

264 Appendix: A catalogue of English polarity items some <−q/−i>

be a hassle; bug; irk; vex; trouble; try one’s patience; (feel) a twinge; (can) … {get to one / weigh on one / get one down} much <+q/−i> (be) … {skin off one’s nose / a major disaster / the end of the world / the worst thing}; faze; (could/would) … {hurt to V / be problem to V / kill you to V}; (have)… {much to lose / a reason not to V} galore <+q/+i> [(be) a {pig/sod/bitch/pain (in the ass/back/butt/neck)} to V] ; (be)… {impossible / a major headache / a pain (in the …)}; to {kill one / break one’s heart / disgust one / be revolting / get one’s goat} that S

16.

Wealth: expensive polarity items

Amounts one could earn or spend, ordered from least to most valuable. any <−q/+i>

a red cent; a blue cent; a plugged nickel; a thin dime; a bent farthing; a kopeck; a brass farthing; a bean; a fig; a prawn; a lobster; a turnip; a penny to one’s name a pretty penny; a good chunk; a tidy sum; will cost you; run into money; some <−q/−i> these things add up; get by; scrape by; make do; eke out; do alright much <+q/−i> be exactly loaded; have a lot of extra cash galore <+q/+i> pay {a king’s ransom / through the nose}; cost an arm and a leg; highway robbery; be worth its weight in gold; be {loaded / made of money / filthy rich / rich as Midas / rolling in it}; have money burning a hole in one’s pocket; shower with (money)

17.

Poverty: cheap polarity items

Amounts one would willingly forgo, ordered from most to least valuable. any <−q/+i>

(for) … {all the money in the world / all the tea in China / love or money / the world} some <−q/−i> for a price much <+q/−i> lack for, want for galore <+q/+i> (for) {love / peanuts / a pittance / next to nothing / a song}; dirt cheap; easy pickings; gratis; free {of charge / for nothing / for the asking}; bankrupt; flat broke; flat out; tapped; zilch

18.

Loquacity: talkative polarity items

Speech acts ordered from least to most expressive. any <−q/+i> some <−q/−i> much <+q/−i>

{say/breathe/utter} a {word/peep/syllable}; make a sound; suggest (even) for a minute; say boo to a goose say {a thing or two / some words}; put in a (good) word for; waffle say much; make bones about; put too fine a point on; mince {matters / words}; (would)… {elaborate (on) ; dignify with a response}

Appendix: A catalogue of English polarity items 265 galore <+q/+i> talk {a blue streak / one’s head off}; run at the mouth; (cry) one’s eyes out; lie through one’s teeth; chatter (away); gab; jabber; natter; yammer; yak; verbal diarrhea

19.

Sapience: knowing polarity items

Cognitive states ordered from least to most aware or informed. any <−q/+i>

(have) a {clue/inkling/idea/notion}; (have) the {foggiest/vaguest/ least/slightest} {idea/notion}; (have) {(half) the sense god gave a goose / the sense to come in out of the rain}; (can)… {imagine / believe (one’s eyes) / fathom / begin to understand / make head(s) or tail(s) of / find one’s way to first base / see the wood for the trees}; know {how to thank / what to say / what to make of / the first thing about / X from Adam / one’s ear from one’s elbow / one’s ass from a hole in the {wall/ground} / a hawk from a handsaw} some <−q/−i> know {a thing or two (about) / one’s way around / which way the wind is blowing}; glimpse; catch a glimpse; (can) read the writing on the wall; been around the block much <+q/−i> be {all there / playing with a full deck / the sharpest (knife in the drawer) / the brightest (bulb) / exactly an Einstein} galore <+q/+i> (be a)… {know-it-all / swellhead / walking encyclopedia / utter genius}

20.

Miscellaneous scales and items

Unwelcomeness: any: (be) amiss Inactivity: any: shirk; leave a stone unturned Easiness galore: (be) a {cinch / snap / piece of cake}; (be) easy as {pie / 1, 2, 3 / a, b, c} Fierceness: any: (be) a shrinking violet Timidity: galore: (be) scared of one’s own shadow Insignificance: any: (can) {overstate the importance / emphasize too much} much: (be) chopped liver; to be {sneezed at / taken lightly / trifled with / underestimated} galore: beneath contempt Ignorance: some: (be) a few {bricks/cards} short of a {load/deck}; much: born yesterday galore: fresh off the boat

Morphological NPIs: words with an obligatory negative affix debunk, disable, disabuse, (a state of) disarray, disband, disbar, discalce, discomfit, discomfiture, disconsolate, disdain, diseased, disgruntled, dismayed

266 Appendix: A catalogue of English polarity items illicit, illusory, immaculate, impeccable, impecunious, impervious, impetuous, (impromptu), inadvertent, inane, incapacitated, inchoate, (terra) incognita, incognito, incommunicado, incomparable, incontrovertible, incorrigible, indefatigible, indignant, indiscriminate, individual, indefectible, indelible, indemnity, indolent, indomitable, inept, inert, inexorable, inevitable, infamy, infirmity, innocent, inordinate, insipid, insouciant, interminable, irradicable, irredentist, irreperable, irrepressible, irresistible, misgivings, misnomer nonchalant, noncommittal, nondescript, (persona) non grata, nonpareil, nonplussed, non-starter, nonsensical, non-sequitur; nonesuch, notwithstanding unbeknownst, uncalled for, uncontrollable, unconscionable, unequaled, unflappable, ungodly (*a godly hour), unheard of, unkempt, unmatched, unrequited, unrivaled, unruly, unsavory, (sight) unseen, unstinting, unsung, unswerving, untold, untoward clueless, endless, heartless, listless, matchless, peerless, relentless, senseless, timeless, tireless, worthless

Notes

1

Trivium pursuits

1 The term is inspired by the Theory of Argumentation in Language (e.g. Anscombre & Ducrot 1983), though my assumptions about meaning and cognition do not always conform to those of these authors. 2 Here and throughout, I use she/her as pronouns for a generic speaker (or writer) and he/him/his for a generic hearer (or addressee). 3

Licensing and the logic of scalar models

1 Huddleston and Pullum (2002) avoid this prejudice by using NPI as short for nonaffirmative polarity item. 2 The “#” here indicates pragmatic anomaly rather than outright ungrammaticality. The difference, I take it, is that an anomalous sentence would be fine given different background assumptions – here, for instance, if Bill Gates were thought to be very poor or the Pope very sexually active – but an ungrammatical sentence is one which cannot be given a coherent construal in any imaginable context. 3 Or perhaps never, if Anscombre and Ducrot are correct in thinking that interrogatives have the same intrinsic argumentative orientation as negatives (1983: 115). 4 Apparently this constraint is not absolute, as suggested by the aphorism attributed to Alice Roosevelt Longworth: “If you can’t say anything good about someone, come and sit here by me.” But this itself seems to be a play on the adage “If you can’t say anything good about someone, don’t say anything at all,” in which someone is referential and takes scope over the negation. Anyway, the constraint does seem to be at work in the fact that instances of the old adage found on the Web are often amended either to “if you can’t say something good about someone …” or “if you can’t say anything good about anyone …” 4

Sensitivity as inherent scalar semantics

1 I use the term operator loosely here: a contextual operator does not map a context T onto another context T′ in the way that, for example, a negative operator maps one proposition P to another, ~P; rather, a contextual operator situates an expressed proposition (or, the expression of a proposition) within a particular sort of imagined context. 267

268 Notes to pages 89–189 2 The attenuating nature of adverbial pretty is evident in some of its earliest recorded uses: the first mention in the OED is from a definition for the Latin diminutive audaculus: “a pretie hardie felow: used in derision” (1565). The playful nature of the understatement here reflects well pretty’s oblique sort of emphasis. 5

The elements of sensitivity

1 Relevance Theory (Sperber & Wilson 1986/1995) is founded on a similar observation of the tension between processing effort and communicative pay-off. In particular, the Communicative Principle of Relevance requires that utterances should create adequate contextual effects and impose minimal processing requirements. 2 I am indebted to Chris Barker, Adele Goldberg, Larry Horn, and Hotze Rullmann, among others, for warning me of this. 6

The scalar lexicon

1 Langacker (1991) presents the same idea, only in reverse – explicating the meanings of nominal quantifiers in terms of modal semantics: “All swans are white implies that choosing a swan at random must yield a white one; similarly, with most random selection should produce a white swan, and with some it may do so” (1991: 108n, italics in original). 2 Jacobson (2006) provides a compositional analysis of the can seem to construction assigning somewhat special meanings to the words can and seem, treating can as an existential quantifier over occasions and seem as a hedging device. This would seem to make the construction as a whole a sort of self-mitigating emphatic NPI: the existential can has to occur with a scale-reversing trigger in order to express a proposition strong enough for seem to be able to attenuate. 3 Wierzbicka (1980: 236) notes that parallels between conjunction and universal quantification have been noticed at least since the work of Roger Bacon in the thirteenth century. If the analogy has any merit, disjunction may be similarly related to existential quantification. 7

The family of English indefinite polarity items

1 Cf. Farkas (2002), who treats the “extreme non-specificity” of Rumanian vreun indefinites in terms of valuation constraints which determine the ways values can be assigned to a variable introduced by an indefinite NP. 2 As Horn notes (2005: fn 6), this is also the view of the Oxford English Dictionary, (any, s.v. I.1.c.), which holds that “in affirmative sentences [any] asserts concerning a being or thing of the sort named, without limitation as to which, and thus constructively of every one of them, since every one may in turn be taken as a representative.” 3 Green (2002: 23) notes that in African American English, some can also work as an adverbial intensifier, as in Kareem Abdul is some tall or She can cook some good (=‘very well’). Presumably this is an extension of exclamative some.

Notes to pages 190–218 269 4 This is fine on a spesumptive reading, and also, as Horn (p.c.) notes, with said substituted for thought, thus allowing the exclamation to be construed in the reported speech act rather than in the reporting act. 8

Polarity and the architecture of grammar

1 In fact, the status of questions, conditionals, and comparatives within the monoticity hierarchy is far from settled (cf. von Fintel 1999; van Rooy 2003; Giannakidou 2006), but if these contexts count as monotonic at all, then it seems they should be at least anti-additive. 2 A fourth sort of DE function, the antimultiplicative, conforms just to the second of De Morgan’s Laws and is thus the logical mirror of anti-additive functions. Van der Wouden (1997: 99) argues that NPs of the form not every N are antimultiplicative operators, and that some Dutch polarity items (viz. even ‘equally’ and rozegueur en maneschijn, roughly ‘a bed of roses’) are sensitive to just such operators (van der Wouden 1997: 139).

References

Abbreviations BLS = (Proceedings of the) Berkeley Linguistics Society CLS = (Proceedings of the) Chicago Linguistics Society NELS = (Proceedings of the) Northeast Linguistics Society SALT = (Proceeding of) Semantics and Linguistic Theory Anscombre, J.-C. & O. Ducrot. 1983. L’Argumentation dans la langue. Bruxelles: Mardaga. Aranovich, R. 1996. “Negation, polarity, and indefiniteness: a comparative study of negative constructions in Spanish and English.” Ph.D. dissertation, UC San Diego. Atlas, J. D. 1984. “Comparative adjectives and adverbials of degree: an introduction to radically radical pragmatics.” Linguistics and Philosophy 7: 347–77. 1996. “ ‘Only’ noun phrases, pseudo-negative generalized quantifiers, negative polarity items, and monotonicity.” Journal of Semantics 13: 265–330. Atlas, J. D. & S. C. Levinson. 1981. “It-clefts, informativeness, and logical form: radical pragmatics (revised standard version).” In P. Cole (ed.), Radical Pragmatics. New York: Academic Press, 1–61. Austin, J. L. [1956] 1970. “Ifs and cans.” In J. O. Urmson & G. J. Warnock (eds.), Philosophical Papers. Clarendon: Oxford University Press, 205–32. Baker, C. L. 1970. “Double negatives.” Linguistic Inquiry 1: 169–86. Barker, C. 1995. Possessive Description. Stanford, CA: CSLI Publications. Barlow, M. & S. Kemmer. 2000. Usage-Based Models of Grammar. Stanford, CA: CSLI Publications. Barwise, J. & R. Cooper. 1981. “Generalized quantifiers and natural language.” Linguistics and Philosophy 4.2: 150–219. Beaver, D. I. & B. Z. Clark. 2003. “Always and only: why not all focus sensitive operators are alike.” Natural Language Semantics 11: 323–62. 2007. Sense and Sensitivity: How Focus Determines Meaning. Malden, MA / Oxford: Wiley-Blackwell. Bender, E. & A. Kathol. 2001. “Constructional effects of just because … doesn’t mean.” In BLS 27, Berkeley: University of California, 13–25. Beneveniste, E. 1966. Problèmes de linguistique générale. Paris: Editions Gallimard. Bergen, B. & N. Chang. 2005. “Embodied construction grammar.” In J.-O. Östman & M. Fried (eds.), Construction Grammars: Cognitive Grounding and Theoretical Extensions. Amsterdam/Philadelphia: John Benjamins, 147–90. 270

References 271 von Bergen, A. & K. von Bergen. 1993. Negative Polarität im Englischen. Tübingen: Gunter Narr Verlag. Bernini, G. 1987. “Attempting the reconstruction of negation patterns in PIE.” In A. G. Ramat, O. Carruba & G. Bernini (eds.), Papers from the 7th International Conference on Historical Linguistics. Amsterdam: John Benjamins, 57–69. Birner, B. & G. Ward (eds.) 2006. Drawing the Boundaries of Meaning: Neo-Gricean Studies in Pragmatics and Semantics in Honor of Laurence R. Horn. Amsterdam / Philadelphia: John Benjamins. Bolinger, D. 1960. “Linguistic science and linguistic engineering.” Word 16: 374–91. 1972. Degree Words. The Hague: Mouton. Borkin, A. 1971. “Polarity items in questions.” In CLS 7. Chicago: CLS, 53–62. Bosque, I. 1980. Sobre la negación. Madrid: Cátedra. Bouvier, Y.-F. 2002. “A featural account of polarity phenomena.” Ph.D. dissertation, University of Geneva. Bowerman, M. 1988. “The ‘no negative evidence’ problem: how do children avoid constructing an overly general grammar?” In J. A. Hawkins (ed.), Explaining Language Universals. Oxford: Basil Blackwell, 73–101. Braine, M. 1971. “On two types of models of the internalization of grammars.” In D. I. Slobin (ed.), The Ontogenesis of Grammar: A Theoretical Symposium. New York: Academic Press, 153–86. Bréal, M. [1900] 1964. Semantics: Studies in the Science of Meaning. Trans. Mrs. H. Cust. New York: Dover Publications. Brown, P. & S. Levinson. 1978. “Universals in language usage: politeness phenomena.” In E. Goody (ed.), Questions and Politeness. Cambridge: Cambridge University Press, 56–289. Reissued as a monograph: Politeness: Some Universals in Language Usage. Cambridge: Cambridge University Press, 1987. Buyssens, E. 1959. “Negative contexts.” English Studies 40: 163–69. Bybee, J. 1985. Morphology: A Study of the Relation Between Meaning and Form. Amsterdam/Philadelphia; John Benjamins. 2001. Phonology and Language Use. Cambridge/New York: Cambridge University Press. Bybee, J., R. Perkins & W. Pagliuca. 1994. The Evolution of Grammar: Tense, Aspect, and Modality in the Languages of the World. Chicago: University of Chicago Press. Caffi, C. & R. W. Janney. 1994. “Toward a pragmatics of emotive communication.” Journal of Pragmatics 22: 325–73. Carlson, G. 1980. “Polarity any is existential.” Linguistic Inquiry 11: 799–804. Carnap, R. 1942. Introduction to Semantics. Cambridge, MA: Harvard University Press. Carston, R. 1995. “Quantity maxims and generalized implicature.” Lingua 96: 213–44. 1996. “Metalinguistic negation and echoic use.” Journal of Pragmatics 25: 309–30. 1998. “Informativeness, relevance, and scalar implicature.” In R. Carston & S. Uchida (eds.), Relevance Theory: Applications and Implications. Amsterdam/ Philadelphia: John Benjamins, 179–236. 2002. “Linguistic meaning, communicated meaning and cognitive pragmatics.” Mind and Language. Special Issue on Pragmatics & Cognitive Science 17.1–2: 127–48.

272 References 1981. “Distribution of free-choice any.” In CLS 17. Chicago: CLS, 8–23. Chierchia, G. 2004. “Scalar implicatures, polarity phenomena, and the syntax/ pragmatics interface.” In A. Belleti (ed.), Structures and Beyond. New York: Oxford University Press, 39–103. Chomsky, N. 1957. Syntactic Structures. The Hague: Mouton. 1965. Aspects of a Theory of Syntax. Cambridge, MA: MIT Press. Clark, H. H. 1996. Using Language. New York: Cambridge University Press. Cormack, A. & N. Smith. 2002. “Modals and negation in English.” In S. Barbiers, F. Beukema & W. van der Wurff (eds.), Modality and Its Interaction with the Verbal System. Amsterdam/Philadelphia: John Benjamins, 133–64. Coulson, S. 2001. Semantic Leaps: Frame-Shifting and Conceptual Blending in Meaning Construction. New York: Cambridge University Press. Crain, S. & P. Pietroski. 2002. “Why language acquistion is a ‘snap’.” The Linguistic Review 19: 163–183. Croft, W. 1991. “The evolution of negation.” Journal of Linguistics 27: 1–27. 2001. Radical Construction Grammar: Syntactic Theory in Typological Perspective. Oxford/New York: Oxford University Press. Croft, W. & D. A. Cruse. 2004. Cognitive Linguistics. Cambridge: Cambridge University Press. Cutrer, M. 1994. “Time and tense in narrative and in everyday language.” Ph.D. dissertation, UC San Diego. Davison, A. 1980. “Any as universal or existential.” In J. van der Auwera (ed.), The Semantics of Determiners. London: Croom Helm, 11–40. Davidson, D. 1967. “Truth and meaning.” Synthese 17: 304–23. Dayal, V. 1998. “Any as inherent modal.” Linguistics and Philosophy 21: 433–76. 2004. “The universal force of free choice any.” Linguistic Variation Yearbook 4: 5–40. De Boer, A., J. de Jong & R. Landeweerd (eds.) 1993. Language and Cognition 3: Yearbook 1993 of the Research Group for Theoretical and Experimental Linguistics. Groningen: TENK. Declerck, R. 1995. “The problem of not … until.” Linguistics 33: 51–98. den Dikken, M. 2002. “Direct and indirect polarity item licensing.” Journal of Comparative Germanic Linguistics 5: 35–66. Dowty, D. 1991. “Thematic proto-roles and argument selection.” Language 67.3: 547–619. 1994. “The role of negative polarity and concord marking in natural language reasoning.” In M. Harrey & L. Santelmann (eds.), SALT IV. Ithaca: Dept. of Modern Languages and Liguistics, Cornell University, 114–44. Ducrot, O. 1972. Dire et ne pas dire. Paris: Hermann. 1973. La Preuve et le dire. Paris: Maison Mame. 1980. Les Échelles argumentatives. Paris: Minuit. Duffley, P. J. 1994. “Need and dare: the black sheep of the modal family.” Lingua 94: 213–43. Duffley, P. J. & P. Larrivee. 2010. “Anyone for non-scalarity?” English Language and Linguistics 14.1: 1–17. Edmondson, J. A. 1981. “Affectivity and gradient scope.” In CLS 17. Chicago: CLS, 38–44.

References 273 1983. “Polarized auxiliaries.” In F. Heny and B. Richards (eds.), Linguistic Categories: Auxiliaries and Related Puzzles, Vol. I. Dordrecht: D. Reidel Publishing Company, 49–68. Emanatian, M. 1983. “Everything you always wanted to no.” Unpublished ms., UC Berkeley. Falkenberg, G. 2001. “Lexical sensitivity in negative polarity verbs.” In Hoeksema et al. (eds.), 79–97. Farkas, D. 2002. “Extreme non-specificity in Romanian.” In C. Beyssade, R. BokBennema, F. Drijkoningen & P. Monachesi (eds.), Romance Languages and Linguistic Theory 2000. Amsterdam/Philadelphia: John Benjamins, 127–51. Fauconnier, G. 1975a. “Polarity and the scale principle.” In R. E. Grossman, L. J. San & T.J. Vance (eds.), CLS 11. Chicago: CLS, 188–199. 1975b. “Pragmatic scales and logical structures.” Linguistic Inquiry 6: 353–75. [1976] 1980. Etude de certains aspects logiques et grammaticaux de la quantification et de l’anaphore en français et en anglais. Lille: Atelier Reproduction des Thèses. 1978. “Implication reversal in a natural language.” In F. Guenther & S. J. Schmidt (eds.), Formal Semantics and Pragmatics for Natural Languages. Dordrecht: D. Reidel Publishing Company, 289–301. 1980. “Pragmatic entailment and questions.” In J. R. Searle, F. Kiefer & M. Bierwisch (eds.), Speech Act Theory and Pragmatics. Dordrecht: D. Reidel Publishing Company, 57–71. 1985. Mental Spaces: Aspects of Meaning Construction in Natural Language. Cambridge, MA/London: MIT Press. Republished, Cambridge: Cambridge University Press, 1994. 1997. Mappings in Thought and Language. Cambridge: Cambridge University Press. Fauconnier, G. & M. Turner. 2002. The Way We Think. New York: Basic Books. Fillmore, C. J. 1982. “Frame semantics.” In The Linguistic Society of Korea (ed.), Linguistics in the Morning Calm. Seoul: Hanshin, 111–37. 1985. “Frames and the semantics of understanding.” Quaderni di Semantica 6: 222–54. Fillmore, C. J., P. Kay & M. C. O’Connor. 1988. “Regularity and idiomaticity in grammatical constructions: the case of let alone.” Language 64: 501–38. Fintel, K. von. 1999. “NPI-licensing, strawson-entailment, and context dependency.” Journal of Semantics 16: 97–148. 2006. “Modality and Language.” In D. Borchert (ed.), Encyclopedia of Philosophy, 2nd edn. Detroit: MacMillan Reference USA. Forget, D., P. Hirschbühler, F. Martineau & M. L. Rivero (eds.) 1997. Negation and Polarity: Syntax and Semantics. Amsterdam/Philadelphia: John Benjamins. Francescotti, R. M. 1995. “Even: the conventional implicature approach reconsidered.” Linguistics and Philosophy 18: 153–73. Gaatone, D. 1971. Etude descriptive du système de la négation en français contemporain. Geneva: Librairie Droz. Garrido, J. 1992. “Expectations in Spanish and German adverbs of change.” Folia Linguistica 26: 357–402.

274 References Gazdar, G. 1979. Pragmatics: Implicature, Presupposition, and Logical Form. New York: Academic Press. Geis, M. 1995. Speech Acts and Conversational Interaction. Cambridge: Cambridge University Press. Geis, M. & W. G. Lycan. 1993. “Nonconditional conditionals.” Philosophical Topics 21: 35–55. Giannakidou, A. 1998. Polarity Sensitivity as (Non)Veridical Dependency. Amsterdam/ Philadelphia: John Benjamins. 1999. “Affective dependencies.” Linguistics and Philosophy 22: 367–421. 2001. “The meaning of free choice.” Linguistics and Philosophy 24: 659–735. 2006. “Only, emotive factive verbs, and the dual nature of polarity dependency.” Language 82.3: 575–603. Gill, D. 1994. “Conjunctive operators in South-Asian languages.” In A. Davison & F. M. Smith (eds.), Papers from the Fifteenth South Asian Language Analysis Roundtable Conference, Iowa City: South Asian Studies Program, University of Iowa, 82–105. 2005. “Conjunctions and universal quantifiers.” In M. Haspelmath, M. Dryer, D. Gil & B. Comrie (eds.), World Atlas of Language Structures. Oxford: Oxford University Press, 230–3. Givón, T. 1995. Functionalism and Grammar. Amsterdam/Philadelphia: John Benjamins. Goldberg, A. 1995. Constructions: A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press. 2006. Constructions at Work: The Nature of Generalization in Language. Oxford/ New York: Oxford University Press. Goffman, E. 1967. Interaction Ritual: Essays on Face-to-Face Behavior. Garden City, NY: Anchor Books. Gordon, D. & G. Lakoff. 1971. “Conversational postulates.” In CLS 7: 63–84. Grady, J. 2005. “Primary metaphors as inputs to conceptual integration.” Journal of Pragmatics 37: 1595–614. Green, L. J. 2002. African American English: A Linguistic Introduction. Cambridge: Cambridge University Press. Grice, H. P. 1975. “Logic and conversation.” In P. Cole & J. Morgan (eds.), Syntax and Semantics. Vol.III: Speech Acts. New York: Academic Press, 41–58. Guerzoni, E. 2004. “Even-NPIs in yes/no questions.” Natural Language Semantics 12: 319–34. de Haan, F. 1997. The Interaction of Modality and Negation: A Typological Study. New York/London: Garland. Hamilton, Sir W. 1858. Discussions on Philosophy and Literature. New York: Harper and Brothers. Hankamer, J. 1973. “Why there are two than’s in English.” In CLS 9. Chicago: CLS, 179–91. Haspelmath, M. 1997. Indefinite Pronouns. Oxford / New York: Oxford University Press. Heim, I. 1984. “A note on negative polarity and downward entailingness.” In C. Jones & P. Sells (eds.), NELS 14. University of Massachusetts, Amherst: GLSA, 98–107.

References 275 Heinämäki, O. 1974. Semantics of English Temporal Connectives. Reproduced by Indiana University Linguistics Club, 1978. Herburger, E. 2000. What Counts: Focus and Quantification. Lingusitic Inquiry, Monograph 36. Cambridge, MA: MIT Press. Hinds, M. 1974. “Doubleplusgood polarity items.” In CLS 10. Chicago: CLS, 259–68. Hirschberg, J. B. 1985. “A theory of scalar implicature.” Ph.D. dissertation, University of Pennsylvania. Hoeksema, J. 1983. “Negative polarity and the comparative.” Natural Language and Linguistic Theory 1: 403–34. 1986. “Monotonicity phenomena in natural language.” Linguistic Analysis 16: 25–40. 1994. “On the grammaticalization of negative polarity items.” In S. Gahl, A. Dalbey & C. Johnson (eds.), BLS 20. Berkeley: University of California, 273–82. 1996. “Review of Progovac,” 1994 Studies in Language: 196–207. 1997. “In days, weeks, months, years, ages: a class of temporal negative polarity items.” ms., Groningen University. 1998. “Corpus studies of negative polarity items.” In M. T. Turell & E. Vallduvi (eds.), IV-V Jornades de corpus lingüístics 1996–1997. Barcelona: Institut Universitari Lingüística Aplicada, 67–86. 2000. “Negative polarity items: triggering, scope and c-command.” In Horn & Kato (eds.), 115–46. 2007. “Parasitic licensing of negative polarity items.” Journal of Comparative German Linguistics 10: 163–82. Hoeksema, J. & H. Rullmann. 2001. “Scalarity and polarity: a study of scalar adverbs as polarity items.” In Hoeksema et al. (eds.), 129–71. Hoeksema, J., H. Rullmann, V. Sánchez Valencia & T. van der Wouden (eds.) 2001. Perspectives on Negation and Polarity Items. Amsterdam: John Benjamins. Horn, L. R. 1969. “A presuppositional analysis of only and even.” In R. I. Binnick, A. Davison, G. M. Green & J. L. Morgan (eds.), CLS 5. Chicago: CLS, 98–107. 1970. “Ain’t it hard anymore.” In CLS 6. Chicago: CLS, 318–27. 1971. “Negative transportation: unsafe at any speed?” In CLS 7. Chicago: CLS, 120–33. 1972. “On the semantic properties of logical operators in English”. Ph.D. dissertation, UC Los Angeles, distributed by IULC, 1976. 1978. “Some aspects of negation.” In J. Greenberg, C. Ferguson & E. Moravcsik (eds.), Universals of Human Language, Vol IV: Syntax. Stanford, CA: Stanford University Press, 127–210. 1984. “Toward a new taxonomy for pragmatic inference: Q-based and R-based implicature.” In Schiffrin (ed.), 11–42. 1985. “Metalinguistic negation and pragmatic ambiguity.” Language 61: 121–74. 1989. A Natural History of Negation. Chicago / London: University of Chicago Press. 1991. “Duplex negatio affirmat …: the economy of double negation.” In Lise M. Dobrin, Lynn Nichols & Rosa M. Rodriguez (eds.), CLS: Papers from the Parasession on Negation, 27, 2. Chicago: CLS, 80–106.

276 References 1996. “Exclusive company: only and the dynamics of vertical inference.” Journal of Semantics 13: 1–40. 1997. “All John’s children are as bald as the King of France: existential import and the geometry of opposition.” In K. Singer, R. Eggert & G. Anderson (eds.), CLS 33. Chicago: CLS, 155–80. 2000a. “Pick a theory (not just any theory): indiscriminatives and the free choice indefinite.” In Horn & Kato (eds.), 147–92. 2000b. “Any and (-)ever: free choice and free relatives.” In A. Wyner (ed.), Proceedings of the 15th Annual Conference of the Israeli Association for Theoretical Linguistics (IATL), 71–111. 2000c. “From if to iff: conditional perfection as pragmatic strengthening.” Journal of Pragmatics 32: 289–326. 2001. “Flaubert triggers, squatitive negation, and other quirks of grammar.” In Hoeksema et al. (eds.), 173–200. 2002. “Assertoric inertia and scalar inference.” In M. Andronis, E. Deberport, A. Pycha & K. Yeshimura (eds.), Proceedings of the Panels of the CLS 38, 2. Chicago: CLS, 55–82. 2005. “Airport ’86 revisited: toward a unified indefinite any.” In G. Carlson & F. J. Pelletier (eds.), The Partee Effect. Stanford, CA: CSLI Publications, 179–205. forthcoming. “ONLY connect: how to unpack an exclusive proposition.” In M. Hackl & R. Thornton (eds.), A Festschrift for Jay Atlas. Oxford: Oxford University Press. Horn, L. R. & Y. Kato. (eds.) 2000. Negation and Polarity: Syntactic and Semantic Perspectives. Oxford: Oxford University Press. Horn, L. R. & Young-Suk Lee. 1995. “Progovac on polarity.” Journal of Linguistics 31: 401–24. Horn, L. R. & G. Ward (eds.) 2004. The Handbook of Pragmatics. Basil: Blackwell Publishers. Hübler, A. 1983. Understatements and Hedges in English. Amsterdam/Philadelphia: John Benjamins. Huddleston, R. & G. K. Pullum. 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press. Israel, M. 1995a. “Negative polarity and phantom reference.” In BLS 21. Berkeley: University of California, 162–73. 1995b. “Review of Negative Contexts by Ton van der Wouden.” Glot International 1.5: 10–12. 1996. “Polarity sensitivity as lexical semantics.” Linguistics and Philosophy 19: 619–666. 1997. “The scalar model of polarity sensitivity: the case of the aspectual operators.” In Forget et al. (eds.), 209–29. 1998a. “The rhetoric of grammar: scalar reasoning and polarity sensitivity.” Ph.D. dissertation, UC San Diego. 1998b. “Ever: polysemy and polarity sensitivity.” Linguistic Notes from La Jolla 19: 29–45. 1999. “Some and the pragmatics of indefinite construal.” In BLS 25. Berkeley: University of California, 169–82.

References 277 2001. “Minimizers, maximizers, and the rhetoric of scalar reasoning.” Journal of Semantics 18.4: 297–331. 2002. “Literally speaking.” Journal of Pragmatics 34: 423–32. 2004. “The pragmatics of polarity.” In Horn & Ward (eds.), 701–23. 2006. “Saying less and meaning less.” In Birner & Ward (eds.), 143–62. Jackson, E. 1994. “Negative polarity, definites under quantification and general statements.” Ph.D. dissertation, Stanford University. Jacobson, P. 2006. “I can’t seem to figure this out.” In Birner & Ward (eds.), 157–76. Jayez, J. & L. M. Tovena. 2005. “Free choiceness and non-individuation.” Linguistics and Philosophy 28.1: 1–71. Jennings, R. E. 1994. The Geneology of Disjunction. New York: Oxford University Press. Jespersen, O. 1917. Negation in English and Other Languages. Copenhagen: A. F. Host. 1924. The Philosophy of Grammar. London: George Allen & Unwin, Ltd. Johannessen, J. B. 2003. “Negative polarity verbs in Norwegian.” Working Papers in Scandinavian Syntax 71: 33–73. Johnson, M. 1987. The Body in the Mind: The Bodily Basis of Meaning, Imagination, and Reason. Chicago: Chicago University Press. Kadmon, N. & F. Landman. 1993. “Any.” Linguistics and Philosophy 16: 353–422. Karttunen, L. 1974. “Until.” CLS 10: 284–97. Karttunen, L. & S. Peters. 1979. “Conventional implicature.” In C.-K. Oh & D. A. Dineen (eds.), Syntax and Semantics, Vol II: Presupposition. New York: Academic Press, 1–56. Kas, M. 1993. Essays on Boolean Functions and Negative Polarity. Groningen Disserta tions in Linguistics 11. Groningen: Center for Language and Cognition Groningen. Katz, J. J. 1977. Propositional Structure and Illocutionary Force. New York: Crowell. Kay, P. 1983. “Linguistic competence and folk theories of language: two English hedges.” In BLS 9. Berkeley: University of California, 128–37. 1989. “Contextual operators: respective, respectively, and vice versa.” In BLS 15. Berkeley: University of California, 181–93. 1990. “Even.” Linguistics and Philosophy 13: 59–111. 1997. Words and the Grammar of Context. Stanford, CA: CSLI Publication. Kay, P. & C. J. Fillmore. 1999. “Grammatical constructions and linguistic generalizations: the what’s X doing Y construction.” Language 75: 1–34. Kirsner, R. S. 1993. “From meaning to message in two theories: cognitive and Saussurean views of the modern Dutch demonstratives.” In R. Geiger & B. RudzkaOstyn (eds.), Conceptualizations and Mental Processing in Language. Cognitive Linguistics Research 3. Berlin and New York: Mouton de Gruyter, 81–114. Klein, H. 1998. Adverbs of Degree in Dutch and Related Languages. Linguistik Aktuell, vol. 21. Amsterdam/Philadelphia: John Benjamins. Klima, E. S. 1964. “Negation in English.” In J. Fodor and J. Katz (eds.), The Structure of Language: Readings in the Philosophy of Language. Englewood Cliffs, NJ: Prentice-Hall, 246–323. Koenig, J.-P. 1999. Lexical Relations. Stanford, CA: CSLI Publications.

278 References König, E. 1977. “Temporal and non-temporal uses of noch and schon in German.” Linguistics and Philosophy 1: 173–98. 1991. The Meaning of Focus Particles: A Comparative Perspective. London: Routledge. König, E. & E. C. Traugott. 1982. “Divergence and apparent convergence in the development of yet and still.” In BLS 8. Berkeley: University of California, 170–9. Kratzer, A. 1991. “Modality.” In A. von Stechow & D. Wunderlich (eds.), Semantics: An International Handbook of Contemporary Research. Berlin: de Gruyter, 639–50. Krifka, M. 1992. “Some remarks on polarity items.” In D. Zaefferer (ed.), Semantic Universals and Universal Semantics. Dordrecht: Foris, 150–89. 1994. “The semantics and pragmatics of weak and strong polarity items in assertions.” In M. Harvey & L. Santelmann (eds.), Proceedings of SALT IV. Ithaca: Dept. of Modern Language and Linguistics, Cornell University, 195–219. 1995. “The semantics and pragmatics of polarity items.” Linguistic Analysis 25: 209–57. Krifka, M., G. Carlson, F. J. Pelletier, A. ter Meulen, G. Chierchia & G. Links. 1995. “Genericity: an introduction.” In G. Carlson & F. J. Pelletier (eds.), The Generic Book. Chicago: University of Chicago Press, 1–124. Kuczaj, S. 1976. “-ing and -ed: a study of the acquisition of certain verb inflections.” ms., University of Minnesota. Kuroda, S.-Y. 1992. Japanese Syntax and Semantics: Collected Papers. Dordrecht: Kluwer Academic Press. Labov, W. 1975. What is a Linguistic Fact? Lisse: Peter de Ridder. 1984. “Intensity.” In Schiffrin (ed.), 43–70. Ladusaw, W. 1979. “Polarity sensitivity as inherent scope relations.” Ph.D. dissertation, University of Texas, Austin. Republished in the series Outstanding Dissertations in Linguistics. New York & London: Garland, 1980. 1983. “Logical form and conditions on grammaticality.” Linguistics and Philosophy 6: 373–92. 1992. “Expressing negation.” In C. Barker & D. Dowty (eds.), Proceedings of the SALT II. Columbus: Linguistics Department, Ohio State University, 237–59. 1996. “Negation and polarity items.” In S. Lappin (ed.), The Handbook of Contem porary Semantic Theory. Oxford and Malden: Blackwell Publishers, 321–41. Lahiri, U. 1998. “Focus and negative polarity in Hindi.” Natural Language Semantics 6: 57–123. Laka, I. 1990. “Negation in syntax: on the nature of functional categories and projections.” Ph.D. dissertation, MIT. Lakoff, G. 1972. “Hedges: a study in meaning criteria and the logic of fuzzy concepts.” In P. Perenteau, J. Levi & G. Phares (eds.), CLS 8. Chicago: CLS, 183–228. 1987. Women, Fire, and Dangerous Things: What Categories Reveal About the Mind. Chicago: University of Chicago Press. Lakoff, G. & M. Johnson. 1999. Philosophy in the Flesh. New York: Basic Books. Lakoff, R. 1969. “A syntactic argument for negative transportation.” In R. I. Binnick, A. Davison, G. M. Green & J. L. Morgan (eds.), CLS 5. Chicago: CLS, 140–147. 1970. “Some reasons why there can’t be any some-any rule.” Language 45: 608–15. 1973. “The logic of politeness; or, minding your p’s and q’s.” In C. Colum et al. (eds.), CLS 9, 149–62.

References 279 Langacker, R. W. 1987. Foundations of Cognitive Grammar, Vol. I: Theoretical Prerequisities. Stanford, CA: Stanford University Press. 1988. “A usage-based model.” In B. Rudzka-Ostyn (ed.), Topics in Cognitive Linguistics. Amsterdam: John Benjamins, 127–61. 1990. “Subjectification.” Cognitive Linguistics 1: 5–38. 1991. Foundations of Cognitive Grammar, Vol. II: Descriptive Application. Stanford, CA: Stanford University Press. 1997. “Generics and habituals.” In Angeliki Athanasiadou & René Dirven (eds.), On Conditionals Again, Current Issues in Linguistic Theory 143. Amsterdam/ Philadelphia: John Benjamins, 191–222. 2000. “A dynamic usage-based model.” In Barlow & Kemmer (eds.), 1–64. 2002. “One any.” In Korean Linguistics Today and Tomorrow: Proceedings of the 2002 International Conference on Korean Linguisitcs. Seoul: Association for Korean Linguistics, 282–300. Langendoen, D. T. 1970. “The ‘can’t seem to’ construction.” Linguistic Inquiry 1: 25–35. Larrivée, P. 1996. “A semantic definition of negative polarity items with evidence from French and English.” Unpublished ms., Université of Laval. Lee, C. 1996. “Negative polarity items in English and Korean.” Language Sciences 18: 505–23. Lee, Y. & L. R. Horn. 1994. “Any as Indefinite plus even.” Unpublished ms., Yale University. Leech, G. 1980. Explorations in Semantics and Pragmatics. Amsterdam: John Benjamins. 1983. Principles of Pragmatics. London: Longman. Lees, R. B. 1960. “Review of Bolinger, 1957, Interrogative Structures in American English.” Word 16: 119–25. LeGrand, J. E. 1975. “Or and any: the semantics and syntax of two logical operators,” Ph.D. dissertation, University of Chicago. Leuschner, Torsten, 1996. “Ever and universal quantifiers of time: observations from some Germanic languages.” Language Sciences 18.1–2: 469–84. Levinson, S. C. 1983. Pragmatics. Cambridge: Cambridge University Press. 2000. Presumptive Meanings. Cambridge, MA: MIT Press. Lindholm, J. M. 1969. “Negative raising and sentence pronominalization.” In R. I. Binnick, A. Davison, G. M. Green & J. L. Morgan (eds.), CLS 5, 148–58. Linebarger, M. 1980. “The grammar of negative polarity.” Ph.D. dissertation, MIT. 1987. “Negative polarity and grammatical representation.” Linguistics and Philosophy 10: 325–87. 1991. “Negative polarity as linguistic evidence.” In L. M. Dobrin, L. Nichols & R. M. Rodriguez (eds.), Papers from the Parasession on Negation. 27, 2. Chicago: CLS, 165–88. Löbner, S. 1987. “Quantification as a major module of natural language semantics.” In J. Groenendijk, d. de Jongh & M. Stokhof (eds.), Studies in Discourse Repre sentation Theory and the Theory of Generalized Quantifiers. Foris: Dordrecht, 53–85. 1989. “German schon-erst-noch: an integrated analysis.” Linguistics and Philosophy 12: 167–212.

280 References Lyons, J. 1982. “Deixis and subjectivity: loquor, ergo sum?” In R. J. Jarvella & W. Klein (eds.), Speech, Place and Action: Studies in Deixis and Related Topics. New York: Wiley, 101–24. MacWhinney, B. 1995. The CHILDES Project: Tools for Analyzing Talk. Hillsdale, NJ: Erlbaum. Mahajan, A. 1990. “LF-conditions on negative polarity licensing.” Lingua 80: 330–48. Mandler, J. M. 1992. “How to build a baby: II conceptual primitives.” Psychological Review 99: 587–604. Margerie, H. 2007. “From downgrading to (over) intensifying: a pragmatic study in English and French.” In I. Kescskes & L. R. Horn (eds.), Explorations in Pragmatics: Linguistic, Cognitive and Intercultural Aspects. Mouton de Gruyter: Berlin, 287–312. 2008. “A historical and collexeme analysis of the development of the compromiser fairly.” Journal of Historical Pragmatics 9,2: 288–314. Matsumoto, Y. 1995. “The conversational condition on horn scales.” Linguistics and Philosophy 18: 21–60. May, R. 1985. Logical Form: Its Structure and Derivation. Cambridge, MA: MIT Press. Mazodier, C. 1998. “ ‘I must have read it in some article’: instabilité qualitative de some + discontinu singulier.” Cahiers de Recherche en Grammaire Anglais, 111–26. McGloin, N. H. 1972. “Some Aspects of Negation in Japanese.” Ph.D. dissertation, University of Michigan. Meillet, A. 1948. Linguistique historique et linguistique générale. Paris: La Société Linguistique de Paris. Michaelis, L. A. 1992. “Aspect and the semantics-pragmatics interface: the case of already.” Lingua 87: 321–39. 1993. “‘Continuity’ within three scalar models: the polysemy of adverbial still.” Journal of Semantics 10: 193–237. Michaelis, L. A. & K. Lambecht. 1996. “The exclamative sentence type in English.” In A. E. Goldberg (ed.), Conceptual Structure, Discourse, and Language. Stanford, CA: CSLI Publications, 375–89. Milsark, G. 1977. “Toward an explanation of certain peculiarities of the existential construction in English.” Linguistic Analysis 3: 1–29. Mittwoch, A. 1977. “Negative sentences with until.” In CLS 13. Chicago: CLS, 410–17. 1988. “Aspects of english aspect: on the interaction of perfect, progressive and durational phrases.” Linguistics and Philosophy 11: 203–54. Moeschler, J. 1997. “La négation comme expression procedurale.” In Forget (eds.), 231–49. Möhren, F. 1980. Le renforcement affectif de la négation par l’expression d’une valeur minimale en ancien français. ‘Beihefte zur Zeitschrift für Romanische Philologie,’ 175. Tubingen: Max Niemeyer Verlag. Morris, C. W. [1938]1955. “Foundations of the theory of signs.” In O. Neurath, R. Carnap & C. W. Morris (eds.), International Encyclopedia of Unified Science, Vol. I.

References 281 [1946]1998. “The scope and import of semiotic.” In A. Kasher (ed.), Pragmatics: Critical Concepts, Vol I. London and New York: Routledge, 7–14. Moxey, L. M. & A. J. Sanford. 1993. Communicating Quantities: A Psychological Perspective. Hove & Hillsdale: Erlbaum. 1994. “Psychological studies of quantifiers.” Journal of Semantics 11.3: 153–70. Newmeyer, F. 2003. “Grammar is grammar and usage is usage.” Language 79.4: 682–707. Palacios Martinez, I. M. 1999. “Negative polarity idioms in Modern English.” ICAME Journal, No. 23, 65–115. Palmer, F. R. 1990. Modality and the English Modals. London and New York: Longman. 1995. “Negation and the modals of possibility and necessity.” In J. Bybee & S. Fleischman (eds.), Modality in Grammar and Discourse. Amsterdam: John Benjamins, 453–71. Paradis, C. 1997. Degree Modifiers of Adjectives in Spoken British English. Lund Studies in English 92. Lund: Lund University Press. 2001. “Adjectives and boundedness.” Cognitive Linguistics 12.1: 47–65. Partee, B. H., A. G. ter Meulen & R. Wall. 1990. Mathematical Methods in Linguistics. Dordrecht: Kluwer Academic Press. Pinker, S. 1989. Learnability and Cognition: The Acquisition of Argument Structure. Cambridge, MA: MIT Press/Bradford Books. Pollard, C. & I. A. Sag. 1994. Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press/Stanford, CA: CSLI Publications. Portner, P. 2009. Modality. Oxford/New York: Oxford University Press. Pott, A. F. 1859. Etymologishce Forschungen auf dem Gebiete der Ino-Germanischen Sprachen, Vol. I. Lemgo/Detmold: Meyer. Powell, M. J. 1992. “Folk theories of meaning and principles of conventionality: encoding literal attitude via stance adverb.” In A. Lehrer & E. F. Kittay (eds.), Frames, Fields and Contrasts: New Essays in Semantic and Lexical Organization. Hillsdale, NJ: Erlbaum, 333–54. Progovac, L. 1992. “Negative polarity: a semantico-syntactic approach.” Lingua 86: 271–99. 1994. Negative and Positive Polarity. Cambridge: Cambridge University Press. Quine, W. V. O. 1960. Word and Object. Cambridge: MIT Press. Quirk, R., S. Greenbaum, G. Leech & J. Svartik. 1985. A Comprehensive Grammar of the English Language. London: Longman. Raghibdoust, S. 1994. “The semantic-pragmatic nature of the persian polarity items.” Unpublished ms., University of Ottawa. Roberts, C. 2004. “Context in dynamic interpretation.” In Hom & Ward (eds.), 197–220. Rooy, R. van. 2003. “Negative polarity items in questions: strength as relevance.” Journal of Semantics 20: 239–73. Rullmann, H. 1996. “Two types of negative polarity items.” In NELS 26, University of Massachusetts, Amherst: GLSA, 335–50. 2002. “A note on the history of either.” In M. Andronis, E. Debenport, A. Pycha & K. Yoshimura (eds.), Proceeedings from the Panels of the CLS 38, 2. Chicago: CLS. 2003. “Additive particles and polarity.” Journal of Semantics 20: 329–401. Sadock, J. 1981. “Almost.” In Cole (ed.), Radical Pragmatics. New York: Academic Press, 257–71.

282 References Sánchez Valencia, Víctor. 1991. “Studies on natural logic and categorial grammar.” Ph.D. dissertation, University of Amsterdam. Sánchez Valencia, V., T. van der Wouden & F. Zwarts. 1993. “Polarity and the flow of time.” In Boer, de Jong & Landeweerd (eds.), 209–18. Sapir, E. 1944. “Grading: a study in semantics.” Philosophy of Science 11: 93–116. Reprinted in D. G. Mandelbaum (ed.), Edward Sapir: Selected Writings in Language, Culture, and Personality. Berkeley, Los Angeles/London: University of California Press, 1985, 122–49. Schiffrin, D. (ed.) 1984. Meaning, Form and Use in Context: Linguistic Applications (GURT ’84). Washington: Georgetown University Press. Schmerling, S. 1971. “A note on negative polarity.” Papers in Linguistics 4: 200–6. Schwenter, S. 1999a. The Pragmatics of Conditional Marking: Implicature, Scalarity, and Exclusivity. New York/London: Garland Press. 1999b. “Two types of scalar particles: evidence from Spanish.” In J. GutiérrezRexach & F. Martínez-Gil (eds.), Advances in Hispanic Linguistics. Somerville, MA: Cascadilla Press, 546–61. Smith, S. 1975. Meaning and Negation. The Hague: Mouton. Spencer, N. J. 1973. “Differences between linguists and nonlinguists in intuitions of grammaticality-acceptability.” Journal of Psycholinguistic Research 2: 83–98. Sperber, D. & D. Wilson. [1986]1995. Relevance: Communication and Cognition, 2nd edn. Oxford: Blackwell. Spitzbardt, H. 1963. “Overstatement and understatement in British and American English.” Philologica Pragensia 6:45 277–86. Strawson, P. F. [1970]1985. “Meaning and truth.” 1969 Inaugural Lecture, Oxford University. Reprinted in A. P. Martinich (ed.), The Philosophy of Language. New York/Oxford: Oxford University Press, 101–12. Suppes, P. 1974. “The semantics of children’s language.” American Psychologist 29: 103–14. de Swart, H. 1996. “Meaning and use of not…until.” Journal of Semantics 13.3: 221–63. Sweetser, E. 1988. “Grammaticalization and semantic bleaching.” In S. Axmaker, A. Jaisser & H. Singmaster (eds.), BLS 14, University of California, 389–405. 1990. From Etymology To Pragmatics: Metaphorical and Cultural Aspects of Semantic Structure. Cambridge: Cambridge University Press. Sweetser, E. & B. Dancygier. 2005. Mental Spaces in Grammar: Conditional Constructions. Cambridge: Cambridge University Press. Szabolsci, A. 2002. “Hungarian disjunctions and positive polarity.” In I. Kenesei & P. Siptar (eds.), Approaches to Hungarian 8. Budapest: Akadémiai Kiadó, 217–41. 2004. “Positive polarity – negative polarity.” Natural Language and Linguistic Theory 22: 409–52. Szabolsci, A. and B. Haddican. 2004. “Conjunction meets negation: a study in crosslinguistic variation.” Journal of Semantics 21: 219–49. Talmy, L. 1985. “Force dynamics in language and thought.” In W. Eilfort, P. Kroeber & K. Peterson (eds.), CLS: Paracession on causatives and Agentivity, 21. Chicago: CLS, 293–337.

References 283 Tomasello, M. 1999. The Cultural Origins of Human Cognition. Cambridge, MA: Harvard University Press. 2003. Constructing a Language: A Usage-Based Theory of Language Acquisition. Cambridge, MA: Harvard University Press. Tovena, L. 1998. The Fine Structure of Polarity Items. New York: Garland. Tovena, L. & J. Jayez. 1999. “Any: from scalarity to arbitrariness.” In F. Corblin, C. Dobrovie-Sorin & J. M. Marandin (eds.), Empirical Issues in Formal Syntax and Semantics, Vol. II. The Hague: Theseus, 39–57. Tovena, L., V. Deprez & J. Jayez. 2004. “Polarity sensitive items.” In F. Corblin & H. de Swart (eds.), Handbook of French Semantics. CSLI Lecture Notes 17. Stanford, CA: CSLI Publications, 403–27. Traugott, E. C. 1988. “Pragmatic strengthening and grammaticalization.” In S. Axmaker, A. Jaisser & H. Singmaster (eds.), BLS 14. Berkeley: University of California 406–16. 1989. “On the rise of epistemic meanings in English: an example of subjectification in semantic change.” Language 65.1: 31–55. Traugott, E. C. & R. B. Dasher. 2002. Regularity in Semantic Change. Cambridge: Cambridge University Press. Traugott, E. C. & E. König. 1991. “The semantics-pragmatics of grammaticalization revisited.” In E. C. Traugott & B. Heine (eds.), Approaches to Grammaticalization, Vol. 1. Amsterdam/Philadelphia: John Benjamins, 189–218. Traugott, E. C. & J. Waterhouse. 1969. “Already and yet: a suppletive set of aspect markers?” Journal of Semantics 5: 287–304. Uribe-Etxebarria, M. 1994. “Interface licensing conditions on negative polarity items: a theory of polarity and tense interactions.” Ph.D. dissertation, University of Connecticut. Vallduvì, E. 1994. “Polarity items, n-words, and minimizers in Catalan and Spanish.” Probus 6: 263–94 van der Auwera, J. 1993. “Already and still: beyond duality.” Linguistics and Philosophy 16: 613–53. 2001. “On the typology of negative modals.” In Hoesksema et al. (eds.), 23–48. van der Auwera, J. & V. Plungian. 1998. “Modality’s semantic map.” Linguistic Typology 2: 79–124. Vasishth, S. 1998a. “Monotonicity constraints on negative polarity in Hindi.” Ohio State University Working Papers in Linguistics 51: 147–66. Vendler, Z. 1967. Linguistics in Philosophy. Ithaca, NY: Cornell University Press. Verhagen, A. 2005. Constructions of Intersubjectivity; Discourse, Syntax, and Cognition. Oxford: Oxford University Press. Wagenaar, K. 1930. Etude sur la négation en Ancien Espagnol jusqu’au XVe siècle. Groningen/The Hague: J. B. Wolters Uitgevers-Maatschappij. van der Wal, S. 1996. Negative Polarity Items and Negation: Tandem Acquisition. Groningen Dissertations in Linguistics 17. Groningen: Center for Language and Cognition, Groningon. Warfel, S. L. 1972. “Some, reference, and description.” In Mid-America Linguistics Conference Papers. Oklahoma City: Oklahoma State University, 41–9.

284 References Wierzbicka, A. 1980. Lingua Mentalis: The Semantics of Natural Language. Sydney: Academic Press. van der Wouden, T. 1996a. “Three modal verbs.” Paper presented at the Colloquium, The Germanic Verb, Dublin. 1996b. “Negative polarity auxiliaries.” Paper presented at the PIONIER conference Perspectives on Negation, Groningen. 1997. Negative Contexts: Collocation, Polarity and Multiple Negation. London/ New York: Routledge. van der Wouden, T. & F. Zwarts. 1993. “A semantic analysis of negative concord.” In L. L. Lahiri & A. Z Wyner (eds.), Proceedings of Semantics Theory III. Ithaca, 202–19. von Wright, G. H. 1951. An Essay in Modal Logic. Amsterdam: North Holland. Yoshimura, A. 1994. “A cognitive constraint on negative polarity phenomena.” In S. Gohl, A. Dolbey & C. Johnson (eds.), BLS 20. Berkeley: University of California, 599–610. Zanuttini, R. 1991. “Syntactic properties of sentential negation: a comparative study of Romance languages.” Ph.D. dissertation, University of Pennsylvania. Zanuttini, R. & P. Portner. 2003. “Exclamative clauses: at the syntax-semantics interface.” Language 79.1: 39–81. Zepter, A. 2003. “How to be universal when you are existential: negative polarity items in the comparative: entailment along a scale.” Journal of Semantics 20: 193–237. Zwarts, F. 1996. “A hierarchy of negative expressions.” In H. Wansing (ed.), Negation: A Notion in Focus. Berlin/New York: Walter de Gruyter, 169–94. 1998. “Three types of polarity.” In F. Hamm & E. Hinrichs (eds.), Plural Quantifica tion. Foris: Dordrect, 177–238.

General index

affective polarity items (APIs) 223 affectivity 30, 48–9, 60, 61, 64, 78, 95, 203, 213, 233–4, 254 affirmation 2, 20, 23, 26, 69, 92, 100 African American English 268 anaphora 22, 207, 209 anomaly, see grammaticality anti-additivity 149, 217, 220–1, 238 anti-concessives 119 anti-licensing 223, see also blocking antimorphicity 216–17, 221 antimultiplicativity 269 any 91, 168–180, 268 adverbial 168 anti-indiscriminative 175–6, 194 free-choice 41, 163, 167, 169–76, 185–8, 193, 194, 197 polarity sensitive 71, 140, 163, 169–76 and subtrigging 187 as universal quantifier 164, 175, 185 unstressed 177–7, 180, 189 Arabic 34 argumentation in language, theory of 52, 267 argumentativity 8–9, 67, 104, 119, 134, 136, 143 assertoric inertia 67 attenuation 7–9, 84–91, 94, 110–16, 119–20, 151, 188, 196, 232, 250, 257 vs. understatement 88, 192–3 autonomy of syntax 211 basic-level categories 28, 108 Basque 24 blocking 3, 26, 61, 74–6, 132, 145, 148–9, 171, 181, 210, 221 Boumaa Fijian 33 Cognitive Grammar 15, 107

cognitive semantics 16–19, 205 communication 7, 12, 14, 62, 78, 110, 113–14, 256–7 animal communication 22 comparative constructions 3, 29, 33, 34, 36, 38, 54, 87, 120, 148, 156, 160, 168, 175, 217, 220 conceptual structure 17, 54–5, 57, 61–2, 78, 80, 198, 212, 232, 235 concessives 114 conditional constructions 3, 29, 33, 34, 38, 50, 63, 134, 138, 148, 153, 156, 175, 194–5, 217, 220, 228, 233 concessive 124 speech act (“Austin”) 250–4 construal 7, 18, 128, 151, 187, 189 emphatic 168 generic 173, 191, 198–9 (in)definite 178, 200–1 scalar 61–3, 70–1, 81–5, 106, 143, 161, 171–3, 180, 183, 234–43, 256 subjective 164, 181, 193, 200–1, 250 of time 151–2, 158 of what is said 254–5 Construction Grammar 15, 51 constructions 9, 15, 23, 62, 164, 205, 224, 234, 257 frequency of use 38, 114–15 contextual operators 79–80, 267 contradictories 20, 93 contraries 1–2, 92, 122, 124, 136, 152 definiteness 108, 178, see also indefinites degree modifiers 28, 40, 74, 81, 83, 86, 88, 106, 122–3, 165 deixis 22 De Morgan’s Laws 216–20, 269 denial 2, 21, 69, 70

285

286 General index disjunction 34, 109, 142, 146–51 diversity problem 32–7, 108, 126, 161–2, 221 Dutch 25, 38, 86, 87, 88, 93, 96, 106, 109, 130, 140, 141, 142, 216, 219, 269

grammaticalization 24, 33, 38, 45–6, 105, 115–16, 128, 132, 136, 142, 143, 151, 161, 173, 178, 180, 184–5 Greek 24, 34, 130 Groningen 216

echoic use 27, 115, 176, 194 emphasis 7–9, 46, 84–91, 94, 110–116, 117–19, 151, 176, 188, 232, 238–40, 250, 257 and stress 118 entailment 4, 9, 147 downward 64, 177, 213–21, 238, 246 limited downward 228, 247, 251 pragmatic 58–9, 69, 103 Strawson 66 Estonian 24 even, see focus particles: additive existential construction 171

Hebrew 24, 173 Hindi 24, 86, 173, 219, 221 Horn scales, see scales: quantitative Hungarian 24, 25, 109, 143, 149–51

face 110–12 face-threatening acts 252 Finnish 24 focus particles additive (even) 64, 72, 80–1, 117, 123–5, 148, 170, 173, 177, 236–9 almost and barely 66–7 at least 119 connective 142 disjunctive (either) 148 exclusive (only) 29, 65, 70, 76, 118, 183 force dynamics 101, 128 French 24, 33, 64, 86, 88, 96, 109, 130, 140, 141 generality 25 generative grammar 4, 10, 224 genericity 198 generic contexts 163, 174, 183, 221 generic operators 29, 169, 180, 185–6, 188, see also construal: generic German 25, 34, 64, 86, 88, 109, 130, 140, 141, 142, 161, 221 grammatical/linguistic representation 4, 6–7, 9, 15, 61, 78, 202–6, 209, 211–12, 224–6, 229, 233–4, 254 grammaticality 11, 61 and acceptability 4 and anomaly 4, 267 and pragmatics 237

illocutionary acts 68, 253 illocutionary effects 70 illocutionary force-indicating devices 137 image schemas 128, 235 Immediate Scope Constraint, see scope Implication Constraint 179, 182–8 implicature 35, 121, 158, 230, 237 conventional 210–11 conversational 61, 116, 153, 204 negative 135, 183, 209–12, 244, 248 scalar 52–4, 65, 113, 147, 192, 196 indefinites 28, 131, 163–201, 221 adverbial 165, 168, 181, 189, 196–8 free-choice 33, 196–200, 221, see any Heimian 188 and intensity 179 phantom 178–88 specific 33 typology of 32–4 individual-level predicates 171, 200 individuation 108 informative value 7, 81, 84, 87, 90, 95, 105, 112–16, 123–5, 132, 148, 152, 162, 174–6, 231 informativity 64, 81, 95, 97, 112–16, 124, 230 redundancy of 109–10 Informativity Hypothesis 104–5, 127, 143, 191, 230 innateness 6, 254 intensifiers 39, 46, 87–9 intensity 48, 106, 110, 120 interrogative constructions 3, 29, 34, 38, 50, 68–70, 131, 138, 148, 153–4, 156, 160, 175, 176, 195, 217, 220, 233, 267 rhetorical 70, 121, 134, 138 and scale reversal 70 intersubjectivity 232 intervention effects, see scope Irish 24, 33

General index 287 irony 35, 135, 184, 192 irrealis 33, 223–4 Italian 33, 143 Japanese 24, 34, 86, 87, 88, 106, 109, 143, 173, 219 Jespersen cycle 43, 45–6 judgment 61, 78, 119 categorical vs. thetic 198–200 Korean 24, 25, 33, 34, 173 Lango 33 Latin 24, 33, 45, 161, 268 law of double negation 71, 75–8, 138 lexicalization 38, 83, 90, 93, 103 lexicon 6, 15, 31–2, 47, 105, 127–7, 197, 229 Lezgian 24, 33 licensers, see polarity contexts licensing 9, 26 Flaubert 36 and linguistic representations 202–32 and locality 71, 176, 208 and n-words 45 and “possible polarity items” 140 and scalar construal 233–55 and scalar inferencing 48–78, 95 secondary 35, 185, 206 subtrigged 199 licensing conditions/requirements 28, 91, 155, 184, 195, 220 licensing contexts 48–78, 174 licensing problem 32, 37, 95, 206–8 licensing relation question 31, 208 literal meaning, see meaning: literal Lithuanian 33 litotes 88, 123, 231 logical form 4, 18, 30, 78, 204–5, 206–11, 224–7 Maltese 24 Mandarin 130 meaning emotive 110 literal 79, 139, 184, 204–5 sentence 18 truth conditional 16, 61, 70, 166, 205, 216, 225–7, 243, 245 utterance 8, 16–18, 62, 67–9, 215, 234, 250, 254, 257

mental spaces 62, 78, 128, 164, 173 metaphor 1, 77, 102 metonymy 114, 156 modal operators 74, 106, 127–42, 186–7, 221–2 monotonicity 234 hierarchy 216–21, 269 thesis 213–16, 219, 224–6, 241 natural kinds 109, 233 negation 20–6, 92, 109, 115, 122–4, 147, 191, 216–17 and c-command 203 and connectives 147–51 direct and indirect 33 as a durative predicate 155 long-distance 34 metalinguistic/polemical 27, 150, 181–2, 194 and modals 128–34, 140 pleonastic 165 weak, see polarity contexts: approximatives negative concord 43–7 negative contexts 2, 5, 23, 38–9, 48, 60, 76, 100, 131–3, 145–6, 151, 156, 206, 233, 267 hierarchy of 37, 216–21 negative evidence 4 negative face 110, 112, 114 negative morphology 21, 41, 53–4 negative operator 22, 26, 74, 132 negative polarity items (NPIs) 2–5, 27–30, 42, 48, 70, 90–1, 209, 212–13, 219, 223, 226, 230–2, 233, 238 attenuating 86, 91, 231 auxiliary need 39–40, 73, 109 conservative/strong 36, 181 emphatic 90, 160, 164, 177, 251–4 liberal 34–7, 181, 251 maximizers 95–6 minimizers 24, 35, 79, 81, 85, 86, 96–7, 149, 160, 179–80, 179–84 morphological 41, 265–6 negative politeness 111, 113, 129 negativity 68, 185, 217, 220 non-veridicality 34, 221–4 Norwegian 130, 142 only, see focus particles: exclusive

288 General index Persian 24, 86, 88 polarity contexts 29–30, 38, 48–78, 62 adversative 29, 38, 50, 66, 70, 71, 144, 156, 176, 208, 210 approximatives 50, 64–5, 70, 156, 160, 218 before clauses 63, 217, 239 exact NPs 243–5 negated because clauses 182–3 quantifier restrictions 29, 36, 63, 76, 183, 217, 246–8, 238, see also comparative constructions; conditional constructions; interrogative constructions; and negative contexts polarity items 2–8, 23, 29–30, 37, 62, 155 abstentive 263 any as 168–80 aspectual 36, 109, 151–61 aversive 138–9 connective 142–51, 260 desiderative 136–8, 262 durative 152, 260 inceptive 152, 160, 260 indefinite 81, 163–201 inverted 95–103 irritative 141, 263–4 modal 127–42, 261 semi-polarity items 38, see also NPIs; PPIs temporal 99–100 tolerative 140–1, 263 Polish 143 politeness 110–12, 129, 250 polysemy 155, 158, 161, 167, 173 Portuguese 25 positive polarity items (PPIs) 2–3, 27–8, 75, 194, 219, 230–2 attenuating 86–90, 91, 111, 164–5, 200, 231 doubleplusgood 87 emphatic 86–7, 96–100 minimizing 97 morphological 41 precedence condition 71–3 predication 22, 59, 190, 198–200 presupposition 61, 66, 69, 80, 102, 103, 153, 160, 165, 170, 181 profiling 104, 107 quantifier raising 208 quantifiers 73–5, 81, 87, 106, 128–9, 131 downward entailing 214 English 164

“exact” 243–5 existential 129, 163, 169 most and few 246–8 negative 44, 149, 217, 220 universal 49, 59, 124, 129 quantitative value 7, 81–4, 87, 90, 103, 105, 148, 152, 162, 191 quasi-negatives, see polarity contexts: approximatives questions, see interrogative constructions recursion 22 Relevance Theory 12, 239, 268 rhetoric 7, 10, 253 interpersonal 250 rhetorical coherence 120, 250 rhetorical force 45, 95, 116, 145, 151, 195, 199, 234, 253 rhetorical function/purpose 7, 112, 116, 143, 234, 245, 252 Rumanian 268 Russian 25, 33, 143 Sanskrit 24 scalar inferencing 51, 54, 64, 76, 91, 93, 104, 129, 174 Scalar Model of Polarity 7, 47, 74, 104, 109, 136, 143, 175, 189, 203, 205, 229–32, 250 scalar models 51, 57–60, 82–3, 95, 143, 147, 176, 234, 236, 238–41, 247–8, 257 scalar norm 83–4, 94, 105 scalar operators 7, 79–81, 85, 104–9, 126–7, 143, 146, 150, 151–2, 164, 168, 179, 235, 237, 250, 256 scale preservation 60, 62, 76, 94, 102, 244, 248, 250 scale principle 58–9 scale reversal 60, 62, 85, 94, 129, 137, 144–5, 176, 188, 224, 244–5, 247–8, 250 scales: canonical and inverted 100–3 conceptual 51, 54–7, 79, 81–4, 105–9, 143, 170, 237, 247 kind vs. quantity 171–3, 175, 180 quantitative (or “Horn”) 52, 118, 147, 183 scope of affective/DE operators 4, 48, 176 of assertion 67 of even 80 Immediate Scope Constraint 209–12

General index 289 and intervening operators 73–5, 210 and the licensing relation question 31 and logical form 227 of modal operators 34, 130–2, 140, 184–6 of multiple affective operators 75–8, 138 of negation 2, 29, 40, 71–7, 130–1, 138, 149–51, 209 semantic bleaching 116 semantic change 114–16 semantic composition 15, 25, 121, 125, 148 semantic domains 55, 105, 162 sensitivity problem 31–2, 37, 95 Serbo-Croatian 34, 43 Somali 33 some 188–195 adverbial 189, 268 basic 189 exclamative 189–91 spesumptive 189–90, 199 Spanish 24, 34, 44, 45, 142, 161 specificity 220 speech acts 55, 62, 68, 110, 114, 139 denoting predicates 133 indirect 111, 121, 134, 139, 204 participants 83, see also conditional constructions: speech act square of opposition 92–3 strength informative 91, 119, 144 licensing 37, 220 prepositional 9 rhetorical 162, 254

strengthening pragmatic 116 semantic 169–70, 174, 175, 189 subjectification 116 subjectivity 10, 16, 18–19, 164, 177–80, 190 superlatives 49–51, 59, 72, 75, 79, 87, 124, 174, 175 suppletion 22, 151, 165 Swahili 24 syntacticization 45 taxonomy 54, 56, 90 thematic hierarchy 98 thematic roles 98, 100 trajector and landmark 155–60 triggering, see licensing truth conditions, see meaning: truth conditional Turkish 24 understatement 88–9, 91, 113, 122, 192 and negation 123, 231 Universal Grammar 207, 233 usage-based grammar 14, 17 utterance interpretation, see meaning: utterance Veridicality Hypothesis 223–4 wh-movement 207 widening 169–70, 177, 189 Yimas 33

Person index

Anscombre, J.-C. 9, 52, 267 Aranovich, R. 44, 45 Aristotle 92 Atlas, J. D. 65, 67, 112, 219 Austin, J. L. 250–1 Bacon, F. 1 Bacon, R. 268 Baker, C. L. 4, 27, 48, 71, 75–8, 91, 176, 181, 206, 209 Barker, C. 246, 268 Barlow, M. 14 Barwise, J. 214, 232 Beaver, D. 183, 205 Bender, E. 134 Benveniste, E. 16 Bergen, A. von 95 Bergen, B. 15, 235 Bergen, K. von 95 Bernini, G. 24 Bolinger, D. 4, 24, 83, 88, 106, 123, 126, 166 Borkin, A. 86, 121, 185 Bosque, I. 161 Bouvier, Y.-F. 86, 141, 142 Bowerman, M. 4 Braine, M. 4 Bréal, M. 45 Brown, P. 110–12 Buyssens, E. 48 Bybee, J. 14, 128, 224 Caffi, C. 110 Carlson, G. 164 Carnap, R. 10, 14, 53, 232 Carston, R. 16, 64 Carver, R. 145 Chang, N. 15, 235

290

Chierchia, G. 8, 71, 74, 91, 176 Chomsky, N. 10, 11 Clark, B. Z. 183, 205 Clark, H. H. 16 Coates, J. 130 Cooper, R. 214, 232 Cormack, A. 128 Coulson, S. 16 Crain, S. 72 Croft, W. 15, 24, 205 Cutrer, M. 128 Dancygier, B. 205 Dasher, R. B. 14, 17 Davidson, D. 25, 232 Davison, A. 164 Dayal, V. 185–8 Declerck, R. 155, 156–7 Deprez, V. 141, 142 den Dikken, M. 185 Dowty, D. 44, 101, 213, 219 Ducrot, O. 9, 52, 64, 67, 267 Duffley, P. J. 130, 177 Edison, T. 163 Edmondson, J. A. 37, 130, 220 Emanatian, M. 185 Falkenberg, G. 141 Farkas, D. 268 Fauconnier, G. 4, 8, 13, 16, 17, 48–51, 58–60, 62, 80, 86, 128, 168–70, 205, 213, 215, 224, 227, 234 Fillmore, C. J. 8, 15, 16, 51, 57, 59, 67, 79–80, 142, 143, 232 Fintel, K. von 61, 66, 128, 205, 213, 223, 228, 269 Francescotti, R. M. 80

Person index 291 Gaatone, D. 30, 86, 96 Garrido, J. 151, 152 Gazdar, G. 53 Geis, M. 25, 110 Giannakidou, A. 4, 8, 34, 45, 164, 178, 222–3, 233, 250, 269 Gil, D. 164 Givón, T. 224 Goffman, E. 110 Goldberg, A. 14, 15, 17, 268 Gordon, D. 111 Grady, J. 235 Green, L. J. 268 Grice, H. P. 52 Guerzoni, E. 121, 176

Katz, J. J. 11 Kay, P. 8, 9, 15, 16, 51, 57, 59, 67, 79–80, 84, 142, 143, 161, 170 Kemmer, S. 14 Kirsner, R. S. 178 Klein, H. 28, 64, 67, 74, 83, 86, 88, 93, 106 Klima, E. 4, 30, 48, 78, 142, 148, 155, 159, 164, 166, 203 Koenig, J.-P. 15 König, E. 80, 143, 151 Kratzer, A. 39, 128 Krifka, M. 8, 37, 86, 91, 168, 172, 176, 192, 208, 211, 219, 229–32 Kucaj, S. 5 Kuroda, S.-Y. 198

de Haan, F. Hamilton, W. 163 Haspelmath, M. 8, 23, 24, 28, 32–4, 46, 164, 169, 173, 176, 220–1 Heim, I. 61, 66, 81, 86, 176, 177, 223, 228, 238, 246 Heinämäki, O. 155 Herburger, E. 9 Hinds, M. 87, 121 Hirschberg, J. B. 54, 170 Hoeksema, J. 38, 71, 73, 76, 102, 106, 130, 140, 141, 207, 213–14, 216 Horn, L. 8.2, 16, 17, 22, 24, 28, 35–7, 41, 52–3, 61, 64–8, 74, 75, 76, 80, 81, 86, 91, 112, 123, 128, 131, 135, 139–40, 147, 151, 153, 155, 156, 157, 164, 168–76, 178–9, 181, 183, 185, 187, 197–9, 207–8, 211, 220, 227, 246, 251, 268, 269 Hübler, A. 83, 88, 91, 110, 139 Huddleston, R. 48, 64, 267

Labov, W. 11, 48 Ladusaw, W. 4, 32, 45, 48, 61, 64, 66, 71, 90, 209, 224–7, 228–9, 243, 245, 246 Lahiri, U. 8, 91, 164, 169, 173 Laka, I. 48, 203, 206 Lakoff, G. 16, 80, 102, 111, 235 Lakoff, R. 110–11, 121, 129, 155, 166 Lambrecht, K. 190 Landman, F. 8, 61, 66, 86, 91, 164, 168–75, 185, 189, 211, 229 Langacker, R. W. 13, 14, 15, 16, 17, 55–7, 98, 107, 128, 163, 169, 177, 178–9, 188, 191, 200, 232, 233, 268 Langendoen, T. 42 Larrivée, P. 96, 177 Lee, C. 164, 169 Lee, Y. 8, 81, 156, 164, 168–76, 183, 207–8 Leech, G. 110 Lees, R. B. 4, 48, 166 LeGrand, J. E. 187 Levinson, S. C. 11, 16, 17, 53, 55, 110–12, 147 Lindholm, J. M. 155, 157 Linebarger, M. 4, 8, 48, 61, 66, 71, 74, 81, 109, 121, 135, 181, 183, 203, 206, 225, 227, 233, 238, 243–5, 248–50 Löbner, S. 151 Lycan, W. G. 250 Lyons, J. 16

Jackson, E. 91, 130, 219, 246 Jacobson, P. 42, 268 Janney, R. W. 110 Jayez, J. 141, 142, 163, 164, 169, 174 Jennings, R. E. 187 Jespersen, O. 45, 128, 178 Johnson, M. 102, 235 Kadmon, N. 8, 61, 66, 86, 91, 164, 168–75, 185, 189, 211, 229 Karttunen, L. 9, 157 Kas, M. 208, 213–14, 216–19 Kathol, A. 134

MacWhinney, B. 5 Mahajan, A. 206 Mandler, J. 235 Margerie, H. 91, 193

292 Person index Martinet, A. 112 Matsumoto, Y. 53 May, R. 225 Mazodier, C. 193 McGloin, N. 86, 87, 88, 91, 106 Meillet, A. 46 Michaelis, L. A. 51, 152, 190 Milsark, G. 189 Mittwoch, A. 151, 155, 159 Moeschler, J. Möhren, F. 24 Morris, C. W. 10, 14 Moxey, L. M. 65 Newmeyer, F. 11, 14 Nietzsche, F. 202 Nobrega, E. 25 O’Connor, M. C. 51, 57, 59, 67, 79–80 Palacios Martinez, I. M. 43 Palmer, F. R. 128, 130 Paradis, C. 83, 88 Partee, B. H. 10 Paul, H. 112 Peters, S. 9 Pietroski, P. 72 Pinker, S. 4 Pollard, C. 15 Portner, P. 128, 190 Pott, A. F. 24 Powell, M. J. 117 Progovac, L. 8, 43, 48, 156, 203, 206–9, 212, 225, 233 Pullum, G. K. 48, 64, 267 Quine, W. V. O. 164 Quirk, R. S. 253 Raghibdoust, S. 86 Roberts, C. 205 Rooy, R. van 8, 9.4, 68, 91, 169, 269 Rullmann, H. 8, 81, 86, 96, 102, 106, 142, 143, 148, 149, 162, 170, 177, 268 Sadock, J. 67 Sag, I. 15

Sánchez Valencia, V. 75, 217–18, 239 Sanford, A. J. 65 Sapir, E. 83 Schmerling, S. 8, 24, 28, 86, 177 Schwenter, S. 53, 161 Smith, N. 128 Smith, S. 155 Spencer, N. J. 11 Sperber, D. 12, 16, 62, 239, 268 Spitzbardt, H. 91 Strawson, P. F. 25 Suppes, P. 5.3 de Swart, H. 155, 159 Sweetser, E. 124, 128, 205, 251 Szabolsci, A. 8, 109, 143, 149–50 Talmy, L. 101, 128 Tomasello 14, 16, 17 Tovena, L. 141, 142, 163, 164, 169, 174, 199 Traugott, E. C. 17, 114–16, 128, 151, 177 Turner, M. 16, 205 Uribe-Etxebarria, M. 203, 206 Vallduvì, E. 44 van der Auwera, J. 128, 151, 153 Vasishth, S. 86, 219 Vendler, Z. 156, 163, 169, 179 Verhagen, A. 8, 16, 67, 205 Wagenaar, K. 24 Wal, S. van der 25, 96, 109, 130, 219 Warfel, S. 190 Waterhouse, J. 151 Wierzbicka, A. 268 Wilde, O. 79 Wilson, D. 12, 16, 62, 239, 268 Wouden, T. van der 37, 44, 48, 65, 86, 130, 213, 216–21, 233, 239, 269 Wright, G. H. von 128 Yoshimura, A. 211, 239 Zanuttini, R. 44, 190 Zepter, A. 8, 91, 169 Zipf, G. K. 112 Zwarts, F. 37, 44, 213–14, 216–19, 239

The Grammar of Polarity: Pragmatics, Sensitivity, and the Logic of Scales (Cambridge Studies in Linguistics)