VOLUME 26 NUMBER 2 MAY 2009
CONTENTS 109
SIGRID BECK AND SHRAVAN VASISHTH Multiple Focus
159
ANDREA GUALMINI AND BERNHARD SCHWARZ Solving Learnability Problems in the Acquisition of Semantics
185
FORTHCOMING ARTICLES ALAN C. BALE AND DAVID BARNER: The Interpretation of Functional Heads: Using Comparatives to Explore the Mass/Count Distinction ALEX LASCARIDES AND MATTHEW STONE: A Formal Semantic Analysis of Gesture YINGYING WANG: Counterfactual Donkey Sentences: A Response to Robert van Rooij
VOLUME 26 NUMBER 2 MAY 2009
ALEX LASCARIDES AND NICHOLAS ASHER Agreement, Disputes and Commitments in Dialogue
JOURNAL OF SEMANTICS
JOURNAL OF SEMANTICS
issn 0167-5133
VOLUME 26 NUMBER 2 MAY 2009
Journal of
SEMANTICS www.jos.oxfordjournals.org
oxford
JOURNAL OF SEMANTICS A N I NTERNATIONAL J OURNAL FOR THE I NTERDISCIPLINARY S TUDY THE S EMANTICS OF N ATURAL L ANGUAGE
MANAGING EDITOR: ASSOCIATE EDITORS:
OF
BART G EURTS (University of Nijmegen) DAVID B EAVER (Stanford University) R EGINE E CKARDT (Universität Göttingen) I RA N OVECK (Institut des Sciences Cognitives, Lyon) PAUL P ORTNER (Georgetown University,Washington) P HILIPPE S CHLENKER (Institut Jean-Nicod, Paris) YAEL S HARVIT (University of Connecticut, Storrs) A NNA S ZABOLCSI (New York University) EDITORIAL BOARD:
N ICHOLAS A SHER (University of Texas, Austin) C HRIS B ARKER (University of California at San Diego) J OHAN B OS (University of Edinburgh) P ETER B OSCH (University of Osnabrück) R ICHARD B REHENY (University College London) M IRIAM B UTT (University of Konstanz) G REG C ARLSON (University of Rochester) A NN C OPESTAKE (Stanford University) H ENRIËTTE DE S WART (Utrecht University) PAUL D EKKER (University of Amsterdam) K URT E BERLE (Lingenio Heidelberg) M ARKUS E GG (Universität des Saarlandes) U LRIKE H AAS -S POHN (University of Konstanz) L AURENCE R. H ORN (Yale University) H ANS K AMP (University of Stuttgart) G RAHAM K ATZ (University of Osnabrück) T IBOR K ISS (Ruhr University, Bochum) J ONAS K UHN (University of Texas, Austin)
CLAUDIA MAIENBORN (Humboldt University, Berlin) JULIEN MUSOLINO (Rutgers University) FRANCIS JEFFRY PELLETIER (University of Alberta) CHRISTOPHER POTTS (University of Massachusetts, Amherst) MARK STEEDMAN (University of Edinburgh) ZOLTAN GENDLER SZABO (Cornell University) KEES VAN DEEMTER (University of Aberdeen) ROB VAN DER SANDT (University of Nijmegen) ROBERT VAN ROOIJ (University of Amsterdam) KAI VON FINTEL (Massachusetts Institute of Technology) ARNIM VON STECHOW (University of Tübingen) BONNIE WEBBER (University of Edinburgh) HENK ZEEVAT (University of Amsterdam) THOMAS EDE ZIMMERMANN (University of Frankfurt)
EDITORIAL CONTACT:
[email protected] © Oxford University Press 2009 All rights reserved; no part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise without prior written permission of the Publishers, or a licence permitting restricted copying issued in the UK by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1P 9HE, or in the USA by the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Typeset by TnQ Books and Journals Pvt. Ltd., Chennai, India. Printed by Bell and Bain Ltd, Glasgow, UK For subscription information please see back of journal.
Scope of this Journal The Journal of Semantics publishes articles, notes, discussions, and book reviews in the area of academic research into the semantics of natural language. It is explicitly interdisciplinary, in that it aims at an integration of philosophical, psychological, and linguistic semantics as well as semantic work done in logic, artificial intelligence, and anthropology. Contributions must be of good quality (to be judged by at least two referees) and must report original research relating to questions of comprehension and interpretation of sentences, texts, or discourse in natural language. The editors welcome not only papers that cross traditional discipline boundaries, but also more specialized contributions, provided they are accessible to and interesting for a general readership in the field of natural language semantics. Empirical relevance, sound theoretic foundation, and formal as well as methodological correctness by currently accepted academic standards are the central criteria of acceptance for publication. It is also required of contributions published in the Journal that they link up with currently relevant discussions in the field of natural language semantics. Information for Authors Papers for publication should be submitted to the Managing Editor by email as a PDF file or PS file attachment. If this is not feasible please contact the Managing Editor.The receipt of submissions is confirmed by email (when there is more than one author to the first author, whom we assume to deal with all correspondence, unless we are instructed differently), and the paper is reviewed by two members of the editorial board or external experts chosen by the editors. The reviewers remain anonymous. An editorial decision is normally reached within 2-3 months after submission. Papers are accepted for review only on the condition that they have neither as a whole nor in part been published elsewhere, are elsewhere under review or have been accepted for publication. In case of any doubt authors must notify the editor of the relevant circumstances at the time of submission. It is understood that authors accept the copyright conditions stated in the journal if the paper is accepted for publication. The style requirements of the Journal of Semantics can be found at www.jos.oxfordjournals. org, under “Instructions to Authors”, and are binding for the final version to be prepared by the author when the paper is accepted for publication. LATEX submission Please use the Journal class file (http://www3.oup.co.uk/semant/instauth/semant.cls). A tex file (http://www3.oup.co.uk/semant/instauth/guide.tex) is available on how to use the .cls file. Authors who are planning to send source files by email should also include a postscript or PDF version of their paper. Please follow all the instructions to authors that are detailed above and note the text width should be set to 28pc and the text height to 41\baselineskip. Electronic figures can only be used in ps or eps format.
SUBSCRIPTIONS
A subscription to Journal of Semantics comprises 4 issues. All prices include postage, and for subscribers outside the UK delivery is by Standard Air. Journal of Semantics Advance Access contains papers that have been finalised, but have not yet been included within the issue. Advance Access is updated monthly. Annual Subscription Rate (Volume 26, 4 issues, 2009) Institutional Print edition and site-wide online access: £187/$365/=281 C Print edition only: £178/$347/=267 C Site-wide online access only: £178/$347/=267 C Personal Print edition and individual online access: £70/$137/=105 C Please note: £ Sterling rates apply in Europe, US$ elsewhere There may be other subscription rates available, for a complete listing please visit www.jos.oxfordjournals.org/subscriptions. Full prepayment, in the correct currency, is required for all orders. Orders are regarded as firm and payments are not refundable. Subscriptions are accepted and entered on a complete volume basis. Claims cannot be considered more than FOUR months after publication or date of order, whichever is later. All subscriptions in Canada are subject to GST. Subscriptions in the EU may be subject to European VAT. If registered, please supply details to avoid unnecessary charges. For subscriptions that include online versions, a proportion of the subscription price may be subject to UK VAT. Personal rate subscriptions are only available if payment is made by personal cheque or credit card and delivery is to a private address. The current year and two previous years’ issues are available from Oxford University Press. Previous volumes can be obtained from the Periodicals Service Company, 11 Main Street, Germantown, NY 12526, USA. Email:
[email protected]. Tel: +1 (518) 537 4700. Fax: +1 (518) 537 5899. For further information, please contact: Journals Customer Service Department, Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, UK. Email:
[email protected]. Tel (and answerphone outside normal working hours): +44 (0)1865 353907. Fax: + 44 (0)1865 353485. In the US, please contact: Journals Customer Service Department, Oxford University Press, 2001 Evans Road, Cary, NC 27513, USA. Email:
[email protected]. Tel (and answerphone outside normal working hours): 800 852 7323 (toll-free in USA/Canada). Fax: 919 677 1714. In Japan, please contact: Journals Customer Services, Oxford University Press, 1-1-17-5F, Mukogaoka, Bunkyo-ku, Tokyo, 113-0023, Japan. Email:
[email protected]. Tel: (03) 3813 1461. Fax: (03) 3818 1522. Methods of payment. Payment should be made: by cheque (to Oxford University Press, Cashiers Office, Great Clarendon Street, Oxford, OX2 6DP, UK); by bank transfer [to Barclays Bank Plc, Oxford Office, Oxford (bank sort code 20-65-18) (UK);
overseas only Swift code BARC GB22 (GB£ Sterling Account no. 70299332, IBAN GB89BARC20651870299332; US$ Dollars Account no. 66014600, IBAN GB27BARC20651866014600; EU= C EURO Account no. 78923655, IBAN GB16BARC20651878923655]; or by credit card (Mastercard, Visa, Switch or American Express). Journal of Semantics (ISSN 0167 5133) is published quarterly (in February, May, August and November) by Oxford University Press, Oxford, UK. Annual subscription price is £187/$365/=281. C Journal of Semantics is distributed by Mercury International, 365 Blair Road, Avenel, NJ 07001, USA. Periodicals postage paid at Rahway, NJ and at additional entry points. US Postmaster: send address changes to Journal of Semantics (ISSN 0167-5133), c/o Mercury International, 365 Blair Road, Avenel, NJ 07001, USA. Abstracting and Indexing Annual Bibliography English Language Literature (ABEL), INSPEC, International Bibliography Sociology, Linguistics Abstracts, Linguistics and Language Behaviour Abstracts (LLBA), MLA: International Bibliography Books, Articles and Modern Language Literature, periodicals Contents Index, Philosopher’s Index, Social Planning Policy and Development Abstracts, Bibliographie Linguistique/Linguistic Bibliography and BLonline. Permissions For information on how to request permissions to reproduce articles/information from this journal, please visit www.oxfordjournals.org/jnls/permissions. Advertising Inquiries about advertising should be sent to Linda Hann, E-mail: lhann@lhms. fsnet.co.uk. Phone/fax: 01344 779945. Disclaimer Statements of fact and opinion in the articles in Journal of Semantics are those of the respective authors and contributors and not of Journal of Semantics or Oxford University Press. Neither Oxford University Press nor Journal of Semantics make any representation, express or implied, in respect of the accuracy of the material in this journal and cannot accept any legal responsibility or liability for any errors or omissions that may be made. The reader should make his/her own evaluation as to the appropriateness or otherwise of any experimental technique described.
JOURNAL OF SEMANTICS A N I NTERNATIONAL J OURNAL FOR THE I NTERDISCIPLINARY S TUDY THE S EMANTICS OF N ATURAL L ANGUAGE
MANAGING EDITOR: ASSOCIATE EDITORS:
OF
BART G EURTS (University of Nijmegen) DAVID B EAVER (Stanford University) R EGINE E CKARDT (Universität Göttingen) I RA N OVECK (Institut des Sciences Cognitives, Lyon) PAUL P ORTNER (Georgetown University,Washington) P HILIPPE S CHLENKER (Institut Jean-Nicod, Paris) YAEL S HARVIT (University of Connecticut, Storrs) A NNA S ZABOLCSI (New York University) EDITORIAL BOARD:
N ICHOLAS A SHER (University of Texas, Austin) C HRIS B ARKER (University of California at San Diego) J OHAN B OS (University of Edinburgh) P ETER B OSCH (University of Osnabrück) R ICHARD B REHENY (University College London) M IRIAM B UTT (University of Konstanz) G REG C ARLSON (University of Rochester) A NN C OPESTAKE (Stanford University) H ENRIËTTE DE S WART (Utrecht University) PAUL D EKKER (University of Amsterdam) K URT E BERLE (Lingenio Heidelberg) M ARKUS E GG (Universität des Saarlandes) U LRIKE H AAS -S POHN (University of Konstanz) L AURENCE R. H ORN (Yale University) H ANS K AMP (University of Stuttgart) G RAHAM K ATZ (University of Osnabrück) T IBOR K ISS (Ruhr University, Bochum) J ONAS K UHN (University of Texas, Austin)
CLAUDIA MAIENBORN (Humboldt University, Berlin) JULIEN MUSOLINO (Rutgers University) FRANCIS JEFFRY PELLETIER (University of Alberta) CHRISTOPHER POTTS (University of Massachusetts, Amherst) MARK STEEDMAN (University of Edinburgh) ZOLTAN GENDLER SZABO (Cornell University) KEES VAN DEEMTER (University of Aberdeen) ROB VAN DER SANDT (University of Nijmegen) ROBERT VAN ROOIJ (University of Amsterdam) KAI VON FINTEL (Massachusetts Institute of Technology) ARNIM VON STECHOW (University of Tübingen) BONNIE WEBBER (University of Edinburgh) HENK ZEEVAT (University of Amsterdam) THOMAS EDE ZIMMERMANN (University of Frankfurt)
EDITORIAL CONTACT:
[email protected] © Oxford University Press 2009 All rights reserved; no part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise without prior written permission of the Publishers, or a licence permitting restricted copying issued in the UK by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1P 9HE, or in the USA by the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Typeset by TnQ Books and Journals Pvt. Ltd., Chennai, India. Printed by Bell and Bain Ltd, Glasgow, UK For subscription information please see back of journal.
Scope of this Journal The Journal of Semantics publishes articles, notes, discussions, and book reviews in the area of academic research into the semantics of natural language. It is explicitly interdisciplinary, in that it aims at an integration of philosophical, psychological, and linguistic semantics as well as semantic work done in logic, artificial intelligence, and anthropology. Contributions must be of good quality (to be judged by at least two referees) and must report original research relating to questions of comprehension and interpretation of sentences, texts, or discourse in natural language. The editors welcome not only papers that cross traditional discipline boundaries, but also more specialized contributions, provided they are accessible to and interesting for a general readership in the field of natural language semantics. Empirical relevance, sound theoretic foundation, and formal as well as methodological correctness by currently accepted academic standards are the central criteria of acceptance for publication. It is also required of contributions published in the Journal that they link up with currently relevant discussions in the field of natural language semantics. Information for Authors Papers for publication should be submitted to the Managing Editor by email as a PDF file or PS file attachment. If this is not feasible please contact the Managing Editor.The receipt of submissions is confirmed by email (when there is more than one author to the first author, whom we assume to deal with all correspondence, unless we are instructed differently), and the paper is reviewed by two members of the editorial board or external experts chosen by the editors. The reviewers remain anonymous. An editorial decision is normally reached within 2-3 months after submission. Papers are accepted for review only on the condition that they have neither as a whole nor in part been published elsewhere, are elsewhere under review or have been accepted for publication. In case of any doubt authors must notify the editor of the relevant circumstances at the time of submission. It is understood that authors accept the copyright conditions stated in the journal if the paper is accepted for publication. The style requirements of the Journal of Semantics can be found at www.jos.oxfordjournals. org, under “Instructions to Authors”, and are binding for the final version to be prepared by the author when the paper is accepted for publication. LATEX submission Please use the Journal class file (http://www3.oup.co.uk/semant/instauth/semant.cls). A tex file (http://www3.oup.co.uk/semant/instauth/guide.tex) is available on how to use the .cls file. Authors who are planning to send source files by email should also include a postscript or PDF version of their paper. Please follow all the instructions to authors that are detailed above and note the text width should be set to 28pc and the text height to 41\baselineskip. Electronic figures can only be used in ps or eps format.
JOURNAL OF SEMANTICS Volume 26 Number 2
CONTENTS ALEX LASCARIDES AND NICHOLAS ASHER Agreement, Disputes and Commitments in Dialogue
109
SIGRID BECK AND SHRAVAN VASISHTH Multiple Focus
159
ANDREA GUALMINI AND BERNHARD SCHWARZ Solving Learnability Problems in the Acquisition of Semantics
185
Please visit the journal’s web site at www.jos.oxfordjournals.org
Journal of Semantics 26: 109–158 doi:10.1093/jos/ffn013 Advance Access publication February 6, 2009
Agreement, Disputes and Commitments in Dialogue ALEX LASCARIDES University of Edinburgh
Abstract This paper provides a logically precise analysis of agreement and disputes in dialogue. The semantics distinguishes among the public commitments of each dialogue agent, including commitments to relational speech acts or rhetorical relations (e.g. Narration, Explanation and Correction). Agreement is defined to be the shared entailments of the agents’ public commitments. We show that this makes precise predictions about implicit agreement. The theory also provides a consistent interpretation of disputes and models what content is agreed upon when a dispute has taken place.
1 INTRODUCTION A semantic framework for interpreting dialogue should provide an account of what is agreed upon and what is in dispute. In spite of the interest that dialogue interpretation has attracted, implicit agreement remains difficult to analyse in logically precise terms. For instance, consider the following real dialogue, described in Sacks et al. (1974: 717): (1)
a. Mark (to Karen and Sharon): Karen ‘n’ I’re having a fight, b. Mark (to Karen and Sharon): after she went out with Keith and not me. c. Karen (to Mark and Sharon): Wul Mark, you never asked me out.
Intuitively, Mark and Karen agree that they had a fight and that this was caused by Karen going out with Keith and not Mark. Thus, implicatures can be agreed upon—that (1b) explains (1a) goes beyond compositional semantics. Furthermore, agreement can be implicated—Karen does not repeat (1a) or (1b) or utter OK to indicate agreement. Finally, (1c) is not The Author 2009. Published by Oxford University Press. All rights reserved. For Permissions, please email:
[email protected].
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
NICHOLAS ASHER Universite´ Paul Sabatier
110 Agreement, Disputes and Commitments in Dialogue
(2) 1.1. A: Je suis tombe´ en panne. 1.2. Peux-tu m’aider? I have had a breakdown. Can you help me? 2.1. B: Ou` es-tu? where are you? 3.1. A: Je suis devant le refuge qui se trouve un km apre`s Couiza. 3.2. Il y a la une cabine te´le´phonique. I’m in front of the refuge that’s 1 km after Couiza. There’s a telephone booth. 4.1. B: Il y a plusieurs refuges aux alentours de Couiza. 4.2. Dans quelle direction es-tu parti de Couiza? There are several refuges around Couiza. In which direction did you leave Couiza? 5.1. A: Je suis sorti par la route Paul Sabatier. 5.2. Puis j’ai roule´ vers la montagne. 5.3. A une clairie`re j’ai tourne a droite. I left by the Paul Sabatier road. I then drove toward the mountains. At a clearing I turned right. 6.1. B: Au grand carrefour? At the big intersection? 7.1. A: Non, apre`s, la` ou on commence a` avoir une belle vue sur la mer. No afterwards, where you begin to have a nice view of the sea. 8.1. B: Ah, je vois, au Rocher du diable. Oh, I see, at the Devil’s cliff.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(yet) agreed upon. A theory of dialogue interpretation should predict these facts. In principle, the Grounding Acts Model (GAM; Traum 1994; Poesio and Traum 1997) supports implicit agreement. But the particular rules that are currently specified demand the recognition of an acceptance act for agreement to take place (Matheson et al. 2000), and the rules they offer do not predict such an acceptance act of (1a) and (1b) from Karen’s utterance (1c). Segmented Discourse Representation Theory (SDRT; Asher and Lascarides 2003) errs in the opposite direction. It stipulates that lack of disagreement implies agreement, and so (1c) is incorrectly predicted to be agreed upon. Thus, SDRT needs modification to deal with (1), just as GAM needs supplementation. Agreement can also occur in the context of corrections or disputes. So disputes must receive a consistent interpretation; otherwise, it will lead to inconsistent inferences about agreement. Consider dialogue (2) from the Toulouse-Stuttgart Procorpe corpus.
Alex Lascarides and Nicholas Asher 111
9.1. A: C’est possible, il y avait un gros rocher. It’s possible; there was a big cliff. 10.1. B: Donc tu es devant la refuge de la Maison de l’aigle et j’arrive tout de suite. So, you’re in front of the refuge at the house of the eagle. I’m coming right away
(3)
a. A: The new student is brilliant and imaginative. b. B: He’s imaginative.
(4)
a. A: John is not a good speaker b. A: because he’s hard to understand. c. B: I agree that he’s hard to understand.
Intuitively, by agreeing with a strict part of a prior contribution, B implicates a dispute with the other part—following Hirschberg (1985), Walker argues that (3b) generates a scalar implicature that B does not believe that the new student is brilliant. Similarly, B agrees in (4) with (4b) but implicates that he does not believe (4a). To our knowledge, there is currently no formally precise account of dialogue that yields accurate interpretations of corrections and agreement. We aim to rectify this here. We will say that a proposition p is grounded just in case p is agreed by the dialogue agents to be true. This follows Clark’s terminology, in particular the concept of grounding a joint action at level 4 (Clark 1996: 388). Clark focuses mainly on grounding at the ‘lower’ levels; how agents ground what was meant, for instance. By contrast, in order to study grounding at the higher level, we will assume a highly idealized scenario where dialogue agents understand each other perfectly, resolving all ambiguities to the same specific values. One of Clark’s main claims is that grounding at all levels occurs only when there is positive evidence for it, and here we aim to explore in a logically precise manner what evidence suffices for grounding a proposition. In future work, we intend to demonstrate that our
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In (2), B’s utterance (4.1) denies a uniqueness presupposition arising from the definite the refuge that’s 1 km after Couiza in (3.1). But utterance (10.1) intuitively implicates B’s agreement about the undisputed parts of this utterance; that is that A is in front of a refuge. In other words, some content that is uttered before the dispute takes place is agreed upon after it. The content of utterance (4.1) in dialogue (2) is inconsistent with the uniqueness presupposition from utterance (3.1) that it denies. But a denial can also be implicated by an utterance that is consistent with the denied content, as the dialogue (3) from Walker (1996) and dialogue (4) show:
112 Agreement, Disputes and Commitments in Dialogue
2 MOTIVATION Many current theories of dialogue adopt the ISU paradigm (Larsson and Traum 2000): an utterance triggers an update to the information state representing the dialogue context to form a new information state. The differences among ISU approaches reside in three areas: the type of information that is recorded in an information state, the type of update operations that are permissible and the controls over their application. We will examine two ISU theories that already offer an account of agreement: the Grounding Acts Model (GAM; Traum 1994; Poesio and Traum 1997, 1998) and Segmented Discourse Representation Theory (SDRT; Asher and Lascarides 2003).1 The GAM links the speech acts performed with their semantic and cognitive effects, including effects on grounding. Poesio and Traum (1998) formalize this: as the dialogue proceeds, each dialogue agent builds the conversational information state (CIS) shown in (5). (5) G, DU1, DU2, DU3, UDU, CDU G ¼ ... DU1 ¼ . . . DU2 ¼ . . . DU3 ¼ . . . UDU ¼ ÆDU1, DU3æ CDU ¼ first(UDU) ¼ DU1 1 Agreement and disputes are not examined in Ginzburg’s (2008) ISU theory, and so we do not examine it here.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
definitions can be extended to model grounding at the lower levels too; this will involve modelling misunderstandings. The rest of the paper is as follows. In section 2 we use existing approaches to motivate general criteria for any adequate theory of agreement and disputes. These criteria could be formalized within any Information State Update (ISU) approach to dialogue [see Larsson and Traum (2000) for an overview]. But in section 3 we argue for SDRT as a starting point, and we describe how to extend it to meet the criteria. This extension is then formalized in detail: section 4 defines the syntax and dynamic semantics of the language in which logical forms are expressed and section 5 defines the glue logic that uses linguistic form and contextual information to construct logical form. We analyse several real dialogues and explore how agreements and disputes interact with anaphora.
Alex Lascarides and Nicholas Asher 113
(6) Name: Assertion Condition on update: G T [e : Assert (B, A, K)] Update: G+ ¼ [e#]e# : Try(B, ks#.s# : Bel(A, K)), [e$]e$ : Accept(A, e) 0 [sjs : SCCOE(A, B, K)]
The update rules form an inheritance hierarchy, and so in addition to the effects specified in (6), Assertion inherits from its super-type speech act Directive an obligation on A to address e, and from its super-type statement a social commitment (SCCOE) of B to A to K. The application of update rules are then controlled by decision trees that use the linguistic analysis of the utterance and the prior CIS to predict which speech acts were performed. The hallmark of content p being mutually agreed upon is that the relevant agents A and B are socially committed to each other to p; in other words, (7) is a part of the CIS (since we ignore misunderstandings, we can assume that A’s and B’s CISs are identical): (7)
G T SCCOE(A, B, p) SCCOE(B, A, p)
A social commitment to p can be created either by stating (or asserting) p or by accepting p (which entails that p was conveyed in a prior assertion). With this in mind, consider (1). While it is possible in principle to provide decision trees in GAM that will recognize (1c) as an acceptance act of Mark’s prior assertions, the rules they actually provide only recognize (1c) as an assertion. Consequently, GAM as it stands does not recognize that Karen is socially committed to (1a) and (1b) and the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A CIS is a DRS (Kamp and Reyle 1993) where each of its referents G (for ground), DU (for discourse unit), etc., are also DRSs. The currently pending discourse units, which require further attention in the dialogue, are grouped under UDU, the top element being the current discourse unit (CDU). The update to a CIS borne from a particular speech act (and the conditions under which the act can be performed) is then specified as changes to (and conditions on) the DRSs G, UDU, etc. For example, the speech act where B asserts K to A updates the common ground G to include an event e# that B intends A to believe K and a conditional event e$ that should A accept the assertion, then A would be socially committed to B to believe K [shown via the attitude statement a social commitment (SCCOE)]:
114 Agreement, Disputes and Commitments in Dialogue
(1#) p : Explanation(p1.1, p1.2) ^ Explanation(p1.2, p2.1) For reasons of space, the semantic representations of the clauses are omitted from (1#). In fact, we will often gloss the content of a label p as Kp. But assuming that Kp1:1 to Kp2:1 are expressed appropriately, (1#) entails the following: Karen and Mark were having a fight, this was caused by Karen going out with Keith and not Mark, and this in turn was caused by Mark not asking Karen out. In short, the presence of rhetorical relations in logical form captures content that is linguistically implicit (here, the causal relations). The update rules and the processes that control their application are rendered in a default glue logic which constructs logical forms like (1#) via axioms that validate default inferences about rhetorical connections among the discourse units (see section 5). The logical form (1#) captures Karen’s commitments but loses Mark’s. This contributes to SDRT’s problematic analysis of agreement. Asher and Lascarides (2003: 363) stipulate that in the absence of divergent rhetorical relations (in other words, speech acts of denial such as Correction and Counterevidence), all the content is agreed upon. So SDRT wrongly predicts that (1c) is agreed upon. In essence, SDRT takes silence to mean assent—a mistake, given Clark’s (1992) empirical findings that grounding requires positive evidence. While SDRT’s current model of agreement is wrong, we believe that rhetorical relations like Explanation are a crucial ingredient in any
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
causal relation between them. GAM needs to be supplemented with rules for inferring that Karen was implicitly accepting Mark’s contribution [it also needs rules that recognize Mark’s contribution as conveying a causal relation between (1a) and (1b)]. Let us compare this with SDRT’s analysis of (1) (Asher and Lascarides 2003). In SDRT, an information state is a Segmented Discourse Representation Structure (SDRS)—or a set of them when ambiguities remain unresolved (see section 5). An SDRS consists of a set of labels that each represents a unit of discourse and a function that associates each label with a formula representing the unit’s interpretation. These formulae include rhetorical relations between labels. The hierarchical structure on labels that this creates constrains anaphoric dependencies (see sections 3 and 5 for details). We will explore the effects of agreement and disputes on anaphora in the course of this paper. But for now we focus on SDRT’s predictions about agreement in (1). Its logical form is (1#), where p1.1, p1.2 and p2.1 label the contents of the clauses (1a)–(1c), respectively (we will often use the convention that the nth utterance in the mth turn is indexed m.n), and p labels the content of the dialogue segment that is created by the rhetorical connections:
Alex Lascarides and Nicholas Asher 115
2 We think Hamblin’s (1970) notion of public commitment is the appropriate speaker’s attitude to the moves he makes in dialogue (see also Gaudou et al. 2006). We explore the links between commitments and other attitudes like belief in Asher and Lascardies (2008). Poesio and Traum (1988) do not provide truth conditions for SCCOE, but it too can be viewed as public commitment.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
adequate model. Rhetorical relations are types of relational speech acts (Asher and Lascarides 2003): they are speech acts because explaining something or continuing a narrative are things that people do with utterances; and they are relational because the successful performance of the speech act Explanation, for instance, is logically dependent on the content of the utterance (or sequence of utterances) that is being explained. When Karen utters (1c), she is explaining (1b). So even though the compositional semantics of (1c) does not entail (1b), its illocutionary contribution does entail it—or more accurately entails that Karen is publicly committed to it.2 This shows how recognizing an implicit acceptance is logically dependent on recognizing the particular relational speech acts that the agent performed. An implicit acceptance act follows if that speech act is left veridical—in other words, it entails the content of its first argument—and that first argument labels an utterance (or sequence of utterances) that were spoken by another agent. While Karen is committed to the speech act Explanation(p1.2, p2.1), Mark is committed to Explanation(p1.1, p1.2). So if agreement is defined as shared public commitment, then (1b) is agreed upon. But clearly there is still something missing. By performing the speech act Explanation(p1.2, p2.1), Karen (implicitly) accepts more of Mark’s contribution than just (1b). The agreed content also includes (1a) and the fact that (1b) caused (1a). But this does not follow from the semantics of Karen’s speech act Explanation(p1.2, p2.1). Furthermore, it would be implausible to interpret Karen’s speech act as explaining the content of Mark’s entire turn—that is as Explanation(p1M, p2.1), where p1M is the dialogue segment (1a1b), associated with the content Explanation(p1.1, p1.2). It is simply not plausible to assume that Mark not asking her out explains why her going out with Keith and not Mark caused a fight. So Karen’s commitment to Mark’s entire turn must arise in some other way. We believe, like GAM, that agreement should be defined as shared public or social commitment. This explains why positive evidence is necessary for it, as Clark and Schaefer (1989) claim: both agents must perform a speech act with appropriate semantic consequences in order to form a shared public commitment. But (1) shows that an agent can commit to more than just the rhetorical speech act that they performed. Here, Karen must be committed to more than (1c) explaining (1b). Her commitment that (1b) also explains (1a) must arise as a consequence of the
116 Agreement, Disputes and Commitments in Dialogue fact that she explained (1b) and that Mark conveyed that (1b) explains (1a). What this suggests, then, is that commitments persist from prior turns and are even transferred from one speaker to another. The basic intuition regarding Karen’s endorsement of Mark’s contribution is the following principle: If one implicitly accepts a prior utterance, one normally also accepts its illocutionary effects.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We will explain shortly why we believe that this principle should apply only to implicit endorsements (i.e. it will not apply to utterances of the form OK, I agree, repeating content, and the like). We will also explain why it is a default (note the use of the word normally). But first let us see how this principle has the desired effect in (1). It predicts that Karen commits not only to the speech act Explanation(p1.2, p2.1) that she performed but also through doing so she commits to the illocutionary effects Explanation(p1.1, p1.2) that Mark committed to when he uttered (1b). So the shared entailments of Karen’s and Mark’s public commitments match what they intuitively agree upon, as desired. Exactly which commitments persist from a prior information state to the current one depends on the speech acts that are performed in the current turn and how they are semantically related to the prior commitments. One might wonder, for instance, why the above principle does not apply to explicit acceptance acts. This is because deciding what is agreed upon by an explicit acceptance amounts to simply identifying the first argument of the (relational) acceptance act (or equivalently, determining the semantic scope of the acceptance act). In GAM, this speech act is called Accept; in the version of SDRT from Asher and Lascarides (2003), it is called Acknowledgement (while acknowledging an understanding of what was said is represented with the so-called meta-talk relation Acknowledgement*). In contrast, GAM and Clark (1996) use Acknowledgement to represent grounding an understanding. Therefore, to avoid confusion, we will use the rhetorical relation Acceptance to represent an explicit endorsement (even within SDRT). If an utterance like OK is intended to endorse the illocutionary effects of an utterance and not just its semantics, then this should be represented by making the first argument of the Acceptance relation the discourse segment whose content entails those illocutionary effects. In other words, the first argument to Acceptance should be the discourse unit p that out-scopes both p1 and p2, where R(p1, p2) represents the illocutionary effects that are being explicitly endorsed. So making
Alex Lascarides and Nicholas Asher 117
implicatures and not just compositional semantics accepted follows from the truth conditional interpretation of the Acceptance relation that is part of logical form, so long as the first argument to the acceptance act (or, equivalently, the semantic scope of the acceptance) is chosen correctly. This makes any persistence axioms for explicit acceptance acts redundant. Rather, what is required are principles for identifying the first argument of the Acceptance relation. Moreover, (3) and (4), repeated here with labels for particular discourse units, show that it would be wrong for the above principle to apply to explicit endorsements. a. b.
(4)
p1.1. A: John is not a good speaker p1.2. A: because he’s hard to understand. p2.1. B: I agree that he’s hard to understand.
Walker (1996) argues that (3), which features an explicit acceptance, triggers a scalar implicature that B does not believe that the new student is brilliant. However, this could not be modelled straightforwardly if the above principle applied to explicit endorsements: A’s (prior) commitment in (3) to Continuation(p1.1, p1.2) is consistent with Acceptance(p1.2, p2.1), and so if the above principle applied to explicit acceptances, B would be (wrongly) committed to Continuation(p1.1, p1.2) and thus to the new student being brilliant. To capture the right predictions about B’s beliefs would then require us to conclude that B is somehow violating Sincerity conditions; that is that he does not believe something that he has publicly committed to. This would be highly counterintuitive. In fact, by being very specific about exactly which part of A’s contribution B accepts, he is also conveying a lack of commitment to the other parts of A’s contribution. Similarly in (4), the compositional semantics of p2.1, which makes the content of p1.2 an argument to the predicate agree given the syntactic complement of the verb, seems to suggest that B has accepted p1.2 and no more than this. So while Explanation(p1.1, p1.2) is consistent with accepting p1.2, it would be wrong to assume that B becomes committed to it. Of course, these scalar implicatures about B’s beliefs apply, as Walker (1996) attests, only when the compositional semantics of the acceptance act is highly specific. If it is expressed in an unspecific way, like OK, a default seems to apply that the first argument to Acceptance is the entire last turn. In other words, if p2.1 in (4) were OK, then the update rules should predict a ‘wide-scope’ interpretation of the endorsement act,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A: ½The new student is brilliantp1:1 and ½he0 s imaginativep1:2 : B: ½He0 s imaginativep2:1
(3)
118 Agreement, Disputes and Commitments in Dialogue making B committed to Acceptance(p1A, p2.1), where p1A is the discourse unit corresponding to A’s entire first turn, that is associated with the content Explanation(p1.1, p1.2). Even if B’s response in (4) is as shown below (so that p2.1 is no longer a syntactic complement to agree), a different interpretation is more salient: (4) p2.1. B: OK, p2.2. B: he’s hard to understand
(8) p1.1. p2.1. p2.2. p3.1.
A: How is James doing? B: Actually, not so well. B: His wife left him. C: And he lost his job.
Intuitively, C uses p3.1 as a continuation of p2.2, and the resulting segment explains p2.1. This implies that p2.2 is not the sole cause for James not doing so well. In contrast, the illocutionary effect Explanation(p2.1, p2.2) that B commits to by uttering p2.2 implicates, via a scalar implicature, that p2.2 is the sole cause. This conflict among B’s and C’s intended illocutionary effects should block C from committing to Explanation(p2.1, p2.2), even though he has implicitly endorsed p2.2. Making our principle default achieves this, so long as the consistency checks that accompany the default reasoning are based on what follows nonmonotonically from the individual premises, rather than what follows from them monotonically. Now let us consider dialogues that feature (explicit) disputes. Dialogue (9) is taken from a chat forum,3 and it involves two agents—PianoCraft and WildWind—discussing David Bowie’s album
3 www.teenagewildlife.com/Interact/cp/showflat.pl?Cat¼&Board¼interp&Number¼251307 &page¼17&view¼collapsed&sb¼7&part¼.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Namely, OK (explicitly) endorses all of A’s turn, while p2.2 explains why that acceptance act was performed. In terms of rhetorical relations, this is expressed with Acceptance(p1A, p2.1) ^ Explanation*(p2.1, p2.2) (Explanation* is a meta-talk relation, where the second argument explains why the speech act expressed by the first argument was performed). We proposed making the principle that the implicit endorsement of an utterance is also an endorsement of its illocutionary effects a default principle, as opposed to monotonic. This is motivated by the need to account for dialogues like (8), where C implicitly endorses p2.2, but does so with a speech act whose illocutionary effects conflict with those of p2.2:
Alex Lascarides and Nicholas Asher 119
Outside (we have labelled discourse units that are relevant to our discussion). (9)
This example provides further evidence for representing an agent’s commitments with rhetorical connections. This is because intuitively, WildWind agrees with p1.1 and p1.2, but disputes that the latter explains the former (see p2.2). This illocutionary effect is something that PianoCraft intended to convey, but left linguistically implicit—there is no cue phrase such as because in his utterances. However, a representation of PianoCraft’s commitment must include this intended effect so that the dispute between PianoCraft and WildWind can be reflected in their conflicting commitments. As we mentioned in section 1, one basic requirement is that disputes receive a consistent interpretation. Individuating each agents’ commitments in the dialogue’s information state helps to achieve this because it is consistent for two agents to make mutually inconsistent commitments. However, an agent’s beliefs can change as the dialogue progresses, and such changes can surface in the dialogue by an agent committing to something that is inconsistent with his earlier commitments. For instance, consider the simple (constructed) dialogue (10): (10) p1.1. A: It’s raining. p2.1. B: No it’s not. p3.1. A: Oh, you’re right (uttered after A looks out the window) A’s utterance p3.1 is an (explicit) Acceptance of B’s utterance p2.1, and given the way the anaphoric elements in p2.1 are resolved, this commits A to it’s not raining, contrary to his commitments from the first turn. If one simply adds this new commitment to the representation of his old one via conjunction, then contrary to intuitions, A’s commitments
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
p1.1. PianoCraft: It’s a safe bet that most of us love Outside, the dark themes, the death abyss, the mutilations, murder, and ignorance at the turn of the century. p1.2. PianoCraft: That’s exactly the album’s theme, that violence is now a form of entertainment, and we watch the destruction as a form of beauty. p2.1. WildWind: Much like the statement that was attempted in Natural Born Killers. p2.2. WildWind: However, I don’t think we like the album because of the theme, that is, I don’t think it’s the violence and death that makes the album worth listening to. p2.3. WildWind: I also know at least one person who can’t stand the album because of the theme (WildWind’s emphasis).
120 Agreement, Disputes and Commitments in Dialogue
(11) p1.1. p1.2. p2.1. p2.2. p3.1. p4.1.
A: John went to jail. A: He embezzled the pension funds. B: No, it was BILL who stole the pension funds. B: I was at the trial. A: Oh, OK. B: John did go to jail though.
In this dialogue, B uses the second turn (p2.1 and p2.2) to dispute p1.2, and hence also the speech act Explanation(p1.1, p1.2) that A performed. A accepts this correction in the third turn p3.1. And finally, B uses utterance p4.1 to accept p1.1. Crucially, this is sufficient for p1.1 to be agreed between them; A need not repeat his commitment to p1.1. And so p1.1 must be a part of A’s commitments at the third turn, even though by accepting B’s correction A can no longer be committed to Explanation(p1.1, p1.2). Thus, when A endorses a dispute of his prior contribution, A should remain committed to those parts of his contribution that B did not deny. Here, the moves B makes in the second turn make him neutral about John going to jail (although his later turn commits him to this); and so by accepting B’s denial, A maintains a commitment to John going to jail. More generally, the principles that establish which prior commitments persist should ensure the following:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
will become inconsistent, making the entire information state inconsistent. There are several ways one could maintain consistency in A’s commitments. First, one could assume that updating an agent’s commitments involves truth maintenance, thereby incorporating downdating and revision. In dialogue (10), this would trigger the removal of the prior and conflicting commitment to p1.1. However, while this might work in principle, modelling down-dating and revision is an unsolved problem for dynamic first-order models. So we will take a different approach in this paper that avoids revision, both in the model theory and in the update rules for constructing information states (see section 3 for an overview). But the general point to make here is that an adequate model of disputes must incorporate principles that identify which prior commitments are preserved and which are dropped. Dialogue (11) shows that should A accept B’s correction of his prior turn, then the principles for identifying ongoing prior commitments must validate that A remains committed to the parts of that turn that B did not dispute (even if B did not endorse them either).
Alex Lascarides and Nicholas Asher 121
A speaker remains committed to those parts of his prior turns that he does not disavow. To summarize this section, we have argued that any theory of dialogue should meet the following criteria.
3. The task of computing what’s agreed upon when an endorsement is explicit is logically equivalent to identifying the first argument of the speech act Acceptance. 4. The update rules for computing the current information state should identify which prior commitments persist in the current state, and are even transferred from one agent to another. In particular, the rules should ensure the following: (a) When one implicitly endorses a prior utterance, one normally also endorses its illocutionary effects. (b) An agent remains committed to those parts of his prior turns that he does not disavow. 3 AN ACCOUNT OF DIALOGUE IN SDRT SDRT as it stands fails to distinguish among the agents’ commitments, but it includes rhetorical relations. GAM distinguishes among each agent’s commitments, but its current rules do not link acceptance acts with relational speech acts and their semantics. Neither GAM nor SDRT include update rules that effect a transfer of a commitment from one agent to another in appropriate contexts (see criterion 4 above), which we argued is required for an adequate analysis of (1). We now propose a formal theory within SDRT that meets all the above criteria. We use SDRT for two main reasons. First, logical form in SDRT is defined more abstractly than in GAM, and this allows us to think of logical forms as first-order models, as Asher and Lascarides (2003) discuss. We can therefore exploit standard preservation results from model theory when stipulating which commitments persist as the dialogue proceeds. In fact, we will take full advantage of this by ensuring
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
1. Information states should individuate among the commitments of each dialogue agent, with agreement defined to be shared public commitment. 2. An agent’s commitments should include rhetorical connections. This is required for modelling implicit acceptance [e.g. (1)] and denials when what is denied is only the rhetorical connection [e.g. (9)].
122 Agreement, Disputes and Commitments in Dialogue
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
that updates to logical form always extend the prior logical form and never revise it, even when an agent drops a prior commitment (in which case, the updated logical form adds a non-truth preserving operator with semantic scope over that prior commitment). The dynamics in the interpretation of logical forms will model how commitments change as the dialogue proceeds. Secondly, SDRT makes specific predictions about which parts of a dialogue context can be antecedents to subsequent anaphora. This gives us the opportunity to explore how agreement and disputes interact with anaphora. SDRT’s model of anaphora relies on separating the representation of dialogue content (i.e. an SDRS) from the conversational implicatures which it gives rise to. For instance, the implicature in (3) that B does not believe that the new student is brilliant is not a part of its SDRS—the only place from which antecedents to surface anaphora can be chosen. And this helps us to explain anomalous subsequent anaphora such as A uttering Why not? (meaning Why don’t you believe that the new student is brilliant) or B uttering That’s why we shouldn’t accept him on the course (meaning we shouldn’t accept him on the course because I don’t believe he’s brilliant). It also predicts that B cannot respond to an assertion that p with I do too (meaning I believe that too). In GAM, all effects on content and cognitive states are expressed in the same information state, and so it would need further refinements to explain this anaphoric data. To make SDRT meet the criterion that it distinguishes among each agent’s public commitments, we make the logical form of each dialogue turn a tuple of SDRSs, one for each agent. An SDRS for a given agent at a given turn will represent all of his current commitments, including the ongoing commitments from prior turns [this is a way to avoid revision in the model theory, as discussed in Lascarides and Asher (2008)]. The logical form of the dialogue overall is the logical forms of each of its turns. All dialogue participants build all the agents’ SDRSs and not just those representing his own commitments (and since we ignore misunderstandings we can assume they all build the same SDRSs). We assume an extremely simple notion of turns, where turn boundaries occur whenever the speaker changes (even if this happens mid-clause). The proposed logical form of (1) is shown in Table 1. As before, p1.1, p1.2 and p2.1 label the contents of the clauses (1a) to (1c), respectively (these semantic representations are omitted for reasons of space). We also from now on adopt a convention that the label of the dialogue segment of turn n with speaker d is pnd (here, M stands for Mark, K for Karen and S for Sharon). We call this tuple of SDRSs a Dialogue SDRS or DSDRS.
Alex Lascarides and Nicholas Asher 123
Thus, Cia and Coa are world assignment pairs, given the definitions from Asher and Lascarides (2003). Equation (12) defines the dynamic interpretation of veridical relations (e.g. Narration, Explanation, Acceptance), where meaning postulates then constrain the illocutionary effects uR(a,b)—for example for Narration they stipulate the spatio-temporal progression of the events (m in ½½m stands for monologue). Equation (13) defines the dynamic interpretation of Correction (see section 4 for details). (12) (w, f ) ½½R(a, b)m (w#, g) iff (w, f ) ½½Ka ^ Kb ^ uR(a,b)m(w#, g) (13) (w, f ) ½½Correction(a, b)m (w#, g) iff (w, f ) ½½(:Ka) ^ Kb ^ uCorr(a,b)m (w#, g) The CCP of a dialogue turn T ¼ fSa : a 2 Dg is the product of the CCPs of its SDRSs:
Turn Mark’s SDRS 1 2
Karen’s SDRS
Sharon’s SDRS
; p1M : Explanation(p1.1, p1.2) ; p1M : Explanation(p1.1, p1.2) p2K : Explanation(p1.1, p1.2) ^ ; Explanation(p1.2, p2.1) Table 1 A representation of dialogue (1)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
There is a sharing of labels across the SDRSs in a DSDRS. This reflects the reality that a speaker may perform a relational speech act whose first argument is part of someone else’s turn or part of his own previous turns. As a special case, it captures the fact that through speaking an agent can reveal his commitments (or lack of them) to content that another agent conveyed, even if this is linguistically implicit. Leaving aside for now how this DSDRS is constructed, let us focus instead on its dynamic semantic interpretation. Asher and Lascarides (2003) define precisely the context change potential (CCP) of an individual SDRS. Since the logical form of a dialogue turn is now a tuple of SDRSs, its CCP is the product of the CCPs of the individual SDRSs. In other words, the context of evaluation Cd for interpreting a dialogue turn is a set of dynamic contexts for interpreting SDRSs—one for each agent a 2 D, where D is the set of all dialogue agents: Cd ¼ ÆCia ; Coa æ : a 2 D :
124 Agreement, Disputes and Commitments in Dialogue Cd ½½Td C#d iff C#d ¼ ÆCia ; Coa æ+½½Sa m : ÆCia ; Coa æ 2 Cd ; a 2 D : Accordingly, entailments from a dialogue turn can be defined in terms of the entailment relation ~m for SDRSs afforded by ½½m: T~d / iff "a 2 D; Sa ~m /:
The Persistence Principle: The undenied commitments of a constituent b in turn n persist to turn n + 1.
4 The term undenied commitment is not ideal. As we saw in (3) p2.1 is consistent with the utterance p1.1 and so one might want to think of p1.1 as an undenied commitment. But that is not the intended interpretation of the term here: p1.1 must not be thought of as an undenied commitment (even though it is undenied content), because as Walker (1996) attests that p2.1 implicitly rejects p1.1.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This makes ~d the shared entailment of each agent’s public commitments. And we assume that content / is grounded or agreed upon by all the dialogue agents by a dialogue turn T iff T ~d /. Similarly, / is agreed on by a subgroup G 4 D of dialogue agents iff for all agents a 2 G, Sa ~m / (where Sa 2 T). In line with intuitions, this definition predicts that by the end of the second turn of dialogue (1), Mark, Karen and Sharon do not agree on anything other than logical truths (because Sharon is not committed to anything). But Mark and Karen agree that they had a fight, which was caused by Karen going out with Keith and not Mark. Finally, given that the SDRSs for a dialogue turn reflect all an agent’s current commitments, the CCP of the dialogue overall is the CCP of its last turn. But what are the update rules for constructing this DSDRS-style information state? Asher and Lascarides (2003) provide a detailed glue logic for constructing a single SDRS. This includes default axioms for identifying rhetorical connections among discourse units given their linguistic form and context of utterance (see section 5). The rules are defaults because one never has complete information about the context, including speaker intentions. Clearly, these default axioms are needed for constructing DSDRSs. But in addition, the glue logic must now incorporate principles for computing which commitments persist from prior turns (see criterion 4 from section 2). The data from section 2 showed that the type of the speaker’s current speech act influences which parts of the prior commitments persist and which do not. For convenience, we state one Persistence Principle, together with separate axioms that for each type of (current) speech act computes the so-called undenied commitments of a prior constituent:4
Alex Lascarides and Nicholas Asher 125
For instance, the implicit endorsement in (1) concerns simple-leftveridical relations (e.g. Explanation and Background)—they entail the content of the first argument but are not Acceptance (which is an explicit endorsement via utterances like OK or repetition of content):
Undenied Commitments for Simple-Left-Veridical Relations: If A commits to R(pj, pi) where R is simple-left veridical, then for any R# and any pk such that B is (already) committed to R#(pk, pj) (or to R#(pj, pk)), as long as it’s consistent to do so, the undenied commitments of pj include R#(pk, pj) (or R#(pj, pk)).
(14) p1.1. A: Can I meet with you sometime in the next two weeks? p1.2. A: What days are good for you? p2.1. B: Well, I have some free time on almost every day except Fridays. p2.2. B: Fridays are bad. p2.3. B: So any day besides Friday we can probably work out a time. p3.1. A: Well next week I am out of town Tuesday, Wednesday and Thursday. p3.2. A: So perhaps Monday afternoon?
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Undenied Commitments ensure that when A implicitly endorses pj, he normally also endorses the illocutionary effects that B had intended pj to have. This applies when constructing the DSDRS for (1). Mark’s first turn is constructed via existing glue logic axioms. Similarly, Karen’s utterance p2.1 will be recognized via these axioms as an Explanation of p1.2. The result satisfies the premise to Undenied Commitments (substitute Karen and Mark for A and B, p1.1, p1.2 and p2.1 for pk, pj and pi, respectively, and Explanation for R and R#). The consequence—that Explanation(p1.1, p1.2) is an undenied commitment of p1.2—is consistent with the premises of the glue logic and therefore inferred. So by the Persistence Principle Karen’s SDRS is as shown in Table 1. Undenied Commitments means that when A implicitly endorses a part of the prior speaker’s turn, he normally commits to all of that prior turn. If that prior turn endorsed A’s turn prior to that, then a sequence of applications of this principle ensures that A remains committed to all the commitments he made the last time he spoke. In other words, this principle models as a special case the fact that A remains committed to his prior commitments in those dialogue contexts where each speaker endorses the prior speaker’s turn. To see how this works, consider how Undenied Commitments contributes to the construction of the logical form for dialogue (14) (this is an extract from dialogue r008c from the Verbmobil corpus; Wahlster 2000).
126 Agreement, Disputes and Commitments in Dialogue
Turn A’s SDRS
B’s SDRS
1 2
p1A : Q-Elab(p1.1, p1.2) p1A : Q-Elab(p1.1, p1.2)
3
p3A : Q-Elab(p1.1, p1.2) ^ IQAP(p1.2, p) ^ Plan-Elab(p2.3, p3.1) ^ Q-Elab(p3.1, p3.2)
; p2B : Q-Elab(p1.1, p1.2) ^ IQAP(p1.2, p) p : Explanation(p2.1, p2.2) ^ Result(p2.1, p2.3) p2B : Q-Elab(p1.1, p1.2) ^ IQAP(p1.2, p) p : Explanation(p2.1, p2.2) ^ Result(p2.1, p2.3)
Table 2 A representation of dialogue (14)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
In turn 1, A’s speech act in uttering the question p1.2 is Q-Elab(p1.1, p1.2)—that is p1.2 is a question that is designed to gather information that helps to achieve the goal behind the question p1.1 (to meet in the next two weeks). So by the semantics of Q-Elab, the illocutionary contribution of the question p1.2 is paraphrased as ‘What days in the next two weeks are good for you?’. B’s utterances in the second turn are an indirect answer, so in SDRT the segment they form attaches to p1.2 with the left-veridical relation IQAP (Indirect Question Answer Pair). Undenied Commitments therefore predicts that B is also committed to Q-Elab(p1.1, p1.2), as shown in Table 2. This correctly predicts that B’s response must be interpreted so that its temporal expressions (e.g. every day in p2.1) denote times within the next two weeks. In turn 3, A uses p3.1 to elaborate a plan to achieve the goals set out by p2.3—to meet on any day (in the next two weeks) besides Friday. This type of speech act is rendered with the left-veridical relation Plan-Elab in SDRT. Therefore, by Undenied Commitments, the illocutionary effects of p2.3 become a part of A’s commitments. This makes A committed to the content p of B’s indirect answer to A’s original question, shown in Table 2. This triggers another application of Undenied Commitments (for p), resulting in A’s initial commitment Q-Elab(p1.1, p1.2) being added to his current SDRS, as shown in Table 2 (for reasons of space, we have not reiterated the content associated with label p in A’s third SDRS). Clark (1992) argues that positive evidence is needed for grounding, but he does not make precise exactly what counts as sufficient positive evidence. Similarly, Poesio and Traum (1998) do not provide rules for inferring when a speaker has performed an implicit acceptance. We have made the quantity of positive evidence that is needed logically precise, in terms of the (relational) speech acts that both speakers perform, and the logical relationships between the semantics of those speech acts. The Persistence Principle and Undenied Commitments capture a general
Alex Lascarides and Nicholas Asher 127
(8)
p1.1. p2.1. p2.2. p3.1.
A: How is James doing? B: Actually, not so well. B: His wife left him. C: And he lost his job.
As explained there, there is a (non-monotonic) semantic conflict between Explanation(p2.1, p2.2) and Explanation(p2.1, p) [where p is Continuation(p2.2, p3.1)], a speech act that the glue logic must identify as one that C intended to perform by uttering p3.1. The former nonmonotonically entails, via a scalar implicature, that James losing his wife is the sole reason he is not doing well, while the latter (monotonically) entails it was a strict part of it. Thus, even though these speech acts satisfy the antecedent to Undenied Commitments, its consequent is not inferred. We saw in section 2 that different ways of expressing an explicit endorsement seem to come with different preferences for resolving the scope ambiguity as to what prior commitments are being endorsed. The logical form for (3) that is shown in Table 4 reflects the ‘narrow-scope’ acceptance discussed earlier. Turn
A’s SDRS
B’s SDRS
1 2
p1:1 : Kp1:1 p1:1 : Kp1:1
3
p1:1 : Kp1:1
; p2B : IQAP(p1.1, p2.1) Explanation(p2.1, p2B : IQAP(p1.1, p2.1) Explanation(p1.2,
C’s SDRS ^ p.2.2) ^ p2.2)
; ; p3A : IQAP(p1.1, p2.1) ^ Explanation(p1.2, p) p : Continuation(p2.1, p3.1)
Table 3 The logical form of (8)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
class of examples involving implicit acceptance: there is sufficient positive evidence to ground p if p follows from the speech act that B performed in uttering pj, and A performs a (left-veridical) speech act other than Acceptance that connects to pj, whose semantic consequences are consistent with p. Sufficient positive evidence for grounding in the context of explicit endorsements rests on the formal semantic interpretation of the relevant speech act Acceptance and the rules for choosing the first argument to this relation. We will examine in detail the evidence for grounding in the context of disputes in section 3.1. As explained in section 2, making Undenied Commitments a default principle helps to construct the right logical form of dialogue (8), which is shown in Table 3.
128 Agreement, Disputes and Commitments in Dialogue Turn
A’s SDRS
B’s SDRS
1 2
p1A : Continuation(p1.1, p1.2) p1A : Continuation(p1.1, p1.2)
; p2K : Acceptance(p1.2, p2.1)
Table 4
A representation of dialogue (3)
3.1 Corrections Corrections are highly complex. Like explicit endorsements, they have various scope possibilities. Furthermore, as detailed in van Leusen (1994) and Asher (2004), the focus structure of the correction reveals which element in the context the speaker denies, and this element may be only part of even a minimal discourse unit [e.g. see earlier discussion of dialogue (2)]. We argued in section 2 that to compute the right commitments and get the facts right about agreement, an agent who accepts a correction of his prior commitments must remain committed to those parts of his contribution that were not denied. In this section, we briefly recount the decidable method from Asher and Lascarides (2003) for computing these ‘undenied’ parts of a corrected discourse unit, and then we will exploit it to specify the necessary update rules on commitments. To illustrate SDRT’s method for computing the ‘undenied’ or ‘neutral’ parts of a corrected discourse unit, consider an extract from (11) (pitch accents are shown with small caps):
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The joint entailment of the SDRSs for the second turn ensures that the new student being imaginative is agreed upon, but his being brilliant is not. This logical form, however, does not express an explicit dispute (i.e. it features no divergent relation like Correction). This is because we argued earlier that the denial in (3), while implicated, is not a part of what B said because it cannot act as an antecedent to subsequent anaphora. SDRT models the cognitive effects of utterances in a separate but related logic in which axioms of agent rationality and cooperativity are encoded. Here, that cognitive logic should ensure that by failing to endorse Kp1:1 (as shown in B’s SDRS), B does not believe Kp1:1 . We forego details of this cognitive reasoning here (but see Asher and Lascarides 2008). We now examine explicit disputes in section 3.1, focussing in particular on which commitments persist when a dispute takes place and how disputes affect anaphoric interpretation.
Alex Lascarides and Nicholas Asher 129
(11) p1.2. A: John embezzled the pension funds. p2.1. B: No, it was BILL who stole the pension funds.
(15) a. A: John embezzled the pension funds. b. B: You’re WRONG. 5
We explain how coreference is inferred in the glue logic in section 5.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
B’s utterance commits him to Correction(p1.2, p2.1), and hence to the negation of Kp1:2 . But B has not denied every aspect of Kp1:2 . For instance, he remains neutral about whether the funds were embezzled (as opposed to merely stolen), with the funds entrusted to Bill. Asher and Lascarides (2003) exploit the focus structure of p2.1 to predict this. Following Krifka (1991), Rooth (1992), Steedman (2000) and others, the linguistic form partitions the content of p2.1 into a pair of formal representations, at least one of which is a k-abstract: the focus f ;k (we will label this p2:1 ) and the topic or ‘background’ (pb;k 2:1 ). It is a partition in that applying the k-abstract to the other expression yields the content of p2.1. In this example, the it-cleft and placement of the f ;k pitch accent make p2:1 equal to kP.P(bill), and therefore, pb;k 2:1 must be (roughly) kx(steal(e, x, y) ^ unique(x)), where unique(x) is a gloss for the uniqueness conditions from the it-cleft, and y corefers with the pension funds mentioned in p1.2.5 The propositional content of the background replaces the k-abstracts with existential quantifiers (Rooth 1992); we will refer to this as pb2:1 . So the background proposition pb2:1 can be paraphrased as Someone (unique) stole the pension funds. This focus structure determines which parts of p1.2 are denied and which are not: a mapping f maps the focus and background of p2.1 into a partition of p1.2 of its denied and undenied parts, respectively (Asher and Lascarides 2003: 351). One can identify f while avoiding undecidable consistency checks by exploiting the fact that p1.2 and p2.1 have similar predicate argument structures (we will see shortly what happens when the f ;k predicate argument structures are not alike). In other words, fðp2:1 Þ is b;k kP.P(john), and therefore, fðp2:1 Þ must be kx(embezzle(e#, x, y)). The undenied part of p1.2—which by convention we call pb1:2 —is also computed by replacing the k-abstracts in fðpb;k 2:1 Þ with existential quantifiers. In other words, pb1:2 is someone embezzled the pension funds. Intuitively, B is ‘neutral’ about this content; he is neither committed (yet) to the stealing being an embezzlement nor committed to its negation. This neutrality means Kpb1:2 should not be entailed by Correction(p1.2, p2.1)—the speech act that B is committed to. But as we will see shortly, Kpb1:2 will affect A’s commitments, should he subsequently endorse B’s correction. Sometimes, corrections are much less specific than in (11):
130 Agreement, Disputes and Commitments in Dialogue
C½½Vðpi ; pj ÞC# iff C½½Kpi ^ Kpj C#: The neutral content of p1A in (11) can then be expressed as Vðp1;1 ; pb1:2 Þ where pb1:2 labels the content someone embezzled the pension funds. A’s utterance p3.1 in (11) endorses B’s utterance. And thus A’s new commitment should not entail that John embezzled the funds, and instead entail that Bill stole those funds (and he was the only one to do so). But as we argued in section 2, it should continue to entail that John went to jail (and that the stealing was an embezzlement). That way, John’s going to jail will be agreed upon as an effect of B’s utterance p4.1. We can capture these effects on commitments with two general principles. First, if B’s correction contains within its scope an utterance pi, then it also corrects all labels that out-scope pi. This reflects the intuition that when B corrects something, he also corrects the content 6 We ignore the complexities rendered by presuppositions, such as that John exists. These are represented in SDRT via separate labels from the asserted content, and so the correction denies a presupposition only if the label of the presupposed content is out-scoped by the first argument to Correction.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Arguably the focus of (15b) is now the whole sentence (see the discussion of all-rheme utterances in Steedman 2000) and so all of (15a) is denied.6 Like explicit endorsements, a Correction can include more than just the last clause within its scope. When it does, recursion as well as prosody and paraphrase cues enable us to compute exactly what is denied and what is not. The full dialogue (11) demonstrates this. By denying p1.2, B must also be committed to denying Explanation(p1.1, p1.2). So arguably, Correction(p1A, p2.1) as well as Correction(p1.2, p2.1) express B’s commitments after turn 2. We have already computed the part of Kp1:2 that B is neutral about. The ‘neutral’ part of Kp1A is then determined recursively: roughly, one replaces all occurrences of p1.2 in Kp1A with a new label pb1:2 that labels the ‘neutral’ part of p1.2, and all relations R to which p1.2 is an argument (in this case, Explanation) with dynamic conjunction. In words, this predicts that B is (currently) neutral about the following: John went to jail and someone embezzled the funds. Or to put it another way, B has not committed yet to this content nor to its negation. The reason R is replaced with dynamic conjunction is because one cannot assume that the illocutionary effects of uttering p1.2 also accompany its neutral counterpart: for instance, ‘John went to jail; someone embezzled the funds’ is not a coherent Explanation. We will see shortly that it is sometimes convenient to express dynamic conjunction as a relation V over labels:
Alex Lascarides and Nicholas Asher 131
of any dialogue segment whose semantics is dependent on the denied content. Given that in SDRTwe infer all the discourse relations that are consistent with an attachment point, we will automatically include these additional corrections in the logical form. For the analysis of (11), this means that B’s SDRS for the second turn is (16). (16) p2B : Correction(p1A, p2.1) ^ Correction(p1.2, p2.1) ^ Explanation*(p2.1, p2.2)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Crucially, no rules validate an inference that the ‘neutral’ content of p1.2 (or p1A) is an undenied commitment, making the Persistence Principle irrelevant when constructing B’s SDRS in (11). The lack of such a rule reflects intuitions about grounding—in (11), we accurately predict that John’s going to jail is not agreed upon after the second turn. A’s third turn is an acceptance of the entire segment p2B;that is Acceptance(p2B, p3.1). Furthermore, as a side effect of this speech act, A is also committed to Correction(p1A, p3.1). SDRT’s glue logic captures this plurality in A’s dialogue move already: the semantic consequences of Acceptance(p2B, p3.1) resolves the content of p3.1 to be equivalent to that of p2B; this commits A to the negation of his first turn, thereby fulfilling the necessary consequences of a corrective move; and so the general glue logic principle that the necessary consequences of a speech act are normally sufficient for that speech act to be inferred applies. So A is now committed to speech acts that are semantically incompatible with his earlier commitments to Kp1A . But his commitments are consistent because no update rules make Kp1A an undenied commitment and hence a part of A’s current commitments. Nevertheless, we need to ensure that A now maintains commitments to the parts of p1A’s content that B was neutral about. We therefore articulate an update rule that ‘imports back’ A’s prior commitment to the undenied bits of the dispute. This ‘minimizes’ the commitments that A drops, and they derive from the scope possibilities for corrections and endorsements. A speaker who accepts a correction adjusts the scope of the acceptance and his interpretation of the correction appropriately to the commitments that he continues to hold. Similarly, a speaker who corrects a correction will adjust the scope of his correction and his interpretation of the first correction in light of what commitments he continues to hold. This idea of minimizing dropped commitments in response to disputes is captured informally in the principle Accepting Corrections (AC) below. This is the second of our update principles, and we formalize it in section 5:
132 Agreement, Disputes and Commitments in Dialogue
Undenied Commitments for the Acceptance of Corrections (AC): If A endorses pj, where B’s SDRS (already) contains Correction(pi, pj) and A was committed to pi in the prior turn, then the undenied commitments of pj include the ‘neutral’ or undenied content of pi, as long as it is consistent to do so.
Undenied Commitments for the Acceptance of Corrections (AC): If A’s SDRS contains k : R(pj, pk) where R is left veridical and B’s SDRS contains Correction(pi, pj) and A was committed to the content of pi in the prior turn, then the undenied commitments of pj are computed as follows: (a) the undenied content of pi is computed recursively by replacing corrected labels that are out-scoped by pi with their
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We saw earlier that the undenied content of Kp1A in (11) is Vðp1;1 ; pb1:2 Þ where pb1:2 is a (new) label that expresses someone embezzled the pension funds. So according to AC and the Persistence Principle, Vðp1:1 ; pb1:2 Þ is a part of A’s SDRS for the third turn of (11). The relation V is not a coherence relation. And yet it is part of the representation of A’s third turn. The need to introduce V into our inventory of relations is an inevitable consequence of two things. First, in order to avoid revision in the model theory, our SDRSs at a given turn represent all of an agent’s current commitments from the beginning of the dialogue to the end of that turn. Secondly, when an agent accepts a correction, his remaining commitments need not, on their own, be related in a coherent way. But a dialogue is coherent so long as each utterance as it’s interpreted attaches to its context with a coherence relation. To capture A’s commitments correctly, this undenied content must be placed in a veridical part of A’s SDRS, by which we mean that it must be connected to the root label via a sequence of zero or more veridical relations. In general, we can ensure the appropriate effect by putting the undenied content in the same scopal position as the acceptance of the correction. Thus, in dialogue (11), since Acceptance(p2B, p3.1) is labelled with the root label p3A, so is the relation Vðp1 ; pb2 Þ Finally, it makes semantic sense to make this undenied content a Background to the segment consisting of corrective moves, since the undenied parts of disputes behave semantically like ‘given’ or background information (van der Sandt 2001). We achieve this by making the last label of the undenied content a second argument to the relation Background, where the first argument is the segment consisting of the corrections (p2B in our example). Thus, we arrive at the following, more specific definition of AC:
Alex Lascarides and Nicholas Asher 133
undenied parts and veridical rhetorical relations that involved these labels with V; (b) the resulting representation of the undenied content of pi is assigned the label k; (c) a Background relation, also labelled with k, between the label of the correcting segment pi and the last label in this undenied content is added (in dialogue (11), this introduces p3A : Background ðp2B ; pb2:1 Þ).
Turn A’s SDRS
B’s SDRS
; p2B : Correction(p1A, p2.1) ^ Correction(p1.2, p2.1) ^ Explanation*(p2.1, p2.2) : Vðp1 ; pb2 Þ ^ Backgroundðp2B ; pb1:2 Þ^ p2B : Correction(p1A, p2.1) ^ Acceptance(p2B, p3.1) Correction(p1.2, p2.1) ^ Explanation*(p2.1, p2.2) : Vðp1:1 ; pb1:2 Þ ^ Background(p2B, p1.2b)^ p4B : Acceptance(p1.1, p4.1) ^ Acceptance (p2B, p3.1) Contrast(p2B, p4.1)
1 2
p1A : Explanation(p1.1, p1.2) p1A : Explanation(p1.1, p1.2)
3
p3A
4
p3A
Table 5 A representation of dialogue (11)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The entire representation for (11) is thus depicted in Table 5. It exhibits the plurality of speech acts that an agent can utter with a single discourse unit. However, in light of this plurality, SDRT assumes a compactness principle whereby speech acts that are inferred to be a part of one’s commitments, but which are semantically and structurally redundant given his other commitments, are omitted from the representation. In the analysis of (11) this means that Correction (p1A, p3.1) ^ Correction(p1.2, p3.1) is omitted from the representation of Kp3A (because Acceptance(p2B, p3.1) entails the same content and predicts the same available labels, as we will see shortly). In section 4 we will unpack the dynamic interpretation of the representation in Table 5. Here, we simply state that each turn has a consistent interpretation (i.e. the output state is not ?). But the SDRSs in turn 2 are not consistent with each other (one entails John embezzled the funds and the other that he did not). At the end of turn 3, that John did not embezzle the funds and that Bill stole the funds is agreed upon. B remains neutral about whether that stealing was an embezzlement and whether John went to jail, while A is committed to it. By the end of turn 4, B commits to John going to jail, and hence, A and B agree on it.
134 Agreement, Disputes and Commitments in Dialogue
(11) p#4.1. B: John didn’t go to jail either. Intuitively, this should attach with Correction to p1.1 and hence attach off the right frontier. Alternatively, instead of B saying p4.1, A may have said it. This should attach not only to A’s prior acceptance act p3.1 but also arguably to p1.1 directly.7 Similarly, consider the two alternative responses B might make to A’s turn in (17): (17) p1.1. p1.2. p2.1. p#2.1.
A: John embezzled a pension fund, A: and then he was convicted of tax evasion. B: It was BILL who embezzled the funds. B: It wasn’t a pension fund.
A’s commitments are to Narration(p1.1, p1.2). So p1.1 is not on the right frontier. But arguably, p2.1 should commit B to Correction(p1.1, p2.1). This not only gets the facts right about commitment but also allows us to finesse what he denies and what he does not (see the exploitation of recursion in the analysis of (11)), which can prove important for getting the facts right for subsequent agreement. Similarly, the anaphoric data in p#2:1 also suggest that p1.1 should be available to utterances of a certain form. 7 One might also wonder why A would utter something he is already committed to. But such redundant moves can happen in monologue (Everyone talked. So Harry talked too) and also in dialogue. The move is not entirely redundant here anyway, since the illocutionary effects that are borne from the Contrast relation that is rendered by though are new commitments, even if the commitment to John going to jail is not.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
B’s fourth turn p4.1 has been interpreted as accepting p1.1. But those readers who are familiar with SDRT may notice that p1.1 is not on the right frontier of the discourse structure and therefore it is not available for attachment. We now examine this issue in some detail, to explain how we predict in SDRT that this attachment to p1.1 is legitimate. Asher and Lascarides (2003) argue that in general, the available labels are those on the right frontier of the discourse structure: that is, the label of the last clause, and any label that is related to it via a sequence of subordinating discourse relations (e.g. Explanation) and/or the semantic out-scoping relation (recall that pi out-scopes pj if Kpi includes pj). Asher (1993) and Asher and Lascarides (2003) highlight particular kinds of examples where this right-frontier constraint breaks down (e.g. in the presence of structural relations like Contrast and Parallel). Here, we observed via (11) that it also appears to break down in the face of certain acceptances (i.e. p4.1). The same is true of corrections: for instance, instead of uttering p4.1 in (11), B could have uttered p#4.1.
Alex Lascarides and Nicholas Asher 135
(18) p1.1. p1.2. p1.3. p2.1. p3.1. p3.2.
A: John came in the pub. A: He sat down on an old coat. A: He drank a beer. B: I definitely saw that he didn’t drink ANYTHING. A: Oh, OK. A: ??It was made of tweed.
The SDRS for utterances p1.1 to p3.1 is shown in Table 6. Axioms for inferring within the glue logic the Narration and Correction moves in the first two turns are discussed in section 5 (see also Asher and Lascarides 2003). A then uses the third turn to accept B’s correction p2.1. Thus, according to AC, we must add a representation of the undenied parts of p1A (and p1.3) to A’s SDRS as well. The background part pb1:3 of p1.3 is >. So by the compactness principle mentioned earlier, pb1:3 is omitted from the representation and hence so is Vðp1:2 ; pb1:3 Þ This makes the undenied content of p1A simply Narration(p1.1, p1.2). Thus, p1.2 is its last label, and so by AC part (c) it is linked to p2B with Background. Finally, AC assigns Narration(p1.1, p1.2) and Background(p2B, p1.2) the root label, as shown in Table 6. This logical form correctly predicts that A is (currently) committed to the proposition that John came in the pub and
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We propose that these variants of (11) and (17) constitute a special case of what Asher (1993) called discourse subordination: there is enough compositional semantic information in the utterance, derivable from syntax and prosodic cues, to guide the attachment to a particular point in the discourse structure. In contrast, a highly under-specified compositional semantics does not guide attachment off the right frontier: responding to A’s move in (17) with you’re wrong (or OK) would be interpreted as a correction (or an acceptance) of at least Kp1:2 . Thus, in dialogues where the compositional semantics of an utterance is sufficiently specific, the availability constraints should be relaxed to include in principle all constituents in the previous turn. This is also what happens in our various versions of (11): B’s very specific utterances p4.1 (and its alternative p#4:1 ) suffice to pick out p1.1 as an attachment point, even though it is not on the right frontier. Similarly, if A instead of B were to say p4.1, it can attach to p1.1. In fact, many types of speech acts allow discourse subordination, at least in principle [see Asher (1993) for examples involving Elaboration]. Because discourse subordination is only possible when an utterance has a certain specific form, the right-frontier constraint on availability is not rendered entirely impotent. For instance, dialogue (18) illustrates AC’s predictions about anaphoric dependencies (imagine that A and B are telling an agent C what happened in the pub last night):
136 Agreement, Disputes and Commitments in Dialogue Turn A’s SDRS 1 2 3
B’s SDRS
p1A : Narration(p1.1, p1.2) ^ ; Narration(p1.2, p1.3) p2B p1A : Narration(p1.1, p1.2) ^ Narration(p1.2, p1.3) p3A : Narration(p1.1, p1.2) ^ Background p2B (p2B, p1.2) Acceptance(p2.1, p3.1) Table 6
: Correction(p1A, p2.1) ^ Correction(p1.3, p2.1) : Correction(p1A, p2.1) ^ Correction(p1.3, p2.1)
A representation of the dialogue p1.1 to p3.1 in (18)
4 THE LOGICAL FORM OF DIALOGUE Section 3 describes a theory of agreement and disputes in SDRT that meets the four general criteria from section 2. We extended logical forms from being a single SDRS to a tuple of them, each one representing the public commitments of an individual agent. We also argued in favour of extending the glue logic to include principles that stipulate when a prior commitment persists, given the particular speech acts that the speaker performed in the current turn. We saw that the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
that he (then) sat on an old coat. But p1.2 is unavailable for subsequent rhetorical connections: it is not the first argument to a subordinating relation where the second argument is available; nor does it out-scope any available label. This correctly predicts that the indefinite noun phrase a coat is not an available antecedent to the pronoun it in the anomalous continuation p3.2 of this dialogue. Similarly, p1.1 is not available. And so had this first utterance been John came in with a jacket over his arm, we would correctly predict that it in p3.2 cannot corefer with the jacket. Thus, the principle AC preserves both commitments and anaphoric dependencies appropriately in such examples. Discourse subordination is not an option for p3.2, for in contrast to the utterances in (11) and (17), it lacks the specific prosodic and linguistic cues that are required for an attachment off the right frontier. We finish this section with a logical form for (9), given in Table 7 (the semantic representations of the clauses are omitted). Roughly, this logical form commits PianoCraft to the album’s theme of violence explaining why people like it, while WildWind’s (complex) utterance p2.2 commits him to people liking the album and to its theme being violence, but he denies that one explains the other. He is also committed to a justification for performing this denial—the anecdote in p2.3 that he knows at least one person who hates the album because of its theme.
Alex Lascarides and Nicholas Asher 137
Turn
PianoCraft’s SDRS
WildWind’s SDRS
1 2
p1P : Explanation(p1.1, p1.2) p1P : Explanation(p1.1, p1.2)
; p1M : Commentary(p1.2, p2.1) ^ Contrast(p2.1, p) ^ Explanation*(p, p2.3) p : Correction(p1P, p2.2)
Table 7 A representation of dialogue (9)
4.1 Syntax We now define the syntax of SDRSs and then use this to define the syntax of DSDRSs. Definitions 1 and 2 are from Asher and Lascarides (2003). Definition 1 The Syntax of SDRS Formulae SDRS formulae are constructed from the following vocabulary: vocab-1. microstructure: A classical first-order vocabulary, consisting of predicates, terms, Boolean connectives and quantifiers, augmented with the modal operator h, the modal operator d that turns formulae into action terms (d/ is the action of bringing it about that /), a modal operator ! that turns an action term into a formula (!d/ is used to represent imperatives); and the operator ‘?’ and k-terms for representing questions as ?kx1. . .kxn/, each xi corresponding to a wh-element. vocab-2. labels: p, p1, p2, etc. vocab-3. a set of relation symbols for discourse relations: R, R1, R2, etc. The set L of well-formed SDRS formulae is defined as follows: 1. Let Lbasic be the set of well-formed formulae that are derived from vocab-1 using the usual syntax rules for first-order modal languages with action terms. Then Lbasic4L.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
scope possibilities of acceptances and corrections interact with the persistence of commitments and also with the interpretation of anaphora. Our task now is to formalize this fully. We start by focussing on the formal language in which the logical form of dialogue is expressed, defining its syntax and dynamic semantic model theory. In section 5 we will extend the glue logic with axioms that formalize Undenied Commitments and AC—the principles from section 3 that identify which prior commitments persist.
138 Agreement, Disputes and Commitments in Dialogue 2. If R is an n-ary discourse relation symbol and p1, . . ., pn are labels, then Rðp1 ; ; pn Þ 2 L. 3. For /; /# 2 L, (/ ^ /#), :/ 2 L. Definition 2 An SDRS Let L be the set of SDRS formulae. Then an SDRS is a triple ÆP, F, lastæ, where: P is a set of labels;that is P 4 vocab-2. last is a label in P (intuitively, this labels the last clause); and F is a function that assigns each member of P a member of L. We say that p immediately out-scopes p# iff F(p) contains p# as a literal. The relation _ that is its transitive closure satisfies the following two constraints: it forms a well-founded partial order over P, and it has a unique root (i.e. there is a unique p0 2 P such that "p 2 P, p0 c p).
When there is no confusion, we may write ÆP, Fæ instead of ÆP, F, lastæ. In the prior sections, the value of F is shown with colons: p : / means F(p) ¼ /. Definition 3 now formally defines DSDRSs, as illustrated in Tables 1–7. Definition 3 A DSDRS Let D be a set of agents. Then a DSDRS is a tuple Æn, T, P, F, lastæ, where: n is a natural number (intuitively, j < n is the jth turn in the dialogue); P is a set of labels; F is a function that assigns each member of P a member of L; T is a mapping from [1, n] to a function from D into SDRSs, such that each SDRS is drawn from P and F. That is, if TðjÞðdi Þ ¼ ÆPdj i ; Fjdi ; lastdj i æ, where j 2 [1, n] and di 2 D, then Pdj i 4P, Fjdi ¼def F^Pdj i (i.e. Fjdi is F restricted to Pdj i ), and lastdj i 2 Pdj i . last ¼ def lastdn ; where d is the unique speaker of the last turn n (each turn has a unique speaker, because turn boundaries occur whenever the speaker changes). We will sometimes write T(j)(di) as T di ðjÞ. Informally, T will map each turn and dialogue participant to an SDRS that represents everything he is currently publicly committed to. While one would expect a contribution in monologue to be coherent (especially if it is edited written text), the same is not true of dialogue. The above definitions allow for this: the formula that is associated with
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Alex Lascarides and Nicholas Asher 139
p1.1. Mark (to Karen and Sharon): Karen ’n’ I’re having a fight, p1.2. Mark (to Karen and Sharon): after she went out with Keith and not me. p2.1. Karen (to Mark and Sharon): Wul Mark, you never asked me out. (1#) Æ2, T, fp1M, p2K, p1.1, p1.2, p2.1g, F, p2.1æ, where: Fðp1:1 Þ ¼ Kp1:1 ; Fðp1:2 Þ ¼ Kp1:2 ; Fðp2:1 Þ ¼ Kp2:1 F(p1M) ¼ Explanation(p1.1, p1.2) F(p2K) ¼ Explanation(p1.1, p1.2) ^ Explanation(p1.2, p2.1) T(1) ¼ f(M, Æfp1M, p1.1, p1.2g, F1, p1.2æ), (K, ;), (S, ;)g,
(1)
where F1 ¼ F ^ fp1A, p1.1, p1.2g T(2) ¼ f(M, Æfp1M, p1.1, p1.2g, F1 p1.2æ), (K, Æfp2K, p1.1, p1.2, p2.1g, F2, p2.1æ), (S, ;)g, where F2 ¼ F ^ fp2B, p1.1, p1.2, p2.1g (19) Æ2, T, fp1M, p2K, p1.1, p1.2, p2.1g, F, p2.1æ, where: Fðp1:1 Þ ¼ Kp1:1 ; Fðp1:2 Þ ¼ Kp1:2 ; Fðp2:1 Þ ¼ Kp2:1 F(p1M) ¼ Explanation(p1.1, p1.2) F(p2K) ¼ Explanation(p1.1, p1.2) ^ Explanation(p1.2, p2.1)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
the root label of a turn may include parts that are not rhetorically connected to any other part of the dialogue (consider in particular the range of well-formed SDRS formulae from Definition 1). However, Definition 3 restricts each turn to having a unique root label. This makes an individual turn part of a single dialogue (addressed to a unique group of people). As Polanyi (1985) shows, this is not always the case (e.g. a ‘self-interruption’ such as Stop that you kids!). We would need to analyse such a dialogue turn with multiple SDRSs, to reflect the idea that more than one conversation is going on simultaneously. But we ignore these complexities here. Definition 3 allows label sharing across speakers and across turns. However, each label is associated with unique content in whatever turn j 2 [1, n] it appears in. That is, "p 2 Pdl 1 \ Pdj 2 , l, j 2 [1, n], d1, d2 2 D, Fld1 ðpÞ ¼ Fjd2 ðpÞ. As we explained earlier, a situation where d1 and d2 interpret p differently would not correspond to a situation where p is assigned distinct contents in distinct SDRSs within the same DSDRS. Rather, it corresponds to a situation where d1 and d2 have each built different DSDRSs (although we do not explore misunderstandings further here). There are several notational variants for DSDRSs. For instance, Table 1 and (19) are notational variants of the DSDRS (1#), the logical form for (1).
140 Agreement, Disputes and Commitments in Dialogue
TM (1) ¼ fp1A, p1.1, p1.2g, TK (1) ¼ TS (1) ¼ ; TM (2) ¼ fp1A, p1.1, p1.2g, TK (2) ¼ fp2B, p1.1, p1.2, p2.1g, TS (2) ¼ ;
We will usually represent DSDRSs as tables.
4.2 Semantics for DSDRSs
Veridical Schema: ðw; f Þ½½Rðp1 ; p2 Þm ðw#; gÞiff ðw; f Þ½½Kp1 ^ Kp2 ^ uRðp1 ;p2 Þ m ðw#; gÞ
Meaning postulates impose conditions on when uRðp1 ;p2 Þ is true for various relations R. This forms a major component of SDRT, since it constrains the illocutionary effects of speech acts. For instance, uExplanationðp1 ;p2 Þ entails that Kp2 is an answer to why Kp1 ?, where the semantics of why-questions follows that given in Bromberger (1962) and Achinstein (1980). A divergent discourse relation such as Correction entails the negation of its first argument:
Semantics of Correction: ðw; f Þ½½Correctionðp1 ; p2 Þm ðw#; gÞiff ðw; f Þ½½ð:Kp1 Þ ^ Kp2 ^ uCorrectionðp1 ;p2 Þ m ðw#; gÞ
8 As usual, the quantifier dx extends the input assignment function f to a new one g that is like f, save that it is defined for x of (van Eijck and Kamp 1997). This ensures that assignments to x when interpreting the subformula w(x) in (dx/) ^ w(x) match those that are used to satisfy the body / of the quantified formula. In terms of natural language, this captures anaphoric dependencies across sentence boundaries. 9 The problem is that it does not produce a suitable dynamic interpretation of questions. We fix this in Lascarides and Asher (forthcoming) by building on Groenendijk’s (2003) semantics of questions.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Asher and Lascarides (2003) offer a dynamic semantics of SDRSs, with contexts being world variable assignment pairs following Groenendijk and Stokhof (1991), Fernando (1994) and van Eijck and Kamp (1997). So the semantics of an SDRS defines how an input pair (w, f) changes to a different output one (w#, g), where w and w# are possible worlds, and f and g are partial variable assignment functions.8 While this semantics is not ideal, it will suit our purposes here—to give an idea of how to interpret DSDRSs and predict agreement and disputes.9 One crucial task is to specify the content of rhetorical relations. Unlike other atomic formulae, these update the input context rather than acting as a test on it. Veridical discourse relations, such as Explanation, Acceptance and Background receive the following interpretation (where as before m in ½½m stands for monologue):
Alex Lascarides and Nicholas Asher 141
The meaning postulates for uCorrectionðp1 ;p2 Þ entail that Kp1 and Kp2 are mutually inconsistent. The semantics of an SDRS is then unpacked recursively, starting with the SDRS formula that is assigned to its unique root label. As we explained in section 3 the interpretation of a DSDRS is the product of the interpretation of its component SDRSs. Definition 4 formalizes this, with the context of evaluation being one dynamic proposition per agent: Definition 4 Dynamic Semantics of DSDRSs
r1 ½½Kd r2
iff
r1 ½½TðjÞd r2 iff
r1 ½½TðnÞd r2 "di 2 D; qi ðr2 Þ ¼ qi ðr1 Þ+½½T di ðjÞm
In words, the CCP of the DSDRS is that of its last turn, which in turn is computed in terms of ½½m.The CCP of a dialogue turn updates the commitments each agent held in the dialogue initial state to include the (dynamic) content of his SDRS for that turn. Thus, the CCPs of the turns in a DSDRS reflect the evolving commitments of each dialogue agent. It is standard when defining truth in a product of models to say that A 3 B ~/ iff A~/ and B~/. This natural definition readily transfers to our dynamic setting, providing a definition of entailment for DSDRSs that is constant whatever the number of participants. Let K ¼ Æn; T; P; F; lastæ be a DSDRS for dialogue participants D, and let ~m be the dynamic semantic entailment relationship afforded by ½½m.10 Definition 5 Grounding K ~d / iff for all di 2 D, T di ðnÞ~m /, where n is the last turn in the conversation.
10
That is, / ~m w iff for all intensional structures A and for all world assignment pairs (w, f ) if there is a pair ðw#; f #Þ such that ðw; f Þ ½½/Am ðw#; f #Þ; then there is a pair ðw$; f $Þ such that ðw#; f #Þ ½½wAm ðw$; f $Þ:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Let K be a DSDRS Æn, T, P, F, lastæ with dialogue participants D ¼ fd1, . . ., dkg and j 2 [1, n]. Let r1 and r2 each be a set of k pairs of world assignment pairs and let qi, i 2 [1, k] be a projection function onto the ith element of r1 and r2. Then:
142 Agreement, Disputes and Commitments in Dialogue This notion of entailment for dialogue matches exactly the definition of agreement, or the grounding of a proposition. The illocutionary contributions of speech acts are encoded in the semantics of DSDRSs. And thus our definition of agreement as a joint entailment on each agent’s commitments models implicit agreement. For example, the SDRSs for the last turn of dialogue (1), shown in Table 1, have the following dynamic implications: Explanationðp1:1 ; p1:2 Þ Explanationðp1:1 ; p1:2 Þ^ Explanationðp1:2 ; p2:1 Þ
^ Kp1:2 ^ Kp1:2 ^ Kp2:1 ^ Kp1:2
^ uExplanationðp1:1 ;p1:2 Þ ^ uExplanationðp1:1 ;p1:2 Þ ^ ^ uExplanationðp1:2 ;p2:1 Þ ^ uExplanationðp1:1 ;p1:2 Þ ^
Thus, uExplanationðp1:1 ;p1:2 Þ —that is the illocutionary effects that stem from p1.2 explaining p1.1—is agreed upon, even though the compositional semantics of neither Mark’s nor Karen’s utterances entail this. Now let us examine some dialogues involving Correction. Consider first the dynamic semantic interpretation of dialogue (11), whose logical form is shown in Table 5. The entailments for the SDRSs of the last turn unpack as follows:11 Kp3A
Vðp1:1 ; pb1:2 Þ ^ Backgroundðp2B ; pb1:2 Þ^ Acceptanceðp2B ; p3:1 Þ only if Kp1:1 ^ Kpb1:2 ^ Kp2B ^ uBackgroundðp2B ;pb1:2 Þ ^ Kp3:1 ^ iff
uAccðp2B ;p3:1 Þ only if Kp1:1 ^ dxðembezzleðe#; x; yÞÞ ^ Correctionðp1A ; p2:1 Þ^ Correctionðp1:2 ;p2:1 Þ ^ Explanation ðp2:1 ;p2:2 Þ ^ Kp3:1
Kp2B
only if Kp1:1 ^ dxðembezzleðe#; x; yÞÞ^ :Explanationðp1:1 ; p1:2 Þ ^ :Kp1:2 ^ Kp2:2 iff Acceptanceðp1:1 ; p4:1 Þ ^ Contrastðp2B ; p4:1 Þ only if Kp1:1 ^ Correctionðp1A ; p2:1 Þ ^ Correctionðp1:2 ; p2:1 Þ^ Explanation ðp2:1 ; p2:2 Þ only if Kp1:1 ^ :Explanationðp1:1 ; p1:2 Þ ^ :Kp1:2 ^ Kp2:2
Thus, the following are all grounded: :Kp1:2 (i.e. that John did not embezzle the funds), Kp2:1 (i.e. that Bill stole the funds), Kp2:2 (i.e. that B 11
Roughly, uAcc(a,b) constrains Kb to be in a non-monotonic equivalence with Ka ^ Kb.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Kp1:1 Kp1:1 Kp1:2 only if Kp1:1 Kp2:1 iff iff
Alex Lascarides and Nicholas Asher 143
The DSDRS for (20) is shown in Table 8. Let us motivate it in detail. In the second turn, A publicly commits to a particular answer to the question p1 being true; B then corrects this in the third turn, and he is also publicly committed to the fact that B saw A fall off explains why he asserts the correction p3.1. Now let us consider the fourth turn, in which A corrects B’s correction. AC does not apply; intuitively, A wishes to preserve his commitments to all the content prior to the dispute (and not just the parts of that content that B remained neutral about), and also preserve what is available within it. More formally, if the glue logic validates Correction(a, b), where the representation of the discourse context includes Correction(c, a), then one needs an axiom ensuring that all of c’s content Kc is part of the undenied commitments of a in this context. Moreover, the axiom should make available all the labels within c that were available before c was corrected. Undenied Commitments for Denying Corrections (DC) captures this, and it is a default just in case A’s current speech act conflicts with preserving all prior commitments:
Undenied Commitments for DC: If A’s SDRS (for a given turn) contains k1 : Correction(a, b), and B’s SDRS (for that turn) contains k2 : Correction(c, a), then normally the undenied commitments for a, which are assigned the label k1, include: – V(c, b) if Kc 2 Lbasic. – Kc ^ V(c#, b), where c# is the last label of Kc, if Kc ; Lbasic.
DC determines the undenied commitments of a type of speech act in a particular context. The role of the relation V is to guarantee the correct effects both in semantics and in what is available. In dialogue
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
was at the trial), :Explanation(p1, p2) (i.e. that John embezzling the funds did not cause him to go to jail) and Kp1:1 (i.e. that John went to jail). Further, A remains committed to the stealing being an embezzlement, but B is neutral about it. In the real dialogue (20) (from personal communication) a correction is corrected: (20) p1.1. B: Hey, what happened? p2.1. A: I got the climb. p3.1. B: No you didn’t. p3.2. B: I saw you fall off. p4.1. A: No. p4.2. A: First time I fell off. p4.3. A: Next time I redpointed it.
144 Agreement, Disputes and Commitments in Dialogue Turn A’s SDRS 1 2 3
4
B’s SDRS
; p2A : IQAP(p1.1, p2.1) p2A : IQAP(p1.1, p2.1)
p1:1 : Kp1:1 p1:1 : Kp1:1 p3B : Correction(p2A, p3.1) ^ Correction(p2.1, p3.1) ^ Explanation*(p3.1, p3.2) p4A : IQAP(p1.1, p2.1) ^ V(p2.1, p4.1) ^ p3B : Correction(p2A, p3.1) ^ Correction(p3B, p4.1) ^ Correction(p3.1, p4.1) ^ Correction(p2A, p3.1) ^ Explanation*(p4.1, p) ^ Elaboration(p2.1,p) Explanation*(p3.1, p3.2) p : Narration(p4.2, p4.3)
(20), A’s SDRS at the point where the dialogue is updated with p4.1 includes Correction(p3.1, p4.1). Since Correction(p2.1, p3.1) and Correction(p2A, p3.1) are in B’s SDRS, DC and the Persistence Principle yield that A’s SDRS must include IQAP(p1.1, p2.1) and V(p2.1, p4.1) labelled with the root label (see Table 8). Intuitively, p4.2 and p4.3 together form a narrative (which we have labelled p), which in turn elaborates A’s original answer p2.1 to the question (i.e. it elaborates how A got the climb) and also explains why he performs the correction p4.1, making A’s final commitments as shown in Table 8. At this point, the labels p3.2 and p4.2 are not available. This seems to concur with intuitions: the pronoun it in a subsequent utterance It hurt could not refer back to the fall. The entailments of the DSDRS in Table 8 ensure that B is committed to A not getting the climb ð:Kp2:1 Þ and that he fell off (Kp3:2 entails this so long as seeing someone falling is interpreted evidentially). A, on the other hand, is committed to A getting the climb ðKp2:1 Þ and to first falling off ðKp4:2 Þ and then red-pointing it ðKp4:3 Þ. So at this point, the only content that is agreed upon is that A fell off. Neither A got the climb nor A didn’t get the climb are agreed upon; nor is any answer to B’s question. 5 CONSTRUCTING LOGICAL FORM Section 4 detailed the syntax and interpretation of the language in which logical forms for dialogue are expressed. This language expresses logical forms for a range of dialogues in a way that captures intuitions about agreement and disputes. Our task now is to describe how those logical forms are constructed during dialogue interpretation. This
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Table 8 The DSDRS for dialogue (20)
Alex Lascarides and Nicholas Asher 145
Glue Logic Schema: k :? (a, b) ^ Info(a, b, k)) > k : R(a, b, k)
In words, if b is to be connected to a with a rhetorical relation whose value we do not know yet, and the result is to appear in the scopal position of the DSDRS that is labelled k, and moreover Info(a, b, k) holds of the content labelled by k, a and b, then normally the rhetorical relation is R. The conjunct Info(a, b, k) is cashed out in terms of the ULFs that a, b and k label, and the rules are justified on the basis of underlying linguistic knowledge, world knowledge or knowledge of the cognitive states of the dialogue participants. Thus, glue logic axioms encapsulate default inferences about which types of speech act were performed, on the basis of the content and context of the utterances. One default that we mentioned in section 3 is that the necessary consequences of a speech act being performed are normally sufficient for inferring that it was performed. For instance, if a and b are rhetorically connected and the glue logic evaluates that they are semantically incompatible (glossed as CorrS(a, b)), then normally they are connected with Correction: Correction: (k :?(a, b) ^ CorrS(a, b)) > k : Correction(a, b) Assuming a standard semantics of the it-cleft and an alternative semantics to pitch accents (Rooth 1992), the compositional semantics of p2.1 in (11) conveys that Bill as opposed to anyone else mentioned in the context stole the funds.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
involves extending SDRT’s glue logic to model the principles of dialogue interpretation from section 3. So we start with a brief description of the glue logic (for details see Asher and Lascarides 2003). Roughly put, the glue logic exploits the under-specified semantics derived from linguistic form (e.g. Egg et al. 2001). Under-specified logical forms (ULFs) are partial descriptions of (complete) logical forms, in our case DSDRSs. The glue logic incorporates default axioms for inferring (in a decidable manner) a more specific ULF: in other words, the logic identifies the pragmatically preferred way of resolving underspecified aspects of compositional content. Rhetorical connections are inferred on the basis of default axioms of the form shown in Glue Logic Schema, where the symbols a and b are meta-variables ranging over the labels in the DSDRS, and ? is a variable in the glue logic language that indicates that the value of some constructor in the fully specific logical form (in this case the value of a rhetorical-relation predicate symbol) is currently unknown:
146 Agreement, Disputes and Commitments in Dialogue
Fight: ða : ðhaveðea ; X; zÞ ^ and cðX; x; yÞ ^ fightðzÞÞ^ b : negðcÞ ^ c : go-outðeb ; x; yÞÞ/causeD ðb; aÞ
In SDRT the inferences for constructing logical form can flow in one of several directions. If the premises of glue logic axioms are satisfied by the ULFs derived from the grammar, and one can thus infer via the glue logic’s consequence relation j; a particular rhetorical relation, then this particular rhetorical connection becomes a part of the (updated) logical form. Further, the semantic consequences of this rhetorical relation may lead to inferences about how other under-specified conditions are resolved (e.g. identifying antecedents to pronouns). Alternatively, there are cases where compositional semantics is insufficient for satisfying the premises to any glue logic axioms. In this case, one can resolve the under-specified compositional semantics to specific values so as to satisfy antecedents to glue logic axioms, leading in turn to a rhetorical relation
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(11) p1.1. A: John went to jail. p1.2. A: He embezzled the pension funds. p2.1. B: No, it was BILL who stole the pension funds. p2.2. B: I was at the trial. p3.1. A: Oh, OK. p4.1. B: John did go to jail though. If p :? (p1.2, p2.1) holds, then SDRT’s constraints on anaphoric interpretation (see definitions given shortly) mean that the funds in p1.2 and p2.1 corefer. And so long as the incompatibility between this and the compositional (and lexical) semantics of p1.2 and p2.1 is transferred into the glue language, the antecedent to Correction will be satisfied, yielding p : Correction(p1.2, p2.1). Further axioms for inferring Correction exploit explicit cue phrases such as No and you’re wrong. Explanation(a, b) can be inferred when there’s evidence in the discourse that b causes a (written causeD(b, a)). Evidence of a causal relation does not entail an actual causal relation, but they are nonmonotonically linked thanks to the default rule Explanation below and the semantics of Explanation given earlier. Explanation: (k :?(a, b) ^ causeD(b, a)) > k : Explanation(a, b) Glue logic axioms for inferring causeD(b, a) are monotonic, for either the discourse contains evidence of a causal connection or it does not. For example, the axiom Fight stipulates that if x and y have a fight z and x did not go out with y, then there is evidence in the discourse of the latter causing the former. As we will shortly see, this axiom contributes to the construction of the DSDRS for dialogue (1):
Alex Lascarides and Nicholas Asher 147
being inferred. If one adopts this strategy, and moreover there is a choice of which way to resolve the under-specified content so as to infer a rhetorical relation from it, then one chooses an interpretation which maximizes the coherence of the logical form [see Asher and Lascarides (2003) for details]. Definition 6, taken from Asher and Lascarides (2003), stipulates this general principle that one interprets discourse in a way that maximizes its coherence and illustrates the conservative assumptions that SDRT makes about what factors influence the degree of coherence. While the original version of Maximizing Discourse Coherence (MDC) applied to SDRSs, it also applies now to DSDRSs.12 Discourse is interpreted so as to maximize discourse coherence, where the (partial) ranking among interpretations is determined by the following principles. All else being equal, the more rhetorical connections there are between two items in a discourse, the more coherent the interpretation. All else being equal, the more semantically under-specified elements are resolved to specific values, the more coherent the interpretation. Moreover, resolutions that lead to j;-consequences for a particular rhetorical relation are preferred over resolutions that are logically unrelated to any rhetorical connection. Some rhetorical relations are inherently scalar. For example, the quality of a Narration is dependent on the specificity of its common topic. All else being equal, an interpretation which maximizes the quality of its rhetorical relations is more coherent than one that does not. All else being equal, the number of labels in the semantic representation is minimal, so long as minimizing the number of labels does not create semantic anomalies among the rhetorical relations in the representation. The glue logic for DSDRSs involves constructing an SDRS for each turn and each participant. The axioms of SDRS construction just described still apply, serving to provide values for the function F in a DSDRS for each label p. In addition, given the definition of DSDRSs, the glue logic must stipulate for every p 2 P which SDRSs it is a member of. So we extend the glue logic axiom with a 3-place 12 SDRT’s more formal definition of MDC (Asher and Lascardies 2003: 233), which ranks SDRSs into a partial order, easily extends to rank DSDRSs: roughly put, one DSDRS K1 is more coherent then another K2 if (i) they are comparable (i.e. they consist of the same number of turns and the same dialogue participants) and (ii) each SDRS in K1 is at least as coherent as the SDRS for the same dialogue participant and turn in K2.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Definition 6 MDC
148 Agreement, Disputes and Commitments in Dialogue
Definition 7 Discourse Update for DSDRSs Simple Update. We first define how to update a context with new information b, given a particular available attachment site a. The ULF formula k :?(a, b) ^ T(d, j, k) specifies that the new information Kb is to be attached to the DSDRS as a part of the SDRS Td(j). Let r be a set of (fully specified) DSDRSs, and let Th(r) be the set of all ULFs that partially describe the DSDRSs in r. Let w be either (a) a ULF Kb, or (b) a formula k :?(a, b) ^ T(d, j, k) about attachment, where Th (r) ‘ulf Kb. Then r + w is a set of DSDRSs defined as follows: 1. r + w ¼ fs : if Th(r), w j;/ then s ‘ulf /g, provided the result is not ;; 2. r + w ¼ r otherwise. Discourse Update. Suppose that A is the set of available attachment points in the old information r for the new information b. Then the power set P(A) represents all possible choices for what labels in r the new label b is actually attached to. updateSDRT is neutral about which member of P(A) is the ‘right’ choice, for updateSDRT (r, Kb) is the union of DSDRSs that result from a sequence of +-operations for each member of P(A). The updated ULF may not identify a unique logical form (i.e. jupdateSDRT ðr; Kb Þj > 1). The Principle MDC then ranks the alternative, fully specific logical forms. However, in contrast to the analysis of disputes from Asher and Lascarides (2003), constructing a logical form always involves extending the logical form from the context and never revising it. That is, ThðupdateSDRT ðr; Kb ÞÞ4ThðrÞ. In essence, as a dialogue proceeds, we learn strictly more information about the content of (prior) utterances, never revising those prior interpretations but rather refining them. Of course, this monotonicity is feasible only
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
predicate symbol T: where d 2 D and j 2 [1, n], T(d, j, p) means that the label p is a part of the SDRS Td(j). Thus, the glue-language predicate symbol T is used to express statements about the function T in the DSDRSs it describes. We need to add axioms to the glue logic that formally specify the principles of dialogue interpretation that we proposed (e.g. Undenied Commitments, AC, DC). But before doing this, we define discourse update and availability for DSDRSs. As in original SDRT, updating a representation of the discourse context with new content involves adding all the j;-consequences of the old and new content to the logical form. If there is under-specified information about which of the available labels the new content attaches to, then update is conservative and generalizes over all the possibilities (see the second part of Definition 7).
Alex Lascarides and Nicholas Asher 149
Definition 8 The Original Definition of Availability for an SDRS Let ÆP, F, lastæ be an SDRS. Furthermore, where p1, p2 2 P, we say that p1 > p2 iff either: (i) R(p1, p2) is within the range of F, where R is a subordinating relation (e.g. Q-Elab, Plan-Elab, IQAP, Correction, Background, Elaboration, Explanation) or (ii) p1 immediately out-scopes p2 (i.e.F(p1) contains the literal p2). Let > * be the transitive closure of the relation >. Then the available labels A 4 P of the SDRS is: A ¼ fp 2 P : p > *lastg The definition of availability for DSDRSs is defined in terms of that for SDRSs: they are all the available labels of its SDRSs for the last turn. Definition 9 Definition of Availability for DSDRSs Let D be a set of discourse participants, and let Æn, T, P, F, lastæ be a DSDRS for D. Furthermore, where di 2 D and j 2 [1, n], let Adj i 4 Pdj i be the set of the available labels for the SDRS T di ðjÞ, as defined in Definition 8. Then the set A 4 P of available labels for the DSDRS is defined as: A ¼ [ Adni di 2 D
In other words, A is the union of all available labels from all the SDRSs for the last turn n (and so by Definitions 3 and 8 last 2 A). In the particular examples we have analysed so far, we have always assumed that the content of the current turn attaches to an available label from the SDRS of the unique speaker of the last turn. But Definition 9
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
because Discourse Update does not restrict the new information b to being the label of a single clause. It could label the content of an entire turn or more, and discourse update abstracts over all these possibilities. In short, we maintain monotonicity only by relaxing incremental interpretation. Any implementation of SDRT in a practical dialogue system would need to restrict the massive search space that ensues from nonincrementality, and the restricted ‘beam search’ may mean that revision processes in implementation become inevitable. But approximating this account of update in a dialogue system is a matter for future research; here we focus only a competence model of dialogue understanding. Definition 8 stipulates that the available labels of an SDRS are the last label and all labels that are connected to it by a sequence of out-scopes relations and/or subordinating rhetorical relations (ignoring the complexities we discussed earlier concerning discourse subordination).
150 Agreement, Disputes and Commitments in Dialogue
Non-Speakers: :speakerðd; jÞ/ðTðd; j 1; aÞ>Tðd; j; aÞÞ
The axiom is a default because silence can be a meaningful act in sufficiently specific contexts (Grice 1975). The labels of utterances that are spoken by a dialogue participant d in a given turn j must, on the other hand, be a part of Td(j). So, if partof(a, j) means that a labels the content of an individual clause that was said in turn j, then the following axiom holds:
Speakers: ðpartof ða; jÞ ^ speakerðd; jÞÞ/Tðd; j; aÞ
The Persistence Principle and Undenied Commitments for simple-leftveridical relations from section 3 have a straightforward formalization in the glue logic:
The Persistence Principle: k : Rða; bÞ/k : Undenied-Commitments( a ) Undenied Commitments for Simple-Left-Veridical Relations: (k : R(a, b) ^ T(d1, j, k) ^ simple-left-veridical(R) ^ k# : R#(c, a) ^ T(d2, j – 1, k#)) > ðk : Undenied-CommitmentsðaÞ/k : R#ðc; aÞÞ
There is a similar rule to Undenied Commitments for the case where k# : R#(a, c). We can now use the glue logic to derive the DSDRS of dialogue (1), shown in Table 1 (where M is Mark, K is Karen and S is Sharon).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
allows a speaker to completely ignore what the last speaker said, and instead address content that was conveyed in a prior turn to the last one. While it might be rare to ignore someone in a two-person dialogue, it is more frequent in a multiparty conversation, especially if the participants have unequal power. A competence model of dialogue should reflect ‘ignoring’ moves in a transparent way. We have achieved this here: the hallmark that someone has ignored the prior speaker’s turn is that his SDRS features no labels that were introduced in that turn. However, all else being equal, an interpretation where a speaker addresses the prior turn is arguably more coherent than one where he ignores it. So for the sake of simplicity, we will from now on focus on interpretations where the current speaker does not ignore the last speaker. Let us now examine how to express axioms in the glue logic that capture the principles from section 3 for computing which commitments persist from prior turns. The axiom Non-Speakers stipulates that normally you change your commitments only if you speak, where speaker(d, j) means that dialogue participant d is the (unique) speaker of turn j:
Alex Lascarides and Nicholas Asher 151
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The axiom Non-Speaker makes TK (1) and TS (1) empty. But the axiom Speaker means fp1.1, p1.2g 4 TM (1). By MDC, we prefer to minimize labels and maximize rhetorical connections. Given the definition of availability, this means that an SDRS TM (1) that satisfies p1M :?(p1.1,p1.2) is preferred to any SDRS that does not satisfy this. This assumption about attachment resolves the pronoun she in p1.2 to Karen, the only accessible antecedent within p1.1. Therefore, given the ULFs of p1.1 and p1.2 derived from the grammar and this resolution of she, the antecedent Fight is satisfied, yielding causeD(p1.2, p1.1) (i.e. there is evidence in the discourse that p1.2 caused p1.1). So the antecedent to Explanation is satisfied, yielding p1M : Explanation(p1.1, p1.2). Now consider the interpretation of the second turn. Non-Speaker makes TS (2) ¼ TS (1) and TM (2) ¼ TM (1). But Speaker means that p2.1 2 TK (2). K’s second turn can be interpreted as ignoring the first, but by MDC relating it to M’s turn is preferred. M’s SDRS has three available labels: p1.1, p2.1 and p1M. And so we consider updates with all combinations of these labels; MDC will tell us which of these is preferred. Let us suppose that p2.1 attaches to p1M, but not to p1.1 or p1.2. Then the glue logic fails to validate any rhetorical connection (e.g. we argued in section 2 that an Explanation connection would be implausible) or at the very least, the rhetorical relation would be of inferior quality to the one that is inferable between p2.1 and p1.2, and hence dispreferred by MDC. A connection between p2.1 and p1.1 but not p1.2 suffers a similar fate. On the other hand, a glue logic axiom that is similar in style to Fight and that encapsulates the world knowledge about Mark not asking Karen out and Karen not going out with Mark should validate an inference that there is evidence in the discourse that the former caused the latter. And so via Explanation one can infer ?p : Explanation(p1.2, p2.1) is a part of TK (2) for some label ?p that is also a part of TK (2). This together with TM (1) validates the antecedent to Undenied Commitments, and so its non-monotonic consequence is inferred (since it is consistent with the premises), leading by Modus Ponens on the Persistence Principle to the conclusion ?p : Explanation(p1.1, p1.2). Finally, MDC makes the SDRS minimal, resolving ?p to the root label p2K. And so K’s SDRS for the third turn is as shown in Table 1. Now let us consider the principles from section 3 for interpreting explicit endorsements and challenges. We suggested in section 3 that whenever k : Correction(a, b) forms a part of an SDRS, then k : Correction(c, b) should be a part of it too for all labels c that out-scope a. In fact, this principle follows from the semantics of Correction, Correction and MDC. That is because the (dynamic) interpretation
152 Agreement, Disputes and Commitments in Dialogue
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
of Correction(a, b) entails Correction(c, b). Assuming this entailment is transferred (in shallow form) to the glue logic, then this together with the assumption k :?(c, b) would satisfy the antecedent to Correction, yielding k : Correction(c, b). And according to MDC, an interpretation where one assumes k :?(c, b) is more coherent than one where c and b are not related at all, for it yields more rhetorical connections (note that c is available because c out-scopes a). Let us return to the analysis of (11), starting with the SDRSs for the first turn. B’s SDRS, according to Non-Speakers is ;. As before, availability, Speakers and MDC means that p1A :?(p1.1,p1.2) holds. This makes John the only possible antecedent to He in p1.2. So by MDC, the pronoun is resolved this way whatever the rhetorical relation. Using rules that encapsulate relevant world knowledge, one will infer from these premises that causeD(p2, p1) and hence p1A : Explanation(p1.1, p1.2). By Non-Speakers, TA (2) ¼ TA (1). By Speakers, fp2.1, p2.2g 4 TB (2). There are many options for how B’s SDRS may be updated with p2.1 and p2.2, and according to Definition 7 they must all be considered and then ranked by MDC. First, one could form a segment out of p2.1 and p2.2 and attach the result to an available label; by MDC attaching to a label from A’s prior SDRS is preferred, and so the choices are to attach to one or more of p1A, p1.1 and p1.2. Alternatively, one could attach p2.1 to one of these labels first, and then attach p2.2 to an available label in the result. If we were to form a segment out of p2.1 and p2.2 first, and then attach the (root) label of that segment to the context, then as we will see shortly this would create an extra label as compared with the strategy for updating ‘clause by clause’, and so it is dispreferred by MDC. So let us consider now the option of attaching p2.1 to an available label within TA (1). We saw earlier that if we assume that p2.1 attaches to p1.2, then Correction validates an inference to ?p : Correction(p1.2, p2.1). This also yields from an assumption that p2.1 attaches to p1A an inference to ?p : Correction(p1A, p2.1). So attaching p2.1 to p1.2 and p1A is preferred by MDC to not attaching to them, since this maximizes the number of rhetorical connections. Furthermore, if p2.1 attaches to p1.1, then no glue logic axioms validate an inference about the identity of the rhetorical connection between them. So, by clause 2 of MDC—one prefers interpretations with fewer unknown values—an interpretation where p2.1 attaches to p1.1 is dispreferred. Given that p2.1 is interpreted as a correction, relevant glue logic axioms will validate that p2.2 explains this corrective move. Finally, by MDC preferring a minimum number of labels, ?p resolves to the root label p2B. And so the final SDRS TB (2) is as shown in Table 5.
Alex Lascarides and Nicholas Asher 153
Wide Scope OK: ð?k : Acceptanceð?a ; bÞ ^ b : OK ^ Tðd1 ; i 1; ?a Þ ^ Tðd2 ; i; bÞÞ> ðTðd1 ; i 1; cÞ/?a dcÞÞ
In words, if b has the form OK and is interpreted as an acceptance act of some part of a different agent’s prior turn, then normally, that acceptance act is of the root label of that turn. So in (11), p3A : Acceptance(p2B, p3.1) is inferred. This analysis is not yet complete, however, because we need to apply AC from section 3.1. AC involves computing the undenied content of the corrected material. So we start by adding glue logic axioms that recursively compute undenied content. We define a function sc (standing for Subordinated Correction), which for any label p that is corrected by p# yields the set of labels out-scoped by p that (i) are also corrected by p# and (ii) of those labels that satisfy (i) they also have ‘highest’ scopal position within p. More formally: Where Correctionðp; p#Þ; scp# ðpÞ ¼ fp## st Correctionðp##; p#Þ; p_p##and "p### st p_p###_p##; :Correctionðp###; p#Þg
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
A’s SDRS for the third turn of (11) has a linguistic form that entails that it is either an Acceptance of some available prior content or an acknowledgement of understanding (which is represented in SDRTwith the meta-talk relation Acknowledgement*). We assume a glue logic axiom that makes OK default to the more specific form of grounding, namely Acceptance; this means that although the meta-talk relation Acknowledgement* is then inferable from this (on the basis that the necessary semantic consequences of a speech act are normally sufficient for inferring it has been performed), the relation is not added to the logical form because it is semantically and structurally redundant to do so. There is, however, an ambiguity in A’s utterance OK: its highly under-specified compositional semantics does not determine the first argument to the Acceptance relation. Once again using the principle that we prefer interpretations where the current turn relates to the previous one, we prefer the first argument of Acceptance to be one of p2.1, p2.2 and p2B. We argued in section 3 that there appears to be a tendency to interpret explicit endorsements with highly under-determined compositional semantic content—like OK—so that they have the widest scope that is consistent with the premises. This can easily be expressed in the glue logic as a default axiom about attachment:
154 Agreement, Disputes and Commitments in Dialogue
Undenied: If Correction(p, p#), then: 1. If scp# (p) 6¼ /, then Undeniedp# ðpÞ ¼ Kp ½Rðp%; scp# ðpÞÞ=Vðp%; scp# ðpÞb Þ where for each p$ 2 scp# ðpÞ; Fðp$b Þ ¼ Undeniedp# ðp$Þ: 2. If scp# ðpÞ ¼ /; then Undeniedp# ðpÞ ¼ d closure of fðp#b;k Þ:
The glue logic axiom which expresses AC is given below.
Undenied Commitments when Accepting Corrections (AC) (k : R(b, c) ^ left-veridical(R) ^ T(d1, j, k) ^ k# : Correction(a, b) ^ T(d2, j – 1, k#) ^ T(d1, j – 1, a) ^ (d1 6¼ d2)) > ðk : Undenied-CommitmentsðbÞ/ k : ðUndeniedb ðaÞ ^ Backgroundðb; LastðaÞÞÞÞ
This affects the construction of the logical form for (11) in the desired way, yielding the SDRSs shown in Table 5. The axiom Denying Corrections (DC) ensures that the content and the availability of the commitments prior to the dispute are normally preserved when the correction is itself in dispute:
Undenied Commitments for DCs: (k1 : Correction(c, a) ^ T(d1, j, k1) ^ k2 : Correction(a, b) ^ T(d2, j, k2) ^ d1 6¼ d2) > ðk2 : Undenied-CommitmentsðaÞ/ k2 : ðKc ^ VðLastðcÞ; bÞÞÞ
The axiom DC applies when constructing A’s SDRS for the fourth turn in (20), and ensures that QAP(p1.1, p2.1) ^ V(p2.1, p3.1) is added to the root label, as shown in Table 8.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We also introduce a function Last from labels to labels: Last(p) is the (unique) label p# that is the last label in the SDRS formula Kp if there is such a label, otherwise, if Kp 2 Lbasic, then Last(p) ¼ p. We then introduce into the glue logic a function Undenied from labels to formulae, that computes for a label p that is corrected by p# the undenied part of p (relative to that correction by p#). Undeniedp#(p) is constructed via substitution and computing it is decidable (unlike down-dating and revision in first-order logics). We write K[///#] to mean that all occurrences of / within the formula K are substituted with /#. We also ‘overload’ this notation: where P ¼ fp1, . . ., png, K½KP =K#P means that each occurrence of Kpi in K is substituted with K#pi , for 1 < i < n. The function Undenied is defined as follows and matches the informal recursive definition we described in section 3.1.
Alex Lascarides and Nicholas Asher 155
6 CONCLUSION
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We have presented a novel treatment of agreement and disputes, which captures facts about implicit agreement and also about what is agreed upon when a dispute has taken place. We argued that any adequate account of agreement and disputes must rest on a logical form for dialogue that tracks the public commitments of each agent, including their commitments to rhetorical relations. This ensures that an agent’s commitments are not only to the compositional semantics of his utterances but also to their illocutionary effects. By committing agents to this illocutionary content, a definition of agreement as shared public commitment captures how implicit agreement is dependent on the particular relational speech acts that each agent performs and the semantic relationships among these speech acts. The analysis of disputes also benefits from a logical form that distinguishes among each agent’s commitments. It ensures that if one agent commits to the negation of another agent’s commitments, then since the interpretation of logical form is a product of the interpretation of each agent’s commitments, the overall dialogue remains consistent. We argued that any adequate theory of agreement and disputes needs to include axioms that determine which prior commitments persist. We provided logically precise persistence axioms within the glue logic of SDRT and demonstrated by example that they capture facts about agreement and disputes. These axioms reflect the logical codependence among the interpretations of corrective moves v. endorsing moves that are performed in dialogue. Overall, the persistence axioms followed a general principle that dialogue interpretation maximize each dialogue participant’s commitments to prior commitments (even if those commitments belonged to another agent), proviso this is consistent with the illocutionary effect of his current utterance. This is similar to the effects of performing down-dating and revision on prior commitments when adding new commitments to them (although down-dating and revision do not typically effect a transition of prior commitments to another agent). But while down-dating and revision are an unsolved problem for first-order languages and certainly uncomputable, our model of dialogue interpretation is computable and precise. The relationship between what is agreed upon (or grounded) and the interpretation of the dialogue is entirely transparent, since it is defined in terms of the model theory of the logical forms. This, together with the logic for constructing logical form, provides a logical basis for exploring Clark’s (1992) notion of positive evidence for grounding, endowing some of his claims with predictive power through logical reasoning.
156 Agreement, Disputes and Commitments in Dialogue
Acknowledgements This work has benefited from comments and feedback from Julia Hirschberg, David Israel, Paul Portner, Colin Matheson, Ron Petrick, Mark Steedman, Matthew Stone, David Traum, Lyn Walker, from the participants of Sigdial 2008, Londial 2008 and Sub13 and from two anonymous reviewers for this journal. Any errors that remain are our own.
NICHOLAS ASHER IRIT, Universite´ Paul Sabatier, 118, Route de Narbonne, F-31062 Toulouse, France, e-mail:
[email protected]
ALEX LASCARIDES School of Informatics, University of Edinburgh, 10, Crichton Street, Edinburgh EH8 9AB, Scotland, United Kingdom, e-mail:
[email protected]
REFERENCES Achinstein, P. (1980), The Nature of Explanation. Oxford University Press. NY, USA. Asher, N. (1993), Reference to Abstract Objects in Discourse. Kluwer Academic Publishers, Dordrecht, NL.
Asher, N. (2004), ‘From discourse microstructure to macro-structure and back again: the interpretation of focus’. In H. Kamp and B. Partee (eds.), ContextDependence in the Analysis of Linguistic Meaning. Elsevier, St Louis, MO, USA.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
This paper presents just some first steps towards a dynamic theory of grounding. It did not provide a detailed analysis of how questions and imperatives affect commitments, agreements and disputes. This will be addressed in Lascarides and Asher (forthcoming). We also wish to rationally reconstruct SDRT’s logic of cognitive modelling, to take into account the model of dialogue interpretation presented here. This involves linking public commitments to inferences about other attitudes, such as beliefs, preferences and intentions. For instance, we have ignored in our analysis of (1) the fact that Mark might be using his utterances in an accusatory manner, to attribute the blame for the fight to Karen, while Karen attempts to reassign the blame to Mark. This is a matter of ongoing research, with some initial results reported in Asher and Lascarides (2008). Finally, we need to extend this account to a model of grounding at the lower levels (e.g. grounding an understanding of what was said). In future work we intend to incorporate insights from prior models of grounding (e.g. Ginzburg 2008) into the SDRT model of agreement presented here.
Alex Lascarides and Nicholas Asher 157 ElsET Symposium. Questions and Answers: Theoretical and Applied Perspectives. Utrecht. 16–23. Groenendijk, J. & Stokhof, M. (1991), ‘Dynamic predicate logic’. Linguistics and Philosophy 14:39–100. Hamblin, C. (1970), Fallacies. Methuen, London, UK. Hirschberg, J. (1985), A Theory of Scalar Implicature. Ph.D. thesis, Computer and Information Science, University of Pennsylvania. Kamp, H. & Reyle, U. (1993), From Discourse to the Lexicon: Introduction to Model-theoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory. Kluwer Academic Publishers, Dordrecht, NL. Krifka, M. (1991), ‘A compositional semantics for multiple focus constructions’. In Steven Moore and Adam Zachary Wyner (eds.), Proceedings from Semantics and Linguistic Theory I. CLC Publications. Ithaca, New York. 127– 58. Larsson, S. & Traum, D. (2000), ‘Information state and dialogue management in the TRINDI dialogue move engine toolkit’. Natural Language Engineering 6:323–40. Lascarides, A. & Asher, N. (2008), ‘The interpretation of Questions Dialogue’. In Proceedings of the 9th SigDial Workshop on Discourse and Dialogue (SIGDIAL), Columbus, OH, USA. 29–36. Lascarides, A. & Asher, N. The Interpretation of Questions in Dialogue. In A. Riester and T. Solstad (eds.), Proceedings of Sinn Und Bedeutung 13, forthcoming. Matheson, C., Poesio, M. & Traum, D. (2000), ‘Modelling grounding and discourse obligations using update rules’. In Proceedings of the First Meeting of the North American Chapter of the Association for Computational
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Asher, N. & Lascarides, A. (2003), Logics of Conversation. Cambridge University Press. Cambridge, UK. Asher, N. & Lascarides, A. (2008), ‘Commitments, beliefs and intentions in dialogue’. In Proceedings of the 12th Workshop on the Semantics and Pragmatics of Dialogue (Londial), London. 35–42. Bromberger, S. (1962), ‘An approach to explanation’. In R. Butler (ed.), Analytical Philosophy. Oxford University Press. Oxford, UK. 72–105. Clark, H. (1992), Arenas of Language Use. University of Chicago Press, Chicago. Clark, H. (1996), Using Language. Cambridge University Press, Cambridge, UK. Clark, H & Schaefer, E. F. (1989), ‘Contributing to discourse’. Cognitive Science 13:259–94. Egg, M., Koller, A. & Niehren, J. (2001), ‘The constraint language for lambda structures’. Journal of Logic, Language, and Information 10:457–85. Fernando, T. (1994), ‘Bisimulations and predicate logic’. Journal of Symbolic Logic 59:924–44. Gaudou, B., Herzig, A. & Longin, D. (2006), ‘Grounding and the expression of belief ’. In Proceedings of the 10th International Conference on Principles of Knowledge Representation and Reasoning (KR’06), Riva de Garda, Italy. 221–9. Ginzburg, J. (2008), Semantics and Conversation. CSLI Publications, Stanford, USA. Grice, H. P. (1975), ‘Logic and conversation’. In P. Cole and J. L. Morgan (eds.), Syntax and Semantics Volume 3: Speech Acts. Academic Press. NY, USA. 41–58. Groenendijk, J. (2003), ‘Questions and answers: semantics and logic’. In Proceedings of the 2nd CologNET-
158 Agreement, Disputes and Commitments in Dialogue Computer Science Department, University of Rochester. van der Sandt, R. (2001), ‘Presuppositional denials’. In Proceedings of the 5th International Workshop on Formal Semantics and Pragmatics of Dialogue (BiDialog), Bielefeld. 107–28. van Eijck, J. & Kamp, H. (1997), ‘Representing discourse in context’. In Johan van Benthem and Alice ter Meulen (eds.), Handbook of Logic and Linguistics. Elsevier, Amsterdam, NL. 179–237. van Leusen, N. (1994), ‘The interpretation of correction’. In P. Bosch and R. van der Sandt (eds.), Focus and Natural Language Processing. Cambridge University Press, Cambridge, UK. Wahlster, W. (ed.) (2000), Verbmobil: Foundations of Speech-to-Speech Translation. Springer, Berlin, Germany. Walker, M. (1996), ‘Inferring acceptance and rejection in dialogue by default rules of inference’. Language and Speech 39. First version received: 08.01.2008 Second version received: 10.10.2008 Accepted: 17.11.2008
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Linguistics (NAACL), Seattle, USA. 2–9. Poesio, M. & Traum, D. (1997), ‘Conversational actions and discourse situations’. Computational Intelligence 13:309–347. Poesio, M. & Traum, D. (1998), ‘Towards an axiomatisation of dialogue acts’. In J. Hulstijn and A. Nijholt (eds.), Proceedings of the Twente Workshop on the Formal Semantics and Pragmatics of Dialogue. Enschede 207–227. Polanyi, L. (1985), ‘A theory of discourse structure and discourse coherence’. In P. D. Kroeber, W. H. Eilfort and K. L. Peterson (eds.), Papers from the General Session at the 21st Regional Meeting of the Chicago Linguistics Society, Chicago. Rooth, M. (1992), ‘A theory of focus interpretation’. Natural Language Semantics 1:75–116. Sacks, H., Schegloff, E. & Jefferson, G. (1974), ‘A simplest systematics for the organization of turn-taking in conversation’. Language 50:696–735. Steedman, M. (2000), The Syntactic Process. MIT Press, Cambridge, MA, USA. Traum, D. (1994), A Computational Theory of Grounding in Natural Language Conversation. Ph.D. thesis,
Journal of Semantics 26: 159–184 doi:10.1093/jos/ffp001 Advance Access publication March 3, 2009
Multiple Focus SIGRID BECK Universita¨t Tu¨bingen SHRAVAN VASISHTH Universita¨t Potsdam
This paper presents the results of an experimental study on multiple focus configurations, that is, structures containing two nested focus-sensitive operators plus two foci supposed to associate with those operators. There has been controversial discussion in the semantic literature regarding whether or not an interpretation is acceptable that corresponds to this association. While the data are unclear, the issue is of considerable theoretical significance, as it distinguishes between the available theories of focus interpretation. Some theories (e.g. Rooth’s 1992) predict such a pattern of association with focus to be impossible, while others (such as Wold’s 1996) predict it to be acceptable. The results of our study show the data to be unacceptable rather than acceptable, favouring important aspects of the theory of focus interpretation developed by Rooth.
1 INTRODUCTION The semantic literature on focus debates the question of whether association with focus can skip an intervening focus-sensitive operator or not, and if it can, under what circumstances this is possible. An example that illustrates the point is given in (1) (from Rooth 1996a). (1) We only introduced [Marilyn]F to John Kennedy. We also only introduced [Marilyn]F to [Bob Kennedy]F. ‘Another person who we introduced only Marilyn to is Bob Kennedy.’ The second sentence in (1) on the intended interpretation instantiates the situation in (2). A focus-sensitive operator Op1 should associate with a focus F1 that is c-commanded by a closer focussensitive operator Op2 (which comes with its own focus F2). The question we address in this paper is whether an interpretation reflecting this pattern of association exists. (2) Op1 . . . [ Op2 [X . . . F2 . . . F1 . . . ]] The Author 2009. Published by Oxford University Press. All rights reserved. For Permissions, please email:
[email protected].
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Abstract
160 Multiple Focus
2 THEORETICAL BACKGROUND
2.1. The interpretation of focus Focus introduces alternatives.1 In (3), focus on the verb invokes alternative relations between Renate and ‘Pride and Prejudice’—for example ‘read’. (3) Renate WATCHED ‘Pride and Prejudice’. Those alternatives are inherited in the larger structures containing the focused item. The VP in (3) makes available alternatives like the ones in (4a), and the whole sentence, accordingly, the ones in (4b) (what alternatives precisely are invoked depends on the context; this does not concern us here). (4) a. fkx.kw. x watched P&P in w, kx.kw. x read P&P in wg ¼ freading P&P, watching P&Pg b. fkw. Renate watched P&P in w, kw. Renate read P&P in wg ¼ fthat Renate watched P&P, that Renate read P&Pg Alternatives become relevant to semantics or pragmatics at certain points. This can be when we evaluate contrast, as in (5), or when we decide what ‘only’ quantifies over in (6).
1 Rooth (1985, 1992), Kratzer (1991), Krifka (1991). See Geurts & van der Sandt (2004) for a differing analysis, and that issue of Theoretical Linguistics and Sauerland (2005) for discussion. We present the discussion in terms of what we take to be the most standard framework. We do not think that this is crucial for the theoretical point at hand.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Conflicting views on this are reported in the literature. This is not surprising since the empirical facts which lie at the heart of the debate are subtle and have so far been evaluated largely by the traditional introspective method alone. It is well known (Schuetze 1996) that the subjective intuition of a researcher is, in difficult cases such as these, not the best way to answer an empirical question; what is needed is an objective experimental evaluation. This paper presents the results of an experimental study using stimuli representing the configuration in (2). The study shows that such configurations are very problematic; hence, a theory of association with focus ought to assume that (2) is not possible. We present the relevant background on association with focus in Section 2. Section 3 reports the experiment and Section 4 summarizes our conclusions.
Sigrid Beck and Shravan Vasishth 161
(5) A: Renate read ‘Pride and Prejudice’. B: No—Renate WATCHED ‘Pride and Prejudice’. (6) Renate only WATCHED ‘Pride and Prejudice’.
(6#) kw. "P[P 2 freading P&P, watching P&Pg & P(Renate)(w) / P ¼ watching P&P] ‘the only property out of freading P&P, watching P&Pg that Renate has is reading P&P’ Operators that make use of focus alternatives in their semantics are called focus-sensitive operators. That could be focus-sensitive adverbs like only, also and even and discourse-level operators (compare e.g. Krifka 1995); or we could endorse a more abstract theory like Rooth’s (1992), according to which a single operator (the ; operator) always handles the interface between focus and focus-sensitive operators such as only. But this is not the issue we want to discuss. The question raised in this paper concerns a particular aspect of the mechanism of focus evaluation.2 Suppose that we have such an operator that evaluates focus alternatives, and suppose that there is more than one focus in its scope. Does the operator necessarily evaluate all foci? Or can it use one and 2 The term ‘focus evaluation’ is from Beck (2007). It is preferred to the term Rooth (1992) uses, ‘focus interpretation’, because the latter could equally well refer to the role that focus has of introducing alternatives, while we want to talk about what happens when these alternatives are used in some further semantic calculation. Focus evaluation is also preferred over ‘association with focus’ in our theoretical discussion because ‘association’ tends to be used to describe intuitively available interpretations for operators like only, always and most. We want to remain neutral between a theory of direct association and an indirect one like Rooth’s (1992), and we want to specifically refer to a mechanism of the grammar, not an interpretive effect.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(5B) is an appropriate response to (5A) because the proposition expressed by (5A) is one of the alternatives that (5B) invokes. We evaluate those alternatives, that is, make use of them, when we establish the contrast relationship in (5). The alternatives become relevant (are evaluated) at sentence level. Example (6) says that there is no relevant relation between Renate and ‘Pride and Prejudice’ other than watching. This comes about through the contribution of VP-adjoined only in (7) by identifying the set of relevant properties C as the focus alternatives of the sister of only (i.e. the context would determine a variable assignment function g such that g(C) ¼ (4a)). This results in (6#) as the meaning of (6), the intuitive interpretation (compare Rooth 1985, 1992). (7) ½½onlyCg ¼ kQ.kx.kw. "P[g(C)(P) & P(x)(w) / P ¼ Q]
162 Multiple Focus ignore the others? Are the alternatives introduced by these foci passed on for the calculation of alternative sets above the focus-sensitive operator? Or would the alternative set for the whole sentence in (6), for example, be the set in (8) in which the focus on watch is used up and no longer affects the construction of alternative sets above only? (8) fthat Renate only watched ‘Pride and Prejudice’g We will consider two possible theories of focus evaluation:
(B) Focus evaluation is selective: it can use (and use up) one focus and pass on the others. (A) is implemented in Rooth’s (1992) theory, and (B) in Wold (1996) [for the detailed semantic analyses from which (A) and (B) follow, see these papers]. Let us reconsider the situation in (2) (repeated from above), and the predictions that (A) and (B) make about sentences in which more than one focus and more than one evaluating operator occur and stand in the structural configuration indicated: (2) Op1 . . . [ Op2 [X . . . F2 . . . F1 . . . ]] Theory (A) predicts that the alternatives introduced by F1 cannot be used by Op1 because they are already used by Op2 and then forgotten for the purposes of the alternative sets for the larger structures. Association with focus should never be possible across an intervening operator. Theory (B) predicts that association of Op1 with F1 is fine. The next subsection examines the literature on this subject and illustrates the theoretical question raised here with some examples.
2.2. The literature on multiple focus It is claimed in the literature (e.g. Krifka 1991; Rooth 1996a) that a focus can skip one focus-sensitive operator and associate with a higher one. Example (9) is from Krifka (1991), and we repeat Rooth’s (1996a) example (10b) (¼ (1) above), which is claimed to allow the interpretation described in the paraphrase.3 3 Note that ‘Marilyn’ in (10b) bears an second occurence focus (SOF): it does not typically carry a pitch accent, but the semantically required F feature is realized phonologically by more subtle means like duration. Compare, for example, Rooth (1996b) and Beaver et al. (2007) on SOF. The issue of SOF is not itself relevant for us; we will assume, in accordance with the recent literature, that SOF is formally marked. We will use F to indicate that.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(A) Focus evaluation affects all foci in the scope of the evaluating operator. Alternatives triggered by these foci are not passed on for the construction of alternative sets higher in the structure.
Sigrid Beck and Shravan Vasishth 163
(9) John also only drank WATER. (10) a. We only introduced [Marilyn]F to John Kennedy. b. We also only introduced [Marilyn]F to [Bob Kennedy]F. ‘Another person who we introduced only Marilyn to is Bob Kennedy.’
(10#) [ alsoC [ onlyD [X we introduced MarilynF2 to [Bob Kennedy]F1]]]] | | The theory of Rooth (1992) [amounting basically to theory (A)] predicts the association of also and ‘Bob Kennedy’ to be impossible, contrary to the intuition reported in Rooth (1996a). Rooth (1996a) considers the alternative LF in (10$) for the example. Here, ‘Bob Kennedy’ has moved out of the c-command domain of only at LF and is now free to associate with also. The structure (10$) no longer instantiates (2). Since we know independently that phrases can move at LF, nothing precludes (10$) as a possible LF of (10b), and we do after all derive the relevant reading (so Rooth argues). (10$) [ alsoC [ [Bob Kennedy]F1 [3[ onlyD [I introduced MarilynF2 to t3]]]] | | This makes the prediction that skipping an intervening focussensitive operator should be possible only when movement can come to the rescue. Rooth tests this prediction with (11), where the focus is embedded inside a relative clause (an island for movement). (11) a. We only recovered the diary entries that MARILYN made about John. b. We also only recovered [the diary entries [that Marilyn made about BOBBY]] ‘Another person such that we recovered only Marilyn’s diary entries about him is Bobby.’ Rooth reports that association with also is still possible, and leaves the example as a problem for a restrictive theory of movement. Krifka
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We know that the focus on ‘Bob Kennedy’ skips a focus-sensitive operator because only obligatorily associates with focus (here: ‘Marilyn’), but ‘Bob Kennedy’ associates with the structurally higher also in the interpretation paraphrased. The structure according to Rooth’s (1992) theory would amount to (10#), with the association indicated. This is an instance of (2), notice, since there is a constituent containing both foci which is c-commanded by Op2 (only), and Op1 (also) in turn c-commands Op2.
164 Multiple Focus (2006) points out that this example does not establish unambiguously that ‘Bobby’ is inside the island, observing that ‘about Bobby’ would be interpretable outside the relative clause. He argues on the basis of further data that island effects do show up in that, when both foci are clearly inside the island, association with two different operators is bad. Wold (1996), on the other hand, is led by the kinds of data discussed here to the suggestion that focus evaluation is not, after all, unselective in that it affects all foci in its scope. He develops a version of the theory in which the evaluating operator itself bears an index, and evaluates only the contribution of coindexed foci. A representation of (10b) would then look as in (12). See Wold (1996) for semantic details. Suffice it to say that the indexed operator only uses those foci that bear the same index. This predicts that association of focus across intervening focus-sensitive operators is completely free. We have called this theory (B) above. However, von Fintel (1994: 49, Fn 44) observes that when the order of only and also is reversed, the relevant reading is completely impossible. His example is (13;B2). This is not what we expect under either Rooth’s movement theory or Wold’s theory. (13) A:
I know that John drank water at the party. What else did he drink? B1: Besides water he only drank [CARrot juice]F. B2: #He only also drank [CARrot juice]F.
Beck (2006) also claims that multiple focus is not as freely possible as theory (B) would have it. She tests association of a focus-sensitive operator across ‘only’ and negation in English (with seven speakers) and German (with ten speakers) in an informal survey. One of her English examples is given below. According to her results, many speakers reject such association, though four out of seven English speakers accepted association across ‘only’ in (14) as indicated by the interpretation given below (14B), fixed by the context. Beck (2006) did not find any island effects. (14) A: B:
You only told THE BOSS that Maria met Sally Right. I also only told the boss that Maria met BILL. %Another person such that I told only the boss that Maria met him is Bill.
There is thus no agreement on the data, which obviously are fairly subtle. At the same time, the issue is important for our view of how the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(12) [ also1 C [ only2 D [I introduced Marilyn F2 to [Bob Kennedy] F1]]]]
Sigrid Beck and Shravan Vasishth 165
(15) a. Eva only gave xerox copies to the GRADUATE students. No, PETR only gave xerox copies to the graduate SOF students. b. Mary only STEAMS vegetables, and even JOHN only steams SOF vegetables. [ e venC [ Jo hn F1 [ only D [ steam s F2 vegetables]]]] | | | | (2)
Op1 . . . [ Op2 [X . . . F2 . . . F1 . . . ]]
We now turn to the experiment that will help us decide upon the acceptability of structures instantiating (2). 3. THE EXPERIMENT
3.1. Method 3.1.1. Participants Sixteen native English speakers were recruited from Berlin and surrounding areas for this study, and they were paid twenty Euros for completing the experiment; the experiment was conducted in Berlin (in a laboratory provided by the Zentrum fu¨r Allgemeine Sprachwissenschaft) and Potsdam (Vasishth Language Processing Lab, Institute for Linguistics). Both laboratories were sound-proof. The mean age of participants was 28.87 years, SE ¼ 1.85
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
evaluation of focus in natural language proceeds. Competing theories of focus interpretation are on the market (e.g. Rooth 1992, 1996a; Wold 1996; Geurts & van der Sandt 2004; Sauerland 2005; Krifka 2006), and this empirical domain potentially differentiates between them (compare for recent discussion e.g. Bu¨ring 2006; Rooth 2006). It is the purpose of the study reported in the next section to gather reliable empirical evidence that will help us to decide between the different theories. It should be stressed that while there has recently been a fair amount of research done on sentences containing more than one focus (see e.g. Bu¨ring 2006; Fe´ry & Ishihara 2005; Rooth 2006; Beaver et al. 2007), the multiple-focus configuration (2) has not been systematically tested empirically. This distinguishes the present investigation from others. In many examples in the recent literature, the primary focus occurs outside the domain of the second occurrence focus (SOF) evaluating operator [data discussed in Fe´ry & Ishihara 2005; Bu¨ring 2006; Rooth 2006; (15) is taken from Rooth’s paper]. Such data are fine; for instance, the structure of (15b) does not instantiate (2).
166 Multiple Focus (median 28, minimum 20, maximum 46, first quartile 23, third quartile 34); eight were females and seven males; 12 were from the USA and the remainder were from the UK. The median number of non-native languages spoken by them was 1. The distribution of foreign languages was French: five speakers; Spanish: four speakers; German: 10 speakers; Japanese: one speaker; Italian: one speaker. Of the 16 participants, three failed to complete the full experiment; they completed only the first session of the experiment (described below).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
3.1.2. Procedure and stimuli The software Linger (http://tedlab. mit.edu/;dr/Linger/) was used for presenting items to participants. At the start of the experiment, participants were seated in front of a computer (;45 cm from the screen) and presented with instructions on the computer screen (the instructions are provided in the Appendix A). Then they were shown a discourse context on the screen; the context was also made available to them in printed form on a piece of paper. Participants were instructed that the sentences they would hear during the experiment would refer to the situation described in this text. Participants were allowed to look at the text as long as necessary in order to internalize the context, and during the experiment (but before they started a trial), they were also allowed to look at the context again if they needed to refresh their memory. There was only one context situation for the experiment. When the participant decided that he/she had comprehended the context and was ready to carry out the task, he/she pressed the space bar of the computer keyboard and heard a dialogue through headphones. The dialogue consisted of a pair of sentences that was either a target, a control (these are described below) or a distractor. After hearing the dialogue, the computer screen prompted the participant for an acceptability judgment rating; a rating of 1 indicated that the sentence was completely unacceptable and 4 that it was perfectly acceptable. The intermediate ratings 2 and 3 were to be used for judgments that fell between these two extremes. The first five dialogues heard by participants were practice items, and after that the stimulus items presented pseudorandomly interspersed with 20 distractor sentences that had no relationship with the research question. Of these, four contained ungrammatical sentences; the purpose of including these was to ensure that participants were attending to the task. If any participant were to rate the ungrammatical sentences as acceptable, their responses would not be considered reliable. As discussed in Appendix A, the ungrammatical fillers received a uniformly low rating (in the range 1.07–1.50).
Sigrid Beck and Shravan Vasishth 167
The dialogues were recorded by two native speakers from the USA. They were graduate students in linguistics at the University of Potsdam, but were naive as to the goals of the experiment. The sentences they produced were checked for anomalies in intonation by a phonetician. The experiment stimuli are described next. Three focus-sensitive operators were tested: only, also and even. These were crossed with four interveners: only, also, even and nobody. Nine combinations were chosen with the configuration shown in (2).4 The F0 contours for the nine target sentences are shown in Appendix B. (2) Op1 . . . [ Op2 [X . . . F2 . . . F1 . . . ]] Op1: also Op1: also Op1: also Op1: only Op1: only Op1: only Op1: even Op1: even Op1: even
Op2: only Op2: nobody Op2: even Op2: also Op2: nobody Op2: even Op2: only Op2: nobody Op2: also
The purpose of the experiment was to determine the acceptability of the target structure under the target interpretation, not the acceptability of the target structure as such. Hence, the stimulus items were presented in a context that unambiguously fixed the interpretation to the association pattern we are interested in. (16) is the overall context for (17), the first test set. In the dialogue in (17), the second sentence instantiates the configuration (2) with Op1 ¼ also and Op2 ¼ only. We represent both regular focus and second occurrence focus with the subscript F in (17) and the following examples. (16) Context A and B are detectives in the San Francisco police force. They are building a case against the well-known director of a local bank. Their main evidence consists of a set of photographs and a video showing the suspect in incriminating circumstances. Their assistants are trying to find eye witnesses to the events taking place in the pictures. Unfortunately, there has been a leak to the press. A and B are trying to find out how the information could have gotten out. 4 We tested negation (in the shape of ‘nobody’) as an intervener because it is a well-known problematic intervener (e.g. Beck 2006), but we did not test it as an associating operator because it is not clear that negation associates with focus (compare the discussion in Rooth 1996a). Hence the nine combinations.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix)
168 Multiple Focus (17) Target A: You only showed [the photos]F to Carol. B: Right. I also only showed [the photos]F to RobinF. [also . . . [ only [X . . . F2 . . . F1 . . ..]] . . . ]
(18) Control A: You only showed [the photos]F to Carol. B: Right. Another person that I only showed [the photos]F to is RobinF. [another . . . [ only [. . . F1 . . .]] . . . F2 . . .] The presentation of the control sentences was carried out in a separate session, this session being separated from the first by 4 weeks. In order to control for any effect of presentation order, half the participants saw the target stimuli in the first session and the control stimuli in the second, and the other half saw the targets and controls in the opposite sequence. We separated the presentation of the target stimuli from the control stimuli because showing both sentences in the same session could possibly introduce a priming effect. For example, presenting the semantically identical items (17B) and (18B) in the same session could bias the participants’ judgment of whichever item occurred second. By separating out the targets and controls into separate sessions, we were able to avoid this possible bias.
3.2 A note on the dependent measure: gradient v. categorical grammaticality ratings It is necessary here to briefly explain the interpretation of the dependent measure that we adopt in this paper. As mentioned above, we used a 4-point scale to obtain a judgment from participants about the grammaticality status of the theoretically interesting constructions. Due
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The same participants that rated the target items were also presented with control items. The purpose of the control item was to ensure that each participant’s baseline acceptability for the interpretation of a semantically identical counterpart of the double-focus condition was known. Since we do not test for acceptability of structure, but rather for acceptability of a particular interpretation for that structure, the control items are paraphrases of the intended interpretation of the corresponding target item, but in which the multiple-focus configuration was avoided. That is, the interpretation was kept constant but the structure attempting to convey it was changed. An example of the control items is (18), which is a paraphrase of the target (17).
Sigrid Beck and Shravan Vasishth 169
to the controversy in the literature about multiple-focus constructions, we expected the response to be gradient. We treat the gradient response in the experiment as an approximation of the (arbitrary) binary distinction of grammatical v. ungrammatical that is often used in linguistics. The only reason for this reductionist interpretation is that the theoretical debate we address in this paper is centered around a binary empirical decision—is the multiple-focus configuration grammatical or not?
3.3 A note on the statistical analysis Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Data analysis was carried out using the linear mixed-effects (multilevel) model or LME (Bates and Sarkar 2006) available as the package lme4 in the R programming environment (R Development Core Team 2006). In the pairwise comparisons (discussed below), participants were treated as a random factor (sometimes referred to as random effect) and target v. control as the fixed factor (or fixed effect). LME models have several advantages over traditional repeated-measures analyses of variance (ANOVA) and are becoming standard in experimental research, including psycholinguistics. This is evident from several recent psycholinguistics articles (e.g. Quene´ & van der Berg 2001, 2004; Bresnan et al. 2007; Oberauer & Kliegl 2006; Vasishth & Lewis 2006) and books (e.g. Snijders & Bosker 1999; Baayen 2008) that demonstrate the many advantages of this technique (see http://lme4.r-forge.r-project.org/bib/lme4bib.html for a complete bibliography spanning various experimental domains). LMEs are beginning to replace the traditional repeated-measures ANOVA for several reasons. One is that the computational tools they rely on (e.g. Monte Carlo Markov Chain techniques) have become feasible relatively recently; another is that the traditional by-participants and by-items (and Min-F) calculation of ANOVA is not necessary in LMEs because participant and item-level variation can be taken simultaneously into account in the model. This considerably simplifies the presentation and interpretation of results (Quene´ & van der Berg 2001) and increases statistical power (Baayen 2008). Some further important properties of LMEs are relevant for psycholinguistic research. For example, they are more flexible when modelling diverse sources of heterogeneity and correlation, and they are able to model unbalanced and incomplete repeated-measures data. Since lack of balance (due, e.g. to missing data points) is quite common in psycholinguistic data (e.g. in the present experiment three participants failed to complete the second session), LMEs are an important alternative to traditional methods (such as introducing various corrections in ANOVA, or dropping the incomplete data
170 Multiple Focus
3.4. Results In order to compare the relative acceptability of each of the focus configurations, it was necessary to compare the response of each target with its control. This results in nine pairwise comparisons. Figure 1 shows the mean ratings for the target and control conditions. Overall, control items were rated significantly better (mean 2.35) than targets (mean 1.93), t ¼ 3.96, P < 0.01 [repeated measures ANOVA, by-participants F1(1,12) ¼ 9.15, P ¼ 0.01, MSE ¼ 10.26, by-items F2(1,8) ¼ 3.45, P ¼ 0.10, MSE ¼ 2.97]. The LME model was fit with target v. control as a fixed factor, and participants and items as crossed random factors.
Figure 1 Mean ratings for target and control conditions for the nine pairs. A rating of 1 corresponds to the judgement ‘‘completely unacceptable’’ and the rating 4 corresponds to the judgement ‘‘perfectly acceptable’’.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
points) because they result in greater statistical power. In order to demonstrate the differences between mixed-effects modelling and traditional repeated-measures ANOVA, we computed all analyses using traditional repeated-measures ANOVA as well (participants with missing data were removed from the analysis). As shown below, in several pairwise comparisons, the ANOVA did not show a significant difference although LME did. In order to allow the reader to take a closer look at the details of the data analysis, the complete dataset and code used for statistical analysis are released along with this paper; they are downloadable from http:// www.ling.uni-potsdam.de/;vasishth/BeckVasishth/. Also released at the same website are (a) all sound files containing the targets, controls and fillers and (b) PDFs showing the F0 pitch contours for the targets. The F0 pitch contours are also included in Appendix B.
Sigrid Beck and Shravan Vasishth 171
Figure 2 Estimated coefficients and 95% confidence intervals for the nine target–control comparisons; the regression was carried out using linear mixed-effectLMEs models, with the target–control as a fixed effect and participants as random effect. The interpretation of the graph is discussed in the text.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We turn next to pairwise comparisons between each target and its corresponding control. Presented in Figure 2 are the regression coefficients and 95% confidence intervals based on the LME analysis (participants were treated as random factors). A coefficient whose confidence intervals do not cross the zero line is statistically significant. The meaning of a significant coefficient with a positive sign is that the control was rated better than the corresponding target; a negative coefficient means that the target was rated significantly better than the control. We present the results as regression coefficients rather than conventional t and P values but, as mentioned earlier, we also present the results of a conventional ANOVA for each comparison (note that for each target–control comparison, the ANOVA refers to byparticipants analyses; there is only one item in each comparison). The LMEs analysis showed that, consistent with Theory A but inconsistent with Theory B, several of the double-focus configurations were rated significantly worse than the corresponding controls. These were also-only [F(1,12) ¼ 32.88, P < 0.01, MSE ¼ 15.39], only-also [F(1,12) ¼ 14, P < 0.01, MSE ¼ 7.54], only-nobody [this comparison was marginal in the conventional repeated measures ANOVA, F(1,12) ¼ 3.6, P ¼ 0.08, MSE ¼ 1.38], only-even [F(1,12) ¼ 7.5, P ¼ 0.01, MSE ¼ 0.96] and even-nobody [this was not significant in the conventional ANOVA F(1,12) < 1]. In the even-only comparison, the target double-focus configuration was marginally worse than the control [F(1,12) ¼ 2.54, P ¼ 0.13, MSE ¼ 0.96], and no significant
172 Multiple Focus differences were found between also-nobody and also-even (Fs < 1). Finally, even-also was rated significantly better than the control [F(1,12) ¼ 11.52, P ¼ 0.005, MSE ¼ 4.65]; however, upon closer examination, it was found that the control had an incorrect name in the critical sentence (rendering it inappropriate in the context in which it was presented); this would explain the lower rating for the control.
3.5. Discussion
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
To summarize the results, five of the nine comparisons showed that double-focus configurations were rated worse than their corresponding controls, and one comparison was rated marginally worse than its control. Of the remaining three comparisons, two did not show any significant effects, and the remaining comparison showed that the control was rated worse than the target, but this is probably due to the error in the control item. Thus, only eight comparisons are relevant for interpreting the results. No target item received a mean rating above 2.69 (even-also), which is approximately the middle of the rating scale. This suggests that the multiple-focus configuration (2) tends towards being unacceptable rather than acceptable. Thus, we may conclude that the degraded status of the target items is to a significant extent due to the configuration (2), not the nature of the meaning conveyed. A further study might clarify whether conditions also—even and even—also are really better than the other instances of the multiple-focus schema (2). It is not expected from the representative theories discussed in this paper (Rooth 1992; Wold 1996) that the choice of focus operator should matter. If, however, Beaver & Clark (2003) are right and not all focus-sensitive operators are alike, then there is a possibility that a more nuanced differentiation is necessary for multiple-focus configurations. This is an empirical question that must be left for future research. The main purpose of this study was to empirically address the question whether the multiple-focus configuration violates a principle of grammar. Our answer is yes. However, as a reviewer points out, our results are also consistent with an alternative explanation: focus association in multiple-focus configurations might reflect a constraint on human sentence comprehension that is independent of the grammar of focus interpretation. We discuss this possibility next. The comprehension process underlying association with focus is an instance of a more general and very well-studied psycholinguistic phenomenon: dependency resolution. Under one commonly held view (Just & Carpenter 1980, 1992), (Van Dyke & Lewis 2003; Lewis & Vasishth 2005; Lewis et al. 2006; Van Dyke & McElree 2006;
Sigrid Beck and Shravan Vasishth 173
(10b) We also only introduced [Marilyn]F to [Bob Kennedy]F. The prosodically focused phrase Bob Kennedy would set retrieval cues for a c-commanding focus operator in order to complete the dependency. However, at this point we need to decide what these retrieval cues should be. In order to derive the predictions of this processing account, we have no choice but to consult semantic theory. Semantic theory suggests two alternatives: Theory A and B. Under Theory A, the search for a c-commanding focus operator would result in greater retrieval difficulty (or outright retrieval failures) because there are two candidate focus operators. This greater retrieval difficulty could result in the lower acceptability ratings we found (assuming that online processing difficulty is reflected in offline acceptability judgments—we return to this point below). By contrast, under Theory B, each focus operator has a unique index associated with it that corresponds to the index for the focused words. Under this assumption, the focused phrase would seek out a focus operator with a unique index, and since there is only one focus operator in memory that has this particular index, retrieval latency and accuracy should be faster. This would (contrary to our findings) be reflected in higher acceptability ratings.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Vasishth & Lewis 2006), at least some types of dependency resolution can be characterized as a cue-driven retrieval process (for related work, see Gibson 1998, 2000; Grodner & Gibson 2005; Gordon et al. 2006). To take a concrete example as an illustration, in the computational model of parsing presented by Lewis & Vasishth (2005), a verb seeking an argument (such as its subject) sets cues—a specification of particular feature-values such as person, number, gender, animacy and/or case agreement—in order to retrieve the target argument. The speed with which this retrieval is completed (the retrieval latency) and the likelihood and/or latency of a successful retrieval are determined by, among other factors, the number of other nouns in working memory that have similar features. The greater the number of similar nouns, the more difficult it is to retrieve the target noun. This process of cue-based retrieval has been shown to be quite generally applicable in explaining constraints on dependency resolution. For example, the dependency resolution needed for processing certain kinds of negative and positive polarity items has also been argued to be subject to similar constraints during online processing (Vasishth et al. 2008). Thus, a cue-based retrieval mechanism can also be invoked to explain the increased difficulty in comprehending multiple-focus configurations. Let us revisit an earlier example to see how such a processing account would work. Consider again (10b), repeated below:
174 Multiple Focus
4. CONCLUSIONS Since five out of the eight relevant comparisons show lower ratings for the double-focus configurations (and one showed marginally lower ratings), the evidence is consistent with Theory A but not with Theory B. There appears to be something wrong with the configuration in (2). A theory of focus interpretation should predict that such structures are ungrammatical. Rooth (1992) provides one such theory. A ban on (2) also follows a more general pattern of alternative evaluation observed in Beck (2007) (called the General Minimality Constraint), which states that the evaluation of alternatives cannot skip an intervening focussensitive operator. (2) Op1 . . . [ Op2 [X . . . F2 . . . F1 . . . ]] An interesting further question concerns free focus. Rooth’s (1992) theory makes no difference between a focus used by an operator like only and a focus that is evaluated purely for discourse purposes. An interesting example is found in Bu¨ring (2006). (19) What did John only eat in PAris? (Schwarzschild p.c. from Bu¨ring 2006) a. # John only ate crepes in PAris. b. # John only ate CREpes in Paris. c. It’s CREpes that John only ate in Paris. In this example, the VP contains a focus that is interpreted by VP adjoined only, namely Paris. At the same time, the VP contains an
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The important point here is that even if we were to assume that the underlying explanation for our results lies in processing, the assumptions of the processing account would still derive from assumptions about the underlying grammatical constraints. Differently put, there is no processing account available without invoking the assumptions of Theories A and B. Yet another way to look at this is that when we say that the effect is due to grammatical constraints, we do not provide a process model of how these grammatical constraints are used in online comprehension. Adding such a process model, as we do in the sketch above, would couple the grammatical mechanism with well-defined processing steps, but this does not entail that the explanation for the effect lies in processing. This is what we mean when we say that the best explanation for the results presented in this paper lies in grammatical rather than processing constraints. We are not at this point aware of a pure processing explanation that would make no reference to the mechanisms implied by Theory A or Theory B.
Sigrid Beck and Shravan Vasishth 175
object NP that needs to be focused in order to satisfy the requirements on question/answer coherence. Whatever operator evaluates question/ answer coherence (we call it Q/AC here) will therefore play the role of Op1 in a structure instantiating our schema (2): (19#) [IP Q/A C [John onlyD [VP ate CR EpesF1 in ParisF2]]]] | |
(20) Focus Prominence (FP): If P is the domain of a focus-sensitive operator O, the most prominent element in P is a focus of O. The domain of only is the VP, so the most prominent element in the VP should be Paris. The domain of Q/AC is the IP, and the most prominent element in the clause should be crepes. Since the VP is contained in the IP and crepes is contained in the VP, it is impossible to meet both restrictions at the same time. Bu¨ring’s constraint will also rule out structures instantiating (2). He notes, however, that the contrast between the unacceptable (19) and the acceptable (21) poses a problem. A discourse-level operator needs to evaluate the contrast relation between grow and eat, crossing over the dependency between only and rice—another instance of (2). This time, the example is fine, however. (21) a. People who grow rice only EAT rice. b. [CONTR ASTC [ . . . [onlyD [ eat F1 riceF2 ]]]]] | | It is not clear what distinguishes the rice sentence from the crepes sentence. See Bu¨ring (2006) and Rooth (2006) for further thoughts on the subject. Data such as (19) and (21) suggest that question/answer pairs and contrast be included in future systematic experimental studies as well. The discussion shows that multiple-focus data like the ones investigated in this study play a crucial role in the lively theoretical debate of the form/meaning relationship and the interpretation of focus. At the same time, the facts are far from clear. We hope to have contributed a crucial piece in the empirical foundation for future theoretical development.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The example is unacceptable, just like our multiple-focus data in the experiment. Bu¨ring assumes that focus evaluation is selective (as in the systems of Krifka 1991, Kratzer 1991 or Wold 1996), not unselective (as in Rooth 1992). He predicts the ungrammaticality from the impossibility of obeying the relevant constraints on prominence:
176 Multiple Focus APPENDIX A
Instructions for the experiment
Examples: Example 1: This sentence is a good English sentence. (fully grammatical) Example 2: Wrong the sentence of course is. (totally ungrammatical) Example 3: This sentence seems in a way to be neither totally grammatical nor totally ungrammatical. (somewhat ungrammatical) Let’s try some practice . . . [Four practice trials follow.] If you have any questions, ask the experimenter now. Otherwise, you may begin the experiment. The context sentence (a), the target sentence (b) the control (c): (1) a. b. c. (2) a. b. c.
You only showed the photos to Carol. Right. I also only showed the photos to Robin Right. Another person I only showed the photos to is Robin. You showed the photos to Carol. Did you also show the movie to her? No. I only also showed the movie to Robin No. The only person that I also showed the movie to is Robin.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Welcome. Thank you for participating in this Experiment. You will hear a dialogue between two detectives. The background of the dialogue is shown below. Please read it carefully. You also have a sheet of paper in front of you with this background information. You can look at it during the experiment if you need to. There is a male and a female detective in the San Francisco police force. They are building a case against the well-known director of a local bank. Their main evidence consists of a set of photographs and a video showing the suspect in incriminating circumstances. Their assistants are trying to find eye witnesses to the events taking place in the pictures. Unfortunately, there has been a leak to the press. The two are trying to find out how the information could have gotten out. When you hear the response of the female detective, you need to rate her response as grammatical or ungrammatical. If you consider it fully grammatical, click on 4, and if you find it completely ungrammatical, click on 1. You can choose an intermediate number if you feel that it is somewhat grammatical or ungrammatical.
Sigrid Beck and Shravan Vasishth 177
c. (8) a. b. c. (9) a. b. c.
You told nobody about the photos. No. I only told nobody about the movie. No. It’s only the movie that I told nobody about. You told nobody about the movie. Right. I also told nobody about the photos. Right. Another thing I told nobody about is the photos. You told nobody about the movie. Right. I even told nobody about the photos. Right. Even the photos I didn’t tell anybody about. You only showed the photos to Carol. Right. I even only showed the photos to the boss. Even the boss, I only showed the photos to. You showed the photos to Carol. Yes. I also showed the movie to her. I even also showed the movie to Carol’s assistant. Yes. I also showed the movie to her. Even her assistant I showed the photos to as well. You showed the photos to Carol’s assistant. You even showed the movie to him. No. I only even showed the movie to Carol. No. The only person I even showed a movie to is Carol. You showed the pictures to Carol. You even showed her the movie. Yes. I also even showed the movie to Robin. Yes. Another person I even showed the movie to is Robin.
Fillers: Note that fillers 16, 17, 19, and 20 contain either odd or ungrammatical sentences. These items were included in order to make sure that our participants were attending to the task; the expectation was that they would rate these fillers as unacceptable. The mean ratings for all fillers are provided with the raw data and code from the website http://www.ling.uni-potsdam.de/;vasishth/BeckVasishth. As expected, these ungrammatical items were rated as unacceptable (in the range 1.07–1.50). by participants. (1) a. You did talk to the fellow from CNN about the photos. b. Right. But I didn’t mention what was in them. (2) a. You didn‘t bring up the photos and the movie at the press conference, did you? b. No, I cleverly avoided mentioning those when outlining the progress we’ve made.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(3) a. b. c. (4) a. b. c. (5) a. b. c. (6) a. b. c. (7) a. b.
178 Multiple Focus
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(3) a. You might have said something to your husband about it, and he may have mentioned it to someone. b. Right, I did tell him. But he never talks about an ongoing case of mine with anyone. (4) a. You said yesterday that the New York Times reporter was talking to the bank guard for a long time. b. Right, but I can’t imagine how he would know about the photos and the movie. (5) a. So, do you think the information was leaked by one of our assistants? b. No, I am sure they’d never talk about this to the press or anyone else. (6) a. You really know for sure that nobody has access to our desk when we are out for lunch? b. Yes, there are only two keys for it and we have them. (7) a. So, do you think there could have been someone at your computer? b. Unlikely. Nobody knows the password. (8) a. Do you remember that guy who sat in the cafe next to us? He might have heard something. b. Maybe, but I don’t think he has anything to do with the case. (9) a. You think the courier made duplicates of the pictures? b. No, I have known him for years. We can trust him. (10) a. You sent the important files by mail, didn’t you? b. No, I always deliver them personally. (11) a. You wanted to ask your special informant if he knows something. b. Right. I asked him. He has no idea at all. (12) a. Do you think we‘ll end up working on this case forever? b. No, I think we’ll work it out pretty soon. (13) a. You talked to that talkative eyewitness on Tuesday. b. Possibly. But the news media would never publish uncorroborated information. (14) a. You made some notes the other day. Has anyone seen them? b. No, I’ve been shredding them as usual after inputting them into the computer. (15) a. So, are you sure no-one has been eavesdropping on us? b. Well, you can never be sure about that, but probably not. (16) a. You can ask the reporter how he got the information? b. Yes, I could do that. Although is he not going to tell me? (17) a. So, do you think our boss is involved? b. I think not he is. (18) a. You know anyone else from the department who could be interesting for us? b. Yes, there is sometimes this strange guy from the other side of the building.
Sigrid Beck and Shravan Vasishth 179
(19) a. b. (20) a. b.
You have shown the pictures to your daughter, right? I did never at all show her the pictures. You think the technician has talked about the video? I don’t know. He is to our department new. APPENDIX B
F0 contours for the nine multiple-focus pairs:
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
180 Multiple Focus
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Sigrid Beck and Shravan Vasishth 181
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Acknowledgements We are grateful to Kai Sippel for assistance with carrying out the experiments. Rainer Dietrich kindly provided access to his laboratory space in Berlin, and Bryan Jurish and Elizabeth Medvedovsky lent their voices for recording the experiment materials. Frank Ku¨gler and Gerrit Kentner assisted with evaluating the intonation of the target and control stimuli, and for plotting the intonational contours. We would like to
182 Multiple Focus thank the organizers of the 2007 London workshop on information structure, David Adger and Daniel Harbour, and its participants, for the chance to present our study and for their comments. Finally, the anonymous reviewers and the action editor, Paul Portner, provided very detailed and thoughtful comments; responding to these has significantly improved this paper. Our thanks to them.
SHRAVAN VASISHTH Chair of Psycholinguistics and Neurolinguistics Department of Linguistics University of Potsdam Karl-Liebknecht Str 24-25 14476 Potsdam Germany e-mail:
[email protected]
REFERENCES Baayen, H. R. (2008), Analyzing linguistic data: A practical introduction to data analysis. Cambridge University Press, UK. Bates, D. & Sarkar, D. (2006), The lme4 Package. (http://cran.r-project.org/ src/contrib/Descriptions/lme4.html). Beaver, David & Brady Clark (2003), ‘Always and Only: Why not all Focus Sensitive Operators are Alike’. Natural Language Semantics 11:323–62. Beaver, David, Brady Clark, Edward Flemming, Florian Jaeger & Maria Wolters (2007), ‘When Semantics Meets Phonetics: Acoustical Studies on Second Occurrence Focus’. Language 83:245–76. Beck, Sigrid (2006), ‘Intervention Effects Follow from Focus Interpretation’. Natural Language Semantics 14:1–56.
Beck, Sigrid (2007), ‘The grammar of focus interpretation’. In Uli Sauerland & Hans-Martin Ga¨rtner (eds.), Interfaces + Recursion ¼ Language? Mouton de Gruyter, Berlin. 255–80. Bresnan, Joan, Anna Cueni, Tatiana Nikitina & Harald Baayen (2007), ‘Predicting the dative alternation’. In Bouma, G., Kraemer, I. and Zwarts, J. (eds.). Cognitive Foundations of Interpretation. KNAW, Amsterdam, 69–94. Bu¨ring, Daniel (2006), Been There, Marked That—A Tentative Theory of Second Occurrence Focus. UCLA, MS, CA. ´ Fery, Caroline & Shinichiro Ishihara (2005), Interpreting Second Occurrence Focus. Potsdam, MS. von Fintel, Kai (1994), Restrictions on Quantifier Domains. Ph.D. dissertation,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
SIGRID BECK Chair of Descriptive and Theoretical Linguistics Englisches Seminar Universita¨t Tu¨bingen Wilhelmstr. 50 72074 Tu¨bingen Germany e-mail:
[email protected]
Sigrid Beck and Shravan Vasishth 183 Krifka, Manfred (2006), ‘Association with focus phrases’. In Valerie Molnar & Susanne Winkler (eds.), The Architecture of Focus. De Gruyter Berlin, New York. 105–36. Lewis, R. L. & Vasishth, S. (2005), ‘An activation-based model of sentence processing as skilled memory retrieval’. Cognitive Science, 29:1–45. Lewis, Richard L., Vasishth, Shravan & Van Dyke, Julie (2006), ‘Computational Principles of Working Memory in Sentence Comprehension’. Trends in Cognitive Sciences 10: 447–54. Oberauer, Klaus & Reinhold Kliegl. (2006), A formal model of capacity limits in working memory. Journal of Memory and Language 55: 601–26. Quene´, Hugo & van der Berg, H. (2004), ‘On Multi-Level Modeling of Data from Repeated Measures Designs: A Tutorial’. Speech Communication 43:103–21. Quene´, Hugo & van der Berg, H. (2001), ‘On Multi-Level Modeling as a Remedy Against the Language-asFixed-Effect Fallacy’. Manuscript. Quene´, Hugo & van der Berg, H. (2004), ‘On Multi-Level Modeling of Data from Repeated Measures Designs: A Tutorial’. Speech Communication 43:103–21. R Development Core Team (2006), R: A Language and Environment for Statistical Computing. Vienna: Austria. Rooth, Mats (1985), Association with Focus. Ph.D. dissertation, University of Massachusetts at Amherst. Rooth, Mats (1992), ‘A Theory of Focus Interpretation’. Natural Language Semantics 1:75–116. Rooth, Mats (1996a), ‘Focus’. In S. Lappin (ed.), The Handbook of Contemporary Semantic Theory. Blackwell, Oxford, UK; Cambridge, MA. 272–97. Rooth, Mats (1996b), ‘On the Interface Principles for Intonational Focus’.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
University of Massachusetts at Amherst. Geurts, Bart & Rob van der Sandt (2004), ‘Interpreting Focus’. Theoretical Linguistics 30:1–44. Gibson, E. (1998), ‘Linguistic Complexity: Locality of Syntactic Dependencies’. Cognition 68:1–76. Gibson, E. (2000), ‘Dependency locality theory: A distance-based theory of linguistic complexity’. In A. Marantz, Y. Miyashita & W. O’Neil (eds.), Image, Language, Brain: Papers from the First Mind Articulation Project Symposium. MIT Press, Cambridge, MA. Gordon, P. C., Hendrick, R., Johnson, M. & Lee, Y. (2006), ‘Similarity-Based Interference During Language Comprehension: Evidence from Eye Tracking During Reading’. Journal of Experimental Psychology: Learning Memory and Cognition 32:1304–21. Grodner, D. & Gibson, E. (2005), ‘Consequences of the Serial Nature of Linguistic Input’. Cognitive Science 29:261–90. Just, M. & Carpenter, P. (1980), A theory of reading: From eye fixations to comprehension. Psychological Review. 87:329–54. Just, M. & Carpenter, P. (1992), A capacity theory of comprehension: Individual differences in working memory. Psychological Review. 99: 122–49. Kratzer, Angelika (1991), ‘The Representation of Focus’. In von Stechow, A. & Wunderlich, D. (eds.). Sementics: A handbook of contemporary research. Mouton de Gruyter, Berlin. 825–34. Krifka, Manfred (1991), ‘A compositional semantics for multiple focus constructions’. In Proceedings of SALT 1. Cornell, Ithaca, NY, 127–58. Krifka, Manfred (1995), ‘The Semantics and Pragmatics of Polarity Items’. Linguistic Analysis 25:209–57.
184 Multiple Focus Van Dyke, J. & McElree, Brian (2006), ‘Retrieval Interference in Sentence Comprehension’. Journal of Memory and Language 55:157–66. Vasishth, Shravan & Richard L. Lewis (2006), ‘Argument-Head Distance and Processing Complexity: Explaining Both Locality and Antilocality Effects’. Language 82: 767–94. Vasishth, Shravan, Sven Bruessow, Richard L. Lewis & Heiner Drenhaus (2008), ‘Processing Polarity: How the Ungrammatical Intrudes on the Grammatical’. Cognitive Science 32: 685–712. Wold, Dag (1996), ‘Long distance selective binding: the case of focus’. In Proceedings of SALT 6, Cornell, Ithaca, NY. 311–28.
First version received: 11.02.2008 Accepted: 01.04.2008
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Proceedings of SALT VI. CLC Publications, Cornell University, Ithaca, NY. Rooth, Mats (2006), S-focus and Relativized Stress F. Handout, Universita¨t Tu¨bingen, June 12 2006. Sauerland, Uli (2005), ‘Don‘t Interpret Focus!’. In Proceedings of Sinn & Bedeutung 9. Nijmegen, The Netherlands. Schuetze, Carson T. (1996), The Empirical Base of Linguistics: Grammaticality Judgements and Lingustic Methodology. University of Chicago Press. Chicago, IL. Snijder, T. A. B. & Bosker, R. (1999), Multilevel analysis: An introduction to basic and advanced modeling. Sage, London. Van Dyke, Julie & Lewis, Richard L. (2003), ‘Distinguishing Effects of Structure and Decay on Attachment and Repair: A Cue-based Parsing Account of Recovery from Misanalyzed Ambiguities’. Journal of Memory and Language 49:285–316.
Journal of Semantics 26: 185–215 doi:10.1093/jos/ffp002 Advance Access publication March 13, 2009
Solving Learnability Problems in the Acquisition of Semantics ANDREA GUALMINI AND BERNHARD SCHWARZ Utrecht University Bernhard & McGill University
This paper proposes solutions to two semantic learnability problems that have featured prominently in the literature on language acquisition. Both problems have often been deemed unsolvable for language learners as a matter of logic, and they have accordingly been taken to motivate principles making sure they will not actually arise in the course of language acquisition. One problem concerns the acquisition of ambiguous sentences whose readings are related by entailment. Crain et al.’s (1994) Semantic Subset Principle is intended to preempt the problem by preventing acquisition of the weaker reading before the stronger reading has been acquired. In contrast, we demonstrate that this very order of acquisition becomes feasible in principle if children can exploit non-truth-conditional evidence of various kinds or evidence from sentences containing downward entailing operators. The other learnability problem concerns the potential need for expunction of certain readings of ambiguous sentences from a child’s grammar. It has often been assumed that, in the absence of negative evidence, such expunction is impossible, and Wexler and Manzini (1987) posit a Subset Principle to preempt the problematic learning scenario. We argue, however, that if the evidence available to the child includes dialogues, and if listeners are expected to interpret speakers’ utterances charitably, then expunction of unavailable readings is possible in principle.
1. TWO SEMANTIC LEARNABILITY PROBLEMS Research in the acquisition of semantics has identified two learnability problems that children must either avoid or solve in order to attain full semantic competence. One problem, identified by Wexler and Manzini (1987) and Crain and Thornton (1998), among others, derives from the well-motivated assumption that children do not have access to negative evidence (see Brown & Hanlon 1970; Marcus 1993). According to this assumption, children are not consistently corrected for linguistic errors and would therefore not receive direct evidence that a sentence they utter is ungrammatical or that a particular semantic interpretation they can assign is unavailable. To illustrate the semantic learnability problem posed by the absence of negative evidence, consider the case of reflexive Ó The Author 2009. Published by Oxford University Press. All rights reserved. For Permissions, please email:
[email protected].
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Abstract
186 Solving Learnability Problems in the Acquisition of Semantics pronouns, discussed in Wexler and Manzini (1987). Simplifying somewhat, the grammar of English requires that a reflexive have an antecedent in the smallest clause containing the reflexive. In this, English differs from Icelandic, for example, where a reflexive can be separated from its antecedent by a non-finite clause boundary. Suppose now a child exposed to English incorrectly hypothesizes that the reflexive himself has the grammar of Icelandic reflexives. Such a child would incorrectly consider an English sentence like (1) grammatical, as he would allow for John to antecede himself. (1) John told me to shave himself
(2) John told Bill to shave himself (3) a. John told Bill to shave Bill (local) b. John told Bill to shave John (non-local) The question that arises in this hypothetical situation is how a child exposed to English might manage to expunge the non-local interpretation from his grammar. To be sure, children learning English have ample exposure to adult utterances supporting the existence of local readings, and very little or no evidence for the existence of nonlocal readings. Lack of evidence for a certain reading is not the same as direct evidence against it, however, and assuming no such negative evidence exists, it is not clear how the child could unlearn an initially hypothesized non-local interpretation.1 We will refer to this semantic learnability problem as the expunction problem. Another semantic learnability problem, first identified in Crain et al. (1994), concerns the acquisition of certain ambiguities. These authors propose that generally, a child who at some stage has acquired one reading of a sentence could add a second reading on the basis of truthconditional evidence, that is, through exposure to an utterance of the sentence as a description of a situation where only the second reading is true. This seems like an easy step, as long as the child assumes that the speaker is cooperative in the sense of Grice (1975) and in particular 1
We will elaborate on this point in the concluding section of the paper. For the time being, the relevant point is that, on the assumption that direct negative evidence is not available, acquiring that a given string is unacceptable (or a given meaning is unavailable) and acquiring that a certain string is acceptable (or a certain meaning available) differ in important respects. In particular, the former task seems to require that the acquisition process exploits both the input that the learner experiences and the absence of input that he could have experienced, but did not.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Moreover, the child would incorrectly allow for two interpretations of sentences like (2), licensing the unavailable non-local reading (3b) in addition to the available local reading (3a).
Andrea Gualmini and Bernhard Schwarz 187
(4) The dinosaur is only painting a house This pitch accent may indicate narrow focus on the object noun phrase a house or wide focus on the verb phrase painting a house. The resulting readings can be paraphrased unambiguously with the examples in (5), where the relevant foci feature as pivots of so-called pseudo-cleft sentences. (5) a. A house is the only thing that the dinosaur is painting (narrow focus) b. Painting a house is the only thing that the dinosaur is doing (wide focus) 2
The evidence discussed by Crain et al. (1994) comes from adults’ comprehension of ambiguous sentences. Strictly speaking, in order to characterize the input available to children, a study on production would be more informative.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
obeys the Maxim of Quality. A problem arises, however, in the case of so-called privative ambiguities, that is, ambiguous sentences where one reading truth-conditionally entails the other (Horn 1989). Let us call the reading that entails the other the strong reading, and the reading that is entailed the weak reading. Suppose that at some stage of language acquisition a child has learned the strong reading of the relevant sentence, but he has not yet acquired the weak reading. Again, such a child could find evidence for the existence of the weak reading by hearing the sentence as a description of a situation where only the weak reading is true. Crain et al. (1994) moreover argue that the availability of this type of evidence is made likely by what they call the Principle of Parsimony, a principle that leads adults to prefer weak readings of sentences with a privative ambiguity.2 If adult speakers indeed prefer to use the relevant sentences on exactly those readings that children need to acquire in the relevant learning scenario, that is, on their weak readings, then this will help ensure that children receive robust evidence for such readings. But now imagine that a child has learned only the weak interpretation of a sentence. Since the strong reading entails the weak reading, there cannot be a situation in which only the strong reading of that sentence is true. So, it is unclear how the child’s linguistic experience might ever provide evidence for the availability of the strong reading. This point is illustrated by Crain et al. (1994) with an example featuring the phenomenon that Jackendoff (1972) dubbed association with focus. Suppose sentence (4) below is pronounced with a pitch accent on the final noun house.
188 Solving Learnability Problems in the Acquisition of Semantics
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Let us now consider how a child could learn that (4) is ambiguous. A child who has acquired only the wide focus reading could be led to posit the narrow focus reading by experiencing an utterance of (4) in a situation where only the narrow focus reading is true. For instance, this could be achieved in a situation where the dinosaur is painting a house and nothing else, but is also eating a sandwich. In this kind of situation, the assumption that the speaker is cooperative would lead the child to infer that an interpretation other than (5b) must have been intended. But let us now turn to an alternative scenario, one in which the child initially assumed that (4) only has the weak, narrow focus interpretation. Crain et al. (1994) argue that no evidence would allow the child to acquire the strong, wide focus reading. This is because the ambiguity under consideration is privative. The wide focus reading entails the narrow focus reading, as not doing anything (but painting a house) entails not painting anything (but a house). So the child can never experience an utterance of (4) in a situation where only the wide focus reading is true. It is therefore not clear how such a child would ever be led to posit the wide focus reading in addition to the narrow focus reading. We will refer to this second semantic learnability problem as the entailment problem. The expunction problem and the entailment problem are potential problems facing the child in the course of language acquisition. However, they may not be problems that the child will actually need to solve. In fact, Manzini and Wexler (1987) and Crain and Thornton (1998) argue that the learnability problems in question would be unsolvable if language learners ever had to face them, and they accordingly suggest that the Language Acquisition Device is designed to ensure that the problems in question never arise in the first place. Manzini and Wexler (1987) posit the Subset Principle, a constraint on the Language Acquisition Device intended to ensure that children always select the smallest language compatible with the available evidence (see also Berwick 1985). In cases where the readings generated by the values of a parameter are in a subset–superset relation, this would lead the child to learn the readings of potentially ambiguous sentences conservatively, positing a reading made available by Universal Grammar only when his current grammar does not allow him to assign an interpretation that makes the relevant linguistic input true. Although the Subset Principle would ensure that children never have to retract from a grammar that generates a superset of the structures or readings that are licensed by the target grammar, it does not address the entailment problem, the learnability problem that is
Andrea Gualmini and Bernhard Schwarz 189
2. SOLVING THE ENTAILMENT PROBLEM In this section we will take a closer look at the entailment problem and propose a general possible solution that children might apply. To recapitulate, in cases of privative ambiguity, a child who has only learned the strong reading could find direct, purely truth-conditional,
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
associated with privative ambiguities. To address this problem, Crain et al. (1994) propose that the Language Acquisition Device includes an additional principle, which prevents children from positing only a weak reading of a potentially ambiguous sentence. More specifically, since Crain et al. subscribe to the view that readings are learned conservatively so that children will not have to face the expunction problem, they propose that in cases where one reading entails the other, the Language Acquisition Device ensures that the strong reading is always hypothesized first. Crain et al. (1994) refer to this constraint as the Semantic Subset Principle. According to the proposals discussed above, then, the expunction problem and the entailment problem are unsolvable and the principles preempting them have the flavour of conceptual necessities. In this paper, we will question the premise that the learnability problems under consideration are unsolvable, and hence we will question the motivation for the relevant preempting principles given in the literature. In section 2 below, we will first show how children could solve three instances of the entailment problem on the basis of different kinds of information. We will also describe a more general solution to the entailment problem which exploits the semantics of so-called downward entailing operators. Crucially, we show that truth-conditional evidence can lead children to solve all the instances of the entailment problem that have been discussed in the literature. Turning to the expunction problem, in section 3 we will explain how a child might expunge unavailable readings on the basis of dialogues, dialogues that may or may not include the child as an active participant. Importantly, the kinds of dialogues we have in mind do not qualify as negative evidence in the usual sense. We are not suggesting that adults directly advise children of the unavailability of certain readings. However, we assume that adults may well object to utterances that they consider false, including utterances produced by children (see Brown & Hanlon 1970), and we explain how such factual objections may provide evidence to the child that a particular interpretation is not actually available.
190 Solving Learnability Problems in the Acquisition of Semantics
2.1 Solving special cases: non-truth-conditional evidence and scalar implicatures Below we will discuss three specific cases of privative ambiguity which have been taken to motivate the Semantic Subset Principle in the literature. In each case, we will argue that particular features of the ambiguity in question could allow the child to solve the instance of the entailment problem at hand, and hence that the ambiguity in question does not in fact establish the need for the Semantic Subset Principle. We start by considering one of the examples that Crain et al. (1994) have used to motivate the Semantic Subset Principle. Recall that sentence (4), repeated below, has the weak, narrow focus reading in (5a) as well as the strong, wide focus reading in (5b). (4) The dinosaur is only painting a house (5) a. A house is the only thing that the dinosaur is painting (narrow focus) b. Painting a house is the only thing that the dinosaur is doing (wide focus) According to Crain et al., a child who has acquired the weak reading would not have access to direct truth-conditional evidence for the
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
evidence for the existence of the weak reading, namely by experiencing the sentence as a description of a situation where only the weak reading is true. The entailment problem is the observation that no analogous direct truth-conditional evidence would be available to a child who has only learned the weak reading, given that there are no possible situations where only the strong reading is true. As mentioned, Crain et al. (1994) suggest that the absence of such truth-conditional evidence makes it impossible for a child to learn a strong reading after first learning a weak reading, hence that the entailment problem is unsolvable. They accordingly posit the Semantic Subset Principle as a constraint that preempts the entailment problem. We believe that the introduction of such a principle is insufficiently motivated. The claim that any instance of the entailment problem is unsolvable ignores a range of conceivable solutions that might be open to the child. Below we will first describe solutions to the entailment problem that apply in certain special cases that have been used to illustrate the entailment problem in the literature, and then move on to present a more general solution that applies in all cases, a solution that exploits children’s knowledge of the semantics of downward entailing expressions.
Andrea Gualmini and Bernhard Schwarz 191
(6) a. Who did John see at the conference? b. Where did John see Marc? (7) He only saw Marc at the conference Moreover, in the acceptable answers, focus can be observed to associate with only and to affect truth conditions in the expected way, yielding the entailment that Marc is the only person who John saw at the conference or that the conference is the only place where John saw Marc, depending on whether stress falls on Marc or at the conference, respectively. It is this observation that suggests a possible way for a child to learn the strong reading of sentence (4) after first having learned the weak reading. If such a child knows the principle of question–answer congruence, he would be driven to posit the wide focus reading (5b) when exposed to sentence (4) as an answer to a question, such as in the dialogue below. (8) Speaker A: What is the dinosaur doing? Speaker B: The dinosaur is only painting a house In this case, since the question word in (8) corresponds to the main predicate of the sentence, the child could infer on the basis of question– answer congruence that stress on a house in (4) (i.e. Speaker B’s utterance) signals focus on the main predicate paint a house. The child
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
existence of the strong reading, as the child could not possibly be exposed to a use of (4) in a situation in which only the strong reading is true. While we agree with this point, we also note that it presents an argument for the Semantic Subset Principle only under the additional assumption that direct truth-conditional evidence is the only type of evidence potentially useful to the child in acquiring the different readings of sentences like (4). It is this assumption, which Crain et al. do not justify, that we find questionable. After all, it is well known that focus can have a range of non-truth-conditional pragmatic effects. One well-known effect of this sort is so-called question–answer congruence. Informally speaking and simplifying considerably, an answer to a constituent question can bear focus on a given expression only if that expression corresponds to the wh-expression in the question. This holds true in particular for answers that contain the focus-sensitive particle only. To illustrate, consider (6) and (7) below, taken from Schwarzschild (1999). Schwarzschild notes that (7) is a good answer to (6a) with stress on Marc, but not with stress on the conference. Conversely, (7) is a good answer to (6b) with stress on the conference, but not with stress on Marc.
192 Solving Learnability Problems in the Acquisition of Semantics could then learn the strong, wide focus reading based on the assumption that Speaker B’s answer addressed the question posed by Speaker A, thereby making up for the unavailability of direct truthconditional evidence for the strong reading discussed above, and solving this particular instantiation of the entailment problem. Apart from question–answer dialogues, children might also exploit the focus effect known as contrast or parallelism (e.g. Rooth 1992; Schwarzschild 1999) to infer the existence of a strong, wide focus interpretation of sentences like (4). To illustrate, note that the second disjunct of the sentence in (9) is felicitous with stress on Fred, but not with stress on talked.
Simplifying again, the parallelism constraint at work here can be characterized as follows. With the exception of possible occurrences of only, material that is not common to both coordinates of a sentence containing disjunction must be focused. This constraint correctly excludes stress on talked in the second disjunct of (9) because such a stress could only mark narrow focus on the verb itself, leaving its complement to Fred unfocused even though this expression does not occur in the first disjunct. On the other hand, the generalization permits stress on Fred because such a stress can mark wide focus on talked to Fred, which makes the two disjunctions identical up to focused material and the occurrence of only. We can now see how the parallelism constraint described above could be used to solve the entailment problem instantiated by (4). A child who has already acquired this parallelism constraint on disjunctions could infer the existence of the strong reading of sentence (4) when exposed to a sentence like (10), which contains (4) as the second disjunct. (10) Either the dinosaur is washing the car or the dinosaur is only painting a house This is because sentence (10) would violate the parallelism constraint if stress on house were taken to mark narrow focus on the object, since this would leave the verb painting unfocused even though it does not have a corresponding occurrence in the first disjunct. Given the parallelism constraint, the child would accordingly be forced to interpret this stress as marking wide focus on the verb phrase painting a house instead. In that case, the child would again learn the strong, wide focus reading based on the assumption that the speaker is obeying
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(9) Either she watched a movie or she only talked to Fred
Andrea Gualmini and Bernhard Schwarz 193
an independently motivated principle and without relying on truthconditional evidence. There are several other focus-related effects that a child could conceivably rely on to solve the entailment problem instantiated by only sentences. We will mention just one more. Sentences containing only are often followed by qualifications of the sort illustrated in (11). (11) The dinosaur is only painting a house, it’s not cleaning the swimming-pool
(12) #The dinosaur is painting only a house, it’s not cleaning the swimming-pool Once again, we see that it would in fact be possible for the child to learn that wide focus is available in the target grammar. In other words, upon exposure to positive evidence, children would be able to add the strong, wide focus reading. We conclude, then, that focus association with only, one of the examples that Crain et al. (1994) use to motivate the Semantic Subset Principle, does not firmly establish the need for such a principle, since children might well be able to solve the potential entailment problem it presents by exploiting non-truth conditional sources of evidence based on conditions on focus. Moving away from association with focus, let us now turn to a different instance of the entailment problem. We consider two cases of privative ambiguity where the weak and the strong readings differ as to the relative scope of negation and an additional operator in the sentence. Our first example in (13), taken from Goro and Akiba (2004a,b), is ambiguous as to the relative scope of disjunction or and negation not. (13) The pig did not eat the carrot or the pepper In one reading of (13), negation and disjunction take surface scope, that is, disjunction receives narrow scope with respect to negation. In this reading, the sentence has the unambiguous paraphrase in (14a). This interpretation is probably the most natural reading of the sentence, but it also seems possible for disjunction to be interpreted as taking inverse
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
On the assumption that the speaker’s qualifications are consistent with the intended focus, the child could take the example in (11) as evidence for the existence of the strong, wide focus reading of (4). If the speaker’s intended focus was narrow focus on a house, (11) would be rather infelicitous, as demonstrated by the infelicity of the following example, where the position of only forces narrow focus.
194 Solving Learnability Problems in the Acquisition of Semantics scope over negation. In this inverse scope reading, the sentence can be paraphrased as (14b).3 (14) a. The pig ate neither the carrot nor the pepper (surface scope) b. The pig didn’t eat the carrot or didn’t eat the pepper (inverse scope)
3 With Goro and Akiba (2004), we assume that disjunction has an inclusive semantics, and that exclusive readings are due to pragmatic strengthening through a scalar implicature. We will discuss the significance of scalar implicatures in the present context shortly. 4 Goro and Akiba (2004) actually disagree with our assessment of the relevant facts, as they assume that (13) permits surface scope only. For them, therefore, the problem introduced by the entailment is not that the child might fail to acquire ambiguity. They propose a parameter +/ PPI such that disjunction must scope above negation in +PPI languages and below negation PPI language. The version of Goro and Akiba of the entailment problem is that a speaker of English or any other PPI language, who has initially adopted the +PPI setting would not be able to reset the PPI parameter based on truth-conditional evidence. They accordingly stipulate that the default setting of the parameter, the initial setting for speakers of all languages, must be –PPI, a stipulation motivated by the same logic as the Semantic Subset Principle.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Note that the ambiguity in question is privative, as the surface scope reading entails the inverse scope reading. If the pig ate neither of the two vegetables, then of course there is at least one vegetable that the pig did not eat. Following the familiar reasoning, then, it could be argued that there is a potential entailment problem and that a child who has acquired the inverse scope reading but not the surface scope reading would be unable to acquire the latter. It might accordingly be proposed that the theory of language acquisition must assume that this situation does not arise in the first place. An argument of this sort has indeed been made by Goro and Akiba (2004a).4 We propose that this version of the entailment problem too might actually be solvable for the child. A plausible solution in this case could be based on the fact that disjunction participates in the familiar pragmatic phenomenon known as scalar implicature, a type of generalized conversational implicature in the sense of Grice (1975). In brief, and simplifying somewhat, the standard theory of scalar implicature states that a speaker uttering a given sentence conversationally implicates that the sentence uttered is the strongest true statement among its scalar alternatives, that is, that no stronger true statement could be formed by replacing some operator in the sentence with another operator of the same type (e.g. Horn 1972; Gazdar 1979; Sauerland 2004). In particular, since a conjunction A and B is stronger than the disjunction A or B, a speaker uttering A or B is predicted to implicate that A and B is false. So the standard theory of scalar implicature derives the well-known observation that disjunction is often understood exclusively. It moreover predicts that in the scope of negation, disjunction does not trigger
Andrea Gualmini and Bernhard Schwarz 195
(15) a. The pig ate neither the carrot nor the pepper (surface scope) b. The pig didn’t eat the carrot or didn’t eat the pepper and the pig ate the carrot or ate the pepper (inverse scope) Note now that the two meanings in (15) are not related by entailment. Specifically, even though (a) entails (b) in (14), where scalar implicatures are suppressed, (a) no longer entails (b) in (15). Obviously, from the assumption that the pig ate neither the carrot nor the pepper, it does not follow that the pig did eat one of the two vegetables. The result is that the entailment problem posed by (13) disappears once the relevant readings are pragmatically strengthened by scalar implicature. A child who has acquired the inverse scope reading of (13) but not the surface scope reading could add the latter, provided he masters the mechanism of scalar implicature. Specifically, such a child could infer the availability of the surface scope reading by experiencing (13) as a description of a scenario where the pig ate neither the pepper nor the carrot, a scenario where (15a) is true but (15b) is false. Much the same comment applies to another scope ambiguity which has been used to illustrate the entailment problem and motivate the Semantic Subset Principle. Sentence (16) below, discussed in Musolino (1998) and Musolino et al. (2000), permits two readings that differ as to the relative scope of the universal quantifier and negation. (16) Every horse didn’t jump over the fence (17) a. No horse jumped over the fence (surface scope) b. Not every horse jumped over the fence (inverse scope)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
a scalar implicature and hence is not exclusive. This is because a negated conjunction not [A and B] happens to be weaker, not stronger, than the negated disjunction not [A or B], so there is no stronger scalar alternative that might trigger an implicature. Going back to the inverse scope reading of sentence (13), note that its logical form is [not A] or [not B]. This logical form has a scalar alternative [not A] and [not B], which is stronger than [not A] or [not B], so that a speaker uttering a sentence with logical form [not A] or [not B] is predicted to implicate that [not A] and [not B] is false. Thus, at least one of the statements A and B must be true. So the inverse scope reading of (13) is (correctly) predicted to implicate that the pig ate the carrot or the pepper. In contrast, since in the surface scope reading, disjunction is interpreted in the scope of negation, no scalar implicature is predicted to arise. The complete, pragmatically strengthened, meanings of the two readings of sentence (13) can therefore be described as in (15) below.
196 Solving Learnability Problems in the Acquisition of Semantics
(18) a. No horse jumped over the fence (surface scope) b. Not every horse jumped over the fence but some horse jumped over the fence (inverse scope) Again, these pragmatically strengthened readings are not related by entailment. A child who initially acquired the inverse scope reading could acquire the surface scope reading as well by hearing (16) applied to a scenario where the surface scope reading is true, that is, a scenario where no horse jumped over the fence. An anonymous reviewer questions the availability of events like the ones we described, given that adults seem to rarely use sentences like (16) in their surface scope reading. To be clear, we do not intend to 5
Actually, under the textbook interpretation of universal determiners, (17a) is true but (17b) is false if there are no horses. This, however, is a potential shortcoming of the textbook analysis, as universal statements are usually understood to convey that the domain of quantification is not empty. 6 Having introduced the particular instance of the entailment problem discussed by Musolino et al. (2000), we would like to comment on an observation offered by Musolino (2006), according to which children’s documented behaviour with sentences like (16) raises an empirical problem for the Semantic Subset Principle. In particular, Musolino (2006) suggests that children’s ability to access either interpretation, documented in a study by Musolino and Lidz (2006), falsifies the claim that children initially hypothesize only the strong reading of a privative ambiguity. In our view, this is incorrect. The Semantic Subset Principle could not possibly be falsified by the observation that children have acquired an ambiguity, since that is what the Semantic Subset Principle is designed to ensure. The Semantic Subset Principle would be falsified if we found that children are initially limited to the weak reading of a privative ambiguity, whose strong reading is only acquired at a later stage. We are not aware of any study documenting this scenario.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
The sentence in (16) is ambiguous between the surface scope interpretation in (17) and the inverse scope interpretation in (17). The ambiguity is privative, with (17a) entailing (17b).5 Musolino et al. (2000) accordingly argue that, if children’s initial hypothesis only included the inverse scope reading in (17b), they would not be able to acquire the surface scope reading (17a).6 However, in this case too the problem becomes solvable once the phenomenon of scalar implicature is taken into account. A negated universal of the form not [every A B] has a stronger scalar alternative not [a A B], where the universal determiner is replaced with an existential determiner. A speaker uttering a sentence with the logical form not [every A B] is therefore predicted to implicate that not [a A B] is false. So sentence (16) in its inverse scope reading (17) is predicted to implicate, correctly it seems, that some horse did jump over the fence. In contrast, the surface scope reading of (16) is not expected to carry a scalar implicature, given that a scalar alternative [a A] [not B] is weaker than the logical form [every A] [not B]. With scalar implicatures added, therefore, the two readings of (16) can be paraphrased as in (18).
Andrea Gualmini and Bernhard Schwarz 197
2.2 Is there really a learnability problem? We are not the first ones to suggest that examples like (4) or (16) above do not provide a good motivation for the Semantic Subset Principle. As for (4), Musolino (2006) asks whether there are any languages where
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
present the relevant evidence as abundant in the input. Nevertheless, we have reasons to believe that data of the kind discussed above exist. This is confirmed by Musolino and Lidz (2006), who report that ‘one of the dozens of examples from the Musolino corpus was used to convey a ‘‘none’’ reading’ (p. 842). This proves our point: examples of the relevant kind exist. They may be rare, but they do exist. Another comment on the proposal above is in order. Children’s computation of scalar implicatures is the subject of a rapidly growing body of research (for a recent review, see Guasti et al. 2005). A frequent finding is that children do not compute implicatures to the same extent as adults. As a consequence, if a child were to initially acquire only the weak reading, he might not have access to the solution described above until relatively late. However, this should not be taken as evidence against our proposal. At most, it could be taken as evidence against the claim that children go through the relevant acquisition scenario in the early stages of language development. This is not our claim, however. Our claim is simply that children could acquire the strong reading of a privative ambiguity after learning the weak reading. It might take them a long time, but the possibility cannot be excluded on logical grounds. The Semantic Subset Principle was not proposed to ensure that children would not encounter problems that could delay their attainment of the target grammar. The Semantic Subset Principle was proposed to ensure that children would not encounter problems that could make the target grammar unattainable. Our point is that the problem discussed by Crain et al. (1994) would not make the target language unattainable. To sum up, we have reviewed different instances of the entailment problem discussed in the literature. Overall, those instances of the entailment problem appear to be solvable. Showing that specific instances of the entailment problem can be solved, however, might not be enough to show that there is no work left for the Semantic Subset Principle. To this aim, what needs to be done is to work out a mechanism that would allow the child to solve arbitrary instances of the entailment problem. We propose such a general mechanism in section 2.3. But first, in section 2.2, we will comment on a related recent proposal in Musolino (2006) and related suggestions by an anonymous reviewer.
198 Solving Learnability Problems in the Acquisition of Semantics
7 Musolino (2006) moreover suggests that knowledge of syntactic structure of the target language can allow a language learner to solve a potential entailment problem. To illustrate this possibility, Musolino suggests that a learner of English could infer the existence of an inverse scope reading of an example like (16) after having learned that subjects in English originate within the verb phrase. While we consider it plausible that the syntax of the target language could help a learner solve certain instances of the entailment problem, we are doubtful about Musolino’s illustration of this idea. After all, the inverse reading of (16) is not the problematic reading: being the weak reading, its existence can be inferred on the basis of purely truth-conditional evidence.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
sentences like (4) lack the strong, wide focus interpretation. If not, it is possible that the successful acquisition of this ambiguity is effectively guaranteed by Universal Grammar. In other words, the acquisition of available readings in this particular case may not in fact have to be conservative, so that a child exposed to a sentence like (4) may immediately posit the ambiguity without ever having to face any learnability problem. Echoing similar remarks in Musolino (2006), an anonymous reviewer moreover suggest that example (16) is not a good illustration of the entailment problem either, arguing that the availability of the strong, surface scope, reading is effectively guaranteed by Universal Grammar as well, specifically by the assumption that meaning is determined compositionally.7 These considerations question the motivation for the Semantic Subset Principle, suggesting that the relevant instances of the entailment problem are preempted independently of such a principle by certain properties of Universal Grammar. By the same token, they also question the need for the solutions to these particular instances of the entailment problem that we have presented above. However, we think that the suggestion put forth by Musolino (2006) and the reviewer fall short of showing that there is no work left to do for the solutions we have proposed. With regard to examples like (4), it remains to be shown that indeed all language allow for both wide and narrow focus readings; hence, whether it can indeed be argued that Universal Grammar excludes languages where only the narrow focus reading is available. After all, it is certainly conceivable that languages could differ as to the relation between stress and focus width. Also, the suggestion that the surface scope reading of examples like (16) is guaranteed by compositionality seems to us to be based on a misunderstanding. We are not aware of any existing explication of the notion of compositionality that would guarantee the availability of surface scope readings. Indeed, such an explication would be inadequate for empirical reasons. This is illustrated by a finding reported in recent acquisition literature: Kra¨mer (2000) has shown that Dutch-speaking children interpret scrambled indefinites in their underlying position up until age 9 (see also Unsworth 2005). In fact,
Andrea Gualmini and Bernhard Schwarz 199
2.3 A general solution: exploiting downward entailingness We would like to propose that sentences containing so-called downward entailing operators, operators that reverse entailment relations within their arguments, could provide children with positive truth-conditional evidence that allows them to solve arbitrary instances of the entailment problem. Let us illustrate the mechanism we have in mind by returning to the association with focus example in (4), repeated below for convenience. (4) The dinosaur is only painting a house (5) a. a house is the only thing that the dinosaur is painting (narrow focus) b. painting a house is the only thing that the dinosaur is doing (wide focus) According to Crain et al. (1994), a child who has acquired the weak reading (5a) would not encounter truth-conditional evidence in favour of the strong reading (5b). To be sure, if the necessary truth-conditional evidence could only come from the occurrence of (4) in a situation in which only the strong reading is true, then Crain et al.’s claim is unobjectionable, as no such situation exists. Nevertheless, one should
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
any experimental maneuver that has been tried has failed in getting Dutch-speaking four-year-olds to access the adult surface scope reading (see Unsworth & Gualmini 2007; Unsworth & Helder 2007). Descriptively, it looks as if Dutch-speaking children indeed initially lack the surface scope reading that, according to the reviewer, should be hypothesized from the beginning. It should moreover be obvious that even if one could successfully argue that Universal Grammar preempts particular instances of the entailment problem, such as those illustrated by (4) and (16), this would not yet amount to an argument that the Entailment Problem in general is a non-problem, as such an argument would of course require an exhaustive survey of privative ambiguity in natural languages. So we are not ready to concur with Musolino (2006), who concludes that ‘semantic subset problems probably do not exist in the first place’. By the same token, of course, the solutions to the relevant instances of the entailment problem we have proposed also fail to establish that there is no need for the Semantic Subset Principle in the general case. Such a case would have to be based on a general solution to the entailment problem. We will now proceed to presenting such a general solution.
200 Solving Learnability Problems in the Acquisition of Semantics also consider the possibility that the child can receive truth-conditional evidence from sentences other than (4) and in turn use this piece of information to draw conclusions about (4). We propose that a child who has acquired the weak, narrow focus, reading of (4) could add the strong, wide focus, reading after first acquiring a wide focus reading of a sentence where only occurs in the scope of a downward entailing operator. One such downward entailing operator is sentential negation, so let us consider the negated version of (4), shown in (19), and its two readings paraphrased in (20). (19) The dinosaur is not only painting a house
Like the ambiguity of (4), the ambiguity of (19) is privative. However, the effect of negation in (19) is that the direction of entailment has been reversed, so now the narrow focus reading entails the wide focus reading. If the dinosaur painted something other than a house, then he also did something other than painting a house. Now suppose a child has acquired the weak, narrow focus, reading of (4). Presumably, such a child will thereby also have acquired the narrow focus reading of (19). The child could now add the wide focus reading of (19) when exposed to the sentence in a scenario where (20b) is true while (20a) is false. This could be a scenario, for example, where the dinosaur is painting a house and is painting nothing else, but is also eating a sandwich.8 Having concluded that wide focus is available, the child could then infer that (5b) is a possible reading for (4). In this way, the child could in principle acquire the two readings in (5), even if he initially assumed that only narrow focus is available. As another illustration, we apply the proposed mechanism to the scopally ambiguous example in (16) above. Suppose a child has acquired the weak, inverse scope, reading of (16) in (17b). Such a child could add the strong, surface scope, reading when exposed to a sentence like (21), where (16) is embedded under impossible. This is a downward entailing adjective, so the direction of entailment attested in (17) is reversed in (22): the inverse scope reading now entails the surface scope reading. 8 Recall from section 1 that this is the very type of scenario the child could use to add the weak, narrow focus, reading of (4) after initially having acquired the strong, wide focus, reading.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(20) a. a house is not the only thing the dinosaur is painting (narrow focus) b. painting a house is not the only thing the dinosaur is doing (wide focus)
Andrea Gualmini and Bernhard Schwarz 201
(21) It’s impossible that every horse didn’t jump over the fence (22) a. It’s impossible that no horse jumped over the fence (surface scope) b. It’s impossible that not every horse jumped over the fence (inverse scope)
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Suppose a child has initially acquired the weak, inverse scope, reading of (16), and has thereby also acquired the inverse scope reading of the embedded clause in (21). The child could add the surface scope reading of the embedded clause when exposed to (21) in a scenario where (22a) is true while (22b) is false, that is, a scenario where it is known that some horse jumped, but also that some horse may not have jumped, perhaps because of a broken leg. Having concluded that surface scope is available for an embedded occurrence of (16), the child could then infer that (16) in isolation has a surface scope reading as well. So the child would acquire the two readings of (16), even if he initially assumed that only inverse scope is available. Unlike the focus or implicature-based solutions presented in the previous section, the mechanism described here is completely general in that it applies to privative ambiguities of all kinds. The obvious question at this point is whether sentences with downward entailing operators that would point a child in the right direction indeed occur in the child’s input. A quantitative analysis of the input available to children is beyond the scope of this paper. We note, however, that Crain et al.’s (1994) Principle of Parsimony mentioned in section 1 above, according to which adults prefer weak readings of sentences that have a privative ambiguity, would help make the relevant reading adults’ preferred interpretation of sentences containing a downward entailing operator. We have argued that children could use truth-conditional evidence to add the strong reading to the weak reading of an ambiguous sentence. In particular, children would need to hear a sentence in which the relevant construction is in the scope of a downward entailing operator and the intended reading is generated by the same mechanism that would generate the strong reading of the original construction. Crucially, however, this amounts to the weak reading of the construction containing the downward entailing operator. So assuming the Principle of Parsimony is correct, unless we have reasons to believe that adults’ assumed preference for weak readings does not carry over to sentences containing downward entailing operators, it seems plausible to hypothesize that the required evidence would indeed be available.
202 Solving Learnability Problems in the Acquisition of Semantics
(23) a. ‘‘There is absolutely no question in my mind that this is the proper way to handle bone growth in young horses,’’ said Fisher. ‘‘It’s not a guarantee that every horse won’t buck his shins because there’s an exception to every medical rule www.ctba.com b. No big deal, because it’s not as though every person didn’t get it, but I hope people figure this out theconstructivecurmudgeon.blogspot.com In both cases, the surface scope reading is clearly intended by the writer, as the reading where every takes inverse scope under negation is too strong to be pragmatically consistent with the context provided: the inverse scope reading of (23a) would convey that every horse might buck his shin, and the inverse scope reading of (23b) would convey that every person got the point. Cases of this sort, then, could in principle help children in solving the particular instantiation of the entailment problem posed by sentences like (16), that is, they could allow children to acquire the surface scope reading in (17a) after initially having learned the inverse scope reading in (17b). Let us summarize the proposed mechanism. Given a sentence S and two logically possible readings SA and SB, where SA entails SB, we have argued that children could acquire both readings on the basis of truthconditional evidence, regardless of which reading was posited first. If the child’s first hypothesis is that only SA is available, then all the child needs to experience is an utterance of S in a context in which reading SA is false, but SB is true. By contrast, if the child’s first hypothesis is that only SB is available, then evidence for the existence of SA might come
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Even if it turned out that the Principle of Parsimony is wrong and that for some reason adults do not always prefer the weak reading of any given sentence, we think that one should not dismiss the necessary evidence as non-existent. We base this belief on the results of an informal internet search for scopally ambiguous sentences of the type discussed above. Notice that such a search stacks the cards against our proposal beyond necessity. In real life, it seems reasonable to assume, like Crain et al. (1994) do, that the child has access to contextual information that would be used to rule out some logically possible meanings while permitting others. By contrast, in our own search we had to restrict our observations to cases where such information could be inferred from the very same context surrounding the relevant utterance. Despite this unnecessary restriction, our search returned many sentences that would force the adoption of the surface scope interpretation. Two relevant examples are presented in (23).
Andrea Gualmini and Bernhard Schwarz 203
(24) I didn’t read the books The sentence is understood to convey that the speaker did not read any of the books, illustrating Fodor’s (1970) observation that, under sentential negation, plural definite noun phrases are typically interpreted as though they were narrow scope existentials. Accordingly, upon being presented with (24) in a context in which the speaker did not read any of the relevant books, a child might infer that definite plurals can be existentials. Now suppose the child brings this piece of evidence to bear on the interpretation of (25). (25) I read the books According to our reasoning, the child might be led to interpret the definite in (25) as an existential quantifier, on a par with (24). This
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
from an utterance that receives the logical form OPDES, where OPDE is any downward entailing operator. In this case the mechanisms that generate SA and SB would generate OPDESA and OPDESB, respectively, where OPDESB entails OPDESA. In this case, the child could receive truth-conditional evidence in favour of OPDESA by simply witnessing an occurrence of OPDES in a context that makes OPDESA true and OPDESB false. Then, the child would need to infer that the mechanism that generates OPDESA would also be available for the original sentence S, thereby acquiring SA. We note again that the path to the acquisition of privative ambiguities proposed by Crain et al. (1994) and the one we proposed are on equal footing in that, by the Principle of Parsimony, the weak reading that the child is assumed to add later (SB or OPDESA) is predicted to be adults’ preferred reading of the relevant sentence (S or OPDES), which in both cases increases the likelihood that, for each sentence, the child’s experience would provide him with evidence for the weak reading of that sentence, that is, SB or OPDESA. Finally, we note that, if S can have the strong reading SA but, at the same time, reading OPDESA was unavailable for any sentence OPDES, the mechanism we have proposed would fail. We are not at present aware of any case of this sort. In concluding this section, we would like to point to a feature of our proposal that may be considered problematic and that connects the entailment problem discussed here to the expunction problem introduced in section 1. Note that for our account to work, we must allow the child to draw analogies from sentences that contain a downward entailing operator to sentences that do not. This raises the possibility that the child might draw wrong generalizations. To illustrate, consider sentence (24).
204 Solving Learnability Problems in the Acquisition of Semantics
3. A SOLUTION TO THE EXPUNCTION PROBLEM: POSITIVE EVIDENCE AND HEARER CHARITY The Subset Principle, as it was formulated by Wexler and Manzini (1987) (see also Wexler & Culicover 1980), is designed to ensure that language acquisition proceeds solely on the basis of positive evidence. Given that ungrammatical sentences are not labelled as such in the primary linguistic data, the child should never entertain a grammar that generates a superset of the well-formed sentences of the target language. If such a scenario arose, the child would not be able to expunge the ungrammatical forms from his grammar. When it comes to syntactic well-formedness, this reasoning is supported by the finding that children indeed do not receive negative evidence (see Brown & Hanlon 1970; Marcus 1993). However, we think that this reasoning might not apply to sentence meanings. What sets aside sentence meanings is the fact that when speakers produce false statements, the other participants to the conversation might object. In other words, dialogue participants might not object to any speaker whose utterances are not fully grammatical, but they are very likely to object to a speaker whose utterances lack true interpretations. Not surprisingly, this also holds true when children produce false statements (see Brown & Hanlon 1970). In order to illustrate our point, let us return to the case of longdistance reflexives, used in section 1 to introduce the expunction problem. We take it that a speaker who produces a sentence with an
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
would lead the child to a non-adult interpretation, however, as adult speakers of English must assign universal force to the definite in (25), taking the sentence to convey that I read all of the books and not just some of them. The question then is whether children would ever be able to recover from such a mistake. If the only way for children to recover form their mistake draws upon the use of negative evidence, under current assumptions, we would have to conclude that children would never recover. So while the mechanism that we have proposed might indeed allow the child to solve the entailment problem, this result comes at a potential cost: the child may solve the entailment problem, but at the possible cost of running into in the expunction problem. So if the expunction problem is unsolvable and the Subset Principle can be motivated on logical grounds, then our proposal faces a serious challenge. In the remainder of this paper, we address this challenge by proposing a potential solution to the expunction problem.
Andrea Gualmini and Bernhard Schwarz 205
embedded reflexive is likely to be confronted by an adult listener who considers the local dependency interpretation false—even if that listener considers the proposition that the unavailable long-distance reading would express to be true. To illustrate, consider the constructed dialogue in (26) below. (26) Speaker A: John wanted Bill to shave himself. Speaker B: No, actually, John had no problem with Bill’s beard.
(27) Speaker A: John wanted Bill to shave himself Speaker B: No, actually, John had no problem with Bill’s beard, but John wanted to get a shave from Bill Arguably, the child could then infer that the long-distance interpretation is not actually available in the adult grammar. The child could reason that, if such a long-distance reading were available, Speaker B would have been able to access it and would accordingly not have objected to Speaker A’s utterance on the grounds of it being false. By drawing this inference, then, the child would expunge the unavailable long-distance interpretation from his grammar. A central premise in the child’s inference is the assumption that listeners interpret speakers’ utterances charitably in the sense that they refrain from rejecting an utterance as false as long as an interpretation is available that they do not consider false (and possibly believe to be true). Without this charity assumption, the exchanges in (26) and (27) would not support the relevant inference, as Speaker B could be taken to arbitrarily have accessed one of two interpretations available in the adult grammar.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
Speaker B’s response is pragmatically well formed (assuming that Speaker B indeed believes that John did not want Bill to shave). This is so even if Speaker B believes that John wanted to be shaved by Bill, the proposition the potential long-distance reflexive interpretation would express. After all, this reading is actually unavailable in the adult grammar and so it is irrelevant for the pragmatic well-formedness of the discourse at hand. Consider now the hypothetical case of a child who has incorrectly posited a long-distance reading of embedded reflexives in addition to the local interpretation. Suppose this child is Speaker A in (26) or witnesses the exchange as a third person. Suppose moreover that it is known in the utterance context that John indeed wanted Bill to shave him, that is, that the hypothetical long-distance reading is true. Alternatively, suppose Speaker B’s explicitly indicates that the longdistance reading is true, perhaps by continuing as shown in (27).
206 Solving Learnability Problems in the Acquisition of Semantics We believe that listeners are indeed charitable in the relevant sense and that speakers and other conversation participants expect them to be.9 To illustrate, in the absence of a disambiguating context, Speaker B’s response in (28) below is pragmatically deviant because it portrays Speaker A’s utterance as false, interpreting he as being anaphoric to Bill, even though Speaker B seems to agree with the other possible reading of the sentence, the one where he is instead anaphoric to John.10 (28) Speaker A: John told Bill that he won Speaker B: No, actually, John declared himself the winner
9 Listener charity is naturally related to Grice’s (1975) Cooperative Principle. Listeners might be said to be charitable in the relevant sense because they expect speakers to abide by the Maxim of Quality. 10 The pragmatic deviance of Russell’s (1905) famous yacht example makes much the same point, and more dramatically so: ‘I have heard of a touchy owner of a yacht to whom a guest, on first seeing it, remarked, ‘‘I thought your yacht was longer than it is’’; and the owner replied, ‘‘No, my yacht is not longer than it is’’’. The yacht owner objects to the guests utterance by assigning it a the contradictory interpretation I thought your yacht was longer than itself even though a consistent interpretation, presumably intended by the guest, is readily available as well.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
We note that dialogues like the one in (27) do not count as negative evidence in the usual sense, because no dialogue participant is explicitly pointing out the unavailability of a given meaning. Furthermore, the mechanism we described is different from so-called indirect negative evidence (Chomksy 1981), since the child does not carry out an inference on the basis of what he could have heard but did not hear. Rather, the unavailability of a given meaning can be inferred on the basis of the charity assumption. Thus, a dialogue between competent adult speakers, which we would classify as positive evidence, could be used to infer the unavailability of a specific reading, a result which is traditionally seen as a prerogative of negative evidence. In the learning scenario described here, the evidence in question is negative only in the sense that it can support the inference that a certain interpretation is not available. Recent work in the acquisition of semantics report experimental results that our proposal might help explain. O’Leary and Crain (1994), Musolino (1998), Gualmini (2004) and Gualmini et al. (2008) report that English-speaking children may assign to the determiner some both wide and narrow scope with respect to clausemate negation. For example, these children can interpret The detective didn’t find someone as conveying either that there is someone the detective did not find, the inverse scope reading, or that the detective found no one, the surface scope reading. On the assumption that some is a positive polarity item in adult English and hence cannot take surface scope in such cases (see e.g. Ladusaw 1979), children may therefore face the acquisition
Andrea Gualmini and Bernhard Schwarz 207
(29) Speaker A: Some pizzas were not lost Speaker B: Well, actually, no pizza was lost! Given that Speaker B signals disagreement with Speaker A’s utterance (well, actually . . .) and states that no pizza was lost, the question is what grammar can be attributed to Speaker B. The relevant reasoning is similar to the argument made on the basis of (27). Speaker B’s response indicates that he takes the potential inverse scope reading of Speaker A’s utterance to be true. If the inverse scope reading was indeed available, charity would therefore require that Speaker B accept Speaker A’s utterance as true, no matter what other readings the sentence might have. Thus, Speaker B’s actual response would arguably lead a listener to infer that the inverse scope reading is not in fact available. The remaining step is to explain why Speaker B’s utterance is pragmatically well formed under the assumption that only the surface scope reading is available. Note that Speaker B’s statement that no pizza was lost is semantically consistent with the surface scope reading of the sentence. This is because the scope ambiguity at hand is privative, with the surface scope reading being entailed by the inverse scope reading: if no pizzas were lost, then there is some pizza that was not lost.12 So it would seem at first sight that Speaker B has no grounds for disagreement with Speaker A’s utterance. However, this apparent conflict dissolves 11 Musolino et al. (2000) take another approach to the problem at hand, relating the acquisition of some to the acquisition of any. These authors propose that once the child learns that any can be interpreted in the scope of clausemate negation, he will infer that its ‘allomorph’ some cannot be so interpreted. This proposal is of course consistent with our own. There may well be more than one way of expunging readings where some scopes under clausemate negation. 12 To be more accurate, this entailment holds only in situations where there are pizzas in the domain, but the relevant situations in the experiments reported in O’Leary and Crain (1994), Gualmini (2004) and Gualmini et al. (2008) were of this sort.
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
problem that the Subset Principle is expected to prevent. The question is how children manage to solve that problem. One candidate is indirect negative evidence (Chomsky 1981). Upon being presented with numerous occurrences of sentence containing some and negation in which some is interpreted outside the scope of negation and no sentence in which some is interpreted in the scope of negation, the learner would infer that the latter interpretation is not available. Appealing as it may be, under this scenario, it is surprising that four- and five-year-olds have still not expunged the relevant interpretation. As an alternative explanation, we propose that under the charity assumption, positive evidence provided by hypothetical dialogues like the one below could lead the child to rid his grammar of the interpretation in which some is interpreted in the scope of negation.11
208 Solving Learnability Problems in the Acquisition of Semantics
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
once scalar implicatures are taken into account. The surface scope reading of Speaker A’s utterance in (29) has the logical form [some A] [not B], which has the stronger scalar alternative [every A] [not B]. Speaker B is therefore expected to take Speaker A to implicate that this stronger statement is false, hence that some pizza was lost. So Speaker B’s utterance is naturally understood as challenging this scalar implicature. In sum, then, Speaker B’s decision to object to the scalar implicature that some pizza was lost, triggered by his interpretation of Speaker A’s utterance, signals that Speaker B has no choice but to access the interpretation that gives rise to such an implicature, that is, the surface scope interpretation of Speaker A’s utterance. It is important to stress the role of scalar implicatures for the particular acquisition scenario we just presented. We have noted above that children do not compute implicatures to the same extent as adults. It has been found, in fact, that many children do not compute scalar implicatures until age five (Guasti et al. 2005). Interestingly, this is precisely the age when children stop interpreting some in the scope of negation. In the acquisition scenario we are envisioning, then, the necessary information might have been available to children well before the switch happens, but they were not in the position to use it. Before they acquire scalar implicatures, children would not be expected to find any coherent interpretation of dialogues like (29) and they should not be able to use them to infer that some must scope above clausemate negation. Two comments about the dialogues described above. First, we emphasize that the child does not need to participate in the dialogue, that is, the child may be Speaker A, but could also just be overhearing the conversation. In other words, our proposal about the use of positive evidence from dialogues in order to expunge readings from the child’s grammar does not require that the child actually uses the reading that needs to be expunged. This sets apart the present case from the use of negative evidence to expunge ungrammatical structures. Second, Speaker B does not necessarily intend to inform Speaker A of the unavailability of a specific interpretation. Speaker B is not trying to let Speaker A know that their grammars are not identical. Speaker B might simply hypothesize that Speaker A was assuming a different domain of quantification or that Speaker A did not have access to the same information available to him. Whatever the reason behind (Speaker B’s conjecture about) Speaker A’s failure, these kind of dialogues could in principle take place. A reasonable question is whether they are likely to take place in real life, that is, whether dialogue participants indeed object to statements whose implicatures are not met. The internet posts in (30) below suggest that
Andrea Gualmini and Bernhard Schwarz 209
they do. These dialogues demonstrate that when an implicature is false, dialogue participants can object or request clarification.13
Given these attested cases, one would expect to also find real-life exchanges analogous to (29), featuring some and a clausemate sentential negation. Indeed, the example below is precisely of this type. (31) what do you mean somebody didn’t show up to work . . . are you still showing up?! http://profile.myspace.com In this example, the speaker’s question are you still showing up?! challenges the implicature triggered by wide scope some, namely the implication that someone still shows up to work. To be sure, this challenge would be uncharitable if the speaker were able to interpret some in the scope of negation, accessing the reading that no one shows up to work. So this dialogue too could lead a learner to infer that some cannot in fact be interpreted in the scope of clausemate negation. The final example we present is a potentially problematic case suggested to us by an anonymous reviewer. The reviewer considers the hypothetical dialogue in (32) below. (32) Speaker A: Every horse didn’t jump over the fence Speaker B: Well, actually, none of the horses did Note that speaker B’s reply in (32) violates the assumption of charity. The content of speaker B’s utterance indicates that he considers the proposition expressed by the surface reading of speaker A’s utterance to be true, yet the form of the reply (well, actually. . .) signals disagreement with speaker’s A statement, indicating that speaker B uncharitably takes speaker A to have intended the inverse scope reading. The reviewer 13 This is of course not surprising, given that speakers often judge sentences to be false if they are associated with a scalar implicature that is false (see e.g. Bott & Noveck 2004).
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
(30) a. i like it what do you mean ok??? i think it’s great. the pics look good too. you’re the first five i’ve given in a while. lol i better start giving more http://www.blogskins.com b. What do you mean ‘‘most’’? Are there some banks they wouldn’t have controlled? http://www.wellingtonpublications.com c. WARM?! WARM?!!!? What do you mean it’s warm?? It’s HOTTER than hell out here http://profile.myspace.com
210 Solving Learnability Problems in the Acquisition of Semantics
4. CONCLUSION In this paper, we started by considering a fairly specific proposal due to Crain et al. (1994) and we finished by challenging a very influential view about language acquisition that dates back to Wexler and Culicover (1980). Our main concern was the logical motivation for the Semantic Subset Principle proposed by Crain et al. (1994) and the Subset Principle proposed by Wexler and Culicover (1980). We argued
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
suggests that (32) is nevertheless a conceivable exchange. Speaker B may simply be unaware of the surface scope reading of speaker A’s utterance, a performance error that could be related to the findings of Musolino and Lidz (2006) that adults rarely use sentences like speaker A’s in the surface scope interpretation. The reviewer points out that if exchanges like (32) indeed occurred, they could lead the child to incorrectly expunge the relevant surface scope readings from his grammar. We agree with the reviewer’s reasoning. What remains to be seen, though, is whether such non-charitable performance errors occur with sufficient frequency to actually trigger expunction of a reading available in the adult grammar. If they do, the reading in question would need to be reintroduced to the child’s grammar on the basis of positive evidence. In the particular case at hand, where the expunged reading is a strong reading, the evidence in question would have to be of the type described in the previous section. To conclude, we have argued that when it comes to sentence meanings, the Subset Principle is not in fact a conceptual necessity, as the expunction problem is solvable in principle. The availability of a solution is tied to the existence of dialogues that would violate basic pragmatic principles, if the target grammar corresponded to the biggest language generated by the possible values of a parameter. If the expunction problem can be solved, there is no obvious reason to assume that it is avoided. A task that we leave for future research is that of establishing the amount of evidence of the relevant sort that is available to children, a necessary step to explain how children might make use of the input in order to choose among the options made available by Universal Grammar (see Yang 2003). At the present time, we would simply like to offer a speculation. We do not expect examples like the ones above to be pervasively available in the input. We regard this as a promising feature, though, as it might help explain why children seem to take so long to expunge the interpretation in which negation takes wide scope and why adult speakers seem to differ with respect to the strength of the polarity sensitivity of some.
Andrea Gualmini and Bernhard Schwarz 211
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
that the presence of downward entailing operators in natural languages and the fact that false—or even simply pragmatically odd—statements are often challenged undermines the need for the Subset Principle and the Semantic Subset Principle in the acquisition of semantics. When it comes to sentence meanings, there is no logical problem of language acquisition, but simply an empirical one. This does not mean that the acquisition of sentence meanings is a trivial enterprise. Children’s ability to solve that problem presupposes their understanding of negation (and downward entailingness) and, in particular cases, their ability to compute scalar implicatures. Thus, the child might indeed have to face several problems before he has access to the tools that would allow him to solve those problems. Eventually, however, the child would be in the position of solving those problems that previous literature deemed as unsolvable. The existence of potential solutions to the semantic learnability problems in question, such as the particular solutions we have outlined, shows that the Subset Principle and the Semantic Subset Principle are not conceptual necessities. Going one step further, one might ask whether these principles are even feasible as components of comprehensive theories of language acquisition. It is in fact not obvious that they are. To begin, Musolino (2006) raises the following objection against the Semantic Subset Principle. The principle seems to imply that a child must in the relevant cases have access to both possible readings of a potentially ambiguous sentence, only to then apply the Semantic Subset Principle and to (initially) posit a grammar that permits only one of these two readings. In effect, for the Semantic Subset Principle to apply, the child must have access to the reading that the very same principle is designed to prevent or delay. To be sure, we agree with Musolino that this would be an unreasonable acquisition scenario. We note, however, that as a conceptual argument against the Semantic Subset Principle, Musolino’s objection is incomplete. The hypothesis put forth by the Wexler and Manzini (1987) is that parameters may have default values. So the Semantic Subset Principle could be viewed as a proposal concerning the default settings of certain parameters. In this view, the child would not in fact need to assess entailments among two potential readings. Instead, default settings would dictate that initially only strong readings are available; hence, there would be no need for the child to determine entailments before settling on an initial interpretation. This version of the Semantic Subset Principle is not subject to the particular objection raised by Musolino (2006). However, one still
212 Solving Learnability Problems in the Acquisition of Semantics
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
needs to ask what the relevant parameters and their default values would look like. To illustrate, consider the case of a universal quantifier and clausemate negation. Since the reading in which every takes scope over negation entails the other reading, there would presumably have to be a parameter whose default value ensures that that reading is available. But upon closer reflection, it is apparent that such a default setting would not in fact derive the intended effect of the Semantic Subset Principle. After all, we have seen that a higher downward entailing operator can reverse the direction of entailment, so that the reading in which negation takes wide scope ends up entailing the other reading. The parameter in question and its default value would have to be sensitive to that complication. We are doubtful that this requirement can be met in a sufficiently constrained theory of parameters. Hence, like Musolino, we remain unconvinced that the Semantic Subset Principle is tenable as a constraint on language acquisition. As for the Subset Principle, the computational models of language acquisition have had a hard time encoding the requirements expressed by it (see Fodor & Sakas 2005). Also, the Subset Principle has been presented as a constraint on parameter setting, yet the parameter setting model of language acquisition is not without alternatives. The Variational Learning model proposed by Yang (2003) is one alternative. Not having been designed to preempt the expunction problem, it seems that this model could rely on a solution to this problem of the sort offered here. It will be useful to elaborate on this point. To illustrate, Yang (2003) attempts to explain how a child might succeed in attaining the target grammar even if he was free to draw upon any grammar made available by Universal Grammar (UG). On this view, the process of language acquisition starts off as a random walk. In principle, the child is free to select any grammar that is made available by UG according to the probability associated with that grammar. Learning is taken to amount to a re-adjustment of the probability associated with each grammar. For any piece of input, if the grammar which happens to be selected by the child successfully analyzes that piece of input, the probability associated with that grammar increases while the probabilities associated with all other grammars decrease. By contrast, if the grammar selected by the child does not allow him to analyze the input encountered by the child, then the probability associated with that grammar is decreased and the probabilities associated with all other grammars are increased. Crucially, on this model, the input is only used to ‘test’ one grammar at a time, but the results of this test have consequences for the probability associated with each grammar. In the long run, the occurrence of
Andrea Gualmini and Bernhard Schwarz 213
Acknowledgements For discussion of the material presented here, we thank Stephen Crain, Marc Garellek, Luisa Meroni, Jennifer Morehouse and Michelle St-Amour, the participants of our Fall 2006 seminar on the acquisition of semantics at McGill, as well as the audience at the 2007 Workshop on Negation at the University of Tu¨bingen. We also thank two anonymous reviewers and Bart Geurts for detailed comments on an earlier version of the paper. Part of the research reported in this paper was supported by a grant to Andrea Gualmini from the Arts Undergraduate Society of McGill University. Andrea Gualmini is currently supported by a VIDI fellowship from the Netherlands
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
experience that cannot be analysed by any grammar other than the target grammar ensures that the probabilities associated with non-target grammars decrease. The problem is that, in the scenario which the Subset Principle is expected to prevent, there does not seem to be any evidence that could not be accounted by the grammar associated with the superset language. Thus, it is not clear whether the target subset grammar could ever win if the learner happened to draw upon a grammar that generates a superset of the meanings licensed by the target grammar. On such a scenario, the grammar selected by the learner would always be rewarded and all the competing grammars would be punished—including the target grammar. If the learner embarked on an unlucky strike, the child might never be able to decrease the probability of accessing the superset grammar and that grammar might end up being adopted by the learner. In other words, although the Variational Learning model provides us with several advantages over the parameter setting model, it does not provide us with a solution to the expunction problem. As a consequence, unless a solution to the expunction problem can be put forward, it seems as though the initial steps of the learner must be set, a conclusion that seems at odds with the view of language acquisition as a ‘random walk’ among possible grammars. Against this background, one could view our contribution as an enrichment of the kind of evidence that would lead the child to punish any given grammar. It is often assumed that the child receives valuable information by encountering a structure that would not be licensed by his current grammar. Similarly, it is often assumed that the child receives valuable information by encountering a sentence for which no true interpretation would be generated by his current grammar. In our view, the child might also receive valuable information by encountering a sentence which is well formed and true—but pragmatically infelicitous—according to his current grammar.
214 Solving Learnability Problems in the Acquisition of Semantics Organization for Scientific Research (NWO) and Utrecht University. Bernhard Schwarz was supported in part by a Programme d’ Etablissement de Nouveaux Chercheurs research grant from Fonds Que´be´cois de la Recherche sur la Socie´te´ et la Culture (FQRSC).
ANDREA GUALMINI Utrecht Institute of Linguistics OTS Janskerkhof 13 3512 BL Utrecht The Netherlands e-mail:
[email protected]
REFERENCES Berwick, R. (1985), The Acquisition of Syntactic Knowledge. MIT Press. Cambridge, MA. Bott, L. & I. A. Noveck (2004), ‘Some utterances are underinformative: the onset and time course of scalar inferences’. Journal of Memory and Language 51:437–45. Brown, R. & C. Hanlon (1970), ‘Derivational complexity and order of acquisition in child speech’. In J. R. Hayes (ed.), Cognition and the Development of Language. John Wiley. New York. Chomsky, N. (1981). Lectures on Government and Binding: The Pisa Lectures. Foris Publications. Dordrecht. Crain, S., L. Conway, & W. Ni (1994), ‘Learning, parsing, and modularity’. In C. Clifton, L. Frazier, & K. Rayner (eds.), Perspectives on Sentence Processing. Lawrence Erlbaum Associate. Hillsdale, NJ. 443–67. Crain, S. & R. Thornton (1998), Investigations in Universal Grammar: A Guide to Experiments on the Acquisition of Syntax and Semantics. MIT Press. Cambridge, MA. Fodor, J. D. (1970), The Linguistics Description of Opaque Contexts. Ph.D. dissertation, MIT. Cambridge, MA.
Fodor, J. D. & W. G. Sakas (2005), ‘The Subset Principle in syntax: the costs of compliance’. Journal of Linguistics 41:513–69. Gazdar, G. (1979) Pragmatics. Academic Press. New York. Goro, T. & S. Akiba (2004a), ‘The acquisition of Japanese disjunction and positive polarity’. In V. Chand, A. Kelleher, A. J. Rodrı´guez, & B. Schmeiser (eds.), Proceedings of the 23rd West Coast Conference in Formal Linguistics. Cascadilla Press. Somerville, MA. 251–64. Goro, T. & S. Akiba (2004b), ‘Japanese disjunction and the acquisition of positive polarity’. In Y. Otsu (ed.), Proceedings of the Tokyo Conference on Psycholinguistics. Hitsuji Shobo. Tokyo, Japan. 137–62. Grice, H. P. (1975), ‘Logic and conversation’. In P. Cole and J. L. Morgan (eds.), Syntax and semantics 3: Speech acts. Academic Press. New York. 41–58 Gualmini, A. (2004), The Ups and Downs of Child Language: Experimental Studies in Children’s Knowledge of Entailment Relationships and Polarity Phenomena. Routledge. New York. Gualmini, A., S. Hulsey, V. Hacquard, & D. Fox (2008), ‘The Question-Answer
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
BERNHARD SCHWARZ Department of Linguistics McGill University 1085 Dr. Penfield Montreal, QC H3A 1A7 Canada e-mail:
[email protected]
Andrea Gualmini and Bernhard Schwarz 215 result)’. Paper presented at Boston University Conference on Language Development. Rooth, M. (1992). ‘A Theory of Focus Interpretation’, Natural Language Semantics 1:75–116. Russell, B. (1905). ‘On Denoting’. Mind 14:479–93. Sauerland, U. (2004). ‘Scalar implicatures in complex sentences’. Linguistics and Philosophy 27:367–91. Schwarzschild, R. (1999), ‘GIVENness, Avoid F and other Constraints on the Placement of Focus’.‘ Natural Language Semantics 7:141–77. Unsworth, S. (2005). Child L2, Adult L2, Child L1: Differences and Similarities. A study on the acquisition of direct object scrambling in Dutch. PhD thesis. Utrecht University. Unsworth, S. and A. Gualmini (2007). ‘Uncovering the pattern of children’s interpretation of negation and indefinites.’ Paper presented at Boston University Conference on Language Development 32, Boston University, USA [2nd - 4th Nov] Unsworth, S. and C. Helder (2007). ‘Dissolving a Dutch delay: The case of specific indefinites’ paper presented at Generative Approaches to Language Acquisition, Barcelona, Spain [September 6th - 8th] Wexler, K. & P. Culicover (1980), Formal Principles of Language Acquisition. MIT Press. Cambridge, MA. Wexler, K. & R. Manzini (1987), ‘Parameters, binding theory, and learnability’. Linguistic Inquiry 18: 413–44. Yang, C. (2003), Knowledge and Learning in Natural Language. Oxford University Press. Oxford. First version received: 06.06.2007 Second version received: 11.01.2009 Accepted: 25.01.2009
Downloaded from jos.oxfordjournals.org by guest on January 1, 2011
requirement for scope assignment’. Natural Language Semantics 16:205–38. Guasti, M. T., G. Chierchia, S. Crain, F. Foppolo, A. Gualmini, & L. Meroni (2005), ‘Why children and adults sometimes (but not always) compute implicatures’. Language and Cognitive Processes 20:667–96. Horn, L. (1972). On the semantic properties of the logical operators in English. PhD thesis. University of California, Los Angeles. Horn, L. R. (1989), A Natural History of negation. University of Chicago Press. Chicago. Jackendoff, R. (1972), Semantic Interpretation of Generative Grammar. MIT Press, Cambridge, MA. Kra¨mer, I. (2000). Interpreting Indefinites: An experimental study of children’s language comprehension. Ph.D. dissertation, University of Utrecht. Ladusaw, W. (1979), Polarity Sensitivity as Inherent Scope Relations. Ph.D. dissertation, University of Texas, Austin. Marcus, G. F. (1993). ‘Negative evidence in language acquisition’. Cognition 46:53–85. Musolino, J. (1998), Universal Grammar and the Acquisition of Semantic Knowledge: An Experimental Investigation into the Acquisition of Quantifier-Negation Interaction in English. Ph.D. dissertation, University of Maryland. Musolino, J. (2006), ‘On the semantics of the Subset Principle’. Language Learning and Development 2:195–218. Musolino, J., S. Crain, & R. Thornton (2000), ‘Navigating negative quantificational space’. Linguistics 38:1–32. Musolino, J. & J. Lidz (2006), ‘Why children aren’t universally successful with quantification’. Linguistics 44:817–52. O’Leary, C. & S. Crain (1994), ‘Negative polarity items (a positive result) positive polarity items (a negative