Accentuation and Interpretation Hans-Christian Schmitz
Palgrave Studies in Pragmatics, Language and Cognition Series E...
41 downloads
631 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Accentuation and Interpretation Hans-Christian Schmitz
Palgrave Studies in Pragmatics, Language and Cognition Series Editors: Noël Burton-Roberts and Richard Breheny Series Advisors: Kent Bach, Anne Bezuidenhout, Robyn Carston, Sam Glucksberg, Francesca Happé, François Recanati, Deirdre Wilson Palgrave Studies in Pragmatics, Language and Cognition is a new series of high quality research monographs and edited collections of essays focusing on the human pragmatic capacity and its interaction with natural language semantics and other faculties of mind. A central interest is the interface of pragmatics with the linguistic system(s), with the ‘theory of mind’ capacity and with other mental reasoning and general problem-solving capacities. Work of a social or cultural anthropological kind will be included if firmly embedded in a cognitive framework. Given the interdisciplinarity of the focal issues, relevant research will come from linguistics, philosophy of language, theoretical and experimental pragmatics, psychology and child development. The series will aim to reflect all kinds of research in the relevant fields – conceptual, analytical and experimental. Titles include: Anton Benz, Gerhard Jäger and Robert van Rooij (editors) GAME THEORY AND PRAGMATICS Reinhard Blutner and Henk Zeevat (editors) OPTIMALITY THEORY AND PRAGMATICS María J. Frápolli (editor) SAYING, MEANING AND REFERRING Essays on François Recanati’s Philosophy of Language Corinne Iten LINGUISTIC MEANING, TRUTH CONDITIONS AND RELEVANCE The Case of Concessives Ira Noveck and Dan Sperber (editors) EXPERIMENTAL PRAGMATICS Ulrich Sauerland and Penka Stateva (editors) PRESUPPOSITION AND IMPLICATURE IN COMPOSITIONAL SEMANTICS Hans-Christian Schmitz ACCENTUATION AND INTERPRETATION Christoph Unger GENRE, RELEVANCE AND GLOBAL COHERENCE The Pragmatics of Discourse Type
Palgrave Studies in Pragmatics, Language and Cognition Series Series Standing Order ISBN 0–333–99010–2 Hardback 0–333–98584–2 Paperback (outside North America only) You can receive future titles in this series as they are published by placing a standing order. Please contact your bookseller or, in case of difficulty, write to us at the address below with your name and address, the title of the series and the ISBN quoted above. Customer Services Department, Macmillan Distribution Ltd, Houndmills, Basingstoke, Hampshire RG21 6XS, England
Accentuation and Interpretation Hans-Christian Schmitz University of Frankfurt am Main
© Hans-Christian Schmitz 2008 All rights reserved. No reproduction, copy or transmission of this publication may be made without written permission. No paragraph of this publication may be reproduced, copied or transmitted save with written permission or in accordance with the provisions of the Copyright, Designs and Patents Act 1988, or under the terms of any licence permitting limited copying issued by the Copyright Licensing Agency, 90 Tottenham Court Road, London W1T 4LP. Any person who does any unauthorised act in relation to this publication may be liable to criminal prosecution and civil claims for damages. The author has asserted his right to be identified as the author of this work in accordance with the Copyright, Designs and Patents Act 1988. First published 2008 by PALGRAVE MACMILLAN Houndmills, Basingstoke, Hampshire RG21 6XS and 175 Fifth Avenue, New York, N. Y. 10010 Companies and representatives throughout the world PALGRAVE MACMILLAN is the global academic imprint of the Palgrave Macmillan division of St. Martin’s Press, LLC and of Palgrave Macmillan Ltd. Macmillan® is a registered trademark in the United States, United Kingdom and other countries. Palgrave is a registered trademark in the European Union and other countries. ISBN-13: 978–0–230–00253–1 hardback ISBN-10: 0–230–00253–6 hardback This book is printed on paper suitable for recycling and made from fully managed and sustained forest sources. Logging, pulping and manufacturing processes are expected to conform to the environmental regulations of the country of origin. A catalogue record for this book is available from the British Library. A catalog record for this book is available from the Library of Congress. 10 17
9 16
8 15
7 14
6 13
5 12
4 11
3 10
2 09
1 08
Printed and bound in Great Britain by CPI Antony Rowe, Chippenham and Eastbourne
Contents
Acknowledgements
vii
1
Introduction: Pragmatic and Semantic Effects of Accentuation
2
Optimal Accentuation 1 Hypothesis of optimal accentuation 2 Reconstruction of messages 3 Hypo- and hyperspeech 4 Impairment of speech communication 5 Accentuation and speech recognition 6 Discussion and summary
6 6 7 15 16 20 23
3
Cooperative Information Exchange 1 Conversational maxims and active interpretation 2 The common ground 2.1 Presupposing a common ground 2.2 Updating the common ground (1) 3 Goal-oriented information exchange 3.1 Questions and answers 3.2 Updating the common ground (2) 4 Conversational maxims and active interpretation revisited
28 31 39 39 42 47 48 54 74
4
Reconstruction of Messages 1 A truly simple fragment of English 2 Interpretation by means of QUADs 2.1 Completion with a QUAD 2.2 Exhaustification 2.3 Type shifting 2.4 Interpretation and accentuation 2.5 Presupposition of questions 2.6 Summary 3 Interpretation without QUADs 4 Context configurations for interpretation
83 85 92 92 104 108 116 123 128 129 136
5
Optimal Accentuation vs Focus Accentuation 1 Semantic effects of accentuation 1.1 Semantic effects of optimal accentuation
142 144 144
v
1
vi
Contents
2
3 6
1.2 Semantics of the focus feature 1.3 Comparison Predictions of stress patterns 2.1 Second occurrence foci 2.2 Focus projections 2.3 Summary Conclusions
Summary
158 168 170 171 182 197 199 201
Appendix: Type Logic with Lambda Operator 1 Syntax 2 Semantics
205 205 207
References
210
Index
223
Acknowledgements Thank you very much: Winfried Lenders for constant support and for giving me the opportunity to write the pre-version of this book as a doctoral dissertation while enjoying a position at the former Institute for Communication Research and Phonetics (IKP, University of Bonn); Bernhard ¨ Schroder for illuminating discussions and for being a perfect supervisor; Petra Wagner for performing experiments together with me; the test persons for being test persons; Joost Kremers for translating the manuscript before I revised it (all mistakes are mine); Bernhard Fisseni for critical remarks and for being a TEX-expert; Wolfgang Hess, Eric Fuß, Dietmar Lanc´e, Robert van Rooy, Ulrich Schade, Henk Zeevat and Ede Zimmermann for valuable suggestions and comments; the editorial team at Palgrave – especially Jill Lake, Melanie Blair and Ann Marangos – for being helpful and always nice; Thomas Roth for corrections; my parents Gisela and Bernhard Schmitz and my sister Heike for constant encouragement and support; and ultimately my wife Julia and our son Johannes, who have both been very patient with me (Johannes due to his age somewhat less than Julia).
vii
This page intentionally left blank
1 Introduction: Pragmatic and Semantic Effects of Accentuation The following examples show that accentuation can change the conditions under which a sentence can be used: (1)
a.
Who did John introduce to Sue? –
b.
John introduced BILL to Sue. To whom did John introduce Bill? – John introduced Bill to SUE.
The declarative sentences in the examples (1-a) and (1-b) can be used felicitously as answers to the questions preceding them. The senctence with stress on “Bill” (example (1-a)), however, is not a felicitous answer to the question of example (1-b), nor can the sentence with stress on “Sue” (example (1-b)) be used as a reply to the question of example (1-a). Due to the different stress patterns both sentences require different utterance contexts; they have different use conditions. If the word “only” is added to the sentences, then they do not only have different use conditions, they also obtain different truth conditions: (2)
a. b.
John only introduced BILL to Sue. John only introduced Bill to SUE.
The sentence in (2-a) is true if John introduced only Bill and no one else from a given group to Sue; for the truth conditions, it is irrelevant whether John introduced Bill to other people as well. The sentence in (2-b), on the other hand, is true if John introduced Bill only to Sue, and not to any one else from a given group; the sentence is false if John also introduced Bill to other people. The sentences in (2-a) and (2-b) are therefore true under different circumstances; their semantic difference is connected to the difference in stress patterns. Focus theories form the standard account to the explanation of pragmatic and semantic effects of accentuation. It is assumed that when two sentences have different truth or use conditions they contain different words or have
1
2
Accentuation and Interpretation
different syntactic structures. The sentences in (2-a) und (2-b) contain the same words; they must ergo have different syntactic structures. To identify the syntactic difference, the theoretical term “focus” is used: the sentences in (2-a) and (2-b) have different syntactic structures, because they have different foci; the focus of sentence (2-a) is the word “Bill”, while the focus of sentence (2-b) is the word “Sue”. Stress serves the purpose of marking focus. When two sentences are stressed differently, they have different foci and must therefore be interpreted differently. Focus theories contain, first, syntactic rules that determine which parts of a sentence can be focused, secondly, phonological rules that determine how focus can be marked through stress, and, thirdly, semantic rules for the interpretation of sentences whose foci have already been identified. Through the definition of syntactic, phonological and semantic rules, focus theories specify for every sentence a relation between stress pattern and interpretation. Stress pattern and interpretation are based on the syntactic structure of sentences; the relation between stress pattern and interpretation is grammatical. Focus theories have a flaw: it is possible that although the focus of a sentence is specified, the accentuation and/or interpretation of the sentence still depends on the utterance context. Selkirk (1995) and Schwarzschild (1999), among others, show that the accentuation of a focused constituent that contains more than one word can depend on the utterance context; that is, establishing that a certain phrase is focused does not suffice to determine with certainty which words of the phrase are to be accentuated. Furthermore, the examples in (3) show that for the interpretation of a sentence, it is not always sufficient to determine its syntactic focus-structure: (3)
a. b.
Which men did John introduce to Sue? – John only introduced BILL to Sue. Which students did John introduce to Sue? – John only introduced BILL to Sue.
The sentence “John only introduced BILL to Sue” is true if John introduced only Bill and no other person from a given group to Sue. For a complete determination of the truth conditions, this given group must be identified: when the sentence is uttered as an answer to the question which men John introduced to Sue (example (3-a)), then this group is a subset of the set of all men. That is, the sentence is true if John did not introduce any other man than Bill to Sue. When, on the other hand, the sentence is uttered as an answer to the question which students John introduced to Sue (example (3-b)), then the given group is a subset of all students. The sentence is then true, when John did not introduce any other student than Bill to Sue. The
Introduction: Pragmatic and Semantic Effects of Accentuation 3
sets of men and of students are not the same; therefore, the answers in the examples (3-a) and (3-b) have different truth conditions. The semantic difference exists only as a result of the different utterance contexts; it cannot be related to a syntactic difference. “So what?”, one could argue. “The syntactically based interpretation mentioned above isn’t incorrect because of that. The sentence is obviously true if John hasn’t introduced anyone other than Bill to Sue. Its truth conditions are at most weakened by the reference to the context, in that the domain of quantification of the operator denoted by the word “only” – the given group – is restricted. The fact that a domain of quantification can be contextually restricted does not argue against the focus theoretic interpretation.” (4)
Does John love Sue or does he just like her? – John only LIKES Sue.
The problem cannot be dismissed so easily, however: the sentence “John only LIKES Sue” is true if John stands in only one relation to Sue out of an unspecified set of relations, namely in the relation of liking her. Obviously, John bears many other relations to Sue, e.g. of not being identical with her. In order for “John only LIKES Sue” to be able to be true at all, the set of relations to be considered must be restricted; without such a set being specified, the sentence cannot be adequately interpreted. In example (4) the requisite set is specified by the question. “John only LIKES Sue” must be interpreted with reference to this question; as an answer, the sentence means that John likes Sue, but does not love her. For the sentence “John only introduced BILL to Sue” the account that its interpretation is predominantly determined by its syntactic structure, and that the reference to the context only adds some precision, can perhaps be maintained. For the sentence “John only LIKES Sue”, such an account does not seem plausible. The stress pattern and interpretation of a sentence do not have to be fully determined by its syntactic focus-structure. In order to determine the proper stress pattern and the interpretation of a sentence, it may be necessary to refer to its utterance context. The definition of the relation between stress pattern and interpretation requires a theory that explains how the utterance context is connected to accentuation and interpretation. In the present work, I develop such a theory. The following consideration will guide me: accentuation only takes place in speech communication; stress emphasises words vis-´a-vis their linguistic surroundings. Speech communication is subject to disturbances; in disturbed communication, words that are emphasised are more easily recog-
4
Accentuation and Interpretation
nised by the recipient than non-emphasised words. A recipient may be able to interpret sentences that he has not fully recognised. It may be sufficient for him to recognise only a few words in order to grasp the entire meaning of an uttered sentence; relevant here are those words that are critical for a correct interpretation – I call these words the i-critical words. It aids communication when a speaker emphasises the i-critical words of his utterance by stressing them, and when he thus increases the probability that those words are recognised. I advance the hypothesis that accentuation only serves to emphasise words that are critical for understanding. I call this hypothesis the hypothesis of optimal accentuation. Which words are critical to the understanding of a sentence and must therefore be accentuated must be determined with reference to the discourse context. Different stress patterns are optimal in different contexts. Sentences that differ in their stress patterns must be uttered in different contexts; they have different use conditions. Under the assumption that a sentence is accentuated optimally, the accentuation presupposes a certain configuration of the context. When the context in turn influences the truth conditions of the sentence, the choice of stress pattern indirectly co-determines meaning; its semantic effect arises as an epiphenomenon of optimal accentuation. Which words are critical to the understanding of a sentence in a given situation is determined through a model of the interpretation of incompletely recognised sentences. In the process of interpreting an incompletely recognised sentence, the recipient must semantically enrich the parts of the sentence that he recognised, in order to compensate for the parts that he did not recognise. In order to do this, he, first, needs rules for reconstructing messages; secondly, he needs criteria of adequacy, in order to determine whether a reconstruction that he made may be correct. I call such a model of the interpretation of incompletely recognised sentences a model of active interpretation. A model of active interpretation specifies how an incomplete sentence is to be interpreted in a given context, and furthermore, which words in a sentence must be recognised in a given context in order for the sentence to be interpreted correctly. Knowing these interpretation-critical words, the model determines which words in a given context are to be accentuated in order to reach the optimal effect. This model forms the basis for a theory of optimal accentuation. A theory of optimal accentuation is normative to the extent that it prescribes how a speaker must accentuate his utterance in order to facilitate understanding by the recipient. Speakers do not always accentuate their utterances optimally; nonetheless, I assume that there is a strong tendency toward optimal accentuation. A theory of optimal accentuation is therefore
Introduction: Pragmatic and Semantic Effects of Accentuation 5
descriptive to the extent that it describes how speakers generally accentuate (or rather, to the extent that it describes how speakers accentuate without error). As such, the theory is an empirical theory, and can be evaluated experimentally. In a theory of optimal accentuation, there is no need for the theoretical term “focus”; it does not need to be replaced by any other term. The book is organised as follows. In Chapters 2–4, I define the main features of a theory of optimal accentuation. I develop, argue for and discuss the hypothesis of optimal accentuation in Chapter 2. Subsequently, I develop in Chapters 3 and 4 a model of active interpretation: in Chapter 3 I define a model of cooperative information exchange and specify criteria of adequacy for uttering complete sentences and interpreting them. In Chapter 4, I expand this model with rules for the reconstruction of incompletely recognised sentences. In Chapter 5, the theory of optimal accentuation is compared with focus theories. I show that the theories make partially different predictions of stress patterns and present experimental data in favour of optimal accentuation. I conclude by summarising the results in Chapter 6.
2 Optimal Accentuation In this chapter, I will develop the hypothesis of optimal accentuation, show on which premises this hypothesis is based, and discuss some objections.
1 Hypothesis of optimal accentuation Let us assume Shannon’s model of a general communication system as depicted in Figure 2.1. A message is encoded in speech by the speaker (Information Source/Transmitter). The corresponding linguistic expression is sent to the recipient (Receiver/Destination) as a signal through a transmission channel. During transmission, disturbances may occur that alter the signal. The recipient receives the, possibly altered, signal. He decodes the signal, recognises a linguistic expression, and ideally, reconstructs the message as intended by the speaker. Finally, he interprets the message and – if the interpretation was successful – understands its meaning. Shannon mentions only one type of disturbance, namely channel disturbance. Apart from channel disturbances, communication can also be impaired by imperfect encoding (conversion of the message into a linguistic expression and into a signal), and by imperfect decoding or reconstruction. In speech communication, a speaker sends a message by forming a sequence of words – possibly a sentence – pronouncing these words aloud Information Transmitter Source
Receiver
-
Signal
Message
6 (Received) Signal
Destination
Message
Noise Source
Figure 2.1 General communication system according to Shannon (1948) 6
Optimal Accentuation 7
and emphasising some of them by accentuation. I posit the hypothesis that accentuation is a means of enhancing the ability of the recipient to reconstruct the speaker’s message. In the optimal case, the speaker emphasises the words that are critical for reconstruction and interpretation – the i-critical words. He should not emphasise words that the recipient does not need to recognise: In the optimal case, a cooperative speaker accentuates a minimal number of words. The words that are accentuated are those words that – when recognised – suffice for the recipient to understand the entire meaning of the speaker’s utterance. The hypothesis of optimal accentuation is based on four premises: 1. A cooperative speaker wants to be understood; in the optimal case, he will express himself in such a way that the recipient is most likely to understand him. 2. With knowledge of the discourse context, especially of the question under discussion, a recipient can reconstruct and interpret a message completely even when he has only recognised a few of the words uttered by the speaker. Ideally, the speaker shapes his utterance in such a way that the recipient will recognise at least the i-critical words. 3. Speech communication can be impaired (disturbed). A speaker must assume that his utterance is recognised incompletely. 4. Accentuated words are more likely to be recognised by the recipient than non-accentuated words. The smaller the number of accentuated words is, the higher is the probability that all accentuated words are recognised correctly. The first premise appears to be trivially true. Situations in which the speaker should strive not to be understood in order to reach the communicative goal are at least very difficult to construct. The other premises must be substantiated.
2 Reconstruction of messages With knowledge of the discourse context, especially of the question under discussion, a recipient can reconstruct and interpret a message completely even when he has only recognised a few of the words uttered by the speaker.
8
Accentuation and Interpretation
It is possible that a speaker utters a complete sentence, while the recipient recognises only parts of it. It is also possible that a speaker utters a sentence only incompletely, so that the recipient cannot recognise a complete sentence from the outset. In both cases the recipient must expand that which he has heard in order to reconstruct the intended message of the speaker. Let us first consider the case in which a speaker utters a sentence incompletely: (1)
Who talked to Jane? – Yves.
“Yves” is a name, which in itself denotes nothing more than a specific person. By referring to the explicitly formulated question who talked to Jane, the recipient can, when recognising the name “Yves”, reconstruct the message that Yves talked to Jane, or – semantically stronger – the message that Yves and no one else talked to Jane. “Yves” is interpreted as a constituent answer to the questions under discussion. In the case of the second reconstruction, “Yves” is interpreted as an exhaustive answer. In both cases, the reconstructed message is a proposition; “Yves” is not interpreted as just a name. In example (1), a question is explicitly asked by the utterance of an interrogative sentence. The recipient of the constituent answer can refer back to the interrogative sentence for reconstructing the complete answer. The next example (2) shows that it is not always necessary to understand the meaning of an interrogative sentence to be able to use it in the reconstruction of an answer: (2)
Who expiated his offences? – James.
Someone who does not know what the expiation of offences means, does not understand the question in example (2). Nonetheless, he can use the interrogative sentence in order to reconstruct a complete answer, according to which James expiated his offences. When a question is under discussion and the recipient can assume that the speaker communicates cooperatively, he can rely on the speaker to be answering the question under discussion. The recipient can reject reconstructions of the speaker’s message that are in principle possible but that do not answer the question. With that, we have a criterion for deciding between different possible reconstructions. It may happen that on the basis of the linguistic material available, only one reconstruction is possible. If this single possible reconstruction does not answer the intended question, an anomaly arises. The following example (3) relies on this effect for its predictable punch line:
Optimal Accentuation 9
Behind you!
Behind you!
Figure 2.2 Football, cooperative messages
(3)
– My dog has no nose. – No nose?! How does he smell? – Bloody awful!
Let me take stock. An incomplete sentence can be completed and interpreted by referring to an already uttered interrogative sentence, as the examples (1)–(3) show. An interrogative sentence can even be used to reconstruct an answer when its meaning is not understood. In order to verify the adequacy of a reconstruction, the meaning of the interrogative sentence must however be taken into account. It is possible that a question is given only implicitly; in that case, there is no linguistic material of an interrogative sentence available. Yet, an incomplete sentence can under certain circumstances still be interpreted by referring to a question that remains implicit: Figure 2.2 shows two scenes from a game of football. In the first scene (left) the white player in possession of the ball is attacked by two black players of the opposing team. To avoid losing the ball, the white player will need to know where he can pass it. His team-mate knows the situation; it is clear to him which question is on the agenda of the player with the ball. He can answer this implicit question; his shout “Behind you!” will be understood by the player with the ball in the sense of “Behind you there is someone you can pass the ball
10
Accentuation and Interpretation
Behind you!
Behind you!
Figure 2.3 Football, uncooperative messages
to”.1 In the second scene (right) of Figure 2.2 it is not obvious to the player in possession of the ball that he is being attacked. He appears to have a free path to the goal and he does not need to find anyone to pass the ball to. Nonetheless, he has to look out for possible attacks from players of the opposing team. Now he can only interpret the shout of his teammate as “Be careful, there’s someone from the opposing team behind you”. “Behind you there is someone you can pass the ball to” is not an adequate reconstruction, because the question where the player can pass the ball to is not on his agenda. In order to reconstruct the incomplete sentence “Behind you!”, the recipient must assume that his team-mate is cooperative and answers the question that must be on the agenda. A team-mate’s shout that does not meet this criterion leads to misunderstanding. Figure 2.3 shows two scenes in which “Behind you!” does not mean a cooperative message. In the first scene (left), the player in possession of the ball is attacked from the front, and so looks for someone to pass the ball to. His team-mate’s shout can be understood as an answer to this question. When, however, the white player should now pass the ball backwards, the ball will not reach his team-mate but an opponent. The shout will with some certainty result in a wrong 1 It is immaterial here whether it is the speaker or another team-mate that is behind the player in possession of the ball. For the player there is only one sensible interpretation for the incomplete utterance “Behind you!”.
Optimal Accentuation
11
pass.2 In the second scene (right), the only question relevant to the player in possession of the ball is whether he may be chased by a black player. A shout from a team-mate will cause him to turn his attention to a possible danger coming from behind, rather than to proceed speedily towards the goal. In the best case, then, the shout causes the attack to slow down; it is not cooperative. Let me take stock. An incomplete sentence can in certain circumstances also be interpreted with reference to a background question when that question is only implicitly given, and when there is no explicitly uttered, linguistic material for it available. Uttering an incomplete sentence presupposes that the recipient has enough information to complete the sentence. In the examples so far the recipient has to have an explicit or implicit background question available to him. A speaker, who cannot presuppose such a question, cannot always assume that his incompletely uttered message will be reconstructed correctly: (4)
Flight delayed. Don’t wait. Take Train.
The text message (4) is ambiguous. Let us assume that the receiver is waiting for the sender at an airport. The sender’s flight is delayed. Depending on the questions that the receiver presupposes, several different interpretations become available to him. If he wonders what the sender is going to do, he will take the text message to mean that the sender will not wait for his flight, and instead will take the train. If the receiver wonders what he himself should do, he will take the message to mean that he should not wait for the sender, but take the train. It is also possible that the receiver has a series of questions. Then he will take the message to mean that he should not wait, because the sender is taking a train; or that the sender will not wait and that he – the receiver – should take a train. In this way, the text message has at least four distinct readings; the different readings are determined by different series of background questions. The sender can only expect his text message to be understood properly when he has reason to assume that the receiver presupposes the correct background questions.3
2
It would be unfair, but under certain circumstances effective, if in the left scene a black player, attacking from behind, should shout “Behind you!”, thus causing the white player to pass the ball to him. 3 Some people argue that the sender should write “Will not wait” instead of “Don’t wait” if he wants to convey what he will do next. This might be so. Still, the interpretation of “Don’t wait” in the sense of “I don’t wait” seems to be possible.
12
Accentuation and Interpretation
In Chapter 3, I will construe an interpretation model in which every assertion is interpreted as the answer to an explicitly or implicitly given background question. There are, however, utterances – even incomplete utterances – that do not require background questions in order to be reconstructed. To these belong, for example, newspaper headlines such as the following: (5)
Merkel elected Chancellor.
Example (5) will most likely be completed as “Angela Merkel has been elected Chancellor of Germany.” Less obvious would be completions such as “Angela Merkel has been elected Chancellor of the University of Bonn”, or “Peter Merkel [or anyone else with the surname “Merkel”] has been elected Chancellor of Germany” All these reconstructions can be interpreted as answers to the general and always available questions “What happened?” or “What is the case?”. These general background questions, however, do not prefer a specific reconstruction. The constraint that the reconstructed message must be the answer to a background question must therefore be augmented with further constraints, which make the first reconstruction plausible, and the others less so. Thus far, I have shown that given certain background knowledge – here especially knowledge about a question under discussion – recipients are able to reconstruct and interpret incompletely uttered messages. It is irrelevant whether the sentence was uttered incompletely, or whether it was just recognised incompletely. It changes nothing about the interpretation whether the speaker – as in the following example (6-a) – utters only the name “Yves” or whether he utters a complete sentence – as in example (6-b) – of which the recipient recognises only “Yves”: (6)
a. b.
Who talked to Jane? – Yves. Who talked to Jane? – YVES talked to Jane.
“Yves” in example (6-a) forms a grammatically well-formed elliptic answer, a so-called constituent answer. In contrast, the word “French” in the next example (7-a) does not form a grammatically well-formed constituent answer; it cannot be uttered on its own (example (7-b)): (7)
a. b.
Which student talked to Jane? – The FRENCH student talked to Jane. Which student talked to Jane? – # French.
Optimal Accentuation
13
Nonetheless, for reconstructing the entire answer, it suffices if only the word “French” is recognised: (8)
Which student talked to Jane? – noise FRENCH noise.
Suppose that Jane was at a party that was also visited by three students, namely a French, an Italian and a Spanish student. A recipient that recognises only the word “French” of the response to the question which student talked to Jane should be able to reconstruct an answer in the sense of “The French student talked to Jane”.4 (9)
Which student talked to Jane? – noise talked noise.
Of course, it is not sufficient to recognise any words uttered by the speaker. In example (9), the recipient only recognises the verb “talked” which does not enable him to reconstruct and interpret the complete answer. The recipient has to recognise the words that are crucial for a proper understanding, that is he has to recognise at least the i-critical words. Let me take stock. A recipient can when necessary reconstruct a message even when the part that he recognises is on itself not grammatically wellformed. He is able to understand an answer sentence even when he does not recognise a constituent answer.5 However, he cannot reconstruct the speaker’s message under all circumstance. For a correct understanding he has to recognise at least the i-critical words of the speaker’s utterance. The examples so far show that in the course of reconstructing a message 4 I appeal to the intuition of the reader. The results of a spontaneously performed experiment under conditions that were not completely controlled for and with only a few test persons suggest that the estimate is correct. Native speakers of German ¨ are able to reconstruct a complete answer on the basis of the word “franzosische” (“French”) alone when they know the question that was asked. I expect that this result can be confirmed through an experiment that is executed under methodologically sound conditions. 5 One more example: the dairy section of a supermarket contains various packaged goods. A customer is looking for high-fat curd cheese and would like to know what the various packages contain (background question 1). One package has the word “curd cheese” on it. The customer interprets this label as an answer to his background question: “This package contains curd cheese”. He now asks himself what kind of curd cheese the package contains (background question 2). He turns the package around; on the reverse it says “medium fat”. The customer now understands that the package contains medium-fat curd cheese. The word “medium fat” is not a grammatically well-formed constituent answer to the presuppositional background question 2. Nonetheless, the customer can understand the word as an answer to his question, and consequently puts the package back and looks further for the desired high-fat curd cheese.
14
Accentuation and Interpretation
an incompletely recognised sentence must be expanded; the recognised sentence parts must be semantically enriched. With the last example of this section – which is from Perry (1998) – it is shown that it is not uncommon that a recipient needs to complete and semantically enrich a word sequence that has been recognised in order to understand it; this case is in fact quite common: (10)
It is raining.
Most likely, it is always raining somewhere on earth. Still, we can consider sentence (10) false, e.g. when the sun is shining at the place where the sentence is uttered. If, however, the speaker has just telephoned someone in the Norwegian city of Bergen – where it is almost always raining – and now tells us what the weather is like, we can easily consider the sentence correct. After all, the speaker might just mean that it is raining in Bergen, and not here, where we are currently located. That is, the sentence (10) is semantically underspecified. The place where it is allegedly raining is not specified; in order to be interpreted, it must be further specified with respect to which question is under discussion (“What is the weather like here?”, “What is the weather like in Bergen?”, etc.) Semantically underspecified utterances are no exceptions.6 Speakers often make utterances that the recipient must complete and specify further. “It is now 18.00h” is true of every full hour, unless one specifies the place for which the sentence is supposed to be valid; “a litre of milk costs 2 Euros” is false if we take the prices at a German supermarket as the basis for our interpretation; the sentence will however certainly be true in the future, and already is so at the next petrol station. Summarising, the examples show that recipients can, if required, reconstruct and interpret incompletely uttered or incompletely recognised sentences. In the course of reconstruction the recognised expressions are semantically enriched. I assume that recipients must complete and semantically enrich expressions that they have recognised on a regular basis, not just in exceptional cases. In doing so, they can make reference to the discourse context. In the examples, the target of a reconstruction is given by a question that is under discussion; declarative sentences are reconstructed as answers to such a question.
6 Some theorists – e.g. Searle (1994) – state that no sentence is in itself (i.e. without reference to context) semantically fully specified. Cf. in this regard also Recanati (2004).
Optimal Accentuation
15
3 Hypo- and hyperspeech Ideally, the speaker shapes his utterance in such a way that the recipient will recognise at least the i-critical words. According to the H&H theory,7 speakers have a strong tendency to behave economically; speakers articulate with the least amount of effort needed to ensure that the recipient will understand them. Speech therefore orients itself on two fixed points, namely articulation with minimal effort and articulation with optimal intelligibility. The optimisation unit for articulation is the sentence: sentences tend to be articulated with as little effort as is possible without hampering intelligibility. Speech that is articulated with the least possible effort is called hypospeech; speech that is articulated for the best possible intelligibility is called hyperspeech.8 In hypospeech, the speaker articulates parsimoniously and imprecisely. He “cheats” on articulatory gestures (articulatory undershoot), which leads to reduction and coarticulation effects. He leaves out single sounds and sometimes even complete words (Greenberg (1999)); other sounds are altered – sometimes even categorically (acoustic undershoot).9 Hyperspeech on the other hand requires precise articulation, little or no reduction and therefore a high articulatory effort. The effort that a speaker must make in order to let his utterance be understood varies from situation to situation. A speaker adjusts his speech to the utterance situation and accommodates to the needs of the recipient. Formal situations require a high level of intelligibility from the speaker, so that he must articulate precisely; when on the other hand speaker and recipient are on familiar terms, the speaker can assume that he will be more easily understood, and can therefore reduce his articulatory effort. Determining the proportion of hyper- and hypospeech can accordingly be used to describe different speech styles. Reduction phenomena occur more often in informal conversation than in formal situations (Kohler (1995)); more fa7
Cf. Lindblom (1983, 1990, 1996). Articulatory effort is tied to the amount of energy used for the realisation of articulatory gestures, i.e. to the amount of energy used when moving the speech apparatus and when holding tensed positions. Lindblom and Davis (1998) propose to determine this effort by measuring oxygen use during speaking. Higher articulatory effort should correlate with higher oxygen use. 9 Reduction always entails centring of formants; reduced vowels show a centring of their formant values toward the central vowel [@]. Reduction further consists of the shortening of sound length; with consonants, the voice onset time (VOT) and the length of frictions tend to be shortened, and plosions are sometimes dropped. With a categorical change, a sound no longer shows its own acoustic sound pattern but one that corresponds to another phoneme. 8
16
Accentuation and Interpretation
miliarity between speaker and recipient increases the laxness of articulation (Esk´enazi (1993)). Special adjustment is required for communication with the hearing impaired. Speech directed at people with hearing impairments is usually articulated more precisely; it is louder and slower, and pauses at phrase borders are longer, while fewer pauses are made inside phrases.10 Payton et al. (1994) show that such clear, continuously precisely articulated speech (clear speech) is easier to understand than normal, in part imprecisely articulated speech (conversational speech) for both hearing impaired recipients and recipients with unimpaired hearing. The requirements on articulation also vary within one utterance: parts that are more difficult to foresee and that are especially important for understanding the utterance must be articulated precisely (hyperspeech) in order to enable the hearer to interpret them. Other parts of the utterance can be articulated loosely (hypospeech) when the correct recognition of these parts is of lower importance. Speakers alternate between hypo- and hyperspeech inside utterances. The precisely articulated words in their utterances are demonstrably easier to recognise than imprecisely articulated words. Ideally, i-critical words are articulated precisely.
4 Impairment of speech communication Speech communication can be impaired (disturbed). A speaker must assume that his utterance is recognised incompletely. Speech communication can in principle be impaired in three ways: first, communication can be hampered on the speaker side by imperfect encoding, i.e. through imprecise articulation (hypospeech). Secondly, channel disturbances may occur. Thirdly, the decoding of the signal can be complicated by an attention deficit on the part of the recipient. Every kind of impairment can lead to a poorer understanding by the recipient of the message directed at him. Variations in articulatory precision were discussed in the previous section; let us now turn to disturbances in the transmission channel and to attention deficits. A channel disturbance consists of an attenuation, filtering and/or masking of the speech signal during transmission. An attenuation results from muffling, or when the signal is sent over a large distance. Filtering results from a restriction of the transmission channel, e.g. a restriction of the frequency band, so that the channel brings about a data loss. For example, in a telephone conversation, frequencies that are not covered by the tele10
Cf. Picheny et al. (1985, 1986, 1989) and Uchanski et al. (1996).
Optimal Accentuation
17
phone band are filtered out completely. The discrimination especially of consonants is affected by this, and words without redundancy, e.g. proper names, are recognised less easily. Masking means that other acoustic signals overlie the speaker’s speech signal. The signal can be masked by other speech signals (cocktail-party effect), by the same, temporally delayed signal (echo, reverberation) or by other sounds (traffic noise, music). Masking has a negative effect on intelligibility. The decrease in intelligibility can be determined for a given speech signal on the basis of acoustic parameters. The negative effect increases with the relation between the intensity of a (masking) disturbance and the intensity of the masked or attenuated signal. In much simplified terms: the quieter a speech signal and the louder the disturbing signal in relation to it, the worse the intelligibility of the speech signal.11 Speakers have a tendency to speak louder when background noise increases. Generally, though, the increase of speech volume does not suffice to completely compensate for the increase of background noise. At a speed of 50 kph, the noise level in a modern middle-class car lies at around 55–58 dB, it increases to 67–70 dB at a speed of 100 kph, and with further acceleration to 130 kph it reaches 71–75 dB (Langmann et al. (1998)). An increase of 10 dB in background noise, however, usually prompts a speaker to increase his speech volume by only 3.5–7 dB (Kryter (1970)). That is, an increase in disturbing sounds – e.g. in a car – is compensated for only partially by speaking louder. Moreover, Pickett (1956) points out that the intelligibility of speech decreases steadily from around 78 dB upwards – at the transition from loud articulation to shouting. An increase in speech volume alone is therefore not an adequate means to make oneself understood against strong disturbing signals. Payton et al. (1994) show that acoustic disturbances can be countered by more precise articulation. Speakers do indeed adjust their articulation to noisy surroundings; this effect is called the Lombard Reflex (Lombard (1911), Junqua (1993)). However, the adjustment cannot be straightforwardly described as a type of articulation that is generally more precise and more 11 Established units of measurement are, among others, the Articulation Index (AI, French and Steinberg (1947)) and the Speech Transmission Index (STI, Steeneken and Houtgast (1980)), that take into account the relation between the intensities of speech signal and disturbing signal on various frequency bands. Various frequency bands are taken into account because a speech signal does not have the same intensity over the entire range of frequencies. At frequencies of 500 Hz and higher, the intensity of the signal decreases continuously, so that higher frequencies are more easily disturbed than lower frequencies. Therefore, a disturbance will not necessarily disturb all frequencies equally. The disturbing effect occurs when the frequencies that are crucial for the recognition of the speech signal are effectively masked (Kryter (1970)).
18
Accentuation and Interpretation
intelligible under all circumstances (i.e. hyperspeech). For example, in an undisturbed environment, a speech signal that has been produced in a noisy environment can be more difficult to understand than a textually identical signal that is produced in a silent environment.12 The intelligibility of the speech signal is therefore linked to the utterance context. It should be noted that channel disturbances cannot always be compensated for completely. The best means of compensation is more precise articulation, which requires considerable effort from the speaker. Lastly, the decoding (recognition) of a speech signal can be impaired by an attention deficit on the part of the recipient. An attention deficit can come about in two ways. First, the recipient may be distracted by a sudden change in the environment – e.g. a loud bang (Eimer et al. (1996)). Secondly, the recipient may become tired; his attention drops without being distracted by a stimulus from outside (Koelega (1996)). It is not immediately clear that communication is generally disturbed by fluctuations in attention and that this kind of disturbance can be compensated for by the speaker. Why would a speaker assume in general that the recipient is distracted, and how would he be able to counter such distraction through proper articulation? In shorter conversations, the danger of fatigue should be negligible. In a longer speech, the recipient’s thoughts may drift; but the recipient will then not be able to interpret and understand what is being said. Finally, not even the most precise articulation can counter distractions from suddenly occurring outside stimuli. The assumption that fluctuations in attention form a disturbance that is relevant and that must be compensated for by the speaker becomes more convincing when one takes into account the obscure, i.e. the hardly measurable or even immeasurable, dimension of cognitive effort.13 One can then describe the decrease in attention as a decrease in cognitive effort. Even though cognitive effort is difficult to measure, the assumption that 12 Cf. Junqua (1993): a test person is asked to produce a given text. Simultaneously a channel disturbance is suggested by playing disturbing signals to the test person over head phones. The text is recorded without the disturbing signals. The test person then produces the same text a second time, this time without being played disturbing signals. In this way, two realisations of the same text are obtained, one produced in a noisy environment and one in a silent environment. The intelligibility of the two versions in different contexts is then compared. The version that is produced in a noisy environment is more difficult to understand in a silent environment than the version that was produced in a silent environment. 13 One way to measure cognitive effort is through reaction time experiments, if one assumes that longer reaction times indicate more effort. It is certainly possible, however, that longer reaction times result not from more but rather from less effort made by the recipient over a specified time interval.
Optimal Accentuation
19
there are different levels of cognitive effort is plausible. It is immediately clear that cognitive tasks vary in difficulty and in how much effort they require. It is furthermore plausible to assume that recipients follow an economy principle and strive to minimise their efforts, just like speakers do when they articulate. Recognising a message requires cognitive effort; it requires all the more effort when interfering signals must be filtered out or when the message was articulated imprecisely. The recipient may not be disposed to make the effort required for a complete recognition of the message. If he is simultaneously occupied with another task – e.g. with driving a car – he may not even be able to make the required effort; his attention is then impaired (Heuer (1996)). It is the speaker’s task to direct the recipient’s attention to the recognition of the i-critical words and thus make interpretation possible even with minimal effort. If a speaker emphasises the i-critical words in the sentence that he utters, the hearer only needs to put effort into recognising the emphasised words. Let me take stock. Speech communication is potentially disturbed in three ways: by imprecise articulation, by channel disturbances and by fluctuations in attention. Impairments in speech communication are compensated for only partially: first, speakers follow an economy principle and tend to not articulate every word clearly; secondly, continuously compensating for channel disturbances is hardly possible, and requires much effort; thirdly, it can be assumed that recipients also follow an economy principle and do not concentrate continuously on the decoding of the signal directed at them. Speakers must therefore assume that their messages are recognised only incompletely. The reception of written language is impaired less than the reception of spoken language. In general, the transmission channel for written language is not disturbed, and fluctuations in attention can be compensated for by re-reading. To a large extent it is the reader who determines how much time he requires for reception; he can decode unclear writing non-linearly, by decoding later parts of the text first, and then return to unclear parts earlier in the text. The reader of a written text can successively improve his recognition of the text, and interpret the text when he has recognised it completely. This option is not as readily available to the recipient of a spoken text. It is the speaker, not the hearer, who determines the speed of reception. The hearer can decode the message only linearly, from beginning to end, and can normally do so only once. He may sometimes have the opportunity to ask clarification questions, or ask the speaker to repeat parts of his utterance, but regarding that recognition of spoken language is
20
Accentuation and Interpretation
by default imperfect, clarification questions are not asked very often. Besides, it is not always possible to ask clarification questions. Lectures, radio broadcasts, etc. are received only once and in their entirety. The hearer cannot correct imperfections in perception afterwards, but must compensate for them during reconstruction and interpretation.
5 Accentuation and speech recognition Accentuated words are more likely to be recognised by the recipient than nonaccentuated words. The smaller the number of accentuated words is, the higher is the probability that all accentuated words are recognised correctly. Stress, i.e. the effect of accentuation, manifests itself in the speech signal by an abrupt change of the fundamental frequency and a relative increase of intensity and duration. It can additionally be supported by gestures, such as eyebrow movements (Massaro (2002)).14 Accentuation requires hyperspeech; as a rule, accentuated words are articulated more precisely than non-accentuated words (Greenberg et al. (2001)). Precise articulation, higher intensity and longer duration are the means of clear speech. They facilitate intelligibility and help prevent channel disturbances. Ultimately, stress gives a word prominence. Accentuation directs the recipient’s attention to a word. Accentuated words are therefore more likely to be recognised correctly than non-accentuated words. With respect to the articulatory effort, optimal accentuation is the most economical means to compensate for the communication disturbances described above. Clear articulation does not become less effective when a speaker articulates clearly throughout his utterance, but a continuously clear articulation is costly, and runs against the speaker’s desire to keep the articulatory effort to a minimum. Speakers therefore tend to articulate parts of their utterances loosely (hypospeech) and to articulate altogether parsimoniously. Parsimonious accentuation furthermore serves the purpose of guiding the attention of the recipient: stress highlights a word in its context and thus attracts attention. Such a highlight, or emphasis, can be described as a deviation from a certain standard. A standard becomes stronger, and a deviation from the standard more prominent, when fewer words are em14 The acoustic correlates of stress can be instantiated differently in different languages. Beckman (1986) shows that stress in Japanese is marked more by fundamental frequency change, while in English it correlates with a combination of features, among which an increase of intensity and duration. For an overview of accentuation in different languages, see the various papers in Hirst and di Christo (1998).
Optimal Accentuation
21
phasised. The lower the number of words emphasised in an utterance, the more prominent those words are. Staccato speech, in which each word is accentuated, may be suitable for lending emphasis to a complete utterance in a discourse, but it does not serve intelligibility when an entire text is spoken staccato, so that being accentuated is the standard. Stress is more effective when fewer words are accentuated. That is, when fewer words are accentuated, the probability that the accentuated words are recognised correctly increases.15 Cutler (1976) and Cutler and Fodor (1979) demonstrate experimentally – through measuring reaction times in phoneme recognition tasks – that accentuation serves to guide attention. They show that recipients actively search for accentuated words and that they recognise these words faster than non-accentuated words. However, Terken and Nooteboom (1987) show that incorrect accentuation slows down the recognition of sentences. Words that are not critical for interpretation16 – which should therefore not be accentuated, according to the hypothesis of optimal accentuation – are recognised as such faster when they are not accentuated. It is therefore not the case that accentuated words are always recognised faster than nonaccentuated words. Accentuating words that in the optimal case should not be accentuated does not improve the intelligibility of an utterance. Accentuation also influences the performance of systems for automatic speech recognition (ASR). The guiding of attention should not play a role there, while clear articulation should have a positive effect. In an evaluation of different ASR-systems, Greenberg and Chang (2000) observe that accentuated words are more likely to be recognised correctly than nonaccentuated words. They tested the ASR-systems on the recognition of a part of the annotated Switch-Board Corpus (Godfrey et al. (1992)), a corpus of telephone conversations containing short utterances of various speakers of American English. The utterances are in conversational speech, and in part loosely articulated. Following Greenberg and Chang, there is a 50 per cent higher probability for non-accentuated words in the test corpus to be recognised incorrectly or not at all (i.e. they are left out) than for accentuated words.
15 I justify the thesis that stress is more effective when fewer words are accentuated by referring to the guiding effect that stress has on attention. Of course, if the base probabilities of accentuated words being recognised do not depend on the number of the accentuated words, then for probabilistic reasons alone the recognition of a smaller set of accentuated words is more likely than the recognition of a larger set. 16 Following Terken and Nooteboom (1987), these are words that express known information.
22
Accentuation and Interpretation
Just to clarify: (a) in the phonetic literature, word level stress is distinguished from stress at the utterance level. In every word, one syllable is accentuated, carrying word stress. Stress at the utterance level does not only emphasise syllables in words, but complete words in the linguistic environment. Here, I am only interested in stress at the utterance level. (b) With “stress” and “accent” I basically mean the same. Repeatedly, accentuation is reduced to a change of the fundamental frequency. A distinction is then made between pitch accents17 and stress. In this sense, I am talking about stress, not about (pitch) accentuation. (Cf. Cruttenden (1986).) (c) I use the term “accentuated” as denoting a binary feature of words. However, the acoustic correlates of stress can have varying strengths; they can at least be measured on an ordinal scale. Accentuated words are perceived as being prominent. Recipients can detect different levels of prominence, that is, the perceptional prominence feature can be measured on an ordinal scale as well. A word can be accentuated in a perceivable manner even if it does not show the strongest acoustic correlates and is not the most prominent word in the sentence. This insight will become important in the discussion on focus projections, as they are assumed by focus theories. I will return to this matter in Chapter 5. Accentuated words are perceived by recipients as being emphasised (prominent) in their surroundings; prominence is a relation between a word and its surroundings.18 Emphasising a word requires that some change be made to it in comparison to the words uttered before it. When two adjacent words are accentuated, some change must take place in order to emphasise the second word in comparison to the first: the second word must be accentuated in a different way than the first, that is, the acoustic correlates of both accentuations must be different. One, but not the only possibility is that the second word shows stronger acoustic stress correlates than the first. In general, recognising stress is made easier when the accentuation of two immediately adjacent words is avoided. Ideally, a speaker will accentuate as few words as possible, and especially as few adjacent words as possible.
17 The dimensions of fundamental frequency, intensity and duration are acoustic dimensions; the corresponding prosodic, psycho-acoustic dimensions are pitch, loudness and length. 18 Consequently, metrical stress theories describe stress as a relative feature. (Cf. Kager (1995).)
Optimal Accentuation
23
6 Discussion and summary I have made the hypothesis of optimal accentuation precise and have substantiated its premises. I will now turn to four objections to my argumentation, and then summarise the results. Objection 1: the first objection is directed at the adaption of the shannonian communication model. Contrary to what the model suggests, it is sometimes argued that signal decoding (recognition) and interpretation cannot be separated. Recognition is argued to be guided by expectation, and expectations only arise during interpretation. This undermines the premise that utterances can be completely interpreted even when recognition is imperfect. Expanding on the objection: the objection finds empirical support. First, under equal conditions, ciphers (or names for ciphers) are more likely to be recognised correctly than words in a well-formed sentence, which in turn are clearly recognised better than meaningless sequences of syllables (Gelfand (1998)). This can be explained if it is assumed that a recipient has certain expectations about the signal to be decoded. The more possible decodings he can exclude in advance, and the fewer the number of decodings that he has to decide between, the easier it is for him to recognise the signal. In the simplest task (cipher recognition), he has to decide between only ten different ciphers; in the most difficult task (recognition of meaningless syllable sequences) he cannot distinguish between predictable words, and must instead recognise the signal in its entirety. The latter task is clearly more difficult, and recognition is consequently more error-prone. In the task of intermediate difficulty (recognising words in a well-formed sentence) the predictability that aids recognition arises through interpretation; the interpretation therefore influences signal recognition. Secondly, examples such as (11-a) and (11-b) (from Lindblom (1990)) show that not only does it depend on interpretation if something is recognised, but also how it is recognised. The identical sequence of sounds (less’n) is decoded into different word sequences depending on the different question contexts. (11)
a. b.
What is your homework assignment? – Less’n five. recognised as Lesson five How many came? – Less’n five. recognised as Less than five
Recognition is therefore not necessarily a bottom-up process in which an incoming signal is searched for acoustic or articulatory invariants and in
24
Accentuation and Interpretation
which a speech signal is then identified based on the recovered invariants. Recognition can also be a top-down process, in which a signal is only searched for specific features that the hearer predicts will be present based on the utterance that he expects to hear. Recognition in this case serves the purpose of discriminating and selecting one of several predictable expressions. (The H&H theory is also based on this insight.) Reply: recognition is indeed guided by expectation, and expectations depend on the interpretations of previous utterances, possibly even on interpretation that looks ahead. Nonetheless, recognition and interpretation can be modelled distinctly and separately. First: Cherry (1978) reports that test persons in shadowing experiments, in which they have to repeat a text as it is being read to them, have no problems with this task. The test persons however speak monotonously and do not put any emotion into the text. Moreover, they can hardly answer questions about the text after the experiment. A correct decoding of the signal, according to Cherry, is a precondition for shadowing it. The test persons decoded the signal, but they did not pre-construct the text to be expected, and could therefore not put any emotion into their speech. They furthermore did not store any information from the text and for that reason could not answer any questions. Cherry assumes that the reason for the test persons not being able to pre-construct the text and store any information from it was that they did not interpret the text. The test persons appeared to be able to decode (recognise) signals without interpreting them; recognition and interpretation must therefore be modelled distinctly and separately. Secondly: even if one adopts the assumption that recognition can be guided by expectation and may in first instance be a discrimination task, one can still consider the distinction between recognition and interpretation useful. (a) A recipient can recognise and interpret a signal even when he has no expectations about it. (b) In any case, even if the recipient has expectations about the signal to be recognised, recognition still (co-)depends on the incoming signal. Note that even if recognition is influenced by expectations, it can still happen that the recipient recognises the signal incorrectly or incompletely. I just state that defective recognition can be compensated for during the subsequent interpretation process. Objection 2: the second objection is directed against the thesis that imperfections in the signal recognition must be taken into account, and that such imperfections can be compensated for during interpretation. Recognition is a discrimination task; a word can already be recognised on the basis of only a few cues in the speech signal. If the expectations with regard to a word to be recognised are extremely high and if therefore the set of words to be distinguished is relatively small, then a few cues will prob-
Optimal Accentuation
25
ably suffice to identify the word that the speaker utters. In this case, the speaker can articulate imprecisely (hypospeech) and still assume that the signal will be recognised correctly. If, on the other hand, the set of words that can be expected is large, the recipient must rely on more cues in order to recognise the word that is meant. The speaker supplies these cues by articulating precisely (hyperspeech) and accentuating the word. In the optimal case then, those words will be accentuated for which the recipient only has weak expectations, and for which it cannot be assumed that they will be recognised correctly when they are articulated without much effort. Expanding on the objection: according to the hypothesis of optimal accentuation, a speaker adjusts himself to what the recipient requires for the reconstruction a message. Objection 2 claims that he adjusts to what is required for the recognition of a linguistic expression.19 It is not impossible that both hypotheses – the hypothesis of optimal accentuation and the competing hypothesis of objection 2 – lead to the same predictions with respect to accentuation. According to Greenberg (1999), words with a high information content are accentuated. Words with high information content must be recognised in order to understand the utterance, and are moreover difficult to anticipate. Both hypotheses therefore predict that they will be accentuated. The difference between the two hypotheses lies in how channel disturbances and fluctuations in attention are assessed: nonaccentuated words have only few discriminative features and are therefore easily rendered unrecognisable by channel disturbances. The competing hypothesis requires that the speaker make sure that these words be recognisable as well, and that the recipient direct his attention to them. Reply: the articulatory and recognition effort that must be made according to objection 2 is exaggerated. Non-accentuated words do not need to be recognised in order for interpretation to be successful; they often are not recognised. It is obvious that recipients can under certain circumstances completely reconstruct sentences that have not been uttered completely. Nothing speaks against the assumption that they can equally reconstruct sentences that have been uttered completely but were recognised only in19 In order to prevent a possible misunderstanding: according to the hypothesis of objection 2, speakers adjust themselves to what the recipient requires for recognising a linguistic expression. The requirements for recognition are related to the relative predictability of the utterance. In contrast, according to the model developed in Chapters 3 and 4, the requirements for the reconstruction of a message (which is what in my hypothesis the speaker adjusts himself to) are not related to the relative predictability of the message. The recipient reconstructs a message in such a way that it fulfils several pragmatic constraints (to the extent that such a reconstruction is possible). The set of adequate reconstructions may be limited, but all adequate reconstructions are equally “predictable”.
26
Accentuation and Interpretation
completely. If recipients do not need to recognise utterances completely, why would one make the strong assumption that they consistently do? Objection 3: stress is realised on syllables. The acoustic correlates of stress are limited to one single syllable of the accentuated word (as a rule on the syllable that also carries word stress). Even the precise articulation that accompanies stress appears to affect primarily the stressed syllable rather than the entire word (Greenberg et al. (2001)). How can stress on syllables serve to highlight entire words and improve their intelligibility? Reply: in the case of words that consist of one syllable, syllable stress obviously highlights the entire word. Monosyllabic words form a large part of the actively used vocabulary. Even if only about a fourth of all English words are monosyllabic, according to Greenberg (1996) more than 85 per cent of the words used in speech communication consist of only one syllable. Furthermore, recognising the syllable that is known to carry word stress also facilitates the recognition of multisyllabic words. When the stressed syllable of a multi-syllabic word is recognised correctly, all expectable words whose stressed syllable differs from the recognised syllable can be excluded. It can also plausibly be assumed that a syllable stress directs the attention of the recipient beyond the syllable to the meaningful unit – here the word – which should increase the probability of a correct recognition. Objection 4: the four premises on which the hypothesis of optimal accentuation is based may very well be correct. The hypothesis of optimal accentuation, however, is not entailed by them. The possibility exists that natural languages are not optimal in the sense intended here. Stress may be used to enhance the recognisability of words, it may however also be used for completely different purposes, e.g. to mark focus. Reply: the objection is correct. A theory of optimal accentuation is required to account for so-called “focus phenomena” as epiphenomena of optimal accentuation. If this can be done successfully, the theoretical term “focus” is no longer needed to account for these phenomena. There is then no longer any ground to assume that natural languages are not optimal in the sense intended here. Accounting for focus phenomena is the topic of Chapter 5. Summarising: ideally, a speaker accentuates in an utterance the minimal set of i-critical words, and as few adjacent words as possible. When a sequence of adjacent words is accentuated, the stress on each word must be varied and possibly raised in order to emphasise each word. In order to determine which words in a sentence must be accentuated in a given discourse context, a model of (active) interpretation of incompletely recog-
Optimal Accentuation
27
nised sentences is required. I define such a model in the two following Chapters 3 and 4 to the extend that interesting predictions on the relation of accentuation and interpretation can be tested.
3 Cooperative Information Exchange According to the communication model described in the previous chapter, a message is encoded into a signal by the speaker and transmitted to the recipient. The recipient recognises (decodes) the signal. The recognition may be imperfect; nonetheless, the recipient may still be able to reconstruct and interpret the complete message. A model of reconstruction and interpretation of incompletely recognised messages must provide, first, operations for reconstruction and, secondly, criteria for the adequacy of reconstructions. Let us assume that a recipient has recognised a declarative sentence directed at him only incompletely. He can interpret the words that he recognised, but thereby alone he does not comprehend the meaning of the entire sentence. In order to understand the meaning of the entire sentence, he needs to complete the words that he recognised to an expression that denotes a proposition, that is, he must semantically enrich the recognised words. In order to be able to accomplish this task, he needs appropriate operations. If he has several such operations at his disposal, he may arrive at more than one reconstruction of the incompletely recognised sentence. It is possible that not every reconstruction has the meaning that the speaker intended. The recipient must determine the reconstruction with the intended meaning, i.e. the correct reconstruction. For this, he requires criteria of adequacy with which he can distinguish correct from incorrect reconstructions. (1)
a. b.
Why are you late? – noise BICYCLE noise FLAT. Why is Peter late? – noise BICYCLE noise FLAT.
The incompletely recognised answer of example (1-a) is to be interpreted in the sense of “I (the speaker) am late because my bicycle had a flat tyre”, while the answer of example (1-b) is to be interpreted in the sense of “Peter is late because his bicycle had a flat tyre”. To derive these interpretations, the recipient has to interpret the words “bicycle” and “flat”. Moreover, he has to perform operations of semantic enrichment in order to expand the meanings of these words to the propositions that the speaker’s bike had a flat tyre or that Peter’s bike had a flat tyre, respectively. Finally, he has to apply criteria of adequacy for choosing the correct reconstruction.
28
Cooperative Information Exchange
29
One might argue, that the recipient would not need to apply criteria of adequacy if his operations of semantic enrichment were dynamically restricted: in the question of example (1-a) the speaker, but not Peter is explicitly mentioned; therefore, the answer can only be so reconstructed that it means that the speaker’s bicycle has a flat tyre. On the contrary, in the question of example (1-b) Peter, but not the speaker is mentioned; therefore, the answer can only be so reconstructed that it means that Peter’s bicycle has a flat tyre. Operations of semantic enrichment refer to the utterance context; distinct contexts enable distinct reconstructions. In each context only one message can be reconstructed; there is no need for the application of adequacy criteria. I agree that semantic enrichment must be contextually restricted. However, an account that completely relies on contextual restrictions of enrichment operations will presumably have problems with examples like the following: (2)
Why do you believe that Peter will be late? – noise SAW noise BICYCLE noise FLAT.
The answer to example (2) is to be reconstructed in the sense of “I saw that Peter’s bicycle had a flat tyre”. Both the person answering and Peter are mentioned in the question, and for the reconstruction of the answer the recipient has to refer to both persons. It seems to be hard to define a plausible and robust set of semantic enrichment operations that only allows for the correct reconstruction but not for the incorrect reconstructions in the senses of “I saw that my bicycle had a flat tyre”, “Peter saw that my bicycle had a flat tyre”, and “Peter saw that Peter’s bicycle had a flat tyre”. Even if the definition of such operations was feasible, it seems to be at least much simpler to allow for the different reconstructions and then apply criteria of adequacy in order to filter out the inadequate reconstructions. Pragmatic criteria of adequacy are needed in any case. Cooperative information exchange requires that certain conversational maxims are fulfilled. Messages of cooperative speakers must fulfil these maxims and recipients will try to reconstruct messages directed at them in such a way that they fulfil the maxims. Conversational maxims determine what a cooperative speaker may mean in a given discourse context, and how his utterance is to be understood. It is up to the speaker to express himself in such a way that he can expect his message to be adequately reconstructed, and thus that he is properly understood. In the present chapter, a formal model of cooperative information exchange is developed and defined, in which criteria for the adequate utterance of
30
Accentuation and Interpretation
complete sentences and for their reconstruction and interpretation are specified. For the time being, the model is only defined for a formal language. In Chapter 4, I extend the model to include a description of naturallanguage communication, and in the course of doing so I define operations for the semantic enrichment of incompletely recognised sentences. First, in section 1, I discuss how criteria of adequacy can be specified according to the Gricean conversational maxims. Following Stalnaker, I describe in section 2 cooperative information exchange as a successive expansion of the common ground. I make this description precise and define a classical update system (oriented on Veltman) for the modification of the common ground. In line with this update system I can define initial criteria of adequacy for cooperative information exchange; however, I do not yet possess a criterion for the thematic relevance of utterances. Next, in section 3, I specify topics of information exchange as questions under discussion (following van Kuppevelt and Ginzburg, among others). A discourse is supposed to answer questions under discussion; relevant is that which serves the purpose of answering such questions. Orienting myself on a proposal by Groenendijk I extend the update system and supplement the criteria of adequacy. Lastly, in section 4, I return to Grice, revise the conversational maxims that he postulates and determine their role in cooperative information exchange. Before I begin, I have to define some important terms: the (intensional) meaning of a declarative sentence is a proposition; by uttering a declarative sentence a proposition is asserted; utterances and assertions are speech acts. The (intensional) meaning of an interrogative sentence is a question – in the sense of Groenendijk and Stokhof (1984): a propositional concept – by uttering an interrogative sentence a question is asked; asking a question is a speech act. That is, I distinguish between linguistic objects (declarative sentences and interrogative sentences), semantic objects (propositions and questions) and speech acts (utterance, uttering; assertion, asserting; the asking of a question, asking). The model that is to be developed here involves only two kinds of sentences, namely declarative and interrogative sentences, and their respective semantic meanings, namely propositions and questions. An answer is a proposition that answers a question when it is asserted. More precisely, one could define answers as relations between propositions and questions. An answer in the first sense is then a proposition that stands in the answering relation to a question. An answer sentence is a declarative sentence whose meaning is an answer. Somewhat loosely an answer sentence, its meaning and the speech act of answering (i.e. uttering the
Cooperative Information Exchange
31
answer sentence) can be called an “answer” – as long as there is no risk of misunderstanding. I will use the term in this loose sense for all three objects, but it should be noted that this is not meant to imply that the distinction between the linguistic object (declarative sentence), the semantic object (proposition) and the speech act is denied. Lastly, a message is send by the utterance of a sequence of words that can be a sentence. I take a message to be the fully specified semantic representation of a sentence, denoting a proposition or a question. Sometimes – when there is no risk of misunderstanding – I do also call a speech act and an uttered expression an “utterance”.
1 Conversational maxims and active interpretation According to Grice (1967), cooperative discourse participants observe four conversational maxims: first, they observe the maxim of quantity, and express themselves just as informative as the joint communicative goal requires. On the one hand, they do not give superfluous information, and on the other they do not withhold required information. Secondly, they observe the maxim of quality, and do not utter anything that they do not believe themselves or for which they lack evidence. A speaker who knows what he is talking about utters only true sentences. Thirdly, speakers observe the maxim of relation, and convey not just any message but only messages that are relevant given the joint communicative goal. What they say is interesting in the given situation. Fourthly and lastly, they observe the maxim of manner, and speak clearly and intelligibly, in a language that the recipients understand. They make the reception of their utterances easy, do not mislead the recipients and avoid ambiguities that cannot be solved directly. The conversational maxims of quantity, quality and relation apply to the meaning of an utterance; they determine what should be said. The maxim of manner applies primarily to the form of an utterance; it determines how something is to be said. I call the maxims of quantity, quality and relation the semantic maxims; the maxim of manner is the syntactic maxim. Grice (1967) argues that cooperative discourse participants are expected to observe the conversational maxims. Individual violations of the maxims are not inconsistent with this assumption. It must be specified precisely to what extent a speaker can violate the maxims without acting uncooperatively. Can the role of the maxims be enhanced and specified in such a way that cooperative information exchange only takes place when the conversational maxims are observed continuously?
32
Accentuation and Interpretation
If the Gricean conversational maxims are adopted unchanged, this question must be answered in the negative. According to Grice, cooperative information exchange can also take place when the conversational maxims are not observed continuously. There are two reasons for that, which may possibly be eliminated, however: First, a conflict may arise between the maxim of quantity and the maxim of quality. It is possible that a speaker does not possess the information necessary to answer a question completely. The speaker must therefore violate either the maxim of quantity, in that he does not give all the desired information, or he must violate the maxim of quality, in that he says something for which he has no evidence. Under these conditions the speaker cannot observe all maxims; nonetheless he can communicate cooperatively. This problem is eliminated easily. The maxims of quantity, quality and relation can be combined to a super-maxim. This super-maxim requires from the speaker that he conveys exactly the relevant information that he can justly assume to be true. He should not say anything that is not relevant or of which he is not sufficently certain. The separate sub-maxims can no longer come into conflict when they are subsumed under the supermaxim; according to the super-maxim, cooperativity does not break down when the single maxims cannot be fulfilled together.1 In order for communication not to fail as a result of a lack of linguistic competence, also the fourth maxim should be changed, so that a speaker is not required to express himself more clearly and intelligibly than he is able to. Non-native speakers, children and other people who do not have perfect competence in a language can communicate cooperatively; they are not expected to use the language in the same way that a completely competent speaker would. The changes make sure that in principle any speaker can obey the maxims at any time. The changes are minimal, and retain the fundamental idea behind the maxims. The second reason why violations of conversational maxims are consistent with cooperativity lies, according to Grice, in the fact that a violation can suggest an implicature. A violation is therefore not only consistent with cooperativity; it can actually serve cooperative information exchange. Grice gives the following examples: 1. In testimonials, there are generally no explicitly negative evaluations. 1 Another argument for subsuming the three maxims into a super-maxim is that the maxim of relation already appears to be included in the maxim of quantity. The maxim of relation requires that a speaker expresses himself only regarding the common communicative goal. The maxim of quantity requires that he only expresses himself regarding the common communicative goal and furthermore that he gives all desired information. (Cf. also Martinich (1980).)
Cooperative Information Exchange
33
A negative evaluation is expressed by omitting more important, positive evaluations. Grice interprets this omission as an obvious violation against the maxim of quantity; the violation allegedly implies a negative evaluation. 2. Someone is asked whether something is the case, he gives an answer and then explains in detail how he came to his conviction. Grice takes the detailed justification of the answer as an infringement of the maxim of quantity.2 The infringement implies that the answer is to a certain extent controversial and cannot be believed with certainty. 3. If a speaker expresses himself, for example, ironically or metaphorically and if it is immediately clear that what he says is literally understood to be incorrect, the recipient will, according to Grice, recognise a violation of the maxim of quality and infer the intended, non–literal but true meaning. 4. A speaker who violates the maxim of manner expresses himself not as clearly as he would be able to. As examples, Grice gives ambiguous, poetic texts and other utterances that could apparently have been shorter and more concise. The example for the last case is the utterance “Miss X produced a series of sounds that corresponded closely with the score of Home Sweet Home”, instead of the more concise “Miss X sang Home Sweet Home”. The recipient allegedly recognises a breach of the maxim of manner and concludes that Miss X’s performance satisfied the prevailing notions of singing only marginally. The interpretation of the examples requires an answer to the question how the recipient arrives at the intended implicatures after he has noticed the violations. Grice does not define rules for drawing the implicatures. I propose – in opposition to Grice – that the implicatures do not depend on violations of the conversational maxims, but on the assumption that the maxims are observed. A recipient has some leeway when he interprets an utterance. He can reconstruct several messages based on the utterance, and moreover he can accommodate his representation of the utterance context. The recipient has to use the leeway he has in interpreting an utterance in order to reconcile it with the conversational maxims. Implicatures are assumptions that a recipient must make so that the utterance directed at him 2 Alternatively, the excessively long justification may be seen as an infringement of the maxim of relation.
34
Accentuation and Interpretation
satisfies the maxims. The maxims become constraints of interpretation.3 – How can Grice’s examples then be explained? 1. Testimonials generally do not contain negative evaluations. The person who writes the testimonial, however, is supposed to give an exhaustive answer to the question which positive characteristics the applicant has that are relevant for the job. If only a few positive characteristics are mentioned, and if the reader is to assume that the maxim of quantity is observed, i.e. that all relevant positive characteristics have been mentioned, he will interpret the testimonial as a negative evaluation. The implicature does not stem from a violation of the maxim of quantity but, quite on the contrary, from the assumption that it is observed. 2. If a speaker justifies a decision in detail, he believes the justification to be relevant in the given context. If a justification is relevant, i.e. if its validity is under discussion, it is not self-evident, but rather to a certain extent controversial. The recipient can accommodate his representation of the utterance context accordingly. He infers from the justification of the conviction and the presupposition that this justification is relevant that the conviction is controversial. 3. The recipient of an ironic utterance arrives at a non-literal reconstruction, because the literal reconstruction cannot be reconciled with the 3 (A) The pipeline model of syntactic–semantic speech processing comprises three sequential modules for speech processing: the first module receives as input a linguistic utterance and yields as output its syntactic analysis. The syntactic analysis is the input for a reconstruction module that produces a semantic representation. This representation is then the input for the last, pragmatic module, in which implicatures can be drawn on the basis of the semantic representation and the utterance context. According to the pipeline model, the (semantic, i.e. truth-conditional) meaning of an utterance is determined on linguistic principles; on the basis of the linguistically determined meaning, pragmatically motivated conclusions can be drawn. I deviate from the pipeline model and lean toward radical pragmatics – cf. Breheny (2002) – according to which non-linguistic, pragmatic moments already occur during the reconstruction process, which is still purely semantic according to the pipeline model. The content of an utterance is linguistically underspecified, and is enhanced by pragmatic principles already during reconstruction; in this way, the semantic and pragmatic analysis modules are combined. – (B) Criteria of adequacy are needed for filtering out inadequate reconstructions of messages. If such criteria are defined as constraints – in the manner proposed here – all reconstructions that do not satisfy the constraints are filtered out. Alternatively, criteria of adequacy can be specified as pure maxims, which must be obeyed as well as possible. Several reconstructions can then be compared on the extent to which they fulfil the maxims; the reconstruction that fulfils them the best must be the one the speaker intended. According to this definition, criteria of adequacy serve to favour one reconstruction over others, rather than to assess reconstructions independently of the alternatives, and to filter them out if necessary. (Cf. Blutner (2000) and van Rooy (2003a).)
Cooperative Information Exchange
35
maxim of quality.4 Again the implicature results from the presupposition that the maxims are observed. 4. Suppose that a speaker has attended a performance of Miss X; it may be assumed that he can describe the performance precisely. Now, the speaker does not say that Miss X has sung Home Sweet Home; rather, he declares that she produces a series of acoustic events that are reminiscent of this song. If Miss X’s rendition can justly be described as singing, the speaker should describe it as such if he is to fulfil the conversational maxims. He does not describe the rendition as singing, which means that it cannot straightforwardly be described as such. The utterance of the speaker is only adequate if Miss X’s performance satisfied the prevailing notions of singing only marginally. What Grice considers an implicature on the basis of a violation of the maxims, can be explained with the presupposition that the maxims are obeyed. (The Gricean interpretation of poetry need not be considered plausible. An ambiguous, poetic text cannot be substituted with a clearer, less ambiguous text. The maxim of manner requires that a speaker expresses his message as clearly as possible. A poetic message is as clear as possible; its disambiguated substitute would not be equivalent. If it is at all possible to assume that poetry is a form of information exchange – one would need a concept of aesthetic information for this – it does not violate the conversational maxims.) To derive implicatures in Grice’s examples it is not necessary to assume violations of the conversational maxims; on the contrary, one can presuppose that they are fulfilled. The objection immanent in Grice against the thesis that cooperative information exchange takes place only when the maxims are observed is at least weakened. I suspect that cooperative communication can always be described as being in accordance with the conversational maxims. Furthermore, fulfilment of the maxims appears to guarantee cooperative behaviour: if a speaker always expresses himself as it is required, does not withhold information, never makes superfluous, irrelevant remarks, only says what he can rightly believe to be true, and if each of his utterances is clearly intelligible, then he behaves cooperatively in an exemplary manner. 4 If the literal reconstruction cannot be reconciled with the maxims and further stylistic parameters of the utterance (intonation, accompanying facial gestures, etc.) point to irony, the recipient may be able to simply negate the literal meaning and see if the result of this operation fulfils the conversational maxims. That would give a very simple operation for the derivation of a non-literal reconstruction. (Cf. also Bach and Harnish (1979).)
36
Accentuation and Interpretation
In order to behave uncooperatively, it appears to be necessary to break at least one maxim. One can therefore tentatively link the term cooperative information exchange to the fulfilment of the conversational maxims. (I am careful here and use the word “tentatively” because until now only the validity of the maxims for assertions has been discussed – not for asking questions – and moreover because the term “relevance” is only vaguely specified.) The conversational maxims should serve to describe realistic situations of information exchange. I presuppose that discourse participants generally (although not always) behave cooperatively. If this were not the case, the conversational maxims would miss their intended point. I cannot prove the presupposition of cooperativity,5 but I can make it plausible with the following example: (3)
What time is it? – It is 18:03, but my watch is three minutes fast.
¨ Suppose that we asked Benedikt Lowe (who provided the example) for the time. He looks at his watch and answers that it is three minutes past six (18:03), but his watch is three minutes fast. We understand that his watch shows 18:03, but that the actual time is 18:00.6 Why? On the face of it, two interpretations are possible: first, our interpretation, according to which it is actually 18:00 – or more precisely, Benedikt thinks it is 18:00 – while his watch shows 18:03, and, secondly, an alternative interpretation according to which it is 18:03, while Benedikt’s watch shows 18:06. We want to know the time and are not interested in any technical details about the watch. In the alternative interpretation (18:03), we learn what Benedikt takes to be the time, and on top of that we learn an uninteresting detail about his watch. The information about the state of the watch does not contribute to answering our question. If we interpret Benedikt’s utterance in this alternative manner, we assume that he violates the maxim of quantity or of relation. 5 Merin (1999) forgoes the presupposition that information exchange is cooperative. He describes discourses with the purpose of gaining information as zero sum games. His description makes different presuppositions from mine, but neither description strictly refutes the other. 6 An experiment with 33 undergraduate students (22 native speakers of German and 11 non-native speakers) showed that this is the generally preferred interpretation. The students were shown the example in German (Wie sp¨at ist es? – Es ist 18:03, meine Uhr geht aber drei Minuten vor.); no interpretation was given to them. The students wrote down their answers on a piece of paper. 31 students (94%) wrote that it was 18:00. Only two students (6%) wrote that it was 18:03. The students that had a deviant interpretation were native speakers of German.
Cooperative Information Exchange
37
The first interpretation (18:00), on the other hand, conforms to the conversational maxims. We learn that Benedikt’s watch shows 18:03, and furthermore that it is three minutes fast, so that there is strong evidence that it is 18:00. We learn what Benedikt takes to be the time and how he came to his conclusion. One could now argue that we did not ask how he came to his conclusion, and that the additional information constitutes a breach of the maxim of quantity or of relation. That is not entirely correct, however. We wanted to know the time. With the answer we know that Benedikt measures time in minutes, that he is well-informed about the state of his watch, and that therefore his indication of time is most likely correct and accurate. The information about the watch serves our interest, because we can conclude that we have received a reliable and precise indication of the time.7 The deciding point of the example is that the recipient only arrives at the correct interpretation because he presupposes that the utterance observes the conversational maxims. Observing the maxims is a cooperative task. It presupposes an active recipient who uses the interpretation leeway in order to reconstruct the message directed at him in such a way that it fulfils the conversational maxims. Not all utterances can be reconstructed in that way. Suppose someone asks which day of the week it is, and Benedikt replies that it is Wednesday, but that his calendar is from 2002. The reply is not a satisfactory answer; the recipient cannot be expected to conclude with certainty which day of the week it is today.8 It is up to the speaker 7 Do we really care whether it is 18:03 or 18:00? Do we choose a reading, or do we just take the time to be about 18:00/18:03? – That depends on our interests. A train to Cologne departs from Bonn Main Station at 18:01. If we want to take that train, it is important for us to know whether it is 18:00 or 18:03. 8 This also was the subject of an experiment with 33 undergraduate students (22 native speakers of German, and 11 non-native speakers). The test persons had to imagine that it was January 2004 (which was actually the case) and that they did not know the day of the week. (The experiment took place on a Thursday.) They now asked someone what day it was, and received the reply “Es ist Mittwoch, aber mein Kalender ist von 2002.” (It is Wednesday, but my calendar is from 2002.) The test persons had to interpret the answer, and determine which day of the week the speaker meant. Less than half, namely 14 test persons (42.4%), chose Wednesday, 9 test persons (27.3%) chose Friday, 7 test persons (21.2%) could not deduce any information about the day of the week, and the other three persons mentioned some other day. No significant distinction was recognisable between native and non-native speakers. With such disparate interpretations, a speaker cannot assume that his answer is understood correctly. – How can the different interpretations of the test persons be explained? – Those that guessed Wednesday simply ignored the information about the calendar. Those that guessed Friday assumed that the day in 2002 that corresponded to the actual date was a Wednesday. Two years later the day must therefore be two days later than a Wednesday, that is, a Friday. (2 years = 365 x 2 days; 1 week = 7 days; (365 x 2) modulo 7 = 2; two days must therefore be added. Example: 9 Jan-
38
Accentuation and Interpretation
to express himself in such a way that he can expect his utterances to be understood correctly. That is, the conversational maxims regulate both the utterances of the speaker and the reconstruction and interpretation effort of the recipient.9 The Gricean conversational maxims apply primarily to the uttering of declarative sentences. However, in a cooperative information exchange, questions are asked as well. An explicitly asked question can determine the goal of the information exchange and in that way establish which subsequent utterances are of interest and which are not. Asking questions is also subject to rules. Obviously, interrogative sentences must be formulated clearly, and when uttering them one must observe the maxim of manner. To what extent uttering interrogative sentences can be regulated by the semantic maxims is not clear, however. The semantic maxims require from the speaker that he know what he is talking about and that he provides the recipient with new knowledge, among other things. A speaker cannot know the question that he asks in the same way that he can know a proposition, and by asking a question he cannot inform the recipient in the same way as by asserting a proposition (except on the basis of possible presuppositions, that may be ignored here). Therefore additional rules for uttering interrogative sentences must be determined. With the aid of the conversational maxims, I want to determine what cooperative information exchange is, and set up a theoretical framework for the description of language games for information exchange. To this end, I must formalise the maxims precisely and enhance them in such a way that they also apply to questions.
uary 2002 was a Wednesday, so 9 January 2004 was a Friday.) The other test persons may simply have guessed, or made an error in the derivation of the actual day of the week from the day of the week in 2002 (e.g. chose two days before Wednesday, instead of two days after). 9 Do the maxims also regulate non-cooperative behaviour in information exchange? – Suppose that someone has committed a crime and is suspected. It cannot be presupposed that he will behave cooperatively all the time during police interrogation and always observes the conversational maxims. It is in his own interest to hide certain things and possibly to lie. As a rule, however, a lie has no value when it is not believed. (Let us keep it simple. Of course, it might be that he follows a strategy according to which he has to appear as a liar.) Therefore it is in the suspect’s interest to give the impression that he is not lying, but communicating cooperatively and fulfilling the conversational maxims. In this way, the behaviour of an intelligent liar can be described as a tactical fulfilling and non-fulfilling of the conversational maxims. He depends on the validity of the maxims in order to be able to break them effectively.
Cooperative Information Exchange
39
2 The common ground In this section, I define a simple model for information exchange by assertions. In the first subsection (2.1), I explain that assertions are informative if they expand the common ground of the discourse participants. I derive two requirements from the conversational maxims that pertain to the relation between declarative sentences to the common ground. In the second subsection (2.2), I define a simple, classical update system for information states and declarative sentences. With this system, I describe how assertions can expand the common ground, and I specify the two requirements that were derived from the conversational maxims. 2.1
Presupposing a common ground
According to Stalnaker (1978), discourse participants presuppose a common ground for their discourse, that they can refer to at any time and that they expand successively in the course of the discourse. The presupposed common ground is changed by events that all discourse participants can evidently observe. For example, if an explosion takes place, all participants can assume that this explosion was observed by all, and if the participants watch a television programme together, they can assume that all of them follow the programme and that the information in it becomes part of the common ground. After the discourse participants have watched the weather forecast, they do not need to tell each other what the weather will be like; they already know this from television. For the present purpose, the most interesting events are those whereby one discourse participant asserts something and the others accept what is being said; as a result of such an event, the asserted proposition becomes part of the common ground. How do assertions change knowledge and a common ground in particular? – Let us assume two persons engaged in a discourse to inform themselves about a certain topic. Both possess certain knowledge about the topic, possibly they possess different knowledge. They exchange their knowledge, in order to turn their private knowledge into mutual knowledge. One of the discourse participants asserts the proposition p. Since he observes the conversational maxims, he himself believes that p. Let us assume that the speaker is well-informed and critical, so that the recipient can trust him and can assume that the speaker knows what he is talking about. The recipient therefore learns, first, that p and, secondly that the speaker knows that p. Using the notation of Fagin et al. (1995), I write Ki ( p) for “the discourse participant i knows that p”. The index s stands for the speaker and the index r for the recipient. Now it holds that Ks ( p) ∧ Kr ( p) ∧ Kr (Ks ( p)). The recipient does not contradict the speaker
40
Accentuation and Interpretation
and the speaker can therefore be certain of the effect of his assertion. He knows how the recipient has updated his knowledge: Ks ( p) ∧ Kr ( p) ∧ Kr (Ks ( p)) ∧ Ks (Kr ( p)) ∧ Ks (Kr (Ks ( p))). The consequence of not contradicting the assertion can be calculated by the recipient; he knows which knowledge the speaker acquires: . . . ∧ Kr (Ks (Kr ( p))) ∧ Kr (Ks (Kr (Ks ( p)))). And so on: The speaker knows which knowledge the recipient acquires and the recipient knows in turn which knowledge the speaker acquires, ad infinitum. The discourse participants (speaker and recipient) are aware of each other’s knowledge of p. After the proposition p has been asserted, and the assertion has been interpreted and accepted – i.e. not contradicted – by all discourse participants of a group G, p belongs to the mutual knowledge (that is, the common ground) CG of G. First, all discourse participants i ∈ G know that p: ∀i ∈ G [Ki p], which I abbreviate to EG ( p). Secondly, all discourse participants know that all discourse participants know that 0 p be equivalent with p all discourse participants know . . . that p. Let EG k +1 k k p, for and let EG p be an abbreviation for EG EG p. Then it holds that EG k = 1, 2, . . . (cf. Fagin et al. (1995)). The discourse participants do not need to calculate the mutual knowledge ad infinitum. When the need arises, however, they can derive knowledge about the knowledge of the others.10 An utterance is informative if it alters the presupposed mutual knowledge (the common ground) of the discourse participants. A credible assertion always gives information about the knowledge of the speaker. It can even be informative even if the recipient already knew its semantic content. It is therefore not necessary to require from the speaker that he only says things that the recipient did not know before. Each member of a group of discourse participants G makes assumptions about what belongs to the common ground of G. The assumptions can be distinct, and some discourse participants may be mistaken. When a proposition p belongs to the common ground CG , every member of G knows that p belongs to CG . That is, if any member of G does not know that p belongs to CG , then p does not belong to CG . Nonetheless some of the discourse participants may falsely assume that p is part of CG . They then presuppose more mutual knowledge than is actually available. This can, for example, cause them to express themselves less informatively than would be adequate. An incorrect assumption about the common ground can never lead to over-informative utterances. If a speaker believes that p does not belong to the common 10 Knowledge about the knowledge of others can be interesting and useful. A popular, albeit not very realistic example is the so-called Muddy Children Puzzle (cf. Fagin et al. (1995)). The role of knowledge about the knowledge of others in games such as the murder mystery game CLUEDO and various card games is described by Van Ditmarschen (2000).
Cooperative Information Exchange
41
ground, then p does not belong to the common ground. Asserting p is then informative, at least for the speaker himself. – Can a speaker inform himself? – Yes. Suppose that all discourse participants except the speaker himself believe that CG ( p). Their belief is false, because the speaker does not believe that CG ( p), so that ¬CG ( p) is the case. Now the speaker asserts that p. With this, the speaker alters his representation of the common ground; as a result of his utterance, he now also believes that CG ( p). He does not alter the information state of the other discourse participants, who do not believe anything that they did not believe before. Nonetheless he turns their mere believing into knowing; what they used to assume falsely, they now assume correctly and rightly. According to Stalnaker (1978), discourse participants have a certain degree of freedom in how much common ground they presuppose. They sometimes presuppose less than they tentatively could. This is especially the case with small talk, when the point is not to inform oneself effectively, but rather to simulate an informative conversation. Let me take stock. An assertion can alter the presupposed common ground of the discourse participants. According to the conversational maxims, assertions must be informative; discourse participants should not assert anything that they feel is already part of the common ground. Independent of how much mutual knowledge the discourse participants presuppose, on that ground it holds that they should not assert anything that has already been asserted or what follows from what has already been said.11 In addition, the discourse participants should only claim what they know themselves. Knowledge is consistent. As long as a speaker observes the conversational maxims, he cannot contradict himself. The recipient detects a violation of the conversational maxims when the speaker contradicts the common ground – and therefore contradicts himself – e.g. by uttering a logically false sentence or an inconsistent set of sentences. The minimal requirement is that the common ground should not be contradicted. Three objections must be discussed. Objection 1: assume two meteorologists engaged in small talk. One says that it is cold; the other adds it is also windy. Both only express things that both already know. Their utterances are therefore not informative and violate the conversational maxims. 11 I make the too strong assumption that discourse participants are logically omniscient: a declarative sentence is understood if its truth conditions are known. If its truth conditions are known, the truth conditions of all logical implications (not: pragmatic implicatures) are also known. This is certainly not very convincing. I will discuss the assumption of logical omniscience in section 4.
42
Accentuation and Interpretation
Nonetheless, their behaviour cannot be considered uncooperative; the requirement of informativity is therefore too strong. Reply: if two discourse participants want to exchange information effectively they should presuppose all the mutual knowledge that they can reasonably presuppose. That is, if the meteorologists presupposed as much mutual knowledge as possible, their utterances were truly uninformative. However, an effective search for information is not the purpose of small talk. The meteorologists need to presuppose only little mutual knowledge, so that even with the utterances mentioned they could change the common ground. Objection 2: according to the reply, discourse participants can presuppose as little mutual knowledge as they wish. In that way, even the most trivial utterances can be informative. This means that the informativity requirement is too weak. Reply: it is correct that the informativity requirement is weak; however, it is not too weak. The discourse participants are free to choose the common ground at the beginning of their discourse. During their discourse, the common ground is modified; these modifications cannot easily be undone. Once p has been asserted and accepted, it becomes part of the common ground. A repeated assertion of p, or of a proposition that is entailed by p, is superfluous and uninformative, and violates the maxim of quantity. (A meteorologist who constantly repeats that it is cold is not a pleasant person to talk to, and also not a cooperative discourse participant.) Not every possible utterance can be informative in the course of a conversation. Objection 3: even if not every possible utterance can be informative in the course of a conversation, then at least the one that introduces the discourse can. The first utterance cannot be redundant in view of a previously performed utterance. That too, however, is implausible; not every utterance is adequate for starting a discourse. Reply: informativity is not the only criterion of adequacy. If the speaker of the first utterance could decide freely what will be presupposed as common ground, almost any utterance that introduces a discourse can be informative. It is however not always practical to presuppose only little mutual knowledge, and not every utterance is relevant for a given communicative goal. The choice for a discourse-introducing utterance is not just regulated by the requirement of informativity, but rather by requirements of relevance and truth (or weaker, plausibility). 2.2
Updating the common ground (1)
I can now formulate the criteria for cooperative information exchange mentioned above in the framework of a simple, classical update system. Dis-
Cooperative Information Exchange
43
course participants presuppose a common ground; each discourse participant possesses a representation of this common ground. I describe cooperative information exchange as a process of continuous updating the representations of the common ground. To this end, a representation of the common ground is specified as an information state, and the updating of this information state is modelled with an update system. An update system determines how information states are altered by the uttering (and accepting) of sentences: Definition 3-1 (Update System) An update system is a structure U = L, Σ, [ ] that consists of a language L, a set of information states Σ and an update function [ ] : L × Σ → Σ, that changes each information state σ ∈ Σ into an information state σ ∈ Σ for each sentence φ ∈ L. Let us assume for now that discourse participants communicate through the type-logical language TL that is defined in the Appendix. In Chapter 4, I turn to the communication with natural-language expressions – building on the system construed here. I represent knowledge with information states. An information state is defined depending on a language – here: TL – and on a model M of this language. By uttering a sentence of TL the information expressed by this sentence is added to the information state. After updating the information state this information is known, i.e. the sentence is entailed by the updated information state. All “known”12 sentences are entailed by the information state. The intensional meaning of a sentence of TL with respect to M is a proposition, i.e. a set of indices (so called possible worlds). It seems natural to specify an information state as a proposition as well, or more precisely, as the intensional meaning of the conjunction of all known sentences: Definition 3-2 (Information State) Let M = D M , I M , [[ ]] M be a (reduced) model of TL:13 (a) σ M is an information state w.r.t. M and TL iff σ M ⊆ I M . 12 I allow the formulation that not only propositions (meanings of declarative sentences) but also declarative sentences themselves can be known. 13 M is a reduced model, because the real world-index i ∗ , in relation to which the truth value of a sentence is determined, is lacking. Cf. definition A-10 in the appendix. – Why is M a reduced and not a complete model? – The holder of an information state must possess a model of TL in order to interpret the sentences of TL and to update his information state. If he would possess a complete model, he would be able to assign a truth value to each sentence. He would therefore already possess all the information that sentences of TL can give; updating his information state would not provide any new information.
44
Accentuation and Interpretation
(b) – 0 = I M is the minimal information state, – 1 = ∅ is the absurd information state. (c) The set of all information states w.r.t M, Σ M , is the power set of I M : ρ( I M ). An information state with respect to an interpretation M of TL is a set of indices i ∈ I M , ergo a proposition. Let us assume that an index i∗ ∈ I M corresponds to reality. If i∗ is not an element of a given information state, the holder of the information state is misinformed; his information state contradicts reality. As long as i∗ is an element of the information state, however, the holder of the information state does not believe anything that is not actually the case; he is not misinformed, and his information state is compatible with reality. The holder of an information state cannot identify i∗ , i.e. he cannot decide which index of his information state corresponds to reality. The more he learns the more indices he excludes. If he were omniscient, his information state would contain exactly the index i∗ . Definition 3-3 (Propositional Knowledge) Let σ be an information state w.r.t. M and TL. The proposition expressed by a sentence φ ∈ TL is known in σ iff:
[[φ]]iM = true, for all i ∈ σ A sentence φ ∈ TL contradicts σ iff:
[[φ]]iM = false, for all i ∈ σ A sentence φ ∈ TL is compatible with σ iff:
[[φ]]iM = true, for at least one i ∈ σ
The proposition expressed by a sentence is already known by the holders of an information state if the sentence is true with respect to each index of the information state. A sentence that is false with respect to each index contradicts the information state. A sentence that is true with respect to at least one index of the information state is compatible with the information state. The minimal information state 0 describes the state of complete ignorance; only logically true sentences are known, and only logically false sentences contradict the information state. The absurd information state 1 describes contradictory knowledge: no sentence is compatible with the information state; at the same time, all propositions – even contradictions – are known. It is a state of confusion.
Cooperative Information Exchange
45
An information state is augmented when propositional knowledge that does not contradict the information state is added to it. Let us assume that an information state is updated with a sentence p. The information state is compatible with p, so that the update does not lead to the absurd information state 1. Some sentences, which so far had been compatible with the information state, now contradict it. One such sentence is for example ¬ p: after the update with p, the information state only contains indices according to which p is true and ¬ p is false. Definition 3-4 (Classical Update System) Let M be a (reduced) model of TL and Σ the set of all information states w.r.t. M and TL. UTL = TL, Σ, [ ]1M is a classical update system for TL iff for all sentences φ ∈ TL and all information states σ ∈ Σ: σ [φ]1M = {i ∈ σ | [[φ]]iM = true}
In what follows I will define two update functions [ ] M , which I will distinguish through subscript numbers. For every sentence φ ∈ TL, [φ]1M is a function Σ → Σ. [φ]1M (σ) is the information state σ that results from updating σ with φ. Instead of [φ]1M (σ), I also write σ [φ]1M . σ[φ]1M denotes the updating of the information state σ through the uttering of φ; σ [φ1 ]1M [φ2 ]1M . . . [φn ]1M denotes the successive updating of σ through the uttering of φ1 , φ2 , . . ., and φn in that order (Veltman (1996)). By uttering (and accepting) a sentence of TL the common ground is updated. This update is purely eliminative: all indices with respect to which the sentence is false are filtered out, and no elements are added to the information state. The update is furthermore distributive: each index is tested individually whether it must be filtered out or not. Systems of eliminative and distributive updates are classical update systems (Groenendijk and Stokhof (1991b)). The system defined here is therefore a classical update system. By updating an information state with a sentence that contradicts the information state all indices are filtered out; the update leads to the absurd information state. An update with a sentence that expresses a proposition that is already known does not result in any change; uttering this sentence is therefore uninformative. Sentences that express information that according to a given information state is already known, are entailed by that information state. A sentence entails a second sentence exactly then if every information state entails the second sentence after having been updated with the first:14 14
(A) In a valid argument the conclusion is a consequence of the premises. (contd)
46
Accentuation and Interpretation
Definition 3-5 (Entailment) Let UTL = TL, Σ, [ ]1M be a classical update system for TL and M. (a) An information state σ ∈ Σ entails a sentence φ of TL (the fact that φ, is a consequence of the facts that are known according to σ) σ |=UTL φ iff σ [φ]1M = σ (b) A sentence φ of TL entails a sentence ψ of TL (the fact that ψ is a consequence of the fact that φ) φ |=UTL ψ iff for all information states σ ∈ Σ: σ [φ]1M = σ[φ]1M [ψ]1M
It is now possible to define the two requirements for cooperative information exchange that were stated at the end of the previous subsection: first, Do not assert anything that is already part of the common ground! and, secondly, Do not assert anything that contradicts the common ground! Definition 3-6 (Semantic Conversational Maxims) Let UTL = TL, Σ, [ ]1M be a classical update system for TL and M. Let the information state CG ∈ Σ represent the common ground of a group of discourse participants G. The uttering of a sentence φ ∈ TL observes the conversational maxims only if the following holds: (a) φ is not already entailed by the common ground: CG [φ]1M = CG therefore CG |=UTL φ Through the notion of consequence the notion of validity for arguments is defined. For several, in the framework of a classical update system equivalent possibilities for defining the notion of validity, see Veltman (1996). – (B) The relation of consequence can also be defined for pairs of information states: An information state σ entails an information state σ (σ |= σ ) iff σ ⊆ σ .
Cooperative Information Exchange
47
(b) φ does not contradict the common ground: CG [ φ ] 1 = 1 therefore CG |=UTL ¬φ
If a recipient can derive several, non-equivalent interpretations of an utterance, he can use the maxims defined here as criteria to test the adequacy of the interpretations. To the extent that he presupposes that the speaker is cooperative, he must interpret his utterances in such a way that it conforms to the maxims. He can reject as inadequate any interpretation according to which the utterance does not satisfy the maxims. So far, information states have been defined as unstructured sets of indices that represent propositional knowledge. In a discourse, the common ground of propositional knowledge is expanded. So far, this knowledge is expanded randomly, not in a goal-oriented manner. For a goal-oriented information exchange, not only a common ground is needed, but also common communicative goals. Knowledge about such goals is called thematic knowledge. The relevance of utterances is determined in relation to the common thematic knowledge. Communicative goals can be set explicitly by asking questions, i.e. by uttering interrogative sentences. TL does not contain any interrogative sentences; that means that setting a communicative goal by asking a question is not possible at present. The update system must therefore be extended, so that communicative goals can be represented, goals can be set or changed by asking questions and the relevance of utterances can be determined in relation to the existing goals.
3 Goal-oriented information exchange In this section, I extend the model for information exchange with questions, and I add a relevance criterion to it. In the first subsection (3.1), I specify goals of cooperative information exchange as questions that are to be answered in the discourse. Several authors15 propose to interpret assertions always as answers to questions under discussion and to describe 15 Ginzburg (1996a), Ginzburg (1996b), Klein and Stutterheim (1987), van Kuppevelt (1994), van Kuppevelt (1995a), van Kuppevelt (1995b), Roberts (2001), Stutterheim (1994), Stutterheim (1997), Stutterheim and Klein (2002), Vennemann (1975), among others.
48
Accentuation and Interpretation
discourses as series of (possibly implicit) background questions and answers. According to this proposal an utterance is relevant if it contributes to answering a background question. I adopt the proposal. In the second subsection (3.2), I extend the language TL to a language QL that includes interrogative sentences. I follow Groenendijk and Stokhof (1984) in defining questions – that is, the meanings of interrogative sentences – as the sets of their exhaustive answers. A question can have more than one possible exhaustive answer, but only one of these answers is true. When a question is put up for discussion, the discourse goal is to add the true, exhaustive answer to the common ground. I use QL to extend the update system that was defined in section 2 above. I orient myself on Groenendijk (1999) and Groenendijk (2003) and modify the system in such a way that (a) information states can contain knowledge about discourse topics (questions under discussion) and that (b) topics can be raised by uttering interrogative sentences. I extend the formal specification of the conversational maxims with the relevance criterion that every utterance should contribute to answering a question that was put up for discussion. 3.1
Questions and answers
Discourse can be structured through series of background questions. A question is under discussion – that is, answering it is a discourse goal – first, if it is asked explicitly, secondly, if it follows from the broader context or was evoked by some event, or, thirdly, if it is presupposed by the speaker, without being explicitly asked or triggered by the context: 1. A question can be put up for discussion explicitly by the utterance of an interrogative sentence. A reply is relevant and satisfies the maxim of relation if it contributes to answering the question. If the question is answered, the communicative goal has been reached and the question is no longer under discussion. For the following examples (4) and (5) I presuppose that the respective questions must be answered exhaustively and cannot simply be removed from the agenda without being answered. By just ignoring a question under discussion and raising a new topic, one breaks the principle of cooperativity. (4)
A: Who has put what on the table? a. b. c. d.
B: Tom has put the plates on the table and Emma the glasses. B: Tom has put the plates on the table. B: # Emma later drove home. B: Tom has put the plates on the table and Emma the glasses, and Emma later drove home.
Cooperative Information Exchange
e.
49
B: # Tom has put the plates on the table, and Emma later drove home.
Example (4) shows different replies to an explicitly asked question. Let us assume that only plates and glasses are on the table. The first reply (a) answers the question exhaustively. The reply is cooperative, i.e. it fulfils the conversational maxims. The second reply (b) contributes to answering the question, but it does not answer the question exhaustively. If B knows that Emma put the glasses on the table, but does not mention this, she is not supplying all the information that she has, and therefore violates the conversational maxims. However, if she does not know who put the glasses on the table, she is giving the best possible answer she can give and observes the maxims. The third reply (c) does not contribute to answering the question. The utterance may be informative, but it is uninteresting with respect to the communicative goal that was set by the question; it does not satisfy the conversational maxims. The fourth reply (d) first answers the question exhaustively and then provides information that was not asked for. After A’s question has been answered, it is no longer under discussion, which means that B is no longer required to respond to it. She can set a new topic and provide information regarding the new topic, even if this information was not explicitly asked for. Things are different with the last reply (e). Here, the question under discussion is first answered only partially, and then information is provided that does not contribute to further answering the question. Because the question was not answered exhaustively, it is still under discussion. The speaker should conform to this; she does not do so, and therefore violates the conversational maxims. This will likely lead to a misunderstanding: in the best case, A observes that B digresses from the topic. In the worst case A assumes that B has answered the question exhaustively and then moved on to another topic. A now wrongly concludes that only plates are on the table, not glasses. The worst case seems to be the most likely one. Let me take stock. Uttering an interrogative sentence defines a communicative goal. Subsequent assertions should serve this goal by contributing to answering the question. Superfluous information, that does not contribute to answering the question, should not be provided. If the question was answered exhaustively, the conversational goal has been reached and the question is no longer under discussion. (The strong requirements that only information needed to answer a question may be
50
Accentuation and Interpretation
given and that a question is under discussion until it answered exhaustively will be reviewed in section 4.) (5)
A: Who put what on the table? – B: I’ll have to think about that. a. b. c. d. e. f.
C: What did Tom put on the table? C: Did Tom put the plates on the table? C: # What did Emma do later? C: ? Who put what on the table? C: ? Who put what where? C: Do you know which table he means?
After a question has been asked and before it is answered, further questions may be asked. In example (5) A asks a question, B requests some time to think, and C asks a follow-up question. The first two of C’s follow-up questions (a, b) are subordinate to A’s question. C’s questions set intermediate goals, structure the discourse, may help B on her way, and thus guide the answering of A’s superordinate question. Exhaustive answers to C’s questions are partial answers to A’s question and in that way contribute to answering it. Uttering either of the first two interrogative sentences seems to be adequate in the given context. Uttering the third interrogative sentence (c) interrupts the discourse and sets a new conversational goal. The question asked by C is independent of A’s original question; an answer does not contribute to answering the original question, so that in the given context, the question does not seem to be adequate. C can ask the fourth and fifth questions (d, e) as echo questions if he has not understood the interrogative sentence uttered by A sufficiently. Otherwise the questions are inadequate. Asking the same or an equivalent question again (d) does not make any contribution to answering the original question. The fifth question (e) is superordinate to the original question; instead of guiding the answering of the original question, it extends it and thus modifies the communicative goal. It, too, does not serve the purpose of answering the original question. C’s final question (f) checks if a precondition for answering A’s question is given. A refers to a specific table; if B does not know which table is meant, she has not fully understood the question. She can only answer the question if she knows which table is meant. C’s final utterance is adequate if it is doubtful whether B has understood the question. C encounters a possible misunderstanding, and in that way contributes to
Cooperative Information Exchange
51
answering the original question. If, on the other hand, there is no doubt that B knows which table is meant – e.g. because A visibly points to it – the answer to C’s question is entailed by the common ground. The question is then superfluous.16 Let me take stock. Background questions can regulate the uttering of declarative sentences as well as the uttering of interrogative sentences. As long as a background question is under discussion, every utterance should serve the purpose of answering this question. The answering of a question can be structured and guided by further questions. For this to be possible, the follow-up questions must be subordinate to the original question, i.e. answers to the follow-up questions must answer the original question at least partially. Lastly, it is possible to ask follow-up questions to check the preconditions for answering the original question. Uttering an interrogative sentence, that establishes a topic that is independent from the original question or goes beyond it, is inadequate. The same is true of interrogative sentences whose answer already follows from the common ground. 2. A question can also be under discussion if it was not asked explicitly (by the utterance of an interrogative sentence) but arises from the broader action context or is evoked by some contextual occurrence. In Chapter 2, I already made an example that shows that questions can arise from the action context (the football example: “Behind you!”). The answer in dialogue (6), which is adapted from Allen and Perrault (1980), is a further example: (6)
When is the next train to Cologne? – 18:01. Platform 1.
At an information booth at a train station, someone asks when the next train to Cologne departs. An employee answers the question by giving a time followed by a platform number. The mentioning of the platform is obviously interpreted as “The train departs from platform 1”, that is, as an answer to the question from which platform the train departs. This question arises from the action context: the person asking the question wishes to know the departure time of a certain train; this information is only relevant to him if he wants to take the train or take someone to it. In order for him to do this, he must also know from which platform the train departs. The rail employee can presuppose the corresponding question; his answer serves the superordinate goal of the person asking for information. 16
For the presupposition of questions by questions, see Carlson (1983).
Accentuation and Interpretation
52
(7)
Knocking on the door – Tom.
Questions can be evoked by contextual occurrences. In example (7) the discourse participants hear knocking on the door. Automatically the question arises who is knocking. The speaker can refer to this implicitly given question and answer it by merely mentioning a name. (8)
Someone is knocking on the door. – Tom.
The question who is knocking also arises after the first sentence of example (8) is uttered. According to van Kuppevelt17 the question is triggered by the use of the indefinite noun phrase “someone”. The use of an indefinite expression puts the question how this expression can be specified on the agenda. According to this idea, the use of “someone” regularly triggers a who? question. Let me take stock. A question can be asked explicitly. Alternatively, it can be evoked by superordinate action goals or by contextual occurrences. Regardless of how it was put up for discussion, the question regulates the adequacy and the interpretation of subsequent utterances. 3. Every utterance should be relevant. In at least some cases, the relevance or irrelevance of an utterance can be accounted for by reference to a background question. In various discourse models, this special case is generalised: relevance is always accounted for by reference to a background question, and every utterance is interpreted in reference to a background question. According to the quæstio model of Klein and Stutterheim18 every text has as its basis an overarching question – the quæstio. The quæstio is successively answered by the text, in that subordinate questions are answered one by one by subparts of the text, until all the information required to answer the quæstio has been provided. (Such a subpart of the text can consist of a complex sentence, a main and subclause or of a section of several sentences.) The rhetoric structure of a text is determined by the order of the subordinate questions; the order is restricted by several global organisational principles, but is otherwise relatively free. According to van Kuppevelt19 parts of a text can answer questions and possibly raise new ones. The ordering of a text arises from the question–answer relations between the subparts of the text. The question–answer relations correspond to rhetorical relations, 17 18 19
Cf. van Kuppevelt (1994), van Kuppevelt (1995a), and van Kuppevelt (1995b). Cf. Stutterheim (1994) and Stutterheim (1997). Cf. van Kuppevelt (1994), van Kuppevelt (1995a), and van Kuppevelt (1995b).
Cooperative Information Exchange
53
as described e.g. by the rhetorical structure theory (Mann and Thompson (1988)). Carlson, Ginzburg and Roberts20 state that questions determine the topics of dialogues – especially of dialogues for effective information exchange. Following Ginzburg a dialogue model must contain a list of questions under discussion (qud), according to which the utterances of the dialogue are interpreted and assessed.21 I adopt this generalisation: relevance is always accounted for by reference to a background question, and every utterance is interpreted in reference to a background question. With this generalisation, I obtain four independent variables: first, the contextually given background question, secondly, the mutual knowledge, thirdly, a linguistic expression (a reply to the background question) and, fourth, the relation of the utterance to the context (relevance or irrelevance/adequacy or inadequacy). Given mutual knowledge and a background question, it can be determined which utterances are relevant and which are not. If one does not have any knowledge of a background question, but assumes that a given utterance is relevant, i.e. contributes to answering a background question, he can reduce the set of possible background questions. If the given utterance is an assertion, it should answer the presupposed background question as far as possible. Furthermore, the answer should not already be entailed by the mutual knowledge. (9)
I will not resign.
By uttering example (9) the question whether the speaker will resign is presupposed. It is also possible to presuppose stronger questions that correlate with a larger need for information, e.g. the question what the speaker will do, or the question who will resign.22 The recipient learns that the speaker refuses to resign, that resigning was compatible with the common ground presupposed by the speaker, and that his resignation was under discussion. The recipient can assume that the question of the speaker’s resignation was not under discussion without reason. If the speaker happens to be the CEO of a corporation, the recipient can assume that the year’s results have been negative. 20
Cf. Carlson (1983), Ginzburg (1996a), Ginzburg (1996b), and Roberts (2001). A dialogue model following this account was implemented as part of the Trindi project: http://www.ling.gu.se/projekt/trindi/. Cf. Cooper et al. (2001)). 22 Which question is presupposed also depends on the stress pattern of the answer. I will discuss this matter is Chapter 4. 21
54
Accentuation and Interpretation
Summarising: the relevance of messages is determined in relation to background questions. Uttering an interrogative sentence puts a new questions up for discussion, directs the answering of an already given question, or checks the preconditions for answering such a question. A question is under discussion until it has been answered as well as possible or satisfactorily (here: exhaustively). As long as a question is under discussion, it regulates all types of utterances: every utterance should serve the purpose of answering it, and the discourse participants cannot ask other arbitrary questions, neither explicitly, nor implicitly by making assertions. An assertion presupposes a background question; the background question is at least partially answered by the assertion. If a background question has already been established, and a speaker utters something that does not contribute to answering this question, he expresses himself inadequately. If, on the other hand, the participants in a discourse do not have common knowledge of a background question, an assertion can allow them to accommodate their discourse knowledge in such a way that the speaker does answer a question (even if that question was given behind the fact). For that, the discourse participants presuppose that the speaker expresses himself in a relevant manner. 3.2
Updating the common ground (2)
Let us extend the update system that was defined in section 2.2 so that thematic knowledge (knowledge of questions) can be represented, such knowledge can be updated and criteria of relevance can be defined. To that end, it must first be made clear what a question actually is. A theory of questions is required that satisfies the following criteria: first, since questions mark the goals of information exchange, and since such goals exist in answering these questions, the theory must define the notion of question as well as the notion of answer.23 Secondly, a theory of questions should provide an empirically maintainable semantics of interrogative sentences. Since questions can be asked by uttering interrogative sentences and since interrogative sentences denote questions, questions must be defined in such a way that they can function as the meaning of interrogative sentences. Thirdly, a semantic theory of interrogative sentences determines semantic relations (entailment, equivalence) between questions, and between interrogative sentences respectively. The specification of semantic relations is of interest here because the adequacy of a newly asked question depends on the relation it has to an eventually given background question. 23 Questions are often answered by uttering elliptical sentences (“Who is sleeping?” – “Ann”). In this chapter, I only discuss the interpretation of complete sentences; the interpretation of elliptical answers is a topic of Chapter 4.
Cooperative Information Exchange
No one is sleeping.
Only Ann is sleeping.
Only Peter is sleeping.
55
Ann and Peter are sleeping.
Figure 3.1 Partition for “Who is sleeping?” and “Who is not sleeping?”
The question theory (semantics of interrogative sentences) of Groenendijk and Stokhof (1984) satisfies these criteria.24 It provides the means required for the further modelling of cooperative information exchange. Let me first illustrate the theory informally by means of some examples: (10)
Who is sleeping?
Following Groenendijk and Stokhof, understanding an interrogative sentence means knowing what counts as an exhaustive answer. An answer is a proposition; typically there are several possible answers. Let us assume that Tom has two children (Ann and Peter) and that he is asked who (meaning who of the children) is sleeping. There are four possible exhaustive answers: none of the children is sleeping; only Ann is sleeping; only Peter is sleeping; or both children are sleeping. The exhaustive answers are incompatible with each other. They cover the entire space of possibilities, so that exactly one of the answers corresponds to reality and is true. In a given model a proposition is a subset of the total set of all indices. The four exhaustive answers are four propositions (index sets) that do not have intersections and whose union is identical to the total set of all indices. The answers together form a partition of the set of all indices (represented by Figure 3.1). Groenendijk and Stokhof call such a partition a question. 24 (A) Interrogative sentences can appear as subclauses of declarative sentences (for example: “Tom knows who is sleeping”). It is an advantage if questions are defined as possible meanings of interrogative sentence regardless of whether they appear as main or subclauses. The question theory of Groenendijk and Stokhof (1984) has this advantage. (B) For some criticism to Groenendijk and Stokhof (1984) cf. section 4. I use Groenendijk’s and Stokhof’s theory primarily because it meets my practical requirements. However, there are certainly more than just pragmatic arguments in favour of Groenendijk’s and Stokhof’s theory. (C) For various (formal) semantic treatments of interrogative sentences, see B¨auerle and Zimmermann (1989), Higginbotham (1996), Groenendijk and Stokhof (1997) and Harrah (2002).
56
Accentuation and Interpretation
In order to understand an interrogative sentence, one must know which question, i.e. which set of propositions, the sentence denotes. According to this, a question is the intensional meaning of an interrogative sentence. The communicative goal connected to asking a question (i.e. uttering an interrogative sentence) consists in identifying the correct answer and to reject the other possible answers. For each question there is exactly one correct and exhaustive answer. A question has been answered after this answer has been added to the common ground. (11)
Tom knows who is sleeping.
Groenendijk and Stokhof define the extensional meaning of an interrogative sentence as its correct, exhaustive answer. If Tom knows who is sleeping, he knows the extensional meaning of “Who is sleeping?”. Depending on the situation at hand, he knows that no one is sleeping, that only Ann is sleeping, that only Peter is sleeping, or that both are sleeping. Let me take stock so far. A question is a set of mutually incompatible propositions that cover the entire space of possibilities. Every element of this set is a possible, exhaustive answer to the question. With the notion of question the notion of answer and therefore also the notion of discourse goal (for cooperative information exchange) are defined. Answering a question and in that way satisfying the discourse goal can be done in three ways: (a) one utters a sentence whose meaning is an exhaustive answer; (b) one utters a sentence that entails an exhaustive answer; or (c) one utters a sentence that is incompatible with at least one possible exhaustive answer, so that the set of possible answers is reduced and the question is at least partially answered. The meaning of interrogative sentences is defined in terms of propositions, i.e. of the meanings of declarative sentences. Note that the meaning of interrogative sentences is not reduced to the meaning of declarative sentences: the intensional meaning of a declarative sentence is a proposition, whereas the intensional meaning of an interrogative sentence is a set of propositions (a propositional concept); the extensional meaning of a declarative sentence is a truth value, while the extensional meaning of an interrogative sentence is a proposition. The meanings of declarative and of interrogative sentences are of different types, but they can be specified within the same model-theoretic framework. According to Groenendijk and Stokhof an interrogative sentence entails a second interrogative sentence if every exhaustive answer to the first sentence entails an exhaustive answer to the second one:
Cooperative Information Exchange
Peter is not sleeping.
No one is sleeping.
Only Ann is sleeping.
57
Peter is sleeping.
Only Peter is sleeping.
Ann and Peter are sleeping.
Figure 3.2 Partition for “Is Peter sleeping?”
(12)
a. b.
Is Peter sleeping? Who is sleeping?
The polar question (“yes/no question”) denoted by (12-a) has two possible answers: either Peter is sleeping, or he is not sleeping. (The corresponding partition is represented in Figure 3.2.) Every exhaustive answer to the wh-question denoted by (12-b) entails an exhaustive answer to the polar question (12-a); therefore the wh-question who is sleeping entails the polar question whether Peter is sleeping. This does not hold the other way around, however: the polar question does not entail the wh-question. Someone may know that Peter is sleeping and at the same time falsely believe that Ann is awake. The information whether Peter is sleeping or not does not entail an exhaustive answer to the question who is sleeping. Two sentences are equivalent with respect to a model if they have the same intensional meaning. Two interrogative sentences are therefore equivalent with respect to a model if they denote the same partitions, i.e. if they have the same exhaustive answers:25 (13)
a. b.
Who is sleeping? Who is not sleeping?
Every exhaustive answer to the interrogative sentence (13-a) is also an exhaustive answer to the interrogative question (13-b), and vice versa. If we know who (from a given set of persons under consideration) is sleeping – that is, if the question who is sleeping is answered – then we also know who (from the set of persons under consideration) is not sleeping. Everybody except the people that are sleeping is not sleeping. Vice versa: if we know who (from a given set of persons under consideration) is not sleeping – that is, if the question who is not sleeping is answered – then we 25 Groenendijk and Stokhof (1984) adopt a notion of strong exhaustivity. For a discussion of strong vs weak exhaustivity cf. the titles named in footnote 24 (C).
58
Accentuation and Interpretation
also know who (from the set of persons under consideration) is sleeping. Everybody is sleeping except the people that are not sleeping. The two interrogative sentences (13-a) and (13-b) are therefore equivalent. The interrogative sentences “Who is sleeping?” and “Who is not sleeping?” denote the same question, i.e. they have the same answers. The answers are propositions that can be expressed by uttering various linguistic expressions. The same linguistic expression can be interpreted differently depending on the interrogative sentence with which the question was asked: the name “Ann” when answered to the first question is expanded into a different answer (Ann is sleeping) than when it is answered to the second question (Ann is not sleeping). Depending on circumstances, it may not suffice for a correct interpretation to know the background questions (the uniform meaning of the different interrogative sentences “Who is sleeping?” and “Who is not sleeping?”). Additionally it may be required to know the linguistic form of the interrogative sentence by which the question has been asked. I return to this matter in Chapter 4. Let me take stock. Following Groenendijk and Stokhof (1984), I define the semantic relations of equivalence and consequence between questions, or interrogative sentences, in terms of the semantic relations between the corresponding exhaustive answers. I now define questions as formal objects. First I extend the language TL to the language QL, that includes interrogative sentences. Then I define the semantics of QL. By doing so I define the model-theoretic interpretation of interrogative sentences and the formal notion of a question as well. Definition 3-7 (Syntax of QL) QL is an extension of TL for which holds: (a) The vocabulary of QL contains the entire vocabulary of TL plus the two locutionary mode operators ! and ?. (b) Well-formed expressions of QL: (i) Every well-formed expression of TL is a well-formed expression of QL. (ii) If φ is a sentence of TL, then !φ is a well-formed expression of QL. (iii) If ψ is a formula of TL, then ?ψ is a well-formed expression of QL. (iv) Noting else is a well-formed expression of QL. Definition 3-8 (Sentence of QL) A sentence of QL is a QL-expression of the form !φ or ?ψ.
Cooperative Information Exchange
59
A sentence !φ is a declarative sentence of QL; it is interpreted just like a sentence of TL. A sentence ?ψ is an interrogative sentence of QL. If ?ψ does not contain a free variable, its meaning is a polar question; uttering ?ψ asks whether it is the case that ψ. For example ?sleep( peter ) can be interpreted in the sense of “Is Peter sleeping?” and ?∃ x [sleep( x )] in the sense of “Is anyone sleeping?”. If ?ψ, on the other hand, contains at least one free variable, ?ψ has the meaning of a wh-question; uttering ?ψ asks for all possible instantiations of the free variables. In this way, ?sleep( x ) can be understood in the sense of “Who is sleeping?” – that is, How can x be instantiated so that sleep( x ) is true? – and ?eat( x, y) can be understood in the sense of “Who eats what?” – that is, How can x and y be instantiated so that eat( x, y) is true?.
Definition 3-9 (Model of QL) Let M = D M , I M , [[ ]] M be a (reduced) model of TL. M = D M , I M , [[ ]] M is a (reduced) model of QL iff for all indices i ∈ I M it holds that:
(a) [[φ]]iM = [[φ]]iM , for all QL-expressions φ that are also expressions of TL, (b) [[!φ]]iM = [[φ]]iM , (c) [[?φ]]iM = {i ∈ I M | [[φ]]iM = [[φ]]iM }, for all sentences ?φ that do not contain free variables, and M ,g
(d) [[?ψ]]iM = {i ∈ I M | { g | [[ψ]]i = true} = { g | [[ψ]]i all sentences ?ψ that do contain free variables. M,g
= true}}, for
Ad (a): QL is an extension of TL. Expressions that are already part of TL are interpreted just like in TL. Ad (b): a declarative sentence of QL is interpreted like a formula that does not contain free variables, i.e. like a sentence of TL. The extensional meaning [[!φ]]iM of a declarative sentence !φ with respect to M and an index i is a truth value. The intensional meaning of !φ with respect to M and the set of all indices is a proposition, namely the set of indices in relation to which !φ is true:
[[!φ]] M := {i | [[!φ]]iM = true} The locutionary mode operator ! is actually not needed for interpretation; it serves the more didactic function of making declarative sentences and interrogative sentence easily distinguishable. The definition 3-9 is redundant. Case (c) is a special case of case (d), and it can be subsumed under (d). However, distiguishing between the semantics of polar questions (case c) and the semantics of wh-questions
60
Accentuation and Interpretation
(case d) might make the definition more clear. Ad (c): an interrogative sentence ?φ that does not contain free varaiables expresses a polar question. The extensional meaning [[?φ]]iM of such an interrogative with respect to M and an index i is a proposition, namely the set of indices in relation to which φ has the same truth value as in relation to i. This proposition is the true answer to ?φ in relation to i: M [[?φ]]iM := {i ∈ I M | [[φ]]iM = [[ φ ]]i }
The intensional meaning of ?φ with respect to M is a question, i.e. the set of the mutually exclusive possible answers to ?φ: M M [[?φ]] M := {{i ∈ I M | [[φ]]iM } = [[ φ ]]i } | i ∈ I
Trivially, a polar question has at most two elements, first, the set of all indices in relation to which φ is true, and, secondly, the set of all indices in relation to which φ is false. If φ is a tautology or a contradiction, the intensional meaning of ?φ has only one element. Above, the a polar question corresponds to a partition on the set of indices, i.e as a set of index sets. Partitions can be described as equivalence relations – i.e. as sets of index pairs i, i , with i, i ∈ I M . An equivalence relation induces a partition; the equivalence relation and the partition can be mapped one-to-one onto each other. A polar question can be represented as an equivalence relation in the following way:
[[?φ]] M := {i, i ∈ I M × I M | [[φ]]iM = [[φ]]iM } According to this representation, the true answer to ?φ in i – that is, the extensional meaning of ?φ in i – is:
[[?φ]]iM := {i | i, i ∈ [[?φ]] M } Ad (d): the extensional meaning of an interrogative sentence is its true, exhaustive answer. For polar questions there are only exhaustive answers; for wh-questions there can be both exhaustive as well as partial answers. Let us take another look at example (10) (“Who is sleeping?´’). “Who is sleeping?” translated into QL is: ?sleep( x ). Let it be given that Ann and Peter are sleeping; a third child with the name Simon, however, is awake. The proposition that Ann and Peter are sleeping is true, but it is just a partial answer, as it does not exclude the possibility of Simon sleeping as well. The true, exhaustive answer is the proposition that only Ann and Peter are sleeping. (Cf. Figure 3.3.) The wh-interrogative ?sleep( x ) contains the free variable x. Several assignments g are possible for this variable, so
Cooperative Information Exchange
Only
Only
Only
Simon,
Only
Only
Only
Simon
Simon
Ann
Ann
No one
Simon
Ann
Peter
and
and
and
and
is sleeping.
is sleeping.
is sleeping.
is sleeping.
Ann
Peter
Peter
Peter
are sleeping.
are sleeping.
are sleeping.
are sleeping.
61
Ann and Peter are sleeping.
Figure 3.3 Partition for “Who (incl. Simon) is sleeping?”
M,g
that the following holds: [[sleep( x )]]i = true. To determine the exhaustive answer all such variable assignments must be taken into account. The extensional meaning [[?sleep( x )]]iM of the interrogative sentence ?sleep( x ) with respect to M and the index i is a proposition, namely the set of all indices with respect to which the formula sleep( x ) is true for the same variable assignments as with respect to i. Precisely: the formula sleep( x ) is true in i for the variable assignments g and g , with g( x ) = ann and g ( x ) = peter. It is false in i for the assignment g , with g ( x ) = simon. The extensional meaning of ?sleeps( x ) is therefore the set of all indices i M,g
for which it holds that [[sleep( x )]]i M,g
M,g
= true, [[sleep( x )]]i
= true and
[[sleep( x )]]i = false. All variable assignments must be taken into account. Accordingly, I generally determine the extensional meaning of a whinterrogative ?ψ as follows: [[?ψ]]iM := {i ∈ I M | { g | [[ψ]]i
M,g
M,g
= true} = { g | [[ψ]]i
= true}}
The intensional meaning of ?ψ with respect to M is a question, i.e. the set of the possible exhaustive answers to ?ψ. The set of the possible exhaustive answers is a set of mutually exclusive sets of indices that together cover the entire space of possibilities; i.e. it is a partition on the set of indices I M :
[[?ψ]] M := {{i ∈ I M | { g | [[ψ]]i
M,g
M,g
= true} = { g | [[ψ]]i
= true}} | i ∈ I M }
As in case (c), this partition can be represented as an equivalence relation on the set of indices, i.e. as a set of index pairs:
62
Accentuation and Interpretation
[[?ψ]] M := {i, i ∈ I M × I M | { g | [[ψ]]i
M,g
M,g
= true} = { g | [[ψ]]i
= true}}
Finally, also the extensional meaning of ?ψ in i can be representated analogously to case (c):
[[?ψ]]iM := {i | i, i ∈ [[?ψ]] M } Now that I have defined the notion of question and the meaning of interrogative sentences, I can determine the semantic relations between interrogative sentences: two interrogative sentences are equivalent in a model M iff they have the same intensional meaning. (That applies to all sentences, not just to interrogatives.) An interrogative sentence entails a second interrogative sentences iff every exhaustive answer to the first interrogative entails an exhaustive answer to the second interrogative. That is, I define the semantic relations between interrogative sentences as follows: Definition 3-10 (Semantic Relations of Interrogative Sentences) Let M = D M , I M , [[ ]] M be a (reduced) model of QL, and let ?ψ and ?ψ be interrogative sentences of QL: (a) ?ψ and ?ψ are equivalent in M iff [[?ψ]] M = [[?ψ ]] M . (b) – For questions being represented as sets of index sets (partitions): ?ψ entails ?ψ in M iff ∀ p ∈ [[?ψ]] M ∃ p ∈ [[?ψ ]] M : p ⊆ p . – For questions being represented as sets of index pairs (equivalence relations): ?ψ entails ?ψ in M iff [[?ψ]] M ⊆ [[?ψ ]] M . I have defined a notion of question, a notion of answer, and semantics of interrogative sentences including semantic relations between interrogative sentences, so that I can extend the classical update system from section 2.2 for the representation and modification of thematic knowledge and for the definition of relevance criteria. In extending the update system I orient myself on Groenendijk (1999) and Groenendijk (2003). Alternative update systems with questions are defined by J¨ager (1995) and Hulstijn (1997). In the update system so far, an information state with respect to a model M is a proposition, that is, an index set I ⊆ I M . Such an information state represents propositional knowledge, but no thematic knowledge. Thematic knowledge (knowledge about questions) can be represented by an equivalence relation over a set of indices. One could now model an information state as a pair PK, TK – consisting of an index set PK representing
Cooperative Information Exchange
63
propositional knowledge and a set of index pairs TK representing thematic knowledge. Alternatively, and more elegantly, one can represent propositional and thematic knowledge together with an equivalence relation, if one adapts the representation of propositional knowledge accordingly. I choose the second, more elegant option. I adapt the representation of propositions to the representation of questions, so that propositions are represented as index pairs as well. Until now a proposition was a set of indices with respect to a model M. From now on I represent it as a set of pairs of identical indices i, i , with i ∈ I M . I represent the exhaustive and correct answers to ?ψ in M and with respect to an index i as follows:
[[?ψ]]iM := {i , i | i, i ∈ [[?ψ]] M } An information state is modelled as a set of index pairs. It is updated by filtering its index pairs: Definition 3-11 (Information State) Let M = D M , I M , [[ ]] M be a (reduced) model of QL. (a) σ is an information state w.r.t. M and QL iff σ is an equivalence relation on a subset I of the indices I M (I ⊆ I M ). (b) For all information states σ w.r.t. M and QL the following holds: – σ is a state of total ignorance, i.e. a minimal information state, iff σ ⊇ {i, i | i ∈ I M }. – σ is a state of disinterest iff for all i, i , i , i ∈ σ: i, i ∈ σ. – σ is an absurd information state iff σ ∩ {i, i | i ∈ I M }. (c) – 0 = I M × I M is the minimal information state of total disinterest. – 1 uniformly represents any arbitrary absurd information state.
An information state with respect to M and TL represents only propositional knowledge; an information state with respect to M and QL on the other hand also represents thematic knowledge, i.e. a question under discussion: Definition 3-12 (Propositional and Thematic Knowledge) Let σ be an information state w.r.t. M and QL.
64
Accentuation and Interpretation
(a) The proposition expressed by a declarative sentence !φ ∈ QL is known in σ iff: [[!φ]]iM = true, for all i, i ∈ σ A sentence !φ ∈ QL contradicts σ iff:
[[!φ]]iM = false, for all i, i ∈ σ A sentence !φ ∈ QL is compatible with σ iff:
[[!φ]]iM = true, for at least one i, i ∈ σ (b) The question expressed by an interrogative sentence ?ψ ∈ QL is known26 in σ iff: σ ⊆ [[?ψ]] M An interrogative sentence ?ψ ∈ QL contradicts σ iff: σ ∩ [[?ψ]] M = ∅ An interrogative sentence ?ψ ∈ QL is compatible with σ iff σ ∩ [[?ψ]] M = ∅
I define propositional knowledge analogously to knowledge in information states of TL. A question is known in an information state if the information state is partitioned in accordance with the question, i.e. if the indices are grouped into exhaustive answers to the question. Because a question is a total equivalence relation on I M and therefore contains all pairs i, i (with i ∈ I M ), it can only contradict an information state if this state does not contain a pair i, i , making it absurd. An interrogative sentence is therefore compatible with any information state except the absurd state. How can an information state be updated? Definition 3-13 (Classical Update System) Let M be a (reduced) model of QL and Σ the set of all information states σ w.r.t M and QL; let !φ, ?φ and ?ψ furthermore be sentences of QL with ?ψ containing free variables and ?φ not containing any free variable. UQL = QL, Σ, [ ]2M is a classical update system for QL iff: 26 What is meant here is that the question is established and under discussion according to the information state. It is not meant that its exhaustive answer is known, like with e.g. “Tom knows who is sleeping”.
Cooperative Information Exchange
65
(a) σ [?φ]2M = {i, i ∈ σ | [[φ]]iM = [[φ]]iM } (b) σ [?ψ]2M = {i, i ∈ σ | { g | [[ψ]]i
M,g
M,g
= true} = { g | [[ψ]]i
= true}}
(c) σ [!φ]2M = {i, i ∈ σ | [[φ]]iM = [[φ]]iM = true} Ad (a) and (b): updating an information state σ with a polar question ?φ or a wh-question ?ψ only affects the thematic knowledge. If elements of the information state are filtered out in the course of an update, then only elements that are pairs i, i ∈ σ with i = i are filtered out. Pairs i, i ∈ σ, that represent propositional knowledge, are never filtered out, since they always fulfil the conditions [[φ]]iM = [[φ]]iM and M,g
M,g
{ g | [[ψ]]i = true} = { g | [[ψ]]i = true}. Ad (c): updating an information state σ with a declarative sentence !φ on the other hand affects propositional knowledge. In the course of an update, pairs i, i ∈ σ can be filtered out. Assertions can also modify thematic knowledge, namely when they answer a question under discussion, thus concluding the corresponding topic. An assertion can never bring up a new question, however. Utterances that change the common ground expand the propositional knowledge or put a question up for discussion: Definition 3-14 (Informativity and Questions) Let UQL = QL, Σ, [ ]2M be a classical update system for QL. For all information states σ ∈ Σ the following holds: (a) A declarative sentence !φ ∈ QL is informative for σ iff: σ [!φ]2M = σ (b) An interrogative sentence ?ψ ∈ QL sets a new topic for σ iff: σ [?ψ]2M = σ
Declarative sentences are informative if they are not already entailed by the common ground. Declarative sentences that contradict the common ground are informative, but updating the common ground with such sentences leads to the absurd information state. If by uttering an interrogative sentence the common ground is altered, the utterance is an expression of inquisitiveness. Updates through interrogative sentences never change propositional knowledge and therefore cannot lead to the absurd information state. If a question has already been asked – possibly even answered –
66
Accentuation and Interpretation
the interrogative sentence that expresses it does not contradict the common ground, but uttering it is not an expression of inquisitiveness. The notion of entailment defined for UTL holds analogously for UQL . An information state entails a declarative or interrogative sentence if updating the information state with the sentence does not yield any change. A sentence entails a second sentence if any arbitrary information state updated with the first sentence entails the second: Definition 3-15 (Entailment) Let UQL = QL, Σ, [ ]2M be a classical update system for QL and M. 1. An information state σ ∈ Σ entails a sentence φ of QL (either a declarative or an interrogative sentence) σ |=UQL φ iff: σ [φ]2M = σ 2. A sentence φ of QL entails a sentence ψ of QL (either a declarative or an interrogative sentence) φ |=UQL ψ iff for all information states σ ∈ Σ: σ [φ]2M = σ[φ]2M [ψ]2M
Accordingly, the following holds: 1. No interrogative sentence ?ψ can entail a declarative sentence !φ, unless !φ is necessarily true and therefore entailed by every declarative and interrogative sentence. (I ignore presuppositions that can be triggered by interrogative sentences.) 2. A declarative sentence !φ entails an interrogative sentence ?ψ iff !φ answers the question asked by ?ψ exhaustively. After an information state has been updated with !φ, a following update with ?ψ does not yield any further change. Trivially, every declarative sentence !φ entails the question whether the sentence is true: !φ |=UQL ?φ. The declarative sentence “Ann and Peter are sleeping” entails the interrogative sentences “Are Ann and Peter sleeping?”, “Is Ann sleeping?”, “Is Peter sleeping?” and “Is anyone sleeping?”: if anyone knows that Ann and Peter are sleeping, he also knows that Ann is sleeping, that Peter is sleeping and that someone is sleeping. “Ann and Peter are sleeping” does not entail
Cooperative Information Exchange
67
“Who is sleeping?”. After all, it is possible that apart from Ann and Peter, other people under consideration – e.g. Simon – are sleeping as well, so that the question who is sleeping is not answered exhaustively. 3. An interrogative sentence ?φ entails another interrogative sentence ?ψ iff every exhaustive answer to ?φ entails an exhaustive answer to ?ψ. Accordingly, “Who is sleeping?” entails e.g. “Is Peter sleeping?”. The reverse is not the case, however: “Is Peter sleeping?” does not entail “Who is sleeping?”. Let me summarise so far. I have extended the formal language with questions and modified the update system in such a way that it allows updates by uttering interrogative sentences. I can now formally describe discourses in which questions are asked and answered successively. Next, I need to specify criteria for the relevance of utterances in the update framework. In order for an utterance to be relevant, it must serve the purpose of answering a question under discussion. A question is only under discussion if it is entailed by the common ground of the discourse participants. However, all questions whose exhaustive answers are entailed by the common ground are also entailed the common ground.27 These questions that are already answered are not under discussion. For a question to be under discussion there must be a possible answer that is compatible with the common ground but not entailed by it: Definition 3-16 (Question Under Discussion) Let UQL = QL, Σ, [ ]2M be a classical update system for QL. Let the information state CG ∈ Σ represent the common ground of a group of discourse participants G. The sentence ?ψ ∈ QL denotes a question under discussion for G iff: (a) ?ψ is entailed by the common ground of G: CG |=UQL ?ψ (b) There is a sentence !φ that denotes a possible answer to ?ψ, and that is compatible with the common ground but not entailed by it: !φ |=UQL ?ψ CG |=UQL !¬φ CG |=UQL !φ 27 If a sentence !φ denotes an exhaustive answer to the question denoted by ?ψ, then it holds that: !φ |=UQL ?ψ (see above). If !φ is entailed by the common ground that, so that ?ψ is answered exhaustively, then ?ψ is entailed by the common ground.
68
Accentuation and Interpretation
If a question is under discussion, there are at least two possible exhaustive answers that are compatible with the common ground.28 An assertion is relevant if it filters out at least one of these answers and in that way reduces the set of answers deemed possible. Such an assertion gives a partial answer to the question under discussion. If the assertion filters out all answers except one, it gives an exhaustive answer and concludes the topic represented by the question. In the following example a question is asked that is partially answered twice: (14)
Who has prepared dessert? (?q) a. b.
Tom helped prepare dessert. (!p) Tom helped prepare dessert and went home early. (![ p ∧ r ])
The question who prepared dessert (?q) entails the polar question whether (also) Tom prepared dessert (?p). With the information that Tom helped prepare dessert, i.e. that he was one of the people who prepared dessert (!p), the subordinate (entailed) polar question ?p is answered exhaustively and the wh-question ?q under discussion is answered partially. Both replies !p and ![ p ∧ r ] provide information that contributes to answering the question at hand. To that extent, both replies are relevant. The second reply ![ p ∧ r ] furthermore gives the information that Tom went home early (!r). This information does not serve to answer the question at hand. It is not under discussion whether Tom went home early (?r), and therefore it is not under discussion whether Tom helped prepare dessert and went home early (?p ∧ r). The second reply is over-informative. In uttering a sentence !φ one should not provide any information that does not serve to answer a question under discussion. In order for the uttering of !φ to be adequate, it must be under discussion whether !φ is true or not. The assertion of !φ is only adequate – relevant and not overinformative – if ?φ denotes a question under discussion:
According to definition 3-16, a question [[?ψ]] M is only under discussion for G if [[?ψ]] M ∩ CG induces a partition, i.e. if it contains at least two propositions (exhaustive answers). I furthermore require that these answers must be expressible linguistically. That means that a question that cannot be answered through a speech act cannot be under discussion. This is desirable. If one wishes to allow for answers to be given through other acts than speech acts, the definition must be modified accordingly. – How then can answers be given through acts other than speech acts? – Someone might not be able to name the objects in his bag. He is nonetheless able to answer the question what is in his bag by putting the objects on a table visible to the one who asked him. 28
Cooperative Information Exchange
69
Definition 3-17 (Partial Answer) Let UQL = QL, Σ, [ ]2M be a classical update system for QL. Let the information state CG ∈ Σ represent the common ground of a group of discourse participants G. A declarative sentence !φ ∈ QL denotes a partial answer to a question under discussion for G iff ?φ denotes a question under discussion for G. I can now distinguish between four different answer relations between questions and propositions: 1. A proposition is a partial answer to a question under discussion iff it satisfies the conditions named in definition 3-17. 2. A proposition gives a partial answer iff it entails a partial answer. A proposition that gives a partial answer without being a partial answer is over-informative. 3. A proposition gives an exhaustive answer to a question under discussion if it filters out all possible answers to the question except one. If ?ψ denotes a question and !φ denotes an exhaustive answer to this question, then for all information states σ it holds that: σ [!φ] |=?ψ. Every proposition that gives a complete answer also gives a partial answer, and can be over-informative. 4. A proposition is an exhaustive answer to a question under discussion iff it is a partial answer and gives an exhaustive answer. An assertion that satisfies the conversational maxims is a partial, sometimes even exhaustive, answer to a question under discussion. Let us take another look at example (14) (repeated here as (15)): (15)
Who has prepared dessert? (?q) a. b.
Tom helped prepare dessert. (!p) Tom helped prepare dessert and went home early. (![ p ∧ r ])
Let us assume that it is already part of the common ground that Tom went home early. Under these circumstances, the question whether Tom helped prepare the dessert and went home early (?[ p ∧ r ]) is under discussion: the question is entailed by the common ground and has not been answered exhaustively yet. Accordingly, the proposition expressed in the second reply (![ p ∧ r ]) not only gives a partial answer, it even is one. Nonetheless, the second reply is of course inadequate. If it is already known that Tom went home early, both replies from example (15) alter the common ground in the same way. In both cases all index
70
Accentuation and Interpretation
pairs according to which Tom did not help prepare dessert are filtered out; with respect to the given information state, both replies are therefore equivalent. However, they are not equivalent with respect to any information state; the first sentence (!p) is logically weaker than the second (![ p ∧ r ]). A simple criterion for excluding the second reply is readily available: uttering a partial answer is only adequate if there is no alternative answer that, although equivalent with respect to the common ground, is logically weaker. Uttering a declarative sentence is adequate in a given context iff the sentence denotes a partial answer to a question under discussion, and if there is no logically weaker sentence that would modify the common ground in the same way: Definition 3-18 (Semantic Conversational Maxims for Assertions) Let UQL = QL, Σ, [ ]2M be a classical update system for QL. Let the information state CG ∈ Σ represent the common ground of a group of discourse partners G. If uttering a declarative sentence !φ ∈ QL to G is to satisfy the conversational maxims, the following must hold: (a) ?φ denotes a question under discussion for G. (b) There is no sentence !χ ∈ QL with !φ |=UQL !χ and !χ |=UQL !φ, so that CG [!φ] = CG [!χ]. Ad (a): if ?φ denotes a question under discussion for G, both !φ and !¬φ are compatible with GG , so that the sentence !φ does not contradict the common ground CG (maxim of quality). Uttering !φ answers a question under discussion (maxim of relation). The utterance is informative, but not over-informative; it does not give more new information than requested (maxim of quantity). Ad (b): the second condition excludes the possibility that the same answer could be given by uttering a logically weaker sentence. It does not give old information (maxim of quantity). Excursus: Groenendijk (1999) proceeds differently in specifying the relevance criterion: according to him an assertion is relevant iff the asserted proposition is an answer to a given question with respect to every possible information state – not just with respect to the information state that represents the current common ground of the discourse participants. Cf. example (15): the proposition that Tom helped prepare dessert and went home early, gives, according to Groenendijk, a partial answer to the question who prepared dessert. It is not the case that it is a partial answer, because it is over-informative when it is not already known that Tom went
Cooperative Information Exchange
71
home early. Groenendijk specifies what is a partial answer to a given question, not, like me, with respect to a specific information state, but with respect to every possible information state: the utterance of a declarative sentence is inadequate if it can be over-informative with respect to a given question. Groenendijk can therefore do without the additional criterion that one must choose the logically weaker of two answers that are equivalent with respect to the common ground. – Why do I not proceed the way Groenendijk does? First, Groenendijk’s relevance criterion is sometimes too strong: (16)
Who did Michael talk to? a. b. c.
He talked to exactly two people. He talked to exactly two people, who were both born in Hamburg in May 1967. He talked to exactly two of his friends.
Let it be given that Michael talked to Oliver and Frank, among others. Oliver and Frank are friends of Michael’s; furthermore, both were born in Hamburg in May 1967. (a) The first reply in (16) denotes a partial answer to the question under discussion. All possible answers according to which Michael talked to more or less than two persons can be filtered out. The utterance is adequate both according to Groenendijk’s definition and according to my definition. (b) The second reply gives more information than the first. On the basis of the utterance, all index pairs that specify that Michael talked to two people who were not born in Hamburg can be filtered out. If places of birth of the people under consideration are not known, these index pairs do not form an answer; the utterance is therefore over-informative and not adequate. According to Groenendijk, the second reply is always inadequate because it can be over-informative. According to my definition, the reply would be adequate if for all people under consideration it was known whether they were born in Hamburg in May 1967 or not. That is rather improbable, so that the second utterance is most probably inadequate. (c) To the third reply: the additional information that the persons that Michael talked to were friends of his, is according to Groenendijk’s definition possibly superfluous, just like the additional information in the second reply. Therefore the third reply is inadequate for Groenendijk. That is not plausible. If it is known who Michael’s friends are – something that is considerably more likely than that it is known when and where the people under consideration were born – the additional information is truly relevant: all index pairs according to which Michael talked to people that are not among his friends can be filtered out. My definition
72
Accentuation and Interpretation
makes reference to the available, propositional knowledge. This allows the third reply to be adequate. Secondly, Groenendijk’s relevance criterion is sometimes too weak: take the example (17), and let us assume that it is already part of the common ground that Michael talked to Oliver. (17)
Who did Michael talk to? a. b.
He talked to Frank. He talked to Oliver and Frank.
The first reply denotes a partial answer to the question who Michael talked to. It only gives new information and is adequate according to both Groenendijk’s and my definition. The second reply, on the other hand, is inadequate according to my definition, because it is over-informative. After all, the fact that Michael talked to Oliver is already part of the common ground. According to Groenendijk’s definition, the second answer is adequate; it denotes a partial answer to the given question, and that with respect to every possible information state. Groenendijk’s criterion is too weak to correctly judge the second reply. (One may have a different opinion about (17). But what when it was only just said that Michael talked to Oliver? Then it would not be adequate to say that Michael talked to Oliver and Frank. One would rather say that Michael talked to Frank too.) The examples (16) and (17) suggest that in order to judge the relevance of an utterance the common ground of propositional knowledge of the discourse participants must be taken into account. (End of excursus) Let us now turn for the last time to the question who is sleeping: (18)
Who is sleeping? a. b. c.
Ann is sleeping. Ann and Peter are sleeping. Only Ann and Peter are sleeping.
The first two answers in example (18) are partial answers; the third answer is an exhaustive answer to the question. All answers satisfy the adequacy conditions defined above. That’s how it has to be: both answers are not only adequate according to the definition, but also according to our intuition. There is no doubt, however, that the second answer is better than the first, because it gives more relevant information. The third answer is even better, because it gives all relevant information and concludes the topic. The answers differ with respect to their level of informativity:
Cooperative Information Exchange
73
Definition 3-19 (Level of Informativity of Answers) Let UQL = QL, Σ, [ ]2M be a classical update system for QL. Let furthermore !φ and !ψ be arbitrary, non-equivalent declarative sentences of QL which satisfy the maxims mentioned in definition 3-18 when uttered in a given situation. The sentence !φ is more informative than !ψ, and therefore uttering !φ is better in the given situation than uttering !ψ iff: !φ |=UQL !ψ
According to the Gricean maxims, a speaker should not withhold any information that is relevant in the given context. That is, he should give the best possible answer he can with the knowledge he has. In order to incorporate this requirement into our formal specification of the semantic maxims, we would need to refer to the private knowledge of the speaker. The maxims, however, should not only regulate the behaviour of the speaker, but should also function as adequacy criteria for interpretations. A recipient should be able to verify whether the maxims have been satisfied, but he cannot be certain about criteria that refer to the private knowledge of other people. That means that the requirement that an interpretation should conform to the best possible answer a speaker can give cannot be a good criterion for the adequacy of interpretations. (I return to this matter in the final section of this chapter.) To finish off, let us turn to the adequacy of asking a question: if there is no question under discussion, uttering an interrogative sentence can put a new question up for discussion. A precondition for this is that the question has not yet been answered exhaustively and can therefore set a new discourse goal. If there is already a question under discussion, all utterances should serve to answer this question. Uttering an interrogative sentence should then not raise a new question, but structure the answering of the given question: Definition 3-20 (Semantic Conversational Maxims for Asking Questions) Let UQL = QL, Σ, [ ]2M be a classical update system for QL. Let the information state CG ∈ Σ represent the common ground of a group of discourse participants G. If uttering an interrogative sentence ?ψ ∈ QL to G is to satisfy the conversational maxims, the following must hold: (a) If CG is a state of disinterest (i.e. if there is no question under discussion): CG |=UQL ?ψ. (b) If for G a question is under discussion:
74
Accentuation and Interpretation
(i) ?ψ denotes a question under discussion for G, and (ii) there is a question under discussion for G that is denoted by ?χ, so that the following holds: ?χ |=UQL ?ψ and ?ψ |=UQL ?χ.
Ad (a): if there is no question under discussion, uttering an interrogative sentence should express inquisitiveness. Every question whose answer is not part of the common ground and with which therefore a new discourse topic can be set, may be asked. Ad (b): if there is already a question under discussion, every newly asked question should serve the purpose of answering this question. By asking a new question the answering of the given question can be structured and guided. Ad (ba): asking a new question should not bring up a new topic. Ad (bb): the new question should be subordinate to the given question, i.e. it should be entailed by the given question and it should not be equivalent with it. It should set an intermediate goal for answering the given, superordinate question. If ?ψ denotes a question that is subordinate to a given question, the following holds: CG [?ψ] = CG . This discourse-structuring function of a follow-up question is not reflected in the update system as it has been defined so far. In Chapter 4, I alter the system so that uttering a follow-up question can have a discourse-structuring effect. Summarising: in this section, I argued for the modelling of discourse goals as questions under discussion. I then specified questions as modeltheoretic objects, following Groenendijk and Stokhof (1984), and defined a formal semantics of questions. Lastly, I developed a classical update system for cooperative information exchange with interrogative and declarative sentences. Within this system, I have defined criteria for the adequacy of utterances and their reconstructions, or interpretations, on the basis of the Gricean conversational maxims.
4 Conversational maxims and active interpretation revisited In this chapter, a preliminary model of cooperative information exchange was construed: information exchange consists of successive changes of the common ground, that is, of the changes in the representations of the common ground that the discourse participants each possess. The common ground can only be modified through complete sentences – i.e. through expressions that denote propositions or questions. So far information exchange takes place through a formal language QL. The model is to be extended in the next chapter, so that it allows, first, information exchange
Cooperative Information Exchange
75
with natural-language sentences and, secondly, adequate interpretation of incompletely uttered, or rather incompletely recognised, sentences. The extension to natural language is simple: I define a translation to QL, so that natural-language utterances can be translated into expressions of QL, and then be interpreted as defined in this chapter. For the interpretation of incomplete sentences I must furthermore define operations of semantic enrichment: the recognised parts of an incompletely uttered and/or recognised sentence are translated into expressions of QL. If the translated expressions do not together form a sentence of QL, they cannot serve to modify the common ground. In order for the expressions to be informative, they must be completed so that they form a sentence of QL. Such a completion causes a change in meaning; the recognised expressions are semantically enriched. The interpretation of a natural-language expression is not necessarily determined non-ambiguously by the recognised parts of the utterance: natural-language sentences can – contrary to sentences of QL – be ambiguous; depending on circumstances, one sentence may be translated into several, non-equivalent QL sentences. Furthermore, a recipient may possess more than one operation for semantic enrichment, which means that he may be able to complete an incompletely recognised utterance in different ways, arriving at different interpretations. A recipient may have some leeway in his reconstruction. In order to use this leeway correctly, he needs adequacy criteria on the basis of which he can distinguish adequate from inadequate reconstructions. Commonly, a reconstruction is deemed adequate if it has the meaning that the speaker intended. A criterion of adequacy that refers to the intention of the speaker is not useful to the recipient, however, because the recipient would have to be able to read the speaker’s mind in order to verify if his reconstruction accords with the speaker’s intention. Certainly, he cannot read minds, and if he could, there would be no need for the speaker to say anything at all. What we need are objective criteria that do not refer to private, purely subjective attitudes. Instead of requiring that the reconstruction of a message has the meaning that the speaker intended, it must be required that it has a meaning that the speaker may have intended, i.e. a meaning that is intendable in the given discourse situation. Intendability is an objective criterion for what a speaker can mean in a given situation and for how he must therefore be understood. Thus, the criteria for the adequacy of reconstructions are based on the criteria for the adequacy of utterances. For determining the adequacy criteria for utterances, from which adequacy criteria for reconstructions can be derived, I assumed the Gricean
76
Accentuation and Interpretation
conversational maxims. The Gricean maxims are plausible, but they cannot be adopted unaltered for my purpose: first, Grice allows for speakers to violate the maxims in order to evoke conversational implicatures. A reconstruction according to which the reconstructed utterance violates a maxim could not be rejected on the ground of the violation alone. That is, because Grice’s maxims are understood as loosely enforceable rules that serve as guide-lines rather than as constraints, it is not possible to derive strict selectional criteria for reconstructions from them. In order for the maxims to form a basis for criteria for the reconstruction of messages, I establish them as strict rules that are constitutive for information exchange. Secondly, the maxims are defined rather vaguely. They apply primarily to assertions and not in the same degree to utterances of interrogative sentences. The maxims must therefore be expanded and made more concrete. I have reformulated the maxims as constraints for utterances: every assertion must constitute at least a partial answer to a question under discussion and must in that way be relevant (maxim of relation), informative but not over-informative (maxim of quantity), and compatible with the common ground (maxim of quality). If there is no question under discussion, the uttering of an interrogative sentence must be an expression of inquisitiveness. Otherwise, it must guide the answering of an already given question; the new question must be logically subordinate to the given question. From the reformulated maxims, adequacy criteria for the reconstruction of messages are derived: the reconstruction of an uttered message is adequate if according to the reconstruction, the utterance satisfies the maxims. The adequacy criteria determine in which relation a given utterance must stand to the common ground. The common ground comprises both propositional knowledge (factive knowledge) and thematic knowledge (knowledge about questions under discussion). The original Gricean maxims also refer to the private knowledge of the speaker: a speaker should say everything he knows about a given topic (original maxim of quality). A recipient cannot verify under all circumstances what the relation between the utterance directed toward him and the speaker’s private knowledge is. The reformulated maxims therefore require nothing more than that a given utterance is relevant and informative with respect to the shared, quasi-open knowledge, i.e. the common ground, and furthermore that it is compatible with the common ground.29 Nonetheless speakers should of course 29 The adequacy criterion of quality is weaker than the corresponding original maxim: a speaker’s utterance can be compatible with the common ground, and therefore satisfy the adequacy criterion, while at the same time contradicting knowledge of the speaker and thereby violating the original maxim. The weakened ade-
Cooperative Information Exchange
77
express themselves in the best way possible and not withhold any relevant information that they have – as required by the original maxims. Even if a recipient is not under all circumstances able to determine what the relation between an utterance directed toward him and the private knowledge of the speaker is, he may at least assume that a cooperative speaker says what he knows. (19)
Who talked to Jane? – Yves talked to Jane.
In example (19) – which was already discussed in Chapter 2 – someone is asked who talked to Jane. Let us assume that it belongs to the common ground that the addressee knows who talked to Jane, and therefore knows the exhaustive answer to the question. The answer that Yves talked to Jane taken literally is just a partial answer that is compatible with a situation in which also other people talked to Jane, as well as with a situation in which no one except Yves talked to Jane. Normally, the answer is interpreted stronger than the literal meaning: it is understood that only Yves talked to Jane. The stronger interpretation makes the answer an exhaustive answer, as a well-informed cooperative speaker is expected to give. In order to arrive at the stronger interpretation, the recipient must expand the interpretation of “Yves” to the interpretation of “only Yves”. He needs an operation that allows him to strengthen the answer to an exhaustive answer. Such an operation is defined in the next chapter.30 If a recipient has the option to interpret as an exhaustive answer an utterance that in its literal meaning is only a partial answer, he needs a criterion on the basis of which he can decide which of the two interpretations is quacy criterion does not prevent a speaker from lying, as long as he lies consistently. A good liar simulates cooperative behaviour. He expresses himself informatively, does not contradict what he already said, although he of course contradicts his own private knowledge. He runs the risk of his lie being disproven if his interlocutor is better informed about his (the liar’s) private information state than the liar is aware of. Let us assume that Albert knows that p; he claims to Bertha that ¬ p. Although neither p nor ¬ p belong to the common ground of Albert and Bertha, Bertha knows that Albert knows that p – KBertha KAlbert ( p). Bertha observes a contradiction between the utterance of Albert and his information state; she considers Albert either confused or a liar. If Albert did not know to what extent Bertha is aware of his information state – ¬KAlbert KBertha KAlbert ( p) – he could not foresee that she would show him to be a liar (or confused). 30 Formally, there are several ways in which the utterance could be strengthened to an exhaustive answer. One could, for example, define an operation with which “Yves talked to Jane” is expanded to “Yves and all other people talked to Jane”. This is not the sense in which the answer sentence is normally understood, however. An operation for the semantic strengthening of an answer sentence must be empirically justified. An operation that strengthens “Yves” in the sense of “only Yves” is the only plausible.
78
Accentuation and Interpretation
preferable. He could prefer the exhaustive answer, because – according to the definition given above – it is the better one. For that, however, he needs to presuppose that the speaker possesses sufficient evidence for giving the exhaustive answer, and he cannot have any expectations himself that contradict the exhaustivity of the answer. Summarising: adequacy criteria for utterances were defined on the basis of the Gricean conversational maxims. These criteria position an utterance in relation to the utterance situation. An utterance situation is specified especially by the common ground, or rather by the representations of the common ground that the discourse participants possess. The common ground contains propositional knowledge (factive knowledge) and thematic knowledge (knowledge about the question under discussion). The reception of an utterance is not deterministic; a recipient has some leeway in reconstructing a message; he makes decisions about the way a given utterance is to be reconstructed and interpreted. We can speak of an active recipient. Adequacy criteria for the reconstruction and interpretation of messages were specified on the basis of the maxims for the uttering of messages: an utterance should be reconstructed in such a way that it is compatible with the reformulated conversational maxims. A recipient can make an utterance compatible with the maxims by (a) choosing one of several alternative reconstructions, and/or (b) by accommodating the context configuration. Every utterance presupposes a common ground. A recipient can accommodate the context configuration by accommodating his representation of the common ground in such a way that the utterance satisfies the maxims, given this representation. The modification can apply to both the propositional and to the thematic aspect of the common ground.31 My model of cooperative information exchange so far is relatively simple. Let us discuss four objections that suggest that the model may be too simple in certain respects. Objection 1: according to the model defined above, holders of information states are logically omniscient. An information state updated by uttering 31 How does this accommodation function? – There are two possibilities: it can be assumed that the presupposed information state σ is fully specified and that the accommodation only alters it by adding or deleting index pairs. Alternatively, it can be assumed that the presupposed information state is underspecified. Accommodating then amounts to further specifying the information state – propositional or thematic knowledge is added, but knowledge that has already been incorporated cannot be revised. The second possibility does not allow corrections. (Cf. Beaver (1991) and Zeevat (1992) for elaboration.)
Cooperative Information Exchange
79
a declarative sentence not only represents the proposition that the sentence denotes, but also all of its logical implications. A recipient therefore always knows everything that follows from what has been said. Reply: recipients can draw valid conclusions from what is said and their prior knowledge. If someone hears that Socrates is human, and if he already knows that all humans are mortal, then he will also learn that Socrates is mortal. Under normal circumstances, recipients cannot draw all logically possible conclusions; their capacity is limited. My notion of information is only tenable as long as I base it on small models with limited domains. If what can be known at all is manageable, the implications of utterances are also manageable. Under these conditions – that is, in a laboratory situation – all valid conclusions can be drawn, and logical omniscience is possible. The model suffices for interesting experimental investigations in laboratory situations. Furthermore, the first objection does not touch the maxim that assertions should be informative. The objection only raises the point that the question when it can be presupposed that something specific is known has not yet been satisfactorily answered. With this I agree. Objection 2: according to the model defined above, a question is only under discussion until it has been answered exhaustively. There are questions, however, that only require partial answers . The following example (20) is the standard example for such a mention-some question: (20)
Where can I buy an Italian newspaper?
The questioner has no desire to have all shops that sell Italian newspapers listed to him. He wishes to know the location of one shop, preferably the one nearest to him. His question is answered as soon as a shop is mentioned; the requirement that the question remains under discussion until it is exhaustively answered is too strong. Reply: that is true. The goal is not to determine the exhaustive, but rather the optimal answer. In some cases the exhaustive answer is the optimal answer. By following Groenendijk and Stokhof (1984) in defining a question as a set of its exhaustive answers, I make the special case the general one. Van Rooy (e.g. van Rooy (2003b)) defines questions as sets of their optimal answers, i.e. not necessarily as partitions, but rather as sets of propositions that may be compatible with each other. He determines optimality in terms of decision theory in relation to subsequent action or information goals. Van Rooy’s decision-theoretic notion of questions is convincing, but integrating it into the model here is formally more complex than could be justified. I wish to show how incompletely recognised utterances can be interpreted, and the relatively simple question theory of Groenendijk and Stokhof (1984) serves my purpose.
80
Accentuation and Interpretation
Objection 3: according to the conversational maxims defined above an utterance is inadequate if it gives more information than is necessary for answering the question under discussion. This criterion appears to be too strong: (21)
Is anyone missing? a. b. c.
Yes. Dirk is missing. Dirk is missing. Dirk has been missing for two hours.
In example (21) a question is asked and answered in three different ways. The first reply first answers the question with “yes”, after which the question is no longer under discussion. The person answering may therefore presuppose a new question for his second sentence. He presupposes the question who is missing and answers it. The first reply satisfies the conversational maxims defined above and is therefore adequate. Things are different with the second reply, that gives an answer and at the same time answers the superordinate question who is missing. According to the conversational maxims defined above, this reply is over-informative and therefore inadequate. The third reply does not only answer the question who is missing, but even for how long he is missing. It, too, is over-informative with respect to the explicitly stated question, and uttering it is inadequate according to the conversational maxims defined above. In reality – i.e. according to the judgement of a competent speaker, and contrary to the conversational maxims defined above – all three replies are acceptable in a cooperative information exchange. The assessments based on the conversational maxims are therefore too strong. Reply: assume a discourse in which the common information state σ is presupposed and in which the sentences α1 , . . . , αn are successively uttered. Then, sentence αn+1 is uttered: σ[α1 ] . . . [αn ][αn+1 ] The recipient must interpret the sentence αn+1 in such a way that the utterance is compatible with the conversational maxims. If the sentence has more than one possible interpretation, he can choose one. Furthermore he can accommodate the information state σ that was presupposed for the entire discourse. In this process of accommodation, new questions and new propositional knowledge can be incorporated. For example, if an utterance presupposes that a certain question is under discussion, the recipient can alter his representation of the common ground in such a way that this precondition is met. Accommodation, however, cannot render any
Cooperative Information Exchange
81
of the utterances α1 , . . . , αn inadequate. The second and third reply in example (21) each denotes an answer to a question that is superordinate to the question that was asked explicitly. Possibly the recipient can accommodate his representation of the common ground in accordance with these questions. The explicitly stated question does not become inadequate with this; after all, it is subordinate to the presupposed questions, and may guide the process of answering them. The second and third reply in example (21) are therefore compatible with the conversational maxims as defined here if the utterance conditions can successfully be adjusted in accordance with them. Objection 4: if one can presuppose and accommodate new utterance conditions so freely, do we not run the risk of allowing too many utterances, possibly also unacceptable ones? And: if a recipient possesses such a wide array of possibilities for making an utterance compatible with the conversational maxims, he may well arrive at very different interpretations, which may nonetheless all be adequate. Is it at all possible to derive useful adequacy criteria for reconstructions and interpretations from the conversational maxims? Reply: I have already discussed a similar objection in section 2.1. I admit that the accommodation of the discourse context must be restricted. Until now, it is restricted by the requirement that the presupposed context is set up before the discourse and that all utterances must be adequate with respect to this context in the order in which they are made. That is, the representation of the common ground cannot be altered in the middle of the discourse, so that later utterances are adequate but earlier utterances are retroactively rendered inadequate. Additional restrictions arise when information exchange is described in the broader action context: information exchange can serve non-communicative goals; these goals determine what kind of information is desired and therefore which questions are under discussion. These questions restrict the possibilities of presupposing new questions. If an utterance can be reconstructed in several, non-equivalent ways and all reconstructions satisfy the adequacy criteria, further criteria on the basis of which a certain reconstruction can be selected are needed. Such additional criteria may arise on the basis of e.g. game-theoretic considerations: if one assumes that the accommodation of the utterance context involves high cost (of whatever sort), a reconstruction that does not require accommodation is on principle preferable. A recipient is anxious to minimise his cost; a speaker must be aware of this and cannot offer reconstructions that are “cheaper” than the intended one. (cf. Parikh (2001)).32 32 According to Heim (1992), presupposing surprising and controversial assumptions in the expectation that the utterance context will be accommodated accord-
82
Accentuation and Interpretation
So far, I have disregarded the conversational maxim of manner. This maxim requires from the speaker that he expresses himself clearly and intelligibly. If an utterance can be interpreted in different, non-equivalent ways, and all interpretations are equally acceptable for the recipient, the utterance is certainly not clear. In a given context, a clear utterance has only one adequate interpretation, or, if it has several, non-equivalent interpretations, one of them must without doubt be preferable to the others. If a speaker expresses himself clearly, the adequacy criteria for reconstruction and interpretation derived from the (semantic) conversational maxims are useful. In the next chapter, I define a further criterion for the clarity of utterances based on the hypothesis of optimal accentuation. Let us now extend the model of cooperative information exchange for the information exchange with natural language and for the reconstruction of incompletely recognised messages.
ingly, is bad communicative style. The speaker expects something that he might in principle expect from the recipient, but that he should not expect from him. In game-theoretic terms, one could account for this style requirement in such a way that the accommodation of surprising and controversial assumptions is especially costly.
4 Reconstruction of Messages Following the model developed here, information exchange through natural language functions as follows: a speaker sends a message by uttering a word or a sequence of words. The message denotes either a proposition or a question, regardless of whether the words form a complete sentence. A recipient recognises some of the uttered words; he might even recognise all of them. On the basis of which words he recognises, and on the basis of his knowledge of the discourse context, he tries to reconstruct the entire message as a sentence of QL. For this, he translates the recognised expressions to QL and, eventually, applies context-sensitive or context-insensitive operations of semantic enrichment. It may happen that by doing so the recipient can reconstruct several non-equivalent messages. Each of these messages can be evaluated on its contextual adequacy. The recipient possesses a representation of the discourse participants’ common ground. (That is, he makes assumptions about the common ground; these assumptions can eventually be false.) The criteria of adequacy describe how a reconstructed message should relate to this common ground representation. If a reconstructed message is considered inadequate, the recipient can either reject the reconstructed message (and possibly choose a different reconstruction), or he can accommodate his representation of the common ground, thereby “making” the message adequate. Both the possibility of reconstructing and the possibility of accommodating are restricted, so that a given utterance cannot be interpreted in any arbitrary way. Ultimately, a recipient can update his common ground representation with the reconstructed message. The maxim of manner for cooperative information exchange requires that a speaker expresses himself clearly and intelligibly. An utterance is clear and intelligible if the speaker’s intended message can be reconstructed easily and selected as the message that the speaker most probably intended. The speaker must make sure that the recipient can recognise at least a set of i-critical words that are sufficient for understanding the entire utterance. According to the hypothesis of optimal accentuation, the speaker must accentuate these words. Which words these are depends on the particular discourse context, and especially on the question under discussion. Clarity furthermore requires that an utterance can only be reconstructed in one way, or – when several, non-equivalent reconstructions 83
84
Accentuation and Interpretation
are possible – that one single reconstruction is preferable, that is, that the recipient can reject all other reconstructions as less plausible. Which words in an utterance are optimally to be accentuated must be determined in relation to the discourse context. The recipient of an utterance can – under the assumption that the utterance was accentuated optimally – accommodate his representation of the context on the basis of the accentuation. That is, accentuation can be informative by presupposing a certain discourse context. The recipient of an utterance may be faced with several different context configurations for interpretation. First, the reconstruction depends on the words that the recipient has recognised: has he recognised all words, or just a few? If he has not recognised all words, did he at least recognise the accentuated words, – i.e. in the case of optimal accentuation, the icritical words? Secondly, reconstruction and interpretation are influenced by the question whether the recipient can determine which of the words he recognised are accentuated and which are not. Stress patterns are not necessarily recognised. However, if a stress pattern is recognised it can have an informative effect and therefore aid understanding.1 Thirdly, reception and interpretation depend on the knowledge of the utterance context. It is especially helpful if the recipient knows the question that the speaker presupposes to be the question under discussion. The recipient then knows the communicative goal of the utterance and can restrict the set of possible reconstructions accordingly. The correct reconstruction and interpretation of a message is not possible under all circumstances; in fact, misunderstandings and lack of understanding do occur. This chapter is organised as follows: in section 1, I define a small fragment of English together with a compositional semantics that determines how expressions of the fragment are represented by expressions of the formal language QL. In sections 2 and 3, I define operations for semantic enrichment and for the completion of expressions. The operations enable a recipient to interpret sentences as complete sentences, even if they were uttered or recognised only incompletely. Finally, in section 4, I illustrate the context configurations for the reception of utterances; I show how even under unfavourable conditions, comprehension is possible. 1 A recipient can recognise words without noticing whether they were accentuated. The difficulty in determining stress should be greater if the recipient’s command of the language of communication is smaller; it should therefore be more difficult in foreign language communication than in native-language communication. In automatic speech recognition word and stress recognition are problems that are dealt with separately – if stress is recognised at all (“[. . . ] intonation is rarely used in ASR” O’Shaughnessy (2000), 432).
Reconstruction of Messages
85
1 A truly simple fragment of English First I define a small fragment of a natural language. This fragment, – I call it L – should contain interrogative and declarative sentences. With these sentences, it should be possible to investigate several interesting phenomena of accentuation and interpretation; otherwise, L should remain as simple as possible. I define L as a fragment of English (rather than e.g. German), so that I am not burdened by having to make sure that case, number and gender agreement is correct. Example sentences of L are “Which square is in the left field?” and “Every black square is in the left field”. For the sake of simplicity, I forgo the use of plural noun and verb phrases, i.e. sentences like “Which squares are in the left field?”. Let L be the set of expressions defined by the following lexicon and grammar. Most abbreviations for phrase types and word classes (np, vp, . . .) correspond to the abbreviations commonly used in phrase structure grammars. The abbreviation ds stands for declarative sentence, is for interrogative sentence; ip stands for interrogative pronoun and aip for adjectival interrogative pronoun: 1. Lexicon: (a) ip → “what” (b) aip → “which” (c) pn → “O1”, . . . , “O9” (d) det → “some”, “every” (e) n → “circle”, “square”, “triangle” (f) adj → “black”, “grey”, “white” 2. Grammar: (a) nom → n (b) nom → adj nom (c) np → pn (d) np → det nom (e) whp → ip (f) whp → aip nom (g) ds → np vp (h) is → whp vp
86
Accentuation and Interpretation
For the reconstruction and interpretation of L-expressions I define a compositional semantics. Every expression of L is assigned at least one semantic representation. If an expression consists of several partial expressions, its semantic representation is composed of the representations of the parts. I represent the meaning of L-expressions by expressions of QL: Definition 4-1 (The Language L: Syntax and semantics) Let the following part of speech types be given: adj, aip, ds, det, ip, is, n, nom, np, pn, vp and whp. The function t1 assigns to each of the types a set of pairs consisting of a natural-language expression and a type-logical meaning representation. Let the expression A · B be the concatenation of two natural-language expression A and B (in that order). Let the meaning representations of natural-language expressions be expressions of QL. The following holds: (a) “what”, λX [object( x ) ∧ X ( x )] ∈ t1 (ip) (b) “which”, λX1 λX2 [ X1 ( x ) ∧ X2 ( x )] ∈ t1 (aip) (c) “O1”, λZλX [ Z (o1) ∧ X (o1)] ∈ t1 (pn) ... “O9”, λZλX [ Z (o9) ∧ X (o9)] ∈ t1 (pn) (d) “some”, λX1 λZλX2 [∃ x [ Z ( x ) ∧ X1 ( x ) ∧ X2 ( x )]] ∈ t1 (det) (e) “every”, λX1 λZλX2 [∀ x [ Z ( x ) ∧ X1 ( x ) → X2 ( x )]] ∈ t1 (det) (f) “circle”, λx [circle( x )] ∈ t1 (n) ... (g) “black”, λx [black( x )] ∈ t1 (adj) ... (h) “is in the left field”, λx [ left( x )] ∈ t1 (vp) (i) A, P ∈ t1 (nom)
⇐= A, P ∈ t1 (n)
(j) A · B, λx [ P1 ( x ) ∧ P2 ( x )] ∈ t1 (nom) ⇐= A, P1 ∈ t1 (adj) ∧ B, P2 ∈ t1 (nom) (k) A, P ∈ t1 (np)
⇐= A, P ∈ t1 (pn)
(l) A · B, Q( P) ∈ t1 (np) (m) A, P ∈ t1 (whp)
⇐= A, Q ∈ t1 (det) ∧ B, P ∈ t1 (nom)
⇐= A, P ∈ t1 (ip)
(n) A · B, Q( P) ∈ t1 (whp)
⇐= A, Q ∈ t1 (aip) ∧ B, P ∈ t1 (nom)
Reconstruction of Messages
87
t1 (is) which triangle is in the left field, ?[triangle( x ) ∧ left( x )]
t1 (whp) which triangle, λX [triangle( x ) ∧ X ( x )]
S
S
P
PPP P
t1 (aip) which, λX1 λX2 [ X1 ( x ) ∧ X2 ( x )]
S
S
t1 (nom) triangle, λx [triangle( x )] t1 (n) triangle, λx [triangle( x )]
S
S
SS
t1 (vp) is in the left field, λx [ left(x)]
Figure 4.1 A kind of syntax tree for example (1-a)
(o) A · B, ![( Q(λx []))( P)] ∈ t1 (ds) ⇐= A, Q ∈ t1 (np) ∧ B, P ∈ t1 (vp) (p) A · B, ?[ Q( P)] ∈ t1 (ds) ⇐= A, Q ∈ t1 (whp) ∧ B, P ∈ t1 (vp) The language L is the set of expressions A for which holds: A, α ∈ t1 (K ), for arbitrary QL-expressions α and arbitrary phrase or word types K. The set of expressions α for which it holds that A, α ∈ t1 (K ) – for all A ∈ L and arbitrary phrase or word types K – is the set of all formal meaning representations of L. Examples (1-a) and (1-b) show two sentences of L together with their QL-representations. Figure 4.1 visualises the step-by-step construction of the first sentence (1-a) together with a formal representation of its meaning. (1)
a.
“Which triangle is in the left field?”, ?[triangle( x ) ∧ left( x )] ∈ t1 (is)
b.
“Some white triangle is in the left field”, ![∃ x [ ∧ triangle( x ) ∧ white( X ) ∧ left( x )]] ∈ t1 (ds)
The QL-representation of example (1-b) requires explanation: normally noun phrases are interpreted as generalised quantifiers,2 i.e. as expressions of the type e, t, t. That is, “some triangle” should be represented with the term λX [∃ x [triangle( x ) ∧ X ( x )]]. I represent noun phrases as terms of type e, t, e, t, t: the formal analogue of “some triangle” then becomes λZλX [∃ x [ Z ( x ) ∧ triangle( x ) ∧ X ( x )]]; the variable Z here serves as 2
Cf. Barwise and Cooper (1981).
88
Accentuation and Interpretation
place holder for a predicate for domain restriction. – Why do I not proceed in the standard way? (2)
The English love to write letters. Most children have several pen pals in many countries.
Westerst˚ahl (1985) shows on the basis of examples such as (2) that the domains of generalised quantifiers can be restricted and need not necessarily contain the entire universe of discourse. The domain of the generalised quantifier “the English” in the first sentence of example (2) is unrestricted; it contains the entire universe of discourse. That is, “the English” denotes all English. Things are different with “most children” in the second sentence. The domain of this quantifier contains only a subset of the universe of discourse; the quantifier denotes the majority of English children, not the majority of all children (irrespective of nationality). If “most children” were translated to QL and if the corresponding term would contain a variable for domain restriction, this variable would have to be appropriately instantiated with the predicate λx [english( x )]. The domain of “several pen pals” does not contain only English; the domain restriction that applies to “most children” does not apply here. The variable Z in the translation of “some triangle” – λZλX [ ∃ x [ Z ( x ) ∧ triangle( x ) ∧ X ( x )]] – can be instantiated with a predicate that denotes a domain restriction. If the declarative sentence “Some triangle is in the left field” is translated to QL, this variable is replaced with the neutral domain restriction λx []. (Cf. definition 4-1 (o).) The predicate λx [] denotes for each index the set of all entities, i.e. the entire universe of discourse; it does not serve to effectively restrict a quantificational domain. If the variable for domain restriction would always have to be replaced with λx [], it would not be needed; it would have no semantic effect and would needlessly complicate the representations of sentences of L. Example (3) however, shows that the restriction with λx [] is not always sufficient: (3)
Which white object is in the left field? – Some triangle is in the left field.
If we reconstruct the answer sentence in the manner described above with the domain restriction λx [], we obtain the following QL sentence: ![∃ x [ ∧ triangle( x ) ∧ left( x )]] This is not the correct meaning representation. As an answer to the question under discussion, the sentence “Some triangle is in the left field” does not just mean that any triangle is in the left field; rather, it has the stronger
Reconstruction of Messages
89
meaning that some white triangle is in the left field. The domain of “some triangle” must be restricted to white objects: ![∃ x [white( x ) ∧ triangle( x ) ∧ left( x )]] In order to generate this reconstruction – the correct one – the variable for domain restriction must be instantiated with the predicate λx [white( x )] instead of λx []. I define such a replacement in section 2. The message that is conveyed by uttering a sentence can be reconstructed if a QL-representation can be determined by means of the function t1 . Let D be a declarative sentence of L, and let I be an interrogative sentence of L. The possible reconstructions of messages conveyed by D or I are QLsentences !φ with D, !φ ∈ t1 (ds), and QL-sentences ?ψ with I, ?ψ ∈ t1 (is), respectively. With the criteria defined in Chapter 3 the contextual adequacy can be tested. Let us tentatively assume that an uttered sequence of words can be reconstructed in various ways and that more than one reconstruction is adequate given the discourse context. (The language L has not been accommodated for such a case yet.) An update-function maps a given information state and an uttered expression unambiguously onto a new information state. For an utterance that can be reconstructed in different ways, it is not appropriate to define an update-function but rather a set of update-rules or -constraints, because the given information state may be changed in different ways by the different reconstructions. Let A1 · . . . · An be a sequence of words recognised by a recipient. If the word sequence can be reconstructed as a sentence φ of QL through t1 , if φ is adequate with respect to σ – in the sense defined in Chapter 3 – (adequate(σ, φ)) and if the classical update σ [φ]2M brings about the information state σ , then σ is a possible information state updated by the utterance of A1 · . . . · An : σ ∈ update(σ, A1 · . . . · An )
⇐=
A1 · . . . · An , φ ∈ t1 (K ) ∧ K ∈ {ds, is} ∧ adequate(σ, φ) ∧ σ = σ[φ]2M As an example, let us reconstruct and interpret two sentences of L: let a model M = D M , I M , i∗ , [[ ]] M for the language QL be given. Let the domain D M contain the nine objects O1, . . ., O9. These objects have different colour and form properties – they are black, grey or white, and triangular, round or square – and furthermore they are spread over two fields – a left one and a right one. Let every possible configuration correspond to an in-
90
Accentuation and Interpretation
Figure 4.2 Toy world i∗
dex i ∈ I M ; let the configuration depicted in Figure 4.2 correspond to the index i∗ , the real world. Let the information state σ be a state of disinterest, i.e. there is no question under discussion. Let it represent complete knowledge about the colours and forms of the objects O1,. . ., O9 but no knowledge about their arrangement. With nine objects and two possible locations in which the objects can be located, there are 29 = 512 possible arrangements; let each of these arrangements correspond to an index pair i, i ∈ σ.3 (4)
a.
p: Which triangle is in the left field? p QL : ?[triangle( x ) ∧ left( x )] p, pQL ] ∈ t1 (is)
b.
q: Some white triangle is in the left field. qQL : ![∃ x [triangle( x ) ∧ white( x ) ∧ left( x )]] q, qQL ∈ t1 (ds)
The information state σ is to be updated by utterances of the example sentences (4-a) (p) and (4-b) (q). The function t1 together with the sentences yields the QL-representations p QL und qQL . By the utterance of p, σ is partitioned. The result – σ [ p QL ]2M – is the partition illustrated in Figure 4.3. Now, the question which triangle is in the left field is under discussion. 3 Because the information state σ is a state of disinterest, σ contains a total of 5122 index pairs. The set of index pairs increases rapidly if it represents less knowledge about the objects O1, . . ., O9 – e.g. no knowledge about the colour properties. The set of index pairs also increases (exponentially, in fact) with the set of available objects. Bos and Gabsdil (2000) call this the “combinatorial explosion problem”.
91
O3, O8 and O9 are in the left field.
Only O8 and O9 are in the left field.
Only O3 and O9 are in the left field.
Only O3 and O8 are in the left field.
Only O9 is in the left field.
Only O8 is in the left field.
Only O3 is in the left field.
No triangle is in the left field.
Reconstruction of Messages
Figure 4.3 Partitioning by “Which triangle is in the left field?”
Sentence q denotes a partial answer to this question: all partition sets according to which O3 is not in the left field are filtered out; therefore, the proposition denoted by q gives a partial answer. Apart from the complete partition sets, no further index pairs are filtered out; that is, the proposition also is a partial answer. Accordingly, uttering q satisfies the adequacy criteria defined in Chapter 3. The sentence is compatible with a situation in which apart from O3 the other two triangles – O8 and O9 – are in the left field as well, and it is compatible with a situation in which both these triangles are not in the left field. The sentence therefore does not denote an exhaustive answer. Uttering q would not satisfy the conversational maxims, and would therefore be inadequate, if it were not known which colours the triangles have. The sentence would be over-informative because apart from the complete partition set according to which no triangle is in the left field, also all index pairs according to which none of the triangles in the left field is white, would be filtered out. These pairs do not form complete partition sets. The adequacy of q (4-b) not only depends on the common thematic knowledge (knowledge about a question) but also on the common propositional knowledge (factive knowledge). Summarising: the language L does not contain ambiguous sentences; therefore the function t1 assigns exactly one sentence of QL to every sentence of L. A recipient that only has function t1 at his disposal for reconstructing a message, can only interpret complete sentences of L, and can only assign exactly one QL-representation to every recognised sentence. He has no leeway when reconstructing messages. I will change this in the next section, when I extend the possibilities for reconstruction.
92
Accentuation and Interpretation
2 Interpretation by means of QUADs Let us now turn to the interpretation of word sequences that do not form complete sentences: in subsection 2.1, I extend representations of the common ground – which are so far just sets of index pairs – with lists of QUADs (question abstract domain pairs) and I specify how QUADs can be used to expand representations of constituent answers into complete messages. In subsection 2.2, I show that a message that has been reconstructed with a QUAD sometimes denotes only a partial answer, but may be semantically strengthened and interpreted as an exhaustive answer. In subsection 2.3, I define rules for the type-shifting of expressions. After that, I show in subsection 2.4 which words in a sentence must be recognised under which conditions in order for the sentence to be understood properly, and I define corresponding requirements for optimal accentuation. In subsection 2.5, I explain how utterance contexts can be presupposed through accentuation and how this can evoke conversational implicatures. I summarise the results in section 2.6. 2.1
Completion with a QUAD
In example (5) it suffices to utter the name “O3” in order to (partially) answer the question what is in the left field: (5)
What is in the left field? – O3.
Groenendijk and Stokhof (1984) describe the interpretation as follows: in course of the interpretation of “What is in the left field?” the question abstract λx [object( x ) ∧ left( x )] is formed. The abstract can be used for the completion of the utterance “O3”. “O3” is translated to λX [ X (o3)].4 This term is then functionally applied to the question abstract; the result is the sentence object(o3) ∧ left(o3), which denotes an adequate, partial answer. (The transformation of this sentence into a sentence of QL is a purely formal modification: ![object(o3) ∧ left(o3)].) The idea of using a question abstract or a variant of such an abstract for the reconstruction of an answer is not new. Question abstracts are used in many models for the interpretation of constituent answers, especially in categorial question semantics (e.g. Hausser (1983)) and in structured meaning theory (e.g. Krifka (2001), Stechow (1989a)).5 4 For the moment, I leave out the variable for domain restriction. Groenendijk and Stokhof (1984) do not use such a variable, though it can be introduced without problems. 5 An overview can be obtained by reading Ginzburg (1996b), Groenendijk and Stokhof (1997) or Higginbotham (1996). A critical introduction is provided by (contd)
Reconstruction of Messages
93
In the model defined so far, there is neither a question abstract available nor an interrogative sentence from which such an abstract can be construed. How then can we access an abstract? Attempt 1 (a): by uttering the interrogative sentence in example (5) an information state is partitioned. The recipient has this partitioned information state available to him. Can the abstract that corresponds to the interrogative sentence be derived afterwards from the partitioned information state? Unfortunately, the answer is “no”. The correct abstract cannot be derived from the partitioned information state. (6)
What is not in the left field? – O3.
With respect to the model defined so far the interrogative sentences (5) (“What is in the left field?”) and (6) (“What is not in the left field?”) are equivalent; by uttering interrogative sentence (5) a given information state is partitioned in the same way as by uttering the interrogative sentence (6). The information state therefore does not offer any clue whether the abstract λx [object( x ) ∧ left( x )] or the abstract λx [object( x ) ∧ ¬ left( x )] must be derived. It is, however, immediately obvious that such a clue is required; after all, the choice of abstract has a direct influence on the interpretation of the constituent answer “O3”. Attempt 1 (b): let us assume that exactly the two non-equivalent abstracts mentioned can be derived from the given information state, and therefore that the constituent answers can be reconstructed in two ways. The reconstructions can be tested for their adequacy with respect to the given information state. Is it possible to recognise the incorrect reconstruction as inadequate? No, the proposition that O3 is in the left field, and the proposition that O3 is not in the left field are both adequate, partial answers to both questions. The incorrect reconstruction cannot be recognised as the inadequate reconstruction by the recipient without access to a question abstract. A question abstract can be transformed into an interrogative sentence of QL by a simple syntactic modification. The interrogative sentence can be used to partition a given information state. The other way around, however, it is not possible to unambiguously reconstruct an interrogative sentence or a corresponding question abstract from a partitioned information state. Attempt 2: all this means that a question abstract must be generated when an interrogative sentence is received and that it must be stored for the reception of any subsequent utterances. A solution that easily suggests the first paper in Groenendijk and Stokhof (1984).
94
Accentuation and Interpretation
itself is to expand the representations of the common ground accordingly: such common ground representations can be modelled as pairs consisting of a set of index pairs (representing, as before, the mutual thematic and propositional knowledge) and a list of question abstracts. By uttering an interrogative sentence the set of index pairs is filtered and a new element is added to the list of abstracts. Uttering a declarative sentence also filters the set of index pairs, and additionally, all abstracts whose corresponding questions are answered are deleted. The filtering of index pairs is done as before; only the modification of a list of abstracts is new. Attempt 2 promises to be successful. According to this attempt information states are redundant – when a question abstract is taken up, the thematic knowledge no longer needs to be represented with a partition, it can be represented by the abstract – but this redundancy is not harmful. On the contrary, representing the thematic knowledge by a partition allows me to retain the semantic relations of declarative and interrogative sentences and the adequacy requirements defined in Chapter 3 unchanged. (Furthermore, in section 3, I will make a proposal for reconstructing messages without using question abstracts. For that, I will need the knowledge representation that I defined in Chapter 3.) Before attempt 2 is implemented, it must be seen whether question abstracts along the lines of Groenendijk and Stokhof (1984) are really always suitable for the interpretation of constituent answers – we will see shortly that they are not always suitable and that therefore a different structure is needed. Construing a question abstract along the lines of Groenendijk and Stokhof (1984) is easy to do: if a simple wh-question as in example (5) is given, it is so far translated into a QL-sentence of the form ?[ψ], where ψ is a formula with a single free variable. Let ξ be this variable. When ξ is bound in ψ by a λ-operator, λξ [ψ] is obtained; this is the desired question abstract. For the interrogative sentence from example (5) (“What is in the left field?”) the abstract λx [object( x ) ∧ left( x )] is formed, and for the interrogative sentence of the following example (7) the abstract λx [black( x ) ∧ object( x ) ∧ left( x )] is formed:6 (7)
Which black object is in the left field? – Every square.
6 The question abstract of a multiple constituent question ?ψ (e.g. of ,,Who gives what to whom?”) would be an expression λζ [ψ], where ζ is a list of the free variables in ψ (e.g. λxyz[ give( x, y, z]). For a polar question ?φ (e.g. “Is it raining?”) the sentence φ can be used as a question abstract. To this sentence, expressions of the type t, t can be applied. That is, possible constituent answers are “yes” (λX t [ X t ]), “no” (λX t [¬ X t ]), “possibly” (λX t [♦X t ]) etc.
Reconstruction of Messages
95
Let us again take the world depicted in Figure 4.2 as a basis. The constituent answer of example (7) is true in this world, and in relation to the given question uttering it is adequate. By way of experiment, let us assume that for the interpretation of “Every square”, the abstract for the interrogative sentence “Which black object is in the left field?” is available. “Every square” is translated as: λZλX [∀ x [ Z ( x ) ∧ square( x ) → X ( x )]] In this term, the variable Z for the domain restriction can be instantiated with the predicate λx []; the result of this replacement can be rewritten as: λX [∀ x [ ∧ square( x ) → X ( x )]] ⇔ λX [∀ x [square( x ) → X ( x )]] This term is now applied to the question abstract λx [black ( x ) ∧ object( x ) ∧ left( x )]. I place an exclamation point before the resulting expression and obtain a declarative sentence of QL: ![∀ x [square( x ) → black( x ) ∧ object( x ) ∧ left( x )]] The resulting QL-sentence does not denote the intended answer to the question under discussion. It means that all squares – O1, O2, O4 and O5 – are black and are located in the left field. Given the world depicted in Figure 4.2, this sentence is false; there are a white and a grey square, which are both in the right field. The person asking the question did not want to know anything about the white and the grey square; nonetheless he is informed about these squares – incorrectly. If the constituent answer “every square” is interpreted in the sense of the QL-sentence that was derived, it does not denote a correct and adequate answer – contrary to what it should. The correct reconstruction of the answer is the following: !∀ x [black( x ) ∧ object( x ) ∧ square( x ) → left( x )] Let us take another look at the interrogative sentence in example (7): (8)
Which black object is in the left field?
The formal equivalent of the aip-complement “black object” is the predicate λx [black( x ) ∧ object( x )]; in the correct reconstruction of the answer, this predicate serves to restrict the domain of quantification. The formal equivalent of the vp “is in the left field” is the predicate λx [ left( x )]; in the correct reconstruction, this predicate determines the scope of quantification. That is, the two predicates play different roles in the reconstruction
Accentuation and Interpretation
96
of the answer sentence; it must be possible to access them independently. Instead of a question abstract along the lines of Groenendijk and Stokhof (1984) – λx [black( x ) ∧ object( x ) ∧ left( x )] – a pair consisting of a reduced abstract, to which the constituent answer can be applied – λx [ left( x )] – and of a predicate for domain restriction – λx [black( x ) ∧ object( x )] – is needed. I call such a pair a QUAD (question abstract domain pair, or, more precisely, question abstract domain restrictor pair):
λx [ left( x )], λx [black( x ) ∧ object( x )] With this QUAD, the answer in example (7) is reconstructed as follows: the function t1 provides a representation of the generalised quantifier “every square”. The variable for domain restriction contained in this representation is instantiated by functionally applying the quantifier to the second element of the QUAD (the predicate for domain restriction): λZλX [∀ x [ Z ( x ) ∧ square( x ) → X ( x )]](λx [black( x ) ∧ object( x )])
⇔ λX [∀ x [black( x ) ∧ object( x ) ∧ square( x ) → X ( x )]] The resulting term can now be applied to the reduced abstract (the first element of the QUAD); the result of this application is the desired QLsentence: ![λX [∀ x [black ( x ) ∧ object( x ) ∧ square( x ) → X ( x )]](λx [ left( x )])]
⇔
![∀ x [black( x ) ∧ object( x ) ∧ square( x ) → left( x )]]
Let me take stock. In a question abstract along the lines of Groenendijk and Stokhof (1984), predicates that have different roles in the reconstruction of a question are not separated but combined into a single lambda term. Such an abstract is not always suitable for the reconstruction of an answer. I therefore replace it with a QUAD in which predicates that play different roles in the reconstruction of an answer are separated from each other. Objection: is the transformation of a question abstract into a QUAD just an ad hoc solution to the problem illustrated above? – Reply: rather not. The problem that the transformation of the question abstract is supposed to solve not only occurs in our model, but was noted earlier already (cf., for example, Higginbotham (1996)): (9)
a. b.
Which bachelors are men? Which men are bachelors?
Reconstruction of Messages
97
According to the question semantics of Groenendijk and Stokhof (1984), the interrogative sentences (9-a) und (9-b) are identical in meaning. Both sentences denote the same question; this question is answered exhaustively when all bachelors – which are necessarily also men – are named. The interrogative sentences therefore denote the same set of exhaustive answers, i.e. the same partition. Now, it is possible to reply to both sentences by uttering the determiner “all”. If “all” is uttered as a reply to the first question (9-a), it must be expanded to a QL-representation of “All bachelors are men”. It is consequently interpreted as a true answer: ![∀ x [bachelor ( x ) → man( x )]. If “all” is uttered as a reply to the second question (9-b), it must be expanded to a representation of “All men are bachelors”. This however denotes a false answer: ![∀ x [man( x ) → bachelor ( x )]. Groenendijk and Stokhof (1984) generate the same abstract for both interrogative questions (9-a) and (9-b). The abstracts therefore provide no information on how the constituent answer is to be reconstructed. Given the examples (9-a) and (9-b), assigning the representations of the aip-complement and the vp different roles in the reconstruction of an answer sentence is a plausible move, certainly not just a trick. It is not surprising that the proposal to represent contextually given questions by QUADs or by QUAD-like structures has already been implemented in other models: QUADs are strongly reminiscent of the representations of questions that a theory of structured meaning provides (cf. Krifka (2001), Stechow (1989a)). Why is the QUAD solution then only rather not an ad hoc solution? – One may regard the use of a question abstract or a QUAD fundamentally implausible. In section 3, I discuss how incompletely recognised utterances can be reconstructed without the use of QUADs. Let us model interpretations by means of QUADs. First, I define what a QUAD is and how it is generated. Then I specify tentatively how a recipient can use a QUAD for the interpretation of an incomplete answer sentence. In order for QUADs to be usable, they must be stored in the available representation of the common ground. I expand the common ground representations in such a way that they also contain QUADs, and I specify how expanded common ground representations can be updated: Definition 4-2 (QUADs of L) Let the language L and the translation function t1 be given. Let the function t2 assign to each of the part of speech types is, whp, ip and aip a set of pairs A, α, β consisting of a natural-language expression A and a pair of QL-expressions α and β. Let the expression A · B be the concatenation of two natural-language expressions A and B (in that order). The following holds:
Accentuation and Interpretation
98
(a) “what”, λXλx [ X ( x )], λx [object( x )] ∈ t2 (ip) (b) “which”, λXλx [ X ( x )], λXλx [ X ( x )] ∈ t2 (aip) (c) A, P ∈ t2 (whp) ⇐= A, P ∈ t2 (ip) (d) A · B, P1 , Q( P2 ) ∈ t2 (whp) ⇐= A, P1 , Q ∈ t2 (aip) ∧ B, P2 ∈ t1 (nom) (e) A · B, Q( P1 ), P2 ∈ t2 (fs) ⇐= A, Q, P2 ∈ t2 (whp) ∧ B, P1 ∈ t1 (vp) For all A, α, β ∈ t2 (is), α, β is a QUAD of L. Furthermore is a QUAD of L, called the empty QUAD. Nothing else is a QUAD of L. The domain of the function t2 is a proper subset of the domain of the function t1 defined above in the context of L. By means of t2 a QUAD can be construed for every interrogative sentence of L. Apart from the QUADs for interrogative sentences, is a QUAD as well. This empty QUAD serves a purely functional purpose that will be explained below.7 Let us now assume that a QUAD ατ , βτ is available for the reconstruction of a message. The recipient has recognised from an utterance the words A1 · . . . · An , in that order. He translates A1 · . . . · An by means of t1 into the term ψτ τ,t . The recipient applies this term first to the domain restriction β and then to the abstract α; he then places an exclamation point or a question mark before this sentence, generating a declarative or interrogative sentence φ of QL: φ ∈ reconstruct(α, β, A1 · . . . · An )
⇐=
A1 · . . . · An , ψ ∈ t1 (K ) ∧ ( φ = ![(ψ( β))(α)] ∨ φ = ?[(ψ( β))(α)] ) (10)
Who is missing? – Andrew. / Andrew?
Whether a message is to be reconstructed as a declarative or an interrogative sentence is marked by the speaker through intonation. The utterance of “Andrew” in example (10) can – depending on the intonation – be reconstructed as a declarative sentence in the sense of “Andrew is missing”, or as an interrogative sentence in the sense of “Is Andrew missing?”. Both reconstructions are adequate given the background question: the declarative sentence answers the question at least partially; the interrogative sentence guides the answering. 7
For QUADs of multiple-constituent and polar questions, see footnote 13.
Reconstruction of Messages
99
That is, the function reconstruct provides a set of possible, complete reconstructions for a QUAD and a sequence of words, i.e. it provides a set of QL-sentences. If the word sequence that the recipient has recognised can already be translated into a QL-sentence by means of t1 , then the speaker’s message can be reconstructed without taking recourse to a QUAD: φ ∈ reconstruct(quad, A1 · . . . · An )
⇐=
A1 · . . . · An , φ ∈ t1 (K ) ∧ K ∈ {ds, is} How can a recipient access a QUAD? – The common ground representations of the discourse participants must be expanded with lists of QUADs so that recipients have QUADS available for interpretation: Definition 4-3 (Information State with QUADs) Let M = D M , I M , [[ ]] M be a (reduced) model of QL. (a) For all QL-information states σ and for all QUAD-lists quads: ς is an information state with QUADs w.r.t. M and L iff: ς = σ, quads Σ is the set of all information states ς. (b) For all information states ς with QUADs the following holds: – ς = σ, quads is a state of total ignorance, i.e. a minimal information state, iff σ ⊇ {i, i | i ∈ I M }. – ς = σ, quads is an absurd information state iff σ = ∅. – ς = σ, is a state of disinterest iff it holds for all i, i and i , i ∈ σ that i, i ∈ σ. (c) – 0 = I M × I M , is the minimal information state of total disinterest. – 1 uniformly represents any arbitrary absurd information state. An information state σ, quads can be updated as result of an utterance. The update applies both to the set of index-pairs σ and the QUAD-list quads. By uttering an interrogative sentence, a new QUAD can be added to the list quads: Definition 4-4 (Updates with Interrogative Sentences) Let an information state σ, quads and a sequence of words A1 · . . . · An be given. Let each of the words A1 , . . . , An be a word of L. An information state σ , quads is a possible result of updating σ, quads with A1 · . . . · An if:
100
Accentuation and Interpretation
(a) ?ψ is a possible reconstruction of the message conveyed by A1 · . . . · An , and ?ψ satisfies the adequacy criteria for asking questions with respect to σ, (b) σ is the result of updating σ with ?ψ, (c) adding a QUAD χ to the list quads by applying the function addq yields a new list of QUADs quads . If A1 · . . . · An is a complete interrogative sentence of L, A1 · . . . · An , χ is an element of t2 (is). The QUAD χ is to be the first element of quads ; it corresponds with the reconstructed question ?ψ.
σ , quads ∈ update(σ, quads, A1 · . . . · An )
⇐=
quad ∈ quads ∧ ?ψ ∈ reconstruct(quad, A1 · . . . · An ) ∧ adequate(σ, ?ψ) ∧
σ =
σ[?ψ]2M
∧ quads = addq(quads, A1 · . . . · An , ?ψ)
I leave addq undefined. The definition of addq is rather simple when lists – especially the lists quads and quads – are defined as in the Prolog programming language.8 Reconstruct takes a QUAD and a sequence of words as arguments. If a given information state represents a state of disinterest, it contains only the empty QUAD. The empty QUAD cannot support the reconstruction of a message; in order for the recipient to be able to construct a QL-sentence, the sequence A1 · . . . · An must form a complete sentence. (I defined the empty QUAD above so that I do not additionally need to introduce a one-place reconstruction function.) Updating a given information state σ, quads with a declarative sentence takes place analogously to updating with an interrogative sentence, although, when an assertion answers a question under discussion exhaustively, the corresponding QUADs are filtered out of quads: Definition 4-5 (Updates with Declarative Sentences) Let an information state σ, quads and a sequence of words A1 · . . . · An be given. Let each of the words A1 , . . . , An be a word of L. An information state σ , quads is a possible result of updating σ, quads with A1 · . . . · An if: 8 (A) For Prolog, see Sterling and Shapiro (1994) – (B) It may happen that with t2 different question abstracts can be generated for the sequence of words A1 · . . . · An (e.g. when one of the words A1 , . . . , An is ambiguous). In order to be able to determine the correct abstract in this case – i.e. the abstract that corresponds to the reconstruction ?ψ – it is useful to combine t1 and t2 , so that the construction of the QL-interrogative sentence can be tuned to the abstract. – (C) The definition of addq is not straightforward for word sequences A1 · . . . · An that do not form complete interrogative sentences.
Reconstruction of Messages 101
(a) !φ is a possible reconstruction of the message conveyed by A1 · . . . · An , and !φ satisfies the adequacy criteria for assertions with respect to σ, (b) σ is the result of updating σ with !φ, (c) with the function delq all QUADs α, β are deleted from the list quad for which it holds that ?[α( x ) ∧ β( x )] is not a question under discussion in σ ; the result of this filtering operation is the list quads .
σ , quads ∈ update(σ, quads, A1 · . . . · An )
⇐=
quad ∈ quads ∧ !φ ∈ reconstruct(quad, A1 · . . . · An ) ∧ adequate(σ, !φ) ∧ σ = σ[?φ]2M ∧ quads = delq(quads, σ )
I leave delq undefined; the definition is rather simple when lists are defined as in Prolog. The same sequence of words can sometimes be reconstructed in different ways; more than one reconstruction can be adequate in the given context; this means that there may be various ways to modify the common ground representations. Therefore not an update function for word sequences and information states, but only rules for updating can be defined. A system based on such rules is not a classical update system, because in the course of updating an information state, it is not only possible to filter out elements, but also to add new ones – namely QUADs. In Chapter 3, I established that interrogative sentences can be uttered to guide the answering of a given question. With the introduction of QUADs I now possess a means to explain the discourse-structuring function of an interrogative sentence that is ‘already known’. Even if uttering an interrogative sentence does not put a new question up for discussion, the utterance can be informative, in that it adds a new QUAD for the reconstruction of subsequent utterances. (11)
Who dances with whom? – Who dances with Tom?
The question denoted by the second interrogative sentence in example (11) is already under discussion when only the first question has been uttered. Nonetheless the utterance is informative because it provides a new QUAD. An information state comprises a list of QUADs and therefore may provide more than one QUAD. This makes it possible that suitable QUADs are available for the reconstruction of both answers in example (12):
102
(12)
Accentuation and Interpretation
Who dances with whom? – Who dances with Tom? – Emma. And Judith with Theo.
The first question act puts a question up for discussion, and a QUAD is added to the representation of the common ground. The second question act does not put a new question up for discussion – the question who dances with Tom is already under discussion after the first question act – but nonetheless a new element is added to the list of QUADs. The new QUAD must be used for the reconstruction of the first partial answer “Emma”; the other QUAD is required for the reconstruction of the second partial answer “Judith with Theo”. I leave the length of the QUAD list unlimited. Recipients certainly cannot refer back to QUADs of questions regardless of how long ago they were asked; how old a question may be for a discourse participant to still be able to refer to its QUAD will differ from person to person. Maximum and average lengths of QUAD lists are to be determined experimentally. To conclude this section, let us look at example (7) – repeated here as (13) – again; the answer is to be interpreted in relation to the toy world depicted in Figure 4.2: (13)
Which black object is in the left field? – EVERY SQUARE is in the left field.
By uttering “Which black object is in the left field?”, a question is put up for discussion and a QUAD is made available. Let us assume that the answer sentence was recognised completely, that it can be translated into QL by means of t1 : ![∀ x [square( x ) → left( x )]]. This QL-sentence does not denote an adequate answer. First, not all squares are in the left field; that is, the QL-sentence is factually false. Secondly, the question in which field all squares are located is not under discussion. Under discussion is the question in which field the black squares are.9 (14)
Which black object is in the left field? – EVERY SQUARE.
If the answer is reconstructed only on the basis of the accentuated words – here written capitalised – one arrives at a true and adequate representation of the answer. The representation means that all black squares are in the left field: ![∀ x [black( x ) ∧ object( x ) ∧ square( x ) → left( x )]]. 9 The question in which field the black objects are was explicitly asked. This question entails the sub-question in which field the black squares are. Therefore, the sub-question is in under discussion as well.
Reconstruction of Messages 103
The fact that only the interpretation of the constituent answer – which here consists merely of the accentuated words – leads to the correct result, need not be an undesirable effect. The constituent answer in example (14) at least appears to be clearer than the complete answer sentence in example (13). It appears more natural and more likely to be uttered than the sentence answer; accordingly, it should be easier to understand. Nonetheless it would be unacceptable if a recipient of the answer sentence in example (13) could only arrive at the correct interpretation if he recognises only the accentuated words. He should be able to understand the utterance even when he recognises more words, possibly even the complete sentence. There are two ways in which this can be made possible. First, one could define a rule that states that QUADs can also be used for the interpretation of complete sentences – specifically for restricting a domain of quantification. Secondly, one could allow a recipient to ignore some of the words that he recognised and let him reconstruct a given utterance solely on the basis of the accentuated words. That is, if the recipient recognises the sentence completely, including the stress pattern, he can arrive at a reconstruction in two ways; regarding the adequacy criteria, he can decide which reconstruction is preferable. The constituent answer in example (14) is then easier to understand because only one reconstruction – in fact, the correct one – is possible, so that it is not necessary to compare several reconstructions with respect to their adequacy.10 10 The term “constituent answer” has not been defined yet. Generally, it denotes a well-formed linguistic expression that is not a complete sentence, but that can nonetheless answer a given question when uttered. The following definition is compatible with this common usage:
Definition 4-6 (Constituent Answer) Given are a language L and the translation function t1 ; given is furthermore an information state σ, quads according to definition 4-3. Let there be a question under discussion and let the QUAD α, β be an element of quads. An expression A of L is a constituent answer with respect to a question under discussion in σ, quads iff: (a)
A, ψ ∈ t1 (K ) for any part of speech type K other than ds or is and
(b)
![(ψ( β))(α)] satisfies the adequacy criteria for assertions specified in Chapter 3.
A constituent answer is a syntactic object, not a semantic object. Since I use the term “question” for semantic objects, it would be more precise to use the phrase “answer term” instead of “constituent answer”. However, in this respect I prefer to conform to common usage. Analogously to the term “constituent answer”, the term “constituent question” can be defined: a constituent question is a linguistic expression that can be converted into a question by means of a QUAD.
104
Accentuation and Interpretation
2.2
Exhaustification
When the constituent answers of the following examples (15-a)–(15-c) are interpreted with QUADs as described in the previous section, each of them denotes a partial answer to the question preceding it: (15)
a. b. c.
Which black object is in the left field? – Every square. Which white object is in the left field? – Some triangle. Which triangle is in the left field? – O3.
The QL-representations created from the given nps and QUADs are compatible with the situation that apart from the black square, the white triangle or the object O3, respectively, there are other black objects, white objects or triangles in the left field. If, on the other hand, the recipient assumes that the speaker possesses enough knowledge to answer the questions under discussion exhaustively, he will interpret the replies in a stronger sense, namely as exhaustive answers:11 the first reply (15-a) as an exhaustive answer means that all black squares are in the left field, and that no other black object is. The second reply (15-b) as an exhaustive answer means that there is a white square in the left field, and no other white objects. The third reply (15-c) as an exhaustive answer is to be understood as meaning that object O3 is the only triangle in the left field. Let us again assume the toy world of Figure 4.2: both the weaker, literal answers as well as the stronger, exhaustive answers are true and adequate in the context of their respective questions. The exhaustive answers are the better answers. When the replies in the examples (15-a)–(15-c) are interpreted literally – i.e. as mere partial answers – first formal representations of the constituents are created by the function t1 , which are then completed by applying them to the domain restriction and the abstract of the given QUAD. In order to interpret the replies stronger – in the sense of exhaustive answers – representations of the constituent answers must be created first as well. These must then be strengthened semantically and subsequently completed into a sentence of QL with the given QUAD. For the semantic strengthening, a function exh is needed: 1. The representation of “every square” (example (15-a)) must be transformed by means of exh into a representation that denotes every square and nothing else in a quantificational domain which is to be determined:
11 I suspect that it is not the exception but rather the rule that recipients presuppose that their interlocutors are competent, and that they therefore generally interpret replies such as in the examples (15-a)–(15-c) as exhaustive answers.
Reconstruction of Messages 105
exh(λZλX [∀ x [ Z ( x ) ∧ square( x ) → X ( x )]]) ⇔ λZλX [∀ x [ Z ( x ) ∧ square( x ) → X ( x )] ∧
∀ x [ Z ( x ) ∧ X ( x ) → square( x )]] This term can be expanded into a representation of an exhaustive answer by means of the QUAD λx [black ( x ) ∧ object( x )], λx [ left( x )]: ![∀ x [black( x ) ∧ object( x ) ∧ square( x ) → left( x )] ∧
∀ x [black( x ) ∧ object( x ) ∧ left( x ) → square( x )]] The QL-representation denotes the exhaustive answer that every black square is located in the left field and that every black object in the left field is a square, i.e. that apart from the black squares, there are no black objects in the left field. 2. The representation of “some triangle” (example (15-b)) is to be understood as denoting one triangle and no other object in a domain still to be determined: exh(λZλX [∃ x [ Z ( x ) ∧ triangle( x ) ∧ X ( x )]]) ⇔ λZλX [∃ x [ Z ( x ) ∧ triangle( x ) ∧ X ( x ) ∧ ∀y[ Z (y) ∧ X (y) → x = y]]] With reference to the QUAD λx [white( x ) ∧ object( x )], λx [ left( x )] for the corresponding, previously uttered interrogative sentence the term can be expanded into a representation of an exhaustive answer: ![∃ x [white( x ) ∧ object( x ) ∧ triangle( x ) ∧ left( x )∧
∀y[white(y) ∧ object(y) ∧ left(y) → x = y]]] According to this reconstruction, there is one white triangle in the left field, and no other white object. 3. Lastly, the representation of the name “O3” (example (15-c)) must be strengthened into a representation in the sense of “only O3”: exh(λZλX [ Z (o3) ∧ X (o3)]]) ⇔ λZλX [ Z (o3) ∧ X (o3) ∧ ∀ x [ Z ( x ) ∧ X ( x ) → x = o3]] Expanding this by means of the QUAD λx [triangle( x )], λx [object( x )] for the corresponding interrogative sentence yields: ![triangle(o3) ∧ left(o3) ∧ ∀ x [triangle( x ) ∧ left( x ) → x = o3]]
106
Accentuation and Interpretation
This QL-sentence means that O3 is a triangle and is located in the left field, and that there is no other triangle in the left field. This sentence also denotes an exhaustive answer to the question under discussion. Each constituent answer of L is a generalised quantifier of the type of one of the constituent answers of the examples (15-a)–(15-c). In order to define the function exh in such a way that it can be used to strengthen any of the constituent answers of L, all I need to do is to replace the non-logical characters in the applications of exh above – i.e. the variables Z, X, x and the constants square, triangle, o3 – with meta-variables (indicated by Greek characters here). This yields three rules of exhaustification; these rules suffice to specify exh for my present purpose:12 Definition 4-7 (Exhaustification) The function exh serves to semantically strengthen generalised quantifiers: exh(λξ 1 λξ 2 [∀ζ [ξ 1 (ζ ) ∧ α(ζ ) → ξ 2 (ζ )]]) ⇔ λξ 1 λξ 2 [∀ζ [ξ 1 (ζ ) ∧ α(ζ ) → ξ 2 (ζ )] ∧ ∀ζ [ξ 1 (ζ ) ∧ ξ 2 (ζ ) → α(ζ )]] exh(λξ 1 λξ 2 [∃ζ 1 [ξ 1 (ζ 1 ) ∧ α(ζ 1 ) ∧ ξ 2 (ζ 1 )]]) ⇔ λξ 1 λξ 2 [∃ζ 1 [ξ 1 (ζ 1 ) ∧ α(ζ 1 ) ∧ ξ 2 (ζ 1 ) ∧ ∀ζ 2 [ξ 1 (ζ 2 ) ∧ ξ 2 (ζ 2 ) → ζ 1 = ζ 2 ]]] exh(λξ 1 λξ 2 [ξ 1 (α) ∧ ξ 2 (α)]]) ⇔ λξ 1 λξ 2 [ξ 1 (α) ∧ ξ 2 (α) ∧ ∀ζ [ξ 1 (ζ ) ∧ ξ 2 (ζ ) → ζ = α]]
A given constituent answer can be translated into QL, the function exh can be applied to the resulting QL-term, and the strengthened term that results can be expanded into a sentence of QL by means of a given QUAD: φ ∈ reconstruct(α, β, A1 · . . . · An )
⇐=
A1 · . . . · An , ψ ∈ t1 (K ) ∧ χ = exh(ψ) ∧ φ = ![(χ( β))(α)] This reconstruction rule makes it possible for the exhaustification function to be applied to the representation of a constituent answer, and for the 12 Groenendijk and Stokhof (1984) (299) define a uniform exh operator for the application to every type of generalised quantifier. They do not use any additional variables along the lines of Westerst˚ahl for the domain restriction, which means that I cannot easily adopt their definition. Adapting it would be possible, but is somewhat complicated formally.
Reconstruction of Messages 107
result of this application to be expanded into a QL-sentence by means of a QUAD. The rule does not allow the option that an entirely recognised sentence is translated into a QL-sentence by means of t1 , and is strengthened only then: a QL-sentence that denotes a partial answer to a given question is compatible with at least two partition sets. In order to strengthen it in such a way that it denotes an exhaustive answer, it would have to be incompatible with all but one partition set. At first sight, there would appear to be at least two possible ways for it to be strengthened; the sentence itself does not provide any clue as to which of these two ways is the correct one. Because of this, the domain of the function exh cannot consist of QL-sentences; exh can only be applied to representations of constituent answers. Objection: it is not immediately obvious that only constituent answers can be interpreted as exhaustive answers. Even when the reply in the following example (16) is completely recognised (i.e. not just the name “O3”), it can be interpreted as an exhaustive answer. (16)
Which triangle is in the left field? – O3 is in the left field.
Reply: a similar objection was already discussed at the end of the last subsection. The exhaustive interpretation of the constituent answer seems to me to be more plausible than the exhaustive answer of the sentence answer. I could allow that the recognised, but not accentuated words are ignored and that the reply is to be reconstructed on the basis of the accentuated words alone. In this way, a recipient can arrive at a representation of the exhaustive answer, although he recognised the entire sentence. (I have already made this proposal above.) Alternatively, if I allow that the function exh can be applied during the translation of the complete sentence by means of t1 , I must additionally allow the domain restriction of the question abstract to restrict the quantificational domain of “O3” (λZλX [ Z (o3) ∧ X (o3)]). If not, a representation according to which O3 is the only object that is located in the left field was created. This representation is not adequate; after all, the reply in example (16) asserts at the most that O3 is the only triangle in the left field.13 13 (A) When a term is expanded by means of a QUAD, a variable must be instantiated to the domain restriction. Now the objection may be raised that it may be plausible that generalised quantifiers contain such variables, but that other expressions may also be used as constituent answers, and that it is neither plausible nor formally elegant to add variables for domain restrictions to these expressions:
(i)
What is Tom doing? – Waiting.
(contd)
108
Accentuation and Interpretation
2.3
Type shifting
According to the hypothesis of optimal accentuation, the i-critical words of an utterance are accentuated. I assume that by default accentuation is optimal and that a stress pattern that in a given context is generally considered (by competent speakers, especially native speakers) to be a natural, correct and expectable pattern, is an optimal pattern. (17)
a. b. c.
Which black square is in the left field? – EVERY black square. Which square is in the left field? – EVERY BLACK square. Which white object is in the left field? – A white TRIANGLE.
The constituent answer “waiting” must be translated into a formal expression and then expanded into a sentence of QL. The translation of “waiting” is λx [wait( x )]; this expression does not contain a variable for domain restriction. It must be expanded to ![waiting(tom)] by application of a question abstract; the expansion does not require any domain to be restricted. Why would one want to create a QUAD and make it available for the reconstruction of the constituent answer, when a “normal” question abstract suffices? – Reply: I concede that it is not necessary to restrict any domain when completing a constituent answer of (i) to a representation that denotes a partial answer to the given question. Things are different when a representation must be generated in the sense of an exhaustive answer, since this answer should not mean that Tom has no other property than to be waiting, but rather that Tom does not perform any other activity. That is, the domain with respect to which the constituent answer must be exhaustive must be limited to the set of possible activities. The obvious way would be to create the QUAD λX [ X (tom)], λX [ action( X )] for the interrogative sentence in example (i), and to apply the predicate λX [ action( X )] contained in it for the restriction of a domain when necessary. (B) The QUAD of a multiple constituent question such as (ii) must contain several, possibly different predicates for domain restriction. How can these predicates be assigned to the corresponding terms that have to be restricted with respect to their domains? (ii)
Which triangle is next to which square?
Reply: it is true that a QUAD for a multiple constituent question must contain several predicates for domain restriction. So I must model the QUAD as a pair consisting of an abstract and a set or sequence of such predicates. If the domain restrictors are combined into a set, there is no clear assignment of the restrictors to their corresponding terms. Any assignment is possible, as long as the result satisfies the adequacy criteria defined in Chapter 3. Should this constraint not suffice to prevent incorrect assignments, the restrictors can be ordered into a sequence, thereby establishing the order of their application. Lastly, it would be possible to define QUADs as recursive structures; in the simpler case, a QUAD would, as before, consist of an abstract and a restrictor, in the more complex case of a QUAD and a restrictor. (C) It does not make sense for the QUAD of a polar question (e.g. “Is it raining?”) to contain a predicate for domain restriction. – Reply: it does not need to. When QUADs consist of an abstract and a set or sequence of restrictors, this set or sequence can be empty. For the QUAD of a polar question, it would be empty.
Reconstruction of Messages 109
The words in the examples (17-a)–(17-c) that are naturally, correctly, expectably and therefore optimally accentuated are written in capitals. Given knowledge of the utterance context, it should be sufficient to recognise these words in order to be able to reconstruct the entire message. This, however, is not possible with just the reconstruction rules defined so far. 1. The interpretation of example (18) (≈ (17-a)) requires a trivial type shift: (18)
Which black square is in the left field? – EVERY noise.
The abstract for the interrogative sentence in example (18) contains two expressions of type e, t: λx [ left( x )], λx [black( x ) ∧ square( x )]. The t1 translation of the word “every” is λX1 λZλX2 [∀ x [ Z ( x ) ∧ X1 ( x ) → X2 ( x )]]; it has the type e, t, e, t, e, t, t. If I apply the translation of “every” to both elements of the QUAD, I obtain an expression of type e, t, not – as required – one of type t. That is, I cannot generate a QL-representation of the message conveyed by the reply just by means of the QUAD. In order to generate a QL-representation, the translation of “every” must be transformed into a generalised quantifier – here an expression of the type e, t, e, t, t – before it is expanded. The goal of this transformation is solely to type shift the term; an obvious way to do this is through functional application to the predicate λx []: λX1 λZλX2 [∀ x [ Z ( x ) ∧ X1 ( x ) → X2 ( x )]] (λx [])
⇔
λZλX [∀ x [ Z ( x ) → X ( x )]]
The resulting quantifier can be paraphrased with “every element of a domain yet to be restricted”. It can be expanded into a sentence by means of the given QUAD, so that one arrives at a correct representation of the message conveyed by the utterance “every”. The message says that all black squares are in the left field:14 ![(λZλX [∀ x [ Z ( x ) → X ( x )]] (λx [black ( x ) ∧ square( x )]))(λx [ left( x )])]
⇔ ![∀ x [black( x ) ∧ square( x ) → left( x )]] 14 Note that exhaustification of the quantifier makes no sense, since the representation obtained without the application of exh already denotes an exhaustive answer. When all elements of the set of black squares are in the left field, it is also the case that only all elements of this set are in the left field. The application of the function exh yields a representation in the sense of “all black squares are in the left field, and everything that is a black square and that is in the left field, is an object of the universe of discourse”. This representation is equivalent to the one generated without exh.
110
Accentuation and Interpretation
Let me take stock. Apart from the reconstruction rules already defined, a rule for transforming determiners into quantifiers is needed. This rule should be a rule of type shifting that has no further semantic effects. This semantically neutral type shifting is achieved by functional application to the predicate λx []. 2. The interpretation of example (19) (≈ (17-b)) requires a similar trivial type shift: (19)
Which square is in the left field? – EVERY BLACK noise.
To reconstruct the answer, the recipient must create a generalised quantifier which he must then complete by means of the QUAD given by the interrogative question. For the creation of this quantifier he has available to him the recognised words “every” and “black”, and the function t1 . The function t1 yields a formal representation for grammatically wellformed phrases (according to the grammar defined above). However, since the representations of “every” and “black” are not quantifiers, and “every black” is not a well-formed expression, t1 does not yield an appropriate quantifier for the accentuated words. If it is possible to treat the adjective “black” as a nom-phrase, it becomes possible to create an appropriate representation for “every black”, by functional application of the representation of “every” to the representation of “black”. This yields the following quantifier:15 λZλX [∀ x [ Z ( x ) ∧ black( x ) → X ( x )]] With reference to the QUAD λx [ left( x )], λx [square( x )] the quantifier can be expanded into an adequate representation of the answer:16 ![∀ x [black ( x ) ∧ square( x ) → left( x )]] Let me take stock. For the reconstruction and interpretation of the reply in example (19) it suffices to recognise the accentuated words, if an adjective can be treated as a nom-phrase in the reconstruction. 15
The type shift only yields the desired result with intersective predicates, such as, e.g. colour predicates. If I were to extend the fragment of English with nonintersective predicates the type shifting rule had to me adapted. 16 Alternatively, the quantifier λZλX [∀ x [ black ( x ) ∧ Z ( x ) → X ( x )]] can be exhaustified first and then expanded into a full sentence. This yields a representation of an exhaustive answer to the given question: ![∀ x [square( x ) ∧ black( x ) → left( x )] ∧ ∀ x [square( x ) ∧ left( x ) → black ( x )]] However, this is only an exhaustive answer if it is already known which objects are black squares.
Reconstruction of Messages 111
3. Finally, the interpretation of example (20) (≈ (17-c)) requires a less trivial type shift: (20)
Which white object is in the left field? – noise TRIANGLE.
The speaker wants to know which white objects are in the left field. Of the answer, he only recognises the word “triangle”, which is formally represented by the predicate λx [triangle( x )]. This predicate cannot be expanded into a sentence by means of the given QUAD; the completion requires a generalised quantifier instead of λx [triangle( x )]. The questioner can assume that at least one of the white objects in the left field has the property denoted by the accentuated word. That means that he can transform λx [triangle( x )] into a generalised quantifier by means of existential quantification: λZλX [∃ x [ Z ( x ) ∧ triangle( x ) ∧ X ( x )]] This quantifier can be expanded into the following QL sentence with the aid of the QUAD given by the interrogative sentence:17 ![∃ x [white( x ) ∧ object( x ) ∧ triangle( x ) ∧ left( x )]] The type shift from λx [triangle( x )] is accomplished by functional application of the determiner λX1 λZλX2 [∃ x [ Z ( x ) ∧ X1 ( x ) ∧ X2 ( x )]] (“some”). This shift may be made possible by a reconstruction rule that allows a nom-phrase to be transformed into a generalised quantifier by application of a determiner. I tentatively determine that the transformation is always (and not just in the case mentioned) carried out by means of the determiner “some”. A priori it cannot be excluded that the type shift can be accomplished by the application of a determiner other than “some”. The application should, however, not result in the recipient reconstructing an answer that is logically stronger than what the speaker wanted to convey. For type shifting, then, a logically weak determiner like “some” 17 It may make sense to apply the function exh and create a representation of an exhaustive reading. This yields the following QL-sentence:
![∃ x [white( x ) ∧ object( x ) ∧ triangle( x ) ∧ left( x ) ∧
∀y[white(y) ∧ object(y) ∧ left(y) → x = y]]] This sentence gives a stronger answer to the question than the sentence generated without applying exh. However, it only denotes an exhaustive answer if it is already known that there is only one white triangle, and if this triangle can already be identified.
112
Accentuation and Interpretation
is to be preferred over a logically stronger determiner such as e.g. “every”. The new type-shifting rule should allow reconstructions that have not been possible so far. Of course, these new reconstructions must satisfy the criteria of adequacy defined in Chapter 3. (21)
Which white object is in the left field? – noise TRIANGLE. a. b.
Some white triangle is in the left field. The white triangle(s) is/are in the left field.
Let us take another look at example (21) (= (20)). With the aid of the new type-shifting rule the answer, of which only “triangle” was recognised, is reconstructed in the sense of the first paraphrase (“Some white triangle . . . ”). The reconstruction is only adequate in the given question context if the knowledge which white objects are triangles is already part of the common ground. Only then does the reconstruction denote a (partial) answer to the question under discussion. If, on the other hand, it is not known which white objects are triangles, then, although the message denoted by the reconstruction gives a partial answer (that there is at least one white object in the left field), it is not an answer. The additional information that the white object in the left field is a triangle does not filter out any further partition sets and therefore does not help to answer the question. In that sense, the information is superfluous, and the reconstructed message is inadequate. The reconstruction therefore makes a relatively strong presupposition about the common ground; this presupposition may be too strong in some cases. Let us assume that there are several white triangles and that not every white triangle can be identified. Just now, however, a specific triangle has been discussed and this triangle can be identified. It is likely that the recipient reconstructs the answer in example (21) in such a way that the white triangle that was just discussed is located in the left field. He then reconstructs the answer in the sense of the second paraphrase indicated above (“The white triangle. . . ”). This reconstruction is logically stronger than the one proposed first, but it demands less from the interlocutors in terms of previous knowledge and may therefore require less to be adequate.18 18 “The white triangle” is not to be understood as a definite description in Russelian sense but as an anaphoric expression – as in the framework of dynamic predicate logic (Groenendijk and Stokhof (1991a)). For the interpretation of definite articles with dynamic semantic models, see the various models of Groenendijk et al. (1995b) and of Heusinger (2003) and Peregrin and Heusinger (1995).
Reconstruction of Messages 113
We are now confronted with an interesting phenomenon: the first paraphrase is in the given situation over-informative and therefore inadequate, while the logically stronger (in itself more informative) second paraphrase is not over-informative and even adequate. It may be sufficient to transform a recognised predicate into a logically weak quantifier by means of existential quantification; sometimes, however, it may be necessary to convert the predicate into a stronger anaphoric quantifier. It is therefore useful to define a second rule, in addition to the typeshifting rule that was proposed above, which allows a logically stronger type shift. Let me take stock. Unlike the two type-shifting rules motivated earlier, the transformation rule for the representation of a nom-phrase into a quantifier causes a semantic enrichment. I propose to carry out this type shift through existential quantification by standard. Specifying reconstruction rules has effects on the predictions of stress patterns. Assume that a speaker utters a noun phrase (np) that exists of a determiner and a nom-phrase. If by type-shifting the recipient can expand the reconstruction of the nom-phrase into a reconstruction of the entire np, there is no need for him to recognise the determiner in order to correctly reconstruct the np. In this case, according to the hypothesis of optimal accentuation, the speaker does not need to accentuate the determiner. Things are different, however, if by type-shifting the recipient would arrive at an incorrect reconstruction for the np which lead to a misunderstanding. In this case, the recipient must recognise the determiner, so that the speaker must accentuate it. (22)
a. b. c. d.
Which white object is in the left field? Which white object is in the left field? Which white object is in the left field? Which white object is in the left field?
– Some TRIANGLE. – The TRIANGLE. – EVERY TRIANGLE. – TWO TRIANGLES.
Assuming that accentuation is optimal by standard, I predict that determiners such as “some”, “a” and “the” are normally not accentuated. The tendency to accentuate other determiners, e.g. “every” and “two”, should, on the other hand, be much stronger. Of course, even these determiners need not always be accentuated: the cardinal “two” e.g. need not be accentuated in the utterance “two triangles” if it was asked of which object there are two in the left field.19 19 The results of a trial experiment (under conditions that were not completely controlled for and with only a few test persons) suggest that my predictions are correct: I confronted the test persons with a list of question–answer pairs of (contd)
114
Accentuation and Interpretation
I can now expand the semantics of the language L with type-shifting rules. The function t1 assigns to each expression of L at least one formal meaning representation. In order to be able to reconstruct incompletely recognised sentences, sometimes meaning representations for expression that are not well-formed (because they are incomplete) have to be created. This is to be done with the proposed type-shifting rules. In definition 4-8 a function t1 is specified that like t1 assigns to each well-formed expression at least one formal representation (ad a) and that furthermore specifies the representations that can be formed by the transformation of detrepresentations into np-representations (ad b), by the transformation of adjrepresentations into nom-representations (ad c) and by the transformation of nom-representations into np-representations (ad d): Definition 4-8 (The Language L with Type Shifting) Let t1 and t1 be functions that assign to each part of speech type named in definition 4-1 a set of pairs, consisting of a natural-language expression and a formal representation. Let the function t1 be identical with the function of the same name specified in definition 4-1. Let A be any expression of L; let P be any expression of L. The following holds: (a) A, P ∈ t1 (K )
⇐= A, P ∈ t1 (K ), for all part of speech types K
(b) A, P(λx []) ∈ t1 (np) (c) A, P ∈ t1 (nom)
⇐= A, P ∈ t1 (det)
⇐= A, P ∈ t1 (adj)
(d) A, λZλX [∃ x [ Z ( x ) ∧ P( x ) ∧ X ( x )]] ∈ t1 (np)
⇐= A, P ∈ t1 (nom)
The language L is the set of expressions A for which it holds that A, α ∈ t1 (K ) – for any formal expressions α and any phrase or word type K. The set of expressions α, for which it holds that A, α ∈ t1 (K ) – for all A ∈ L and any phrase or word type K – is the set of formal representations of L. The functions t1 und t1 specify for each part of speech type a set of pairs consisting of a linguistic expression and a formal representation. Only the form “Was ist im linken Feld?” – “Zwei graue Kreise sind im linken Feld.” (“What is in the left field?” – “Two grey circles are in the left field.”) and “Was ist im linken Feld?” – “Der graue Kreis ist im linken Feld.” (“What is in the left field? – The grey circle is in the left field.”) The test persons had to mark which words were to be accentuated in the answer sentences. Certain determiners were never marked; the determiner “ein” (“a” or “one”) was only marked when the answer sentence was a reply to a question asking for a quantity – e.g. “Wieviele Kreise sind im linken Feld?” (“How many circles are there in the left field?”). Cardinals such as “zwei”, “drei” (“two”, “three”) etc. were almost always marked; “alle” (“all”), “jeder” (“every”) were marked in the majority of cases.
Reconstruction of Messages 115
the expressions A with A, α ∈ t1 (for any α) are grammatically wellformed; the set of expressions B with B, α ∈ t1 (again for any α) contains incomplete and grammatically not well-formed expressions. The function t1 can be used for the reconstruction of messages. I replace the function t1 in the reconstruction rules that were defined in subsections 2.1 and 2.2 with the function t1 , thus obtaining the following new rules: Definition 4-9 (Reconstruction Rules) (a) Reconstruction without a QUAD: φ ∈ reconstruct(quad, A1 · . . . · An )
⇐=
A1 · . . . · An , φ ∈ t1 (K ) ∧ K ∈ {ds, is} (b) Reconstruction with a QUAD: φ ∈ reconstruct(α, β, A1 · . . . · An )
⇐=
A1 · . . . · An , ψ ∈ t1 (K ) ∧ (φ = ![(ψ( β))(α)] ∨ φ = ?[(ψ( β))(α)]) (c) Reconstruction with a QUAD and exhaustification: φ ∈ reconstruct(α, β, A1 · . . . · An )
A1 · . . . · A n , ψ ∈
t1 (K )
⇐=
∧ χ = exh(ψ) ∧ φ = ![(χ( β))(α)]
According to the first reconstruction rule, the sequence of recognised words A1 · . . . · An should be transformed into a sentence of QL by means of t1 alone; because the standard rules of type shifting can take effect in the reconstruction, the sequence of words does not have to form a complete sentence of L (unlike was the case with translation through t1 ). The second and third reconstruction rules make it possible to expand the representation of a constituent answer with a QUAD. The word sequence recognised by the recipient does not even have to be a complete constituent answer, all that is required is that the meaning representation of a constituent answer can be assigned to it by means of t1 . The application of the third rule exhaustifies the reconstructed constituent answer. (23)
Which triangle is in the left field? – noise WHITE noise.
116
Accentuation and Interpretation
With the new reconstruction rules, the reply in example (23) can be reconstructed: the adjective “white” is translated into the predicate λx [white( x )]. By type-shifting, this predicate can function as the representation of a nomphrase. It can be transformed into a generalised quantifier – i.e. into the representation of a noun phrase – through existential quantification. The resulting quantifier – λZλX [∃ x [ Z ( x ) ∧ white( x ) ∧ X ( x )]] – can (optionally) be exhaustified and expanded into a QL sentence by means of the QUAD given by the previously uttered question. The following two reconstructions are obtained: 1. without exhaustification: ![∃ x [triangle( x ) ∧ white( x ) ∧ left( x )]] 2. with exhaustification: ![∃ x [triangle( x ) ∧ white( x ) ∧ left( x ) ∧ ∀y[triangle(y) ∧ left(y) → x = y]]] Which is the desired result. 2.4
Interpretation and accentuation
Let me summarise what I have so far. A recipient recognises all or some of the words of an uttered sentence. The sentence denotes a message. I model the reconstruction of this message in such a way that the recipient translates the recognised words into expressions of a formal language QL; if the words together form phrases, he can combine the representations of the words into representations of phrases. If representations cannot be combined, it may be possible to type-shift them, and then combine them. Type-shifting may require semantic enrichment. Generalised quantifiers can be strengthened semantically by application of the function exh; the application of exh is optional. By means of a QUAD, a quantifier can be expanded into a full sentence of QL. The rules for type-shifting and completion with a QUAD serve to transform QL-expressions into complete QL-sentences; exhaustification does not contribute to sentence formation. With the reconstruction rules all the answers in the examples (24-a)–(24-g) can be interpreted correctly: (24)
a. b. c. d.
What is in the left field? – EVERY BLACK SQUARE noise Which square is in the left field? – EVERY BLACK noise Which black object is in the left field? – EVERY noise SQUARE noise Which black square is in the left field? – EVERY noise
Reconstruction of Messages 117
e. f. g.
Which triangle is in the left field? – noise WHITE noise Which white object is in the left field? – noise TRIANGLE noise What is in the left field? – noise WHITE TRIANGLE noise
The answers in the examples are reconstructed either in the sense of “Every black square is in the left field” ((24-a)–(24-d)) or in the sense of “Some white triangle is in the left field” ((24-e) - (24-g)). All that is needed for a successful reconstruction is to know the respective QUADs and to recognise all optimally accentuated (here: capitalised) words. No further propositional or thematic knowledge is required; such knowledge is only needed to judge the adequacy of completed reconstructions. According to the optimal accentuation hypothesis the accentuated words in an utterance form a smallest possible set of words that suffice for proper understanding. This means that in the answers in the examples (24-a)– (24-g), at most the capitalised words are accentuated (even when the person answering utters a complete sentence in each example). Depending on how much common ground is available, it may suffice for the recipient to recognise only some of these words; the speaker therefore does not necessarily have to accentuate each of the capitalised words: (25)
Which black object is in the left field? – Every black SQUARE is in the left field.
Let us assume that it is known that all black squares are in the same field; it is, however, not known in which. If the recipient of the answer in example (24-c) only recognises the word “square”, he will reconstruct the message directed at him through type-shifting and QUAD-completion in the sense of “Some black square is in the left field”. Although this sentence is logically weaker than the uttered sentence “Every black square is in the left field”, with respect to the presupposed common ground it is equivalent. If all black squares are in the same field and some black square is in the left field, then all black squares are in the left field. The speaker does not need to be worried about his message receiving a logically weaker reconstruction than intended if this reconstruction leads to the same modification of the common ground. If it is not part of the common ground that all black squares are in the same field, sentences such as “Some black square is in the left field” and “Every black square is in the left field” are not equivalent with respect to the common ground. Under such circumstances, the speaker must make sure that the recipient also recognises the determiner “every”; this means that he has to accentuate the determiner.
118
Accentuation and Interpretation
Reconstructing a message consists in creating a QL-sentence φ for a recognised sequence of words A1 · . . . · An while taking a given information state σ, quads into account. The reconstruction is based on the rules defined above. The reconstruction rules establish a relation between information states, word sequences and QL-sentences. For information states and word sequences sets of QL-sentences are determined; such sets may be empty, since it is not always possible to reconstruct a QL-sentence after recognising a sequence of words. For each information state and every message (represented by a QLsentence) a set of word sequences can be determined. Each of these word sequences can function to convey the message in the situation specified by the information state. (I assume that every message can be conveyed linguistically, and that therefore there exists for any message at least one word sequence that can convey the message.) The word sequences can have different lengths; some of the word sequences form syntactically wellformed sentences, others perhaps do not. Let us assume that a speaker wishes to convey a message represented by φ and to this end wishes to utter a grammatically well-formed sentence A1 · . . . · An . The sentence A1 · . . . · An must satisfy the following criteria: 1. It must be possible to reconstruct φ from A1 · . . . · An . 2. Some of the words A1 , . . . , An must be accentuated. The sequence of words to be accentuated must be as short as possible; still, the recipient should be able to reconstruct φ or an equivalent message φ even if he only recognised these words. 3. When the recipient has recognised more than just the accentuated words, he must also be able to reconstruct φ or φ on the basis of the words that he recognised. The correct reconstruction must not presuppose that some parts of the sentence are not recognised. 4. It is presupposed that the message φ satisfies the adequacy criteria defined in Chapter 3. Furthermore, it is assumed that it is the best message that the speaker can give; he does not withhold any relevant information and he does not claim anything for which he has no evidence. By appealing to the reconstruction rules, it is possible to determine constraints for accentuation; an optimal stress pattern is a pattern that satisfies the constraints: let us assume that the information state CG = σ, quads represents the common ground of a group of discourse participants. Say that a speaker wishes to convey a message φ. Let the word sequence
Reconstruction of Messages 119
A1 · . . . · An form a grammatically well-formed sentence that can be translated into the QL-sentence φ by means of t1 . – How can an optimal stress pattern for A1 · . . . · An be determined? Attempt 1: let X0 , . . . , X j be word sequences of arbitrary length (possibly of length 0), and let C1 · . . . · Cj be the shortest possible word sequence that allows the message φ to be reconstructed in relation to σ, quads, so that the sequence X0 · C1 · . . . · Cj · X j is identical with the sequence A1 · . . . · An . (If X0 , . . . , X j are variables, X0 · C1 · X1 · . . . · Cj · X j can be unified with A1 · . . . · An .) That is, the words C1 , . . . , Cj occur in the order 1 − j in the sentence A1 · . . . · An . The sentence A1 · . . . · An is optimally accentuated if of its words only C1 , . . . , Cj are accentuated. Let C1stress , . . . , Cjstress be the accentuated words C1 , . . . , Cj ; let A1 · . . . · A n be the sentence A1 · . . . · An with stress pattern. The stress pattern is optimal if A1 · . . . · A n is identical (can be unified) with X0 · C1stress · X1 · . . . · Cjstress · X j : A1 · . . . · A n ∈ accentuate( A1 · . . . · An , φ, σ, quads)
⇐=
A1 · . . . · An , φ ∈ t1 (K ) ∧ K ∈ {is, ds} ∧ C1 · . . . · Cj ∈ shortest({ D1 · . . . · Dl |
α, β ∈ quads ∧ φ ∈ reconstruct(α, β, D1 · . . . · Dl ) ∧ A1 · . . . · An = Y0 · D1 · Y1 · . . . · Dl · Yl }) ∧ A1 · . . . · A n = X0 · C1stress · X1 · . . . · Cjstress · X j The function shortest yields a shortest sequence of a given set of word sequences. The sequence C1 · . . . · Cj would therefore be the shortest, or at least one of the shortest sequences for which in the given context the message φ can be reconstructed. Objection: we have seen in example (25) above that it is not always necessary to be able to reconstruct the message φ. It suffices to reconstruct any message φ , as long as this message is equivalent to φ in the given information state. It may be that for the reconstruction φ fewer words need to be recognised than for the reconstruction of φ; therefore C1 · . . . · Cj need not necessarily be a shortest sequence that is optimally to be accentuated. Reply: that is correct; the accentuation rule has to be modified. Attempt 2: let UQL = QL, Σ, [ ]2M be a classical update system for QL as defined in Chapter 3. The common ground of the discourse participants is represented by the information state CG = σ, quads, with σ ∈ Σ. It must be possible for the sequence of accentuated words C1 · . . . · Cj to be reconstructed as a message φ in such a way that φ and the speaker’s intended message φ are equivalent with respect to σ. Following this, the new accentuation rule becomes:
120
Accentuation and Interpretation
A1 · . . . · A n ∈ accentuate( A1 · . . . · An , φ, σ, quads)
⇐=
A1 · . . . · An , φ ∈ t1 (K ) ∧ K ∈ {is, ds} ∧ C1 · . . . · Cj ∈ shortest({ D1 · . . . · Dl |
α, β ∈ quads ∧ φ ∈ reconstruct(α, β, D1 · . . . · Dl ) ∧ σ[φ]2M = σ[φ ]2M ∧ A1 · . . . · An = Y0 · D1 · Y1 · . . . · Dl · Yl })
∧ A1 · . . . · A n = X0 · C1stress · X1 · . . . · Cjstress · X j
Objection: the rule is better than the one from attempt 1; it still does not determine exclusively optimal stress patterns, however. For example, in the following example (26), both answers are optimally accentuated according to the rule, although only the stress pattern of the first answer can be considered optimal: (26)
Which triangle is concealed by the white square? a. b.
The white square conceals the WHITE triangle. The WHITE square conceals the white triangle.
Reply: this objection is correct as well. It was stated above that the correct reconstruction must also be possible when the recipient has recognised more than just the accentuated words. If a recipient recognises not just the accentuated word “white” but the entire subject-np in the second answer sentence, the extension with the given QUAD will yield a reconstruction that denotes that the white square is a triangle that is concealed by the white square. This reconstruction is obviously inadequate. The problem only occurs when, first, a word to be accentuated occurs more than once so that it must be decided which occurrence is to be accentuated, and, secondly, when a QUAD is contextually available. If no QUAD is available, all words must be accentuated except those whose non-recognition can be compensated for through type-shifting. Since it is irrelevant where these words occur, the accentuation rule of attempt 2 can continue to be used in this case. If a QUAD is available for the reconstruction of a message, the speaker does not have to utter a complete sentence. It suffices for him to utter an expression whose formal representation can be expanded into a representation of the complete message by means of a QUAD. Such an expression is called a constituent answer or a constituent question. All the words of a constituent answer (or question) that cannot be compensated for through type-shifting must be accentuated. For the accentuation of such expressions, the accentuation rule of attempt 2 can be used. Attempt 3: even if a QUAD is contextually given, a speaker can utter a complete sentence; in this case a constituent answer or question is part
Reconstruction of Messages 121
of this sentence. The speaker needs to accentuate only the words of the constituent answer (or question); the sentence parts that do not belong to the constituent answer (or question) can be compensated for by means of the QUAD. This means that it is necessary to identify a sentence part whose representation can be expanded by means of a QUAD. Which words in this sentence part need to be accentuated is determined by the accentuation rule of attempt 2; other words are not to be accentuated: A1 · . . . · A n ∈ accentuate( A1 · . . . · An , φ, σ, quads)
⇐=
A1 · . . . · An , φ ∈ t1 (K ) ∧ K ∈ {is, ds} ∧ α, β ∈ quads ∧ A1 · . . . · An = Z0 · B1 · Z1 · . . . · Bm · Zm ∧ B1 · . . . · Bm , ψ ∈ t1 (K ) ∧ (ψ( β))(α) = φ ∧ C1 · . . . · Cj ∈ shortest({ D1 · . . . · Dl |
ψ ∈
t1 ( D1
· . . . · Dl )) ∧ σ[(ψ( β))(α)]2M = σ[(ψ ( β))(α)]2M
∧ B1 · . . . · Bm = Y0 · D1 · Y1 · . . . · Dl · Yl }) ∧ B1 · . . . · Bm = X0 · C1stress · X1 · . . . · Cjstress · X j ∧ A1 · . . . · A n = Z0 · B1 · Z1 · . . . · Bm · Zm
Objection: this excludes the incorrect accentuation of example (26), but not the incorrect accentuation of example (27). Only the first of the two indicated stress patterns is optimal; the rule of attempt 3 also allows the second, non-optimal pattern:20 (27)
Who likes Bill? a. b.
BILL likes Bill. Bill likes BILL.
Reply: that is again correct. If a word alone forms a constituent answer and if this word occurs more than once in a sentence, the accentuation rule of attempt 3 does not allow us to decide which occurrence of the word must be accentuated. The acceptability of the first and inacceptability of the second pattern can be explained as follows: the question that was asked is who likes a certain person (namely Bill). If the recipient of the answer only recognises “Bill likes” (which is more likely with the first stress pattern than with the second one), he understands that the speaker is saying something about which person or which object Bill likes. If it is presupposed that the 20 By the way: If we translate the example into German, so that we are not tied to word order as much, the problem does not occur: Both “BILL mag Bill” and “Bill mag BILL” can be optimally accentuated replies to “Wer mag Bill?”.
122
Accentuation and Interpretation
speaker expresses himself adequately and answers the question under discussion, “Bill likes” can only be expanded into “Bill likes Bill”. Any other expansion would be inadequate in the given context. If, on the other hand, the recipient recognised “likes Bill” (which is more likely with the second stress pattern than with the first one), he understands that the speaker is saying something about which person likes Bill. Perhaps the recipient already knows that Bill is liked, as he wants to know who it is who likes him. He does not receive any clue how to adequately reconstruct “likes Bill”. That means that the recipient is more likely to arrive at a correct reconstruction of the speaker’s message on the basis of the recognised words in the case of the first stress pattern (example (27-a)) than in the case of the second stress pattern (example (27-b)). Sentences such as those in example (27) are certainly not uttered very often; nonetheless it should be possible to determine correct stress patterns for them as well. By now, I do not have the formal means to correctly complete “Bill likes” in the given context, so that I cannot formally motivate the advantages of the first stress pattern. The application of the reconstruction rules specified so far requires that “likes” be ignored and that the remaining term “Bill” is expanded by means of the given QUAD. This way, however, one can complete not only “Bill likes” but also “likes Bill”. Given the rule apparatus then it is irrelevant which of both word sequences the recipient recognises. Let me take stock. I have formulated a rule for the optimal accentuation of sentences. If a sentence is accentuated optimally, it can be understood on the basis of the accentuated words alone. The set of words to be accentuated is as small as possible; if a sentence is not accentuated optimally, then either more or less words have been accentuated than would be necessary. The accentuation of too many words is a disadvantage because it reduces the probability of the recognition of all accentuated words. Furthermore, a recipient who assumes that the speaker accentuates optimally may make incorrect assumptions about the discourse context on the basis of the over-accentuation, and accommodate his representation of the context incorrectly, i.e. not in the way that the speaker intended. (In 2.5 I focus on the accommodation of utterance contexts.) A speaker also increases the risk of misunderstanding when he accentuates too few words, i.e. not all i-critical words. He must assume that non-accentuated words may not be recognised. If the speaker does not accentuate all i-critical words, it is probable that the recipient does not recognise all of them:
Reconstruction of Messages 123
(28)
a. b.
What is in the left field? – EVERY black square is in the left field. Which triangle is in the left field? – Some white TRIANGLE is in the left field.
If the recipient recognises only the accentuated word “every” in the answer in example (28-a), it might be that he reconstructs the answer in the sense of “Every object is in the left field”. This reconstruction is considerably stronger than the one that the speaker intended; it may even be false. The answer in example (28-b) is under-accentuated as well. If the recipient recognises only the accentuated words, he cannot fully reconstruct the speaker’s intended message. He does arrive at the reconstruction of a correct, partial answer, according to which at least one triangle is in the left field, so that he can exclude the possibility that there is no triangle in the left field. In this case, an incorrect accentuation does not lead to a misunderstanding, but it increases the probability that the speaker’s message is not fully understood. 2.5
Presupposition of questions
An optimal stress pattern for a given sentence can be determined for a certain discourse context. Conversely, for a sentence with stress pattern the contexts according to which this pattern is optimal can also be determined: (29)
I did NOT delete your data.
Let us assume that someone enters his office and is greeted with the utterance in example (29). He recognises the sentence completely, including the stress pattern. He immediately suspects that his data are lost. – Why? Every declarative sentence should answer a question under discussion at least partially. When a recipient translates a declarative sentence into the meaning representation !φ, he must assume that for the speaker the question ?φ – or possibly even a logically stronger question – is under discussion. The recipient of the sentence in (29) can therefore assume that it is under discussion whether the speaker deleted the data of the recipient. Under the assumption that the sentence is accentuated optimally, it should be possible to reconstruct the conveyed message on the basis of the accentuated words alone. The reconstruction of the message of (29) on the basis of “I” and “not” cannot be accomplished by type-shifting alone; a QUAD is required. A QUAD is only available if the question corresponding to the QUAD is under discussion. The recipient already establishes that the question whether the speaker deleted the data is under discussion. However, if the QUAD of this question were available, the speaker would optimally
124
Accentuation and Interpretation
only accentuate “not” – “I did NOT delete your data”; the stress pattern of example (29) (with stress on “I” and “not”), would be suboptimal. The accentuation of “I” and “not” is optimal only if the QUAD that is available corresponds to the question who deleted the recipient’s data. The speaker presupposes this question by his accentuation, and the recipient accommodates his representation of the common ground accordingly. A question is not put up for discussion without a reason; conditions must be given that make answering the question interesting. The question who deleted certain data is only interesting out of the blue when the data have been deleted. Exactly that is what the recipient must assume; he must also accommodate his representation of the common ground in this regard. Objection: why must we consider the stress pattern to reach this conclusion? Without taking the stress pattern into account, we can establish that the question whether the speaker deleted the data of the recipient is under discussion. This question too must be interesting. Out of the blue, it is only interesting when the data were deleted. Reply: no. (a) The question whether the speaker deleted the data can also be asked in different contexts. If the speaker is very clumsy and one must always fear that he may have deleted certain data, then the message that he did not delete certain data should be reassuring. (b) Other accentuations presuppose different questions and thus have different effects: (30)
I did not delete YOUR data.
The accentuation of “your” presupposes the question whose data the speaker deleted. He did not delete the recipient’s data, but he does not mention anything about the data of others; these may be lost. Not taking the stress pattern into account would mean to deny the difference between (29) and (30). This difference is substantial: after the utterance in (29), the recipient assumes that his data were deleted; after the utterance in (30) he suspects that at least his own data are intact. Let me take stock. The accentuation rule formulated in section 2.4 above places sentences, stress patterns, messages and discourse contexts in relation to each other. Taking into account the message that is to be conveyed and the existing context, a speaker can determine an optimal stress pattern for a sentence. A recipient in turn can – if he recognises the sentence completely, including its stress pattern – conclude what the speaker takes the discourse context to be. For that he must presuppose that the speaker accentuates optimally. Specifically, the stress pattern gives information about which question is under discussion for the speaker. There are always pre-
Reconstruction of Messages 125
conditions tied to a question being under discussion. The recipient can accommodate his context-representation in relation to the question that the speaker presupposes and its preconditions; in that way he brings his own context-representation more in line with the speaker’s. Different accentuations of a sentence may require different accommodations; a speaker can therefore inform the recipient in different ways by using different accentuations. If a recipient knows which question is under discussion, he can semantically strengthen a declarative sentence that is uttered as an answer to this question through exhaustification. He can also strengthen the sentence if he only finds out what the question is on the basis of the stress pattern. That is, accentuation can influence the interpretation of a sentence, since it indicates how the sentence is to be exhaustified. The next example is from Rooth (1992): Steve, Paul and Mats have taken part in a test; they were immediately informed of the results; each of them therefore knows how he and the other two performed. George knows about the test and knows that the results were given immediately. He meets Mats and asks him how it went. Mats answers “Well, I passed”: (31)
a. b.
Well, I PASSED. Well, I passed.
The sentence “I passed” can be accentuated in different ways; in example (31-a) the verb “passed” is accentuated, while in example (31-b) the pronoun “I” is accentuated. If the stress pattern is disregarded, the sentence that Mats utters means no more than that Mats passed the test. Rooth (1992) claims – and I agree – that the recipient George can extract more information from the utterance, and that the extra information depends on the stress pattern. On the basis of (31-a) (“I PASSED”) he can suspect that Mats passed, but not very convincingly; he learns nothing about the results of Steve and Paul. On the basis of (31-b) (“I passed”), on the other hand, George can suspect that Mats passed (possibly even aced), but that Steve and Peter failed the test. Rooth (1992) observes that the different stress pattern evoke different conversational implicatures. To the extent that they evoke implicatures, the stress patterns are informative in different ways. To account for this phenomenon, Rooth (1992) uses the theoretical term “focus”: the different accentuations mark different foci; determining the foci evokes different implicatures. My explanation here does not make reference to foci, but to a question that is presupposed through accentuation: First, in Chapter 3, I established that a speaker’s message must be reconstructed in such a way that it satisfies certain criteria of adequacy. It must
126
Accentuation and Interpretation
furthermore be the best answer that the speaker can give, i.e. it should not be assumed that the speaker withholds relevant information. Finally, an utterance should be clear and intelligible; it can be presupposed that an uttered sentence is optimally accentuated. Secondly, the stress on “passed” in example (31-a) is optimal only if Mats’ result is under discussion. Things are different in example (31-b); the stress on “I” is optimal only if it is under discussion who passed the test. That is, the accentuation presupposes two different questions; both are logically subordinate to the question that George asked – that is, the question how the test went for Steve, Paul and Mats. They can therefore be put up for discussion retroactively and thus guide the answering process for George’s question. George can accommodate his representation of the common ground accordingly. Thirdly, George knows that Mats knows the exhaustive answer to the question that he asks, and so he also knows that Mats knows the exhaustive answers to the presupposed (sub-) questions. He can (strictly speaking: must) exhaustify Mats’ answers and reconstruct them in the sense of “I (Mats) just passed” (example (31-a)), and “Just me (Mats) passed” (example (31-b)). From the reconstructions, it logically follows that Mats did not ace ((31-a)), and that Steve and Paul did not pass ((31-b)), respectively. These are the implicatures mentioned above. The implicatures arise from the fact that George “makes” Mats’ utterance adequate. For this purpose, George accommodates his representation of the common ground to the question that Mats presupposes in his accentuation, and exhaustifies the literal meaning of Mats’ utterance with respect to this question.21 21
It is interesting that it is hardly possible to answer George’s original question exhaustively by uttering “I passed”: (A) optimally, given the original question, both words “I” and “passed” should be accentuated. Accentuation requires that the words are emphasised in relation to the surrounding words. But if every word is accentuated, there are no longer surrounding words against which the accentuated words can be emphasised, ergo no word is emphasised. The recipient may not recognise that every word was accentuated; he may instead recognise that no word was accentuated. Alternatively, he may assume that he did not recognise the stress pattern, or that only “passed” was accentuated – since the accentuation of both words requires that “passed” is emphasised in relation to “I”. (B) In order to exhaustify “I PASSED” (both words accentuated), a representation must be created according to which, first, Mats is the only person that passed (meaning of “I” exhaustified, meaning of “passed” not exhaustified), and, secondly, Mats just passed (meaning of “I” not exhaustified, meaning of “passed” exhaustified). That is, the utterance must be interpreted as an answer to two questions: first, as the answer to the question who passed, and, secondly, as the answer to the question how well those that passed did. The function exh cannot be applied to the meanings of “I” and “passed” simultaneously, because the recipient would then arrive at an interpretation in the sense of “Just Mats just passed”. This representation (contd)
Reconstruction of Messages 127
Summarizing: if the recipient understands which question the speaker presupposes, he can semantically strengthen the speaker’s utterance through exhaustification. That is, accentuation can influence the reconstruction and interpretation of an utterance. Through accentuation, a speaker can presuppose contextual requirements for his utterances – especially a question under discussion. The presupposition is not always unambiguous; it may happen that a stress pattern is optimal with respect to different questions: (32)
a.
EVERY triangle is in the left field. (i) How many triangles are there in the left field? (ii) Which triangle is in the left field?
b.
A BLACK triangle is in the left field. (i) What sort of triangle is in the left field? (ii) Which triangle is in the left field?
The stress patterns of the example sentences (32-a) and (32-b) are both optimal in relation to two non-equivalent questions. If the recipient does not already know which question is under discussion – one of the questions may have been asked explicitly – he has several ways in which he can accommodate his information state. In doing so, he must make sure that the speaker’s utterance is adequate with respect to the accommodated information state. If he knows how many triangles there are altogether, but does not know which objects are triangles, then, in view of the utterance (32-a), only the accommodation to the first question (“How many triangles . . . ”) is allowed. If the recipient were to accommodate his information state to the second question, the statement in (32-a) would be over-informative and would not denote an answer. The same is true for example (32-b); accommodation to the second question (“Which triangle. . . ”) is only adequate if the recipient can identify the black triangles. If the recipient of sentence (32-b) cannot decide which question is presupposed, there are different ways in which he can reconstruct and interpret the sentence through exhaustification. He can take it to mean that there is a triangle in the left field that is black and has no other interesting property (e.g. having a couple of red lines on it). Alternatively, he can take it to mean that there is only one triangle in the left field and that this triangle is black. To prevent misunderstanding, it would be advisable not to exhaustify, or to ask the speaker what he meant. denotes not the intended, exhaustive but merely a partial answer.
128
Accentuation and Interpretation
Let me take stock. A stress pattern can give information about a question under discussion, though it need not specify this question unambiguously. Perhaps one of the questions that are compatible with the stress pattern has already been put up for discussion, so that the recipient does not need to accommodate his information state. It could also be the case that only one of the questions is contextually adequate, and that all other questions are excluded by the criteria of adequacy. Otherwise, it appears to be a good interpretation strategy to interpret the corresponding utterance so weakly that it is adequate with respect to each of the possible questions. That may avoid misunderstanding. 2.6
Summary
Several operations are available for reconstructing a message: 1. Grammatically well-formed expressions can be translated into formal meaning representations. 2. A recipient can utilise an available QUAD (question abstract domain pair) to expand the representation of a constituent answer or constituent question. 3. By application of the function exh the representation of a constituent answer can be semantically strengthened. 4. Meaning representations can be type-shifted; this may result in a meaning enrichment. A message must be adequate with respect to the common ground of the discourse participants. The discourse participants possess (possibly distinct) representations of what they suppose is the common ground. The adequacy of messages is to be judged with respect to these representations. A recipient has some leeway in reconstructing messages; sometimes, a recipient is able to reconstruct several messages, but can reject the inadequate ones. It is also possible that he can make a message adequate by accommodating his representation of the common ground. Criteria of adequacy exist for messages, and also for linguistic expressions with which messages are conveyed. An utterance must be clear and intelligible. Clarity and intelligibility depend on the common ground; a recipient that presupposes that the speaker expresses himself clearly and intelligibly can adequately adjust his representation of the common ground. A clear and intelligible utterance is optimally accentuated. The recipient of an utterance can accommodate his representation of the common ground, so that the utterance is optimally accentuated with respect to it. A speaker
Reconstruction of Messages 129
can rely on the ability of the recipient to make such accommodations, and can therefore presuppose a certain context – especially a question under discussion – through accentuation. The recipient of an utterance reconstructs a message; possibly he accommodates his representation of the common ground during this process. With a message that has been reconstructed and deemed adequate, he can update his representation of the common ground. A speaker in turn can directly update the common ground representation that he posseses with his intended message. With this, I have defined the main characteristics of a model for the interpretation of incomplete or incompletely recognised sentences. By referring to the rules for the reconstruction of messages and the adequacy criteria for messages I specified a rule for optimal accentuation. With this rule, stress patterns are predictable.
3 Interpretation without QUADs If there is no QUAD available for the reconstruction of a message, the type-shifting operations are the only operations through which recipients can semantically enrich expressions. In the last section, I defined three type-shifting operations. The following example (33), which was already given in Chapter 3, shows that further operations are needed: (33)
Why are you so late? – noise BICYCLE noise FLAT.
The words that are accentuated and that the recipient recognised, “bicycle” and “flat”, can be translated into formal meaning representations. These representations can be type-shifted; type-shifting yields generalised quantifiers in the sense of “a bicycle” and “a flat” – i.e. “a flat tyre”. Uttering the question “Why are you so late?” does not provide a QUAD with which these quantifiers can be expanded into a complete sentence of QL. The answer therefore cannot be reconstructed by means of a QUAD. An obvious solution is to put the quantifiers in a relation to each other and to generate the message that some bicycle had a flat tyre. Still, this message does not yet denote an adequate answer to the question that was asked; after all, the fact that some bicycle is broken is no reason for being late. We can assume that it is not just any bicycle that is broken, but the speaker’s bicycle; and we can therefore plausibly reconstruct the answer in the sense of “My (the speaker’s) bicycle had a flat”. With the operations that have been defined so far, this enrichment is not possible. That is, the type-shifting operations so far do not suffice for the reconstruction of messages such as the answer in example (33); further operations need to be defined.
130
Accentuation and Interpretation
(34)
Where is the book? – noise TABLE.
As an answer to the question in example (34), the word “table” can be interpreted in such a way that the intended book is somewhere on, under or next to a (certain) table. The representation of “table” can be typeshifted to a representation of “a table” or “the table”. This expression must be transformed into an expression that denotes a location. The location can remain underspecified; it might be that only a broad indication of where the book is located is needed. The result of type-shifting can therefore be a semantically underspecified expression. According to the model described in the previous section, QUADs are used when (type-shifted) translations of the recognised expressions do not suffice for the reconstruction of a message. The expansion of an expression with a QUAD is a contextually licensed semantic enrichment. So far, it is the only operation for contextually licensed enrichment; further operations of this kind are needed, however: 1. QUADs are only given for explicitly uttered interrogative sentences; they are not available for the reconstruction of answers to implicitly given questions. Nonetheless the discourse context can guide the interpretation of an answer to a question that is merely implicit (35)
When is the next train to Cologne? – 18:01. Platform 1.
In Chapter 3, I argued that, after the answer to the question when the train to Cologne departs has been given, the rail employee in example (35) can presuppose the question from which platform the train leaves for Cologne, and answer it by uttering “Platform 1”. This question arises from the questioner’s need for information which can be assumed in the given situation by standard. According to the QUAD-model, no QUAD is available for the reconstruction of the message that the train departs from platform 1; the message can therefore not be reconstructed in that model. In order to reconstruct it, the representation of “platform 1” must be transformed into the representation of a location such as “from platform 1”; the location must then be expanded with the subject and predicate of the original interrogative sentence. That is, the QUAD of the original interrogative sentence does not suffice for reconstructing the message. (36)
a. b.
Knocking on the door – Tom. Someone is knocking on the door. – Tom.
Reconstruction of Messages 131
In the context of the examples (36-a) and (36-b), “Tom” is to be interpreted in the sense of “Tom is knocking on the door”. The interpretation must refer to the knocking that the discourse participants observed, or to the utterance that someone is knocking, respectively. No appropriate QUAD is available so far. 2. In the following examples from Ginzburg and Sag (2000), interrogative sentences are uttered, adding QUADs to the context. These QUADs, however, cannot be used for the reconstruction of the follow-up and reprise questions posed by uttering “Who?” and “Mo?”: (37)
a. b. c.
Did anyone call? – Yes. – Who? Did you meet Makriyannis? – Who? Did Mo dupe the judges? – Mo?
The reconstruction of the follow-up question in example (37-a) and the reprise questions in the examples (37-b) and (37-c) requires reference to the interrogative sentences in each example. The QUADs of these interrogative sentences are not appropriate for the reconstruction; the sentence must be available to the recipient in a different format. The following example, from Schwarzschild (1999), behaves similarly: (38)
John drove Mary’s red convertible. What did he drive before that? – He drove her BLUE convertible.
If only the accentuated word “blue” would be recognised in the answer sentence, then according to the model of the previous section, the message can be reconstructed that John drove some blue object. Correct would be the reconstruction that John drove Mary’s blue convertible, however. The correct reconstruction requires reference to constituents of the declarative sentence uttered earlier; using the given QUAD is not sufficient for reconstruction. Discourse participants possess representations of the common ground. In the model of the previous section, a representation of the common ground is a pair consisting of a set of index-pairs σ) and a QUAD-list quads. The set σ represents the common propositional and thematic knowledge of the discourse participants; the adequacy of a message is judged in reference to σ. The elements of the list quads can be used for the reconstruction of messages. As we have seen, however, these QUADs do not always suffice: QUADs are representations of recently uttered interrogative sentences. QUADs have a format that allows only limited use of these representations.
132
Accentuation and Interpretation
For the reconstruction of the reprise questions (37-b) and (37-c) the original interrogative sentence must be available in a different format; the QUADs cannot aid reconstruction. It may be necessary to also take into account recently uttered declarative sentences. Recipients must have representations of these sentences and their constituents available to them. The model of the previous section can be modified in such a way that instead of QUADs, formal representations of recently uttered sentences and their constituents and representations of other, generally observed events are available. An information state then consists of a set σ and a list of formal expressions that represent the recently uttered expressions and generally observed events. During the reconstruction of a message, the recognised words can be translated into formal expressions and type-shifted independently of the discourse context. Furthermore, they can be semantically enriched by combining them with the contextually available representations of expressions uttered earlier. Thereby they can be expanded into complete messages. The messages generated in this way can then be tested for their contextual adequacy. (39)
Which black square is in the left field? – EVERY noise
As an example, let us interpret the answer in the dialogue (39): after “Which black square is in the left field?” has been uttered and understood, the discourse participants have available formal representations of the entire sentence and of its constituents, i.e. of “black”, “square”, “black square”, “is in the left field”, etc: λx [black( x )], λx [square( x )], . . . These expressions can be used for reconstructing the answer. First, the recognised determiner is translated into a formal representation, which is then type-shifted to the generalised quantifier λZλX [∀ x [ Z ( x ) → X ( x )]]. This quantifier can be expanded into a sentence of QL by functional application of two contextually given predicates of the type e, t. Because several predicates of type e, t are available – λx [black ( x )], λx [square( x )], . . . – the quantifier can be expanded in different ways. The recipient can therefore reconstruct several, non-equivalent messages: 1. ![∀ x [black( x ) ∧ square( x ) → left( x )]] 2. ![∀ x [ left( x ) → black( x ) ∧ square( x )]] 3. ![∀ x [black( x ) → square( x )]] 4. ![∀ x [ left( x ) → square( x )]] 5. . . .
Reconstruction of Messages 133
The reconstructable messages can be tested for their adequacy. Only one message – the first of those indicated above – satisfies the criteria of adequacy defined in Chapter 3. The recipient can reject all other reconstructable messages as inadequate. (40)
Knocking on the door – Tom.
In order to be able to interpret the utterance of example (40), the recipient must possess a representation of the knocking sound, or more precisely a representation of the action that evokes the sound. An appropriate representation would be the predicate λx [knock( x )]. The name “Tom” can be translated into a generalised quantifier and expanded into a sentence of QL with this predicate; in this way, the recipient arrives at a representation of the message that Tom is knocking: ![knock(tom)]. Let me summarise so far. I propose that instead of a list of QUADs, information states contain a list of representations of recently uttered expressions and generally observed events. During the reconstruction of a message, these representations can be used to semantically enrich the recognised expressions. As before, context-independent type-shifting must be possible. In order to create representations of exhaustive answers, the exhaustification function exh must be available in the modified model as well; the application of this function is no longer linked to the use of a QUAD. In their main characteristics, the models that I proposed for interpretation by means of QUADs and for the interpretation without QUADs are similar: both models comprise a semantics for the interpretation of recognised expressions, furthermore operations for context-independent semantic enrichment (operations for type-shifting and exhaustification) and operations for context-dependent semantic enrichment (the application of QUADs, and the expansion with representations of recently uttered expressions, respectively). Let us discuss two objections to the modification of the QUAD-model. Objection 1: by making reference to representations of already uttered sentences, it may be possible to reconstruct more messages than would be possible by just using QUADs. It is possible that distinct, non-equivalent messages can be reconstructed and that these messages satisfy the criteria of adequacy equally well: (41)
Whom does Irene not like? – Tom.
134
Accentuation and Interpretation
If the interrogative sentence in example (41) makes a QUAD available with which the constituent answer “Tom” is interpreted in the sense of “Irene does not like Tom”. If the utterance does not make a QUAD available, but instead meaning representations of “like”, “Irene” and “not”, it becomes possible to reconstruct two non-equivalent, but equally adequate messages: one in the sense of “Irene likes Tom” and one in the sense of “Irene does not like Tom”. (It is not mandatory to apply a contextually available negation operator during reconstruction.) That is, the constituent answer in example (41) may be misunderstood. Reply: the objection is correct; the model must be adapted in such a way that for the reconstruction of the answer in example (41), the negation operator must be applied. Nonetheless, a constituent answer to a question with negative polarity seems to my intuition to be less easily understood than a constituent answer to a question with positive polarity. (42)
Who does Irene like? – Tom.
The answer of example (42) is clearly to be interpreted in the sense of “Irene likes Tom”. The answer of example (41) is possibly not as easy to understand; it is more likely to evoke a clarification question: “So Irene does NOT like Tom?” The effect that a constituent answer in reply to a question with negative polarity can be more easily misunderstood may be considered desirable. It would need to be tested empirically if the effect occurs just in the model or also in actual communication. Objection 2: after the interrogative sentence in example (43) has been uttered, there are sufficient expressions available in the modified model to reconstruct the answer that Tom has a certain book. The recipient would not have to recognise any word of the answer sentence to be able to reconstruct the answer. Of course, that cannot be the case. (43)
Does Tom have the book, or does Andrew have the book? – Tom noise
Reply: it is true that the recipient can reconstruct at least two contextually adequate messages without having recognised a single word of the declarative sentence, namely one that says that Tom has a certain book and one that says that Andrew has this particular book. If the recipient recognises the word “Tom” in the answer sentence and uses it for the reconstruction of the answer, he can only reconstruct the message that Tom has the book. That is, the recipient has to recognise the name not for being able to reconstruct a message at all, but for being able to choose the intended message from the reconstructable ones. Recognition does not always have to extend
Reconstruction of Messages 135
the possibilities for reconstruction; it can also enhance intelligibility when it restricts the possibilities for reconstruction. It be given that a speaker utters a sentence, the recipient recognises some words of it and is on that basis able to reconstruct an adequate message. He accepts the message and does not utter any objection. The speaker can then update his information state with the message that he intended, and the recipient can update his information state with the reconstructed message. Ideally, the intended and the reconstructed message are equivalent. Furthermore, speaker and recipient can add new elements to their lists of representations of recently uttered expressions. So far so inconcrete: which of the linguistic expressions that have already been uttered can be used for the reconstruction of a message? The “recycling” of context material may well be restricted. A speaker knows every expression that he uttered,22 but the recipient does not necessarily recognise all these expressions. Should the speaker update his list of representations of all expressions only with representations of the accentuated expressions, which the recipient most likely recognised? And the recipient on the other hand, should he add to his list only representations of the expressions that he recognised, or also semantically enriched representations? Does the danger exist that speaker and recipient expand their lists in different ways and does that raise the probability that accentuation is misleading? How old can an utterance be before its representation can no longer be used for the reconstruction of a message? How are representations of uttered expressions deleted? Answering these questions and specifying the interpretation model accordingly will require empirical data to be obtained and assessed. I do not have such data available at present; answering these questions must therefore be left to future research. The same applies to the definition of further type-shifting operations. These operations must also be justified empirically, and an interpretation model that includes them must be evaluated experimentally. I have so far only established that further operations are needed. 22 More precisely: a speaker possesses a representation of the intended meanings of all the expressions that he uttered. He may not literally remember every expression that he uttered. It is possible that he makes mistakes that he does not notice himself. Consider the following example from Ulrich Schade (p.c.): “Das Etikett auf den grunen ¨ Weinflaschen sind spiegelverkehrt gedruckt.” (“The label on the green wine bottles are printed in mirror image.”) The subject of this sentence is syntactically singular, but conceptually plural; it is quite likely that the speaker does not even notice that he incorrectly uses the plural form of the verb, let alone that he corrects his mistake. In this case, the speaker does not literally “know” both phrases “the label on the green wine bottles” and “are printed in mirror image”, but nevertheless he possesses semantic representations of the phrases.
136
Accentuation and Interpretation
4 Context configurations for interpretation For the correct interpretation of an utterance, parts of the utterance must be recognised. Furthermore, contextual knowledge may be required to semantically enrich the recognised expressions and to judge the adequacy of the utterance. Because stress patterns can also be informative, it is beneficial if the recipient recognises which words the speaker accentuates. The more a recipient recognises from an utterance and the more contextual knowledge he has that is useful for interpretation, the higher the probability that he understands what the speaker tries to convey to him. However, understanding an utterance is still possible when the utterance is recognised only partially or if the recipient has only partial knowledge of the discourse context that the speaker presupposes. Recipients may therefore face different sets of context configurations for the interpretation of an utterance: 1. The recipient recognises all uttered words (case 1), or he does not recognise all uttered words. If he does not recognise all uttered words, he recognises at least all accentuated words (case 2), or he does not recognises some of the accentuated words (case 3). 2. The recipient notices which of the recognised words are accentuated, or he does not perceive the stress pattern. 3. By an utterance, a speaker presupposes a question. This question may be already known to the recipient or not. If the question and its QUAD are known to the recipient, he can use the QUAD to semantically enrich the recognised expression. He can moreover assess the relevance of a reconstructed message with respect to the question under discussion. We can now take a closer look at the various context configurations of reception: – Case 1.1: the recipient perceives all uttered words correctly, he distinguishes accentuated from non-accentuated words, and he knows the question under discussion. He can reconstruct the speaker’s message without making reference to the context, and he can assess whether the speaker expresses himself adequately. If the utterance does not show the optimal stress pattern given the question under discussion, the recipient can accommodate his representation of the common ground and thus “make” the stress pattern optimal. In doing this, he can make assumptions that can be classified as conversational implicatures. (Cf. Rooth’s “I passed”-example in subsection 2.5.) Under these conditions, it is possible to compensate for anomalies in accentuation and in syntax:
case 1
All accentuated words recognised?
Figure 4.4 Context configurations for interpretation
Accentuation Q recognised? Q yes s Question 1 no Q P Pq known? no P
case 2 yes 7
yes 1 Question P q yes 3 known? noPP
@ no @ R @
case 1.3 case 1.4
case 2.1 case 2.2
@
case 2.3
yes no @ R Question 1 P Pq known? no P case 3
case 1.2
Accentuation recognised? @
case 1.1
case 2.4
Reconstruction of Messages 137
yes All words recognised? C C C no C C C CW
*
yes 1 Question P Pq known? no P yes
138
Accentuation and Interpretation
(44)
Who talked to Jane? – Yves talked to JANE.
The question that is presupposed by the accentuation in the answer of example (44) is not logically subordinate to the question that is actually asked. The recipient cannot accommodate his information state so as to conform to this question. That means that the answer is not adequate: either it does not serve the purpose of answering the question, or it is not optimally accentuated. The best interpretation seems to be the one that ignores the stress pattern. (45)
Who talked to Jane? – Talked Jane YVES.
The answer sentence of example (45) is not grammatically well-formed. The recipient can interpret the sentence by ignoring all the words except the accentuated one and expanding this word with the given QUAD. Under the condition that the speaker accentuates optimally, the recipient can interpret even syntactically irregular utterances. – Case 1.2: the recipient perceives all uttered words correctly, he distinguishes accentuated from non-accentuated words, but he does not know the question under discussion. If he presupposes that the utterance was accentuated optimally, he can accommodate his representation of the common ground in line with a question that is compatible with the stress pattern. He can reconstruct the speaker’s message and exhaustify it in relation to the newly assumed question. The recipient is not able to test the adequacy of the utterance, for that he must trust the speaker. – Case 1.3: the recipient perceives all uttered words correctly, and he knows the question under discussion. He does not perceive the stress pattern, that is, he cannot distinguish between accentuated and nonaccentuated words. The recipient can reconstruct a message, exhaustify it in relation to the question under discussion and test the adequacy of the reconstructed message. He cannot determine whether the utterance was accentuated optimally or whether he needs to accommodate his representation of the common ground in order to make the accentuation optimal. As a result, he cannot draw any conversational implicatures based on the stress pattern. – Case 1.4: the recipient perceives all uttered words, but he does not recognise the stress pattern, nor does he possess a representation of the question under discussion. He can reconstruct a message, but he cannot exhaustify this message in relation to any question. He cannot carry out
Reconstruction of Messages 139
any adjustment of his representation of the common ground that is justified by reference to a stress pattern, and he cannot determine whether the message satisfies the criteria of adequacy. In this respect, he must trust the speaker. – Case 2.1: the recipient does not perceive all words correctly, but he perceives at least all accentuated words. He distinguishes the accentuated words from the non-accentuated ones, and he knows the question under discussion. The recipient can ignore the non-accentuated words and reconstruct a message by means of the given QUAD; he can exhaustify this message in relation to the question under discussion. The reconstructable messages can be judged for their adequacy. (46)
Who is waiting for Andrew? – PETER is waiting noise
The fact that some words were not recognised does not have to have any effects on interpretation. The answer of example (46) can be also interpreted without applying the given QUAD; the recipient may not even notice that he did not recognise all words. (47)
Who talked to Jane? – YVES noise to Jane
I have not defined any operations that allow the recipient to construe a semantically underspecified representation of the presupposed question on the basis of the recognised words. Let us assume that the recipient has such an operation available. He is then able to generate an underspecified representation of the question presupposed by the answer in example (47): ?[ R( x, jane)] (R denotes a relation that is still to be specified). If this representation can be specified as a representation of the actual question under discussion (?[talk ( x, jane)]), the recipient can be certain that the answer was optimally accentuated. – Case 2.2: the recipient does not recognise all words, but he does recognise all accentuated words. He recognises the stress pattern of the utterance, but does not possess a representation of the question under discussion. Possibly he can create an underspecified representation of the speaker’s message through type-shifting. In the course of the discourse, he may be able to further specify this representation: (48)
(Who talked to Jane?) YVES noise to Jane ANDREW did NOT talk to Jane.
140
Accentuation and Interpretation
If the recipient first recognises “Yves” and “to Jane”, he can assume that the speaker wanted to tell him that Yves either did, does or will do something together with Jane. The stress on “Yves” signifies that the speaker presupposes the question who is connected to Jane through this unknown activity. The next sentence is recognised completely. The recipient learns that Andrew did not talk to Jane and that the speaker presupposes that the question under discussion is who talked to Jane. By referring to this question, the recipient can specify his representation of the message uttered before so as to mean that Yves talked to Jane. – Case 2.3: the recipient does not recognise all words, but he recognises at least all accentuated words. He knows the question under discussion, but he cannot distinguish accentuated words from non-accentuated words. (49)
Who talked to Jane? a. b.
Yves noise Yves noise Jane
The reconstruction of the message proceeds as in case 2.1. The recipient can tentatively assume that he recognises exactly the accentuated words (first answer of example (49)); he can enrich these words with the given QUAD. If he recognises more than just the accentuated words and if the enrichment with the QUAD does not lead to an appropriate reconstruction, he can ignore words and try to reconstruct a message on the basis of just some of the recognised words. That is, he can reconstruct the second answer of example (49) in the sense of “Yves talked to Jane”, in the sense of “Jane talked to Jane”, or in the sense of “Yves and Jane talked to Jane”. All three reconstructions satisfy the criteria of adequacy; however, if the recipient finds it unlikely that Jane talked to herself, he will prefer the first reconstruction. In case 2.3 the interpretation without QUADs, as outlined in section 3, appears to be superior to the interpretation with QUADs. In the interpretation with QUADs it may happen that recognised expressions must be ignored; there are no clues, however, that indicate which expression must be ignored. In the interpretation without QUADs it is never necessary to ignore recognised expressions, because apart from complete QUADs, there are also “smaller” type-adequate expressions available for semantic enrichment. – Case 2.4: again the recipient does not recognise all words but he recognises at least all accentuated words. He does not recognise the stress
Reconstruction of Messages 141
pattern, and he does not possess a representation of the question under discussion. The interpretation must proceed as in case 2.2: the recipient construes an underspecified representation of the speaker’s message, which he may be able to specify further in the course of the discourse. – Case 3: the recipient does not perceive all accentuated words correctly. If the speaker accentuated his utterance optimally, it follows that the recipient does not recognise all i-critical words. As a result, he can at best understand only part of the speaker’s utterance: (50)
Who talked to Jane? a. b. c.
noise talked noise noise Jane. noise STUDENT noise
Let us assume that the intended answer in example (50) is that the French student talked to Jane. Even if the recipient knows the question under discussion, he cannot reconstruct a message when he only recognises the word “talked” (first answer). If he only recognises the name “Jane” (second answer), he can reconstruct a message in the sense of “Jane talked to Jane”, but he can immediately reject this reconstruction because of its lack of plausibility. Not rejecting this reconstruction leads to a misunderstanding. On the basis of the word “student” (third answer), a recipient can – if he knows the question under discussion – create a representation of “Some student talked to Jane”. This representation does not cover the entire meaning of the speaker’s intended message. Still, it covers the meaning partially, and can thus be informative. If the recipient does not know the question under discussion, there is no way for him to understand the speaker’s utterance. Summarising: the interpretation proceeds similarly in all cases described. The recipient tries to translate the expressions that he recognises into representations of messages by appealing to the discourse context. He then evaluates these messages with respect to his representation of the common ground. For interpretation it is not essential that the recipient knows whether he recognised all words or all accentuated words; his interpretation strategy does not have to be influenced by such knowledge. Interpreting utterances by appealing to the discourse context is a surprisingly robust process. Even when an utterance is only partially recognised, and even when the recipient has only limited knowledge of the discourse context, understanding is possible.
5 Optimal Accentuation vs Focus Accentuation The following examples, which were already given in the Introduction, show that the use conditions of a sentence can be influenced by accentuation: (1)
a.
Who did John introduce to Sue? –
b.
John introduced BILL to Sue. To whom did John introduce Bill? – John introduced Bill to SUE.
The declarative sentences of the examples can be uttered as answers to the questions indicated. The declarative sentence of example (1-a) – with stress on “Bill” – however, is not an adequate answer to the question of (1-b), nor can the declarative sentence of example (1-b) – with stress on “Sue” – be used as an answer to the question of example (1-a). On the basis of the different stress patterns the sentences require different discourse contexts (here: questions under discussion); that is, the sentences have different use conditions. Within a theory of optimal accentuation, the difference is easily accounted for: the two stress patterns are optimal in different contexts. If a recipient knows the question under discussion, he needs only recognise the word “Bill” for understanding the answer of example (1-a), so that exactly this word must be accentuated. Analogously, in example (1-b), “Sue” needs to be accentuated, because a correct understanding requires that this word be recognised. If the word “only” is added to the examples, the sentences differ not only in their use conditions, but also in their truth conditions: (2)
a. b.
John only introduced BILL to Sue. John only introduced Bill to SUE.
The sentence (2-a) with stress on “Bill” is true iff John did not introduce any person (from a given group) other than Bill to Sue. Whether John introduced Bill to other people as well has no effect on the truth value. In contrast, the sentence (2-b) with stress on “Sue” is true iff John introduced
142
Optimal Accentuation vs Focus Accentuation 143
Bill to no person (from a given group) other than Sue.1 The sentence is false if John introduced Bill to other people (from a given group) as well. The sentences (2-a) and (2-b) therefore have different truth conditions. The semantic difference is linked to the difference in stress pattern. How can semantic effects of stress patterns be accounted for? – In a theory of optimal accentuation, stress patterns do not have a direct semantic function; stress only serves to emphasise the i-critical words. The optimality of a stress pattern is determined on the basis of the specific discourse context; accentuation presupposes a certain configuration of this context. Because the context in turn can influence the interpretation of an utterance, stress can indirectly have a semantic function; its semantic effect appears as an epiphenomenon of optimal accentuation. This explanation is demonstrated below by means of the examples (2-a) and (2-b). Focus theories2 account for the semantic and pragmatic function of stress patterns in a different way. The theoretical term “focus” is introduced into the syntactic description of sentences: one or more constituents of a sentence can be foci, i.e. carry a syntactic focus feature. Stress serves to mark focused constituents, it gives information about which constituents carry a focus feature. Focus theories argue that sentences with different stress patterns have different (syntactic) focus-background structures. This syntactic difference affects interpretation. In the case of examples (2-a) and (2-b), the syntactic difference yields different use and truth conditions; in the case of examples (1-a) and (1-b), its effect is limited to the use conditions only. A theory of optimal accentuation differs from focus theories, first, in its methodology: in a theory of optimal accentuation, stress patterns are determined purely pragmatically; because pragmatic factors can affect the interpretation of an utterance, a stress pattern can influence the truth conditions of an uttered sentence. A focus theory is – compared to a theory of optimal accentuation – a more “linguistic” theory: in a focus theory, stress patterns are grammaticalised; sentences with different stress patterns differ syntactically and must therefore be interpreted differently. The semantic difference that results from the syntactic difference can be – but need not be – reflected in the truth conditions of the sentence; it certainly affects the use conditions. In a theory of optimal accentuation, the semantic difference between sentences that are accentuated differently is based on a pragmatic difference; in focus theories, a pragmatic difference is based on a semantic difference. Focus theories – unlike a theory of optimal accentuation – con1 According to some focus theories, the sentence can have other readings. I will discuss the existence of other readings below. 2 For an overview, see Stechow (1989a), Rooth (1996a), Krifka (1996).
144
Accentuation and Interpretation
form to the pipeline model of speech processing, which states that a sentence is first analysed syntactically, that the syntactically analysed sentence is then interpreted semantically – establishing among other things the truth conditions of the sentence – and that finally the result of interpretation is analysed pragmatically in order to establish the use conditions. Secondly, a theory of optimal accentuation differs from focus theories with respect to the empirical data: both types of theories make predictions about how utterances must be and therefore are accentuated in given discourse contexts, and about how utterances with given stress patterns must be and therefore are interpreted in discourse contexts. Both types of theories determine a relation between stress patterns and interpretations. However, they do not specify the same relation. By means of sentences with given stress patterns and given interpretations it can be tested which relations are correct; that is, the theories can be empirically evaluated. This chapter is structured as follows: in section 1, it is shown how semantic effects of stress patterns are accounted for by a theory of optimal accentuation and by focus theories. In section 2, it is shown that a theory of optimal accentuation and focus theories make contradictory predictions about stress patterns. Such predictions are evaluated. In section 3, the results are summarised.
1 Semantic effects of accentuation In the current section, I first show how the semantic effects of stress can be described as epiphenomena of optimal accentuation. To give some examples, I explicate several interpretations, for which I presuppose the implicit adjustments of the lexicon and grammar specified in the previous chapter. Secondly, I illustrate the focus-theoretic explanation of accentuation effects. Thirdly, I sum up the important differences of the optimal accentuation account and the competing focus-theoretic account. 1.1
Semantic effects of optimal accentuation
Through accentuation, a speaker presupposes a discourse context; in this context, it should be possible for a recipient to understand the utterance even when he only recognises the accentuated words. According to the model defined in the previous chapter, part of the discourse context is the question under discussion; the speaker’s utterance must be interpreted with respect to this question. If a recipient recognises the speaker’s entire utterance – i.e. all words, including the stress pattern – he can infer what kind of question the speaker presupposes and he can accommodate his
Optimal Accentuation vs Focus Accentuation 145
representation of the common thematic knowledge accordingly. A stress pattern need not lead to exactly one possible presupposed question. If it is compatible with different, non-equivalent background questions, a recipient cannot identify a specific presupposed question, but he can restrict the class of possible presupposed background questions. That is, the stress pattern provides him at least with a constraint for accommodating his representation of the common thematic knowledge, and for the interpretation of the given utterance against the background of this knowledge. Let us consider two examples that show how the interpretation of a sentence can be influenced by the presupposition of a question. Example (3) is repeated from example (2-a) with stressed “only”: (3)
John ONLY introduced BILL to Sue.
The stress pattern of sentence (3) is only optimal if it is under discussion which persons (from a certain group) were introduced by John to Sue. By uttering this sentence such a question is presupposed. Remember that “John only introduced Bill to Sue” must denote an answer to the question that is presupposed to be under discussion. This answer must be reconstructable from “only” and “Bill” alone. The reconstruction is not possible by the mere application of meaning enrichment operations that do not refer to the discourse context. Discourse material must be taken into account. Let us assume the QUAD-model: for the reconstruction of the answer that John only introduced Bill to Sue from “only” and “Bill” alone, a QUAD of the form λx [introduce( john, x, sue)], RESTRICTOR must be available. The variable RESTRICTOR which replaces a fully specified domain-restrictor represents a predicate of the type e, t, e.g. λx [ person( x )] or λx [ person( x ) ∧ male( x )]. If RESTRICTOR is substituted with the predicate λx [ person( x )], the QUAD corresponds to the question who John introduced to Sue; the question domain comprises all persons, without further restrictions. If RESTRICTOR on the other hand is substituted with λx [ person( x ) ∧ male( x )], the QUAD represents the question which man John introduced to Sue; the question domain does not comprise all persons, but only all men. By the stress pattern of sentence (3) some question that can be represented by a fully specified instance of the QUAD given above is presupposed. The stress pattern is therefore compatible with several background questions; but these questions only differ in their domain restriction. I interpret “only” in the sense of the exhaustification operator exh that was defined in Chapter 4.3 The operator can be applied to the interpreta3
Cf., for example, Zeevat (2002).
146
Accentuation and Interpretation
tion of “Bill” – the quantifier λZλX [ Z (bill ) ∧ X (bill )],4 so that one obtains the formal representation of “only Bill”: λZλX [ Z (bill ) ∧ X (bill ) ∧ ∀ x [ Z ( x ) ∧ X ( x ) → x = bill ]] This quantifier can be expanded into a sentence of QL by means of the QUAD of the presupposed question: ![ RESTRICTOR(bill ) ∧ introduce( john, bill, sue) ∧
∀ x [ RESTRICTOR( x ) ∧ introduce( john, x, sue) → x = bill ]] This means that the sentence (3) is interpreted in the sense that John introduced only Bill and no other person (from a group that is still to be specified) to Sue. Let us now assume that instead of “Bill”, “Sue” is accentuated: (4)
John ONLY introduced Bill to SUE.
In contrast to sentence (3), which was just interpreted, sentence (4) is only optimally accentuated if it is under discussion to which person (from a given group) John introduced Bill. By uttering this sentence, it is presupposed that such a question is under discussion and that a corresponding QUAD of the form λx [introduce( john, bill, x )], RESTRICTOR is available. Again, the variable RESTRICTIOR represents a predicate of the type e, t, e.g. λx [ person( x )] or λx [ person( x ) ∧ f emale( x )]. Against the background of the presupposed question, sentence (4) can be interpreted as follows: first, representations for the meanings of “only” and “Sue” are taken from the lexicon; these representations are then combined into a representation of the meaning of “only Sue”. The resulting quantifier is expanded into a sentence of QL by means of the QUAD of the presupposed question. This yields: ![ RESTRICTOR(sue) ∧ introduce( john, bill, sue) ∧
∀ x [ RESTRICTOR( x ) ∧ introduce( john, bill, x ) → x = sue]] According to this, (4) is interpreted in the sense that John introduced Bill only to Sue and not to any other person (from a group that is either already given or still to be specified). This interpretation is not equivalent to the interpretation of example (3). 4 As explained in Chapter 4, I take generalised quantifiers to be expressions of type e, t, e, t, t. Z is a variable that is to be instantiated by a predicate for domain restriction.
Optimal Accentuation vs Focus Accentuation 147
Let me take stock. The example sentences (3) and (4) are identical except for their stress patterns. Therefore the different interpretations are to be ascribed to the different stress patters. The examples show that is it possible to presuppose different background questions through different accentuations of the same sentence, and thus to force different interpretations. In both the examples (3) and (4) the adverb “only” is accentuated. However, the sentences must also be interpreted differently when “only” does not carry stress: (5)
a. b.
John only introduced BILL to Sue. John only introduced Bill to SUE.
How can the difference in meaning of the example sentences, in which “only” is not accentuated, be accounted for? I interpret “only” as the exhaustification operator exh. According to the rules defined in Chapter 4 for the reconstruction of messages, a recipient can freely apply the exhaustification operator to create the representation of an exhaustive answer; the application of the operator does not have to be forced by uttering the word “only”. Let us assume that the question under discussion is whom John introduced to Sue. The recipient only recognises the accentuated word “Bill” in the speaker’s answer. He can interpret the speaker’s utterance without applying the exhaustification operator as a partial answer in the sense of “John introduced Bill to Sue”. Alternatively, he can apply the exhaustification operator and interpret the utterance as an exhaustive answer in the sense of “John introduced only Bill to Sue”. That is, the recipient does not necessarily have to recognise the word “only” in order to correctly interpret the example sentence (5-a) as an answer to the question whom John introduced to Sue; he can compensate for not recognising “only” by applying the exhaustification operator. This means that the stress pattern of example sentence (5-a) can be optimal in relation to the question whom John introduced to Sue. Therefore the sentence can be interpreted as an answer to this question – just like the example (2-a) (with stress on “only”). Analogously, the sentence (5-b) (with stress on “Sue”) can be interpreted in the same way as example (2-b) (with stress on “only” and on “Sue”). A recipient applies the exhaustification operation freely and in that way interprets an utterance as an exhaustive answer if he has no expectations that contradict the exhaustive answer and if he assumes that the cooperative speaker possesses enough evidence to provide the exhaustive answer. Under these circumstances he does not need to recognise the word “only” in order to interpret the examples (5-a) and (5-b) correctly. Accordingly,
148
Accentuation and Interpretation
(a)
(b)
Figure 5.1 Reference worlds for the “only”-experiment
a speaker does not have to stress “only” if he assumes that these circumstances apply – optimally, he only accentuates those words whose nonrecognition a recipient cannot compensate for. Under these circumstances, the word “only” is not i-critical but pragmatically redundant. Excursus: Whether or not a speaker accentuates “only” depends on his assessment of the recipient’s disposition toward applying the exhaustification operator. If the speaker can expect the recipient to apply the exhaustification operator, he does not need to accentuate “only”; if not, he must accentuate it. I assume that the accentuation of “only” is associated with an expectation on the part of the recipient that the speaker cannot or will not give an exhaustive answer – or rather with a suspicion on the part of the speaker that the recipient has such an expectation. My assumption is confirmed by the results of an experiment that is described by Schmitz (2006) and which I discuss here in short: recordings of dialogue (6-a) and (6-b) are presented together with Figures 5.1(a) and 5.1(b) to 29 test persons. The test persons are first year students at the former Institute of Communication Research and Phonetics at the University of Bonn (Germany) who attend an introductory course in computational linguistics. 14 test persons are native speakers of German, 15 are non-native speakers. The reason for the suboptimal number of test persons is that exactly 29 students – unfortunately not more – attend the course. The experiment is conducted in German. I assume that the German word “nur” behaves like “only” with respect to interpretation and accentuation.
Optimal Accentuation vs Focus Accentuation 149
(6)
a.
(i)
(ii)
b.
(i)
(ii)
Was ist im LINKEN Feld? – Nur das QUADRAT ist im linken Feld. (What is in the LEFT field? – Only the SQUARE is in the left field.) Was ist im LINKEN Feld? – NUR das QUADRAT ist im linken Feld. (What is in the LEFT field? – ONLY the SQUARE is in the left field.) Was ist im LINKEN Feld? – Nur das DREIECK ist im linken Feld. (What is in the LEFT field? – Only the TRIANGLE is in the left field.) Was ist im LINKEN Feld? – NUR das DREIECK ist im linken Feld. (What is in the LEFT field? – ONLY the TRIANGLE is in the left field.)
The left Figure 5.1(a) is shown as a reference world for the recordings of the first dialogue (6-a). The right Figure 5.1(b) is shown as a reference world for the recordings of the second dialogue (6-b). Before the recordings are played, the test persons are explained that the questioner does not know which objects are in which fields, but that the respondent is fully informed about the positions of the objects. Before the recordings of the second dialogue (6-b) are played, the test persons are furthermore told that, although the questioner does not know the positions of the objects, he expects that the objects are more or less equally distributed over the two fields. The test persons have to judge which of the recorded answers they consider more adequate in each case – the one without stress on “only” or the one with stress on “only”. The experiment is a forced choice experiment, i.e. the test persons have to choose one of the answers in each case. Before the recordings of the first dialogue (6-a) are played, nothing is said about the expectations of the questioner. I therefore assume that accentuating “only” and not accentuating “only” are equally acceptable; that is, I do not expect a clear preference for one of the recordings. For the second dialogue, however, I assume that there is a tendency to prefer the recording with a stress on “only”: the questioner expects that the objects are more or less equally distributed over both fields; the information that only one object – namely the triangle – is in the left field contradicts his expectation. Therefore, the respondent cannot expect that the questioner will understand the answer as an exhaustive answer unless he recognises “only”. Therefore, “only” has
150
Accentuation and Interpretation
to be accentuated. After all, I assume a correlation between the tendency to prefer stress on “only” and the information about the expectations of the questioner: I expect a significantly higher preference for stress on “only” in the second dialogue (6-b) than in the first (6-a). The results of the experiments confirm my expectations: the answers of the first dialogue (6-a) are both considered more or less equal; 16 test persons prefer stress on “only”, the other 13 test persons consider the answer better without stress on “only”. The tendency to prefer accentuating “only” is stronger in the second dialogue (6-b), in which the exhaustivity of the answer contradicts the expectations of the questioner. Twenty-one test persons prefer stress on “only” – especially all 14 native speakers of German do. Only 8 test persons feel that in the second dialogue “only” need not be accentuated. The one-sided t-test for comparing the null hypothesis (The accentuation of “only” and the exhaustivity that contradicts the expectations of the recipient are not correlated) with my alternative hypothesis (There is a stronger preference for accentuating “only” when “only” contradicts an exhaustivity expectation.) yields a p-value of < 0.05. (End of excursus) Let me take stock. The stress patterns of the sentences (5-a) – “John only introduced BILL to Sue” – and (5-b) – “John only introduced Bill to SUE” – can be optimal with respect to the questions who John introduced to Sue, and to whom John introduced Bill, respectively, even if “only” is not accentuated. Uttering the sentences can presuppose the questions, which can thus influence the interpretation of the sentences. That is, the sentences (5-a) and (5-b) without stress on “only” can be interpreted analogously to the sentences (3) and (4) with stress on “only”. Three objections to the analysis of the example sentences without stress on “only” must be discussed. Objection 1: the explanation that the example sentences without accentuation of “only” can be analysed analogously to the sentences with accentuation of “only” presupposes that “only” denotes the exhaustification operator. “Only” is not always used as an exhaustification operator, however. Let us assume that we have a field divided into a left and a right field; three squares – a red one, a black one and a red and black striped one – are distributed over both fields: (7)
Albert: Which squares are in the left field? Barbara: Only the square that is only red is in the left field.
In Barbara’s utterance only the first “only” serves to exhaustify the answer, not the second “only”. The example (7) shows that “only” is not always used to exhaustify an answer.
Optimal Accentuation vs Focus Accentuation 151
Reply: admittedly, some modifications would be necessary in order to ascribe an exhaustifying function to the second “only”. That is not required, however. I claim that “only” denotes the exhaustification operator exh; I do not claim that “only” is always used to exhaustify an answer. If “only” is used to exhaustify an answer, it may be pragmatically redundant, because a recipient can apply the exhaustification operator independently, without having recognised the word “only”. When “only” is not used to exhaustify an answer and cannot be reconstructed through some contextual cue, it is not pragmatically redundant and must be accentuated. The second “only” in the answer sentence of example (7) is not pragmatically redundant; I therefore expect it to be accentuated: (8)
Albert: Which squares are in the left field? Barbara: Only the square that is ONLY RED is in the left field.
For a correct reconstruction of the answer, “only” and “red” must be recognised. The meaning representation of “only” can be applied to that of “red”; this yields a predicate in the sense of “only red”.5 This predicate can be transformed into a generalised quantifier with a standard type-shifting operation; the quantifier has the meaning of “an object that is only red” or 5 In Chapter 4, I defined the exhaustification operator only for application to several generalised quantifiers. Here the operator must be applied to the predicate λx [red( x )], that is to a predicate of type e, t. “Only red” denotes the property of having only one of a given set of properties, namely, being red: every object that is “only red” has other properties (e.g. the property of being extended in space). The domain of properties to be taken into account must be restricted. In order to be able to restrict this domain, I represent the meaning of “only red” as follows (I represent the domain restrictions by an argument that is to be applied to the term):
exh(λx [red( x )]) ⇔ λZλy[ Z (λx [red( x )]) ∧ red(y) ∧ ∀ X [ Z ( X ) ∧ X (y) → X = λx [red( x )]]] The variable Z stands for a predicate that restricts the set of properties to which λx [red( x )] belongs and of which λx [red( x )] is the only one that applies to the as yet unspecified object y. In the context given in example (8) there is no predicate available that can be substituted for the variable Z. I tentatively assume that there is a standard rule for determining such predicates. Here, an obvious result of such a rule would be to substitute Z with the predicate λY [colour (Y )]. This yields a representation of the property of being red and not having any other colour: λy[colour (λx [red( x )]) ∧ red(y) ∧ ∀ X [colour ( X ) ∧ X (y) → X = λx [red( x )]]] Because it is generally known that red is a colour, the representation can be abbreviated: λy[red(y) ∧ ∀ X [colour ( X ) ∧ X (y) → X = λx [red( x )]]]
152
Accentuation and Interpretation
of “the object that is only red”.6 The quantifier can now be exhaustified7 and expanded into a representation of the exhaustive answer by means of the QUAD for the question that was asked: only a square that is only red is in the left field.8 This shows that although the second “only” of example (8) is not used to explicitly exhaustify the answer, it can still denote the exhaustification operator exh. I distinguish between the meaning and the use of the adverb “only”, and therefore I need not assume that the adverb always serves the purpose of exhaustifying an answer. Nonetheless I assume that it usually does. The use of “only” in example (8) (or (7) is, I believe, rather unusual. Objection 2: Horn (1969) distinguishes between a propositional part and an assertional part in the meaning of “only”. Uttering “John only introduced BILL to Sue” presupposes that John introduced Bill to Sue and asserts that John introduced no one other than Bill to Sue. The interpretation of “only” as the exhaustification operator does not take the distinction of the presuppositional part and the assertional part into account. Reply: the distinction between a presuppositional and an assertional part should not be accepted without reservations. According to Horn (1969), the answer in example (7) – “Which squares are in the left field?” “Only the red square is in the left field.” – presupposes that the red square is in the left field. However, this is exactly something that the speaker cannot presuppose in the given situation; the distinction between presupposition and assertion is not convincing in this case.9 Regardless of this reservation, the distinction of a presuppositional and an assertional part of “only” is compatible with my explanation of the semantic effects of stress patterns. My explanation requires that if a non-accentuated “only” is not recognised, it 6
Cf. definition 4-8 in Chapter 4, subsection 2.3:
λZλY [∃y[ Z (y) ∧ red(y) ∧ ∀ X [colour ( X ) ∧ X (y) → X = λx [red( x )]] ∧ Y (y)]] 7
Cf. definition 4-7 in Chapter 4, subsection 2.2:
λZλY [∃y[ Z (y) ∧ red(y) ∧ ∀ X [colour ( X ) ∧ X (y) → X = λx [red( x )]] ∧ Y (y) ∧ ∀ x [ Z ( x ) ∧ Y ( x ) → x = y]]] 8
Cf. definition 4-9 in Chapter 4, subsection 2.3:
![∃y[square(y) ∧ red(y) ∧ ∀ X [colour ( X ) ∧ X (y) → X = λx [red( x )]] ∧ le f t(y) ∧ ∀ x [square( x ) ∧ le f t( x ) → x = y]]]] 9 Besides, Horn himself revised his semantics of “only”. Cf. Atlas (2002) and Atlas and Horn (2002), among others.
Optimal Accentuation vs Focus Accentuation 153
can be compensated for by independent application of the exhaustification operator. The exhaustive interpretation of a sentence that does not take “only” into account must be sufficiently similar to an interpretation that does take “only” into account; that is both interpretations must alter the common ground of the discourse participants in the same way. In order to alter the common ground in the same way the interpretations do not have to be equivalent. (The interpretations would be equivalent if they update every information state – and not just the current one – in the same way.) Since I allow for the accommodation of common ground representations, the interpretations especially do not have to be divided into the same presuppositional and assertional parts. Objection 3: take another look at the standard example. (9)
John only introduced BILL to Sue.
Just to remember: I claim that uttering sentence (9) presupposes the question who John introduced to Sue. The sentence is interpreted as an answer to this question; accordingly, it means that John introduced only Bill to Sue. In order to be able to assume that the question mentioned is presupposed, it must be assumed that “only” is used as an exhaustifier. This corresponds to the common usage of “only”, as I assumed. Another usage is not common, but possible. Turning to the objection: because “only” does not necessarily have to be used as an exhaustifier, it is possible that uttering sentence (9) presupposes another question than the one mentioned. For example, it is possible that the question who John introduced only to Sue is presupposed. Sentence (9) must then be interpreted in such a way that John introduced Bill only to Sue. If the uttering of sentence (9) can presuppose different, non-equivalent questions, the sentence can be interpreted in different, non-equivalent ways – in spite of the accent on “Bill”. This means that the accent on “Bill” cannot determine meaning; a theory of optimal accentuation fails to account for the semantic effect of the accentuation of “Bill”. Reply: no, the theory does not fail. In the following example (10), the utterance “John only introduced BILL to Sue” answers the explicitly stated question who John introduced only to Sue: (10)
I heard that John introduced each man only to one woman. Who did he only introduce to SUE? – John only introduced BILL to Sue.
In general, the preferred reading of example (10) will be an exhaustive answer according to which John introduced only Bill to Sue. Much less
154
Accentuation and Interpretation
probable than the exhaustive reading, it seems to me, is the reading in which the sentence is interpreted as a non-exhaustive answer in the sense of “John introduced Bill only to Sue”. – Why? – Usually, “only” is used to exhaustify an answer. In order to interpret “John only introduced BILL to Sue” as a non-exhaustive answer (in the sense of the reading that I consider less probable), it must be assumed that “only” does not have its usual function. There must be a reason to make such an assumption, and no such reason is available in example (10). Interjection: that is, uttering the answer sentence preferably answers the question who John introduced to Sue, and not the explicitly stated question who John only introduced to Sue? – That does not matter. After the common ground of the discourse participants has been updated with “I heard that John introduced each man only to one woman” both questions are equivalent with respect to the common ground; an answer to one question is automatically an answer to the other. In example (11) there is a reason for the assumption that “only” does not serve to exhaustify the answer: (11)
I heard that John introduced each man to only one woman. Who did he only introduce to SUE? – John only introduced BILL to Sue, and he also only introduced TOM to Sue.
If “only” would exhaustify the answer in example (11), the answer would be contradictory: John would have introduced only Bill and additionally only Tom to Sue. If one assumes that the speaker is saying the truth, one has to interpret the answer sentence in such a way that John introduced Bill and Tom only to Sue. I believe that this interpretation is possible.10 Nonetheless I take the answer sentence to be confusing. It is not necessary at all to use “only” contra its purpose as exhaustifier; the answer sentence would be clearer and easier to understand if “only” had simply been left out:11 (12)
I heard that John introduced each man to only one woman. Who did he only introduce to SUE? – John introduced BILL to Sue, and he also introduced TOM to Sue.
10 Beaver and Clark (2003) have a different opinion: cf. subsection 2.1 of the present chapter. An experiment to test the contradicting hypotheses remains to be done. 11 The interrogative sentence would also be clearer and easier to understand without “only”. However, I am not discussing the interpretation and adequacy of the interrogative sentence, but of the answer sentence.
Optimal Accentuation vs Focus Accentuation 155
Because of the unusual way in which “only” is used, the answer in example (11) conflicts with the Gricean maxim of manner.12 I believe that although the answer of example (11) can be understood, it is pragmatically inadequate; it is not clear, and difficult to understand: “John only introduced BILL to Sue” is preferably interpreted as the exhaustive answer to the question who John introduced to Sue, because for every other interpretation it must be assumed that “only” is used in a pragmatically inadequate way. Let me take stock. In order to account for the semantic effect of stress patterns in connection to “only”, a theory of optimal accentuation assumes that the common use of “only” is to exhaustify an answer.13 Deviations from common usage must somehow be motivated. If there is no reason for such a deviation, the recipient of the phrase “John only introduced BILL to Sue” can assume that “only” has its usual function and that the sentence denotes an exhaustive answer to the question who John introduced to Sue. Accordingly, the recipient can interpret the sentence to mean that John introduced only Bill to Sue. How can “John only introduced Bill to Sue” be interpreted when no information about the accentuation is available, and the question under discussion is not known? (This situation occurs, for example, when the sentence is the first sentence of a written text.) (13)
John only introduced Bill to Sue.
“Only” marks that “John introduced Bill to Sue” (without “only”) is to be interpreted as an exhaustive answer. Regardless of which question is to be answered, the recipient understands that John introduced Bill to Sue. Exhaustification operators are generally applied to constituent answers; an exhaustification operator denoted by “only” is applied to the meaning of an expression in the domain of “only”, i.e. to the meaning of an expression that is locally c-commanded by “only”. Therefore, some expression or sequence of expressions in the domain of “only” must be able to function as a constituent answer to the question that the sentence is to answer. That is, uttering sentence (13) exhaustively answers one of the following questions: 1. the question who (from a restricted set of people) John introduced to Sue – ?[introduce( john, x, sue) ∧ RESTRICTOR( x )] – e.g. “Which men 12 In contrast, the uncommon use of “only” in example (7) – “Which squares are in the left field?” “Only the square that is only red is in the left field.” – does not contradict the modality of manner. 13 Therefore, “only” is normally “associated” with an accentuated word in its domain. (See below, subsections 1.2 and 2.1.)
156
Accentuation and Interpretation
did John introduce to Sue?”. The answer sentence must be accentuated as follows: “John only introduced BILL to Sue.” 2. the question to whom (from a restricted set of people) John introduced Bill – ?[introduce( john, bill, x ) ∧ RESTRICTOR( x )] – e.g. “To which women did John introduce Bill?”. The answer sentence must be accentuated as follows: “John only introduced Bill to SUE.” 3. the question who (from restricted sets) John introduced to each other – ?[introduce( john, x, y) ∧ RESTRICTOR1 ( x ) ∧ RESTRICTOR2 (y)] – e.g. “Which men did John introduce to which women?”. The answer sentence must be accentuated as follows: “John only introduced BILL to SUE.” 4. the question which relations (from a restricted set of relations) John has to Bill and Sue – ?[ X (sue)(bill )( john) ∧ RESTRICTOR ( X )] – e.g. “What did John do to Bill and Sue?”. The answer sentence must be accentuated as follows: “John only INTRODUCED Bill to Sue.” 5. the question which relations (from a restricted set) John has to Sue – ?[ X (sue)( john) ∧ RESTRICTOR ( X )] – e.g. “What did John do to Sue?”. The answer sentence must be accentuated as follows: “John only INTRODUCED BILL to Sue”. 6. the question which relations (from a restricted set) John has to Bill – ?[ X (bill )( john) ∧ RESTRICTOR ( X )] – e.g. “What did John do to Bill?”. The answer sentence must be accentuated as follows: “John only INTRODUCED Bill to SUE.” 7. the question which properties (from a restricted set of properties) John has – ?[ X ( john) ∧ RESTRICTOR ( X )] – e.g. “What did John do?”. The answer sentence must be accentuated as follows: “John only INTRODUCED BILL to SUE.”14 The questions 4–7 can only be correctly answered by “John only introduced Bill to Sue” if the question domains are restricted. Otherwise the sentence is false as an answer to these questions: John certainly stands in other relations to Bill and Sue apart from the introducing relation; trivially, for example, he stands in a relation of not being identical to them. Furthermore, John has other properties than of having introduced Bill and Sue; trivially, he has the property of being identical to himself. “John only 14 According to focus theories “introduced” and possibly even “Bill” do not have to be accentuated. I discuss the different accentuation predictions in subsection 2.2.
Optimal Accentuation vs Focus Accentuation 157
introduced Bill to Sue” can be a true answer to questions 1–3 even if the question domains are not restricted; the domains of these questions may be restricted, but do not have to be. The question which properties John has – that is, question 7: ?[ X ( john) ∧ RESTRICTOR ( X )] – is semantically the strongest of the seven questions above. If the variables RESTRICTOR and RESTRICTOR are instantiated in such a way that RESTRICTOR( x ) and RESTRICTOR ( X ) are tautologies for any x and X, then question 7 entails the other six questions – an exhaustive answer to question 7 gives an exhaustive answer to each of the other questions.15 Furthermore the question can be made equivalent to any instance of any of the other questions by instantiating RESTRICTOR . For example, it can be made equivalent to an instance of “Which man did John introduce to Sue?” – question 1: ?[introduce( john, x, sue) ∧ person( x ) ∧ male( x )] – by instantiating RESTRICTOR with the predicate λY [Y = λx [introduce( x, y, sue) ∧ person(y) ∧ male(y)]]: ?[ X ( john) ∧ λY [Y = λx [introduce( x, y, sue) ∧ person(y) ∧ male(y)]]( X )]
⇐⇒ ?[introduce( john, y, sue) ∧ person(y) ∧ male(y)] That is, “John only introduced Bill to Sue” can in any case be interpreted as an exhaustive answer to the question which properties (of a not yet defined set of properties) John has. The set of interesting properties under discussion that John could have is unspecified; the only property that is certainly part of it is the property of having introduced Bill to Sue. By determining the properties under discussion, “John only introduced Bill to Sue” can be specified as an exhaustive answer to any of the other six questions named above. Therefore, it is possible to interpret the sentence independently of the stress pattern in such a way, first, that John introduced Bill to Sue, and, secondly, that of a still undefined set of properties, John only has the property of having introduced Bill to Sue. I represent the interpretation as follows: ![ RESTRICTOR (λx [introduce( x, bill, sue)]) ∧ introduce( john, bill, sue)
∧ ∀ X [ RESTRICTOR ( X ) ∧ X ( john) → X = λx [introduce( x, bill, sue)]]] The representation is underspecified but can be fully specified by instantiating the variable RESTRICTOR . It strongly resembles the representation that is proposed in alternative semantics as will be shown in the following subsection. 15
Cf. Chapter 3, subsection 3.2.
158
Accentuation and Interpretation
Let me take stock. For the sentence “John only introduced Bill to Sue”, which has an unspecified stress pattern, an underspecified meaning representation is generated; this representation is made more concrete through reference to a given background question. 1.2
Semantics of the focus feature
Focus theories use the theoretical term “focus” to explain semantic and pragmatic effects of accentuation: an accent marks a focus; a focus is an expression that carries the syntactic focus feature; and the focus feature influences the meaning of the expression that carries it. Sentences with different stress patterns have different foci, that is, different syntactic structures. For that reason they can have different truth conditions and use conditions. For a focus-theoretic description, three types of rules are required: 1. Syntactic rules that determine which expressions can carry a focus feature, or rather how the focus feature is assigned to an expression. 2. Phonological rules that determine how focused expressions are marked through stress. 3. Semantic rules that determine how the focus feature is to be interpreted, and what effect focusing – i.e. assigning a focus feature to one or more constituents – has on the truth and use conditions of sentences. In the present section, I am concerned with the semantics of the focus feature, i.e. with the third kind or rules. For the sake of simplicity, I assume that every syntactically well-formed expression can be focused. That is, let a collection of syntactic rules of the following schema be given, in which the variable C can be replaced with the symbol for any word class or phrase type, and in which the subscripted “F ” denotes the focus feature:
[C ] F → C In this section, I only discuss example sentences in which individual words are focused. I assume that exactly these words are accentuated – the standard assumption in focus theories. I will discuss foci that consist of multiple words in the following section. The most influential theories of focus semantics are structured meaning theory and alternative semantics.16 I illustrate these theories with the examples (14-a) and (14-b): 16 For structured meaning theory cf. a.o. Krifka (1992a); for alternative semantics cf. a.o. Rooth (1992). There are other theories of focus interpretation, e.g. (contd)
Optimal Accentuation vs Focus Accentuation 159
(14)
a. b.
John only introduced [BILL]F to Sue. John only introduced Bill to [SUE]F .
1. According to structured meaning theory, the sentences (14-a) and (14-b) are interpreted as follows: in both sentences, the domain of “only” consists of a focus (“Bill” and “Sue”, respectively) and a background that is formed out of the constituents without focus feature (“introduced . . . to Sue” and “introduced Bill to . . . ”, respectively). Separate semantic representations are generated for the foci and the backgrounds, so that for each sentence a pair consisting of a background representation and a focus representation is obtained:17
λxλy[introduce(y, x, sue)], bill λxλy[introduce(y, bill, x )], sue The first of these pairs represents the meaning of “introduced [BILL] F to Sue”; the second represents the structured meaning of “introduced Bill to [SUE]F ”. In the background representations, the position of the focus is occupied by a lambda-bound variable (here x). Functional application of the background representations to the focus representations yields a “destructured” representation in which the focus representation takes the position of the place-holder variable. The meaning of the focus feature is not represented in the destructured representation: λy[introduce(y, bill, sue)] “Only” denotes a so-called focus operator (represented as only) that requires a structured meaning as argument. A structured meaning is only available as argument if at least one expression in the domain of “only” is focused; the adverb “only” therefore requires at least one focus in its domain. The operator only is applied to the structured meaning representation of the domain of “only”. This application resolves the structure of the argument: theories based on the syntactic movement of foci (Chomsky (1977)), and furthermore in-situ binding semantics (Wold (1998)) and unification-based replacive theories (Pulman (1997)). (The terms “in-situ binding semantics” and “replacive theory” come from Krifka (1996).) Structured meaning theory and alternative semantics can be considered paradigmatic for focus semantics; the other theories do not deviate from them in any fundamental sense that is relevant in the present discussion. 17 For sake of simplicity, I represent names as expressions of type e. Names can also be represented as generalised quantifiers, and be integrated in structured meaning representations. Cf. Krifka (1993).
160
Accentuation and Interpretation
only(λxλy[introduce(y, x, sue)], bill )
⇐⇒ λy[introduce(y, bill, sue) ∧
∀z ∈ ALT(bill )[introduce(y, z, sue) → z = bill ]] only(λxλy[introduce(y, bill, x )], sue)
⇐⇒ λy[introduce(y, bill, sue) ∧
∀z ∈ ALT(sue)[introduce(y, bill, z) → z = sue]] Finally, the resulting terms are connected to the semantic representation of the subject “John”; this yields semantic representations of the entire example sentences (14-a) and (14-b): ![introduce( john, bill, sue) ∧
∀z ∈ ALT(bill )[introduce( john, z, sue) → z = bill ]] ![introduce( john, bill, sue) ∧
∀z ∈ ALT(sue)[introduce( john, bill, z) → z = sue]] The terms “ALT(bill )” and “ALT(sue)” denote sets of alternative entities for the persons Bill and Sue, respectively. Accordingly, sentence (14-a) – “John only introduced [BILL]F to Sue” – means, first, that John introduced Bill to Sue and, secondly, that John did not introduce any alternative entity to Sue. Sentence (14-b) – “John only introduced Bill to [SUE]F ” – on the other hand means, first, that John introduced Bill to Sue and, secondly (and here is the difference with (14-a)), that John did not introduce Bill to any alternative entity other than Sue. Let me take stock. The focus feature structures meaning representations; the meaning representations of the focused expressions are separated from the meaning representations of the background expressions. This enables a focus operator such as only to refer to the meaning representations of the focus and of the background separately. 2. The interpretation in the alternative semantics framework goes as follows: for each expression two meanings are derived – a normal meaning and a focus meaning. The focus feature influences the focus meaning of
Optimal Accentuation vs Focus Accentuation 161
an expression that carries a focus feature or that contains an expression with a focus feature, but it does not influence the normal meaning of this expression. The normal meanings of “John only introduced [BILL]F to Sue” (14-a) and “John only introduced Bill to [SUE]F ” (14-b) are therefore identical. They are represented as follows: ![introduce( john, bill, sue) ∧
∀ X ∈ C [ X ( john) → X = λx [introduce( x, bill, sue)]]] If the focusing is not taken into account, “John only introduced Bill to Sue” means, first, that John introduced Bill to Sue, and, secondly, that from a given set of properties C John possesses only the property of having introduced Bill to Sue. Trivially, John has other properties as well, e.g. the property of being identical with himself. The set C therefore must not contain all possible properties; it must be restricted. To further specify and thus restrict the set C, alternative semantics refers to the focus meanings of “introduced [BILL]F to Sue” and “introduced Bill to [SUE]F ”. – What is a focus meaning? (a) The focus meaning of an expression that does not carry the focus feature is a set that contains only one element, namely the normal meaning of the expression. The focus meaning of “Sue” is the set {sue}, whose only element is the person Sue.18 (b) The focus meaning of an expression that carries the focus feature is the set of all objects that are of the same type as the normal meaning of the focused expression. That is, the focus meaning of “[Sue] F ” is the set { x | x ∈ D e };19 the person Sue (sue) is one element but not the only element of the focus meaning of “[Sue]F ”. (c) The focus meaning of a complex expression is the set of objects that can be formed by combining the objects in the focus meanings of the constituents of this expression: let e.g. A · B be a complex expression consisting of the constituents A and B, and let the normal meaning representation of A · B be generated by applying the normal meaning representation of A to the normal meaning representation of B. Let α and β be the focus meanings of A and B. Now, the focus 18 According to Rooth (1992), names do not denote generalised quantifiers, but expressions of type e. It is possible to adjust the semantics in such a way that names denote generalised quantifiers. For the sake of simplicity, I will not do this. 19 D is the domain of a given model; D e is the set of all entities. Cf. Appendix, section 2.
162
Accentuation and Interpretation
meaning of A · B is the set { a(b) | a ∈ α ∧ b ∈ β}. The normal meaning of a complex expression is one element but usually not the only element of its focus meaning. The focus meanings of the verb phrases “introduced [BILL]F to Sue” and “introduced Bill to [SUE]F ” are the following sets of properties:
{λx [introduce( x, y, sue)] | y ∈ D e } {λx [introduce( x, bill, y)] | y ∈ D e } These focus meanings can be used to restrict the domain of quantification of only. Let the domain C of only be a subset of the focus meaning of the domain of “only”; it therefore only contains properties of the sort of having introduced someone to Sue (example (14-a)), and of having introduced Bill to someone (example (14-b)), respectively: ![introduce( john, bill, sue) ∧
∀ X ∈ C [ X ( john) → X = λx [introduce( x, bill, sue)]] ∧ C ⊆ {λx [introduce( x, y, sue)] | y ∈ D e }]
![introduce( john, bill, sue) ∧
∀ X ∈ C [ X ( john) → X = λx [introduce( x, bill, sue)]] ∧ C ⊆ {λx [introduce( x, bill, y)] | y ∈ D e }] Let me take stock. Alternative semantics introduces two semantic levels, one for a normal meaning, which is not influenced by the focus feature, and one for a focus meaning, which is influenced by the focus feature. The focus meaning of an expression of type τ is a set of possible denotations of τ-expressions. By means of such a set the domain of an operator like only can be restricted. Structured meaning theory and alternative semantics interpret the example sentences in different ways. Nonetheless, they generally arrive at equivalent results.20 In the interpretation of the sentences (5-a) and (5-b), 20 Focus meanings in alternative semantics must consist of intensional objects for deriving interpretations that are equivalent to those of structured meaning theory. – Why? – Cf. the following example from Krifka (1996):
(i)
This horse only has a [HEART]F .
(contd)
Optimal Accentuation vs Focus Accentuation 163
both theories refer to the focus feature for reconstructing a set of alternatives. This set of alternatives is used to determine the domain of only. The domain of only need not be identical with the alternative set, but must be a subset of it: (15)
a. b.
Who does John like? – John only likes [SUE]F . Does John love Sue or does he just like her? – John only [LIKES]F Sue.
If the domain of only in the answer sentence of example (15-a) is identified with the set of alternatives that is built on the basis of the focus, and if this set is not further restricted – that is, if the domain is identified with the set ALT(sue) or the focus meaning of “likes [Sue]F ” – the sentence is interpreted to mean that John does not like anything or anyone besides Sue. The question under discussion is whether a certain horse has a heart, kidneys or both. Every living being that has a heart (horses among them) also has kidneys; sentence (i) is therefore false. Structured meaning theory interprets it in such a way that the horse in question only has a heart, and no kidneys; that is, structured meaning theory captures the meaning correctly. However, according to alternative semantics, the sentence means that of all the predicates in the focus meaning of “has a [heart]F ”, only the predicate of having a heart applies to the horse in question. If the focus meaning consists of extensional objects, the sentence will be interpreted as true, because extensionally, the predicate of having a heart and the predicate of having kidneys are identical. If on the other hand the focus meaning consists of intensional objects, sentence (i) is false in alternative semantics as well, because intensionally the two predicates are different. Krifka (1996) points out that the two predicates are also identical intensionally if it already belongs to the common ground that every being that has a heart also has kidneys. With respect to every index in the common ground, the extensions of the predicates are identical, so their intensions are identical as well. That means that alternative semantics theory (incorrectly) interprets sentence (i) as true. Krifka’s argument only holds if the intensions of predicates are determined by the worlds of the common ground and not by all the worlds of the underlying model. That this should be the case is not immediately obvious. However, Krifka hits a sore spot with his criticism, one that has already been noticed by Rooth (1985): let us construct a model with only one index. Let the predicate λx [introduce( x, bill, sue)] (introducing Bill to Sue) have the same extension as the predicate λx [introduce( x, tom, sue)] (introducing Tom to Sue) for this index. As a result, the two predicates have the same intension. Now, let the names “Bill” and “Tom” have different extensions (and therefore, in the one-index model, also different intensions). With respect to this model, structured meaning theory (correctly) interprets the sentence “John only introduced [BILL]F to Sue” as false, while alternative semantics (incorrectly) interprets it as true. To remedy this problem, Rooth (1985) proposes to only use more “realistic” models than the “degenerate” one-index models for the interpretation of natural language expressions, so that the problem does not occur and alternative semantics arrives at interpretational results that are equivalent to those of structured meaning theory. Rooth’s solution requires, of course, that each model in which every horse that has a heart also has kidneys must be rejected as unrealistic.
164
Accentuation and Interpretation
This interpretation does not correspond to the intended meaning. As an answer to the question here, the sentence must be interpreted as meaning that John does not like any other person than Sue; John’s like or dislike of objects is not under discussion. In order for the set of alternatives that is built on the basis of the focus to be able to function as the domain of only, it must be restricted with respect to the question under discussion. The same is true of example (15-b): if the set of alternatives built on the basis of the focus functions without restrictions as the domain of only, the answer sentence means that John does not stand in any other relation to Sue other than liking her. Under this interpretation, the sentence is false, because John obviously stands in other relations to Sue, e.g. in the relation of not being identical with her. In order for the sentence to be interpreted correctly, the domain of only must be restricted to the relations of loving and liking that are under discussion. That is, the focus only determines a constraint for the identification of the domain: the domain must be a subset of the set of alternatives that is built on the basis of the focus. Which subset this is must be determined in relation to the discourse context. Structured meaning theory and alternative semantics do not explain how the domains of only in the examples (15-a) and (15-b) can be completely specified by referring to the context; both theories determine the meaning of the answer sentences only vaguely. However, the theories do explain the meaning difference between the sentences. In focus theories, operators such as only are called “focus operators”; words such as “only”, that denote focus operators, are called “focus particles” or “focus adverbs”. Apart from “only”, words such as “even”, “always”, “mostly” and “seldom” are focus adverbs. A focus adverb can be associated with one or more foci in the phrase which it locally c-commands; the foci serve to determine the domain of the operator denoted by the focus adverb. If a focus influences the truth conditions of a sentence, it is associated with a focus adverb; the operator denoted by the adverb absorbs the meaning of the focus feature. Unlike a theory of optimal accentuation, structured meaning theory and alternative semantics do not make any claims about the circumstances under which not only foci but also the focus adverbs associated with them must be accentuated. Structured meaning theory and alternative semantics make different predictions about the interpretation of sentences in which no foci are marked (i.e. when there is no stress pattern that marks a specific focus-background structure):
Optimal Accentuation vs Focus Accentuation 165
(16)
John only introduced Bill to Sue.
To interpret sentence (16) according to structured meaning theory, a focus must be identified in the domain of “only”: the operator denoted by “only” requires a structured meaning as its argument; a structured meaning can only be generated on the basis of a focus. Several different constituents in the domain of “only” can in principle be focused, which means that sentence (16) is ambiguous with respect to its focus-background structure. Determining a focus determines the meaning of the sentence, albeit in a vague manner. (The vagueness lies in the fact that the set of alternatives can vary.) According to alternative semantics, sentence (16) is not ambiguous; its meaning is vaguely specified – even if no focus is identified in the domain of “only”: John has no other contextually interesting property than the property of having introduced Bill to Sue. The meaning of the sentence is fully specified when the other interesting properties that John might have (but does not actually have) are specified. A focus may be useful for establishing what these properties are, but it is not essential for it. Foci do not always influence the truth conditions of sentences. In the answer sentences of the following examples (17-a) and (17-b) they only affect the use conditions; they serve to guarantee the question–answer congruence – as observed in focus theories. In order to guarantee the congruence of a declarative sentence with a question under discussion, the constituent that could serve as a constituent answer must be focused.21 In the answer sentence in example (17-a), this constituent is “Bill”; in the answer sentence in example (17-b) “Sue” is the constituent to be focused: (17)
a. b.
Who did John introduce to Sue? – John introduced [BILL]F to Sue. To whom did John introduce Bill? – John introduced Bill to [SUE]F .
Foci with semantic effect are always associated with focus adverbs; they are therefore called bound foci. Pragmatic foci, which only influence the use conditions, are not associated with a focus adverb; they are called free foci. Jacobs (1984) proposes to treat all foci equally as bound foci; pragmatic foci, that are not associated with a focus adverb, should be associated with the locutionary mode of the sentence: 21 If the question under discussion is a multiple constituent question, several constituents must be focused: Whom did John introduce to whom? – John introduced [BILL]F to [SUE]F .
166
(18)
Accentuation and Interpretation
a. b.
Who did John introduce to Sue? – ASSERT John introduced [BILL]F to Sue. To whom did John introduce Bill? – ASSERT John introduced Bill to [SUE]F .
Jacobs represents the locutionary mode of a declarative sentence with the operator ASSERT. In the answer sentences of the examples (18-a) und (18-b), ASSERT is associated with the focus “Bill” and with the focus “Sue”, respectively. The foci serve to determine a set of alternatives; through this set, ASSERT can connect the sentence “John introduced Bill to Sue” to a question under discussion. The focus meaning of “John introduced [BILL] F to Sue” can be construed as a superset of the possible answers to the question under discussion, who John introduced to Sue.22 If the focus meaning of the sentence is not a superset of the question under discussion, congruence between question and answer is not guaranteed; uttering the sentence would be inadequate. The uttering of a declarative sentence should always serve the purpose of answering a question that is presupposed to be under discussion. According to this, every declarative sentence must be in congruence with a background question. Therefore every declarative sentence must have at least one pragmatic focus associated with ASSERT; this focus may at the same time function as semantic focus: (19)
Who did John introduce to Sue? – ASSERT2 John only1 introduced [[BILL]F2 ]F1 to Sue.
The focus of example (19) fulfils two roles; first, it is a semantic focus, associated with “only”, and, secondly, it is a pragmatic focus, associated with ASSERT.23 It is possible for a sentence to have multiple foci, and for these foci to have different functions; one focus can have a pragmatic function and be associated with the locutionary mode, another focus can have a semantic function and be associated with a focus adverb. To interpret such a sentence, it is necessary not only to determine the foci but also what functions they have. It must be determined which foci are associated with the focus 22
Cf. Rooth (1992). In order for a superset of possible answers to be generated, “Bill” must denote a generalised quantifier. The focus meaning of “John introduced [BILL]F to Sue” then not only contains the exhaustive, but also the partial answer to the question whom John introduced to Sue. 23 I adopt the notation for double assignment of the focus feature from Krifka (1992a). An interesting question would be whether the second, pragmatic focus associated with ASSERT would not be better treated as a discontinuous focus consisting of the words “only” and “Bill”.
Optimal Accentuation vs Focus Accentuation 167
adverbs and which with the locutionary mode. (In the following examples, I forego marking the words to be accentuated. It is debatable whether all foci are marked through stress.) (20)
ASSERT [John]F only introduced [Bill]F to Sue.
Determining the focus associations in example (20) is simple. “Only” must be associated with a focus in its domain; the only focus in the domain of “only” is “Bill”; therefore, “Bill” is associated with “only”. The focus “John” is not in the domain of any focus adverb; therefore it can only be associated with the locutionary mode ASSERT. “Bill” can, in addition to being associated with “only”, also be associated with ASSERT; this does not influence the truth conditions of the sentence, however. Determining the focus associations in a sentence becomes more difficult if there are more foci in the domain of a focus adverb. In that case, determining the focus associations also affects the truth conditions of the sentence.24 The examples (21) contain multiple foci in the domains of several focus adverbs; the interpretation of the example sentences depends on the question which focus is associated with which focus adverb:25 (21)
a. b.
(John only introduced [BILL]F to Mary.) John even2 only1 introduced [Bill]F1 to [Sue]F2 . (John only introduced Tom to [SUE]F .) John even2 only1 introduced [Bill]F2 to [Sue]F1 .
The interpretational contexts of the example sentences are determined by the preceeding sentences in parentheses. The sentences do not differ in their foci and focus adverbs, but they do differ in the associations of foci and focus adverbs. As a consequence, they differ in truth values: sentence (21-a) is true iff John introduced only Bill even to Sue; sentence (21-b) is true iff John introduced even Bill only to Sue. For the interpretation of “John even only introduced Bill to Sue”, it does not suffice to determine the foci of the sentence; it must also be established with which focus adverbs the foci are associated.26 24 This also occurs in the case of the answer sentence of example (i), which has already been discussed above:
(i)
I heard that John introduced each man to only one woman. Who did he only introduce to SUE? – ASSERT2 John only1 introduced [Bill]F2 to [Sue]F1 .
25 The examples are from Schroder ¨ ¨ (2003) and Schroder and Schmitz (2003). For convenience’ sake, I do not mark ASSERT, nor do I mark which foci are associated with this operator. 26 The formalism of structured meaning theory of Krifka (1992a) and the for-
168
Accentuation and Interpretation
Let me take stock. The foci of a sentence are associated with focus adverbs and/or with the sentence’s locutionary mode. The stress pattern of a sentence can inform the recipient which constituents are focused but it does not provide an indication of which focus is associated with which focus operator. Determining focus associations requires making reference to the discourse context. 1.3
Comparison
I can now compare the interpretations of the paradigmatic examples sentences that are derived by a theory of optimal accentuation with the interpretations derived by focus theories: the interpretation of optimal accentuation corresponds to the interpretation of structured meaning theory if a word in the domain of “only” is recognised as being accentuated (and focused). As an example, the first of the following two QL-sentences represents the meaning of “John only introduced BILL to Sue” according to the interpretation of optimal accentuation; the second QL-sentence represents the meaning of “John only introduced [BILL]F to Sue” according to structured meaning theory: ![ RESTRICTOR(bill ) ∧ introduce( john, bill, sue) ∧
∀ x [ RESTRICTOR( x ) ∧ introduce( john, x, sue) → x = bill ]]
![introduce( john, bill, sue) ∧
∀z ∈ ALT(bill )[introduce( john, z, sue) → z = bill ]] Under the assumption that the predicate RESTRICTOR in the first representation and the term ALT(bill ) in the second representation denote the same contextually restricted set of entities, and that bill is an element of ALT(bill ) – something that should hold in any case – both meaning representations are equivalent. The interpretation of optimal accentuation corresponds to the interpretation of alternative semantics if no stress pattern is recognised, i.e. if no malism of alternative semantics of Rooth (1992) do not allow the establishment of several different focus associations. In both formalisms, the sentence “John even only introduced [Bill]F to [Sue]F ” must be interpreted in such a way that both foci are associated with “only”; the focus associations indicated in examples (21-a) and (21-b) cannot be interpreted with either formalism. This is a defect of the formalisms, ¨ ¨ not of the theories. Schroder (2003) and Schroder and Schmitz (2003) show how structured meaning theory can be formalised in such a way that it becomes possible to interpret different focus associations in a single sentence. Wold (1996) and Wold (1998) define an interpretation that follows the idea of alternative semantics.
Optimal Accentuation vs Focus Accentuation 169
element in the domain of “only” is marked as a focus. The first of the following two QL-sentences represents the meaning of “John only introduced Bill to Sue” according to the interpretation of optimal accentuation; the second QL-sentence represents the interpretation according to alternative semantics: ![ RESTRICTOR (λx [introduce( x, bill, sue)])
∧ introduce( john, bill, sue) ∧ ∀ X [ RESTRICTOR ( X ) ∧ X ( john) → X = λx [introduce( x, bill, sue)]]] ![introduce( john, bill, sue) ∧
∀ X ∈ C [ X ( john) → X = λx [introduce( x, bill, sue)]]] The predicate RESTRICTOR in the first representation corresponds to the set C of the second representation; λx [introduce( x, bill, sue)] is an element of C; both meaning representations are therefore equivalent. Focus theories and a theory of optimal accentuation derive equivalent (underspecified) interpretations for the paradigmatic example sentences. According to focus theories, the meaning of a sentence depends on which sentence parts carry the syntactic focus feature. Stress marks expressions that carry the focus feature; it therefore gives information about the syntactic structure of the sentence and in that way about which meanings the sentence can have. The deciding point of focus theories is that they take the semantic effects of stress to be grammaticalised; the influence of the stress pattern on the meaning of a sentence is determined with reference to the syntactic structure, and not with reference to a discourse context. This point turns out to be a defect: only underspecified interpretations are derived for sentences with given stress patterns; these interpretations can be further specified only with reference to a discourse context. Reference to a context is necessary in order to restrict the sets of alternatives of the foci and – in the case of a sentence with multiple foci – in order to establish which focus is associated with which focus adverb or with the sentence’s locutionary mode. Focus theories do not define how the context can be referenced in order to specify the meaning. Given two sentences with “only”, completely identical except for their stress patterns, focus theories can explain the fact that these sentences have different meanings, but even if the discourse context is given, they cannot fully specify these different meanings. A theory of optimal accentuation on the other hand explains how a
170
Accentuation and Interpretation
discourse context – namely a question under discussion – can be referred to: the stress pattern of a sentence should be optimal given the discourse context; accentuation restricts the set of possible contexts, in that way the set of possible context-relative meaning specifications and consequently the set of possible interpretations.
2 Predictions of stress patterns Focus theories and a theory of optimal accentuation make partly different predictions about stress patterns. These predictions can be tested to evaluate the theories: 1. Suppose a sentence and a discourse context are given; the sentence has a focus that does not need to be recognised in order for the sentence to be understood. According to a theory of optimal accentuation, this focus is not to be accentuated. Therefore, if the focus is accentuated, it provides support for focus theories. If on the other hand it is not accentuated, it provides support for a theory of optimal accentuation. In the next subsection (2.1), I discuss the phenomenon of so-called second occurrence foci. These are expressions that according to focus theories are focused, but for which it is at least questionable whether they are accentuated or not. 2. Suppose a declarative sentence and a question under discussion are given: according to focus theories, the part of the declarative sentence that denotes a constituent answer to the question must be focused and marked with an accent in order to establish the question–answer congruence. According to a theory of optimal accentuation, the i-critical words must be accentuated. In the QUAD-model these words always belong to the sentence part that denotes a constituent answer to the question under discussion. Both focus theories and a theory of optimal accentuation predict that at least one word in the sentence part that denotes a constituent answer must be accentuated. However, the theories do not always make the same predictions as to which words are to be accentuated: it is conceivable that a word is accentuated to mark a pragmatic focus, although it does not need to be accentuated for optimal recognition. Furthermore, it is conceivable that in order to mark a focus it is not necessary to accentuate all i-critical words that are to be accentuated for optimal recognition. In the second subsection (2.2), I turn to accentuation of pragmatic foci that consist of multiple words, about which focus theories and a theory of optimal accentuation make different predictions.
Optimal Accentuation vs Focus Accentuation 171
2.1
Second occurrence foci
In example (22), Albert answers the question who Tom introduced to Sue by saying that Tom introduced only Bill to Sue. Barbara contradicts Albert and says that it was John who only introduced Bill to Sue. With her contradiction, Barbara presupposes the question who introduced only Bill to Sue: (22)
Albert: Tom only introduced BILL to Sue. Barbara: No. JOHN only introduced Bill to Sue.
In Albert’s utterance, “Bill” is accentuated, and therefore focused, according to focus theories. The focus serves two functions: first, it is a semantic focus that determines the domain of only; secondly, it is a pragmatic focus associated with the locutionary mode operator ASSERT. The focusbackground structure can be made explicit as follows: ASSERT2 Tom only1 introduced [[Bill]F2 ]F1 to Sue. In Barbara’s utterance, “John” is accentuated and therefore focused. The focus “John” is not associated with any focus adverb; it only has a pragmatic function. According to structured meaning theory, the focus adverb “only” must be associated with a focus in its domain. Barbara’s utterance is to be interpreted in such a way that John introduced only Bill to Sue; that means that “Bill” must be focused and associated with “only”. That is, the focus-background structure of Barbara’s utterance can be made explicit in the following manner: ASSERT2 [John]F2 only1 introduced [Bill]F1 to Sue. Partee (1999)27 expresses the intuition that the focus “Bill” is accentuated in Albert’s utterance, but not in Barbara’s. In her utterance, the verb phrase “only introduced [Bill]F to Sue” is repeated. It is therefore not necessary to determine with which focus “only” is associated; the association is already given with Albert’s utterance. “Bill” in Barbara’s utterance is a so-called second occurrence focus (SOF); SO-foci are not to be accentuated. 27 A preliminary version of the paper was already available in 1994. Example (i) is the original example from Partee (1999):
(i)
Albert: Eva only gave xerox copies to [the GRADUATE STUDENTS]F . Barbara: No. PETR only gave xerox copies to [the graduate students]F .
The semantic focus associated with “only” in this example is a phrase that consists of multiple words. I do not discuss the accentuation of focused phrases until the next subsection (2.2), and so I use a variant here of the already amply discussed sentence “John only introduced Bill to Sue”.
172
Accentuation and Interpretation
Let us assume that Partee is correct. Structured meaning theory requires that a focus adverb is always associated with a focus. If the theory additionally requires that each focus is accentuated, then according to Partee’s intuition, it is falsified by example (22). If the theory allows some foci not to be marked, it is not falsified. The thesis that focus adverbs must always be associated with foci can then be maintained; however, this move raises the question how it is at all possible to falsify the focus theory.28 Furthermore, if we assume that not every focus must be accentuated, it must be determined under which conditions a focus is accentuated and what purpose stress here serves. It no longer suffices to refer to the focus feature to account for or predict stress: structured meaning theory is therefore weakened in its ability to explain and predict stress patterns. In alternative semantics, “only” does not have to be associated with a focus under all circumstances; the non-accentuated word “Bill” in Barbara’s utterance in example (22) therefore does not necessarily have to be a focus.29 Alternative semantics therefore does not have the difficulty of having to account for the absence of stress. However, if alternative semantics assumes that a focus adverb such as “only” is not always associated with an accentuated focus, it needs to explain under which circumstances an expression in the domain of “only” is focused and accentuated. The explanation that a sentence with “only” has certain truth conditions and must therefore show a certain focus-background structure and be accentuated in such and such a way, no longer holds: if the association of “only” with a focus is merely optional, then one single reading of a sentence is compatible with different focus-background structures and with different stress patterns; the power of a theory of alternative semantics to predict stress patterns is therefore weakened. Compare example (22) with (23): (23)
Albert: John only introduced BILL to Sue. Barbara: Yes. John only introduced BILL to Sue.
28 Krifka (1995b), 263: “Although I feel that this view [that SOF are not accentuated, HCS] is defendable, especially if one adduces additional independent evidence of some sort for it, I also feel that this position is quite close to immunizing hypothesis (I) [focus operators are always associated with foci, HCS] against possible falsification.” 29 There is only one version of alternative semantics, the free parameter theory of association with focus (Rooth (1992)), in which “only” does not have to be associated with a focus in its syntactic domain. Other versions – e.g. in the first version of Rooth (1985) – require an association between a focus adverb and a focus in its domain. Rooth (1992) prefers a variant of alternative semantics in which association with a focus is to a large extent free, although some focus adverbs are obligatory associated with a focus. “Only” is one of these adverbs.
Optimal Accentuation vs Focus Accentuation 173
As in example (22), Barbara repeats the verb phrase “only introduced Bill to Sue”. This time, unlike in example (22), “Bill” is accentuated in her utterance. The focus “Bill” has both a semantic and a pragmatic function in the repeated phrase. Barbara answers the same question as Albert; the sentence that she utters has the same focus-background structure as the Albert’s sentence: ASSERT2 John only1 introduced [[Bill]F2 ]F1 to Sue. To differentiate the repeated focus in example (23) from an SOF such as in example (22), Bartels (1995) calls the repeated in (23) focus an echo focus. The essential distinction is that an SOF does not have a pragmatic function, while an echo focus does. Examples such as (22) and (23) suggest that foci are only accentuated when they serve a pragmatic function. A SOF – i.e. a focus that only serves a semantic function – is not accentuated. If this suspicion is correct, semantic effects of stress are to be treated as epiphenomena of pragmatic focus accentuation, and stress patterns are in first instance determined by the discourse context. This does not refute focus theories, but it would mean that they have to be genuinely pragmatic theories, and as such they would be closer to a theory of optimal accentuation as one might have expected. Excursus: A theory of optimal accentuation requires that in Barbara’s utterance of example (22) only “John” is accentuated. The SOF “Bill” does not need to be accentuated, because “John” is the only word that the recipient must recognise in order to understand Barbara’s utterance. Apart from the word “John”, the recipient must also recognise which role this word plays in the utterance; he must recognise whether “John” functions as subject (replacement of “Tom” in Albert’s utterance), as direct object (replacement of “Bill”) or as indirect object (replacement of “Sue”). A fairly certain cue to the role of “John” is the position in which the word appears: (24)
Albert: Tom only introduced BILL to Sue. Barbara: (i) No. JOHN noise. (No. JOHN only introduced Bill to Sue.) (ii) No. noise JOHN noise. (No. Tom only introduced JOHN to Sue.) (iii) No. noise JOHN. (No. Tom only introduced Bill to JOHN.)
If in example (24) “John” is the first word uttered by Barbara (i), it is highly likely that it functions as the subject. If it appears in the middle of the utter-
174
Accentuation and Interpretation
ance (ii), it likely replaces the direct object “Bill”; at the end of the utterance (iii), it likely functions as a replacement for the indirect object “Sue”. The recipient does not need to recognise the other parts of Barbara’s utterance in order to determine which position “John” occupies. It suffices for him to perceive noise, so that he recognises whether anything was uttered before or after “John”, and he can estimate the word’s position in the sentence.30 In example (25), the recipient cannot perceive any noise that could give a cue to the role of “John”: (25)
Albert: Tom only introduced BILL to Sue. Barbara: No. JOHN.
Barbara’s utterance in example (25) will presumably be interpreted in the sense of “Tom introduced only John to Sue”. Albert answered the question who Tom introduced to Sue. The most obvious interpretation of Barbara’s utterance is to interpret “John” as a corrective constituent answer to the same question. This means that it is unlikely that Barbara can correct Albert in the sense of “John only introduced Bill to Sue” by merely uttering “John”. Barbara must utter more than just “John” – although for a correct interpretation, Albert only needs to recognise this one word; it suffices if he perceives the rest of the utterance as noise. In this way, noise can contribute information. (End of excursus) Is the intuition of Partee (1999), that SO-foci are not accentuated, really correct? Rooth (1996b)31 disputes this assumption. According to his intuition, the SOF “Bill” in Barbara’s utterance of example (22) – repeated here as (26) – must be accentuated: (26)
Albert: Tom only introduced [[BILL]F ]F to Sue. Barbara: No, [JOHN]F only introduced [BILL]F to Sue.
Rooth performs an experiment to undermine his intuition. He makes recordings of the B-sentences in the examples (27-a) and (27-b): (27)
a.
A: B:
Paul only [[named]F ]F Manny today. So what. Even [Eva]F only [named]SOF Manny today.
b.
A: B:
Paul only named [[Manny]F ]F today. So what. Even [Eva]F only named [Manny]SOF today.
30 In German – with its generally freer word order – recognising the position of a word is a less useful cue than in English. It would be interesting to see if this has an effect on accentuation. 31 See also Rooth (1995b).
Optimal Accentuation vs Focus Accentuation 175
The words “named” and “Manny” occur in two variants in the sentences Rooth recorded – namely non-focused and as SOF: in the B-sentence of example (27-a) “Manny” is not focused and “named” is SOF, in the B-sentence of (27-b), things are reversed. Rooth compares the recordings and notices that the SO-foci are audibly accentuated and clearly distinguish themselves from their non-focused variants. An acoustic analysis of the recordings shows that the SO-foci do not correlate with an extreme value of the fundamental frequency (f0). Unlike the non-focused variants, however, they do correlate with an increase in intensity and vowel duration. Rooth interprets this to mean that SO-foci are accentuated, but that this accentuation is realised differently than with first occurrence foci (FOF), namely without an f0-maximum or -minimum. Rooth records the test sentences under uncontrolled conditions: he reads the sentences himself, and does so in such a way that to his ear they sound natural. It may be that Rooth’s impression of naturalness is shaped by his linguistic knowledge; perhaps he only finds a stress pattern natural that conforms to his focus theory. Rooth’s experiment is therefore not entirely methodologically sound, something he admits himself. The result can therefore not be taken as evidence that SO-foci are generally accentuated. It can be taken as an indication that Partee’s intuition, according to which SO-foci are generally not accentuated, must be tested. Bartels (1995) also conducts an experiment to investigate the accentuation of SO-foci. She does a production experiment in which linguistically native speakers of American English – all of the undergraduate students in linguistics at the University of Massachusetts – read sentences with “only”. The sentences are placed in different contexts, so that every focus associated with “only” is read once as FOF, once as SOF, and once as echo focus. The following dialogues are examples of the texts that the test persons read. The different roles are read by different test persons. The setting is a telephone interview conducted by a radio reporter with a globetrotter: (28)
Globetrotter: When I was in China, I lived only on rice mush for a month. Reporter: Gee, I’m glad I wasn’t there. I couldn’t live only on rice mush for a month.
In example (28), “rice mush” occurs as FOF in the sentence uttered by the globetrotter. The focus is italicised, so that the test persons receive a typographic cue that this focus must be accentuated. In the reporter’s utterance “I” is FOF and also italicised; “rice mush” occurs in the sentence as a typographically unmarked SOF.
176
Accentuation and Interpretation
For the example dialogue (29), the dialogue setting is extended: it is assumed that the telephone connection for the interview is bad. The reporter repeats the utterances of the interviewee, so that the radio listeners can understand them properly. “Rice mush” now occurs in the reporter’s utterance as a typographically unmarked echo focus. (29)
Globetrotter: When I was in China, I lived only on rice mush for a month. Reporter: When you were in China, you lived only on rice mush for a month.
Bartels analyses the speech signals of the test persons to see to what extent relative changes in fundamental frequency, intensity and syllable duration correlate with the various “rice mush”-foci. The results are as follows: both the change in fundamental frequency and the increase in intensity and duration are significantly higher in FO foci than in echo foci, and significantly higher in echo foci than in SO-foci. That is, FO foci are accentuated significantly “stronger” than SO-foci. SO-foci are not only accentuated “differently”, they are also accentuated “less”, if at all. Bartels compares the speech signals of different focused expressions; she does not compare the signals of focused and corresponding non-focused expressions. Her data show that SO-foci are accentuated “less” than other foci, but they do not answer the question whether SO-foci are accentuated at all. A problem in the experiment is the fact that single words are italicised. It cannot be excluded that the SO-foci are accentuated “less” than the echo foci because in the context of the SO-foci there appears an italicised word, while in the context of the echo foci no word is italicised. Following the experiments of Rooth and Bartels, Beaver et al. (2007) conduct a production experiment to compare the acoustic realisations of SOF and of corresponding non-focused expressions. They let native speakers of American English read various texts with two variants each. Example (30) consists of two variants of one of the texts: (30)
a.
(i)
Both Sid and his accomplices should have been named in this morning’s court session. (ii) But the defendant only named Sid in court today. (iii) Even the state prosecutor only named Sid in court today.
b.
(i)
Defence and Prosecution had agreed to implicate Sid both in court and on television. (ii) Still, the defence attorney only named Sid in court today. (iii) Even the state prosecutor only named Sid in court today.
Optimal Accentuation vs Focus Accentuation 177
Both variants of the text consist of three sentences (i–iii). The first sentence (i) sets up the context for the two follow-up sentences (ii) and (iii); the second sentence (ii) contains a focus adverb associated with a FOF; this focus is repeated in the third sentence (iii) as SOF. The variants are distinct in their first sentences, and consequently in the focus-background structure of the second and third sentences. In the second sentence of the first variant, “Sid” is a FOF associated with “only” that is repeated as SOF in the third sentence. The FOF in the third sentence is the expression “the state prosecutor”, which is associated with “even”; “in court” is not focused in any sentence of the first text variant. Things are different in the second variant. Here, the “in court” in the second sentence is a FOF associated with “only”, which is repeated as SOF in the third sentence. “Sid” is not focused in the sentences of the second variant. The third sentences in both text variants of example (30) only differ with respect to their SOF. In the first variant “Sid” is SOF, while “in court” is not focused at all; conversely, in the second variant “Sid” is not focused and “in court” is SOF. By comparing the acoustic realisation of “Sid” and “court” in the first variant with the realisations in the second variant, Beaver et al. compare the realisation of SO-foci with the realisation of identical, nonfocused expressions. Beaver et al. let 20 linguistically naive test persons – all of them native speakers of American English, none of them with any training in linguistics – read the variants of seven different texts. In four texts, the SO-foci, whose acoustic realisation the experiment is about, are associated with the focus adverb “only”; in the other three texts, the SO-foci are associated with the focus adverb “always”. Unlike in Bartels’ experiment, the foci are not typographically marked in the texts. The speech signals of the test persons are analysed to see whether they mark the SO-foci in comparison to the corresponding non-focused expressions by (a) an f0-minimum or -maximum. (b) an increase of intensity, and/or (c) a lengthening of word duration. In a preliminary version of their paper, Beaver et al. also tested whether the choice of the focus adverb – “only” or “always” – had any influence on the way the SOF was marked.32 The results of the experiment are the following: SO-foci are not marked by pitch accents, i.e. the SO-foci are not correlated with extreme values of the fundamental frequency. However, there is a significant difference between the durations of SO-foci and their non-focused counterparts (p-value 32 The title of the preliminary version was Second Occurrence Focus Is Prosodically Marked; Results of a Production Experiment. This paper was published on David Beaver’s home page in 2002 (http://www.stanford.edu/˜dib). It can still be found via the wayback machine on http://www.archive.org.
178
Accentuation and Interpretation
< 0.1). The differences are slight: the duration of SO-foci is on average only 6 ms longer than that of the corresponding non-focused words. Moreover, there is a marginally significant difference between the intensities of SOfoci and their non-focused counterparts (p-value < 0.5); SO-foci are on average realised with 0.31 dB higher intensity. According the preliminary version of the paper, the acoustic differences can be detected for SO-foci associated with “only” but not for SO-foci associated with “always”. That is, SO-foci associated with “always” are not correlated with an extreme value of the fundamental frequency, longer duration or higher intensity.33 Beaver et al. claim that the results confirm the hypothesis that SO-foci are prosodically marked. According to the preliminary version of their paper this is only true for SOi foci associated with “only”: SO-foci associated with “always” are not prosodically marked. If one assumes that every focus is accentuated, one must assume that “always” is not always associated with a focus; the association is merely optional. SO-foci associated with “only” on the other hand are accentuated, even though they do not correlate with an f0-extremum and the other acoustic correlates are weaker than those of FOF stress. According to Beaver et al. one can hold the hypothesis that the focus adverb “only” is always associated with an accentuated focus and that this association is obligatory. The interpretation of the results for the SO-foci associated with “always” are completely in line with a theory of optimal accentuation: foci associated with “always”, that do not need to be accentuated for optimal recognition, are not accentuated. But how can the requirements of optimal accentuation be made to agree with the accentuation of SO-foci associated with “only”? I believe that Beavers’ interpretation of the result for the SO-foci associated with “only” can be challenged: the average increase in duration and intensity is very slight. There is certainly some variation in these factors; we can assume that on a part of the recordings the SO-foci were somewhat more emphasised, and that on another part they were not emphasised at all. It would be interesting to see in how many cases the SOF associated with “only” is not emphasised at all. Because the average increase of duration and intensity is so slight, these cases cannot just be extreme outliers. If they are not just extreme outliers, the test results support the hypothesis that SO-foci associated with “only” may be accentuated, but they do not support the stronger hypothesis that these SO-foci must be accentuated. On the assumption that foci are always marked by stress, “only” need not necessarily be associated with a focus; the association is not obligatory. 33 According to the preliminary version of the paper, the duration of SO-foci that are associated with “only” is on average 7 ms longer than that of their non-focused counterparts.
Optimal Accentuation vs Focus Accentuation 179
A slight increase in duration and intensity does not necessarily guarantee that the word with which the increase correlates is perceived as accentuated. The fact that some SO-foci correlate to a slight increase in intensity and duration does not automatically lead to the conclusion that those SO-foci are effectively emphasised by stress. According to Lehiste (1970) the threshold for observable differences in duration in speech signals lies between 10 and 40 ms. Beaver et al. report that the average increase in duration of SO-foci lies at 6 ms – for SO-foci associated with “only” at 7 ms. One can therefore reasonably doubt whether the lengthening really has any accentuating effect. For this reason, Beaver et al. (2007) make a perception experiment in which they give test persons the task to detect the SOFs in the recordings that were made in the production experiment. (31)
a. b.
Even the state prosecutor only named [Sid]SOF in court today. Even the state prosecutor only named Sid [in court]SOF today.
As an example, the test persons have to decide in which of the recordings of (31-a) and (31-b) the speaker wishes to make “court” more prominent than “Sid”. Before the test persons decide they can listen to the recordings as often as they want. Result: in 63 per cent of the cases the test persons choose the “right” recording, i.e. the one in which the word to be concentrated on – in example (31): “court” – is a SOF. The t-test shows that the result is highly significant for a correlation between SOF-marking and perception of stress (p-value < 0.001). However, in 37 per cent of the cases the SOF are not perceived as accentuated. That is, the probability is quite high that a recipient cannot detect a SOF just because of its prosodic marking. Moreover, the test conditions are very artificial and do not have much in common with the normal circumstances of speech recognition; we can expect that the recognition is much worse in normal dialogue situation or even when the test persons are not allowed to listen to the recordings more than once. Therefore, we can take the result of the experiment rather as evidence that the prosodic marking of “only”-SO-foci does not have a robust accentuation effect. In spite of all the doubts connected to the interpretation of Beaver et al.’s results, two questions remain to be answered: first, why is it not just in a few negligible cases that an “only”-SOF is correlated with a slightly longer duration and a higher intensity? Secondly, wherein lies the difference between “only” and “always” that causes the tendency to accentuate a word in the domain of “only” to be stronger than the tendency to accentuate a word in the domain of “always”? Let us begin with the second question: if “only” serves to exhaustify an answer, at least one word in the domain of “only” must be accentuated.
180
Accentuation and Interpretation
In section 1, I claimed that “only” is usually used to exhaustify an answer. If the claim is correct, then one word in the domain of “only” is usually accentuated. This word must be interpreted as (part of) a constituent answer to a question under discussion; the constituent answer is exhaustified because of the occurrence of “only”. “Always” has other use conditions than “only”; one should not similarly expect that a word in the domain of “always” is accentuated and interpreted as (part of) a constituent answer: (32)
a.
A: B:
Does Sandy feed Nutrapop to her dogs? Yes, Sandy always feeds Nutrapop to [FIDO]F , and she also always feeds Nutrapop to [BUTCH]F .
b.
A: B:
Does Sandy feed Nutrapop to her dogs? Yes, Sandy only feeds Nutrapop to [FIDO]F , and she also only feeds Nutrapop to [BUTCH]F .
The examples in (32-a) and (32-b) come from Beaver and Clark (2003). In both examples, B’s answers are contradictory if the foci “Fido” and “Butch” are associated with the focus adverbs “always” and “only”, respectively. In order to interpret the answers as non-contradictory, it must be assumed that “always” and “only” are associated with “Nutrapop”. Both answers then mean that Sandy feeds her dogs Fido and Butch Nutrapop and nothing else. Beaver and Clark take the charitable reading to be possible in the case of “always” (example (32-a)). They assume that “always” is not necessarily associated with the accentuated focus in its domain. They take the charitable reading to be impossible in the case of “only”. “Only”, they argue, must be associated with the focus in its domain; therefore the answer sentence must be interpreted as contradictory. I consider this requirement too strong. According to my intuition it is possible to follow the principle of charity and interpret the answer in example (32-b) as non-contradictory. Nevertheless, the answer sounds odd, because “only” is used contra to its common function as an answer exhaustifier. I agree that there is a difference in use between the adverbs “only” and “always”, and that this difference has an effect on the interpretation of stress patterns and the accentuation of expressions in the domains of both adverbs. However, to account for this, it suffices to point to the different uses of “only” and “always”; the theoretical term “focus” is not required.34 34 Two more examples against the obligatory association of “only” with all foci in its domain:
(i)
a.
(John only introduced [Bill]F to Mary.) [Bill]F1 to [Sue]F2 .
He even2 only1 introduced (contd)
Optimal Accentuation vs Focus Accentuation 181
That does not yet explain why it happens at all that a word in the domain of “only” is prosodically marked, although optimally it should not be marked at all, and why (according to the data of Beaver et al.) such a word is marked weaker than the words that are optimally to be accentuated. It seems that focus theories have difficulties in explaining why “only”-SOfoci are not always marked and why they are so weakly marked when they are marked, while a theory of optimal accentuation has difficulties in explaining why some “only”-SO-foci are marked at all. I suspect that when reading the experiment texts of Beaver et al., the test persons are uncertain whether or not they need to accentuate a word in the repeated “only”-phrases. On the one hand, it is usually the case that in a repeated phrase no words are accentuated; on the other hand, there is usually some word in the domain of “only” that is accentuated. Experiences of Schmitz et al. (2001) support the suspicion that the use of “only” in repeated phrases is uncommon and therefore can lead to uncertainties with respect to the optimal accentuation. Schmitz et al. conducted experiments that were designed to bring test persons to freely utter focus constructions with “nur” (“only”) and “sogar” (“even”) – without being given a specific text. (I presuppose that the English “only” and the German “nur” have comparable use conditions.) This turned out to be difficult: it was not easy to bring the test persons to utter sentences with “nur” and “sogar”; utterances with repeated focus constructions – i.e. constructions with SO-foci – were not produced at all. Even forcing SOF-constructions by explicitly requiring answers in the form of complete sentences to questions such as “Markus is in a group only with Judith. Who else is in a group only with Judith?” failed completely. Ignoring the instructions, the test persons answered with constituent answers (“THOMAS”), they left out “nur” in b.
(John only introduced Tom to [Sue]F .) He even2 only1 introduced [Bill]F2 to [Sue]F1 .
The examples (i) were already given in the first section of this chapter. According to Beaver and Clark (2003), both foci “Bill” and “Sue” must be associated with “only”; I claim that the associations indicated are unusual but possible. (ii)
A: B:
Mary wasn’t so bad after all. Of all the things we were afraid she might do, she only [invited Bill for dinner]F . You got the person wrong. She only [invited [Lyn]F for dinner]SOF . But it’s true that she did only one of those terrible things she could have done.
Example (ii) is from Roberts (2001). In B’s utterance, “invited Lyn for dinner” is supposedly a focus associated with “only”; only one part of that focus – namely “Lyn” – however, is a pragmatic focus that must be accentuated. This focus is not to be associated with “only”.
182
Accentuation and Interpretation
their answers (“THOMAS is in a group with Judith.”), or they just copied the stress pattern of the question (“THOMAS is in a group with only JUDITH.”). The effect of a weak SOF-stress that occurred in the experiment of Beaver et al. could not be observed. In the experiment texts of Beaver et al. the word “only” occurs in repeated phrases; optimally, there is no need to accentuate any word in such phrases. The texts do not give the impression of being uncommon or artificial; nonetheless it may be difficult to have such texts spoken freely. In spite of the clarity and naturalness of the texts, it is likely that readers are uncertain about the optimal accentuation. The weak prosodic marking of an SOF may be assessed as uncertainty on the part of the test person about the optimal accentuation. Apart from that, it is possible that a word in the domain of “only” is accentuated in order to presuppose a corresponding background question, so that “only” is used in its common function of an answer exhaustifier. In that case, however, “only” should be accentuated “normally”, just like a first occurrence focus. Let me take stock. Second occurrence foci are semantically and not pragmatically motivated foci. According to a theory of optimal accentuation, they do not need to be accentuated. Experimental data of Beaver et al. (2007) show that SO-foci associated with “only” appear often – although not always – to be accentuated weakly. I assume that the non-optimal accentuation of the SO-foci can be accounted for by referring to the unusual usage of “only” in the texts that the test persons had to read and to the general expectation that in the domain of “only” at least one word is to be accentuated. 2.2
Focus projections
Let us now turn to the accentuation of foci that consist of multiple words: a complete theory of focus entails a syntactic rule of focus assignment, a semantic or pragmatic rule of focus interpretation and a phonological rule of focus accentuation. Focus theories can differ with respect to any of these rules, and they can therefore state different empirical relations between stress patterns on the one hand and truth and use conditions on the other hand. E.g. two theories of focus can differ in their phonological rule of focus accentuation; for a sentence with a given focus structure, they can therefore predict different stress patterns: (33)
What did John do? – John [introduced Bill to Sue]F .
For the declarative sentence of the example (33) to be an adequate answer to the question under discussion, the entire verb phrase “introduced Bill
Optimal Accentuation vs Focus Accentuation 183
to Sue” must be in focus. Note that the focus is a complex phrase which consists of more than one word. A phonological rule of focus accentuation determines which words of the phrase have to be accentuated in order to mark the whole phrase as a focus. Following the nuclear stress rule of Chomsky and Halle (1968), only “Sue”, which is the rightmost word of the phrase, must bear an accent. From this word the focus can be projected to the entire verb phrase. (34)
What did John do? – John only [introduced Bill to SUE] F .
Following rules of Gussenhoven (1984), Selkirk (1995) and Schwarzschild (1999), “Bill” must also bear an accent. That is, due to different accentuation rules, different predictions regarding the stress pattern are made. (35)
What did John do? – John only [introduced BILL to SUE]F .
According to a theory of optimal accentuation, the i-critical words of an utterance must be accentuated. In the context of example (33) almost all word in the phrase “introduced Bill to Sue” count as i-critical and therefore have to be accentuated; excluded is only the word “to” that marks “Sue” as indirect object: (36)
What did John do? – John only INTRODUCED BILL to SUE.
The example shows that focus theories and a theory of optimal accentuation make in some cases different stress predictions. These predictions can be evaluated experimentally. Together with Petra Wagner I performed several experiments to test the different stress predictions. All of these experiments were conducted in German with German-speaking test persons. (The test persons were undergraduate students at the former Institute of Communication Research and Phonetics at the University of Bonn.) We assume that English and German do not behave differently in any interesting aspect regarding the experiments. Therefore, we assume that we would get similar results if we performed the experiments in English. A precise description of all experiments is given by Schmitz and Wagner (2006). First, we test the nuclear stress rule against the hypothesis of optimal accentuation. According to the nuclear stress rule it is always sufficient to accentuate only the rightmost word of a phrase in order to mark that phrase as a focus. Moreover, accentuation of foci is not context-dependent. That is, as soon as the focus-structure of a sentence is fully specified no further information is needed to determine the correct stress pattern.
184
Accentuation and Interpretation
We play recordings of four dialogues like the ones in (37-a) and (37-b) to 39 test persons: (37)
a.
(i)
(ii)
b.
(i)
(ii)
Was ist im linken Feld? – Nur [das quadratische HAUS]F ist im linken Feld. (What is in the left field? – Only [the square HOUSE]F is in the left field.) Was ist im linken Feld? – Nur [das QUADRATISCHE HAUS]F ist im linken Feld. (What is in the left field? – Only [the SQUARE HOUSE]F is in the left field.) Welches Quadrat ist im linken Feld? – Nur [das quadratische HAUS]F ist im linken Feld. (Which square is in the left field? – Only [the square HOUSE]F is in the left field.) Welches Quadrat ist im linken Feld? – Nur [das QUADRATISCHE HAUS]F ist im linken Feld. (Which square is in the left field? – Only [the SQUARE HOUSE]F is in the left field.)
Each dialogue consists of a question and answer. Dialogue (37-a) contains a broad question without a restrictor (“What ...”); dialogue (37-b) contains a narrow question with a restrictor (“Which square ...”). The two recordings of each of the dialogues differ only in the stress pattern of the answer. On one recording, the word “house” is accentuated, on the other recording both “square” and “house” are accentuated. The test persons have to decide which accentuation is the best one in their perception. The experiment is a forced-choice experiment, the test persons have to choose one of the recordings – i.e. one of the stress patterns – in each case. Experiments on the accentuation of focus operators (cf. Schmitz (2006)) have shown that the test persons can be influenced by the order in which the dialogue-recordings are played. To compensate for such an influence, the order of presentation is varied: for two dialogues the recording with an accent only on the noun is played first; for the other two dialogues the recording with accents on the adjective and the noun is played first. A theory of optimal accentuation determines stress patterns in relation to discourse contexts, especially in relation to questions under discussions and the (presupposed) mutual knowledge of the discourse participants. If predictions of stress patterns are empirically evaluated, it must be made sure that the relevant discourse context is sufficiently specified. In order to control for the questions under discussion, we play recordings of dialogues
Optimal Accentuation vs Focus Accentuation 185
Figure 5.2 Reference world for the dialogues (37-a) and (37-b)
with explicitly asked questions. In order to control for presuppositions regarding the mutual knowledge, we present small toy worlds such as the one in Figure 5.2 and explain which knowledge the dialogue participants have about these worlds. In the toy worlds, there are nine different objects that can be classified as squares, triangles or circles, or alternatively as cars, houses or faces. The objects are distributed over two fields (a left one and a right one). We tell the test persons that the questioner knows which objects exist but that he does not know in which fields the objects are. The person answering has complete knowledge of the reference worlds. He shall inform the questioner adequately. The test dialogues relate to the toy worlds, so that only knowledge regarding the toy worlds can be relevant for judging the stress patterns. By specifying reference worlds for the dialogues, we constitute controlled test conditions. According to the nuclear stress rule, the focus “the square house” can be marked through a stress on “house”. In both dialogue contexts – both after a question with and without question restrictor – it should suffice to accentuate the noun “house”; the adjective “square” does not need to be accentuated in any of the answers. The nuclear stress rule leads us to the expectation that there is no correlation between the preferred stress pattern and the context; it predicts that the test persons prefer recordings like (37-a-i) and (37-b-i). According to optimal accentuation, both the adjective and the noun must be accentuated after a question without a restrictor (“What ...”). After a question with a restrictor (“Which square ...”), only the noun must be accentuated. According to a theory of optimal accentuation, the accentuation of the answer sentence depends on the context; the theory predicts that the test persons prefer recordings like (37-a-ii) and (37-b-i).
186
Accentuation and Interpretation
These are the results of the experiment: after a question without a restrictor (“What ...?”), the test persons prefer in over 66 per cent of the cases the accentuation of both the adjective and the noun (“Only the SQUARE HOUSE ...”, (37-a-ii)). After a question with a restrictor (“Which square ...?”) they prefer in over 70 per cent of the cases the sole accentuation of the noun (“Only the square HOUSE ...”, (37-b-i)). The data show a clear correlation between the question types and the test persons’ accentuation preferences (p-value < 0.0001). They corroborate the hypothesis of optimal accentuation. Let me take stock. The nuclear stress rule predicts an incorrect stress pattern for the answer sentence of dialogue (37-a) in which a question without a restrictor is under discussion. It is therefore not always sufficient to accentuate only the final, i.e. rightmost word of a focus. The answer sentences of the dialogues (37-a) and (37-b) have the same foci; nonetheless, these foci are accentuated differently, depending on context. Focus accentuation therefore can be context-dependent: a declarative sentence with a given focus structure can serve as an answer to different, non-equivalent questions. Which of these questions is under discussion has no influence on the focus structure of the sentence. However, the experimental data show that the choice of the question under discussion has an influence on which words of the focus have to be accentuated. Note that a theory of optimal accentuation does not make any predictions on the acoustic realization of accents. Acoustic correlates of accents can be a longer duration, a higher intensity, an extreme value of the fundamental frequency (f0) and a high spectral tilt. An accent need not be realised by all these means – e.g. a word can be perceived as accentuated even though it does not correlate with an extreme f0-value. In the recordings that are used for the experiment there are pitch accents on all accentuated words, i.e. both accentuated adjectives and accentuated nouns are correlated with extreme f0-values. A theory of optimal accentuation does not claim that this must always be the case. One might object that the specification of the foci in the answers to the question with a restrictor was not correct. Instead of the entire phrase that serves as subject, only the noun had to be focused: (38)
Which square is in the left field? – Only the square [HOUSE]F is in the left field.
This is not correct, however: the focus of the answer sentence in example (38) is, first, associated with “only” and, secondly, serves the function of
Optimal Accentuation vs Focus Accentuation 187
establishing the question–answer congruence. If only “house” is focused, then the question of (38) has to be interpreted in the sense of “Of what kind is the square in the left field?” Under this interpretation, the questioner asks for a property, not for an object. Since the question “Which square is in the left field?”can be answered by naming an object – e.g. “Object 1 is in the left field” – this interpretation is not convincing. Apart from that, the crucial data are the test persons’ evaluation of the answers to the questions without a restrictor (“What is in the left field?”). There is no dispute that in these answers the entire subject has to be focused. The accentuation of foci is context-dependent. It is not sufficient to identify the foci in the syntactic structure of a sentence; additionally it is necessary to identify which words in the foci must be accentuated given a specific discourse context. For this purpose, Selkirk (1995) introduces the technical feature f . Selkirk defines rules for the assignment of the f -feature, specifies the relation between foci and expressions that carry the f -feature and determines which of these expressions are to be accentuated: 1. The feature f can be freely assigned to any word. A collection of rules of the following schema is given, in which the variable W can be replaced with the symbol for any word class:
[W ] f → W Each word that is assigned the f -feature by such a rule is accentuated. If in the sentence “Only the square house is in the left field” both the words “square” and “house” must be accentuated, they must both be assigned the f -feature: (39)
Only the [SQUARE] f [HOUSE] f is in the left field.
If in the same sentence only the word “house” must be accentuated, only this word should receive the f -feature: (40)
Only the square [HOUSE] f is in the left field.
2. The f -feature can be projected from the internal argument of a phrase to the phrase head. The determiner “the” is the head of the phrase “the square house”, “house” is an argument of the head, and “square” is a modifier of “house”. The f -feature can be projected from the noun “house” to the determiner “the”:
188
Accentuation and Interpretation
(41)
a. b.
Only [the] f [SQUARE] f [HOUSE] f is in the left field. Only [the] f square [HOUSE] f is in the left field.
A word that is assigned an f -feature through projection – here “the” – is not accentuated. 3. From the head of a phrase the f -feature can project to the entire phrase: (42)
a. b.
Only [[the] f [SQUARE] f [HOUSE] f ] f is in the left field. Only [[the] f square [HOUSE] f ] f is in the left field.
4. A focus is an expression that carries the f -feature and that is not syntactically dominated by any other expression that carries the f -feature. In the sentences (42-a) and (42-b) it is the entire phrase “the square house” that is focused: (43)
a.
Only [[the] f [SQUARE] f [HOUSE] f ]Ff is in the left field.
b.
Only [[the] f square [HOUSE] f ]Ff is in the left field.
Selkirk’s rules make different stress patterns possible for the focus “the square house”; the focus can be pronounced either with or without accentuating the adjective “square”. It is the discourse context that determines whether “square” must be accentuated, i.e. whether it must carry the f -feature. According to Selkirk the adjective “square” must carry the f -feature and be accentuated if it is contextually “new”; if on the other hand it is contextually “given”, it cannot carry the f -feature and cannot be accentuated: 1. Every expression that carries the f -feature and that is not a focus, is contextually new. 2. Every expression that does not carry the f -feature, is contextually given. 3. Every focus is contextually new or given. The distinction between contextually given and new expressions is not always precisely defined in the literature. Let us follow Halliday (1967):35 An expression is contextually new if it cannot be deduced from the previous discourse or from the broader situation, if it contrasts to a previously uttered or somehow derivable alternative, or if it replaces the wh-element 35 Cf. also Prince (1981) and Schwarzschild (1999). Schwarzschild (1999) gives a precise definition of “given”.
Optimal Accentuation vs Focus Accentuation 189
of a question under discussion. Otherwise it is given. In the answer to the question what is in the left field, the adjective “square” is new, and must therefore be f -marked and accentuated: (44)
What is in the left field? – Only [[[the] f [SQUARE] f [HOUSE] f ] f ]F is in the left field.
In the answer to the question which square is in the left field, the adjective “square” is given; it cannot be f -marked nor accentuated: (45)
Which square is in the left field? – Only [[[the] f square [HOUSE] f ] f ]F is in the left field.
That is, according to Selkirk, the accentuation of the focus “the square house” depends on the context. Selkirk predicts the same stress patterns for the answer sentences of examples (44) and (45) as a theory of optimal accentuation. With Selkirk’s rules stress patterns can be described as grammaticalised while still being context dependent. Unlike the nuclear stress rule, her rules specify the stress patterns of the answer sentences of (44) and (45) correctly. However, Selkirk does not always predict the same stress patterns as a theory of optimal accentuation: in general, Selkirk’s theory states that the internal arguments of a phrase can receive the f feature only through free assignment; if they carry the feature, they must be accentuated. The head of a phrase can receive the f -feature also through projection; that is, a head that carries the f -feature does not always have to be accentuated. This means that there is a general asymmetry between the accentuation of contextually new, f -marked heads and new, f -marked arguments of a phrase, a so-called head-argument asymmetry. Optimal accentuation does not entail such an asymmetry. As a result, accentuation predictions of optimal accentuation differ regularly from Selkirk’s.36 Consider the following example (46) together with the reference world depicted in Figure 5.3: 36 In the answer of “What did John do? – John [introduced Bill to Sue] .” the names F “Bill” and “Sue” are contextually new and therefore must be f -marked. They cannot receive the f -feature through projection and must therefore be accentuated. The verb “introduced” is new as well, and must also be f -marked. Because it is the head of the verb phrase “introduced Bill to Sue”, however, it receives its f -feature through projection, which means that it is not accentuated. From “introduced” the f -feature is projected onto the entire verb phrase, which functions as focus. Selkirk therefore predicts a stress on “Bill” and on “Sue”. This stress pattern can either mark “Bill” and “Sue” or the entire phrase “introduced Bill to Sue” as focused. The accentuation of “Bill” and “Sue” does not fully specify the focus-background structure of the sentence “John only introduced Bill to Sue”; for further specification, it is necessary to refer to the discourse context – here: the question under discussion.
Accentuation and Interpretation
190
1
Peter
Jan
Thomas
Frank
2
Peter
Jan
+
+
-
-
DEL
Thomas
Frank
DEL
Figure 5.3 Reference world for example (46)
(46)
What does Peter do? – Peter reduces the square.
Figure 5.3 shows a game board with three objects: a triangle, a square and a cross. The objects can be manipulated with the buttons on the left in three different ways: they can be enlarged, reduced or deleted. That is, altogether there are nine possible actions: enlarging/reducing/deleting the triangle/square/cross. Four players can take these actions, namely Peter, Jan, Thomas or Frank. The first picture on the left shows the game board before an action is taken; the second picture on the left shows the game board after Peter – whose name is printed in larger size than the other names – reduced the square. The questioner of example (46) asks which action Peter takes according to Figure 5.3. Following Selkirk, the verb “reduces” and the noun “square” are both contextually new in the answer of the example and therefore have to be f marked. The f -feature is freely assigned to the noun “square”; it therefore has to be accentuated. From “square” the f -feature is projected to the head “the” of the phrase “the square”; from “the” it is projected to the entire phrase which is an internal argument of the verb “reduces”. The f -feature is projected to the verb (head) and from the verb to the entire verb phrase “reduces the square”. The verb phrase becomes a focus. “Reduces” is contextually new, but since it can receive the f -feature by projection it does not have to be accentuated: (47)
What does Peter do? – Peter [[reduces] f [[the] f [SQUARE] f ] f ]F .
Following a theory of optimal accentuation, the recognition of the verb “reduces” is critical for the understanding of the entire answer. The verb therefore has to be accentuated:
Optimal Accentuation vs Focus Accentuation 191
(48)
What does Peter do? – Peter REDUCES the SQUARE.
A focus theory a` la Selkirk and a theory of optimal accentuation predict different stress patterns for example (46). Which theory is right? Excursus: Schwarzschild (1999) criticises Selkirk’s rule system. He shows that the rules of contextual linking are ad hoc and lead to partially incorrect predictions. Schwarzschild proposes an alternative rule system, according to which f -marking is less strongly syntactically restricted. In his model, though, there is still a head-argument asymmetry – unlike in a theory of optimal accentuation. This may in fact be an important reason for him to adhere to the idea of syntactic f -marking: “We stop short of eliminating f -marking altogether, but this move is strongly suggested” (Schwarzschild (1999), 143). It is not necessary to illustrate Schwarzschild’s rule system here. For example (46), Schwarzschild predicts the same stress pattern as Selkirk, which differs from the one optimal accentuation predicts; the discussion of the rule system applies equally to his system. Gussenhoven (1984) defines a differently motivated rule (system), the Sentence Accent Assignment Rule. According to his rule, there is also an asymmetry similar to the head-argument asymmetry which is not entailed by a theory of optimal accentuation. The results of experiments whether there is such an asymmetry and of what kind this asymmetry is also apply to the evaluation of his rule. (End of excursus) We perform a perception experiment to test the stress predictions of Selkirk against the predictions of optimal accentuation. The experimental design is very similar to that of the nuclear stress rule-experiment described before. Recordings of four dialogues like the ones in (49-a) and (49-b) are played to 47 test persons: (49)
a.
(i) (ii)
b.
(i)
(ii)
Was macht Peter? – Peter verkleinert das QUADRAT. (What does Peter do? – Peter reduces the SQUARE.) Was macht Peter? – Peter VERKLEINERT das QUADRAT. (What does Peter do? – Peter REDUCES the SQUARE.) Was verkleinert Peter? – Peter verkleinert das QUADRAT. (What does Peter reduce? – Peter reduces the SQUARE.) Was verkleinert Peter? – Peter VERKLEINERT das QUADRAT. (What does Peter reduce? – Peter REDUCES the SQUARE.)
192
Accentuation and Interpretation
The dialogues (49-a) and (49-b) differ only with respect to the questions: (49-a) inquires about an action; (49-b) inquires about an object. As before, the recordings of the dialogues only differ in the stress pattern on the answer: in the first recording, only the noun (“square”) is accentuated; in the second recording the verb (“reduces”) is also accentuated. The test persons must decide for each dialogue which of the recordings – i.e. which stress pattern – is better in their perception. Just as the other experiment, this too is a forced-choice experiment; the test persons must choose one of the recordings. Again, controlled test conditions are constituted by presenting the dialogues in connection to game boards similar to the one in Figure 5.3. The test persons are told that the questioner does not know what the respective player (Peter) is doing on the game board. The answerer on the other hand does know. He should pass this information on to the questioner in an adequate manner. Focus theories state that in the answer to a question like “What does Peter do?” (dialogue (49-a)) the entire verb phrase “reduces the square” is focused. In the answer to a question like “What does Peter reduce?” (dialogue (49-b)), only “the square” is focused. According to Selkirk, the f feature can be freely assigned to the noun “square”, and from there it can be projected via “the” to “the square”. From “the square”, the f -feature can then be projected onto the verb “reduces” and beyond onto “reduces the square”. Regardless of the question whether the verb phrase or the object noun phrase is focused, only the noun “square” must receive the f -feature through free assignment. That means that in both contexts, only “square” must be accentuated. Hence, Selkirk predicts that the test persons prefer recordings like (49-a-i) and (49-b-i) and that no correlation between the preference for a stress pattern and the context can be observed. According to a theory of optimal accentuation, the stress pattern of the answer depends on the context: in the answer to the question what Peter does (dialogue (49-a)), the verb “reduces” and the noun “square” must be accentuated; in the answer to the question what Peter reduces (dialogue (49-b)), accentuating the noun suffices. That is, a theory of optimal accentuation predicts that the test persons prefer recordings like (49-a-ii) and (49-b-i) and that a correlation between the preference for a stress pattern and the context can be observed. These are the results of the experiment: in the case of an answer to an inquiry about an object (like in dialogue (49-b): “What does Peter reduce?”) the test persons prefer the sole accentuation of the noun in almost 80 per cent of the cases. In the case of an answer to an inquiry about an action (like in dialogue (49-a): “What does Peter do?”) the test persons prefer the sole
Optimal Accentuation vs Focus Accentuation 193
accentuation of the noun in almost 60 per cent of the cases. That is, in both contexts a majority of the test persons prefer the sole accentuation of the noun; this corresponds to Selkirk’s predictions. However, the data show a highly significant correlation between context and stress preference. The one-sided t-test for the comparison of Selkirk’s hypothesis (null hypothesis: There is no correlation between context and accentuation) and the hypothesis of optimal accentuation (alternative hypothesis: There is a correlation between accentuation and context, to the extent that the verb is more likely to be accentuated if it does not occur in the interrogative sentence) yields a p-value of < 0.0001. In this respect the data support the expectations of optimal accentuation and contradict Selkirk’s prediction. The results of the experiment neither confirm focus theories a` la Selkirk (1995), nor do they confirm a theory of optimal accentuation. If – as Selkirk assumes – the verb in the answers of the test dialogues is not be accentuated at all, there should be no correlation between context and stress preferences. Yet, the results are highly significant in favour of such a correlation; Selkirk’s assumption therefore cannot be correct. However, the fact that in all contexts a majority of test persons prefer the recording with the sole accentuation of the noun (“Peter reduces the SQUARE”) to the recording in which both the verb and the noun are accentuated (“Peter REDUCES the SQUARE”), seems to support Selkirk’s assumption. That is, in the case of an answer to a question such as “What does Peter do?”, a majority of the test persons prefer recordings with a non-optimal stress pattern. How can these results be explained? – In the recordings in which the verb (“verkleinert”, reduces) and the noun (“Quadrat”, square) are accentuated, both words are accentuated about equally strong and both words bear pitch accents. A possible explanation of the results can be that the test persons consider the stress on the verb in the recordings that they hear too strong. It is possible that the predictions of optimal accentuation are correct, but that the stress on the verb must be weaker than the one on the noun. Let us sidestep for a moment in order to explain how we arrive at this idea: Wagner (2002) shows that in sentences that are pronounced in citation form, words of different word classes are accentuated with different strengths. Furthermore, content words at the end of intonational phrases are accentuated stronger than words before them.37 (So far, I have treated stress as a binary feature. For Wagner, stress is a gradual feature.) A declarative sentence is pronounced in its citation form if it is uttered as an answer to the question what is the case. According to Wagner, when “Pe37 The latter point may contribute to the explanation of why the (incorrect) nuclear stress rule often appears quite plausible.
194
Accentuation and Interpretation
ter verkleinert das Quadrat” (Peter reduces the square) is pronounced in its citation form, the proper name “Peter” and the noun “Quadrat” are accentuated stronger than the verb “verkleinert”. “Quadrat” is the last content word of an intonational phrase and therefore receives the strongest stress; the stress on the article “das” is so weak that it can be ignored: (50)
PETER++++ VERKLEINERT+++ das QUADRAT+++++ . (PETER++++ REDUCES+++ the SQUARE+++++ .)
According to optimal accentuation, in a sentence in citation form all the words must be accentuated whose non-recognition cannot be compensated for through standard operations of semantic enrichment. When the sentence “Peter verkleinert das Quadrat” is pronounced in citation form, the words “Peter”, “verkleinert” and “Quadrat” must optimally be accentuated. The article “das” does not need to be accentuated, because it can be compensated for through type-shifting the meaning representation of “Quadrat”. A theory of optimal accentuation therefore makes the same predictions that Wagner makes with regard to which words are to be accentuated in a sentence in citation form. However, because the theory as I outlined it so far treats stress as a binary feature, it does not predict different stress strengths – unlike Wagner. In dialogue (49-a) the sentence “Peter verkleinert das Quadrat” answers the question what Peter is doing. This question is not equivalent to the general question what the case is. The sentence is therefore not pronounced in citation form. Let us assume that a theory of optimal accentuation correctly predicts which words are to be accentuated in sentences that are not pronounced in their citation form, and let us follow Wagner in determining stress strength. According to these assumptions, both the verb “verkleinert” and the noun “Quadrat” must be accentuated; the verb, however, must be accentuated weaker than the noun: (51)
Was macht Peter? – Peter VERKLEINERT+++ das QUADRAT+++++ . (What is Peter doing? – Peter REDUCES+++ the SQUARE+++++ .)
There is no reason to assume that all words must be accentuated equally strong when the sentence is not pronounced in its citation form. It is therefore not merely possible but even likely that the sentence, when uttered as a reply to the question what Peter is doing, must be optimally accentuated as indicated in example (51). If this is indeed the case, we presented inadequate stress patterns to the test persons in our experiment. We can explain the test results by assuming that the majority of the test persons chose the recording without stress on the verb in order to make sure that
Optimal Accentuation vs Focus Accentuation 195
it has a weaker stress than the sentence-final noun. The test persons chose the recording with stress on the verb more often when the question what Peter was doing was under discussion, because it is only in this answer that the verb has to carry an accent at all – albeit a weaker one than the noun.38 We perform a production experiment in order to test the hypothesis on different stress strengths. In the experimental setup we keep close to the perception experiment performed before: a person sitting in front of a test person shows pictures that are similar to the one depicted in Figure 5.3. He asks questions like “Was macht Peter?” (“What does Peter do?”) or “Was verkleinert Peter?” (“What does Peter reduce?”). The test person looks at the pictures and answers the questions with complete sentences like “Peter verkleinert das Quadrat.”(“Peter reduces the square.”). The test person does not read the answer but generates it freely only by looking at the picture. The test persons’ answers are recorded and then analysed. The experiment is split into a training phase and a test phase. During the training phase, the test persons answer three questions to get familiarised with the task. During the test phase, the test persons answer altogether 12 questions of the kinds “What does X do?”, “What does X enlarge/reduce/delete?” and “Who enlarges/reduces/deletes the Y?”. Twentysix test persons finish the experiment. In the acoustic analysis of the answer sentences we concentrate on the acoustic correlates of the verbs and the nouns in the object noun phrases. We are in particular interested to find out whether the verbs in answers to action-inquiries like “What does Peter do?” are realised differently from 38 Objection: above, it was shown that the recipients of the dialogue in (i) clearly prefer that both the adjective “quadratisch” (square) and the noun “Haus” (house) are accentuated:
(i)
Was ist im linken Feld? – Nur das quadratische Haus ist im linken Feld. (What is in the left field? – Only the square house is in the left field.)
Should we not expect that the preference for accentuation of the adjective is weaker than the experiment showed – that is, that it turns out to be just as weak as the preference for accentuation of the verb in dialogue (49-a)? Reply: there are indeed two reasons to expect that “quadratische” will be accentuated weaker than “Haus”. According to Wagner (2002), nouns are generally accentuated more strongly than adjectives; furthermore, the noun “Haus” is at the end of an intonational phrase. However, Wagner also argues that adjectives are generally accentuated more strongly than verbs. It should therefore not surprise us that test persons judge the stress pattern of “Nur das QUADRATISCHE HAUS ist im linken Feld” (Only the SQUARE HOUSE is in the left field) differently from the stress pattern of “Peter verkleinert das Quadrat” (Peter REDUCES the SQUARE.)
196
Accentuation and Interpretation
the verbs in answers to object-inquiries like “What does Peter reduce?”. The boundaries of each verb and each noun are hand labelled by a phonetic expert. Then the differences between the fundamental frequencies in the lexically stressed syllables and the average fundamental frequencies of the answers are computed. Moreover, the verb and noun durations are measured. The absolute intensity level or related measures are not taken into account, because we do not control the distance to the microphone in the recordings. These are the results: (a) there are no significant, context-related differences in the realizations of the nouns; the nouns are accentuated in the action-inquiry and in the object-inquiry contexts in the same way. (b) In answers to action-inquiries (“What did Peter do?”) the verbs are produced on average 42 ms longer than in answers to object-inquiries (“What did Peter reduce?”). We find a clear correlation of contexts and verb durations – the two-sided t-test yields a p-value of < 0.0001. When taking into account that the increase is probably concentrated on the lexically stressed syllable, it is clear that an average difference of 42 ms is audible and increases the perceptual prominence or level of accentuation of the word.39 Moreover, the effect is very robust – 22 out of 26 test persons significantly lengthen the verb in the action-inquiry context. (c) No acoustic difference could be found for the fundamental frequency – the values are almost identical for verbs in answers to action-inquiries and to object-inquiries. We interpret the results as a corroboration of our hypothesis: in the action-inquiry context, all i-critical words are accentuated. However, there is an asymmetry of accentuation: the verbs are accentuated less strongly – only by an increase of duration, without an extreme value of f0 – than the sentence final nouns which are also marked by an f0-extremum. Let me take stock. There is an asymmetry in the accentuation of i-critical words that according to a theory of optimal accentuation must be accentuated. Selkirk (1995) understands this asymmetry as a syntactic asymmetry between phrase heads and their arguments. She describes it by means of a syntactic rule for the projection of the f -feature. I-critical words (in Selkirk’s analysis, words that are contextually new) that must carry the f -feature but that can receive it through projection are not accentuated. I suspect that the asymmetry can be explained as an asymmetry in stress strength, without having to refer to a syntactic rule of f -projection. Stress must accordingly be treated as a gradual feature – and not as a binary one, as I have done so far. Furthermore, the accentuation rule defined in Chapter 4 must be 39 Compare the 42 ms with the 6 ms measured in the Beaver-Experiment that was discussed in section 2.1!
Optimal Accentuation vs Focus Accentuation 197
expanded with rules for the determination of stress strength, i.e. along the lines of Wagner (2002) so that stress strengths are determined as depending on word classes and positions in intonational phrases. 2.3
Summary
Accentuation is context-dependent: 1. According to focus theories, “only” and “always” are focus adverbs that must (structured meaning theory) or at least can (alternative semantics) be associated with a focus in their domains. It is not always the case that an expression in the domain of “only” and “always” is accentuated. Focus theories must therefore either admit that not every focus is accentuated, or they must admit that a focus adverb need not always be associated with a focus. Such admissions weaken the predictive power of these theories: the explanation that a declarative sentence with “only” or “always” has certain truth conditions, so that it shows a certain focusbackground structure and must therefore be accentuated in such-andsuch a way, is not valid.40 40 If not every focus is accentuated, then the accentuation of a word is not a necessary condition for the word to be part of a focus. According to Lambrecht ¨ (1994) and Buring (1999), the accentuation of a word is also not a sufficient condition for the word to be in focus. Stress – as they claim – is also used to to mark topics:
(i)
What did the pop stars wear? – The [FEMALE]T pop stars wore [CAFTANS] F .
¨ Example (i) comes from Buring (1999). In the answer sentence, “female” and “caftans” must be accentuated. “Caftans” is a pragmatic focus that establishes the question–answer congruence; “female” on the other hand cannot be interpreted as ¨ ¨ focus, but, according to Buring, serves as topic. If Buring is correct in claiming that the word “female” in example (i) cannot carry a focus feature, focus and stress are only loosely connected: foci are not always accentuated, and stress does not always serve to mark foci. The hypothesis that topics can be accentuated is relevant for the assessment of focus theories. Yet, I only mention it in a footnote: first, I do not want to be forced to discuss whether the use of the theoretical term “topic” is at all useful, and, secondly, ¨ Lambrecht and Buring claim that topics are accentuated in a different way than foci, i.e. that topic accentuation robustly correlates to different features of the speech signal. I do not have data to my disposal that would allow me to test this claim. According to a theory of optimal accentuation, the sentence “The FEMALE pop star wore CAFTANS” presupposes and partially answers the question which pop stars wore what. The theory uses neither the theoretical term “focus” nor the theoretical term “topic”. (I presume that “What did the pop stars wear?” is interpreted in the sense of “For each pop star: what did he or she wear?”, and not in the sense of “Which pieces of clothing were worn by all pop stars together?” That is, the presupposed question is equivalent to the question that was actually asked.)
198
Accentuation and Interpretation
It can plausibly be assumed that speakers only accentuate foci that they consider to be pragmatically motivated. The phenomenon that foci associated with “only” are often accentuated without pragmatic motivation can be explained by referring to the normal use of “only”: “only” is normally associated with a pragmatic focus, which has to be accentuated. A speaker can be uncertain about the role of a focus associated with “only” and accentuate it, even if the focus is not pragmatic and therefore should not be accentuated. I suspect that this uncertainty is more likely to occur when reading unknown texts (such as the experiment texts of Beaver et al. (2007), and less likely in spontaneous speech. If the assumption that focus stress is motivated pragmatically is not only plausible but also correct, semantic accentuation effects of focus theories are best explained as epiphenomena of pragmatic focus accentuation. 2. A focus that has to be accentuated can consist of multiple words. Which of these words must carry stress depends on the utterance context. An accentuation rule such as the nuclear stress rule, which does not refer to the discourse context, predicts partially inadequate stress patterns. Selkirk (1995) introduces the syntactic feature f , so that the words to be accentuated can be determined in relation to the utterance context: contextually new words must carry the f -feature; they can receive this feature either through free assignment, or through projection. If a word receives the f -feature through free assignment, it must be accentuated; if it receives the feature through projection, it is not accentuated. The deciding reason for positing the f -feature lies in the explanation of the lack of accentuation on certain contextually new words. The feature would not be needed if all contextually new words were accentuated. Within a theory of optimal accentuation, the lack of accentuation on contextually new words is accounted for by means of type-shifting rules: it must be possible to compensate for the non-recognition of unstressed words by referring to the discourse context, or through standard typeshifting operations. New words that are not recognised cannot be compensated for by referring to the context. If there is no need to recognise (and hence accentuate) such words, it must be possible to compensate for them through type-shifting. Asymmetries in the accentuation of contextually new words that cannot be accounted for through type-shifting rules must be interpreted as asymmetries in the strength of accentuation.
Optimal Accentuation vs Focus Accentuation 199
3 Conclusions In the present chapter, I showed that semantic effects of stress patterns can be accounted for as epiphenomena of optimal accentuation without the need for the theoretical term “focus”. Focus theories form the standard account for semantic effects of stress patterns. I deviate from this standard. When one deviates from a standard, one must motivate one’s decision. I have three reasons for the deviation: first, theories should be simple; dispensing with a theoretical term makes a theory simpler. The theoretical term “focus” should therefore be dispensed with if this is possible. I have shown that this is indeed possible. (The burden of proof now lies with the proponents of focus theory: rather than the abandonment of the term “focus”, its use must be justified.) Secondly, a theory of optimal accentuation makes predictions that are partially different from the predictions of focus theories. I claim that the predictions of a theory of optimal accentuation are more accurate than the predictions of focus theories. In this chapter, I have supported my claim with experimental data. Thirdly, a theory of optimal accentuation not only describes how utterances are accentuated, it also explains why accentuation occurs. One might get the false impression that the development of a theory of optimal accentuation, in which the term “focus” is not used, is a radical undertaking. In truth, such a theory is rather conservative, and resembles focus theories in several essential aspects. The central ideas of focus theories deal with the accentuation of foci that consist of multiple words – after all, if the assignment of foci and stress were a one-to-one relation, the term “focus” would not be needed – and with the interpretation of foci: 1. Concerning accentuation: constituent answers are focused without each of their words being accentuated. Focus theories account for these omissions of stress by specifying a focus projection rule. In a theory of optimal accentuation these rules are replaced by standard rules of semantic type-shifting. This essentially amounts to shifting the omission rules from syntax to semantics. 2. Concerning interpretation: foci serve to form sets of alternatives or to identify contextually given sets; a theory of optimal accentuation states that on the basis of stress patterns presupposed background questions can be inferred. Suppose a sentence including stress pattern is given, for which a set of alternatives (focus theory) and a background question (theory of optimal accentuation) can be construed; if now the background question is represented with a QUAD consisting of a question abstract and a question restrictor, then the restrictor can be identified
200
Accentuation and Interpretation
with the focus-theoretic set of alternatives. It is therefore not surprising that focus theories and a theory of optimal accentuation generally arrive at equivalent interpretations. The ideas of focus theories and of a theory of optimal accentuation with respect to the omission of accentuations and to the interpretation of stress patterns are similar, but not identical. The theories therefore make partially different predictions for stress patterns. Some of the different predicitions may result from different notions of stress: the nuclear stress rule becomes substantially more plausible if one assumes that it only talks about pitch accents. I do not reduce stress to extreme values of the fundamental frequency (f0). I consider such a reduction inadequate because recipients may recognise words as accentuated even when they do not correlate to an f0 minimum or maximum.41 If focus theories in fact only deal with the interpretation of pitch accents, then a theory of optimal accentuation is a radical undertaking. It would mean that the theory does not explain a phenomenon in a different way from how it has been explained so far; instead, it would explain a different phenomenon.
41 That is, not only the words that Pierrehumbert (1987) marks with “H*” or “L*” count as accentuated. A word can also be recognised as accentuated when it does not correlate with an extreme value of the fundamental frequency, but rather with relatively high intensity and long duration, or when it is highlighted against the other words through hyperspeech. Cf. Chapter 2.
6 Summary Starting out with Shannon (1948), I developed the outlines of a model of active interpretation: a speaker transmits a message φ by uttering a sequence of words A1 · . . . · An , thus sending an acoustic signal to the recipient. The recipient receives and decodes the signal. On the basis of the words that he recognises, he reconstructs the speaker’s message. Hypospeech on the part of the speaker, acoustic disturbances in the communication channel and attention deficits on the part of the recipient may lead to problems in recognition, so that the recipient recognises a sentence directed at him only partially. Hypospeech, acoustic disturbances and attention deficits occur regularly; incomplete recognition must therefore regularly be taken into account. In order to be able to reconstruct the message that an incompletely recognised sentence conveys, the recipient has to semantically enrich the sentence parts that he recognised. I represent messages as sentences of the type-logical language QL. In Chapter 4, I defined the following operations for the reconstruction of messages: 1. Natural-language expressions that have been recognised can be translated into type-logical meaning representations: words are translated with a lexicon; complex expressions are translated by functional application of the translations of their parts; in this way, completely recognised natural-language sentences can be translated into sentences of QL, i.e. into representations of entire messages. 2. Type-logical expressions can be modified through operations of semantic enrichment. If a sentence is recognised only incompletely, the translations of the recognised parts may be independently (actively) expanded into a sentence of QL my means of these operations: (a) semantic enrichment can be carried out with reference to the discourse context through the use of QUADs (question abstract domain pairs) or through the use of representations of already uttered expressions. (b) Without reference to the discourse context, semantic enrichment can be carried out through the application of the exhaustification function exh and through typeshifting.1 1
The distinction between semantics and pragmatics is not very sharp in (contd)
201
202
Accentuation and Interpretation
Semantic enrichments are not controlled by the speaker; the recipient carries them out on his own. It is possible that the recipient can reconstruct several non-equivalent messages on the basis of the words that he recognised. In order to find out which of the possible reconstructions corresponds to the message that is intended by the speaker, the recipient requires criteria that enable him to distinguish between the messages that a speaker may intend in a given situation – that is, what kinds of messages are intendable in the given discourse situation – from the messages that he cannot intend. In Chapter 3, I defined such criteria, which serve as criteria of adequacy both for making utterances and for interpreting utterances. If a speaker obeys the adequacy criteria – I presuppose that a cooperative speaker always does – the message he intends is an element of the set of intendable messages. If the speaker expresses himself clearly and intelligibly, the intersection of the set of intendable messages and the set of reconstructable messages contains exactly one element, namely the intended message.2 The interpretation of an utterance depends on the discourse context: first, contextually available material – i.e. a QUAD or the type-logical representation of an expression that has already been uttered – can be used to semantically enrich a reconstruction; secondly, the adequacy of a reconstructed message is determined in reference to the discourse context. The goal of interpreting an utterance is to reconstruct a contextually adequate message. A recipient can proceed goal-oriented (top-down) and uses the words that he recognised to select a message from the set of messages that are compatible with the adequacy criteria. In doing so, he may occasionally ignore some of the recognised words, e.g. to compensate for syntactic anomalies. Alternatively, a recipient can proceed data-oriented (bottom-up). If he recognises a sentence completely, he can translate the sentence into the representation of a message. If this message is not adequate given his representation of the discourse context, he may be able to “make” the message adequate by accommodating his context representation. Of course, the accommodation of context representations is restricted; a recipient cannot my model. Following Peregrin (1999), one can count the invariants of interpretation specified in lexicon and grammar to semantics, while the free application of operations for semantic enrichment – regardless of whether these refer to the discourse context or not – can be considered part of pragmatics. 2 Weaker: a speaker’s utterance can also be clear and intelligible if it allows more than one reconstruction that is compatible with the adequacy criteria. However, the recipient must be able to decide with some certainty which of the reconstructable messages is the one the speaker intended. The probability that the speaker intends exactly this message must be higher than the probabilities for the other possible messages.
Summary
203
make just any message adequate by accommodating the context representation arbitrarily. In order to reconstruct a message, a recipient need not necessarily recognise all of the words that the speaker uttered, but he must recognise some of them. The words that would be necessary and sufficient for reconstructing the whole message are the words that are critical for interpretation, the so-called i-critical words. A cooperative speaker wishes to be understood; therefore, he must express himself in such a way that the recipient will recognise at least the i-critical words. (If there are multiple sets of icritical words, the speaker must make sure that the recipient recognises one of these sets.) The speaker can increase the probability that the recipient recognises the i-critical words by emphasising these words through accentuation. I posit the hypothesis that accentuation only serves to highlight i-critical words; I call this hypothesis “the hypothesis of optimal accentuation”: In the optimal case, a cooperative speaker accentuates a minimal number of words. The words that are accentuated are those words that – when recognised – suffice for the recipient to understand the entire meaning of the speaker’s utterance. Which words in a sentence are critical for understanding depends on which message the uttering of the sentence is to convey and in which discourse context the sentence is uttered. If a recipient knows the discourse context, it must be the case that he can reconstruct the message on the basis of the i-critical words alone. That is, the i-critical words that have to be accentuated can be determined on the basis of the reconstruction operations defined in Chapter 4. Accordingly, I defined a three-place function accentuate that makes reference to the reconstruction operations. The function accentuate assigns stress patterns to sentences for specific messages and discourse contexts. Accentuation is just one aspect of prosody. Apart from emphasising icritical words, prosody performs other functions; it serves, for example, to mark a speaker’s attitudes, emotions, locutionary modes, etc.3 The various aspects of prosody can interfere with each other, which can affect the realisation of stress. The function accentuate determines which words of a sentence are to be accentuated. It does not determine in which way and how strong these words are to be accentuated, and it also does not determine which of the possible acoustic correlates a specific stress must have. Certainly, words that are to be accentuated must be accentuated strong enough to effectively increase the probability that they are recognised correctly. 3
Cf. e.g. Pierrehumbert and Hirschberg (1990).
204
Accentuation and Interpretation
Accentuation is not the only way to emphasise words or word sequences. Another means is, for example, topicalisation. In a language such as German, it is possible to position almost any sentence part, even syntactically incomplete parts, at the front of the sentence.4 Although other means of emphasising can enhance accentuation, they cannot replace it – even icritical words that are topicalised must be accentuated. According to the hypothesis of optimal accentuation, stress patterns do not have a direct semantic function. They can, however, have an indirect semantic effect: the stress pattern of a sentence must be optimal given a specific discourse context; accentuation therefore presupposes a corresponding discourse context; this context can influence the interpretation of the sentence. Semantic effects of stress patterns can be described as epiphenomena of optimal accentuation; the theoretical term “focus” is not needed to account for them. The hypothesis of optimal accentuation can be empirically evaluated in combination with a model of active interpretation: on the basis of the hypothesis combined with an interpretation model, one can predict how sentences with given stress patterns and in a given discourse contexts will be interpreted and how sentences with given interpretations and in given discourse contexts will be accentuated. These predictions partly contradict predictions made by other theories that describe the relation between accentuation and interpretation (in casu: focus theories). We conducted experiments that tested several predictions. The results of these experiments support the hypothesis of optimal accentuation.
4 Compare the following example from Muller ¨ (1998): “Dem Peter zu geben versucht hat das Buch keiner.” (the.dat Peter to give tried has that.acc book no one.nom: No one has tried to give that book to Peter.)
Appendix: Type Logic with Lambda Operator 1 Syntax Definition A-1 (Types) The set of types T is the smallest set for which the following holds: (a) e ∈ T, t ∈ T. (b) If σ ∈ T and τ ∈ T, then σ, τ ∈ T. Definition A-2 (Syntax of TL) (a) The vocabulary of TL contains: (i) Non-logical constants: – For each type σ there is a set of constants of this type: P1σ , P2σ , ...1
– Identity: let = be a predicate constant of type e, e, t.
(ii) Variables: for each type σ there is a set of variables of this type: X1σ , X2σ , ...2 (iii) Connectives: ¬, ∧, ∨, →, ↔ (iv) Quantifiers: ∃, ∀ (v) The lambda operator: λ (vi) Auxiliary symbols: (, ), [, ], , (b) Well-formed expressions of TL: (i) If φσ is a constant or variable of TL, then φσ is a well-formed expression of type σ of TL. (ii) If φσ,τ and ψσ are well-formed expressions, then φσ,τ (ψσ ) is a wellformed expression of type τ. 1 When indices are not needed to distinguish constants, and when the type of a constant follows unambiguously from the context, indices and type designations can be left out. – As long as there is no risk of misunderstanding, constants can be represented by lower-case words: square, circle, ... (cf. Chapter 4). 2 When indices are not needed to distinguish variables, and when the type of a variable follows unambiguously from the context, indices and type designations can be left out.
205
206
Accentuation and Interpretation
(iii) If φt is a well-formed expression of type t, then ¬φt is also a well-formed expression of type t. (iv) If φt and ψt are well-formed expressions of type t, then (φt ∧ ψt ), (φt ∨ ψt ), (φt → ψt ) and (φt ↔ ψt ) are well-formed expressions of type t. (v) If φt is a well-formed expression of type t and ξ τ is a variable of an arbitrary type τ, then ∀ξ τ [φt ] and ∃ξ τ [φt ] are well-formed expressions of type t. (vi) If φτ is a well-formed expression and ξ σ is a variable of an arbitrary type σ, then λξ σ [φτ ] is a well-formed expression of type σ, τ . (vii) Nothing else is a well-formed expression of TL.
Definition A-3 (Formula of TL) A formula of TL is a well-formed expression φt of type t of TL. Definition A-4 (Free Variable) An occurrence of a variable ξ in a well-formed expression φσ is free iff: (a) φσ = ξ, or (b) there are well-formed expressions ψτ,σ and ω τ , so that φσ is the well-formed expression ψτ,σ (ω τ ) and ξ is free in ψτ,σ or in ω τ , or (c) there is a well-formed expression (a formula) ψt , so that φσ is the expression ¬ψt and ξ is free in ψt , (d) there are well-formed expressions (formulas) ψt and ω t , so that φσ is the expression ψt ∧ ω t , ψt ∨ ω t , ψt → ω t or ψt ↔ ω t and the corresponding occurrence of ξ in ψt or ω t is free, or (e) there is a formula ψt , so that φt is the formula ∀ζ [ψt ] or ∃ζ [ψt ], the corresponding occurrences of ξ in ψt is free and the variable ζ is different from ξ, or (f) there is a formula ψτ , so that φσ is the formula λζ [ψτ ], the corresponding occurrence of ξ in ψτ is free and the variable ζ is different from ξ. Definition A-5 (Sentence of TL) A sentence of TL is a formula φt of TL that does not contain free variables.
Appendix: Type Logic with Lambda Operator 207
2 Semantics Definition A-6 (Possible Denotations) Let a discourse domain D be given: (a) The set of possible denotations of well-formed expressions φe of TL is D e = D. (b) The set of possible denotations of well-formed expressions (formulas) φt of TL is D t = {true, f alse}. (c) The set of possible denotations of well-formed expressions φσ,τ of TL is σ D σ,τ = D τD (the set of functions F from D σ to D τ : { F | F : D σ → D τ }). Definition A-7 (Variable Assignment) A variable assignment in TL w.r.t. a universe of discourse D is a function g that assigns to each variable ξ σ of TL an element of D σ . Definition A-8 For variable assignments g and g and variables ξ it holds that g =ξ g iff for all variables ξ that are different from ξ, it holds that g(ξ ) = g (ξ ). Definition A-9 Let g be a variable assignment w.r.t. D, let ξ σ be a variable of type σ and let d be an element of D σ . Then gξdσ is the variable assignment g for which the following holds: (a) g =ξ σ g and (b) g (ξ σ ) = d. Definition A-10 (Reduced Model) A reduced model of TL is a (reduced) Kripkestructure M = D M , I M , [[ ]] M iff: (a) D M is a non-empty discourse domain of TL, (b) I M is a non-empty set of indices (possible worlds), (c) [[ ]] M is an interpretation function that assigns to every expression of TL for each index i ∈ I M a possible denotation of TL, so that for each i ∈ I M the following holds: (i) for all well-formed expressions φσ that do not contain free variables, it holds that: [[φσ ]]iM ∈ D σ . (ii) for all variable assignments g in TL, there is a function [[ ]] M,g , so that: (A) for all well-formed expressions φσ of TL it holds that: M,g [[φσ ]]i = [[φσ ]]iM if φσ does not contain free variables,
208
Accentuation and Interpretation M,g
(B) for all variables ξ σ of TL it holds that: [[ξ σ ]]i
= g ( ξ σ ), M,g
(C) for all expressions φe and ψe it holds that: [[φe = ψe ]]i iff
M,g [[φe ]]i
= true
M,g [[ψe ]]i ,
=
(D) for all well-formed expressions φσ,τ and ψσ it holds that: M,g M,g M,g [[φσ,τ (ψσ )]]i = [[φσ,τ ]]i ([[ψσ ]]i ), (E) for all well-formed expressions (formulas) φt and ψt the following holds: M,g
– [[¬φt ]]i – –
M,g
= true iff [[φt ]]i
M,g ∧ ψt ]]i M,g [[φt ∨ ψt ]]i
[[φt
= f alse,
M,g = true iff [[φt ]]i = M,g = f alse iff [[φt ]]i
M,g
true and [[ψt ]]i
= true, M,g
=
M,g
=
= f alse and [[ψt ]]i
f alse, M,g
– [[φt → ψt ]]i true,
M,g
– [[φt ↔ ψt ]]i
M,g
= true iff [[φt ]]i
M,g
= true iff [[φt ]]i
= f alse or [[ψt ]]i M,g
= [[ψt ]]i
,
and (F) for all formulas φt and all variables ξ σ the following holds: M,g
– [[∀ξ σ [φt ]]]i
= true iff for all variable assignments g =ξ σ g M,g
it holds that: [[φt ]]i –
= true,
M,g [[∃ξ σ [φt ]]]i = true iff for at least one M,g g =ξ σ g it holds that: [[φt ]]i = true.
variable assignment
The definition of the semantics of modal operators usually makes use of a relation R M ⊆ I M × I M that describes the accessibility of the indices i ∈ I M (cf. Kripke (1963)). The language TL defined above does not contain modal operators. Therefore, I do not need to include a relation R M . A model M can be defined in such a way that each index i ∈ I M is assigned its own domain D. In that case, D M is not defined as a fixed domain, but as a function of indices to domains. For the sake of simplicity, I assume that each index has the same domain. Accordingly, D M will be a fixed domain for each index. In a reduced model M according to definition A-10, a sentence of TL is not assigned a truth value unambiguously. A sentence only has a truth value in relation to an arbitrarily selected index i ∈ I M . A complete model M , that assigns a truth value to each sentence unambigously, must be expanded with a selected index i∗ ∈ I M . The index i∗ should represent reality; the truth value of a sentence must be determined in relation to this index.
Appendix: Type Logic with Lambda Operator 209
Definition A-11 (Complete Model) A complete model of TL is a Kripke-structure3 M = D M , I M , i∗ , [[ ]] M iff: (a) M = D M , W M , [[ ]] M is a reduced model of TL, (b) i∗ ∈ I M and (c) for each sentence φ of TL the following holds: – If [[φ]]iM ∗ = true, then φ is true in M. – Otherwise φ is false in M.
Sentence 1 [ψσ /ξ σ ](φτ ) is the expression φτ in which all occurrences of ξ σ are replaced with ψσ ; λξ σ [φτ ](ψσ ) is equivalent to [ψσ /ξ σ ](φτ ) if none of the variables that are free in ψσ is bound by a quantifier or λ-operator in [ψσ /ξ σ ](φτ ).
3
Because of the omission of R M it is actually a reduced Kripke-structure.
References Allen, J. F., and C. R. Perrault. 1980. Analysing Intention in Utterances. Artificial Intelligence 15, 143-178. Atlas, J. D. 2002. Negative Polarity Items and Overcoming Assertoric Inertia. Position Statement at One Day “Only”. Amsterdam. Atlas, J. D., and L. Horn. 2002. Discussion. One Day “Only”. Amsterdam. Bach, K., and R. M. Harnish. 1979. Linguistic Communication and Speech Acts. Cambridge/Mass. Bartels, C. 1995. Comments on Asher and Krifka: Acoustic Correlates of ‘Second Occurrence’ Focus: Toward an Experimental Investigation. In Kamp and Partee (1995b), 11–30. Barwise, J., and R. Cooper. 1981. Generalized Quantifiers and Natural Language. Linguistics and Philosophy 4, 159–219. B´atori, I., W. Lenders, and W. Putschke (eds). 1989. Computational Linguistics/Computerlinguistik. Berlin, New York. B¨auerle, R. and T. E. Zimmermann. 1989. Frages¨atze. In Stechow and Wunderlich (1989), 333–348. Beaver, D. 1991. The Kinematics of Presupposition. In Dekker, P., and M. Stokhof (eds). Proceedings of the 8th Amsterdam Colloquium. Amsterdam. Beaver, D., and B. Clark. 2003. “Always” and “Only”: Why Not All FocusSensitive Operators Are Alike. Natural Language Semantics 11, 323–362. Beaver, D., B. Clark, E. Flemming, and M. Wolters. 2007. When Semantics meets Phonetics: Acoustical Studies of Second Occurrence Focus. Language 83, 245–276. Beckman, M. E. 1986. Stress and Non-Stress Accent. Dordrecht. Bernardi, R., and M. Moortgat (eds). 2003. Questions and Answers: Theoretical and Applied Perspectives. Utrecht Institute of Linguistics. Blutner, R. 2000. Lexical Pragmatics. Journal of Semantics 15, 115–162. Bolinger, D. 1972. Accent Is Predictable (If You’re a Mind Reader). Language 48, 633–644. Bolinger, D. 1986. Intonation and Its Parts. Melody in Spoken English. Stanford. Bolinger, D. 1989. Intonation and Its Uses. Melody in Grammar and Discourse. London et al. Bonomi, A., and P. Casalegno. 1993. Only: Association with Focus in Event Semantics. Natural Language Semantics 2, 1–45. Bos, J., and M. Gabsdil. 2000. First-Order Inference and the Interpretation of Questions and Answers. In Poesio, M., and D. Traum (eds), Proceedings of
210
References 211
Goetalog 2000. Fourth Workshop on the Semantics and Pragmatics of Dialogue. Gothenburg Papers in Computational Linguistics 00-5, 43–50. Bosch, P., and R. van der Sandt (eds). 1994a. Focus & Natural Language Processing. Volume 1: Intonation and Syntax. IBM Working Papers of the Institute for Logic and Language 6. Heidelberg. Bosch, P., and R. van der Sandt (eds). 1994b. Focus & Natural Language Processing. Volume 2: Semantics. IBM Working Papers of the Institute for Logic and Language 7. Heidelberg. Bosch, P., and R. van der Sandt (eds). 1994c. Focus & Natural Language Processing. Volume 3: Discourse. IBM Working Papers of the Institute for Logic and Language 8. Heidelberg. Bosch, P., and R. van der Sandt (eds). 1999. Focus. Linguistic, Cognitive, and Computational Perspectives. Cambridge. Breheny, R. 2002. The Current State of (Radical) Pragmatics in the Cognitive Sciences. Mind & Language 17, 169–187. ¨ Buring, D. 1999. Topic. In Bosch and van der Sandt (1999), 142–165. Caelen-Haumont, G. 1994. Semantic and Pragmatic Prediction of Prosodic Structures. In Keller (1994), 271–293. Carlson, L. 1983. Dialogue Games. An Approach to Discourse Analysis. Dordrecht. Carlson, G. N., and F. J. Pelletier (eds). 1995. The Generic Book. Chicago. Carston, R. 1999. The Semantics/Pragmatics Distinction: A View from Relevance Theory. In Turner (1999), 85–125. Cherry, C. 1978. On Human Communication. Cambridge/Mass. Chomsky, N. 1972. Deep Structure, Surface Structure, and Semantic Interpretation. In Chomsky, N. Studies on Semantics in Generative Grammar. The Hague, Paris, 62–119. Chomsky, N. 1977. Conditions on Rules of Grammar. In Chomsky, N. Essays on Form and Interpretation. New York, 163–210. Chomsky, N., and M. Halle. 1968. The Sound Pattern of English. New York. Cohen, P. R., and H. J. Levesque. 1990. Rational Interaction as The Basis for Communication. In Cohen et al. (1990), 221–255. Cohen, P. R., J. Morgan, and M. E. Pollack (eds). 1990. Intentions in Communication. Cambridge/Mass. Cooper, R., S. Larsson, J. Hieronymus, S. Ericsson, E. Engdahl, and P. Ljunglf. 2001. GoDiS and Questions Under Discussion. In Trindi Consortium (ed.). The Trindi Book. Draft, 29–74. (23.3.2007) Cruttenden, A. 1986. Intonation. 2nd edition. Cambridge. Cutler, A. 1976. Phonem-Monitoring Reaction Time as a Function of Preceding Intonation Contour. Perception and Psychophysics 20, 55–60.
212
References
Cutler, A., and J. A. Fodor. 1979. Semantic Focus and Sentence Comprehension. Cognition 7, 49–59. Di Christo, A. 1998. Intonation in French. In Hirst and di Christo (1998), 195–218. Dretske, F. 1971. Contrastive Statements. Philosophical Review 81, 411–437. ¨ Eimer, M., D. Nattkemper, E. Schroger, and W. Prinz. 1996. Involuntary Attention. In Neumann and Sanders (1996), 155–184. Esk´enazi, M. 1993. Trends in Speaking Styles Research. Proceedings of Eurospeech ’93. Berlin, 501–509. Fagin, R., J. Y. Halpern, Y. Moses, and M. Y. Vardi. 1995. Reasoning About Knowledge. Cambridge/Mass. Fastl, H., and E. Zwicker. 2006. Psychoacoustics. Facts and Models. 3rd edition. Berlin et al. ¨ Fisseni, B. 2004. Something Empirical About Focus. In Lowe, B., R. van ¨ Rooij, B. Schroder, H. Zeevat (eds). ILLC-Day 2 in Bonn. ‘Language’. ILLC Publications X-2004-04. ILLC, Amsterdam. (23.3.2007) Fisseni, B. 2005. (Je)dem Fokus seine Definition. Phonetics-Colloquium, Cologne. (23.3.2007) French, N. R., and J. C. Steinberg. 1947. Factors Governing the Intelligibility of Speech Sounds. JASA 19, 90–119. Gamut, L. T. F. 1991. Logic, Language, and Meaning. Volume II: Intensional Logic and Logical Grammar. Chicago. Gelfand, S. A. 1998. Hearing. An Introduction to Psychological and Physiological Acoustics. New York et al. Geurts, B., and R. van der Sandt. 1997. Presuppositions and Backgrounds. In Dekker, P., M. Stokhof, and Y. Venema (eds). Proceedings of the 11th Amsterdam Colloquium. Amsterdam. Geurts, B., and R. van der Sandt. 2002. Only. Position Statement at One Day “Only”. Amsterdam. Ginzburg, J. 1996a. Dynamics and the Semantics of Dialogue. In Seligman and Westerst˚ahl (1996), 221–237. Ginzburg, J. 1996b. Interrogatives: Questions, Facts and Dialogue. In Lappin (1996), 385–422. Ginzburg, J., and I. Sag. 2000. Interrogative Investigations. The Form, Meaning, and Use of English Interrogatives. Stanford. Godfrey, J. J., E. C. Holliman, and J. McDaniel. 1992. SWITCHBOARD: Telephone Speech Corpus for Research and Development. Proceedings of ICASSP-92. San Francisco, 517–520. Goldsmith, J. A. (ed.). 1995. Handbook of Phonological Theory. Oxford.
References 213
Greenberg, S. 1996. Understanding Speech Understanding – Towards a Unified Theory of Speech Perception. Proceedings of the ESCA Workshop on Workshop on the Auditory Basis of Speech Perception. Keele, 1–8. Greenberg, S. 1997. On the Origins of Speech Intelligibility in the Real World. Proceedings of the ESCA Workshop on Robust Speech Recognition for Unknown Communication Channels. Pont-a-Mousson, 23–32. Greenberg, S. 1999. Speaking in Shorthand – A Syllable-Centric Perspective for Understanding Pronunciation Variation. Speech Communication 29, 159–176. Greenberg, S., and S. Chang. 2000. Linguistic Dissection of SwitchboardCorpus Automatic Speech Recognition Systems. Proceedings of the ISCA Workshop on Automatic Speech Recognition: Challenges for the New Millennium. Paris. Greenberg, S., S. Chang, and L. Hitchcock. 2001. The Relation of Stress Accent to Pronunciation Variation in Spontaneous American English Discourse. Proceedings of the ISCA Workshop on Prosody in Speech Processing and Understanding. Red Bank/NJ. Grice, H. P. 1967. Logic and Conversation. Reprinted in Grice, H. P. 1989. Studies in the Way of Words. Cambridge/Mass., 1–143. Groenendijk, J. 1999. The Logic of Interrogation. Classical Version. In Matthews, T., and D. L. Strolovitch (eds). SALT IX: Semantics and Linguistic Theory. Ithaca, 109–126. Groenendijk, J. 2003. Questions and Answers: Semantics and Logic. In Bernardi and Moortgart (2003), 16–23. Groenendijk, J., and M. Stokhof. 1984. Studies on the Semantics of Questions and the Pragmatics of Answers. Dissertation, University of Amsterdam. Groenendijk, J., and M. Stokhof. 1990. Dynamic Montague Grammar. In ´ K´alm´an, L., and L. Polos (eds). Proceedings of the Second Symposion on Logic and Language. Budapest, 3–48. Groenendijk, J., and M. Stokhof. 1991a. Dynamic Predicate Logic. Linguistics and Philosophy 14, 39–100. Groenendijk, J., and M. Stokhof. 1991b. Two Theories of Dynamic Semantics. In van Eijck, J. (ed.). Logics in AI. LNAI 478. Berlin et al., 55–64. Groenendijk, J., and M. Stokhof. 1997. Questions. In van Benthem and ter Meulen (1997), 1055–1124. Groenendijk, J., M. Stokhof, and F. Veltman. 1995a. Coreference and Modality in the Context of Multi-Speaker Discourse. In Kamp and Partee (1995a), 195–215. Groenendijk, J., M. Stokhof, and F. Veltman. 1995b. Coreference and Contextually Restricted Quantification: Is There Another Choice? In Kamp and Partee (1995a), 217–240.
214
References
Groenendijk, J., M. Stokhof, and F. Veltman. 1996a. This Might Be It. In Seligman and Westerst˚ahl (1996), 255–270. Groenendijk, J., M. Stokhof, and F. Veltman. 1996b. Coreference and Modality. In Lappin (1996), 179–213. Gussenhoven, C. 1984. On the Grammar and Semantics of Sentence Accents. Dordrecht. Gussenhoven, C. 1999. On the Limits of Focus Projection in English. In Bosch and van der Sandt (1999), 43–56. Halliday, M. A. K. 1967. Notes on Transitivity and Theme in English. Part 2. Journal of Linguistics 3, 199–244. Harel, D., D. Kozen, and J. Tiuryn. 2002. Dynamic Logic. In Gabbay, D. M., and F. Guenthner (eds). Handbook of Philosophical Logic. 2nd edition. Volume 4, 99–217. Harrah, D. 2002. The Logic of Questions. In Gabbay, D. M., and F. Guenthner (eds). Handbook of Philosophical Logic. 2nd edition. Volume 8, 1–60. Hausser, R. 1983. The Syntax and Semantics of English Mood. In Kiefer, F. (ed.). Questions and Answers. Dordrecht, 97–158. Heim, I. 1992. Presupposition Projection and the Semantics of Attitude Verbs. Journal of Semantics 9, 183–221. Heuer, H. 1996. Dual-Task Performance. In Neumann and Sanders (1996), 113–153. Heusinger, K. von. 1999. Intonation and Information Structure. Habilitationsschrift, Universit¨at Konstanz. Heusinger, K. von. 2003. The Double Dynamics of Definite Descriptions. In Peregrin (2003), 149–168. Higginbotham, J. 1996. The Semantics of Questions. In Lappin (1996), 361– 383. Hirschberg, J. 1993. Pitch Accent in Context: Predicting Intonational Prominence from Text. In Pereira, F. C. N., and B. J. Grosz (eds). Natural Language Processing. Cambridge/Mass., 305–340. Hirst, D., and A. di Christo (eds). 1998. Intonation Systems. A Survey of Twenty Languages. Cambridge. Horn, L. 1969. A Presuppositional Analysis of “Only” and “Even”. In Binnick, R., et al. (eds). Papers from the 5th Regional Meeting of the Chicago Linguistic Society. Chicago, 98–107. Horn, L. 1996. Presupposition and Implicature. In Lappin (1996), 299–319. Horn, L. 2004. Implicature. In Horn, L., and G. L. Ward (eds). Handbook of Pragmatics. Oxford, 3–28. Hulstijn, J. 1997. Structured Information States: Raising and Resolving Issues. In Benz, A., and G. J¨ager (eds). Proceedings of Mundial ’97. Munich.
References 215
Jackendoff, R. S. 1972. Semantic Interpretation in Generative Grammar. Cambridge/Mass. Jacobs, J. 1984. Funktionale Satzperspektive und Illokutionssemantik. Linguistische Berichte 91, 25–57. J¨ager, G. 1995. Only Updates. On the Dynamics of the Focus Particle “Only”. In Dekker, P., and M. Stokhof (eds), Proceedings of the 10th Amsterdam Colloquium, Amsterdam. Junqua, J.-C. 1993. The Lombard Reflex and Its Role on Human Listeners and Automatic Speech Recognizers. JASA 93, 510–524. Kadmon, N. 2001. Formal Pragmatics. Pragmatics, Presupposition and Focus. Oxford. Kager, R. 1995. The Metrical Theory of Word Stress. In Goldsmith (1995), 367–402. Kamp, H. 1978. Semantics Versus Pragmatics. In Guenthner, F., and S. J. Schmidt (eds). Formal Semantics and Pragmatics for Natural Languages. Dordrecht, 255–287. Kamp, H. 1985. Context, Thought and Communication. Proceedings of the Aristotelian Society NS 85, 240–261. Kamp, H., and B. Partee (eds). 1995a. Context-dependence in the Analysis of Linguistic Meaning. Proceedings of the Workshops in Prague (February 1995), Bad Teinach (May 1995). Volume 1: Papers. IMS, University of Stuttgart. Kamp, H., and B. Partee (eds). 1995b. Context-dependence in the Analysis of Linguistic Meaning. Proceedings of the Workshops in Prague (February 1995), Bad Teinach (May 1995). Volume 2: Comments and Replies. IMS, University of Stuttgart. Kamp, H., and U. Reyle. 1993. From Discourse to Logic. Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory. Dordrecht. Keller, E. (ed.). 1995. Fundamentals of Speech Synthesis and Speech Recognition. Basic Concepts, State-of-the-Art and Future Challenge. Chichester et al. Klein, W., and C. von Stutterheim. 1987. Quaestio und referentielle Bewegung in Erz¨ahlungen. Linguistische Berichte 109, 163–183. Koelega, H. S. 1996. Sustained Attention. In Neumann and Sanders (1996), 277–331. Kohler, K. J. 1995. Articulatory Reduction in Different Speaking Styles. Proceedings of the 13th ICPhS. Stockholm, 12–19. Krifka, M. 1992a. Compositional Semantics for Multiple Focus Constructions. In Jacobs, J. (ed.). Informationsstruktur und Grammatik. Linguistische Berichte, Sonderheft 4. Opladen, 17–53. Krifka, M. 1992b. A Framework for Focus-Sensitive Quantification. In Barker, C., and D. Dowty (eds). Proceedings of SALT II. Ohio State Uni-
216
References
versity, Columbus, 215–236. Krifka, M. 1993. Focus and Presupposition in Dynamic Interpretation. Journal of Semantics 10, 269–300. Krifka, M. 1995a. Focus and the Interpretation of Generic Sentences. In Carlson and Pelletier (1995), 238–264. Krifka, M. 1995b. Focus and/or Context: A Second Look at Second Occurrence Expressions. In Kamp and Partee (1995a), 253–275. Krifka, M. (1996), Frameworks for the Representation of Focus. Proceedings of the Conference on Formal Grammar. ESSLLI, Prag. Krifka, M. 2001. For a Structured Meaning Account of Questions and Answers. In F´ery, C., and W. Sternefeld (eds). Audiatur Vox Sapientia. A Festschrift for Arnim von Stechow. Berlin, 287–319. Kripke, S. A. 1963. Semantical Analysis of Modal Logic I. Normal Modal Propositional Calculi. Zeitschrift fur ¨ mathematische Logik und Grundlagen der Mathmatik 9, 67–96. Kryter, K. D. 1970. The Effects of Noise on Man. New York, London. Ladd, R. 1996. Intonational Phonology. Cambridge. Lambrecht, K. 1994. Information Structure and Sentence Form. Topic, Focus and the Mental Representation of Discourse Referents. Cambridge. Langmann, D. et al. 1998. CSDC – The MoTiV Car Speech Data Collection. Proceedings of the First International Conference on Language Resources and Evaluation. Volume 2. Granada, 1107–1113. Lappin, S. (ed.). 1996. Handbook of Contemporary Semantic Theory. Oxford. Lehiste, I. 1970. Suprasegmentals. Cambridge/Mass. Lenders, W. 1974. Semantische und argumentative Textdeskription. Hamburg. ¨ Lenders, W. 1989a. Computergestutzte Verfahren zur semantischen Beschreibung von Sprache. In B´atori et al. (1989), 231–244. ¨ ¨ Lenders, W. 1989b. Ubersicht uber die Verstehensproblematik hinsichtlich der Computersimulation von Sprache. In B´atori et al. (1989), 260–272. Levelt, W. J. M. 1989. Speaking. From Intention to Articulation. Cambridge/Mass. Lewis, D. K. 1979. Scorekeeping in a Language Game. In B¨aurle, R., U. Egli, and A. von Stechow (eds). Semantics from Different Points of View. Berlin et al., 172–187. Lindblom, B. 1983. Economy of Speech Gestures. In MacNeilage, P. F. (ed.). The Production of Speech. New York et al., 217–245. Lindblom, B. 1990. Explaining Phonetic Variation: A Sketch of the H&H Theory. In Hardcastle, W. J., and A. Marchal (eds). Speech Production and Speech Modeling. Dordrecht, 403–439. Lindblom, B. 1996. Role of Articulation in Speech Perception: Clues from Production. JASA 99, 1683–1692.
References 217
Lindblom, B., and J. H. Davis. 1998. Calculating and Measuring the Energy Costs of Speech Movements. Proceedings of FONETIK 98. Stockholm, 32– 35. Lombard, E. 1911. Le Signe de l’El´evation de la Voix. Annales Maladiers Oreille, Larynx, Nez, Pharynx 37, 101–119. Mann, W. C., and S. A. Thompson. 1988. Rhetorical Structure Theory: Toward a Functional Theory of Text Organization. Text 8, 243–281. ´ ements de linguistique g´en´erale. Paris. Martinet, A. 1960. El´ Martinich, A. P. 1980. Conversational Maxims and Some Philosophical Problems. Philosophical Quarterly 30, 215–228. Massaro, D. W. 2002. Multimodal Speech Perception: A Paradigm for ¨ Speech Science. In Granstrom, B., D. House, and I. Karlsson (eds). Multimodality in Language and Speech Systems. Dordrecht, 45–71. Merin, A. 1999. Information, Relevance, and Social Decision Making: Some Principles of Decision-Theoretic Semantics. In Moss, L. S., J. Ginzburg, and M. de Rijke (eds). Logic, Language and Computation. Volume 2. Stanford, 179–221. Moln´ar, V., and S. Winkler (eds). 2006. The Architecture of Focus. Berlin, New York. Montague, R. 1968. Pragmatics. Reprinted in Thomason (1974), 95–118. Montague, R. 1970. English as a Formal Language. Reprinted in Thomason (1974), 188–221. Montague, R. 1973. The Proper Treatment of Quantification in Ordinary English. Reprinted in Thomason (1974), 247–270. ¨ Muller, G. 1998. A Derivational Approach to Remnant Movement in German. Dordrecht. Muskens, R., J. van Benthem, and A. Visser. 1997. Dynamics. In van Benthem and ter Meulen (1997), 587–648. Neumann, O., and A. Sanders (eds). 1996. Attention. London et al. O’Shaughnessy, D. 2000. Speech Communications. Human and Machine. 2nd edition. Piscataway. Parikh, P. 2001. The Use of Language. Stanford. Partee, B. H. 1994. Focus, Quantification, and Semantics–Pragmatics Issues. Preliminary Version. In Bosch and van der Sandt (1994b), 363–377. Partee, B. H. 1999. Focus, Quantification, and Semantics–Pragmatics Issues. In Bosch and van der Sandt (1999), 213–231. Paul, H. 1898. Prinzipien der Sprachgeschichte. 3rd edition. Halle. (1st edition: 1880) Payton, K. L., R. M. Uchanski, and L. D. Braida. 1994. Intelligibility of Conversational and Clear Speech in Noise and Reverberation for Listeners with Normal and Impaired Hearing. JASA 95, 1581–1592.
218
References
Peregrin, J. 1999. The Pragmatization of Semantics. In Turner (1999), 419– 442. Peregrin, J. (ed.) 1999. Meaning: The Dynamic Turn. Oxford et al. Peregrin, J., and K. von Heusinger. 1995. Dynamic Semantics with Choice Functions. In Kamp and Partee (1995a), 329–353. Perry, J. 1998. Indexicals, Contexts and Unarticulated Constituents. (23.3.2007) Picheny, M. A., N. I. Durlach, and L. D. Braida. 1985. Speaking Clearly for the Hard of Hearing I: Intelligibility Differences Between Clear and Conversational Speech. Journal of Speech and Hearing Research 28, 96–103. Picheny, M. A., N. I. Durlach, and L. D. Braida. 1986. Speaking Clearly for the Hard of Hearing II: Acoustic Characteristics of Clear and Conversational Speech. Journal of Speech and Hearing Research 29, 434–446. Picheny, M. A., N. I. Durlach, and L. D. Braida. 1989. Speaking Clearly for the Hard of Hearing III: An Attempt to Determine the Contribution of Speaking Rate to Differences in Intelligibility Between Clear and Conversational Speech. Journal of Speech and Hearing Research 32, 600–603. Pickett, J. M. 1956. Effects of Vocal Force on the Intelligibility of Speech Sounds. JASA 56, 902–905. Pierrehumbert, J. 1987. The Phonology and Phonetics of English Intonation. PhD-Dissertation, MIT 1980. Reproduced by the Indiana University Linguistics Club. Pierrehumbert, J., and J. Hirschberg. 1990. The Meaning of Intonational Contours in the Interpretation of Discourse. In Cohen et al. (1990), 271– 311. Prince, E. F. 1981. Toward a Taxonomy of Given–New Information. In Cole, P. (ed.). Radical Pragmatics. New York et al., 223–253. Pulman, S. 1997. Higher Order Unification and the Interpretation of Focus. Linguistics and Philosophy 20, 73–115. Quirk, R., S. Greenbaum, G. Leech, and J. Svartvik. 1985. A Comprehensive Grammar of the English Language. London, New York. Recanati, F. 2004. Literal Meaning. Cambridge. Roberts, C. 2001. Information Strukture in Discourse: Towards an Integrated Formal Theory of Pragmatics. (23.3.2007) Rochemont, M. 1986. Focus in Generative Grammar. Amsterdam. Rooth, M. 1985. Association with Focus. PhD-thesis, University of Massachusetts, Amherst. Rooth, M. 1992. A Theory of Focus Interpretation. Natural Language Semantics 1, 75–116. Rooth, M. 1995a. Indefinites, Adverbs of Quantification, and Focus Seman-
References 219
tics. In Carlson and Pelletier (1995), 265–299. Rooth, M. 1995b. Comments on Krifka. In Kamp and Partee (1995b), 167– 178. Rooth, M. 1996a. Focus. In Lappin (1996), 271–297. Rooth, M. 1996b. On the Interface Principle for Intonational Focus. In Galloway, T., and J. Spence (eds). Proceedings of SALT VI. Cornell. Rooth, M. 1999. Association with Focus or Association with Presupposition? In Bosch and van der Sandt (1999), 232–244. Russel, B. 1905. On Denoting. Reprinted in Marsh, R. C. (ed.). 1956. Bertrand Russel. Logic and Knowledge. Essays 1901-1950. London, New York, 39–56. Schmitz, H.-C. 2006. Experiments on the Accentuation of Focus Operators. IKP Working Paper NS 19, University of Bonn. (23.3.2007) ¨ Schmitz, H.-C., B. Schroder. 2002a. On Focus and Quantifier Raising. IKP Working Paper NS 01, University of Bonn. (23.3.2007) ¨ Schmitz, H.-C., B. Schroder. 2002b. On Focus and VP-deletion. Snippets 5, 16-17. (23.3.2007) ¨ Schmitz, H.-C., B. Schroder, and P. Wagner. 2001. Zur Akzentuierung se¨ mantischer und pragmatischer Fokusse. In Hess, W., and K. Stober (eds). Elektronische Sprachverarbeitung. Dresden, 151–158. Schmitz, H.-C., P. Wagner. 2006. Experiments on Accentuation and Focus Projektion. IKP Working Paper NS 21, University of Bonn. (23.3.2007) ¨ Schroder, B. 2003. Zur Logik des Fokus. Habilitationsschrift, University of Bonn. ¨ Schroder, B., and H.-C. Schmitz. 2003. Underspecified Focus Representation. In Decker, P., and R. van Rooy (eds). Proceedings of the 14th Amsterdam Colloquium. Amsterdam, 193–198. Schwarzschild, R. 1997a. Why Some Foci Must Associate. Manuscript. (23.3.2007) Schwarzschild, R. 1997b. Interpreting Accent. Manuscript. (23.3.2007) Schwarzschild, R. 1999. Givenness, AviodF and Other Constraints on the Placement of Accent. Natural Language Semantics 7, 144–177. Searle, J. 1994. The Rediscovery of the Mind. Cambridge/Mass. Seligman, J., and D. Westerst˚ahl (eds). 1996. Logic, Language and Computation. Volume 1. Stanford. Selkirk, E. 1984. Phonology and Syntax: The Relation Between Sound and Structure. Cambridge/Mass.
220
References
Selkirk, E. 1995. Sentence Prosody: Intonation, Stress, and Phrasing. In Goldsmith (1995), 550–569. Shan, C.-C., and B. ten Cate. 2002. The Partition Semantics of Questions, Syntactially. In Nissim, M. (ed.). Proceedings of the Seventh ESSLLI Student Session. Trento. Shannon, C. E. 1948. A Mathematical Theory of Communication. Reprinted in: Sloane, N. J. A., and A. D. Wyner (eds). 1993. Claude Elwood Shannon. Collected Papers. Piscataway, 5–83. Sperber, D., and D. Wilson. 1986. Relevance. Communication and Cognition. Oxford. Speber, D., and D. Wilson. 2002. Pragmatics, Modularity and MindReading. Mind & Language 17, 2–23. Stalnaker, R. 1978. Assertion. Reprinted in Stalnaker (1999), 78–95. Stalnaker, R. 1998. On the Representation of Context. Reprinted in Stalnaker (1999), 96–113. Stalnaker, R. 1999. Context and Content. Oxford. Stechow, A. von. 1989a. Focusing and Background Operators. Arbeitspaper Nr. 6. Fachgruppe Sprachwissenschaft der Universit¨at Konstanz. Stechow, A. von. 1989b. Current Issues in the Theory of Focus. In Stechow and Wunderlich (1989), 804–825. Stechow, A. von, and D. Wunderlich (eds). 1989. Semantik/Semantics. Berlin, New York. Steedman, M. 1994. Remarks on Intonation and “Focus”. In Bosch and van der Sandt (1994a), 185–204. Steeneken, H. J. M., and T. Houtgast. 1980. A Physical Method for Measuring Speech-Transmission Quality. JASA 67, 318–326. Sterling, L., and E. Shapiro. 1994. The Art of Prolog. Advanced Programming Techniques. 2nd edition. Cambridge/Mass. Stutterheim, C. von. 1994. Quaestio und Textaufbau. In Kornadt, H.-J., J. Grabowski, and R. Mangold-Allwinn (eds). Sprache und Kognition. Perspektiven moderner Sprachpsychologie. Heidelberg, 251–272. Stutterheim, C. von. 1997. Einige Prinzipien des Textaufbaus. Empirische Un¨ tersuchungen zur Produktion mundlicher ¨ Texte. Tubingen. Stutterheim, C. von, and W. Klein. 2002. Quaestio and L-Perspectivation. In Graumann, C. F., and W. Kallmeyer (eds). Perspective and Perspectivation in Discourse. Amsterdam, 59–88. ten Cate, B., and C.-C. Shan. 2002. Question Answering: From Partitions to ¨ Prolog. In Egly, U., and C. Fernmuller (eds). Proceedings of TABLEAUX 2002. Berlin. ten Hoopen, G. 1996. Auditory Attention. In Neumann and Sanders (1996). 79–112.
References 221
Terken, J., and S. G. Nooteboom. 1987. Opposite Effects of Accentuation and Deaccentuation on Verification Latencies for Given and New Information. Language and Cognitive Processes 2, 145–163. Thomason, R. H. (ed.). 1974. Formal Philosophy. Selected Papers of Richard Montague. New Haven, London. Turner, K. (ed.) 1999. The Semantics/Pragmatics Interface from Different Points of View. Oxford et al. Uchanski, R. M., S. S. Choi, L. D. Braida, C. M. Reed, and N. I. Durlach. 1996. Speaking Clearly for the Hard of Hearing IV: Further Studies of the Role of Speaking Rate. Journal of Speech and Hearing Research 39, 494–509. van Benthem, J., and A. ter Meulen (eds). 1997. Handbook of Logic and Language. Amsterdam et al. van Deemter, K. 1994. What’s New? A Semantic Perspective on Sentence Accent. Journal of Semantics 11, 1–31. van Ditmarschen, H. 2000. Knowledge Games. ILLC Dissertation Series DS200-06. Amsterdam. van Kuppevelt, J. 1994. Directionality in Discourse. In Bosch and van der Sandt (1994c), 485–501. van Kuppevelt, J. 1995a. Discourse Structure, Topicality and Questioning. Journal of Linguistics 31, 109–147. van Kuppevelt, J. 1995b. Main Structure and Side Structure in Discourse. Linguistics 33, 809–833. van Rooy, R. 2003a. Relevance and Bidirectional OT. In Blutner, R., and H. Zeevat (eds). Pragmatics in Optimality Theory. Hampshire, 173–210. van Rooy, R. 2003b. Questions and Relevance. In Bernardi and Moortgart (2003), 96–107. Veltman, F. 1996. Defaults in Update Semantics. Journal of Philosophical Logic 25, 221–261. Vennemann, T. 1975. Topics, Sentence Accent, Ellipsis: A Proposal for Their Formal Treatment. In Keenan, E. L. (ed.). Formal Semantics of Natural Language. Cambridge, 313–328. Wagner, P. 2002. Vorhersage und Wahrnehmung deutscher Betonungsmuster. Dissertation. University of Bonn. Westerst˚ahl, D. 1985. Determiners and Context Sets. In van Benthem, J., and A. ter Meulen (eds). Generalized Quantifiers in Natural Language. Dordrecht, 45–71. Widera. C. 2002. Zur Reduktion von Vokalen. Eine experimentell-phonetische Untersuchung. Dissertation. University of Bonn. Wilson, N. L. 1959. Substances Without Substrata. Review of Metaphysics 12, 521–539. Wold, D. 1996. Long Distance Selective Binding. The Case of Focus. In
222
References
Galloway, T., and J. Spence (eds). Proceedings f SALT VI. Cornell, 311–328. Wold, D. 1998. How to Interpret Multiple Foci Without Moving a Focussed Constituent. In Benedicto, E., M. Romero, and S. Tomioka (eds). Proceedings of Workshop on Focus. Amherst, 277–289. Zeevat, H. 1992. Presuppositon and Accommodation in Update Semantics. Journal of Semantics 9, 379–412. Zeevat, H. 2002. Only Pragmatics. Position Statement at One Day “Only”, Amsterdam. Zimmermann, T. E. 1989. Kontextabh¨angigkeit. In Stechow and Wunderlich (1989), 156–229. Zimmermann, T. E. 1999. Context & Point of View. In Wilson R., and F. Keil (eds). The MIT Encyclopedia of Cognitive Science. Cambridge/Mass., 156–229. Zipf, G. K. 1949. Human Behavior and the Principle of Least Effort. Cambridge/Mass.
Index accent, 22 accentuate, 119, 121, 203 accentuation, 20, 118–123 accentuation of focus operators, 148– 150 accommodation, 78, 80–81, 128–129 acoustic disturbance, see channel disturbance acoustic undershoot, 15 active interpretation, 4, 26, 28–29, 78, 83–84, 128–129, 201 adequacy criteria, 28–29, 34, 75–78, 128–129 Allen, J., 51 alternative semantics, 158, 160–162 always, 178–180 answer, 30–31, 55, 69 to be a partial answer, 69 to be an exhaustive answer, 69 to give a partial answer, 69 to give an exhaustive answer, 69 Articulation Index (AI), 17 articulatory effort, 15–16, 20 articulatory undershoot, 15 asking, 30 assertion, 30 Atlas, J., 152 attention, 18–21 attenuation, 16 Automatic Speech Recognition (ASR), 21
calendar-example, 37–38 Carlson, L., 51, 53 categorial question semantics, 92 Chang, S., 21 channel disturbance, 16–18 Cherry, C., 24 Chomsky, N., 159, 183 citation form, 193–194 Clark, B., 154, 180, 181 clear speech, 16, 20 CLUEDO, 40 coarticulation, 15 cocktail-party effect, 17 cognitive effort, 18–19 combinatorial explosion problem, 90 common ground, 39–47 constituent answer, 8, 12, 103 context configurations, 84, 136–141 context-dependency of accentuation, 197–198 context-dependency of focus interpretation, 2–3, 162–164 conversational maxims, 29–38, 75–78 conversational speech, 16, 21 Cooper, R., 53, 87 cooperative information exchange, 9– 10, 28–82 criteria of adequacy, see adequacy criteria Cruttenden, A., 22 Cutler, A., 21 declarative sentence, 30 denotations, 207 descriptivity, 5 di Christo, A., 20 discontinuous focus, 166 discourse goal, 48 discourse structuring, 48–54, 101 discrimination task, 23–24 domain restriction, 87–89, 94–96, 107– 108, 164–165 duration, 20, 22, 175–177, 196, 200
B¨aurle, R., 55 ¨ Buring, D., 197 Bach, K., 35 Bartels, C., 173, 175–177 Barwise, J., 87 Beaver, D., 78, 154, 176–182, 196, 198 Beckman, M., 20 Blutner, R., 34 Bos, J., 90 bottom up process, 23, 202 bound focus, 165 Breheny, R., 34
echo, 17
223
224
Index
echo focus, 173 Eimer, M., 18 empty QUAD, 98 entailment, 46, 66 Esk´enazi, M., 16 exh, 106 exhaustification, 104–107, 125–127 exhaustive answer, 104–107 explicit question, 48–51 facial gestures, 20, 35 Fagin, R., 39, 40 filtering, 16–17 first occurrence focus, 175 focus operator, 148–150, 164 focus phenomena, 1, 26 focus pragmatics, 165–168 focus projection, 182–197 focus semantics, 158–165 focus theory, 1–3, 143, 158–170, 204 Fodor, J., 21 follow-up question, 50–51, 131 football-example, 9–11 formula of TL, 206 free focus, 165 free parameter theory, 172 free variable, 206 French, N., 17 fundamental frequency (f0), 20, 22, 175–177, 196, 200 Gabsdil, M., 90 Gelfand, S., 23 general communication system, 6, 23, 28 generalized quantifier, 87–89 Ginzburg, J., 30, 47, 53, 92, 131 given–new, 188 Godfrey, J., 21 Greenberg, S., 15, 20, 21, 25, 26 Grice, H. P., 30–38, 73–76, 78, 155 Groenendijk, J., 30, 45, 48, 55–58, 62, 70–72, 74, 79, 92–97, 106, 112 Gussenhoven, C., 183, 191 H&H theory, 15, 24 Halle, M., 183 Halliday, M., 188 Harnish, R., 35 Harrah, D., 55
Hausser, R., 92 head argument asymmetry, 189–198 hearing impairment, 16 Heim, I., 81 Heuer, H., 19 Heusinger, K. von, 112 Higginbotham, J., 55, 92, 96 Hirschberg, J., 203 Hirst, D., 20 Horn, L., 152 Houtgast, T., 17 Hulstijn, J., 62 hyperspeech, 15–16, 18, 20, 200 hypospeech, 15–16, 20 hypothesis of optimal accentuation, 4, 6–7, 203 i-critical, 7, 13, 16, 148, 203 implicature, 32–33, 35, 125–127 implicit question, 51–52, 130–131 in-situ binding semantics, 159 information state, 43–44, 63, 99 informativity, 40, 65, 72–73 informativity of noise, 173–174 intensity, 20, 22, 175, 176, 178, 200 intention vs intendability, 75, 202 interrogative sentence, 30 intonational phrase, 193–195 irony, 33, 35 J¨ager, G., 62 Jacobs, J., 165, 166 Junqua, J.-C., 17, 18 Kager, R., 22 Klein, W., 47, 52 Koelega, H., 18 Kohler, K., 15 Krifka, M., 92, 97, 143, 158–160, 162, 163, 166, 167, 172 Kripke, S., 207–209 Kryter, K., 17 ¨ Lowe, B., 36 λ-conversion, 209 Lambrecht, K., 197 Langmann, D., 17 language fragment L, 85–91 Lehiste, I., 179 length, 22
Index 225 level of informativity, 72–73 Lindblom, B., 15, 23 locutionary mode operator, 166 logical omniscience, 78–79 Lombard reflex, 17 Lombard, E., 17 loud articulation, 17 loudness, 22 ¨ Muller, G., 204 Martinich, A., 32 masking, 17 Massaro, D., 20 maxim of manner, 31, 83 maxim of quality, 31, 32 maxim of quantity, 31, 32 maxim of relation, 31, 32 mention-some question, 79 Merin, A., 36 message, 31 metrical stress theories, 22 model of QL, 59 model of TL, 207–209 movement theory of focus, 159 Muddy Children Puzzle, 40 multiple constituent question, 94, 108 multiple focus construction, 166–167 mutual knowledge, see common ground newspaper headline, 12 non-cooperative information exchange, 10–11, 38, 76–77 Nooteboom, S., 21 normativity, 4 nuclear stress rule, 183–186, 198 O’Shaughnessy, D., 84 only, 145, 147, 150–155, 178–182 over-informativity, 42, 79 Parikh, P., 81 Partee, B., 171, 172, 174 partial answer, 69 partition, 55 Payton, K., 16, 17 Perault, C., 51 Peregrin, J., 112, 202 Perry, J., 14 Picheny, M., 16 Pickett, J., 17
Pierrehumbert, J., 200, 203 pipeline model of speech processing, 34, 144 pitch, 22 pitch accent, 22, 200 polar question, 57, 60, 108 pragmatically redundant, 148 predictions of stress patterns, 113, 170– 198 presupposition of questions, 123–128 Prince, E., 188 private knowledge, 39, 76 Prolog, 100, 101 prominence, 22 proposition, 30 propositional concept, 30, 56 propositional knowledge, 44, 62–64 Pulman, S., 159 QUAD, 92, 97–98 quæstio, 52 question, 30 question abstract, 92–96 question logic (QL), 58–62 question theory, see semantics of interrogative sentences question under discussion, 53, 67–68 question–answer congruence, 165 radical pragmatics, 34 reading, 19–20 Recanati, F., 14 recognition task, 21, 23 reconstruct, 98, 99, 106, 115 reconstruction of messages, 7–14 reconstruction operations, 28, 128 reconstruction rules, 115 reduction, 15 replacive theories of focus, 159 reprise question, 131 reverberation, 17 rhetorical structure theory, 53 Roberts, C., 47, 53, 181 Rooth, M., 125, 136, 143, 158, 160– 163, 166, 168, 172, 174–175 Russel, B., 112 Sag, I., 131 Schade, U., 135 Schmitz, H.-C., 148, 167, 168, 181–184
226
Index
¨ Schroder, B., 167, 168 Schwarzschild, R., 2, 131, 183, 188, 191 Searle, J., 14 second occurrence focus, 171–182 Selkirk, E., 2, 183, 187–193, 196–198 semantic effects of accentuation, 144– 170 semantic effects of optimal accentuation, 143–158, 168–170 semantic enrichment, 28, 29, 75, 202 semantic maxims, 31–32, 46–47, 70, 73–74 semantic relations of interrogative sentences, 57–58, 62 semantics of interrogative sentences, 55–62 semantics of L, 86–87, 114 Sentence Accent Assignment Rule (SAAR), 191 sentence of QL, 58 sentence of TL, 206 shadowing task, 24 Shannon, C., 6, 23, 201 Shapiro, E., 100 shouting, 17 small talk, 41 speech act, 30 speech styles, 15–16 Speech Transmission Index (STI), 17 staccato speech, 21 Stalnaker, R., 30, 39–41 Stechow, A. von, 92, 97, 143 Steeneken, H., 17 Steinberg, J., 17 Sterling, L., 100 Stokhof, M., 30, 45, 48, 55–58, 74, 79, 92–97, 106, 112 stress, 20, 22 stress strength, 193–197 structured meaning theory, 92, 158– 160 Stutterheim, C., 47, 52 Switch-Board Corpus, 21 syllable, 22, 26 syntactic maxim, 31 syntax of L, 85–87, 114 syntax of QL, 58 syntax of TL, 205–206
Terken, J., 21 text message-example, 11 thematic knowledge, 62–64 theory of optimal accentuation, 3–5 time-example, 36–38 top down process, 24, 202 topic accentuation, 197 topicalisation, 204 toy world, 90, 148, 185, 190 traffic noise, 17 Trindi project, 53 Type Logic (TL), 205–209 type shifting, 108–116 types, 205 Uchanski, R., 16 underspecification, 130 update, 89, 100, 101 update constraint, 89 update function, 89, 101 update rule, 89, 101 update system, 43, 45, 64–65, 99–101 updates with declarative sentences, 100– 101 updates with interrogative sentences, 99–100 utterance, 30, 31 van Ditmarschen, H., 40 van Kuppevelt, J., 30, 47, 52–53 van Rooy, R., 34, 79 variable assignment, 207 Veltman, F., 30, 45, 46, 112 Vennemann, T., 47 violation of maxims, 31–33 Wagner, P., 183, 193–195, 197 wayback machine, 177 Westerst˚ahl, D., 88, 106 wh-question, 57, 60–62 Wold, D., 159, 168 Zeevat, H., 78, 145 Zimmermann, T. E., 55