Constraints in Discourse
Pragmatics & Beyond New Series (P&BNS) Pragmatics & Beyond New Series is a continuation of Pragmatics & Beyond and its Companion Series. The New Series offers a selection of high quality work covering the full richness of Pragmatics as an interdisciplinary field, within language sciences.
Editor Andreas H. Jucker
University of Zurich, English Department Plattenstrasse 47, CH-8032 Zurich, Switzerland e-mail:
[email protected]
Associate Editors Jacob L. Mey
University of Southern Denmark
Herman Parret
Jef Verschueren
Susan C. Herring
Emanuel A. Schegloff
Belgian National Science Foundation, Universities of Louvain and Antwerp
Belgian National Science Foundation, University of Antwerp
Editorial Board Shoshana Blum-Kulka Hebrew University of Jerusalem
Jean Caron
Université de Poitiers
Indiana University
Masako K. Hiraga
St.Paul’s (Rikkyo) University
University of California at Los Angeles
Deborah Schiffrin
David Holdcroft
Georgetown University
Sachiko Ide
Kobe City University of Foreign Studies
Sandra A. Thompson
Thorstein Fretheim
Catherine KerbratOrecchioni
John C. Heritage
Claudia de Lemos
Teun A. van Dijk
Marina Sbisà
Richard J. Watts
Robyn Carston
University College London
Bruce Fraser
Boston University University of Trondheim University of California at Los Angeles
University of Leeds Japan Women’s University
University of Lyon 2 University of Campinas, Brazil University of Trieste
Volume 172 Constraints in Discourse Edited by Anton Benz and Peter Kühnlein
Paul Osamu Takahara
University of California at Santa Barbara Pompeu Fabra, Barcelona University of Berne
Constraints in Discourse
Edited by
Anton Benz Zentrum für Allgemeine Sprachwissenschaften
Peter Kühnlein Rijksuniversiteit Groningen
John Benjamins Publishing Company Amsterdam / Philadelphia
8
TM
The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984.
Library of Congress Cataloging-in-Publication Data Constraints in discourse / edited by Anton Benz, Peter Kuhnlein. p. cm. (Pragmatics & Beyond New Series, issn 0922-842X ; v. 172) Includes bibliographical references and index. 1. Discourse analysis. 2. Constraints (Linguistics) I. Benz, Anton, 1965- II. Kühnlein, Peter. P302.28.C66 2008 401'.41--dc22 isbn 978 90 272 5416 0 (Hb; alk. paper)
2007048314
© 2008 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. · P.O. Box 36224 · 1020 me Amsterdam · The Netherlands John Benjamins North America · P.O. Box 27519 · Philadelphia pa 19118-0519 · usa
Table of contents
Acknowledgements 1.
Constraints in discourse: An Introduction
part i The Right Frontier
vii 1
27
2. Troubles on the right frontier Nicholas Asher
29
3. The moving right frontier Laurent Prévot and Laure Vieu
53
part ii Comparing Frameworks
67
4 . Strong generative capacity of rst, sdrt and discourse dependency dags Laurence Danlos
69
5. Rhetorical distance revisited: A parameterized approach Christian Chiarcos and Olga Krasavina
97
6. Underspecified discourse representation Markus Egg and Gisela Redeker
117
part iii The Cognitive Perspective
139
7. Dependency precedes independence: Online evidence from discourse processing Petra Burkhardt
141
8. Accessing discourse referents introduced in negated phrases: Evidence for accommodation? Barbara Kaup and Jana Lüdtke
159
Table of contents
part iv Language Specific Phenomena
179
9. Complex anaphors in discourse Manfred Consten and Mareile Knees
181
10. The discourse functions of the present perfect Atsuko Nishiyama and Jean-Pierre Koenig
201
11. German right dislocation and afterthought in discourse Maria Averintseva-Klisch
225
12. A discourse-relational approach to continuation Anke Holler
249
13. German Vorfeld-filling as constraint interaction Augustin Speyer
267
Index
291
Acknowledgements
The contributions collected in this volume are based on the proceedings of the first conference on Constraints in Discourse held at the University of Dortmund. All contributions have been reviewed again and thoroughly revised before publication. The conference was organised by the two editors Anton Benz and Peter Kühnlein together with Claudia Sassen. Both editors regret that Claudia Sassen, who did a great job at organising the conference, had to leave the editorial board. We thank Angelika Storrer from the Institute for German Language at the University of Dortmund as well as the Deutsche Forschungsgemeinschaft for their financial support. Furthermore, we have to thank our employers, the IFKI at the University of Southern Denmark, the University of Bielefeld, the ZAS in Berlin and the University of Groningen for their help and encouragement. John Tammena has helped reduce the unreadability of our introductory chapter. We want to thank him as well as Paul David Doherty who helped setting up the index. Our special thanks, however, go to Andreas Jucker, the series editor of P&Bns, and of course to Isja Conen from John Benjamins’ publishing company, for their untiring help and patience.
Constraints in discourse An introduction 1. General remarks For a long time the development of precise frameworks of discourse interpretation has been hampered by the lack of a deeper understanding of the dependencies between different discourse units. The last 20 years have seen a considerable advance in this field. A number of strong constraints have been proposed that restrict the sequencing and attaching of segments at various descriptive levels, as well as the interpretation of their interrelations. An early and very influential work on the sequencing and accessibility of expressions across sentence boundaries was concerned with the rfc (Right Frontier Constraint), often associated with a paper by Polanyi (1988). The rfc formulates a restriction on the possible discourse positions of pronominal expressions. Another much discussed constraint governing pronominal reference is the centering principle formulated by Grosz and Sidner (1986). In addition to the proposal of new discourse constraints, recent years saw the development of competing formal frameworks for discourse generation and interpretation, most importantly, Rhetorical Structure Theory (rst, Mann and Thompson 1987) and Segmented Discourse Representation Theory (sdrt). Especially the recent publication of Asher and Lascarides (2003), which summarises more than ten years of joint research in sdrt, gave a strong impulse to the field of discourse semantics and led to the publication of an increasing number of papers. Constraints play a role not only in diverse fields of linguistics, but in a wide variety of fields of research in general, such as computer science, especially artificial intelligence (cf., e.g., (Blache 2000)). What the use of constraints has in common in all these fields is that they describe properties of objects in order to specify whether certain objects are well-formed from the point of view of the background theory. As soon as an object carries the property or properties specified by all of the constraints defined by the theory, it counts as well-formed and is accepted as (part of) a model of the theory. The object is then said to satisfy the constraints set by the theory. In the present collection, a number of authors contributed to define constraints thus understood to specify properties that are relevant in the context of research on discourse. The multiplicity of identified constraints mirrors the multiple facets of this research area itself. To give a rough understanding of major issues in discourse research, we will lay out three paradigms in this introduction and relate them to each other and to the texts in this volume. The three paradigms we selected share a focus on rhetorical relations: a discourse is conceived as such only if every part of it is connected to the rest via certain relations
Constraints in discourse — an introduction
that specify its role. This property of discourse is classically related to coherence and cohesion and can be used as a constraint to distinguish well-formed discourses from arbitrary sets of objects. The paradigms were developed during the last 20 years and within their frameworks, a number of such constraints have been proposed for the description and explanation of the multiplicity of dependencies between units of discourse. Segmented Discourse Representation Theory (sdrt), for example, posits a selection principle over interpretations of discourse: among possible interpretations of a discourse the one is selected that renders the discourse as coherent as possible. This is operationalised via the number of rhethorical relations that connect parts of the discourse and an ordering over preferences for those relations: the more the better, given their type for some discourse. This principle is called Maximise Discourse Coherence (mdc) and of course is a constraint over the selection of interpretations as well as discourses: of those interpretations that can be generated for a given discourse only those are acceptable that have the highest possible degree of coherence. And among objects generally only those count as discourse for which some interpretation establishes coherence. Consider what would happen if (1b) and (1c) were exchanged in example (1), taken from (Asher and Lascarides 2003); the resulting discourse would clearly be less acceptable, and one might well argue that this would be due to the loss of coherence. (1) a. One plaintiff was passed over for promotion three times. b. Another didn’t get a raise for five years. c. A third plaintiff was given a lower wage compared to males who were doing the same work. d. But the jury didn’t believe this.
One prominent constraint that is recognised by almost all theories of discourse is the so-called Right Frontier Constraint (rfc), see especially the chapters in Part I of this book. This constraint amounts to a restriction over attachment points in a discourse. (We will give a short characterization here and discuss the rfc a little more extensively in Section 3.) Consider Example (1) again. Under any reasonable interpretation, (1d) can only be related to either the immediately preceding utterance (1c) or to the totality of the preceding utterances (1a–1c). In the first case, what the jury didn’t believe was just the fact that one plaintiff was given a lower wage compared to males who were doing the same work. In the second case, the jury wouldn’t believe any of the reported facts. What should not be possible—and that is the claim connected with the rfc—is an attachment of (1d) to (1a) or (1b) alone. These two utterances should be blocked as attachment points. The name Right Frontier Constraint derives from an assumption over representations stating that more recent utterances, or, more general, constituents in a discourse are graphically represented to the right of less recent ones. Discussion of formal representations of discourse structure and measures of anaphoric distances can be found in the chapters of Part II of this book. The most recent constituents in discourse (1) prior
Constraints in discourse — an introduction β
γ
α Figure 1. A graphical representation of what it means for a node to be on the right frontier: node α represents the last utterance in a discourse. α and every node dominating α (like β) is thus on the right frontier and available for attachment for a subsequent utterance γ.
to the utterance of (1d) are either (1c) or the compound constituent (1a–1c), which makes these two being situated on the right hand side of the representation given this assumption. As accordingly all and only those constituents that are accessible for pronominal anaphoric attachment are on the right hand side of the representation, this constraint is called rfc. As a reaction to the variety of constraints, there will be discussions on a broad spectrum of restrictions on well-formedness, be these universal, language independent restrictions, like the two mentioned seem to be, or language specific constraints. It is one interesting property of constraints that they can be more or less specific, and their effects can add to each other. Thus, one can end up with a very strong filter over admissible structures by combining constraints that pertain to different properties of objects. Exemplarily, there are discussions on language-specific constraints that don’t seem to be readily transferable to other languages from, e.g., German. For more on language specific constraints, see the chapters in Part IV of this book. Other chapters, Part III, deal with psycholinguistic or neurolinguistic reflexes of constraints and their empirical testing. During the processing of discourses by human participants, the linguistic constraints can be expected to produce effects and generate preferences for strategies or solutions. These predictions of course should be empirically testable.
2. The cognitive status of rhetorical relations The theory of rhetorical relations is a cornerstone of discourse analysis. In general, it is undisputed that the meaning of text is more than the conjunction of the meanings of its sentences, but there are different opinions about the cognitive status of rhetorical relations. One position assumes that rhetorical relations are part of the linguistic inventory of language users and therefore of their linguistic competence. When faced with a sequence of two text segments, the hearer or reader searches a closed list of
Constraints in discourse — an introduction
rhetorical relations and chooses that relation which fits best, where the criterion for fitting best varies from theory to theory. From this we may distinguish positions that assume that the extra information that the reader infers from the concatenation of two text segments is derived e.g., from assumptions about the speaker’s intentions, commonsense world knowledge, and conversational maxims alone. Rhetorical relations are then not part of our basic linguistic inventory. We may call the first position a non–reductionist position and the second position a reductionist position. Within reductionist positions we may roughly distinguish between approaches that take their starting point in plan-based reasoning, and approaches that take their starting point in Gricean pragmatics. The most important frameworks of discourse analysis discussed in this volume are non–reductionist in character, e.g., the Linguistic Discourse Model (Polanyi 1986), Rhetorical Structure Theory (Mann and Thompson 1987), and Segmented Discourse Representation Theory (Asher and Lascarides 2003). As an illustration, we discuss the following example: (2) Ann calls a taxi service. Ann: (1) I need a taxi now. (2) Pick me up at the Dortmund railway station and (3) drop me at Haus Bommerholz.
The first sentence is a directive speech act asking the taxi service to supply a transportation to Ann. Propositions (2) and (3) provide more information about the lift. They elaborate the content of the first sentence. A non–reductionist would assume that there exists a rhetorical relation Elaboration that is inferred by the addressee. The inference of text coherence begins with an interpretation of the sentences (1), (2) and (3). The addressee then searches a mental library of rhetorical relations. We may assume that it contains the entries Elaboration, Explanation, and Result. Each rhetorical relation defines constraints that must be fulfilled by text segments which are connected by the relation. For example, a text segment β can only elaborate a text segment α if β denotes a sub-eventuality of α, whereas Explanation and Result assume that the eventualities are non-overlapping and that one is the result of the other. Hence, the addressee can infer Elaboration, and therefore text coherence, from the fact that the propositions in (2) and (3) refer to sub-eventualities of the event mentioned in (1). (For more on this cf. Section 6.) A reductionist tries to show discourse coherence without reference to a predefined set of rhetorical relations. Instead, the explanation may for example rest on assumptions about the speaker’s domain plans. Taking a lift with a taxi is an activity which can be broken down into being picked up by the taxi at a certain place, the taxi ride, and being dropped at the destination. Schematically, we can describe this decomposition as follows:
(S1 ) TakingTaxi(P) → PickUp(P, Time1, Place1), TaxiRide, Drop(P, Time2, Place2)
An analysis of Example (2) may proceed as follows: Sentence (1) states the speaker’s domain intention. This activates schema (S1), which is shared knowledge in the
Constraints in discourse — an introduction
relevant language community. In order to make the directive in (1) felicitous, some of the parameters in (S1) have to be specified. This is done in sentences (2) and (3); they state the place of departure Place1 and the destination Place2. Coherence is achieved by direct reference to a schema like (S1). Discourse becomes incoherent if the hearer cannot find a domain schema which connects the text segments, as seen in the following example: (3) Ann calls a taxi service. Ann: (1) I need a taxi now. (2) I grew up in Bielefeld, Ostwestfahlen–Lippe.
A reductionist position which is based on plan recognition is widespread among approaches in artificial intelligence, e.g., (Grosz and Sidner 1986; Litman and Allen 1990). The assumption that rhetorical relations are part of our linguistic inventory has consequences for our understanding of both pragmatics and, especially, conversational implicatures (Grice, 1975). For an example we look at:1 (4) Ann: Smith doesn’t seem to have a girl friend. Bob: He’s been paying lots of visits to New York lately. Implicature: Smith possibly has a girl friend in New York (p).
In order to understand Bob’s utterance as a contribution to the ongoing conversation, Ann has to find a rhetorical relation that connects his utterance to her contribution. We may assume that there exists a rhetorical relation of Counterevidence. The inference of Counterevidence can proceed from the semantic content of the utterances and their prosodic and other linguistic properties. It is not necessary that the inference takes into account the interlocutors’ intentions. If Counterevidence holds between Ann’s and Bob’s utterances, then Bob’s utterance must provide evidence for the negation of Ann’s claim, i.e., it must provide evidence for the claim that Smith has a girl friend. This is the case if one assumes that Smith possibly has a girl friend in New York. Hence, the construction of a rhetorical relation between the two utterances leads to an accommodation of the implicature (p). We may contrast this reasoning with the standard theory of conversational implicatures (Grice 1975), (Levinson 1983, Ch. 3), which assumes that the implicatures are derived by reasoning about each other’s intentions. According to Grice, interlocutors adhere to a number of conversational principles which spell out how discourse participants should behave in order to make their language use rational and efficient. In particular, Grice assumes that each contribution to the ongoing conversation serves a joint goal of speaker and hearer. A possible derivation of the implicature may proceed
1. For a more thorough discussion of this example and the relation between Grice’ theory of conversational implicatures and the assumption of rhetorical relations see (Asher and Lascarides 2003, Sec. 2.6).
Constraints in discourse — an introduction
as follows: (1) Ann’s utterance raises the question whether Smith has a girl friend; (2) Bob’s contribution must be relevant to this question; (3) Bob’s contribution can only be relevant if Smith possibly has a girl friend in New York; (4) as Bob has done nothing in order to stop Ann from inferring that (p), it follows that she safely can infer that (p). In contrast to the first explanation, this explanation infers implicatures directly from joint intentions and a general principle of relevance.2
3. Topics in the analysis of discourse constraints In the previous section, we were introduced to different positions concerning the status of rhetorical relations. Rhetorical relations provide the backbone of some of the most important formal frameworks in discourse analysis. In this section, we want to address some topics in discourse analysis which are related to the investigation of discourse constraints. We start with constraints related to rhetorical relations and the discourse structures constructed by them. In this context, we introduce, for example, the Right Frontier Constraint as first codified by Livia Polanyi (1986) in her ldm (for more detail see Section 4). Text coherence is the result of interconnectedness of text segments. The analysis using rhetorical relations naturally leads to a representation as a graph. The terminal nodes of the graph can be identified with elementary illocutionary acts. The graph in Figure 2 shows an analysis of the following example, in which Ann tells how she came to Haus Bommerholz:
(5) Ann: (1) I arrived at 10 am. (2) I took a taxi then. (3) It picked me up at the Dortmund railway station and (4) dropped me at Haus Bommerholz. (5) I thought it might be quite complicated to get to this place but (6) it wasn’t.
A natural question that arises concerns the general structure of these graphs. First we may ask, what kind of branches are associated with the different rhetorical relations. Are they always of the same kind or can we distinguish between different types of relations? Closely related to this question is that for the types of graphs that can be generated. For example, the graph in Figure 2 has a tree like structure and only binary branches. A third question concerns the comparability of different representations. The tree in Figure 2 is an rst graph (Mann and Thompson 1987). These trees are different from trees which we usually find in syntax. In syntactic trees, the relations that connect two constituents are normally attached to the branching nodes. In rst graphs 2. Asher and Lascarides (2003) point out that any existing theory of conversational implicatures in the tradition of Grice, has to assume that interlocutors carry out costly computations about each other’s intentions. Hence, a theory of conversational implicatures which is based on the theory of rhetorical relations is attractive from a cognitive point of view as it makes weaker assumptions about the inference capabilities of the interlocutors.
Constraints in discourse — an introduction
CONTRAST (5)
EVIDENCE
(6) NARRATION (1)
ELABORATION (2)
NARRATION (3)
(4)
Figure 2. An analysis of Example (5). The graph shows the rhetorical relations that hold between text segments.
they are labels to the edges connecting the nodes. We will see syntax like graphs in the section about the Linguistic Discourse Model. The answers to the above questions impose more or less strict constraints on discourse. These topics are especially discussed in the contributions by Danlos (Chapter 4) and Egg & Redeker (Chapter 6). In Figure 2, we can find two types of relations: relations like Elaboration which are attached to an arch and relations like Narration which are attached to branches starting from a shared node. Text segments connected by Narration are intuitively on the same level, whereas a text segment that is attached to another text segment by Elaboration or Evidence is subordinated to this segment. The distinction between coordinating and subordinating discourse relations became very influential with (Grosz and Sidner 1986).3 One way of conceptualising the distinction between subordinating and coordinating rhetorical relations is based on the discourse intentions of the speaker. In Example (2), the sentences ‘Pick me up at Dortmund railway station’ and ‘Drop me at Haus Bommerholz’ provide information without which the addressee cannot successfully perform what was asked from him in the first sentence ‘I need a taxi now’. In a coordinated sequence like ‘(1) I arrived at 10 pm. (2) I took a taxi then.’ neither (1) is uttered in order to support (2), nor is (2) uttered in order to support (1). Each sentence can stand alone, and none needs the other in order to justify its occurrence. In contrast, the utterance of (2)
3. rst distinguishes between multi-nuclear and nucleus-satellite relations. This distinctions is closely related to Grosz and Sidner’s (1986) distinction between coordinating and subordinating relations.
Constraints in discourse — an introduction
‘Pick me up at Dortmund railway station’ in Example (2) cannot be justified without the information that Ann needs a taxi. The distinction between coordinating and subordinating discourse relations is incorporated in most formal frameworks and in all frameworks which we will present in the next sections. There are differences how subordination and coordination are defined. In particular, there are different ways of thinking about the nature of these relations. For example, they may be defined in terms of discourse plans and intentions, or in a purely syntactic way. Subordination and coordination are the properties of rhetorical relations that define the right frontier. Roughly, the right frontier denotes the zone in a graph where new text segments can attach. It is on the right side of the discourse graph if we assume that the graph is a tree and that the order from left to right corresponds to the natural order of discourse segments in text or dialogue. We consider the following example, where Ann tells another story:
(6) Ann: (1) I took a taxi to Haus Bommerholz. (2) It picked me up at the railway station. (3) The ride took more than half an hour. (4) The taxi driver didn’t know his way. (5) This was very annoying.
To which proposition does (5) refer? Sentences (2) and (3) are coordinated to each other and subordinated to (1). Sentence (4) is subordinated to (3). The right frontier consists of the segments (1), (3), (4), and (2+3). It is defined as follows: the top node of a tree is always on the right frontier; if a sequence of coordinated nodes is subordinated to a node on the right frontier, then the sequence itself and its rightmost coordinated node are also on the right frontier.4 The right frontier constraint states that new discourse segments can only attach to segments that are positioned on the right frontier. This means that in our example (5) can only attach to (1), (3), (4), or the compound (2+3). This does not follow from expectations about annoying things:
(7) Ann: (1) I took a taxi to Haus Bommerholz. (2) I had to wait very long for it. (3) Then, the ride took more than half an hour. (4) The driver didn’t know his way. (5) This was very annoying.
Again, (5) can only attach to the segments on the right frontier, i.e., to (1), (3), (4), and the compound (2+3) but not to (2). The claim that new discourse segments can only attach to the right frontier needs some qualification. What can attach are anaphoric expressions, i.e., discourse elements which need a previous discourse element in order to receive a truth value. Examples of anaphoric expressions are pronouns like ‘he,’ ‘she,’ or ‘it’, but also abstract object
4. The precise definition of the right frontier and its associated constraint is, of course, framework dependent; see especially sections 4 and 6.
Constraints in discourse — an introduction
ELABORATION
(1) (2)
NARRATION EXPLANATION (3)
(4)
Figure 3. An analysis of Example (6).
anaphora (Asher 1993) like ‘this’ in sentence (5) which refers to an preceding event. Furthermore, we can think of a complete sentence like (5) as an anaphoric expression that needs a previous discourse segment to which it can be linked by a rhetorical relation. Not all anaphoric expressions are bound by the right frontier constraints. For example definite descriptions can pick up objects which were introduced in segments left to the right frontier. Here is a slight variation of an example from (Asher and Lascarides 2003):
(8) (1) One plaintiff was passed over for promotion three times. (2) Another didn’t get a raise for five years. (3) A third plaintiff was given a lower wage compared to males who were doing the same work. (4) But the jury didn’t believe this. (4ʹ) But the jury didn’t believe the first case.
‘This’ in sentence (4) can only refer to either the compound of (1), (2), (3) or (3) alone. In contrast, ‘the first case’ in (4ʹ) refers to (1), which is not on the right frontier. An obvious problem for the right frontier constraint are cataphors, i.e., pronouns that refer to objects that are introduced later in discourse. The graph in Figure 2 shows another potential problem: The last coordinated sentences (5) and (6) are superordinated to the previous discourse (1)–(4) in such a way that (1)–(4) are attached to the last sentence (6). This is not possible if we assume that sentences (5) and (6) are attached sequentially to the previous graph for (1)–(4). It is possible to analyse the discourse in Example (5) in other ways which avoid this problem. The right frontier constraint is discussed especially in the papers by Asher (Chapter 2) and Prévot & Vieu (Chapter 3). Consten & Knees (Chapter 9) discuss abstract object anaphora. Chiarcos & Krasavina (Chapter 5) discuss different methods to measure the distance between anaphors and their antecedents in discourse graphs. Another important constraint connected to rhetorical relations and the structures defined by them is the Maximize Discourse Coherence (mdc) constraint introduced by (Asher and Lascarides, 2003). rst graphs, for example, connect discourse segments
Constraints in discourse — an introduction
by a single rhetorical relation. The mdc constraint represents the contrary position. It states that as many rhetorical relations as possible are realised between discourse segments. This can be understood best from the interpretation perspective. The addressee tries to connect the different segments by as many discourse relations as possible. Coherence is defined by connectedness through rhetorical relations. Maximising the number of relations that hold between segments is then the same as maximising discourse coherence. An intuitive example is the following one taken from (Asher and Lascarides 2003, p. 18):
(9) (1) John moved from Brixton to St. John’s Wood. (2) The rent was less expensive.
There are two possibilities to resolve the bridging anaphora in (2). ‘The rent’ can relate to the rent in Brixton or St. John’s Wood. In both cases, (2) provides background information, hence (2) can attach to (1) by a relation named Background. But if we assume that ‘the rent’ refers to St. John’s Wood, then we get in addition also an explanation for why John moved. This is the preferred reading of Example (9). We get this interpretation if we maximise the number of discourse relations as the preferred reading allows to connect (2) with Background and Explanation to (1), whereas the dispreferred reading allows a connection only with Background. So far, we presented phenomena and constraints directly related to the discourse structure defined by rhetorical relations. But not all discourse constraints are connected to these relations. We here mention two important principles: centering (Grosz et al., 1995) and DRT subordination (Kamp and Reyle 1993). Grosz and Sidner (1986) distinguished between three components of discourse structure: the linguistic structure, the intentional structure, and the attentional state. The linguistic structure is defined by discourse segments and the relations holding between them. The intentional structure is defined by the speaker’s intentions that underlie the discourse segments and the relation between these intentions. The attentional state is defined by the immediate focus of attention at each point of the discourse. Grosz and Sidner distinguish between local and global discourse coherence. Global discourse coherence roughly corresponds to the coherence defined by the discourse relations holding between discourse segments, i.e., it is associated with the linguistic structure. Local coherence refers to coherence among the utterances of one discourse segment.5 Centering Theory (CT) explains, for example, why the discourse in Example (10a) is more coherent than the discourse in (10b) (Grosz et al., 1995, p. 206). (10) a. (1) John went to his favourite music store to buy a piano. (2) He had frequented the store for many years. (3) He was excited that he could finally buy a piano. (4) He arrived just as the store was closing for the day. 5. Here, discourse segment has to be understood roughly as meaning a sequence of coordinated utterances.
Constraints in discourse — an introduction
b. (1ʹ) John went to his favourite music store to buy a piano. (2ʹ) It was a store John had frequented for many years. (3ʹ) He was excited that he could finally buy a piano. (4ʹ) It was closing just as John arrived.
CT assigns to each utterance a set of forward looking centres and a unique backward looking centre. Forward and backward looking centres are semantic domain entities like persons, things, and events. The backward looking centre is the immediate focus of attention. The forward and backward looking centres of two consecutive utterances are related to each other as follows: The backward looking centre of the second utterance must be an entity from the forward looking centre of the first utterance. The elements of the forward looking centre are ranked according to salience. The subject is most likely to be ranked highest. CT formulates several discourse constraints that are derived from forward and backward looking centres. One rule states that the backward looking centre of a sentence must be realised as a pronoun if any element of the forward looking centre of the previous utterance is also realised by a pronoun. This predicts that (11a) is better than (11b): (11) a. John met Mary. He loves her. b. John met Mary. John loves her.
Another rule states, for example, that a continuation of backward looking centres is preferred over a change. This explains the observation in Example (10) and explains why the use of the pronoun ‘he’ in sentence (5) of Example (12) is misleading (Grosz et al., 1995, p. 207). (12) (1) Terry really goofs sometimes. (2) Yesterday was a beautiful day and he was excited about trying out his new sailboat. (3) He wanted Tony to join him on a sailing expedition. (4) He called him at 6 AM. (5) He was sick and furious at being woken up so early.
DRT subordination likewise imposes restrictions on anaphoric accessibility of discourse objects. In contrast to the constraints presented so far, DRT subordination is derived from the logical form of utterances. It explains why, for example, the following uses of pronouns are infelicitous: (13) a. In the cage there was no lion. *It was snoring and sleeping. b. If a farmer owns a donkey, he beats it. ?He is my neighbour.
Discourse is interpreted incrementally by constructing Discourse Representation Structures (drss). Several construction algorithms have been proposed. One suggestion is to construct a unique drs for each new sentence and merge it with a drs representing discourse old information. A drs consists of a pair 〈U,Con〉 of a discourse universe U and discourse constraints Con. The universe U contains discourse referents, which correspond to the familiar variables in first-order logic. U represents the set of entities introduced by the discourse. The discourse constraints in Con are, in the most
Constraints in discourse — an introduction
simple case, a set of first-order formulas that represent the truth conditionally relevant content of the discourse. A drs representing the sentence ‘It was snoring and sleeping’ is 〈{y}, {snoring(y), sleeping(y)}〉: y is the discourse referent introduced by ‘it.’ In order to interpret the sentence in a given context, y has to be linked to a discourse old referent. Let’s consider the case where the first sentence would have been ‘In the cage there is a lion.’ This sentence can be represented by a drs 〈{x}, {lion}(x), in-cage(x)}〉. We can see that y can only be linked to x. We acquire a drs representing the meaning of the whole discourse by merging the two drss into one. This can either be achieved by building the unions of the universes and conditions and adding the constraint x = y, or by replacing y by x in the first drs and then building the unions. This leads either to 〈{x, y}, {lion(x), in-cage(x), x = y, snoring(y), sleeping(y)}〉, or to 〈{x}, {lion(x), in-cage(x), snoring(x), sleeping(x)}〉. In our example (13a), the context is given by the sentence ‘In the cage there was no lion.’ In DRT, this sentence is represented by a drs of the form 〈0,{¬ 〈x, {lion(x), in-cage(x)}〉}〉, or in graphical notation:
x ¬
lion(x), in-cage(x)
We here encounter a negated drs 〈x, {lion(x), in-cage(x)}〉 in the condition set of a larger drs. The negated drs corresponds to the first-order formula ¬∃x, (lion(x)∧ in-cage(x)). In addition to the truth-conditions, the drss represent information about the accessibility of discourse referents for subsequent anaphors. Anaphors in a new drs D can only be linked to discourse referents contained in the universe of the drs with which D is merged. In our example, we see that y introduced by ‘it’ cannot be linked to x because x is not an element of the universe of the drs representing the first sentence of (13a). The universe of the subordinated negated drs is not accessible. In contrast to Grosz and Sidner (1986), subordination and coordination are not defined in terms of discourse goals but in terms of the logical form of sentences. Psycholinguistic evidence about the accessibility of discourse referents will be discussed by Burkhardt in chapter 7 and by Kaup & Lüdtke in Chapter 8. Burkhardt studies definite determiner phrases and Kaup & Lüdtke discuss the accessibility of discourse referents which were introduced in negated contexts.
4. The ldm One of the first theories of discourse structure and interpretation that explicitly acknowledged the rfc was formulated in (Polanyi 1986). This paper was based on
Constraints in discourse — an introduction
earlier work (Polanyi and Scha 1983a, b) and remains influential. We will here describe a recent version of the ldm, taken mainly from Polanyi (2001). Following the idea that it is possible to build up discourse recursively, the ldm identifies basic or elementary discourse units; we will use the abbreviation “e-dcu” here for those units, taking up the notation from (Polanyi 1986). The criterion for a stretch of discourse to be counted as an e-dcu is that it is (part of) an utterance that describes a single event or event type in a Davidsonian sense as stated in, for example, (Davidson 1980). The characterisation is thus twofold: on the one hand, it is syntactical to the extent that it gives a criterion that can be tested purely by the form of the constituent. On the other hand there is a semantic/pragmatic criterion that might give rise to the identification of dcus which are only parts of utterances. We will discuss an example below. But note that the ldm does not make use of, for example, the notion of a speech act. C
I like to read SF
I like to ski
I like to sleep late
Figure 4. [C[e-DCU I like to read SF], [e-DCU I like to ski], [e-DCU I like to sleep late]] as a tree representation.
The claim that discourse can be constructed recursively means, on the other hand, that there have to be (syntactic) rules of combination for the constituents at various levels. The ldm knows three of those rules: Coordination According to ldm, coordination is an n-ary conjunction of the coordinated dcus, where the semantics of the conjunction corresponds to the intersection of the semantics of the constituents. The coordinated dcus are subordinated to a freshly introduced or already existing C-node. Thus, to quote an example from Polanyi (1986), [C[e-DCU I like to read SF], [e-DCU I like to ski], [e-DCU I like to sleep late]], consisting of a conjunction of three e-dcus, is interpreted as ∩([[I like to read SF]], [[ I like to ski]], [[I like to sleep late]]), where “[C. . .]” expresses the top-level node, ∩ has the same arity as the conjunction and “[[f]]” maps f into its meaning. (For a tree representation of the discourse structure, see Fig. 4.) Thus, all the information of subordinated nodes (here, e-dcus) is collected and inherited by the superordinate C-node. Subordination According to the ldm, subordination is always binary, and the semantics of the superordinate S-node is the semantics of the left sister. For example, the structure [S[e-DCU I like to do fun things on vacation], [e-DCU I like to read SF]] receives an interpretation that is just [[I like to do fun things on vacation]] (“[Sˆ. . .]” expresses the top-level node again). The semantics of the right sister doesn’t
Constraints in discourse — an introduction
contribute to the meaning of the S-node. It is appropriate to distinguish between subordination of nodes in the hierarchy and subordination of dcus with respect to one another. As explained in the previous paragraph, the daughters of a C-node are coordinated, while in the present case, the daughters of an S-node are not. (The left sister is superordinated.) n-ary relations There is one case of n-ary relation that is different from coordination.6 Polanyi (2001) doesn’t give a general characterisation of this type of relation, but states that logically, rhetorically, or interactionally related pairs of dcus are n-arily connected. [B[e-DCU If John goes to the store], [e-DCU he’ll buy tomatoes]] is an example for such a binary structure; here, the relation between the dcus is the logical if/then. The semantics of such nodes is complex. According to Polanyi et al. (2004), what is available at those nodes is information about each constituent and the relationship connecting them. The analysis of ‘if John goes to the store he’ll buy tomatoes’ already is the announced example for sub-sentential e-dcus. This analysis is triggered by the presence of the logical connective if ___ then ___. There is no general characterisation for the triggering class of constructions in the texts on the ldm. But as propositional logical connectives take expressions as arguments that can classically be interpreted as sets of possible worlds in which the respective argument is evaluated as true, this seems to mirror the desire to allow for a finely-grained semantic analysis of discourse. Here might be the natural point to have a closer look at the semantics as utilised by the ldm. As Polanyi (2001) puts it, the central concern of the ldm is the setting and resetting of contexts. This emphasis on contexts is quite in the spirit of Kaplan (1978) and distinguishes the otherwise rather similar semantic concerns from those found in (Heim 1983; Kamp and Reyle 1993; Groenendijk and Stokhof 1990) and elsewhere. Of course, as Polanyi stresses, the semantics utilised by the ldm is dynamic. Figure 5 shows a representation of an already complicated case, namely the semantics of a dcu expressing reported speech. What is summarily described as “index” or “indexes” for dcus in the upper list actually comes in different flavours; Polanyi (2001) gives a list that spells it out as the partial ordering interaction > speech event > genre unit > modality > polarity > point of view of contexts. Polanyi (2001) demonstrates the explanatory power of the ldm by giving an interesting analysis of a Yiddish anecdote. We will here just briefly analyse the short taxi ordering event from Example (2), p. 5. As discussed above, Ann’s utterance of I need a taxi now gets further elaborated by the subsequent utterance Pick me up at the Dortmund railway station and drop me at Haus Bommerholz. According to the ldm principles set out, the first utterance is analysed as [e-DCU I need a taxi now], expressing a single event. The second utterance has to be 6. Polanyi (1986) in an earlier version of the ldm acknowledged only binary rather than n-ary relations.
Constraints in discourse — an introduction indexes of reporting DCU e1 at t1 ··· event of reporting indexes of DCU reported event(s) reported
Figure 5. The representation of dynamic semantic interpretation of a case of reported speech.
rendered as a binary construction including [e-DCU Pick me up at the Dortmund railway station] and [e-DCU drop me at Haus Bommerholz]. We arrive at the intermediate analysis for the second utterance [n-ary[e-DCU Pick me up at the Dortmund railway station], [e-DCU drop me at Haus Bommerholz]]; and finally, using abbreviations for the utterances, for the whole discourse we get [S[e-DCU u1], [n-ary[e-DCU u2], [e-DCU u3]]]. Note that there might be a choice in the analysis; one might decide to first attach [e-DCU u2]] to [e-DCU u1] as [S[e-DCU u1], [e-DCU u2]] and only later introduce the node [n-ary] to enable the attachment of [e-DCU u3]. This, however, depends on the actual implementation of the discourse parser and is only relevant for the processing, while it makes no difference to the result. With regard to the interpretation of this short discourse, what we arrive with at the top level node is just the interpretation of [e-DCU I need a taxi now], according to the rule for interpreting subordinating nodes. The interpretation of the subordinated dcu ⟦[n-ary[e-DCU Pick me up at the Dortmund railway station], [e-DCU drop me at Haus Bommerholz]]⟧ plays no role for the semantics of the discourse. This is a little surprising and might need some further investigation; however, given the way the semantics for the ldm is set up it makes sense: all the relevant indexes seem to be set for the taxi ordering event. The events described in the subordinate dcus carry different indexes that should not have direct impact on the current ones. At this point it may be beneficial to introduce a distinction between parts of semantic information with respect to their availability. There should probably be parts that percolate through the tree (e.g., information relating to individuals) and others that don’t (like the indexes). We will see how other theories handle the passing of information in the following sections of this chapter.
5. Rhetorical Structure Theory The origins of Rhetorical Structure Theory (rst, Mann and Thompson (1987)) lie in text generation. It soon developed into a general linguistic framework for analysing
Constraints in discourse — an introduction
SEQUENCE (1)
(5)
ELABORATION (2)
(3)
LIST
(4)
Figure 6. An analysis of Example (14).
text in terms of the rhetorical relations that hold between text segments. The minimal units of texts are the speech acts produced on sentence level. Units linked by rhetorical relations themselves form a text segment that can again be linked to other text segments. rst distinguishes between two types of relations: relations that connect text segments which are of equal importance to the text, so-called multi-nuclear relations, and relations that connect text segments of differing importance, so-called nucleussatellite relations. A nucleus is a text segment which can stand alone as a coherent text. In contrast, a satellite would not form a coherent text without its super-ordinated nucleus. The following Example (14) shows three nuclei in (1), (2) and (5), and two satellites in (3) and (4) which elaborate sentence (2): (14) Ann: (1) I arrived in Dortmund by train. (2) I took a taxi to Haus Bommerholz. (3) It picked me up at the railway station. (4) The ride took more than half an hour. (5) When I arrived at Haus Bommerholz, I checked in immediately.
If we delete (3) or (4), then the remaining text is still coherent; whereas a deletion of (2) produces an incoherent text because the two satellites cannot be subordinated to any other text segment. The sentences (1), (2) and (5) form a temporal sequence. Each of them could be deleted independently of the other without disrupting text coherence. A graphical rst representation of Example (14) is shown in Figure 6. As mentioned before, rst distinguishes between two types of relations: multinuclear and nucleus-satellite relations. The definition of a multi-nuclear relation is divided into two parts: constraints on the combined nuclei and a definition of the intended effects of the relation. Definitions of nucleus-satellite relations are divided into three parts: constraints on the nucleus and satellite individually, constraints on the nucleus-satellite combination, and again a definition of the intended effects. We here show some examples for each relation type.7
7. The definitions are taken from (Mann and Thompson 1992). They can also be found on the rst homepage http://www.sfu.ca/rst/01intro/definitions.html. The page provides an extensive list of relations together with examples and many elaborated text analyses.
Constraints in discourse — an introduction
Multi-nuclear relations The following tables show the definitions of the relations Contrast and Sequence. The relation Sequence corresponds to the sdrt relation Narration, see Section 6. Contrast Constraints on nuclei
Intended effects
No more than two nuclei; the situations in these two nuclei are (a) comprehended as the same in many respects (b) comprehended as differing in a few respects and (c) compared with respect to one or more of these differences.
Receiver recognizes the comparability and the difference(s) yielded by the comparison.
Sequence Constraints on nuclei
Intended effects
There is a succession relationship between the situations in the nuclei.
Receiver recognizes the succession relationships among the nuclei.
Nucleus-satellite relations In the following tables, N stands for nucleus, S for satellite, and R for Receiver. Evidence Constraints on N or S
Constraints on N+S
Intended effects
on N: R might not believe N to a degree satisfactory to W on S: R believes S or will find it credible
A’s comprehending S increases A’s belief of N
A’s belief of N is increased
Preparation Constraints on N or S
Constraints on N+S
Intended effects
none
S precedes N in the text; S tends to make R more ready, interested or oriented for reading N
R is more ready, interested or oriented for reading N
Elaboration Constraints on N or S
Constraints on N+S
Intended effects
none
S presents additional detail about the situation or some element of subject matter which is presented in N or inferentially accessible in N in one or more of the ways listed below. In the list, if N presents the
R recognizes S as providing additional detail for N. R identifies the element of subject matter
Constraints in discourse — an introduction first member of any pair, then S includes the second: • set : member • abstraction : instance • whole : part • process : step • object : attribute • generalization : specific
for which detail is provided.
Graphical convention There is a graphical convention how to draw rst trees. We have seen examples in Figures 2, 3, and 6. Corresponding to the two types of relations, these graphs are built up by two types of graphical components. Figure 7 shows the graphical convention for depicting multi-nuclear relations. Figure 8 shows the corresponding convention for nucleus-satellite relations. rst is mainly a theory of discourse coherence based on rhetorical relations. Important for the understanding of rst is the role of intentions. A text is only coherent if the receiver can recognise behind each text segment an effect intended by the text’s author. This contrasts, for example, with the treatment of rhetorical relations in sdrt. As an effect, the inclusion of author’s intentions in the definition of rhetorical relations leads to a more finely grained distinction within these relations than in other approaches. The explicit inclusion of intentions makes it an attractive framework for discourse generation. In rst, it is assumed that only one relation can hold between two segments, but multiple analysis of one text may be possible. Maybe, the strongest drawback compared to other theories is rst’s lack of a theory of anaphoric restrictions. Likewise,
RELATION Nucleus 1
Nucleus 2
...
Figure 7. Graphical component for multi-nuclear relations.
RELATION Nucleus
Satellite
Figure 8. Graphical component for nucleus-satellite relations.
Nucleus n
Constraints in discourse — an introduction
there is no direct connection to theories of discourse interpretation, like Discourse Representation Theory (Kamp and Reyle 1993). As a consequence, rst is weak as a framework for discourse interpretation.
6. sdrt One of the most recently developed theories of discourse meaning is sdrt, prominently defended by Asher and Lascarides (2003). It can be thought of as building on two cornerstones: the first is rst, as layed out in section 5. The other one is drt, which already was touched upon in section 3. drt had been around in the form of grey papers in the 80s and canonized in (Kamp and Reyle 1993). For an example of how drt deals with the construction of meaning representations of discourse, let us look at Ann’s calling a taxi service again. “I need a taxi now” can be taken to introduce four discourse markers: one for each Ann and a taxi, one for the event (or state) of needing the taxi, and one for the moment denoted by “now”. In drt, this would be written as in Figure 9(a): i x e1 n
j y r e2 m
taxi(x)
Dortmund-railway-station(r)
need(e1, i, x, n)
pick-up(e2, j, y, r, m)
(a)
(b)
Figure 9. Representation of I need a taxi now and Pick me up at Dortmund railway station in standard drt.
The continuation in the above example (2) “Pick me up at Dortmund railway station” might be given a semantic representation in the form of the drs 9(b) by a similar reasoning. Using a standard merging operation like in (van Eijck and Kamp 1997), the two drss turn into the representation in Figure 10. Notice that this representation actually allows to identify all the discourse entities even after the merge, and the truth conditions of the discourse are rendered quite nicely: the whole discourse can be said to be true with respect to some model if the referents can be mapped to that model such that the properties and relations expressed by the predicates are satisfied, just as required by standard model theory. However, some information is not preserved by the obtained drs: “Pick me up at Dortmund railway station” is a request or command, whereas “I need a taxi now” is a statement. The utterance mood (or illocutionary force) is not preserved. Further, the two utterances stand in a certain relation (cf. Section 1.5): the second utterance elaborates the first. Note that their order is not arbitrary. This information likewise is not preserved because the discourse segmentation is abstracted over in drt.
Constraints in discourse — an introduction i x e1 n j y r e2 m taxi(x)
need(e1, i, x, n) Dortmund-railway-station(r) pick-up(e2, j, y, r, m) i=j
Figure 10. Representation of I need a taxi now. Pick me up at Dortmund railway station.
sdrt preserves discourse segmentation. The drss that represent discourse segments are labelled by prepending tags (here, greek letters with indices) as shown in Figure 11. The labels can be used to refer to the drss representing the meanings of discourse segments. Whereas it might be best to think of earlier discourse theories—and very obviously so in the case of (Grosz and Sidner 1986)—as adhering to a metaphor of stack execution, sdrt might best be reconstructed along the lines of relational databases. (Discourse referents are distinguished entries in that they don’t point to other entities etc.) It is thus possible to express relations that hold between them and properties they have. Thus, the representations for the meanings of the two utterances from Ann’s taxi call turn into a structured representation from which it is possible to recover much more information than from the plain drs. i x e1 n π1:
taxi(x) need(e1, i, x, n) (a)
j y r e2 m π2:
Dortmund-railway-station(r) pick-up(e2, j, y, r, m) (b)
Figure 11. Representation of I need a taxi now and Pick me up at Dortmund railway station in sdrt.
This representation of course is called an sdrs. As can be seen in Figure 12, the sdrs not only contains the two sub-sdrss representing the meaning of each utterance, but also a predicate denoting the kind of relation the two meanings stand in: π2 is interpreted as elaborating π1. sdrt has adopted the insight that it is necessary to express structural underspecification from earlier approaches, e.g., udrt (Underspecified drt), cf. Reyle (1993), or clls (Constraint Language for Lambda Structures (Egg et al., 2001)). The way sdrt does so with respect to discourse relations is by introducing variables for relations.
Constraints in discourse — an introduction π1 π2 i x e1 n π1:
taxi(x) need(e1, i, x, n)
y r e2 m π2:
Dortmund-railway-station(r) pick-up(e2, i, y, r, m) Elaboration(π1, π2)
Figure 12. Representation of I need a taxi now. Pick me up at Dortmund railway station in sdrt.
In Figure 12, the Elaboration relation is fully specified, following the analysis above. If, however, it were unclear exactly which relation(s) held between the two discourse units, it would be feasible to just write R1 in place of Elaboration, thus introducing a variable for a relation. Note that sdrt, in marked contrast to rst, allows for multiple relations to hold between discourse units. This is relevant not only for trivial cases, like inverse relations: Explanation and Cause are such trivial cases. The sdrt analysis of Explanation is that of a causal explanation: the occurrence of one event explains (or can be employed to explain) why another event occurs. Non-trivial cases might be some Elaboration in parallel to Background as in Example (9). Combinations of possible relations holding between discourse consituents (or rather, the sdrss expressing their meaning) are constrained, however, by the outcomes that are predicted by the relations: if, e.g., R1 predicts temporal overlap of the related events, and R2 predicts a temporal sequence of them, then R1 and R2 can not simultanously hold between two constituents. According to the database metaphor, the criterion is that the table containing the outcomes may not become inconsistent. If it were the case that by adding a relation Rn to a table containing the outcomes predicted by the relations R1 . . . Rm became inconsistent, a decision would have to be made which (set of) relation(s) would have to be dropped. The principle guiding such a decision would be that of Maximize Discourse Coherence, cf. p. 9, i.e., drop the minimal number of relations that is necessary to render the outcomes maximally consistent (taking scalar relations etc. into account). The introduction of labels for discourse constituents allows to reason about the structure of discourse without having to bother with the meaning of the constituents. Thus, the complexity of modal dynamic predicate logic that is needed for the interpretation of the discourse constituents is not introduced into the reasoning about the
Constraints in discourse — an introduction
structure of the discourse. However, when needed all of the information can be recovered and used. To accomplish this, the language (and corresponding logic) to describe the rhetorical structure of a discourse is combined with the language used to describe (and logic used to reason over) underspecified logical forms. There is a very detailed description of this so-called glue language (and corresponding logic) in (Asher and Lascarides, 2003, 184ff) and we will not go into the details here. Compared to rst, sdrt is definitely stronger as a theory of discourse interpretation. Because of its rich formal inventory, sdrt allows a detailed description of discourse meaning, whereas rst in the described version must be said to be definitely more restricted. The ldm, on the other hand, seems to have a coverage that is similar to that of sdrt. While sdrt can be said to be semantically driven and syntactically informed, the ldm ought to be characterized as being syntactically driven. It would definitely be worth to have a look at which approach is cognitively more appropriate, but this can not be done here.
7. About the papers The book divides into four parts. The first part contains two chapters by Asher and Prévot & Vieu which discuss the right frontier constraint. The second part contains papers which compare different frameworks according to the discourse structures which can be generated in these frameworks. It includes chapters by Danlos, Chiarcos & Krasavina, and Egg & Redeker. The third part approaches the topic of discourse constraints from the cognitive perspective. The chapters are based on experimental studies conducted by Burkhardt and Kaup & Lüdke. The last and largest part contains work which applies discourse theory to language specific phenomena. It inludes five chapters by Consten & Knees, Nishiama & Koenig, Avarintseva-Klisch, Holler, and Speyer. In view of the importance of the Right Frontier to the various frameworks as a restriction on anaphoric accessibility, it is well justified to open this book on constraints in discourse with a discussion of the RFC. The first contribution, Troubles on the Right Frontier by Nicholas Asher, discusses some challenges to the Right Frontier Constraint and proposes a refinement of rfc that meets them. It is argued that different anaphors behave differently with respect to rfc depending on their presupposed information content. In general anaphoric expressions with little to no presuppositional content appear to obey rfc without exception. However, anaphors with a ‘heavier’ presuppositional content (definite descriptions and complex demonstratives) have a better chance of remaining felicitous even though the relations between them and their antecedents violate the Right Frontier Constraint. The right frontier constraint is also at the heart of the paper contributed by Laurent Prévot & Laure Vieu, The Moving Right Frontier. The authors argue that the coordinating or subordinating nature of discourse relations plays a major role in certain cases of revision of the discourse structure. They focus in particular on a relation typical in narratives, Result, as well as on a family of dialogue relations: content-relations
Constraints in discourse — an introduction
introduced by interrogatives. They conclude that the observed complex behaviour shows that the rfc need to be “handled with care”. Closely related to those discussions of the rfc is Laurence Danlos’ Comparing rst and sdrt Discourse Structures through Dependency Graphs, since the rfc is a restriction over docking points in discourse structures. Danlos discusses the rst distinction between Nucleus and Satellite arguments and the sdrt distinction between coordinating and subordinating discourse relations. She proposes a third mode of representation for discourse structures, called Dependency Graph. She argues that rst is far too restrictive with respect to generative capacity, sdrt a little bit too restrictive, and dependency DAG formalism a little bit too powerful. Christian Chiarcos & Olga Krasavina, in their Rhetorical distance revisited — A parameterised approach, develop some notion of rhetorical distance. This notion allows for a comparative representation of different theories; the effect is a scalar concept of accessibility. The authors reconstruct three theories of discourse-structural accessibility that differ in their assumptions on discourse structuring and its limiting force on the search for the antecedent. Results of an empirical study on the comparative predictivity of the reconstructed theories for the use of pronouns in German and English newspaper corpora are discussed. Underspecied discourse representation is at the center of Markus Egg’s & Gisela Redeker’s paper. An approach to discourse structure that builds on syntactic structure to derive that part of discourse structure that can be captured without taking recourse to deep semantic or conceptual knowledge is proposed. The authors claim that this contribution is typically only partial. They develop a notion of underspecified constraints that describe the structures a given discourse might have. This results in an interface from syntax to discourse and a clean interface to modules of discourse resolution. Still dealing with discourse structure, but from a processing perspective, Petra Burkhardt’s Dependency Precedes Independence: Online Evidence from Discourse Processing, investigates the integration of definite determiner phrases (dps) as a function of their contextual salience. dps depend on previously established discourse referents or introduce a new, independent discourse referent (and see Asher’s paper in this volume). A formal model that explains how discourse referents are represented in the language system and what kind of mechanisms are implemented during dp interpretation is presented. Experimental data from an event-related potential study are discussed that demonstrate how definite dps are integrated in real-time processing. Two distinct mechanisms Specify R and Establish Independent File Card and a model that includes various processes and constraints at the level of discourse representation are assumed to explain the data. Barbara Kaup & Jana Lüdtke in their Accessing Discourse Referents Introduced in Negated Phrases: Evidence for Accomodation? present an investigation into negation. According to standard theories of dynamic semantics, a discourse referent introduced by a noun phrase (np) in the scope of a negation should be inaccessible to subsequent anaphoric reference. Kaup’s paper presents empirical findings on anaphora resolution in the context of negations.
Constraints in discourse — an introduction
Lexical or pronominal nps that refer to propositionally structured referents (such as events, processes, states and facts) while introducing them as unified entities into a discourse representation, called complex anaphors, are at the focus of Manfred Consten’s & Mareille Knees’ contribution Complex Anaphors in Discourse. They describe anaphoric complexation processes and constraints on them in terms of ontological categories. Additionally, they provide a resolution model for complex anaphors and discuss different kinds of disambiguation processes based on ontological and lexical features as well as conceptual knowledge. Atsuko Nishiyama & Jean-Pierre Koenig report two corpus studies of the present perfect in English and Japanese in The discourse functions of the present perfect. They argue that the inferences required to interpret the present perfect follow from general default rules or commonsense entailment rules, and that the use of the perfect is relevant for discourse coherence in two ways. First, the presence of the state which the perfect introduces helps establish discourse relations, or allows the establishment of additional discourse relations between discourse segments. Second, the pragmatic inferences required to interpret the perfect can indirectly trigger the rules needed to establish discourse relations. German right dislocation, according to Maria Averintseva-Klisch in her German Right Dislocation and Afterthought in Discourse, subsumes two distinct constructions, dislocation proper and afterthought. These differ in a number of prosodic, syntactic and semantic characteristics and also have different discourse-functional properties. Right dislocation marks a discourse referent as especially salient on the current stage of the discourse. This requires the fulfilment of certain anaphoric constraints on the following discourse. Afterthought is a local reference clarification strategy and has no impact on the global discourse structure. Anke Holler’s contribution A discourse-relational approach to continuation draws upon the distinction between two classes of non-restrictive relative clauses in German: continuative and appositive ones. The paper investigates whether the notion of communicative-weight assignment first introduced by Brandt can be couched in discourse-structural terms by exploiting the distinction between coordinating and subordinating discourse relations in the sense of Asher and Vieu (2005). The filling of the vorfeld in German, following Augustin Speyer in his German Vorfeld-filling as Constraint Interaction, depends on information structural rather than strictly syntactic constraints. He argues for a ranking among scene-setting elements (which are said to be most likely to appear in the vorfeld), followed by contrastiveelements and finally by topics. The difference in likelihood to be in the vorfeld are argued to be best modelled by an Optimality Theoretic account that is sketched out in the paper.
Bibliography Asher, N. (1993). Reference to Abstract Objects in Discourse. Kluwer Academic Publishers. Asher, N. and Lascarides, A. (2003). Logics of Conversation. Cambridge University Press.
Constraints in discourse — an introduction Bäuerle, R., Schwarze, C., and Stechow, A. v., editors (1983). Meaning, Use, and Interpretation of Language. Foundations of Communication—Library Edition. de Gruyter. Blache, P. (2000). Constraints, linguistic theories and natural language processing. In Proceedings of NLP-2000. Davidson, D. (1980). Essays on Action and Events. Clarendon Press. Egg, M., Koller, A., and Niehren, J. (2001). The constraint language for lambda structures. Journal of logic, language and information, 10:457–85. Grice, H.P. (1975). Logic and Conversation. In Cole, P. and Morgan, J. L., editors, Syntax and Semantics, volume 3, pages 41–58. Academic Press. Groenendijk, J. and Stokhof, M. (1990). Dynamic Montague Grammar. Technical report, ILLC, University of Amsterdam. Obtainable via FTP from http://www.wins.uva.nl/research/illc/. Grosz, B.J., Joshi, A.K., and Weinstein, S. (1995). Centering: A Framework for Modelling the Local Coherence of Discourse. Technical Report TR-18-95, Center for Research in Computing Technology, Harvard University. Grosz, B.J. and Sidner, C. (1986). Attention, intentions, and the structure of discourse. Computational Linguistics, 12(3):175–204. Heim, I. (1983). File change semantics and the familiarity theory of definiteness. in: (Bauerle et al., 1983). Kamp, H. and Reyle, U. (1993). From Discourse to Logic—Introduction to Modeltheoretic Semantics of Natural Language, Formal Logic and Discourse Representation Theory, volume 42 of Studies in Linguistics and Philosophy. Kluwer. Kaplan, D. (1978). Dthat. In Cole, P., editor, Pragmatics, volume 9 of Syntax and Semantics, pages 221–43. Academic Press. Kempen, G., editor (1987). Natural Language Generation. Number 135 in NATO Advanced Science Institutes—Applied Sciences. Martinus Nijhoff Publishers. Levinson, S.C. (1983). Pragmatics. Cambridge University Press. Litman, D.J. and Allen, J.F. (1990). Discourse processing and commonsense plans. In Cohen, P.R., Morgan, J., and Pollack, M.E., editors, Intentions in communication, pages 417–44. MIT Press. Mann, W.C. and Thompson, S.A. (1987). Rhetorical Structure Theory: Description and Construction of Text Structures. in: (Kempen, 1987). pp. 85–95. Mann, W.C. and Thompson, S.A., editors (1992). Discourse Description — Diverse linguistic analyses of a fund-raising text. John Benjamins Publishing Company. Mann, W.C., Matthiesen, C. M. I. M., and Thompson, S.A. (1992). Rhetorical Structure Theory and Text Analysis. In Mann, W. C. and Thompson, S.A., editors, Discourse Description — Diverse linguistic analzses of a fund-raising text, pages 39–78. John Benjamins Publishing Company. Polanyi, L. (1986). The linguistic discourse model: Towards a formal theory of discourse structure. Techn. Report TR-6409, BBN Laboratories Inccap. Polanyi, L. (2001). The Linguistic Structure of Discourse. In Schiffrin, D., Tannen, D., and Hamilton, H.E., editors, Handbook of Discourse Analysis. Blackwell. Polanyi, L., Culy, C., van den Berg, M.H., Thione, G.L., and Ahn, D. (2004). Sentential structure and discourse parsing. In Webber, B. and Byron, D., editors, Proceedings of the ACL2004 Workshop on Discourse Annotation. ACL/SIGDial. Polanyi, L. and Scha, R. (1983a). On the Recursive Structure of Discourse. In Ehlich, K. and van Riemsdijk, H., editors, Connectedness in Sentence, Discourse and Text, pages 141–78. Tilburg univ.
Constraints in discourse — an introduction Polanyi, L. and Scha, R. (1983b). The Syntax of Discourse. Text, 3(3):261–70. Reyle, U. (1993). Dealing with Ambiguities by Underspecification: Construction, Representation and Deduction. Journal of Semantics, 10:123–79. van Eijck, J. and Kamp, H. (1997). Representing discourse in context. In Benthem, J. F. v. and Ter Meulen, A. G., editors, Handbook of Logic and Language. MIT Press.
part i
The Right Frontier
Troubles on the right frontier Nicholas Asher
CNRS, Laboratoire IRIT
1. Overview The Right Frontier Constraint (RFC), originally proposed by Polanyi in the eighties (Polanyi 1985) is one of the central empirically motivated constraints on discourse update and anaphora resolution in Segmented Discourse Representation Theory or SDRT, a formal theory of discourse interpretation that integrates dynamic semantics and a conception of rhetorical function into the analysis of discourse content.1 In SDRT these two tasks are codependent; anaphora resolution offers constraints on discourse attachment and vice-versa. This is because in SDRT attachment with or without discourse connectors is a matter of resolving an underspecified antecedent for a term in a discourse relation, which is exactly what is involved with pronoun resolution. While there is considerable support for the the Right Frontier Constraint or RFC, there are also some challenges. in this paper I will look at these challenges in detail, and propose a refinement of RFC that meets them. In particular, I’ll argue that different anaphors behave differently with respect to RFC depending on their presupposed information content. In general expressions anaphoric expressions with little to no presuppositional content appear to obey RFC without exception. This would predict that then discourse attachments understood as anaphors with no presupposed content must obey RFC, which also is in accord with the facts. However, anaphors with a ‘heavier’ presuppositional content (definite descriptions and complex demonstratives) have a better chance of remaining felicitous even though the relations between them and their antecedents violate the RFC. This points to a special role that such expressions play in discourse, a role noticed by Ariel, Gundel and others: such expressions permit the reader to focus on a discourse entity that was not at this point in processing the discourse salient. Integrating such observations into a theory of discourse structure will enable us to refine RFC appropriately. To set the stage, I’ll begin with an overview of SDRT and its formalization of RFC.
1. I would like to thank Laure Vieu, Laurent Prévot and Laurence Danlos for helpful comments on this paper and for helpful discussions on the topic in general.
Nicholas Asher
2. An introduction to the Right Frontier Constraint and its formalisation Anaphors in natural language are subject to several constraints governing their possible antecedents. While syntactic and semantic constraints of the sort discussed in generative syntax and dynamic semantics respectively are widely accepted, discourse constraints on anaphora are less well known at least in the philosophical community. The semantic constraint of accessibility of antecedents in dynamic semantics easily makes sense in conceptual terms: if the constraint of semantic accessibility is violated when one identifies a variable introduced by an anaphor with some antecedently introduced variable or discourse referent v, there is no value assigned to v in the local context of the anaphor and so the identification is uninterpretable. On the other hand, the syntactic constraints from Binding Theory and RFC are structural constraints. Schlenker (2005) has argued for a pragmatic reinterpretation of the Binding Theory and one can explain RFC in similar terms; RFC is a presentational constraint that, together with other principles of SDRT, is a refinement of the Gricean constraints of relevance and orderliness. But to see precisely how this is the case, I need to give a little background about SDRT. A discourse structure in SDRT or sdrs is a triple 〈A, F, Last〉, where:
• •
•
A is a set of labels. Last is a label in A (intuitively, this is the label of the content of the last clause that was added to the logical form); and F is a function which assigns each member of A a member of a formula of the SDRS language, which includes formulas of some version of dynamic semantics (DRT, DPL, Update Semantics, Martin Löf Type Theory, among others.)
This notion of discourse structure is very abstract and very general. One important distinction for SDRT (and for many other theories of discourse structure) that needs to be added to understand the notion of a right frontier is the distinction between two types of discourse relation. There are subordinating discourse relations and coordinating discourse relations. Asher and Vieu (2005) provide some theory internal tests as to whether a given discourse relation is subordinating or coordinating. These tests confirm that the discourse relation of Narration is a prime example of a coordinating relation, while the relation of Elaboration is a prime example of a subordinating relation. The difference between coordinating and subordinating relations for defining the right frontier constraints is best understood by moving from the abstract definition of an SDRS to a graphical representation of an SDRS. Here’s the algorithm for constructing a graph from an SDRS understood as above. • • •
Each constituent (or label) is a node Each subordinating relation creates a downward edge Each coordinating relation creates a horizontal edge.
This graphical representation immediately imposes some constraints on what sort of SDRSs are possible.
Troubles on the right frontier
• • • •
No two nodes can be connected by both a subordinating and coordinating relation. Several edges (of the same type) are possible between 2 constituents. Many SDRSs can be represented as trees (Baldridge and Lascarides 2005) but some cannot (Danlos 2003). Anaphora resolution and SDRS update are dependent on the graph structure.
These graphs also make explicit a dimension of discourse coherence. Discourse coherence is dependent on the connectedness of the graph; the degree of connectedness of the graph is one measure of coherence. However, SDRT allows for underspecified graph connections as well as underspecified anaphoric connections. These lead to a scalar notion of coherence, Maximize Discourse Coherence, or MDC. Roughly a discourse structure is maximally coherent if it has the fewest underspecifications, the maximal number of connections, the strongest connections between constituents.2 Here is a simple example of a discourse structure, familiar from Asher and Lascarides (2003): (1) π1. John bought an apartment π2. but he rented it.
Here is (1)’s discourse structure: (1ʹ)
• • • • •
A = {π0, π1, π2} F(π1) = ∃x∃e(e ⊰ now ∧ apartment(x) ∧ buy(e, j, x)) F(π2) = ∃eʹ(eʹ ⊰ now ∧ rent(eʹ, j, x)) F(π0) = Narration(π1, π2) ∧ Contrast(π1, π2) Last = π2
Here’s another familiar, but slightly more complex example. (2)
π1. π2. π3. π4. π5. π6.
John had a great evening last night. He had a great meal. He ate salmon. He devoured lots of cheese. He then won a dancing competition. # It (# the salmon) was a beautiful pink.
Here’s the SDRS for (2): (2ʹ) 〈A, F, Last〉, where: • A = {π0, π1, π2, π3, π4, π5, π6, π7} • F(π1) = Kπ , F(π2) =Kπ , F(π3) = Kπ3, F(π4) = Kπ4, F(π5) = Kπ , 1 2 5 F(π0) = Elaboration(π1, π6) 2. MDC also involves a notion of minimization of discourse constituents beyond those introduced by the clauses of a text. But that will not be an issue here.
Nicholas Asher
F(π6) = Narration(π2, π5) ∧ Elaboration(π2, π7) F(π7) = Narration(π3, π4) • Last = π5
Here’s the corresponding graph of (2ʹ): John had a lovely evening Elaboration He had a great meal
Narration
He won a dancing competition
Elaboration He ate salmon
Narration
He devoured cheese
With these examples, we can now describe the “right frontier” as it’s defined in SDRT–it is more general and more precise than the notion of a right frontier in other discourse theories. This governs where new information can attach in SDRT. We define the set of available nodes for attachment as falling under the following possiblilites. 1. The label α = Last; 2. Any label γ ≥*D α where ≥*D is defined recursively: (a) R(γ, α) is a conjunct in F(l) for some label l, where R is a subordinating discourse relation (like Elaboration, Explanation or ⇓); (b) R(γ, δ) is a conjunct in F(l) for some label l, where R is a subordinating discourse relation and F(δ) contains as a conjunct Rʹ(δʹ, α) or Rʹ(α, δʹ), for some Rʹ and δʹ; or (c) R(γ, δ) is a conjunct in F(l) for some label l, where R is a subordinating discourse relation and δ≥*Dα. For all relations other than structural relations, we can now also use the notion of the available nodes to constrain the resolution of anaphoric conditions in SDRT. Imagine the following situation: • •
β:Kβ; Kβ contains anaphoric condition φ. The available antecedents then are:
1. in Kβ and drs-accessible to φ 2. in Kα, drs-accessible to any condition in Kα, and there is a condition R(α,γ) in the sdrs such that γ = β or γ≥*D β (where R isn’t structural).
Troubles on the right frontier
The upshot of these definitions is that an antecedent for an anaphoric expression must be drs-accessible on the right frontier as defined in SDRT. The predictions of the Right Frontier Constraint largely confirm intuitions. For instance, the availability constraint on anaphors predicts that (2π1 – π6) is infelicitous; the relation between pronoun or the definite description and its antecedent violates the right frontier condition. Also the attachment doesn’t make sense. So this discourse is doubly damned according to the principle MDC. Why does RFC exist? It derives from the idea that the author should present information in an orderly way. If one wishes to comment or modify the information in some discourse constituent, one should do it before one has closed off that part of the story. This is an intuitive idea that SDRT makes quite precise. One cannot just simply go back to elaborating on or commenting on the salmon once one has moved on to talk about the rest of John’s evening in (2). The fact that it’s difficult to attach π6 to the rest of the discourse structure also shows that something like the Gricean constraint of Relevance is being violated.3 Matters, however, are more complicated than this straightforward picture would suggest. If we replace π6 in (2) with (π7) the discourse is much better. Why?
(2) (π7) The entire next day John kept remembering what a beautiful color his salmon had been.
SDRT and its formalisation of RFT don’t at all explain why (2π1 – π5, π7) sounds quite adequate. This is the challenge for RFC that I want to examine here. How prevalent are these violations in real texts? I’ve made a preliminary study using Wall Street Journal news stories, editorials, and letters annotated with SDRT discourse structure (each by 2 annotators), I’ve found 10 out of 173 cases of anaphoric definites that violate RFC. 2/3 of the cases can be resolved by a choice of attachment point in the structure (annotators had trouble with this). Less than 2% of the cases look like definite violations of the Right Frontier. So RFC seems to be a real constraint even for definites. Nevertheless, there is a difference between anaphoric expressions that may be referentially equivalent. The use of definites as opposed to pronouns often improve the ability of speakers to recover anaphoric connections. The following example adapted from Laure Vieu and Laurent Prévot (2005) shows this: (3)
a. b. c. d.
This morning, in the subway, I almost got robbed. At some point, I noticed that a man was pulling at my purse. I just froze, I couldn’t say a word. Suddenly, a woman screamed.
3. For a discussion of the relation between Relevance and SDRT see Asher and Lascarides (2003).
Nicholas Asher
e. The pickpocket (The man, ?He) let go of my purse and ran away. f. I wanted to thank the woman (?her) but she had already disappeared into the crowd.
Though for some the use of the simple pronouns is passable, the use of the definites markedly improves the discourse for most speakers of English. There is certainly enough information available to the interpreter given the different gender of the two antecedents to find the intended antecedents for the two pronouns. (3) isn’t a discourse where interpreters lack relevant information and hence are simply unable to pick out the intended antecedents of the anaphoric expressions. The awkwardness noticed with pronouns in (3 is not a case of simple pronoun ambiguity as in (4): (4) a. John called Jim a Republican and then he insulted him. b. Pat invited Sandy over. She cooked dinner for her.
We observe the same differential behavior between pronouns and definite descriptions when we consider a slight variant of (2π7):
(2) (πʹ7) The entire next day John kept remembering what a beautiful color it had been.
(2πʹ7), when appended to (2π1 – π5), is no better than (2π1 – π6). Somehow definites enable us to pick up non salient antecedents in a way that pronouns don’t. The differences between the use of pronoun and of the definite description in the variations of (2) or in (3) are striking. Since nothing else in the variations changes, it is logical to try to explain the difference by taking a look at the difference between pronouns and definite descriptions. The perspective of generation here is instructive. Why would someone choose to use a definite description or a complex demonstrative over a simple anaphoric pronoun? I think we can find the beginnings of an answer in the work of Ariel (1988) and Gundel et al. (1990). Gundel and Ariel hypothesize a hierarchy of referential expressions, according to which certain expressions require a more salient antecedent than others. We can express their observations as follows using >> to represent ‘requiring a more salient antecedent than’: • The hierarchy of referential expressions: 0 anaphors >> pronouns >> definite descriptions >> proper names.
From the perspective of the use of a referring expression, we can put the point in a slightly different and perhaps more illuminating way. Using an expression from the right hand side of the hierarchy makes salient a discourse entity that was not salientbefore. One problem with these observations is that the authors provide no precise model of discourse salience. Furthermore, existing models of salience like Centering Theory (Joshi, Weinstein and Grosz 1986; Beaver 2004) or the numerical algorithms of
Troubles on the right frontier
Mitkov (1994) provide no explanation of the differential behavior between pronouns and definites.4 So it’s difficult to use these theories to account for the observations. RFC, in effect, offers a model of salience, and a way of putting the Referential Hierarchy to the test: any discourse referent introduced within a constituent on the right frontier is salient; salient antecedents must occupy a position on the right frontier. The point then about definites and complex demonstratives is that they in effect change discourse structure by putting their constituents on the right frontier. On the other hand, 0 pronouns should not be able to alter discourse structure and so the notion of discourse structure and discourse update in Asher and Lascarides (2003) or Asher (1993) should suffice. Thus, we will coarsen the hierarchy of referential expressions, by putting a division between pronouns and the expressions to their left and other expressions to the right on the hierarchy: •
Salience Hypothesis: expressions on the referential hierarchy that require at least as salient antecedents as pronouns must have SDRT available antecedents.
3. An Application of the Right Frontier to Ellipsis Let’s see how this proposal works out. Asher (1993) provides considerable evidence for the Right Frontier constraint with regard to propositional anaphora and to VP ellipsis. To give one example,
(5) One plaintiff was passed over for promotion three times. Another didn’t get a raise for five years. A third plaintiff was given a lower wage compared to males who were doing the same work. But the jury didn’t believe this (any of this),
Here it is difficult to impossible to get any other antecedents to the simple demonstrative, except the proposition expressed by the penultimate clause or the proposition expressed by the first three clauses. The Right Frontier Constraint and SDRT’s semantics for anaphors referring to abstract entities explains these facts. But there I did not distinguish between the behavior of various sorts of anaphors that refer to abstract entities. To look at how the Salience Hypothesis fares, I want to look at another form of 0 anaphora or ellipsis, sluicing. Sluicing is a kind of ellipsis that has always been thought to be governed by syntactic constraints. But a recent paper of Romero and Hardt (2004) suggests 4. Centering Theory exploits a number of features like whether the antecedent was mentioned in the previous clause, what grammatical role it plays in the previous sentence and so on, to determine these transitions. Centering theory doesn’t make any predictions about what is and isn’t possible in terms of anaphoric connections, though it ranks the anaphoric links in terms of transitions and so at least implicitly imposes a preference ordering on the set of antecedents. Mitkov’s model is a much simpler account which provides just a partial ordering of candidate antecedents based on a variety of superficial and easily recoverable features.
Nicholas Asher
that discourse constraints may also be at work in this phenomenon. The following are typical examples. The material that should follow the wh elements has been deleted or is missing in (6); this material must be recovered from the context. (6)
a. b. c. d.
John ate, but I never figured out what 0 [John ate]. John ate. Sam ate. But I never figured out what 0 [John ate and Sam ate]. John ate. But I don’t know what. Mary kissed somebody. You’ll never guess who.
Sluicing can occur across separate sentences; so traditional syntactic theories, whose domain of inquiry is the syntactic structure of an individual sentence, can’t impose any relevant constraints on the phenomena we will be studying. The Right Frontier Constraint makes interesting predictions concerning sluicing. Consider (7) (7)
a. John left and then Mary kissed someone. You’ll never guess who. b. Mary kissed someone and then John arrived. #You’ll never guess who. c. Mary kissed someone and then John arrived. You’ll never guess from where. d. John arrived and then Mary kissed someone. #You’ll never guess from where.
By using the expression and then and using SDRT’s rules for inferring discourse relations (Asher and Lascarides 2003), we’ve forced a Narration relation on the discourses in (7). Given SDRT’s rules, this forces the right frontier to contain just the second clause of the first sentence as well as the constructed topic, required by the axioms for Narration (Asher and Lascarides 2003). The upshot of this is that only material in the clause or in the topic is available for reconstructing the ellipsis; and since topics must generalize over the clauses they span, we can conclude that only the second clause will furnish material for reconstruction. The examples in (7) bear out this prediction: in (7a, c) the second clause furnishes an appropriate antecedent; in (7b, d) it does not. Further evidence that the Right Frontier Constraint is operative (and not some simpler constraint like adjacency of discourse units) come from the following data:
(8) a. Mary kissed someone because John left for some other party. You’ll never guess who. b. ??Because Mary kissed someone, John left early. You’ll never guess who. c. Mary kissed someone, He’s a student here. You’ll never guess who. d. Mary kissed someone. You know him. But you’ll never guess who.
SDRT predicts (8a, c, d) to be OK, whereas a constraint of adjacency would not. The reason is that in these examples subordinating discourse relations obtain between the first two clauses, and according to RFC either the first or the second clauses furnish available antecedents for the ellipsis. SDRT, however, predicts (8b) to be bad, since the only appropriate antecedent is not available according to RFC. Let’s now turn to single sentence examples like: (9) a. *Mary arrived after John ate but it’s unclear what. b. Mary arrived after John ate but it’s unclear what John ate.
Troubles on the right frontier
c. *Mary arrived after John ate but it’s unclear what Mary arrived after John ate. d. Agnes arrived while John was eating and I was trying to figure out what. e. John ate before Mary arrived, but I never figured out what.
These simple sentences show a remarkable range of grammaticality. As Chung et al. (1995) point out, we cannot recover the material explicit in (9c) because that constitutes an island violation. But what syntactic constraints alone don’t at all explain is why (9a) can’t have the reading (9b), which is perfectly straightforward. Nor can syntactic constraints explain why the sluicing examples (9d, e) are OK when (9a) is ungrammatical. Semantic constraints on ellipsis don’t really help us here either. Romero and Hardt note that off the shelf theories of ellipsis that exploit focus (like Rooth 1992, Fiengo and May 1994, etc) would predict that a matching content can be found between the ellipsis site and some content in the antecedent discourse. The first observation to make is that the material in the after or before clauses is presupposed; it escapes the scope of negation and the interrogative force of a question: (10) a. It’s not true that Mary arrived after John ate → John ate. b. Did Mary arrive after John ate? → John ate.
Both (10a–b) entail that John ate, and these are classic tests for presupposition. Now in discourse presuppositions have a strong preference to attach high up in the structure; and in any case the material in the third clause with the ellipsis, which is asserted, cannot attach to the presupposed material. Only additional presupposed material (as in (9d) can attach to presuppositional material. This explains why (9d) is fine. One additional SDRT hypothesis is required to explain the data. The theory of ellipsis resolution proposed in Asher (1993) requires that ellipsis material be recovered from the discourse constituent to which the constituent containing the ellipsis is attached. This is because constituents with ellipsis always attach at least with the structural relations Parallel or Contrast (though they may attach with more relations). These relations give rise to some sort of matching condition, which consists in constructing a maximal partial isomorphism between the two related constituents. The matching process requires a common theme (for Parallel) or two contrasting themes (for Contrast) (see Asher 1993, or Asher, Hardt and Busquets 2001), and it also requires identifying antecedents with material so as to contribute to the isomorphism. The matching condition has considerable empirical support. It predicts that the two quantified constituents in (11ab) must have the same quantifier structure either ∃∀ or ∀∃. It also predicts that the antecedent of the deleted VP in (11c) is in the consequent of the conditional, which would not normally be available, if we assume that some coordinating relation links the constituents provided by the subordinate and main clauses.
Nicholas Asher
(11) a. Every doctor saw at least one patient, and every nurse saw at least one patient too. b. Every doctor saw at least one patient, and every nurse did too. c. When John goes to school, he normally brings his books. But when Samantha goes to school, she normally doesn’t.
With this in mind, let’s now return to our examples. The discourse structures for (9d, e) are both ones where the clause with the ellipsis attaches to the constituent with the relevant matching material. However, in (9a) the ellided constituent is asserted and attaches to the assertion that Mary arrived. It cannot recover the appropriate antecedent for the ellipsis. What the matching condition requires here is that we recover Mary arrived as the value of the ellipsis, but this results in the following, plainly ungrammatical sentence: (12) Mary arrived after John ate but I never figured out what Mary arrived.
In (9d) the ellided constituent is part of the presupposition; it can attach to the presupposed clause John was eating and can recover the appropriate antecedent. Romero and Hardt use a constraint other than RFC to try to explain the data. They invoke the constraint that the source of the ellipsis must C-command the target in a discourse structure. SDRT agrees that there is a discourse constraint on ellipsis andsluicing, but it’s defined through the relations of Parallel and Contrast. However, there is reason to think that C-command in the simple sort of discourse structure Romero and Hardt employ is not the right constraint for constraining ellipsis either. Consider the following example of VP ellipsis (13): (13) a. If John’s teasing bothered Sam (π1), he didn’t show it (π2). Pat didn’t either(π3). b. If John’s teasing bothered Sam, Sam didn’t show that Sam was bothered and Pat didn’t show that Pat was bothered. c. If John’s teasing bothered Sam, Sam didn’t show that Sam was bothered. If John’s teasing bothered Pat, Pat didn’t show that Pat was bothered by John’s teasing him either. d. If John’s teasing bothered Sam, Sam didn’t show that Sam was bothered. If John’s teasing bothered Sam, Pat didn’t show that Pat was bothered by John’s teasing Sam either.
(13a) contains an ellipsis with three readings (13b, c, d). Given that there is considerable evidence that the conditional introduces a coordinating relation, Romero and Hardt’s discourse constraint cannot be satisfied in this case. They would have presumably the following tree like structure for the three clauses: And in such a structure the source plainly does not C-command the target. Nevertheless, the discourse is acceptable. Using its analysis of the discourse relations Parallel and Contrast which impose particular structural constraints, SDRT predicts (13a) to be good and have the three readings given. SDRT posits two structure for this discourse. The first, given by the graph in
Troubles on the right frontier Parallel
π3
Conditional
π1
π2
Figure 1. Simple Tree for (13a).
π0 π1
Conditional
π2
π4 Parallel
π3
Figure 2. First SDRT graph for (13a).
π0 π4 π1
Conditional
Parallel
π5
π2
Figure 3. Second SDRT graph for (13a).
figure 2, gives the conditional wide scope over the Parallel relation and accounts for reading (13b), while the second graph given in figure 3 gives the Parallel relation wide scope and accounts for readings (13c, d) (for details see Asher 1993). So there is considerable evidence that RFC together with structural constraints induced by Parallel and Contrast provide the right constraints on resolving ellipses. But there are still some complications to consider. Consider (14a, b) due to Bernhard Schwarz. (14) a. John died after he ate something poisonous, but I’m not sure what. b. ??John survived after he ate something poisonous, but I’m not sure what. c. John left after Mary kissed somebody. You’ll never guess who.
Nicholas Asher
(14a, c) don’t pattern at all with the other sluicing examples and seem to go against the picture that I’ve sketched so far. What seems to be different is that there is a causal connection between clauses in (14a, c) missing in (14b) and (9a). The discourse relation that holds between the clauses in (14b) and (9a) is Background, with additional temporal information being given by the adverbial clause. Now SDRT predicts that causal relations, as well as relations like Elaboration or Explanation can’t bind presuppositions. Causal links are part of the foreground. So if these clauses trigger presuppositions, it would appear that the material is both part of the foreground (giving us the causal link) as well as being presupposed. One test for this hypothesis is to think about how attachments are affected by such causal links. It appears that we can’t use Result, Explanation or Elaboration to attach to presupposed material straightforwardly, but we can when the material, ordinarily presupposed, itself has causal links to material in the foreground. (15.) a. John left (π1) after Mary ate (π2). She had an aioli (π3). (Explanation(π1, π3), not Elaboration (π2, π3). b. John left (π1) after Mary ate (π2). ?She was very happy (π3). (?Explanation(π1, π3), not Result(π2, π3). c. John died (π1) after he ate something poisonous (π2). He had blowfish (π3). (Elaboration(π2, π3)).
Given these observations, we now have an explanation of the apparently contrary data. In (14a) the modifier clause remains part of the assertion and so is also available for attachment with new asserted material. Since according to SDRT the Parallel matching condition is satisfied, the sluicing is predicted to be OK.5 In sum, we see that with at least one class of 0 anaphor, ellipses in English, RFC functions as predicted together with the rest of the SDRT machinery. Another prediction from our assumptions is that discourse attachments should also obey RFC without any modification. Discourse relations are in effect 0 anaphors; they require an antecedent discourse constituent to fill in the first term of the relation. But typically such relations do not convey any presupposed content, and so they are of a piece with 0 anaphors.
4. RFC and Definites If RFC appears to be a robust predictive constraint for both ellipses or 0 anaphors and pronouns, matters are different with anaphoric expressions involving presupposed content – definite descriptions, complex demonstratives, names and even pronouns to
5. Romero and Han’s approach would presumably not be able to account for (3) either.
Troubles on the right frontier
a certain extent. We’ve already seen in the variations of (2) that using an expression with more presupposed content does make a difference to the felicity of the discourse. In contrast to 0 anaphoric expressions, definite descriptions, complex demonstratives and proper names all introduce presupposed constituents containing the information that constitute their presuppositions, and these presupposed constituents seem to affect matters, though the different expressions have dfferent resolution strategies for their presuppositions (Hunter and Asher 2005). According to Roberts (2003), ‘The presupposed content constitutes the conventionally given constraints of that utterance of the expression on the context’. Expressions like definites, complex demonstratives and even pronouns all presuppose information about their referents as well, and thus impose lexically given constraints on their antecedents. For instance the pronoun ‘he’ presupposes that its antecedent must be singular in number and masculine in gender. In all cases, these presupposed constituents must be attached via a relation like Consequence (binding) or Background (accommodation). Definite descriptions, for instance, require that the variable introduced by the definite must be related via some underspecified relation R (often identity unless we have a bridging definite) to some available antecedent. Here’s an example of the SDRT interpretation of a definite like the salmon
• the salmon → p : ∃x, y(salmon(x) ∧ R(x, y) ∧ y = ? ∧ R = ?), a: λPP(x)
Definites require a lot of processing in order to be fully integrated into the discourse context. Perhaps this processing can also lead to changes in the discourse structure that would make available antecedents that in the standard SDRS would not be. The presuppositional content of definites is well known in the literature. It’s also well known that such presupposed information has to be integrated into the discourse context, and (Asher and Lascarides 1998, 2003) give an account of such integration within SDRT. What hasn’t been investigated is whether this integration in turn affects the discourse structure beyond simply attaching the presupposed information in some appropriate place. The challenge to RFC posed by definites involves complicated, interconnected issues. In SDRT, there’s the not only the matter of the accessibility of the antecedent that affects discourse coherence, there’s also the discourse attachment and the strength of the discourse connections to consider. Consider a variation on the example (5) of Asher (1993) where we again replace a pronoun with a definite: (16) One plaintiff was passed over for promotion three times. Another didn’t get a raise for five years. A third plaintiff was given a lower wage compared to males who were doing the same work. The jury didn’t believe the claim about the plaintiff that didn’t get a raise for five years
The example is perfectly intelligible. It is possible, even easy, for speakers to get the intended antecedent that’s predicted to be unavailable by our current formulation of
Nicholas Asher
RFC and our current analysis of discourse structure. Indeed it is the only possible antecedent since the definite description picks out exactly the proposition that is the intended antecedent. The SDRS graph for the first three sentences of (16) is identical to that for (5): Three badly treated plaintiffs make claims
(16ʹ)
(16a)
Continuation
(16b)
Continuation
(16c)
The attachment of the constituent given by the last sentence must take place to the constituent (16c) or to the topic three badly treated plaintiffs make claims, according to SDRT’s definition of discourse update (Asher and Lascarides 2003). And so the theory predicts the antecedent of the definite to be unavailable. At least there’s something right about the RFC prediction; the example sounds awkward. The discourse just isn’t well put together. As currently formulated, SDRT locates the problem with the unavailability of the antecedent of the definite. But the implication of this, that the definite can’t find an antecdent, is wrong; interpreters can bind the presupposition of the definite to its intended antecedent. Nevertheless, the discourse is bad. Nevertheless there is something about the choice of anaphor makes the whole thing go amiss, since (16) differs from the felicitous (5) only in the choice of a different referential expression in the last clause. We can redeem the awkwardness of (16) if we talk about all the claims in the last constituent: (17) One plaintiff was passed over for promotion three times. Another didn’t get a raise for five years. A third plaintiff was given a lower wage compared to males who were doing the same work. While the jury didn’t believe the claim that one plaintiff didn’t get a raise for five years, they did believe the claims about the other two.
This shows that unavailability is not here the cause of the awkwardness. (17) has exactly the same discourse structure as (16) and (5). Yet (17), like (5), is perfectly fine! In these examples, attachment, the strength of the discourse attachment and anaphora resolution all interact. Supposing that the definite does render the antecedent accessible and makes it somehow part of the right frontier, the anaphor can be resolved. Nevertheless, the attachment is less than felicitous. The problem is that while we get a commentary in both (16) and (17), (16) is less felicitous, because the commentary only singles out one of the many elements mentioned in the constituent dominated by the topic, to which the commentary is naturally attached. This is what should follow given that the attachment facts are predicted to remain constant. The example (17) is much better because the commentary spans all the information under the topic. In general, this seems to be a feature of felicitous discourse structure; attachments to a constituent
Troubles on the right frontier
are better, and hence preferred by MDC, if, when they attach to topic, the relation links all of the content under the topic to the new constituent. This observation is robust. Consider again the case of (2). While we can accommodate material so that the presupposition is satisfied in the context, the attachment of π6 to π5 via Background, as inferred in the glue logic, makes little sense. Background requires that the salmon’s color give information that somehow sets the stage for the information in the foregrounded constituent. But plainly that isn’t satisfied. So it’s unclear how to attach the information in π6 to the discourse. Of course we understand what someone who utters such a discourse was trying to say, but there’s simply a much better way of saying it—describe the salmon’s color while you’re talking about the salmon! Consider further (18), also due to Laure Vieu and Laurent Prévot (2005). (18a–d, eʹ) and (18a–d, eʹʹ) are all pretty awkward or marginal according to most speakers, though (18e) is rated quite acceptable, but still not unproblematic, since one would like to know what happened to the other purchases. (18eʹʹʹ) seems perfect to my ears. (18)
a. Last night, John went on a wild shopping spree. b. He bought an expensive tuxedo. c. He booked a cruise to the caribbean, d. and he ordered three cases of champagne. e. Early this morning, the champagne was delivered to him. eʹ. ?Early this morning, the ticket was delivered to him. eʹʹ. #Early this morning, it was delivered to him (where ‘it’ refers to the tuxedo or the cruise). eʹʹʹ. Early this morning the tuxedo, the ticket and the champagne were all on his doorstep when he stepped out to get the morning paper.
The problem here is, again, not the accessibility or availability of antecedents but rather the attachment itself and the resulting discourse structure. If we contrast (24(a–d, f) with (24(a–d, fʹ) (19)
a. b. c. d. e. fʹ.
John had a great evening last night. He had a great meal. He ate a wonderful fresh salmon. Then he devoured several scrumptious cakes. The salmon was especially delicious. The salmon was good.
we find that the discourse with (fʹ) is less good, because the last constituent comments only on a part of the meal. In contrast, the use of especially in (f) makes implicit reference to the other elements of the meal. Likewise, (18(e, eʹ)) also only continue the narrative with respect to a component of the constituents. This means that we need to refine in SDRT the computation of discourse relations; ideally, when they attach a new bit of information to a complex constituent α they should not leave us “hanging” about elements involved in α.
Nicholas Asher
This phenomenon is another reflection of Gricean constraints of orderliness and relevance and leads to a refined formulation of RFC. If one is going to comment on just one element in a story, then the comment should be attached directly to that element, not to a superordinate constituent which serves as a topic for several elements in the story. Similarly, if one is going to describe a Result of one element of a story, one should ensure that one attaches the information about the result to the constituent involving just that one element, not the whole story. Similarly for Explanations, and Narrations. The other relations of SDRT in monologue, Background and the structural relations of Parallel and Contrast have their own constraints, which has the result that the Gricean implicatures aren’t directly relevant there. We can directly implement these constraints as part of MDC. Once we separate out the effects of these constraints on discourse coherence, we have a better chance of analysing what is really going on with definites. In dialogue, it appears that violations of a simple RFC could be frequent.6 Another factor is introduced into the discourse structure construction process when several agents are concerned. But that’s entirely to be expected if the foundation of RFC has to do with Gricean constraints of orderliness and relevance. When several agents interact in discourse, there may be, and often are, competing messages that they want to get across. Orderliness and relevance thus have to take a different form when competing “discourses” are involved. For onething while one agent is talking, another may not have had the time or opportunity to inject the appropriate commentary. Consider (27c, d): (20)
a. b. c. d. e. f.
A: John had a great evening last night. A: He had a great meal. A: He ate salmon. A: He devoured lots of cheese. A: He then won a dancing competition. B: The salmon was a beautiful pink.
This seems to me to be much better than the original (2) with the simple definite. Given its pragmatic rationale RFC doesn’t apply as it stands to multi speaker discourse, since B may not have had any change until the turn he takes in (20) to add his bit concerning the salmon, and interpreters readily understand that. Such apparent violations of the Right Frontier in dialogue require more study, but one hypothesis might just be π6 can attach to the first speaker’s contribution in different ways than in monologue, say as an Acknowledgement or Commentary. Notice, however, that replacing the salmon with a pronoun in dialogue is still strange and almost uninterpretable. This would provide some evidence that the accommodation strategy is also at work in dialogue. Further, it confirms that the real source of the infelicity of (2π1 – π6) as a monologue is the discourse attachment of π6 to the context. In these texts, we see a phenomenon that I earlier noted also occur in monologue, though rarely. Asher (1993) gives naturally occurring examples of definite descriptions or 6. Thanks to Francis Corblin for this remark.
Troubles on the right frontier
even whole clauses with enough metalinguistic content like (21) so that they can rearrange the right frontier and refer to discourse constituents that would be otherwise unavailable. (21) a. I found your first claim rather puzzling. b. Let me go back to the first thing you said.
Asher (1993) introduces an idea discourse subordination in analogy with modal subordination to treat these examples. The idea is that here the presupposition given by the definite is attached via Consequence (i.e., it is entailed or bound in presupposition theory terminology) to some other constituent α in the discourse structure that need not be on the right frontier. In so doing the right frontier is itself rearranged so that α is now an available attachment site. These sort of descriptions bring a constituent into salience. While the sort of definite description in (2.7) is not obviously a description of some discourse constituent, we might suppose that the definite here has a similar function of bringing some entity into salience. Assuming that the constraints on attachment strength are as I have stated them to be, what I propose to do is to modify the process of discourse update in the spirit of Asher (1993), according to which the use of (at least some) definites can modify discourse structure in monologue. The esence of the proposal in Asher (1993) is that the choice of words in referential expressions can modify discourse structure and rendering available constituents as antecedents for anaphoric pronouns that would otherwise be unavailable according to RFC. This seems sensible enough if indeed RFC is a matter about how information is presented, not a constraint about what information is present. Thus, anaphoric expressions which are referentially equivalent (definite descriptions, 0 pronouns, or overt pronouns) may have different discourse effects. In particular, what appears to be in line with the Referential Hierarchy is that the presuppositional content of certain referential expressions, notably definites, complex demonstratives and even pronouns following Geurts (1999) can affect the presentation of information. And since this presentation has essentially to do with representations, this means that such expressions change the discourse structure. Following the underlying assumptions of the Referential Hierarchy, I suggest that to make some linguistic antecedent salient via the use of a definite description or some other presuppositional DP is to put the discourse referent that the antecedent introduces or is associated with into some SDRT available discourse constituent to which the presupposition introduced by the definite or presuppositional DP can then be bound via the relation of Consequence. Topics would be a natural place within which to insert such material. SDRT hypothesizes that topics are responsible for structuring discourse and making certain antecedents (viz. plural sums in Asher (1993 and 2004) available in the sense that they become members of the right frontier, but topics only play a role with certain discourse relations in SDRT, and hence a much more limited role than that accorded to them by others, like van Kuppelvelt (1995). While no theory of discourse structure gives a precise idea of how to construct topics if they are not explicitly given, more than a decade of research has enabled us to put constraints on what topics should be. And even
Nicholas Asher
though some see topics as creatures of darkness (Kehler 2004) – a viewpoint I can readily understand having tried to make sense of these discourse entities for over a decade – there is considerable evidence that topicality is a real feature of discourse structure. Other constraints on topic stem from observations about quantificational domains. Stanley and Szabo (2000) postulate that the restrictor of a quantified noun phrase contains a variable that is contextually specified to provide a domain of quantification, a hypothesis that is amply supported by the linguistic facts. Putting Stanley and Szabo within the framework of a theory of discourse structure, I hypothesize that the variable they posit functions like a pronoun; it gets “bound” to an antecedent in the discourse context or the nonlinguistic context – i.e., it’s anaphoric or deictic. Such a variable can pick up the set of all of the entities mentioned in a particular span of discourse. Putting such a contextual domain variable as part of topic information seems reasonable, as the following sort of example from Laure Vieu attests: (22)
a. b. c. d. e.
Last night, John went on a wild shopping spree. He bought an expensive tuxedo. He booked a cruise to the caribbean. He ordered three cases of champagne. Early this morning, he immediately went to tell everything to his shrink.
Early this morning in (22e) is a frame adverbial which in the literature on that subject is postulated to create a new topic linked to the one given by (22a) (Vieu et al. 2005, Le Draoulec and Pery-Woodley (2003). The quantifier everything in (22e) is a restricted quantifier; it ranges over all the events or actions that occurred during the shopping therapy. Somehow a domain of quantification is established from the relevant entities introduced into the discourse there and the quantifier is restricted to those. One way to implement such an idea is to put more theoretical weight on a topic. A topic like that given by (22a) would contain not only a proposition, but a quantificational domain value that could be picked up anaphorically. Topics in SDRT are required to be simple with respect to their propositional content in SDRT in order to predict effects about plural anaphoric reference to groups that are not explicitly given in the text (Asher 1993, 2004). There may, however, be more to a topic, when presentational effects require it. Besides quantificational domains, I will postulate as in Kaplanian contexts a (very small) set of salient entities that are entered into topic by the presuppositions of definites. In principle, definites can raise to prominence any DRS accessible discourse referent or entity.7 In effect, I am suggesting that we pursue a strategy of discourse accommodation for definites. 7. Definites of course cannot raise to salience an inaccessible discourse referent: (23)
a. John went on a wild shopping spree. b. He ordered 3 cases of champagne and booked a cruise to the carribean. However, he saw no car that he wanted to buy. # The car, # it wasn’t fancy enough.
Troubles on the right frontier
Here is the sketch of a formal model of a topic: • • • •
A topic is a triple consisting of: a proposition, a quantifcational domain and perhaps a list of distinguished elements. Formally, we have: τ : 〈 fτ,S,x1, . . . xn〉, where S is a quantificational domain. Unlike Asher (2005) we no longer integrate the salient individuals into the topic’s propositional content. The salient individuals are distinct. Both the DRS accessible entities mentioned in the propositional content of a topic and those in the list of distinguished elements are available, if any DRS accessible entity mentioned in the topic is available. Using a definite allows us to make an entity salient and put it on the list of distinguished elements. We adopt the constraints on the scalarity of attachments along with SDRT’s MDC as sketched above.
On the other hand, topics are dynamic and evolve as discourse proceeds. Here matters require further investigation, but one can imagine that Centering Theory or other devices may order the set of salient entities into a stack that dynamically evolves as discourse proceeds. A definite may evoke an entity into salience but if it is not picked up as the discourse proceeds it may be placed down on the stack of salient entities, requiring another referential expression to make it salient again. To implement the accommodation rules for definites, I exploit the description language of SDRSs of Asher and Lascarides (2003) used in that formulation of SDRT’s glue logic to calculate discourse structure and the modifications that discourse structure may induce on constituents. The hypothesis is that definites can modify the local topic of the constituent to which they attach, allowing access to normally unavailable antecedents within the local SDRS. Let ≤*D stand for the closure of immediate discourse subordination relation as in Asher and Lascarides (2003). Then, we have the following accommodation rule for definites: •
Accommodation Rule for Definites: Let σ be the discourse context constructed so far: (?(α,β,λ) ∧ [p:∃!xf(x)](β) ∧ [∃vf](γ) ∧ γ≤*Dα∧ no available antecedent for x in σ)→◊Updatesdrt(σ,[∃x1](topic(γ)) ∧ [x1 = v](γ))
Note that the hypothesis advances a modification of discourse constituents to handle presuppositions. It does not modify RFC on the attachment of new information. Hence, we can’t simply attach a constituent even with a definite description anywhere we would like. The Accommmodation Rule together with the Right Frontier Constraint makes the correct predictions concerning pronouns and 0 anaphors; since in those cases there is nothing triggering any accommodation, it predicts that there should be no apparent violations of the Right Frontier Constraints. It also predicts that attachments should follow RFC since they are in effect 0 anaphors, cases of
Nicholas Asher
discourse subordination with definite descriptions that explicitly refer to constituents aside. In order to be able to bind the presupposition introduced by a definite, we have to make the possible update actual. ◊f in the glue logic simply means that f holds of some discourse structure that we might compute. Eliminating the ◊ means making f hold of the actual discourse update and this incurs a cost that we keep track of in computing discourse structure. It doesn’t follow in the glue logic by itself. So if we factor this cost into MDC, we predict first that uses of the Accommodation Rule will be rare in actual discourse, as we’ve already seen. We also predict the differences between the following examples: (24)
a. b. c. d. e.
John had a great evening last night. He had a great meal. He ate a wonderful fresh salmon. Then he devoured several scrumptious cakes. #It was especially delicious. (meaning the salmon)
(25)
a. b. c. d. e.
John had a great evening last night. He had a great meal. He ate a wonderful fresh salmon. It was especially delicious. Then he devoured several scrumptious cakes.
While the pronoun it has some presupposed content, it is insufficient to distinguish between the meal and the salmon as antecedents. Because making the salmon the antecedent to the definite would require using the ◊ discharge rule, this discourse structure is dispreferred and so speakers are predicted to choose the meal as the antecedent, as is the case. Because no accommodation is needed in (25), this discourse is predicted to be perfectly felicitous. On the other hand, using the definite has as a discourse purpose to render something non salient salient, so we predict that (19) repeated below is OK, and in fact slightly better than (26), where the choice of the pronoun would have been equally good. The definite description in (26) serves no special purpose. (19)
a. b. c. d. e.
John had a great evening last night. He had a great meal. He ate a wonderful fresh salmon. Then he devoured several scrumptious cakes. The salmon was especially delicious.
(26)
a. b. c. d. e.
John had a great evening last night. He had a great meal. He ate a wonderful fresh salmon. The salmon was especially delicious. Then he devoured several scrumptious cakes.
Troubles on the right frontier
Another example where our constraints predict a difference is a variation on (18): (27) a. Yesterday John went on a wild shopping spree. b. He bought an expensive tuxedo, booked a cruise to the Caribbean, and ordered three cases of champagne. c. Then he went to a very fancy restaurant with his girlfriend. d. The champagne was of very high quality.
I cannot get the champagne to refer back to the champagne John bought. The only reading I get is one where I make a bridging inference to what John and his girlfriend had at the restaurant. And that induces one to take (27d) to be an Elaboration or a Background for (27c). The Accommodation Rule predicts this. The Accommodation Rule will not be triggered, as there is already an available antecedent for the presupposition of the definite to bind to. Of course, we can block the bridging in (27) by adding additional content to (27e) (27eʹ) Early next morning, The champagne was delivered. (27eʹ) Early next morning, the tuxedo, the ticket and the champagne (His purchases) were all on his doorstep.
Here we have an accommodation that goes first to a subtopic and then to a higher topic, so that it can be picked up by the attaching constituent. Since the bridging inference fails, there is no available antecedent to bind the presupposition of the definite. So we can use the Accomodation Rule. Our new account of the presuppositions of definites in discourse doesn’t predict everything we wanted. Consider again (3). The first clause has a passive but lexical semantics still treats rob as a transitive verb with a nonexpressed subject. Hence, SDRT would have as a logical form for (3a): (3aʹ) ∃x(almost(∃e(rob(e, x, i) ∧ e < now))).
As is known from the literature on bridging, indefinites can sometimes license bridging inferences. This seems to be the case with the DP a man in the second constituent. We are invited to identify the man with the robber in the topic position, and this allows us to bind the definite without any special mechanism. On the other hand, it is somewhat puzzling why the pronoun is worse for many speakers given this bridging interpretation of the the indefinite. Perhaps it is because the presuppositions of gender and number of the pronoun can’t match anything with the antecedent. As to the anaphoric reference back to the woman, once again the presupposed content of the pronoun suffices to raise a unique antecedent to salience. Nevertheless, once again, the use of the pronoun in (3f) is less good than the definite description for many speakers. This is not predicted on the current account. It may be that there are further constraints on the use of the Accommodation rule with pronouns. For one thing pronouns do not have any uniqueness requirement, so the rule, as it stands, will not fire. It is an easy matter to write a rule for pronouns, but it appears that they are perhaps subject to further constraints.
Nicholas Asher
In particular, we would might well take pronouns to be best with already available antecedents. The accommodation rule for pronouns might introduce a weakened possible update, something of the form ◊◊φ so that the cost of accommodating a pronoun would be greater than that for a definite description. To be explicit, • Accommodation Rule for Pronouns: Let σ be the discourse context constructed so far: (?(α,β,λ) ∧ [p: ∃xNum(x) = pl/sg ∧ Gen(x) = M/F/N](β)∧[∃vf](γ) ∧ γ ≤*Dα ∧ no available antecedent for x in σ)→◊◊Updatesdrt(σ,[∃x1](topic(γ)) ∧ [x1 = v](γ))
This clearly deserves more research both on constructed and naturally occurring texts. Given our formal model of a topic and our Accommodation Rules for definites and pronouns, we can now consider more complex cases of topic accommodation. They suggest that the list of salient entities and the very content of the topic can interact. (28) a. Michael had a great evening last night. b. He had a big meal. He ate salmon. He devoured lots of cheese. He then went dancing. c. The next morning was not so good. d. John woke up ill. e. It turns out that the salmon had been left unrefrigerated too long and in addition had been undercooked. f. It had given him food poisoning.
The discourse structure for this example has two top level constituents (28a) and (28c) linked by Narration, with the Elaboration as before of (28a). (28c) also has an Elaboration given by (28d). Then (28e) and (28f) are linked together via Result and together form a complex constituent π1 that is attached to (28d) via Explanation. Clearly, the antecedent for the salmon is not at all on the right frontier of this discourse. However, the same strategy of accommodation allows us to make the salmon salient introducing it into the list of salient entities in topic position in (28a) and then into the topic position given by (28c), so that the presupposition can be bound. Does this change in effect the content of the topic? It would seem so. This discourse is now much more about the salmon. It might even be a candidate for the topic dominating (28a) and (28c), which, since they are related by Narration, must have a topic.
5. Conclusions The Right Frontier Constraint or RFC remains an important constraint on anaphora. It appears to be pretty much absolute for anaphors without presupposed content. We’ve seen that even highly constrained phenomena like sluicing and ellipsis provide evidence in favor of the RFC over other constraints like surface adjacency or C-command
Troubles on the right frontier
in the discourse structure. The examination of sluicing data showed that we must attend to the presuppositional status of information in attachment; it makes a difference to discourse structure and to the Right Frontier Constraint itself. We have also seen that anaphors with presupposed content have a more complex interaction with RFC. I’ve argued that the presuppositions provided by definite descriptions and complex demonstratives bring certain entities into salience that might not otherwise be salient. While the RFC remains an absolute constraint, these referential expressions affect the discourse structure, in particular the structure of the topics in ways that are not predicted by the standard rules of SDRT. Pronouns are an attenuated case of definite descriptions. To make these ideas precise, I elaborated a formal model of topic and Accommodation Rules for definites and pronouns that modify topics in certain circumstances. This framework will allow us to study other effects of salience in discourse structure, something which I hope to pursue in future research.
References Ariel, M (1988). Referring and accessibility. Journal of Linguistics, 65–87. Asher, N. (1993). Reference to Abstract Objects in Discourse. Dordrecht: Kluwer Academic Publishers. Asher, N. (2004). Discourse Topic. Theoretical Linguistics, 163–203. Asher, N. and Lascarides, A. (2003). Logics of Conversation. Cambridge University Press. Asher, N., Hardt, D. and Busquets, J. (2001). Discourse Parallelism, Ellipsis and Ambiguity, Journal of Semantics, 18: 1–25. Asher, N and Vieu, L. (2005): Subordinating and Coordinating Discourse Relations. Lingua, 115: 591–610. Baldridge, J. and Lascarides, A. (2005). Probabilistic Head-Driven Parsing for Discourse Structure, to appear in Proceedings of the Ninth Conference on Computational Natural Language Learning CoNLL-2005, Ann Arbor, MI. Chung, S., Ladusaw, W. and McCloskey, J. (1995). Sluicing and Logical Form. Natural Language Semantics, 239–282. Danlos, L. (2003). Discourse semantic dependency representations as DAGs, in Proceedings of the First International Conference on MTT, Paris. Fiengo, R. and May, R. (1994). Indices and Identity. Cambridge, MA: MIT Press. Geurts, B. (1999): Presuppositions and Pronouns, Elsevier Publications. Grosz, B. and Sidner, C. (1986): ‘Attention, Intentions and the Structure of Discourse’, Computational Linguistics 12, 175–204. Gundel, J., Hedberg N., and Zacharski, R. (1990): Giveness, Implicature and the Form of Referring Expressions in Discourse, in K. Hall et.al eds., Papers from the 16th Annual Meeting of the Berkeley Linguistics Society, 442–453. Hunter, J. and N. Asher (2005): ‘A Presuppositional Account of Indexicals’, Proceedings of the Amsterdam Colloquium, ILLC Publications. Kehler, A. (2004). ‘Comments on Asher’s ‘Discourse Topic’, Theoretical Linguistics, 227–241. van Kuppevelt, J. (1995):’ Main Structure and Side Structure in Discourse’, Linguistics 33, 809–833.
Nicholas Asher Le Draoulec, A. and M.P. Pery-Woodley (2003). ‘Time travel in text: Temporal framing in narratives and non-narratives’. In Determination of Information and Tenor in Texts: Multidisciplinary Approaches to Discourse, L. Lagerwerf and W. Spooren and L. Degan, eds. Amsterdam/Münster: Stichting Neerlandistiek VU/Nodus Publikationen. 267–275. Mitkov, R (1997): ‘Robust pronoun resolution with limited knowledge’, in Proceedings of the 18th International Conference on Computational Linguistics, (COLING’98)/ACL’98 Conference, 869–875. Montreal, Canada. Polanyi, L. (1985). ‘A Theory of Discourse Structure and Discourse Coherence’. In Papers from the General Session at the 21st Regional Meeting of the Chicago Linguistics Society. P.D. Kroeber, W.H. Eilfort and K.L. Peterson, eds. Prévot, L. and L. Vieu (2005). ‘The Moving Right Frontier’. Constraints in Discourse, C. Sassen, A. Benz and P. Khnlein (eds.). Dortmund, UniversitŁt Dortmund: 136–142. Romero, M. and Hardt, D. (2004). ‘Ellipsis and the Structure of Discourse’. Journal of Semantics, 21:1–42. Roberts, C. (2003): ‘Uniqueness in definite noun phrases’, Linguistics and Philosophy 26, 287– 350. Rooth, M. (1992): ‘Ellipsis Redundancy and Reduction Redundancy’, Proceedings of the Stuttgart Ellipsis Workshop, S. Berman and A. Hestvik, eds., Arbeitspapiere des Sonderforshungsbereich 340, number 29. Schlenker, P. (2005): ‘Non Redundancy: Toward a Semantic Reinterpretation of Binding Theory’, Natural Language Semantics 13, 1: 1–92, 2005 Szabo, Z. and Stanley, J. (2000): ‘On Quantifier Domain Restriction’, Mind and Language, 15: 219–261.
The moving right frontier Laurent Prévot1,2 and Laure Vieu1,3 1Laboratory
for Applied Ontology (LOA), ISTC-CNR of Linguistics, Academia Sinica 3Institut de Recherche en Informatique de Toulouse (IRIT), CNRS 2Institute
This paper analyzes systematic cases of revision of the discourse structure entailing a modification of the right frontier. We show that the coordinating or subordinating nature of discourse relations plays a major role in this revision, examining in particular a relation typical in narratives, Result, as well as a family of dialog relations: content-relations introduced by interrogatives. Their complex behaviour shows that the Right Frontier Constraint, a major principle in most discourse theories, needs to be handled with care. We also generalize the discussion about problems due to the multiplication and the sophistication of discourse principles operating within SDRT, in particular the Maximize Discourse Coherence principle which constitutes an important improvement of the theory but also introduces some methodological issues.
1. Introduction The Right Frontier Constraint (RFC) on accessibility and possible discourse continuations, introduced in (Webber 1988; Polanyi 1988), is exploited in several theories of discourse. The notion of right frontier refers to the tree-like structure of a discourse representation, which in all theories involves the notion of complex segment. In SDRT (Asher 1993; Asher and Lascarides 2003), the theory that will be discussed in this paper, discourse segments are represented by constituents which accordingly are either (i) simple constituents having a propositional content, typically representing a single clause or utterance, or (ii) complex constituents corresponding to larger segments, that are some kind of container for other (sub-)constituents and the discourse relations that relate them. Like SDRT, most discourse theories do use discourse relations, and in several of them, such relations also affect the hierarchical discourse structure and as a result the definition of the right frontier. For instance, LDM (Polanyi 1988), Grosz & Sidner’s theory (Grosz and Sidner 1986), RST (Mann and Thompson 1987) and SDRT all make use of two kinds of relations behaving differently in the discourse structure. SDRT has extensively exploited this difference in behavior to explain many phenomena at the semantics-pragmatics interface (Asher and Lascarides 2003). In SDRT, a coordinating
Laurent Prévot and Laure Vieu
relation pushes the right frontier to the right, closing-off its attachment point, while a subordinating relation extends the right frontier downward1 and leaves open its attachment point for further attachments. In order to discuss the RFC on clear solid ground, we propose in Def 1 our definition of the constraint directly inspired from (Asher and Lascarides 2003):pp148.2 Def 1. Right Frontier Constraint The available attachment points in the discourse structure for a new constituent are those of the right frontier, i.e., 1. the last simple constituent introduced in the structure, and 2. any constituent dominating the last one, where dominance between constituents is defined by the transitive closure of direct dominance: A constituent β is directly dominated by a constituent α iff β is attached to α by a subordinating relation, or β is a sub-constituent of the complex constituent α. The discourse referents available for anaphora resolution are those which are DRTaccessible3 within the constituents of the right frontier from the attachment point up. The “coord/subord” distinction is considered by most authors to be part of the definition of the discourse relations in a stable way. In other words, a given relation is by essence of a given kind. However, studying the actual substance of the “coord/subord” distinction, (Asher and Vieu 2005) have shown that there are cases in which coordinating relations may become subordinating. This means that a given continuation can make a coordinating attachment become subordinating. It can revise the structure and change the right frontier, opening an attachment point that was closed. We will see in this paper that the opposite change can occur as well. In dialogs, questions are attached to the context with a Relq, the question version of the relation Rel that would have attached an answer to the question to the same context. Answering the question brings in both a relation between the question and the answer (QAP), and the relation Rel between the context and the answer. Since relations Relq are proved to be subord (see (Asher and Lascarides 2003):pp332), if the corresponding assertive relation Rel is coord, answering a question modifies the right frontier, closing off an open node above the attachment point above the answer. In the next section we will describe and discuss the right frontier change when a coordinating relation, Result, becomes subordinating. Then, we will examine the dialog relations Narrationq and Explanationq to show how, in some cases, answering a question alters the right frontier. We will end this paper by more general methodological questions on how theoretical discourse constraints such as the RFC can be evidenced and formulated, especially in the case of theories making use of several interdependent such constraints. 1. Without necessarily introducing a complex segment, a difference with other theories. 2. The original definition makes use of other concepts that we will like to pass over for the sake of concision since they do not concern our point here. 3. See (Kamp and Reyle 1993) for the notion of accessibility in DRT.
The moving right frontier
2. Chameleon relations in monologic discourse In (Asher and Vieu 2005), several criteria to decide whether a given relation is coordinating or subordinating are proposed, most of them relying on possible or impossible cases of anaphora resolution. On the basis of these criteria, it is shown that some relations are only coordinating by default. Punctuation and coordination particles can force them to become subord, as shown for Result on two examples taken from (Asher and Vieu 2005) reported below: (1) a. Lea screamed (π1), so the burglar ran away (π2). b. Lea screamed (π1), so the burglar ran away (π2). Max woke up (π3). #She also got a sore throat (π4). c. Lea screamed (π1), so the burglar ran away (π2) but Max woke up (π3). She also got a sore throat (π4).
In (1b), that Max woke up can’t be seen as a result of Lea’s scream. It is simply understood as a continuation of the story that is being told, i.e., π3 is attached by Narration to π2. This is shown by the impossibility to continue the text with π4, for the anaphora in the parallelmarker also can’t be solved. This contrasts with (1c) in which the punctuation and the connective but force the attachment of π3 to π2 by Contrast as well as some kind of Continuation, creating a complex segment which can be seen as collecting all the consequences of Lea’s scream. In this context, it is now possible to continue to extend this complex segment with π4. We see on example (1c) that Result changes from coord to subord. As a result, the structure built with the attachment of π2 to π1, Fig.1:(1-a), is revised when attaching π3 to obtain that of Fig.1:(1-c). The right frontier is modified, reopening π1. π1
Result
π2
(1 − a) πT
π1 Topic
τ∗
π1
Narr. Result
π2
Narr.
Result τ∗
π3
π2
Cont. π 3 Contrast π1 Result
π4 cannot be attached
(1 − b) Figure 1. Chameleon relations in example (1).
τ∗
π2
Cont. π Contrast 3 (1 − c)
Cont. π Parallel 4
Laurent Prévot and Laure Vieu
We would like to emphasize that we are not facing a new Result relation when its coord/subord nature changes. The relation keeps the same triggering rules and the same semantic effects. The semantics of a relation belongs to the information content level and remains unchanged with chameleon transformations. In fact, what changes only belongs to the information packaging level.4 The information-packaging level is generally considered as dealing with defeasible information, and this alone suggests that it is not absurd to consider that the coord/subord nature of a relation may change. As we have seen, Result changes in our example because of the presence of punctuation and connectives, which clearly affects information packaging; and it is very likely that the nature of a relation interacts with other information packaging ingredients, as intonation for instance. In (Asher and Vieu 2005), it is suggested to handle chameleon phenomena by stating that some relations (e.g., Result) are by default coordinating, and that this default can be overridden by more specific discourse clues such as punctuation and structural discourse markers (but and also in our examples). This proposal is not formally implemented in SDRT yet, but it would actually involve using revision mechanisms. Since revision mechanisms are in general best avoided, and since it would with no doubt be theoretically simpler to assume that a given relation is always of a given hierarchical nature, we would like to examine now two possible alternative explanations. The first one extensively uses the notion of discourse topic while the second one tries to handle these problems with underspecification.
2.1 Topic Insertion? In the case at hand, exemplified in (1–c), we have (i) two constituents (π2 and π3) that should be attached with the same relation to the third one (π1), and (ii) this relation is coord. If the relation was subord, as we have just suggested, there would be no structural problem (cf. Fig.2:A). A solution that may come to mind, for keeping Result coord, would be to group the two constituents (π2 and π3) into a complex constituent dominated by a topic and to relate this topic constituent to π1 with the original coord relation as shown on Figure 2:B. π1
π1
Result
πtop
Result τ∗
π2
Cont.
π3
A
Topic τ∗
π2
Cont.
π3
B
Figure 2. Topic insertion in example 1. 4. In other theories, for instance in RST, the distinction corresponding to the coord/subord one in SDRT has been taken to be closely linked to the semantics of the relation. SDRT showed from the start the need to distinguish e.g., Result (coord) from Explanation (subord) whose semantic contents both refer to causality between eventualities.
The moving right frontier
In SDRT, discourse topics are assumed to be propositional and are integrated in the discourse structure like any other constituent. Some discourse topics are explicit (e.g., when we have an Elaboration), but others are only implicit and have to be built from the contents of the segment it is a topic of, by some kind of subsuming operation (Asher 1993; Asher 2004). In SDRT, discourse topics are essential ingredients of the discourse structure. For example, a Narration is necessarily dominated by a discourse topic (either explicit or implicit). A solution based on discourse topics as sketched above, raises two problems. First, the two structures depicted in 2:A and 2:B do not have the same availability properties. More precisely, in 2:A the referents in π1 are available for π3 while this is not the case in 2:B. Example (1–c) suggests that this availability link exists and therefore indicates 2:A as a more adequate solution. The second point concerns the content of the topic constituent πTOP. The relation Continuation does not have a semantics by itself; it is only a mark of the continuation of the Result relation in this span of discourse. Therefore the topic has to be built taking into account the Result relation. However, this relation holds between π1 and π2 and π1 and π3 but not between π2 and π3. This implies that the potential topic constituent (πTOP) must somehow include some information from π1 (what is shared between π2 and π3 is that they both are consequences of π1) which is clearly odd from a discourse topic building viewpoint. More precisely, in this case, πTOP does not include e1, the main eventuality of π1, and cannot therefore be the topic of π1, but only of what happened once e1 occurred. For both these reasons, this approach does not seem suitable. In addition, let’s note that, if we start using topics in such a way, there would be little point in keeping subordinating relations other than Topic in SDRT since this mechanism could apply to any subord relation.
2.2 Underspecification? Another way of handling this issue could be to exploit the relatively recent Maximize Discourse Coherence (MDC) constraint of SDRT (Asher and Lascarides, 2003): pp230. Equipped with such a tool, a new option consists in questioning the established coord nature of Result and attributing to it an underspecified nature. Then it is possible to use additional clues to decide on the nature of the relation, possibly with the help of subsequent clauses. In this way, if the context supports an additional inference to Narration, as in (1–b), Result will be coord. But if the updated context supports the creation of a complex segment gathering several “results”, as in (1–c), Result will be subord. The version of MDC (Def 2) we use is based on the gloss given in (Asher and Lascarides 2003):pp234. Def 2. Maximize Discourse Coherence MDC is based on a coherence partial order on discourse structures. Maximizing coherence amounts to prefer discourse structures with the smallest number of nodes, the
Laurent Prévot and Laure Vieu
fewest semantic and pragmatic clashes, the largest number of rhetorical relations and the fewest number of underspecifications. The introduction of MDC resulted in an important improvement of SDRT, accounting for new phenomena and significantly simplifying the account of others. However, with this principle we have lost the possibility of accounting for the total incoherence of a given discourse. One structure is simply better than another one. Particularly for an unacceptable discourse it is possible to say that the best structure representing it still has some clashes and similar problems but not to reject it as incoherent by not being able to build any representation, as was done in earlier versions of SDRT. The interesting counterpart of this potential problem is to offer the possibility of leaving discourse relations underspecified after an update, delaying the decision until enough information is available. This allows to deal with example (2), awkward at first but perfectly alright once completed (adapted from (Caenepeel 1991) and (Asher and Lascarides 2003)). (2) a. Joe was released from hospital (π1). ?He recovered completely (π2). b. Joe was released from hospital (π1). He recovered completely (π2) and they needed the bed (π3). c. Joe was released from hospital (π1). He recovered completely (π2), then he resumed training (π3).
In this example, the relation between π1 and π2 is underspecified before the utterance of π3, which makes clear in (2–b) that it is an Explanation, a subord relation, and in (2–c) a Narration, a coord relation. (Asher and Lascarides 2003) do not discuss how to deal with such underspecification in details, although it is quite clear that this case is not resolved with the construction of a number of alternative SDRSs, as for truly ambiguous discourses. The constituent π2 is surely attached to π1 but since the relation is left underspecified, its nature is underspecified as well. One wonders then what are the sites available after this attachment, i.e., where is the right-frontier of such a discourse? The formal definition of SDRS update in (Asher and Lascarides 2003) considers that only coord relations induce a constraint; so an underspecified relation is dealt with as a subord one, leaving all the sites available. This seems quite reasonable in this example. But if the same is applied for the “underspecification” of the nature of Result, it amounts to consider Result as subord by default, rather than coord by default. This apparently clashes with the intuition that Result is usually coord, as assumed up to now in SDRT on the basis of quite a number of examples. We consider therefore that one should admit that there are such things as chameleon relations, to be dealt with some sort of revision mechanism. Changes are fortunately not so frequent, and always triggered by specific clues. (Asher and Vieu 2005) suggests that Narration, a prototypical coordinating relation in narratives, is always coordinating, and that no subordinating relation can be turned into a coordinating one. We do not take issue on this precise point here, but, examining dialogs, we will now see that something very close to turning a subordinating relation into a coordinating one can occur and alter the right frontier accordingly.
The moving right frontier
3. Content relations and interrogatives Some questions require from their answers to satisfy a given rhetoric relation with the previous discourse context. These questions (introducing relations like Explanationq , Narrationq... ) have been briefly presented in (Asher and Lascarides 2003) but we believe that the structural aspect of their treatment in SDRT requires more attention, as it has been spotted in (Prévot et al., 2002; Prévot 2004). In order to show this, we are going to consider interrogatives introducing subordinating or coordinating relations. We will pay a special attention to the state of the right frontier after the question resolution.
3.1 Narrationq versus Elaborationq
Narration and Elaboration are among relations that are assumed not to exhibit a cha-
meleon behaviour (Asher & Vieu 2005). Narration is coordinating while Elaboration is subordinating. Narrationq, Elaborationq and QAP have been shown to be subordinating (Asher and Lascarides 2003). In example (3), the subordinating nature of Elaboration, Elaborationq and Background5 is coherent and predicts correctly that π5 is open for pursuing the story (see Fig 3). (3)
A 1 B 2 A 3 A 4 A 5 B 6 A7 A8 B 9
Yesterday I visited Fez, it was great! Really? Where did you go? In the morning, I’ve been in the medina.(π3) I started by getting lost (π4) and then a child guided me to the souk. (π5) The tanner’s souk?(π6) No the shoemaker’s one. (π7) There were some wonderful babouches there! (π8) He took you to his uncle’s shop, right? (π9)
In example (4) the interrogative in B6 introduces a Narrationq. (4)
A1 B 2 A3 A4 A5 B 6 A7 A8 B 9
Yesterday I visited Fez, it was great! Really? Where did you go? In the morning, I’ve been in the medina.(π3) I started by getting lost (π4) and then a child guided me to the souk. (π5) Then, what did you do?(π6) There I recognized the place (π7) and I went to the shoemaker’s of the other day.(π8) # He took you to his uncle’s shop, right?(π9)
5. In (Vieu and Prévot 2004), we applied the test proposed in (Asher and Vieu 2005) and we found out that Background was a subordinating relation, contrary to what had been proposed up to now in SDRT but in agreement with RST’s viewpoint.
Laurent Prévot and Laure Vieu π3 Topic τ∗ 4–9 π4
Narr. Result
π5
π9 Background
Elab.
Elab_q
π6
π8 QAP
π7 Figure 3. Discourse structure for (3): A3 – B9.
π3 Topic τ∗ 4–9 π4
Narr.
π9
π5 Narr_q
Narr.
π6 QAP
τ∗ π 7–8 7
Narr.
π8
Figure 4. Discourse structure for (4): A3 – B9.
In this case, the standard SDRT analysis (Asher and Lascarides 2003) faces two problems. Firstly, the structure predicts wrongly the availability of π5 for further attachments, for instance for π9, which is unacceptable (see Figure 4). Secondly, the subord nature of Narrationq and QAP results in a puzzling subord Narration between π5 and π7, Narration being the prototypical coord relation. These two problems point toward the necessity of a coordinating attachment between π5 and some other node. Indeed, since π5 is not available for π9 in example (4), the hypothesis that there is some node on its right would explain the blocking. We conjecture, as we will see now, that this additional node needs to be attached to π5, instead of the answer, by a Narration relation.
The moving right frontier 3.2
A solution using a question-answer topic
The solution proposed, as presented in (Prévot et al., 2002; Prévot 2004) is to assume that the question-answer pair generates a dominating discourse topic. This topic is a simple constituent whose content is the resolved question-answer pair. In case of simple answers, the content of elliptical answers to questions is already reconstructed in the answer constituent and therefore the topic is only a copy of the answer. But in case of complex answers the topic is built as an abstraction over the answers, just as for narrative topics. The establishment of the QAP relation generates this topic over the question-answer sequence and this topic is attached to the previous discourse with the expected assertive relation, with its expected type of attachment. In Figure 5, (A) corresponds to a subord relation after question resolution while (B) corresponds to a coord relation. In this figure, γ is the target of the question a, and β is the answer to α. The Topic-Question relation associates two constituents: τ*, which is a complex constituent for the segment consisting of the question and the answer, and τ, which is the topic itself, a simple constituent built from the question and its answer(s). With our solution, what changes is the importance of the Relationq in the structure. It is in a first time crucial for tackling the coherence of the dialog. And it becomes secondary once the structure is updated by the establishment of a satisfying answer to the question. The relation between γ and β is actually established between γ and τ. Surely Narrationq still holds between π5 and π6 in example (4) but it is no more important for availability issues. This Narrationq is only part of the dialog history but still helps increasing the overall coherence for the Maximize Discourse Coherence
γ Sub γ
π
Sub−q
Topic−Question π∗
Topic−Question τ∗
QAP
τ
Coo−q
α
Sub
Coo
α QAP
Coo β
β (A)
(B)
Figure 5. Question-Answer attachment in the case of a simple question-answer pair.
Laurent Prévot and Laure Vieu
constraint which prefers discourse interpretations offering the highest number of rhetorical links (among other criteria). Instead, Narration between π5 and τ takes on a more important role for the Right Frontier Constraint. Applying our proposal to examples (3) and (4) leads to the discourse graphs represented on figures 6 and 7 respectively. Figure 7 shows that we correctly model the π3 Topic τ∗ 4–9
Narr. Result
π4
π5
π9 Elab.
Elab_q
τ Background
Topic-Question τ∗ π 6–7 6
π8 QAP
π7 Figure 6. New structure for (3): A3 – B9.
π3 Topic τ∗ 4–9
π4
Narr.
π5
Narr.
Narr_q τ∗ 6-8
π9
τ
Topic-Question π6 QAP
τ∗ π 7–8 7
Figure 7. New structure for (4): A3 – B9.
Narr.
π8
The moving right frontier
fact that π5 is not available anymore for further attachment once the question π6 is answered and closed. We correctly capture the unavailability of the discourse referents introduced in this constituent for pronominal anaphora resolution.
4. More general and methodological issues In order to account for the subtleties that appeared around the RFC, we proposed solutions that introduced modifications of the discourse structure. In particular, we use more and more implicit discourse topics that are not directly corresponding to the surface form. Such method is also followed in (Asher 2004),(Asher, this volume), essentially for dealing with definite descriptions. In Asher’s proposal, binding definite descriptions whose antecedents are not on the right frontier might force the creation of new implicit discourse topics. This path toward sophistication seems unavoidable but raises three main issues. Methodological confusion. The profusion of theoretical constraints and principles that could be extended or altered forces to choose among potential modifications for explaining any new phenomena at hand. For example, as we saw in the section 2, in order to explain the availability of discourse referents in example (1) one might decide to introduce chameleon relations, another person to introduce more sophisticated topic management rules, while another might just let MDC do the job and simply introduce more underspecification. As we have seen, there seems to be reasons for preferring the first option, but we still believe that a clear methodological line for deciding when, where and how we should preferably integrate new elements into the system is lacking. Principle Interdependence. More sophistication results in a variety of principles that are more difficult to handle. Their complex interaction is difficult to deal with since all the constraints may move simultaneously. Namely, RFC, MDC, topic construction rules and the coord/subord distinction have all important consequences for the discourse structure and therefore for referent availability. We saw that topic construction rules and chameleon relations have important effects on the right frontier. Similarly, introducing more implicit topic nodes in the representations affects MDC, as this yields less preferred structures. If we modify these constraints without taking care of their interaction the risk is high to enter a long chain of modifications without succeeding in stabilizing the system. Acceptability criterion. The sophistication of the theory is unavoidable for accounting for more linguistic phenomena, i.e, analyzing a larger number of acceptable discourses as coherent. However extending the theory in this direction often means releasing constraints. And while releasing constraints, we have to make sure to remain able to analyze unacceptable discourses as incoherent. In particular, the Maximize Discourse Coherence principle made us loose what was once our basic methodological rule: being able to account for the acceptability or unacceptability of a given discourse, analyzing it as coherent or incoherent (see Def. 2).
Laurent Prévot and Laure Vieu
In spite of –and because of– its recent move introducing MDC, SDRT today requires a general reflexion on how to handle the scalarity of discourse acceptability. Some anaphora links seem to manage to violate RFC, as in example (5). Similarly, when looking for examples with interrogatives (for Section 3), we actually ended up many times finding examples that were strangely acceptable in spite of the theoretical unavaibility of discourse referents. We believe that this is often due to complex phenomena occurring in the construction of discourse topics (not only for questionanswer pairs), so, in essence, we agree with (Asher, this volume). But overall, such difficulties point toward the scalarity in the acceptability of discourses, based on some kind of scalarity of availability of referents for anaphora resolution. This point constitutes a strong argument in favor of the MDC, although apparently at the cost of releasing RFC, at least in its referent availability point. Such a move is argued against in (Asher, this volume). (5)
π1 π2 π3 π4 π5 π6
This morning, in the subway, I almost got robbed. At some point a man started pulling at my purse. I just froze. A woman screamed, and the pickpocket escaped. I wanted to thank her but she had disappeared.
(6)
π1 π2 π3 π4 π5 π6
On his birthday, John had a great evening. He started by winning a dance competition. His partner was very seductive and she gave him her phone number. Then he had a great dinner and party with some friends. The entire next day John kept hesitating about calling her.
In fact, according to (Asher, this volume), the pronouns in bold in examples (5) and (6) yield unacceptable discourses, while the same examples with definite descriptions would be acceptable. We agree that such discourses are more awkward than others but we believe that a deep corpus search is bound to exhibit similar examples6. Moreover it is important that the theory explains why such forced examples are still better than very bad examples like (7). The scalarity of acceptability is also signalled by the fact that disagreements exist between naive readers (both on French and English language examples) according to their acceptability. What needs to be discovered is whether such scalarity is part of 6. It is clear that finding authentic corpus examples of similar anaphora patterns is necessary. However spotting such phenomena is rather difficult because there are a lot of pronouns in the data and most of them do not qualify for testing our propositions. Most pronouns are either clearly linked to a referent in the discourse topic or in the last utterance. In order to facilitate the research one needs a corpus annotated with anaphoric links, discourse structure, and more particularly discourse pop-ups.
The moving right frontier
RFC or is accounted by MDC, less satisfactory “forced” anaphoras bringing less satisfactory structures. The second option looks more elegant but we need to be sure that it can explain why examples like (5)–(6) are less acceptable than perfectly “wellformed” discourses and more acceptable than totally mistaken ones such as (7). In order to do so, MDC, i.e., the coherence partial order on discourse structures which combines several possibly non-converging criteria, needs now to be more systematically tested, including on corpus examples. (7)
(Asher and Lascarides 2003) π1 John had a great evening last night. π2 He had a great meal. π3 He ate salmon. π4 He devoured lots of cheese. π5 He then won a dancing competition. π6 # It (# The salmon) was a beautiful pink.
5. Conclusion This paper has shown some limit cases for the Right Frontier Constraint. RFC in SDRT is founded on the coordinating/subordinating nature of relations and we explained that this nature, situated at the information packaging level, is not as stable as believed. Moreover the importance of a given coherence relation might evolve during the interpretation of discourses, as shown on content-level relations introduced by interrogatives in dialogs. RFC is therefore a discourse principle that needs to be used with care. In order to make its use more reliable, we must (i) examine systematically each relation under the light of (Asher and Vieu 2005) tests for their nature, (ii) clarify the interaction between the nature of relations and other information-packaging phenomena and, (iii) propose a new revision mechanism to be integrated within SDRT for dealing with chameleon relations. We then need to pursue the work on topic construction started by Asher in (Asher 2004), (Asher, this volume), as the insertion of implicit topics in discourse structures, a crucial method for handling a number of phenomena in SDRT, also affects RFC. However, we also discussed the difficulties due to the multiplication of interacting principles when elaborating SDRT for increasing its coverage of acceptable discourses, as we have just done in this paper. In particular, we noticed the loss of a clear-cut acceptability/unacceptability criterion with MDC. This last principle is powerful but requires from our point of view a systematic evaluation of its application on various examples of good, merely correct and odd discourses. This paper thus contributes to show that a sophisticated theory like SDRT is in need of general methodological principles on how to handle the evolution of its own foundations, i.e., discourse constraints such as RFC and MDC. This is especially true now that the use of SDRT is spreading in the community.
Laurent Prévot and Laure Vieu
Acknowledgments The authors would like to thank Isabel Gómez Txurruka, Nicholas Asher, Philippe Muller and Nicolas Maudet for discussions related to the topic of this paper, and the “Constraints in Discourse” organizers, reviewers and participants for their questions, comments and suggestions.
References Asher, N. (1993). Reference to Abstract Objects in Discourse. Kluwer Academic Publisher. Asher, N. (2004). Discourse topic. Theoretical Linguistics, (30):161–201. Asher, N. and Lascarides, A. (2003). Logics of conversation. Cambridge University Press. Asher, N. and Vieu, L. (2005). Subordinating and coordinating discourse relations. Lingua, 115(4):591–610. Caenepeel, M. (1991). Event structure versus discourse coherence. In Caenepeel, M., Delin, J., Oversteegen, L., and Sanders, J., editors, Proceedings of the DANDI Workshop on Discourse Coherence. Grosz, B. and Sidner, C.L. (1986). Attention, intentions, and the structure of discourse. Computational Linguistics, 12(3):175–204. Kamp, H. and Reyle, U. (1993). From Discourse to Logic. Kluwer Academic Publishers. Mann, W. and Thompson, S. (1987). Rhetorical Structure Theory : A theory of text organization. Technical report, Information Science Institute. Polanyi, L. (1988). A formal model of the structure of discourse. Journal of Pragmatics, 12. Prévot, L. (2004). Structures sémantiques et pragmatiques pour la modélisation de la cohérence dans des dialogues finalisés. PhD thesis, Université Paul Sabatier. Prévot, L., Muller, P., Denis, P., and Vieu, L. (2002). Une approche sémantique et rhétorique du dialogue. Un cas d’étude: l’explication d’un itinéraire. Traitement Automatique des Langues, 2(43):43–70. Vieu, L. and Prévot, L. (2004). Background in Segmented Discourse Representation Theory. In Workshop Segmented Discourse Representation Theory, 11th conference on Natural Language Processing (TALN), pages 485–494. Webber, B.L. (1988). Discourse deixis and discourse processing. Technical Report MS-CIS86–74, Department of Computer and Information Science. University of Pennsylvania.
part ii
Comparing Frameworks
Strong generative capacity of rst, sdrt and discourse dependency dags Laurence Danlos
Institut Universitaire de France The aim of this paper0 is to compare the discourse structures proposed in rst, sdrt and dependency dags which extend the semantic level of mtt for discourses. The key question is the following: do these formalisms allow the representation of all the discourse structures which correspond to felicitous discourses and exclude those which correspond to infelicitous discourses? Hence the term of “strong generative capacity” taken from formal grammars.
1. Introduction rst (Mann and Thompson 1988) and sdrt (Asher and Lascarides 2003) are quite different discourse theories, however they both rely on discourse relations and postulate an asymmetry: some parts of a text play a “subordinate” (“less important”) role relative to other parts. This asymmetry is expressed in rst as a distinction between the arguments of discourse relations: arguments of type Nucleus are more important than arguments of type Satellite. This is expressed in sdrt as a distinction between the types of discourse relations: a coordinating relation links arguments of equal importance, while a subordinating relation links an important argument to a less important one. Since these distinctions come from the same idea, we use the following terminology for the two theories: a multi-nuclear or coordinating relation links two Nuclei1, while a nucleus-satellite or subordinating relations links a Nucleus and a Satellite. On the other hand, dependency dags for discourse (Danlos 2004), which extend the sentential semantic level of mtt (Meaning-Text Theory) (Mel’cuk 1988) to the discourse level, relies on discourse relations but not on the distinction Nucleus/Satellite or coordinating/subordinating. This formalism only calls upon constraints coming from
0. A French version of this paper is published in Revue TAL, Volume 47 Numéro 2, pp 169– 198, 2006. 1. A coordinating relation may link more than two Nuclei, e.g., Narration or Sequence. However, I leave aside this case, which means that it is supposed that all discourse relations discussed in this paper have exactly two arguments.
Laurence Danlos
the semantic behavior of discourse connectives, which are extrapolated to discourse relations which are not lexicalized by a discourse connective. These constraints, which are minimal, are also respected in rst and sdrt (Section 4). The aim of this paper is to compare rst, sdrt and discourse dependency dags with respect to the following question: do these formalisms allow the representation of all the discourse structures which correspond to felicitous discourses and exclude those which correspond to infelicitous discourses? By taking the term “strong generative capacity” from formal grammars, I reformulate this question as: what is the strong generative capacity of each of these three formalisms? It will be shown that none of these formalisms has the appropriate strong generative capacity. rst, which imposes that discourse structures be tree shaped (among other constraints), is too restrictive (there exist some felicitous discourses whose structure is not controversial but not representable as an rst tree). At the opposite extreme, dependency dag formalism, which only imposes constraints coming from the semantics of discourse connectives, is too powerful (there exist some dependency dags which respect these constraints, but which seem to correspond to no felicitous discourse). Finally, sdrt comes near to the appropriate strong generative capacity but still falls short. This conclusion may sound negative, however I hope this study will shed new light on the constraints which must govern discourse structures and on the geometric properties of the graphs representing discourse structures. Let us underline the following point: this paper only focuses on discourse structures. Nothing is said about the processes that are or should be implemented to compute discourse structures from discourses. The paper is organized as follows. Sections 2, 3 and 4 present the main features of the discourse structures proposed respectively in rst, sdrt and dependency dags. Section 5 concerns the strong generative capacity of these three formalisms. Section 6 concludes. In sections 2 to 4, a reference discourse is used. This discourse, taken from (Asher and Lascarides 2003), is given in (1). (1)
a. b. c. d. e.
Fred experienced a lovely evening last night. He had a fantastic meal. He ate salmon. He devoured lots of cheese. He won a dancing competition.
This narrative discourse describes Fred’s evening which is elaborated with two sub-events, the meal and the dancing competition. The meal is itself elaborated with two courses, salmon and cheese. The discourse relations involved and their arguments are not controversial, they are the following:
l
Elaboration holds between the first sentence (1a) and the rest of the discourse (1b–e),
Strong generative capacity of rst, sdrt and discourse dependency dags
l
l
l
Narration (called Sequence in rst) holds between the meal and dancing competition sentences, i.e., (1b) and (1e), Elaboration holds between the meal sentence (1b) and the two next sentences, i.e., (1c) and (1d), Narration holds between the salmon and cheese sentences, i.e., (1c) and (1d).
Elaboration is a nucleus-satellite relation in rst and is subordinating in sdrt, while Narration is multinuclear in rst and coordinating in sdrt. This fits with the claim that nucleus-satellite/subordinating and multinuclear/coordinating are terminological variants.
2. RST rst is a theory which dates back twenty years and which has been widely used in descriptive or computational linguistics as well as in NLP (both for text understanding and text generation). Therefore, the numerous authors working in this framework do not systematically share the same points of view. It is not in the scope of this paper to present the various points of view on rst, see (Taboada and Mann 2006a) and (Taboada and Mann 2006b) for a review. Thus, we limit ourselves in the next section to a particular interpretation of rst, that of Marcu which has had a strong impact in discourse analysis (Marcu 2000a) and discourse annotation (Carlson et al. 2003). However, we discuss in Section 2 one of the issues which is debated within the rst community, namely the Nucleus/Satellite distinction.
2.1 Graphical representations and predicate-argument relations The original graphical representation proposed in (Mann and Thompson 1987) for discourse structures is illustrated in Figure 1, which shows the discourse structure for (1). In this diagram, the representation of the ith sentence is noted πi. This notation comes from sdrt and not from rst where the notation Ci is preferred, see for example (Egg and Redeker 2007) (this volume). This difference in notation is irrelevant since the representation of sentences is not discussed at all in this paper, which focuses on discourse structures annotated with discourse relations. (Marcu 1996) has proposed graphical representations which are equivalent to the original diagrams, but which do look like trees. For a discourse made up of two sentences (clauses) linked by a discourse relation R, the representation is a binary tree: the root is R, the edges are labelled N for Nucleus or S for Satellite, the (ordered) leaves are the representations of the two sentences. If R is a nucleus-satellite relation, the Nucleus (resp. Satellite) of R is the leaf on the edge labelled N (resp. S); the Nucleus precedes or follows the Satellite. If R is a multi-nuclear relation, both edges are labelled N. Marcu has also proposed a principle, called “Nuclearity Principle” (or “Compositionality Principle”), which gives the predicate-argument relations when a nucleus-satellite
Laurence Danlos
For memory π1 ≅ Fred’s evening
Elaboration
π2 ≅ meal π3 ≅ salmon
π1
Narration Elaboration π2
Narration
π3
π4 ≅ cheese
π5
π5 ≅ dancing competition
π4
Figure 1. rst diagram for (1) in orthodox style.
(subordinating) relation is embedded in another discourse relation. I extend his principle to give the predicate-argument relations when a multinuclear (coordinating) relation is embedded in another one. This leads to the “Mixed Principle”. Mixed Principle: Let ni be a non terminal node in an rst tree whose left (resp. right) daughter is nj. The left (resp. right) argument of ni is: l l l
if nj is a leaf, nj if nj is a coordinating discourse relation, the sub-tree rooted at nj if nj is a subordinating discourse relation, the Nucleus argument of nj (recursively given by the Mixed Principle).
This principle is case-based. The two first cases correspond to the standard interpretation of trees used in computer science; the third one corresponds to Marcu’s Nuclearity Principle. The rst tree for (1), which must be interpreted with the Mixed Principle, is shown in Figure 2. Elaboration N
S
π1
Narration N
N
Elaboration
π5
S
N π2
Narration N π3
Figure 2. rst tree for (1) in standard style.
N π4
Strong generative capacity of rst, sdrt and discourse dependency dags
In this tree, the right (Satellite) argument of the root - the topmost Elaboration - is the sub-tree rooted at the topmost coordinating relation Narration. In this sub-tree, the left argument of the root is π2 (thanks to the Nuclearity Principle). π2 is also the left argument of the embedded Elaboration. In a nutshell, π2 is argument of two discourse relations, although it has a single parent in the tree in Figure 2.
2.2 Nucleus/Satellite distinction The criterion put forward by Marcu for the Nucleus/Satellite distinction is the fact that the satellites of a text can be deleted without harming its coherence. He used this criterion for text summarization: his strategy consists in deleting all the Satellites from the original text to keep only the Nuclei (Marcu 2000b). Along the lines of the theory as originally proposed in Mann and Thomson (1988), Marcu considers that only a few discourse relations are multi-nuclear (mainly Sequence, Parallel Contrast, Joint and List). He considers that all the other ones are nucleus-satellite relations, in particular all the relations that can be lexicalized by a subordinating conjunction (the main clause corresponding to the Nucleus, the subordinate clause to the Satellite) (Matthiessen and Thompson 1988). Marcu’s position is not approved unanimously within the rst community. On the one hand, his criterion for the Nucleus/Satellite distinction has been criticized, for example, in (Stede 2007) where it is proposed that the notion of salience should be taken into account to determine which elements can be qualified as Nuclei according to the context. Moreover, the direct mapping advocated by (Matthiessen and Thompson 1988) between the linguistic structure and the type of a discourse relation (i.e., a subordinating conjunction lexicalizes a subordinating relation) has been widely criticized, recently in (Delort 2006). We will see in Section 3.3 that the distinction between coordinating and subordinating relations made in sdrt is also debated.
3. sdrt 3.1 Box representations and graphs for SDRSs Originally, (Asher 1993) designed sdrt as an extension of drt – Discourse Representation Theory (Kamp and Reyle 1993) - to account for specific properties of discourse. Thus, a discourse structure in sdrt, called an sdrs, gets a box representation à la drt. In box representations, the distinction between coordinating and subordinating discourse relations is not taken into account. However, the theory, especially the “logic of information packaging”, makes heavy use of this distinction, which is crucial in order to give sdrss hierarchical structures represented as graphs. Let us introduce this graphical representations.
Laurence Danlos
For a discourse made up of two sentences (clauses) linked by a discourse relation R, the nodes of the sdrs graph are the labels π1 and π2 of the drss giving the semantic representations of the two sentences. They are linked by an arrow labelled by the discourse relation R. The arrow “is horizontal with the newer constituent to the right if R is coordinating, while it is vertical (oblique) with the newer constituent below if R is subordinating” (Asher and Vieu 2005). Taking into account the Nucleus/Satellite type of arguments, this means that a horizontal arrow links two Nuclei, while a vertical arrow goes from a Nucleus down to a Satellite. It is supposed that the Nucleus of a subordinating relation always precedes the Satellite. We will come back on this simplification in Section 3.4. In addition to nodes representing sentences (noted πi and called “sentence nodes”), sdrs graphs include “scope nodes” (noted πʹ, πʹʹ, …). In the box representation scheme, a scope node labels a sub-sdrs. In the graph scheme, a scope node is linked by lines (and not arrows) to sentence nodes. Figure 3 illustrates these two representation schemes for the sdrs of (1) (these diagrams are taken from (Asher and Lascarides 2003)). The notation Kπ stands for the i drs representing the ith sentence. π1 , π′ π1 : Kπ1 π2 , π5 , π″ π2 : Kπ2
π3 , π4 π3 : Kπ3 π″ : π : K 4 π4 Narration (π3 , π4) Narration (π2 , π5) Elaboration (π2 , π″) Elaboration (π1 , π′) π′ : π5 : Kπ5
π1 Elaboration π′ π2 Elaboration
Narration
π5
π″ π3
Narration
π4
Figure 3. sdrs for (1) in a box representation and as a graph.
Strong generative capacity of rst, sdrt and discourse dependency dags
It should be noted that Narration (π2, π5) and Elaboration (π2, πʹʹ) are on an equal footing in the sub-sdrs labelled πʹ in the box representation, because no difference is made there between coordinating and subordinating discourse relations. However, this is not the case in the graph: the scope node πʹ immediately dominates the two Nuclei π2 and π5 of the coordinating relation Narration, while it immediately dominates only the Nucleus π2 of the subordinating relation Elaboration (it dominates the Satellite πʹʹ but does not immediately dominate it2). This asymmetry between coordinating and subordinating relations in sdrs graphs can be considered as equivalent to the asymmetry present in the Mixed Principle in the framework of rst (Section 2.2).
3.2 Topic nodes In addition to sentence and scope nodes, an sdrs can also include “topic nodes” (noted π*, π**, . . .) which are constructed nodes used for representing the theme of several constituents when this theme is not linguistically realized. These topic nodes are mainly introduced for constituents linked by a coordinating relation. As an illustration, consider the discourse obtained by deleting the first sentence of (1). Its sdrs graph is exactly like the one for (1) except that π1 is replaced by a topic node π*: the content of π* is Fred’s evening. The introduction of topic nodes is mainly motivated by the Right Frontier Constraint.
3.3 Right Frontier Constraint The notion of “right frontier” was originally proposed by (Polanyi 1988). In an sdrs graph for a discourse with n sentences (clauses), the right frontier contains the node πn representing the last sentence, and the sentence and topic nodes which are on the rightmost branch of the graph and dominate πn. In the graph of Figure 3, the right frontier contains the nodes π5 and π1. In the dynamic construction of an sdrs, via an incremental update process, the discourse constituents on the right frontier are the only available sites for attachment of new information. This is known as the “Right Frontier Constraint”. Moreover, this constraint states that the antecedent of an anaphoric expression must be (“DRS-accessible”) on the right frontier. The notion of right frontier is therefore crucial in sdrt. As it relies on the distinction between coordinating/subordinating relations, this distinction has been widely discussed (Asher and Vieu 2005), (Prevot and Vieu 2005). It is not in the scope of this paper to present all these discussions. Let us just say that these authors consider that a given discourse relation may have only a type by default which can be revised in context. For example, (Asher and Vieu 2005) propose that the relation Result is coordinating by default, but that it becomes subordinating in some contexts.
2. Dominance is the transitive closure of immediate (direct) dominance.
Laurence Danlos
3.4 Subordinating conjunctions and linear order Subordinate conjunctions have been largely ignored in sdrt, in which the focus has been put on inferring the discourse relations which are not lexicalized by a discourse connective. Thereby, preposed subordinate clauses which appear before the main clause, are ignored. Hence, the following simplification: it is assumed that the Nucleus always precedes the Satellite (see the quotation “vertical arrows for subordinating relation with the newer constituent below” in Section 3.1). This simplification is not made in rst: the Satellite of a subordinating relation follows or precedes the Nucleus. As we don’t want to anticipate how preposed subordinated clauses should be handled in sdrt, we limit the rest of this study – the goal of which is to compare the generative capacity of rst and sdrt – to the cases where the Satellite of a subordinating relation follows the Nucleus.
3.5 Summary on rst and sdrt, discourses in the canonical order From this short presentation of rst and sdrt, it should be clear that the two theories do not study exactly the same set of phenomena. To carry on the comparison between the discourse structures they propose, we must limit ourselves to discourses which have been studied in both theories. That is the reason why any discourse including a preposed subordinate clause is put aside, since we have just seen that such subordinate clauses are ignored in sdrt. So the rest of this paper concerns discourses in the “canonical order”, i.e., discourses of the form S1 (Conn1) S2 … Si (Conni) Si+1 … Sn, which count n sentences, in fact n clauses, noted Si, and with no preposed subordinate clauses. The clauses are linked by optional discourse connectives noted Conni which appear in the initial position of their host sentence. It is assumed that Si includes no discourse connective: this means that cases with multiple discourse connectives are put aside (they are discussed in (Webber et al. 2001)). Discourses in canonical order are also supposed to involve no discourse relation such as Attribution. This relation raises difficulties concerning the linear order of its arguments when one of them is embedded in the other one. It is discussed in (Redeker and Egg 2006) in the framework of rst and in (Hunter et al. 2006) in the framework of sdrt. The representation of Si, whatever it is, is noted πi. A connector Conni lexicalizes a discourse relation noted Ri. If Conni is not present, it is all the same assumed that there is a discourse relation Ri. In fact, there could be several discourse relations between two constituents if they are of the same type, i.e., either all coordinating or all subordinating. This is authorized in sdrt but not in rst. It can easily be authorized in rst without any drastic change: a node noted Ri /Rʹi in an rst tree can be used to indicate that the two relations Ri and Rʹi hold between two constituents. As Ri and Rʹi are of the same type, there is no problem with the Nucleus/Satellite type of their arguments. In sdrs graphs, a horizontal arrow can be labelled Ri /Rʹi if Ri and Rʹi are both coordinating, and a vertical arrow can be labelled Ri /Rʹi if Ri and Rʹi are both subordinating.
Strong generative capacity of rst, sdrt and discourse dependency dags
It is clear that the discourse structures proposed in rst and sdrt are quite different, however these two theories share the fact that they rely upon discourse relations and the distinction Nucleus/Satellite or coordinating/subordinating. Do they rely upon the same set of discourse relations and do they give them the same type? The answer to these two questions is roughly affirmative. In fact, (Asher 1993) started from the discourse relations proposed in (Mann and Thompson 1988) and even if there exist few differences in the set of discourse relations used in rst and sdrt, they are not relevant for the study proposed here. Concerning the type given to a discourse relation, it is the same in rst and sdrt in most cases. The well-known exception is the relation Result which is subordinating in rst and coordinating (by default) in sdrt. As a consequence, no example involving Result will be presented in the rest of this paper. We are now going to present another mode of representation for discourse structures, which is inspired by dependency grammars and which doesn’t rely upon the distinction between coordinating and subordinating relations.
4. Discourse dependency dags Among dependency grammars, the most famous theory is probably mtt (MeaningText Theory) (Mel’cuk 1988), designed for sentence generation but adapted to sentence analysis by (Kahane 2001). It involves three levels of representation: semantic, syntactic and morphologic. We propose below an extension of the semantic level for discourse, and compare the discourse structures obtained to those proposed in rst and sdrt. The core of the semantic level in mtt is a directed labelled graph in which nodes are “semantemes”, either lexical or grammatical. A lexical semanteme represents a meaning of a word (e.g., ‘bake1’ (a potato) and ‘bake2’ (a cake) are two lexical semantemes for the verb bake). A semanteme is viewed as a predicate which is linked to its arguments (if any) by arrows pointing towards them. These arrows are labelled with numbers which distinguish the arguments. For discourse, it can be stated that discourse relations are semantemes when they are lexicalized by a discourse connective. In this perspective, a discourse relation corresponds to the meaning of a discourse connective or to one of its meanings. By extrapolation, discourse relations which are not lexicalized are also considered as semantemes. Two sentences linked by a discourse relation R receive thus the same representation as in rst, namely a binary tree whose root is R and whose leaves are the sentence representations. However, there is a crucial difference between semantic dependency graphs and rst representations, namely the tree structure of the graphs: a semantic dependency graph is not always tree shaped, while rst representations are compulsorily tree shaped (Section 2.1). This difference comes from the way predicate-argument relations are computed. In semantic dependency graphs, whether they represent a sentence as in mtt or a discourse in the mtt extension proposed here, predicate-argument
Laurence Danlos
relations are computed in a simple and standard way: the arguments of a predicate (e.g., a discourse relation) are always its daughters. There is no equivalent to the Nuclearity principle used in rst (Section 2.1). For example, the dependency graph for (1) – in fact a dag, see below – is shown in Figure 4. In this graph, π2 has two parents, which straightforwardly translates the fact that π2 is argument of two discourse relations. On the other hand, it must be remembered that this fact is not graphically visible in the rst tree for (1) (see Figure 2); it requires the help of the Nuclearity Principle. Convention: In this paper, any rst tree must be interpreted with the Mixed Principle, any dependency graph with the standard interpretation. To avoid confusion, edges are graphically represented as lines in rst trees and as arrows in semantic dependency graphs. What are the constraints which govern dependency graphs representing discourse structures? First, it is assumed that they are acyclic. Therefore, these dependency graphs are dags (Directed Acyclic Graphs). In these dags, it is assumed that the leaves, projected on a horizontal line, are ordered, as it is the case of the leaves of an rst tree. It is also assumed that any non terminal node has exactly two daughters, which comes from the fact that a non terminal node is a discourse relation with two arguments (see note 1). Second, from our knowledge of the semantics of discourse connectives, two (minimal) constraints, noted C1 and C2, can be postulated for discourses in the canonical order (in particular, without preposed subordinate clauses, see Section 3.5) of the form S1 (Conn1) S2 . . . Si (Conni) Si+1 …. Sn. Constraint C1 states that the first argument of a discourse connective Conni is on the left of Conni. Constraint C2 states that a sentence Si+1 introduced by a discourse connective Conni is under the scope of this connective. By extrapolation, I postulate that these two constraints also hold when a discourse relation Ri is not lexicalized by a discourse connective. These constraints are formulated as follows in semantic dependency dags.
Elaboration N
S
π1
Narration N
π2
N
N
Elaboration S Narration N π3
Figure 4. Dependency graph for (1).
π5
N π4
Strong generative capacity of rst, sdrt and discourse dependency dags
Constraint C1: the first argument of Ri is the representation of a text span which appears on the left of (Conni) Si+1. Constraint C2: the second argument of Ri is the representation of a text span which starts at πi+1 (this text span can be reduced to πi+1). In terms of dominance, C2 means that Ri dominates πi+1. Let us show that constraints C1 and C2 are respected both in rst and sdrt, which is not a surprise since they are minimal. In rst, the “adjacency principle” is postulated. It states that the arguments of a discourse relation expressed through a discourse connective are given by continuous text spans which are adjacent to the discourse connective (Mann and Thompson 1987). The adjacency principle is also postulated when Ri is not lexicalized. This principle makes no distinction between the first and second argument of a discourse relation. More precisely, it is equivalent to constraints Cʹ1 and C2, in which Cʹ1 mirrors C2 (Cʹ1 is thus a constraint which is stronger than C1). Constraint Cʹ1: the first argument of Ri is the representation of a text span which ends at πi (this text span can be reduced to πi). In terms of dominance, Cʹ1 means that Ri dominates πi. Constraints Cʹ1 and C2 are used in (Egg and Redeker 2007) (this volume) for underspecified discourse representations in the framework of rst. For a discourse in the canonical order with n sentences, the underspecified representation they propose isgiven in Figure 5 (a dotted line represents dominance, a solid line immediate dominance). This underspecified representation respects exactly constraints Cʹ1 and C2. R1
π1
R2
π2
Rn-1
π3
πn-1
πn
Figure 5. Underspecified rst representation from (Egg and Redeker 2007).
In sdrt, constraints Cʹʹ1 (below) and C2 can be inferred from the incremental updating process. In a simplified way, when dealing with the current sentence Si+1, the underspecified condition ?R(α, πi+1) holds, in which ?R is a discourse relation variable, which will be specified as Ri in our notation, and α is an attachment site which must be on the right frontier of the sdrs graph representing the left-context of (Conni)Si+1 (Section 3.3). Therefore, a constraint stronger than C1 holds for the first argument of Ri, namely Cʹʹ1. Constraint Cʹʹ1: the first argument of Ri is a text span which appears on the right frontier of the sdrs graph representing the left context of (Conni) Si+1.
Laurence Danlos
For the second argument of Ri, the underspecified condition ?R(α, πi+1) states that πi+1 is the second argument of Ri. However, later on in the non-monotonous updating process, this condition may be revised so that Ri only dominates πi+1. This means that C2 is also respected in sdrt. In conclusion, dependency graphs representing discourse structures are ordered dags, in which non-terminal nodes are discourse relations with two daughters. The predicate-argument relations respect a heavy constraint C2 on the right argument and a weaker constraint C1 on the left argument. rst and sdrt also respect C2 for the second (right) argument, while these discourse theories respect respectively Cʹ1 and Cʹʹ1 for the first (left) argument, these constraints being stronger than C1. From this data, one can expect dependency dag formalism to be more powerful in strong generative capacity than rst and sdrt. This will indeed be shown in Section 5. Constraints Cʹ1 and Cʹʹ1 cannot be directly compared without taking into account other rst or sdrt constraints (e.g., the constraint stating that rst structures must be tree shaped). However, it will be shown in Section 5 that rst is less powerful in strong generative capacity than sdrt. Let us underline the following point: the only constraints which hold on discourse dependency dags are C1 and C2 which don’t involve the coordinating/subordinating distinction, which is so crucial in rst and sdrt. The next section will examine the consequences of this fundamental difference. To illustrate the constraints on dependency dags, let us examine what are the possible dags for discourses in the canonical order with three sentences, a case which will be thoroughly studied in Section 5. The possible dags have three ordered leaves, π1 < π2 < π3, two non terminal nodes R1 and R2. The heavy constraint C2 means that the second argument of R1 must start at π2, and that π3 is compulsorily the second argument of R1. The weaker constraint C1 means that π1 is compulsorily the first argument of R1. These constraints leave us with four possible non-labelled dags,3 which are shown in Figure 6. Two of them are non tree shaped (one in which π1 has two parents, the other one in which π2 has two parents), the two others are tree shaped.
R1
R1
R2
π3
R1 π1
π2
π3
π1
π2
R1
R2
R2
π3
π1
π2
π1
R2 π2
Figure 6. Non-labelled dags for a discourse with three sentences in the canonical order, respecting constraints C1 and C2.
3. Non-labelled graphs do not take into account the labels (N or S) on the edges.
π3
Strong generative capacity of rst, sdrt and discourse dependency dags
A crucial point is that constraint C2 excludes the non tree shaped dag in which π3 has two parents, namely the dag in Figure 7. This dag is excluded since the second argument of R1 does not start at π2 (in other words R1 does not dominate π2), and so constraint C2 is not respected. There seems to exist no felicitous discourse whose structure is the dag in Figure 7. As explained in (Danlos 2004), this is justified on psycho-linguisitic grounds: what would be a discourse in which the second sentence is not linked at all to the first one?4 R1
π1
R2
π2
π3
Figure 7. Non-labelled dag which does not respect constraint C2.
Summary of the representations for discourse structures: Taking as illustration the discourse in (1), we have examined three representations for discourse structures: rst trees, sdrt graphs and dependency dags. We are going to compare the strong generative capacity of these three formalisms.
5. Strong generative capacity To compare the strong generative capacity of the three formalisms under study, I start with quite a simple case of discourses in the canonical order, namely discourses with three sentences (clauses) noted here as S1(Conna) S2(Connb) S3. Their discourse structures involve three sentence representation nodes noted πi and two discourse relations noted Ra and Rb. The methodology is the following: I begin with rst trees, since rst is the most constraining formalism. Next I move to semantic dependency dags, since this formalism is the least constraining one. This will allow me to situate sdrt in between these two formalisms. 4. The link between the first two clauses of a discourse can be given by a third sentence, as in (i) below in which S3 establishes a joint link between S1 and S2 through its plural subject. See also (ii) in which the first two sentences establish the setting for the third sentence. These discourses must be orally uttered with a special intonation. (i) It is raining. Fred arrived late. These two facts irritated Mary. (ii) The sun was shining. Nice music was playing on the radio. Fred woke up in a good mood that day.
Laurence Danlos
5.1 rst trees and their equivalents in the other representations For discourses in the canonical order of the form S1(Conna) S2(Connb) S3, rst trees must have three ordered leaves (π1 < π2 < π3) and two internal nodes (Ra and Rb). To respect the tree structure, one internal node must be the daughter of the other one. Therefore, there exist only two non-labelled binary trees5, namely either Rb(Ra(π1, π2), π3) or Ra(π1, Rb (π2, π3)). These two trees lead to eight rst trees with the edges labelled. First, the four cases with an embedded subordinating discourse relation are examined, next the four other cases - with an embedded coordinating relation - are examined. Cases with an embedded subordinating relation: In Table 1, the first row shows the four rst trees with an embedded satellite-nucleus relation, namely (Ia)–(IVa). The second row shows the equivalent dependency dags, namely (Ib)–(IVb). None of these dags is tree shaped. This fits with the fact that the Nuclearity Principle is involved to compute the predicate-argument relations in the rst trees (Ia)–(IVa) since the embedded relation is subordinating. For example, in (Ia), the Nucleus of Rb is π1, which is also the Nucleus of Ra; hence the dependency dag (Ib), in which π1 has two parents. The third row in Table 1 shows the equivalent sdrt graphs. Let us describe the sdrt graph in (Ic) with the help of the predicate-argument relations visible in (Ib): starting from π1, there is a vertical arrow pointing towards π2 and labelled with the subordinating relation Ra. Next, from π1, there is a horizontal arrow pointing towards π3 and labelled with the coordinating relation Rb. The scope label πʹ immediately dominates π1 and π3, and dominates π2. The four discourse structures (I)–(IV) given in Table 1 can all be linguistically realized in felicitous discourses, for example in the discourses given in (2).
(2) a. Fred was badly sick last week. He had a bad flu. However, he is in good shape this week. Structure (I) with Ra = Elaboration and Rb = Contrast b. Fred is badly sick. He probably has the flu. He took a walk in the rain yesterday. Structure (II) with Ra = Elaboration and Rb = Explanation c. Fred is in good shape. However, Mary is badly sick. She has a bad flu. Structure (III) with Ra = Contrast and Rb = Elaboration d. Fred was in a foul mood. He hadn’t slept well that night. His electric blanket hadn’t worked.6 Structure (IV) with Ra = Explanation and Rb = Explanation
Cases with an embedded coordinating relation: In Table 2, the first row shows the four rst trees with an embedded multi-nuclear relation, namely (Va)–(VIIIa). 5. As explained in note 3, non-labelled graphs don’t take into account the labels N and S on the edges, that is the coordinating vs subordinating nature of discourse relations. 6. This discourse is taken from (Hobbs 1979).
Strong generative capacity of rst, sdrt and discourse dependency dags
Table 1. rst trees for S1(Conna) S2(Connb) S3 with an embedded subordinating relation, and their equivalent dependency dags and sdrt graphs. (I)
(II)
N π3
Ra
RST trees
N
π1
π2
Rb
π1
Ra
S
N
N
π2
π2
π3 π1
(Ib)
Rb
(IIb)
SDRT graphs
Rb
Ra π2
π3
π1 Ra
Rb
π1
S
π2
π3 π1
Rb
N S N π2
S π3
(IVb) π1 Ra
Ra
π2 Rb
π3 (IIc)
π3
Ra
π'
π2 (Ic)
Rb
(IIIb)
π' π1
π2 (IVa)
NN N
π3 π1
S
N π3
Ra
S π2
Rb
(IIIa)
N
S
π1 S
N π2
S
N Rb
(IIa)
N
Ra
N
π3 π1
π1
Ra N
N
N S
(Ia)
Dependency DAGs
S
Ra
N S
(IV)
Ra
Rb
Rb N
(III)
π3 (IIIc)
π2 Rb π3 (IVc)
The second row shows the equivalent dependency dags, namely (Vb)–(VIIIb), which are all tree shaped. The rst trees and dependency dags are quite similar: graphically, they differ only by the edges which are lines in rst trees and arrows in dependency dags, according to the convention presented in Section 4. This similarity comes from the fact that the predicate-argument relations in the rst trees (Va)–(VIIIa) are computed in the standard way (i.e., without involving the Nuclearity Principle). For example, in (Va), the Nucleus of Rb is the sub-tree rooted at Ra, hence the dependency dag (Vb). The third row in Table 2 shows the equivalent sdrt graphs. In these graphs, the topic nodes are omitted for the sake of simplicity: the scope nodes are supposed to play their roles on the right frontier for the attachment of new information.
Laurence Danlos Table 2. rst trees for S1(Conna) S2(Connb) S3 with an embedded coordinating relation, and their equivalent dependency dags and sdrt graphs. (V)
(VI)
Rb N
N
Ra
RST trees
N π3
π2
N
π' Rb Ra
N π3
π2 (Vc)
π1
Ra
π2 π3
(VIc)
Rb
N
N
π3
π2
(VIIb)
N π3
(VIIIb) π1
π' Rb
Ra S
π1
Rb π2
π' π3
(VIIIa)
N
N π2
π3
π2
Ra N
π2
(VIb)
π''
π1
S
N
N
(VIIa)
Ra π1
Rb
π3
N N
N N π2 π1
π1
π2
Rb
S
N
N
N π3
Ra
Ra N Rb
(VIa)
Rb N
(VIII)
Ra N
π3 π1 π2
π1
(Vb)
SDRT graphs
N
S
Ra
(Va)
Dependency DAGs
Rb
N N
N N π1
(VII)
Ra π1
Ra π2
π'
π'' Rb
(VIIc)
π3
π2
Rb
π3
(VIIIc)
The four discourse structures (V)–(VIII) given in Table 2 can all be linguistically realized in felicitous discourses, for example in (3).
(3) a. Fred ate a big salmon. He also devoured a lot of cheese. On the other hand, Mary missed her diner. Structure (V) with Ra = Parallel/Narration7 and Rb = Contrast
7. The notation Ra = Parallel/Narration has been introduced in Section 3.5. It means that the two coordinating relations Parallel and Narration can be inferred to link π1 and π2.
Strong generative capacity of rst, sdrt and discourse dependency dags
b. Fred ate a big salmon. He also devoured a lot of cheese. This was a fantastic meal. Structure (VI) with Ra = Parallel/Narration and Rb = Comment8 c. Fred missed his diner. On the other hand, Mary ate a big salmon. She also devoured a lot of cheese. Structure (VII) with Ra = Contrast and Rb = Parallel/Narration d. Fred had a fantastic meal. He ate a big salmon. He also devoured a lot of cheese. Structure (VIII) with Ra = Elaboration and Rb = Parallel/Narration
In conclusion, for discourses in the canonical order of the form S1(Conna) S2 (Connb) S3, rst allows exactly eight discourse structures. These eight rst trees correspond to dependency dags and sdrt graphs which are authorized in these two formalisms. They can all be linguistically realized in felicitous discourses.
5.2 Dependency dags without any equivalent rst tree We have shown in Section 4 that there are four non-labelled dags for discourses in the canonical order of the form S1(Conna) S2(Connb) S3, respecting constraints C1 and C2. When the edges are labelled, sixteen dags are obtained (four for each non-labelled dag). Eight of these dags have already been examined in the previous section: dags (Ib)–(VIIIb) in Tables 1 and 2 with an equivalent rst tree. We are left with the eight other dags without any equivalent rst tree, which thus correspond to discourse structures excluded in rst. We are going to examine whether these discourse structures are also excluded in sdrt or not, and study their linguistic realization. We start with the four non tree shaped dags.
5.2.1 Non tree shaped dags without any equivalent rst tree Non tree shaped dags in which π1 has two parents: dags (IXb) and (Xb) in Table 3 differ from dags (Ib) and (IIb) in Table 1 by the fact that Ra is coordinating (and not subordinating). Hence the impossibility of obtaining rst trees (interpreted with the Mixed Principle) with the same predicate-argument relations. These dags convert into the sdrt graphs (IXc) and (Xc), which are excluded by the Right Frontier Constraint (Section 3.3): π3 cannot be attached to π1 which is not on the right frontier. Non tree shaped dags in which π2 has two parents: dags (XIb) and (XIIb) in Table 3 differ from dags (IIIb) and (IVb) in Table 1 by the fact that Rb is coordinating (and not subordinating). In structure (XII), it is assumed that Ra ≠ Rb. In other words, this discourse structure does not involve a unique discourse relation linking three constituents.
8. Comment is subordinating both in rst and sdrt.
Laurence Danlos Table 3. Non tree shaped dependency dags without any equivalent rst tree, and their equivalent sdrt graphs (on a shaded background for those which are excluded by the theory)
(IX)
(X)
Ra Dependency DAGs
N
N
Rb
N
S π2
π1
Rb π3
Ra
(IXc)
N π2
π3
π' π2 π1
Rb
N S
N N
π1
π2
Ra
π3
Ra
π2
Rb
π3
π1
Rb
N N N N π1
(XIb) π''
(Xc)
(XII)
Ra
(Xb)
π' SDRT graphs
N
N
π3 π1
(IXb)
π1
Rb
Ra
N
(XI)
π2
π3
(XIIb)
π'
π'
π"
Ra π2
Rb (XIc)
π3
π1
Ra
π2
Rb
π3
(XIIc)
sdrt graph (XIc) is excluded by the “Continuing Discourse Pattern Constraint” which states that “coordinated constituents of a substructure must behave in a homogeneous fashion with respect to a dominating constituent”, (Asher and Vieu 2005).9 In the non-monotonous updating process of sdrt, graph (XIc) is compulsorily transformed into graph (VIIIc) in Table 2, in which the two coordinated constituents are dependent on π1. On the other hand, sdrt graph (XIIc) is not excluded by any constraint. Remark on the arborescence of sdrt graphs and their projectivity: sdrt graphs never look like trees since they contain horizontal arrows (for coordinating relations). Nevertheless, one can disregard these horizontal arrows and examine the arborescence of sdrt graphs focusing on the relations between a mother and a daughter coming from a subordinating relation (graphically a vertical or oblique arrow) or a scope relation (graphically an oblique line). In this perspective, sdrt graphs (Ic)–(IXc) look like trees with a single root and a single parent for each node. However, this is not the case for (Xc)–(XIIc): in each of these graphs, one node has two parents – π1 in (Xc), π2 in 9. A similar constraint has been put forward in syntax for coordination: (Sag et al. 1985), for example, state that two constituents can be coordinated only if they have the same syntactic function.
Strong generative capacity of rst, sdrt and discourse dependency dags
(XIc) and (XIIc). The fact that (XIIc) is not excluded means that an sdrt graph can be non tree shaped. sdrt graph (IXc) is tree shaped (disregarding horizontal arrows) but it doesn’t respect the Right Frontier Constraint. Let us show that this sdrt tree is not “projective”.10 The notion of projectivity has been introduced in dependency grammars for syntax. First a definition: in a tree, the (maximal) projection of a node x, noted Proj(x), is the set of nodes dominated by x, x included. A syntactic dependency tree for a sentence is projective iff all the projections of words are continuous segments of the sentence (Lecerf 1961). Lecerf proved that a dependency tree is projective iff dependencies never cross each other and no dependency covers the root. As an illustration, if ω1, ω2, ω3, and ω4 are four words occurring in a projective dependency tree with the linear order ω1< ω2< ω3< ω4, then it is not possible that ω1 be linked to ω3 and ω2 to ω4 (such a case is known as “crossing dependencies”). The notion of projectivity can be straightforwardly extended to sdrt trees (i.e., sdrt graphs which are tree shaped disregarding horizontal arrows). The reader can check, for example, that the sdrt graph for (1) in Figure 3 is a projective tree. The sdrt graphs (Ic)–(VIIIc) are also projective trees. On the other hand, this is not the case for (IXc) which is not projective: Proj(π1) = {π1, π3} does not form a continuous segment since there is π2 which intervenes between π1 and π3. In summary, (Ic)–(VIIIc) are projective trees and they respect the Right Frontier Constraint, while (IXc) is not projective and doesn’t respect the Right Frontier Constraint. More generally, it is possible to show that an sdrt tree is projective iff it respects the right frontier constraint (Sylvain Kahane, pc).11 We are now going to examine how the discourse structures (IX)–(XII) given in Table 3 are linguistically realized. We start with structure (IX), which raises questions on the status of anaphoric relations in discourse structures. Linguistic realization of (IX): The discourse (4) involves an anaphoric link between an indefinite NP (a salmon) in the first sentence and a definite NP in the third sentence (the salmon). It can be given structure (IX), represented as the dependency dag (IXb), with Ra = Parallel/Narration and Rb = Elaboration/Comment. This discourse is felicitous, and numerous examples following this pattern, in which the third sentence elaborates/comments an entity occurring in the first sentence, can be found in corpora.
(4) Fred ate a big salmon. He also devoured a lot of cheese. The salmon came from Norway.
As structure (IX) does not correspond to any rst tree (because of the Mixed Principle) and is excluded in sdrt (because of the Right Frontier Constraint), other analyses 10. I thank Sylvain Kahane for drawing my attention to projectivity issues. 11. This rule is valid only for sdrt trees such that the Nucleus of any subordinating relation occurs before the Satellite. This is the case for the sdrt trees studied here, which represent discourses in the canonical order (Section 3.5).
Laurence Danlos
for (4) are proposed in these discourse theories. In the framework of rst, (Egg and Redeker 2007) (this volume) would give (4) the tree shaped structure (VIa) in Table 2, with again Ra = Parallel/Narration and Rb = Elaboration/Comment. In this tree, the Nucleus of Rb is the sub-tree rooted at Ra. This means that the anaphoric link between a salmon and the salmon is ignored in the discourse structure. More generally, Egg and Redeker claim that “anaphora can create relations between sentences that are not directly linked by discourse structure”. This position is not adopted by (Wolf and Gibson 2005): although these authors work in the rst framework, they would give (4) structure (IX) and use such examples as evidence against the arborescence of discourse structures. In the framework of sdrt, the anaphoric definite expression the salmon in (4) violates the constraint stating that the antecedent of an anaphoric expression must be on the right frontier (Section 3.3). However, (Asher 2007) (this volume) claims that such a definite NP, which has presuppositional content, is accommodated with the following consequence: the referent for a salmon is introduced in the topic π* of π1 and π2 (see Section 3.2 for the notion of topic). π* is on the right frontier and π3 is attached to π*. The sdrt graph for (3) is shown in Figure 8, with Ra = Parallel/Narration and Rb = Elaboration/Comment. It respects the Right Frontier Constraint both for the attachment of π3 and for the antecedent of the salmon. As explained in Section 5.2, topic nodes are omitted in the diagrams of Table 2 for the sake of simplicity. If it were not the case, the diagram (VIc) in Table 2 would be replaced by the one in Figure 8. This amounts to saying that (Asher 2007) gives (4) structure (VI) (disregarding topic nodes), which is the structure advocated in (Egg and Redeker 2007) in the framework of rst. π* Rtopic Rb
π'
π1
Ra
π2
π3
Figure 8. sdrt graph proposed in (Asher 2007) for discourse (4).
To sum up, (4) is given either structure (VI) or (IX) according to the positions adopted for the status of anaphoric relations in discourse structures. We have seen that (IXc) is not projective. In syntax, it is well-known that most structures for English (or French) sentences are projective, but not all of them. If (3) is given structure (IX), then it can be stated that most structures for English (or French) discourses are projective, but not all of them. What can be said about crossing dependencies in discourse? Consider discourses with four sentences (clauses) in which the third sentence elaborates an entity occurring
Strong generative capacity of rst, sdrt and discourse dependency dags
in the first one (or elaborates the first one) and the fourth sentence elaborates an entity occurring in the second one (or elaborates the second one). Such discourses, which have been examined by (Stede 1999), (Wolf and Gibson 2005) and (Egg and Redeker 2007) (this volume), are illustrated in (5), with the anaphoric links a big salmon/the salmon and a lot of cheese/the cheese.
(5) Fred ate a big salmon. He also devoured a lot of cheese. The salmon came from Norway. The cheese came from France.
If anaphoric links are taken into account in discourse structure – Wolf and Gibson’s position – then the dependency dag for (5) includes crossing dependencies, namely Elaboration (π1, π3) and Elaboration (π2, π4). On the other hand, Egg and Redeker (this volume) and Asher (pc.) would give (5) a structure which doesn’t reflect the anaphoric links.12 In conclusion, a definite anaphoric NP can have its antecedent a priori anywhere in its left context (as long as binding constraints are respected). Taking into account anaphoric links in discourse structures raises a problem for the arborescence of rst structures and for the Right Frontier Constraint in sdrt, and leads to crossing dependencies in dependency dags. Another solution, advocated by Egg in rst and Asher in sdrt, consists in not systematically representing anaphoric links in discourse structures. I have no conclusive argument to decide between these two solutions. Linguistic realization of (X): Structure (X) in Table 3, in which two coordinating relations have the same constituent as first argument, violates the sdrt Right Frontier Constraint for the attachment of π3. It turns out that it is hard to imagine an example with this structure. Nevertheless, let us examine the discourse in (6a). It is a priori of structure (X) with also lexicalizing Ra = Parallel and next lexicalizing Rb = Narration. However, one can argue that the second sentence (Mary did too) is perceived as secondary information, which amounts to demoting the coordinating relation Parallel down to a subordinating relation, and therefore to giving (6a) structure (I) in Table 1 instead of structure (X). We are left with the cases where (6a) is part of a longer discourse in which Fred and Mary are on an equal footing, for example (6b). (6) a. Fred ordered salmon. Mary did too. Next, he had an apple pie. Structure (X) with Ra = Parallel and Rb = Narration b. Fred and Mary went to a fancy restaurant last night. Fred ordered salmon. Mary did too. Next, he had an apple pie. On the other hand, Mary had cheese cake.
In (6b), the narrative of Fred and Mary’s dinner is organized following the dishes they ordered: first the main course, next the dessert. This narrative structure can be 12. In this structure, the third and fourth sentences form a complex constituent which elaborates/comments the complex constituent made up of the first and second sentences (which are linked by the relation Parallel/Narration).
Laurence Danlos
reflected in an sdrt graph in which the topic π* of the second and third sentences is Fred and Mary’s main course to which is attached the topic π** of the fourth and fifth sentences defined as Fred and Mary’s dessert. This discourse structure is given in Figure 9. In this structure, the sub-discourse (6a) is not given structure (X). In a nutshell, since (6a) illustrates the only pattern of felicitous discourses with structure (X) I can imagine, it would seem that this structure cannot be linguistically realized, provided that it is accepted that the coordinating/subordinating type of a discourse relation changes according to the context (for example, provided that it is accepted that the coordinating relation Parallel can be demoted down to a subordinating relation in context). π1 Elaboration π' π* Topique
Narration
π'' π2
Parallèle
π** Topique π'''
π3
π4
Contraste
π5
Figure 9. sdrt graph for the discourse in (6b).
Linguistic realization of (XI) and (XII): discourses (7a) and (7b) illustrate respectively structures (XI) and (XII).
(7) a. Mary is worrying because her eldest son has bad marks. Her youngest son also has bad marks (but she doesn’t care). Structure (XI) with Ra = Explanation and Rb = Parallel b. Fred tidied up his bedroom this afternoon. Mary did too. Next she went to the movies. Structure (XII) with Ra = Parallel and Rb = Narration
In sdrt, the discourse in (7a) cannot be given structure (XI): sdrt graph (XIc) is transformed into graph (VIIIc) in Table 2 by the Continuing Discourse Pattern Constraint (see above). For (7a), this constraint amounts to inferring that Mary is also worrying about the bad marks of her youngest son, although this may be wrong (as suggested by the element we put into brackets but she doesn’t care). These data lead us to state that the Continuing Discourse Pattern Constraint could be maintained when the subordinating relation Ra is Elaboration, but should not be maintained when Ra is Explanation. This allows (7a) to be analyzed as (XIc).
Strong generative capacity of rst, sdrt and discourse dependency dags
We have seen that (XIc) and (XIIc) are not tree shaped (disregarding horizontal arrows) since the node π2 has two parents. The felicitous examples in (7) show that non tree shaped sdrt graphs must be authorized.
5.2.2 Tree shaped dags without any equivalent rst tree In the four tree shaped dependency dags in Table 2, namely (Vb)–(VIIIb), the embedded relation is coordinating. What can be said about the discourse structures corresponding to the four other tree shaped dags with an embedded subordinating relation? Consider dag (XIIIb) in Figure 10 (this figure also includes the equivalent sdrt graph which is commented below). The Nucleus argument of Rb in this dag is the sub-tree rooted at Ra. This predicate-argument relation is not possible in (Marcu’s version of) rst because of the Nuclearity Principle which states that the Nucleus argument of Rb is π1. π'
Rb N
Rb
S π3
Ra N S π1
π2
(Xlllb)
π1
π3
Ra π2 (Xlllc)
Figure 10. dag (XIIIb) and its equivalent sdrt graph.
The discourse structure (XIIIb) can be felicitously realized, for example in discourse (8), in which the antecedent of the pronoun this is the interpretation of its left context, namely the causal relation linking the interpretations of the two first sentences.
(8) Fred is upset because his wife is abroad for a week. This proves that he does love her. Structure (XIIIb) with Ra = Explanation and Rb = Comment
The discourse structure for (8) is not controversial. In the framework of rst, Egg (pc.) indeed does analyze (8) as (XIIIb) and acknowledges that this is a clear counter-example to the systematic applicability of the Nuclearity Principle. In sdrt, Asher (pc.) analyzes (8) as (XIIIc), i.e., as a graph including square brackets around π1, π2 and the arrow labelled Ra. These brackets mean that (XIIIc) must be interpreted as including a complex constituent formed by π1 and π2 linked by Ra. This is not yet formalized in the theory, in which only complex constituents formed around a coordinating discourse relation are formally handled in the present state of the formalism.
Laurence Danlos
In summary, neither rst nor sdrt can accurately handle the structure of (8). This is explained by the fact that (8) does not fit with the basic principles of these theories which rely heavily on the coordinating/subordinating distinction and consider the relation Explanation as subordinating and thereby the explanation of a fact as a satellite of lower importance. Yet, in (8), the explanation given for Fred’s being upset is essential for the interpretation of the third sentence.13 The only formalism in which the structure of (8) does not raise any problem is the dependency dag formalism, which goes along with the fact that this formalism does not make any use of the coordinating/subordinating distinction. The three other tree shaped dags with an embedded subordinating relation raise the same questions as (XIIIb). So I will not comment on them in detail and I consider that they can all be linguistically realized in a felicitous way.
6. Summary and conclusion First, we give an assessment on the strong generative capacity of rst, sdrt and dependency dag formalism, for discourses in the canonical order with three sentences. There exist sixteen dependency dags which respect the minimal constraints C1 and C2 exposed in Section 4. rst – in the version presented in Section 2 – authorizes only eight discourse structures, namely structures (I)–(VIII) in tables 1 and 2. These discourse structures can all be realized in felicitous discourses. However, rst is too restrictive: it excludes discourse structures which can be realized in felicitous discourses, as described below. sdrt authorizes the eight discourse structures admitted by rst and the two structures (XI)14 and (XII), which are realized in felicitous discourses, cf. (7). sdrt excludes structures (IX) and (X) because of the Right Frontier Constraint. The question of whether structure (IX) can be realized in felicitous discourses depends (at least) upon the status that should be given to anaphoric links: should they be systematically reflected in discourse structures or not (cf. the discussion of examples (4) and (5))? Concerning structure (X), it seems hard to find a felicitous discourse realizing it (see the discussion of (6)). If it turns out that this structure cannot indeed be realized in felicitous discourses, this is a strong argument in favor of the Right Frontier Constraint for the attachment of new information and, thereby, a strong argument against the dependency dag formalism which cannot exclude structure (X).
13. Nevertheless, neither Egg nor Asher contemplate the solution consisting in promoting Explanation as a coordinating relation in an example such as (8). 14. If the Continuous Discourse Pattern Constraint is applied in a limited way, which means that (XIc) is not excluded.
Strong generative capacity of rst, sdrt and discourse dependency dags
We are left with structure (XIII) and the three other structures in which a subordinating relation forms a complex constituent. These structures can be realized in felicitous discourses, see (8). Only dependency dag formalism authorizes them.15 In conclusion, none of the three formalisms under study – rst, sdrt, and dependency dag formalisms – has the appropriate strong generative capacity. These formalisms are either too restrictive or too powerful. This conclusion may sound negative, however I hope this study will shed new light on the constraints which must hold on discourse structures and on the following open questions: should the distinction between the coordinating/subordinating type of discourse relations be kept? Should the type of a given discourse relation be fixed in a static way – which is easy to implement – or computed in a dynamic way according to the context – which is hard to implement and may give rise to vicious circles? What can be expected about the strong generative capacity of these formalisms for discourses in the canonical order with more than three sentences? The same results as those obtained for discourses in the canonical order with three sentences. This is because constraints are too restrictive in rst, not restrictive enough in dag formalism, and not totally adequate in sdrt, whatever the number of sentences. On the other hand, there is an unknown factor, namely the number of discourse structures which can be realized in felicitous discourses. Consider discourses in the canonical order with four sentences. There exist five non-labelled rst trees and twenty-five non-labelled dependency dags respecting constraints C1 and C2 for these discourses. This leads to forty labelled rst trees and two hundred labelled dependency dags. I cannot tell where the number of felicitous discourses stands between forty and two hundred. Future research is needed for discourses which are not in canonical order, for example, for discourses with preposed subordinate clauses or discourses with multiple discourse connectives in the same clause. Acknowledgments: I thank first Nicholas Asher (sdrt) and Markus Egg (rst) for their comments and the fruitful discussions we had in Malaga during ESSLLI’2006. I also thank Sylvain Kahane for his invaluable help on dependency structures. Finally, I thank André Bittar and Laure Vieu for their comments.
References Nicholas Asher and Alex Lascarides. 2003. Logics of Conversation. Cambridge University Press, Cambridge. Nicholas Asher and Laure Vieu. 2005. Subordinating and coordinating discourse relations. Lingua, 115(4):591–610.
15. sdrt is ready to authorize them, but does not yet have the formal mechanism to do it.
Laurence Danlos Nicholas Asher. 1993. Reference to Abstract Objects in Discourse. Kluwer, Dordrecht. Nicholas Asher. 2007. Troubles on the right frontier. In A. Benz and P. Kühnlein, editors, Constraints in Discourse. John Benjamins. Lynn Carlson, Daniel Marcu, and Mary Ellen Okurowski. 2003. Building a discourse-tagged corpus in the framework of rhetorical structure theory. In Jan van Kuppevelt and Ronnie Smith, editors, Current Directions in Discourse and Dialogue, pages 85–112. Kluwer Academic Publishers. Laurence Danlos. 2004. Discourse dependency structures as constrained DAGs. In Proceedings of SIGDIAL’04, pages 127–135, Boston. Laurence Delort. 2006. Clause ‘subordination’ and discourse relations. In Proceedings of the 28th Annual Meeting of the German Society for Linguistics (DGfS-06), Workshop on Subordination vs. Coordination in Sentence and Text from a Cross-linguistic Perspective, Bielefeld, Germany. Markus Egg and Gisela Redeker. 2007. Underspecified discourse representation. In A. Benz and P. Kühnlein, editors, Constraints in Discourse. John Benjamins. Jerry Hobbs. 1979. Coherence and coreference. Cognitive Science, 6:67–90. Julie Hunter, Nicholas Asher, Brian Reese, and Pascal Denis. 2006. Evidentiality and intensionality: Two uses of reportive constructions in discourse. In Constraints in Discourse, pages 99–106, Maynooth, Ireland. Sylvain Kahane. 2001. Grammaires de dépendance formelles et théorie sens-texte. In Proceedings of Tutoriel of TALN, pages 17–76, Tours, France. Hans Kamp and Uwe Reyle. 1993. From Discourse to Logic. Kluwer Academic Publishers, Dordrecht. Yves Lecerf. 1961. Une représentation algébrique de la structure des phrases dans diverses langues naturelles. Compte Rendu de l’Académie des Sciences, 252:232–34. William Mann and Sandra Thompson. 1987. Rhetorical structure theory. In G. Kempen, editor, Natural Language Generation, pages 85–95. Martinus Nijhoff Publisher, Dordrecht. William Mann and Sandra Thompson. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text, 8(3):243–281. Daniel Marcu. 1996. Building up rhetorical structure trees. In The Proceedings of the 13th National Conference on Artificial Intelligence, pages 1069–1074, Portland. Daniel Marcu. 2000a. The rhetorical parsing of unrestricted texts: A surface-based approach. Computational Linguistics, 26(3):395–448. Daniel Marcu. 2000b. The Theory and Practice of Discourse Parsing and Summarization. The MIT Press. Christian Matthiessen and Sandra Thompson. 1988. The structure of discourse and ‘subordination’. In John Haiman and Sandra Thompson, editors, Clause Combining in Grammar and Discourse, volume 18 of Typological Studies in Language, pages 275–329. John Benjamins, Amsterdam/Philadelphia. Igor Mel’cuk. 1988. Dependency Syntax: Theory and Practice. State Univ. of NY Press, Albany. Livia Polanyi. 1988. A formal model of the structure of discourse. Journal of Pragmatics, 12:601– 638. Laurent Prevot and Laure Vieu. 2005. The moving right frontier. In Proceedings of the Workshop on Constraints in Discourse, pages 136–142, Dortmund, Germany. Gisela Redeker and Markus Egg. 2006. Says who? on the treatment of speech attributions in discourse structure. In Constraints in Discourse, pages 140–146, Maynooth, Ireland.
Strong generative capacity of rst, sdrt and discourse dependency dags I. Sag, G. Gazdar, T. Wasow, and s. Wisler. 1985. Coordination and how to disinguish categories. Natural Language and Linguistic Theory, 3(2):117–171. Manfred Stede. 1999. Rhetorical structure and thematic structure in text generation. In Proceedings of LORID’99, pages 44–50. Manfred Stede. 2007. RST revisited: Disentangling nuclearity. In Subordination vs. Coordination in Sentence and Text from a Cross-linguistic Perspective. Maite Taboada and William Mann. 2006a. Applications of rhetorical structure theory. Discourse Studies, 8(4):567–588. Maite Taboada and William Mann. 2006b. Rhetorical structure theory: Looking back and moving ahead. Discourse Studies, 8(3):423–459. Bonnie Webber, Alistair Knott, and Aravind Joshi. 2001. Multiple discourse connectives in a lexicalized grammar for discourse. In H. Bunt, R. Muskens, and E. Thijsse, editors, Computing Meaning, volume 2, pages 229–246. Kluwer Academic Press. Florian Wolf and Edward Gibson. 2005. Representing discourse coherence: a corpus-based study. Computational Linguistics, 31(3):249–287.
Rhetorical distance revisited – A parameterized approach* Christian Chiarcos1 and Olga Krasavina2 1University 2Moscow
of Potsdam State University, Humboldt University of Berlin
This paper presents the notion of rhetorical distance within a parameterized framework for discourse-structural accessibility. Based on a minimal set of parameters, it allows for a comparative representation of different theories, and is capable of assessing discourse-structural effects beyond the binary accessibility criterion. This work is motivated by general heterogeneity of approaches that relate discourse structure and pronominalization. The proposed parameterized framework reconstructs three of these approaches on a common foundation. Finally, we outline the results of a comparative empirical study on predictions of these theories for the use of personal pronouns in German and English newspaper corpora.
1. Introduction For more than 20 years, the impact of discourse structure on anaphoric accessibility, especially on the use of pronouns, has been taken for granted. Beginning with Fox’s early empirical study (Fox 1987), strong interaction between embedding of utterances into discourse structure and anaphoric accessibility has been assumed. Later on, more evidence proved that structuring of discourse plays an important role in pronominalization. As such, Givón (1990) demonstrated how discourse cues indicate discourse organization. Yet, Tetreault and Allen (2003) questioned the effect of discourse structural organization on pronominal anaphora: a pronoun resolution algorithm based on discourse structure did not improve the output. There are two interpretation possibilities of these results: on the one hand, discourse structure may be disregarded as far as its impact on anaphora is concerned, on the other hand, a more fine-grained empirical analysis may
* Christian Chiarcos’ work was supported by the DFG (German Research Foundation), Gradu iertenkolleg 275, project number 5220 8303. Olga Krasavina’s work was supported by grant number 05-04-04240a from the Russian Foundation for the Humanities.
Christian Chiarcos and Olga Krasavina
be due. To find a reasonable solution to this controversy, both a closer look at the theory and careful research are crucial. In the present study, we focus on the following theories that relate discourse and anaphora: the stack model by Grosz and Sidner (1986) (GS), the veins theory by Cristea et al. (Cristea et al. 1998, Ide and Cristea 2000) (VT) and the approach of rhetorical distance by Kibrik (2000), Kibrik and Krasavina (2005) (KK), from which we adopted this key term in our paper. Predictions of these approaches show some overlappings, but there are also cases in which they do not. As a consequence, a serious methodological problem emerges: one is no longer concerned with a general claim, but rather with its separate instantiations. In order to identify the contribution of discourse organization to the choice of referring expressions, we propose a general framework of discoursestructural accessibility. In this manner, we assess accessibility in a meta-theoretic and thus theory-neutral way.1 This study focuses on three main objectives:
1. Reduce three theories of accessibility to a common denominator.
(a) More precisely, we identify parameters of discourse structural accessibility based on the notion of rhetorical distance between anaphor and antecedent. (b) Further, we formalize this framework as a closed set of weighted parameters. (c) Finally, we represent the theories as different weights of this set of parameters. By demonstrating how to reconstruct the approaches of Grosz and Sidner, Cristea et al. and of Kibrik and Krasavina within our framework, we prove its adequacy with respect to these theories.
2. Evaluate the predictive power of the theory reconstructions on empirical basis, and 3. Compare it with the impact of referential distance, i.e., a measurement of distance between anaphor and antecedent based on the sequential structure of discourse.
This paper consists of two essential parts: a theoretical (sections 3 and 4) and an empirical (sections 4 and 5). Here, special emphasis is put on theoretical issues. In the first part, we consider sequential distance-based approaches vs. those based on the discourse structure and outline our parameterized approach. In the second part, results of two pilot studies are presented. Using corpus data of German and English newspaper texts, Potsdam Commentary Corpus (Stede 2004)
1. Although we analyze discourse structure on the basis of but another theory (RST), this approach is theory-neutral in a way that we abstract over specific theories of referential accessibility.
Rhetorical distance – A parameterized approach
and RST Discourse Treebank (Carlson et al. 2003), we investigate predictive force of the respective theory reconstructions in the pronominalization task, i.e., the anaphoric use of pronouns in contrast to full descriptions.
2. Background 2.1 Theories of discourse-structural accessibility One of the main linguistic constraints on the use of pronouns – referential distance (RefD) – has been first examined in Givón (1983). Referential distance is a measurement which corresponds to the gap between a referent’s current location in text and its last previous mention, i.e., the number of clauses between anaphor and antecedent, it is thus a measurement of sequential distance. However, Fox (1987) shows that Givón’s approach predicts pronouns in too many cases, because the influence of discourse-structural boundaries is ignored. As an example, at the beginning of a new rhetorical unit, full NPs, rather than pronouns are preferred, i.e., grammatical devices with great explicitness. The right-frontier principle (Polanyi 1988; Webber 1991; Asher 1993) formulates a necessary discourse-structural condition for pronominal reference (or anaphoric relations in general): anaphoric reference is possible only to those nodes that directly dominate (or structurally precede) the actual discourse segment. An alternative formulation of this insight was provided by Grosz and Sidner (1986). Here, anaphoric accessibility is a consequence of the interplay of attentional and intentional structure: • Th e intentional structure induces a global hierarchical structuring of the text, where different discourse units are related by either coordinating satisfactionprecedence or subordinating dominance relations, both are mono-directional. • The attentional structure is, then, defined as a stack induced by the intentional structure. It comprises the potential antecedents of anaphors occurring in the current segment that are grouped together in “focus spaces” where each focus space is associated with a dominating (or the actual) segment. Whenever a new segment is opened, the actual focus space is pushed into the stack; if it is closed, it is popped out. Thus, focus spaces are ranked according to the number of dominance relations between their segment and the actual one. Accessibility of a discourse fragment for antecedent search depends on the depth of this fragment in the stack. Projected to the RST tree, dominance relations can be interpreted as subordinating relations in the RST, as suggested by Moser and Moore (1996), see below.
Christian Chiarcos and Olga Krasavina
Another important theory of discourse-structural accessibility is the VT. It states accessibility of antecedents depends on so-called “veins” computed by special rules applied to an RST-like graph. The status of a discourse unit – nucleus or satellite – is essential. Relative order is considered as well, so existence of a left sister node is relevant. The semantics of rhetorical relations are of no importance at all. In KK, the notion of rhetorical distance has been applied, which is defined as the length of the shortest path pointing from an anaphor to its antecedent along the rhetorical tree. The theories presented so far share important common insights: • D epending on rhetorical structure, a path is constructed (cf. sequence of dominating segments in stack model, veins). • Then, a discourse segment is regarded as accessible iff. it is included on this path. • The distance between an anaphor and its potential antecedent on this path corresponds to the likelihood that it serves as an antecedent. We suggest theory-dependent accessibility judgments can be modeled by assigning weights to edges on this path depending on their respective types. Thus, such edge weights are interpreted as parameters of a generalized framework of discourse-structural accessibility.
2.2 RST in a nutshell Our study builds on Rhetorical Structure Theory (Mann and Thompson 1988), a descriptive framework for discourse-structural analysis of text. RST relies on the notion of text spans as building blocks of discourse. For these minimal text spans which can be considered as constituents of discourse, we will henceforth use the term elementary discourse unit (EDU) (Carlson et al. 2003). In syntactic terms, an EDU roughly corresponds to a syntactic clause. Adjacent text spans are connected by rhetorical relations that make up more complex text spans organized in a tree structure. The construction principles ensure RST analyses are trees with text spans (discourse segments) interpreted as nodes. EDUs, then, correspond to terminal nodes, while relations between nodes establish more complex discourse segments which are represented as non-terminal nodes. We refer to such a text span consisting of several sub-constituents as their parent, with the sub-constituents being its children. Accordingly, parents are located above their children in the graph. The central assumption of RST is the concept of nuclearity in discourse, i.e., the asymmetries between the text spans that make up a more complex structure. Some elements, called nuclei, are more important to the writer’s purpose, less easy to substitute and more necessary for understanding. Unlike nuclei, satellites can be replaced without any significant change to the text function. They depend in their meaning on other elements.
Rhetorical distance – A parameterized approach
According to their nuclearity, two basic prototypes of schemata are distinguished: mononuclear rhetorical relations or subordinating relations with one nucleus and one satellite, forming an asymmetric group multinuclear rhetorical relations or coordinating relations with several nuclei, but not satellites, forming a symmetric group
3. A parameterized framework for rhetorical distance 3.1 Trees and paths Our approach builds on RST trees as representations of discourse structure. However, these are formalized as labeled graphs according to the following conventions (cf. Fig. 3, Fig. 4, page 100, 101) • N odes correspond to either an EDU or a discourse segment which is established by a discourse relation connecting discourse segments. • Between nodes, directed edges are established pointing from a parent node (i.e., a discourse segment) to a child node (i.e., a subsumed discourse segment or an EDU). • Edges receive labels which specify the relationships between parent node and child node. (We consider vertical edges only, i.e., relations between a child and a parent node in the rhetorical tree. No direct links between different child nodes are allowed.) When constructing paths over these labeled trees, the following terminology applied: • A node n contains a referring expression r if either
– n is a terminal node and the expression r occurs in the text of node n, or – n is a non-terminal node and one of its direct children contains r.
• We treat edges as elementary building blocks of path construction. • We say, an edge in the path points from one node n1 to another node n2 if
– n1 is the child node of n2 and n1 contains the anaphor, (ascending edge) or – n1 is the parent node of n2 and n2 contains the antecedent (descending edge)
• N odes are ordered sequentially, thus we distinguish left and right children of a parent node. The underlying intuition is to construct an acyclic path, i.e., a sequence of edges, pointing from one EDU α containing the anaphor to another EDU β containing a potential antecedent, henceforth represented as α → β. The edges that are not on the path, i.e., pointing to/from text spans containing neither antecedent nor anaphor, are excluded. In path construction, edge labels are composed of the edge label from the underlying labeled tree, but additionally augmented with a direction, i.e., ascending or descending.
Christian Chiarcos and Olga Krasavina
3.2 Parameters Applying the terminological conventions sketched above, we now propose a minimal inventory of edge types that provides parameters for the general framework. Note that we ignore the type of relation beyond the distinction of mononuclear (asymmetric) and multinuclear (symmetric) relations. In addition to the directional distinction between ascending and descending edges in the path to be constructed, we suggest the following parameters to define necessary distinctions among edges in the tree (cf. Fig. 1). relation nuclearity whether a relation is mononuclear (i.e., subordinating, asymmetric) or multinuclear (i.e., coordinating, symmetric) node nuclearity whether the child node serves as satellite or nucleus of the dominating node (for a multinuclear relation, all child nodes are defined as nuclei) node position whether the child node is left-most or right-most of the children2 As a convention, these four features of edges are encoded by edge-labels in the following way:
• t he first two letters denote nuclearity of the underlying discourse relation, i.e., mo (mononuclear) or mu (multinuclear) • the third letter denotes the sequential position of the child node, i.e., l (left) or r (right) • the fourth letter denotes the type of the child node, i.e., s (satellite) or n (nucleus) • the fifth letter denotes the direction of the edge, i.e., whether it is ascending a or descending d For example, mulnd denotes a multi-nuclear discourse relation, with the antecedent (descending) in the left-most (nuclear) child-node, while molna corresponds to a mono-nuclear discourse relation, with the anaphor (ascending) in the satellite that is the left-most child node, cf. Fig. 2. mononuclear relation left
mo right
nucleus satellite
1
r n
s
Figure 1. Building a labeled tree from an RST tree fragment.
2. As there are different possibilities to deal with n-ary branching relations we decided to concentrate on binary trees for both theoretical aspects and empirical evaluation.
Rhetorical distance – A parameterized approach
A
B
moln
mors
A
B
anapher in A path is ascending from A molna
Figure 2. From tree fragment to edge label.
Then, a path in a rhetorical tree can be represented as a sequence of edge labels, beginning at the position of an anaphor and ending at the EDU containing the antecedent, see example on Fig. 5. The rhetorical distance between referential expressions in two EDUs is then defined as the weighted sum of occurrences of labels on the path between them. The higher the relative degree of rhetorical accessibility of a potential antecedent is, the lower is its rhetorical distance. For pronominalization, an accessibility threshold τ is introduced, converting the gradual distance measurement into a binary criterion. For theories which provide a binary accessibility criterion (such as in GS and VT, see below), τ is set out of theoretical considerations, for theories which provide a gradual approximation of accessibility (such as referential distance and Kibrik and Krasavina’s rhetorical distance, see below), it is basically a value determined empirically for optimizing the results. Given an anaphor α and antecedent β, we now consider the frequency of edge labels on the path pointing from α to β. rhet-dist(a, b ) =
∑
molna,molnd,molsa, i∈molsd,morna,mornd, morsa,morsd,mulna, mulnd,murna,murnd
wi freq(a, b , i)
acc(a, b ) iff.rhet-dist(a, b ) < τ
Here, i represents one possible edge label, freq(α, β, x) denotes the frequency, or number of occurrences of the edge label x on the path pointing from α to β, wx ∈ + is the weight associated with edge label x, and τ is the accessibility threshold. In the following section, we suggest different configurations of weights and thus prove the adequacy of our proposal with respect to the theories under consideration.
4 Reconstruction of the theories within the framework In this section we reconstruct three models under consideration, GS, VT and KK, using the set of parameters, edge labels and accessibility thresholds as outlined in Section 3. By assigning different weights to the set of parameters, we generate the necessary distinctions, thus demonstrating the adequacy of the proposed framework with respect to all three theories. Finally, the minimal set of parameters is used to compare the theories.
Christian Chiarcos and Olga Krasavina
4.1 Stack Model (GS, Grosz and Sidner 1986) Previous interpretations of GS in RST terms (Moser and Moore 1996, Tetreault and Allen 2003) agreed on the rough correspondence between RST-nuclearity and GSdominance, thus equating dominance relations with mononuclear (subordinating) RST relations, and satisfaction-precedence with multinuclear (coordinating) RST relations. This analysis is consistent with the partial mapping from GS’s discourse structure onto RST that has been implemented in Relational Discourse Analysis (RDA, Moser and Moore 1996:414), where embedded segments in GS are analyzed as satellites in RST. This partial mapping has been extended by Marcu (2000:527f.) who proposed an isomorphic mapping between GS-dominance and RST-subordination, and a (monodirectional) homomorphism transforming satisfaction-precedence into multinuclear relations. Nevertheless, RST structures and those assumed by Grosz and Sidner (1986) differ with respect to their granularity. So, it seems that an interpretation of GS’s discourse structure and rhetorical structure in the sense of Mann and Thompson (1988) can be problematic. Generalizing over these insufficiencies, we summarize the GS’s proposal as follows:3 • ascending from a nucleus and descending into a nucleus is always possible • ascending from a satellite is always possible • descending into a satellite node is not permitted Then, the GS configuration assigns every descending satellite edge (i.e., molsd, morsd) the weight wmolsd,morsd = 1, everything else the weight 0. If we now compare the values calculated for rhetorical distance in the GS-configuration to the original binary criterion of accessibility, we find a discourse segment is accessible iff. its rhetorical distance is lower than 1. Otherwise it is not, thus we set the accessibility threshold τGS = 1. Returning to the example given in Fig. 3, our application of GS on RST-like graph structures would predict EDU (39) as location of a possible antecedent for the pronoun she in EDU (43), but rule out the EDUs (40)–(42) as these satellite nodes are “popped out” from the stack as soon as new discourse material is attached to the nonterminal node (39–42). In the rhetorical distance approach, we calculate distances between pairs of EDUs according to the configurations.
3. Note that GS’s proposal cannot account for left satellites according to Ide and Cristea (2000). One could assume that left satellites in RST arise due to its greater level of detail in discourse segmentation or that they are not in a dominance relation, but in a satisfaction-precedence relation. However, this interpretation conflicts with Marcu (2000) and Moser and Moore (1996). With respect to the situation of n-ary branching relations, where sequential order among satellites does not necessarily imply satisfaction-precedence relations, another solution is to apply Marcu’s mapping onto left-branching satellites as well, though such configurations could not have been expressed in the original GS’s proposal. Following Tetreault and Allen (2003), we opt for the last alternative.
Rhetorical distance – A parameterized approach 39–44
elaboration-additional
39–42
43–44 elaboration-object-attribute-e
elaboration-additional
39–40 manner
41–42
(43)
(44) that has fallen away from the walls
(42)
She snaps photos of the buckled floors and the plaster
enablement
(39)
(40)
Once inside, she spends nearly four hours
measuring and diagramming each room in the 80-year-old house
(41)
gathering enough to estimate what it information would cost to rebuild it
Figure 3. RST example.
moln
mors
moln moln (39) Once inside, she spends nearly four hours
mors (40) measuring and diagramming each room in the 80-year-old house
mols (41) gathering enough information
morn
moln
mors
(43)
(44)
She snaps photos that has fallen away of the buckled floors from the walls (42) and the plaster to estimate what it would cost to rebuild it
Figure 4. Reconstruction of RST fragment as labeled tree. [root of the path]
39–44 molnd
morsa
39–42 43–44 molnd
molna
39–40
41–42
molnd
(39) Once inside, she spends nearly four hours
(40)
(41)
(42)
measuring and diagramming each room in the 80-year-old house
gathering enough information
to estimate what it would cost to rebuild it
antecedent β
path(a, b ) = (molna, morsa, molnd, molnd, molnd) rhet-dist(a, b ) = wmolna + wmorsa + wmolnd + wmolnd + wmolnd acc(a, b ) = iff. rhet-dist (a, b ) < t Figure 5. Example analysis for (43) → (39).
(43) (44) She snaps photos of that has fallen away the buckled floors from the walls and the plaster anaphor α
Christian Chiarcos and Olga Krasavina
Thus, according to the GS configuration, we can calculate from the corresponding path between (43) and (39) (Fig. 5) a rhetorical distance of 0. As the rhetorical distance does not exceed τGS, (39) would be regarded as being accessible from (43). However, the distances for (43) → (40), (43) → (41) and (43) → (42) exceed the accessibility threshold, the EDUs (40) – (42) are thus regarded as inaccessible just as predicted by GS (rhet-distGS(43, 40) = 1, rhet-distGS(43, 41) = 2, rhet-distGS(43, 43) = 1).
4.2 Veins Theory (VT, Cristea et al. 1998) VT was actually designed as a generalization of Centering Theory (Grosz et al. 1995) that extends the applicability of centering rules by defining larger domains of accessibility within a discourse tree. These domains, veins, are derived as follows: 1. In a bottom-up manner, a head is assigned to every discourse segment (a) if an RST node is terminal, i.e., an EDU, its head is its unique ID label, (b) if a node has one nuclear child node (mononuclear node), its head is the head of its nucleus (c) if a node has several nuclear child nodes (multinuclear node), its head is the concatenation of the heads of its nuclear children 2. Then, the vein is calculated top-down using the heads derived so far: • The vein of the root is its head. • Suppose v is the vein of a non-terminal node with left child L (head hl) and right child R (head hr). • The vein of L and R is defined as follows: type of relation multinuclear mononuclear mononuclear
nuclearity L R NUC NUC NUC SAT SAT NUC
veins of child nodes L R υ υ υ ῦ ο hr υ ο hr (hl) ο υ
Notation:
– x o y is a flat list comprising the heads enumerated in x and y according to their respective linear order. – Parentheses (. . .) are used as additional symbols when constructing the vein of right nuclear children of mononuclear discourse relations. – Given a vein x, x˜ denotes those elements from x that are not marked by parentheses. 3. For an EDU with label u, all EDUs that are in the vein of u and precede u sequentially are defined as accessible.
From these rules, the following conditions can be derived:
1. as the veins are calculated top-down, veins theory allows for unrestricted ascending (mulna, murna, molna, molsa, morna, morsa) 2. nuclei are accessible as antecedents (mulnd, murnd, molnd, mornd)
Rhetorical distance – A parameterized approach
3. 4.
right satellites are inaccessible as antecedents (morsd) left satellites (molsd) are (a) accessible only from the sub-tree dominated by the corresponding nucleus, (b) but not from right-branching satellite nodes (morsa)
As stated in (4b), the combination of molsd and morsa has to be prohibited. But while multiple morsa edges are allowed on an edge (see Fig. 1), multiple molsd edges are not. So we assign molsd a large constant as a weight, say wmolsd = 1 and morsa a small one, say wmolsd = 0.05, with wmolsd+wmorsa as the threshold of accessibility, thus τVT = 1.05. Edge morsd is always inaccessible (3), thus, its weight is larger than wmolsd + wmorsa. Here wmorsd = 2. Every other type of edge is accessible. This proposal has one limitation: its local character. While in the VT, descending into a satellite is always impossible except in the root node (4a), our method of counting the edges of a certain type is insensitive to their relative order. To approximate this behavior, we assign a small additional weight (less than wmorsa, i.e., 0.01) to every descending edge, but this still does not guarantee full compatibility. Thus, wmolnd,mornd,mulnd,murnd = 0.01 as compared to ascending edges with wmolna,morna,mulna,murna = 0. For our example in Fig. 3, the head of node (39–44) comprises just the label of the node (39), i.e., h39–44 = (39). Treating this fragment as a complete RST tree, we equate the vein v39–44 with the head h39–44, and derive the veins of the relevant children as v43–44 = v43 = (39, 43). Thus, only node (39) would be predicted as being accessible. In our reconstruction of the VT, the rhetorical distance for (43) → (39) can be calculated as 0.08. It is thus lower than τVT and is regarded as being accessible, while the distances for nodes (40–42) as antecedent EDUs exceed the accessibility threshold (rhet-distVT(43, 40) = 2.07, rhet-distVT(43, 41) = 3.06, rhet-distVT(43, 42) = 2.07).
4.3 Rhetorical distance approach (KK, Kibrik and Krasavina 2005) Given a pair of EDUs with one containing a potential antecedent and the other one containing a potential anaphor, Kibrik and Krasavina’s algorithm constructs a path pointing from the potential anaphor to the potential antecedent. Then, rhetorical distance corresponds to the number of transitions on this path. According to KK, rhetorical distance is calculated according to the following rules:
1. Move along the RST graph towards the nearest antecedent and count the number of “horizontal jumps”, i.e., transitions ascending from a satellite or descending into a satellite of a discourse relation. 2. If a jump penetrates into a symmetrical structure or makes a step out of it, a penalty of 0.5 is added. 3. Nothing contributes to rhetorical distance.
Our parameterized framework implements these conditions by the following configuration:
1. edges from and to satellites receive weight 1
wmolsa,morsd,morsa,morsd = 1
Christian Chiarcos and Olga Krasavina
2. edges from and to nuclei in multinuclear relations receive weight 0.5 wmolna,mulnd,murna,murnd = 0.5 3. edges from and to nucleis in mononuclear relations receive weight 0 wmolna,molnd,morna,mornd = 0
As the KK’s method provides a gradual measurement of discourse-structural accessibility, the value of τ is not specified and has to be determined empirically. Applied to our example (see Fig. 5), we find rhet-distKK for the pair (43) → (39) is 1, making node (39) the most probable antecedent EDU of (43) (rhet-distKK(43, 40) = 2, rhet-distKK(43, 41) = 3, rhet-distKK(43, 42) = 2).
4.4 Theoretical issues: Summary In this section we have shown that the set of parameters proposed above allows for reconstruction of three approaches on referential accessibility in discourse. This enables comparison of the configurations proposed for the corresponding theories. Having reinterpreted the absolute weights as partial orders, we found – not unsurprisingly – a great deal of compatibility, especially among less accessible relations, morsd, molsd, indicating that descending into a satellite is quite a costly operation. Additionally, ascending from the nucleus of a mononuclear relation (molna, morna) seems to be least problematic, with multinuclear relations being intermediate. Generally, ascending seems to be easier than descending, as predicted by GS. The implementation of the theories within our framework is optimal for GS and KK, since these can be modeled as monodirectional search algorithms. For veins theory, Table 1. Edge labels and their corresponding weights, according to Grosz and Sidner (1986), Cristea et al. (1998) and Kibrik and Krasavina (2005). edge label molna molnd molsa molsd morna mornd morsa morsd mulna mulnd murna murnd accessibility threshold τ
stack model (GS) (Grosz/Sidner 1986)
veins theory (VT) (Cristea et al. 1998)
rhetorical distance (KK) (Kibrik/Krasavina 2005)
0 0 0 1 0 0 0 1 0 0 0 0
0 0.01 0 1 0 0.01 0.05 2 0 0.01 0 0.01
0 0 1 1 0 0 1 1 0.5 0.5 0.5 0.5
< 1
< 1.05
n.a.
Rhetorical distance – A parameterized approach
the approach allows for a locally optimal reconstruction only. The VT comprises both a bottom-up determination of heads and the top-down determination of accessibility domains and thus a global optimization step. However, this is uncritical for all but one case, that is, to distinguish whether a satellite containing the potential antecedent in its nucleus is directly attached to the root node of the path or whether the satellite with the antecedent is attached to a nuclear child of the root node. Here, the accessibility of an EDU is restricted not just by the types of edges, but also by their order. Without considering the rhetorical structure of the whole discourse as performed by the interaction of bottom-up and top-down strategies of the original model by Cristea et al. (1998), our reconstruction is consistent with the majority of the VT-based accessibility judgments. However, for the sake of comparison with the other approaches, special extensions to treat with this locality problem remain as a topic for further research. With respect to the minimal set of parameters all theories have in common, we obtain the configurations in Tab. 1.
5. Comparative empirical evaluation 5.1 Preliminaries Having outlined a meta-theoretic framework and argued for its adequacy with respect to three relevant theories of discourse-structural accessibility, we can now investigate the respective predictions of these theories as to the impact of RST on referential choice empirically. Specifically, we concentrate on the following questions in this section: 1) interaction of rhetorical distance and referential distance, 2) differences between the theories. We examined the use of pronouns as opposed to the full definite descriptions. Full descriptions encompass proper names, demonstrative and definite descriptions, and pronouns include personal, possessive, demonstrative pronouns and in German, pronominal adverbs. The data used in this analysis consisted of 134 texts from the Potsdam Commentary Corpus (PCC, Stede 2004) and 84 texts from the RST Discourse Treebank (Carlson et al. 2003). The Potsdam Commentary Corpus is a collection of German newspaper commentaries annotated for morpho-syntax (using the TIGER format, cf. Brants and Hansen 2002), coreference (PoCoS, Krasavina and Chiarcos 2007), rhetorical structure (Reitter and Stede 2003), information structure (Götze 2003), and discourse connectives (Stede 2004). The RST Discourse Treebank is a collection of American English newspaper articles of various kinds (WSJ, from the Penn Treebank) annotated for rhetorical structure (Carlson et al. 2003). The sample that we used for the current study was additionally annotated with anaphoric links (PoCoS, Krasavina and Chiarcos 2007). The rhetorical and co-reference annotation of both corpora is compatible. It should be noted, however, the segmentation of the RST Discourse Treebank is more fine-grained
Christian Chiarcos and Olga Krasavina
than that of the Potsdam Commentary Corpus, so in the former, and the EDUs are oftensmaller, than in the latter. As an example, EDUs in RST Discourse Treebank are usually syntactical clauses, but sometimes sub-clausal elements as well, while the current EDU segmentation in the Potsdam Commentary Corpus tends to equate EDUs with clauses or sentences. To evaluate performance of the theories, we used the following measurements: precision relative portion of correctly predicted pronouns from predicted pronouns; recall relative portion of correctly predicted pronouns from originally used pronouns; f-measure the harmonic mean of recall and precision;4 χ2 test test against the null hypothesis of independence. Since the Kibrik and Krasavina (2005) model of rhetorical distance does not provide an accessibility threshold, we applied different values of the accessibility threshold τ. Additionally, we tested a baseline defined as the path length, i.e., the number of edges on the path, again with different τ-values. Table 2. Predictions as to pronouns according to Grosz and Sidner (GS), veins theory (VT) and Kibrik and Krasavina (KK).
correct predicted
total predicted
total correct
recall
precision
f
χ2
German data from Potsdam Commentary Corpus (PCC, Stede 2004) GS 173 418 282 61.35% 41.39% 49.43% VT 209 523 282 74.11% 39.96% 51.93% KK, 193 413 282 68.44% 46.73% 55.54% τ = 2.0 baseline, 220 496 282 78.01% 44.35% 56.56% τ = 5.0
32.62, p<0.01 38.01, p<0.01 78.13, p<0.01 80.29, p<0.01
English data from RST Discourse Treebank (Carlson et al. 2003) GS 53 163 121 43.80% 32.52% 37.32% VT 63 208 121 52.07% 30.29% 38.30% KK, 113 346 121 93.39% 32.66% 48.39% τ = 3.5 baseline, 95 253 121 78.51% 37.55% 50.80% τ = 5.0
1.72, p = 0.89 0.40, p = 0.53 13.82, p < 0.01 23.38, p < 0.01
4. Here, precision recall and f-measure can be interpreted relative to each other only, rather than with respect to the overall data, as the data set has been restricted.
Rhetorical distance – A parameterized approach
Table 3. The impact of referential distance on pronominalization.
correct total total predicted predicted correct
recall
precision
f
German (PCC) 247 462 282 87.59% 53.46% 66.40% τ = 2 English (RST) 101 238 121 83.47% 42.44% 56.72% τ = 3
χ2 206.57, p < 0.01 50.97, p < 0.01
In this study, we proceeded under the following conditions: • distances were counted to sequential antecedents, i.e., to the most recent mention of a discourse referent; • trivial cases, i.e., cases in which both anaphor and antecedent are in the same EDU, were excluded; • anaphor-antecedent pairs with edges from n-ary branching relations were excluded
5.2 Rhetorical distance and pronominalization Comparing the performance of the theories, we found that KK’s proposal (for German with τ = 2, for English with τ = 3.5) seemed to achieve a slight advantage over veins theory and stack model which perform nearly equally, cf. Tab. 2. However, this should not be interpreted without reservations, as, surprisingly, it turned out that the baseline (for German with τ = 3 resp. τ = 5, for English with τ = 5) outperformed all theories in both corpora. This finding is surprising, as it implies the established theories are either inapplicable for the pronominalization task or are completely redundant, being replaceable by much simpler formulations. In that effect, we can hypothesize that local strategies interfere with rhetorical distance. This assumption conforms to the interaction between global and local structure of discourse as assumed by GS. GS operates with minimal discourse segments, which are building-blocks of the global structure. Discourse segments are larger than EDUs in the RST. A discourse segment is defined by a specific discourse purpose, allowing for the accumulation of multiple functionally related sentences within one minimal discourse segment. Opposed to this, EDUs in the RST are defined as syntactical units at or below the sentence level, thus they tend to be smaller per definitionem. Within a (minimal) discourse segment, then, local strategies, such as those outlined in Centering Theory (Grosz et al. 1995) are applied for pronominalization and pronoun interpretation. In general, these strategies rely on the notion of recency as a necessary condition for pronominalization.5 5. As an example, in the original formulation of Centering Theory by Grosz et al. (1995), pronominalization is intrinsically tied to the status of the “backward-looking center” which must have an antecedent in the directly preceding utterance.
Christian Chiarcos and Olga Krasavina
5.3 Effects of referential distance Considering observations on the question of how rhetorical and referential distances interact, surprisingly, for both German and English, rhetorical distance seems to be ruled out by referential distance, as illustrated in Tab. 3. For short-distance references, discourse structural constraints are overruled due to the overwhelming frequency of pronouns. As a consequence, we assume referential distance is a main determinant of pronoun use in local contexts. And, vice versa, rhetorical distance is a device allowing for pronominal reference where it is unexpected under a purely sequential account. To have a closer look at rhetorical distance effect, we excluded cases that are more properly explained by short-distance effects by introducing a bias β. β corresponded to referential distance values. Anaphoric links with referential distance less than β were filtered out. The results of this evaluation are summarized in Tab. 4. For long-distance pronominal references, i.e., with higher referential distance, rhetorical distance turned out to be a better indicator in general. In the German data, this effect was found for pronominal references with referential distance greater than 2 (β > 2). For the English sample, it showed up at distances greater than 3 (β > 3). Curiously, the theories seemed to perform similarly at the lower end of the spectrum. For German, the difference in f-measure values between the best RhetD-configuration (KK with τ = 1.5) and the baseline (with τ = 5) did not exceed 3%, and for English, the baseline (with τ = 6) outperformed all theories at β = 4. Nevertheless, for longer distances, a slight advantage of KK’s rhetorical distance was observed in both corpora (on the PCC for β between 3 and 5, on the RST Discourse Treebank for β greater than 4). The data on long-distance pronominal anaphors is, however, too sparse to draw any definite conclusions, as illustrated by the seemingly better performance of the baseline for pronouns with referential distance greater than 5 in German. Similarly, the number of long-distance pronouns is too low to make any commitments at this point. So, out of 134 texts from the PCC, only 35 pronouns (β = 3; 12.4%), and out of 84 texts from the RST Discourse Treebank, only 20 (β = 4; 16.5%) resp. 12 pronouns (β = 5, 9.9%) can be considered as long-distance pronouns. It is necessary to test longdistance pronominalization on additional corpora.
5.4 Empirical evaluation: summary and related work In conclusion, for short-distance references, we found a tendency that referential distance is a more appropriate predictor of pronominalization than rhetorical distance in both corpora. Still, such findings might be an artifact of restrictions applied in this study. The most important restriction is that we considered sequential antecedents, not rhetorical ones. Another restriction is that we considered only the accessibility aspect. There are additional factors affecting the choice of referring expressions, such as referential interference.6 6. On the impact of ambiguity to the interference cf. the case studies collected in Givón (1983).
Rhetorical distance – A parameterized approach
Table 4. f-measure: Interplay between referential distance (RefD) and rhetorical distance (RhetD) in the pronominalization task. Results for English from RST Discourse Treebank. RefD bias β
β= 1
β = 2
β = 3
β = 4
β = 5
German data from Potsdam Commentary Corpus (PCC, Stede 2004) RhetD GS 49.43% 40.76% 20.11% 20.80% 22.47% VT 51.93% 42.22% 18.34% 20.69% 20.00% KK 55.54% 45.99% 21.88% 28.92 32.73% τ = 2 τ = 2 τ = 1.5 τ = 1.5 τ = 1.5 baseline 56.56% 47.36% 18.89% 22.86% 30.43% τ = 5 τ = 4 τ = 5 τ = 5 τ = 4 RefD 66.40% 57.40% 14.92% 13.53% 14.08% τ = 2 τ = 2 τ = 5 τ = 6 τ = 5 absolute frequencies # prn 282 # full NP 601
165 544
35 386
23 263
18 196
English data from RST Discourse Treebank (Carlson et al. 2003) RhetD GS 37.32% 24.62% 20.45% 17.17% 11.11% VT 38.30% 31.33% 24.16% 18.56% 15.79% KK 48.39% 40.44% 31.41% 25.60% 21.28% τ = 3.5 τ = 3.5 τ = 3.0 τ = 3.0 τ = 3.0 baseline 50.80% 42.34% 31.79% 32.53% 17.31% τ = 5 τ = 5 τ = 5 τ = 6 τ = 7 RefD 56.72% 48.19% 38.24% 28.57% 18.75% τ = 3 τ = 3 τ = 4 τ = 5 τ = 5 absolute frequencies # prn 121 # full NP 298
80 272
38 206
20 163
12 134
β = 6
β=7
20.29% 17.95% 33.33% τ = 1.5 38.89% τ = 4 11.11% τ=6
18.60% 16.33% 28.57% τ = 1.5 35.39% τ=3
13 148
9 93
8.89% 10.17% 18.42% τ = 3.0 14.29% τ = 7 6.45% τ=6
13.33% 8.33% 20.00% τ = 3.0 16.67% τ=5
9 117
8 96
(For each column, the best values for f-measure are marked bold.) For long-distance pronouns, referential distance is ruled out by rhetorical distance. Although the data was of limited size, the tendency was observed for two corpora of different languages. As both corpora were annotated independently for co-reference and rhetorical structure respectively, it is unlikely that this effect is an annotation artifact. The existence of a small domain that is specifically reserved for effects of discourse structure is an indicator that sequential and discourse structural strategies are applied in parallel. Due to a greater number of short-distance references performed by pronouns, referential distance can be seen as a “default” measurement of referential accessibility, whereas rhetorical distance comes into play in cases of long-distance anaphora. In other words, we suggest the combined application of Walker’s cache model
Christian Chiarcos and Olga Krasavina
(1998, 2000) and Grosz and Sidner’s stack model (or related approaches) side by side. This two-channel strategy of pronominalization as suggested here conforms to Tetreault and Allen’s approximation of Grosz and Sidner’s stack model (Tetreault and Allen 2003), and to Poesio et al. (2002).
6 Conclusion This paper presents a parameterized framework for reconstructing theoretical models of referential accessibility and discourse structure, Grosz and Sidner (1986), Cristea et al. (1998) and Kibrik and Krasavina (2005). By introducing a generalized notion of rhetorical distance, we create a common foundation for these models. The proposed minimal set of parameters enables comparison of the theories by assigning absolute weights corresponding to the theories’ predictions. We have demonstrated adequacy of our framework by presenting reconstructions of the theories considered here and have compared the theories’ respective predictions. Finally, a pilot empirical evaluation revealed a two-channel pronominalization tendency: referential and rhetorical distances interact, indicating that factors from both local (sequential proximity) and global context (rhetorical structure) have an impact on the choice and the interpretation of pronouns correspondingly.
References Asher, N. 1993. Reference to abstract objects in discourse. Dordrecht: Kluwer. Brants, S. and S. Hansen 2002. Developments in the TIGER Annotation Scheme and their Realization in the Corpus, in Proceedings of the Third Conference on Language Resources and Evaluation (LREC 2002), 1643–1649. Carlson, L., D. Marcu, and M.E. Okurowski 2003. Building a Discourse-Tagged Corpus in the Framework of Rhetorical Structure Theory. In: J. van Kuppevelt and R. Smith (eds), Current Directions in Discourse and Dialogue. Dordrecht: Kluwer, 85–112. Chiarcos, Ch. 2005. Mental salience and grammatical form: Generating referring expressions. In: M. Stede, Ch.Chiarcos, M.Grabski and L.Lagerwerf (eds.), Salience in discourse. Proceedings of the 6th Workshop on Multidisciplinary Approaches to Discourse (MAD-05), Chorin. Cristea, D., N. Ide, and L. Romary 1998. Veins Theory. A model of global discourse cohesion and coherence. In: Proc. 36th Ann. Meeting of the ACL, 281–285. Fox, B. 1987. Discourse Structure and Anaphora. Cambridge: CUP. Givón, T. 1983. Topic continuity in discourse. Amsterdam: Benjamins. Givón, T. 1990. Syntax. A functional-typological introduction. Amsterdam: Benjamins. Götze, M. 2003. Zur Annotation von Informationsstruktur. Diploma thesis, University of Potsdam. Grosz, B. and C. Sidner 1986. Attention, intentions, and the structure of discourse, Computational Linguistics, 12(3):175–204.
Rhetorical distance – A parameterized approach Grosz, B., A.K. Joshi, and S. Weinstein 1995. Centering: A framework for modelling the local coherence of discourse. Computational Linguistics 21(2):203–225. Ide, N. and D. Cristea 2000. A Hierarchical Account of Referential Accessibility. In Proc. 38th Ann. Meeting of the ACL. Kibrik, A. 2000. A cognitive calculative approach towards discourse anaphora. In Proc. Discourse anaphora and anaphor resolution conference (DAARC’2000). Kibrik, A. and O. Krasavina 2005. A corpus study of referential choice. The role of rhetorical structure, in Proceedings of DIALOG’05, Zvenigorodsky, Russia. Krasavina, O. 2005. Types of third-person pronouns and salience conditions. In: M. Stede, Ch. Chiarcos, M. Grabski and L. Lagerwerf (eds.), Salience in discourse. Proceedings of the 6th Workshop on Multidisciplinary Approaches to Discourse (MAD-05), Chorin. Krasavina, O. and Chiarcos, Ch. 2007. PoCos - Potsdam Coreference Scheme. In: Proceedings of the Linguistic Annotation workshop held in conjunction with ACL 2007, Prague, June 2007, p. 156–163. Mann, B. and S. Thompson 1988. Rhetorical Structure Theory. TEXT 8(3):243–281. Marcu, D. 2000. Extending a formal and computational model of Rhetorical Structure Theory. In Proc. 18th Int. Conf. on Computational Linguistics (COLING’2000). Moser, M. and J. Moore 1996. Toward a synthesis of two accounts of discourse structure. Computational Linguistics, 22(3):409–419. Poesio, M., B. di Eugenio and G. Keohane 2002. Discourse Structure and Anaphora: An Empirical Study, University of Essex, NLE Technical Note TN-02-02. Polanyi, L. 1988. A formal model of the structure of discourse. Journal of Pragmatics 12:601– 638. Reitter, D. and Stede, M. 2003. Step by step: Underspecified markup in incremental rhetorical analysis. In Proceedings of the 4th international workshop on Linguistically Interpreted Corpora (LINC-03). Tetreault, J. and J. Allen 2003. An empirical evaluation of pronoun resolution and clausal structure. In Proc. 2003 Int. Symp. on Reference Resolution and its Applications to Question Answering and Summarization, 1–8. Walker, M.A. 1998. Centering, anaphora resolution, and discourse structure. In: M. A. Walker, A.K. Joshi, and E.F. Prince (eds.), Centering in Discourse. Oxford: Oxford University Press, 401–435. Walker, M.A. 2000. Toward a model of the interaction of centering with global discourse structure. Verbum. Webber, B. 1991. Structure and ostension in the interpretation of discourse deixis. Natural Language and Cognitive Processes 2(6):107–135.
Underspecified discourse representation* Markus Egg and Gisela Redeker This paper proposes an approach to discourse structure that builds on syntactic structure to derive that part of discourse structure that can be captured without taking recourse to deep semantic or conceptual knowledge. This contribution is typically only partial; we intend to capture this partiality in terms of underspecified constraints that describe (but do not enumerate) the structures a given discourse might have. This allows a rather straightforward interface from syntax to discourse and yields a clean interface to modules of discourse resolution.
1. Introduction The analysis of discourse structure has been gaining increasing importance in Natural Language Processing. Discourse structure provides semantic information that interacts with the meaning of clauses (and other constituents of the discourse that are atoms in discourse structure) in the derivation of the full interpretation of the discourse. Consider e.g., (1) from Asher and Lascarides (2003). In its preferred interpretation, C2–C5 give further details about the evening described in C1 and C3–C4, about the meal described in C2. This means that the eventualities (states of affairs) described in C2–C5 are part of Max’s evening, and C3–C4 describe parts of his meal as introduced by C2. This information goes beyond the compositionally derived semantics of C1–C5 and complements it. I.e., this information – as well as the result of semantic composition – is only partial in that it does not fully determine the interpretation of the discourse.
(1) Max experienced a lovely evening last night (C1). He had a fantastic meal (C2). He ate salmon (C3). He devoured lots of cheese (C4). He won a dancing competition (C5).
Asher and Lascarides (2003) show that the derivation of such a fully specified discourse structure presupposes a semantic analysis of the discourse atoms (clauses and other constituents that are atoms in discourse structure) as well as vast amounts of
* We thank the participants of CID ’05 in Dortmund and two anonymous reviewers for valuable comments.
Markus Egg and Gisela Redeker
conceptual knowledge. However, for any computational attempt at analysing discourse structure and its contribution to the meaning of a discourse, this raises the question of how to fulfill these presuppositions. There is as yet no system available for the computational determination of discourse-atom semantics, let alone wide-coverage representations of conceptual knowledge. I.e., modelling the interaction of clause- and discourse-level semantics as described in Asher and Lascarides (2003) is at present not a realistic goal for a computational approach to discourse structure. Thus, we pursue a more modest goal in our paper, viz., to derive information on discourse structure solely on the basis of syntactic structure and an appropriate syntaxdiscourse interface. In this respect, we follow researchers like Marcu (1997), Schilder (2002), and Webber (2004). Descriptions of discourse structure that are obtained in this way are characteristically only partial, since they use syntactic structure as the only knowledge source determining the eventual discourse structure. This suggests formalising such descriptions as constraints on discourse structure (Schilder 2002), similar to the ones used in the treatment of structural ambiguity in underspecification formalisms (Reyle 1993; Copestake et al. 2005; Egg et al. 2001). We first introduce discourse relations and discourse structure representations and the syntax-discourse interface on which our analyses are based. For an extended example we show how an incrementally built initial underspecified discourse structure representation can be enriched by further information derivable from syntactic structure. We then defend our representations against some counterarguments raised in the literature and compare our approach to related work.
2. Discourse structure Various systems of discourse relations have been proposed in recent approaches to the modelling of coherence and discourse connectives (Marcu 1997; Redeker 2000; Carlson et al.: 2003 Soricut and Marcu: 2003; Asher and Lascarides: 2003; Webber 2004). The most explicit and elaborated one is Marcu (1997), an extension of the empirically very successful Rhetorical Structure Theory (RST) (Mann and Thompson: 1988). RST analyses are based on the analyst’s plausibility judgments and have been applied to many text types in many languages, e.g., Dutch and German (Abelen et al. 1993; Stede 2004).1
1. RST distinguishes two kinds of relations: The asymmetric mononuclear relations like elaboration or justify relate a nucleus (centrally important) and a satellite (additional information, which could in many cases be left out without rendering the text incoherent). The symmetric multinuclear relations like list or joint relate discourse entities of equal status.
Underspecified discourse representation
We will base our analyses on the set of relations as defined in (classical) RST, unless otherwise stated. These relations are set in small capitals. The relations between discourse segments are sometimes (but not always) indicated by explicit discourse connectives such as so, but, or while. For the interpretation of Dutch connectives, we will draw on the comparative research on English and Dutch discourse connectives by Knott and Sanders (1998). On the basis of the discourse relations, discourse structure is modelled in terms of binary trees in the following way: The leaves of these trees stand for the discourse atoms, while all other nodes of the tree correspond to complex discourse constituents. The label of the node for a complex constituent indicates the relation that links its immediate subconstituents. E.g., a text with two clauses C1 (nucleus) and C2 (satellite) related by an elaboration relation would schematically be depicted as follows:
(2) elaborationn C1
C2
Here the mother describes a functor, its daughters, the arguments of the functor. To distinguish nucleus and satellite among the arguments (where appropriate), a subscript (n or s) indicates the status of the left daughter. The trees we assume for our analyses differ from those assumed in RST. Here, all the nodes are discourse units and the mother node is the convex union of all its daughter nodes. Daughters are related by different sorts of links, which also determines their position as satellite or nucleus. But our trees and RST trees have in common that all leaves are discourse atoms. (2) would be rendered as (3) in RST. The discourse structure is a tree whose mother is the segment consisting of C1 and C2 and whose daughters are the nucleus C1 and the satellite C2 elaborating on C1:
(3)
1-2 Elaboration 1
2
For the simple examples discussed so far and in the next two sections, the transformation from one type of tree to the other is straightforward, but not for nuclei with more than one satellite. We will discuss this problem in connection with the extended example (17) in section 5.3 below. As soon as we try to account for more complex examples, we are faced with the problem that discourse structure can only be described in part. Consider e.g., (4):
(4) John is stubborn (C1). His sister is stubborn (C2). His parents are stubborn (C3). So, they are continually arguing (C4).
While C2 and C3 are attached to the previous discourse by implicit connectives (expressing a list), C4 presents a (non-volitional) result of a suitable part of the
Markus Egg and Gisela Redeker
preceding discourse due to the connective so. This does not fix the discourse structure in (4) completely: In its preferred interpretation, C4 is the result of C1–C3, i.e., due to the stubbornness of the whole family, they are constantly arguing. In another, less preferred reading, C4 is the result of C3 only (while John and his sister are stubborn, his parents are too, and the latter is the reason why the parents are constantly arguing), but there is no reading in which C4 is the result of exactly C2 and C3. In the following section we will show that the chosen way of representing discourse structure is capable of dealing with incomplete information on discourse structure.
3. Representing discourse structure This section introduces the representation of discourse structure that underlies the analyses in this paper. We describe (partial) information on discourse structure by expressions of a suitable underspecification formalism, here, a version of the Constraint Language for Lambda Structures (CLLS; Egg et al. 2001). First, we will present these expressions in a more intuitive way in section 3.1, then we will introduce the formal foundations for the expressions in section 3.2.
3.1. Underspecified and fully specified discourse representations As a first example for the expressions that represent information on discourse structure, consider (5), the discourse representation for (4). Such expressions are called constraints and describe a number of discourse structures, which are all formalised as tree structures. The key ingredient of constraints are (reflexive, transitive and antisymmetric) dominance relations, which are indicated by dotted lines (see Schilder 2002 for a similar approach). Dominance of X1 over X2 means that X2 is part of the structure below (and including) X1, but there might be additional material intervening between X1 and X2. In these constraints and the trees they describe, ‘0n’ stands for the (very unspecific) discourse relation as introduced by the n-th implicit discourse connective,2 SO, for the relation introduced by so, and Cn, for the meaning of the n-th clause of the discourse:
(5)
0/ 1
0/ 2 •
•
•
C1 •
C2 •
SO • •
• C3 •
• C4 •
In prose: The three discourse connectives (the implicit connectives 01 and 02 and the explicit so) are all binary in that they link two text segments, which are represented
2. Indices are merely added to facilitate reference to different tokens of the same relation.
Underspecified discourse representation
as their daughters. Thus, C1 is linked to a part of the following discourse (including at least C2) by the implicit connective 01, 02 connects two discourse segments (comprising at least C2 and C3, respectively), and, finally, so connects a discourse segment to its left (which includes at least C3) to C4. This constraint is compatible with a number of tree structures, called its solutions. If we assume that these tree structures may only comprise material that is already introduced in the constraint, then there are exactly five fully specified tree structures compatible with the constraint. These tree structures describe the potential discourse structures for (4) (see Webber 2004). (6d–e) model the preferred and (6a–b), the less preferred interpretation of (4); the inacceptable interpretation of (4) is modelled by (6c):
0/ 2
(6) (a)
SO
0/ 1 C1
(b)
C2
C3
0/ 1 0/ 2
C1 C4
C2
SO C3
(c)
0/ 1 C1
0/ 2 0/ 1
C4 C3
(e)
SO
SO 0/ 2 •
C2
(d)
C1
SO 0/ 1
C4 C3
C2
C4
C4 0/ 2
C1 C2
C3
So far, we have only arranged the various discourse atoms into a tree structure. A second task, which is crucial for the derivation of fully specified discourse structure, is the specification of the discourse relations. This task is due to the fact that discourse connectives themselves need not fully determine discourse relations. For implicit discourse markers, this is quite obvious, but explicit discourse markers, too, do not always fully specify a discourse relation, which shows up e.g., in the taxonomies for English and Dutch discourse markers in Knott and Sanders (1998): In these taxonomies, the semantic contributions of explicit discourse markers are not restricted to the bottom elements, but often show up as elements higher up in the hierarchy. For example (4) and its potential representations (6a–e), the specification of the discourse relations goes as follows. First, the semantic contributions of the implicit discourse markers (modelled as labels 01 and 02) are specified to a list relation.3 Since lists may comprise more than two elements, we break them down into binary-branching
3. To keep track of such specifications, numeric subscripts are sometimes preserved in the tree structures. List models a specific kind of conjunction in RST, where the arguments must be comparable (as opposed to joint).
Markus Egg and Gisela Redeker
subtrees, which add one element at a time. Second, SO, the semantic contribution of so, is specified to the discourse relation of non-volitional result in the context of (4). (Formalisation of this step goes beyond the CLLS formalism proper, see section 3.2.2 for the technical details.) Thus, eventually, we obtain (7a) or (7b) as final representations of the preferred discourse structure of (4):
(7) (a)
resultn
(b)
C4
list2
list2
C1
C2
C1
C4
list1
C3
list1
resultn
C2
C3
As a second example, consider the discourse structure of (8) [= (1)]. Its five sentences are connected by implicit discourse connectives, which gives rise to the constraint (9):
(8) Max experienced a lovely evening last night (C1). He had a fantastic meal (C2). He ate salmon (C3). He devoured lots of cheese (C4). He won a dancing competition (C5).
(9)
0/ 1
0/ 2
C1
C2
0/ 3
0/ 4
C3
C4
C5
The preferred interpretation (10a) for (8) is based on one of the tree structures that are described in (9). In this tree structure, the discourse relations are not yet specified: (10) (a)
0/ 1 C1 C2
0/ 4
C1 C5
0/ 2
joint4
elaborationn2 C2
0/ 3 C3
(b) elaborationn1
C4
C5 list3 C3
C4
An appropriate specification models the fact that in (8), C2–C5 elaborate C1 (the evening), C3–C4 elaborate C2 (the meal), while C2–C4 (as a whole) and C5 are related by a joint relation, and C3 and C4 form a list. Thus, in the eventual representation (10b) for (9), the 0-relations are specified appropriately. In sum, underspecified constraints on discourse structures are an efficient way of modelling partial information on discourse structure. Before we formalise our discourse structure representations, we discuss an issue that seems to clash with our claim that discourse structures can be modelled as trees, viz., the question of what it means for a relation to link two nonatomic discourse segments D1 and D2. Marcu (1996) says that this is possible if and only if the relation also
Underspecified discourse representation
holds between the nuclei of D1 and D2 (if these segments are mononuclear). This condition may apply recursively. Danlos (2004, 2006) formulates this ‘nuclearity principle’ as follows: What looks like relations between nonatomic discourse segments are in fact relations between their nuclei, because the arguments of discourse relations can only be discourse atoms (and segments whose top relation is multinuclear). But then discourse structures cannot be trees, as nodes may have several parents. Consider e.g., (10b) in Danlos’ analysis: The relation joint4 would link C2 (the head of the segment C2–C4 instead of the segment as a whole) to C5, thus, C2 would have two parents, viz., one for the elaboration2 relation between C2 and C3–C4, and one for the relation joint4. We regard the ‘nuclearity principle’ as a means of understanding discourse structure representations without being a part of these representations. E.g., our understanding of the tree (10b) would include the insight that, eventually, the relation joint4 between C2–C4 and C5 also means that C2 is joined to C5, but this claim is not hardwired into the discourse representation. In section 5 we will make ample use of this weak version of the ‘nuclearity principle’.
3.2. Formal foundations of discourse representations After this informal introduction in the discourse representations used in this paper in the preceding section, we will now characterise them in a more formal way. We will first show how the constraints on discourse structure can be expressed in the Constraint Language for Lambda structures (Egg et al. 2001), and how the arrangement of discourse atoms into a tree structure can be handled in CLLS (section 3.2.1). For the specification of discourse relations, however, we must extend CLLS, which will be discussed in section 3.2.2.
3.2.1 Arranging discourse atoms in CLLS In CLLS, constraints on tree structures introduce node variables, labels for these variables, and dominance relations between them.4 Intuitively, node variables correspond to discourse segments. Labels correspond to discourse relations, they specify single discourse relations (e.g., elaborationn) or indicate the information of a given connective 4. We simplify CLLS in two ways here: First, some of the atomic constraints in CLLS (e.g., the ones for λ-binding or parallelism) are omitted. These constraints are only useful if CLLS structures are used to describe λ-terms as in Egg et al. (2001) (this was the original goal of the formalism, which also explains its name). Second, discourse connectives are represented as binary CLLS node labels. CLLS proper would represent them (just like the discourse atoms) as nullary labels and model the application of the connective to its arguments in terms of explicit nodes for functional application (labelled by ‘@’), whose daughters are nodes for functor and argument. E.g., (2) would be rendered as (i):
@
(i)
C2
@ elaborationn
C1
Markus Egg and Gisela Redeker
about a discourse relation (e.g., SO or 0). For atomic segments, labels specify a unique name. Finally, dominance relations indicate those parts of a discourse structure that are not yet known. The graphical representations for constraints used so far are shorthand for conjunctions of atomic constraints on tree structures. E.g., the constraint in (11a) is spelt out in (11b), where ‘◁*’ indicates dominance: (11) (a)
label1
label2
X1
(b) X1 : label1(X2, X4)∧X2◁*X3∧X3 : label2
X4
X2 X3
This allows us to make the intuitive partial ordering of strength between such constraints more precise: C1 is at least as strong as C2 iff C1 comprises at least all the atomic constraints of C2. Now the arrangement of discourse atoms (as given in a constraint) into a tree structure can be specified as follows: A tree structure is described by (or compatible with) a constraint, if there is a variable assignment (for the node variables of the constraint) into the domain of the tree structure (i.e., its nodes) that satisfies every atomic constraint within the constraint. E.g., both the tree structures (12a) and (12b) are compatible with (11). In (12a), both X2 and X3 are mapped onto N2, which is compatible with X2◁*X3, since dominance includes identity: (12) (a)
label1 label2
N1 N2
(b)
N1
label1 N2
label3
N4 label4 label5
N7
N5 label2
N4 N6
N3
Note that actual nodes are distinguished from node variables by their names (Nn and Xn, respectively, where n ∈N). Graphically represented tree structures such as in (12) are also just shorthand for conjunctions of atomic relations between nodes, for instance, (12a) depicts N1:label1(N2, N4)∧N2:label2. These examples illustrate that, in fact, constraints like (11) describe an infinite number of tree structures, because the material below the node assigned to X2 is not restricted, except that it must comprise the node assigned to X3. However, in this paper, we are only interested in so-called constructive solutions, where the mapping is surjective, i.e., every node in the solution corresponds to a node variable in the constraint.
3.2.2. Specifying discourse relations in an extension of CLLS For the specification of discourse relations, we must extend CLLS. First, we assume a join-semilattice structure 〈L, ≤〉 for the set of labels L. Atomic elements of this structure
Underspecified discourse representation
r epresent the discourse relations themselves. Since the relation ‘≤’ can be interpreted as ‘is more specific than’, all other elements of the lattice, in particular, its greatest element 0, model partial information on discourse relations. These elements comprise labels for the discourse connectives themselves (written as the name of the connective in capital letters, e.g., ‘WANT’ for Dutch want ‘because’; in addition, ‘0’ models the semantic contribution of the implicit discourse connective). In this way, one can represent the fact that connectives need not fully specify a discourse relation. The lattice structure formalises the intuition of Knott and Sanders (1998) that connectives can be arranged into a taxonomy.5 Then we can extend the above partial ordering of strength between constraints recursively to account for cases of different labels for the same node variable: C1 is at least as strong as C2 iff C1 comprises at least all the atomic constraints of C2 or if all of the following conditions are met:
●
● ● ●
→
→
C1 = Xn: label1(Y ) ∧φ1, where ‘Y ’ stands for a specific sequence of zero or more arguments of label1 → C2 = Xm: label2(Y ) ∧φ2 label1 ≤ label2 φ1 is at least as strong as φ2
Analogously, the notion of solution can be extended in that an atomic constraint → → Xn: label1(Y ) can be satisfied by Nn: label2(M ) under a specific variable assignment iff label2 ≤ label1 and the assignment maps Xn onto Nn, and the elements of the sequence → → of node variables Y onto the elements of the sequence of nodes M . Fully specified discourse structures are thus modelled as solutions of underspecified constraints on discourse structures such as (5) and (9). Solving such constraints usually involves adding further dominance relations between constraint node variables and specifying labels for discourse relations. E.g., the crucial step from (5) to a representation of the preferred reading of (4) consists in adding a dominance relation between the left daughter of the so- and the 01-node variable. This rules out all but the last two possibilities in (6). Then the 01- and 02-node variables can be arranged in either order (since either connective is interpreted as a list relation). In sum, the proposed approach allows a straightforward analysis of discourse structure. What is more, in this approach one can model partial descriptions of discourse structure and their resolution (or disambiguation) in terms of (monotonically) strengthening the involved constraints. In the next section we will show that this approach also allows a straightforward interface to syntax, i.e., a simple mapping from syntactic structure to discourse constraints.
5. This formalisation can be used for classifications of different granularity, e.g., in the realm of conjunctive discourse relations, which is given a much finer-grained partition in Knott and Sanders (1998) than in (classical) RST.
Markus Egg and Gisela Redeker
4. Constructing and resolving discourse constraints In this section, we will first introduce the syntax-discourse interface for our analyses and then work out a larger example.
4.1. The syntax-discourse interface Constraints such as (5) and (9) are derived incrementally by simple interface rules: For incoming sentences consisting of one clause C, the left daughter of the node variable that carries the label for their discourse connective dominates the node variable for the immediately preceding discourse segment C0, its right daughter, a node variable for C. C then becomes the new immediately preceding discourse segment: (13) constraint up to now ...
new constraint
C0
immediately preceding discourse segment (old)
D
... C0
C
immediately preceding discourse segment (new)
For sentences S consisting of two clauses C1 and C2 related by a discourse connective D, e.g., the fourth sentence in (17) below, the daughters of the D-node variable dominate C1 and C2, respectively, and the daughters of a node variable with the label for the (implicit or explicit) discourse connective D´ (that links S as a whole to the preceding discourse) dominates the node variable for C0 and the D-node variable. In addition, C2 is determined as the new preceding discourse segment for the next sentence. (14) visualises this updating procedure: (14) constraint up to now ...
new constraint
C0
immediately preceding discourse segment (old)
D
... C0
D
C1
C2
immediately preceding discourse segment (new)
Underspecified discourse representation
For other sentences consisting of more than one clause, additional assumptions are called for, in particular, for sentences with an embedded sentence S. Here S becomes the new preceding discourse segment. This is illustrated by cases such as (15), where the second sentence is linked to only the embedded sentence of its predecessor. In this way, we can model the intuition that the second sentence in (15) is also embedded in the modal context of Max’s wish: (15) Max wished that a wolf would come in. It would devour his nasty supervisor.
Formally, we can handle this low attachment of the second sentence with a rule that resembles (13), but incorporates two additional assumptions: First, we must encode the relation between matrix clause C1 and embedded sentence C2 in terms of a common dominating node variable of attribution (i.e., the wish that a wolf would come in is attributed to Max; this relation is introduced in Carlson et al. 2003). The node variable for the satellite C1 is the left child of the attribution node variable, the right child of this node variable dominates the variable for C2. Second, we determine C2 as the immediately preceding discourse segment: (16) constraint up to now ...
new constraint
C0
immediately preceding discourse segment (old)
D
...
C0
attributes C1 C2
immediately preceding discourse segment (new)
In the following section we will show how the rules (13) and (14) can be used in order to construct initial discourse constraints such as (5) and (9). These constraints can be derived incrementally, along with syntactic parsing. However, the syntactic structure of a discourse may yield more clues to the discourse structure, which can then be used to restrain these initial descriptions of discourse structure. This two-level strategy is also employed in Schilder (2002). Such clues include the parallel structures of C1–C3 in (4), which strongly suggest that they should combine to form one single constituent in the discourse structure. This is bourne out by our preference for (6d) or (6e) as its discourse structure. A second clue is modal subordination (Roberts: 1989), which shows up in (15): The auxiliary in the second sentence indicates that this sentence is still part of the modal context introduced by the matrix verb of the first sentence. Further clues are the syntactic position of temporal clauses (Schilder: 1998) and cleft sentences (Delin and Oberlander 1995). The extended example in the following section will illustrate such clues.
Markus Egg and Gisela Redeker
4.2 An extended example With a larger example (from a Dutch fund-raising letter) we will now show how much information can be gathered by an appropriate syntax-discourse interface: (17) H elaas raken de Nederlandse asielen iedere zomer weer boordevol met dakloze dieren. (C1) Dieren die om welke reden dan ook door hun baasje zijn weggedaan en die nu aan hun lot zijn overgelaten. (C2) Namens hen vragen wij om uw hulp. (C3) Want om deze dieren een beter bestaan te geven, (C4) is er natuurlijk geld nodig. (C5) Voor inentingen en sterilisaties. (C6) Voor uitbreiding van het aantal onderkomens. (C7) Voor extra medische zorg wanneer noodzakelijk. (C8) (Unfortunately, the Dutch animal shelters fill to the brim with homeless animals every summer. (C1) Animals that have been done away with by their owner for whatever reason and that are now left to their destiny. (C2) It is in their name that we ask your help. (C3) Because to improve the existence of these animals (C4) there is of course a need of money. (C5) For vaccinations and sterilisations. (C6) For increase of the number of shelters. (C7) For extra medical care when necessary. (C8)
Rules (13) and (14) derive (18) as the initial discourse structure representation of (17): (18)
0/ 1 C1
0/ 2 C2
0/ 3
WANT
C3
C6
OM
C4
0/ 4
0/ 5 C7
C8
C5
To derive a fully-fledged discourse structure representation from this constraint, we take advantage of further syntactic clues, in particular, the parallel syntactic structure of C6–C8. We assume that these structures give rise to lists. We pick (arbitrarily) one of the possible ways of modelling lists in terms of binary branching, viz., (19): (19)
list
C1
list
list
C2
... list
Cn−1
Cn
Underspecified discourse representation
In addition, we assume that such lists as a whole are linked directly to an i mmediately adjacent discourse segment (if there is one). We render this linking by a discourse relation node variable (here, the one labelled 03) such that its left child is the node variable for the first segment (here, C5), and its right child dominates the node variable for the second segment. Another potentially discourse-relevant piece of syntactic information is the fact that C2 consists only of an NP whose head word is a direct repetition of the last word of C1. This suggests a direct relation between the two clauses in terms of a discourse relation node variable whose children dominate the two clauses. For (18), this is of no avail, because there already is such a node variable, viz., the one labelled 01.6 Due to the parallel syntactic structure in C6–C8, constraint (18) can be strengthened to (20): (20)
0/ 1
C1
0/ 2
C2
WANT C3
OM C4
0/ 3 C5 list
C6
list
C7
C8
This constraint is much less ambiguous, since only the position of the 01-, 02- and WANT-node variables with respect to each other is not yet fixed. I.e., the ambiguity is analogous to the one in (5), there are five possible discourse structures left. Considering the fact that the number of solutions for simple ‘zigzag’ constraints like (5) and (9) with n discourse atoms is the Catalan number C of n–1, this considerably reduces the number of ambiguities (C(7) = 429).7 At the same time, the implicit discourse connectives 04 and 05 are specified to the relation list. 6. At present, these rules have the status of hypotheses; we intend to validate them empirically against Dutch corpora like the PAROLE corpus (http://parole.inl.nl/) or the Corpus Gesproken Nederlands (http://www.tst.inl.nl/cgn.htm). (2n)! 7. One of the formulae for the Catalan number of n is n!(n +1)!
Markus Egg and Gisela Redeker
5. Treeness of discourse structures This section discusses the adequacy of the proposed discourse representations. We model discourse structures by specific tree structures, where the leaves are discourse atoms and the other nodes are given by the relations between discourse constituents. These structures are more restricted than the RST-style trees (each of our trees can be mapped into an RST-style tree but not vice versa), let alone representations of discourse structures in terms of graph structures as suggested by Knott et al. (2001), Danlos (2004, 2006), or Wolf and Gibson (2005). This section will be devoted to a number of discourses that might be adduced as counterexamples proving that our notion of discourse structure is too restricted. We will show how this seemingly contrary evidence can be explained away. These potential counterexamples fall into three groups, which are ordered with respect to the specific problems that are claimed to emerge from the attempt to model their structure in terms of trees. These problems are crossed dependencies, discontinuous constituents, and structures with multiple parents.
5.1 Crossed dependencies Wolf and Gibson (2005) claim that discourse structures often exhibit crossed dependencies, like, e.g., in the following example: (21) Schools tried to teach students history of science (C1). At the same time they tried to teach them how to think logically and inductively (C2). Some success has been reached on the first of these aims (C3). However, none at all has been reached on the second (C4).
According to Wolf and Gibson (2005), C3 links to C1 and C4 to C2, as elaborations, while C1 and C2 on the one side and C3 and C4 on the other side are supposed to form contrasts. We disagree with this analysis, because it fails to take the surface structure (order and connectives) into account. The analysis can derive crossover only by assuming that the relatedness of C1 and C3 and the one of C2 and C4 should be represented as a direct relation between those segments. But the text first relates C1 and C2, and C3 and C4, respectively, and even marks those relations with connective expressions. The writer obviously gave preference to this structure over the alternative of first joining C1 and C3 and then C2 and C4. What is more, many of the examples Wolf and Gibson (2005) adduce for crossdependency rely on ordinary or on complex anaphora (Schwarz-Friesel et al.: 2004), i.e., anaphors that relate to whole sentences or larger discourse segments (abstract objects in the sense of Asher 1993). I.e., the intuition that there are dependencies in these examples that cross other dependencies can be put down to a cohesive device. Thus, the supposed cross-dependency in (21) emerges by the complex anaphors the first of these aims and the second in C3 and C4, which refer back to the propositions
Underspecified discourse representation
introduced in C1 and C2 (that schools had the goal of teaching history of science and the goal of teaching logic and inductive thinking, respectively). Thus, the structure we would assign to (21) is (22):8 (22)
resultn contrast
list C1
C2
C3
C4
This example shows that discourse structure is just one possibility of organising a text. Referential anaphors can create relations between sentences that are not directly linked by discourse structure (Redeker: 1991), an additional coding of such anaphoric relations in terms of discourse structure would thus be superfluous.9 An analogous explanation is available for another example adduced by Wolf and Gibson (2005) as evidence for crossed dependencies. They claim that C4 elaborates C2 only, thus crossing the relation of (non-volitional) cause between C3 and the sequence of C1 and C2: (23) Susan wanted to buy some tomatoes (C1) and she also tried to find some basil (C2) because her recipe asked for these ingredients (C3). The basil would probably be quite expensive at this time of the year (C4).
We assign to (23) the structure (24), and explain the intuition that there is some dependency between C4 and C2 by the anaphora the basil in C4, which relates back to some basil in C2. (24)
elaborationn causen list C1
C4 C3
C2
5.2. Non-continuous discourse constituents The second group of examples that look problematic at a first glance are cases where there seems to be a non-continuous discourse constituent, which is interrupted by another, embedded, constituent. However, we contend that these cases do not pose a problem given our version of the ‘nuclearity principle’ (see section 3.1).
8. In contrast to Wolf and Gibson (2005), we regard the relation between C1 and C2 as list and the one between C1–C2 and C3–C4, as (volitional) result. 9. Nevertheless, discourse structure and anaphoric relations are interdependent, see e.g., the results of the work in Veins Theory (Cristea: 2003).
Markus Egg and Gisela Redeker
There are two kinds of interruptions. First, an (otherwise) atomic discourse segment is interrupted. This case shows up in (25), where Mr. Baker’s assistant for inter-American affairs, Bernard Aronson, acknowledged is interrupted after the subject DP by another, complex discourse constituent (C2–C3): (25) Mr. Baker’s assistant for inter-American affairs, Bernard Aronson, (C1) while maintaining (C2) that the Sandinistas had also broken the cease-fire, (C3) acknowledged: (C4) “It’s never very clear who starts what.” (C5)
This example – as well as the next one – is quoted by Gibson (2005) from the RST Discourse Treebank (Carlson et al. 2003; from example wsj_0655). Second, something that would in principle constitute a larger discourse segment can be interrupted at a position where one of its (potential) subconstituents ends and another one begins. This second sort of example appears typically when attributions occur within the attributed text: (26) “The administration should now state (C1) that (C2) if the February election is voided by the Sandinistas (C3) they should call for military aid,” (C4) said former Assistant Secretary of State Elliott Abrams. (C5) “In these circumstances, I think they’d win.” (C6)
In this example, there is direct speech (C1–C4 and C6), which would form a straightforward discourse constituent, were it not for the intervening attribution satellite C5. In this hypothetical constituent, C6 would elaborate on C1–C4. We claim that these sorts of examples can indeed be assigned a tree structure, and that seeming differences between these tree structures and intuitions on the interpretation of the examples can be explained once we understand the tree structures in terms of the ‘nuclearity principle’. In addition, for the first kind of example, we need a device of indicating that two discourse segments are in fact part of one single discourse atom. Here we use the (quasi-)discourse relation same-unit as introduced by Carlson et al. (2003). It merely links the (nucleus of the) first constituent and the second constituent together. Consequently, we can uphold the analysis (27) that Carlson et al. (2002) assign to the discourse structure of (25).10 In this structure, relating C1–C3 and C4 by the relation same-unit expresses the fact that C1 (i.e., C1–C3 without the satellite C2–C3 for the interruption) and C4 are in principle one constituent in (25). This indicates that C2–C3 is a concession to C1 and C4 together:
10. We deviate from their analysis in that we regard the relation between C1 und C2–C3 as concession, not as elaboration-additional-e (i.e., a general elaboration relation, which
Underspecified discourse representation
(27)
attributions same-unit concessionn C1
C5 C4
attributions C2
C3
With our version of the ‘nuclearity principle’ we can also account for example (26) without relinquishing the treeness of discourse structure. Its analysis (28) in the WSC Discourse Corpus (Carlson et al.: 2002) is criticised in Wolf and Gibson (2005), who claim that it fails to model two intuitions on (26): First, C6 is part of the message linked by attribution to C5, where the source is given, and, second, C6 evaluates C2–C4: (28)
evaluationn C6
attributionn C5
attributions C1
same-unit C2
conditions C3
C4
The first intuition can be reconstructed as follows: C6 is related to C1–C5 by the relation of evaluation. Consequently, eventually, C6 also evaluates C2–C4, i.e., the nucleus of the nucleus of C1–C5. (Since same-unit is a multinuclear relation, the iteration stops at this point.) But this intuition (that the chances of winning a military conflict under specific circumstances are evaluated) is also shared by Wolf and Gibson (2005) and supported by an instance of modal subordination: Due to its modal auxiliary, C6 takes up the hypothetical mood of C2–C4. The second intuition also follows from (28). The message attributed to the source cited in C5 consists of C1–C4 and C6: Since the source is cited in the satellite of the attribution relation, subsequent segments can relate to the nucleus of this relation, i.e., to the message. In such cases, the message continues after the segment citing the source. In this example, once again, a complex anaphor (these circumstances) reinforces the relation between C6 and C2–C4. In the RST-style of encoding, example (26) can be modelled in an even more direct way, which straightforwardly encodes the fact that C6 eventually evaluates C1–C4 (i.e., this need not be derived from the fact that C1–C4 is the nucleus in the constituent C1–C5 evaluated by C6):
i nterrupts another segment). Note that in an attribution relation, the nucleus is the message and the satellite, the segment that indicates the source.
Markus Egg and Gisela Redeker
(29)
1-6 1-4
Attribution 1. The administration should now state
Attribution Evaluation
2-4 Same-unit
2. that
Condition
3-4
5. said former Assistant Secretary of State Elliott Abrams.
6. In these circumstances, I think they’d win.
3. if the {February} 4. they should call election is voided for military aid, by the Sandinistas...
The basic idea here is that in RST-style trees, one nucleus can have several satellites. This means that an interrupting constituent such as C5 – as long as it is a satellite (like the source in an attribution relation) – does not prevent further satellites such as C6 from relating to the same nucleus (here, C1–C4).
5.3. N-ary RST trees One more argument can be levied against the sort of tree structures that we use to model discourse structure in this paper. This argument upholds the claim that discourse structures are indeed trees, but only in the RST sense. However, it challenges the claim that RST trees are always binary (or can straightforwardly mapped into such trees), which means that no mapping into tree structures in our sense should be p ossible. The underlying problem here has already been introduced: In RST trees, one nucleus can be associated with several satellites. E.g., an RST-analysis of (17) in (30) could regard C1–C2 as circumstance, and C4–C8, as justification of C3: (30)
Circumstance
1-8
Justify
4-8 3. It is in their Solutionhood name that we ask your help# 5-8 1. Unfortunately, 2. Animals that 4. Because to Background the Dutch have been done improve the animal shelters away with by existence of 6-8 fill to the brim their owner for these animals, 5. there is of course a need with homeless whatever reason List of money. animals every and that are now 6. For 7. For increase summer. left to their vaccinations of the number destiny. and of shelters. sterilisations. 1-2
Elaboration
8. For extra medical care when necessary.
Stede (2004) suggests splitting such potential multiple-satellite constructions (MSC) into binary parts, one for each satellite. For (17), this would be possible; then C4–C8 is the justification to C1–C3, which is internally complex (C1–C2 describing a circumstance of C3). However, for most of the potential MSC structures in fund-raising letters this would not yield plausible analyses, as it would fail to reflect the powerful rhetorical effect of symmetric justify or motivation satellites often found around the appeal to donate money (cf. Abelen et al. 1993). This may also be true for other types of persuasive texts.
Underspecified discourse representation
But if there are discourses that can only be analysed by such MSC, a more fundamental issue emerges, viz., the question of whether the kind of tree structures we assume for discourse structure is compatible with the basic assumptions of RST: If the label of a node (or node variable) for a constituent C indicates the relation that links the immediate subconstituents of C, we cannot directly translate analyses such as (30) into a tree structure in our sense, because nodes may not have more than one label. E.g., the problem in (30) is the claim that C3 is brought about by two discourse relations, which would force us to give two labels to the node for C3. Now the frequency of such potential MSC seems to be genre-dependent. While Carlson et al. (2003) and Stede (2004) found only few instances in their newspaper corpora, they abound in the fund-raising letters analysed by Abelen et al. (1993). The analyses on the RST website (http://www.sfu.ca/rst/pdfs/rst-analyses-all.pdf) corroborate this impression: In the analysed fund-raising letters (20 units altogether), five MSC are found, while all other analysed data (193 units) exhibit only 10 instances of the phenomenon. While these numbers are more illustrative than decisive, we feel that the strategy of splitting the potential MSC into binary parts, should nevertheless be empirically tested against data from a variety of genres. The representation of potential MSCs thus remains an unresolved issue, which may, however, be avoidable in nonpersuasive text types. With these comments, we conclude our analyses. They have shown that syntactic structure on its own already reveals a lot about the underlying discourse structure. In this way, one can gain valuable information that contributes to the derivation of a unique discourse structure representation for a given discourse. While we have demonstrated that a few apparent problems for the analysis of discourse structure in terms of the tree structures presented in this paper can be explained away, the question of whether all discourse structures can be modelled adequately in terms of such tree structures calls for further discussion.
6. Related work In this section, we will compare our approach to related work. First of all, we share many intuitions with Schilder (2002). The main difference lies in the further processing of the initial underspecified discourse structure representations: Schilder uses Information Retrieval techniques (vector space model and position method) to derive full discourse structure representations from these initial representations while we merely capture the information available from syntactic structure without attempting to obtain a fully specified discourse structure representation. The work on the Potsdam Commentary Corpus also recognises the importance of underspecification in the representation of discourse structure but implements it by chart parsing techniques (subtree sharing and local ambiguity packing) (Stede 2004).
Markus Egg and Gisela Redeker
The LTAG (Lexicalised Tree-Adjoining Grammar) community build their analyses of discourse structure on LTAG, which constructs syntactic tree structures for expressions from tree fragments associated lexically with the words in that expression. Subsequent construction of discourse structure (as well as of semantic representations) is based not on the syntax tree but on its derivation tree. This makes the syntax-discourse structure interface relatively complex, as can be seen e.g., in Webber’s (2004) derivation of the discourse structure for (4). What is more, potential ambiguity of a given discourse structure as e.g., for (4) must be resolved during the process of constructing it. Depending on the integration of this process into larger NLP systems, we envisage two potential problems for this strategy: If it takes place before the results of semantic construction for the discourse are available, there is the danger of not ending up with the preferred discourse structure. And if the results of semantic construction are already available, there is the question of how to let them guide the process of selecting and constructing one single discourse structure. The proposed approach is more modular than the one based on LTAG, since it does not enforce choosing one of the possible discourse structure alternatives during discourse structure construction. This choice can be relegated to a more convenient time at which additional information (in particular, results of semantic construction) is available, which allows for a clear interface between discourse structure construction and other modules. The preferences that guide discourse structure construction on the basis of LTAG structures could be incorporated into the proposed approach as resolution preferences for underspecified discourse structure constraints. Asher and Lascarides (2003) offer an account of discourse structures in terms of underspecified semantic representations for the involved clauses with a (possibly underspecified) discourse relation that links the respective clause to a not yet specified discourse segment. From these representations, fully specified discourse structures are built incrementally by deciding for each new clause C (a) a segment C´ of the discourse structure of the previous discourse to which it attaches and (b) which discourse relation links C to C´. This is done by inference rules that use the semantics of the discourse segments involved. The proposed constraint-based approach differs from the one of Asher and Lascarides in that we limit ourselves to indefeasible discourse knowledge, which is encoded in discourse structure constraints, and do not model defeasible discourse knowledge, which takes the form of inference rules. E.g., their narration rule states that two clauses can be related by a narration relation if they describe events that are parts of a natural event sequence, and a nonmonotonic logic infers the structure of a discourse on the basis of these rules (e.g., a defeasible Modus Ponens). Finally, there is much common ground between our work and the work of Danlos (2004, 2006). We are both investigating the exact nature of the discourse structure and its formalisation, which involves comparing already existing approaches to discourse structure representation as well as testing potential discourse structure analyses against a wide range of data.
Underspecified discourse representation
7. Conclusion In this paper, we sketched an approach to discourse structure analysis. We will apply the results of this approach to discourse structure annotation. There are as yet no largescale corpora for Dutch that are annotated for discourse structure. We are currently setting up such an annotation initiative, where we will first automatically derive partial information on discourse structure from syntactically analysed corpora. This derivation will implement the discourse-syntax interface as sketched in this paper and output discourse constraints on the basis of a suitable syntactic analysis. These constraints will then be manually specified by human annotators. Discourse annotation at the University of Potsdam (Stede 2004) has shown that such a two-layered annotation process for discourse structure can boost inter-rater reliability and speed of corpus annotation. Further research questions will include the search for further (indefeasible) factors to constrain discourse structure underspecification and the integration of resolution heuristics to obtain fully specified discourse structure representations. E.g., in (9), simple ontological knowledge such as the fact that salmon and cheese are meal items could be used to infer the fact that C3 and C4 are elaborations of C2, which would go a long way towards resolving (9). In the future, we will also investigate the interaction between semantic construction and discourse structure construction.
References Abelen, E., G. Redeker, and S. Thompson (1993). ‘‘The rhetorical structure of US-American and Dutch fund-raising letters.” Text 13, 323–350. Asher, N. (1993). Reference to abstract objects in discourse. Dordrecht: Kluwer. Asher, N. and A. Lascarides (2003). Logics of conversation. Cambridge: Cambridge University Press. Carlson, L., D. Marcu, and M.E. Okurowski (2002). RST Discourse Treebank. Corpus number LDC 2002T07, Linguistic Data Consortium, Philadelphia. Carlson, L., D. Marcu, and M.E. Okurowski (2003). ‘‘Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory.” In J. van Kuppevelt and R. Smith (eds), Current Directions in Discourse and Dialogue, 85–112. Dordrecht: Kluwer. Copestake, A., D. Flickinger, C. Pollard, and I. Sag (2005). ‘‘Minimal Recursion Semantics. An introduction.” Research on Language and Computation 3, 281–332. Cristea, D. (2003). ‘‘The relationship between discourse structure and referentiality in Veins Theory.” In W. Menzel and C. Vertan (eds), Natural Language Processing between Linguistic Inquiry and System Engineering. Iasi: “Al.I.Cuza” University Publishing House. Danlos, L. (2004). ‘‘Discourse dependency structures as constrained DAGs.” In M. Strube and C. Sidner (eds), Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue, Cambridge, Massachusetts, USA, pp. 127–135. Association for Computational Linguistics. Danlos, L. (2006). ‘‘Comparing RST and SDRT discourse structures through dependency graphs.” This volume. Delin, J. and J. Oberlander (1995). ‘‘Syntactic constraints on discourse structure: the case of it-clefts.” Linguistics 33, 465–500.
Markus Egg and Gisela Redeker Egg, M., A. Koller, and J. Niehren (2001). ‘‘The Constraint Language for Lambda-Structures.” Journal of Logic, Language, and Information 10, 457–485. Knott, A., J. Oberlander, M. O’Donnell, and C. Mellish (2001). ‘‘Beyond elaboration: The interaction of relations and focus in coherent text.” In T. Sanders et al. (eds), Text representation: linguistic and psycholinguistic aspects, pp. 181–196. Amsterdam: Benjamins. Knott, A. and T. Sanders (1998). ‘‘The classification of coherence relations and their linguistic markers: An exploration of two languages.” Journal of Pragmatics 30, 135–175. Mann, W. and S. Thompson (1988). ‘‘Rhetorical Structure Theory: Towards a functional theory of text organization.” Text 8, 243–281. Marcu, D. (1996). ‘‘Building up rhetorical structure trees.” In Proceedings of the 13th National Conference on Artificial Intelligence, Portland, pp. 1069–1074. Marcu, D. (1997). The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts. Ph. D. thesis, Department of Computer Science, University of Toronto. Redeker, G. (1991). ‘‘Linguistic markers of discourse structure.” Linguistics 29, 1139–1172. Redeker, G. (2000). ‘‘Coherence and structure in text and discourse.” In W. Black and H. Bunt (eds), Abduction, Belief and Context in Dialogue, pp. 233–263. Amsterdam: Benjamins. Reyle, U. (1993). ‘‘Dealing with ambiguities by underspecification: construction, representation, and deduction.” Journal of Semantics 10, 123–179. Roberts, C. (1989). ‘‘Modal subordination and pronominal anaphora in discourse.” Linguistics & Philosophy 12, 683–721. Schilder, F. (1998). ‘‘Temporal discourse markers and the flow of events.” In Proceedings of the Workshop on discourse relations and discourse markers, COLING/ACL 98, pp. 58–61. Schilder, F. (2002). ‘‘Robust discourse parsing via discourse markers, topicality and position.” Natural Language Engineering 8, 235–255. Schwarz-Friesel, M., M. Consten, and K. Marx (2004). ‘‘Semantische und konzeptuelle Prozesse bei der Verarbeitung von Komplex-Anaphern.” In I. Pohl and K.-P. Konerding (eds), Stabilität und Flexibilität in der Semantik, pp. 67–86. Frankfurt: Peter Lang. Soricut, R. and D. Marcu (2003). ‘‘Sentence level discourse parsing using syntactic and lexical information.” In Proceedings of NAACL 2003. Stede, M. (2004). ‘‘The Potsdam Commentary Corpus.” In B. Webber and D. Byron (eds), ACL 2004 Workshop on Discourse Annotation, Barcelona, Spain, pp. 96–102. Webber, B. (2004). ‘‘D-LTAG: extending lexicalized TAG to discourse.” Cognitive Science 28, 751–779. Wolf, F. and E. Gibson (2005). ‘‘Representing discourse coherence: a corpus-based study.” Computational Linguistics 31, 249–287.
part iii
The Cognitive Perspective
Dependency precedes independence Online evidence from discourse processing Petra Burkhardt
University of Marburg This paper investigates the integration of definite determiner phrases (DPs) as a function of their contextual salience, which is reflected in the degree of dependency on prior information. DPs depend on previously established discourse referents or introduce a new, independent discourse referent. This paper presents a formal model that explains how discourse referents are represented in the language system and what kind of mechanisms are implemented during DP interpretation. Experimental data from an event-related potential study are discussed that demonstrate how definite DPs are integrated in real-time processing. The data provide evidence for two distinct mechanisms – Specify R and Establish Independent File Card – and substantiate a model that includes various processes and constraints at the level of discourse representation.
1. Introduction This paper presents a model for the representation and processing of definite Determiner Phrases (DPs) and investigates the function of definite determiners as well as the role of contextual information for the interpretation of DPs. Many theoretical approaches assume that definite DPs (e.g., the ladybug, the pianist) represent discourseold/given information (cf. e.g., Hawkins 1978; Heim 1982; Prince 1981, 1992), i.e., a definite DP refers to an entity in the mental model that has previously been introduced, which therefore serves as a potential anchor for a dependency relation – in contrast to indefinite DPs (e.g., a virus, a singer) which represent discourse-new information that does not need to be anchored. In (1), the definite DP the movie is coreferential with the previously introduced entity a movie. In this case, an identity relation is established between the two DPs at the level of discourse representation (i.e., between the discourse representation for a movie and that for the movie), and all information relevant to the discourse unit movie is accumulated in one place. 1. Megan watched a movie last night. The movie was so scary that she had nightmares. 2. Paul talked to a stranger at the train station. He said that the book was hilarious. 3. Tom talked to a woman at the airport. He complained about the weather.
Petra Burkhardt
4. Sid was happy because the sun was shining. 5. Rachel went to a concert in London. The soloist reminded her of an old friend. Traditionally, a functional correspondence has been assumed between definiteness and the given-new distinction (cf. Chafe 1976 among others), and as a consequence, definite DPs are not expected to represent new information. This is illustrated by the observation that the definite DP the book in (2) yields an infelicitous interpretation: without additional information, this passage sounds incoherent and awkward, because it is not clear which book Paul is talking about and a linkage to the previous utterance cannot be established. However, in certain contexts, interpretation can still be accomplished, for instance if Paul is pointing at the book in question. Deictic information can then be used to integrate the definite DP into the mental representation. Likewise, situational uses of definite DPs and world knowledge facilitate the integration of a definite, but discourse-new entity, as in (3) and (4), where the definite DPs the weather and the sun refer to information made available by the broader context in which the communicative act is taking place. Another way to rescue the interpretation of a definite, but discourse-new referent is by means of inferential knowledge. In such cases, a previously mentioned entity, which is generally expected to hold a prominent role at the time of utterance, licenses an ‘inferential bridging’ relationship, where presuppositions are activated and inferential knowledge is retrieved to establish a connection between two entities (cf. e.g., Haviland and Clark 1974; Clark 1975). Accordingly in (5), the inference that soloists may be present at concerts is utilized and the soloist can be interpreted as the soloist at the concert Rachel went to. These examples illustrate that a strict correspondence between definite marking and givenness cannot be supported empirically. Corpus studies further indicate that violations of the definiteness-givenness correspondence as in (3–5) are common in the human language system. Fraurud (1990) analyzed Swedish texts and found that 60.9% of definite DPs represented discoursenew entities. Investigating English texts in a series of studies, Poesio and Vieira (1998) report that approximately 50% of definite descriptions embodied new information. These findings strongly suggest that a theory that seeks to formalize the representation and interpretation of DPs must account for the different mechanisms available to integrate definite DPs, as observed in (1)–(5), and in particular must develop a machinery for the integration of discourse-new definite DPs.
2. The interpretation of definite DPs The processes required for the interpretation of definite DPs are here formalized within the framework of the Syntax-Discourse Model (cf. Avrutin 1999; Burkhardt 2005), which has its roots in File Change Semantics (cf. Heim 1982). The Syntax-Discourse Model operates on strict correspondence rules between the syntax module – which is in charge of phrase-structural information encoding – and the discourse module – which takes care of maintaining discourse representation structures. With respect to DP interpretation, the
Dependency precedes independence
correspondence rule states that each syntactic unit that represents a DP – regardless of its internal structure – triggers the creation of a corresponding discourse unit (so-called ‘individual file card’). (6) below illustrates that DPs have the internal structure of a head D and its complement NP – e.g., [D[NP]], as in [the[apple]] – and this basic structure is also reflected in the file card representation: a file card consists of a frame and a heading (cf. Avrutin 2004), and the referential features associated with the head D of the DP are encoded in the frame and the lexical features associated with the complement in the heading. (6)
DP D
the
NP
apple
apple
frame : the
The two main components of a DP – head D and complement – serve distinct functions during the interpretation process. The complement provides lexical information, activating lexical-semantic processes, and the head carries functional features and encodes definiteness (e.g., a pumpkin, the umbrella), number (e.g., five butterflies, many stars), person, gender, and Case (e.g., from German: die Kinder (‘the children’– 3rd person, plural, neuter, nominative/accusative), dem Kind (‘to the child’– 3rd person, singular, neuter, dative)). These two qualitatively different sources of information impact the interpretation process in distinct ways. The information encoded in the head – and in the following I will only talk about definiteness, but the general idea pertains to all feature specifications associated with D – is first and foremost grammatical in nature, is largely processed by narrow syntax and influences the interface operation linking syntax and discourse structure. The information encoded in the complement is lexical and affects processes in discourse structure. Following for instance Chafe (1976) or Heim (1982), it is assumed that a discourse unit that is marked indefinite represents new information, while a discourse unit that is marked definite represents given information and hence depends on a previously introduced discourse entity. Accordingly, the information about the newness or givenness of a discourse unit is strongly associated with the definiteness feature carried by D and is detached from the lexical and semantic content of the DP. These accounts therefore suggest a predominance of the definiteness information carried by D, which in the current framework affects the interface operation between syntax and discourse, because every DP introduces a discourse unit and the features of the head determine the newness or givenness of this unit. The information associated with D hence serves the primary function of signaling the discourse module how to deal with an incoming file card (i.e., treating it as new or linking it to old discourse units). Then as soon as a file card representation is created in discourse – and its givenness status has been established – lexical and semantic features associated with the complement can be interpreted. It should be at this point that definite DPs, which by the above definition depend on previously established information, enter into a relation R with a given discourse referent based on the content information provided by the complement.
Petra Burkhardt
However, before elaborating the distinct mechanisms available to establish the relation R, a few words need to be said about definiteness, since this paper explicitly focuses on definite DPs. Much research has been done on the nature of definiteness, and numerous functions have been associated with definite marking including uniqueness, specificity, inclusiveness, identifiability, and familiarity (for overviews see Lyons 1999 and Abbott 2004). As the discussion above indicates, familiarity is a core concept within the present framework, but the notion of given (i.e., familiar) is here used as a broad and flexible concept that evokes a dependency relation R – in contrast to viewing it in its strict sense as a mere indicator of an identity relation (see below). This view is compatible with semantic accounts of definiteness that propose that definite markers serve the purpose to mark their complement nouns as functional concepts (cf. e.g., Löbner 1985; Chierchia 1995). Löbner, for example, suggests that complements of definite Ds can be inherently relational (e.g., the mother) or depend on contextual information to satisfy a particular relation R. In other words, a definite DP introduces the presupposition that reference is made to a particular entity, and in this sense definite DPs are used anaphorically and enter into a dependency relation R with an antecedent or an anchor (cf. Chierchia 1995). This in turn requires some prior knowledge about the entity that serves as antecedent or anchor – which is instantiated by previous utterances, by deictic cues, by the current situation, or by general world knowledge. Which mechanisms are available to establish the dependency relation R? Identity is considered to be the most economical process during the interpretation of definite DPs. In these cases, the corresponding file card entry matches an antecedent file card in the discourse space – e.g., in (1) – which leads to the merging of all relevant information pertaining to this discourse referent onto a single file card representation (cf. also Heim 1982; Avrutin 1999). This move is of course motivated by considerations of economy, because intuitively it is more economical for the language processor to keep all information associated with a particular referent in one place. If identity cannot be achieved on the basis of the information available in the given discourse representation structure, the system searches for other types of dependencies to satisfy the presupposition associated with the definite marker. These dependencies can be licensed by an inference-based relation R, which is either established between two individual discourse entities – e.g., (5) – or between a discourse unit and a ‘situation file card’ (cf. Avrutin 1999) – e.g., (3). The situation file card is available in any given speech situation, holds information about the immediate and general context of a speech act, and represents information shared by the speaker and the hearer (e.g., about participants, the venue, the weather, . . .). Inference-based relations as exemplified in (5) – also referred to as inferential bridging (cf. Clark 1975), accommodation (cf. Lewis 1979; Heim 1982) or associative anaphora (Hawkins 1978) – denote dependency relations where objects or events are related to each other through a relation that is only implicitly available. To establish this relation R, lexical knowledge and world knowledge are activated. Crucially, in drawing inferences to integrate a definite DP, the system strives to achieve coherence,
Dependency precedes independence
and therefore the notion of salience (i.e., what is being talked about) plays an important role in the formation of the relation R. For instance in (5), the soloist can be considered a salient participant in the event described in the preceding utterance (Rachel went to a concert in London). Likewise in (7), the ceiling represents a salient part of the topic of the previous utterance (new apartment); and under the same assumption that the focus of the utterance is the apartment, the view must be interpreted as the view from Laura’s new apartment. Alternative interpretations of these definite DPs would be incoherent.
(7) Laura showed Sue her new apartment. The ceiling was high and the view was breathtaking.
These linking processes between a definite DP and a salient entity already available in the discourse space are therefore essential for successful communication and illustrate that inferential bridging relations foster coherence and thus contribute to smooth and felicitous communication. As a final point, there are two reasons why a relation R of the sort represented by inferential bridging is needed by the language system. The first reason is related to discourse structure internal considerations, the second one has to do with general conversational principles. First, coupled with considerations of economy, the amount of information stored and maintained at the level of discourse structure must be kept at its minimum. As a consequence, any incoming information marked as given should ideally be merged with the corresponding information already available on the antecedent’s file card (i.e., yielding identity). This idea has for example also been formalized in Asher and Lascarides’ (1998) rule If Possible Use Identity. Due to the correspondence rule that connects syntax and discourse representation within the current framework, every definite DP introduces a discourse unit that should first attempt to enter into an identity relation with an antecedent. In the absence of a matching antecedent, however, the system attempts to establish ties with other information units already available in the mental model in order to bridge the incoming information with previously given information and integrate it in the most reasonable manner. This bridging operation is carried out to satisfy the expectations associated with the definiteness of the head D and is hence driven by the feature specification of D. It can therefore be argued that this is a bottom-up trigger to establish a relation R resulting from the syntax-discourse interface operation. Second, every communicative act is guided by general conversational principles, in particular that it should obey constraints of coherence (cf. also Grice 1975). This means that the system always attempts to establish coherence between incoming and prior information. Assuming that general principles of communication are part of conceptual structure, which includes processes of reasoning and forming intentions and is not part of the language system proper, this can be viewed as an additional trigger guiding the formation of a relation R, such that higher-level conceptual structure impacts processes at the level of discourse structure. Both motivations – those arising at the syntax-discourse interface
Petra Burkhardt
and those arising at the conceptual structure – discourse interface – interact with each other and the integration of definite DPs via a relation R is then achieved by activating lexical and world knowledge.
3. The processing of definite DPs For sentence processing this means that first of all distinct knowledge sources – narrow syntax, interface rules, discourse structure, and conceptual structure – can impact the interpretation of definite DPs, and second that various mechanisms are available to establish the dependency relation R. Following the view that the information associated with the definite marker is prevalent, a definite DP must always be integrated via a dependency relation – i.e., the presence of the definite marker triggers a process Establish R. (Note that in the case of an indefinite marker, this process should not be initiated, but instead the corresponding discourse referent is added to the discourse representation as an independent discourse unit – Establish Independent File Card – which should result in storage cost.) The process Establish R is completely automatic and is evoked by the features retrieved in narrow syntax. It activates a discourse-based process Specify R, which utilizes the information encoded in the heading of the file card. The realization of this process is guided by principles of economy that state that an identity relation should be preferred over any other type of dependency relation during the processing of definite DPs. In other words, identity relations are less demanding for the sentence processor than for instance bridging relations. But if it cannot form an identity relation, additional knowledge sources are activated to satisfy the need for a dependency relation. Furthermore, the language system strives to carry out Specify R, and following from this, a definite DP that lacks an anchor entirely is the least preferred DP and results in increased processing effort – and should ultimately yield an incoherent interpretation. The figure in (8) illustrates the functional relations between the syntax and the discourse module as well as the processes resulting from the distinct linguistic components that have been discussed so far. (8)
DP D
NP
the
x
SPECIFY R (via IDENTITY or BRIDGING)
x the
ESTABLISH R
What we should then observe during sentence processing is a reflection of the formation of the relation R, which can first be distinguished on the basis of the type of
Dependency precedes independence
dependency (i.e., identity vs. bridging) and second on the basis of coherence considerations – because not every discourse-new definite DP is integrated with the same ease and some cannot be integrated via a bridging dependency at all, resulting in an incoherent or infelicitous interpretation (cf. also Almor’s 1999 Informational Load Hypothesis). Both of these distinctions have consequences for the amount of effort required for the formation of R. Moreover, identity and bridging must be dissociated on the basis of the number of discourse referents that must be maintained in discourse, because a definite DP that enters into a bridging relation keeps an independent file card representation (to serve as reference for future mention), while file cards with the same heading are merged under an identity relation. This results in increased integration cost due to a process Establish Independent File Card during the interpretation of discourse-new definite DPs. Crucially, the present model predicts that Specify R is carried out prior to Establish Independent File Card. The system first strives to establish a linkage with prior information. In the case of identity, interpretation is completed when a matching antecedent is located. In the case of bridging, the nature of the relation R demands the introduction of an independent file card representation. It is therefore assumed that dependency formation precedes the establishment of a new and independent file card during the processing of definite DPs. In the remainder of this paper, these two processes – Specify R and Establish Independent File Card – are investigated and the distinctions between these two processes are hypothesized to be reflected in differing electrophysiological patterns. Next, I first provide an overview over previous sentence processing studies that are relevant for an investigation of definite DPs and the processes that are required for their interpretation. Then I introduce the methodology of event-related brain potentials (ERPs) and discuss prior findings and predictions for the present investigation.
3.1. Previous processing evidence From a processing perspective, two findings are of particular relevance for the present research question. First, it has been shown that the integration of given information is less costly than that of new information (cf. e.g., Haviland and Clark 1974; Yekovich and Walker 1978). This demonstrates that economy consideration affect sentence comprehension and that processing demands increase with the amount of new information that must be integrated. Second, Haviland and Clark (1974) report longer comprehension times for bridging relations over identity relations. These findings generally converge with the view adopted above, but they only provide end-of-sentence measures and cannot shed light onto the specific processes distinguished in the current framework. In contrast, the investigation presented below explores the interpretation of definite DPs in real time utilizing ERP measures.
3.2. Previous ERP findings ERPs provide continuous measures of the electrical brain activity during sentence processing and supply information about the time course of processing. This technique
Petra Burkhardt
has proven to have a high temporal resolution and to reveal fine-grained distinctions of the underlying neural mechanisms during sentence processing, which is reflected in the latency, amplitude, polarity, and topography of the ERP signal. It is therefore an informative methodology to investigate the temporal and functional dimensions of DP integration and the processes elaborated on above. For present purposes, two ERP components are of particular relevance and are therefore discussed in the following paragraphs: the so-called N400 and the P600. A negative deflection with a latency of about 400 ms - the N400 – has been reported by numerous language studies that investigated the processing of semantic implausibilities and contextual incoherence (cf. e.g., Kutas and Hillyard 1980, 1984; van Petten and Kutas 1991; van Berkum, Hagoort, and Brown 1999; Kutas and Federmeier 2000; Hagoort, Hald, Bastiaansen, and Petersson 2004). It has been shown that the amplitude of the N400 varies with the degree of semantic plausibility, such that the less plausible the integration of a lexical item is, the more pronounced is the negativity (e.g., “cry” in The pizza was too hot to eat/drink/cry. – from Kutas and Hillyard 1980) and also as a function of pragmatic-conceptual plausibility (e.g., Amsterdam is a city that is very old/new. – from Hagoort et al. 2004). Additionally, an N400 has been recorded relative to the onset of pronominal entities (in contrast to proper name entities), suggesting that the negativity is related to the establishment of a dependency relation between a pronominal and its antecedent (cf. Streb, Rösler, and Hennighausen 1999, Streb, Hennighausen, and Rösler 2004; Burkhardt 2005). Furthermore, a reduction of the amplitude of the N400 has been observed during the comprehension of repeated entities (known as ‘repetition priming’), suggesting that the N400 amplitude is a marker of semantic and conceptual relatedness (cf. Rugg 1985; Weisbrod, Kiefer, Winkler, Maier, Hill, Roesch-Ely, and Spitzer 1999; Kutas and Federmeier 2000). Finally, N400-modulations have been observed with respect to processes that require inference drawing. St. George, Mannes, and Hoffman (1997) report a more pronounced N400 at the end of multi-sentence passages that are not based on prior inference-based processing compared to passages that depend on the retrieval of inferential knowledge (e.g., no inferential processing required: Pam set the dining room table. She put the turkey in the oven. The guests were outside playing badminton. It was too bad the turkey burned. vs. drawing of inferences required (i.e., the consequences of forgetting about the turkey must be inferred in order to properly integrate the subsequent (third) sentence): Pam set the dining room table. She forgot about the turkey in the oven. The guests were disappointed with the ruined meal. It was too bad the turkey burned.). For the present investigation, these findings suggest that the N400 represents an electrophysiological potential associated with the retrieval of lexical and inferential information and should hence be evoked during the formation of a relation R (i.e., during the processing of Specify R). In addition, a positive deflection peaking around 600 ms – the so-called P600 – which has generally been characterized as an index of reanalysis and syntactic revision (e.g., Friederici, Pfeifer, and Hahne 1993), has more recently been interpreted as a marker of integration cost (cf. Kaan, Harris, Gibson, and Holcomb 2000; Fiebach, Schlesewsky,
Dependency precedes independence
and Friederici 2002; Kaan and Swaab 2003). In these latter studies, a pronounced P600 has been registered in constructions that exert increased processing demands during discourse integration, exemplified by the need for a discourse representation for who, but not for whether (e.g., Emily wondered who the performer in the concert had imitated for the audience’s amusement. vs. Emily wondered whether the performer in the concert had imitated a pop star for the audience’s amusement.). Moreover, an enhanced P600 has been observed during the processing of discourse-new proper names compared to referentially dependent pronouns, suggesting that processing new information exerts integration cost (cf. Burkhardt 2005). For the current research question, these results indicate that the P600 might serve as an index for differences in integration load and storage cost, triggered by the process Establish Independent File Card.
4. The present study The study reported here investigated the online processing of definite DPs as a function of the licensing power of the information provided by a context sentence. In a reading experiment in German, three conditions were contrasted, which were hypothesized to show increasing integration difficulty: definite DPs that corresponded to a previously introduced matching discourse referent (identity condition), definite DPs that depended on a bridging relation with an entity that was introduced in the context sentence (bridging condition), and definite DPs that lacked a matching referent or a salient relation in the discourse representation (incoherence condition). (9) illustrates example stimuli for all three conditions. Notice that the target sentence, containing the critical definite DP, is identical in all three conditions. (9) a. Identity Condition: Regine beschreibt einen Portier aus dem Adlon. Regine describes a doorman from the Adlon Sie denkt, dass der Portier wohl überqualifiziert war. she thinks that the doorman probably overqualified was “Regine describes a doorman from the Adlon. She thinks that the doorman was probably overqualified.” (9) b.
Bridging Condition: Rebekka beschreibt ein Hotel in der Eifel. Sie denkt, dass der Portier wohl überqualifiziert war. “Rebekka describes a hotel in the Eifel. She thinks that the doorman was probably overqualified.”
(9) c.
Incoherence Condition: Ruth schwatzt gelegentlich mit ihrer Friseurin. Sie denkt, dass der Portier wohl überqualifiziert war. “Ruth talks occasionally with her hairdresser. She thinks that the doorman was probably overqualified.”
Petra Burkhardt
Since we are dealing with definite DPs, the syntax-discourse interface operation is hypothesized to trigger the process Establish R in all three conditions and the present experiment therefore does not address the processing correlates of this particular process. But the conditions differ in how they carry out Specify R and whether they require Establish Independent File Card: In the identity condition, the definite DP the doorman forms an identity relation with its antecedent. This should result in the most economical integration process and realization of Specify R. Guided by findings from repetition priming, this allows for the prediction that – in contrast to the other two conditions – the least pronounced effect in the N400-component should be observed for the identity condition (in fact, a reduced N400). Assuming furthermore that the P600-component indexes the establishment of an independent discourse representation, no effect should accrue in the P600-component. In the bridging condition, the critical definite DP the doorman depends on the salient entity the hotel as an anchor for an inferentially licensed relation R (yielding the doorman of the hotel in the Eifel). The additional processing effort required to form this relation might be reflected in a more pronounced N400 (in contrast to the identity condition), thus marking lexical and inferential processes, and certainly in a more pronounced P600 (also in contrast to the identity condition), indexing the preservation and storage of a discourse-new and independent file card representation. In the incoherence condition, the critical discourse-new definite DP lacks both an identical referent and an anchor in the preceding discourse and consequently Specify R cannot be carried out successfully. Since the two successive sentences do not share a common topic, this condition violates coherence principles (cf. e.g., Lascarides and Asher 1993 for the coherence constraint on Background). Due to the constraint to optimize coherence, the language system might thus try hard to identify a potential anchor and search for some kind of dependency relation. This search for clues and inferences is predicted to result in processing cost, and should most likely result in the failure to satisfy coherence, which on the basis of previous ERP research should be reflected in the most enhanced N400 compared to the other two conditions.1 In terms of the process Establish Independent File Card, a pronounced P600 might be registered in the incoherence condition as an indication that the system introduces an independent representation for the definite DP and possibly anticipates resolving the relation via a cataphoric relation made available by successive utterances. 1. Alternatively, a common topic might be constructed resulting in a connection between the two sentences (e.g., the act of talking with the hairdresser and the mention of a doorman in (9.c)) such as the drawing of an inference that Ruth and her hairdresser must have previously talked about this particular doorman and that the doorman is therefore known. Nevertheless, such an interpretation should also be costly, because elaborate reasoning must be applied to avoid incoherence. However, sentence combinations as presented in the incoherence condition are usually evaluated incoherent when participants are asked to rate whether the second sentence is a good continuation of the first one, which suggests that elaborate (and often far-fetched) reasoning is generally not applied to resolve the interpretation of these kind of discourse-new definite DPs.
Dependency precedes independence
4.1 Norming study Prior to the ERP study, a questionnaire study was conducted to determine good candidate sets for the target and anchor DPs in the bridging condition, i.e., pairs of nouns that were highly associated with each other. Association norms were collected on predetermined pairs of nouns because bridging relations are easily established in the presence of strongly related and associated anchor-target pairs that represent salient parts or roles (cf. Clark 1975). In the questionnaire study, participants were asked to rate the relationship between pairs of nouns on a seven-point-scale (with ‘1’ corresponding to no apparent connection between the two items, and ‘7’ corresponding to a very strong connection). The questionnaire consisted of a total of 90 word pairs, which included 30 filler items. The experimental pairs consisted of salient relations, including terms for objects and related salient individuals (e.g., Torte – Bäcker ‘tart – baker’), events and salient individuals (e.g., Hochzeit – Bräutigam ‘wedding – groom’), or locations and respective salient individuals (Schule – Rektor ‘school – principal’). The filler pairs represented nonsense associations (e.g., Tulpe – Demonstration ‘tulip – demonstration’). Experimental and filler pairs were randomized, and two versions of the list of 90 noun pairs were designed varying in the order of the items. Thirty monolingual native speakers of German (14 female) from the participant pool of the Max-Planck-Institute for Human Cognitive and Brain Sciences in Leipzig participated in the questionnaire study. Their ages ranged from 20 to 35 years (Mean=25.9, SD=3.3). Average responses from all thirty participants were calculated per word pair. The averages of the experimental pairs ranged from 3.7 to 6.93 (M=6.23, SD=0.71); the averages of the filler pairs varied from 1 to 4.13 (M=1.78, SD=0.85). For the selection of the stimuli material for the ERP study, a cut-off value was defined at 6.0 for the average response to a noun pair. This yielded 45 experimental word pairs that were on average evaluated higher than 6.0. From this list, the 40 most highly rated pairs were chosen for the design of the experimental sentences of the ERP study.
4.2 ERP study2 4.2.1 Method Forty experimental stimuli were constructed per condition. They were designed in triplets and consisted of a context and a target sentence as illustrated in (9). Target sentences were kept identical across triplets, while context sentences varied in the salience of the critical DP (yielding identity, bridging, or incoherence). In addition to the 120 experimental stimuli, 160 filler items were created that served as distracter items and matched the experimental stimuli in length.
2. This study is also reported in Burkhardt (2006) and is here discussed with reference to the model outlined above. Additional details about design and statistical data are available therein.
Petra Burkhardt
Twenty-four students (12 female) from the University of Leipzig participated in this experiment (21 to 29 years of age; M=24.3, SD=2.6). All participants were righthanded, native speakers of German, and reported normal or corrected-to-normal visual acuity. One participant had to be excluded from the analysis of the ERP data due to excessive artifacts. Participants were fitted with an electrode cap, with which the electroencephalogram (EEG) was recorded. They were seated in a sound-attenuating booth and were instructed to read the stimuli for comprehension, which were presented visually in the center of a computer screen. DPs were presented phrase by phrase (for 500ms) and all other elements word by word (for 400ms), each with an inter-stimulus interval of 100ms. Following the presentation of context and target sentence of a given stimulus, a comprehension question was presented in its entirety, which participants had to answer with respect to the stimulus just read. This comprehension question was asked to assure that participants were processing and understanding the material properly and items with a false or timed-out response were excluded prior to data analysis. In the comprehension task, information was probed from both context and target sentences, and ‘yes’ and ‘no’ responses were equally distributed across all items. Different questions were assigned to each stimulus and counterbalanced across participants. For instance, for the identity condition presented in (9.a), possible comprehension questions were Was the doorman from the Adlon? (expected response: ‘yes’) or According to Regina, is the doorman too young? (expected response: ‘no’).
4.2.2 Results: Behavioral data Behavioral data are based on the responses to the comprehension question. First, error rates were computed over incorrect and timed-out responses (i.e., responses that failed to be registered 4000ms after the presentation of the verification question). Second, mean reaction times per condition were calculated for items with correct responses. The analysis of the error rates revealed that participants had an accuracy rate of about 90% and higher, indicating that they were attending to the stimuli properly. A repeated measures analysis of the reaction times to the comprehension task showed a main effect of condition, which resulted from significantly longer reaction times for the incoherence condition over both the identity and the bridging condition (see Table 1). Table 1. Behavioral data by condition. IDENTITY CONDITION BRIDGING CONDITION INCOHERENCE CONDITION
Mean Reaction Times (SDs)
Mean Error Rates (%)
1662 ms (336 ms) 1648 ms (299 ms) 1693 ms (319 ms)
3.83 (9.56 %) 3.61 (9.02 %) 4.22 (10.54%)
Dependency precedes independence
4.2.3 Results: ERP data Grand-average ERPs for identity (dotted grey line), incoherence (dashed line), and bridging (solid line) are shown in Figure 1, where ERPs were time-locked to the onset of the critical definite DP in the target sentence. By convention, negativity is plotted upward and positivity downward. Only experimental items that elicited a correct response to the verification task entered the data analysis. Repeated measures analyses of variance were carried out for two time windows corresponding to the N400- and for the P600-component (for statistical details see Burkhardt 2006). In the N400-windows, ranging from 350 to 550ms post-onset, the statistical analysis revealed a significant effect, which had its maximum over posterior electrode sites. The degree of salience and coherence, which differentially affects the process Satisfy R in the three conditions, was modulated by this negativity: the definite DPs in the incoherence condition registered the most pronounced negativity, followed by the bridging condition with an intermediate (but clearly reduced)
F3
FZ
C3
Negativity CZ C4
P3
PZ
F4
P4
-5 µV Positivity
0.5
S 1.0
5
Figure 1. Grand-average ERPs from nine electrode positions, recorded to the onset of the definite DP (onset at vertical line) for IDENTITY (dotted grey line), BRIDGING (solid line), and INCOHERENCE (dashed line). Negative voltage changes are plotted upwards; the time course is plotted on the horizontal line (from 200ms pre-onset till 1200ms post-onset relative to the definite DP). ERPs show an effect of salience and coherence – related to Specify R – in the N400-component, with the most pronounced negativity for the incoherence condition, followed by the bridging and identity condition, as well as an effect of the creation of an independent discourse representation – cf. Establish Independent File Card – in the P600-component for the bridging and the incoherence condition.
Petra Burkhardt
negativity, and finally the identity condition, which showed a strong reduction of the N400-component. In the P600-window, spanning from 600 to 900ms, the statistical analysis registered a significant effect, which was resolved over left posterior electrode sites. A comparison of the three conditions revealed that the incoherence condition and the bridging condition showed an enhanced positive-going waveform and differed significantly from the identity condition. This supports the view that the P600 indexes the creation of independent discourse referents and measures storage cost as required by Establish Independent File Card.
5. Discussion The goal of this paper was to bring together theoretical aspects of the interpretation of definite DPs with evidence from sentence processing in order to shed light onto the nature of the interpretive processes underlying DP integration. Two processes were of particular interest: Specify R and Establish Independent File Card. On the basis of the Syntax-Discourse Model, the two functionally distinct components of a DP were hypothesized to impact sentence comprehension in specific ways: heads, which provide information for syntax-discourse interface operations, indicate the (saliency) status of a DP and consequently determine the manner in which a DP is integrated. When D carries the feature [definite], this introduces the presupposition that reference is made to a known entity, prompting the process Establish R, upon which the lexicosemantic information supplied by the complement is assessed to carry out Specify R. The relation R can be resolved via identity or bridging and the ease of specifying R was predicted to emerge in the N400-component of the ERP signal. In addition, coherence considerations were also expected to modulate the N400. Additional integration and storage cost resulting from Establish Independent File Card in the case of non-identical discourse units was hypothesized to emerge in the P600-component. The electrophysiological data support these predictions and illustrate that different processes are carried out during the integration of definite DPs. The identity condition – hypothesized to be the least costly in the current comparison – elicited a reduced negativity but no pronounced positivity (reflecting the computational advantage of establishing an identity relation), while the incoherence condition – hypothesized to be subject to the most costly processes – registered a pronounced N400 followed by a P600. Interestingly, the bridging condition showed a reduction of the N400 – similar to the identity condition – and a pronounced P600 – patterning with the incoherence condition. The obtained N400-results converge with ERP findings from semantic and repetition priming that indicated that the amplitude of the N400 varies as a function of semantic and conceptual relatedness (cf. e.g., Kutas and Federmeier 2000). The present data specifically demonstrate that the availability of semantic and inferential relations ease interpretation very early in the time course of integration (i.e., around 350ms after stimulus onset). The observed P600 can be
Dependency precedes independence
interpreted as an index of increased integration cost (cf. e.g., Kaan et al. 2000; Fiebach et al. 2002; Kaan and Swaab 2003) and is here attributed to the need to establish and store an additional discourse referent that is new within discourse structure (cf. also Burkhardt 2005). Overall, the results of this study reveal two distinct neural mechanisms involved in the interpretation of definite DPs that can be functionally dissociated and can be linked to the processes Specify R and Establish Independent File Card respectively. Based on the time course of the neural correlates for these two processes, Specify R reveals an earlier onset latency than Establish Independent File Card, which suggests that dependency formation precedes the integration of independent discourse entities. Critically, during the first phase of integration, both coreferential and inferential relations facilitate interpretation, as evidenced by the reduced N400 to the identity condition and the bridging condition. This indicates that salience is an important feature during the initial interpretative processes carried out at the level of discourse representation. In contrast, in the incoherence condition, the definite DPs are less consistent with prior context and occur unexpectedly. The search for a potential relation R and the detection of the incoherence between context sentence and target DP yielded an N400 with a pronounced amplitude, as previously reported for pragmatic and semantic plausibility violations (cf. e.g., Kutas and Hillyard 1984; van Berkum et al. 1999; Hagoort et al. 2004). The enhanced N400 therefore appears to mark the unavailability of a relation R. In general, the amplitudinal variations observed for the N400 suggest that during this time window, a search for an antecedent or anchor takes place and additional knowledge sources are accessed to specify the relation R. The second neural mechanism observed in the present study indexes the integration of an independently available entity, as a consequence of which an additional discourse referent must be kept in storage for future referential processes. This results in cost and is reflected in the pronounced P600 in the incoherence condition and the bridging condition, which have in common that – independent from salience and coherence considerations – they introduce a previously unknown entity into discourse structure. Table 2 summarizes the findings per condition. It further shows that DPs that are subject to inferential bridging relations share processes with the two other types of DPs investigated in the present study. Table 2. Interpretation processes and ERP findings. SPECIFY R
DEPENDENCE: IDENTITY
DEPENDENCE: BRIDGING
INDEPENDENCE (INCOHERENCE)
✓ reduced N400
✓ (less) reduced N400
enhanced N400
✓ P600
✓ P600
ESTABLISH - INDEPENDENT FILE CARD
Petra Burkhardt
To conclude, the present investigation indicates that during the integration of DPs, interpretation is constrained by discourse-internal economy considerations, namely that the formation of a dependency relation is less costly than the introduction of an independent discourse referent and that dependency relations are (attempted to be) specified prior to the establishment of independent reference. This has previously been proposed on the basis of offline studies, and is here substantiated by online data, which further reveal that distinct neural mechanisms are involved. The ease of dependency formation in the bridging and identity conditions is observed within the N400-window, which is interpreted as a marker for the establishment of a relation R and the activation of lexical and conceptual neural networks. Additionally, a new discourse referent must still be formed in the case of inferentially bridged DPs – and seems to be established for definite DPs that lack a proper anchor – which results in cost and is reflected in the P600-component. Finally, this paper brings together model-theoretical considerations with evidence from language processing, implying that ERP-measures can be utilized to test predictions for other discourse phenomena and to determine the relevant features and constraints that should be encoded in a theory of discourse representation.
Acknowledgements This research was carried out at the Max-Planck-Institute for Human Cognitive and Brain Sciences in Leipzig, Germany. I would like to thank Cornelia Schmidt and Beate Günther for their support during data acquisition, as well as Sergey Avrutin, Ina Bornkessel-Schlesewsky, Angela Friederici, and Matthias Schlesewsky for fruitful discussion.
References Abbott, B. 2004. “Definiteness and indefiniteness.” In The handbook of pragmatics, L. Horn and G. Ward (eds), 122–149. Oxford/Malden, MA: Blackwell. Almor, A. 1999. “Noun-phrase anaphora and focus: The information load hypothesis.” Psychological Review 106 (4): 748–765. Asher, N. and Lascarides, A. 1998. “Bridging.” Journal of Semantics 15 (1): 83–113. Avrutin, S. 1999. Development of the syntax-discourse interface. Boston: Kluwer. Avrutin, S. 2004. “Weak syntax.” In Broca’s region, Y. Grodzinsky and K. Amunts (eds.), 49–62. Oxford/New York: Oxford University Press. Burkhardt, P. 2005. The syntax-discourse interface: Representing and interpreting dependency. Amsterdam/Philadelphia: John Benjamins. Burkhardt, P. 2006. “Inferential bridging relations reveal distinct neural mechansims: Evidence from event-related brain potentials.” Brain and Language 98: 159–168. Chafe, W.L. 1976. “Givenness, contrastiveness, definiteness, subjects, topics, and point of view.” In Subject and topic, C.N. Li. (ed.), 25–55. New York: Academic Press.
Dependency precedes independence Chierchia, G. 1995. Dynamics of meaning. Anaphora, presuppostion, and the theory of grammar. Chicago: University of Chicago Press. Clark, H.H. 1975. “Bridging.” In Theoretical issues in natural language processing, B. Nash-Webber and R. Schank (eds), 188–193. Cambridge, MA: Yale University Mathematical Society Sciences Board. Fiebach, C.J., Schlesewsky, M., and Friederici, A.D. 2002. “Separating syntactic memory costs and syntactic integration costs during parsing: the processing of German WH-questions.” Journal of Memory and Language 47 (2): 250–272. Fraurud, K. 1990. “Definiteness and the processing of NPs in natural discourse.” Journal of Semantics 7: 395–433. Friederici, A.D., Pfeifer, E., and Hahne, A. 1993. “Event-related brain potentials during natural speech processing - Effects of semantic, morphological and syntactic violations.” Cognitive Brain Research 1 (3): 183–192. Grice, H. P. 1975. “Logic and conversation.” In Speech acts, P. Cole and J. L. Morgan (eds.), 41–58. New York: Academic Press. Hagoort, P., Hald, L., Bastiaansen, M., and Petersson, K.M. 2004. “Integration of word meaning and world knowledge in language comprehension.” Science 304 (5669): 438–441. Haviland, S.E. and Clark, H.H. 1974. “What’s new? Acquiring new information as a process in comprehension.” Journal of Verbal Learning and Verbal Behavior 13: 512–521. Hawkins, J.A. 1978. Definiteness and indefiniteness. London: Croom Helm. Heim, I. 1982. The semantics of definite and indefinite noun phrases. Ph. D. Dissertation, University of Massachusetts, Amherst. Kaan, E., Harris, A. Gibson, E., and Holcomb, P. 2000. “The P600 as an index of syntactic integration difficulty.” Language and Cognitive Processes 15 (2): 159–201. Kaan, E., and Swaab, T.Y. 2003. “Repair, revision, and complexity in syntactic analysis: an electrophysiological differentiation.” Journal of Cognitive Neuroscience 15 (1): 98–110. Kutas, M., and Hillyard, S.A. 1980. “Reading between the lines: Event-related brain potentials during natural sentence processing”. Brain and Language 11 (2): 354–373. Kutas, M., and Hillyard, S.A. 1984. “Brain potentials during reading reflect word expectancy and semantic association.” Nature 307: 161–163. Kutas, M., and Federmeier, K.D. 2000. “Electrophysiology reveals semantic memory use in language comprehension.” Trends in Cognitive Sciences 4 (12): 463–470. Lascarides, A. and Asher, N. 1993. “Temporal interpretation, discourse relations and commonsense entailment.” Linguistics and Philosophy 16: 437–493. Lewis, D. 1979. “Scorekeeping in a language game.” In Semantics from Different Points of View, R. Bäuerle, U. Egli, and A. v. Stechow (eds), 172–87. Berlin: Springer. Löbner, S. 1985. “Definites.” Journal of Semantics 4: 279–326. Lyons, C. 1999. Definiteness. Cambridge: Cambridge University Press. Poesio, M. and Vieira, R. 1998. “A corpus-based investigation of definite description use.” Computational Linguistics 24 (2): 183–216. Prince, E.F. 1981. “Toward a taxonomy of given-new information.” In Radical pragmatics, P. Cole (ed.), 223–255. New York: Academic Press. Prince, E.F. 1992. “The ZPG letter: Subjects, definiteness, and information-status.” In Discourse description: Diverse linguistic analyses of a fund raising text, W.C. Mann and S.A. Thompson (eds), 295–325. Amsterdam: John Benjamins. Rugg, M.D. 1985. “The effects of semantic priming and word repetition on event-related potentials.” Psychophysiology 22 (6): 642–647.
Petra Burkhardt St. George, M., Mannes, S., and Hoffman, J.E. 1997. “Individual differences in inference generation: An ERP analysis.” Journal of Cognitive Neuroscience 9 (6): 776–787. Streb, J., Rösler, F., and Hennighausen, E. 1999. “Event-related responses to pronoun and proper name anaphors in parallel and nonparallel discourse structures.” Brain and Language 70 (2): 273–286. Streb, J., Hennighausen, E., and Rösler, F. 2004. “Different anaphoric expressions are investigated by event-related brain potentials.” Journal of Psycholinguistic Research 33(3): 175–201. van Berkum, J.J.A., Hagoort, P., and Brown, C.M. 1999. “Semantic integration in sentences and discourse: Evidence from the N400.” Journal of Cognitive Neuroscience 11 (6): 657–671. van Petten, C., and Kutas, M. 1991. “Influences of semantic and syntactic context on open- and closed-class words.” Memory and Cognition 19 (1): 95–112. Weisbrod, M., Kiefer, M. Winkler, S. Maier, S. Hill, H. Roesch-Ely, D., and Spitzer, M. 1999. “Electrophysiological correlates of direct versus indirect semantic priming in normal volunteers.” Cognitive Brain Research 8 (3): 289–298. Yekovich, F.R. and Walker, C.H. 1978. “Identifying and Using Referents in Sentence Comprehension.” Journal of Verbal Learning and Verbal Behavior 17: 265–78.
Accessing discourse referents introduced in negated phrases Evidence for accommodation? Barbara Kaup and Jana Lüdtke Berlin University of Technology
In two experiments we compared anaphor resolution times in negative sentences (e.g., Either Peter does not catch a train, or it will arrive late in the evening.) with those in affirmative sentences (e.g., If Peter catches a train, then it will arrive late in the evening.). Sentences were read segment-by-segment, and segment reading times were being recorded. In line with the hypotheses, segment reading times following the anaphoric expression were longer in the negative than in the affirmative condition, but only when the critical entity was being referred to (e.g., the train as compared to Peter). When instead of a pronoun a repeated-name was being used for reference (e.g., the train as compared to it), resolution times were faster specifically in the negative condition. Implications for different accounts of language comprehension are discussed.
1. Introduction According to proponents of dynamic semantics, the meaning of a sentence is not defined in terms of its truth conditions but rather in terms of its potential to change the context in which it occurs. This concept of meaning as context-change potential is cental, for instance, in Heim’s File-Change Semantics (Heim 1982) or Kamp’s Discourse-Representation Theory (DRT, Kamp 1981). According to these theories, a sentence containing an indefinite noun phrase (NP) [such as a lion in (1a)], introduces a discourse referent into the discourse representation, and this discourse referent can be utilized when in the upcoming text the respective entity is being referred to (e.g., 1b, see Figure 1). Thus, such a sentence changes the context by providing a discourse referent to which upcoming text can be related. Accordingly, dynamic semantics is particularly well suited to account for anaphoric binding across the sentence boundary. (1) a. In the cage there was a lion. b. It was sleeping and snoring.
Barbara Kaup and Jana Lüdtke x,y cage(x) lion(y) in(y,x) sleeping(y) snoring(y)
Figure 1. Discourse representation for text (1).
Negation provides an interesting case in this context: An indefinite NP in the scope of the negation operator introduces a discourse referent. However, this discourse referent is usually not available for anaphoric reference in the upcoming text. For instance, in (2), the anaphoric reference in the second sentence seems awkward. This apparent inaccessibility is accounted for by assuming that negation is an operator that applies to a sub-ordinate DRS, and that discourse referents represented in a negated sub-DRS are inaccessible for anaphor resolution in the main-DRS (see Figure 2).1 (2) a. In the cage there was no lion. b. *It was sleeping and snoring.
However, under certain conditions, anaphoric reference to entities introduced within the scope of the negation operator is possible. For instance, in (3) where the lion is introduced within the scope of a double negation, anaphoric reference is felicitous. It seems that comprehenders resolve the double negation, to the effect that the respective discourse referent is accessible for anaphor resolution in the second clause. This accommodation process (Lewis 1979) is licensed in cases where the original DRS is logically equivalent to the transformed DRS: The negation of the negation of a proposition p is logically equivalent to the proposition p. Accordingly, accommodation can transform a DRS that is negated twice into a DRS that is not negated at all (see Figure 3). It is usually assumed that accommodation is triggered by the encounter of an anaphoric expression that can otherwise not be resolved. In other words, accommodation does not take place spontaneously but only when it is required for anaphor resolution (cf. Kuschert 1999). x cage(x) y lion(y) in(y,x) sleeping(?) snorring(?)
Figure 2. Discourse representation for (2). The pronoun in the second sentence cannot be resolved because no adequate discourse referent is accessible in the main DRS. 1. Note, the cage is not a potential referent because the verb in the anaphoric sentence requires an animate subject.
Accessing discourse referents introduced in negated phrases
(3) a. Its not true that there was no lion in the cage. b. I saw it sleeping and heard it snoring. y cage(y)
x,y
x,y (A)
x ¬
¬
lion(x) in(x,y)
(B)
cage(y) lion(x) in(x,y)
(C)
cage(y) lion(x) in(x,y) sleeping(x) snoring(x)
Figure 3. A: Discourse representation for (3a). B: Discourse representation after the accommodation process has taken place: The discourse referent representing the lion is accessible in the main-DRS. C: Discourse representation for the whole discourse after successful anaphor resolution.
The assumptions concerning accommodation directly translate into hypotheses concerning the time that is required for anaphor resolution in language comprehension. An anaphoric expression with an antecedent in the scope of a double negation [e.g., (3)] should take longer to resolve than an anaphoric expression with an antecedent in an affirmative phrase [e.g., (4)]. The reason is that the former triggers a time-consuming accommodation process. Once the anaphor is resolved, the two conditions should not differ with respect to the accessibility of the introduced discourse referents, because the accommodation process modifies the discourse representation. Thus, if an anaphoric expression is used for the second time, no difference should emerge between the two different antecedent conditions [i.e., (3) and (4), respectively]. (4) a. It is true that there was a lion in the cage. b. I saw it sleeping and heard it snoring.
The first of these two predictions was investigated in a self-paced reading experiment by Kuschert (1999). Participants were presented with narrative texts in which a target entity (in the following critical entity) was introduced either within the scope of a double negation or within an affirmative phrase [e.g., (5) and (6), respectively]. The next sentence then referred to this entity by means of a pronoun. Sentences were being presented segment-by-segment, and segment reading times were being recorded [segment borders are indicated by a “/” in (5) through (8) below]. As predicted, reading times for the segment following the pronoun (i.e., in the mall today) were significantly longer in the negative than in the affirmative condition. This is in line with the idea that comprehenders, upon encountering the pronoun in the anaphor sentence, initiated a time-consuming accommodation process in the negative condition
Barbara Kaup and Jana Lüdtke
(see Figure 4). Interestingly, the same result was not obtained in two control conditions [(7) and (8)], where the anaphor sentence did not refer to the critical entity but to an entity that was mentioned in the previous sentence by means of a proper name or a definite NP (e.g., Jim; in the following: non-critical entity). The respective discourse referent was not introduced but only referred to in the previous sentences and should therefore be represented in the accessible main-DRS right away, even in the negative condition (see Figure 4). The fact that the negation effect did not generalize to these conditions rules out the possibility that the prolonged reading times in the negative condition reflect general processing difficulties subsequent to negative sentences (for instance due to spill-over effects). (5) a. Mary denied / the statement / that Carl / does not have a sister. b. She had met her / in the mall today. (6) a. Mary confirmed / the statement / that Carl / has a sister. b. She had met her / in the mall today. (7) a Mary denied / the statement / that Carl / does not have a sister. b. She had met him / in the mall today. (8) a. Mary confirmed / the statement / that Carl / has a sister. b. She had met him / in the mall today. (A)
(C)
x,y,z Mary(x) Carl(y) sister-of(z,y)
z sister-of(z,y)
(B) x,y
=
Mary(x) Carl(y)
z sister-of(z,y)
z sister-of(z,y)
Figure 4. A: Discourse representation for (6a). B: Discourse representation for (5a). C: Accommodation: Logical equivalence of not not P and P. The representation in A corresponds to the representation for (5) a after accommodation has taken place. Note: This figure is slightly simplified: Mary denied that P is reduced to Not P, and Mary confirmed that P is reduced to P.
Accessing discourse referents introduced in negated phrases
By the same reasoning as above, Kuschert also investigated anaphor resolution in so-called bathroom sentences [e.g., (9)].2 Here, the anaphor refers to an entity that was introduced within the scope of a negation operator in the first clause of the sentence. The respective discourse referent is represented in a negated sub-DRS, and accordingly, should be inaccessible for anaphor resolution in the second clause (see Figure 5A). Accommodation in this case presumably utilizes the fact that Not-a or b is logically equivalent to If a then b. In DRT it is generally assumed that discourse referents represented in the sub-DRS that corresponds to the antecedent of an implication are accessible from within the sub-DRS that corresponds to the consequent of the implication. Thus, accommodation in this case should be successful, because subsequent to the accommodation process, the critical discourse referent is available for anaphor resolution from within the sub-DRS containing the respective proposition (see Figure 5B). Thus, translated into predictions concerning the time that is needed for anaphor resolution, we would expect to find longer resolution times for the anaphor in bathroom sentences compared to the corresponding affirmative implication [e.g., (9) and (10), respectively]. As before, this difference should only be obtained when the anaphor refers to the critical entity, but not when it refers to the non-critical entity [as in (A) x Peter(x) y train(y) caught(x,y)
v
arrive-late(?)
(B) x Peter(x) y
train(y) caught(x,y)
⇒
arrive-late(y)
Figure 5. A: Discourse representation for (9). B: Discourse representation for (10). Note: The representation in B corresponds to the representation for (9) after accommodation has taken place. 2. The term is due to a structurally similar example attributed to Barbara Partee, namely Either there is no bathroom in this house, or it is in a funny place.
Barbara Kaup and Jana Lüdtke
(11) and (12)]. The results of Kuschert’s experiment corresponded to these predictions: The segment following the anaphor (i.e., arrive very late) was read more slowly when the antecedent was in the scope of a negation operator [(9)] than when it was in an affirmative phrase [(10)]. Furthermore, when instead of the critical entity (i.e., train) the non-critical entity was being referred to [(11) and (12)] the negative condition did not lead to longer reading times than the affirmative condition. Thus, the results obtained with these materials replicated the results obtained with double negation.
(9) Either Peter did not / catch a train, / or it will / arrive very late / in the evening.
(10) If Peter / caught a train,/ it will / arrive very late / in the evening. (11) Either Peter did not / catch a train, / or he will / arrive very late / in the evening. (12) If Peter / caught a train, / he will / arrive very late / in the evening.
To summarize, the results of Kuschert’s study fit nicely with the predictions of the accommodation hypothesis. Anaphors referring to entities that were introduced in the scope of a negation are more slowly resolved than those referring to entities introduced in affirmative phrases, presumably because they trigger a time-consuming accommodation process. However, in an earlier study conducted in our lab (Kaup, Dijkstra, and Lüdtke 2004), we failed to replicate Kuschert’s results in several experiments employing sentences with double negation. Instead of finding evidence for a temporary inaccessibility of the critical entity in the negative condition, we found evidence for a relatively high accessibility of this entity. Obviously, this is in contrast to the predictions of the DRT-based accommodation hypothesis. Before reporting two new experiments in which we tried to replicate Kuschert’s result obtained with bathroom sentences, let us take a closer look at our study with double negation.
2. Previous Study: Double Negation In the first experiment we attempted to replicate the results by Kuschert. Participants were presented with narrative stories containing passages such as (5)–(8), and sentence reading times were being measured (see Table 1, for an example). Reading times for the anaphoric sentences were analyzed. Reading times were longer in the negative than in the affirmative condition, but this difference was only significant for the non-critical conditions (see Figure 6A). These results did not replicate Kuschert’s results. A second experiment was designed to find out whether the differences reflect differences in methodology (segment vs. sentence reading times). In this experiment, instead of presenting the stories sentence-by-sentence, the stories were presented segment-by-segment, self-paced by the participants, with the segment borders being assigned to the materials according to Kuschert’s criteria. We analyzed the reading times for the final segment of the anaphoric sentences. The results replicated those of the first experiment: Reading times were significantly longer after negative than after affirmative sentences, but this difference was only
Accessing discourse referents introduced in negated phrases
significant in the non-critical conditions (see Figure 6B). Thus, the fact that the results did not replicate the results by Kuschert cannot be due to differences in the experimental procedure. Table 1. Sample text for the study in Kaup, Dijkstra and Lüdtke 2005
3300
A
3200 3100 3000
NEG
NEG AFF
2900 2800
AFF Critical (pronominal)
Non-Critical (pronominal)
Reading Time [in ms]
Reading Time [in ms]
Setting We were awaiting company for the weekend. Hours before Uncle Sam was supposed to be arriving my whole family was already panicking. Introducing Sentence [Aff] My brother Stanley assured everybody that our sister had made a cake. [Neg] My brother Stanley objected to the statement that our sister had not made a cake. Anaphor Sentence [Critical] He told us that he had seen it in the kitchen just now. [Non-Critical] He told us that he had seen her in the kitchen just now. Final Sentence We were really glad when Uncle Sam finally arrived and everything went fine. Question Did the visitor arrive?
1250
B NEG G
1200 1150 1100
NEG
AFF
1050 1000
Critical (pronominal)
AFF Non-Critical (pronominal)
Figure 6. A: Mean reading times for the anaphoric sentences in Experiment 1 of Kaup, Dijkstra and Lüdtke, 2004. B: Mean reading times in Experiment 2 of this study.
What then are the implications of these results? Contrary to the predictions of the accommodation account, anaphors referring to the critical entity were not resolved more slowly after a negative than after an affirmative sentence. Instead the predicted polarity effect was obtained in the non-critical conditions. Why should the polarity of the introducing sentence affect the accessibility of the non-critical entity? In the following we will discuss an account of the results (foregrounding account) that rests on the assumption that comprehenders spontaneously resolve the double negation when processing the introducing sentence in the negative condition. As a result, the critical entity is represented as if it had been represented in an affirmative phrase. According to the foregrounding account, the results obtained
Barbara Kaup and Jana Lüdtke
in Experiments 1 and 2 reflect two (partly counteracting) phenomena: First, the processing of the introducing sentence can be assumed to be more difficult in the negative than in the affirmative condition, because the initial representation is far more complex in the former than in the latter case (see Figure 4 above). These difficulties may “spill over” to the anaphor sentence, in the sense that not in all cases are participants done with their representation for the negative introducing sentence when they begin processing the anaphor sentence. As a result, the processing of the anaphor sentence is slowed down in the negative conditions. Second, in those cases where participants do have completed their representation (and accordingly have resolved the double negation in the negative condition), the critical entity is relatively highly accessible in the negative conditions, because participants take the relatively complex construction that is used to convey the existence of the critical entity in the negative conditions as a signal that this entity is important for what is to come. This foregrounding of the critical entity in the negated conditions facilitates anaphor resolution and neutralizes the spill-over effect from the previous sentence in the critical conditions. Two further experiments were designed that investigated this foregrounding account of the results. If the foregrounding account is correct, and the critical entity is indeed foregrounded in the negated conditions, then this should be reflected in a repeated-name penalty (Gordon, Grosz and Gilliom 1993) when a repeated-name anaphor is being used instead of a pronoun: Entities in the discourse focus are usually referred to by means of a pronoun. If instead of a pronoun a repeated name is being used for referring to an entity in the discourse focus, anaphor resolution is hampered, and resolution times are prolonged. Thus, if using a repeated-name anaphor prolongs the resolution times specifically in the negative-critical condition, then this can be interpreted as indirect support for the assumption that the critical entities were indeed relatively highly accessible in the negated conditions. This prediction was examined in Experiments 3 and 4 of this study. In Experiment 3, participants were presented with the narrative texts employed in the previous experiments except that only the critical entity was being referred to, in half of the cases by means of a pronoun, and in the other half by means of a repeated-name anaphor (He told us that he had seen the cake in the kitchen just now). Narratives were presented sentence-by-sentence, self-paced by the participants. As expected, there was a negation-by-anaphor interaction (see Figure 7A). Reading times were longer in the negative than in the affirmative conditions but only with repeated-name anaphors. This fits well with the predictions: In the negated conditions, the target entity is foregrounded and repeated-name anaphors were therefore inadequate (repeated-name penalty). Accordingly, reading times were significantly prolonged in the negative-repeated-name condition. For the affirmative conditions there was a different pattern; here the target entity is not foregrounded, and accordingly repeated names did not hamper but rather help the resolution process (cf. Almor 1999).
3300 3200 3100
A NEG NEG
AFF
3000 AFF
2900 2800
Pronoun (critical)
Rep-Name (critical)
Reading Time [in ms]
Reading Time [in ms]
Accessing discourse referents introduced in negated phrases
3600 3500
B NEG
NEG
3400 3300 3200 3100
AFF
AFF
Non-Critical Critical (repeated name) (repeated name)
Figure 7. A: Mean reading times for the anaphoric sentences in Experiment 3 of Kaup, Dijkstra and Lüdtke, 2004. B: Mean reading times in Experiment 4 of this study.
Experiment 4 investigated the prediction that a main effect of negation would occur when repeated names are used for both the critical and the non-critical conditions. Participants were presented with the narrative texts employed in Experiments 1 and 2 except that all anaphors were repeated-name anaphors (He told us that he had seen the cake / our sister in the kitchen just now). Narratives were presented sentence-by-sentence, self paced by the participants. As expected, there was a significant main effect of negation (see Figure 7B). This provides additional evidence for the idea that the critical entity is foregrounded in the negated conditions. A repeated-name penalty reduces the relative advantage of the critical entity in the negated condition. Consequently, the negation effect is now also significant for the critical-antecedent conditions. Taken together the results of Experiments 3 and 4 supported the foregrounding account. Specifically, the results supported the assumption that the critical entity is foregrounded after processing the introducing sentence in the negative version. This indirectly suggests that comprehenders spontaneously resolved the double negation when processing the introducing sentences. In other words, in contrast to what the accommodation hypothesis assumes, it seems that accommodation is not triggered by an anaphoric element but takes place as soon as its licensing conditions are met– a double negation is resolved independent of whether there is an anaphor referring to an inaccessible discourse referent or not. Thus, taken together, the results of the four experiments can be explained by two assumptions: First, negative sentences are more difficult to process than affirmative sentences, and this difficulty may spill over to subsequent sentences. Second, when processing double negations as in (5) or (7), participants “calculate” the content of the actual state of affairs. As a consequence, the critical entity becomes fore-grounded and relatively highly accessible. With respect to the DRT-based accommodation assumption we can conclude that accommodation is not (only) triggered by an anaphoric element that can otherwise not be resolved. Rather, it seems that accommodation can take place whenever its licensing conditions are met: When the comprehender has created a DRS in which a discourse referent is embedded in an inaccessible sub-DRS,
Barbara Kaup and Jana Lüdtke
and this DRS is logically equivalent to a DRS, in which the respective discourse referent is accessible, then the comprehender spontaneously creates this “simpler” DRS, to the effect that the respective discourse referent becomes accessible for subsequent anaphor resolution. What are the implications of these considerations with respect to the bathroom sentences discussed in the introduction of this chapter? As was briefly reported above, in Kuschert’s experiment, bathroom sentences and double-negation sentences produced equivalent results. We were unable to replicate Kuschert’s result with doublenegation sentences, and as of yet it remains unclear why. The question arises wether we can replicate Kuschert’s results with bathroom sentences. If the above considerations are correct, it seems that we might: In bathroom sentences [e.g., (9), here repeated as (13)], the anaphoric element is encountered immediately after the processing of the phrase containing the negation operator, and more important, prior to the sentence boundary. Considering that creating a DRS with an embedded sub-DRS is timeconsuming, it seems well possible that accommodation has not yet taken place when the anaphoric element is being encountered. As a consequence, the critical entity may still be encapsulated by the negation operator, and anaphor resolution should be difficult. There is another aspect in which bathroom sentences differ from the doublenegation sentences: In double-negation sentences, the discourse referent representing the critical entity is accessible in the main DRS as a result of the accommodation process, or in other words, it stands for an entity that exists in the described world. In contrast, in bathroom sentences the respective discourse referent is embedded even after the accommodation process has taken place. In other words, the sentence does not provide definite information with respect to the existence of the critical entity (it is not definite but only possible that the critical entity exists; see Figure 5B). It seems well possible that accommodation takes place spontaneously only when accessibility in the main DRS is at stake. If so, then comprehenders of bathroom sentences can not be expected to accomodate prior to encountering the anaphoric element in the second clause. (13) Either Peter did not catch a train, or else it will arrive very late in the evening.
In any case, if either of these assumptions is correct then the critical entity should not be foregrounded when the anaphoric element is encountered in bathroom sentences. Thus, in contrast to what was found with double-negation sentences, we should not find evidence for a relatively high accessibility of the critical entity, nor a repeated-name penalty. These predictions were investigated in two experiments. In the first experiment, pronouns were used for referring to the critical-and non-critical entity. Thus, this experiment was equivalent to Kuschert’s experiment with bathroom sentences. In the second experiment, we replaced the pronouns by repeated-name anaphors.
Accessing discourse referents introduced in negated phrases
3 Current Study: Bathroom Sentences 3.1 Experiment 1 3.1.1 Method Participants. Fifty-six students of the Berlin University of Technology participated for course credit or financial reimbursement of EUR 8,-per hour. All participants were native speakers of German. Materials. The materials consisted of 50 short stories, 16 of which were used as experimental items, 32 as filler items, and 2 as practice items.3 The experimental items were constructed according to the following schema (see Table 2; for a German example see Table 3): The first two sentences specified the setting of the story. The next sentence (target sentence) mentioned two new entities: The non-critical entity was always referred to by a name (e.g., Peter) or a definite noun phrase (e.g., the building), whereas the critical entity was always introduced via an indefinite noun phrase (e.g., a train, an elevator). Both entities were introduced in the first of the two clauses that made up this sentence. In the second clause, the sentence pronominally referred to either the critical entity or the non-critical entity. In the affirmative version, the target sentence was an implication (e.g., If Peter catches a train, it/he will arrive late). In the negative version the target sentence was a bathroom sentence (e.g., Peter either does not catch a train, or it/he will arrive late.). The next sentence was the final sentence of the story. The filler stories were of comparable lengths and topics as the experimental stories and served to obscure the manipulation. Sixteen of the filler stories contained a negation somewhere in the story, whereas the remaining sixteen did not. For each story, a simple comprehension question was constructed with half of the comprehension questions requiring a ‘yes’ response and the other half requiring a ‘no’-response. Design and Procedure. Each participant read all 16 experimental items intermixed with all 32 filler items. The 16 experimental items were assigned to four sets, the 56 participants to four groups, and the assignment of versions to sets and groups was according to a 4×4×4 Latin square. Thus we employed a 2(polarity: affirmative vs. negative) × 2(antecedent: critical vs. non-critical) × 4 group/set design with repeated measurement on the first two variables. Text presentation was segment-by-segment, self-paced by the participant pressing the space-bar, according to a moving-windows procedure (Haberlandt 1994). Pressing the space-bar after reading the final segment of the final sentence of the story elicited the presentation of the comprehension question. Participants responded by pressing the appropriate key (‘ . ’-and ‘x’-key, marked with ‘y’ and ‘n’, respectively). The experimental session lasted approximately 30 minutes.
3. We thank Susanna Kuschert for providing us with her experimental materials. Many of the narratives employed in this study are based on her originals.
Barbara Kaup and Jana Lüdtke
Table 2. Sample text Setting My roommate Carol and I / decided to remodel / the kitchen of our apartment. / In addition to painting the walls, / we also wanted to get / some new kitchen furniture. / Today, Carol suggested / that we go to John’s carpentry shop / to get an impression / of his work. / I am not so sure / whether that’s worth it. Target Sentence [Neg/Critical] Either John will not be willing / to build a dining table at all, / or it will be / extremely expensive. [Neg/Non-Critical] Either John will not be willing / to build a dining table at all, / or he will be / extremely expensive. [Aff/Critical] If John is willing / to build a dining table at all, / it will be / extremely expensive. [Aff/Non-Critical] If John is willing / to build a dining table at all, / he will be / extremely expensive. Final Sentence Carol persuaded me / to go there anyway. Question Was the kitchen to be remodelled?
Table 3. Sample text Setting Zu beneiden ist Doris nicht. / An diesem Wochenende / Bahn zu fahren, / wird die HÖlle sein. Target Sentence [Neg/Critical] Entweder wird Doris / keinen Sitzplatz mehr bekommen, / oder er wird / im Raucherabteil sein. [Neg/Non-Critical] Entweder wird Doris / keinen Sitzplatz mehr bekommen, / oder sie wird / im Raucherabteil sein. [Aff/Critical] Wenn Doris noch / einen Sitzplatz bekommen wird, / dann wird er / im Raucherabteil sein. [Aff/Non-Critical] Wenn Doris noch / einen Sitzplatz bekommen wird, / dann wird sie / im Raucherabteil sein. Final Sentence Ich kann mir vorstellen, / dass Doris / nach ihrer Ankunft / gerne einen Spaziergang / an der Elbe / machen wu¨rde. Question Kommt Doris mit dem Flugzeug nach Hamburg?
3.1.2 Results and Discussion The analyses were performed on the segment reading times in the experimental stories. More specifically, we analyzed the reading times for the final segment of the target sentence, which always was the segment following the one containing the anaphoric expression. Reading times longer than 8000 ms or shorter than 400 ms were omitted, as well as reading times falling outside 1,458 standard deviations (cf. Selst and Jolicoeur 1994)
Accessing discourse referents introduced in negated phrases
Reading Time [in ms]
from the item’s mean in the respective condition (this eliminated less than 5% of the data). We submitted the remaining segment reading times to two analyses of variance, one based on participant variability (F1) and one based on item variability (F2). The mean of the reading times in the four different conditions are displayed in Figure 8: When the sentences referred to the critical entity, reading times were longer in the negative than in the affirmative condition. The same did not hold when the noncritical entity was being referred to. Also, references to the critical entity led to faster reading times than references to the non-critical entity, but only in the affirmative versions of the sentences. These differences were reflected in the statistical analyses. There was a significant main effect of antecedent and a negation-by-antecedent interaction in the analysis by participants, but no main effect of negation (antecedent: F(1,52) = 4.4, p < .05; F2(1,12) = 1.5; p = .25; negation-by-antecedent: F1(1,52) = 4.4, p < .05; F2(1,12) = 2.1; p = .17; negation: both Fs < 1). Planned comparisons revealed a negation effect for the critical but not for the non-critical entity (critical: F(1,52) = 4.5, p < .05; F2(1,12) = 4.4; p = .05; non-critical: both Fs < 1; F2(1,12) = 1.5; p = .25). An effect of antecedent emerged in the analyses by participants in the affirmative but not in the negative conditions (negative: both Fs < 1; affirmative: F(1,52) = 8.2, p <.01; F2(1,12) = 3.5; p = .08). 1250 1200 1150 1100
NEG
NEG
1050 1000
AFF
AFF
Critical (pronominal)
Non-Critical (pronominal)
Figure 8. Mean reading times in the four different conditions of Experiment 1.
These results replicate the results that Kuschert obtained with bathroom sentences: Immediately after reading the first clause of a bathroom sentence, the critical entity is relatively low in accessibility. This suggests that at this point in the comprehension process the critical entity was (still) represented in the inaccessible negated sub-DRS in the negative conditions. Thus, in contrast to what was found with the double-negation sentences, the critical entity was apparently not foregrounded when the pronoun was encountered in the bathroom sentence. The fact that in the affirmative conditions, reading times were faster when the anaphor referred to the critical compared to the non-critical entity probably reflects the fact that the critical entity’s existence is what the first clause of this sentence is about. That the same advantage for the critical entity was not found in the negative conditions can be counted as further support for the view that the critical entity was relatively low in accessibility in these conditions.
Barbara Kaup and Jana Lüdtke
In Experiment 2 we investigated the prediction that we would not find a repeatedname penalty with bathroom sentences. This prediction follows directly from the view the critical entity is not foregrounded in bathroom sentences.
4 Experiment 2 4.1 Method Participants. Thirty-two students of the Berlin University of Technology participated for course credit or financial reimbursement of EUR 8,-per hour. All participants were native speakers of German. Materials. The materials were the same as those in Experiment 1, except that all antecedents were critical antecedents. Two new conditions were created by replacing the pronoun in the second clause of each experimental sentence by a repeated-name anaphor (e.g., Peter either did not catch a train or else the train will arrive late. Design and Procedure. The design was a 2(polarity: affirmative vs. negative) × 2(anaphor: pronoun vs. repeated name) × 4 group/set design with repeated measurement on the first two variables. The procedure was the same as in Experiment 1.
4.2 Results and Discussion
Reading Time [in ms]
Outlier elimination was performed as in Experiment 1, which in this experiment reduced the data set by less than 3%. The data of one participant was discarded because he or she had made five mistakes with the comprehension questions in the experiment. The mean of the reading times in the four different conditions are displayed in Figure 9. When a pronoun was used to refer to the critical entity, negative conditions lead to longer segment reading times than affirmative conditions, which replicates the polarity effect observed in the critical-antecedent conditions of Experiment 1. As expected 1250 1200 1150 1100
NEG
NEG
1050 1000
AFF
AFF
Pronoun (criticall)
Repeated-name (criticall)
Figure 9. Mean reading times in the four different conditions of Experiment 2.
Accessing discourse referents introduced in negated phrases
with respect to a potential repeated-name penalty, we did not find any indication for such a penalty, neither in the affirmative nor in the negative conditions. In the negative conditions, repeated-names even helped the resolution process. Segment reading times were shorter in the negative repeated-name condition than they were in the negative pronoun condition. In affirmative conditions, repeated names and anaphors led to similar segment reading times. These differences were reflected in the statistical analyses. There was no main effect of polarity (both F < 1), and the main effect of anaphor was only significant in the by-participants analysis (F(1,27) = 5.2, p < .05; F2(1,12) = 2.0; p = .18). However, there was a significant polarity-by-anaphor interaction (F(1,27) = 8.0, p < .01; F2(1,12) = 5.6; p < .05). Separate analyses for the two anaphor conditions indicated that the polarity effect was significant in the by-participant analysis only when a pronoun was used for reference (pronoun: F(1,27) = 4.3, p < .05; F2(1,12) = 2.7; p = .12; repeated-name: F(1,27) = 3.0, p = .10; F2(1,12) = 2.6; p = .13). Separate analyses for the two anaphor conditions indicated that repeated-name anaphors led to shorter segment reading times in the negative conditions (F(1,27) = 12.5, p = .01; F2(1,12) = 5.2; p < .05) but did not affect reading times in the affirmative conditions (both F < 1). The results nicely match the predictions. Repeated-name anaphors did not lead to prolonged reading times in the negative conditions with critical antecedents. This is in line with the view that the critical entity was not foregrounded in these conditions. Moreover, the fact that repeated-names even helped the resolution process specifically in the negative but not in the affirmative conditions indicates that the critical entity was relatively inaccessible in the negative conditions. This is in line with the view that in the negative conditions, the critical entity was still represented in an inaccessible substructure at the point in time of testing, i.e., at the point in time when the anaphoric element was being encountered.
5. General Discussion In previous studies we investigated the DRT-based accommodation hypothesis with sentences containing a double negation. According to this hypothesis, anaphors referring to entities introduced in a sentence with a double negation should take longer to resolve than anaphors referring to entities introduced in an affirmative sentence, because with the former but not with the latter comprehenders supposedly initiate a timeconsuming accommodation process. In contrast to what was found in an earlier study by Kuschert (1999), we did not find positive evidence for this hypothesis. The results of four experiments consistently showed that entities introduced in a sentence with a double negation are relatively highly available shortly after the processing of these sentences. We interpreted these findings as suggesting that the accommodation process is not (only) triggered by anaphoric elements that can otherwise not be resolved. Rather, it seems that comprehenders spontaneously initiate the accommodation mechanism
Barbara Kaup and Jana Lüdtke
when processing a sentence with a double negation. As a consequence entities introduced within the scope of a double negation become foregrounded shortly after the processing of the sentence, and accordingly these entities are relatively highly available at this point in the comprehension process. The experiment reported in the third section of the present chapter investigated the DRT-based accommodation hypothesis with bathroom sentences. With respect to the accommodation hypothesis, bathroom sentences differ in (at least) two relevant aspects from the materials employed in our previous study with double negation. First, the anaphor referring to the entity introduced within the scope of a negation is encountered prior to a sentence boundary. Second, and probably more important, bathroom sentences do not convey definite information with respect to the existence of the critical entity in the described world. Thus, we hypothesized that comprehenders would either not spontaneously accomodate with these kinds of sentences, or alternatively would not yet have initiated the accommodation process when encountering the anaphor in the second clause. Accordingly, we expected the critical entity to be relatively low in accessibility in the negative conditions of the present experiment. In line with this prediction, reading times for the segment following the anaphor in bathroom sentences were relatively long compared to the reading times in equivalent segments in affirmative control sentences. In two further control conditions, in which not the critical entity but the non-critical entity was being referred to, no reading time difference emerged. This rules out an alternative explanation according to which the relatively long reading times in the negative condition are purely due to spill-over effects from the negation in the first clause of the sentences. The fact that with bathroom sentences no repeated-name penalty was observed when instead of a pronoun a repeatedname anaphor was being used for reference, provides further support for the view that the critical entity is not foregrounded in bathroom sentences. Rather it seems that the critical entity is relatively inaccessible: Repeated-names not only did not hamper the resolution process but even helped it. Taken together the results of the two experiments are in line with the hypothesis that in bathroom sentences, the critical entity is still represented in an inaccessible substructure when the anaphoric element is being encountered. The results thereby replicate the results obtained by Kuschert (1999) and suggest that with bathroom sentences, accommodation has not taken place when the anaphoric element is encountered. In principle there are two different explanations for this finding. The first explanation rests on the assumption that comprehenders always spontaneously accomodate, but simply had not yet initiated this process when the anaphor was being encountered in the second clause of the sentences. Thus, this explanation attributes the differences in results to the fact that the anaphor in bathroom sentences is encountered prior to the sentence boundary. The second explanation rests on the assumption that comprehenders only spontaneously accomodate in case an entity can be made accessible in the main-DRS, or in other words, in case accommodation concerns an entity that exists in the described world. Thus, this explanation attributes the differences in results to the fact that bathroom sentences do not convey
Accessing discourse referents introduced in negated phrases
definite information with respect to the existence of the critical entity in the described world. It should be noted that this explanation only implies that comprehenders do not spontaneously accomodate with bathroom sentences, i.e., prior to encountering an anaphoric element that requires accommodation in order to be resolved. This explanation does not imply that comprehenders do not accomodate at all. Future studies are necessary to find out which of the two explanations is correct. If the first explanation is correct, and comprehenders simply didn’t have enough time to initiate the accommodation process in the bathroom sentences employed in our experiment, we might find foregrounding effects with sentences such as (14), in which additional material is inserted in between the disjunction operator and the anaphoric element. On the other hand, if the second explanation is correct, then inserting additional material should not make a difference because the sentence still does not convey definite information with respect to the existence of the critical entity. (14) Either Peter does not have a girl friend, or, and in that case I find him very awkward, he simply never brings her to his house.
What are the implications of the present results with respect to the current debate in Psychology concerning the format of the representations employed in language comprehension? According to situation-model theory (e.g., van Dijk and Kintsch 1983; Zwaan and Radvansky 1998), comprehenders create a referential representation which consists of mental tokens that stand for the referents that the linguistic input introduces and refers to. Usually it is (at least implicitly) assumed that the referential level of representation, which consists of mental tokens representing the relevant referents, is augmented by propositions that assign properties and relations to these tokens. Thus in this respect, situation-model theory resembles DRT. In the present chapter we interpreted our results in terms of DRT. We argued that the results can be accounted for if one assumes that accommodation may take place spontaneously. The same post-hoc assumption could be made in situation-model theory, which implies that the results in principle can be accounted for by this theory. However, in language comprehension research there is growing evidence that suggests that text comprehension is tantamount to the construction of a mental simulation of the described state of affairs. This simulation has been shown to be grounded in perception and action (Barsalou 1999; Glenberg 1997; Glenberg and Kaschak 2002; Zwaan 2004; Zwaan, Stanfield and Yaxley 2002). In experiential simulations, negation cannot be represented explicitly. Instead, it has been proposed that negation is implicitly represented in the simulation processes that are undertaken when processing a negative sentence. More specifically, when processing a negative sentence, the comprehender is assumed to create a simulation of the negated state of affairs that he or she keeps separate from the simulation of the actual state of affairs. The negation is then implicitly captured in the deviations between the two simulations (cf. Kaup and Zwaan 2003; Kaup, Yaxley, Madden, Zwaan and Lüdtke 2007; Kaup, Lüdtke, and Zwaan 2006). For instance, when processing a sentence such as Carl does not have a
Barbara Kaup and Jana Lüdtke
sister, the comprehender is assumed to simulate Carl with a sister (negated state of affairs) as well as Carl without a sister (actual state of affairs). The simulation of the actual state of affairs captures the information that Carl does not have a sister by deviating in this respect from the simulation of the negated state of affairs. The case of double negation (as in Its not true that Carl does not have a sister) is more complex: The information from the subordinate clause in the first sentence (i.e., Carl does not have a sister) will lead to representations of Carl with a sister and Carl without a sister (as explained). For the negation in the main clause this two-simulation representation corresponds to the negated state of affairs. The actual state of affairs then again contains Carl with a sister (see Figure 10B). If we compare this representation with the simulation created for the corresponding affirmative sentence (i.e., It is true that Carl has a sister; see Figure 10A) it becomes evident that the resulting simulations of the actual states of affairs do not differ. What differs is the simulation history. Thus, to summarize, according to the experiential-simulations view of language comprehension, comprehenders spontaneously resolve the double negation to the effect that embedded discourse entities that exist in the described world become available in the simulation of the actual state of affairs. Thus, overall the experiential-simulations A
B Actual State of Affairs [Carl has sister]
Actual
Negated State of Affairs [Carl does not have sister]
Negated
Figure 10. A: Mental simulation for It is true that Carl has a sister. B: Mental simulation for It is not true that Carl does not have a sister.
Accessing discourse referents introduced in negated phrases
account seems to fit nicely with the results of the double negation experiments. In fact, the account has the advantage of predicting that the double negation is being spontaneously resolved when processing the introducing sentence, rather than having to assume this in a post-hoc manner. How about the results obtained with respect to the bathroom sentences [e.g., (15)]? According to the experiential-simulations account, the first clause leads to a representation of Peter with a girl friend (negated state of affairs) as well as to a simulation of Peter without a girl friend (actual state of affairs). When the disjunction operator is encountered, the comprehender supposedly sets up an alternative simulation of Peter with a girl friend. The immediately following pronoun referring to the critical entity (girl friend) can only be resolved upon completion of this “alternative” simulation. Obviously, the chances that this alternative simulation is already available when the pronoun is being encountered increases with the time that the comprehender has prior to encountering the anaphoric element. In our experiment, the pronoun immediately followed the disjunction operator. Thus, the comprehender had nearly no time to create the “alternative” simulation before encountering the pronoun. From the perspective of the experiential-simulations account, it is therefore not surprising that in the present experiment, the critical entity was relatively inaccessible when the pronoun was encountered. If it turned out that inserting additional material in between the disjunction operator and the pronoun [see (14)] enhances accessibility of the critical entity in the negative condition, then this would further support the experiential-simulations interpretation of the results. (15) Either Peter does not have a girl friend, or he never brings her to his house.
6 Conclusions In this chapter we were concerned with the question of whether accessing discourse referents introduced in negated phrases is more time consuming than accessing discourse referents introduced in affirmative phrases. In contrast to what was predicted on the basis of a DRT-based accommodation hypothesis, we did not find evidence for a reduced accessibility in the case of negation in general. Rather, the results suggest that in the case of double negation comprehenders spontaneously resolve the double negation to the effect that embedded discourse referents become accessible in the available discourse representation. The same does not hold in the case of bathroom sentences. Future studies are needed to clarify whether comprehenders generally do not spontaneously accomodate with bathroom sentences, or alternatively whether they simply need more time before encountering the anaphor in these types of sentences. Overall the results can be accounted for by DRT and situation-model theory, in case one adds a post-hoc assumption concerning the accommodation process. The results can also be accounted for by the experiential-simulations account of language comprehension,
Barbara Kaup and Jana Lüdtke
with the slight advantage of not requiring post-hoc assumptions. According to this view, negation is not explicitly encoded in language comprehension but implicitly captured in the deviations between two simulations, namely a simulation of the negated state of affairs and a simulation of the actual state of affairs.
References Almor, A. (1999). Noun-phrase anaphora and focus: The informational load hypothesis. Psychological Review, 106, 748–765. Barsalou, L.W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22, 577–660. Glenberg, A.M. (1997). What memory is for. Behavioral and Brain Sciences, 20, 1–55. Glenberg, A.M., and Kaschak, M.P. (2002). Grounding language in action. Psycho-nomic Bulletin & Review, 9, 558–565. Gordon, P.C., Grosz, B.J., and Gilliom, L.A. (1993). Pronouns, names, and the centering of attention in discourse. Cognitive Science, 17, 311–347. Haberlandt, K. (1994). Methods in reading research. In M.A. Gernsbacher (Ed.), Handbook of Psycholinguistics (pp. 1–31). New York: Academic Press. Heim, I. (1982). The semantics of definite and indefinite noun phrases. Doctoral dissertation, University of Amherst, (distributed by SFB 99, University of Konstanz). Kamp, H. (1981). A theory of truth and semantic representation. In J. Groenendijk, T. Janssen, and M. Stokhof (Eds.), Formal methods in the study of language: pt: 1 (p. 277–322). Amsterdam: Mathematish Centrum. Kaup, B., Dijkstra, K., and Lüdtke, J. (2004). Resolving anaphors after reading negative sentences. Poster presented at the 44th Annual Conference of the Psychonomic Society, Minneapolis (mn), USA. Kaup, B., Lüdtke, J., and Zwaan, R.A. (2006). Processing negated sentences with contradictory predicates: Is a door that is not open mentally closed? Journal of Pragmatics, 38, 1033–1050. Kaup, B., Yaxley, R.H., Madden, C.J., Zwaan, R.A., and Lüdtke, J. (2007). Experiential simulations of negated text information. Quarterley Journal of Experimental Psychology, 60, 976–990. Kaup, B., and Zwaan, R.A. (2003). Effects of negation and situational presence on the accessibility of text information. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 439–446. Kuschert, S. (1999). Dynamic meaning and accomodation. Dissertation, Universität des Saarlandes. Lewis, D. (1979). Scorekeeping in a language game. Journal of Philosophical Logic, 8, 339–359. Selst, M.V., and Jolicoeur, P. (1994). A solution to the effect of sample size on outlier elimination. The Quarterly Journal of Experimental Psychology, 47A, 631–650. van Dijk, T.A., and Kintsch, W. (1983). Strategies of discourse comprehension. New York: Academic Press. Zwaan, R.A. (2004). The immersed experiencer: Toward an embodied theory of language comprehension. In B.H. Ross (Ed.), The psychology of learning and motivation, vol. 44 (p. 35– 62). New York: Academic Press. Zwaan, R.A., and Radvansky, G.A. (1998). Situation models in language comprehension and memory. Psychological Bulletin, 123, 162–183. Zwaan, R.A., Stanfield, R.A., and Yaxley, R.H. (2002). Language comprehenders mentally represent the shapes of objects. Psychological Science, 13, 168–171.
part iv
Language Specific Phenomena
Complex anaphors in discourse1 Manfred Consten and Mareile Knees 1. Overview Researchers have heterogeneously referred to what we call complex anaphors, e.g., abstract object anaphora (Asher 1993, 2000), labelling (Francis 1994), as well as extended reference and reference to fact (Halliday/Hasan 1976), sentence-related reference (Koeppel 1993), proposition-related anaphora (Greber 1993), situational anaphora (cf. Fraurud 1992; Dahl/Hellmann 1995), discourse deixis (Webber 1991) or shell nouns (Schmid 2000). All these approaches deal with the linguistic means of picking up a whole sentence (or even a larger piece of text) and handling a respective abstract referent (see 2.1). In this paper, we consider different ways to handle abstract referents. For example (1):
(1) Good linguists are bad football players. This/This fact/This image/This prejudice/This impertinence/*It …
While the demonstrative pronoun this is semantically neutral, the lexical NP anaphors seem to cause some change with respect to the epistemical or ontological state of the referent. All these anaphors have in common that they condense the propositionally structured antecedent to a nominal expression that is easily manageable as a unified entity in the following discourse. Thus, complex anaphors serve as a means of text economy and at the same time effect the progression of the information flow (cf. Schwarz 2000a: 132). In contrast, the mere personal pronoun it is not capable of functioning in this way; it is not possible as a complex anaphor at all (see 3.3). In order to differentiate the effects complex anaphors can have on the ongoing discourse, we will provide a categorisation of anaphoric complexation processes in terms of ontological types (Section 2). Due to the lexical content of the anaphoric expression
1. This paper has been written within the context of the research group “KomplexTex” (supervised by Monika Schwarz-Friesel), granted by the Deutsche Forschungsgemeinschaft (SCHW 509/6–2). We would also like to thank Maria Averintseva-Klisch, Tübingen, for her innumerable inspiring questions and helpful comments, Marlies Schleicher, Jena, for her critical remarks, Nicolaus Janos Eberhardt, Jena, for formatting work, and the anonymous reviewers for their helpful suggestions.
Manfred Consten and Mareile Knees
the ontological status of the referent can change during the anaphoric process. This process is subject to ontological constraints, see:
(2) Good linguists are bad football players. *This event/*This process …
Section 3 will suggest a resolution model which takes semantic and conceptual features into account. We will discuss different examples of ambiguous complex anaphors in order to show how ontological features as well as lexical and conceptual knowledge constrain the resolution process.
2. Types of and constraints on complexation processes 2.1 What is ‘abstractness’? As mentioned above, referents of complex anaphors are ‘abstract’ in a sense that deserves a closer definition. These referents are propositionally structured objects that have received several detailed analyses. Davidson (1967) pointed out that when talking about events, situations, etc. they can be handled like things. The Davidsonian “event argument” (Davidson 1967; Parsons 1990) is adopted by several current semantic theories, especially DRT (cf. Asher 1993, 2000; Higginbotham 2000; Maienborn 2003, 2004, Kratzer 1995, 2003). There has been no final agreement on the ontological categorisation of such referents as events, states, processes or situations (cf. Vendler 1967; Dowty 1979; Kim 1969, 1976 as well as Asher 1993, 2000, and Maienborn 2003). We use the following classification showing the increasing abstractness of the proposed ontological types. degree of abstractnesss
ontological category
high
proposition (pp)
fact (f) state (s) process (p)
[dependent on world] [– dynamic, – telic / dependent on world and time] [+ dynamic, – telic]
low
event (e)
[+ dynamic, + telic]
Figure 1. Abstractness scale.
Events are spatio-temporal entities defined in terms of certain results, e.g., to eat an apple (with the result that the apple has disappeared) or to run from A to B (with the result that you are at B). Consequently, event-clauses can be specified by adverbials like three times. In contrast, processes are defined in terms of temporal duration like running for hours (Vendler 1967, for critical remarks cf. Davidson 1985; Engelberg 2000; Maienborn 2003). Of course, eating an apple has a temporal duration as well,
Complex anaphors in discourse
that can be focussed in a sentence like She was eating an apple for half an hour (before she realised that she doesn’t like apples at all). As this example shows, sometimes ontological categories are assigned to an entity not ‘on their own’, but only by their linguistic context (cf. Parsons 1990). We consider this the most interesting point with respect to the discourse function of complex anaphors and will discuss this issue in the following sections. A further distinction is made with respect to states (albeit it is not resumed in the other sections of this paper): Davidsonian (D-)states are expressed by verbs like sit, lie, hang as well as sleep, wait, stick. Kimian (K-)states, on the other hand, are expressed by know, believe, love, possess, cost, resemble. In contrast to events, processes and D-states, K-states are higher on the abstractness scale since they do not depend on space (if you are in the state of waiting, sitting etc. this state will only hold at a certain location; if you resemble another person or if you know something this will be true independent from where you are). However, K-states as well as events, processes and D-states are bound to an experiencer and time (you may forget what you knew, for instance), whereas facts are only bound to a world (namely the world where their proposition is true; cf. Maienborn 2003: 121; Asher 2000: 133f). Finally, propositions are not specified with respect to logical values. Therefore, they are often expressed by embedded that- or if-clauses such as It is true/a lie/certain/possible that pp, It is unknown if pp. Some researches have assumed that there is a fundamental difference between ‘real’ ontological categories and those created by natural language itself (Bach’s “Natural Language Metaphysics” 1989; Asher 1993). According to these approaches, abstract objects are objects that are not ‘in the world’ but formed by speakers. Abstrakte Objekte verstehe ich [...] als mentale Konstrukte, die den Bedürfnissen einer effizienten Verarbeitung natürlicher Sprache sowie anderen kognitiven Leistungen dienen, die aber letztlich auf andere, ontologisch grundlegende Kategorien reduzierbar sind. (Maienborn 2003: 120) (I understand abstract objects as mental constructs that fulfill the requirements for an efficient processing of natural language as well as other cognitive processes. However, they can finally be reduced to other categories which are ontologically basic.)
Following Asher (1993), K-states, facts and propositions are regarded as “abstract objects” in the stated sense of the term, while events, processes and D-states (summed up as “situations”) are ‘real’ objects that depend on world and time.
2.2 Types of anaphoric complexation Now let us have a closer look at the complexation process. We distinguish between three types of complex anaphoric reference (cf. 2.2.1, 2.2.2, and 2.2.3):
Manfred Consten and Mareile Knees
2.2.1 Maintenance by neutral anaphors With this kind of complex anaphors, the anaphorical expression itself is neutral with respect to ontological types. For this reason, the discourse entity established by the anaphoric process usually keeps the ontological type denoted by the antecedent.2
(3) zneutral ≈ x
Of course, the ontological type assigned by the anaphor has to be compatible with the context, especially the semantic structure of the verb the anaphor is dependent on. For example, it is tested whether a complex anaphor can be combined with the verb happen (see (4) and (5)) in order to distinguish events and processes from states (Maienborn 2003: 59–62).
(4) Arthur played the piano. This happened while …
(5) Arthur owned a bike./The apples cost 3 Euro. *This happened while …
Here are some examples for different ontological types:3 (a) Events
(6) [The Americans tried to storm the building but were forced back by shots from the top floor.]e [This/The whole thing]n happened yesterday while Mr. Rumsfeld visited Baghdad. (derived from the newspaper-text cited as (19) below)
(7) Panic nahm die Gelegenheit wahr, um eigene Aktien im Wert von 13 Millionen Dollar abzugeben. Dies brachte ihm zahlreiche Aktionärsklagen ein, die zum Teil heute noch anhängig sind. (Tiger-Corpus, 1547f.) [Panic availed himself the opportunity of selling his shares worth 13 millions dollars.]e [This]n resulted in several claims of share holders which are still pending.
(b) Processes
(8) [The amount of jobs decreases, while the importance of the service sector is growing at the same time]p. [The whole thing/This]n hasn’t finished yet. (derived from the corpus example cited as (20) below)
2. ‘≈’ assigns a complex referent (x) to an anaphor (z) (cf. Asher 1993: 145). 3. Most of our examples are taken directly from the “Tiger-Corpus” as indicated. The Tiger-Corpus consists of approximately 40,000 sentences (700,000 token) taken from German newspaper texts (Frankfurter Rundschau). In the framework of the DFG-research project “KomplexTex” we have used it in order to systematically determine different grammatical and ontological types of complex anaphors (cf. Consten/Knees/Schwarz-Friesel 2007). Examples without any reference are based on natural language examples from the Tiger-Corpus or other newspaper texts but modified in order to show the range of possible and impossible ontological change.
Complex anaphors in discourse
(c) States
(9) [The Jacobs-Sister’s dogs resemble each other as much as their owners.]s [This]n holds since their first performance.
Our next example does not deal with coreference in a strict sense, but with ‘sloppy identity’, an anaphoric relationship well-known with respect to nominal anaphors since Karttunen (1969). (10) Bis zu seinem altersbedingten Abschied vom Chefsessel des größten deutschen Elektrokonzerns am 1. Oktober jedenfalls “habe ich noch genug zu tun“, versichert der 64jährige Manager. Dies wird bei Kaskes Nachfolger Heinrich von Pierer nicht anders sein. (Tiger-Corpus, 2977f.) Th e 64-year old director of the biggest German company for electronics, commented at the time of his age-related departure on the 1st of October “I still [have enough to do]s”. [This]n will not change for Kaske’s successor Heinrich von Pierer.
The anaphor-sentence is obviously intended to indicate that whoever is director of the company will be busy, whereas a reading like (10’) is ruled out for lack of communicative plausibility: (10’) #For Kaske’s successor Heinrich von Pierer, it will not change that [Kaske still has enough to do till his parting]s
Consequently, the referent for this is not the state denoted in the antecedent sentence, but the respective type of state (for events vs. event-types see Asher 1993: 237– 239). In other words, the state-referent becomes abstracted from its concrete experiencer and time in the context of the anaphoric sentence. Even though the anaphor is neutral with respect to ontological types, there are cases where a different ontological type is fixed by the syntactic or semantic context. In the following examples the antecedents denote an event or a state while the referent for the neutral anaphor this is understood as a fact/a proposition due to the lexical meaning of the verb. The event-referent in (11) must be factual in order to serve as a proof. In (12) the states denoted in the antecedent sentence can only necessitate something if it is real whereas in (13) the anaphor sentence marks the referent as hypothetical by the verb assume that creates an attitude context. (11) [The Americans tried to storm the building but were forced back by shots from the top floor.]e [This]n proves that the situation isn’t under control yet. (derived from the newspaper-text cited as (19)) (12) Das Land durchläuft keine Rezession im westlichen Sinne, sondern steht in einem “notwendigerweise langwierigen Prozeß” des Neuaufbaus der Wirtschaft. Dies macht mittelfristig hohe Spar- und Investitionsanlagen nötig. (similar Tiger-Corpus, 728f.)
Manfred Consten and Mareile Knees
[The country is not experiencing a recession in a West sense]s [but it is in a necessarily long process of reorganising the economy.]s [This]n medium-term necessitates high economisers and investments. (13) [The country is not experiencing a recession in a West sense]s [but it is in a necessarily long process of reorganising the economy.]s At least [this]n is assumed by some economists.4
The examples discussed by Maienborn (2003: 111–113) show as well that in many cases the ontological status of complex referents only becomes explicit by anaphoric reference: (14) Arthur sleeps. This has already lasted for the whole morning. (15) Arthur sleeps. This shows/proves that he played the whole afternoon …
Due to the verbal context the complex anaphor this in (14) denotes an entity with a temporal extension. It cannot refer to a fact since facts are not temporally bound by definition (they are only bound to a world, namely the world in which the proposition is true, cf. 2.1). Maienborn (2003: 112) describes (14) as reification adding a temporal dimension to the fact. That means that the ontological status of the referent in (14) gets changed by the complex anaphor. In contrast to this view, we assume that an ontology-changing complexation does not occur in (14) but in (15). The antecedent sentences in both (14) and (15) denote states. States feature an inherent temporal dimension which in (14) gets picked up by the complex anaphor whereas in (15) the complexation process evokes an abstraction and focuses on the mere factuality of the referent. In 2.2.3 we will postulate that this kind of abstraction is an essential function of anaphoric complexation processes whereas complexation ‘the other way round’ (i.e., evoking decreasing abstractness) will not occur (see 2.3).
2.2.2 Maintenance by lexical anaphors While most of the research discussed in section 2.1 deals with verbal phrases representing certain ontological types, there are also some nouns that can be assigned to ontological categories. This seems trivial for nouns that function as ontological category-labels anyway, e.g., event, process, state, fact, and quite smooth for some nouns derived from typical event- or state-predicates such as resemblance in (21). Respective nouns serve as antecedents in the examples described in this section: the ontological status of the referents stays the same during the anaphoric process, since the antecedent and the anaphor denote the same ontological type. (16) zx ≈ x 4. Here, it is not quite clear whether this is related to the first part of the antecedent sentence – the state of ‘not undergoing’, the second part – the state of ‘being part of a process’, or both of them. There is a preference for the latter ones due to discourse structural reasons (right frontier contraint) and world knowledge.
Complex anaphors in discourse
However, for most of the nouns it is anything but easy to fix an ontological category independent from context, e.g., (similar to (14) and (15)): (17) [They went out to take in some air.]e [This walk]e was the first activity outside the house since two weeks. (18) [They went out to take in some air.]e [This walk]p lasted for two hours.
Like in 2.2.1, we present some of the few clear examples for events, processes and states. a. Events
(19) Die Amerikaner versuchten, in das Gebäude einzudringen, wurden aber von Schüssen aus dem Obergeschoss zurück gedrängt. Zwei Soldaten seien bei dieser Aktion verletzt worden, einer im Haus, einer außerhalb. (Süddeutsche Zeitung online, 25.7.2003) [The Americans tried to storm the building but were forced back by shots from the top floor.]e It is said that two soldiers were injured during [this action]e, one inside the house and the other one outside the house.
(b) Processes
(20) Unbestritten ist, daß die Zahl der Arbeitsplätze in der Industrie geringer wird, während gleichzeitig das Gewicht des Dienstleistungssektors zunimmt. Dieser Prozeß ist auch noch längst nicht abgeschlossen. (Tiger-Corpus 18138f.) It is indisputable that [the amount of jobs decreases, while the importance of the service sector is growing at the same time.]p [This process]p hasn’t finished yet.5
(c) States
(21) Die Hunde der Jacob-Sisters sehen einander so ähnlich wie deren Besitzer. Kenner der Branche machen die Ähnlichkeit für den Erfolg verantwortlich. [The Jacobs-Sister’s dogs resemble each other as much as their owners.]s Insiders hold [this [German: the] resemblance]s responsible for their [Ger.: the] success.
2.2.3 Ontology-changing anaphors Examples like (15) and (18) show that the ontological status of the referent can change in the course of the text. This section deals with examples where this ontological shift is evoked not (or not only) by the context but by the anaphor itself. In contrast to the examples in 2.2.2, the anaphoric expression denotes another ontological type than its antecedent. Consequently, the anaphorical process results in a shift of the discourse representation with respect to the ontological type of the current referent.
5. As the example shows, the process-referent in the antecedent sentence is still anaphorically accessible even though it is embedded in a factual statement. It is not possible to read the complex anaphor as referring to the indisputable fact that the process holds.
Manfred Consten and Mareile Knees
(22) zx ≈ y
Here are some selected examples: a. Event becomes fact
(23) [The Americans tried to storm the building but were forced back by shots from the top floor.]e [This fact]f proves that the situation isn’t under control yet.
Compared with (11), the factual reading for the referent in (23) is doubly specified, namely by the verb prove (like in (11)) and at the same time by the lexical meaning of the anaphor itself. Of course, both factors have to be in agreement, unlike the next examples. (24) *[This fact]f happened yesterday while Mr. Rumsfeld visited Bagdad. (event-reading) (25) *[This fact]f was assumed by Al Jazeera. (proposition-reading)
The next variants (derived from (19) and (20)) show that the complex anaphor can evoke the ontological shift on its own, i.e., in a ‘neutral’ context, as well. (b) Event/process becomes fact/proposition/negatived fact
(26) [The Americans tried to storm the building but were forced back by shots from the top floor.]e Rumsfeld had to explain the consequences resulting from [this fact]f /[this rumour]pp during a press conference in the afternoon. (27) [The amount of jobs decreases, while the importance of the service sector is growing at the same time.]p [This insight]f / [this misbelief]neg / [this assumption]pp f determined economical sciences of the 20th century.
(c) State becomes fact
(28) Schwächen und Lähmungen, die auch in einstigen kommunalen Hochburgen der SPD wie in Frankfurt am Main oder in Berlin zu besichtigen sind, resultieren stärker aus strukturellen Verschiebungen als individuellen Fehlleistungen. Diese Erkenntnis verschiebt die Kritik an einer Partei, die nach der jüngsten Bundestagswahl angetreten war, um in der laufenden Legislaturperiode die – auf dem Papier – schwache Koalition zwischen der Union und den Freien Demokraten abzulösen. (Tiger-Corpus, 22276f.) [Weakness and paralysis observed in former communal strongholds of the SPD like in Frankfurt/Main or in Berlin result rather from structural movements than from individual mistakes.]s [This insight]f drives the criticism of the party that competed in the recent election for the Bundestag against the weak coalition between the CDU and the FDP. (29) Barrieren überall! Von dieser Erkenntnis wird auch ein Jahr nach der hart erkämpften Verankerung des Benachteiligungsverbotes Behinderter im Grundgesetz der Alltag dieser Menschen geprägt. (Tiger-Corpus, 26815f.)
Complex anaphors in discourse
[Barriers [are] all around!]s [This insight]f still holds and effects the everyday life of disabled people even one year after the prohibition of discriminating disabled people has been incorporated into the Basic Law.
There are not only examples for an ontological change to facts or proposition but also for changes at a lower level of abstractness: (d) Event becomes state
(30) Statt ihren Praktikumsbericht zu schreiben, ist sie Eis essen gegangen. Diese Herumhängerei guck’ ich mir nicht mehr länger mit an. (oral communication) [Instead of working on her training report, she went out to eat ice cream.]e I won’t tolerate [this hanging out]s any longer.
In (30), the single event (the referent’s going out to eat ice cream) is released from its concrete temporal and spatial fixation by the state-anaphor Herumhängerei, thus it is understood as a typical, exemplary incident. (e) Process becomes state
(31) Die Regierung hat tagelang ergebnislos über Subventionsabbau diskutiert. Die Opposition ist über diesen Stillstand empört. (Newspaper text) [The government has been discussing the reduction of subventions for days without any result.]p The opposition is outraged about [this stagnancy.]s
2.3 Constraints on ontology changing complexation (32) [The earth turns about the sun.]p [This process]p/[this state]s will presumably last for 7 × 109 years. [This fact]f is well known since the Middle Ages. Researchers of the Vatican were not allowed to examine [this possibility]pp/*[This event]e …
As the example shows, anaphorical complexation can shift referents of any ontological type to a discourse entity of either the same ontological type or an ontological type that is more abstract. They cannot be shifted to a discourse entity that is less abstract. Thus, anaphorical complexation can be a process of increasing abstractness (in terms of the abstractness scale, see figure 1). This claim holds for all examples discussed in 2.2.3.6 (33) *zy ≈ x if x > y (“if x is higher on abstractness scale than y”)
The relevance of this constraint for the resolution of complex anaphors is discussed in section 3. 6. In our corpus study with German newspaper texts (Consten/Knees/Schwarz-Friesel, 2007), we found that out of 30 demonstrative and 30 definite complex anaphoric NPs 17 respectively 13 were complex anaphors evoking a change of the ontological category (like the examples in 2.2.3). None of them violates the constraint given in (33).
Manfred Consten and Mareile Knees
3. The resolution of complex anaphors In our model, we will integrate procedural aspects by using a combination of DRS’ and cognitive Text-world Models (Schwarz 2000, 2001; similar Johnson Laird 19947). The model takes semantic and conceptual features into account. These features will be presented within the discussion of several different examples of ambiguous complex anaphors. It will be shown how ontological features as well as lexical and conceptual knowledge constrain the resolution process. We assume that complex anaphors establish new discourse entities at the textworld level by condensing previously mentioned propositionally structured referents. This differentiates them from (direct) nominal anaphors as the latter refer to objects already introduced as discourse entities. DRT approaches do not reflect this difference as they assume that each incidence of an anaphor integrates a new discourse referent at the DRS (cf. the critical remarks in Löbner 1985: 320; Cornish 1999: 186 and Consten 2004: 61). In our model, we distinguish between different levels: the text semantic level, the textworld level and the knowledge base. Referents are introduced by textual structures at the text semantic level; they establish discourse entities at the textworld level by activating the corresponding concept in the long term memory (phase 1) and by applying cognitive strategies (e.g., condensing complex textual structures to unified referents). The textworld level represents the discourse entities which are talked about in the discourse. The knowledge base contains different sources of knowledge e.g., lexical or conceptual knowledge. We restrict our illustration to those parts of knowledge that are used in order to resolve the complex anaphor. Initially, anaphors do not establish discourse entities at the textworld level but are interpreted at the text semantic level where the appropriate part of the textual structure is re-activated. In case of complex anaphors, these textual parts are propositionally structured (phase 2). Finally, the complex anaphor establishes the referent as a new complex discourse entity (phase 3).
3.1 Disambiguation by ontological features Our first example demonstrates how the ‘abstractness-constraint’ presented as (33) above can serve to explain ontological based resolution of ambiguous complex anaphors: (34) [The Jacobs-Sisters are always in a wonderful mood and flashy.]s [Yesterday they had a great performance in New York.]e a. [This event]e has surely made them even more popular. b. [This quality]s has surely made them even more popular.8 c. [This/that]n has surely made them even more popular. 7. Johnson-Lairds “mental model” theory differs from the Text-world Model Theory especially in the differentiation of a propositional and a mental level of representation. 8. This continuation is plausible if the second antecedent sentence is read as an embedded subsegment so that the first sentence is still accessible as an active discourse unit.
Complex anaphors in discourse
The two complex anaphors (a) vs. (b) have different antecedents, although both sentences in (34) are accessible as possible antecedents for both of the anaphors from a pure structural point of view (as version (c) shows).9 However, the first sentence is ruled out as antecedent in case of (a) since an event-anaphor cannot be assigned to a stateantecedent. In case of (b), there is no such restriction (as (30) shows it is possible to assign state-anaphors to event-antecedents in principle) but there seems to be a preference for an antecedent of the same ontological type if provided by the preceding text.10 Let us now have a closer look at the resolution process of example (34a) illustrated in figure 2. In phase 1, the nominal expressions Jacob-Sisters, great performance and New York introduce referents at the text semantic level (w, x . . . in figure 2). As nominal expressions they directly establish discourse entities at the textworld level (W, X . . .). The nominal anaphor they in the second sentence is immediately resolved to the Jacobs-Sisters since they refers as a personal pronoun to plural entities previously textworld level
W
X
W X V
V
knowledge base
*event ⇐ state
yesterday (e1) they (w) e1 – give (w, x) great performance (x) in (e1, v) New York (v)
establishes w x activates
event (ze) p1 – make more popular (ze, w) them (w) ? ze ≈ s1 ∨ e1 phase
E1
→ event ⇐ event Jacobs-Sisters (w) s1 – be in wonderful mood and flashy (w) establish
text semantic level
W X V
1 (encounter the complex anaphor)
v
disambiguates
w
s1 e1
x
v s1
e1 – give (w, x) re-activates
→ ze ≈ e1 *ze ≈ s1 event (e1) event (e1) p1 – make more popular (e1, w) p1 – make more popular (e1, w) them (w) them (w) 2 (resolving the complex anaphor)
3 (establishing e1 as a discourse object)
Legend: x, y: nominal referents on text semantic level e1, s1: complex referents on text semantic level indicated as “event”, “state” etc. ze, zs: anaphors indicated as “event”, “state” etc. W, X: nominal discourse entity established in textworld level E, S: complex discourse entity indicated as “event”, “state” etc., established in textworld level
Figure 2. Resolution model for (34a). 9. That seems to be more likely to be related to the second antecedent sentence (cf. Asher 1993: 226f) while this might sum up both of the antecedent sentences and refer to the antecedent as a whole (in contrast with Asher’s (ibid.) more complex example, where this is preferable related to the first sentence, that represents the discourse topic). 10. Having a great performance is conceivable as antecedent for a state-anaphor when the antecedent gets an iterative reading as discussed with example (30).
Manfred Consten and Mareile Knees
introduced into the discourse and the only plural object talked about so far are the Jacobs-Sisters. The same holds for them in the sentences with the complex anaphors. In contrast to the nominal expressions, propositional expressions introduce complex referents (like events, states etc.) only into the text semantic level (e1, s1…) but they do not establish discourse entities at the textworld level. They are not as individuated as nominal expressions. Furthermore, they are not the focused entities talked about since they only specify the relations between and give information about the discourse entities talked about. As mentioned before, complex anaphors can function as linguistic reifications since they establish new discourse entities at the textworld level by condensing previously mentioned propositionally structured referents after being resolved. So in phase 1 the complex anaphor z (this event) of type e (“event”) denotes due to its lexical meaning an event-referent. Thus, it cannot refer to one of the referents of the nominal expressions (e.g., w, x or v) since they do not denote events. In phase 2, the anaphor (ze) activates knowledge about ontological categories (i.e., the abstractnessconstraint (33) (ze ≈ e1; *ze ≈ s1)).11 Following this constraint, the anaphor is assigned to the adequate prementioned referent out of the group of possible referents (e1, s1) at the text semantic level: namely e1, giving a great performance. In phase 3, the anaphor re-activates this propositionally structured referent and thereby establishes it as a unified discourse entity E1 at the textworld level. Even though we do not focus on the resolution of nominal anaphors it should be mentioned that in phase 3 them also re-activates the referent Jacob-Sisters that has already been established by the nominal expression the Jacobs-Sisters in phase 1. textworld level
W
X
V
W X V
knowledge base
state ⇐ event
yesterday (e1) they (w) e1 – give (w, x) great performance (x) in (e1, v) New York (v)
establishes w x activates
quality (zs) p1 – make more popular (zs, w) them (w) ? zs ≈ s1 ∨ e1 phase
S1
→ state ⇐ state
Jacobs-Sisters (w) s1 – be in wonderful mood and flashy (w) establish text semantic level
W X V
1 (encounter the complex anaphor)
v
disambiguates
s1 e1
w x v e1 s1 – be in wonderful mood and flashy (w)
re-activates → zs ≈ s1 zs ≈ e1
quality (s1) quality (s1) p1 – make more popular (s1, w) p1 – make more popular (s1, w) them (w) them (w) 2 (resolving the complex anaphor)
Figure 3. Resolution model for (34b). 11. The preferred interpretation is marked by an arrow in the figure.
3 (establishing s1 as discourse object)
Complex anaphors in discourse
Figure 3 illustrates the resolution process involved in example (34b). The only difference between (34a) and (34b) is that the complex anaphors z (this quality) of type s (“state”) denotes a state-referent. Moreover, the abstractness-constraint does not restrict the resolution process like in (34a). Instead, we assume a preference namely that the complex anaphors rather refers to a state than to an event (see phase 2). Nonetheless, this preference disambiguates the complex anaphor as it is interpreted as referring to the text semantic referent s1 rather than to e1. Again, as s1 is re-activated by this anaphoric nominal reference it is established as a discourse entity at the textworld level.
3.2 Disambiguation by lexical features As stated in 2.2.2 and 2.2.3, lexical properties of the complex anaphoric expression can determine the resolution of the complex anaphor. The second example illustrates how such lexical knowledge in addition to the ontological features is involved in the anaphora resolution process. (35) [Thirty cars crashed into the unlighted road works barriers]e and [subsequent cars had to wait for several hours.]s a. Without avail, some guys had tried to prevent [the accident.]e b. Without avail, some guys had tried to prevent [the delay.]s
In (35a), the complex anaphor the accident refers to the crashing evente specified in the first antecedent sentence whereas in (35b) the complex anaphor the delay sums up
textworld level
WX
Y V
knowledge base
text semantic level
E1
→ CRASHING WAITING → event ⇐ event *event ⇐ state establish thirty cars (w) e1 – crash into (w, x) unlighted road works barriers (x) following cars (y) s1 – have to wait (y) for several hours (s1)
w x
y
disambiguates
e1 s1
w x
y s1
establishes
e1 – crash into (w, x)
activates
without avail (p1) some guys (v) p1 – try to prevent (v, ze) accident (ze) ? ze ≈ s1 ∨ e1 phase
W X Y V
W X Y V
1 (encounter the complex anaphor)
Figure 4. Resolution model for (35a).
re-activates → ze ≈ e1 *ze ≈ s1 without avail (p1) some guys (v) p1 – try to prevent (v, e1) accident (e1) 2 (resolving the complex anaphor)
without avail (p1) some guys (v) p1 – try to prevent (v, e1) accident (e1) 3 (establishing e1 as discourse object)
Manfred Consten and Mareile Knees
the referent of the second antecedent: the states of waiting. Disambiguation is based on the lexical meaning of the complex anaphor: accident is a typical example for an eventnoun (due to the criteria stated in the abstractness scale, see figure 1). In contrast, delay denotes a non-dynamic and non-telic entity and refers to a state. In comparison to the first example the processes in phase 1 are similar except for the additional activation of lexical knowledge triggered by the complex anaphoric expressions the accident in (35a) and the delay in (35b). Besides the abstractness constraint it is the lexical meaning of the anaphoric expressions which biases the resolution process. Accident is a better description for the previously mentioned crashing of cars than for the waiting of some other cars whereas delay is semantically closer related to waiting cars. Phase 2 of figure 4 and figure 5 shows how the complex anaphors not only activate ontological features but also lexical knowledge in order to determine the adequate antecedent. Once the corresponding antecedent is found, the referent at the text semantic level gets re-activated and is established into the textworld level (phase 3). textworld level
WX
Y
V
W X Y V
knowledge base
text semantic level
S1
CRASHING → WAITING → state ⇐ state state ⇐ event establish thirty cars (w) e1 – crash into (w, x) unlighted road works barriers (x) following cars (y) s1 – have to wait (y) for several hours (s1)
w x
y
establishes disambiguates
e1 s1
w x
y e1
s1 – have to wait (y)
activates
without avail (p1) some guys (v) p1 – try to prevent (v, zs) delay (zs) ? zs ≈ s1 ∨ e1 phase
W X Y V
1 (encounter the complex anaphor)
re-activates → zs ≈ s1 zs ≈ e1 without avail (p1) some guys (v) p1 – try to prevent (v, s1) delay (s1) 2 (resolving the complex anaphor)
without avail (p1) some guys (v) p1 – try to prevent (v, s1) delay (s1) 3 (establishing s1 as discourse object)
Figure 5. Resolution model for (35b).
3.3 Disambiguation by conceptual knowledge In our final example, we will show that conceptual knowledge has an impact on the resolution of complex anaphors as well. Any lexical influence of the complex anaphoric expression is ruled out here since we are dealing with “neutral” anaphors like in 2.2.1. (36) [Thirty cars crashed into the unlit road-works barriers]e and [the following cars had to wait for several hours]s. a. By warning the drivers about the road works, some guys had tried to prevent [this]n without avail.
Complex anaphors in discourse
b. By redirecting the traffic around the place of accident, some guys had tried to prevent [this]n without avail.
Again, phase 1 of figure 6 and figure 7 are in principal similar to the previous examples. The only difference is that the sentences with the complex anaphors are quite extensive. Due to the gerund construction, the propositionally structured entities in (36a and b) are interpreted as causally connected. This is reflected by the purpose relation12 in the knowledge base in phase 2. As the complex anaphor is neutral we have to take the context of the anaphor more into account than before. So in contrast to the previous examples it is not only the complex anaphor which activates some conceptual respectively lexical knowledge in the knowledge base but also its context. In (36a), the anaphor occurs as an argument of an intended PREVENT-process, which is causally related to some warning-event. Thus, the reader has to reason about the relation between the warning, the intended prevention and some previously introduced activity. textworld level
WX
Y
U
R
V
W X Y U R V
→ CRASHING WAITING * WARNING establish thirty cars (w) e1 – crash into (w, x) unlighted road works barriers (x) following cars (y) s1 – have to wait (y) for several hours (s1)
w x y u r v disambiguates e1 s1 e2 activates
by (e2, p1) e2 – warn about (v, u, r) drivers (u) road works (r) some guys (v) p1 – try to prevent (v, zn) this (zn) ? zn ≈ e1 ∨ s1 ∨ e2 without avail (p1) phase
E1
WARNING – PURPOSE – PREVENT:
knowledge base
text semantic level
W X Y U R V
1 (encounter the complex anaphor)
w x y u r v s1 e2
establishes
e1 – crash into (w, x) re-activates
→ zn ≈ e1 zn ≈ s1 *zn ≈ e2 some guys (v) p1 – try to prevent (v, e1) this (e1) without avail (p1)
… some guys (v) p1 – try to prevent (v, e1) this (e1) without avail (p1)
2 (resolving the complex anaphor)
3 (establishing e1 as discourse object)
Figure 6. Resolution model for (36a).
12. Following Graesser et al. (2001) we assume certain basic relations which are relevant for the coherence in texts and which are more or less explicit at the textual surface structure. These basic relations are comparable to some of the discourse relations proposed by Asher and Lascarides (2003).
Manfred Consten and Mareile Knees textworld level
W X
Y
U
R V
W X Y U R V
knowledge base
text semantic level
S1
REDIRECT – PURPOSE – PREVENT: CRASHING → WAITING * REDIRECT establish thirty cars (w) e1 – crash into (w, x) unlighted road works barriers (x)
establishes w x y u r v activates
following cars (y) s1 – have to wait (y) for several hours (s1) by (p2, p1) p2 – redirect around (v, u, r) traffic (u) place of accident (r) some guys (v) p1 – try to prevent (v, zn) this (zn) ? zn ≈ e1 ∨ s1 ∨ p2 without avail (p1)
phase
W X Y U R V
1 (encounter the complex anaphor)
e1 s1
p2
disambiguates
w x y u r v e1 p2 s1 – have to wait (y)
→ zn ≈ s1 zn ≈ e1 *zn ≈ p2
re-activates
some guys (v) p1 – try to prevent (v, s1) this (s1) without avail (p1)
… some guys (v) p1 – try to prevent (v, s1) this (s1) without avail (p1)
2 (resolving the complex anaphor)
3 (establishing s1 as discourse object)
Figure 7. Resolution model for (36b).
The preferred reading for the complex anaphors is that it refers to the crashing cars (e1) since the drivers were warned about the road works which then were involved in the crash. Taking the waiting (s1) as referent is less plausible but not impossible. The warning (e2) is ruled out as referent for logical reasons: it is not sensible to warn in order to prevent the same warning. In (36b), the reader has to relate the redirecting with the intended prevention and some previously introduced activity. Here, resolving the complex anaphor to the waiting (s1) is the preferred reading since redirecting the traffic around the place of accident is the more plausible reason for preventing a holdup. Moreover, the place of accident is mentioned, so it can be inferred that the accident had happened before the redirecting. Thus, the redirecting could not have prevented the accident. Again, for logical reasons it can be ruled out that the complex anaphor refers to the redirecting (p2) as it is not sensible to redirect something in order to prevent this redirection. We have seen that in both example sentences conceptual knowledge was involved in phase 2 in order to determine the most adequate antecedent for the anaphor – but it is difficult to formalise world or conceptual knowledge since it is as complex as the world. Once the complex referent is established as a unified discourse entity by a complex anaphor, the discourse entity is accessible by personal pronouns (as it in the 3rd
Complex anaphors in discourse
sentence of (37)),13 whereas the use of personal pronouns in the Vorfeld as a complex anaphor (as it in the 2nd sentence) is restricted (cf. Hegarty/Gundel/Borthen 2002; Hegarty 2003):14 (37) [The earth turns about the sun.]p [This process]p/[This]n/*[It] will presumably last for 7 × 109 years. [It] might, however, terminate a few years earlier.
4. Summary and outlook In our notion, complex anaphors condense prementioned propositional referents and establish them as unified discourse entities. Thus, anaphoric complexation has two dimensions: from an ontological point of view, it consists in a process of (potentially) increasing abstraction. Concerning the discourse structure, on the other hand, it results in a sort of reification since complex items are handled like “things” by language. There are, however, other aspects of complex anaphora which should not be ignored: regarding textual functions, complex anaphors are thematic and rhematic elements at the same time since 1) they are interpreted by reactivating prementioned referents and 2) they establish new discourse objects. Furthermore, some lexical anaphors evoke due to their lexical meaning an additional evaluation (see (38)) or a meta-discoursive categorisation (s. (39)) (cf. Consten/Knees/Schwarz-Friesel 2007; Schwarz-Friesel/Consten/Marx 2004; Consten 2004: 34). (38) Ratzinger has been elected pope. This fortune/This catastrophe … (39) The earth turns about the sun. This thesis/This claim/This proposal/This blasphemous misbelief/This joke Johannes Kepler made when he was drunk …
These discourse functions go beyond the mere abstraction- and resolution processes that were topic of this paper. With respect to abstractness, we differentiated between neutral and ontology changing complexation and proposed an “abstractness constraint”. We then suggested a process model of anaphoric complexation which serves to explain the resolution of certain kinds of ambiguous complex anaphora not solved by current approaches. Moreover, our model shows how anaphor resolvers
13. It in the 3rd sentence is not a complex anaphor since it is not assigned to a propositionally structured antecedent but to a NP-antecedent (This process/This) by which a unified discourse entity has already been established. 14. Hegarty (2003: 1f) assumes that events introduced by a clause are immediately accessible by personal pronouns since they are in focus merely due to their ontological status. However, some of our data does not support his claim. We have no evidence that ontological states of referents are determinants of a salience hierarchy.
Manfred Consten and Mareile Knees
make use of ontological constraints as well as lexical and conceptual knowledge. Accordingly, it is able to integrate different cognitive aspects of language processing.
References Asher, N. 1993. Reference to Abstract Objects in Discourse. Dordrecht: Kluwer. Asher, N. 2000. “Events, Facts, Propositions and Evolutive Anaphora.” In Speaking of Events, J. Higginbotham, F. Pianesi and A.C. Varzi (eds), 123–150. Oxford: Oxford University Press. Asher, N. and Lascarides, A. 2003. Logics of Conversation. Cambridge: Cambridge University Press. Bach, E. 1989. “The Algebra of Events.” Linguistics and Philosophy 9: 5–16. Consten, M. 2004. Anaphorisch oder deiktisch? Zu einem integrativen Modell domänengebundener Referenz [LA 484]. Tübingen: Niemeyer. Consten, M., Knees, M. and Schwarz-Friesel, M. 2007. “The Function of Complex Anaphors in Text.” In Anaphors in Text, M. Schwarz-Friesel, M. Consten and M. Knees (eds.), 81–102. Amsterdam: Benjamins (SLCS 86) Cornish, F. 1999. Anaphora, Discourse, and Understanding: Evidence from English and French. Oxford: Clarendon Press. Dahl, Ö. and Hellmann, C. 1995. What Happens When We Use An Anaphor? Ms. Dept. of Linguistics, Stockholm. Davidson, D. 1967. “The Logical Form of Action Sentences.” In The Logic of Decision and Action, N. Rescher (ed), 81–95. Pittsburgh: University of Pittsburgh Press. Reprint in D. Davidson. 1980. Essays on Action and Events, 105–122. Oxford: Clarendon Press. Dowty, D.R. 1979. Word Meaning and Montague Grammar: the Semantics of Verbs and Times in Generative Semantics and in Montague’s PTQ. Dordrecht et al.: Reidel. Francis, G. 1994. “Labelling discourse: An aspect of nominal-group lexical cohesion.” In Advances in written text analysis, M. Coulthard (ed), 83–101. London/New York: Routledge. Fraurud, K. 1992. “Situation Reference. What does ‘it’ refer to?” In K. Fraurud. Processing Noun Phrases in Natural Discourse. PhD thesis. Depart. of Linguistics, Stockholm University. Graesser, A., P. Wiemer-Hastings and K. Wiemer-Hastings. 2001. “Construction Inferences and Relations during Text Comprehension.” In Text representation: Linguistic and psycholinguistic aspects, T. Sanders, J. Schilperoord and W. Spooren (eds), 249–271. Amsterdam: John Benjamins. Greber, E. 1993. “Zur Neubestimmung von Kontiguitätsanaphern.” Sprachwissenschaft 18 (4): 361–405. Halliday, M. and Hasan, R. 1976. Cohesion in English. London: Longman. Hegarty, M., 2003. Type shifting of Entities in Discourse. Presentation at the First International Workshop on Current Research in the Semantics-Pragmatics Interface, Michigan State University. Hegarty, M., Gundel, J. and Borthen, K. 2002. “Information structure and the accessibility of clausally introduced referents.” Theoretical Linguistics 27 (2–3): 163–186. Higginbotham, J. 2000. “On Events in Linguistic Semantics.” In Speaking of Events, J. Higginbotham (ed), 49–80. New York: Oxford Univ. Press. Johnson-Laird, P. 1994. “Mental Models and Probabilistic Thinking.” Cognition 50: 189–209. Karttunen, L. 1969. “Pronouns and Variables.” CLS 5: 108–115. Kim, J. 1969. “Events and their Descriptions. Some Considerations.” In Essays in Honor of Carl
Complex anaphors in discourse G. Hempel, N. Rescher et al. (eds), 198–215. Dordrecht: Reidel. Kim, J. 1976. “Events as Property Exemplifications.” In Action Theory. Proceedings of the Winnipeg Conference on Human Action, M Brand and D. Walton (eds), 159–177. Dordrecht/ Boston: Reidel. Koeppel, R. 1993. Satzbezogene Verweisformen: eine datenbankgestützte Untersuchung zu ihrer Distribution und Funktion in mündlichen Texten, schriftlichen Texten und schriftlichen Fachtexten des Deutschen [Tübinger Beiträge zur Linguistik 386]. Tübingen: Narr. Kratzer, A. 1995. “Stage-level and individual-level predicates.” In The generic book, G.N. Carlson and F.J. Pelletier (eds), 125–175. Chicago: The Univ. of Chicago Press. Kratzer, A. 2003. The Event Argument and the Semantics of Verbs. Manuscript. University of Massachusetts at Amherst. Available at http://semanticsarchive.net. Löbner, S. 1985. “Definites.” Journal of Semantics 4: 279–326. Maienborn, C. 2003. Die logische Form von Kopula-Sätzen. Berlin: Akademie-Verlag. Maienborn, C. 2004. “On Davidsonian and Kimian States.” In Existence: Syntax and Semantics, I. Comorovski and K. von Heusinger (eds). Dordrecht: Kluwer. Parsons, T. 1990. Events in the Semantics of English: a Study in Subatomic Semantics. Cambridge, MA: MIT Press. Schmid, H.-J. 2000. English abstract nouns as conceptual shells. From corpus to cognition, Berlin et al.: Mouton de Gruyter. Schwarz, M. 2000a. Indirekte Anaphern in Texten. Studien zur domänen-gebundenen Referenz und Kohärenz im Deutschen [LA 413]. Tübingen: Niemeyer. Schwarz, M. 2000b. “Textuelle Progression durch Anaphern.” In Prosodie – Struktur – Interpretation [Linguistische Arbeitsberichte 74], J. Dölling and Th. Pechmann. (eds), 111–126. Leipzig: Institut für Linguistik, Universität. Schwarz, M. 2001. “Establishing Coherence in Text. Conceptual Continuity and Text-world Models.” Logos and Language 2 (1): 15–24. Schwarz-Friesel, M., Consten, M. and Marx, K. 2004. “Semantische und konzeptuelle Prozesse bei der Verarbeitung von Komplex-Anaphern.” In Flexibilität und Stabilität, I. Pohl (ed), 67–86. Frankfurt am Main: Peter Lang. Vendler, Z. 1967. Linguistics in Philosophy. Ithaca/New York: Cornell University Press. Webber, B. 1991. “Structure and ostension in the interpretation of discourse deixis.” Language and Cognitive Processes 6: 107–135.
The discourse functions of the present perfect Atsuko Nishiyama and Jean-Pierre Koenig
Ritsumeikan University / University at Buffalo, the State Univesity of New York The interpretation of the present perfect is often assumed to require pragmatic inferences. However, what rules speakers use to perform these pragmatic inferences is not clear. This paper reports two corpus studies of the present perfect in English and Japanese that show that the inferences required to interpret the present perfect follow from general default rules or commonsense entailment rules. These studies also show that the use of the perfect helps discourse coherence in two ways. First, the presence of the state the perfect introduces helps establish discourse relations or allows the establishment of additional discourse relations between discourse segments. Second, the pragmatic inferences required to interpret the perfect can indirectly trigger the rules needed to establish discourse relations.
1. Introduction Establishing text coherence involves establishing discourse relations between clauses and that discourse relations, in turn, help interpret temporal and anaphoric relations between clauses (Mann and Thompson 1988; Hobbs et al. 1993; Lascarides and Asher 1993; Asher and Lascarides 2003). According to many previous studies, the establishment of discourse relations ultimately relies on the conversational participants’ commonsense knowledge about events or, possibly, the lexical reflexes of that knowledge. In this paper, we show that the choice of grammatical forms, in particular, the choice of a present perfect form over a simple past tense form, can also play a role in inferring discourse relations and building text coherence. Based on the results of two corpus studies of English and Japanese present perfect examples we collected from diverse genres, we argue that the choice of a perfect form affects discourse coherence. We first discuss the discourse functions of the English present perfect and its interaction with the establishment of discourse relations and then show that our results extend to the Japanese nonpast perfect -te-i-ru.
2. Background: the semantics of the English perfect Many scholars have argued that the present perfect introduces a past event and a state holding at present (henceforth, the perfect state) (Kamp and Reyle 1993; van Eijck and
Atsuko Nishiyama and Jean-Pierre Koenig
Kamp 1997; Michaelis 1998; de Swart 1998; Borillo et al. 2004). However, the nature of that state varies among scholars and many of the previous studies provide only a temporal or causal definition of the perfect state. This paper cannot discuss the semantics of the perfect in any detail.1 Suffice it to say that to model the fact that the nature of the perfect state varies with the perfect’s interpretation, we modify as in (1) the standard analysis of the perfect within Discourse Representation Theory (DRT) (Kamp and Reyle 1993; van Eijck and Kamp 1997; de Swart 1998). (1) The meaning of the perfect introduces: i. an eventuality eυ which satisfies the base eventuality description f ; ii. a subpart eυʹ of eυ which also satisfies f and precedes reference time r (τ(eυʹ)⊰r); iii. a perfect state s, which overlaps reference time r(τ(s) o r), and whose category is semantically a free variable X (X in Figure 1).
The presence of the free variable X is a semantic constraint (imposed by the perfect form), but the value of X has to be filled in via pragmatic inferences.2 Figure 1 represents the Discourse Representation Structure (DRS) that results from the interpretation of a sentence whose verb and arguments contribute the eventuality description f (hereafter, the base eventuality description). It is important to note two differences between our analysis of the semantics of the English perfect and other theories. One is that the category of the perfect state is semantically a free variable and the other is that the relationship between the perfect state and the base eventuality is one of inferability, not a temporal (abutting) relation or a causal relation. ev, ev´, s, r
φ (ev) ev´ ≤ ev
φ (ev´) τ (ev´) r X (s) τ (s) r Figure 1. The meaning of the English perfect.
1. See Nishiyama and Koenig (2004) for more details about previous studies of the perfect and a preliminary version of our semantics for the English perfect. 2. This use of a free variable X can be compared to the use of property variables in Kay and Zimmer (1978) and Partee (1984) to model the semantics of noun-noun compounds and genitives and the use within DRT of property variables to model VP ellipsis in Hardt (1999).
The discourse functions of the present perfect
Possible values of X for sentences (2) and (3) and the labels for the corresponding uses of the perfect are shown informally in (2a) – (2b) and (3a) – (3b), respectively.3 (2) Ken has broken his leg. a. Ken’s leg is currently broken (= X(s)). —Entailed resultative reading b. Ken is behind in his project (= X(s)). —Conversationally implicated resultative reading (3) Ken has lived in London. a. Ken (still) lives in London (= X(s)). —Continuative reading b. Ken knows good restaurants in London (= X(s)). —Conversationally implicated resultative reading
Assuming the correctness of our analysis of the English perfect, for which we have argued elsewhere, three questions arise. The first is what type of inferences are used to find the value of X, i.e., what inference rules lead to fully specified perfect readings. The second is what roles the choice of a perfect form over a simple past tense form plays in discourse. The third is to what extent the results of our analysis of the English perfect extend to other languages such as Japanese. The next section answers the first question through the analysis of a sample of examples culled from a diverse range of corpora.
3. Inference patterns needed to find the value of X in English We collected data from various genres, two newspapers (Graff 1997), the two discussion articles in CQ Researcher Online, conversation data from the Switchboard Corpus (Graff et al. 1998), and narrative data from Netlibrary. We examined the interpretations of all present perfect examples including those that occurred in embedded clauses in a pseudo-randomly selected portion of each corpus (605 examples in all). Non-finite forms of the perfect, e.g., perfect forms that followed modal auxiliaries or to were excluded from analysis, as well as the idiomatic expression ’ve got to.4
3. We call entailed resultative perfect readings readings in which the value of X corresponds to the resultant state entailed by the base eventuality description. We call conversationally implicated resultative perfect readings (Depraetere 1998) or (non-entailed) resultative perfect readings readings in which the value of X is a resultant state that is not entailed from the base eventuality description. Some scholars call the latter readings existential or experiential perfect readings (McCawley 1971; Dahl 1985). 4. This is a slightly revised version of the study presented in Nishiyama and Koenig (2006) and Nishiyama (2006b).
Atsuko Nishiyama and Jean-Pierre Koenig
We found that three types of inference patterns were needed to assign a value to X. First, in most perfect examples (81.98%, see (4) – (5) and also Table 1 below), readers need draw only trivial inferences in order to find the value of X, namely that the state either described or entailed by the verb and its arguments persists until the present (the presumption of persistence (McDermott 1982)). The inferences drawn in the other 18% fall into a handful of inference patterns. Type (i) Entailed resultative or continuative perfects. In example (4) readers infer that the base eventuality description, i.e., the state of his being a member of her household still persists at present. In example (5), the occurrence of the event of Yeltsin’s health becoming a major issue entails the resultant state of Yeltsin’s health being a major issue. Readers infer that the entailed state still persists at present.
(4) . . . , he has been a member of her household ever since. (X = He is a member of her household.) (Cather 1996, 24) —inference of persistence
(5) Yeltsin’s health has become a major issue in the closing days of Russia’s presidential race. (X = Yeltsin’s health is a major issue in the closing days of Russia’s presidential race.) (Graff 1995–1997: Wall Street Journal, 07.01.1996) — entailment and inference of persistence
Type (ii) Speech-act/Epistemic perfects. Some perfect sentences have speech act verbs or epistemic verbs as their main verbs and the value of X can be inferred via default rules that reflect the speaker and hearer’s expectations about each other’s speech acts. They can be divided into two subtypes. Subtype (ii-a) Evidential use. Authors may use the perfect to communicate that the complement of performative or epistemic verbs such as, say, promise, or see, presently holds or is likely to hold in the future (see (6) and (8)). The default inference rule at play here is that, e.g., if Z says or promises Y and Z is trustworthy (conforms to our cultural model of language and information, see Sweetser (1987)), Y is (normally) true or likely to be true in the future as per the speech act’s sincerity conditions (see (7) and (9)) (Searle 1969), and therefore, Y holds or is likely to hold (assuming that if p is true, the state whose category is p holds (Ismail 2001)).
(6) Sumitomo has said its losses from Mr. Hamanaka’s trading stand at $1.8 billion. (X = Sumitomo’s losses from Mr. Hamanaka’s trading stand at $1.8 billion.) (Graff 1995–1997: Wall Street Journal, 07.01.1996)
(7) ∀x∀p((say (x, p)∧trustworthy (x))>p) (‘>’ means ‘nonmonotonically/defeasibly entail’ (Pelletier and Asher 1997))
(8) Britain’s opposition Labor Party has also promised a ban on all tobacco advertising if it wins the election due to be held by May next year. (X = There is likely to be a ban on all tobacco advertising if the Labor Party wins the election.) (Graff 1995–1997: Reuters Financial News, 07.01.1996)
(9) ∀x∀p((promise (x, p)∧trustworthy (x))>likely (future (p)))
The discourse functions of the present perfect
In example (6), if Sumitomo says Y, then Y must be true, given rule (7), and the value of X is ‘Sumitomo’s losses from Mr. Hamanaka’s trading stand at $1.8 billion.’ In example (8) if Britain’s Labor Party promises Y, there is likely to be Y in the future, given rule (9) and the value of X is ‘there is likely to be a ban on all tobacco advertising in the future if it wins.’ Subtype (ii-b) Topic negotiation. Speakers often use the perfect with epistemic verbs at the beginning of a conversation to ask about addressees’ epistemic state and to set up a topic (see (10) and (11)). For instance, in examples (10) and (11), the speaker is trying to set up a topic, camping or the movie “Dances with Wolves,” by asking about the addressee’s camping experience or his having seen the movie. Here, the speaker relies on the default rule that if, by asking the addressee whether an epistemic pre-condition for having a conversation on her chosen topic is satisfied (by asking, e.g., the extent of the addressee’s experience or knowledge of the topic), she wants to talk about Y (see (12)). (10) Have you done a lot of camping recently? (X = I want to talk about camping with you.) (Graff et al. 1998, sw2009.txt) (11) A: Have you seen DANCING WITH WOLVES? (X = I want to talk about the movies.) B: Yeah. I’ve seen that, . . . . (Graff et al. 1998, sw2010.txt) (12) ∀x∀y (ask_addressee_know (x, y)>want_talk (x, y))
Type (iii) Commonsense entailment. Authors sometimes use the perfect, instead of the past tense, to indicate that the occurrence of a past event provides evidence or an explanation for the truth of a claim she made or will make. The value of X in these cases is the state description conveyed by a clause that precedes or follows the sentence containing the perfect. For example, in (13) the events introduced by the sentence containing the perfect (that the speaker curled up and watched international programs) support and provide evidence for the assertion conveyed by the first sentence. The fact that the speaker curled up on a couch and watched those international programs is proof that you can metaphorically go around the world in 80 channels. (13) . . ., you can go around the world in 80 channels (= X(s)). . . . I’ve curled up on my living room couch, clicker in hand, and watched, among other things, an Italian salute to mothers; Latin American telenovelas and variety shows; Greek movies; Japanese samurai epics and modern domestic dramas; Indian musicals; the evening news from Moscow; Chinese-language pop videos; Korean game shows; and France’s “Bouillon de Culture,” …(Graff 1995–1997: Wall Street Journal, 07.01.1996)
In order to find the value of X in example (13), readers need to make use of a rather specific commonsense entailment rule such as (14).
Atsuko Nishiyama and Jean-Pierre Koenig
(14) ∀x (curl_up_clicker_in_hand (x)∧watch_international_programs (x)>can_go_ around_the_world_in_80_channels (x))5
(15) is a similar example. The value of X for the first sentence is what the second sentence expresses. (15) House Democratic leader Richard Gephardt of Missouri …has played a key role in recruiting the party’s congressional candidates. Many are merely reflecting his priorities. . . (=X(s)) (Graff 1995–1997: Wall Street Journal, 07.01.1996)
Here, Gephardt’s key role in recruiting candidates explains that many congressional candidates reflect his priorities. Readers infer the value of X, using another rather specific commonsense entailment rule, the one stated in (16). (16) ∀x∀y (play_key_role_recruiting (x, y) > reflect_priorities_of (y, x))
It must be noted here that all the inference rules used in type (i–iii) are available independently of using the perfect, whenever speech participants are aware that the base eventuality occurred. The perfect simply triggers a search for available and relevant inference rules, because of the need to find a value for X (see above and Nishiyama and Koenig (2004) for more discussion). Although the inferential process is semantically triggered by the use of the perfect, the triggered inference rules are not part of its semantics. Table 1 summarizes the types of rules used to determine the value of X in our sample. It is striking that the value of X for the overwhelming majority of present perfect examples we have looked at can be found through very general default principles. 81.98% of all the examples belong to Type (i), where the value of X can be derived through the principle of persistence. 11.24% of the examples belong to Type (ii), where the value of X can be inferred through general default expectations regarding speech acts. In total, 93.22% of the examples require general default rules to assign a value to X. Only a small number of examples (4.46%) require specific commonsense knowledge rules.
4. Discourse Functions of the English Present Perfect Much of the previous literature on the English perfect— including the previous section— has focused on its meaning, i.e., what information about the world is conveyed by the use of a perfect form. This section focuses, instead, on the pragmatics of the English perfect and tries to answer the following question: why do writers or speakers choose a perfect form to describe an eventuality that occurred or started in the past? Simply put, our answer is that the choice of a perfect form is guided by writers 5. This rule is stated more specifically than is plausible. But it is easy to generalize and nothing substantial hinges on the particular formulation of the commonsense rule that we posit, as long as there is one such rule (or set of rules) that is plausibly shared by speech participants.
The discourse functions of the present perfect
Table 1. Numbers and percentages of examples of each type of inference patterns.
Type (i)
A B C D Total (%)
181 140 70 105 496 (81.98)
Type (ii) (a)
(b)
21 5 7 18 51 (8.43)
0 0 13 4 17 (2.81)
Type (iii)
Others
8 6 8 5 27 (4.46)
2 4 2 6 14 (2.31)
A
Newspapers (Graff 1995–1997: Reuters Financial News, 07.01.1996, Wall Street Journal, 07.01.1996) Discussion: CQ Researcher Online (http://library2.cqpress.com/cqresearcher). C Conversation: Switchboard Corpus (Graff et al. 1998, files sw2001.txt through sw2019.txt) D Narrative: Netlibrary (http://www.netlibrary.com/) (H.G. Wells, The Time Machine; I. Bernard Cohen, Howard Aiken: Portrait of a Computer Pioneer; Willa Cather, O! Pioneer!) B
or speakers’ desire to help addressees understand the coherence of the discourse they read or hear.
4.1 Type (i) - (iia,b) Lascarides and Asher (1993) and Asher and Lascarides (2003) argue, with others, that discourses cohere to the extent one can establish discourse relations between its segments (typically, clauses). We assume the correctness of this hypothesis and will discuss the role the perfect plays in enhancing discourse coherence in the context of the approach to discourse coherence developed in Segmented Discourse Representation Theory (SDRT) (Asher and Lascarides 2003). In SDRT, for two sentences or other pieces of text to form a coherent discourse, there must be a discourse relation R that relates their corresponding meaning representations or DRSs. More precisely, R takes two utterances’ meaning representations as its arguments (R(π1, π2) where πi is a label for the DRS of an utterance or clause), and is nonmonotonically inferred from the information content of utterances, discourse contexts, and world knowledge. Whenever an utterance introduces an additional eventuality into discourse, a new discourse relation must relate its semantic content to that of a surrounding utterance. Our contention is that when a sentence containing a perfect form introduces a perfect state, additional discourse relations can be established on the basis of the relations that can exist between the perfect state and eventualities described in the surrounding text.6 More generally, the presence of an additional eventuality (the perfect state) 6. As shown in the previous section, the underspecified value of X is easily resolved at the level of the logical form of each clause, typically via very general default inferences. And if the
Atsuko Nishiyama and Jean-Pierre Koenig
either prevents conflicting inferences from being made or increases the number of discourse relations between discourse segments. In both cases, it strengthens the coherence of discourses, as per the Maximal Discourse Coherence Principle (MDC) of Asher and Lascarides (2003) that ranks discourses as more coherent the more consistent discourse relations between discourse segments they support.7
4.1.1 Type (i) Consider examples of perfect uses of Type (i). (17) a. Alexandra took him in, and he has been a member of her household ever since. (X = He is a member of her household.) b. He is too old to work in the fields, but he hitches and unhitches the work-teams and looks after the health of the stock.
In (17a) the value of X for the perfect clause is the state that he is a member of Alexandra’s household through an inference of persistence. The DRSs for the clauses (17a) and (17b) can form an Elaboration relation through the Elaboration rule in (18) because the state of his being too old to work in the fields, of his hitching and unhitching the work-teams and of his looking after the health of the stock is a part of the state of his being a member of Alexandra’s household (= X) (Asher and Lascarides 2003). Since the state of his being too old to work in the fields, the habitual state of his hitching and unhitching the work-teams . . . are all temporally included in the state of his being in Alexandra’s household (= X), sentence (17b) can elaborate sentence (17a). (18) Elaboration Rule ∀α∀β∀P∀Pʹ∀eυ∀eυʹ((P(eυ, α)∧Pʹ(eυʹ, β)∧Part-of (evʹ, ev))>Elaboration (α, β)) (In this and subsequent rules, α and β are DRSs, P and Pʹ are eventuality descriptions, and eυ and eυʹ are eventualities. The predicate Part-of describes a part-of relation between eventualities.)
inferences required for resolving the value of X are more specific, as in type (iii), the value is found in the surrounding text. The information associated with the value of X is, thus, always available for the establishment of discourse relations. 7. The version of SDRT presented in Asher and Lascarides (2003) assumes that each clause describes a single eventuality. There is, thus, a one-to-one correspondence between DRS labels and discourse markers anchored to eventualities. As a result, relations between eventualities easily map onto relations between DRSs of clauses that describe these eventualities. Unfortunately, for any theory that assumes that perfect operators introduce an additional stative eventuality, like that of Kamp and Reyle (1993) and ours, the mapping between eventuality relations and discourse segment relations will be slightly more complex (c.f. the treatment of presuppositions in Asher and Lascarides (2003)). Since this technical issue is of no relevance to the main point of this paper, we do not discuss it any further.
The discourse functions of the present perfect
The elaboration rule in (18) says that if two discourse segments describe eventualities such that one is part of the other, then one can (defeasibly) infer the existence of an elaboration relation between these two discourse segments.8 If the simple past had been used in (17a) instead (i.e., ‘and he was a member . . .’), readers would only have been able to infer that his membership held prior to speech time. Because that state might not hold anymore at speech time (in fact, the use of the past tense would implicate it does not), the addressee could not establish a part-whole relation between the prior state of being a member of Alexandra’s household and the current state of hitching and unhitching work-teams . . . . No inference that an Elaboration relation holds between the DRSs for (17a) and (17b) could, then, be drawn. Discourse (19) illustrates how the choice of a perfect form can help prevent addressees from drawing conflicting inferences. Figure 2 shows the SDRS for discourse (19).9 (19) a. For centuries, the Havasupai Indians of northwest Arizona have performed the ram dance to conduct the spirits of their dead relatives to the next world. (X = The Havasupai Indians of Northwest Arizona (usually) perform the ram dance to conduct the spirits of their dead relatives to the next world.) b. But today the sacred ceremony has become more than just a funeral rite. (Cooper 1996) (X = Today the sacred ceremony is more than just a funeral rite.)
The presence of the discourse marker but at the beginning of (19b) indicates that (a portion of) the DRSs for (19a) and (19b) stand in a Contrast relation. One argument of this relation, the one encoded in (19b), corresponds to the event of the ceremony becoming more than a funeral rite. The combination of the semantics of inchoatives and the use of the perfect form will lead readers to infer that the ceremony is more than a funeral rite. But, in turn, this means that the ceremony must still be performed. Had the verb form of (19a) been the simple past performed, an implicature would have arisen that would contradict that entailment, namely an implicature that the ceremony is no more being performed. There are thus two ways in which the use of perfect forms in (19) helps establish discourse coherence. First, the establishment of a Contrast relation between (19a) and (19b) depends on the fact that the ceremony was performed in the past and still is, an inference that the perfect form in (19b) helps trigger. Second, the coherence of (19) depends on the absence of any information that would contradict
8. Our inference rules are stated somewhat informally for ease of presentation and, thus, differ slightly from those found in Asher and Lascarides (2003). Nothing substantial hinges on these simplifications. We also add to the set of discourse relations that Asher and Lascarides (2003) discuss, see our Topic negotiation QAP relation in Section 4.1.3. 9. All SDRSs in this paper are simplified for ease of presentation.
Atsuko Nishiyama and Jean-Pierre Koenig
that inference. The use of the perfect form in (19a) implicates that the ceremony is still being performed and, more crucially, blocks the implicature that it is not anymore performed. π1 , π2 ev, s, n H.I.per form_R.D (ev) π1 :
τ(ev) n X(s) τ(s) n
ev′, s′ , n R.D_become_more_than_funeral_rite (ev′) π2 :
τ(ev′) n X′(s) τ(s′) n Contrast (π1, π2)
X = H.I._per form_R.D_to_conduct_the_dead_to_next_world X′ = R.D_be_more_than_ f u neral_rite Figure 2. SDRS for (19).
4.1.2 Type (ii-a) In uses of the English perfect of Type (ii-a), the perfect introduces X(s) together with the source from which the speaker derives this piece of information. As we saw in the previous section, the use of a perfect form leads to the inference that X(s) holds. This inference, in turn, helps readers infer the presence of additional discourse relations between the relevant discourse segments. In discourse (20), for example, the DRSs for (20a) and (20b) form a Result relation based on the causal relation that readers can infer between the event introduced in (20a) and the perfect state introduced by the perfect form in (20b), as seen in (21). (20) a. Mr. Hamanaka spent billions of dollars . . . b. Sumitomo has said its losses from Mr. Hamanaka’s trading stand at $1.8 billion (X = Sumitomo’s losses from Mr. Hamanaka’s trading stand at $1.8 billion). (21) ∀x∀y∀n∀eυ∀eυʹ((spend (x, n, eυ) ∧ loss_stand_at (x, n, eυʹ)) > cause (eυ, eυʹ))
The discourse functions of the present perfect
The inference rule in (21) is based on the knowledge that an eventuality of x spending n dollars normally causes an eventuality of x’s loss standing at n dollars. The readers’ ability to infer a causal relation between spending and losses, in turn, helps trigger the Result discourse relation inference rule stated in (22). (22) Result Rule ∀α∀β∀P∀Pʹ∀eυ∀eυʹ((P (eυ, α) ∧ Pʹ(eυʹ, β) ∧ cause (eυ, eυʹ)) > Result (α, β))
The rule in (22) says that if the eventuality descriptions P and Pʹ are true of eυ and eυʹ in DRSs α and β and if there is a causal relation between eυ and eυʹ, then α and β stand in a Result relation. Together, the two rules in (21) and (22) ensure that a Result relation holds between the DRSs for the two sentences in (20). Figure 3 is a representation of the SDRS for discourse (20). π1, π2 e, n π1:
M r .H._spend_billions_of_dollars (e) τ(e) n
e, s, n S_say_S′s_loss_stand_at_$1.8_billion (e) π2:
τ(ev) n X(s) τ(s) n Result (π1, π2) X = S′s_loss_stand_at_$1.8_billion Figure 3. SDRS for (20).
Again, the present perfect plays a critical role in facilitating the inference of a Result relation, as that relation is not between what Sumitomo said and Mr. Hamanaka’s spending, but between the fact that Sumitomo is reporting and Mr. Hamanaka’s spending, i.e., between the perfect state whose category is implicated by (20b) and the content of (20a). Inferring the category of the perfect state is a prerequisite to the establishment of the relevant discourse relation and the need to infer a category for the perfect state triggers this inference in a way a past tense form would not have.
4.1.3 Type (ii-b): Topic Negotiation QAP Uses of the perfect of Type (ii–b) can also help make interrogative sentences and their answers cohere. Consider the question in (23) or the question-answer pair in (24).
Atsuko Nishiyama and Jean-Pierre Koenig
(23) Have you done a lot of camping recently? (X = I want to talk about camping.) (Graff et al. 1998, sw2009.txt) (24) a. Have you seen DANCING WITH WOLVES? (X = I want to talk about this movie.) b. Yeah. I’ve seen that, , that’s, uh, that was a really good movie. (Graff et al. 1998, sw2010.txt) (25) ∀x∀y∀p(ask_whether_experienced (x, y, p) > ask_whether_know (x, y, p))
Rule (25) says if x asks addressees whether y has experienced p, normally x is asking whether y knows about p. As discussed before, if x asks whether y knows about p, it can, in turn, be defeasibly inferred that x wants to talk about p, as seen in rule (26) below. We propose that under such circumstances, question and answer pairs such as (24) form a Topic-Nego-QAP via the simplified rule in (27). Q and A in rule (27) are conversational events as defined in Poesio and Traum (1997) or goal related speech acts as defined in Asher and Lascarides (2003). If α is the label of a DRS describing a conversational event Q of x expressing her desire to talk about p (indirectly expressed through the perfect form, here) and β is the label of a DRS describing a conversational event A of y informing whether he wants to talk about p, then one can defeasibly infer that Q and A form a Topic-Negotiation-Question-Answer-Pair. (26) ∀x∀y∀p(ask_whether_know (x, y, p) > (want_to_talk_about (x, p)) (27) ∀α∀β∀x∀y∀p∀Q∀A (express (Q, x, y, want_to_talk_about (x, p), α) ∧ inform_ whether (A, y, x, want_to_talk_about (y, p), β) > Topic – Nego – QAP (α, β))
Figure 4 is a simplified SDRS for the dialogue in (24). Although this paper cannot fully discuss discourse relations between speech acts, we can at least say that the choice of a perfect form in uses of Type (iib) facilitates the establishment of a TopicNegotiation-QAP relation between the question and answer by introducing the perfect state and triggering the inference rule in (27).
4.2 Type (iii) In uses of the perfect of Type (iii), the value of X is found in the surrounding text, as discussed in Section 3. Interestingly, the perfect form is still instrumental in establishing the coherence of the discourse in which it occurs, even though the perfect state would still have been introduced into the discourse, had a perfect form not been used. This is because the perfect triggers the retrieval of a commonsense rule that may constitute a needed premise to establish a discourse relation between the sentence that contains the perfect and the sentence that contains the state description that is the value of X. Consider the use of the perfect in discourse (28) (simplified from (13)). The perfect form triggers the search for the value of X and the retrieval of rule (29) from which the value of X is determined.
The discourse functions of the present perfect
π1, π2
π1:
ask_whether (x, y, π3: e:
y see p
)
τ(e) n
π2:
inform (y, x, π3) X (s) τ(s) n
X (s) τ(s) n Topic-Nego-QAP(π1, π2) X = x_want_to_talk_about_p X′ = y_want(or not_want)_to_talk_about_p Figure 4. SDRS for (24) (tentative).
(28) . . ., you can go around the world in 80 channels (= X). I’ve . . . watched, among other things, an Italian salute to mothers; Latin American telenovelas and variety shows; Greek movies; Japanese samurai epics and modern domestic dramas; Indian musicals; the evening news from Moscow; Chinese-language pop videos; Korean game shows; and France’s “Bouillon de Culture,” (Graff 1995–1997: Wall Street Journal, 07.01.1996) (29) ∀x (watch_international_programs (x) > can_go_around_the_world_in_80_channels (x))
The use of the perfect ’ve watched in the second sentence in (28) facilitates the establishment of an Evidence coherence relation between the SDRSs for the two sentences of (28), because the perfect triggers the retrieval of the commonsense rule in (29). (30) Evidence Rule ∀α∀β∀P∀Pʹ∀eυ∀eυʹ((P(eυ, α)∧Pʹ(eυʹ, β)∧(P(eυ, α)>Pʹ(eυʹ, β))) > Evidence (α, β))
The Evidence Rule in (30) says that if the eventuality descriptions P and Pʹ in DRSs α and β are true of eυ and eυʹ and one can defeasibly infer Pʹ(eυʹ, β) from P(eυ, α), then α is evidence for β. In other words, if one makes two claims such that one can (defeasibly) infer the truth of the first from that of the second, the second claim is evidence in favor of the first claim. By evoking a rule on the basis of which one can defeasibly derive Pʹ(eυʹ, α) from P(eυ, β) (rule (29)), the perfect helps trigger the rule in (30) on which the coherence of the discourse in (28) partly rests. The SDRS for (28) is shown in Figure 5.
Atsuko Nishiyama and Jean-Pierre Koenig
π1, π 2
s, n π1: You_can_go_around_the_world_in_80_channels (s) τ(s) n
ev, s, n I_watch_Italian_salute, …etc. (ev) τ(ev) n
π2:
X(s) τ(s) n Evidence (π1,π2) X = You_can_go_around_the_world Figure 5. SDRS for (13).
Table 2 summarizes the various uses of the English present perfect we just reviewed and differences in the kinds of inferences addressees must perform when interpreting English present perfect forms. Table 2. Inference types and discourse functions of English present perfects
Type (i)
Type (ii) Type (iii) (a)
(b)
General inference
+
+
+
–
Value X is in the surrounding text
–
–
–
+
is introduced implicitly
is introduced with qualification
suggests a new topic
already present in discourse
Perfect state
Typical To add discourse To negotiate Discourse relations in discourse a topic Function
To help establish primary discourse relation
5. The Japanese perfect This section examines uses of the Japanese nonpast perfect -te-i-ru (-te-i- + nonpast) to determine whether the results of our corpus study of the uses of the English present perfect extend to other perfect operators. The English and the Japanese perfect forms
The discourse functions of the present perfect
have similar meanings, but differ in one important respect; namely Japanese -te-iforms can receive both progressive and perfect interpretations while the English perfect has only perfect readings (Nishiyama 2006a). Japanese -te-i-forms might best be translated as English progressive perfect forms. Despite this difference, the Japanese -te-i-ru examples we collected have the same discourse functions our English present perfect examples had.
5.1 Summary of the Japanese -te-i-ru data We collected Japanese -te-i-ru examples from two Japanese newspapers (Graff and Wu 1995), discussion articles from Aozora Bunko (www.aozora.co.jp), conversations from BS Archive (Ohori 1993), and narratives from three novels (Ooe 1959; Nitta 1973; Murakami 1985). The examples include both progressive and perfect readings and occurred in both main and subordinate clauses (including relative clauses), as well as in adjectival phrases that modified noun phrases. The data we report also include the form te-ru, a frequent colloquial form of -te-i-ru. As was the case for our English sample, the perfect examples we examined were all the relevant examples in a pseudorandomly selected portion of each corpus (1186 examples in all). Table 3 shows the types of inference patterns needed to interpret -te-i-ru examples. Table 3. Inference patterns of -te-i-ru
Type (i)
Newspaper* Conversation** Discussion*** Narrative**** Total (%)
337 125 235 358 1055 (88.95)
Type (ii)
Type (iii)
(a)
(b)
110 4 14 2 130 (10.96)
0 0 0 0 0 (0)
0 0 0 0 0 (0)
Others
0 1 0 0 1 (0.09)
*Nihon Kezai Shimbun, 07,01, 1994, Dow Jones Telerate/Kyodo News Service, 06, 30, 1995 (Graff and Wu 1995). **Ohori (1993) ***Suzuki (1997), Takahashi (1997) ****Ooe (1959), Nitta (1973), Murakami (1985)
Type (i): Progressives, Entailed Resultative Perfects, and Continuative Perfects. Uses of Japanese -te-i-ru that belong to Type (i) include progressive, entailed resultative perfect, and continuative (progressive perfect) readings. All readings require an inference of persistence.10 Sentence (31) is an example of progressive perfect (continuative) uses. 10. The semantics of Japanese -te-i- differs from that of the English progressive form. Japanese -te-i- sentences require an inference of persistence to receive a progressive reading (Nishiyama 2006a), while English progressives do not.
Atsuko Nishiyama and Jean-Pierre Koenig
(31) Mou- nannen-mo kore-o tsuka -tte-i- ru. (Ooe 1959, 26) Already years-as.long.as this-acc use -te-i- npst ‘I have been using this for years now.’ (X = I’m using this.)
Progressive, entailed-resultative, and continuative uses make up 88.95% of all our -te-i-ru examples. That is, 88.95% of -te-i-ru examples correspond to English Type (i) uses, as seen in Table 3: the interpretation of an even larger majority of Japanese nonpast perfect examples depends on an inference of persistence. Type (ii-a): Speech Act/Epistemic-Evidential Uses. Other Japanese nonpast perfect examples in our corpora turn out to belong to a single inference pattern, namely Type (ii-a) (10.96%). Example (32) illustrates such uses. (32) Bei-seifu-wa genjoo-de-wa ‘nichi-bei- seifu u.s.-government-top current.situation-in-top ‘Japan-u.s.- government choutatsu kyoutei-ni ihan-su ru’ to mi- -te-i- ru. trading rules-dat violation-do npst’ comp judge -te-i- npst (Graff and Wu 1995) ‘The U.S. government has judged that in the current situation “(Japan’s free trading) violates WTO rules.”’ (X = Japan’s trading violates WTO rules.)
In discourse (32) the author conveys to the reader that the complement of a speech act/epistemic verb (mi- ‘judge/regard’) is true. Determining the value of X relies on the same sincerity condition-based rules seen in the corresponding evidential uses of the English present perfect discussed in Section 3. To infer the value of X, readers of (32) rely on the default rule that if somebody gives her judgment and if we assume she fits our cultural model of communication and information (to simplify, she is trustworthy), her opinion is correct and what she says is (normally) true, as seen in rule (33). (33) ∀x∀p((express_judgement (x, p)∧trustworthy (x))>p)
Rule (33) says that if x expresses her judgment about p or explains p, normally p is true. Type (iia) uses of Japanese -te-i-ru cover examples that include verbs such as setsumei-su (‘explain’), bunseki-su- (‘analyze’), yosoku-su- (‘predict’), kento-su- (‘consider’), shiteki-su- (‘point out’), iu (‘say’), and so on. Type (i) and Type (ii-a) uses of Japanese -te-i-ru amount to 99.91% of all our examples. In contrast to our English data set, the interpretation of our Japanese nonpast perfect examples only required the use of general default inference rules (the persistence rule or rules based on a speech act’s sincerity conditions).
5.2 The discourse functions of Japanese -te-iOur sample of Japanese nonpast perfect examples displayed the same range of discourse functions that their corresponding English examples did (to the extent there were examples of a particular type of present perfect use). For instance, the DRSs for the first and the second sentence in (34) (an example of Type (i) use) form a
The discourse functions of the present perfect
Narration relation and a Result relation, because of the relationship between the events described in the two sentences (the change in Japanese manufacturers’ ability to take orders for commercial satellites and the increased rarity of NASDA satellites). In addition, a Background relation can be established between the perfect state introduced in (34a) (there is currently no way for Japanese manufacturers to take orders for commercial satellites) and the overlapping event described in the second clause. Figure 6 is a simplified SDRS for (34). (34) a. Nihon-no eisei-meekaa-wa, . . . jitsuyou-eisei Japanese satellite-manufacturer-top …commercial-satellite juchuu-no michi-o jijitsu-jou tozasare- -te-i- ru. taking.order-gen way-acc virtually close-caus- -te-i- npst ‘Japanese satellite manufacturers’ ability to take orders for commercial satellites has been virtually shut down.’ (X = There is no way for Japanese manufacturers to get orders for commercial satellites.) b. . . . NASDA-no eisei-seisaku-wa . . . kazusukunai . . . nasda-gen satellite-production-top . . . rare uchuu-kanren-gijutsu-o chikuseki-suru-ba-to na-ru. space-related-technology-acc accumulation-do-place-dat become-npst ‘Making an NASDA (research) satellite (will) become a rare opportunity (for Mitsubishi Electronic) to build up space-related technology.’
π1, π2 e, s, n π1:
J.M_have_a_way_closed (e) X(s) τ(s) n
e, n π2:
Making_satellite_become_rare_opportunity (e) ¬τ(e) n Narration (π1, π2) Result (π1, π2) Background (π1, π2) X = There_be_no_way Figure 6. The SDRS for (34).
Atsuko Nishiyama and Jean-Pierre Koenig
In (35) (an example of Type (ii-a) use), the DRSs for the first and the second sentences form an Explanation relation, because of the causal relation that can be inferred between the eventuality described in the first sentence (consumption is not rising), and the perfect state described in the second sentence (two factors are negatively affecting that consumption). Figure 7 is a simplified SDRS for (35). (We assume with others that the progressive use of -te-i-ru is stative.) (35) a. Kojin-shouhi-wa . . . moriagari-ni kake, Individual-consumption-top . . . upsurge-in lack ‘(The total amount of) consumption by individuals is not rising up, and b. dou-shiten-de-wa ‘. . . futatsu-no mainasu youin-ga that-branch-in-top ‘. . . two minus factors-nom hibii- -te-i- ru’ to shi- -te-i- ru. affect- -te-i- npst’ comp regard- -te-i- npst. ‘the branch (Bank of Japan) has judged that there are two factors that negatively affect (that consumption).’ (X = Two factors are negatively affecting (that consumption).) (Graff and Wu 1995) π1, π2 s, n π1:
Individual_consumption_not_being_rising (s) τ(s) n
e, s, n π2:
Branch_regard_two_minus_factors_being_affecting (e) X(s) τ(s) n Explanation (π1, π2) X = two_minus_factors_being_affecting Figure 7. The SDRS for (35).
5.3 Differences between the Japanese and English samples A striking difference between our sample of English present perfect examples and our sample of Japanese -te-i-ru examples is that there are no uses of Type (iib) in our Japanese sample. There are two reasons for this. First, because -te-i-ru is vague between a progressive and a perfect interpretation, asking about someone’s past experience is
The discourse functions of the present perfect
typically interpreted as a (perfect) progressive question. This is particularly true as there is another form koto-ga aru, which is exclusively used to ask about someone’s past experience (see the made-up example in (36)). (36) Panda-o mi- ta- koto-ga arimasu-ka. Panda-acc look.at- pst comp-nom exist.polite-q ‘Have you looked at a panda?’
Second, even when a sentence contains -te-i-ru and the verb shi- (‘know (get to know)’) and serves a Topic Negotiation function, we categorized it as a Type (i) use, not as a Type (ii-b) use in Table 3. Our decision was based on the fact that addressees only need draw an inference of persistence to interpret the value of X, since the entailed state of someone getting to know someone or something still persists. Consider the examples in (37) which we categorized as examples of Type (i), despite the fact that they are part of a Topic Negotiation QAP pair. (37) a. A: b. C:
Kore shi- tte- ru? This know te-(i)- npst ‘Do you know this?’ (X = You know this. ) Minna shi- tte- ru tte. Everyone know- te-(i)- npst prtcl ‘Everyone knows it.’ (X = Everyone knows it.) (Ohori 1993, RK.data-04)
In discourse (37) speaker A asks whether the addressees know some stuff, implicating that she wants to talk about it. After speaker C accepts the topic by saying everyone knows about it, speaker A starts talking about the topic. The same rule we used in the context of English Type (ii-b) examples can be used here. If the speaker wants to know whether the addressees know something (and thus can talk about it), she probably wants to talk about it (see (12)). The inference rule in (12) ensures that discourse (37) can be part of a Topic-Negotiation-QAP pair. Thus, although -te-i-ru examples never need to make use of speech-act inference rules to determine the underspecified value of the perfect state, the point of choosing -te-i-ru over a simple past form might be the same as the corresponding English present perfect examples, namely to negotiate a topic. Another difference between our English and Japanese samples is that there does not seem to be any -te-i-ru example whose interpretation requires the use of a commonsense rule (uses of Type (iii)). It is hard to know how to interpret this difference. First, English examples of Type (iii) are also quite rare (only 4.46%). It is therefore possible that the fact we did not find any corresponding example in our Japanese sample is due to the vagaries of sampling. Second, such uses are not impossible and are attested in written texts. In discourse (38) a clause containing -te-i- (whose DRS is labelled as π2) serves as evidence of the truth of what was claimed in the preceding context (and which corresponds to the value of X (38a)). The context, which we omit for brevity, states that the manager knows about crimes (in parenthesis in the translation). Given the
Atsuko Nishiyama and Jean-Pierre Koenig
context, the fact that the culprit made a basic mistake that only those who have never committed a crime would commit can serve as evidence that the manager is not the culprit. (38) a. . . . Kanri-nin-no A-san-wa hannin-ja arimasen. . . . Manager-gen A-Mr.-top culprit-cop.top -neg. ‘Manager Mr. A is not the culprit.’ b. . . . shohan-mo yara nai youna misu-o ya . . . first.time.criminal-too do neg like mistake-acc do -tte-i- ru deshou. (Yura 1985) -te-i- npst tag ‘. . . (the culprit) made a basic mistake which only those who have never committed a crime would make, right?’ (Context: The manager has a prior criminal record and is knowledgeable about how to commit a crime.) (X = The manager is not the culprit.)
The presence of X(s) in the semantic representation of clause (38b) triggers the search for the value of X and the commonsense rule in (39). (39), in turn, facilitates the establishment of an Evidence relation between the discourse segments π1 and the segment π2 via rule (30). Figure 8 shows the resulting SDRS for discourse (38). (39) ∀x∀y ((make_basic_mistake (x) ∧ be_experienced/knowledgeable (y)) >¬x = y)
In conclusion, Japanese -te-i-ru has the same range of interpretive possibilities as the English present perfect form, although the interpretation of -te-i-ru seems to rely even more on very simple default inferences. Table 4 summarizes the possible inference types and discourse functions of Japanese -te-i-ru perfect uses. Note that,
π1, π2 s, n π1:
Manager_A_be_not_culprit (s) τ(s) n
π2:
e, s, n Culprit_make_basic_mistake (e) τ(e) n X(s) τ(s) n Evidence (π1, π2) X = Manager_A_be_not_culprit Figure 8. The SDRS for (38).
The discourse functions of the present perfect
Table 4. Inference types and discourse functions of Japanese -te-i- perfects
Type (i)
Type (ii) (a)
General inference
+
+
–
Value X is in the surrounding text
–
–
+
is introduced implicitly
is introduced with qualification
already present in discourse
Perfect state Typical Discourse Function
To add discourse relations in discourse (including Topic-Nego QAP)
Type (iii)
To help establish primary discourse relation
although no example in our sample had to be categorized as a Type (ii-b) use (because the value of X could always be determined through an inference of persistence), -te-iru forms can be used to negotiate a topic. All in all, and despite the slight difference in meaning between the English and Japanese perfects, both forms help discourse coherence in very similar ways.
6. Summary This paper examined a sample of over 600 English present and 1100 Japanese nonpast perfect examples to provide a partial answer to why speakers or writers choose a present or nonpast perfect form to describe a past eventuality. Simply put, the perfect can help establish discourse coherence or maximize discourse coherence. In many cases, the help that the perfect form provides is directly tied to the introduction of an additional eventuality into the Segmented Discourse Structure Representation. The more eventualities, the more semantic relations. The more semantic relations between eventualities, the more possible discourse relations between the SDRSs that describe these eventualities. But in some cases, the help comes from the fact that the category of the perfect state is underspecified semantically. That is, the rules themselves that are needed to determine the category of the perfect state might provide a premise that is needed to infer that, for example, an Evidence relation holds. The first kind of discursive help supports the claim that the perfect introduces a perfect state. Without this additional eventuality, no additional discourse relations would arise. The second kind of discursive help supports the claim that the category of the perfect state is semantically underspecified. Without the “firing” of rules used to determine the nature of the perfect state, some premises needed to derive discourse relations would be missing. More generally, the contrast between past tense and present perfect forms we discussed in this paper suggests that inferring discourse relations between discourse segments does
Atsuko Nishiyama and Jean-Pierre Koenig
not solely depend on their informational content. It also depends on the grammatical structure speakers or writers choose to communicate that content.
References Asher, N. and A. Lascarides 2003. Logics of Conversation. Cambridge: Cambridge University Press. Borillo, A., M. Bras, A. L. Draoulec, L. Vieu, A. Molendijk, H. De Swart, H. J. Verkuyl, C. Vet, and C. Vetters 2004. Tense, connectives and discourse structure. In Handbook of French Semantics, H. De Swart and F. Corblin (eds.), 309–348. Stanford: CSLI Publications. Cather, W. 1996. O Pioneer! Raleigh: Alex Catalogue. (http://www.netlibrary. com/). Cooper, M. 1996. Native Americans’ future. In The CQ Researcher Online. (http://library. cqpress.com/cqresearcher/cqresrre1996071200). Dahl, Ö. 1985. Tense and Aspect Systems. New York: Basil Blackwell. de Swart H. 1998. Aspect shift and coercion. Natural Language and Linguistics Theory 16, 347– 385. Depraetere, I. 1998. On the resultative character of present perfect sentences. Journal of Pragmatics 29, 597–613. Graff, D. 1995–1997. North American News Text Corpus. Philadelphia: Linguistic Data Consortium, University of Pennsylvania. Graff, D., A. Canavan, and G. Zipperlen 1998. Switchboard-2 Phase 1. Philadelphia: Linguistic Data Consortium, University of Pennsylvania. Graff, D. and Z. Wu 1995. Japanese Business News Text. Philadelphia: Linguistic Data Consortium, University of Pennsylvania. Hardt, D. 1999. Dynamic interpretation of verb phrase ellipsis. Linguistics and Philosophy 22, 185–219. Hobbs, J., M. Stickel, D. Appelt, and P. Martin 1993. Interpretation as abduction. Artificial Intelligence 63, 69–142. Ismail, H.O. 2001. Reasoning and Acting in Time. State University of New York at Buffalo. Ph. D dissertation. Kamp, H. and U. Reyle 1993. From Discourse to Logic, Part 1, 2. Studies in Linguistics and Philosophy. Dordrecht: Kluwer Academic Press. Kay, P. and K. Zimmer 1978. On the semantics of compounds and genitives in English. Sixth California Linguistics Association. Lascarides, A. and N. Asher 1993. Temporal interpretation, discourse relations and commonsense entailment. Linguistics and Philosophy 16(5), 437–493. Mann, W.C. and S.A. Thompson 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text 8(3), 243–281. McCawley, J.D. 1971. Tense and time reference in English. In Studies in Linguistic Semantics, C. J. Fillmore and D. T. Langendoen (eds.). NewYork: Holt, Rinehart, and Winston. McDermott, D. 1982. A temporal logic for reasoning about processes and plans. Cognitive Science 6, 101–155. Michaelis, L.A. 1998. Aspectual grammar and past-time reference. Routledge studies in Germanic linguistics; 4. London, New York: Routledge.
The discourse functions of the present perfect Murakami, H. 1985. Sekai-no owari-to haadoboirudo wandaarando. Tokyo: Shincho-sha. Nishiyama, A. 2006a. The meaning and interpretations of the Japanese aspectual marker -te-i-. Journal of Semantics 23, 185–216. Nishiyama, A. 2006b. The semantics and pragmatics of the perfect in English and Japanese. Dissertation, University at Buffalo, the State University of New York. Nishiyama, A. and J.-P. Koenig 2004. What is a perfect state? In WCCFL 23, University of California, Davis, 595–606. Cascadilla Press. Nishiyama, A. and J.-P. Koenig 2006. The perfect in context: A corpus study. In Penn Working Papers in Linguistics Vol.12.1: Proceedings of the 29th Annual Penn Linguistics Colloquium, A. Eilam, T. Scheffler, and J. Tauberer (eds.), University of Pennsylvania, 265–278. Nitta, J. 1973. Koko-no Hito. Tokyo: Shincho-sha. Ohori, T. 1993. Rikkyo-93. In BS Archive. Tokyo: University of Tokyo. Ooe, K. 1959. Shisha-no ogori, Shiiku. Tokyo: Shincho-sha. Partee, B. 1984. Compositionality. In Varieties of Formal Semantics: Proceedings of the Fourth Amsterdam Colloquium, F. Veltman (ed.), 281–311. Dordrecht: Foris Publications. Pelletier, F. J. and N. Asher 1997. Generics and defaults. In Handbook of Logic and Language, J. v. Benthem and A. G. ter Meulen (eds.), 1125–1177. Cambridge: The MIT Press. Poesio, M. and D.R. Traum 1997. Conversational actions and discourse situations. Computational Intelligence 13(3), 309–347. Searle, J.R. 1969. Speech Acts. Cambridge: Cambridge. Suzuki, S. 1997. Media tai ‘Watashi’. Tokyo: The Expanded Books – J engine/Voyager Japan Inc. (http://www.aozora.gr.jp/). Sweetser, E.E. 1987. The definition of lie. In Cultural models in language and thought, D. Hollland and N. Quinn (eds.), 43–66. Cambridge New York: Cambridge University Press. Takahashi, Y. 1997. Ongaku-no Hanhouhouron josetsu. Tokyo: The Expanded Books – J engine/ Voyager Japan Inc. (http://www.aozora.gr.jp/). van Eijck, J. and H. Kamp 1997. Representing discourse in context. In Handbook of Logic and Language, J. v. Benthem and A. G. ter Meulen (eds.), 179–237. Cambridge: The MIT Press. Yura, S. 1985. Sousou koushinkyoku satsujin jiken. Tokyo: Shincho-sha.
German right dislocation and afterthought in discourse Maria Averintseva-Klisch Univ. Tübingen1
I show that German right dislocation subsumes two distinct constructions, which I label right dislocation proper and afterthought. These differ in a number of prosodic, syntactic and semantic characteristics and also have different discoursefunctional properties. Right dislocation marks a discourse referent as especially salient on the current stage of the discourse. This requires the fulfilment of certain anaphoric constraints on the following discourse. Afterthought is a local reference clarification strategy and has no impact on the global discourse structure.
1. Introduction German right dislocation is a construction which consists of an NP at the end of the clause and a coreferent proform inside the clause, as in (1). It is generally assumed that right dislocation is a strategy of spoken German, which enables the speaker to resolve a (pro)nominal reference that might be unclear to the hearer (Altmann 1981; Auer 1991; Selting 1994; Uhmann 1993, 1997; Zifonun/Hoffmann/Stecker 1997): (1) a. b.
Ich mag siei nicht, (ich meine) die Serenai. I like heri not, (I mean) the Serenai Und dann passierte das Unglücki, (ich meine) dieser And then happened the unfortunate-thingi, (I mean) this schreckliche Autounfalli. terrible car-accidenti
Resolution of an unclear reference is in fact the function of the right-peripheral NP in (1). There are, however, cases, where it is not plausible to assume that right dislocation
1. This paper is based on research conducted as part of my doctoral thesis, which was till March 2006 financially supported by the DFG within the graduate school Economy and Complexity in Language (Humboldt University Berlin / Potsdam University). I would like to thank my supervisor Claudia Maienborn (Univ. Tübingen) for her constant support and guidance, the audience of the workshop Constraints in Discourse for very stimulating feedback, and Manfred Consten (Univ. Jena) and Barbara Schlücker (FU Berlin) for helpful comments.
Maria Averintseva-Klisch
has the function of resolving a vague reference, because the reference is unambiguously clear, like in (2): (2) a. b.
(“Der Taifun!” rief Lukas dem Kapitän zu. “Da ist er!”). (“The typhoon!” called Lukas the captain to. “There is it!”) Ja, da war eri, der Taifuni. Yes, there was iti, the typhooni [Ende, M.: Jim Knopf und die Wilde 13] (Den Tag, den vergess’ ich nicht,) deri war viel zu (The day, d-pron forget I not,) d-proni was much too schön, der Tagi. wonderful, the dayi [Altmann (1981:129)]
Instead, as I show in the following, the right dislocation in (2) has the function of marking the discourse referent (in the standard dynamic semantics sense, e.g., Karttunen 1976) that is going to be especially salient for the following discourse segment.2 In (2a), what follows is a description of the typhoon (terrible wind, dark waves, and so on). In (2b) one expects the speaker to supply more information about the unforgettable day. In the following, I will argue that ‘reference clarification’ and ‘salience marking’ are not two different functions of one construction, but that there are actually two different constructions that have been subsumed under the label of German right dislocation. I name the salience-marking construction like in (2) right dislocation proper (in the following: right dislocation, RD) and the reference-clarifying one like in (1) afterthought (AT).3 I will show that these constructions differ not only with respect to their discourse functions, but also in their prosodic, syntactic and semantic features. The paper is organized as follows: in section 2 the prosodic, syntactic and semantic differences between RD and AT are briefly introduced. I show that RD is prosodically and syntactically part of its host sentence, whereas AT is an ‘orphan’ that gets integrated into its host sentence only at the level of the discourse. In section 3 I address the integration of AT in terms of SDRT (Asher/Lascarides 2003) and introduce discourse
2. By discourse segment I understand intuitively a relatively small, thematically contiguous part of a discourse; roughly, a discourse segment is minimally an utterance, or, as is more often the case, several interrelated utterances. In written discourse, a discourse segment corresponds rather often to a paragraph, cf. Goutsos (1997). 3. I use the term afterthought for this construction for two reasons: firstly, as it is traditionally used in the literature exactly to denote the reference-clarifying NP additions to the right of the clause (cf. e.g., Ward/Birner 1996: 470, Fretheim 1995: 31). Secondly, by using this term I assume that reference-clarifying additions are actually a subtype of the rather heterogeneous group of syntactically free additions coming after the sentence and supplying additional information about some referent in the sentence (in this broader sense the term afterthought is used e.g. in Cann/Kempson/Otsuka 2002: 20).
German right dislocation and afterthought in discourse
relation Afterthought which is responsible for this integration. Section 4 discusses in more detail the discourse function of RD, i.e., the marking of a discourse referent as being especially salient for the following discourse segment. Interesting parallels and differences between the discourse functions of the right and the left sentence periphery are dwelt upon in this context. Then I show how RD takes part in the subdivision of a discourse segment into its main story line and background. Finally, in section 5 the results are summed up and conclusions are drawn.
2. Right dislocation vs. afterthought: formal differences As stated above, RD and AT differ crucially in their prosody and syntax. In the following I will introduce and illustrate these differences. First to the prosody. RD is prosodically integrated into its host sentence (3a), i.e., it continues the tone movement of the host sentence and thus does not build a prosodic unit of its own,4 whereas AT builds a prosodic unit (optionally divided by a pause from the clause) with a tone movement and a clause-like accent of its own, (3b): (3) a. [Ich MAG siei nicht, die Brigittei]. b. [Ich MAG siei nicht], | [die BriGITtei]. I like heri not, the Brigittei (|: pause; [ ]: prosodic unit; CAPITALS: main accent)
RD AT
Prosodic differences go along with syntactic differences: RD is syntactically and prosodically part of its host sentence, whereas AT is an independent unit. The differences are listed below. Firstly, for RD, morphological agreement between the clauseinternal proform and the NP on the right is obligatory (4) (cf. also Altmann 1981). For AT, on the other hand, it is only optional, at least as far as gender agreement ist concerned, as examples in (5) show. (4) a.
(“Der Taifun!” rief Lukas dem Kapitän zu. “Da ist er!”) (“The typhoon!” Lukas called to the captain. “Here it comes!”) Ja, da war eri, der Taifuni / *das Yes, there was itNOM_MASK, the typhoonNOM_MASK / *the Unwetteri. stormNOM_NEUTR
RD
4. Altmann (1981) observes two prosodic patterns for what he calls German right dislocation, the integrated and the non-integrated one. However, as he does not differentiate between RD and AT, he does not explain his observation. Fretheim (1995) takes the prosodic criterion to be crucial for the distinction between RD and AT in Norwegian: he shows that prosodically integrated structures are RDs, and prosodically non-integrated ones ATs.
Maria Averintseva-Klisch
b.
Ich mag ihn nicht, den Peter / *der Peter / diesen I like himAKK not the PeterAKK / *the PeterNOM / [this unmöglichen Volltrottel / *dieser unmögliche Volltrottel.5 RD impossible idiot]AKK / *[this impossible idiot]NOM
(5) a. b.
Und dann passierte das Unglücki, (ich meine) And then happened [the unfortunate-thing] NEUTR, (I mean) dieser schreckliche Autounfalli. [this terrible car-accident]MASK Ich habe ihni vorhin gesehen, das Kleinei von der I have himMASK before seen, the little-oneNEUTR of the Nachbarn. neighbours
AT
AT
Secondly, optional additions like ich meine (‘I mean’) or also (‘that is’) are possible in the case of AT (6a), where some of them serve to enhance the reference clarifying function (cf. Altmann 1981). On the other hand, additions are bad with RD (6b). (6) a. b.
(Anna und Brigitte kommen morgen.) Ich mag siei nicht, Anna and Brigitte are coming tomorrow.) I like heri / themi not, ich meine / also / tatsächlich, die Brigittei. AT I mean / that-is / really the Brigittei (“Der Taifun!” rief Lukas dem Kapitän zu. “Da ist er!”) Ja, da (“The typhoon!” Lukas called to the captain. “Here it comes!”) Yes, there war eri, *ich meine / *also / *tatsächlich der Taifuni. RD was iti, *I mean / *that-is / *really the typhooni
Thirdly, AT is not restricted to the position at the right periphery, but can occupy a variable position in its host sentence, cf. (7): the afterthought I mean Peter can come at the very end of the sentence (7a), immediately after the coreferent pronoun (7b) or between the temporal adverbial yesterday and the adverbial with effort (7c). RD, on the contrary, is only possible at the right periphery, so that (8a), but not (8b) is wellformed.6
5. The relative weight of the NP seems to play a role, improving the cases without congruence with AT. Still, even heavy NPs like the second variant this impossible idiot in (4b) do not allow incongruence with RD. 6. Here, and in the following, I use the prosodic structure as a diagnostics to distinguish between RD and AT. This means that for cases marked as RD I assume prosodic integration. So, (8b) is bad with the RD prosody, while it is well-formed when the NP builds a prosodic unit of its own as an AT construction.
German right dislocation and afterthought in discourse
(7) a. b. c. (8) a. b.
Ich habe ihni gestern nur mit Mühe wiedererkannt, | ich I have him yesterday only with effort recognized, I meine den Peteri. mean the Peter Ich habe ihni, | ich meine den Peteri, | gestern nur mit I have him, I mean the Peter, yesterday only with Mühe wiedererkannt. effort recognized Ich habe ihni gestern, | ich meine den Peteri, | nur mit I have him yesterday, I mean the Peter, only with Mühe wiedererkannt. effort recognized (I hardly recognized him yesterday, Peter.) Ich kann ihni nicht leiden, den Peteri. I can him not suffer, the Peter *Ich kann ihni, den Peteri, nicht leiden. I can him the Peter not suffer (I don’t like him at all, Peter)
AT
RD
Fourthly, subordinate clauses are not allowed between the clause-internal pronoun and the NP in a RD (9a), cf. Altmann (1981), while they are not at all problematic for AT (10): (9) (10)
(“Der Taifun!” rief Lukas dem Kapitän zu. “Da ist er!”) (“The typhoon!” Lukas called to the captain. “Here it comes!”) *Ja, da war eri, den sie alle befürchtet haben, der Taifuni.7 *Yes, there was iti, whom they all were-afraid-of, the typhooni
RD
Ich mag die Fraui nicht, die gestern hier war, | (ich meine) I like the womani not, who yesterday here was, (I mean) die Annai. AT the Annai
Summarizing the findings in (3)–(10), one can see that there is ample evidence that RD belongs prosodically and syntactically in a much more straightforward way to its host sentence. Prosodically, RD is a part of its host sentence’s tone contour. Considered syntactically, RD is much more restricted in allowing insertions between
7. The utterance is well-formed, if the subordinate clause does not intervene between the proform and the NP: (a)
Ja, da war eri, der Taifuni, den sie alle befürchtet haben. Yes, there was iti, the typhooni, whom they all were-afraid-of
Maria Averintseva-Klisch
the host sentence and the RD-NP than is AT: RD does not allow subordinate clause insertion ((9)–(10))8 nor optional additions of any kind (6). Besides, RD occupies a fixed position in the host sentence (at its right periphery). Moreover, as shown in (4), the right-peripheral NP has to agree morphologically with the clause-internal pro-form, which also suggests that NP is part of the clause, because morphological agreement is not expected to function across sentence boundaries.9 That leads to the assumption that RD is, syntactically seen, part of its host sentence, presumably the right adjunct to the IP.10 An ultimate syntactic analysis of the right dislocation would, however, exceed the limits of this paper. AT, on the contrary, can vary its position in its host sentence (see (7)). Furthermore, AT does not strictly require morphological agreement between the NP and the clause-internal pronoun, and it allows various insertions between the host sentence and AT-NP, e.g., additions like I mean etc. or subordinate clauses. All in all, AT appears to be syntactically fairly free. The prosodic and syntactic independence of AT leads to its analysis as an orphan (in terms of Haegeman 1991, Shaer 2003). An orphan is a phrase that is syntactically autonomous and gets integrated into its host sentence only at the level of discourse via some discourse relation.11 RD and AT differ also semantically in a crucial way. RD is much more restricted than AT as far as the semantic status of the NP is concerned. The RD-NP can only
8. The inability to insert a clause between the clause-internal coreferent proform and the NP has been analysed in the generative framework as the ‘right roof constraint’ (Ross 1967) or ‘upward boundedness’ (cf. e.g., Müller 1995). That means that the NP is analysed as being moved out of its host sentence, where a ‘pronominal copy’ (Altmann 1981) is left. Analysed like this, (9) gives strong evidence that right dislocation is syntactically part of its host sentence. 9. As Consten (2004: 90) notes, gender congruence of anaphora and its antecedent in general is easily violated, but only across sentence boundaries. Sentence-internally, gender agreement is expected. 10. I assume with Müller (1995) that the standard position for right adjunction is the IP adjunction. 11. This proposal has consequences for locating AT with respect to its host sentence. Obviously, it cannot be a right peripheral construction in the proper sense of the word, as the right periphery is a sentence-bound concept, and AT is not a part of its host sentence. Actually, in spoken and even in written language, AT often comes explicitly after a sentence boundary, marked e.g. by an intervention of another speaker, or, in the case of written language, graphically with an appropriate punctuation mark, e.g., a full stop in (a): (a)
(Der Koch war schon an Bord, der Matrose ebenfalls.) Er aß die Fliegen. (The cook was already on board, and the sailor too.) Hei ate the flies. Der Koch, nicht der Matrose. The cooki, not the sailor. [from: Yann, Martel: Schiffbruch mit Tiger: 364; I owe this example to Konstanze Marx].
German right dislocation and afterthought in discourse
refer to a definite specific individual (11a), whereas neither indefinite specific NPs (11b) nor any kind of quantificational NPs (11c) are possible:12 (11) a. b. c.
Da kommt eri schon wieder, der Peteri / der blonde Manni / There comes hei already again, the Peteri / the blond mani / dieser blonde Manni. that blond mani Da kommt eri schon wieder, *so ein Typi aus dem There comes hei already again, such a guy from the Tanzkurs.13 dancing-classi Alle blonden Frauen sind für ihn wunderschön. Peter liebt siei, All blonde women are for him beautiful. Peter loves themi, *alle blonden Fraueni.14 all blonde womeni
12. The requirement of specific individual reference for the RD-NP might be the reason why operator binding like in (a) is only marginally available for RD, even if considered syntactically nothing would prevent it: seine Frau (‘his wife’) here does not refer to a specific individual but to an ordered set (of women in relation to men), and this is against the restrictions on the RDNP: (a)
??Jeder Manni liebt sie, seinei Frau. ??Every mani loves her, hisi wife
13. Note that (11b) and (11c) would be well-formed as ATs, cf. (a) and (b): (a) (b)
Da kommt er schon wieder, | ich meine so ein Typ aus dem There comes he already again, I mean so a guy from the Tanzkurs. dancing-class Peter liebt sie, | ich meine alle blonden Frauen. Peter loves them, I mean all blonde women
14. The discourse in (11c) would be perfectly well-formed without right dislocation, cf. (a): (a)
Alle blonden Frauen sind für ihn wunderschön. Peter liebt alle blonden All blonde women are for him beautiful. Peter loves all blonde Frauen. women
Grosz/Ziv (1994) state that in English, right dislocation cannot be used to refer to entities that were mentioned in the sentence immediately preceding the one with the right dislocation (Grosz/Ziv (1994: 190); see, however, objections in Ward/Birner (1996)). In German this is possible, cf. (b), so that this cannot be the cause of the ill-formedness of (11c): (b)
Verena ist für ihn die schönste. Peter liebt sie, die Verena. Verena is for him the prettiest. Peter loves her, the Verena
Maria Averintseva-Klisch
Besides, it is required that the referent of the RD-NP is discourse-old (in terms of Prince 1992), so that (12) with a discourse-new referent is bad. Discourse-old is understood here to include situationally evoked information (in terms of Prince 1981), as in (13):15 (12) A: Und wie geht die Festvorbereitung? B: Ich weiß nicht was ich noch versuchen soll. Ich kann einfach keine JazzBand für den Abend auftreiben. A: How are the festival preparations coming along? B: I don’t know what I should try next: I haven’t been able to get a jazz band for the evening. A: #Du könntest ihni fragen, diesen Chorleiteri. Bestimmt kennt A: You could himi ask, that choirmasteri. Certainly knows er jemanden. he somebody (13) Deri spinnt doch, der Typi / dieser Schröderi. Hei is-crazy sure, the guyi / that Schröderi (context: A and B are talking about linguistics. A sees a newspaper B has on his table with a picture of the German federal chancellor on the front page, points to it and comments on it)
AT, on the contrary, allows nearly all kinds of NPs, definite and indefinite, specific and non-specific, as well as quantificational ones (14): (14) a. b. c.
Siei kommt heute zum Abendessen, | ich meine Paulai / eine Shei comes today to dinner, I mean Paulai / a Fraui aus meinem Tanzkurs. woman from my dancing-classi Hast Du einsi, | ich meine ein Euro-Stücki? Have you onei, I mean a euro piecei (context: standing near a locker in a library) Siei sind Fleischfresser, | ich meine alle Löweni. Theyi are carnivorous I mean all lionsi
Also discourse-new information is possible with AT, so that (15), contrary to (12), is well-formed: (15) A: How are the festival preparations coming along? B: I don’t know what I should try next: I haven’t been able to get a jazz band for the evening.
15. Situationally evoked entities behave also like discourse-old with respect to other linguistic diagnostics: e.g., situationally evoked information, as well as discourse-old in the proper sense of the word, can be preposed in inversion, as Betty Birner pointed out to me.
German right dislocation and afterthought in discourse
A: A:
Du könntest diesen Typeni fragen, | na, diesen Chorleiteri. You could that guyi ask, interj this choirmasteri (Bestimmt kennt er jemanden.) (Sure he knows somebody.)
The only condition for an AT-NP is that its reference should be clear enough to enable the AT to fulfil its function of reference repair, i.e., the referent should be easily identifiable by the particular NP expression. So, NPs with a vague or too general reference are dispreferred (cf. 16). (16)
Ich habe Äpfel und Pflaumen gekauft. (I’ve bought apples and plums.) Diei schmecken aber leider nicht, | (ich meine) die Äpfeli / Theyi taste but unfortunately not, (I mean) the applesi / #die Früchtei / #dieses Kernobsti. the fruitsi / those pip-fruiti’16
A brief summary of prosodic, syntactic and semantic properties of RD and AT so far: RD is prosodically and syntactically part of its host sentence. Its discourse function is to mark a discourse-old referent having a definite specific individual reference as being especially salient for the following discourse. In section 4 I will focus on the discourse function of the RD. AT, on the contrary, is a prosodically and syntactically independent unit, i.e., an orphan. Its discourse function is to repair an unclear pronominal reference. Being an orphan, AT only gets integrated into its host sentence at the level of discourse (cf. Shaer/Frey 2004). In the following section I will develop a discourse relation Afterthought that attaches AT to its host sentence.
3. The discourse relation Afterthought I argue that there is a special discourse relation for attaching AT to its host sentence, which I call Afterthought. The following analysis is done in terms of Segmented Discourse Representation Theory (SDRT), Asher/Lascarides (2003). For SDRT, a crucial assumption is that contents of utterances building up a discourse are related to each other via discourse relations. Asher/Lascarides (2003) propose a number of such discourse relations, e.g., Narration, a coordinating discourse relation that combines two utterances whose eventualities occur in the sequence in which they are described, as in Max came in. He sat down. Discourse relations can be coordinating, as in the case of
16. The latter remains a bad repair even if one assumes that the information that apples are pip fruit and plums are not is known to the hearer. Still, such repair requires too much effort from the recipient and is for this reason dispreferred.
Maria Averintseva-Klisch
Narration, where two eventualities described are at the same level of detail, or subordinating, when the second constituent provides more details about the eventuality of the first one without bringing the flow of narration any further. An example of subordinating discourse relation is Elaboration as in Max had a great meal. He ate salmon., where the second utterance provides more details about the eventuality of the first utterance (cf. Asher and Lascarides (2003), ch. 4.). Discourse relations hold between informational units, most often contents of utterances, but also contents of bigger chunks of discourse might participate in a discourse relation.17 In the following I argue that AT requires a special discourse relation Afterthought that attaches AT to its host sentence. As shown above, AT repairs an insufficient (pro) nominal reference. In other words, it provides a characteristic of a discourse referent, which helps to identify the referent in question in the discourse model, as in (17): (17) a. b.
Er hat angerufen, | (ich meine) Dein Chef! Es klappt! Hei has phoned, (I mean) your bossi! It works-out (context: A to B just after having laid down the receiver) A: (Serena und Teresa kommen auch mit.) B: Ich mag sie nicht, | A: Serena and Teresa are coming too.) B: I like heri / themi not, (ich meine) Serena. (I mean) Serenai
In most cases, anaphoric pronouns like er (‘he’) or sie (‘her’) in (17) are used when the resolution of the anaphor is unproblematic. In some cases, however, as in (17), there is either none immediately apparent (17a) or more than one equally plausible candidate antecedent for the anaphoric expression (17b). That is why the speaker decides to resolve the unclarity explicitly by supplying what he believes to be an unambiguous identification for the referent, e.g., the mentioning of the unique relation to the hearer (17a) or of the proper name of the referent (17b). Reflecting this function of the right-peripheral NPs in (17), the discourse relation Afterthought is informally described in (18): (18) Afterthought holds whenever the second constituent provides additional information about some discourse referent in the first constituent, in such a way that the information helps to identify this discourse referent.
17. A well-known example from Asher/Lascarides (2003) is (a), where utterances 2–5, as a whole, elaborate on the utterance 1, giving details of the great evening (the inner relations between the utterances 2–5 are neglected for the moment). (a)
1. John had a great evening last night. Elaboration (α, β), where α: 1, β: 2–5 2. He had a great meal. 3. He ate salmon. 4. He devoured lots of cheese. 5. He then won a dancing competition.
German right dislocation and afterthought in discourse
Important is, that AT cannot be subsumed under any other discourse relation; at the first glance, Elaboration might seem suitable here. However, Afterthought differs crucially from Elaboration in its impact on the truth conditions of the whole sentence: AT first makes the establishing of the truth conditions for an utterance possible; due to the reference unclarity, it is not possible before the adding of the AT takes place.18 To be able to define the properties of Afterthought, it is necessary to find out whether it is a subordinating or coordinating relation. This can be tested with the help of the tests for subordination and coordination, proposed by Asher and Vieu (2005) (cf. also Vieu/Prévot 2004). These tests prove that Afterthought is a subordinating relation.19 The second important point is that Afterthought, unlike Elaboration, is a cognitive-level discourse relation20 in the terms of Asher / Lascarides (2003), which means that not only the contents of the clauses that are related are important, but also the
18. Asher and Lascarides (2003) define that “R is a distinct discourse relation only if there is evidence that it affects the truth conditions of the elements it connects, and these effects cannot be explained by other means” (Asher/Lascarides 2003: 145). That leads to the assumption of the Afterthought as a separate discourse relation. 19. To illustrate the claim exemplarily for one of the tests: 1. Attachment Test: given are two constituents, α and β, a relation R (α, β), and a possible extension with a constituent γ; the nature of R is to be tested. If it is possible to attach γ to α, then R is subordinating; if attachment is possible only to β, then R is coordinating. (cf. Asher/Vieu 2005: 600) (1)
a. b c.
Dann ist sie weggelaufen, (α) (ich meine) die Serena. (β) Das macht sie immer wenn sie wütend ist. (γ)
Explanation (α, γ)
‘Then she ran away, (I mean) Serena. That’s what she always does when she is angry.’ The Attachment Test shows that Afterthought is subordinating. The remaining Continuation Test and Anaphora Test achieve the same result, that is why I present only one of the tests here. 20. Asher and Lascarides (2003) distinguish between content-level discourse relations and cognitive-level discourse relations. For the former, it is only the content of the utterances building up a discourse that matters, as e.g., with Narration or Elaboration. For the latter not only the content of the utterances, but also the intentions of the speaker and the hearer are important for defining their semantics. So, e.g., for a discourse in (a) it is assumed that the discourse relation NEI (Not Enough Information) connects two utterances, and the semantics of this relation is defined in the following way: NEI holds if the speaker of 2 has an intention of making clear with his utterance that he does not know the answer to 1, and thus cannot help the speaker of 1 to achieve his speech act related goal (SARG); in the case of (a) the goal being to learn who is coming to the party:
(a)
1: Who is coming to the party? 2: I don’t know.
Nearly all discourse relations for dialogue are cognitive-level.
Maria Averintseva-Klisch
intentions of the participants. Crucial for producing an afterthought is the intention of the speaker to repair a reference he believes to be unclear for the hearer.21 With all this in mind, (18) can be made more precise in (19): (19) Afterthought (α,β) is a cognitive-level, subordinating discourse relation, which holds whenever the speaker of α and β supplies β with the speech act related goal22 of clearing the reference of a discourse referent x that has been introduced in α by establishing a relation x=z, where z is a discourse referent introduced in β, and the reference of z in the discourse representation is
assumed to be unambiguous.23
[α: the host sentence; β: the afterthought; x, z: discourse referents]
That means that AT refers back to an element of its host sentence, whose reference it resolves. Note that the reference resolution is purely local, in that it does not affect the structure of the discourse as a whole. For example, in (20) the discourse segment is about a certain play with an actress playing the role of a nun, and the AT occurs in the utterance claiming that the actress was much more beautiful than the actual nun. The theme of the discourse segment is the play, and the AT I mean the nun does not change this; it does not affect the global structure of the discourse segment: (20) ([. . .] und das ist es auch [. . .] was das Stück will, was man um so deutlicher sieht, als die Bethmann wirklich eine sehr hübsche Frau ist oder doch zum wenigsten viel hübscher,) ([. . .] and this is also [. . .] what is the point of the play, and one sees it even clearer, because the Bethmann is really beautiful, or at least much more beautiful) als siei wirklich war ich meine die Nonnei, (was aber nichts schadet [. . .].) than shei really was, I mean the nuni (but it is not so bad [. . .].) [Newspaper Corpus of Bonn BZK: 2014916]
21. In some cases, the hearer might make his inability to establish the reference explicit, and thus directly trigger the intention of the speaker (see also footnote 11). 22. Speech act related goal (SARG) is a goal that is either conventionally associated with a particular type of utterance (that is e.g., the case with AT, where an extrasentential NP is required) or is recoverable from the discourse context. E.g., the SARG of a question is to learn the answer to this question (cf. Asher/Lascarides (2003: ch. 7)). 23. According to the analysis presented here, corrections like (a) are a subtype of ATs (cf. ‘alien-initiated repair’ in the terms of Uhmann 1993):
(a)
(A: Ann went to London. B: No, she didn’t, I just met her.) A: I meant Ann Smith.
Besides, it is possible that Afterthought is also used to attach other kinds of optional additions, as certain kinds of appositions or non-restrictive relative clauses. However, this issue needs further investigation.
German right dislocation and afterthought in discourse
This means, that AT is a backward-looking local repair strategy that does not influence the global discourse structure. Below I will show that for RD quite the opposite is true: it has influence on the global structure of its discourse segment.
4. The discourse function of right dislocation It has been stated above that the function of RD is to mark the discourse referent that is going to be especially salient for the following discourse. In the following, I call this referent discourse topic referent. I assume that for any given coherent discourse segment there exists such a discourse topic referent with which this discourse segment is concerned. Using the term discourse topic referent I assume the local concept of the discourse topic.24 The notion of discourse topic referent corresponds to the entity-based approach to discourse topic, which is advocated for in Oberlander (2004).25 According to Oberlander, the only sort of discourse topic needed in addition to discourse relations for establishing coherence is an entity the discourse segment is ‘about’.26 The concept of ‘an entity the discourse segment is about’ matches the intuitive understanding one has about discourse. When questioned about the subject of the discourse (in the pretheoretical sense), e.g., “What were you talking / reading etc. about?”, spontaneous answers refer to entities (or, more precisely, to nominal discourse referents). So, some of the possible answers could be “We were talking about Woody Allen’s last movie / Anna’s wedding / my new colleague / our holiday plans / German right dislocation etc.”
24. In choosing this concept of discourse topic, I do not attempt a theoretical solution to the problem of the status of discourse topic, which has been extensively discussed in literature. See e.g., Keenan & Schieffelin (1976), Brown & Yule (1983/2004), Goutsos (1997) and, more recently, Büring (2003), Asher (2004), Kehler (2004), Oberlander (2004), Stede (2004) and Zeevat (2004), to name a few, for the questions of what a discourse topic is (some answers are: a proposition, a question the discourse answers, an entity etc.) and whether modeling of the discourse needs this concept in the first place. 25. The existence of some kind of entity that is most salient at a given stage of the discourse and that is relevant for establishing coherence seems to be the common point of the papers in the recent issue of Theoretical Linguistics dedicated to discourse topics (cf. “recurring sentence topic” in Oberlander (2004), “local topics within discourse segments” in Kehler (2004), “protagonist” in Zeevat (2004) and “Discourse topic 1” in Stede (2004)). 26. That corresponds at the level of the discourse to the notion of aboutness proposed by Reinhart (1981) at the sentence level for sentence topics.
Maria Averintseva-Klisch
In line with this intuitive understanding of the discourse topic, RD serves to mark some discourse referent as the discourse topic referent for the following discourse segment. Take, for example, (21) as an illustration: (21) (Und als der König seine Frau verloren hatte, bedauerte ihn die Dutitre: “Ach ja, für Ihnen is et ooch nich so leicht [...].”) (And when the king lost his wife, Dutitre pitied him: “Dear me, I should say, for you things aren’t that easy either [. . .].”) Siei war ein Original, die Madame Dutitrei. shei was an original the Madame Dutitrei (Sie verstand nie, warum man über ihre Aussprüche lachte. Sie war eben echt und lebte, wie alle wirklich originalen Menschen, aus dem Unbewussten. Kein falscher Ton kam deshalb bei ihr auf.) (She never understood why everybody always laughed at her remarks. She was genuine and lived unconsciously, as all unique people do. She never came across as being artificial.) [Fischer-Fabian, S., Berlin-Evergreen, Berlin: Knaur. 125]
Segment (21) is about a certain Madame Dutitre. Madame Dutitre is available (and most salient) as the referent for the pronoun sie (‘she’) in the second sentence of (21) containing a RD. What RD does here is to mark that the following is about Madame Dutitre. Madame Dutitre is thus explicitly set as the discourse topic referent for the (sub)segment following the right dislocation.27 Importantly, in (21) no topic shift takes place, as Madame Dutitre is also the discourse topic of the preceding (sub)segment. Here it is instructive to have a look at the left peripheral constructions in German and to compare their discourse functions to that of RD. Frey (2004) shows that a left-peripheral construction called hanging topic is used in a similar way to mark the discourse topic referent for the following segment. That is why in the following I briefly introduce two German left-peripheral constructions, hanging topic and left dislocation, and show how they relate to RD.
27. The difference between using a RD or a sentence without RD like (a) in the same context is that although Madame Dutitre is in both cases the discourse topic referent, RD marks this fact explicitly, whereas in (a) this remains implicit: (a)
Madame Dutitre war ein Original. Madame Dutitre was an original. (Sie verstand nie, warum man über ihre Aussprüche lachte.) (She never understood why everybody always laughed at her remarks.)
German right dislocation and afterthought in discourse
4.1 To the left and to the right: left dislocation, right dislocation and hanging topic Since Altmann (1981), two forms of left-peripheral NPs with a coreferent proform inside the clause are distinguished in German: left dislocation (LD) and hanging topic (HT) (‘free theme’ in terms of Altmann 1981), cf. (22a) vs. (22b): (22) a. b.
Den Ottoi→, deni mag jeder. The Ottoi, d-proni likes everybody Ottoi↓, | jeder mag ihni. Ottoi, everybody likes himi
LD HT
[Frey (2005: 20)] (→: progredient tone; ↓: falling tone; |: pause; capitals: main accent)
The common assumption (again, since Altmann 1981) is that the main formal difference between LD and HT is the clause-internal pro-form: LD only allows socalled ‘weak d-pronouns’ der, die, das as coreferent clause-internal pro-forms, whereas for HT different resumptive forms are possible (e.g., personal pronoun ihn (‘him’) in (22b)). As Frey (2004), (2005) and Shaer/Frey (2004) show, there are more important formal and functional differences between these two constructions. They amount to the following (cf. Frey 2004): On the one hand, LD is prosodically and syntactically part of its host sentence (cf. also Altmann 1981). Its function in the discourse is to mark the clause-internal pronoun as the sentence topic. HT, on the other hand, is an orphan. Its discourse function is to mark the introduction of a new discourse topic,28 so that the NP refers to that discourse topic. In (23), the discourse topic changes from Hans to the Berlin underground, and HT signals this change: (23)
(Hans ist ein richtiger Fan der Berliner U-Bahn. Deshalb reist er oft nach (Hans is a real fan of the Berlin underground. That’s why he rather often goes to Berlin.) Berlin.) Die Berliner U-Bahni, siei nahm 1902 ihren Betrieb auf. [. . .] the berlin undergroundi iti took 1902 its service on (Now, the Berlin underground, it started in 1902.) [Frey 2004, ex. (57)]
This means that RD and HT have in common that they both mark the discourse topic referent for the following discourse segment. The difference is twofold: firstly, in
28. Frey’s understanding of discourse topic as the “main theme of a Section of a text” (Frey 2004: 217) corresponds roughly to what I call the discourse topic referent in this paper.
Maria Averintseva-Klisch
the case of hanging topic the discourse topic referent in question is bound to change from the preceding segment, while RD does not have this additional requirement, as one can see in (21), where the discourse topic referent does not change.29 Secondly, the following segment includes the host sentence in the case of HT, and excludes it in the case of RD. It is important that although there are formal similarities between left and right dislocation in German – both are prosodically and syntactically integrated into their host sentence, while HT is not -, functionally it is HT that RD corresponds to. Evidence for this claim comes from the option of interchanging the constructions in the same context: RD can be replaced by HT, but not by LD,30 as in (24): (24) (Und als der König seine Frau verloren hatte, bedauerte ihn die Dutitre: “Ach ja, für Ihnen is et ooch nich so leicht [. . .].”) Siei war ein Original, die Madame Dutitrei. RD shei was an original the Madame Dutitrei (Sie verstand nie, warum man über ihre Aussprüche lachte. Sie war eben echt und lebte, wie alle wirklich originalen Menschen, aus dem Unbewussten. Kein falscher Ton kam deshalb bei ihr auf.) And when the king lost his wife, Dutitre pitied him: “Dear me, for you, things really aren’t that easy either [. . .]. She was unique, det Madame Dutitre. She never understood why everybody always laughed at her remarks. She was genuine and lived unconsciously, as all unique people do. She never came across as being artificial. a. Madame Dutitrei↓, | siei war ein Original. HT Madame Dutitrei shei was an original b. #Die Madame Dutitrei→, diei war ein Original. LD the Madame Dutitrei d-proni was an original
As shown above, RD in (24) marks that the referent of the NP in question is the discourse topic referent for the following segment. A hanging topic NP refers to the discourse topic referent for the following segment too, and that is why (24a) is equally
29. The difference between RD and HT is expected, according to Lambrecht (2001). Lambrecht shows that, cross-linguistically, left dislocation constructions (i.e., HT in the case of German, cf. Frey 2004) are used for the announcement or establishment of a new topic relation between a referent and a predication, while right dislocation constructions serve the continuation or maintenance of an already established relation. However, in the case of German the data show that for RD, the discourse topic referent is neither bound to be new nor necessarily old; that issue is left completely open. 30. Frascarelli/Hinterhölzl (2007) propose for Italian, that left and right dislocation might both date from the same deep structure. They state, however, that this analysis is not applicable to German. (24) supplies discourse-functional evidence against assuming one deep structure for left and right dislocation, in spite of some formal similarities between them.
German right dislocation and afterthought in discourse
possible here.31 Left dislocation (24b), however, is not suitable here: it can only locally mark the sentence topic, and that does not capture the suggestion that the whole segment, and not only this one sentence, is ‘about’ Madame Dutitre.32 To sum up: in spite of their formal similarities, left and right dislocation differ in a crucial way as far as their discourse function is concerned. LD is a local (i.e., sentenceinternal) aboutness marker, whereas RD is a global one, in that it marks the discourse topic referent for the following discourse segment. In this sense RD parallels HT: both mark the discourse topic referent for the following segment. However, there is one important difference between RD and HT: RD does not issue any conditions with respect to the discourse topic referent in the preceding segment, while HT requires a change of the discourse topic referent. In the next section I will discuss in more detail what consequences the marking of a discourse topic referent by RD has for the structure of the following discourse.
4.2 Right dislocation and the ‘foreground’ vs. ‘background’ distinction In order to understand the role of RD in the discourse some preliminaries are required: the distinction between foreground and background in a discourse segment, and the discourse relation Background (cf. Asher/Lascarides 2003, Vieu/Prévot 2004) that accounts for this distinction. I will now briefly introduce these. Background (in terms of SDRT) is a discourse relation that is responsible for subdividing a discourse into the foreground (main story line) and background (less 31. The difference between (24a) and (24b) is whether the author wishes to mark a change of discourse topic or not. It depends on whether Madame Dutitre is understood to be the discourse topic referent for the preceding segment. The preceding segment goes as follows:
(a) Und als der König seine Frau verloren hatte, die beim Volke so beliebte Königin Luise, bedauerte ihn die Dutitre: “Ach ja, für Ihnen is et ooch nich so leicht, wer nimmt heute schon’n ollen Witwer mit sieben kleine Kinder?” And when the king lost his wife, the very popular queen Louise, Dutitre pitied him: “Dear me, for you, really, things aren’t that easy either, who would nowadays be willing to marry an old widower with seven little children?” It is likely to consider Madame Dutitre as the discourse topic referent of the preceding segment also; that would explain the choice of the RD. However, if the author intends the first segment as being about the king, then the hanging topic construction in (24b) would be preferred. 32. Altmann (1981:88) states that certain kinds of HT may be best “paraphrased” as right dislocation; in this case the right dislocation also has the function that is otherwise ascribed to HT, i.e., “continuation of a previously introduced theme”, and not “disambiguation of a pronominal reference” that is according to Altmann (1981) typical for right dislocation. However, Altmann does not pursue this idea, and even generally considers HT (introducing a NP with a clear reference) and right dislocation (disambiguating an unclear reference) as being functionally complementary (Altmann 1981: 107).
Maria Averintseva-Klisch
important information about the state of affairs relating to the time interval of the main story line) (cf. Asher/Lascarides (2003: 460) for the exact definition). An example of Background is (25): (25) 1. A burglar broke into Mary’s apartment. 2. Mary was asleep. 3. He stole the silver. [Asher and Lascarides (2003: 166)]
In (25), the information of utterance 2, that Mary was sleeping, serves as a background for the main story line (a burglary in Mary’s apartment).33 It is important that Background, being a subordinating relation (cf. Vieu/Prévot 2004) ensures that the discourse referents in the foreground are always available for anaphoric reference. It has been claimed above that RD influences the structure of the following discourse segment. This happens in the following way: RD assists the division of a discourse segment into a main story line and background. More specifically, this means that, firstly, the sentence with the RD always belongs to the main story line; secondly, the RD signals that this main story line is not exhausted, but is going to be resumed in the following. This is expected, as RD marks the discourse topic referent for the following segment. What follows, is that the utterances between the RD and the resuming of the main story line are understood as supplying background information, and thus coherence is maintained (see also Averintseva-Klisch 2007). Evidence for this claim comes from some peculiarities concerning anaphoric resumption, as in (26): (26) 1. 2. 3.
Hast Du schon das Neueste von Melaniem Have you already the newest about Melaniem Lisal hat gestern Geburtstag gefeiert und Lisal has yesterday birthday celebrated and eingeladen. invited Diem war auch eingeladen, die Melaniem. Shem was also invited the Melaniem
gehört? heard ein paar Leute a couple people
33. The distinction of foreground vs. background corresponds roughly to the distinction between main structure and side structure in the discourse made by von Stutterheim/Klein (2002). According to von Stutterheim & Klein, the main structure is built of partial answers to the Quaestio, a (mostly) implicit question the discourse as a whole is answering; in other words, it is the main story line of a discourse (segment). Side structures include sentences that supply information that is not immediately relevant as a partial answer to the Quaestio. One function that Background might have is to attach a special subtype of side structure. This is the function of Background that is relevant for this paper. Background also has other functions; so, an important function of Background is to attach presuppositions (cf. Asher / Lascarides 2003: 239).
German right dislocation and afterthought in discourse
4. Lisal hatte sehr leckeres Essen gekocht und sichl echt was Lisal had very tasty meal cooked and refl really something einfallen lassen für eine schöne Feier. came-up-with for a wonderful party 5. So typisch Lisal eben. So typically Lisal just 6. Und sie hockte den ganzen Abend da mit unzufriedener Miene. And shem crouched the whole evening there with unsatisfied face 7. Und zu mir hat sie dann noch gesagt, dass die Feier ganz And to me has shem then yet said that the party totally “uncool” war. uncool was 1. Have you heard the latest about Melanie? 2. It was Lisa’s birthday yesterday, and she threw a little party. 3. She was also invited – Melanie. 4. Lisa cooked a delicious meal and arranged everything wonderfully. 5. It was Lisa at her best. 6. And she sat the whole evening in a corner and sulked. 7. And to me she said that the party was stupid.
What is important here is that for the anaphor sie (‘she’) in 6 (and 7) the reference to Melanie is not only possible, but even preferred to the reference to Lisa, although the NP Lisa is in linear terms nearer and thus (being also morphosyntactically suitable) should be preferred. This follows, however, from the structure of the discourse segment. The discourse structure inferred here is illustrated in (27). The main story line is about Melanie (and her bad manners at the party), and the utterances about Lisa are interpreted as the background of the main story line:
(27) 1. Hast Du schon das Neueste von Melanie gehört? Elaboration foreground
2. Lisa hat 3. Die war auch 6. Und sie hockte den eingeladen, Narration ganzen Abend da mit gestern Geburtstag die Melanie. unzufriedener Miene. gefeiert und ein paar Leute eingeladen Background Elaboration Background background 4. Lisa hatte sehr leckeres Essen gekocht und sich echt was einfallen lassen für eine schöne Feier. 5. So typisch Lisa eben.
Und zu mir Narration hat sie dann noch gesagt, dass die Feier ganz "uncool" war.
Maria Averintseva-Klisch
The anaphoric resumption occurs only at the foreground level, and the pronoun in 6 resumes the current discourse topic referent, which has been explicitly marked with a RD in 3, i.e., Melanie. There are two constraints on Background in the SDRT analysis: firstly, the background utterance must denote a state. Secondly, the background state and the foreground event or state must overlap. Clause 5 fulfils both of these constraints. It refers to a (not explicitly specified) property of Lisa, which is a state and overlaps with the time interval of the main story line, the main story line being Melanie exhibiting bad manners at Lisa’s birthday party. Nevertheless, one might argue that there is a problem with clause 4. It describes the cooking of a delicious meal that (as we infer from our knowledge of the world) was before the party, and which is not a state but an activity. However, the tense form used here, past perfect, denotes not the action of cooking a delicious meal itself but its resultant state, i.e., the result of Lisa’s activities which contributed to a wonderful party. This resultant state overlaps with the time of the main story line. Thus, both constraints on Background are fulfilled. To sum up: RD interplays in a special way with the foreground-background-subdivision of a discourse segment. The RD itself and sentences resuming the discourse topic referent belong to the foreground, which enables the discourse topic referent to remain available for anaphoric resumption throughout the background part. The function of RD is thus twofold: from the point of view of production, RD assists in subdividing the discourse segment into a main story line and background. From the recipient’s point of view, RD helps to keep the track of the main story line and to resolve anaphors in a way compatible with the intentions of the author.
5. Summary and conclusions In this paper, I considered the construction traditionally called German right dislocation. I argued that there are actually two different constructions, right dislocation proper and afterthought, which are both subsumed under this label. Evidence for the distinction between right dislocation and afterthought comes from prosody, syntax and semantics, as well as from the discourse functions they have. Right dislocation is prosodically and syntactically seen as part of its host sentence, i.e., a right IP adjunct (presumably as a result of a movement). Its function in discourse is to mark some discourse-old referent as the discourse topic referent for the following discourse segment (in the sense of the referent the following segment is about). This leads to the preference for definite specific NPs referring to individuals for RD. RD marking of the discourse topic referent has consequences for anaphoric resumption in the following segment and for the structure of the following segment as a whole. In other words, RD is a ‘forward-looking’ discourse device, issuing certain constraints on the subsequent discourse segment.
German right dislocation and afterthought in discourse
Afterthought is prosodically and syntactically independent of its host sentence and is integrated into the sentence at the level of discourse. In order to account for this kind of integration, I proposed a subordinating cognitive-level discourse relation Afterthought. Functionally, AT is a local repairing strategy, which is directed towards the host sentence, and which does not have any impact on the global discourse structure. In this sense, it is a ‘backward-looking’ discourse device. To conclude: In this paper, I showed that certain semantic constraints on the RD-NP seem to follow from its referring to the discourse topic referent. Observations of this kind might allow insights into the nature of the otherwise elusive pragmatic category of the discourse topic. Discourse topic is crucially important for the structure of the discourse model. In this sense RD in its function of marking the discourse topic referent is an explicit means revealing how the discourse model is built up.
References Altmann, H. 1981. Formen der ‘Herausstellung’ im Deutschen: Rechtsversetzung, Linksversetzung, Freies Thema und verwandte Konstruktionen. Tübingen: Niemeyer. Asher, N. 2004. “Discourse Topic”. Theoretical Linguistics 30: 163–201 Asher, N. and Lascarides, A. 2003. Logics of Conversation. Cambridge: CUP. Asher, N. and Vieu, L. 2005. “Subordinating and coordinating discourse relations”. Lingua 115: 591–610. Auer, P. 1991. “Vom Ende deutscher Sätze”. Zeitschrift für Germanistische Linguistik 19: 139–157. Averintseva-Klisch, M. 2007. “Anaphoric properties of the German right dislocation.” In: Anaphors in Text. Cognitive, formal and applied approaches to anaphoric reference, M. Schwarz-Friesel, M. Consten and M. Knees (eds.), 165–182. Amsterdam: Benjamins. Brown, G. and Yule, G. 2004. Discourse analysis. Cambridge: CUP [reprinted from original of 1983]. Büring, D. 2003. “On D-Trees, Beans, and B-Accents”. Linguistics & Philosophy 26:5: 511–545. Cann, R., Kempson, R. and Otsuka, M. 2002. “On Left and Right Dislocation: A Dynamic Perspective”. Ms., URL: http://semantics.phil.kcl.ac.uk/ldsnl/papers. Consten, M. 2004. Anaphorisch oder deiktisch? Zu einem integrativen Modell domänengebundener Referenz. Tübingen: Niemeyer. Frascarelli, M. and Hinterhölzl, R. 2007. “Types of Topics in German and Italian”. In: On Information Structure, Meaning and Form. Generalizations across languages, K. Schwabe and S. Winkler (eds.), 87–116. Amsterdam: Benjamins. Fretheim, T. 1995. “Why Norwegian Right-Dislocated Phrases are not Afterthoughts”. Nordic Journal of Linguistics, 18: 31–54. Frey, W. 2004. “Pragmatic properties of certain German and English left peripheral constructions”. Linguistics 43/1: 89–129. Frey, W. 2005. “Zur Syntax der linken Peripherie im Deutschen”. In: Deutsche Syntax: Empirie und Theorie. Symposium Göteborg 13–15 Mai 2004, Franz Josef d’Avis (ed.). Göteborg. Goutsos, D. 1997. Modeling Discourse Topic: Sequential Relations and Strategies in Expository Text. Norwood: Ablex (= Advances in Discourse Processes LIX).
Maria Averintseva-Klisch Grosz, B. and Ziv, Y. 1994. “Right Dislocation and Attentional State”. In: Proceedings of the 9th Annual Conference & Workshop on Discourse of the Israel Association for Theoretical Linguistics, R. Buchalla and A. Mittwoch (eds.), 184–199. Jerusalem: Akademon Press. Haegeman, L. 1991. “Parenthetical adverbials: The radical orphan approach.” In: Aspects of modern English linguistics: Papers presented to Masatomo Ukaji on his 60th birthday, S. Chiba et al. (eds), 232–254. Kaitakusha. Karttunen, L. 1976. “Discourse referents”. In: Syntax and Semantics 7. Notes from the Linguistic Underground, McCawley, J.D. (ed), 363–385. New York: Academy Press. Keenan, E.O. and Schieffelin, B.B. 1976. “Topic as a Discourse Notion: A Study of Topic in the Conversations of Children and Adults”. In: Subject and Topic, Ch. N. Li, (ed.), 336–385. New York: Academic Press. Kehler, A. 2004. “Discourse topics, sentence topics, and coherence”. Theoretical Linguistics 30: 227–240. Lambrecht, K. 2001. “Dislocation”. In: Language Typology and Language Universals / Sprachtypologie und sprachliche Universalien. An International Handbook, M. Haspelmath et al. (eds.), 1050–1078. Berlin: de Gruyter. Müller, G. 1995. “On Extraposition & Succesive Cyclicity”. In: On extraction and extraposition in German, U. Lutz and J. Pafel (eds), 213–243. Amsterdam: Benjamins. Oberlander, J. 2004. “On the reduction of discourse topic”. Theoretical Linguistics 30: 213–225. Prince, E.F. 1981. “Towards a Taxonomy of Given/New Information”. In: Radical Pragmatics, P. Cole (ed.), 223–254. New York: Academic Press. Prince, E.F. 1992. “The ZPG Letter: Subjects, Definiteness, and Information-Status”. In: Discourse Description. Diverse linguistic analyses of a fund-raising text, W. C. Mann and S. A. Thompson (eds.), 295–325. Amsterdam: Benjamins. Reinhart, T. 1981. “Pragmatics and Linguistics: An Analysis of Sentence Topics”. Philosophica 27: 53–94. Ross, J. 1967. Constraints on Variables in Syntax. Ph. Diss. Cambridge, Mass.: MIT. Selting, M. 1994. “Konstruktionen am Satzrand als interaktive Ressource in natürlichen Gesprächen”. In: Was determiniert Wortstellungsvariation?, B. Haftka (ed.), 299–318. Opladen: Westdeutscher Verlag. Shaer, B. 2003. “An ‘Orphan’ Analysis of Long and Short Adjunct Movement in English”. In: WCCFL 22 Proceedings, G. Garding and M. Tsujimira (eds.), 450–463. Sommerville MA: Cascadilla Press. Shaer, B. and Frey, W. 2004. ‘Integrated’ and ‘Non-integrated’ Left-Peripheral Elements in German and English”. ZASPiL 35, Vol. 2, Dez. 2004 : 465–502. Stede, M. 2004. “Does discourse processing need discourse topics?”. Theoretical Linguistics 30: 241–253. Uhmann, S. 1993. “Das Mittelfeld im Gespräch” In: Wortstellung und Informationsstruktur, M. Reis (ed.), 313–354. Tübingen: Niemeyer. Uhmann, S. 1997. Grammatische Regeln und konversationelle Strategien. Fallstudien aus Syntax und Phonologie. Tübingen: Niemeyer. Vieu, L. and Prévot, L. 2004. “Background in SDRT”. Workshop SDRT, TALN-04. von Stutterheim, C. and Klein, W. 2002. “Quaestio and L-perspectivation”. In: Perspective and perspectivation in discourse, C.F. Graumann and W. Kallmeyer (eds.), 59–88. Amsterdam: Benjamins.
German right dislocation and afterthought in discourse Ward, G. and Birner, B. J. 1996. “On the Discourse Function of Rightward Movement in English”. In: Conceptual Structure, Discourse and Language, A. Goldberg (ed.), 463–479. Stanford: Center for the Study of Language and Information Publications. Zeevat, H. 2004. “Asher on discourse topic”. Theoretical Linguistics 30: 203–211. Zifonun, G. , Hoffmann, L. and Stecker, B. (eds.). 1997. Grammatik der deutschen Sprache. Berlin: de Gruyter. Vol. 1. (= IDS 7.1)
A discourse-relational approach to continuation Anke Holler
University of Göttingen In German, there exist two classes of non-restrictive relative clauses: continuative and appositive ones. The clauses of both classes are comparable in their syntactic and semantic behavior as they are both non-integrated clauses denoting a proposition. They differ, however, in respect to their discourse function. This was first observed by Brandt (1990), who discriminates non-restrictive relative clauses by their communicative-weight. She claims that continuative relative clauses provide major information, whereas appositive relative clauses express minor information. This paper investigates inasmuch Brandt’s notion of communicativeweight assignment can be couched in discourse-structural terms by exploiting the distinction between coordinating and subordinating discourse relations in the sense of Asher and Vieu (2005).
1. Introduction In German, there exists an interesting subclass of non-restrictive relative clauses, dubbed continuative relative clauses, which emphasizes their specific discourse functional role. Classic examples of this clausal class are given in (1). (1) a.
Oskar traf einen Bauern, den er dann nach dem Weg fragte. Oskar met a farmer whom he then for the way asked ‘Oskar met a farmer, whom he then asked the way.’
b. Emma suchte eine Telefonzelle, die sie schließlich auch fand. Emma sought a phone booth which she finally PART found ‘Emma sought a phone booth, which she finally found.’ c.
Oskar machte einen Versuch, der aber restlos scheiterte. Oskar made an attempt which however completely failed ‘Oskar made an attempt, which however completely failed.’
d.
Emma hat Emma has durch throughout
es einer Freundin erzählt, die nun den Tratsch it a friend told who now the gossip die ganze Stadt trägt. the whole town spreads
Anke Holler
‘Emma told a friend the news, who now is spreading gossip throughout the whole town.’
Syntactically, continuative relative clauses behave like non-integrated clauses as they show typical root clause properties, cf. Holler (2005). Semantically, continuative relative clauses are introduced by an anaphoric pronoun and denote propositions, which is a consequence of their non-restrictiveness. This contrasts with restrictive relative clauses, which are usually analyzed as denoting properties. Basically, continuative relative clauses share these grammatical properties with “ordinary” nonrestrictive relative clauses such as (2), which I will henceforth call appositive relative clauses. (2) a.
Oskar traf einen Bauern, der übrigens einen Strohhut trug. Oskar met a farmer who incidentally a straw hat wore ‘Oskar met a farmer, who incidentally was wearing a straw hat.’
b.
Emma suchte eine bestimmte Telefonzelle, die nie Emma sought a certain phone booth which never kaputt ist. out of order is ‘Emma sought a certain phone booth, which is never out of order.’
c.
Oskar machte diesen Versuch, der nicht scheitern konnte. Oskar made this attempt which not fail could ‘Oskar made this attempt, which could not fail.’
d. Emma hat es ihrer Freundin erzählt, die sie übrigens schon Emma has it her friend told who she actually already lange kennt. for a long time knows ‘Emma told her friend the news, whom she has actually known for a long time.’
Thus, the question arises in which respect continuative relative clauses differ from appositive ones. Brandt (1990) addresses this issue and suggests that main and minor information is differently distributed in constructions containing continuative and appositive relative clauses. She uses the term ‘Kommunikative Gewichtung’ (communicative-weight assignment) to characterize this distributional difference and argues that a continuative relative clause provides main information. Consequently, the same communicative weight is assigned to a continuative relative clause and its host clause, whereas an appositive relative clause is weighted less than its host clause since it describes minor information. Brandt’s observation seems to be feasible. However, she does not give a theoretical explanation for the observed empirical facts. In particular, Brandt (1990) does not discuss how communicative-weight assignment can be embedded into a more general approach to information packaging.
A discourse-relational approach to continuation
In this paper I provide an analysis that copes with the empirical properties of the continuative relative clause construction by discourse-structurally expressing communicative-weight assignment. The analysis is based on the assumption that continuative relative clauses can be distinguished from appositive relative clauses by the way they are rhetorically connected to their host clause. While appositive relative clauses are linked with a subordinating rhetorical relation, continuative relative clauses are rhetorically attached by a relation that coordinates the relative clause with its host clause. This paper is organized as follows: Section 2 gives an overview of the relevant syntactic properties of continuative relative clauses. Section 3 reviews the approach proposed by Brandt (1990), which distinguishes continuative relative clauses and appositive ones pragmatically by their communicative-weight assignment. In section 4, I will argue for a discourse-structurally based distinction between continuative relative clauses and appositive relative clauses. In section 5, I will develop a formal analysis that accounts for both the syntactic and the discourse structural properties of the continuative relative clause construction. Section 6 concludes and summarizes the paper.
2. Relevant syntactic properties As has been shown by Holler (2005), continuative relative clauses behave syntactically like non-integrated clauses, cf. Reis (1997). Evidence for this assumption comes primarily from the following syntactic properties: First, continuative relative clauses are prosodically and pragmatically independent from their matrix clause, which is indicated by an independent focus domain and an autonomous illocutionary force. This is exemplified in (3)1 and (4). If the construction in (3) is uttered with maximal focus by answering the question of what happened, the sequence is not appropriate. Similarly, as (4) illustrates, an interrogative operator cannot take scope over the whole construction. (3) Was ist passiert? (‘What happened?’) #[Oskar traf einen Bauern, den er dann nach dem Weg fragte.]F Oskar met a farmer whom he then for the way asked (4) #Traf Met
Oskar einen Bauern, den er dann nach dem Weg fragte? Oskar a farmer whom he then for the way asked
Secondly, continuative relative clauses are syntactically dispensable as can be seen by (5).
1. F marks the focus domain.
Anke Holler
(5) a. b.
Oskar traf einen Bauern, den er dann nach dem Weg fragte. Oskar met a farmer whom he then for the way asked ‘Oskar met a farmer, whom he then asked the way.’ Oskar traf einen Bauern. Oskar met a farmer ‘Oskar met a farmer.’
Thirdly, continuative relative clauses disallow variable binding from outside as is demonstrated by (6). (6) *Niemandi traf einen Bauern, den eri dann nach dem Weg fragte. Nobody met a farmer whom he then for the way asked
Fourthly, continuative relative clauses occur only at the very end of a complex sentence as (7) shows. (7) a. *. . ., weil Oskar einen Bauern getroffen hat, den er dann because Oskar a farmer met has whom he then nach dem Weg fragte, als er allein unterwegs war. for the way asked when he alone on the way was b. . . ., weil Oskar einen Bauern getroffen hat, als er allein because Oskar a farmer met has when he alone unterwegs war, den er dann nach dem Weg fragte. on the way was whom he then for the way asked ‘. . ., because Oskar met a farmer when he was on his way, whom he
then asked the way.’ How these syntactic properties of continuative relative clauses can be captured is shown in section 5. In the next section, Brandt’s approach to continuative relative clauses shall be briefly reviewed.
3. Assigning communicative weight Brandt (1990) assumes that main and subordinate clauses differ in their communicative potential. She further claims that a continuative clause2 has the same communicative potential as a main clause although it is subordinated in its syntactic form. Brandt (1990) understands communicative-weight assignment as marking the weight that a speaker intends to give to an independent information unit using linguistic means. The purpose of such an assignment is to facilitate the hearer’s reception of an utterance. For instance, syntactic subordination normally means that the
2. Brandt (1990) investigates several types of syntactically subordinated continuative clauses, such as continuative adverbial clauses and continuative relative clauses.
A discourse-relational approach to continuation
information expressed by the respective clause is less weighted. In other words, two information units are considered as differently weighted if they are combined in a compound sentence (which means that one clause is subordinated to another), or they are considered as equally weighted if they are ordered sequentially.3 Further, Brandt (1990) relates the communicative-weight assignment to the pragmatic distinction between primary (= main) and secondary (= minor) information in a text. She claims that highly weighted information units usually express primary information, whereas secondary information is expressed by less weighted information units. It follows from this that subordinated clauses usually signal minor information. Discussing the question in which respect continuative relative clauses differ from appositive ones, Brandt (1990) shows that continuative relative clauses bear the same communicative weight as their host clause and, hence, express major information.4 Her argumentation is based on two tests: a coordination test and the so-called dennoch (‘nonetheless’) test adopted from Pasch (1983). The coordination test exploits the fact that only a subset of the class of non-restrictive relative clauses can be paraphrased by a coordinated clause. As illustrated by the contrast between (8) and (9), non-restrictive relative clauses in an intermediate position cannot be transformed into a clause introduced by a coordination particle. They may, hence, not function as continuative clauses.5 (8) a.
Sie gab das Buch Emil, der es dann zur Bibliothek brachte. she gave the book Emil who it then to the library took ‘She gave the book to Emil, who then took it to the library.’
b. Sie gab das Buch Emil, und er brachte es dann zur Bibliothek. She gave the book Emil and he took it then to the library ‘She gave the book to Emil, and he then took it to the library.’ (9) a. Ausgestopfte Wildenten und Eisvögel, die er an der stuffed mallards and kingfishers which he on the Nordküste Finnlands gejagt hatte, sahen einen erschreckt von north coast Finland hunted had looked you frightened from der Wand herab an. the wall down at ‘Stuffed mallards and kingfishers, which had been hunted on the north coast of Finland, looked frightened as they stared down from the wall.’ b. *Ausgestopfte Wildenten und Eisvögel, und die hatte er an der stuffed mallards and kingfishers and they had he on the
3. Note that communicative weight is assigned to independent information units. 4. Brandt (1990) admits that certain continuative clauses may express minor information. 5. All examples are cited according to Brandt (1990).
Anke Holler
Nordküste Finnlands gejagt, sahen einen erschreckt von der north coast Finland hunted looked you frightened from the Wand herab an. wall down at
However, relative clause constructions like (10), though intuitively not continuative, pass the coordination test as well. (10) a.
Ich habe gerade den neuen Roman von ihm gelesen, der wieder I have just the new novel by him read which again sehr interessant ist. very interesting is ‘I have just read his new novel, which once again is very interesting.’
b.
Ich habe gerade den neuen Roman von ihm gelesen, und der ist I have just the new novel by him read and it is wieder sehr interessant. again very interesting ‘I have just read his new novel, which once again is very interesting.’
To solve this problem Brandt (1990) assumes with Posner (1979) that a coordination particle may have different pragmatic functions. A continuative relative clause construction is realized only if the coordination particle indicates a conversation6 in the relative clause’s paraphrase. Since this is not the case in (10), the non-restrictive relative clause is not classified as continuative. In addition to the coordination test, Brandt (1990) identifies continuative relative clauses by the so-called dennoch (‘nonetheless’) test originally developed by Pasch (1983). Applying this test, the right context of the relative clause is manipulated by introducing a sentence that starts with dennoch (‘nonetheless’). Brandt (1990) assumes that a non-restrictive relative clause conveys main information and is therefore continuative if dennoch (‘nonetheless’) can refer to it. This is demonstrated by the examples in (11) taken from Brandt (1990). (11) a. Ich beobachtete Peter, der neben Anna saß. *Dennoch sagten I looked at Peter who next to Anna sat nonetheless told sie kein Wort. they no word ‘I looked at Peter, who was sitting next to Anna. *Nonetheless, they weren’t talking to each other.’ b. Sie machten dann ihr Experiment, das auch gelang. they conducted then their experiment which part succeeded
6. Brandt (1990) uses the term ‘gesprächsandeutend’.
A discourse-relational approach to continuation
Dennoch wurde die Untersuchung abgebrochen. nonetheless was the investigation abandoned ‘They then conducted their experiment, which was successful. Nonetheless, the investigation was broken off and abandoned.’
According to the dennoch (‘nonetheless’) test only (11b) is considered as a continuative relative clause since the investigation was abandoned although it had been successful. In (11a), however, dennoch (‘nonetheless’) cannot resume the proposition expressed by the relative clause. Summarizing the approach proposed by Brandt (1990), non-restrictive relative clauses differ in their pragmatic function. Only a continuative relative clause is characterized by a communicative-weight assignment that corresponds to the one of a main clause. Although Brandt’s account seems to be empirically adequate, it is problematic in at least one respect, there is no independent explication of what it means to assign a communicative weight to a clause. Considering continuative relative clauses as an example, I will show next how communicative-weight assignment can be explained in a discourse-grammatical way. I will argue that to assign a communicative weight means to establish a discourse relation.
4. A discourse structural account Given the textual autonomy of a continuative relative clause (which can be attributed to its root properties) the relation between a continuative relative clause and its host clause should not be described by means of sentence grammar, but rather as a relation that underlies discourse grammatical restrictions.7 A continuative relative clause gives additional information to the issue raised in its host clause. This information is asserted by the speaker. In this respect continuative relative clauses do not contrast to appositive ones. Nevertheless, there seems to be an intuitive difference between these two types of relative clauses. In the philologically oriented literature, it is claimed by most linguists that a continuative relative clause expresses a new thought or introduces a new state of affairs. Brandt (1990) tries to be a bit more precise, and she describes the intuitive difference between appositive and continuative relative clauses as a difference in communicative-weight assignment which operates at the level of speaker intention. She claims that a continuative relative clause provides the main information, whereas an appositive relative clause adds secondary information meaning that only in the first case the same communicative weight is assigned to the relative clause and its host clause.
7. For a similar assumption regarding non-restrictive relative clauses see Sells (1985).
Anke Holler
To date, however, Brandt’s notion of communicative-weight assignment lacks an explication. Therefore, I propose to explain communicative-weight assignment in terms of the way clauses are rhetorically connected. Assuming a hierarchical discourse structure,8 two clauses have the same communicative weight if they are rhetorically connected by a symmetric discourse relation that coordinates two discourse constituents. Two clauses differ in their communicative weight, if they are rhetorically connected by a subordinating discourse relation. Applying this insight to non-restrictive relative clauses, a continuative relative clause is a relative clause that is rhetorically connected with its host by a coordinating discourse relation, whereas an appositive relative clause is a relative clause that is rhetorically subordinated to its host. By this analysis, continuative relative clauses show a uniform discourse structural behavior that contrasts with the behavior of appositive relative clauses. Adopting the Segmented Discourse Representation Theory’s (SDRT) view of discourse structure (Asher 1993) — a theory that offers a formal account of the hypothesis that discourse has a hierarchical structure upon which interpretation depends — a relation is considered as subordinating only in the case where one constituent discourse dominates another. α fl β is written to denote that α discourse dominates β. On the other hand, a relation is considered as coordinating if no constituent discourse dominates another, i.e., ¬(α fl β) & ¬(β fl α) holds. (12) gives the semantic definition of discourse dominance, cf. Asher (1993). α fl β iff the main eventuality described in β is a subsort of the main eventuality described in α, or the proposition associated with β defeasibly implies that which is associated with α. At first glance, the basic rhetorical relation to describe communicative balance is Continuation, since it clearly fulfills the axiom in (13), stating that neither argument of this discourse relation may dominate the other one. (13) Continuation(α, β) → ¬(α fl β) & ¬(β fl α) Continuation is certainly the most general symmetric relation that can be established in a continuative relative clause construction. Additionally, more specific discourse relations coordinating two discourse constituents can be observed in this relative construction as example (14) demonstrates. (14) a.
Oskar machte einen Versuch, der aber restlos Oskar made an attempt which however completely scheiterte. failed ‘Oskar made an attempt, which however completely failed.’
b. Emma hat es einer Freundin erzählt, die nun den Tratsch Emma has it a friend told who now the gossip 8. The view that information in a discourse is richly structured is well established, cf. Grosz and Sidner (1986), Polanyi (1988), Mann and Thompson (1988), Asher (1993) among others.
A discourse-relational approach to continuation
durch die ganze Stadt trägt. throughout the whole town spreads ‘Emma told a friend, who now is spreading gossip throughout the whole town.’
In (14a) the rhetorical relation Contrast is established, lexically indicated by the particle aber. In (14b), the continuative relative clause is attached to its host by the rhetorical relation Narration, since the topic time in the continuative relative clause, which is the time the assertion is made for (cf. Klein 1994), shifts with respect to the topic time of the host clause. Temporal progression is also made explicit by the cue phrases dann (‘then’) and nun (‘now’). A formalized analysis of (14a) and (14b) in the framework of Asher and Lascarides (1993) is given by (15a) and (15b). ME(α) stands for the main eventuality described in α.9 (15) a.
〈α, β〉 & make_an_attempt(α) & fail(β) > Contrast(α, β)
b. 〈α, β〉 & tell_a_friend(α) & spread_the_gossip_throughout_town(β) & ME(α) ⊰ ME(β) > Narration(α, β)
Besides the aforementioned examples, the remaining classic examples of a continuative relative construction given in (1a) and (1b) fulfill the criteria that a coordinating discourse relation is established between the relative clause and its host as well. This is illustrated by (16). (16) a.
〈α, β〉 & meet_a_farmer(α) & ask_the_way(β) & ME(α) ⊰ ME(β) >Narration (α, β)
b. 〈α, β〉 & seek_a_phone_booth(α) & find_a_phone_booth(β) & ME(α) ⊰ ME(β) >Narration(α, β)
All these examples clearly support the hypothesis that continuative relative clauses are discourse-structurally coordinated with their host clause. In each case, the relative clause is attached to its host by a coordinating relation that is subsumed by Continuation. Unlike continuative relative clauses, appositive ones elaborate the assertion of their host clause or provide secondary information describing an event or an individual. An appositive relative clause is, hence, discourse-structurally subordinated to
9. Asher and Lascarides (2003) use a slightly different formalization. For the purpose that shall be accomplished here, it seems to be adequate to stick to the version of Asher and Lascarides (1993). Naturally, it is possible to make the point in the Asher and Lascarides (2003) framework as well.
Anke Holler
its host clause, which is expressed by a respective rhetorical relation. Elaboration, a complementary relation to Continuation, is the paradigm relation inducing subordination. It is defined as follows, cf. Asher (1993): (17) Elaboration (α, β) iff (α fl β ⁄(for every e ∈ ME(β) there is an e’ ∈ ME(α) such that e is a part of e’)) & β is more complex than α.
Example (18) illustrates how the appositive relative clause constructions in (2) are analyzed in the presented framework. (18) a. b. c. d.
〈α, β〉 & meet_a_farmer(α) & wear_a_straw_hat(β) & ME(α) fl ME(β) > Elaboration(α, β) 〈α, β〉 & seek_a_phone_booth(α) & never_be_out_of_order(β) & ME(α) fl ME(β) > Elaboration(α, β) 〈α, β〉 & make_an_attempt(α) & can_not_fail(β) & ME(α) fl ME(β) > Elaboration(α, β) 〈α, β〉 & tell_a_friend(α) & know_for_a_long_time(β) & ME(α) fl ME(β) > Elaboration(α, β)
Since the main eventuality of the relative clause is discourse dominated by the main eventuality of its host clause, the relative clause is categorized as appositive. Having identified the discourse functional difference between continuative relative clauses and appositive ones, I will now develop a formal analysis that accounts for both the syntactic and the discourse structural properties of the continuative relative clause construction. A single-layered constraint-based grammar like HPSG suits itself well suited in implementing such an analysis.
5. HPSG analysis In section 2, I argued that continuative relative clauses are syntactically nonintegrated clauses. They share this property with appositive relative clauses as has been shown by Holler (2005). I thus opt for a syntactic approach to continuative and appositive relative clauses that analyzes both of them as orphan constituents which are syntactically unattached.10 By providing additional information, orphaned clauses serve to form the discourse frame against which the proposition expressed in the host clause is evaluated. Thus, non-integrated (i.e., non-restrictive) relative clauses are attached in discourse albeit 10. The orphan analysis goes back to Haegeman (1991), who describes peripheral adverbials as orphans.
A discourse-relational approach to continuation
they are orphaned in syntax. In accordance with the empirical facts, this approach makes it possible to distinguish between continuative and appositive relative clauses discourse structurally without being forced to distinguish between them syntactically. The sign-based monostratal architecture of HPSG qualifies itself very well to implement the proposed analysis. In HPSG, the modelling domain is a system of sorted feature structures. The linguistic theory specifies which feature structures are to be considered admissible. A grammar is formulated declaratively as a set of constraints on these feature structures. HPSG is sign-based in terms of its architecture. Signs are taken to be structured complexes of phonological, syntactic, semantic, discourse, and phrase-structural information, which is reflected in feature-value pairs representing this linguistic information. It is generally assumed that all signs at the minimum possess the two features phon and synsem, which are dedicated to describe phonological information and a complex of syntactic and semantic information, respectively. That the feature structures employed in HPSG are sorted means that each node is labelled with a sort symbol that indicates which type of object the structure is modelling. The finite set of sorts is assumed to be partially ordered which is depicted in a so-called sort hierarchy. The feature labels which can appear in a feature structure depend on their sort. This means that the ontological category of a linguistic entity determines its attributes. Adopting a theory variant advocated by Sag (1997),11 it can be assumed that orphan constituents have to satisfy the requirements associated with a phrase of the sort head-orphan-phrase. This sort is newly defined as a subsort of headed-phrase in addition to the sorts head-nexus-phrase and head-filler-phrase, which represent syntactically linked structures such as head-argument, head-adjunct and head-filler structures. The gist of the analysis developed here is that the modification relationship between the continuative relative clause and its host is not established in syntax, but rather at the level of discourse interpretation. To ensure this in grammar, the constraint given in fig. 1 needs to be formulated. It says that the mod attribute of a head-orphanphrase is restricted to the value none, which means that an orphaned constituent may not attach to another constituent. This explicates the fact that continuative relative clauses are syntactically non-integrated clauses. head-orphan-phrase → - | | | | | none Figure 1. Restricting the mod value of an orphan.
11. Sag’s analysis is construction-based, in the sense of allowing grammatical properties to be associated directly with constructions, rather than requiring that they are projected from lexical or grammatical formatives. Constructions are organized in sort hierarchies, where constraints on higher sorts are inherited by lower ones.
Anke Holler
Further, fig. 2 shows that the content value of an orphan is unified with the background set of its head. This is expressed by tag 3 representing the respective shared structure. In this way it is possible to establish a relation between the relative clause and its host clause at the discourse level without being forced to establish a syntactic relation at the same time. The required structure sharing ensures that a continuative relative clause is discourse-structurally linked to its host clause, which means that the semantic information contributed by the continuative relative clause provides additional information for the intepretation of the host clause. In section 2 it has been shown that a continuative relative clause is not included in the host’s information structure and has illocutionary force of its own. These two facts are grasped by manipulating the values of the information-structure attribute12 of an orphan and by further restricting its background set. As depicted in fig. 2, the information-structure value of an orphan has to be different from the informationstructure value of its head. Similarly, the psoa object of sort intend in the background set of an orphan, which expresses, following Green (2000), the fact that a constituent has illocutionary force, differs from the one in the background set of the orphan’s head. head-orphan-phrase → | verb - 1
- | |
3 , 4 intend ,…
| none 3 - | | - 2
5 intend ,…
∧ 1 ≠ 2 ∧ 4 ≠ 5 Figure 2. Restricting orphan constituents such as non-integrated clauses.
Continuative relative clauses as well as appositive relative clauses are typical orphan phrases, being syntactically non-integrated clauses. Thus, two subsorts of
12. This attribute has been introduced by [Engdahl and Vallduvi (1996)].
A discourse-relational approach to continuation
head-orphan-phrase are defined which are called head-cont(inuative)rel(ative)-phr(ase) and head-app(ositive)rel(ative)-phr(ase). These sorts representing continuative relative clauses and appositive relative clauses, respectively, inherit all constraints associated with the sort head-orphan-phrase. This is depicted in fig. 3. phrase non-hd-phrase
hd-phrase
hd-nexus-phr
hd-orphan-phr
hd-filler-phr
hd-apprel-phr
hd-contrel-phr Figure 3. Partition of phrase.
Now, the distinct discourse structural properties of continuative and appositive relative clauses can be described by representing rhetorical relations as members of a sign’s context background set since this attribute takes as a value a set of para metrized states of affairs (also called psoas) corresponding to what are best thought of as the appropriateness conditions associated with an utterance of a given sort of phrase. Adapting a proposal by (Pollard and Sag 1994, ch.8) for analyzing semantic relations, all rhetorical relations introduced by Asher (1993) are defined as atomic subsorts of the sort quantifier free parameterized state of affairs, abbreviated as qfpsoa. They are declared in a hierarchy of sorts as given in fig. 4, where they are partitioned as coordinating and subordinating, resp. in the sense of Asher and Vieu (2005). Each sort representing a rhetorical relation bears two appropriate attributes, which are called constituentα and constituentβ. The values of these attributes are of sort psoa specifying the semantic content of the constituents that are connected by the represented rhetorical relation. In case of a construction containing a non-restrictive relative clause, the constituent α value is token-identical with the content value of the host clause, and the constituent β value is shared with the content value of the relative clause. qfpsoa
coord-rhetorical-rel
narrate
contrast
Figure 4. Representing rhetorical relations.
subord-rhetorical-rel
…
elaborate
…
Anke Holler
With this means at hand, the different discourse structural behavior of continuative relative clauses and appositive relative clauses can be captured by requiring that the background set of a relative clause of sort head-contrel-phrase contains a member of sort coord-rhetorical-rel, whereas the background set of a relative clause of sort head-apprel-phrase contains a member of sort subord-rhetorical-rel. This is implemented by formulating the constraints shown in fig. 5. (19) Oskar traf einen Bauern, den er dann nach dem Weg fragte. Oskar met a farmer whom he then for the way asked ‘Oskar met a farmer, whom he then asked the way.’ head-contrel-phrase → | | |
coord-rhetorical-rel , …
head-apprel-phrase → | | |
subord-rhetorical-rel , …
Figure 5. Discourse structural constraints on non-restrictive relative clauses
The proposed analysis is exemplified by the feature structure given in fig. 6, which represents the continuative relative clause construction of (1a) repeated here as (19). The continuative relative clause appears as a value of the orphan daughter of a phrase of sort head-contrel-phrase. It does not modify the head daughter syntactically since its mod value is specified as none. However, it is related to the head daughter by its content value which is unified with the head daughter’s background set. This is expressed by tag 4 .13 Furthermore, the host clause and the continuative relative clause are discourse-structurally connected since they are both constituents of a coordinating rhetorical relation which is a member of the background set of the phrase representing the whole construction.
13. I have nothing to say here about the question of how the values of the content and the information-structure attribute of the sign representing the continuative relative clause construction are related to the ones of its daughters, i.e., the relative clause and the host clause. To simplify matters I assume for both, the content and the information-structure value, that the sign’s value corresponds to the unified values of the daughters. For a discussion of semantic aspects of the non-restrictive relative clauses see Holler-Feldhaus (2003) and Arnold (2004).
A discourse-relational approach to continuation
head-contrel-phrase Oskar traf einen Bauern den er dann nach dem Weg fragte | verb 3 ∪ 4 - 1 ∪ 2 |
4, 5, 6 , narrate
a 3 ,... b 4
Oskar traf einen Bauern | verb 3
-
- 1
|
4 , 5 intend ,...
den er dann nach dem Weg fragte | -
|
verb none
4 - 2
6 intend ,...
Figure 6. Feature structure describing (19).
6. Conclusion This paper has provided an account of German continuative relative clauses which deals with their main grammatical properties, and captures the similarities and differences between continuative relative clauses and appositive relative clauses. It has
Anke Holler
been argued that continuative relative clauses and appositive ones are best treated as syntactic orphans that are only discourse structurally attached to their host clause. Continuative relative clauses and appositive relative clauses differ, however, in the way the relative clause is rhetorically connected to its host clause. Whereas continuative relative clauses establish a coordinating rhetorical relation to their host, appositive relative clauses are discourse structurally subordinated. This analysis provides a straightforward explanation to the observation made by Brandt (1990) that continuative relative clauses express main information, and appositive relative clauses convey secondary information. While Brandt (1990) attributes this difference to a communicative-weight assignment, this paper has outlined a discourse relational explanation which exploits independently motivated devices. The presented approach merits the precision of the communicative-weight assignment.
References Arnold, D. (2004). Non-restrictive relative clauses in construction based hpsg. In S. M¨uller (Ed.), Online Proceedings of the Eleventh International Conference on HPSG. Stanford, CA: CSLI. Asher, N. (1993). Reference to Abstract Objects in Discourse. Studies in Linguistics and Philosophy 50. Dordrecht: Kluwer. Asher, N. and A. Lascarides (1993). Temporal interpretation, discourse relations and commonsense entailment. Linguistics and Philosophy 16, 437– 493. Asher, N. and A. Lascarides (2003). Logics of Conversation. Cambridge University Press. Asher, N. and L. Vieu (2005). Subordinating and coordinating relations in discourse. Lingua 115, 591–610. Brandt, M. (1990). Weiterf¨uhrende Nebens¨atze. Stockholm: Almquist & Wiksell International. Engdahl, E. and E. Vallduvı´ (1996, May). Information packaging in HPSG. In C. Grover and E. Vallduví (Eds.), Edinburgh Working Papers in Cognitive Science, Vol. 12: Studies in HPSG, Chapter 1, pp. 1–32. Scotland: Centre for Cognitive Science, University of Edinburgh. Green, G.M. (2000). The nature of pragmatic information. In R. Cann, C. Grover, and P. Miller (Eds.), Grammatical Interfaces in HPSG, Number 8 in Studies in Constraint-Based Lexicalism, pp. 113–138. Stanford: CSLI Publications. Grosz, B. and C. Sidner (1986). Attentions, intentions and the structure of discourse. Computational Linguistics 12, 175–204. Haegeman, L. (1991). Parenthetical adverbials: The radical orphanage approach. In Y. Fuiwara, N. Yamada, O. Koma, S. Chiba, A. Ogawa and T. Yagi (Eds.), Aspects of Modern English Linguistics, pp. 232–254. Tokyo: Kaitakusha. Holler, A. (2005). Weiterf¨uhrende Relativs¨atze. Empirische und theoretische Aspekte. Number 60 in Studia grammatica. Berlin: Akademie Verlag. Holler-Feldhaus, A. (2003). Zur grammatik der weiterf¨uhrenden Relativs¨atze. Zeitschrift f¨ur Germanistische Linguistik 31(1), 78–98. Klein, W. (1994). Time in Language. London and New York: Routledge.
A discourse-relational approach to continuation Mann, W. and S. A. Thompson (1988). Rhetorical structure theory: toward a functional theory of text organization. Text 8(3), 243–281. Pasch, R. (1983). Untersuchungen zu den Gebrauchsbedingungen der deutschen Kausalkonjunktionen da, denn und weil. Linguistische Studien Reihe A(104), 41–243. Polanyi, L. (1988). A formal model of the structure of discourse. Journal of Pragmatics 12, 601–638. Pollard, C. and I.A. Sag (1994). Head-Driven Phrase Structure Grammar. Chicago: CSLI Publications and University of Chicago Press. Posner, R. (1979). Bedeutung und Gebrauch der Satzverkn¨upfer in den natu¨rlichen Sprachen. In G. Grewendorf (Ed.), Sprechakttheorie und Semantik, pp. 345–385. Frankfurt/M.: Suhrkamp. Reis, M. (1997). Zum syntaktischen Status unselbst¨andiger Verbzweits¨atze. In C. D¨urscheid, K. H. Ramers, and M. Schwarz (Eds.), Syntax im Fokus. Festschrift f¨ur Heinz Vater. T¨ubingen: Niemeyer. Sag, I.A. (1997). English relative clause constructions. Journal of Linguistics 33(2), 431–484. Sells, P. (1985). Restrictive and non-restrictive modification. Technical report, CSLI, Stanford, CA.
German Vorfeld-filling as constraint interaction* Augustin Speyer
University of Pennsylvania The filling of the vorfeld (= clause-initial position in German declarative clauses) depends on information structural rather than strictly syntactic constraints. Referential phrases of one of the following three types are eligible for the vorfeld: scene-setting elements, contrastive elements and topics. The main point of this paper is to show that these types seem to be ranked: scene-setting elements are the most likely ones to appear in the vorfeld, followed by contrastive elements and finally by topics. Note that topics are thus not the preferred vorfeld-fillers even in German (see Speyer 2007; Frey 2004a). The difference in likelihood to be in the vorfeld can be modelled by an Optimality Theoretic account that is sketched out in this paper.
1. Introduction German clauses have been described in traditional German linguistics by means of the so-called ‘Feldermodell’ or ‘field model’.1 This model makes crucial use of the fact that the verbal elements show strict constraints on their placement: They can appear either at the beginning of a clause (so-called ‘verb-first’ or V1-clauses), at a position after the first phrase of the clause (‘verb-second’ or V2-clauses) or at the very end of the clause (‘verb-final’ clauses, sometimes abbreviated VL for German verb-letzt).
* Earlier versions of this paper were presented at the Workshop for Dislocated Elements in Discourse at the Zentrum für Allgemeine Sprachwissenschaft in Berlin (November 28–30, 2003), at PLC 28 in Philadelphia (February 27–29, 2004) and at the Workshop ‘Constraints in Discourse’ in Dortmund (June 3–5, 2005). I wish to thank the participants of these workshops, especially Maria Alm, Werner Frey and Anita Steube. I also want to express my warmest thanks to Ellen Prince, Marga Reis, and two anonymous reviewers for their valuable comments and suggestions, and Jean-Francois Mondon for helping me with my English. All remaining mistakes are of course my responsibility. 1. For further discussion of the field model, see Grewendorf, Hamm, and Sternefeld 1987 and Reis 1987:147.
Augustin Speyer
VL can be obscured by right-dislocated elements. From this distribution we get two potential positions for verbal material, one at the beginning of the clause with an optionally filled phrasal position before it, and one at the end. These two positions are called Linke / Rechte Satzklammer ‘left / right sentence bracket’.2 All material which is not part of the verb form flocks either between the sentence brackets, before the left one or after the right one. These positions are referred to as Mittelfeld ‘middle field’, Vorfeld ‘pre-field’ and Nachfeld ‘post-field’, respectively. A schematic overview is given in (1). (1) (Vorvorfeld)
Vorfeld
Linke Satzklammer
Mittelfeld
- finite verb - complemetizer - coordinators - Left-disloc. material
1 phrase
Rechte Satzklammer
Nachfeld
- rest of verbal complex - the entire verb. complex n phrases
n phrases(?) (right-disloc. material)
I assume a grammatical model in which both the vorfeld and the left sentence bracket are filled by movement in these cases; all non-verbal elements have been base-generated in the mittelfeld, all verbal elements in the RSK (cf. Bach 1962; Koster 1975; den Besten 1983). I furthermore assume that at least in German there is no structural difference between clauses with the subject in the vorfeld and clauses with something else in the vorfeld (cf. den Besten 1983). We are mostly interested in sentences that have a vorfeld. The archetypical declarative main clause and the archetypical wh-question main clause are the most common clause types with a vorfeld. A typical example of a German declarative main clause is given in (2). (2)
Der Wähler hat dem Kandidaten nur zeigen wollen, wie sehr The voter has the candidate only demonstrate wanted how much ihm Politik stinkt. him politics stinks ‘The voter only wanted to show the candidate, how tired he is of politics.’
2. The whole verbal complex is presumably generated in clause-final position, that is: in the right sentence bracket. If the left sentence bracket is already occupied by a complementizer (which presumably is also generated there), no part of the right sentence bracket can move. If the left sentence bracket is empty, the finite part of the verb form is moved there; if the verb form is only one word, the verb form as a whole moves there. The left sentence bracket cannot be left empty. The left sentence bracket corresponds to C, the vorfeld corresponds to Spec,CP in generative terms (Vikner 1995).
German Vorfeld-filling as constraint interaction
VF
LSK MF
Der Wähler hat
dem Kandidaten nur
RSK
NF
zeigen wollen,
wie sehr ihm Politik stinkt
Whereas in the case of wh-questions the vorfeld-filling is determined rather strictly – it is the wh-phrase which needs to stand in the vorfeld – in the case of declarative main clauses no such strict conditions seem to hold: Although the syntax of German main clauses requires the vorfeld to be occupied, it does not determine which constituent moves there.3 It is therefore reasonable to assume that the choice of the phrase which is moved to the vorfeld follows other, not strictly syntactic rules. A natural assumption, which I adopt here, would be that the choice reflects discourse requirements. These requirements are the topic of this paper. The paper is organized as follows: In section 2 several theories about what could or should be in the vorfeld are presented briefly. Section 3 and 4 refer to a corpus study that I undertook (501 tokens); section 3 states what kind of phrases we do actually find in the vorfeld, whereas section 4 addresses the (more interesting question) what kind of elements have a higher likelihood to appear in the vorfeld than others. Since the discourse requirements responsible for vorfeld-filling are easiest to identify for referential expressions, and since they make up for the largest part of vorfeldfillers (405 out of total 501, that is roughly 81%), I confine myself in this paper to cases where the constituent occupying the vorfeld has a clear referent. The conditions under which non-referential expressions move to the vorfeld are left for future research.
2. Expectations about vorfeld-filling Discussion of the filling of the German vorfeld in the traditional syntactic descriptive literature notoriously has been spongy, to say the least. Behaghel (1932) e.g., says that certain classes of elements – which are more or less coextensive with the terms topics, contrastive and scene-setting elements, used in this paper – can occur in the vorfeld, but which of them has a higher likelihood than the other is never discussed. 3. Purely syntactic accounts have been proposed, too, e.g., Frey (2004b) who assumes three vorfeld positions, SpecCP, SpecKontrP (for contrastive elements) and SpecFinP (for any element that is high in the middle field, either base generated high or scrambled) and derives similar effects as the ones discussed in this paper by A’-movement of phrases into these positions. Especially certain adverbials and scene-setting elements are generated high in the mittelfeld (see Frey and Pittner 1998) and can permeate into the vorfeld because of that; likewise topics (that are moved to a topic position right below FinP). He has to make reference to discourse structural requirements, too, though, so the difference is perhaps not too large, and my account and Frey (2004b) probably turn out to be reconcilable.
Augustin Speyer
In order to test whether we can do better than that, let us look at other languages. German is not the only language with a V2-syntax, which produces a clause structure in which the vorfeld is an issue at all. More or less closely related languages have some versions of V2 also, among which are English, Dutch, Yiddish and the Scandinavian languages. In some of these languages it is easier to determine what stands at the front than in German. Starting from these languages we can form some expectations about what we can suppose to find in the German vorfeld. Furthermore less closely related languages such as Czech which share with German the trait of a relatively free word order have been studied under functional perspectives e.g., by the Prague School, and their results have been claimed to be applicable also to German. This can function as a second source of expectations.
2.1 Subject as unmarked vorfeld filler This is the assumption that clearly holds for English (with its obligatory subject-beforeverb-syntax) and has been argued to apply also to Dutch (Koster 1975; Travis 1984; Zwart 1997). The main argumentation, using a generative framework in Chomsky’s tradition, is as follows: The verb needs to be moved from V to I (or – in English – the inflectional markers from I to V, but this is beside the point) and the subject needs to move to Spec, IP in order to receive nominative case. If we assume that IP is to the left of VP, this movement suffices to give us a kind of V2 sentences, with the restriction that only subjects can stand before the verb. If something else is to be moved to the left of the verb, another projection needs to be opened left to IP (usually thought of as being identical to the CP – complementizer phrase – of subordinate clauses), the specifier of which is occupied by the non-subject ‘vorfeld’ element; the verb needs to move further to C in order to come again into second place. Under this view ‘topicalization’-sentences – i.e., sentences not starting with the subject – are structurally more complex than subject initial sentences; subject initial sentences are automatically more basic than ‘topicalization’ sentences as ‘topicalization’ sentences are always derived from subject-initial sentences. A similar analysis was suggested for German already by Bach (1962), to whom virtually all subsequent treatments of German and especially Dutch word order refer in some ways. Whereas, however, in Dutch it is possible to find arguments in favour of such an analysis, e.g., the position of subject clitics (see Zwart 1997), in German it is harder to find compelling evidence in favour of an analysis under which the subject in the vorfeld is more basic than other elements. By Occam’s Razor it is easier to assume one underlying clause structure for German than two, if there is no evidence for a twofold analysis. Since there is no evidence for such an analysis, it is highly improbable that a German language learner would derive two analyses – one for subject-initial cases, one for all others – where s/he could do with only one. The subject, being the highest argument in the structure, still might be less marked than other cases, simply because it is the phrase base-generated closest
German Vorfeld-filling as constraint interaction
to the vorfeld. It has been noted (e.g., Molnar 1991:169f. with references) that the subject is more often in the vorfeld than any other part of speech. Whether this is a direct consequence of subjecthood or only indirectly connected is not clear, however.4
2.2 English Topicalization: poset-elements Prince (1999) argues that topicalization in English – a construction which opens a ‘second’ preverbal position, so to speak, to the left to the subject; examples in (3) – depends on the notion of partially-ordered set (henceforth, poset). In particular, Prince (1999:7) proposes the following condition. The topicalised element stands in a salient partially-ordered set (poset) relation to some entity evoked in the discourse. The condition is to be read that only poset elements may be topicalized. For the purposes of this paper, an informal treatment of the poset relation is sufficient; for a more formal discussion, the reader is referred to Hirschberg (1985:122) and Prince (1999:8). A poset relationship exists if the discourse representation contains a set of entities, explicit or implicit, and the topicalised element refers to a member of that set, as in (3a,b), or if a bona-fide set can easily be constructed. A poset relation also exists if the element in question is in contrast to some entity already evoked, as in (3c), or if it resumes a whole set already evoked, as in (3d).5 (3) a.
‘We’ve got Earl Grey, Ceylon, Lemon Ginger, Raspberry, Rose hip. Which’d you like?’ – ‘Earl Grey I’d like.’
b. Thanks to all who answered my note asking about gloves. I didn’t look at this bb for several days and was astounded that there were 11 answers. Some I missed, darn. (from Prince 1999:1) c. The necklace she got from a friend. The ring she bought for herself. d. ‘And who did you invite for this spontaneous orgy, you chump?’ – ‘Well, there’s Charlie and Al and Liz and Pat and Tom and Shermy and Rick and John and Mary and Bill. All these guys you’ll have to order pizza for, I’m afraid.’
This construction has in common with the German vorfeld that some phrase is fronted; as modern topicalisation developed out of a pattern very similar to the German vorfeld-filling (remember that Old and Middle English had a version of V2,
4. Speyer (2004; 2007) argues that it is epiphenomenal. 5. The font conventions in the examples are as follows: standard text: italics. Topics: boldface. Antecedent of topic: underlined. Contrastive element: non italic. Scene-setting element: small caps. In the glosses italics stand for contrastive elements.
Augustin Speyer
too, with minor details distinguishing it from the Modern German version of V2) we can abstract away from the fact that in Modern English the subject intervenes between topicalised phrase and verb.6 The main point is: If in English poset elements can be fronted, we could expect the same to happen in German too; as German does not have the subject-before-verb-constraint, the vorfeld is ‘free’ to receive the fronted / topicalized element.
2.3 Topic or Theme Word-order and the information structural requirements determining it have been a focus of research for the linguists of the so-called ‘Prague School’ (e.g., Mathesius 1928; Daneš 1966). One of the most frequently cited result of their research is the ordering of the sentence according to what they call theme-rheme-structure: The theme (which can be described as a piece of discourse-old information that represents the entity which the utterance is about; one could think of it as kind of heading under which all relevant information is clustered; another, almost identical term is aboutness-topic) has a strong tendency to stand before the rheme (which is all information that is added to the theme cf. Mathesius 1928: 66; Daneš 1966: 228; Halliday 1967:205; 212; Sgall, Hajicová and Benešová 1973:16). The implications of this assumption for a free wordorder language such as German or the Slavic languages, which have been in the focus of the Prague school, are obvious: In such languages we would expect to find the theme before the rheme even more than in fixed word-order languages, since in free wordorder languages nothing hinders the phrases to move around in order to establish the desired theme-rheme structure. Applied to the problem of vorfeldbesetzung this would imply that the vorfeld would be the archetypical theme- or topic position, as it is the foremost constituent slot in the sentence. This view is proposed rather frequently indeed (see e.g., Molnár 1991; Vallduví and Engdahl 1996:282ff.). Recent research by e.g. Werner Frey suggests however that the archetypical topic position is rather at the left edge of the mittelfeld, that is, immediately after the left sentence bracket (Frey 2004a). So from there it looks as if theme-rheme-structuring is only relevant for the mittelfeld, but that for the vorfeld potentially other factors hold, independent from theme-rheme-structure. Slightly related to a theme-rheme structure (in the sense that themes tend to be discourse-old and rhemes tend to introduce new material) is the notion that discourseold material tends to appear earlier in the sentence than discourse-new material. This has been shown to be relevant especially for non-canonical word-order constructions in English (Birner 2004). So we should not be surprised if vorfeld-elements are essentially discourse-old.
6. This is especially true since the ‘competing’ fronting construction, Hanging Topic Left Dislocation, can be distinguished quite easily (see Shaer and Frey 2004).
German Vorfeld-filling as constraint interaction
We have now three contradicting expectations on what we would expect in the vorfeld: • the subject, • a poset-element or • the topic. It will turn out that each expectation can account for a fraction of cases, but that neither expectation could apply to all vorfeld-cases.
3. Types of vorfeld-fillers in German Let us now see what kinds of referential expressions we really do find in the German vorfeld. It will turn out (not surprisingly) that Behaghel’s description, spongy as it is, hits the target, but to make it more clear what is meant by the terms I will dwell on each term and try to approach a suitable definition. I examined two corpora consisting of text from a variety of genres with varying degrees of formality in order to see what kinds of referential expressions we find in the vorfeld. The first corpus was used only to detect the patterns; the second was used for control and was also the basis of the frequency calculations in section 4. Most examples in this paper are from the second corpus. For this corpus only subliterary texts were chosen (what in German one would call gebrauchsprosa), coming from three sources: newspapers (editor’s comments and long reports), concert programs and essays written for oral presentation in the radio. These four genres of gebrauchsprosa were chosen randomly, but with the thought in mind that they should constitute as different types of gebrauchsprosa as possible. The analysed passages out of the texts were chosen randomly, but examined beforehand, whether they were sufficiently coherent (e.g., no lists, no texts consisting almost entirely of quotations etc.). An exact list can be found at the end of the paper. Only taking sentences into account in which the vorfeld is indeed occupied by a referential expression, it becomes apparent that in the majority of sentences (364 out of total 405 with referential expressions in the vorfeld, that is roughly 90%; 73% of all sentences in the corpus) the vorfeld-element conforms to one of the following three types of elements: Topic, contrast or scene-setting.7 In the following examples, topics
7. The remaining 10% of cases are either subject pronouns (on which see Speyer 2006), expletive ‘es’ or elements that have in common their being discourse-new elements. An example would be “Mehr als 100000 Jobs sind nach dem 11. September in Manhattan verloren gegangen.” (“more than 100 000 jobs have been lost in Manhattan after 9/11”; StZ 6, 19), where the information “more than 100000 jobs” was never mentioned in the text, let alone evoked, thus it
Augustin Speyer
are marked bold, their antecedents are underlined, contrast elements are in italics in the glosses and in normal font in the examples, scene-setting elements are in small capitals.
3.1 Topic For the definition of ‘Topic’ I choose as a first step the definition of backward-looking center in Centering Theory. Centering Theory is a framework originally proposed as a model of discourse coherence and the felicitous use of pronouns (Grosz, Joshi and Weinstein 1995; Prince 1998; Walker, Joshi and Prince 1998). In Centering Theory, the referential expressions in an utterance appear on a list of forward-looking-centers (Cf ), which are ranked in a language-specific way according to non-pragmatic factors such as syntactic function and thematic role.8 The highest-ranked forward-looking center is called the preferred center (Cp). Most sentences – basically all that feel intuitively ‘coherent’ to the previous discourse – have also a backward-looking center (Cb), which links the utterance to the previous discourse. By that the referent of the Cb is coreferential with some entity in the prior discourse. Of the Cf entities in the sentence, the Cp is the one with the highest probability of being coreferential with the Cb of the following utterance. In a highly coherent discourse, the Cb of each utterance is coreferential with the Cp of the preceding utterance. An example for a Cb in the vorfeld is under (4). (4) Verteidigungsminister Peter Struck (SPD) hat gestern sein Sparprogramm defence-minister Peter Struck (SPD) has yesterday his cut-expense-prgr. bekannt gegeben. Er sieht darin auch einen Schritt zur Reform der known given he sees therein also a step to-the reform of-the Bundeswehr. federal army ‘Minister of Defence Peter Struck (SPD) proposed his program for cutting expenses yesterday. He sees it also as a step towards a reform of the Federal Army.’ (StZ 1,1–2)
is brand-new. To deduce from that that being discourse-new is a property which makes a phrase eligible for vorfeld-movement is premature; I indeed never thought that, although the wording in Speyer 2004 might suggest that. The key property of these phrases which makes them move to the vorfeld still needs to be found. 8. Strube and Hahn (1996) argue that centers are ranked according to functional criteria in free word-order languages, esp. German. In the light of Speyer (2007) this is slightly circular: The centering hierarchy is meant to create something akin to a theme-rheme-structure, but does that starting from independent factors. To say that the centering hierarchy takes a themerheme-structure as a starting point to create a theme-rheme-structure is circular.
German Vorfeld-filling as constraint interaction
Cbs are often realized as pronouns in the discourse, as also in ex. (4). From this it follows that a possible method of testing whether a referential expression has the potential of being a Cb is the pronominalization test: If it is possible to replace the referential expression in question with a pronoun and preserve the unique reference of the phrase, there is a good chance that the referential expression is a Cb (see ex. 5). (5) a. Die Landesverteidigung solle künftig nicht mehr primäre Aufgabe the country-defence shall in-the-future not more primary task der Bundeswehr sein. Die Streitkräfte sollten vielmehr im of-the fed.army be the forces should rather in UN-Auftrag ‘überall auf der Welt’ einen Beitrag zur UN-mandate anywhere in the world a contribution to internationalen Sicherheit leisten. international security afford ‘The defence of the country would in the future no longer be the primary task of the Federal Army. The armed forces (b: it) should instead contribute to international security everywhere in the world, under U.N. mandate. (StZ 1, 8–9) b. Die Landesverteidigung solle künftig nicht mehr primäre Aufgabe der Bundeswehr sein. Sie sollte vielmehr im UN-Auftrag ‚überall auf der Welt’ einen Beitrag zur internationalen Sicherheit leisten.
As the property of being pronominalizable is a necessary but not sufficient condition on centerhood and by that also of topichood, the test cannot determine for sure what the Cb of a clause is, but it can identify expressions which are definitely not Cbs. In (6), for instance, the reference with a pronoun in the second sentence crashes. The subject ‘Lemon Ginger’ cannot be the Cb of the second sentence as there are more than one equally ranked Cps in the preceding sentence.
(6) a. We’ve got Earl Grey, Ceylon, Lemon Ginger, Raspberry, Rose hip. Lemon Ginger is a tremendous beverage.
b. We’ve got Earl Grey, Ceylon, Lemon Ginger, Raspberry, Rose hip. # It / This is a tremendous beverage.
Although examples like (4) and (5), where a whole NP functions as Center, might be expected to be the most common case, they turn out to be not very frequent, and many examples contain less prototypical Cbs, such as Cbs which are embedded in other phrases or which are elided. This problem is treated in more detail in Speyer (2007), and it is not relevant here. Of course it is not only the property of being discourse-old which makes a topic out of a referential expression. The second condition, perhaps more important than
Augustin Speyer
the first one, is that the topic is the entity which the sentence is ‘about’ (Strawson 1964; Halliday 1967; Kuno 1972; Reinhart 1982; Gundel 1985 etc.). This is a notion notoriously less easy to formalize than the one offered by Centering Theory, but Reinhart (1982) proceeds rather far. Without repeating her formal definitions here, I refer simply to her metaphor of the subject-ordered library catalogue: The topic is described as a ‘defining entry’ which organizes the propositions in the context set (the set of all propositions which have been agreed to be true in the previous discourse) and assigns them to referents that are taken from the context set as well. Each sentence has the potential of adding further information to one of these entries. Heim’s (1982) filecard metaphor is closely related: Each ‘topic’ represents a filecard which is filled with new information as it proceeds; if the topic shifts, a new filecard has to be created or has to be picked up again from the ‘stack’ of topics already mentioned during the discourse. For the present study it is sufficient to define topic as en entity which • is discourse-old information • functions as heading to which the sentence in question adds information • conforms to the definition of backward-looking center In some cases we find a phrase in the vorfeld which is not a topic under the definition given above, because it is not discourse-old, but it is a phrase denoting an entity which will be used as topic in the subsequent sentences (“Das Virus” in 7). It is, using Centering-terminology, a preferred center, and the sentence in which it stands has a continue or retain relation to the following sentence; at the same time it functions as center for the first sentence itself, but is newly introduced; so we would have a rough-shift-relation to the previous sentence. As it is a priori not clear whether these cases are archetypical topics or not, they are left out of the calculation (although it turned out that for matters of the ranking described in section 4 these cases behave similar to ‘normal’ topics). (7)
Das Virus ist tückisch, The virus is pernicious bis heute weiß keiner, wie es auf den Campen-Hof gelangte. till today knows no-one how it on the Campen-farm arrived ‘The virus is pernicious; to the present day nobody knows how it got to Campen’s farm.’ (SZ 1, 46–47)
3.2 Contrast Some phrases found in the vorfeld of German sentences have a property which can be described as ‘contrast’. It is not ‘contrast’ in the sense of ‘having contrastive focus’, although many examples in this class would show contrastive focus if read loud, but rather ‘contrast’ in the sense of ‘belonging to a set of entities which is being evoked
German Vorfeld-filling as constraint interaction
in the discourse (or already has been evoked)’. This description shares much with the definitions of ‘poset-relations’ as given by Hirschberg (1985) and Prince (1999 – note that this is the condition under which English topicalisation can take place, see 2.2), but also with the notion of ‘kontrast’ (sic!) as defined by Vallduví amd Vilkuna (1998).9 Let me illustrate this with some examples. Example (8) is perhaps the ‘clearest’ case: A set M is established by being explicitly referred to, and some members of the same set are referred to in the following discourse. (8) Bisherige sozialdemokratische Vorzeigeminister wollen nicht mehr über Former social-democrat present-ministers want not more over sich verfügen lassen. themselves order allow Clement verabschiedet sich, Struck lehnt den Posten des Clement takes-leave himself Struck declines the post of-the Außenministers ab(. . .) Schröder selbst hat eine andere “Lebensplanung”. foreign minister ptc. Schröder himself has another life-plan Manche werden gar nicht mehr genannt. Some become ptc. not more mentioned Set M:M= Bisherige soz.dem. Vorzeigemin.; M = {. . . , Clement, Struck, Schröder, . . . } ‘Former social-democrat prominent ministers do not want to be available any more. Clement leaves. Struck turns down the post of foreign minister. Schröder himself has another ‘plan for his life’. Some are not mentioned at all.’
(FAZ 1, 3–7)
The set as a whole need not be mentioned before some members are enumerated; it can be referred to as a whole after some members are enumerated (9), or not at all (10). This is the most common case, though the set needs to be easily inferable from its members. (9)
Schon jetzt . . . haben Union und SPD deutlich gemacht, dass die already now have union and SPD clear made that the Tarifautonomie erhalten bleibt und dass die Sonn-, Feiertags- und wages-autonomy preserved stays and that the sun- holiday and Nachtzuschläge auch künftig nicht besteuert werden. night premiums also in-the-future not taxed become In beiden Fällen haben die Union und ihre Kanzlerkandidatin In both cases have the union and her chancellor-candidate
9. A more strict definition of what was termed ‚p-kontrast’ in Speyer (2004) is too strong for the observable cases and can only capture a subset.
Augustin Speyer
eine andere Position vertreten. another position defended M = {Tarifautonomie bleibt erhalten, Sonn- etc. -zuschläge werden nicht besteuert} M: M= exclusive social democrat positions agreed upon in the coalition talks [[in beiden Fällen]] =M ‘Even now . . . CDU and SPD have made clear that the autonomy of wages will be kept and extra pay for work on Sundays, holidays and nights will stay exempt of taxes. In both cases the CDU and its candidate had different views.’ (FAZ 2, 26–27) (10) So gehen die Experten davon aus, dass am Grund des Meeres thus go the experts therefrom out that at-the base of-the sea damals eine leichte Strömung vorgeherrscht haben muß. then a light current existed have must Hunderte versteinerte Tintenfische wurden in einer entsprechenden hundreds fossilized squid became in a corresponding Anordnung gefunden. pattern found Die Kadaver der Saurier waren gegen abgesunkene Baumstämme the corpses of-the saurs were against sunk tree-trunks geschwemmt worden [. . .]. washed become ‘Thus the experts assume that a slight current must have prevailed at the bottom of the sea at that time. Hundreds of fossilized squid were found in a corresponding formation. The corpses of the 〈plesio〉 saurs had been washed up against sunken treetrunks.’ M ={. . . , squid, plesiosaurs, . . .} M: M= animals that can end up on the bottom of Jurassic lagoons (StZ 3, 37–39)
Normally the members of such a set are mentioned in different sentences, but this need not be the case. Example (11) shows a sentence in which two such members are enumerated in the same clause. (11) Ihre heimischen Zirkel faßten zu eng. Kein langwieriges Geschäft, their domestic circles caught too narrow no long-lasting business, keine kurzweilige Liebe konnte sie binden. no short-time love could them bind ‘Their domestic circles were too narrow. Neither time-consuming business nor entertaining love could bind them.’ (GrT 1, 37–38)
German Vorfeld-filling as constraint interaction
Note that a locality condition seems to hold for contrastive cases. All references to the set or its members must be made in adjacent sentences. That means, satellites (that is: small self-contained sub-discourses that elaborate on something from the main discourse, but feature a topic different from the main discourse surrounding them) cannot intervene without disturbing the establishment of such a set. They can only intervene if they have the previous contrast element as topic. If in sentence (10), for instance, a clause were to be inserted between the second and third clause that do not take the member ‘hundreds of fossilized squid’ as a topic, but some other entity in the sentence, it is rather questionable whether the reader or hearer could relate ‘the corpses of the plesiosaurs’ to the same set as ‘squid’; s/he would probably only think that the discourse is strangely incoherent (10’). (10’) So gehen die Experten davon aus, dass am Grund des Meeres damals eine leichte Strömung vorgeherrscht haben muß. Hunderte versteinerte Tintenfische wurden in einer entsprechenden Anordnung gefunden. Diese Anordnung erinnerte die Forscher an einen halbmondförmigen Sandkuchen. #Die Kadaver der Saurier waren gegen abgesunkene Baumstämme geschwemmt worden [. . .]. ‘Thus the experts assume that a slight current must have prevailed at the bottom of the sea at that time. Hundreds of fossilized squid were found in a corresponding formation. This formation reminded the researchers of a crescent-shaped mud pie. #The corpses of the 〈plesio〉saurs had been washed up against sunken treetrunks.
How can we distinguish such contrast cases from normal topics? Note that the present definition of contrast also includes topics, as they evoke a set, too, with only one member, though, namely the topic itself. Under Hirschberg’s (1985) and Prince’s (1999) definition of posets (= partially ordered sets) such cases fall under this definition and by that token resumptive pronouns in English, for example, show similar properties as members of a list etc. with respect to topicalization (Prince 1999 argues that a poset relationship to other entities is the very property which elements must have in order to be topicalized in English and Yiddish). The pronominalization test, which is applicable to topics, fails for contrast elements, as was demonstrated in (6). So it would be undesirable to subsume both under the same heading. The failure of the pronominalization test gives us a hint how to distinguish these cases, however: Pronominal reference can be made felicitously only if the referent is uniquely identifiable, moreover familiar to the addressee and salient in the discourse (cf. Gundel, Hedberg and Zacharski 1993). Topics have these properties. Contrast elements are not necessarily familiar or salient; they become salient and
Augustin Speyer
inferable only after the first mention of the set or reference to one of its members has been made. As they always have to be seen before the backdrop of the set to which they belong, they are not uniquely identifiable. Rather, the members by themselves are, but as more than one member is enumerated in these cases – which are equally salient – pronominal reference has to crash, as it cannot refer to one of them and allow clear predictions about which one is the referent. So we can briefly describe the ‘contrast’ elements in the vorfeld as members of a set or the set itself; the set is evoked in the discourse either by direct reference or can be inferred from its members as they are mentioned. All references to the set and/or its members must be made in adjacent utterances. One-member-sets are exempt.10
3.3 Scene-Setting Some phrases in the vorfeld could be subsumed under the term ‘scene-setting’. A scene-setting element can be defined as an expression that names a crucial restriction on the situation (such as: the place, the time, etc. . .) in which the proposition is true (similar definition Jacobs 2001:656). Let me illustrate this with example (12): (12) Zwar den weitesten Weg [. . .] doch den sichersten [. . .] nahm Simon Dach, though the farthest way but the most-secure took Simon Dach dessen Einladungen diesen Aufwand ausgelöst hatten. whose invitations this expense caused have Schon im Vorjahr [. . .] waren die vielen einladenden und den already in-the pre-year were the many inviting and the Treffpunkt beschreibenden Briefe geschrieben [. . .] worden. meeting-point describing letters written become ‘Simon Dach, whose invitations started this business, took the farthest, but the most secure way. Already in the preceding year the huge amount of letters, inviting and describing the meeting point, had been written.’ (GrT 1, 21–22)
10. Linking my results back to Birner 2004 shows partial concord: Topics are per definition discourse-old; contrastive elements are at least evoked (by other members of the set). The generalisations for English do not hold for German. Scene-setting elements, however, need not be discourse-old; good examples are clauses at the beginning of paragraphs that ‘set the scene’: To begin a text by e.g., “In der Lagerhalle 45 des Duisburger Hafens war es ganz still, bevor der erste Schuss fiel.” (“in storage hall 45 of the Duisburg harbour it was completely quiet, before the first shot rang out”; my example) is completely normal; yet the vorfeld-element simply cannot be discourse-old here, simply because there was no discourse up to that point. Further bear in mind that there is a class of elements, mentioned in note 7, that seem to possess ‘discoursenewness’ among their properties.
German Vorfeld-filling as constraint interaction
The proposition [[such-and-such letters had been written]] is only true in the situation described by the adverbial ‘already in the preceding year’. In a situation which had e.g. ‘at the narration time’ as time-frame, the proposition would be false. Scene-setting elements are thus mostly local or temporal adverbials, including expressions like ‘now’, ‘then’, ‘always’ etc. Another example would be (13): (13) Erstmals haben am 11.September gesellschaftliche Akteure international first-time have at 11 September communal actors internationally zugeschlagen. . . An diesem Tag fand der erste Angriff im struck on this day took the first attack in-the Weltbürgerkrieg statt. world-civil-war place ‘On September 11 non-governmental agents have struck for the first time internationally. . . On this day the first attack in the global civil war took place.’ (L2, 15–16)
The proposition [[the first attack in the global civil war took place]] is true only at the date given by the scene-setting element [[on this day]], referring back to September 11 of the preceding sentence. Not all local and temporal adverbials fall under this definition, of course. Take a sentence such as (14), for example: (14) Niemand wollte um diese Uhrzeit nach Köln fahren ‘nobody wanted to drive to Cologne at this time of day.’
There are two adverbials in this sentence, one local and one temporal one. But “um diese Uhrzeit” does not modify the main proposition p ‘nobody wanted q’, but the subordinate proposition q ‘to drive to Cologne’. It is conceivable that only specifications of the matrix situation show this strong tendency to appear in the vorfeld, although a sentence with this element in the vorfeld does not sound infelicitous (14’). (14’) Um diese Uhrzeit wollte niemand nach Köln fahren
But the interpretation is ambiguous between “um diese Uhrzeit” modifying “nach Köln fahren” or “wollte”. The phrase “nach Köln” finally is, strictly speaking, not adverbial at all; one could argue that the goal is a necessary complementation of the verb ‘fahren’ and thus an argument rather than an adjunct. As arguments are inalienable parts of the proposition it is impossible under the definition of scene-setting elements given above to use them as scene-setting elements. Note furthermore that “nach Köln” behaves differently with respect to vorfeld-movement: Whereas in the case of “um diese Uhrzeit” vorfeld movement was still somewhat possible, however at the prize
Augustin Speyer
of introducing ambiguity (14’), it is possible with “nach Köln” only in a contrastive context (14”).11 (14”) a. Die Uhr schlug elf. the clock struck eleven #Nach Köln wollte um diese Uhrzeit niemand fahren. To Cologne wanted at this clock-time nobody drive ‘The clock struck eleven. Nobody wanted to drive to Cologne at this time of the day.’ b. Die Uhr schlug elf. the clock struck eleven Nach Köln wollte um diese Uhrzeit niemand fahren, nach To Cologne wanted at this clock-time nobody drive to Düsseldorf schon gar nicht. Dusseldorf already very not ‘The clock struck eleven. Nobody wanted to drive to Cologne at this time of the day, even less to Dusseldorf.’
Under this definition of scene-setting, elements which are not clearly referential can also be included (cf. Jacobs 2001:655ff.), such as certain adverbials limiting the domain of the proposition like in sentence (15a) – strictly speaking, all adverbials of a ‘with respect to X’ sense would be included – or conditionals, be they realized nominally (15b) or as a clause (15c). They are left out of the subsequent analysis, however, as I wanted to restrict it to classical referential expression. (15) a.
Körperlich geht es Peter gut body-wise goes it Peter good ‘Peter is fine, with respect to his body’
b. Im Falle eines Sieges wird die Mannschaft eine In case of-a victory will the team a Belobigung vom Präsidenten erhalten commendation from-the president get ‘In the case of victory the team will receive a commendation from the president.’ c. Wenn sie siegt, wird die Mannschaft eine Belobigung vom Pr. If she wins will the team a commendation from-the p. erhalten get ‘If it wins, the team will receive a commendation from the president’ (15a, b after Jacobs 2001:655) 11. A reviewer pointed out that the sentence sounds better if an ‘aber’ is inserted: “Nach Köln wollte um diese Uhrzeit aber niemand fahren.” The particle “aber” induces a contrastive reading for the sentence as a whole, implicating that there are alternatives to the preposed element in the discourse universe, even though they are not explicitly mentioned.
German Vorfeld-filling as constraint interaction
3.4 Problems for the subsequent analysis We have seen that most referential expressions in the vorfeld fall under one of the three following types: topic, contrast, scene-setting. One sees on first glance that these terms belong to completely different pragmatic dimensions. A rather undesirable consequence of the fact that these types of elements do not form a homogenous class is that elements can belong to two types at the same time. It is not altogether possible to define these types of element in such a way that they exclude each other, since they do not belong to the same pragmatic dimension. Take givenness, for example: Topics are clearly given information – this is part of their definition – contrast elements are inferable – this is part of their definition. But scene-setting elements are not per se of a certain givenness status – they can be discourse-old or discourse-new. The example (12) was an example of a discourse-new scene-setter (as can be checked from the context from which the text is taken). An example for a discourse-old scene-setter would be (13). An extreme example is “In der Asienkrise der neunziger Jahre” in (16): (16) Von der Konvertierbarkeit ihrer Währungen profitierten vor allem from the convertability of-their currencies profitted in-first-place westliche Banken und Investoren, während die betroffenen Länder in western banks and investors whereas the affected countries in einer Finanzkrise versanken. a financial crisis submerged 1998 traf sie Russland, 1999 Brasilien, die Türkei 2001 und 1998 hit it Russia 1999 Brazil the Turkey 2001 and im gleichen Jahr auch Argentinien. in-the same year also Argentina In der Asienkrise der neunziger Jahre verloren manche Regierungen In the Asia-crisis of-the 1990s lost some governments ihr Amt, viele Menschen aber ihren Arbeitsplatz und ihre Ersparnisse their mandate many persons but their job and their savings. ‘western banks and investors profited mostly from the compatibility of their currencies, whereas the affected countries sank into a financial crisis. 1998 it hit Russia, 1999 Brazil, Turkey 2001 and in the same year also Argentina. In the crisis in Asia in the 1990s some governments lost their mandate, but many people their job and their savings.’ (L2, 32–34)
“Krise” can be taken as topic; the topics of this passage are financial crises, and it is clearly the topic of sentence [L2,33], so it could be understood as such also in [L2,34]. “Asien”, however, is clearly a contrast element, forming a set ‘M:M=regions and countries subject to financial crisis’ together with {Russia, Brazil, Turkey and Argentina}. The whole phrase “In der Asienkrise der neunziger Jahre”, finally, conforms to the definition of scene-setting element which was provided above. The impossibility to assign all vorfeld-fillers to one and only one type on the basis of the definitions given above is not a real problem as long as we are only interested in
Augustin Speyer
what kind of elements can be in the vorfeld at all. But as soon as we go on asking and try to solve the question, which of these elements are more preferred than the others for vorfeld-placement, the tokens that conform to more than one definition do pose a problem in that there is no way to choose which one of the factors is the one mainly responsible for their movement to the vorfeld. I am not sure whether it is possible to rephrase the definitions so that in the end their definitions are such that it is really possible to say of a given element that it is e.g., contrast and nothing but contrast. Operationally the best we can do is to concentrate on the examples which can be assigned to only one type, and use only those for the subsequent analysis.
4. The ranking of vorfeld-fillers As was said above, most of the referential phrases in the vorfeld are either topic or contrast or scene-setting elements. These three properties obviously favour vorfeldmovement; phrases that conform to one of these properties are singled out and moved preferably to the vorfeld. We have to ask now what happens if the sentence contains more than one phrase with a vorfeld-favouring property. In many sentences this is not a problem, as they have only one topic and no contrast or scene-setting element, or only one contrast element and neither topic nor scene-setting element, etc. But there are still many sentences that have two or more phrases attracted to the vorfeld. The easiest way to find out what is going on is to gather the sentences that contain both a topic and a contrast element, both a topic and a scene-setting element, both a contrast and a scene-setting element or all three types of elements, and see which type of element is really in the vorfeld. As was mentioned in section 3.4, only sentences in which the elements in question can be assigned exclusively to one category are taken into account. Table 1. Topic + Contrast (ex. 17)
total number
Contrast in VF
Topic in VF
sth. else in VF (see note 7)
numbers percent
32 100 %
20 63 %
9 28 %
3 9%
This result is probably skewed by one text (L2) which alone accounted for 5 cases in which the topic was in the vorfeld (ex. 18). The topic was preferred in these cases for either stylistic reasons (in order to create series of sentences with anaphor in a rhetorical sense, that is, sentences starting with the same word) or processing constraints (as e.g., not to put too heavy elements into the vorfeld).
German Vorfeld-filling as constraint interaction
Table 2. Topic + Scene-setting (ex. 19)
total number
Sc.-setting in VF
Topic in VF
sth. else in VF
numbers percent
29 100 %
25 86 %
4 14 %
0 0%
Table 3. Contrast + Scene-setting (ex. 20)
total number
Contrast in VF
Sc-set. in VF
sth. else in VF
numbers percent
16 100 %
3 19 %
12 75 %
1 6%
Table 4. Topic + Contrast + Scene-setting (ex. 21; also one in 16: L2,33)
total number
Contrast in VF Topic in VF Sc.-sett. in VF
sth.else in VF
numbers percent
7 100 %
1 14 %
0 0%
0 0 %
6 86 %
(17) Die Richtlinienkompetenz des Kanzlers gilt. . . nicht. . . gegenüber the guideline-competence of-the chancellor is-valid not toward dem Bundestag [. . .] the parliament Die Parteien bestimmen die Richtlinien der Politik The parties determine the guidelines of-the politics der Reichskanzler wurde als Vollzieher und Hüter der the empire-chancellor became as fulfiller and guardian of-the Koalitionsrichtlinien bezeichnet. coalition-guidelines addressed ‘The Chancellor has no competence in how to interpret the guidelines opposed to the parliament. The parties determine the political guidelines; the chancellor was addressed as fulfiller and guardian of the coalition’s guidelines (FAZ 2, 18; 20–21) (18) Sie (= Non-Government-Organizations) verstehen sich als der bessere They understand themselves as the better Repräsentant der abendländischen Kultur [. . .]. represaentative of Western culture Sie kümmern sich um die Benachteiligten, [. . .] They care themselves for the disadvantaged Sie helfen bei der Konfliktbearbeitung und bei der Konfliktlösung. they help at the conflict-treatment and at the conflict-solution ‘They see themselves as the better representatives of Western culture. They care for the disadvantaged. They help at the treatment and solution of conflicts.’ (L2, 45–48)
Augustin Speyer
(19) Am Dienstag mittag können die deutschen Helfer [. . .] aufbrechen. At Tuesday noon can the German helpers start ‘Tuesday at noon the German helpers can start’ (FAZ3, 46) (20) Zu Bachs Zeiten hatten beide Feiertage eine wichtige Stellung im At Bach’s times had both holidays a important position in-the Kirchenjahr.12 church-year Zum Reformationstag komponierte Bach . . . die beiden heute gespielten To-the reformation day composed Bach the two today played Kantaten [. . .] cantatas Zu Michaelis komponierte Bach außer BWV 19 und 149 To Michaelmas composed Bach besides BWV 19 and 149 noch BWV 50 . . . also BWV 50 ‘At Bach’s time both holidays were prominent in the festival calendar of the 〈Lutheran〉 church. For reformation’s day Bach composed the two cantatas played tonight. For Michaelmas Bach composed besides BWV 19 and 149 also BWV 50.’ (Ri1, 7–8; 10) (21) Im Umkreis von drei Kilometern töteten sie (= the veterinary officers, In radius of three kilometres killed they mentioned in preceding sentence) sämtliches Geflügel, mit Gas, per all poultry by gas by Stromstoß. electric shock ‘In a 3-km-radius they killed all poultry, using gas and electric shocks’ (SZ1,43)
We see clear trends from tables 1–4: If a scene-setting element is one of the competitors, it wins out in most cases (Tables 2, 3, 4: 43 out of 52 cases = 83%). It does not matter whether the other competitor is a topic or a contrast element. If no scenesetting element is among the competitors, i.e., if the competition is between contrast and topic, contrast wins out in most cases. This is not so clear from the tables above; that this preference can be overridden at all suggests that it is not as strong as the preference for scene-setting elements in the vorfeld. These numbers suggest that vorfeld-placement is not strictly categorical but happens on a competitive basis: There are three ‘constraints’ on vorfeld-placement; these 12. Note that “Bach” appears for the first time in the text; therefore, it is not to be regarded as the Topic in [Ri1,7].
German Vorfeld-filling as constraint interaction
constraints are understood in a sense close to Optimality Theory (to which see e.g., Prince and Smolensky 1993; Kager 1999). The three constraints are: Constraint 1 (Topic-VF): The topic is moved to the vorfeld Constraint 2 (Contrast-VF): The contrast element is moved to the vorfeld Constraint 3 (Scene-setting-VF): The scene-setting element is moved to the vorfeld If these constraints are ranked in the following order, we would expect exactly the distribution which we observed.
Scene-setting-VF >> Contrast-VF >> Topic-VF This ranking can be read as: if a sentence contains more than one phrase conforming to the conditions stated in the vorfeld-constraints, the optimal candidate has the phrase in the vorfeld that conforms to the conditions of the highest-ranked relevant constraint. As constraints in Optimality theory are violable in principle, it is not tragic if the constraints in this ranking do not account for all 100% of cases; the ‘exceptions’ in tables 1–4 might either be suboptimal candidates which simply happened to slip in instead of the optimal ones (the basic idea behind Stochastic Optimality Theory), or they might be due to interaction with further constraints. The author of text L2, for instance, has a stylistic-rhetorical constraint (like ‘sentences start with identical words’) and another, more central constraint (Behaghel’s Law of increasing members, phrased as a constraint: ‘heavy elements are to the right’) that interfere with the three vorfeld-constraints outlined above; it is ranked higher for him (or for his perception of the genre he is writing in) than the three vorfeld-constraints and thus candidates are chosen that, strictly speaking, are not the optimal candidates if the optimal output was determined only by the three vorfeld-constraints. As examples (19) to (21) suggest, the topic tends to appear in the middelfeld-initial position in cases in which it is ousted from the vorfeld-position by higher-ranked elements. This is in accordance with Frey (2004a)’s findings. The topic can move from this position into the vorfeld only if the vorfeld is not filled otherwise. The reason why it is the topic that is singled out for vorfeld-movement in these cases is perhaps because it is the closest phrase, being in the topmost adjunct or (in the case of subject topic) argument position within the mittelfeld or IP. If one changed the word order in examples (17) and (19) to (21) and put the lowerranked phrase into the vorfeld instead of the phrase that has been put there according to the constraint ranking one would see that the resulting sentences would sound slightly less acceptable than the original sentences in the given context. This might be further evidence in favour of the ranking proposed here.
5. Conclusions A corpus-study showed that the German vorfeld is filled according to pragmatic considerations, but that it is not possible to pinpoint one property which a phrase must
Augustin Speyer
have in order to be moveable to the vorfeld, but that there are at least three competing properties, viz. Topichood, Contrasthood or being a Scene-setting element. In cases in which the sentence contains more than one phrase conforming to one of these properties, vorfeld-movement follows the ranking scene-setting >> contrast >> topic.
References Corpus: 1.: Newspaper: Comments FAZ1: Frankfurter Allgemeine, 12.10.2005, p.1 “Hoffnungsträger gesucht” 22 sentences FAZ2: Frankfurter Allgemeine, 12.10.2005, p.1 “Richtlinienkompetenz unter gleich Starken” 48 sentences Sum: 70 sentences 2.: Newspaper: Reports FAZ3: Frankfurter Allgemeine, 12.10.2005, p.9 “Kein Laut mehr aus den Trümmern” 79 sentences SZ1: Süddeutsche Zeitung, 24.10.2005, p.3 “Wenn es still wird im Stall” 99 sentences Sum: 178 sent. 3.: Concert Program Notes Ri1: Konzertprogramm Int.Bachakademie Konz. 23.10.2005, p.1 13 sentences Ri2: Konzertprogramm Int.Bachakademie Konz. 23.10.2005, p.10f. 30 sentences Ri3: Konzertprogramm Int. Bachakademie Konz. 20.11.2005, p.5ff. 44 sentences FB1: Konzertprogramm Freiburger Barock Konz. 10.3.2006, p.7 11 sentences FB2: Konzertprogramm Freiburger Barock Konz. 10.3.2006, p.9ff. 28 sentences Sum: 126 sent. 4.: Radio essays L1: Radioessay “Am Anfang war der Big Bang”, SWR2 Aula, 15.1.2006. 58 sentences L2: Radioessay “Die Globalisierung des Terrors”, SWR2 Aula, 7.9.2003. 69 sentences Sum: 127 sent. Total: 501 sentences Other Sigla (examples from the 1st corpus, see Speyer 2004, Speyer 2007): StZ1: Stuttgarter Zeitung, 22.2.2003, p.1 “Struck legt Tornados und Boote still” StZ3: Stuttgarter Zeitung, 22.2.2003, p.34 “Auferstehung eines Schnittzahnsauriers” StZ6: Stuttgarter Zeitung, 28.2.2003, p.29 “Am 11. September wirft die Sonne kei nen Schatten” GrT: Günther Grass: Treffen in Telgte.
German Vorfeld-filling as constraint interaction Secondary Literature: Bach, E. 1962. “The Order of Elements in a Transformational Grammar of German.” Language 38: 263–269. Birner, B. 2004. “Discourse Functions at the Periphery: Noncanonical Word Order in English.” In Proceedings of the Dislocated Elements Workshop. ZASPiL 35 vol.2, B. Shaer, W. Frey and C. Maienborn (eds), 41–62. Berlin: ZAS. Daneš, F. 1966. “A Three-Level Approach to Syntax.” In Travaux Linguistiques de Prague 1: L’Ècole de Prague d’aujourd’hui, J. Vachek (ed), 225–240. Tuscaloosa: University of Alabama Press. den Besten, H. 1983. “On the Interaction of Root Transformations and Lexical Deletive Rules.” In On the formal syntax of the Westgermania, W. Abraham (ed), 47–131. Amsterdam/ Philadelphia: John Benjamins. Frey, W. 2004a. “A Medial Topic Position for German.” Linguistische Berichte 198: 153–190. Frey, W. 2004b. “The grammar-pragmatics interface and the German prefield.” Sprache & Pragmatik 52: 1–39. Frey, W. and Pittner, K. 1998. “Zur Positionierung der Adverbiale im deutschen Mittelfeld.” Linguistische Berichte 176: 489–534. Grewendorf, G., Hamm, F. and Sternefeld, W. 1987. Sprachliches Wissen. Frankfurt/M: Suhrkamp. Grosz, B. J., Joshi, A.K. and Weinstein, S. 1995. “Centering: A Framework for Modelling the Local Coherence of Discourse.” Computational Linguistics 21: 203–225. Gundel, J.K. 1985. “‘Shared Knowledge’ and Topicality.” Journal of Pragmatics 9: 83–107. Halliday, M.A.K. 1967. “Notes on Transitivity and Themes in English II.” Journal of Linguistics 3: 199–244. Heim, I. 1982. The semantics of definite and indefinite noun phrases. PhD diss., University of Massachussetts, Amherst. Hirschberg, J. 1985. A Theory of Scalar Implicature. PhD diss., University of Pennsylvania, Philadelphia. Jacobs, J. 2001. “The Dimensions of Topic-Comment.” Linguistics 39: 641–681. Kager, R. 1999. Optimality Theory. Cambridge: Cambridge University Press. Koster, J. 1975. “Dutch as an SOV language.” Linguistic Analysis 1: 111–136. Kuno, S. 1972. “Functional Sentence perspective: A case study from Japanese and English.” Linguistic Inquiry 3: 269–320. Mathesius, V. 1928. “On Linguistic Characterology with Illustrations from Modern English.” In Actes du Premier Congrès International de Linguistes à La Have, 56–63 (Reprinted in and cited from: J. Vachek (ed.): A Prague School Reader in Linguistics, J. Vachek (ed), 59–67. Bloomington: Indiana University Press, 1964). Molnár, V. 1991. Das TOPIK im Deutschen und im Ungarischen. Stockholm: Almqvist & Wiksell. Prince, A. and Smolensky, P. 1993. Optimality Theory: Constraint Interaction in Generative Grammar. Ms., Rutgers University. Prince, E.F. 1999. “How Not to Mark Topics: ‘Topicalization’ in English and Yiddish.” In Texas Linguistics Forum, Austin: University of Texas Press. Reinhart, T. 1982. “Pragmatics and Linguistics: An Analysis of Sentence Topics.” Philosophica 27: 53–94. Reis, M. 1987. “Die Stellung der Verbargumente im Deutschen. Stilübungen zum Grammatik: Pragmatik-Verhältnis.” In Sprache und Pragmatik. Lunder Symposion 1986, I. Rosengren (ed), 139–177. Stockholm: Almqvist & Wiksell.
Augustin Speyer Sgall, P., Hajicová, E. and Benešová, E. 1973. Topic, Focus and Generative Semantics. Kronberg: Scriptor. Shaer, B. and Frey, W. 2004. “‘Integrated’ and ‘Non-Integrated’ Left-peripheral Elements in German and English.” In Proceedings of the Dislocated Elements Workshop. ZASPiL 35 vol.2, B. Shaer, W. Frey and C. Maienborn (eds), 465–502. Berlin: ZAS. Speyer, A. 2004. “Competing Constraints on Vorfeldbesetzung in German.” In Proceedings of the Dislocated Elements Workshop. ZASPiL 35 vol.2, B. Shaer, W. Frey and C. Maienborn (eds), 519–541. Berlin: ZAS. Speyer, A. 2006. Filling the vorfeld in spoken and written discourse. Paper presented at ‘Linguistic Evidence 2’ (Tübingen, Germany, February 2006) and ‘Organisation in Discourse 3: The Interactional Perspective’ (Turku, Finland, August 2006). Speyer, A. 2007. “Die Bedeutung der Centering Theory für Fragen der Vorfeldbesetzung im Deutschen.” Zeitschrift für Sprachwissenschaft 26: 83–115. Strawson, P. 1964. “Identifying reference and truth-values.” Theoria 30: 96–118. Strube, M. and Hahn, U. 1996. Functional centering. In ACL’96 – Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, 270–277. Travis, L. 1984. Parameters and effects of word order variation. Ph.D. thesis, MIT. Vallduví, E. and Engdahl, E. 1996. “The linguistic realization of information packaging.” Linguistics 34: 459–519. Vallduví, E. and Vilkuna, M. 1998. “On Rheme and Kontrast.” In Syntax and Semantics 29: The Limits of Syntax, P.W. Culicover and L. McNally (eds), 79–108. New York: Academic Press. Vikner, S. 1995. Verb Movement and Expletive Subjects in the Germanic Languages. Oxford/ New York: Oxford University Press. Walker, M.A., Joshi, A.K. and Prince, E.F. 1998. “Centering in Naturally Occurring Discourse: An Overview.” In Centering Theory in Discourse, idem (eds), 1–28. Oxford/New York: Oxford University Press. Zwart, C. J.-W. 1997. Morphosyntax of Verb Movement: a minimalist analysis of the syntax of Dutch. Dordrecht: Kluwer.
Index
A abstract object 9, 183 accessibility 1, 3, 12, 17, 22–3, 30, 32–33, 41–3, 46–7, 53–4, 75, 97–100, 103–10, 112–14, 160–8, 171, 174, 177, 187, 190–1, 196–7 accommodation 5, 44, 46–51, 144, 159–65, 167–8, 173–5, 177 anaphor 9–10, 23, 29–31, 35, 40, 42, 50, 54–5, 63–4, 88, 97–103, 105, 107, 111, 130–1, 133, 159–168, 171–4, 177, 181, 184–197, 230, 234–5, 243, 284 anaphor resolution 159–161, 163, 166, 168 attention 10–11 B backward looking 11, 111, 237, 245, 274, 276 bathroom sentences 163–4, 168–172, 174–5, 177 brain 147, 151, 156 bridging 10, 41, 49, 142, 144–147, 149–156 C cache model 113 centering 1, 10–11, 34–5, 47, 106, 111, 274–6 coherence 1–2, 4–6, 9–10, 18, 21, 24, 31, 41, 53, 57, 61, 63, 65, 73, 118, 144–5, 147, 150, 153–5, 195, 201, 207–209, 212–3, 221, 237, 242, 274 cohesion 2 communicative-weight assignment 24, 249–56, 264 constraint 1–3, 6, 8–10, 12, 20, 22, 24, 29–30, 33, 35–6,
38, 40, 45, 47, 50–1, 53–4, 57–8, 62, 65, 75, 78–81, 85–90, 92, 120–9, 136, 150, 189–190, 192–4, 197, 202, 230, 258–9, 267, 272, 287 coordination 8–14, 54–5, 86, 235, 253, 254 D definiteness 23, 142–5 dependency 23, 69–70, 77–8, 80–9, 91–3, 130–1, 141, 144, 146–8, 150, 155–6 discourse relation 4–5, 9–10, 13–14, 18–22, 29–38, 40–41, 43–47, 53–61, 65–79, 82–7, 89–9, 101–8, 104, 107–09, 119–25, 127, 129–36, 141, 143–50, 154–6, 195, 202, 207–14, 220–1, 227, 230–36, 240–4, 245, 255–7, 260–2, 264, 271, 276 discourse representation 23–24, 117–25, 127–8, 130–7, 141–2, 145–6, 149–50, 153, 156, 167–8, 187, 236, 271 discourse representation theory(DRT) 10–12, 19, 54, 159–64, 167, 173–5, 177–8, 182, 202 discourse segment 9, 10, 99–101, 104, 106, 111, 121, 126–7, 129, 132, 136, 208, 226–7, 236–33, 241–6 discourse structure 2, 10, 12–13, 22–4, 30–1, 33, 35, 38, 41, 42–6, 50–9, 53–4, 57, 60, 63–4, 71–3, 85, 88–7, 97–8, 101, 104, 113–4, 117–125, 127–8, 130–7, 143, 145, 146, 155, 197, 221, 225, 237, 243, 256
discourse unit 1–2, 13–4, 16, 21, 36, 99–100, 111, 119, 132–5, 141–6, 154, 190, 227–8, 233–4 distance, referential 98–9, 103, 109, 111–3 distance, rhetorical 97–114 double negation 160–1, 164–8, 171, 173–4, 176, 177 E elaboration 4, 7, 9, 17, 21, 30–2, 40, 49–50, 57, 59, 62, 70–5, 78, 82, 85, 87–90, 105, 118–19, 122, 130–4, 208–09, 234–5, 258 ERP 147–56 event 181–2, 185–95 event-related 23, 141, 147 evidence 7, 17, 213–14, 220 explanation 4, 9–10, 21, 32, 40, 50, 54, 56, 58, 82, 90–2, 131, 218, 235 F fact 181–2, 185–9 focus 1, 10–11, 22, 24, 29, 37, 76, 98–9, 145, 166, 192, 197, 233, 251, 272, 276 formalism 23, 69–70, 80–1, 91–3, 120–23 forward looking 11 G given-new 142 glue logic 21–2, 43, 47–8 graph 6–9, 23, 30–2, 38–9, 42, 74, 75, 77–9, 82, 86–8, 90–91, 100, 104, 107 H head driven phrase structure grammar(HPSG) 258–63
Index I identity 41, 124, 141, 144–7, 149–56, 185 inference 4–6, 49, 136, 142, 144, 148, 150, 203–04, 206–12, 214–16, 219–21 M maximize discourse coherence(MDC) 2, 9–10, 21, 31, 33, 43–4, 47–8, 53, 57–8, 61–5, 208 N N400 148–50, 153–56 narration 7, 9, 17, 30–2, 36, 50, 55, 57–62, 69–75, 78, 84–5, 87–90, 136, 217, 233–5, 257 negation 23, 160–69, 171, 173–8 nucleus 7, 16–18, 23, 69, 71–8, 82–3, 87–8, 91, 100–02, 104, 106–09, 118–19, 123, 131–4 O ontology 53, 186–7, 189, 197 P P600 148–50, 153–6 pronominalization 97, 99, 103, 111–14, 275, 279 pronouns 29, 33–4, 41, 48–50, 97, 109–14, 159–61, 166–1, 171–4, 177, 181, 191, 230, 244, 274–75
R reference 23–4, 49, 99, 109, 112–13, 159–160, 173–74, 181, 183–184, 186, 193, 225–6, 228, 231, 233–6, 241–43, 269, 275 relative clause 250–51, 254–62, 264 representation 13–16, 69–93, 166, 220–13 resolution 23–4, 29–32, 37, 41–2, 54, 59, 61, 63–4, 97, 159–61, 163, 166, 168, 173–4, 182, 189–97, 225, 234, 236 result 4, 22, 40, 44, 50, 53–8, 62, 75, 77, 119–20, 122, 131, 210–11, 216–17 rhetorical distance(KK) 23, 97–01, 103–04, 106–14 Rhetorical Structure Theory(RST) 1, 4, 6–7, 15, 18–9, 22, 53, 56, 59, 69–73, 75–8, 80–4, 88, 92–3, 100–06, 109–14, 118–19, 121, 133–34 Right-Frontier Constraint(RFC) 1–3, 6, 8–9, 12, 22–3, 29–36, 39–51, 53–5, 58–9, 62–5, 75, 79, 83, 85, 87–9, 92, 99, 186 S segmented discourse representation theory(SDRT) 1–2, 4, 19–23, 29–33, 35–8,
40–7, 51, 53, 56–60, 64–5, 69–70, 73–7, 80–1, 83, 85–7, 90–3, 207–13, 226, 233, 244, 256 sequential distance 98–9 speech act 4, 13, 144–45, 204–05, 212, 216, 219, 235–36 Stack Model(GS) 53, 98, 100, 103–08, 111, 114 state 19, 24, 181–83, 185–6, 188–89, 191–94, 201–05, 207–12, 214, 217–19, 221, 244 subordination 8–9, 10–14, 45, 47–8, 54–5, 104, 127, 133, 235, 252, syntax 6–7, 23, 30, 86–8, 109, 117–18, 125–26, 128, 136–37, 142–43, 145–46, 150, 154, 227, 244, 259, 269–2 T tree 8, 13, 15, 38–9, 53, 70–3, 76–4, 80–3, 85–8, 91–2, 99–103, 105–07, 119–24, 130, 132, 134–136, 278 treeness 130, 133 U underspecification 20, 31, 56–8, 63, 118, 120, 135, 137 V Veins Theory(VT) 98, 100, 103, 106–08, 110, 111, 131
Pragmatics & Beyond New Series A complete list of titles in this series can be found on the publishers’ website, www.benjamins.com 178 Schneider, Klaus P. and Anne Barron (eds.): Variational Pragmatics. A focus on regional varieties in pluricentric languages. vii + 371 pp. Expected May 2008 177 Rue, Yong-Ju and Grace Qiao Zhang: Request Strategies. A comparative study in Mandarin Chinese and Korean. xix, 324 pp. + index. Expected June 2008 176 Jucker, Andreas H. and Irma Taavitsainen (eds.): Speech Acts in the History of English. viii, 318 pp. Expected April 2008 175 Gómez González, María de los Ángeles, J. Lachlan Mackenzie and Elsa M. GonzálezÁlvarez (eds.): Languages and Cultures in Contrast and Comparison. xxii, 354 pp. + index. Expected May 2008 174 Heyd, Theresa: Email Hoaxes. Form, function, genre ecology. 2008. vii, 239 pp. 173 Zanotto, Mara Sophia, Lynne Cameron and Marilda C. Cavalcanti (eds.): Confronting Metaphor in Use. An applied linguistic approach. 2008. vii, 315 pp. 172 Benz, Anton and Peter Kühnlein (eds.): Constraints in Discourse. viii, 292 pp. Expected April 2008 171 Félix-Brasdefer, J. César: Politeness in Mexico and the United States. A contrastive study of the realization and perception of refusals. 2008. xiv, 195 pp. 170 Oakley, Todd and Anders Hougaard (eds.): Mental Spaces in Discourse and Interaction. 2008. vi, 262 pp. 169 Connor, Ulla, Ed Nagelhout and William Rozycki (eds.): Contrastive Rhetoric. Reaching to intercultural rhetoric. 2008. viii, 324 pp. 168 Proost, Kristel: Conceptual Structure in Lexical Items. The lexicalisation of communication concepts in English, German and Dutch. 2007. xii, 304 pp. 167 Bousfield, Derek: Impoliteness in Interaction. 2008. xiii, 281 pp. 166 Nakane, Ikuko: Silence in Intercultural Communication. Perceptions and performance. 2007. xii, 240 pp. 165 Bublitz, Wolfram and Axel Hübler (eds.): Metapragmatics in Use. 2007. viii, 301 pp. 164 Englebretson, Robert (ed.): Stancetaking in Discourse. Subjectivity, evaluation, interaction. 2007. viii, 323 pp. 163 Lytra, Vally: Play Frames and Social Identities. Contact encounters in a Greek primary school. 2007. xii, 300 pp. 162 Fetzer, Anita (ed.): Context and Appropriateness. Micro meets macro. 2007. vi, 265 pp. 161 Celle, Agnès and Ruth Huart (eds.): Connectives as Discourse Landmarks. 2007. viii, 212 pp. 160 Fetzer, Anita and Gerda Eva Lauerbach (eds.): Political Discourse in the Media. Cross-cultural perspectives. 2007. viii, 379 pp. 159 Maynard, Senko K.: Linguistic Creativity in Japanese Discourse. Exploring the multiplicity of self, perspective, and voice. 2007. xvi, 356 pp. 158 Walker, Terry: Thou and You in Early Modern English Dialogues. Trials, Depositions, and Drama Comedy. 2007. xx, 339 pp. 157 Crawford Camiciottoli, Belinda: The Language of Business Studies Lectures. A corpus-assisted analysis. 2007. xvi, 236 pp. 156 Vega Moreno, Rosa E.: Creativity and Convention. The pragmatics of everyday figurative speech. 2007. xii, 249 pp. 155 Hedberg, Nancy and Ron Zacharski (eds.): The Grammar–Pragmatics Interface. Essays in honor of Jeanette K. Gundel. 2007. viii, 345 pp. 154 Hübler, Axel: The Nonverbal Shift in Early Modern English Conversation. 2007. x, 281 pp. 153 Arnovick, Leslie K.: Written Reliquaries. The resonance of orality in medieval English texts. 2006. xii, 292 pp. 152 Warren, Martin: Features of Naturalness in Conversation. 2006. x, 272 pp. 151 Suzuki, Satoko (ed.): Emotive Communication in Japanese. 2006. x, 234 pp. 150 Busse, Beatrix: Vocative Constructions in the Language of Shakespeare. 2006. xviii, 525 pp. 149 Locher, Miriam A.: Advice Online. Advice-giving in an American Internet health column. 2006. xvi, 277 pp.
148 Fløttum, Kjersti, Trine Dahl and Torodd Kinn: Academic Voices. Across languages and disciplines. 2006. x, 309 pp. 147 Hinrichs, Lars: Codeswitching on the Web. English and Jamaican Creole in e-mail communication. 2006. x, 302 pp. 146 Tanskanen, Sanna-Kaisa: Collaborating towards Coherence. Lexical cohesion in English discourse. 2006. ix, 192 pp. 145 Kurhila, Salla: Second Language Interaction. 2006. vii, 257 pp. 144 Bührig, Kristin and Jan D. ten Thije (eds.): Beyond Misunderstanding. Linguistic analyses of intercultural communication. 2006. vi, 339 pp. 143 Baker, Carolyn, Michael Emmison and Alan Firth (eds.): Calling for Help. Language and social interaction in telephone helplines. 2005. xviii, 352 pp. 142 Sidnell, Jack: Talk and Practical Epistemology. The social life of knowledge in a Caribbean community. 2005. xvi, 255 pp. 141 Zhu, Yunxia: Written Communication across Cultures. A sociocognitive perspective on business genres. 2005. xviii, 216 pp. 140 Butler, Christopher S., María de los Ángeles Gómez González and Susana M. Doval-Suárez (eds.): The Dynamics of Language Use. Functional and contrastive perspectives. 2005. xvi, 413 pp. 139 Lakoff, Robin T. and Sachiko Ide (eds.): Broadening the Horizon of Linguistic Politeness. 2005. xii, 342 pp. 138 Müller, Simone: Discourse Markers in Native and Non-native English Discourse. 2005. xviii, 290 pp. 137 Morita, Emi: Negotiation of Contingent Talk. The Japanese interactional particles ne and sa. 2005. xvi, 240 pp. 136 Sassen, Claudia: Linguistic Dimensions of Crisis Talk. Formalising structures in a controlled language. 2005. ix, 230 pp. 135 Archer, Dawn: Questions and Answers in the English Courtroom (1640–1760). A sociopragmatic analysis. 2005. xiv, 374 pp. 134 Skaffari, Janne, Matti Peikola, Ruth Carroll, Risto Hiltunen and Brita Wårvik (eds.): Opening Windows on Texts and Discourses of the Past. 2005. x, 418 pp. 133 Marnette, Sophie: Speech and Thought Presentation in French. Concepts and strategies. 2005. xiv, 379 pp. 132 Onodera, Noriko O.: Japanese Discourse Markers. Synchronic and diachronic discourse analysis. 2004. xiv, 253 pp. 131 Janoschka, Anja: Web Advertising. New forms of communication on the Internet. 2004. xiv, 230 pp. 130 Halmari, Helena and Tuija Virtanen (eds.): Persuasion Across Genres. A linguistic approach. 2005. x, 257 pp. 129 Taboada, María Teresa: Building Coherence and Cohesion. Task-oriented dialogue in English and Spanish. 2004. xvii, 264 pp. 128 Cordella, Marisa: The Dynamic Consultation. A discourse analytical study of doctor–patient communication. 2004. xvi, 254 pp. 127 Brisard, Frank, Michael Meeuwis and Bart Vandenabeele (eds.): Seduction, Community, Speech. A Festschrift for Herman Parret. 2004. vi, 202 pp. 126 Wu, Yi’an: Spatial Demonstratives in English and Chinese. Text and Cognition. 2004. xviii, 236 pp. 125 Lerner, Gene H. (ed.): Conversation Analysis. Studies from the first generation. 2004. x, 302 pp. 124 Vine, Bernadette: Getting Things Done at Work. The discourse of power in workplace interaction. 2004. x, 278 pp. 123 Márquez Reiter, Rosina and María Elena Placencia (eds.): Current Trends in the Pragmatics of Spanish. 2004. xvi, 383 pp. 122 González, Montserrat: Pragmatic Markers in Oral Narrative. The case of English and Catalan. 2004. xvi, 410 pp. 121 Fetzer, Anita: Recontextualizing Context. Grammaticality meets appropriateness. 2004. x, 272 pp. 120 Aijmer, Karin and Anna-Brita Stenström (eds.): Discourse Patterns in Spoken and Written Corpora. 2004. viii, 279 pp. 119 Hiltunen, Risto and Janne Skaffari (eds.): Discourse Perspectives on English. Medieval to modern. 2003. viii, 243 pp. 118 Cheng, Winnie: Intercultural Conversation. 2003. xii, 279 pp.
117 Wu, Ruey-Jiuan Regina: Stance in Talk. A conversation analysis of Mandarin final particles. 2004. xvi, 260 pp. 116 Grant, Colin B. (ed.): Rethinking Communicative Interaction. New interdisciplinary horizons. 2003. viii, 330 pp. 115 Kärkkäinen, Elise: Epistemic Stance in English Conversation. A description of its interactional functions, with a focus on I think. 2003. xii, 213 pp. 114 Kühnlein, Peter, Hannes Rieser and Henk Zeevat (eds.): Perspectives on Dialogue in the New Millennium. 2003. xii, 400 pp. 113 Panther, Klaus-Uwe and Linda L. Thornburg (eds.): Metonymy and Pragmatic Inferencing. 2003. xii, 285 pp. 112 Lenz, Friedrich (ed.): Deictic Conceptualisation of Space, Time and Person. 2003. xiv, 279 pp. 111 Ensink, Titus and Christoph Sauer (eds.): Framing and Perspectivising in Discourse. 2003. viii, 227 pp. 110 Androutsopoulos, Jannis K. and Alexandra Georgakopoulou (eds.): Discourse Constructions of Youth Identities. 2003. viii, 343 pp. 109 Mayes, Patricia: Language, Social Structure, and Culture. A genre analysis of cooking classes in Japan and America. 2003. xiv, 228 pp. 108 Barron, Anne: Acquisition in Interlanguage Pragmatics. Learning how to do things with words in a study abroad context. 2003. xviii, 403 pp. 107 Taavitsainen, Irma and Andreas H. Jucker (eds.): Diachronic Perspectives on Address Term Systems. 2003. viii, 446 pp. 106 Busse, Ulrich: Linguistic Variation in the Shakespeare Corpus. Morpho-syntactic variability of second person pronouns. 2002. xiv, 344 pp. 105 Blackwell, Sarah: Implicatures in Discourse. The case of Spanish NP anaphora. 2003. xvi, 303 pp. 104 Beeching, Kate: Gender, Politeness and Pragmatic Particles in French. 2002. x, 251 pp. 103 Fetzer, Anita and Christiane Meierkord (eds.): Rethinking Sequentiality. Linguistics meets conversational interaction. 2002. vi, 300 pp. 102 Leafgren, John: Degrees of Explicitness. Information structure and the packaging of Bulgarian subjects and objects. 2002. xii, 252 pp. 101 Luke, K. K. and Theodossia-Soula Pavlidou (eds.): Telephone Calls. Unity and diversity in conversational structure across languages and cultures. 2002. x, 295 pp. 100 Jaszczolt, Katarzyna M. and Ken Turner (eds.): Meaning Through Language Contrast. Volume 2. 2003. viii, 496 pp. 99 Jaszczolt, Katarzyna M. and Ken Turner (eds.): Meaning Through Language Contrast. Volume 1. 2003. xii, 388 pp. 98 Duszak, Anna (ed.): Us and Others. Social identities across languages, discourses and cultures. 2002. viii, 522 pp. 97 Maynard, Senko K.: Linguistic Emotivity. Centrality of place, the topic-comment dynamic, and an ideology of pathos in Japanese discourse. 2002. xiv, 481 pp. 96 Haverkate, Henk: The Syntax, Semantics and Pragmatics of Spanish Mood. 2002. vi, 241 pp. 95 Fitzmaurice, Susan M.: The Familiar Letter in Early Modern English. A pragmatic approach. 2002. viii, 263 pp. 94 McIlvenny, Paul (ed.): Talking Gender and Sexuality. 2002. x, 332 pp. 93 Baron, Bettina and Helga Kotthoff (eds.): Gender in Interaction. Perspectives on femininity and masculinity in ethnography and discourse. 2002. xxiv, 357 pp. 92 Gardner, Rod: When Listeners Talk. Response tokens and listener stance. 2001. xxii, 281 pp. 91 Gross, Joan: Speaking in Other Voices. An ethnography of Walloon puppet theaters. 2001. xxviii, 341 pp. 90 Kenesei, István and Robert M. Harnish (eds.): Perspectives on Semantics, Pragmatics, and Discourse. A Festschrift for Ferenc Kiefer. 2001. xxii, 352 pp. 89 Itakura, Hiroko: Conversational Dominance and Gender. A study of Japanese speakers in first and second language contexts. 2001. xviii, 231 pp. 88 Bayraktaroğlu, Arın and Maria Sifianou (eds.): Linguistic Politeness Across Boundaries. The case of Greek and Turkish. 2001. xiv, 439 pp. 87 Mushin, Ilana: Evidentiality and Epistemological Stance. Narrative Retelling. 2001. xviii, 244 pp. 86 Ifantidou, Elly: Evidentials and Relevance. 2001. xii, 225 pp.
85 Collins, Daniel E.: Reanimated Voices. Speech reporting in a historical-pragmatic perspective. 2001. xx, 384 pp. 84 Andersen, Gisle: Pragmatic Markers and Sociolinguistic Variation. A relevance-theoretic approach to the language of adolescents. 2001. ix, 352 pp. 83 Márquez Reiter, Rosina: Linguistic Politeness in Britain and Uruguay. A contrastive study of requests and apologies. 2000. xviii, 225 pp. 82 Khalil, Esam N.: Grounding in English and Arabic News Discourse. 2000. x, 274 pp. 81 Di Luzio, Aldo, Susanne Günthner and Franca Orletti (eds.): Culture in Communication. Analyses of intercultural situations. 2001. xvi, 341 pp. 80 Ungerer, Friedrich (ed.): English Media Texts – Past and Present. Language and textual structure. 2000. xiv, 286 pp. 79 Andersen, Gisle and Thorstein Fretheim (eds.): Pragmatic Markers and Propositional Attitude. 2000. viii, 273 pp. 78 Sell, Roger D.: Literature as Communication. The foundations of mediating criticism. 2000. xiv, 348 pp. 77 Vanderveken, Daniel and Susumu Kubo (eds.): Essays in Speech Act Theory. 2002. vi, 328 pp. 76 Matsui, Tomoko: Bridging and Relevance. 2000. xii, 251 pp. 75 Pilkington, Adrian: Poetic Effects. A relevance theory perspective. 2000. xiv, 214 pp. 74 Trosborg, Anna (ed.): Analysing Professional Genres. 2000. xvi, 256 pp. 73 Hester, Stephen K. and David Francis (eds.): Local Educational Order. Ethnomethodological studies of knowledge in action. 2000. viii, 326 pp. 72 Marmaridou, Sophia S.A.: Pragmatic Meaning and Cognition. 2000. xii, 322 pp. 71 Gómez González, María de los Ángeles: The Theme–Topic Interface. Evidence from English. 2001. xxiv, 438 pp. 70 Sorjonen, Marja-Leena: Responding in Conversation. A study of response particles in Finnish. 2001. x, 330 pp. 69 Noh, Eun-Ju: Metarepresentation. A relevance-theory approach. 2000. xii, 242 pp. 68 Arnovick, Leslie K.: Diachronic Pragmatics. Seven case studies in English illocutionary development. 2000. xii, 196 pp. 67 Taavitsainen, Irma, Gunnel Melchers and Päivi Pahta (eds.): Writing in Nonstandard English. 2000. viii, 404 pp. 66 Jucker, Andreas H., Gerd Fritz and Franz Lebsanft (eds.): Historical Dialogue Analysis. 1999. viii, 478 pp. 65 Cooren, François: The Organizing Property of Communication. 2000. xvi, 272 pp. 64 Svennevig, Jan: Getting Acquainted in Conversation. A study of initial interactions. 2000. x, 384 pp. 63 Bublitz, Wolfram, Uta Lenk and Eija Ventola (eds.): Coherence in Spoken and Written Discourse. How to create it and how to describe it. Selected papers from the International Workshop on Coherence, Augsburg, 24-27 April 1997. 1999. xiv, 300 pp. 62 Tzanne, Angeliki: Talking at Cross-Purposes. The dynamics of miscommunication. 2000. xiv, 263 pp. 61 Mills, Margaret H. (ed.): Slavic Gender Linguistics. 1999. xviii, 251 pp. 60 Jacobs, Geert: Preformulating the News. An analysis of the metapragmatics of press releases. 1999. xviii, 428 pp. 59 Kamio, Akio and Ken-ichi Takami (eds.): Function and Structure. In honor of Susumu Kuno. 1999. x, 398 pp. 58 Rouchota, Villy and Andreas H. Jucker (eds.): Current Issues in Relevance Theory. 1998. xii, 368 pp. 57 Jucker, Andreas H. and Yael Ziv (eds.): Discourse Markers. Descriptions and theory. 1998. x, 363 pp. 56 Tanaka, Hiroko: Turn-Taking in Japanese Conversation. A Study in Grammar and Interaction. 2000. xiv, 242 pp. 55 Allwood, Jens and Peter Gärdenfors (eds.): Cognitive Semantics. Meaning and cognition. 1999. x, 201 pp. 54 Hyland, Ken: Hedging in Scientific Research Articles. 1998. x, 308 pp. 53 Mosegaard Hansen, Maj-Britt: The Function of Discourse Particles. A study with special reference to spoken standard French. 1998. xii, 418 pp. 52 Gillis, Steven and Annick De Houwer (eds.): The Acquisition of Dutch. With a Preface by Catherine E. Snow. 1998. xvi, 444 pp.