Domains and Dynasties The Radical Autonomy of Syntax
Studies in Generative Grammar The goal of this series is to publi...
30 downloads
778 Views
11MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Domains and Dynasties The Radical Autonomy of Syntax
Studies in Generative Grammar The goal of this series is to publish those texts that are representative of recent advances in the theory of formal grammar. Too many studies do not reach the public they deserve because of the depth and detail that make them unsuitable for publication in article form. We hope that the present series will make these studies available to a wider audience than has hitherto been possible.
Editors: Jan Koster Henk van Riemsdijk Other books in this series:
1. Wim Zonneveld A Formal Theory of Exceptions in Generative Phonology
2. Pieter Muysken Syntactic Developments in the Verb Phrase of Ecuadorian Quechua
3. Geert Booij Dutch Morphology
4. Henk van Riemsdijk A Case Study in Syntactic Markedness
5. Jan Koster Locality Principles in Syntax
6. Pieter Muysken (ed.) Generative Studies on Creole Languages
7. Anneke Neijt Gapping
8. Christer Platzack The Semantic Interpretation of Aspect and Aktionsarten
9. Noam Chomsky Lectures on Government and Binding
10. Robert May and Jan Koster (eds.) Levels of Syntactic Representation
11. Luigi Rizzi Issues in Italian Syntax
12. Osvaldo Jaeggli Topics in Romance Syntax
13. Hagit Borer Parametric Syntax
14. Denis Bouchard On the Content of Empty 'Categories
15. Hilda Koopman The Syntax of Verbs
16. Richard S. Kayne Connectedness and Binary Branching
17. Jerzy Rubach Cyclic and Lexical Phonology: the Structure of Polish
18. Sergio Scalise Generative Morphology
19. Joseph E. Emonds A Unified Theory of Syntactic Categories
20. Gabriella Hermon Syntactic Modularity
21. Jindrich Toman Studies on German Grammar
22. J. Gueron/H.G. Obenauerl J.-Y.Poliock (eds.) Grammatical Representation
23. S.J. KeyserlW. O'Neil Rule Generalization and Optionality in Language Change
24. Julia Horvath FOCUS in the Theory of Grammar and the Syntax of Hungarian
25. Pieter Muysken and Henk van Riemsdijk Features and Projections
26. Joseph Aoun Generalized Binding. The Syntax and Logical Form of Wh-interrogatives
27. Ivonne Bordelois, Heles Contreras and Karen Zagona Generative Studies in Spanish Syntax
28. Marina Nespor and Irene Vogel Prosodic Phonology
29. Takashi Imai and Mamoru Saito (eds.) Issues in Japanese Linguistics
Jan Koster
D mains and ynasties The Radical Autonomy of Syntax
1987 FORIS PUBLICATIONS Dordrecht - Holland/Providence - U.S.A.
Published by: Foris Publications Holland P.O. Box 509 3300 AM Dordrecht, The Netherlands Sale distributor for the U.S.A. and Canada: Foris Publications USA, Inc. P.O. Box 5904 Providence RI 02903 U.S.A. CIP-DATA Koster, Jan Domains and Dynasties: the Radical Autonomy of Syntax / Jan Koster. - Dordrecht [etc.] : Foris. - (Studies in Generative Grammar; 30) With ref. ISBN 90 6765 270 9 paper ISBN 90 6765 269 5 bound SISO 805.4 UDC 801.56 Subject heading: syntax; generative grammar
ISBN 90 6765 269 5 (Bound) ISBN 90 6765 270 9 (Paper) ©
1986 Foris Publications - Dordrecht
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission from the copyright owner. Printed in The Netherlands by ICG Printing, Dordrecht.
Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vii
Chapter 1. The Invariant Core of Language. . . . . . . . . . . . . . . . . . . . . . . . 1.1. The research program. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2. The configurational matrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3. Domain extensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4. Conclusion.. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . .. . . .. . Notes.......................... . .... ......... ..............
1 1 8 17 25 29
Chapter 2. Levels of Representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. D-structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3. NP-structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4. Logical Form. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notes.......................... .............. ..............
31 31 38 57 76 98 108
Chapter 3. Anaphoric and Non-Anaphoric Control. . . . . . . . . . . . . . . . . . . 3.1. Introduction ...............................-. . . . . . . . . . . . 3.2. Where binding and control meet. . . . . . . . . . . . . . . . . . . . . . . . . . 3.3. Some minimal properties of control. . . . . . . . . . . . . . . . . . . . . . . . 3.4. Infinitival complements in Dutch. . . . . . . . . . . . . . . . . . . . . . . . . . 3.5. Asymmetries between N and V .. : . . . . . . . . . . . . . . . . . . . . . . . . . 3.6. Conclusion.. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . .. .. . . . . . . . Notes.......... ........ ............ .......... .. ............
109
Chapter 4. Global Hannony, Bounding, and the ECP . . . . . . . . . . . . . . . . 4.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. On the nature oflocal domains .................... -~ - ....... 4.3. The Cinque-Obenauer hypothesis. . . . . . . . . . . . . . . . . . . . . . . . . . 4.4. The parametrization of dynasties. . . . . . . . . . . . . . . . . . . . . . . . . . 4.5. Global harmony. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6. The grammar of scope. . . • . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7. Conclusion.. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notes.. ............................ ................. .......
145 145 147 153 159 172 201 231 236
109 110 113 119 136 141 142
vi
Domains and Dynasties
Chapter 5. NP-Movement and Restructuring. . . . . . . . . . . . . . . . . . . . . . . 5.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2. Passives and ergatives in Dutch. . . . . . . . . . . . . . . . . . . . . . . . . .. 5.3. Case, agreement, and subject drop in Dutch. . . . . . . . . . . . . . . .. 5.4. A difference between English and Dutch. . . . . . . . . . . . . . . . . . .. 5.5. Reanalysis and covalency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6. Against reanalysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7. Transparency without reanalysis. . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8. Restructuring in French. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.9. Conclusion. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . Notes................... . ... . . .... .... ............... ......
239 239 242 257 265 271 279 288 296 312 314
Chapter 6. Binding and its Domains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2. Reflexives in Dutch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3. The principles Band C in English and Dutch. . . . . . . . . . . . . . . . 6.4. Principle C effects in parasitic gap constructions. . . . . . . . . . . . . . 6.5. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notes......................... ............... ..............
315 315 323 341 356 368 369
Chapter 7. The Radical Autonomy of Syntax. . . . . . . . . . . . . . . . . . . . . . .
371
Bibliography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
377
Index of names. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
385
General index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
388
Preface
Linguistics, like any other field of inquiry, can only make progress through a certain diversity of viewpoints. Although there have been many challenges to "standard" theories of generative grammar, there have been relatively few major controversies within what is often referred to as the Theory of Government and Binding. The theory presented in this study accepts the major goals of Government and Binding, but differs from the standard view in a number of respects. The basic difference is that the theory of Domains and Dynasties entirely rejects the notion "move alpha" and, therefore, the idea oflevels connected by "move alpha". Apart from Lexical Structure and Phonetic Representation, only one level is accepted, namely the level of S-structure. In my opinion the traditional level of D-structure can most appropriately be seen as a substructure of S-structure, while the notion of Logical Form is rejected altogether. This study grew out of my reactions to Chomsky's Pisa lectures. Shortly before the Pisa lectures, I had published a version of Subjacency (the Bounding Condition) that appeared to be almost indistinguishable from principle A of the binding theory. This strongly suggested that a generalization was being missed. Currently, more than seven years after the Pisa lectures, a condition like the Bounding Condition also shows up in mainstream GB theories under the name O-subjacency, and also in the idea that all traces are antecedent-governed in a strictly local domain. It seems to me that such a strict locality condition makes traditional Subjacency superfluous and that it brings back into focus what I consider one of the most important problems of the theory of grammar: how is the locality condition for the binding of traces related to the locality domains of other grammatical dependencies? The answer given here is that at an appropriate level of abstraction, there is a uniform locality condition for all grammatical relations of a certain type. The idea of a uniform locality condition leads to the Thesis of Radical Autonomy. According to this thesis, core grammar is characterized by a c9nfigurational matrix of properties that are entirely constructionindependent. A further perspective is that the configurational matrix determines the form of a computational faculty that is not intrinsically built for language. Grammar in the traditional generative sense is perhaps only an application of this computational module, in the same way that book-keeping is an application of arithmetic. Language in this view only
viii
Domains and Dynasties
originates through the interaction of the abstract computational module with our conceptual systems, whereas the lexicon can be considered the interface among these components. Rules like LF-movement cannot be fundamental computations from such a perspective since they are specific to certain conceptual contents, which belong to a different and presumably equally autonomous system. Research for this book started in 1979 in a project (Descriptive Language) organized by the University of Nijmegen and the Max Planck Institute for Psycholinguistics and sponsored by the Netherlands Organization for the Advancement of Pure Research (Z.W.O.). The original versions of my theory were discussed with Angelika Kratzer of the Max Planck Institute, and with Dick Klein and John Marshall of the University of Nijmegen, among others. The many visitors to the Max Planck Institute, Robert May and Edwin Williams in particular, also contributed much to the development of my views. Also during this time, I had regular meetings with a group of linguists from the Federal Republic of Germany. This book would probably not exist without the many discussions of Chomsky'S Pisa lectures I had with Tilman H6hle, Craig Thiersch, Jindra Toman, Hans Thilo Tappe, and many others. I have very good memories of the friendship and encouragement I experienced in this group. Most of the work on this book was done after I joined the faculty of Tilburg University in 1981. Here, I worked under the excellent conditions created by Henk van Riemsdijk. As ever, I felt greatly stimulated by the harmonious combination of friendship and polemics dating back to our student days. Several aspects of this study were discussed with Henk, and also with my other colleagues at Tilburg, including Reineke Bok-Bennema, Norbert Corver, Jan van Eijck, Anneke Groos, Casper de Groot, Anneke Neijt, Rik Smits, and Gertrud de Vries. Furthermore, I was able to discuss my work with several visitors, such as Ken Hale, Jean-Roger Vergnaud, and Maria-Luisa Zubizarreta. More than anything else, the content of this study was inspired by the seminal work of Richard Kayne. I learned very much from our discussions and from the critical comments that Richie gave me on several parts of the text. Likewise, I was inspired by the work of Guglielmo Cinque and Hans-Georg Obenauer, as is clear from several chapters. In addition, I would like to thank Guglielmo Cinque for his detailed comments on large parts of the text. Other colleagues and friends I would like to thank for comments include Hans den Besten, Elisabet Engdahl, Ton van Haaften, Riny Huybregts, David Lebeaux, Robert May, Carlos Otero, Christer Platzack, Thomas Roeper, and Tarald Taraldsen. I am grateful to Gaberell Drachman of Salzburg University, Austria, for giving me the opportunity to present parts of this book at the Salzburg International Summer Schools of 1982 and 1985. I was much encouraged and stimulated by the discussions and the friendship of the many participants. As for the 1982 Summer School, I would like to acknowledge
Preface
ix
the contributions ofSascha Felix, Wim de Geest, Liliane Haegeman, Hubert Haider, David Lebeaux, Anna Szabolcsi, and Dong-Whee Yang. Of the 1985 Summer School, I would like to mention Elena Benedicto, Clemens Bennink, Leonardo Boschetti, Anna Cardinaletti, Kirsti Christensen, Gunther Grewendorf, Willy Kraan, Martin Prinzhorn, Alessandra Tomaselli, and Gert Webelhuth. The Netherlands Organization for the Advancement of Pure Research (Z.W.O.) gave me the opportunity to visit MIT and the University of Massachusetts at Amherst in the fall of 1983 (grant R30-191), which I hereby gratefully acknowledge. At MIT, I discussed parts of chapter 4 with Noam Chomsky, Danny Jaspers, Carlos Quicoli, Luigi Rizzi, and Esther Torrego, among others. At Amherst, I profited from the comments of David Pesetsky and Edwin Williams. Charlotte Koster read the whole text and proposed many improvements of both content and style. Especially chapter 6 owes much to her ideas on learnability. I would like to thank her in more ways than one, as ever! In preparing the final text, I received excellent editorial assistance from Rita DeCoursey of Foris Publications and technical assistance from the staff of my current department at the University of Groningen. In the department, Corrie van Os helped me with the bibliography and Wim Kosmeijer compiled the index. Versions of chapters 1 and 3 were published earlier, respectively in Theoretical Linguistic Research 2 (1985), 1-36, and Linguistic Inquiry 15 (1984), 417-459, and are reprinted here with kind permission of the publishers. Jan Koster Groningen, December 1986
Chapter 1
The Invariant Core of Language
1.1. The research program
Recently, N oam Chomsky appropriately characterized the goal of generative grammar as a contribution to the solution of "Plato's problem": how can we know so much given that we have such limited evidence?! Among the cognitive domains that confront us with this problem, our language is a particularly striking and important example. In studying human language, it is difficult not to be impressed by the richness, subtlety, and specificity of the system of knowledge acquired. Since only a fraction of this richness seems to be encoded in the evidence available to the language learner, much of the architecture of the acquired system must spring from the innate resources of the organism itself. Either the learning child possesses rich powers of abstraction and generalization (general learning strategies), or its inborn capacities involve an articulated and specific system that is only triggered and "finished" by the evidence. There is, to my knowledge, no research program in linguistics that is based on general learning strategies and that is even beginning to come to grips with the richness of our knowledge of language. So far, only the second approach, i.e. the attempt to formulate a highly articulate initial scheme, has attained a promising degree of success. I therefore believe that this is the right approach to Plato's problem in the domain of natural language. This conclusion is sometimes called pretentious or unmotivated, but it is often hard to see what motivates the opposition beyond prejudice. On the one hand there is not the slightest evidence that the data available to the child, or "general learning mechanisms", are rich enough to account for the nature of the system acquired; on the other hand, the program based on the alternative, the assumption of an articulate initial scheme, has led to a very successful research program. I fail to see how critics of the Chomskyan program can account for the total lack of success of the other theories and the continuous development and success of the program criticized. Even if one fully agrees with Chomsky's approach to Plato's problem, there are different ways to execute the research program based on it. Generative grammar in Chomsky's sense is a much more pluriform enterprise than it is sometimes believed to be. This pluralism is generally 1
2
Domains and Dynasties
considered healthy and even necessary for progress, as in any other science. It is a truism that one of the most effective tools towards progress is criticizing existing theories by the formulation of challenging alternatives. Given the Chomskyan approach to Plato's problem, then, we can distinguish several largely overlapping but sometimes conflicting lines of research. The most common line of research has always stressed the importance of distinct levels of syntactic representation. Most of these levels are supposed to be connected by a special mapping, nowadays generally referred to as "move alpha". Chomsky, for instance, distinguishes lexical structure, D-structure, S-structure, Logical Form (LF), and Phonetic Form (PF). Van Riemsdijk and Williams (1981) add yet another level to this series, namely the level of NP-structure. My own approach differs somewhat from this commonly assumed picture. It has always seemed to me that with the introduction of trace theory in Chomsky (1973), the original arguments for certain levels have lost their force. To a certain extent, this was also observed by Chomsky at the end of "Conditions on Transformations" (1973): as soon as you have traces there is an obvious alternative according to which traces are basegenerated at S-structure. In this view, D-structure is not necessarily a separate level, but can also be interpreted as a substructure or a property of S-structure. 2 Chomsky has never been convinced of the meaningfulness of the alternative, mainly because of the alleged properties of "move alpha". In Chomsky's view, the alternative could only be formulated with interpretive rules at S-structure that duplicate the unique properties of "move alpha".3 Since I believe that this latter conclusion is false, I have been trying to develop the alternative in Koster (1978c) and subsequent papers. These attempts have nothing to do with a general preference for frameworks without transformations or with a preference for context-free rules in the sense of Gazdar and others. 4 I agree with Chomsky (1965) that the significant empirical dimension of the research program has little to do with the so-called Chomsky hierarchy. What is significant is the attempt to restrict the class of attainable grammars (perhaps to a finite class) in a feasible way. From this point of view, formulating grammars with or without transformations is not necessarily a meaningful question (apart from empirical considerations). My main argument i!; that I consider the attempts to isolate the properties of "move alpha" entirely unconvincing. "Move alpha" exists only to the extent that it can be shown to have properties. Neither attempts to establish properties of "move alpha" directly, nor attempts to establish movement indirectly by attributing special properties to its effects (traces) have been successful, in my opinion. At the same time, it is understandable that these attempts to isolate "move alpha" as something special have inhibited research into unified theories, i.e. theories that subsume movement and, for instance, anaphora under a common cluster of properties.
The Invariant Core of Language
3
Functionally speaking, "move alpha" is insufficiently general for the job that it is supposed to do. Movement can be seen as a transfer mechanism: it connects certain categories with deep structure positions (which are also available at S-structure under trace theory) and transfers the Case- and 8license of these positions to the moved categories. It is hardly controversial that not all transfer can be done by movement. A standard example demonstrating this is left dislocation: (1)
That book, I won't read it
Originally, such sentences were also derived by movement transformations (see Ross (1967)). But it is generally assumed now that (1) and many similar cases of transfer cannot be accounted for by "move alpha". An example like (1) shows that anaphors like it can transfer 8-roles to NPs (like that book) in non-8-positions. This independently needed transfer mechanism makes "move alpha" superfluous. Obviously, we can do with only one general transfer mechanism from dependent elements to their antecedents. This transfer mechanism is instantiated by (1) and in a similar way by a "movement" construction like (2): (2)
Which book did you read t?
The trace tin (2) appears to behave like the pronominal it in (1) in the relevant aspects. The burden of proof is certainly on those who claim that we need an entirely new transfer mechanism ("move alpha") beyond what we need anyway for (1). Attempts have been made to meet this burden of proof, but the question is whether these attempts have been successful. If "move alpha" is superfluous from a functional point of view, it might still be argued that it can be recognized by its special properties. Chomsky (1981b, 56) argues that the products of "move alpha", traces, have the following three distinct properties: (3)
a. b. c.
trace is governed the antecedent of trace is not in a 8-position the antecedent-trace relation satisfies the Subjacency condition
Note, however, that none of these properties uniquely distinguishes movement from other grammatical dependency relations. It is already clear from (1) that also the antecedents oflexical anaphors (or pronominals) can be in non-8-positions (3b). Also, government (3a) is not a distinguishing property, because all lexical anaphors bear Case and must therefore be governed. 5 The only plausible candidate for the status of distinguishing property has always been Subjacency (3c). It is for this reason that I have focused on this property in Koster (1978c) and elsewhere. The crucial question from my point of view, then, is whether Subjacency is really that different
4
Domains and Dynasties
from, say, the locality principles involved in the binding theory of Chomsky (1981b). If we take a closer look at Subjacency, it can hardly be missed that the form it is usually given (and which is clearly distinct from the anaphoric locality principles) is entirely based on certain idiosyncrasies of English and a few other languages. Under closer scrutiny, Subjacency as a separate property appears to dissolve. The version originally proposed on the basis of English in Chomsky (1973) simply conflates a general locality principle with a small extension for limited contexts in English. Before I demonstrate this with examples, I would like to stress that I consider Subjacency, or more generally, the idea that "unbounded" movement is built up from a succession of local steps, as one of the most important advances of generative grammar in the 1970s. Thanks to Subjacency, it has become clear for the first time that grammatical dependency relations that look wildly differen t at the surface might, contrary to appearances, be instantiations of a common underlying pattern. Subjacency has been a crucial conceptual step, and my own attempts at further unification only became possible because of Subjacency, which reduced a mass of seemingly unbounded relations to a simple local pattern. My criticisms do not concern Subjacency as a strict locality principle, but the particular form given to it in Chomsky (1973), which makes it unsuitable for further unification with other locality principles. If we want a further unification, we have to get rid somehow of the differences between the locality format for movement (Subjacency) and, for instance, for anaphora (principle A of the binding theory). At first sight, this is not so easy because there seem to be some clear differences. These differences can be summarized as follows: (4)
a.
b. c.
Subjacency is often formulated as a condition on derivations, while principle A of the binding theory is a condition on representations Subjacency involves two domain nodes, while principle A only involves one node (the governing category) Contrary to Subjacency, principle A involves opacity factors like INFL or SUBJECT
Given the desirability of unification, these differences present themselves as a puzzle: how can we show that "move alpha" and anaphoric binding are governed by the same basic locality principle? Let us consider in turn the differences listed in (4). Originally, Subjacency was formulated as a condition on derivations. But Freidin (1978) and Koster (1978c) claimed that, with traces, it could just as well be formulated as a condition on representations. Also Chomsky (1985a) formulates Subjacency as a condition on S-structure. So, it is questionable whether this point is still controversial: we can simply formulate
The Invariant Core of Language
5
Subjacency as a condition on S-structure, just like principle A, as long as there is no evidence to the contrary. There is also an easy solution to the second difference. In Koster (1978c) it was concluded that even for English, Subjacency could be replaced by a one-node domain statement (like the later principle A for anaphors) for all contexts except one. The standard two-node formulation was based on the peculiar postverbal context of English, which was a bad place to look to begin with. Thus, in general, the bounding facts of English can be formulated by specifying just one bounding node, S' or NP. Much of the subject condition of Chomsky (1973), for instance, follows from a condition that says that elements cannot be extracted from an NP: (5)
*Who did you say that [NP a picture of t] disturbed you?
The one-node format would have been sufficient for these cases, but it did not seem to be for contrasts like the following: (6)
a. b.
Who did you see [NP a picture of t] *Who did you hear [NP stories about [NP pictures of t]]
Even from English alone, however, it is clear that (6b) is irrelevant for a choice between a one-node and a two-node Subjacency format. The reason is that standard two-node Subjacency is both too strong (7b) and too weak (7a) for English in this context: (7)
a. b.
* Who did you destroy [NP a picture of t] Which girl did you consider [NP the possibility of [NP a game with t]]
As (7a) shows, one node can already lead to unacceptable sentences, while (7b) and many other examples show that extraction across two or even three bounding nodes may still yield acceptable sentences. In short, one node is sufficient for all contexts of English, except the postverbal context, in which we can find almost anything. The conclusion that Subjacency is a one-node condition was reinforced by the fact that even (6a) is ungrammatical in most languages, Dutch among them: (8)
*Wie heb je [NP een foto van t] gezien?
It must therefore be concluded that one node is sufficient for Subjacency in almost all languages known to have "unbounded" movement in all contexts, and in some languages, like English, in all contexts but one. In the exceptional context, two-node Subjacency is just as irrelevant as onenode Subjacency.
6
Domains and Dynasties
On the basis of the facts, then, we are justified in also taking the second step towards unification: both bounding and binding involve local domains that specify only one node. Of course, we are left with the problem of how to account for cases like (6a) and (7b), but it seems at least plausible that this problem has nothing to do with Subjacency. Recently, I have tried to give a solution for this problem by adopting certain ideas formulated by Kayne (1983). According to this solution (Koster (1984b) and chapter 4 below), the basic bounding domain is a one-node domain, which can be extended under very specific and partially universal conditions. A bounding domain can be extended only if the last trace of a chain is structurally governed and if all domains up to the antecedent are governed in the same direction. With some qualifications, to which I will return, I believe that bounding is constrained by the one-node format in all other cases. This part of the puzzle is therefore solved by splitting standard Subjacency in two parts: a universal one-node domain specification, and a domain extension based on the language-particular fact that prepositions can be structural governors in English, together with the fact that the direction of government is rather uniform in English. As I will argue below, the one-node domain that we have split off from Subjacency forms the basis of a construction-independent and universal locality principle. With respect to this one-node locality principle, all languages are alike, while languages differ with respect to the extensions, which are also the loci of parametric variation. If this hypothesis solves the first two aspects of the unification puzzle, the next step is trying to solve the third aspect by splitting off the same universal domain from the binding conditions for anaphors. In the case of anaphoric domains, it is already generally assumed that the locality format involves only one node, the governing category. The big problem here is how to split off the opacity factors, such as INFL and SUBJECT. It seems to me that the solution is very similar to what we saw in the case of bounding: there is a basic one-node domain defined without opacity factors; these opacity factors only playa role in partially languagespecific domain extensions. As before, English is a poor choice to illustrate this because this language has a relatively impoverished system of anaphors. But in many languages clitics are used in the domain of V, while different pronouns are used for binding into PPs and other constituents. For the clitics, the opacity factors are usually irrelevant: the clitics are simply bound in the minimal Xmax (Sf) in which they are governed, just like traces. 6 Thus, a clitic governed by V is bound in its minimal Sf, just like a trace governed by V. Often clitics cannot be bound in any other environment. French, for instance, uses a reflexive se in the domain of a verb, but other forms, like /ui-meme, in the domain of P and other categories (see chapter 6 for a more elaborated account). Dutch forms a very interesting illustration of this point of view. This
The Invariant Core of Language
7
language has at least two reflexives, zich and zichze?f. The crucial fact is that these reflexives overlap in the domain of V, but contrast in other contexts (i.e. in extended domains), for instance in the domain of P. The following examples illustrate this: (9)
a.
Jan J an Jan Jan
b.
wast washes wast washes
zichzelf himself zich himself
It is not the case that both reflexives occur with all verbs in this context, which is probably a lexical fact. The point is that verbs that select both forms can have them in the same context, namely the domain of V. We can account for the sentences in (9) by a domain statement that does not refer to opacity factors like SUBJECT. We can simply say that both zich and zichze?{ are bound in the minimal X max of the governor V (under the assumption that this domain is S'). I assume that in the unmarked case both Dutch reflexives are bound in their minimal X max (in practice only the minimal S') without any reference to opacity factors. Opacity factors only play a role in the marked case, under so-called "elsew here" conditions. Thus, if the reflexives are not bound in their minimal xmax, they contrast with respect to the notion subject: zichzelJ must be bound in the minimal domain containing a subject, while zich must be free in this domain. The contrast is illustrated in the following examples, in which the reflexives are bound across a PP boundary (and therefore not bound in their minimal governing Xmax):
(10)
a. b.
Jan schiet [pp op zichzelfJ J an shoots at himself *J an schiet [pp op zichJ
Thus, in Dutch the distinction between the basic domain and the extended domain (which involves opacity factors) can be detected by the fact that the two reflexives overlap in the former domain while they are in complementary distribution with respect to the latter domain. There is much more to say about Dutch reflexives (see Koster (1985) and chapter 6 below), but the basic approach is clear from these simple examples. The path towards unification, then, can only be followed if we see that neither standard Subjacency (with its two nodes) nor binding principle A (with its opacity factors) formulates the primitive locality domain for the dependency relations in question. Both conditions conflate the common universal part with language-particular extensions. If we split off the extensions, it appears that bounding and binding are governed by exactly the same basic locality principle. The approach taken here involves a theory of markedness. The un-
8
Domains and Dynasties
marked locality principle for all local dependencies in all languages is a simple one-node domain principle that says that an element must be connected with its antecedent in the minimal xmax in which it is governed. Beyond this, there are only marked extensions from which languages may or may not choose. Both directionality factors in the sense of Kayne (1983) and opacity factors in the binding theory belong to the theory of markedness. The theory of markedness is also the main locus of parametrization. The basic, unmarked domain might be part of all languages without parametrization; this certainly is the strongest possible hypothesis, one that we would like to maintain as long as possible. If all this is correct, the unmarked format for Subjacency (the Bounding Condition of Koster (1978c)) is indistinguishable from the unmarked locality format for binding. None of the properties in (3), then, distinguishes "move alpha" from any other dependency relation in the unmarked case. If "move alpha" can be detected neither by its functional role nor by its properties, then without new evidence, there is no reason to assume that "move alpha" exists.
1.2. The configurational matrix The most fundamental notion of the theory of grammar is the dependency relation. Most grammatical relations are dependency relations of some kind between a dependent element () and an antecedent a: (11)
... a,
... ,o ...
LJ R
In anaphoric relations, for instance, the anaphors are dependent on their antecedent. Similarly, subcategorized elements that receive a a-role or Case are dependent on some governor, usually the head of a phrase. There are many different types of dependency relations, but all have something in common, both functionally and formally. Functionally speaking, dependency relations have the following effect: (12)
share property
Any kind of property can be shared by two properly related elements. Antecedent and anaphor, for instance, share a referential index, which entails that they have the same intended referent. A "moved" lexical category and its trace share one lexical content (found at the landing site) and one set of licensing properties (found at the trace position). Formally speaking, all dependency relations have the same basic form, while some have their basic form extended in a certain way. As already
The Invariant Core of Language
9
indicated in the previous section, domain extensions are languageparticular options that result from parameter setting, and which fall within the limits of a very narrow hypothesis space, which is defined by Universal Grammar. Domain extensions for empty categories involve chains of equally oriented governors, amI domain extensions for other anaphors involve the opacity factors or chains of governors that agree with respect to some factor. More will be said on domain extensions in the next section. In this section, I will only define the basic, unextended form of dependency relations. First, I will mention and briefly illustrate the properties of the relation R (of (11)). Then, I will discuss the question to what extent the list of properties has some internal structure. I will conclude this section with a discussion of the scope of the properties in question. As I have discussed elsewhere, it seems to me that basic dependency relations of type R (in (11)) have at least the following four properties: 7 (13)
a. b. c. d.
obligatoriness uniqueness of the antecedent c-command of the antecedent locality
The first property, obligatoriness, is almost self-explanatory. All dependency relations with the properties of (13) are obligatory in the sense that the dependent elements in the relation must have an antecedent. Thus, a reflexive pronoun does not occur without a proper antecedent (14)
*1 hate himself
A structure like (14), in which no antecedent for the reflexive can be found, is ill-formed, and if there is an appropriate antecedent, it cannot fail to be the antecedent:
(15)
John hates himself
In this respect, the binding of reflexives differs from the binding of other pronouns, like the (optional) binding of him in: (16)
John thinks that Mary likes him
As is well known, we can optionally connect him with the possible antecedent John, but we may also leave the pronoun unbound. The second property, uniqueness, applies only to antecedents. Thus, we may connect an antecedent with more than one anaphor:
(17)
They talked with each other about each other
10
Domains and Dynasties
But we can only have one antecedent for an anaphor; in other words, split antecedents are impossible: (18)
*John
confronted Mary with each other
Again, this is not a necessary property of anaphoric connections. Pronominals differ from bound anaphors in that they can take split antecedents, as has been known since the 1960s: (19)
John told Mary that they had to leave
The third property, c-command, is so well known that it hardly stands in need of illustration here. In (20a), himself is not c-commanded by the antecedent John. For pronominals, c-command is not necessary, as shown by (20b): (20)
a. b.
*[NP The father of John] hates himself [NP The father of John] thinks he is happy
The form of c-command that I have in mind is the more or less standard form proposed by Aoun and Sportiche (1983), according to which the minimal X max containing the antecedent must also contain the anaphor. The fourth property, locality, is illustrated by the following contrast: (21)
a. b.
John hates himself thinks that Mary hates himself
*J 01m
Again, it can be observed that pronominals like him are not constrained by the locality principle in question: (22)
J 01111 thinks that Mary likes him
The standard form of locality for anaphors is given by principle A of the binding theory of Chomsky (1981b, ch. 3): anaphors must be bound in their governing category. A governing category is the minimal X max containing the governor of an anaphor and a SUBJECT (subject or AGR) accessible to the anaphor. The basic form of locality that I am assuming here differs from this standard format. Instead, I will assume that the Bounding Condition of Koster (1978c) is basic, not only for empty categories, but for all local dependencies: (23)
Bounding Condition
A dependent element 8 cannot be free in: '" [~ ... 8 ... ] ... where ~ is the minimal Xmax containing 8 (and the governor of 8)
The Invariant Core of Language
11
This locality principle accounts for the contrast between (24a) and (24b), under the assumption that Sf is the relevant xmax: (24)
a. b.
[s' J 0/1/1 hates himself] John thinks [s' that himself is sick]]
*[s'
The following acceptable sentence is not accepted by the basic locality principle (23), because himself is not bound in the minimal PP in which it is governed:
(25)
J 0/111 depended
[pp
on himse?fJ
This sentence is only accepted by adding a marked option to the basic locality principle. According to this "elsewhere" condition, a reflexive must be bound in the extended domain defined as the minimal Xmax that contains a subject. Thus, principle A of the binding theory is considered a marked, extended domain from this point of view. 8 Apart from this not unsubstantial modification, the properties listed under (13) are well known, especially c-command and locality. What has not received much attention, however, is the fact that the properties in question form a cluster: if a dependency relation involves locality it usually also involves c-command and uniqueness. The fact that these properties co-occur suggests that there might be some further structure to this collection. It seems to me that the relation R is in fact a function. According to the definition of a function, there is a unique value in the co-domain for each argument in the domain. Suppose now that we take dependent elements in a given structure as arguments. In that case, we can consider antecedents in the same structure as values. The function is not defined in structures without appropriate antecedents, and these structures are rejected. In this way, we account for the obligatoriness of R (property (13a)). Similarly, we account for the uniqueness property: a function always gives a unique value for a given argument, in this case a unique antecedent. Assuming that R is a function, the only two substantial properties are (13c) and (13d): c-command and locality, respectively. It seems to me that these two properties are not unrelated either. In fact, both properties are locality principles. C-command is locality seen from the perspective of the antecedent. It can be formulated as follows: (26)
C-command
A potential antecedent a cannot be free in: ... [p ... a ... ] ... where ~ is the minimal Xmax containing a
This is very similar to the Bounding Condition (23), repeated here for convenience:
12 (27)
Domains and Dynasties Bounding Condition A dependent element 8 cannot be free in: ... [~ ... 8 ... ] ... where P is the minimal Xmax containing 8 (and the governor of 8)
The similarity between (26) and (27) is just too striking to be accidental. I assume therefore that R is a bilocal function, a function that gives a unique value (the antecedent) for each dependent element, in such a way that the antecedent is in the minimal domain of the dependent element (cf. (27)) and the dependent element in the minimal domain of the antecedent (cf. (26)). If this conclusion can be maintained, the list in (13) can be replaced by a simple function that shows a certain degree of symmetry with respect to the notion "locality". An intriguing question that I will not pursue here is whether there is a counterpart to the notion of domain extension for (26). Recall that one of the most general domain extensions for (27) involves the notion "subject". Under this extension, a dependent element is not accessible in the domain of a subject. If there is full symmetry in this respect, we expect that there are also languages that define their antecedent domain as a similar extension of (26): in such languages potential antecedents are not accessible in the domain of a subject I have argued elsewhere that it is exactly this situation that we find in languages like Japanese, Korean, and many others, in which only subjects can be antecedents for reflexives: if potential antecedents are not accessible in 'the domain of a subject, only the subject itself is accessible in the given domain (Koster (1982b)). If this conclusion is correct, then unrestricted c-command, as in English, is the unmarked condition for antecedents, while the subjects-only option for antecedents is a marked extension, not unlike the extensions that we find for anaphors in principle A of the binding theory. This would be a remarkable confirmation of the view that c-command is the antecedent counterpart of locality, as it is usually defined for the dependent element. In any case, it seems worthwhile to look not only for lists of correlating properties like (13) but also for the deeper structural principles from which these properties follow. The properties in (13) (and the principles from which they follow) define a configurational matrix for almost all grammatical dependency relations. There are surprisingly few relations that are not somehow characterized by the properties of this configurational matrix. In fact, there might be only one major class of exceptions, which I will briefly discuss in a moment. Furthermore, there are anaphoric systems, like the one for the reflexive zibun in Japanese, that seem to be characterized by locality on the antecedent (c-command) but not by locality on the dependent element (as in the case of English anaphors). The major exception that comes to mind is the class of dependencies
The Invariant Core of Language
13
that seem to be characterized by principles of argument structure. Thus, control structures are not generally characterized by the properties in (13). There are control structures without obligatory antecedents (28a), with split antecedents (28b), with non-c-commanding antecedents (28c), and with nonlocal antecedents (28d) (see Koster (1984a) and chapter 3 below): (28)
a. b. c. d.
It is impossible [PRO to help Bill] John proposed to Mary [PRO to help each other] It is difficult for Mary [PRO to help Bill] John thinks [s it is impossible [s PRO to shave himself]]
In some cases, the antecedent of PRO must f-command it (in the sense of Bresnan (1982)). Similar observations can be made about anaphor binding in many languages. Even in English, c-command is not always necessary, as was observed by J ackendoff (1972): (29)
A book by John about himself
This does not mean that the configurational binding theory can be replaced for English by a theory based on argument structure. In languages like English and Dutch, possibilities like (29) are limited to certain prepositions, while c-command is much more generally usable. In control structures, principles of argument structure are more prominent in English, but even in the case of control these principles interact with the purely structural notions of (13) (see Koster (l984a) and chapter 3 below). One might argue that Universal Grammar defines two systems: a system based on argument structure, and a purely structural system. The former system might be the older system, while the latter system might be the result of a later evolutionary development. Whatever the merit of these speculations, it seems to me that nonconfigurational principles have a minority position in most natural languages. Most dependency relations fall within the limits of the configurational matrix characterized by (13). At least the following dependency relations have the form specified by (13): (30)
a.
b.
c. d.
licensing relations government subcategorization El-marking Case assignment agreement subject-verb CO MP-verb anaphor binding movement
14
Domains and Dynasties
e. f. g.
NP-movement Wh-movement obligatory control predication gapping
For most of these dependencies, Chomsky (1981b, 1982a) postulates different modules, such as government theory, Case theory, binding theory, bounding theory, control theory, etc. Insofar as each of these subtheories has some characteristics of its own, I agree. But it would be a mistake to consider each subtheory a totally primitive structure. To a large extent, the subtheories are made from the same stuff, namely the properties of the configurational matrix (13). In many cases, the fact that the construction types in (30) have the properties listed in (13) needs little illustration. It is clear, for instance, that the licensing relations, (30a), have the four properties: a subcategorized element is obligatorily dependent (13a) on a unique head (13b).· Furthermore, the head c-commands its complements (13c) in a local domain, i.e. the head does not govern into the domain of another governor (13d). Similarly, the agreement relations, (30b), and the predication relation, (30f), have the four properties in a rather perspicuous manner. The other relations are interesting in that they seem to contradict the uniformity hypothesis in one way or another. Obligatory control has already been briefly discussed: a well-defined subclass of control structures has the properties listed in (13), as has been argued in Koster (1984a) and chapter 3 below. Anaphor binding and movement are the most problematic from the point of view of a unified theory. Both seem to involve wildly varying domains, within one language, and also across languages. Some of this variation has already been discussed, and I will return to it in the next section. I will conclude the present section with some nonstandard applications of the configurational matrix. First, I will give a brief review of the properties of the gapping construction, which is constrained by (13) in a nontrivial way.9 One problem with gapping is that it is not quite clear what kind of representation is appropriate for coordinate structures. Often, coordination has been treated in terms of normal tree structures. Accordingly, the gaps in the gapping construction were handled by the usual transformational or interpretive processes. Thus, in Ross (1967) the gap in (31 b) is created by deleting the corresponding verb in (31a): (31)
a. b.
John reads a newspaper and Mary reads a book John reads a newspaper and Mary a book
Using essentially the same type of representation, others (like Fiengo (1974)) have replaced the deletion transformation by interpretive rules.
The Invariant Core of Language
15
More radical proposals do not consider coordinated structures as basic phrase markers but as the derivative product of a linearization rule. One of the earliest examples is Williams (1978), and more recently De Vries (1983) and Huybregts (to appear) have been exploring three-dimensional representations (based on set union of reduced P-markers in the sense of Lasnik and Kupin (1977)). F or present purposes, I will assume representations in the spirit of Williams (1978), which is most readily accessible. In this kind of framework, conjuncts before linearization can be represented in columns: John I NP Mary
(32)
Ireads
a newspaper NP a book
I
II
In this representation, elements in the same column have the same function. Thus, both John and Mary have the status of subject, and they receive the same 9-role. The conjuncts each occupy one row, and two conjuncts are properly coordinated if the minimal Xmax containing the column of the two conjuncts contains a conjunction. As before, we assume that S' can function as the minimal Xmax containing the elements governed by V (or INFL). Applied to (32), this means that both John and Mmy and a newspaper and a book are properly coordinated. The column with John and Mary, for instance, is accepted by the conjunction and in its minimal S'. The same holds for the column with a newspaper and a book. In coordinate structures, then, the relation R of (11) is interpreted as a relation between conjunctions and columns of type Xi (where Xi is an element from the X-bar system). A special feature of (32) is that the gap of the second conjunct is not considered a deletion site or an empty V. The properties of the verb read are simply equally distributed over the members of the column to which the verb is related. Thus, in (32) both the book and the newspaper are governed by the verb read. If we assume that the relation between conjunctions and columns has the properties in (13), many facts about gapping are explained. Particularly, the local properties of gapping are explained if we assume that columns are only possible if they are licensed by a conjunction in the same local domain (in the sense of the Bounding Condition; see Koster (1978c, ch. 3)). For instance, the facts that Neijt (1981) seeks to explain in terms of Hankamer's Major Constituent Condition seem to follow. A relevant contrast is the following: (33)
a. b.
*Peter was invited by Mary and John Peter was invited by Mary and John
'/1# /#ft¢~ '/v# #Ym~
~i Bill
by Bill
Contrary to (33b), the gap of the ungrammatical (33a) also includes the preposition by.
16
Domains and Dynasties
The explanation is straightforward, if we assume that gapping is constrained by (13). Consider the underlying representation of (33a):
(34)
a.
* Is,
and
Peter INP John
I was invited
This sentence is ungrammatical because Mary and Bill are not properly coordinated, i.e. the maximal column containing these NPs is not licensed by a conjunction in the minimal local domain (which is the PP headed by by). The representation underlying (32b), however, is well-formed: (34)
b.
Is'
and
Peter INP John
Iwas
invited
In this case, Mary and Bill are part of the more inclusive PP column, thanks to the presence of the second occurrence of by. The PP conjuncts are properly coordinated because their column is licensed by the conjunction and in the minimal domain S'. These examples are representative of the local properties of gapping as described by the Major Constituent Condition of Neijt (1981). The facts straightforwardly follow from the Bounding Condition, which also determines all other local dependencies. Various other hitherto unexplained gapping facts follow from the hypothesis that gapping is constrained by the configurational matrix. So far, it is clear that the list in (30) covers an enormous mass of facts. Many entries are themselves abbreviations for large collections of constructions, "Wh-movement" for instance (see Chomsky (1977)). And yet the list is probably far too short, due to certain arbitrary limitations imposed on the relations considered. One such limitation is the fact that usually only those instantiations of R in (11) are considered in which a does not dominate O. As soon as we drop this arbitrary limitation, the scope of the configurational matrix is considerably extended. Consider for instance the vertical relation in the xbar system, and in phrase structure in general. All sister nodes depend on an immediately dominating mother node. The relation between mother and daughters has the properties in (13): the relation is obligatory (13a), there is always a unique mother to a given pair of daughters (13b), and clearly the relation is local (13d):
The Invariant Core of Language
(35)
17
[vp V [pp P NP]]
P is the head of PP and not of VP, which (for the P) is beyond the limits imposed by the Bounding Condition. It seems to me, then, that there is a close relationship between the Bounding Condition and X-bar theory. The nodes of a projection form a family within the domain (Xmax) defined by the Bounding Condition. Similarly, our modified concept of c-command applies (13c): not only are daughter nodes determined by the mother node within their minimal xmax, but also the mother node determines the daughters within its minimal Xmax. It is somewhat accidental, perhaps, that vertical grammatical relations (like the relations between members of a projection) have hardly been studied from the same perspective as "horizontal" relations like anaphora and movement (an exception is Kayne (1982)). If we abstract away from the distinction related to dominance, it might appear that (13) simply sums up the properties of all local relations of grammar, including both those given in (30) and those implied by the X-bar system. In chapter 2, some applications of this perspective will be discussed. Henk van Riemsdijk has pointed out (personal communication) that scope relations can be seen as an instantiation of "vertical locality". Normally, quantified NPs are assigned a scope either by (an interpretation of) QR (May (1977)) or by relating the quantified element to an abstract morpheme Q (in the sense of Katz and Postal (1964)). Both procedures have the effect that the properties of the scope relation are given the format of a "normal" dependency relation, in which the dependent element is not dominated by its antecedent. If the dominance/nondominance distinction is irrelevant, we can assign scope to a quantified element without QR or an abstract morpheme. We can simply interpret the scope of a quantified element as a relation between this element and the minimal S that contains (i.e. dominates) it. I will not pursue further the many intriguing consequences of interpreting (13) also as a property of vertical relations. Apart from the applications discussed in chapter 2, I consider the vertical dimension as a topic for future research.
1.3. Domain extensions So far we have assumed that purely structural grammatical dependency relations have the same unparametrized form in all constructions in all languages (in the unmarked case). This form is determined by the properties in (13), which include the C-command Condition (26) and the Bounding Condition (27) as universal locality principles. For several constructions in several languages nothing further has to be said.
18
Domains and Dynasties
But in many languages the basic domain as determined by the Bounding Condition can be "stretched" in a certain manner. As mentioned before, domain stretching belongs to the theory of markedness. This conclusion is based on the fact that it is not universal and subject to parametric variation. A trace of Wh-movement, for instance, cannot be bound across a PP boundary in most languages. This fact follows from the Bounding Condition (27), which entails that a trace must be bound in the minimal PP (an Xmax) in which it is governed. In other words, the domain for Wh-traces cannot be stretched beyond the size of a PP (or any other xmax) in most languages (with overt Wh-movement). English and the Germanic Scandinavian languages are among the very few languages with preposition stranding, which entails domain stretching beyond PP boundaries. But even in these languages, this marked phenomenon is limited to very narrowly defined conditions, to which I will return in a moment. Standard Subjacency blocks extraction from complex NPs (in the sense of Ross (1967)), but allows extraction from PPs. This shows that Subjacency, taken as a universal locality principle, is too permissive. It fails to indicate that extraction from PP is something rather exceptional, even in English. In retrospect, we can say that standard Subjacency conflates elements of the unmarked locality principle (27) with elements of the language-particular domain stretching that makes preposition stranding possible in certain contexts. In my opinion, one of the most interesting developments during the last few years has been the emergence of theories that try to describe exactly under what conditions domain stretching is possible. As mentioned in the first section, two types of domain stretching can be distinguished. According to the first type, a domain can be extended by specifying an extra category that the domain must contain. This option is probably limited to categories like subject, INFL, or CO MP. Thus, if a category is governed by a preposition, it must be bound within its minimal governing category (= PP) in the unmarked case. By stipulating that the minimal domain must also contain a subject, the minimal domain PP is extended to the first S containing the PP (this S being the first category up that contains a subject). For English, this is the domain extension chosen for bound anaphors (see Chomsky (1981b, ch. 3) for further details). In languages that do not select this option for certain anaphors, the anaphors in question cannot be bound across PP boundaries. Examples were given in section 1.1 above. Here, I will limit myself to the second type of domain extension, the one that allows violation of Wh-islands in certain languages, among other things. For this type of extension, the key insight was provided by Kayne (1983): the path from dependent element to antecedent must meet certain conditions (see also Nakajima (1982)). In particular, Kayne observed that the direction in which the successive projections (up to the antecedent) are governed plays a crucial role in domains the size of which exceeds the size
The Invariant Core of Language
19
of the minimal X max. This insight led to some remarkable predictions; for instance, as to the (near) absence of parasitic gaps in SOY languages like German and Dutch (Bennis and Hoekstra (1984), Koster (1983, 1984b), and chapter 4 below). In addition to some minor modifications necessary for languages like Dutch, my interpretation of the directionality constraints differs somewhat from Kayne's. First of all, it seems to me that directionality plays no role in the assignment of scope (whether it is executed as LF movement or not). Second, directionality constraints belong entirely to the theory of markedness in my view. In the unmarked domain theory (entailed by the Bounding Condition (27)), directionality does not playa role (see chapter 4 for further details). It seems to me that Kayne's theory of path conditions can also be generalized for types of long distance dependencies other than Whmovement. Many languages have long distance anaphora, for instance (see Yang (1984)). As in the case of Wh-movement, domain stretching in these cases often depends on the nature of the successive governors. In Icelandic, for instance, long distance reflexivization is possible if all Vs from the reflexive up to the domain of the antecedent are in the subjunctive mood (see Maling (1981) and the literature cited there, and furthermore chapters 4 and 6 below). Possibly, there are very similar conditions on long Wh-movement in certain languages. Alexander Grosu has informed me, for example, that in certain cases of Rumanian long Wh-movement, all verbs of the path from trace to antecedent must take the supine form if the verb of the top domain (containing the Wh-antecedent) has the supine form (see also Georgopolous (1985) for uniform paths of the realis or irrealis). In general, then, long distance dependencies (other than successive cyclic Wh-movement) seem to require certain types of agreement among the successive domain governors. These governors form a chain that we might call a dynasty (Koster (1984b) and chapter 4 below): (36)
A dynasty is a chain of governors such that each governor (except the last one) governs the minimal domain containing the next governor.
Thus, the governors that can stretch the domain for Icelandic reflexives must be in the subjunctive form. The governors that can stretch the domain for Wh-traces must govern in the same direction, and so on. Until evidence to the contrary is found, I assume that there are only very few kinds of dynasties, and that their nature is determined by Universal Grammar. In fact, I know of only three kinds of dynasties, determined by the following types of agreement: directionality (for Wh-movement), interclausal verb agreement (subjunctive, supine, etc.), and agreement of lexical category (see below). If dynasties are defined by UG, the nature of domain extensions is not
20
Domains and Dynasties
determined by data, and is not by itself a matter of parametric variation. Dynasties might just be dormant features of all grammars, which become available in certain cases if independent parameters are set. Thus, preposition stranding involves a certain type of domain extension (beyond the minimal PP containing the trace). It is presumably acquired by the language learner if certain data (for instance, stranded prepositions) show that the language under consideration has prepositions among its structural governors (see Kayne (1984, ch. 5)). Even if the domain extension is usually acquired on the basis of data, there is no reason to assume that the same holds for the nature of the dynasty, which determines where prepositions can be stranded and where not. Similarly, long distance reflexivization might be an option for all languages in which interclausal verb dependency is somehow expressed. What is a matter of parametric variation, then, is the nature of the verbverb agreement, not the fact that it defines a domain extension. Data seem to playa role in the factors that trigger domain extensions, and not in the factors that determine their shape. If all this is correct, we have the following domain theory. The shape of grammatical domains is entirely determined by UG, by the Bounding Condition (27) in the unmarked case, and by a very limited number of dynasty-governed domain extensions in the marked case. Parameters play a precisely defined and limited role in this theory: they block or open the way to certain domain extensions. In other words, parameters do not play a role at all in the universal configurational matrix (13) that defines the basic shape of dependencies in all languages. In domain theory, parameters are the switches that separate the unmarked domain and its marked extensions. It is not unlikely that parameters play other roles as well, but there can be little doubt that the theory of parameters can develop beyond a mere statement of differences among languages only if the use of parameters is somehow severely limited. I will now turn to the role and nature of dynasties in island violations. Until Chomsky (1977), generative grammar had a rather simple theory of islands. There were just a few, like the Complex NP Constraint (CNPC) and the Wh-island Condition, which were both explained by Subjacency. This theory was elegant and suggestive, but it was not entirely satisfactory for a number of reasons. Some reasons have already been mentioned, among others the stricter nature of island conditions in a language like Dutch. Other languages, like Italian and the Scandinavian languages, turned out to be more permissive with respect to island violations. But even within English, violations of island conditions vary strongly in acceptability. Some of these differences, such as the subject-object asymmetry in Wh-island violations, were explained in terms of the ECP, but others led to many theories but little agreement among linguists. One of the controversial theories is the directionality theory based on Kayne (1983), which was briefly mentioned before. So far, it is the only
The Invariant Core of Language
21
available theory that explains why Dutch has only stranding of postpositions (not of prepositions), and why parasitic gaps are practically lacking in Dutch. This theory also explains the sharp difference between English and Dutch with respect to violations of the CNPC. Thus, certain violations of this condition are reasonably acceptable in English: (37)
[Which race did you express [NP a desire [to win t]]]
The trace is not bound within its minimal domain (expressed by the innermost brackets). So, it can only be bound in an extended domain, in this case the domain indicated by the outermost brackets. The domain extension is well-formed, because the governors of the dynasty all govern in the same direction: the three relevant governors, express, desire, and win, all govern to the right. This kind of directional agreement is required by the theory of Kayne (1983) and its offspring (like Bennis and Hoekstra (1984), Koster (l984b), and chapter 4 below). The Dutch equivalent of (37) is hopelessly ungrammatical: (38)
*[Welke race heb je [een verlangen [te t winnen] uitgedrukt]]
The explanation is straightforward: the N verlangen 'desire' governs to the right, but contrary to what we see in English, the two verbs govern to the left in an SOY language like Dutch. Since there is no dynasty of governors governing in the same direction, the domain extension is not well-formed. A theory based on directionality, though successful in many cases, does not work as an account for the variable acceptability of Wh-island violations, both within one language and across languages. For example, earlier attempts to explain the relative strictness of Wh-islands in Dutch dealt with examples like the following (Koster (1984b)): (39)
*Welk
boek weet je [wie t gelezen heeft] which book know you who read has 'Which book do you know who read?'
This fact seemed to be explained by the directionality constraints, under the assumption that the matrix verb governs the clausal complement to the right, while the object in the embedded clause (indicated by the trace) is leftward-governed by the verb. This is in accordance with the fact that tensed complement clauses must occur to the right of the verb, while NPobjects must occur to the left. This explanation is incorrect, as pointed out by Koopman and Sportiche (1985), who have given relatively acceptable violations of Whislands in Dutch: (40)
Met welk mes weet je niet hoe je dit brood with which knife know you not how you this bread
22
Domains and Dynasties zou kunnen snijden could cut 'With which knife don't you know how you might cut this bread?'
Relatively acceptable Wh-island violations can be found in Dutch after all, contrary to the predictions made by the directionality theory. The fact that earlier studies claimed a stricter Wh-island behavior for Dutch than for English is probably due to two factors. First of all, Whisland violations in English are often milder with relative pronouns extracted from dependent questions: (41)
?This is the boy that I know who kissed
In Dutch, such sentences are distinctly worse: (42)
*Dit is
de jongen die ik weet wie kuste
This contrast is probably due to an independent factor, namely the fact that Dutch has so-called d-words (like die) in such cases, which are somewhat more difficult to extract, even in non-island contexts. Furthermore, Dutch has only a very limited supply of infinitival Whcomplements. In English, these are among the best examples of relatively acceptable Wh-island violations, while extractions from tensed clauses (like (39)) are often bad in both languages if subjects are crossed. Examples without Wh-subjects in COMP lead to relatively mild violations in Dutch: (43)
a.
b.
?Welke boeken wil je wet en aan wie hij which books want you know to whom he heeft? has 'Which books do you want to know to whom ?Aan wie wil je weten welke boeken hij to whom want you know which books he heeft? has To whom do you want to know which books
gegeven given
he gave?' gegeven given
he gave?'
Koopman and Sportiche claim a further contrast between examples like (43a) and (43b): extraction of a direct object is supposed to be worse (43a) than extraction of a subcategorized PP (43b). To my ear, however, (43a) and (43b) hardly differ in acceptability. It is really not a contrast to build a theory on. The directionality theory is of course also insufficient for contrasts within one language. In earlier work, I observed a contrast between the extractability of adjuncts and, for instance, direct objects on the basis of examples like the following (Koster (1978c, 195-198)):
The Invariant Core of Language (44)
a. b.
23
What don't you know how long to boil? *How long don't you know what to boil?
Huang (1982) sought to relate such differences between the extractability of complements and adjuncts to the ECP: complements are properly governed (in the sense of the ECP), while adjuncts are not. Koopman and Sportiche (1985) further developed this type of theory by stipulating that long extraction across Wh-islands is possible if and only if the long-moved Wh-element comes from a 8-position. An alternative theory has been developed by Hans Obenauer (1984, based on work presented in 1982) and Guglielmo Cinque (1984). According to this theory, extraction beyond the domains defined by Subjacency always involves pro. Since only NPs (and certain designated PPs) have the feature + pro, only these elements can be extracted from Wh-islands. This theory also explains the poor extractability of adjuncts in cases like (44b). In spite of success in cases like this one, neither the Huang-KoopmanSportiche theory nor the Cinque-Obenauer theory explains all facts. The former theory, for instance, does not explain Adriana Belletti's observation that extraction of thematic PPs from certain islands is much worse than extraction of NPs:
(45)
*With whom did you express [a desire [to talk t]]
For the Cinque-Obenauer approach, such facts and many others (see Koster (1984b)) are unproblematic, because there is no overt pro-form corresponding to the PPs in question. The Cinque-Obenauer theory, on the other hand, does not account for the relative acceptability of (43b). This fact cannot be accounted for by Subjacency, as suggested for similar facts in Spanish by Obenauer (1984). Subjacency would have to be formulated with S' as bounding node for Dutch. But apart from all the other problems with Subjacency (some of which have been mentioned above), this solution would not account for the fact that the following sentence is still relatively acceptable in Dutch: (46)
?Aan wie wil je weten [s' welke boeken hij zegt to whom want you know which books he says [s' dat hij gegeven heeft]] that he given has 'To whom do you want to know which books he says that he has given?'
This sentence is (43b) with one embedding added. The fronted PP comes from the most deeply embedded clause. Therefore, it has to pass two S's, which is a violation of Subjacency in the intended sense. And yet (46) is hardly less acceptable than (43b). Subjacency, in other words, cannot be
24
Domains and Dynasties
the factor that governs the extractability of PPs from islands in these cases. Summarizing, we have the following situation. Many facts, such as the nature of P-stranding in Dutch, the near absence of parasitic gaps in German and Dutch, and the strong contrast between English and Dutch with respect to the CNPC, can only be accounted for at the moment by a theory that incorporates Kayne's directionality constraints in some form. The nonextractability of adjuncts follows from the Huang theory and its further development by Koopman and Sportiche (1985). It also follows from the Cinque-Obenauer theory. The latter theory has the advantage that it also explains Adriana Belletti's observation of the nonextractability of complement PPs in almost all cases, other than (43b) or (46). At least for this reason, the Cinque-Obenauer theory must be accepted as an important supplement to a Kayne-type directionality theory (along with the qualifications made in chapter 4 below, in my opinion). The Koopman-Sportiche theory has one advantage, however. It is the only theory that does not exclude (46). As we have seen, both the application of the directionality theory to this type of example and the Cinque-Obenauer theory wrongly exclude (46). The question, then, is whether we can save this advantage of the Koopman-Sportiche theory in some form. In fact, examples like (43b) and (46) were given a special status in Koster (1984b) in a discussion of similar examples from Italian. In one of the well-known examples from Rizzi (1978), a PP is extracted from a Whisland: (47)
Tuo fratello, your brother raccontato t, told
a cui mi domando che storie abbiano to whom I wonder which stories they have era molto preoccupato was very troubled
Like (46), this example is incompatible with the Cinque-Obenauer theory as interpreted in Koster (1984b). For this reason, I introduced an extra condition, the Extended Bounding Condition, for examples like (47). According to this condition, the unmarked domain (27) is stretched if there is a dynasty of only Vs. Contrary to the directionality-governed dynasty, which only allows extraction of NPs (= pro), this V-dynasty would allow Wh-fronting of all categories, just like in the unmarked domain (Wh-movement within a single clause). This view has the consequence that Italian counterparts of examples like (46) are predicted to be relatively acceptable, contrary to what the Subjacency account of Rizzi (1978) suggests. To my knowledge, this prediction is borne out. In spite of this, some other data from Koopman and Sportiche (1985) suggest that this formulation (in terms of the Extended Bounding Condition) is too permissive: the account permits extraction of categories of all types (including adjuncts) in domains determined by a pure V-
The Invariant Core of Language
25
dynasty. Adjuncts, however, cannot be extracted from Wh-islands within the domains in question: (48)
*Waarom wil
je weten [wat hij t gelezen heeft] want you know what he read has 'Why do you want to know what he read t?' why
It appears that the Koopman-Sportiche generalization is exactly right for extended domains with pure V-dynasties: in those domains only a-marked categories (NPs or PPs) can be extracted. But as soon as we have dynasties with mixed categories, for instance N and V as in the CNPC, directionality constraints become relevant and only NPs can be extracted (in accordance with the Cinque--Obenauer approach). Both the Huang-Koopman-Sportiche approach and the Cinque-Obenauer approach, then, are right, be it that they concern slightly different domains. All in all, we have a three-way distinction for Wh-movement, one for the unmarked case (49a), and two for the marked case (49b and c), depending on the nature of the dynasty:
(49)
a. b.
c.
all categories movable within basic domain (27) (no dynasty) only complements movable in a domain defined by a dynasty of Vs (no directionality) elsewhere: only NPs moved if there is a dynasty of equally oriented governors
The contrast between (49b) and (49c) is not entirely unexpected: quite generally, the acceptability of extractions from islands is a function of the uniformity and simplicity of dynasties. lO The most important conclusion, however, is that the extraction facts from many languages confirm the reality of the (unmarked) Bounding Condition (27). To the best of my knowledge, the Bounding Condition defines the only domain (in all languages with Wh-movement) in which categories of all types can be moved to CO MP. Domain extensions (which lead to Wh-island violations) are only possible under very limited conditions that can be met in some languages but not in others, depending on the fixing of certain parameters. A domain extension can be recognized not only by its dynasty conditions, but also by strict limitations on the type of category that can be moved to COMPo
1.4. Conclusion
In recent years, much attention has been paid to parametrized theories of grammar. On the one hand, this has given linguistic theory the necessary flexibility, but on the other hand, it has led to a rather unconstrained use
26
Domains and Dynasties
of parameters. This is somewhat reminiscent of the earlier unconstrained use of features. Like a theory of features, a theory of parameters must be constrained: it can only contribute to explanatory adequacy, beyond the mere description of differences among languages, if it indicates where parameters playa role and where not. A tentative effort towards this goal is the hypothesis of the previous section that parameters do not play a role in the unmarked core of grammar, but only as switches between this core and the marked periphery. The most important conclusion, however, is that there is an invariant core of language after all, in spite of the obvious need for parameters at some point in the theory. This invariant core is a configurational matrix, characterized by the four properties listed in (13), which plays a role in almost all local dependencies in (presumably) all languages. A crucial feature of (13) is that it incorporates a universal locality principle, the Bounding Condition (27), that is believed to hold for all constructions mentioned under (30). This locality principle is in a sense the minimally necessary locality principle for all languages in that it defines domains similar to the maximal projections of X-bar theory. Abstracting away from the dominance/nondominance distinction, we concluded that an obvious generalization can be made: the notion "maximal projection" not only defines the domain for vertical dependency relations, it also defines the unmarked domain for all other local dependency relations. Under the crucial assumption that S' (rather than VP) can be the minimal domain of V, the unmarked locality principle (27) characterizes many of the constructions in (30) without further problems. The real challenge for the hypothesis of a universal unmarked locality principle comes from the fact that many constructions, particularly control, bound anaphora, and movement constructions, seem to require a domain definition that somehow deviates from the Bounding Condition. Control, for instance, seems to allow long distance dependency, and more generally, seems to involve principles of argument structure rather than a purely configurational theory. I have tried to show, however, that a well-defined subclass of control structures- namely, obligatory control in the sense of Williams (1980) - has exactly the properties in (13), including the Bounding Condition (27) (see chapter 3 for further details). The biggest problem has been the unification of bound anaphora and "move alpha" in terms of the Bounding Condition. The domain statement for bound anaphora, principle A of the binding theory of Chomsky (1981b, ch. 3), deviates from the Bounding Condition in that the minimal relevant Xmax must contain a SUBJECT (in the sense defined in Chomsky (1981b)). An even greater discrepancy exists between the Bounding Condition and the standard locality principle for "move alpha", i.e. Subjacency. Contrary to the Bounding Condition, Subjacency does not specify one, but two nodes of type Xmax (traditionally NP and S' (or S)). In short, both bound anaphora and movement seem to require domains
The Invariant Core of Language
27
larger than the one specified by the Bounding Condition. The idea that bigger domains must be defined was reinforced by the study of long distance anaphora in languages like Icelandic (and from a different perspective, Japanese) and by reports concerning languages with permissive island behavior, like Romance and Scandinavian. It is fairly obvious now, I believe, that in many languages with phenomena that seem to require more extended domains, the minimal domain defined by the Bounding Condition (27) can still be detected somehow. In languages with long distance anaphora, different things often happen in the minimal domain. In Dutch, for instance, the two reflexives zich and zichzelf are usually in complementary distribution, but they are bound in the same way in the only minimal domain in which they can have an antecedent, namely the domain of V (= S'). As we saw in section 1.1, this domain is specified by the Bounding Condition (without reference to the notion subject). The notion subject only appears to playa role if the anaphors in question are not bound in their minimal Xmax: zichzelf must be bound in the domain of a subject (like English himself), while zich must be free in the minimal domain containing a subject. Similarly, clitics are usually bound in their minimal governing xmax and cannot be bound across major phrase nodes. Again, the domain for these clitics can be defined by the Bounding Condition, without reference to the notion subject. The facts from Dutch suggest that the notion subject does not playa role in the basic domain, but only in an extended domain, which is not universal, as shown by the clitics in many languages. In short, bound anaphors are universally bound within their minimal X max. Outside this minimal domain, anaphors are bound in the minimal subject domain, free in the minimal subject domain, or not bound at all. In comparing various languages, we observe that notions like subject, INFL, or COMP do not define basic domains, but only playa role as domain stretchers. Domain stretching is a marked option in this view. Another method of domain stretching, necessary for long distance anaphora and long movement, is based on the dynasty concept. According to this idea, a domain can be stretched if the governors in the path from dependent element to antecedent agree in some fashion (see chapter 6 for further details). "Move alpha" is the most important case, because its alleged deviant properties have always played a role in the defense of the traditional derivational perspective on grammar. "Move alpha" defines the mapping between various levels of representation. If the properties of "move alpha" cannot be defined, one argument for a particular multilevel approach collapses. 11 As we have seen, Subjacency is the only relevant distinguishing property of "move alpha". If "move alpha" is not characterized by Subjacency, but by the universal Bounding Condition, it loses its distinct character.
28
Domains and Dynasties
The evidence that "move alpha" is not characterized by Subjacency but by the Bounding Condition is very strong in my opinion. Even in English, the Bounding Condition - simpler than Subjacency - suffices for almost all contexts. The only exception is a certain class of postverbal extractions. But this context is clearly irrelevant because, on the one hand, Subjacency is both too weak and too strong for this context, and on the other hand, in many languages (Dutch, for instance) extraction in this context, just as in the other contexts, is perfectly characterized by the Bounding Condition (see Koster (1978c)). The peculiar permissiveness of movement from postverbal contexts in English and a few other languages derives from the possibility of preposition stranding, together with the uniform direction from which the successive projections from trace to antecedent are governed. Thanks to some independent structural features of English, this language allows for a domain extension in the very limited context in question, an extension determined by dynasties of uniformly oriented governors. Strong evidence for the Bounding Condition has come from the study of Wh-island violations in recent years. These violations differ much in strength, depending on the nature of the Wh-category moved to COMPo The relevant fact here is that in the domain defined by the Bounding Condition all categories (including adjuncts) can be moved to COMP, while there are severe limitations both on the type of category moved and on the dynasty conditions if a Wh-element is moved to CO MP in an extended domain. The Bounding Condition, in other words, defines the domain in which all categories can be moved to COMP, relatively free of further conditions. This distinction between the unmarked domain and the extended domain can be observed in most (perhaps all) languages studied from this perspective, even in Italian, as shown by Huang (1982) (see chapters 4 and 5 for further details). If all this is correct, the theory of the configurational matrix (which includes the Bounding Condition) is a step in the direction of a unified theory of grammatical dependency relations. The theory is not only universal in the sense that it applies to all languages, it is also universal in the sense that it applies to all constructions of a certain type. The hypothesis that the core properties of grammar are constructionindependent, I will refer to as the Thesis of Radical Autonomy (see chapter 7). Needless to say, a theory with this scope is highly abstract. But the promising aspect of it is that in spite of this degree of abstractness, it makes very concrete predictions about a large number of constructions. It determines the locality properties of constructions as diverse as subcategorization, bound anaphora, control, and gapping. In the chapters that follow, I will demonstrate the reality of the configurational matrix in X-bar structures (chapter 2), control structures (chapter 3), structures involving Wh-movement (chapter 4) and NPmovement (chapter 5), and also in bound anaphora (chapter 6). If the
The Invariant Core of Language
29
configurational matrix can be detected in all these different constructions, the Thesis of Radical Autonomy is confirmed, which ultimately entails that core grammar is not functionally determined but rather based on mental structures without an inherent meaning or purpose (chapter 7).
NOTES 1. Chomsky (1984). 2. See Sportiche (1983) for a lucid development of this idea. 3. See Chomsky (1981b). 4. See Gazdar (1982), for example. 5. See Bouchard (1984) for the fundamental similarities between empty categories and lexical anaphors in this respect. 6. I am assuming throughout this book that S' (rather than VP) is the minimal Xmax for V. This assumption is at variance with the usual assumption that the maximal projection of V is VP, and that INFL and/or COMP are the heads of new projections. I have never been quite convinced by this assumption, however. It might be useful to make a distinction between lexical projections (based on the categories V, N, P, and A) and auxiliary projections (based on Q, COMP, and INFL). For some purposes, then, S' might be the minimal domain for V (i.e. VP plus its auxiliary projections based on INFL and COMP), and for others VP might be the relevant domain (i.e. the lexical projection without its auxiliaries). Whatever the ultimate truth in this respect, it seems to me that S' often replaces VP as the minimal domain of V. 7. For earlier accounts, see for instance Koster (1982b) and (1984a). 8. Thus, the binding theory for English has the following form: A bound anaphor must be bound in: (i) its minimal Xmax, or elsewhere: (ii) in its minimal SUBJECT domain The first part, (i), is the universal Bounding Condition. The second part, (ii), is the languageparticular extension for English. The status of (ii) can be derived from the fact that it is either lacking in other languages, or is a dimension of contrast, as we saw in section 1.1 for the Dutch reflexives. 9. The following discussion of gapping is from Koster (1984c), where these and other facts are somewhat more extensively discussed. 10. See Koster (1984b), for example. 11. It should be noted that I am not arguing against multilevel theories in general. Apart from S-structure (with its "D-structure" and "LF" properties), I am assuming LS (lexical structure) and PF. The mapping among these levels, however, does not have the properties of "move alpha".
Chapter 2
Levels of Representation
2.1. Introduction
The construction of levels of representation, like deep and surface structure, connected by movement transformations is the standard solution to a certain reconstruction problem. Thus, there are idiomatic expressions like to make headway, in which the idiomatic connection requires the adjacency of the verb make and the NP headway. Assuming that adjacency is a necessary condition for idiomatic interpretation, the following type of example, in which the idiomatic elements are "scattered", poses the classical problem:
(1)
Headway seems to be made
Since the necessary adjacency is lost here, it must be somehow reconstructed. Deep structure was the answer: there must be an underlying level at which make and headway are literally adjacent: (2)
seems to be made headway
The surface structure (1) is derived from the deep structure representation (2) by what is now called "move alpha". This solution was generalized to most situations in which a strictly locally defined relation must be reconstructed. Another example is subjectverb agreement: (3)
a. b.
Mary thinks that the boys have lost The boys think that Mary has lost
The number of the finite verb (have vs. has) is determined by the number of the subject that immediately precedes it. As in the idiom example, an element of the agreement relation (the subject in this case) can be indefinitely far away from the verb: (4)
Which boys do you think that Bill said that Mary thinks have lost
Since it is entirely obvious that number agreement depends on the local 31
32
Domains and Dynasties
subject of a verb, and since the relevant subject which boys is not occupying the relevant local position, it is again reasonable to reconstruct the deep structure in which the subject and the verb are adjacent: (5)
do you think that Bill said that Mary thinks which boys have lost
These examples, to which many others could be added, illustrate one of the fundamental problems that transformational-generative grammar has sought to solve. The standard solution, constructing a level of deep structure, seems very natural. In fact, it seems to be the only reasonable solution in a framework without traces. The standard solution to the reconstruction problem has been undermined by two developments. First, it was shown that the proposed solution was not sufficiently general in that there were similar cases that could not be solved by postulating a level of deep structure. Secondly, trace theory came to the fore, which suggested what in my opinion is a more promising alternative. To illustrate the first point, consider binding of the anaphor himself. Like idiom interpretation and number agreement, anaphor binding is a local relation: (6)
John thinks that the boy admires himself
Both antecedent and reflexive enter into the binding relation if they are within the same local domain. As before, the antecedent can be moved from the necessary local position: (7)
Which boy does John think admires himself
As before, it is clear that the local pattern can be restored by reconstructing the antecedent position of which boy: (8)
does John think which boy admires himself
It is also possible to reorder the reflexive instead of the antecedent: (9)
a. b.
Himself I don't think he really likes What he really likes is himself
It is my claim that in these cases the standard solution does not work. Neither in the case of topicalization (9a), nor in the case of pseudo-cleft (9b) is it possible to literally reconstruct himself in the local domain of the antecedent (the object position of like). I will return to topicalization in what follows. Here, I will briefly illustrate this point with the pseudo-cleft construction. In accordance with the standard solution to the reconstruction problem, it was originally
Levels of Representation
33
thought that the deep structure of (9b) literally has himself in the object position of the verb: (10)
[NP it [8' he really likes himself]] is-
Deriving (9b) from (10) is not easy. Himself has to be moved to the postcopular position indicated by - , and it must be replaced by what (see Chomsky (1970, 209) for a solution along these lines). This way of deriving pseudo-cleft sentences has been universally abandoned. Roger Higgins (1973) convincingly demonstrated that it does not work. In present terms, the movement of himself is impossible because it violates Subjacency. It would also violate the a-criterion because himself, an argument, would fill a a-position at D-structure which is filled by the variable (also an argument) bound by what at surface structure. Last but not least, the binding theory that relates himself to its antecedent does apply at S-structure (see Chomsky (1981b)), so that himself can only indirectly be linked to its antecedent. In short, (9b) is a clear example in which a local relation, the antecedent-reflexive relation, cannot be reconstructed in the standard way by stipulating that there is a deep structure like (10). Apparently, local relations may be reconstructed in a weaker way, namely by the mediating properties of anaphors. In the copular predication (9b), the reflexive himself is interpreted as the value of the pronominal what, which in turn binds a trace at the position where the antecedent-reflexive relation is normally locally determined. The consequences of the fact that the reconstruction problem cannot be solved by standard means in (9b) should not be underestimated. In fact, we can interpret (9b) as a counterexample to the standard approach if the latter is taken to have the following content: local relations can only be satisfied by elements in situ, i.e. by elements that literally occupy the positions involved in the local relations. It seems to me that this is one of the core ideas of the standard level approach; (9b) shows that the standard approach is untenable as a general solution to the reconstruction problem. A somewhat weaker principle is in order. Suppose that local relations are defined for a local domain ~. We then need a principle like: (11)
A dependent element (5 and an antecedent a satisfy a local relation in a domain ~ if a and (5 are in domain ~, or if a or () are respectively related to a or () in p. I
I
The standard approach requires "being in" a certain position; the revised approach (11), necessary in view of examples like (9b), says that "being in" the relevant positions is fine, but "being related" to the positions in question is sufficient. It is now clear why (11), in conjunction with trace theory, potentially undermines the standard approach. In a theory with traces, the Sstructures of (1) and (4) are (12a) and (12b), respectively:
34 (12)
Domains and Dynasties a. b.
Headway seems to be made t Which boys do you think that Bill said that Mary thinks t have lost
Headway is interpreted idiomatically if it is in the object position of make, but according to (11) it is also so interpreted if it is related to an element in the object position of make. The trace t in (12) is precisely the "anchor" element to which headway can be related. Similarly, which boys in (12b) is linked to an element t in the relevant local domain, so that which boys satisfies the locally defined agreement relation. Given the necessity of (11), trace theory is not a complement to the standard approach, but an alternative to it: with traces represented at Sstructure, it is not necessary to have a separate level of D-structure. In a sense, deep structure does not disappear, because its relevant aspects are now coded into S-structure. Chomsky (1973, sect. 17) realized that trace theory suggested the alternative just mentioned, but has never accepted it as the better theory. Since many of the standard arguments lose their force under the assumptions of trace theory, the motivation for a separate level of D-structure, related to S-structure by "move alpha", must be sought elsewhere. In principle, there are two ways to justify D-structure plus movement: either to show that there are properties that are naturally stated only at Dstructure (and not at S-structure), or to demonstrate that "move alpha" has properties that cannot be identified as the properties of rules of construal at S-structure. Note that the second type of argumentation is indirect and weak in principle. The only point of this type of argumentation is that "move alpha" can be reformulated as a rule of construal at Sstructure, but that such restatements are unsuccessful if the rules of construal still have the properties of "move alpha", which are distinct from the properties of other construal rules. The theory without "move alpha" would be a notational variant of the two-level theory, at best (Chomsky (1981b)). If "move alpha" has distinct, irreducible properties, the derivational perspective is not really well established, because it is clear that different rules of construal can have different properties at S-structure. Thus, the alleged unique properties of "move alpha" give circumstantial evidence for a derivational approach, at best. If it can be shown, however, that there are no unique principles applying to "move alpha" and not to other rules of construal, a much stronger point can be made: "move alpha" becomes entirely superfluous. This is one of the central theses of this book: the (unmarked) configurational core of "move alpha" can also be found in a subclass of control structures, in bound anaphora constructions, and in many other constructions. In short, my argument against "move alpha" is essentially an argument of conceptual economy. I agree with Chomsky (1981b, 92) that there is no argument based on conceptual economy if the properties
Levels of Representation
35
of "move alpha" are not shared by other rules of construal. But I will show that there is much evidence that there is a common core in "move alpha" and the rules of construal. One of the redundancies of the current GB approach is that it has two indexing procedures: free indexing for construa~ and indexing by application of "move alpha". By generating S-structures directly, we can do with only one procedure, namely free indexing. The configurational matrix discussed in chapter 1 can be seen as a definition of possible coindexing configurations: coindexing is only permitted between a dependent element o and a unique antecedent a within a local domain ~. As we briefly indicated in chapter 1, coindexing can be interpreted in one and only one way: (13)
share property
This mode of interpretation is sufficient for both the antecedent-trace relation and the antecedent-anaphor relation. It is the central interpretive rule of grammar that these two forms of coindexing share with several other relations. Properties are only optionally shared. A category can derive properties from another category only if it does not yet have the properties in question. This is determined by the uniqueness property of the configurational matrix. Thus, an NP can only share the lexical content of another NP if it does not have a lexical content of its own. Similarly, a-roles and referential indices can only be borrowed by categories that do not have a a-role or a referential index of their own. Some examples may illustrate this: (14)
a. b.
Johni saw himselfi Johni saw Billi
Suppose that all NPs in a tree except anaphors have an inherent referential index. Suppose furthermore that the indices in (14) do not indicate intended coreference but accessibility of rule (13) for the two elements in question. Then, Bill in (14b) cannot share a referential index with John by (13), because this would violate the uniqueness property: an NP can have one and only one referential index. As a consequence, John and Bill must have a different referential index in (14b), which is ultimately interpreted as "disjoint reference". An anaphor like himself, however, does not have an inherent referential index. This might be seen as the definition of the notion "anaphor". But since all NPs of a certain type must have a referential index, himself must borrow the index from its possible antecedent John, which is brought about by (13). Compare now (15a) with (l5b): (15)
a. b.
Johnj was arrested tj J ohnj saw himselfj
36
Domains and Dynasties
Again, we have two coindexed NPs in a local relation permitted by the configurational matrix. Again, then, whatever properties are lacking from one of the NPs can be transferred by (13). In the first case, (lSa), a a-role must be transferred. Since John stands in the proper relation to its trace t, it can borrow a a-role from the trace by (13). Nothing blocks this transfer, because John is not in a position where it is assigned another a-role. In (lSb), we find two NPs that meet the same configurational criteria, but here it is not possible to transfer a a-role from himself to John. The optional rule (13) allows this transfer, but the result would be filtered out by the uniqueness property (usually referred to as the a-criterion): since John already has a a-role it cannot share another a-role with an element coindexed with it. In short, optional property transfer (13) in conjunction with independent principles like the uniqueness condition not only gives the results of the construal rules, but also the results of "move alpha". It should be said at the outset that I am not claiming that we find the same relation in (1Sa) and (lSb). There is an obvious difference between the antecedenttrace relation found in (1Sa) and the antecedent-anaphor relation found in (lSb). What I am claiming is something different: both (1Sa) and (1Sb) involve the same interpretive rule with the same configurational properties, namely (13). The result of this rule is different in these two cases because of independent factors, namely, the fact that John in (1Sb) already has a a-role, while in (1Sa) John is in a non-a-position. But clearly, this difference has nothing to do with the interpretive rule involved, which is (13) in both cases. What I am advocating here, in other words, is a more modular approach to the two different relations in (1Sa) and (1Sb), respectively: one interpretive rule together with two different antecedents (a versus non-a) yields two different relations. The alternative approach sketched here gives a unified account of the common core of "move alpha" and other rules of construal. It not only accounts for the classical cases discussed at the beginning of this chapter but also for the problematic (9b), which was beyond the scope of "move alpha". Let us briefly consider, then, how these cases are accounted for. Take the S-structure representation of (1): (16)
HeadwaYj seems to be made
tj
The relevant idiomatic interpretation is forced upon this structure if the complement of made has the lexical content headway. Since the trace tj does not have inherent lexical properties, they must be borrowed elsewhere. The trace and its antecedent headway meet the conditions of the configurational matrix, so that (13) applies. This entails that tj has the required lexical properties, which it shares with its antecedent. Thanks to (13), this result can be derived without reconstruction of a level of Dstructure in which headway actually occupies the position of the trace. Similar considerations hold for the agreement fact (12b), repeated here
Levels of Representation
37
for convenience:
(17)
Which boysj do you think that Bill said that Mary thinks tj have lost
The agreement relation requires the feature "plural" on the trace tj. Traces never have such properties inherently, but thanks to (13) the feature can be borrowed from the antecedent which boys, which is inherently plural. Let us now have a closer look at the various levels that have been proposed in the literature: (18)
a. b. c. d. e.
D-structure (Chomsky (1981b)) NP-structure (Van Riemsdijk and Williams (1981)) S-structure (Chomsky (1981b)) Logical Form (Chomsky (1981b)) surface structure (Chomsky (1981b))
There is some consensus about the idea that S-structure is the most fundamental level of syntactic representation. Given the strong and growing evidence for empty categories with their distinct properties, the existence of this abstract level seems well established. Naturally, surface structure is then also relatively unproblematic. It differs from S-structure by certain marginal deletions, and perhaps by certain stylistic rules. All the other levels are highly problematic. They are interrelated by "move alpha", a ghost device the properties of which have never been successfully identified. This can be seen by inspecting the properties of traces, the products of "move alpha". Chomsky (1981b, 56) gives the following distinguishing properties: (19)
a. b. c.
trace is governed the antecedent of trace is not in a 9-position the antecedent-trace relation satisfies the Subjacency condition
N one of these properties distinguishes traces from other things. Not only traces but also lexical anaphors are governed. There is strong evidence that PRO can be governed in a subclass of control structures (Koster (1984a) and chapter 3 below); pro (Chomsky (1982a)) is also governed, as a subject in pro-drop languages and also as a resumptive pronoun (chapter 4 below). The second property (19b) is shared by trace and overt resumptive pronouns. It is an error to consider this property the property of a rule ("move alpha"). It is clearly an independent property of certain antecedents. The fact that the subject position of verbs like seem and the subject position of passive constructions is non-9, has nothing to do with "move alpha". The subject positions in question have the same properties without "move alpha", as is clear from structures like it seems that . .. and from passives like it is said that .... 1 It is very unfortunate that an
38
Domains and Dynasties
independent property of the antecedent is confused with a property of the rule itself; as if the fact that anaphors can have plural or singular antecedents entails that there are two entirely different rules of bound anaphora. The third property (19c) is the only substantial property that has been attributed to "move alpha". It is one of the main theses of this book that Subjacency is not a distinguishing property either. The gaps that we find in movement constructions appear to be divided into two classes with entirely different properties. The dividing line is not Subjacency, but the Bounding Condition, which also characterizes locality in many other constructions (chapter 4 below). In other words, there is no rule with the properties of (19). Of course, there are relations with these properties. But these relations are not primitive; they are modularly built up from independent elements, such as the properties of antecedent positions, and the all-purpose propertysharing rule (13). This latter rule has the properties of the configurational matrix, which has nothing in particular to do with movement constructions. If "move alpha" is an artefact, it is hard to imagine what else could justify levels like D-structure, NP-structure, or LF. Apart from "move alpha", the standard approach is to isolate properties that can only be naturally stated at one level or another. But as noted before, such arguments are weak in principle because the relevant aspects of D- or NPstructure are represented at S-structure as subparts. Arguments for levels come down, then, to the idea that subparts of S-structure can be distinguished which have their own properties. This conclusion seems hardly controversial. Let us nevertheless have a closer look at the properties that are supposed to characterize the various levels.
2.2. I)-structure It is not easy to find out exactly what D-structure is. In Chomsky (1981b, 39) we find the following characterization: (20)
D-structure lacks the antecedent-trace relation entirely. At D-structure, then, each argument occupies a El-position and each El-position is occupied by an argument. In this sense, D-structure is a representation of El-role assignment - though it has other properties as well, specifically, those that follow from X-bar theory and from parameters of the base (e.g. ordering of major constituents) in a particular language.
There are two aspects here: (i) D-structure has no traces, and, (ii) it is a pure representation of GF-9 (among other things). Note that these two aspects are independent of one another. In practice, D-structure is interpreted as a level without traces, but its significance is obviously based on the second aspect, i.e. its being a pure representation of GF..e. That the
Levels of Representation
39
two aspects are not interrelated can be seen from an example like the following:
(21)
W hatj did he see t j?
If D-structure is defined as a level without traces, (21) is of course not a D-
structure, but if it is only defined as a level at which each argument occupies a 8-position, (21) does qualify as a potential D-structure. In GB theory, a Wh-trace is considered a variable, i.e. an argument. So, the representation (21) contains two 8-positions that are both filled by an argument (he and t, respectively). If the essence of D-structure is the pure representation of GF-8, movement to A'-positions is irrelevant: before and after the movement the A-chains have exactly one element, which is typical of D-structures. In practice, (21) is not interpreted as a D-structure, but this then depends on the extra stipulation that D-structure contains no traces, neither NP-traces nor Wh-traces. For Wh-traces, this has nothing to do with the essence of D-structure (its being a pure representation of GF-8). If we drop the unmotivated stipulation, we can maintain the essence of Dstructure and consider (21) a D-structure (which falls together with its Sstructure, as in so many other cases). This is a welcome conclusion, because there are independent reasons to assume that Wh-phrases must be base-generated in COMP in certain cases. This is so in languages with overt resumptive pronouns (which are very marginal in English). I will show below that English has empty resumptive pronouns that cannot be related to their Wh-antecedent in COMP by "move alpha". So, the Whphrase in COMP in (21) is in one of its possible base positions, and its trace is an argument with a function chain of one member, which is in accordance with the definition of D-structure. Since it is not possible to exclude (21) as a D-structure on the basis of the argument-8-role distribution, and since the Wh-phrase is also in a possible D-structure position, I see only one argument - apart from arbitrary stipulation - against its D-structure status: the properties of "move alpha". If "move alpha" is a condition on derivations with specific properties, and if the antecedent-trace relation in (21) has these properties, then (21) is not a plausible D-structure. Chomsky has argued recently, however, that there are reasons to consider the traditional characteristic of "move alpha", Subjacency, a property of S-structure (LF movement does not obey Subjacency; class lectures, fall 1983, and Chomsky (1986a)). But if Subjacency is a property of S-structure, there are no significant reasons left to deny D-structure status to structures with only Wh-traces. This is a fortiori true for the theory presented here, according to which "move alpha" has no characteristic properties at all. We must conclude, then, that theD-structure/S-structure distinction is practically meaningless for the many constructions that only involve Whmovement (see Chomsky (1977) for the scope of this rule). If the D-
40
Domains and Dynasties
structure/S-structure distinction is significant at all, it must be based on NP-movement, because only this rule creates A-chains with more than one member. But here we meet other problems. If Wh-movement (for instance in (21)) exists, there must be a distinction between a category as a functional position in a structure and the lexical content of that category. This is clear from the fact that the alleged Dstructure of (21) has the Wh-phrase in the position of the argument, the trace: (22)
COMP he PAST saw [NP what]j
The 9-role can be assigned to the object NP only in abstraction from its lexical content. The reason is that this lexical content is moved to COMP, where it does not have a 9-role (Chomsky (1981b, 115)). The 9-role is left behind at the now empty NP position (the trace). It is therefore not necessary for "move alpha" to carry along9-roles. What (22) and (21) have in common from the point of view of the a-criterion and the Projection Principle is that in both cases there is one a-role assigned to one argument position, i.e. the object position. In (22), this position has lexical content, and in (21) the lexical content has been moved. What remains constant is the a-role assigned to the NP position, which then has this a-role in abstraction from its lexical content This is not what we see in the case of NP-movement: (23)
a. b.
NP was arrested [NP John]j Johnj was arrested [NP tj]
This case has been treated in different ways. One way is to assign the arole to the NP John in (23a); when John is moved to the subject position, the a-role is carried along. The a-role is then not assigned to the object position, in abstraction from its lexical content, as in (22). This is hardly a fortunate result, because a-role assignment would be more or less dependent on the content of NPs: if the NP contains a (quasi-) quantifier, the a-role is assigned to the position (22), and if the NP contains a referential expression, the a-role is assigned to that expression (i.e. not to the position but to the content of the position: (23a)). The problem can be circumvented by assigning a-roles to chains, which is more or less standard now (see Chomsky 1982a)). But this is also problematic, because now John no longer has a 9-role itself in (23b). At S-structure, then, the only way to see whether the conditions of the 9criterion are met is by inspecting the chain. But this algorithm, which checks whether John is connected to a 9-position, practically mimics "move alpha". In short, both methods of transmitting a a-role to a derived A-position lead to problems: either Wh-movement and NP-movement get a different treatment, or "move alpha" is duplicated. But even if these problems can
Levels of Representation
41
be solved, the biggest conceptual problem remains: the derived structure (23b) seems to contain two arguments, a name (John) and an anaphor (the NP-trace). GB theory explicitly states that anaphors are arguments, which is only reasonable (Chomsky (1981b, 35)). Since NP-traces are anaphors for the binding theory (Chomsky (1981b, ch. 3), a structure like (23b) contains two arguments. This is at variance with the a-criterion and the Projection Principle, which require a one-to-one relation between a-roles and arguments at all levels. In practice, therefore, NP-traces are supposed to be non-arguments in structures like (23b). This does not follow from the a-criterion, which only entails that (23b) contains one argument, without telling which of the two NPs is the argument. If not only names, but also all anaphors are arguments, (23b) is in fact ruled out by the a-criterion, unless it is guaranteed somehow that some anaphors (NP-traces) are non-arguments. This must be done by stipulation: (24)
Anaphors are arguments unless they are non-a-bound in a nonCase-position
Even with this stipulation of the worst possible sort, the contradiction remains, because NP-traces must be arguments for binding purposes: (25)
a. b.
TheYi seem [ti to like each otheri] TheYi were confronted ti with each otheri
In both cases, each other is A-bound by a trace of NP-movement. But if NP-traces can enter into a chain of coreference, they must be capable of some referential function themselves, and are therefore arguments by definition. There is also another reason to consider both they and its trace to be arguments in (25a). Both are followed by a VP; if the notion argument makes sense at all, it is reasonable to say that each NP in the predication relation par excellence, the [NP VP] relation, is an argument. It seems to me that the ugly stipulation and the contradiction that we observed form strong counterevidence against the second part of the acriterion (in bold type) (Chomsky (1981b, 36)): (26)
Each argument bears one and on.ly one a-role, and each a-role is assigned to one and only one argument
If both the antecedent and the trace (after NP-movement) are arguments, we have one a-role distributed over two arguments. This is a welcome conclusion, because, as we discussed in chapter 1, the configurational matrix requires a unique antecedent but not a unique dependent element. In other words, the core relations of grammar are not biunique. But this fact throws a new light on the a-criterion (26). As mentioned in chapter 1, licensing relations meet the conditions of the configurational
42
Domains and Dynasties
matrix. If this is the case, the first part of the a-criterion need not be stipulated. It simply follows from the general uniqueness property of the configurational matrix: the a-roles can depend on one and only one antecedent, the licensing governor in this case. This fact is completely analogous to what we observe for bound anaphors: they cannot have split an teceden ts: (27)
*John
confronted Mary with themselves
A dependent element like a reflexive can receive only one referential index from one antecedent. Similarly, an argument can receive only one a-role from one licensing category. But if the second part of the a-criterion is false, the licensing relation is also in this respect like other core relations. Anaphors must have a unique antecedent, but a given antecedent can take more than one anaphor:
(28)
They talked with each other about each other
All in all, it appears that the theory of grammar is considerably simplified if we drop the second part of the a-criterion. It is no longer necessary at all to stipulate the a-criterion, if licensing is a core relation. Together with the empirical evidence given earlier, this forms very strong evidence for the idea that NP-traces are in fact arguments. Consider now a relevant example: (29)
Johnj seems [tj to go]
If this S-structure contains two arguments (to one a-role), its D-structure,
by the Projection Principle, also contains two arguments. But then it becomes senseless to postulate a D-structure which is different from its Sstructure for (29). For NP-movement, then, we come to the same conclusion as for Wh-movement: it does not make sense to remove traces from D-structure (= S-structure). In other words, it does not make sense to distinguish D-structure from S-structure. We have now also located our main difference with the standard GB theory. According to the standard approach, the a-criterion is a biuniqueness condition that states that the relation between a-role assigners and arguments is one to one. According to the present approach, the relation between a-role assigners and arguments is one to one or one to many. As we have seen, this leads to three disadvantages for the standard approach: (i) part of the a-.criterion has to be stipulated, (ii) it must be stipulated that some anaphors are not arguments, (iii) this latter stipulation leads to a contradiction. I will now try to sketch the outlines of a theory without these three disadvantages. As already mentioned, the a-criterion disappears, because its empirically relevant part follows from the general properties of core
Levels of Representation
43
relations (in particular from the uniqueness property of the configurational matrix). Although it does not make sense to distinguish Dstructure from S-structure in the alternative theory, the Projection Principle still makes sense. This is so because the existence of Lexical Structure, distinct from S-structure, is not disputed. Thus, if a verb selects an object, this object must always be represented at S-structure. In structures with fronted Wh-objects then, the gap in object position must contain an empty category (a trace in the standard theory). Nevertheless, I would like to slightly modify the Projection Principle, or rather its scope. Much of the standard theory is inspired by the desire to define syntactic structure as a projection from the lexicon. This has not been entirely successful, because of the obligatoriness of subjects. This has led to the Extended Projection Principle: syntactic structures consist of projections from the lexicon plus subjects (Chomsky (1982a, 10)). These are also the 9-positions. In the same spirit, I would like to define the possible 9-positions (argument positions): (30)
9-roles are assigned by: a. b.
heads (for complements) (to direct 9-positions) predicates (for subjects) (to indirect 9-positions)
The first part (30a) is in accordance with the standard Projection Principle. The second part (30b) is an extension that goes slightly beyond the standard extension of the Projection Principle. The standard extension concerns subjects in the sense of Chomsky (1965), i.e. subjects defined as [NP, S]. It seems to me that this is not sufficient, and that the extension must cover all subjects of subject-predicate relations in the sense of Williams (1980) and subsequent papers. According to this conception, a subject is an NP in the configuration [~ NP XP], where XP stands for any maximal projection (including S'). The NP subject in this sense may receive a 9-role by indirect 9-marking (Chomsky (1981b, 38)), but also by binding an element in the predicate XP. Some possibilities are exemplified by (31): (31)
a. b. c.
John broke his arm Johnj [vP seems [tj to go]] J ohnj [s' OJ [I don't really like t j]]
In all three cases, the argument John is followed by a predicate. In (31a), John receives a 9-role by indirect 9-marking in the usual sense. In (31b), John receives a 9-role by binding an open place in the following predicate. The 9-role of the open place is transmitted by the property-sharing rule (13).
It seems to me that the subject-predicate relation is the only extension we need: it is the only place where direct projection of 9-roles from the
44
Domains and Dynasties
lexicon fails. Ultimately, all El-roles come from the lexicon, but they are only indirectly assigned to subjects. Since we gave up the one-to-one requirement between El-roles and arguments, this indirect El-marking by "property sharing" with another argument is unproblematic. Topicalized constructions like (31c) have always been very problematic for the standard approach. The open sentence is predicated over John in (31c) (Chomsky 1977)), so that John must be an argument according to any reasonable definition of this term. But if John is an argument, it must have a El-role. Under the property-sharing approach, this is not a problem, because John is linked to the trace in (31c) by a construal chain. This trace, an argument, has a El-role that may be shared by the other argument, John. A movement analysis, on the other hand, is impossible for topicalization. John would originate in the trace position, moved to COMP, and from there it would be lifted to the topic position by Vergnaud-raising (see Van Haaften et al. (1983)). But as I will argue below, Vergnaud-raising is impossible for topicalization. In Dutch, topicalization may look like English topicalization, but it may also involve a so-called d-word in COMP position (Van Riemsdijk and Zwarts (1974), Koster (l978a)): (32)
Die man, die ken ik that man that know I
In this case, not only a El-role is transferred, but also Case. In languages with rich overt Case-marking, like German, agreement in Case is normal (see Van Riemsdijk (1978)): (33)
Den Hans (acc.), den (acc.) mag ich nicht the John him like I not 'J ohn, I don't like him'
This example shows once again that in general there is no one-to-one relation between antecedents and dependent elements. There is always a unique antecedent (the Case assigner in this example), but there may be more than one dependent element (Case-bearing NPs). The Dutch and German cases definitely do not involve Vergnaudraising, which would create the d-word with its Case ex nihilo (see also Cinque (1983a) and section 3 below for more arguments). So, here we have a crucial example: Case and El-role assignment to the topic by movement is impossible, while the property-sharing rule may use the construal chain through the anaphoric d-words to transfer to the topic the licenses it needs. The examples with the d-words are particularly interesting because dwords do not usually link idiom chunks to their licensing position, as shown by Van Riemsdijk and Zwarts (1974):
Levels of Representation (34)
45
a.
Ik geloof er de ballen van 1 believe there the balls of 'I don't believe any of it' b. *De ballen, dat/die geloof ik er van the balls that believe 1 there of
Usually, "move alpha" can transfer at least three things: a a-role, Case, and lexical content. If we compare (32) and (33) to (34), we see a discrepancy: in the first two examples, it appears that d-words can transmit a a-role and Case, but from (34) it is clear that lexical content cannot be transmitted. This difference does not come as a surprise. As 1 argued before, the property-sharing rule transmits whatever properties can be transmitted. Normally, the uniqueness condition works as a filter. Thus, Case cannot be transmitted to NPs that already have Case. Similarly, lexical content cannot be transmitted to an NP position that already has lexical content. Thus, the representation of (34b) is as follows: (35)
*De ballenj diej geloof ik tj er van
A Case-marked trace must have a unique lexical content as antecedent (antecedents are always unique). Die in (35) qualifies as the lexical content of the trace, but then it is impossible for the idiomatic NP de bal/en to also qualify as the lexical content of the trace position. Diej cannot be skipped, because according to the configurational matrix, an antecedent is obligatory within a local domain. The transfer of a-roles and Case is unproblematic, however, in such cases. For those, the licensing element (the assigner) is the antecedent. So, the trace tj in (35) has a unique antecedent within the local domain, the verb geloof. As noted before, the number of dependent elements is not constrained by a uniqueness condition, so that both the topic and the dword may depend on the assigner of Case and a-role. So, the rule "share property" works selectively, since its scope is "filtered" by independent principles, such as the uniqueness property of the configurational matrix. This approach solves a paradox about easy-to-please constructions (Chomsky (1981b, 308-314)): (36)
Johnj [vP is [AP easy [OJ [PRO to please tj]]]]
John seems to be in a non-a-position because it can be replaced by it (it is easy to please John). Traditionally, it has also been assumed that John has its D-structure position in the trace position, from where it is moved to the matrix subject position (see Lasnik and Fiengo (1974), however, for a deletion approach, and also Chomsky (1977) for a similar approach). A movement analysis for (36) leads to a paradox, as noted by Chomsky (1981b, 309). The problem is that idiom chunks cannot be moved to the
46
Domains and Dynasties
matrix subject position, as one might expect under a movement analysis: (37)
a. b.
*Good carej is hard to take tj of the orphans *Too muchj is hard to make tj of that suggestion
It seems to me that this paradox cannot be solved under the standard assumptions. Chomsky (1981) assumes that the examples in (37) show that a movement analysis is not possible. I agree, but it must be concluded then that the standard assumptions are seriously undermined, because the standard approach crucially assumes that tl-roles are assigned directly, and not by linking. Moreover, Chomsky (1981b, 313) observes that a nonmovement analysis creates a new problem. If John is inserted in D-structure, the Projection Principle requires that its position be a tl-position, which it is not. Chomsky therefore weakens the assumptions about lexical insertion by assuming that John is inserted in S-structure in (36) (while such names are inserted in D-structure elsewhere). This is even interpreted as an argument in favor of D-structure, because the solution of the paradox crucially involves the distinction between S-structure and D-structure (Chomsky (1981 b, 346, point (e))). It seems reasonable, however, to interpret the paradox as an argument against D-structure and the standard assumptions. Clearly, John is an argument in (36), which must receive its tl-role directly, if the standard assumptions are correct. For the alternative approach, however, (36) is unproblematic. John is inserted at S-structure like all other lexical items (the simplest theory) and it may receive a tl-role because it is a subject. Particularly, it must receive a 8-role from its predicate according to (30b). Since there is a construal chain (indicated by the indices in (36)), this tl-role may be shared with the trace coindexed with it, a trace within the predicate as required. As we saw in the Dutch case, idiomatic lexical content is not necessarily transferred in construal chains. It is only transferred if the chain does not contain other lexical material. It is reasonable, however, to assume that the operator OJ in (36) has features. Intermediate links in COMP do not necessarily have content, but a COMP-to-COMP chain always ends in an operator position, usually marked by the feature + WH (see Chomsky (1977)). It seems appropriate to assume, then, that the feature that makes a COMP position an operator is also present if the operator is not phonetically realized, as in (36). We can also consider these lexical features of the operator position the realization of the Case assigned to the trace. Under the alternative theory, there is nothing paradoxical about (36). There is a construal chain as indicated, and property sharing is filtered by the uniqueness condition as usual. A tl-role is transferred to John, because it is not in a direct tl-position. Case is not transmitted, however, because John is already in a Case position. Similarly, lexical content is not transmitted, because the lexical content of the trace position is already
Levels of Representation
47
satisfied by the features of the operator position. But since lexical content is not transmitted, idiom chunks cannot appear in the matrix subject position, as shown by (37). I will now give a brief review of all arguments in favor of D-structure that can be found in Chomsky (1981b), and that are summarized there on page 346. There is some consensus that S-structure is the basic level of syntactic representation. Chomsky notes that the arguments for Dstructure (as a level distinct from S-structure) are "highly theory-internal". In particular, "[tJhe existence of a level of D-structure, as distinct from Sstructure, is supported by principles and arguments that are based on or refer to specific properties of this level, which is related to S-structure by the rule Move-a." The arguments in which D-structure plays a role are summarized as follows (page numbers of Chomsky (1981b) added): (38)
a. b. c. d.
e.
asymmetric properties of idioms (ch. 2, note 94) movement only to non-S-position ( ... and discussion ... of the distinction between NP-trace and PRO) (p. 46ff.) restriction of an operator to a single variable (p. 203) the requirement that AGR-subject coindexing be at D-structure, as distinct from government by AGR at S-structure, with its various consequences (p. 259) the possibility of inserting lexical items either at D- or Sstructure (p. 312)
We have just discussed argument (38e) and concluded that the facts in question form arguments against D-structure. We can therefore limit our attention to the first four arguments (38a-d). The idiom argument hinges on the fact that some idioms can be "scattered" at S-structure (good carei was taken ti of the orphans), while others cannot (*the bucketi was kicked ti)' In other words, idioms of the first type can undergo movement (bind traces), while idioms of the second type cannot. The argument deserves to be quoted in full (Chomsky (1981b, 146, note 94)): Thus idioms in general have the properties of non-idiomatic structures, and appear either in D-structure or S-structure form, but not only in S-structure or LF-form. D-structure, not S-structure or LF, appears to be the natural place for the operation of idiom rules, since it is only at D-structure that idioms are uniformly not "scattered" and it is only the D-structure forms that always exist for the idiom (with marked exceptions), S-structures sometimes being inaccessible to idiomatic interpretation. Thus at D-structure, idioms can be distinguished as subject or not subject to Move-a, determining the asymmetry just noted.
It is true that there are idioms that only exist in their D-structure form, but there are also idioms that only exist in S-structure form (the marked exceptions mentioned in the quotation). Bresnan (1982), for instance, gives passive idioqls like x's goose is cooked (meaning, x is in trouble and there
48
Domains and Dynasties
is no way out). But it is irrelevant whether there are many or few such examples, because the logic of the argument is unclear. What is an idiom rule? Presumably it is a rule that says that a V + NP combination, among others, has an idiomatic interpretation (make + headway, kick + the bucket, etc.). It seems to me that the most natural place for such interpretation (e.g. kick the bucket = 'die') is not D-structure but the lexicon. The crucial fact, then, is that some idioms can be scattered and some cannot. But of course, the most natural place for that information is also the lexicon. The question is how this information must be coded. It should be noted that the fact to be accounted for is not that no element of certain idioms can be moved. The NP part of certain V + NP idioms cannot be moved, but there is no direct evidence from English that the V part is also immobile. A language like Dutch has some obligatory V-movement rules, Vsecond (Koopman (1984)), and V-raising (Evers (1975)). It appears that the V part of all V + NP idioms in Dutch undergoes these rules, including idioms of the type kick the bucket. An example is de PUP uitgaan ('to die', lit. to go out of the pipe): (39)
a. b. c.
dat hij de pijp uit ging that he the pipe out went hij ging de pijp uit t dat hij de pijp t scheen uit te gaan that he the pipe seemed out to go
(non-root order) (root order after V-second) (after V-raising)
I conclude from these facts that the non-scattering of idioms is a fact of the NP, not of the V, in V + NP idioms. The question now is what the nature of this fact is. Chomsky (1981b, 146, note 94), assumes - and that is the crux of the argument - that the NP must be marked as not undergoing "move alpha". This marking can of course be done in the lexical specification of the idiom, but it remains a fact about certain idioms which cannot be moved, and therefore can only be inserted in D-structure. But note that under this interpretation the argument tacitly assumes what it must prove, namely that the crucial fact about certain idiomatic NPs is plus or minus "move alpha". It is not only possible but presumably even necessary to code the properties of the idiomatic NPs in the lexicon in a different way. The fact to be explained is that the bucket in kick the bucket cannot bind a trace at S-structure. Suppose now that we code this in the lexicon as follows: (40)
[v kick] [NP the bucket] = 'die' [ - antecedent]
Idioms like care (to take care) and headway are not marked with
Levels of Representation
49
[ - antecedent], a marking which presumably follows from a more general property, e.g. the property of being nonreferential in some sense. The marking with [ - antecedent], as in (40), now no longer blocks insertion at S-structure, but the result is filtered out if the bucket binds something at Sstructure, for instance a trace. This solution is presumably better than the marking with [- move alpha] at D-structure, because (40) also blocks (41) at S-structure: (41)
*He kicked the bucketj before he had paid for itj
The bucket cannot be the antecedent for the pronominal it either, a fact about binding stated at S-structure. Parts of idioms like care can sometimes be antecedents at S-structure (Chomsky (1981b, 327)): (42)
Carej was taken tj of the orphans, but itj was sometimes insufficient
All in all, it can hardly be concluded that the idiom argument supports Dstructure. Idioms surely differ from one another, a fact that is naturally expressed in the lexicon. But the differences in question are best interpreted as differences in S-structure behavior. The second argument (38b), "movement only to non-9-positions", has to do with the 9-criterion. Again, we see that "movement" is already presupposed. But since part of the 9-criterion is preserved in the alternative account, the fact in question receives an explanation that does not substantially differ from the standard account: (43)
NPj, ... , NP j
"-----9 If two NPs are coindexed, property sharing, including sharing of the 9-
role, is possible. But as we saw before, property sharing is filtered by the uniqueness condition: the second NP in (43) can transmit a 9-role to the first only if it does not have a 9-role of its own. This fact has nothing to do with D-structure, but is explained by the uniqueness property of the configurational matrix, which is a property of S-structure relations. Note that it is also guaranteed under the alternative account that in a function chain GF b ... , GFn, it is always GFn that is directly licensed. Suppose it were otherwise, i.e. that a 9-role were indirectly assigned (transmitted) to the last NP in a chain: (44)
... NPn-1, ... ,NP n 9 _______ J
Because of the c-command requirement, each link in a chain c-commands the next link; therefore, NP n _ 1 c-commands NP n' Suppose now that NP n
50
Domains and Dynasties
is not directly f)-marked, but that it receives its f)-role from NP n - 1. According to (30), indirect f)-marking goes only from predicates to subjects. Consequently, NP n _ 1 must be contained in the predicate of which NP n is the subject. But this is only possible if NP n _ 1 does not ccommand NP n (the predicate itself c-commands the subject, so that the material contained by the predicate does not c-command the subject). But if NP n _ 1 does not c-command NP n, these two NPs do not form a link of a chain. Therefore, it is impossible for the last element of a chain to get a f)-role indirectly. The last element must always be in a direct f)-position, and the other elements must be in non-f)-positions because of the uniqueness condition. The difference between trace and PRO will be the topic of the next chapter. The third argument (38c) concerns examples like (Chomsky (1981b, 203)): (45)
*WhOi did you give [pictures of ti] to ti?
This example is supposed to be ungrammatical because of the fact that it contains two variables. The idea is that D-structure cannot contain traces and WhOi can fill only one variable position at D-structure, so that the Dstructure for (45) always contains a non-argument, [NP e], at D-structure. This argument is without force, because, as we saw before, the definition of D-structure does not exclude a base-generated Wh-phrase binding two variables (unless it is stipulated that D-structure does not contain Wh-traces). More importantly, the intended explanation is completely overruled by the discovery (or rediscovery) of parasitic gaps:2 (46)
Which booki did you return ti before reading ei?
This structure contains two variables that cannot both be filled at Dstructure by which book. It is therefore not surprising that the earlier explanation for the ungrammaticality of (45) is not maintained in Chomsky (1982a). The fourth argument (38d) has to do with the ungrammaticality of the following Italian sentence (Chomsky (1981b, 259)): (47)
*NPi AGRi sembra [s Giovanni leggere i libri] seems to read the books
The intended explanation is based on the idea that assigning nominative Case involves a mechanism with two components: (48)
a. b.
AGR is coindexed with the NP it governs nominative Case is assigned to (or checked for) the NP governed by AGR
Levels of Representation
51
Clearly, (48b) applies at S-structure (as Chomsky notes), because Case must be checked after Raising. The argument, then, crucially involves the assumption that (48a) applies at D-structure (and not at S-structure). If this assumption is plausible, we might have some confirmation for Dstructure. According to Chomsky, (48a) must apply at D-structure for the following reason. If it is assumed that in pro-drop languages the rule R (which adjoins AGR to V) applies in the syntax, AGR will govern Giovanni in (47): "If AGR could be coindexed with Giovanni by [(48a)], then both conditions for nominative Case assignment would be fulfilled: Giovanni would receive nominative Case in [(47)] and raising of the embedded subject would not be obligatory. But if the agreement phenomenon is determined at D-structure, then the structure [(47)] is barred as required" (Chomsky (1981b, 25<J-260)). It seems to me that this argument is based on questionable assumptions. A much simpler analysis of (47) assumes that the nominative Case assigner AGR (with or without the rule R) or INFL is not accessible to Giovanni. Nominative Case is assigned under government by AGR (or INFL), but in this case Giovanni cannot be governed by AGR because Giovanni is already in the domain of another governor, namely the verb sembra. As we discussed in chapter 1, government is determined by minimal c-command, i.e. a governer y cannot govern an element 0 in the domain of another governory'. Consequently, Giovanni is not governed by AGR in (47), so that nominative Case is not assigned to it and (47) is ruled out by the Case filter without Raising. 3 There are various ways to account for pro-drop phenomena, as indicated by Chomsky (1982a, 78ff.), but there is no evidence that an account involving D-structure is somehow superior, or even plausible. We must conclude, then, that none of the arguments given in Chomsky (1981b) and summarized in (38) supports D-structure. Let us now turn to some direct arguments against D-structure. The first argument is based on a phenomenon analyzed by Obenauer (1984). Obenauer has shown that there is a phenomenon in French which he calls "Quantification at a Distance", the binding of the empty ei by beaucoup in (49b): (49)
a. b.
11 a rencontre [beaucoup de linguistes] II a beaucouPi rencontre rei de linguistes]
The interesting property of beau coup is that it can also occur in these contexts without binding an empty element, with a similar meaning, codetermined by the verb: (50)
II a beaucoup rencontre Jean
There are similar quantifiers, like combien, that can undergo Wh-
52
Domains and Dynasties
movement: (51)
a. b.
[Combien de linguistes]j a-t-il rencontre tj [Combien]j a-t-il rencontre [ej de linguistes]
It appears that this type of Wh-movement is not possible across a quantifier like beaucoup:
(52)
*[QP Combien]j a-t-il beaucoup rencontre [NP[QP e]j de linguistes]
Obenauer calls this pseudo-opacity. This phenomenon is problematic for a movement account, because nothing blocks the movement of combien in (52) (cf. (51b)), and beau coup occurs in the preverbal position where it can normally occur without having a movement source (see (50)). Obenauer argues persuasively that beaucoup necessarily becomes an A'binder of the trace, if there is a trace at S-structure. Under this assumption, (53) (= (52)) is straightforwardly ruled out: (53)
*Combienj a-t-il beaucoupj rencontre [tj de linguistes]
In the theory presented here, this structure is ruled out by the uniqueness principle at S-structure: a trace can have only one antecedent. The point is that this analysis is based on the fact that the relation between beaucoup and the trace is of the same nature as the relation between combien and the trace (the uniqueness condition constrains relations of a given type). But then the relation in question has nothing to do with "move alpha", because the relation between beaucoup and the trace is not created by "move alpha". In terms of a movement account, beaucoup only becomes a trace-binder after the unrelated movement of combien. In other words, (52) clearly involves a relation of the antecedent-trace type that is not created by "move alpha". As Obenauer rightly concludes, such examples favor a representational view of the antecedent-trace relation over a derivational view. A similar argument against the derivational approach can be based on island violations. In some languages, like Swedish, islands can relatively easily be violated (see Allwood (1976)), which might have to do with the productivity of the resumptive pronoun strategy in this language (see Engdahl (1984)). But even in English, relatively acceptable violations of the Complex NP Constraint can be produced (as we briefly discussed in chapter 1): (54)
Which racej did you express [NP a desire [s to win ej]]
This is not a universal phenomenon, because the Dutch equivalent is totally ungrammatical:
53
Levels of Representation
(55)
te **Welke racej heb je het verlangen (om) COMP to which race have you the desire uitgedrukt? expressed
ej
winnen Will
If Subjacency were discovered on the basis of Dutch, we could simply say
that this sentence is ruled out by Subjacency (or some equivalent of it; see chapter 4). It would be a reasonable conclusion, then, that (54), which as a Subjacency violation does not show the characteristic property of movement, cannot be derived by movement. If this conclusion is correct, it must be possible to generate Wh-phrases in COMP (as in (54)) without "move alpha". Since there is nothing in the definition of D-structure that precludes base-generation of variables (apart from stipulation), (54) must be a possible base structure. Despite the fact that (54) does not show the property of "move alpha", it is usually assumed that it is derived by "move alpha", perhaps under the further assumption that Subjacency is not an absolute principle, but an expression of the unmarked case. The Dutch example, however, shows that Subjacency is an absolute principle: if it is violated, the resulting sentence is very unacceptable. I will show in chapter 4 that relatively acceptable island violations do not involve "move alpha" in the traditional sense, but a resumptive pronoun strategy, as discovered by Cinque (1983b) and Obenauer (1984). A violation of island constraints (as in (54)) is not simply "movement" with less strict Subjacency. Gaps in such structures appear to have properties that are entirely different from the properties of standard traces. Standard traces can be of all categories (NPs, PPs, adjuncts, etc.). Gaps in islands (like ej in (54)), parasitic gaps among them, are usually exclusively of the category NP. Moreover, they can only be found in structures in which certain directionality constraints are met (global harmony, see chapter 4). The directionality constraints in question can be met in SVO languages like English, Romance, and Scandinavian, but not in SOY languages like Dutch or German; hence, the acceptability of (54) in the SVO languages, and the total un acceptability in the SOY languages (see 55)). Gaps in islands, in other words, show a surprising clustering of properties: 4 (56)
a. b. c.
Subjacency is violated only NPs are possible directionality constraints must be met
Gaps that are strictly locally bound, like "traces", lack these three properties. What this indicates, is that the violation of Subjacency in (54) is not an accidental property, but a characteristic property of the construction in question (i.e. the construction with the antecedent-resumptive
54
Domains and Dynasties
pronoun strategy): the properties (56b-c) are found only if Subjacency is violated. If these conclusions (worked out in chapter 4) are correct, (54) cannot be derived by "move alpha". There is, therefore, strong evidence that Whphrases can be generated in CO MP without involvement of "move alpha". Like Obenauer's argument, this argument favors the representational view over the derivational view for the Wh-phrase-gap relation. We can now summarize our objections against D-structure and "move alpha". D-structure is essentially a reconstruction level for the direct assignment of a-roles. As we have seen, it is not possible to construct such a level (distinct from lexical structure). Even if D-structure is assumed, arole assignment must be extended to subjects (indirect a-marking in the sense of Chomsky (1981 b, 38)). This is not sufficient, however, because arguments in topicalization and easy-to-please constructions are neither complements nor subjects that qualify as indirect a-positions in the sense of the Extended Projection Principle. Indirect a-role assignment must be extended to all subjects in subject-predicate constructions. If we do so, the biuniqueness property of the a-criterion must be dropped, which is a welcome move for principled reasons. As soon as we drop biuniqueness (the one-to-one relation between a-roles and arguments), the distinction between D-structure and S-structure becomes meaningless. "Move alpha" is essentially a transfer mechanism. Its status as a unique transfer mechanism, distinct from other construal rules, is not supported by the facts. First of all, it has the same configurational properties as other construal rules (see chapter 1 and the following chapters). And, secondly, it is a transfer mechanism filtered by the same uniqueness condition as other construal rules. This second aspect is very important, because in the standard GB approach much weight is given to the fact that movement is to non-apositions. This is, however, totally determined by the uniqueness condition that also filters the transfer potential of other construal rules. a-roles cannot be transmitted to positions that already have a a-role (or that cannot have one in principle), which is just like the fact that referential indices cannot be transmitted to NPs that already have a referential index (such as nonanaphors), or that cannot have a referential index in principle (like A' -positions). That "move alpha" is just an instance of the general property-sharing rule (13), filtered by uniqueness and other independent factors, is clearly demonstrated by the fact that "move alpha" transmits different things in differen t circumstances: (i)
"move alpha" transmits Case (and lexical content), but no a-role, as in Wh-movement
(57)
Whomj did you see
tj?
Levels of Representation
55
This follows from (30): indirect O-role assignment is only to arguments (subjects). This fact is significant because it instantiates a general aspect of the property-sharing rule: only those properties are transmitted that the "receiving" category can take independently. In this case, a 8-role is not transmitted, because non-arguments are inherently unable to take a O-role. (ii)
"move alpha" transmits a O-role (and lexical content) but no Case, as in NP-movement
(58)
Johnj was arrested tj
Naturally, a O-role can be transmitted to a non-O-position. Transfer of a 8role to a 8-position would violate the uniqueness condition. Similarly, John in (58) receives Case independently (from INFL). Again, the uniqueness condition prohibits transport of a second Case to such positions. What Wh-movement in (57) and NP-movement in (58) have in common is the transfer of lexical content. This is not necessary, however: (iii)
"move alpha" transmits a O-role but no lexical content:
(59)
Johnj wants [PROj to be arrested tj]
Again, we see that what "move alpha" transmits is dependent on the inherent properties of the "landing site". If the landing site cannot have lexical content for independent reasons (like the PRO position in (59)), no lexical content is transmitted. It must be concluded, then, that "move alpha" cannot be functionally defined: what it does is contextually determined. Since "move alpha" cannot be configurationally defined either, there is no evidence for the idea that it deserves an independent existence in the theory of grammar. It is just an artefact, the result of the combined properties of the propertysharing rule (13) (i.e. the properties of the configurational matrix) and the independent properties of the landing sites. Crucial evidence in favor of this view, apart from its obvious advantage in terms of conceptual economy, can be found in topicalization and other left dislocation structures, and in easy-to-please constructions: (60)
a. b.
Bill j , I don't like himj J ohnj is easy [OJ [to please tj]]
As a transfer mechanism of 8-roles and Case, "move alpha" is neither necessary nor sufficient. We have already seen that "move alpha" does not always transmit a 8-role (57) or Case (58). The examples in (60) show that Case and 8-role can also be transmitted by other construal rules. What is crucial is that the transfer is dependent on the same filtering mechanism in these nonmovement construal rules. Thus, in (60a) both a O-role and a
56
Domains and Dynasties
Case are transmitted to Bill, because Bill is an argument in a position to which Case and a-role are not assigned independently. In (60b), John is in a position with inherent Case (assigned by INFL), but without a direct alicense. As a consequence of the uniqueness condition, Case can be transmitted in (60a) but not in (60b). A a-role must (and can) be transmitted in both cases. Nonmovement construal, in other words, is subject to the same functional selectivity as movement. Where movement and nonmovement construal sometimes differ is with respect to the transfer of lexical content (for instance, idiom chunks): (61)
*Good carej is hard to take tj of the orphans
But again, this is not a difference in the nature of the transfer mechanism itself, but the consequence of an independent filter mechanism. If an idiom chunk is related to its licensing position in a construal chain that involves other lexical material (like him in (60a) and the lexical features of the operator OJ in (60b)), the lexical content of the idiom chunk cannot be fully transmitted to (shared by) the licensing position. Again, this is a result of the uniqueness condition: a given position has one and only one lexical content. No wonder, then, that idiomatic lexical content is only optimally transferred in construal chains with no other lexical features. The difference in grammaticality between (60b) and (61) is usually interpreted as an indication that "move alpha" is not involved in the a-role transfer of the trace tj in (60b) to John. The implicit assumption, then, is that the transfer of a a-role and the transfer of lexical content always correlate in "move alpha". As we have seen, however, this assumption is false. In (59), for instance, a a-role is transferred from tj to PROj, while lexical content is clearly not transferred. Given the contextually determined functional selectivity of "move alpha", the discrepancy in grammaticality between (60b) and (61) is not a sufficient argument against the involvement of "move alpha" in the a-transfer from tj to Johnj in (60b). But there are other arguments against a movement analysis. That (60a) does not involve movement is quite uncontroversial. We explained the ungrammaticality of (61) by the hidden lexical features of the operator OJ (cf. (60b)). If John in (60b) inherits a a-role by "move alpha" through the operator position in COMP, lexical features must be created ex nihilo, and we would have a chain with two different Cases (of John and the trace, respectively), which is not possible for chains in general (Chomsky (1981b, 334)). A structure like (60b), then, forms a strong counterexample to the usual assumptions concerning D-structure and S-structure as a pair related by "move alpha". In any case, (60b) shows a structure with two arguments (John and the trace) sharing only one a-role. There is no corresponding Dstructure for such cases, according to standard assumptions. We can give (60b) a D-structure only by dropping the assumption of a one-to-one relation between a-roles and arguments at D-structure. This brings us
Levels of Representation
57
once again to the essence of our argument: dropping the one-to-one relation in question comes down to giving up the idea that there is a significan t distinction between D-structure and S-structure. It appears, then, that one of the oldest examples presented in favor of a level of deep structure, John is easy to please, forms one of the strongest arguments against what is now called D-structure. In conclusion, it is useful to give a summary of the kinds of positions that we find at S-structure: (62)
a. b.
c.
positions that are directly projected from the lexicon (basic positions) positions that are related to (and share properties with) basic positions: i. Wh-positions in COMP ii. subjects in non-a-positions iii. topics adjuncts
It has never been controversial that there are positions like the basic positions of (62a), distinct from the positions summed up in (62b). I will maintain this aspect of D-structure by sometimes referring to basic positions as D-stru-cture positions. In chapter 4, I will show that directionality constraints (global harmony) are computed from D-structure positions only. In this sense, D-structure survives as a substructure of S-structure with specific properties. This fact is compatible with a theory like the one presented in Koster (1978c). What was denied in that theory, and what seems even more untenable now, is that S-structure with the positions mentioned under (62b) and its lexical substructure (the D-structure positions) exist as two different levels that can be bridged by "move alpha".
2.3. NP-structure Van Riemsdijk and Williams (1981) have proposed a level of representation distinct from both D- and S-structure. This level is situated between the application of NP-movement and Wh-movement: (63)
D-structure
~
move NP
NP-structure
~
S-structure
move Wh
The arguments for NP-structure are very much like the standard arguments for D-structure: certain facts are treated most elegantly, in the most revealing way, if certain elements are in certain positions, rather than related to certain positions. It should be noted that it is essentially an argument of elegance, because there is a certain consensus that the
58
Domains and Dynasties
arguments in question are not absolutely compelling if there are traces at S-structure. The point can best be illustrated with an example of predication that Van Riemsdijk and Williams give (p. 205): (64)
a. b.
John ate the meatj rawj How rawj did John eat the meat tj?
Both sentences represent a subject-predicate relation with the meat as subject and (how) raw as predicate. If we call the level at which the subjectpredicate relation is fixed "predicate structure" (as in Williams (1980)), the c-command condition on the predication relation is presumably stated as follows: (65)
In predication structure, a subject must c-command its predicate or a trace of its predicate
Van Riemsdijk and Williams argue that if we assume that predicate structure is in fact the pre-Wh-movement NP-structure, the statement (65) can gain in elegance by dropping the reference to the trace of the predicate (in bold type in (65)). An argument of this type is weak in principle because the full representation of (64b) is as follows: (66)
[AP how raw]j did John eat [NP the meat]j [AP t]j
Since the predication relation is defined for pairs [NPj APiJ, it is simply false that the statement of the c-command relation has to refer to the notion "trace" at S-structure. The theory presupposed by (65) is typically a theory without traces, in spite of the fact that (65) mentions the notion "trace". With traces, the c-command condition can be given at S-structure without reference to the notion trace: (67)
A subject NPj in a predicate structure NP i XP j must c-command its predicate XPj
Both in (64a) and (64b) (= (66)), this simple condition is fulfilled at Sstructure. So, there is no element of elegance in this case. The argument is apparently based on the mistaken assumption that the whole predicate has been moved in (66). What has been moved in (66), however, is not the whole predicate but only the lexical content of the predicate. A given lexical content has a syntactic function (such as "being a predicate") only with respect to a functional position. The Wh-position in COMP is not a functional position in this sense. The situation is analogous to what we observe when a Wh-phrase is moved from an argument position:
Levels of Representation (68)
Whatj did you see
59 tj
It is generally assumed that what in CO MP is not an argument in this case; only its "original" position, the position of the trace, is an argument position. Similarly, how raw is not a predicate in (66); only the trace position is. Basically, all the arguments that Van Riemsdijk and Williams give are of this "elegance" type, and basically all of these arguments show the same weakness, namely, that a functional position is not distinguished from its lexical content. I will return to the arguments in detail. But first I will show that predicate structure cannot be NP-structure in the intended sense. Consider a topicalization structure like (69):
(69)
Billj [s' OJ [I don't like
tJJ
Clearly, this is a predication structure with Bill as subject and the open sentence as predicate (see Chomsky (1977)). The topic Bill inherits its Case from the trace position t j, and binding conditions are also transferred from this position: (70)
[Pictures of each otherJj [OJ [theYj don't like
tjJJ
This reveals an inconsistency in the NP-structure model: according to Van Riemsdijk and Williams, NP-structure is the level at which the binding theory applies and at which Case is assigned. This means that the topic must have its NP-structure position in the position indicated by the trace. It is only here that Case is assigned directly and that the binding theory applies without extra transfer. It is for this reason that the NP-structure model derives topicalization by so-called Vergnaud-raising, a rule proposed by Vergnaud (1974). Applied to topicalization, this analysis assumes movement of the topic from the trace position tj to the operator position OJ in (70), followed by Vergnaud-raising to the topic position. In this analysis, the predicate structure in the optimally elegant sense that a lexical subject c-commands a lexical predicate, is only formed after Whmovement of the topic from tj to OJI will show that Vergnaud-raising for topicalization is impossible for independent reasons. Here, the example suffices to establish that it is impossible to construct a level at which both Case assignment and predication apply in the intended sense. More generally, the arguments against NP-structure are of the same type as the arguments against D-structure: the properties of the mapping, "move alpha", cannot be isolated and established, and the properties attributed to NP-structure itself are not exclusive properties of that level. Let us therefore have a closer look at the properties in question. Van Riemsdijk and Williams (1981) give the following four properties of NP-structure:
60
(71)
Domains and Dynasties
a. b. c. d.
the opacity condition (ultimately the binding theory) applies at NP-structure (abstract) Case is assigned at NP-structure contraction (i.e. to-contraction) operates at NP-structure (certain) filters apply at NP-structure
I will now briefly discuss these arguments, beginning with the idea that the binding theory applies at NP-structure. That Wh-movement does not affect the binding possibilities in the same way as NP-movement is not very surprising, given the fact that the binding theory concerns the relations between arguments, i.e. elements in A-positions. Rules that move material to A'-positions, naturally, have no effect on relations between A-positions. But apart from this, (71a) cannot fulfill its promises. As we saw in (70), the binding theory can only apply at NP-structure if we assume Vergnaud-raising for topicalization. This cannot be right because, as I will show, Vergnaud-raising is impossible for topicalization. If this conclusion is correct, (71a) must be false. One aspect of (71a) deserves special attention. Van Riemsdijk and Williams see it as an advantage of their model that the binding theory does not have to refer to Wh-traces: at NP-structure the future Wh-traces are still filled by Wh-phrases, which, as nonanaphors (and nonpronominals) cannot be bound. This would explain the alleged binding properties of Wh-traces in other models without stipulating a difference between Wh-traces and other empty categories (1981, 174). I strongly agree with the spirit of this explanation, as I will argue below. But the proposed execution of the idea, with the Wh-phrases physically filling the future trace positions, again does not give what it promises. The reason is that there are gaps, like parasitic gaps, that are identified by Wh-phrases but that cannot be literally filled by these Wh-phrases at NP-structure. As I will show in detail in chapter 4, the relation between Wh-phrases and parasitic gaps is definitely not characterized by Whmovement. Here we might add that in languages with rich overt Casemarking, like Finnish, the parasitic gaps can have a Case different from the Wh-phrase, which always agrees with the locally bound gap (the trace; see Taraldsen (1984)). This confirms the idea that the relation between a Wh-phrase and a trace has the properties of what is usually called Whmovement, but that parasitic gaps have different properties. In spite of the fact that parasitic gaps cannot "physically" be filled by their binding Wh-phrases, they have the properties of Wh-traces with respect to the binding theory (they must be A-free in every governing category; see chapter 6 for more details). This fact undermines the execution that Van Riemsdijk and Williams give to a certain explanatory idea with which I agree, i.e. the idea that the behavior of certain gaps is derived from the nature of their antecedent. In particular, the facts about parasitic gaps undermine the idea that NP-structure is necessary or even possible in carrying out the intended explanation.
Levels of Representation
61
In short, (71a) is not supported by the binding facts. It is, on the contrary, incompatible with the binding facts if a larger class of facts is considered, in particular the facts about topicalization and parasitic gaps. The second property, (71b), will be my main target for a somewhat more elaborated critique, so I will postpone its discussion until after a brief discussion of (71c) and (d). These two arguments are very similar and show the same weakness: they overlook the fact that the behavior of a position is the joint result of its functional status and its lexical content. The contraction facts on which (71c) is based are well known. A Whtrace blocks contraction (72), while an NP-trace and PRO do not block contraction ((73) and (74), respectively): (72)
a.
b. (73)
(74)
a. b. a. b.
Whoj do you want tj to beat Nixon *Whoj do you wanna tj beat Nixon J ohnj is supposed tj to leave J ohnj is sposta tj leave Ij want PROj to leave Ij wanna PROi leave
The argument is that it is already known that material that is "physically" present between want and to blocks contraction, and that it is only natural to assume that "physically present" material blocks contraction at PR (Phonetic Representation). If contraction applies at NP-structure, before Wh-movement, the Wh-phrase is literally present between want and to in (72), so that contraction is blocked in the most natural way. It seems to me that this approach adds nothing to the standard approach, which explains the difference by the fact that Wh-traces differ from NP-traces and PRO in that Wh-traces have Case. Case is what Wh-traces have in common with lexical NPs, which explains the similarity in contraction behavior. Since there are Case-marked gaps identified by a c-commanding Whphrase that are not the result of Wh-movement, namely parasitic gaps and gaps in islands, it is possible to give a crucial test. We have seen before that gaps in islands differ from traces (see (56) above); particularly, they miss the characteristic Subjacency property of the traces of Wh-movement. If these gaps are not created by Wh-movement (see chapter 4), the NPstructure theory and the standard theory make different predictions. Since the gaps in question are Case-marked, the standard GB theory predicts that contraction across these gaps is just as bad as in (72b). The NPstructure approach, however, predicts that contraction is possible because the gaps are not created by Wh-movement; particularly, the gaps have not been physically filled at any level by the Wh-phrase. Consider now the following data: (75)
a.
??Which manj did you express [a desire [that you want ej to succeed Reagan]]
62
Domains and Dynasties
b.
*Which manj did you express [a desire [that you wanna ej succeed Reagan]]
Naturally, such island violations yield less acceptable sentences. But to the extent that these data are clear, it seems to me that we find the same contrast as in (72). If this conclusion is correct, the standard approach is confirmed: Case suffices to block contraction, and "physical presence" of the Wh-phrase - as required by the NP-structure model - is not necessary. The fourth argument, (71d), is based on a filter proposed for certain Italian data by Longobardi (1980). In Italian, adjacent infinitives are bad under certain circumstances (Van Riemsdijk and Williams (1981, 177)): (76)
*Giorgio comincia ad amare studiare begins to like to study
Essentially, Longobardi's filter has the following form:
Again, Van Riemsdijk and Williams observe, the filter is blocked by a Whtrace (in an argument position) but not by an NP-trace or PRO. Moreover, the filter still seems to apply if the second infinitive is preposed, for instance by clefting (78a) or topicalization (78b): (78)
a. b.
*E [andare a Roma]j che potrei desiderare tj it-is to-go to Rome that I-might wish *[Andare a Pisa]i potrei preferire tj to-go to Pisa I-might prefer
Since the configuration of the filter (77) seems to be destroyed in these structures, Van Riemsdijk and Williams assume that there must be a preWh-movement level, NP-structure, at which the two infinitives are still adjacent. In this case, it might seem that it is less easy to dismiss the argument than in the predication case (66). In the predication case, the relevant information, the category AP, was still present after Wh-movement. In (78) something seems to be definitely lost, namely the internal structure of the trace. The categorial structure of the trace (presumably S') is irrelevant. What really matters is the internal structure of S', i.e. the fact that it contains an infinitive. One solution, proposed by Longobardi, would be layered traces, so that the S'-trace can have an empty V with the feature
inf. This solution is not particularly attractive, according to Van Riemsdijk and Williams. It seems to me that the solution of NP-structure is unattractive for the same reason: it does not seem very plausible that the
Levels of Representation
63
infinitives in (7S) directly bind the trace, just as it is implausible that cleft sentences and topicalizations are derived by Vergnaud-raising, as Van Riemsdijk and Williams assume. It is not unlikely that the construal chain involves an extra step, as in:
(79)
Whatj I prefer tj is to go to Rome
As mentioned before, Higgins (1973) has argued that such pseudo-clefts cannot be derived by movement. (79) is a good example, because the trace is of type NP, while the focus constituent, the infinitive, is of type Sf. Similarly, the cleft and topicalization structures might involve a Wh-trace of an NP, bound by an operator in COMP, which is in turn linked to a fronted Sf. If an analysis along these lines is correct for (7S), then the trace tj is an NP, and both the solution based on layered traces and that based on NP-structure must be incorrect, because these solutions presuppose a trace of the categorial type of the infinitive, i.e. an Sf. Later on, I will present a solution that avoids this problem in particular, and the impossible Vergnaud-type solution for topicalization in general. But first I will analyze (71 b), the claim that NP-structure is the level at which Case is assigned. Case assignment is for NP-structure what E)-marking is for D-structure. Whereas the postulation of D-structure is inspired by the desire to construct a level where all E)-roles are directly assigned, NP-structure is postulated by a desire, among other things, to construct a level at which Case is directly assigned. The underlying assumption is the same in both cases: direct licensing is more natural than indirect licensing through traces and other links of construal chains. As we have seen in the case of D-structure, it is not possible to construct a level at which all E)-marking is direct. But at least it could be maintained that all E)-roles are ultimately derived from the properties of lexical items. For Case, not even that can be true. Consider first an example in which the Case of a topic is derived from the complement Case of a verb. A relevant example is the German example given earlier (Van Riemsdijk (197S, 167)) (see (33) above): (SO)
Den Hansj (acc.), denj (acc.) mag ich tj nicht
In a way, this example is already problematic for the idea that NPstructure is the level of Case assignment. The problem is that (SO) contains two Case-marked NPs. It is not possible to create a level at which both NPs receive their Case directly in the object position of the verb. It is therefore again necessary to derive (SO) by Vergnaud-raising. But note that this would be a very undesirable kind of chain formation in this case. First of all, "move alpha" would have to be complicated by giving it the ad hoc power of being able to create an extra lexical position. This would result in a chain with two Cases, which is an anomaly (c£ Chomsky (19S1b, 334)).
64
Domains and Dynasties
Secondly, it is hard to maintain that den in (SO) is a kind of "visible" trace of the moved topic den Hans. The point is that den can also occur independently (this is essentially Dougherty's anaporn principle): (S1)
Deni mag ich ti nicht that one like I not 'That one, I don't like'
Den is just an independent pronominal. So, we could already interpret (SO) as a counterexample to the idea that there is a level at which all Case is assigned directly. Things become more problematic if we consider examples in which the Case of the topic is not derived from a complement (or subject) position. Thus, Van Riemsdijk (197S, 16S) gives examples like the following: (S2)
Der Hans (nom.), mit dem (dat.) spreche ich nicht mehr the John with him talk I not more 'J ohn, I don't talk to him any longer'
Here, the topic has obligatory nominative Case, while the bound d-word in CO MP has dative Case. Case agreement leads to an ungrammatical sentence: (S3)
*Dem Hans (dat.), mit dem (dat.) spreche ich nicht mehr
In (S2), then, the nominative Case of the topic is not derived from a clauseinternal position. This shows two things. First, it appears that not all Cases are ultimately derived from positions projected from the lexicon, as 8-roles are. Secondly, (S2) shows that Vergnaud-raising is not a general solution to the Case-transfer problem. What we see in (S2) is the selectivity of transfer that characterizes rules of construal in general. If a topic already has a Case (S2) for some independent reason, Case is not transmitted. If the topic has no independent Case, it must be transmitted (SO). Apparently, nominative Case is optionally assigned to topics (it is also possible as an option in (SO)), as a default Case (as suggested by Jan Odijk (personal communication)), or as a generalization of the Case assignment to subjects (the topics in question are subjects in a subject-predicate relation). If the nominative option is not chosen, Case must be transmitted by a c-commanding d-word, as in (SO). If the d-word does not c-command the topic, no Case is transmitted, and the nominative is obligatorily chosen because of the Case filter (cf. (S2) and (S3)). The idea that NP-structure is the level of Case assignment is motivated by examples like the following: (S4)
Whomj did you see
tj
Levels of Representation
65
The idea is that only the trace posItIOn is a posItIOn of direct Case assignment. Since direct Case assignment is the most natural Case assignment, and since whom is only in its natural Case position before Whmovement, i.e. at NP-structure, NP-structure is the natural level of Case assignment. As in the case ofe-marking and D-structure, this view only makes sense if it can be established that there is a one-to-one correspondence between Case positions and Case-bearing NPs. If Case can be transmitted by (nonmovement) construal, so that the one-to-one correspondence breaks down, there is no evidence for NP-structure. It must be shown, in other words, why it is implausible that the Case of whom in (84) is derived from the trace position at S-structure. The view that Case can only be assigned to NPs in direct Case positions and not be transmitted by construal rules is plainly false. Practically all construal rules that connect NPs can transfer Case. We have already seen examples like (80), but also ordinary (non-d) pronouns transmit Case. Van Riemsdijk (1978, 175, note 27) gives examples like:
(85)
Den Hans (acc.), ich habe ihn (acc.) gestern gesehen I have him yesterday seen the John 'John, I saw him yesterday'
There is almost full consensus that such cases of ordinary Left Dislocation are not transformation ally derived by "move alpha" (see Van Riemsdijk and Zwarts (1974) for arguments). One reason is that epithets can also transfer Case (see Cinque (1983a)): (86)
J Ohl1, Mary doesn't like that little bastard
Also, in many other examples of nonmovement construal, it appears that Case can be transmitted: (87)
a. b. c. d.
Whatj he really likes tj (obj.) is himselfj (obj.) He saw something awfulj (obj.): himselfj! (obj.) Whatj did he see tj (obj.)? Himself! (obj.) John saw Billj (obj.) and Peter himselfj (obj.)
Another clear case is Sluicing, discussed by Van Riemsdijk (1978, 231ff.). Van Riemsdijk convincingly argues that Sluicing structures are not derived by deleting the whole context as in: (88)
a.
Someone has done the dishes, but I am not sure who ~# yiPJl! j~, ~JJJl#
Sluicing falls into a domain of facts that Van Riemsdijk refers to as "connectedness of discourse" phenomena. What has already been shown
66
Domains and Dynasties
by (87), but what is also demonstrated in Sluicing constructions by Van Riemsdijk, is that "connectedness in discourse" is sufficient in most cases for Case transfer. Thus, Van Riemsdijk (1978, 244-245) gives the following German example: (88)
b.
Er will jemandem (dat.) schmeicheln, aber sie wissen nieht he wants someone flatter but they know not wem (dat.) whom 'He wants to flatter someone but they don't know whom'
The dative Case of wem, which is obligatory, can only be derived from jemandem in the preceding discourse. The Case cannot be derived from the deleted context, because, as Van Riemsdijk shows, there has never been a context to delete. But if even discourse rules can transmit Case, there is not the slightest reason to doubt that Case can be transmitted in local construals like ((84), repeated here for convenience): (89)
Whomj did you see
tj
Since whom is construed with the Case-bearing trace, it can derive a Case without problems. In fact, there is direct evidence that Case can be transmitted in very similar situations without movement. As we have mentioned several times, gaps in islands do not have the properties of traces and cannot be the result of "move alpha" (see also chapter 4). Consider now the following island violation, which is reasonably acceptable: (90)
Whomj did you express [a desire [to see ej]]
Here, the NP-structure solution fails, because there is no underlying structure with whom in the place of the gap. Particularly, it is not possible to derive (90) from such an NP-structure by "move alpha". But if (90) does not involve NP-structure, I see no reason why (89) should involve NPstructure. It is fair to say, then, that Case assignment has nothing to do with NPstructure. Since the other alleged properties of NP-structure (71) do not support NP-structure either, we must conclude that there is no evidence for NP-structure. Furthermore, we have seen that there are predicate structures that involve Wh-movement (69), which contradicts the view that predicate structure is the pre- Wh-movement NP-structure. A more general objection is that models with NP-structure crucially involve "move alpha", a rule without known properties. A specific feature of the NP-structure hypothesis is that it leads to a Vergnaud-raising analysis for topiealization in a number of cases. I have
Levels of Representation
67
already indicated why I find this rule questionable. It leads, for instance, to chains with two Case positions. I will conclude this part of the discussion with two more arguments against a Vergnaud-type analysis for topicalization. The first argument is a familiar one by now. If gaps in islands are not created by movement, topics cannot originate in the gap position either (from which they are moved to an operator position from which they are raised to the topic position). If the analysis (of chapter 4) for the gaps in question is correct, the following example cannot be derived by movement (including Vergnaud-raising): (91)
That racej [s' OJ [I did not express [a desire [to win eJ]]]
In spite of the fact that such examples cannot be derived by movement, the construal chain suffices for the transfer of anaphoric relations: (92)
[Pictures of each otherJj [OJ [theYi did not express [a desire [to buy ej]]]]
This example shows once again that NP-structure is not the level of the binding theory, in the sense that binding relations must directly be expressed there, without construal transfer. This leads us to a second argument against a Vergnaud-type analysis for topicalization. Consider the following example, in which the reflexive is inside an AP (or a small clause):
(93)
Hej was never [satisfied with himselfJ
This AP can be topicalized: (94)
[Satisfied with himselfJj [OJ [hej never was tj]]
In English, it is not obvious that this example is incompatible with Vergnaud-raising. In Dutch, however, it is possible to have a d-word in the operator position: (95)
zichzelfj]j [s' datj [is hijj nooit tj geweest]] that has he never been satisfied about himself
[AP Tevreden over
The crucial point, now, is that the d-word is an NP and not an AP (or a small clause). In other words, Vergnaud-raising would lead here to the transmutation of categories. The transmutation of water into wine is more credible than the transmutation of NPs into APs by the nonexisting rule "move alpha". I will now turn to a point in which I fundamentally agree with Van Riemsdijk and Williams (1981), and disagree with the standard GB
68
Domains and Dynasties
approach. The point can best be illustrated with strong crossover:
(96)
*Who j did hej say that Mary liked
tj
According to the standard GB approach, the ungrammaticality of this sentence is explained by the fact that the trace is defined as a variable, which can be clarified by the following LF paraphrase of (96): (97)
For which x, x a person, x said that Mary liked x
At S-structure, where the binding theory applies, the trace tj in (96) is already defined as a variable. Under the "natural" assumption that variables have something in common with names ("variables are unspecified names", see Chomsky (1981b, 102)), variables are supposed to have the same binding properties as names. This is expressed by the wellknown principle C of the binding theory: R-expressions (names and variables) must be free in every governing category. This principle is supposed to explain (96), in which the variable tj is illegitimately bound by an antecedent he. This view is confusing for a number of reasons. To begin with, it has traditionally been assumed that pronouns, like he, are the natural language equivalent of variables. Thus, (98a) can be paraphrased as (98b ):
(98)
a. b.
Everyone thinks that he is happy For every x, x a person, x thinks that x is happy
In this case, the pronoun he, which corresponds to the rightmost occurrence of x in (98b), cannot be free, but on the contrary, must be bound. From the point of view of Logical Form, this discrepancy between (97) and (98) seems paradoxical: in (97) a variable must be free, and in (98b) a variable in a similar position must be bound. But of course, the binding theory applies at S-structure, where we have two different categories corresponding to the variables in (97) and (98b). In (97) the variable corresponds to a trace at S-structure, and in (98b) the variable corresponds to a pronoun at S-structure. At S-structure, there is no paradox because traces differ from pronouns in their binding properties. But the problem is that an S-structure like (96) has always been seen as a precursor of the logical form (97). Without this assumption, it makes little sense to call something a variable at S-structure. If syntactic categories are precursors of LF categories, then it is hard to avoid he also being called a variable at S-structure, so that the paradox reappears at Sstructure. Nevertheless, it is essential to the standard binding theory that a Whtrace be called a variable at S-structure. Ultimately, this cannot be done
Levels of Representation
69
without stipulation. In other words, the standard approach is problematic in two respects. It leads to a paradox and it requires unnecessary stipulation. It is a fundamental merit of the analysis given by Van Riemsdijk and Williams (1981), that both problems are avoided. I will later show that their analysis (or some variant of it) avoids several other problems of the standard approach (see section 4 on Logical Form). According to the NPstructure analysis, the binding theory applies to (96) before Whmovement (99)
*H ej
said that Mary liked who j
The position of the object trace is still filled by the Wh-phrase who here. This structure is naturally ruled out by the binding theory because who is neither an anaphor nor a pronominal, so that it is accepted neither by principle A nor by principle B of the binding theory. This theory is not a notational variant of the standard theory because a (quasi-) quantified NP like who in (99) is not a variable. A quantified NP is neither an operator nor a variable, but a category that somehow combines the two aspects of quantification, a matter to which I will return. In fact, we can entirely dispense with principle C of the binding theory if we assume that only anaphors and pronominals can be bound. Since who is neither, (99) can be ruled out without reference to principle C or the notion variable. Although I consider this account a definite improvement over the standard theory, it does not quite work for all cases of strong crossover. Strange as it may seem, this is due to the very idea of NP-structure, the idea that Wh-phrases are literally present in the positions of the gaps at some level. Crucial evidence against this aspect of Van Riemsdijk and Williams' analysis comes from the anti-c-command condition on parasitic gaps. 5 Thus, (lOOa) is relatively acceptable, while (lOOb), in which the first gap c-commands the second, is ungrammatical (see Chomsky (1982a)): (100)
a. b.
Which book j did you return tj without reading ej book j tj was returned before you could read
*Which
ej
It may seem that (lOOb) favors the standard account of strong crossover: if the rightmost gap (the parasitic gap) is a variable, (lOOb) is ruled out because the variable is not free. The NP-structure analysis, on the other hand, does not apply here because it is not possible to reconstruct a preWh-movement level at which the Wh-phrase fills both gaps. Moreover, parasitic gaps are usually in islands, so that (lOOb) cannot be derived from NP-structure by "move alpha" without violating Subjacency. It seems to me that the essence of the Van Riemsdijk- Williams analysis can be preserved by changing the mode of execution, i.e. by giving up NPstructure. The reason why the NP-structure account breaks down in cases
70
Domains and Dynasties
like (100b) is that it is based on an untenable dogma that also led to problems in the case of D-structure. In order to see how the alternative execution works, we must make a short excursion into the domain of categories and their lexical content. In particular, we must sketch the outlines of a theory of identification of categories. All syntactic categories must be identified somehow. The most direct way of identifying a category is by giving it a lexical content. Case assignment is another identification strategy (see the discussion of "visibility" in Chomsky (1981b)). Thus, a category that has both Case and lexical content is doubly identified in a sense. Adjuncts, on the other hand, are only identified by their lexical content. I think that this is the ultimate reason why adjunct gaps have a much more limited distribution than NP gaps (see chapter 4). The Case-marked NPs can be identified by weaker means than the non-Case-marked adjuncts. NP gaps can, for instance, be identified by AGR in pro-drop languages. Furthermore, we have already indicated that NP gaps are the only gaps in English that occur in islands. Huang (1982) has discussed in detail how strict the island behavior of adjuncts is (see chapter 4 below). In English, Case is never a sufficient identifier. Contrary to what we see in languages like Japanese or Chinese (see Xu (1984)), a Case-marked NP gap must always be identified by a c-commanding category of some kind. It seems to me that many construals between categories are identification strategies. A bound anaphor, for instance, is incompletely identified because it is an argument without an inherent referential index. Binding, then, is an instantiation of the property-sharing rule (13). The missing identification is shared with some local antecedent. Similarly, the subject of a passive is insufficiently identified because it is in a position where it lacks an inherent 8-role. I have used "dynamic" terminology for these phenomena, such as "inheriting a 8-role", or "transmitting Case, or a referential index". This usage is somewhat metaphorical, for expository reasons. What I really mean is that the two related categories literally share a certain property, without transfer in the "dynamic" sense. Thus, two NPs can be mapped onto one referential index, or one 8-role, etc.: (101)
a.
John saw himself
~./ I
b.
John was arrested t
~8~ I
As we have mentioned before, all these identification mappings have the form of the configurational matrix discussed in chapter 1. Let us now make a distinction between a category and its lexical content. This distinction has traditionally been expressed by the device of lexical insertion. Thus, the sentence John loves Mary is derived by inserting the lexical elements (102a) in the syntactic skeleton (102b):
Levels of Representation (102)
a.
b.
71
[NP lohn]i, [NP Mary]j. [v love]k [[NP e]i [[v eh [NP e]j]]
In (102 a), we find the lexical content of the categories given in (102b). I will assume now that also after lexical insertion, a filled NP position consists of two parts, the functional category and its lexical content. Thus, in a sentence like John loves Mary, we distinguish for instance the object category [NP e] and its lexical content [NP Mary]. We can also say that Mary identifies the functional object position. It is a fundamental aspect of the representational theory presented here that lexical content is considered part of the identification of a category, on a par with other identificational material, like referential indices, Case, and a-roles. This entails that lexical material can be shared by two positions, just as Case or referential indices can. Thus, a sentence like who did you see can be represented like (101): (103)
[[COMP NP] [you saw NP]]
~/ who
This representation would be the same for a sentence without overt Whmovement, like you saw who (with the Wh-phrase in situ). Given the general possibility of property sharing (within the limits of the configurational matrix) it is immaterial in which position the lexical material actually occurs. In short, a functional category can be identified in two ways by its lexical content in situ, i.e. when the category dominates its lexical content, or by binding, i.e. when the lexical content is the antecedent of the functional category in some domain. In both cases, the functional category shares its properties (such as Case and a-role) with the information provided by the lexical content. I will refer to the former situation as isotopic property sharing, and to the latter as non-isotopic property sharing. It seems to me that generative grammar has always had a strong bias towards isotopic property sharing. Originally, for instance, anaphors were derived from full names in a sentence like John saw himself We could interpret this by saying that himself already had an isotopic referential index in deep structure. This view turned out to be untenable (see, for instance, lackendoff (1972)). Since then, it has generally been assumed that interpretation of anaphors is non-isotopic, i.e. the referential index is derived from (shared with) a binding category. Much of the foregoing discussion can be seen as a demonstration of the inevitable development in the direction of non-isotopic theories for Case and a-roles as well. A continuing preference for D-structure can be seen as an attempt to maintain an isotopic theory for a-role assignment. And the development of NP-structure can be seen as an attempt to construct an isotopic theory for
72
Domains and Dynasties
Case also. In both cases, these attempts are unsuccessful, in my opinion. One reason is that there can be a difference between isotopic property sharing and non-isotopic property sharing. With isotopic property sharing, the functional category and its lexical content share all the properties they have. Non-isotopic property sharing, however, can be partial. As we have seen many times, non-isotopic property sharing can be partial, because it is filtered by independent properties of antecedents. Thus, an antecedent and an anaphor share only one identifier, the referential index. In general, we can say that the fewer properties an antecedent has, the more properties it can share. What has been called "move alpha" is a case of partial property sharing and is therefore most naturally treated as a case of non-isotopic property sharing. Under a movement analysis, some properties of the functional position (the D-structure position) must inevitably be "left behind" (like the a-role in Wh-movement). We can now see why the NP-structure theory did not really work. The reason is that NP-structure requires isotopic property sharing for Whphrases, i.e. they are in a functional position at NP-structure, so that they must share all features with their NP-structure position. This is impossible in the case of parasitic gaps, as we saw in (100). It is stiII possible, under a non- Wh-movement analysis, to determine partial property sharing at S-structure. Thus both (96) and (lOOb) (repeated here) can be excluded if we drop the assumption of (rather) full property sharing presupposed by the movement account (104)
a. b.
* Whoj did hej say that Mary liked tj * Which bookj tj was returned before you could
read
ej
In (104a), the trace position is fully identified by the Wh-phrase in COMP, as in the NP-structure theory, i.e. the trace position shares the lexical features of the fronted Wh-phrase. This rules (104a) out, because binding by he requires anaphoric or pronominal identification. How can we rule out both (104a) and (104b) without movement or variables? Let us first consider some simpler cases, for instance French reflexivization: (105)
IIj sej lave tj he himself washes 'He washes himself'
II must bind se, because se is specified in the lexicon as an anaphor, i.e. as an incompletely identified element. II cannot bind se directly, because se is not in an A-position: only elements in A-positions can be bound. The trace cannot be bound either. Here we make a crucial assumption different from the standard approach. According to the standard binding
73
Levels of Representation
theory, Case-marked empty elements have inherent binding properties. In contrast, I will assume that empty Case-marked NPs have no inherent binding properties at all: their binding properties depend on the properties of their identifiers. What this comes down to is that neither se nor t can be directly bound in (105). But se must be bound. The solution is property sharing. A reflexive can only be bound in relation to an A-position. The relation can be isotopic, i.e. if the reflexive is dominated by the NP in A-position. The identification relation can also be non-isotopic, as in (105). In both cases, the reflexive lexical content identifies the A-position. In (105), then, se can be bound because it identifies (shares the properties of) an A-position. Similar mechanisms rule out the following example, with a nonreflexive clitic: (106)
*I\i lei lave ti he him washes
Le as a nonanaphor cannot be bound. In (106), it identifies an A-position, which rules the sentence out (the lexical content Ie must be free in relation to an A-position). This same mechanism is sufficient to rule out (104a): as such, the trace has no binding properties, but its identifier, the Wh-phrase, cannot be bound in relation to this A-position. Consider now an island violation, which involves an empty resumptive pronominal according to the theory referred to before: (107)
Which racei did you express [a desire [to win ei]]
The empty pronominal is identified by a c-commanding phrase (the Whphrase), as required. The gap is not checked by the binding theory, _because it is not necessary for resumptive pronouns to be A-bound, nor is there an A-binding antecedent in the sentence. Consider next a case in which the empty pronominal is A-bound:
(108)
*Hei said that Mary liked ei
Why can a resumptive empty pro not be identified by an A-binder? Recall that an A-binder does not really bind an empty Case-mark~d NP. It only binds the identifier of this empty category in relation to this A-position. For (108) this would entail that he binds itself, which is impossible under the reasonable assumption that A-binding is a nonreflexive relation. Identification by a Wh-phrase in COMP is no problem in (107) because in this example there is no A-binder, so that the binding theory does not apply. Now consider (104b) again (repeated here): (109)
* Which booki
ti
was returned before you could read ei
74
Domains and Dynasties
Parasitic gaps are resumptive pros according to the Cinque-Obenauer theory to be discussed in chapter 4. As empty elements, they must be identified. In (109) this requirement is met, because there is a ccommanding Wh-phrase. But the empty ei is also bound in (109), namely by the c-commanding trace (which is also identified by the Wh-phrase). It is now clear that (109) is ruled out in the same way as (108). The trace stands in a potential binding relation to the gap ei' As before, this gap has no inherent binding properties. The c-commanding trace (an A-binder) only binds the identifier of this gap in relation to (the A-position of) the gap. But the identifier, which book, cannot be bound, because it is not an anaphor or a pronominal in relation to its functional position, the position of the trace. In other words, the trace cannot bind the parasitic gap because it cannot bind its identifier in relation to its own position. (Of course, the trace also cannot bind which book in relation to the other Aposition, the position of the gap ejo) In sum, if Case-marked empty positions always transfer binding relations to their identifier, which is then bound in relation to the empty position, we have a uniform account for the facts in (104), (105), (106), and (108). The analysis preserves the fundamental and desirable property of the Van Riemsdijk- Williams analysis of strong crossover. But by giving up the idea of movement and NP-structure, the analysis (based on property sharing now) can be extended to the crossover-like anti-ccommand condition on parasitic gaps. Needless to say, the analysis based on property sharing can handle all the facts that the NP-structure hypothesis can handle. Thus, the transfer of binding relations after Wh-movement is accounted for (c£ property (71a)): (110)
[Which pictures of each otheri]j do theYi lik_e tj
They c-commands tj, so that by property sharing the lexical content in COMP is also in the domain of they. Since each other is dominated by a phrase in the domain of they, each other is also in the domain of they, as required. This analysis avoids the problematic transmutation of categories that seemed to be required for a movement analysis of (95) (repeated here): (111)
[AP Tevreden over zichzeifiJj [s' dat j [is hiji nooit tj geweest]]
Instead of Vergnaud-raising, we may of course have a binding relation between an AP (or a small clause) and an NP. There is independent evidence for this type of relation (see also Ross (1969)): (112) Peter is tem'eden over zichzeifi en Hans is dati ook Peter is satisfied with himself and John is that also 'Peter is satisfied with himself, and so is John'
75
Levels of Representation
This is an anaphoric relation that does not involve transformations. Essentially, we find the same base-generated relation in (111) (recall Dougherty's anaporn principle: dat can also be interpreted without an AP topic in (111)). The anaphoric relation, a construal that normally leads to property sharing, appears to be sufficient for the transfer of the ccommand relation. As we concluded before, a movement analysis (and therefore an NP-structure analysis) is impossible in (111). The contraction facts are already handled by the Case-feature on the Wh-trace. The facts also follow from property sharing: (113)
*Whoj do you wanna
tj
succeed Reagan?
It is not necessary to reconstruct who literally into the place of the trace, because the trace is identified by who, so that the trace position shares the
lexical content with the COMP position. Case has been extensively discussed above. Let us therefore end with the Italian facts related to the Longobardi filter. The most problematic facts concern fronted infinitives, cases in which the configuration of the Vjnf Vjnf filter seems to be blocked: (114) *E [andare a RomaJj che potrei desiderare it-is to-go to Rome that I-might wish
tj
If the fronted infinitive is directly linked to the trace, the example is accounted for immediately by property sharing. If the filter says that an
infinitive may not be followed by a category the lexical content of which begins with an infinitive, (114) meets the structural description of the filter without (pseudo-) reconstruction or layered traces. Fronted phrases simply specify the lexical content of their phrases. As in other cases, structural descriptions have an isotopic and a non-isotopic interpretation. The idea that (114) must involve some kind of rec'Onstruction is based on the untenable dogma that structural descriptions can only be met isotopically. In sum, then, we see that NP-structure was one of the last attempts to give the priority of isotopic property sharing some weight. D-structure tried to maintain the dogma for 8-roles. NP-structure tried to revitalize it for Case positions. In both domains we have seen a development that started earlier, with the theory of bound anaphora. With respect to the property-sharing rule, all identifiers are equal: referential indices, 8-roles, Cases, and lexical content. In all cases', it must be concluded that nonisotopic property sharing cannot generally be derived from representations with only isotopic property sharing, i.e. from distinct levels of representation like D-structure or NP-structure.
76
Domains and Dynasties
2.4. Logical Form
Since the term Logical Form is used in different senses, I will first briefly mention what is not at issue in this section. The syntactic level of Logical Form (LF) has very little to do with what logicians call logical form. So, if I am criticizing LF, I am saying nothing whatsoever about logic or semantics in the logician's sense. Furthermore, I will be critical of LF but not of one of the ideas that led to it in the first place, namely the idea that the scope of quantified elements is, at least in part, determined by syntactic principles. The partial dependence on syntax of scopal phenomena is a well-established fact, something I will not take issue with in what follows. The syntactic level of LF differs only minimally from S-structure, particularly by so-called LF movement, an application of "move alpha" that adjoins certain quantified elements to a category containing them. In essence, this idea was first proposed by Chomsky (1973) for unmoved Whelements, so-called Wh-elements in situ (like what in Who saw what). This analysis was extended to other quantified elements, like everyone, by May (1977). Other applications are the movement of elements in focus, like the stressed JOHN in I saw JOHN (see Chomsky (1981b, 196)). LF movement is based on the analogy that is supposed to exist between the output of this rule and the output of overt Wh-movement at S-structure: (115)
Whoi did he see ti?
It has been assumed for some time (i.e. since Chomsky (1973)) that this is a representation of the (quasi-) quantifier-variable relation that can be paraphrased in this case as:
(116)
For which x, x a person, he saw x
This is even one of the ideas that stimulated the development of trace theory, the thought that natural language structures give direct evidence for quantifier-variable representation as the natural ("biologically real") expression of quantification. In discussing these matters, Chomsky (l980b, 165) comes to the following conclusions: If these conclusions are correct, one might speCUlate that the familiar quantifiervariable notation would in some sense be more natural for humans than a variable-free notation for logic; it would be more readily understood, for example, in studying quantification theory and would be a more natural choice in the development of the theory. The reason would be that, in effect, the familiar notation is "read off of" the logical form that is the mental representation for natural language. The speCUlation seems to me not at all implausible.
These speculations have had a great influence on syntactic theory, in that
Levels of Representation
77
much effort has been invested in the development of a syntactic level of LF. Furthermore, this "logistic" view of syntax has also had much influence on the views of S-structure. Particularly, Wh-movement is usually seen as a rule that creates the precursor of Logical Form. From this point of view it is not incoherent to refer to Wh-traces as variables, and to Wh-phrases in COMP as operators. In the binding theory, as we saw in the preceding section, Wh-traces are treated as variables, and the fact that they behave like names is considered natural under the assumption that variables are unspecified names (Chomsky (1981b, 102». Although this view is quite prominent in GB theory, it is not generally accepted, not even among those who accept GB theory in other respects. From the point of view defended here, for instance, the "logistic" approach to syntax is rejected entirely. It seems to me that it is extremely unlikely that Frege, by giving the foundations for quantifier-variable notation in his Begriffschrift (1879) after centuries of logic, did nothing else than rediscover our deepest nature. I find it, on the contrary, more plausible that quantifier-variable notation was developed so late in the history of logic because natural language contains nothing that fully corresponds to the operators and variables in predicate calculus. 6 Universal quantification, for instance, is not expressed in natural language by an operator and a variable, but by a quantified NP, which involves aspects of both. Thus, everyone in (117a) corresponds to the italicized part of the canonical representation (l17b): (117)
a. b.
Everyone left For every x, x a person, x left
Thus, a quantified NP in natural language merges three elements of the canonical format: (i) a quantifier, (ii) a restriction on the quantifier, and (iii) a variable. Natural language expresses these three elements simultaneously, in one NP, and not analytically, as in standard predicate calculus. As for variables, it has traditionally been assumed that pronouns are the closest natural-language counterparts of these elements of the predicate calculus. But where the latter contains "pure variables", natural-language pronouns can only be "dressed" variables, i.e. elements that always have features like person and number. All in all, it seems to me that there are no direct counterparts to pure variables bound by pure operators in natural language and that it is an error to interpret the output of Wh-movement along these lines. The development of LF is based on the logistic interpretation of the output of Wh-movement. According to this view (developed in May (1977) and (1985», the familiar rule "move alpha" picks up a quantified NP and adjoins it to S (this rule is called Quantifier Raising (QR». Thus, QR gives the following output of (117a):
78
(118)
Domains and Dynasties
[s[everyone]j [s
tj
left]]
According to this view, the output of QR is logically transparent in the same sense as the output of Wh-movement at S-structure (115): both structures are supposed to be almost like the canonical representations (116) and (117). The word almost plays a somewhat underestimated role in this context. Not much attention has been paid to the fact that there is still a significant gap between (118) and the canonical representation (117b). In (1l7b), the restriction on the quantifier is also analytically represented, while in (118) the restriction is still implicit in everyone. Further rules are needed to "extract" the restriction. Usually, these rules are only hinted at by saying that (118) can be paraphrased as (1l7b). It is not easy to make these rules (which seem to involve the insertion of lexical material) explicit, as experience with Generative Semantics has shown. In short, quantified NPs contain three elements, a quantifier, a restriction, and a variable. QR makes the quantifier and the variable partially explicit, while the restriction is left implicit. The alternative, to be sketched in a moment, seems somewhat more natural in its underlying assumption that it is not the business of syntax at all to make explicit certain aspects of the content of lexical elements; or at least, "move alpha" is only an instance of the property-sharing rule, and not a device that analytically splits lexical items. The alternative view, according to which Wh-movement has nothing to do with logic, has already been anticipated in the preceding section. I will refer to it as the identification approach, as opposed to the standard logistic approach. According to the identification approach, empty Casemarked elements are not variables but dummies that must be identified by some element in an A'-position. As mentioned before, these dummies do not have independent binding properties. Their binding properties depend on the properties of their identifiers. Thus, if the identifier is a reflexive clitic like French se, this element must be locally bound in relation to the dummy position (which as an argument position has the necessary features). If the identifier does not have anaphoric properties, it cannot be bound in the dummy position. This rules out the following example, in which he has the same index as the trace: (119)
*Whoj did hej see
tj
The ungrammaticality of this sentence according to the identification approach has nothing to do with the status of the trace as a variable. It is a simple consequence of the fact that the trace cannot be identified as an anaphor. If there is a construal chain that makes it possible to identify the trace with reflexive features, the sentence is grammatical, even if the trace is A'-bound from COMP:
Levels of Representation (120)
79
Himselfj [OJ [hej does not really like tj]]
If the trace is interpreted as a variable, it is not easy to circumvent the binding theory for this case (the variable is not free). Under the alternative approach, there is no problem at all: himself is bound in position tj, to which it is connected by construal, which entails property sharing. As already mentioned in the preceding section, there is no difference between (119) and its counterpart in a language without overt Whmovement, which could be represented in English as:
(121)
*Hej saw who j
The sentence would be ungrammatical for exactly the same reason: who is not an anaphor. Such sentences can be ruled out by the binding theory at S-structure, and it is not necessary to have recourse to the alieged properties of variables at LF. A sentence like (121) represents isotopicaliy what (119) represents non-isotopically. In both cases, it is the combination of an Aposition and its specific lexical content that makes the sentence ungrammatical. In neither case is there a direct relation with canonical representations like (116). In particular, it is meaningless in this approach to say that (115) is closer to the canonical representation than (121). Who is not an operator in (119) and t is not a variable. Together, these two elements express the content of a (quasi-) quantified NP who in a nonanalytic fashion; i.e. the operator part and the variable part (not to mention the restriction) are not separated by "move alpha". These three aspects remain indistinguishable in the quantified NP in question. What (119) "splits" in comparison with (121) is the argument and its identifying lexical content, not the variable part and the operator part of that lexical content. In this alternative view, Wh-movement might still be indirectly related to scope assignment. In chapter 4, I will assume that ali cases of scope assignment for Wh-phrases involve an abstract scope marker Q, in the sense of Katz and Postal (1964) and Baker (1970). The representations of (119) and (121) are as follows in this view: (122)
a. b.
[S' Qj [Wh-phrase]j [s ... tj . .. ]] [s' Qj [s· .. [Wh-phrase]j ... ]]
(119) (121)
In both cases, scope is assigned to the Wh-phrase by its relation with the scope marker Q. The difference is that the content of the Wh-phrase is adjacent to Q in (122a). It is not excluded in the alternative analysis that this adjacency somehow facilitates scope assignment. It might even be necessary in languages in which the scopal domain of Wh-phrases is S (instead of Sf) to move the content of the Wh-phrase first to the COMP position, if direct linking to Q from the S-internal position is not possible as a consequence of the domain restriction of Wh-phrases (see chapter 4). But saying that Wh-movement to COMP somehow marks scope is
80
Domains and Dynasties
something very different from saying that the Wh-phrase in COMP is an operator that binds a variable. If the alternative view is coherent, we must ask ourselves how it can be empirically distinguished from the standard view. Some evidence has already been given. In contrast with the standard view, all A'-binding of A positions has the same characteristics in the alternative theory: it identifies empty argument positions. It is not necessary, as in the standard theory, to make a distinction between A'-bound argument positions that are operator-bound, and A'-bound positions that are not operator-bound, and therefore not interpreted as variables (positions bound by c1itics, stylistically dislocated material, etc.). Furthermore, nothing special has to be said about cases like (120). It seems to me, however, that a much stronger case can be made. I will show that the alternative theory can handle everything the operatorvariable theory can handle, and that there is no reason to assume that the notion "variable" deserves a place in syntactic theory. What is more important is that there is crucial evidence that decisively favors the alternative theory. Pied Piping in general, and certain cases of Whmovement in German, have a fully regular status in the identification theory, but are irreparably anomalous in the standard theory. The notion "variable" has played a crucial role in the explanation of the following contrasts: (123)
strong crossover a. b.
(124)
weak crossover a. b. c.
(125)
Hisj mother likes J ohnj *Hisj mother likes everyonej * Whoj does hisj mother like
tj
CO \1P-to-CO \1P violation a.
(126)
Whoj tj said Mary kissed himj * Whoj did hej say Mary kissed tj
*Whoj[ S t j though [s' t j [ S John would see t j]]]
anti-c-command condition for parasitic gaps a. b.
Which bookj did you return tj before you could read ej *Which bookj tj was returned before you could read ej
In all these examples, the rightmost t or e is considered a variable according to the logistic approach. Apart from weak crossover, the ungrammatical sen tences are ruled out by principle C of the binding theory, which stipulates that variables may not be bound (as in the ungrammatical sentences).
Levels of Representation
81
The same facts follow immediately from the identification approach, which does not make reference to the notion "variable". In all ungrammatical cases, the rightmost empty element is identified by a Wh-phrase, a nonanaphor and non pronominal, which may not be bound in the Aposition in question. For strong crossover and the anti-c-command condition of parasitic gaps, this explanation was already demonstrated in the preceding section. It is clear that the same simple explanation will do for the COMP-to-COMP condition of Chomsky (1973), a fact for which complex logistic machinery has also been proposed (see May (1979)). Weak crossover is also optimally simple under an identification approach (see Koster (1983)). A pronoun can only be interpreted as a ("dressed") variable bound by a quantifier, if a quantified NP (in Aposition) c-commands that pronoun at S-structure. Thus, in (124) there is no binding relation at all between who and his (as assumed by Koopman and Sportiche (1982)). Binding (in the sense of sharing of referential properties) is by definition a relation between A-positions (see Chomsky (1981b)). Consequently, his can only be bound by who in an A-position. The A-position in question, the trace position in (124c), does not ccommand his, so that his cannot be bound by who (see Reinhart (1976) and especially Hai'k (1984) for a similar explanation). This is the optimally simple and natural approach to weak crossover, while alternatives based on the notion "variable" only create new and difficult problems (as shown by Hai'k (1984)). The fact that only (124a) is grammatical is not surprising, because this is the only case in which the two NPs can be coreferential without ccommand (binding). Without binding there can only be accidental coreference, a possibility for referential NPs like names and not for quantifiers (see Lasnik (1976)). It seems to me, then, that the standard cases are equally well or even more simply explained by the variable-free identification approach. Crucial evidence against the standard logistic approach has been around for some time but has not received the full attention it deserves. Curiously, one of the oldest arguments against the logistic interpretation of the output of Wh-movement can be found in Chomsky (1977, 83): 'The error of identifying trace itself as the variable within the scope of the wh-quantifier, which is overcome by the much more natural theory just outlined, resulted from concentration on too narrow a class of whphrases." In what follows, Chomsky shows that the parallelism of Whtrace and variable is only seemingly direct in cases like (l27a), but not in cases with more inclusive Wh-phrases like (127b) and (c): (127)
a. b. c.
I wonder what John saw t W hose book did Mary read t Pictures of whom did Mary see t
The more complex examples (127b) and (c) show beyond reasonable
82
Domains and Dynasties
doubt that Wh-phrases do not directly translate as operators. Nor do the traces correspond to variables, as is clear from the paraphrases that Chomsky gives: (128)
a. b.
For which x, x a person, Mary read ex's book] For which x, x a person, Mary saw [pictures of x]
In these cases, the traces only correspond to phrases containing variables, not to the variables themselves. It is crucial, then, that strong crossover, etc. can also be observed in these cases: (129)
*[Whose brother]j did hej say that Mary liked tj
The trace can still be defined as a variable, but the notion loses its significance, because as Chomsky (1977) pointed out, the Wh-phrase does not correspond to an operator. Under the identification approach, (129) is ruled out in the same way as before: the trace cannot be bound by he because it is identified by a nonpronominal, the Wh-phrase. If a PP containing a Wh-word is preposed, we have a really crucial example: (130)
*[With whomj] did hej say that Mary talked [pp t]
This is a normal case of strong crossover. It is not possible to construct a reading in which there is a binding relation between he and (the variable corresponding to) whom. In this case, the example is not ruled out by the binding theory because the binding theory says nothing about PP-traces. The identification approach, however, rules out (130) as it does all the other cases. The preposed Wh-phrase is the lexical content which identifies the trace. The preposed Wh-phrase can only be interpreted in relation to the functional position indicated by the trace. In this position, the NP whom is in the domain of he, contrary to what is possible for a Wh-phrase. In the logistic approach, (130) can presumably only be ruled out in the following way. First, the Wh-phrase is reconstructed in its original position. Then, Wh-movement applies again, this time only affecting the alleged operator part of the Wh-phrase. Only then is a variable created. The variable is then perhaps illegitimately in the domain of he, but we can only rule out the structure in question by applying the binding theory at LF. Since the binding theory must also apply at S-structure (see Chomsky (1981b)), this would lead to an undesirable duplication of the operations of the binding theory. In the identification approach, all these complications are unnecessary, because a functional category and its lexical content can share their properties not only isotopically but also non-isotopically. Non-isotopic
Levels of Representation
83
property sharing works "as if" the lexical content is literally reconstructed in the functional position. But the whole idea of literal reconstruction is foreign to the identification approach. Reconstruction in any form is only necessary if it is assumed that lexical content and functional position must be represented isotopically for certain purposes. It is this insistence on isotopic property sharing that causes the problems, as in many other cases. In Chomsky (1981b, 185) the problematic Pied Piping facts are mentioned, but not solved. From the present perspective, then, these facts are a serious anomaly for the logistic interpretation of Wh-movement. At the same time, these facts crucially support the alternative theory based on identification. Given the identification theory, it is also clear why some Wh-movements look like the creation of an operator-variable structure. In general, a Wh-phrase is a phrase of some size containing a Wh-word that corresponds to a logical operator. In the smallest possible Wh-phrase, this Wh-word falls together with the NP containing it (as in (127a)). If the Wh-phrase is preposed, it might then look as if an operator has been moved. Appearances can be deceiving, however, as we have seen by considering a broader class of Wh-phrases. The bigger Wh-phrases cannot be translated as variables, but only as the identifying lexical content of categories containing a variable. If we want to have a uniform interpretation of Wh-phrases, the smallest Wh-phrase must also be interpreted as the identifying lexical content of a category, a category with which it falls together in this smallest case. It seems to me that this interpretation is entirely consistent and is the only interpretation that avoids the Pied Piping anomaly. Although I consider even the English Pied Piping facts as decisive evidence against the logistic interpretation, it is useful to add some spectacular evidence from German discussed by Van Riemsdijk (1982) and (1983). German can "pied pipe" whole clauses, as first discussed by Ross (1967) (see also Longobardi (1980) and Cinque (1980) for related cases in Italian). Van Riemsdijk (1982) gives examples like the following: (131) Jetzt hat er sich endlich den Wagen [s' den zu kaufenJj now has he (himself) finally the car which to buy er tj sich schon lange vorgenommen hatte, leisten he (himself) already long planned had afford konnen been-able-to 'Now he has finally been able to afford the car which he had planned to buy for a long time' Van Riemsdijk shows convincingly that the moved Wh-phrase (den zu kaufen) is of type Sf, and moreover, that the relative pronoun den has been moved internally, i.e. to the COMP of the pied-piped Sf, which is in COMP itself. Thus, the structure is as follows (his (34)):
84
Domains and Dynasties
(132) NP
S'
I
--------------I N~
CaMPI
der Waf/ell
S
I~
S'; ~~ C0 MP 1
dellj
~'-........
2
er
~
NP
S'
NP
til'
sl'elh
VP
I~ PRO NP V
I
tj
V I
vorgenommen hatte
I
zu kaufen
According to the logistic approach, Wh-movement in relative clauses is also considered the creation of an operator-variable structure. A structure like (132) dramatically deviates from this picture. The Wh-phrase in COMPi is not an operator at all, but only the content of the rightmost trace tj. If anything is an operator, it is den in COMP 2. But COMP2 is not the operator COMP that is supposed to introduce English relative clauses. COMP 2 introduces the complement of the matrix verb sich vornehmen, which strictly subcategorizes it. But contrary to COMPi> COMP2 is not a likely candidate for an operator COMPo In other words, neither COMP immediately dominates an operator in (132). COMPi only contains a phrase containing a possible operator (den), and COMP 2 is not an operator COMP, so that it does not make sense to interpret the word den that it dominates as an operator. Of course, (132) is fully compatible with the identification approach: COMPi contains the material that identifies tj, and COMP 2 contains the material that identifies tj. If it is not possible to give a coherent logistic interpretation to (132), we have another counterexample to the standard approach. There is also direct evidence from German that a fronted Wh-phrase is not necessarily in an operator position. It appears that in languages with both Wh-movement and overt scope markers for questions, the trajectory from the D-structure position of a Wh-phrase to its point of scope marking can be shared by a series of movements and a series of scope markers. This is particularly clear in German (see De Mey and Manicz (1984) for a similar phenomenon in Hungarian). In German, the distance from an already moved Wh-phrase to its scope position can be bridged by a repetition of the scope marker was in the intermediate CO MP positions: (133)
Was glaubst du, was Peter meint, mit wemj Hans sagt, what think you what Peter believes with whom Hans says dass Klaus behauptet, dass Maria tj gesprochen hat that Klaus claims that Maria spoken has
Levels of Representation
85
According to Van Riemsdijk (1983a, 13), mit wem has the widest possible scope, i.e. over the matrix clause. In that case, scope is not computed from the point where the Wh-phrase has been moved to (in an intermediate COMP), but from the highest occurrence of was. Interestingly, Van Riemsdijk shows that the Wh-phrase can end up in any intermediate COMP, as long as the path to the matrix is filled by a series of was: (134)
a. b.
c. d. e.
Was glaubst du, was Peter meint, was Hans sagt, was Klaus behauptet, mit wem Maria gesprochen hat Was glaubst du, was Peter meint, was Hans sagt, mit wem Klaus behauptet, dass Maria gesprochen hat Was glaubst du, was Peter meint, mit wem Hans sagt, dass Klaus behauptet, dass Maria gesprochen hat Was glaubst du, mit wem Peter meint, dass Hans sagt, dass Klaus behauptet, dass Maria gesprochen hat Mit wem glaubst du, dass Peter meint, dass Hans sagt, dass Klaus behauptet, dass Maria gesprochen hat
These sentences all have the same logical form in that the Wh-phrase (the Wh-word containing it) has matrix scope. Only in (134e) does the Whphrase also fill the COMP from where its Wh-part has scope over the whole sentence. In other words, (134) forms evidence of the most direct possible kind against the view that Wh-movement creates an operatorvariable structure. Scope is only assigned to Wh-phrases by linking them to a scope marker (Q). In English, there is always the requirement that this Q be adjacent to at least one Wh-phrase (in Wh-questions). In other languages, this requirement is lacking if the Wh-phrase can be linked to Q by a series of intermediate scope markers. 7 Given the fact that (134) is entirely unproblematic for the identification view on Wh-movement, it seems to me that (134) forms compelling counterevidence against the logistic interpretation of Wh-movement: Whmovement has nothing to do with the creation of an operator-variable structure. If Wh-movement is not the creation of an operator-variable structure, there is not much point to so-called LF movement, the rule (including QR) that creates LF on the analogy of overt Wh-movement. If overt Whmovement does not serve any logical purpose, why would covert Whmovement? In principle, there are two ways to test the hypothesis of LF movement. The first prediction is that it has the properties of "move alpha", in particular that it is constrained by Subjacency. A second possible prediction is that the gaps created by LF movement have the same properties as overt gaps. Both consequences have been proposed. Thus, May (1977) claimed that scope assignment is constrained by Subjacency. This prediction of the hypothesis of LF movement is generally believed to be false now (see Chomsky (1977), and especially Huang (1982); also May (1985)). The fact
86
Domains and Dynasties
that LF movement does not appear to have the properties of movement is, of course, an argument against LF movement The second prediction, the similarity in the behavior of gaps, is the only one that is currently seriously defended. According to this hypothesis, there is a significant generalization about overt gaps and covert gaps, namely the ECP (see Lasnik and Saito (1984) for a recent discussion). I will try to show in chapter 4 that this second prediction is also false, at least with respect to Wh-elements in situ. Here, I will only add a brief discussion of QR. QR is the rule proposed by May (1977) that adjoins quantified NPs to the Ss containing them, on the analogy of overt Wh-movement. Thus, everyone saw Bill is represented as: (135)
S
N~
~VP
eveJonei
NP
I
tj
~NP
V
I
saw
I
Bill
According to this view, everyone has scope over the whole sentence because it c-commands the nodes in the sentence. Scope is, therefore, believed to depend on c-command at LF. Furthermore, the trace in (135) is considered a variable, like the analogous traces of Wh-movement. Note, however, that there is no evidence that the scope of (non-Wh) quantified NPs must be expressed by c-command. Thus, the following sentence is usually supposed to be ambiguous between a reading in which everyone has wide scope and a reading in which someone has wide scope (see May (1977, 1985); also Ha'ik (1984)): (136)
Evelyone loves someone
Everyone c-commands someone, but not the other way around. So, there is
no direct evidence that c-command is a necessary condition for scope. We might simply say that a quantified NP has scope over the minimal S that contains it (this is essentially command in the original sense; see Langacker (1969)). This statement, which seems empirically equivalent to QR + c-command, does not require the creation of an entirely new level by a rule without known properties. Moreover, it avoids the creation of traces that must be interpreted as variables. This would be ad hoc, because as we have just seen, there is no reason to interpret traces as variables at
Levels of Representation
87
S-structure. Moreover, since the binding theory applies at S-structure (see Chomsky (1981b)), the new level of LF would not serve any other purpose than the representation of scope. In fact, it would be created for the sole purpose of expressing scope by c-command rather than by S-command. To my knowledge, there is no empirical evidence that makes it plausible that scope must be assigned by c-command rather than by S-command. Binding relations require c-command, as the following example shows: (137)
a. b.
Everyone thinks that he is happy *The father of everyone thinks that he is happy
Contrary to what we see in (137a), the quantified NP everyone cannot bind the pronoun he because it does not c-command it (see Lasnik (1976), Reinhart (1976), Culicover (1976), and recently, Hai'k (1984)). The scope of the quantified NP, however, is not expressed by c-command but by Scommand: (138)
a. b.
Everyone thinks that someone loves him The father of everyone thinks that someone loves him
In both cases, the reference of someone varies as a function of the fact that it is in the scope of everyone (see Hai'k (1984) for this property of indefinites). Of course, we can convert (138) into structures in which everyone c-commands someone, but the question is: why should we? What is worse for the QR-hypothesis is that QR is not really able to disambiguate (136), given the transfer properties of traces. The original idea was that (136) can be disambiguated by applying QR in two different fashions (May (1977)): (139)
a. b.
[Everyone]j [someone]j [tj loves tj] [Someone]j [everyone]j [ti loves tj]
According to the hypothesis in question, (139a) represents the reading in which everyone has wide scope, and (139b) the reading with wide scope for someone. Thus, (139b), for instance, can also be interpreted with everyone having wide scope. Everyone c-commands the trace of someone, tj- By transfer, then, everyone can also c-command someone. This interpretation cannot be avoided if traces keep their normal properties. Compare, for instance, the following S-structure: (140)
[Which pictures of each otherj]j do theYi really like tj
They does not c-command each other directly, but each other is dominated by a phrase the trace of which is in the domain of they. By transfer, then, each other may also be interpreted as being in the domain of they. This is the normal property of traces, as we have seen before in many contexts. In
88
Domains and Dynasties
(139) traces can only be deprived of their normal properties by stipulation, which annihilates the explanatory force of the analysis. Movement, especially movement to A'-positions, is in a sense configuration-preserving; i.e. the relations holding at the positions of origin may still hold after movement. This was also the observation that inspired Van Riemsdijk and Williams (1981) to postulate a pre-Whmovement level, NP-structure, at which major properties are defined: Whmovement leads to practically no changes in the basic relations. Even in (139), therefore, we still need a disambiguating procedure, for instance the kind of scope indexing introduced by Hai'k (1984).8 If this conclusion is correct, it does not make sense to create structures like (139), because scope indexing can just as well be done at S-structure, as shown by Hai'k. Arbitrary (ultimately stipulative) neglect of the transfer properties of traces undermines many other explanations based on QR. May (1985, ch. 1) observes, for instance, the following contrast (141) a. b.
Dulles suspected everyone who Angleton did *Dulles suspected Philby, who Angleton did
The explanation is based on the idea that VP-deletion is only possible if neither the missing verb nor its antecedent c-commands the other. The first example, (141a), involves a quantifier that can be extracted by QR: (142)
[[everyone who Angleton did]2 [Dulles suspected t2]]
In this structure, suspect no longer c-commands the missing verb, so that it can be reconstructed in accordance with the conditions on VP-deletion: (143)
[[everyone who Angleton suspected e2]2 [Dulles suspected t2]]
Since QR does not apply to (141b), which has a name instead of a quantifier, the anti-c-command condition cannot be circumvented in this case. Again, it is not clear what difference QR makes here, because the fronted quantifier phrase in (143) is still indirectly c-commanded by suspect, through the trace. Moreover, (143) is suspect for two other reasons. First of all, it involves Pied Piping, which is a highly idiosyncratic phenomenon that differs from language to language (Ross (1967)). It is likely that the idiosyncratic Pied Piping patterns of languages must be learned on the basis of evidence. But what evidence could there be concerning the nature of Pied Piping with respect to the "invisible" LF movement? Another problem with (143) is that it violates a well-formedness condition on coindexed NPs, Chomsky's iii-filter (Chomsky (1981b, 212)): (144)
*[y ... (5
.•. ],
where y and
(5
bear the same index
Levels of Representation
89
In short, QR does not solve what it is supposed to solve in this case, and it creates new problems that could have been avoided without QR. It seems obvious that the contrast in (141) is caused by the fact that the NP to be reconstructed is bound by a quantifier in (141a) and by a name in (141b). We can distinguish quantified NPs from other NPs at Sstructure by giving them two different kinds of indices, for instance i for nonquantified NPs and iii for quantified NPs (see Hai'k (1984)). This is sufficien t to distinguish (141 a) from (141 b) under reconstruction at Sstructure: (145)
a. b.
Dulles suspected everyonei/i who Angleton did [suspect eviJ Dulles suspected PhilbYi> who Angleton did [suspect eiJ
Whatever the explanation, the condition can clearly be stated at Sstructure: VP-reconstruction under conditions of c-command is only possible in the scope of a quantifier. Another interesting case discussed by May (1985, ch. 1) is the following: (146)
Every pilot hit some Mig that chased him
As in previous examples, this sentence exhibits a scope ambiguity: either quantifier may be understood as having broader scope than the other. May observes that the construal of the pronoun him varies according to the scope relations: him can only be bound by every pilot if every pilot has broader scope than some Mig that chased him. This, again, would follow from the representation after QR: (147)
a. b.
[Every pilot2 [some Mig that chased him3 [t2 hit t3JJJ [Some Mig that chased him3 [every pilot2 [t2 hit t3JJJ
The idea is that only in (147a) is the pronoun c-commanded by every pilot, so that it can be construed as a bound variable. In (14 7b), in contrast, him is in a phrase with broader scope than everyone. Consequently, him is not ccommanded by everyone and thus cannot be interpreted as a bound variable. Again, the explanation does not seem to work. Also in (147b) the quantified NP every pilot c-commands the trace of the NP with broader scope, so that him can indirectly be construed as a variable. This indirect construal is, again, quite normal at S-structure: (148)
[Which of his i sistersJj does every pilot like most tj
His can be construed as a bound variable, in spite of the fact that every pilot does not c-command it directly. The transfer properties of the trace are sufficient. Given the fact that the binding theory applies at S-structure, also for
90
Domains and Dynasties
pronouns bound to a quantified NP, and also given the fact that binding is a relation between A-positions, the explanation based on QR (147) is not intelligible. (For an alternative, see the analysis in terms of the Extended Name Constraint in Hai'k (1984).) QR does not have the desired effects because it overlooks the transfer properties of traces. Another problem has to do with the definition of c-command. Consider the structure of (139a), for instance: (149) [ds everyonej [s someonej [s tj loves tj]]]] This structure was supposed to solve the scope ambiguity because everyone (with wide scope) does c-command someone (with narrow scope), and not the other way around. This interpretation is inconsistent with the current definition of c-command, as noted by May (1985, ch. 2). According to the definition given by Aoun and Sportiche (1983), c-command is defined in terms of maximal projections: a node c-commands another node if each maximal projection dominating the first node also dominates the second node. Under the assumption that S' is a maximal projection in (149) (and not S), it follows from the Aoun-Sportiche definition of ccommand that the quantifiers in (149) c-command one another. In other words, the asymmetry in c-command, necessary for disambiguating the structure, is now lost. It is not easy to preserve the desired properties of QR, if the Aoun-Sportiche definition of c-command is accepted. In an interesting attempt to solve this serious problem, May (1985) proposes the Scope Principle: (150) Scope Principle ~::-sequences
are arbitrarily interpreted
A L-sequence is a class of operators '1', such that for any operator OJ, OJ E '1', OJ governs OJ- The mutual governance is expressed by mutual ccommand, so the two "operators" everyone and someone form a Lsequence in (149). The arbitrary interpretation that is allowed in such sequences according to (150) entails, among other things, that either everyone or someone has scope over the other. C-command, in other words, is no longer used as a disambiguating condition in (149). This is a rather radical departure from the assumptions of May (1977). To see how (150) works and interacts with other principles, consider the following examples (from May (1985, ch. 2)): (151)
a. b.
[ds every student2 [s some professor3 [s e2 admires e3]]]] [s{s some professor3 [s every student2 [s e2 admires q]]]]
In the theory of May (1977), these two representations served to disambiguate the sentence every student admires some professor. In the new
Levels of Representation
91
theory of May (1985), only (151b) represents the two readings. The two quantified NPs govern each other, so that the Scope Principle (150) applies: either some professor or every student has broader scope than the other. Interestingly, (151a) is supposed to be ruled out by the ECP at LF. According to this analysis, q is not properly governed because the antecedent-governor every student is not adjacent to e2. Contrary to q, e2 is not lexically governed, so that the sentence is ruled out by the ECP. (Later, in chapter 5, May reformulates this alleged ECP fact in terms of the Path Containment Condition of Pesetsky (1982b);) This analysis is subsequently applied to an interesting ambiguity, observed in the following sentences: (152)
a. b.
What did everyone buy for Max? Who bought everything for Max?
The first sentence, (152a), has two readings. If what has wide scope, a possible answer is (153a), and if everyone has wide scope, a possible answer is (153b): (153)
a.
b.
Everyone bought Max a Bosendorfer piano Mary bought Max a tie, Sally a sweater, and Harry a piano
In contrast, (152b) has only one reading, with who having wide scope. An appropriate answer would be: (154)
Mary bought everything for Max
According to May, the contrast between (152a) and (l52b) as regards scope possibilities is accounted for by an interaction of QR, the Scope Principle (150), and the ECP. This can be seen from the representations: (155)
a. b.
[S' What2 [s everyone3 [s e3 bought e2 for Max]]] [s' Wh0 3 [s everything2 [s e3 bought e2 for Max]]]
In both cases, the conditions of the Scope Principle are fulfilled, but (155b) is ruled out by the ECP. Consequently, only (155a) is well-formed as the representation of a scope ambiguity. This might seem to have the unwanted consequence that the grammatical (152b) does not have a representation at all. May, therefore, proposes another modification, based on earlier ideas of Williams (1977) and Sag (1976): QR can also adjoin material to VP. This yields the following representation for (152b): (156)
[s' Wh03 [s q [vp everything2 [vp bought q for Max]]]]
The ECP is no longer violated, and relative scope is no longer determined
92
Domains and Dynasties
by the Scope Principle, because the two quantifiers who and everything do not govern each other. If the Scope Principle does not apply, scope is determined in the old way, by c-command: who has wider scope than everything, because who c-commands everything, and not the other way around. I find this explanation unconvincing for a number of reasons. First of all, if QR can adjoin an NP to a VP and to an NP (as May later on assumes), why can QR not adjoin everything to Sf, yielding the following representation for (152b): (157)
[Sf Everything2 [Sf wh0 3 [e3 bought e2 for Max]]]
The conditions of the ECP are met, and the Scope Principle also applies (if it did not, everything would still have wide scope because of ccommand). It seems to me that (157) cannot be excluded without arbitrary stipulation. This is a serious objection, because (157) represents the reading that the analysis seeks to exclude. But apart from this analytical problem, there are also empirical reasons to reject the solution. Even in (152b) it is not quite clear whether the less accessible reading (with everything having wide scope) should be excluded. But even if there is a factor that rules this reading out, it is not obviously the structural factor related to the ECP. The problem is that there are structurally analogous cases in which it is perfectly natural for the universal quantifier to have wide scope: (158)
a. b.
Which chemical gives each wine its own flavor? Who knows the right medicine for each patient?
In these examples, the universal quantifiers can have wide scope, as is clear from the possible answers: (159)
a. b.
Tannin gives Bordeaux its flavor, and sulphite all the rest John knows the right medicine for this one, and Mary knows the right medicine for that one, etc.
Since these cases are structurally similar to (152b), we do not expect these readings if the ECP account is correct. Or in other words, the analysis does not seem to explain what it is supposed to explain. There is no clear evidence that everyone must be adjoined to the VP in certain cases. May mentions the following example from Williams (1977): (160)
Max saw everyone before Bill did
This example is ambiguous, according to Williams, depending on adjunction of the quantified phrase to the VP (the collective reading) or the S (the distributed reading):
Levels of Representation (161)
a. b.
93
[Everyone]j Max [vp saw tj] before Bill did [vp-] Max [vp everyonej [vp saw tj]] before Bill did [vp - ]
According to May, the difference in adjunction will correlate with whether the quantified phrase, or just the variable it binds, is "reconstructed" back into the position of the missing VP. He then goes on to say that (162) is not ambiguous: (162) Who saw everyone before Bill did? This sentence would only show the "collective" construal, while the "distributed" reading of (161) would be lacking. This would follow from the impossibility of S-adjunction in (162), due to the ECP (government of the subject trace of who would be blocked if everyone were adjoined to S). Again, it seems unlikely that (160) can be disambiguated by the two representations in (161). As before, the transfer properties of traces make these two representations equivalent. Both are non-isotopic representations of the same pair everyone (the lexical content in an N-position) plus its functional position (the trace in an A-position). If the VP is reconstructed in (161a), the content of the antecedent is reconstructed. This means that not only saw but also the content of the trace is reconstructed. But apart from this, I detect no clear difference between (160) and (162). Both can have the "collective" and the "distributed" reading for everyone. This is perhaps clearer with verbs like kiss, which are somewhat easier to construe with the distributed than with the collective reading:
(163)
Which girl kissed everyone before Sally did?
The distributed reading comes more readily to mind here than the collective reading. Obviously, the availability of the two readings is independent of Wh-movement. If the disambiguation (161) is accepted, then (163) is in fact a counterexample against the ECP account. If we do not accept the disambiguation (161) (and we should not, in my opinion), there is no longer an argument for LF-adjunction of quantified NPs to the VP. The second argument for VP-adjunction involves the following pair: (164)
a. b.
Which of his poems did every poet read? Which of his poems were read by every poet?
In (164a), it is natural to construe his as a variable bound to every poet. In (164b), this construal is considerably less natural, although not entirely impossible, as May notes. This would follow from the analysis given, because only in (164a) can his be in the scope of every poet, in accordance with the Scope Principle.
94
Domains and Dynasties
Again, the analysis does not carry much weight. To begin with, the bound variable construal is not entirely impossible in (l64b), as May suggests. If this reading is possible at all, (164b) is a counterexample against the intended analysis, because his would not be in the scope of every poet, which is adjoined to VP. The real point is that every poet indirectly c-commands his at S-structure, as required for binding. Binding from a by-phrase is not entirely impossible, as in a book by John about himself. Similarly, the trace in (164b) is not c-commanded by every poet, so that the bound variable construal depends on the extent to which one accepts binding from a by-phrase. A similar contrast can be found in: (165)
a. b.
Everyone said that he was happy It was said by everyone that he was happy
All in all, I see no argument, not even a subtle one, for adjunction to VP in (164). Another argument for adjunction of quantified NPs to VP, again based on earlier work by Williams (1977) and Sag (1976), is given in May (1985, ch. 3). A sentence like some student admires every professor is ambiguous in isolation, but in the following VP-deletion context, the ambiguity disappears (only a specific construal is available for some student): (166)
Some student admires every professor, but John doesn't
The explanation is, as before, based on the idea that only reconstruction of a VP containing the every-phrase will give rise to a well-formed logical representation. If every professor can be adjoined to the VP and to the S, the following structures are possible: (167)
a.
b.
[s Some student2 [s e2 [vP every professor3 [vP admires e3]]]], but John (does not) [vP every professor3 [vP admire e3]] [s Every professor3 [s some student2 [s e2 [vP admires e3]]]], but John (does not) [vP admire e3]
(167b) is considered problematic because of the fact that the second conjunct contains a free variable (the scope of every professor is limited to the first conjunct). The other example, (167a), is not problematic because the reconstructed VP contains a quantifier that binds the variable, thanks to the fact that every professor is conjoined to the VP in the first conjunct. It is this possibility of adjunction either to S or VP, together with the alleged impossibility of (167b), that explains the fact that some student can only have the specific reading (in which some student has wide scope) according to May. This would entail another argument for LF-adjunction to VP. It seems to me, however, that the specific reading of some student in
Levels of Representation
95
(166) is not a fact about VP-deletion but a fact about coordination. In noncoordinated structures, the ambiguity is preserved under YP-deletion: (168)
a b.
Some student admires every professor who John does Some student admired every professor at a time that John didn't yet
In both examples, the first part is identical to the first part of (166), and both examples involve YP-deletion in the same way as (166). In spite of this, both sentences are ambiguous: they also allow the nonspecific construal with some student in the scope of every professor. This is inconsistent with the explanation given for (166): we would expect the same exclusion of the nonspecific reading under the analysis given. If we look at other coordinate structures, however, we see the same pattern as in the coordinate structure (166): (169)
a. b.
Some student and John admire every professor Some student admires every professor and Bill only Quine
Both examples have nothing to do with YP-deletion, but both only allow the specific construal. I conclude from this fact that the specific reading in (166) has nothing to do with VP-deletion, so that it cannot be considered an argument in favor of adjunction to YP. Defacto this was the last argument in favor of adjunction to YP in May (1985). The last argument presented is not a real argument, because it shows only that the evidence in question is compatible with the adjunction-to-YP analysis. It does not show that the evidence in question must involve adjunction to YP. The relevant fact is: (170)
Every pilot hit some Mig that chased him
The point of concern is that him can only be taken as a bound variable if the some-phrase containing it has narrower scope than every pilot. This well-known fact is not compatible with May's earlier analysis (QR as adjunction to S) in conjunction with certain assumptions about weak crossover. May shows that these assumptions are compatible with his new conception of QR (adjunction to YP). As said before, this compatibility does not give new support for the analysis. The fact in question is also compatible with a number of other analyses (for instance, in terms of the Extended Name Constraint in Haik (1984)). This concludes my criticisms of the arguments in favor of QR-asadjunction-to-YP. I will end with one last argument in favor of LF (and QR), one of the strongest arguments for such a level, according to May (1985, ch. 4). The argument is about the inverse-linking cases, discussed in May (1977):
96 (171)
Domains and Dynasties Somebody from every city despises it
May argues convincingly that it is not a donkey-pronoun (like the it in Every man who owns a donkey beats it), and that Hark's concept of indirect binding is therefore problematic for these cases (Hai'k (1984)). In inverselinking cases like (171), everyone has wide scope. It is thanks to this widescope property, expressed by c-command at LF, that it can be interpreted as a bound variable in (171). There are two claims involved: (i) it is a bound variable, bound by every city, (ii) the binding relation is expressed by c-command at LF. This, according to May, is one of the strongest arguments for LF: (171) clearly shows that c-command at S-structure does not work (as assumed by Reinhart (1976), and recently by Hai'k (1984)). One could, of course, say that if c-command at S-structure does not work, something else at S-structure might work. It is after all not a priori certain that all prominence relations must be reduced to c-command at some level. But it is not necessary to make this move. The reason is that (171) does not seem to involve binding at all, precisely because every city does not c-command it at S-structure. The examples that May gives do not involve simple quantified NPs like someone and everyone, but noun phrases like each city and each pianist. In the latter class, the quantifier is expressed by a determiner, and the restriction on the quantifier is expressed by the head noun. There is a considerable difference between these two kinds of quantified NPs. Simple quantifiers practically require c-command for bound-variable interpretation (at S-structure). The other quantified NPs behave almost like descriptions or names. Thus, if we replace every city in (171) by everyone, the sentence becomes ungrammatical: (172)
*A brother of everyone hates him
We find a similar con trast among cases of weak crossover. All of the following examples are less than optimal, but (173b) is considerably better than (173c): (173)
a. b. c.
?His father hates John ?? H is father hates every pianist *His father hates everyone
We find a similar contrast among the following sentences: (174)
a. b. c.
The father of John hopes that he will win prizes The father of every pianist hopes that he will win prizes *The father of everyone hopes that he will win prizes
In (174c), everyone does not c-command he at S-structure, which makes the sentence ungrammatical. Maybe English is a bad example because of
Levels of Representation
97
the deviant behavior of everyone in certain contexts. But in most languages, a simple quantified NP must c-command a bound pronoun at Sstructure (see Higginbotham (1980) for Mandarin Chinese, and Koopman and Sportiche (1982) for French, Dutch, and other languages). Quantified NPs like each pianist, however, can even be embedded in a sentential noun complement (175) The fact that each pianist plays Mozart does not prove that he likes music Such examples are also beyond the scope of QR, which is certainly not designed to move quantified NPs out of a complex NP, in spite of the fact that the canonical paraphrase of (175) is something like: (176)
For every x, x a pianist, the fact that x plays Mozart does not prove that x likes music
In short, with pronouns related to NPs like each pianist we find nothing like binding conditions at all, no c-command at S-structure or at LF. Rather, the quantified NPs behave like quasi-names, and anaphora seems to assume the character of free anaphora to some extent (see Lasnik (1976)). There is direct evidence that it is not in a bound position in Somebody fi'om every city despises it. We have examples like: (177)
a. b.
Somebody from every city hates the place The parents of each pianist want the fellow to be happy
Epithets and the like are never possible in strict binding positions, i.e. if the quantifier c-commands the A-position in question: (178)
*Each
pianist thinks that the fellow is happy
The contrast between (178) and (177b) shows that the fellow is not bound at all by each pianist in (177b). The inverse-linking cases, therefore, do not challenge the view that binding is expressed by c-command at S-structure, nor do they form any evidence for a level of Logical Form. This concludes the discussion of a fair sample of arguments for Logical Form. However interesting the idea of LF has always been, I can only conclude that there is no evidence for it (the same conclusion will be drawn from the ECP evidence in chapter 4). In a way, this conclusion is more negative than in the case of D-structure and NP-structure. Nothing real corresponds to LF, there are no known properties of LF movement, and there are no known properties of the level itself (and the same can be said of the ECP, to which I will return). In the case of D-structure, the conclusion was different. In spite of the
98
Domains alld Dynasties
fact that not all argument positions could be filled from lexical positions by "move alpha", we could still distinguish a substructure of basic epositions, projected from the lexicon. There is also good evidence that this substructure plays a role in grammar. There is nothing real, however, in the case of Logical Form. It is my belief that the inspiration from the quantifier-variable notation of the standard predicate calculus was wrong to begin with. The analogy between Wh-movement and the creation of quantifier-variable structure was very misleading in this respect. It is not the first time that generative grammar has been led astray by images from logic textbooks. Generative Semantics was inspired by the same illusion. In chapter 7, I will make some further remarks to motivate my scepticism about logicism in the study of grammar. Syntax, as I see it, has nothing to do with logic, natural or otherwise. The essence of syntax, the configurational matrix, defines a structure without meaning or inherent purpose. It might be an entirely accidental, epiphenomenal spin-ofT of our brain structure. In any case, it is in itself not a calculus for some inherent purpose, like the expression of meaning. The idea that syntax is a "natural" calculus to that end is, in my view, an obstacle to the scientific study of grammar.
2.5. Conclusion Traditionally, the idea of distinct levels of representation has played an important role in generative grammar. According to some, it is even the most important idea in generative grammar. Also in the framework presented here, at least the following levels must be distinguished: lexical structure, surface structure, and most important, S-structure. Furthermore, it seems likely - although beyond the scope of grammar as such - that there are levels of representation in which aspects of syntax and meaning are integrated. Empirically speaking, little is known about these integrated levels. What we find in the current model theoretic approaches to syntax/ semantics cannot be the biologically real model we are looking for. The reason is that as yet syntax is often treated in these approaches as something that is already known, whereas only the dimmest outlines have been glimpsed. Integrated levels can only be constructed if the constituent elements are better known. They are at a different level of abstraction than what has been studied in generative grammar. The idea that distinct levels of representation are the essence of syntax has led to a true proliferation of levels in recent years. Some of these levels are based on elements that have been part of generative grammar since its inception. D-structure is an example. Others, like LF and NP-structure, are relatively new. The conclusion of this chapter is that there is no convincing evidence for D-structure, NP-structure, or LF.
Levels of Representation
99
S-structure is, according to Chomsky (1981b, 39), factored into two components: D-structure and "move alpha". The possibility of generating s_structure directly has been envisaged since Chomsky (1973). The standard GB approach to these matters, called "theory Ia" in Chomsky (1981b, 90), assumes that only D-structures are base-generated and that Sstructures are derived from them by the rule "move alpha". The alternative defended here, called "theory Ib" by Chomsky, generates S-structures directly, without the rule "move alpha". According to Chomsky (1981b, 90), "[iJt is not easy to find empirical evidence to distinguish between Theories Ia and Ib. It may be that they are to be understood as two realizations of the same more abstract theory, which captures the essential properties of UG at the level of abstraction appropriate for linguistic theory." Particularly, theory Ib has never been accepted as the better theory because it is thought that it would still have to distinguish D-structure as a substructure and, instead of "move alpha", interpretive rules with the properties of "move alpha". These rules would have properties distinct from other construal rules, so that there would be no support for theory Ib on empirical grounds (Chomsky (1981b, 92)). I disagree with this interpretation for two reasons. To begin with, we have shown that there are more arguments at S-structure in non-epositions than can be "filled" by a rule with the properties of "move alpha". This is particularly clear in topicalization structures and easy-toplease constructions. Furthermore, the discovery of empty resumptive pronouns in parasitic gap constructions and in islands has obliterated the idea that empty categories bound by a Wh-phrase in CO MP must be generated by "move alpha". The gaps in question cannot be generated by "move alpha", because the antecedent-gap relation does not have the properties of "move alpha". Most important of all, in spite of the fact that this issue has existed for almost 15 years, the proponents of theory Ia have not been successful in isolating distinctive properties of "move alpha". In this chapter, I have shown that "move alpha" cannot be functionally defined. It is, just like the other construal rules, an instance of the property-sharing rule. It is filtered by the same uniqueness principle (which accounts for the selectivity of property sharing) and it has the same configurational properties (in the unmarked case). In what follows, I will show that the bounding conditions of "move alpha" can be factored into two components. Unmarked bounding is defined by the Bounding Condition of Koster (1978c). According to this condition, empty categories are bound in their minimal governing category. In the unmarked case, the "minimal governing category" falls together more or less with the notion "maximal projection". Thus, an empty category must be bound in the minimal NP, PP, AP, or S' in which it is governed. I will show in chapter 3 that a subclass of control structures is characterized by exactly the same Bounding Condition. It concerns those control complements for which there is much evidence that they are
100
Domains and Dynasties
transparent (like S'-deletion complements), so that PRO must be governed. In chapter 4, I will show that in some languages, like Dutch, traces are characterized by the strict Bounding Condition just mentioned. Interestingly, gaps in nonstrict bounding contexts have entirely different properties from standard traces. They must be pro according to Cinque (1983b) and Obenauer (1984), and can only be of the category NP. That these gaps only occur in some languages, but not in others, is explained in terms of certain directionality constraints. The most important aspect of the conclusions of chapters 3, 4, and 5 is that there is no difference between the locality principle for governed PRO and the (unmarked) locality principle for "move alpha". In chapter 6, I will show that the unmarked locality principle, the Bounding Condition, also plays a role in reflexivization. In chapter 1, I indicated that all licensing relations are also characterized by the Bounding Condition, including the governance relation itself. In the same chapter, I also indicated that gapping is constrained by the Bounding Condition (for details, see Koster (1978c, ch. 3)). All in all, it appears that there is much evidence that Subjacency is an artefact There is no evidence that "move alpha" is characterized by a locality principle distinct from what we find in rules of construal. It is not true, in other words, that theory Ib must mimic "move alpha" with a construal rule with the properties of "move alpha". It is rather the case that there is nothing at all that has the properties of "move alpha". If this conclusion is correct, then theories Ia and Ib are not notational variants. Theory Ib must, on the contrary, be the better theory and theory la is refuted. NP-structure was in part based on an assumption underlying the idea of D-structure. Van Riemsdijk and Williams (1981) admitted that there were no compelling arguments for NP-structure (and D-structure) apart from considerations of elegance and naturalness. I will return to these considerations shortly. Empirically, the NP-structure model was not quite successful, particularly because of the existence of the nonstandard gaps, bound longdistance by a Wh-phrase. The content of these gaps cannot be reconstructed by assuming NP-structure and "move alpha". The reason is that "long distance" A'-binding does not have the properties that are attributed to "move alpha". (A deeper reason for scepticism is, of course, that "move alpha" itself is a dubious concept.) Nevertheless, the gaps-bound-at-adistance often show the "reconstruction" properties of filled gaps at NPstructure. This shows that NP-structure, as a literal, "physical" reconstruction of the content of the gaps, cannot be right. Also in other respects, it was shown that NP-structure could not be defined as the level of certain properties, such as binding or Case-marking. In spite of these disagreements, I think that the NP-structure model is superior to the standard model in one respect, namely in its treatment of
Levels of Representation
101
Wh-traces as reconstruction sites (and not as variables). This has led to a certain scepticism about LF, shared by the theory presented here. The essence of the critique on LF is that the Wh-phrase-trace relation has nothing to do with the operator-variable structures of the standard predicate calculus. In this sense, the NP-structure model has led to what in my opinion is decisive evidence against the logistic interpretation of Whmovement (see Van Riemsdijk (1982, 1983)). If Wh-movement does not create operator-variable structures, the idea that inspired LF and LF movement becomes meaningless. But empirically as well, the idea has not been convincing. In this case, LF movement is not even considered to have the properties normally attributed to " move alpha". And the properties that the level of LF is supposed to have remain in the dark. In chapter 4, I will show that there is no clear evidence for the ECP (al1 dependent elements must be governed, whether they are lexical or not). Also, attempts to establish the idea that LF is the level of anaphor binding have been very unsuccessful. This latter fact has something to do with the similar scepticism about other levels: even if "move alpha" existed, it would be configuration-preserving in a sense, because of the transfer properties of traces. The fact that traces have transfer properties, like anaphors, might be the clue to an understanding of the fact (if it is a fact) that the derivational approach is not successful in the long run. Since its inception, transformational grammar has been characterized by what we might call "derivational concepts" and "representational concepts". In the beginning, there was a heavy bias towards derivational concepts, such as "movement", and the cycle (see Chomsky (1965), the culmination point of this early development). Already at an early stage, there were also purely representational concepts. The notion c-command, for instance, goes back to the notion "in construction with", found as early as Klima (1964). The other fundamental notion, locality, has had a more mixed career. A fundamental shift took place when Dougherty (1969), Jackendoff (1972), and others developed representational theories for bound anaphora. This led to a "mixed" perspective in Chomsky (1973), according to which locaJity principles for anaphors are representational (as in the current binding theory), and locality principles for traces are derivational (Subjacency). Up until recently, this has been the standard perspective (but see Jenkins (1976), Lightfoot (1977), Freidin (1975) and (1978), Koster (1978c), and more recently Cinque (1983b), Rizzi (1983), Sportiche (1983), and others). What has undermined the derivational perspective? Trace theory, ultimately the idea of Structure Preservingness (see Emonds (1970) and (1976}), in my opinion. Structure Preservingness has two aspects, the Emondsian idea that certain landing sites of moved categories are already base-generated, and trace theory. It is, incidentally, the case that other non transformational approaches, like the ones developed by Brame and Bresnan, were also originally inspired by the work of Emonds. The
102
Domains and Dy nasties
underlying idea is quite natural: if transformations only create structures that are already base-generated, what sense does it make to have transformations in the first place? This query was answered in two ways: (i) there must still be a link to the positions in which categories are licensed, and (ii) movement rules have different properties from construal rules. The first motivation for movement, the licensing problem, was entirely undermined by the second aspect of Structure Preservingness, the development of trace theory since Chomsky (1973). Curiously, traces were not accepted by some of the earlier proponents of base-oriented approaches, like Bresnan and Brame. It seems to me that in the traceless base-oriented approaches that they developed, the problem of strictly local licensing was never solved. This perhaps had to do with a misguided anti-abstractness bias, particularly with the idea that a syntactic position is defined exclusively by lexical content. Obviously, however, a syntactic position is defined by lexical content and (or) by context, i.e. by licensing relations. By simply assuming that lexical content must be distinguished from the functional position it fills (as in Aspects, and particularly as in Chomsky (1967)) we get trace theory practically for free (see also Chomsky (1981b, 85ff.)). Trace theory solves the problem of strictly local licensing, even if the lexical content associated with a functional position is " at a distance". The development of trace theory, which was indeed more implicit in earlier variants of transformational grammar than sometimes realized, brought, as we saw, theory Ib to the fore as a potential alternative to the derivational approach. The defense of this theory in Koster (1978c) was, naturally, directed against the only remaining motivation for "move alpha", the idea that movement rules have distinct properties. From the same perspective, Freidin (1978) showed that another core notion of the derivational approach, the principle of the cycle, does not have an independent status in the theory of grammar (under the assumptions of trace theory). In retrospect, it is clear why it has been thought for such a long time that "move alpha" has distinct properties. Long Wh-movement is an accidental property of English (and some other well-studied languages). At first sight, this looks like a process with properties very different from what we see in the construal rules. This conception has been undermined by two facts. First of all, there are languages with local Wh-movement, but without long Wh-movement. In Dutch, for instance, most Wh-movement is of the strictly local variety. Apart from the exception discussed in chapter 1, long Wh-movement exists only as an iteration of local Wh-movement ("successive cyclic movement"). In other languages with Wh-movement, like certain varieties . of German, long Wh-movement does not exist at all. On the other hand, there are languages with "long construal", like the long reflexivization found in Icelandic (see Thdtinsson (1976), Maling (1981), Anderson (1983), and Everaert (1986)). A language like Dutch, with mostly strictly local movement and construal, is in a way, then, a case in which the similarity
Levels of Representation
103
between "movement" and "construal" is most obvious. It is interesting to see to what extent the idea of Structure Preservingness has undermined the derivational approach with its proliferation of levels. In the early stages of the Extended Standard Theory, the issue of levels was much connected with the question at what level semantic interpretation applied (see for instance J ackendoff (1972)). The basic idea was that certain aspects of interpretation must be stated at deep structure, while others must be stated at surface structure. Given the importance that was originally attached to the question which levels are the levels of semantic interpretation, it is significant that this whole issue has almost disappeared. There are very few linguists that refer to GB theory as "interpretive semantics" (also because of the disappearance of Generative Semantics). The issue has largely lost its significance because of Structure Preservingness, mainly because of trace theory. Chomsky (1975, 117), for instance, indicates that thanks to traces all semantic interpretation can be done at S-structure. It has been insufficiently realized that thanks to the Emondsian aspect of Structure Preservingness (pre-generated landing sites), one could just as well say that all (or most) semantic interpretation can be done at the level of D-structure. One of the important aspects of Van Riemsdijk and Williams (1981) is that they show that Wh-movement (and other forms of movement to A'-positions) has practically no effect on semantic interpretation. It is because of this that they consider the possibility of semantic interpretation before Wh-movement (i.e. at their level of NPstructure). But precisely the landing sites of NP-movement are pregenerated in Emonds's sense. Thus, one of the traditional arguments for semantic interpretation at S-structure is a sentence like the following (see Chomsky (1981b, 43)):
(179)
They seem to each other [t to be happy]
The argument has always been that the binding rules for anaphors must apply at surface (or S-) structure, because only at this level is there a suitable antecedent (they) that c-commands the anaphor each other. This argument is far from compelling under current assumptions. Consider the D-structure of (179):
(180)
NPi seem to each otheri [theYi to be happy]
The binding theory says that each other must be A-bound. Clearly, this condition is fulfilled at D-structure: each other is bound by the empty subject NP i , the future landing site of they. The binding theory, in short, could just as well be applied at D-structure. Of course, full interpretation requires the further information that the antecedent is they. But this, too, can be determined at D-structure. We can stipulate that lexical NPs are only interpreted in relation to a Case position. They in (180) is not in a
104
Domains and Dynasties
Case position, so it cannot be interpreted in situ. But it can be interpreted elsewhere in the pre-generated NP-chain, namely in its only Case position, the matrix subject position in (180). Given the correctness of the view that movement to A'-positions is semantically neutral (at least for bound anaphora), all relevant positions for semantic interpretation are already present at D-structure. It is Structure Preservingness in both senses, then, that ultimately undermines the derivational approach. Everything relevant for semantic representation is present at both D- and S-structure. Similarly, LF movement cannot change anything in principle, given the transfer properties of traces. Semantically, the distinction of three levels, S-structure, Dstructure, and LF, is useless. As already mentioned, Van Riemsdijk and Williams (1981) realize that the arguments for NP-structure are not compelling in a theory with traces. Similarly, Chomsky (1981b) mentions several times that the arguments for D-structure are "highly theory-internal" in a theory with traces. It is important to note, then, that there is some consensus that S-structure is by far the best-established level of syntactic representation. A preference for the extra levels of D- and NP-structure is mainly based on considerations of elegance and naturalness. It is not immediately obvious what is meant by elegance and naturalness in this context. It seems to me that the underlying assumption of both Chomsky and Van Riemsdijk and Williams is that the natural relation between a functional category and its lexical content is isotopic, in the sense defined earlier. This is in my opinion the essence of the idea that certain things are most naturally represented at distinct levels. What all these levels appear to do, among other things, is to reconstruct relations of isotopic property sharing. It is highly significant that Van Riemsdijk and Williams (1981) consider various possibilities of reconstruction only in this sense: it is either reconstruction of isotopic relations before Wh-movement (NP-structure) or reconstruction of isotopic relations after Wh-movement (at LF). Non-isotopic property sharing is somehow considered less basic, and therefore derived. Hence, the insistence on the concept of reconstruction, i.e. the reconstruction of isotopic relations. I will conclude this chapter by showing that there is no fundamental difference between isotopic and non-isotopic property sharing, and that there is therefore no reason to reconstruct anything by the postulation of levels. Ultimately, isotopic property sharing and non-isotopic property sharing are manifestations of exactly the same deeper principles, namely the properties of the configurational matrix. Note first that the preference for isotopic property sharing is not universally applied to all syntactic relations. For construal rules, it is generally assumed that property sharing is non-isotopic and not reducible to isotopic property sharing. Thus, a reflexive pronoun shares its referential index with its antecedent non-isotopically. Isotopic property sharing would mean in this case that the reflexive originates in a position
105
Levels of Representation
where the referential index is assigned directly, followed by "movement" to its surface structure position. This is impossible, hence the universal acceptance of what I call non-isotopic property sharing. It has been a fundamental idea in the approach originating with Koster (1978c) that lexical content must essentially be treated like a referential index: it can be shared non-isotopically, by categories in two different positions. At this point, it could be objected that there is still a difference: referential indices must be shared non-isotopically and lexical content can only be shared non-isotopically. Isotopic property sharing for lexical content and the functional category assigned to it could still be basic. I will show now that this apparent difference is deceptive. The fact that isotopic property sharing is not possible in the case of bound anaphora is entirely due to independent factors. This opens the perspective of a conceptually unified theory: all syntactic property sharing is either isotopic or non-isotopic. In fact, I will make a stronger claim: there is no need for the distinction between isotopic and non-isotopic property sharing. These terms were only used for purposes of exposition. Both types of relations are manifestations of the configurational matrix. Ultimately, then, there is only one type of syntactic relation, characterized by one set of properties. Consider the properties of the configurational matrix once again: (181)
a. b. c. d.
obligatoriness uniqueness of the antecedent prominence of the antecedent locality
Compare next a representation of isotopic property sharing (182a) and a representation of non-isotopic property sharing (182b): (182) a.
s·
b.
~S
COMP
~YP I Y~NP. John NP
I saw
I' what
S'
~S I~ NP NP YP
COMP j
Jwt
J01hll
np I I
saw
Ii
In both cases, we can distinguish two aspects of the object (of saw): the functional position NP j (i.e. [NP, VP] with its Case and 9-license) and the associated lexical content [NP whatJj. In (182a), the functional position dominates the lexical content (so that the two fall together). In that case, the properties of the two entities are shared isotopically, as we called it. In (182b), we find exactly the same two ingredients. In this case, the functional category does not dominate the lexical content. Note furthermore that the dominance relation is such that the lexical content also dominates
106
Domains and Dynasties
the functional category in (182b) (i.e. NP j from the lexicon to which the features of what are assigned). As I mentioned, the isotopic representation (182a) has somehow been considered natural or basic, and the non-isotopic representation derived. I have already indicated in the discussion of LF that (182a) and (182b) are semantically equivalent representations. I will now show that it is meaningless to construe (182a) as the "reconstruction" of (182b). Nothing needs to be reconstructed, because both (182a) and (182b) represent a relation with the same properties, namely the properties of the configurational matrix. For (182b), this is nothing new. Let us therefore consider (182a). The first property, obligatoriness, is fulfilled: the functional position NP j must be related to the lexical content [NP what]j, and vice versa. Uniqueness is also a property of the isotopic relation. Thus, it is not possible (in noncoordinated structures) to assign two lexical contents, [NP John]j and [NP BillJi, to one functional position NP j: (183)
*Np·
A
John
Bill
The fourth property, locality, is trivially fulfilled. Both the functional category and its lexical content are in the same domain. Let us now turn to the third property, prominence, which appears to be the crucial property. As discussed in chapter 1, the standard prominence property is c-command, usually defined as follows (see Aoun and Sportiche (1983)): (184) a c-commands ~ = df every maximal projection dominating a dominates ~, and a does not dominate p What is surprising, from the present perspective, is the curious stipulation about dominance (in bold type in (184)). This ad hoc addition to an otherwise natural principle has accompanied the definition of c-command since Reinhart (1976). To my knowledge, it has not been noticed that the stipulation in question occurs twice in the current theory of grammar, at least partially. What I have in mind is Chomsky's iii-filter (Chomsky (1981b, 212)): (185) *[y ... 8 ... ], where y and 8 bear the same index As Chomsky shows, this filter is independently motivated. It is also a very natural principle: a category cannot overlap in reference with a proper subpart of it But it should be clear that the filter duplicates what the stipulation in the definition of c-command purports to do. Consider an example:
107
Levels oj Representation (186)
Np·
~N'
Det
~ PP
N
A
P
Np·
I
1
each other
According to the stipulation in the definition of c-command, NPj cannot be the antecedent ofNP j , and the same is prevented by the filter (185). For related reasons, NP j cannot be the antecedent of itself: the essence of anaphora is that it involves incomplete lexical items: the referential index cannot come from itself, but must be shared with an NP with a content that is complete in the desired sense. Given the independent status of the filter (185), we can drop the stipulation in the definition of c-command, thereby simplifying the concept of prominence. But note now that the stipulation concerned "dominance", i.e. the factor that differentiates isotopic property sharing from nonisotopic property sharing with respect to the properties of the configurational matrix. If we drop the reference to dominance, which is independently motivated, there is no longer a difference between isotopic and non-isotopic property sharing with respect to (181): both modes of property sharing are complete and equal realizations of the properties of the configurational matrix. The idea, then, that isotopic property sharing is more natural or basic has no theoretical foundation. Consequently, it is meaningless to reconstruct somehow a situation of isotopic property sharing, for instance by the postulation of distinct levels of representation. We have another reason, therefore, to be sceptical about a proliferation of levels: they reconstruct something that does not need to be reconstructed. Isotopic and non-isotopic property sharing are manifestations of one and the same relation with one set of defining properties. The difference between the mode of property sharing in anaphora and the position-content relation follows from independent factors. By giving up the idea of reconstruction levels we can simplify the theory of grammar in more ways than one: first of all the levels of NP- and 0structure are eliminated as artefacts. Furthermore, "move alpha", also an artefact, can be seen as the manifestation of a rule with more general properties, the property-sharing rule with the properties of the configurational matrix. And third, we can unify the definition of property sharing by simplifying the definition of c-command. I will now conclude with the speculation that thanks to the simplification of the notion prominence, important aspects of the (so-called) base rules can also be reduced to the properties of the configurational matrix. By dropping the stipulation about dominance in the definition of c-
108
Domains and Dynasties
command, the cluster of configurational properties is extended to vertical relations, i.e. relations in which one of the terms in a relatiQn dominates the other. This is exactly what we want, because it is clear that "property sharing" is also vertical in this sense. Thus, the elements of an X/_ projection share certain properties, which is indicated by the notation of X-theory. There is also a long tradition of relations that extend beyond the boundaries of a projection. This is the "long" property sharing usually referred to as percolation. We might ask ourselves now whether "vertical" relations can be studied from the same perspective. The unified treatment of isotopic and non-isotopic property sharing described above suggests such an extension. I t is indeed the case that the vertical relations in tree representations are characterized by the properties of the configurational matrix: all nodes (except the root) are obligatorily and locally dependent on a unique, more prominent node. In a way, then, I agree with Chomsky (1982a, 16) that it is better to reduce the rewriting rules of the base to simple things like X/-theory and "move alpha", than the other way around. In the same vein, I believe that the properties of the configurational matrix characterize the foundation of all syntactic dependency relations, both "horizontal" and "vertical". All the rest is filling it in.
NOTES 1. In fact, I will argue in chapter 5 that these structures involve movement of it. See also Bennis (1986). 2. According to some recent proposals, examples like (46) involve two chains (Chomsky (1986b)). For an argument against this idea, see chapters 4 and 6 below. 3. I am following standard assumptions here. For a different account of nominative Casemarking, see chapter 5. 4. For some qualifications, see chapters 1 and 4. 5. This condition will be further discussed in chapter 6. 6. The fact that operator-variable notation was developed so late in the history of logic may contradict Chomsky's view that this notation is somehow natural to the human mind. 7. Q is the scope marker of Katz and Postal (1964), which was mentioned in chapter 1. Ultimately, I believe that Q can be dispensed with.- See the discussion of "vertical locality" in chapter 1. 8. This is one important reason why Halk's scope indexing is not a notational variant of QR (as has been claimed by Hornstein (1985)).
Chapter 3
Anaphoric and N onanaphoric Control
3.1. Introduction
According to our theory, most grammatical relations have a common core. Functionally, this common core can be characterized as free property sharing. Formally, this property sharing holds for those relations that are characterized by the configurational matrix that was extensively discussed in the preceding two chapters. The central question of the present chapter is whether control (of the PRO subject of infinitives) is also characterized by the configurational matrix. As usual, the most important issue concerns the locality property of the configurational matrix, the Bounding Condition. In chapters 4 and 5, I will continue the line of my earlier work (Koster (1978c)) by demonstrating that both Wh-movement and NPmovement are - in the unmarked case - characterized by the Bounding Condition, which is essentially a one-node version of Subjacency (interpreted as a condition on representations). In chapter 6, it will be claimed that the Bounding Condition is also the unmarked locality principle for the binding theory. Provided, then, that these conclusions are correct, a considerable degree of unification will be reached if it can be shown that control is also characterized by the Bounding Condition. According to standard assumptions (Chomsky (1981b)), control is something rather different from movement or bound anaphora. I fully agree with this standard view as long as we consider the total of properties of control vis-a-vis the full set of properties of "movement" or binding. But as soon as we analyze the relations in question into their components, a rather different picture arises. Control, for instance, is often considered a matter of argument structure (whatever that is). I will not take issue with this view, but I will claim that argument structure is not the whole story. What little we know about argument structure does not suffice to account for the sharp division we find in control structures between optional and obligatory control in the sense of Williams (1980). Both forms of control involve principles of argument structure, but obligatory control is also characterized by the principles of the configurational matrix. The latter situation only arises, as I will show, in transparent complements with governed PRO. Governed PRO behaves like an anaphor in that it is always strictly locally bound. There are, in other words, two kinds of control, namely anaphoricand nonanaphoric control. 109
110
Domains and Dynasties
If anaphoric control in the sense of this chapter exists, it is a phenomenon of great theoretical significance. It would be the case that there is a well-defined subclass of control structures that has all the properties of the configurational matrix in common with "movement" and bound an aphora. This would not only be a significant step in the direction of a more unified theory, it would also be a vindication of the Thesis of Radical Autonomy, according to which the core properties of grammar are entirely construction-independen t.
3.2. Where binding and control meet It is quite commonly assumed that infinitival complements are sentential in that they have a subject. In accordance with Koster and May (1982) I will assume that the embedded subject is an empty category, so that a sentence like John wants to go has the following structure: (1)
Johnj wants [ej to go]
The empty element ej is usually referred to as PRO. A crucial question is how this element is related to the binding theory. Are its properties totally independent from the configurational matrix, and is it in all cases governed by a separate theory, the theory of control? Or are there certain overlaps and interactions? One of my central claims in Koster (1978c) was that the following two sentences have the same configurational properties: (2)
a. b.
John seems [e to go] John tries [e to go]
Furthermore, I claimed there that the usual distinction between the empty elements (trace in (2a) and PRO in (2b)) is not based on the alleged fact that trace and PRO are two different primitives. I considered the 8-status of the matrix subject with respect to the two different verbs to be totally independent of the configurational properties of the relation between John and e. In a sense, then, a distinction was made in Koster (1978c) between trace and PRO, or at least a clear distinction was made between the antecedenttrace relation and the antecedent-PRO relation (see pp. 32-34). Any listing of differences between these two relations (Chomsky (1982b, 87)) is therefore an inadequate response to my original claim, as is clear from the references just given. What I really had in mind did not concern the relations but the primitive status of trace and PRO. These categories had been claimed to differ intrinsically (in feature content), whereas they had the same primitive status in Koster (1978c). More recently, Chomsky (1981b, 1982a) has also
Anaphoric and Nonanaphoric Control
111
abandoned the position that there are intrinsic differences among empty categories. Thus, until recently at least, there was a growing consensus that the properties of empty categories are contextually determined (but see Chomsky (1986a» . On the other hand, in Koster (1978c) I tended to underestimate the independent status of the theory of control. I now agree with Chomsky (1981a,b) that there is such a theory, independent from the binding theory. Here too, then, the different positions have become less sharply distinguishable. In spite of these convergences, I would like to claim that the standard analysis of the difference between (2a) and (2b) is not quite correct. I wish to maintain that (2b) is characterized not only by the independent theory of control, but also by the full binding theory. In other words, the binding theory and the theory of control overlap in certain cases (though not in others). The arguments for an independent theory of control are familiar by now. In terms of the previous discussion we can say that all four properties of the configurational matrix (which are also the properties of anaphor binding) can be violated. Thus, there are control constructions without obligatory antecedents (3a), with split antecedents (3b), with nonc-commanding antecedents (3c), and with nonlocal ("long distance") antecedents: (3)
a. b. c. d.
It is impossible [e to help Bill] John proposed to Mary [e to go to the movies] It is difficult for M my [e to help Bill] John thinks [s it is impossible [8 e to shave himself]]
It is true that we never find such deviant properties for the antecedenttrace relation, but from this fact it cannot be concluded that the antecedent-PRO relation is generally lacking the properties of the configurational matrix (which always characterize trace binding). The crucial point is that examples like (3a-d) form a precisely definable subclass of the antecedent-PRO relation. Williams (1980) has made the important observation that there are two distinct clusters of properties associated with PRO, Obligatory control (the control in complements of verbs that do not select for or a gerund) has roughly the properties of the configurational matrix, whereas the deviant properties exemplified in (3) occur only in complements that do select the complementizer for or a gerund. Thus, all the examples (3a-d), and in general all examples usually given to demonstrate the trace-PRO distinction, involve optional PRO: l (4)
a. b. c. d.
It is impossible [for Mary to help Bill] John proposed to Mary [for Bill to go to the movies] It is difficult for Mary [for John to help Bill] John thinks it is impossible [for him to shave himself]
112
Domains and Dynasties
In other words, I have never seen an argument for the deviant properties of control on the basis of obligatory PROs in the complements of verbs like try, begin, condescend, etc. Try, for instance, has an obligatory PRO in its complement and never selects a jor-complementizer: (5)
*John tried very hard [for Bill to go]
The relevant fact, now, is that so-called raising predicates (seem, be likely, etc.) are also forms that never select a jor-complementizer or a gerund. Traces in embedded subject position (of infinitivals) are, so to speak, just as obligatory as obligatory PROs: (6)
a. b.
*It seems [for Bill to go] *It is likely [for Mary to help Bill]
These are the matrix predicates that select reduced S's (referred to as S'Deletion in Chomsky (1981b)). Such reduced clauses are transparent for government from the matrix verb, which is also necessary for exceptional Case-marking in the complements of believe-type verbs. Note once again that believe-type verbs do not select jor (see also Chomsky and Lasnik (1977)): (7) *John believes [for Mary to go] I believe that the phenomenon referred to as S'-deletion is in fact the absence ofCOMP: 2 (8)
a.
Full clauses
b.
s'
S'
~
COMP
Reduced clauses
S
I
S
Small clauses (in the sense of Chomsky (1981b)) can also be subsumed under (8b): if these structures are clauses at all, they are clauses without a complementizer. It is generally agreed that reduced clauses are transparent for government from the matrix verb. This can now be seen as an automatic consequence of the absence of COMPo Traces in the complements of raising verbs can be governed this way, as required. Let us assume now that government is the crucial factor that determines that empty categories are bound in accordance with the four properties of the configurational matrix. Are there reasons to exclude PRO from this pattern? The opposite appears to be true. If we assume that PRO can be the subject of reduced infinitival clauses, we can explain why these PROs are exactly the PROs that are bound in accordance with the four properties of
Anaphoric and Nonanaphoric Control
113
the configurational matrix (just like traces). Consider, for example, the fol1owing structure: (9) John tries [s{s e to go]] If we assume that try selects a reduced clause, an automatic consequence of the fact that try does not selectjor, we must conclude that tries governs the embedded subject e (as in the case of seem). This is perhaps forbidden by the standard government and binding theory, but that is a disadvantage of that theory, because governed PRO makes the right predictions in cases like (9): absence of COMP always triggers the pattern of the configurational matrix, not only in an antecedent-trace relation but also in an antecedent-PRO relation. In other words, whether the pattern of the configurational matrix is triggered or not does not depend on the intrinsic content of the relation (antecedent-trace or antecedent-PRO) but on construction-independent configurational factors. Once again, it appears that the pattern of the configurational matrix is a radical1y autonomous pattern.
33. Some minimal properties of control One objection against the theory of governed PRO is the loss of an ECP explanation for the fol\owing case:
(10)
*John was tried [e to go]
According to the ECP explanation, this sentence is ruled out because the trace of John (the embedded e) cannot be governed by try, since this is not a verb that triggers Sf-deletion. If we assume that try does select a reduced complement, this explanation is lost. This is, however, the point where the independent theory of control comes into play. Since it is this theory that explains (10), let us take a closer look at it. Although some minimal assumptions about control are in order, it is not my purpose here to give a theory of control (see Manzini (1983a)). My main concern is the overlapping properties of anaphor binding and a subclass of control constructions. I will therefore limit myself here to control of infinitival complements to verbs. 3 The least that can be said about the theory of control in the sense intended here is that it involves argument structure (see Chomsky (1981b, 77)). Infinitival subjects (of verb complements) are usually control1ed by one or more arguments of the matrix predicate. Since these arguments are minimally contained by the next higher clause, control appears to be a rather local process. That there are apparent cases of "long distance" control involves the fact that arguments can remain implicit. Therefore, let
114
Domains and Dynasties
me first explain what I understand by implicit arguments. As is well known, the by-phrase of a passive construction is optional. Thus, we find both John was hit and John was hit by someone. Semantically speaking, the agent is still presupposed in the first case. This tacitly present agent remains part of the argument structure, and it is this kind of hidden argument that I will refer to as an implicit argument. Another example is John gave his money, where the indirect object is implicit. I would like to' claim now that certain processes - in particular, processes that crucially depend on argument structure - do not always distinguish between explicit and implicit arguments. Control is an example of such a process. Thus, consider a verb like suggest. A person who suggests something has an addressee in mind: (11)
My teacher suggested to me to take another topic
In this case, I am the one who receives suggestions. In an appropriate context, the same content can be expressed by leaving the receiver implicit: (12)
My teacher suggested -
to take another topic
The point here is that the (implicit) receiver remains the controller. A further claim is that, apart from some marginal exceptions, so-called long distance control involves an implicit controller in the immediately adjacent matrix clause of the infinitival complement: 4 (13)
It is difficult to take another topic
The understood subject of to take anothel: topic is the same person (or set of persons) for whom it is difficult to take another topic. Difficult is a subjective modat there is always someone for whom something is difficult. If this hidden argument is explicitly expressed, it must be the controller. Thus, in the following sentence Bill is the controller, and not Mary: (14)
Mary said it was difficult for Bill to take another topic
Thus, long distance control in such cases is possible only if the Jar-phrase is not made explicit, and only if it can implicitly be interpreted as the long distance controller: (15)
Mary said it was difficult to take another topic
In this case, Mary can be interpreted as the con troller because Mary can be interpreted as the one for whom the particular action is difficult. The examples just given reveal another aspect of argument-structure binding - namely, the fact that the argument can be contained in a
Anaphoric and Nonanaphoric Control
115
characteristic PP, in this case ajor-phrase. This is why c-command can be violated: the relevant argument qualifies as a controller, no matter how it is structurally expressed. In general, this means that the controlling arguments can be left implicit, or can be couched in a characteristic PP. It is, of course, also possible that the controlling argument c-commands the embedded subject As we have seen before, this third option is obligatory for those cases of control that are also subject to the binding theory, since the binding theory requires an explicit c-commanding antecedent (cases like try). What all three manifestations of the controlling argument have in common is that they belong to the lexical structure of the predicate in the adjacent matrix clause. This leads to the first- locality- property of an important class of control cases: (16)
The controller is an argument of the minimal argument structure containing the control complement.
A second restriction is that the controller is a designated argument (perhaps predictable on the basis of a more inclusive theory of control). Thus, for promise only the subject qualifies as a controller (under certain circumstances), whereas for persuade the object must be chosen. The common assumption here is that this information must be stipulated in the lexical structure of these verbs. There are also verbs for which two arguments are possible (see Chomsky (1968, 48) for such examples): (17)
a. b.
John asked Bill to go John asked Bill to be permitted to go
I suppose that in such cases both the matrix subject and the indirect object are lexically designated controllers, and that further choices are either pragmatically induced or determined by a future, more inclusive theory of control (which, again, is not our primary concern here). In any case, we can modify (16) as follows: (18)
The controller for an embedded subject (PRO) is a designated argument of the minimal argument structure containing the control complement.
These minimal assumptions about control primarily concern argument structures determined by a V (or a copula with an A). Quite similar observations can be made about NPs. Recall, for instance, the interesting examples given by Postal (1969):5 (19)
a. b.
America' s attempt to attack Cuba at night the American attempt to attack Cuba at night
116
Domains and Dynasties
In both cases, the controller is the (understood) subject of attempt, ie. America. Again, we see that in argument-structure processes c-command is not required. 6 It does not matter how the relevant argument is structurally expressed. It can be inferred from an adjective, as in (19b); it can also be couched in a characteristic by-phrase, as in (20a), or be left implicit, as in (20b): (20)
a. b.
the attempt by America to attack Cuba at night the attempt to attack Cuba at night
Other familiar examples are: (21)
a. b.
We found plans to kill the Ayatollah We have plans to kill the Ayatollah
In both cases, the controller is an implicit argument of plans (someone's plans, our plans), the nature of which is again determined pragmatically. With this minimal, rather conventional theory of control in mind, we can return to our original problem, the ungrammaticality of (22) (= (10)): (22)
*John was tried [e to go]
The explanation appears to be quite simple. Try is not only a verb that selects reduced clauses (which makes the embedded PRO an anaphor subject to the binding theory); it is also, unlike seem, a verb of control. Since it is a control verb, it must have a designated argument that serves as the controller. For try, this designated element is the underlying subject, which is not explicitly expressed in (22). Why, then, can this argument not be left implicit or be expressed by an agentive by-phrase? It is here that the independent binding theory comes into play. Since the infinitival complement of try in (22) is a reduced clause, its subject e (PRO) is governed and must therefore be bound in its minimal governing category. This means that there must be an obligatory c-commanding antecedent. John in (22) is the only NP that fulfills these conditions. But John is not the controller according to the independent theory of control, so the sentence is ruled out. In other words, (22) is ungrammatical because the combined requirements of the theory of control and the binding theory cannot be met. 7 The explanation of the ungrammaticality of (22) is analogous to the explanation of the ungrammaticality of (23): (23)
*Bill
was promised [e to go]
This sentence is ungrammatical because the designated controller, the underlying subject of promise, is absent. Again, the controller cannot be left implicit since promise is also a verb that selects reduced clauses (it does not select aJor-complementizer). As before, this leads to government of the
Anaphoric and Nonanaphoric Control
117
PRO-subject of the complement. Governed PRO is an anaphor subject to principle A of the binding theory. In other words, e must be bound by Bill in (23) (the only possible binder according to the binding theory). Again, this NP is not the underlying subject, which is the designated controller of promise. As in the case of (22), the requirements of the binding theory and the theory of control are not compatible, hence the ungrammaticality.8 In conclusion, it appears that the ECP is by no means needed to explain the ungrammaticality of (22). It follows just as well from the independently needed factors that rule out (23). In general, the type of account just given explains what Bresnan (1982, 402) calls "Visser's generalization": " ... the observation that verbs whose complements are predicated of their subjects do not passivize". Bresnan gives the following examples (her (84) and (86» : (24)
(25)
a. b. c. d. e. a. b.
c. d.
e.
He strikes his friends as pompous The boys made Aunt Mary good little housekeepers Max failed her as a husband The vision struck him as a beautiful revelation Mary promised Frank to leave *His friends are struck (by him) as pompous *Aunt Mary was made good little housekeepers (by the boys) *She was failed (by Max) as a husband *He was struck (by the vision) as a beautiful revelation *Frank was promised to leave (by Mary)
Examples (24e) and (25e) correspond to (23). The other examples involve small clauses as complements (in the sense of Chomsky (1981b». Small clauses lack a complementizer and are therefore transparent for government by the matrix verb. This entails that all PRO-subjects of the small clauses are governed anaphors that must be bound by a c-commanding NP. The designated controllers (the underlying subjects) are also ccommanding antecedents for binding in (24), but not in (25). In the latter set of examples, only the derived subjects are possible binders, which are not the designated controllers. In short, our theory of governed PRO in reduced clauses (without a complementizer) explains Visser's generalization where it operates in combination with lexical stipulations concerning the designated controller. These lexical stipulations (which can perhaps in part be reduced to more general principles, like Rosenbaum's Minimal Distance Principle),9 are of the same nature for cases of obligatory control (in the sense of Williams (1980» and optional control. The difference is that in the case of obligatory control these lexical properties interact with the binding theory. It is this interaction that gives our account explanatory force, since it relates Visser's generalization to a general pattern - namely, the pattern of four properties of the configurational matrix that also characterizes many other core grammar dependencies that have nothing to do with control.
118
Domains and Dynasties
It is not clear whether the lexical-functional variant of generative grammar explains Visser's generalization. According to this alternative, control from the by-phrases in (25) is excluded by the stipulation that obligatory controllers must be subjects or objects in a non oblique form (Bresnan (1982, 376)). Lexical stipulations in this framework may refer only to "semantically unrestricted" functions like subject and object in the case of control, and not to oblique functions like by-phrases. This is said to follow from "severe constraints on the lexical encoding of semantically restricted functions". But these "severe constraints" have no explanatory value, since they do nothing beyond stipUlating that by-phrases occur in certain lexical frames but not in others. Bresnan mentions another empirical generalization, Bach's generalization, which also follows from our account where the object of a verb is an obligatory controller, "intransitivization" is impossible (Bresnan (1982, 418)). Bresnan gives the following examples (her (122) and (123)):
(26)
a. b.
(27)
a.
b.
Louise *Louise Louise Louise
taught Tom to smoke taught to smoke signaled Tom to follow her signaled to follow her
In (27b) the object can be omitted, but not in (26b). According to our previous account of such cases, controllers can be omitted (or given in oblique form) only if the complement is a full clause with a complementizer (for). This prediction is confirmed by the following examples from Bresnan (1982) (her (124) and (125)):
(28)
a. b.
*Louise taught Tom for him to smoke Louise signaled Tom for him to follow her
All in all, it seems that our hypothesis of governed PRO has a real explanatory advantage over the standard theory (which does not allow governed PRO) and the lexical-functional approach. In our theory, absence of the complementizer Jar (in D-structure) makes PRO accessible for the governing verb of the matrix clause. This makes PRO an anaphor according to the standard binding theory (principle A), so that it is predicted that the controller cannot be left implicit (as in (26)) or expressed with a by-phrase (as in (25)). The reason is that the binding theory requires an explicit, c-commanding antecedent. Our hypothesis therefore explains the generalizations made by Visser and Bach. The standard theory, on the other hand, has no obvious explanation for the fact that the impossibility of the Jor-complementizer is necessarily correlated with an explicit nonoblique antecedent. This salient fact remains entirely accidental. The same can be said about the lexical-functional approach, because, as
Anaphoric and Nonanaphoric Control
119
we have seen, the impossibility of implicit or oblique controllers in certain cases is entirely a matter of stipulation in this framework. The facts of English, therefore, give strong support to the claim that PRO must be governed under certain circumstances. Even stronger support for this claim appears to come from the complement system of Dutch.
3.4. Infinitival complements in Dutch There is an intriguing difference between English and Dutch concerning certain facts discussed by Williams (1980). Williams points out that the following type of construction is generally impossible with verbs that do not select aJor-complementizer: (29)
*It was tried [e to see Bill]
Such constructions do occur, however, with verbs that select Jor. Williams makes the strong claim that such constructions are possible only with verbs that select Jor:10 (30)
It was arranged [e to see Bill]
It is easy to see that (the possibility of) such contrasts follows from the assumptions made so far. In (29) the complement is transparent because it is never introduced by Jor. PRO (e) is therefore governed in (29), which as an anaphor in its minimal entails that it must be bound governing category. The only available antecedent is the matrix subject it, which is not the designated controller for the control verb try. According to control theory, the underlying subject is the only possible controller, which conflicts with the requirements of the binding theory. In (30), however, the verb arrange selects a Jor-complementizer (which must be deleted if it is not followed by a lexical subject; see Chomsky and Lasnik (1977)). This makes the infinitival complement opaque for government from the matrix verb arrange. Consequently, e (the embedded PRO) does not have to be bound according to the binding principles. The only theory that applies in this' case is the theory of control. Contrary to the binding theory, this theory allows an implicit argument as controller; hence the possibility of (30). This type of explanation receives remarkable support from certain facts in Dutch. In this language, the equivalent of tly, the verb proberen (which has exactly the same meaning), differs from t/y only in that it can select an optional complementizer om. Interestingly, this produces a grammatical equivalent of 'the ungrammatical English example (29): (31-)
Er werd geprobeerd [(om) e Bill te bezoeken] there was tried COMP Bill to visit
120
Domains and Dynasties
In Dutch, an SOV language, these complements introduced by complementizers occur only to the right of (the underlying position of) the verb. The complement of a verb like proberen can also occur to the left of the verb, in which case the embedded verb is obligatorily adjoined to the matrix verb (Verb Raising in the sense of Evers (1975)). I will return to the Dutch complement system in more detail, but at this point it is only relevant to mention an exceptionless fact infinitival complements on the left-hand side of the matrix verb (which undergo Verb Raising) never have a complementizer. This renders these complements transparent in many respects, as I will demonstrate. What is crucial at this point is that our hypothesis predicts that PRO in these complements is also accessible for government from the matrix verb, so that this PRO must be bound in accordance with principle A of the binding theory. In other words, it predicts that we will never find examples like (31) for transparent complements. This is indeed the case. If the complement of proberen is to its left, it patterns like English try: (32)
*Er werd [e Bill !l] geprobeerd te bez6ekenl there was PRO Bill tried to visit
These are representative examples. Extraposed complements with the possibility of the om-complementizer (like (31)) pattern like English examples with jor-complements, whereas Verb Raising complements, which never have a complementizer, pattern like English verbs that never have a jOI'-complementizer. What is so striking about the Dutch verb pl'obel'en is that we see the two distinct patterns with the same verb. I take this as strong evidence for the thesis of governed PRO. 11 The general pattern becomes even more perspicuous if we consider the Dutch complement system in more detail. The Dutch complement system differs from the English system in two respects. First, like English, Dutch has infinitives with and without the morpheme te (English to). Dutch, however, has a much wider use for infinitives without teo Roughly, Dutch has te-less infinitives not only where English has to-less infinitives (as in John saw Bill go), but also where English has gerunds. As I will show, te-Iess infinitives form a Dutch counterpart of the English gerund, which explains why the distribution of these infinitives deviates from that of other infinitival complements. The most important difference between Dutch and English, however, involves the underlying SOV structure of Dutch. The exact nature of the difference will become clear as we proceed, but for the moment it suffices to consider the major facts. In contrast with English, Dutch has infinitival complements on both sides of the matrix verb. Certain complements occur only in extraposed position, to the right of the matrix verb. I will refer to these complements as extra posed complements. Other complements occur only to the left of the matrix verb, in which case, as we saw in connection with (32), the verb of the complement is obligatorily adjoined to the
Anaphoric and Nonanaphoric Control
121
matrix verb. This process is called Verb Raising (VR), and I will refer to the complements in question as VR-complements. It is very interesting from the point of view of our present theoretical concerns to see which complements occur only in extraposed position, and which complements occur only as VR-complements, or as both VR-complements and extraposed complements. The entire pattern is rather complex and has been unravelled by Evers (1975). As I will show, several of the classical problems can be solved within the framework of the theory of government and binding. As described by Evers (1975), the properties of the Dutch infinitival system can be summarized as follows:12
(33)
a. b. c. d.
e. f.
g.
Only extraposed complements can be introduced by COMPo Raising (to subject position) occurs only from VR-complements. Control is possible with both extraposed and VR-complements. Infinitives without te occur only as VR-complements. Exceptional Case-marking occurs only with VR-complements. Obligatory control (in the sense of Williams (1980)) is a property of VR-complements. Only VR-complements show certain transparencies (Verb Raising, R-movement, adverbial scope).
There is a beautiful pattern in these complex data, but this only becomes clear if the data themselves are crystal clear. Therefore, let me summarize the data in yet another way: VP
(34)
S'
~
COMP
v
j
\
S
y
V R-complements Extraposed complements
a. b. C.
d.
e. f. g.
COMP Raising Control Without te Exceptional Case Obligatory control Transparency
+ + + + + + +
+
Before going on, it is useful to point out a preliminary generalization on
122
Domains and Dynasties
the basis of this summary. The analysis of English complementation in section 2 showed that there are reasons to correlate the absence of a complementizer with transparency phenomena such as government into embedded clauses, raising (and obligatory control), and exceptional Casemarking in believe-type verbs. My theory differs from the standard theory in considering obligatory control to fall in the same natural class as the other transparency phenomena (usually referred to as S'-deletion phenomena). What is so interesting about (34) is that in Dutch, transparent and opaque infinitival complements are "physically" separated by the matrix verb. All transparency phenomena are found in VR-complements, which typically lack COMP (see (34a)). What really strongly confirms our theory of obligatory control is that it patterns with raising and other transparency phenomena (cf. (34f) and (34b)), and not with control in general (cf. (34f) and (34c)). It seems to me that much of (34) can be explained by the standard assumptions of government and binding theory such as the ECP (traces must be properly governed), together with two additional, well-motivated assumptions. The first rather crucial additional assumption is the claim made by several linguists that government of argument positions is directional. 13 There must be a parameter that determines that languages are either SVO or SOY. A simple formulation of this parameter is the idea that government (of argument positions) is directional. In SVO languages the verb governs to the right, and in SOy languages it governs to the left. This simple assumption accounts for the fact that (35a) is grammatical in an SOY language like Dutch, whereas (35b) is not: (35)
a. b.
Ik I *Ik I
denk think denk think
dat that dat that
hij he hij he
Mary zag Mary saw zag Mary saw Mary
In (35a) Mary is governed by the verb zag, because the verb governs to the left. (35b) is ungrammatical because Mary is to the right of the verb, so that it remains ungoverned. In an SVO language like English, the opposite pattern holds. A second additional assumption is that te-less infinitives are 9-marked (and perhaps Case-marked), which accounts for the NP-like distribution of these clausal complements. I will return to this matter in the more general discussion of te-less infinitives. Let us now turn to some illustrations of the facts listed under (34) and the principles that explain them. First, the possibility of COMPo We have seen that a verb like proberen 'try' can select a complementizer am if its infinitival complement is in extraposed position. The same holds for many other control verbs: they select om-complements only if the complements are extraposed but never if
Anaphoric and Nonanaphoric Control
123
they are to the left and undergo Verb Raising. The following facts illustrate this: (36)
a.
b.
(37)
a.
b.
Ik denk dat zij probeerde (om) het boek te lezen I think that she tried CO MP the book to read 'I think that she tried to read the book' Ik denk dat zij (*om) het boek probeerde te lezen I think that she (COMP) the book tried to read 'I think that she tried to read the book' Ik denk dat hij weigerde (om) Mary te kussen COMP Mary to kiss I think that he refused 'I think that he refused to kiss Mary' Ik denk dat hij (*om) Mary weigerde te kussen I think that he (COMP) Mary refused to kiss 'I think that he refused to kiss Mary'
It is sometimes claimed that the lack of a complementizer in VRcomplements shows that these complements are in fact VPs. But this argument has no force. First of all, such a VP-analysis would entail that one verb (such as proberen, weigeren) has two clause-like complements instead of one: a VP-complement to the left of the matrix verb and an S'complement to the right. This can hardly be considered an elegant conclusion. More important, even VP-analysts would have to stipulate that complements with complementizers do not occur to the left of the verb. The point is that tensed complements, which always have a complementizer in Dutch, cannot occur to the left of the verb either:
(38)
a.
b.
Ik denk dat hij zei dat hij zou komen I think that he said that he would come 'I think that he said that he would come' *Ik denk dat hij dat hij zou komen zei I think that he that he would come said
On the basis of these facts, which have nothing to do with infinitives, we can conclude that there is a general ban in Dutch against complementizers to the left of the verb. One would hope to find an explanation for this fact, but whatever rules out (38b) seems sufficient to rule out the ungrammatical variants of (36b) and (3 7b) as well. 14 Moreover, there is nothing inherent to VP-analyses that requires VP-complements to be generated to the left of the matrix verb. One would have to stipulate this fact, which reduces the explanatory advantage of VP-analyses to zero. This becomes even clearer if we consider the next case, raising complements. Dutch has many raising complements with te and without te preceding the infinitive. Both types of complements occur only to the left of the matrix verb, as VR-complements. For te-Iess complements, this also follows from an independent factor to which I will return shortly. Crucial
124
Domains and Dynasties
cases are therefore raising complements with te, like the complement of schijnen 'seem': (39)
a.
b.
Ik denk dat zij het boek schijnt te lezen I think that she the book seems to read 'I think that she seems to read the book' *Ik denk dat zij schijnt het boek te lezen I think that she seems the book to read
These facts follow from the standard assumption of the theory of government and binding that traces must be properly governed, in conjunction with the well-motivated assumption that government is directional for arguments (leftward in Dutch). This becomes clear if we consider the underlying structures: (40)
a.
b.
Ik denk dat zij [t het boek te lezen] schijnt *Ik denk dat zij schijnt [t het boek te lezen]
( = (39a)) (= (39b))
(40a) is the structure underlying (39a) (before raising of the verb lezen). The trace t is properly governed (as required by the ECP) since the verb schijnen governs leftward. It is for this reason that (40b) (underlying (39b)) is ungrammatical: the trace is to the right of schijnen, where it cannot be governed. It is here that raising crucially differs from control. As we saw in (36) and (37), both proberen and weigeren select a te-complement that can occur on both sides of the matrix verb. This is allowed by the theory, as the structures underlying (36) demonstrate: (41)
a. b.
Ik I Ik I
denk think denk think
dat that dat that
zij probeerde [(om) e het boek te lezen] she tried COMP the book to read zij [e het boek te lezen] probeerde she the book to read tried
The contrast between (41) (two grammatical sentences) and (39) (one grammatical sentence) provides crucial evidence in favor of the theory of government and binding and the assumptions made here. In (39) (see (40)), the antecedent zij and its trace t form a chain, and empty categories in chains must be governed. (41) involves control, which means that the antecedent zij and the embedded PRO (e) are in two distinct chains. This follows from the e-criterion. 15 Unlike traces, PROs are not necessarily governed. This is why (41a) is accepted: e is to the right of proberen, so that it is not governed. This has no consequences here, contrary to what we see with schijnen ((39b), (40b)). I have demonstrated above that PRO can be governed in transparent complements (like (41b)), but nowhere
Anaphoric and Nonanaphoric Control
125
have I claimed that PRO must be governed. Only traces must be governed. It should be noted here that, as in Koster (1978c) and Chomsky (1981b, ch. 6), I am not making a distinction between trace and PRO as primitives. There is only one type of empty category, its status being determined by the relations into which it enters. In a chain, with an antecedent belonging to the same chain, an empty category must be governed; this requirement does not exist when an empty category has an antecedent in a different chain. In any case, no other theory that I know of explains the ungrammaticality of (39b) and the grammaticality of (41a). As in the lexicalfunctional framework, the theory advocated here makes it possible to group obligatory control together with raising (cf. the notion "functional control" in Bresnan (1982)). The overlap in binding properties follows according to the present theory - from the possibility of governed PRO under certain circumstances. Thus, both the lexical-functional approach and the present approach differ from the standard approach in that they classify obligatory control with raising. In contrast with the lexical-functional approach, however, the present approach is like the standard approach in that it assumes empty categories, (directional) government, and the ECP. This has led to an explanation for the fact that raising sometimes differs from control (cf. (39b)). On the other hand, it is not clear how the ungrammaticality of (39b) follows from anything in the lexical-functional approach. One can of course stipulate that raising complements are of a specific type. But nothing in a lexically oriented theory forbids these complements of a specific type from occurring to the right of the matrix verb. Again, requiring that schijnen 'seem' select VPs that occur only to the left of the matrix verb would be mere stipulation. The government and binding approach, on the other hand, explains the fact that raising complements in Dutch never occur in extraposed position. In this approach, this crucial fact follows from the ECP and the independently motivated assumption of directional government (see the references of note 13). If my assumptions are correct, then, the present approach has the advantage over both the standard and the lexical-functional approaches that it explains the properties of obligatory control (the properties of the configurational matrix). Besides, it has the extra advantage over the lexical-functional approach that it explains the contrast between raising and control with respect to the distribution of infinitival complements in Dutch. Dutch has an abundance of infinitival complements without te 'to'. This class of complements deserves special attention, because it involves some distributional peculiarities that confirm the idea of directional government. Infinitives without te come in three varieties: raising verbs (42), control verbs (43), and verbs with "exceptional Case-marking" (44):
126
(42)
Domains and Dynasties
a. b.
(43)
a.
b.
(44)
a.
b.
Ik denk dat Peter zal vertrekken I think that Peter will disappear Ik denk dat Mary moet blijven I think that Mary must stay Ik denk dat Peter boeken leerde lezen I think that Peter books learned read 'I think that Peter learned to read books' Ik denk dat Mary een auto wilde kopen I think that Marya car wanted buy 'I think that Mary wanted to buy a car' Ik denk dat Mary hem hoorde zingen I think that Mary him heard sing 'I think that Mary heard him sing' Ik denk dat Peter haar liet komen I think that Peter her let come 'I think that Peter let her come'
The first class (42) includes the Dutch auxiliaries. There is no reason, however, to assume that there is a special category Aux in Dutch. All socalled auxiliary verbs are rather regular verbs and lack the defective paradigms and deviant distribution of the English auxiliaries. What the auxiliaries have in common is that they do not assign a a-role to the subject In other words, they are raising verbs (like seem). This means that a sentence like (42b) is derived as follows: 16 (45)
a. b. c.
Ik denk dat [s NP [s Mary blijven] moet] NP Raising) Ik denk dat [s Mary [s t blijven] moet] Verb Raising) Ik denk dat [s Mary [s t ] moet blijuen]
All Dutch auxiliaries can be treated as normal verbs (Vs) that select a clausal complement. In this way, we can maintain the natural generalization that there is a one-to-one correspondence between subjects and verbs. Besides, it is by far the simplest solution because both NP Raising and Verb Raising are needed anyway. A solution that postulates a special category Aux for these verbs, or VP-complements, adds something to the grammar that is entirely superfluous. A control case like (43a) has the following underlying structure: (46)
Ik denk dat [s Peteri [s ei boeken lezen] leerde]
After Verb Raising we derive the structure underlying (43a): (47)
Ik denk dat [s Peteri [s ej boeken - ] leerde lezen]
In this case, the subject of the embedded clause (eJ is a PRO controlled by Peter. A sentence like (44a) is derived from the following underlying
Anaphoric and Nonanaphoric Control
127
structure: (48)
Ik denk dat [s Mary [s hem zingen] hoorde]
After Verb Raising its structure is as follows: (49)
Ik denk dat [s Mary [s hem-] hoorde zingen]
1 will return to these cases of "exceptional Case-marking" later. What all te-less complements in (42)-(44) have in common is that they occur only as VR-complements. They never occur in extraposed position. 17 (50)
*Ik denk dat Peter zal [s t het boek lezen] 1 think that Peter will the book read
The only grammatical variant is (51): (51)
Ik denk dat Peter [s t het boek - ] zal lezen] I think that Peter the book will read 'I think that Peter will read the book'
The cases of control (43) and exceptional Case-marking (44) also lack a variant with an extraposed complement (Evers (1975)): (52)
a. b.
*Ik denk dat [s Peterj leerde [s ej boeken lezen]] *Ik denk dat [s Mary hoorde [s hem zingen]]
The raising case (50) is ungrammatical because of directional government As with schijnen (cf. (39b)), the trace in the complement is not properly governed, because the matrix verb zal does not govern to the right. The same principle explains the ungrammaticality of (52b). Here we need exceptional Case-marking from the verb hoorde. But exceptional Casemarking is like regular Case-marking in that it only works under government. Since the verb hOG/'de in (52b) does not govern to the right, hem is not governed by this verb, so that it cannot receive Case from it. The only new problem is (52a), the case of control. The ungrammaticality of this example does not follow from directional government in the same way, because here the embedded subject ej is a PRO that need not be governed. Why, then, is (52a) ungrammatical? The answer again involves directional government, this time not with respect to the embedded subject, but with respect to the containing clause. Complements that have infinitives with te are not necessarily governed by a verb to their right, so that these complements occur on both sides of the matrix verb (cf. (36)- (37)). Te-less infinitives, however, differ in interesting ways. They behave like English gerunds in that they have NP-like
128
Domains and Dynasties
distribution. If te-Iess infinitives have the same distribution as NPs, the ungrammaticality of (52a,b) is what we expect: like NPs, the te-Iess complements do not occur to the right of the verb. This is not unlike English gerunds, which behave like NPs (Emonds (1972)) and do not undergo extraposition (Rosenbaum (1967, 45)). There is good evidence that te-Iess infinitives behave like NPs in other contexts. They are the only type of complement that can occur in subject position: (53)
Ik denk dat [boeken lezenJ noodzakelijk is I think that books read necessary IS 'I think that reading books is necessary'
Note that the te-Iess infinitive in the Dutch example is naturally translated into English by using a gerund. Other types of complements (tensed clauses and infinitives with te) are impossible in subject position: (54)
a. b.
*Ik I *Ik I
denk think denk think
dat that dat that
[dat hij that he [boeken books
komtJ noodzakelijk is comes necessary is te lezenJ noodzakelijk is to read necessary is
Not only in embedded clauses, but also in root sentences in which the finite verb precedes the subject (such as questions; cf. English subject-aux inversion), te-Iess infinitives are the only possible type of complement: (55)
a.
b. c.
Is [boeken lezenJ noodzakelijk? is books read necessary 'Is reading books necessary?' *Is [dat hij komtJ noodzakelijk? is that he comes necessary *Is [boeken te lezenJ noodzakelijk? is books to read necessary
In an earlier paper (Koster (1978a)), I concluded on the basis of data like (54) and (55b,c) that "subject sentences don't exist". Full clauses appear only in peripheral positions: in extraposed positions and in topicalized positions, but never in typical NP-positions such as subject positions. Emonds (1972), on which Koster (1978a) was based, noted that gerunds in English are different from full clauses in that they do occur in typical NPpositions such as subject positions. From (53) and (55a) it is clear that teless infinitives in Dutch have a status similar to that of gerunds in English. The NP-like status of te-Iess infinitives is confirmed by several other facts. One important example will be given in the next section: nouns (Ns) have neither NP-complements nor te-Iess infinitival complements. Another relevant fact is that te-Iess infinitives (with the exception of the
Anaphoric and Nonanaphoric Control
129
exceptional Case-marking cases, to which I will return) are the only type of VR-complement that easily undergoes topicalization: (56)
a.
b.
[Boeken lezenJ wil hij niet books read want he not 'Reading books, he does not want' [Piano spelenJ leert zij nooit piano play learns she never 'Playing the piano, she will never learn'
In this respect, te-less infinitives behave like NPs and not like some other clause types, which are often difficult to topicalize: (57)
a. b.
*[Boeken te lezenJ probeerde hij nooit books to read tried he never *[Oe hond te vangenJ heeft hij geweigerd the dog to catch has he refused
In short, there is plenty of evidence that te-less infinitives have an NP-like distribution, just like English gerunds. The question, then, is how to account for this fact. On the basis of their NP-like distribution, Emonds (1972) concluded that gerunds are in fact NPs. This would account for their distribution. However, the clause-like gerunds (and the equally clause-like te-less infinitives) are not NP-like in any other respects. They do not have a noun as their head, and they are in fact like regular clauses in most respects. In principle, the theory of government and binding allows another solution. As mentioned earlier in connection with directional government, NPs owe their distribution to their governors. The nature of the governor determines where arguments can be governed and where they cannot. Furthermore, it is generally assumed that a-marking and Case-marking depend on government. The latter two processes are concomitant features of the assignment of argument status to a category. Suppose now that, in general, clauses do not occur in a-positions, but that there is an exception to this rule, namely, gerunds (and te-less infinitives). These two clause types would then have argument status, without being NPs. Their similarity to NPs in distributional character would follow from something that they share with NPs, i.e. a-role assignment and Case-marking. Since these processes only occur under government, we can explain why the clauses in question occur in NP-positions without being NPs. Technically, we can consider Case assignment to be a relation between a Case assigner and a head. Thus, normally a Case assigner affects the head N of the NP to which Case is assigned. Similarly, we can assume that in the case of te-less infinitives, the matrix verb affects the head V of the infinitival complement. Schematically, then, we would have the following situation:
130 (58)
Domains and Dynasties ... [s ... V ... ] ... V ... t
I
Case If a V receives Case in this way, it must assume a particular morphological shape. In English, Case-marked Vs are realized as gerunds, in Dutch as teless infinitives. Considering Case assignment as a relation between a head and a Case assigner that is external to its maximal projection has the advantage that the fact that Ns do not assign Case can be related to a general property of Ns, as 1 will show in the next section. In conclusion, it seems to me that by incorporating the notion of directional government, the theory of government and binding explains two classical problems of Dutch syntax. First, it explains the fact that raising complements with te occur only to the left of the matrix verb (as VR-complements), whereas control complements with te occur on both sides of the matrix verb. Second, it explains the fact that te-less infinitives (involving raising or control) do not occur at all in extraposed position. Let us now turn to exceptional Case-marking in Dutch. This phenomenon is found only with verbs of perception (horen 'hear', zien 'see', etc.) and causatives like Taten 'let' (see Evers (1975, 4)):
(59)
a.
b.
Ik denk dat ik Chaar het boek - ] zag I think that I her the book saw 'I think that 1 saw her read the book' Ik denk dat ik [Peter de auto - ] laat I think that 1 Peter the car let 'I think that I'll let Peter wash the car'
Tezen read wassen wash
There are reasons to assume that these are cases of exceptional Casemarking, because there is no normal thematic relation between the matrix verb and the objective forms (haar in (59a) and Peter in (59b)) that depend on it. This was shown by De Geest (1972, 170), who gave examples like (60): (60)
Ik zag geloof overal ontbreken 1 saw faith everywhere lack 'I saw faith lacking everywhere'
A sentence like (60) does not entail II< zag geToof'l saw faith'. Therefore, this might be a form of exceptional Case-marking. On the other hand, there is a well-known heresy, raising to object position, which cannot be excluded out of hand. It is true that most of Postal's (1974) arguments are unconvincing (see Bresnan (1976)), but I am not entirely convinced by the arguments for exceptional Case-marking either. One persistent problem is that complements of verbs like believe (in their exceptional Case-marking analysis) fail to pass constituency tests.
131
Anaphoric and Nonanaphoric Control
Bresnan (1982), for instance, mentions Postal's Right Node Raising argument (61)
*Mary believes, but Catherine doesn't believe, Peter to be fat
The part that has undergone Right Node Raising (Peter to be fat) fails the test for constituency. This argument is not decisive for two reasons. First of all, it is not clear whether Right N ode Raising is a valid test for constituency. In English, and definitely in Dutch, there are cases of Right Node Raising of nonconstituents. 18 But even if it were a valid test, it would not be decisive since the tests in question do give sufficient, but not necessary, conditions for constituency. In spite of these objections, it must be said that a sentence like (61) constitutes a problem for a theory that incorporates the idea of exceptional Case-marking. A better case can be made on the basis of examples like (59) in Dutch. Dutch is a so-called verb-second language, which means that in root sentences there is always one constituent preceding the finite verb.19 This provides us with an excellent test for constituency: only constituents can precede the finite verb. We have already seen that te-less infinitives can be preposed in Dutch (cf. (56)). Alleged exceptional Case-marking in Dutch always involves teless infinitives (cf. (59)). It is a remarkable fact, then, that preposing the complements in question yields highly ungrammatical sentences: (62)
a. b.
*[Ham' her *[Peter Peter
boeken lezen] zag ik books read saw I de auto wassen] liet the car wash let
zelden rarely ik nooit I never
These facts strongly suggest that the preposed complements (between brackets) do not form constituents (unless one can find an independent explanation for the ungrammaticality). The alternative seems to give an advantage here because the objective forms haar and Peter can be left behind (in fact must be,left behind): (63)
a. b.
[Boeken lezen] zag ik haar zelden books read saw I her rarely [De auto wassen] liet ik Peter nooit the car wash let I Peter never
These facts provide compelling evidence for the constituency of the preposed complements,l° Presumably, these complements are clauses,. just like in the other cases of preposed te-less infinitival complements (cf. (56)). Under the alternative analysis, it could be maintained that the pre posed
132
Domains and Dynasties
constituents in (63) are VPs, the subjects of their clauses being left behind. This would be a highly problematic conclusion, however, since there is no independent evidence in Dutch for VP-preposing. 21 In short, examples like (62) and (63) give some evidence for the foIlowing underlying structure for an example like (59a):
s
(64)
~VP I~
NP
:~
r TA ~h::~
I
~!,
t
NP
V
het boek 'the book'
lezen 'read'
61
AIl in all, it seems to me that the debate "exceptional Case-marking versus raising to object position" is still open. Fortunately, the outcome of this debate is irrelevant for the issues that concern us here. What reaIly matters in this context is that both analyses require the embedded clause to be transparent. Chomsky (1981b) assumes Sf-deletion both for exceptional Case-marking and for raising. In both cases the embedded subject position in (64) must be governed by the matrix verb. Thus far, we have used the notion of directional government and the essential transparency of VR-complements for the following facts: (65)
a. b. c. d.
Passivization of VR-control verbs is impossible (32). Raising complements are always VR-complements. Te-less infinitives cannot be extraposed. Exceptional Case-marking (or raising to object position) is possible in VR-complements.
In the remainder of this section, I will show that there are several other transparency phenomena that distinguish VR-complements from extraposed complements. Crucially~ I will show that several of these transparency phenomena can be observed in control complements as long as they are to the left of the matrix verb. Often the same control verbs can also have their complements in extraposed position, in which case no transparency can be found.
Anaphoric and Nonanaphoric Control
133
First, Verb Raising itself requires transparency. All complements to the left of the matrix verb undergo Verb Raising. It is not known why this is necessary, but it is no doubt possible thanks to the fact that the clause that loses its verb is transparent because it lacks a complementizer. Extraposed complements never undergo Verb Raising. 22 Second, there are certain transparency phenomena concerning reflexivization in Dutch that can only be observed in VR-complements (Koster (1985) and chapter 6 below). Consider the following contrast: (66)
a.
b.
Ik denk dat Peter [s Mary naar zich toe zag komen] I think that Peter Mary to himself prt. saw come 'I think that Peter saw Mary come toward himself' *Ik denk dat Peter Mary dwong [s' om naar zich I think that Peter Mary forced COMP to himself toe te komen] prt. to come 'I think that Peter forced Mary to come to himself'
(66a) involves a VR-complement; (66b) has an extraposed complement, introduced by the complementizer om. In (66a) the reflexive zich can be bound by Peter across the specified subject Mary. This is not possible in (66b), where Mary controls the subject of the complement. There is considerable independent evidence in Dutch that zich cannot be bound across the boundaries of a full clause (see Koster (1985) and chapter 6 below for details). The minimal S in (66a) is not a full clause, so that zich can have an antecedent external to it. In (66b), however, the minimal clause is an extraposed complement (which can be introduced by a complementizer, as indicated). Extraposed complements are full clauses, which are absolute boundaries for the binding of zich. Thus, there is a clear difference between VR-complements and extraposed complements with respect to binding possibilities. This difference is explained by assuming that the VR-complement is not a full clause (as indicated by the impossibility of a complementizer). The third kind of transparency phenomenon yields a clear difference between VR-complements and extraposed complements of the same (control) verbs. It has been known since Evers (1975) that clitics in VR-complements can be moved across subjects: (67)
Ik denk dat hij het Peter t hoorde zingen I think that he it Peter heard sing 'I think that he heard Peter sing it'
This is a striking fact because normally het cannot be moved across a subject. If the raising-to-object analysis is correct for such cases, het in (67) has been moved from its object position in the complement across the
134
Domains and Dynasties
raised constituent (Peter) in the matrix clause. This kind of "clitic climbing" is possible only from VR-complements. It is never possible to move het out of an extraposed complement: (68)
a.
b.
Ik denk dat Peter probeerde (om) het I think that Peter tried COMP it te geven to give 'I think that Peter tried to give it to Mary' *Ik denk dat Peter het probeerde (om) I think that Peter it tried COMP te geven to give
aan Mary to Mary
aan Mary to Mary
Other striking examples, which directly involve control complements, have to do with R-movement, extensively discussed by Van Riemsdijk (1978). In the following example, the particle er 'there' has been moved from its original position (69a) to a position in front of the subject (69b): (69)
a.
b.
Ik denk dat iemand er over schrijft I think that someone there about writes 'I think that someone writes about it' Ik denk dat er iemand t over schrijft I think that there someone about writes
It appears that VR-complements are no barrier for this kind of movement. Er can be moved from the complement across the matrix subject: (70)
a.
b.
Ik denk dat iemand [s er over] probeerde te schrijven I think that someone there about tried to write 'I think that someone tried to write about it' Ik denk dat er iemand [s t over] probeerde te I think that there someone about tried to schrijven write
This example constitutes direct and very strong evidence for the transparency of control complements as opposed to the opaqueness of extraposed complements, because er cannot be moved out of the latter: (71)
a.
Ik denk dat iemand probeerde [s' om er over I think that someone tried COMP there about te schrijven] to write 'I think that someone tried to write about it'
Anaphoric and Nonanaphoric Control
135
b. *Ik denk dat er iemand probeerde [s' om t over 1 think that there someone tried COMP about te schrijven] to write A last, equally compelling argument involves adverbial scope. It seems reasonable to assume that sentence adverbials like waarschijnlijk 'probably' have as their scope the minimal full clause containing them: (72)
Mary says that John is probably crazy
Probably has scope over the embedded clause, and not over the matrix clause. Again, we find a strjking contrast between the VR-complements of control verbs and their extraposed complements: (73)
Ik denk dat Peter [s het boek waarschijnlijk] probeerde te lezen to read 1 think that Peter the book probably tried 'I think that Peter probably tried to read the book'
It is very likely that the adverbial waarschijnlijk is contained by the complement: it is probably internal to its VP because it has passed the object het boek. It is therefore striking that the scope of the adverbial is not the complement but the next higher clause. This follows from the assumption that the scope of such sentence adverbials is the minimal full clause containing them, together with the assumption that the VRcomplement is not a full clause (it always lacks a complementizer). Limiting the scope to the complement does not even giv~ a possible reading, and again we see that extraposed complements are opaque:
(74)
*Ik denk dat Peter probeerde [s' om waarschijnlijk het COMP probably the I think that Peter tried boek te lezen] book to read 'I think that Peter tried to probably read the book'
If we try to limit the scope of waarschijnlijk to the complement, we get an impossible reading. The crucial point in this case is that the complement is opaque, so that we do not derive the reading where the adverbial has scope over the next clause up. In conclusion, it seems to me that all VR-complements show transparency phenomena. Transparency is expressed in Chomsky (1981b) by the device of S'-deletion. As we have seen, this phenomenon is correlated in English with the impossibility of the complementizer for. It is striking that VR-complements, which all show transparency phenomena in Dutch, can never have a complementizer. What is most crucial is that there is direct evidence that control complements also show these correlated phenomena:
136
'Domains and Dynasties'
absence of a complementizer plus transparency. But since control complements (in their VR-position) are so obviously transparent in Dutch, there is no reason to block government of PRO by the matrix verb. Since controlled VR-complements fall under the same generalization as other complements that show "S'-deletion"-like behavior, the conclusion seems inescapable: PRO must be governed under certain circumstances. Of course, this is not a frightening conclusion. It is, on the contrary, the only plausible explanation for the fact that PROs of a well-defined class behave like anaphors. In the next section, I will show that the nature of infinitival complements to nouns confirms the conclusion of this section. Ns are not proper governors and therefore select only opaque infinitival complements, both in English and in Dutch. 3.5. Asymmetries between N and V There is considerable evidence that nouns and verbs do not have the same governance properties. Richard Kayne has investigated these differences and expressed them by stipulating that Vs are structural governors and that Ns are just governors (1981, 1983). Normally a governor governs the categories that it subcategorizes, but a structural governor can also govern elements from other projections under certain conditions. Schematically, this difference is as follows: (75)
a.
". N ". [xn ". Y ". ] " . I
)(
government
b.
T
". V ". [Xn ". Y ". ] " .
T
I
government A V may govern across a maximal projection boundary X n, but an N never does. Although this difference has many interesting consequences, I will not explore them here, limiting myself instead to the consequences for infinitives. Nevertheless, it is important to bear in mind that there is considerable independent evidence for the distinction. Let me, therefore, give one example that has nothing to do with the infinitival complements that concern us here. Ross (1967) proposed a condition generally known as the Complex NP Constraint. This generalization entails that elements can never be extracted from the complement of an N. There are no such restrictions on the complements of Vs. In other words, there is no such thing as a Complex VP Constraint. This N-V asymmetry follows from the distinction shown in (75), in conjunction with the requirement that traces must be properly
137
Anaphoric and Nonanaphoric Control
governed. Assuming that Wh-phrases are extracted through COMP, we can express the distinction as follows: (76)
b.
a.
In (76a) the trace in COMP is not accessible for the potential governor N, because this N, not being a structural governor, cannot pass the maximal projection boundary Sf. In (76b) there appears to be no problem. V is a structural governor, so that it can govern the trace in COMPo Extraction from an Sf-complement of an N would, therefore, lead to an ECP violation, whereas extraction from a V-complement leaves the trace properly governed. In other words, the difference in governance properties of N and V, together with the ECP, explains why Vs can be bridges, whereas N s are generally not. With this much evidence in mind, we can now turn to infinitival complements. First, we can explain why gerunds do not occur as the complements of Ns, and more generally, why Ns do not assign Case. Recall that we have decided to consider Case assignment to be a relation between a governor and a head. If this is the correct view, we can immediately explain why Vs assign Case, whereas Ns do not: (77)
a.
AN\T
ne'
Case
b.
VP
~
V
NP
\1
Case
In (77a) N2 cannot receive Case from Nl because Nl cannot govern N2 across the maximal projection NP2. Since government is a necessary condition for Case assignment, N 1 cannot assign Case to N 2. Again, a similar problem does not arise in (77b) because V, as a structural governor, can cross the maximal projection boundary NP. I will assume, however, that Vs can govern categories in other projections only if these categories are not governed by another governor,
138
Domains and Dynasties
such as INFL or COMPo Thus, if a verb selects a for-complement, this complement is "protected" from government by the matrix verb. This accounts for the fact that for-complements are opaque, whereas complements without a (D-structure) complementizer appear to be transparent for government from the matrix verb. The consequences of this difference have been extensively illustrated in the previous sections. In Dutch, the category V is also a structural governor, but in this language we must take into account the effects of directional government. VR-complements never have a complementizer, are therefore transparent, and are open to government from the matrix verb. Extraposed complements often have a complementizer, but even without a complementizer they are not accessible to the governing powers of the matrix V.23 In this way, we have been able to explain a whole array of remarkable differences between VR-complements and extraposed complements. Given the fact, then, that Ns do not govern into another projection, we can make a very strong prediction. Our theory predicts that infinitival complements to N s are never transparent, but always opaque. This prediction is exactly, in all details, borne out. In order to see this, we must briefly go through the list of properties (34). First, N-complements can select complementizers. In English the complementizer for can be selected; in Dutch the complementizer am is often preferred: (78)
a.
b.
de poging am Nicaragua aan te vallen the attempt COMP Nicaragua prt. to attack 'the attempt to attack Nicaragua' het verlangen am rijk te worden the desire CO MP rich to get 'the desire to get rich'
N-complements differ in this respect from VR-complements, which never select a complementizer. Second, raising is impossible in NPs, as was already shown in Chomsky (1970) (see also Williams (1982)): (79)
a. b.
John appears [t to come] *John's appearance [t to come]
Exactly the same holds for Dutch. There is no raising in NPs: (80)
a.
b.
John schijnt ziek te zijn John seems sick to be 'John seems to be sick' *J ohn's schijn ziek te zijn John's appearance sick to be
)
Anaphol'ic and Nonanaphoric Control
139
As in the previous cases, this follows from the ECP and the fact that Ns do not govern across clause boundaries. As a consequence, the trace t is governed in (79a), whereas it remains ungoverned in (79b). Third, control is of course possible in NPs. Peter is the controller in the following example: (81)
Peter's attempt PRO to go home
Control in general does not distinguish opaque complements from transparent ones (cf. (34c)), so we will leave this property here. Fourth, te-less infinitives provide interesting confirmation for our analysis. In the previous section, they were analyzed as NP-like complements, which had to undergo Case-marking. Naturally, this can only happen under government. But Ns do not assign Case, as is well known. Under the assumption made before - that Case-marking involves government into an NP - we predict that N s do not select NPs, gerunds, or te-less infinitives. This is indeed the case: (82)
a. b.
Zij wil vertrekken she wants to leave *haar wil vertrekken her will to leave
This generalization has no exceptions, a remarkable fact never considered before to my knowledge. Fifth, exceptional Case-marking occurs only in the domain of a V, and never in the domain of an N: (83)
a. b.
Mary believes him to come *Mary's belief him to come
Again, this fact is easily explained by the requirement that Case is only assigned under government. A related fact was noted by Williams (1982): passivization is not possible from the complement of belief Compare: (84)
a. b.
The book was believed t to be stolen *the book's belief t to be stolen
The explanation is as before: only verbs govern into complements. Sixth, and most crucial, obligatory control (in the sense discussed before) does not occur systematically in NPs. Usually, it is not necessary to have an explicit, c-commanding antecedent: (85)
a.
Bill's attempts to leave the country
b. the attempts (by Bill) to leave the country c.
the attempts for Bill to leave the country
140
Domains and Dynasties
The most interesting cases involve a contrast between a verb and a closely related noun. Refuse seems to be a verb of obligatory control: (86)
a. b. c.
Bill refused to go *1 refused for Bill to go *It was refused (by me) to go
The corresponding noun refusal has the two possibilities that refuse lacks: (87)
a. b. c.
Bill's refusal to go Bill's refusal for Mary to go the refusal to go
The same can be said about Dutch. To my knowledge, there are no systematic examples of obligatory control. It is almost always possible to replace the subject-controller of an NP by an article. If this generalization is correct, it provides important evidence for the involvement of government (of PRO) in obligatory control. Without the assumption of governed PRO, there is no reason to expect that obligatory control patterns with raising and other transparency phenomena. As for the remaining transparency phenomena, N-complements are opaque, insofar as they can be tested in NPs. There is clearly no counterpart of Verb Raising in NPs: there is no rule that adjoins the V of the complement to the N of the NP. Clitics cannot be moved out ofN-complements, but this does not tell us very much because there is no clitic position in NPs that can serve as a possible landing site. Sentence adverbials usually cannot occur in N-complements, which pattern like extraposed complements in this respect: (88)
*zijn poging om waarschijnlijk te vertrekken his attempt probably to leave
It appears, therefore, that N-complements are opaque and behave like extraposed complements - and not like VR-complements - in Dutch. The relevant facts can be summarized by repeating (34) with the properties of N-complements added: (89)
Properties of infinitival complements in Dutch VR
a. b. c. d. e.
COMP Raising Control Without te Exceptional Case
+ + + +
Extraposed
N
+
+
+
+
141
Anaphoric and Nonanaphoric Control
f. g.
Obligatory control Transparency
+ +
It is, of course, very remarkable that extraposed complements and Ncomplements have exactly the same properties, which in most cases sharply contrast with the properties of VR-complements. The explanation of the contrast rests, as we have seen, on the fact that VR-complements lack a complementizer (see (89a)), which makes them transparent with respect to government. Extraposed complements are either opaque because they have a complementizer, or in principle transparent, but not affected by government from the matrix V, since this V governs only in the other direction. N-complements are not affected by government either, since Ns do not govern into complements in general. It is the notion of government, therefore, that ultimately explains the very complex distribution of facts summarized in (89).
3.6. Conclusion In chapter 1, I specified a very general set of properties, the configurational matrix, that I claimed to be the essence of all core grammar dependencies. According to the Thesis of Radical Autonomy, there are no sets of such major properties that are specific to a certain type of construction (e.g. movement, anaphor binding, predication, control). Control has been a challenge from this point of view, because it is usually claimed that control has little in common with movement or even anaphor binding. According to the standard theory of government and binding, this deviant status of control is due, among other things, to the alleged fact that PRO is not governed. The purpose of this chapter has been to show that the standard view is false in this respect. The reason appears to be that the standard view ignores the fact that there are two kinds of PRO, as was stressed by Williams (1980) and, in a different framework, by Bresnan (1982). This chapter started from the assumption that what Williams calls "obligatory control" (and Bresnan "functional control") has properties that are remarkably similar to the properties of anaphor binding, and ultimately to the properties of movement and predication as well. This can only mean that obligatory PROs (in the intended sense) are in fact anaphors. Since PRO is an empty category, it can only be an anaphor if it is governed (principle A of Chomsky's binding theory). It was furthermore shown that PRO can only be governed (i.e. be anaphoric) if it is the subject of a transparent complement. In English and Dutch, at least, transparency follows from the absence of a complementizer in underlying structure. 24 Although it might seem somewhat strange and unorthodox to postulate the possibility of governed PRO, the consequences appear to overwhelm-
142
Domains and Dynasties
ingly favor this assumption. First of all, it leads to an explanation of the correlation between the lack of a complementizer and obligatory control. Furthermore, it provides - in conjunction with some common assumptions about control - an explanation for some generalizations made by Visser and Bach. These generalizations are not explained by the standard theory, nor by the lexical-functional alternative, in spite of claims to the contrary. Nevertheless, the theory presented in this chapter can be seen as a slight modification of the standard theory. Much of this theory was adopted here, especially the S-analysis of infinitival complements plus the various possibilities of government in general, and of empty categories in particular. The advantages of this approach are particularly striking with respect to the complement system of Dutch. In this case, it is not even clear how governed PRO can be avoided (in VR-complements), and, on the positive side, a very intricate set of facts falls into place. To the extent that the analyses in this chapter can be maintained, the Thesis of Radical Autonomy is vindicated because the assumption of governed PRO entails that there is a class of control cases that shows the strict properties of anaphor binding. My theory differs in this respect from the standard theory, which still has the - in my opinion undesirable - tendency toward constructionspecific properties. In a sense, the same can be held against the alternatives proposed by Bresnan and Williams. In the lexical-functional approach, there is no obvious way to explain the similarities between a "functionally" described phenomenon such as obligatory control and a purely nonfunctional, configurational phenomenon such as Wh-movement. In conclusion, Williams's (1980) approach was one of the sources of inspiration for this chapter. In this framework, much is reduced to the properties of predication; or at least obligatory control is reduced to predication. The Thesis of Radical Autonomy suggests that this generalization might be somewhat misleading. The properties of predication themselves stand in need of clarification. What particularly calls for explanation is the fact that the configurational properties of the predication relation largely overlap with the properties of anaphor binding, movement, and many other dependencies. What is needed, in other words, is a deeper configurational principle that explains the similarities among all local dependencies. It is my hope that the properties of the configurational matrix provide the basis for such a deeper principle.
NOTES 1. For the sake of convenience, I will refer to all PROs in Jar-infinitival complements as optional PROs, in spite of the fact that such PROs are sometimes obligatory, as in John
Anaphoric and Nonanaphoric Control
143
knows what PRO to do. The obligatoriness of PRO (and absence ofjin) in such cases follows from independent factors, as has been familiar since Chomsky and Lasnik (1977). 2. Apart from overt complementizers, I also assume the existence of null complementizers. This assumption seems necessary, because the relevant distinctions can also be found in languages without overt complementizers for infinitives. 3. For the sometimes rather different properties of other control structures, see Van Haaften (1982). See also Manzini (1983a). 4. For the old idea of an implicit controller, see Koster (1978b, 583) and Roeper (1983), among others. 5. Note that the infinitives in these examples are complements to Ns, which never contain obligatory PRO in the sense intended here (see note 1). 6. In fact, the scope of the c-command property is unclear and not well explored. Reflexivization, for instance, does not always involve c-command (see Koster (1985)). Jackendoff (1972) gives examples like a book by Johl1 about himself; see also E. Kiss (1981). McCloskey (1984) gives an interesting argument for raising into PPs in modern Irish. If McCloskey is right, even movement does not always involve a c-commanding antecedent. 7. Some verbs are more complex than Iry. The verb say, for instance, has two possibilities: John said PRO to go, with PRO ("arbitrarily") controlled by the x to whom John addresses himself, and Johl1 was said t to go, which does not involve control at all, but raising. 8. Guglielmo Cinque has reminded me of the well-known fact that passivization in the complement of promise may lead to corresponding passivization of the matrix verb: Bill was promised PRO to be allowed to go. 9. See Rosenbaum (1967). See also Koster (1978c, ch. 3) for discussion. 10. It is not possible at the moment to give necessary and sufficient conditions for such structures. As Williams (1980) points out, not all verbs with Jor-complements permit this construction (want, for instance, is an exception). But if these constructions are possible at all, the complement is usually a Jor-complement. 11. Hans den Besten and Riny Huybregts have pointed out to me that (32) is not necessarily a crucial example because, according to them, passivization is generally impossible with Verb Raising complements. I disagree, because in my speech the following example (which involves Verb Raising and passivization) is perfectly grammatical:
(i) dat hij de voorzitter geacht werd te zijn ... that he the president considered was to be 'that he was considered to be the president ... ' Thus, if I am right, the account given here explains the incompatibility of Verb Raising and passivization for control verbs. There are also noncontrol verbs, such as verbs of perception, that seem to show the incompatibility of Verb Raising and passivization, but this is probably an independent matter that has nothing to do with Verb Raising. I base this conclusion on the fact that passivization is also impossible in the corresponding English examples, which certainly do not involve Verb Raising: (ii) We saw Bill go (iii) *Bill was seen 1 go (by us) 12.
Henk van Riemsdijk has pointed out to me that there is an exception to (33b): (i) dat Bill werd geacht t vergeten te zijn that Bill was considered forgotten to be 'that Bill was considered to be forgotten'
This is not a productive pattern in Dutch, and it occurs only in a very limited number of cases with participles. As far as I know, all (non participial) verbs that allow raising to subject are VR-verbs. 13. See Stowell (1981), Hoekstra (1982), and Kayne (1983). Henk van Riemsdijk (personal
144
Domains and Dynasties
communication) has suggested that directionality is limited to argument positions. For the SOV character of Dutch and German, see Koster (1975), Den Besten (1977), and Thiersch (1978). 14. See Hoekstra (1984) for discussion of this matter. 15. I am assuming here that D-structure properties can be projected from S-structure (see Koster (1978c) and Sportiche (1983)). A further assumption is that (at S-structure) a a-role assigned to an NP is optionally inherited by its antecedent. If the antecedent is assigned an independent a-role (as in control structures), inheritance is blocked by the a-criterion, which prevents double a-marking of NPs. 16. Riny Huybregts has argued that Verb Raising is a two-step operation: reanalysis at Dstructure and actual movement of the verb at the phonological level (PF); see chapter 5 below for discussion of this idea. Note that the surface order of (45b) also gives a grammatical sentence in Dutch. This does not mean, however, that Verb Raising is optional. Presumably, reanalysis (or rather, the alternative discussed in chapter 5) always applies, whereas the actual surface order is determined at PF by minor movements that vary somewhat from dialect to dialect in Germanic (especially with modals). 17. See Evers (1975). Note that the surface order of (50) is grammatical in Flemish dialects. This is not the result of extraposition, however, but a consequence of the fact that the dialects in question allow incorporation of one X" category under Verb Raising (see chapter 5 for references and some discussion). 18. A relevant example is the following: (i) Ik geloof dat John met I believe that John with
en Mary zonder een pot load werkte and Mary without a pencil worked
The words in italics definitely do not form a constituent. 19. Actually, it is possible for two constituents to precede the finite verb, if the second consituent is a so-called d-word. See Koster (1978a,c) and Thiersch (1978). 20. It is even possible in Dutch to insert a d-word between the preposed constituents and the finite verb in (63): (i) [Boeken lezenJ dati zie je haar ti zelden books read that see you her rarely The d-word dat is an NP, which usually has an NP or S' as antecedent. The D-structure position, indicated by the trace, is an NP-position. Haar remains the understood subject of boeken lezen, which indicates that this subject is not part of the preposed S' at S-structure. 21. VP-preposing is totally impossible without an auxiliary verb. But in Dutch auxiliary verbs are raising verbs, which always have an S'-complement that can be preposed. Consequently, there is no evidence for VP-preposing. 22. This is pr,obably an independent matter. If Verb Raising involves reanalysis (see note 16), it probably also involves the typical condition on reanalysis, namely, adjacency of the reanalyzed items. After extra position, the matrix verb and the complement are not adjacent. For an argument against reanalysis, see chapter 5. 23. This follows from the fact that Vs do not govern arguments to their right in Dutch. 24. Presumably, there are clauses with null complementizers. Guglielmo Cinque has pointed out to me that two classes of complements can also be distinguished in Italian, in spite of the fact that in this language the distinction does not depend on the presence or absence of an overt complementizer.
Chapter 4
Global Harmony, Bounding, and the ECP
4.1. Introduction One of the most popular developments in the theory of grammar during the past five years, the Empty Category Principle (ECP) can be seen as a rapprochement between the grammar of gaps and the grammar of scope. According to most versions of the ECP, there is a level of Logical Form (LF) to which the condition applies. LF is derived from S-structure by the rule "move alpha", which adjoins, among other things, quantified phrases to some category containing them (see May (1977) and (1985) for details). In this way, the nature of scopal domains is determined in part at least by the properties of either move alpha or by the properties of gaps (empty categories) in general. Not all versions of the ECP apply to LF. In fact, one of the most elaborated versions, the Connectedness Condition (CC) of Kayne (1983), applies at the level of S-structure. But in one crucial aspect Kayne's version agrees with most other versions: it assumes the essential parallelism of the grammar of gaps and the grammar of scope (particularly, the grammar of Wh-elements in situ). The following elements can be distinguished in Kayne's CC: (1)
a. · proper government (standard ECP) b. bounding (Subjacency) c. percolation (directionality)
According to the CC, an empty category must have an antecedent in one of its g-projections. An ec (empty category) has a g-projection if and only if it is properly governed. 1 The CC, therefore, incorporates some version of Chomsky'S classical ECP (Chomsky (1981b)). Especially in an earlier version (Kayne (1981)), Kayne's ECP was also designed to incorporate the bounding theory (Subjacency). The current version, the CC, no longer entails the whole bounding theory, but there are still elements of this theory that the CC covers. The subject condition of Chomsky (1973), for instance, follows from the CC: (2)
* Whoj did [NP a picture of tj] disturb you? 145
146
Domains and Dynasties
The trace of who, ti, is properly governed, so that it has a g-projection. Assuming that canonical government is to the right in English, percolation goes up to the subject NP (indicated by the brackets in (2)). From here, there is no further percolation because the subject is on a left branch, contrary to the condition of canonical government, which requires gprojections on right branches all the way to the antecedent who. It is therefore not possible to include the antecedent who in any g-projection of the trace, from which it follows that (2) violates the Connectedness Condition. Other versions of the ECP also include proper government (la) and elements of bounding theory (lb), but Kayne's CC is unique in that it crucially involves percolation (lc), particularly certain directionality constraints, as entailed by the notion "canonical government configuration". It is for this reason that Kayne's version of the ECP is the only version that has something to say about preposition stranding, among other things. In what follows, I will develop Kayne's directionality constraints and show that the nature of preposition stranding, the near absence of parasitic gaps, and the extreme marginality of island violations in an SOY language like Dutch follow from these directionality constraints. But in spite of my agreement with this particular aspect of Kayne's CC, I will show that the crucial assumption that inspired several versions of the ECP (including the CC), the idea of strong parallelism between the grammar of gaps and the grammar of scope, is unmotivated. In particular, I would like to argue that there is at best a weak parallelism in terms of (la), i.e. in terms of the standard ECP effects. Not only the bounding conditions (i.e. (lb)) but also the directionality constraints (i.e. (lc)) fail to show the expected parallelism. 2 As for the bounding conditions, the lack of parallelism has been convincingly demonstrated by Huang (1982). Recall that May's rule of Quantifier Raising (QR) was an instance of move alpha. May (1977) derived a strong prediction from this fact, namely that the grammar of scope (as determined by QR) would be constrained by the distinguishing property of move alpha, namely Subjacency. This prediction has turned out to be false. In fact, Chomsky (1977) had already argued that scopal domains are not really determined by Subjacency, but at best by a specificity constraint (see Fiengo and Higginbotham (1981)), as can be seen in the following examples (Chomsky (1977,214)); (3) a. b.
We can't find books that have any missing pages *We can't find the books that have any missing pages
In (3a), any can have wide scope in spite of the fact that it is in a complex NP. Specificity of the NP in (3b) blocks such an interpretation. Similarly, Chomsky showed that the scope of Wh-phrases in situ is not really subject
Global Harmony, Bounding, and the ECP
147
to island constraints. Chomsky (1981b, 235) repeats this conclusion and gives the following examples: (4)
a. b.
Who remembers where we bought which book I wonder who heard the claim that John had seen what
In (4a), the rightmost Wh-phrase, which book, can have wide scope, in violation of the Wh-island constraint. And in (4b), the violation of the Complex NP Constraint seems tolerable. Huang (1982) extensively demonstrates similar violations of island constraints for Chinese, and Lasnik and Saito (1984) give similar examples for another language without over Wh-movement, namely Japanese. In short, the idea that the scope of Wh-in situ is constrained by Subjacency has been almost universally abandoned (also by May (1985)). What has not been universally concluded, however, is that this result undermines the concept of LF movement. LF movement exists to the extent that it has properties. If it does not have the properties of move alpha, what other properties does it have? Unless this question is answered, the concept of LF movement is more obscure than its popularity might suggest. In any case, the alleged parallelism between the grammar of scope and the grammar of gaps has effectively been undermined. In recent work, Chomsky has "institutionalized" the observed discrepancy by stipulating that Subjacency is the characteristic condition of gaps at S-structure, while the (standard) ECP is the characteristic condition of gaps at LF {Chomsky (l986a)). It is the main purpose of this chapter to show that there is a second major discrepancy between S-structure gaps and so-called LF gaps, namely in terms of the directionality constraints (lc). My elaboration of Kayne's directionality constraints is based on the more general theory of syntactic domains that was briefly introduced in chapter 1 and that I will discuss first.
4.2. On the nature of local domains The general background assumption of the following discussion is the Thesis of Radical Autonomy. According to this thesis, the major properties of core grammar are construction-independent. Thus, the c-command property meets the criterion of radical autonomy because it can be found in most modules, such as predication theory, government theory -(including licensing relations like El-marking, Case-marking, and subcategorization), binding theory, and bounding theory. In other words, c-command is not the exclusive property of any of these subtheories. Locality conditions, on the other hand, are more or less supposed to be different for each subtheory. Thus, the locality property of binding is expressed by the notion governing category, while the locality condition of bounding is
148
Domains and Dynasties
expressed by the unrelated and different notion of Subjacency. The most important implication of the Thesis of Radical Autonomy is that what holds for c-command also holds for locality: locality conditions and their extensions are roughly of the same nature for all subtheories. As said before, it should be noted that the Thesis of Radical Autonomy does not entail that there are no differences between NP-movement and control (see Koster (1984a)), or between binding and bounding. Clearly, there are certain properties that differentiate the various subtheories of grammar. Radical autonomy only means that there is a common core to the subtheories, a core which is therefore totally construction-independent, and which includes (in my view) the locality conditions. In other words, the current subtheories are not atomic but molecular: they are to a large extent made from the same stuff. It is my purpose to determine what the common core is, and to optimalize, for instance, the similarities between binding and bounding. Although it seems to me that such a goal would be fairly obvious in any science that is trying to expand its explanatory core, the momentum of linguistic theory has been almost in the opposite direction in recent years. Thus, the opacity conditions of Chomksy (1973) were supposed to hold for both movement and anaphor binding. In Chomsky (1981b), things are more strictly separated. Subjacency holds for movement only, while opacity (as incorporated in the binding theory) only holds for anaphor binding (including NP-traces, but not Wh-traces). Similarly, the NIC of Chomksy (1980a) applied to both Wh-traces and anaphors. In Chomsky (1981b) part of the NIC is subsumed under the binding theory (anaphors) while a residue was developed into the ECP (traces). There are several other examples of this development, and its main thrust is clear: the optimalization of the differences between binding and bounding. In my opinion, the results of this development have not always been convincing, and in any case it seems useful to counterbalance the development just mentioned by an attempt to stress the similarities between binding and bounding. If we look at a language like English, or at other SVO languages such as Italian, French, N Ofwegian, or Swedish, the differences between the properties of Wh-traces (bounding) and the properties of anaphors (binding) seem almost overwhelming at first sight. But if we look at the much stricter bounding conditions on Wh-traces in an SOY language like Dutch, the similarities with the binding properties of anaphors can hardly be overlooked. It is my claim that the more permissive nature of bounding in English and other SVO languages is (in part) the consequence of the nature of the directionality constraints that determine domain extensions. In a language like Dutch, the nature of directionality blocks the domain extensions, hence the strict character of bounding. On the other hand, there are many languages with a much more permissive type of binding than English. In English, the domain of binding is essentially the governing category of the
Global Harmony, Bounding, and the ECP
149
anaphor. But in many languages, the domain can be "stretched" under certain conditions. Long distance reflexivization, for instance, is by no means exceptional, as shown in a survey by Yang (1984). So, in fact, there could be a possible natural language which is more or less the mirror image of English with respect to the relative permissiveness of binding and bounding. Such a language would have the strict bounding conditions of Dutch and the more permissive binding conditions of Icelandic. In other words, the emphasis on the difference between binding and bounding has perhaps been determined by the accidental properties of English and some other languages that happen to have the same differences between binding and bounding. Assuming that this bias is accidental, I will now sketch a theory of local domains (and their extensions) that stresses the similarities between binding and bounding. According to the theory introduced in chapter 1, there is a simple prototypical domain that is relevant for a number of different construction types. Other domain definitions can be seen as simple extensions of this prototypical "U rform". I am assuming, then, that the following domain definitions are sufficient for most construction types: (5)
a.
b. c.
... [~ ... ( ) .. . y .. . 8 ... ] .. . . .. [~ ... (ro) . . . y . .. 8 ... ] .. . if ... [~ ... (ro) ... y .. . 8 ... ] ... a domain then ... [w . .. y' ... [~ ... (ro) ... y ... 8 ... ] ... ] ... a domain
In these definitions, 8 stands for "dependent category" (for instance, an anaphor or Wh-trace); y is the minimal governor of 8, and Pis the minimal maximal projection containing y and 8. For some unexplained reason, it appears that the value of ~ is usually not VP, but S'. Thus, the values of P are: NP, PP, AP, and S'. ro stands for "opacity factor", i.e. the elements subject and AGR (SUBJECT in the sense of Chomsky (1981b)), and presumably INFL or COMP (see chapter 6). The third definition (5c) is recursive, and therefore the most interesting. In this definition, Wstands for the minimal category (with the same values as P) containing p and y' (the governor of P). In principle, the recursive definition (5c) can lead to domains of unlimited size, usually if there is some sort of agreement between the successive governors y and y'. This agreement between successive governors is a very common property of this type of domain extension. The first type of domain (5a) is by far the most productive. It is part of the configurational matrix and the locality condition for the government relation itself, and therefore also for the licensing relations (9-marking, Case-marking, subcategorization) that depend on government. As we saw in chapter 3, it is also the locality condition for obligatory control (see Koster (1984a)) and predication (see Williams (1980)). Most importantly, it is the locality condition that determines chain formation. It has exactly the same format as the Bounding Condition of Koster (1978c) (as
150
Domains and Dynasties
reformulated in Koster (1979) and chapter 1 above). This condition is a somewhat stricter locality principle than Subjacency. As an illustration of (5a), consider the government relation. Government is strictly local, as is clear from the following example: (6)
AA
V
P
NP
In such configurations, the NP is governed by P and not by V. This follows from the fact that government involves locality principle (5a). The NP is the dependent element 0, which is dependent on the minimal governor P (the Y in 5a)). The minimal category ~ containing these elements is the PP. As a result, the NP cannot be governed by an element outside this PP. The V is therefore not a possible governor of the NP. The second type of domain (5b) clearly is a minimal extension of (5a). By adding the opacity factor ill, the values of ~ are reduced to those categories that can contain a SUBJECT or COMP or some other opacity factor (see chapter 6). This is typically the domain proposed for binding relations (the notion "governing category" in Chomsky (1981b, ch. 3)). The third type of domain (defined by (5c)) is a recursive extension of (5a) or (5b). This is the domain of long reflexivization in Icelandic (and many other languages; see Yang (1984)). It is also the domain for long Wh-movement and parasitic gaps in English, French, Italian, Spanish, Scandinavian, etc. As mentioned before, this domain extension often requires a kind of agreement between the governor of the lower domain and the governor of the superjacent domain. Long reflexivization and long Wh-movement differ in this agreement relation, but otherwise the nature of the domain extension seems to be similar. In both cases, the domain extension is triggered by a chain of successive governors. I will refer to such a chain as a dynasty: (7)
Dynasty
= df
a chain of successive governors 1) such that for each i (1 ::;; i < n), Yi governs the minimal domain ~ containing Yi + 1
Thus, if n = 5 for instance, we have the following situation, where each Y (except Y5) governs the domain ~ of the next governor:
Global Harmony, Bounding, and the ECP
151
As in all other good dynasties, the members of a syntactic dynasty have something in common. What they have in common is in part a matter of parametrization. One of the best known examples is the kind of domain extension we find for Icelandic reflexives (see Thniinsson (1976), Maling (1981), Anderson (1983)). In the simplest case, domain extensions for Icelandic reflexives are determined by a dynasty of Vs, such that each V (except the first one) is in the subjunctive mood. Anderson (1983) gives examples like the following: (9) Jon segir ao Maria viti ao Haraldur vilji John says that Mary knows(subj.) that Harold wants(subj.) a 0 Billi mei oi sig that Bill hurts(subj.) himself In this example, the reflexive sig is not bound in its minimal domain but in a higher domain. In fact, a reflexive can be bound in any higher domain as long as the intermediate verbs are in the subjunctive mood. Moreover, each domain must be governed by the next higher subjunctive verb, up to the domain of the antecedent. The dynasty for Icelandic reflexivization in an example like (9) is as follows: (10)
. " [s' Jon . .. VI [S'· · . V ... [S'·· ' V ... [S' . .. V . . .sig ... ]]]]. . . subj. subj. subj.
The chain of subjunctive verbs is dependent on the first verb (V 1) which is in the indicative mood. In general, dynasties for anaphors are formed if the chain of governors is somehow dependent on the first element. We can express this by stipulating that the elements of the dynasty have the feature [ + dependent] as a common characteristic (see Yang (1984)). The feature [ + dependent] appears to be a parameter. It is the property shared by the members of the relevant dynasties (for n ~ 2) and it can have at least the following values (see chapter 6 for some problems and refinements): (11)
a. b. c.
[+dep] = subjunctive (Icelandic) [+dep] = infinitive (Icelandic, Norwegian) [ +dep] = reanalysis (Dutch, German)
In Norwegian, the domain of the reflexive seg is extended if the dependent verb is an infinitive (see Hellan (1980)):
(12)
Ola bad oss [PRO snakke om seg] Ola asked us to talk about himself
The Dutch equivalent of (12) is ungrammatical:
152
Domains and Dynasties
(13)
*Marie
vroeg ons [om PRO over zich te praten] about herself to talk Mary asked us compI.
In Dutch (and German), the domain is only extended with accusativewith-infinitive verbs (Reis (1976)). These verbs undergo Verb Raising under conditions of "reanalysis" (here used only as a descriptive term; see chapter 5): (14)
Marie liet i [ons over i zich praten i ] Mary let us about herself talk
In short, the various Germanic languages that allow long reflexivization are parametrized with respect to the feature [ +dep]. This feature [ +dep] is the property which the dynasties that determine the domain extension have in common. I would now like to show that the same dynasty concept applies to empty categories (ec's). I will assume, in accordance with Koster (1978c) and chapter 1 above, that the Bounding Condition (Sa) is the locality condition for traces in the unmarked case. Extensions of this minimal domain always have the form (5c) and are often only possible if a dynasty of a certain type can be formed. As in the case of reflexivization, dynasties can be formed only if the successive governors of the domain extension share a certain property. This property can be met in some languages, but not in others. If and only if the property in question can be met, does the language in question have long Wh-movement. For the time being, I will assume that in fact two conditions must be met: (15)
ec's have a dynasty iff
(a) Yn (of (7)) is a structural governor, and (b) each Yi has the same orientation
Whether something is a structural governor or not differs from language to language;4 so, let us assume that (15a) involves a parametrized notion. The second condition, however, might be invariant. So, let us make the strongest claim possible, namely, that it is an unparametrized principle of Universal Grammar. The relevant form of orientation is the direction of government, an important feature of grammars as argued by Stowell (1981), Kayne (1983), Hoekstra (1984), and others. The intuitive content of the directionality constraints (to be defined in the next section) can best be illustrated with an example: (16)
WhOi did you s~e [a pic.lure [Q[ ti]]
The preposition of is a structural governor, so condition (15a) is fulfilled. The domain of this P, indicated by the brackets, is governed by the head
Global Harmony, Bounding, and the ECP
lS3
(N) of the next domain. In turn, the domain of this N (the NP indicated by the next pair of brackets) is governed by the verb see. So, in principle, we have a dynasty here, consisting of the elements V, N, and P. According to condition (lSb), however, we can form a dynasty for the trace only if all these elements govern in the same direction. This happens to be the case in (16), as indicated by the arrows underneath the governors. My claim is that this sameness of the directionality of government is a necessary condition for grammaticality. If this claim is universally true, we make the strong prediction that in languages in which the orientation of government is not uniform (as in (16» , the equivalent of (16) is ungrammatical. This is indeed the case in a language like Dutch. (17)
*Wiej heb je [een fo.1o [v~n t j] gezkn] who have you a picture of seen
As I will show in what follows, prepositions are structural governors in Dutch, so that the first condition of (lS) is fulfilled. The second condition, however, is not fulfilled: the orientation of the P van 'of' and the N Joto 'picture' is the same as in English (government to the right), but the orientation of the verb zien 'see' is in the opposite direction (government to the left). The nonuniformity of the direction of government, indicated by the arrows in (17), makes the sentence ungrammatical. There is no dynasty, which blocks the recursive domain extension (Sc). Consequently, the Bounding Condition (Sa) defines the maximal domain in which the trace can be bound. In general, traces are bound in their minimal maximal projection (NP, PP, AP, S'), unless the domain can be extended. The domain can be extended only if a dynasty of uniformly oriented governors can be formed. Before discussing the evidence for this hypothesis, I will first point out certain differences between ec's in (A')-chains and ec's in domain extensions as defined by (Sc).
4.3. The Cinque-Obenauer hypothesis In an earlier paper, I made a distinction between an anaphoric strategy and a pronominal strategy for A'-bound ec's. The anaphoric strategy connects the links of an A'-chain, in !lccordance with the Bounding Condition (Sa). The pronominal strategy can be found in constructions involving island violations and parasitic gaps (Koster (1983». Recently, Cinque (1983b) and Obenauer (1984) have given much evidence that the notion pronominal strategy should be taken literally. Guglielmo Cinque, whose ideas I will follow here to a large extent, has proposed that the empty category with the features [- anaphor, +pronominal) is not only the little pro identified by AGR in pro-drop languages, but also the empty resumptive pronoun found in parasitic gap
154 Domains and Dynasties
constructions (among others). In the latter case, pro is not identified by AGR but by an operator in A'-position. Since this pro is considered a resumptive pronoun, it can only be an argument NP.5 This hypothesis explains several facts about so-called long Whmovement. Consider for instance Belletti's well-known observation that violations of the Complex NP Constraint are much worse if the extracted Wh-phrase is a PP: (18)
a.
b. (19)
a.
b.
?Whoj do you believe the claim that John spoke to ej *To whomj do you believe the claim that John spoke ej ?This is the guy OJ I heard about a plan to give a book to ej *This is the guy to whomj I heard about a plan to give a book ej
Extraction of PPs from islands is consistently worse than extraction of NPs from islands. This fact is explained, according to Cinque, if we assume that the rightmost ej in (18) and (19) is not a trace but an empty resumptive pronoun (pro). Obviously it must be determined then that the ec's in these sentences cannot be part of a chain. Why, in other words, are the ec's in (18) and (19) not traces? Cinque solves this problem by stipulating that there is a Wellformedness Condition on Chains (WCC) which entails, expressed in our terms, that a dynasty for traces contains only structural governors. Assuming that P and V are the structural governors (Kayne (1984)), it follows that the ej in (18) and (19) cannot be a trace: in all cases, the dynasty that extends the domain of ej to the antecedent contains the N head of the complex NP (claim, plan). Since N is not a structural governor, the WCC is not met; therefore, the ec's cannot be traces and must be pro. Since only NPs can be pro, the contrasts in (18) and (19) are explained. Although my own views differ somewhat with respect to the WCC, I think this approach is essentially correct. Or, at least, it is the only approach that explains the contrasts in (18) and (19) at all. Another much discussed problem that is explained by Cinque's approach is the following contrast (Chomsky (1982a, (71), (72)); see also Lasnik and Saito (1984) and Pesetsky (1984)); (20)
a. b. c.
someone who John expected t to be successful though believing e to be incompetent *someone who John expected t would be successful though believing e is incompetent *someone who John expected t would be successful though believing that e is incompetent
In fact, we find a three-way contrast here: (20c), with undeleted that, is worse than (20b), which is in turn considerably worse than (20a). How can we account for these contrasts? Suppose that e is a resumptive pro in all cases. (20a) would be
155
Global Harmony, Bounding, and the ECP
unproblematic because e is structurally governed and there is a path of right branches (in the sense of Kayne's Connectedness Condition) up to the antecedent (contrary to traces, pro does not require a dynasty of only structural governors). The other examples, however, are problematic if e must be pro. The problem is that there is no structural governor for the e in nominative subject position in (20b,c). These ec's can at best be structurally governed by COMP or antecedent-governed by a local antecedent in the immediately preceding COMPo But if ec's are locally bound in this way, they cannot be pro, but must, instead, be anaphoric. In a GLOW presentation (Copenhagen, 1984), Cinque gave evidence that resumptive pronouns (like pro) are never bound by a close operator (if the pronouns are subjects). Even in English this can be demonstrated: (21)
a.
b.
*the man whoj hej died yesterday ... the man whoj we didn't know who had invited himj
So, it is clear that the e in (20b,c) cannot be pro. But then it must be a trace. Sentence (20c) is then immediately ruled out by the principles that account for the that-t effect. But (20b) is also ruled out; since e is not pro, there is no direct link to the antecedent. The only other option would be successive cyclic linking (as in chains). But there cannot be a chain from e to the antecedent because not all intermediate governors are structural (though is not) and thus Cinque's WCC is not met. The only remaining option, according to which the e (trace) is locally bound by an operator, which is itself pronominally linked to who, is also excluded: (22)
someone whoj would be successful though believing [OJ [tj ... ]] pronominal linking
This option is excluded because only arguments (NPs) can be resumptive pronouns. A third fact explained by the Cinque-Obenauer approach is the wellknown fact, studied in detail by Huang (1982), that certain adjuncts, like why, cannot be extracted from islands: (23) *WhYj do you believe the claim that John left
ej
Again, the ej cannot be a trace: since the intermediate N (claim) is not a structural governor, the WCC cannot be met. But it cannot be a pro either, because it is not an argument NP. So, this approach offers a very simple solution for Huang's adjunct facts (to the extent that they involve overt gaps). From the discussion of the examples in (20) it appeared that parasitic gaps must be considered to be pro. This view is at variance with the alternative view that parasitic gaps are in fact traces, bound by an abstract operator o. According to this view, parasitic gaps constructions involve
156 Domains and Dynasties two traces, and therefore two chains that are somehow compounded (Chomsky (1986b)): (24) Which book did you return t without 0 reading t
\
j chain ,1
I
\
cjhain 2
1-_ _ _ _--1,
compounding This view is based on the apparent island-sensitivity of parasitic gaps, that is, the fact that parasitic gaps are usually bad in islands: (25)
(26)
This is the man I interviewed t a. before telling you to give the job to e b. *?before reading [NP the book you gave to eJ (CNPC) This is the man I interviewed t a. before telling you to give the job to e b. *?before asking you [which job to give to eJ (Wh-island)
If the parasitic gaps are traces, which obey Subjacency, these facts are accounted for. If parasitic gaps are resumptive pro, on the other hand, there is no obvious explanation for the ungrammatical (25b) and (26b), because it is almost a characteristic property of resumptive pronouns that they make it possible to circumvent island conditions. I will argue below that the latter generalization holds for overt resumptive pronouns but not for empty resumptive pronouns. But first it should be noted that the chain composition approach, though attractive, meets with some obvious problems. First of all, it has always been a remarkable feature of parasitic gaps that they do violate island conditions under certain circumstances, particularly in the so-called subject cases (This is a man who everyone who knows e admires t; see Taraldsen (1981)). The island effects cannot always be found in the subject cases, so the chain composition approach does not really carryover to these cases. Secondly, it is not clear how chain composition is brought about. Suggestions to the effect that the second chain would be connected to the tail of the first chain by a kind of predication mechanism are extremely problematic. In ordinary predication, the subject c-commands the predicate, and vice versa. With parasitic gap constructions we find, in contrast, an anti-c-command condition (Chomsky (1982a)). Thus, the tail of the first chain (like t in (24)) never c-commands the hypothetical second chain (see chapter 6 (section 6.4) for further details). But even if these two problems can be solved, chain composition cannot explain why the second chain usually contains only NP gaps. In a normal A'-chain, almost any category - NP, PP, AP, or adjunct occurs. Parasitic gaps are always NPs, and non-NPs give a typically bad result:
Global Harmony, Bounding, and the ECP (27)
a.
b.
157
*This is the man to whom we gave a present t without talking e *This is the champion against whom we fought t without yelling e
Adjuncts like how also give impossible sentences: (28) *How did you solve the problem t without knowing that you could do e These facts are predicted by the hypothesis that parasItIc gaps are resumptive pro, because only NPs can be pro. Chain composition, in contrast, leaves unexplained the fact that parasitic gaps are always NPs. But how can we account for the island-sensitivity of parasitic gaps? Note, first, that according to the pro analysis, parasitic gaps can be embedded in islands. In the subject case, this is obvious. But also in the adjunct cases, parasitic gaps are already in islands (if there is no hidden operator). If this is the case, the reason (25b) and (26b) are so bad is because the parasitic gaps are embedded in two islands, the adjunct island (in both cases), and a complex NP (25b) and a Wh-island (26b). In languages like English, island constraints can be violated to some extent. As we will see later, it is possible to formulate the necessary conditions for these violations, conditions that can be met in English, Scandinavian, and Romance, but not in Dutch and German. It is, however, not possible to formulate sufficient conditions on the distribution of gaps in islands (pro). In English, for instance, it is very difficult to violate island conditions twice, even in cases that are quite tolerable with one island violation. Thus, the following island violations are not so bad: (29)
a.
b.
Which race did you express a desire to win t? What books don't you remember who borrowed t from you?
If a second island is embedded, these sentences become considerably worse:
(29)
a'. b'.
*Which race did you express a desire to meet the man who won t? *What books don't you remember who knew who had borrowed t from you?
Such cases are analogous to (25b) and (26b), but here it is not possible to have recourse to the idea of chain composition. Pro, in other words, does not have the same free distribution as lexical resumptive pronouns. Contrary to the latter, it must be entirely identified by its antecedent, which is only possible in contexts that are not too complex (compare the specificity constraints on scope assignment). It is at present not possible to give an exact formulation of the degree of
158 Domains and Dynasties
complexity of the context that pro can tolerate (see note 5). But it is clear from (29) that more than one island may be too many. The fact that parasitic gaps cannot be embedded in too many islands is hardly surprising, given the fact that pro cannot be so embedded in other cases either (29). Chain composition cannot solve the problem in general (certainly not in (29)), and therefore the ungrammatical (25b) and (26b) do not form evidence in favor of extra hidden chains. In short, islandsensitivity cannot decide the status of parasitic gaps (trace or pro). There is still another argument against chain composition. Cinque (1983b) has argued that it is not always absolutely necessary to have a licensing trace for ec's with parasitic-gap-like properties. The following example (from Chomsky (1982a, (99c)) is a case in point: (30)
the man that I went to England without speaking to e
This sentence is marginal, but relatively acceptable compared to (31), in which the gap is a PP: (31)
the man to whom I went to England without speaking e
This is a familiar phenomenon by now (compare Belletti's observation discussed before). The gap in this case has the same properties as parasitic gaps - not only PPs, but also adjuncts resist extraction: (32)
*Why did he go to England without speaking to Bill e
The adjunct cannot be construed with the sentence embedded under without. Cinque also observes that the third diagnostic criterion, apparent lack of successive cyclic movement, is met in these constructions: (33) *The student that I went to England without saying (that) e is intelligent As before (cf. (20)), the sentence is bad with or without deletion of that. In other words, the gaps in these constructions have the same properties as parasitic gaps and other gaps embedded in islands. But note that since there is no licensing chain, the gap in (30) must be directly linked to the operator position that follows the head of the relative clause: (34)
the man OJ that I went to England without speaking to
ej
In this case, we cannot have a second operator following without (as in (24)): (35)
the man OJ I went to England without OJ speaking to
ej
Global Harmony, Bounding, and the ECP
159
If the rightmost 0i completes the chain, as in (24), the leftmost OJ is a vacuous operator, which is not permitted by any theory. So, example (30) obliterates the idea of chain composition. In conclusion, then, it seems to me that the Cinque-Obenauer approach is the only current theory that explains the systematic difference between traces in chains, and the gaps that we find in islands and adjuncts.
4.4. The parametrization of dynasties
If we tentatively assume that there is a clear distinction between traces in chains and pro in other contexts, we have to solve a demarcation problem: where do we find traces, and where do we find pro? Cinque (1983b) has solved this problem in terms of his Well-formedness Condition on Chains and Kayne's Connectedness Condition. According to this view, traces are only acceptable if all g-projections that connect trace and antecedent are complements of structural governors. Pro only meets the weaker conditions on g-projections as expressed by Kayne's CC: the successive gprojections are not necessarily structurally governed. The only requirement is that the g-projections are on a left branch or a right branch, depending on the canonical government configuration of a language. Although I agree to a large extent with the view that pro depends on the CC, it seems to me that the WCC is too weak. Moreover, the CC and the WCC overlap too much. I would therefore like to propose a stronger condition on chains, one that does not overlap with the CC. The condition on chains that I would like to propose is the Bounding Condition of Koster (1978c, 1979), discussed in chapter 1 and repeated above as (5a). This view entails that in the unmarked case, each trace of a chain is bound in its minimal domain P (in the sense of (5a)). Thus, under normal successive cyclic movement this condition is met: (36)
Whoj do you think [ti that he said [ti that he saw ti]]
Assuming that the traces in COMP are accessible for government from the higher verbs, each trace is bound in its minimal domain, the minimal governing Sf. Nothing further has to be stipulated with respect to the nature of the dynasty involved. In (36), there is a dynasty of Vs (structural governors in accordance with Cinque's WCC), but this fact does not have to be stipulated because it follows from the simplest domain definition, namely (5a). The Complex NP Constraint violations cannot involve traces (a chain) because it is not possible to provide an antecedent for each t in its minimal domain: (37)
a.
Whoj do you believe [NP the claim [s' tj that [Bill saw tj]]]
160 Domains and Dynasties The rightmost trace is bound in its minimal domain Sf, but the intermediate tj in COMP cannot be bound in its minimal domain. If nonstructural governors like N cannot penetrate another projection at all (see chapter 3), this trace remains ungoverned so that it does not have a domain. If the N does govern this trace, it is not bound in its minimal domain because the NP lacks a landing site (a COMP). The elements (whoj, tj, tj) therefore do not form a chain in (37a). But then there is only one other way to connect the rightmost tj with its antecedent whoj, namely by interpreting it as pro. As we saw in the preceding section, the properties of pro are exactly the properties of gaps that we find in complex NPs. Cinque (1983b) gives only one kind of fact for which the WCC seems to give different predictions from the simpler and more general Bounding Condition (5a). Cinque notes that his WCC predicts the possibility of extraction from the following context in English: (37)
b.
. .. [yp V [pp P [s{ ... e . .. ]]]] ...
If a Wh-phrase is moved from the position e to a position outside of the VP, a chain is possible according to the WCC. This is so because the dynasty from e to its antecedent would only involve structural governors, namely the P and V in (37b). Chain formation of this kind would be inconsistent with the Bounding Condition, because an intermedilj.te trace in (37b) could not be bound in the PP in (37b). Given the fact that PP extraction always involves chains, the following examples (Cinque's (67)) might confirm the validity of the WCC:
(38)
a. b. c.
the girl to whomj he was counting [pp on [s{s PRO giving a present tj]]] ... the manjrom whomj we were looking forward [pp to [s{s PRO recei ving a letter t i]]] ... the man to whomj they insisted [pp on [s{s PRO sending an invitation tj]]] ...
These examples are grammatical, and the WCC seems to be confirmed. Note, however, that these examples are not overwhelmingly convincing. In all cases, there is an alternative analysis that is consistent with the Bounding Condition. Suppose that the V and the adjacent P in (37) and (38) were to undergo "reanalysis", so that a complex verb would be formed (see chapter 5 for an alternative account of "reanalysis"): j (39) ... [yp (V pj)y [s'
tj [ ... tj . .. ] ...
In that case, the complex V could govern the intermediate trace in COMP, and the Bounding Condition would be met as in (36). There are good reasons to assume that this is what we actually find in (38). Van Riemsdijk (1978) has argued that pseudo-passives, contrary to
Global Harmony, Bounding, and the ECP
161
other cases of preposition stranding, involve complete reanalysis of V and P. All Ps in (38) can be reanalyzed with the V as complex verbs in pseudopassives: (40)
a. b. c.
It cannot be counted on t That was looked forward to t That course was insisted on t
In these cases, the t must be governed by the preceding "complex verb". Since the elements in question can undergo complete reanalysis, there is no reason to assume that this option is excluded for (38). Let us conclude therefore that the cases of chain formation considered so far are consistent with the Bounding Condition as a criterion that distinguishes chains from pro-binding. Before giving positive evidence for this view (which is inconsistent with the WCC), I would first like to make a few remarks about the general nature of dynasties. Dynasties come in various degrees of perfection. A perfect dynasty would be a set of governors that are all of the same kind. Thus, the kind of dynasty that we find in Icelandic long reflexivization is perfect because it consists of Vs and nothing but Vs: (41) ... V [ ... V [ ... V [ ... V [ ... V ... ]]]] ... Even with only Vs there might be degrees of perfection, because there are different kinds of verbs, such as infinitives and finite verbs. A slightly less perfect kind of perfection is exhibited by the kind of dynasty involved in Cinque's WCC. This type of dynasty involves structural governors only: (42) ... V [ ... P [ ... V [ ... P [ ... V ... ]]]] ... An even less perfect dynasty involves lexical governors only (Longobardi (1985)): (43) ... V [ ... N [ ... A [ ... P [ ... V ... ]]]] ...
The least perfect kind of dynasty, a really decadent dynasty, is the dynasty that defines percolation in Kayne's Connectedness Condition. Here, the only requirement is uniformity of the direction of government (canonical government). But apart from this faint family resemblance, there is little uniformity; not even lexical government is required: (44) ... y
[ ... N~ [ ... y [ ... V~ [ ... y ... ]]]] ...
I would now like to argue that the acceptability of gaps is in part a function of the dynasty that connects them to their antecedents. It has, for instance, often been observed that CNPC violations are worse with
162
Domains and Dynasties
relative clauses than with N-complements. Assuming that relative clauses have the structure [NP NP S'], this can be seen as a function of the perfection of the dynasties: (45)
a. b.
complement Ns: ... y relative clauses: ... y
[ ... tl [ ... Y t . .. ] .. . [ ... NE [ ... Yt ... ] .. .
Similarly, most parasitic gap constructions involve imperfect dynasties. The adjunct cases, for instance, might have the following structure, so that extraction from an adjunct always involves government of a g-pl:ojection by a nonlexical category (VP in this case): (46)
A
NP
~
VP
Adjunct
This accounts for the fact that adjunct parasitic gaps are always somewhat marginal. The same holds for constructions like the following: (47)
the man that I went to England without speaking to e
The upshot of this discussion is that the marginality of these constructions is not only due to the fact that they involve pro, but also due to the fact that their dynasty is far from perfect. Consider now simple cases of preposition stranding in English: (48)
[s' What [are you talking [pp about e]]]
It is with respect to these constructions that the WCC and the Bounding Condition make different predictions. According to the WCC, e is a trace because chain formation is possible: the dynasty involves the P about and the V talk, both structural governors. According to the Bounding Condition approach to the demarcation problem, the e is not a trace but a pro: the e cannot be bound in its minimal domain (PP). How can we test these predictions? Before giving some possibly crucial evidence, I would like to recall some facts discussed by Ross (1967) .. Elaborating certain observations by Kuroda (1964), Ross points out that certain nouns like time, way, manner, place, etc. cannot be pronominalized (Ross (1967, 4.203-204)): (49)
a.
My sister arrived at a time when no buses were running, and my brother arrived at a time when no buses were running too
Global Harmony, Bounding, and the ECP b. (50)
a. b.
(51)
a.
b.
163
*My sister arrived at a time when no buses were running and my brother arrived at one too Jack disappeared in a mysterious manner and Marian disappeared in a mysterious manner too *Jack disappeared in a mysterious manner and Marian disappeared in one too I live at the place where Route 150 crosses Scrak River and my dad lives at the place where Route 150 crosses Scrak River too *1 live at the place where Route 150 crosses Scrak River and my dad lives at it too
Ross then goes on to point out that in these cases prepositions cannot be stranded (p. 4.205): (52)
a.
b. c.
*What time did you arrive at e? *The manner which Jack disappeared in e was creepy *The place which I live at e is the place where Route 150 crosses Scrak River
On the basis of these facts, Ross (p. 4.206) comes to a conclusion that is very interesting from the present perspective (recall that NP = PP for Ross): (53)
No NP whose head noun is not pronominalizable may be moved out of the environment [P - JNP
In other words, ec's must have pronominal features in this context. But this is exactly what we expect if ee's after stranded prepositions are pro. A really crucial test would be the following. The Bounding Condition approach entails that gaps in PPs (like in (48)) are pro, while the wce approach entails that they are traces. The Bounding Condition approach, then, predicts that PPs cannot be extracted from PPs (cf. Belletti's observation, discussed above). Thus, the following configuration is allowed by the wec but excluded by the BC: (54)
Whj [ ... V ... [P [pp eiJJ
... J ...
The path from ej to Whj involves the structural governors P and V, so the linking can be considered a chain according to the wee. The Be, on the other hand, excludes chains of this type because the ej is not bound in its minimal domain (PP). The divergent predictions can be tested with Ps that take a PP complement (PP ~ P PP). Although such PPs exist in English (Jackendoff (1977), Van Riemsdijk (1978)), the prediction is not easy to test, because there are not too many bona fide prepositions that can be stranded and that take PP comple-
164 Domains and Dynasties
ments. Particles must be excluded, and so must the possibility that the preposition is assimilated to the verb (as we saw in connection with (38)). But to the extent that these pitfalls can be circumvented, the prediction of the Be approach seems to be correct. Van Riemsdijk (1978, 146) gives examples like the following: (55)
a. b.
They took a shot at him [pp from [pp behind the car]] *[Behind the car]j they took a shot at him [pp from [pp ej]]
Other examples are: (56)
a.
b.
They heard the noise [pp from [pp inside the barn]] *[Inside which barn]j did they hear the noise [pp from [pp ej]]?
(57)
a.
b.
They sent him [pp down [pp into the valley]] *[Into which valley]j did they send him [pp down [pp ej]]?
So, to the extent that the evidence is representative, the Be approach seems to be vindicated. In the next section, I will show that evidence from Dutch points in the same direction. Let us tentatively conclude, therefore, that gaps in preposition stranding contexts are not traces but pro and that PPs are absolute barriers for "movement" (chain formation). Thus, both the following structures are cases not of movement but of pro-linking: ('58)
a. b.
Whatj are you talking [about proiJ the man OJ that I went to England [without speaking to proiJ
The relative acceptability of (58a) in comparison with (58b) has to do with the relative perfection of the dynasty in (58a): the dynasty in (58a) involves only structural governors, while the adjunct phrase (without . .. ) in (58b) is not even lexically governed. Problematic for the Be approach are the well-known violations of the Wh-island condition in Italian,studied by Rizzi (1978b): (59)
Tuo fratello, a cuij mi your brother to whom I abbiano raccontato tj era they have told was
domando che storie wonder which stories molto preoccupato very troubled
The problem is that a PP has been extracted out of a Wh-island. Since the Be cannot be met (the tj is not bound in its minimal Sf), we would expect a pro under our hypothesis. But since the trace is a PP it cannot be pro. The same can be observed in those dialects of English in which the following example is grammatical:
165
Global Harmony. Bounding. and the ECP (60)
the man to whomj I wondered [s' which storiesj [PRO to tell
tjtjJJ
If extraction of PP is a diagnostic criterion for chain formation. we must conclude that domains for traces can be bigger than the minimal domain ~ of (5a). In fact. these problems suggest a new approach to parametrization. based on the dynasty concept. The basic idea is that not bounding nodes but dynasties form the primary locus of parametrization. According to application of the BC sketched earlier. chains are characterized by the BC. the simplest domain principle of UG (5a). But it is clear. as we saw for Icelandic ' and other Germanic languages. that not only the binding of pro but also the binding of anaphors can involve extended domafns with dynasties (in the sense of (5c)). As we saw before. the binding of pro conforms to the rather permissive dynasty of Kayne's Connectedness Condition. But nothing in our theory prohibits an independent domain extension for traces. It seems to me that the relevant domain extension is similar to the domain extensions that we see for anaphors in certain languages. Exactly as in the cases of long reflexivization. the domain extension for traces is a parametric option based on the category V. If we assume that the BC involves exactly one governor Xo. for instance V (61a). we may hypothesize that several languages. including Italian and certain varieties of English or even Dutch (see chapter 1). can extend the bounding domain by a dynasty of Vs (61b): (61)
a.
Bounding Condition:
b. Extended Bounding Condition:
(V) (Vl •.. ·• V j •... V n )
This means that in the unmarked case. a bounding domain is defined by exactly one governor (V in (61a)). In the marked case. the domain is defined by a dynasty of Vs (61b). As it stands. (6tb) is too permissive for the facts of Italian that we find in Rizzi (1978b). This problem can be solved in two ways. Either we could stipulate an upper bound on certain dynasties. or we could maintain (61b) and hypothesize that the limitations on the size of the dynasties in Italian are caused by complexity factors independent of the bounding theory. Although I tend to opt for the latter solution. I will show that Rizzi's major data can be handled by a dynasty of exactly two Vs. Let us first note that an adjunct cannot be extracted from the simplest Wh-islands in Italian. as observed by Huang (1982. ch. 7. data provided by Rita Manzini): (62)
a.
e
*Questo la ragione perLa quale mi chiedo che cosa ho this is the reason jor whichj I wonder what comprato tj bought tj
166
Domains and Dynasties b.
e il
*Questo this is comprato bought
modo nel quale mi chiedo che cosa ho the way in whichj I wonder what I tj tj
If we assume with Huang that only governed complements trigger percolation, these facts are easily explained. With a dynasty of only Vs, the trace itself and each successively higher domain containing it must be governed by a V. This condition is not met in (62): the trace of the adjunct (tj) is not governed by V, and is definitely not a complement of the embedded verb. Note that the sentences in (62) were also ruled out if we had pro instead of trace, since adjuncts do not have pros. But the pro-linking approach is irrelevant for cases like (62), because PPs can be extracted from the Whislands in question (see (59)). This shows that we have to do with a trace domain in (62) and not with a pro domain. As said before, most of Rizzi's data can be accounted for by stipulating that the Extended Bounding Condition involves a dynasty of exactly two Vs. What this would come down to is that traces in Italian must be bound in their domain governing category in the sense of Manzini (1983a). Two sets of data are relevant in this respect. First of all, Rizzi has shown that violation of more than one Wh-island leads to ungrammaticality:
(63) *Questo incarico che non so proprio chi possa avere this task that not I know really who might have indovinato a chi affidero, mi sta creando un sacco di grattacapi guessed to whom I will entrust is getting me into trouble The structure of such double Wh-island violations is as follows: (64)
OJ ... V [whj [tj ... V [Whk [ ... V tj tk]]]]
Clearly, the path from tj to OJ involves a dynasty of three Vs, which is one more than the maximally permitted number of two. Another example discussed by Rizzi is the following. Extraction of a relative pronoun from a clause introduced by a Wh-phrase gives a reasonably acceptable result «65a), corresponding to Rizzi's (17a)), while the sentence becomes ungrammatical if the Wh-phrase does not introduce the clause from which the relative pronoun is extracted, but the next clause up «65b), corresponding to Rizzi's (17b)): (65)
a.
b.
OJ [8 .. · V [8' tj [8 ... V [8 whj [8 ... V tj tj]]]] *OJ [8 ... V [8' whj [8 ... tj . .. V [8' tj [8 ... V tj . .. ]]]]]
Sentences illustrating this contrast are the following (Rizzi's (18a) and (18b)) (although (66b) can be made virtually acceptable by slight stylistic
167
Global Harmony, Bounding, and the ECP
modifications according to Guglielmo Cinque (personal communication)): (66)
a.
II mio primo libro che credo che tu sappia a my first book which I believe that you know to chi ho dedicato mi sempre stato molto caro whom I have dedicated to me has always been very dear *II mio primo libro che so a chi credi my first book which I know to whom you believe che abbia dedicato mi sempre stato molto caro . that I have dedicated to me has always been very dear
e
b.
e
These facts are explained by Rizzi by stipulating that S' is the bounding node for Subjacency in Italian. In (65a), the path from the rightmost tj to OJ never passes more than one S', while in (65b) there is not such a path: the linking of the tj in the rightmost COMP to OJ passes more than one S'. Note however that the same facts are explained by the hypothesis that a dynasty for traces in Italian involves at the most two Vs. In (65a), this is immediately obvious: the path connecting the two traces contains only two Vs. The second example (65b) looks somewhat more problematic at first sight, because the path connecting the two tjS contains only one V, while the path connecting the intermediate tj to the operator OJ contains two Vs. But note that the latter path does not contain a well-formed dynasty, if we assume in the spirit of Huang (1982) that this type of percolation can only be triggered by complements. Since the intermediate tj in COMP is not a complement, it cannot be the basis of a dynasty. Only the rightmost tj is a complement that can trigger a dynasty. But the path from this trace to OJ contains three Vs, one more than permitted. 6 All in all, it seems to me that the basic contrasts in Italian are explained by the hypothesis that a trace is either bound in its minimal domain, or in a domain defined by a dynasty of not more than two Vs. In spite of the fact that stipulating an upper bound to dynasties for Italian presumably leads to a certain degree of descriptive adequacy, I have a preference for the second option mentioned above. According to this alternative, Italian traces are either locally bound, or bound in a domain defined by any number of Vs. In this way, we do not have to complicate the domain definitions given under (5): domains are either strictly local or recursively expanded to the extent that a dynasty (of some type) can be formed. The fact that it is difficult to violate more than one Whisland in Italian must then be due to complexity factors of various kinds. In fact, there is some evidence for this second option. First of all, decrease of acceptability with multiple Wh-island violations seems to be more gradual than the other approaches suggest (Guglielmo Cinque, personal communication). Furthermore, Rizzi observes that violations of more than one Wh-island lead to more acceptable results if the dynasty (in our terms) involves infinitives. This fact is not easily accounted for if there is
168
Domains and Dynasties
a sharp cut-off point after two bounding nodes or a dynasty of exactly two Vs. In general, it seems to me that domain extensions for traces (or for any other dependent element) are not defined in terms of bounding nodes, but in terms of the nature" of the Succ€ssive domain governors. If this is true, dynasties form one of the primary loci of parametrization among languages. This was already clear from the study of reflexivization in various languages. Thus, domains for reflexives can often be expanded if there is a dynasty of infinitives. Similarly, it has often been observed that Wh-islands can be relatively easily violated in many dialects of English if there is a dynasty of infinitives. Reinhart (1975), for instance, gives the following examples: (67)
a.
b. c.
What don't you know when to file? What don't you know how long to boil? What don't you know where to put?
Reinhart even gives examples of Wh-island violations from tensed sentences that appear to be acceptable to many speakers: (68)
What books don't you remember who borrowed from you?
Given these facts, one might wonder what the difference between English and Italian really is. SIS' parametrization for Subjacency is obviously not sufficient. And if we accept the dynasty approach, we face the same problem: both English and Italian allow Wh-island violations in certain contexts, so, for both languages we have to define an extended domain. That the dynasty approach is adequate for examples like (67) appears from the fact that we cannot extract adjuncts (as observed in Koster (1978c, 198)). If we reverse the Wh-phrases in (67), the sentences become ungrammatical (cf. the Italian sentences in (62)): (69)
a.
b. c.
*When don't you know what to file t? How long don't you know what to boil t? *Where don't you know what to put t?
*'
In short, there appears to be considerable overlap between the Wh-island behavior of English and of Italian. So, again, where are the differences? The following data from Chomsky (1973, 69) seem to be relevant for a potential answer to this question: (70)
a.
b.
*What booksj does John know to whomj to give tj tj? *To whomj does John know what booksj to give tj tj?
The judgments given for (67}-(69) seem to hold for most varieties, but with
169
Global Harmony, Bounding, and the ECP
respect to (70) there might be a dialect split: some speakers accept (70) while others reject these sentences. Note also that the dynasty approach does not rule out (70) in an obvious way. The sentences in (69) were no problem from this point of view, because adjuncts are not governed by Vs so that there was no way to trigger a percolation domain. In (70), in contrast, all traces are governed by the verb give. So, why are the sentences in (70) bad in certain varieties of English? I would like to suggest the following answer to this question. The domain of empty categories is determined by a governor, or, in the case of a dynasty, by a chain of governors. Let us suppose now that there is a uniqueness condition that says that each governor can define the domain of exactly one empty category. If both traces in (70) are governed by the verb give, this uniqueness condition would be violated because the governor would determine the domain of more than one ec (tj and tj). Along similar lines, we would have an explanation for the following ungrammatical sentence: (71)
* Whatj do you wonder whomj to give
tj
to ej?
The ec corresponding to whom (i.e. ej) must be pro according to our earlier account of ec inside PPs. This entails that there must be a weak dynasty (in the sense of Kayne's CC) that connects this ec to its antecedent whom} This dynasty consists of the successive governors P (to) and V (give). So, the verb give defines (or co-defines) the domain of an ec. But the same verb give also defines - as its governor - the domain of tj. Clearly, this is a violation of the uniqueness condition, which says that a governor can determine the domain of only one ec. We are now able to account for all of the following contrasts (discussed by Chomsky in class lectures, fall 1983): (72)
a. b. c. d.
This is a paper OJ that we really need to find someonej [whoj [tj understands ej]] ?*This is a paper OJ that we really need to find someonej [OJ that [we can intimidate tj with ej]] *This is the reason whyj we really need to find someonej [OJ that [we can intimidate tj with this paper ej]] ?*This is a paper OJ that we really need to find someonej [whoj [we can all agree [tj [tj understands ej]]]]
The ungrammaticality of sentence (72c) is explained by the CinqueObenauer theory discussed before. Sine the ej is not bound in its minimal domain, it must be a pro. But only NPs can be pro, so that the adjunct why does not qualify as an element that binds pro. The other two ungrammatical sentences are explained by the uniqueness condition. In (72b), the verb intimidate defines two domains, namely the domains of tj and ej (the dynasty of ej involves the verb, which also
170 Domains and Dynasties
defines the domain of tj). Similarly, in (72d) the verb agree co-determines the domains of the intermediate trace tj, and of ej. In contrast, the verb understand in (72a) only determines the domain of ej. The domain of tj is not determined by any lexical governor at all, but by the antecedent governor whoj in COMP or by INFL. It is, in other words, not necessary under the uniqueness explanation to adopt the idea of Leland George that who remains in its (D-structure) subject position at S-structure (followed only by movement to COMP at LF). Let us assume, then, that the domain of A'-bound empty categories is always determined by a unique governor. The uniqueness principle suggests a new solution for the English-Italian difference (to the extent that there is a difference). As we saw before, most varieties of English allow some form of Wh-island violation (see (67)--(68)). Problematic is the impossibility of the sentences in (70) in some varieties. In these sentences, the VP contains two traces, in apparent violation of the uniqueness principle. The same problem arises in the sentences discussed by Rizzi, particularly in examples like (59) (repeated here for convenience): (73)
Tuo fratello, a cuij mi domando che storiej abbiamo raccontato tj tj era molto preoccupato
This type of sentence also has two traces in the VP. How can we reconcile this with the uniqueness principle? We would not like to parametrize the uniqueness principle itself. It looks natural, and it loses its force if it is not universal: something is unique or it is not. I would therefore like to propose an alternative solution that parametrizes not the uniqueness principle but the scope of governors. Consider the structure of the predicate in English: (74)
VP
~ A
Adjunct
PP
V
NP
I
what
to whom
when
The adjunct is weakly governed by the adjacent VP; the PP is governed by V', and the NP by V. Alternatively, it might be the case that the V governs both the NP and the PP. Suppose now that this is a matter of para-
Global Harmony, Bounding, and the Eep
171
metrization: in some languages (like in some varieties of English) the V governs both the NP and the PP, while in other languages (like Italian) the NP is governed by V and the PP by V'. We have in principle, then, a way to circumvent the uniqueness principle if the domain extensions for traces are determined by a dynasty of Vs or ViS. Take the following example again:
(75)
To whomj do you wonder [whatj [to [vp[ v{ V give] ti] tj]]]
If only the embedded V governs, the dynasty for tj is
(76)
*[ Vp V t (P) t ' ]
This filter only applies in languages with uniform V-dynasties, in which the V governs both the first trace t and the (domain of the) second trace t'. In languages that lack the filter, the second trace (or its PP domain) is governed by V', The parametrization only concerns the scope of government and therefore of the filter (76), The uniqueness principle is not parametrized but universal. It explains the filter (76) and also other data like (72d), repeated here for convenience: (77) ?*This is a paper 0i that we really need to find someone [whoj [we can all agree [tj [tj understands ei]]]] This sentence does not show the filter configuration (76) at all, and yet it is explained by the same uniqueness principle (the verb agree (co-)determines the domains of both tj and ei). One minor problem to conclude with. The uniqueness principle (or the filter (76)) has no bearing on configurations with traces from Whmovement and NP-movement mixed together:
(78)
Whoj was iti given ti to tj?
Parametrization does not help here, because such sentences seem gram-
172 Domains and Dynasties matical for all dialects. This means that either NP-movement does not exist (does not leave a trace) or that the uniqueness principle only holds for uniformly bound empty categories. Let us then assume the following form for the uniqueness condition: (79)
Uniqueness Condition
A governor yean (co-)define the domain of one and only one Xbound empty category where "X-bound" means either A-bound or A'-bound. But even with this formulation, many problems remain. Guglielmo Cinque has, for instance, pointed out that (79) incorrectly rules out (67c), in which both Wh-phrases are subcategorized by put. Of course, there are differences between (67c) and (70), since the former example involves the adjunct-like where. In spite of this, it seems to me that (79) is not entirely satisfactory as it stands. In summary, we have come to the following conclusions. The behavior of A'-bound empty categories is determined by two dynasties, (80a) and (80b): ' (80)
a. b.
... Ki
[... Ki [... Ki [... Ki
ec ... ]]] ... .. . V [ ... V [ ... V [ ... V ec ... ]]] .. .
(i ~ 3)
In all cases, the unmarked behavior of the ec's is determined by the Bounding Condition (5a). The dynasties in (80) define domain extensions (in the sense of (5c)) beyond the domain defined by the Bounding Condition. The first dynasty (80a) defines the domain of empty pro. 7 This dynasty does not require lexical government but only uniformity of government (in the sense of Kayne's CC). The second dynasty does require lexical governors of the type V (or V'). It is a (perhaps marked) extension of the Bounding Condition for traces. In contrast with (80a), uniformity of orientation of the governors is not necessary, as we saw in chapter l. Thus, some dynasties require uniform orientation and some do not. In the next section, we will see the dramatic consequences of a lack of uniform orientation in a language like Dutch.
4.5. Global harmony The basic hypothesis of this section is that governed empty categories are, in the unmarked case, bound in their minimal NP, PP, AP, or S', i.e. in accordance with the Bounding Condition (5a). Domain extensions for governed empty categories always have the form (5c) and are only possible under very strict conditions that can be met in some languages but not in others.
Global Harmony, Bounding, and the ECP
173
As discussed before, domain extensions are defined by dynasties. I will assume with Kayne (1983) that domain extensions for empty categories are determined by directionality constraints. Directionality in Kayne's Connectedness Condition is defined by the notion "canonical government configuration": canonical government is rightward if in the grammar of the language, V governs NP to its right; it is leftward if V governs NP to its left. In English, V precedes the NP it governs so that canonical government is to the right. In Japanese, the V follows th~ NP it goverps, which means that canonical government is to the left in this language. If an empty category is bound to an antecedent outside of its minimal gprojection (which I tab; to be NP, PP, AP, or Sf), the fundamental orientation must be preserved: all subsequent g-projections (up to the one governed by the antecedent) must be either on left branches or on right branches, depending on the canonical government orientation of the language. In general, the maximal g-projection of an ec must qmtain the antecedent of the ec. If this is not the case, there is one way to save the ec: its maximal g-projection must form a subtree with the g-projection of another ec bound to the same antecedent, provided that the maximal gprojection of the latter ec includes the 'antecedent. In our terms, the subtrees of the" Connectedness Conditon are compounds of extenqed domains defined by dynasties of uniform orientation, where "uniform orientation" is determined by the notion "canonical government configuration". In what follows, I will deviate from Kayne's theory in two respects. First of all, I will show that the notion "canonical government" is too,rigid for a language with mixed branching patterns like Dutch. I will replace "canonical government" by the related, but slightly more flexible notion "global harmony". In the next section (4.6), I will deviate from Kayne's theory in a more fundamental way. Kayne's CC not only constrains the grammar of ec's but also the grammar of Wh-scopif and negation. The Qirectionality facts of Dutch seem to contradict this view: neither the scope of Wh-in situ nor wide-scope negation is determined by the directionality conditions that we find for gaps. According to the hypothesis of global harmony, a language does not have a canonical government configuration. In some lal1guages, like Japanes,e, the direction of government of the various lexical heads is uniform, while in other languages, like Dutch, some lexical heads govern in one direction and others govern in the opposite direction. In Kayne's gprojections, directionality is uniform and in accordance with the canonical government configuration of the language. Global harmony simply drops the latter assumption, while preserving the former: the only requirement for g-projections is uniform orientation of government, either to the left or to the right. In languages with uniform branching, the two notions make the same predictions. In languages with mixed branching, the predictions are different as we will see. To summarize, I would like, to propose a theory with the following two elements: '
174 Domains and Dynasties (81)
a.
Condition of Global Harmony An empty category is bound in accordance with the Condition of Global Harmony iff it meets the Connectedness Condition (with "uniforri'1 orientation" replacing "canonical government").
b., Binding of Empty Categories A governed empty category is: (i) either bound in its minimal g-projection (NP, PP, AP, S') (ii) or bound in accordance with the Condition of Global Harmony. I will now illustrate this theory, mainly on the basis of Dutch. The following facts of Dutch are explained by (81): (82)
a. b. c.
the fact that only postpositions are stranded the (near) absence of parasitic gaps the strict character of islands
First of all, I will discuss preposition stranding in Dutch. In an earlier paper, I tried to relate the difference between English and Dutch with respect to preposition stranding to the difference in basic word order between the two languages. In an SVO language like English, prepositions can be stranded, while in an SOY language like Dutch, only postpositions can be stranded, while prepositions cannot be stranded (Koster (1978b, 576-577)). The earlier proposals, made essentially in terms of basic word order, can easily be translated into the present framework, in which "basic word order" is replaced by the closely related notion "direction of government". An additional assumption that I will make is that in a language with mixed branching, like Dutch, government is both to the left and to the right. This assumption is in accordance with the fact that in the Dutch X'system some types of complements occur on both sides of the head. Headcomplement order is not completely free, however. NPs in argument position are always to the left of their governing V, for instance. So, it is more accurate to say that government is bidirectional unless stipulated otherwise (as for arguments). In Dutch, bidirectionality can be demonstrated with PPs: most Ps occur both as prepositions (government to the right) and as postpositions (government to the left). Normal NPs usually follow the P, while categories preceding the P usually take the form of a so-called R-pronoun (Van Riemsdijk (1978)): (83)
a.
b.
onder de tafel under the table *de tafel onder
Global Harmony, Bounding, and the ECP (84)
a. b.
175
*onder het under it er onder there under
The fact to be explained is that complements that follow the P (as in (83a)) cannot be extracted, while categories preceding the same P (as in (84b)) can: (85)
a. b.
*Welke Which Waal'j where
tafelj table slaapt sleeps
slaapt sleeps hij [tj he
hij [onder tj]? he under onder]? under
These facts follow from the Condition of Global Harmony (under the further condition, which I am assuming throughout, that S' is the minimal g-projection of V): (86) a.
welke tafel
b.
A N( A hj\ P
f
Vlldel'
[:C / \ P
I
NP· slaapt
I
I
~
l A hij
PP
Y
AI
[ + R]j
E. slaapt
on~er
In (86a) (corresponding to the ungrammatical (85a)), the P governs to the right (as indicated by the arrow); since the V governs the PP to the left, the Condition of Global Harmony is not met, so that the structure is ruled out. In (86b), however, the Condition of Global Harmony is met since both the P and the V govern to the left (as indicated by the arrows). In other words, (85b) is predicted to be grammatical. As it stands, the notion "canonical government" is not sufficient in these cases. Both in (86a) and (86b), the trace is governed by a structural governor (the P onder). In both cases, the minimal g-projection of this P, the PP, is on a left branch, in accordance with the canonical government configuration of Dutch (i.e. Dutch is OV; see Koster (1975) and Thiersch (1978)). This means that if we define the Connectedness Condition on the basis of the notion "canonical government configuration", both (85a) and (85b) are predicted to be grammatical. According to this account, only the orientation of the g-projections is relevant. 8 It appears, however, that the
176 Domains and Dynasties orientation of the g-projections must be in agreement with the direction of government of the structural governor. In a language with mixed branching, like Dutch, this structural governor can sometimes govern in two directions, hence the difference in grammaticality between (85a) and (85b). The account given here seems to make predictions very similar t() the ones made by the account in terms of the Nesting Hypothesis in Koster (1978b). According to the latter account, ec's must be (either left or right) peripheral in the minimal category of type VI or Si (if they are not immediately dominated by such a category). I prefer the present account for two reasons. First, the present account is based on the dynasty concept that is also relevant for the domain extensions necessary for bou~d anaphors in many languages. The present account is therefore based on more general notions. Second, it is easier to extend the present account to parasitic gap constructions, a matter to which I will return shortly. An interesting matter, which I will not discuss here, is that the Nesting Hypothesis might follow from the Condition of Global Harmony as a theorem, if we make the further assumption that only binary branching is possible (Kayne (1984)). One feature of Koster (1978b) preserved in the present ~ccount is that preposition stranding is an essentially marked phenomenon (see also Van Riemsdijk (1978)). According to the unmarked domain definition (the Bounding Condition (5 a)), PPs are islands. Extraction from PP involves the marked domain extension (5c). As said before, this domain extension is always defined by a dynasty of some sort. According to the present account, the dynasty for ec's is defined by the global harmony of the successive domain governors. This also entails that chains cannot be formed across PP boundaries. According to the Cinque--Obenauer hypothesis cited earlier, only pros (and therefore only empty NPs) can be found inside PP islands. This hypothesis was confirmed by certain facts of English. How does it fare with respect to Dutch? Relevant cases are the PP complements of Ps. A PP complement can either follow or precede a PP. Moreover, there are two possibilities for the PP complements themselves: either they have the form of a P followed by an NP, or they have the form of an [ +R] word preceding the same P. All . in all there are four possibilities: (87)
a.
~PP
P
(87)
b.
~NP
pn PP
c.
~NP
P
~PP ~ [+R] P
P
P (87)
PP
(87)
d.
PP
~p,
PP
[~P
Global Harmony, Bounding, and the ECP
177
In none of these four cases can the PP complement be extracted from the matrix PP, as predicted by the assumptions made earlier (based on the Cinque-Obenauer hypothesis). Only in (87a) and (87d) do the two Ps have the same orientation (they both govern to the right and to the left, respectively). So, only in these two cases can the Condition of Global Harmony in principle be met. Extraction, however, always involves a landing site in S or S', so that the path up to the landing site always passes through a node on a left branch. PP complements to Vs, for instance, precede these VS in D-structure, so that the Vs govern the PPs to the left. This means that extraction is only possible in (87d), and not in (87a). Only in the former case can a dynasty of uniformly oriented governors be formed (E, E, YJ Extraction of the NP in (87a) from the PP (as a VP complement) would involve a dynasty of the form (E, E, Y), which does not meet the Condition of Global Harmony (the Ps govern to the right and the V to the left). All these predictions are borne out exactly. Let us therefore consider some examples (from Van Riemsdijk (1978)). First of all some illustrations of the patterns in (87): (88)
a.
b.
c.
d.
Hij kocht de cognac [pp voor [pp bij de koffie]] he bought the cognac for with the coffee 'He bought the cognac to go with the coffee' Hij kocht de cognac [pp voor [pp er bin] he bought the cognac for there with 'He bought the cognac to go with it' Hij ging [pp[pp onder het viaduct] door] he went under the viaduct through 'He passed through under the viaduct' Hij ging [pp[pp er onder] door] he went there under through 'He passed through under it'
As predicted by the Cinque-Obenauer approach (in conjunction with our assumptions about the extension of chains); extraction of the embedded PP is impossible in all four cases: (89)
a. b. c. d.
Bij welke koffie]j kocht hij cognac [pp voor [pp tiJ] with which coffee bought he cognac for *[pp Waar binj kocht hij cognac [pp voor [pp tiJ] where with bought he cognac for *[pp Onder welk viaduct]j ging hij [pp[pp tj] door] under which viaduct went he through *[pp Waar onder]j ging hij [pp[pp tj] door] through where under went he *[pp
I will now demonstrate how extraction of the NPs and the R-words is blocked in all cases of (87) and (88), except (87d) and (88d). In all cases of
178
Domains and Dynasties
(88), Verb-second, a root transformation, has applied. The government orientation of the verb is only clear from the underlying verb-final order, which I will use in the following examples. As before, orientation will be indicated by an arrow underneath the governors. In the structure underlying (88a), the two Ps govern to the right, while the verb governs to the left: (90)
a.
dat hij [vp de cognac [pp voor [pp bij de koffiel]] that he the cognac fo( With the coffee ko£ht] bought
Since the Condition of Global Harmony is not met, extraction of the NP de koffie is impossible, as predicted: (90)
a '.
*Ik vraag me af welke koffie hij de cognac voor I wonder which coffee he the cognac for bij t kocht with bought
In the structure underlying (88b), the word er precedes the P bij. This P
therefore governs to the left. The other P governs to the right, and the verb again governs to the left: (90)
b.
dat hij [vp de cognac [pp voor [pp er bij]] kocht] that he the cognac fo( there With bo~ght
Again, the arrows do not all point in the same direction. As in the preceding example, the Condition of Global Harmony is not met, so that extraction (of er in this case) is impossible: (90)
b '.
*Ik vraag me af waar hij de cognac voor t bij kocht with bought I wonder where he the cognac for
The situation is similar with the third example, the structure underlying (88c): (90)
c.
dat hij [vp[pp[pp onger het viaduct] dOQr] ging]
As before, the Condition of Global Harmony is not met, so that extraction of the NP het viaduct is impossible: (90)
c'.
*Ik vraag me af welk viaduct hij onder t door ging I wonder which viaduct he under through went
Only in the structure underlying (88d) do all arrows point in the same
Global Harmony, Bounding, and the ECP
179
direction, so that the Condition of Global Harmony is met: (90)
d.
dat hij [vp[pp[pp er onster] dOQr] ging] that he there under through went
As predicted, this is the only case where apparent extraction (of possible: (90)
d '.
er)
is
Ik vraag me af waar hij t onder door ging I wonder where he under through went
The facts of extraction from complex PPs in Dutch thus give strong support to the Cinque-Obenauer hypothesis (only NPs can be extracted from islands), and the Condition of Global Harmony (extension beyond the minimal domain (the Bounding Condition for chains (5a)) is only possible if there is a dynasty of equally oriented governors). Apart from some problems to which I will return, the theory developed so far gives a rather complete account of preposition stranding in Dutch. It should be noted that global harmony is only a necessary condition for preposition stranding, not a sufficient condition. Thus, standard German, which has the same OV orientation as Dutch, does not allow stranding of postpositions, even if the Condition of Global Harmony is met: (91)
*WOi hat er es tj mit get an?
what has he it with done 'What has he done it with?' Kayne (1984, ch. 5) has argued that the absence of preposition stranding from Romance is due to the fact that Ps are not structural governors in these languages, contrary to English. Apparently, then, German is like Romance in that a necessary condition for preposition stranding is not met, namely the structural character of the governor P. Dutch, however, seems to be similar to English in this respect. All in all we have the following pattern: (92)
a. b. c. d.
English, Scandinavian: Dutch, German dialects: standard German: Romance:
only preposition stranding only postposition stranding no stranding no stranding
This pattern is explained by two parameters: the parameter determining the nature of a governing P (structural or not), and the parameter determining the basic word order (VO or OV). The first parameter determines whether a language shows preposition stranding at all. The second parameter determines (in conjunction with the Condition of Global Harmony) whether prepositions or postpositions are stranded.
180 Domains and Dynasties
One problem that requires some addition to the present theory of preposition stranding is that reordering rules seem to change the orientation of government sometimes. In principle, this might change constituent order in such a way that the Condition of Global Harmony might be met after all. In fact, this never happens. Reordering always has the effect that the categories in question become "frozen", i.e. they become absblute islands. An example is the rule of PP extraposition in Dutch, which reorders a PP from its (D-structure) preverbal position to a position following the verb (see Koster (1973)): (93)
a.
b.
Hij he 'He Hij he
heeft over het meisje gepraat has about the girl talked has talked about the girl' heeft tj gepraat over het meisjej has talked about the girl
If global harmony were determined at S-structure, the derived structure (93b) would have the right orientation (i.e. all arrows point in the same direction):
(94)
Hij heeft gePL.aat
ov~
het meisje
This equality of orientation would in principle allow (postverbal) preposition stranding. This does not appear to be the case: (95)
*Welk meisjej heeft hij tj gepraat over ti which girl has he talked about
As observed in Koster (1978b, 578), the same holds for English: (96)
a. b. c. d.
He saw a picture of Bill, yesterday Who did he see a picture of t, yesterday He saw a picture, yesterday, of Bill *Who did he see a picture, yesterday, of t
Similar observations can be made about preverbal reorderings in Dutch. PPs with prepositions or postpositions can be reordered preverbally: (97)
a.
b.
Hij heeft een prijs daar mee gewonnen he has a prize there with won 'He won a prize with that' Hij heeft daar mee een prijs gewonnen
Obviously, global harmony is not affected by this reordering: in both examples, P and V govern to the left. Nevertheless, it appears that P-
Global Harmony, Bounding, and the Eep
181
stranding is possible only in (97a) (Koster (1978c, 103-104)): (98)
a.
b.
Waarj heeft where has 'With what *Waarj heeft
hij een prijs tj mee gewonnen he a prize with won did he win a prize?' hij tj mee een prijs gewonnen
Again we might assume that (97b) represents the derived order, from which preposition stranding is impossible. In an earlier paper (Koster (1978b)), I tried to account for these facts in terms of Huybregts's Anfecedency Binding Constraint (called the Double Binding Constraint in Koster (1978b, 578)). According to this principle, categories in derived positions (that bind a trace) are islands. Recently, Bennis and Hoekstra (1984) have discussed similar facts. They propose solving the problem in terms of Kayne's (1984, ch. 5) conditions on cosuperscripting. According to these conditions, a P is a proper governor (necessary for P-stranding) iff it is properly governed by V. Given the assumption that canonical government is to the left in Dutch, (95) would be accounted for. In this example, the preposition over is to the right of the verb, where it is not properly governed. It is not clear, however, how this solution accounts for the contrast between (98a) and (98b): in both examples the stranded P is to the left of the V, where it is canonically governed in principle. Bennis and Hoekstra therefore make the further assumption that the P must be adjacent to the V. But this is factually false because stranded prepositions can be followed by certain (preverbal) categories. Stranded Ps can, for instance, either precede (as in (99a)) or follow (as in (99b)) certain PP complements to the verb: (99)
a.
b.
Waarj is hij tj mee naar de dokter gegaan where is he with to the doctor gone 'With what (= why) did he go to the doctor?' Waarj is hij naar de dokter tj mee gegaan
To my ear, (99a) even sounds slightly more natural, in spite of the fact that the stranded P is not adjacent to the verb. A way out would be to assume that PPs can be incorporated into the verb. But that is not very plausible in this case, because the usual diagnostic for incorporation, moving along under Verb Raising, gives an ungrammatical result. PPs like the one in (99) cannot be incorporated into a verbal cluster (as can other incorporated elements, such as particles; see Koster (1975)): (100)
*dat hij zou willen naar de dokter gaan the doctor go that he would want to 'that he would like to go to the doctor'
182
Domains and Dynasties
I conclude from these facts that adjacency to the verb is not a necessary condition for preposition stranding in Dutch. But then (98b) becomes problematic under the Bennis-Hoekstra proposal. This example is accounted for under the alternative, if the PP containing mee is in a derived position in (98b). It is not quite obvious, however, that (98a) represents the underlying word order and (98b) the derived order. All in all, it seems to me that the matter is not entirely settled. But it is reasonable to conclude at this point that global harmony is a Dstructure property (i.e. an S-structure property that is not computed for categories that bind traces). Some remaining problems have to do with preposition stranding in NPs and APs. Preposition stranding in NPs is usually impossible, as predicted by the theory. There are some apparent exceptions, to which I will turn shortly. APs are more problematic. The impossibility of Pstranding in NPs is illustrated by the following example (see Van Riemsdijk (1978)): (101)
*Waarj heeft hij [NP een argument [tj tegen]] verworpen? where has he an arg~ment against rejected
This is the general situation: prepositions (postpositions) cannot be stranded in NPs. This fact is explained by the Condition of Global Harmony (as indicated by the arrows in (101), orientation of government is not uniform). Apparent exceptions are limited to PPs with the prepositions van 'of' and over 'about': (102) a.
b.
Waarj where 'What Waarj where 'What
heeft hij een collectie tj van gezien? has he a collection of seen did he see a collection ofT heeft zij een boek tj over geschreven? has she a book about written did she write a book about?'
There are, however, many reasons to believe that PPs with van and over are exceptional and not necessarily part of the NP. Contrary to other ("real") PP complements to Ns, PPs with van and over can precede the NP with which they are construed. Compare the ungrammatical (l03a) (corresponding to (101)) to the grammatical (103b,c) (corresponding to (102)): (103) a. b. c.
*Hij he Hij he Zij she
heeft has heeft has heeft has
tegen Wittgensteinj een argument tj verworpen against Wittgenstein an argument rejected van Wittgensteinj een collectie tj gezien of Wittgenstein a collection seen over Wittgensteinj een boek tj geschreven about Wittgenstein a book written
Global Harmony, Bounding, and the ECP
183
If PPs with van and over are not really complements, but PPs that are associated with NPs by other means, then the examples in (102) are no longer counterexamples to the Condition of Global Harmony. If the PPs in question are outside the NPs, the rightward orientation of the Ns is simply irrelevant. Similar problems are posed by APs. But contrary to what we saw with NPs, almost all Ps can apparently be stranded in APs. If the PP complements in question are really part of the APs, the Condition of Global Harmony is violated, as indicated by the arrows:
(104)
Waarj is hij [AP tevr~den [pp tj m~e]] gew~est? where is he satisfied with been 'What has he been satisfied with?'
Often it is also possible to place the whole PP in front of the adjective: (105) Hij is daarmeej [AP tevreden [pp tj]] geweest he is therewith satisfied been 'He has been satisfied with that' Consequently, one might conclude that APs are not islands at all. This conclusion is contradicted by the fact that predicative APs that cannot be construed with the verb are always opaque (see Koster (1978c, 82-83)): (106) a.
b.
Bill looked at her, full of excitement *Whatj did Bill look at her, full of tj?
Similarly, in Dutch APs appear to be islands if they are not construed with the verb as in (104): (107)
a.
b.
Hij heeft, tevreden met zijn nieuwe ontdekking, he has satisfied with his new discovery Marie gekust Mary kissed 'Satisfied with his new discovery, he kissed Mary' *Dit is de ontdekking waarj hij [AP tevreden tj this is the discovery where he satisfied mee] Marie kuste with Mary kissed
It seems to me, then, that APs can be assimilated to verbs in the sense that
they form a complex verbal expression, at least with respect to subcategorization (of PPs in this case) and a-marking. That there is a kind of complex predicate formation (weaker than reanalysis) has been convincingly demonstrated by Jayaseelan (1984). He has argued, for instance, that Mary in (108) does not receive a a-role from
184 Domains and Dynasties
the verb give, but from the combination give permission (cf. also put the blame on someone, etc.): (108)
John gave Mary permission to leave
Let us assume therefore that APs can generally be assimilated to verbs in this sense. If there is a PP complement, as in (109) (corresponding to (104)), we can then assume that this PP is outside the AP and subcategorized by the whole complex predicate AP V: (109)
Waarj is hij [AP tevreden] [pp tj mee] [v geweest]
Only if this type of assimilation is possible are APs transparent. If complex predicate formation is excluded, APs are islands (cf. (107b)). That complex predicate formation in the sense described above is not always unrestricted can be concluded from certain data from Swedish. Christer Platzack has pointed out that adjectives in Swedish are structural governors, which assign Case to NPs (Platzack (1982)): (110) Han var overlagsen sin motstandare he was superior his opponent 'He was superior to his opponent' Wh-movement of the object of the adjective (or of the complex predicate V A) is unproblematic (Christer Platzack, personal communication): (111) Vemj var han overlagsen ti?
As Platzack points out, there is an exceptional class of adjectives in Swedish, which require the object to precede the head: (112) 'ao
Hon var honom kar she was him dear 'She was dear to him' b. *Hon var kar honom
In these cases, Wh-movement is impossible:
(113)
*Vemj var hon tj kar?
This fact must be due to the direction of government, because if the same adjective is followed by a complement, Wh-movement is possible according to Platzack: (114) a. b.
Hon var kiir fOr honom she was dear to him Vemj var hon kar for tj?
Global Harmony, Bounding, and the ECP
185
These facts follow from the Condition of Global Harmony" because in the grammatical examples (111) and (114b), all governors uniformly govern to the right, while in the ungrammatical (113), the verb governs to the right and the adjective to the left. The problem, then, is why (113) cannot be "saved" by complex predicate formation. I would like to suggest an answer along the following lines. Even if Swedish allows complex predicate formation (in (110) for instance), the basic VO order of Swedish must be preserved. This is not the case in (1I2a) and (113), so that there is no complex predicate formation, in which case the NP remains inside the AP. But then extraction is only possible if the Condition of Global Harmony is met, which is the case in (110) but not in (113). Needless to say, the notion "complex predicate formation" has not yet been sufficiently clarified. Definitive answers to the questions raised here therefore have to await further theoretical developments. In spite of this uncertainty, it seems clear that APs are islands if there is no way to assimilate them to the V. Let us turn now to parasitic gaps. The fact to be explained ,is that parasitic gaps are nearly absent in Dutch (see Koster (1983)), and !also in German (see Felix (1983), Torris (1984), and Haider (1984)). For reasons discussed by Chomsky (1982a), a fact like this is highly significant for a theory of Universal Grammar. The point is that parasitic gaps are extremely exotic (they were practically overlooked during the first 25 years of generative research), and yet they have very specific and complex properties. It seems extremely unlikely that these properties are learned. They are somehow hidden in the fabric of language, and they are therefore a good test case for a theory of Universal Grammar: can it predict the properties of parasitic gaps or not? For similar reasons, it would be a success for a theory of grammar if it could explain why parasitic gaps are entirely missing from some languages. Why do English, Romance, and Scandinavian have parasitic gaps, while they are nearly absent from Dutch and German? When Kayne (1983) showed the relevance of directionality constraints for the distribution of parasitic gaps, an answer immediately suggested itself. The languages with parasitic gaps are SVO languages, while Dutch and German are SOY (Koster (1975) and Thiersch (1978)). Given the branching patterns of Dutch and German, it is impossible to meet the directionality constraints in these languages (Koster (1983)). I would like to show here that the parasitic gap facts are straightforwardly explained by the Condition of Global Harmony (which is an elaboration of Kayne's notion "canonical government"). For purposes of exposition, I will make the common distinction between the so-called subject cases and adjunct cases of parasitic gaps. The subject cases involve examples like the following: (115) a. b.
This is a man whoj close frknds ol..ej adnlli"e tj This is a man whoj everrone who knQ.ws ej adnlli"es tj
186 Domains and Dynasties
The Condition of Global Harmony is met in these examples, as indicated by the uniformly oriented arrows. The Dutch counterparts of these sentences are totally ungrammatical: (116) a. b.
*Dit is een man diej goede vrie.!lden v~n ej tj bewq!1deren *Dit is een man diej iede.!..een die ej kent tj bewoQ.dert
The NP-internal structure is similar to what we find in English (government to the right), but the verbs govern to the left, so that there is no global harmony. To the best of my knowledge, these examples are representative: it is not possible to construct examples with parasitic gaps in subject phrases in Dutch, because it is not possible to meet the Condition of Global Harmony, as it is in English, Romance, and Scandinavian. With one exception, the same holds for the adjunct cases. Thus, take a familiar English example like (117): (117) Which bookj did you re..!..urn tj bef.Q.fe you could
r~ad
ej
Not only the two verbs (return and read), but also the preposition (before) governs to the right in such examples. In Dutch, however, the Condition of Global Harmony cannot be met in this way: (118)
*Welk boekj heb je tj terugg~bracht vOQ.r dat je which book have you returned before that you kon ej lel-en could read
The preposition voor governs to the right (like English before), but the two verbs govern in the opposite direction due to the SOY pattern of Dutch. The result of this lack of governmental harmony is an ungrammatical sentence. Again, the example is representative. It is not possible to construct adjuncts with parasitic gaps in tensed clauses. It has been observed since the earliest research on parasitic gaps, however, that there is one exception, namely if the adjunct contains an infinitve (see Koster (1983)): (119) Welk boekj heb je zonder ej te lezen tj teruggestuurd which book have you without to read returned 'Which book did you return without reading?' One difference between (118) and (119) is that in (119) the adjunct precedes the verb, which is generally possible in Dutch. Thus, the adjunct in (118) (with the tensed clause) can also precede the verb, which does not alter the ungrammatical status of this sentence:
187
Global Harmony, Bounding, and the Eep
(120) *Welk boekj heb je voordat je ej kon lezen tj teruggestuurd? In other words, the distinction between (119) and (120) has to do with the ±tense distinction. If (119) contains a parasitic gap, we have an apparent counterexample to the Condition of Global Harmony, because the orientation of the preposition zonder is the opposite from the orientation of the verbs: (121)
Welk boekj heb je zonQ..er ej te
l~zen
tj
terugg~tuurd?
It is therefore important to know whether (121) really contains a parasitic
gap. One complication is that a verb like lezen 'read' in (121) can be used transitively and intransitively (like eat). But this is not the right solution, because the construction is also possible with verbs that are always transitive, like uitlezen 'to finish reading': (122)
Welk boekj heb je zonder ej uit te lezen tj teruggestuurd?
Therefore, we cannot escape the conclusion that there are two gaps in examples like (121) and (122). Bennis and Hoekstra (1984) have also observed that there are always an A'-binder and a trace licensing the parasitic gap, as in English (Chomsky (1982a)). The necessary A'-binder is not always a Wh-phrase, it can also be an NP that has moved to the front of the VP, as has been observed for German by Felix (1983): (123) Hans hat Mariaj ohne ej anzuschauen tj gekiisst John has Mary without to look at kissed 'John kissed Mary without looking at her' Felix gives some arguments for this VP-internal NP-movement, which had already been established for Dutch by Kerstens (1975) (see also De Haan (1979, 58ff.) for additional arguments, and Hoekstra (1984, 116) for the idea that this type of NP-movement involves Chomsky-adjunction). That VP-internal movement (in order to create A'-binding) is necessary can be seen from the following Dutch examples: (124)
a.
Hij heeft het boekj zonder ej uit te lezen tj he has the book without to finish to read teruggestu urd returned 'He returned the book without finishing it' b. *Hij heeft zonder ej uit te lezen het boekj teruggestuurd
In (124b), the NP is in its original position, so that there is no A'-binder for the gap ej. This renders the sentence ungrammatical. Note that the
188 Domains and Dynasties word order of (124b) yields a perfect sentence with an overt pronoun replacing the gap: (125)
Hij heeft zonder hetj uit te lezen het boekj teruggestuurd
We can tentatively conclude, then, that infinitives, and only infinitives, contain parasitic gaps in Dutch. But then we have to find a solution for the contrast between (119) and (120), and, more generally, for the fact that (119) contains a parasitic gap at all, which seems to run counter to the Condition of Global Harmony. A solution can perhaps be based on the idea proposed by Emonds (1976), which we also find in Chomsky and Lasnik (1977), that the infinitival complementizer for is in fact a preposition. We could similarly argue that Dutch infinitival complements can have Ps as complementizers, perhaps optionally, so that there are two analyses available for a phrase like zonder te lezen (literally, 'without to read'): (126) a.
PP
~, I
,ond"
9
corp/\
S
/\
C01MP
Sf
(126) b.
S
P J
zonder
If infinitival complements introduced by a P can be analyzed as (126b), the problem is solved (see Bennis and Hoekstra (1984), who have independently come to a similar conclusion). Recall that (119) was problematic because the preposition zonder governs to the right, while the verbs govern to the left. With a structure like (126b), the orientation of the P becomes irrelevant, because it now falls within the minimal g-projection of V (which is Sf, as we have been assuming throughout). Thus, instead of (121) we would have (127), in which the orientation of the P zonder is neutralized, so that global harmony is no longer disturbed:
(127) Welk boekj heb je [s' zonder [ej te 19:enJ tj terug~stuurd? which book have you without to read returned Tensed complements to the same Ps obviously cannot be reanalyzed in this way. In most case, a tensed complement to a P is introduced by the normal complementizer for tensed clauses, dat 'that'. Thus, for these complements the structure must be similar to (126a):
Global Harmony, Bounding, and the ECP
(128)
189
~ I corp / \s s'
p
zond"
dat
In such a structure, the orientation of the P does count, and this explains why (129) is ungrammatical (as a structure that violates the Condition of Global Harmony): (129)
*Welk boekj heb je zOllder dat je ej uit l1!-s tj which book have you without that you finished reading terug~stuurd
returned To conclude this discussion of parasitic gaps, I would like to mention certain facts that are reminiscent of what we saw with preposition stranding. It was observed above that prepositions can only be stranded if the PP is in D-structure position. The same holds for parasitic gaps. Extraposition, for instance, destroys the environment for parasitic gaps: (130)
a.
b.
A man whoj everyone who knows ej admires tj *A man whoj everyone admires tj who knows ej
In Dutch, preposing or extraposition of PPs made them opaque for extraction. Assuming that the adjunct phrases introduced by zonder are removed from their D-structure position if they are preposed or extraposed, we predict that parasitic gaps will no longer occur in them. Note first that the phrases in question can be preposed (131b) and extraposed (131c): (131) a.
b. c.
Ik geloof dat niemand boeken zonder ze I believe that no one books without them uit te lezen terugstuurt to finish reading returns 'I believe that no one returns books without reading them' Ik geloof dat zonder ze uit te lezen niemand boeken terugstuurt Ik geloof dat niemand boeken terugstuurt zonder ze uit te lezen
190 Damains and Dynasties If only (131a) represents the D-structure order, we predict that parasitic
gaps can only occur with the order of (131a). This prediction is borne out (although intuitions vary somewhat on (132c)): (132) a.
Welk boekj heeft niemand zander ej uit te lezen which book has no one without to finish reading teruggestuurd? returned 'Which book has no one returned without finishing?, b. *Welk boekj heeft zander ej uit te lezen niemand teruggestuurd? c. *Welk boekj heeft niemand teruggestuurd zander ej uit te lezen?
These facts show once again that the notion "D-structure position" is more relevant for transparency than adjacency (as a precondition for cosuperscripting). First of all, adjacency of the transparent category and the matrix verb is irrelevant (as we saw with PPs): (133)
Welk boekj heeft hij zander ej uit te lezen naar which book has he without to finish reading to de bibliotheek gestuurd? the library sent 'Which book did he send to the library without reading?,
This sentence is grammatical in spite of the fact that the transparent clause (introduced by zander) is not adjacent to the matrix verb gestuurd. In fact, it is not clear at all how cosuperscripting could playa role in these cases, because the government properties of the P zander are neutralized here as we saw before. Moreover, the clause introduced by zander is an adjunct phrase, which is not obviously governed by the matrix verb (see Huang (1982)). I conclude therefore that the facts favor the hypothesis that the Condition of Global Harmony can only be met with respect to D-structure. As for parasitic gaps, this conclusion is inconsistent with certain data discussed by Longobardi (1985). According to Longobardi, parasitic gaps are possible in topicalized adjuncts in Italian: (134) ?Senza prima conoscere bene ej non so proprio quale not I know which without first knowing well ragazzoj Maria sarebbe felice di sposare ej boy Mary would be happy to marry 'Without knowing ej well beforehand, I really don't know which bOyj Mary would be happy to marry ej' In Dutch, it is also possible to topicalize embedded adjuncts (135a), but
parasitic gaps are then entirely impossible (135b):
Global Harmony, Bounding, and the ECP (135)
a.
b.
191
Zonder ze uit te lezen heeft hij de boeken without them to finish reading has he the books teruggestuurd returned 'Without finishing them, he has returned the books' *Zonder ej uit te lezen heeft hij de boeken teruggestuurd
Also in English, it is sometimes possible to have parasitic gaps in preposed adjuncts (Liliane Haegeman, personal communication): (135) c.
This is the kind of food whichj before you buy ej tj must be weighed
This example is in accordance with the Condition of Global Harmony, but it is problematic if the before-phrase has been preposed. Given our assumptions, the problem can be solved by assuming that the beforephrase is in one of its possible D-structure positions. For the Italian problem, however, I see no solution. Given the marginality of the parasitic gap phenomenon, it seems unlikely that languages have a choice as to whether global harmony is determined at D-structure, or at NP-structure (in the sense of Van Riemsdijk and Williams (1981)). Nevertheless, the Italian example seems to fall into place if global harmony is determined at NP-structure. Last but not least, it should be noted that only NP-gaps occur in the only possible Dutch parasitic gap construction. PPs give ungrammatical results: (136)
a.
b.
*Aan welk boekj heeft hij zonder ej te werken tj on which book has he without to work geld verdiend? money earned 'With which book did he make money, without working on it' *Met welk meisjej heeft hij zonder ej te praten tj with which girl has he without to talk samengewerkt? cooperated 'With which girl did he cooperate, without talking to herT
As before, these facts are explained by the Cinque--Obenauer hypothesis: if a gap is not part of a chain (like a parasitic gap) it must be pro, i.e. an NP. To summarize, we have seen that the Condition of Global Harmony explains why parasitic gaps are generally absent from Dutch. The one exception can be accounted for by assuming that Ps can fill the COMP position of infinitival complements. As with PPs, the phrases containing parasitic gaps are only transparent in very specific positions, presumably D-structure positions (Dutch) or NP-structure positions (Italian?). As in
192
Domains and Dynasties
English, parasitic gaps are never PPs but only NPs, as predicted by the Cinque-Obenauer hypothesis. A relatively simple but important prediction of the Condition of Global Harmony has to do with the relatively strict nature of island conditions in Dutch. In English, not to mention Romance or Scandinavian, there are many reasonably acceptable violations of the Wh-island constraint. Some of these were discussed above: (137) a. b. c.
What don't you know when to file? What don't you know how long to boil? What don't you know where to put?
Even extractions from tensed Wh-islands give reasonable sentences in many cases, as has been observed by Reinhart (1975) and many others since: (138) What books don't you remember who borrowed from you? In Dutch, most such sentences are ungrammatical (Koster (1978c, 192193); but see chapter 1 above): (139) a. *Wat weet je niet wanneer op te bergen? b. *Wat weet je niet hoe lang te koken? c. *Wat weet je niet waar neer te leggen? (140) *Welke boeken herinner je je niet wie van je leende? With relative clauses, the sentences in question are not at all better: (141)
*Dit is de man die ik weet wie ontmoet he eft this is the man who I know who met has 'This is the man that I know who met'
A remarkable fact in this context is that Dutch does not show the wellknown subject-object asymmetry discussed for English and other languages. Of the following two sentences, the first one (142a) is usually claimed to be much worse than the second (142b): (142) a. b.
*Whoj do you remember whatj tj saw tj? ?*Whatj do you remember whoj tj saw tj?
The corresponding Dutch sentences are both very bad: (143) a.
b.
*Wie herinner je je wat zag? *Wat herinner je je wie zag?
Global Harmony, Bounding, and the Eep
193
If anything, (143b) (corresponding to (142b)) is even slightly worse than (143a) (corresponding to (142a)). As for the Complex NP Constraint, we find similar contrasts. There have been many reports in the literature of reasonably acceptable violations of the CNPC. In English, for instance, the following sentence is supposed to be relatively acceptable (Chomsky, class lectures, fall 1983):
(144) Which racej did you express [NP a desire [s' to win
tj]]
In Dutch, such sentences are totally unacceptable (in fact, the contrast with English is even more dramatic than with Wh-island violations): (145)
*Welke racej heb je [NP een verlangen [s' (om) tj te winnen] COMP to win which race have you a desire uitgedrukt expressed
To my knowledge, it is not possible to produce relatively acceptable examples of CNPC violations in Dutch (like the ones given for English, Romance, and'Scandinavian). German, again, behaves like Dutch in this respect. As before, it appears that the acceptable CNPC violations occur in SVO languages but not in SOy languages with mixed orientation, like German and Dutch. As before, the facts follow from the Condition of Global Harmony (with some qualifications; see chapter 1). Let us first consider the Wh-island condition. In a relatively acceptable sentence like (138), all arrows point in the same direction: (146) What booksj don't you rem£.mber [whoj
[tj
bors..owed
tj
from you]]
In the ungrammatical Dutch sentence (140), the Condition of Global Harmony cannot be met in this way: (147)
*Welke boekenj heri.!lner je je niet [wiej [tj which books remember you not who van je le~nde]] from you borrowed
tj
The same contrast in global harmony can be observed in the case of the CNPC violations. In the English example (144), the Condition of Global Harmony can be met (148a), while this is impossible in the Dutch equivalent (148b): (148)
a.
b.
Which racej did you eXPLess [NP a desire [s' to wjp tiJ] *Welke racej heb je [NP een ver@pgen [s' (om) tj te wi~nen]] uitg~rukt
194
Domains and Dynasties
It should be noted that global harmony is a property of dynasties. It only plays a role if the recursive domain definition (Sc) is involved. It is
irrelevant if ec's are bound in accordance with the Bounding Condition (Sa) (see also (81b)). This is important in connection with normal long Wh-movement from tensed clauses in Dutch. Such movement is always possible in Dutch, even if the Condition of Global Harmony is not met: (149)
[s' Welk boekj heb je ge~egd [s' which book have you said gel£zen heeft]]] read has
tj
dat [hij that he
tj
Global harmony is not met here, because verbs govern their tensed S's to the right (in contrast with NP arguments, which are governed in the opposite direction). It is a fact that sentential complements occur on both sides of the verb in Dutch, but tensed complements only occur to the right. There is, therefore, no reason to assume that the sentential complement is in a derived position in (149) (as a result of extraposition). Moreover, we have seen before that extra position turns categories into absolute islands. The S' complement in sentences like (149) is clearly not an island, which forms strong evidence for the hypothesis that the postverbal position of tensed complements is the D-structure position of these complements. In any case, then, the dynasty concept is not involved in (149). This example contains a simple compounding of two local steps. The rightmost trace is bound in its minimal governing S' (in accordance with the Bounding Condition (Sa)), and the intermediate trace in COMP is also bound in its minimal governing S' (again, in accordance with the Bounding Condition (Sa)). This solution entails that intermediate traces in COMP can be governed from the left (see the orientation of gezegd in (149)). Only if the intermediate position in COMP is filled by a Wh-phrase (see (147)) is the condition of Global Harmony relevant. In (147), it is not possible to combine two local bindings, because the rightmost trace cannot be bound in its minimal governing S'. If the analysis given here is correct, certain standard facts and theories must be reconsidered, especially Subjacency and its empirical domain. In Chomsky (1973, 1977), Subjacency was proposed as a more general principle subsuming the following conditions (most of which go back to Ross (1967)): (lS0)
a. b. c.
the Complex NP Constraint (CNPC) the Subject Condition the Wh-island Condition
In connection with what follows, it is relevant to repeat the kinds of examples (and their structure) that these conditions are supposed to
Global Harmony, Bounding, and the ECP
195
account for: (151)
a. b. c.
*Whoj [s did you believe [NP the fact [s' tj that [Mary saw tj]]]] *Whatj [s did [NP a picture of ti] disturb Bill] *Whatj [s do you wonder [whoj [s tj saw tj]]]
In one of the most paradigmatic developments of generative grammar, Chomsky was able to subsume various empirical generalizations (those of (150) among them) under one principle, which has had the following form since Chomsky (1977): (152)
Subjacency A cyclic movement rule cannot move a phrase from position Y to position X (or conversely) in: ... X ... [u ... [~ ... Y ... ] ... ] ... X ... where a and ~ are cyclic nodes
The cyclic nodes were considered to be S' and NP, and, since Rizzi (1978b), Sand NP for English and S' and NP for Italian. As briefly discussed in chapter 1, it is unsatisfactory from the point of view of radical autonomy that this formulation crucially differs from the format of locality principles for anaphors. This is so because according to the Thesis of Radical Autonomy, all locality conditions of fundamental grammatical dependencies have the same basic format. Let us compare, then, (152) with the (pre-Pisa) format of locality conditions on bound anaphors: (153)
Opacity A rule of construal cannot connect an anaphor Y with'X in: ... X ... [~ ... Y ... ] ... X ... where ~ is a cyclic domain containing a subject or "tense"
SUbjacency (152) is a condition on derivations, while (153) considered to be a condition on representations (especially so in received in the binding theory of Chomsky (1981b, ch. 3». striking difference is that (152) specifies two cyclic nodes, and
is usually the form it The most (153) only on~ • In spite of these differences, the si,milarity between (152) and (153) is so striking that it seems very likely that a generalization has been missed. This hypothesis (of the missed generalization) is at the core of the Thesis of Radical Autonomy. Matters become even more suspect if we realize that all the facts mentioned under (151) can be handled by a one-node condition like (153):
196
Domains and Dynasties
(154)
One-node Subjacency
A movement rule cannot connect Y with X in: [~ ... Y ... ] ... X ... where Pis a cyclic node
... X ...
This empirically adequate alternative to standard Subjacency is so close to (153) that it would be a miracle if it were not in fact the same underlying principle. It is therefore interesting to reconsider the original motivation for the two-node format (152) in Chomsky (1973). In retrospect, the following contrast appears to be entirely irrelevant (Chomsky's (86) and (87)): (155)
a.
b.
Who did you see [NP a picture of t] *Who did you hear [NP stories about [NP pictures of t]]
This contrast is indeed predicted by the two-node format (152) but not by the one-node format (154). In spite of this fact, it is now clear why the contrast in (155) forms a non-argument in favor of a two-node format (152). The point is that the two-node format is both too weak and too strong for the postverbal contexts in question. First of all, it was shown by Chomsky (1977) that there are examples similar in structure to (l55a) that are less acceptable to most speakers: (156)
*Who did you destroy [NP a picture of t]
Earlier, Bach and Horn (1976) had given many examples in which extraction from one NP also leads to ungrammaticality: (157) a. b.
c.
d. e.
* About whom did they destroy [a book t] *What did Einstein attack [a theory about t] *About what did Einstein attack [a theory t] *Which city did Jack search for [a road into t] *Into which city did Jack search for [a road t]
Bach and Horn sought - correctly, in my opinion facts by a one-node condition: (158)
to explain these
The NP Constraint
No constituent that is dominated by an NP can be moved or deleted from that NP by a transformational rule Chomsky (1977) dismissed the NP Constraint by examples like the following: (159)
Of the students in the classj [several tj] failed the exam
Global Harmony, Bounding, and the Eep
197
This is a counterexample with the derivation as indicated. It has frequently been observed, huwever, that PPs introduced by of (and a few others) do not necessarily have their source inside an NP (see also our discussion of (102) above). If this were always the case, the following sentence would ha ve an ungrammatical source: (160)
Of the students in the classj I like [Mary tj] better than anyone else
The source would be ungrammatical in this case: (161)
*1 like [Mary of the students in the class] better than anyone else
It seems clear that PPs introduced by of do not necessarily have their source inside the NP with which they are construed. 9 Moreover, one could just as well say that the following sentences (from Koster (1978b) falsify two-node Subjacency:
(162) a. b. c. d. e. f.
Which ship do you suspect [NP the quality of [NP the wood of t]]? Which plants do you recognize [NP the need for [NP a tag on t]]? Which team did you consider [NP the possibility of [NP a game with t]]? Which subjects is there [NP danger for [NP too much fuss about t]]? Which planet is there [NP money for [NP a missile to t]]? Which sentence did you hear [NP stories about [NP the structure of t]]?
These sentences vary in acceptability, but most native speakers accept many sentences of this kind. In fact, the irrelevance of postverbal stranding contexts could already have been concluded from the famous examples in Ross (1967) (also mentioned by Chomsky (1973)): (163) Reports which the government prescribes [NP the height of [NP the lettering on [NP the covers of t]]] Examples like these show that two-node Subjacency is not only too weak, but also too strong. Apart from this, it also appears that the original grammatical example (155) is ruled out as soon as we decide that S is a bounding node for Subjacency in English (Chomsky (1977)): (164)
Whoj [s did you see [NP a picture of tiJ]?
Chomsky (1977) tries to solve this problem by assuming an extraposition-
198
Domains and Dynasties
like restructuring rule that transforms (164) into a structure like: (165)
[Whoj [s did you see [NP a picture tj] [pp of tj]jJ]
If whoj is extracted from the PP, only one bounding node (S) is now passed. But this solution does not work, because, as we have seen above, extraposition has in general the opposite effect in that it turns phrases into absolute islands. In short, the postverbal context in question is the wrong place to look if we want to understand the nature of bounding. Given the theory of dynasty-dependent domain extensions, we also understand why. As an absolute constraint on extraction, Bach and Horn's NP Constraint is too strong. But reinterpreted as a characterization of the unmarked case, it seems correct. Reinterpreted in this manner, it can be seen as an instantiation of the Bounding Condition (5a). In conjunction with the Condition of Global Harmony, this theory predicts that postverbal contexts in English are in principle transparent for extraction. An example like (164), then, is a violation of the basic locality principle Gust like (162) and (163)), which is the Bounding Condition (5a). This condition has the one-node format (154), and is, moreover, a condition on representations. An example like (164) involves the recursive domain extension (5c), which requires, as we have seen, global harmony:
(166)
Who did you
s~e [NP
a pic1.ure [pp oft]]
Global harmony also immediately explains the subject-object asymmetry, i.e. the fact that an NP in subject position is opaque ((151b) is repeated here for convenience): (167)
*What did [NP a picture of t] disturb Bill
Contrary to an object NP, a subject NP is on a left branch, so that the Condition of Global Harmony cannot be met (both Kayne's "canonical government configuration" and "global harmony" require that transparent phrases be on right branches in constructions without parasitic gaps). As was indicated in section 4.2 above, a directionality constraint on dynasties like the Condition of Global Harmony also explains why it is not possible to reproduce an example like (164) in an SOY language like Dutch: (168)
*Wie heb je [een foJ.o [v1!-n t]] gezkn
Clearly, these examples lack global harmony in Dutch. An example like (168), then, provides strong evidence against standard Subjacency and in favor of a one-node format like the Bounding Condition. From (167) it appears that the Bounding Condition is also valid for English. All
Global Harmony, Bounding, and the ECP
199
postverbal extractions from NPs are marked exceptions that can only be found in languages with preposition stranding, in which the Condition of Glo bal Harmony can be met. The Bounding Condition is confirmed by the fact that it gives the right demarcation line between strictly local binding of traces (the unmarked case) and "long" binding of pro (the marked case), which, contrary to the former type of binding, involves the Condition of Global Harmony. The Bounding Condition is also confirmed by the fact that it gives the demarcation line between chains (that involve all kinds of categories) and pro-binding (that only involves NPs, according to the Cinque-Obenauer hypothesis). The Cinque-Obenauer hypothesis is confirmed again by the fact that PPs cannot be extracted from NPs, not even from one postverbal NP, as demonstrated by some of the Bach-Horn examples cited before: (169) a.
b. c.
* About whom did they destroy [a book t] * About what did Einstein attack [a theory t] *Into which city did Jack search for [a road t]
Even in Italian, a language that is supposed to have S' as a bounding node, it is usually not possible to extract PPs from one NP, as demonstrated by Cinque (1980): (170) *Il paese [pp a cui] ricordiamo [NP un attaco t] the country on which we remember an attack la Polonia is Poland
e
Note that two-node Subjacency does not explain this class of facts. The Bounding Condition as a demarcation criterion for chains, however, does explain (170). Since the trace is bound outside its minimal governing NP, it cannot form a chain with its antecedent a cui. But then the antecedent cannot be a PP according to the Cinque-Obenauer hypothesis. Another fact explained in this way is that PPs cannot be extracted from NPs in Dutch, even if an S-boundary is not passed (see Koster (1978c, 8081)): (171) a.
b.
Wij we 'We *Wij we
hebben [NP de reis naar Hawaii geannuleerd have the trip to Hawaii cancelled have cancelled the trip to Hawaii' hebben naar Hawaii [NP de r
The ungrammaticality of (l71b) is surprising, because normally PPs can easily precede an object NP in Dutch (the movement in (l71b) could even be structure-preserving). Note also that two-node Subjacency says nothing about (l71b), because only one bounding node (NP) is passed. The
200
Domains and Dynasties
Bounding Condition (Sa), however, easily explains this fact (again in conjunction with the Condition of Global Harmony, or the CinqueObenauer hypothesis). It should be noted that the Bounding Condition was explicitly designed as a characterization of the unmarked case. With respect to the marked case, extraction from NP, PP, AP, or S', it was always recognized that it is limited to certain positions. As for extraction from PPs, for instance, I followed Joseph Emonds (personal communication) by saying that PPs with stranded prepositions "have to be peripheral within V'" (Koster (1978c, 96)). In present terms, this is of course the environment in which the PPs in question are governed (see Nakajima (1982) for an explicit statement of this conclusion). In any case, the Bounding Condition has always entailed that extraction from adjuncts is not possible because adjunct PPs are external to V', and therefore not in the only position from which extraction from PP was recognized to be possible. It is therefore not necessary to formulate an entirely new condition in order to prevent extraction from adjuncts (Huang (1982, 505)): (172)
Condition on Extraction Domain
A phrase A may be extracted out of a domain B only if B is properly governed Huang's condition (172) says exactly what the Bounding Condition says for these cases, if we add the assumption about government (Nakajima (1982)). We may conclude, then, that the Bounding Condition has a much wider scope than standard Subjacency. It not only explains most of the standard facts (islands; see (159)), but also (in conjunction with the CinqueObenauer theory) the fact that PPs cannot be extracted at all from NPs in English, Italian, or Dutch, and the fact that adjuncts are islands (172). Apart from Rizzi's parametrization of Subjacency (discussed above), the main evidence against the Bounding Condition and in favor of two-node Subjacency was based on postverbal contexts, which happened to be misleading as to the nature of bounding. Since the Bounding Condition (one-node Subjacency) also enters into the explanation of many other facts not covered by standard (two-node) Subjacency, I will assume that it is the locality principle for governed empty categories. From the point of view of radical autonomy, this conclusion has profound consequences. Contrary to standard Subjacency, the Bounding Condition cannot be considered the exclusive property of movement rules. There is, as we have seen, strong evidence that the same locality principle governs a well-defined subclass of control structures (see chapter 3). Moreover, the Bounding Condition is, contrary to Subjacency, so similar to binding principle A, that a generalization seems to be missed.
Global Harmony, Bounding, and the ECP
201
In fact, I will argue that the Bounding Condition also defines the unmarked domain for bound anaphors (chapter 6 below). For what follows, the most important conclusion is that the Bounding Condition divides gaps in two broad classes. Within its limits, we find ec's of all categories (traces, elements of chains). Beyond the limits of the (Extended) Bounding Condition, we only find empty NPs (pro), bound in accordance with the Condition of Global Harmony. As we have seen, the two classes of ec's were sometimes mixed up, which led to an inadequate conception of Subjacency. This new perspective on the nature of bounding appears to have consequences for the grammar of scope.
4.6. The grammar of scope In this section, I will limit myself to the grammar of Wh-elements in situ. It remains to be seen to what extent the results can be extended to the grammar of other quantified expressions, but here I will simply ignore these matters. In the introduction to this chapter, it was mentioned that the ECP can be seen as a rapprochement between the grammar of gaps and the grammar of scope. No one really claims that gaps have the same distributional properties as lexical elements with scope. In practice there is even a standard way of expressing the differences: the distinguishing properties of overt gaps are relegated to S-structure, while the remaining properties of covert gaps (created by LF movement) are considered to be properties of a distinct level of representation, namely Logical Form (LF). Although this might be the correct way to develop the theory of grammar, it must be said that the original motivation to create LF gaps in the first place was the idea that LF gaps share certain properties with overt gaps. By creating LF gaps, the properties of all gaps coud be stated at LF so that no generalization would be missed. The fate of LF, then, depends on the extent to which similarities can be found between overt gaps and LF ga[. If there are no similarities at all, the case for LF is considerably wea ened. Let us therefor . . reconsider the properties that can be distinguished in certain developments of the ECP, mentioned in the introduction to this chapter: (173)
a. b. c.
proper government bounding percolation
(standard ECP) (Subjacency) (directionality)
According to Chomsky, (b) and (c) are S-structure properties that do not belong to the grammar of scope. 10 As was mentioned before, it was already concluded in Chomsky (1977) that scope is not constrained by
202
Domains and Dynasties
Subjacency, and Huang (1982) has convincingly demonstrated that Chomsky's conclusion was correct. The standard view, then, is that property (173a) is the typical LF property, which applies to both overt gaps and LF gaps. In what follows, I will come to a slightly different conclusion. It is not entirely obvious whether overt gaps and Wh-elements in situ (LF gaps) must be governed in the same way. But as for directionality ((173c), preposition stranding, etc.) I agree with the standard view: the Condition of Global Harmony is an S-structure property, it definitely does not apply to Wh-elements in situ. The most interesting issue concerns (173b). I will conclude that bounding does apply after all to the scope assignment of Wh-elements in situ. Even if the arguments that standard Subjacency is irrelevant for LF gaps are convincing, conclusions may be different under a different theory of bounding. It is clear that if we want to assess the degree of similarity between overt gaps and LF gaps, the traditional arguments do not suffice. According to these arguments, Wh-elements in situ do not obey Subjacency because they can violate island constraints. But there are also lots of overt gaps that violate island constraints, and from this we do not conclude that the grammar of gaps is not constrained by Subjacency. On the contrary, the problem is rather to determine the empirical domain of Subjacency vis-avis the total distribution of gaps. Similarly, we might argue that Subjacency is relevant for only a subclass of Wh-elements in situ. One of the main conclusions of this section is that my reformulation of Subjacency, the Bounding Condition (Sa), plays an important role in the grammar of Wh-elements in situ. This role appears to be very similar to the role of the Bounding Condition in the grammar of gaps. As we saw in the previous sections, the Bounding Condition is not an absolute constraint on the occurrence of all gaps, it is only a dividing line between gaps with two distinct sets of properties. Gaps within the limits set by the Bounding Condition are members of chains and can be of all categories. Gaps beyond the limits of the (Extended) Bounding Condition are characterized by two extra conditions: first, they must obey directionality constraints (global harmony), and, second, they are pro and therefore only of the category NP (according to the Cinque-Obenauer hypothesis). My claim is that the Bounding Condition makes a similar split in the class of Wh-elements in situ. Within the limits of the Bounding Condition we find Wh-elements of all types. Outside of these limits, extra requirements must be met. Unlike what we saw for gaps, these extra requirements do not include the Condition of Global Harmony, but, interestingly, the conditions of the Cinque-Obenauer hypothesis must be met: beyond the limits of the Bounding Condition we only find NPs as Wh-elements in situ. If these conclusions are correct, it does not become any easier to
Global Harmony, Bounding, and the ECP
203
evaluate the arguments for an extra level of LF. In my opinion, a serious objection against LF movement has always been that it does not seem to show the most characteristic (perhaps the only) property of movement rules, namely Subjacency. But if the grammer of scope meets a bounding condition after all, things may be different. I hasten to add that the Bounding Condition is not a property unique to movement rules, but a much more general locality principle, which also plays a role in the theory of control constructions (chapter 3) and bound anaphora (chapter 6). We may conclude, then, that a specific objection against LF movement disappears, but that a more general objection against movement remains: it is not possible to identify a property unique to movement rules. Before going into the bounding problem, I would first like to discuss (173a) and (173c), the standard ECP properties and global harmony, respectively. If the standard ECP is a correct generalization over both overt gaps and LF gaps, we expect that the two kinds of gaps will always be governed in the same way. It is not obvious that this is true, however. Most Wh-elements in situ occur in positions where overt gaps may also occur, but there are some intriguing exceptions. The positions I have in mind are the nominative subject position of tensed clauses (174) and certain adjunct positions (175): (174) (175)
a. b. a.
b.
Whoj did he say tj saw Bill? *1 wondered whatj whoj saw tj WhYj did he come tj? *Whoj tj came whYj?
In (174a) we have a trace in a nominative position, where it is impossible to have a Wh-element in situ (174b). Similarly, we find a trace in (175a) in a position where the corresponding Wh-element in situ leads to ungrammaticality. Existing theories explain away these discrepancies by certain COMPinaccessibility hypotheses, in conjunction with a theory of antecedent government. 11 According to these theories, a gap is properly governed if it is either governed by a lexical category or by a local antecedent. The trace in (174a) is not lexically governed, but it is antecedent-governed by an antecedent in COMPo If there is an extra element in COMP, like the complementizer that, the route to the antecedent governor is blocked. This is the familiar that-t effect: (176)
*Whoj did he say [tj that] tj saw Bill?
Similarly, the trace of whoj in (174b) (after LF movement) is not accessible to its antecedent governor in COM~ due to the presence of what (already in COMP). The same account can e given for the contrast between (175a) and (175b): the trace in (175a) is a tecedent-governed by why in COMP, while after LF movement of why iq. (175b) this antecedent governor would
204
Domains and Dynasties
be inaccessible due to the element who already present in COMPo Thus, according to this account, the contrasts in (174) and (175) are due to an independent factor and have nothing to do with the way in which the elements in question must be governed. But note that this account depends on the correctness of the view that we need a notion like antecedent government and COMP accessibility (of the kind in question). If both notions are unmotivated, as I will argue, the contrasts in (174) and (175) might show a real difference in government context between overt gaps and Wh-elements in situ. The relevance of COMP accessibility can be questioned on the basis of the fact that an adjunct like why can be extracted (see Huang (1982)): (177) WhYi do you think [ti that] he came ti? If the structure is as indicated, the rightmost trace can only be antecedentgoverned if the trace in COMP is accessible. But contrary to expectations, the complementizer that does not block accessibility in this case. Another problem is that the Dutch equivalent of (176) is not so bad (there is no clear that-t phenomenon), while the equivalent of (174b) is presumably as bad as the English example: 12
(178) a. b.
Wiei zei hij [ti dat] ti Wim gezien had? WhOi said he that ti Bill seen had *Ik vraag me af watj wiei tj zag I wonder what who saw
Examples like (177) and (178) demonstrate rather conclusively that a complementizer does not block COMP accessibility. Furthermore, the notion antecedent government seems rather ad hoc: it adds an entirely new kind of government to the theory, a kind which plays a very limited role. I will therefore entirely dispense with the notion antecedent government. For the that-t effect, I will propose a theory based on accounts given by Aoun and Brody, who have sought to reduce this effect to the binding theory.13 This alternative theory has the advantage that it accounts for the difference between English and Dutch with respect to the that-t phenomenon. As for facts like (174b) and (175b), I will argue that these are due to a failing domain extension. Contrary to that, a Wh-phrase may indeed make a COMP inaccessible, which makes the domain extension necessary in the first place. In (174b) this failing domain extension is probably due to the fact that the subject who is not lexically or structurally governed. But then, lexical government is a condition that triggers the domain extension in question. It is not a necessary condition on traces, as shown by (174a) and (178a). In (175b) the failing domain extension is due to the fact that why is not an NP (see Huang (1982)). I will show that this might be seen as an
Global Harmony, Bounding, and the ECP
205
application of the Cinque-Obenauer hypothesis, which only allows certain long distance strategies for NPs. This conclusion entails that the ungrammaticality of (175b) has nothing to do with the fact that the adjunct why is not antecedent-governed. It is, in fact, quite easy to show that antecedent government is not the decisive factor in (175b). As shown by Huang, it is not true that adjuncts cannot remain in situ in general. Adjuncts like where and when, which can be considered NPs according to Huang, can remain in situ: (179)
a. b.
WhOi ti came when? WhOi ti lives where?
Traditionally, when and where have been considered adjuncts like why. I see no reason to question the traditional view. But if when and where are adjuncts, the explanation of the ungrammaticality of (175b), based on antecedent government, does not work. If adjuncts are not governed with an inaccessible (already filled) COMP, the sentences in (179) would also be ungrammatical. In other words, when and where must be governed by an element not in COMPo But if these adjuncts can be governed in this way, there is no reason to assume that why and how cannot also be governed in this way. In general, I will assume that adjuncts are governed by the category to which they are Chomsky-adjoined. Thus, not only when and where, but also why and how are governed by the lower VP in the following configuration: (180)
A
VP
Adjunct
In general, I will assume that a category can be governed by any lexical category Xi if it is minimally c-commanded by that category, and if the immediately dominating category does not belong to a different projection. In (180), then, the lower VP is of type Vi, and the immediately dominating category (the upper VP) is not of a distinct projection, so that the lower VP governs the Adjunct. A subject, however, is not governed by the VP, because the immediately dominating category S belongs to a different projection. If adjuncts are always governed in this way, if the that-t effect has nothing to do with the alleged fact that the complementizer makes the antecedent governor inaccessible, and if the that-t effect has nothing to do with antecedent government, then we can entirely dispense with the notion antecedent government. Let us therefore turn to the alternative theory of the that-t effect. On the basis of Perlmutter (1971) it was originally thought that Dutch was an exception to what Chomsky and Lasnik (1977) called the [that-t} filter. But efforts to make Dutch a regular instance of Universal Grammar
206
Domains and Dynasties
led to the idea that the that-t behavior of Dutch did not essentially differ from the corresponding behavior of English. It is indeed the case that extraction from subject position (181a) is slightly less good than extraction from object position (181b): (181) a.
b.
?Wiej denk je dat tj verdwenen who think you that disappeared 'Who do you think disappearedT Watj denk je dat hij tj gelezen what think you that he read 'What do you think that he readT
is has heeft has
(181a) can be improved by adding er (there) before the extraction site: (182)
Wiej denk je dat er tj verdwenen is
In any case (181a) superficially looks like a that-t effect. The weakness of the effect led to some disagreement about the data, and eventually the myth was born that there are two dialects with respect to the that-t effect, Dutch A and Dutch B (GLOW, Nijmegen 1980; see also Pesetsky (1982a)). The problem was and is that (18la) presumably is not as bad for most speakers as the corresponding English sentence ((181a) is slightly worse if the order of participle and finite verb is reversed, which is also a possible order in Dutch). The weak Dutch effect is practically limited to iritransitives. If objects are added to the verb, the that-t effect disappears altogether: (183)
Wiej denk je dat tj het gedaan heeft who think you that it done has 'Who do you think that did itT
In spite of the fact that the that-t filter is violated, this sentence is perfectly grammatical, presumably for all speakers of Dutch. There is no reason at all, then, to assume that there is a dialect split in Dutch with respect to (183). It must be concluded, in other words, that the alleged distinction between Dutch A and Dutch B is just a myth. What is certainly rather normal in Dutch is the occurrence of doublyfilled COMPs, particularly with the complementizer of 'if' in indirect questions. In these cases, we find a sharper subject-object asymmetry. The following sentence seems to be rather low in acceptability: (184) ?*Ik weet niet wiej of tj verdwenen is I know not who if disappeared has But again, the ungrammaticality cannot be explained in terms of the standard COMP-inaccessibility theories. As in (183), we can add an
Global Harmony, Bounding, and the Eep
207
object, and one or even two complementizers do not make the sentence ungrammatical: 14 (185)
Ik weet niet wiej of (dat) tj het gedaan heeft 1 know not who if that it done has 'I do not know who did it'
So, with the complementizer of as well, it is doubtful whether (the equivalent of) a systematic that-t effect can be observed. It is true that (184), to my ear at least, is worse than (181a), but that does not tell us very much. Examples with of in a doubly-filled COMP are slightly substandard. The behavior of of is rather idiosyncratic. Examples can easily be constructed in which it cannot co-occur with another element in COMPo PPs in COMP, for instance, lead to ungrammaticality: (186)
a.
b.
*Ik weet 1 know 'I do not *Ik weet 1 know 'I do not
niet aan wiens vader of hij dacht not of whose father if he thought know of whose father he thought' niet tijdens welke vakantie of hij verdwenen is not during which vacation if he disappeared has know during which vacation he disappeared'
Since the behavior of of is not understood at all in such constructions, not much can be concluded from it for the time being. What remains is a very weak that-t effect with intransitives, as in (181a), and the total absence of such an effect in most other cases. But even in (181a) the effect is too weak to trust theories that entirely reject this sentence. 1 will conclude, then, that Dutch does not have a that-t effect like English. But then the original problem makes a comeback: how can Dutch be reconciled with Perlmutter's generalization that that-t effects are only absent from what we now call pro-drop languages (assuming that Dutch is not a pro-drop language like Italian)? The absence of a clear that-t effect from Dutch is particularly interesting in conjunction with the fact, mentioned before, that the superiority facts in Dutch are similar to their English counterparts ((178b) is repeated here for con venience): (187)
*Ik vraag me af watj wiej tj zag 1 wonder what who saw
If the facts are correctly represented, (187) is remarkable, because according to the usual assumptions the explanation for the superiority facts runs parallel to the explanation for the that-t effect. It is, however, not entirely clear how ungrammatical (187) is. It is surely less acceptable than the Dutch that-t violations, but it is not so ungrammatical as, for instance, root sentences without Verb-second. The judgment (187) is
208
Domains and Dynasties
usually based on the clear contrast with (188), in which the order of subject and object are reversed: (188)
Ik vraag me af wiej tj wat zag
Contrasted with this sentence, (187) is bad, but contrasted with other ECP violations (like *John is impossible to go) it is relatively acceptable. It is in principle possible, then, that the relative unacceptability of (187) is not due to an ECP 'violation but to something else. I will return to this matter. But first I will try to give an explanation for the clear contrast between English and Dutch with respect to the that-t phenomenon. It seems to me that the solution lies in the older "On Binding" theory, according to which Wh-traces are subject to the Nominative Island Condition (NIC) (see Chomsky (1980a)). Thus, my solution will be akin to the account by Aoun (1981), who sought to reduce the ECP (for the that-t effect) to the binding theory. If the NIC is the key, we must consider the question how it is implemented in general and how English and Dutch differ with respect to it. Domain definitions for binding purposes usually have the following form ((5b) above): (189)
... [p ...
0) . . .
y ... 8 ... ] ...
In this formulation, y is the governor and 8 the dependent element (the anaphor). The opacity factor 0) can be INFL (AGR), subject, or COMPo The domain ~ is the minimal category containing 8, y, and 0). If 0) is specified as AGR, the value of ~ is S (the minimal category containing AGR), and if the value of 0) is COMP, the value of ~ is S' (the minimal category containing COMP). Thus, opacity factors define domains: S is the INFL (AGR) or subject domain, S' is the COMP domain. As for the implementation of the NIC, I will assume for the time being that nominative Case is assigned to the subject by its governor INFL (but see chapter 5). INFL is not accessible to the object of a sentence, because objects are governed by V (a governor does not penetrate the c-command domain of another governor). Similarly, adjuncts are not governed by INFL if they are governed by VP (see above). In other words, the exclusive governance relation between INFL and the subject follows from minimal c-command as the domain-constituting factor of governors. Let us now turn to the difference between the INFL domain of English and the INFL domain of Dutch. It has generally been assumed, at least since Chomsky (1957), that English has a special Aux node (see also Jackendoff (1972), Akmajian, Steele, and Wasow (1979)), which is usually considered a daughter of S and a sister of subject and predicate (Emonds (1976)). The English Aux (INFL) is clearly recognizable by the expression of tense (do-support) and by the modals that differ both morphologically and distributionally from ordinary verbs. If INFL (Aux) is the element
Global Harmony, Bounding, and the Eep
209
that assigns nominative Case to the subject and if it defines an anaphoric domain, this domain must be S (being the minimal domain containing INFL). In Dutch, however, there is no such clear evidence for an Aux. Dutch does not express tense by do-support, and the modals do not significantly differ from ordinary verbs. Furthermore, Dutch does not have a rule like English VP-deletion and other rules that require Aux as necessary context. It was therefore assumed in Koster and May (1982) that the element that assigns nominative Case in Dutch is COMP (see also Evers (1975)). Some, like Bayer (1984), have even proposed introducing CONFL for the grammars of Dutch and German. In any case, it seems crucial that in English the category S is the domain for the NIC, because the minimal category containing the governor of the subject, INFL, is S. In Dutch, the subject may be governed by COMP, so in this language S' is the minimal domain for the NIC. If the nature of the NIC domain determines the occurrence of the that-t phenomenon, it is now easy to see how the difference between English and Dutch follows from the domain definitions. Let us therefore have a closer look at sentences with subject extraction in the two languages: (190) a. b.
Wiej denk je [S{COMP tj dat] [s tj het gedaan heeft]] who think you that it done has *Whoj do you think [S{COMP tj that] [s tj INFL has done it]]
The grammatical Dutch sentence (190a) is a literal translation of the ungrammatical English sentence (190b). In both languages, the complementizer is followed by a subject trace. The Dutch example is grammatical, because the subject is governed from COMP (by the com plementizer dat, which has the feature + tense). Therefore, S' is the relevant domain (as the minimal category containing COMP). In the English example, S is the relevant domain, because in this language S is the minimal category containing the governor of the subject, i.e. INFL. Therefore, in Dutch the subject trace must be bound in its domain S', which it is, while in the English example (190b) the subject trace is not bound in its minimal domain, namely S. So, the contrast in grammaticality between (190a) and (190b) follows straightforwardly from the differences in domain definition for the two languages. Why, then, does deletion of that save (190b)? Why is the following sentence grammatical: (191) Whoj do you think [S{COMP tj] [s tj INFL has done it]] This problem can be solved by an extension of the domain definitions that is needed anyway. Projections are generally believed to be defined by their heads. There
210
Domains and Dynasties
are reasons to assume that projections lose their independence if the head disappears or undergoes reanalysis. Verb Raising in Dutch is a case in point (see Evers (1975) and chapters 3 and 5). In this process, a V-head of an embedded clause is adjoined to the V of the next higher clause. As a result, the embedded clause loses its independence in several respects, a process that is often referred to as clause union (see also Hoekstra (1984)). If we assume that complementizers are the heads of their projection, as proposed by Chomsky (1981b, 162), Stowell (1981), Kayne (1984, ch. 8, note 16), and others; if, in other words, the complementizer is the head of S', it becomes possible to consider an S' without a complementizer as a headless category, indistinguishable from S. A very similar situation arises under what Chomsky (1981b) calls S'-deletion: see chapter 3 above for an argument that this is in fact the absence of a complementizer. If we assume that deletion (or nongeneration) of that makes S' indistinguishable from S, the grammaticality of (191) is explained. The subject trace must be bound - according to the NIC as incorporated in the binding theory - in the minimal category containing INFL, i.e. in S. The subject trace is not bound in this S, but in contrast with (190b), Sis now indistinguishable from S', so that binding in S' will also do. The proposed solution is compatible with facts like the following: (192) a. b.
[s' Whoj [s tj INFL left]] I saw the man [s' whoj [s tj INFL wrote the book]]
Application of the NIC is unproblematic here. As before, the subject traces must be bound in the minimal category in which INFL governs the nominative subject. As before, this category S is indistin!Juishable from S', because S' does not contain a head (a complementizer). 5 This analysis solves a problem that has been salient since Chomsky's Pisa lectures. Koopman (1983) suggests that the following contrast can perhaps be accounted for by the ECP: (193) a.
b.
Whoj did Bill see tj? *Whoj did tj come?
This looks like a subject-object asymmetry: Subject Aux Inversion seems possible after fronting of the object (193a), but not after fronting of the subject (193b). The explanation suggested by Koopman is that in (193b) the governor whoj is not adjacent to the subject trace. In other words, the governor of the subject trace would be somehow inaccessible, as in the standard accounts of the that-t effect. This explanation is not compatible with our earlier rejection of the notion antecedent government. Fortunately, then, there are no reasons to assume that (193b) shows an ECP effect. The example becomes grammatical with emphatic stress on did. Exactly the same contrast can be observed in sentences without Whmovement:
Global Harmony, Bounding, and the ECP (194)
a. b.
211
*John did come John DID come
Since (194a) does not involve an empty category (like (193b)), there must be a cause independent of the ECP that is responsible for its ungrammaticality. The same cause (other than the ECP) will account for (193b). Moreover, Koopman mentions the following problematic facts: (195)
a. b. c.
Whoj has tj come? Whoj will tj come? Whoj must tj come?
It is hard to see why did in (193b) makes the governing whoj less accessible than the auxiliaries in (195). Whoj is not adjacent to the traces in (195), if we accept the traditional assumption that Subject Aux Inversion is triggered (whatever that means) by a Wh-phrase in COMPo The facts in (195) also form a strong counterexample against the standard account of COMP inaccessibility. Williams (1974) has argued that Subject Aux Inversion is in fact movement of Aux to COMPo Since then it has become clear that all Germanic languages other than English have a Verb-second rule. Koopman (1984) has shown that there are several African languages that also show the Verb-second phenomenon. If Verb-second is such a universal phenomenon, especially among the Germanic languages, it becomes very likely that Subject Aux Inversion is in fact an instance of Verb-second (movement of the tensed verb). If this conclusion is correct, and if we accept the hypothesis of Den Besten (1977) that Verb-second moves the verb to the tense feature that is expressed by the complementizer, then we must assume the following structure for , for instance, (120a): (196)
[S{COMP
Whoj has] [s
tj
INFL come]]
If this interpretation is correct, we have a strong counterexample against the doubly-filled COMP account of that-t phenomena, and also against the idea that a branching COMP blocks c-command. Under the alternative analysis, however, (196) is unproblematic. Again, we have a subject trace that must be bound in S. Again, this S is indistinguishable from S', because S' lacks an independent head (a complementizer). If we assume that INFL is the head of S, we see in (196) that this head has been moved to the head position of S', so that the two projections are literally indistinguishable, insofar as projections are defined by heads. A last virtue of the proposed analysis has to do with the interesting case discussed by Huang (1982) and mentioned before:
(197)
WhYj do you think [S{COMP tj that] [s she left tj]]?
212
Domains and Dynasties
Under most current analyses, the adjunct trace at the end of the sentence is not lexically governed but antecedent-governed by the intermediate trace in COMPo But this form of antecedent government is incompatible with the COMP-accessibility theories used in the same analyses. Under the alternative analysis, the rightmost trace is (nonstructurally) governed by VP, and it must, like all traces, be bound in its minimal governing category Sf. This condition is met in (197), because as we saw above, that (Dutch dat) does not block the accessibility of the antecedent in COMP, if the trace is not governed by INFL. In summary, it appears that the ECP is not only unnecessary for an explanation of the that-t phenomena, it is also incapable of explaining the difference between English and Dutch. This difference is explained by an Aoun-type theory that reduces this instance of the ECP to the NIC (ultimately the binding theory). I will now turn to global harmony and the role of bounding in constructions with Wh-elements in situ. In this domain, we find very little overlap with the grammar of overt gaps. Recall that a domain extension for gaps is triggered by a structural governor; furthermore, the domain extension appeared to be defined by a dynasty of equally oriented governors. Neither condition applies to the grammar of Wh-elements in situ. It was already observed by Huang (1982) that Wh-phrases can occur inside adjuncts that are islands for overt Wh-movement: (198) a.
b.
WhOj left [despite which warning]?
*Which warningj did he leave [despite tj]?
An adjunct preposition like despite cannot be stranded. This can be explained by assuming that it is not a structural governor, so that the necessary domain extension is not triggered. But then it is obvious that a structural governor is not needed for the extended domain of a Wh-phrase in situ. This might also be expressed by saying that if LF movement allows preposition stranding, it obeys other laws than preposition stranding at Sstructure. No matter how ~me phrases the difference, the contrastjng sentences in (198) show a clear difference between the grammar of overt gaps (198b) and the grammar of Wh-in situ (198a). The difference shows up not only in the elements that trigger domain extensions (structural governors); an equally clear distinction exists with respect to global harmony. As we saw in the previous section, the domain extensions for gaps are heavily constrained by directionality constraints (the Condition of Global Harmony). These directionality constraints do not playa role in the grammar of Wh-in situ. In order to show this, I will briefly discuss the three areas in which the Condition of Global Harmony appears to constrain the grammar of gaps in Dutch (preposition stranding, parasitic gaps, and islands). As we saw above, global harmony entailed that only postpositions
Global Harmony, Bounding, and the ECP
213
could be stranded in Dutch. Prepositions could not be followed by gaps. As for Wh-elements in situ, there is no problem at all. They can easily follow a preposition, in spite of the fact that the Condition of Global Harmony is violated, as indicated by the arrows: (199)
Wiej tj heeft m£t wiej gesPLoken? who has with who talked 'Who talked to whom?'
Such sentences are grammatical, no matter which preposition one chooses. Recall, for instance, that the Condition of Global Harmony prevented Wh-movement of an R-pronoun if it followed a preposition: (200) a.
b.
Hij heeft de cognac [pp vOQr [pp er ~jJ] gekQcht he has the cognac for there with bought 'He bought the cognac to go with it' *Waarj heeft hij de cognac [pp vOQ.f [pp tj bijJ] gekQcht? where has he the cognac for ';;th bought
Again, we get a grammatical sentence if we replace the gap in (200b) by a Wh-element, in spite of the lack of global harmony: (201)
Wiej tj heeft de cognac [pp vQor [pp waar bijJ] gekQcht? who has the cognac for where with bought 'Who bought the cognac to go with what?'
No matter what examples one tries to construct, global harmony appears to be irrelevant for Wh-elements in situ. This is also clear from examples involving extraposition. We saw in the previous section that extraposition turns PPs into absolute islands. Not so for Wh-phrases in situ: these categories can easily occur in an extraposed PP: (202)
Wiej tj heeft tk gesproken [pp met wiejJk?
Note that it is extremely unlikely that global harmony can be preserved in (199) by Pied Piping (of the whole PP met wie) at LF. Pied Piping is a step away from LF, it requires reconstruction, after which the Condition of Global Harmony might still be violated after extraction of the NP (wie), the only element with scope after all. Pied Piping at LF is an obscure notion, which creates more problems than it solves. Moreover, it would require Pied Piping of categories that normally cannot be pied piped in Dutch: (203)
Wiej tj heeft [NP de vader [pp van wie]] gezien? who has the father of who seen 'Who saw the father of whom?'
214
Domains and Dynasties
Global harmony can only be preserved in (203) by applying LF movement to the whole NP (indicated by brackets) in (203). But this NP does not qualify as a Wh-phrase in Dutch, or at least it does not undergo overt Wh-movement: (204)
*Ik vraag me af [NP de vader [pp van wieJJj hij tj zag I wonder the father of who he saw
Since Wh-elements in situ can be indefinitely far embedded in Dutch, the notion of Pied Piping would lose all significance. Last but not least, there is good evidence that scope assignment to Whelements is limited to NPs in certain contexts (as I will argue below). This conclusion is incompatible with LF Pied Piping, which would involve movement of PP from the contexts in question. Global harmony excludes parasitic gaps from most adjuncts in Dutch. Again, it appears that Wh-elements in situ form no problem in these contexts, in spite of the lack of global harmony: (205)
Welke jongen heeft welk
meisje gek1!st [vQ9r [dat hij has which girl kissed before that he welk hoek iJ!.sJJ? which book read 'Which boy kissed which girl before he read which bookT
which boy
Even in the very ungrammatical subject cases of parasitic gaps, we can replace the gap by a Wh-element without ungrammaticality: (206)
Wie zei dat [NP goede vrienden v1!.n wieJ welke who said that close friends of who which man bewogderd hebben? man admired have 'Who said that close friends of whom admired which man?'
So, in the domain of parasitic gap constructions as well, we find a clear contrast between the distribution of gaps and the distribution of Whelements in situ. The grammar of the former is characterized by the Condition of Global Harmony, which appears to play no role in the grammar of the latter. The same appears to hold for islands. The Condition of Global Harmony explains why islands are relatively strict (for overt Whmovement) in an SOY language like Dutch. But, as expected, islands are not at all strict for Wh-elements in situ. Thus, recall the following contrast between the relatively acceptable English sentence (207a) and the fully unacceptable Dutch sentence (207b) (both are violations of the CNPC): (207)
a.
Which racej did you eXPLess [NP a desire [to
wiP
tjJJ?
215
Global Harmony, Bounding, and the ECP
b.
*Welke racej heb je [NP een
verl~ngen
[(om) tj te wi!!nen]]
uit~drukt?
As indicated by the arrows, these sentences differ in global harmony. But if we replace the trace in (207b) by a Wh-phrase, the sentence is grammatical: (208)
Wie heeft [NP een verlangen [om welke race te winnen]] who has a desire COMP which race to win uitgedrukt? expressed 'Who expressed a desire to win which race?'
Similarly, scope assignment to Wh-elements in situ can violate the Whisland condition in Dutch, which is usually rather strict for overt Whmovement as we saw (but see chapter 1). Consider first a relevant example in English: (209)
Who remembers where we bought which books?
According to the standard Baker analysis (Baker (1970)), which books can have wide scope. If global harmony were relevant for Wh-in situ, this would be predicted to be impossible in Dutch. Nevertheless, welke boeken can have matrix scope in (210) just like in the English example: (210)
Wie herinnert zich wie welke boeken gekocht heeft? who remembers who which books bought has
In other words, the Wh-island condition does not apply to Wh-in situ in Dutch. Examples could easily be multiplied, and the conclusion seems clear. There is a fundamental difference between (overt) gaps and Whelements in situ in terms of global harmony. Since directionality constraints play such a fundamental role in the grammar of S-structure gaps, the distribution of Wh-elements, not subject to these directionality constraints, is much more liberal. It seems to me that the common conclusion that bounding is also irrelevant for Wh-elements in situ is in part due to the very significant difference with respect to directionality constraints, which seem, without further analysis, interwoven with bounding conditions. A standard argument has been that there are examples of Wh-elements in situ for which scope assignment violates the island conditions. Contrary to what has often been done, one cannot immediately conclude from these facts that Subjacency is irrelevant for Wh-in situ. After all, we do not conclude from the fact that overt Wh-movement violates islands (see (207a), for instance) that Subjacency is irrelevant for gaps at S-structure. The role of Subjacency (the Bounding Condition, in my opinion) can only indirectly
216
Domains and Dynasties
be demonstrated in such examples, by showing that the properties of gaps in islands are different from the gaps in contexts without island violations. Gaps that are not strictly locally bound (in accordance with the (Extended) Bounding Condition) differ in two fundamental respects; (i) they must be bound in accordance with the Condition of Global Harmony, and (ii) they must be NPs, in accordance with the Cinque--Obenauer hypothesis. 'As for Wh-elements in situ, no one (as far as I know) has ever tried to demonstrate a difference between such Wh-elements that are locally linked to their scope markers and Wh-phrases that are linked long distance. If there is such a difference, bounding is relevant to the grammar of Wh-in situ, contrary to what has been concluded so far. What we are looking for, of course, are similarities with what we have seen for overt gaps. But we have already concluded that these similarities are nonexistent in at least one respect: global harmony. Wh-elements that are not strictly locally linked to their scope marker are not linked outside the local domain, in accordance with the Condition of Global Harmony. There is only one other known similarity with overt gaps possible, namely, similarity in terms of the Cinque-Obenauer hypothesis. According to this, only NPs can be linked to an antecedent outside the strictly local domain. It seems to me that this is exactly what we find in the various languages that have been studied from this perspective. Huang (1982) points out that adjuncts like why and how cannot be moved out of a Wh-island in LF, contrary to NPs and adjuncts like where and when. Huang shows that there are reasons to assume that where and when are NPs (they appear as complements in PPs like since when and from where, for instance). Nevertheless, Huang interprets the relevant distinctions in terms of the complement/noncomplement distinction. Where and when are taken to be complements, together with object NPs. But as I already briefly indicated, there are no reasons to assume that where and when are not adjuncts. Being an NP is not incompatible with adjunct status, as is clear from the existence of temporal adjuncts like yesterday, these days, the day I left, etc. On the other hand, it is highly counterintuitive to consider where and when as verbal complements in most cases. All in all, it seems to me that Huang has demonstrated a difference between NPs and non-NPs and not a difference between complements and noncomplements. But even if this is the right conclusion, it cannot be very firmly established because there are, as we will see, too many problematic data. As for the data of Chinese and Japanese that have been discussed in the literature, the NP /non-NP distinction seems to correspond with the local/nonlocal distinction. Thus, consider the following Chinese data from Huang (1982); (211) Ni xiang-zhidao [s' shei mai-Ie she me] you wonder who bought what a. What is the thing x such that you wonder who bought x b. Who is the person x such that you wonder what x bought
Global Harmony, Bounding, and the ECP
217
According to the (a) and (b) interpretations (also discussed by Lasnik and Saito (1984)), both shei 'who' and sheme 'what' can have wide scope, in spite of the fact that the embedded clause is a Wh-island. In other words, Wh-NPs can have scope outside their minimal local domain. If the embedded question contains a non-NP, like weisheme 'why', it cannot have wide scope ( # means "impossible reading"): (212)
Ni xiang-zhidao [s' Lisi weisheme mai-Ie sheme] you wonder why bought what a. What is the thing x such that you wonder why Lisi bought x b. #What is the reason x such that you wonder what Lisi bought for x
In the (a) interpretation, the NP what has wide scope, a reading which is not possible for the adjunct in (b). An adjunct that behaves like weisheme is zeme 'how': (213)
Ni xiang-zhidao [s' shei zeme mai-Ie shu] you wonder who how bought book a. Who is the person x such that you wonder how x bought books b. #What is the manner x such that you wonder who bought books in x
The subject NP shei can have wide scope, like the object NP, but the adjunct zeme cannot have wide scope, which means that it cannot escape from the Wh-island. Similar contrasts, according to Huang, can be found with respect to complex NPs. Shei and sheme can have a scope wider than the complex NP, while the adjuncts cannot escape from the complex NP: (214)
a.
b.
(215)
a.
b.
[NP[S' shei xie] de shu] zui youqu who write DE book most interesting 'Books that who wrote are the most interesting?' [NP[S' ta taolun sheme] de shu] zui youqu he discuss what DE book most interesting 'Books in which he discusses what are most interesting?, *[NP[S' ta weisheme xie] de shu] zui youqu he why write DE book most interesting 'Books that he wrote why are most interesting?' *[NP[S' ta zeme xie] de shu] zui youqu he how wrote DE book most interesting 'Books that he wrote how are most interesting?'
Lasnik and Saito (1984) point out that another language without overt Wh-movement, Japanese, behaves like Chinese in this respect. The following examples show, for instance, that nani 'what' can have wide scope
218
Domains and Dynasties
(with respect to a complex NP), while naze 'why' cannot: (216)
a.
b.
[NP[S' Taroo-ga nani-o te-ni ireta] kotoJ-o sonnani -nom what-acc obtained fact-acc so much okotteru no be angry Q 'Y ou are so angry about the fact that Taro obtained what?' *[NP[S' Taroo-ga naze sore-o te-ni ireta] kotoJ-o sonnani -nom why it-acc obtained fact-acc so much okotteru no be angry Q 'You are so angry about the fact that Taro obtained it why?'
Adjuncts with the meaning of when and where do not behave like why but like what in Chinese and Japanese. Thus, the following sentence is ambiguous, like (211) and unlike (212): (217) Ni xiang-zhidao [Lisi (zai) shemeshihou mai-Ie sheme] you wonder (at) when bought what a. What is the thing x such that you wonder what Lisi bought at x b. When is the time x such that you wonder what Lisi bought at x
NaZi 'where' is also NP-like in Chinese and behaves in the same way. It can for instance be embedded in a complex NP, according to Huang (1982): (218)
[NP[S' ta zai nali pai] de dianying] zui hao he at where film DE movie most good 'Movies that he filmed where are the best?'
What all these data demonstrate is that islands delimit the scope of nonNPs, but not the scope of NPs (even if these NPs are non-arguments or adjuncts). Huang further observes that non-NP Wh-operators may go long distance as long as they do not cross islands: (219) Ni renwei eta weisheme meiyou lai] you think he why not come 'WhYi do you think that [he didn't come ti]?' This fact also illustrates another feature that distinguishes Chinese and Japanese from English: words like why can remain in situ. Lasnik and Saito (1984) point out that the adjuncts can also remain in place if there is another Wh-phrase in the same clause or in a higher clause. They give illustrative examples from Japanese:
219
Global Harmony, Bounding, and the ECP
(220) a.
b.
Kimi-wa nani-o naze sagasiteru no you-topic what-acc why looking-for Q 'Why are you looking for what?' Kimi-wa dare-ni John-ga naze kubi-ni natta you-topic who-to -nom why was fired tte itta no COMP said Q '*To whom did you say that John was fired why?,
The different behavior of NPs and non-NPs can also be found in languages with overt Wh-movement. As in Japanese and Chinese, the distinction corresponds with local domains. In languages like German and Dutch, the distinctions in question are perhaps very similar to what we have seen in the Asian languages. English and French seem to require a slightly different definition of the notion "local domain" in these cases. In German and Dutch, adjuncts can remain in situ, as in Japanese and Chinese: (221) a.
b.
Wer is warum weggegangen? who has why left '*Who left why' Wie is waarom weggegaan? who has why left
According to Haider (1986), we find the by now familiar NP/non-NP distinction in German, as soon as we embed the Wh-phrases in islands: (222) a.
Wer bezweifelt die Tatsache, dass Fritz who doubts the fact that Fritz 'Who doubts the fact that Fritz has lost b. *Wer bezweifelt die Tatsache, dass Fritz who doubts the fact that Fritz
verloren hat? what lost has what?' warum veri oren hat? why lost has was
The Dutch counterparts of these sentences are as follows: (223)
a. Wie betwijfelt het feit dat Fritz wat verloren heeft? b. *Wie betwijfelt het feit dat Fritz waarom verloren heeft?
The contrast is rather subtle, but it seems to me that it is real. If the facts are correctly represented, we observe what we saw in Chinese and Japanese: scope assignment for Wh-NPs can violate island constraints, and scope assignment for adjuncts cannot. Since the Cinque-Obenauer approach made the same distinction for overt gaps, we might say that the facts just mentioned form strong evidence for LF movement: we have to create LF gaps in order to state the Cinque-Obenauer split between NPs and non-NPs as a generalization
220 Domains and Dynasties over all gaps. In any case, the role of the Bounding Condition for Wh-in situ is confirmed: within local domains, scope is assigned to Wh-phrases of all categories, while scope markers outside of such a local domain are only accessible to NPs. It is not entirely obvious, however, that the Cinque-Obenauer distinction between NPs and non-NPs must be made in terms of gaps. In fact, the appeal of this analysis for gaps was tied up with the idea that empty pros had to be NPs, just like lexical pronominals. According to Cinque (1983b), empty pro, contrary to traces, is not created by move alpha. If we accept this approach, it is not possible to derive the pronominal features of long-distance-bound Wh-in situ by replacing it (by LF movement) with empty pro. But there is an alternative to LF movement that has been available since Baker (1970) (who developed certain ideas of Katz and Postal (1964». According to this approach, scope is assigned to a Wh-element in situ by coindexing it with an abstract question operator Q, which can serve as scope marker (see also the alternative account in terms of vertical locality in chapter 1 above): (223) c.
.. . [~ Qj [ ... Whj ... ]] ...
In all cases that I know of, this approach is at least empirically equivalent
movement of the Wh-element. Within the local domain ~, any category can be linked to Q.16 This linking would be in accordance with the Bounding Condition (Sa). As is generally the case with this general purpose locality principle, there is no reason to expect any special conditions on the Wh-element in (223c). Uusually, special conditions, like the conditions on the nature of dynasties, are only to be expected if a linking involves an extended domain. Suppose now that such special conditions are also required if a Whelement is linked to Q beyond the limits of the Bounding Condition. Suppose furthermore that these special conditions come down to the requirement that the Wh-element is +pronominal]: iO LF
(224) ... Qj ...
[~
... [ ... Whj ... ]] ... [ + pro]
Under the further assumption made by Cinque and Obenauer that the feature [+pro] is only compatible with NPs, the facts that we have discussed are accounted for: only NPs can have an antecedent outside of ~, i.e. in one swoop. Like all other indexing, indexing to Q can be iterated (compare "successive cyclic" movement). In this way, non-NPs can also be linked at a distance, so long as the intermediate COMPs are free: (225)
.,. Qj ... [Qj ... [Qj ... Wh-adjunctj ... ]] ...
Global Harmony, Bounding, and the ECP
221
This accounts for the fact that adjuncts like why can have scope outside their minimal clause in Chinese, Japanese, etc. An index in COMP is not free if the clause is subcategorized by a verb that takes indirect questions like wonder. Thus, the following structure is impossible: (226)
*... Qi ... wonder [Qi [ ... Whi· .. ]] ...
The rightmost Qi is in a position where it is governed by the verb. This always entails that the scope of elements linked to this position is fixed. This can immediately be observed in languages with overt Wh-movement. It is certainly not the case that overt Wh-movement fixes the scope of a Whelement, as can be seen in languages like German or Hungariah, which have scope markers distinct from the Wh-phrase. 17 But if Wh-movement is to a position where the Wh-phrase is governed by a verb like wonder (i.e. a verb that takes questions), the scope is always fixed. This means that adjuncts cannot be linked beyond a subcategorized Q (as in (226)). Thus, for a word like why, (227a) is possible but (227b) is not: (227)
a.
b.
... Qi ... [Qi ... [Qi ... WhYi.··]]··· [Qi ... whYi ... ]] ...
*... Qi ... [wonder
The adjunct cannot be linked directly to the leftmost Qi, because non-NPs can only be locally linked to Q, in accordance with the Bounding Condition. In other words, the intermediate Q may not be skipped in (227b). For NPs, the situation is different, because NPs can be pronominal. They can therefore skip an intermediate subcategorized Q, as long as this Q is assigned the index of some other Wh-phrase. Thus, for the Chinese or Japanese equivalent of I wonder who saw what (with all Wh-phrases in situ) there are the following possibilities: (228)
a. b. c.
[ ... [ ... wonder [Ql,2 [WhOl ... what2]]]].'. [Ql[ ... wonder [Q2'" [WhOl ... what2]]]] .. . [Q2[ ... wonder [Ql ... [WhOl ... what2]]]] .. .
Contrary to what we saw with adjuncts (227b), NPs like the Chinese equivalents of who and what can skip the intermediate Q. As NPs, they can have the feature [ + pronominal], so that they can be bound outside their local domain. The approach just sketched differs from the account of Huang (1982) (and the related account of Lasnik and Saito (1984)) in a number of respects. First of all, it is not based on LF movement but on indexing at S- . structure. Furthermore, the distinction between NPs and certain adjuncts is not derived from the notion of government. In Huang's approach, an adjunct like why is not properly governed, unless it is antecedent-
222
Domains and Dynasties
governed. LF movement is then necessary in order to create the properly indexed structure. Illegitimate structures arise if antecedent government is somehow blocked. The major idea in that respect is that multiple Whmovement leads to branching COMPs, which are partially inaccessible for antecedent government. In the account given above, all adjuncts are sufficiently governed in situ by the VP node; LF movement is therefore not necessary for the purposes of proper government (it is not necessary to create a local controller or antecedent governor). But since antecedent government is not necessary for adjuncts, it can be entirely dispensed with. In what follows, I will criticize the major assumptions of the other account. Indexing of Wh-phrases to Q might look like a notational variant of LF movement, but in fact it has a slight empirical advantage. Furthermore, I will show that the most common COMP-inaccessibility theory, based on an idea of Aoun, Hornstein, and Sportiche, is presumably incoherent and in any case empirically inadequate. Here it suffices to mention that the theory sketched above covers all facts that are also covered by Huang's approach. In particular, this theory gives a correct account of the distributional differences between NP Whphrases and non-NP Wh-phrases in Chinese, Japanese, and other languages. But there is also a definite empirical advantage in comparison with Huang's theory. In the theory discussed above, the fact that (the Chinese equivalents of) when and where have a different behavior from why and how follows from the application of the Cinque-Obenauer hypothesis, in conjunction with the idea that the Bounding Condition provides the demarcation line between the occurrence of NPs and nonNPs. Since where and when are NPs, and why and how are not, the former two adjuncts allow long distance linking, while the latter two do not allow it. Under the other account, however, the difference between the two kinds of adjuncts is a complete mystery. There is no reason to assume that where and when are arguments or complements like who or what. Nor is there any reason to assme that the proper government conditions for when and where differ from those that are needed for other adjuncts. There are also problems that are neutral between the two accounts. The most difficult empirical problems are caused by the behavior of adjuncts in languages like English or French. One of the most intriguing of Huang's observations is that there is also a distributional difference comparable to what we saw in Chinese between where and when on the one hand, and how and why on the other hand. Thus, most speakers agree, according to Huang, that there is a contrast between (229a-d) and (22ge-f): (229)
a. b.
c. d. e. f.
Who Who Who Who *Who *Who
remembers remembers remembers remembers remembers remembers
where we bought what? where we met who? what we bought where? what we bought when? what we bought why? what we bought how?
223
Global Harmony, Bounding, and the ECP
As in Chinese and Japanese, where and when differ from the other adjuncts in that they pattern with the NPs. But contrary to the Asian languages, and presumably also contrary to German and Dutch, an adjunct like why cannot remain in situ at all: (230)
a.
b.
*Who left why? *Who said he left why?
The explanation given by Huang (1982) and Lasnik and Saito (1984) is based on the COMP-indexing mechanism first proposed by Aoun, Hornstein, and Sportiche (1981). According to this approach, there is a COMP-indexing rule of the following form: (231)
[COMP ...
Xj ... ]
~ [cOMP i ...
Xj ... ]
This rule is used to account for the superiority facts: (232)
a.
b.
1 wonder whoj tj saw what *1 wonder whatj who saw tj
The explanation runs as follows. Rule (231) applies at S-structure, so that the COMPs in (232a) and (232b) receive the indices i and j, respectively. After LF movement, the following structures are derived: (233) a. b.
1 wonder 1 wonder
[cOMPi [cOMPj
whatj whoiJ tj saw whOj whatjJ tj saw
tj tj
(from (232a)) (from (232b))
Thanks to rule (231), the COMPs have inherited the index of the Whphrase that they contained at S-structure. LF movement of the remaining Wh-phrase leads to branching COMPs. This has the consequence that (233a) is well-formed according to the ECP, while (233b) is not. In (233a), the subject trace tj is antecedent-governed by COMP (which has received the index i), so that the subject trace is properly governed. The object trace tj is also properly governed by the verb saw. In (233b), however, the subject trace remains ungoverned, because the COMP has inherited the wrong index, namely the index j of the object that it contained at Sstructure. The subject trace cannot be directly governed by who b because this Wh-phrase in COMP does not c-command the subject trace due to the branching COMPo This ingenious procedure is also supposed to explain the ungrammaticality of (230a) and (230b). Like the subject in (232b), the adjunct why is not properly governed at S-structure (by hypothesis). Therefore, it must be antecedent-governed. A local controller in the required sense can only be created by moving why to COMP at LF. But there it forms a branching COMP with who, so that c-command (necessary for local control) is
224 Domains and Dynasties
impossible. Rule (231) does not help, because at S-structure COMP has received the index of who, not of why. This is the current standard account of (230). It seems to lead to a serious technical problem, and it is definitely inadequate from an empirical point of view. The technical problem can be demonstrated with (233). If the Wh-phrases form a branching COMP that blocks direct c-command of the traces, we can perhaps explain why the subject trace tj is not antecedent-governed by whoj in (233b). But the same inaccessibility of the antecedent also prevents proper binding: normally speaking, a trace must be c-commanded by a Wh-phrase in COMPo In other words, destruction of c-command also destroys the c-command relation necessary for other purposes. It is not clear how this technical problem can be solved. But even if it can be solved, the whole procedure leads to empirical problems. The first problem is the well-known fact that adding a third Wh-phrase considerably improves (232b) (see Kayne (1983)): (234)
I wonder what who saw where
If COMP bears the index of what, neither the trace of who nor that of where is properly governed after LF movement. Or, even if where were properly governed at S-structure, who would be improperly governed exactly as it is in (233b). Kayne's Connectedness Condition cannot save (234) at LF, because that condition crucially applies at S-structure. After LF movement, the Connectedness Condition can only rescue (234) if (233b) is also rescued. Another serious problem is that languages like Chinese and Japanese allow structures like (230), with an adjunct like why in situ. Lasnik and Saito (1984) give examples like the following:
(235) a.
b.
Kimi-wa nani-o naze sagasiteru no? you-topic what-acc why looking-for Q 'Why are you looking for what?' Kimi-wa dare-ni John-ga naze kubi-ni natta you-topic who-to -nom why was fired tte itta no? COMP said Q '*To whom did you say that John was fired why?'
After LF movement, (235a) has one of the following two structures (naze and nani in COMP, on the right in Japanese): (235) c. d.
[Kimi-wa tl [Kimi-wa tl
-0 -0
t2 sagasiteru] [naze2 nanij] no t2 sagasiteru] [nanil naze2] no
The problem is that with a branching COMP neither nani nor naze ccommands its trace. So, how can (235a) with the adjunct naze 'why' in situ
Global Harmony, Bounding, and the ECP
225
be saved? Lasnik and Saito (1984) propose to solve the problem by the notion "head of COMP" (cf. Stowell (1981, ch. 6)). In rule (231), COMP receives the index from the Wh-phrase which is its head. In a language like Japanese, with only LF movement, we are free to choose any Wh-phrase in COMP as the head at LF: either naze or nani in (235c-d). In this way, we can derive a structure in which the trace of the adjunct is antecedentgoverned by COMP at LF. In a language like English, with overt Whmovement, a COMP is filled with a Wh-phrase at S-structure; this Whphrase is the head of COMP at S-structure, but also at LF, because Lasnik and Saito stipulate that if COMP has a head at S-structure, nothing else can be the head of COMP at LF. Note that the stipulation undermines the explanatory force of the account. As the Japanese example shows, it can be determined at LF what the head of COMP is. After LF movement, the English structure of the equivalent of (235a) is indistinguishable from the Japanese structure, apart from indexing. Re-indexing at LF can only be prevented in the English structure by stipulation. The real problem with the Lasnik-Saito account is empirical. Their theory entails that in a language with overt Wh-movement, an adjunct like why cannot remain in situ: since the head of COMP is already fixed at Sstructure in such a language, there is no way to assign the index of the adjunct to COMP in order to create proper antecedent government. This prediction happens to be false. As we have seen, a language like German has overt Wh-movement, but still allows adjuncts like why to be in situ. Thus, Haider (1986) gives examples like the following: (236)
a. b.
Wer ist warum weggegangen? who has why left Ich weiss nicht, wer weshalb verloren hat I know not who why lost has '*1 don't know who lost why'
All in all, it seems to me that the account in terms of antecedent government and COMPaccessibility (as determined by the various possibilities of COMP-indexing) is not only technically problematic but also seriously inadequate from an empirical point of view. There is a very simple alternative based on the analysis given above. Under this alternative, LF movement is replaced by coindexing with Q, and antecedent government is entirely dispensed with. Recall that if an adjunct in situ leads to ungrammaticality, this is due to the fact that Q is not accessible in a local context. Thus, in Chinese and Japanese, Q was not accessible across islands. Only NPs (with the feature [ + pronominal]) could be linked to Q at a distance. Reconsider now the paradigm in (229) and the ungrammatical sentences in (230). The sentences with why and how are ungrammatical, while the sentences with NPs (including adjuncts) are grammatical. This is of
226
Domains and Dynasties
course the familiar distribution of data that in other contexts was explained by the Bounding Condition and the Cinque-Obenauer hypothesis. In other words, the facts in (229) and (230) are explained if why and how are somehow not accessible to Q. A very simple way of implementing this idea is parametrizing the Bounding Condition for linking of Wh-phrases to Q: in Chinese, Japanese, and also in German and Dutch, S' is the local domain for this linking, while in English and French, S is the relevant domain. Thus, the following (direct) linking to Q is legitimate in many languages but not in English and French: (237) ... [s' Qi ... [s ... WhYi . .. ]] ... If S is the local domain for linking to Q (the demarcation in terms of the Cinque-Obenauer hypothesis), we predict that only NPs can be linked to Q in English. Thus, the following linking is also possible in English:
(238) ... [s' Qi ... [s ... whati .. . ]] .. Since the local boundary S is crossed, this is a long distance linking in English. As in the many cases discussed before, long distance linking is only possible for NPs like what (and when, etc.), which have the necessary pronominal features. 18 If this solution is tenable, we have a very simple explanation for the fundamental fact observed by Huang (1982), that the distribution of Wh-phrases in situ in Chinese that are linked to Q across an island is very similar to the distribution of Wh-phrases within S'. If we assume S as a bounding node for French as well, we have an explanation for a fact that remains unaccounted for in Lasnik and Saito (1984). French has overt Wh-movement, but it is not obligatory under all circumstances. Thus, the object may remain in situ in the following example: (239) Tu as vu qui you saw whom The remarkable fact is that the adjunct porquoi cannot remain in situ (see Aoun (1986)): (240) *Tu es venu pourquoi This fact does not follow from the Lasnik-Saito account, because pourquoi can be moved to COMP at LF where it can become the head of COMP, since COMP is not filled at S-structure in (240). Thus, nothing prevents assigning the index of pourquoi to COMP at LF, so that the adjunct trace is properly governed, thereby meeting the requirements of the ECP. Under the alternative account, however, the ungrammaticality of (240) follows
Global Harmony, Bounding, and the Eep
227
immediately: (241)
*[S' Qj [So .. pourquoij ... ]]
If S is a bounding node, this linking is impossible, while the NP qui in (239) can be linked across a bounding node (like in Chinese, Japanese, and many other languages).19 Before going into some problems, I would like to point out that the account just given entails an argument against LF movement of Whphrases. As mentioned before, linking of Wh-phrases in situ to Q could be seen as a notational variant of LF movement. But if the parametrization just given is correct, this is not very likely. The reason is that for overt Wh-movement, S' is the bounding node: (242)
[s' WhYj [s did you leave tj]]
Clearly, an adjunct trace can be bound across an S boundary. But, as we have seen, a Wh-adjunct cannot be linked to Q across S. Since overt Whmovement and linking to Q do not have exactly the same locality properties, it becomes less likely that the two processes can be collapsed. I will now turn to some problems. First, I will discuss the problem of AP adjuncts, and then I will end with the superiority facts. Huang (1982) limits himself to adjuncts like where, when, why, and how. But if we consider a larger array of adjuncts, some problems arise, particularly with AP adjuncts. Thus, the following sentences are not so bad according to native speakers of English: (243)
a. b.
Whoj tj worked how long? Whoj tj did it how often?
Assuming that phrases like how long are adjuncts, these data are problematic for all existing accounts. For the parametrization account, they are problematic because the adjuncts must be linked to Q across an Sboundary. But if that is a long distance linking, how long must be an NP, which it is not. Traditionally, these adjuncts have been considered APs with determiner how. I see no reason to doubt this traditional insight, so let us assume that these adjuncts are in fact APs. One solution would be to assume that APs can be pronominal, like NPs. This conclusion is perhaps not entirely unjustified. Ross (1969) has argued that there are many similarities between NPs and APs. Ross points out, for instance, that in many languages an adjective can be replaced by a pronoun: (244)
a. b.
Peter ist klug, aber die Frauen sind es nicht (German) Peter is smart but the women are it not Pierre est intelligent, mais les femmes ne Ie sont pas (French)
228
Domains and Dynasties
The same example can be translated into Dutch: (245) Peter is slim, maar de vrouwen zijn het niet In English, this pattern is less productive, but even in this language there are examples, according to Ross: (246) a. b.
Harry is smart, although he doesn't look it People want me to be polite, but being it is often difficult
The assumption that APs can be pronominal has a number of consequences for real gaps. We predict, for instance, that APs can escape from islands more easily than PPs or adverbial adjuncts like why. This prediction cannot be directly tested with adjunct APs, because adjuncts APs do not leave a structurally governed gap. As we saw before, domain extensions for ec's are only possible under certain directionality constraints and only if the gap itself is structurally governed. Since adjuncts are usually governed by VP (and not by V), they do not meet the requirement for extraction (domain extension). It is, however, possible to construct examples with APs in structurally governed positions, for instance the complement of made in (247): (247) They made him very sick It seems to me that extraction of Wh-phrases from these positions gives
better results than extraction of PPs or adverbial adjuncts. Given the fact that extraction from tensed Wh-islands is usually bad in English, we expect the following sentence to be rather bad as well: (248) ?* How bigj do you remember [whatj [they made
tj
tj]]
But even here we find a contrast with the following sentences (with extracted PPs or adjuncts), which are worse: (249) a.
b. c.
*WhYj do you remember what they made tj *Howi do you remember what they made tj *To whomj do you remember what they sent
tj
With complex NPs, the contrast is even clearer. Thus, (250a) seems to be better than (250b-c): (250) a. ?? How bigj did you express [a desire [to make it tj]] b. *Howj did you express a desire to make it tj c. *With whomj did you express a desire to work tj I will assume, then, that APs are closer to NPs than to PPs or Adverbs
Global Harmony, Bounding, and the ECP
229
in their behavior: only NPs and APs can have pronominal features, which give more flexibility for long distance linking. Among the most problematic cases are the superiority facts (251a) and the related facts with Wh-phrases in subject position (251 b): (251)
a. *1 wonder [whati [who saw ti]] b. ??WhOi ti said that who saw Bill?
In principle, these sentences could be grammatical under the account given above, because the subject who is an NP in both sentences: NPs can be linked to Q across an S-boundary. But if these sentences are definitely ungrammatical, it is not difficult to add a provision that rules these sentences out. Like other domain extensions, the domain extension for Wh-phrases in situ might be triggered by a certain type of government. Domain extensions for gaps, for instance, are triggered by structural government. Similarly, we might stipulate that the domain extension for Wh-phrases is only triggered by lexical government. This would be sufficient to rule out (251), because subjects are only governed by INFL, which is not a lexical governor. Objects are governed by V, and certain adjuncts by VP: both categories are therefore governed by a lexical element of type Xi, so that both allow domain extensions (adjuncts only if they are NPs). We could leave the matter here, but the problem is that we find conflicting reports about these sentences in the literature. What everyone agrees on is that (251a) is worse than (251b), but beyond that there is little agreement. As for (251a), it appears that there are several ways to improve this type of sentence: (252) ?I wonder which books which girls read
If the subject which girls is not sufficiently governed, we would expect this sentence to be as bad as (251), which it is not. This casts some doubt on the proposed ways to rule out (251a). There have also been reports from other languages that claim that the equivalent of (251a) is grammatical. Haider (1986), for instance, claims that (253) is grammatical in German: (253)
Was hat wer gekauft? what has who bought
As for the Dutch equivalent, I find it difficult to make up my mind, as mentioned before: (254)
Wat heeft wie gekocht? What has who bought
230
Domains and Dynasties
The sentence does not seem entirely ungrammatical, which again casts doubt on the generality of the proposed explanations. As for (251b), the picture from the literature is even more confusing. In a paper that started much research on multiple questions (Baker (1970)), the following sentence (with the same structure as (251 b)) is given as grammatical: (255) Which girl regrets that which boy didn't dance with her? Bresnan (1977, 191) claims that there exist "grammatical nonecho questions" of the following form: (256)
a. b.
Who recommends that who be fired? Which man ordered that which woman be fired?
Bresnan explicitly comments that these sentences are "perfectly grammatical". To my ear, the Dutch version of such sentences is also perfectly grammatical: (257)
Welke man zei dat welke vrouw ontslagen moest worden? which man said that which woman fired had to be
So, there is at least substantial doubt about the status of sentences of type (251b). Assuming that there is not a real dialect split involved, we might hypothesize that such sentences are in fact grammatical, perhaps only slightly less acceptable because of the weak kind of government involved. In that case, (251a) might also be grammatical insofar as the ECP is concerned. It would not be too difficult to explain why (251a) is less acceptable than (251b): (251a) involves a Wh-phrase that interrupts the linking of another Wh-phrase to its trace, which always leads to less grammatical sentences, especially when the intermediate Wh-phrase is superior in some sense (see Fiengo (1980), Gueron (1982), Cinque (1983b), and also Koster (1978c, ch. 3)). All in all, it is very questionable whether the ECP is responsible for the negative judgments in (251). Most examples like (251b) are not bad enough, and (251a) clearly involves an independent factor that contributes to its marginality. But even if the marginality of the sentences in (251) is caused by insufficient government, the sentences can be saved by adding a third Wh-phrase, in accordance with Kayne's Connectedness Condition. It was demonstrated above that the linking of Wh-phrases in situ does not involve directionality constraints. It appeared that the Condition of Global Harmony is only relevant for S-structure gaps. In Kayne's formulation of the Connectedness Condition, the directionality constraints are built in, and are supposed to apply also to Wh-elements in situ. Nothing prevents us from adopting a more modular approach, according to which connectedness without the directionality constraints applies to
Global Harmony, Bounding, and the ECP
231
both gaps and Wh-elements. The directionality constraints could then be added for gaps only. If we adopt this approach, each Wh-element in situ can have g-projections (triggered by lexical government) that must form a subtree with the g-projections of other Wh-phrases in the sentence in the sense of Kayne (1983). Thus, the relative acceptability of (258) can be explained by the Connectedness Condition, even if stated without the directionality constraints: (258)
?I wonder whatj who saw tj where
This sentence contains two Wh-phrases in situ that must be linked to Q in COMPo If who is insufficiently governed, it can be rescued as before: the path of where to the antecedent Q involves the node S, which immediately dominates who, so that a subtree is formed. 20 As Kayne (1983) has pointed out, this approach crucially assumes that Wh-phrases are linked to their antecedent at S-structure. After LF movement, (258) would be similar to the less acceptable (259) in all relevant aspects: (259)
*1 wonder whatj who saw tj
After LF movement, both (258) and (259) would have only gaps in the original positions of the Wh-phrases. But the contrast between (258) and (259) shows that who is not "picked up" by an extra (lexically governed) gap, but by an extra Wh-phrase in situ. Since it is only at S-structure that Wh-phrases are in situ, it follows that S-structure is the relevant level for connectedness. We must conclude, then, that there are two good arguments that favor the hypothesis that Wh-phrases in situ are connected with their scope marker by S-structure linking, and not by LF-movement. The first argument is the SIS' parametrization that distinguishes linking to Q from movement to COMPo The second argument, connectedness, makes the point more directly: only at S-structure can the relevant paths be constructed.
4.7. Conclusion The main focus of this chapter has been the question whether the rapprochement between the grammar of gaps and the grammar of Whelements in situ in terms of the ECP is justified. Our starting point was one of the richest conceptions of the ECP, Kayne's Connectedness Condition, in which the following elements could be distinguished: (260)
a. b. C.
proper government (standard ECP) bounding (Subjacency) percolation (directionality)
232
Domains and Dynasties
Our clearest result concerns (260c): directionality constraints apply to overt gaps, not to LF gaps, i.e. not to Wh-elements in situ. Domain extensions for gaps must be formulated in terms of the Condition of Global Harmony, a development of Kayne's notion "canonical government configuration", not only for English and French, but also for languages with a mixed branching pattern like Dutch. Wh-movement can violate island constraints only if a dynasty can be formed of equally oriented domain governors. Wh-movement is a strictly local process in languages in which the dynasty in question cannot be formed. In such languages, Wh-movement is characterized by the Bounding Condition (one-node Subjacency), which is not only a stricter locality principle than standard Subjacency, but also a domain principle that plays a role in certain control structures (chapter 3) and in bound anaphora (chapter 6). The Bounding Condition, the unmarked locality principle, was concluded to be the demarcation between two kinds of gaps: gaps bound within the local domain can be of all categories, and gaps bound outside the local domain must have pronominal features (the Cinque-Obenauer hypothesis). The Bounding Condition also plays a role in the grammar of Whelements in situ. Just as in the grammar of gaps, it is a demarcation criterion for two kinds of bound elements. Wh-elements that are linked to their scope marker within the local domain can be of all categories, and Wh-elements linked to their scope marker outside the local domain must be pronominal (NP, and perhaps AP). Here, then, it might seem that we have a clear similarity between the grammar of gaps and the grammar of Wh-elements, and an argument in favor of LF movement. This is an illusion, however. First of all, the Bounding Condition is not, like standard Subjacency, a property unique to movement rules. It is, on the contrary, a much more general locality principle that also characterizes a number of nonmovement linkings. Furthermore, the non locally bound Wh-elements cannot be turned into gaps by movement, because movement is a local process by definition. But the clearest point is an empirical one. It appears that the Bounding Condition can have either S or Sf as a bounding node, and that this parametrization can distinguish the grammar of gaps from the grammar of Wh-elements in situ within one language. Thus, in English Sf is the bounding node for overt Wh-movement, while S is the bounding node for Wh-elements in situ. If this conclusion is correct, the Bounding Condition leads here in fact to an argument against the identification of Whmovement and the linking of Wh-elements to their scope marker. So far, then, we have two clear differences between the grammar of gaps and the grammar of scope, namely in terms of (260b) and (260c). The Bounding Condition is implemented in two different ways for the two types of grammar, and the directionality constraints that are so characteristic for the grammar of gaps are entirely lacking in the grammar of Whelements in situ. So, if the rapprochement between the two grammars is
Global Harmony, Bounding, and the ECP
233
justified, there must be a significant similarity in terms of (260a), the standard ECP. According to the ECP, empty categories must be governed in a certain way, a way shared by S-structure gaps and LF gaps (Wh-elements in situ). Note that the similarity is not just in terms of government, but in terms of a particular kind of government. Simple government does not distinguish anything from anything, because all categories, including lexical categories, are somehow governed (see Bouchard (1984)). So, the ECP is only a meaningful generalization over gaps and Wh-elements if this particular kind of government can be found in both domains. The particular kinds of government involved in the ECP are: either lexical government by a category of type XO, or antecedent government. This is a stronger type of government than ordinary government. Subjects, for instance, can be governed by INFL, but then they do not meet the requirements of the ECP, because INFL is neither a lexical category nor an antecedent governor. Another important fact is that neither gaps nor Wh-elements are governed everywhere in the same way. Gaps bound in a local domain have to meet weaker conditions for government than gaps bound in an extended domain. This point has usually been overlooked in evaluations of the ECP. Let us begin, then, with a comparison between gaps and Wh-elements in local domains. As for gaps, it appears that there is no evidence that we need a stronger notion of government than just ordinary government. Stronger forms have been advocated for subject gaps and for adjuncts: (261)
a. b.
WhOi do you think [ti [ti left]] WhYi did she leave ti
In these cases, the traces are supposed to be antecedent-governed by a local controller in COMPo This local controller would be inaccessible to other material in COMP (the that-t effect): (262) *WhOi do you think [ti that [ti left]] But, as we saw in the preceding section, a complementizer does not block accessibility in all cases: that-t sentences are usually grammatical in Dutch, and also, if an adjunct is extracted, the presence of that does not seem to block the accessibility of the intermediate trace: (263)
WhYi do you think [ti that [he left tiJ]
At least in these cases, antecedent government can be dispensed with entirely. In (263), the adjunct trace can be considered to be governed by VP, and for (261) there is no reason to assume that government of the subject trace by INFL is not sufficient. By dispensing with the notion
234
Domains and Dynasties
"antecedent government", several conceptual, technical, and empirical problems can be avoided. The that-t phenomenon must not be explained in terms of the inaccessibility of the antecedent governor, but in terms of the domain definition for the NIC. This is so far the only plausible way to explain the difference between English and Dutch, in my opinion. In sum, there are no cases where we need a notion of government for gaps in local domains that goes beyond the ordinary government that we also need for lexical anaphors. The same can be said about Wh-in situ. For Wh-elements that are NPs, it is not remarkable at all to be governed, since Case (necessary for all lexical NPs) is only assigned under government. What remains are the ungrammatical adjuncts in situ, and the superiority facts: (264)
a.
b.
*1 wonder [s' what [s who saw]] *1 wonder [s' who [s left why]]
According to our analysis, these sentences are not ungrammatical as a result of improper government, but as a result of the fact that linking to the scope marker is not strictly local in English, because S is the bounding node for this process. If scope is only assigned in an extended domain, the Wh-element must (preferably) be lexically governed (which is not the case in (264a)), and it must be an NP (which is not the case in (264b)). This approach, based on the Cinque-Obenauer hypothesis, explains why the following sentences, with NP adjuncts, are grammatical: (265) a. b.
I wonder who lived when I wonder who lived where
If adjuncts are insufficiently governed in general, and if they can only be rescued by antecedent government, these sentences remain a mystery. We must therefore assume that adjuncts in situ are in fact sufficiently governed (without antecedent government), and that the grammaticality of (265) (as opposed to (264b)) is caused by the NP status of when and where. If our analysis of (264) is correct, there are no remaining cases for which we need anything beyond ordinary government. If ordinary government suffices for both gaps and Wh-elements bound within local domains, it must be concluded that there is no evidence for the ECP at all within these local domains. The reason is deceptively simple: not only empty categories, but also lexical categories must be ordinarily governed. In fact, all categories must be governed, except PRO. So, the exceptional status of PRO is remarkable and requires explanation, not the fact that empty categories must be governed in general. Or at least, it does not need explanation beyond what can be said about the government requirement for lexical categories. If there is an ECP, conceptualized as a significant generalization over
Global Harmony, Bounding, and the ECP
235
gaps of Wh-elements in situ, it cannot be found within local domains. Therefore, it must show up in the behavior of these elements when they are bound in extended domains. But here we find the clearest differences between gaps and Wh-elements in situ. Domain extensions for gaps are only triggered if the gaps are structurally governed. Domain extensions for Wh-elements in situ are triggered by a weaker type of government, lexical government at best. This difference showed up in (198), repeated here for con venience: (266)
a. b.
Whoj left [despite which warning]? *Which warningj did he leave [despite tj]?
The second sentence, (266b), is ungrammatical because of the fact that despite is not a structural governor. This blocks the formation of a dynasty-based domain extension. From (266) it is clear that despite is a sufficient governor for the domain extension of Wh-phrases. It must therefore be concluded that in extended domains we find even less evidence for the ECP. That ECP is a spurious generalization can be seen from the following summary of our results: (267)
properties a. type of dynasty b. type of bounding c. type of category (i) local (ii) nonlocal d. type of government (i) local (ii) nonlocal
gaps global harmony S'
Wh-in situ
XP NP (AP)
XP NP (AP)
ordinary structural
ordinary lexical
S (or S')
The point of this list is that the similarities are all of an uninteresting kind, i.e. the similarities are shared by lexical categories. Within a local domain, all dependent categories, lexical or not, are governed. For nonlocal binding, pronominal features are necessary, for both lexical and nonlexical categories. Apart from these trivial similarities, the properties of gaps and Whelements crucially differ. If this is indeed the case, Wh-movement and scope marking of Wh-elements are two distinct processes. The idea that scope assignment of Wh-elements is an instance of move alpha (LF movement) can be justified in two ways: either it must be shown that it has the properties of move alpha, or it must be shown that the resulting LF gaps share significant properties with overt gaps. Since scope assignment of Wh-elements shows neither class of properties, the existence of LF movement and a level of LF (distinct from S-structure) cannot be justified on the basis of the distribution of Wh-elements in situ.
236
Domains and Dynasties
NOTES 1. Percolation in Kayne's sense requires "structural government", which is a somewhat stronger notion than proper government (see Kayne (1984». 2. More precisely, I will show that there is a weak parallelism in terms of bounding. Bounding, however, appears to be differently implemented for the two domains of facts. 3. Actually, ~ might be a parameter. In section 4.6, I will show that its value can be S. 4. The structural governors in English are Vs, and the Ps that allow preposition stranding (see Kayne (1984». 5. It is not entirely clear why there could not be resumptive PPs. If we consider the possible ec's in island violations, it appears that there are more possibilities for NPs than for PPs. For adjuncts, there are hardly any possibilities at all. It seems to me that this state of affairs reflects the extent to which the categories in question are identified. NPs are optimally identified by subcategorization, a-role assignment and Case assignment. PPs can only be subcategorized, and adjuncts are not even subcategorized. Assuming now that ec's in islands form a highly marked phenomenon, it is not entirely unexpected that only optimally identified categories (i.e. NPs) can be found in the contexts in question. 6. All known domain extensions are triggered by some type of government. For pro, this is "structural government". For trace (as in Italian), it must be the strong type of "complement government" (cf. s-government in the sense of Jaeggli (1981» that we find in subcategorization. It seems to be likely that the two notions can be collapsed, so that domain extensions for all ec's are triggere!i if and only if the ec's are structurally governed complements. 7. If the rightmost Xl is a structural governor, i.e. [+ V] 0. 8. Bennis and Hoekstra (1984, n. 2) have noticed that the notion "canonical government" can be improved by stipulating that not only the orientation of the g-projections matters, but also the direction of the government of the gap itself. They suggest that the direction of the government of the gap must be in agreement with the direction in which the g-projections are governed. This is a step towards "global harmony", so at least this kind of intermediate position is necessary. It is very well possible that global harmony must be combined with the idea that the direction of government must be canonical. 9. See also Klein (1983) for arguments in favor of this conclusion. 10. Class lectures, fall 1983. 11. See for instance Huang (1982, ch. 7) and Lasnik and Saito (1984). 12. It is not entirely clear how bad (178b) is in comparison with the English equivalent. For discussion, see below. 13. Aoun (1981). 14. Adding a second or a third element in COMP is substandard. In my own judgment, (185) is perfect. 15. I am assuming here that in certain contexts it is possible for the complementizer to take over the role of governor from INFL. For Dutch, I have been assuming that anyway. For French, the transition is indicated by the que/qui switch. A related phenomenon can be observed in relative clauses in English and Scandinavian. (I would like to thank Guglielmo Cinque for pointing out the problem to me.) 16. Ultimately, I believe that the NP/non-NP distinction is explained by the better identification of NPs in comparison with other categories (see note 5). If this assumption is correct, the feature [ + pronominal] can be dispensed with. 17. See Van Riemsdijk (1983a) for this very important observation in German, and chapter 2 above. Compare also the mit strategy in Hungarian discussed by De Mey and Maracz (1984). 18. According to this theory, all linking of Wh-phrases in situ to Q is marked long distance linking. This is different from what we observe in Chinese, where the linking of in situ elements can be local (because S' is the relevant bounding node). But note that in English the unmarked local linking to Q also exists, namely after overt Wh-movement to COMPo In other words, overt Wh-movement is involved in the unmarked scope assignment, while interpretation of in situ elements always constitutes a marked phenomenon in English. 19. Aoun (1986) observes that the behavior of comment is anomalous in French in that it
Global Harmony, Bounding, and the ECP
237
does not behave as an adjunct. Thus, the following sentence is grammatical: Tu es venu comment? 20. The path theory for Wh-phrases in situ remains undeveloped here. Percolation is presumably triggered by lexical government, but so far, I have been unable to identify dynasty conditions for Wh-phrases.
Chapter 5
NP-Movement and Restructuring
5.1. Introduction So far we have seen that there is a prototypical locality principle, the Bounding Condition ((5a) in chapter 4), of which the other locality principles are marked extensions. Domains are expanded either by adding opacity factors (like subject or Tense), or by dynasty formation. The focus of the preceding chapter was on A'-binding, particularly on the various empty categories bound by Wh-phrases. Domain extensions for the empty elements in question were determined by directionality factors: they were bound in accordance with the Principle of Global Harmony. In what follows, I will focus on anaphors and their domains. I will first discuss various forms of NP-movement in English and Dutch. The second part of this chapter is about so-called restructuring. In the next chapter, I will focus on lexical anaphors, particularly on reflexives in Dutch. In general, I will show that for both NP-traces and reflexives, the unmarked domain defined by the Bounding Condition plays a significant role. For A-bound NP-traces, domain extensions are very rare. In fact, pseudopassives, supposedly involving reanalysis, are the only example of a domain extension for A-bound NP-traces. I will argue that the percolation mechanisms studied in the preceding chapters make reanalysis superfluous. So-called reanalysis is just an instance of the independently motivated, dynasty-driven percolation mechanism. Dutch differs from English in that it has more than one lexical reflexive. These reflexives partially overlap, but otherwise differ in distribution. Obviously, this means that the binding theory of Chomsky (1981b) has to be extended somehow. I will show in chapter 6 that the unmarked Bounding Condition defines the domain in which the two Dutch reflexives overlap. Opacity factors only playa role beyond the simplest domain. It is only there that Dutch reflexives contrast: one reflexive has to be bound in the minimal subject domain (as in English), while the other must be free in the minimal subject domain. More generally, I will show that the opacity factors, which define anaphoric domains for various languages, determine a markedness hierarchy in accordance with the Subset Principle of Berwick (1982). In this hierarchy, the domains for English anaphors have an intermediate position. The smallest language is defined by the unmarked Bounding 239
240 Domains and Dynasties
Condition. The language defined by the domains familiar from English (based on the opacity factors subject and Tense) is a superset of this elementary language. An even more inclusive language is defined by opacity factors like "Indicative Tense", which plays a role in the domain definition for anaphors in Icelandic. I will also consider the question how the term "language" has to be applied here, given Chomsky's penetrating criticisms of the notion (E-) language (Chomsky (1986a)). Before going into particular instances of NP-movement in English and Dutch, I would like to point out that I am only using the term NPmovement as a descriptive term. Like Wh-movement, NP-movement is just a particular relation between an antecedent and an empty category. Wh-movement refers to the relation between Wh-phases and traces (A/binding), and NP-movement refers to the relation between NPs in Apositions and traces. In neither case do I assume a real movement analysis, because the notion "movement" does not have defining properties (as was shown in chapter 2). In the case of NP-movement, it is much easier to see than in the case of Wh-movement that the property that is supposed to define movement (Subjacency) is entirely superfluous. One reason is that NP-movement is also constrained by principle A of the binding theory of Chomsky (1981b), because NP-traces are treated there as anaphors. In most cases, principle A defines a stricter domain than Subjacency, so that the role of Subjacency cannot even be detected. One case in which principle A defines a larger domain than Subjacency is the following: (1) John says that [it is clear that [[pictures of himself] are for sale]] The brackets indicate bounding nodes for Subjacency. Thanks to the formulation of opacity, the binding theory accepts this sentence: it of the intermediate clause is not an accessible SUBJECT in the sense of Chomsky (1981b). As a result, this it does not make its minimal S a domain, so that himself can be bound across two Ss. So, the nature of the opacity factors allows binding in domains larger than the domain defined by Subjacency in cases like (1). But it is clear that the intricate formulation of the opacity factors that makes (1) possible is relevant only for lexical anaphors. If we replace himself in (1) by an NP-trace, the result is ungrammatical for independent reasons: (2)
*John
says that [it is clear that [[pictures of t] are for sale]]
As we saw in chapter 2, it is never possible to A-bind an empty category in a Case-marked position, neither locally nor nonlocally. In fact, there is no evidence at all that opacity factors playa role for nonlexical anaphors (NP-traces). In all the cases that come to mind, the
NP-Movement and Restructuring
241
unmarked Bounding Condition suffices. Consider, for instance, the following standard case: (3)
f
John seems [s' t to be likely [s' til to go]]
Both traces (t' and til) are bound in the minimal Sf in which they are governed, i.e. in the domain defined by the Bounding Condition, which is formulated without opacity factors. Needless to say, the Bounding Condition eliminates the need for Subjacency in this case as well. Obviously, it makes no sense to say that the traces in question cannot be bound across two bounding nodes, if they cannot even be bound across one bounding node. To my knowledge, there is not a single case where the domain of NP-traces must be defined in terms of opacity factors or in terms of the two nodes of classical Subjacency. To conclude this preliminary discussion, consider just one more case in which the formulation of principle A leads to a domain larger than the minimal domain defined by the Bounding Condition: (4)
[They want very much [for each other to go]]
Each other is neither bound in its minimal S nor in its minimal Sf. The minimal Sf contains the governor of each other (for), but it does not contain an accessible subject or Tense. It is for this reason that only the matrix clause qualifies as a domain: the root S is the minimal category that contains not only the anaphor and its governor, but also an accessible subject and Tense. Again, we have an extension of the minimal domain that is only possible for lexical anaphors. If a trace occurs in a clause introduced by for, this clause is always the minimal domain in which the trace is bound. For Wh-traces, this is rather unproblematic: (5)
Who do you want [t (*for) [t to go]]
In standard English, for is always deleted if it is not followed by lexical material (see Chomsky and Lasnik (1977)). One might argue therefore that want does not necessarily select for. There are good reasons to assume, however, that verbs like want always select for, even if it does not show up (see chapter 3). In a sentence like who do you want t to go, this underlying for is also the proper governor of the trace (in the sense of the ECP). In any case, the rightmost trace in (5) can be bound in its minimal domain, namely by the intermediate trace in the COMP of the forcomplement. But also for NP-traces, the minimal domain definition always suffices. Consider the following well-known example: (6) *Bill was preferred [t (for) [t to have seen Tom]]
242
Domains and Dynasties
Without the intermediate trace, this sentence is simply excluded by the Bounding Condition: the rightmost (governed) trace would not be bound in its minimal Sf. But with the intermediate trace, the sentence is also excluded for reasons that originally led to principle C of the binding theory. Thus, one might say that with an intermediate trace in COMP, the rightmost trace is a variable that must be A-free (which it is not in (6)). But even if this explanation is rejected, there are other mechanisms that exclude the A-binding of a trace in argument position, as we have seen in chapter 2. Note also that it is not possible to interpret the rightmost trace of (6) as PRO. This PRO could not be bound by Bill, because Bill is not the underlying subject that must be the controller if prefer is a control verb (see chapter 3 for the details of this account). There is no way to save (6). In any of its interpretations, the sentence is excluded by independent mechanisms and we do not have to have recourse to the ECP (as suggested by Chomsky (1981b, 252)). The ECP account probably does not work because the underlying for is a proper governor, as can be concluded from examples like (5). In general, then, all extensions of the minimal domain are based on problems with lexical anaphors. With the exception of pseudo-passives, to which I will return, the behavior of NP-traces is adequately characterized by the Bounding Condition. If this conclusion is correct, the reality of the Bounding Condition is confirmed again.
5.2. Passives and ergatives in Dutch Dutch is not a free word order language like Latin. It is, for instance, usually not possible for indirect objects to precede subjects, particularly not if these subjects are agents (Koster (1978c, 157)): (7)
a.
b.
Ik denk dat Peter hem het boek gaf I think that Peter him the book gave 'I think that Peter gave him the book' *Ik denk dat hem Peter t het boek gaf
In Koster (1978c, 3.2.2.3) it was observed that there are two exceptions to this word order pattern. Indirect objects can precede the subjects of passives (8) and the subjects of "ergative" verbs (9): (8)
a.
(9)
b. a.
b.
Ik denk dat het boek hem gegeven werd I think that the book him given was 'I think that the book was given to him' Ik denk dat hem het boek t gegeven werd Ik denk dat de ergste rampen haar overkwamen I think that the most terrible disasters her happened 'I think that the most terrible disasters happened to her' Ik denk dat haar de ergste rampen t overkwamen
NP-Movement and Restructuring
243
Implicit in this analysis is the idea that het boek in (8b) and de ergste rampen in (9b) are in subject position, and that a rule of "indirect object preposing" has applied. A further assumption in Koster (1978c) was that Dutch has lexical passives. In retrospect, the arguments for this were rather weak, as was pointed out by Den Besten (1981a) and Hoekstra (1984). The main argument was that only direct objects are passivized in Dutch. Indirect objects are in general not passivized: (to)
*Hij werd een boek gegeven he was a book given 'He was given a book'
Furthermore, it was noted that Dutch does not have pseudo-passive (like the boat was decided on) or raising passives (like John was believed to go). Den Besten showed that raising passives are indeed extremely marginal in Dutch (there is only one idiomatic expression), but that passivization of indirect objects does occur: (11)
a.
b.
Er werd ons verzocht om weg te gaan there was us requested for away to go 'We were asked to leave' Wij werden verzocht om weg te gaan we were requested for away to go
Dutch has impersonal passives that also occur with indirect objects (like ons in (lla)). Den Besten notes, however, that (llb) is in fact the preferred form (in spite of some traditional normative opposition). Den Besten further observes that indirect object passivization is limited to examples like (llb), which have a sentential complement (see Everaert (1982) for further examples). If we replace the complement clause in (11b) by a pronoun (or any other NP), passivization is impossible: (12)
a.
Men heeft het hem verzocht one has it from-him requested b. Het is hem verzocht it has from-him requested c. *Hij is het verzocht he has it been-requested
Indirect object passivization does occur after all, but possibly only if there is a sentential object. I will show later on that in that case the underlying object het ('it') is optional (or deleted). The absence of pseUdo-passives in Dutch is also a very weak argument against transformational passives. The difference between English and
244 Domains and Dynasties
Dutch is due to the difference in word order between the two languages (and the effects of the Principle of Global Harmony). 1 will return to this in section 5.5. Not only was the original evidence against "transformational" passives in Dutch weak, there is also positive evidence for them. At least two arguments have appeared in the literature that show that data as in (8) and (9) point in the direction of NP-movement from object position to subject position. Thus, according to the alternative analysis, there are transformational passives in Dutch. There is not necessarily indirect object preposing, but there certainly is direct object preposing across an indirect object (which remains in place). Thus, according to this analysis, (8a) represents the derived order (see (13a)), while (8b) shows the underlying order (see (13b)). (The notation np indicates a "landing site" for NPmovement.) (13)
a.
b.
Ik denk dat het boek [vp hem t gegeven werd] 1 think that the book him given was 'I think that the book was given to him' Ik denk dat np [vp hem het boek gegeven werd]
A problem to which 1 will return is the question whether the subject position np must be filled or not in S-structure. The first argument for (13b) was given by De Haan (1979, 197ff.). This argument is based on the fact that the pronominal R-word er typically follows the subject (Van Riemsdijk (1978)): (14)
a.
dat Fred de jongens [er mee] heeft geplaagd that Fred the boys there with has teased 'that Fred has teased the boys with it' b. dat Fred er de jongens [t mee] heeft geplaagd c. *dat er Fred de jongens [t mee] heeft geplaagd
As (14c) shows, if er is extracted from a PP, it cannot precede a definite subject. De Haan uses this fact to show that het boek in (8b) and (13b) is not in subject position, while in (8a) and (13a) it is. De Haan illustrates the latter fact with the following sentences: (15)
a.
b. c.
dat het boek Mary er voor werd gegeven that the book Mary there for was given 'that Mary was given the book for it' dat het boek er Mary t voor werd gegeven *dat er het boek Mary t voor werd gegeven
As expected, er cannot precede the syntactic subject het boek in (15c). Surprisingly, however, (16b) (derived from (16a)) is grammatical:
NP-Movement and Restructuring
(16)
a. b.
245
dat Mary het boek er voor werd gegeven that Mary the book there for was given dat er Mary het boek t voor werd gegeven
Assuming that er cannot precede definite subjects, these facts are explained if het hoek is in subject position in (15c) but not in (16b). These facts do not, of course, show that indirect object preposing does not exist in Dutch (it is not clear what could prohibit it). What these facts do show is that there is an analysis of (8b) or (16b) in which het boek is in VP-internal (or at least in non-subject) position. By far the simplest analysis, then, is to assume that this is also the underlying order and that the other order (8a) is derived by moving the direct object to subject position (across the indirect object). Similar things can be said about the second argument, which has been provided by Hans den Besten. According to an observation in Den Besten (1981b), the complex expression wat voor can be split by (subextraction of wat) in direct object positions but not in subject positions (see also Hoekstra (1984, 216)): (17)
a.
b.
Wat heb jij in Italic [t voor musea] bezocht? what have you in Italy for museums visited 'What kind of museums did you visit in Italy?' *Wat hebben [t voor mensen] jou geholpen? what have for people you helped 'What kind of people helped you?'
According to Den Besten's explanation, the trace is properly governed (by V) in (17a); for subjects, however, there is no proper governor (17b). The contrast is explained, in other words, in terms of the ECP. Interestingly, a similar contrast can be observed in examples like (8) and (9): (18)
a.
(19)
b. a.
b.
Wat werd hem [t voor boek] gegeven? what was him for book given 'What kind of book was given to him?' *Wat werd [t voor boek] hem gegeven? Wat zijn haar [t voor rampen] overkomen? what are her for disasters happened 'What kind of disasters happened to her?' *Wat zijn [t voor rampen] haar overkomen?
If we want to maintain the same type of explanation, these examples
suggest that the object is still in its underlying (VP-internal) position in (18a): only in this position, the NP is governed by the verb. Similarly, we must assume for (19a) that the NP containing the trace is in VP-internal
246
Domains and Dynasties
position. The order in (18b) and (19b), then,. must be the result of movement of the direct object to subject position. I accept this argument with the qualification that the ECP is not something special for empty categories. As we saw in chapter 4, the ECP is just an instance of a general requirement for dependent elements: all dependent elements bound by an antecedent in accordance with the configurational matrix must be governed. Thus, both lexical anaphors and empty categories must be governed. The wat voor facts in Dutch show that the relevant notion is real government (by a lexical head), and not antecedent government in the sense of Lasnik and Saito (1984). This can be seen as follows. Recall from chapter 4 that that-trace facts are nonexistent in Dutch sentences with a transitive verb. Thus, the following sentence is perfectly acceptable: (20)
Wie denk je dat t het boek gelezen heeft? who think you that the book read has 'Who do you think read the book?'
In chapter 4, the difference between Dutch and English in this respect was explained in terms of the NIC. Alternatively, one might argue that the complementizer dat is somehow not a barrier for antecedent government from COMPo This explanation would not work because wat voor split is not possible in the position of the trace in (20): (21)
*Wat denk je [t datJ [t voor mensenJ het boek lazen?
for people the book read what think you that 'What kind of people do you think read the book?' Apparently, the intermediate trace in COMP cannot save the situation (as an antecedent governor). This cannot be due to inaccessibility of the trace in the subject phrase, because such traces are accessible in the structurally analogous object phrases: (22)
Wat denk je [t datJ hij [t voor mensenJ gezien heeft what think you that he for people seen has 'What kind of people do you think that he saw?'
In fact, then, (21) shows that the notion "antecedent governor" is irrelevant for the explanation of subject-object asymmetries. Let us suppose therefore that only government-by-a-head is relevant. How, then, do we account for the difference between (20) and (21)? Suppose that the relevant governor for subjects is INFL (or COMP in Dutch, see chapter 4). In (20), the subject trace is properly governed by dat. In (21), however, the nonstructural governor dat cannot govern the trace inside the subject phrase. Only structural governors (V or P) can govern into another projection, as was discussed in chapter 3.
247
NP-Movement and Restructuring
I would now like to give a third argument for the VP-internal status of het boek in (8b), based on the possibilities for verb projection preposing in
Dutch. Verbs and their projections can be preposed in Dutch under various circumstances. In chapter 3, for instance, examples were given like: (23)
[Boeken lezenJ zag ik haar zelden books read saw I her rarely
Particularly, te-Iess infinitives can be preposed in Dutch with or without the object. Also, past participles can be preposed, with or without the object, as in (24a) and (24b), respectively. (24)
a. b.
[Het boek gelezenJ heeft hij the book read has he [GelezenJ heeft hij het boek read has he the book
niet not niet not
Definite subjects cannot be incorporated in the preposed constituent. Thus, if the embedded subject haar of (23) is part of it, the result is ungrammatical: (25)
*[Haar boeken lezenJ zag ik zelden her books read saw I rarely
This does not mean that subjects are never incorporated. Indefinite subjects can be adjoined to the VP and preposed together with it: (26)
a.
b.
[Iemand lachenJ zag je er nooit someone laugh saw you there never 'Y ou never saw someone laugh there' [Iemand boeken lezenJ zag je er zelden someone books read saw you there rarely 'Y ou rarely saw someone read books there'
Even with transitive verbs, an indefinite subject can be incorporated, as (26b) shows. Recall now the fact that indirect objects can precede the subjects of passives (8b) but not the agentive subjects of active sentences (7b). Corresponding facts with indefinite subjects are (27a) and (27b): (27)
a.
b.
dat hem boeken gegeven werden that him books given were 'that books were given to him' *dat hem iemand boeken gaf that him someone books gave
If the inversion of the relative order of subject and indirect object involves
248
Domains and Dynasties
subjects in the VP, we predict that boeken can be preposed with the VP in (27a). This appears to be correct: (28)
[Boeken gegeven] werden hem maar zelden books given were him only rarely
The correct order corresponding to (27b) is (29): (29)
dat iemand hem boeken gaf that someone him books gave 'that someone gave books to him'
From a structure with this order, it is impossible to prepose the subject with the verb and leave the indirect object behind: (30)
Demand boeken geven] zag je hem zelden someone books give saw you him rarely 'Y ou rarely saw him give books to someone'
This sentence is grammatical, but only in the reading according to which the pre posed iemand is the indirect object and hem is the embedded subject. It is impossible to interpret iemand as the embedded subject and hem as the indirect object. This fact contrasts with (28), in which hem is interpreted as the indirect object. In other words, leaving the indirect object behind and preposing the subject with the verb is only possible if the subject can be moved across the indirect object and become part of the verb projection. ' The same fact can be demonstrated with ergative verbs, which also allow the subject to move across the indirect object to the VP-internal position: (31)
[Rampen overkomen] zag je hem· maar zelden disasters happen saw you to-him only rarely 'Only rarely, you saw disasters happen to him'
Like in (28), and contrary to (30), hem is interpreted as the indirect object. All in all, then, there is good evidence that inversion of the order of subject and indirect object in Dutch involves incorporation of the subject into the VP. In Koster (1978c), it was concluded that passives and ergatives, which allow the inversion of subject and indirect object, are also the verbs that form their perfect tense with the auxiliary zijn 'be', rather than with the usual auxiliary hebben 'have'. This conclusion was based on facts like the
N P-Movement and Restructuring
249
following: (32)
a.
b. (33)
a.
b. (34)
a.
b.
Hij heeft het boek gelezen he has the book read 'He has read the book' *Hij is het boek gelezen he is the book read Het boek is gelezen the book is been-read 'The book has been read' *Het boek heeft gelezen the book has been-read Rampen zijn hem overkomen disasters are to-him happened 'Disasters have happened to him' *Rampen hebben hem overkomen disasters have to-him happened
Passives and ergatives differ in this respect from most other intransitives (unergatives), which form their perfect tense with hebben: (35)
a. b.
Hij he *Hij he
heeft gelachen has laughed is gelachen is laughed
Hoekstra (1984) claims that verbs form their perfect tense with zijn 'be' if and only if they are unaccusatives (including passives). Although I agree that verbs that select zijn are always un accusatives, I will show in a moment that the opposite does not hold: not all unaccusatives select zijn; there are also un accusatives that select hebben 'have'. Summarizing so far, we have seen the following similarities between passives and ergatives: (36)
a. b. c. d. e.
They form their perfect with zijn They allow the indirect object to precede the subject They allow er to precede definite subjects if the indirect object precedes the subject Their subjects allow wat voor split in certain contexts Their subjects can be incorporated under verb projection preposing
Hoekstra (1984) mentions two other relevant facts. First of all, he observes that prenominal (adjectival) past participles only lead to paraphrases with the auxiliary zijn ('be') and never to paraphrases with hebben ('have'). Or,
250 Domains and Dynasties
in other words, perfects with hebben have no corresponding prenominal participle at all: (37)
a. b.
(38)
a. b.
(39)
a. b.
(40)
a. b.
Het gelezen boek book the read Het boek is gelezen the book is been-read The book has been read' De snel gegroeide kool cabbage the fast grown is snel gegroeid De kool the cabbage is fast grown The cabbage has grown fast' *De gelachen man the laughed man De man heeft gelachen the man has laughed *De gedanste kinderen the danced children De kinder en hebben gedanst danced the children have
Another relevant fact is the observation of Perlmutter (1978) that impersonal passives do not occur with ergative (un accusative) verbs in Dutch (i.e. the verbs that select zijn as their auxiliary): (41)
a. b.
(42)
a. b.
Het water is verdampt the water has evaporated * Er werd verdampt door het water there was evaporated by the water De kinderen hebben gehuild the children have cried Er werd gehuild door de kinderen there was cried by the children
Relational Grammar explains this contrast in terms of the 1-advancement exclusiveness law, according to which only one promotion to subject is possible within a single clause. Unaccusatives (like verdampen in (41)) always involve promotion to subject of the underlying accusative (het water in (41a)). Passivization would be a second promotion (of er in (41b) according to Perlmutter and Zaenen (1984)), which is forbidden by the law in question. In contrast, (42a) does not involve a promotion because the derived subject is also the underlying subject of unergative verbs like huilen. Passivization would therefore only be the first promotion (42b). In the present framework, the facts in question can be accounted for by a simple lexical statement. The key insight is that passive morphology "de-
251
NP-Movement and Restructuring
externalizes" the external a-role (cf. Williams (1981)). It also absorbs objective Case (Chomsky (1981b)), but only if there is an NP with objective Case. De-externalization, however, always applies, even if Case absorption cannot apply (for instance in impersonal passives). Deexternalization can be stated as a lexical rule that erases the underlining, which indicates the external nature of a a-role (cf. Williams (1981)): (43)
V + en: (ai,"')
V: (fli,"')
This lexical statement guarantees that passive only applies if there is an external a-role to begin with. Thus, unaccusatives, which do not have an external a-role, cannot be passivized according to (43). Passive morphology is selected by the verb be in English. In many languages, causative verbs and verbs of perception may have a similar effect on their complement verbs (see for instance Kayne (1975), Burzio (1981) and Zubizarreta (1985)). In Dutch, causative verbs like laten 'let' and perception verbs like zien 'see' optionally turn their complement verbs into verbs with one of the passive effects, namely de-externalization of the external a-role. Thus, the following alternation exists in Dutch (see also Coopmans (1985)): (44)
a.
b.
Fred laat Peter de aardappelen schillen Fred let Peter the potatoes peel 'Fred let Peter peel the potatoes' Fred laat de aardappelen schillen (door Peter) Fred let the potatoes peel by Peter
This kind of passivization is possible with transitives (as in (44b)) and also with unergatives: (45)
a. b.
Peter Peter Peter Peter
hoorde heard hoorde heard
de kinderen the children huilen (door cry by
huilen cry de kinderen) the children
This is never possible with unaccusatives: (45)
c. d.
Peter Peter *Peter Peter
hoorde heard hoorde heard
het water verdampen the water evaporate verdampen (door het water) evaporate by the water
Superficially, one might think that in examples like (45b) the grammatical subject is simply dropped in the complement. It can, however, be omitted (or internalized) only if it is the underlying external subject. In all cases in which we have a derived subject, subject drop leads to ungrammatical
252
Domains and Dynasties
sentences: (46)
a.
b. (47)
a.
b. (48)
a. b.
worden Fred laat Peter geslagen Fred let Peter been-beaten be 'Fred let Peter be beaten' *Fred laat geslagen worden Fred let been-beaten be Mary laat het boek goed verkopen Mary let the book well sell 'Mary let the book sell well' *Mary laat goed verkopen Mary let well sell Peter laat de was smelten Peter let the wax melt *Peter laat smelten Peter let melt
In (48a), de was is understood as the derived subject. An underlying subject (the external a-role) can be dropped (internalized), as expected: (49)
a.
b.
Peter laat Fred de was smelten Peter let Fred the wax melt 'Peter let Fred melt the wax' Peter laat de was smelten (door Fred)
In the next section, I will argue that passive be is a main verb that selects a small clause. This optimalizes the parallelism between standard passives and passive effects in the complements of verbs like laten. Both cases involve a passivization feature on the verb in the clausal complement of another verb (the italicized V indicates: "triggers passive effects"): (50)
a. b.
... be... [ ... V... J .. . ... laten ... [ ... V ... J .. .
Structurally, there are no real differences. The main differences are morphological and (partially) functional. Verbs like be select a participle (V + en), while verbs like laten do not impose any morphological changes at all on their complement verbs. A second difference is that be makes passivization obligatory, while laten leaves it optional. Functionally speaking, standard passivization does two things, while passivization in the complement of laten does only one thing. Standard passivization absorbs the Case of the object and blocks the expression of the external a-role. Passivization in the complement of laten only prevents the external a-role from being realized. There is no evidence of Case absorption in complements of causatives and perception verbs.
NP-Movement and Restructuring
253
The similarities and differences between the two kinds of passive suggest that the two properties (Case absorption and de-externalization) of the standard passive are two independent features. Case absorption is a morphological effect connected with the past participle selected by be (and get) in English. De-externalization, however, is a property that certain verbs (like be and the causatives) impose on their complements. If the two types of passivization only affect verbs with external arguments, we have a direct criterion for underlying structure: if a verb can undergo passivization, it is not an un accusative verb. This is important for the question to what extent there is a correlation among the properties in (36). To the best of my knowledge, verbs that form their perfect with zijn 'be' never passivize. The conclusion that these verbs are un accusative in Dutch is rather safe. More problematic is the status of verbs that select hebben, but which nevertheless show the other properties listed under (36). It appears that some of these verbs are not unaccusatives according to the passivization criterion, while others are. The first group involves psychological verbs (occurring in what Perlmutter and Postal (1984) call Inversion Clauses). These verbs were discussed by Lenerz (1977) for German, and, briefly, by Den Besten (1982) for Dutch. Verbs in this class are interesseren 'interest', ergeren 'irritate', verwonderen 'surprise', verbazen 'surprise', fascineren 'fascinate', ontroeren 'move', and many others. These verbs have an obligatory object (often in the accusative in German) and they form their perfect with hebben. Contrary to normal transitive verbs, these verbs allow subject-object inversion: (51)
a.
b.
dat de that the 'that the dat hem
boeken hem interesseren books him interest books interest him' de boeken interesseren
According to all the other criteria listed under (36), the subjects of these verbs can be incorporated into the VP. Thus, er can precede a definite subject, if the indirect object precedes it: (52)
dat er hem de boeken over interesseren that there him the books about interest 'that the books about it interest him'
Wat voor split shows that the subject is VP-internal if the indirect object precedes it:
(53)
a.
b.
*Wat hebben [t voor boeken] hem geinteresseerd?
what have for books him interested 'What kind of books have interested him' Wat hebben hem [t voor boeken] geinteresseerd?
254
Domains and Dynasties
Indefinite subjects can also be preposed together with the rest of a verb projection: (54)
[lets interesserenJ zag je hem maar zelden something interest saw you him only rarely 'Only rarely, you saw something interest him'
It is clear, in other words, that the subject can be incorporated in the VP of these verbs. But it can also be shown that these psychological verbs are not unaccusatives, because they often undergo passivization (cf. Den Besten (1982}): (55)
a. b. c.
Hij he Hij he Hij he
werd was werd was werd was
ge'interesseerd door het boek interested by the book ontroerd door het boek moved by the book gefascineerd door het boek fascinated by the book
Contrary to what we see with normal transitive verbs, passivization is rather peculiar in complements of causatives or verbs of perception: (56)
a.
Bernstein liet de symfonie ons ontroeren Bernstein let the symphony us move 'Bernstein let the symphony move us' b. %Bernstein liet ons ontroeren (door de symfonie) Bernstein let us be-moved (by the symphony)
(The symbol % is used to indicate "strangeness" of meaning.) In most cases, this type of passivization is even entirely unacceptable: (57)
a. b.
*De the *De the
leraar teacher leraar teacher
liet let liet let
ons us ons us
fascineren be-fascinated interesseren be-interested
(door (by (door (by
het the het the
boek) book) boek) book)
Contrary to standard passivization, this type of passivization seems to require an agent as the original external argument. Normal passivization (55), however, seems sufficient to establish the hypothesis that verbs that form their perfect tense with hebben are not un accusatives, but verbs with an underlying external argument. Incorporation of the subject into the VP of these verbs is then comparable to the incorporation of un ergative subjects described for Italian by Burzio (1981) and Belletti and Rizzi (1981). And yet the hypothesis that verbs with auxiliary hebben are never unaccusatives is probably not quite correct. In a review of the relevant
NP-Movement and Restructuring
255
part of Koster (1978c), Balk-Smit Duyzentkunst (1979) pointed out that there are quite a few exceptions. The discussion concerned the well-known class of ergatives that have a causative counterpart (Koster (1978c, 163): (58)
a. b.
John smelt de was John melts the wax De was smelt the wax melts
The verb smelten 'to melt' forms its perfect with hebben in the transitive use (58a) and with zijn in the intransitive use (58b): (59)
a. b.
John heeft de was gesmolten John has the wax melted De was is gesmolten the wax has melted
Like Hoekstra (1984), I concluded that the intransitive members of such pairs always select zijn. Balk-Smit Duyzentkunst (1979, n. 18) showed, however, that this claim is wrong. There are many examples of intransitives that, according to some of the other criteria, are unaccusatives, and that nevertheless select hebben as their perfect auxiliary. Balk-Smit Duyzentkunst gives the following examples, among others: (60)
a. b. c.
(61)
a. b. c.
Hij kookt de aardappelen he cooks the potatoes De aardappelen koken the potatoes cook De aardappelen hebben gekookt the potatoes have cooked Ik beweeg mijn hand I move my hand Mijn hand beweegt my hand moves Mijn hand heeft bewogen my hand has moved
Other verbs of this type mentioned by Balk-Smit Duyzentkunst are kJ'ullen 'curl', draaien 'turn', spelen 'play', slingeren 'sling', luiden 'ring', and stoppen 'stop'. Either the intransitives in question are not un accusatives, or not all unaccusatives select zijn as their auxiliary. There are good reasons to believe that the latter conclusion is correct. The intransitives in (60b) and (61b) are almost prototypical unaccusatives. This status is confirmed by
256
Domains and Dynasties
the fact that impersonal passives are not possible. This can be illustrated with koken, a verb that can appear without an object, like English eat: (62)
a. b.
Peter kookt (de aardappelen) Peter cooks (the potatoes) De aardappelen koken the potatoes cook
Interestingly, an impersonal passive can be formed corresponding to the intransitive in (62a), but not to the intransitive in (62b): (63)
Er werd gekookt there was cooked
The implicit subject here is the agent of koken, i.e. the underlying external argument. The implicit subject cannot be de aardappelen, i.e. the derived subject. This contrast can also be clarified by adding the door-phrases: (64)
a. b.
Er there *Er there
werd was werd was
gekookt cooked gekookt cooked
door by door by
Peter Peter de aardappelen the potatoes
Pollmann (1975) observed that Dutch impersonal passives differ from normal passives in that they always involve a human agent. This is in accordance with the un accusative hypothesis and the insight of Perlmutter (1978) that unaccusatives, contrary to unergatives, never allow passivization. The unergatives that do allow passivization are typically verbs of action (cf. Perlmutter and Postal (1984)). Similar observations can be made for the complements of causatives and verbs of perception: (65)
Ik hoorde koken I heard cook 'I heard someone cooking'
As in regular impersonal passives, the implicit subject can only be an agent. It is impossible to interpret (65) as I heard the potatoes cooking. It is difficult, then, to avoid the conclusion that verbs like koken 'cook' are unaccusatives. If this conclusion is correct, there are counterexamples to the claim made by Koster (1978c) and Hoekstra (1984): a subclass of the unaccusatives forms its perfect tense with the auxilliary hebben 'have'. Presumably, this class differs semantically from the verbs that select zijn. Traditionally, it has been observed that the verbs that select zijn are verbs of change. Perhaps the un accusatives that select hebben are not denotating change of state but a continuing state.
NP-Movement and Restructuring
257
Apart from this problem for further research, I believe that Dutch does not essentially differ from Italian and other languages. As in Italian, there are two kinds of intransitives, un accusatives and unergatives (see Perlmutter (1978) and Burzio (1981)). With Den Besten (1982) and Hoekstra (1984), I believe that the existence of transformational passives in Dutch is a rather well-established fact. A theoretically important difference with the English passive is that passive NP-movement is optional in Dutch. I will return to this difference in the following sections. Here it suffices to recall that the inversion of the passive subject and the indirect object is caused by the optional movement of the passive subject from object position to subject position: (66)
a.
b.
dat np [vp hem [v' het boek gegeven werd]] that him the book given was 'that the book was given to him' dat het boek [vp hem [v' t gegeven werd]] that the book him given was
If (66a) is the correct structure, we have an interesting problem: the subject position can remain empty, and the derived subject can remain in its underlying object position, in spite of the fact that accusative Case is absorbed in this position under the standard analysis of passives (Chomsky (1981b)). This indicates that the standard analysis, based on the Case Filter, is incorrect. Before going into this problem, I will discuss the problems for Case assignment and agreement posed by (66a). Furthermore, I will discuss in the next section the circumstances under which subjects can remain empty in Dutch.
5.3. Case, agreement, and subject drop in Dutch Just as in English, the subjects of passives in Dutch must be assigned nominative Case: (67)
a. b. c.
De politie arresteerde hem the police arrested him Hij werd gearresteerd door de politie he was arrested by the police *Hem werd gearresteerd door de politie
For English, nominative Case assignment after passivization is supposed to be dependent on Tense (INFL). Furthermore, it is assumed that passivization is obligatory in order to escape from the effects of the Case Filter (Chomsky (1981b)). We have seen, however, that Dutch differs from English in that the direct object can remain in place under passivization
258
Domains and Dynasties
while the subject position remains empty. I will return to this difference between Dutch and English in the next section. Here, I will limit myself to matters of Case assignment and agreement. Suppose that our account of the Dutch passive is correct. We then have to account for the fact that nominative Case is assigned to NPs that are governed by the V. This is somewhat at odds with the common view that nominative Case is assigned under government of Tense. This assumption is problematic, however. There are several cases in which nominative Case is assigned without a governing Tense. This was briefly discussed in chapter 2 in connection with Henk van Riemsdijk's example from German: (68)
Der Hans (nom.), mit dem (dat.) spreche ich nicht mehr the Hans with him talk I not more 'Hans, I don't talk to him any longer'
The nominative phrase der Hans is in topic position and therefore not governed by Tense. Other examples can be found in Dutch. Thus, Sturm and Pollmann (1977) give examples like the following: (69)
Karel een huis kopen, wie had dat kunnen denken Charles a house buy who had that can think 'Charles buying a house, who could have imagined that'
Clearly the initial NP Karel is assigned nominative Case, which becomes clear when we replace Karel by the pronominal hij 'he': 1 (70)
Hij (nom.) een huis kopen, wie had dat kunnen denken
The verb kopen is as tenseless as a verb can be. One might suggest, then, that nominative Case is not assigned by a governor at all, but that, instead, it is a default Case (see lakobson (1935)). I adopt this view with the addition that nominative Case is only assigned as the default Case to the subjects of subject-predicate combinations. Both the nominative in (68) and the one in (70) are subjects in this sense, and so are the subjects of tensed sentences. A consequence of the default view is that we have to prevent nominative Case assignment to the subjects of infinitival complements. This can be done by a simple filter for complements: 2 (71)
*[s ... NP nom ,' .. , V ... ] (S minimal) unless NP nom agrees with V This type of account has the added advantage that we allow for nomi-
NP-Movement and Restructuring
259
native Case in complements with inflected infinitives, as in Portuguese (see Chomsky (1981b, 52)). In any case, we can now account for the fact that there are also passives like (70): (72)
Hij (nom.) gearresteerd, dat geloof ik niet he arrested that believe I not
The nominative of the VP-internal subject of Dutch passives can simply be inherited from the empty external subject, just as in the following wellknown examples from Italian (Chomsky (1981b, 260)): (73)
a. b.
[NP eiJ [vp[ vp telefonano] [molti studenti]j] [NP eiJ [vp arrivano [molti studenti]j]
It is generally assumed that Case inheritance from the empty subject (to the postverbal subject) requires a special style of indexing, namely cosuperscripting, to distinguish the relation from the binding relation. Note, however, that the theory of the configurational matrix (chapters 1 and 2) does not require coindexing of any type. As long as two elements, an antecedent and a dependent element, are in the proper mutual configuration, property sharing is always possible. Inheritance can be upward (from right to left) as in the case of lexical content under Wh-movement, or downward (from left to right) as in the case of referential properties in anaphoric relations. In (73), molti studenti cannot inherit referential properties from the empty subject (binding), because the lexical NP already has referential properties (as a non-anaphor). 3 But it certainly can inherit other properties from an antecedent. A lexical NP is not a dependent element with respect to binding, but it definitely is a dependent element with respect to Case assignment. It must receive Case either from a Case assigner, or from another NP that already has Case. Case inheritance, then, is just another instance of the property-sharing relation that is generally allowed within the configurations defined by the configurational matrix. For Dutch passives, this entails that nominative Case is assigned to the empty subject [NP e]j, and that this Case is inherited by the NP het hoek in (74): (74)
dat [NP ei] [vP hem [het boekj gegeven werd]] that him the book given was 'that the book was given to him'
I will now focus on the main difference between Dutch and English in this context, namely the fact that Dutch subjects can remain empty under certain circumstances. Obviously, Dutch is not a pro-drop language like Italian. Under which conditions, then, can subjects remain empty in Dutch? An earlier attempt to answer this question led to the following
260
Domains and Dynasties
hypothesis (Safir (1985)): "In German [and Dutch - JK], a subject bearing a thematic role (agent, patient, etc.) may not be missing - only an expletive subject can - while in Italian, a subject may be missing regardless of whether it has a thematic role or not." In Safir's interpretation, this means that er 'there' and het 'the' may be dropped in Dutch subject positions: (75)
a. b.
Mij werd me was Duidelijk clear
(er) verteld dat de aarde rond was there told that the earth round was is (het) dat hij komt is it that he comes
Although Safir acknowledges the existence of empty expletive pronouns in languages like German and Dutch, he nevertheless assumes that er and het are also expletives (that can be dropped). In contrast, I will argue that neither er nor het are expletives and that the only pleonastic element in Dutch is the empty element. Consider (74), for instance. According to our earlier discussion, such passive sentences (and also the unaccusatives) can have an empty subject. It does not make sense to say that er or het has been dropped in such structures, because neither is possible with definite NPs: (76)
a. b.
*Er werd de jongen het boek gegeven there/it was the boy the book given *Het werd de jongen het boek gegeven
If the passive subject remains in its underlying VP-internal position, I
assume that the subject position is occupied by an empty Case-marked expletive element. The fact that this element occurs with passives and un accusatives suggests the following hypothesis: (77)
A subject can be empty in Dutch if and only if there is no external a-role.
This generalization covers the subject drop phenomenon in examples like (75) and also in impersonal passives: (78)
Overal werd (er) gedanst everyw here was there danced
Examples like (78) are slightly more natural with er, but impersonal passives with sentential complements are perfect without er: (79)
Overal werd (er) gezegd dat hij ziek was everywhere was there said that he sick was 'Everywhere, it was said that he was sick'
NP-Movement and Restructuring
261
Why are er and het not pleonastic elements? For het and the corresponding English it the answer seems clear: these elements are arguments (or quasi-arguments in weather contexts like het regent 'it rains'). Arguments for this conclusion will be given in a moment. For er, however, the answer is less straightforward. The main point is that there is simply no evidence that er is an NP. Traditional grammarians call it an adverb, a conclusion that is confirmed by distributional evidence. There is no uncontroversial NP-position in which er can occur. Contrary to what we see for English there, er can always be added to a sentence with an indefinite subject, no matter how many NPs there are: (80)
Er heeft iemand Peter een boek gegeven there has someone Peter a book given 'Someone has given a book to Peter'
If er is an NP, this would add a fourth NP position to the sentence, which is not justified on any other grounds. I assume therefore that er is a presentational adverb that is used if the NP subject of the sentence is not a suitable topic of the sentence. Thus, (80) describes an event, not a predication over iemand. Impersonal passives are typically lacking a subject for predication:
(81)
Er werd gedanst there was danced
I assume that er does not occupy the subject position in this sentence, but that the real subject is a pleonastic NP, which can be empty as in the case of passives and un accusatives. This entails the following structure for (81): (82)
Er werd [NP eJ [vp gedanstJ
As in (80), the real subject is an NP, and the adverb er is only added to indicate that the empty NP is not the topic of the sentence. German es is probably a real pleonastic element (cf. Safir (1985)): (83)
Es wurde ein Mann getotet there was a man killed
Es is a real dummy that fills the obligatory first position of the sentence. In other contexts, it must be dropped:
(84)
a. b.
dass (*es) ein Mann getotet wurde that there a man killed was Gestern wurde (*es) ein Mann getotet yesterday was there a man killed
262
Domains and Dynasties
German es contrasts sharply with Dutch er in such contexts. The presence of er is even strongly preferred in the following sentences: (85)
a. b.
dat er that there Gisteren yesterday
een man gedood werd a man killed was werd er een man gedood was there a man killed
It is even less plausible that Dutch het or its English counterpart it is a dummy. Contrary to what is generally assumed, I believe that it (and het
in the corresponding Dutch sentences) is an argument in the following examples: (86)
a.
It seems that Bill is happy
b. It is clear that Mary knows the answer Hans Bennis and I independently came to the conclusion that the it in question is the same it as in (87):4 (87)
I regret it that Bill is sick
It is quite reasonable to assume that it is an argument (theme) in this case.
The verb must be subcategorized for this NP, and therefore it must also 9mark it: subcategorization entails 9-marking according to Chomsky's projection principle (1981b, 37ff.). That it is an argument can also be concluded from the fact that the clause can be omitted: (88)
I regret it
This it can also be questioned (interpreted as a variable), which indicates argument status: (89)
What do you regret t?
It is also replaced by an NP interpreted as a variable in:
(90)
[That Bill is sickJj OJ I regret tj
It, then, is an argument, and the following clause is a specification of the content of this argument. Naturally, no NP other than it can be related to a following clause:
(91)
*1 regret themj [that John will come]j
In general, clauses cannot occupy argument Case-positions (see the Case Resistance Principle of Stowell (1981)). Therefore, they always occur in
NP-Movement and Restructuring
263
dislocated positions from where they can be linked to an NP-position. Thus, subject sentences are in fact topics that bind a trace (variable) in the NP-subject position (see Koster (1978a)). Similarly, object clauses never occupy the same position as a Case-marked object NP. In Dutch, a similar phenomenon can be observed with PP complements (cf. Hoekstra (1984)). Thus, it is generally impossible to have sentential complements in certain PPs: (92)
a. b.
c. (93)
a. b. c.
Ik reken op hem I count on him Ik reken er op I count there on 'I count on it' *Ik reken op [dat hij komt] I count on that he comes Ik denk aan haar I think of her Ik denk er aan I think there on *Ik denk aan [dat zij komt] I think of that she comes
It is nevertheless possible to express the content of (92c) and (93c) by
adding dislocated clauses to (92b) and (93b), respectively: (94)
a.
b.
Ik reken eri op [dat hij komt]i I count there on that he comes 'I count on his coming' Ik denk efj aan [dat zij komt]i I think there of that she comes 'I think of her coming'
Again, the clauses cannot appear in argument positions but they must be linked to argument positions. The sentences in (86) are entirely analogous to (87): it is an argument, and the following clause is a specification of this argument. Bennis (1986) has insightfully concluded that it must be an underlying internal argument, so that (86a) and (86b) have the following underlying structure: (95)
a. b.
np seems iti [that Bill is happY]i np is [iti clear] [that Mary knows the answer]i
As in passives, the subject positions must obligatorily be filled (so that (86) is derived). In this view, it is not only an argument, but also an internal argument that must be externalized. This entails that verbs like seem and be are in
264
Domains and Dynasties
fact un accusatives (seem is perhaps a case of inversion in the sense of Perlmutter and Postal (1984)). This conclusion is confirmed by the fact that verbs like seem and appear do not undergo passivization (Perlmutter and Postal (1984, 105)): (96)
a.
b.
*1 am seemed to by Harry to be wrong *1 am appeared to by Louise to be wrong
The same can be said about the Dutch equivalents. Moreover, there are Dutch examples that show the inversion of subject and indirect object that is typical of verbs with an underlying VP-internal subject: (97)
a.
(98)
b. a.
b.
dat mij die dingen juist schijnen that me those things correct seem 'that those things seem to me to be correct' dat die ding en mij t juist schijnen dat mij die dingen duidelijk zijn that me those things clear are 'that those things are clear to me' dat die dingen mij t duidelijk zijn
Furthermore, adjectives always form their perfect with the auxiliary zijn 'be', which is a compelling argument for their unaccusative status: (99)
a.
b.
Het is duidelijk geweest [dat hij komt] it is clear been that he comes 'It has been clear [that he will comeT *Het heeft duidelijk geweest [dat hij komt] it has clear been that he comes
In sum, the evidence that verbs like seem and adjectives (with be) are unaccusatives is quite strong. Consequently, I assume that Bennis is right and that (95) represents the underlying structures of the sentences in question. If the it in these cases is the argument of the verb, which is specified by the following clause, it must be the only NP that can fill the role of argument (cf. (91)). Thus, no important conclusions can be based on sentences like the following (Chomsky (1982a, 22)): (100) *TheYi seem [ti to feed each other] would be difficult If not because of the ECP, this sentence is ungrammatical for the same reason as (91): only it can fill the subject position as the argument that is specified by the extraposed clause. Everything said so far about it can be said about Dutch het. It definitely seems to be an argument, and neither Dutch er nor het can be
NP-Movement and Restructuring
265
considered dummies. Dutch differs from English in that the internal argument it is not necessarily moved to subject position (as in passives and un accusatives). In VP-internal positions het can be dropped if a clause follows, hence the following contrast between Dutch and English: (101)
a. b.
Duidelijk is dat hij komt clear is that he comes *Clear is that he comes
Interestingly, it can usually be dropped in English too, if it is in VPinternal position: (102)
I regret (it) that he will come
Many differences between English and Dutch, then, can be explained by the fact that English, contrary to Dutch, has an obligatory lexical subject, even if there is no external a-role. In the next section, I will give a somewhat more systematic account of the consequences of this difference between the two languages.
5.4. A difference between English and Dutch We have seen in the preceding section that subjects that are not assigned an external a-role can remain empty in Dutch. In English, on the other hand, every subject position must be lexically filled. Nevertheless, there is a position in Dutch (and German) that also must be filled in declarative sentences. This is the first position of the sentence, which precedes the "verb second" position (see Thiersch (1978) and Koster (1975)). Words like er and het cannot be dropped in this position: 5 (103)
a.
(104)
b. a.
(105)
b. a. b.
Duidelijk is (het) dat hij komt is it that he comes clear *(Het) is duidelijk dat hij komt Vaak wordt (er) gezegd dat hij komt often is there said that he comes *(Er) wordt vaak gezegd dat hij komt werd (er) gedanst Overal everyw here was there danced *(Er) werd overal gedanst
Similarly, the first position must be filled in sentences with passives or unaccusatives: (106)
a.
Hem werd een boek gegeven him was a book given
266 Domains and Dynasties
b.
c. (107) a. b.
c.
Een boek werd hem gegeven a book was him given 'A book was given to him' *- werd hem een boek gegeven was him a book given Hem overkwam een ramp him happened a disaster Een ramp overkwam hem a disaster happened him 'A disaster happened to him' *- overkwam hem een ramp happened him a disaster
The requirement to fill the first position only holds for root sentences. In the verb-final subordinate clauses, the subject may be empty: (108) a. b. c. d. e.
Ik denk dat I think that '" dat that . .. dat that . .. dat that . .. dat that
duidelijk is dat hij komt clear is that he comes vaak gezegd wordt dat hij komt often said is that he comes overal gedanst werd everywhere danced was hem een boek gegeven werd him a book given was hem een ramp overkwam him a disaster happened
English, then, is a subject-oriented language, while Dutch and German are V-second languages. In subject-oriented languages, the subject position must be filled, and in V-second languages the first position of the sentence must be filled (in declarative sentences). One may wonder which parameters are responsible for the differences in question. In Koster (1986a) and (1986b), I have proposed to account for the distinction in terms of the nature of INFL and COMP in English and Dutch. Contrary to Dutch and German, English has a strong INFL which governs the subject position. This has the consequence that subjects must be either lexically filled or bound (if empty). In Dutch, COMP governs the topic position, which must therefore be filled (or bound) like the English subject position. The fact that English subjects must be filled has several consequences. It partially explains, for instance, why passivization in English leads to an obligatory fronting of the underlying object to subject position. I have already indicated why I believe that the usual account in terms of the Case Filter is incorrect. According to this standard account (Chomsky (1981b, 124)), the participle killed absorbs Case, so that the NP John must go to the subject position in order to escape the Case Filter:
NP-Movement and Restructuring
(109) a. b.
267
np was killed John John was killed t
There are at least three reasons to assume that this account is wrong. First of all, it has never been explained in a satisfactory way why (109a) cannot be saved by inserting of, as in the complements of nouns and adjectives (cf. Chomsky (1981b, 142, n. 49)). This would yield a sentence like (Dummy) was killed of John. A more important argument against the standard account is that it does not work for Dutch. As we have seen in the discussion of passives and ergatives in Dutch, Case-absorbed objects can remain in place because they can inherit nominative Case from the empty subject position. I conclude therefore that the obligatoriness of NP-movement is not caused by the Case Filter, but by the need to fill the subject in English. This alternative account suffices for cases like (109). It does not suffice, however, for cases with indefinite NPs. In these cases, it is not immediately obvious why the structure cannot be saved by there-insertion. Why, in other words, is the following sentence ungrammatical? (110)
*There was arrested a man in the yard
Here, an explanation in terms of the Case Filter does not work, because there are many cases in which nominative Case can be inherited through there (see for instance Emonds (1976, 104ff.)): (111)
a. b. c. d.
There There There There
remain a few problems to be solved appeared a hatless stranger occurred a catastrophe in that century may not exist a solution to that problem
Interestingly, all of these sentences have verbs that presumably are unaccusatives. With none of the verbs is it possible to form impersonal passives in the corresponding Dutch sentences: (112) a. b. c. d.
*Er there *Er there *Er there *Er there
werd was werd was werd was werd was
gebleven (door de problemen) remained by the problems verschenen (door de vreemdeling) appeared by the stranger gebeurd (door de catastrofe) occurred by the catastrophe bestaan (door het probleem) existed by the problem
Furthermore, most of these verbs form their perfect tense with zijn rather
268
Domains and Dynasties
than with hebben, which is strong evidence for their unaccusative status: (113) a.
b.
c.
Er is een probleem gebleven there is a problem remained 'There has remained a problem' Er is een vreemdeling verschenen there is a stranger appeared 'There has appeared a stranger' Er is een ramp gebeurd there is a disaster occurred 'There has occurred a disaster'
Only bestaan 'exist' forms its perfect with hebben 'have', which shows once again that not all un accusatives select zijn: (114) Er heeft een probleem bestaan there has a problem existed 'There has existed a problem' In any case, the existence of examples like (111) forms a third argument against the standard account of passives. As we have seen in Dutch, there is a strong parallelism between passives and unaccusatives: both have an NP in underlying VP-internal position. The fact, then, that un accusatives with subjects in VP-internal position can be saved by there-insertion (111) and passives cannot (110), shows that something is missing from the standard account of passives in English. What has to be added to the usual account is the assumption that passive be is a main verb that selects a (small) clausal complement. This possibility was explored by Hasegawa (1968), but dismissed by Chomsky (1970, n. 29), among other things, on the basis of idioms in passives like advantage was taken of John. This idiomatic example would indicate that the passive subject (advantage) could not be base-generated. Hasegawa's hypothesis has made a comeback, however, through the possibility of interpreting passive be as a raising verb that selects a small clause (see Stowell (1978), Borer (1980), and particularly Burzio (1981, 234ff.)). According to the modified hypothesis, the structure of John was killed is as follows: (115) Johnj was [s tj killed
tiJ
As before, movement of John is obligatory because the subject position(s) must be filled. The extra (intermediate) trace is the key to understanding the ungrammaticality of (110). The explanation has to do with the special semantic conditions on NPs that follow the combination there be in English. Roughly, NPs following this combination must be indefinite (Mils ark (1974)). This explains why the structure underlying (110) must be
269
NP-Movement and Restructuring
rejected: (116) Therej was [s
tj
arrested a man in the yard]
The NP following the combination there was is the trace of there, i.e. the subject of the small clause. Contrary to what is required, this NP is not indefinite, hence the ungrammaticality. Without the extra trace, a man would be the first NP. This NP is indefinite, so that the sentence is not necessarily rejected. Note also that there is immediate evidence for this account (and the extra NP-position that it involves) in the acceptability of (117): (117) There has just been [a manj arrested
tiJ
Summarizing, we have seen that the obligatoriness of NP-movement in English passives does not follow from the Case Filter, but from the requirement that the subject position must be filled. Structures with indefinite NPs cannot be saved by there- insertion, because be selects a small clause. The subject of this small clause follows the combination there be, and must therefore be indefinite. With unaccusatives like there remains a problem to be solved, the problem does not arise, because in these cases the verb be (with its clausal complement) is not involved. I would like to conclude this section with some examples of the consequences of the preceding analyses. To begin with, consider the fact that English does not have impersonal passives of simple intransitive verbs: (118)
*It was danced
According to our earlier analysis, it is always an argument in English. Sentence (118) is then ruled out by the 9-criterion. As an argument, it must be linked to some 9-position. Since the verb dance is intransitive, it has no VP-internal 9-position to which it can be linked. In contrast, the following sentence is grammatical: (119) It was said that Bill was sick The verb say does have an object, namely the preposed NP it. The clausal complement is a specification of this argument NP, as is clear from the underlying structure of (119): (120) np was said
itj
[that Bill was sick]j
Since the subject position cannot remain empty in English, it must be moved to the subject position, which results in (119). For a different reason, it cannot be replaced by there in (118):
270 Domains and Dynasties (121)
*There was danced
This example is ruled out by the fact that the underlying structure of (121) would involve an unfilled or an inappropriate subject, namely the subject of the clausal complement of was: (122) Therej was [tj danced] At best, the embedded subject is the trace of there, which is definitely not the indefinite NP that is required in the context there was . ... A possibly different manifestation of the fact that subjects must be filled in English is that we do not find the equivalent of the Dutch sentence (123b), next to an equivalent of (123a): (123) a.
b.
Ik liet Peter de aardappelen schillen I let Peter the potatoes peel 'I let Peter peel the potatoes' Ik liet - de aardappelen schillen (door Peter) 1 let the potatoes peel by Peter
Such sentences are not possible at all in English, as was observed by Zubizarreta (1985) in a comparison between French and English. Thus, French is like Dutch in that it has complements without an overt subject: (124) a. b. c. d.
11 faut laisser vivre one must let live Ce medicament fait dormir this medicine makes sleep rai entendu sonner la porte 1 have heard ring at the door rarement vu pleurer dans ma vie rai my life seen cry 1 have rarely in
a
Zubizarreta shows that the literal translations of these sentences are ungrammatical in English: (125) a. b. c.
d.
*One must let live *This medicine makes sleep *1 heard ring the door bell *1 rarely saw cry in my life
Apart from some idiosyncratic translation problems like (126a), such sentences are generally acceptable in Dutch: (126) a. b.
c.
?Men moet laten leven Dit medicijn doet slapen Ik hoorde bellen aan de deur
NP-Movement and Restructuring
d.
271
Ik heb zelden zien huilen in mijn leven
In this respect, then, Dutch is closer to French than to English.
5.5. Reanalysis and covalency
All processes discussed so far in this chapter are very local. Thus, both passivization and the promotion rules for unaccusatives are fully characterized by the Bounding Condition. No opacity factors needed to be considered, nor was there any reason to have recourse to the other type of domain extension, the one characterized by the dynasty concept that was introduced in chapter 4. The question, then, is whether domain extension (percolation) plays a role at all in NP-movement. In the following sections, I will argue that dynasty-controlled domain extension plays a role in so-called restructuring (or reanalysis) phenomena. In fact, what has been referred to as restructuring is just an instance of the more general phenomenon of domain extension. As for NP-movement to A-positions, the case that immediately comes to mind is pseudo-passivization. Pseudo-passivization violates the Bounding Condition: (127)
[s Mary was looked [pp at t]]
Since the trace is not bound in its minimal domain (the PP), the Bounding Condition is violated. According to the domain theory discussed in chapter 4, this entails that pseudo-passivization is a marked phenomenon. It does not generally occur in languages with passives. A necessary . condition is the possibility of preposition stranding, and in part therefore, pseudo-passivization is constrained by the same factors as Wh-movement. In Dutch, for instance, the equivalent of (127) is ungrammatical: (128)
*Marie
werd [pp naar t] gekeken Mary was at looked
This is straightforwardly explained by the Bounding Condition, which prohibits extraction from PP. As we saw in chapter 4, the minimal domain defined by the Bounding Condition can be extended if the Ps of a language are structural governors ana if the Condition of Global Harmony is met. This condition is not met in (128), because the preposition naar governs to the right, while the verb (gekeken) governs to the left. Global harmony predicts that NP-movement (or, in fact, ermovement) is possible in Dutch if er is extracted from a post positional PP: (129)
Er werd [pp t naar] gekeken there was at looked 'It was looked at'
272
Domains and Dynasties
In this case, both naar and gekeken govern in the same direction.
Often, pseudo-passivization is considered an argument for reanalysis (see for instance Van Riemsdijk (1978)). In current terminology, the typical argument is that passivization involves absorption of the Case of a sister of V. In this view, it is not unreasonable to assume that where the caseabsorbed NP is not a deep structure sister of V, reanalysis has applied, so that the allegedly necessary sisterhood exists at a derived level of structure. Most other arguments for restructuring involve NP-movement to A'positions, such as the movement of clitics from an embedded clause to the matrix verb ("clitic climbing"). A classic example of this phenomenon is clitic extraction from the complements of faire in French (Kayne (1975, 269)): (130) a.
b.
Elle fera [s partir ses amis] she will have leave her friends 'She'll have her friends leave' Elle les fera [s partir t] she them will have leave 'She'll have them leave'
Related phenomena have been discussed for Spanish (for instance by Contreras (1979) and Bok-Bennema (1981)): (131) a. b.
Paco qui ere dartelas Paco wants to give you them Paco te las quiere dar
The phenomenon has been discussed in great detail for Italian by Rizzi (1978a) and by Burzio (1981, ch. 6). Rizzi has pointed out that clitic climbing is just one instance of a whole set of related transparency phenomena that occur with certain verbs. Of clitic climbing, Rizzi gives the following example, among others: (132) a. b. c. d.
Piero Piero Piero Piero Piero *Piero
verra a parlarti di parapsicologia will come to speak to you about parapsychology ti verra a parlare[t] di parapsicologia decidera. di parlarti di parapsicologia will decide to speak to you about parapsychology ti decidera di parlare[t] di parapsicologia
The matrix verb in (132a,b) has a transparent complement, while the matrix verb in (132c,d) has a complement from which clitics cannot be promoted. Similarly, in "impersonal si" sentences, the direct object of the embedded clause can be promoted with some verbs (133a,b), while promotion is impossible with other verbs (133c,d):
NP-Movement and Restructuring (133) a.
b.
273
Finalmente si comincera a costruire Ie nuove finally will begin to build the new case popolari houses of the people 'Finally, one will begin to build the new council houses' Finalmente Ie nuove case popolari si cominceranno a costruire t
c.
d.
Finalmente si ottera di costruire will get permission to build finally Ie nuove case popolari the new houses of the people 'Finally, one will get permission to build the new council houses' *Finalmente Ie nuove case popolari si otterranno di costruire t
A third phenomenon concerns the selection of avere 'have' or essere 'be' as the auxiliary of the perfect tense. Similarly to what we saw in Dutch, some verbs select avere (134a), while others select essere (134b): (134) a.
b.
Mario ha (*e) voluto un costoso regalo di Natale Maria has wanted a present expensive of Christmas 'Mario has wanted an expensive Christmas present' Mario (*ha) tornato a casa returned home Mario is 'Mario has come home again'
e
Interestingly, main verbs that usually select avere can optionally select essere if these main verbs have transparent complements with verbs that select essere: (135) a.
b. c.
d.
Mario ha voluto tornare a casa Mario voluto tornare a casa Mario ha promesso di tornare a casa Mario has promised to return home *Mario e promesso di tornare a casa
e
Somewhat less generally known is the fact that very similar transparency phenomena played a prominent role in the first chapter of Evers (1975), versions of which had circulated since 1969, and which played the same seminal role for the syntax of Dutch and German as Kayne (1975) did for the syntax of Romance. Some of these phenomena were discussed in chapter 3 in connection with control. The Dutch and German facts add an interesting dimension to the Romance facts because, as we saw in chapter 3, the very same verbs have transparent complements to their left (Verb Raising (VR) complements) and opaque complements to their right (extraposed complements).
274 Domains and Dynasties
This indicates that the nature of the matrix verb .is only one necessary condition for "restructuring". Another necessary condition is that the transparent complement must be governed. The fact that the Dutch and German transparency phenomena are direction-dependent suggests the role of the Principle of Global Harmony. 1 will return to this matter in detail below. Here 1 will repeat some of the examples from chapter 3 for reasons of convenience. Recall that er can be extracted from VR-complements and placed before indefinite matrix subjects: (136) dat er iemand [s t over] probeerde te schrijven that there someone about tried to write 'that someone tried to write about it' If the complement is extraposed (to the right of proberen), extraction of er leads to an ungrammatical result:
(137)
*dat er iemand heeft geprobeerd [s (om) t that there someone has tried COMP over te schrijven] about to write
Similarly, the German pronoun es can be extracted from a VRcomplement and placed before the matrix subject: (138) dass es der Hans [t zu begreifen] versucht that it the John to understand tries 'that John tries to understand it' Again, extraposition makes the complement opaque: (139)
*dass es der Hans versucht [t zu begreifen]
Transparency versus opacity can also be observed with the scope of quantifiers and other operators. Sentential adverbs like waarschijnlijk 'probably', for instance, can have matrix scope in a VR-complement: (140)
Ik denk dat Peter [s het boek waarschijnlijk] 1 think that Peter the book probably probeerde te lezen tried to read 'I think that Peter probably tried to read the book'
After extraposition, matrix scope is no longer possible:
NP-Movement and Restructuring (141)
275
*Ik denk dat Peter probeerde [(om) het boek waarschijnlijk te lezen]
For both Romance and Germanic, then, transparency ("restructuring") phenomena are well documented. Furthermore, it is fair to assume that all these phenomena fall into the same class. An adequate explanation, in other words, has to work for all the cases discussed so far. Because the solution proposed by Evers (1975) fails to apply to Romance, it must be rejected. Evers sought to account for the transparency of VR-complements in terms of a pruning operation. This pruning operation was supposed to be triggered by Verb Raising: a clause that loses its V must be pruned. On the basis of the Romance facts, however, it is clear that transparency has nothing to do with the formation of a complex verb (as a result of Verb Raising) in the matrix clause. Kayne (1975, 217ff.) has convincingly shown thatfaire and its complement verb do not form a constituent (a complex verb). All kinds of material can intervene betweenfaire and its complement verb. In questions, for instance, subject clitics are attached to the right of faire (142a) and not to the right of the hypothetical complex verb (142b): (142)
a.
b.
Fera-t-il partir Marie? will have he leave Mary 'Will he have Marie leave?' *Fera partir-il Marie?
Similarly, the negative element pas must be placed to the right of faire: (143)
a.
b.
On ne fer a pas partir Jean *On ne fera partir pas Jean
Furthermore, quantifiers and adverbials can be placed between the two verbs: (144)
a.
b.
Ils la feront sans aucun doute pleurer they her will make without any doubt cry 'They will no doubt make her cry' Elles feront toutes les trois soigneusement they will have all the three carefully controler leurs voitures check their cars 'They will all three have their cars checked carefully'
These examples conclusively show that faire and its complement verb do not form a complex verb.
276 Domains and Dynasties Although Rizzi (1978a) convincingly demonstrates that there is a coherent set of transparency phenomena associated with certain verbs, he is remarkably vague as to the nature of the restructuring process itself. He says that restructuring creates a "verbal complex", and he considers two possibilities of raising the embedded verb to a position near the matrix verb. One of these possibilities raises the embedded verb to a sister position of the matrix verb under the VP. This option, more or less rejected by Rizzi on other grounds, seems to violate the Projection Principle. The other possibility looks like Evers's Verb Raising, i.e. the embedded verb is Chomsky-adjoined to the matrix verb. In spite of Rizzi's preference for the latter option, he argues that the two verbs nevertheless cannot form a complex verb. His arguments are reminiscent of those of Kayne's, i.e. adverbs can intervene between the two verbs (the verbs undergoing restructuring are italicized): (145)
e
Maria dovuta immediatamente torn are a casa Maria is must immediately return home 'Maria had to come home again immediately'
Rizzi gives several examples of this kind and rightly concludes that the two verbs do not form a complex V. But it must be concluded that Rizzi does not really give a coherent account of the restructuring process itself: the only solution that he hesitantly finds acceptable at all involves the creation of a complex verb, which he rejects on other grounds. Burzio (1981) collapses restructuring and the fare-infinitive rule and claims that both involve raising of the embedded VP into the matrix VP. This solution meets the same problem as the solution that Rizzi rejected, in that it violates the Projection Principle (as was observed by Zubizarreta (1982)). In general, the exact mechanisms that are supposed to govern reanalysis have remained in the dark. Recently, however, a more formal approach has been hinted at by Riny Huybregts in lectures and unpublished work. This work is also the starting point of Haegeman and Van Riemsdijk (1986). I will now go into this theory in somewhat more detail. According to the theory in question ("covalency"), tree representations do not suffice for restructuring phenomena. Multidimensional representations are believed to be necessary for sentences like (146a,b): (146) a. b.
John looked at Mary Ik denk dat hij [Peter het boek lezen] zag I think that he Peter the book read saw 'I think that he saw Peter read the book'
The italicized words have a lexical status of their own, but moreover, look and at in (146a) and lezen and zag in (146b) are supposed to form bigger
277
NP-Movement and Restructuring
units as a reSUI{ of reanalysis. As we saw before, reanalysis for (146a) is suggested by pseudo-passivization (Mary was looked at), and for (146b) it is suggested by the transparency phenomena that we discussed above. The multidimensional structure of the VP of (146a) can, somewhat misleadingly, be represented as follows: (147)
VP V---------------pp
~P
look
I
V
I I P
I I NP
at
Mary
~ V
VP This structure does not exactly express how the covalency theory analyzes the VP in question. What is really intended is a development of certain ideas of Lasnik and Kupin's (1977), who do not represent syntactic structures as trees but as reduced phrase markers (RPMs). To see what RPMs are, we have to consider their relation to phrase markers (Pmarkers) in general. In Chomsky (1955), phrase structure derivations are formed by applications of phrase structure rules. By applying the rules in different orders, a class of equivalent derivations can be formed. A P-marker for a sentence S is the set of strings occurring as lines in any of the equivalent phrase structure derivations of S. The lines of P-markers in this sense may contain terminals, nonterminals, or both. RPMs are more restrictive: they may contain the terminal string of the P-marker and furthermore all and only the mono strings of the P-markers. The set of monostrings consists of all non terminals together with their terminal context (which may be null). Lasnik and Kupin (1977) give the following example of an RPM: (148)
{S, Ab, Cb, aB, ab}
This RPM contains the terminals a and b, and the nonterminals S, A, C, and B. It contains the terminal string ab and for each non terminal, a specification of its context of terminals (for S this context is null). The RPM (148) is compatible with an infinite set of trees. The following two are, for instance, compatible with (148):
278
Domains and Dynasties
s ~
(149)
A
(150)
s ~
I
B
I
B
A
C
1
1
b
b
A and C have the same terminal context, namely b. For this reason, it is not possible to say that either (149) or (150) uniquely corresponds to (148). Chomsky (1981b, 146, n. 94) has suggested that Lasnik and Kupin's formalism can be used to express reanalysis, for instance for idiomatic expressions like kick the bucket. Assuming that this string is analyzed as V~NP, we can add an element to an RPM so that the string is also analyzed as V. Applied to "reanalysis" structures like (147), this gives the following RPM (consisting of the terminal string look at Mary and the monostrings): (151) a. b. c. d. e. f.
VP V~ at Mary 100k~PP
b'. V~Mary
100k~P~Mary
look~ atNP look ~ at Mary
The lines (151a-f) specify the "normal" RPM. The added line (b') expresses reanalysis and is the cause of the fact that (151) is no longer representable as a tree. Assuming for the sake of argument that syntactic structures must be represented as RPMs, we must conclude that reanalysis in this sense increases the expressive power of linguistic theory. Without (b '), (151) could also be represented as a tree. No matter how the overall relations are between RPMs and trees, by allowing added rules like (151b') we give up the normal, more restrictive type of representation according to which "there is one and only one element in the reduced phrase marker for each distinct non-terminal in the phrase marker" (Lasnik and Kupin (1977, 176)). For this reason alone, the covalency theory must be considered with scepticism. But there are also obvious empirical problems. The analysis treats look at as a constituent (V), which is at variance with all available evidence (as I will show in the next section). Verb Raising in Dutch (in the sense of Evers (1975)) may involve adjunction of the embedded V to the matrix V, so that a complex V is created. In French and Italian, however, restructuring does not create a complex V, as we have seen before.
279
NP-Movement and Restructuring
Ultimately, I reject the covalency theory because it formalizes something which is taken for granted but which appears to be entirely superfluous: reanalysis. It is not at all necessary to weaken linguistic theory for the sake of restructuring, since it probably does not exist anyway. The phenomena in question fall entirely within the range of dynasty-controlled domain extensions, which are independently necessary for facts that can in no way be reduced to reanalysis, let alone to the covalency theory.
5.6. Against reanalysis According to the reanalysis theory adopted by Haegeman and Van Riemsdijk (1986), pseudo-passivization and Verb Raising in Dutch involve a similar restructuring process, i.e. addition of a line to the underlying Pmarkers. In both cases, two lexical heads are analyzed as a constituent. Thus, in (146a) look at is analyzed as V in the added dimension, and in (146b) lezen zag is interpreted as a V. This equal treatment is unsatisfactory because constituent tests fail for look at, while they succeed for the products of Verb Raising. That look and at do not form a constituent can be concluded from gapping facts: (152)
*John looked at Mary and Bill- Sue
The preposition at cannot be gapped along with the verb, while gapping of the verb alone does not create any problems: (153) John looked at Mary and Bill -
at Sue
These facts are entirely unproblematic if we assume that the underlying structure V + P is maintained. Example (152), in other words, is a counterexample against reanalysis. Other rules involving sisters of V also count against reanalysis. Consider for instance "Heavy NP Shift", the possibility to move the italicized heavy NP to the right: (154)
a. b.
John saw the woman that he loved very often John saw very often the woman that he loved
If heavy sisters of V can be moved to the right, we can expect the same
after reanalysis, which creates the sisterhood in question. This expectation is not met: (155)
a.
b.
John looked at the woman that he loved very often *John looked at very often the woman that he loved
280
Domains and Dynasties
Superficially, the following facts might be considered evidence in favor of reanalysis: (156) a. b.
Who did you look at t and kiss t? He either looked at or kissed Mary
In (156a), across-the-board extraction has applied, an operation that usually requires a rather strict parallelism (see Williams (1978)). Similarly, one might argue that (156b) involves coordination of the equal constituents looked at and kissed. The latter conclusion is not necessary, however. What also could be involved is Right Node Raising, which would entail the following structure for (156b):6 (157) He either looked at t or kissed t Mary Neither across-the-board extraction nor Right Node Raising requires full parallelism, as is clear from the following examples: (158)
a. b.
Who did you see a picture of t and kiss t? He either saw a picture of t or kissed t Mary
If there is no reason to reanalyze see a picture of as a complex verb in these cases, there is no reason to do so in the case of look at either. All in all, it seems to me that, apart from the absorption phenomenon in pseudo-passives, there is no evidence for reanalysis in cases like look at. If we compare the clusters that result from Verb Raising, we see that gapping (among other rules) does show that the derived structure involves a constituent (a complex V):
(159) dat hij Mary een boek zag lezen en Peter een krant that he Mary a book saw read and Peter a newspaper 'that he saw Mary read a book and Peter a newspaper' Contrary to what we saw in French and Italian, Dutch verb clusters cannot be broken up by adverbs and other material (apart from incorporations, to be discussed in a moment). There are good reasons, then, to assume that Dutch verb clusters form a constituent, something that Evers (1975) sought to explain by his adjunction analysis (which adjoins the embedded verb lezen to the right of the matrix verb zag in (159)). Haegeman and Van Riemsdijk (1986) reject the adjunction analysis without much argument. Instead, they assume reanalysis along the lines of Huybregts's covalency theory. Furthermore, they claim that the inversion of the verb order is carried out in the phonology. I have a preference for the adjunction theory, because adjunction is a normal structure-forming process that is needed in many other places in
NP-Movement and Restructuring
281
the grammar. Verb movement in the phonological component, however, seems rather ad hoc and unmotivated, and (like the covalency theory) it increases the expressive power of the theory of grammar. The only argument against adjunction that Haegeman and Van Riemsdijk (1986) give is based on certain facts of Swiss German (Ziiritiiiitsch). It is based on the following underlying structure: (160) das er em Karajan en arie vorsinge] chone] wil] wants that he Karajan (dat.) an aria sing-for can 'that he wants to be able to sing an aria for Karajan' In Swiss German, Verb Raising is possible with or without incorporation of objects. Furthermore, the order of the verbs is reversed, as in Dutch. For (160), this entails that en arie can be moved along or left behind under adjunction. According to Haegeman and Van Riemsdijk, there is one possible word order in Swiss German that cannot be derived from (160) by adjunction: (161) das er em Karajan wil en arie chone vorsinge The verbs are in the right order (the inverse of the underlying order in (160)), but it is allegedly impossible to place the NP en arie between wil and chane by adjunction and incorporation. If we start in the most deeply embedded S of (160), we can adjoin vorsinge under incorporation of en arie to the right of the next verb up (chane). Then, we repeat adjunction and adjoin the verbal complex from the intermediate cycle to the right of the matrix verb wi!. The result would be (162) das er em Karajan wi! chone en arie vorsinge and not the possible (161). Haegeman and Van Riemsdijk not only claim that the adjunction analysis cannot derive (161), they also claim that this sentence can be derived on the basis of their multidimensional representations. Even if the latter conclusion is right, it is easy to see that the former conclusion is wrong. Contrary to what Haegeman and Van Riemsdijk claim, an adjunction derivation of (161) is not problematic at all. There are various ways to derive (161) and to maintain an adjunction analysis. In their argument against adjunction, Haegeman and Van Riemsdijk tacitly assume that the adjunction rule is also the operation that changes the linear order of the verbs (as in Evers (1975)). Earlier in their article, they suggest that the bewildering variety of linear order in Germanic verb clusters (see Den Besten and Edmondson (1981)) is brought about by inversion rules that work on the output of the adjunction rule. Let us assume that this is the correct approach and that the adjunction rule is basically the same for all varieties of Germanic:
282
Domains and Dynasties
(163) Vi V -+ [v Vi V] This rule Chomsky-adjoins a verb projection Vi to the adjacent verb of the next higher S. The superscript i is a parameter and stands for the number of bars. In standard Dutch and German, the value of i is 0, i.e. only heads of type V undergo Verb Raising (VR). In many dialects of Germanic, such as Swiss German and West Flemish, the value of i is equal to or smaller than 2. This is what Haegeman and Van Riemsdijk call Verb Projection Raising (VPR). The final order of verbs in these languages can be accounted for by a very simple inversion rule on the outputs of (163). The rule that Haegeman and Van Riemsdijk give has essentially the following form (M stands for Modal, A for Auxiliary): (164)
A
Vi
VMjA
With these assumptions in mind, (161) can be derived as follows. First, we apply VPR to the underlying structure (160) (indicating the output of adjunction by brackets): (165) ] (en arie vorsinge chone)] wi I] Then, at the intermediate cycle, we adjoin vorsinge to chane: (166) ] (en arie (vorsinge chone))] wil] Next, we apply VPR again and adjoin the whole cluster to the matrix verb wil: (167) ]] ((en arie (vorsinge chone)) wil)] We conclude the derivation by applying the inversion rule (164) to all verbs: (168) ]] (wi! (en arie (chone vorsinge)))] This is the desired output (161). It is clear, then, that the conclusion that (161) cannot be derived by an adjunction analysis is based on rather
NP-Movement and Restructuring
283
arbitrary assumptions. If we distinguish between the adjunction rule itself and the inversion rule, as Haegeman and Van Riemsdijk do in another context, the derivation of (161) is unproblematic. As it stands, there is no convincing evidence against an adjunction analysis for Germanic verb clusters. There is, however, some positive evidence that favors the adjunction analysis over the alternative based on reanalysis. Some of this evidence has already been discussed. Contrary to what we saw with the combination V + P in pseUdo-passives, verb clusters appear to behave as a unit in rules like gapping. Here, I will add some evidence based on verb (projection) pre posing and the A-over-A principle. The evidence in question makes use of the fact that VR is optional in certain cases in Dutch. Particularly, VR is optional with tensed verbs that select an infinitive without te (see chapter 3), if these infinitives themselves do not have a VR-complement. Thus, I assume that VR is optional in (169) and obligatory in (170): (169) a.
b. (170) a.
b.
dat hij het boek lezen] wil] that he the book read wants 'that he wants to read the book' dat hij het boek t] wil lezen] *dat hij het boek lezen] t] zou willen] that he the book read would want 'that he would like to read the book' dat hij het boek t] t] zou willen lezen]
Furthermore, VR is optional for past participles, even if there are more embeddings than one: (171) a.
b.
dat hij het boek gelezen] t] t] zou kunnen hebben] that he the book read would can have 'that he might have read the book' dat hij het boek t] t] t] zou kunnen hebben gelezen]
In all other cases, VR (and inversion) are obligatory in standard Dutch. Even if VR is optional in the cases just mentioned, the complements appear to be transparent. This is in accordance with what we saw in the Romance languages: transparency does not depend on the formation of a complex verb. That the complements in the optional VR-configuration are still transparent can be demonstrated with the usual phenomena. Er can be extracted and placed before the matrix sentence, for instance:
(172) dat er iemand [s een boek t over schrijven] wilde that there someone a book about write wanted 'that someone wanted to write a book about it'
284
Domains and Dynasties
Furthermore, adverbs like waarschijnlijk 'probably' in these complements can have matrix scope, which is, as we saw above, only possible if the complements are transparent: (173) dat hij [s het boek waarschijnlijk lezenJ wilde that he the book probably read wanted 'that he probably wanted to read the book' The transparency facts with optional VR are compatible with the reanalysis theory of Haegeman and Van Riemsdijk because transparency is in their conception not caused by actual verb movements (VR or VPR) but by covalency, i.e. by an extra V in the added dimension. Problems for the reanalysis theory without adjunction arise from another property of Dutch, the possibility to front verbs or verb projections. This process, which we discussed in chapter 3 and section 5.2 above, preposes verbs (174a) or their projections (174b); 7 (174) a.
b.
lezen wilde hij dat boek niet read wanted he that book not 'He did not want to read that book' dat boek lezen wilde hij niet that book read wanted he not
According to the adjunction analysis, verb clusters can be broken up only if the A-over-A principle is not violated. 8 The Verb Second rule, for instance, can front a finite verb that is extracted from a verb cluster. The A-over-A principle is not violated, because there is always only one finite verb in a cluster, which is of course not dominated by a finite verb itself (see Koster (1975)). Another prediction of the adjunction analysis (in conjunction with A-over-A) is that verbs (or verb clusters) can be fronted if they have not undergone VR. We have just seen that VR is optional in certain cases in Dutch. The embedded verbs that have not undergone VR are not part of a cluster so that they can be fronted (as predicted); (175) a.
b.
c.
Lezen kan ik die boeken niet read can 1 those books not 'I cannot read those books' Gelezen heb ik die boeken niet read have 1 those books not 'I have not read those books' Gelezen zou ik die boeken willen hebben read would 1 those books want have 'I would like to have read those books'
NP-Movement and Restructuring
285
As expected, embeddings can be added with participles (175c). In all other cases, extra embeddings make VR obligatory and block extraction of the most deeply embedded verb: (176) a.
(177)
b. a.
b.
Ik zou die boeken niet kunnen lezen I would those books not can read 'I would not be able to read those books' *Lezen zou ik die boeken niet kunnen Je kunt hem die boeken maar zelden zien lezen you can him those books only rarely see read 'Only rarely can you see him read those books' *Lezen kun je hem die boeken maar zelden zien
With verbs like weigeren 'refuse', which always select verbal complements with te 'to', VR is always obligatory (with the complement to the left of the matrix verb): (178) a. b.
*dat hij dat boek altijd te lezenJ weigerdeJ that he that book always to read refused dat hij dat boek altijd tJ weigerde te lezenJ 'that he always refused to read that book'
As predicted, preposing of the verb that has undergone VR (adjunction) is blocked by the A-over-A principle: (179)
*Te lezen weigerde hij dat boek altijd
It is not clear how all these facts can be accounted for without an adjunction analysis. A crucial point is that it is not possible to apply the A-over-A principle to reanalysis structures (V-clusters in the added dimension). Such an analysis would incorrectly exclude sentences like (175). Nevertheless, in this view it is necessary to assume reanalysis in these cases, because the transparency phenomena (172) and (173) show that what motivated reanalysis in the first place does apply. But (175) shows that reanalysis - the formation of a complex V in the added dimension is not capable of preventing preposing of the embedded verb by exploiting the A-over-A principle. In short, adjunction feeds the A-over-A principle and therefore makes the right predictions. Reanalysis, on the other hand, does not feed A-overA. This is an argument against reanalysis anyway, and besides, the differences with respect to verb extraction (explained by the optionality or obligatoriness of adjunction) remain unaccounted for. One of the core observations made by Haegeman and Van Riemsdijk
286
Domains and Dynasties
has to do with the behavior of quantifiers under VPR. Quantified NPs in transparent complements can have matrix scope, but incorporation of the quantified NPs under VPR destroys this possibility. After incorporation, the scope is limited to the embedded clause. Haegeman and Van Riemsdijk illustrate this with examples from West Flemish and Swiss German (which both have VPR). The following example is from West Flemish: (180) a.
b.
da Jan vee boeken hee willen lezen that John many books has wanted read 'that John has wanted to read many books' da Jan hee willen vee boeken lezen
In (180a), only VR has applied, so that the NP vee boeken is not incorporated. In (180b), VPR has applied, so that the quantified NP is moved along with the verb. Interestingly, (180a) is ambiguous. According to one reading, vee boeken has narrower scope than the modal willen. In the other reading, vee boeken has wider scope than willen. In (180b), however, only one reading is available, namely the reading with narrow scope for vee boeken. The same fact can be observed in Ziiritiiiitsch: (181) a. b.
das de Hans viii buechel' hat wele lase das de Hans hat wele viii buecher lase
Haegeman and Van Riemsdijk give similar examples of scope difference for adverbials, and claim that the difference is systematic. They further claim that the rule for scope indexing introduced by H
Scope indexing applies freely when NPs belong to the same minimal S. Otherwise, NPj must be c-commanded by NPi in order to be indexed as being in the scope of NPi.
If VPR creates a monosentential structure and if (182) also applies to the relative scope of elements other than NP, it incorrectly predicts that (180b) and (181b) will show scope ambiguity. Haegeman and Van Riemsdijk try to solve this problem by stipulating that scope depends solely on c-command. Furthermore, they make crucial use of Huybregts's covalency theory. They illustrate their solution with the following examples, in which (183a) is a VR-structure and (183b) a VPRstructure:
287
NP-Movement and Restructuring
(183) a.
NP
VR
S
b.
VPR
S
~ AM A ~ NP
VM
NP
VP
NP
/\
}\i
Jan PRO en hus John
kopen wIlt
\~?~bUl XP
NP
T'
S
~ x~
i
Jan PRO en hus kopen wilt a house buy wants John
V
~M Vx
VP
NP
I
I
X\/V
VM
VP
S
In this view, scope depends entirely on c-command and both dimensions (the upper and the lower tree) may playa role. We now observe that the VR-structure (183a) represents the scope ambiguity in question. In the upper tree, V M c-commands the quantified NP en hus (XP), and in the lower tree this XP c-commands the modal VM. Both readings are therefore represented by an asymmetrical c-command relation. In the VPR-structure (183b), however, VM asymmetrically ccommands the XP in both dimensions. This entails that (183b) represents only one reading, namely the reading in which the modal has wider scope. I find this account rather complex and artificial. First of all, it ignores the well-known fact that c-command is irrelevant within the same simplex S. Thus, everyone loves someone is ambiguous in spite of the fact that
288
Domains and Dynasties
everyone asymmetrically c-commands someone. This fact is accounted for by Ha'ik's scope rule (182). It is crucial for Haegeman and Van Riemsdijk to ignore this fact, but unfortunately they give no motivation. Furthermore, a whole new dimension is created for a few problematic facts (transparency and the loss of transparency after VPR). In general, it is not satisfactory to increase the expressive power of linguistic theory for a few isolated facts. If the same facts could be accounted for without invoking new dimensions, we would have a better theory. Fortunately, such a theory already exists.
5.7. Transparency without reanalysis
The alternative theory that I have in mind is the theory of dynastycontrolled domain extensions discussed in chapter 4. It seems to me that so-called reanalysis or restructuring is just one instance of a much more general pattern. In the unmarked case (of the Bounding Condition), a domain is defined by just one governor: a dependent element must have an antecedent in the minimal Xmax that contains its governor. The intuitive content of the dynasty idea is that in the marked case, a domain can also be defined by a set of governors that are in a certain relation. More precisely, a dynasty is an ordered n-tuple of subjacent domain governors with the governor of the dependent element as the last member (= n). The theory of dynasties is a generalization of Kayne's theory of percolation projections or g-projections (Kayne (1984)). In chapter 4, examples were given of the dynasty that expands the domain for Whtraces. What the members of this dynasty have in common is their equal orientation: they govern in the same direction. In the next chapter, some dynasties for long reflexivization in Icelandic and other languages will be discussed. As we will see, domain extensions for reflexives are determined by verbs of a certain type. It seems to me that the restructuring phenomena studied by Rizzi and others are hardly different. In all cases of transparency, certain elements have a larger domain than their minimal X max, i.e. a larger domain than they would have in the unmarked case characterized by the Bounding Condition. Furthermore, all cases of reanalysis involve a series of subjacent domain governors. With the exception of pseUdo-passives, these domain governors even form a series of Vs (as in long reflexivization). As I will show, there is strong evidence for this view of "restructuring" in the fact that to the extent that transparency phenomena in VRcomplements involve movement, the Principle of Global Harmony applies. For ease of exposition, it is therefore necessary to give a very brief recapitulation of the crucial features of the theory of chapter 4. According to the theory of Global Harmony, an empty category must be bound
NP-Movement and Restructuring
(184)
a. b.
289
in its minimal Xmax (NP, PP, AP, S'), or in accordance with the Principle of Global Harmony.
The Bounding Condition (184a) defines the unmarked domain and (184b) formulates necessary conditions for domain extension (percolation). Extraction from PPs (stranding) violates (184a) but is permitted in a few languages that have structurally-governing prepositions and that can conform to the Principle of Global Harmony: (185)
[s' What have you talked [pp about t]]
English has the right type of prepositions and the Principle of Global Harmony can be met: both the preposition about and its domain governor (the V talked) govern to the right. Dutch also has the right prepositions, but the Principle of Global Harmony cannot be met in the equivalent of (185): (186)
*[Wat heb je[pp over t] gepraat]
In this example, the preposition governs to the right and the verb to the left. The sentence is therefore rejected by the Principle of Global Harmony. Only with postpositions can the principle be satisfied: (187)
[Waar heb je [pp t over] gepraat] where have you about talked 'What did you talk about?'
Both the postposition and the verb govern to the left in this case. In the present context, it is crucial to recall from chapter 4 that percolation is only possible from projections in D-structure position. As soon as a PP is moved to a derived position, stranding is no longer possible, even if the direction of the governors remains the same: 9 (188)
a. b.
(189) a.
b.
He talked t yesterday about his brother *Whoj did he talk t yesterday about tj You thought that to him nothing was clear t *Whoj did you think that to tj nothing was clear t
Also for Dutch, this phenomenon was documented in chapter 4. Thus, after PP extra position, prepositions still cannot be stranded, even if the direction of the governors superficially looks right: (190) a. b.
Hij heeft t gepraat over zijn broer he has talked about his brother *Wiej heeft hij t gepraat over tj who has he talked about
290 Domains and Dynasties
Also a small step from the D-structure position to the left makes PPs percolation islands: (191) a.
b.
Hij heeft daar over een boek t gelezen he has there about a book read 'He read a book about that' *Waarj heeft hij tj over teen boek gelezen where has he about a book read 'What did he read a book about?'
From the D-structure position, this type of percolation (stranding) is possible: (192)
Waar heeft hij een boek t over gelezen?
In other words, domain extension is bound by a freezing principle. If a projection binds a trace, it becomes a percolation island. Pending a deeper explanation, we can formulate the generalization as follows: (193) Xmax is a percolation island if it (or one of its subprojections) binds a trace. According to this condition, the Bbunding Condition (184a) applies, unless a path is trace-free. Only with trace-free paths can the minimal domain be extended in accordance with the Principle of Global Harmony. Ultimately, then, the ungrammaticality of the (b) sentences of (188-191) is explained by the Bounding Condition. I will now show that the so-called reanalysis facts fit this existing percolation theory. In this section, I will limit myself to pseudo-passives and the transparency facts of V(P)R-complements. Some Romance facts will be discussed in the next section. As for pseudo-passives, it is important to bear in mind that a theory of syntactic structure can only formulate necessary conditions. Quite generally, passivization is not only determined by syntactic structure but also by lexical and semantic factors. On the other hand, it is not a priori clear how the boundaries between the various modules must be drawn. Thus, Chomsky (1965, 218, n. 27) observes a contrast between the following two sentences: (194) a.
England was lived in by many people
b. England was died in by many people The first sentence is much more natural than the second. In this case, the distinction might be syntactic after all. Evidence from Dutch shows that sterven 'die' is an unaccusative verb, while live (in the intended sense) might be an unergative. In any case, wonen 'live' forms its perfect tense
NP-Mouement and Restructuring
291
with hebben, while steruen 'die' selects zijn: (195) a.
b.
Hij he 'He Hij he 'He
heeft in Engeland gewoond has in England lived has lived in England' is in Engeland gestorven is in England died died in England'
The fact that unaccusatives like die cannot be passivized can be entirely explained in syntactic terms (see Perlmutter and Postal (1984)). But if passivization is constrained by purely semantic factors (for instance, "property interpretation" in the sense of Fiengo (1974)), these factors are neutral between a syntactic analysis in terms of restructuring and an analysis in terms of dynasty-controlled domain extension. An essential feature of passivization is the absorption of Case by the passive morphology. This absorption can be seen as a strictly local dependency relation between the passive verb and the Case-absorbed NP. Pseudo-passivization is a violation of the Bounding Condition: (196)
[s' Mary was looked [pp at t]]
The trace is not bound in its minimal domain PP, so that a marked domain extension must be involved. If the absorption relation is a relation between looked and the trace, it also violates the Bounding Condition. The domain extension for the trace is quite similar to what we saw for Wh-traces in chapter 4, i.e. the normal conditions on preposition stranding must be met. The relevant principle is the Principle of Global Harmony, which is satisfied in (196): both at and its minimal domain governor looked govern to the right. As was already mentioned at the beginning of section 5.5, the Principle of Global Harmony cannot be met in the Dutch translation of (196) ((128), repeated here for convenience): (197)
*Marie
werd [pp naar t] gekeken Mary was at looked
In fact, Dutch has no pseudo-passivization at all. The following example (198a) looks like a pseUdo-passive, but it is just an impersonal passive with fronted er. This example does not involve absorption, as can be concluded from the fact that er may remain in its PP-internal D-structure position (198b): (198) a.
Er werd [pp t naar] gekeken there was at looked 'It was looked at'
292
Domains and Dynasties
b.
Gisteren werd [pp er naar] gekeken yesterday was there at looked 'Yesterday, it was looked at'
The contrast between (197a) and (198a) shows that NP-movement (or ermovement) has to meet the same criteria as Wh-movement: violation of the Bounding Condition is possible only if the successive governors of a dynasty govern in the same direction. The absorption relation also involves the dynasty (looked, at). In fact, it seems appropriate to say that this dynasty governs the trace. Thus, in the unmarked case, a dependent element is governed by a single governor, but in the marked case a whole dynasty may govern a trace. This formulation has several advantages. First of all, we can maintain the simplest domain statement for traces: a trace must be bound in its governing category. As before, the governing category is defined as the minimal domain containing the dependent element and its governor. If there is a single governor, the domain is small; if the governor is a dynasty, the domain can be large. A further advantage of accepting dynasties as governors is that it becomes possible, in principle, that everything that must happen under (normal) government can also happen under dynasty-government. Thus, a nontrivial consequence of our formulation might be that Case assignment (and absorption) can also depend on dynasties. It is my claim that this is exactly what we see in pseudo-passivization and also in the Frenchfaireconstruction, to be discussed in the next section. So, consider (199a) and its passive counterpart (199b): (199) a.
John looked [pp at Mary] [pp at t]
b. Mary was looked
The verb look does not assign Case by itself, but the dynasty (look, at) does assign Case. Thus, Mary in (199a) receives Case from its governor, just like Mary in John saw Mary. In the former case, the governor is a dynasty, while in the latter case the governor is a single head. Similarly, absorption in (199b) is controlled by the governing dynasty. Dynasty formation is the general pattern of domain enlargement that applies to both Wh-movement and passivization. But there is a difference with respect to the role of the lexicon. Lexical factors do not play an evident role in the formation of extended domains for Wh-traces, but passives are somehow lexically constrained. The reason is that passivization has to do with Case assignment and absorption, which are lexical processes by definition. It is the typical function of the lexicon to specify which governors assign which kind of Case. So, to the extent that dynasties assign Case, they must be mentioned in the lexicon. Thus, alongside statements like (200a), we also find such specifications for dynasties as (200b): (200) a. b.
see -+ objective Case (look, at) -+ objective Case
NP-Movement and Restructuring
293
In neither case is it of course necessary to specify for each individual item that it assigns the Case in question. In principle, (200a) is just an instance of the rule that applies to all (or most) transitive verbs. Similarly, (200b) might be an instance of the rule that holds for all verbs that are followed by a preposition that governs in the same way (along the lines of Kayne (1981)). In any case, it is important to distinguish two independent factors in pseudo-passivization: (i) the general syntactic mechanism of dynastycontrolled domain extension, and (ii) the lexical fact that certain dynasties assign Case. The first factor is exactly the same as in Wh-movement. In both cases the conditions of the Principle of Global Harmona' must be met. The second factor is specific to the passive construction. 1 If we evaluate the theory presented here in comparison with other theories, these two factors must be carefully kept apart. Thus, other theories involve complete amalgamation of the V and the P involved in pseudo-passivization (Hornstein and Weinberg (1981)) or assignment of the category V to V + P in the extra dimension (Haegeman and Van Riemsdijk (1986)). Amalgamation or addition of an extra V do not occur spontaneously, but are triggered by the same lexical factors that determine pseudo-passivization in the theory presented here. The lexical factors, then, are neutral between the three theories. What has to be compared are the following three mechanisms: (201) a. b. c.
amalgamation rules extra rules in an added dimension dynasty-controlled domain extension
It seems to me that theory (201 b) is the worst of these three theories. It is not only ad hoc, it also weakens linguistic theory in a rather drastic way. Both (201a) and (201 b) are empirically falsified by the fact that the combination V + P does not behave as a constituent of type V in any rule. Theory (201c) has none of these disadvantages: it is based on an independently motivated mechanism (see chapters 4 and 6) and it does not say that V + P forms a constituent in pseudo-passives. Both amalgamation (201a) and Huybregts's application of the Lasnik/Kupin formalism (201b) require adjacency of the "reanalyzed" terms in order to prevent unwanted overlaps in structure. This means that these types of reanalysis cannot be involved in preposition stranding under Wh-movement (see also Van Riemsdijk (1978)). In Dutch Whmovement, the stranded preposition and the verb need not be adjacent:
(202) a.
b.
Waar heeft hij aan zijn dissertatie t mee gewerkt? where has he on his dissertation with worked 'With what did he work on his dissertation?' Waar heeft hij t mee aan zijn dissertatie gewerkt?
The second sentence (202b), in which the P mee and the V gewerkt are not
294
Domains and Dynasties
adjacent, even sounds somewhat more natural to my ear. This evidence is similar to what we saw for Jaire + V and other Romance V series: neither preposition stranding nor the Jaire-infinitive construction involves complete amalgamation. On the other hand, pseudo-passives do usually involve adjacency of P and V. Thus, one might argue that the following contrast slightly favors amalgamation (cf. Chomsky (1981b, 123)): (203) a. b.
John was spoken to *John was spoken angrily to
It seems to me, however, that this contrast can be explained by inde-
pendent factors. Suppose that we only allow binary branching (in the sense of Kayne (1984)). In that case, it is reasonable to assume that the tophrase in (203) is a D-structure sister of the verb: contrary to angrily, this PP is a complement of the verb. Together with binary branching, this entails that the to-phrase is no longer a sister of the verb in (203b). If the PP has been moved to the right, it binds a trace in the derived structure: (204) John was spoken tj angrily [to t]j
According to (193), the PP becomes a percolation island as soon as it binds a trace (compare also (188)). In sum, adjacency facts do not automatically support theories that take restructuring as an amalgamation process. 1 will now return to the contrasts in transparency between VRcomplements and extraposed complements in Dutch (and German). Recall that the following contrasts were observed: (205) a.
b. (206) a.
(207)
b. a.
b.
Ik denk dat er iemand [s' t over] probeerde 1 think that there someone about tried te schrijven to write 'I think that someone tried to write about it' *Ik denk dat er iemand probeerde [s' (om) t over te schrijven] dass es der Hans [s' t zu begreifen] versucht that it the John to understand tries 'that John tries to understand it' *dass es der Hans versucht [t zu begreifen] Ik denk dat Peter [s' het boek waarschijnlijk] 1 think that Peter the book probably probeerde te lez~n tried to read 'I think that Peter probably tried to read the book' *Ik denk dat Peter probeerde [s' het boek waarschijnlijk te lezen]
NP-Movement and Restructuring
295
The contrasts in (205) and (206) follow straightforwardly from the Principle of Global Harmony. In (205a), the trace is not bound in its minimal domain, but domain extension is possible because all relevant governors govern to the left (over, the verbs schrijven and proberen). In (205b), however, the matrix verb governs to the right (or not at all, if we assume that the extraposed complement is in a derived position). The same explanation holds for the contrast between (206a) and (206b). Assuming that in the unmarked case adverbs like waarschijnlijk have scope over the minimal Sf that contains them, we can explain the contrast in (207) in a similar way: on the left of the verb, percolation is possible because the complement is in its D-structure position. After extraposition, the minimal Sf containing waarschijnlijk becomes a percolation island according to (193). Given the availability of this type of explanation, we see that the interesting scope phenomena that Haegeman and Van Riemsdijk observed in Swiss German are already accounted for. In order to see this, we must reconsider (180a): (208)
[s' da Jan [s' vee boeken] hee willen lezen] that John many books has wanted read 'that John has wanted to read many books'
The quantified NP vee boeken is in its D-structure position in its minimal domain Sf. As in (207a), scope percolation is possible because this minimal domain is governed on the left of the matrix verb. In the narrow scope reading, the modal willen has scope over vee boeken. In the wide scope reading, willen can also be in the scope of the quantified NP. VPR has the same effect as extraposition, in that it creates a trace: (209) da Jan [s' tj] [hee willen [v' vee boeken lezen]iJ As before, the binding of a trace creates a percolation island. Domain extension is not possible from derived positions according to (193). It seems, therefore, that the scope facts are already explained by existing and independently motivated theories. It is not necessary at all to weaken the theory of grammar by adding non-tree-representable dimensions to it. Very similar scope phenomena can also be observed in standard Dutch. Recall the various forms of verb (projection) preposing that we discussed before: (210) a.
b.
Hij wil maar zelden [boeken lezen] he wants only rarely books read 'Only rarely does he want to read books' [Boeken lezen]j wil hij maar zelden tj
Like VPR, this type of movement leads to disambiguation in certain cases.
296
Domains and Dynasties
Thus, the following sentences, with the VP in D-structure position, is ambiguous: (211)
[s' Hij wil [s{ Vp veel boeken lezen]]] he wants many books read 'He wants to read many books'
Vee! boeken either has scope over the embedded clause or over the matrix clause. After VP-preposing, however, the sentence becomes unambiguous:
(212)
[vp vee I boeken lezen]j wil hij [s tj]
In this case, only the narrow scope reading is possible. As before, this is explained by the fact that categories that bind traces are percolation islands. Note that it is also impossible to prepose a VP with an adverb that has matrix scope: (213)
a.
b.
Hij wil [s[het boek helaas lezen]] he wants the book unfortunately read 'Unfortunately, he wants to read the book' *[Het hoek helaas lezen]j wi! hij [s tiJ
Contrary to what we see in English, adverbs with matrix scope like helaas 'unfortunately' can occur in embedded clauses in Dutch (see (213a)). This is again explained by the familiar percolation mechanism: VRcomplements are governed on the left of the matrix verb and are transparent. As soon as the VP with the adverb is preposed, the sentence becomes ungrammatical. The VP binds a trace, so that according to (193) the minimal domain S' becomes a percolation island. All transparency phenomena in VR-complements are explained by the fact that a dynasty of Vs can be formed on the left of matrix verbs. It is only there that the complements are properly governed, which is a necessary condition for dynasty formation. Movement of the minimal domain S' (by extraposition) or one of its subprojections (VPR or VPpreposing) creates percolation islands in accordance with (193). The island condition (193) is independently motivated by the facts of preposition stranding under Wh-movement. It must be concluded, then, that restructuring in the sense of creation of a new constituent is motivated neither by pseudo-passives nor by the transparency facts of German and Dutch.
S.S. Restructuring in French
One of the much discussed properties of the French jaire-infinitive construction is the obligatory reversal of the mutual order of the embedded
297
NP-Movement and Restructuring subject and the following verb: (214)
a. b.
*Elle fera ses amis partir she'll have her friends leave Elle fera partir ses amis
Italian restructuring often does not involve change of word order, but there are facts that are quite similar to what we see in the French causative construction (Burzio (1981, 551)): (214) c.
d.
*Lo e andato Giovanni a prendere it has gone Giovanni to fetch 'Giovanni has gone to fetch it' Lo e andato a prendere Giovanni
Contrary to Rizzi (1978a), Burzio concludes that (214) (and the Italian equivalent with fare) show the same phenomenon, which he describes as VP-movement. If this conclusion is correct, restructuring in French and Italian always involves a change in constituent order. In what follows, I will discuss the nature of this constituent order change and try to determine which factors govern transparency in the Romance constructions in question. Also in French, a whole VP can seem to appear to the left of the subject:
a
(215) Elle fera lire Ie livre Jean she'll have read the book to John 'She'll have John read the book' Contrary to phrases introduced by par 'by' in the faire-par construction, the phrase Jean functions as an external subject in several respects (see Kayne (1975)). Zubizarreta (1985) claims, for instance, that the reflexive se is always bound by an external subject, which shows that Jean is an external subject in (216):
a
a
(216)
a
On a fait se laver les mains Jean one has had himself wash the hands to John 'We had John wash his hands'
Since Kayne (1975), many authors have described the change in word order as a leftward movement of V (+ NP). The intuition underlying this idea is probably the fact that the postverbal NPs (ses amis in (214) and Jean in (216)) continue to function as a subject. Rouveret and Vergnaud (1980) take over the idea with some changes but without further arguments. Burzio (1981) and Zubizarreta (1982) adopt Kayne's idea with some new arguments, to which I will return. In what follows, I will adopt an alternative analysis. According to this analysis, the change in word
a
298
Domains and Dynasties
order is not brought about by leftward movement of V (+ NP), but by rightward movement of the subject. This alternative has a tradition of its own and has been proposed by Goldsmith (1975), Beaubien et al. (1976), and Milner (1979), among others (see also Hulk (1982, 185ff.)). For a number of reasons, the idea of V (+ NP)-movement is not satisfactory. First of all, it has never become quite clear what has to be moved. Movement of the whole VP often gives an ungrammatical output. Perhaps for this reason, Kayne (1975) moves the V with or without the NP. If the NP is moved along, no reference is made to the node VP and is adjoined to the underlying subject. As we mentioned, Burzio (1981) moves the whole VP, but without solving the problems inherent in this. In fact, his raising of the lower VP into the higher VP violates the Projection Principle, as has been observed by Zubizarreta (1982). Zubizarreta emphatically rejects movement of the whole VP. But movement of only a subprojection of the VP leads to another problem, namely that it ignores the fact that movement usually only involves maximal projections or heads. 11 In my opinion, the most unsatisfactory aspect of these various Vmovement rules is that they seem superfluous in many cases. The problem is that (214b) and (215b) can also be generated by movement of the subjects to the right. Both French and Italian already have several independently necessary rules that move subjects to the right. Thus, French has stylistic inversion (Kayne and Pollock (1978)) in (217) and NP-extraposition in (218) (Kayne (1975, 330)):
a
(217) a.
Quand est parti Jean? when has left John 'When did John leave' b. Ie livre qu'a lu Jean the book that has read John 'the book that John has read' (218) a. I1 est arrive trois femmes there have arrived three women b. I1 a ete mange plusieurs tartes there have been eaten several pies 'Several pies have been eaten' Also Italian has several well-studied rules that move NPs to the right (see for instance Burzio (1981)): (219) a. b. (220) a. b.
Giovanni arriva Giovanni arrives Arriva Giovanni Giovanni telefona Giovanni telephones Telefona Giovanni
299
NP-Movement and Restructuring
(221) a.
Giovanni scrive una letter a Giovanni writes a letter b. Scrive una lettera Giovanni
The last example shows a difference when compared to the inversion in Italian fare-constructions: (222)
Faro leggere il libro *(a) Mario I'll have read the book to Mario 'I'll have Mario read the book'
As in French, the postposed subject obligatorily has a attached to it. This is impossible in (221b), and this fact has played a role in the assumption that different processes are involved in the two constructions. Kayne (1975, 327ff.) gives detailed motivation for preferring a Vmovement rule over an NP-extraposition rule. One major argument is the Specified Subject Condition, which according to Kayne (1975) constrains the options for clitic climbing in French. If the subject NP moves to a VPinternal position, there is no longer a structural subject according to this argument. Since the SSC requires a subject in the relevant positions in derived structures, the NP subject cannot have been moved in the complements of faire. Apart from the question whether NP-extraposition really bleeds the structure relevant for the SSC, the argument has no obvious force in present GB terms. The SSC is incorporated into the binding theory of Chomsky (1981b), which is a theory of A-binding and not of the A'binding involved in cliticization. Moreover, the facts that are supposed to follow from the SSC follow from other principles in a less problematic way, which I will demonstrate in a moment. can be inserted only in the faireKayne further points out that construction. NP-extraposition that leads to the forbidden sequence V NP NP cannot be saved by insertion of a:
a
(223)
a. b.
*Il manger a cette tarte trois filles there will eat this pie three girls *Il mangera cette tarte it trois filles
Furthermore, stylistic inversion exceptionally allows the sequence V NP NP in cases like (224): (224)
Le jour ou a pris fin la deuxieme guerre mondiale the day when has taken end the second war world 'The day when the Second World War came to an end'
Insertion of sentence:
a (to
the second NP) leads to an entirely ungrammatical
300
Domains and Dynasties
(225) *Le jour ou a pris fin it la deuxieme guerre mondiale Of course, these arguments say nothing against NP-extraposition infaireconstructions. Faire-infinitives with intransitive verbs do not involve anyway. Apart from this, the sentences only show that there is a difference in that is inserted only with faire. The fact that the presence of faire triggers a-insertion can be completely independent from the question whether faire-complements have NP-extraposition or not. A final difference in comparison with other cases of NP-extraposition concerns the fact that the NP in faire-complements can be followed by predicate adjectives:
a
a
(226) Cela fera devenir Jean completement fou that will make become John completely crazy 'That will make John completely crazy' In other cases of NP-extraposition this sequence is impossible: (227)
*Il est devenu trois filles completement folIes there have become three girls completely crazy
In what follows, I will show that all the differences follow from the presence of faire, which is responsible for the insertion of and certain changes in word order. In no way can the observed differences be taken as an argument against the involvement of NP-extraposition in the faireconstruction. Burzio (1981) claims that the restructuring facts described by Rizzi (1978a) must be accounted for by the same VP-fronting rule that also appiies for the complements of Italian fare and French faire. He opts for VP-fronting (and thereby rejects NP-extraposition) on the basis of certain dativization facts in Italian (1981, 555ff.). According to Burzio, a must be inserted if the NP subject of the complement immediately follows an NP or S. He considers the following structure:
a
(228) Faccio [vp venire Giovanni [s PRO a prenderlo NP]] I made come Giovanni to fetch it 'I will have Giovanni come to fetch it' The clitic 10 cannot be extracted from this structure: (229)
*Lo faccio venire (a) Giovanni a prendere
This is explained in Burzio's terms by the fact that restructuring has not applied: the VP with the source of the clitic has not been fronted across the subject Giovanni. After this operation, the sentence becomes grammatical:
NP-Movement and Restructuring
301
(230) Lo faccio venire a prendere a Giovanni Burzio accounts for the insertion of a in (230) by the presence of the NPtrace (see (228)) after prendere. Not only NPs but also clauses trigger the insertion of a: (231)
Cia fara sognare [s di vincere] a Giovanni this will make dream to win to Giovanni This will make Giovanni dream of winning'
Given the fact that a-insertion is triggered after S, the following sentence shows, according to Burzio, that movement of VP (and not NPextraposition) is the right rule: (232) Faccio venire a lavorare (*a) Giovanni I make come to work Giovanni If Giovanni had moved to the right, the clause preceding it would trigger ainsertion. But as a matter of fact, a-insertion is prohibited. Note, however, that (232) shows only that Giovanni is not preceded by an S [a lavorare]. Nothing prevents Giovanni (the underlying subject of venire) from being adjoined to the lower V lavorare. Precisely because of the fact that the clause of lavorare is transparent, the lowered Giovanni could still be properly linked to its underlying subject position (as in telefona Giovanni) - from where it binds the pro subject of lavorare. If this analysis is correct, the dativization facts of Italian do not favor VP-preposing over NP-extraposition. Still another argument for V-projection preposing is given by Zubizarreta (1982, ch. 4). According to this argument, a non-maximal Vprojection must be preposed in the complements of verbs like faire and voir in order to meet the subcategorization needs of these verbs; i.e. it is claimed that these verbs are subcategorized for a small clause projected from V. In order to turn the S-complements into small clauses of this type, a nonmaximal V-projection must be moved to the head position (= COMP) of these Ss, so that they themselves become projections of V. This argument is not convincing, because V-projection pre posing is optional with verbs like laisser and voir. It is true that the word order change triggers transparency, but it is not clear why the preposing should change the categorial status of the complement. Moreover, there are many other languages, such as Dutch or German, in which the naked infinitives of causative verbs and perception verbs are transparent without Vprojection movement to S. As we have seen before, these complements are even transparent if Verb Raising does not apply:
(233) dat hij [s Peter het boek waarschijnlijk lezen] zag that he Peter the book probably read saw
302
Domains and Dynasties
The complement is transparent, which can be concluded from the fact that the adverb waarschijnlijk has matrix scope in spite of the fact that Verb Raising has not applied. All in all, it seems to me that there are no convincing arguments in the literature that favor V-projection preposing over NP-extraposition. I will now sketch an alternative analysis of the Jaire-infinitive construction which is based on NP-extraposition, an operation that is needed anyway. Furthermore, I will make use of the fact discussed in the preceding section that Case can be assigned by dynasties. It has convincingly been demonstrated that the combination Jaire + V does not form a constituent in French, but from this it does not follow that the combination cannot assign Case. On the contrary, if a dynasty is a governor, we expect instances of Case assignment by dynasties. This is what we claimed for combinations like look + at in English, and we will make a similar claim for the dynasty (jaire, V). There is even some evidence that dynasties can select arguments that are not compatible with any of the constituting verbs alone. In Dutch, the causative verb laten 'let' is in the same class as verbs of perception. As we have seen in the preceding sections, these verbs select te-Iess transparent infinitival complements like Jaire and voir in French. As we saw in section 5.2, the Dutch verbs in question may select complements without an external argument, like the French Jaire-par construction. Interestingly, there are also examples that look like the French Jaire-a construction, as has been observed by Bordelois (1974), Coopmans (1985), and others: (234) a.
b.
c.
Marie Mary 'Mary Marie Mary 'Mary Marie Mary 'Mary
liet de taart proeven aan Peter let the pie taste to Peter let Peter taste the pie' liet het boek lezen aan Peter let the book read to Peter let Peter read the book' liet haar antwoord weten aan Peter let her answer know to Peter let Peter know her answer'
These sentences have paraphrases with Peter in the normal subject position: (235) a.
Marie liet Peter de taart proeven
b. Marie liet Peter het boek lezen c.
Marie liet Peter haar antwoord weten
In spite of the superficial similarities between the sentences in (234) and the French Jaire-a sentences, the two constructions cannot be considered identical. First of all, the Dutch construction is limited to a rather small class of verbs. Basically, only stative verbs of cognition may figure in
303
NP-Movement and Restructuring
complements of laten with aan. Verbs of action do not occur in this construction. 12 (236)
a. b.
(237)
a. b.
*Zij liet de boerderij bezoeken aan haar ouders she let the farm visit to her parents Elle a fait visiter la ferme ses parents she had VISIt the farm to her parents 'She had her parents visit the farm' *Zij liet de verklaring tekenen aan haar echtgenoot she let the declaration sign to her husband Elle a fait signer la declaration son mari she had sign the declaration to her husband 'She had her husband sign the declaration'
a
a
The exact nature of the difference between French and Dutch in this respect is a largely unexplored area. In any case, it is clear from the examples (which can easily be multiplied) that there are considerable differences between the constructions in the two languages. Another difference is that reflexives are not very good in the Dutch complements: (238)
*We lieten zich die dag herinneren aan Jan we let REFL that day remember to John
This fact suggests that the Dutch aan-phrase is not (connected with) an external argument like its French counterpart. What is interesting about the Dutch construction is that it can select arguments that cannot occur with each of the verbs separately: (239) Hij liet doorschemeren aan Jan dat hij ziek was he let glimmer to John that he sick was 'He hinted to John that he was sick' In this case, Jan cannot be interpreted as the subject of the embedded verb: (240)
*Jan schemerde door dat hij ziek was
Nor is it possible to have another subject, with Jan in the VP-internal aanphrase (het = it): (241)
*Het schemerde door aan Jan dat hij ziek was
Obviously, the argument aan Jan belongs to the combination laten doorschemeren. Another example is the following (hem = him):
304
Domains and Dynasties
(242) a.
b.
Hij liet het zich aanleunen he let it REFL lean against 'He took it as his due' *Het leunde hem aan
Apparently, there are arguments that only occur with the combination laten + V. There are also examples in which the embedded subject occurs independently, but in which the meaning of laten + V cannot be reduced to its parts: (243) Hij liet haar de stad zien he let her the town see 'He showed her the town' The combination laten + zien does not mean 'cause to see', but 'show'. In this case, it is also reasonable to assume that the thematic roles are assigned by the dynasty (laten, zien). In spite of these phenomena, it is reasonable that laten and its complements form bisentential structures. This is clear from examples like: (244) Jan liet Peter Marie een boek geven John let Peter Mary a book give 'John let Peter give a book to Mary' With a monosentential analysis, either we must decide that there are verbs with four arguments, which is not generally true, or we must assume that the subject Peter is at the same time a complement of the higher V and the subject of the lower VP. This latter option violates the Projection Principle. Nevertheless, it must be concluded that dynasty formation can be a step in the direction of full clause union. A dynasty can, like any other governor, assign Case or 9-roles. Given this fact, there is nothing against the idea that a dynasty with the form (jaire, V) assigns Case, or even 9roles. Here, I will limit myself to the Case option. Apart from this, I will assume that (jaire, V)-dynasties create certain templates in French. The most important template goes together with the special Case assignment of (jaire, V) itself. I will return to this feature in a moment. The other template can best be formulated as a filter for the time being: (245)
*[ ... Vi ... Case ... Vi ... ]
The indices i indicate that the two verbs are members of the same dynasty. What the filter says is that two members of a V-dynasty may not be interrupted by a Case-bearing element such as a lexical NP, a clitic, or a
NP-Movement and Restructuring
305
trace of these. It accounts for sentences like the following: (246)
*Elle fer a Jean partir she'll have John leave
In this case, I assume that the filter (245) prevents Jean from being moved from its D-structure position in the domain of partir to the external subject position of the complement. Thus, I assume that the correct order (247) reflects the D-structure order of ergative verbs like partir (see Burzio (1981)): (247) Elle fera [NP [vp partir Jean]] As with other unaccusatives, the embedded subject is not the external argument. There is only one 9-role, which is assigned inside the VP. For unergatives, I assume the following structure: (248) Elle fera [NP j [vp [vp telephoner] JeaniJ] In the familiar way, this structure differs from (247) in that the subject Jean is adjoined to the VP. Moreover, unergatives have an externaI9-role,
which must be inherited from the subject NPj in this case. Thus, neither in (247) nor in (248) is Case assigned to the embedded (external) subject. In both cases, objective Case is assigned by the dynasty (jaire, V). Further details of Case assignment will be discussed in a moment. As with PRO and other Case-less positions, nothing prevents the embedded subject from being active in respects other than Case assignment. So, in (248) the embedded subject NPj gets a 9-role and transmits it in the usual manner. So far, the structures are derived by a combination of the filter (245) and the standard assumptions for postverbal subject positions in Romance. The same facts are derived by the rule that preposes Vprojections. The filter has an advantage, however. It also accounts for the sentences in (249) without any further cost (Kayne (1975, 270)): (249) a. b.
c.
d.
e. f.
*Elle fera les partir she'll have them leave *Elle fera Ie manger a Jean to John she'll have it eat 'She'll have John eat it' *Elle fera lui manger ce gateau she'll have to him eat that cake 'She'll have him eat that cake' manger *Elle fera Ie lui she'll have it to him eat 'She'll have him eat it' *Elle Ie fera lui manger *Elle lui fera Ie manger
306
Domains and Dynasties
Another construction accounted for is the following one (from Kayne (1975, 230)):13 (250) ?*Elle Ie fera lire ce livre she him will have read this book The clitic cannot be bound to a postverbal position because French does not have double object constructions. So, the trace must be between the two verbs (in the underlying position of the embedded subject): (251)
*Elle Ie fera t lire ce livre
This is again the configuration forbidden by (245). The filter does not exclude the occurrence of the pro-PPs en and y between the two verbs (examples from Rouveret and Vergnaud (1980, 137)):
(252) a.
b.
Marie a fait y aller Jean Mary had there go John 'Mary had John go there' Marie a fait en parler Jean Mary had about it speak John 'Mary had John speak about it'
As pro-PPs, en and y do not have Case. More problematic is the occurrence of the reflexive se between the two verbs (Rouveret and Vergnaud (1980, 140)): (253)
Pierre a fait se presenter Jean au concours Peter had REFL present John at the examination 'Peter made John sit for the examination'
This problem, however, is not typical of the approach presented here. Probably, the reflexive se is an intransitivizing affix that does not represent a Case at all. Interesting constructions can be observed with verbs like laisser 'let', which, contrary to jaire, only optionally trigger dynasty formation (and the word order change that goes along with it). Thus, both the following sentences are possible (Kayne (1975, 203)): (254) a.
b.
II a laisse partir son amie he has let leave his friend 'He let his friend leave' II a laisse son amie partir
It follows from the filter (245) that the two verbs do not form a dynasty in
NP-Movement and Restructuring
307
(254b). Consequently, the complement is no longer transparent. We can now explain the following contrast (Rouveret and Vergnaud (1980, 159)): (255)
a.
b.
*Pierre Yi a laisse [s' Jean monter ti] Peter to it let John come up 'Peter let Jean come up there' Pierre Yi a laisse [s' monter tj Jean]
In (255a), the Case-marked NP Jean is between the two verbs, so that there cannot be a dynasty without violating the filter (245). Since dynasty formation is not possible, domain extension for the trace of the clitic y is also impossible. Without the possibility of dynasty-controlled percolation, it must be bound by the default condition, the Bounding Condition. So, the fact that the trace in the complement of (255a) must be bound in the minimal domain has nothing to do with the Specified Subject Condition. Local binding simply is the unmarked situation. In (255b), the filter does not prevent dynasty formation. Percolation becomes possible and the trace is still bound in the domain containing its governing dynasty. The filter definitely accounts for many more facts than the rule of Vprojection preposing. It avoids the formulation problems of this rule, and in a sense it is less ad hoc. In terms of partial clause union, (245) makes much sense. It is a template that improves the monosentential appearance of faire-infinitive constructions. It has the effect that all arguments occur either on the right of the verbal complex, or, as clitics, on the left. This is just the kind of NP distribution that we see in monosentential structures with respect to the (single) verb. I will now show that there is a second template with a similar purpose that, together with the properties of percolation discussed in the preceding sections, accounts for some aspects of clitic placement in fail'econstructions. The template I have in mind is derived from the Case-frame associated with the dynasty (jaire, V). We saw in the discussion of Dutch causatives that dynasties can have arguments of their own (i.e. arguments that do not belong to one of the verbs). Along these lines, I assume that the dynasty (jail'e, V) assigns Case in the following way: (256)
(faire, V) ---+ acc. (dat.)
The italicization of the optional (dat.) indicates that it is associated with an external argument (see Williams (1981) for this notation). The Cases assigned by this dynasty (acc. (dat.)) are quite common in French mono· sentential structures. Verbs like donner 'give', for instance, assign thes Cases: (257) Je donne Ie livre a Pierre I give the book to Peter
308
Domains and Dynasties
I assume that French has a second template that improves the similarity with a monosentential pattern, in this case the pattern of (257), which also reflects the Case-frame of (256): (258)
*[ ... Vj ... Vj- X" - NPj- Y" unless X" and Y" are null or empty
Ul NPj) ... ]
As before, the i indices on the Vs indicate that they belong to a dynasty. The same indices on the NPs indicate that they are Case-marked by the dynasty in accordance with (256). The filter looks more complex than it is. It just mentions a common subcategorization pattern (that in (257)) and associates it with a dynasty. The effect of the filter is the following: it prohibits lexical material associated with other governors from being interspersed with the arguments associated with the dynasty. Thus, the X" and the Y" in (258) may be complements that the second V has independent of the dynasty. It blocks structures like the following: (259)
a
*Je ferai [NP [vp[ Vp ecrire sa soeur malade] Jean]] I will have write to his sister sick John 'I will have John write to his sick sister'
In accordance with (256), accusative Case is assigned to the first NP met. This is Jean. This NP is in the derived postverbal position for unergatives. So, it is Chomsky-adjoined to the VP and it gets its a-role from the embedded external subject position. The sentence violates filter (258), because there is an X", the complement sa soeur of ecrire, that intervenes between the dynasty and the NP that it Case-marks (cf. also the Case-adjacency Condition of Stowell (1981)). The filter can be circumvented by extraposing the X" in question, which yields the more natural (260):
a
(260) Je ferai ecrire t Jean
asa soeur
For what follows, it is important to keep in mind that the extraposed PP binds a trace in (260). Another sentence with an extra posed PP is the following (Rouveret and Vergnaud (1980, 178)):
a
a
(261) Jean a fait comparer cette sonatine t Paul une symphonie John had compare this sonata to Paul to a symphony 'John had Paul compare this sonata with a symphony' In this case, a term corresponding to Y" of (258) has been extraposed. Most simple cases are straightforwardly accounted for by the Caseassignment mechanism (256) and are not ruled out by the filter (258). Consider some examples:
NP-Movement and Restructuring (262) a. b.
Ene fera partir Jean Ene fera lire Ie livre
309
aJean
In both cases, the dynasty assigns accusative Case to the first NP met (J ean in (262a) and Ie livre in (262b ». In the latter case, the objective Case that lire independently must assign to its object (Ie livre) is simultaneously satisfied. The dative Case of (256) can be assigned to Jean (and expressed by inserting a) because Jean must be interpreted as an external argument (recall the italicization of dat. in (256)). I assume that Jean inherits its thematic role from the external subject of lire, just like the postverbal intransitive unergatives. Thus, the external a-role is assigned in the normal way to the external subject NPj and transmitted to Jeanj: (263) Ene fer a [NP j [vp[ VP lire Ie livre]
a J eani]]
I will now make some tentative proposals for certain problems of clitic distribution in faire-constructions. One classic problem is the fact that dative clitics cannot move to the highest V (Kayne (1975, 281»: (264)
Je ferai ecrire mon ami a sa soeur malade I will have write my friend to his sister sick 'I'll have my friend write to his sick sister' b. *Je lui ferai ecnre mon ami I to her will have write my friend a.
Originally, Kayne sought to explain this fact by the Specified Subject Condition. For reasons discussed before, this solution is not available to us (see also Burzio (1981, ch. 5». Moreover, this solution leaves the grammaticality of (265) in the dark: (265) Jean les a fait acheter t a Marie John them had buy to Mary 'John had Mary buy them'
a
As we saw before, clauses with must have an external subject because reflexives can be bound in them (Elle fera se laver les mains a Jean 'She will have John wash his hands'). If the SSC is the right explanation for (264b), it is not clear why (265) is grammatical. Therefore, there must be an alternative to the SSC account. The explanation that I would like to propose is based on the template (258). As we have seen, lexical material intervening between the Vs and the positions Case-marked by the dynasty is obligatorily extraposed in order to meet the template (258). See for instance (260). These extraposition rules are purely stylistic. It seems to me that they leave traces that function in a way that differs from the way traces of Wh-movement and other extractions function. As is well known, traces of Wh-movement can have
310
Domains and Dynasties
phonological effects. They can block wanna-contraction, for instance. Similarly, the trace of an extracted clitic functions as if it were lexically filled. See, for instance, how (250) is ruled out by filter (245). I assume that the traces of stylistic rules do not function as if they were lexical. In fact, this is even impossible if we want (258) to work properly: the template can only be satisfied if the extraposition rules leave behind nothing with a lexical content. How can we characterize this difference between stylistic rules and extraction rules? A lexical category must be licensed somehow. Extracted NPs, for instance, must be linked to a Case position. Suppose now that within a strictly local context, it is not always necessary for a lexical category to occupy a subcategorized position. A category is functionally licensed in its D-structure position, but it is lexically licensed anywhere in its local domain. Stylistic rules, then, characterize the cases in which the functional (D-structure) position and the lexical position do not fall together. Since lexical licensing is restricted to minimal domains, stylistic rules are always strictly local. An extracted element must be linked to a D-structure position for its functional license, but it may be linked to any appropriate position for its lexical license. It can be the D-structure position, but by the nature of stylistic variation, it can also be another position in the local domain of the D-structure position. I assume now that a trace functions as a lexical category with respect to filters and other processes if it is a lexical licensing position. This means that the template (258) is satisfied if X" and Y" are not present at all or if they are traces created by stylistic rules. In the relevant contexts, stylistic rules are the only rules that leave traces that are not also lexical licensing positions. Let us now consider the consequences for clitic extraction. One consequence is that the clitic in (264b) cannot be directly linked to its underlying position: (266) Je luij ferai ecrire tj mon ami If the trace (as a lexical licensing position) is non-empty, this structure is rejected by (258). The position in which the clitic is normally lexically realized, the position preceding ecrire, is not available because of the filter (245). There is no reason to assume that an NP-clitic like lui can be extraposed like its PP counterpart in (260). But even if extraposition were possible, the structure would be ruled out. After extraposition, the clitic would not be directly licensed by its underlying position, but by its derived clause-final position:
(267) Je luij ferai ecrire tj mon ami tj The problem now is that the derived position is not bound in its minimal
NP-Movement and Restructuring
311
domain. It can only be linked to its antecedent by dynasty-controlled percolation, but as we have observed many times, percolation is only possible from D-structure positions. Categories that bind traces are percolation islands (see (193)). Since the rightmost trace in (267) is not a Dstructure position, percolation is impossible. There is some independent evidence that this explanation is in the right direction. Burzio (1981, 390) has observed that extraction of dative clitics is much better in Italian than it is in French: (268) ?Gli feci telefonare Piero him I made telephone Piero 'I had Piero telephone him' Interestingly, the following word order is much more acceptable in Italian than in French (Burzio (1981, 405)): (269) Faro [vp telefonare a Piero] Mario I will make phone to Piero Mario 'I will make Mario phone Piero' Apparently, the template (258) does not (or not in the same way) hold for Italian. Contrary to what we observed in French (see (259)), lexical PPs of the embedded verb can precede the postponed subject and therefore remain in their D-structure position. From there, percolation is possible, so that the clitic in (268) can be licensed. Note in conclusion that our solution to the dative extraction problem does not prevent the following grammatical sentences from being generated (Kayne (1975, 301)): (270)
a.
b.
Cela y fer a aller Jean that to it will make go John That will make John go to it' ElIe en fera sortir Jean she of it will make come out John 'She will make John come out of it'
If the clitics y and en were directly linked to their underlying postverbal positions, the sentences would be ruled out by the template (258):
(271) a. b.
Cela y fer a alIer t Jean Elle en fera sortir t Jean
Contrary to what we saw with the dative clitic, these PP clitics can occupy the position between the two verbs (cf. also (252)). Thus, the following sentences are grammatical:
312 (271)
Domains and Dynasties c.
Cela fera y aller Jean
d. Elle fera en sortir Jean Since y and en are Case-less pro-PPs, the filter (245) does not exclude these structures. With dative clitics, which are NPs, the filter does apply: (272)
*Je ferai lui ecrire mon ami
For French dative clitics, then, there is no way to climb to the position of the higher verb without violating either (258) or (245). Summarizing, it seems to me that clitic extractions in French (and Italian) are regulated by the dynasty concept and its properties. Transparency itself is caused by the fact that verbs of a selected type form a composite governor: a dynasty, in which each governor governs the minimal domain of the next governor. As in other cases, dynastycontrolled domain extensions are only possible for categories in Dstructure positions. In French, certain D-structure positions must be empty in order to meet the requirements of two templates. The function of these templates seems to be to make the shape of two partially united clauses more mono sentential. The templates sometimes lead to certain derived positions from which percolation is no longer possible.
5.9. Conclusion The reality of the unmarked locality principle, the Bounding Condition, is confirmed by the results of this chapter. As we concluded in section 5.1, two-node Subjacency is entirely superfluous for NP-movement if it is also constrained by the more restrictive principle A of the binding theory. In fact, we came to a stronger conclusion, namely that the opacity factors incorporated in principle A do not play a role for the traces of NPmovement. Their distribution is - in the unmarked case - entirely determined by the Bounding Condition. In the first half of this chapter, it was concluded that passives and ergatives do involve the binding of NP-traces in Dutch (contrary to what was claimed in Koster (1978c)). Furthermore, it was shown that the order indirect object - subject in Dutch passives and ergatives requires an analysis with the subject in VP-internal position, so that the external subject position remains empty. From this, it was concluded that NPmovement in Dutch passives is optional and that the standard analysis of the obligatoriness of passive NP-movement in English is false. According to the standard analysis, Case absorption forces the object to move to the subject position in order to receive Case. This analysis would leave the optionality ofNP-movement in Dutch unexplained, as well as the fact that expletives can transmit Case to postponed NPs. An alternative was proposed according to which Dutch and English
N P-Movement and Restructuring
313
differ with respect to the necessity of filling the subject position. Dutch, it was shown, is a semi-pro-drop language, in which subjects can remain empty if the external a-role is absent or realized elsewhere. In English, the nature of INFL determines that the empty subject must be either bound or lexically filled (see Koster (1986a) and (1986b )). The second half of this chapter sought to assimilate restructuring phenomena to the dynasty concept developed in chapter 4. English pseUdo-passives violate the Bounding Condition and are therefore a marked phenomenon. As with Wh-movement (chapter 4), it was concluded that global harmony is a necessary property of pseudo-passives. The essence of the dynasty concept is that in marked domain extensions (beyond the domain defined by the Bounding Condition), the domain is not defined by a single governor but by a set of governors. Pseudopassivization is possible in those languages in which Case assignment and absorption can be governed by a dynasty (instead of a single governor). The nature of the dynasty set up in pseudo-passivization is determined by lexical and semantic factors, which were concluded to be neutral among the various analyses of restructuring phenomena. What distinguishes the various analyses is the way in which a larger domain is created. Some analyses require reanalysis by complete amalgamation of governors (Hornstein and Weinberg (1981)). Other analyses increase the expressive power of linguistic theory by requiring non-tree-representable representations for the purpose of so-called reanalysis. Both these alternatives were concluded to be inadequate. There is no evidence for amalgamation in English, and in the case of verb "restructuring" in Romance, there appeared to be conclusive counterevidence against amalgamation. In Romance, the "restructured" verbs are not necessarily adjacent. Lack of adjacency is also a problem for the multidimensional analyses (as proposed by Haegeman and Van Riemsdijk (1986)). But the main objection against the extra (a priori undesirable) dimensions is their superfluity. Apart from the neutrallexicaljsemantic factors, the restructuring phenomena can be accounted for completely by the independently motivated dynasty mechanism. Apart from this conceptual point, involvement of the dynasty concept was empirically confirmed by data from Germanic and Romance. Dutch and German infinitival complements are only transparent (in the sense of chapter 3) if they are governed from the right direction, so that the Condition of Global Harmony is fulfilled. Extraposed complements are opaque and behave in accordance with the Bounding Condition. In other words, the transparency versus opaqueness of Dutch/German infinitives confirms both the reality of the Bounding Condition and the dynastygoverned nature of domain extension. As for the fail'e-construction in French, it was concluded that "restructuring" is only possible if the structure of certain templates is met, which mimic the subcategorization frames of monoclausal structures. As
314 Domains and Dynasties in English pseudo-passivization, Case relations (assignment or absorption) were extended from a single governor to a dynasty ((faire, V) in this case). As for the reality of the Bounding Condition, the behavior of the French verb laisser appeared to be particularly important. Contrary to what is assumed for jaire, subject postposing is only optional with laisser. If subject postposing applies, the structure of the relevant template (245) is met, so that a dynasty can be formed that makes the complement transparent. Without subject postposing, the template is not matched, no dynasty can be formed, and the domain remains opaque, in accordance with the Bounding Condition. All in all, it seems to me that NP-movement and "restructuring" phenomena provide us with ample evidence for the reality of the Bounding Condition and its dynasty-driven extensions.
NOTES 1. Richard Kayne has pointed out to me that English assigns objective Case to the NP in such examples: Him buy a house? Never! 2. It is essential in this account that S be a (governed) complement. If S were not a complement, (70) would be a counterexample to (71). Furthermore, (71) must be qualified for languages without subject-verb agreement. 3. Similarly, an anaphor in the position of molti studellti would not have a proper antecedent. If the empty subject had the necessary referential index, the verb in question would end up with two arguments, contrary to what it can take. 4. For arguments, see Bennis (1986). It should be noted that speakers differ somewhat as to the acceptance of sentences like (87). For some other examples, see Kiparsky and Kiparsky (1970, sect. 6). These authors also give some arguments for a distinction between factive it and expletive it. Factivity, however, is not at issue here. I am only making the weaker claim that it - factive or not - is always an argument. Therefore, their arguments do not have a bearing on my claim that so-caIled expletive it is in fact an argument. 5. In Dutch yes/no questions, the first position of the sentence can remain empty. 6. I am not reaIly committed to the rule of Right Node Raising. See Zwarts (1986) for penetrating criticisms. According to an analysis like Zwarts's, it is even less necessary for look at and kiss to be coordinated constituents in (156b). 7. See also Webelhuth (1985) for similar arguments for German. 8. I am taking the "A-over-A principle" here as a descriptive term, assuming that the Aover-A phenomenon must ultimately be explained by deeper principles. 9. Richard Kayne rightly points out that (189b) is much more unacceptable than (188b). The difference may again be a matter of directionality. In (188b), the PP remains on a right branch (as in its D-structure position), while in (189b) the fronted PP has been adjoined as a left branch. 10. By making a distinction between domain extension (under the Principle of Global Harmony) and the lexical factors, we aIlow for the possibility that a language has preposition stranding under Wh-movement but not in pseudo-passives. This is what we find in Icelandic, for example. 11. As before, I am using the term "movement" here as a descriptive term only. 12. The French examples are from Kayne (1975, 204). Richard Kayne has informed me that the French construction is also semanticaIly restricted. However, the Dutch restrictions are certainly different, as is clear from the examples. 13. Richard Kayne has informed me that (250) is better than the examples in (249). If necessary, one could qualify (245) by making a distinction between lexical NPs and traces.
Chapter 6
Binding and Its Domains
6.1. Introduction Anaphors and bound pronouns are referentially dependent, i.e. they derive their intended reference from some antecedent. Within generative grammar, it is generally assumed that the dependency relation in question also applies to quantified antecedents and variables. Thus, both the following examples are instances of anaphoric binding: (1)
a. b.
Bill likes himself Everyone likes himself
In both cases, the referential status of the NP himself is derived from the antecedent (Bill and everyone, respectively). It is further assumed that "intended coreference" in the grammatical sense is not a relation holding for objects in the real world. Rather, the relation holds for objects in some domain D of mental representation (see Chomsky (1981b), Bouchard (1984), Montalbetti and Wexler (1985), among others). The same can be said about so-called disjoint reference:
(2)
*HehatesJohn
In this example, he and John cannot refer to the same object in domain D, even when these NPs can refer to the same person in the real world. Grammatically speaking, in other words, (2) is ill-formed if he and John refer to the same object in domain D. The relation between the domain D and the real world poses many intriguing and partially obscure questions that will not be discussed here. In the present context, I will limit myself to the purely configurational aspects of the coreference relation: under what circumstances does the grammatical dependency relation hold, and under what circumstances does it not hold? As before, I will not be concerned with the content of the dependency relation but only with its configurational properties, which it appears to share to a large extent with dependency relations of totally different content (such as grammatical agreement relations). The fact that (1) involves a coreference relation says really nothing about the configurations in which this relation can be realized. Not only 315
316 Domains and Dynasties are the configurations in question like those of relations with a totally different content, they are also subject to a certain degree of parametric variation. The configurational properties can differ somewhat from language to language, while the content of the relations remains constant. This suggests that the content of a relation does not determine its grammatical form (i.e. its configurational properties). It is important, then, to keep in mind that the following paragraphs do not present a theory of coreference (let alone a theory of language and reference) but only a further development of the theory of the configurational matrix insofar as it enters into coreference relations. We are concerned, in other words, with the purely configurational module that we also detected in control relations (chapter 3) and "movement" relations (chapters 4 and 5). As before, I will show that this module involves a hierarchical set of domain definitions as its main substance. As in the case of obligatory control and the case of movement, anaphoric binding in the unmarked case is constrained by the minimal locality principle, the Bounding Condition discussed in the preceding chapters. Beyond this, anaphoric binding involves certain extended domains of the kind discussed in chapter 4. Thus, languages can select marked domains by adding certain opacity factors to the unmarked domain definition, or by familiar percolation mechanisms. In the latter case, an extended domain is only possible if a dynasty of a certain type can be formed. My starting point is the standard binding theory (Chomsky (1981b, 188». (3)
Binding Theory (A) An anaphor is bound in its governing category (B) A pronominal is free in its governing category (C) An R-expression is free
Principles A and B express the fact that anaphors like himself and pronominals like him are in complementary distribution with respect to some local domain (the governing category). Principle C expresses the alleged fact that R-expressions (names and variables) cannot have a ccommanding antecedent at all. As for anaphors and pronominals, it has become clear that a simple theory like (3) does not suffice, not even for English and certainly not for the many languages with more than one type of anaphor. In the latter case, the various anaphors differ in distribution, which usually requires more than one domain definition. Altogether, at least the following additions seem necessary. A rather straightforward addition is necessary to account for the well-known fact that some anaphors can only have a subject as their antecedent, while others can be bound by other c-commanding NPs as well. The anaphoric
317
Binding and Its Domains
systems of different languages can vary in this respect. But also languageinternally, in languages with more than one type of anaphor, the various anaphors can differ in this respect, as has been shown by Hellan (1983) for Norwegian, by Vikner (1985) for Danish, and by Koster (1985) for Dutch, among others. The most fundamental shortcoming of the binding theory (3) is that it does not distinguish the unmarked local domain from its marked extensions. From the point of view taken here, the notion "governing category" of principles A and B presupposes the following domain extension. The unmarked domain defined by the Bounding Condition is enlarged by adding just one opacity factor, namely the notion (accessible) SUBJECT (see Chomsky (1981b, ch. 3)). As claimed by Wexler and Manzini (1985) and Koster (1984a), opacity factors differ somewhat from language to language. In some languages, a domain is defined as the minimal domain containing Tense, while in other languages the corresponding domain is defined as the minimal domain containing indicative Tense. The latter option allows larger domains. I will assume that the Bounding Condition (the minimal domain defined without opacity factors) represents the unmarked local domain for all languages with locally bound anaphors. Furthermore, I will assume that opacity factors form a hierarchy in terms of the size of the language that they define, a matter to which I will return in a moment. Apart from this richer theory of domain extensions, I will show that anaphors can be multidimensional, in the sense that they can be free or bound with respect to more domains than one. This is in fact what we already find for the element PRO in the standard binding theory: PRO must be bound in its minimal governing category and PRO must be free in its minimal governing category. Another feature that I will add to the standard binding theory has to do with the complement/noncomplement distinction discussed by Huang (1982) and others. It seems to me that this distinction plays a role in the facts of binding that so far has hardly been recognized. Consider the contrast between (4) and (5), for instance: (4)
a.
b. (5) a. b.
*J an
dacht *John thought Jan zag een John saw a
aan zich of him slang naast zich snake near him
The contrast between the English sentences (4b) and (5b) forms a standard problem (see Lakoff (1968)), which remains unsolved in the binding theory (3). The ungrammaticality of (4b) is correctly predicted by (3), but (5b) is a counterexample (see Chomsky (1981b, ch. 5) for some discussion). I will argue below that the contrast between the Dutch sentences has to do with the fact that in (4a) zich is bound by the subject of a V that governs the
318 Domains and Dynasties domain of the anaphor (a "governing subject"), while in (Sa) zich is not bound by such a subject. It seems to me that the distinction between a governing and a nongoverning subject plays an important role in all languages with long distance anaphors and also in the effects for which principle C of the binding theory is held responsible. Furthermore, it appears that the notion "governing subject" is tightly connected with the dynasty concept discussed in the preceding chapters. In sum, the major goal of this chapter is to show that anaphoric relations are like other local dependencies in that their domains must be split into an unmarked part (characterized by the Bounding Condition) and certain domain extensions (characterized by opacity factors and dynasties). As I will show for Dutch, the unmarked Bounding Condition can be detected even for anaphors that seem to call for more elaborate domain definitions. Before applying them to Dutch anaphora, I will first give a somewhat more detailed account of the domain extending mechanisms. As for the opacity factors, I assume that they indirectly define a hierarchy of languages in accordance with the Subset Principle of Berwick (1985). More precisely, the Subset Principle specifies a markedness hierarchy for parameters which determine languages that are in a subset relation as a result of the values of the parameters in question. Thus, if i and j are values of a parameter p, we can define L(p(i)) as the language attained by setting the value of the parameter as i, aria L(p(j)) as the language for value j. If L(p(i)) is a subset of L(p(j)), i is less marked than j according to the Subset Principle. For language acquisition, this could mean - in the simplest case - that value i is tried first by the learner, and that j is only selected if positive evidence shows that j is the right value (see Wexler and Manzini (1985)). As just indicated, it is generally recognized that principle A of the binding theory (3) is not sufficient for anaphoric binding in general. Many languages appear to have larger domains than English for anaphor binding (see Yang (1984) for a survey). As briefly mentioned in chapters 1 and 4, Icelandic anaphors, for instance, can be bound outside their minimal tensed clause if this clause has its verb in the subjunctive form (see Maling (1981) and the literature cited there): (6) a. b.
*John John John John
segir says segir says
Maria that Mary [aD Maria that Mary [aD
elskar sigJ loves REFL elski sigJ loves (subj.) REFL
The first sentence (6a) seems to behave in accordance with principle A of the binding theory (3): sig is not bound in its governing category (i.e. the domain indicated by the brackets). The second sentence (6b), however, violates the original formulation of principle A: the reflexive sig is not
Binding and Its Domains
319
bound in the minimal category containing Tense and the subject (both accessible to sig in the sense of Chomsky (1981b, ch. 3)). In this sentence, the embedded verb elski is in the subjunctive form, which contrasts with the indicative form elskar of (6a). In Icelandic, a clause is accessible in principle if its verb is in the nonindicative form, i.e. if it is in the subjunctive or the infinitival form. A common approach, adopted here, is to maintain principle A and to parametrize the notion of the governing category. This is brought about by varying the opacity factors that define the governing category. Thus, for English anaphors, the opacity factors are subject and Tense, and for Icelandic sig, the opacity factor is "indicative Tense". In (6a), the reflexive cannot be bound outside the embedded clause, because this is the minimal clause containing indicative Tense. In (6b), however, the matrix clause is the minimal clause containing the indicative, so that sig can also have its antecedent in this clause. Wexler and Manzini (1985) claim that the different parametrizations of the notion of the governing category (in terms of opacity factors) form a markedness hierarchy with respect to the Subset Principle. They give the following definition: (7) y is a governing category for a iff y is the minimal category which contains a and a. b. c. d. e.
has has has has has
a subject, or an Infl, or a Tns, or an indicative Tns, or a root Tns
This list is a disjunction which can be interpreted as a parameter with five values. If this parameter is correct, it belongs to Universal Grammar and defines the hypothesis space for anaphoric domains in all languages. What is particularly interesting is that the five values in (7) form a hierarchy with respect to the Subset Principle. Wexler and Manzini demonstrate that if the value (a) (subject) is chosen for an anaphor, a smaller language is chosen than when (b) (Infl) is selected. In fact, the language L(a) is a subset of L(b). Similarly, (b) determines a smaller language than (c): L(b) is a subset of L(c), and so on. In other words, markedness increases if we go down the list from (a) to (e). In terms of learnability, this means that if L(a) is selected, the grammar can grow to L(b) on the basis of further positive evidence. I adopt the idea that the markedness hierarchy is given by the Subset Principle, but I would like to qualify Wexler and Manzini's idea in a number of respects. First of all, I do not believe that (7a) defines the least marked domain. (7a) is based on English, a language which certainly has relatively small
320 Domains and Dynasties anaphoric domains compared to Icelandic. But it is simply not true that the domains for English are the smallest we can find for anaphors among the languages of the world. Clearly, the domain definition for English defines a domain bigger than the one defined by the Bounding Condition: (8)
John talks [pp about himself]
If the Bounding Condition were the only condition applying to English anaphors, this sentence would be ungrammatical: himself is not bound in the minimal X max in which it is governed, namely the PP. Obviously, the grammar of English extends the minimal domain to the subject domain: the minimal X max must at least contain a subject (7a). There are many languages, however, in which certain anaphors cannot be bound across major clause boundaries like PP boundaries. The French reflexive se is a case in point:
(9)
*11 parle [pp de se] he talks about REFL
The clitic se can only be bound in the minimal domain of V (S or S') (lOa); binding across a PP boundary requires a different lexical item, namely lui or lui-meme (lOb) (see Zribi-Hertz (1980)): (10)
a.
b.
II se lave he REFL washes 'He washes himself' II parle [pp de lui-meme] he talks about himself
Anaphoric clitics in particular are usually bound in a domain smaller than the governing category defined by (7a). It appears, then, that neither the anaphoric domain of Icelandic reflexives (7d) nor the domain of English anaphors (7a) represents a domain that is minimal in any sense. The minimal domain can be found in languages with clitics like French, and also, as I will argue below, in a language like Dutch. I conjecture that this minimal domain is the domain defined by the Bounding Condition, i.e. the minimal Xmax in which an anaphor is governed. There is a certain logic in adding the minimal domain definition to the list (7). Going down the list from (7a) to (7e) corresponds to an increasing language size (in accordance with the Subset Principle). Going up from (7e) to (7a), then, corresponds with decreasing language size. But of course there is still one step up from (7a), namely the elimination of opacity factors altogether. The smallest language is defined by a domain where the governing category is simply the smallest category containing the anaphor.
Binding and Its Domains
321
Let us therefore add the Bounding Condition to (7) as the first step of the list (i.e. (lla)): (11)
y is a governing category for a iff y is the minimal Xmax which contains a and a. b. c. d. e. f.
has has has has has has
the governor of a and a subject, or an Infl, or a Tns, or an indicative Tns, or a root Tns
As before, I assume that this list represents a markedness hierarchy, in accordance with the Subset Principle. Also in a second respect, the theory of Wexler and Manzini has to be expanded. Without further additions, their theory predicts that the following Icelandic sentence is ambiguous (from Maling (1981); see Giorgi (1984)): (12)
Jon segir aa Haraldur komi fyrst Maria bjodi ser John says that Harold comes since Mary invites self
The minimal domain containing ser and an indicative Tense marking is the matrix clause. Both subjects (Jon and Haraldur) are within this domain, so that both qualify as possible antecedents. Nevertheless, only Jon is a possible antecedent. Giorgi (1984) gives similar facts for Italian and Japanese. She points out that the Italian word proprio is a long distance anaphor: (13)
Osvaldo pensava che quell a casa appartenesse ancora Osvaldo thought that that house belonged still alia propria famiglia to self's family
If the anaphoric element is in an adjunct phrase, the subject of the minimal domain containing indicative Tense is not an appropriate antecedent:
(14)
*Osvaldo ritorno in patria prima che il Osvaldo returned to his country before that the fisco sequestrasse il proprio patrimonio IRS sequestered the self's estate 'Osvaldo returned to his country before the IRS sequestered his estate'
This example shows again that Wexler and Manzini's theory is incom-
322
Domains and Dynasties
plete. As in Icelandic, a second c-commanding subject does not lead to ambiguity (Giorgi (1984, 315)): (15)
Mariaj sperava che Osvaldoi ritornasse in patria Maria hoped that Osvaldo would return to his country prima che il fisco sequestrasse il proprio*i/j patrimonio before that the IRS sequestered the self's estate 'Mary hoped that Osvaldo would return to his country before the IRS sequestered her estate'
Maria is the only possible antecedent, in spite of the fact that both Maria and Osvaldo are c-commanding subjects in the minimal domain containing indicative Tense. In fact, as Giorgi points out, similar facts were already discussed by Kuroda (1965) for Japanese: (16)
John-wa Bill-ga zibun-o mita toki hon-o yonde ita John Bill self saw when book reading was 'John was reading a book when Bill saw himself'
Both John and Bill are c-commanding subjects, but only Bill qualifies as an antecedent. As in Italian, a more superordinate subject does qualify as an antecedent: (17)
Mary-waj Bill-gai zibun-oi/j/*k mita toki John-gak Mary Bill self saw when John itt a hon-o yonde ita to book reading was comp said 'Mary said that John was reading a book when Bill saw her (himself)'
Mary is a possible antecedent, and so is Bill. So, why is John not a possible antecedent? Giorgi claims that the facts in question can be found in all languages with long distance anaphors. I will assume that this conclusion is correct and make the further claim that related facts can be found in English and Dutch. It seems to me that all the facts in question involve the notion "governing subject" (see Koster (1985)): (18)
Governing Subject (g-subject) A governing subject for an anaphor Y is the subject of an XO such that XO governs Y or a domain containing Y
The similar notion used in Koster (1985) concerns a special subcase of the notion defined by (18). The notion as it was used earlier implicitly
Binding and Its Domains
323
involved the dynasty concept. Under certain circumstances, it is still necessary, I believe, to refer to this subcase: (19)
Dynasty Subject (d-subject) A dynasty subject is a governing subject such that XO (of (18)) is a member of a dynasty
In general, long distance anaphors must be bound by a governing subject. But in Dutch at least, the antecedent must be a dynasty subject if the anaphor is governed by a dynasty (of more than one member). As discussed in chapter 5, a dynasty governs Y if one of the members of the dynasty governs Y. I will return to the distinctions in question in the next section. Assuming now that adjuncts are not governed, we can account for the facts of long distance anaphora that we discussed. The impossible antecedents in Icelandic, Italian, and Japanese are not governing subjects in the sense of (18), while the possible antecedents are. Before going into the problems of principles Band C of the binding theory (and parasitic gaps), I will first give an analysis of the Dutch reflexives zich and zichzelfthat modifies my earlier account (Koster (1985)) in a number of respects. A main shortcoming of that account (and other previous analyses) was that it did not cover the occurrence of zich in the domain of V. Like Vat (1980), Everaert (1986), and others, I treated zich as a clitic, a category entirely different from zich in other contexts. I will now assume that there is only one zich. The fact that zich in the domain of V has different properties from zich in other contexts is now taken as evidence for the reality of the minimal domain (defined by the Bounding Condition: see (lla)) in the theory of Dutch reflexives. Another difference between the analysis here and the earlier one is that the old notion "governing subject" is now redefined in terms of dynasties (d-subject), which brings out the similarities between anaphoric domains and the domains of traces in a clearer fashion. The more systematic use of the notion "governing subject" also has certain consequences for the analysis of anaphors in directional and locational phrases (John saw a snake near him, etc.).
6.2. Reflexives in Dutch Dutch reflexives can take different forms, which partially overlap but which differ greatly in distribution. For this reason alone, principle A of the binding theory has to be modified or supplemented. By and large I will limit myself here to the third person anaphors zich and zichzelf. The distribution of these two forms differs in a simple way in one respect and in relatively complex ways in other respects. The simple difference is that
324 Domains and Dynasties zich must always be bound by a subject, while zichzelf may also be bound by other NPs. In this respect, zich behaves like long distance anaphors in other languages (see, for instance, Giorgi (1984) and Vikner (1985)). I will give details and illustrations in a moment. Apart from picture noun contexts, there are two major problems concerning the distribution of zich and zichzelf. The first problem is that the two anaphors are in complementary distribution, with one exception: in the immediate domain of the category V, the two reflexives can overlap in distribution. Thus, (20) is one example of the many contexts in which the two reflexives differ in distribution, while (21) illustrates the one context in which they overlap: (20)
a. b.
(21)
a. b.
* Peter Peter Peter Peter Peter Peter Peter Peter
schiet op zich shoots at REFL schiet op zichzelf shoots at himself wast zich washes REFL wast zichzelf washes himself
If zich occurs in a prepositional object of a verb, it can never be bound by the subject of the same sentence (see (20)). As (21a) shows, however, zich can be bound by the subject of the same sentence if it is governed by V. The occurrence of zich and zichzelf in the immediate domain of V is in part a matter of (presumably semantically based) selection restrictions. I will not go into this problem: I simply assume that zich and zichzelf can occur in the immediate domain of V. The major contribution of Vat (1980) is the insight that the nonoccurrence of zich in contexts like (20a) is not a matter of selection. If we add a certain type of embedding, zich can be bound by the next subject up:
(22)
Jan liet [Peter op zich schietenJ John let Peter at REFL shoot 'John let Peter shoot at him'
This observation is a slight generalization of a fact first noted by Reis (1976) concerning the behavior of reflexives in directional and locational contexts: (23)
Peter zag [Marie naar zich toe komenJ Peter saw Mary to REFL prt. come 'Peter saw Mary coming towards him'
Reis observed that in the German equivalent of such sentences, the Specified Subject Condition was violated. Since then, it has become clear
Binding and Its Domains
325
that there are many languages with long distance anaphors (Yang (1984)), and that Dutch zich (and German sich) share certain properties with these. Locational and directional contexts form the second major problem. In so-called snake sentences, a reflexive is bad in English, while a pronoun is possible: (24)
a.
*John
saw a snake [near himself] John saw a snake [near him]
b.
Without contrastive stress, zichzelf is impossible in the Dutch equivalent of (24a): (25)
*J an
zag een slang [naast zichzelf] John saw a snake near himself
It is quite normal, however, to use zich in such contexts:
(26)
Jan zag een slang [naast zich]
A literal translation of (24b) (with hem) is also possible (27a), although most speakers of standard Dutch prefer the reduced form 'm (27b):1 (27)
a. b.
Jan zag een slang [naast hem] Jan zag een slang [naast 'm]
Apart from the reduced form, Dutch does not differ in this respect from English. Like English him, hem cannot be bound in its minimal domain if it is governed by V: (28)
*Jan
wast hem John washes him
In several Dutch dialects, however, (28) is grammatical. In these dialects, hem is both a pronominal and an anaphor. In standard Dutch, we only find the same form for anaphors and pronominals in the first and the second person. Again, there is a preference for the reduced forms me and je, but the full forms mij and jou are also possible: (29)
a.
b.
Ik was me (mij) I wash me 'I wash myself' Jij wast je Gou) you wash you 'Y ou wash yourself'
Dutch is like French in this respect (je me lave) and unlike modern
326 Domains and Dynasties
English. In Middle English, however, him could be used anaphorically. Faltz (1977), for instance, gives the following example: (30)
He cladde hym he dressed himself
It is clear from such facts that it is not necessary for a language to make a
lexical distinction between anaphors and pronominals. Before going into the two major binding problems of Dutch reflexivization, I would like to discuss the antecedent problem briefly here. I already mentioned that the antecedent for zich must be a subject and that zichzelf can be bound by any c-commanding NP. If reflexivization falls into the pattern of the configurational matrix discussed in chapter 1, we expect that the antecedent always c-commands the anaphor. This is not true, however. Jackendoff (1972) gives examples like the following: (31)
A book by John about himself
The antecedent John is in a by-phrase and does not c-command the anaphor. Similar examples can be found in Dutch (see Koster (1985)): (32)
a. b.
Een boek van Jan over zichzelf a book by John about himself De confrontatie van Jan met zichzelf the confrontation of John with himself
Another familiar example in which the antecedent does not obviously ccommand the anaphor is the following: (33)
That picture of himself made John famous
F-command (in the sense of Bresnan (1982)) does not work here, because (as has been pointed out by Chris Driessen (personal communication)) such examples are even possible if John does not f-command himself: (34)
That picture of himself led to John's suicide
Picture noun contexts form a classic problem. The most promlsmg solution is binding of the anaphor by the implicit subject of the picture noun, which by itself is controlled in some fashion (see Daalder and Blom (1976) and Chomsky (1986a)). I will not go into the picture noun problem at all here, but simply point out that the problem is not limited to picture nouns. In Dutch impersonal passives, a reflexive can be bound by an antecedent in the door-phrase (the equivalent of the English by-phrase):
Binding and Its Domains (35)
327
Er wordt door Jan te veel over zichzelJ gepraat there is by John too much about himself talked 'John talks too much about himself'
In this case, the problem cannot be solved by reference to a general solution to the picture noun problem. Something else seems necessary: the antecedent must be a c-commanding argument, no matter how this argument is expressed. Thus, if a semantically unrestricted argument is the antecedent, the anaphor is bound by a c-commanding NP. If the argument is couched in a PP, the anaphor is bound by an antecedent in a ccommanding PP. This solution, however, overgenerates in an interesting way. All examples involving an antecedent in a PP have the anaphor in a PP as well. If the anaphor is in the immediate domain of a V, the antecedent must be a c-commanding NP, as shown by the following contrast: (36)
a. b.
Jan gaf zichzelJ een boek John gave himself a book *Door Jan werd zichzelJ een boek gegeven by John was himself a book given
This indicates that binding by an antecedent in a PP is a somewhat marked option. For reasons that I will go into shortly, an anaphor can only be bound in its minimal domain (in the sense of the Bounding Condition) if it is governed by V. In (36b), the anaphor zichzelJis governed by V, and the antecedent Jan is in a PP contained by the same minimal domain (S or Sf). It appears, however, that in the minimal domain only a c-commanding NP is a possible antecedent. I will follow this pattern of reasoning in general: the minimal domain in the realm of anaphora can be detected by its deviant behavior. In this case, the situation seems to be as follows. The reflexive zichzelJ can be bound by an argument in a PP, except if it is bound in its minimal domain. If this pattern of reasoning is correct, we find indirect evidence for the reality of the Bounding Condition in the realm of anaphora. Another conclusion is that anaphor binding cannot in general be reduced to a theory of argument structure. ZichzelJ must be bound by an argument, but the form which this argument will take is in part dependent on purely structural factors. Another indication that binding cannot be fully reduced to a theory of argument structure is that neither zichzelJnor zich is necessarily bound by a co-argument in Dutch. The view that the differences between the two Dutch reflexives might be explained by the co-argument notion is suggested by the following contrasts: (37)
a.
Jan zag een slang naast zich (*zichzelJ) John saw a snake near REFL (*himself)
328
Domains and Dynasties b.
Jan schoot op zichzelJ (*zich) John shot at himself (*REFL)
According to the co-argument hypothesis, (37a) cannot select zichzelJ because the subject John is not a co-argument of zichzelJ: the latter form is contained in a PP which is not subcategorized by the verb. In (37b), however, the reflexive zichzelJ is contained in a PP which is traditionalIy considered a prepositional object of the verb. In that case, zichzelJ can be considered an argument of the verb, just like the subject of the sentence (i.e. its co-argument Jan). For zich, just the opposite conditions would hold, so that the contrast is explained in terms of the co-argument notion. This simple theory of the contrast does not work, because it is easy to find examples with zichzelJ in a nonsubcategorized adjunct:
(38)
Jan sprak namens zichzelJ John talked on behalf of himself
Clearly, the PP introduced by namens is not a prepositional object, but a phrase that functions as a sentential adverb. Consequently, zichzelJis not a co-argument of the subject Jan. And yet the sentence is perfectly grammatical. Other evidence that strongly counts against the co-argument hypothesis is the fact that both zich and zichzelJ can appear as the subject of a clause embedded under a perception verb:
(39)
J an zag [zich(zelJ) vallen] John saw REFL(self) faIl 'John saw himselffalI'
As is generaIly assumed, the embedded subject in this case does not have a thematic relation with the matrix verb. 2 In other words, Jan and the reflexive it binds are not co-arguments. Another false hypothesis concerning the antecedents of Dutch reflexives is that they must be subjects (see Faltz (1977)). This cannot be right for zichzelf, because, just like its English counterpart, it can also have an indirect object as antecedent:
(40)
Jan raadde Peter zichzelJ aan John recommended Peter himself prt. 'John recommended himself to Peter'
This sentence is ambiguous: both the subject Jan and the indirect object Peter can be the antecedent of zichzelJ. The subject hypothesis cannot be maintained by claiming that Peter is in fact the subject of a small clause in the sense of Kayne (1982). This hypothesis, not well founded for Dutch double object constructions anyway, would make the incorrect prediction
Binding and Its Domains
329
that Peter can also be the antecedent of zieh. This is not the case: (41)
Jan raadde Peter [de vrouw naast zieh] aan John recommended Peter the woman near REFL prt. 'J ohn recommended the woman near him to Peter'
This sentence is umambiguous: only the "real" subject John is a possible antecedent for zieh. If Peter is interpreted as the antecedent, the sentence is entirely ungrammatical. As for the antecedent of ziehzeif, then, it must be concluded that it is neither necessarily a co-argument nor a subject. Zieh, however, must have a subject-antecedent. It can be a deep structure subject (as in the examples already given), or a derived subject, as in: (42)
Peter werd gewekt door het lawaai naast zieh Peter was woken by the noise near REFL
I will now leave the topic of possible antecedents and turn to the two major binding problems of Dutch reflexivization. The first major problem, recall, is the fact that zieh and ziehzelf overlap in the domain of V (43a), while they are in complementary distribution in other contexts (43b): (43)
a. b.
wast zieh(zelf) John washes REFL(self) Jan schiet op ziehzelf (*zieh) John shoots at himself (*REFL) Jan
In fact, the problem is practically ignored in most studies on Dutch reflexives. In Koster (1985), it is simply said that zieh in the immediate domain of V is a clitic with the same distribution as ziehzeif, unlike zieh in other contexts, which is treated as a nonclitic. Essentially the same nonsolution was proposed by Vat (1980) and Huybregts (in unpublished lectures). Treating zieh as a clitic is not satisfactory, because there is no evidence that the Dutch "weak" pronouns are in fact clitics. Unlike the clitics in Romance and other languages, they are not co-analyzed with the verb to which they are connected. Dutch has two verb movement rules, Verb Second and Verb Raising, neither of which moves the alleged clitics along with the verb. The earlier proposal says in fact nothing beyond the observation that zieh in the domain of V behaves differently from zieh in other contexts, which leaves the problem exactly where it is. Another solution was suggested by Everaert (1986). According to this solution, zieh in the domain of V is not an argument, so that the normal binding theory does not apply to it. This idea is based on the observation that zieh is always used in contexts without a real internal argument, for
330 Domains and Dynasties example with inherent reflexive verbs (like zich vergissen 'to err'). This solution is not satisfactory either. Although it is true that zich is always used with verbs without a true argument-object, it is also used as an argument. This can most clearly be concluded from double object constructions with zich in indirect object position. Quite often, the indirect object can be left implicit: (44)
Jan verschafte (zich) een alibi John gave (REFL) an alibi 'John gave (himself) an alibi'
If zich is omitted, the understood indirect object is arbitrary (or contextually determined). Clearly, then, zich satisfies a thematic role in such cases. But if zich can satisfy a thematic role, there is no reason to deny it argument status. In short, the problem of the overlapping distribution in the immediate domain of V has never been solved. Here, I would like to propose a new solution based on the Bounding Condition, i.e. on the role of the minimal domain definition in the realm of bound anaphora. Throughout the preceding chapters, I have assumed that the minimal domain of V is S (or S'). Furthermore, I have assumed that the minimal domains of N, A, and Pare NP, AP, and PP, respectively. What is necessary for zich to be bound in its minimal domain? First of all, the minimal domain should contain a subject (since zich must be bound by a subject). In principle, this leaves the domains NP and S, and perhaps AP. Furthermore, the lexical anaphor zich must receive Case from the head of the domain. Since neither N nor A assigns Case to NPs, only S remains as a minimal domain in which zich can be bound. In short, zich is bound in its minimal domain if it is governed by V and bound in the minimal S containing it. The same holds for zichzelf. This gives us a clue to the solution: the only context in which zich and zichzelf overlap in distribution is where they are bound in their minimal domain, i.e. where they are governed by V and bound in the minimal S containing them. The other domains in which they are bound are extensions of the minimal domain in the usual way; that is, the enlarged domains either involve opacity factors or dynasties of some sort. In these extended domains, zich and zichzelf contrast in distribution. In this way, we detect the reality of the minimal domain: the Bounding Condition defines the only domain in which both zich and zichzelf can be bound. Most facts discussed so far can now be accounted for by a simple binding theory that incorporates this idea:
(45)
Binding Theory for Dutch Anaphors a.
Zich and zichzelf are bound in their minimal Xmax (if it contains an antecedent)
Binding and Its Domains
331
elsewhere (Le. if X max does not contain an antecedent): b.
(i) zichzelfis bound in its minimal subject domain (ii) zich is free in its minimal d-subject domain
c.
zich is bound in its minimal COMP domain
Most of this theory is rather straightforward, with the exception of the role of dynasties and the notion "d-subject", which I will explain below. The first statement (45a) involves the Bounding Condition, the unmarked locality principle that we also detected in the theory of obligatory control (chapter 3) and in the theory of "movement" (chapters 4 and 5). Here, the Bounding Condition defines the (only) domain in which both reflexives may be bound (must be bound if there is an appropriate antecedent). Statement (45b) treats the notion "subject domain" as a dimension of contrast: one reflexive must be bound in the minimal subject domain, while the other reflexive must be free in it. The domain involved is more or less the domain of principles A and B (Le. the governing category) of the standard binding theory (see (3) above). However, the fact that zich must be free in this domain (with some qualifications to be discussed later) shows that the domain definition in question is not a universal definition of the minimal domain in which anaphors must be bound. Zich must be free in this domain, but it must be bound (if possible) in a smaller domain, namely in the domain defined by the Bounding Condition (45a). Icelandic sig, on the other hand, can be bound in a more inclusive domain, which shows that the standard binding domain of principle A is merely a somewhat intermediate step on a universal scale (see (11)). Zich can also be bound in domains larger than the domain defined by principle A, but never outside the domain of COMP (which falls together with INFL in Dutch according to the assumptions made in chapter 4). This is what (45c) accounts for, which can be demonstrated by the following example: (46)
*Jan
zei [dat Peter een slang naast zich zag] John said that Peter a snake near REFL saw
Peter is a possible antecedent for zich, but it can never be bound across a COMP (by Jan across the COMP dat in this case).
Step by step, then, we observe the reality of the domain hierarchy (11). Zich must be bound in the domain defined by (1Ia) (the unmarked Bounding Condition). It must skip the next domain (l1b) (the domain
defined by principle A of the standard binding theory), in which it must be free. But the next domain definition (Hc) poses an upper bound to the size of the domain in which zich can be bound. Zichzelf, on the other hand, can be bound in the minimal domain (l1a), and also in the next domain (l1b), which is also the maximal domain in which it can be bound.
332
Domains and Dynasties
In general, I believe that anaphors can have multidimensional domain statements, i.e. they must be bound with respect to one domain of (11) and they can be free with respect to another domain of (11). Naturally, the latter domain is always a subdomain of the maximal domain in which an anaphor can be bound at all. It also seems natural to put further constraints on the possible binding theories for anaphors. I assume that, apart from the Bounding Condition (lla), anaphors can have at most one definition of the maximal binding domain and at most one definition of the subdomain in which they must be free. Before turning to the second major binding problem ("snake" sentences) and the role of the dynasty concept, I will first illustrate principles (45a) and (45b). The first binding problem (overlapping vs. contrasting distribution of zich and zichzelJ) is accounted for, as is clear from the examples of (43) (repeated here for convenience): (47)
a. b.
[Jan John [Jan John
wast zich(zelJ)] washes REFL(self) schiet Cop zichzelJ (*zich)]] shoots at himself (*REFL)
In (47a), zich or zichzelJis governed by V and bound by Jan. In both cases the anaphor is bound within its minimal domain (indicated by the brackets). Both zich and zichzelJ can be bound in this way, in accordance with (45a). In (47b), the anaphors cannot be bound within the minimal domain (the PP indicated by the innermost brackets) because there is no antecedent within this domain. Automatically, then, the elsewhere conditions apply ((45b) and/or (c)). ZichzelJ can (and must) be bound in the next domain (the outermost brackets), because this is the minimal domain containing a subject (Jan) (45bi). Zich, however, must be free in this domain (45bii): the domain in question is the minimal domain containing the d-subject Jan. This is the subject of the dynasty (schieten, op), which governs zich. According to (45bii), zich must be free in the minimal domain of such ad-subject. Consider next some examples involving causatives or verbs of perception: (48)
a.
b.
Jan liet [Peter zich wassen] John let Peter REFL wash 'John let Peter wash himself' Hans gelooft [dat Jan [Peter Cop zich schieten] zag]] Hans believes that John Peter at REFL shoot saw
Both sentences are unambiguous. In (48a), Peter is the only antecedent. Zich is governed by the verb wassen and it can (and therefore must) be
Binding and Its Domains
333
bound in its minimal domain (indicated by the brackets). This is in accordance with (4Sa). In (48b), Jan is the only possible antecedent. In this case, zieh cannot be bound in its minimal domain (the PP indicated by the innermost brackets). Therefore, (4Sa) does not apply, so that (4Sb) and (c) must apply. Peter is excluded as an antecedent by (4Sbii), since this subject is the dsubject determining the minimal d-subject domain in which zieh must be free. Nothing of what we have said so far excludes Jan as a possible antecedent. It is a subject (as required for zieh) and it is not the minimal dsubject. Hans, finally, is outside the minimal COMP domain (defined by dat), so that binding of zieh by Hans is excluded by (4Sc). All in all, then, it appears that (4S) correctly predicts that Jan is the only antecedent for zieh. Consider now a Dutch "snake" sentence: (49)
[Jan zag een slang [naast zieh]] John saw a snake near REFL 'John saw a snake near him'
Again, zieh is not bound in its minimal domain (the PP introduced by naast). So, the elsewhere conditions apply as before. Note, however, that this time the sentence is not excluded by (4Sbii). Jan is the subject defining the minimal subject domain for zieh, but contrary to what we saw before, Jan is not a d-subject for zieh. The reason is that the locational phrase is an adjunct, so that zag and naast do not form a dynasty. Nothing, therefore, excludes Jan as a possible antecedent for zieh in this case. The last principle, (4Sc), is of course not violated, because Jan does not bind zieh across a COMPo So far so good. The problem, however, is that (4S) does not exclude the ungrammatical (SO) either: (SO) *[Jan zag een slang [naast ziehzelf]] In this case, Jan is the minimal subject for ziehzelf, so that the anaphor is correctly bound in accordance with (4Sbi). This brings us to the problem of "snake" sentences in general. The standard binding theory makes the wrong predictions about "snake" sentences in English, as noted by Chomsky (1981b, 291): (Sl)
John saw a snake [near him (*himself)]
The problem with such sentences is that him is bound to the antecedent John within the minimal governing category. According to principle A, then, we would expect a reflexive (an anaphor), while principle B excludes a pronominal in this context. As is clear from (Sl), the opposite appears to be true: an anaphor is impossible, while a pronominal gives a grammatical result.
334 Domains and Dynasties As we have seen in (49}-(50), a similar observation can be made about Dutch: the reflexive zichzelf is excluded, while zich is possible. In fact, it is also possible to use a pronominal, as in English: (52)
Jan zag een slang naast hem John saw a snake near him
As said before, most speakers of Dutch prefer the reduced form 'm over the full form hem in such cases, but (52) is also acceptable (to my ear, at least). A hypothesis that immediately suggests itself is that the antecedent and the pronominal are not in the same governing category. If that were the case, most examples would be in accordance with the binding theory. Unfortunately, it is very difficult to make a strong case for an extra domain. What has been suggested, for instance, is a small clause analysis (SC = small clause): (53)
John saw [sc a snake near him]
Chomsky (1981b, 291) concludes - rightly, I believe - that a small clause analysis is not generally available for such cases. One of the problems is that we find the same distinctions in the locational (and directional) complements to intransitives (see Koster (1985, (16)): (54)
Jan keek [achter zich] John looked behind REFL
As before, we also find ordinary pronominals in these contexts, particularly in the reduced form: (55)
Jan keek achter'm John looked behind him
It is not clear what the small clause would be like in these cases. Perhaps
the only possibility is a small clause introduced by PRO: (56)
Jan keek [PRO achter zich]
But in such a structure, the only feasible controller of PRO is the subject Jan. Since zich is bound by Jan, it must also be bound by PRO, which is controlled by Jan. But then the small clause solution collapses: zich would be bound again in its minimal (d-)subject domain, which is forbidden by (45bii). In Koster (1985) I rejected the small clause analysis for this reason, and adopted an ad hoc solution. I stipulated that PPs containing a directional or locational P have the same status as domains containing a subject. Directional and locational PPs, in other words, were analyzed as minimal
335
Binding and Its Domains
governing categories in the sense of principle A of the binding theory. I would now like to propose an alternative based on the notion "dsubject" and on certain independent properties of locational and directional prepositions. Let us therefore reconsider some problematic contrasts: (57)
a.
*J an
b.
schoot John shot Jan keek John looked
op zich at REFL achter zich behind REFL
This contrast straightforwardly follows from (45). In (57a), zich is governed by the dynasty (schoot, op). Therefore, Jan is a d-subject and zich is bound in the domain of this minimal d-subject, which is forbidden by (45bii). In (57b), the preposition achtel' introduces an adjunct and does not form a dynasty with the verb (only the heads of prepositional objects form a dynasty with the verbs that select them). Consequently, Jan is not a dsubject for zich, and (45bii) does not exclude the sentence. We concluded earlier that (45) is still problematic because it does not exclude zichzelJ in the context of zich, contrary to what seemed necessary: (58)
* Jan
keek achter zichzelJ John looked behind himself
This observation is not entirely correct, however. (58) becomes grammatical if we put the main stress on the reflexive: (59)
Jan keek achter zichZELF
In those adjunct phrases in which zichzelJ is possible, this is the unmarked stress pattern: (60)
Jan sprak namens zichZELF John talked on behalf of himself
The difference between (59) and (60) is that stress on the reflexive is unmarked in (60) but contrastive in (59). It seems to me that this is due to an inherent difference between prepositions like namens and locational prepositions like achtel'. Prepositions of the former type are always unstressed, while directional and locational prepositions are stressed when they are followed by an anaphor or pronoun. We can now simply assume that from the point of view of the binding theory both zich and zichzelJ can occur in adjunct PPs, but that if the preposition can be (uncontrastively) stressed zich must be selected, while zichzelJ must be selected otherwise. If this solution is correct, the complementary distribution of zich and
336 Domains and Dynasties zichzelf in adjuncts has nothing to do with the binding theory and its domains, but only with selection based on stress. We have not accounted yet for English "snake" sentences and for the fact that pronominals can be found in adjunct domains. I will return to this matter in the next section, (6.3), which discusses principle B. I will now turn to the nature of the dynasties involved in reflexivization. According to our earlier definitions, a dynasty consists of a number of successive domain governors that have something in common. This means that a domain can be extended to the next more inclusive domain as long as the former domain is governed within the latter and the successive governors have the right property. It seems to me that these properties of dynasties not only account for several facts of Dutch reflexivization, but also for the differences between long reflexivization in Dutch on the one hand and in some Scandinavian languages on the other. Dynasties for anaphors are built up from lexical heads that are dependent on some verb. In general, there are two types of heads that satisfy this description: dependent verbs, such as infinitives and subjunctives, and the prepositions of prepositional objects, which have traditionally been believed to be closely associated with verbs (or adjectives). Apart from the notion of the d-subject in (45bii), the binding theory (45) says nothing about dynasties. It seems to me that a partially languagespecific binding theory is not the right place for this aspect of domain extension. Rather, it seems appropriate to introduce the dynasty concept in the definition of long distance anaphora. With this in mind, I will make the following assumptions (about Universal Grammar): (61)
a.
long distance anaphors are anaphors bound by a nonminimal
b.
governing subject long distance anaphors governed by a dynasty must have a dsubject as antecedent
These assumptions are not just part of the binding conditions for Dutch (45), but are claimed to hold for all languages. Let us now see how these assumptions interact with the binding conditions. Consider the following case of long distance binding: (62)
dat Jan [Peter opi zich schieten~ liet i that John Peter at REFL shoot let
Peter is not a possible antecedent for zich, because it is the minimal dsubject for zich excluded by (45bii). Jan is a nonminimal subject within the domain of the minimal COMP, so it is a possible antecedent according to the binding theory (45). Binding by Jan would be a case of long distance anaphora according to (61a). Clearly, Jan is also a d-subject for zich, because it is the subject of liet, which belongs to the dynasty ([iet, schieten,
Binding and Its Domains
337
op), indicated by the superscripts in (62). Therefore, (61b) must and can be satisfied in (62). Consider next the following sentence, which contrasts with (62) in an interesting way: (63)
*dat Jan [het schot i opi zich] betreurde that John the shot at REFL regretted
As in (62), binding of zich by Jan would be a case of long distance binding: Jan is a governing subject for zich because it is the subject of betreurde, the verb which governs the domain containing zich. Moreover, Jan is a nonminimal subject if we assume that the NP (indicated by the brackets) contains an implicit subject. But contrary to (62), (63) is ruled out by (61b). In this case, Jan is not a d-subject (as required by (61b)), because Jan is the subject of betreurde, which does not form a dynasty with the dynasty (schot, op) which governs zich. By assumption, dynasties for anaphors are only formed by dependent heads. The preposition op can be said to depend on the verb schieten 'shoot' or its nominalization, but the head of an NP is not dependent on a V in this sense. For the same reason, (64) (discussed in Koster (1985)) is ruled out: (64)
*dat Jan [Marie verliefd i opi zich] achtte that John Mary in love with REFL considered
Zich is governed by the dynasty (verliefd, op) (op is a dependent head, i.e. a fixed preposition selected by the adjective verliefd). As in the preceding example, the dynasty cannot be expanded with the matrix verb. Neither Jan nor Marie are possible antecedents. Marie is the minimal d-subject, which is not a possible antecedent according to (45bii). Jan is a possible antecedent according to the binding theory; but since it is not ad-subject, it is ruled out as a possible antecedent by (61b). All these considerations concerning dynasties are irrelevant for the occurrences of zich in adjunct phrases. Since adjuncts are by definition not governed, these occurrences of zich are not long distance anaphors in the sense of (61a). Thus, Jan is not a governing subject for zich in (65): (65)
Jan zag een slang [naast zich] John saw a snake near REFL 'J ohn saw a snake near him'
Since zich is not a long distance anaphor in the sense of (61), the only principle that applies is principle (45c): zich must be bound (by a subject) in its minimal COMP-domain. This condition is met in (65). Similarly, the ambiguity of the following sentence is predicted:
338
Domains and Dynasties
(66)
dat Jan [Marie de slang [naast zich] knutIelen] liet that John Mary the snake near REFL hug let 'that John let Mary hug the snake near him (her),
Marie is the minimal subject for zich, but it is not a d-subject, so that Marie is not ruled out (as an antecedent) by (45bii). Nothing ruletl out Jan either: both Jan and Marie are possible antecedents within the ~minimal COMP-domain. Similarly, the following examples (discussed in Koster (1985)) are accounted for: (67)
a. b.
[Met Marie naast zich(*zelf)] is Jan altijd nerveus with Mary near REFL(*self) is John always nervous Jan werd nerveus van [Marie's lawaai naast zich(*zelf)] John became nervous by Mary's noise near REFL(*self)
In both (67a) and (67b), zich is in a locational adjunct, so that only (45c) applies. In both cases, zich is bound within its minimal COMP-domain by J an. In neither case can zichzelf be bound by Jan, because in both cases Marie is the minimal subject beyond which zichzelf cannot be bound according to (45bi). All in all, it seems to me that the binding theory (45), together with (61), accounts for most of the facts of Dutch reflexivization discussed in Koster (1985) and elsewhere. A fact that we have not yet accounted for is that long distance reflexivization only occurs in Verb Raising complements in Dutch, i.e. in complements to the left of their matrix verb. Following a suggestion by Richard Kayne, I will assume that this fact has to do with the direction of government in Dutch. Another factor is the possible presence of complementizers in extraposed infinitives. In chapters 3 and 5, I have assumed that sentential complements with a complementizer are generated to the right of their matrix verb. Only infinitives without complementizers are (base-)generated to the left of their matrix verb. These orders correspond with the direction of government: government is to the right for complements introduced by complementizers, while the matrix verb governs Verb Raising complements to the left. Since we have assumed that the direction of government is a D-structure property, extraposed complements that bind a trace are not governed at all (see chapters 4 and 5). Consider now the following example (Koster (1985, (26b)): (68)
*Peter dwong Marie om naar zich toe te komen Peter forced Mary COMP to REFL prt. to come
This example is straightforwardly ruled out by (45c): zich is not bound in the minimal domain introduced by the COMP om. The sentence is also ungrammatical if the complementizer is absent. If this is due to com-
Binding and Its Domains
339
plementizer deletion in PR, the sentence is still ruled out by (45c) which applies before PRo If the COMP-less complement binds a trace (resulting from extraposition) the complement is not governed at all, so that no dynasty can be formed. Given my assumptions about the direction in which complements are governed, the relevant dynasties can only be formed on the basis of Verb Raising complements. In languages in which all complements are governed to the right of their matrix verb (and that lack (45c)) long distance anaphors are in principle possible in all of their complements that are so governed. This is precisely what we see in the Scandinavian languages. In Icelandic, long distance anaphora is possible if a dynasty of dependent verbs can be formed. Since Icelandic has not only infinitives but also subjunctives, a great deal more is possible in this language than in the other Scandinavian languages. But Danish, Norwegian, and Swedish still have a possibility that Dutch and German lack. The Scandinavian languages have long distance anaphors in control complements to the right of the matrix verb. As mentioned in chapter 4, the following Norwegian sentence, for example, is grammatical (Hellan (1980)): (69)
Ola bad oss [PRO snakke om· seg] Ola asked us to talk about REFL
The Dutch equivalent of this sentence is ungrammatical for the reasons we already mentioned: (70)
*Marie
vroeg ons [(om) PRO over zich te praten] Mary asked us (COMP) about REFL to talk
Unfortunately, reflexivization into a control complement is still rather bad if the complement is generated to the left of the matrix verb (where it undergoes Verb Raising): (71)
dat Marie ons over zich vroeg te praten that Mary us about REFL asked to talk
This example is somewhat hard to judge, due to the fact that Verb Raising with object control is rather odd anyway. Verb Raising combines better with object control if the embedded infinitive lacks te 'to': (72)
dat Marie Peter boeken leerde lezen that Mary Peter books taught read 'that Mary taught Peter to read books'
A crucial question now is whether te-Iess infinitives can contain long distance reflexives. Again, the facts are not crystal clear: 3
340 Domains and Dynasties (73)
dat Marie Peter aan zich leerde denken that Mary Peter of REFL taught think 'that Mary taught Peter to think of her'
It seems to me that such a sentence is somewhat less perfect than a corresponding sentence with a non-control verb like laten:
(74)
dat Marie Peter aan zich liet denken that Mary Peter of REFL let think 'that Mary let Peter think of her'
On the other hand, I find (73) distinctly better than (70) or than (75) (with Marie binding zich): (75)
*dat Marie Peter zich liet wassen that Mary Peter REFL let wash
This sentence is only grammatical if Peter binds zich, as predicted by (45a). As it stands, the theory (i.e. (45) and (61)) predicts that (73) is grammatical. If it is not - which I doubt - we have to limit the permissible dynasty for reflexives to dependent prepositions and verbal complements to causatives and verbs of perception. This would be an unfortunate result. By far the most elegant theory would entail that long distance reflexivization is always possible in principle in Verb Raising complements, i.e. in complements to the left of the matrix verb. If that is correct, we can give a very simple account of the parametric differences among Dutch, Icelandic, and the other Scandinavian languages. Other things being equal, we would not have to stipulate anything special for reflexives in these particular languages at all. We could assume that in all languages with long distance anaphors, these anaphors are possible to the extent that a dynasty of dependent heads can be formed. If infinitives and subjunctives are universally defined as dependent verbs, the difference between Icelandic and the other Scandinavian languages follows simply from the fact that the former language has SUbjunctive verbs, while the latter have not. The facts of Dutch and German would follow from the independent assumption that the complements containing the dependent verbs are only found to the left of matrix verbs. It is not clear, however, whether things can be kept that simple. Summarizing, we have come to the following conclusions. The local domain involved in principles A and B of the standard binding theory is not sufficient for a language like Dutch with more than one anaphor. It appears that the domain of the standard binding theory is just one option among a set of domain definitions that define languages that are in a subset-superset relation. The smallest language is defined by the Bounding Condition, the unmarked locality principle that we also detected in the
Binding and Its Domains
341
theory of control and the theory of movement. Particularly in the languages with anaphoric clitics, the Bounding Condition suffices as a principle-A-like condition. But also in Dutch this domain can be detected: it is the only domain in which the distributions of zieh and ziehzelJ overlap (45a). As in English, the further distribution of the anaphors is determined by slight extensions of the basic domain. ZiehzelJ is not unlike the English anaphors in that the domain is defined as the minimal domain containing a subject (45bi). For zieh, a slightly more inclusive domain was defined, namely the minimal domain containing a COMP (45c). A special feature of the binding theory for Dutch (45) is that it also specifies a "negative" condition for an anaphor (45bii): zieh can not be bound in a certain domain. The standar~ binding theory specifies negative conditions only for nonanaphors (principles B and C). The negative condition in question exploits the notion "d-subject", which introduces the dynasty concept in the anaphoric conditions. The dynasty concept also appeared to playa role in the tentative characterization of long distance anaphora (61). Finally, certain differences among the Germanic languages were related to certain independent factors (such as the direction of government). I will now turn to the standard negative conditions (principles Band C) and show that the notion "d-subject" plays a role in these conditions as well.
6.3. The principles Band C in English and Dutch
The following two sentences seem problematic for principle B of the binding theory: (76)
a.
b.
The men found a smokescreen around them They like their dogs
If (76a) is analyzed as a monoclausal structure, the binding of them by the men is ruled out by principle B. This problem was in fact noted by Lees and Klima (1963), who discussed (76a) as a potential counterexample to their Clause Mate Constraint on bound anaphora. They sought to solve the problem by proposing a bisentential underlying structure:
(77)
The men found [a smokescreen to be around them]
The derivation of (76a) (from (77)) would involve deletion of to be, a rule which is no longer considered acceptable. In the current framework, we can analyze (76a) as involving a small clause (indicated by the brackets): (78)
The men found [a smokescreen around them]
342 Domains and Dynasties Although plausible for this particular example, I have already mentioned in the preceding section why I do not believe that the small clause solution is generally available for locational and directional PPs. In this section, I will briefly discuss the problem in connection with principle B. The second sentence, (76b), is problematic because of the much discussed fact that the complementary distribution between anaphors and pronominals breaks down in possessive NPs (see Chomsky (1981b, 1986a), Huang (1982), Aoun (1986)). Thus, both an anaphor like each other's and a pronominal like their are possible in the context in question: (79)
a.
b.
They like each other's dogs They like their dogs
If each other is bound in its governing category in (79a), principle B seems to predict that the pronominal their is forbidden in this domain. The first problem (pronominals with an antecedent in the same local domain) was not only discussed by Lees and Klima, but also by Chomsky (1965, end of ch. 3) and LakofI (1968), and more recently by Chomsky (1981b) and E. Kiss (1982b). Chomsky (1981, 290) gives the following examples:
(80)
a. b. c. d.
John John John John
pushed the book away from him drew the book towards him turned his friends against him saw a snake near him
As mentioned in the preceding section, the preferred Dutch translation of him in these contexts is zich, although the pronominal hem (preferably in the reduced form 'm) is also possible: (81)
a. b. c. d.
Jan Jan Jan Jan
duwde het boek van zich ('m) af trok het boek naar zich ('m) toe keerde zijn vrienden tegen zich Cm) zag een slang naast zich em)
In the preceding section, it was concluded that the small clause solution is not generally available because of the fact that the PPs in question also co-occur with intransitives in Dutch. The same can be observed in English: (82)
John looked [behind him]
As in the Dutch example that we discussed, a small clause would involve PRO. If this PRO is interpreted as John, the small clause analysis fails, because him would again be bound in a local domain, contrary to what principle B stipulates: (83)
Johnj looked [PROj behind himiJ
343
Binding and Its Domains
Another problem is that, as in Dutch, the reflexive form is not entirely excluded in (82), particularly not in contrastive contexts: (84)
John looked [behind himSELF]
All in all, it seems to me that sentences like (82) form a genuine problem for principle B of the binding theory. The solution that I would like to propose is similar to the one proposed for Dutch in the preceding section. Apparently, himself is not compatible with noncontrastively stressed prepositions. We can state this fact by assuming that the underlying form is the anaphor himself in sentences like (80) and (82), from which the intensifier self is deleted in the context of certain prepositions: (85)
P-
pro self -+ P -
pro
where P is noncontrastively stressed
If we assume that this rule applies at PF, (82) is no longer problematic. At S-structure, him is still himself, so that the binding theory applies correctly. Before going into some further complications concerning anaphors in adjuncts, I will first discuss the second problem. The second problem, the overlapping distribution of pronominals and anaphors in possessive NPs, can probably be solved in a rather trivial way. The solution that I have in mind simply assumes that possessives like their are ambiguous between a pronominal reading and an anaphoric reading. Such ambiguities are quite common across languages. In the preceding section, for instance, I mentioned that hem can be analyzed as an anaphor (himself in English) or as a pronominal (him in English) in several dialects of Dutch: (86)
Jan waste hem John washed him(self)
In standard Dutch, hem can only be interpreted as a pronominal (disjoint in reference from Jan) in (86). What seems relevant in this context is the observation made earlier that the anaphor/pronominal ambiguity also exists in standard Dutch, namely for first and second person pronouns: me (mij) and je Uou) can be used as anaphors (with a local antecedent) but also as pronominals (without a local antecedent). Apparently, the ambiguity can exist in one language for some forms but not for others. If this is the case, it is possible that the ambiguity does not exist for the objective forms like them, but only for the possessive forms like their. If we assume that forms like her, his, and their, are ambiguous in the intended sense, the problem is solved. Both each other's and their are bound in their minimal governing category in (79) (repeated here for convenience):
344 Domains and Dynasties (87)
a. b.
They like each other's dogs They like their dogs
A related ambiguity can be found in the following Dutch example: (88)
Jan sprak namens hem zelf John talked on behalf of him self
This sentence is ambiguous. Hem zelJ is either interpreted as an anaphor bound by Jan or as a pronominal disjoint in reference from Jan. The Dutch intensifier zelJ can be added to any NP. Thus, John himselJ is translated as Jan zelJ in Dutch. Pronominals can also be "intensified" in this way: (89)
Ik zag hem zelf I saw him self 'I saw him himself'
Hem zelJ clearly differs from English himselJ in this respect: the latter can only be used anaphoricaUy. The fact that hem zelJ can be interpreted as an intensified pronominal has obscured the fact that it can also function as an anaphor. In local contexts, such as (88), hem zelJ cannot be replaced by the unambiguous pronominal hem: (90)
*J an sprak namens
hem
As we saw earlier, however, zichzelJ can occur in this context:
(91)
Jan sprak namens zichzelJ
In English, this is an anaphoric context too, which requires himself. The anaphoric use of hem zelJ is not unrestricted, because it cannot be bound by the minimal d-subject of a verb: (92)
a. b.
*J an
zag hem zelJ John saw him self *J an schoot op hem zelJ John shot at him self
ZichzelJ is absolutely necessary in these contexts. Unfortunately, the notion "minimal d-subject" that we used for zich (see (45bii)) does not fully characterize the domain in which hem zelJ must be free. The following sentence is grammatical to my ear, which shows that hem zelJ can be bound by the d-subject of an adjective (see Koster (1985)):
Binding and Its Domains
(93)
345
Marie acht [Peter tevreden over hem zelfJ Mary considers Peter satisfied with him self
Peter is the subject of the dynasty (tevreden, over) which governs hem zelf. The difference between (93) and (92) is that in the latter, the d-subject is the subject of a verb. This is the crucial difference, as can be demonstrated with the following minimal pair:
(94)
a. b.
*J an beschrijft
hem zelf John describes him self J ans beschrijving van hem zelf John's description of him self
Summarizing so far, we have come to the following conclusions. For English, principle B of the binding theory is just right. The two apparent problems can be accounted for in a rather simple way. In directional and locational PPs, the apparent bound pronominal (him) is in fact an anaphor (himself) that has undergone self-deletion. In possessives, the apparent pronominal is ambiguous between a pronominal and an anaphor, which accounts for the overlapping distribution under local binding. For Dutch, we do not have to modify principle B either. Superficially, the occurrence of "intensified" pronominals (like hem zelf) in the local domain of some antecedent might suggest a modification. But we have argued instead that forms like hem zelf are ambiguously analyzed as anaphors or pronominals. The anaphoric form has to be excluded from certain contexts, which we can account for as follows: (95)
Binding condition for hem zelf Hem zelf must be free in the domain of its minimal verbal d-subject
As in English, the intensifier zelf is usually deleted in directional and locational PPs, which accounts for the fact that both zich and hem occur locally bound in "snake" sentences: (96)
Jan zag een slang naast zich (hem) John saw a snake near REFL (him)
In general, the two anaphors zichzelf and hem zelf overlap in a minimal subject domain, except in a minimal verbal d-subject domain, in which only zichzelf can be used. The net result of this is that if the antecedent is a verbal subject, the overlap only occurs in adjunct phrases. Condition (95) only specifies a negative condition for anaphoric hem zelf. A natural question is whether there is also a positive condition, as there is for all other anaphors. The problem is that this is practically impossible to tell, due to the fact that hem zelf is also used pronominally. In an earlier account of Dutch reflexivization, I claimed that there is a
346
Domains and Dynasties
positive condition for the reduced form 'm zelf, In anaphoric contexts, the reduced form is preferred over the full form, and I saw an opacity effect in sentences like the following: (97)
*J an zei dat Peter 'm zelJ waste John said that Peter himself washed
Such facts have remained controversial, however. The reason is that even the reduced form can be used pronominally: (98)
Ik zag 'm zelf I saw himself 'I saw him himself'
If the reduced form can be used pronominally, it is hard to see why the intended reading in (97) must be excluded. Nevertheless, I would like to maintain that there is a third anaphor in Dutch. Hem zelJ is not an intensified pronominal in all contexts. It is an anaphor if locally bound (88), in those contexts in which the "bare" pronominal is excluded (90). I would now like to turn to principle C of the binding theory, which is by far the most problematic. Apart from certain problems of interpretation discussed by Lasnik (1980) and Finer (1985), there are two major problems with principle C. The first problem is that the many different cases supposed to be ruled out by principle C vary enormously in acceptability. This suggests that the various "principle C effects" do not form a unitary phenomenon. The second problem is that, interpretively speaking, disjoint reference is the main effect of a principle C violation. It appears, however, that a principle C violation is neither necessary nor sufficient for a disjoint reference interpretation. To begin with the first problem, consider the following sentences: (99)
a.
b. c. d. e. f. g. h.
*He hates John *He thinks that John is sick *John thinks that John is sick * He left because John was sick *John left because John was sick *Nobody left because John was sick *Whoj tj thinks that we like tj *Whoj tj was arrested before we saw
ej
All these sentences are supposed to be ruled out by principle C. This is not satisfactory, because the sentences of (99) differ enormously in acceptability (as has been noted by Ann Farmer in a lecture at Tilburg (1985)).
347
Binding and Its Domains
For example, (99a) is entirely unacceptable in the intended reading, while (9ge) is almost acceptable. I will suggest below that principle C effects are caused by a discourse principle that also takes structural information into account. These structural features discriminate between (99a) and (9ge) in such a way that (99a) is assigned a lower acceptability status than (9ge). Similar principles playa role with respect to the second problem, the fact that a principle C violation is neither necessary nor sufficient for the disjoint reference interpretation. That a principle C violation is not necessary for disjoint reference is made clear by the following examples, among others: (100) a. b.
*We talked with him about John *He and John went away
In the first example, (100a), him is in a PP that does not dominate John. Therefore, him does not c-command John. Principle C is not violated, in other words. Nevertheless, a disjoint reference interpretation is clearly in order. The second example of disjoint reference, (100b), is not covered by principle C either. The reason is that with two conjoined NPs, the first does not c-command the second. This has been convincingly demonstrated by Huybregts (in unpublished work) and also by De Vries (1983). Interestingly, there are indications that a principle C violation is not sufficient either for disjoint reference. Consider the following example (due to Bolinger): (101)
We gave her the furcoat that Mary has always wanted to have
This sentence is acceptable under the intended interpretation (i.e. coreference of her and Mary). The problem, then, is that one might assume that her c-commands Mary, so that principle C is violated. In other words, (101) could involve a principle C violation without disjoint reference. An extra complication is that it is not entirely clear in (101) whether her c-commands Mary at all, or if it does, in what way. If we accept Kayne's (1984) ideas of binary branching, the structure of the relevant part of (101) might be as follows:'! (102)
VP
~
V
her
L---Mary----"-
348
Domains and Dynasties
Whether her c-commands Mary or not depends on the definition of ccommand. If it is defined with respect to the minimal branching node (V' for her), her does not c-command Mary. If it is defined with respect to the minimal Xmax (arguably VP), her does c-command Mary. Only in the latter case do we have a violation of principle C. The issue is hard to decide because there are facts that involve a configuration like (102), but for which a disjoint reference interpretation seems more appropriate: (103)
?We told her that we loved Mary
It is usually assumed that cases like (101) are not possible if the antecedent is a subject. But also with subject antecedents, the facts appear to be complex. Klein (1980) discusses many apparent principle C violations with subject antecedents in Dutch. He rightly observes, for instance, that the following sentence is grammatical:
(104)
Hij kende Eline al vele jaren toen Osewoudt he knew Eline already many years when Osewoudt plots besloot haar te huwen suddenly decided her to marry 'He had known Eline for many years already, when Osewoudt suddenly decided to marry her'
In this case, the nonanaphor (Osewoudt) is in an adjunct phrase. As in the preceding example, we could argue that c-command is defined in terms of branching nodes and that the adjunct phrase is Chomsky-adjoined to S. Under these assumptions, hij would no longer c-command Osewoudt. But as before, it is easy to find examples with subject anaphors and names in adjuncts that seem to require a disjoint reference interpretation. Klein (1980), for instance, gives the following example: (105)
*Hij kuste Eline, toen Osewoudt binnenkwam he kissed Eline when Osewoudt entered
Thus, even if we assume that the anaphor does not c-command the name in such cases, we observe disjoint reference (although (105) is better with some intonations than with others). A fact that has some bearing on the c-command question is that quantified NPs can bind pronouns in the configuration responsible for the apparent principle C violation in (101): (106) We gave everyone the furcoat that he wanted to have According to the assumptions of chapter 2, this type of binding is only possible if everyone c-commands he.
Binding and Its Domains
349
With respect to adjuncts, matters are more complex. The problem here is that the operator words introducing adjunct phrases have a scope of their own. It has been known for a long time that sentences with adjuncts can involve scope ambiguities. Kraak (1966), for instance, discusses the ambiguities in (Dutch) sentences like (107): (107) He did not leave the house because it rained In one reading, the subject of the main clause did leave the house, but not because of the rain. In the other reading, he did not leave the house. The difference has been accounted for by either interpreting not as having scope over because, or the other way around. This type of scope ambiguity is relevant for the behavior of anaphors in adjuncts. Thus, consider the following example: (108) No one left because he was sick This sentence shows the familiar ambiguity: either people left for other reasons than sickness, or nobody left at all. Under the former reading, no one has scope over because, and he can be bound by no one. Under the latter reading, because has the wider scope, and binding of he (by no one) is entirely impossible. . It seems to me that the principle C viohttions that Klein discusses involve adjuncts with scope over the sentences containing the pronouns. This can be observed in sentences containing negation: (109)
Hij kende Eline niet, toen Osewoudt besloot haar te huwen
he knew Eline not when Osewoudt decided her to marry 'He did not know Eline, when Osewoudt decided to marry her' The sentence is ambiguous in the same way as the other examples. Either toen is in the scope of niet, or the other way around. Osewoudt can only bind hij under the latter interpretation. If niet has scope over the adjunct phrase, i.e. if he possibly knew Eline at some other time than at the time that Osewoudt decided to marry her, then the sentence is ungrammatical under the intended reading. In that case, in other words, Osewoudt cannot be coreferential with hij. Under the most natural intonation, the ungrammatical reading is characterized by heavier stress on niet and by the absence of a pause (comma intonation) between niet and toen: (110)
*Hij kende Eline NIET toen
Osewoudt besloot haar te huwen
Some of the problems noted can be solved by making a distinction between broad and narrow c-command, more or less along the lines of Contreras (1985). For narrow c-command, we take the first dominating branching node as the relevant node; for broad c-command, we consider
350
Domains and Dynasties
the first Xmax the relevant node or the Xmax of the same type that immediately dominates the relevant node (under Chomsky-adjunction). Thus, in (102), we can choose between V' (narrow c-command) and VP (broad c-command). Under the narrow c-command interpretation, her does not c-command Mary, and disjoint reference is not necessary in (101). Since there is also a broad c-command interpretation, the quantified NP everyone can bind the pronoun he in (106). But even if only broad c-command exists, the relative acceptability of sentences like (101) and (104) can be accounted for. Typically, we find such counterexamples to principle C if the antecedents are not governing subjects (in the sense defined above). In (101), the antecedent is not a subject at all, and in (104) the subject is not a g-subject with respect to the name (Osewoudt) in the adjunct. This is so because the verb does not govern the adjunct clause. If the antecedent is a governing subject, we always get strong principle C effects: (111)
*Hij denkt dat Osewoudt ziek is he thinks that Osewoudt sick is
Interestingly, (104) also becomes ungrammatical if we add an embedding: (112)
*Hij
zei dat hij Eline al vele jaren kende toen he said that he Eline already many years knew when Osewoudt plots besloot haar te huwen Osewoudt suddenly decided her to marry
This sentence is ungrammatical in the reading according to which the adjunct clause belongs to the embedded clause. In that case, the hij of the matrix clause is the subject of the verb that governs the domain containing the adjunct clause. In other words, the sentence-initial hij of (112) is a governing subject for Osewoudt. If the facts are correctly interpreted along these lines, we see that the same notions that define anaphoric domains also playa role with respect to the various principle C effects. In any case, we must conclude that grammatical notions play a definite role in disjoint reference interpretations. In spite of this conclusion, I am not convinced that principle C is a principle of grammar in the strict sense. My doubts are based on both theoretical and empirical considerations. Theoretically speaking, principle C as a negative condition does not make sense in the way principle B does. I will argue that principle B can be inferred from the existence of principle A, given some reasonable assumptions about Universal Grammar. A similar move does not seem to be possible for principle C. Empirically speaking, the effects of principle C are just too diverse (see (99)) for a principle of grammar. The effects of principle A, for instance, have an allor-nothing character: a sentence with a reflexive that is not bound, or is
Binding and Its Domains
351
not properly bound, is entirely ungrammatical under all circumstances. Principle C violations, in contrast, vary considerably, depending on the exact configuration and the lexical nature of the NPs involved. What I would like to claim is that negative conditions, such as principle B of the binding theory, are not primitive statements of the theory of grammar. Principle B can be entirely eliminated if we assume a natural anti-redundancy principle, akin to the Uniqueness Condition of Chomsky and Lasnik (1977) (see also Roeper et ai. (1984) for discussion). According to the principle that I have in mind, positive evidence for a binding relation for an anaphor in some domain automatically leads the language learner to infer that other anaphors are excluded in the domain in question. Thus, a child receiving positive evidence for binding of himself in (113a), automatically infers that other items, such as him, cannot be bound in the same context (l13b): (113)
a.
b.
John washes himself *John washes him
The principle in question can be formulated as follows: (114) Nonredundancy Principle Each domain definition defines the binding properties of maximally one type of anaphor (or pronominal) Ideally, then, different types of anaphors do not overlap distributionally in a language. In practice, however, there is some overlap among anaphors, as we saw in the discussion of the binding facts of Dutch. This means that (114) must be interpreted as a principle that belongs to the theory of markedness. Thus, in the unmarked case there is no overlap. Marked overlap can only be introduced in a particular grammar by additional positive evidence. If (114) is part of UG (perhaps derived from some more general learning principle), we can account for the contrast between (113a) and (113b) without principle B. A positive condition like principle A would suffice, together with the establishment of the independently necessary lexical-type distinction between anaphors and pronominals. If it is learned that anaphors are bound in a governing category of some type, it follows from (114) that pronominals are not bound in this governing category. More generally, (114) suggests a way to get rid of negative conditions like principle B entirely. If it is correct, language-particular grammars only specify positive conditions (which are themselves taken from a small array of options from UG). By (114), each domain type is exclusively connected with one type of anaphor or pronominal, unless there is additional positive evidence to the contrary. If (114) can be maintained, the binding theory is improved, because the
352
Domains and Dynasties
mirror image relation between the effects of principles A and B is no longer accidental and unrelated. It would follow from a general principle, which is a highly desirable result. It should be noted now that (114), which makes sense of mirror image effects, cannot be extended to all existing negative conditions. In particular, it cannot be extended to principle C of the binding theory. The reason is that (114) is only defined for potential dependent elements, such as anaphors and pronominals. Nondependent elements, such as names, fall outside its scope because there are no positive domain definitions connected with these elements. All in all, it seems possible to make sense of principle B but not of principle C. Principle B effects can be derived from a natural general principle, while principle C remains unrelated to anything else in the grammar. I take this as a conceptual consideration against principle C (as a principle of grammar). It seems to me that more generally it can be said that grammar is about the configurational properties of dependent elements, not about the behavior of independent elements. In spite of these theoretical considerations, it can of course not be denied that there are principle C effects in an empirical sense. If the facts in question cannot be accounted for by a principle of sentence grammar, it is appropriate to ask what else could account for them. A natural answer is that principle C effects are due to some discourse principle. This conclusion is not incompatible with what we observed earlier, namely that principle C effects vary along strictly grammatical dimensions. A problem with discourse principles is that they are usually hard to make precise. In spite of this problem, I will give a tentative formulation of a discourse principle based on the anaphoricity scale of Lakoff (1968). Lakoff insightfully observed that anaphoricity is a matter of degree according to the following scale: (115)
a. b. c. d.
names definite descriptions epithets pronouns (anaphors)
Thus, in the following four examples, the second NP (in italics) corresponds with the items mentioned in (115): (116)
When Rambo arrived a. they arrested Rambo b. they arrested the president c. they arrested the bastard d. they arrested him
The first example, (116a), in which the name is repeated, is not felicitous. In normal discourse, there is a strong preference for replacing the second
Binding and Its Domains
353
occurrence of a referring expression by an NP that is more anaphoric according to (115) (i.e. an NP-type that is lower on the list). Starting from this fact, we can formulate the following discourse principle: (117) Discourse Principle for Coreferential NPs For each sequence of coreferential argument NPs C = (NP1, ... , NPj, NPi+ h ... , NP n) (1 < i:::;; n) NPj + 1 must be more anaphoric than NPj (unless both are anaphors/pronominals), depending on the relative prominence of NPj Crucial in this formulation is the role given to the relative prominence of NPj. The intuitive (in part traditional) idea is that the need to continue a sequence with a more anaphoric NP decreases if the prominence of the last NP of the discourse sequence decreases. Relative prominence can be determined by purely structural factors, and also by discourse semantics. As for the latter case, one might hypothesize for instance that NPs contained in a presupposed sentence are less prominent than NPs in a clause that expresses new information. As for the purely structural factors, I assume the following prominence hierarchy: (118) Prominence
(i) c-command a. local subject b. governing subject c. subject d. non subject (ii) non-c-command a. degree of embedding i (i > 0) b. degree of embedding i + 1 c. etc.
This specification of the relative prominence of two NPs in a sequence distinguishes two cases: (i) the first NP c-commands the second NP, (ii) the first NP does not c-command the second NP. In the former case, the first NP is relatively more prominent if it is a local subject with respect to the second NP (i.e. the two NPs are in the same local domain). If we go down the list, the disjoint reference interpretation becomes less compelling (or more variable, depending on non structural factors). The following examples illustrate this: (119)
a. b.
John hates John John thinks that John is sick
354 Domains and Dynasties In both sentences, the disjoint reference interpretation is rather compelling. In (119a), the first occurrence of John is a local subject for the second. In (119b), it is not a local subject but still a governing subject. According to the most obvious interpretation of (118), the disjoint reference interpretation is slightly more compelling in (119a) than in (119b). This conclusion seems to be correct, although the intuitions are hardly crystal clear. Disjoint reference becomes less compelling if the first occurrence of John is a non governing subject: (120)
John left because John was sick
The sentences of (119) and (120) are all infelicitous, because the second John is not lower on the anaphoric scale than the first John. If we replace the second John by an epithet, the reality of (118) is more clearly demonstrated: (121)
a. b. c.
*John hates the bastard ??John thinks that we hate the bastard John left because the bastard was sick
Of these examples, (121a) is ungrammatical under a coreference interpretation. The third sentence, however, seems rather acceptable in spite of the principle C violation. The second sentence, (121b), seems to be intermediate in acceptability (perhaps closer to (121a) than to (121c)). In any case, the discourse approach seems to be more flexible than the grammatical approach with respect to (121). If we have just principle C as a grammatical principle, all three sentences of (121) are ruled out in the same way. With nonsubjects as c-commanding antecedents, disjoint reference is not always compelling either, as we saw in the discussion of Bolinger's example ((101), repeated here for convenience): (122)
We gave her the furcoat that Mary has always wanted to have
This is an instantiation of case (118id). In this case, the second NP is even less anaphoric than the first. In general, disjoint reference effects are stronger if anaphoricity decreases. Thus, it has often been pointed out that both the following sentences are bad, but that (123b) is worse than (l23a): (123)
a. b.
*John thinks that John is sick *He thinks that John is sick
If anaphoricity decreases, even sentences with antecedents that are not governing subjects are bad (compare (120) and (121c)):
Binding and Its Domains (124)
355
*He left because John was sick
But, as we have seen in the discussion of Dutch examples like (109), examples with nongoverning subject antecedents are among the examples that sometimes lack disjoint reference under decreasing anaphoricity. In any case, a simple grammatical principle (like principle C) would fail to make all the relevant distinctions. Another advantage of the discourse approach is that it does not make disjoint reference effects entirely dependent on binding configurations (with a c-commanding antecedent). Thus, (100a), repeated here as (125), is also accounted for: (125)
*We talked with him about John
Strictly speaking, this example is not accounted for by principle C because him does not c-command John. But even if there is no strict binding configuration, decreasing anaphoricity leads to ungrammaticality unless the first NP becomes less prominent in terms of embedding. We can improve (125) by embedding the pronoun more deeply: (126)
We talked with his father about John
In this respect, (118ii) incorporates the classical insights about backward anaphora (see Lakoff (1968)). I will now briefly consider how the discourse approach applies to the principle C effects (in the descriptive sense) that have been observed for empty categories. Usually, the effects in question are very strong, as is clear from (99g) (repeated here for convenience): (127)
*Whoj
tj
thinks that we like
tj
The first trace is a governing subject for the second trace, so in principle this example can be accounted for by (117) and (118ib). A question that arises, however, is how the two traces are situated on the anaphoricity scale (115). If both traces are anaphors, the principle C effect is not triggered according to (117). But it is incorrect to treat Wh-traces as anaphors (in any simple way). In the discussion of strong crossover in chapter 2 it was already argued that Wh-traces behave like Wh-phrases in situ under principle C. The same assumptions account for (127). If the second gap is interpreted as a Wh-phrase, anaphoricity does not increase from the first to the second gap in (127). Since the first gap is a governing subject for the second, it is sufficiently prominent to trigger a principle C effect according to (117). This accounts for the unacceptability of (127). Nothing changes if we interpret the second gap in (127) as a parasitic gap. In fact, we only find parasitic gaps if the first gap is not too
356
Domains and Dynasties
prominent with respect to the second (in the sense of (118)). A more detailed discussion of parasitic gaps will be given in the next section. Quite generally, multiple-gap constructions show the usual principle C effects. The same holds for construals of pronominals and gaps, as is clear from the well-known strong crossover facts. Interestingly, Wh-gaps can be "protected" against principle C effects if the gap and the antecedent are separated by an operator. It is this fact that makes easy-to-please constructions grammatical according to Chomsky (1981b, 201): (128) Johnj is easy [OJ [to please tj]] The empty operator OJ marks the maximal domain in which the trace is A'-bound. Chomsky (1981b, 201) rightly observes that principle C effects do not hold if the antecedent of a trace is outside this maximal operator domain. This fact is highly significant for the question whether certain parasitic gap constructions involve a second operator. In the next section, I maintain that the many principle C effects in parasitic gap constructions argue against the second operator hypothesis. As for principles Band C of the LGB binding theory, we have come to the following conclusions. Contrary to principle A, principles Band Care negative conditions, i.e. they specify that certain elements may not be bound in certain domains. In spite of these similarities, the two principles are very different. Principle B makes sharp predictions, while principle C effects differ considerably, depending on the syntactic configuration and the lexical nature of the NPs considered. The most important difference is that principle B effects can be derived from principle A and a natural, general nonredundancy principle (114). Such a procedure does not seem possible with respect to principle C. It seems more appropriate to derive the highly variable principle C effects from independent discourse principles. As we have seen, such principles can exploit grammatical information without contradiction. More generally, I would like to conclude that negative conditions, such as principles Band C, are never primitive statements of any particular grammar. Positive binding conditions are selected by setting certain parameters of Universal Grammar. Without further evidence to the contrary, negative binding conditions are automatically derived from the nonredundancy principle. The latter principle also belongs to Universal Grammar.
6.4. Principle C effects in parasitic gap constructions
Having redefined' the nature of principle C effects, I would now like to return to parasitic gaps and in particular to principle C effects in parasitic gap constructions. Parasitic gap constructions are potentially very important from the point of view that concerns us here. The reason is that they
357
Binding and Its Domains
might offer crucial evidence in favor of the representational view of grammar and against the traditional view that involves movement transformations or "move alpha". If "move alpha" exists, there should be a one-to-one correspondence between traces and their immediate antecedents, since a moved category leaves behind only one trace (per move). The only apparent exception to this generalization can be found in so-called across-the-board (ATB) extractions (see Ross (1967)): (129) Whoj did you [visit tj] and [kiss tj] In principle, this kind of example could already be interpreted as a counterexample to the derivational view, were it not that ways have been found to reduce the two trace positions to one transformational factor. Williams (1978), for instance, made a well-known proposal to this effect. In his view, Wh-movement applies to conjuncts before linearization: (130) COMP did you [visit [kiss
IWhol]
and
]
In this structure, who is treated as a common factor of the two parallel VPs. After its movement to COMP, both VPs have an empty object. Williams's proposal also accounts for the parallelism that we usually find in ATB constructions. This is somewhat problematic, however. In chapter 5, for instance, I gave some examples in which there is no strict parallelism: (131) a.
b.
Who did you look at t and kiss t? Who did you see a picture of t and kiss t?
To the extent that Williams's solution requires strict parallelism, these examples are evidence against it. Assuming, however, that these problems can be solved, ATB extractions can be claimed to be compatible with the derivational approach. Parasitic gap constructions are much more problematic in this respect, because they rarely involve the strict parallelism of ATB constructions. As it stands, parasitic gaps, which share many properties with traces, are so problematic for the classical derivational approach that they definitely seem to undermine it. If the derivational approach requires a one-to-one correspondence between Wh-phrases and gaps, there are in principle two ways to reconcile the problematic parasitic gap facts with it. One possible approach is to claim that some parasitic gap constructions are ATB constructions after all. In that case, the two gaps can be reduced to a single factor. This is the approach taken by Huybregts and Van Riemsdijk (1985). Another approach, taken by Chomsky (1986b), is to claim that parasitic
358
Domains and Dynasties
gap constructions involve an extra (empty) operator, so that the one-toone correspondence between operators and gaps is restored. This is the chain composition approach discussed and rejected in chapter 4 above. In what follows, I will give an extra argument against this approach, based on the principle C effects that we find in parasitic gap constructions. But first I will briefly discuss the ATB proposal that Huybregts and Van Riemsdijk (1985) have made for parasitic gaps in Dutch. As we saw in chapter 4, the parasitic gap phenomenon in Dutch is mainly limited to infinitives introduced by prepositions: (132)
Wie heb je [zonder e te kennen] t teruggestuurd? who have you without to know sent back 'Who did you send back without knowing?'
According to Huybregts and Van Riemsdijk, the alleged parasitic gap e is not a parasitic gap at all, but a trace parallel to the other trace. The two . traces are the result of ATB extraction from an underlying structure like (130):
(133)
Wie heb je
Izonder [vp t teruggestuurd] I [vp te kennen] t
The idea is that after ATB extraction of wie (as indicated) a leftward linearization rule places the phrase introduced by zonder in the position preceding the upper VP. Another implication is that zonder can be analyzed as a coordinator. It is very difficult to make sense of this approach. One problem is that the two "coordinated" phrases in (133) lack the functional parallelism that we usually find in coordinated structures to which ATB extractions apply. The upper VP is the predicate of the sentence, while the lower VP is part of an adjunct to this predicate. This lack of functional parallelism strongly argues against the analysis. Besides, (133) is somewhat misleading. It looks as if two VPs are "coordinated". This, however, cannot be the case if one assumes in other contexts that infinitival complements are S or S'. Clearly, the infinitive following zonder in (133) is a control structure: the subject of the complement of zonder is controlled by the matrix subject (je). So, the real structure is rather (134) than (133): (134)
Wie heb je
Izonder [s{s PRO [vp t teruggestuurd] I [vp t te kennenJ]]
If we assume the S-analysis of infinitives, then, the alleged parallelism (that we normally find in ATB constructions) collapses. The most serious problem is empirical. Huybregts and Van Riemsdijk rightly claim that their approach predicts that Ross's Coordinate
Binding and Its Domains
359
Structure Constraint applies to structures like (133). Although ATB extraction is possible from parallel structures, extraction from a single conjunct has to be blocked: (135)
*Who
did you [vp visit t] and [vp kiss Mary]
Similarly, extraction from one of the "coordinated" structures in the sense of (133) would be impossible. Huybregts and Van Riemsdijk claim that their prediction is confirmed by the following sentence: (136)
Wiej heeft hij erj [zonder echt [ej naar] te verlangen] to long who has he there without really for tj [tj om] gevraagd? for asked 'Who has he asked for it without really longing for itT
Huybregts and Van Riemsdijk claim that their ATB theory correctly predicts that (136) is ungrammatical. The opposite must be true, however, because (136) is perfectly acceptable, not only to me but to all other native speakers I have asked so far. 5 If (136) is grammatical, in other words, the ATB approach is falsified. Other examples of this kind lead to the same conclusion. Thus, an indirect object can be extracted from only one of the alleged conjuncts without problems: (137)
Wiej heb je het boekj [zonder ej te lezen] [tj tj teruggeven]? who have you the book without to read back given 'To whom did you give back the book without reading itT
In this case, the Wh-phrase is extracted from the second "conjunct" only. If the Coordinate Structure Constraint were to apply, the sentence would be ungrammatical. It is perfectly acceptable, however. Again, we must conclude that Dutch parasitic gap constructions do not involve coordination in any sense. I conclude therefore that the ATB approach to parasitic gap constructions in Dutch must be rejected. It should be noted that even if the ATB approach worked in some cases, it would not work for all parasitic gap constructions. In the wellknown cases with parasitic gaps in the subject and traces in the VP, it is entirely impossible to reconstruct an ATB-type parallelism: (138) A man whoj close friends of ej admire tj Thus, no matter what the eventual fate of the ATB approach is, it can never be a general solution for the problem that the one-to-one correspondence of fillers and gaps seems to collapse in parasitic gap constructions. But at this point, I would like to make a stronger claim, namely that the ATB approach does not work at all.
360
Domains and Dynasties
Another way to restore the one-to-one correspondence between heads of A'-chains and Case-marked empty categories is by postulating a second chain for parasitic gap constructions. This is the chain composition approach briefly discussed in chapter 4. According to this approach, worked out in Chomsky (1986b), the second chain is headed by an empty operator: (139)
Which bookj did you [vp return tj [OJ before tj you could read tiJ]
In this familiar instance of an adjunct parasitic gap construction, the first chain is headed by which bookj according to this analysis, and the second chain by the empty operator OJ. The empty operator has first been moved from the rightmost trace position to the COMP position after before (indicated by the trace). From there, it has been adjoined to the PP introduced by before, in order to meet a certain condition on chain composition, namely O-subjacency. Chomsky (1986b) introduces O-subjacency to account for the facts that in earlier analyses were accounted for by principle C of the binding theory: (140)
*Which bookj tj was [vp returned tj OJ before you could read ej]
In Chomsky (1982a), it was observed that parasitic gap constructions are subject to an anti-c-command condition. This would account for the contrast between (139) and (140). According to the analysis in question, the first trace in (139) does not c-command the parasitic gap, while in (140) it does. If parasitic gaps are treated as variables to which principle C of the binding theory applies, the contrast is accounted for: contrary to what we see in (139), the parasitic gap is not free (in the sense of the binding theory) in (140). The anti-c-command explanation has been criticized by Contreras (1985). He argues that anti-c-command does not save (141a), since the ungrammaticality of (141b) shows that the position of the first gap does ccommand the second gap after all: (141)
a.
b.
Which articlesj did you file ej without reading ej? *1 filed themj without reading [Mary's articles]j
Chomsky accepts this argument and assumes that parasitic gaps in these constructions are not subject to principle C effects. Instead, he assumes 0subjacency, which says that in chain composition (of two chains C and C), the head of C' must be O-subjacent to the final element of C. A gap t is O-subjacent to t' if there is no barrier including t and excluding t'. Crucial to this account is the concept of a barrier. A maximal projection is a barrier for the categories containing it if it is not theta-marked or if it immediately dominates a non-theta-marked maximal projection. Given the O-subjacency requirement on chain composition, the contrast
Binding and Its Domains
361
between (139) and (140) can be accounted for without having recourse to an anti-c-command condition (like principle C of the binding theory). The ungrammaticality of (140) follows from the fact that there is one barrier too many (namely the node VP) between the subject trace of the first chain and the empty operator heading the second chain. In (139), in contrast, the empty operator is connected with the object trace (after return), which is not separated by a VP barrier from the second chain. In fact, in (140) as well, the operator could be connected with the object trace. This is prevented, however, by Chomsky's final version of the condition on chain composition: (142) The operator of the parasitic gap must be O-subjacent to the head of the A-chain of the real gap This revised condition crucially refers to the head of the A-chain. Consequently, the object trace has to be disregarded in (140). Only the subject trace is relevant, because this is the head of the A-chain. If we accept (142), then, the contrast between (139) and (140) is indeed accounted for without reference to principle C of the binding theory. It is not clear, however, why (142) should be accepted. Moreover, it is far from clear that principle C effects (in the descriptive sense) are irrelevant for parasitic gap constructions. I will return to these matters in a moment. First, I would like to criticize the chain composition approach on more general grounds. It is useful in this connection to stress that the barrier and chain composition approach should not be seen as being radically opposed to the connectedness approaches proposed by Kayne (1983), and worked out by Longobardi (1985), Bennis and Hoekstra (1984), Koster (1984b), and many others. Huybregts and Van Riemsdijk (1985) give this impression, but Chomsky (1986b, sect. 2) explicitly mentions that his notion of a "blocking category" (crucial to the definition of barriers) is based on an insight of Cattell (1976) that was developed in the connectedness approaches. In terms of chapter 4 above, the notion of a blocking category corresponds with the domains defined by the Bounding Condition. Crucial to the theory of dynasty-driven domain extensions is the idea of a set of governors (a dynasty) that have something in common. What the governors have in common in Chomsky's view is the fact that they assign a theta role. In our terminology, then, the Bounding Condition can be circumvented according to Chomsky's approach if there is a dynasty of theta-role assigners. (In fact, Chomsky uses the notion of L-marking, which is more complex than just theta-marking.) So far, there is a substantial common core among the various approaches: all claim that there is a strict locality principle (Bounding Condition, minimal g-projection, blocking category) and a percolation mechanism based on government. The idea of a strict locality principle is
362
Domains and Dynasties
fairly traditional by now. What is new, and in my opinion the frontier of current generative research, is the common attempt to determine the nature of the percolation mechanisms. Although there is a great deal of consensus that percolation depends on a set of governors, there is perhaps some disagreement as to the nature of this set. One implication of chapter 4 is that a chain of theta-role assigners (or L-markers) is not sufficient. An additional requirement is that the theta-role assigners govern in the same direction (global harmony). This extra requirement accounts for many facts, and no alternative has been developed so far. In this area, there would only be substantial controversy if there were alternatives. Apart from the directionality conditions, there is perhaps a second area of controversy, namely with respect to the idea of chain composition itself. As such, this idea is not incompatible with the connectedness approaches. On the contrary, it could easily be claimed that parasitic gap constructions involve two chains that correspond to portions of trees that must form a subtree in the sense of Kayne (1983). Conceptually speaking, then, the idea of chain composition is neutral among the various percolation theories. As said before, the issue is more important for the debate between representational and derivational theories of A'-chains. Empirically speaking, I see no evidence for chain composition in parasitic gap constructions. The only evidence, characterized as "strong" by Chomsky (1986b), is based on the fact that parasitic gaps are bad in islands. It seems to me, however, that this evidence is rather weak, and I can only repeat and try to sharpen what led me to this conclusion in chapter 4. If a gap cannot be embedded in islands, we may generally conclude that it is part of an A'-chain with local links (see Chomsky (1977)). This conclusion can be extended to parasitic gaps: the well-documented fact that parasitic gaps cannot be embedded in certain islands shows that parasitic gaps are part of an A'-chain. Nothing more and nothing less can be concluded on the basis of the facts. It is, for instance, not possible to infer from the facts how many chains are involved in the construction. Consider, for example, the following bad sentence: (143)
*Which book did you return t before you met the man who wrote e?
The gap e is bad in the island created by the relative clause construction. From this we may infer that e is part of an A'-chain. It is crucial from our point of view that this is the case when e is connected to a hidden second operator, but also when e is directly linked to the Wh-phrase that introduces the sentence (which book). On the basis of facts like (143) alone, it is not possible to decide which interpretation is correct. It is therefore
Binding and Its Domains
363
incorrect to consider facts like (143) as evidence for a second chain and a second operator. The evidence is neutral in this respect. If there is only one operator position in constructions like (143), the parasitic gap is already in an island (an adjunct island in (143)). According to the one-operator approach, parasitic gaps are usually gaps in islands that are "saved" by a second gap. Even then, parasitic gaps remain marginal and have a status comparable to other island violations that are just acceptable. As in the former case, the latter type of gap becomes unacceptable if an extra island is added (the Condition of Global Harmony does not necessarily save the constructions in question, since it is only a necessary condition for acceptability). Thus, (144a) is a fairly acceptable island violation, but an extra island makes the sentence in (144b) bad: (144) a.
b.
Which race did you express a desire to win t? *Which race did you express a desire to meet the man who won t?
No one concludes from this contrast that (144a) involves a second operator or chain, because (144a) only has one gap. Similar things can be said about the marginal constructions with extractions from adjuncts: (145) a.
b.
Which man did you go to England without talking to t? *Which man did you go to England without knowing the woman who talked to t?
No matter how marginal (145a) is, an additional island (145b) makes the sentence must less acceptable. Data like (144) and (145) establish the fact that also when there is one chain, marginal sentences with gaps become unacceptable by adding islands. But then it is an illusion to think that bad parasitic gaps in islands point in the direction of two chains. Since the island evidence forms the only evidence given for the extra chains, we must conclude that there is no valid evidence available for chain composition. On the contrary, I will point out shortly that there is some evidence against it. Even if there is no evidence for chain composition, one could argue that it does make sense somehow. In the case of parasitic gap constructions, however, it is hard to make sense of the second (empty) operator. The chains derived by Wh-movement in Chomsky (1977) are often predicated on something. This is particularly true for easy-to-please constructions, which are believed to involve empty operators (Chomsky (1981b)): (146)
John is easy [OJ [to please t]]
The structure headed by OJ is combined with easy and predicated on John. We can even create a kind of composition of two A'-chains:
364 (147)
Domains and Dynasties Who [t seems [t to be easy [OJ to please tj]]]
In this structure, the second chain is connected with the last element of the first chain by the predication relation that we also observe in (146). It should be noted that predication in this sense has properties entirely different from the alleged chain composition in parasitic gap constructions. In (146) and (147), the anti-c-command condition is not only lacking, c-command is even obligatory: the subject of the predication must c-command the complex predicate introduced by the empty operator. Secondly, we observe that O-subjacency is violated in both (146) and (147): there is always a VP-barrier between the subject and the second chain. Thirdly, (142), necessary to make chain composition work, is violated in (147). In (147), the second chain is not connected to the head of the first chain but to its tail. It is clear, in other words, that chain composition cannot be reduced to predication. But if chain composition is not predication, what else can it be? It seems an ad hoc concept for two reasons. First, the second operator does not create a complex predicate as it usually does. It remains unclear what its function is. Second, the composition mechanism itself is based on (142), which has no independent status in the grammar. So far, I have concluded that there is no evidence for chain composition and that its functional sense has not yet been clarified. I will now show that even if these objections can be overcome, it is not clear whether the composition condition (142) works technically. Reconsider, for instance, (140), which is supposed to be ruled out by (142): (148)
*Which bookj tj was [vp returned tj OJ before you could read
eiJ
The explanation only works if the VP between the head of the first Achain (the subject) and the operator heading the second chain forms a barrier. It is difficult to see why this VP is a barrier in the relevant sense. It is a barrier only if the adjunct introduced by before is not Chomskyadjoined to a lower VP. In other words, the following commonly assumed structure is tacitly rejected: (149)
S
~VP
NPj
P
~ NPj
V
It
PP
~P l~ P
I
before
Sf
Binding and Its Domains
365
In order to rule out (148) (= (140)), it must be assumed that there is a barrier between the head of the first chain (the subject under S in (149)) and the operator OJ. In (149), however, the topmost VP is not a barrier in this sense. The reason is that this VP does not include the operator in the technical sense of Chomsky (1986b). According to the technical definition, the VP in adjunction structures like (149) has two segments. A category includes another category only if both segments of the former dominate the other. In (149), only one segment of the VP dominates the operator OJ, so that the VP is not a blocking category and therefore not a barrier. In other words, the subject in (149) is O-subjacent to the operator, so that (148) is not ruled out under this structure. A necessary condition for this solution to work at all is that the adjunct is not adjoined to the VP-projection but an intrinsic part of it. Chomsky explicitly assumes that there is a "small VP" with which the adjunct forms a higher echelon of the VP projection. Thus, instead of (149), we would have something like (150): (150)
S
N~P ~pp
o·
1~ P
S'
I
before
In this case, the VP has only one segment that fully dominates the operator. In this case, then, there would be a barrier. But it is hard to see why (150) has to be preferred over the rather conventional structure (149). In this case, the adjunct has no intrinsic relations with other material in the VP projection, and one might plausibly argue that such adjuncts are always adjoined to projections rather than intrinsic constituents of them. But even if (150) is the correct structure, there is nothing that prevents further adjunction of the adjunct PP to the VP or even the S. In that case, the barrierhood of the VP would be circumvented. Exactly such procedures are used by Chomsky (1986b) in other cases to meet the requirements of O-subjacency, for instance the extra adjunction of the operator to the PP. Since there is little reason to consider the VP an unsurmountable barrier, it is not clear how (148) can technically be excluded by the condition on chain composition (142). Under the usual anti-c-command assumptions, on the other hand, (148) is straightforwardly ruled out. There are also cases for which a two-chain interpretation seems impossible:
366 (151)
Domains and Dynasties Who would [a picture of eJ surprise t?
This sentence involves only one clause. So, if we assume that operators belong to clauses, there can only be one operator in (151). In fact, then, the one-to-one correspondence between operators and gaps collapses in examples like (151), so that it favors the representational approach over the derivational one. But even then, (151) can be interpreted as a structure with two chains that share the operator heading the chains. Under this interpretation, however, (151) is not accepted by (142). The reason is that the head of the second chain (who, heading the chain (who, e)) is not O-subjacent to the head of the A-chain of the "real" gap: there is a VP-barrier between who and t. In fact, there is no obvious two-chain interpretation of (151) which is compatible with (142). We must conclude, then, that O-subjacency and (142) cannot replace the anti-c-command condition and that there are cases in which (142) excludes a two-chain interpretation for two-gap constructions. Efforts to get rid of principle C effects in parasitic gap constructions have not been very convincing so far. It seems to me that Contreras's conclusion with respect to facts like (141b) (repeated as (152) for convenience) has been an overstatement: (152)
*1 filed themi without reading [Mary's articlesJi
Although this particular sentence is not great, we certainly find subjectobject asymmetries with anaphors that have their antecedent in a following adjunct (see Lakoff (1968)). Thus, (153a) is definitely better than (152): (153)
a.
b.
We filed themi before we could read the articlesi *TheYi were filed before we could read the articlesi
In general, parasitic gap constructions do show the same pattern of relative acceptability as constructions with backward anaphora do (i.e. they show principle C effects). Consider, for instance, the case of a totally unacceptable parasitic gap construction mentioned by Contreras:
(154) *WhOi did you introduce ei to ei? A corresponding construction with backward anaphora is also totally unacceptable: (155) *Did you introduce him(self)i to Billi? This sentence is far worse than (152). Sentences with acceptable parasitic gaps usually have corresponding sentences with acceptable backward anaphora:
Binding and Its Domains
(156) a. b.
367
Whoj would [a picture of ej] surprise tj? A picture of him(self}j surprised Billj
The same holds for parasitic gap constructions in Dutch: (157) a.
b.
Welk boekj heb je [zonder ej te kennen] tj gekocht? which book have you without to know bought 'Which bookj did you buy tj without knowing ej?' Heb je [zonder hetj te kennen] het boekj gekocht? have you without it to know the book bought 'Have you bought itj without knowing the book?
The fact that structures with backward anaphora like (152) are occasionally somewhat more marginal than the corresponding structures with parasitic gaps hardly undermines the general conclusion that there is a rather strong parallelism between the possibility of backward anaphora and parasitic gaps. In both types of structures we observe very similar principle C effects. Parasitic gap constructions do show principle C effects after all, and attempts to get rid of these have been unsuccessful. But if parasitic gap constructions do show principle C effects, we have evidence against a second operator. A second operator would neutralize the principle C effects, as is clear from easy-to-please constructions: (158) Whoj tj is easy OJ to please tj? The first trace c-commands the second trace, but there is no principle C effect, thanks to the operator that separates the two traces. In parasitic gap constructions, we see just the opposite: (159)
*Whoj tj was happy before we knew tj?
If there were a hidden operator between the two traces, the structure would be saved, as in (158). We must conclude, then, that the chain composition approach to parasitic gap constructions is unsupported at the moment. The evidence in favor of it (islands) is not valid. The second operator does not make sense, and it does not block principle C effects, such as those in predication structures. Moreover, the conditions that are supposed to determine the actual composition, O-subjacency and (142), are ad hoc, technically defective, and not sufficient to save the one-to-one correspondence between Wh-phrases and gaps (as is clear from (151)). If this conclusion is correct, parasitic gaps form genuine evidence against the derivational approach to chain formation and favor the representational view (without "move alpha").
368
Domains and Dynasties
6.5. Conclusion
A comparison between anaphoric dependencies and the dependencies created by "move alpha" has often led to the conclusion that the two types of relations must be different. The locality condition for bound anaphors is, under standard assumptions, identified as being principle A of the binding theory, i.e. the governing category characterized by government of the anaphor and some opacity factor (SUBJECT in Chomsky (1981b)). The locality condition for movement is usually considered something rather different: subjacency characterized by two nodes rather than the one specified for principle A. Furthermore, opacity factors are deemed irrelevant for movement. Much of this traditional distinction is based on the study of English. If we study other languages, even the ones that are closely related to English, we find a rather different pattern. In Dutch and German, for instance, "movement" can be characterized by a one-node condition that is very similar to the locality condition of principle A. The main difference between the Bounding Condition postulated for "movement" in chapter 4 and the locality condition of principle A lies in the opacity factors. If we study the binding systems of languages other than English, we find that the opacity factors familiar from principle A are not absolute, invariant conditions. Rather, they have a position on a universal scale ((11) above) that can be interpreted as a markedness hierarchy in accordance with the Subset Principle of Berwick (1985). In some languages, the opacity factors are such that larger anaphoric domains (larger than for English) are defined. In many languages, it is also possible to detect smaller domains. Naturally, the smallest anaphoric domains are defined when the opacity factors are entirely lacking. The resulting condition is indistinguishable from the Bounding Condition that we have postulated for "move alpha" and other local dependencies. This universal, multipurpose locality principle is sufficient for anaphoric clitics in many languages. In Dutch, its reality can be demonstrated indirectly: it is the only domain in which zich and zichzelj can both be bound. The reality of the Bounding Condition for anaphora, then, is supported by both conceptual and empirical considerations. Conceptually speaking, it is the natural "null domain" of a hierarchy of domains of increasing size. Empirically, it is supported by the facts mentioned and described in this chapter. The most important conditions of binding theory are principle A-like conditions. In fact, the hierarchy (11) of principle A-like conditions is the substance of the binding theory. As in the case of bounding theory, it can be seen as a family of conditions, with the Bounding Condition as the unmarked core and the other conditions as extensions of it. As in the case of bounding, the extensions can also be determined by a dynasty of some sort. In general, dynasties for long distance anaphors are defined by dependent heads (mainly verbs). According to the theory sketched in this chapter, principle B-like
Binding and Its Domains
369
conditions are not primitive but derived. In order to account for the complementarity of principles A and B, I have postulated a nonredundancy condition as a universal learning principle. This condition characterizes the unmarked case, according to which anaphors do not overlap in their domain conditions. In practice, grammars deviate in one way or another from this ideal. The deviations, which are rather slight, must be learned through extra positive evidence. If these conclusions are correct, the further conclusion seems justified that the inductive leap from data to grammar is greatly facilitated by universal principles, which are by hypothesis part of the initial state of the language acquisition device. As for principle C of the binding theory, we have concluded that it is not a principle of grammar in the strict sense. We reinterpreted it as a discourse principle, according to which the crucial relative prominence of NPs in a discourse is determined by both structural and nonstructural factors. This step was motivated by the diversity of principle C effects, and by the fact that principle C, contrary to principle B, could not be linked to the positive conditions (the condition A-like conditions). We concluded this chapter with a discussion of parasitic gaps. The ATB and chain composition approaches to parasitic gaps were rejected. The former approach was rejected on the basis of crucial counterexamples (violation of the Coordinate Structure Constraint); the latter approach was also considered unsupported by the evidence. The positive evidence is not convincing and there is clear negative evidence, namely the abundance of principle C effects to which parasitic gaps are subject. If there were a second chain in parasitic gap constructions, headed by an empty operator, these principle C effects would be neutralized. All in all, the conclusions of this chapter favor the representational view of grammar over the derivational one in two respects. First, the general characteristics of domain definitions for binding appear to be the same as for bounding: in both cases we find a prototypical domain, the Bounding Condition, which is extended in certain ways. This conclusion is at variance with the traditional indirect evidence for "move alpha", according to which "move alpha" can be detected by its deviant domain definition (Subjacency rather than principle A). A similar conclusion can be drawn from the properties of parasitic gaps. In parasitic gap constructions, the one-to-one correspondence between fillers and gaps seems to collapse. This conclusion is compatible with the representational view on filler-gap constructions: as in anaphoric and other dependencies, it is always possible to have one antecedent for more than one dependent element. On the derivational view, in contrast, the lack of a one-to-one correspondence between fillers and gaps seems inexplicable. NOTES 1. All speakers of standard Dutch prefer zich (as in (26)). Some speakers reject (27a) but accept (27b).
370
Domains and Dynasties
2. See De Geest (1972), quoted in chapter 3 above. 3. Judgments on (73) vary; some speakers accept it in the intended reading, while others reject it. 4. This is not the kind of structure that Kayne (1982) assumes for such structures. Kayne analyzes double object constructions as small clauses. 5. See Bennis (1986) for a similar conclusion.
Chapter 7
The Radical Autonomy of Syntax
The aim of generative grammar, both in the sense of standard GB theory and in the sense of this study, is to give an answer to Plato's problem in the realm of human language. Technically speaking, this endeavor has led to two major theoretical concerns that can be described as level theory and domain theory. According to the most common development of level theory, syntactic representations must be dissected into a number of levels, some of them connected by "move alpha". According to most versions of domain theory (including the one presented in this study) syntactic computations are constrained by locality principles. For antecedents, a fairly uncontroversial and uniform locality principle has been assumed, namely the principle of c-command (see chapter 1). For dependent elements, a great variety of locality principles exist, such as Subjacency, opacity (principle A of the binding theory), and the extensions for long distance anaphora. As for level theory, a distinction has been made among Lexical Structure (LS), D-structure (DS), S-structure (SS), Logical Form (LF), and Phonetic Form (PF). Others have even added a sixth level, namely the level of NP-structure (NPS). One major conclusion of this study is that this proliferation of levels is unjustified. Quite generally, it is believed that level theory is one of the core theories (perhaps the core theory) of generative grammar. I disagree with this interpretation. In my opinion, level theory lost its major significance with the introduction of trace theory (Chomsky (1973)). This is particularly true for the levels connected by "move alpha", i.e. D-structure, NP-structure, and Logical Form, as levels distinct from S-structure. The c:onclusion of the preceding chapters is that the common 5- or 6-level model can be reduced to a 3-level model: (1)
LS
~
SS
t
PF "Move alpha" plays no role in this model because there is no evidence for this traditional notion. As I tried to demonstrate in the preceding chapters, it is as superfluous for linguistic theory as the notion of phlogiston was for 371
372
Domains and Dynasties
the theory of combustion. Functionally, "move alpha" cannot be distinguished from other forms of property sharing at S-structure (chapter 2). Formally, it has precisely the locality properties of the configurational matrix, which we also find for a great number of grammatical relations that were never analyzed as "movement". At the time that English was the main empirical basis for generative theories, it seemed rather obvious that the locality principle for "movement" (Subjacency) was different from the locality principle for bound anaphora. The study of other languages, however, has revealed a range of variation for both types of locality principles. For Dutch, for instance, it was concluded that the proper formulation of Subjacency (the Bounding Condition) hardly differed from principle A as formulated in Chomsky's Pisa lectures. This insight gave the impetus to this study and to the belief that a very important generalization is missed in standard GB theory, i.e. the generalization that locality is essentially the same across construction types. If "move alpha" does not exist, level theory loses much of its significance, and domain theory becomes the all-important core theory of grammar. In more practical terms, this means that linguistic theory has to focus on the question how we can develop an optimally unified theory of syntactic domains. If we look at actual research practice over the last ten years, we see that the traditional preoccupation with level theory has in fact impeded the formulation of optimally unified domain theories. The ambition to establish a clear distinction between "move alpha" and other syntactic relations has led to the postulation of properties like Subjacency and the ECP, both conceived as properties unique to "move alpha". Both properties appear to be unmotivated, and they have stood in the way of a unified theory of syntactic domains. The unified domain theory presented in this study is based on the wellknown Chomskyan distinction between core and periphery. Universal Grammar provides a core domain for each syntactic domain of a certain kind (the Bounding Condition). As a consequence, each relevant syntactic relation in every language is minimally constrained by this prototypical locality principle. Languages only differ with respect to the possible extensions of the basic domain. As we saw in the preceding chapters, domain extension is a matter of parametric variation. Like the definition of the core domain, domain extension is tightly constrained by Universal Grammar. Extended domains are either defined by an opacity factor, selected from a small and strictly organized set (chapter 6), or they are defined by a dynasty, a set of governors also selected from a small number of possible sets. This hypothesis, then, is a tentative answer to Plato's problem with respect to domain theory. Syntactic domains are built up from a strictly invariant core domain, which can be extended on the basis of a small number of options. The domain theory presented in this study confirms the Thesis of Radical Autonomy, at least for the core domain. According to the Thesis of
The Radical Autonomy of Syntax
373
Radical Autonomy, basic syntactic properties are entirely constructionindependent. The oldest generative grammars contained rules that corresponded to certain construction types. Thus, for the passive construction there was a passive transformation, for the cleft construction there was a cleft transformation, and so on. It was soon realized that the rules in question have much in common, and that an adequate theory has to factor out these common elements. In the 1970s this led to highly impoverished rule components ("move alpha") that interacted with principles that were in part construction-independent. Or at least many ingredients of the principles were construction-independent. The property of c-command, for instance, was part of many different dependencies, such as anaphor binding and the antecedent-trace relation (movement). The property of locality, however, remained to a certain extent constructiondependent. In standard GB theory, for instance, constructions involving anaphor binding are characterized by a locality principle typical of anaphor binding. Of course, the standard GB theory is autonomous in the sense that the form of domains cannot be derived from their function. There is, in other words, only an arbitrary relation between the domain implied by principle A of the binding theory (form) and anaphor binding (function). Although the relations in question are arbitrary, they exist nevertheless, i.e. there are fixed correspondences between construction types and properties. According to the Thesis of Radical Autonomy, there are no correspondences at all in core grammar. Core grammar is defined by the properties of the configurational matrix (chapter 1). None of the properties of this matrix (c-command, locality) uniquely determines any kind of syntactic construction. There are no core properties of, say, "movement" or anaphor binding. To what extent the properties of the more peripheral domain extensions are radically autonomous is an open question at this moment. In principle, it is possible that anaphoric constructions involve domain extensions of a type different from other constructions. In that case, domain extensions would still be autonomous, but not radically autonomous. The Thesis of Radical Autonomy has a strong anti-functionalistic bias. European structuralism often adopted a certain form of functionalism. According to this functionalism, syntactic structures are not autonomous but can only be understood - ultimately - in terms of their function with respect to meaning expression and communication. This form of functionalism, quite strong in the Prague School, is firmly embedded 4 in the Hegelian tradition of the Geisteswissenschaften, with its emphasis on organic growth and other teleological metaphors (see Holenstein (1984)). It seems to me that the Thesis of Radical Autonomy (if correct) refutes this form of functionalism. In fact, the ordinary thesis of autonomy already left little hope for functionalism, since it has never been established that the correlation between syntactic properties and (construction-bound) functions is non arbitrary. But as long as there was any correlation at all,
374
Domains and Dynasties
the functionalist could still have hope that this correlation could be explained in functional terms in the future. If there is no correlation at all between syntactic properties and specific constructions, the notion of a functional explanation of correlations no longer makes sense. In the view of grammar developed here, the core properties of syntactic modules have no inherent meaning or purpose. At some level of abstraction, language is a meaningful, goal-oriented human activity. But this is not the level of abstraction relevant for the study of the syntactic modules that underlie language in this "humanistic" sense. A belief in the inherently meaningful and goal-oriented nature of grammar is not really limited to functional grammarians in the narrow sense. In fact, standard GB theory implicitly adopts a weak form of functionalism by stressing the concept of Logical Form. Logical Form is conceived of as a natural counterpart to certain calculi containing quantifier-variable notation. Chomsky's interpretation of Wh-traces as variables and so-called LF-movement (Quantifier Raising, again supposed to create quantifier-variable structures) have been the major manifestations of this tradition. In the preceding chapters, I have attempted to explain why I believe that this development does not make sense. Wh-movement does not generally create operator-variable structures, and QR is not able to disambiguate structures with scope ambiguity (chapter 2). Moreover, LF does not make sense, because the mapping deriving it from S-structure ("move alpha") does not make sense. The relevant empirical insight that quantifier interpretation is somehow island-sensitive can just as well be expressed at S-structure. At this point, I would like to add a more principled objection to the notion of LF. The notion of LF is presumably related to the confusion that can be found in the following statement by Richard Montague (1970): There is in my opinion no important difference between natural languages and the artificial languages of logicians; indeed, I consider it possible to comprehend the syntax and semantics of both kinds of languages within a single natural and mathematically precise theory. On this point I differ from a number of philosophers, but agree, I believe, with Chomsky and his associates.
It is ironical that the only time that Montague expressed agreement with Chomsky, this agreement was based on a notion of language that Chomsky has entirely rejected, especially in recent years. Chomsky (1986a) has unambiguously rejected the notion of E-Ianguage that Montague seems to have had in mind. An E-Ianguage is a set of sentences (wellformed formulas) of some sort. It is inappropriate to reconstruct "knowledge of language" as knowledge of a set of setences (E-Ianguage), just as it is inappropriate to reconstruct "knowledge of chess" as knowledge of a set of moves. Someone who knows a language has access to a set of modules, the open parameters of which are set in a certain manner (I-language). In other words, natural E-Ianguages (the natural languages that
The Radical Autonomy of Syntax
375
Montague has in mind) do not exist. To the extent that there are sets of sentences in the world that we wish to define as languages, these languages are all artificial. "Natural" E-languages are not natural at all; they are not "given" in some sense. They are the arbitrarily chosen product of some arbitrary selection of our cognitive modules. The sets in question can only be selected on the basis of some decision. Therefore, they belong to the world of human culture and not to the world of human individual psychology. In Popperian terms, E-languages belong to world 3 and not to world 2. If we maintain this terminology, generative grammar concerns the structure of certain aspects of world 2 and not the E-languages of world 3 (see Popper (1972)). If these conclusions are correct, Montague's statement becomes entirely trivial and tautological: two kinds of artificial languages can be studied in the same way, namely as artificial languages. On the other hand, it cannot be denied that the definition of natural languages as sets of sentences in Chomsky (1957) and the subsequent development of mathematical linguistics has misled generative grammarians by suggesting that generative grammar adopts the concept of Elanguage. It seems to me that the notion of LF, and more generally the notion of natural logical calculi with quantifier-variable notation, is misleading to the same degree. Logical calculi are on a par with E-languages and belong to the world of the products of human culture (Popper's world 3). Logical calculi are always designed with a certain goal (or even an interpretation) in mind and are, therefore, inherently goal-oriented and meaningful. There is, however, not the slightest reason to assume that among the various ways we can combine our cognitive modules there is one outcome that is inherently goal-oriented or meaningful, namely a natural calculus with quantifier-variable structure. Ultimately, the idea of natural Logical Forms seems just as meaningless as the idea of natural E-languages. In fact, the two concepts seem closely related. It must be concluded, then, that any attempt to combine the notion of individual psychology with the notion of a calculus (interpreted or not) is based on a kind of category mistake. The study of the modules of grammar is a matter of cognitive psychology, but the study of the product of these modules when put to use is not a matter of cognitive psychology. In the terminology adopted here, modules and their structures belong to world 2, while E-languages and logical forms belong to world 3. If we look at the forms that we find in the modules of grammar, we meet notions that have also played an important role in the study of matter, namely mathematical essences. The modules of grammar seem to specify discrete infinities, recursively defined structures, and so on. Furthermore, we find certain forms of symmetry, mirror symmetry in the structure of X-bar categories, and translation symmetry in such notions as successive cyclicity and the dynasty concept described in the previous chapters.
376
Domains and Dynasties
If these notions are the core notions of grammar, the cognitive modules in question can be studied just as the structures of matter can be, namely as the manifestations of certain mathematical ideas. Such ideas, it seems to me, are incompatible with the traditional functional view of language. Devices that have the form of certain mathematical essences may be useful for certain purposes. But this never means that the forms in question can be explained in terms of their application. To give an example, genetic information is organized, in part, in the form of a double helix. This appears to be an effective form of organization, but it would be absurd to conclude from this state of affairs that the helical shape that matter can take is due to genetic transmission. The fact that matter can organize itself in certain forms can be entirely reduced to principles of physics, which are autonomous with respect to any application. Likewise, the modules of grammar have certain forms that happen to be useful at some "higher" level of reality. Conceptually, the forms of our cognitive modules appear to be related to the nonfunctionally defined forms of matter as studied in physics and chemistry and in the nonfunctional forms of biology, such as the study of "growth and form" in the tradition of D'Arcy Thompson (1961) and others. In such traditions, the usefulness of form is never intrinsic but always a matter of happy accident. Likewise, I believe that the fact that the forms of our cognitive modules find a powerful application in language is entirely accidental. Language as a human creation is useful and meaningful, but the underlying mechanisms are pure form. Their patterns are radically autonomous, rich and beautiful, but lacking in inherent meaning or purpose, like the patterns we find on the wings of butterflies.
Bibliography
Akmajian, A., S. Steele and T. Wasow (1979), "The Category Aux in Universal Grammar", Linguistic Inquiry 10, 1-64. Allwood, J. (1979), "The Complex NP Constraint as a Nonuniversal Rule and Some Semantic Factors Influencing the Acceptability of Swedish Sentences Which Violate the CNPC", in J. Stillings, ed., University of Massachusetts Occasional Papers in Linguistics 2. Amherst, Massachusetts. Anderson, S. (1983), "Types of Dependency in Anaphors: Icelandic (and Other) Reflexives", Joumal of Linguistic Research 2, 1-23. Aoun, J. (1981), The Formal Nature of Anaphoric Relations. Ph.D. Dissertation, MIT. Aoun, J. (1983), "On the Formal Theory of Government", The Linguistic Review 2, 211-236. Aoun, J. (1986), Generalized Binding: the Syntax and Logical Form of Wh-interrogatives. Foris, Dordrecht. Aoun, J. and D. Sportiche (1983), "On the Formal Theory of Government", The Linguistic Review 2, 211-236. Aoun, J., N. Hornstein and D. Sportiche (1981), "Some Aspects of Wide Scope Interpretation", Joumal of Linguistic Research I, 69-95. Bach, E. and G. Horn (1976), "Remarks on 'Conditions on Transformations' ", Linguistic Inquiry 7,265-299. Baker, C.L. (1970), "Notes on the Description of English Questions: the Role of an Abstract Question Morpheme", Foundations of Language 6,197-219. Balk-Smit Duyzentkunst, F. (1979), "Chomsky's Metaforen", Spektator 9,3-13. Bayer, J. (1984), "COMP in Bavarian Syntax", The Linguistic Review 3, 209-274. Beaubien, F., M. Saburin and M. St. Amour (1976), "Faire-Attraction: Deplacement du Sujet", Montreal Working Papers in Linguistics 7,21-40. Belletti, A. and L. Rizzi (1981), "The Syntax of 'Ne': Some Theoretical Implications", The Linguistic Review I, 117-154. Bennis, H. (1986), Gaps and Dummies. Foris, Dordrecht. Bennis, H. and T. Hoekstra (1984), "Gaps en Parasitic Gaps", The Linguistic Review 4, 29-87. Berwick, R. (1982), Locality Principles and the Acquisition of Syntactic Knowledge. Ph.D. Dissertation, MIT. Berwick, R. (1985), The Acquisition of Syntactic Knowledge. MIT Press, Cambridge, Massachusetts. Besten, H. den (1977), "On the Interaction of Root Transformations and Lexical Deletive Rules", Ms., University of Amsterdam (published in Groninger Arbeiten zur germanistischen Linguistik (GAGL) 20, 1-78, University of Groningen). Besten, H. den (1981a), "A Case Filter for Passives", in A. Belletti, L. Brandi and L. Rizzi, eds., Theory of Markedness in Generative Grammar, Scuola Normale Superiore, Pisa. Besten, H. den (198Ib), "Government, syntaktische Struktur and Kasus", in M. Kohrt and J. Lenerz, eds., Sprache, Formen und Strukturen. Max Niemeijer, Tiibingen. Besten, H. den (1982), "Some Remarks on the Ergath:,e Hypothesis", in Groninger Arbeiten zur germanistischen Linguistik (GAGL) 21,61-81, University of Groningen. Besten, H. den and J. Edmondson (1981), "The Verbal Complex in Continental West Germanic", in Groninger Arbeiten zur germanistischen Linguistik (GAGL) 19, 11-61, University of Groningen.
378
Domains and Dynasties
Bok-Bennema, R. (1981), "Clitics and Binding in Spanish", in R. May and J. Koster, eds., Levels of Syntactic Representation. Foris, Dordrecht. Bordelois, 1. (1974), The Grammar of Spanish Causative Complements. Ph.D. Dissertation, MIT. Borer, H. (1980), "Empty Subjects in Modern Hebrew and Constraints on Thematic Relations", NELS 10, Ottawa. Bouchard, N. (1984), On the Content of Empty Categories. Foris, Dordrecht. Bresnan, J. (1976), "Nonarguments for Raising", Linguistic Inquiry 7,485-501. Bresnan, J. (1977), "Variables in the Theory of Transformations", in P. Culicover, T. Wasow and A. Akmajian, eds., Formal Syntax. Academic Press, New York. Bresnan, J. (1982), "Control and Complementation", Linguistic Inquiry 13, 343-434. Burzio, L. (1981), Intransitive Verbs and Italian Auxiliaries. Ph.D. Dissertation, MIT. Cattell, R. (1976), "Constraints on Movement Rules", Language 52, 18-50. Chomsky, N. (1955), The Logical Structure of Linguistic Theory, Ms., MIT (published by Plenum Press, New York (1975)). Chomsky, N. (1957), Syntactic Structures. Mouton, The Hague. Chomsky, N. (1965), Aspects of the Theory of Syntax. MIT Press, Cambridge, Massachusetts. Chomsky, N (1967), "The Formal Nature of Language", appendix of E. Lenneberg, Biological Foundations of Language. John Wiley and Sons, New York. Chomsky, N. (1968), Language and Mind. Harcourt, Brace and World, New York. Chomsky, N. (1970), "Remarks on Nominalization", in R. Jacobs and P. Rosenbaum, eds., Readings in English Transformational Grammar. Ginn, Waltham, Massachusetts. Chomsky, N. (1973), "Conditions on Transformations", in S. Anderson and P. Kiparsky, eds., A Festschrift for Moris Halle. Holt, Rinehart and Winston, New York. Chomsky, N. (1975), Reflections on Language. Pantheon, New York. Chomsky, N. (1977), "On WH-movement", in P. Culicover, T. Wasow and A. Akmajian, eds., Formal Syntax. Academic Press, New York Chomsky, N. (1980a), "On Binding", Linguistic Inquiry 11, 1-46. Chomsky, N. (1980b), Rules and Representations. Columbia University Press, New York. Chomsky, N. (1981a), "On the Representation of Form and Function", The Linguistic Review 1, 3-40. Chomsky, N. (1981b), Lectures on Government and Binding. Foris, Dordrecht. Chomsky, N. (1982a), Some Concepts and Consequences of the Theory of Government and Binding. MIT Press, Cambridge, Massachusetts. Chomsky, N. (1982b), On the Generative Enterprise: a Discussion with Riny Huybregts and Henk van Riemsdijk. Foris, Dordrecht. Chomsky, N. (1986a), Knowledge of Language: its Nature, Origin, and Use. Praeger, New York. Chomsky, N. (1986b), Barriers. MIT Press, Cambridge, Massachusetts. Chomsky, N. and H. Lasnik (1977), "Filters and Control", Linguistic Inquiry 8, 425-504. Cinque, G. (1980), "On Extraction from NP in Italian", Journal of Italian Linguistics 5, 47-99. Cinque, G. (1983a), "Topic Constructions in Some European Languages and Connectedness", in K. Ehlich and H. van Riemsdijk, eds., Connectedness in Sentence, Discourse and Text, Tilburg Studies in Language and Literature 4, Tilburg University. Cinque, G. (1983b), "Island Effects, Subjacency, ECP/Connectedness and Reconstruction", Ms. Universita di Venezia. Cinque, G. (1984), "A'-bound pro vs. Variable", GLOW Newsletter 12, 21-23. Contreras, H. (1979), "Clause Reduction, the Saturation Constraint, and Clitic Promotion in Spanish", Linguistic Analysis 5, 161-182. Contreras, H. (1985), "Parasitic Gaps and Binding", Ms., University of Washington. Coopmans, P. (1985), Language Types: Continua or Parameters? Ph.D. Dissertation, University of Utrecht. Culicover, P. (1976), "A Constraint on Coreferentiality", Foundations of Language 14, 109-118.
Bibliography
379
Daalder, S. and A. Blom (1976), "De Strukturele Positie van Reflexieve en Reciproke Pro nomina", Spektator 5, 397-414. D'Arcy Thompson, W. (1917), On Growth and Form (ed. 1.T. Bonner, Cambridge University Press, Cambridge, 1977). Dougherty, R. (1969), "An Interpretive Theory of Pronominal Reference", Foundations of Language 5, 488-519. E. Kiss, K. (1981), "Move ex and C-Command in a Nonconfigurational Language", paper presented at the GLOW colloquium, G6ttingen. E. Kiss, K. (1982), "On the Core Grammar of Argument Binding", paper read at the Salzburg International Conference on Comparative Syntax, August 1982. Emonds, J. (1970), Root and Structure Preserving Transformations. Ph.D. Dissertation, MIT.
Emonds, 1. (1972), "A Reformulation of Certain Syntactic Transformation", in S. Peters, ed., Goals of Linguistic Theory, Prentice-Hall, Englewood Cliffs, New Jersey. Emonds, J. (1976), A Transformational Approach to English Syntax. Academic Press, New York. Engdahl, E. (1984), "Parasitic Gaps, Resumptive Pronouns and Subject Extraction", Ms., University of Wisconsin, Madison. Everaert, M. (1982), "A Syntactic Passive in Dutch", Utrecht Working Papers in Linguistics 11,37-73. Everaert, M. (1986), The Syntax of Reflexivization. Ph.D. Dissertation, University of Utrecht. Evers, A. (1975), The Transformational Cycle in Dutch and German. Ph.D. Dissertation, University of Utrecht. (Also distributed by the Indiana University Linguistics Club, Bloomington, Indiana). Faltz, L. (1977), Reflexivization: a Study in Universal Syntax. Ph.D. Dissertation, University of California, Berkeley. (Reproduced by University Microfilms International, Ann Arbor, Michigan). Felix, S. (1983), "Parasitic Gaps in German", Ms., University of Passau. Fiengo, R. (1974), Semantic Conditions on Surface Structure. Ph.D. Dissertation, MIT. Fiengo, R. (1980), Surface Structure: the Interface of Autonomous Components. Harvard University Press, Cambridge, Massachusetts. Fiengo, R. and J. Higginbotham (1981), "Opacity in NP', Linguistic Analysis 7, 395-422. Finer, D. (1985), The Formal Theory of Switch-Reference. Garland Publishing Inc., New York. Frege, G. (1879), Begriffsschrift und andere Aufsiitze. 2. Nachdruckaufslage der 3. Auflage, Wissenschaftliches Buchgesellschaft, Darmstadt, 1977. Freidin, R. (19'15), "The Analysis of Passives", Language 51,384-405. Freidin, R. (1978), "Cyclicity and the Theory of Grammar", Linguistic Inquiry 9,519-549. Freidin, R. and H. Lasnik (1981), "Disjoint Reference and Wh-Trace", Linguistic Inquiry 12, 39-53. Gazdar, G. (1982), "Phrase Structure Grammar", in P. Jacobson and G. Pullum, eds., The Nature of Syntactic Representation. Reidel, Dordrecht. Geest, W. de (1972), Complementaire Constructies bi) Verba Sentiendi in het Nederlands. Ph.D. Dissertation, University of Nijmegen. Giorgi, A. (1984), "Toward a Theory of Long Distance Anaphora: a GB Approach", The Linguistic Review 3, 307-362. Goldsmith, J. (1975), "French Causatives, the Constituent Constraint, the Specified Subject Constraint and Traces", Montreal Working Papers in Linguistics 5, 27-55. Gueron, J. (1982), "Logical Operators, Complex Constituents, and Extraction Transformations", in R. May and J. Koster, eds., Levels of Syntactic Representation. Foris, Dordrecht. Haaften, T. van (1982), "Interpretaties van Begrepen Subjecten", GLOT 5, 107-122. Haaften, T. van, R. Smits and Jan Vat (1983), "Left Dislocation, Connectedness and Reconstruction", in K. Ehlich and H. van Riemsdijk, eds., Connectedness in Sentence, Dis-
380
Domains and Dynasties
course and Text, Tilburg Studies in Language and Literature 4. Tilburg University. Haan, G. de (1979), Conditions on Rules. Foris, Dordrecht. Haegeman, L. and H. van Riemsdijk (1986), "Verb Projection Raising, Scope, and the Typology of Rules Affecting Verbs", Linguistic Inquiry 17, 417-466. Haider, H. (1984), "A Unified Account of Case and II-marking: the Case of German", Ms. Universitat Wien. Haider, H. (1986), "Affect (X- a Reply to Lasnik and Saito: 'On the Nature of Proper Government' ", Linguistic Inquiry 17, 113-126. Halk, I. (1984), "Indirect Binding", Linguistic Inquiry 15, 185-223. Hasegawa, K. (1968), "The Passive Construction in English", Language 44, 230-243. Hellan, L. (1980), "On Anaphora in Norwegian", in J. Kreisman and A.E. Ojeda, eds., Papers from the Parasession on Pronouns and Anaphora, Chicago Linguistic Society, Chicago. Hellan, L. (1983), "Anaphora in Norwegian and Theory of Binding", Working Papers in Scandinavian Syntax 5, University of Trondheim. Higginbotham, J. (1980), "Pronouns and Bound Variables", Linguistic Inquiry 11, 679-708. Higgins, R. (1973), The Pseudo-Cleft Construction in English. Ph.D. Dissertation, MIT. Hoekstra, T. (1982), "Government and a Left-Right Asymmetry between Dutch and English, Ms., University of Leyden. Hoekstra, T. (1984), Transitivity: Grammatical Relations in Government Binding Theory. Foris, Dordrecht. Holenstein, E. (1984), "Die russische ideologische Tradition und die deutsche Romantik", in Das Erbe Hegels II. Suhrkamp, Frankfurt aiM. Hornstein, N. (1985), Logic as Grammar: an Approach to Meaning in Natural Language. MIT Press, Cambridge, Massachusetts. Hornstein, N. and A. Weinberg (1981), "Case Theory and Preposition Stranding", Linguistic Inquiry 12, 55-94. Huang, J. (1982), Logical Relations in Chinese and the Theory of Grammar. Ph.D. Dissertation, MIT. Hulk, A. (1982), Het Clitisch Pronomen En. Ph.D. Dissertation, University of Utrecht. Huybregts, R. (to appear), A Formal Theory of Binding and Bounding. Huybregts, R. and H. van Riemsdijk (1985), "Parasitic Gaps and ATB", TILL Papers, Tilburg University. Jackendoff, R. (1972), Semantic Interpretation in Generative Grammar. MIT Press, Cambridge, Massachusetts. Jackendoff, R. (1977), X'-Syntax: a Study of Phrase Structure. MIT Press, Cambridge, Massachusetts. Jaeggli, O. (1981), Topics in Romance Syntax. Foris, Dordrecht. Jakobson, R. (1935), "Beitrag zur allgemeinen Kasuslehre", in R. Jakobson, Selected Writings 2. Mouton, The Hague. Jayaseelan, K.A. (1984), "Complex Predicates and the Theory of II-marking", GLOW Newsletter 12, 34. Jenkins, L. (1976), "NP-interpretation", in H. van Riemsdijk, ed., Green Ideas Blown Up. University of Amsterdam. Katz, J. and P. Postal (1964), An Integrated Theory of Linguistic Descriptions. MIT Press, Cambridge, Massachusetts. Kayne, R. (1975), French Syntax: the Transformational Cycle. MIT Press, Cambridge, Massachusetts. Kayne, R. (1981), "ECP Extensions", Linguistic Inquiry 12, 93-133. Kayne, R. (1982), "Unambiguous Paths", in R. May and J. Koster, eds., Levels of Syntactic Representation. Foris, Dordrecht. Kayne, R. (1983), "Connectedness", Linguistic Inquiry 14, 223-249. Kayne, R. (1984), Connectedness and Binary Branching. Foris, Dordrecht. Kayne, R. and J.-Y. Pollock (1978), "Stylistic Inversion, Successive Cyclicity, and Move NP in French", Linguistic Inquiry 9, 595-621.
Bibliography
381
Kerstens, J. (1975), "Over Afgeleide Structuur en de Interpretatie van Zinnen", Ms., University of Amsterdam. Kiparsky, P. and C. Kiparsky (1970), "Fact", in M. Bierwisch and K. Heidolph, eds., Progress in Linguistics. Mouton, The Hague. Klein, M. (1980), "Anaforische Relaties in het Nederlands", in M. Klein, ed., Taal Kundig BeschoulVd. Martinus Nijhoff, The Hague. Klein, M. (1983), "Vooropstaande PP's en Thematische Relaties", Gramma 7,41-50. Klima, E. (1964), "Negation in English", in 1. Fodor and J. Katz, eds., The Structure ofLanguage: Readings in the Philosophy of Language. Prentice-Hall, Englewood Cliffs, N.J. Koopman, H. (1983), "ECP Effects in Main Clauses", Linguistic Inquiry 14, 346-350. Koopman, H. (1984), The Syntax of Verbs. Foris, Dordrecht. Koopman, H. and D. Sportiche (1982), "Variables and the Bijection Principle", The Linguistic Review 2, 139-160. Koopman, H. and D. Sportiche (1985), "II-theory and Extraction", GLOW talk, Brussels. Koster, J. (1973), "'PP over V' en de Theorie van J. Emonds", Spektator 2, 294-311. Koster, J. (1975), "Dutch as an SOY Language", Linguistic Analysis I, 111-136. Koster, J. (1978a), 'Why Subject Sentences Don't Exist", in S.J. Keyser, ed., Recent Transformational Studies in European Languages. MIT Press, Cambridge, Massachusetts. Koster, J. (1978b), "Conditions, Empty Nodes, and Markedness', Linguistic Inquby 9, 551-593. Koster, J. (1978c) , Locality Principles in Syntax. Foris, Dordrecht. Koster, J. (1979), "Anaphora: an Introduction without Foonotes", Filosofisch Instituut, Nijmegen. Koster, J. (1982a), "Enthalten syntaktische Repriisentationen Variablen?", Linguistische Berichte 80/82, 70-100 (Teil I), and 83 (1983), 36-60 (Teil 2). Koster, J. (1982b), "Counter-opacity in Korean and Japanese", Ms., Tilburg University. Koster, J. (1983), "Syntax without Variables", Ms., Tilburg University. Koster, J. (1984a), "On Binding and Control", Linguistic Inquiry 15, 417-459. Koster, J. (1984b), "Global Harmony", TILL papers, Tilburg University. Koster, J. (1984c), "Onverteerde Restanten", Spektator 14, 357-362. Koster, J. (1985), "Reflexives in Dutch", in J. Gueron, H.-G. Obenauer and J.-Y. Pollock, eds., Grammatical Representation. Foris, Dordrecht. Koster, J. (1986a), "Lege Subjecten en Open VP's", in C. Hoppenbrouwers, J. Houtman, I. Schuurman and F. Zwarts, eds., Proeven van Taalwetenschap ter Gelegenheid van het Emeritaat van Albert Sassen, TABU 16, University of Groningen. Koster, J. (1986b), "The Relation between Pro-drop, Scrambling and Verb Movements", Groningen Papers in Theoretical and Applied Linguistics, TTT Nr. I, University of Groningen. Koster, J. and R. May (1982), "On the Constituency ofInfinitives", Language 58, 117-143. Kraak, A. (1966), Negatieve Zinnen, De Haan, Hilversum. Kuroda, S.-Y. (1964), "A Note on English Relativization", Ms., MIT. Kuroda, S. -Y. (1965), Generative Grammatical Studies in the Japanese Language, Ph. D. Dissertation, MIT. Lakoff, G. (1968), "Pronouns and Reference", Ms., reproduced by the Indiana University Linguistics Club. Langacker, R. (1969), "On Pronominalization and the Chain of Command", D. Reibel and A. Schane, Modern Studies in English: Readings in Transformational Grammar. Prentice Hall, Englewood Cliffs, N.J. Lasnik, H. (1976), "Remarks on Coreference", Linguistic Analysis 2, 1-23. Lasnik, H. (1980), "On Two Recent Treatments of Disjoint Reference", Journal of Linguistic Research 1/4,48-58. Lasnik, H. and R. Fiengo (1974), "Complement Object Deletion", Linguistic Inquiry 5, 535-570. Lasnik, H. and J. Kupin (1977), "A Restrictive Theory of Transformational Grammar", Theoretical Linguistics 4, 173-196.
382
Domains and Dynasties
Lasnik, H. and M. Saito (1984), "On the Nature of Proper Government", Linguistic Inquiry 15, 235-289. Lees, R. and E. Klima (1963), "Rules for English Pronominalization", Language 39, 17-28. Lenerz, J. (1977), Zur Abfolge nominaler Satzglieder im Deutschen. Narr, Tiibingen. Lightfoot, D. (1977), "On Traces and Conditions on Rules", in P. Culicover, T. Wasow and A. Akmajian, eds., Formal Syntax, Academic Press, New York. Longobardi, G. (1980), "Remarks on Infinitives: A Case for a Filter", Italian Linguistics 5. 101-157. Longobardi, G. (1985), "Connectedness and Island Constraints", in J. Gueron, H.-G. Obenauer and J.-Y. Pollock, eds., Grammatical Representation. Foris, Dordrecht. McCloskey, J. (1984), "Raising, Sub categorization and Selection in Modern Irish", Natural Language and Linguistic Theory 1,441-485. Maling, J. (1981), "Non-clause-bounded Reflexives in Icelandic", in Th. Fretheim and L. Hellan, eds., Papers from the Sixth Scandinavian Conference of Linguistics, R0roS. Manzini, R. (1983a), "On Control and Control Theory", Linguistic Inquiry 14, 421-446. Manzini, R. (1983b), Restructuring and Reanalysis. Ph.D. Dissertation, MIT. May, R. (1977), The Grammar of Quantification. Ph.D. Dissertation, MIT. May, R. (1979), "Must COMP-to-COMP Movement Be Stipulated?", Linguistic Inquiry 10, 719-725. May, R. (1985), Logical Form: its Structure and Derivation. MIT Press, Cambridge, Mas-. sachusetts. Mey, J. de, and L. Manicz (1984), "On Real and Apparent Cases of Wh-movement in Hungarian", Ms., University of Groningen. Milner, 1.-C. (1979), "La Redondance Fonctionelle", Linguisticae Investigationes 3, 87-145. Milsark, G. (1974), Existential Sentences in English. Ph.D. Dissertation, MIT. Montague, R. (1970), "Universal Grammar", Theoria 36, 373-398. Montalbetti, M. and K. Wexler (1985), "Binding is Linking", Ms., University of California, Irvine. Nakajima, H. (1982), "X'-detection", Ms., Chiba University, Japan. Neijt, A. (1981), "Gaps and Remnants: Sentence Grammar Aspects of Gapping", Linguistic Analysis 8, 69-93. Obenauer, H.-G. (1984), "On the Identification of Empty Categories", The Linguistic Review 4, 153-202. Perlmutter, D. (1971), Deep and Surface Structure Constraints in Syntax. Holt, Rinehart, and Winston, New York. Perlmutter, D. (1978), "Impersonal Passives and the Unaccusative Hypothesis", Berkeley Linguistic Society 4, 157-189. Perlmutter, D. and P. Postal (1984), "The I-Advancement Exclusiveness Law", in D. Perlmutter and C. Rosen, eds., Studies in Relational Grammar 2. The University of Chicago Press, Chicago. Perlmutter, D. and A. Zaenen (1984), "The Indefinite Extraposition Construction in Dutch and German", in D. Perlmutter and C. Rosen, eds., Studies in Relational Grammar 2. The University of Chicago Press, Chicago. Pesetsky, D. (1982a), "Complementizer-trace Phenomena and the Nominative Island Condition", The Linguistic Review I, 297-343. Pesetsky, D. (1982b), Paths and Categories. Ph.D. Dissertation, MIT. Pesetsky, D. (1984), "Extraction Domains and a Surprising Subject/Object Asymmetry", GLOW Newsletter 12, 58-60. Platzack, Ch. (1982), "Transitive Adjectives in Swedish: a Phenomenon with Implications for the Theory of Abstract Case", The Linguistic Review 2,39-57. Pollmann, T. (1975), Oorzaak en Handelende Persoon: de Beschrijving van Passieve Zinnen in de Nederlandse Grammatica. Ph.D. Dissertation, University of Nijmegen. Popper, K. (1972), Objective Knowledge. Clarendon Press, Oxford. Postal, P. (1969), "Anaphoric Islands", in R. Binnick, A. Davison, G. Green and 1. Morgan,
Bibliography
383
eds., Papers from the Fifth Regional Meeting of the Chicago Linguistic Society, University of Chicago, Chicago, Illinois. Postal, P. (1971), Cross-over Phenomena. Holt, Rinehart and Winston, New York. Postal, P. (1974), On Raising. MIT Press, Cambridge, Massachusetts. Reinhart, T. (1975), "A Note on the Two-comp Hypothesis", Ms., MIT. Reinhart, T. (1976), The Syntactic Domain of Anaphora. Ph.D. Dissertation, MIT. Reis, M. (1976), "Refiexivierung in deutschen A.c.l.-Konstruktionen: ein transformationsgrammatisches Dilemma, Papiere zur Linguistik 9, 5-82. Riemsdijk, H. van (1978), A Case Study in Syntactic Markedness: the Binding Nature of Prepositional Phrases. Foris, Dordrecht. Riemsdijk, H. van (1982), "Zum Rattenfiingereffekt bei Infinitiven in deutschen Relativsiitzen.", TILL Papers, Tilburg University. Riemsdijk, H. van (1983), "Correspondence Effects and the Empty Category Principle", in Y. Otsu et al., eds., Studies in Generative Grammar and Language Acquisition. Editorial Committee, Tokyo. Riemsdijk, H. van and E. Williams (1981), "NP-structure", The Linguistic Review I, 171-217. Riemsdijk, H. van and R. Zwarts (1974), "Left Dislocation in Dutch and the Status of Copying Rules", Ms., MIT. Rizzi, L. (l978a), "A Restructuring Rule in Italian", in S. Jay Keyser, ed., Recent Transformational Studies in European Languages. MIT Press, Cambridge, Massachusetts. Rizzi, L. (l978b), "Violations of the Wh-island Constraint in Italian and Subjacency", in L. Rizzi, Issues in Italian Syntax. Foris, Dordrecht (1982). Rizzi, L. (1982), Issues in Italian Syntax. Foris, Dordrecht. Rizzi, L. (1983), "On Chain Formation", Ms., Universita della Calabria. Roeper, T. (1983), "Implicit Thematic Roles in the Lexicon and Syntax", Ms., University of Massachusetts, Amherst. Roeper, T. et al. (1984), "The Problem of Empty Categories and Bound Variables in Language Acquisition", Ms., University of Massachusetts, Amherst. Rosenbaum, P. (1967), The Grammar ofEnglish Predicate Complement Constructions. MIT Press, Cambridge, Massachusetts. Ross, J.R. (1967), Constraints on Variables in Syntax. Ph.D. Dissertation, MIT (also distributed by the Indiana University Linguistics Club, Bloomington, Indiana). Ross, J.R. (1969), "Adjectives as Noun Phrases", in D. Reibel and S. Schane, eds., Modern Studies in English: Readings in Transformational Grammar. Prentice-Hall, Englewood Cliffs, N.1. Rouveret, A. and 1.-R. Vergnaud (1980), "Specifying Reference to the Subject: French Causatives and Conditions on Representations", Linguistic Inquiry II, 97-102. Safir, K. (1985), "Missing Subjects in German", in 1. Toman, ed., Studies in German Grammar. Foris, Dordrecht. Sag, l. (1976), Deletion and Logical Form. Ph.D. Dissertation, MIT. Sportiche, D. (1983), Structural Invariance and Symmetry in Syntax. Ph.D. Dissertation, MIT. Sportiche, D. and J. Aoun (1981), "On the Formal Theory of Government", GLOW talk, Gottingen. Stowell, T. (1978), "What was There before There was There", in D. Farkas et al., eds., Papersfrom the Fourteenth Regional Meeting of the Chicago Linguistic Society, University of Chicago, Chicago, Illinois. Stowell, T. (1981), Origins of Phrase Structure. Doctoral Dissertation, MIT. Sturm, A. and T. Pollmann (1977), "Is Seen X in de bar-notatie?", Spektator 6,476-478. Taraldsen, T. (1981), "The Theoretical Interpretation of a Class of 'Marked' Extractions", in A. Belletti, L. Brandi, and L. Rizzi, eds., Theory of Markedness in Generative Grammar. Scuola Normale Superiore, Pisa. Taraldsen, T. (1984), "On the Complementarity of Bounding and Binding", paper presented at GLOW, Copenhagen 1984. Thiersch, C. (1978) Topics in German Syntax. Ph.D. Dissertation, MIT.
384
Domains and Dynasties
Thniinsson, H. (1976), "Reflexives and Subjunctives in Icelandic", NELS 6, 225-239. Torris, T. (1984), "Parasitic Gaps, Across-the Board, and Comparative Clauses", Ms., Universitiit K61n. Vat, J. (1980), "Zich and zichzelf", in S. Daalder and M. Gerritsen, eds., Linguistics in the Netherlands 1980. North-Holland, Amsterdam. Vergnaud, J.-R. (1974), French Relative Clauses, Ph.D. Dissertation, MIT. Vikner, S. (1985), "Parameters of Binder, and of Binding Category in Danish", Ms., Department of English, University of Geneva. Vries, G. de (1983), "Drie-dimensionele K06rdinatie", Masters thesis, University of Amsterdam. Webelhuth, G. (1985), "German is Configurational", The Linguistic Review 4, 203-246. Wexler, K. and R. Manzini (1985), "Parameters and Learnability in Binding Theory", to appear in T. Roeper and E. Williams, eds., Parameters in Linguistic Theory. Williams, E. (1974), Rule Ordering in Syntax. Ph.D. Dissertation, MIT. Williams, E. (1975), "Comparative Reduction and the Cycle", Ms., University of Massachusetts, Amherst. Williams, E. (1977), "Discourse and Logical Form", Linguistic Inquiry 8, 101-140. Williams, E. (1978), "Across-the-Board-Rule Application", Linguistic Inquiry 9, 31-43. Williams, E. (1980), "Predication", Linguistic Inquiry 11,203-238. Williams, E. (1981), "Argument Structure and Morphology", The Linguistic Review 1, 81-114. Williams, E. (1982), "The NP Cycle", Linguistic Inquiry 13, 227-295. Yang, D.-W. (1984), "The Extended Binding Theory of Anaphors", Theoretical Linguistic Research 1, 195-218. Zribi-Hertz, A. (1980), "Coreferences et Pronoms Reflechis: Notes sur Ie Contraste lui-luimeme en Fran<;ais", Linguisticae Investigationes 4, 131-179. Zubizarreta, M.L. (1982), On the Relationship of the Lexicon to Syntax. Ph.D. Dissertation, MIT. Zubizarreta, M.L. (1985), "The Relation between Morphonology and Morphosyntax: the Case of Romance Causatives", Linguistic Inquiry 16, 247-289. Zwarts, F. (1986), Categoriale Grammatica en Algebraische Linguistiek. Ph.D. Dissertation, University of Groningen.
Index of Names
Akmajian, A. 208 Allwood, J. 52 Anderson, S. 102, 151 Aoun, J. 10, 90, 106, 204, 208, 222, 223, 236 fn. 13, 236 fn. 19, 342 Bach, E. 118, 142, 196, 198, 199 Baker, C.L. 79, 215, 220, 230 Balk-Smit Duyzentkunst, F. 255 Bayer, J. 209 Beaubien, F.-M. 298 Belletti, A. 23, 24, 154, 158, 163, 254 Bennis, H. 19,21, 108 fn. 1, 181, 182, 187, 188,236 fn. 6, 262, 263, 264, 314 fn. 4, 361, 370 fn. 5 Berwick, R. 239, 318, 368 Besten, H. den 143 fn. 11, 144 fn. 13, 211,243,245,253,254,257,281 Blom, A. 326 Bok-Bennema, R. 272 Bolinger, D. 347, 354 Bordelois, 1. 302 Borer, H. 268 Bouchard, N. 29 fn. 5, 233, 315 Brame, M. 101, 102 Bresnan, J. 13, 47, 101, 102, 117, 118, 125,130,131,141,142,230,326 Brody, M. 204 Burzio, L. 251,254,257,268,272,276, 297,298,300,301,305,309,311 Cattell, R. 361 Chomsky, N. 1,2, 3,4, 10, 14, 16, 18, 20,26,29 fn. 1,29 fn. 3, 33, 34, 37, 38,39,40,41,43,44,45,46,47,48, 49,50,51, 54, 56, 59, 63, 68, 69, 70, 76,77, 81, 82, 83, 85, 87, 88, 99, 101, 102, 103, 104, 106, 108, 108 fn. 2, 108 fn. 5, 109, 110,111,112, 113, 115, 117, 119, 125, 132, 135, 138, 141, 143 fn. 1, 145, 146, 147, 148, 149, 150, 154, 156, 158, 168, 169, 171, 185, 187, 188, 193, 194, 195, 196, 197,201,202,205,208, 210,239,240,241,242,251,257,259, 263,264,266,267,268,277,278,290, 294,299, 315, 317, 319, 326, 333, 334,
342, 351, 356, 357, 360, 361, 362, 362, 365, 368, 371, 372, 374, 375 Cinque, G. 23, 24, 25, 44, 53, 65, 74, 83, 100, 101, 143 fn. 8, 144 fn. 24, 153, 154, 155, 158, 159, 160, 161, 166, 167, 172, 199,220,230,236 fn. 15 Contreras, H. 272, 349, 360, 366 Coopmans, P. 251, 302 Culicover, P. 87 Daalder, S. 326 D'Arcy Thompson, W. 376 Dougherty, R. 64, 75, 101 Driessen, C. 326 Edmondson, J. 281 Kiss, K. 143 fn. 6, 342 Emonds, J. 101, 128, 129, 188,200,208, 267 Engdahl, E. 52 Everaert, M. 102, 243, 323, 329 Evers, A. 48, 121, 127, 130, 133, 144 fn. 17,209,210,273,275,276,278,280, 281
E.
Faltz, L. 326, 328 Farmer, A. 346 Felix, S. 185, 187 Fiengo, R. 14, 45, 145, 230, 291 Finer, D. 346 Frege, G. 77 Freidin, R. 4, 101, 102 Gazdar, G. 2, 29 fn. 4 Geest, W. de 130, 370 fn. 2 George, L. 170 Georgopolous, C. 19 Giorgi, A. 321, 322, 324 Goldsmith, J. 298 Grosu, A. 19 Gueron, J. 230 Haaften, T. van 44, 143 fn. 3 Haan, G. de 187, 244 Haegeman, L. 191,276,279,280,281, 282, 283, 284, 285, 286, 288, 293, 295, 313
386
Index of Names
Haider, H. 185,219,225,229 Halk,1. 81, 86, 87, 88, 89, 90, 95, 96, 286,288 Hankamer,1. 15 Hasegawa, K. 268 Hellan, L. 151, 317, 339 Higginbotham, J, 97, 146 Higgins, R. 33, 63 Hoekstra, T. 19,21, 143 fn. 13, 144 fn. 14, 152, 181, 182, 187, 188, 210, 236 fn. 8,243,245,249,255,256,257,263, 361 Holenstein, E. 373 Horn, G. 196, 198, 199 Hornstein, H. 108 fn. 8, 222, 223, 293, 313 Huang, C.-T.J. 23, 24, 25, 28, 70, 85, 146, 147, 155, 165, 166, 167, 190,200, 202,204,205,211,212,216,217,218, 221,222,223,226,227,236 fn. 11, 317, 342 Hulk, A. 298 Huybregts, R. 15, 143 fn. 11, 144 fn. 16, 181,276,280,286,293, 329, 347, 357, 358, 359, 361 Jackendoff, R. 13,71, 101, 103, 143 fn. 6, 163, 208, 326 Jaeggli, O. 236 fn. 6 Jakobson, R. 258 Jayaseelan, K.-A. 183 Jenkins, L. 101 Katz, J. 17,79, 108 fn. 7, 220 Kayne, R. 6, 8, 17, 18, 19,20,21,24, 136, 143 fn. 13, 145, 146, 152, 154, 155, 165, 173, 176, 179, 181, 185,210, 224, 230, 231, 232, 236 fn. 1, 236 fn. 4, 251,272,273,275,276,288,293,294, 297,298,299,305,306, 309, 311, 314, fn. 1,314 fn. 9, 314 fn. 12, 314 fn. 13, 328, 338, 347, 361, 362, 370 fn. 3 Kerstens, J. 187 Kiparsky, C. 314 fn.4 Kiparsky, P. 314 fn. 4 Klein, M. 236 fn. 9, 349 Klima, E. 101,341,342 Koopman, H. 21, 22, 23, 24, 25, 48, 81, 97,210,211 Kraak, A. 349 Kupin, 1. 15, 277, 278, 293 Kuroda, S.-Y. 162,322 Lakoff, G. Langacker, Lasnik, H. 119, 143
317, 342, 352, 354, 366 R. 86 15,45,81,86,87,97,112, fn. 1, 147, 154, 188,205,217,
218,221,223,224,225,226,236 fn. 11,241,246,277,278,293, 346, 351 Lees, R. 341, 342 Lenerz, J. 253 Lightfoot, D. 101 Longobardi, G. 62, 75, 83, 190,361 Maling, J. 19, 102, 151, 320, 321 Manzini, R. 113,143 fn. 3, 165, 166,317, 318, 319, 321 Manicz, L. 84, 236 fn. 17 May, R. 17, 76, 77, 81, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 109, 145, 146, 147, 209 McCloskey, J. 143 fn. 6 Mey, S. de 84, 236, fn. 17 Milner, J. -C. 298 Milsark, G. 268 Montalbetti, M. 315 Montague, R. 374, 375 Nakajima, H. 18, 200 Neijt, A. 15, 16 Obenauer, H.-G. 23, 24, 25, 51, 52, 53, 54, 74, 100, 153, 155, 220 Odijk, J. 64 Perlmutter, D. 205, 207, 250, 253, 256, 257,264,291 Pesetsky, D. 91, 154, 206 Plato 1,2 Platzack, C. 184 Pollmann, C. 184 Pollock, J.- Y. 298 Popper, K. 375 Postal, P. 17,79, 108 fn. 7, 115, 130, 131,220,253,256,264,291 Reinhart, T. 81, 87, 96, 106, 168, 192 Reis, M. 152, 324 Riemsdijk, H. van 2, 17,37,44, 57, 58, 59,60,62,63,64,65,66,67,69,74, 83, 85, 88, 100, 101, 103, 104, 134, 143, fn. 12, 143 fn. 13, 160, 163, 164, 174, 176, 177, 182, 191, 236 fn. 17, 244,258,272,276,279,280,281,282, 283,284,285,286,288,293,295,313, 357, 358, 359, 361 Rizzi, L. 24, 101, 164, 165, 166, 167, 170, 195,200,254,272,276,288,297,300 Roeper, T. 143 fn. 4, 351 Rosenbaum, P. 117,128,143 fn. 9 Ross, J.R. 3, 14, 18,74, 83, 88, 136, 162, 163, 194, 197,227,228,357,358 Rouveret, A. 297, 306, 307, 308
Index of Names Safir, K. 260, 261 Sag, 1. 91, 94 Saito, M. 86, 147, 154,217,218,221, 223,224,225,226,236 fn. 11,246 Sportiche, D. 10,21,22,23,24,25,29 fn. 2, 81, 90, 97, 101, 106, 144 fn. 15, 222,223 Steele, S. 208 Stowell, T. 152, 210, 225, 262, 268, 308 Sturm, A. 258 Taraldsen, K.T. 60, 156 Thiersch, C. 144 fn. 13, 144 fn. 175, 185, 265 ThnHnsson, H. 102, 151 Torris, T. 185 Vat, J. 323, 324, 329 Vergnaud, J.-R. 59,297,306, 307, 308 Vikner, S. 317, 324 Visser, F. 117, 118
387 Vries, O. de 15, 347 Wasow, T. 208 Webelhuth, O. 314 fn. 7 Weinberg, A. 293, 313 Wexler, K. 315, 317, 318, 319, 321 Williams, E. 2, 15,26,37,43,57,58, 59, 60, 62, 63, 67, 69, 74, 88, 91, 92, 94, 100, 103, 104, 109, Ill, 117, 119, 138, 139, 141, 143 fn. 10, 191,211,251, 280, 307, 357 Xu L.-J. 70 Yang, D.-W. 19, 149, 150, 151, 318, 325 Zaenen, A. 250 Zribi-Hertz, A. 320 Zubizarreta, M.-L. 251, 270, 276,297, 298, 301 Zwarts, F. 44, 65, 314 fn. 6
General Index
A over A Principle, 28ff, 314 fn. 8 and verb (projection) fronting, 284f A-position, 60 and NP-movement, 60 A-bar binding of -, 80 A-bar position and Wh-movement, 60 ACI-verbs and Verb Raising, 152 Adjectives and structural Government, 184 as unaccusatives, 264 Adjuncts and anaphor binding, 121 and antecedent government, 212 AP-, 227f and Binding, 349 and identification, 70 -infinitives and parasitic gaps, 186f and WH-elements in situ, 205 Adjunction, 280ff Chomsky-, 205, 276, 282, 308, 348, 350, 364 and inversion rules, 28lff -Ruie, 282, 282f to S, 92f and Verb Raising, 280ff to VP, 9lf African languages, 2ll Agent, ll4, 254, 256 AGR,IO and identification, 70, 153 and opacity factors, 208 -subject coindexing, 47, 51 f Agreement, 257ff case -,64 COMP-verb -, 13 directional -, 21 and NP-traces, 4lf number -,31 of successive governors and domain extensions, 149 -relations, 14 subject-verb -,13,31, 258f, 314 fn. 2 verb-verb -, 20 Anaphoricity
and disjoint reference, 354f Anaphors,9 and adjuncts, 349 backward -, 355, 366f bound -, 10, 18 and the Bounding Condition, 327 Dynasty for -, 336 lexical -, 240f, 246 and locality principles, 371 long distance -, 19, 318, 321ff multi-dimensional -, 317 and multi-dimensional domain statements, 332 and NP-traces, 240f and Principle A, 318 and subject antecedents, 316 and sUbject-object asymmetry, 366 and sUbjunctive verbs, 318f Antecedent, 3, 8, 315, 326 and adjuncts, 333 -anaphor relations, 35, 315 and c-command, 326 and c-commanding arguments, 327 as non g-subject, 350 non-local -, 13 and non theta positions, 37 in the minimal Xmax, 288 in PP's, 327 obligatory -, 13 -PRO relations, 110 quantified -, 315 -resumptive pronoun strategy, 53f split -, 10, 13, 42 subject -, 316, 326ff, 348 -trace relations, 35, 99, llO, 159, 357 unique -,44 Antecedent Binding Constraint, 181 Argument(s) and control, 109, ll3 controlling -, ll5 designated -, 115 ff external -, 307ff implicit -, ll4 of {aten + V, 303f quasi -, 261 -structure, 13
General Index and theta-roles, 41 and the theta-criterion, 41 Asian languages, 219, 223 AUX, 126, 208f and do-support, 208 and modals, 208 and theta-role assignment, 126 Avere (Auxiliary verb) (Italian), 273 Bach's Principle 118 Barrier, 36lff Basic word order or direction of government, 174 Be (Verb), 251 and passive morphology, 251 as Raising Verb, 268f and small clauses, 268f as unaccusative verb, 264 Beaucoup (Quantifier) (French), 5lf as A-bar binder, 52 Before-phrase, 191, 360 Binary Branching, 294, 347 and pseudo-passives, 294 Binding, 5 A- -,73,240 anaphor -, 4, 13, 315 and argument structure, 327 and by-phrases, 94 and c-command, 87 and conditions for anaphors, and Control, nOff and domains, 315ff of empty categories, 174 and extra posed complements, 133 long distance -, 337 into PP's, 6 and scope, 349 and VR-complements, 133 Binding Theory, 14, 316 and complement/non-complement distinction, 317 configurational -, 13 and Control 110, 115ff and locality principles, 4 and the non-redundancy principle, 351 and parasitic gaps, 60, 360ff and PP-traces, 82 Principle A of the -, 4f, 11 f, 117, 141,312, 316,350 Principle B of the -, 69, 316, 34lff, 350 Principle C of the -, 68, 69, 80, 242, 316,318,346ff and Principle C violations, 346ff and "snake" sentences, 333 and the SSC, 299 and that-trace effects, 204ff
389 Blocking Category, 361, 365 vs. the Bounding Condition, 361 Bounding, 145 and Control, 99 -domain, 6 and Global Harmony, 145 -nodes, 5,23, 197, 199 -theory, 14 Bounding condition, 8, 10, IOff, 99 and anaphor binding, 361ff and bound anaphors, 330 and chains, 159ff and Control, 109ff for Dutch anaphors, 330ff and empty categories, 152 extended -, 24, 165, 202, 216 and movement to COMP, 27 and NP-movement, 24lff and NP-traces, 24lff parametrization of -, 226 and passives, 271 and pseudo passives, 271, 291 and reanalysis, 288ff and scope, 203 and subjacency, 38, 150 and unaccusatives, 271 and WH-elements in situ, 212ff, 232 and WH-movement, 232 and X-bar theory, 17 By-Phrase, 114, 116, 326 and passives, 114 Case -absorption, 251,272 accusative -, 307ff -assignment, 13, 129 dative -, 307ff default -, 64, 258 and dynasties, 292 -filter, 51, 64, 257, 266ff and Government, 51, 128, 137 -inheritance, 259, 267 and lexical specification, 292 -marked V's, 129f nominative -, 50, 257f overt -,60 -position and lexical NP's, 104 -transfer, 44 -theory, 14 Case Adjacency Condition, 308 Case Resistance Principle, 262 Category functional -, 71 and identification, 70 and lexical content, 70f transmutation of -, 67 Causative construction, 297 C-Command, n
390
General Index
and anaphor binding, 373 and antecedent-trace relations, 373 Aoun and Sportiche -, 90, 106 and branching COMP's, 211, 222ff broad -, 349f and Government, 205 and local control, 223 narrow -, 349f and predication, 58 and Principle-C violations, 346ff and Relation R, 10 Chain(s) A-,39f and the Bounding Condition, 159ff COMP to COMP -, 46 construal -, 46, 56 and government, 124 headed by an empty operator, 360 and (in)direct theta-positions, 50 licensing -, 158 Chain composition, 156ff, 358, 361, 361ff -approach, 358, 360 and Barriers, 361 vs. Connectedness, 36lf and Islands, 362 as predication, 364 problems with -, 156ff and O-subjacency, 360ff Chinese, 70, 147, 216ff, 236, Mandarin -, 97 The Cinque Obenauer Hypothesis, 153ff, 176ff, 191, 199, 200ff, 215ff, 232, 234 and long WH-movement, 154 and WH-elements in situ, 202 Clause and argument positions, 263 -inversion, 253 small -, 341 tensed complement -, 21 -union, 210, 304
Clause Mate Constraint, 341 Cleft sentences, 62 Clitic, 6, 73, 272, 306f, 310 anaphoric -, 320 -climbing, 134,272,299, 310ff dative -, 309, 31lf and dynasty controlled percolation, 311 and dynasties of V's, 304 extraction of -, 310ff NP-,31O subject -, 275 and VR-complements, 133 Co-arguments Hypothesis, 327ff Combien (Quantifier) (French), 51 Command,86 f- '-, 13, 326 s- -,87
COMP and accessibility for Government, 159, 203ff,222 branching -, 222ff -to-COMP violations, 87 doubly filled -, 206f head of -, 225f -indexing mechanism, 223 intermediate -, 220 and nominative Case-assignment, 209 as opacity factor, 149f PP's in -, 207 Complements clausal -, 21 extraposed -, 120ff, 273f, 293f /or- -, 138 infinitival -, 1l0, 119 infinitival WH -, 22 N- -, 136ff -/non-complement distinction, 216 Raising -, 123f sentential -, 194 tensed -, 194 VR- -, 120ff, 273f, 283, 293f Complementizer and antecedent government, 205 -deletion in PR, 339 and extraposed infinitives, 338 as head of S-bar null -, 143 fn. 2 Complex NP Constraint, 136, 159ff and chains, 159ff and Global Harmony, 193 violations of the -, 154 Condition(s) on derivation, 4 on representation, 4 Condition on Extraction Domains, 200 Configurational Matrix, 8ff, 20, 98, 104f, 105, 246, 259, 272 (see also Relation R) and Base rules, 107 and Binding, 316 and Control, 109ff, 125 and Government, 112, 149 and identification, 70 and indexing, 35 and licensing relations, 41, 149 and reflexivization, 326 and uniqueness, 35, 42, 99 and vertical relations, 108 violations of the -, 111 and Visser's Generalization, 117 Conjuncts, 15 Connectedness Condition, 145ff, 159ff, 165, 230 and canonical Government, 173, 175
General Index and g-projections, 145f, 159, 162, 288 and the Grammar of gaps and scope, 145 and S-structure, 145, 224 and weak dynasties, 169 without directionality constraints, 231 Construction independent properties, 147 Control, 109ff anaphoric -, 109ff and the Bounding Condition, 109ff and extraposed complements, 12Iff functional -, 125 and implicit/explicit arguments, 114 of infinitival complements to verbs, 113f non-anaphoric -, 109ff and NP's, 139 obligatory -, 14, 111 and raising, 124 and semantically unrestricted functions, 118 -structures, 13, 200 -theory, 14 and VR-complements, 12Iff Coordinate Structure Constraint, 359f Coordinated -constituents, 314 fn. 6 properly -, 14f -structures, 14f, 95, 280 Core Grammar and parameters, 25f invariant -, 26 Covalency, 276ff, 286 and transparent complements, 284 Crossover strong -, 68f, 80f, 355f weak -, 80, 96 cyclic successive - linking, 155 Danish, 317, 339 Deep-structure, 3lf Den (Pronominal), (German), 64, 83 Despite (Adjunct Preposition), 212 Determiner as quantifier, 96 Die (Relative Pronoun) (Dutch), 22, 45 Directionality constraints, 19, 53 and domain extensions, 173ff, 228 as D-structure property, 338 and LF-gaps, 147, 20Iff and overt gaps, 232 or percolation, 145 and SOY-languages, 53 and S-structure gaps, 147 and SVO-languages, 53 and the theory of markedness, 19
391 \
and WH-elements in situ, 212, 215, 230 Discourse Principle, 347, 352ff, 353 and the anaphoricity scale, 352 and Principle C, 347, 352 Discourse Semantics, 353 Domain and ACI-verbs, 152 anaphoric -, 6 antecedent -, 12 basic -,7,25 and Binding, 315ff co- -, 11 d-subject -, 333ff dynasty governed -, 20, 27 -extension, 6f, 9, 11, 17ff, 148,317 -extensions and lexical Government, 204, 229, 235 -extractions and structural Government, 235 and lexical factors, 292 local -, 6, 14, 15 of long distance reflexivization, 149f of long WH-movement, 150 maximal binding -, 332 minimal -, 12, 18 minimal COMP -,333 of P, 6f of parasitic gaps, 150 and Principle C, 350 scopal -,79 -stretching, 18 of a subject, 12, 331 of V, 6f Double Binding Constraint, 181 Dougherty's Anaporn Principle, 64 D-structure, 2, 38ff and percolation, 311 as substructure of S-structure, 2, 34, 57,99 and Theta-theory, 33, 40ff and variables, 50 d-Words, 22, 44, 144 fn. 19, 144 fn. 20 bound -,64 and transmitting, 45 Dynasty 19f, ISO, 150ff, 372, 375 and adjuncts, 333 and agreement of lexical categories, 19 and anaphor binding, 316ff, 337 and Case-assignment/absorption, 292f, 304,307f and directionality, 19f, 152f, 338 and the distribution of zich/zichze/f, 330 and domain extensions, 212ff, 235, 293, 336 and extended domains of anaphors, 165 and governing subjects, 318
392
General Index
of governors, 21, 288, 292, 336, 362, 372 and interc1ausal verb agreement, 19f and L-marking, 361 and long distance anaphors, 336 and parametrization, 151, 159ff perfection degree of -, 16lff and prepositions, 153 and proper government, 296 and reanalysis, 279ff and refiexivization, 336 of structural governors, 159f and theta-role assignment, 304, 361 upper bounding to -, 166f of V's, 24f, 151, 159, 171,296 of two V's, 165ff Each other (Reciprocal), 240 Easy to please, 45, 99, 356, 363 and Principle C, 356 and empty operators, 363, 367 Empty Category Principle (ECP), 203ff, 233ff, 242, 245, 264 and Bounding, 145 and Global Harmony, 145 and LF, 91, 145 and Subjacency, 145 E-language, 240, 374f vs. I-languages, 374 Empty categories, 245 in A-bar chains, 153ff, 360 in domain extensions, 153ff and the dynasty concept, 152f Er (R-pronoun) (Dutch), 178f, 206, 244f, 253,260ff anq impersonal passives, 260f -movement, 271,274 as presentational adverb, 261 in Topic-position, 265, 291 Ergatives, 242ff and word order, 242 with a causative counterpart, 255 Es (Pleonastic element) (German), 26lf, 274 Essere (Auxiliary) (Italian), 273 'and transparent complements, 273 Exceptional Case Marking, 112, 121, 127, 130ff and causatives, 130 and directional government, 127 and NP's, 139 and perception verbs, 130 vs raising to object position, 131 and S-bar deletion, 112, 132 and te-less infinitives, 131 and VR-complements, 121 Extended Name Constraint (ENC), 90, 95
Extended Standard Theory (EST), 103 Extraction across the board -, 280, 357ff of adjuncts, 22, 155f er -, 283f and Islands, 155 out of complex NP's, 18, 199 out of PP's, 18 from tensed WH-islands, 228 Faire (French), 272, 275, 293, 296ff and a-insertion, 299ff and Case-assignment, 302, 305, 307ff and elitics, 272, 309 and double object constructions, 306 and dynasties, 302, 304 -infinitive constructions, 296, 307 and NP-extraposition, 298ff par - constructions, 297, 302 and predicate adjectives, 300 and pro PP's, 306 and small clauses, 301 Finnish,60 Flemish, 144 west -, 282, 286 For-Phrase, 11if, 118, 119, 189 and Control, III -deletion, 241 as proper governor, 242 French, 6, 51, 72, 97, 148, 150,219,222, 226f, 232, 236, 270, 272, 278, 280, 292, 296ff, 320, 325 Functional position, 40, 102 -licensing -, 310 Functionalism, 373f and LF, 373f
Gapping, 14 -construction, 14 and constituency, 279f and Verb Raising, 280, 283 Generative Semantics, 78, 98, 103 German, 19, 24, 44, 53, 63, 80, 83, 84, 102, 144, 15lf, 157, 179, 185, 187, 193,209, 219ff, 236, 253, 258ff, 273f, 282, 294, 296, 301, 313, 324f, 339f, 368 Swiss-, 281, 286, 295 Germanic languages, 144, 152, 165, 211, 273, 275, 283, 290, 294, 297, 305, 313 Global Harmony, 145, 172ff and Binary Branching, 176 condition of -, 174 and domain extensions, 172ff as a D-structure property, 289 and Island Conditions, 192ff, 214 and L-marking, 362
General Index and NP-movement, 27lff and parasitic gaps, 185ff, 214, 363 and passives, 244 and P-stranding, 174ff, 212ff and S-structure gaps, 230 and Subjacency, 194ff and transparent complements, 274, 288ff and WH-elements in situ, 202, 212ff Governing Category, 4, 6, 10, 292, 316, 342 and the Binding Theory, 147, 150 domain -, 166 and domain extensions, 317 and minimal Xmax, 6f, 10, 15, 99, 116, 119, 320 parametrization of the -, 319 and subject domains, 331 Government antecedent -,91, 203ff, 221, 233 bidirectional -, 174 canonical -, 146f direction of -, 6, 18, 122ff, 138, 338 dynasty -, 292 lexical -,91,233,246 as a licensing relation, 13 and minimal c-command, 51 proper -, 23, 91, 145, 201, 203, 224 structural -, 6, 234 -theory, 14 and transparent complements, 274 transparency for -, 112 Governor agreement of -, 150 chain of successive -, 150, 168 structural -, 6, 20, 152, 159 successive -, 19 and the Uniqueness Condition, 169 Unique - and A-bar bound ec's, 170 Grammatical function (GF) GF-theta 38f Heavy NP-Shijt, 279 Hebben (Auxiliary) (Dutch), 248ff, 291 and psychological verbs, 253f and unaccusatives, 248 and unergatives, 290 Hem (Pronoun) (Dutch), 325, 342 anaphoric use of hemze/f, 344ff Binding Condition for hemze/f, 345 as pronominal and anaphor, 325 and snake sentences, 334 Het (Expletive) (Dutch) 260ff as argument, 262ff as quasi argument, 261 in Topic-position, 265 in VP-internal position, 265
393 Himse/f (Reflexive), 240 Hungarian, 84, 221, 236 Icelandic, 19, 27, 102, 149ff, 161, 165, 240,288, 318ff, 331, 339f Identification and variables, 8lf Idiom chunks, 44f Idiomatic expressions, 31, 34, 47f, 243 and adjacency, 31 and coreference, 48 and the lexicon, 48 and obligatory V-movement, 48 and passives, 268 and reanalysis, 278 iii Condition, 88, 106 and quantified NP's, 89 Indefiniteness effect 267f Indexing as cosuperscripting, 259 free -,35 and move-alpha, 35 to Q, 220 referential -, 35, 106f at S-structure, 221 Indirect object preposing, 243f and passivisation, 243 Infinitives adjacent -, 62 as dependent element in dynasties, 151 fronted -, 75 te-less -, 120ff, 138, 247, 302 te-less -, and gerunds, 120ff te-less -, and theta-marking, 122 Infinitival complements and Control, 125f in Dutch, 119ff and ECM, 125f and nominative Case-assignment, 258 and nouns, 136ff opaque -, 122 and raising verbs, 125f transparent -, 122 INFL, 4, 6 and anaphoric domains, 209 and nominative Case-assignment, 208f, 257f as opacity factor, 149, 208 and proper government, 233 strong -, 266 Intransitives unaccusative -, 257 unergative -, 257 Inverse linking cases, 95f Irish, 143 Italian, 20, 24, 28, 50, 62, 75, 144, 148,
394
General Index
150, 164ff, 190f, 195, 199f, 236, 254, 257ff, 272f, 278, 280, 297ff, 31lf Island Violations, 20ff of the CNPC, 20, 52, 147, 154 and Global Harmony, 192ff, 214 and parasitic gaps, 156 and scope-assignment, 219 WH- -, 18,20,21,147,164,214,217 and WH-elements in situ, 146f It (Expletive), 262ff as argument, 262, 269 expletive -, 314 fn. 4 factive -, 314 fn. 4 as variable, 262 in VP-internal position, 265 Japanese, 12,27,70, 147, 216ff, 236, 32lff KOrean, 12 Laisser (Verb) (French), 306f and dynasties, 306f Latin, 242 Left dislocation, 3, 65 Lexical content, 40, 102 and theta-roles, 40 and functional positions, 59, 102 Lexical Functional Grammar, U8, 125 Lexical structure, 2, 43, 98 and the Projection Principle, 43 Linearization Rule, 15 Lo (Clitic) (Italian), 300 Local controller, 223, 233 -/non-Iocal distinction, 216 relation and trace theory, 33 Local domain, 6, 14, 15, 147ff -extensions, 149ff Locality, 10, 101, 361 and Control, 115 minimal - and anaphor binding, 316 and the sub theories of Grammar, 147 vertical -, 220 Locality Principle, 6, 11, 37lf anaphoric -, 4 strict -, 361 Logical Form (LF), 2, 76ff, 100, 374 and the Binding Theory, 82 and c-command, 96 and the logistic interpretation, 76ff -movement, 147,219, 222f and proper Government, 201 and Subjacency, 203 Long distance reflexivization and control complements, 339 and dynasties of dependent verbs, 339f
and VR-complements, 338f Lui meme (French), 6, 320 and Binding, 320 Major Constituent Condition, 15f Maximal projection and local dependency relations, 26 and vertical dependency relations, 26 Minimal Distance Principle, 117 Movement of AUX to COMP, 211 LF -,76, 100 local WH -, 102 long WH -, 19, 102, 154 minor -, 144 fn. 16 NP -, 14, 40, 187 vs, pro-linking, 164 R -, 121, 134 successive cyclic -, 102, 158, 220 to non-theta positions, 49 unbounded -, 4f WH -, 14,40 Move-Alpha, 2f, 8, 3lf, 54f, 357, 371 and the Bounding Condition, 27f and partial property sharing, 72 and rules of construal, 34 and Subjacency, 27f Multi-dimensional representations, 276ff, 313 and Verb Raising, 281 Nali ("Where") (Japanese), 218 Nani ("What") (Japanese), 217, 22f Naze ("Why") (Japanese), 218, 224f Nesting Hypothesis, 176 Nodes cyclic -, 195 daughter -, 17 mother -,16 sister -, 16 Nominative Island Constraint, 148, 208ff and minimal domains, 209 Nonredundancy Principle, 351, 356 and Principle B, 351f and Principle C, 352 and the theory of markedness, 351 Norwegian, 148, 151,317,339 Noun (N) and government and opaque infinitival complements, 136ff picture -, 326 Nominal Phrase (NP) anaphors/pronominals and passives, 340ff complex -, 97 -Constraint, 196, 198
General Index -extraction and Case-positions, 310 and extraposition, 128, 298ff -gaps and identification, 70 and non-local scope markers, 220 -/non-NP distinction, 219 and obligatory control, 139f Principle C and and conjoined -, 347 quantified -, 17, 69, 78ff, 348 -raising, 126 and te-Iess infinitives NP-movement, 40, 239ff and the Bounding Condition, 29lf and the Case-filter, 267 and domain extensions, 271 and Heavy NP-shift, 279 from object to subject position, 244, 312 opacity factors and -, 240f, 312 optional passive -, 257 and Principle A, 240, 312 and Subjacency, 240, 312 NP-structure, 2, 57, 100f and abstract Case, 59, 63 and the Binding theory, 59f, 67 and filters, 62f and predication, 59f and to-contraction, 6lf and pre WH-movement, 58 Of (Complementizer) (Dutch), 206ff
-insertion, 267 om (Complementizer) (Dutch), 119f, 138, 338 and extraposed complements, 122f, 338 and VR-complements, 123 Opacity anaphor binding and -, 316 -elimination, 320 factors, 6f, 149, 240f, 319, 368 and governing categories, 317 and NP-traces, 240 - pre-Pisa, 195 pseudo -,52 zichlzichzelJ distribution and -, 330 Opacity Condition, 148 Operator (0), 46, 59 COMP,84f empty -, 356ff non-NP WH -, 218 and Principle C, 367 Question - (Q), 220ff vacuous -, 159 and [+ wh)-feature, 46 Parametric variation, 6 and configurational relations, 316 and domain extensions, 372
395 and dynasties, 159ff SIS-bar -, 201ff, 232 and Subjacency, 200 Parasitic Gaps, 19, 21, 50, 60, 155ff, 214, 356 A-bar chains and -, 362 adjunct cases of -, 186f, 360 and ATB extraction, 357ff and backward anaphors, 366 and the anti c-command condition, 69, 80, 156, 360, 364ff and chain-composition, 362ff coordination and -, 358 vs. derivational approach, 357 directionality constraints and -, 185 as a D-structure property, 190f empty operators and -, 360 and Island sensitivity, 156f as NP-gaps, 156 NP-structure and -, 191 and operator binding, 155 Principle C and -, 356ff and pro, 191 as resumptive pro, 74, 99, 155ff Subjacency and -, 156 subject cases of -, 185f, 214, 359 tense and -, 187, 358 as variables, 360 and WH-elements in situ, 214 and WH-phrases, 60 Particles, 181 and incorporation, 181 Passives, 242ff and Case-theory, 252f, 257ff Case-filter and -, 266ff empty subjects and -, 260 and external theta-roles, 25lff and human agents, 256 impersonal -, 243, 256, 260f, 269, 291 lexical -, 243 obligatory -, 252 optional -, 252 pseudo -, 243, 280, 288, 29lff raising -, 243 and reflexivization, 326 and sentential complements, 243 Topic-position and -, 265 and word order, 242f Passive morphology and be and causative verbs, 251 and a non-theta subject position, 251 and objective Case absorption and verbs of' perception, 251 Past participles, 247 and optional Verb Raising, 283 pre nominal -, 249
396
General Index
Path, 18 trace free -, 290 Path Containment Condition, 91 Percolation -domain, 169 governed complements and -, 166 -mechanism, 362 Percolation Islands, 290 and extraposition, 295f Phonetic Form (PF), 2 (see also surface structure) Phonetic Representation (PR), 61 Phrase markers, 277ff reduced -, 277ff Phrase Structure Rules, 277 Pied Piping, 80, 83, 88, 213f and invisible LF-movement, 88 Plato's problem, 1,2,371 Portuguese, 259 Postposition Stranding, 21, 174ff, 212 and Global Harmony, 174ff, 212, 289 Pourquoi (Adjunct) (French), 226f Prepositional Phrase (RP) and absolute Islands, 180, 213 -extraposition, 180, 213 as minimal governing category, 335 -preposing, 188 resumptive -, 236 fn. 5 Predication, 14 and c-command, 58 complex predicate formation, 183ff raising predicates, 112 -structure, 58 and WH-movement, 59 Prepositions (P) and infinitival complements, 188 non-contrastively stressed -, 343 and proper government, 181 as structural governors, 179, 236 fn. 4 and tensed complements, 188 Preposition Stranding, 18, 20, 162ff AP's and -, 182ff adjacency and -, 18lff, 293 and the Bounding Condition, 162 and Connectedness, 146 and cosuperscripting, 181 and dynasties, 162ff and Global Harmony, 174ff, 289 and non-pronominalizable NP-heads, 162f NP's and -, 182ff and pro, 162ff pseudo passives and -, 271 pro, 23 distribution of -, 157 -drop languages, 51 and government, 37, 154ff
identification and -, 157 long binding of -, 199 as a resumptive pronoun, 37, 73, 154ff semi pro-drop, 313 and successive g-projections, 159 vs. traces, 159ff WH-elements in situ and -, 220 PRO, 13, 109ff, 234, 242 and anaphor binding, 141 governed -, 113 ff and government, 37 obligatory -, 112 optional -, 111 and Principle A, 120 and to-contraction, 61 Projection Principle, 4lf, 262, 276, 298, 304 Extended -, 43 violations of the -, 276, 298, 304 Prominence Hierarchy, 353 Pronominal, 3, 10 intensified -, 344f Pronoun, 6 bound -,315 donkey -,96 empty expletive -, 260 overt -, 188 relative -, 22 R -, 174ff, 213, 244 resumptive -, 39, 53, 73 Property transfer optional --, 36 and the configurational matrix, 36 Proprio (long distance anaphor) (Italian), 321 Pruning operation, 275 and Verb Raising, 275 Pseudo cleft constructions, 32f Quantification at a distance, 51 Quantifier(s) and transparent complements, 286 and VPR, 286 -variable notation, 374f Quantifier Raising (QR), 17, 77ff, 85ff Que/Qui-Switch, 236 fn. 15 Raising and directionality of government, 127 and NP's, 138 to object, 131, 133f and obligatory Control, 125 into PP's, 143 fn. 6 reading collective, 92f distributed, 92f Reanalysis, 27lff
General Index A over A principle and -, 285 and A-bar type of NP-movement, 272 and adjacency, 144 fn. 22, 293 against -, 279ff clitics and -, 274 constituent tests and -, 279f, 293 and covalency, 276ff as dependent factor in dynasties, 151 and dynasty controlled domain extensions, 271, 288ff Fare-infinitives and -, 276 Heavy NP-shift and -, 279 Multi dimensional representations and -,276ff and the nature of matrix verbs, 274 optional Verb Raising and -, 283 and P-stranding, 161,293 and pseudo passives, 160f, 272, 279, 283 and reduced phrase markers, 278 and subjacent domain governors, 288 and transparency, 272ff of V and P, 160f, 283 and Verb Raising, 144 fn. 16 and verbal complexes, 275, 278ff Reflexivization and transparency, 133 (co)Reference and configurational aspects, 316 and the Discourse Principle, 353 disjoint -, 315, 343f, 346ff intended -, 315 Referential Expression (R-expression), 40, 68, 316 Referential index, 8 Relation(s) agreement -, 14, 31 anaphoric -, 8 dependency -, 8, 13 grammatical dependency -, 4 licensing -, 13f, 4lf local -,33 operator-variable -, 80 predication -, 364 quantifier-variable -, 76f, 364 -R,9 scope -, 17 subject -predicate -, 43, 64 unbounded -, 4 Relation R and c-command, IOf and coordinate structures, 14f and locality, IOf and obligatoriness, 9, 11 and uniqueness, 9f, 52 Relational Grammar, 250 Representations, levels of, 2, 27, 31ff
397 and generative grammar, 98 (empty) resumptive pronoun close operator bounding vs. -, 155 and extraction out of PP's, 154f as operator, 154 as pro, 153 Right Node Raising, 131,314 fn. 6 and constituency, 131, 280 Romance languages, 27, 53, 157, 179, 185f, 192f, 273, 275, 283, 290, 297, 305,313, 329 Root sentences and te-Iess infinitives, 128 Rule-R, 51 Rules of construal, 34f and Case-transmitting, 65 local -, 66, 102 and move-alpha, 34, 54, 99 non-movement -, 55f, 65 and S-structure, 34 Rumanian, 19 S
as bounding node as governed complement, 314 fn. 2 as INFL domain, 208 S-bar, 26f as bounding node, 167, 227 -deletion, 112ff, 132, 210 as domain of COMP, 208 the minimal governing -, 194 as the minimal g-projection of V, 175 as minimal maximal projection, 149 schijnen (Raising verb) (Dutch), 124f Scandinavian languages, 18,20,27,53, 150, 157, 179, 185f, 192f, 236, 336, 339f Scope, 79ff adverbial -, 134, 274f, 286, 295 -ambiguity, 86ff, 286f, 296, 349 -assignment, 79, 214f bounding and -, 202 and c-command, 86ff, 286f of governors, 170f Grammar of -, 201ff and incorporation, 286 -indexing, 88, 286 -markers, 84f, 216 matrix -, 85, 286 and minimal local domains, 217 narrow -, 90, 286, 295 -percolation, 295 -principle, 90, 90f and transparent complements, 284, 286 wide -, 146, 215, 274f, 286, 295 Scope Indexing Rule, 286, 288 se (Reflexive) (French), 6, 72, 297, 306
398
General Index
and Binding, 320 as intransitivizing affix, 306 seem (Raising verb), 37 as unaccusative verb, 264 seg (Reflexive) (Norwegian), 151 Self-deletion, 343 Sentential noun complement, 97 Ser (Reflexive) (Icelandic), 321 Share property, 8, 35, 70f, 372 and Case-inheritance, 259 and construal rules, 99 isotopic, 7lf, 79, 82, 104f non-isotopic, 7lf, 79, 82, 93, 104f and relation between A-positions, 81 and successive governors, 152 and transmitting, 45, 70 Shei ("Who") (Chinese), 217 Sheme ("What") (Chinese), 217 si (Pronoun) (Italian), 272f Sig (Reflexive) (Icelandic), 151, 318ff, 331 Sluicing, 65f and connectedness of discourse, 65f Snake sentences, 325, 333ff and Principle A, 333 and Principle B, 333 and pronouns, 325, 333 and reflexives, 325, 333 and the small clause analysis, 334 SOY-language, 120 and directionality of Government, 122 Spanish, 23, 150, 272 Specified Subject Condition (SSC), 299, 307, 309 violations of the -, 324 S-structure, 2 and Binding Theory, 33, 68f, 89f and bounding, 200 and c-command, 96f Empty [NP ,S] and -, 244 linking of WH-phrases and -, 231 and move-alpha, 99 and percolation, 200 and semantic interpretation, 103 and theory la, 99 and theory Ib, 99 Structure preservingness, lOlf and trace theory Stylistic Inversion, 298f Subcategorization as a licensing relation, 13 and theta-marking, 262 Subjacency, 3f, 53, 69, 195, 195ff and the antecedent-trace relation, 37 Bounding Theory and -, 148 and the Bounding Condition, 38 as a condition on representation, 195 ECP and -, 145
and LF-movement, 85f one node -, 5, 195, 196 two node -, 5, 195ff and NP-movement, 240 vs. Principle A, 368 O-subjacency, 360, 364ff and scope assignment, 85 WH-elements in situ and -, 202, 215 Subject, 12 -AUX inversion, 128, 210f derived -, 256 -drop phenomenon, 257ff d-subjects and adjectives, 344f dynasty -, 323, 33lff governing -, 318, 318, 322f, 337, 354 implicit -, 256 incorporation of the -, 247ff -indirect object inversion, 264 and infinitival complements, 110 local-,31,354 minimal d -, 344 non-governing -, 354 -object asymmetry, 192, 198, 206, 210, 246 -object inversion, 253 obligatory lexical -, 265 -oriented language, 266 -predicate relation, 43, 258 raising to -, 121 rightward movement of the -, 298 -sentences, 263 unergative -, 254 -verb inversion, 297 SUBJECT 4, 6, 10, 149f, 240, 317 Subjunctive, 19, 318 and dynasties, 151, 339 Subset Principle, 239, 318ff, 368 and the Bounding Condition, 320 and the markedness hierarchy of opacity factors, 319, 321 Superiority effects, 229, 234 Surface structure, 3lf, 37 (see also PF) and deletion rules, 37 and stylistic rules, 37 SVO-language and directionality of government, 122 Swedish, 52, 148, 184f, 339 te (Infinitival marker) (Dutch), 120 That-trace effect, 155ff, 203ff, 233f, 246 and the Binding Theory and doubly filled COMP's, 211 intransitives and -, 207 NIC and -, 208f, 234, 246 and pro-drop languages, 207 superiority effects and -, 207f
General Index and that deletion, 158 Theory directionality -, 20 domain -, 37lf of identification, 70 level -, 371 of markedness, 7 trace -, 32 There -insertion, 267f Thesis of Radical Autonomy, 28, 110, 141, 147, 372 and locality, 148, 195 and Subjacency, 195 Theta theory Barriers and -, 360 and Complex Predicate formation, 183 and expletive it, 269 and government, 129 and property transfer, 36 and non-theta positions theta criterion, 33, 41, 49, 124 theta marked categories, 25 theta marking, 13 theta position, 3 theta-role, 3, 8 theta-role assignment, 43 unaccusatives and -, 305 unergatives and -, 305 Three Level Model, 371 Topicalization structures, 44, 59, 62, 99 and Case theory, 63f, 258 and Vergnaud Raising, 63ff and VR-complements, 129 Topic, 59, 261 and Government by COMP, 266 and subject sentences, 263 Trace, 2 antecedent- - relation, 3 domain extensions and -, 165 and government, 3, 37 intermediate -, 194 and level theory, 371 and lexical licensing, 310 NP -,39 WH -,18,39 -theory, 32 Transparency, 272ff and causatives, 301 domain extensions and -, 288f dynasties of V's and -, 296 of infinitival complements, 313 laisser and -, 306f and matrix scope perception verbs and -, 301 and percolation, 290ff -without reanalysis, 288ff
399 Verb Raising and -, 283 and VP-movement, 297 Transformational Generative Grammar (TGG), 32, 101 and cycles, 101 f and derivational concepts, 101 and isotopic property sharing, 71 and representational concepts, 101 Transmitting Case -,45, 54f, 312 of lexical content, 45, 54f theta-role -, 45, 54f, 308f
Unaccusative Hypothesis, 256 and passives, 256 Uniqueness Condition, 172, 351 Universal Grammar (UG), 9, 12, 19, 99, 206 binding and -, 350 core domains and -, 372 and dynasties 152 and long distance anaphors, 336 Non-redundancy Principle and -, 351 opacity factors and -, 319 and parasitic gaps, 185 the positive binding condition and -, 356 Universal quantification and quantified NP's, 77 Universal quantifier, 92 Variable, 50, 68, 78ff, 315, 360 vs. A-binding, 242 dressed -,77,81 and LF, 98ff Principle C and -, 355 Verb (V) ergative -, 305 and leftward movement, 297 minimal g-projection of -, 188 -movement in PF, 280f -projection preposing, 247ff, 254, 283 -projection raising, 283, 295 -Raising, 48, 120, 133, 181, 210, 275, 28lf,329 reflexive -, 330 -Second, 48, 131, 178, 211, 265f, 284, 329 and structural government, 136 Vergnaud Raising, 44, 59 and Case transfer, 64 Visser's Generalization, 117 and the configurational matrix, 117 Verb Phrase (VP) -analysis, 123 -barrier, 365f -deletion, 88, 94f, 209
400
General Index
-preposing, 131f, 297ff Wanna-contraction, 310 Was (Scope marker) (German), 84f Waf voor split, 245f, 253f Weisheme ("Why") (Chinese), 217 Welljormedness Condition on Chains (WCC) , 154ff WH-movement multiple - and branching COMP's, 222 non overt -, 217 WH-phrase and COMP, 39 gaps and -, 233 and Global Harmony, 202 -in situ, 71, 76, 145, 201ff local thinking and -, 216 long distance linking and -, 216 and Principle C, 355f as [+ pronominal], 220 and resumptive pronouns, 39 X-bar theory, 375 and auxiliary projections, 29 fn. 6
and lexical projections, 29 fn. 6 and vertical relations, 16, 108 Y (Clitic) (French), 306f, 311f
and Case-less pro-PP's, 312 and domain extensions, 307 zibun (Reflexive) (Japanese), 12 zich (Reflexive) (Dutch), 7, 133, 317f, 323ff, 342 and adjunct PP's, 335 antecedent of -, 326ff Case and -, 330 as elitic, 329 domain of -, 330ff and domain of V, 324, 329f and double object construction, 330 and "snake" sentences, 333 theta-roles and -, 330 zichzelj (Reflexive) (Dutch), 7, 323ff and adjunct PP's, 335 antecedent of -, 326ff domain of -, 330ff and domain of V, 324, 329f zijn (Auxiliary) (Dutch), 248ff, 291