Theoretical Approaches to Universals
Linguistik Aktuell/Linguistics Today Linguistik Aktuell/Linguistics Today (LA) provides a platform for original monograph studies into synchronic and diachronic linguistics. Studies in LA confront empirical and theoretical problems as these are currently discussed in syntax, semantics, morphology, phonology, and systematic pragmatics with the aim to establish robust empirical generalizations within a universalistic perspective.
Series Editor Werner Abraham University of Vienna
Advisory Editorial Board Guglielmo Cinque (University of Venice) Günther Grewendorf (J.W. Goethe-University, Frankfurt) Liliane Haegeman (University of Lille, France) Hubert Haider (University of Salzburg) Christer Platzack (University of Lund) Ian Roberts (Cambridge University) Ken Safir (Rutgers University, New Brunswick NJ) Lisa deMena Travis (McGill University) Sten Vikner (University of Aarhus) C. Jan-Wouter Zwart (University of Groningen)
Volume 49 Theoretical Approaches to Universals Edited by Artemis Alexiadou
Theoretical Approaches to Universals
Edited by
Artemis Alexiadou University of Potsdam
John Benjamins Publishing Company Amsterdam / Philadelphia
8
TM
The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984.
Library of Congress Cataloging-in-Publication Data Theoretical Approaches to Universals / edited by Artemis Alexiadou. p. cm. (Linguistik Aktuell/Linguistics Today, issn 0166–0829 ; v. 49) Papers from a conference on Universals organized by the Research Center for General Linguistics, the Linguistics Department of the University of Potsdam and the Dutch Graduate School in Linguistics and hosted in Berlin in March 1999 Includes bibliographical references and indexes. 1. Universals (Linguistics)--Congresses. 2. Grammar, Comparative and general-Congresses. I. Alexiadou, Artemis. II. Zentrum für Allgemeine Sprachwissenschaft, Typologie und Universalienforschung. III. Universität Potsdam. Institut für Linguistik. IV. Landelijke Onderzoekschool Taalwetenschap. V. Linguistik aktuell; Bd. 49. P204. T48 2002 415’.01-dc21 isbn 90 272 2770 5 (Eur.) / 1 58811 191 1 (US) (Hb; alk. paper)
2002021464
© 2002 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. · P.O. Box 36224 · 1020 me Amsterdam · The Netherlands John Benjamins North America · P.O. Box 27519 · Philadelphia pa 19118-0519 · usa
Table of contents
List of contributors Introduction Artemis Alexiadou Universal features and language-particular morphemes Maya Arad Agree or attract? A relativized minimality solution to a proper binding condition puzzle Cedric Boeckx Distributed deletion ´ Gisbert Fanselow and Damir Cavar
vii 1 15
41 65
Roots, constituents, and c-command Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
109
A four-way classification of monadic verbs Murat Kural
139
On agreement: locality and feature valuation Luis López
165
A minimalist account of conflation processes: Parametric variation at the lexicon-syntax interface Jaume Mateu and Gemma Rigau
211
Morphological constraints on syntactic derivations Juan Romero
237
Intermediate traces, reconstruction and locality effects Joachim Sabel
259
Index
315
List of contributors Artemis Alexiadou University of Potsdam Institute of Linguistics Postfach 601553 14415 Potsdam, Germany
[email protected]
Maya Arad University of Geneva Dept. of Linguistics 2, rue de Candolle CH - 1211 Genève 4
[email protected]
Cedric Boeckx Department of Linguistics 4088 Foreign Language Building 707 South Mathews Avenue, MC-168 University of Illinois at Urbana-Champaign Urbana, IL 61801, USA
[email protected] Damir Cavar Dresdner Bank CC IT, enateg Research & Innovations 60301 Frankfurt a.M., Germany
[email protected] Gisbert Fanselow University of Potsdam Institute of Linguistics Postfach 601553 4415 Potsdam, Germany
[email protected]
Robert Frank Department of Cognitive Science Johns Hopkins University 243 Krieger Hall 3400 N. Charles St. Baltimore, MD 21218-2685, USA
[email protected] Paul Hagstrom Department of Modern Foreign Languages & Literatures Boston University 718 Commonwealth Ave. Boston, MA 02215, USA
[email protected] Murat Kural University of California, Irvine Department of Linquistics 3151 Social Science Plaza Irvine, CA 92697-5100, USA
[email protected] Luis López University of Illinois-Chicago Dept. of Spanish, French, Italian and Portuguese College of Liberal Arts and Sciences 601 South Morgan St. Chicago, IL 60607-7117, USA
[email protected] Jaume Mateu Departament de Filologia Catalana Facultat de Filosofia i Lletres Edifici B Universitat Autònoma de Barcelona E-08193 Bellaterra, Spain
[email protected]
List of contributors Gemma Rigau Departament de Filologia Catalana Facultat de Filosofia i Lletres Edifici B Universitat Autònoma de Barcelona E-08193 Bellaterra, Spain
[email protected]
Joachim Sabel ZAS Jägerstr. 10-11 10117 Berlin Germany
[email protected]
Juan Romero Dept. de Filología Española Universidad Autónoma de Madrid 28049 Madrid, Spain
[email protected]
Vijay K. Shanker Department of Computer and Information Science University of Delaware Newark, Delaware 19716, USA
[email protected]
Introduction Artemis Alexiadou University of Potsdam
The present volume has its origin in the GLOW conference on Universals organized by the Research Center for General Linguistics (ZAS, Berlin), the Linguistics Department of the University of Potsdam and the Dutch Graduate School in Linguistics (LOT) and hosted in Berlin in March 1999.1 In the first part of this introduction, I offer a brief overview of the main issues involved in our understanding of and quest for universals, by presenting some points of controversy concerning the proper characterization of universal and language specific properties. In the second part, I summarize the contributions to this volume.
.
Universals in linguistic theory
The search for universals has always been at the center of interest in linguistic theory. Two main approaches can be recognized. On the one hand, work by Greenberg (1966), Comrie (1981), Croft (1990) and others searches for surface properties that would be common to all languages and attempts to identify patterns of regularities. For the generative linguist, on the other hand, fundamental claims about universal properties of language are build into the very architecture of the theory of Universal Grammar (UG) in various forms as we will see below. Alongside formal universals, generative linguists also seek substantive universals in inventories, markedness patterns, feature hierarchies etc of the type explored in Greenberg’s work. As Croft (1990) points out, both approaches deal with the question ‘what is a possible language?’ and believe that there are universal constraints that define the answer to this question. Moreover, both believe that the answer to this question is reached at by the comparative study of language.
Artemis Alexiadou
Typologists have determined so called unrestricted universals, which are assertions that all languages belong to a particular type on some parameter, and the other types on the same parameter are not attested (or are extremely rare), e.g. all languages have oral vowels (Croft 1990: 46). Most importantly, however, typological work established a number of implicational universals. Implicational universals offer a description on logically possible language types that limits linguistic variation but does not eliminate it. Not all of them are absolute and without exception. In several cases it turns out that we are rather dealing with universal tendencies rather than universal properties. In the generative tradition properties of languages that vary cross-linguistically will be learnt as a result of exposure to some specific environment. On the other hand, properties that are shared by all languages might well be taken to be part of UG. Certain UG principles are fixed and invariant, e.g. that in all languages phrase structure is endocentric (see the discussion in Haegeman 1997). Others contain parameters, i.e. dimensions with respect to which languages vary. The former need not be acquired, while the latter need to be fixed. Language acquisition then primarily consists of fixing the parameters. The principles and parameters approach opened up the possibility of accounting for certain universal tendencies stated in e.g. Greenberg’s work. Let me offer an illustration of both approaches with an example from the domain of word order typology. Word order typology is concerned with various permutations of S, V, and O. Basic order is generally taken to appear in unmarked constructions; this would exclude questions and cases where constituents have been focused by fronting, as in (2): (1) I like Alexandra
SVO
(2) Alexandra, I really cannot stand
OVS
The basic word order should be the one most frequently appearing in a language. English is unambiguously SVO, Turkish, and Japanese are SOV (cf. 3), Welsh is VSO, Malagasy is VOS. But some Australian languages such as Warlbiri permit all possible permutations (cf. 4): (3) Ken-ga Naomi-wo miru Japanese Ken-Nom Naomi-acc sees ‘Ken sees Naomi’ (4) a.
Ngarka-ngu ka wawirri panti-rni man-erg aux kangaroo spear-nonpast b. wawirri ka panti-rni ngarka-ngu
Introduction
c.
panti-rni ka ngarka-ngu wawirri ‘The man is spearing the kangaroo’
Greenberg (1966) took a sample of about thirty languages from different families and different parts of the world. He observed that although there was considerable variation in word order, the variation was structured in the sense that certain properties varied together. Greenberg put forth several universals that capture this structured variation. These are of the type in (5): (5) With overwhelmingly greater than chance frequency, languages with normal SOV are postpositional.
(5) simply states that languages with normal SOV order are in their majority postpositional. In fact the order of V and O can be taken to be the central parameter, which determines the serialization of the language of modifier(operator)-head (operand) or head-modifier, as shown in (6). (6) SOV AN GN DetN RelN Post
SVO NA NG NDet NRel Prep
where O = object, V = verb, N = noun, A = adjective, G = genitive, Det = determiner (article), Rel = relative clause, Post = postpositions, Prep = prepositions. After Chomsky (1986), the difference between SVO and SOV languages such as e.g. English and Japanese was seen as the result of a difference in the values of a parameter of X -theory. OV and VO patterns result from choosing different values for the head-parameter that regulates the position of head categories in relation to their complements. This parameter can take two values: head first and head last, each accounting for the two patterns of word order, head complement and complement head found in languages such as English and Japanese. In other words OV arises when a language opts for follows in (7b) and VP when a language opts for precedes. (7) a. X follows its specifier b. X precedes its complement
The hypothesis here is that when learning a language the child has to determine the word order. Word order variation is seen as a parameter according to which languages vary. The parameter can have a limited set of values, i.e. either VO or OV. Sentences like the ones in (1) provide evidence to the English child that
Artemis Alexiadou
in her language the verb precedes the object, while (3) provides evidence to the Japanese child that in her language the object precedes the verb. The question that arises is whether parameters such as those can handle all the results of typological work. Consider again (5). With the same token, we take OP patterns to arise when a language opts for ‘follows’ in (7b) and PO when a language opts for ‘precedes’. On this view, one deep property, the choice of precede or follow in (7b), is responsible for the derivation of several other properties. As (6) suggests, OV languages tend to have PossN and AN orders. Roberts (1997), however, points out that these conclusions are slightly tricky for the principles and parameters approach: possessors and adjectives are taken to occupy specifiers within the DP/NP. In principle they should be independent of the ordering patterns we find within X . On the view just presented, parameters are macroparameters, that is they constitute a small set of binary-valued parameters with very far reaching consequences (see Baker 1996 for a recent discussion in defense of this view). However, more recent work in generative syntax has re-examined the status of such parameters, and the components of grammar these operate in. For instance, Kayne (1994) proposes that the head parameter can be dispensed with. In order to derive OV patterns from a universal SVO order, one needs to make use of extensive movement operations. In this respect, Kayne’s proposal for a universal ordering merely shifts the burden from phrase structure to movement. Others pursue the idea that macroparameters can be dispensed with and that linguistic variation is best understood in terms of microparameters. Microparameters are local, low-level phenomena which can partially obscure macroparametric variation. The view that parametric variation affects only the inflectional system of languages, as proposed in Borer (1983), has been recently receiving a lot of attention among generative linguists. On this view, parameters are associated with individual lexical items. As a result, the structure-building apparatus is simple, unified and universal. Well-formedness conditions on phrase-structure such as X theory are eliminated in favor of a system that incorporates the operations Merge and Move (Chomsky 1995). Fundamental claims about universal properties of language are build into the very architecture of the theory of UG in the form of e.g. primitives (features), combinatorial operations (Merge), the operation ‘Move’, interfaces with extra-linguistic systems (LF, PF), and so forth. Language variation is a reflex of the interaction of language specific properties of lexical items, and in particular of morpho-lexical features, operating in the structure building apparatus seeking to satisfy the requirements imposed by the interface (see Arad; Mateu & Rigau).
Introduction
Several issues arise. First, how are the primitive notions of the structure building apparatus, Merge, Move, Agree or Attract defined? How do these apply, and what are the properties they are sensitive to? Second, what features are relevant for Agree or trigger overt displacement? What does a typology of features look like? Third, how are we to understand variation in morpho-lexical features exactly? Finally, is it true that morpho-lexical variation dispenses with the need for structural variation? All these questions are taken up in the contributions to this volume. In what follows I briefly turn to some of the issues concerning clause structure and universal properties that have preoccupied the recent literature. What has become a more or less standard view by now is to assume that functional morphemes occupy different syntactic slots in the structural representation of the (verbal and nominal) clause. A first issue concerns the types of features that we can assume to project in the functional domain. According to some authors, the types of features that are present in the syntactic terminals are those that are relevant for semantic interpretation at LF, as in (8) (see Halle & Marantz 1993; Chomsky 1995; Embick 1997). The underlying assumption is that there is a universal set of features, and each language draws from that pool. Languages will differ as to whether they will realize the feature at all, and whether they will realize it by means of an auxiliary, an affix, a particle and so on (see Arad; Romero for further discussion). (8) Tense Neg Aspect Force Mood Number etc.
These functional heads further possess [-interpretable] features, e.g. Case which need to be eliminated either via Agree or via overt displacement of an element (see Boeckx; López). On the other hand, other researchers claim that even features that merely have an effect on morpho-phonological well-formedness, such as word markers and theme vowels, project to the phrasal level. This is illustrated below with an example from Bernstein (1993). Spanish nouns may be marked for number. Plurality is manifested consistently with the suffix /-s/. Gender has no direct phonological realization. Moreover, Spanish nouns contain stems with or without word markers. Word markers are elements that occur in word-peripheral position, where they can be followed at most by the plural suffix (cf. (9) Harris 1991). The complete inventory of WMs includes any of the five underlying vowels of Spanish /a e i o u/:
Artemis Alexiadou
(9) muchach-o-s ‘boy’ muchach-a-s ‘girl’ abuel-o-s ‘grandfather’ abuel-a-s ‘grandmother’
According to Bernstein, languages that show morphological evidence for the category word marker project this category in the syntax. In fact Bernstein derives certain word order asymmetries within the DP, by arguing that these are linked to the presence vs. absence of the functional category of WM. Specifically, the lack of noun movement within the DP in English is attributed to the fact that English, unlike Spanish, lacks WMs. On the other hand, Spanish nouns project the category WM, and as a result noun movement can be observed in the Spanish DP (see the contrast in (10)). (10) a. b. c. d.
the American girls *the girls American las muchachas americanas *las americanas muchachas
On such views, the overt morphological instantiation of some feature, i.e. rich number or gender or even tense inflectional morphology, is correlated with the presence of overt movement. This correlation between syntactic movement and the presence of overt morphology could be taken to be a principle of UG (see the discussion in and the contributions to Haegeman 1997). However, as Snyder (1995) points out, it is not clear whether there is a principled reason to expect the particular feature combinations distinguished by a given morphological paradigm to have direct consequences for language specific properties of syntax. One could imagine that the implications are completely the reverse. A related point of controversy concerns the number and the order of the functional projections associated with these features. Some authors take the number and the order of these projections to be universal, see Chomsky (1995), and Cinque (1999), for an elaborated CP-IP domain, shown in (12): (11) [TP [Asp [VoiceP [ V]]] (12) [Mood speech act [Mood evaluative [Mood evidential [Mood epistemic [T (Past) [T (Future) [(Mood irrealis) [Mood necessity [Mood possibility [Mood volitional [Mood obligation [Mood ability/permission [Asp habitual [Asp repetitive [Asp frequentative I [Asp celerative [T (anterior) [Asp terminative [Asp continuative [Asp perfect? [Asp retrospective [Asp proximative [Asp durative [(?) Asp generic/progressive [prospective [Asp sg Completive I [Asp Pl Completive I [Voice [Asp Celerative II [Asp sg Completive II [Asp repetitive II [Asp frequentative II
Introduction
But, others take the order and number of functional projections to be subject to cross-linguistic variation. For instance, Ouhalla (1991) argues that the order of functional projections is parameterized. Iatridou (1990) and Thráinsson (1995) among others argue that one should assume only those functional projections one has evidence for in a given language. On this view, languages may vary as to whether they have a pre-pollockian unsplit IP or an IP containing an Agreement phrases distinct from Tense (the so called Split Infl Parameter). In particular, Bobaljik and Thráinsson (1998) argue that there are a series of straightforward consequences of assuming such a parameter, both for the syntax and for the morphology, namely there are more specifier positions in (13b) than in (13a), there are non local relations among Infl-type heads in (13b), and there are more terminal nodes in (13b) than in (13a). On this view, on might expect two VP external subject positions (the specifiers of AgrP and TP) and perhaps a VP external DP object position in languages that have structure (13b), but not in those that have (13a). (13) a.
b. AgrP
IP
Agr'
I' I
VP
Agr
TP T' T
AgrP Agr' Agr
VP
The authors argue that this is the correct interpretation of multiple subject position and object shift phenomena in Icelandic – as opposed to Mainland Scandinavian and English. Hence Icelandic licenses Spec, TP as an intermediate subject position, allows for object shift, and exhibits transitive expletive constructions (TECs). On the other hand, languages such as English, lack object shift (OS), transitive expletive constructions and do not license Spec, TP as a further subject position (14) (see López’s contribution). (14) a.
þaðklaruðu margar mys [VP alveg ostinn] there finished many mice completely the cheese
TEC
Artemis Alexiadou
b. þaðklaruðu margar mys ostinn [VP alveg ] there finished many mice the cheese completely ‘Many mice completely finished the cheese ’ c. *there read a man a book
OS
Furthermore, the authors develop an account of verb movement, according to which feature checking can take place both with movement and without movement (between a head and its complement). In (13a) feature checking is thus allowed without movement, since Infl and V stand in a head complement relationship. Economy conditions will then imply that feature checking must not take place via movement. Therefore, there is no V-to-Infl movement in languages with an unsplit Infl. In (13b), however, only the lowest functional head is in a complement relation with the verb, hence verb movement seems required for feature checking with higher heads. Evidently, the recent growth in crosslinguistic study opens new opportunities for extending the empirical base, confirming or challenging old generalizations and establishing new ones. At the same time, recent theoretical developments in syntax lead to important questions concerning the formal and/or substantive nature of universals in language, and the quest for the exact sources of variation between languages. The articles in this collection constitute a step towards this goal.
. The papers The contributions to this volume all attempt to identify universal properties of the language faculty, as well as the source of cross-linguistic variation. Some of the articles pay particular attention to the organization of the grammar, the type of operations that are effective, the role of features in determining variation, and primitive notions of phrase-structure. Others show how structural differences capture semantic and morphological differences within a language and across languages, and how these are ultimately responsible for variation. In sum, the papers in this volume are concerned both with formal as well as with substantive universals. I turn to a brief summary below. Arad argues that there are three sources for language variation: the inventory of roots the language has, the features it has selected, and the way these features are bundled together. In particular, Arad following Marantz (1997) and Halle and Marantz (1993), views the Lexicon of a language as a set of roots and (possibly bundled) features. There are three ways in which languages vary:
Introduction
(i) Root inventory, i.e. signs or lexical pieces available in a language. Variation in roots includes also the variation in contextual meanings that roots can be assigned. (ii) Subset of features selected from the universal pool: i.e. the features a language uses in building its lexicon. More precisely, does it have morphological case features, gender features, aspect (perfective/imperfective) etc? (iii) The ways in which features are bundled together. Languages may put together different sets of features into morphemes. This allows for a further source of lexical variation: the bundles of features available in a language. Specifically, Arad argues that the functional head little v comes in two types. Generalizing the case of little v, she further argues that verbal heads in general are features bundles. Languages may thus have different verbal morphemes, i.e. different feature bundles, and the same root can form different types of “verbs” when combining with verbal morphemes of different types. The empirical discussion of verb creating morphemes concentrates on psychological verbs. Boeckx’s contribution is concerned with a critical evaluation of the organization of the grammar, and the movement operations (Agree vs. Attract F(eature)) that could be effective. In particular he shows that Lasnik’s argumentation of favor of Attract F and against Agree can be dealt with within a One-cycle model, as the one put forth in Chomsky (1999), hence Agree is superior to Attract F. A further question that Boeckx addresses is whether the One-cycle model is a notational variant of the Single output model, proposed in Bobaljik (1995), Groat and O’Neil (1995) among others. Boeckx outlines empirical arguments based on the ‘how likely’ paradigm that Lasnik’s analysis is inadequate. He offers an alternative that implicates Relativized Minimality in support of the One-cycle model and the Agree operation. Fanselow and Cavar’s contribution deals with XP-split constructions in German and Slavic languages. These splits have the following properties: they arise in the context of operator movement only. XP-splits can retain or invert the order of the elements found in the continuous counterpart. The latter type of split cannot show up with PPs – it is replaced by a construction that differs from XP-splits only in the presence of copies of the preposition in all slots where parts of the PP appear. Pull splits do not show up for all types of operator movement in German. The authors point out that movement analyses of such splits face serious problems with respect to syntactic islands and the phonetic shape of the parts of the split phrase (the “regeneration” problem discovered by van Riemsdijk). They argue that these problems render a simple movement analysis of the XP-split construction impossible. But these patterns are not amenable to a treatment in which both parts are base-generated in situ, either. The authors propose that the copy & deletion (CD) approach to move-
Artemis Alexiadou
ment (Chomsky 1995) offers a way to account for the paradox. If the CD is implemented in such a way that the deletion operation following the copying step of movement may affect both copies. The CD-approach offers a unified analysis for both DP and PP splits. Frank, Hagstrom, and Vijay-Shanker’s article is concerned with the proper characterization of grammatical structures, and the primitive notion in the determining the properties of tree structures. They demonstrate that there is no way for grammar to refer to dominance. More specifically, dominance does not figure into grammatical explanation. The authors show that notions for the definition of which dominance has been used were can be translated into statements about c-command. Two concepts are considered: roots and constituents. The authors show how these can be described in terms of c-command. The ccommand based view on roots distinguishes between the categorial root and the site of cyclic attachment. Thus no need to refer to the dominance relation is necessary in order to determine the root of a tree structure. The ccommand based definition of constituents further enables an understanding of cases where movement is blocked. Kural’s paper deals with certain problems that emerge from the two-way classification of monadic verbs as unaccusative and unergative verbs, as proposed in e.g. Perlmutter (1978) and Burzio (1986). Kural argues that (a) the tests used to distinguish between the two classes do not all test the same structural properties, and (b) the discrepancies in the behavior of some monadic verbs across these tests can be explained naturally by positing a four-way classification rather than the traditional two-way classification. Kural points out that some of the tests, e.g. there insertion make reference to some VP external position, while others e.g. resultatives, cognate objects, relate to the VP internal base position of the object. The four classes are: verbs of being, change of state verbs, change of location verbs, and finally verbs of creation. Each class is associated with a distinct structure. The syntactic behavior of these verbs becomes more transparent, but it is also shown that the classification of these verbs along syntactic lines fully coincides with their broad semantic properties such as denoting a change of location or creation of an abstract entity. López proposes a new look at the operations Agree and Move. He argues that (i) the operation Agree is strictly local, and (ii) the operation Move is triggered by the instability created in the system by unvalued features (following similar ideas in Frampton & Gutmann 1999). The paper introduces the concept of co-valued features: two terms with unvalued features of the same type that are related by the operation Agree must have their features valued ‘in tandem’. On the conceptual side, these alterations allow us to revisit and eliminate some
Introduction
unnecessary assumptions. On the empirical side, this paper presents analyses of structural case assignment, typological differences in expletive constructions, quirky subjects and concord phenomena, amply demonstrating the empirical advantages of the approach. Mateu and Rigau show that the ‘conflation processes’ involved in so-called ‘lexicalization patterns’ (see Talmy 1985) can receive an adequate explanation when translated into syntactic terms. Talmy claims that languages can be classified according to how semantic components like ‘figure’, ‘motion’, ‘path’, ‘manner’, or ‘cause’ are conflated into the verb. For example, conflation of motion with path is argued to be typical of Romances languages like Spanish, whereas conflation of motion with manner is typical of English. They argue that an analysis of these conflation processes in purely semantic terms like that put forward by Talmy (1985) can be descriptively adequate but cannot be regarded as explanatory at all, since the ‘parametric variation’ to be found in such processes can be shown to crucially involve morpho-syntax, not pure semantics. The paper claims that ‘parameterized variation’ is not to be confined to inflectional systems. Romero argues that languages differ in the formal features they encode, i.e., there is no a universal catalogue of shared formal features. Each language determines independently (although not arbitrarily) its own formal features from the universal set of features F available for the faculty of language. The presence of these features is available for the learner in the primary linguistic data. In this system each language may “formalize” different sets of features. If languages define two (Spanish), three (Latin), seventeen (Swahili) or none (English) feature for gender/class, it is an idiosyncratic property. The relations established by these features may receive different morphosyntactic representations that can also affect syntactic computation. Romero’s hypothesis is supported by the Person Case Constraint (PCC), which shows up only if there are agreement features involved. Romero further explores the role of the EPP feature, and proposes that this feature is crucially involved in the PCC. Moreover, an analysis for object shift in Scandinavian languages is proposed and an explanation for the PCC based on the interactions between agreement and EPP. Specifically, Romero proposes that person agreement in languages with object agreement is tied to an EPP feature. Sabel’s article provides evidence for the constraint in (15): (15) Constraint on Adjunction Movement (CAM) Movement may not proceed via intermediate adjunction.
Artemis Alexiadou
The CAM predicts that the only existing intermediate traces of a moved element are traces in specifier positions. The empirical evidence against intermediate adjunction is formulated with respect to different movement types, such as wh-movement, empty-operator-movement, A-movement, extraposition, quantifier raising, scrambling, and head movement. Furthermore, data that were traditionally used as providing evidence for intermediate adjunction have been explained as involving movement via a second specifier position. For example, the analysis of scrambling in German and Japanese rests on the assumption that Japanese allows for multiple A-specifiers whereas German does not. On the other hand, multiple (CP) A -specifiers seem to be the unmarked case in languages as argued in connection with the proposed analysis of extraction out of wh-islands.
Acknowledgements I wish to thank the Zentrum für Allgemeine Sprachwissenschaft in Berlin, the Linguistics Department of the University of Potsdam, the Dutch graduate school OTS, and the DFG for their support in organizing GLOW in Berlin in March 1999. I thank the authors of this volume for their co-operation, as well as the participants of the conference. I am grateful to Werner Abraham and Kees Vaes for their assistance in the preparation of this volume.
Notes . The papers by Fanselow & Cavar and Boeckx were not presented at that event.
References Baker, M. (1996). The Polysynthesis Parameter. Oxford: Oxford University Press. Bernstein, J. (1993). Topics in the Syntax of Nominal Structure across Romance. Ph.D. Diss., CUNY. Bobaljik, J. (1995). Morphosyntax: The Syntax of Verbal Inflection. Ph.D. Diss., MIT. Bobaljik, J., & H. Thraínsson (1998). Two heads aren’t always better than one. Syntax, 1, 37–71. Borer, H. (1983). Parametric Syntax: Case studies in Semitic and Romance Languages. Foris: Dordrecht. Burzio, L. (1986). Italian Syntax: A Government and Binding Approach. Kluwer.
Introduction
Chomsky, N. (1986). Barriers. Cambridge, MA: MIT Press. Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, N. (1999). Derivation by Phase. MIT Occasional Papers in Linguistics. Cinque, G. (1999). Adverbs and Functional Heads. Oxford: Oxford University Press. Comrie, B. (1981). Language Universals and Linguistic Typology. Chicago: University of Chicago Press. Croft, W. (1991). Typology and Universals. Cambridge University Press. Embick, D. (1997). Voice and its interfaces with syntax. Ph.D. Diss., University of Pennsylvania. Frampton, J., & S. Gutmann (1999). Cyclic Computation. Syntax, 2.1, 1–27. Greenberg, J. (1966). Some universals of grammar with particular reference to the order of meaningful elements. In J. Greenberg (Ed.), Universals of Grammar (pp. 73–113). Cambridge, MA: MIT Press. Groat, E., & J. O’Neil (1995). Spell-Out at the LF Interface. In W. Abraham et al. (Eds.), Minimal Ideas (pp. 113–139). Amsterdam: John Benjamins. Haegeman, L. (1997). Introduction. In L. Haegeman (Ed.), The New Comparative Syntax (pp. 1–32). London: Longman. Halle, M., & A. Marantz (1993). Distributed Morphology and the Pieces of Inflection. In K. Hale and S. J. Keyser (eds.), The View from Building 20 (pp. 111–176). Cambridge, MA: MIT Press. Harris, J. W. (1991). The Exponence of Gender in Spanish. Linguistic Inquiry, 22, 27–62. Iatridou, S. (1990). About AgrP. Linguistic Inquiry, 21, 551–576. Kayne, R. (1994). The Antisymmetry of Syntax. Cambridge, MA: MIT Press. Marantz, A. (1997). No escape from syntax: Don’t try a morphological analysis in the privacy of you own lexicon. Ms., MIT. Ouhalla, J. (1991). Functional Categories and Parametric Variation. London: Routledge. Perlmutter, D. (1978). Impersonal passives and the Unaccusative Hypothesis. Proceedings of the fourth Annual Meeting of the Berkeley Linguistics Society, 157–189. van Riemsdijk, H. (1989). Movement and Regeneration. In P. Benincà (Ed.), Dialect Variation and the Theory of Grammar: Proceedings of the GLOW Workshop on Linguistic Theory and Dialect Variation (pp. 105–136). Dordrecht: Foris. Roberts, I. (1997). Comparative Syntax. London: Arnold. Snyder, W. (1995). Language Acquisition and Language variation: The role of Morphology. Ph.D. Diss., MIT. Talmy, L. (1985). Lexicalization patterns: Semantic structure in lexical forms. In T. Shopen (Ed.), Language Typology and syntactic description, Vol. III: Grammatical categories and the lexicon (pp. 57–149). Cambridge: Cambridge University Press. Thráinsson, H. (1995). On the non-universality of functional projections. Ms., Harvard University/University of Iceland.
Universal features and language-particular morphemes Maya Arad University of Geneva
.
Introduction: Roots and morphemes
A widely accepted view assigns variation among languages to morphology (cf. Borer 1984; Chomsky 1995). In this paper I suggest a specific formulation of this view. I argue that there are three sources for language variation: the inventory of roots a language has, the features it has selected out of a universal inventory, and the way these features are bundled together. Concentrating on feature bundling, I argue that category features such as “v” are not primitives, but rather are feature bundles. Languages may bundle features in different ways, thus having different “v” heads.1 Consider, first, the main participants of word formation mentioned above – roots and features. Following work in the framework of Distributed Morphology (Halle & Marantz (1993) and subsequent work), I assume here that the Lexicon of a particular language consists of roots and features. Roots are the lexical kernels that the language has. They are devoid of all functional material and are category neutral. When combined with category features such as n or v they become actual nouns or verbs: √ v n (1) fish: a. b. v
Öfish
to fish (v)
n
Öfish
a fish (n)
√ The root fish is the lexical and phonological core shared by the noun a fish and the verbs to fish. Speakers thus have access to words – complex entities built of roots and features – but not to the roots themselves. Perhaps for that reason,
Maya Arad
the idea that in all languages the (category neutral) root is distinct from the word creating morphology may not seem straightforward. Indeed in English, where word-creating morphology is non-obligatory, the root is not morphologically distinct from the nominal or verbal morpheme. However, other languages (e.g. Romance), always have some overt verbal morphology on the verb stem. In Semitic languages, going one step further, the root is mostly easily distinguishable from the word creating morphology: on their own, roots – units of three consonants – are neither pronounceable, nor belong to any grammatical category. Only when put into word creating morphology (known as patterns), roots become nouns, verbs or adjectives. Consider the following example from Hebrew: √ (2) lmd → lamad (learn, v, pattern CaCaC) → talmid (student, n, pattern taCCiC) → limudi (pertains to learning, adj., pattern CiCCuCi) Consider, next, the second participant in word formation – features. I will rely here on two complementary assumptions: 1. UG makes available a universal set of features (Chomsky 1998). 2. Languages may select a subset of these features. In particular, a language may bundle together some of these features into “morphemes”, i.e. featurebundles (cf. Marantz 1999). While the first claim is a conceptual necessity, the second claim could be subject to debate: do all languages have all possible features? I will assume here that languages select a subset of the feature inventory offered by UG. For example, Russian has instrumental case features and grammatical Gender features, while English does not. Similarly, English, but not Russian, has progressive marking for verbs. So here is an important source for language variation: which morpho-syntactic features has the language selected out of the universal pool? Suppose, furthermore, that language may bundle together different properties. If this is the case, then here is another source for variation: two languages with the same set of features could bundle these features in different ways, thus ending up with a different set of morphemes. Following the essence of Borer’s (1984) proposal, I argue that language variation is restricted to the lexical and morphological component, and has three sources: first, languages differ as to their inventory of roots (namely, what “signs” a language has). Second, language may select different features out of the universal pool made available by UG. Finally, different languages may bundle features in different ways, thus having different morphemes. I will illus-
Universal features and language-particular morphemes
trate this with a case study of verb creating morphemes. My starting point is a specific verbal head, “little v” (cf. Chomsky 1995, 1998). Following Marantz (1999), I claim that this head bundles together two sets of features: semantic contents (giving the event an agentive interpretation) and transitivity properties (case checking properties). These two features are separable. They may be bundled with other features, into different verbal morphemes. Generalizing the case of little v, I will argue that verbal heads in general are features bundles. Languages may thus have different verbal morphemes, i.e. different feature bundles, and the same root can form different types of “verbs” when combining with verbal morphemes of different types. In my discussion of verb creating morphemes I will concentrate on psychological verbs. This is because a sub-group of these verbs alternates between an agentive and a non-agentive reading, which enables us to see the interaction of a single root with two different verbal morphemes. Furthermore, I argue that in many cases both types of psych verbs – Subject Experiencer and Object Experiencer – are formed from the same root, combined with different verbal morphemes. So, because of their special properties, these verbs make a good case study of verbs and features bundles. But before getting to psych verbs, a few words on the best known verbal morpheme, little v.
. Little v Little v (Chomsky 1995, 1998; also Collins’s 1997, Transitivity Phrase) is commonly taken to be the upper head in the VP-shell, and is often referred to as a “transitivity head,” which heads transitive constructions. This head introduces an external argument in its specifier and enters into a relation with the object (“Agree,” or checks structural Case):2 (3)
vP external argument
v
v
VP V
Object
The motivation for postulating v is twofold: first, it captures the correlation between the presence of an external argument and (structural) object case (Burzio’s 1986 generalization). Second, by having the external argument in-
Maya Arad
troduced by a functional head we capture the observation that this argument is not an argument of the verb. Structurally, it is external to the verb phrase. Semantically, its interpretation is given compositionally by the whole verb phrase (Marantz 1984; Kratzer 1996). Many questions arise at this point. For example, do agentive verbs with oblique complements (e.g. hit at the wall) have a v head too? Another important question is whether v necessarily entails an agent, or could it introduce any kind of an external argument. Stative verbs, such as know algebra, have been shown to have an external argument (cf. Belletti & Rizzi 1988), yet the semantic role of this external argument is certainly not an agent. Finally, if verbal morphemes are bundles of features, then what types of bundles, what types of v heads apart from “little v”, exist? Let us develop this last question. Note that in Hebrew different types of verbs are formed from the same root:3 √ (4) a. yšv + CaCaC = yašav (be seated, stative)4 √ b. yšv + hiCCiC = hošiv (make someone sit down, causative)5 √ c. yšv + hitCaCeC = hityašev (sit down, inchoative) √ (5) a. rgz + CaCaC = ragaz (be angry, stative) √ b. rgz + hiCCiC = hirgiz (anger, causative) √ c. rgz + hitCaCeC = hitragez (get angry, inchoative) (Hebrew) A single root can form different types of verbs: stative, causative or inchoative. These verbs share something (in (4), for example, they all refer to an event of sitting), but they also differ: one refers to a stative event, being at a state of sitting, another, to a causative event – making someone sit down, and yet another to an inchoative event of change of state: sit down. How can one root form several “verbs”? My hypothesis is that the root is combined in each case with a verbal morpheme of a different type or different semantic “flavor”, thus forming different “verbs”: √ (6) a. root + Va = stative. √ b. root + Vb = causative. √ c. root + Vc = inchoative. The hypothesis that verbal heads have different semantic flavors raises (at least) two questions: 1. What types of verbal morphemes exist in natural language? 2. What are the contents of verbal morphemes?
Universal features and language-particular morphemes
In what follows, I will examine these questions through the specific case of a class of verbs that alternate between an agentive and a non-agentive reading: Object Experiencer verbs. I will argue that the two readings are achieved by combining the same root with different verbal morphemes, one agentive and one stative.
. Object Experiencer verbs Psych verbs, verbs denoting mental states, fall into two main groups (for some references cf. Ruwet 1972; Belletti & Rizzi 1988; Grimshaw 1990; Pesetsky 1995). In the first group the experiencer is realized as a subject (Subject Experiencer verbs), in the other it is realized as the object (Object Experiencer verbs): (7) a. Subj(ect)Exp(eriencer) verbs: fear, love, admire, hate. b. Obj(ect)Exp(eriencer) verbs: frighten, annoy, surprise, bother.
The starting point of my discussion is Belletti and Rizzi’s (1988) seminal work on Italian psych verbs. Belletti and Rizzi (henceforth B&R) show that although ObjExp verbs seem identical to standard transitive verbs, they differ from them syntactically in a number of ways. Qualifying B&R’s typology further, I will argue the following: 1. ObjExp verbs such as frighten can have a stative reading or an agentive reading. This ambiguity is pervasive through the class of ObjExp verbs: their subjects can be interpreted as either agentive or non-agentive.6 2. The two readings of ObjExp verbs differ syntactically: only the stative reading has the syntactic effects noted by B&R. On the agentive reading, the verb behaves like a standard transitive verb. 3. The stative and the agentive readings of ObjExp verbs are formed by the √ same root (e.g. fright). The difference between them arises from the type of verbal morpheme with which the root is combined. On the agentive reading the root is combined with (standard) little v. On the stative reading it is combined with a verbal head which is stative and causative (in a sense to be made explicit below). I call this head “stative little v”.
Maya Arad
. The stative and non-stative reading of ObjExp verbs Consider, first, the two possible readings of ObjExp verbs, agentive and stative. These readings differ with respect to two properties: whether there is an agent who purposely aims to bring about a mental state, and whether there is a change of state in the experiencer. The agentive reading has an intentional agent and change of state in the experiencer: (8) Anna frightened Laura deliberately / in order to make her go away.
On this reading, the agent – in this case Anna – intends to bring about a state of fright in the experiencer. The experiencer, Laura, undergoes a change of state and becomes frightened. In contrast, the stative reading has neither an agent nor a change of mental state in the object.7 For example, in (9a) below there is no single point in time in which Laura turns from “unconcerned” into “concerned.” Rather, perception of the problem by Laura triggers a concomitant state of concern. When she happens to think of the problem, she experiences a “spell” of concern (cf. Pylkkänen 1997): (9) a. This problem concerned Laura. b. Anna/Anna’s behavior frightens Laura. c. Blood sausage disgusts Laura.8
Several points have to be noted about this reading. First, since there is no agent on the stative reading, the triggering of the mental state by the stimulus is neither volitional, nor under the stimulus’ control. Even when the subject is human, he or she do not act on purpose: in (9b) it is something about Anna that frightens Laura, rather than Anna’s attempts to frighten. Second, there is no change of state in the experiencer.9 The stative reading only asserts that the experiencer is at a specific mental state as long as she perceives the stimulus (or has it on her mind). Another difference concerns the relationship between the entity that brings about the mental state and the mental state itself. On the agentive readings the agent only brings about the resulting state, which holds independently, and is not part of the event of mental state (cf. Pesetsky 1995). On the stative reading the stimulus has to co-occur with the mental state in order for it to hold: the experiencer is at a specific mental state for as long as it perceives the stimulus (cf. Pylkkänen 1997). The stimulus is thus an inherent part of the event of mental state. The schema in (10) describes the difference between the two readings:
Universal features and language-particular morphemes
(10) Agentive reading: agent brings about a mental state a. action mental state - - - - - - - - - - - - - - - -> .................................................(indefinite) Stative reading: stimulus triggers a concomitant mental state b. perception of stimulus (problem, blood sausage): ____________stop mental state (concern, disgust, etc.): .........................stop
Finally, note that both readings are causative.10 Following Pylkkänen (1997), I assume that causation can be active or stative. Active causation involves an agent, who acts and brings about a change of state, while stative causation involves a stative causer (or a stimulus) which triggers a state (whose existence is co-extensive with that of the stimulus). . The syntactic realization of the stative and the agentive readings So far we were concerned with the semantic properties of ObjExp verbs. However, these verbs are also known for exhibiting some syntactic peculiarities. In this section I argue that such peculiarities only occur with the stative reading of ObjExp verbs. When the subject is interpreted as agentive, ObjExp verbs lose their “psych” properties. I will illustrate this on the properties pointed out by Belletti and Rizzi (1988) for Italian – reflexive clitics, causativization and extraction from the object – but this observation holds across a number of languages (English, Hebrew, Spanish, Greek – see Arad (1998) for a detailed account). Consider reflexivization first. As pointed out by B&R, ObjExp verbs cannot appear with a reflexive clitic si in Italian (11a). However, (11a) is crucially interpreted as a stative reading of frighten. If an agentive reading is forced (11b) the sentence is fine:11 (11) a.
??Gianni
si spaventa. Gianni self frightens b. Gli studenti si spaventano prima degli esami the students self frighten before the exams per indursi a studiare di più to urge-refl to study more ‘The students frighten themselves before exams in order to urge themselves to study harder.’
Next, B&R show that embedding an ObjExp verb under the causative construction in Italian is ungrammatical. However, if a minimal pair of stative and agen-
Maya Arad
tive readings is constructed, we can see that this construction is ungrammatical with the stative reading (12a), but is allowed on the agentive one (12b): (12) a. *Gianni /questo gli ha fatto preoccupare Maria. G. /this him-dat made worry M. ‘This/Gianni made him frighten Maria.’ b. Gianni gli ha fatto spaventare Maria G. him-dat made frighten M. per farla lavorare di più to make her work more ‘Gianni made him frighten Maria to make her work harder.’
Finally, B&R show that extraction from the object is ungrammatical with ObjExp verbs (13a). Again, on the agentive reading, extraction is grammatical (13b): (13) a. *La ragazza the girl b. La ragazza the girl
di cui of which di cui of which
Gianni spaventa i genitori Gianni frightens the parents Gianni spaventata i genitori Gianni frightens the parents
perché gliela facessero sposare. for him-dat-her-acc make (3rd pl) marry ‘The girl whose parents G. frightens so that they will allow him to marry her.’
Consider a further evidence for the difference in syntactic representation of the two readings: object case marking in Spanish. Object pronouns in Spanish take dative case on the stative reading, and accusative on the agentive: (14) a.
el niño/la musica le molestó. the boy/the music her-dat bothered ‘The boy/the music bothered her.’ b. el niño/* la musica la molestó. the boy the music her-acc bothered ‘The boy/*the music bothered her.’ c. Lo hice para molestarla/lo (*le). it I did in order to bother her/him (acc) (dat) ‘I did it in order to bother him/her.’
(stative)
(agentive)
Speakers interpret (14a) as non-agentive: something about the boy or the music triggered a mental state of bothering. (14b), on the other hand, is interpreted as unambiguously agentive: the boy intended to bother (and “the music” can-
Universal features and language-particular morphemes
not be an agentive subject). Finally, in unambiguously agentive contexts (14c), only the accusative case is allowed. To conclude: the data above suggest that on their agentive reading, ObjExp verbs behave like standard active or agentive verbs: they allow reflexivization, causativization and extraction. On their stative reading they exhibit a special behavior. . The case for “stative little v” Consider the similarities and the differences between the two readings of ObjExp verbs. The similarities are evident: first, the two readings share their morphology: both contain the same lexical kernel, or root (e.g. fright). Second, the readings bear some semantic similarity: both refer to an event of mental state, and, furthermore, both describe a causation of a mental state. Now consider the differences between the readings. The type of event they describe is different in each case: bringing about a change of state (agentive) or triggering a concomitant state (stative). They also differ with respect to the semantic role of the causing element: an agent on the agentive reading, a stimulus (or stative causer) on the stative reading: (15) a. frighten (agentive) /fright/ mental state causative agentive “agent”
b. frighten (stative) /fright/ mental state causative stative “stimulus”
The similarities between the two readings seem related to the root they share. The differences between them – in particular having an agent or a stimulus – seem related to their different external arguments. External arguments are assumed in current theory to be assigned by some functional head (cf. Kratzer’s 1996 Voice head). Suppose that heads introducing external arguments could belong to more than one type. Specifically, in the case of ObjExp verbs, this head could be agentive or stative. We can then explain the two readings of Ob√ jExp verbs by assuming that the root fright can combine with two types of verbal heads, which introduce an agentive or a stative external arguments. As a result, the same root forms two types of ObjExp verbs: agentive or stative. Let me elaborate now on this hypothesis. What is shared by the two readings is √ precisely the root, for example fright. This is the smallest kernel, referring to √ some event of fright, which both readings share. The root fright forms both the stative and agentive readings. The identity of the verbal head with which the
Maya Arad
root is combined determines the syntactic properties of the verb. Schematically, I am arguing something like the following: √ fright + Va = /frighten/ (agentive) √ fright + Vb = /frighten/ (stative) The agentive reading behaves like a standard active verb. I assume it has the same transitive structure, headed by (the well-known) little v: (16)
vP NP
v
Agent
ÖP
v
Öfright
NP Experiencer
The argument at the specifier of v is interpreted as an agent by virtue of its structural position. The object of the verb is interpreted as an experiencer because of the type of the root involved, a root denoting a mental state. Note that “experiencer” is thus only a convenient label, not a syntactically relevant √ notion.12 The head v combines with the root phrase, P. Below v there can be a predicate of change of state (as in the case of frighten) or other types of predicates. For example, psych predicates can have the form of a change of possession predicate (cf. give fright) or change of location (e.g. French mettre en colère, literally “put into anger”). Consider now the stative reading. I suggest that on this reading the root is merged with a verbal head that I will call “stative little v”. This head has the following properties: first, it gives the event the interpretation of stative causation (unlike standard little v, which is active). Second, the argument in its specifier is interpreted as a stative causer. Finally, its object is marked with dative case (cf. Spanish) rather than accusative. Put more explicitly, I suggest that “little v” comes in two flavors, active and stative: v1
(17) a.
b. v1
agent v1
…ACC
v2 v2
stative causer
v2
…DAT
Universal features and language-particular morphemes
Stative little v is responsible for the stative reading of psych verbs (a stative causative construction): v2
(18)
v2
stative causer
ÖP
v2 Öfright
NP
Both active and stative v introduce an external argument in their specifier and check object case. However, they differ with respect to their semantic content: one gives the event an interpretation of an agentive or active event, while the other gives the event an interpretation of stative causation. The arguments in their specifiers are interpreted accordingly, as an agent or as a stative, nonagentive causer. The heads also differ with respect to the morphological spell out of the object-case they assign (dative vs. accusative). In the next section I will further examine the similarities and differences between these two heads.
. Towards a typology of “little v” properties . Two sets of v properties I have argued above that there are two flavors of “little v”: two heads which share some features, but differ with respect to other features. This, I suggest, is only a special case of a more general claim that has been made recently: category features (e.g. v, D) are not primitives, but stand for bundles of features (Marantz 1997; Chomsky 1998). UG makes available a universal set of “properties” or features, and the “Lexicon” of a particular language contains a subset of these features (Chomsky 1998). Furthermore, subsets of this subset of features can be bundled together pre-syntactically (“lexical items”, Chomsky 1998, “morphemes” in Distributed Morphology). This raises a host of questions: what features exist in the universal pool? Are there any restrictions on the way in which they are bundled together? Do different languages bundle features in different ways? What are the features that are bundled together under the title “v”? Concentrating on the properties of the category “V”, I will discuss these questions in view of the ObjExp data above.
Maya Arad
Consider, first, standard (active) little v. Following Marantz (1999), I argue that it has (at least) three types of properties: – – –
Verbalizing property (combines with a root to form a verb). Semantic content. Transitivity (formal properties: introducing external argument, agree with object).
Consider the first property. All verbal morphemes create verbal environments, as part of their defining properties: they make roots into verbs, rather than nouns or adjectives. I believe that the “verbalizing” property could be reduced to a formal requirement of merging with T (or, possibly, Asp), but I leave the issue open here (other properties of verbs, such as person features, could also be associated with T). The head little v also has semantic content. Possible characterizations of this content are agentivity (gives the event agentive interpretation, cf. Kratzer’s (1996) VoiceP), causation (the event is interpreted as a causative event, cf. Harley’s 1995 CAUS) or some aspectual content, such as “process,” giving the event a durative interpretation (cf. Borer’s 1998 AspP head). Finally, consider the transitivity property of v. It has an external argument in its specifier and forms a relation with the object (“Agree” – Chomsky 1998). Taken together, these properties capture the essence of Burzio’s (1986) generalization, that is, the correlation or the dependency between an external argument and structural object case.13 Suppose now that these three features – verbalizing, semantic content, and transitivity – are features of the universal set made available by UG. Let these features be the ingredients that are used for building verbal morphemes (of which little v is just an example). The verbalizing property distinguishes the verbalized root from a noun. Semantic content has several values or “flavors”: agentive, stative etc. Transitivity is a general property – any head that introduces an external argument in its specifier and agrees with the object has the transitivity property (I will discuss this in detail below). Languages can bundle together any of these features, thus forming verbal heads of different types. In other words, different verbal morphemes are different bundling of the features available in the language. Possible examples of such feature bundles are: bundle 1 bundle 2 bundle 3 bundle 4 “verby” “verby” “verby” “verby” agentive agentive stative inchoative transitivity
bundle 5 “verby” causative transitivity
Universal features and language-particular morphemes
What counts as a possible “V” bundle? I assume that all v heads have the verby property: all verbal morphemes make roots into verbs. All heads also have some semantic content: heads with no content do not enter into the computational system. However, not all heads have the transitivity property, because not all verbs in language have an external argument and object case (cf. unaccusatives). We could, in principle, find heads that share only a subset of their properties. For example, heads sharing their semantic content but not transitivity, or heads sharing transitivity but not semantic content. In fact, this is exactly the case, as I will illustrate through several salient cases. .. Sharing semantic content Consider the contrast between transitive and reflexive verbs in Romance languages: (19) a. Gianni lava Maria b. Gianni si lava
(Gianni washes Maria) (Gianni washes himself)
(Italian)
A reflexive verb does not agree with its object (no object case is assigned). Also, at least according to some analyses of Romance reflexives (e.g. Marantz 1984), si verbs have no external argument. However, transitive and reflexive wash bear many similarities: both refer to a washing event and both have an agent, a washer, even if it is not syntactically projected. I assume that transitives and reflexives are made from the same root, combined with different verbal morphemes. These morphemes share their semantic content (in this case agentive), but differ precisely with respect to their transitivity: one has an external argument and checks object case, while the other does not. Passives are a similar case. One of the best known generalizations about passives is that they do not have a (syntactically projected) external argument and that they do not check object case: (20) a. He was hit b. The book was read
However, another well-established claim about passives is that they share many characteristics of their active counterparts. In particular, passives retain their agentive interpretation: in (20) there was an agent who did the hitting or read the book, even if it is not projected in the syntax. I assume that active and passive verbs are formed from the same root (note that they do share their basic verbal morphology, as in eat and eaten). The root may be combined with two morphemes, both sharing the property of being verbal and having some se-
Maya Arad
mantic content. However, one also has the transitivity property (active) while the other lacks it (passive). Passive morphology may thus be only an overt manifestation for the lack of external argument and object case (in many languages passive morphology is similar to that of reflexives or unaccusatives, all three of them lacking object case). .. Sharing transitivity Consider now the opposite case: two heads sharing the transitivity property but not their semantic content. This is the case of active and stative little v, which was presented above in relation to ObjExp verbs. The two heads do not share their semantic content – one is active while the other is stative. However, both have the transitivity property: they merge with an external argument (an agent in one case, a stimulus in the other) and check object case (spelled out as accusative in one case, dative in the other). It is possible, thus, for two verbal morphemes to share transitivity but not semantic content. I take transitivity to be a general property: any head with a filled specifier and a relation (agree) with a lower element has the transitivity property. v
(21) filled spec
v v
relation with a lower element
Another head that has the transitivity property is the applicative element in Double Object Constructions (DOC). According to some analyses (Marantz 1993; McGinnis 1998), DOC involve an applicative head which introduces a benefactive argument and checks the case of the lower object: v
(22) agent
v v
vapplicative
benefactive
vapplicative
vapplicative
ÖP
Ögive
John gave Mary an apple
theme
Universal features and language-particular morphemes
The applicative head does not share the semantic content of either active or stative little v, but it does share the transitivity property with them. It merges with an external argument (benefactive; cf. Marantz (1993), where it is argued that the benefactive argument is external to the inner event of change of state) and checks the case of the lower object. The examples discussed here represent the kind of morphemes that can be made from the ingredients described above: verbalizing property, semantic content and transitivity. The features selected from the lexicon of a particular language can be bundled in different ways, thus creating verbal heads of different types. Active little v, stative little v and heads creating passives are examples of such morphemes. Crucially, as argued above, morphemes may differ with respect to one feature only (transitivity or semantic content). In the next section I will come back to the topic of Italian psych verbs, and consider another case of verbal heads that differ with respect to some of their properties. . The case of Italian piacere and preoccupare verbs It was argued above that B&R’s effects hold only for the stative reading of ObjExp verbs. The question remains: what is it about the stative that reading triggers these effects? I suggest that this may be related to the type of (languageparticular) morpheme that creates ObjExp verbs in Italian. The crucial facts are related to a sub-group of Italian ObjExp verbs, the piacere group, which will be discussed below. As shown by B&R, beside the worry-type (preoccupare) class, Italian has verbs of the please-type (piacere). Apart form piacere, this small group includes scocciare (displease) and interessare (displease): (23) a.
Questo this b. Questo this
preoccupa worries piace a pleases to
Gianni. G. Gianni. G.
Follow B&R, I argue that piacere verbs are similar in every respect to preoccupare verbs, except for their object case which is dative (but cf. Pesetsky 1995). Interestingly, piacere verbs do not exhibit the same syntactic behavior as preoccupare verbs. They can form reflexives and be embedded under a causative verb: (24) a.
??Gianni
G.
si preoccupa. self worries.
Maya Arad
b. Gianni si piace. G. self please ‘G. likes himself/thinks highly of himself.’14 (25) a. *Gianni ci ha fatto preoccupare Maria. Gianni us-dat has made worry Maria ‘Gianni made us worry Maria.’ b. Gianni ci ha fatto piacere il gelato. Gianni us-dat has made please ice cream ‘Gianni made us like ice cream.’
So the situation in Italian is as follows: preoccupare verbs on their agentive reading do not exhibit psych effects. Piacere verbs do not exhibit psych effects. Only preoccupare verbs on their stative reading exhibit psych effects. Psych effects exist only when accusative case is assigned to the experiencer in the absence of an agent. My assumption is that this is related to the properties of the verbal head that is involved in forming each of these verb groups. Active v merges with the root and forms the agentive reading of preoccoupare verbs. Stative v forms dative ObjExp verbs like piacere: v1
(26) a.
v1
agent v1
v2
b.
…ACC
preoccupare (agentive)
v2
stative causer
v2
…DAT
piacere
Unlike B&R, I assume that the dative case on piacere verbs is not inherent, but is assigned by stative little v. Note, for example, that it can be absorbed with a reflexive clitic, as in (24b).15 Thus, both agentive preoccupare and piacere have structural case on their object. I assume that the stative reading of preoccupare verbs is formed by combining the root with a “defective” v head: this morpheme shares the semantic property of stative little v (giving the event a stative causative interpretation), but not its transitivity property (i.e. does not agree with the object): (stative) preoccupare: verby stative-causative
piacere: verby stative-causative transitive
Universal features and language-particular morphemes
Stative preoccupare verbs are thus semantically similar to piacere verbs. However, the ACC case on the object of stative preoccupare is lexically marked (or inherent, as suggested by B&R). Stative preoccupare verbs thus differ both from agentive preoccupare and from the piacere group, in that they are formed through a head that does not have a transitivity property. My hypothesis is that the peculiar behavior of stative preoccupare verbs in Italian is related to this lack of transitivity. Suppose that the ability to form reflexives or to causativize is related to the transitivity property of the head. The presence or absence of the transitivity property on the head may explain why agentive preoccupare verbs and piacere verbs can form reflexives and causatives, while stative preoccupare verbs cannot. Let us see how this should work for the syntactic effects noted above. Consider reflexivization first. Reflexivization affects the transitivity property of the verb and object case is absorbed (cf. Section 4.1.1 above): (27) a.
Gianni guarda Maria. Gianni looks at Maria b. Gianni si guarda. Gianni self looks at
(transitive, object case) (reflexive, no object case)
Suppose that the process of reflexivization involves dispensing with the transitivity property of v. Taking this as my hypothesis, I assume that it is only verbs that are formed through verbal heads possessing the transitivity property which may appear as reflexives. In other words, in order to suppress the transitivity property, a head must first have the option to have it. Agentive preoccupare and piacere verbs can form reflexives, when their object case is absorbed. The case on the object of stative preoccupare verbs cannot be absorbed, as it is not assigned by a transitivity head, and thus the process of reflexivization cannot take place. B&R note that passives and raising verbs in Italian cannot reflexivize, and assume that this is related to the fact that they lack external arguments. My hypothesis is that what passives, raising verbs and stative preoccupare verbs share is this: they are formed by combining the root with a verbal head which does not have any transitivity properties, hence their failure in reflexivization. Consider next causativization. I assume that the formation of causatives involves placing a verbal head on top of another head, in a process similar to restructuring. If the causativized verb is mono-argumental, then the subject of the lower verb is assigned ACC by the upper v:
Maya Arad
vcause
(28) Maria
vcause
vcause
vagentive vagentive
Gianni
Öwork
vagentive
Maria ha fatto lavorare Gianni. M. made work G.-acc ‘Maria made Gianni work.’
If the lower v is transitive, then the causative form is essentially similar to double object constructions (Marantz 1993), with the upper (causative) v checking structural DAT, instead of ACC: vcause
(29) Maria
vcause
vcause
vagentive vagentive
to Gianni vagentive
Öeat
ÖP NP an apple
Maria ha fatto mangiare una mela a Gianni. M. made eat an apple-acc to Gianni ‘Maria made Gianni eat an apple.’
Causativized structures involve two structural cases, whose morphological spell-out is determined, I assume, post syntactically. Thus, when the lower v assigns morphological accusative, the upper v assigns dative (29). However, if the lower v assigns dative case, as with piacere verbs, then the causative (upper) v checks accusative case (25b). I assume that the restriction on the causativization of stative preoccupare verbs has to do with the morphological component
Universal features and language-particular morphemes
rather than with the syntax. Recall that causativization in this case involves merging a causative head that assigns structural case with a defective v (i.e., a v which lacks transitivity), whose object is lexically marked with accusative case. In the morphology, dative is inserted for the object of the upper verb, following the morphological accusative of the lower verb. This dative is infelicitous, perhaps because the subjects of stative little v are not meant to be marked with dative, as this makes them interpreted as subjects of agentive v. The argument above is tentative, but I believe that there is some evidence that the issue is indeed morphological. Italian has one verb that belong to both the piacere and preoccupare groups, interessare, “interest”, which can take either a dative or accusative object (30a). I assume that such roots are compatible with both types of stative little v: transitive (with structural accusative) or defective (with lexically marked accusative). Interestingly, the two variants exhibit a difference in causativization: the dative variant can undergo causativization, but the accusative one cannot (30b–c): (30) a.
la politica / Maria lo / gli interessa. Politics / Maria he-acc / he-dat interests ‘Politics/Maria interest him.’ b. Gianni ha fatto interessare Maria a Paolo. Gianni made interest Maria to Paolo. ‘Gianni made Maria interest Paolo.’ c. *Gianni ha fatto interessare Paolo a Maria. Gianni made interest Paolo to Maria
The behavior of ObjExp verbs in Italian provides another example of the types of verbal morphemes found across languages. In the next section I will look at another case of variation across languages, Subject Experiencer verbs. There, too, much of the variation can be traced into the verbal morphemes that create the verbs.
. Subject Experiencer verbs Consider now the second group of psychological verbs, Subject Experiencer (SubjExp) verbs: love, hate, admire etc. The morphological similarity between pairs of ObjExp and SubjExp verbs is apparent, and intriguing: (31) a. John worries about the dog. b. The dog worries John.
Maya Arad
(32) a. John is amused at the game. b. The game amused John.
If we take roots to be the basic elements in the lexicon, the relation between SubjExp and ObjExp verbs is straightforward. I suggest that (in many cases, in many languages) SubjExp and ObjExp verbs are formed from the same root. The combination of this root with different verbal morphemes yields “verbs” of different types. SubjExp and ObjExp verbs share some of their morphology in many languages:16 (33) a. ragaz (be angry) b. hirgiz (anger) c. hitragez (get angry)
√ ( rgz)
(34) a. inhoa (find disgusting) b. inhotta (disgust)
√ ( inho)
(Finnish: Pylkkänen 1997)
(35) a. s’étonner (be amazed) b. étonner (amaze)
√ ( éton)
(French: Pesetsky 1995)
(36) a. udivljat-sja (be surprised) √ b. udivljat (surprise) ( udivl)
(Hebrew)
(Russian: Pesetsky 1995)
Like all verbal heads, the verbal morpheme that makes roots into SubjExp verbs is a bundle of features. Consider now the features of this morpheme. It certainly has the verbalizing feature, namely, making roots into verbs. It also has some semantic content. In this case, the content is stative, non-causative. Like locative prepositions, it establishes a static relation between the experiencer and a mental state (the experiencer is at some mental state). This head makes the root into a stative event, and introduces an external argument that is stative and non-causative. I assume that these features are universally associated with morphemes that make roots into SubjExp predicates: these verbs have to be stative, as part of their content. However, the third property that verbal morphemes may have, transitivity, is not obligatory. Interestingly, as far as transitivity is concerned, the head forming SubjExp predicates exhibits cross-linguistic variation. In some languages, like English and Italian, it has the transitivity property and behaves like standard agentive little v. Thus, SubjExp verbs in Italian and English have accusative object marking. Furthermore, SubjExp verbs in Italian behave like agentive transitive verbs in allowing reflexive si and causativization (cf. B&R):
Universal features and language-particular morphemes
(37) a.
Gianni si ama / apprezza / teme. Gianni refl love / appreciates / fears ‘Gianni loves/appreciates/fears himself.’ b. Gianni ha fatto amare /apprezzare /temere Paolo a Maria Gianni made love /appreciate /fear Paolo to M. ‘Gianni made Maria love/appreciate/fear Paolo.’
In other languages stative verbs are realized differently from active verbs. In Hindi and Georgian, for example, statives mark their subjects with dative case: (38) a.
Gela-s Gela-dat b. Ram-ko Ram-dat
nino Nino-nom Sita-se Sita-instr
uqvars. love (Georgian: McGinnis 1998) pyaar hai. love be-Present (Hindi)
In Irish and Scottish Gaelic SubjExp predicates are expressed through the verbs be or have, combined with a noun or an adjective: (39) a.
Tá fuath do Y ag X. is hatred to Y at X ‘X hates Y.’ b. Tha eagal orm. is fear on me. ‘I am afraid.’
(Irish: McCloskey & Sells 1988)
(Scottish Gaelic: Ramchand 1997)
I assume that the head that forms SubjExp verbs has the same semantic content (stative, non-causative) in all languages. The syntactic realization of the stative head (its argument structure, case marking properties and transitivity) is decided by each language separately, depending on which features it chooses to bundle together. Some languages (English, Italian) bundle the transitivity property and stative content under the same morpheme, while others (Georgian, Hindi) do not. The kind of morpheme a language has will affect the type of “verbs” (the combinations of roots with morphemes) it has. A variation in morphemes thus leads to syntactic variation in the realization of verbs across languages. The “Experiencer” argument of SubjExp verbs is an external argument, introduced by a verbal head. It is a subject of a state, while the “Experiencer” of ObjExp verbs is the argument that is being put into a state. The two groups of verbs also differ with respect to the type of event they encode. SubjExp verbs are static: the Experiencer is at a certain state. ObjExp verbs (on both readings) are dynamic: the Experiencer is being put into a state (with or without change of state). We thus expect the two groups to have their “Experiencer” syntacti-
Maya Arad
cally realized in different positions – subject (in static relations) or object (in dynamic relations). It is important to note at this point that cross-linguistic variation occurs with SubjExp verbs and other stative verbs, but not with standard, agentive verbs. Quirky (dative marked) subjects are often subjects of stative verbs, never agents. I assume that this fact is related to stativity in some way. Perhaps because stativity is a property shared by verbs and prepositions of certain kind, there are less restrictions on the way in which it is realized (unlike agentive predicates, which must be realized as verbs). I leave the question open for future research.
. Summary Regarding the Lexicon of a language as a set of roots and (possibly bundled) features enables us to give a more precise content to the claim that language variation is restricted to lexical items. There are three ways in which languages vary: 1. Root inventory: what signs or lexical pieces does the language have? Variation in roots includes also the variation in the meanings that are assigned to roots in different environments, or contexts. For example, cake is assigned different meanings in eat a cake, take the cake or piece of cake (cf. Marantz 1997).17 2. Subset of features selected from the universal pool: what features does the language employ in building its lexicon? Does it have morphological case features, gender features, aspect (perfective/imperfective)? 3. The ways in which features are bundled together. As argued above, languages may put together different sets of features into morphemes. This allows for a further source of lexical variation: what bundles does the language have. In this paper I concentrated specifically on verbal morphemes, in order to account for variation in verb types. I argued that there are three types of properties that serve as “ingredients” of verbal morphemes: verbalizing property (making roots into verbs), semantic content (which may come in several “flavors” – agentive, stative, stative-causative) and transitivity (formal features, external argument and case checking, which are optional). Different phenomena across languages (passivization, reflexivization and the formation of psych
Universal features and language-particular morphemes
verbs) have been argued to result from the combination of roots with different morpheme types.
Notes . For a similar proposal, assigning a morpho-syntactic origin for language variation, see Mateu and Rigau (this volume). . Some theories postulate a richer structure of the verbal projection, correlating with different verb classes. See, in particular, Hale and Keyser (1998), Kural (this volume). In this paper I argue that such richness is best captured in terms of features of the verbal head that merges with the root. However, this approach shares much with the structural approach mentioned above. . This syntactic and semantic alternation of the root is reflected in the morphology of the verb. The root appears with a different verb-creating morpheme, or pattern, in each of its appearances as a stative, inchoative or causative verb. Note that the same type of alternation exists in Romance languages, although its morphological effects are less rich. The causative-inchoative alternation is morphologically expressed by the presence or absence of the pronominal clitic SE (e.g. French réchauffer, heat-causative vs. se réchauffer, heat-inchoative). . b, k and p are spirantized in Hebrew in post-vocalic positions, yielding v, x and f respectively. . Initial y gives rise to phonologically contracted forms in certain contexts, such as hošiv. . This ambiguity dates back to Ruwet (1972), who notes that verbs such as strike cannot be interpreted as psych verbs if they are agentive (strike someone with your intelligence vs. strike someone with a bat). . There is also a non-stative non-agentive reading, or an eventive reading, which has a change of mental state in the experiencer, but no intentional agent: (i)
The explosion/the noise/the thunderstorm frightened Laura.
This reading patterns with the agentive reading in some languages, and with the stative reading in others. In this paper I abstract away from this issue, as it does not bear directly on my analysis. See Arad (1998) for discussion. . Note that although the stative reading is easier to get with present tense or habitual aspect, it cannot be reduced to it. As shown in Pylkkänen (1997), the stative reading can also refer to a single event, or “spell” of mental state. . In languages in which object case marking is sensitive to change of state, such as Finnish, the objects of these verbs are marked with partitive case (instead of accusative). . In languages in which psych verbs bear causative morphology, such as Hebrew or Finnish, both readings carry a causative morpheme (cf. Pylkkänen 1997).
Maya Arad . B&R show, in fact, that verbs such as colpire, strike, can take a reflexive on their physical, non-psych reading: (i) Gianni si è colpito a. con un bastone. Gianni self is struck with a stick b. *per la sua prontezza. by the his quickness . In fact, as shown by Ruwet (1972), many ObjExp have a non-psych reading, in which the same participant is interpreted not as an experiencer, but as a patient: shake, disturb, move etc. . Note that this is not an explanation of Burzio’s generalization, but rather, a description. In line with much work on case (e.g. Marantz 1991; Laka 1993) I think this generalization could be derived from independent principles (such as the manner in which case is assigned and the relation between v and T and the EPP). . Note that this reading is non-agentive: it does not mean that John is trying to please himself, but rather, that he likes himself, thinks highly of himself, etc. . Note that dative is assigned structurally in causative constructions in Italian when the causativized verb is transitive (see discussion in the text below). . English has very few pairs of SubjExp and ObjExp verbs formed from the same root with √ no morphological modification ( worry). . At an even more local environment, that of the immediate phrase, roots can also acquire context-dependent meaning. Thus, for example, French pomme is interpreted as apple on its own, but as potato in the context of pomme de terre.
References Arad, M. (1998). VP structure and the Syntax-Lexicon Interface. Doctoral dissertation, University College London. Belletti, A. & L. Rizzi (1988). Psych verbs and theta theory. NLLT, 6, 291–352. Borer, H. (1984). Parametric Syntax. Dordrecht: Reidel. Borer, H. (1998). Passive without Theta Grids. In S. Lepointe (Ed.), Morphology and its Interfaces with Phonology and Syntax. Stanford, CA: CSLI. Burzio, L. (1986). Italian Syntax. Dordrecht: Kluwer. Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, N. (1998). Minimalist inquiries: The framework. MITOPL. MIT. Collins, C. (1997). Local Economy. Cambridge, MA: MIT Press. Grimshaw, J. (1990). Argument Structure. Cambridge, MA: MIT Press. Hale, K. & J. Keyser (1998). The Basic Elements of Argument Structure. In H. Harley (Ed.), MITWPL 32. Papers from the Upenn/MIT Roundtable on Argument Structure and Aspect (pp. 73–118). Halle, M. & A. Marantz (1993). Distributed Morphology and Pieces of Inflection. In K. Hale and J. Keyser (Eds.), The View from Building 20 (pp. 111–176). Cambridge, MA: MIT Press.
Universal features and language-particular morphemes
Harley, H. (1995). Subjects, Events and Licensing. Doctoral dissertation, MIT. Laka, I. (1993). Unergatives which assign ergative. In MITWPL. Papers on case and agreement. Kratzer, A. (1996). Severing the External Argument from its Verb. In J. Rooryck and L. Zaring (Eds.), Phrase Structure and the Lexicon. Dordrecht: Kluwer. Kural, M. (this volume). A four-way classification of monadic verbs. McCloskey, J. & P. Sells (1988). Control and A chains in Modern Irish. NLLT 6, 143–189. McGinnis, M. (1998). Locality in A-movement. Doctoral dissertation, MIT. McGinnis, M. (1999). Types of causation and the distribution of psych roots. Ms., UPenn. Marantz, A. (1984). On the Nature of Grammatical Relations. Cambridge, MA: MIT Press. Marantz, A. (1993). Implications of asymmetries in double object constructions. In S. Mchombo (Ed.), Theoretical Aspects of Bantu Grammar. Stanford: CSLI Publications. Marantz, A. (1997). No Escape from Syntax: Don’t Try Morphological Analysis in the Privacy of Your Own Lexicon. In A. Dimitriadis, L. Siegel et al. (Eds.), Upenn Working Papers in Linguistics, Vol 4.2, Proceedings of the 21st Annual Penn Linguistics Colloquium (pp. 201–225). Marantz, A. (1999). Case and Derivation: Properties of little v. Talk presented at the University of Paris VIII. Mateu, J. & G. Rigau (this volume). A minimalist account of conflation processes: Parametric variation at the syntax-lexicon interface. Pesetsky, D. (1995). Zero Syntax. Cambridge, MA: MIT Press. Pylkkänen, L. (1997). Stage and Individual level psych verbs in Finnish. Paper presented in the workshop on events in syntax and semantics, LSA Summer Institute, Cornell University. Ramchand, G. (1997). Aspect and Predication. Oxford: Oxford University Press. Ruwet, N. (1972). Théorie syntaxique et syntaxe du français. Paris: Editions du Seuil.
Agree or attract? A Relativized Minimality solution to a Proper Binding Condition puzzle Cedric Boeckx University of Illinois at Urbana-Champaign
I examine a paradigm first discussed by Kroch and Joshi (1985) which Lasnik (in press) takes as an argument for feature-movement. I show that Lasnik’s solution is problematic on both conceptual and empirical grounds. I offer an alternative approach that in contrast to previous solutions does not rely on the Proper Binding Condition, but instead deeply implicates Relativized Minimality.
.
Introduction
Considerable insight has been gained in recent years into the nature of word order by focusing on possibilities of remnant movement.1 Abstractly, remnant movement takes the form in (1). An element α is moved out of an element β, which is subsequently moved to a position higher than (and featurally distinct from) α’s derived position. (1) [[β ... t α ...] ... [ ... α ... [ ... t β ...]]]
The process is illustrated in (2) (from German). (2) [Y ti gelesen]j hat [X das Buch]i keiner tj read has the book noone ‘Noone has read the book’
Because of its two-step character, remnant movement has come to be regarded as a powerful argument for a derivational approach to syntactic computation (see Müller 1998; but see Brody & Szabolcsi 2000). One of the most fascinating aspects of remnant movement is the problem it poses for the Prober Binding
Cedric Boeckx
Condition (PBC), which demands the traces be bound (hence c-commanded) by their antecedents. (3) Proper Binding Condition Traces must be bound
(Fiengo 1977)
The influence of remnant movement analyses in recent years has grown steady. For instance, projects are being developed to show that most (perhaps all) instances of head-movement can, and should be reanalyzed in terms of remnant movement (see Koopman & Szabolcsi 2000 and Mahajan 2000, among others). Kayne (1998, 2001) has even suggested that traditional instances of covert processes such as scope reversal via QR or reconstruction be captured by making massive use of remnant movement, with no need for a distinct LF component. Remnant movement raises many intriguing properties which I will not discuss here (see Müller 1998 for what is to date the most detailed study of remnant movement).2 Rather, I will focus on a fairly narrow issue that arises in the remnant movement approach, and suggest that once properly analyzed the problem is only apparent. In addition, our investigation will enable us to draw some important conclusions about the structure of the grammar; in particular, about the status of a distinct LF component.
. How likely Consider the contrast in (4)–(5), originally reported in Kroch and Joshi (1985). (4) a. John is likely to win b. There is likely to be a riot (5) a. How likely to win is John b. *How likely to be a riot is there
(5a) appears to be a paradigm case of remnant movement. In a first step, John is raised from its VP-internal position to its surface position (SpecIP). Next, the non-finite predicate how likely ..., containing the trace of John, is raised to SpecCP. This is schematized in (6). (6) (=5a) [CP [how likely [t i to win]]j [C’ isk [IP Johni t k [VP t j ]]]]
Surprisingly, an example like (4b), which is minimally different from (4a), fails to yield a grammatical output if the how likely ... chunk is raised, as in (5b) (compare (6) and (7)).
Agree or attract?
(7) (=5b) [CP [how likely [t i to be a riot]]j [C’ isk [IP therei t k [VP t j ]]]]
The crucial difference between (4a) and (4b) is that John raises to matrix SpecIP in (4a), but not in (4b). Instead in (4b) an expletive is inserted in the embedded SpecIP (due to the preference of Merge-over-Move, see Chomsky 1995: 348; 2000: 111), and undergoes movement to SpecIP. (I return to the status of expletive raising below. Note that if expletives are taken to undergo predicate movement, as in Moro 1997 and related work, (5b) is equally puzzling.) The contrast in (5) is puzzling in more than one respect. Müller (1998: 7, n.10) mentions the paradigm in (5) and suggests we treat (5b) as an unexplained exception. It is indeed difficult to see what grammatical property would exclude (5b) while ruling in (5a). Further, it has been suggested in Vuki´c (1998) and Boškovi´c (2001a) that expletives are merged in their surface position without undergoing any movement (both Vuki´c and Boškovi´c treat expletives as grammatical formatives that are merged as late as possible. See Boškovi´c for evidence that there is no EPPchecking in the infinitival complement of raising predicates). If correct, the ‘late-insertion’ view of expletives replaces (7) with (8). (8) [CP [how likely [to be a riot]]j [C’ isk [IP there t k [VP t j ]]]]
That (8) yields an ungrammatical output, and (6) doesn’t is clearly unexpected. If anything, one would expect the reverse pattern of grammaticality, as (6) appears to violate the PBC, whereas (8) does not (the raised predicate does not contain any trace at all). It is this puzzle that I will concentrate on in this paper. On grounds that I have discussed elsewhere (see Boeckx 2001), I will assume that the derivation of (5b) given in (8) is the correct one (i.e., expletives are merged in their surface positions). The thesis I will entertain here is that the contrast in (5) has nothing to do with remnant movement or the Proper Binding Condition. Whatever their ultimate status, these will remain unaffected by (5), for reasons to be developed shortly. I will also argue, contra Lasnik (in press), that the contrast in (5) fails to provide an argument for the existence of feature movement (Chomsky 1995) and for a distinct LF component.
Cedric Boeckx
. More facts Before offering a solution to the contrast in (5), let me expand the data base by showing that the contrast in (4)–(5) is not limited to existential constructions. A effect similar to (5b) is found with idiom chunks (9b).3 (9) a. Advantage is likely to be taken of John b. *How likely to be taken of John is advantage
Descriptively speaking, raising of portion of an idiom is unproblematic (9a), unless the movement is followed by predicate raising (9b). An additional puzzle of the how likely-paradigm is that the contrasts observed so far disappear if only portion of the predicate is raised, as in (10). (10) a. How likely is John to win b. How likely is there to be a riot c. How likely is advantage to be taken of John
Relying on Kroch and Joshi’s original observations, Lasnik and Saito (1992: 141) take the how-likely paradigm to require an explanation in terms of the PBC. According to them, (5b) and (9b) are out because the trace (of there and of advantage, respectively) fails to be properly bound after predicate raising. (Note that when only part of the predicate raises, as in (10), all traces are bound, as the portion of the predicate containing the trace remains in situ.) According to Lasnik and Saito, what saves (5a) is the existence of an alternative derivation that does not contain an unbound trace. Lasnik and Saito appeal to a long-standing view that modal predicates such as likely are ambiguous between raising and control predicates. Thus, (4a) may be represented as (11) or (12). (11) John is likely [t to win] (12) John is likely [PRO to win]
If the derivation in (11) is chosen, (5b) will violate the PBC, as shown in (13). (13) [CP [how likely [*t i to win]]j [C’ isk [IP Johni t k [VP t j ]]]]
If, however, the control derivation is chosen, no trace will be contained in the raised predicate, and the sentence will be grammatical. (14) [CP [how likely [PRO to win]]j [C’ isk [IP John t k [VP t j ]]]]
Lasnik and Saito note that since expletives and idiom chunks cannot control PRO, a derivation like (14) is unavailable to them.
Agree or attract?
(15) a. *there tried [PRO to be a riot] b. *advantage wants [PRO to be taken of John] (16) a. *[how likely PRO to be a riot] is there b. *[how likely PRO to be taken of John] is advantage
By capitalizing on the PBC and on a raising/control ambiguity for predicates like likely, Lasnik and Saito account for the whole how likely paradigm. There are, however, several problems with Lasnik and Saito’s solution. The first I want to mention is the raising/control distinction of modal predicates. As already said, it is standard to capture the root vs. epistemic modal distinction via a control vs. raising analysis. The main motivation for such an analysis is that root modals (but not epistemic modals) appears to assign a theta-role to the subject. Assuming some version of the theta-criterion, root modals then have to project their own subjects. Aside from the status of the theta-criterion in minimalist models of grammar, I note that the claim that root modals assign a theta-role to the subject has not gone unchallenged (see already Newmeyer 1975). Further, if one follows Kratzer (1991) in taking the root/epistemic distinction not to be the result of a lexical ambiguity, but of different modal bases that are provided by different conversational backgrounds, different syntactic (/thematic) structures are a priori not necessary to capture the distinction. In addition, Wurmbrand (1998) (see also Bobaljik & Wurmbrand 2000) and Barbiers (1995) provide detailed arguments based on unrelated phenomena (restructuring and ellipsis, respectively) against a control/raising analysis of the root/epistemic modal distinction.4 In addition, if control reduces to raising, as Hornstein (1999) has argued, Lasnik and Saito’s PBC account of the contrasts above cannot be correct. Perhaps the strongest argument against a PBC account of the how likely paradigm comes from the ungrammaticality of (17) (pointed out to me by Koji Sugisaki, p.c.; see also Nomura 2001 and Abels, in progress).5 (17) *who said that there was how likely to be a riot
A similar fact obtains with idiom chunks. (18) *who said that advantage was how likely to be taken of John
Note, in contrast, the grammaticality of (19).6 (19) who said that John was how likely to win
The three sentences just given mirror the contrast between (5a) and (5b)–(9b). The importance of the present cases is that they do not involve the remnant
Cedric Boeckx
movement part (predicate raising). Hence they cannot be ruled out via the PBC.
. A Move-F account Recently, Lasnik (in press) has revisited the how-likely paradigm, and argued that it provides an argument for Chomsky’s (1995) treatment of covert movement as feature movement. To understand the argument, it is useful to retrace certain developments in the minimalist program concerning the nature of covert movement and the timing of operations. . Theories of movement Whereas the first Minimalist paper (Chomsky 1993) had taken over the socalled Y-model (Chomsky & Lasnik 1977) to some extent (compare (20) and its minimalist variant (21)), subsequent research within the Minimalist Program (henceforth MP) has led to interesting modifications of the central architecture of the language faculty. (20)
D(eep)-Structure
(21)
Initial Array
S(urface)-Structure LF
PF
Spell-Out LF
PF
Chomsky (1995) still adopts the ‘temporal’ asymmetry between overt and covert movement (i.e., pre- vs post-Spell-Out operations), much like in Chomsky (1993), but suggests that we view ‘covert operations’ as consisting not of movement of categories that happen to receive no pronunciation (as was the case in 1993, and in work within the GB-framework), but rather of movement of formal features. For Chomsky (1995), “the operation Move (...) seeks to raise just F[eature]” (p. 262). Chomsky’s reasoning is that movement is triggered to check features. We therefore expect under Minimalist assumptions that if the computational component can raise just what is needed (features to carry out the checking operation), it will do so. The question now arises as to why sometimes whole categories, and not just formal features, move. Chomsky’s answer (in 1995) is that overt movement is to be decomposed in the following way.
Agree or attract?
Applied to the feature F, the operation Move creates at least one, perhaps two “derivative chains” alongside the chain CHF = (F, tF ) constructed by the operation itself. One is CHFF = (FF[F], t FF[F] ), consisting of the set of formal features FF[F] and its trace; the other is CHCAT = (α, t), α a category carried along by generalized pied-piping. (Chomsky 1995: 265)
Chomsky assumes that any overt operation is the result of moving features first (so far, overt and covert movements are indistinguishable, except perhaps in terms of timing, a point I will discuss extensively below), and then an operation of pied-piping which carries along the ‘remnants’ of the item from which the features have been moved. Chomsky assumes that For the most part – perhaps completely – it is properties of the phonological component that require pied-piping. Isolated features and other scattered parts of words may not be subject to its rules, in which case the derivation is canceled; or the derivation might proceed to PF with elements that are ‘unpronounceable,’ violating F[ull] I[nterpretation]. (Chomsky 1995: 266)
As emphasized in the first quote, within a Move-F framework, overt and covert operations are indistinguishable up to a certain point (formation of a derivative chain/pied-piping). This leads Chomsky to claim that “such considerations could permit raising without pied-piping even overtly, depending on morphological structure.” (p. 266) What Chomsky means by ‘overtly’ here is ‘the overt (pre-Spell-Out) component’ in (21). Before Move-F, it was assumed without discussion that covert operations took place after Spell-Out. Given Move-F, it is now possible to view covert movement as the ‘first’ part of overt movement, that is, as an operation that does not require any distinct component. This is indeed the conclusion that Chomsky (2000, 2001a, b) embraces: There is a single cycle; all operations are cyclic. Within narrow syntax, operations that have or lack phonetic effects are interspersed. There is no distinct LF component within narrow syntax. (Chomsky 2000: 131)
Other proponents of the so-called single output syntax (ignoring important differences among them) are Bobaljik (1995, 2001), Groat and O’Neil (1996), Pesetsky (2000), and, from a representational perspective, Brody (1995, 1997). (For a very different implementation, see Kayne 1998.)
Cedric Boeckx
Proponents of the SOS make use of an insight expressed in Chomsky (1993), reviving the analysis of movement in Chomsky (1955), that traces are copies of the moved elements. The important idea is that chains (the objects of syntactic computation) consist of sequences of copies of a given element, but that at the interfaces (LF and PF) only one position in a given chain (link) is (typically) privileged or ‘interpreted.’ (At least in the general case, see Nunes 1999, to appear, and Boškovi´c 2000, 2001b, for valuable discussion and necessary refinements). Departing from Chomsky (1993, 1995), SOS defenders argue that not only may LF privilege either the higher or the lower copy (see Chomsky’s 1993 discussion of A-bar movement reconstruction in such terms), but that PF also may choose which copy to privilege (i.e., pronounce). On this view, ‘overt’ and ‘covert’ movements are distinguished not by temporal ordering in the derivation, but rather by the choice of which copy to pronounce and which copy to delete.7 Or, put differently, pronunciation of the highest copy corresponds to ‘overt’ movement, and pronunciation of the lowest copy corresponds to ‘covert’ movement. The major point of departure for Chomsky from other Single-Output syntax models like Groat and O’Neil (1996) and Bobaljik (1995, 2001) is in its acceptance of a covert operation that is distinct from ‘covert’ (i.e., unpronounced) phrasal movement (either feature movement or Agree). (Kayne 1998: 172 remains somewhat agnostic; Brody 1997 and Pesetsky 2000 assume some version of Agree.) The need for such an operation is best illustrated on the basis of existential sentences.8 Originally, Chomsky (1986) proposed that the associate-indefinite NP in such cases replaces the expletive there in the covert component, as illustrated in (22). (As far as I can see, unless some extra stipulation is built in, such as Bobaljik’s 2001 Minimize PF-LF mismatch, Groat and O’Neil 1996 and Bobaljik’s 1995, 2001 make the same prediction as Chomsky 1986.) (22) a. there is a man in the garden b. a man is [t in the garden]
S(urface)-Structure LF-expletive replacement
This analysis was criticized as soon as it was proposed (apparently, first, by Lori Davis; Howard Lasnik, personal communication): the expletive replacement analysis gets the scope facts wrong. As is well-known, indefinites in subject positions are scopally ambiguous (see (23)). (22b) predicts that such ambiguity exists in existential constructions. But this is not the case. The associate in (24) only has a low reading.
Agree or attract?
(23) someone from New York is likely to win the lottery (someone likely/likely someone) (24) there is likely to be someone here (*someone likely/likely someone)
Chomsky (1991) puts forward a new analysis of existential constructions. He suggests that at LF the associate does not literally replace the expletive but adjoins to it, as shown in (25). (25) [a man [there]] is [t in the garden]
There are many problems with this analysis, and I won’t review them here. They are thoroughly discussed in Lasnik (1992). Chomsky (1995) proposes a much more satisfactory account. Chomsky’s reasoning is that movement is triggered to check features. We therefore expect under Minimalist assumptions that demand minimization wherever possible that if the computational component can raise just what is needed (features to carry out the checking operation), it will do so (recall “the operation Move (...) seeks to raise just F[eature]” (Chomsky 1995: 262)). Thus, Chomsky argues for the existence of feature movement (Move-F). Relying on the Move-F hypothesis, Chomsky proposes that in existential constructions only formal (φ-) features of the associate NP move (head-adjoin) to Infl0 , leaving all phonological and semantic features behind. Raising of φ-features immediately accounts for the fact that finite agreement in existential constructions is controlled by the feature specification of the associate, as illustrated in (26). (I here set aside semi-formulaic examples like there’s two men in the garden.) (26) a. there is/*are a man in the garden b. there *is/are two men in the garden
As Lasnik has extensively discussed (see the essays in Lasnik 1999b), the Move-F account provides a straightforward explanation for the narrow scope of the associate NP in these constructions if we assume, quite plausibly, that the establishment of scopal relations is more than a matter of formal features, and requires phrasal displacement (see Pesetsky 2000: 2–5 for some discussion).9 On largely conceptual grounds, Chomsky (2000:123) dispenses with feature movement altogether and captures its effects via the operation Agree. The latter amounts to a process of feature checking (in his terms, valuation) at a distance. The summary of minimalist views on covert movement given here, especially the covert process that accounts for agreement in existential sentences
Cedric Boeckx
will now enable us to examine Lasnik’s (in press) argument in favor of feature movement based on the how likely paradigm discussed in previous sections. . The Proper Binding residue Lasnik (in press) uses the paradigm above in support of the Move-F hypothesis. His argument runs as follows. Barss (1986) rules out (5b) by capitalizing on Chomsky’s (1986) expletive-replacement analysis of existential constructions. Barss notes that if a riot must replace there at LF, the movement will be illicit because it is sidewards. Lasnik agrees that the expletive replacement account cannot be correct, but he claims that Barss’s analysis can be maintained under the feature movement analysis, crucially not under an Agree analysis. Assume that the φ-features of a riot are attracted in (5b), the element becomes PF-deficient (see Chomsky’s 1995 quote above; see also Ochi 1999a, b; Lasnik 1999c and Uriagereka 1999). How likely-fronting removes the category from the c-command domain of the moved features, the necessary repair strategy cannot be carried out, and the derivation crashes due to the presence of scattered features at the interfaces. Lasnik claims that an Agree account cannot capture the fact in (5b), as in the absence of (feature) movement, there is no feature scattering to start with, hence no requirement for the associate NP to remain within the c-command domain of the expletive. Lasnik’s analysis is appealing for the level of subtlety it reaches. In contrast to many other studies (see, e.g., Wurmbrand 2001), it is not concerned with whether or not some non-phrasal ‘covert’ process exists, but with the more difficult question of which form that process takes. However, Lasnik’s account faces many problems. First, by adopting a feature movement analysis, Lasnik inherits the conceptual difficulties that led Chomsky to reject feature chains in favor of Agree. In particular, it is not clear what feature scattering (crucial for Lasnik) means under the copy theory of movement. Movement of the feature will leave a copy behind, rendering the need for repair obscure. Further, if feature movement chains reduce to head-chains (Chomsky 1995; Boškovi´c 1998), they inherent the problems associated with the latter (see Chomsky 2000, 2001a; Brody 2000; Boeckx & Stjepanovi´c 2001; and Mahajan 2000). Second, in order to account for why feature movement is not accompanied by repair in standard existential sentences, Lasnik has to assume that that instance of feature movement takes place after Spell-out, in a separate LF com-
Agree or attract?
ponent (where feature scattering does not cause any crash). He is thus forced to a return to the Y-model. Third, Lasnik has to postulate that feature movement out of the copy left by remnant movement is impossible (contra Boškovi´c 1997 and Nishioka 1997). If it were, the remnant movement case (5b) would be virtually identical to (4b), as illustrated in (27). (27) [how likely to be a riot] [is [there [
] | F-movement |
Fourth, Lasnik’s solution says nothing about the badness of (17), as it crucially relies on remnant movement to exclude (5b). Fifth, it is not clear how the feature movement account of (5b) extends to (9b) (let alone (18)). Unlike the expletive-associate relation, the raising of an idiom chunk to the idiom remnant has never been treated in terms of feature movement as far as I know. On the basis of the problems it faces, I think it is fair to say that Lasnik’s analysis is inadequate. . Relativized Minimality We have seen that neither a raising vs. control/PBC account nor a feature movement account adequately captures the how likely paradigm. In this section I propose a novel way of looking at the facts that not only accounts for the whole range of data, but also allows us to preserve the arguably more elegant single output model of syntax, and does not jeopardize any conclusions about remnant movement reached by previous studies. The format of the solution I would like to argue for is well-known. It essentially amounts to a Relativized Minimality violation (Rizzi 1990). An element α enters into a relation with an element β if there is no γ that meets the requirement(s) of α (i.e., that matches α), and γ either c-commands β. The illicit situation is schematized in (28). (28) [α ... [ ... γ ...[ ... β ...]]] (γ c-commands β)
In the following I will adopt Starke’s (2001) conception of Relativized Minimality, as it leads to what I think is a clearer solution. The portion of Starke’s view on chains that will be relevant for us is roughly as follows. (29) a. α ... α ... α b. αβ... α ... αβ
Cedric Boeckx
In (29a) we have three elements of the same type (α). Attempting to relate the first and the third (in linear order) leads to a violation of Relativized Minimality. The situation in (29b) is more complex. The intervening element (second element in linear order) is of type (α). The first and the third elements which the grammar is trying to relate both contain an α feature. In addition, they contain a β feature which is missing from the intervener. Starke’s point is that if the first and the third element are α-related, the situation that obtains is equivalent to that in (29a), and is thus ruled out by Relativized Minimality. If, however, the first and the third elements are β-related, no intervention effect emerges, as the potential intervener is not “of the same type.” A concrete case of (29a) is a superiority condition of the type we found in (30)–(31). (30) *whati did who buy t i (31) [C [ who T [buy what]]]
α α +wh +wh
α +wh
Starke claims that the situation in (29b) corresponds to a weak island. As is well-known from the work of Cinque (1990), Rizzi (1990), and Szabolcsi and Zwarts (1993), weak islands such as wh-islands are ‘selective’ islands, in the sense that certain elements associated with well-defined readings can extract, while others, which lack these readings, can’t.10 Classic cases are the theta-/nontheta related contrast (32), and the D-linked/non-D-linked contrast (33).11 (Judgements are contrastive, rather than absolute.) (32) a. *How many pounds do you wonder whether he weighed t? ANSWER 1 [non-referential/non-theta reading]: he weighed 100 lbs (he was skinny) b. How many pounds do you wonder whether he weighed t? ANSWER 2 [referential/theta reading]: he weighs 100 lbs (by lifting the package) (33) a. *what do you wonder whether Mary read b. which book do you wonder whether Mary read
Starke’s view is that the good instances of extraction (32b, 33b) correspond to the situation in (29b) when the moving element β-relates to its final landing site. The bad instances of extraction correspond to an α-relation.
Agree or attract?
In what follows, I argue that the α/β-relation in (29b) plays a role in the how-likely paradigm. The reader should understand that I will not try to rule out the bad cases by appealing to some inadequacy of the remnant movement step fronting how likely .... The ungrammaticality of (17)–(18) suffices, in my view, to show that the badness of (5b)–(9b) is independent of remnant movement. In other words, they should be ruled out prior to the application of remnant movement. Put differently, in order to derive the contrast between (4b)/(9a) and (5b)/(9b), we must find a difference between the following stages of the derivation: (34) a. is [likely to be a riot] b. is likely to be advantage taken of John] (35) a. is [how likely to be a riot] b. is [how likely to be taken advantage of John]
The difference cannot affect the good cases (4a)/(5a): (36) is [likely John to win] (37) is [how likely John to win]
With Chomsky (2000) I assume that in existential constructions, Infl0 and the associate stand in a checking relationship by Agree, and that feature matching takes place as the derivation unfolds, not in a distinct LF-component. It will therefore be crucial to examine the various cases step by step. The good cases (4a, 4b, 9a) are derived straightforwardly, as in (38)–(39). (38) [ T [likely [... NP ...]]] | | (39) a. [ T [likely [John to win]]] Agree (T,John) + Move (John) b. [ T [likely [to be a riot]] Agree (T,[a riot]) + Merge there ˆ_: there-insertion c. [ T [likely [to be taken advantage of John]]] Agree (T,advantage) + Move (advantage)
The proposal I would like to make to rule out the cases in (5b, 9b) is that the presence of how in how likely ... blocks the Agree relation between T and some NPs. It has often been argued that that wh-words are decomposable into a whpart and an indefinite part (an idea going back to Chomsky 1964 and Katz & Postal 1964). Suppose that how in how likely actually consists of a wh-part and an indefinite part, roughly as ‘wh-indefinite (degree).’12 I would like to argue
Cedric Boeckx
that the indefinite part of how creates an intervention/Relativized Minimality effect,13 as schematized in (40). (40) [ T [how likely [... NP ...]]] [+NP] [+WH,+NP] [+NP] | | *
It may be objected that the indefinite part of the wh-phrase does not ccommand the associate, hence should not be a blocker for Agree (recall (28)). However, there are various ways around this well-known ‘almost c-command’ problem. For concreteness, I will assume that c-command out of the specifier of an XP is possible (Kayne 1994). (Note that some features of how must ‘head’ the whole phrase to trigger pied-piping under wh-movement.) Although (40) rules out the crucial Agree relation in (39b, c), it appears to do so in (39a) as well, predicting (5a) to have the same status as (5b)–(9b), contrary to fact. However, here Starke’s characterization of weak islands (29b) comes handy. Recall that blocking is obviated if there is an ‘alternative’ agreerelation involving a feature that is absent from the blocker. I will argue that the [+NP] corresponds to α in (29b). The idea now being that there is another feature that is present and can partake in Agree in (39a), but not in (39b, c). The feature that I will make use of is [+D]. It is often assumed that definite noun phrases are DPs, while indefinites are NPs (see, e.g., Chomsky 1995: 342, 350). Further, indefinites are ambiguous between a DP reading (e.g., specific indefinites) and an NP reading. The ambiguity may account for the two readings in (41). (41) someone is likely to win the lottery (someone likely/likely someone)
As is well-known, the ambiguity in (41) is missing in existential sentences. (42) there is likely to be a lottery winner (*a winner likely/likely a winner)
Let us take the absence of the wide scope reading in (42) to mean that the indefinite NP in existential constructions is a pure NP (it lacks the DP reading). (This restriction may underlie the well-known definiteness effect.) Now let us go back to the examples in (39) and the intervention effect in (40). The presence of the indefinite feature on how blocks the Agree relation that relates T and the noun phrase in the infinitive complement. This is the α-relation in (29b). I propose that Agree can succeed if a D-feature is involved. This would correspond to the β-relation in (29b). The D-feature is absent from the intervener how. Having established that the indefinite noun phrase in existential construction is a pure NP, the β-relation (Agree [+D]) cannot be estab-
Agree or attract?
lished in (39b). Likewise, the saving β-relation cannot be established in (39c). As is well-known, portions of idioms are non-referential noun phrases, and thus quite plausibly lack the DP-reading. Summarizing the discussion so far, (39b, c) are excluded due to intervention as in (43). (43) [ T [how likely [... NP ...]]] [+NP] [+WH,+NP] [+NP] | | *
α
α
α
(39a) is rescue by the presence of a D-feature on the noun phrase being attracted. (44) [ T [how likely [... DP ...]]] [+DP] [+WH,+NP] [+DP] | |
β
α
β
An interesting, and correct, prediction of the present analysis is that in case an indefinite is used, and attracted in the how-likely frame, it cannot have an indefinite, non-referential reading (that reading would match the indefinite part of how, and the element could not be attracted); it must have a specific, (quasi-)referential reading. As (45) shows, the prediction is borne out. (45) [how likely to win the lottery] is someone from New York (someone likely/*likely someone)
So far our proposal is able to capture the basic how-likely paradigm without any appeal to remnant movement, distinct LF-component, or move-F, which I take to be desirable. What remains to be explained is the improvement in (10b, c) (repeated). (46) a. how likely is there to be a riot b. how likely is advantage to be taken of John
In an earlier version of the present work (Boeckx 1999), I claimed that extraposition took place in (46) (more precisely, I adopted Larson’s 1988 treatment of extraposition as resulting from Light-Predicate Raising (see already Fiengo 1977, see also Kayne 1994, and, for a precise formulation of the Light-Predicate Raising rule, Runner 1995). The derivation I assumed is given in (47). (47) a. [how [likely [to be a riot]]] → predicate-raising/‘extraposition’ b. [to be a riot]i [how [likely [t i ]]] →
Cedric Boeckx
c. is [to be a riot]i [how [likely [t i ]]] Agree between Infl0 and a riot Remnant movement d. [how [likely [t i ]]]i is there [to be a riot]i tj
The crucial step was (47b), which brings the NP past the intervener, allowing the establishment of the α-relation (Agree [+NP]), thus avoiding the fate of (39b, c). However, as pointed out in Nomura (2001) there are at least two problems with the extraposition analysis (problem #1 was also brought to my attention by Klaus Abels, p.c.). First, what prevents extraposition from taking place in (17), repeated here as (48). (48) *who said that there was how likely to be a riot
Indeed, pending a more precise formulation of extraposition in (47), nothing seems to block a derivation like (49). (Irrelevant stages omitted.) (49) a. b. c. d. e.
[how [likely [to be a riot]]] → predicate raising/‘extraposition’ [to be a riot]i [how [likely [t i ]]] → remnant movement is [[how [likely [t i ]]]j [to be a riot]i [tj ]] Agree between Infl0 and a riot there is [[how [likely [t i ]]]j [to be a riot]i [tj ]] → who said that there is [[how [likely [t i ]]]j [to be a riot]i [tj ]]
The problematic step here is (49c), where an Agree/α-relation can be established. By remnant-moving how likely ... in (49b), the (almost) c-command relation between how and the lower NP (a riot) is broken, preventing intervention from taking place. (49) thus predicts (17) to be grammatical (a similar problem arises for (18), which I won’t illustrate here). A second problem for the extraposition approach is that the extraposition step in (47a) patterns unlike familiar instances of extraposition (see Nomura 2001). (The contrasts are sublte, but nonetheless significant.) As shown in (51), wh-movement out of an extraposed infinitival is degraded. (50) whati did Bill ask Mary [to fix t i ] yesterday (51)
??what
i
did Bill ask Mary yesterday [to fix t i ]
If wh-movement out of an extraposed clause crosses a wh-island, the result is ungrammatical (more so than if extraposition does not take place, (53)). (52) *whati did you wonder [how nicely Bill asked Mary yesterday [to fix t i ]] (53) ?*whati did you wonder [how nicely Bill asked Mary [to fix t i ] yesterday]
Agree or attract?
Crucially, the deviance in (52) is not replicated to the same degree in parallel cases involving how likely. (54)
??what
(55)
???what
i i
do you wonder [how likely Bill is [to fix ti ]] do you wonder [how likely Bill is [to ask Mary to fix t i ]]
As I see no straightforward solution to the problems raised for the extraposition account by Nomura (2001), I reject the solution to (10b, c) offered in Boeckx (1999), and turn to an alternative. The proposal I would like to make is based on an intuition going back to Rosenbaum (1967: 108, n. 1).14 The idea is that what is traditionally referred to as raising adjectives (likely, certain) are in fact “peculiar adverbs” (Rosenbaum’s term). Treating raising adjectives as “adjuncts” opens up a new possibility of dealing with (10b, c). It has become popular since Lebeaux (1988) to view adjuncts as being inserted acyclically (in contrast to complements, which conform to a strict view of the cycle, along the lines of Chomsky’s 1993 Extension Condition). (See, e.g, Chomsky 1993 and much subsequent work). Suppose then that there are two ways of inserting how likely: cyclically, in which case it takes the to-infinitive as its complement), or ‘acyclically;’ separately from its alleged complement (to-infinitive). If the latter option is chosen, a sentence like (56) is derived as in (57). (56) how likely is John to win (57) a. b. c. d.
is [John to win] → attraction of John [Johni [is [t i to win]]] → acyclic insertion of how likely15 [Johni [is [how likely] [t i to win]]] → wh-movement [[how likely]j isk [Johni [t k [t j ] [t i to win]]]
Acyclic insertion essentially allows us treat how likely as a constituent, independent from the to-infinitive. This option therefore does not affect cases where wh-movement of how likely pied-pipes the infinitive. This is an important point because if acyclic insertion could be involved in ‘pied-piping cases,’ nothing would prevent a derivation like (58), which is clearly unwanted, as it would incorrectly rule in cases like (5b).16 (58) a. is [to be a riot] → Agree (T,riot) (+ there-insertion) b. [there [is [to be a riot]]] → acyclic insertion of how likely c. [there [is [how likely] [to be a riot]]] → wh-movement (+ piedpiping) d. [[how likely [to be a riot]]j isk [there [t k [t j ]]]]
Cedric Boeckx
Acyclic insertion of how likely does not face the second problem raised by Nomura for the extraposition account. In the absence of extraposition, we do not expect (52) and (54) to pattern the same way. As for the first problem (how likely in situ, as in (17)), it also does not arise under acyclic insertion. If how likely is an adjunct, (17) reduces to (59). That is, whatever excludes in-situ whadjuncts in multiple questions will exclude the acyclic insertion option for a case like (17). The only option available will be the cyclic insertion, which gives rise to intervention, as demonstrated above. (59) *who left why
The acyclic insertion analysis of how likely thus seems superior to the extraposition analysis in accounting for the grammaticality of (10b, c), without running afoul of (17)–(18).
. Conclusion To conclude, I have examined a paradigm first discussed by Kroch and Joshi (1985) which Lasnik (in press) took as an argument for feature-movement. I have shown that Lasnik’s solution is problematic on several grounds. In particular, it fails to provide a solution for part of the paradigm (the idiom case) and moreover leads to an organization of the grammar that contains a separate LF component and feature chains. I have offered an alternative approach to the how likely paradigm that deeply implicates Relativized Minimality. Not only does the present analysis capture the full paradigm straightforwardly, it also need not assume the existence of feature-movement chains or of a distinct LF-component.17 In so doing, the present analysis lends credence to the conceptually more elegant mechanism of Agree and the One-cycle model of syntax. Finally, the account reconciles the how likely paradigm with independent conclusions about remnant movement, which previous analyses (Müller 1998) had failed to do.
Notes . For helpful comments on an early version of this paper, I thank Željko Boškovi´c, Howard Lasnik, Adolfo Ausín, Koji Sugisaki, and especially Jairo Nunes. I particularly appreciate comments by Klaus Abels and Masashi Nomura which led to a sharpening of the hypothesis I first entertained, and considerable improvement of its technical implementation. I
Agree or attract?
am grateful to Masashi Nomura for making available to me a draft of Nomura (2001), where Lasnik’s solution is also examined (Klaus Abels also discusses aspects of the how likely paradigm in work in progress to which I haven’t have access.). Finally, I thank Artemis Alexiadou for her interest in this paper, and her offer to include it in the present volume. . Saito (2001) argues that remnant movement is possible if and only if the remnant contains no trace (in his terms, if the remnant is a complete constituent). Besides the unclear status of what it means to contain no ‘trace’ under the copy theory of movement (Chomsky 1995), Saito’s account is designed to rule in only those cases of remnant movement out of which A-movement has taken place (following Lasnik 1999a, Saito assumes that Amovement leaves no trace). Müller contains many examples which appear to falsify Saito’s proposal, as they arguably involve A-bar movement. . Kayne (2001) notes that the predicate part of small clauses pattern like idioms in this respect. (i)
a. the winneri is likely to be [Small Clause John t] b. *[how likely to be John t i ]j is the winneri t j
I will not discuss such examples here, as the simple cases of raised predicates like (i-a) pose non-trivial questions (why is there no Relativized Minimality effect when an NP, the winner, crosses another one, John?), to which I cannot provide an answer here. If the exact nature of the first step of movement out of the small clause to SpecIP is ignored, the solution I offer in the text for (5b)–(9b) extends to (i-b). . The raising-control ambiguity appears to be supported by Martin’s (1992, 1996) argument, based on observations in Barss (1986), that a sentence like (i) allows for only one reading of the indefinite, in contrast with regular raising cases (see (ii)), but alongside with control predicates (see May 1985) (iii). (i)
[How likely to win the lottery] is someone from NY (someone likely/*likely someone)
(ii) someone from NY is likely to win the lottery (someone likely/likely someone) (iii) A unicorn is eager to be apprehended (a unicorn eager/*eager a unicorn) Setting aside the issue of whether control predicates do indeed block reconstruction (see Hornstein 2001: 139–140), I note that Sauerland (1999) provides an analysis of the contrast between (i) and (ii) that is independent from raising vs control. The explanation I provide below for the how likely paradigm also captures the asymmetry. . Contrast: (i)
who said that it was how likely that there will be a riot
. Some speakers find (19) marginal (see note 14 below for a possible explanation for this). Crucially, they still perceive a contrast between (17)/(18) and (19). (Thanks to Masashi Nomura for discussion of the relevant examples.)
Cedric Boeckx . For a different tack on the overt/covert asymmetry, see Nissenbaum (2000) and Chomsky (2001b), where covert movement is taken to be phrasal movement taking place after (cyclic) spell-out. I will not discuss this option here, as many details still remain to be worked out. . The LF-intervention effects on A-bar movement discussed in Pesetsky (2000) seem to provide another argument in favor non-phrasal ‘covert’ operations. . Lasnik also points out that the Move-F analysis captures the paradigm discussed in Den Dikken (1995) (see also Lasnik & Saito 1991 for similar examples in ECM-contexts) which is problematic under expletive-replacement. (i)
a. b.
Some applicants seem to each other to be eligible for the job No applicants seem to any of the deans to be eligible for the job
(ii) a. *there seem to each other to be some applicants eligible for the job b. *there seem to any of the deans to be no applicants eligible for the job As the data in (ii) show, the associate is incapable of licensing an NPI/anaphor located in the matrix clause, which is unexpected under the expletive replacement analysis since according to the latter (i) and (ii) share the same LFs. Lasnik takes the ungrammaticality of the sentences in (ii) to mean that such licensing mechanisms require more than formal features. But see Branigan (1999), Yatsushiro (1999), and Watanabe (2000) for some arguments that binding (but not scope) can be established via feature movement. . Strictly speaking, this is not Cinque’s (1990) view. According to him, all islands are absolute. Apparent (good) extraction out of islands are cases of base-generation and (nonovert) resumption. I ignore this detail here (for arguments that resumption demands a movement-approach, see Boeckx 2001). . I will not be concerned here with defining the relevant factors easing extraction out of weak islands. For valuable discussion, see Frampton (1999), Szabolcsi and Den Dikken (1999), and Starke (2001). . Masashi Nomura (p.c., attributing the original observation to Howard Lasnik) points out that if the proposal I make is correct, we expect no substantial difference between (i) and (ii). My proposal indeed seems to predict a Relativized Minimality effect triggered by somewhat in (ii). (i)
(who said that) there is how likely to be a riot
(ii) there is somewhat likely to be a riot The fact of the matter is that for some speakers (ii) is much better than (i). To account for the contrast, I am forced to say that for those speakers, there must be a structural difference between (i) and (ii). In other words, for them, the (offending) indefinite part of the degree modifier is (structurally) more prominent in (i) than it is in (ii) (i.e., it c-commands a riot). Intuitively, this seems correct. In (i), how must project in some way or other in order for the complex how likely to act as a wh-phrase. Exactly how to formulate this prominence distinction is a task I leave for future research. . Properly speaking, the intervention effect would be of the type that Chomsky (2000: 123) characterizes as “defective intervention effect.” Defective intervention arises when an ele-
Agree or attract?
ment α matches the featural requirements of a probe P, but fails to agree with it. (In other words, γ blocks the raising of β to α even though γ itself cannot raise to α.) . Howard Lasnik (p.c.) informs me that Noam Chomsky has made a proposal similar to Rosenbaum’s at various times. Neither Lasnik nor I have been able to locate the proposal in Chomsky’s writings. It must therefore have been made in class lectures. . Alternatively, one could follow Rizzi (1990), Uriagereka (1988), Law (1991, 1993), Boeckx (2001), and Starke (2001), among others, and take wh-adjuncts to be directly inserted in COMP. As far as I can see, this possibility yields identical results to the text discussion (i.e., it is unavailable in the ‘pied-piping’ cases and in the in-situ examples, for reasons I discuss in the text immediately below). Klaus Abels points out that many speakers of English find long-distance questions with how likely like (i) deviant. (i) *?how likely did John say that there was to be a riot This suggests that how likely patterns like how come, which is known to lack long-distance construal. (ii) *how come did John say that Mary left t The facts in (i) and (ii) may demand an analysis in terms of base-generation in [+wh] SpecCP for how come and how likely. Such an analysis may account for the fact that some speakers find (19) deviant. . Unless Relativized Minimality is viewed as a condition on representation (see Rizzi 1986). I reject this option as I follow Chomsky (2000) in taking Agree to be derivationally established. . Lasnik’s (in press) second argument in favor feature movement (based on the interaction of head-movement and ellipsis) is equally problematic, as discussed in Boeckx and Stjepanovi´c (2001).
References Barbiers, S. (1995). The syntax of interpretation. The Hague: HAG. Barss, A. (1986). Chains and Anaphoric Dependence. Doctoral dissertation, MIT. Bobaljik, J. D. (1995). Morphosyntax: The syntax of verbal inflection. Doctoral dissertation, MIT. Bobaljik, J. D. (2001). A-chains at the Interfaces: Copies, agreement, and “covert” movement. Ms., McGill University. Bobaljik, J. D. & S. Wurmbrand (2000). Modals, Raising, and Reconstruction. Ms., McGill University. Boeckx, C. (1999). Agree or Attract? Ms., University of Connecticut. Boeckx, C. (2001). Mechanisms of Chain Formation. Doctoral dissertation, University of Connecticut. Boeckx, C. & S. Stjepanovi´c (2001). Head-ing toward PF. Linguistic Inquiry, 32, 345–355.
Cedric Boeckx
Boškovi´c, Ž. (1997). The Syntax of Non-finite Complementation: An economy approach. Cambridge, MA: The MIT Press. Boškovi´c, Ž. (1998). LF-movement and the Minimalist Program. In P. N. Tamanji and K. Kusumoto (Eds.), Proceedings of NELS 28 (pp. 43–57). University of Massachusetts, Amherst: GLSA. Boškovi´c, Ž. (2000). What is Special about Multiple Wh-fronting. Ms., University of Connecticut. Boškovi´c, Ž. (2001a). A-movement and the EPP. Ms., University of Connecticut. Boškovi´c, Ž. (2001b). On the Nature of the Syntax-phonology Interface. London: Elsevier. Branigan, P. (2000). Binding Effects with Covert Movement. Linguistic Inquiry, 31, 553–557. Brody, M. (1995). Lexico-logical form: A radically minimalist program. Cambridge, MA: The MIT Press. Brody, M. (1997). Perfect Chains. In L. Haegeman (Ed.), Elements of grammar (pp. 139– 167). Dordrecht: Kluwer. Brody, M. (2000). Mirror Theory. Linguistic Inquiry, 31, 29–56. Brody, M. & A. Szabolcsi (2000). Overt Scope: A case study in Hungarian. Ms., University College London and New York University. Chomsky, N. (1955). The Logical Structure of Linguistic Theory. Ms., Harvard University. (Published in part, 1975, New York: Plenum.) Chomsky, N. (1964). Current Issues in Linguistic Theory. The Hague: Mouton. Chomsky, N. (1986). Knowledge of Language. New York: Praeger. Chomsky, N. (1991). Some Notes on Economy of Derivation and Representation. In R. Freidin (Ed.), Principles and Parameters in Comparative Grammar (pp. 417–454). Cambridge, MA: The MIT Press. Chomsky, N. (1993). A Minimalist Program for Linguistic Theory. In K. Hale and S. J. Keyser (Eds.), The view from Building 20 (pp. 1–52). Cambridge, MA: The MIT Press. Chomsky, N. (1995). Categories and Transformations. In The Minimalist Program (pp. 219– 394). Cambridge, MA: The MIT Press. Chomsky, N. (2000). Minimalist Inquiries: The framework. In R. Martin, D. Michaels and J. Uriagereka (Eds.), Step by Step (pp. 89–155). Cambridge, MA: The MIT Press. Chomsky, N. (2001a). Derivation by Phase. In M. Kenstowicz (Ed.), Ken Hale: A life in language (pp. 1–50). Cambridge, MA: The MIT Press. Chomsky, N. (2001b). Beyond Explanatory Adequacy. Ms., MIT. Cinque, G. (1990). Types of A-bar dependencies. Cambridge, MA: The MIT Press. Den Dikken, M. (1995). Binding, Expletives, and Levels. Linguistic Inquiry, 26, 347–354. Fiengo, R. (1977). On Trace Theory. Linguistic Inquiry, 8, 35–61. Frampton, J. (1999). The Fine Structure of Wh-movement and the Proper Formulation of the ECP. The Linguistic Review, 16, 43–61. Groat, E. & J. O’Neil (1996). Spell-out at the LF Interface. In W. Abraham, S. D. Epstein, H. Thráinsson, and C. J.-W. Zwart (Eds.), Minimal Ideas (pp. 113–139). Amsterdam: John Benjamins. Hornstein, N. (2001). On A-chains: A reply to Brody. Syntax, 3, 129–143. Kayne, R. S. (1994). The Antisymmetry of Syntax. Cambridge, MA: The MIT Press. Kayne, R. S. (1998). Overt vs. Covert Movement. Syntax, 1, 128–191.
Agree or attract?
Kayne, R. S. (2001). Raising, Reconstruction, and Remnant Movement. Talk given at the Asymmetry conference, May 2001, UQAM. Katz, J. & P. Postal (1964). An Integrated Theory of Linguistic Descriptions. Cambridge, MA: The MIT Press. Koopman, H. & A. Szabolcsi (2000). Verbal Complexes. Cambridge, MA: The MIT Press. Kratzer, A. (1991). Modality. In A. von Stechow and D. Wunderlich (Eds.), Semantics. An international handbook of contemporary research (pp. 639–650). Berlin: de Gruyter. Kroch, A. & A. Joshi (1985). The Linguistic Relevance of Tree-adjoining Grammar. Ms., University of Pennsylvania. Larson, R. (1988). On the Double Object Construction. Linguistic Inquiry, 19, 335–391. Lasnik, H. (1992). Case and Expletives: Notes toward a parametric account. Linguistic Inquiry, 23, 381–405. Lasnik, H. (1999a). Chains of Arguments. In S. D. Epstein and N. Hornstein (Eds.), Working Minimalism (pp. 189–215). Cambridge, MA: The MIT Press. Lasnik, H. (1999b). Minimalist Analysis. Oxford: Blackwell. Lasnik, H. (1999c). On Feature Strength: Three minimalist approaches to overt movement. Linguistic Inquiry, 30, 197–217. Lasnik, H. (In press). Feature Movement or Agreement at a Distance? In A. Alexiadou, E. Anagnostopoulou, S. Barbiers and H.-M. Gaertner (Eds.), Remnant Movement, Fmovement and the T-model. Amsterdam: John Benjamins. Lasnik, H. & M. Saito (1991). On the Subject of Infinitives. In L. Dobrin, L. Nichols and R. Rodriguez (Eds.), Papers from the 27th Regional Meeting of CLS (pp. 324–343). University of Chicago, IL. Lasnik, H. & M. Saito (1992). Move α. Cambridge, MA: The MIT Press. Law, P. S. (1991). Effects of Head-movement on Theories of Subjacency and Proper Government. Doctoral dissertation, MIT. Law, P. S. (1993). On the Base Position of Wh-adjuncts and Extraction. Paper presented at the 67th Annual Meeting of the Linguistic Society of America, Los Angeles. Mahajan, A. (2000). Eliminating Head-movement. Ms., University of California, Los Angeles. Martin, R. (1992). On the Distribution and Case Features of PRO. Ms., University of Connecticut. Martin, R. (1996). A Minimalist Theory of PRO and Control. Doctoral dissertation, University of Connecticut. May, R. (1985). Logical Form. Cambridge, MA: The MIT Press. Moro, A. (1997). The Raising of Predicates. Cambridge: CUP. Müller, G. (1998). Incomplete Category Fronting. Dordrecht: Kluwer. Newmeyer, F. (1975). English Aspectual Verbs. The Hague: Mouton. Nishioka, N. (1997). On Trace Movement. English Linguistics, 14, 182–202. Nissenbaum, J. (2000). Explorations in Covert Phrase Movement. Doctoral dissertation, MIT. Nomura, M. (2001). Extraposition or Scattered Deletion. Ms., University of Connecticut. Nunes, J. (1999). Linearization of Chains and the Phonetic Realizations of Chain Links. In S. D. Epstein and N. Hornstein (Eds.), Working Minimalism (pp. 217–249). Cambridge, MA: The MIT Press.
Cedric Boeckx
Nunes, J. (To appear). Sideward Movement and Linearization of Chains in the Minimalist Program. Cambridge, MA: The MIT Press. Ochi, M. (1999a). Constraints on Feature Checking. Doctoral dissertation, University of Connecticut. Ochi, M. (1999b). Some Consequences of Attact-F. Lingua, 109, 81–109. Pesetsky, D. (2000). Phrasal Movement and its Kin. Cambridge, MA: The MIT Press. Rizzi, L. (1986). On Chain Formation. In H. Borer (Ed.), Syntax and Semantics, 19: The syntax of pronominal clitics (pp. 65–95). Orlando: Academic Press. Rizzi, L. (1990). Relativized Minimality. Cambridge, MA: The MIT Press. Rosenbaum, P. S. (1967). The Grammar of English Predicate Complement Constructions. Cambridge, MA: The MIT Press. Runner, J. (1995). The Licensing of Noun Phrases. Doctoral dissertation, University of Massachusetts, Amherst. Saito, M. (2001). A Derivational Approach to the Interpretation of Scrambling Chains. Ms., Nanzan University. Sauerland, U. (1999). Scope Reconstruction Without Reconstruction. In Proceedings of WCCFL 17 (pp. 582–596). Stanford CA: CSLI. Starke, M. (2001). Move Dissolves into Merge: A theory of locality. Doctoral dissertation, University of Geneva. Szabolcsi, A. & M. den Dikken (1999). Islands. Ms., New York University and CUNY. Szabolcsi, A. & F. Zwarts (1993). Weak Islands and Algebraic Semantics for Scope Taking. Natural Language Semantics, 1, 235–284. Uriagereka, J. (1988). On Government. Doctoral dissertation, University of Connecticut. Uriagereka, J. (1999). Minimal Restrictions on Basque Movements. Natural Language and Linguistic Theory, 17, 403–444. Vuki´c, S. (1999). Attract F and the Minimal Link Condition. Linguistic Analysis, 28, 185–226. Watanabe, A. (2000). Feature Copying and Binding: Evidence from Complementizer Agreement and Switch Reference. Syntax, 3, 159–181. Wurmbrand, S. (1998). Infinitives. Doctoral dissertation, MIT. Wurmbrand, S. (2001). Move or Agree? Paper preresented at WCCFL 20, February 2001, University of Southern California. Yatsushiro, K. (1999). Case Licensing and VP Structure. Doctoral dissertation, University of Connecticut.
Distributed deletion* ´ Gisbert Fanselow and Damir Cavar University of Potsdam
.
Introduction
DPs and PPs often surface in a discontinuous manner. Standard Wh-movement extracts constituents out of DPs and PPs (1). Quantifiers may appear to the right of the DP which they modify semantically (as in (2)). According to Sportiche (1988), this construction emerges by the stranding of the quantifier when DP moves to Spec,IP. Whether “extraposition from NP” in (3) involves rightward movement depends on the status of the antisymmetry hypothesis (Kayne 1994; Chomsky 1995), but independent considerations may militate against a rightward movement explanation as well (see Culicover & Rochemont 1990). Noun incorporation also gives rise to discontinuous noun phrases, as (4) illustrates for Greenlandic. Finally, DPs and PPs may simply be ‘split’ in a considerable number of languages such as German, Croatian, Polish, Russian, Hungarian, Finnish, Latin, Ancient Greek, and Warlpiri, as (5) and (6) illustrate. (1) Who did you see a photo of ? (2) The students have all written a paper on logic. (3) A book appeared about Chomsky. (4) Marlun-nik ammassat-tur-p-u-nga. (Greenlandic, Geenhoven 1998:16) two-inst.pl sardine-eat-ind-[-tr]-1sg (5) a.
Interessante Bücher hat sie mir keine aus Indien interesting books has she me none from India empfohlen. (German) recommended “She has not recommended any interesting books from India to me.”
´ Gisbert Fanselow and Damir Cavar
b. Knijge mi je Marija zanimljive preporuˇcila. books me has Mary interesting recommended “Mary has recommended interesting books to me.” ˛zki mi Marek interesujace ˛ zaproponował. c. Ksia˙ books me Marek interesting suggested (6) a.
Mit was hast du für Frauen gesprochen? with what have you for women spoken “With what kind of women did you speak?” se Ivan stablo penje? b. Na kakvo on what-kind-of self I. tree climbs “On what kind of tree does Ivan climb?” si˛e Marek drzewo wspina? c. Na jakie on what-kind-of self M. tree climbs
(Croatian)
(Polish) (German)
(Croatian)
(Polish)
The empirical focus of the present article lies on the constructions in (5)–(6), which we will call (XP-) split constructions.1 The standard analysis of (5) was proposed by van Riemsdijk (1989): the part of the XP that appears in clauseinitial position is moved out of XP, stranding the material left behind. If left branch extraction is impossible, the analysis of (6) must be more complex. It involves remnant movement of an XP out of which some material has been extracted before it was placed into the clause-initial slot. However, movement analyses face serious problems with respect to syntactic islands and the phonetic shape of the parts of the split phrase (the “regeneration” problem discovered by van Riemsdijk). We will argue that these problems render a simple movement analysis of the XP-split construction impossible. However, it does not seem amenable to a treatment in which both parts are base-generated in situ, either. A way out of this apparent paradox is offered by the copy & deletion (cd-) approach to movement (Chomsky 1995) if it is implemented in such a way that the deletion operation following the copying step of movement may affect both copies. The cd-approach offers a unified analysis for both type of constructions, i.e., DP-splits as in (5) and PP-splits as in (6). How such a derivation may proceed is illustrated in (7) for Croatian. (7) mi je Marija zanimljive knijge preporuˇcila → me has Mary interesting books recommended Complete copying zanimljive knijge mi je Marija zanimljive knijge preporuˇcila → Partial deletion in upper copy zanimljive knijge mi je Marija zanimljive knijge preporuˇcila → Complementary deletion in lower copy zanimljive knijge mi je Marija zanimljive knijge preporuˇcila
Distributed deletion
This account for XP-splitting may suggest itself, so the major virtue of the present paper lies in the presentation of the empirical arguments in its favor, and in developing the approach in some detail. Our account may also be applicable to (3) and (4), and its general idea seems helpful for a number of further puzzles of syntax. The article is structured as follows. Section 2 introduces core properties of the split construction, and distinguishes two types of splits. Section 3 is dedicated to a discussion and refutation of previous analyses. Sections 4 and 5 presents the distributed deletion theory in some detail. Section 6 briefly discusses loose ends and possible extensions of the present approach.
. Some core properties In an XP-split construction, the phonetic material of a single phrase appears in more than one position. There is no principled limit to the number of slots on which a phrase can be scattered, as German (8a) and Croatian (8b) illustrate. Similarly, more than one phrase can be split up in a single clause,2 as (9) shows for German and Polish: (8) a.
Bücher hat man damals interessante in den Osten keine books has one then interesting in the East no mitnehmen dürfen. with-take may “As for books, one could not take any interesting ones to the East then” b. Koje je Ivan zanimljive kupio knjige. which is Ivan interesting bought books “Which interesting books did Ivan buy?”
(9) a.
Sonaten haben Frauen bislang nur wenige welche sonatas have women up to now only few some geschrieben. written “As for sonatas: Up to now, only few women have composed some” chłopiec dziewczyny b. Piotr powiedział, ˙ze ˙zaden ładnei girl Piotr said that not-one beautiful boy nie zignoruje. not ignores “Piotr said that no boy ignores beautiful girls”
´ Gisbert Fanselow and Damir Cavar
XP-splits arise in wh-movement contexts, as (10) illustrates, and when there is focus/topic movement to various positions, as exemplified in (8). Frey (2000) argues from contrasts such as the one in (11) that XP-splits are confined to movement to topic/focus-positions (preceding the normal position of sentential adverbials in German), and do not arise in the context of standard (A-) scrambling (targeting positions following sentence level adverbs). Thus, it seems that XP-splits are confined to operator movement. (10) a.
Na kakav je Ivan krov skoˇcio? on what-kind has Ivan roof jumped? “On what kind of roof has Ivan jumped?” hat er Schweine gekauft? b. Wieviel how many has he pigs bought “How many pigs has he bought?”
(Croatian)
(German)
(11) a.
dass er teure Bücher wahrscheinlich der Frau keine that he expensive books probably the.dat woman no schenken wollte. give wanted “. . . that he probably did not want to give the woman expensive books as a presents.” b. ?*dass er wahrscheinlich teure Bücher der Frau keine schenken wollte.
XP-splits come in two varieties. XPs can simply be pulled apart (Pull-splits), leaving XP-internal order intact. This is illustrated in (12). German differs from the Slavic languages in allowing pull-splits for simple wh-extraction only (13). (12) Na kakav je Ivan krov skoˇcio? on what-kind has Ivan roof jumped? “On what kind of roof has Ivan jumped?”
(Croatian)
Wieviel hat er Bücher gelesen? how many has he books read “How many books has he read?” a . Wieviel Bücher hat er gelesen? b. *Keine hat er Bücher gelesen. no has he books read “He has not read any books.” b . Keine Bücher hat er gelesen.
(13) a.
The internal order of the XP can also be inverted in the split construction, as illustrated in (14). Inverted splits are well-formed for noun phrases only, but not for PPs (15). Therefore, PP-splits are confined to wh-movement in German.
Distributed deletion
(14) a.
Crveni je Ivan auto kupio. red has Ivan car bought “Ivan has bought a red car” a . Auto je Ivan crveni kupio. b. Autos besitzt er (nur) schnelle. cars owns he only fast “As for cars, he owns only fast ones.” b . *(Nur) schnelle besitzt er (nur) Autos.
(Croatian)
Na kakav je Ivan krov skoˇcio? on what-kind has Ivan roof jumped? “On what kind of roof has Ivan jumped?” b. *Krov je Ivan na kakav skoˇcio?
(Croatian)
(15) a.
(German)
PPs can be torn apart in a different way in German (and certain dialects of Croatian), however. In (16), the prepositional head of the PP appears in both parts, while the DP-part of the PP is split in the inverted way (as compared to in keinen Schlössern “in no castles,” *in Schlössern keinen). (16) In Schlössern habe ich noch in keinen gewohnt lived in castles have I yet in no “I have not yet lived in any castles.”
The core properties of XP-splits can thus be summarized as follows: a. XP-splits arise in the context of operator movement only. b. XP-splits can retain or invert the order of the elements found in the continuous counterpart. The latter type of split cannot show up with PPs – it is replaced by a construction that differs from XP-splits only in the presence of copies of the preposition in all slots where parts of the PP appear. c. Pull splits do not show up for all types of operator movement in German.
. Previous analyses . Simple movement theories The standard account for XP-discontinuity is movement. In (17a) the verb phrase is serialized discontinuously, because who has been extracted from it. That (17b) involves movement, too, seems to be the standard view, though alternative accounts have been proposed (see Horn 1975). The null hypothesis for XP-splits thus should also involve the creation of discontinuity by
´ Gisbert Fanselow and Damir Cavar
movement, as has been proposed for German by van Riemsdijk (1989), Tappe (1989), Diesing (1992), Kniffka (1996) among others, and by Franks and Progovac (1994) for Croatian, or Yearley (1993) and Sekerina (1997) for Russian. (17) a. who did you [VP see t] ? b. who did you see [DP a picture of t] ?
At early stages of generative theory, movement analyses for XP-splits were confronted with the problem that movement is restricted to minimal or maximal projections, while the analysis of split noun phrases seems to presuppose that submaximal projections are moved, cf. (18) for an illustration, and Fanselow (1988) for the pertinent argument. (18) a.
Sie hat keine interessanten neuen Bücher gekannt. she has no interesting new books known “She did not know any interesting new books.” b. [ Bücher ]i hat sie [ keine interessanten neuen ti ] gekannt. c. [ Neue Bücher ]i hat sie [ keine interessanten ti ] gekannt. d. [ Interessante neue Bücher ]i hat sie [ keine ti ] gekannt. e. [ Keine interessanten neuen Bücher ]i hat sie ti gekannt.
As (18) shows, any segment of [keine [interessanten [neuen [Bücher]]]] can undergo movement in an extraction account of XP-splits, and at first glance, only one of these segments can be maximal. But, as was noted by Tappe (1989) and Kniffka (1996), this line of reasoning is problematic because of the additional layers of functional structure that have been discovered in the DP, following the seminal work of Abney (1987) – in fact, the movement facts of (18) themselves constitute evidence for an elaborate internal structure of noun phrases, which might look as in (19). A movement analysis of inverted splits can thus pick any of the functional projections in the noun phrase, and move it to the front. (19) [DP [D keine] [AGR-A1-P [AP interessanten] [[AGR-A1 e] [AGR-A2-P [AP neuen ] [[AGR-A2 e][Nom-P Bücher]]]]]]
Pull splits require a slightly more complex derivation. Since P+Det does not form a constituent, the derivation of (20) must involve remnant movement in the sense of den Besten & Webelhuth (1990), Müller (1998), see, e.g., Corver (1990) and the discussion in Sekerina (1997): first, krov is extracted from na kakav krov (this involves an inverted split), then [na kakav t] is moved to sentence initial position.
Distributed deletion
(20) Na kakav je Ivan krov skocio? on what-kind has Ivan roof jumped? “On what kind of roof has Ivan jumped?”
(Croatian)
This analysis has the advantage of reducing pull splits to inverted splits followed by remnant movement, and thus seems to explain why languages allow pull splits only if inverted ones are licensed, too – but it faces the problem that PPs disallow inverted splits though pull splits of PPs are fine. Simple movement theories face at least two kinds of problems, both of which have already been alluded to. First, inverted splits can be “imperfect” in the sense that the two parts contain more phonetic material than fits into a single constituent. The case of preposition doubling (21a, b) has been discussed above,3 but a similar constellation arises with determiners, too, as (21c–d) illustrate.4 (21) a.
In Schlössern habe ich noch in keinen gewohnt. in castles have I yet in no lived “As for castles, so far I have not lived in any.” b. *In keinen in Schlössern habe ich gewohnt. c. Einen amerikanischen Wagen kann ich mir keinen neuen leisten. an American car can I me no new afford “As for American cars, I cannot afford a new one.” d. *Keinen neuen einen amerikanischen Wagen no new an American car
The indefinite article and the negative quantifier kein do not go together in German noun phrases (as (21d) shows), because they compete for the same structural position, but they may occur in different parts of a split noun phrase (21c). Imperfect splits such as (21a, c) have no well-formed source in a movement account – there is not enough space in a single continuous XP for the material present in the split case. Van Riemsdijk (1989) attributes the imperfection of the split in (21c) and similar examples to a “regeneration” process: according to his theory, what moves to first position in (21c) is just amerikanischen Wagen. This sequence, however, is not a legal independent noun phrase in German.5 Therefore, phrase structure rules re-apply after movement and insert an indefinite article in order to guarantee well-formedness. At the present moment, one can at least say that “regeneration” adds a complication to the movement analysis, which one would hope to be able to avoid. The second problem of the movement account of split XPs has also been mentioned already. Recall that, (6), (8b), or (10) require a remnant movement
´ Gisbert Fanselow and Damir Cavar
analysis, in which a nominal projection is moved out of PP. But this ingredient of an extraction analysis of splits is confronted with the serious problem that PPs are islands for movement in Croatian otherwise, as (22) illustrates: (22) a.
b. c. d. e.
Ivan se popeo [PP na veliko drvo] Ivan self climbed on big tree “Ivan climbed on a big tree.” *Štoi se Ivan popeo [PP na veliko ti ] what self I. climbed on big *Drvoi se Ivan popeo [PP na veliko ti ] tree self I. climbed on big *Ivan se drvoi popeo [PP na veliko ti ] I. self tree climbed on big Na veliko se Ivan drvo popeo. on big self I. tree climbed
The examples in (22) show that PPs are islands for wh-extraction (b), topicalization (c), and scrambling (d). However, a split of the complex PP is possible, as (22e) shows. Thus, the movement step necessary for creating the discontinuous PP is not well-formed, since it violates a strong island restriction.6 The problem is not confined to split PPs. In German, split noun phrases do not respect at least three types of islands, as the following data illustrate. First, (23) shows that subjects (of non-unaccusative verbs, at least) are islands for the extraction of PPs (cf. e.g. Müller 1996). Nevertheless, subjects can be split up, as Fanselow (1988, 1993) observes (24).7 (23) a. *[An Maria] haben mir [keine Briefe t] gefallen. to Mary have me no letters pleased “No letters to Mary have pleased me.” b. *[An Maria] hat mich [kein] Brief t] erschreckt. to Mary has me no letter frightened “No letter to Mary has frightened me.” Briefe an Maria gefallen mir keine. letters to Mary please me no “As for letters to Mary, they do not pleaseme.” b. Briefe an Maria haben mich keine erschreckt. letters to Mary have me no frightened “As for letters to Mary, they have not frightened me.”
(24) a.
Kniffka (1996: 52) shows that subjects can be split up even when they precede modal particles which are often claimed to mark the boundary of VP. Like-
Distributed deletion
wise, in contrast to claims made in Diesing (1992), subjects of individual level predicates fail to disallow XP-splits: (25) a.
Ärzte dürften schon ein paar altruistisch sein doctors may really a few altruistic be “As for doctors, a few will be altruistic” b. Skorpione sind ziemlich viele giftig scorpions are rather many poisonous “As for scorpions, rather many of them are poisonous”
Dative indirect objects (26a–b) and many genetive (26c–d) noun phrases illustrate essentially the same point. They are islands for movement (Müller 1996; Vogel & Steinbach 1998), yet split noun phrases can be formed on their basis (Fanselow 1993; Kniffka 1996: 33). (26) a. *[Über Polen ] ist hier noch [keinen Büchern t ] ein Preis about Poland is here yet no books-dat a prize verliehen worden awarded been “No books about Poland have been awarded with a prize here.” b. Interessanten Büchern über Polen ist hier noch keinen ein interesting books about Poland is here yet no a Preis verliehen worden. prize awarded been “As for interesting books about Poland, no prize have been awarded to any of them here so far.” c. *[An Studenten] habe ich ihn [schrecklicher Morde] at students have I him horrible-gen murders-gen angeklagt accused “I have him accused of horrible murders of students.” d. Schrecklicher Morde an Studenten ist er vieler beschuldigt horrible murders at students is he many accused worden. been “He has been accused of many horrible murders of students.”
Similar arguments can be formulated with respect to the Specific Subject Condition and pragmatic constraints on movement. Thus, a number of stable generalizations concerning extraction8 are not fulfilled by split noun phrases in German.
´ Gisbert Fanselow and Damir Cavar
Mohawk is also in line with this picture. As Baker (1991, 1995) shows, whmovement is subject to standard CED effects in Mohawk, cf. (27) (= (28), (29), and (30a) in Baker 1991). (27) a.
uhka i-hs-ehr-e’ v-ye-atya’tawi-tsher-a-hnhnu-’ who Ø-2sS-think fut-FsF-dress-nom-buy-punc “Who do you think will buy a dress?” Complement b. *uhka wa’-te-s-ahsvtho-’ ne tsi wa’-e-ihey-e’ who fact-dup-2sS-cry-punc because fact-FsF.die-punc “*Who do you cry because (he) died?” Adjunct Islands c. *uhka we-sa-tsituni- ’tsi wa’-t-ha-a’shar-ya’k-e’ who fact-NsS2sO-make.cry-punc -dup-MsSknife-break-punc “Who did that he broke the knife upset you?” Subject Islands
Noun phrases are intransparent for movement (28) (= (34) in Baker 1991), but they can be split up irrespective of grammatical function (29) (= (40)–(41) in Baker 1991). (28) a. *uhkai se-nuhwe’-s ne ti ako-kara who 2sS-like-hab NE FsP-story “Whose story do you like?” b. *uhka we-sa-tsituni-’ ne ti ako-kara FsP-story who fact-2sO-make.cry-punc NE “Whose story made you cry?” to ni-hati wa’-she-kv-’ rati-ihn-a-rakv how part-MpS fact-2sS/3pO-see-punc MpS-skin-be-white “How many white men did you see?” b. to ni-hati wa-esa-kv’- rati-ihn-a-rakv “How many white men saw you?” ne kweskwes c. ka nikayv wa-hse-nut-e’ which fact2sS-feed-punc NE pig “Which pig did you feed?” ne kweskwes d. ka nikayv wa’-ka-nvst-a-k-e’ which fact-ZsS-corn-eat-punc NE pig “Which pig ate the corn?”
(29) a.
The same problems arise in Slavic languages, as Sekerina (1997) shows. One of the most fundamental predictions of a movement account of split constituents, namely those of the bounding theory, is thus not borne out.9 There are further data that require additional complications in a movement account. In (30), one part of a split noun phrase occupies a position in a VP moved to clause-initial position, whereas the other part is left behind.
Distributed deletion
(30)
[VP Bücher gelesen ] habe ich keine. books read have I no “I have not read any books.”
In principle, (30) might involve remnant VP movement (Thiersch 1985; den Besten & Webelhuth 1990; Müller 1998; but see Fanselow, in press) as exemplified by the derivation in (31). But the movement that precedes remnant VPtopicalization for (30) would not yield grammatical results in isolation, as (32) shows, and it would have to affect non-constituents in cases like (33): (31) a.
hat man wahrscheinlich [VP den Mann geküsst ] has one probably the man kissed b. hat man den Manni wahrscheinlich [VP ti geküsst ] c. [VP ti Geküsst ] hat man den Manni wahrscheinlich. “One has kissed the man probably.”
⇒ ⇒
(32) *dass ich keine damals Bücher gelesen habe. that I no then books read have “that I did not read any books at that time” (33) a.
Ich habe [keine [Bücher über Maria]] gelesen. I have no books about M. read “I haven’t read any books by Mary.” b. Bücher gelesen habe ich noch keine über Maria.
On the other hand, the overtly legal split operation (34a) does not feed remnant VP topicalization (34b) – in contrast to all other movement types. (34) a.
weil Bücher selbst der Fritz noch keine t geschrieben hat because books even the F. yet no written has “Because even Fritz has not yet written any books” b. *[[ noch keine t ] geschrieben ] hat Bücher selbst der Fritz c. Bücher geschrieben hat selbst der Fritz noch keine. books written has even the F. yet no “Even Fritz has not yet written any books.”
If one finds a way of blocking (32), one could try harder and attempt to explain well-formed (30) or (34c) by two – rather than one – steps of movement preceding remnant VP topicalization. Thus, one could first move Bücher out of keine Bücher, so as to yield (35b). If it is now possible to front the remnant noun phrase keine t, as in (35c), one would have produced a constituent (underlined in (35c)) that contains exactly the phonetic material one needs to front in remnant VP topicalization for (30).
´ Gisbert Fanselow and Damir Cavar
(35) a. (habe) ich [ keine Bücher gelesen] b. (habe) ich [ Bücher [ [ keine t ] gelesen ]] c. (habe) ich [ keine t ] [ [ Bücher [ t gelesen ]]
This last step preceding VP-fronting is problematic, however: it involves the scrambling (adjunction to VP/IP) of a category containing a scrambling trace itself. As Müller (1998) has observed, the ban against such a kind of movement is the core restriction on remnant movement: (36) Unambiguous Domination: In . . . [A . . .B . . .]. . ., A and B must not undergo the same kind of movement.
We have seen, then, that simple movement theories of XP-splits face at least three types of problems: a. They cannot account for the repetition of phonetic material in imperfect splits. b. They cannot cope with the fact that XP-splits disrespect standard islands for movement (PP-islands, barriers by lack of l-marking). c. They cannot handle the existence of XP-splits in VP-fronting constructions easily. . Base generation theories At least the first two of the three problems for movement accounts would not arise if the parts of split constituents would be base-generated in place. The idea that discontinuous phrases are generated as two (or more) independent constituents goes back to Hale (1983). According to him, split noun phrases (in Warlpiri) are a diagnostics for non-configurationality. Thematic theory seems to militate against the view that more than one phrase is linked to a single thematic role, but whether this constitutes a problem depends on the nature of thematic linking. Hale (1983) proposed a theory of Lexical Conceptual Structure and its relation to phrase structure in which multiple linking of more than one NP to a single role is unproblematic. Furthermore, NPs fulfill functions other than the referential closing of argument slots in Warlpiri. It may even be the case that the only function of non-pronominal NPs in Warlpiri (Jelinek 1984) or Mohawk (Baker 1995) is that of adjuncts, so that no conflict with standard theta-theory arises. Van Geenhoven (1998) presents a semantic theory that is able to handle multiple XPs that are linked to the same argument slot, at least for the case of direct objects.
Distributed deletion
Hale, Jelinek, and Baker attribute the presence of (base-generated) split constituents to non-configurationality. This is not appropriate, since a survey of Australian languages (Austin & Bresnan 1996: 262) revealed that the existence of split noun phrases neither depends on generally free constituent order (Diyari refutes such a connection) nor on enclitic pronouns bearing the argument function (Jiwarli has split noun phrases but no pronominal clitics). Therefore, a base generation account needs to assume that NPs may have nonargumental, attributive functions quite independent of non-configurationality, a possibility entertained, e.g., in Fanselow (1988), van Geenhoven (1998). See also Kuhn (1998, to appear) for a base generation approach for split NPs within the framework of LFG. Just as movement seems to be optimal for (37), base generation accounts are correct for (apparent) XP-splits in Japanese. As Tanaka (in prep.) observes, the type of NP-discontinuity exemplified in (38) must not involve movement in the crucial derivational steps, because no islands for movement such as the Coordinate Structure Constraint are respected (see (39a) vs. (39b)). (37) a. Who did you see a photo of? b. *Who did a photo of please you? (38) Peter-wa kuruma-wa itsumo akai-no-o kat-teiru yo always red-no-acc buy-pres prt Peter-top car-top “As for cars: Peter always buys red ones” isu-wa Peter-wa kinoo rampu-to kurashikkuno-no-o chair-top Peter-top yesterday lamp-and classical-no-acc kat-ta bought “Peter bought a lamp and a classical chair yesterday” Peter-wa kinoo rampu-to kat-ta b. ??isu-wa chair-top Peter-top yesterday lamp-and bought “Peter bought a lamp and a chair yesterday”
(39) a.
The Japanese XP-splits (38), (39a) involve two independent noun phrases, one generated in an A-position, the other being merged in a Topic position. XPsplits of the Slavic and German type differ from Japanese, however, in that certain kinds of islands have to be respected. This fact can be accounted for in a base generation account only indirectly. (40a) illustrates the fact that the relation between the parts of an XP-split respects the Complex Noun Phrase Constraint in German. Similarly, the complex noun phrase (40b) is an island for both movement (40c) and split constituent formation (40d) in Croatian.
´ Gisbert Fanselow and Damir Cavar
(40) a. *Bücher habe ich [eine Geschichte dass sie keine liest ] gehört. books have I a story that she no reads heard “I have heard a story that she does not read any books.” je Marija svojoj sestri b. Ivan je vidio [NP auto [RelCP koji car which is M. her sister I. is seen kupila ]] bought “Ivan has seen the car which Mary bought for her sister.” ˇ je Marija ti sestri ] je Ivan vidio [NP auto [RelCP koji c. *[NP Cijoj auto which is M. whose sister is I. seen kupila]]? bought “Whose sister is such that Ivan saw the car which Mary bought for her?” ˇ i je Ivan vidio [NP auto [RelCP koji je Marija ti sestri kupila ]]? d. *Cijoj
Fanselow (1988) tries to account for such facts by assuming that one of the two parts of an XP-split has to obligatorily undergo movement to Spec,CP after having been merged independently of the other part of the split construction. Since the two NP parts are merged independently of each other, it is obvious why the relation between keine and Bücher itself need not respect islands for movement as such (41a). If Bücher has to undergo later movement (41b), the relation between Bücher and its trace must be compatible with subjacency, however – a fact Fanselow claims is able to capture (40a). (41) a.
er [NP keine ] [NP Bücher ] gelesen hat he no books read has b. [NP Bücher ]i hat er [NP keine ] ti gelesen hat “He has not read any books.”
This argument is valid, however, only if there are additional constraints on the distance at which two XPs may be merged independently of each other when they are linked to the same thematic role. Otherwise, one could circumvent all islands constraints by simply merging the XP-parts at any distance. Even if such locality constraints on merger can be identified,10 a base-generation account of XP-splits for German leaves it open why one of the XPs must move to Spec,CP or the sentence internal “topic”-position in the sense of Frey (2000). Similarly, a theory which merges the parts of XP-splits in Croatian and Polish in situ would leave it unexplained why at least one part of a split DP and PP must appear in front of VP proper, in a focus position. There may be technical ways
Distributed deletion
to guarantee that such a kind of movement takes place (see Fanselow 1988) but they are certainly not satisfactory. A further disadvantage of base generation solutions is that they have little to say about a phenomenon favoring movement analyses. Riemsdijk (1989) observes that some linear order facts are unexpected in base generation theories.11 Order is not free in German noun phrases. As (42a) illustrates, there is only one option for arranging the prenominal elements keine, zwei, grüne – which is mirrored in the discontinuous case, as (42b) shows. We can explain (42b) if the source of a split noun phrase is its continuous counterpart (=42a) – while it is not obvious how a base generation might capture (42b): noun phrases such zwei Bücher, keine grünen, keine Bücher or zwei Grüne are perfect if they form a single complete phrase. Thus, they should be able to co-occur within a single clause if they can be generated independently of each other. (42) a.
keine zwei grünen Bücher no two green books *keine grünen zwei Bücher *zwei keine grünen Bücher *grüne keine zwei Bücher b. Grüne Bücher hat sie keine zwei. green books has she no two “She does not have two green books.” *zwei Bücher hat sie keine grünen *keine Bücher hat sie zwei grüne
Similarly, adjective order is not free. (43a) is unmarked while (43b) is not – the latter is fully acceptable only if amerikanische bears focal stress. (43) a.
u
Ich kaufe neue amerikanische Bücher. I buy new American books “I am buying new American books.” b. m Ich kaufe AMERIKANISCHE neue Bücher. “As for new books, I am buying ones from America.”
Interestingly, a similar pragmatic constraint holds for the discontinuous case. (44b) shares the pragmatic well-formedness conditions of (43b), while (44a) is as unmarked as a split noun phrase can be. This is explained if the split category in (44a) is derived from the continuous NP in (43a), and if the same holds for the pair (43b)–(44b).
´ Gisbert Fanselow and Damir Cavar
(44) a. amerikanische Bücher kaufe ich neue b. m neue Bücher kaufe ich amerikanische
It is hard to imagine that such restrictions12 can be made follow from semantic or related considerations in base generation accounts, at least, no such accounts have been proposed so far. . A prosodic option? Zec and Inkelas (1990) assume that syntactic constituents may be split by enclitics in Serbo-Croatian. They claim that such data provides evidence for a phonological or prosodic placement of enclitics. For Croatian, (45) shows that the clitic-cluster may appear after a complex DP (45a), or apparently ‘inside’ the complex DP, as in (45b). (45) a.
Taj ˇcovjek joj ga je poklonio. this man her it be3sg presentptc “This man presented it to her.” ˇcovjek poklonio. b. Taj joj ga je this her it be3sg man presentptc
Independent of whether one wants to concede that a prosodic rule of clitic placement can split a constituent, the examples discussed in the preceding sections do not only show that a prosodic solution cannot account for inverted splits in German – rather, many pull splits of Croatian cannot find a prosodic analysis either, because material other than clitics can intervene between the two parts of the split construction. Thus, as has been pointed out by Browne (1976), a wh-phrase can be fronted, leaving the head noun in situ, as in (46). (46) Kakav je Ivan kupio auto? what-kind-of be3sg I. buyptc car “What kind of car has Ivan bought?”
´ As Cavar (1999) points out, the same type of syntactic discontinuity is possible with the constructions discussed in Zec and Inkelas (1990), in (47), a demonstrative is topicalized, being separated from the head noun of the complex DP by the subject Ivan, and not just by clitics. (47) Taj je Ivan kupio auto. this be3sg I. buyptc car “Ivan bought this car.”
Distributed deletion
Prosodic placement of clitics thus cannot be the general analysis of split constituents. Whether some XP-splits emerge as a result of prosodic clitic placement is an open issue, however. Thus, Browne (1975) argues that (48b) must be due to a non-syntactic clitic placement, because (48c) suggests that proper names can only be split in sentence initial position – quite unlike what we have seen in (46) and (47). (48) a.
Lav Tolstoj je veliki ruski pisac. Leo Tolstoy be3sg great Russian writer “Leo Tolstoy is a great Russian writer.” Tolstoj veliki ruski pisac. b. ?Lav je Leo be3sg Tolstoy great Russian writer bio Tolstoj veliki ruski pisac. c. *Lav je L. be3sg beptc T. great Russian writer
´ Franks (1998) and Cavar (1999) argue, however, that proper names are split in syntax, too. While (49a) illustrates that both parts of a name can be inflected, there is a marginal possibility of inflecting the first word only in a complex proper name only. (50) from Franks (1998) shows that splits arise only when both words are overty inflected (see also the discussion in Section 5). (49) a.
Lava Tolstoja ˇcitam. L. T. read1sg “I am reading Leo Tolstoi.” b. ?Lava Tolstoj ˇcitam.
(50) a.
Lava sam Tolstoja ˇcitao. L. be1sg T. readptc “I read Leo Tolstoy.” b. *Lava sam Tolstoj ˇcitao. L. be1sg T. readptc
Similar conditions were observed in Boškovi´c (1997) for syntactic split of proper names. As illustrated in (51), proper names can be split by non-enclitic elements, if both parts are inflected. (51) a.
Lava L. b. *Lava L.
ˇcitam read1sg ˇcitam read1sg
Tolstoja. T. Tolstoj. T.
(51a) argues against a simple prosodic account of splitting proper names in ´ Croatian (see also Cavar (1999), on which the present discussion is based).
´ Gisbert Fanselow and Damir Cavar
However, Anderson (2000a) develops an approach to clitic placement in which inflectional affixes may share the distributional properties of clitics, and piedpipe the verb they are attached to, so that the verb ends up in a position reserved for clitics otherwise. In such an account, (51a) turns out to not be very different from (48b). We will leave the issue of whether proper noun splits should be analyzed like other splits in Croatian open here, but we would like to point out that a purely prosodic account of clitic placement has serious shortcomings quite independent from the present discussion, as argued in detail in ´ Cavar (1999).
. The copy and deletion approach . The general mechanism The evidence considered so far seems paradoxical: some aspects of the split construction require a movement analysis, others rule it out. E.g., split DPs disrespect the subject island condition (52a), but respect the complex noun phrase constraint. (52) a.
Briefe an Maria haben mich keine erschreckt. letters to Mary have me no frightened “As for letters to Mary, none of them has frightened me” b. *Bücher habe ich [eine Geschichte dass sie keine liest] gehört. books have I a story that she no reads heard “I have heard a story that she does not read any books.”
For the contrast in (52), the following characterization suggests itself: a movement barrier Σ does not block the formation of a split XP if and only if Σ itself is the XP to be split up. This follows if (a) splitting up Σ involves movement (then, (52b) is explained), but (b) not movement out of Σ.13 If splits are not formed by moving something out of the category that will be split up, the subject condition has no chance to block (52a). The idea that split formation may involve movement, but not movement of part of XP out of XP is enigmatic at first glance only. It makes sense if we assume that a chain <Σ, Σ> is formed (so barriers dominating Σ must be respected), in which the phonetic material of Σ is partially realized in the upper position, and partially in the lower copy. This is the core idea of the partial (distributed) deletion account of split constituents.
Distributed deletion
Recall that movement is a combination of copying and movement in the Minimalist Program (the cd-theory of movement, see, e.g., Chomsky 1995; Nunes 2001). Thus, the overt movement of α involves the steps in (53): First, α is copied to its landing site (53b), then the copy left behind is deleted, or made invisible to the phonological component. . . . . . . . . .α. . . . . . . . . . . . Copying ⇒ b. α. . . . . .α. . . . . . . . . . . . Full Deletion of lower Copy ⇒ c. α. . . . . .α. . . . . . . . . . . .
(53) a.
There is evidence that the deletion of the lower copy is not an automatic sequel to movement. Rather, as was argued, e.g., by von Stechow (1992), Groat and O’Neill (1996), Pesetsky (1998) and Sabel (1998), among others, at least some instances of covert movement14 are better analysed as movement in the overt component, with the upstairs rather than the downstairs copy being made invisible to the phonological component: (53) Full Deletion of upper Copy ⇒ c. α. . . . . .α. . . . . .
While these two modes of realizing chains phonologically may be considered standard, there exist other ways of dealing with copies in chains, which have received less attention. Thus, as was originally pointed out by Höhle (1996), copies of “light” wh-phrases may fail to be deleted in the so-called “CopyConstruction”, see Hiemstra (1986), Fanselow and Mahajan (2000), Fanselow ´ and Cavar (2001), and Nunes (2001) for analyses. (54) wer denkst du denn wer du bist? who think you ptc who you are? “Who do you think you are?”
Furthermore, Pesetsky (1998) argues that (certain) resumptive pronouns reflect the failure of copies of movement to delete completely. Thus, there seems to be some evidence that (53c) and (53c ) are not the only legal modes of treating chains in terms of phonological realizations. What we would like to add to this picture is the idea that, under certain conditions, deletion may affect both the upstairs and the downstairs copy, but in a partial way so, which yields the split XP construction. Thus, simplifying matters first, assume that a movement step maps (55a) onto (55b), by copying a noun phrase. If the downstairs copy
´ Gisbert Fanselow and Damir Cavar
deletes completely, we get standard topicalization (55c), if part of the lower material is retained, split topicalization arises (55d). (55) a.
hat er keine Bücher gelesen has he no books read Copying of the noun phrase ⇒ b. keine Bücher hat er keine Bücher gelesen “Overt” movement because of full deletion of lower copy ⇒ c. keine Bücher hat er keine Bücher gelesen Split noun phrase because of partial deletion in both copies ⇒ d. keine Bücher hat er keine Bücher gelesen
Under such a view,15 it is obvious why simple island effects fail to arise with split XPs: the step from (55b) to (55d) does not involve movement at all (so the XP to be split up cannot be a barrier), but split formation involves movement, so that barriers containing the lower XP have an effect on well-formedness (52b). Partial or distributed deletion as envisaged here is an extension of partial reconstruction at LF to the overt component of grammar.16 Reflections on the failure of quantifier raising or LF-wh-movement to bleed Principle C effects (see, e.g., Fox 1995; Nunes 1995; Pesetsky 2000) and further considerations (see Chomsky 1995) suggest that the semantic material of a phrase may end up being distributed to more than one position in a chain. It has been observed, e.g., that LF quantifier raising normally does not bleed the effects of Principle C of the Binding Theory. Thus, him and John cannot be coreferent in (56), although Quantifier Raising of the object should yield an LF-representation such as (57), in which John is no longer c-commanded and bound by him. If, however, as much semantic material of the quantified NP is reconstructed after LF-movement as is compatible with the necessity to keep the quantificational head in place, as in (57 ), a structure arises that represents scope, does not fail to imply the Principle C effect, and which is, in effect, identical with the kind of structure that arises by partial deletion in overt syntax, according to our account.17 See Fox (1995), Pesetsky (2000), Wilder (1997) for arguments that show that partial reconstruction is superior to an analysis in which Principle C is checked before LF-movement. (56) *I sent himi [every letter Johni expected] (57) *[every letter Johni expected]k I sent himi tk (57 ) *[every]k I sent himi [tk letter Johni expected]
Distributed deletion
. Pragmatic conditioning Both in Croatian and in German, XP-splits go hand in hand with a particular pragmatic structure that was studied in detail by Kniffka (1996) and de Kuthy (2000) for German, and for Slavic languages, e.g., by Siewierska (1984) (Polish), Lapteva (1976) and Sekerina (1997) (Russian). In a split construction, the right part of XP must be focal, while the lefthand part may be a (link-) topic or a second focus. Note that both Croatian and German sentence structure offer a number of positions reserved for YPs with specific pragmatic functions, such as focus and topic positions (see Frey 2000; Pili 2001). Bringing these observations together, the following generalization suggests itself: the XP-split construction is grammatical only if a single XP must fulfill two different positional requirements defined by pragmatic constraints on order.18 In other words: Suppose that XP bears a feature f1 that requires that XP be overtly realized in position A, and an additional feature f2 that forces XP into position B. Then XP is split up in languages like Croatian or German. (58) [[A XP] ..... [[B XP] ......]] This general idea can be made precise along the following lines. Suppose that an XP = [ap [b c]q ] bears two semantic or pragmatic features p, q, such as [+wh], [+focus], [+link-topic], etc., and suppose that these feature are checked by corresponding heads Hp and Hq in the standard way: the head attracts a phrase bearing a corresponding feature. Consider now a structure such as (59a). If the features p and q must be checked on Hp and Hq , respectively, (59b) will arise after two instances of movement/ attraction. (59) a. [Hp .... [Hq ... [XP ap [b c]q ]]] b. [[XP ap [b c]q ] [Hp .... [[XP ap [b c]q ] [Hq ... [XP ap [b c]q ]]]]] In the approach proposed here, the “strength” of the attracting feature does not determine whether movement (copying) applies before Spellout or not. Rather, copying always takes place as soon as possible. The strength of the attracting feature rather determines which of the copies created by movement is spelt out. In the “easy” case, the attracting features of both Hp and Hq are weak, so that the lowest copy is spelt out (=59c) if (60) holds. (59) c.
[[XP ap [b c]q ] [Hp .... [[XP ap [b c]q ] [Hq ... [XP ap [b c]q ]]]]]
(60) In a chain C = of XP, C1 is not spelt out if the feature attracting XP to C1 is weak. For heads with strong attracting features, the most simple implementation of standard ideas would seem to be (61). (61) yields correct results when only one attracting feature is strong (as sketched in (59d)), but problems arise as soon as two
´ Gisbert Fanselow and Damir Cavar
attracting heads have strong features, as seems to be the case in the split construction (both parts appear in positions related to semantic/pragmatic features). (61) would then require that XP be spelt out in both positions. (61) In a chain C = of XP, C1 is spelt out if the feature attracting XP to C1 is strong. (59) d. [[XP ap [b c]q ] [Hp .... [[XP ap [b c]q ] [Hq ... [XP ap [b c]q ]]]]] Multiple full copies of a single phrase (caused by the presence of two strong features of different heads attracting the same XP) seem non-existent in natural languages. Thus, (61) cannot be maintained. A situation in which both p and q of [XP ap [b c]q ] are attracted by corresponding strong features of different heads either implies ineffability, or the XP-split construction. The former situation holds in Dutch (where a constellation in which one part of an NP is focal, the other topical, simply cannot be expressed, as Henk van Riemsdijk (p.c.) points out), the latter in German and Croatian. (62) Suppose C = is formed because a strong feature of H has attracted XP and suppose that H checks the operators features f1 ... fk of XP. Then the categories bearing f1 ... fk must be spelt out in C1 . According to (62), operator positions checked by strong features must be filled by phonetic material bearing the corresponding operator feature. This implies an XP-split construction whenever the operator features are checked in two different specifier positions. When a phrase bears only one operator feature, it is not split up, even if not all of its parts bear that feature. This is guaranteed if the phonetic spellout is governed ´ by a contiguity principle (see Fanselow & Cavar 2001): material that is contiguous at one step in the derivation (that is, e.g., merged as a single phrase) should remain contiguous unless other principles force a violation of contiguity. The absence of a split construction in languages like Dutch may then be a consequence of a constellation in which contiguity cannot be violated in the interest of (62).
. Anti-freezing In the preceding section, we have seen why XP-splits arise only if XP bears two different pragmatic or semantic functions. The mechanism of splitting XPs that is implicit in (62) makes a stronger claim: it presupposes that phrases that are split are moved to specifier positions linked to operator features. Is this stronger claim really justified? At least for Croatian, the answer seems to be positive.
Distributed deletion
In Section 3.1, we have observed that barriers such as subject islands, dative islands or PP islands are not respected by split constituent. Croatian obeys a further restriction: descriptively speaking, a constituent cannot be split up in its root position, rather, a split is possible just in case both (all) parts of the XP occupy derived positions. Thus, as (63) illustrates, a PP cannot be split up if part of it remains in the base position following the verb (recall Croatian is an SVO language underlyingly): (63) a. *na kakav je Ivan bacio loptu krov on what is Ivan thrown ball roof b. *na kakav je Ivan bacio krov loptu c. na kakav je Ivan krov bacio loptu Dative DPs share these properties. They are islands for extraction, as (64a) illustrates. Nevertheless, they can be split up, as expected (64b), but only so if no part of the discontinuous NP follows the verb (65).19 (64) a. *ˇcega je policajac pokazao šoferu put za Split of what is policeman shown driver way to split “The policeman has shown the way to Split to the driver of what?” b. šoferu je policajac autobusa pokazao put za Split driver is policeman of-bus shown way to Split “The policeman has shown the driver of the bus the way to Split.” (65) a. *ˇcijoj je Ivan dao knjigu sestri whose is Ivan given book to sister “Whose sister has Ivan given the book to?” b. *ˇcijoj je Ivan dao sestri knjigu c. ˇcijoj je Ivan sestri dao knjigu It thus appears as if splitting up DPs and PPs is possible in derived positions only. This is predicted if the spellout principle (62) refers to chains and attractors, and not to focus or topic positions. However, in Croatian (and in Polish), there is an exception to the generalization just presented: accusative noun phrases can be discontinuous even if part of the DP follows the verb: ˇ je Ivan vidio sestru? (66) Ciju Whose is Ivan seen sister “Whose sister has Ivan seen?” This difference may find a straightforward explanation if we acknowledge the fact that accusative noun phrases in base generated positions cannot be islands for movement. (66) could thus be due to normal extraction, which may reduce to a
´ Gisbert Fanselow and Damir Cavar
remnant movement of the accusative NP following standard extraction from NP, or to some sort of left branch extraction.20
. The two types of splits The formation of XP-splits involves a copying operation followed by two instances of partial deletion. One therefore expects that constraints on copying/ movement exert some influence on the nature of XP-splits. XP-splits arise when an XP = [XP ap [b c]q ] possesses two operator features p and q attracted by different heads in a constellation such as (59a) repeated here for convenience. (59) a.
[Hp .... [Hq ... [XP ap [b c]q ]]]
It is reasonable to assume that the features p and q in XP stand in a c-command relation to each other. In Chomsky (1995), operator features are taken to be “subfeatures” of categorial features. Recall that overt movement is triggered by the need to check categorial features (or their subfeatures) only, and that it is subject to the Minimal Link Condition. Suppose that p and q are always subfeatures of the same “type” in terms of the functioning of grammar. Then their attraction is always subject to relativized minimality21 and/or A-over-A22 -effects: in [XP ap [b c]q ], only p but not q can be attracted. Furthermore, syntactic features (may) become invisible for the computational system after being checked. Thus, a relativized minimality or A-over-effect exerted by ap in (59a) disappears as soon as p has been checked. The feature q becomes accessible for attraction/movement as soon as p has been checked in [XP ap [b c]q ]. In other words, a converging (successful) derivation will have the properties sketched in (67): In the constellation (67a), H1 can attract p only and not q, because p is closer to H1 than q. After copying, p is checked in (67b), so that ap ceases to block further attraction of q. (62) guarantess, however, that ap must be spelt out in the specifier position of H1 . The second copying step moves [XP ap [b c]q ] to the specifier position of H2 , with q being the attracted feature. Because of (62), q must find a phonetic realization in the new landing site. Thus, (67c) is spelt out as in (67d), i.e., an inverted split arises. Recall that in the constellation in question, neither (60) nor (62) imply that any material must be present in the root position, so that no phonetically realized elements will appear there, in order to minimize the degree to which the contiguity23 of XP is violated. (67) a. b. c. d.
[H2 .... [H1 ... [XP ap [b c]q ]]] [H2 .... [[XP ap [b c]q ] [H1 ... [XP ap [b c]q ]]]] [[XP ap [b c]q ] [H2 .... [[XP ap [b c]q ] [H1 ... [XP ap [b c]q ]]]]] [[XP ap [b c]q ] [H2 .... [[XP ap [b c]q] [H1 ... [XP ap [b c]q]]]]]
Distributed deletion
That relativized minimality considerations imply that XP-splits are of the inverted type is a welcome consequence, given that inverted splits are the default version of the split construction. Pull splits preserve the c-command relations among the overt elements of the continuous XP: an XP merged as [a [b [c]]] appears as [a [X [b [Y [c] ....]]]] at the surface. Therefore, pull splits may be related to the Parallel Movement Constraint (PMC) proposed by Müller (2001). (68) Parallel Movement Constraint If A c-commands B at level L, then A c-commands B at level L’ The PMC requires that c-command relations generated in the base should be preserved (to the extent that this is possible). If the PMC is interpreted as a principle governing phonetic realizations, pull splits will be generated. Having identified the two principles of grammar that might be made be responsible for inverted and pull splits, respectively, one has to identify the “traffic rules” for their interaction. Initially, one might suspect that the choice between pull and inverted splits is correlated with the relative ranking of the relativized minimality/A-over-A condition and the PMC in the spirit of Optimality Theory. Structures that respect one of the two constraints inevitably violate the other. What is grammatical and what not would thus be a function of which of the two principles has priority over the other. This simplistic account fails for two reasons, however. First, it predicts that there are languages in which only pull splits exits (in which PMC outranks the A-over-A condition), and this does not appear to be the case. Second, it ignores the fact that the choice among pull and inverted split seems to be correlated with the operator features involved – at least in German. When the split XP involves a wh-feature and a topic/focus feature, the choice of split type must reflect the hierarchical relations among the attracting heads: Bücher weiss ich nicht wieviel er gelesen hat. books know I not how many he read has “as for books, I do not know how many of them he has read” denkst du dass er täglich Bücher liest? b. wieviel how many think you that he daily books reads “how many books do you think that he reads every day?”
(69) a.
When no wh-feature is involved, splits are inverted. This observation suggests a refinement of an assumption made above. Recall that the A-over-A-condition and/or the Minimal Link Condition affect features only that are identical from a grammatical perspective. The distribution of split types in German suggests that topic and focus features are identical from the perspective of the Minimal Link Condition,
´ Gisbert Fanselow and Damir Cavar
while the wh-feature is different from the topic-focus feature. Therefore, wh-splits do not have to be inverted. For Croatian (and perhaps Slavic languages in general), we then only have to add the assumption that topic and focus features may optionally be treated as distinct. If they are, the A-over-A condition/the MLC will no longer force an inverted serialization of the split construction, as required.
. Island effects revisted Features present on specifiers, determiners, adjectives (and, arguably, the noun) can trigger the pied piping of the complete DP in wh-movement contexts, as (70) illustrates for German. (70) a.
[das wievielte Buch] ist das? the “how-many-eth” book is that “how many books does that make” b. [ein wie teueres Buch] hat sie gekauft? a how expensive book has she bought “how expensive a book has she bought” b . [wessen Buch] hat er gekauft whose book has he bought b . [wem sein Buch] hat er gekauft who his book has he bought (sie wollte wissen) (she wanted to know) c. [den wievielten Geburtstag] er heute feiert the how-many-eth birthday he today celebrates “how old did he get today” d. [ein welcher Student] das geschrieben hat a which student that written has e.
(dialectal)
(dialectal)
welches Buch hat er geschrieben which book has he written
On the other hand, there is no pied-piping for features that follow the noun, that is, for features c-commanded by the lexical noun. (71) es ist egal – “it does not matter” a. *einen Bruder von wem a brother of whom “whose brother she loves” b. *einen Bruder wessen sie a brother whose she
sie liebt she loves liebt loves (=a.)
Distributed deletion
c. *den Versuch wen zu kuessen er wagte the attempt who to kiss he dared “who he made an attempt to kiss” d. *eine Geschichte, dass sie wen liebt, er glaubte a story that she who loves he believed “who he believed a story that she loved” e. *einen Mann der wen liebt er kennt a man who whom loves he knew “who he know a man who loves t” We will not attempt to derive this generalization from general principles, and confine ourselves to stating it: If a head H attracts the feature f, then Σ = [DP . . . f . . . ] can be pied piped only if f is not c-command by the nominal “head” of the DP. XP-splits involve the attraction of two features residing in XP. We expect, then, that a DP may be split only if both parts contain “prenominal” material. Recall that a DP is split up only if it is attracted twice to specifier positions in which operator features are checked. These features must sit in the prenominal domain for there being a chance of pied piping the complete DP. This prediction is borne out: Bücher kaufe ich keine. books buy I no “I buy no books.” b. Bücher kaufe ich nur Peters. books buy I only Peter’s “I just buy Peter’s books.” c. Bücher kaufe ich interessante. books buy I interesting “I buy interesting books” d. blaue kaufe ich keine. blue buy I no “I don’t buy blue ones.” e. interessante Bücher kaufe ich keine neuen. interesting books buy I no new “I do not buy any new interesting books”
(72) a.
Furthermore, there can be no XP-splits in which one part contains postnominal material only. This prediction is also borne out in, and (73) illustrates an important consequence. Recall that underlying subjects and indirect objects are barriers for movement in German (Müller 1996), as exemplified in (73). XP-split formation does not respect these islands, because splitting up a DP does not involve extraction out of that DP. We must guarantee, then, that (73) cannot arise by moving the complete DP and splitting it up by partial deletion. Our model implies this without
´ Gisbert Fanselow and Damir Cavar
further stipulations: in [keine [Briefe [an Maria]] the PP an Maria is c-commanded by the noun Briefe. The attraction of a feature residing in PP thus cannot trigger the pied-piping of the whole DP – which is necessary for the emergence of a split construction involving an Maria and keine Briefe.24 (73) *[DP keine Briefe an Maria] haben mir [keine Briefe an Maria] no letters to Mary have me no letters to Mary gefallen pleased “No letters to Mary have pleased me” For the same reason, the examples in (74a, c) cannot arise. The CNPC cannot be circumvented either by using partial deletion for deriving phonetic strings blocked by movement constraints: wen in (75a) could not have triggered pied-piping, as (75b) shows. (74) a. *den ich kenne mag ich jeden who I know like I everyone b. ich mag jeden den ich kenne I like everyone who I know “I like everyone who I know.” c. *dass Maria schläft machte er die Behauptung that Mary sleeps made he the claim d. er machte die Behauptung, dass Maria schläft he made the claim that Mary sleeps “he made the claim that Mary is sleeping” (75) a. *wen hast du eine Geschichte, dass sie t liebt kritisiert who have you a story that she loves criticized “Who did you criticize a story that she loves t ?” b. *es ist egal [CP [DP eine Geschichte dass sie wen liebt] du a story that she who loves you it is equal kritisiert hast criticized have “it does not matter about who you have criticized a story that she loves him”
Distributed deletion
. Morphological and other wellformedness conditions . Strong and weak inflection In German, determiners, quantifiers and adjectives take their morphological forms from two paradigms, the “strong” and the “weak” inflection. The choice is determined by the syntactic context. Thus, in the neuter nominative/accusative paradigm, the negative universal quantifier takes the form kein if it appears in a noun phrase with a lexical noun (or an adjective), as in (76a). If the noun phrase neither contains a lexical noun nor an adjective, as in (76b–c), the strong form keines must be chosen. When a noun phrase is discontinuous, as in (77)–(78), the form which kein takes is not the one found in the corresponding continuous case (compare (76a) with (77)). Rather, kein takes exactly the form it would have if the second part of the split noun phrase would be a single, independent noun phrase. (76) a.
er he b. er he c. er he
hat has hat has hat has
kein Geld. no money keines/*kein. none keines/*kein aus Deutschland. none from Germany
(77) Geld hat er kein-es/*kein. money has he no “he has no money.” (78) Geld hat er kein amerikanisches money has he no American “he has no American money” In other words, the second part of the split noun phrase takes the shape of a wellformed complete independent noun phrase with identical lexical content. The same holds for the first part. When an adjective such as englisch “English” is preceded by a definite determiner, it appears in the weak form englische in a neuter nominative/accusative situation (79a), while the strong form must be used when no determiner precedes or if the adjective follows an indefinite determiner (79b). In the discontinous case, the form of the adjective in the first part of the split noun phrase again is not necessarily the one it would take in the corresponding continuous DP. Rather, it takes the form it would have if the first part would be a simple independent noun phrase.
´ Gisbert Fanselow and Damir Cavar
(79) a.
ich habe nur das englische Geld da. I have only the English money there “I just have this English money over there.” b. ich habe (ein) englisches Geld. I have (an) English money
(80) englisches Geld hab ich nur das da. English money have I only that there “I just have this English money.” This observation concerning the local morphological well-formedness of the parts of a split noun phrase is an old one (cf., e.g., Haider 1985). It has been used as an argument against a movement analysis, which is far from being convincing because there is no reason to believe that the morphological shape of the determiner or adjective is not determined after copying and deletion.25 In other words, the morphemes merged into a syntactic representation are abstract entities. These abstract morphemes can be marked as [-pronounced] after the copying part of a movement operation. How they are spelt out is determined by the constellations they are part of at the spellout level. In a DP-split construction, both parts are dominated by a DP node. Thus, (81) affects both parts of the split DP. The strong-weak distinction in the form of the articles and adjectives is but one of the conditions that must be met by both parts of the split construction. (81) The phonetic string dominated by a DP node must meet the lexical and morphological wellformedness conditions for DPs.
. Overt determiners The discussion in the preceding paragraph helps to understand a fundamental restriction concerning the formation of pure DP-splits in German. As had already been observed in the early work concerning DP-splits (Fanselow 1988; van Riemsdijk 1989), DP-splits can be wellformed in certain varieties of German only if the split phrase is a plural DP, or is projected from a mass noun. Thus, contrasts such as (82) can be observed: (82) a. *Alten Professor kennt sie keinen old professor knows she no “she knows no old professor” b. Alte Professoren kennt sie keine old professors knows she no “she knows no old professors”
Distributed deletion
An informal internet questionnaire study revealed that only 3 out of 45 native speakers of German rated (82a) as grammatical. The “best” comparable structure (83) was accepted by 15 of the 45 consultants (of which only one rejected pure splits of plural DPs). (83) Lampe habe ich keine lamp have I no “I have not got a lamp” The restriction in question is easy to understand: unless they are headed by a mass noun, German singular noun phrases are well-formed only if they have an overt determiner – unlike what holds for the plural: (84) a.
ich I b. *ich I c. ich I
kenne know kenne know kenne know
Professoren professors Professor professor einen Professor a professor
Whatever the nature of the restriction exemplified in (84) is, it creates a problem for the phonetic realization of a split construction [[einen Professor] . . .. [einen Professor] . . ..], because it implies in conjunction with (81) that the first occurrence of the copied DP cannot be pronounced without a determiner. In this situation van Riemsdijk’s regeneration idea comes into play. Since singular count nouns do not constitute well-formed DPs by themselves, alten Professoren in (82a) must not be part of a split DP. The problem can be circumvented, however, by ‘inserting’ an indefinite article into the determiner position of the left copy of the DP. (82) c.
einen alten Professor kennt sie keinen an old professor knows she no “she know no old professor”
If we follow the standard idea that keinen is the spellout of a negative operator merged with an indefinite determiner, the following description seems natural: An abstract DP [neg [indef [alt [professor]]]] is copied to two operator positions. If two operator features are present, the constellation [[neg [indef [alt [professor]]]] — [neg [indef [alt [professor]]]] . . .. ] arises, in which the abstract morphemes neg and indef have been marked as [-pronounced] in the lefthand copy. This implies a conflict between requirement (85) blocking the realization of singular count DPs without overt determiners, and the pronunciation principle (86) that requires that no material be pronounced twice. If the former principle is stronger than the latter, the [-pronounced] instruction for abstract [indef] is ignored in the left copy – this
´ Gisbert Fanselow and Damir Cavar
is the most economical way of respecting DP well-formedness. Consequently, [indef] is pronounced as einen as in (82c). This happens in the dialect of most speakers of German. The minority dialect ranks pronunciation economy (86) (=nonpronunciation of indef ) higher than the determiner requirement (85a). For these speakers, (82a) is grammatical. (85) Singular count DPs start with a determiner (86) Do not pronounce material twice This dialectal difference constitutes an aspect of the split construction that is easy to account for in Optimality Theory. The same holds for (87). This sentence has been rated as ungrammatical by only 2 of the 45 consultants (9 found the sentence questionable, and 34 grammatical) – which is surprising given the number mismatch between the left and the right part of the split DP. (87) Zeitungen liest er nur eine – die taz newspapers reads he only one the taz “As for newspapers, he only reads one: the taz” Such constructions are grammatical only if the left DP is plural, and the right DP singular. The reverse constellation is strongly ungrammatical. Assume that the DP that was merged originally is nur eine Zeitung “only one newspaper-sg” – or rather a constellation of abstract morphemes corresponding to that. After copying and partial deletion, the configuration (88) arises. The left copy in (88) violates (85). One way of dealing with the problem is to ‘insert’ an article (if (85) outranks (86)), which leads to (90b), while the minority dialect tolerates (90a) since (86) (85). But there is a further way of dealing with the problem constituted by (85): one can realize the lefthand DP in the plural, so that (85) is not violated at all. Such a strategy obviously violates a further principle of spellout: abstract formal features should find the proper phonetic realization. But if (85) (89), the slight deviation from the input is warranted. (88) [DP zeitung, sg] liest er [DP nur eine] (89) Feature Faith: The Phonetic Realization must respect the formal features of the input Zeitung liest er nur eine newspaper-sg reads he only one liest er nur eine einzige b. eine Zeitung a newspaper reads he only a single
(90) a.
(91) exemplifies a number of puzzles that arise in the context of DP splits in German. The structures exemplified in (91) have been used as arguments in favor of
Distributed deletion
base-generation by Fanselow (1988, 1993), but they can be dealt with successfully in the present theory as well if one assumes more pronunciation principles like (85). Thus, the order of the words in the two copies of the split DPs is not fully inverted sometimes. The sequence that keine nur Bücher that underlies (91a) in our account is ungrammatical, as (91b) shows, because nur ‘only’ must be leftperipheral in a DP, while relative clauses (91c, d) have to appear at the right edge. Does this argue against deriving (91a, c) from the sources like (91b, d)? If the two serialization constraints just mentioned do not govern the construction process of noun phrases, but rather apply to DPs in isolation at surface structure, the contrast in (91a–d) is explained: by splitting it up, the DP loses its offending properties. (91) a.
b. c. d. e. f.
nur Bücher liest er keine only books read she no “He just does not read any books.” *er liest keine nur Bücher Bücher, die erfolgreich waren, kennt er keine von Maria books which successful were knows he no by Mary *er kennt keine Bücher, die erfolgreich waren von Maria “He does not know any books by Mary that have been successful.” Bücher hat er welche books has he some *er hat welche Bücher
Likewise, welche “some” cannot co-occur with an overt noun in German, a problem that welche Bücher manages to solve in (91e) by splitting up.
. Loose ends . The English–German/Slavic contrast So far, we have focused on problems that may arise in the left copy of a split DP. The restrictions affecting the righthand copy of an inverted split seem to have more severe consequences. In an inverted DP-split, the right copy has no overt nominal head. In German and Slavic, this cannot create a problem because the overt realization of a noun in DP is never necessary. Likewise, Warlbiri noun phrases need no overt noun (Hale 1983), the same holds for West Greenlandic (Fortescue 1984), Latin (Kühner-Stegman 1976, §§61, 247; Ostafin 1986), and Dyirbal and Yidiñ (Dixon 1972, 1979). The option for omitting a noun, that is, the option for an ellipsis of the complement of some functional category in the DP, is certainly related to the
´ Gisbert Fanselow and Damir Cavar
“strength” of agreement in the noun phrase in these languages. Lobeck (1991) suggests that only agreeing functional heads permit ellipsis of their complements. Languages tolerating noun ellipsis allow a split construction. In contrast, most English noun phrases need an overt nominal head. The literal translation (92b) of German (92a) is ungrammatical – the empty nominal position must be filled by one, as in (92c). No XP-splits exist (92d). Noun ellipsis is also impossible in Japanese, which has no movement-based split construction either. (92) a. b. c. d.
Ich kaufe ein teures *I buy an expensive I buy an expensive one *books, I bought three expensive
Fanselow (1988) tries to derive the grammaticality of XP-splits from the independent existence of DPs lacking an overt noun. Because of the repair strategies discussed in 5.2, the present model does not correlate XP-splits and the existence of noun phrases without nouns.
. PP-splits Since local wellformedness requirements as discussed in 5.2. imply that the parts of a split DP should come as close as possible to the shape that complete independent DPs have, it is natural to suspect that the same holds for split PPs. (93) seems to be an obvious and trivial condition for the phonetic realization of PPs. It implies that PPs cannot be split in the strict sense (94): only one of the two copies can fulfill (93) if distributed deletion is maximal. (93) Left Edge of PP PPs begin with an overt preposition (94) *Bücher hat er in keine geschaut books has he in no looked “He has not looked into any books.” Just as in the cases discusses in 5.2., the problem can be repaired by choosing a less economical pronunciation, that is, by realizing the preposition in both copies: (95) in Schlössern habe ich noch in keinen gewohnt in castles have I yet in no lived “So far, I have not yet lived in any castle.” However, two copies of the preposition are retained in inverted splits only, and not in pull splits, as (96) shows:
Distributed deletion
(96) Na kakvo se Ivan stablo penje? on what-kind-of self I. tree climbs “On what kind of tree does Ivan climb?”
(Croatian)
This difference might be captured if (93) is replaced by two requirements. First, we assume that the preposition and the category it selects should be phonetically adjacent. This implies that na kakvo and in keinen have to be phonetic neighbors in (96) and (95), respectively. Second, we assume that the highest element in a chain created by a strong categorial feature W must contain an overt element realizing the categorial feature W. If PPs are attracted by a P-feature, the second principle implies the presence of a preposition for in Schlössern in (95) (so both copies of in keinen Schlössern must realize the preposition) while it does so for na kakvo in (96). Consequently, the preposition can be absent in the lower copy of na kakvo stablo there.
. A mystery Some speakers of German (10 out of the 45 informants)26 find structures such as (97a) unobjectionable – a construction which cannot be integrated easily into the present framework because no speaker of German accepts noun phrases with more than one nominal head, that is, *nur Bussarde Raubvögel is completely ungrammatical. #
Raubvögel kennt Gereon nur Bussarde birds of prey knows Gereon only buzzards “As for knowing birds of prey, Gereon knows just buzzards.” b. *er kennt nur Bussarde Raubvögel greifen den Gereon immer nur Bussarde an c. # Raubvögel birds of prey attack the Gereon always just buzzards ptc “as for being attacked by birds of prey – Gereon is always attacked by buzzards only”
(97) a.
Similar problems arise in the analysis of noun incorporation, as Mithun (1984: 870) and Anderson (2000b) point out: the incorporation of ‘fish’ does not preclude the appearance of a head noun in the object DP in Mohawk (98). (98) sha’té:ku snikú:ti rabahbót wahutsyahní:nu ki rake’niha eight of.them bullhead he.fish.bought thus my.father “my father bought eight bullheads” How can these structures be analyzed in incorporation models? In many languages, N2 frequently bears a possessor relation to the incorporated noun N1 in the incorporation structure [VP [V V-N1] [DP . . . N2 . . . ]]. We may analyse such a constel-
´ Gisbert Fanselow and Damir Cavar
lation as arising from a movement of N1 out of a DP in which N1 is the head and N2 the specifier. If this is correct, one just needs to account for the objective Case appearing on N1 and N2 – but this simply illustrates Case concord between a head and a specifier amply documented in, e.g., Massam (1986). This account may then be extended to (98) and even to (97) if the range of “possessive” relations between N2 and N1 can include a general partitive relation, too. As (99) shows, the construction (97a) is more restricted than standard splits. (97a) constrasts with (99a–b). (99c) suggests that the first noun phrase must not bear dative case (in contrast to split DPs), while (99d) shows that the problem cannot be solved by simply assuming that Raubvögel is a free topic in (97a). We have to leave the precise analysis of the construction open. kennt er keine Bussarde (99) a. *Raubvögel birds of prey knows he no buzzards b. *einen Raubvogel kennt er nur einen Bussard a bird of prey knows he only a buzzard ein Dinosaurier keinen/ *Bussarden c. Raubvögeln ähnelt birds of prey resembles a dinosaur no / *buzzards gekannt hat er nur Bussarde d. Raubvögel birds of prey known has he only buzzards
. Other constructions It is tempting to explain a further construction type that is characterized by properties much similar to the one we have discussed here by distributed deletion, namely extraposition. Haider (1997) notes a number of problems concerning the assumption of an extraposition operation for CPs in German: clauses in “extraposed” position are not barriers, as they should be in a derived position (but see Müller 1998 for considerations weakening this argument), they are c-commanded by the elements preceding them according to evidence involving polarity items, and the movement that extracts them out of their host noun phrase would violate conditions on movement more often than not: In (100a), the relative clause would have been moved out of a PP. This problem shows up with the apparent extraposition of PPs, too, as (100b) shows. gedacht, die Bücher liest ich habe an eine Frau woman thought who books reads I have at a “I have thought about a woman who reads books” b. ich habe über den Titel nachgedacht von deinem Buch of your book I have about the title thought “I have reflected about the title of your book”
(100) a.
Distributed deletion
The island problem would be avoided if the whole DP or PP is generated behind the verb, and if its movement to, e.g., the AGR-O position preceding the verb can strand the relative clause or a PP. That this stranding might be an instance of partial deletion was suggested by Mahajan (p.c.), see Hinterhölzl (1999) for a detailed version of this position. We do not want to assess the virtues of these ideas in the present paper, but we wish to point out one difference. First, recall that our account implies that no material may rest in the base position of a phrase, since partial deletion applies only if the phrase in question hosts two or more different features that cannot be phonetically realized in a single position. Obviously, the relative clause in (100a) does not bear such a feature – it seems as if one would have to assume purely phonological principles that enforce partial deletion in the case of relative clause stranding. ´ Fanselow and Cavar (2001) re-analyse the appearance of stranded verbal particles in German and Dutch verb second movement as involving distributed deletion. Hinterhölzl (to appear) derives a number of apparent remnant movement effects from distributed deletion. Thus, the scope of the mechanism proposed here may well go beyond the split construction.
Notes * This article has its roots in a joint presentation at the 1997 International Conference on Pied Piping held at the Friedrich Schiller University Jena. Parts of the paper have been presented at workshops and conferences at the universities in Leipzig, Osijek, Poznan, and Stuttgart. For inspiring discussions and helpful hints, we are grateful to Artemis Alexiadou, Josef Bayer, Caroline Féry, Gereon Müller, Henk van Riemsdijk, Peter Staudacher, and Masatoshi Tanaka. A particular thank goes to Anoop Mahajan, who suggested distributed deletion as an analysis of split DPs in the discussion period of a 1995 talk by Fanselow. The research reported here was partially supported by grants of the German Research Foundation (DFG) to the Innovationskolleg “Formale Modelle kognitiver Komplexität” (INK 12) at the University of Potsdam, and the Graduiertenkolleg “Ökonomie und Komplexität in der Sprache” at the Humboldt University Berlin and the University of Potsdam. . (5) and (6) belong to the class of “separation constructions” in the sense of Pesetsky (2000). “Split topicalization” and “split scrambling” have been used as further labels for (5) and (6) in the literature. The Slavic constructions have also been discussed as a subtopic of “left-branch extractions.” . Speakers differ in the extent to which they accept or reject multiple splits – presumably, because multiple splits involve a highly complex pragmatic structure. Furthermore, multiple splits necessarily involve one phrase which is completely split within IP. Such splits were considered ungrammatical in the early literature (see, e.g., Fanselow 1988; Kniffka 1996), but such claims were based on data that were constructed in a less than optimal way. . The existence of this construction type has been brought to our attention by Josef Bayer.
´ Gisbert Fanselow and Damir Cavar . See Kniffka (1996) for an assessment of the dialectal distribution of imperfect DP splits with two determiners. . More precisely, Riemsdijk (1989) assumes that the clause initial position of German, nowadays the specifier of CP, can host maximal projections only, and considers amerikanische Wagen not to be one – certainly a necessary assumption in models of noun phrase structure that did not assume the fine functional structure related to the DP models. . Of course, the problem will not be solved if we assume extraction of non-constituents. . In a questionnaire study carried out together with Reinhold Kliegl and Matthias Schlesewsky, we found a certain nestedness of judgments: there are some (few) speakers who accept splitting for accusative noun phrases only, others accept splits of nominative and accusative noun phrases, and a third group accepts discontinuity for accusative, nominative, and dative phrases. The nesting of the judgements in our questionnaire study reflects the development of judgements in the literature, to a certain extent. . DeKuthy (2000) has argued that German noun phrases are islands for extractions of PPs. Structures showing apparent PP-extraction from NP involve an underlying structure [VP NP PP V] in her approach. A discussion of this view is beyond the scope of the present paper. Note that our major point is not affected if her analysis is correct: movement processes have to obey the island conditions, independent of whether the construction one wants to compare XP splits with is a movement construction or not. In fact, we might say that our general point would rather be strengthened, because it would be fairly unclear why noun phrases should be islands for extractions of PPs but not for extractions of, say, NomPs. Given that splits affect subjects and indirect objects, a reanalysis option is ruled out as an account for split constituents immediately, because reanalysis processes are assumed to involve direct objects only (if the process exists at all). . The only way to counter this argument against the movement approach of split constituents would be to claim that the bounding theory does not hold for the extraction of XPs from a YP that is an “extended projection” of XP. We do not think that such a proposal could be spelt out in a convincing way, and it would be incompatible with the observation that the extraction of VPs out of IPs or CPs essentially respects bounding theory. . Doing so may in fact be simple: if an XP can be linked thematically to predicate P only if XP is merged in the projection of P, then two XPs sharing a thematic role must be merged in the same maximal projection. Alternatively, one can assume that one part of the DP merges in VP, the other in the projection of the functional head which licenses the formal features of the DP. If, as argued in Fanselow (2001a, in press) the checking of certain formal features implies theta-role assignment, and if the two DP-parts both check features with the relevant functional head, they share a thematic role and their relation is correctly predicted to be local one. In the interest of space, we will not pursue this idea here. . Note, however, that there are also word order facts that are unexpected in simple movement theories at least, as we show in Section 5. . Given the freedom of word order in Slavic noun phrases, no similar argument can be made easily for noun phrase splits in Slavic. For PP splits, the relevant point is obvious
Distributed deletion
however; base generation does not readily explain why the highest part of a PP must contain the preposition. . In this respect, the present account is much in line with base generation theories as proposed in Fanselow (1988) or van Geenhoven (1998): there is movement, but splitting itself is not caused by extraction from something. . Pesetsky (2000) argues that featural movement of the kind introduced in Chomsky (1995) is nevertheless necessary, in addition to the phonological deletion of the upstairs copy. If Chomsky (2000) is correct in replacing feature movement by agreement at a distance, the analysis sketched above is, of course, the only kind of covert movement. . For a similar approach developed independently of us, see Hinterhölzl (1999, to appear). . Partial deletion effects might be reanalyzed as involving (partial) reconstruction of phonetic material in the overt component: phrase Σ first moves completely from a to b, later, a part of Σ is reconstructed to a. At a purely descriptive level, this approach and the theory proposed here have fairly similar consequences. . The effects of partial reconstruction can be reanalyzed as being due to distributed deletion after LF copying. An analysis of all reconstruction phenomena in terms of distributed deletion seems possible, but is well beyond the scope of the present paper. A few remarks can be found in Fanselow (2001b). . In this respect, it resembles other types of NP-discontinuity, cf. DeKuthy (2000). . Ungrammaticality may, however, be repaired for certain speakers with heavy stress. . Note that reference to the fact that an accusative NP occupies a non-derived focus position cannot account for the contrast discussed above: PP objects can occupy non-derived postverbal focus positions, too, yet they cannot be split up there. . Recall that the Minimal Link Condition (MLC) requires that K attracts α only if there is no β, β closer to K, such that K attracts β (Chomsky 1995: 310). α is closer to target K than β if α c-commands β (Chomsky 1995: 358). . The A-over-A condition (Chomsky 1964) reduces to the MLC if the feature triggering the locality effect is sitting on a syntactic head which projects this feature. . That is, we assume that contiguity is a graded constraint. . Croatian adds a difficulty, however, (i) shows that PPs may be split off noun phrases in a process of partial deletion – recall that dative DPs are islands for movement. At present, we have no account for this difference between German and Croatian. (i)
Knjigama je Ivan o matematici davao ocjene Books-dat has Ivan about mathematics given grades “Ivan has given grades to books about mathematics.”
. The morphological facts of German are mirrored in other languages with DNP. In Warlbiri, noun phrases are morphologically well formed if they begin with a (possibly empty) sequence of words not bearing Case morphemes followed by a (necessarily non-null) sequence of words (including the final one) that are Case-marked. This condition must be respected by the parts of a DNP individually. See Nash (1980).
´ Gisbert Fanselow and Damir Cavar (i)
kurdu-ngku wita-ngku ka maliki wajilipi-nyi child-erg small-erg aux dog chase-np kurdu wita-ngku ka maliki wajilipi-nyi kurdu-ngku ka maliki wajilipi-nyi wita-ngku *kurdu ka maliki wajilipi-nyi wita-ngku
. 18 informants rated (97) ungrammatical, 17 found it questionable.
References Abney, S. (1987). The English Noun Phrase in its Sentential Aspects. PhD dissertation, MIT. Anderson, S. (2000a). Towards an Optimal Account of Second-Position Phenomena. In J. Dekkers, F. van der Leeuw, and J. van de Weijer (Eds.), Optimality Theory: Phonology, syntax and acquisition (pp. 302–333). Oxford: OUP. Anderson, S. (2000b). Some Lexicalist Remarks on Incorporation Phenomena. In B. Stiebels and D. Wunderlich (Eds.), The Lexicon in Focus (pp. 123–142). Berlin: Akademie Verlag. Austin, P. & J. Bresnan (1996). Non-Configurationality in Australian Aboriginal Languages. Natural Language and Linguistic Theory, 14, 215–268. Baker, M. (1991). On Some Subject/object Non-asymmetries in Mohawk. Natural Language and Linguistic Theory, 9, 537–576. Baker, M. (1995). The Polysynthesis Parameter. Oxford: OUP. den Besten, H. & G. Webelhuth (1990). Stranding. In G. Grewendorf and W. Sternefeld (Eds.), Scrambling and Barriers (pp. 77–92). Amsterdam: John Benjamins. Boškovi´c, Ž. (1997). Second Position Cliticization: Syntax and/or phonology? Ms., University of Connecticut. Browne, W. (1975). Serbo-Croatian Enclitics for English-Speaking Learners. In R. Filipovi´c (Ed.), Contrastive Analysis of English and Serbo-Croatian. Zagreb: Institute of Linguistics. Browne, W. (1976). Two Wh-fronting Rules in Serbo-Croatian. Južnoslovenski filolog, 32, 194–204. ´ Cavar, D. (1999). Aspects of the Syntax-Phonology Interface. Doctoral dissertation. University of Potsdam. Chomsky, N. (1964). Current Issues in Linguistic Theory. The Hague: Mouton. Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: The MIT Press. Chomsky, N. (2000). Minimalist Inquiries: The framework. In R. Martin, D. Michaels and J. Uriagereka (Eds.), Step by Step: Essays on minimalist syntax in honor of Howard Lasnik (pp. 89–155). Cambridge, MA: The MIT Press. Culicover, P. & M. Rochemont (1990). Extraposition and the Complement Principle. Linguistic Inquiry, 21, 23–47. Corver, N. (1990). The Syntax of Left Branch Extractions. Doctoral dissertation, Tilburg University. De Kuthy, K. (2000). Discontinuous NPs in German. A case study of the interaction of syntax, semantics, and pragmatics. Doctoral dissertation, Saarbrücken. Diesing, M. (1992). Indefinites. Cambridge, MA: The MIT Press.
Distributed deletion
Dixon, R. (1972). The Dyirbal Language of North Queensland. Cambridge: CUP. Dixon, R. (1979). A Grammar of Yidiˇn. Cambridge: CUP. Fanselow, G. (1988). Aufspaltung von NP und das Problem der ‘freien’ Wortstellung. Linguistische Berichte, 114, 91–113. Fanselow, G. (1993). The Return of Base Generators. Groninger Arbeiten zur Germanistischen Linguistik (GAGL), 36, 1–74. Fanselow, G. (2001). Features, Theta-roles and Free Constituent Order. Linguistic Inquiry, 32(3). Fanselow, G. (2001b). When Formal Features need Company. In C. Féry and W. Sternefeld (Eds.), Audiatur Vox Sapientiae (pp. 131–152). Berlin: Akademie Verlag. Fanselow, G. (In press). Against Remnant VP-Movement. In A. Alexiadou, E. Anagnostopoulou, S. Barbiers and H.-M. Gaertner (Eds.), Dimensions of Movement. Amsterdam: John Benjamins. ´ Fanselow, G. & D. Cavar (2001). Remarks on the Economy of Pronunciation. In G. Müller and W. Sternefeld (Eds.), Competition in Syntax (pp. 107–150). Berlin: Mouton de Gruyter. Fanselow, G. & A. Mahajan (2000). Towards a Minimalist Theory of Wh-expletives, Whcopying, and Successive Cyclicity. In U. Lutz, G. Müller and A. von Stechow (Eds.), Wh-scope Marking. Amsterdam: John Benjamins. Fortescue, M. (1984). West Greenlandic. London: Croom Helm. Fox, D. (1995). Condition C Effects in ACD. MIT Working Papers in Linguistics, 27, 105–120. Franks, S. (1998). Clitics in Slavic. Ms. position paper Comparative Slavic Morphosyntax, Bloomington, Indiana. Franks, S. & Progovac, L. (1994). On the placement of Serbo-Croatian clitics. Indiana Slavic Studies, 7, 69–78. Frey, W. (2000). Über die syntaktische Position der Satztopiks im Deutschen. Ms., ZAS, Berlin. van Geenhoven, Veerle (1998). Semantic Incorporation and Indefinite Descriptions: Semantic and syntactic aspects of noun incorporation in West Greenlandic. Stanford: CSLI Publications. Groat, E. & J. O’Neill (1996). Spellout at the LF-interface. In W. Abraham, S.D. Epstein, H. Thraínsson and J.-W. Zwart (Eds.), Minimal Ideas (pp. 113–139). Amsterdam: John Benjamins. Haider, H. (1985). Über sein oder nicht sein: zur Grammatik des Pronomens sich. In W. Abraham (Ed.), Erklärende Syntax des Deutschen [Studien zur deutschen Grammatik 25] (pp. 223–254). Tübingen: Narr. Haider, H. (1997). Extraposition. In D. Beerman, D. LeBlanc, and H. van Riemsdijk (Eds.), Rightward Movement [Linguistics Today 17] (pp. 115–151). Amsterdam: John Benjamins. Hale, K. (1983). Warlpiri and the Grammar of Non-configurational Languages. Natural Language and Linguistic Theory, 1, 5–47. Hiemstra, I. (1986). Some Aspects of Wh-questions in Frisian. NOWELE, 8, 97–110. Hinterhölzl, R. (1999). Licensing Movement and Stranding in the West Germanic OV Languages. Ms., Humboldt University Berlin.
´ Gisbert Fanselow and Damir Cavar Hinterhölzl, R. (In press). Remnant Movement and Partial Deletion. In A. Alexiadou, E. Anagnostopoulou, S. Barbiers and Hans-Martin Gaertner (Eds.), Dimensions of Movement. Amsterdam: John Benjamins. Höhle, T. (1996). German w...w-constructions. In U. Lutz and G. Müller (Eds.), Papers on Wh-Scope Marking. Arbeitspapier 76 des Sonderforschungsbereich 340. Stuttgart and Tübingen. Horn, G. (1975). The Noun Phrase Constraint. Doctoral dissertation, University of Mass., Amherst. Jelinek, E. (1984). Empty Categories, Case, and Configurationality. Natural Language and Linguistic Theory, 2, 39–76. Kayne, R. (1994). The Antisymmetry of Syntax. Cambridge, MA: The MIT Press. Kniffka, G. (1996). NP-Aufspaltung im Deutschen [KLAGE 31]. Hürth: Gabel. Kuhn, J. (1998). Resource Sensitivity in the Syntax-semantics Interface and the German Split NP Construction. In T. Kiss and D. Meurers (Eds.), Proceedings of the ESSLLI X Workshop on Current Topics in Constraint Based Theories of Germanic Syntax. Saarbrücken. Kuhn, J. (To appear). The Syntax and Semantics of Split NPs in LFG. In Proceedings of CSSP 1997. Kühner, R. & C. Stegmann (1976). Ausführliche Grammatik der lateinischen Sprache. Zweiter Teil: Satzlehre. Darmstadt: Wissenschaftliche Buchgesellschaft. Lapteva, O. A. (1976). Russkij razgovornyj sintaksis ‘Russian Colloquial Syntax’. Moscow: Nauka. Lobeck, A. (1991). Phrase Structure of Ellipsis in English. In S. Rothstein (Ed.), Perspectives on Phrase Structure [Syntax & Semantics 25] (pp. 81–107). San Diego, CA: Academic Press. Massam, D. (1986). Case Theory and the Projection Principle. PhD dissertation, MIT. Mithun, M. (1984). The Evolution of Noun Incorporation. Language, 60, 847–894. Müller, G. (1996). A Constraint on Remnant Movement. Natural Language and Linguistic Theory, 14, 355–407. Müller, G. (1998). Incomplete Category Fronting. Dordrecht: Kluwer. Müller, G. (2001). Order Preservation, Parallel Movement, and the Emergence of the Unmarked. In G. Legendre, J. Grimshaw and S. Vikner (Eds.), Optimality Theoretic Syntax. Cambridge, MA: The MIT Press. (Also on ROA ROA-275-0798.) Nash, D. (1980). Topics in Warlpiri Grammar. New York: Garland. Nunes, Jairo (1995). The Copy Theory of Movement and Linearization of Chains in the Minimalist Program. Doctoral dissertation, University of Maryland. Nunes, J. (2001). Sideward movement. Linguistic Inquiry, 32, 303–344. Ostafin, D. M. (1986). Studies in Latin Word Order: A transformational approach. Doctoral dissertation, University of Connecticut, Storrs. Pesetsky, D. (1998). Some Optimality Principles of Sentence Pronunciation. In P. Barbosa, D. Fox, P. Hagstrom, M. McGinnis and D. Pesetsky (Eds.), Is the Best Good Enough? (pp. 337–383). Cambridge, MA: The MIT Press. Pesetsky, D. (2000). Phrasal Movement and its Kin. Cambridge, MA: The MIT Press. Pili, D. (2001). On A- and A -dislocation in the left periphery. A comparative approach to the cartography of the CP-system. Doctoral dissertation, University of Potsdam.
Distributed deletion
Riemsdijk, Henk van (1989). Movement and Regeneration. In P. Benincà (Ed.), Dialectal Variation and the Theory of Grammar (pp. 105–136). Dordrecht: Foris. Sabel, J. (1998). Principles and Parameters of Wh-movement. Habilitation thesis, Frankfurt/M. Sekerina, I. (1997). The Syntax and Processing of Scrambling Constructions in Russian. Doctoral dissertation, CUNY. Siewierska, A. (1984). Phrasal Discontinuity in Polish. Australian Journal of Linguistics, 4, 57–71. Sportiche, D. (1988). A Theory of Floating Quantifiers and its Corollaries for Constituent Structure. Linguistic Inquiry, 19, 425–449. Stechow, A. von (1992). Kompositionsprinzipien und grammatische Struktur. In P. Suchsland (Ed.), Biologische und soziale Grundlagen der Sprache (pp. 175–248). Tübingen: Niemeyer. Tanaka, M. (In preparation). Anti-Bewegungsanalyse der japanischen Topik-Transformation im Vergleich mit dem Deutschen (working title). Doctoral dissertation, University of Potsdam. Tappe, T. (1989). A Note on Split Topicalization in German. In C. Bhatt, E. Löbel and C. Schmidt (Eds.), Syntactic Phrase Structure Phenomena in Noun Phrases and Sentences (pp. 159–179). Amsterdam: John Benjamins. Thiersch, C. (1985). VP and Scrambling in the German Mittelfeld. Ms., University of Tilburg. Vogel, R. & M. Steinbach (1998). The Dative – an Oblique Case. Linguistische Berichte, 173, 65–90. Wilder, C. (1997). Phrasal Movement in LF: de re readings, binding and ellipsis. In K. Kusumoto (Ed.), NELS 27 [Proceedings of the North East Linguistic Society] (pp. 425–439). GSLA, Amherst. Yearley, J. (1993). Discontinuity in the Russian Noun Phrase. Ms., University of Massachusetts, Amherst. Zec, D. & S. Inkelas (1990). Prosodically Constrained Syntax. In S. Inkelas and D. Zec (Eds.), The Phonology-Syntax Connection. Chicago: The University of Chicago Press.
Roots, constituents, and c-command Robert Frank, Paul Hagstrom, and K. Vijay-Shanker Johns Hopkins University / Boston University / University of Delaware
.
Background
At the core of syntactic theory is the question of how grammatical structures are properly characterized. It has long been clear that sentences have constituents and hierarchical structure, and these abstract structures have generally been described in terms of trees such as the one shown in (1) for the surface string BDE separable into two constituents, B and DE, where B is hierarchically superior to both D and E. (1)
A C
B D
E
In grammatical explanation, the relation of c-command is of fundamental importance; movement is allowed if and only if the moved element c-commands its trace, antecedents must c-command pronouns and anaphors, and so forth. Traditionally, c-command is defined in terms of dominance: α c-commands β iff every node dominating α dominates β (and neither dominates the other). However, the dominance relation is much less central in syntactic explanation than c-command. In fact, our hypothesis here (following Frank & VijayShanker 2001) is that syntactic structures ought to be characterized directly in terms of a primitive c-command relation, as opposed to a primitive dominance relation. Frank and Vijay-Shanker (2001) show that the class of tree structures that can be characterized in terms of primitive c-command is a subclass of those that can be characterized with primitive dominance including all of the lin-
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
guistically relevant ones. This match between the expressiveness of primitive c-command and the range of natural language structures provides support for the primitive c-command hypothesis. However, since we take the existence of primitive c-command to imply the non-existence of primitive dominance, we must deal with the consequence that there is no way for grammar to refer to dominance in this sense at all. In this paper, we aim to demonstrate the tenability of our claim that dominance does not figure into grammatical explanation. We show first that the cases in which dominance was traditionally considered crucial can be translated into statements about c-command, and, second, that viewing these cases in terms of c-command furthers our understanding. Dominance has been used primarily to define two notions, roots and constituents. In traditional terminology, the root of a tree (e.g., A in (1)) is the node which dominates all other nodes, and a constituent (e.g., {C, D, E} in (1)) is the set of nodes all of which are dominated by a specified node (e.g., C in (1)). We will consider these two concepts in turn, looking at what makes them important, how they can be described in terms of c-command, and what light it sheds on the phenomena involved.
. Roots and substitution There are two properties that we normally associate with the root of a structure like (1). First, the root is the node which determines the category (or features) of the (sub-)tree as a whole. Second, it is the node to which further attachments occur (in a cyclic derivation). Intuitively, the root of a structure is the node that is closest to the top, the node that is least deeply embedded in the structure. We can formalize this intuition with the less embedded relation in (2), which approximates the classical “dominance” relation in a certain range of cases (Frank & Vijay-Shanker 2001). (2) A node x is less embedded than a node y iff x does not c-command y, and every node which c-commands x also c-commands y
x ≤ y iff ¬xCy∧ ∀z [ z C x →zCy]
There are two parts to the definition in (2). First, it says that x can only be less embedded than y if y is indeed embedded; if x c-commands y, then y is certainly not embedded below x – x could not “dominate” y in traditional terms. The second part says that x can only be less embedded than y if everything
Roots, constituents, and c-command
that c-commands x also c-commands y. To give concrete examples, consider some nodes from the tree in (1), repeated below. (1) (repeated)
A C
B D
E
In terms of primitive c-command, this structure is characterized by the following set of c-command relations: (3) D B B B
mutually c-commands mutually c-commands c-commands c-commands
E (D c-commands E, E c-commands D) C (B c-commands C, C c-commands E) D E
Given the definitions in (2), we can see that C is less embedded than D here because (i) C does not c-command D, and (ii) everything that c-commands C (namely, B) also c-commands D. The reverse does not hold; D is not less embedded than C because there is something which c-commands D (namely, E) that does not c-command C. By the same reasoning, we can see that A is less embedded than B but not vice-versa, since (vacuously) everything that ccommands A also c-commands B, but there is a node (C) which c-commands B but not A. To identify the root node of (1), we can use the intuitive idea that the root is the least embedded node in terms of this formal definition of relative embeddedness and conclude that A is the root; for any node N in (1), A is less embedded than N and N is not less embedded than A. The complete list of c-command relations from (1) is given in (3). At least for the case of the structure (1), the definition of less embedded in (2) gives us the same results we would have had using traditional “dominance.”
. Roots and adjunction Adjunction structures differ from the simpler substitution structure (1) considered above in terms of their c-command relations. In (4), D and E are sisters, and B has been adjoined to C.1 Most of the c-command relations are as before, but notice that whereas B and C stood in a mutual c-command relation in (1),
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
B asymmetrically c-commands C in (4). That is, an adjunct c-commands its “sister” but not vice-versa (following Kayne 1995: 16). (4)
C C
B D
(5) D B B B
E
mutually c-commands c-commands c-commands c-commands
E (D c-commands E, E c-commands D) C D E
Running through the same sort of calculation as before, we discover unsurprisingly that C is less embedded than both D and E: C does not c-command either D or E, and the only node c-commanding C, namely B, also c-commands both D and E. More interestingly, we also see that B is not less embedded than C, since B c-commands C, but neither is C less embedded than B since there is a node (B) which c-commands C but not B. The question then arises: What is the root of the structure in (4)? There is no node which is less embedded than all other nodes; it cannot be C (since C is not less embedded than B), nor can it be B (since B is not less embedded than C). We would not like to consider (4) to be a rootless structure, since it must always be possible to determine the category of such a tree, and this is one of the functions of the root node. There is a slightly weaker way we can think of the root of a structure. There are two nodes in (4) for which we can say that there is at least no other node less embedded than it. There is no node less embedded than B, nor is there a node less embedded than C. Following this idea, we define the root node as follows. (6) A node N is a root iff there is no other node M such that M is less embedded than N.
N is a root ↔ ¬ ∃ M.M ≤ N
Using this definition, both B and C are roots of (4). In fact, generally, this way of looking at roots means that more adjuncts directly leads to more roots in the structure. So, in a structure like (7), the roots are C, F, and B.
Roots, constituents, and c-command
(7)
C C
F B
C D
E
However, even in multi-rooted structures like (7), two of these roots have a distinguished status. Specifically, C is the root which does not c-command any other nodes (whereas both B and F c-command C), and F is the root which is not c-commanded by any other nodes (whereas both B and C are c-commanded by F). Recall that there are two things that the root is important for: determining the label or category of the tree as a whole, and determining the site of cyclic attachment. In (7), C is the node which determines the category, so it is good that we can distinguish C from among the three roots of (7). (8) The categorial root is the root which does not c-command any other nodes.
Since we can also distinguish F in an equally general way, we expect to see that F plays a similarly important role. We will see shortly that it plays the other major role of the traditional “root” node, determining the site of cyclic attachment. Before we make this connection, however, we must take a detour to discuss how these structures come about derivationally.
. Substitution and adjunction with multiple roots Starting with a substitution structure like (9), we can characterize the “merger” (i.e. merge, in the sense of Chomsky 1995) of the subtrees <X> and as the assertion of a mutual c-command relation between every pair of roots, where one member of the pair is taken from <X> and the other is taken from .2 (9)
Y
X
Substitution
Adjunction, on the other hand, as in (10), only asserts c-command in one direction; so, for every pair of roots (one from <X> and one from and
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
where the <X> subtree is being adjoined to the subtree), the <X> root will c-command the root. (10)
X
Y
Adjunction
Notice too that if we accept that structures are combined using c-command relations (only), then these two modes of attachment, substitution and adjunction, exhaust the logically possible means of combination: c-command can be established either in both directions or in just one direction. Another interesting result follows from this and from a natural wellformedness condition on trees stated in (11), requiring that the categorical status of the tree be determinable. (11) Categorial Identity Condition A well-formed tree has a unique category-determining root (a categorial root).
Recall that the category of the tree is determined by the categorial root, the root that does not c-command any other nodes. This means that an adjunction to a tree cannot change the well-formedness of that tree with respect to (11). In (10), when <X> is adjoined to , the categorial root of does not c-command any new nodes (since c-command is only asserted from <X> roots to roots), so it remains the root which does not c-command any other nodes. The <X> root(s) on the other hand now (each) c-command the root(s) of and so can no longer serve as categorial roots. Substitution, however, establishes a mutual c-command relation between the roots of the two subtrees. Saying nothing further, the result of (9) would be a tree which does not satisfy the Categorial Identity Condition (11). In order to have a well-formed tree, a new node must be added to the representation, one which does not c-command any other nodes. This new node is the label of the combined tree; it is the node which “projects” (in the terminology of Chomsky 1995). As a consequence of this interaction between the need for a categorial root and the mechanisms underlying adjunction and substitution, we no longer have any need to posit a distinction between “segments” and “categories” (May 1977; Chomsky 1986). In effect, “segments” don’t exist; no new node is necessary in an adjunction structure. Only under substitution is a new node added to the representation.
Roots, constituents, and c-command
Having laid out an overview of the representational system, we will turn in the next few sections to some particular cases of adjunction and substitution in action.
. Substitution with multiple roots In (12), a single node X is about to be merged with a multiply rooted (YP and WP) structure. This will establish mutual c-command relations between X and YP and between X and WP. The result of this merge is shown in (13). (12)
X
YP WP
YP Y
(13)
ZP
XP X
XP YP
WP
X YP
Y
YP WP
YP
ZP
As a result of the need to satisfy the Categorial Identity Condition, a node (XP) must be added to the structure. At this point XP is not in a c-command relation with any other node. It becomes the only root of the resulting structure. One question that arises in this connection is how the properties of the newly projected node are determined; that is, why in (13) does the node labeled XP take its features from X rather than YP? One possibility is that the choice is free in the syntax, with incorrect choices being filtered out by uninterpretability at the LF interface. An alternative, suggested by Chomsky (2000), is that the possibility of substitution is regulated by the existence of a selection relation. Since this mode of combination does not impose any structural asymmetry, there must be some substantive asymmetrical relation between the combined elements, a relation that Chomsky takes to be selection.3
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
. Adjunction with multiple roots – XP adjunction The situation is more interesting where we adjoin a subtree <XP> to the same multiply-rooted structure. Whereas in the case above, a mutual c-command relation was established between the roots of the two structures, here only unidirectional c-command relations are established. In the resulting structure (15), XP c-commands both YP and WP. No new node is necessary to maintain well-formedness. (14) XP YP WP
YP Y
ZP
(15) YP
YP XP
XP
YP WP
YP Y
YP WP
YP Y
ZP
The interesting thing about the structure in (15) is that the c-command relations are the same as those in the structure in (16). To put it another way, (15) is nondistinct from (16). (16)
YP WP XP
YP WP
Y
ZP
We are used to thinking of (16) as coming about as a result of a different derivational history, one in which XP first adjoins to WP and then the complex adjoins to YP (e.g., in the proposals about multiple wh-movement in Ackema & Neeleman 1998; Grewendorf & Sabel 1996). People have argued for structures like (16), although the structures have always appeared somewhat odd, seemingly resulting from movement which targets things which should not, cyclically speaking, be targets.
Roots, constituents, and c-command
To take one example, consider the well-known case of “Absorption” of wh-words and quantifiers, discussed by Higginbotham and May (1981), May (1985). The operation of Absorption, given in (17), turns a structure like (15) into a structure like (16) for the purposes of interpretation. (17) Absorption (May 1985: 21, following Higginbotham & May 1981) . . . [ NPi [ NPj . . . → . . . [ NPi NPj ]i,j . . .
However, if representations are described only in terms of c-command, as proposed here, no such operation is required, since the structures are already nondistinct. Another well-known example of a structure like (16) was proposed by Rudin (1988) to account for the properties of multiple wh-movement in certain Slavic languages, including Bulgarian and Romanian. In these languages, all wh-words move to the front of their clause, forming an unbreakable constituent. Thus, (18b) is degraded compared to (18a) because the wh-words are separated by the adverb vchera ‘yesterday’. e udaril koj kogo vchera who whom yesterday has hit ‘Who hit whom yesterday?’ kogo e udaril b. ??koj vchera who yesterday whom has hit ‘Who hit whom yesterday?’ (Marina Todorova, p.c., cf. Rudin 1988)
(18) a.
Rudin (1988) analyzed this as multiple adjunction of wh-words to SpecCP, although from our understanding of other cases of wh-movement, we expect to find wh-words moving to, or targeting, CP itself, rather than SpecCP. Additionally, movement that targets SpecCP also constitutes a violation of cyclicity when understood in terms of the extension condition of Chomsky (1992). From our perspective, however, these two apparently distinct sorts of movement are actually the same (under the further assumption that there is no distinction between specifiers and adjuncts, e.g. as argued by Kayne 1995). We can thereby avoid the need to posit a movement that targets an embedded element. On the basis of this discussion, one might be tempted to conclude that in our system the locus of attachment for a second adjunction is indeterminate among the multiple roots, with all possibilities yielding identical results. The Bulgarian data just reviewed, however, indicate that that multiply fronted whelements form a single, unbreakable constituent, suggesting that each successive adjunct attaches to the one that precedes it. Observe that the first adjunct, WP in (14), is not only a root of its structure, but is in fact the unique root
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
that is distinguished by being c-commanded by none of the other roots. We label this distinguished root the attachment root. (19) The attachment root is the root not c-commanded by any other nodes.
We will see that the attachment root provides the locus of cyclic attachment (the role of the traditional notion of “root” not covered by the categorial root).4
. Adjunction and multiple roots – head adjunction As our final example, we consider the head-adjunction structure in (20). (20)
XP WP
XP ZP
X Y
X
The complex head, considered alone as a subtree, has two roots, X and Y. Recall that the definition of root is based on the intuition that roots are minimally embedded; this means that in (20), X and Y are (equally) minimally embedded in the complex head subtree. There is reason to think that both X and Y are local enough to WP in SpecXP to check features. For example, consider the licensing conditions on Nwords like French personne. As is well-known, personne requires the presence of a local negative element. This requirement can be instantiated by the assertion that personne contains an uninterpretable negative feature ([Neg]) that must be checked for convergence. We assume that the negative head ne also contains an instance of [Neg] that is capable of checking personne’s feature. Following Pollock (1989) in the assumption that the negative head ne is generated between T and V, this would imply that in an example like (21), the Neg head adjoins to T (perhaps in a complex together with the verb), after which personne moves to SpecTP.5 (21) Personne n’est venu. No one neg-is came ‘No one came.’
Roots, constituents, and c-command
Such a derivation, combined with the empirical fact in (21), entails that the [Neg] feature of ne is close enough to personne in SpecTP to check, despite the fact that the [Neg] feature is part of a complex head with T and the copula (n’est). Returning to (20), notice that, once this complex head <X> (containing X and Y) has been merged with the complement (the subtree with categorial root ZP), both X and Y c-command (X and Y are both roots of <X> and merging asserts mutual c-command relations between every root of <X> and every root of ). However, thinking further back in the derivation, if Y had moved to adjoin to X from within , this means that Y still c-commands its trace.6 Looking at it this way allows us to maintain the view that a moved element must c-command its trace.
. Recap: Rootedness of well-formed structures Let us return to the issue we started with, the properties traditionally attributed to the root of a tree structure. Recall that the root determines the category or features of the tree as a whole, as well as determining the point at which further cyclic derivational attachments occur. We have seen that in these adjoined structures, the two properties of roots are dissociated; the category determination is taken by the categorial root (C or YP in (22)), while the site of cyclic attachment is determined by the attachment root. (22) C B
CATEGORIAL ROOT
C
ATTACHMENT ROOT
D
E
. Constituents So far, we have concentrated on showing that although dominance was useful for the purposes of identifying the root of a structure, it conflated two notions of “root” which the proposed c-command-based view distinguishes (the site of cyclic attachment and the categorial root). We conclude that there is no need to refer to the dominance relation for the purposes of root de-
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
termination. However, there is another important role traditionally played by dominance: the identification of constituents. As traditionally understood, a constituent is a collection of nodes that are picked out in some way by one of the nodes in the tree. This is usually accomplished by associating with each node n the constituent that is the collection of nodes dominated by n. (23) Traditional definition of constituent: For a node r, Constituent(r) = { m | r dominates m }
The question we will address in the next few sections is what becomes of the concept of “constituent” if we re-interpret it in terms of c-command. We will see that we can maintain a close-to-traditional view of constituency, while at the same time providing insight into certain phenomena that have remained puzzling under the traditional view. Consider the subtree in (24). We want to be sure that whatever our new interpretation of constituency is, it gives us the constituents listed in (25). (24)
AP A
BP B
CP
(25) {AP, A, BP, B, CP} {A} {BP, B, CP} {B} {CP}
(“AP”) (“A”) (“BP”) (“B”) (“CP”)
A natural place to begin is to carry over the notion of less embedded, used in the preceding sections to approximate dominance for the purposes of determining roots. The idea would be to define a constituent something like in (26). For reasons we will turn to directly, however, (26) is insufficient as a definition of constituent. (26) Revised definition of constituent (first attempt): For a node r, Constituent(r) = { m | r is less embedded than m }
For the structure in (24), the definition of constituent given in (26) gets the correct results. The nodes picked out by AP are {AP, A, BP, B, CP}, those picked out by BP are {BP, B, CP}, that picked out by B is {B}, and so forth. The defini-
Roots, constituents, and c-command
tion in (26) runs into problems when we consider adjunction structures, such as (27) below. (27)
ZP Z Y
WP Z
Recall from earlier discussion that Y is not less embedded than Z, nor vice versa. As a consequence of this, there is no node r that will pick out exactly the nodes {Y, Z} under the definition in (26). Rather, we find that the “constituent” picked out by Z is {Z} and that picked out by Y is {Y}. This runs counter what we know about the structure of complex heads, however. Specifically, the members of a complex head move together in iterated head movement, so they must form a constituent (assuming that movement can only involve constituents). The facts lead us to expect Z to pick out the constituent {Y, Z}, and the failure of (26) to provide this result is fatal. (28)
XP X Y Z
YP X
Y
t
Y and X move together, hence must form a constituent
ZP t
WP
We presented (26) as a possibility because it is a natural extension of the preceding discussion, but thinking about constituency in terms of c-command, there is an equally natural alternative conception of constituent: A constituent is the collection of nodes that are picked out by some specific node by virtue of being c-commanded by that node. That is, as in (29) below, A picks out the constituent {B, C, D} because these are the nodes that A c-commands. Moreover, no node picks out exactly the set of nodes {A, C, D}, which is the desired result since {A, C, D} is not a constituent. (29)
F A
B C
D
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
This definition is nearly correct, but there is one case that requires consideration, illustrated in (30). Were we to be faced with a ternary branching structure, such a definition would incorrectly allow two branches together to count as a constituent, picked out via c-command by the third. (30)
* A
F B
C
D
E
A slight refinement solves this problem; we need only require that the “root” of the constituent (C, above) does not c-command any of the elements of the constituent (specifically, excluding B from the constituent). The final definition of constituent we will adopt is given below in (31). (31) Revised definition of Constituent (final version): constituent(r)= {n | g C n and ¬ r C n} for a node g such that r C g and g C r.
Walking through (31), it says that the constituent with “root” r includes those nodes which are c-commanded (or “picked out”) by a certain node g and are not c-commanded by r, where g is in a mutual c-command relation with the “root” r. Also, notice what this node g in (31) is, structurally. It is, essentially, the governor of the constituent. We will return to the significance of this shortly, but first let us try the definition (31) on the adjunction structure that caused trouble for the previous potential definition back in (26). The final definition of constituent yields the constituents listed in (32) for (27). (27) (repeated)
ZP Z Y
(32) constituent(—) ZP Y Z WP
WP Z
is Ø {Y} {Y, Z} {WP}
governor — WP WP Y, Z
Roots, constituents, and c-command
Of particular note, Z is not a member of constituent(Y) (with governor WP) because Y c-commands Z. Also notice that the entire tree is not a constituent, since there is no node which can serve as a governor g to pick it out. To return to the discussion of the Bulgarian data from Section 6, recall that (18a–b) suggest that wh-words in a multiple question form an unbreakable constituent at the front of the clause. We noted that two possible structures for these questions, (15) and (16), are nondistinct with respect to their nodes and c-command relations. Yet, only (16) provides an explanation for the “unbreakability” of the wh-words. Interestingly, the definition of Constituent provided above picks out (16) as the correct representation of these c-command relations. To see this, consider (33), which is (16) merged with another node to provide a governor. The constituents in (33) are given in (34). (33) A
YP WP XP
YP WP
(34) constituent(—) YP WP XP Y ZP
Y
ZP
is {XP, WP, YP, Y, ZP} {XP, WP} {XP} {Y} {ZP}
governor A A A ZP Y
The constituency in (34) is consistent with (16) (i.e. (33)) but not with (15).7
. Using constituents: Movement Now that we have a way to identify constituents in a tree, let us turn to consider why constituents are necessary. Primarily, constituents are the unit on which operations can be performed. We will consider three primary operations here. In this section we will look at movement, returning to ellipsis and conjunction in upcoming sections. We argue that constituents figure into the movement operation in the following way: (35) All and only constituents may be moved.
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
In the remainder of this section, we consider the restrictions on certain subcases of movement, but the explanations of each will conform to a common theme: where movement of a certain part of a tree is not allowed, it is because that part of the tree does not constitute a constituent. . Head movement The first case we will consider is that of head movement. Recall from the list in (32) that the X0 -level constituents are {Z, Y} and {Y}, but not {Z}. (27) (repeated)
ZP Z
WP
Y
Z
This set of constituents conforms to our expectations based on the observed behavior of head-movement. Moving a complex head is allowed (as predicted, {Z, Y} is a constituent), as is excorporation of Y out of the complex head (again as predicted, {Y} is a constituent);8 however, moving the head away leaving the adjunct behind is impossible (because {Z} is not a constituent). . Non-movement of non-maximal projections The second case we will consider is movement of non-head, non-maximal projections (“X ”). As mentioned earlier, we assume, following Kayne (1995), that there is no meaningful structural distinction between specifiers and adjuncts, or to put it another way, specifiers are adjuncts. With this in mind consider the structure given below in (36). (36)
YP Y
XP WP
(¬X')
XP X
ZP
What would it mean to be able to move X (the lower XP) in the structure above? It would mean that there would have to be a constituent which contains XP, X, and ZP but to the exclusion of WP. There is, however, no such
Roots, constituents, and c-command
constituent. The only constituent (governed by Y) which contains all of these nodes also contains WP. Thus, it is a straightforward prediction of this view that non-maximal projections, or indeed any phrase without adjuncts included, cannot be moved. . Movement of XPs Finally, we turn to consider XP-movement, which can be divided into two cases, complement movement and specifier/adjunct movement. In general, for an XP complement, the head of which it is a complement will act as a governor and license movement. In (36) above, the head Y is a governor for constituent(XP) = {XP, WP, X, ZP}. To put it another way, phrasal complements of a head are always constituents and hence always extractable. In somewhat more traditional terms, we might say that phrasal complements of a head are “head governed” (in the sense of Rizzi 1990); but notice that we need make no statements about licensing of movement over and above the principle in (35) that all and only constituents can be moved. As noted earlier, a consequence of our conception of constituent is that unembedded structures do not form constituents. In the current context, this leads us to expect that unembedded structures cannot undergo movement. Though such a result might strike the reader as tautological, we would like to suggest that it allows us to derive the stipulation made by Chomsky (2000) to the effect that feature checking cannot obtain under so-called “pure Merge”, i.e., Merge that does not occur as a subpart of the movement operation. Chomsky’s motivation for this stipulation is to prevent a DP merged into the specifier of vP as an external argument from checking accusative case. To see how this follows, consider the mechanism underlying licit cases of feature checking by phrasal elements, as occurs in feature-driven movement. First, the feature in need of checking, F say, identifies some other feature F that can check it. Since it is assumed that bare features are inaccessible to the movement operation, the derivational system must next determine what the minimal structure S is which contains F and which may be moved.9 Under the assumptions that we are making in which syntactic structures consist of a set of nodes and c-command relations among them, determining such an S is a non-trivial problem, as these structures do not directly specify a notion like “containment”. The most natural way to solve this problem is to exploit the conception of constituent under discussion. Suppose now that the feature F , rather than coming from within the same syntactic structure as that which includes F, is present in an independently constructed structure S . Suppose further that F is part of the head
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
whose features project the (categorial) root of S . We might expect that F (or more properly its projection) should then Merge with S . If, however, we require that this S be identified as a constituent, such an instance of Merge will be impossible, as there is no governor for S .10 As things stand so far, we also predict that XP-adjuncts and specifiers should be constituents and thus should be capable of movement quite generally. For example, in (36), constituent(WP) = {WP} with governor Y, analogous to excorporation from a complex head as discussed above. However, we know empirically that movement of specifiers and adjuncts is much more limited than movement of complements. This is an issue which we will discuss in some depth after summarizing the basic proposal with respect to constituency. . Conclusion In the preceding sections, we have proposed that the constituent status of each of the following entities is as listed below: An X0 and everything adjoined to it (X0max , in the terminology of Chomsky 1995) is a constituent. b. A maximal projection is a constituent (except at the root). c. A non-maximal projection is not a constituent.
(37) a.
The existence of a governor plays a crucial role in defining a constituent. This allows us to understand why we seemed to need the (head government condition of the) ECP; it is simply a result of the fact that constituents are only defined with respect to a governor, combined with the constraint that only constituents can move.
. Movement of specifiers, constituency and conditions on government As noted in Section 10.3, the proposal sketched above does not distinguish between the constituency and hence extractability of complements and specifiers/adjuncts. In (38), both XP = {XP, WP, X, ZP} and WP = {WP} are constituents, given that they both stand in a mutual c-command relation with the head Y.
Roots, constituents, and c-command
(38)
YP Y
XP WP
XP X
ZP
Given this, what yields the well-known complement/adjunct and complement/ subject asymmetries in extraction? In the subsections below, we speculate on a number of possibilities. . Possibility I: Restrictions on possible governors Rizzi’s (1990) explanation of contrasts between examples like (39) and (40) distinguishes saw from that in the ability to head-govern the trace of movement. (39) Who do you think that John saw t? (40) *Who do you think that t saw John?
This proposal can be readily reformulated in our terms, by restricting what counts as a possible node g: (41) constituent(r)= {n | g C n and ¬ r C n} for a node g such that r C g and g C r where g is a governor. (42) Lexical verbs are governors. That is not a governor.
We will also include in our set of potential governors certain functional heads, e.g. empty complementizer and T, so as to allow for extraction of subjects (from that-less CPs) and adjuncts, respectively. Under this view, the source of that-t effects is the inability of that to define a constituent, leaving the DP in (43) as a non-constituent.
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
(43)
CP TP
C that DP who
TP T
VP
Notice that this also implies that the TP complement here is not itself a constituent either, since the C node that would also play a crucial role in defining the TP constituent, if it existed. However, there is at least some evidence to support the hypothesis such TP complements are not constituents. First, as seen in (44), note that TP, when the complement of that, cannot be extracted. Second, such a TP cannot be elided, as shown in (45) (cf. Lobeck’s 1995 proposal that elided constituents must be governed; under our proposal this would simplify to “only constituents can be elided”). (44) *John left, I was told that t. (45) *Even though Mary hopes that e, she doubts that Bill is coming to the party.
Similar behavior is observed with the wh-complementizer if, which also fails to license extraction from the specifier of its TP complement: (46) *John finished his novel, I asked Mary if t. (47) *Even though Mary asked if e, she is pretty sure that Bill won’t be coming.
In the presence of empty complementizers, which license extraction of embedded subjects, both the extraction and elision of TP are possible:11 (48) John left, I was told [CP [C Ø ] t ]. (49) Even though Mary asked who [CP [C Ø ] e], she is sure that Bill is bringing Louise.
One potential problem with the hypothesis that TP is not a constituent is the fact that TPs appear to be conjoinable (50). (50) I would never have believed that
[Mary would leave] and [Bill would stay]
Roots, constituents, and c-command
This becomes a non-problem, however, if we accept the proposal of Wilder (1994), who argues that conjuncts must be extended projections (CPs here) with ellipsis of repeated lexical material. (50 ) I would never have believed
[that Mary would leave] and [that Bill would stay]
. Possibility II: Selectional restrictions A somewhat less conservative alternative approach to restricting the constituency, and hence extractability, of an embedded specifier builds on the selectional properties of the embedding head. Our proposal entails that phrases with specifiers have multiple roots (i.e., multiple minima in the less embedded than relation). So, for example, in (51), the CP complement of V has two roots, the categorial root CP (which c-commands no other node in ) and WH. (51)
VP V
CP WH
CP C
TP
Syntactic heads have selectional properties which specify what type of complement they require. When considering the complement of V in (51), what property does it have that satisfies the selectional requirements of V? Certainly, selection must attend to the categorial root, which determines that has the categorial properties of a CP. Suppose, however, that there are two kinds of selection. The first dictates properties of the root(s) of a proper complement, the second dictates properties of the categorial root (only) of a proper complement. Consider, as an example, the difference between say (a bridge verb) and whisper (a non-bridge verb). If we suppose that say has the property that its complement must be a CP (interpreted as meaning that the categorial root of its complement must have the categorial features of a CP), whereas whisper has the (stricter) property that (all of) the root(s) of its complement must have the categorial features of a CP (52), we derive the result that whisper does not tolerate successive-cyclic extraction through the specifier of its CP complement. Specifically, if a wh-word left an intermediate trace in the specifier of a
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
CP complement of whisper, the selectional requirement of whisper would be violated: The intermediate trace would itself constitute an additional root of the CP complement, yet would not have the categorial features of a CP. The selectional properties of whisper in (52) practically forbid its CP complement to have any specifiers.12 (52) Selectional properties of bridge verbs vs. non-bridge verbs a. say categorial root: CP b. whisper root: CP (53) Why did you say/*whisper [t that you had left early t]
As a second example of how this distinction between selection for a root vs. selection for a categorial root, consider the difference between raising and control verbs. (54) Selectional properties of raising verbs vs. control verbs a. seem categorial root: TP b. try root: TP
Assuming that NP-movement but not control depends on the possibility of movement through specifier of TP (pace Hornstein 1998), only seem will allow it given the selectional properties above. Parallel to the bridge/non-bridge distinction, try effectively selects for a TP with no specifier, whereas seem selects for a TP regardless of whether it has a specifier. If this approach to the raising/control distinction is on the right track, there are a couple of implications worth mentioning. First, it avoids the otherwise unmotivated TP/CP distinction between the infinitival complements to raising and control verbs. A second implication is that “EPP” (in the sense of “TP must have a specifier” as the EPP is interpreted following Chomsky 1992 et seq.) cannot be a universal property of T, since under this proposal, TP complements of control verbs lack specifiers altogether. . Possibility III: Derivational symmetric c-command Finally, let us consider a third alternative (as a proposal for how to restrict extraction of specifiers while allowing extraction of complements), one which is even more radical (and more speculative). In our discussion above (Section 10.3), we suggested that the constituency of a specifier (as well as the constituency of an adjoined head) derives from the existence of a mutual ccommand relation with an “external” element. So, in (55), WP (the specifier of
Roots, constituents, and c-command
XP) was taken to be a constituent in virtue of the fact that it is in a mutual c-command relation with governor Y; likewise, Z is a constituent in virtue of the fact that it is in a mutual c-command relation with governor ZP. (55)
YP Y
XP WP
XP X
Z
ZP X
Earlier in (9–10), we took these mutual c-command relations to result from the multi-root characterization of the merging of <X> and : all roots of <X> c-command all roots of (and vice versa). In this section, we will consider an alternative statement of the Merge operation: (56) Alternative version of Merge (single-root version): X ←→ Y
where xC is the categorial root of <X>, and where yA is a root of , then xC c-commands yA where yC is the categorial root of , and where xA is a root of <X>, then yC c-commands xA
Consider what happens when merging <X> and a multi-rooted structure , as in (57) below, using this revised definition of Merge. <X> has only one root node (X). has two roots, YP being the categorial root. Merging <X> and will therefore establish three c-command relations according to (56): X c-commands WP, YP c-commands X, and X ccommands YP.
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
(57)
XP CATEGORIAL ROOT
X
YP
WP
YP WP
ZP
Y ATTACHMENT ROOT of
XP
of
YP
X
YP
Notice in particular that WP does not c-command X, which means that X cannot serve as a governor to define WP as a constituent. That is, WP is not a constituent in (57). This is a step toward the solution of our problem (that is, how to restrict the movement of specifiers), since specifiers and complements are no longer symmetrical; specifiers are not necessarily constituents, whereas complements are. If only constituents can move, then we are not guaranteed to be able to move a specifier. Of course, we must not stop here, since it is certainly not the case (empirically) that specifiers can never move. Given that specifiers can sometimes move, and continuing to assume that only constituents can move, it must be the case that WP can become a constituent in certain circumstances. For WP to become a constituent means essentially that WP must come to c-command X. The question is: How might this come about? Our suggestion is (somewhat paradoxically) movement. That is, as illustrated in (58), if WP were moved higher, the WP would c-command X. At this point, WP would be a constituent (with governor X), since WP is in a mutual c-command relation with X. Put another way, movement is allowed if in the end what moved is a constituent. (58)
XP XP
WP X
WP « X
YP WP
YP
Roots, constituents, and c-command
Opening up this possibility of “post-hoc” licensing of movement seems in one sense to have put us back where we started; any movement of a specifier to a ccommanding position would be licensed, since after the movement, there will a mutual c-command relation between the moved specifier and a governing node. So, what then accounts for the restrictions on the movement of specifiers? Our proposal is that movement of specifiers is in general allowed, though not for cases where there is no local landing site available for the movement. Under this hypothesis, the locus of the restriction on movement of specifiers is to be found in the locality theory; let us suppose that the relevant locality domain is the “phase” as proposed by Chomsky (2000). This view implies, of course, that observed legitimate cases of movement from specifier involve only very local moves. This seems plausible. For example, in TP complements of ECM verbs, the embedded subject moves only as far as the accusative Case position (say, SpecvP). No phase boundary intervenes (phase boundaries being CP and vP). Similarly, if we suppose that raising verbs like seem have no vP projection (only a VP projection), then no phase boundary is introduced at seem and the higher TP remains local to elements in the complement of seem. As another example which suggests that movement must be phase-local, consider the following agreement fact in Chamorro (as reported by Chung 1994:17). In (59), a wh-word has been moved from the object position in the embedded clause. Agreement morphology must appear on all verbs along the path of movement. In (59), the embedded clause is only a TP, yet the successivecyclic agreement affects it as well. If wh-agreement is a reflex of having the trace of wh-word in SpecvP, the facts are explained.13 (59) Hafa malago’-mu [t u-mafa’maolik t]? what wh[obl].want-agr wh[nom].agr-be.fixed ‘What do you want to be fixed?’
Lastly, what is the status of the principle that says “only constituents can move” in light of the hypothesis that something can become a constituent via movement? One possibility is that the notion of constituent is not a fundamental part of syntax, but rather an important requirement on the part of pronunciation. That is, perhaps constituency is verified at and useful for PF, while syntax (and even perhaps LF) are not sensitive to issues of constituency. If this speculation is on the right track, we would expect to find that post-Spell-Out movement (of specifiers, for example) would be unconstrained compared to overt movement (since it would no longer be important to move only constituents. This also raises an interesting possible take on
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
why the “phase” would be the relevant locality domain as well. Suppose that constituency is necessary for pronunciation (e.g., to determine the location of prosodic boundaries), and suppose that the linguistic structure is handed off to the PF interface at every phase. We might then conclude that phase-internal movement of an XP is required for pronunciation if the XP in its base position is not itself a constituent; this predicts a difference between direct objects and other arguments (which are presumed to be introduced into the structure in specifier position) in whether they must undergo phase-internal movement. Notice also that this need not be movement “to the edge” of the phase; it need only be movement to a position c-commanding the would-be governor, but within the phase. This makes a slightly different set of predictions about the impact of phases on derivations than the view proposed in Chomsky (2000) (for example, it predicts that direct objects need not move successivecyclically, while subjects, adjuncts, and indirect objects must, and it predicts that movement to a c-commanding position further inside a phase boundary is possible for those things which must move successive-cyclically). Further pursuit of this and other implications of these suggestions are left for future study.
Notes . Note that we represent adjunction structures as is standard, involving two segments of the single category to which adjunction takes place. In primitive c-command terms, however, there is no corresponding distinction between segment and category. Rather, it is simply the case that the adjunct (asymmetrically) c-commands the node to which it is adjoined. . We will use the notation <X> to label subtrees in the text to distinguish them from node labels. <X> is a collection of nodes and c-command relations, whose categorial root is X. . For Chomsky (2000), the operation involved in substitution is what he calls “set merge”, an operation that puts two structures into a single set. This is a notational variant of our proposal that substitution involves the assertion of mutual c-command. One sees these two variants in different formalizations of undirected graphs in mathematics: they may either be characterized by a symmetric relation on the set of nodes (a set S of pairs such that (x,y) ∈ S iff (y,x)∈ S) or by a set of 2-subsets of the set of nodes. In the case of adjunction, Chomsky’s formulation of the operation as “pair merge” is identical to ours, as he takes it to involve the creation of an ordered pair. This is nothing but the addition of a single assertion to a (c-command) relation. . In Section 9, we show how the notion of constituent that we develop derives the conclusion that the structure that is derived from successive adjunction is not indeterminate between attachment at any of the roots, but instead involves attachment at the categorial
Roots, constituents, and c-command
root. In other words, while (15) and (16) are nondistinct, constituency tells us that (16) is the correct way to draw the structure. . Example (i) below shows that [Neg] on ne need not be checked via overt movement, and is therefore not a strong/selectional feature. This eliminates the possibility of an alternative derivation for (21) in which personne moves first to SpecNegP, checking [Neg] on the Neg0 head, after which personne moves on to SpecTP and Neg0 moves on to head-adjoin to T0 . Such a derivation would not make our point, since there is a stage in the derivation where Neg0 and personne would be in a direct Spec-head relation. (i)
Je n’ai vu personne. I neg-have seen noone ‘I did not see anyone.’
. Given the formulation of adjunction given in the text, this result will require the sort of interarboreal derivation discussed in Bobaljik and Brown (1997). An alternative to this approach might invoke a transitivity condition on the c-command relation of the sort in (i): (i)
For all nodes x, y, z, if xCy and yCz and not x≤z, then xCz.
. This has the further implication that in cases we may have thought were multiple adjunctions to XP, such as multiple adverbial phrases, are not. That is, although we can draw a tree like (15), the constituents will always come out as in (16). This result meshes with Kayne’s (1995) proposal, under which multiple adjunction to XP is impossible, and forces us to a view along the lines of that proposed by Cinque (1999) for multiple adverbs. . Whether excorporation is allowed at all or in only these cases is under debate; see Roberts (1991) for discussion and an opposing view. . As we tentatively suggest in Section 11.3, the need for identifying a moveable constituent may only pertain to structures that need to be pronounced, however. . An immediate question raised by this proposal is how expletive there is able to check T’s EPP features under pure Merge. One possibility might build on the idea that there is a radically impoverished lexical item, being the realization of the single categorial feature D. Perhaps as a consequence the derivation can avoid the step of identifying a containing constituent, as the bare feature itself is moveable. We would expect, then, that all such cases of feature checking under Merge will involve such radically impoverished lexical items. . To accept that (48)–(49) are the correct structures for these sentences requires that the empty complementizer is not being moved/elided in these cases. One argument, admittedly indirect, for the structure in (48) might center on Stowell’s (1981) proposal that empty complementizers must occur in governed positions (to explain the fact that we get them in object CP complements, but not with sentential subjects). If this is right, the empty C couldn’t front. Note, however, that this notion of “needing a governor” differs from the notion of “needing a governor” to be a constituent; here, it is something like a selectional requirement, determining whether the empty complementizer can appear in the structure. Concerning (49), note that, although (i) is possible, (ii) is unexpectedly ill-formed. This indicates that something more complicated is going on in elision of this kind, but we have nothing further to offer here.
Robert Frank, Paul Hagstrom, and K. Vijay-Shanker
(i)
Mary hopes (that) John will leave.
(ii) *Even though Mary hopes [CP [C Ø] e ], she doubts that Max is coming to the party. . There is a loophole, however. If the phrase which occupies SpecCP in the complement of a non-bridge verb like whisper is itself a CP, then the selectional property of the verb could be satisfied despite the fact that its complement CP has a specifier. Sufficiently developed, this might make interesting predictions for languages like Basque (Ortiz de Urbina 1989), or Quechua (Cole 1982), in which “clausal pied piping” is allowed in wh-movement. For example, in Basque, wh-arguments can in general either move to the matrix SpecCP alone (i-a), or take the embedded clause along (i-b). We might expect to find that a version of (i) with a non-bridge verb would show obligatory clausal pied-piping, because its complement cannot have a specifier, yet without clausal pied-piping the wh-argument would have to land in the specifier of the embedded clause on its way to its matrix position. More interesting would be a case structured like (ii); here we would predict (given certain assumptions) that (ii-a) would be ill-formed because the DP must land in the specifier of the complement of whisper whereas in (ii-b) the entire subordinate clause stopped there. In (ii-b), whisper’s selectional requirements would be met, since what is in SpecCP of the complement of whisper is itself a CP. However, not only do we not have the relevant Basque data, it is also not completely clear that the assumptions necessary for these predictions to fall out are warranted; more research must be done before any solid conclusions can be reached. For example, it is not clear whether wh-words must move internal to a pied-piped CP (Ortiz de Urbina 1990) or not (Echepare 1995), nor is it clear that wh-words and pied-piped CPs share the same landing site (Echepare 1995). Development of this area must await further research. (i)
a. b.
nori esan duzu [ ti etorri dela ] ? who say aux arrive aux-comp [nor etorri dela]i esan duzu ti ? who arrive aux-comp say aux ‘Who did you say arrived?’
(ii) a. *whoi whisper [ ti John say [ ti arrived ] ] ? (prediction, given certain assumptions) b. [ who arrived ]i whisper [ ti John say ti ] ? . Others who have recently provided evidence for a VP-level trace of wh-movement include Fox (1999) (arguing on the basis of interactions between Condition C and scope reconstruction) and Nissenbaum (1998) (arguing based on properties of parasitic gap constructions).
References Ackema, P. & A. Neeleman (1998). Optimal questions. Natural Language and Linguistic Theory, 16, 443–490. Bobaljik, J. D. & S. Brown (1997). Interarboreal Operations: Head movement and the extension requirement. Linguistic Inquiry, 28, 345–356.
Roots, constituents, and c-command
Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: The MIT Press. Chomsky, N. (2000). Minimalist Inquiries: The framework. In R. Martin, D. Michaels, and J. Uriagereha (Eds.), Step by Step, Essays in Minimalist Syntax in Honor of Howard Lasnik. Cambridge, MA: The MIT Press. Chung, S. (1994). Wh-agreement and Referentiality in Chamorro. Linguistic Inquiry, 25, 1–44. Cinque, G. (1999). Adverbs and Functional Heads. Oxford: OUP. Cole, P. (1982). Imbabura Quechua. The Hague: North Holland. Echepare, R. (1995). A Case for Two Types of Focus in Basque. In E. Benedicto, M. Romero and S. Tomioka (Eds.), Proceedings of Workshop on Focus [UMOP 21]. Amherst, MA: GLSA. Fox, D. (1999). Reconstruction, Binding Theory, and the Interpretation of Chains. Linguistic Inquiry, 30, 157–196. Frank, R. & K. Vijay-Shanker (2001). Primitive c-command. Syntax, 4(3), 164–204. Grewendorf, G. & J. Sabel (1996). Multiple Specifiers and the Theory of Adjunction: On scrambling in German and Japanese. Sprachwissenschaft in Frankfurt Arbeitspapier 16. Johann Wolfgang Goethe-Universität, Frankfurt am Main. Higginbotham, J. & R. May (1981). Questions, Quantifiers and Crossing. The Linguistic Review, 1, 41–80. Hornstein, N. (1998). Movement and Chains. Syntax, 1, 99–127. Kayne, R. (1995). The Antisymmetry of Syntax. Cambridge, MA: The MIT Press. Lobeck, A. (1995). Ellipsis. Oxford: OUP. May, R. (1985). Logical Form: Its structure and derivation. Cambridge, MA: The MIT Press. Nissenbaum, J. (1998). Movement and Derived Predicates: Evidence from parasitic gaps. In U. Sauerland and O. Percus (Eds.), The Interpretive Tract [MITWPL 25]. Cambridge, MA: MIT Working Papers in Linguistics. Ortiz de Urbina, J. (1989). Parameters in the Grammar of Basque. Dordrecht: Foris. Ortiz de Urbina, J. (1990). Operator Feature Percolation and Clausal Pied-piping. In L. Cheng and H. Demirdash (Eds.), Papers on Wh-movement [MITWPL 13]. Cambridge, MA: MIT Working Papers in Linguistics. Pollock, J.-Y. (1989). Verb Movement, Universal Grammar, and the Structure of IP. Linguistic Inquiry, 20, 365–424. Rizzi, L. (1990). Relativized Minimality. Cambridge, MA: The MIT Press. Roberts, I. (1991). Excorporation and Minimality. Linguistic Inquiry, 22(1), 209–218. Rudin, C. (1988). On Multiple Questions and Multiple WH Fronting. Natural Language and Linguistic Theory, 6, 445–501. Wilder, C. (1994). Coordination, ATB and ellipsis. Ms., ZAS, Max-Planck-Gesellschaft, Berlin.
A four-way classification of monadic verbs Murat Kural University of California, Irvine
.
Introduction
The two-way classification of monadic verbs as unaccusative and unergative verbs that is more or less the standard view now, goes back to Perlmutter’s (1978) original work on impersonal passives, where he observed that monadic verbs did not constitute a homogenous class. Focusing on Dutch primarily, he was able to show that a certain subset of monadic verbs allows impersonal passives, i.e., unergative verbs, while others do not, i.e., unaccusative verbs. In his analysis, the sole argument of an unergative verb starts out as an initial 1 (an underlying subject), while that of an unaccusative verb starts out as an initial 2 (an underlying object). Later on, Perlmutter’s basic insight was preserved in subsequent theories, most notably in Burzio (1986), where unaccusative verbs (his ergative verbs) are generated as internal arguments inside the VP, whereas unergative verbs (his intransitive verbs) are generated as external arguments outside the VP. Burzio also adds at least two more standard tests to the unaccusativity lore: (a) auxiliary selection, whereby unaccusative verbs choose essere ‘be’ as the auxiliary in the perfective tense in Italian, while unergative verbs choose avere ‘have’, and (b) the cliticization of the partitive ne from the postverbal subject position, which is allowed with unaccusative verbs, but not with unergatives.1 Burzio’s (1986) work predates the VP-internal subjects of Koopman and Sportiche (1991), which has since become a standard feature of the theory. Once Burzio’s structures were modified accordingly, we would obtain the following structures.
Murat Kural
(1) a. Unaccusative verbs:
b. Unergative verbs:
VP
VP DP
V’
e V
DP
V’ V
The key distinction between (1a) and (1b) is that the argument of an unaccusative verb in (1a) is generated as a complement – as if it were the object of a transitive verb, but the argument of an unergative verb is generated as its specifier – in the same position as the subject of a transitive verb. The following table contains some of the better-known unaccusativity tests that have been developed over the years: (2) there-insertion (English) Locative inversion (Chichewa) Subject case (Basque, Hindi) Agreement (Creek) Cognate objects (English) Resultatives (English) way-construction (English)
Unaccusatives Unergatives √ X √ X Absolutive Ergative Object Subject √ X √ X √ X
These tests and various issues regarding them have been discussed thoroughly by Burzio (1986), Bresnan and Kanerva (1989), Bresnan (1994), Laka (1992), Mahajan (1990), Martin (1991), Hale and Keyser (1993), Jackendoff (1992), Levin (1993), Levin and Rappaport (1995) among many others. The two-way distinction outlined above has been quite influential in the current theories and has been adopted almost universally. This dichotomy has a solid foundation since each test does in fact differentiate two classes of monadic verbs across a clear line of separation, although the classes overlap significantly in some cases. With the exception of some well-defined classes, such as motion verbs, verbs typically display either the set of properties associated with unaccusative verbs or the ones associated with unergative verbs, as stated in (2). What verbs do not do is to have some properties from the first column and others from the second in a random manner, which is what would be expected if these properties were not a reflection of some deeper structural organization of monadic verbs.
A four-way classification of monadic verbs
On the other hand, what makes this dichotomy a questionable classification is the fact that it is not a perfect division. Not all verbs align in exactly the same way described above, and perhaps more to the point, the verbs that stand out of this classification usually form semantically coherent classes, and the way in which a given verb class deviates from the pattern in (2) is not random either. An example would be verbs of motion, which allow there-insertion and locative inversion in (3a–b), which are unaccusative properties, but they also allow cognate objects and resultatives in (3c–d), which are expected of unergative verbs. (3) a. b. c. d.
There walked into the room three men Into the room walked three men The three men walked a long walk The three men walked the soles off their shoes
It is also the case that verbs that disallow cognate objects do not always behave the same way. A verb like break allows the transitivity alternation, as in (4b), but not appear, as in (5b) below. (4) a. *The vase broke a great break b. The magician broke the vase (5) a. *The rabbit appeared a quick appearance b. *The magician appeared the rabbit
On the other hand, there are some well-known unaccusative verbs that occasionally display some of the unergative properties, such as cognate objects with die, as in (6), and grow, shrink, and sink allow bare measure phrases, as in (7). (6) The rabbit died a horrible death (7) a. The tree grew two inches b. The shirt shrank two sizes c. The ship sank a thousand feet
One must bear in mind when evaluating the results of the tests above that despite the fact that each one effectively points at some two-way contrast within monadic verbs, the arguments presented thus far do not establish that all these tests actually draw the same type of distinction. It is entirely possible that at least some of these tests are sensitive to different properties, such that even though they all define two different classes of monadic verbs, they actually define different types of classes, setting up different contrasts. What lies at the heart of the matter is the question of what exactly each of these tests is sup-
Murat Kural
posed to be diagnosing. So even though it has been customary in the literature to think that these tests uniformly identify unaccusative verbs, one must also consider the possibility that some of the tests in (2) are in fact, indicators of more than one type of structural distinction. The main contention of this paper is that (a) the tests mentioned above do not all test the same structural properties, and (b) the discrepancies in the behavior of some monadic verbs across these tests can be explained naturally by positing a four-way classification rather than the traditional two-way classification. This view is outlined in Section 2 below.
. An alternative approach A major dividing line between the tests listed in (2) has to do with the position that is being tested: there-insertion, locative inversion, the case of the subject, and the subject-verb agreement patterns refer to some VP-external position, and more specifically, to the subject position within the inflectional field, whether it is the tense or agreement projection. The remaining three, cognate objects, resultatives, and the way-construction all relate to the VP-internal base position of the object, as they all seem to be localized at the internal argument position within the VP, i.e., what may be regarded as the complement of the verb. When evaluated from this perspective, some of the discrepancies sketched out in (3) through (7), and others that will be discussed below, become more than just some quirky properties of each test or verb type. By shedding some light into the internal organization of various VPs, these tests effectively serve as diagnostic tools that identify the fault lines that separate monadic verbs. Given this interpretation, the distribution of the classic unaccusativity tests across various monadic verb types points at a four-way distinction in terms of the differences in VP architecture, where each class can be defined as a semantically coherent group. These groups are as follows: 1. Verbs of being are verbs that indicate that their subject comes to be in some fashion, as in appear, arise, arrive, emerge, ensue, exist, lapse, and occur. Although the verbs in this class seemingly cover a wide range situation types, what they all have in common is the notion of becoming present in some way. 2. Change of state verbs indicate that their subject has undergone a change of state, as in the case of break, burn, change, fold, grow, heat, heal, melt, shrink, and sink. Specifically, the change of state comes in the form of the end state
A four-way classification of monadic verbs
of the subject being stated in the verb. In other words, with these verbs, a “DP V-s” can be paraphrased as “DP becomes V-en”, e.g., The window breaks is equivalent to The window becomes broken. 3. Change of location verbs indicate the motion of the subject, as in fall, jump, march, roll, run, skip, slide, swing, turn, and walk. The crucial concept here is that the verb provides the manner in which the subject moves. If there were a way to paraphrase “DP V-s” in this class, it would be “DP moves by V-ing”, e.g., though admittedly not a perfect sentence, something like The boy fell is truth-functionally equivalent to The boy moved by falling. 4. Verbs of creation are verbs indicate that the subject has produced an often abstract, though sometimes concrete but intangible product, as in cough, dance, dream, laugh, sing, sleep, smile, speak, and think. With these verbs, a “DP V-s” can be – perhaps quite awkwardly – paraphrased as “DP produces an NV ”, where NV refers to the nominal version of the verb, e.g., The girl coughs is equivalent to The girl produces a cough, cf. Hale and Keyser (1993). The type of VP architecture that will be proposed for these verb classes is sketched out in (8) below. Note that the internal organization of verbs of being in (8a) is the same as Burzio’s (1986) classic unaccusative VP structure, whereas the change of location verbs in (8c) have the same VP design as Burzio’s unergative verbs. By contrast, both the change of state verbs and verbs of creation in (8b) and (8d) have complex, multi-layered VP architecture, that involves inchoative and causative predicates respectively (the diacritics used below will be discussed later on). (8) a. Verbs of being:
c. Change of location:
VP e
VP V’
DP
V
DP
appear
Bill
V’
Bill V walk
XP
Murat Kural
b. Change of state:
d. Verbs of creation:
VP
VP V’
DPi
DP
the vase V INCH
VP
Bill
DP
V’
PROi V S
V
VP
CAUSE
XP
break
V’
V’
DP
(a dance) V dance
XP -1
As an aside, consider the VP structure for verbs of creation in (8d) in the context of the “small v” analysis of Kratzer (1994) and Chomsky (1995).2 A small v is, by definition, a light verb whose function is to introduce the external argument, which naturally leads to the question of how v and cause relate to one another. If they are assumed to be different verbs, it would not be clear how the external argument of cause is introduced into the structure. Since v is posited as the source of all external arguments, it must be provided by v. However, this would suggest that cause is a monadic predicate, though it is not obvious what a monadic cause might mean. In some sense, it would have a meaning close to happen, but that would imply that the causative meaning itself comes from v. This would, in turn, suggest that v actually has semantic content that is the equivalent of cause. However, if indeed v has the semantics of cause, that would have to mean that it occurs only with verbs that have causative meaning, excluding verbs that lack this sense altogether, e.g., verbs that depict physical contact, such as kiss and touch, or verbs that do not entail that their object undergoes any change of state, as in watch, read, and mention. Finally, it must be pointed out that what this paper proposes is a way to draw finer distinctions within the classic unaccusative/unergative dichotomy. Although it presents an alternative that has a four-way distinction, this does not mean that there can be no additional monadic verb categories. As our diagnostic skills sharpen in the future, there may be more than four classes that need to be accounted for. In this respect one should follow the discussion provided below as a potential starting point to a finer grained monadic verb typology.
A four-way classification of monadic verbs
. Verbs of being As shown in (8a), verbs of being, such as appear, arise, arrive, emerge, ensue, exist, lapse, and occur, are basically handled in this work as single-layered null specifier verbs. The verb projects a single VP and its sole thematic argument is generated as its complement. (9)
VP e
V’ V
DP
appear
Bill
The structure given above is basically the same structure that has been assumed for unaccusative verbs as first presented in Burzio (1986). As will be argued below, most of the properties of this verb class follow directly from this classic VP architecture shown in (9). The discussion will concentrate on four fundamental properties of verbs of being (as well as the other classes): i.
First, they allow postverbal subjects in the there-insertion, (10), and locative inversion constructions in English, (11). (10) a. There appeared a huge rabbit on the stage b. There exist various possibilities (11) a. On the stage appeared a huge rabbit b. In this forest exist many magical beings
ii. Second, they do not transitivize with null morphology. The examples in (12) are not possible even though the pragmatics of both situations are controlled in a way that would allow the causer to be able to cause the event of appearing, (12a), and the state of existing, (12b). (12) a. *The magician appeared the rabbit out of the hat b. *God existed the universe
A magician can make a rabbit appear out of a hat, and God is presumed to have the power to bring the universe into existence. However, neither situation can be expressed using the type of null causatives that would otherwise yield the transitivity alternation.
Murat Kural
iii. Third, these verbs do not allow the type of non-thematic complements (NTCs) mentioned above, i.e., cognate objects, as in (13), resultatives, the way-construction, (14a), and bare measure phrases (BMPs), (14b). (13) a. *The rabbit appeared a quick appearance b. *?Those people exist a strange existence (14) a. *Cockroaches will exist their way into dominance on the planet b. *The toothpaste appeared three inches
The measure phrase three inches is intended as a BMP in (14b), i.e., as a phrase that delimits the appearance. This means that out of the total amount of toothpaste in the tube, a three-inch line must be visual, rather than the appear to be three inches sense, where the whole toothpaste appears to be three inches, which may be the size of the tube, or the entire content of the tube after it is squirted out. iv. Fourth, verbs of being do not passivize. This constraint is not only operative in English, but also in Dutch, as shown by Perlmutter (1978). (15) *There was appeared by three rabbits
Note that Turkish and German allow verbs of this class to appear with the passive morphology, albeit with the semantics of the impersonal construction (see Maling 1993 and Kural 1996). The ability to license there-insertion and to invert the locative around the subject are both properties that involve a specifier position. Given the VP architecture in (9), and that no other verb class seems to allow either process (with the notable exception of directionals, which will be discussed below), it would be entirely plausible if the locus of both properties is the specifier inside the VP rather than a specifier higher up. Based on this, we may conjecture that there is licensed at the [Spec, VP] position, and that [Spec, VP] is a crucial intermediary position in the movement of the locative phrase to its surface position. In cases where the [Spec, VP] is occupied by an argument, which is true for all other verb classes, there-insertion or locative inversion would be blocked. In a language like Dutch where the equivalent of there-insertion is available with other verb classes would then suggest that Dutch allows er to be generated higher up, perhaps at the [Spec, TP] position. The contrast between the two language types in terms of the availability of [Spec, TP] for there may be a function of overt verb movement as suggested by Holmberg (1986), i.e., correlating with the fact that an inflected verb stays relatively low in English, below the T, but much higher in Dutch, at T or above.
A four-way classification of monadic verbs
In order to understand why verbs of being are not compatible with transitivization through what presumably is null causativization, one needs to look at the VP architecture of causatives with a structure like the one in (9) and more traditional VP structures, both shown in (16). A key background assumption in this account is that cause is a diadic verb with a Patient role that the causee must associate with it. When there is an external argument in the [Spec, VP], the Patient role associates with that argument as in (16a), but when the only possible candidate lies further below, the association of the Patient role with the target argument is blocked by the intervening thematic head, i.e., the root verb. (16) a. “Plain VP” under cause:
b. Unaccusative VP under cause: VP
VP DP
DP
V’ V
CAUSE
V
VP V’
DP V
V’
CAUSE
XP
VP V’
e V
DP
The effect intended in (16b) is similar to the minimality effects of Rizzi (1990). The correlation would not be an unusual one under Stowell’s (1981) conception of θ-role association as an instance of coindexation between the predicate and its arguments. Under this view, the minimality violation in (16b) would be handled as a case of binding violation, cf. Aoun’s (1985) Generalized Binding. The VP architecture in (9) also derives the inability of verbs to license NTCs: since the sole argument of the verb takes up the complement position, either by being generated there or binding an empty category in that position, there is no room available for NTCs. The fact that these verbs can generate only one argument also provides the grounds for a straightforward explanation as to why this verb class is cross-linguistically so resistant to passivization. The only argument that these verbs can generate gets demoted in the passive construction. According to the Extended Projection Principle, the EPP (Chomsky 1982, 1995), the subject position must be occupied by some constituent. An expletive can fulfill this requirement at the surface, but things get further complicated at LF because of the principle of Full Interpretation, FI (Chomsky 1986), which requires that LF representations contain all and only the elements that can re-
Murat Kural
ceive some interpretation at LF. The FI forces expletives to be deleted, or at least become transparent at LF, so the EPP cannot be satisfied with an expletive in the subject position. With any other verb class, there is usually some argument or a quasi-argument to provide the means to satisfy the EPP at LF, but this is not the case with verbs of being. As a result, the passivization of these verbs leads to an EPP violation, unless the language has some other way to satisfy the EPP (some of these strategies were discussed in Kural 1996).3
. Change of location verbs Verbs that indicate a change of location, such as fall, jump, march, roll, run, skip, slide, swing, turn, and walk are treated in this work as being single-layered thematic specifier verbs, which is the traditional unergative structure repeated again in (17) below. (17)
VP DP
V’
Bill V
XP
walk
i.
These verbs allow there-insertion and locative inversion only in the presence of a directional PP. There is allowed in (18) and the locatives are inverted in (19), where the sentences contain a directional PP. By contrast, both there-insertion and locative inversion are blocked in (20), where the PPs are locational. (18) a. There ran three people away from the crime scene b. There walked a woman into the room (19) a. Away from the crime scene ran three people b. Into the room walked a woman (20) a. *There walked a woman in the garden b. *In the room walked a woman
ii. These verbs can be transitivized with null (causative) morphology. (21) a. Bill ran his dog in the park b. Mary walked his guests to the door
A four-way classification of monadic verbs
iii. Change of location verbs license various NTCs, such as BMPs, (22a), cognate objects, (22b), resultatives, and the way-construction. (22) a. Bill ran five miles in the race b. Mary walked a quiet walk in the woods
iv. Finally, these verbs can passivize with much ease, although with some cross-linguistic variation: In a language like English, they passivize in the presence of an NTC, as is the case with the BMP in (23a) and the cognate object in (23b), although no such NTC is needed in Dutch. (23) a. Five miles were run by Bill in the race b. ?A long walk was walked in the woods
If the assumption made above is correct, and both there-insertion and locative inversion require a vacant [Spec, VP], one would expect the VP architecture in (17) to exclude both constructions. However, it is also the case that thereinsertion and locative inversion are licensed in the presence of a directional phrase. Given that both constructions key in on a vacant VP level specifier, one can maintain a consistent analysis by positing a directionality phrase above the VP in these instances, whose specifier position would be available for generating there and moving the locative phrase, much like the [Spec, VP] with verbs of being.4 (24)
DirP e (there) (into the room)
Dir’ Dir
VP V’
DP a woman
V
PP
walk
With respect to transitivization through null morphology, clearly change of location verbs can be incorporated into a null cause, and their arguments are high enough to be able to associate with the Patient-of-cause, since there is no thematic role dispensing head between the external argument of the change of
Murat Kural
location verb and cause. The situation would more or less the same in the case of directionals because Dir does not have a thematic grid associated with it. The complement position of a change of location verb is available for various NTCs, see (17), which is a property these verbs display in abundance, some of which is exemplified in (22). These verbs are also quite agreeable with passivization, and this is expected since any NTC that is primarily a noun phrase would be able to move up to the subject position once the external argument of the verb is demoted to an explicit or implicit by-phrase. To see that some of the NTCs are truly objects that can be passivized, observe the following paradigm in Turkish. A BMP, such as be¸s mil ‘five miles’ appears in the accusative case depending on the specificity of the distance, (25a), and the same expression becomes the subject of the corresponding passive, (25b). (25) a.
Ahmet o be¸s mil-i yarı¸sta ko¸stu A.-nom that five mile-acc race-loc run-past-3sg ‘Ahmet ran those five miles in the race’ b. Yarı¸sta (Ahmet tarafından) be¸s mil ko¸s-ul-du race-loc A. by five mile run-pass-past-3sg ‘Five miles were run in race’
Based on the interpretive requirement that the demoted subject must be animate, Maling (1993) argues that the impersonal passives in Dutch contain a pro-arb subject at LF, even though the surface subject is the expletive er. What separates Dutch from English is the ability of the former to license an expletive at a higher position, i.e., the specifier of an inflectional head. Once a language allows expletives to be generated in cases where there is no NTC, it would not require presence of an NTC in the passivized form of a change of state verb. Although the mechanism that introduces the pro-arb is unclear at the moment, it is likely to be independent of the ability to license expletives above the VP level.
. Change of state verbs Verbs that indicate change of state, such as break, burn, change, fold, grow, heat, heal, melt, shrink, and sink are inchoative-layered verbs. They are argued to be projected in structures that contain two layers of VP: a lexical layer that contains the root form, its phonetic content, and its basic irreducible semantics, and an inchoative layer that contains an elementary predicate, call it inch, which has no phonetic content and is the locus of the “change” component of the overall meaning of these verbs.
A four-way classification of monadic verbs
(26)
VP V’
DPi the vase V
VP
INCH
DP
V’
PROi V
XP S
break
A couple of points need to be noted here: First, a change of state verb indicates that the denotation of its argument undergoes some change as a result of the act denoted by the verb.5 Second, inch refers to the inception of a transformation, i.e., the change of state, not its endpoint, e.g., The boat is sinking is true in case the ship starts sinking and only a small part of it is under water, and it does not require that the whole ship to be submerged. Third, the head that inch combines with, which is the root verb, is a stative verb that designates the end state of the transformation, i.e., The ship is sinking means The ship has begin to be in a sunk state. This is indicated in the present work with the diacritic VS , where a VS is a defined as a verb that must incorporate into an inch to be a legitimate verb. Also note in passing that the relation between the specifier of inch and the specifier of the root verb in (26) is one of control rather than raising. This distinction is important for the internal logic of what is being proposed here, which will become more apparent below. Change of state verbs have the following properties: i.
They do not allow there-insertion, as in (27), or locative inversion, as in (28). (27) a. *There sank a boat (in the harbor) b. *There burned a house (to the ground) (28) a. *To the bottom sank a boat b. *To the ground burned a house
ii. They readily transitivize with null causative morphology. (29) a. The enemy sank the boat in the harbor b. An arsonist burned the house to the ground
Murat Kural
iii. They allow only limited types of NTCs: they take BMPs, (30a), and resultatives, (30b), but not cognate objects, (31a), or the way-construction, (31b). (30) a. The ship sank a thousand feet b. The house burned to a crisp (31) a. *The boat sank a complete sink b. *The house burned a spectacular burn
iv. They also do not passivize, even with NTCs that are otherwise allowed in the active form. (32) a. *A thousand feet were sunk by the ship b. *A crisp was burned by the house
Given the scheme of things argued thus far, the fact that there-insertion or locative inversion is not allowed with change of state verbs suggests that the VP architecture of these verbs does not allow for a vacant VP-level specifier. This problem is avoided by assuming that inch is a control predicate rather than a raising predicate. Having the thematic structure of this verb class separated into two VPs as in (26) has the added advantage of ensuring that the argument that undergoes the change is uniformly associated with the Patient role.6 However, this does not mean that all cases of Patient are necessarily introduced through inch. The object of the causative predicate cause is arguably a Patient argument since it is acted on by the causer, yet being the Patient-of-cause does not entail any discernible change of state.7 The ability of change of state verbs to transitivize under null causatives strongly suggests that the relevant thematic argument of the inch-breakS combination is not generated lower than a thematic role providing predicate. The predicate in question in these cases is inch, which is a thematically active predicate and is located as the higher predicate in the structure. Thus, for the biphrasal scheme to work in these circumstances, the argument of a change of state verb must be generated as the specifier of inch. Doing so, however, would deprive the root verb breakS of an argument, which would violate θ-criterion, or any principle that is meant to regulate the bijection between thematic roles and arguments. This is where the concept of control comes into play. As can be seen in (26), both inch and breakS generate their own arguments, and their identity is ensured by the control relationship that holds between the two.8 A consequence of the VP architecture given for this verb class is that an event like the intransitive breaking is treated as a composite event that contains the
A four-way classification of monadic verbs
endstate component represented with the root verb VS , i.e., breakS , and the inception component that is represented with the inch. Change of state verbs allow only a subset of all NTCs, and disallow the others. It was argued in Kural (1996, 1998) that what regulates the distribution of NTCs with this class of verbs is their requirement that their complements be secondary predicates. While the BMPs measure out the extent of the motion with change of location verb, they primarily predicate on the state of the subject with change of state verbs, e.g., *a five-mile run athlete versus a thousand-foot sunk ship. Note that BMPs cannot bear accusative case in Turkish, even if one construed the BMP as a predesignated specific distance that is being measured, cf. change of location verbs in (25). (33) a.
Bot be¸s metre(*yi) battı boat-nom five meter-acc sink-past-3sg ‘The boat sank five meters’ b. Hava onbe¸s derece(*yi) ısındı air-nom fifteen degree-acc warm-past-3sg ‘It got fifteen degrees warmer’
On the other hand, resultatives are by definition constituents that predicate on the state of the subject with intransitive verbs, and they are allowed change of state verbs, mostly, e.g., a burned-to-crisp house. If this conjecture is correct, this would limit NTCs occurring with change of state verbs to only the categories that can predicate on an endstate, i.e., BMPs and resultatives, excluding cognate objects and the way construction since they are not secondary predicates.9 It can be convincingly argued that the inability of change of state verbs to passivize with the NTCs, which is unlike the change of location verbs, is a consequence of the predicative nature of the NTCs. Since they are argumental constituents, they are not allowed as subjects, leaving change of state verbs without a potential subject to satisfy the EPP at LF after passivization.
. Verbs of creation Semantically what verbs of creation, such as cough, dance, dream, laugh, sing, sleep, smile, speak, and think have in common is that they entail the creation of the nominal equivalent of the verb, e.g., cough can be paraphrased as produce a cough, and dance as produce a dance. This verb class is treated in this work as causative-layered verbs that appear in a double-layered structure reminiscent of Hale and Keyser’s work (1993).
Murat Kural
(34)
VP DP Bill
V’ V CAUSE
VP V’
DP
(a dance) V dance
XP -1
The lower VP contains the root form of the verb of creation and the cognate object that denotes the entity that is being created either in the implicit or in the explicit form. Note that much like a VS , a V-1 is also an incomplete predicate as is. The diacritic of a raised “-1” indicates that the verb needs one more argument to be satisfied, but it cannot provide that argument thematically. This argument is supplied by the null causative predicate cause, which is an elementary predicate like inch: it lacks a phonetic content and it does not provide the core meaning of the verb. The source of the creation sense one finds in this verb class is the combination of the causative layer and the often implicit cognate object.10 This verb class has the following properties: i.
They do not allow there-insertion, (35), or locative inversion, (36). (35) a. *There danced fourteen people at the party b. *There laughed some people in the audience (36) a. *At the party danced some people b. *In the audience laughed some people
ii. They do not transitivize with null (causative) morphology, a fact that has been commented on by Hale and Keyser (1993): (37) a. *The DJ/music danced the people b. *The clown laughed the children
iii. They allow the whole range of cognate objects, resultatives, BMPs, the wayconstruction. The following set of sentences contain a specific class of cognate objects that are very common with verbs of creation: nouns that refer to a subclass of what the true cognate object would refer to, e.g., the rela-
A four-way classification of monadic verbs
tion between tango and dance, as seen in (38a). The example (38b) shows one of the better known instances of the resultative construction with a verbs of creation. (38) a. The couple danced a tango during the reception b. His friends laughed Bill out of the room
iv. These verbs passivize across languages with a type of variation that is familiar from change of location verbs: In a language like English they passivize in the presence of an overt NTC, as in (39), while in a language like Dutch, they passivize without one, as in (40). (39) a. A tango was danced during the reception b. Bill was laughed out of his room by his friends. (40) Er wordt hier veel gedanst ‘There is danced a lot here’
The there-insertion and locative inversion facts follow from the internal organization of the VP structure in (34), which provides no vacant specifier at the higher levels of the VP complex to host there or become the landing site for the inverted locative. On the other hand, the base position of the thematic argument is perfectly compatible with the null causative construction that would transitivize a verb of creation. In fact, the inability of these verbs to transitivize in the same manner as change of location and change of state verbs is due to an independent constraint on null causatives. Unlike overt causatives, null causatives cannot be iterated, which can be seen in the following example that starts out with a change of location verb, run, in (41a), transitivizes it once in (41b), and then attempts to add one more layer of causation in (41c). Compare (41c) with (42) and the corresponding example in the periphrastic version in (43c). (41) a. The horses ran around the barn b. John ran the horses around the barn c. *Sue ran John the horses around the barn (42) Sue made John run the horses around the barn (43) a. The horses ran around the barn b. John made the horses run around the barn c. Sue made John make the horses run around the barn
Although successive transitivization works in the periphrastic examples in (43), the second application of the null causativization is blocked in (41c). It can be
Murat Kural
further observed in (42) that there is nothing wrong in principle about embedding a transitivized change of state verb under another causativized verb. What goes wrong in (41c) is that a verb that was already incorporated into a null cause attempts to be further incorporated into another null cause. Given the VP architecture of creation verbs in (34), where a root V-1 necessarily incorporates into a null cause, embedding a verb of creation under a null causative layer would be the exact equivalent of the successive incorporation into null causative seen in (41c) above. Thus one may plausibly argue that whatever rules out cases like (41c) would also be behind the ungrammaticality of (37).11 In terms of the VP-internal characteristics of verbs of creation, it may be sufficient to note that the VP structure in (34) allows enough room for all the NTCs that have been discussed in this work. A similar statement can be made with respect to the passivization facts. Assuming that the language is like Dutch and can generate an expletive at a position that is relatively higher up, the EPP would be satisfied by moving the covert cognate object in the lower specifier to the subject position at LF. In a language like English, where expletives can only be generated low, this verb class provides no vacant specifier to generate an expletive, which means that a verb of creation can passivize only in the presence of an NTC in the overt syntax.
. Split ergativity and auxiliary selection Two of the classic unaccusativity tests that have not been commented on thus far are (a) the case borne by the subject in as certain class of ergative languages and (b) the type of auxiliary selected in the perfective tense in some of the Romance languages. It has been argued by Mahajan (1994) that these two phenomena are intimately related, and that split ergativity, as it is called, is another manifestation of the more familiar auxiliary selection phenomena discussed by Burzio (1986). Split ergativity is a property observed in ergative languages like Hindi and Basque, where the subject is in the ergative case with verbs that are traditionally known as unergatives, and in the absolutive case with traditional unaccusatives (Mahajan 1990; Laka 1992). This alternation is optional in Hindi, (44), but not in Basque, (45). Note that the verbs in the (a) sentences below are traditionally treated as unergatives, and the ones in the (b) sentences, as unaccusatives:
A four-way classification of monadic verbs
kutt˜f ne bh˜fkaa dogs(plur) erg barked(masc-sing) ‘The dogs barked’ b. Sitaa (*ne) aayii S.(fem) erg came(fem) ‘Sitaa came’
(44) a.
(Hindi)
(45) a.
(Basque)
emakume-a-k hitz egin du woman-the-erg word made has ‘The woman has spoken’ b. emakume-a etorri da woman-the-abs arrived is ‘The woman has arrived’
The issue regarding the case of the subject is one that recalls Fillmore’s (1968) classic account of way in which the thematic licensing of an argument reflects on the types of prepositional phrase that each will appear later on at the surface. The most obvious correlation between the paradigm above and Fillmore’s theory is his conjecture that all agents are licensed as by-phrases at the “deep structure”, and they remain as such in the passive construction, but in the active structures where they become subjects at the surface, the “subjectivization” transformation deletes the preposition at some point in the derivation. Taking this correlation as more than a mere coincidence, suppose that any argument that is thematically licensed by the predicate cause is licensed as a PP, which is either realized as a by-phrase, or as an ergative case, depending on the language. (46)
VP V’
PP P
DP V CAUSE
YP XP
V’ V
YP
The point that this is not such a far-fetched idea can be made by showing that an argument that is both a causee and a causer in a multiple causative construction appears as a by-phrase in Turkish if both the accusative and the dative cases are taken up by other arguments. This can be seen in the following example where Ay¸se’s reading is ultimately caused by the speaker but indirectly through
Murat Kural
Ahmet. The causal interaction is between Ahmet and Ay¸se, where Ahmet acts under the speaker’s directive: (47) pro [Ahmet tarafından] [[bu kitab]ı Ay¸se’ye 1.sg A. by this book-acc A.-dat oku-t-tur-du-m read-caus-caus-past-1sg ‘I made Ahmet make Ay¸se read this book’
The fact that the intermediate causer appears as a by-phrase would have a straightforward account if one assumed that all causers start out their lives as by-phrases. The arguments that eventually surface as accusative, dative, or nominative expressions have their prepositions incorporated either into the verb directly, or into some functional/inflectional projection that mediates in the P-incorporation. With this correlation in the background, consider Mahajan’s (1994) claim that deep down, split ergativity in Hindi is the same procedure as auxiliary selection in the perfective in Romance and Germanic languages. Kayne’s (1993) work establishes the key connection between the two phenomena by arguing in effect that have is derived from be by incorporating a preposition into it in the manner that has been sketched out below. (48)
Aux’ Aux Aux
VP P
PP P t
V’
DP V CAUSE
YP XP
V’ V
YP
In other words, he argues that have is be-P.12 Mahajan takes this idea further by first observing that be is preserved in Hindi in the perfective tense, where the subject is in the ergative case. (49) raam-ne vah kitaab˜e par˜ı˜ı th˜ı˜ı R.-erg those book-plur read-perf-fem-plur be-fem-plur-past ‘Ram had read those books’
A four-way classification of monadic verbs
The correlation that Mahajan capitalizes on is the following. In a Romance language like Italian, the perfective auxiliary is have with unergative verbs, and be with unaccusative verbs. In Hindi, the perfective auxiliary is always be, but the subject bears the ergative case with unergative verbs, and the absolutive with unaccusative verbs. If Kayne is correct and the source of the have in Italian that occurs with unergative verbs is a P that gets incorporated into the verb, this would correlate with the situation in Hindi unergative verbs where the P stays with the subject. In the absence of any P-incorporation into the auxiliary, the verb remains as be.13 Given the causative architecture for verbs of creation in (34), one would not be surprised to see that the external argument of the predicate comes into the derivation as a prepositional phrase, which produces the have auxiliary in the perfective. This, however, does not in itself explain why the have auxiliary appears with change of location verbs. For this I can offer two speculative suggestions without necessarily committing to either one. The first possible solution is the possibility that all arguments generated in a specifier position are generated as by-phrases, which is distinct from arguments generated as complements. The other possible solution is to assume that in these cases, there is an additional predicate act, a control predicate like inch (see Kural 1996), that generates arguments as its by-phrase specifier. Either way, another fact that stands out with respect to change of location verbs with directional expressions is that they take a be auxiliary. One can account for this fact by assuming that the P incorporation into the auxiliary is blocked when there is a directional phrase in the structure. This recalls the effect of DirP that was mentioned above. Taking this as a point of departure, suppose that the incorporated P never reaches the auxiliary because it gets blocked by the DirP. (50)
Aux’ DirP
Aux e
Dir’ Dir
Dir
VP
P
PP
V’
P
DP
V
t
a woman
walk
XP
Murat Kural
. Conclusion A system in which monadic verbs are differentiated following a four-way classification provides a more effective and efficient layout that allows the apparent discrepancies between verb classes emerge as lines of demarcation between a much richer array of monadic verb typology. The preceding discussion demonstrates that once we move away from the preconceived idea of a two-way classification of monadic verbs, we find not only that the syntactic behavior of these verbs becomes more sensible, but we also realize that the classification of these verbs along syntactic lines fully coincides with their broad semantic properties such as denoting a change of location or creation of an abstract entity.
Notes . I am leaving out ne-cliticization facts from this paper. . Arad (this volume) extends the classic v theory of tansitivity by positing a stative v, a causative v, and an inchoative v, which is closer to the spirit of what I am arguing for. Although we agree on the compositional nature of semantic verb classes, there are also clear differences between the two proposals. First, the basic criticism of the v theory I present in the text holds for her system as well. Second, the predicates I am using have the ability to combine with one another, while Arad’s light verbs do not. An alternative view of some of the phenomena she discusses can be found in Kural (1996). . Such strategies are arguably in place in languages that allow verbs of being to passivize. In the case of German and Turkish, these passives are the equivalent of the impersonal construction, and as suggested by Maling (1993), the EPP appears to be satisfied in these cases by a pro-arb. On the other hand, the equivalent construction behaves more like a true passive in Lithuanian, where one can argue that the by-phrases are moved to the subject position (Kural 1996). . There is an obvious parallelism between this view and Mateu’s and Rigau’s work in this volume, which provides a structural frame in which Talmy’s (1987) and by extension, Gruber’s (1965) work can be interpreted. . This is a rather crude definition that does not account for negation (The window broke versus The window did not break) and other modalities. Strictly speaking, one should be talking about hypotheticals. These issues are handled in detail in Kural (1996). . Stripped out of its Patient component, the (PRO) arguments of the stative root predicates sinkS (be sunk) or melt S (be molten) are more likely to be a non-Patient, perhaps similar to a “Theme” or Rozwadowska’s (1988) Neutral. . It is argued in Kural (1996) that the interpretation of the causative structure with causer acting on the causee is one of two possible interpretations. The other one, where the causer
A four-way classification of monadic verbs
manipulates the circumstances to bring about an event are derived by having the Patient-ofcause associate with the entire VP, which stands for the whole proposition. . The theory of PRO has been in a state of flux since the end of Chomsky’s (1980) classic theory that restricted PRO only to ungoverned domains. This concept was based on the complementary distribution between anaphors and pronouns in the same binding domain, which came to an end first, with Huang (1983), then Chomsky (1986), when different types of binding domains were established for anaphors and pronouns, making them no longer incompatible in every context. Various proposals have been made to fill this vacuum, including Chomsky’s (1995) null case and Hornstein’s (1999) raising, none of which necessarily duplicate the “ungoveredness” requirement, which is being violated in the representation in (26). . As an amusing aside, note that the equivalents of the verb die behave like they belong in different verb classes; a change of location verb in English, John died a horrible death, but a change of state verb in French, Jean est mort *(d’)une mort horrible. . This covert cognate object in (34) serves the function that the trace of the cognate object does in Hale and Keyser’s (1993) system, where an unergative verb is derived by incorporating a cognate object into a light verb. An outstanding problem with this approach is that these verbs can cooccur with the cognate object, as in danced a great dance or even related objects as in danced a great waltz. This is no longer a problem if we assume that the verb binds a nominal expression at all times, covert or otherwise, instead of a trace as in Hale and Keyser (1993). . It is argued in Kural (1996) that the movement of a verb into successive null predicates of the same type creates a problem in index transferal and leads to a violation of chain formation. . Kayne actually assumes that the preposition starts out as the head of the complement of the auxiliary verb, but Mahajan turns this around and has the P start out as the preposition of the VP-internal subject, much along the lines of Fillmore’s theory of agents. . Note that it is not clear how this would relate to the subject versus object agreement Martin (1991) reports for Creek.
References Aoun, J. (1985). Generalized Binding. Dordrecht: Foris Publications. Bresnan, J. (1994). Locative Inversion and the Architecture of UG. Language, 70, 72–131. Bresnan, J. & J. Kanerva (1989). Locative Inversion in Chichewa: A case study of factorization in grammar. Linguistic Inquiry, 20, 1–50. Burzio, L. (1986). Italian Syntax. Dordrecht: Reidel Publishing. Chomsky, N. (1982). Some Concepts and Consequences of the Theory of Government and Binding [LI Monographs]. Cambridge, MA: The MIT Press. Chomsky, N. (1986). Knowledge of Language: Its nature, origin, and use. New York: Praeger. Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: The MIT Press.
Murat Kural
Fillmore, C. (1968). The Case for Case. In E. Bach and R. T. Harms (Eds.), Universals in the Linguistic Theory. New York: Holt, Reinhart, and Winston. Gruber, J. (1965). Studies in Lexical Relations. Doctoral dissertation, MIT. Hale, K. & S. J. Keyser (1993). Argument Structure and the Lexical Expression of Syntactic Relations. In K. Hale and S. J. Keyser (Eds.), The View from Building 20. Cambridge, MA: The MIT Press. Holmberg, A. (1986). Word Order and Syntactic Features in the Scandinavian Languages and English. Doctoral dissertation, University of Stockholm. Hornstein, N. (1999). Movement and Control. Linguistic Inquiry, 30, 69–96. Huang, C.-T. James (1983). A Note on the Binding Theory. Linguistic Inquiry, 14, 554–561. Jackendoff, R. (1992). Babe Ruth Homered His Way to the Hearts of America. In T. Stowell and E. Wehrli (Eds.), Syntax and Semantics, Vol 26: Syntax and the Lexicon. San Diego: Academic Press. Kayne, R. (1993). Towards a Modular Theory of Auxiliary Selection. Studia Linguistica, 47, 3–31. Koopman, H. & D. Sportiche (1991). The Position of Subjects. Lingua, 85, 211–258. Kratzer, A. (1994). The Event Argument and the Semantics of Voice. Unpublished manuscript, University of Massachusetts, Amherst. Kural, M. (1996). Verb Incorporation and Elementary Predicates. Doctoral dissertation, UCLA. Kural, M. (1998). Two Types of Bare Measure Phrases. In Proceedings of WECOL 96, 177– 187. Department of Linguistics, California State University, Fresno. Laka, I. (1992). Ergative for Unergatives? Talk presented at UCLA. Levin, B. (1993). English Verb Classes and Alternations. Chicago, IL: The University of Chicago Press. Levin, B. & M. R. Rappaport (1995). Unaccusativity: At the syntax-lexical semantics interface [LI Monographs]. Cambridge, MA: The MIT Press. Mahajan, A. (1990). The A/A-bar Distinction and Movement Theory. Doctoral dissertation, MIT. Mahajan, A. (1994). The Ergativity Parameter: Have-Be alternation, word order, and split ergativity. In M. Gonzales (Ed.), Proceedings of the Twenty-Fourth Meeting of the Northeastern Linguistic Society. University of Massachusetts, Amherst: GSLI Publications. Maling, J. (1993). Unpassives of Unaccusatives. Unpublished manuscript, Brandeis University. Martin, J. (1991). The Determination of Grammatical Relations in Syntax. Doctoral dissertation, UCLA. Perlmutter, D. (1978). Impersonal Passives and the Unaccusative Hypothesis. In Proceedings of the Fourth Annual Meeting of the Berkeley Linguistic Society. UC Berkeley. Berkeley. Rizzi, L. (1990). Relativized Minimality [LI Monographs]. Cambridge, MA: The MIT Press. Rozwadowska, B. (1988). Thematic Restrictions on Derived Nominals. In W. Wilkins (Ed.), Syntax and Semantics, Vol 21: Thematic Relations. San Diego: Academic Press. Stowell, T. (1981). The Origins of Phrase Structure. Doctoral dissertation, MIT.
A four-way classification of monadic verbs
Talmy, L. (1985). Lexicalization Patterns: Semantic structures in lexical forms. In T. Schoepen et al. (Eds.), Language Typology and Syntactic, Description III: Grammatical Categories and the Lexicon. New York: CUP.
On Agreement Locality and feature valuation Luis López University of Illinois-Chicago
I propose a new look at the operations Agree and Move, taking many of the concepts in Chomsky (1998, 1999) as starting point but departing from these papers in important respects. First, I argue that the operation Agree is strictly local. Second, I argue that the operation Move is triggered by the instability created in the system by unvalued features (following similar ideas in Frampton & Gutmann 1999). Third, the concept of co-valued features is introduced: two terms with unvalued features of the same type that are related by the operation Agree must have their features valued “in tandem”. It is shown that feature co-valuation underlies expletive-associate and movement chains. These alterations are shown to have healthy consequences, since they allow us to revisit and eliminate some unnecessary assumptions.
.
Introduction
The goal of this paper is to advance the Minimalist Program. Some of its assumptions are scrutinized and rejected and a more streamlined framework is proposed that also has some empirical advantages. Concretely, I propose a new look at the operations Agree and Move, taking many of the concepts in Chomsky (1998, 1999) as starting point but departing from these papers in important respects. First, I argue that the operation Agree is strictly local and long distance agreement, as proposed by Chomsky, does not exist. Second, I argue that the operation Move is triggered by the instability created in the system by unvalued features (following similar ideas in Frampton & Gutmann 1999). Third, the concept of co-valued features is introduced: two terms with unvalued features of the same type that are related by the operation Agree must have their features valued “in tandem”.1 It is argued that the notion of feature co-valuation makes explicit some implicit assumptions concerning the expletive-associate
Luis López
relation as well as the locality that restricts the connection between the links of a chain. On the conceptual side, these alterations will be shown to have healthy consequences, since they allow us to revisit and eliminate some unnecessary assumptions. Additionally, this paper presents novel analyses of structural case assignment and of the typological differences in expletive constructions, amply demonstrating the empirical advantages of this approach. As Chomsky defines it, Agree is an operation that involves two terms, a probe and a goal. Take a functional category that has a set of unvalued/uninterpretable features that need to be valued and deleted from narrow syntax. In order to value its features, the functional category must probe in its c-command domain until it finds a goal: a set of matching features – valued features of the same type – that the probe can agree with. If the functional category has an additional EPP feature, the goal will be pied-piped into the spec position of the probe. For instance, assume that Tense (henceforth T) bears unvalued φ-features. In order to value and delete its features, T probes in its c-command domain for a goal with matching features of the same type. The subject in Spec,v has valued φ-features, therefore it can be a goal for T’s features. The subject’s features value those of Tense, which delete. Additionally, T selects for a D – it has an EPP feature – a requirement that is satisfied by pied-piping the subject to Spec,T. An important distinction that Chomsky makes is that matching and agreeing are not the same thing in his system. If a probe encounters a DP with matching features whose structural case has already been assigned and deleted, the probe will not be able to agree with this DP and, moreover, it will not be able to keep probing. In other words, agreement can only take place with a constituent that has an unvalued case feature. Chomsky refers to this as a “freezing” effect and it provides an account of Relativized Minimality effects. Consider (1). In (1a), the matrix T can’t agree with ‘John’ because there is an intervening DP, ‘it’, with matching features. Can the matrix T agree with ‘it’? No, because ‘it’ agrees with the subordinate T and, as a consequence, ‘it’ has had its case deleted. Once case is deleted, agreement is not possible. Since T can’t agree with the intervening DP, it can’t pied-pipe it either. Finally, since the features of T and ‘it’ match, T cannot go on probing, hence the ungrammaticality of (1b): (1) a. T seem that it is likely to John win. b. *John seems that it is likely to win.
A second restriction on Agree is the Phase Impenetrability Condition (henceforth PIC). A phase is a structure created by a sub-numeration headed by transitive v (an abstract light verb that introduces external arguments, see Sec-
On Agreement
tion 2) or C. Unaccusative or passive v is not the head of a phase, and neither is T. Once a phase has been completed – all the items in the numeration selected, all the Agree and Move operations that can take place already finished – it can be embedded within another phase. Take the structure represented in (2), where H1 and H3 are the heads of their respective phases: (2) H1 . . . H2 . . . [H3P XP [ H3 . . . YP]
The PIC prevents the head of a phase to probe into another phase, except its edge – for our purposes, the edge of a phase is the outer specifier. Thus in (2) H1 can probe XP, but can’t go further down. On the other hand H2, which is not the head of a phase, can probe XP and YP. In Chomsky (1999), H1 = C, H2 = T and H3 = v, so T can probe its c-command domain indefinitely whereas C can only probe Spec,T and Spec,v. The validity of operations is evaluated at the phase level. So, turning back to example (2), and assuming that H3 = v is transitive, any Agree or Move operations involving the edge of v, XP and H2 = T will be evaluated once H1 = C is merged and the phase finished. If H3 = unaccusative/passive v, operations involving H3 and YP will also be evaluated when H1 is merged and the strong phase completed. A third property of Agree that I discuss in this paper is that of φcompleteness. This issue appears in connection with raising and ECM predicates (3a and 3b respectively) and in participle agreement (3c, a Spanish sentence): (3) a. Several prizes are likely to be awarded. b. We expect several prizes to be awarded. c.
(Chomsky 1999: 5) Las mujeres fueron vistas en la tienda. The women were.3rd.pl seen.fem.pl in the store
In (3a, b) there is a non-finite Tense, called by Chomsky defective Tense (Tdef ), which does not assign Case. Tdef is distinct from the non-finite T that appears in control and PROarb constructions, which assigns null Case. Why does Tdef not assign Case? In Chomsky’s system, structural Case is a by-product of agreement. Probes do not have a Case feature, so they do not assign Case, but when a probe agrees with a DP with an unvalued Case, this Case can be valued and deleted from narrow syntax (Chomsky 1999: 4). For a Case feature to be valued and deleted, the probe must have a complete set of φ-features, or be φcomplete. Finally, Chomsky suggests that Tdef and participles have some agreement features but not enough to be φ-complete. Thus, Tdef or a participle can
Luis López
probe a DP but, since they are not φ-complete, the Case of the DP is not deleted and the DP can be probed again from higher up. Under one version of the story, Tdef has an EPP feature that attracts the DP to its spec, from where it can be probed from above. Under another version also considered by Chomsky, Tdef does not even have an EPP, but the higher probe can simply by-pass it to reach the DP. To sum up: Chomsky allows for long distance agreement between a probe and a goal, provided that “freezing” effects and the PIC are respected. Probes can by-pass other probes if the latter are not φ-complete. I think it is fair to say that some of the assumptions that ground this framework for analysis are stipulatory and a system that could dispense with them would be preferable. First of all, the distinction between matching and agreeing is fairly artificial and does not follow from any principles: why should a DP without Case be more able to agree than a DP with Case? Notice that the φ-features of the DP do not delete, since they are not interpretable – apparently, Case is necessary to make the φ-features of a DP visible to a probe, but we do not know why these φ-features can’t stand on their own and be accessed by a probe without the intermediary of a Case feature. Moreover, the requirement that the goal has Case leaves out all the concord phenomena that Carstens (2000) discusses (i.e. sharing of gender and number features between a noun and its modifiers). Second, the assumptions that make up the PIC could also be questioned – turning back to (2), why can H2 probe YP but H1 can’t? It would be preferrable to have the same locality requirement for all instances of Agree. Finally, how is φ-completeness to be defined? In example (3c), the participle has number and gender features and T has number and person features. Since the DP has number, gender and person features, it would seem that neither T nor the participle has a complete set of φ-features. Are we going to assume that T has unexpressed gender features in Spanish? On what grounds? Consideration of multiple agreement in Bantu provides further fuel against the notion of φ-completeness. Consider the Kiswahili sentence in (4), cited from Carstens and Kinyalolo (1989): (4) a.
Juma a-ta-pika chakula Juma agr.fut.cook food ‘Juma will cook food’
On Agreement
b. Juma a-ta-kuwa a-me-pika chakula Juma agr.fut.be agr.perf.cook food ‘Juma will have cooked food’ Where ‘a’ = agr = 3rd person singular 1st noun class.
In Kiswahili, a verbal root can’t support both tense and aspect morphology. So, the strategy resorted to by this language is to have the aspect morphology attached to the stem and the tense morphology attached to an auxiliary verb, glossed as ‘be’. The agreement marker a- indicates subject agreement, third person singular and first noun class (where noun class is taken to be an expression of gender, see Carstens 1991). Notice that subject agreement is repeated on the main verb and on the auxiliary. Let’s consider the stage in the derivation in which the aspect marker has just been merged and the subject is still in its basegenerated position. Further, I assume that assembling of the aspect marker and the verb is a PF process: (5) [AspP a-me [vP Juma pika . . . ]] agr.perf Juma cook
Obviously, ‘-me-’ is an agreement probe, as shown by the overt agreement morpheme. Notice that the probe is φ-complete – or, at least, it can’t be said that the Aspect head is any less complete than T. So, matching of features between -me- and the subject should lead to deletion of the Case feature of the subject. At this point, the subject is inactive and can’t be probed by T, even if -me- had pied-piped the subject to its spec. So Chomsky predicts that T can’t agree with Juma, contrary to fact. A consequence is that we need Case features associated with some heads, independent of φ-completeness. Alternatively, we could assume that the Case feature of Juma is deleted but still present in the computation until the phase is completed. Thus, the Case feature is still accessible to T. However, notice that the same reasoning can apply to the examples in (3): we could say that Tdef does delete the Case of the DP but it survives until the phase is completed, so it is still accessible from the higher probe. But if we adopt this way of thinking, we simply do not need φ-completeness at all, only the stipulation that T is not the head of a phase. It seems safe to conclude that a system in which the operation Agree were not concerned about φ-completeness would be preferable.2 I propose that the operation Agree is more restricted than what Chomsky claims: I argue that there is no such thing as long distance agreement. Take the structure (6). Chomsky would allow for Agreement to take place between H1 and XP across H2P, unless H1 is head of a phase. Instead, I propose that the
Luis López
phrase projected by H2 creates a barrier for agreement, regardless of the type of H2, and consequently H1 can’t agree with XP, regardless of the type of H1: (6)
H1P H1
H2P H2
XP
Further, I argue that this intervention effect is what triggers Move. In Chomsky’s model, once a probe has agreed with a goal the latter can be pied-piped to the spec position of the probe in order to satisfy an EPP feature of the probe. Instead, I argue that XP in (6) needs to move if it has unvalued features that can’t be probed. Movement is triggered by unvalued features of the term that moves. We will see below how this can be conceptualized without falling into the problems that Chomsky’s (1993) Greed created and that led to its abandon. The obvious conceptual advantage of my proposal is that, by disallowing any sort of long distance agreement, I reduce drastically the search space for an agreement probe, thus reducing the computational complexity involved in the operation. As will be shown, this proposal makes the PIC unnecessary and provides a good motivation for “freezing” effects, including superraising, so they do not need to be stipulated. Finally, it allows us to dispense with Tdef and φ-completeness. In Section 2, I present my assumptions concerning Agree and Move and the notion of feature co-valuation. In Section 3, I discuss the mechanisms of Agree, Case assignment and Move within the assumptions laid out in Section 2. In Section 4, I explore another aspect of the theory of agreement, namely, the possibility that two sets of unvalued features can be co-valued. I analyze expletive constructions and show that Locality of Agreement+Feature covaluation provide some new insight into the properties of this construction in German, French, English and Icelandic. It is shown that my assumptions provide analyses for a variety of phenomena that can’t be explained under Chomsky’s. A crucial ingredient of my analyses is that the expletive-associate relation is an Agree relation established in a strictly local configuration. In Section 5 I discuss Superraising, and show how my proposal to enforce locality between two agreeing terms can be extended naturally to enforce locality between the links of a chain. The final section summarizes my main conclusions. Although not every aspect of the theory of Case and Agreement is discussed and some knots are still tied (or uncut), I believe enough is accomplished to conclude that the direction taken here is promising.
On Agreement
. Framework . Transitivity I adopt Chomsky’s (1995) proposal that external arguments are introduced by a functional category, a light verb represented as v (for similar ideas, see Kratzer 1996, among others who develop ideas ultimately rooted in Larson 1988). What does v select for as a complement? The traditional assumption is that it should be a VP. However, it is worth considering Marantz’s (1997) recent proposals concerning the morphology-syntax interface. Marantz argues vigorously against the Lexicalist Hypothesis – or more appropriately, he complains that Lexicalism died a while ago but most of us did not read the obituary and missed the funeral. Two lexicalist assumptions that Marantz rejects are crucial for our purposes. The first is that the lexicon is a computational space, separate from syntax, in which words are formed by putting together different bits and pieces, including roots with an inherent category label. The second assumption is that syntax does not see these bits and pieces, only the resulting lexical item with the category label attached to it. Instead, Marantz proposes a “narrow lexicon” composed of roots and bundles of grammatical features. The roots enter the computational system – there is only one for morphology and syntax – without a category label and take the complements that they select. Then they are themselves selected by a functional category. If the functional category is a v, the resulting structure will be a verbal phrase. If the functional category is a D, the result will be a nominal phrase. Henceforth, I represent a label-less root as an X and its projection as an XP. (7) a.
b.
vP v
XP X
DP D
YP
XP X
YP
Thus, if //buy// is selected by v, it is going to be a verb, whereas if it is selected by D, it is going to be a noun. Functional heads – Tense, Comp, Det – are feature bundles that have fairly fixed selectional requirements. I adopt Marantz’s proposals in this paper.3 v comes in several versions (see Arad 1998, this volume). We are interested in two of them. v(AG) is the functional category that selects for an agent external argument and has the property of assigning accusative Case. A second version of the light verb is the one that we find in unaccusatives and passives: it
Luis López
does not assign a θ-role at all but may in some languages assign partitive Case (Belletti 1988; Lasnik 1992). Call it v(Ø).4 Further, I propose that the selectional properties of v in any of its versions are invariant: it selects an XP as a complement and a DP as a specifier, even if it can’t assign it a θ-role – or, to say the same thing in different words, v has an obligatory EPP feature that is satisfied by the external argument if we have a v(AG) or by a raising DP or an expletive if we have a v(Ø) (more on this in Sections 3.2 and 4). . Agree Following Chomsky (1998), Agree is an operation between two items, a probe and a goal (p, g) whose objective is to provide a value for features that are introduced into the computational system without a value. Henceforth, I represent unvalued features as a variable α so if I want to say that x has unvalued φ-features I simply write that x has [αφ]. Thus, a functional category with [αφ] can probe within its c-command domain until it finds a DP, which has a set of valued φ-features, [φ]. As a result of the probe, the [αφ] of the probe can be valued. Following a long tradition I assume that v, finite T and C have sets of [αφ] in need of valuation (see Haegeman 1992 and Zwart 1997 for evidence that C can agree). We can understand Agree (p, g) as an operation that co-values two sets of features. If one of the two sets is already valued, this value is simply copied on the other set and the [α] symbol is removed. This is represented in (8a), where X is a probe and Y is a goal. If both probe and goal have unvalued features of the same type, they will remain unvalued, with a twist: since they are now involved in the Agree relation, these features will be co-valued. In other words, two unvalued but agreeing features can’t vary freely, eventually they must end up with the same value. I represent this with a subscript number in (8b): (8) a. Agree (X[α], Y[1]) → X[1], Y[1] b. Agree (X[α], Y[β]) → X[α1 ], Y[α1 ]
Co-valued variable features are going to be useful for an analysis of expletives in Section 4 and movement chains in Section 6. Should EPP participate in Agree (p, g) operations? It seems not. First, a probe looks into its c-command domain, while EPP features typically involve the spec of a head. Second, selection does not involve valuation of features. In other words, it does not seem plausible to say that v has an [αcategory] feature which is valued as [D] when it probes and finds a DP or valued as a [P] if it finds a PP instead. Selection does not look for a token feature of a certain type,
On Agreement
but looks for a specific instantiation of a feature. For all this, I prefer to leave EPP out of the Agree (p, g) system, as a residue of the θ-assignment system. . Case assignment as Agree In the Principles and Parameters tradition, certain heads had the stipulated property of assigning Case, namely P, T and V. This view did not change radically until the most recent developments of the Minimalist Program. As I mentioned in the introduction, in Chomsky (1999: 4), [assign x Case] is not a feature of the probe – there is no matching relationship between probe and goal in this respect. Rather, when agreement between probe and goal takes place, and the probe is φ-complete, the unvalued Case of the nominal is valued and deleted. As we saw in my discussion of the example (4), in Swahili we find that both the T head and Aspect co-occur in the same sentence and appear to be φ-complete. As a consequence, the aspect head probes the DP and since the probe is φ-complete, the unvalued Case of the DP should be valued and deleted, effectively freezing the DP in place before T can probe it. This seems to be an undesirable result. Instead, it seems we should retain from earlier frameworks the idea that some heads are responsible for Case licensing of DPs and others are not. Let’s then assume that some functional categories do have a Case feature and, further, that Case is one of the features that can enter an Agree (p, g) relation – notice that I am talking about Case features now like [nominative] or [accusative], not “Case assigning features”. This feature may be inherent or may be added freely as the lexical item is drawn from the lexicon to form a numeration.5 This Case feature is valued from the onset but uninterpretable. Since this feature is valued, it can be copied on an agreeing goal that has an unvalued Case. Since it is uninterpretable, it will be deleted as soon as the phase is spelled-out. For example, v(AG) has an [accusative] feature, so when it agrees with an object with a unvalued Case, the feature [accusative] can be copied on the feature matrix of the object. After the vP phase is completed and spelledout, both accusative features are deleted. Case assignment is therefore only a variant of the agreement relation, as represented in (9). In (9a) I represent the situation in which one of two items involved in Agree has a Case feature. What if two terms X, Y are involved in the Agree relation but both of them have unvalued Case features? I posit that the Agree relation forces them to co-value their Case variables (9b), following the general pattern in (8):
Luis López
(9) a. Agree (X[C1], Y[αC]) → (X[C1], Y[C1]) b. Agree (X[αC], Y[βC]) → (X[α1C], Y[α1 C])
Co-valued Case features are going to be at work in my analyses of the connection between expletive and associate as well as in movement chains. The final question to be decided is which heads bear a Case feature. I assume that v may bear a Case feature. As already mentioned, v(AG) bears accusative Case (so that the effects of Burzio’s generalization are captured), whereas v(Ø) may have no Case or may have so called partitive Case (after Belletti 1988; Lasnik 1992). Finite T does have [αφ], at least in English, but, contrary to standard assumptions, I claim that T does not have a Case feature. Instead, finite C bears [nominative] and non-finite C [null].6 Nominals have valued φ-features but unvalued Case, which I represent as unvalued Case. It may seem somewhat exotic to have C as a Case assigner (but see Platzack 1986; Vikner 1995; and Chomsky 1999, fn 17), but clearly this assumption leads to a simplification of the theory. Currently, Chomsky (1998) must assume that there are two types of infinitival heads. On the one hand, Tdef does not assign Case, has only an [αperson] feature and is selected by a V. On the other, ordinary infinitival T in control and PROarb constructions has [αperson] and [αnumber], assigns null Case and is selected by C. Therefore, Chomsky makes two sets of assumptions: (i) there are two types of nonfinite T, (ii) C and V have different selectional properties because V cannot select non defective T and C cannot select defective T. Instead, I propose that there is only one type of infinitival T, which neither has α-features nor assigns Case (see also Romero, this volume). Non-finite T can be freely selected by a non-finite C that bears null Case, by a prepositional complementizer (like ‘for’) that bears accusative, or by a lexical root that is selected by v (giving rise to raising/ECM constructions). If infinitival T does not have a set of [αφ] features, one of the motivations for φ-completeness disappears (as I will show in detail in Section 3.2). Another advantage is that disociating Case assignment from agreement with T turns out to be a necessary step in order to account for the properties of expletive constructions (Section 4) – for instance, the associate of the expletive in English agrees with T but appears in non-nominative (accusative or partitive) Case. . α-features and spell-out In Chomsky’s system, Full Interpretation forces features without semantic content – uninterpretable – to be deleted by LF after being checked. The question is whether Full Interpretation also plays a role in PF. Let’s assume it does. The
On Agreement
uninterpretability of features is presumably irrelevant at this level. However, whether a feature is valued or not should be very relevant. Why? From Distributed Morphology (Halle and Marantz 1993) I adopt the idea that CHL only handles bundles of features – syntactic heads are paired up with phonetic representations at Morphological Structure, by means of Vocabulary Insertion. But Morphology cannot find appropriate vocabulary items for features that have no value. Thus, a structure with unvalued features cannot be spelled-out. From Collins (2000) and Ura (1996) I adopt the notion that an unvalued feature must be valued at once before further unvalued features are introduced in the structure. Moreover, I propose that spell-out takes place whenever a structure has no unvalued features left. In the Case of ordinary transitive sentences, spell out will take place right after vP is built and object has valued its Case feature (but before subject is merged) and after CP is built and subject has valued its own Case feature. Thus, my system in which spell-out takes place after features are valued coincides with Chomsky’s phases. With unaccusatives and passives, we are going to see that the Case of the object is not valued at the vP level, so Spell-out will be delayed until CP (again as in Chomsky’s approach). With expletive constructions we are going to find cross-linguistic variation: vP spells-out in French but not in German and a mixed situation occurs in English. For clarity, consider the three types of situations depicted below (Henceforth SU = external argument merged in Spec,v, OB = Compl,X): (10) [vP α [v’ v [XP OB ]]] (11) [vP SU [v’ v [XP α ]]] (12) [vP α1 [v’ v [XP α1 ]]]
In (10), v’ has no unvalued features, so it can spell-out before a term is merged in Spec,v with its own unvalued feature. In (11), the unvalued feature is buried within the vP and the structure can’t spell-out. In (12), the unvalued features are co-valued; the structure can’t spell-out but the higher unvalued feature might be valued by means of an Agree relation with a higher functional category; if that happens, the lower feature is also valued and the structure can spell-out. Notice that the set of uninterpretable features and the set of unvalued features are not co-extensive. Assume Case features are agreement features, so DPs can be [nominative], [accusative] or have their Case feature unvalued if they are not in the domain of a probe with a Case feature. [nominative] and [accusative] are uninterpretable at LF but valued, so they do not lead to a crash at PF – in-
Luis López
deed, PF needs these features to know what morphology must be inserted. On the other hand a DP with an unvalued Case feature cannot be matched with the appropriate vocabulary item, the structure can’t spell-out and this leads to a crash. . An example: Participle agreement (including Case) An example will serve to show the intuitive import of feature co-valuation, including Case. Take the following two Icelandic examples (taken from McGinnis 1998: 184): (13) a.
Hann telur sig vera [ t sterkan]. He.nom believes himself.acc to be strong.acc b. Hann tel-st vera [ t sterkur]. He.nom believes.refl to be strong.nom ‘He believes himself to be strong’
In (13a), the anaphor sig starts out as an argument of the adjective/participle. Both sig and the participle have unvalued Case features, as well as person and number features. sig can be probed by the adjective/participle in its initial merge position. Thus the person and number of the adjective/participle become those of the noun. Additionally, since they are involved in an agree relation, their Case features are co-valued. sig then raises spec to spec (we will see how and why) until it is probed by the matrix v. v has an [accusative] feature which gets copied onto sig. Since sig and the adjective/participle have covalued their Case features, the adjective/participle automatically also values its unvalued Case as [accusative]. In (13b), McGinnis, following Marantz (1984), argues that hann is an argument of the adjective/participle. If so, the adjective/participle and the pronoun will also co-value their Case features. Then, the pronoun raises until it is probed by the matrix C. C has a [nominative] feature that gets copied onto hann. By virtue of feature co-valuation, the adjective/participle ends up [nominative] too.7 . Another example: Floating quantifiers Floating quantifiers is another area in which feature co-valuation proves to be useful. When we move a DP stranding its quantifier (see fn 3) we end up with two constituents that bear the same Case feature. This can be seen in the Spanish example (14). In Spanish, dative shift can strand a quantifier
On Agreement
(see Branchadell 1992 and references therein). When this happens, both the moved DP and the stranded quantifier show up in dative Case, spelled-out as a (see Demonte 1995 for arguments that a is a spell-out of dative Case and not a preposition). Additionally, notice the morphological change in the dative clitic.: (14) a.
Le di a cada uno de los hombres un libro. cl.3rd.sg gave dat each one of the men a book b. Les di a los hombres un libro a cada uno. cl.3rd.pl gave dat the men a book dat each one ‘I gave each of the men a book’
In (14a) the clitic agrees with the quantifier, which is singular, whereas in (14b) it agrees with the determiner, which is plural. How can this be? I suggest that the puzzle of having two dative constituents in the same sentence can be analyzed in terms of feature co-valuation. Let’s go step by step. First, assume that dative shift is an instance of Case-driven movement (as in Collins & Thráinsson 1996, among others). It has been noted before (Demonte 1994) that the dative clitic can augment the Case valence of a verb: (15) a.
Juan hizo un pastel para María. ‘Juan made a cake for Maria.’ b. Juan le hizo un pastel a María. Juan cl.3rd.sg made a cake dat Maria ‘Juan made Maria a cake.’ c. *Juan hizo un pastel a María. Juan made a cake dat Maria
It seems natural to conclude that the clitic itself is involved in assigning Case. Following suggestions in Ura (1996), we can assume that it is associated with a light verb v head. Thus, the clitic has the features [dative] and [αφ]. Take sentence (13a). The QP has dative shifted to a position where it can be probed by the clitic. The unvalued Case of the QP is valued as [dative] and the [αφ] of the clitic are valued as [3rd.sg] because these are the φ-features of the quanti fier cada. Take now (13b). The crucial assumption here is that the DP starts out as the complement of the quantifier. In this configuration, the quantifier can probe the DP and, as a result, they covalue their Case features, as in (16a). The DP raises, stranding the quantifier, to a position where it can be probed by the clitic. The [dative] Case of CL is copied onto the feature structure of the DP. By feature co-valuation, the QP also ends up with the feature [dative]:
Luis López
(16) a. [QP Q[ α 1C] DP[ α 1C] NP ] b. [CL[dat] DP[dat] . . .. QP[dat] t(DP) ]
Additionally, CL has [αφ] that need to be valued too. The DP has φ-features [3rd.pl], and these are the features that end up copied on the clitic. Since all features are valued, the structure is ready for spell-out, after which the uninterpretable features can delete. Obviously this analysis is sketchy. However, I believe it suffices to illustrate the usefulness of incorporating feature co-valuation in the theory. . Locality of Agreement Principle (17) limits Agree (p, g) to elements not separated by a maximal category: (17) Locality of Agreement A probe p can agree with a goal g iff (i) p c-commands g (g is a sister of p or dominated by a sister of p) (ii) there is no Xmax category such that (iia) Xmax dominates g. (iib) Xmax does not dominate p.
(17) depends strongly on our definition of an Xmax category. Consider a phrase with a spec, as exemplified in (18): (18)
XP ZP
X’ X
YP
I assume that specs create adjunction structures, so that XP and X’ are not categories but segments of a category (see May 1985 and Chomsky 1986 on the notion of segment). As a consequence, ZP is not dominated by an Xmax , but only by a segment. Thus, ZP can be probed from outside XP and can also probe YP and Spec,Y. However, an item outside of XP can’t probe YP.8 Locality of Agreement subsumes the PIC. The purpose of PIC is to ensure that a probe can’t look into a finished phase, except at the edge (spec) of this phase. (17) ensures that a probe never looks at anything but its own complement and the spec position of its complement. Locality of Agreement derives Chomsky’s “freezing” effects (Section 5.1) and renders φ-completeness useless (Section 3.2). The difficulty and the interest is to see whether this more restric-
On Agreement
tive condition can give us empirical advantages, which happens to be the Case, as I will show. . Move Another point in which I distance myself from Chomsky (1998) concerns Move. It might be useful here to briefly review Chomsky’s (1993, 1995) motivations for Move or Attract to provide some background to my own solution.9 In Chomsky (1993), constituents moved to satisfy their own formal requirements – to get Case, in other words – within the principle called Greed. Greed rules out (19), because the moved item does not achieve anything by moving to the subject position of the matrix clause: (19) *John seems t is happy.
However, the existence of successive cyclic movement proved to be a challenge for Greed. In the examples in (20), I indicate intermediate traces in places where the moved element stops for breath but does not satisfy any licensing requirements: (20) a. Where did you think t that Peter bought it t? b. John is believed t to be likely t to be t happy.
In order to account for the existence of successive cyclic movement, a DP in search of a Case or a wh-word in search of an appropriate interrogative head should be allowed to stop in intermediate positions, even if no feature checking or Case assignment took place in these intermediate positions. To put it in Chomsky’s words (1995: 261): “Move raises α to a position β only if morphological properties of α itself would not otherwise be satisfied in the derivation”. This formulation fits in what Collins (1997) calls a Global Economy framework because in order to know if a derivational step is permissible one must let the derivation proceed until it is finished. Global Economy was considered to be inadequate if computational considerations enter the picture – in general, operations that can “look ahead” at future steps of a derivation should be suspect (see Collins 1997; Johnson & Lappin 1999). An optimal analysis of the displacement phenomenon should include successive cyclic movement organically without having to weaken the motivation for movement and introduce computational complexity. From Chomsky (1995) on, motivation for movement rests on the functional category with which the DP agrees. In Chomsky (1995) the operation Attract Feature was suggested, later abandoned.10 In Chomsky (1998), Move is
Luis López
seen as the combination of Agree (p, g) and Merge of the goal in the spec position of the probe. The idea is that the (p, g) relation is sufficient to delete the uninterpretable features of probe and goal, except for one, the selectional EPP feature of the probe. In order to delete the EPP feature, the goal is pied-piped into the spec position of the probe. Thus, the application of Move presupposes a previous Agree operation and is triggered by a selectional requirement of the probe. Within this model, it still seems that successive cyclic movement must be stipulated in some form. Assuming phases and the PIC, an account of (20a) requires stipulating optional EPP features in Spec,v and Spec,C, which are licensed if they have an effect on outcome (more on this in Section 5). It is easy to see how this takes us back to the Global Economy problems that Collins discussed. The ungrammaticality of (19) must also be stipulated: if the head triggers movement, there is no principled reason why ‘John’ can’t raise upstairs. Chomsky proposes the above-mentioned “freezing” effect. To sum up, neither Move nor Agree+pied pipe integrate successive cyclic movement without stipulation and without assuming Global Economy. The challenge that I undertake here is to conceptualize displacement in such a way that it satisfies these theoretical desiderata. Let me introduce my approach to this problem by means of two metaphors. The first one is the “tension” metaphor, which I borrow from Frampton and Gutmann (1999). The presence of an [α] feature creates a tension in the structure built by Merge and the computational system tries to release it before proceeding to the next phase. Tension is released by valuing the [α] feature. The “tension” metaphor needs to be complemented with a “reaction” metaphor. We can compare syntactic movement to a chemical reaction: putting two substances together may unchain a reaction which sometimes gives as a result a product that is not stable yet and requires further reaction until a final stable product is obtained. As in syntactic movement, the reacting substances do not know that the product of their initial reaction is not the final stage, but the initial reaction takes place regardless. I propose to view Successive Cyclic Movement as “intermediate” reactions. Regarding movement as a reaction may help us abandon teleological metaphors of movement, with all their concomitant problems, at the same time sticking to the fundamental idea that all syntactic operations must be motivated. In Chomsky (1999), movement is a consequence of agreement. Contrariwise, I want to propose that movement takes place when Agree fails. Concretely, Move is triggered by the necessity of creating a (p, g) relation to value and delete [α] features of items that might be too distant for probing, given the strict locality requirement imposed by (17). Thus, the ultimate reason why there is move-
On Agreement
ment is actually Locality of Agreement, together with the instability created by [α] features. Take again configuration (18) and assume that YP has a [α] feature – say unvalued Case – in need of valuation and that X – a bare root – does not have the relevant Case assigning feature that could value unvalued Case and delete it. Under these circumstances, unvalued Case of YP can’t be probed. Furthermore, unvalued Case does not have anything to probe. It is then that YP must move. I assume movement is to the closest available spec. Why the closest and no other? First, since movement is reactive, nothing forces a moving term to go to any particular place. Second, as I explain in Section 5, an Xmax is a barrier that severs the links of a chain. This bans movement to any but the closest spec.11 It is important to note that the constituent which is about to move does not “know” if this raising is going to be successful because in a reactive system terms are blind to what might lay ahead – and, as a matter of fact, movement may fail to value [α], which leads to another movement operation. If the closest spec can’t be probed by a head with a matching feature, the YP will have to jump again to the next spec, if one is available. The process only stops if the features of YP are finally valued or when no specs are available and a feature is trapped within a phase (more on this later). Successive cyclic movement is therefore incorporated into the system without stipulation. Moreover, the ungrammaticality of (19) is now expected: the DP ‘John’ has already valued its unvalued Case downstairs and reached equilibrium. Nothing forces to move it again and it is too far down to be probed by the matrix T. The hypothesis just presented does not require “freezing” effects. Consequently, it does not require a distinction between Match and Agree. Since we do not need “freezing”, we have incorporated the advantage inherent in the Greed system, but notice that we do not incorporate its main disadvantage, Global Economy. Movement of x is licensed if x has an [α] feature: there is no need for look-ahead. Additionally, this hypothesis does not require that the goal have an unvalued Case feature for Agree to take place, so Concord can be let back in. To conclude, I propose that Move is triggered by the requirements of the constituent that moves, as in the Greed framework, so “freezing” is out. The [α] features create an instability that triggers a reaction – movement is not “toward something” but “because of something”. Within this view and contrary to Greed, successive cyclic movement is not surprising, it is exactly what one would expect within a reactive system. Additionally, since movement is not “toward something”, we do not need to incorporate a Global Economy assumption. That is, movement is not licensed if a feature has been checked, movement happens as long as the moving item carries an [α] feature.
Luis López
. Analyses . Accusative Case and transitivity The purpose of this section is to show how Locality of Agreement and reactive Move, together with some current assumptions concerning the role of v in assigning accusative Case can account for some data that have remained recalcitrant after several years of intensive investigation. At the same time, this analysis will be taken as a starting point for an analysis of expletives in Section 4. Consider (21) and (22), which represent, respectively, the steps involved in building a regular transitive vP and the resulting structure (see also Romero, this volume). In derivations I omit category labels on phrases: (21) 1. 2. 3. 4. 5. 6. 7. 8. (22)
Merge (X,OB{[α C][ φ ]} ) = {X,OB{[α C][ φ ]} } Copy OB{[α C][ φ ]} Merge (OB{[α C][ φ ]} ,{X,t(OB)}) = {OB{[α C][ φ ]} ,{X,t(OB)}} Merge (v{[acc][ αφ ]} ,{OB{[α C][ φ ]} ,{X,t(OB)}} = = {v{[acc][ αφ ]} ,{OB{[α C][ φ ]} ,{X,t(OB)}}} Agree (v{[acc][ αφ ]} ,OB{[ α C][ φ ]} ) = = {v{[acc][ φ ]} ,{OB{[acc][ φ ]} ,{X,t(OB)}}} Spell out {v{[acc][ φ ]} ,{OB{[acc][ φ ]} ,{X,t(OB)}}} Delete [-interpretable] features = = {v{[acc][φ ]} ,{OB{[acc][φ ]} ,{X,t(OB)}}} Merge (SU, {v{[acc][φ ]} ,{OB{[acc][φ ]} ,{X,t(OB)}}}) = = {SU, {v{[acc][φ ]} ,{OB{[acc][φ ]} ,{X,t(OB)}}}} vP
SU
v’ v
XP OB
X’ X
t(OB)
Let’s decode the information contained in (21). In line 1, the root merges with object, which bears valued φ-features and an unvalued Case feature. The unvalued Case feature forces it to move, as represented in lines 2 and 3. Notice that I have not subscripted any features on the trace of object: this is incorrect, I will
On Agreement
fix it in Section 5. In line 4, the light verb merges, with unvalued φ-features and a valued Case feature. The light verb and object may agree at once, with the result that their unvalued features are now valued, as shown in line 5. Since now we have a structure without α-features, it can be spelled-out at once (line 6) and the uninterpretable features can delete (line 7) – deleted features are represented with strike-thru. Finally, in line 8, the subject can merge. Subject merges with its own valued φ-features and unvalued Case (not represented in 21) and satisfies the selectional feature of v (EPP). This structure preserves the assumptions in Arad (1998, this volume), Chomsky (1995, 1998) and Kratzer (1996) that the same head that formally licenses the object is involved in introducing the external argument. However, Chomsky (1995) and the others assume that object raises to Spec,v and checks its accusative Case in that position, as in structure (23): (23)
vP OB
v’ SU
v’ v
VP V
t(OB)
López (2001) points out that it is problematic to have the subject θ-role assigned in the same domain where accusative Case is checked and by the same head. Take a situation in which subject has been merged with v and received a θ-role and v has an accusative Case to assign. What is to prevent v from assigning accusative Case to subject instead of object? All Chomsky (1995: 311–312, 356) does is to stipulate that arguments can only check features if they form non-trivial chains. Or take another situation: the object has raised overtly to Spec,v before subject has merged (as briefly suggested by Chomsky, and developed by Ura 1996). In this situation, what prevents the object from assuming a second θ-role? Nothing does, unless we reintroduce the θ-Criterion as a theoretical primitive, an undesirable move. Given these problems, it seems clear that it is not a good idea to have a configuration like (23) if v assigns a θ-role to subject and Case to object. This is avoided by leaving object in Spec,X and ensuring that Agree only involves the c-command domain of the probing head (and notice that there is nothing wrong wih structure (23) if v does not assign Case to object in Spec,v).
Luis López
Word order data favors my approach against both Chomsky (1995) and Chomsky (1998). Consider the following sentences (that I cite from Johnson 1991): (24) a. *Chris ate slowly the meat. b. Chris talked slowly to Gary.
(24a) exemplifies the well-known requirement of adjacency for accusative Case assignment in English (Stowell 1981). The question is where this adjacency comes from. Johnson argued that it comes from the combination of two movement operations: (i) movement of the lexical verb to a higher functional category that governs VP (the functional head could be v in our terms), (ii) movement of the DP to Spec,X, where this functional category assigns Case under government. This seems the only available option if we want to preserve Emonds’ (1976) and Pollock’s (1989) conclusions that the English verbal complex never raises to INFL overtly. How do we get the word order in (24) within Chomsky’s (1995) or (1998) assumptions? We have two choices, both of them unsavory: (i) if object raises overtly to Spec,v, we would have the word order subject-object-V, which is unattested in English; (ii) if object raises covertly, the adjacency requirement is left unaccounted for unless an adverb creates an intervention effect on object movement (probably an implausible assumption, although briefly discussed by Chomsky). Short movement to Spec,X seems descriptively to be the most adequate solution. There is some interesting evidence that the target of Raising to Object is Spec,X. It has been known since Postal (1974), that an ECM subject can bear two θ-roles, as exemplified in (25a–d), taken from Boskovic (1997). (25a, b) show that the verb ‘estimate’ selects for a DP that expresses quantity; the same restriction is apparent in (25c, d), which suggests that the ECM subject is selected and assigned a θ-role. López (2001) additionally shows that chains with two θ-roles are found in causative constructions, in which the causee argument is shared by the subordinate and the matrix predicates, as in the Spanish sentence (25e): (25) a. b. c. d.
Sue estimated Bill’s weight. *Sue estimated Bill. Sue estimated Bill’s weight to be 150 pounds. *Sue estimated Bill to weigh 150 pounds.
On Agreement
e.
Juan le hizo a Pedro reparar el coche. Juan cl.3rd.sg make-past.3rd.sg dat Pedro repair the car ‘Juan made Pedro fix the car’
López (2001) argues that A-chains whose head has accusative or dative Case can have two θ-roles, one assigned by the subordinate predicate, the other by the matrix predicate. This is not accounted for if ECM SUs and causees move to matrix v, but it follows if they move to Spec,X, which becomes a position that is both θ- and Case-related. . Nominative Case and participle agreement Let’s proceed with the derivation of an ordinary transitive sentence. After a vP structure like (22) is completed, the Case feature of subject is still unvalued. T merges and probes subject so it can value its own φ-features. However, T cannot value subject’s Case, and this forces subject to raise to Spec,T. Then C merges with TP. With subject in Spec,T, C probes subject and values its own φ-features while it values the subject’s Case feature. The CP can now spell-out and all the uninterpretable features can delete: (26)
CP C
TP SU
T’ T
vP t(SU)
v
1. Agree(T[αφ ] ,SU [ φ ][ α C] ) → T[ φ ] . . . SU[ φ ][ α C] 2. Agree(C[nom][αφ ] ,SU[ φ ][ α C] ) → C[nom][φ ] . . . SU[φ ][nom] . . .T[ φ ]
This is the structure of the solution for the simplest Case. Let us now consider Cases in which the DP that is generated in complement position raises to Spec,T: (27) a. A book is on the table. b. A lady passed away. c. The body was discovered by a lady.
Luis López
In these Cases, we have the v(Ø) variant: v selects for a D in its spec although it does not have a θ-role to assign. By assumption, v(Ø) in these instances does not assign Case either, because the DPs generated as complements of the main predicate show up in Spec,T. The passive is the most complicated example, as shown in (28). (28) [C [5 T[4 be [3 participle [2 v [1 discover- the body ]]]]]]
The bare root ‘discover’ is selected by a light verb, so the structure becomes verbal and can be selected by participle morphology. The object is merged with a unvalued Case that forces it to raise to 1. In 1, object can be probed by v, which satisfies v’s unvalued φ-features but object still has its unvalued Case feature. So, object raises again to 2. In 2, object can be probed by the participle, valuing the latter’s φ-features, and satisfies v’s selectional feature (EPP). Additionally, if the participle has a Case feature, as in Icelandic, both the participle’s and object’s Case features are co-valued at this point. Object continues raising successively through 3 and 4 and in the latter it can be probed by T, thus valuing the [αφ] of T. Finally, object stops at 5, where it can value its Case with C (and so does the participle’s, if it has one, by co-valuation). Kayne’s (1989) facts about participle agreement in French follow naturally. Kayne showed that a participle in this language agrees with an object provided that the latter has raised, either as an instance of wh-movement, clitic movement or, as in this Case, A-movement to subject position. In my terms, there is participle agreement with an object that stops in 2, but not if it stays in situ, because of Locality of Agreement. If position 2 is not filled because nothing raises into it, we get a default form of the participle, without a set of φ-features. As Chomsky (1999) himself notes, Kayne’s findings have no account in his system. Since he allows the participle to probe object in its c-command domain regardless of the intervening distance, he cannot explain why agreement shows up only when object raises. . Eliminating φ-completeness and Tdef Recall that Chomsky suggests that two probes are not φ-complete: participles and non-finite T in Raising constructions. Recall also that I argued that there does not seem to be a coherent way to decide which probes are complete and which are not. Let’s start with participles. In Chomsky (1999), T can probe the DP across the participle because the participle is not φ-complete. In his framework, if the participle were φ-complete, it would freeze the DP in place and would not
On Agreement
be probed by T. In my analysis, the DP agrees with the participle and with T independently as it raises, as we saw in 3.2. The notion of φ-completeness is therefore unnecessary in this context. The other construction in which φ-completeness is crucial is Raising to Subject/Raising to Object (ECM). Let me now discuss them briefly: (29) a. John seems to be intelligent. b. Mary expects John to be intelligent/found soon.
The parallel properties of these constructions are well-known: a surface DP constituent of the matrix predicate gets its θ-role from the subordinate predicate. The structure of the derivation of (29b) is represented in (30): (30) Mary [vP expects [XP 6 X [TP 5 to [vP 4 v [VP 3 be [aP 2 a [XP 1 intelligent John ]]]]]]
‘John’ has been merged within the lowest structure, which I call aP, in parallelism with vP. In many languages, adjectives and nouns agree. This is analyzed here as taking the the light head that selects the root to be a probe. After movement to 1, ‘John’ can be probed and agrees with the adjective but it still has a unvalued Case that needs to be valued and deleted. Recall that the adjective/participle has Case in Icelandic. As a result of Agree (a/ptc,DP), their unvalued Case are covalued. The unvalued Case forces the DP to raise to the next available specifier, which is number 2 here. In 2, the DP can’t be probed by a Case assigning head (by hypothesis, ‘be’ does not assign Case here), so the DP has to continue raising. Since T is not a Case assigner and there is no C in the subordinate sentence, raising of the DP only stops at 6, where it can be probed by v and its Case is valued as [accusative]. By covaluation, the adjective/participle’s Case is also valued as [accusative]. All Case values can now delete. The DP is now stable and does not (cannot) raise anymore. Chomsky’s (1999) analysis of (28b) is very different from mine: either Tdef pied pipes ‘John’ to Spec,T, so its EPP feature is deleted or the upstairs probe v by-passes Tdef . In any Case, the Case feature of the DP is not valued and deleted by Tdef because Tdef does not have a complete set of φ-features. Later, the matrix v probes in its c-command domain until it reaches ‘John’ in Spec,T or in situ. v can then agree with and assign Case to the DP. Finally, v pied-pipes the DP to Spec,v to satisfy an EPP feature of v. Among the advantages of my analysis: (i) no need for φ-completeness; (ii) no need for two types of non-finite T, one of which is, by stipulation, selected by V while the other is selected by C; (iii) as discussed above, word order facts
Luis López
strongly suggest that Raising to Object involves Spec,X; (iv) no connection between agreement and Case, so Concord is let back in.
. Co-valued α-features, part 1: Expletives No doubt, a theory of agreement can’t be complete without an analysis of expletive constructions. This I undertake in this section. In 4.1 I discuss briefly Chomsky’s analysis of expletive constructions and in 4.2 I present my own assumptions on the nature of expletives. In Sections 4.3–4.6 I present analyses of expletive constructions in German, French, English and Icelandic which, I believe, are canonical representatives of the typological variation one can encounter. The goal of this section is to show how the combination of Locality of Agreement, Move and Feature Co-valuation provides considerable empirical coverage. . Chomsky (1999) Let us first see how Chomsky (1998, 1999) analyses a simple expletive construction like (31): (31) There is a man in the closet.
The initial numeration includes both the expletive and the DP ‘a man’. Since Merge is simpler than Move, ‘there’ is merged in Spec,T rather than have the DP raise. ‘There’ only has the feature [person], which is enough to satisfy the EPP feature of T, but leaves the φ-features of T intact because the expletive is φincomplete. Thus, T can probe and agree with the associate. The φ-features of T are valued and deleted and the Case of the associate is valued as nominative and deleted. Notice that, as a result, the expletive and the DP do not enter any sort of relation – there is no expletive-associate relation, contrary to what had been commonly assumed up to this point. This analysis makes some specific predictions, some correct, others not. Among the incorrect ones, in English the associate shows up in accusative Case, not nominative (as Chomsky himself recognizes). Importantly, if the expletive merges with Spec,T, as universally assumed, one would expect to find expletive constructions with all kinds of predicates; however, in many languages including English expletive constructions are limited to unaccusative and passive verbs or, in other words, to predicates without an external argument. Even
On Agreement
when expletives are grammatical with transitive verbs (Icelandic), the resulting construction has a somewhat different form, as I will show in a minute. Unlike Chomsky (1998, 1999), I do not claim that Tense agrees directly with the associate of the expletive – which would violate Locality of Agreement – instead, the expletive enters agreement relationships with the associate and with Tense. Moreover, I claim that in the general Case, expletives merge in Spec,v, which explains why only unaccusatives and passives accept expletives. With some plausible assumptions concerning the properties of functional categories and expletives, I account for the range of variation found in German, French and Icelandic, namely: 1. The Case of the associate is non-nominative in English and French, nominative in German. 2. The associate agrees with Tense in English and German, but not in French. 3. There are Transitive Expletive Constructions (TEC) in Icelandic but not in English or French.12 . Parameters of variation in expletive constructions My view of expletives of the ‘there’ type assigns them a richer feature structure than is standard so far. Expletives are of category D and as such they have a set of φ-features and Case (as for ‘there’ having Case, see Groat 1999; Lasnik 1995). Their Case feature is unvalued, like that of the other nominals. What makes expletives different from other nominals is that they are not referential, which entails that they do not have inherent φ-features either. At this point, languages have two choices. The expletive can come from the lexicon with a fixed default value, say [3rd person singular], as is the Case in French. Or the φfeatures may simply be unvalued, as in English or German: thus, ‘there’ ends up being both [αφ] and unvalued Case (although expletives never seem to exhibit explicit number agreement, for reasons that I do not know). If my assumptions concerning the feature bundle of the English and German expletive are right, valuing the features of the expletive is somewhat more complicated than those of a functional category or an ordinary DP: its unvalued Case forces it to establish an Agree(p, g) relation with a v or C but its [αφ] forces it to do so with a DP – establishing the expletive-associate relation. The other parameter of variation in constructions with expletives concerns v(Ø). v(Ø) may bear partitive Case or may not bear Case at all (on partitive Case: Belletti 1988; Lasnik 1995). In German, we only seem to find the non Case bearing v(Ø) and in French we only see the Case bearing v(Ø). English
Luis López
seems to have a mixed system, where ‘be’ bears partitive Case (as argued by Lasnik 1995) but other unaccusatives do not. Furthermore, I assume that the expletive is merged in Spec,v, (except in TECs, Section 4.6) (see Groat 1999 and references therein): (32)
vP EXPL
v’ v
XP OB
X’ X
t(OB)
(32) also reflects object raising to Spec,X, pushed by its own unvalued Case.13 Notice that no Xmax is interposed between the expletive and object, so they can enter an agree relation without violating Locality of Agreement. One could consider the possibility of retaining the traditional notion that expletives are merged with TP. However, merging the expletive with vP seems preferable conceptually. We know that v(AG) selects for an external argument. We can simply assume that v always selects for a D (Section 2.1), the only difference between v(AG) and v(Ø) is that the latter has had its θ-assigning property “bleached”, but otherwise they have identical syntactic properties. In other words, the D selecting feature of v does not need to be stipulated. However, if the expletive is merged with Tense, we do have to stipulate that Tense has an additional D selecting feature, somewhat arbitrarily, since T never assigns a θ-role. Notice that we do not need an EPP feature on T to force movement of a DP to Spec,T because the need of DPs to value their Case is sufficient to trigger raising. Therefore, I stick to the idea that the expletive merges with vP, obtaining the bonus point that we explain why expletive constructions occur only with unaccusative and passive verbs. Additionally, word order suggests that object can’t have moved higher than Spec,X in English, which within my assumptions suggests that Spec,v must be filled with an expletive: (33) There arrived three men. / *There three men arrived.
Lasnik (1995) shows that object must be adjacent to the verb ‘be’ or the unaccusative in English: (34) a. *There will be usually a man here / There will be a man usually here.
On Agreement
b. *There will arrive usually a man here / There will arrive a man usually here.
As we saw before, the fact that an adverb can’t stand between the object and the verb is taken as a sign that object must have raised to Spec,X.14 . German In German, as in English, we can have expletive constructions with unaccusative and passive verbs. The associate of the expletive is in nominative Case and it agrees with T (examples provided by Susanne Winkler, p.c.): (35) a.
Da/Es ist einer/e/s hier. ‘There is one here.’ b. Da/Es kam einer/s/e spät an. ‘There arrived one late.’ c. Es wurde der Körper einer bestimmten Frau der Pearl St gefunden. There was the body of-a certain woman of-the Pearl St discovered ‘There was discovered the body of a certain woman of Pearl St.’
I assume the following derivation for vP in German with an expletive. First object and X merge, and object raises to Spec,X, pushed by its unvalued Case, then v is merged and it probes object, so its own [αφ] are valued. However, v does not have a Case feature, so the unvalued Case of object remains unvalued. The vP cannot be spelled-out and the uninterpretable features remain. ‘das/es’ is merged, satisfying the EPP of v. ‘das/es’ has [αφ] and unvalued Case. Assuming (32), no Xmax intervenes between ‘da/es’ and object, only segments of categories. Thus, the expletive can probe in its c-command domain to find matching φ-features and the relation Agree(p, g) is established between the expletive and object that values the φ-features of the expletive. Neither object nor the expletive has a value for Case. However, the expletive and its associate are now linked by the Agree relation, which co-values all their features. In other words, the Agree relation forces the unvalued Case feature of object and the [βC] of the expletive to be co-valued, even if at this point both Case features are variables. This is represented in (36.1). T is now merged. T has a set of [αφ] to be valued. It probes the expletive, which has the φ-feature values of the object. As a result, the φ-features of Tense end up being the same as those of the object through the expletive, as shown in (36.2). The unvalued Case of the expletive forces it to move to Spec,T.
Luis López
C is now merged. C has [αφ] and [nominative]. It can probe the expletive in Spec,T, valuing its own φ-features. Moreover, the Case of the expletive is valued as [nominative] and, automatically, so does the value of the associate, by virtue of the Agree relation that co-values their Case features. There are no unvalued features left, so the structure is ready for spell-out and all uninterpretable features can delete at this point, as shown in (36.3). (36) 1. Agree(Da[ αφ ][ β C] ,NP[ φ ][ α C] ) → Da[ φ ][ α 1C] . . . NP[ φ ][ α 1C] 2. Agree(T[ αφ ] ,Da[ φ ][ α 1C] ) → T[ φ ] . . .Da[ φ ][ α 1C] . . . NP[ φ ][ α 1C] 3. Agree(C[nom][αφ ] ,Da[ φ ][ α 1C] ) → C[nom][φ ] . . . Da[nom][φ ] . . . T[ φ ] . . . NP[nom][φ ]
Thus, my analysis of expletive constructions attributes a different role to expletives than that found in Chomsky’s recent work. For Chomsky, ‘there’ satisfies the EPP of T and T probes and agrees with the object in situ. In my account, T can’t value its φ-features with object because of Locality of Agreement, unless either object is in Spec,v or an expletive in Spec,v can act as a mediator between T and object. The expletive is close to T, so it can be probed and it is also close enough to object to probe it. The expletive-associate relation that has been so widely discussed turns out to be an Agree(p, g) relation. So far, there is no empirical difference between Chomsky’s approach and mine. Testing them against more complicated Cases will show our different predictions. . French In French, Tense does not agree with the associate of the expletive, which receives partitive Case (Belletti 1988), not nominative, as can be seen in examples (37): (37) a.
Il y a trois hommes sur la table. it there have three men on the table ‘There are three men on the table.’ b. Il est arrivé trois femmes. it is arrived three women ‘There arrived three women.’ c. Il en est arrivé trois. it part is arrived three ‘Three of them arrived’
On Agreement
(37) are a problem for Chomsky’s approach, as he acknowledges (1998: fn 91), because in his model Tense is supposed to probe down and agree with object, not with the expletive which is, within his assumptions, merged in Spec,T. However, in my model probing of object by T is impossible because a maximal category separates Tense from object. My model seems more promising to tackle these data. Consider (38) (in which I condense structure to make relevant points more salient): (38) C[2 est [ il v [1 arrivé OB ]]]
object raises to 1, where it gets partitive Case from v(Ø). Since vP does not include any unvalued features, it can spell-out and delete the uninterpretable features, as shown in (39.1). The expletive is merged with vP with features [3rd person singular] and unvalued Case. unvalued Case should force il to probe. However, there is nothing to probe, because the Case feature of object has already deleted. Notice the difference with the German Case, where vP could not spell-out because of the remnant unvalued Case feature on object, leaving object’s features available for probing by the expletive. If this reasoning is correct, when il is ready to probe, it will not find a Case value available for it and there is no co-valuation between the expletive and object. After T is merged, it can probe the expletive, valuing its own φ-features as [3rd person singular], as in (39.2). Since il still has an unvalued Case feature, it raises to Spec,T. Then C is merged. Finally, il is assigned nominative Case by C and values the φ-features of C (39.3). The valued uninterpretable features of C, T and il can now be deleted: (39) 1. Agree(v[part][αφ ] ,NP[ φ ][ α C] ) → v[part][φ ] . . . NP[part][φ ] 2. Agree(T[ αφ ] ,il[3rd.sg][α C] ) → T[3rd.sg] . . . il[3rd.sg][α C] 3. Agree(C[nom][αφ ] ,il[3rd.sg][α C] ) → C[nom][3rd.sg] . . .il[3rd.sg][nom]. . .T[3rd.sg]
To sum up, the peculiar property of French expletive constructions is that T agrees with the expletive and not with the associate. This is naturally accounted for if Tense probes the expletive in Spec,v instead of object. The differences between French and German are correctly attributed to the ability of v(Ø) to assign Case. . English Expletive constructions in English are acceptable with the copulative verb, unaccusative verbs and passive verbs, as shown in (40) and (41). As in German and French, expletives are not acceptable with transitive clauses:15
Luis López
(40) a. A book is on the table. b. A lady passed away. c. The body was discovered by a lady. (41) a. There is a book on the table. b. At her residence on Tuesday last, there passed away a lady of talent and refinement. c. At the corner of Pearl and Dufferin streets last Sunday there was discovered, by a lady resident there, the body of a certain woman of Pearl Street.16
English expletive constructions seem to combine some of the properties of German and some of French, according to the verb that we choose. Let’s first consider expletive constructions with the verb ‘be’: (42) There is only me. (43) *There is only I. (44) There are only three people.
A sentence like (42) is a problem for Chomsky’s approach, as he himself points out. Recall that Chomsky ties Case assignment with agreement and T is the Case assigner. But in these Cases object is not in nominative Case although it agrees with T. These sentences are not a problem for me, because agreement with T is an operation distinct from Case valuation by C or v. The examples in (42) and (43) show that English copula is a Case assigner. So, a derivation in the German style with co-valued Case features between expletive and associate does not work. On the other hand, (44) shows that there is agreement between T and object, so ‘there’ does not come from the lexicon with default features, as in French. I propose to analyze these sentences in the following step by step process. First, raising of object to Spec,X allows it to be probed by v, so its unvalued Case is valued as [partitive] while it values the φ-features of v, as shown in (45.1). The Case feature of object and the φ-features of v are valued, the vP can now be spelled-out and the uninterpretable features deleted from narrow syntax. But notice that the object’s φ-features, being interpretable, are not deleted. Second, the expletive probes and establishes a (p, g) relation with object (45.2). Thus, the expletive can value its φ-features with object but, since the latter’s Case has been deleted, cannot co-value its unvalued Case. So, at this point the expletive has unvalued Case and valued φ-features. Third, T can now probe and value its φ-features with the expletive (45.3). The result is that now T and object have the same φ-features, although object got Case from v(Ø). Finally, the expletive
On Agreement
raises to Spec,T where it values its own unvalued Case as nominative. All the uninterpretable features of the expletive, C and T can now be spelled-out and deleted. Thus, we have obtained (i) agreement between T and object (through ‘there’), (ii) non-nominative Case for object. (45) 1. 2. 3. 4.
Agree(v[part][αφ ] ,NP[ φ ][ α C] ) → v[part][φ ] . . . NP[part][φ ] Agree(there[ αφ ][ α C] ,NP[ φ ] ) → there[ φ ][ α C] . . . NP[ φ ] Agree(T[ αφ ] ,there[ φ ][ α C] ) → T[ φ ] . . . there[ φ ][ α C] Agree(C[nom][αφ ] ,there[ φ ][ α C] ) → C[nom][φ ] . . . there[ φ ][nom] . . . T[ φ ]
Let’s now see if this analysis can be extended to unaccusatives and passives. It seems not. Consider sentences (46): (46) a. *It is/has arrived three men. b. *It was found a body. c. It is a pity that a body was found in Pearl Street.
(46c) reminds the reader that English ‘it’ can be an expletive with the verb ‘be’, so the question is why it can’t be one in (46a, b), in the French manner. It seems that the solution must be found in the absence of partitive Case in English unaccusatives. If unaccusatives could assign Case in English (as Lasnik argue), (46a, b) should be grammatical: ‘three men’ would get Case from v(Ø) and the expletive would get Case from C. It seems that English unaccusatives differ from ‘be’ in that only the latter can assign partitive Case – for reasons that I do not know. The derivation of English expletive constructions with unaccusatives and passives must be more in the German way: the expletive and its associate have their Case features co-valued. The sentences in (47) provide evidence that this analysis is correct. The grammaticality judgments are delicate because pronouns do not like presentation focus, which is forced in expletive constructions. However, there seems to be a clear preferrence for the nominative pronoun: (47) a. ?There arrived she, with a big, mysterious trunk. b. *There arrived her, with a big mysterious trunk.
For completeness, I present an example of an ECM derivation with an expletive here, so the reader can compare it with the derivation of an ordinary ECM: (48) John believes there to be good reasons for his behavior. (49) [John v [2 believ- [1 to there be good reasons for his behavior]].
The expletive will raise through the spec numbered 1 to spec 2, where it can receive accusative Case from the matrix v.17 Notice that this analysis predicts
Luis López
agreement between the matrix v and the associate of the expletive. Given the poverty of English inflection, this prediction cannot be checked. . Transitive Expletive Constructions Finally, let’s explore why there are TECs in Icelandic but not in English. In (50) is an example:18 (50) Thadh klarudhu margar mys ostinn alveg there finished many mice the.cheese completely ‘Many mice completely finished the cheese.’ (Bobaljik and Jonas 1996: 217)
The immediate solution would be simply to say that Icelandic T has an EPP feature and the expletive is merged in its spec position. Subject is in Spec,v and object is in Spec,X and the adverb adjoined to a lower projection of X – essentially the structure of transitive predicates in English except for the presence of the expletive in Spec,T. However, matters do not seem to be so simple. Throughout the 1990s, evidence has accumulated that languages with TECs have a clausal structure more complex than those languages without TECs, at least in the realm of the Germanic languages. Bobaljik and Thráinsson (1998) have argued that TECs correlate with other phenomena – 2 subject and 2 object positions, multiple inflectional morphemes in one stem, obligatory V raising – which confirm the complex structure hypothesis. I assume therefore that the existence of TECs depends on the existence of a functional category that merges with TP. Following a long tradition, I call it AgrP.19 Agr and v both select for a DP in their specs, where an expletive can be merged, which makes me think that they probably are really the same category. Consider the following structure. (51) C [thadh Agr [1 T [SU v [ 2 Adv X OB ]]]
object raises to Spec,X (and I abstract away from the option of object shift and its complications). The bare root adjoins to v and the latter raises to T. The subject raises to Spec,T. From here, there are two possibilities. One possibility assumes that Agr assigns Case to a probed DP and the expletive gets it from C.20 The second possibility takes Agr to not be a Case assigner. In that Case, subject would still raise to Spec,T, where its Case feature would be co-valued with that of the expletive. The expletive could get Case from C and subject
On Agreement
would receive the same Case by co-valuation. I can’t see how both possibilities can be teased apart at this point. Interestingly, TECs provide another piece of evidence in favor of locality of agreement and against long distance agreement. Sigurðhsson (1991), Frampton (1997) and Vikner (1995) have shown that in TECs subject raises to a fairly high position, to the right of the highest modal: (52) a.
adh thadh mundi einhver hafa bordhadh that there would someone have eaten ‘that someone would have eaten this apple’ b. *adh thadh mundi hafa einhver bordhadh that there would have someone eaten
theta epli this apple theta epli this apple (Vikner 1995: 191)
This can be accounted for within my assumptions, but not within Chomsky’s (or Frampton’s, who leaves this as an unresolved problem). But before I present my analysis, notice that matters become even more puzzling when we consider that in expletive constructions with unaccusatives and passives the word order is very different (McGinnis 1998: 152): (53) Thadh hafa aldrei fari-st sjómann. there have never died sailors ‘Sailors have never died’
The question now is why the associate must raise in TECs but not in unaccusatives. Consider the following compressed representations for (52a) and (53): (54) a. [AgrP thadh mundi [TP einhver strakur t [ . . . ]]] b. [CP C [TP thadh hafa [vP t farist [XP sjómann t ]]]]
Take (54a) first and assume that the highest modal is in Agr, where it has raised from T (Bobaljik and Jonas 1996). My assumptions force subject to raise to Spec,T, where it can values its Case feature, hence the high position of subject. But Chomsky would predict that (52b) should be grammatical: the expletive in Spec,Agr satisfies the EPP, T would probe and find subject in its c-command domain. Long distance agreement is possible so subject would have no need to raise. Take (54b) now. At this point, all we have to assume is that expletive constructions with unaccusatives may have the basic structure in (32), with the expletive merged in Spec,v. Thus, (i) object raises to Spec,X, (ii) the root ad-
Luis López
joins to v and (iii) the expletive raises to Spec,T, as in the previous analyses of German, French and English. The resulting word order is as expected.
. Co-valued α-features, part 2: Movement chains Apparently, there seems to be a trade-off between a theory in which functional categories trigger movement and a theory in which the Caseless DP does concerning which aspects of the theory come for free and which aspects must be stipulated. In Section 2.8, we saw how successive cyclic movement was an organic part of my approach while Chomsky (1998, 1999) needs to postulate a series of optional EPP features to carry a constituent from the edge of one phase to the edge of the next phase. On the other hand, it would seem that Relativized Minimality follows directly from Chomsky’s model but not from mine. If the probe triggers movement, the probe will just pied-pipe the first DP it encounters. However, Chomsky has to stipulate “freezing” effects: if the probe encounters a DP with the appropriate features but no Case, the probe can’t agree with it and can’t piedpipe it either. Additionally, the frozen DP can’t be circumvented. This “freezing” does not follow from any principle that I can see. Consider Chomsky’s (1998) example: (55) *There seem several people are friends of yours.
As Chomsky explains, if we did not assume “freezing” this sentence should be grammatical within his framework. The matrix T should be able to probe ‘several people’ and value and delete its own φ-features. So, what went wrong? According to Chomsky, although the features of the probe and the goal are of the same type and therefore can match, they can’t agree, the reason being that the Case feature of the DP has already been valued and deleted. However, it is not clear why this should be so. As we saw above, the valued Case feature of a DP should probably delete, but its φ-features, which are interpretable, should not delete. Thus, it is not clear why they could not be accessible for a probe. I believe it is clear that the theory would improve if we could dispense with this distinction between Match and Agree – whenever a probe finds a goal with the appropriate features, Agree should take place. It is not clear to me why a DP with Case can’t agree whereas a DP with unvalued Case can, since sound and healthy [φ] are still part of its feature structure. In this respect, the former Move+Greed framework seemed to fare a little better with (55), since all we
On Agreement
had to say was that a DP with all its features satisfied has no motivation to move. In a similar manner, within my framework, the ungrammaticality of (55) comes about because ‘several people’ is protected by a maximal category and consequently is protected from probes. Consider the phrase structure of (55), represented in (56): (56)
TP T
vP v’
there v
XP seem
CP C
IP
several people I’ are
‘There’ and matrix T need to probe a c-commanded DP to value its [αφ], as we saw in Section 4.5. However, two maximal projections intervene, XP and CP. Since the DP does not have any α-features left, it has reached equilibrium and does not need to move further to Spec,X. The resulting derivation crashes because the [αφ] of the expletive and the matrix T are never valued. The “freezing” effects are also at play in Chomsky’s account of Superraising: (57) a. *John seems that it is certain to fix the car. b. T seem that it is certain to John fix the car.
According to Chomsky, the matrix T probes into the subordinate clause until it finds the goal ‘it’, which has matching features. Although T and ‘it’ can’t agree, the probe is interrupted and can never reach ‘John’. Hence, from (57b) we can’t derive (57a). How would Superraising be accounted for within my framework without stipulating Relativized Minimality?
Luis López
First, we will have to start by looking at the theory of chains and traces. Chomsky (1993) proposes the copy theory of movement – i.e., the hypothesis that what we had been calling ‘trace’ is actually a copy of the moved item. He argues that this is conceptually superior to our previous conception of a trace as an independent syntactic object because the copy theory allows us to maintain Inclusiveness. Let’s assume that traces are copies of their antecedents. A question has sometimes been asked that has not, to the best of my knowledge, received a careful answer, namely, what is the connection between the copies that form a chain. As pointed out by Roberts (1998), once the uninterpretable features of the head of a chain are deleted, we need to know what we are going to do with their copies in the foot of the chain. Presumably, they should delete too, the question is what mechanism we are going to use for this purpose. I propose that Feature Co-valuation is exactly what we need. Assume structure (58), in which XP is a constituent that has moved leaving a copy behind, which also has a copy of the unvalued features of the head of the chain: (58) XP[α] . . . copy(XP[α])
The unvalued features of XP and its copy are co-valued and this is what makes XP and its copy form a chain. This co-valuation of features should be uncontroversial: if XP and its copy did not have their features co-valued, they would not be copies of each other. Therefore, the relation between the links of a chain is no different from the relation between expletive and associate (pursuing intuitions here that hark back to Chomsky 1986), and both are simply an agreement relation involving co-valued features. When the features of XP are valued by a probe, those of its copy are too and can delete, as was the Case with the expletive-associate relation. This entails, of course, that Locality of Agreement also holds of the links of a chain, as much as any other agreement relation. However, the formulation in (17) is limited to agreement relations that involve a probe and a goal, and here we have a different instance of feature co-valuation, one which does not involve a probe. We need a definition of Locality broader than (17), so that it embraces also agreement without a probe. I propose the following informal definition, which subsumes Locality of Agreement: (59) Locality of Co-Valued Features (LCVF) Two features φ and γ of the same type can be co-valued if there is no Xmax such that Xmax dominates one of them but does not dominate the other.
Let’s now return to Superraising in light of LCVF. The links of a chain can’t skip over a spec position because an Xmax acts as a barrier and disrupts the
On Agreement
connection between co-valued features. If a moving term skips a spec, a maximal category now is going to interpose between two links of a chain, disrupting the connection between the features. Thus, even if the head of the chain had its features valued and deleted, the foot of the chain would not, and this would lead to a crashed derivation at PF. Consider the structure in (60), under the crucial assumption that the head H only accepts one spec: (60) XP . . .[HP YP H [. . . copy(XP) ]]
XP had to skip Spec,H, which is taken by YP. Now XP and its copy can’t Agree without violating LCVF because of the intervening Spec,H. As a result, the unvalued features of copy(XP) are never valued and deleted and the derivation can’t spell-out at PF. (60) raises the issue of whether a head should allow for multiple specifiers. I agree with Chomsky (1995) that the theory of bare phrase structure does not include any such restriction. Additionally, it seems likely that C accepts more than one spec of the A’ variety, given the existence of multiple whmovement in some languages. On the other hand, there does not seem to be any reason why particular languages could not restrict the number of specs of a given functional head to one – as a lexical property of that head which can be parametrized. Let’s assume that in some languages, T allows for only one A-spec, giving rise to superraising violations; other languages would not impose any restrictions and could consequently tolerate superraising violations. This correlation between multiple specifiers and grammaticality of superraising is exactly the conclusion that Ura (1994, 1996) arrived at after investigating a wide range of languages. Notice that Chomsky’s “freezing” cannot accommodate Ura’s findings, because Chomsky’s approach, in which a head attracts/piedpipes a DP unless there is an intervening DP, would predict that no language would ever allow for superraising. On the other hand, a theory that allows for movement to be triggered by the features of the item that moves, and not by the features of the attractor, can incorporate Ura’s findings if coupled with a parametrization of the multiple specs option. Therefore, Ura’s proposal can fit into LCVFs unproblematically: if a language allows for multiple specs, an XP can “leapfrog” (to use McGinnis graphic metaphor) another constituent without violating (59). Thus, Relativized Minimality does not need to be stipulated in my framework since it derives from (59), an overall restriction on co-valued features which is independently needed. Additionally, it allows us to accommo-
Luis López
date Ura’s correlation between multiple Spec,T and grammatical Superraising violations. To conclude, in Sections 4 and 5 I have presented the notion that two constituents X and Y can have co-valued features, as a consequence of applying Agree to X and Y when these two constituents have unvalued features of the same type. It could be argued that co-valued features is an extra addendum to the theory, thus detracting from its elegance. However,the notion of co-valued features is simply a formalization of certain relations that the theory of grammar has to make explicit: the expletive-associate relation and the local connection between the links of a chain. Therefore, co-valued features should not be considered an extra, data-driven assumption but a proposal to unify formally aspects of grammar that are clearly related.
. Conclusions and some remarks on imperfections This paper has proposed that the Agree relation can only hold of elements standing in the a very local relation, which applies to the links of a movement chain, the expletive-associate relation and the other agreement relations that a DP normally engages with T, with v, with C, with a participle or with a predicative adjective. It has also been argued that Move is not Attract/Pied-Pipe but triggered by the unvalued features of a term which create an instability in the structure. Additionally, I have proposed the notion of “Co-Valued features” to make explicit the relation between an expletive and its associate and the relation between the links of a chain. The combination of locality and covaluation gave us the possibility of analyzing the subtle but real cross-linguistic variation in the structure of expletive constructions. Although there is a considerable amount of literature of expletives, I am not aware of any work that tries to account for the cross-linguistic variation. Finally, I have shown that, within my framework, we can dispense with some assumptions necessary in Chomsky’s (1998, 1999) framework: the PIC, “freezing” effects and φ-completeness. Although I have argued against the details of Chomsky’s analysis of agreement, I have not yet said anything about his real goal: that of articulating the hypothesis that the computational system of human language (CHL ) is an optimal solution to interface conditions. As a matter of fact, I find this project worth pursuing and in the following I sketch a way one could go about it within my assumptions. Chomsky (1998, 1999) suggests that we look at the theory of movement and the role of interpretable features in an entirely new way. Until now, Case
On Agreement
Theory was simply a theoretical primitive. It was a formal system built into the CHL that we needed to assume in order to account for some pesky data (ie: raising of internal arguments to Spec,T in unaccusative and passive constructions). This kept the community of syntacticians more or less satisfied until the arrival of the Minimalist Program started to demand more challenging standards of explanation and propose bolder hypotheses about the constitution of CHL . The boldest and most challenging hypothesis, without doubt, is the suggestion that CHL contains no imperfections, that the system is designed as if a super-engineer had planned it. Within this point of view, Case features, and uninterpretable features more generally, are problematic: why should Tense, or a participle, show agreement with a DP? Why should DPs have Case? Chomsky (1998) proposes that uninterpretable features are there to provoke displacement, and the latter is meant to give rise to differences in “surface semantic interpretation”. Uninterpretable features are not an imperfection, they exist for a reason. Let’s very briefly summarize how he analyzes Scandinavian Object Shift. As is well known, Scandinavian languages have the option of moving an object to a position that we can identify as Spec,v. An object in Spec,v is interpreted as specific/presupositional while a lower object is interpreted as nonspecific/existential. According to Chomsky, what pied-pipes an object to Spec,v is an optional EPP feature of v. This EPP feature is licensed because it has an effect on outcome, namely, the specific interpretation. If there is no EPP in v, the object simply stays in situ, where it can enter an Agree relation with v and receive a focus interpretation (and here I abstract away from the tight restrictions on OS summarized under the rubric Holmberg’s Generalization). Let me suggest an alternative to Chomsky’s analysis within the confines of my framework which makes no use of EPP features to provoke displacement. As I suggested in Section 2, we can assume that a Case feature can be added to some functional heads as they are drawn from the lexicon. Thus, v can be [accusative] or [partitive], according to the version of v. Assume another functional head, call it F, that selects for vP. Assume further that F can also bear a Case feature. Then, we can have the situation in which the feature [accusative] is added to F and not to v. Following my assumptions concerning Move, object will have to raise to spec,v to value its unvalued Case against F. The interpretation rule for this object works exactly as in Chomsky’s system and object in Spec,v becomes specific.
Luis López
(61) a. v[acus] [XP OB X t(OB)] b. F[acus] [vP OB SU v [XP t(OB) X t(OB)]]
→ focused OB → specific OB
This, of course does not amount to an analysis (but see López 2000 where the details of this approach are developed). Rather, I only want to show that within my framework it is possible to explore Chomsky’s hypothesis that the Case system is indeed built in for surface semantic interpretation.
Notes . Frampton and Guttman (2000) hit on a similar idea independently, which they call “feature coalescence”. I find this coincidence deeply reassuring. . The Swahili example suggests that subject movement stops at every available step before reaching its final destination. We find evidence in many languages, including English, for successive cyclic A-movement of subject as it goes to Spec,T, as Sportiche (1988) pointed out. Consider sentence (i): (i)
a. b.
The musicians would all have played a sonata. The musicians would have all played a sonata.
The fact that the floating quantifier can appear in several places between the final and the initial positions of the subject would suggest successive cyclic movement. However, Chomsky’s assumptions do not predict this, rather, they would predict just one movement from Spec,v to Spec,T triggered by the EPP feature of T. The only way he could open the door for this sort of successive movement is by stipulating that each modal/aux has a separate EPP feature and only T is φ-complete, so it can assign Case. Baltin (1995) and Bobaljik (1995) cast some doubt on the assumption that floating quantifiers indicate the path taken by the suject as it raises, proposing instead that the original idea that floating quantifiers are adverbs was right. However, floating quantifiers agree with the displaced constituents in languages with rich agreement morphology, a most unusual behavior for an adverb. Alternatively, it could be said that they are secondary predicates, which may also display overt agreement, but the distribution of floating quantifiers does not match that of secondary predicates: (i)
The musicians have all played a sonata.
(ii) *The musicians have naked played a sonata. . In Section 6 I adopt a slightly different version of (7b). . In Chomsky (1995), Kratzer (1996) no v is assumed in unaccusatives and passives. However, under Marantz’s assumptions, there has to be one, or an unaccusative root would never be read as a verb. . Nominative or null Case are always present, so it seems that the head that bears it does so already in the lexicon. The Case feature on v seems to be added: some verbs may select a
On Agreement
complement DP (which needs Case) or a complement CP/PP (which does not), a situation that can only occur if [accusative] is added freely. . In a more articulated CP structure, like the one in Rizzi (1997), the Case assigning head could be Finite0 . . Chomsky (1999) analyses (12) as an instance of long distance agreement in which the participle, which is not φ-complete, can be by-passed by the upstairs probe. A criticism of φ-completeness is in Section 1. . The idea that a spec position could be governed from outside was debated frequently during the GB era, generally agreeing that specs are accessible to outside governors, although some of the empirical reasons we might not find as compelling nowadays. In Kayne’s (1994) system, specs are adjunctions, so ZP in (18) would also not be dominated by XP. In Chomsky’s (1995) Bare Phrase Structure, adjunction to Xmax is banned because the resulting structure of segments, it is said, would be uninterpretable. However, Chomsky’s recent notion of edge of a phase that can be probed from outside again opens the door for a porous Xmax . . Several interesting proposals are not going to be discussed explicitly in this section, among them Collins’ (1997) Last Resort and Lasnik’s (1995) Enlightened Self-Interest. . “Attract Feature” was very hard to define and raised difficult problems. For instance, as Frampton and Gutmann (1999) point out, it is unclear what kind of chain results from feature movement. Moreover, if a head detects a feature that matches, thus establishing a relation between attractor and attractee, it is unclear why attraction is further required. . If movement is seen as expulsion from a configuration and remerge in another there is no room for a DP to get trapped in a loop, as it merges endlessly with the same XP, each time creating a new spec. Each of this remerges would be to the same configuration. Once an item has been “expelled” from Spec,X, it can’t go back to it. . As for German, some speakers that I consulted accept (i), but others do not: (i)
Es haben drei Männer ein Haus gekauft. There have three men a house bought ‘Three men have bought a house’
Given this uncertainty, I will leave German aside when discussing TECs. . In the analysis of German, English and French expletives I adopt the non-trivial assumption that in these languages v does not accept more than one A-spec. This prevents object from raising higher than Spec,X. This is an issue that we are going to encounter again later on. . Lasnik’s (1995) conclusion is different: he claims that adjacency is evidence for Case assignment by v (or equivalent). In my terms, adjacency simply indicates that object moved to Spec,X (remaining neutral as to whether it gets Case from v or not). . Examples like the ones in (i) could cast some doubt on the assertion that transitive verbs do not accept expletives: (i)
a. b.
There hit the stands a new journal. There entered the room a man from England.
Luis López
However, notice that the verbs in these sentences are conceptually unaccusative (‘hit’ means ‘arrive’ in (i-a)) except that they can assign Case. Phrase-structurally, the correct analysis for this type of sentence would probably have both DPs as arguments of the root, leaving Spec,v free for the expletive. Case assignment is a challenge for the v-as-Case-assigner theories generally. . Examples (41b) and (41c) are taken from Alice Munro’s short story ‘Meneseutung’, in the volume ‘Friend of my Youth’, published in 1990. Other examples tend to sound awkward, although it is hard to tell if the reason is the syntax or the information structure. Sentence (i) is considered ungrammatical by Chomsky (1999: 15), but (ii) sounds much better with only a small change in word order: (i) *There came several angry men into the room. (ii) ?Into the room there came several angry men. In (ii), the DP is focused, whereas in (i) it is not. It seems correct to assume that expletive constructions fulfill the information structural function of focusing the DP. The contrast between (i) and (ii) arises because the DP can be more naturally focused at the end of the sentence. This seems to support Chomsky’s claim that the ungrammaticality of (i) shows that English object must be dislocated in unaccusative/passive constructions (see also examples in the previous footnote). . Or it may adjoin to v, as proposed by Boskovic (1997). . Due to the inadequacy of my word processor I represent the Icelandic interdental fricatives as -th- and -dh-. . Chomsky (1995) eliminates AgrPs because they have no effect on either interface. However, AgrPs may play a role in the Chomsky (1999) model. Assume AgrPs are there to assign Case features. Further, assume that checking Case features is connected with surface interpretability (following Chomsky’s (1998, 1999) hypothesis that “imperfections” are not real but part of an optimal design, see Section 7 of this paper). Then AgrPs do have an effect on outcome, so they are licensed to live. . This analysis entails that there are two sources for nominative Case in Icelandic. Uriagereka (1995: 167) reaches the same conclusion for constructions usually referred to as nominativus pendens in Romance languages. The following is his example from Galician: (i)
Meu fillo n’o mata a fame. my son not-him kills the hunger ‘My son, hunger does not kill him’
In (i), both meu fillo and a fame are in nominative Case. In our terms, meu fillo would get it from C and a fame from Agr.
On Agreement
References Arad, M. (1998). VP-Structure and the Syntax-Lexicon Interface. Doctoral dissertation, University College London. Arad, M. (This volume). On ‘Little v’. Baltin, M. (1995). Floating Quantifiers, PRO and Predication. Linguistic Inquiry, 26, 199– 248. Bejar, S. & D. Massam (1999). Multiple Case Checking. Syntax, 2, 65–79. Belletti, A. (1988). The Case of Unaccusatives. Linguistic Inquiry, 19, 1–34. Bobaljik, J. (1995). Morphosyntax: The syntax of verbal inflection. Doctoral dissertation, MIT. Bobaljik, J. & D. Jonas (1996). Subject Positions and the Roles of TP. Linguistic Inquiry, 27, 195–236. Bobaljik, J. & H. Thráinsson (1998). Two Heads Aren’t Always Better than One. Syntax, 1, 37–71. Boskovic, Z. (1997). The Syntax of Non-finite Complementation. Cambridge, MA: The MIT Press. Branchadell, A. (1992). A Study of Lexical and Non-lexical Datives. Doctoral dissertation, Universitat Autònoma de Barcelona, Spain. Carstens, V. (2000). Concord in Minimalist Theory. Linguistic Inquiry, 31, 319–356. Carstens, V. & K. Kinyalolo (1989). On IP-structure: Tense, aspect and agreement. Unpublished manuscript, Cornell University and U.C.L.A. Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht: Foris. Chomsky, N. (1986). Barriers. Cambridge, MA: The MIT Press. Chomsky, N. (1993). A Minimalist Program for Linguistic Theory. In K. Hale and J. Keyser (Eds.), The View from Building 20 (pp. 1–45). Cambridge, MA: The MIT Press. Chomsky, N. (1995). Categories and Transformations. In The Minimalist Program, 219–394. Cambridge, MA: The MIT Press. Chomsky, N. (1998). The Minimalist Program: The framework [MIT Occasional Papers in Linguistics 15]. Department of Linguistics and Philosophy, MIT, Cambridge, MA. Chomsky, N. (1999). Derivation by Phase [MIT Occasional Papers in Linguistics 18]. Department of Linguistics and Philosophy, MIT, Cambridge, MA. Collins, C. (1997). Local Economy. Cambridge, MA: The MIT Press. Collins, C. & H. Thráinsson (1996). VP-Internal Structure and Object Shift in Icelandic. Linguistic Inquiry, 19, 131–174. Demonte, V. (1995). Dative Alternation in Spanish. Probus, 7, 5–30. Diesing, M. (1992). Indefinites. Cambridge, MA: The MIT Press. Emonds, J. (1976). A Transformational Approach to English Syntax. New York: Academic Press. Frampton, J. (1997). Expletive Insertion. In C. Wilder, H.-M. Gärtner and M. Bierwisch (Eds.), The Role of Economy in Linguistic Theory (pp. 36–57). Berlin: Akademie Verlag. Frampton, J. & S. Gutmann (1999). Cyclic Computation, a Computationally Efficient Minimalist Syntax. Syntax, 2, 1–27. Frampton, P. & S. Guttman (2000). Agreement is Feature Sharing. Unpublished manuscript, Northeastern University.
Luis López
Groat, E. (1999). Raising the Case of Expletives. In S. Epstein and N. Hornstein (Eds.), Working Minimalism (pp. 27–43). Cambridge, MA: The MIT Press. Grohmann, K. (To appear). A Movement Approach to Contrastive Left Dislocation. Rivista di Gramatica Generativa. Haegeman, L. (1992). Theory and Description in Generative Grammar: A Case Study in West Flemish. Cambridge: CUP. Halle, M. & A. Marantz (1993). Distributed Morphology and the Pieces of Inflection. In K. Hale and J. Keyser (Eds.), The View from Building 20 (pp. 111–176). Cambridge, MA: The MIT Press. Holmberg, A. (1999). Remarks on Holmberg’s Generalization. Studia Linguistica, 53, 1–39. Johnson, D. & S. Lappin (1999). Local Constraints vs. Economy. Stanford, CA: CSLI publications. Johnson, K. (1991). Object Positions. Natural Language and Linguistic Theory, 9, 577–636. Kayne, R. (1989). Facets of Romance Past Participle Agreement. In P. Benincà (Ed.), Dialect Variation and the Theory of Grammar (pp. 85–103). Dordrecht: Foris. Kayne, R. (1994). The Antisymetry of Syntax. Cambridge, MA: The MIT Press. Koopman, H. & D. Sportiche (1991). The Position of Subjects. Lingua, 85, 211–258. Kratzer, A. (1996). Severing the External Argument from its Verb. In J. Rooryck and L. Zaring (Eds.), Phrase Structure and the Lexicon (pp. 109–138). Dordrecht: Kluwer. Larson, R. (1988). On the Double Object Construction. Linguistic Inquiry, 19, 335–391. Lasnik, H. (1992). Case and Expletives: Notes toward a parametric account. Linguistic Inquiry, 23, 381–405. Lasnik, H. (1995). Case and Expletives Revisited: On greed and other human failings. Linguistic Inquiry, 26, 615–634. Longobardi, G. (1994). Reference and Proper Names: A theory of N-movement in syntax and logical form. Linguistic Inquiry, 25, 609–665. López, L. (2000). The Edges of Catalan. Ms., University of Illinois-Chicago. López, L. (2001). On the (Non)complementarity of θ-theory and Checking Theory. Linguistic Inquiry, 32(4). Marantz, A. (1997). No Escape from Syntax: Don’t try morphological analysis in the privacy of your own lexicon. In A. Dimitriadis (Eds.), Proceedings of the 21st Annual Penn Linguistics Colloquium [University of Pennsylvania Working Papers in Linguistics 4.2] (pp. 201–225). Masullo, P. (1992). Incorporation and Case Theory in Spanish. A Crosslinguistic Perspective. Unpublished dissertation, University of Washington. May, R. (1985). Logical Form. Cambridge, MA: The MIT Press. McGinnis, M. J. (1998). Locality of A-Movement. Doctoral dissertation, MIT. Platzack, C. (1986). COMP, INFL and Germanic Word Order. In L. Hellan and K. Andersen (Eds.), Topics in Scandinavian Syntax (pp. 185–234). Reidel: Dordrecht. Pollock, J.-I. (1989). Verb Movement, UG and the Structure of IP. Linguistic Inquiry, 20, 365–424. Postal, P. (1974). On Raising. Cambridge, MA: The MIT Press. Rizzi, L. (1990). Relativized Minimality. Cambridge, MA: The MIT Press. Rizzi, L. (1997). The Fine Structure of the Left-periphery. In L. Haegeman (Ed.), Elements of Grammar (pp. 281–337). Dordrecht: Kluwer.
On Agreement
Roberts, I. (1998). Have/Be Raising, Move F and Procrastinate. Linguistic Inquiry, 29, 113– 126. Romero, J. (This volume). Morphological Constraints on Syntactic Derivations. Ross, J.-R. (1983). Inner Islands. Unpublished manuscript, MIT. Selkirk, E. (1994). Sentence Prosody, Intonation, Stress and Phrasing. In J. Goldsmith (Ed.), The Handbook of Phonological Theory (pp. 550–569). New York: Blackwell. Sigurðhsson, H. (1991). Icelandic Case Marked Pro and the Licensing of Lexical Arguments. Natural Language and Linguistic Theory, 9, 327–363. Sportiche, D. (1988). A Theory of Floating Quantifiers and its Corollaries for Constituent Structure. Linguistic Inquiry, 19, 425–449. Stowell, T. (1981). Origins of Phrase Structure. Doctoral dissertation, MIT. Suñer, M. (1988). The Role of Agreement in Clitic-Doubled Constructions. Natural Language and Linguistic Theory, 6, 391–434. Ura, H. (1994). Varieties of Raising and the Feature-Based Bare Phrase Structure Theory [Occasional Papers in Linguistics 7]. Cambridge, MA: The MIT Press. Ura, H. (1996). Multiple Feature-Checking: A Theory of Grammatical Function Splitting. Doctoral dissertation, MIT. Uriagereka, J. (1995). An F position in Western Romance. In K. Kiss (Ed.), Discourse Configurational Languages (pp. 153–175). New York: OUP. Vickner, S. (1995). Verb Movement and Expletive Subjects in the Germanic Languages. New York: OUP. Yang, C. (1999). Unordered Merge and Its Linearization. Syntax, 2, 38–64. Zagona, K. (1982). Government and Proper Government of Verbal Projections. Doctoral dissertation, University of Washington. Zwart, J.-W. (1997). The Morphosyntax of Verb Movement: A minimalist approach to the syntax of Dutch. Dordrecht: Kluwer.
A minimalist account of conflation processes Parametric variation at the lexicon-syntax interface* Jaume Mateu and Gemma Rigau
.
Introduction
The main purpose of this paper is to show that the ‘conflation processes’ involved in so-called ‘lexicalization patterns’ (see Talmy 1985) can receive an adequate explanation when translated into syntactic terms. An analysis of these conflation processes in purely semantic terms like that put forward by Talmy (1985) can be said to be descriptively adequate, but the ‘parametric variation’ to be found in such processes will be seen to crucially involve morphosyntax, not only semantics (see Snyder 1995). First of all, it will be necessary to review some of the main insights of Talmy’s work. As is well-known, this cognitive linguist claims that languages can be classified according to how semantic components like Figure, Motion, Path, Manner, or Cause are conflated into the verb. For example, conflation of motion with path is argued to be typical of Romances languages like Spanish (see (1)), whereas conflation of motion with manner is typical of English (see (2)). The examples in (1) and (2) are all drawn from Talmy (1985: 69f). (1) a.
La the b. La the c. El the
botella entró bottle went+into botella salió bottle went+out globo subió balloon went+up
a la cueva flotando. (Spanish) to the cave floating de la cueva flotando. of the cave floating por la chimenea flotando. through the chimney floating
Jaume Mateu and Gemma Rigau
d. El the e. La the (2) a. b. c. d. e.
globo bajó por balloon went+down through botella se alejó de la bottle went+away from the
la chimenea flotando. the chimney floating orilla flotando. bank floating
The bottle floated into the cave. The bottle floated out of the cave. The balloon floated up the chimney. The balloon floated down the chimney. The bottle floated away from the bank.
In fact, Spanish and English can be regarded as two poles of a typological dichotomy that Talmy (1991) characterized as ‘verb-framed languages’ versus ‘satellite-framed languages’. Given this distinction, there are languages encoding the path element into the verb: for example, consider the Spanish path verbs entrar ‘go in(to)’, salir ‘go out’, subir ‘go up’, etc. By contrast, other languages do not incorporate the path into the verb but leave it as a satellite around the verb. According to Talmy, the latter option is typically found in the majority of Indo-European languages (Romance being excluded). When the path remains as a satellite, one option becomes available: the manner component (for example, floating in the examples in (2)) can be encoded into the verb. The well-known ‘elasticity’ of the verb meaning in English (cf. Rappaport Hovav & Levin 1998) can be exemplified with data involving not only conflation of motion with manner (see (2)), but also conflation of causation with manner (see the examples in (3), drawn from Levin & Rapoport 1988: 279). The fact that the directionality or path component remains as a satellite in English allows the manner component (e.g., brushing) to be conflated into the causative verb in (3). As expected, the lexicalization pattern corresponding to the Romance languages (i.e., the path incorporates into the verb, saturating it lexically) prevents them from having the kind of verbal elasticity in (3), the manner component being then forced to be expressed as an adjunct if necessary: e.g., cf. Sp. ella quitó las hilas con un cepillo/cepillando (lit.: ‘she took+out the lint with a brush/brushing’). (3) a. b. c. d. e. f.
She brushed the lint off. She brushed the tangles out. She brushed the lint off the coat. She brushed the crumbs into the bowl. She brushed melted butter over the loaves. She brushed the coat clean.
A minimalist account of conflation processes
g. She brushed her way to healthy hair. h. She brushed a hole in her coat.
Notice that it is precisely the conflation of the motion or causation verb with manner what accounts for those cases where the construction rather than the verb has been argued to determine the argument structure (see Jackendoff 1990, 1997 or Goldberg 1995). As shown in Jackendoff (1990, 1997), constructions like those in (4) through (6) have syntactic and semantic restrictions of their own and, in this sense, it is indisputable that each of them deserves the status of ‘constructional idioms’. Moreover, Jackendoff (1997: 554f) noted that these constructions can be considered instances of a more general abstract construction, the ‘verb subordination archi-construction’ in (7). (4) ‘One’s way construction’: e.g., He moaned his way out of the room. a. [VP V [bound pronoun]’s way PP] b. ‘go PP (by) V-ing (5) ‘Resultative construction’: e.g., He wiped the table clean. a. [VP V NP {AP/PP}] b. cause NP to become AP/go PP by V-ing (it)’ (6) ‘Time-away construction’: a. [VP V NP away] b. ‘waste [Time NP] V-ing’
e.g., She danced the night away.
(7) ‘Verb Subordination Archi-construction’ a. [VP V . . . ] b. ‘act (by) V-ing’
Although we do not have any problem in attributing the status of ‘constructional idioms’ to the constructions in (4)–(6) in the sense that each of them has its own set of syntactic and semantic peculiarities, we want to show that Jackendoff ’s (1997) ‘Verb Subordination Archi-construction’ in (7), as it stands, can be regarded as an epiphenomenon, once a principled account of the parametric variation in the lexicon-syntax interface is taken into account.1 Quite importantly, we claim that the relevant explanation of the parametric issue concerning the existence of (3)–(6) in English, but not in Romance, cannot be formulated in purely semantic or aspectual terms, since it can be argued to have nothing to do with the positive or negative application of some ad hoc operations over the ‘Lexical Conceptual Structure’ (LCS) (Levin & Rapoport 1988), the ‘Aspectual Structure’ (Tenny 1994), or the ‘Event Structure’ (Pustejovsky 1991), but with one empirical fact: i.e., the syntactic properties associ-
Jaume Mateu and Gemma Rigau
ated with the lexical element encoding directionality are not the same in English as in Romance (cf. Snyder 1995 and Klipple 1997 for two proposals in tune with our syntactic account). ‘Semanticocentric’ analyses run into problems when language variation is taken into account, since no principled explanation can be given to why some languages (e.g., Romance) appear to lack the relevant LCS operation, the aspectual operation or the event-type shift strategy involved in the conflation processes in (2) and (3). Accordingly, we will take pains to show that the solution of such a problem cannot be stated in purely semantic or aspectual terms.
. On the distribution of semantic properties. A minimalist conception Before presenting our syntactic analysis of conflation processes, it will be useful to provide a general picture concerning where semantic properties are to be distributed in the minimalist program we are assuming. Being inspired by Chomsky (1995, 1998), we propose that the semantic information to be located in the model depicted in (8) can be distributed in three different places. Firstly, there are certain semantic properties that can be argued to be optimally coded into lexical entries. Secondly, there are other semantic properties that can be seen as output conditions on LF. In particular, we will be dealing with an important set of them, those that form the Projection Principle conditions (Chomsky 1998: 27). Finally, there are semantic properties belonging to systems of thought, which are to be located beyond the linguistic interface level. (8)
lexicon computational system spell-out PF
sensoriomotor systems
LF Legibility conditions: Projection Principle conditions, Binding conditions, etc.
systems of thought
Let us begin with the semantic properties that must be optimally coded into the lexical entry. We will assume that the semantic information to be located in the lexicon is the optimal information required by the ‘computational system’. It is widely acknowledged that lexical entries include semantic features entailing
A minimalist account of conflation processes
their corresponding categorial features.2 These lexically encoded semantic features will have to be interpreted at the interface level LF. Two classes of semantic features can be distinguished: non-relational features vs. relational features. The former entail the syntactic category Noun (N), whereas the latter entail the categories considered as syntactic predicates: Verb (V) and Particle (P). For our present purposes, P is to be regarded as a cover birelational term for Adposition, Adjective, and Adverb. So-called Adpositions are pure Particles, whereas Adjectives and Adverbs can be seen as complex Particles that incorporate a non-relational element.3 This proposal nicely captures the argument structure similarities of sentences like those in (9). All of them turn to share the same syntactic structure, that in (10) (where functional categories have been omitted). (9) a. The cat is in the room. b. The cat is happy. c. The cat is here.
In is a simple Particle that selects a non-relational element as its complement (room), while both happy and here are complex Particles incorporating their non-relational complement. (10) V V
P N
P P
N
Relational features can be argued to be hierarchically organized. For our present purposes, we will be assuming a coarse-grained organization of relational features like the following one: Relational features can express an eventive relation or a non-eventive relation. The eventive relation entails the syntactic category V, while the non-eventive relation entails the syntactic category P. An eventive relation can be instantiated as a causal relation or a transitional relation. In turn, the causal relation can be dynamic (e.g., dance, kill, shelve) or static (e.g., hear, know, love), and the transitional relation can also be dynamic (e.g., disappear, go, come) or static (e.g., be, belong, remain). Transitive verbs (unergatives included) are entailed by the causal relation feature, while unaccusative verbs are entailed by the transitional relation feature.
Jaume Mateu and Gemma Rigau
On the other hand, a non-eventive relation can be prototypically regarded as a spatial relation. This can express a central coincidence relation (e.g., with, on, at, ...) or a terminal coincidence relation (e.g., to, out of, up to, ...).4 The relevant properties to be encoded into a lexical entry can be exemplified with those of the lexical entry corresponding to the unergative verb dance in (11).5 (11)
dance a.
phonological properties
b. c.
V (
What is meant by (11b) is that the categorial property V is entailed by the semantic feature, i.e., the causal relational feature. The fact that dance has tense and phi-features will not be indicated in the lexical entry, since that much is determined by its category V (presumably by UG), as noted by Chomsky (1995: 238). Finally, what is meant by (11c) is that a N must be incorporated into the verb dance (see Hale & Keyser 1993, 1998). This information is clearly idiosyncratic, and hence it must be encoded into the lexical entry. As we will see below, the information optimally encoded into lexical entries will be argued to be crucially relevant when dealing with the crosslinguistic variation involved in Talmy’s conflation processes. We can now concentrate on the semantic properties that must be located in the output conditions on LF, the interface linguistic level related to systems of thought. It is clear that LF has to meet certain ‘legibility conditions’ in order for systems of thought to access this interface level (Chomsky 1998: 7). According to Chomsky (1998: 27f), bare output conditions on LF include Binding conditions, the Case Theory, the Chain condition, the Projection Principle, etc. The legibility conditions we are interested in at present are those concerning the Projection Principle. We will assume that the Projection Principle conditions govern the relation among those three basic syntactic objects depicted in (12) (where the X in (12a) is to be regarded as a variable: it is N in unergative structures, and P in transitive and unaccusative structures). We will also assume that they govern the syntax-semantics associations depicted in (13). As a result, notice that there appears to be a strong homomorphism between the syntax and semantics of argument structure at LF.6
A minimalist account of conflation processes
(12) a. V V
b. X
c. N
P N
P P
N
(13) a.
V is to be associated to an eventive relation: if there is an external argument, it is interpreted as a causal relation; otherwise it is interpreted as a transitional relation. b. P is to be associated to a non-eventive relation. c. N is to be associated to a non-relational element.
According to (13a), the eventive relation associated to V can be instantiated as two different semantic relations: if there is an external argument in the specifier position of the relevant functional category (e.g., ν in Chomsky 1995 or Voice in Kratzer 1996), the eventive relation will be instantiated as a causal relation, the external argument being interpreted as Originator (see Borer 1994, and Mateu 1999). If there is no external argument, the eventive relation will be instantiated as a transitional relation.7 Concerning the non-eventive/spatial relation in (13b), its specifier and complement are prototypically interpreted as Figure and Ground respectively (these terms being adapted from Talmy 1985).8 The output conditions on the interface linguistic level accessed by systems of thought can be regarded as instantiations of a general condition, the ‘Full Interpretation Principle’ (FIP). Beyond these conditions, we assume that there is a third set of general semantic instructions that will contribute to a representation of meaning more complete than that offered by grammar. This set of semantic instructions found between grammar and systems of thought can be argued to facilitate the access of the latter to the grammatical interface. What happens between LF and systems of thought is beyond our present concerns, but it is clear that there must be non-linguistic or encyclopedic information that ‘enriches’ the representation of meaning provided by grammar: see Chomsky (1975: 105f), Williams (1977), or Marantz (1997).
. Conflation processes in Path constructions In this section, we provide a syntactic account of conflation processes like those involved in (1), (2), and (3). Firstly, we will deal with the lexicalization
Jaume Mateu and Gemma Rigau
pattern corresponding to English (i.e., conflation of manner into the {motion/causation}verb). Secondly, we will show why this lexicalization pattern does not hold for Romance languages, where the relevant lexicalization pattern involves conflation of path into the verb. Consider the examples in (14): (14) a. Sue danced. b. Sue danced across the room. c. John danced Sue across the room.
It has often been noted in the literature that unergative verbs in English can be unaccusativized when a directional PP is added (see Hoekstra 1984; Levin & Rappaport Hovav 1995; and Ritter & Rosen 1998, among others). One interesting question to be solved is why the so-called unaccusativization process involved in (14b) does not take place in some languages, e.g., in Spanish. As we will see below, our proposal is that the solution is to be found in the different syntactic properties associated to the lexical element encoding directionality in English vs. Spanish. In order to get the syntactic derivation involved in (14b), it is required that the lexical subarray contain the substantive categories in (15), where their corresponding lexical entries can be argued to include those three kinds of information which we have exemplified with the unergative verb dance in (11). We put functional categories aside here.9 (15) dance
go
across
Sue
Phonological no phonological phonological phonological properties properties properties properties
room phonological properties
V (
We assume that the lexicon of satellite-framed (i.e., ‘non-verb-framed’) languages like English has a phonologically null verb expressing dynamic transition, besides its phonetically realized correspondent. We represent this empty unaccusative verb in boldface: go. By virtue of expressing a dynamic transition, this unaccusative verb subcategorizes for a PP denoting a directional spatial relation, which in turn relates two non-relational elements: Sue (i.e., the Figure),
A minimalist account of conflation processes
and room (i.e., the Ground). The syntactic object in (16) is the result of merging go with the birelational P headed by across: (16)
V V go
P N P Sue P N across room
(16) would be interpretable at the interface with systems of thought, whereas some syntactic object like (17) would not. Indeed, the Projection Principle requires that the null verb go select a directional spatial relation but not a non-relational element as its complement, this being due to its transitional feature. (17)
V V N go room
However, as it stands, the syntactic computation of (16) would not be convergent at PF, because the verb go, being devoid of phonological properties, would not be interpretable or legible at the interface level with sensoriomotor systems. In order to avoid its crashing at PF, it is required that the empty verb be conflated with another element with phonological properties. The unergative verb dance represented in the numeration in (15) turns out then to be adjoined to the phonologically null unaccusative verb by means of Merge. As a result, the conflation of dance with go will be spelled out as dance. Its corresponding syntactic representation is given in (18) (functional categories omitted). (18)
V V
P
V V N P dance go Sue P N across room
Jaume Mateu and Gemma Rigau
Given (18), our claim is that the generalized transformation used by Hale & Keyser (1997) in their account of sentences like (14b) is not but an instantiation of Merge. By using this operation, we provide the empty unaccusative verb with the phonological features needed for it to be legible at the interface level with sensoriomotor systems. This is in accordance with Chomsky’s claim that syntactic operations can be argued to be used in order to satisfy external conditions. Let us now analyze (14c) John danced Sue across the room. As noted by Ritter and Rosen (1998: 140–141, 157–158), a sentence like (14c) does not alternate with (14a) or (14b), but with John danced. That is to say, according to them, (14c) is not an example of the well-known causative-inchoative alternation. In their event-based approach, a delimiting predicate (e.g., across the room) is posited to be added to the activity verb dance in sentences like (14c), the former predicate licensing then a delimiting argument (e.g., Sue). Accordingly, Ritter and Rosen point out that the fact that John is the subject of the verb dance and Sue an argument of the secondary predicate across the room, explains why the object must not necessarily be engaged in the action denoted by the verb: for example, they note that John must be dancing in (14c) but Sue could be a doll John is holding as he dances. Although we assume Ritter & Rosen’s empirical considerations, we have our qualms on their event-based analysis, according to which a ‘delimiter phrase’ is said to be added to the activity verb, this secondary predicate licensing then a ‘delimiting object’. Rather our syntactic analysis of the conflation process involved in (14c) takes the configuration in (19) as the main or basic one, i.e., that formed by merging a PP headed by across with the lexical item in (20), which represents a phonologically null causative verb. As above, we express the non-verb-framed nature of English path constructions by positing a null verb like that in (20). (19)
V V cause
P N P Sue P N across room
A minimalist account of conflation processes
(20)
cause no phonological properties V (
In order to saturate the empty phonological properties of the causative verb, the conflation depicted in (21) is then required, the subordinate unergative verb dance providing the main causative verb with a phonological basis for it to be interpreted or legible at PF. (21)
V V
P
V V N P dance cause Sue P N across room
On the other hand, note that the same syntactic conflation process represented in (21) appears to be involved in examples like those in (22), where the path represented by the P(article), which is a complex one in (22a–c) (cf. (9b–c)), is not incorporated into the verb, this requiring the conflation of the empty causative predicate with an unergative verb by means of Merge.10 (22) a. b. c. d. e. f.
Sue danced the night away. Tribal members ceremonially danced it open. Sue laughed herself silly. Sue sneezed the napkin off the table. Sue laughed her way into the room. Sue swam her swimsuit to tatters.
Wechsler (1995)
Quite interestingly, our syntactic analysis of the conflation processes involved in (18) and (21) can also be shown to receive empirical support from examples like the German ones in (23), which are nicely commented on by Seibert (1992: 62). (23) a.
Er He b. Er He
schwamm aus dem Gefängnis. swam out.of the prison hat sich aus dem Gefängnis geschwommen. has refl out.of the prison swum
(German)
Jaume Mateu and Gemma Rigau
According to Seibert (1992: 66), “the adverbial does not denote a place the subject reaches as a natural result of swimming, i.e., the person might have been swimming in a completely different place, or the person may have never left the prison while actually swimming”. By contrast, the adverbial out of the prison in (23a) does denote a place the subject reaches as a natural result of swimming. Seibert’s comments on (23) can be explained on the basis of our conflation analysis in quite an elegant way. While (23a) involves merging the verb schwimmen (‘swim’) with the null verbal element corresponding to the transition (i.e., go), (23b) involves merging schwimmen with the null verbal element corresponding to the causation (i.e., cause), this being in full accordance with the interpretive effects noted by Seibert. That is to say, (23a) can be analyzed as (24), whereas (23b) can be analyzed as (25) (As above, we assume that the external argument in (23b) (i.e., Er ‘he’) is to be introduced by the relevant functional projection omitted here). (24)
V
V schwimmen
V
P
V N go Er
P P N aus Gefängnis
(25)
V V
P
V V N P schwimmen cause sich P N aus Gefängnis
So far our syntactic analysis of the lexicalization pattern typical of English, that involving conflation of manner into the verb.11 Let us now deal with the lexicalization pattern corresponding to conflation of path into the verb, which has been argued to be characteristic of Romance languages.12 As noted by Talmy, in Spanish the (telic) directional or path element is conflated into the motion verb. To put it in our present terms, the lexical entry of Spanish path verbs like entrar (‘to go in’) contains the information that a telic path particle is incorporated into the verb. Since the verb-framed nature of Spanish path constructions
A minimalist account of conflation processes
is a fossilized property (this being due to the diachronic evolution of this language), it is clear that each verbal lexical entry affected by such a fossilization will have to reflect it. For example, the lexical entry of entrar (‘to go in’) will have to contain information like that depicted in (26): (26)
entrar phonological properties V (
Given (26), we can now explain why Spanish lacks constructions like (27b): (27) a.
Sue bailó. Sue danced b. *Sue bailó a la habitación. Sue danced to the room
The construction in (27b) is ungrammatical in Spanish: since the preposition corresponding to the telic path constituent is lexically incorporated into the transition verb in Romance, there is no phonologically null unaccusative verb corresponding to dynamic transition that turns to be available in Spanish. As a result, merging an unergative verb like bailar (‘dance’) with an empty unaccusative verb expressing dynamic transition is not a real possibility in such a language. Accordingly, Spanish speakers are only allowed to encode the manner component as an adjunct: see (28). (28) a.
American Spanish Sue entró a Sue went+into (in)to b. European Spanish Sue entró en Sue went+into in(to)
la habitación (bailando). the room (dancing) la habitación (bailando). the room (dancing)
The syntactic stucture associated to (28a) is that in (29), the adjunct bailando (‘dancing’) being omitted from the syntactic argument structure:
Jaume Mateu and Gemma Rigau
(29)
V V entró
P N Sue
P P a
N habitación
The examples in (28) could be argued to pose a potential problem for our analysis of lexical incorporation:13 why is it the case that the lexically incorporated prepositional complement of the unaccusative verb can reappear again in syntax (cf. a/en la habitación ‘into the room’)? We think that the fossilized kind of incorporation of P into the verb entrar is crucial in order to understand why the prepositional complement reappears: P is always projected in the syntax, this being a copy of the P incorporated into the verb. Otherwise, note that there would not be any internal specifier position available for the subject of the unaccusative sentence. This copy can be pronominal, as in the Catalan example (30a), or phonologically null when recovered via deixis (see 30b).14 (30) a.
La Sue Sue b. La Sue Sue
hi entrà. loc.clitic went+into entrà. went+into
(Catalan)
So far we have been dealing with cases where those two lexicalization patterns concerning conflation of motion with manner and that of motion with path do not coincide in a unique language. This is the case in English and Spanish: recall Talmy’s proposal that while English usually lacks conflation of path into the verb, Spanish lacks conflation of manner into the verb. However, it is interesting to notice that there are some languages that appear to combine both options, as shown by the Dutch data in (31), drawn from van Hout (1996), and by the Russian data in (32), drawn from Spencer & Zaretskaya (1998). Nevertheless, as we will see below, the incorporation of the Particle into the verb in the following data must not be analyzed in the same way as in the Spanish examples we have just dealt with above. (31) a.
John is weg-gelopen. John is away-walked ‘John walked away’.
(Dutch)
A minimalist account of conflation processes
b. De gevangene is de gevangenis uit-gezwommen. the prisoner is the prison out-swum ‘The prisoner swam out of the prison’. (32) Ona vo-sla /v-letela. she in-walked /in-flew ‘She walked/flew in’.
(Russian)
It has often been said in the literature that there is an unaccusativization process involved in (31)–(32). The basic verb is said to be unergative, but the syntactic construction where it appears turns out to be unaccusative when a directional element is present. For example, note that the auxiliary selected in the Dutch data is zijn (‘be’) (see Hoekstra 1984). Our analysis of (31a) to be presented below can be argued to hold for the rest of the data in (31b) and (32). As assumed above when dealing with English data like (14b) Sue danced across the room, we want to propose that the lexicon of both Dutch and Russian contains a phonologically null verb denoting dynamic transition, besides its phonetically realized correspondent. Following our present convention, we represent this phonologically null unaccusative verbal element in boldface: for expository reasons, let us call it go once again. As noted, it is required that the empty verb be conflated with another element with full phonological properties in order for the former to be interpretable or legible at the interface level with sensoriomotor systems. To avoid its crashing at PF, the unergative verb (ge)lopen (‘walk(ed)’), which has also been selected from the lexical subarray, turns to be adjoined to the phonologically null unaccusative verb by means of Merge. As a result, the conflation of (ge)lopen with go will be spelled out as (ge)lopen. Its corresponding simplified syntactic configuration is given in (33). (33)
V V
P
V V N P (ge)lopen go John weg[affix]
So far the analysis of (31a) is identical to its corresponding English version John walked away. However, there is an additional step in the syntactic derivation of (31) and (32), which appears to be triggered by the affixal nature of
Jaume Mateu and Gemma Rigau
the preverbs.15 Consequently, this P will have to move to the superior verbal head, adjoining to it. In this case, Move is clearly justified because of the affixal status of P. On the other hand, it should be clear that in (31) and (32), the incorporation of the P(article) into the verb is not a fossilized process, as it is in Spanish. Crucially, notice that the morphological analysis of the verbs in (31) and (32) is quite transparent: the prefix corresponding to the P(article) can be easily identified. By contrast, Spanish path verbs like entrar (‘to go in’), bajar (‘to go down’), subir (‘to go up’), etc., constitute morphophonological atoms (that is, what corresponds to the P(article) and what corresponds to the verb in such verbs cannot be distinguished synchronically any longer), this being due to the above-mentioned fossilization process. Accordingly, it should not be surprising that the fossilized status of the incorporation of the telic P into the transition verbal head prevents Spanish from merging an unergative verbal head expressing what Talmy refers to as ‘manner’ into a null unaccusative verb expressing dynamic transition. By contrast, the non-fossilized nature of the incorporation of P into the verb in (31)–(32) allows a subordinate unergative verb to be merged into the null unaccusative verb of the path of motion construction.
. Conflation processes in existential locative constructions In this section, we will show that examples of satellite-framed constructions can also be found in the Romance languages, which, as noted above, have been argued to be typically verb-framed (cf. Talmy 1991). Once again it appears to be the case that such a distinction cannot be drawn across the board, but it depends on the different lexical semantic domains involved (see our Footnote 11). For example, we want to argue that the existential locative constructions in (34), drawn from Torrego (1989) and Rigau (1997), involve conflation of an unergative verb into a null unaccusative verbal head expressing static or negative transition (i.e., be). (34) a.
Spanish En este árbol anidan cigüeñas. In this tree nest-3pl storks ‘There are some storks nesting in this tree’.
A minimalist account of conflation processes
b. Central Catalan (En aquest esbart,) hi ballaran adolescents. In this group, loc.cl. will-dance-3pl teenagers ‘There will be some teenagers dancing here (in this group)’. c. N’ hi ballaran molts. part.cl. loc.cl. will-dance many ‘Many of them will dance there’.
The unaccusativity of these constructions can be argued to be shown by the licensing of (i) a postverbal bare NP in (34a) and (34b), and (ii) the partitive clitic en/ne in (34c).16 Moreover, the sentences in (34) are impersonal just like their corresponding paraphrases in (35), which are formed by the impersonal existential verb Sp. haber / Cat. haver-hi (‘have’) plus a gerund or a pseudorelative construction: (35) a.
Spanish En este árbol hay cigüeñas {anidando/que anidan}. in this tree has-loc.cl. storks {nesting/that nest-3pl} ‘There are some storks nesting in this tree’. b. Central Catalan En aquest esbart, hi hauran adolescents {ballant/que ballaran}. in this group, loc.cl. will-have-3pl teenagers {dancing/that will dance} ‘There will be some teenagers dancing here (in this group)’.
Although a sentence with a definite DP is grammatical (cf. (36a)), it cannot take an existential meaning: in contrast to (34b), (36a) does not express a property of aquest esbart (‘this group’), but an activity carried out by the subject (les meves filles) (cf. Rigau 1997). (36) Central Catalan a. (En aquest esbart,) hi ballaran les meves In this group, loc.cl. will-dance-3pl the mine-pl filles. daughters ‘My daughters will dance here (in this group)’. b. Les meves filles ballaran en aquest esbart. The mine-pl daughters will-dance-3pl in this group ‘My daughters will dance here (in this group)’.
As noted by Rigau (1997, 1999), the locative clitic hi acts as an impersonalizer in the Catalan examples in (34) and (35). It is precisely this element that pre-
Jaume Mateu and Gemma Rigau
vents the sentence from having a nominative subject.17 By contrast, in (36a) the locative clitic hi is just a reassumptive pronoun of a left dislocated PP. Moreover, the parallelism between the sentences in (34) and those with the impersonal existential verb haver-hi is also visible in those Romance languages and dialects where this verb does not agree with its object NP in number: for example, compare (37a) with (39a). (37) a.
Sardinian Jones (1993: 105) B’ at ballatu tres pitzinnas. loc. cl. has danced three girls ‘Three girls danced’. b. Northwestern Catalan En aquest esbart, hi balla adolescents. in this group, loc.cl. dances teenagers ‘There are some teenagers dancing here (in this group).’
Crucially, notice that the construction is ungrammatical when there is a definite DP: (38) a.
Sardinian Jones (1993: 195) *B’ at ballatu cussos pitzinnas. loc. cl. has danced these girls b. Northwestern Catalan *En aquest esbart, hi balla els adolescents. in this group, loc.cl. dances the teenagers
On the other hand, the Sardinian sentence in (37a) shows that the auxiliary selected by the impersonal existential constructions under study is áere (‘have’), this auxiliary also being selected by the existential verb áere. By contrast, the existential or locative verb éssere (‘be’) selects the auxiliary éssere. See the Sardinian examples in (39):18 (39) a.
B’ at áppitu metas problemas. loc.cl. has had many problems ‘There has been many problems’. b. Bi sun/*at istatus issos. loc.cl. are/*has been they ‘They were there’.
Jones (1993: 114)
The above-mentioned properties shared by the sentences in (34) and (37), on the one hand, and those in (35) and (39a), on the other, can be explained if we assume that the lexical entry corresponding to the existential verb {Sp. haber / Cat. haver-hi / Sard. áere} contains the same abstract central coinci-
A minimalist account of conflation processes
dence preposition as that syntactically incorporated in the constructions in (34) and (37). The lexical entry we propose for the impersonal existential verb is depicted in (40): (40) Sp. haber / Cat. haver-hi / Sard. áere phonological properties V (
As an idiosyncratic property of (40), an abstract central coincidence preposition (P) is incorporated into the verb expressing static or negative transition (see Freeze 1992; Hale & Keyser 1993; Kayne 1993, among others). This P selects a locative determiner as its specifier/subject, and an NP as a complement.19 Following Rigau (1997), we assume that the incorporation of P into the existential verb allows this verb to assign partitive case, an instance of the inherent case that P is able to assign when incorporated into the host ‘light’ verb.20 Let us now deal with the syntactic representation of the conflation process involved in constructions like those in (34) and (37). We want to argue that there is an empty unaccusative verb expressing static or negative transition (i.e., be), which in turn selects a phonologically null central coincidence P. This null P incorporates into the V in the syntax in order to satisfy the Full Interpretation Principle at PF. Since both V and P are phonologically null heads, the derivation will crash unless the empty complex verbal head is merged with the unergative verb ballar (‘to dance’) present in the numeration: Functional categories omitted, the syntactic structure corresponding to both (34b) and (37b) is represented in (41). The PP en aquest esbart (‘in this group’) has been set aside, since it is an adjunct. (41)
V V V balla
V be
P N hi
P
P N central adolescents coincidence
Jaume Mateu and Gemma Rigau
Quite interestingly, our present analysis can be argued to provide an elegant explanation of the fact that existential locative constructions like those in (34) are not possible in English, as shown in (42).21 (42) *At the party danced some people.
Our proposal is that the phonologically null verb be does not exist in an otherwise non-verb-framed language like English. As seen in Section 3, its satelliteframed nature has been argued to be restricted to path constructions (pace Talmy 1985, 1991). Recall that we have accounted for its non-verb-framed nature by positing two empty verbal heads: go and cause. However, as a result of its lacking a null verbal head be, merging an unergative verb like dance into a static or negative transition verb is not a real possibility in this language, as it is in Romance languages like Spanish or Catalan, which are otherwise typically verb-framed: they lack those two empty verbal heads go and cause involved in complex path constructions. Moreover, according to our present argumentation, it is plausible to assume that the absence of an empty verbal head be from English is related to the fact that this language has no existential verb equivalent to the Romance impersonal existential verb: e.g., cf. Cat. haver-hi. To conclude, we have shown that the so-called unaccusativization process involved in both (41) and (18) (Sue danced across the room) can be analyzed in a uniform way. Quite interestingly, we have not made use of different mechanisms or strategies when dealing with each of these constructions. Rather the conflation processes under study have been defined as an instance of the Merge operation that combines two different verbs present in the numeration, the phonologically null one being the main verb, while the full verb being the subordinate one: it is the former verb that will determine the {cause/motion/state} meaning of the construction, the latter verb expressing what Talmy refers to as the ‘manner component’.
. Concluding remarks The most general conclusion to be drawn from our study is fully coherent with Chomsky’s (1995: 8) claim that “the apparent richness and diversity of linguistic phenomena is illusory and epiphenomenal, the result of interaction of fixed principles under slightly varying conditions”. Adopting such a perspective, we have shown that Talmy’s (1985, 1991) descriptive analysis of the conflation processes involved in the constructions under study can be explained
A minimalist account of conflation processes
within a minimalist conception. In particular, we have shown that the distinction between satellite-framed and verb-framed constructions correlates with the (un)availability of the relevant empty heads (cause/go/be), whose licensing involves appealing to Merge to avoid crashings at PF. On the other hand, the data we have analyzed here do not appear to affect functional aspects of the lexicon (Borer 1984; Chomsky 1995). Accordingly, we must conclude that ‘parametrized variation’ is not to be confined to inflectional systems. This conclusion has been independently reached by Hale & Keyser (1998), Juffs (1996), and Snyder (1995), among others.
Notes * We are grateful to our colleagues of the Grup de Gramàtica Teòrica of the Universitat Autònoma de Barcelona, especially to Laia Amadas and Carme Picallo for their careful critical readings. Thanks to Gretel de Cuyper and Michael Kennedy for the Dutch and English data, respectively. Laia Amadas also deserves a special mention for her support and willingness to help us with the preparation of the manuscript. We are also grateful to the audience at the 1999 GLOW: Universals (ZAS, Berlin) for insightful comments and suggestions. Research for this paper has been supported the Ministerio de Ciencia y Tecnología through project BFF2000-0403-C02-01/02, and by the Generalitat de Catalunya through project 2001 SGR 00150. . We do not intend to reduce the importance of semantics by adopting a syntactic approach. Our syntactic account should not be regarded as incompatible with Jackendoff ’s (1990) or Goldberg’s (1995) works on the semantic restrictions concerning constructional idioms. We have put them aside in the present paper, because what we are mostly concerned with here is how these constructions can be dealt with from a syntactic perspective. . Accordingly, we assume the epistemological priority of ‘semantic selection’ over ‘categorial selection’ (see Grimshaw 1979 and Chomsky 1995, among others). . It is important to note that N, V, and P must be regarded as the syntactic categories derivable from their associated semantic features, not as their corresponding language-specific morphosyntactic realizations (see Hale & Keyser 1997, 1998). For example, it should be clear that we are not positing that happy or here are morphosyntactic Ps that turn out to incorporate a N as a morphosyntactic category. Rather what we are positing is that both involve the incorporation of a non-relational element into a relational one. . See Hale (1986) for an in-depth analysis of these semantic relations. . According to Chomsky (1995: 238), “ lexical entry represents in the optimal way the instructions for the phonological component and for the interpretation of the LF representation: a phonological matrix, and some array of semantic properties. It must also contain whatever information is provided by the verb itself for the operations of CHL (= computational system).”
Jaume Mateu and Gemma Rigau . See Bouchard (1995), Baker (1997), and Mateu (1999) for more discussion on the homomorphic nature between syntactic and semantic structures. . In this sense, our proposal is similar to that developed by Harley (1995). See also Arad (this volume). The main difference is that, with Hale & Keyser (1993, 1998), we do not analyze the syntactic head associated to the eventive relation as a functional one. As shown in (12a), this is a lexical one. . Quite interestingly, notice that our proposal is compatible with Baker’s (1997) assumption that there are only three ‘proto-roles’ (see Dowty 1991): agent/causer, theme/patient, and goal/path/location. With Baker (1997: 121), we assume that the ‘Uniformity of Theta Assignment Hypothesis’ (UTAH) is “in the spirit of ” the Minimalist Program, and that the UTAH is an important part of the theory of the interface between LF and systems of thought. This notwithstanding, we agree with Hale & Keyser’s claim that the status of UTAH in linguistic theory can be argued to be derived, once a strictly configurational account of Baker’s proto-roles is provided. . Following Chomsky (1998: 13), we assume that “derivations make an one-time selection of a lexical array LA from the lexicon, then map LA to expressions, dispensing with further access to the lexicon”. . Besides the semantic/aspectual restriction that the conflated verb must denote an activity (see Jackendoff 1990; Hoekstra 1992, among others), there also appears to be a syntactic reason excluding examples such as those in (i), which contain unaccusative verbs: the internal specifier position projected by P, which has been assumed to be subcategorized for by all unaccusative verbs (e.g., cf. (10)), i.e., that occupied by Figure, could not be licensed in (i) either. Accordingly, there is no structural position for Sue to be interpreted as Figure in (i). (i)
a. *Sue came the door open. b. *Sue arrived herself silly.
. Some relevant remarks are in order here: as noted by Juffs (1996), it should be clear that the distinction between satellite-framed languages and verb-framed ones must not be drawn across the board, but rather it depends on the lexical-semantic domains analyzed. For example, English can be typically analyzed as satellite-framed with regard to ‘physical motion’. This notwithstanding, concerning ‘abstract motion’, it is both satellite-framed (e.g., cf. the adjectival resultative construction in (22b–c)) and verb-framed (cf. the huge number of change of state verbs in English (cf. Levin 1993). That is, it appears to be more appropriate to speak of ‘satellite- and verb-framed constructions’ rather than ‘satellite- vs. verb-framed languages’. . This notwithstanding, it is important to keep in mind the following remarks found in Talmy (1985: 72): “English does have a certain number of verbs that genuinely incorporate Path, as in the Spanish conflation type, for example: enter, exit, pass, rise, descend, return, circle, cross, separate, join (...). But these verbs are not the most characteristic of English. In fact, the majority (here all except rise) are not original English forms but rather borrowings from Romance, where they are the native type”. . One caveat is in order here: although the PP a/en la habitación (‘into the room’) can be omitted, it is not an adjunct: see Tortora (1998), where it is argued for the argumental status
A minimalist account of conflation processes
of these dispensable elements. According to Tortora (1998: 344), PPs like those in (28) do not occupy a VP-external position; rather they are part of the core eventuality of the VP, just like English resultative adjectival phrases in the river froze solid or the window broke open. . In (30b) the P lexically incorporated into V allows the phonologically null P to be properly interpreted, since the former ensures the recoverability of the latter. . We assume that a non-relational element (i.e, an abstract Ground) is lexically incorporated into the relational element involved in weg (‘away’). Here we will not comment on how this information is to be encoded into the lexical entry of weg: See Hale & Keyser (2000) for the proposal that some intransitive particles incorporate a non-relational element. Here we will assume that particles and prefixes do not essentially differ with respect to argument structure: both involve the birelational element P(article) (see Section 2). . Torrego (1989) and Rigau (1997) relate the construction in (34) to the so-called ‘locative inversion’. Nevertheless, it must be noted that the latter construction appears to have different properties: for example, those discourse conditions governing locative inversion are not the same as those governing (34). See Levin & Rappaport Hovav (1995) for arguments against taking the locative inversion construction as an ‘unaccusative diagnostic’. In order to distinguish those unaccusative constructions in (34) from the locative inversion constructions which can be unaccusative or unergative, we will refer to the former as ‘existential locative constructions’ (see Footnote 21). . Following Longa, Lorenzo & Rigau (1998), we assume that Spanish has a phonologically null locative determiner represented as . See this article for motivation of this assumption. . As noted by Jones (1993: 113f), the clitic bi is obligatory in (39a), but optative in (39b). In the latter sentence the clitic could be replaced by a locative PP or adverbial phrase, this showing that the clitic bi is the true predicate, and not a subject clitic as in (39a). Accordingly, the subject issos (‘they’) in (39b) has nominative case. . In impersonal deontic existential constructions like those in (i), the specifier selected by P is a dative or locative clitic determiner, that is, a ‘quirky case’ clitic (see Rigau 1999). (i)
a.
b.
c.
d.
e.
Hi cal tres ous. loc.cl. is-necessary three eggs ‘Three eggs are necessary’. Mos cal un milió de francs. to-us is-necessary a million of francs ‘We need a million of francs’. Bi keret tres ovos. loc.cl is-necessary three eggs ‘Three eggs are necessary’. Nos keret unu milione de francos. to-us is-necessary a million of francs ‘We need a million of francs’. I cau tres cagires. loc.cl. is-necessary three chairs ‘Three chairs are necessary’.
Northwestern Catalan
Sardinian (Jones 1993: 101)
Aranese Occitan
Jaume Mateu and Gemma Rigau
f.
Mos cau tres cagires. to-us is-necessary three chairs ‘We need three chairs’.
. Transitive verbs are associated with accusative case, not partitive case. Consequently, the clitic en in (i-b) is the genitive case that an overt or covert quantifier assigns to the N (see Rigau 1997): (i)
a. b.
La Maria llegeix (molts) llibres. ‘Mary reads (many) books’. La Maria en llegeix (molts). ‘Mary reads many of them’.
. This example is commented on by Kurat (this volume). See this work for an alternative explanation of the ungrammaticality of (42) (his (36a)). Once again one caveat is in order here: as noted above (cf. fn. 16), the possibility for an unergative verb like dance to enter into the so-called ‘locative inversion’ is not what is at issue here (cf. Levin & Rappaport Hovav 1995: 285). That is, here we are not referring to locative inversion cases like (36a), which is not an unaccusative construction, but to impersonal existential constructions like those in (34), which have been shown to be unaccusative (see Footnote 16).
References Arad, M. (This volume). Universal Features and Language-particular Morphemes. Baker, M. (1997). Thematic Roles and Syntactic Structure. In L. Haegeman (Ed.), Elements of Grammar. Dordrecht: Kluwer. Borer, H. (1984). Parametric Syntax. Dordrecht: Reidel. Borer, H. (1994). The Projection of Arguments. In E. Benedicto and J. Runner (Eds.), Functional Projections [UMOP 17]. Amherst: University of Massachusetts. Bouchard, D. (1995). The Semantics of Syntax. Chicago and London: The University of Chicago Press. Chomsky, N. (1975). Reflections on language. New York: Pantheon. Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: The MIT Press. Chomsky, N. (1998). Minimalist Inquiries: The framework. Ms., MIT. (Published as Chomsky, N. (2000). In R. Martin et al. (Eds.), Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik. Cambridge, MA: The MIT Press). Dowty, D. (1991). Thematic Proto-Roles and Argument Selection. Language, 67, 547–619. Freeze, R. (1992). Existential and Other Locatives. Language, 68, 553–595. Goldberg, A. (1995). Constructions: A constructional approach to argument structure. Chicago: The University of Chicago Press. Grimshaw, J. (1979). Complement Selection and the Lexicon. Linguistic Inquiry, 10, 279– 325. Hale, K. (1986). Notes on World View and Semantic Categories: Some Warlpiri examples. In P. Muysken and H. van Riemsdijk (Eds.), Features and Projections. Dordrecht: Foris.
A minimalist account of conflation processes
Hale, K. & S. J. Keyser (1993). On Argument Structure and the Lexical Expression of Syntactic Relations. In K. Hale and S. J. Keyser (Eds.), The View from Building 20: Essays in honor of Sylvain Bromberger. Cambridge, MA: The MIT Press. Hale, K. & S. J. Keyser (1997). The Limits of Argument Structure. In A. Mendikoetxea and M. Uribe-Etxebarría (Eds.), Theoretical Issues at the Morphology-Syntax Interface. Bizcaia: Servicio Editorial de la UPV. Hale, K. & S. J. Keyser (1998). The Basic Elements of Argument Structure. In H. Harley (Ed.), Papers from the UPenn/MIT Roundtable on Argument Structure [MIT Working Papers in Linguistics 32] (pp. 73–118). Hale, K. & S. J. Keyser (2000). On the Time of Merge. Ms., MIT. Harley, H. (1995). Subjects, Events, and Licensing. Doctoral dissertation, MIT. Hoekstra, T. (1984). Transitivity. Dordrecht: Foris. Hoekstra, T. (1992). Aspect and Theta-theory. In I.M. Roca (Ed.), Thematic Structure: Its role in grammar. Berlin: Mouton de Gruyter. van Hout, A. (1996). Event Semantics of Verb Frame Alternations: A case study of Dutch and its acquisition. Doctoral dissertation, TILDIL dissertation series. Jackendoff, R. (1990). Semantic Structures. Cambridge, MA: The MIT Press. Jackendoff, R. (1997). Twistin’ the night away. Language, 73, 534–559. Jones, M. A. (1993). Sardinian Syntax. London: Routledge. Juffs, A. (1995). Learnability and the Lexicon. Amsterdam: John Benjamins. Kayne, R. (1993). Toward a Modular Theory of Auxiliary Selection. Studia Linguistica, 47, 3–31. Klipple, E. (1997). Prepositions and Variation. In A.-M. Di Sciullo (Ed.), Projections and Interface Conditions. New York: OUP. Kratzer, A. (1996). Severing the External Argument from its Verb. In J. Rooryck and L. Zaring (Eds.), Phrase Structure and the Lexicon. Dordrecht: Kluwer. Kural, M. (This volume). A Four-way Classification of Monadic Verbs. Levin, B. (1993). English Verb Classes and Alternations: A preliminary investigation. Chicago: The University of Chicago Press. Levin, B. & T. Rapoport (1988). Lexical Subordination. Chicago Linguistics Society, 24, 275– 289. Levin, B. & M. Rappaport Hovav (1995). Unaccusativity: At the syntax-lexical semantics interface. Cambridge, MA: The MIT Press. Longa, V., G. Lorenzo & G. Rigau (1998). Subject Clitics and Clitic Recycling: Locative sentences in some Iberian Romance languages. Journal of Linguistics, 34, 125–164. Marantz, A. (1997). No Escape from Syntax: Don’t try morphological analysis in the privacy of your own lexicon. In A. Dimitriadis et al. (Eds.), Penn Working Papers in Linguistics 4 (2) (pp. 201–225). Mateu, J. (1999). Universals of Semantic Construal for Lexical Syntactic Relations. Paper presented at the 1999 GLOW Workshop: Sources of universals, Potsdam. Distributed as GGT-99-4 Research Report, Universitat Autònoma de Barcelona (http://ggt.uab.es). Rappaport Hovav, M. & B. Levin (1998). Building Verb Meanings. In M. Butt and W. Geuder (Eds.), The Projection of Arguments: Lexical and compositional factors. Stanford, CA: CSLI Publications.
Jaume Mateu and Gemma Rigau
Rigau, G. (1997). Locative Sentences and Related Constructions in Catalan: ésser/haver alternation. In A. Mendikoetxea and M. Uribe-Etxebarría (Eds.), Theoretical Issues at the Morphology-Syntax Interface. Bizcaia: Servicio Editorial de la UPV. Rigau, G. (1999). Relativized Impersonality: Deontic sentences in Catalan. In E. Treviño and J. Lema (Eds.), Semantic Issues in Romance Syntax. Amsterdam: John Benjamins. Ritter, E. & S. T. Rosen (1998). Delimiting Events in Syntax. In M. Butt and W. Geuder (Eds.), The Projection of Arguments: Lexical and compositional factors. Stanford CA: CSLI Publications. Seibert, A. J. (1992). Intransitive Constructions in German and the Ergative Hypothesis. Doctoral dissertation, University of Trondheim, Norway. Snyder, W. (1995). Language Acquisition and Language Variation: The role of morphology. Doctoral dissertation, MIT. Spencer, A. & M. Zaretskaya (1998). Verb Prefixation in Russian as Lexical Subordination. Linguistics, 36, 1–39. Talmy, L. (1985). Lexicalization Patterns: Semantic structures in lexical forms. In T. Shoepen (Ed.), Language Typology and Syntactic Description III: Grammatical Categories and the Lexicon. Cambridge: CUP. Talmy, L. (1991). Paths to Realization: A typology of event conflation. Berkeley Linguistics Society, 17. Torrego, E. (1989). Unergative-unaccusative Alternations in Spanish. MIT Working Papers in Linguistics, 10, 253–272. Tortora, C. (1998). Verbs of Inherently Directed Motion are Compatible with Resultative Phrases. Linguistic Inquiry, 29, 338–345. Wechsler, S. (1995). The Semantic Basis of Argument Structure. Stanford, CA: CSLI Publications. Williams, E. S. (1977). Discourse and Logical Form. Linguistic Inquiry, 8, 101–139.
Morphological constraints on syntactic derivations* Juan Romero Universidad de Alcalá de Henares/Universidad Autónoma de Madrid
.
Parametric variation in a minimalist framework
The aim of this paper is to sketch a model for linguistic variation within a minimalist framework. The changes the minimalist program (MP) has introduced in the syntactic explanation make, from my point of view, untenable the Principles and Parameters (PP) postulates. The proposal I will argue for is based on the idea that languages differ in the formal features they encode, i.e., there is no a universal catalogue of shared formal features. Each language determines independently (although not arbitrarily) its own formal features from the universal set of features F available for the faculty of language (Chomsky 1998, 1999). The basic assumption in the PP model is that purely grammatical properties are encoded as principles. Principle P expresses a requirement that must be satisfied by every derivation in every language. P can be satisfied in different ways within the range specified by the parametric options of P. Therefore, cross-linguistic variation is encoded as parameters over principles. Consider Case theory. Every NP is roughly subject to the following condition: it must receive/check Case. We can express this requirement as a principle, and then we can define parametric options such as right or left assignment, or overt/covert checking. These options delimit the range of variation UG allows for Case Theory. In the government and binding model, the modular architecture of the computational component provides a well-defined locus for the formulation of each principle. Some of them, like government, affect the whole system; others, like Case Filter, are module-specific. The properties of each module are satisfied in a certain level of representation (D Structure, S Structure, etc.). As a
Juan Romero
consequence, within the Government and Binding model, parametric options are easily defined and evaluated. For instance, suppose that a language L assigns Case (under government) to its right, and that this condition is evaluated at S Structure. This amounts to say that by S Structure each NP must be in the government domain, and to the right of the head that assigns its Case. Under minimalist assumptions, it is not so clear how these principles can be represented or satisfied. Firstly, there are no modules where principles can be expressed. Its role in syntactic analysis has been transferred to formal features. The difference between both models is not trivial. Feature properties are defined independently of its value: no “right assignment” can be attributed to a certain feature, they all work in the same system, and under the same rules. Secondly, the only levels of representation are the interface levels: PF and LF. At this point, we can evaluate the grammaticality of the derivation, but whatever we define in these levels cannot affect the derivation, it is too late. Suppose, again, that L assigns Case to its right, and suppose a derivation D that reaches LF (I assume that if Case is not checked the derivation will be automatically canceled). There are two possibilities: (i) the NP is at H’s right or (ii) at H’s left. If it is at H’s left, the derivation crashes; on the contrary, if it is at H’s right, the derivation succeeds. What is important is that it is not possible to design a specific operation to move the NP to the right to overcome the ungrammaticality in (ii).1 Therefore, defining a parameter at the interface levels does not have any effect in the derivation.2 Due to these problems, the MP resorts to the notion of strength to account for linguistic variation. In Chomsky (1993), linguistic variation depends on an idiosyncratic property of features: its requirement of being overtly or covertly checked. As a consequence, principles are encoded as features, and parametric variation as a “timing” on its checking. This model is subject at least to a very important problem: the notion overt does not match to morphosyntactic visible properties in a systematic way (see Chomsky 1993, 1995). Therefore, from the learner’s perspective this approach poses a complex paradox. On the one hand, although there is no necessary visible evidence for it, some operation O, triggered by F, can be defined as overt. On the other hand, learning a specific language consists basically in learning the strength of its features. Therefore we, the learners, require overt evidence to determine if F is strong (and O is overt), otherwise we must apply a general condition: Procrastinate, which basically tells us that F is not strong. In the most recent developments of the MP, these and other problems have led us to a different understanding of linguistic variation. One of the main reasons for this shift is the fact that the notion of strong feature is anticyclic.
Morphological constraints on syntactic derivations
Strong features imply the existence of two different cycles: the strong/overt cycle, and the weak/covert cycle. In a derivational approach this is clearly an unexpected “imperfection” that would require good evidence to be sustained (see Kayne 1998; Chomsky 1998, 1999). At this point, brand new tools are required to account for linguistic variation. Chomsky (1999) proposes that the P&P approach can be reduced to a kind of uniformity approach: (1) Uniformity Principle In the absence of compelling evidence to the contrary, assume languages to be uniform, with variety restricted to easily detectable properties of utterances
The first question is what can be considered uniform. There at least two kind of candidates. First, those properties that constitute an inherent part of the computational component. And second, those properties that due to its interpretability properties are motivated at the interfaces. Some candidates are the following: Thematic relations. Following Hale and Keyser’s hypothesis, thematic relations are configurational in nature. In the MP, this can be understood as Merge by s-selection (Romero 1997). If a verb V selects a theme, in the computational component V takes a complement that will be thematically interpreted. Note that although these relations are usually considered to be evaluated at LF, it is also possible to understand them other way. Specifically, this can be a subcase of a more general procedure: the I/C interpretation of syntactic structures (Spec–head, and head–complement relations). Thematic interpretation would be the particular instance in which syntactic structures involve relations between lexical heads. Basic sentence structure. The sequence [COMP – TENSE – v – V] seems to be universal, although not all the functional heads are required in every sentence (see Chomsky 1999). (See Uriagereka 1998 for a possible I/C motivation for this sequence.) Computational operations. The expressions in every language are composed by means of the same operations: Merge, Move, Agree, etc. These are not substantive universals, but purely computational devices, possibly the core of the faculty of language. What is crucial about all of these candidates (that do not exhaust the list), is that no variation at all is allowed for them. There is no language in which the theme relation is expressed by means of a Spec–head relation; or in which variation is allowed for the inner works of Merge, Agree, etc.
Juan Romero
The second question about (1) is what means easily detectable properties of utterances. Chomsky (1999) following ideas developed by Vergnaud and Borer, suggests two properties that could satisfy this condition: “. . . parametric variation is restricted to the lexicon, and insofar as syntactic computation is concerned, to a narrow category of morphological properties, primarily inflectional” (p. 2). Interestingly, in the very same paragraph Chomsky states that “. . . basic inflectional properties are universal though phonetically manifested in various ways. . .”. From this point of view, syntactic variation is basically expressed by means of the phonetic properties of the formal features.3 However, under this view, it is not clear what means that syntactic variation is just a diverse phonetic manifestation of universal properties. Consider the analysis of the Thematization/Extraposition English rule in Chomsky (1999). The application of this rule has some syntactic effects, but crucially no feature checking relation or other syntactic operations seem to be involved in it. We can build a parallel case for agreement. Consider the sentence in (2): (2) a. *whom did you give t the car b. who gave whom the car
Romero (1999) argues that agreement in English (but not, for instance, in Spanish) is morphologically configurational, therefore, in the morphological component, agreement relations must be expressed by means of Spec–head relations. Furthermore, Romero (1999) argues that dative shift involves an agreement relation. As a consequence, if at Spell Out the dative has moved away from the position where this relation is established (2a), the derivation will crash. On the other hand, if the dative remains in situ (2b), the agreement relation is morphologically evaluated and the derivation succeeds. Therefore, it seems that this line of reasoning has some initial plausibility. The question is if it is enough to account for linguistic variation. Chomsky’s hypothesis is based on the idea that every language has the very same set of interpretable/uninterpretable features (its ‘basic inflectional properties’). Therefore, every language has to express, for instance, the same agreement or Case relations. Throughout this paper I will argue that this approach is theoretically questionable and empirically insufficient, and that, as a consequence, the answer to this question, whether the phonetic properties of the formal features are enough to capture syntactic variation, is no. Syntactic derivations act primarily on formal features. Its role on the faculty of language is defined precisely on their ability to enter into syntactic computations. At the other side of the interfaces only those features that can receive either a phonetic or a semantic interpretation can survive. Where do formal
Morphological constraints on syntactic derivations
features come from? Suppose a universal set of features F available for the faculty of language.4 Due to the bare output conditions we know that this set contains phonetic and semantic features: those are the features used by the A/P and I/C components respectively. Formal features are split into two groups: interpretable and uninterpretable. Interpretable means interpretable at the interfaces (Chomsky 1995), so they are either phonetic or semantic in nature. As a consequence, I will assume that interpretable features are phonetic or semantic features that have been formalized, and when they reach the interfaces they “recover” their original phonetic or semantic status, i.e., they are interpretable. Regarding uninterpretable features two options come to mind. On the one hand, they can belong to the computational component. This approach seems to be the one assumed in Chomsky (1999): . . . The relation Agree and uninterpretable features are prime facie imperfections. In MI and earlier work it is suggested that both may be part of an optimal solution to minimal design specifications by virtue of their role in establishing the property of “displacement”, which has (at least plausible) external motivation in terms of distinct kinds of semantic interpretation and perhaps processing. If so, displacement is only an apparent imperfection of natural language, as are the devices that implement it (p. 3).
Under this view, uninterpretable features form part of the mechanics of the computational component, in the same way that Merge or Move do. On the other hand, it can be considered that uninterpretable features are interpretable features in the wrong place (Romero 1999). From this perspective, they are also formalized features, but due to the fact that they are attached to a category that cannot interpret them, they do not receive an interpretation at the interfaces. There is a clear difference between these two approaches: under the first one (to which I will refer as Visible Feature Uninterpretability Hypothesis (VFUH)), being uninterpretable is an active property in the computational component; under the second one (the Formal Features Hypothesis (FFH)), it is not, if an uninterpretable feature remains unchecked at the interfaces, the derivation crashes. The interfaces are the only places where being (un)interpretable becomes relevant. As said, Chomsky argues that uninterpretable inflectional features are the devices that implement displacement. There are three different kinds of uninterpretable features:
Juan Romero
(3) (i) to select a target α (ii) to determine whether α offers a position for movement and if so, what kind of category can move to that position (iii) to select the category β that is moved
The first kind implements Agree relations: uninterpretable features in a probe look for the appropriate features to be checked against. In absence of other requirements, this process takes place without displacement. In the VFUH, these relations are only triggered by uninterpretable features; on the contrary, in the FFH Agree is triggered by any formal feature. In (3ii) it is expressed the property of displacement: the probe has a feature that projects (typically, an EPP feature); therefore some syntactic object is needed to merge with the probe. Under the VFUH, it is again an uninterpretable feature. However, Chomsky argues that there is some interpretation (a surface non thematic interpretation) associated to the new position. In the FFH this property is related to the presence of an interpretable selectional feature (as in the case of pure Merge). This feature is, for instance, responsible of the tense (for the subjects), or the aspectual (for the objects) structure interpretation of the sentence. Finally, the third kind of uninterpretable features is basically composed of Case features. In the VFUH these features determine if an NP is available for movement or not. In the FFH Case features do not form part of the computational component (see Bonet 1991; Marantz 1993; Romero 1997), they are just the morphological expression of a certain relation. Essentially, the FFH is based on the idea that if a feature is formal it is because it plays an active role in the computational component, and this role is not dependent on its interpretability status. First, there are interpretable selectional features that trigger Merge. Second, the formal features of the item that triggers Merge, the head, act as probes in the label of the category newly formed.5 Descriptively speaking, in a first step, an item H triggers Merge and it is attached to another syntactic object C. In this sense, Merge is unrestricted. In a second step, H checks if C has the appropriate features. For instance, V is a verb that selects an interrogative COMP. In the first step, V just takes a complement; in the second step, Agree, it checks that its complement is [+wh]. The whole process is independent of the interpretability of the features. An obvious advantage of this proposal is that it does not require a teleological interpretation of uninterpretable features. Note that under current minimalist assumptions it is not only necessary to specify that a formal feature is uninterpretable. Furthermore, it is necessary to stipulate that the fact of being uninterpretable triggers computational operations. This property makes sense at the interfaces,
Morphological constraints on syntactic derivations
or in a representational model, but it results stipulative from a derivational point of view. In this framework the most obvious way for dealing with linguistic variation is assuming that the vocabulary of formal features is not universally determined. Each language may “formalize” different sets of features. If languages define two (Spanish), three (Latin), seventeen (Swahili) or none (English) features for gender/class, it is an idiosyncratic property. In order to support this hypothesis, in Section 2 I will show how there are certain restrictions, the Person Case Constraint, that only show up if there are agreement features involved. Furthermore, I will link this property to the ability found in some languages such as Japanese to delete arguments without leaving no phonetic trace, no agreement or pronoun. In Section 3 I will treat the interpretability status of the EPP feature. In this section I also provide an analysis for object shift in Scandinavian languages, and an explanation for the PCC based on the interactions between agreement and EPP. Specifically, I will propose that person agreement in languages with object agreement is tied to an EPP feature. . The Person-Case Constraint The Person-Case Constraint (PCC) has been argued to be a universal morphological restriction on agreement and clitic clusters (Bonet 1991). The PCC states that dative arguments are ungrammatical if the accusative is first or second person, and agreement is overtly realized. However, Ormazabal and Romero (1998, 2000) show that this description is not totally accurate. Specifically, the PCC has the following properties: i.
It is triggered by object agreement features encoded in the verb, not by the arguments. This property is clearly shown in the contrast between Basque tensed and non tensed clauses that will be developed in Section 2.1. ii. It affects not only first and second person objects, but any animate specific arguments in object position (if this property is encoded in the verb). Consider the following examples from Castilian Spanish: (4) a. *me le enseñaron 1sD 3sO.Animate showed.3pS ‘They showed him to me’ b. me lo enseñaron 1sD 3sO.Inanimate showed.3pS “They showed it to me”
Juan Romero
In this dialect, the clitic 3rd person animate masculine objects has the form le, and the clitic for the 3rd person non animate masculine objects has the form lo. The contrast between (4a) and (4b) shows that the PCC affects also 3rd person arguments. iii. Its nature is not morphological, but syntactic. Consider the following example from Haitian Creole (M. Degraff, p.c.): (5) a. *mwen pral bay Jan/li Mary/l I will give Jan/him Mary/her b. mwen pral bay Jan/li yon menai I will give Jan/him a girlfriend
In Haitian Creole, as in Spanish, the PCC ranges from 1st and 2nd person to proper names. Interestingly, Haitian Creole lacks either clitics or object agreement. iv. It shows up in Double Object Constructions (DOC). Following an analysis pursued by different authors in the last years (Uriagereka 1988; Demonte 1995; Romero 1997; Anagnostopoulou 1998; etc.), I will assume that all the b-pairs in (6)–(10) are instances of the same construction, the DOC, which includes the constructions traditionally labeled with this name in English (6), and the applicative constructions in Bantu languages (7), as well as dative-clitic or dative-agreement constructions in Romance (8), Basque (9) or Southern Tiwa (10): (6) a. I gave a book to Mary b. I gave Mary a book (7) a.
Mavuto a-na-perek-a chitseko kwa mfumu Mavuto SM-past-hand-asp door to chief ‘Mavuto handed the door to the chief’ b. Mavuto a-na-perek-er-a mfumu chitseko Mavuto SM-past-hand-app-asp chief door ‘Mavuto handed the door to the chief’
(8) a. entregué un coche a María b. le entregué un coche a María (9) a.
eskutitza Myriam-engana bidali du-te letter.the Miriam-allative send aux.3sO-3pS b. eskutitza Myriam-i bidali di-o-te letter.the Miriam-dat send aux.3sO-3sD-3pS ‘They sent the letter to Myriam’
(Baker 1988)
Morphological constraints on syntactic derivations
(10) a.
bi-musa-wia-ban ‘uide-’ay agr-cat-give-past child-to ‘I gave the cats to the child’ b. ‘uide tam-musa-wia-ban child agr-cat-give-past ‘I gave the cat to the child’
(Rosen 1990)
Although for the purposes of this paper I will simply assume that all the sentences in (6)–(10) involve the same structure, note that one of the main reasons to think that the members of the pairs in (6)–(10) represent the same construction is the fact that all of these languages are subject to the PCC precisely in this alternation. Furthermore, all these pairs also share identical c-command asymmetries (Barss & Lasnik 1986), and the so-called possession or animacy restriction (Green 1974) exemplified in (11). (11) a. I sent a letter to Paris b. *I sent Paris a letter
Although the existence of this restriction has been attested in more than 300 languages (see Albizu 1997), there are reasons to think that it is not universal. Specifically, it seems that this constraint does not show up at least in Japanese, Turkish, and Basque infinitive DOCs. Due to space reasons in Section 2.1 I will only consider Basque infinitives (but see Romero 1999 for a more detailed description including both Japanese and Turkish DOCs). In Section 2.2 I will propose that the lack of formal features in the languages that are no subject to the PCC can also be related to the way they drop arguments. . PCC in Basque infinitives Consider data in (12) from Basque (Laka 1993). (12) a. *zuk ni etsaiari saldu na-i-o-zu you.erg me.nom enemy.dat sell 1sO-aux-3sD-2sS ‘You sold me to the enemy’ b. gaizki iruditzen zait [zuk ni etsaiari saltzea] wrong seem aux [you.erg I.nom enemy.dat selling] ‘Your selling me to the enemy seems wrong to me’
In order to save (12a), Basque resorts to a different Case marking (allative, or others depending on the verb) and no dative agreement is triggered on the auxiliar (13b).
Juan Romero
(13) a. *ni Myriam-i bidali na-i-o-te 1abs Myriam-dat send 1sO-aux-3sD-3pS ‘They sent me to Myriam’ b. ni Myriam-engana bidali na-u-te 1abs Myriam-allative send 1sO-aux-3pS ‘They sent me to Myriam’
The contrast in (13) strongly resembles the NP/PP alternation in languages such as English. Furthermore, the structure in (13b) is precisely the one used in the sentences affected by the possession restriction: (14) a.
eskutitza Parise-ra bidali du-te letter Paris-allative send aux.3sO-3pS ‘They sent the letter to Paris’ b. *eskutitza Paris-i bidali diote letter Paris-dative send aux.3sO-3sD-3pS
As a consequence, I will assume that (13a) is the Basque counterpart of the English DOC,6 and (13b) to the to-construction. This analysis is appealing since Basque uses the allative construction exemplified in (13b) precisely in the same contexts where the to-construction is mandatory in ditransitive constructions in English (11a)/(14a). As shown in (12b), Basque infinitives do not show PCC effects. Note that the indirect object in (12b) is dative marked, and therefore patterns with (13a), that is, (12b) is an “infinitive” DOC. However, contrary to what happens in (12a), although a violation of the PCC is expected, the resulting sentence is grammatical. Finally, Basque infinitives clearly show that what is important for the computational component are the specifications of the attracting head ([+φ-features] on tensed sentences, [–φ-features] on infinitive sentences), and not those of the shifted NP, which are the same in both structures. . Agreement restrictions An obvious way to account for the asymmetry between (12a) and (12b) is to relate it to the fact that non tensed clauses lack agreement morphology (see Laka 1993). Specifically I propose that Basque infinitives lack the formal features that encode agreement, and, as a consequence, no agreement relation is established between the verb and its internal arguments. If this proposal is on the right track, the presence of φ features on a certain language (for a certain relation) can be tested by the effects of agreement restrictions such as the PCC. The contrasts previously exemplified denote a “small amount” of linguistic variation
Morphological constraints on syntactic derivations
that can be made dependent on this property. However, Romero (1999) argues that actually two former parameters can be reduced to the presence/absence of agreement features: the null topic parameter (Huang 1984), and the pro-drop parameter. Consider the sentences in (15)–(17), where different possibilities of dropping in Spanish, English, and Japanese are exemplified. (15) a.
María compró el billete María bought.3sS the ticket b. María lo compró María it bought.3sS ‘María bought it’ c. compró el billete bought.3sS the ticket ‘He/she bought the ticket’
(16) a. Mary bought the ticket b. Mary bought it c. He/She bought the ticket (17) a.
Tanaka-wa Noriko-ni tegami-o okutta Tanaka-top Noriko-dat letter-acc sent ‘Tanaka sent Noriko the letter’ b. Tanaka-wa tegami-o okutta Tanaka-top letter-acc sent ‘Tanaka sent her the letter’ c. Tanaka-wa Noriko-ni okutta Tanaka-top Noriko-dat sent ‘Tanaka sent it to her’ d. Noriko-ni tegami-o okutta Noriko-dat letter-acc sent ‘He/she sent Noriko the letter’
Each language displays a different pattern. In Spanish (15), NP-dropping requires an agreement marker or a clitic. In English (16) it requires pronouns. However, in Japanese (17) the arguments simply disappear in the phonetic output (actually, there are almost no 3rd person pronouns in this language). Since the seminal paper by Taraldsen (1978), the pro-drop parameter has been related to the presence of strong agreement, arguing that only strong agreement licenses an empty category. The intuition is that an argument can be dropped whenever its information is somehow recoverable from the agreement morphology (recoverability principle). However, languages as Japanese or Turk-
Juan Romero
ish, or Basque infinitives, can freely drop arguments without resorting to any morphological agreement (Huang 1984). The hypothesis I propose is that thematic interpretation license NPdropping. I assume that thematic relations are subject to some general principle of the kind argued for in Baker’s UTAH or Hale & Keyser’s recent hypothesis on argument structure. Under this view, the presence of an argument is fully predictable, and NPs can be freely drop whenever the pragmatic conditions that govern dropping are met (see, for instance, Li & Thompson 1979). If this is correct, the question is what syntactic conditions, if any, apply to dropping. These restrictions, the licensing of empty categories, were a major topic in the Government and Binding framework. Since appealing to the phonetic content of a category (leaving aside traces) does not seem to be allowed within the MP, it is not clear how all of these proposals can be expressed in minimalist terms. As a point of departure I will adopt a very naive approach to this problem: let us say that in the numeration it is only inserted the material required for convergence. From this perspective, Japanese or Turkish constitute the “unmarked” case: whenever the appropriate conditions are met, an empty category with the minimum possible amount of information (maybe a variable, as suggested by Huang) is inserted in the numeration. However, if we try to do the same in Spanish or English, the result is ungrammatical. This can be explained by the fact that these languages must satisfy an agreement relation. Therefore, some extra material must be inserted (a pro, a clitic, or a pronoun). Specifically, the formal features required to delete the uninterpretable features encoded in the verb (v, T). In this sense, Spanish and English are subject essentially to the same constraint. This idea basically restates Huang (1984) proposal for empty categories in object position. He argues that there are two different parameters: null topic parameter and pro-drop parameter. Null topic languages introduce an null operator in topic position that can license a variable. Chinese, Japanese or Korean belong to this group. In these languages strong agreement is not required for dropping an argument, since they have an alternative way for licensing an empty category: an operator-variable chain.7 The pro-drop parameter splits non null topic languages into two groups: those than can license a pro by means of strong agreement (Spanish, Basque), and those that do not have an agreement “strong enough” to license a pro (English, French). In the present proposal, null topic languages are those that do not have formal agreement features, and non null topic languages are those that must satisfy agreement relations. If the numeration is subject to an economy of insertion principle, when possible, the first option is always going to be preferred. Actually, Campos
Morphological constraints on syntactic derivations
(1986) argues that in Spanish, under certain conditions, the null topic strategy is available. Finally, this system straightforwardly accounts for the question of the parameter hierarchy posed by Huang (1984). Under his view, no [+null topic]/[+pro-drop] language is allowed, [±pro-drop] languages constitute a sub-set of [–null topic] languages. This implies the existence of a hierarchy of parameters, and, as a consequence, a hierarchy of principles. No such a problem arises in the model sketched. In brief, the proposed system relies on the hypothesis that thematic projection is uniform and that it is thematic interpretation what allows to recover the information missing in the phonetic output. Therefore, if no agreement relation needs to be satisfied, argument dropping is free. However, if a language has encoded formal agreement features, argument dropping will be restricted by the necessity of satisfying agreement relations. Furthermore, different languages express agreement relations in different ways that can impose additional restrictions on syntactic derivations, such as those found between English and Spanish. Summarizing, linguistic variation can be explained in terms of the formal features encoded in each language. The presence of these features is available for the learner in the primary linguistic data. Furthermore, the relations established by these features may receive different morphosyntactic representations that can also affect syntactic computation.
. Object movement and EPP In the previous section I described the properties of the PCC. I argued that the PCC is an agreement restriction that constraints the combination of indirect and direct objects in languages that encode object agreement when this feature has the value [+person]. However, I did not provide any explanation for this restriction. In this section I will explore the role of the EPP feature, and I will propose that this feature is crucially involved in the PCC. Ormazabal and Romero (2000) (O&R) argue that the PCC is a movement restriction (cf. Anagnostopoulou to apeear): there is only one landing site available for object movement, and two different EPP features that need to be satisfied. Specifically, O&R state the following: i.
The landing site for object movement is (Spec, V) (see also Romero 1997): (18) [vP SUBJECT [ v [VP OBJECT [ V t OBJECT ]]]]
Juan Romero
ii. The argument in (Spec, V) measures out the event in the sense of Tenny (1987), this explains why the argument in (Spec, V) must be specific (within a totally different framework, Koizumi 1993 makes a similar proposal); iii. If more than one argument appears in (Spec, V), i.e. if there are multiple specifiers, the derivation crashes for I/C reasons: the aspectual interpretation of the sentence can only interpret one argument. Somehow this can be considered the aspectual counterpart of the thematic criterion. Assuming these analysis, the nature of the EPP feature can be characterized according to the following properties: (i) it is a selectional interpretable feature; in the case at stake it is not lexical, but functional, and the projected structure enters into an aspectual interpretation; and (ii) its presence is independent of agreement features. This property is clearly shown in infinitives (Chomsky 1995). In the case of object movement, the mere existence of DOCs in Japanese or Basque infinitives also supports this conclusion. However, there must be something more to be said, since from these properties, no agreement restrictions are expected. Recall that the PCC shows up if and only if the following conditions are met: (i) the verb encodes object agreement; (ii) object agreement has the value [+person] (animate/specific); and (iii) there is dative shift. This description leads us to the following question: why do not nonanimate object agreement trigger PCC effects? To answer this question first I will analyze object movement in transitive structures. It will be shown that, as it is commonly assumed, an object does not require to be animate to undergo object movement. However, if it is animate, object movement becomes obligatory (when possible). As a consequence, I will argue that in the same way as languages can determine whether verb agreement encodes person plus number, but not, for instance, gender; languages can also associate person and EPP. Note that this is not I/C (or A/P) motivated, it is just the way formal features are organized. . Scandinavian object shift Descriptively speaking, object shift (OS) in Icelandic and other Scandinavian languages is testable with respect to the position of adverbs. In (19a), the object appears to the left of the adverb. However, as shown in (19b), this movement is not always available. Specifically, it has been argued that OS is depen-
Morphological constraints on syntactic derivations
dent on verb movement. This constitutes the core of the so-called Holmberg’s Generalization (HG) (examples from Holmberg 1999). (19) a.
Jag I b. *Jag I
kysste henne inte [VP t-kysste t-henne] kissed her not har henne inte [VP kysst t-henne] have her not kissed
This operation has several interesting properties. First, if the object is pronominal, OS is mandatory, at least in some languages (Icelandic, Danish, and some varieties of Norwegian). Therefore, in these languages, the in-situ counterpart of (19a) is ungrammatical (2). This property is clearly reminiscent of the PCC: whenever the object is pronominal, it has to undergo movement. (20) *Jag kysste inte henne I kissed not her
Second, if the NP is ambiguous between a specific and a non specific interpretation, the specific interpretation is obtained in the raised position (21a), and the non specific is obtained in situ (21b) (examples from Bobaljik & Thráinsson 1998). This fact is related to the event measurement properties: non-specific arguments cannot bound the event (see, for instance, Jackendoff 1995). Examples in (21) also show that object movement is not restricted to non animate arguments. (21) a.
Ég las Prjár bækur ekki I read three book-pl not ‘I didn’t read three books’ b. Ég las ekki Prjár bækur I read not three books ‘I didn´t read three books’
An intriguing property of this operation is found in verb topicalization environments (Holmberg 1999). In this construction, the verb is displaced to initial position, and it is contrastively interpreted. Interestingly, although there is an auxiliar, OS is allowed in this case: (22) Kysst har jag henne not (bara hAllit henne i handen) kissed have I her not (only held her by the hand)
In the early nineties, HG was used in conjunction with the notion of equidistance: the object could only raise to (Spec, AgrO) if the verb moved first to AgrO, making the subject and the object equidistant to OS position. However,
Juan Romero
according to recent analysis, the object moves to (Spec, v) for agreement and Case checking. Therefore, if negation (or VP adverbs) are generated in a position above VP (or adjoined to VP), as it is commonly assumed, this movement is not enough to account for the properties of OS. (23) [vP Adv [vP Object [ Subject [ v [. . . t Object... ]]]]]
Therefore, OS implies an additional movement to some other position. Holmberg (1999) treats OS as a phonological operation triggered by a [–Focus] feature, and sensible to the phonetic context. Holmberg notes that “not just an unmoved verb, but any phonologically visible category inside VP preceding the object position blocks OS”. The range of “phonologically visible categories” in Swedish include prepositions (24a), indirect objects (24b), and particles (24c). (24) a. *jag talade henne inte med t-henne I spoke her not with b. *jag gav den inte Elsa t-den I gave it not Elsa c. *dom kastade mej inte tu t-mej they threw me not out
However, Chomsky (1999) argues that if this movement were phonetic, we would not expect the kind of semantic effects we found in OS. As a consequence, Chomsky proposes a two step derivation. In the first step, the object moves to (Spec, v). In this position, the object intervines between the subject and (Spec, T), therefore, it must move. This explains why object shift is always available when it is followed by wh-movement: (25) Mary wonders what John t bought t
Chomsky argues that object shift languages such as Icelandic have a rule DISL, triggered by interpretive reasons, that moves the object to its final position. Since I assume that object shift moves the object to (Spec, V), here I am not going to discuss the specific details of these proposals. Contrary to Chomsky, I propose that it is the first step the one that is responsible for the interpretive properties of the construction. Recall that assuming the properties assigned to the PCC, to obtain a DOC, it is necessary to move the indirect object to (Spec, V). It is at this point where the PCC arises, and therefore it is this position the one that has the interpretive properties mentioned: the argument in (Spec, V) (i) measures out the event and (ii) is specific. Consider the sentences in (26):
Morphological constraints on syntactic derivations
(26) a.
jag gav inte I gave not b. hann skilar he returns
Elsa den Elsa it bókasafninu aldrei bókunum the.library never the.books
Swedish Icelandic
Both sentences are instances of DOC in Swedish and Icelandic respectively. In (26a) it is shown that the object movement position (Elsa) is below the adverb. Furthermore, in (26b) it is shown how in DOCs it is the indirect object (bókasafninu) the one that can undergo OS. If so, object movement is, or can be, much more frequent than it is supposed to be, but it does not usually have phonetic consequences once the verb moves to v and the order VC is reconstructed. The consequences show up (i) in DOCs, since the indirect object precedes the direct object; (ii) when there is an additional rule, such as DISL, that moves the object to another position. Since the interpretive properties are satisfied in the first step, it seems natural to assume that this second step is not syntactic, but it is sensible to whatever features are involved in the first step. Two kind of arguments support this conclusion: V-topicalization structures (22), and nominative OS movement. In V topicalization structures, the verb is fronted stranding the auxiliar. Following Holmberg (1999), since no phonetic material intervenes between the auxiliar and the object, this can be dislocated to a higher position above negation: (27) Kysst har jag henne not (bara hAllit henne i handen) kissed have I her not (only held her by the.hand)
Furthermore, Holmberg argues that this movement is not restricted to objects. A nominative experiencer argument can also undergo OS. Consider (28). (28) mér líkar hún/tölvan ekki me.dat like.3sg it/the.computer.nom not ‘I do not lile it/the computer’
Icelandic
It is known that in this kind of sentences, the dative is in subject position (see Boeckx 1998 and references therein). Therefore, although the experiencer triggers agreement with the verb, it is not either raised to subject position, or to object position. However, since it has the relevant properties, it can be affected by the same rule that dislocates objects to the left of the negation. If this line of reasoning is correct, OS can be interpreted as a kind of clitic movement, possibly triggered by prosodic reasons.
Juan Romero
. EPP In the previous section I have argued that an EPP feature can project a specifier in (Spec, V) in transitive clauses and attract a non animate object. Interestingly, its presence seems to be obligatory whenever the object is animate. In other analysis, this property may be made dependent on the phonological features of the argument. However, under this two step analysis, the first movement is purely syntactic. This fact correlates with many properties found in different languages. Consider Spanish. In Spanish, clitic doubling in general is barred for objects (29a). However, when the object is a pronominal, it is not only possible, but obligatory (29b). (29) a.
(*lo) vi el coche cl.3sO saw the car ‘I saw the car’ b. *(lo) vi a él cl.3sO saw him ‘I saw him’
Likewise, most ergative languages are subject to a phenomenon called split ergativity. This process splits arguments between pronominals and NPs, and it basically consists in the following property: pronominals are accusative, and NPs absolutive; i.e., when the object is a pronoun, the language patterns with nominative-accusative languages. The PCC is just another instance of this general problem: animate objects do not behave as non animate objects. Suppose that formal features are not just a collection of features, but they are somehow ordered. Specifically, the presence of a certain feature in a certain position may imply the presence of another feature. In Romance languages, for instance, it is well known that N categories encode number and gender, and V categories encode person and number. Other languages, such as Arabic or Southern Tiwa, also include (or may include) gender features in V categories. Supposse, furthermore, that when object agreement encodes [person], it also necessarily encodes an EPP feature. Can this property explain the peculiar behavior of this kind of arguments? Consider the PCC effects. In an agreement object language, two sets of agreement features appear in the verb (since the formal features are ordered, we can assume that the EPP feature triggered by aspectual reasons actually corresponds to the EPP feature encode as part of the agreement cluster.): i.
If both sets are [+person], two EPP features project specifiers. Since the aspectual interpretation only tolerates one, the derivation crashes.
Morphological constraints on syntactic derivations
ii. If only one set is [+person], the indirect object raises to (Spec, V) and the derivation converges. The other argument agrees in situ with the verb. In languages with no object agreement, the aspectual properties of the verb encode an EPP feature that forces object movement to (Spec, V). Two important conclusions arise from this characterization: first, EPP is always an interpretable feature. As a consequence, its interpretive nature may impose conditions at the interface. Note that this analysis does not imply that multiple specifiers are always barred. Multiple specifiers are only barred whenever a configurational interpretation is assigned: thematic relations, aspectual relations, or temporal relations. Possibly this argument does not apply when multiple specifiers serve as landing site for scope reasons. The second conclusion is related to the interpretable/uninterpretable properties of formal features. If the EPP is interpretable, as proposed, interpretable features can be involved in movement operations. Therefore, it is its formal nature, and not its interpretability status, what makes formal features to be active in the computational component. I think this is a good result for at least the following reasons: i.
No “looking-forward” property needs to be invoked to trigger computational operations. If being (un)interpretable is an interface property, then it cannot be relevant troughout the derivation. ii. It is not necessary to stipulate that every language shares the same set of formal features. Linguistic variation can be this way linked to the formal features each different language encodes and clearly express. Formal features are the best candidates to be easily detectable properties of utterances. iii. The computational component can be simply understood as a formal device that manipulates formal objects to create new objects for the interfaces. It is not required to encode substantive properties such as formal features from the scractch.
Notes * This material was presented at the 22nd GLOW Colloquium at Berlin, and to the Linguistic Seminar at Instituto Universitario Ortega y Gasset (Madrid). I am very grateful to these audiences for helpful comments and discussion. I am also grateful to Violeta Demonte, Olga Fernández Soriano, Javier Ormazabal, and Miriam Uribe Etxebarria. This research has been partly supported by a Comunidad de Madrid Postdoctoral Fellowship 144/2001.
Juan Romero . If this were the case, the derivation should be able to look forward, a very problematic move (Johnson and Lappin 1997; Collins 1997; Chomsky 1998). . The introduction of phases has slightly changed this picture. Chomsky (1999) argues that there is an evaluation process at strong phases, at the point where representations are handed over to the interfaces (see Section 3). . Romero (1999), in a somewhat different framework, proposes an analysis of the parameter of pro-drop based on the phonetic properties of agreement features in different languages. . In the case of phonetic features, this set has been pretty well studied, and there are reasons to believe that something similar can be postulated for semantic features (see Uriagereka 1997). . I am assuming with Hale and Keyser (1997) that by definition nominal items do not trigger Merge (see also Romero 1997). However, under a phase derivational system, uninterpretable feature deletion would be evaluated at the “next phase” (Chomsky 1999), therefore, an uninterpretable feature may remain unchecked until the end of the phase, where its presence would lead to a crashed derivation. . Itziar Laka (p.c.) has reported to me that Montoya (1988) reaches the same conclusion. Montoya shows that the dative c-commands the absolutive, but the absolutive c-commands the allative, etc. Unfortunately I have not had access to this paper. . Actually, parameter setting is necessary to exclude [±pro-drop] in Chinese, and Huang argues that it lacks agreement. I will come back to this in Section 4.
References Albizu, P. (1997). The Syntax of Person Agreement. Doctoral dissertation, USC. Anagnostopoulou, E. (1998). Dative Argument and Clitic Doubling. Ms., MIT. Anagnostopoulou, E. (to appear). The syntax of ditransitives: Evidence from clitics. Berlin: Mouton de Gruyter. Barss, A. & H. Lasnik (1986). A Note on Anaphora and Double Objects. Linguistic Inquiry, 17, 347–354. Bobaljik, J. D. & H. Thráinsson (1998). Two Heads aren’t Always Better Than One. Syntax, 1 (1), 37–71. Boeckx, C. (1998). Agreement Constraints in Icelandic and Elsewhere. Working Papers in Scandinavian Syntax, 62 (2), 1–35. Bonet, E. (1991). Morphology after Syntax: Pronominal clitics in romance. Doctoral dissertation, MIT. Bonet, E. (1994). The Person-Case Constraint: A morphological approach. MIT Working Papers in Linguistics, 22, 33–52. Campos, H. (1986). Indefinite Object Drop. Linguistic Inquiry, 12 (2), 354–359. Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: The MIT Press. Chomsky, N. (1998). Minimalist Inquiries: The framework. MIT Occasional Papers in Linguistics, 15.
Morphological constraints on syntactic derivations
Chomsky, N. (1999). Derivation by Phase. MIT Occasional Papers in Linguistics, 19. Collins, C. (1997). Local Economy. Cambridge, MA: The MIT Press. Demonte, V. (1991). Dos clases de objetos indirectos, la construcción de doble objeto y el alcance de las estructuras larsonianas. Ms., U. Autónoma de Madrid. Demonte, V. (1995). Dative Alternation in Spanish. Probus, 7 (1), 5–30. Green, G. (1974). Semantics and Syntactic Regularity. Bloomington: Indiana University Press. Hale K. & J. S. Keyser (1993). On Argument Structure and the Lexical Expression of Syntactic Relations. In K. Hale and S. J. Keyser (Eds.), The view from building 20. Cambridge, MA: The MIT Press. Hale, K. & J. S. Keyser (1997). On the Complex Nature of Simple Predicators. In A. Alsina, J. Bresnan and P. Sells (Eds.), Complex predicates (pp. 29–65). Stanford: CSLI Publications. Harley, H. (1995). Subjects, Events, and Licensing. Doctoral dissertation, MIT. Holmberg, A. (1999). Remarks on Holmbergs’s Generalization. Studia Lingüística, 53 (1), 1–39. Huang, J. C.-T. (1984). On the Distribution and Reference of Empty Pronouns. Linguistic Inquiry 15 (4), 531–574. Jackendoff, R. (1996). The Proper Treatment of Measuring out, Telicity, and Perhaps even Quantification in English. Natural Language and Linguistic Theory, 14, 305–354. Johnson, D. & S. Lappin (1997). A Critique of the Minimalist Program. Linguistics and Philosophy, 20, 272–333. Kayne, R. (1998). Overt vs. Covert Movement. Syntax, 1 (2), 128–191. Koizumi, M. (1993). Object Agreement and the Split-VP Hypothesis. In J. Bobaljik and P. Collins (Eds.), Papers on Case and Agreement I, MIT Working Papers in Linguistics (pp. 99–148). Koizumi, M. (1995). Phrase Structure in Minimalist Syntax. Doctoral dissertation, MIT. Laka, I. (1993). The Structure of Inflection: A case study in X-zero syntax. In J. I. Hualde and J. Ortiz de Urbina (Eds.), Generative Studies in Basque Linguistics (pp. 21–70). Amsterdam: John Benjamins. Li, C. N. & S. A. Thompson (1979). Third Person Pronouns and Zero-anaphora in Chinese Discourse. In T. Givón (Ed.), Syntax and Semantics, 19. Discourse and syntax (pp. 311– 335). London: Academic Press. Marantz, A. (1993). Implications of Asymmetries in Double Object Constructions. In S. A. Mchombo (Ed.), Theoretical Aspects of Bantu Grammar 1 (pp. 113–150). Stanford CA: CSLI Publications. Miyagawa, S. (1996). Word Order Restrictions and Nonconfigurationality. MITWPL, 29, 117–142. Miyagawa, S. (1997). Against Optional Scrambling. Linguistic Inquiry, 28 (1), 1–25. Montoya, E. (1998). Objectu dikoitzeko egiturak euskaraz. Ms., UPV/EHU. Ormazábal, J. & J. Romero (1998). A Case against Case. Paper presented at GLOW, Tilburg. Ormazábal, J. & J. Romero (2000). On Agreement Restrictions. Ms., UPV-UAM. Rizzi, L. (1986). Null Objects in Italian and the Theory of pro. Linguistic Inquiry, 17 (3), 501–557.
Juan Romero
Romero, J. (1997). Construcciones de Doble Objeto y Gramática Universal. Doctoral dissertation, U. Autónoma de Madrid. Rosen, C. (1990). Rethinking Southern Tiwa: The geometry of a triple agreement language. Language, 66 (4), 669–713. Taraldsen, T. (1978). On the NIC, Vacuous Application, and the That-trace Filter. Bloomington: Indiana University Linguistics Club. Uriagereka, J. (1988). On Government. Doctoral dissertation, UConn., Storrs. Uriagereka, J. (1997). Warps: Some thoughts on categorization. Cuadernos de Lingüística del I.U. Ortega y Gasset, 4, 1–24.
Intermediate traces, reconstruction and locality effects Joachim Sabel ZAS, Berlin
.
Introduction*
Since Chomsky (1986a) the concept of intermediate adjunction as a means of providing intermediate landing positions for wh-phrases has played an important role in explaining central properties of movement. Since then successive cyclic movement of a wh-phrase has been assumed to proceed via VPadjunction before it ends up in Spec CP, as can be seen in (1): (1) [CP What do you [VP t [VP like t]]]?
The intermediate adjunction hypothesis is independently supported by the fact that it accounts for weak crossover effects, locality phenomena, and reconstruction properties of moved elements with respect to scope and binding properties. In recent work, Chomsky (1999, 2000) assumes the so-called “phase-impenetrability condition” that requires short movement steps in successive stages, yielding a strong form of subjacency. The idea is that the domain of a head X is not accessible to operations outside XP, but only X and its ‘edge’ are (the edge being the residue outside of X , either Spec X or elements adjoined to XP). The ephase impenetrability condition requires that A -movement targets the edge of every ‘phase’, i.e. CP and νP (either Spec CP/νP or a position adjoined to CP/νP) and provides a theoretical motivation for the intermediate movement step in (1) (although it leaves open whether this movement targets an adjoined position or a specifier position). In this article, I will focus on the concept of adjunction as an intermediate step in movement of XPs and X0 -categories. Empirical arguments are presented
Joachim Sabel
showing that movement can never go into an intermediate adjoined position, but that it can only target an adjoined position as a goal position (used here to refer to a final destination for movement) i.e., an element that is moved into an adjoined position is “frozen in place.” Thus, I provide evidence for the constraint in (2): (2) Constraint on Adjunction Movement (CAM) Movement may not proceed via intermediate adjunction.
It follows from (2) that elements may undergo successive-cyclic movement only via specifier positions. Adopting the relational definition of levels of projections (Chomsky 1995, among others), I assume that a specifier is a sister of a category with the features [–maximal, –minimal] (X ), whereas XP-adjunction creates a sister of a category with the features [+maximal, –minimal] (XP). Adjunction movement in the case of head-movement creates a sister of a category with the features [–maximal, +minimal] (X0 ). Furthermore, I assume that different structural positions such as adjunction and specifier positions correlate with different intrinsic properties. For example, as will be argued in Section 3.2, adjunction movement as an instance of scrambling targets a position with A -properties i.e., is a type of A -movement; whereas scrambling to a specifier targets a position with A-properties i.e., this type of scrambling is of the A-movement type.1 The article is organized as follows. In Section 2, I consider the theoretical and above-mentioned empirical arguments that motivated the intermediate adjunction hypothesis. Presented in Section 3 are conceptual and empirical arguments against intermediate traces in adjoined positions. Consideration of different movement types, such as wh-movement, scrambling, A-movement, quantifier raising, and head movement, motivate the postulation of the constraint in (2) which – as I will argue – can be seen to follow from featurechecking requirements in the framework of Chomsky (1995). Furthermore, an alternative account for the wh-island phenomena discussed in Section 2 that motivated the intermediate adjunction hypothesis is proposed. This account, which is based on the framework of Chomsky (1995, Chapter 4), relies on the Uniformity Condition on Chains as well as on the assumption that C0 may project multiple specifiers which function as intermediate landing sites for extraction of wh-phrases. Finally, conclusions are formulated in Section 4.
Intermediate traces, reconstruction and locality effects
. Evidence for intermediate adjunction In this section, I will review several phenomena that motivated the assumption that movement via an intermediate adjoined position is necessary. The relevant arguments come from reconstruction effects with respect to binding and scope phenomena, as well as from weak crossover and locality effects. Interestingly, for an explanation of these phenomena the concept of intermediate adjunction seems to play a central role, irrespective of whether the ‘Barriers’ framework (Chomsky 1986a) is adopted, the framework laid out in Chomsky and Lasnik (1993), or the Minimalist framework in Chomsky (1993, 1994, 1995, Chapter 4).2 . Principle A reconstruction effects Evidence for traces in intermediate positions adjoined to VP can be gained from reconstruction phenomena. Consider first the examples in (3), which provide evidence for intermediate traces in Spec CP (Barss 1986, 1988; Lebeaux 1991; Huang 1993; Chomsky 1995; among others): (3) a. *They said [CP that I should talk to friends of each other]. b. They wondered [CP [which friends of each other]i [IP I should talk to ti ]]. c. [CP [Which friends of each other]i did they say [CP ti that [IP I should talk to ti ]]]?
The anaphor in (3a) violates Principle A of the Binding Theory since the potential antecedent is not located in the embedded clause which constitutes its relevant Binding Domain. If, as in (3b), the NP containing the anaphor is moved to Spec CP i.e., into a position “closer” to its antecedent, Principle A is fulfilled and the sentence becomes grammatical. Turning now to (3c) we observe that the anaphor in the matrix Spec CP position should also violate Principle A because it is not c-commanded by its antecedent. Given that this sentence is nevertheless grammatical it can be concluded that binding of an anaphor is possible if there is a movement site to which the anaphor may be reconstructed and in which it may be bound in accordance with Principle A. The relevant reconstruction site in (3c) is the position of the intermediate trace t i which is located in the intermediate Spec CP position. In this position the anaphor may be bound in accordance with Principle A, as can be seen in (3b).
Joachim Sabel
Similar reconstruction phenomena can be observed in examples in which the wh-phrase containing the anaphor has crossed a wh-island (Takahashi 1994): (4) a. *John wonders [where [Mary bought pictures of himself ]]. b. ??[Which pictures of himself ]i does John ti wonder [where [Mary ti bought ti ]]?
As with (3a), the anaphoric expression in (4a) may not be bound in its base position. Importantly, in contrast to (3b–c) it also cannot be bound in the intermediate Spec CP position, which is occupied by where. The only account for the slightly deviant character of (4b) i.e., for the possibility for John to act as a binder for himself, seems to be that there is a trace (t i ) in a VP-adjoined position of the matrix clause from which binding can be achieved. (4) provides evidence for intermediate adjunction of wh-phrases to VP. A similar argument for intermediate adjunction can be constructed in connection with scrambling. In analyses that treat scrambling as a movement phenomenon, scrambling is traditionally analyzed as Chomsky-adjunction to a maximal projection (Saito 1985, 1992). As can be seen from (5a), the embedded subject is the only possible antecedent for otagai ‘each other’.3 In (5b) the embedded object containing the reflexive is scrambled in front of the embedded subject. In this position the reflexive otagai may be bound by the matrix subject. Now consider (5c), where the NP containing the anaphor is scrambled out of the embedded clause in front of the matrix subject. In this case the matrix subject can also be co-referent with the anaphor. (5c) provides evidence that the scrambled element has moved through an IP-adjunction site in the embedded clause, which was its ultimate landing site in (5b) (Nemoto 1993: 93):4 (5) a.
Joe-to Michael1 -ga [CP [IP karera-ga2 Kate-ni [otagai*1/2 -no theynom Joe-and Michaelnom Katedat each other’s hon]-o okutta to omotteiru]] (koto). bookacc sent C0 thinking ‘Joe and Michael are thinking that they sent Kate each other’s book.’ b. Joe-to Michael1 -ga [CP [IP otagai1/2 -no hon-o [IP karera-ga2 each other’s bookacc theynom Joe-and Michaelnom Kate-ni t okutta to omotteiru]]] (koto). Katedat sent C0 thinking c. otagai1/2 -no hon-o Joe-to Michael1 -ga [CP [IP t [IP each other’s bookacc Joe-and Michaelnom
Intermediate traces, reconstruction and locality effects
karera-ga2 Kate-ni t okutta to omotteiru]]] (koto). theynom Katedat sent C0 thinking
Hence adopting the intermediate adjunction hypothesis provides a unified account for reconstruction phenomena associated with binding properties in connection with wh-movement and scrambling. . Weak crossover The absence of weak crossover effects in scrambling languages has been taken as evidence for intermediate adjunction. Several authors have argued that XPmovement proceeds via intermediate adjunction to IP in scrambling languages like German (Grewendorf 1988; Bayer 1993; Richards 1997; among others). This assumption was used to explain that (for many speakers) weak crossover effects are absent in contexts where they appear in non-scrambling languages like English. The contrast is illustrated in (6) vs. (7): (6) [CP Weni hat [IP t i [IP seinei Mutter [VP immer t i geküßt]]]]? whoacc has his mother always kissed ‘Who did his mother always kiss?’ (7) *[CP Whoi does [IP hisi mother often kiss t i ]]?
The intermediate trace t i in (6) is argued to be located in an IP-adjoined position with A-properties, which prevents both the pronoun and the variable t i from being locally A -bound by the operator in Spec CP. Assuming intermediate adjunction therefore provides an answer to the question of why the weak constraint crossover (see Koopman & Sportiche 1982) is violated in (7) but not in (6). . Locality effects The preceding sections have shown that weak crossover and certain binding phenomena are easily accounted for if the concept of intermediate adjunction is adopted. A further argument for the intermediate adjunction hypothesis comes from locality effects. Since Chomsky (1986a) it has been widely accepted that locality effects in connection with movement out of islands receive a straightforward explanation if intermediate adjunction is adopted. Before I discuss the relevant examples I will shortly review some technical notions. Recall that one theoretical motivation for assuming intermediate adjunction of wh-phrases in the ‘Barriers’ framework has to do with the aim of reduc-
Joachim Sabel
ing the ECP to antecedent government.5 Consider for example the case of α = VP in (8). (8) a. What do you [α like t]? b. What do you [α t [α like t]]?
The non-L-marked VP is assumed to be a barrier for antecedent government of t in the derivation (8a) (and IP would be a barrier by inheritance, see Note 5), which obviously has empirical shortcomings since it would incorrectly block antecedent government of an object trace in a derivation like (8a). This problem arises in the framework of Chomsky (1986a) as well as in the analyses of Chomsky and Lasnik (1993) and Chomsky (1995, 2000) where the assumption is made that every non-complement is a barrier for an element β as long as it includes β. However, an element β adjoined to an element α is not completely within α. This is the case in (8b). t is adjoined to VP. Here VP is neither a barrier between t and t nor between what and t . Therefore, XPs that represent intermediate adjunction sites are not barriers for movement. Given the assumption that adjunction to non-arguments is possible (in contrast to arguments, see Chomsky 1986a, 1991, 1995) and that VP is not an argument, the wh-phrase adjoins to VP as in (8b) on its way to Spec CP. Consequently, every trace is antecedent governed. Thus by assuming the intermediate adjunction hypothesis, we correctly predict the absence of an ECP violation in (8b). An account for the well-known asymmetries in (9) below is provided by the intermediate adjunction hypothesis in conjunction with the (independently supported) assumption that intermediate traces of arguments located in A positions are not licensed at LF and therefore must be deleted, whereas intermediate traces of adjuncts cannot be deleted (Lasnik and Saito 1984, 1992; Chomsky 1986a; Chomsky and Lasnik 1993):6 (9) a. ??What do you [VP t wonder [CP how John could [VP t [VP fix t]]]]? b. *How do you [VP t wonder [CP what John could [VP t [VP fix t]]]]? c. *Who do you [VP t wonder [CP how [IP t could fix the car]]]?
Only complements may be extracted out of wh-islands (9a), yielding a (mild) subjacency violation. Adjunct and subject extraction violates the ECP (Empty Category Principle) (9b–c). Let us first turn to (9a–b). The initial trace t of the extracted object and adjunct is antecedent governed and fulfills the ECP in (9a–b) for reasons already outlined in the discussion of (8b). However, the embedded IP is not L-marked, and hence is a blocking category for t and CP becomes a barrier by inheritance in both examples. Therefore, a barrier intervenes between t and t in (9a–b), hence t is not antecedent governed in
Intermediate traces, reconstruction and locality effects
both examples. Following Chomsky and Lasnik, we can assume that this trace gets *-marked (the analogue of [-γ-marked], i.e., that the wh-phrase enters the numeration bearing a ‘*-feature’ that becomes ‘visible’ on one of its copies in (9a–c)). Importantly, *t remains at LF only in (9b), whereas it must be deleted in (9a), yielding the operator-variable pair (What, t). Therefore, the ECP is violated in (9b) but not in (9a). Let us now turn to (9c). In contrast to (9a), the variable t in (9c) is not antecedent governed by an intermediate trace (IPadjunction being excluded by assumption) and is *-marked. Hence, because t is a variable it may not be deleted and violates the ECP. If intermediate adjunction to VP in the sentences of (9) were impossible, complement extraction across a wh-island should be as ungrammatical as adjunct- and subject-extraction, because the undeletable initial traces in all these examples would not be antecedent governed and would be *-marked. Hence, the concept of intermediate adjunction provides an explanation for the asymmetries found in (9). Another strategy to account for the wh-island effects is pursued in Chomsky and Lasnik (1993). Adopting the basic idea of Rizzi’s (1990a) Relativized Minimality, no appeal to barrier theory is made for the explanation of (9). The data in (9) are then accounted for because movement of the wh-phrase does not proceed in a successive-cyclic way via Spec CP. The long-extracted wh-phrases fail to make the “shortest move” because they all skip Spec CP, which is a potential landing site. Hence, this movement violates the condition Minimize Chain Links (MCL), a derivational version of Relativized Minimality.7 Under these assumptions, the trace in Spec IP in (9c) is *-marked because the wh-phrase fails to make the shortest move (again, IP-adjunction is excluded by assumption). As with the former account in terms of barrier theory, this trace remains at LF because it represents the variable. In (9b) a uniform chain is created with one *-marked trace. Again, deletion of the *-marked trace may not apply. In (9a), on the other hand, a trace in a VP-adjoined position ensures that the variable is not *-marked. This trace itself is *-marked but deleted at LF. As with the barrier-theoretic analysis, intermediate adjunction to VP is necessary to account for the data in (9). Furthermore, given that MCL forces chain links to be be minimal in length, VP-adjunction is obligatory under the assumption that VP represents a possible landing site. To sum up, an account for the asymmetries with respect to wh-islands found in (9) can be given if one relies on the intermediate adjunction hypothesis in conjunction with the mechanism of intermediate trace deletion. Before I turn to the next piece of evidence for the intermediate adjunction hypothesis, I would like to clear up a potential objection that could be raised
Joachim Sabel
against this account of locality effects which relies on the deletion of intermediate traces. At first sight, this mechanism seems to raise a problem if at the same time it is assumed that the Binding Theory applies at LF as assumed in Chomsky (1995, Chapter 4). Consider again an example such as (3c); repeated here as (10). (10) [CP [Which friends of each other]i did they say [CP ti that [IP I should talk to ti ]]]?
Recall that the anaphor inside the wh-element is not in a structurally adequate position to fulfill Principle A of the Binding Theory; however, being located in the intermediate Spec CP position, the anaphor may be bound by its antecedent. As pointed out in Chomsky (1993, 1995), adopting the copy theory of movement, has the advantage that one need not assume literal reconstruction for the purposes of the Binding Theory i.e., that the wh-element in (10) is lowered into the intermediate Spec CP position. Assuming that a copy of the wh-phrase is located in the intermediate Spec CP, Chomsky (1993, 1995) assumes (following the analysis in Lebeaux 1983, 1985) that in (10), the anaphor inside the intermediate copy (t i ) fulfills the Binding Theory at LF. Note that we now seem to face the problem that at LF the anaphor inside the intermediate trace (copy) in (10) can fulfill the Binding Theory only if intermediate traces (or copies) are not deleted at LF. On the other hand, the Uniformity Condition on Chains forces intermediate trace deletion in (10) (see Note 6). A potential solution to this paradox is to reject the assumption that the Binding Theory applies only at LF and to assume that Principle A of the Binding Theory can be stated in derivational terms (11) (as has been argued by several authors, cf. Belletti and Rizzi 1988; Uriagereka 1988; Lebeaux 1991; Sabel 1996; among others). (11) Principle A of the Binding Theory can be fulfilled at any stage of the derivation.
A mechanical instantiation of this idea might be worked out by assuming that anaphors enter the numeration with a kind of “binding-feature” that needs to be visible at the LF-interface. Visibility is achieved if the anaphor is bound (understood here as “checked”) at one step of the derivation in the relevant domain under a certain indexing I. Given (11), the anaphor in (10) fulfills Principle A at one step of the derivation i.e., at the step where the wh-phrase is located in the intermediate Spec CP position, making an additional syntactic “reconstruction operation” at LF (however it is understood) superfluous for the purposes of Binding Theory.8
Intermediate traces, reconstruction and locality effects
To sum up, in this section I have shown that an account of locality effects with respect to wh-islands can be given if one relies on the intermediate adjunction hypothesis in conjunction with the mechanism of intermediate trace deletion. . Scope reconstruction Further evidence for intermediate adjunction to VP can be gained from whquantifier interactions and scope reconstruction facts. The following sentences (12) are not ambiguous. A wh-phrase has crossed a weak island and cannot be interpreted as being in the scope of the universal quantifier (the examples (12) and (14) are from John Frampton, mentioned in Cheng 1991: 185, see also Longobardi 1987, Cinque 1990, Saito 1994c, and Frampton 1999 for discussion):9 (12) a. Which books do you wonder whether every student read? b. Which books don’t you know that every student read? c. Which book didn’t every student think that his teacher wrote?
Following Rizzi (1992), the Spec position of NegP can be analyzed as an A position filled with a sentential negation operator (that is phonetically unrealized in English). Let us assume that a wh-phrase that crosses Spec NegP violates MCL. Assuming intermediate adjunction, movement from the matrix VP position to Spec CP in (12b–c) leaves a *-marked trace, as shown in (13b–c): Which books do you [VP t wonder [CP whether every student [VP *t read t]]]? b. Which books don’t you [VP *t know [CP t that every student [VP t read t]]]? c. Which book didn’t every student [VP *t think [CP t that his teacher [VP t wrote t]]]?
(13) a.
The data in (13) can then be explained if we assume that a *-marked trace blocks scope reconstruction in the sense that a wh-phrase may not be reconstructed into a *-marked trace position or into a trace position that is/was ccommanded by a coindexed *-marked trace. Therefore, the wh-phrases in (13) may not be interpreted as being in the scope of the universal quantifier. Now consider the examples in (14): (14) a.
book did every student [VP t wonder [CP whether his teacher [VP *t wrote t]]]?
??Which
Joachim Sabel
b. Which book did every student [VP t think [CP t that his teacher didn’t [VP *t write t]]]?
In contrast to (13), the examples in (14) are ambiguous. Nothing excludes the possibility that the wh-phrase in (14a) could be interpreted in the scope of the universal quantifier in the matrix clause, given that t is neither *-marked nor c-commanded by a *-marked trace. Although (14b) can be accounted for in a different way i.e., by assuming that Spec CP is the relevant reconstruction site for the wh-phrase, (14a) is an example that might be taken as further evidence for the claim that (long) wh-movement proceeds via adjunction to VP. Again, assuming a derivation which contains an intermediate trace in an adjoined position provides an account of the asymmetry found in (13)–(14). A further argument for intermediate traces in VP-adjoined positions comes from scope reconstruction facts in connection with variable binding. It is well-known that the interpretation of a pronoun as a bound variable is only possible if the pronoun is bound by a quantificational antecedent. Fox (1999) assumes that variable binding has to be synactically encoded at LF and that it (potentially) triggers reconstruction. He argues that the contrasts between the examples in (15) provide an argument for intermediate VP-adjunction of wh-phrases. In (15a), reconstruction of the wh-phrase into the trace position is possible, licensing the bound variable reading of the pronoun he. A different situation arises in (15b). Reconstruction leads to a configuration in which the referential expression Ms. Brown is bound by the pronoun she, giving rise to a violation of Principle C of the Binding Theory. The crucial example is (15c). As evident from (15b), the position *t in (15c) is not a possible reconstruction site. If reconstruction took place into such a position this would incorrectly predict (15c) to yield a Principle C violation like (15b). Therefore, in order to account for the well-formedness of (15c) it seems to be necessary to postulate an intermediate trace t in a VP-adjoined position that represents a possible reconstruction site. (15) a.
[Which of the books that he1 asked her2 for] did Ms. Brown2 [VP give every student1 t]? b. *[Which of the books that he1 asked Ms. Brown2 for] did she2 [VP give every student1 *t]? c. [Which of the books that he1 asked Ms. Brown2 for] did every student1 [VP t get from her2 *t]?
To sum up, in Sections 2.1–2.4, I have presented arguments in favor of the assumption that movement proceeds via intermediate adjunction to VP and IP.
Intermediate traces, reconstruction and locality effects
The intermediate adjunction hypothesis elegantly accounts for weak crossover effects, locality phenomena, and reconstruction properties of moved elements with respect to scope and binding properties. In the following, I will argue that on the other hand there is a large amount of theoretical and empirical evidence showing that the concept of intermediate adjunction overgenerates and that it makes incorrect predictions.10
. Arguments against intermediate adjunction If we assume that the intermediate adjunction hypothesis holds, several ad hoc devices are needed to constrain the unrestricted use of the mechanism. However, as we will see in the next sections, these devices are only motivated by the effect that has to be reached and they do not follow from strictly minimalist assumptions. Furthermore, a number of empirical problems result from the intermediate adjunction hypothesis. This suggests that it is preferable to find another explanation for the phenomena already discussed. In the following, I will present arguments against intermediate adjunction from distinct movement types, among them wh-movement, quantifier raising, scrambling, A-movement, empty operator movement, and head movement. In addition, I will try to show that the arguments against intermediate adjunction do not depend on a special theoretical framework. They hold independently of whether the ‘Barriers’ framework (Chomsky 1986a) is adopted, the framework laid out in Chomsky and Lasnik (1993), or the Minimalist framework (Chomsky 1995, 2000). Abandoning the intermediate adjunction hypothesis implies that the phenomena mentioned in Section 2 have to be accounted for in a different way. Alternative explanations of the relevant phenomena will be explored in the following discussion. Let us now start with the problems that arise if we assume intermediate adjunction in wh-movement constructions. . Wh-movement and intermediate adjunction .. Adjunct islands Chomsky (1986a: 66) notes that intermediate adjunction of adjuncts to adjuncts must be prohibited, otherwise no barriers are crossed in (16), and the sentence should be grammmatical.
Joachim Sabel
(16) *Where did they [VP t [VP leave Boston [PP t [PP before [CP (t ) PRO [VP t [VP meeting John t]]]]]]]?
This problem arises in the framework of Chomsky (1986a) as well as in the analysis of Chomsky and Lasnik (1993) and Chomsky (1995, 2000) where the assumption is made that every non-complement is a barrier for an element β as long as it includes β. An element β adjoined to an element α is not completely within α. This is the case in (16). t is adjoined to the adjunct PP, hence PP fails to be a barrier for t . The constraint that adjuncts may not be adjoined to adjuncts correctly rules out the derivation in (16), but it is not restrictive enough.11 It does not exclude intermediate adjunction of arguments to adjuncts. Hence, the sentences in (17) are derivable without wh-movement crossing any barriers (cf. Browning 1987: 327; Johnson 1988; Coopmans 1988, 1990; Lightfoot & Weinberg 1988; Clark 1990): (17) a. *What did Sam [VP t [VP go out [PP t [PP without [CP (t ) PRO [VP t [VP talking about t]]]]]]]? b. *Who did they [VP t [VP leave Boston [PP t [PP before [CP (t ) PRO [VP t [VP meeting t ]]]]]]]?
In order to rescue the intermediate adjunction hypothesis one might argue that (16)–(17) could be ruled out by another independently motivated constraint i.e., a constraint that excludes movement from Spec CP to an adjoined position as a case of Improper movement (Hoekstra & Bennis 1989; Müller & Sternefeld 1993; Grewendorf & Sabel 1994). Note that this constraint could be argued to be motivated anyway given that movement from Spec CP to an A-position must be similarily ruled out (Chomsky 1981). For an illustration of this fact, consider the following cases of illicit long A-movement: (18) a.
i. *John was decided [IP t [IP t to leave at noon]]. ii. *John was decided [CP t [IP t to leave at noon]].
b. i. *John was expected [CP (t ) that [IP it was told t [that Mary is beautiful]]]. ii. *Who was expected [CP t that [IP it was told t [that Mary is beautiful]]]?
Chomsky (1995: 326) excludes (18a-i) by arguing that the complement of decide requires a PRO subject. Given that John cannot bear Null Case, a feature mismatch arises, which cancels the derivation (Chomsky 1995: 309). The same explanation rules out the derivation (18a-ii). On the other hand, example (18b-
Intermediate traces, reconstruction and locality effects
i) is ruled out by the Minimal Link Condition (MLC) (see Chomsky 1995: 296, 311, 358 for discussion): (19) a.
Minimal Link Condition (MLC) K attracts α only if there is no β, β closer to K than α, such that K attracts β. b. Closeness β is closer to the target K than α if β c-commands α.
According to (19), the matrix I0 in (18b-i) attracts the closest element. It is closer than John to the target I0 . Hence, the derivation (18b-i) cannot be generated and John remains in its base position without having its Case checked overtly. Therefore, superraising as in (18b-i) is excluded.12 Note that the accounts for (18a) and (18b-i) do not extend to (18b-ii) because who, in contrast to it, has to check nominative case, in addition to a wh-feature. Who is attracted by an operator feature of the embedded C0 and moves into the embedded Spec CP (see Collins 1993, Ferguson & Groat 1994, and Sabel 2000, for this assumption). In this position it is closer to the matrix I0 than it. Hence superraising should be possible in (18b-ii), an unwelcome result. On the other hand, the Improper movement analysis i.e., that movement from Spec CP to an A-position is impossible, rules out (18b-ii). Let us therefore assume that this constraint holds and that for similar reasons movement from Spec CP to an adjoined position is impossible (cf. also Note 4). Is the Improper movement account sufficient to exclude extraction out of adjuncts? No, this constraint is only a necessary but not sufficient condition on successive-cyclic movement. The ban on movement from Spec CP to an adjoined position does not rule out the examples (20), which should be perfectly acceptable if intermediate adjunction of arguments to adjuncts were possible. (20) a. *Which movie did you [VP t [VP sleep [PP t [PP during t]]]]? b. *Which city did you [VP t [VP sleep in your bed [PP t [PP in t]]]]?
In (20) no barriers are crossed at all. To exclude these examples we have to state that intermediate adjunction is impossible. But if intermediate adjunction must be disallowed to account for (20), then it is preferable to state that it is impossible in general, and the sentences in (16)–(17), (20) can be excluded as CED (Condition on Extraction Domains) violations i.e., by assuming that adjuncts like other non-complements are barriers for movement.13
Joachim Sabel
.. Intermediate adjunction to IP In adopting the intermediate adjunction hypothesis, it has been stipulated that adjoining wh-phrases to IP is forbidden, in contrast to other operators (Chomsky 1986a: 5, 32). In order to see why this constraint is needed, let us consider the case of wh- and subject-islands: (21) *Who do you [VP t wonder [CP how [IP t [IP t could fix the car]]]]? (22) *Who do you t think [CP (t ) that [IP t [IP [NP pictures of t] are on sale]]]?
If adjunction to IP is possible, no barriers are crossed in (21), since IP is not a blocking category for t and CP therefore fails to become a barrier by inheritance. Hence, adjunction to IP must be excluded. The situation is similar if we adopt the framework of Chomsky and Lasnik (1993) i.e., the account for (21) in terms of Minimize Chain Links. If the IP-adjoined position in (21) counts as a possible landing site, the variable in Spec IP will not be *-marked, and we would incorrectly predict that (21) only violates subjacency. It is also necessary to exclude IP-adjunction in (22). Given the derivation with IP-adjunction in (22), the wh-argument has crossed only one barrier i.e., the non-L-marked embedded subject, and we would expect that (22) only represents a mild subjacency violation. Although the ban on IP-adjunction is necessary to explain the ungrammaticality of (21)–(22), it need not to be stated as a special condition; it can be subsumed under the broader constraint according to which intermediate adjunction is impossible in general. This general constraint on adjunction movement has the further advantage that it also excludes intermediate adjunction to an argument, such as the embedded subject in (22). If intermediate adjunction to the subject NP in (22) were possible, the wh-phrase would not cross a single barrier. Therefore, it was stated as a separate condition (Chomsky 1986a: 6) that adjunction to arguments is disallowed. But this constraint can also be subsumed under the more general constraint on adjunction movement, which also covers the ban on intermediate adjunction to adjuncts, mentioned in the preceding section. Furthermore, the more general constraint still allows for non-intermediate adjunction to IP and subjects in the case of quantifier raising and inverse linking (May 1977, 1985) and it explicitly does not exclude base-generated adjunction of adjuncts (or relative clauses) to arguments (or base adjunction in the case of “inverted” subjects).14 However, if intermediate adjunction is impossible in general, one could object that we loose the account for the already mentioned contrast in (6) vs. (7), repeated here as (23a–b). As already pointed out in Section 2.2, intermediate adjunction to IP seems to be necessary to explain the absence of weak crossover
Intermediate traces, reconstruction and locality effects
effects in scrambling languages like German. Given the ban on intermediate adjunction, the contrast in (23a–b) has to be accounted for in another way. [CP Weni hat [IP t i [IP seinei Mutter [VP immer t i geküßt]]]]? whoacc has his mother always kissed ‘Who did his mother always kiss?’ b. *[CP Whoi does [IP hisi mother often kiss t i ]]?
(23) a.
In fact, scrambling languages such as Polish provide evidence that – contrary to what has been claimed in the literature – the absence of a weak crossover effect in (23a) is not due to an IP-adjoined intermediate trace. Polish, like German, is a scrambling language (Zabrocki 1981; Willim 1989; among others). Scrambling to IP (and VP) is possible without giving rise to weak crossover violations, as shown in (24b) (Willim 1989: 132; Anna Bondaruk p.c.): (24) a. *Jegoi uczniowie podziwiaj¸a ka˙zdego nauczycielai . his pupils admire every teacheracc ‘His pupils admire every teacher’. b. [IP Kazdego nauczycielai [IP jegoi uczniowie podziwiaj¸a t]]. teacheracc his pupils admire every ‘His pupils admire every teacher’.
If intermediate adjunction to IP were the reason for the absence of weak crossover effects with wh-movement in German (23a), we would expect Polish to behave like German. However, in contrast to German, Polish, shows weak crossover effects with wh-movement (Willim 1989): (25) *Kogoi [IP [jegoi matka] zobaczyła t]? who his mother see ‘Who did his mother see?’
Whatever account is ultimately adopted for the cross-linguistic variations in weak crossover facts, the data from Polish show that the absence of weak crossover effects in German cannot be explained in a way that relies on the intermediate adjunction hypothesis.15 This conclusion gains further independent support from constructions in Polish containing multiple wh-words. Polish is a multiple wh-fronting language where the IP-adjoined position is an overtly manifested final destination for movement of wh-phrases. In Polish, all wh-phrases are obligatorily fronted (26a). The first wh-word is fronted to Spec CP, whereas the other wh-words adjoin to IP, as illustrated in (26b) (Rudin 1988):
Joachim Sabel
(26) a.
Co komu Monika dała t t? what to-whom Monica gave ‘What did Monica give to whom?’ b. [CP Co [IP komu [IP Monika . . .]]]
Given that only one wh-phrase may move to Spec CP in Polish and that further wh-elements in a sentence containing multiple wh-words obligatorily front to IP-adjoined positions, as in (26), the ban on intermediate adjunction predicts that long wh-movement of a wh-phrase located in an adjoined position should be impossible. This is in fact the case, as can be seen from (27) (Rudin 1988: 454). As shown in (27a), a subjunctive clause permits long extraction in Polish, but only one wh-element may be long fronted (27b). Given that one of the wh-elements in (27) is moved to the embedded Spec CP position and the other is adjoined to the embedded IP, only the former wh-phrase may move from the embedded to the matrix Spec CP: Janek kupił t t]? Co Maria chce [CP z˙ eby komu what Maria wants that to whom Janek buy b. *Co komu Maria chce [CP z˙ eby Janek kupił t t]? what to whom Maria wants that Janek buy ‘What does Maria want Janek to buy for whom?’
(27) a.
If intermediate adjunction to IP is prohibited, it is expected that the second whelement is not allowed to move successive cyclically from the embedded clause into the matrix clause. Hence, wh-movement in Polish provides independent empirical evidence for the ban on intermediate adjunction of wh-elements to IP.16 .. Copy movement and intermediate adjunction to VP Let us now turn to intermediate VP-adjunction. If intermediate adjunction is generally impossible, traces in VP-adjoined positions cannot be generated. This conclusion solves a long standing empirical problem in languages such as German and Afrikaans (du Plessis 1977) which allow for the spelling-out of traces of successive-cyclic wh-movement (see (28a) vs. (28b)). The fact that these copies may only be spelled-out in Spec CP (28b) but never in VP-adjoined positions, as shown in (29), is explained if intermediate adjunction to VP is impossible.17
Intermediate traces, reconstruction and locality effects
[CP Wen glaubst du [CP t meintev Hans tv [CP t dass thought Hans that who believe you das Argument t überzeugt]]]? the argument convinces ‘Who do you believe that Hans thought that the argument convinces?’ b. [CP Wen glaubst du [CP wen [IP Hans meinte [CP wen who Hans thought who who believe you [IP das Argument t überzeugt]]]]]? the argument convinces
(28) a.
(29) *[CP Wen glaubst du [CP wen [IP Hans wen meinte [CP wen who Hans who thought who who believe you [IP das Argument (wen) überzeugt]]]]]? the argument who convinces ‘Who do you believe that Hans thought that the argument convinces?’
Furthermore, dispensing with VP-adjunction has advantages from a theoretical point of view if we give up the barrier theory that made VP-adjunction necessary.18 Thus, for example, the adoption of the condition Minimize Chain Links in Chomsky and Lasnik (1993: 540ff.) to account for wh-island violations makes the notion of a barrier by inheritance superfluous and gives the possibility of simplifying the theory of barriers along the lines of Huang’s (1982) Condition on Extraction Domains (CED). I will adopt this idea and assume in the following that every non-complement is a barrier (cf. also Cinque 1990; Sabel 2002). This revised notion of barrier makes VP-adjunction superfluous for elementary cases of extraction such as object extraction in (28) and it correctly predicts that adjuncts and subjects are islands for extraction (as long as they do not represent intermediate adjunction sites). .. Pronoun binding and intermediate adjunction to VP If VP-adjunction is excluded, the question remains of how to account for the data in Section 2.4 involving reconstruction and variable binding. As argued in Fox (1999), *t in (15b) and (15c), repeated here as (30a) and (30b), is not a possible reconstruction site. (30) a. *[Which of the books that he1 asked Ms. Brown2 for] did she2 [VP give every student1 *t]? b. [Which of the books that he1 asked Ms. Brown2 for] did every student1 [VP t get from her2 *t]?
Joachim Sabel
To account for the grammaticality of (30b), he assumes the presence of an intermediate trace t in VP-adjoined position that serves as a reconstruction site yielding the bound variable reading of the pronoun inside the wh-phrase. There are two reasons to reject this argument for intermediate VPadjunction. Firstly, the argument depends on the claim that *t in (30b) like *t in (30a) is not a possible reconstruction site. But, as can be seen from (30 b), *t in (30b) is a possible reconstruction site. (30 ) a. *Did she2 give every student1 [the books that he1 asked Ms. Brown2 for]? b. Did every student1 get from her2 [the books that he1 asked Ms. Brown2 for]?
Secondly, (30 a) and (30a) are equally bad, although the wh-phrase in (30 a) does not contain a pronoun. Therefore reconstruction should not be forced in (30 a). This shows that the Principle C effect in (30a) does not result from reconstruction. I conclude that the distribution of Principle C violations in (30) is not due to reconstruction of the wh-phrase (i.e., either into the position t or into an intermediate VP-adjoined position t ). (30 ) a. *[Which of the books that John asked Ms. Brown1 for] did she1 [VP give every student t]? b. [Which of the books that John asked Ms. Brown1 for] did every student [VP get from her1 t]?
The appearance of Principle C effects is caused by many different factors, for example the depth of embedding of the pronoun in the matrix clause seems to be relevant in (30) (see also Huang 1993 for discussion). Additionally, the licensing of a bound pronoun variable in example (30b) cannot be taken as an argument for reconstruction. The possibility of the bound variable reading in this example as well as in (15a) can be explained independently of reconstruction. Following Reinhart (1983), who argues that bound pronouns can be treated as a subcase of Principle A of the Binding Theory, I would like to suggest an account that parallels the derivational version of Principle A of the Binding Theory in (11). (11 ) A pronoun that is A-bound (in accordance with Principle B of the Binding Theory) by a quantificational antecedent at any stage of the derivation is interpreted as a bound variable.
Intermediate traces, reconstruction and locality effects
This provides an alternative account for the analysis of the examples (15a) and (30). Given that the pronouns in these examples are all A-bound before whmovement takes place, reconstruction is not necessary.19 The discussion so far has shown that the ban on intermediate adjunction to adjuncts, arguments, VP, and IP should be subsumed under a more general constraint, according to which intermediate adjunction in general is impossible. This conclusion is empirically supported and represents a welcome conceptual simplification. I will therefore assume that the following constraint holds: (31) Constraint on Adjunction Movement (CAM) Movement may not proceed via intermediate adjunction.
(31) ensures that movement can never go into an intermediate adjoined position, but go only into an adjoined position which represents a goal position (a final destination for movement). According to the CAM, elements that are base-generated in adjoined positions, may be long extracted via successivecyclic movement as long as no intermediate trace in an adjoined position is created, as it is the case, for example, with extraction of base-generated adjuncts and inverted subjects in null subject languages. The CAM predicts that an element α may undergo successive-cyclic movement only via specifier positions. Alternatively, α may be moved if it is adjoined to an element β, which itself undergoes movement. The latter option represents a typical case of head movement or multiple fronting (see Note 38). However, the data discussed in Section 2, which were taken as evidence for intermediate adjunction to VP, have been only partially explained in a way compatible with the CAM. In the following, it will be shown that an alternative explanation may also yield a coherent picture for the other phenomena discussed in Section 2 i.e., the Principle A reconstruction effects and extraction asymmetries that arise from wh-movement across wh-islands. .. Wh-islands and intermediate adjunction to VP Consider again, the well-known complement/non-complement asymmetry, already discussed in Section 2.3, and repeated here in (32), without intermediate traces: (32) a. ??What do you wonder [CP how John could [VP fix t]]? b. *How do you wonder [CP what John could [VP fix t]]? c. *Who do you wonder [CP how [IP t could fix the car]]?
Joachim Sabel
Chomsky and Lasnik (1993) assume that movement that violates MCL leaves a *-marked trace. If there is no intermediate trace in the embedded sentences of (32), complement extraction across a wh-island as in (32a) should be as ungrammatical as adjunct- and subject-extraction (32b–c), because the initial traces in these examples are always *-marked. Hence, if there is no intermediate trace in the embedded sentences in (32), why is there a difference in grammaticality between complement extraction, on the one hand, and adjunct and subject extraction, on the other? The question of how to explain this extraction asymmetry is also left unanswered in Chomsky’s (1995: 295) discussion of wh-islands. Recall that Chomsky (1995) assumes that there is a distinction between +Interpretable and –Interpretable features. The [wh] features on whphrases are +Interpretable. Importantly, these features on XPs remain accessible for the computational system after checking, ensuring that one and the same wh-phrase may undergo successive-cyclic movement or attraction.20 It follows that wh-island violations never arise. An embedded [wh] C0 as in (33a) always attracts the closest wh-phrase according to the MLC (see (19)). This wh-phrase moves to Spec CP and is then attracted again by the [wh] matrix C0 , as in (33b), because when located in the embedded Spec CP, it is closer to the matrix C0 than the wh-phrase in situ. The [wh] feature of the embedded C0 is checked in (33a). Therefore, it cannot attract what at a later step of the derivation (33b) (yielding *Who you wonder what could solve?). Furthermore, such a movement would be counter-cyclic. Chomsky notes that the example (33b) converges with all relevant features checked, yielding gibberish since the structure cannot be interpreted adequately.21 (33) a. . . . wonder [CP who [IP t could solve what]] b. [CP Who do you wonder [CP t [IP t could solve what]]]?
Under this account wh-island violations as in (32) simply cannot be derived. However, the observed asymmetries cannot be explained. In the following I will propose an analysis based on the Minimalist framework of Chomsky (1994, 1995, Chapter 4). We will see that an analysis assuming multiple specifiers will provide a solution for the problem mentioned above and for the other cases of wh-island violations, which were used as arguments for the intermediate adjunction hypothesis, such as the examples involving reconstruction effects already discussed in Section 2.1. Cross-linguistic variation of wh-island effects suggests that what seems to be a complement/non-complement asymmetry in (32) is in fact a θ/non-θ asymmetry. The accusative object in English, which does not overtly move out of VP for Case-checking is extracted from its θ-position whereas subject and
Intermediate traces, reconstruction and locality effects
adjunct extraction takes place from non-θ-positions. Hence, extraction out of wh-islands is only possible if it takes place from a θ-position (Koopman & Sportiche 1985, 1986). The relevancy of this generalization is supported by the extraction facts in languages like German (Bayer 1991), in which subject and object NPs move out of VP for Case-checking. In these languages objects (34a), subjects (34c), and adjuncts (34b) may not be extracted out of wh-islands:22 (34) a. *[CP Was2 fragt sich Hans [CP wie1 [IP Fritz t2 t1 whatacc asks refl. Hansnom how Fritznom repariert hat]]]? fixed has t 1 t 2 repariert b. *[CP Wie2 fragt sich Hans [CP was1 [IP Fritz how asks refl. Hansnom whatacc Fritznom fixed hat]]]? has fragt sich Hans [CP wie1 [IP t 2 das Auto t 1 c. *[CP Wer2 whonom asks refl. Hansnom how the caracc repariert hat]]]? fixed has
The observation that extraction out of wh-islands is only possible from θpositions extends to the analysis of similar examples in pro-drop languages. As argued in Rizzi (1986), the subject position in languages such as Italian and Spanish may be occupied with an expletive pro and the inverted subject in θ-position may be Case marked in a way other than via specifier head agreement with Infl, granted that the inverted subject position behaves like a Case marked A-position. Chomsky (1995, Section 4.5), who assumes that covert movement is in fact feature-movement, reaches a similar conclusion. If we accept that the base position of the inverted subject is a θ-position, then in Spanish (and Italian; see Rizzi 1982: 51 for the corresponding examples in Italian), nothing blocks movement of the subject (35c), in contrast to English (32c), or German (34c): (35) a. *Qué no sabes quién compró t ? what not know-you who bought ‘What don’t you know who bought?’ b. *Por qué no sabes qué comprar t ? why not know-you what to-buy ‘Why don’t you know what to-buy?’
Joachim Sabel
c.
?Quién
no sabes qué compró t ? who not know-you what bought ‘Who don’t you know what bought?’
(Jaeggli 1988)
Let us turn to the question of how the derivation of wh-island violations proceeds, in light of the prohibition of intermediate adjunction and the MLC. I assume that the embedded C-System in these cases may contain multiple landing positions for wh-phrases i.e., multiple specifiers (see also Reinhart 1981; Comorovski 1986, 1989; Mulders 1997; Richards 1997). I follow an idea of Koizumi (1994) who assumes a multiple specifier analysis for topicalization in English, embedded verb second in Yiddish and the Scandinavian languages, and multiple wh-fronting in the Slavic languages. Koizumi (1994) presents evidence that the head of a phrase with multiple specifiers must contain hierarchically ordered features which have to be checked in a certain order.23 Adopting the main idea of Koizumi’s analysis, I assume that the selected C-head in indirect questions may bear more than one [wh] feature. The [wh] features in this head are hierachically ordered ([wh1] > [wh2]) and thus have to be checked in different specifier positions of CP by different wh-phrases.24 If this is true, the embedded Spec CP positions in the examples above have the following structure, with t 2 either as an A -position (in the case of adjunct extraction) or a base/derived A-position (in the case of argument extraction): (36)
Both wh-phrases move to the intermediate specifier positions. The intermediate trace t 2 is located in Spec2 , whereas Wh1 has moved to Spec1 . In (36a) and (36b), different wh-phrases occupy the specifier positions, because they differ with respect to the [wh] feature they bear. In both cases, Wh2 is closer to the attracting matrix C-head. The idea that the wh-phrases bear different [wh] features allows us to motivate movement of both wh-phrases without violating the MLC.25 Even more important, the embedded C-head bears [wh] features that require a wh-operator in both Spec positions. Note that the intermediate trace of the wh-phrase Wh2 is not an operator. Hence, this trace is *-marked, after the matrix C-head attracts Wh2 .26 Let us now adopt a proposal made in Chomsky (1995: 388, Fn. 75), according to which, besides L-relatedness, θpositions are relevant for the Uniformity Condition on Chains (37). Recall that
Intermediate traces, reconstruction and locality effects
operator-variable chains are the only chains in which intermediate trace deletion takes place. No intermediate trace deletion applies in uniform chains. If we take L-relatedness and θ-positions to be the relevant property P, then we can reformulate the condition for intermediate trace deletion. Let us assume that intermediate trace deletion may only apply if a chain fulfills both conditions in (38). (37) Uniformity Condition on Chains A chain C is uniform with respect to P (UN[P]) if each αi has property P or each αi has property non-P. (38) a. A . . . (A ) . . . A b. θ . . . (θ ) . . . θ
(operator-variable construction)
Now we are able to explain the examples (32) and (34)–(35). Let us assume that these examples are derived as shown in (36a–b). In (32b–c), showing subject and adjunct extraction in English, the intermediate trace *t 2 (with respect to (36a–b)) may not be deleted since the chain (Wh2 , *t 2 , t 2 ) is uniform i.e., each member of the chain is located in a non-θ-position. The initial trace t 2 in (32b) marks the base-position of the adjunct whereas t 2 (with respect to (36a)) in (32c) is located in Spec IP. Therefore, the intermediate traces may not be deleted. In contrast, the chain (Wh2 , *t 2 , t 2 ) in (32a) is not uniform because t 2 is located in a θ-position. In this example the intermediate trace must be deleted, and at LF we get the operator-variable pair (Wh2 , t 2 ). The slightly deviant character of this sentence is due to the fact that a *-marked trace was created during the derivation. The explanation for (32b–c) extends to the German examples (34b–c). In contrast to (32a), object extraction out of a whisland is ungrammatical in German since t 2 is located in a non-θ-position i.e., the position in which structural Case is assigned to the extracted object. Hence the relevant chains in (34) are all uniform, and consequently the intermediate *-marked trace cannot be deleted. The explanation for the Spanish (and corresponding Italian) cases (35a–b) is the same as for German. In contrast to these languages, subjects may be extracted out of wh-islands, as in (35c), since extraction takes place from a (Case-marked) θ-position. Again deletion of the intermediate trace *t 2 is forced to create an operator variable pair.27, 28 Note that this analysis makes the prediction that in languages with object shift, wh-questioning of objects across wh-islands should be impossible. In fact, languages with obligatory object movement into a Case position such as Icelandic do not allow for object extraction out of wh-islands (Maling 1979), see
Joachim Sabel
(39a). Similarily, complements in English may no longer be extracted if they undergo passivization (39b): (39) a. *Hvað vissi enginn hver hefur skrifað t? ‘What does no one know who wrote?’ b. *What do you wonder how was fixed t?
Furthermore, this analysis correctly predicts that PP arguments of verbs should be easily extractable across wh-islands in all languages. This is not only true for English (Chomsky 1986a: 39), or Dutch (Koster 1987), Comorovski (1990) further shows that this holds for French, Italian and Spanish, and, it also holds in German. Compare (40a) with (34) and (40b). ??[
fragt sich Hans Womit2 [CP was1 [IP Fritz t1 t2 with-what asks refl. Hansnom whatacc Fritznom schneiden soll]]]? cut should fragt sich Hans [CP womit1 [IP Fritz t 2 t 1 schneiden b. *[CP Was2 whatacc asks refl. Hansnom with-what Fritznom cut soll]]]? should
(40) a.
CP
In addition, this analysis shows that – contrary to what has been assumed – the reconstruction data discussed in Section 2.1 do not provide evidence for intermediate adjunction. Consider again the examples (4) (=(41)) and the examples (42): (41) a. *John wonders [where [Mary bought the pictures of himself ]]. b. ??[Which pictures of himself ]i does John ti wonder [where [Mary ti bought ti ]]? (42) a. *John asked Mary [where [Paul bought the pictures of herself ]]. b. ??[Which pictures of herself ]i did John ask Mary [where [Paul ti bought ti ]]?
The fact that the anaphoric expression may not be bound in its base position (41a), (42a) provides no reason to conclude that there is a trace in a VPadjoined position of the matrix clause. Instead, the long-moved wh-phrase in (41b), (42b) is extracted via Spec2 of the embedded CP, and in this position it may be bound in accordance with (11) by the antecedent in the matrix clause i.e., by the matrix subject in (41b) and by the matrix object in (42b). Summarizing the discussion on wh-islands, we can conclude that extraction out of wh-islands does not provide evidence for intermediate traces in
Intermediate traces, reconstruction and locality effects
VP-adjoined positions. The multiple Spec analysis provides a straightforward account for the observed cross-linguistic variation with respect to extraction from wh-islands i.e., one that is compatible with the MLC and the CAM. In the next sections, I will present further arguments against intermediate adjunction, because it is necessary to prove whether movement types other than wh-movement obey the constraint. Let us consider next long distance scrambling. . Long scrambling and intermediate adjunction Long scrambling (scrambling across sentence boundaries) is a highly restricted process in languages like German (see Grewendorf & Sabel 1994, 1999; Sabel 1995, 1996; Wurmbrand 1998, for extensive discussion). It is only possible with specific matrix verbs. For example, in contrast to the matrix verb behaupten, ‘claim’, (43b) the matrix verb versuchen, ‘try’, (43a) allows for long scrambling out of its infinitival complement: (43) a.
[CP Vermutlich hat [IP presumably has heiraten] versucht]]. marry tried b. *[CP Vermutlich hat [IP presumably has heiraten] behauptet]]. marry claimed
diesen Mann keine Frau [ t zu this manacc no womannom to
diesen Mann keine Frau [ t zu this manacc no womannom to
In contrast to languages like Japanese (44a), scrambling in German may not apply out of finite clauses (44b): (44) a.
Sono hon-o John-ga [CP Mary-ga t katta to] omotteiru this bookacc Johnnom Marynom bought C0 thinks (koto). (the fact) ‘(the fact) that John thinks that Mary bought this book.’ Peter glaubt [CP dass Maria t kaufte] b. *dass dieses Buch that this bookacc Peternom believes that Marynom bought
If it is assumed that the (potential) intervening barriers in (43a) and (44a) are neutralized by successive-cyclic adjunction, it is unclear why the same derivation does not provide an acceptable result in (43b), (44b). The assumption that successive-cyclic adjunction may apply in one example but not in the other
Joachim Sabel
is as implausible as the assumption that the complements of the matrix verb have a different categorial status i.e., that only in (43b), (44b) do we find CP complements which are barriers for long movement. Consider next the impossibility of scrambling out of adjunct clauses. If intermediate adjunction in conjunction with the deletion approach to intermediate argument traces is assumed (Lasnik & Saito 1984, 1992; Chomsky 1986a, 1995; Chomsky & Lasnik 1993) we have the problem of excluding the ungrammatical examples (45b) and (46b) from Russian (Yadroff 1991, 1994): Vse usnuli [CP groza [CP kogda t konˇcilas’]]. ended everbody fell-asleep the-storm when ‘Everybody fell asleep when the storm ended.’ b. *Vse groza usnuli [CP kogda t konˇcilas’]. when ended everbody the-storm fell-asleep
(45) a.
My byli udivleny [ vodku [ potomu cˇ to on prines t]]. we were surprised vodkaacc because he brought ‘We were surprised because he brought vodka.’ byli udivleny [ potomu cˇ to on prines t]. b. *My vodku he brought we vodkaacc were surprised because
(46) a.
Consider the possibility that (45b), (46b) are derived via intermediate adjunction. The objects in these examples will be adjoined to the embedded VP, leaving a trace t . Assuming the Lasnik and Saito mechanism, the trace t prevents the initial trace t from being *-marked. Given that intermediate argument traces in non-A-positions must delete, the resulting chain, (NP, t), should be a legitimate object. This is clearly the wrong prediction. The empirical problem that (45b) and (46b) provide for intermediate adjunction does not only hold for Russian. Scrambling out of adjuncts is impossible in all scrambling languages (Sabel 1996). Furthermore, I assume that the examples (45a), (46a) show that adjunction to a sentence-level is possible. This provides an additional problem for an analysis of scrambling derived by successive-cyclic adjunction. If the grammar allows for successive-cyclic adjunction, then (45a), (46a) represent intermediate steps of the derivations (45b), (46b). Therefore it must be explained why a derivation of (45b), (46b) containing intermediate traces in adjoined positions is impossible, while its underlying partial derivations are possible. If we assume that scrambling applies in accordance with (31), this question does not arise. Hence, scrambling provides further evidence for the constraint on adjunction.29
Intermediate traces, reconstruction and locality effects
Although we have seen that the intermediate adjunction hypothesis again runs into trouble, we do not yet have an account for the facts in (43)–(46). Concerning the contrast in (43a) and (43b), it was argued in Grewendorf and Sabel (1994) that long scrambling out of infinitivals is possible in German if verb incorporation in the sense of Baker (1988) between the embedded and the matrix verb takes place at LF. The verb versuchen, ‘try’, in (43a) is an incorporation verb, whereas behaupten, ‘claim’, in (43b) is not. As a result of the mentioned ‘restructuring process’, the matrix and embedded clauses behave like a ‘mono-sentential’ structure, and scrambling may apply in one movement step in (43a), but not in (43b). Crucially, this analysis does not rely on the intermediate adjunction hypothesis. We now have to answer the question of why Japanese allows for scrambling out of finite clauses (44a), in contrast to German (44b) and why scrambling out of adjuncts clauses in (45b) and (46b) is ungrammatical. The impossibility of scrambling out of adjuncts (45b), (46b) should be treated on a par with the impossibility of movement out of CED-islands in general (see Sabel 2002 for discussion). Let us first turn to the question of why Japanese allows for scrambling out of finite clauses (44b), in contrast to German (44a). A cross-linguistic survey yields that a correlation exists between the possibility of long scrambling and Aproperties of short scrambling. Languages like Hindi (Mahajan 1990) or Modern Persian (Browning & Karimi 1994) behave exactly like Japanese. These languages allow for scrambling out of finite clauses and show short scrambling with A-properties (see below). In Grewendorf and Sabel (1999) it is shown that the crucial diagnostic for the A-/A -movement characteristics of scrambling lies in the question of whether a moved category can act as a binder for an anaphoric expression that is unbound in its base position.30 In order to answer the question, why Japanese allows for long scrambling out of finite clauses in contrast to German, we must take into consideration that short scrambling has A-properties in Japanese but A -properties in German i.e., only in Japanese can a scrambled NP act as an A-binder for an anaphor, (47b) vs. (48b). The (a)-examples represent violations of Principle A of the Binding Theory and (48c) shows that anaphoric binding in German is possible in principle in this construction (Saito 1994b). (47) a. ?*[IP [ Otogaii -no sensei]-ga [I karera-o hihansita]] (koto). each othergen teachernom theyacc criticized (the fact) (Each other’s teacher criticized them)
Joachim Sabel
b. [ Karera-oi [[ otogaii -no sensei]-ga [I’ t hihansita]]] theyacc each othergen teachernom criticized (koto). (the fact) (48) a. *dass [IP die that [ the haben]]. have b. *dass [IP den the that haben]]]. have c. dass [IP der the that hat]]. has
Lehrer von sich][I den Schüler kritisiert teachers of himself]nom the pupilacc criticized
Schüleri [ die Lehrer von sichi ] [I t kritisiert pupilacc [ the teachers of himself]nom criticized
Schüleri [I [die Lehrer von sichi ] kritisiert pupilnom [the teachers of himself]acc criticized
Furthermore, Japanese has the so-called multiple subject construction, again in contrast to German. In this construction we observe that the structurally higher subject may act as an A-binder, which suggests that it occupies the same position as the scrambled object in (47b) (Doron and Heycock 1996). (49) [IP Johni -ga [ zibun-zisini -no hisyo-ga] [VP kubi-ni natta]] (koto). secretarynom was-fired Johnnom selfgen (Johni [is such that] hisi secretary was fired)
Chomsky (1994, 1995, Chapter 4) and Ura (1994), assuming the Merger theory of phrase-structure building in Chomsky (1994, 1995), have claimed that the sentence structure in Japanese may contain multiple specifiers, which provide multiple L-related positions. According to Chomsky (1994, 1995: 286) this option may be due to parameterized properties of Agreement (or Case) i.e., one and the same head may check Agreement (or Case) more than one time. If this is true, then the higher subject in (49) and the scrambled NP in (47b) are located in a Spec2 position, which is an L-related position with A-properties (50a). (50) a. [IP NPacc [I NPnom [I [VP . . . t. . .]]]] b. [IP NPacc [IP NPnom [I [VP . . . t. . .]]]]
Multiple L-related specifiers are not available in German, where the corresponding feature-checker can check only once. The scrambled NP in German
Intermediate traces, reconstruction and locality effects
therefore is located in an adjoined position (48b), which is associated with A properties (50b). Consequently, only in (47b) can the scrambled object act as an A-binder for the anaphor. In (48b) the scrambled object is located in an IP-adjoined position with A -properties. Therefore it cannot bind the anaphor in subject position. Now, given the CAM, we predict that an element that is moved to an adjunction site inside an embedded clause may not move further into the matrix clause. This is the case in German (44b), whereas scrambling in Japanese may proceed in a successive-cyclic manner via an embedded Spec2 IP position (44a) (cf. Ura 1994; Grewendorf and Sabel 1999 for more details of this analysis). Therefore scrambling out of finite clauses is possible in Japanese but not in German. This analysis provides a uniform account of the different A-/A -properties and locality restrictions of scrambling in languages such as German and Japanese in that it predicts that a scrambling language allows for scrambling out of finite clauses and short A-scrambling if multiple A-specifiers are licensed in this language.31 Note also that from this analysis an explanation for the reconstruction properties with long scrambling arises i.e., one that does not rely on intermediate adjunction. Recall the discussion of the examples (5) in Section 2.1, repeated as (51) below. In (51a) the embedded subject is the only possible antecedent for otagai ‘each other’. If the embedded object containing the reflexive is short scrambled in front of the embedded subject, the reflexive otagai may be bound by the matrix subject (51b). In (51c) the NP containing the anaphor is scrambled out of the embedded clause in front of the matrix subject; also in this case the matrix subject can be co-referent with the anaphor. Given the analysis of scrambling in Japanese mentioned above, the embedded object in (51c) is long scrambled via the embedded Spec2 IP position and not via an adjoined position as indicated in (51c). Then (51c) is not derived via successivecyclic adjunction and does not provide evidence for an intermediate trace in an adjoined position inside the embedded clause. (51) a.
Joe-to Michael1 -ga [CP [IP karera-ga2 Kate-ni [ otagai*1/2 -no theynom Joe-and Michaelnom Katedat each other’s hon]-o okutta to omotteiru]] (koto). C0 thinking bookacc sent ‘Joe and Michael are thinking that they sent Kate each other’s book.’ b. Joe-to Michael1 -ga [CP [IP otagai1/2 -no hon-o [IP karera-ga2 each other’s bookacc theynom Joe-and Michaelnom Kate-ni t okutta to omotteiru]]] (koto). Katedat sent C0 thinking
Joachim Sabel
c.
otagai1/2 -no hon-o each other’s bookacc Kate-ni t okutta to Katedat sent C0
Joe-to Michael1 -ga [CP [IP t [IP karera-ga2 theynom Joe-and Michaelnom omotteiru]]] (koto). thinking
The proposed account for the different options of long scrambling in Japanese and German (44) as well as for its reconstruction properties (51c) relies on the assumption that a scrambled element obligatorily moves through an intermediate landing site in the embedded clause. Why is an alternative derivation of the examples (44) and (51c) impossible in which scrambling applies in one fell swoop i.e., does not proceed in a successive-cyclic manner? One possible answer is that (long) scrambling, like wh-movement, is an obligatory movement operation that is driven by a feature i.e., a scrambling feature [Σ] (see also Collins 1995; Miyagawa 1997; Grewendorf & Sabel 1999; Chomsky 2000 for this assumption). The scrambling feature is associated with Agrs-features in Infl (or ν) which triggers scrambling to IP (or νP). The fact that the so-called scrambling languages are all pro-drop languages (for example, SOV-languages such as German, Hindi, Japanese, Korean, Modern Persian, Turkish and SVO-languages such as Polish and Russian all license argumental or non-argumental pro), suggests that the language-specific ability of Agrs (or Infl) to license pro is a necessary (although not a sufficient) condition for a language to have scrambling i.e., for Agrs (or Infl) to bear the scrambling feature (for this generalization see also Koster 1986; Reuland & Kosmeijer 1988; Tonoike 1997; among others).32 Then, in a simple sentence with short scrambling to IP, Infl0 and the constituent to be scrambled contain the scrambling feature. For example, the scrambling feature in (47b) and (48b) is realized on Infl0 . Given Chomsky’s (1993) definition of ‘Checking Domain’33 this feature is checked via substitution into Spec2 of IP in Japanese (47b) or via adjunction to IP in German (43a), (48b). Applying the idea of feature-driven movement to long scrambling in (44) and (51c), let us assume that assignment of the scrambling feature to Agrs in I0 implies assignment of a scrambling feature to each intermediate I0 . Consequently, in sentences such as (44) or (51c) displaying long scrambling out of a finite clause to IP, the scrambling feature is located in both I0 of the matrix as well as I0 of the embedded clause and the scrambled element has to check both scrambling-features. The scrambling-feature can be checked via adjunction to IP in German or via substitution into Spec2 of IP in Japanese. Now, given the assumption that successive-cyclic adjunction is generally impossible, elements in German may not be long scrambled because a scrambled element
Intermediate traces, reconstruction and locality effects
that is moved to an adjunction site inside the embedded clause i.e., IP in (44b), may not move further into the matrix clause. On the other hand, scrambling in Japanese may indeed proceed in a successive-cyclic manner via the embedded Spec-(IP) position as in (44a) and (51c) i.e., not via XP-adjunction (see Grewendorf & Sabel 1999 for more empirical motivation for this analysis). In this way we can derive a uniform account of the different A-/A -properties and locality restrictions of scrambling in languages such as German and Japanese. . Quantifier raising Let us briefly consider one more type of adjunction movement in light of the CAM: quantifier raising (QR). Given that QR, like scrambling in German, is adjunction movement, we expect that it may not apply successive-cyclically. Combining the proposal of Chomsky (1995), that quantifier raising is triggered by some feature with the suggestion of Kiss (1987: 249) that scrambling in the overt syntax parallels QR at LF we could say that the feature triggering quantifier raising is located in the same functional heads as the scrambling feature.34 Then, given that QR is constrained by the CAM; we can explain its clausebound character i.e., the non-ambiguity of (52) (see Hornstein 1984; Williams 1986; Mahajan 1990; Cheng 1991; Lasnik & Saito 1992; among others): (52) Someone thinks that everyone saw you at the rally.
Recall that May (1985) and Chomsky (1986a: 7) argue that the distributive reading in examples such as (53) is a consequence of adjunction movement of the subject to IP at LF, as in (53 ), where IP does not include everyone. Instead, who and everyone are dominated by exactly the same maximal projections (CP): (53) [CP Who does [IP everyone like t]]? (53 ) [CP Who1 does [IP everyone2 [IP t 2 like t 1 ]]]?
If the IP-adjunction approach for quantifiers is adopted, we have to block a derivation for (52) in which adjunction movement of the embedded subject to IP represents an intermediate step, which is followed by long movement of the embedded subject into the matrix clause. The non-ambiguity of (52) follows from the CAM since the subject may not leave the embedded clause if it is adjoined to the embedded IP as in (52 ). (52 ) Someone thinks that [IP everyone [IP t saw you at the rally]].
Joachim Sabel
According to the CAM the embedded subject in (52 ) is frozen in place.35 Now recall, in the light of the analysis in (53), the following examples (54) (=(14)), mentioned in Section 2.4, in favor of the intermediate adjunction hypothesis. As already pointed out, scope reconstruction facts were argued to provide evidence for intermediate adjunction. (54) a.
Which book did every student wonder [CP whether [his teacher [VP wrote t]]]? b. Which book did every student think [CP t that [his teacher didn’t [VP write t]]]?
The wh-phrases can be interpreted in the matrix clause i.e., they yield a distributive reading. (54b), for example, can have a family of answers: Paul thinks that his teacher didn’t write Barriers, John thinks that his teacher didn’t write Anarchy, State and Utopia. The distributive reading in (54a) was argued to be a consequence of an intermediate trace in VP-adjoined position. However, the distributive reading in (54a) (and also (54b)) can be due to the fact that after quantifier raising the quantifier every student has scope over the entire question a clause-bound process that, in addition, is blocked by so-called inner islands such as negation as in (55c) (=(12c)) (see for example Beck 1996). (55) c.
Which book didn’t every student think [CP t that [his teacher [VP wrote t]]]?
Hence, the scope reconstruction facts do not provide an argument for intermediate adjunction (to VP). Let us sum up the empirical results so far. The CAM was shown to hold for different types of XP-movement i.e., for wh-movement and scrambling (including covert quantifier movement). Furthermore, we have seen that we can give an alternative account for the scope reconstruction facts in connection with extraction out of wh-islands, i.e., one that does not rely on the intermediate adjunction hypothesis. Let us now look at the problematic consequences of intermediate adjunction for empty operator movement. . Movement of empty operators and further XP-movements In his discussion of parasitic gap constructions, Chomsky (1986a: 65f.) assumes that NPs may be adjoined to PP-adjuncts as in (56): (56) Which booki did [IP you [VP return ti [PP Oi [PP without [reading ei ]]]]]?
Intermediate traces, reconstruction and locality effects
For the parasitic gap construction to be licensed, it is assumed that one composed chain has to be constructed out of the two A -chains in (56). This (composed) chain can only be constructed if the empty operator is 0-subjacent to the ‘real’ trace t i . Hence, t i and the empty operator may not be separated by one barrier or more. In order to compose the relevant chain, operator movement in the infinitival has to take place and the operator has to adjoin to PP. Adjoined to PP it is 0-subjacent to the trace of the wh-phrase, according to the framework of Chomsky (1986a). Given the assumption that the PP-adjunct is base-generated in a VPinternal position and not adjoined to VP or IP, the empty operator in (56) is not 0-subjacent to the structural position of the subject. This predicts the ungrammaticality of sentences like (57): (57) *Which paperi [IP ti [VP disappeared [PP Oi [PP before you could [VP read ei ]]]]]?
If the operator in PP-adjoined position (56) were permitted to leave an intermediate trace, a derivation like (57 ) would be possible: (57 ) *Which paperi [IP ti [VP Oi [VP disappeared [PP ei [PP before you could [VP ei [VP read ei ]]]]]]]?
Given (57 ), the sentence is predicted to be grammatical because the empty operator and the trace in subject position are no longer separated by a barrier (cf. also Browning 1987: 201f.).36 Hence, the concept of intermediate adjunction undermines the account of the parasitic gap constructions (56)–(57). On the other hand, if we dispense with intermediate adjunction, the derivation (57 ) is excluded. To complete the discussion of the CAM in connection with movement of XPs, it should be noted that further construction types have led several authors to the conclusion that intermediate adjunction may not be allowed. For example, it must also be excluded that A-movement may apply via intermediate adjunction (cf. Chomsky’s 1986a: 74, 1995: 326 discussions of Improper movement in different theoretical frameworks). (58) a. *John seems that [IP t [IP it was told t [ that Mary is intelligent]]]. b. *John seems that [IP t [IP it is considered [ t to be intelligent]]].
The Improper movement account of (58) in the framework of Chomsky (1986a: 74, 1986b) and Chomsky and Lasnik (1993) relies on the idea that the trace t is an R-expression (i.e., a variable since it is locally A -bound by t ) that is A-bound by John in violation of Principle C of the Binding Theory (see also
Joachim Sabel
Li 1990). However, this account of (58) is in conflict with the Uniformity Condition on Chains. As already pointed out (cf. Section 2.3), only uniform chains are legitimate LF objects in the framework of Chomsky and Lasnik (1993) and Chomsky (1995). For illustration, consider (59)–(61): (59) A . . . A . . . A (A-/L-related chain) . . . A (A -/non-L-related chain, X0 -chain) (60) A . . . A (61) A . . . (*A ) . . . A (operator-variable construction)
The Uniformity Condition prohibits (intermediate) trace deletion in A-chains (59) where every element is located in an L-related (A-) position or (intermediate) trace deletion in A -chains (60) where every element occupies a non-Lrelated (A -) position. In contrast, in operator-variable constructions (61) the intermediate trace occupies a non-L-related position, which has to be deleted. Given this analysis we can no longer rule out the examples (58) since according to the Uniformity Condition on Chains the intermediate trace in these examples should be deleted to create a uniform A-Chain, a legitimite object. The resulting chain C = (John, t) does not violate Principle C of the Binding Theory any longer and the ungrammaticality of these examples is unexplained. To sum up, given the (representational version of the) Uniformity Condition, Improper movement that results from intermediate traces in adjoined positions as in (58) is not excluded since these traces are deleted at LF. Hence a derivational constraint is needed which prohibits intermediate adjunction.37 Let us now turn to the analysis of (58) within the framework of Chomsky (1994, 1995). In (61), John is moved to the position t . In this position John is closer to the matrix I0 than it and John can be attracted by I0 according to the MLC (see (19)). (58) should be well-formed, contrary to fact. As pointed out in Chomsky (1995: 326), there must be a reason why the intermediate trace is not licensed, and this reason is responsible for the fact that derivation (58) is impossible. To sum up, whatever framework is adopted, we have to exclude intermediate adjunction in the course of A-movement. To complete the discussion of the CAM in connection with movement of XPs, it should be noted that further construction types have led several authors to the conclusion that intermediate adjunction may not be allowed. For example, it has often been noted that the clause-bound character of extraposition points to the conclusion that rightwards adjunction may not apply as an intermediate movement step (see for example Baltin 1983 Footnote 8, 1987: 591; Guéron & May 1984: 15; May 1985: 109ff.; Aoun 1986: 135; Kroch & Joshi 1987; Raposo 1987; Nakajima 1989: 331; Davis & Alphonce 1992: 96).
Intermediate traces, reconstruction and locality effects
Lastly, it must be mentioned that LF-movement of anaphors obeys locality restrictions that are incompatible with successive-cyclic adjunction (Hestvik 1990: 157f.). This point leads to another important aspect: head movement. Given that “the optimal assumption would be that movement of zero level categories falls under the principles that apply to movement of maximal projections . . .” (Chomsky 1986a: 68), head-movement should also obey the constraint on adjunction movement. In fact, the CAM makes the same predictions for X0 - and XP-movement. Its ability to restrict head movement will be demonstrated in the next section. . Intermediate adjunction and head movement The constraint on adjunction excludes the possibility that the strict locality restrictions holding for X0 -movement (Travis 1984; Chomsky 1986a, 1991; Baker 1988) are neutralized by means of intermediate adjunction. For example, the CAM excludes violations that were traditionally subsumed under the Head Movement Constraint (HMC). That head movement may not skip a potential landing-site as in How tall be John will t is ruled out by Minimize Chain Links if the movement in question applies in one fell swoop (see also Lasnik 1995 for an alternative proposal). However, the constraint on adjunction excludes the possibility that intermediate adjunction to the intervening head-position provides a derivation in which MCL or the MLC is not violated (see also Chomsky 1995: 135). Consider, for example, clitic climbing. Clitics in Spanish may move out of infinitival complements of matrix verbs like querer ‘want’ or permitir ‘allow’. (62b–c) are derived from (62a) by moving the clitic te ‘you’ in (62b) and the clitic lo ‘it’ in (62c) (examples (62)–(63) are taken from Aissen and Perlmutter 1983). It is plausible to assume that te and lo form one complex head in (62c) (see Sabel 2001 for discussion). For example, both may be moved, as can be seen from (63a). (62) a.
Quiero [PRO permitirte [PRO hacerlo]]. I-want to-allow-youdat to-do-itacc b. Te quiero [PRO permitir [PRO hacerlo]]. youdat I-want to-allow to-do-itacc c. Quiero [PRO permitir te lo [PRO hacer]]. I-want to-allow-youdat -itacc to-do
(63) a.
Te lo quiero [PRO permitir [PRO hacer]]. you-it I-want to-allow to-do
Joachim Sabel
b. *Lo quiero [PRO permitirte (t ) [PRO hacer t]]. it I-want to-allow-you to-do
If te and lo form a complex head in (62c) and (63a) that is formed via adjunction, the CAM predicts that (62c) cannot be the input for a structure in which lo moves on into the highest clause, leaving an intermediate trace in the position occupied by lo in (62c). This prediction is in fact borne out, as can be seen in (63b). On the other hand, if (63b) is derived from (62a) and lo moves in one fell swoop into the highest clause, clitic climbing violates MCL (see Sabel 2001 for more discussion). To sum up, the impossibility of successive-cyclic adjunction in cases of verb movement and clitic climbing (63b) follows from the CAM and requires no further stipulation. Given the constraint on adjunction movement, derivations including intermediate traces in adjoined positions cannot be generated. An element dominated by only one segment of a complex head may not move further on its own because this movement creates a trace located in an adjoined position. At this point of the discussion, one might wonder whether the CAM is too restrictive. Two cases have to be considered. First, the CAM does not exclude a head formed by adjunction from moving as a whole, as is the case with clitic climbing (63a) (or as is the case with V-to-I-to-C in general). Here, a new head is formed after each movement step, so that every head was adjoined only once.38 Secondly, it does not exclude the possibility that the “head of a complex head” may move and strand the segment that was created by adjunction to itself. Derivations of this type must be allowed. For example, they are attested in connection with verb-movement in Dutch (see Koster 1987). As can be seen from the Dutch examples (64a–b), it is impossible for the infinitival complement to remain in its base position, nor is extraposition allowed. The only possibility is that incorporation creates a complex verb (64c). (64d) shows that the matrix verb moves to C0 . This is necessary because of the verb-second character of Dutch. Under the assumption that the matrix and embedded verbs must form a single head (64c), a trace inside this verbal complex must be licensed (64d). (64) a. *dat zij [ het boek te lezen] schijnt that she the book to read seems ‘She seems to read the book.’ b. *dat zij ti schijnt [ het boek te lezen]i seems the book to read that she
Intermediate traces, reconstruction and locality effects
c.
dat zij [ het boek tv ] schijnt + that she the book seems d. Zij schijnti [ het boek tv ] ti + te the book to she seems
te lezenv to read lezenv . read
From this I conclude that the CAM is not too restrictive. It represents a correct generalization and restricts the set of potential derivations. Adjunction movement can never go into an intermediate adjoined position, but only into an adjoined position that represents a goal position. Successive-cyclic movement of an element α may only proceed via Spec positions. Let us summarize the results of the discussion in the preceding sections. In Section 2, I have presented evidence for the intermediate adjunction hypothesis from reconstruction, weak crossover, and locality effects. In Section 3, I have tried to show that, on the other hand, a number of theoretical and empirical arguments argue against the intermediate adjunction hypothesis. Data and considerations concerning a wide range of movement types, such as whmovement, A-movement, QR, extraposition, scrambling, and head movement, motivated the postulation of the constraint on adjunction movement (CAM), which generally excludes intermediate traces in adjoined positions: (65) a. Substitution
b. XP-Adjunction
WP YP
WP YP
W'
WP
XP t'
XP XP
t
(A-movement, Empty operator-, and Wh-movement)
t'
XP
t
(QR, Scrambling, Extraposition)
Joachim Sabel
c. X0 - Adjunction WP WP Y
XP W
X' X
t'
YP X
t
(Head-Movement)
If the CAM represents a correct descriptive generalization, we have to ask why this constraint holds. Chomsky (1995: 195) mentions a restriction for featurechecking, which relies on the idea that elements contain hierarchically ordered features. One consequence of this restriction is that the effects of the head movement constraint can be derived. The idea is that features in a checkee i.e., in a verbal head, have to be checked in a certain order. Assuming that lexical elements are taken to be sequences of features and checked in a certain order, Chomsky (1995) suggests that, it is possible to capture the effects of Baker’s Mirror Principle in minimalist terms (see also Grimshaw 1991 and Cherny 1992 for a similar suggestion). If we generalize this idea and extend it to XP-movement we are able to rule out the problematic examples of “Improper movement” in (65a) i.e., adjunction movement as an intermediate movement step (see Cinque 1990; Frampton 1999) that is followed by movement to a specifier position. We can assume that any operator-features (topic-, focus-, wh- or scrambling features) that are checked in adjoined positions are always checked last i.e., if the checkee is moved into different checking positions, movement to specifier positions has to precede movement to adjoined positions. Furthermore, this analysis can be extended to other cases of Improper movement discussed in Section 3.1.1 such as long scrambling via Spec CP (see Note 4) or (18b-ii), repeated here as (66): (66) *Who was expected [CP t that [IP it was told t [that Mary is beautiful]]]?
In (66), who has to check nominative case, in addition to a [wh] feature. The wh-element is first attracted by an operator feature of the embedded C0 . In
Intermediate traces, reconstruction and locality effects
this position it is closer to the matrix I0 than it. Therefore, superraising of who should be possible in (66), contrary to fact. On the other hand, according to the above mentioned analysis of hierarchical feature-orderings movement from Spec CP to a Case position is ruled out. This analysis can be extended to further types of Improper movement. For example, it rules out wh-movement from a [wh]-Spec CP into a [–wh] Spec CP position (Lasnik and Saito 1992; Manzini 1998). Let us next turn to successive-cyclic adjunction in (65b–c). Chomsky (1995: 285) accounts for the impossibility of successive-cyclic adjunction in the case of head movement (65c) by assuming that X0 -adjunction in incorporation contexts is triggered by an [affix] feature, which erases after the first adjunction movement. Further movement of the adjoined head would violate Last Resort and is therefore excluded. Now, in order to rule out successive-cyclic adjunction in (65b), we can assume that feature-checking for XP-adjunction works in the same way. It may apply only once. Note that besides the adjunction movement in (65b–c) – this also rules out (vacuous) infinite iterative adjunction movement (see Fukui 1993; Saito 1994b; Lee 1994: 44; Takano 1996: 245; Sabel 1999 for discussion). In order to illustrate this, assume that an element β adjoins to α. Then the next shortest landing site for β will again be a position adjoined to α and so on, without limit on the number of application of adjunction to α, β will never move out of α. This unwarranted derivation arises in a framework such as Chomsky (1981, 1986a) where Move α applies optionally as well as in a framework where the shortest move requirement is assumed (or Minimize Chain Links) as in Chomsky and Lasnik (1993) and also in the framework of Chomsky (1994, 1995, Chapter 4) where it is possible in principle that one and the same head may check a certain feature more than one time. The analysis outlined above also excludes derivations of this type. Importantly, the use of an adjoined position as a goal position remains unproblematic under this analysis since adjunction movement applies only once. Furthermore, positions that arise from base adjunction as the base position of (non-L-related) adjuncts or the base position of L-related (for example, inverted subjects) are not affected by the mentioned restriction.
. Conclusion In this article, I have argued that movement is restricted by the Constraint on Adjunction Movement (CAM), a constraint that forbids traces in intermedi-
Joachim Sabel
ate adjoined positions. The CAM predicts that the only existing intermediate traces of a moved element are traces in specifier positions. The empirical evidence against intermediate adjunction was formulated with respect to different movement types, such as wh-movement, empty-operator-movement, Amovement, extraposition, quantifier raising, scrambling, and head movement. Furthermore, those data that were traditionally used as providing evidence for intermediate adjunction have been explained as involving movement via a second specifier position. For example, the analysis of scrambling in German and Japanese presented in Section 3 rests on the assumption that Japanese allows for multiple A-specifiers whereas German does not. On the other hand, multiple (CP) A -specifiers seem to be the unmarked case in languages as argued in connection with the proposed analysis of extraction out of wh-islands.
Notes * For helpful comments I would like to thank Noam Chomsky, Chris Collins, John Frampton, Sam Epstein, Gisbert Fanselow, Danny Fox, Hans-Martin Gärtner, Günther Grewendorf, Eric Groat, Shin-Sook Kim, Shigeru Miyagawa, Norvin Richards, Jeff Runner, Mamoru Saito, Gert Webelhuth, and Ede Zimmermann. . In most recent analyses of movement phenomena that are within the Minimalist Program, the idea that successive-cyclic movement proceeds via intermediate adjunction plays an integral part (see Chomsky & Lasnik 1993; Saito 1994b; Takahashi 1994; Agbayani 1997; Takeda 1997; Fukui & Saito 1998; among others). For example, with respect to reconstruction properties discussed in the following section, the analysis in Takahashi (1994) is settled in the Minimalist framework (Chomsky 1993; Chomsky & Lasnik 1993) and rests on the intermediate adjunction hypothesis as the analysis in Barss (1986, 1988), who assumes the Barriers framework. Furthermore, Richards (1997), working in the Minimalist framework, in his analysis of the absence of weak crossover effects in several languages, addresses the correlation with intermediate IP adjunction, like the authors mentioned in Section 2.2 who present an analysis in the Barriers framework. In addition, concerning the discussion of locality effects of movement in Section 2.3, the idea that every non-complement is a barrier, which is a variant of the notion of L-marking barrier in Chomsky (1986a), can be found in Chomsky and Lasnik (1993) and in Chomsky (2000). This analysis in terms of barrierhood relies on the distinction between the notion of “segment” and “categories” already found in Chomsky (1986a) and adopted in the Minimalist framework (see Chomsky 1993: 11, 1995: 177; Fukui & Saito 1998; among others). Thus, the following arguments presented in favor of the intermediate adjunction analysis carry over to the different frameworks. . There has been some debate on whether a structural difference between specifier and adjunction positions exists or whether all sisters to the projections of a head-complement structure have to be analyzed as specifiers (see Fukui 1986; Kuroda 1992; Kayne 1994; Ura 1994; Fukui & Saito 1998; among others). In contrast to my analysis in the text, it might
Intermediate traces, reconstruction and locality effects
be possible to analyze all types of XP-movement as movements to a specifier position and interpret the different properties of these movements instead in terms of different agreement or feature checking relations with the (attracting) head. This analysis, however, leaves head movement as the only existing (exceptional) type of adjunction movement (except the latter is analyzed as XP-movement, which, however, raises new problems, see for example Toyoshima 1997) and offers no natural account for the fact that head adjunction behaves like scrambling, quantifier raising and extraposition, and other types of XP-adjunction movements with respect to successive-cyclic movement. In the following, I assume that movement may target either a specifier or an adjoined position. . See Mahajan (1990) for a discussion of similar examples in Hindi, and Saito (1992), Abe (1993), Nemoto (1993), and Sakai (1994) for the relevant facts in Japanese. . There are several reasons to assume that long scrambling as in (5) does not proceed via Spec CP (see Hoekstra & Bennis 1989; Müller & Sternefeld 1993; Grewendorf & Sabel 1994, 1999 for arguments, see also Sections 3.1.1 and 3.5 for discussion). . In fact, to be precise, if a disjunctive ECP (Empty Category Principle) is assumed, intermediate adjunction is likewise necessary to correctly account for subjacency violations in the Barriers framework, where subjacency violations are accounted for in terms of crossed barriers. According to Chomsky (1986a), there are two ways for an XP to become a barrier for an element β: (i)
XP (except IP) is a barrier for β iff it is a blocking category for β, or
(ii) XP is a barrier for β iff XP immediately dominates a blocking category for β. An Xmax is a blocking category for β if and only if Xmax is not L-marked, this being the case if Xmax is not θ-marked (i.e. if it is not assigned a θ-role under sisterhood by a lexical head). (i) is relevant for all XPs except IP, which may be a blocking category but – by definition – never a barrier according to (i). IP may only become a barrier by “inheritance”, i.e. by (ii). If a disjunctive ECP is adopted, the intermediate movement step is still necessary to guarantee that the moved wh-phrase does not violate subjacency. Consider for instance again example (iii) (=(8a)), where the trace of the extracted object is θ-governed and does not violate the disjunctive ECP: (iii) What do [IP you [VP like t]]? In (iii) VP is not L-marked and therefore a barrier for t according to (i) and IP is a barrier for t according to (ii). The wh-phrase has crossed two barriers and we would expect (iii) to result in ungrammaticality, contrary to fact. For reasons outlined in the following discussion in the text, the intermediate adjunction hypothesis ensures that the perfect acceptability of (iii) follows. . Chomsky and Lasnik (1993) rely on the Uniformity Condition on chains to derive this effect (cf. also Chomsky 1991, 1995). They argue that i) only uniform chains – and, as a special case, the two-membered operator-variable pairs – are legitimate objects at the LFinterface, and that ii) trace-deletion is a Last Resort operation that creates uniform chains and operator-variable pairs. Uniformity is a relational notion. A chain is uniform if all its members share the relevant property (UN[P]), for example L-relatedness (UN[L]). Ad-
Joachim Sabel
juncts and heads are non-L-related elements. They only move to non-L-related positions, creating legitimate objects, i.e. uniform chains where every member occupies a non-Lrelated position. A-chains with each element in an A- or L-related position are also uniform. Hence, deletion of traces does not apply in these uniform chains. In the case of long wh-movement of arguments it is important that only operator-variable pairs are licensed at LF, therefore intermediate traces are deleted from A - or non-L-related positions as a Last Resort operation, yielding legitimate LF-objects of the form (Wh, t), where t represents the Case-marked position, i.e. the variable. . Chomsky (1995) accounts for wh-island violations in terms of the Minimal Link Condition (MLC). This analysis will be discussed in Section 3.1.5. . Furthermore, if it is assumed that reconstruction is impossible for A-moved elements (Chomsky 1995) or that A-movement does not leave copies (Hornstein 1998; Lasnik 1998, 1999; Saito & Hoshi 2000), the examples in (i) are no longer problematic. Anaphors which are contained in elements that are A-moved out of the c-command domain of their antecedents do not violate Principle A. The dependent elements in (i) act as if they were contained in A -moved phrases (10) (cf. Barss 1986: 108; Belletti & Rizzi 1988; Johnson 1985: 41ff., 1987, 1992; Pesetsky 1987; among others). (i)
a. b.
Pictures of himself i [VP [please t] Johni ]. Each otheri ’s pictures seem to the meni [IP t to be t the most beautiful].
(11) accounts for the well-formedness of examples such as (10) and (i) without referring to the question of whether or not the anaphor (or the element that contains the anaphor) has undergone A- or A -movement (see also Note 9). . Cinque (1990, Section 1.4.2) notes that scope reconstruction of wh-phrases across weak islands is much more restricted than ‘reconstruction’ for the purposes of the Binding Theory. An explanation for this asymmetry could make use of the idea that – in line with what I have argued for in the preceding section i.e., that reconstruction at LF for Binding Theory simply does not exist. . A further argument which was proposed in the literature in favor of the intermediate adjunction hypothesis concerns the explanation of Superiority effects in terms of the ECP. Assuming that traces have to be antecedent governed, Chomsky (1991, Footnote 24 and 34) assumes that traces are licensed (i.e. +γ-marked) at intermediate stages of a derivation (see Barss 1986: 447; Saito 1994a: 209f.). In (i) the trace of the raised subject satisfies the ECP. Hence, the contrast in (i) is due to the fact that LF wh-movement of the adjunct violates the ECP, in contrast to covert movement of the object, as can be seen from the LFrepresentations of (i) in (ii). Given that the object adjoins to VP, as shown in (ii-a), the initial trace t j is antecedent-governed and therefore not *-marked. Although further movement of the wh-phrase to Spec CP leaves an intermediate trace which is not c-commanded and gets therefore *-marked, this trace is deleted and the resulting operator-variable pair does not cause ungrammaticality. On the other hand, deletion of an intermediate trace is impossible in the case of the LF-moved adjunct in (ii-b). Therefore (i-b) violates the ECP. (i)
a. [CP Whoi [IP ti [VP said what]]]? b. *[CP Whoi [IP ti [VP [VP solved the problem] how]]]?
Intermediate traces, reconstruction and locality effects (ii) a. [CP [NP [NP what]j [NP who]i ]i [IP ti [VP (*tj )[VP said tj ]]]]? b. *[CP [NP [ADVP how]j [NP who]i ]i [IP ti [VP *tj [VP [VP solved the problem]tj ]]]]? However, there are other superiority effects which cannot be subsumed under the ECP account (see Hendrick and Rochemont 1988; Pesetsky 1982; Lasnik and Saito 1992; and Lee 1993). In (iii) and (iv) none of the variables will be *-marked under a derivation involving intermediate adjunction as in (ii-a). Therefore, the ECP is not violated in (iii), (iv). Then the contrast in (i) no longer provides an argument for the intermediate adjunction hypothesis. One has to conclude that some version of the superiority condition has to be posited independently of the ECP. (iii) *What did you tell who to read t? (but: Who did you tell t to read what?) (iv) *Who did you give what to t? (but: What did you give t to whom?) . Furthermore, this condition is too strong since adjunction movement may target an adjunct as a goal position; see the discussion of examples (34a) and (35a) below. . The only derivation allowed by the MLC is one in which John stays in its base position and it raises to the matrix Spec IP position: (i) *It was expected [that t was told John that Mary is beautiful]. However, this derivation is excluded because it has already checked its Case feature in the embedded Spec IP position and therefore may not check the Case of the matrix I0 . . A potential problem seems to arise if we compare the ungrammaticality of object extraction from CED-islands (17) and (20) with the milder ungrammaticality of wh-island violations (i): (i) ??What do you wonder [CP how John could [VP fix *t]]? If intermediate adjunction is impossible, we would expect that in (i) a *-marked trace remains at LF, whereas the *-marked trace – located in Spec CP – can be deleted in (17). Hence, contrary to fact, (17) should involve a milder degree of ungrammaticality than (i). This aspect is more explicitly discussed in Sabel (1998, 2002), where it is argued that the only existing intermediate trace (*t ) in the Spec CP position of the adjunct clauses is undeletable. . In addition, if it turned out to be correct to analyze the infinitival complements of ECM verbs as IPs (and not as CPs), then the mentioned constraint rules out a redundancy for examples like (i-a) and (i-b), where the intermediate trace is not licensed because adjunction to IPs and adjunction to arguments is impossible: (i)
a. *[CP Who does John believe [IP t [IP [a sister of t] to be stupid]]]? b. *[CP Who does John consider [IP t [IP [students of t] to be talented]]]?
. A plausible cross-linguistic analysis is given in Georgopoulos (1991). She argues that an account for the variations in weak crossover effects must refer to the canonical order of head and specifier. According to her analysis, a weak crossover effect arises if a pronoun is located in a specifier position (Spec IP) that does not stand in a canonical relation to its head (I0 ). Hence, in SVO languages like English or Polish, where I0 and Spec IP are not in a canonical
Joachim Sabel
relation, weak crossover effects are found with wh-movement, whereas they do not arise with overt operator-movement in SOV languages like German, Japanese, and Lakhota. . In my analysis of the Polish data, I have followed the suggestion made by Toman (1981), Lasnik and Saito (1984), Rudin (1988), Cheng (1991), Billings and Rudin (1994), Koizumi (1995), Boškovi´c (1996), Richards (1997), Stepanov (1998) among others that the IP-adjoined position is an operator-position in Polish. Several further languages provide empirical evidence for the fact that the IP-adjoined position may be an operator position for wh-elements. For example, Raghibdoust (1994) makes the same claim for Persian, and further evidence for an IP-adjoined operator-position is provided by wh-scrambling in Japanese (see Nishiyama et al. 1996; Grewendorf & Sabel 1999; among others). Mahajan (1990) in his “scrambling” analysis of wh-movement in Hindi also argues for the IPadjunction analysis of wh-movement. Obviously, from a typologically point of view languages must be divided according to their use of different destinations for wh-elements in the overt syntax. We find languages in which wh-elements end up in Spec CP (as in English, German), in a position adjoined to IP (as in Hindi, Japanese, and Persian) or in both positions (as in Polish). A potential explanation could make use of an idea, entertained by Rizzi (1990b, 1996) that (in addition to C0 ) Infl0 can also be base-generated with [wh] features. . See McDaniel (1989: 585f.) for the same problem with intermediate adjunction to VP imposed by partial wh-movement constructions. . Note also that the following derivation with intermediate VP-adjunction is problematic for an account of wh-island violations in the ‘Barriers’ framework of Chomsky (1986a): (i) *How do you t wonder [CP what [John could [VP t [VP t [VP fix t]]]]]? The adjunct has adjoined two times to the embedded VP. We have to exclude that t and t antecedent govern each other, which is the case if the notion m-command (cf. Chomsky 1986a) is relevant for antecedent government. The problem that is caused by a derivation like (i) (which also arises in connection with the that-t-effect, as pointed out by Chomsky 1986a, see also Takahashi 1994: 117f. for relevant discussion) disappears if c-command is the notion relevant for antecedent government. Then – although no barrier intervenes between the trace t and t – t does not c-command t , hence t is not antecedent governed and violates the ECP at LF. Note, however, that the notions “inclusion” and “exclusion” are still needed in frameworks which express structural relations in terms of strict c-command (defined in terms of the first branching node (Reinhart 1983)), in order to guarantee that head movement such as V-to-I or N-to-V leaves a commanded trace t v (see Baker 1988: 449, Footnote 10; Chomsky & Lasnik 1993: 518, 522). Defining therefore c-command relations as, α c-commands β if α does not dominate (include) β and every γ that dominates α dominates β, the problem posed by derivations such as (i) needs to be solved with an independent constraint such as the one which prohibits intermediate adjunction making derivations such as (i) impossible. . That licensing of the bound variable reading of a pronoun is subject to similar constraints as anaphoric binding can also be seen from the fact that it is independent of the A- or A -movement properties of the constituent containing the pronoun (cf. Note 8). As shown in (i-b), in contrast to (30) and (i-a), the bound variable reading of a pronoun can
Intermediate traces, reconstruction and locality effects
also be licensed when the pronoun has left the c-command domain of its binder as a result of A-movement (Engdahl 1986; Koizumi 1992; Abe 1993: 311): (i)
a. b.
Which of hisi parents do you believe that every mani likes [t best]? Itsi nose seems to every intelligent roboti [t to be ugly].
. The corresponding [wh] feature in the head of the attractor is also +Interpretable and strong in English, triggering overt movement. Furthermore, weak +Interpretable features need not be checked. Therefore wh-phrases in situ (for example in multiple questions in English) or their [wh] features are not moved at LF. The properties of +Interpretable features correlate with the assumption that +Interpretable features are legitimate LF objects that enter into interpretation. In contrast, –Interpretable features (such as Case) need to be checked in any event, and hence are eliminated at LF. –Interpretable features on XPs immediately disappear after checking. This prohibits, for example, an NP from checking one and the same feature more than one time. . Note that Chomsky (1995, Chapter 4) allows for feature-checking via Merge in nonθ-positions. This is relevant for expletive constructions and for the analogue of (32) with whether. Like who in (33a) whether in (i) checks the [wh] feature of the embedded C0 . At a later stage of the derivation whether is attracted by the matrix C-head (i). However, a trace inside the embedded Spec CP position is not licensed (see the discussion in Note 26 below). Again, the derivation in (ii) is not possible. (i) *Whether you wonder t whether John could fix what? (ii) *What do you wonder whether John could fix t ? . Certain binding phenomena seem to provide evidence for object shift in German (see Wyngaerd 1989; Mahajan 1990; Sabel 1996; among others). . As discussed in Section 3.5 below the idea of hierarchically ordered features can possibly be derived from the assumption that lexical elements are taken to be sequences of features which have to be checked in a certain order, as suggested in Chomsky (1995: 195). . Koizumi assumes that hierarchically ordered [Top] and [Neg] features ([Top] > [Neg]) in one and the same functional head trigger checking of two different elements (Topics and Negative Phrases) in different specifier positions of one and the same projection. . This also rules out the possibility that Wh1 moves from Spec1 to Spec2 . See also Reinhart (1981) for suggestions on how to regulate the filling of multiple landing positions in Spec CP. . It is commonly assumed that intermediate traces are [–wh] elements i.e., non-operators. For example, this assumption automatically excludes examples such as *Who do you wonder [t [t won the race]] in which the strong [wh] C0 head of the embedded C0 needs a [wh] element in its specifier. However, given that intermediate traces are [–wh] elements the ungrammaticality of this sentence is expected. An additional empirical argument for the assumption that traces are [–wh] elements can be gained from sentences such as Who knows [who [John saw t]]. As can be seen from this example the embedded [wh] C0 is checked by a wh-element. However, this example cannot be understood as a matrix double question which means that this wh-element cannot be interpreted in the matrix Spec CP position.
Joachim Sabel
Again, this results in a mismatch since the fact that a [–wh] element occupies an embedded Spec CP with a [wh] C0 head (for further discussion see Lasnik & Saito 1992; Rizzi 1996). . The analysis outlined in the text provides the basis for an account for the fact that the sentential complements of factive verbs are syntactic islands. One possibility is to adopt the analysis in Melvold (1991) who argues that extraction from factive islands is blocked by an empty operator in the Spec CP position of the factive complement (see also Watanabe 1993). According to Melvold (1991) island effects with respect to wh-extraction from factive complements have to be explained in analogy to wh-island violations. A different possibility arises from the analysis of factive islands in Cinque (1990: 30) who, in contrast to Melvold, assumes that these complements are not generated as sisters of V. Then they are barriers for extraction according to the idea that “every non-complement is a barrier” (Huang 1982; Chomsky & Lasnik 1993; among others). Both analyses are compatible with the analysis presented here (see Sabel 2002 for discussion). . Violations of the CNPC such as (ii) provide a weaker island effect than complement extraction from relative clauses as in (i), which show a very strong violation that is equal to an ECP effect. (i) *Which book did John have [NP a friend [CP to whom [to read *t]]]? (ii) ?*Which book did you hear [NP a rumor [CP *t that John had read t ]]? In contrast to (ii) no intermediate trace can be generated in (i) because the intermediate Spec CP position is occupied by the relative pronoun. Note that this explanation is not undermined by the multiple specifier analysis proposed in this section. It is plausible to assume that in contrast to embedded interrogative complements, the C-head of relative clauses may not project a second specifier position in which the long-moved wh-element can create an intermediate trace. The reason for this may well lie in different feature specifications of complementizer types of embedded [wh] C0 -heads and the C0 -heads of relative clauses (see, for example, Rizzi 1990a, Chapter 2). Given the asymmetries between relativization and whmovement, the idea that relative clauses and wh-questions may have different C-systems is independently justified. For example, we find multiple wh-movement but never multiple relativization. Relative pronouns may never occur in-situ in contrast to wh-elements; relative pronouns in English license resumptive pronouns in contrast to interrogatives wh-phrases (Safir 1986), that-t and weak crossover effects are absent in relativization, in contrast to wh-movement (Chomsky 1981, 1982). Finally, Horvath (1986: 48ff.) compares both movement types in Hungarian, showing that the landing sites of relative pronouns differ from the landing sites of interrogative wh-elements. This asymmetry is also found in Italian, where according to Rizzi (1995) relative operators and wh-operators occupy different positions; i.e. the former must precede topics, in contrast to question operators (see Brandon & Seki 1981; Tajima 1987; Tajima & Arimura 1988; and Müller & Sternefeld 1993 for further differences between wh-movement and relativization). To conclude, the explanation for the strong ungrammaticality of (i) relies on the idea that in contrast to (ii) no intermediate trace is created in an A -position and the *-marked trace may not be deleted. . The same conclusion is drawn in Hoekstra and Bennis (1989), discussing examples like (44b) from Dutch, where scrambling out of finite clauses – in contrast to sentence-internal scrambling – is impossible, as in German.
Intermediate traces, reconstruction and locality effects . In addition, it is argued in Grewendorf and Sabel (1999) that further tests, which have led others to different conclusions about the A-/A -properties of scrambling, do not in fact provide any conclusive evidence in this regard. These tests come from reconstruction effects, weak crossover effects, and parasitic gap phenomena. . For example, the same correlation is found in Persian. Firstly, Persian allows for the multiple subject construction (i) (Ura 1994). Secondly, a short scrambled NP may bind an anaphor (ii-b) that is not licensed otherwise due to the lack of a binder (ii-a) (Browning & Karimi 1994): (i)
ketâb-aš gom šod. Muhmud Mahmudnom booknom got lost ‘Mahmud, his book got lost. (It is Mahmud that his book got lost.)’
(ii) a. *Madar-e khodash Ali-ra koshte. mother-Ez self-him Ali-OM killed b. Ali-rai madar-e khodashi t koshte. Ali-OM mother-Ez self-him killed ‘His own mother killed Ali.’
(OM=Object Marker)
Finally, (iii-b) derived from (iii-a) shows that Persian allows for long distance scrambling. (iii) a. b.
Ali fekr-mikone ke Mehry een ketab-ra be Hassan dad. Ali thinks that Mehry this book-OM to Hassan gave Ali een ketab-ra fekr-mikone ke Mehry t be Hassan dad. Ali this book-OM thinks that Mehry to Hassan gave ‘Ali thinks that Mary gave this book to Hassan.’
In Hindi, similar correlations are found, although it is controversial whether Hindi has multiple subject constructions of the sort found in Japanese and Persian (Mahajan 1990, p.c.). . Several authors have argued that in contrast to “full” pro-drop languages such as Italian and Spanish, languages such as Dutch and German represent “semi” pro-drop languages. The latter do not allow referential pro-subjects but only empty expletive pronominal prosubjects. According to this view, the subject position in impersonal passive constructions and in constructions with VP-internal subjects is occupied with an expletive pro that satisfies the EPP (Extended Projection Principle). For more details of the analysis of semi pro-drop see McKay (1985); Platzack (1985); Safir (1985a, 1985b); Koster (1986); Grewendorf (1989). . According to Chomsky (1993), an element α is in the checking domain of a head (X) if (i) it is in a Spec head relation with X, or (ii) it is in a position adjoined to the head X, or (iii) it is adjoined to the maximal projection of X, or (iv) it is adjoined to the Spec of X. . A similar treatment of scrambling and quantifier raising gains support from the observation that scrambling traces and traces generated by quantifier raising behave in the same way: they are not subject to principle C (cf. see Hornstein 1984; Aoun & Hornstein 1985; Aoun & Li 1990, 1993; Nemoto 1993: 25; Saito 1994b). . Note that an alternative derivation of example (52) is impossible according to which long quantifier raising applies without intermediate adjunction. Assuming that scrambling in the overt syntax parallels QR at LF, the features triggering scrambling and quantifier raising are
Joachim Sabel associated with Agrs-features in the functional heads Infl0 (or v0 ) in the embedded and matrix Infl0 (or v0 ), triggering obligatorily successive-cyclic movement. . Given that ‘every non complement is a barrier’, we can translate the constraints on parasitic gaps in terms of intervention of maximal projections. . Fukui (1993) argues that the same problem arises with respect to the examples in (i). After the intermediate trace is deleted, no Principle C violation should occur. (i)
a. *John was decided [IP t [IP t to leave at noon]]. b. *John was decided [CP t [IP t to leave at noon]].
However, as already mentioned in Section 3.1.1, the examples in (i) are ruled for independent reasons i.e., because the infinitival requires a PRO subject which checks Null Case. Given that John cannot bear Null Case, a feature mismatch arises (see Chomsky 1995: 326 for further discussion). . A similar construction occurs as well in the case of multiple XP-movement. As is wellknown, multiple wh-fronting languages such as Bulgarian and Romanian allow for long distance fronting of multiple wh-elements as in (i) from Romanian (Comorovski 1986): (i)
promis t2 t1? ce1 ziceai ca t 3 i-a Cine3 cui2 to-him has-promised who to-whom what you-were-saying that ‘Who did you say promised what to whom?’
At first sight, (i) seems to pose a problem for the assumption that movement may not proceed via intermediate adjunction. As noted in Rudin (1988), to derive a sentence like (i) without a violation of subjacency it is necessary for more than one wh-phrase to pass through the embedded Spec CP position, which means that Bulgarian and Romanian must allow multiple (intermediate) wh-traces to be adjoined to Spec CP as in (ii) (Rudin 1988: 455): (ii) [CP Whi Whj . . . [CP [SpecCP t i [t j ] ] . . . t i . . . t j . . .]]? As argued in Ackema and Neeleman (1998), Sabel (1998), Grewendorf and Sabel (1999), Sabel (2001), multiple wh-elements in Bulgarian and Romanian move as one single constituent successive-cyclically from Spec CP to Spec CP (leaving only one intermediate trace in the embedded Spec CP position). See also Richards (1997) for an alternative analysis of multiple fronting phenomena in terms of multiple specifiers.
References Abe, J. (1993). Binding Conditions and Scrambling without A/A Distinction. Doctoral dissertation, University of Connecticut. Ackema, P. & A. Neeleman (1998). Optimal Questions. Natural Language and Linguistic Theory, 16, 443–490. Agbayani, B. (1997). Category Raising, Adjunction and Minimality. UCI Working Papers in Linguistics, 3, 1–25.
Intermediate traces, reconstruction and locality effects
Aissen, J. & D. M. Perlmutter (1983). Clause Reduction in Spanish. In D. M. Perlmutter (Ed.), Studies in Relational Grammar (pp. 360–403). Chicago: The University of Chicago Press. Aoun, J. (1986). Generalized Binding. Cambridge, MA: The MIT Press. Aoun, J. & N. Hornstein (1985). Quantifier-Types. Linguistic Inquiry, 16, 623–637. Aoun, J. & Y.-H. A. Li (1990). Scope and Constituency. Linguistic Inquiry, 20, 141–172. Aoun, J. & Y.-H. A. Li (1993). Wh-elements in Situ: Syntax or LF. Linguistic Inquiry, 24, 199–238. Baker, M. (1988). Incorporation. A theory of grammatical function changing. Chicago: The University of Chicago Press. Baltin, M. (1983). Extraposition: Bounding versus government-Binding. Linguistic Inquiry, 14, 162–166. Baltin, M. (1987). Do Antecedent-Contained Deletions Exist? Linguistic Inquiry, 18, 579– 595. Barss, A. (1986). Chains and Anaphoric Dependence. Doctoral dissertation, MIT. Barss, A. (1988). Paths, Connectivity and Featureless Empty Categories. In A. Cardinaletti, G. Cinque und G. Giusti (Eds.), Constituent Structure (pp. 9–34). Dordrecht: Foris. Bayer, J. (1991). Notes on the ECP in English and German. Groninger Arbeiten zur Germanistischen Linguistik, 30, 1–55. Bayer, J. (1993). V(P)-Topicalization and the Role of Traces. Ms., Universität Stuttgart. Beck, S. (1996). Quantified Structures as Barriers for LF Movement. Natural Language Semantics, 4 (1), 1–56. Belletti, A. & L. Rizzi (1988). Psych-Verbs and θ-Theory. Natural Language and Linguistic Theory, 6, 291–352. Billings, L. & C. Rudin (1994). Optimality and Superiority: A new approach to overt multiple-wh ordering. In J. Toman (Ed.), Formal Approaches to slavic Linguistics. The College Park Meeting (pp. 35–60). Michigan Slavic Publications. Boškovi´c, Z. (1996). Fronting Wh-Phrases in Serbo-Croatian. Ms., University of Connecticut. Brandon, F. R. & L. Seki (1981). A Note on Comp as a Universal. Linguistic Inquiry, 12, 659–664. Browning, M. A. (1987). Null Operator Constructions. Doctoral dissertation, MIT. Browning, M. A. & E. Karimi (1994). Scrambling to Object Position in Persian. In N. Corver and H. van Riemsdijk (Eds.), Studies on Scrambling (pp. 61–100). Berlin: Mouton de Gruyter. Cheng, L. (1991). On the Typology of Wh-Questions. Doctoral dissertation, MIT. Cherny, L. (1992). The Role of Agreement and Modality in Palauan. Proceedings of the West Coast Conference on Formal Linguistics (WCCFL), 11, 78–92. Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht: Foris. Chomsky, N. (1982). Some Concepts and Consequences of the Theory of Government and Binding. Cambridge, MA: The MIT Press. Chomsky, N. (1986a). Barriers. Cambridge, MA: The MIT Press. Chomsky, N. (1986b). Knowledge of Language. Its nature, origin and use. New York: Praeger.
Joachim Sabel
Chomsky, N. (1991). Some Notes on Economy of Derivation and Representation. In R. Freidin (Ed.), Principles and Parameters in Comparative Grammar (pp. 417–454). Cambridge, MA: The MIT Press. Chomsky, N. (1993). A Minimalist Program for Linguistic Theory. In K. Hale and S. J. Keyser (Eds.), The View from Building 20: Essays in linguistics in honor of Sylvian Bromberger (pp. 1–52). Cambridge, MA: The MIT Press. Chomsky, N. (1994). Bare Phrase Structure. MIT Occasional Papers in Linguistics, 5. Department of Linguistics and Philosophy, MIT. Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: The MIT Press. Chomsky, N. (1999). Derivation by Phase. MIT Occasional Papers in Linguistics, 18. Department of Linguistics and Philosophy, MIT. Chomsky, N. (2000). Minimalist Inquiries: The framework. In R. Martin, D. Michels, and J. Uriagereka (Eds.), Step by step (pp. 89–155). Cambridge, MA: The MIT Press. Chomsky, N. & H. Lasnik (1993). Principles and Parameters Theory. In J. Jacobs et al. (Eds.), Syntax: An international handbook of contemporary research (pp. 506–569). Berlin: de Gruyter. Cinque, G. (1990). Types of A -Dependencies. Cambridge, MA: The MIT Press. Clark, R. (1990). Thematic Theory in Syntax and Interpretation. London: Croom Helm. Collins, C. (1993). Topics in Ewe Syntax. Doctoral dissertation, MIT. Collins, C. (1995). Toward a Theory of Optimal Derivations. In: R. Pensalfini and H. Ura (Eds.), Papers on Minimalist Syntax, MIT Working Papers in Linguistics, 27 (pp. 65–103). Department of Linguistics and Philosophy, MIT, Cambridge, Mass. Comorovski, I. (1986). Multiple Wh Movement in Romanian. Linguistic Inquiry, 17, 171– 177. Comorovski, I. (1989). Discourse and the Syntax of Multiple Constituent Questions. Doctoral dissertation, Cornell University. Comorovski, I. (1990). Verb Movement and Object Extrcation in French. Proceedings of the North Eastern Linguistics Society (Nels), 20, 91–105. Coopmans, P. (1988). On Extraction from Adjuncts in VP. Proceedings of the West Coast Conference on Formal Linguistics (WCCFL), 7, 53–65. Coopmans, P. (1990). A Note on Bars and Barriers. In J. Mascaró and M. Nespor (Eds.), Grammar in Progress. Dordrecht: Foris. Davis H. & C. Alphonce (1992). Passing, Linear Asymmetry, and wh-Movement. Proceedings of the North Eastern Linguistic Society (Nels), 22, 87–100. Doron, E. & C. Heycock (1996). Filling and Licensing Multiple Specifiers. Ms. Engdahl, E. (1986). Constituent Questions: The syntax and semantics of questions with special reference to Swedish. Dordrecht: Reidel. Ferguson, K. S. & E. M. Groat (1994). Defining ‘Shortest Move’. Ms., Harvard University. Fox, D. (1999). Reconstruction, Binding Theory, and the Interpretation of Chains. Linguistic Inquiry, 30, 157–196. Frampton, J. (1999). The Fine Structure of Wh-movement and the Proper Formulation of the ECP. The Linguistic Review, 16, 43–62. Fukui, N. (1986). A Theory of Categorial Projection and its Applications. Doctoral dissertation, MIT. Fukui, N. (1993). A Note on Improper Movement. The Linguistic Review, 10, 111–126.
Intermediate traces, reconstruction and locality effects
Fukui, N. & M. Saito (1998). Order in Phrase Structure and Movement. Linguistic Inquiry, 29, 439–474. Georgopoulous, C. (1991). Canonical Government and the Spec Parameter: An ECP account of weak crossover. Natural Language and Linguistic Theory, 9, 1–46. Grewendorf, G. (1989). Ergativity in German. Dordrecht: Foris. Grewendorf, G. & J. Sabel (1994). Long Scrambling and Incorporation. Linguistic Inquiry, 25, 263–308. Grewendorf, G. & J. Sabel (1999). Scrambling in German and Japanese: Adjunction versus Multiple Specifiers. Natural Language and Linguistic Theory, 17, 1–65. Grimshaw, J. (1991). Extended Projection. Ms., Brandeis University, Waltham MA. Guèron, J. & R. May (1984). Extraposition and Logical Form. Linguistic Inquiry, 15, 1–31. Hendrick, R. & M. Rochemont (1988). Complementation, Multiple wh, and Echo Questions. Toronto Working Papers in Linguistics, 9. Hestvik, A. (1990). LF-Movement of Pronouns. Doctoral dissertation, Brandeis University. Hoekstra, T. & H. Bennis (1989). A Representational Theory of Empty Categories. In H. Bennis and A. van Kemenade (Eds.), Linguistics in the Netherlands 1989 (pp. 91–99). Dordrecht: Foris. Hornstein, N. (1984). Logic as Grammar. Cambridge, MA: The MIT Press. Hornstein, N. (1998). Movement and Chains. Syntax, 1, 99–127. Horvath, J. (1986). Focus in the Theory of Grammar and the Syntax of Hungarian. Dordrecht: Foris. Huang, C.-T. J. (1982). Logical Relations in Chinese and the Theory of Grammar. Doctoral dissertation, MIT. Huang, C.-T. J. (1993). Reconstruction and the Structure of VP: Some theoretical consequences. Linguistic Inquiry, 24, 103–138. Jaeggli, O. (1988). ECP Effects at LF in Spanish. In D. Birdsong and J.-P. Montreuil (Eds.), Advances in Linguistics (pp. 113–149). Dordrecht: Foris. Johnson, K. (1985). A Case for Movement. Doctoral dissertation, MIT. Johnson, K. (1987). Against the Notion ‘SUBJECT’. Linguistic Inquiry, 18, 354–361. Johnson, K. (1988). Clausal Gerunds, the ECP, and Government. Linguistic Inquiry, 19, 583–609. Johnson, K. (1992). Scope and Binding Theory: Comments on Zubizarreta. In T. Stowell and E. Wehrli (Eds.), Syntax and the Lexicon. Syntax and Semantics 26 (pp. 259–275). New York: Academic Press. Kayne, R. (1989). Facets of Romance Past Participle Agreement. In P. Benincà (Ed.), Dialect Variation and the Theory of Grammar (pp. 85–103). Dordrecht: Foris. Kayne, R. (1994). The Antisymmetry of Syntax. Cambridge, MA: The MIT Press. Kiss, K. (1987). Configurationality in Hungarian. Budapest: Akademiai Kiado; Dordrecht: Reidel. Koizumi, M. (1992). Copy α and Reconstruction Effects. Ms., MIT. Koizumi, M. (1994). Layered Specifiers. Proceedings of the North Eastern Linguistics Society (Nels), 24, 255–269. Koizumi, M. (1995). Phrase Structure in Minimalist Syntax. Doctoral dissertation, MIT. Koopman, H. & D. Sportiche (1982). Variables and the Bijection Principle. The Linguistic Review, 2, 139–160.
Joachim Sabel Koopman, H. & D. Sportiche (1985). θ-Theory and Extraction. Glow Newsletter, 14, 57–58. Koopman, H. & D. Sportiche (1986). A Note on Long Extraction in Vata and the ECP. Natural Language and Linguistic Theory, 4, 357–374. Koster, J. (1986). The Relation between pro-drop, Scrambling, and Verb Movements. Ms., Rijksuniversiteit Groningen. Koster, J. (1987). Domains and Dynasties. The radical autonomy of syntax. Dordrecht: Foris. Kroch, A. & A. K. Joshi (1987). Analyzing Extraposition in a Tree Adjoining Grammar. In G. Huck and A. Ojeda (Eds.), Discontinuous Constituents. Syntax and Semantics, 20 (pp. 107–149). New York: Academic Press. Kuroda, S.-Y. (1992). Whether We Agree or Not: A comparative syntax of English and Japanese. In S.-Y. Kuroda (Ed.), Japanese Syntax and Semantics (pp. 215–257). Dordrecht: Kluwer. Lasnik, H. (1995). Verbal Morphology: Syntactic Structures meets the Minimalist Program. In H. Campos and P. Kempchinsky (Eds.), Evolution and Revolution in Linguistic Theory (pp. 251–275). Washington DC: Georgetown, University Press. Lasnik, H. (1998). Some Reconstruction Riddles. Proceedings of the 22nd Annual Penn Linguistics Colloquium. Volume, 5 (1), 83–98. Lasnik, H. (1999). Chains of Arguments. In S. D. Epstein and N. Hornstein (Eds.), Working Minimalism (pp. 188–215). Cambridge, MA: The MIT Press. Lasnik, H. & M. Saito (1984). On the Nature of Proper Government. Linguistic Inquiry, 15, 235–289. Lasnik, H. & M. Saito (1992). Move α. Cambridge, MA: The MIT Press. Lebeaux, D. (1983). A Distributional Difference between Reciprocals and Reflexives. Linguistic Inquiry, 14, 723–730. Lebeaux, D. (1985). Locality and Anaphoric Binding. Linguistic Inquiry, 16, 343–363. Lebeaux, D. (1991). Relative Clauses, Licensing, and the Nature of the Derivation. In S. D. Rothstein (Ed.), Perspectives on Phrase Structure. Syntax and Semantics, 25 (pp. 209– 239). New York: Academic Press. Lee, E.-J. (1993). Superiority Effects and Adjunct Traces. Linguistic Inquiry, 24, 177–183. Lee, R. K. (1994). Economy of Representation. Doctoral dissertation, University of Connecticut. Li, Y.-H. A. (1990). X◦ -Binding and Verb Incorporation. Linguistic Inquiry, 3, 399–426. Lightfoot, D. & Weinberg, A. (1988). ‘Barriers’ (A Review). Language, 64, 366–383. Longobardi, G. (1987). In Defense of the Correspondence Hypothesis. Ms. Mahajan, A. (1990). The A/A-bar Distinction and Movement Theory. Doctoral dissertation, MIT. Maling, J. (1979). An Asymmetry with Respect to Wh-Islands. Linguistic Inquiry, 9, 75–89. Manzini, R. (1998). A Minimalist Theory of Weak Islands. In P. W. Culicover and L. McNally (Eds.), The Limits of Syntax [Syntax and Semantics 29] (pp. 185–209). New York: Academic Press. May, R. (1977). The Grammar of Quantification. Doctoral dissertation, MIT. May, R. (1985). Logical Form. Cambridge, MA: The MIT Press. McDaniel D. (1989). Partial and Multiple Wh-movement. Natural Language and Linguistic Theory, 7, 565–604. McKay, T. (1985). Infinitival Complements in German. Cambridge: CUP.
Intermediate traces, reconstruction and locality effects
Melvold, J. (1991). Factivity and Definiteness. MIT Working Papers in Linguistics, 15, 97–117. Miyagawa, S. (1997). Against Optional Scrambling. Linguistic Inquiry, 28, 1–25. Mulders, I. (1997). Mirrored Specifiers. Linguistics in the Netherlands (pp. 135–146). Amsterdam: John Benjamins. Müller, G. & W. Sternefeld (1993). Improper Movement and Unambiguous Binding. Linguistic Inquiry, 24, 461–507. Nakajima, H. (1989). Bounding of Rightwards Movements. Linguistic Inquiry, 20, 328–334. Nemoto, N. (1993). Chains and Case Positions: A study from scrambling in Japanese. Doctoral dissertation, University of Connecticut. Nishiyama K. et al. (1996). Syntactic Movement of Overt Wh-Phrases in Japanese and Korean. Japanese/Korean Linguistics, 5, 337–351. Pesetsky, D. (1982). Paths and categories. Doctoral dissertation, MIT. Pesetsky, D. (1987). Binding Problems with Experiencer Verbs. Linguistic Inquiry, 18, 126– 140. Platzack, C. (1985). The Scandinavian Languages and the Null Subject Parameter. Working Papers in Scandinavian Syntax, 20. Plessis, H. du (1977). Wh-Movement in Afrikaans. Linguistic Inquiry, 8, 723–726. Raghibdoust, S. (1994). Multiple Wh-Fronting in Persian. Cahiers Linguistiques D’Ottawa, 21, 27–58. Raposo, E. (1987). Romance Inversion, the Minimality Condition and the ECP. Proceedings of the North East Linguistic Society (Nels), 18, 357–374. Reinhart, T. (1981). A Second Comp Position. In A. Belletti, L. Brandi, L. Rizzi (Eds.), Theory of Markedness in Generative Grammar (pp. 517–557). Pisa: Scuola Normale Superiore di Pisa. Reinhart, T. (1983). Anaphora and Semantic Interpretation. London: Croom Helm. Reuland, E. & W. Kosmeijer (1988). Projecting Inflected Verbs. Groninger Arbeiten zur Germanistischen Linguistik GAGL, 29, 88–113. Richards, N. (1997). What moves where when in which language? Doctoral dissertation, MIT. Rizzi, L. (1982). Issues in Italian Syntax. Dordrecht: Foris. Rizzi, L. (1986). Null Objects in Italian and the Theory of pro. Linguistic Inquiry, 17, 501– 557. Rizzi, L. (1990a). Relativized Minimality. Cambridge, MA: The MIT Press. Rizzi, L. (1990b). Speculations on Verb Second. In J. Mascaró and M. Nespor (Eds.), Grammar in Progress (pp. 25–32). Dordrecht: Foris. Rizzi, L. (1992). Argument/Adjunct (A)symmetries. Proceedings of the North East Linguistic Society (Nels), 22, 365–381. Rizzi, L. (1995). The Fine Structure of the Left Periphery. Ms., Université de Genève. Rizzi, L. (1996). Residual Verb Second and the Wh-Criterion. In A. Belletti and L. Rizzi (Eds.), Parameters and Functional Heads. Essays in comparative syntax (pp. 63–90). Oxford: OUP. Rudin, C. (1988). On Multiple Questions and Multiple Wh-Fronting. Natural Language and Linguistic Theory, 6, 445–501.
Joachim Sabel
Sabel, J. (1995). On Parallels and Differences between Clitic Climbing and Long Scrambling & the Economy of Derivations. Proceedings of the North East Linguistic Society (Nels), 25, 405–423. Sabel, J. (1996). Restrukturierung und Lokalität. Universelle Beschränkungen für Wortstellungsvarianten. Berlin: Akademie-Verlag. Sabel, J. (1998). Principles and Parameters of Wh-Movement. Habilitation’s-Thesis, Universität Frankfurt/Main. Sabel, J. (1999). Das Passiv im Deutcshen. Derivationale Ökonomie vs. optionale Bewegung. Linguistische Berichte, 177, 87–112. Sabel, J. (2000). Partial Wh-Movement and Typology of Wh-Questions. In U. Lutz et al. (Eds.), Wh-Scope Marking (pp. 409–446). Amsterdam: John Benjamins. Sabel, J. (2001). Deriving Multiple Head and Phrasal Movement: The Cluster Hypothesis. Linguistic Inquiry, 32 (3), 532–547. Sabel, J. (2002). A Minimalist Analysis of Syntactic Islands. The Linguistic Review, 19. Safir, K. (1985a). Syntactic Chains. Cambridge: CUP. Safir, K. (1985b). Missing Subjects in German. In J. Toman (Ed.), Studies on German Grammar (pp. 193–229). Dordrecht: Foris. Safir, K. (1986). Relative Clauses in a Theory of Binding and Levels. Linguistic Inquiry, 17, 663–689. Saito, M. (1985). Some Asymmetries in Japanese and Their Theoretical Implications. Doctoral dissertation, MIT. Saito, M. (1992). Long Distance Scrambling in Japanese. Journal of East Asian Linguistics, 1, 69–118. Saito, M. (1994a). Additional-wh Effects and the Adjunction Site Theory. Journal of East Asian Linguistics, 3, 195–240. Saito, M. (1994b). Improper Adjunction. In M. Koizumi and H. Ura (Eds.), Formal Approaches to Japanese Linguistics I: MIT Working Papers in Linguisitcs 24 (pp. 263–293). Department of Linguistics and Philosophy, MIT, Cambridge, MA. Saito, M. (1994c). Scrambling and the Functional Interpretation of Wh-Phrases. Ms., University of Connecticut. Saito, M. & Hoshi, H. (2000). The Japanese Light Verb Construction and The Minimalist Program. In R. Martin et al. (Eds.), Step by Step. Essays on Minimalist Syntax in Honor of Howard Lasnik (pp. 261–295). Cambridge, MA: MIT Press. Sakai, H. (1994). Derivational Economy in Long Distance Scrambling. In M. Koizumi and H. Ura (Eds.), Formal Approaches to Japanese Linguistics I: MIT Working Papers in Linguisitcs, 24 (pp. 295–314). Department of Linguistics and Philosophy, MIT, Cambridge, MA. Stepanov, A. (1998). On Wh-Fronting in Russian. Proceedings of the North East Linguistic Society (Nels), 28. Tajima, K. (1987). Wh-Q/Wh-Rel Asymmetries and Conditions on A -Chains. Proceedings of CLS, 23, 336–349. Tajima, K. & Arimura, K. (1988). Two Types of Variables: A D-Structure Adjunction Approach to Null Operator Constructions. Proceedings of CLS, 24, 362–376. Takahashi, D. (1994). Minimality of Movement. Dissertation, University of Connecticut. Storrs, Connecticut.
Intermediate traces, reconstruction and locality effects
Takano, Y. (1996). Movement and Parametric Variation in Syntax. Dissertation, University of California, Irvine. Takeda, K. (1997). A Note on Locality of Feature Movement and Category Movement. UCI Working Papers in Linguistics, 3, 183–202. Toman, J. (1981). Aspects of Multiple Wh-Movement in Polish and Czech. In R. May and J. Koster (Eds.), Levels of Syntactic Representation (pp. 292–302). Dordrecht: Foris. Tonoike, S. (1997). On Scrambling. In S. Tonoike (Ed.), Scrambling (pp. 125–159). Tokyo: Kurosio Publishers. Toyoshima, T. (1997). Head-to-Spec Movement. Ms., Cornell University. Travis, L. (1984). Parameters and Effects of Word Order Variation. MIT Dissertation, Cambridge, MA. Ura, H. (1994). Varieties of Raising and the Feature-based Bare Phrase Structure Theory. MIT Occasional Papers in Linguistics, 7. Department of Linguistics and Philosophy, MIT, Cambridge, MA. Uriagereka, J. (1988). On Government. Doctoral dissertation, University of Connecticut. Watanabe, A. (1993). Larsonian CP Recursion, Factive Complements, and Selection. Proceedings of the North East Linguistic Society (Nels), 23, 523–537. Williams, E. (1986). A Reassignment of the Functions of LF. Linguistic Inquiry, 17, 265–299. Willim, E. (1989). On Word Order: A Government – Binding Study of English and Polish. Kraków. Wyngaerd, G. Vanden (1989). Object Shift as an A-Movement Rule. In P. Branigan et al. (Eds.), Student Conference on Linguistics – MIT Working Papers in Linguistics, 11 (pp. 256–271). Department of Linguistics and Philosophy, MIT, Cambridge, MA. Wurmbrand, S. (1998). Infinitives. Doctoral dissertation, MIT. Yadroff, M. (1991). The Syntactic Properties of Adjunction (Scrambling in Russian). Ms., Indiana University, Bloomington. Yadroff, M. (1994). Long Distance Scrambling is just Left Dislocation? Ms., Indiana University, Bloomington. Zabrocki, T. (1981). Lexical Rules of Semantic Interpretation. Poznan: Uniwersytet im. Adama Mickiewicza w Poznaniu.
Index
A absorption 117 adjunction 111f. with multiple roots 113f., 116, 118 Agree 48, 50, 54, 56, 165, 166, 168, 172, 241 antifreezing 86f. Attract 179 C case 167, 169, 173, 175, 182, 185, 237f., 242f. categorial identity condition 114 change of location verbs 143, 148f. change of state verbs 142, 150f. conflation 211ff. constituent 120f., 126, 134 c-command 109f. Constraint on Adjunction Movement 260, 277 copy and deletion 82f. co-valued features 165, 172, 188f., 193f., 198f. locality of 200, 201 D Distributed Morphology 15, 25 dominance 110 E EPP 168, 186, 187, 203, 249f., 254f., 305 existential locative constructions 226f. expletive constructions 188f. F floating quantifiers 176ff.
G Global economy 180f.
L Languages Basque 157, 244, 245f. Catalan 224, 227 Chamorro 1331 Croatian 66ff., 72ff., 80ff., 99f. Dutch 224, 294f. Finnish 34 French 34, 135, 192f. Georgian 35 German 65ff., 73ff., 79ff., 90ff., 191f., 263, 275, 279, 283, 285 Haitian Creole 244 Hebrew 18, 32 Hindi 35, 157 Icelandic 7f., 176, 196f., 251, 282 Irish 35 Italian 21, 22, 23, 27, 29, 30, 31ff., 35, 279 Japanese 77, 247, 262, 284f. Kiswahili 168f. Polish 66, 273f. Russian 34, 225, 284 Sardinian 228 Scottish Gaelic 35 Southern Tiwa 245 Spanish 167, 177f., 212ff., 243, 247, 254, 279, 293f. Swedish 252f. Turkish 150, 153, 158 locality of Agreement 178 long scrambling 283
Index
M Minimal Link Condition 271 monadic verbs 140 unaccusative vs. unergative Move 179, 181 Move F 46, 47, 49
139f.
O object experiencer verbs 19f., 29f. stative and non-stative reading 21, 22, 23
20,
P Parallel Movement Constraint 89 Phase Impenetrability Condition 167, 202, 259 Person-Case Constraint 243f. Principle A reconstruction effects 261f. Proper Binding Condition 42, 45 Q quantifier raising 42, 289f. R Relativized Minimality 41, 51f., 60, 198, 265 remnant movement 41, 42, 71 root node 111, 112f. roots in distributed morphology 15ff.
S Scandinavian object shift 250f. scope reconstruction 267f. Single-Output syntax 48 Specific Subject Condition 73 Split ergativity 156f. subject experiencer verbs 33, 34, 35, 36 substitution with multiple roots 113, 115 superraising 199 T transitivity alternation 141 U Unambiguous Domination 76 Uniformity Condition on Chains 281, 292 Uniformity Principle 239 V verbs of being 142, 145f. verbs of creation 143, 153f. W weak crossover 263 X XP-split construction 66f.
In the series LINGUISTIK AKTUELL/LINGUISTICS TODAY (LA) the following titles have been published thus far, or are scheduled for publication: 1. KLAPPENBACH, Ruth (1911-1977): Studien zur Modernen Deutschen Lexikographie. Auswahl aus den Lexikographischen Arbeiten von Ruth Klappenbach, erweitert um drei Beiträge von Helene Malige-Klappenbach. 1980. 2. EHLICH, Konrad & Jochen REHBEIN: Augenkommunikation. Methodenreflexion und Beispielanalyse. 1982. 3. ABRAHAM, Werner (ed.): On the Formal Syntax of the Westgermania. Papers from the 3rd Groningen Grammar Talks (3e Groninger Grammatikgespräche), Groningen, January 1981. 1983. 4. ABRAHAM, Werner & Sjaak De MEIJ (eds): Topic, Focus and Configurationality. Papers from the 6th Groningen Grammar Talks, Groningen, 1984. 1986. 5. GREWENDORF, Günther and Wolfgang STERNEFELD (eds): Scrambling and Barriers. 1990. 6. BHATT, Christa, Elisabeth LÖBEL and Claudia SCHMIDT (eds): Syntactic Phrase Structure Phenomena in Noun Phrases and Sentences. 1989. 7. ÅFARLI, Tor A.: The Syntax of Norwegian Passive Constructions. 1992. 8. FANSELOW, Gisbert (ed.): The Parametrization of Universal Grammar. 1993. 9. GELDEREN, Elly van: The Rise of Functional Categories. 1993. 10. CINQUE, Guglielmo and Guiliana GIUSTI (eds): Advances in Roumanian Linguistics. 1995. 11. LUTZ, Uli and Jürgen PAFEL (eds): On Extraction and Extraposition in German. 1995. 12. ABRAHAM, W., S. EPSTEIN, H. THRÁINSSON and C.J.W. ZWART (eds): Minimal Ideas. Linguistic studies in the minimalist framework. 1996. 13. ALEXIADOU Artemis and T. Alan HALL (eds): Studies on Universal Grammar and Typological Variation. 1997. 14. ANAGNOSTOPOULOU, Elena, Henk VAN RIEMSDIJK and Frans ZWARTS (eds): Materials on Left Dislocation. 1997. 15. ROHRBACHER, Bernhard Wolfgang: Morphology-Driven Syntax. A theory of V to I raising and pro-drop. 1999. 16. LIU, FENG-HSI: Scope and Specificity. 1997. 17. BEERMAN, Dorothee, David LEBLANC and Henk van RIEMSDIJK (eds): Rightward Movement. 1997. 18. ALEXIADOU, Artemis: Adverb Placement. A case study in antisymmetric syntax. 1997. 19. JOSEFSSON, Gunlög: Minimal Words in a Minimal Syntax. Word formation in Swedish. 1998. 20. LAENZLINGER, Christopher: Comparative Studies in Word Order Variation. Adverbs, pronouns, and clause structure in Romance and Germanic. 1998. 21. KLEIN, Henny: Adverbs of Degree in Dutch and Related Languages. 1998. 22. ALEXIADOU, Artemis and Chris WILDER (eds): Possessors, Predicates and Movement in the Determiner Phrase. 1998. 23. GIANNAKIDOU, Anastasia: Polarity Sensitivity as (Non)Veridical Dependency. 1998. 24. REBUSCHI, Georges and Laurice TULLER (eds): The Grammar of Focus. 1999. 25. FELSER, Claudia: Verbal Complement Clauses. A minimalist study of direct perception constructions. 1999.
26. ACKEMA, Peter: Issues in Morphosyntax. 1999. ° 27. RUZICKA, Rudolf: Control in Grammar and Pragmatics. A cross-linguistic study. 1999. 28. HERMANS, Ben and Marc van OOSTENDORP (eds): The Derivational Residue in Phonological Optimality Theory. 1999. 29. MIYAMOTO, Tadao: The Light Verb Construction in Japanese. The role of the verbal noun. 1999. 30. BEUKEMA, Frits and Marcel den DIKKEN (eds): Clitic Phenomena in European Languages. 2000. 31. SVENONIUS, Peter (ed.): The Derivation of VO and OV. 2000. 32. ALEXIADOU, Artemis, Paul LAW, André MEINUNGER and Chris WILDER (eds): The Syntax of Relative Clauses. 2000. 33. PUSKÁS, Genoveva: Word Order in Hungarian. The syntax of È-positions. 2000. 34. REULAND, Eric (ed.): Arguments and Case. Explaining Burzio’s Generalization. 2000. 35. HRÓARSDÓTTIR, Thorbjörg. Word Order Change in Icelandic. From OV to VO. 2000. 36. GERLACH, Birgit and Janet GRIJZENHOUT (eds): Clitics in Phonology, Morphology and Syntax. 2000. 37. LUTZ, Uli, Gereon MÜLLER and Arnim von STECHOW (eds): Wh-Scope Marking. 2000. 38. MEINUNGER, André: Syntactic Aspects of Topic and Comment. 2000. 39. GELDEREN, Elly van: A History of English Reflexive Pronouns. Person, ‘‘Self’’, and Interpretability. 2000. 40. HOEKSEMA, Jack, Hotze RULLMANN, Victor SANCHEZ-VALENCIA and Ton van der WOUDEN (eds): Perspectives on Negation and Polarity Items. 2001. 41. ZELLER, Jochen : Particle Verbs and Local Domains. 2001. 42. ALEXIADOU, Artemis : Functional Structure in Nominals. Nominalization and ergativity. 2001. 43. FEATHERSTON, Sam: Empty Categories in Sentence Processing. 2001. 44. TAYLAN, Eser E. (ed.): The Verb in Turkish. 2002. 45. ABRAHAM, Werner and C. Jan-Wouter ZWART (eds): Issues in Formal German(ic) Typology. 2002. 46. PANAGIOTIDIS, Phoevos: Pronouns, Clitics and Empty Nouns. ‘Pronominality’ and licensing in syntax. 2002. 47. BARBIERS, Sjef, Frits BEUKEMA and Wim van der WURFF (eds): Modality and its Interaction with the Verbal System. 2002. 48. ALEXIADOU, Artemis, Elena ANAGNOSTOPOULOU, Sjef BARBIERS and Hans Martin GAERTNER (eds): Dimensions of Movement. From features to remnants. n.y.p. 49. ALEXIADOU, Artemis (ed.): Theoretical Approaches to Universals. 2002. 50. STEINBACH, Markus: Middle Voice. A comparative study in the syntax-semantics interface of German. 2002. 51. GERLACH, Birgit: Clitics between Syntax and Lexicon. n.y.p. 52. SIMON, Horst J. and Heike WIESE (eds): Pronouns. Grammar and representation. n.y.p.
53. ZWART, C. Jan-Wouter and Werner ABRAHAM (eds): Studies in Comparative Germanic Syntax. Proceedings from the 15th Workshop on Comparative Germanic Syntax (Groningen, May 26-27, 2000)(Workshop). n.y.p. 54. BAPTISTA, Marlyse: The Syntax of Cape Verdean Creole. The Sotavento varieties. n.y.p. 55. COENE, M. and Yves D'HULST (eds): From NP to DP. Volume 1: The syntax and semantics of noun phrases. n.y.p. 56. COENE, M. and Yves D'HULST (eds.): From NP to DP. Volume 2: The expression of possession in noun phrases. n.y.p. 57. DI SCIULLO, Anna-Maria (ed.): Asymmetry in Grammar. Volume 1: Syntax and semantics. n.y.p. 58. DI SCIULLO, Anna-Maria (ed.): Asymmetry in Grammar. Volume 2: Morphology, phonology, acquisition. n.y.p. 59. DEHÉ, Nicole: Particle Verbs in English. Syntax, information structure and intonation. n.y.p. 60. TRIPS, Carola: From OV to VO in Early Middle English. n.y.p. 61. SCHWABE, Kerstin and Susanne WINKLER (eds.): The Interfaces. Deriving and interpreting omitted structures. n.y.p.