This page intentionally left blank
C O N T RO L A S M OV E M E N T
The movement theory of control (MTC) makes one ma...
49 downloads
1338 Views
931KB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
This page intentionally left blank
C O N T RO L A S M OV E M E N T
The movement theory of control (MTC) makes one major claim: that control relations in sentences like ‘John wants to leave’ are grammatically mediated by movement. This goes against the traditional view that such sentences involve not movement, but binding, and analogizes control to raising, albeit with one important distinction: whereas the target of movement in control structures is a theta position, in raising it is a non-theta position; however, the grammatical procedures underlying the two constructions are the same. This book presents the main arguments for MTC and shows it to have many theoretical advantages, the biggest being that it reduces the kinds of grammatical operations that the grammar allows, an important advantage in a minimalist setting. It also addresses the main arguments against MTC, using examples from control shift, adjunct control, and the control structure of “promise,” showing MTC to be conceptually, theoretically, and empirically superior to other approaches. c e d r i c b o e c k x is Research Professor at the Catalan Institute for Advanced Studies (ICREA), and a member of the Center for Theoretical Linguistics at the Universitat Aut`onoma de Barcelona. n o r b e r t h o r n s t e i n is Professor in the Department of Linguistics at the University of Maryland, College Park. ja i ro nun e s is Professor in the Department of Linguistics at the Universidade de S˜ao Paulo, Brazil.
In this series 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126
r o g e r l a s s : Historical linguistics and language change j o h n m . a n d e r s o n: A notional theory of syntactic categories bernd heine: Possession: cognitive sources, forces and grammaticalization no m i e r t e sc hi k-shi r : The dynamics of focus structure j o hn c o l e ma n: Phonological representations: their names, forms and powers christina y . bethin: Slavic prosody: language change and phonological theory b a r b a r a d a n c y g i e r : Conditionals and prediction cla i re l e f e b v r e : Creole genesis and the acquisition of grammar: the case of Haitian creole heinz g iegerich: Lexical strata in English keren rice: Morpheme order and semantic scope a pril m c m a h o n: Lexical phonology and the history of English m a tth e w y . c he n : Tone Sandhi: patterns across Chinese dialects g r e g o r y t . s t u m p : Inflectional morphology: a theory of paradigm structure j o a n b y b e e: Phonology and language use la uri e b aue r : Morphological productivity t h o m a s e r n s t: The syntax of adjuncts eli z ab e t h c l o ss t r a ug o t t and r i c har d b . d a s h e r : Regularity in semantic change m a y a h i c k m a n n: Children’s discourse: person, space and time across languages d i a n e b l a k e m o r e : Relevance and linguistic meaning: the semantics and pragmatics of discourse markers i a n r ob e r t s and a nna r oussou: Syntactic change: a minimalist approach to grammaticalization d o n k a m i n k o v a : Alliteration and sound change in early English m a r k c . b a k e r : Lexical categories: verbs, nouns and adjectives ca rlot a s. smi t h: Modes of discourse: the local structure of texts rochelle lieber: Morphology and lexical semantics ho lg er di e sse l : The acquisition of complex sentences s h a r o n i n k e l a s and c h e r y l z o l l : Reduplication: doubling in morphology s u s a n e d w a r d s: Fluent aphasia b a r b a r a d a n c y g i e r and eve sweetser : Mental spaces in grammar: conditional constructions hew ba erma n, dunst a n b r own, and g r e v il l e g . c o r b e t t : The syntax–morphology interface: a study of syncretism m a r c u s t o m a l i n : Linguistics and the formal sciences: the origins of generative grammar s a m u e l d . e p s t e i n and t . d a ni e l se e l y : Derivations in minimalism p a u l d e l a c y : Markedness: reduction and preservation in phonology y e h u d a n . f a l k : Subjects and their properties p . h. mat t he ws: Syntactic relations: a critical survey m a r k c . b a k e r : The syntax of agreement and concord g i l l i a n c a t r i o n a r a m c h a n d : Verb meaning and the lexicon: a first phase syntax p i e t e r m u y s k e n: Functional categories j u a n u r i a g e r e k a: Syntactic anchors: on semantic structuring d . r o b e r t l a d d : Intonational phonology, second edition leonard h. babby : The syntax of argument structure b. elan d resher: The contrastive hierarchy in phonology d a v i d a d g e r , d a n i e l h a r b o u r , and laurel j. watkins : Mirrors and microparameters: phrase structure beyond free word order niina ning z hang : Coordination in syntax nei l s mi t h: Acquiring phonology n i n a t o p i n t z i: Onsets: suprasegmental and prosodic behaviour c e d r i c b o e c k x , norbert hornstein, and jairo nunes: Control as movement Earlier issues not listed are also available
CAMBRIDGE STUDIES IN LINGUISTICS General editors: p. austin, j. bresnan, b. comrie, s. crain, w. dressler, c. j. ewen, r. lass, d. lightfoot, k. rice, i. roberts, s. romaine, n. v. smith
Control as Movement
CONTROL AS MOVEMENT CEDRIC BOECKX ICREA/Universitat Aut`onoma de Barcelona
N O R B E RT H O R N S T E I N University of Maryland, College Park
JA I RO N U N E S Universidade de S˜ao Paulo, Brazil
CAMBRIDGE UNIVERSITY PRESS
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo, Delhi, Dubai, Tokyo Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521195454 © Cedric Boeckx, Norbert Hornstein, and Jairo Nunes 2010 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2010 ISBN-13
978-0-511-78955-7
eBook (NetLibrary)
ISBN-13
978-0-521-19545-4
Hardback
Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.
Contents
Acknowledgments 1
page x
Introduction
1
2
Some historical background
5
2.1 2.2 2.3 2.4 2.5
Introduction What any theory of control should account for Control in the standard-theory framework Control in GB Non-movement approaches to control within minimalism 2.5.1 The null-case approach 2.5.2 The Agree approach Conclusion
5 5 6 9 16 16 20 35
3
Basic properties of the movement theory of control
36
3.1 3.2
Introduction Departing from the null hypothesis: historical, architectural, and empirical reasons Back to the future: elimination of DS and the revival of the null hypothesis Controlled PROs as A-movement traces 3.4.1 Configurational properties 3.4.2 Interpretive properties 3.4.3 Phonetic properties and grammatical status Conclusion
36
4
Empirical advantages
59
4.1 4.2 4.3 4.4
Introduction Morphological invisibility Interclausal agreement Finite control 4.4.1 Finite control and hyper-raising
59 59 60 63 70
2.6
3.3 3.4
3.5
37 43 46 47 49 52 56
vii
viii
Contents 4.4.2 Finite control, islands, and intervention effects 4.4.3 Summary The movement theory of control under the copy theory of movement 4.5.1 Adjunct control and sideward movement 4.5.2 The movement theory of control and morphological restrictions on copies 4.5.3 Backward control 4.5.4 Phonetic realization of multiple copies and copy control Conclusion
98 102 115 123
5
Empirical challenges and solutions
125
5.1 5.2
Introduction Passives, obligatory control, and Visser’s generalization 5.2.1 Relativizing A-movement 5.2.2 Impersonal passives 5.2.3 Finite control vs. hyper-raising Nominals and control 5.3.1 Finite control into noun-complement clauses in Brazilian Portuguese 5.3.2 Raising into nominals in Hebrew 5.3.3 The contrast between raising nominals and control nominals in English Obligatory control and morphological case 5.4.1 Quirky case and the contrast between raising and control in Icelandic 5.4.2 Apparent case-marked PROs The minimal-distance principle, control shift, and the logic of minimality 5.5.1 Control with promise-type verbs 5.5.2 Control shift 5.5.3 Summary Partial and split control 5.6.1 Partial control 5.6.2 Split control Conclusion
125 125 127 132 136 141
169 171 176 181 182 183 190 194
6
On non-obligatory control
195
6.1 6.2 6.3 6.4 6.5
Introduction Obligatory vs. non-obligatory control and economy computations Some problems A proposal Conclusion
195 196 202 204 209
4.5
4.6
5.3
5.4
5.5
5.6
5.7
75 79 79 83
142 147 149 152 152 160
Contents
ix
7
Some notes on semantic approaches to control
210
7.1 7.2 7.3
Introduction General problems with selectional approaches to obligatory control “Simpler syntax” 7.3.1 Some putative problems for the movement theory of control 7.3.2 Challenges for “simpler syntax” Conclusion
210 210 216 217 226 237
8
The movement theory of control and the minimalist program
238
8.1 8.2 8.3
Introduction Movement within minimalism and the movement theory of control The movement theory of control and the minimalist architecture of UG Inclusiveness, bare phrase structure, and the movement theory of control Conclusion
238 239
References Index
250 261
7.4
8.4 8.5
241 245 248
Acknowledgments
Previous versions of (part of) the material discussed here have been presented at the following universities: Connecticut, Harvard, Leiden, Lisbon, Maryland, New York, Rutgers, S˜ao Paulo, Stony Brook, Tilburg, and Utrecht; and at the following meetings: ANPOLL 2003, XVIII Colloquium on Generative Grammar, Edges in Syntax, EVELIN 2004, GLOW XXX, Going Romance 2007, LSA 2005, Romania Nova II, Ways of Structure Building, and V Workshop on Formal Linguistics at USP. We would like to thank these audiences for comˇ ments and suggestions. Special thanks to Zeljko Boˇskovi´c, Hans Broekhuis, Lisa Cheng, Marcelo Ferreira, Micha¨el Gagnon, Terje Lohndal, Carme Picallo, and Johan Rooryck. We would also like to acknowledge the support received from the Generalitat de Catalunya (grant 2009SGR1079; first author), NSF (grant NSD.BCS.0722648; second author), and CNPq and FAPESP (grants 302262/2008–3 and 2006/00965–2; third author).
x
1 Introduction
In the following pages we develop an extended argument for a proposal whose conceptual simplicity and empirical success will, we trust, be evident to all readers. The proposal says that (obligatory) control is movement, more specifically, A-movement. We propose that the phenomena that have been used to motivate a special and separate control construction are best explained if control is treated as an A-movement dependency, on a par with other phenomena that have been traditionally treated in terms of A-movement such as passive, raising, and (local) scrambling. Put another way, we claim that maintaining the constructional specificity of control (in whatever form, be it in terms of the PRO theorem [e.g., Chomsky 1981], null case [e.g., Chomsky and Lasnik 1993; Martin 1996; and Boˇskovi´c 1997], or ad hoc “anaphoric” tense-agreement dependencies [e.g., Landau 1999, 2000, 2004]) significantly hampers our understanding of the phenomenon as it leads to explanations that are roughly as complex as the phenomenon itself. Despite virtues that we believe are transparent (see e.g., Hornstein 1999, 2001), the movement theory of control (hereafter, MTC) has proven to be quite controversial.1 We believe that there are several reasons for this. The first one is historical. Differentiating raising from control in terms of movement has been a fixed point within generative grammar from the earliest accounts within the standard theory to current versions of minimalism (see Davies and Dubinsky 2004). Under this long-held view, which became crystallized in GB with the formulation of the (construction-specific) control module (Chomsky 1981), if raising involves movement, control cannot. It is thus not surprising that the MTC has been welcomed with considerable skepticism, as its basic proposal is exactly to analyze control in terms of (A-)movement. However, such historical bias should not deter us from a fair evaluation of the conceptual properties and empirical coverage of the MTC. 1 See e.g., Landau (2000, 2003); Culicover and Jackendoff (2001, 2005); Kiss (2005); and van Craenenbroeck, Rooryck, and van den Wyngaerd (2005) for a useful sample.
1
2
Introduction
The second reason behind the controversy is also related to the long interest control has enjoyed within the generative tradition. Over the years, control phenomena have been richly described. Consequently, any new approach will likely fail, at least initially, to adequately handle some of the relevant data. Moreover, if the novel approach is conceptually tighter than the more descriptive accounts that it aims to replace (as we believe to be the case with the MTC), some features of the phenomenon heretofore assumed to be central may not be accommodated at all. This should occasion no surprise, as it reflects the well-known tension between description and explanation. Odd as it may seem, failure to cover a data point may be a mark of progress if those that are covered follow in a more principled fashion. The virtues of a proposal can be seriously misevaluated unless one keeps score of both what facts are covered and how facts are explained. A weak theory can often be “easily” extended to accommodate yet another data point, and this is not a virtue. Correspondingly, a tight theory may miss some “facts” and this is not necessarily a vice, particularly if the account is comparatively recent and the full implications of its resources have not yet been fully developed. We believe that many have been too impressed by these apparent problems without considering how the MTC might be developed to handle them. In fact, we believe that the MTC actually faces few empirical difficulties (and none of principle), whereas the current alternatives both face very serious empirical hurdles (e.g., backward control) and often empirically succeed by stipulating what should be explained (e.g., the distribution of PRO through null case). One aim of what follows is to make this case in detail. Finally, it is fair to say that the resistance to MTC is in part due to the inadequacies and limitations of previous versions of the MTC (including our own work), which we have tried to overcome here. Addressing the vigorous critiques of MTC here and in previous work (Hornstein 2003; Boeckx and Hornstein 2003, 2004, 2006a; Nunes 2007; Boeckx, Hornstein, and Nunes in press) has allowed us to rectify some errors, clarify the proposal, and sharpen the arguments. This stimulating intellectual exercise has led us to better appreciate the consequences of the MTC and has in fact convinced us that it covers even more empirical ground than we at first thought, as we will argue in the following chapters. For all these reasons, we thought that a detailed defense of MTC required a monograph. But before we launch our defense of MTC, a few notes are in order. First, we cannot emphasize enough that MTC does not equate “control” with “raising.” Since the MTC was first proposed, it has been regularly objected
Introduction
3
that the MTC cannot be right because of features that control has, but raising does not, and vice versa. However, control is raising only in the descriptive sense that control is an instance of A-movement, but it is not raising qua construction. In other words, all the MTC is saying is that, like the derivation of raising, passive, or local scrambling constructions, the derivation of obligatorycontrol constructions also involves A-movement. The different properties of constructions involving wh-movement and topicalization, for instance, do not argue against analyzing them in terms of A’-movement. Similarly, we urge the reader not to dismiss our proposal simply because (unanalyzed) control– raising asymmetries exist. Although raising often proves useful in illustrating properties of A-movement that carry over to control, it is a ladder that ought to be kicked away as theory advances. In the chapters that follow, we in fact argue that control–raising asymmetries generally reduce to independent factors – something we take to be an indication that the MTC is on the right track. Second, the MTC is actually not a radically new idea. It goes back as far as Bowers (1973), who already proposed that raising and control should be basically generated in the same way. However, as the proposal conflicted with core principles of almost every model of UG from Aspects to GB, it did not find fertile soil to blossom for a long time. This scenario drastically changed when the minimalist program came into the picture. Chomsky’s (1993) proposal that D-structure should be eliminated provided a very natural conceptual niche for the MTC within the generative enterprise as it removed the major theoretical obstacle that prevented movement to -positions. In a system with D-structure, movement to -positions is a non-issue, for movement can only take place once -assignment is taken care of. By contrast, in a system without D-structure, where movement and -assignment intersperse, movement to -positions arises at least as a logical possibility. Thus, whether or not it is a sound option has to be determined on the basis of the other architectural features of the system, as well as its empirical coverage. We hope to show that the MTC fits snugly with some leading minimalist conceptions and thus constitutes an interesting argument in its favor. Third, as minimalism aspires to explain why UG properties are the way they are, we are interested in developing a theory of control that deduces the properties of control configurations from more basic postulates, rather than merely listing the possible controllers, controllees, control predicates, and control complements coded as features of individual lexical items. Finally, although our specific implementation of MTC is the one that has been extended to the broadest range of data thus far, it is certainly not the only one possible. O’Neil (1995), Manzini and Roussou (2000), Kayne (2002), and
4
Introduction
Bowers (2006) share the spirit but not the details of our analysis. For reasons of space, we will not be able to do proper justice to these works and the reader is invited to evaluate each different implementation in its own right. Let us close this introductory chapter by providing an overview of the subsequent chapters. Chapter 2 offers a brief overview of how control is handled in the standard-theory framework, in GB, and in non-movement approaches within minimalism. Chapter 3 lays out the broad features of our version of the MTC. Chapter 4 discusses some of the empirical advantages that the MTC has. Chapter 5 addresses many of the empirical challenges that have been considered to be fatal to the MTC and proposes solutions compatible with the MTC. Chapter 6 presents our take on how non-obligatory control is to be analyzed. Chapter 7 discusses the extent to which the MTC is based on more solid conceptual and empirical grounds than semantic/selectional approaches to obligatory control. Finally, Chapter 8 concludes the monograph.
2 Some historical background
2.1
Introduction
Up to very recently, there had been a more or less uncontroversial view that control phenomena should be analyzed in terms of special grammatical primitives (e.g., PRO) and construction-specific interpretive systems (e.g., the control module). In this chapter, we examine how this conception of control was instantiated in the standard-theory framework (section 2.3), in GB (section 2.4), and in non-movement analyses within the minimalist program (section 2.5), briefly outlining what we take to be the virtues and problems of each approach.1 This discussion will provide the general background for us to discuss the core properties of (our version of) the MTC in Chapter 3 and evaluate its adequacy in the face of the general desiderata for grammatical downsizing explored in the minimalist program.
2.2
What any theory of control should account for
A theoretically sound approach to control – one that goes beyond the mere listing of the properties involved in control – must meet (at least) the following four requirements. First, it must specify the kinds of control structures that are made available by UG and explain how and why they differ. Assuming, for instance, that obligatory control (OC) and non-obligatory control (NOC) are different, their differences should be reduced to more basic properties of the system. Second, it must correctly describe the configurational properties of control, accounting for the positions that the controller and the controllee can occupy. In addition, it should provide an account as to why the controller and the controllee are so configured. Assuming, for instance, that the controllee can 1 For much more detailed discussion, we urge the reader to consult Davies and Dubinsky’s (2004) excellent history of generative treatments of raising and control.
5
6
Some historical background
only appear in a subset of possible positions (e.g., ungoverned subjects), why are controllees so restricted? Third, it must account for the interpretation of the controllee, explaining how the antecedent of the controllee is determined and specifying what kind of anaphoric relation obtains between the controllee and its antecedent (in both OC and NOC constructions) and why these relations obtain and not others. For instance, assuming that controllers must locally bind controllees in OC constructions, why is the control relation so restricted in these cases? Fourth, it must specify the nature of the controllee: what is its place among the inventory of null expressions provided by UG? Is it a formative special to control constructions or is it something that is independently attested? In the next sections, we briefly review how these concerns have been addressed from the standard-theory model to the minimalist program. 2.3
Control in the standard-theory framework
Within the framework of the standard theory, control phenomena were coded in the obligatory transformation referred to as equi(valent) NP deletion (END), which for our current purposes can be described as follows:2 (1)
X-NP-Y-[S {for/poss}-NP-Z]-W Structural description: 1 2 3 4 5 6 7 → Structural change: 1 2 3 4 Ø 6 7 Conditions: i. 2 = 5 ii. the minimal-distance principle is satisfied
Irrelevant details aside, END applies to the (a)-structures in (2)–(5), for instance, and converts them in the corresponding (b)-sentences. (2) a. b.
John tried/wanted/hoped [for John to leave early] John tried/wanted/hoped to leave early
→
(3) a. b.
John regrets/insisted on/prefers [poss John leaving early] John regrets/insisted on/prefers leaving early
→
(4) a.
John persuaded/ordered/forced/asked/told Mary [for Mary to leave early] John persuaded/ordered/forced/asked/told Mary to leave early
b. (5) a. b.
→
John kissed Mary before/after/without [poss John asking if he could] → John kissed Mary before/after/without asking if he could
2 Here we abstract away from issues that are orthogonal to our discussion such as the interaction between END and the rule of complementizer deletion, which has the effect of deleting the term numbered 4 in (1). See Rosenbaum (1967, 1970) for discussion.
2.3 Control in the standard-theory framework
7
According to this approach, there is nothing of special interest in the nature of the controllee. It is a regular NP in the underlying structure and the fact that the corresponding surface position is phonetically null follows from the kind of transformation END is. It is a deletion transformation that removes the targeted NP, leaving nothing at surface structure. To put it differently, the superficial phonetic difference between controller and controllee results not from intrinsic lexical properties of the controllee, but from properties of the computation itself, i.e., that END is a deletion operation. As far as the configurational properties of control are concerned, END explicitly specifies that the controllee (the target of deletion) must occur in the subject position of infinitival clauses (for-clauses) and gerunds (poss-clauses), and that the controller must be the closest NP (in compliance with the minimal-distance principle). Thus, according to the minimal-distance principle, sentences such as (4b) must be derived from the structures in (4a) and not from the one in (6) below, which would incorrectly allow the understood subject of the embedded clause to be interpreted as being coreferential with the matrix subject. As opposed to what we find in (4a), the antecedent of the controllee in (6) is not the closest NP around. As for adjunct control in sentences such as (5), the minimal-distance principle is satisfied under the assumption that the embedded clause is adjoined to the matrix clause and, as such, it is structurally closer to the subject than it is to the object.3 (6)
John persuaded/ordered/forced/asked/told Mary [for John to leave early]
Finally, the interpretation properties of control are enforced by condition (i), which requires that controller and controllee be “identical,” which was understood in terms of coreference. This general approach was refined within the standard theory as more complex control structures were considered, but its axiomatic (i.e., stipulative) nature remained. The configurational and interpretive properties of control were analyzed as irreducible features of the END transformation itself. This by no means diminishes the value of these earlier approaches to control. Identifying the different properties of control phenomena with such formal rigor 3 END as stated is not entirely adequate empirically. Given (1) above, the structure in (ia), for example, should allow for control by ‘Mary’ in (ib): (i) a. b.
John persuaded a friend of Mary [for Mary to leave] John persuaded a friend of Mary to leave
It should be clear how requiring that some sort of command relation hold between the antecedent NP and the deleted one will help screen out cases like (i), where the “wrong” NP is chosen.
8
Some historical background
was unquestionably an achievement, with large consequences for theorizing beyond control structures, and it paved the way for subsequent reanalyses in GB and in the minimalist program. Before we leave this brief review, two points are worth mentioning which will be relevant to the discussion of these later reanalyses, including the MTC. The first one regards an empirical problem that the standard-theory approach faced in relation to the way it handled the interpretive properties of control. As we saw above, the controller and the controllee were taken to be lexically identical and the semantic relation between them was understood as coreference. Problems arise when the controller is not a referential NP, as exemplified by the contrast between (7) and (8). →
(7) a. b.
[John wants [John to win]] John wants to win
(8) a. b.
[Everyone wants [everyone to win]] → Everyone wants to win
Whereas (7a) might be taken to roughly represent the meaning of (7b), (8a) in no way represents the interpretation of (8b), which should rather be paraphrased as ‘Everyone wants himself to win.’ This suggests that, instead of an NP identical to (i.e., coreferential with) its controller in underlying structure, what we actually need is a kind of bound anaphor or an expression that can be so interpreted.4 The obvious question then is how to obtain this bound interpretation. The second point worth mentioning concerns the identification of another type of control. Relatively early on, END was distinguished from a related operation dubbed super-equi (SEND). This operation also deletes a subject of a non-finite clause but, in contrast to END, it operates across unbounded stretches of sentential material, as illustrated in (9).5
4 If there is an anaphoric relation in control structures, then END is unlikely to be a chopping (“gap”-leaving) rule. Rather, it is more like the rules of reflexivization or pronominalization, which were operations governed by command relations. The problem is that control structures do not appear to leave lexical residues like the other construal operations. They appear to require a phonetic gap. Seen from a contemporary perspective, the problem of how to characterize the rules that lead to control structures (are they chopping rules or construal rules?) highlights the tension that we will see constantly recurring: how best to account for both the distribution of the controllee and its interpretation. 5 See Grinder (1970). Data such as (9) are not the sorts of cases Grinder discussed, but they fall under the SEND rubric.
2.4 Control in GB (9) a. b.
9
[S1 John said [S2 that Mary believes [S3 that [S4 John washing himself] would make a good impression on possible employers]]] → John said that Mary believes that washing himself would make a good impression on possible employers
Note that (9) violates the minimal-distance principle, as ‘Mary’ intervenes between the target of deletion (‘John’ in S4 ) and its antecedent (‘John’ in S1 ). Moreover, in contrast to standard END configurations, the controllee is not within a clausal complement (or adjunct) of a higher predicate. In (9), for instance, the controllee is within the sentential subject of S3 . The following question then arises: what is the relation between END and SEND? Or to put the question somewhat differently: why should UG have two rules that have the same effect (deletion of an identical NP), but apply to different configurations?6 In the next sections we examine some answers to these two issues that were offered within GB and the minimalist program. 2.4
Control in GB
Building on earlier work in the extended standard theory (EST), the GB approach to control is considerably more ambitious and empirically more successful than the standard-theory model. Within GB, the controllee is a PRO, a base-generated NP containing no lexical material ([NP Ø]). This conception of the controllee as a base-generated non-lexical formative arises as a natural consequence of the GB assumptions regarding the base component. The GB theory of the base includes both phrasestructure rules, like the ones in (10), and lexical-insertion operations, like the ones in (11). (10) a. b. c.
S → NP INFL VP VP → V NP NP → N
(11) a. b. c.
N → John/he/it/Bill V → kiss/see/admire INFL → past/to
These two types of rules operate in tandem to generate structures such as (12) below. However, they can also be used to generate structures like (13), where the subject of the clause has been generated by the phrase-structure component 6 Grinder (1970) actually collapsed END and SEND. However, later approaches identified many substantial differences between the constructions underlying END and SEND that are better captured if two kinds of control are recognized, as we shall see below.
10
Some historical background
but has not been filled by lexical insertion. In short, a theory of the base factored into a set of phrase-structure rules and lexical-insertion operations has room for an element like PRO: it is what one gets when one generates an NP structure but does not subject it to lexical insertion. (12)
[S [NP John] past [VP see [NP Bill]]]
(13)
[S [NP Ø] to [VP see [NP Bill]]]
This way of understanding PRO has an interesting consequence for the constructions that were captured by END in the standard theory. If one assumes that categories without lexical content are uninterpretable unless provided with “content” (by being linked with an antecedent, for example) and, furthermore, that the principle of full interpretation does not tolerate contentless structures, then the requirement that PRO must have an antecedent follows naturally.7 We wish to stress this point as it is important for some of the discussion that follows. If one treats PRO as a lexical element, it is hard to explain why PRO must be phonetically null and why it requires an antecedent. Of course, it is possible to stipulate that these two features are inherent properties of a specific lexical item (PRO), but this cannot explain why PRO is necessarily anaphoric and null. Moreover, so conceived, PRO is a rather unusual lexical element as it has no positive properties. It has no phonetic matrix and its only semantic feature is the requirement that it must be coindexed with a grammatical antecedent.8 This point is worth emphasizing. PRO, on this view, is not simply a semantically dependent expression that needs to be interpreted with respect to some salient element in the discourse (e.g., like ‘the other’ in ‘John ate one of the bagels. Harry ate the other.’). Rather, PRO is specified as needing an antecedent in a particular structural configuration. However, this is a very odd lexical feature as it is only definable in configurational (i.e., grammatical) terms. In other words, invoking such features in the construction of lexical items (be it PRO or any other item) is just a way of simulating a grammatical requirement via lexical stipulation.9 The GB approach offers a sounder alternative as it treats PRO’s properties as the result of interacting grammatical principles. This feature of the GB analysis 7 See Chomsky (1980: 8): “If Coindex does not apply and the embedded clause contains PRO, then we end up with a ‘free variable’ in LF; an improper representation, not a sentence but an open sentence.” 8 This point is similar to Chomsky’s (1995) argument against considering Agr as a lexical category. Given that its only features are uninterpretable, a preferable approach, all things being equal, is to take these features as belonging to related true lexical categories. 9 For a discussion of reflexives and bound pronouns in light of this discussion, see Hornstein (2001, 2007).
2.4 Control in GB
11
of control is clearly a desirable one for any theory to have. Any adequate theory of control should eschew lexically stipulating PRO’s basic properties and specify how grammatical principles interact so that the desired properties of PRO emerge. We consider some possibilities below (see also Chapter 3). Being a grammatical, non-lexical formative specified as [NP Ø], PRO is in fact quite similar to NP-traces (standard traces of A-movement) in GB.10 What distinguishes them is neither their internal structures nor their interpretation, but how they are introduced in the derivation and how they get their indices.11 PRO is inserted at D-structure, but is only coindexed later in the derivation. In contrast, NP-traces receive their indices as they are created in a movement operation (they must be coindexed with the NP that moves). However, after PROs get their indices (at S-structure or LF), they become completely indistinguishable from NP-traces. Notice that, once we take PRO and NP-traces to be indistinguishable at some point(s) in the derivation, we are already very close to the MTC. We return to this point in Chapters 3 and 4. The GB account of the distribution of PRO is similarly ambitious. Rather than simply stipulate that it appears in the subject position of non-finite clauses, GB strove to derive this fact from the binding theory. The proposal, known as the PRO theorem, went as follows (see Chomsky 1981). PRO was taken to be a pronominal anaphor and, as such, subject to both principles A and B.12 Principle A states that an anaphor must be bound in its domain; principle B that a pronoun must be free in its domain. Under the assumption that these principles apply within the same domain, they end up imposing contradictory requirements on a pronominal anaphor, namely, that it should be both free and bound in the same domain. The only way for such an expression to meet both requirements is for it to vacuously satisfy them, i.e., by not meeting the necessary conditions for these requirements to be enforced. Thus, PRO cannot have a binding domain. Given that the binding domain for an expression was defined (in one of its formulations) as the smallest clause within which it is governed, then PRO does not have a binding domain if it is ungoverned (if it has no governor, for instance). Finally, if one takes an Infl head to be a governor if
10 See Chomsky (1977: 82): “We may take PRO to be just a base-generated t(x) [trace of x], x a variable; i.e., as a base generated NPx , an NP without an index.” 11 See Chomsky (1977: 82): “trace and PRO are the same element; they differ only in the way the index is assigned – as a residue of a movement rule in one case, and by a rule of control in the other . . . Note also that PRO is a non-terminal.” 12 Saying that PRO was a pronominal anaphor does not imply in the context of GB that it was a lexical formative. Traces, for instance, were treated as anaphors (Chomsky 1981) despite their clearly being non-lexical.
12
Some historical background
it is finite but not if it is non-finite (to and ing), one then derives the distribution of PRO: it can only appear in the subject position of non-finite clauses. A side benefit of this reasoning is that it provides an account of why PRO must be phonetically null. Within GB, case theory requires that nominals with phonetic content bear case and case is taken to be assigned under government. If PRO only appears in ungoverned positions, it cannot be case marked. Therefore, PRO cannot have phonetic content, for otherwise the case filter would be violated. Once again, this makes PRO very similar to NP-traces. These too occur in caseless positions and, not surprisingly, are phonetically null. Notice also that, by taking PRO to be a non-lexical formative, the problem posed by quantified expressions in the standard-theory framework dissolves. A sentence like (8b), for instance, repeated below in (14a), will be associated with a structure along the lines of (14b), where PRO does not have quantification properties on its own, but is rather interpreted as a bound variable, as desired. (14) a. b.
Everyone wants to win [Everyone wants [PRO to win]]
At first sight, taking PRO to be a pronominal anaphor also seems to have other welcome consequences as far as its interpretation is concerned. One indeed finds examples of its anaphoric behavior, as illustrated in (15), as well as examples of its pronominal behavior, as illustrated in (16).13 (15) a. b. c. d.
∗
It was expected PRO to shave himself John1 thinks that it was expected PRO1 to shave himself ∗ John1 ’s campaign expects PRO1 to shave himself John1 expects PRO1 to win and Bill2 does too (‘and Bill expects himself to win,’ not ‘and Bill expects John to win’) e. [The unfortunate]1 expects PRO1 to get a medal f. [Only Churchill]1 remembers PRO1 giving the ‘Blood, Sweat, and Tears’ speech
(16) a. b. c. d. e. f.
∗
It is illegal PRO to park here John1 thinks that Mary said that PRO1 shaving himself is vital John1 ’s friends believe that PRO1 keeping himself under control is vital if he is to succeed John1 thinks that PRO1 getting his resum´e in order is crucial and Bill does too (‘Bill2 thinks that his1/2 getting his resum´e in order is crucial’) [The unfortunate]1 believes that PRO1 getting a medal is unlikely Only Churchill remembers that PRO giving the BST speech was momentous
13 On the properties illustrated in (15) and (16) as well as further data and discussion, see e.g., Fodor (1975), Williams (1980), Lebeaux (1985), and Higginbotham (1992).
2.4 Control in GB
13
In (15), PRO roughly behaves like a reflexive. In configurational terms, it requires an antecedent (cf. [15a]) which must be local (cf. [15b]) and ccommand it (cf. [15c]). On the interpretation side, it only supports a sloppy interpretation under VP ellipsis (cf. [15d]), a de se reading in sentences such as (15e) (i.e., it is only felicitous if the unfortunate is conscious of who he is and expects himself to get a medal), and a bound reading when its antecedent is associated with only (that is, [15f] can be paraphrased as ‘Only Churchill is such that he remembers himself giving the BST speech’ and not as ‘Only Churchill remembers that Churchill gave the BST speech’). By contrast, in (16) PRO behaves like a pronoun in every respect. Hence, it does not require an antecedent (cf. [16a]) and, where there is an antecedent, the antecedent need not be local (cf. [16b]) or c-command it (cf. [16c]). In addition, (16d) allows both strict and sloppy readings, (16e) permits both de se and non-de se interpretations, and (16f) may be falsified by situations in which people other than Churchill recall the import of the BST speech. Despite appearances, the data in (15) and (16) actually turn out to be quite problematic for the specification of PRO as a pronominal anaphor within GB. Notice that PRO displays either properties of reflexives or properties of pronouns. But in no case does it display properties of both pronouns and reflexives. Not by coincidence were data such as (15) and (16) handled by two different transformations (END and SEND, respectively) within the standard theory. Thus, it makes more sense to assume that PRO is ambiguous between a reflexive and a pronoun, than to assume it is a pronominal anaphor. However, this ambiguity thesis completely undermines the PRO theorem, as the theorem crucially assumes the existence of an element that is simultaneously a pronoun and an anaphor. In turn, if the PRO theorem falls, we are left with no account of the distribution of PRO. The requirements of the PRO theorem have one further architectural consequence: in order to explain the distributional properties of PRO in terms of the PRO theorem, some other component of the grammar must be responsible for PRO’s specific interpretation in a given configuration. This accounts for the addition of the control module in the GB framework. The control module recognizes two types of control: obligatory control (OC), illustrated in (15), and non-obligatory control (NOC), illustrated in (16). In the case of OC, the controller is lexically specified as an argument of the embedding control verb and, in the case of non-local control, other (rarely specified but frequently adverted to) principles come into play. Notice that this amounts to saying that, like in the earlier treatment in terms of END and SEND, OC and NOC are rather distinct types of relations. Importantly, it is tacitly assumed that the control module somehow obliterates the pronominal
14
Some historical background
specification of PRO in OC constructions, and, conversely, its anaphoric specification in NOC constructions. The problem is not so much that the details of how this would be achieved are never spelled out, but that this tacit assumption casts suspicion over the initial specification of PRO as a pronominal anaphor. Why should UG provide PRO with such a specification only to see it blotted out later? After all, this does not happen with standard pronouns and anaphors: they must live with the pronominal or anaphoric specifications stated in their birth certificate. One could reply that we should learn to live with the PRO module given the nice results we obtain with the PRO theorem concerning the distribution of PRO. However, this apparent success does not survive closer scrutiny either. The first thing to be noted is that the PRO-theorem account of the distributional properties of PRO is intrinsically associated with a specific formulation of binding domains, one in which government is essentially the one and only requirement to be satisfied. Recall that all that matters for PRO to vacuously satisfy both principles A and B is that it does not have a binding domain. To lack a governor is certainly one way for PRO to be deprived of a binding domain. But, if the correct definition of binding domain ends up including other requirements, there may be other ways for PRO to lack a domain. Take, for instance, the definition in (17) (see Chomsky 1981). (17)
␣ is a binding domain for  iff ␣ is the minimal NP or S containing , in which  is governed and ␣ has a subject accessible to .
This is not the place to review the various reasons for including the notion of accessible subject in the definition of binding domain within GB.14 The relevant point for our discussion is that, once accessible subjects become part of the definition of binding domains, PRO may also lack a domain if it does not have an accessible subject. This in turn undermines the account of the distribution of PRO in terms of the PRO theorem, as government is no longer the only player on the field.15 To put it broadly, if binding domains are to be formulated along the lines of (17), the account of the distribution of PRO exclusively in terms of government involves an independent axiom, rather than a theorem. But the problem is actually worse than these remarks suggest. Recall that a crucial assumption in the PRO-theorem account is that a finite Infl is a governor, but a non-finite Infl is not. From an empirical point of view, this assumption 14 See Lasnik and Uriagereka (1988) for a good discussion of the notion of accessible subject and the motivations for its inclusion in the definition of binding domain. 15 See Bouchard (1984) for discussion.
2.4 Control in GB
15
is challenged by languages like Brazilian Portuguese, which allow obligatory control into indicative clauses, as we will see in detail in sections 2.5.2.2 and 4.4 below. In order to make room for finite control in such languages, the PROtheorem account would be forced to assume that their finite Infls are optional governors. However, it is not at all obvious how this assumption can be formally encoded in the system. Given that government is a structural relation, being a governor cannot be listed as a lexical property for the reasons discussed above. Such a lexical specification would be comparable to saying, for instance, that a given lexical item is lexically specified as being unable to c-command.16 This just does not make sense. What is required is a structural reason for preventing a non-finite Infl from governing its Spec. But, regardless of the definition of government one assumes, if a finite Infl can govern its Spec, so should a nonfinite Infl, as the two structural configurations are identical. Again, to assume the opposite would be parallel to saying that, although the configurational Spechead relation is exactly the same in both cases, a finite Infl head m-commands its Spec, but a non-finite Infl does not. In short, when details are considered, the distributional properties of PRO do not follow theorematically and it is not even obvious how to convert the PRO theorem into an axiom, as it is unnatural to encode structural properties as lexical features or formulate different notions of government for different lexical items. Even the apparent benefit of the account of PRO’s lack of case has an undesirable consequence. An A-chain in GB must be headed by a case-marked position unless it is headed by PRO.17 This statement is transparently troublesome. If Achains are independently subject to a case-licensing requirement (say, Aoun’s [1979] visibility condition, which requires that -roles be associated with case in order to be visible at LF), why should A-chains headed by PRO be exempted from such a requirement? Notice in particular that, given that chains headed by pro and null operators also required case licensing, PRO’s lack of phonetic content could not be the reason for this exception. To sum up: despite its laudable ambitions and its improvement over the standard-theory approach to control, the GB approach has significant empirical and theoretical problems. On the plus side, treating PRO as a grammatical formative circumvents the previous problem related to control involving quantified expressions, and accounts for why PRO is phonetically null and why (at least in the case of OC) it needs a grammatical antecedent. On the down side, the account of the distribution of PRO turns out on closer consideration to be 16 See Hornstein, Nunes, and Grohmann (2005) for a discussion of this point. 17 See Chomsky (1981: 334ff.) for discussion.
16
Some historical background
less a theorem than an axiomatic stipulation. Moreover, the assumption that PRO is a pronominal anaphor leads to empirical problems as the system cannot predict when PRO behaves like an anaphor and when it behaves like a pronoun. A separate control module must then be added to the theory to specify the interpretive properties of PRO. Moreover, the construction-specific flavor of this new addition to the model is at odds with the general goal of the principlesand-parameters theory of deducing properties of rules and constructions from the interaction of more basic features. It is no wonder that the control module always felt like an appendix to the model and never occupied a bright spot among GB’s theoretical achievements. The GB take on control was therefore ripe for a minimalist reanalysis.
2.5
Non-movement approaches to control within minimalism
2.5.1 The null-case approach Given that the addition of the construction-specific control module in GB was prompted by the problematic assumption that PRO is a pronominal anaphor, one would in principle expect that the abandonment of this assumption should also lead to the abandonment of the control module. However, history and logic are known to frequently go their separate ways. The first minimalist reanalysis of control, outlined in Chomsky and Lasnik (1993), gave up on the account of PRO in terms of its alleged pronominal-anaphoric nature, but basically left intact the assumption that the interpretation of PRO required a special module in the system. Let us consider how the distributional properties of PRO are handled on this account. Take the contrast between (18a) and (18b), for instance. (18) a. b.
∗
John hoped [PRO1 to be elected t1 ] John hoped [PRO1 to appear to t1 [that Bill was innocent]]
From the perspective of GB, PRO cannot occupy a governed position as it would then meet the requirements for binding theory to apply and would end up violating principle A or principle B. Hence, PRO cannot remain in the object position of the embedded verb in (18a) or the preposition to in (18b). However, once it moves to the subject position of the infinitival clause, which, by assumption, is ungoverned, it should circumvent the binding violation in both (18a) and (18b). The ungrammaticality of (18b) is therefore unaccounted for within GB. Notice that the contrast in (18) mimics the contrast in the ECM constructions in (19), which can be straightforwardly captured if one assumes
2.5 Non-movement approaches to control
17
that a given expression cannot move from a case-marked position to another case-marked position. (19) a. b.
∗
We expected [John1 to be hired t1 ] We never expected [John1 to appear to t1 [that the job was easy]]
Based on the parallelism between pairs like (18) and (19), Chomsky and Lasnik (1993) propose a case-based account of the distributional properties of PRO under which A-chains headed by PRO are not exceptional as far as case licensing is concerned, as they were in GB (see section 2.4).18 The gist of their proposal is that PRO must be licensed by a special kind of case, dubbed null case, which is checked by some non-finite Infl heads. Under the assumption that the infinitival to in (18) checks null case, movement of PRO is licit in (18a) as it proceeds from a caseless (passives generally do not check case) to a case-checking position, but not in (18b), where it proceeds from a casechecking to another case-checking position. This proposal also extends to the standard cases regarding the distribution of PRO. Thus, under this view, PRO cannot appear in the subject position of finite clauses or in the object position of a transitive verb, as respectively illustrated in (20), because these are not positions in which null case is checked. ∗
(20) a. b.
∗
John hoped (that) PRO could eat a bagel Bill saw PRO
Notice that some extra assumption must be made in order to capture the standard contrast between (21) and (22) below, for instance. In other words, the null-case approach must somehow ensure that the infinitival ‘to’ of control constructions can license PRO, but not the infinitival ‘to’ of ECM or raising constructions. The obvious question is how to independently distinguish the ‘to’ that can check null case in (21) from its siblings in (22), which cannot. One thing is certain. One cannot simply say that these heads are lexically ambiguous in terms of their specification for case checking; otherwise, structures corresponding to (22) should be grammatical with the case-checking version of ‘to’. (21) (22) a. b.
John hopes [PRO to graduate soon] ∗ ∗
I believe [PRO to be nice] It seems [PRO to be nice]
18 The idea of accounting for the distribution of PRO in terms of case finds its origins in Bouchard’s (1984) proposal that PRO cannot appear in a case-marked position.
18
Some historical background
Martin (1996, 2001) is the most fully worked out version of the null-case approach to the distribution of PRO, which attempts to couch the distinction between (21) and (22) on more solid grounds. Building on Stowell’s (1982) proposal that control infinitives are tensed whereas ECM and raising infinitivals are tenseless, Martin proposes that only tensed infinitivals check null case.19 Tying null case to tense has the virtue of rendering it more natural and less stipulative. Under this perspective, null case would be very similar to nominative case, as both would be checked by a tensed Infl, differing only in terms of their morphological realization. Unfortunately, the proposed independent diagnostics for distinguishing tensed from tenseless infinitivals fail to yield the expected divide between control predicates, on the one hand, and ECM and raising predicates, on the other, as convincingly shown by Wurmbrand (2005).20 Take the contrasts between the infinitival complements of the control verb ‘decide’ and the ECM verb ‘believe’ in (23)–(26) (from Wurmbrand 2005), for example. (23) a. b.
∗
(24) a. b. (25) a. b. (26) a. b.
At 6, Leo decided to sing in the shower right then At 6, Leo believed Bill to sing in the shower right then Leo decided yesterday to leave tomorrow John believes/believed Mary to be pregnant
∗
∗
Leo decided [[to leave] [which was/is true]] Leo believes [[John to be smart] [which is true]] Leo doesn’t want John to sing in the shower, but he decided to, anyway Leo believes John to be honest and she believes Frank to, as well
The contrasts above are supposed to show that the control infinitival is tensed as it is compatible with eventive predicates (cf. [23a]), triggers a future reading (cf. [24a]), requires an irrealis interpretation (that is, the truth of the complement is left unspecified at the time of the utterance; cf. [25a]), and licenses VP ellipsis (cf. [26a]). Conversely, the ECM/raising infinitival clauses are taken to be tenseless as they are incompatible with eventive predicates (cf. [23b]), require a simultaneous interpretation with respect to the embedding clause (cf. [24b]),21 allow a realis interpretation (cf. [25b]), and do not license VP ellipsis (cf. [26b]). 19 See also Boˇskovi´c (1997) for relevant discussion. 20 For further discussion and arguments against null case and its ties to tense, see also Landau (2000), Pires (2001, 2006), Baltin and Barrett (2002), and Hornstein (2003). 21 See Hornstein (1990) for a discussion of this interpretation in the context of sequence-of-tense constructions.
2.5 Non-movement approaches to control
19
The above paradigm does indeed distinguish ‘decide,’ a control verb, from ‘believe,’ an ECM verb. The problem, as Wurmbrand (2005) shows, is that the criteria do not generalize to other control and ECM/raising cases. For instance, the control verb ‘claim’ does not license eventive predicates (cf. [27a] below) and allows a realis interpretation for its complement (cf. [27b]), whereas the control verb ‘manage’ does not trigger a future reading (cf. [27c]). In turn, the infinitival complement of the ECM verb ‘expect’ is compatible with an eventive predicate (cf. [28a]), does not permit a realis interpretation (cf. [28b]), and allows a non-simultaneous interpretation (cf. [28c]). (27) a. b. c. (28) a. b. c.
∗
At 6, Leo claimed to sing in the shower right then Leo claimed [[to be a king], which was true] ∗ John managed to bring his toys tomorrow ∗
The bridge is expected to collapse tomorrow The train is expected [[to arrive late tomorrow] [which is true]] The printer is expected to work again tomorrow
Wurmbrand (2005) also reviews the VP ellipsis data and observes that the data that purport to demonstrate a distinction between control (where it is allowed) and raising (where it is prohibited) are subject to substantial speaker variation (when the contrast exists at all). Besides, the clear acceptability of raising examples such as the ones in (29) indicates that the licensing of VP ellipsis fails to cleanly distinguish control from raising. (29) a. b. c.
The tower started to fall down and the church began to as well John expects the printer to break down whereas Peter expects the copier to They say that Mary doesn’t know French but she seems to
The above arguments, which decouple tense properties from control infinitivals, are seconded by the observation that PRO may also occur in gerundive subject positions, despite the fact that gerunds are generally analyzed as not tensed (see Stowell 1982; Pires 2001, 2006). This is illustrated in (30) below, where the gerund licenses PRO but not the temporal adverb. (30) a. b.
John hated [PRO eating turnips (∗ tomorrow)] John preferred [PRO eating turnips (∗ tomorrow)]
The overall conclusion one reaches is that, whatever tense properties nonfinite clauses have, they do not seem to be useful for distinguishing raising from control configurations. There are surely differences between raising and control complements, but this varies across verbs and there is no apparent systematic way to distinguish the two classes using the “tense” diagnostics mentioned
20
Some historical background
above. Thus, although conceptually appealing, the attempt to analyze null case as similar to nominative by associating it to a form of tense ends up failing. This is really bad news. Once the distribution of PRO cannot be reduced to a [±tense] feature of T, null case finds no independent motivation within the system and follows from nothing but the attested distribution of PRO. And the picture is not very glamorous. In order to work, the null-case approach requires three stipulations: (i) PRO has no phonetic content; (ii) null case must be assigned to PRO; and (iii) only PRO can bear null case. These three stipulations track but do not explain the facts under discussion. In other words, despite its explanatory aspirations, it seems fair to say that the null-case approach amounts to stipulating that PRO appears where it does and that it has the phonetic properties it has. What of PRO’s interpretive properties? Here there is some good news. With the PRO theorem abandoned, PRO can be treated as ambiguous, a null reflexive in some contexts (OC cases) and a null pronoun in others (NOC cases). It is then possible to reduce the interpretive properties of PRO to the interpretive properties of pronouns and reflexives. For example, that OC PRO requires a local, c-commanding antecedent follows from its being subject to principle A of the binding theory (or whatever substitutes for principle A). The fact that NOC PRO does not need an antecedent follows its being pronominal. Given such a reduction, what remains to be determined is why OC and NOC PROs distribute as they do, i.e., why reflexive PRO appears in OC contexts and pronominal PRO in NOC contexts. One can, of course, stipulate that certain predicates select for OC and so for reflexive-like PROs, while others do not. However, it is not clear how this is to be implemented grammatically (see Chapters 6 and 7 below). First, it is not clear how selection of embedded subjects by matrix verbs (so-called control predicates) is to be stated. If selection is a head-to-head relation, then OC is not an obvious case of selection. Second, adjunct control seems to pattern like OC and, on the standard assumption that predicates can select complements but not adjuncts, then adjunct control is expected to be NOC, contrary to fact. These are issues that we revisit in later chapters. What is worth noting here is that simply reducing OC to something like principle A and NOC to something like principle B does not by itself suffice to account for the interpretive properties of OC and NOC configurations. 2.5.2 The Agree approach Let us now consider Landau’s (1999, 2000, 2004) alternative approach to control. Like the null-case approach reviewed in the previous section, Landau takes the existence of PRO for granted but, unlike proponents of the null-case
2.5 Non-movement approaches to control
21
approach, he takes PRO to bear regular case like any other DP. In addition to this take on case, three other aspects of Landau’s approach stand out: (i) the special attention given to “partial-control” constructions; (ii) the dependence of obligatory control on the postulation of certain features and feature specifications; (iii) the interpretation of PRO mediated by (a version of) Chomsky’s (2000, 2001) Agree operation. Let us examine each of these major aspects of Landau’s system, leaving the discussion of whether or not PRO bears regular case to section 5.4.2 below.22 2.5.2.1 The relevance of partial control Partial control refers to control constructions where an embedded predicate must take a (semantically) plural subject, but the antecedent of the controllee is (semantically) singular, as illustrated in (31). (31)
The chair hoped [PRO to gather/meet at 6/to apply together for the grant]
In (31), the matrix subject is understood as a member of the set of people denoted by the embedded subject. Assuming that this interpretive fact shows that controller and controllee are not identical, Landau takes partialcontrol constructions to be a strong argument for a PRO-based account of control. According to him, the mismatch in interpretation between PRO and its antecedent results from PRO being independently specified for the semantic feature mereology, which characterizes group names (for instance, ‘committee’ is [+Mer], while ‘chair’ is [−Mer]), as illustrated in (32). (32)
The chair[−Mer] hoped [PRO[+Mer] to gather/meet at 6/to apply together for the grant]
It is a great merit of Landau’s work to have shown that partial control is indeed an instance of obligatory control (the controllee requires a local c-commanding antecedent, triggers sloppy readings under ellipsis, and enforces de se readings, for instance), and to have provided a very detailed description of the types of predicates that allow partial control. Landau argues that a tensed infinitive such as the complement of “desiderative” verbs licenses it, but an untensed infinitive such as the complement of “implicative” verbs does not, as illustrated by the contrast between (31) and (33) (see section 2.5.2.2 below for details). 22 Here we will primarily focus on Landau (2004)’s analysis of obligatory control, which he takes to replace his older treatment (Landau 1999, 2000). For discussion of the limitations of his previous treatment, see Hornstein (2003) and Landau (2007) for a rejoinder.
22
Some historical background
(33)
∗
The chair managed [PRO to gather/meet at 6/to apply together for the grant]
It is fair to say that, after Landau’s work, partial control came to be part of the empirical basis that any approach to obligatory control must take into consideration. However, the amount of ad hoc machinery required to account for partial control in Landau’s system, as we will see below in section 2.5.2.3, ends up undermining the initial appeal that a PRO-based theory appears to have. And there are empirical problems, as well. As observed by Hornstein (2003), it is not the case that any predicate that selects a plural subject licenses partial control, as shown in (34). (34) a. b.
∗
They sang alike/were mutually supporting John hoped/wants [PRO to sing alike/to be mutually supporting]
Notice that the matrix predicate of (34b) is of the type that licenses partial control (cf. [31]). So (34b) shows that partial control must in part be determined by properties of the embedded predicate. In fact, Hornstein suggests that what seems to distinguish the predicates that support partial control from the ones that do not is that the former can select a commitative PP, as shown in (35). (35) a. b.
∗
The chair met/gathered/applied together for the grant with Bill The chair sang alike/is mutually supporting with Bill
The data in (36)–(37) further show that being compatible with a commitative PP is not sufficient for partial control to be licensed: the commitative must be selected. (36) a. b. c.
The chair met/gathered/applied together for the grant (∗ with Bill) The chair left/went out (with Bill) The committee left/went out
(37)
The chair preferred [PRO to leave/go out at 6] (exhaustive control: OK; partial control: ∗ )
Example (36b) shows that, as opposed to what happens with ‘meet/ gather/apply together’ in (36a), the commitative associated with ‘leave/go out’ is not selected. In turn, (36c) shows that a [+Mer] noun can be the subject of ‘leave/go out.’ Now, given that in Landau’s system PRO can always be intrinsically specified as [+Mer], one would expect that a sentence such as (37), whose matrix predicate is of the type that licenses partial control, should allow a partial-control reading with a [+Mer] PRO. But this does not happen. Example (37) only has an exhaustive control reading.
2.5 Non-movement approaches to control
23
The fact that the availability of partial control is contingent on there being a predicate that selects a commitative complement suggests that, rather than involving a plural subject, partial control may in fact involve the licensing of a null commitative argument in a standard (“exhaustive”) obligatory-control construction. That is, a sentence such as (38a) should actually be represented as in (38b) (still keeping PRO for purposes of discussion), where pro is a null commitative argument. (38) a. b.
The chair preferred to meet at 6 [The chair]i hoped [PROi to[+tense] meet prok at 3]
Here is not the place for us to pursue the suggestion encapsulated in (38b) (see section 5.6.1 below for discussion). What is relevant for our current purposes is to point out that, in Landau’s system, the availability of partial control should be quite free once the tense requirements on the infinitive are satisfied. It is indeed quite mysterious in his system why partial control should depend on the potential licensing of commitative arguments within the infinitival clause. And if partial control turns out to be more related to the licensing of null commitative arguments, whatever accounts for exhaustive control should also cover partial control. In other words, if something along the lines of (38b) is on the right track, partial control does not intrinsically favor a PRO-based approach and we are back to the original question of what the best account of the null embedded subject of (38a) is (see section 5.6.1 below for a suggestion of how partial control can be analyzed under the MTC). 2.5.2.2 [Tense] and [Agr] features and finite control The second major aspect of Landau’s system is the specific typology of control configurations − involving both non-finite and finite clauses – it establishes. Following a venerable tradition, Landau assumes that the local environment of the embedded subject must provide all the necessary information to determine whether it must, can, or cannot be PRO. In particular, Landau takes the relevant local licensing features to be (semantic) [T(ense)] and (morphological) [Agr(eement)]. Where Landau departs from previous accounts is in the way these features conspire to determine the nature of control, as shown in (39) (from Landau 2004: 840).23 23 EC and PC in (39) stand for exhaustive and partial control respectively. C(ontrol)-subjunctives and F(ree)-subjunctives are distinct in that only the former necessarily require an obligatorycontrol interpretation of their subjects. For purposes of exposition, below we use I for the tense head T in order to distinguish it from the tense feature [T].
24
Some historical background
(39) Obligatory control EC-infinitive Balkan Csubjunctive
No control Hebrew 3rd-person subjunctive
PC-infinitive
I0
[−T, −Agr]
[−T, +Agr] [+T, +Agr] [+T, −Agr]
C0
[−T]
[−T]
Balkan F-subjunctive
indicative
[+T, +Agr]
[+T, +Agr]
[+T, +Agr] [+T, (+Agr)] [+T, +Agr]
Ø
Consider the infinitives in (39), for instance. As mentioned in section 2.5.2.1, Landau has argued that the essential difference between an infinitival that allows partial control and one that disallows it is its tense properties: an infinitival I allows both exhaustive and partial control if specified as [+T], but only exhaustive control if specified as [−T]. This difference is meant to capture the fact that the infinitival clauses that allow partial control can be temporally independent from the matrix clause, as illustrated in (40) below. Given that the tense properties of I are predicted by the selecting predicate and that selection is a local relation, the [T] features of I are accordingly replicated on C in (39). Thus, a verb like ‘hope,’ for instance, selects a CP headed by C[+T] , which in turn selects an IP headed by I[+T] . (40) a. b.
∗
Yesterday John hoped to travel tomorrow Yesterday John managed to travel tomorrow
As Landau observes, the basic intuition underlying the typology in (39) is that obligatory-control configurations do not form a natural class; they are in fact the complement subset of the natural class of non-controlled environments. Putting aside the case of Hebrew third-person subjunctives for the moment, the generalization is that if I is positively specified for both [T] and [Agr], it does not trigger obligatory control. On the other hand, a single negative specification for [T] or [Agr] ([+T, −Agr] or [−T, +Agr]) or a negative specification on both ([−T, −Agr]) will necessarily lead to obligatory control. In sum, obligatory control is the elsewhere case. Given this feature distribution, it follows that indicative complements should not display obligatory control. As Landau (2004: 849–850) puts it, “the only generalization in this domain that appears to be universal is the incompatibility of indicative clauses with OC. Anything else is possible, under certain circumstances.” However, this generalization is falsified by “referential” (i.e., non-expletive, non-arbitrary) null subjects in (colloquial) Brazilian Portuguese. As extensively argued by Ferreira (2000, 2004, 2009) and Rodrigues (2002, 2004), null subjects in Brazilian Portuguese show all the diagnostics of
2.5 Non-movement approaches to control
25
obligatory control. Take the Brazilian Portuguese sentences in (41)–(45), for instance.24 (41)
(42)
∗
Comprou um carro novo Bought a car new ‘She/he bought a new car’ [[o Jo˜ao] disse que [o pai d[o Pedro]] acha que vai The Jo˜ao said that the father of-the Pedro thinks that goes ser promovido] be promoted ‘Jo˜aoi said that [Pedroj ’s father]k thinks that hek/∗ i/∗ j/∗ l is going to be promoted’
(43)
S´o o Jo˜ao acha que vai ganhar a corrida Only the Jo˜ao thinks that goes win the race ‘Only Jo˜ao is an x such that x thinks that x will win the race’ NOT: ‘Only Jo˜ao is an x such that x thinks that he, Jo˜ao, will win the race’
(44)
O Jo˜ao est´a achando que vai ganhar a corrida e o The Jo˜ao is thinking that goes win the race and the Pedro tamb´em est´a Pedro too is ‘Jo˜ao thinks that he’s going to win the race and Pedro does, too (think that he, Pedro, is going to win the race)’
(45)
O infeliz acha que devia receber uma medalha The unfortunate thinks that should receive a medal ‘The unfortunate thinks that he himself should receive a medal’
Example (41) shows that null subjects in Brazilian Portuguese require an antecedent25 and (42), that the antecedent must be the closest c-commanding DP. As for interpretation matters, a null subject in Brazilian Portuguese is interpreted as a bound variable when its antecedent is an only-DP (cf. [43]); it obligatorily triggers sloppy reading under ellipsis (cf. [44]); and it only admits a de se reading in sentences such as (45). Importantly, in all the sentences of (41)–(45), the null subject displays the diagnostics of obligatory control despite the fact that it is within a standard indicative clause. The existence of finite control into indicative complements in Brazilian Portuguese therefore presents prima facie problems for the typology proposed 24 See Ferreira (2000, 2004, 2009) and Rodrigues (2002, 2004) for additional tests. 25 Referential null subjects in matrix clauses in Brazilian Portuguese can only be licensed as instances of topic-drop (see Ferreira 2000, Modesto 2000, and Rodrigues 2004 for relevant discussion).
26
Some historical background
by Landau.26 Below we discuss the implications of this empirical fact within Landau’s Agree-based approach. 2.5.2.3 Determining the interpretation of obligatorily controlled PRO via Agree In addition to the features [T] and [Agr] to be hosted by C and I, Landau (2004: 841) also proposes that DPs must be featurally specified as to whether or not they support independent reference ([R]): lexical DPs and pro are specified as [+R] and PRO as [−R]. According to Landau (p. 841), “[b]oth values on [R] are interpretable, when occurring on nominal phrases.” However, the [−R] feature makes PRO a potential goal for agreement, for “this feature acts as an instruction to coindex the -features of PRO with those of an antecedent; Agree is a way of achieving that” (p. 843). The feature [R] is also assigned to some functional categories, according to the rule in (46). (46)
R-assignment rule (Landau 2004: 842) For X0 [␣T, Agr] ∈{I0 , C0 , . . .}: Ø → [+R]/X0 [__] , if ␣ =  = ‘+’ Ø → [−R]/elsewhere
Given these assumptions, let us consider the derivation of an exhaustive control construction such as (47) in Landau’s system, which is given in (48). (47)
John managed to fix the car
(48)
Agree
[DP I2 [ . . . tDP . . . [CP C[−T] [IP PRO[−R] I1[−T, −Agr, −R] [tPRO . . . ]]]]] Agree
Agree
Agree
Agreement between I1 and PRO in (48) deletes I1 ’s [−Agr] and [−R] features. Agreement between C and I1 then deletes C’s [−T] feature. Finally, after agreeing with the matrix subject, I2 agrees with PRO, coindexing their features and licensing PRO’s [−R] feature.27 26 See Rodrigues (2004) for arguments that Finnish may also allow obligatory control into finite indicative clauses. 27 The only relevant difference between (48) and a typical obligatory-control subjunctive in Greek such as (i) in Landau’s system is that in the latter, I1 has overt agreement morphology ([+Agr]), rather than “abstract” agreement ([−Agr]), as represented in (ii). (i)
Greek (Terzi 1997): I Maria1 prospathise Pro1/∗ 2 na divasi the Maria tried.3SG PRT read.3SG ‘Maria tried to read’
2.5 Non-movement approaches to control
27
In turn, a partial-control construction like (49) is to be derived along the lines of (50). (49) (50)
The chair hoped to meet at 6 Agree
[DP I2[+T, +Agr, +R] [ . . . tDP . . . [CP C[+T, +Agr, +R] [IP PRO[−R] Agree
Agree
I1[+T, −Agr, −R] [tPRO . . . ]]]]] Agree
As before, agreement between I1 and PRO deletes I1 ’s [−Agr] and [−R] features. Agreement between C and I1 now deletes C’s [+T] and [+Agr] features, but not its [+R] feature, for it mismatches the [−R] feature of I1 . C then checks its [+R] feature with I2 . Notice that I2 agrees with C and not with PRO, which raises the question of how PRO can license its [−R] feature. According to Landau (p. 845), this feature gets licensed in virtue of I2 agreeing with C, which in turn is “coindexed” with PRO via I1 . Furthermore, Landau assumes (p. 849) that if I2 and PRO do not agree directly, their [Mer] features need not match. If I2 is specified as [−Mer] and PRO is inherently specified as [+Mer], a partial-control effect will arise. Let us finally consider the last type of obligatory-control configuration listed in (39): Hebrew third-person subjunctives.28 Given the feature specification for Hebrew subjunctives in (39), the derivation of a sentence such as (51) proceeds as in (52). (51)
Hebrew (Landau 2004) Gili hivtiax [ˇse- eci yitna’heg yafe] Gil promised that will-behave.3SG.M well ‘Gil promised to behave’
(52)
Agree
[DP I2[+R] [ . . . tDP . . . [CP C[+T, +Agr, +R] [IP PRO[−R] I1[+T, +Agr, +R] [tPRO . . . ]]]]] Agree
(ii)
Agree
Agree
Agree
[DP I2 [ . . . tDP . . . [CP C[−T] [IP PRO[−R] I1[−T, +Agr, −R] [tPRO ]]]]] Agree
Agree
Agree
28 Landau (2004: 815, 846) attributes the lack of a derivation with an uncontrolled third-person pro in Hebrew subjunctives to the non-existence of referential third-person pro in the language. More specifically, he assumes Shlonsky’s (1997) proposal that third-person pros in Hebrew are null Num◦ heads and, because they are null, they cannot support a third-person feature hosted by a higher D-head.
28
Some historical background
Agreement between I1 and PRO in (52) checks the [+Agr] feature of I1 , but not its [+R] feature, which mismatches the [−R] feature of PRO. Agreement between C and I1 then checks all of the features of C and the [+R] feature of I1 . If the matrix I2 had a [−R] feature, the remaining unchecked [−R] feature of PRO would be licensed by agreement with I2 . However, in (52) I2 is specified as [+R]. The [−R] feature of PRO must then be indirectly licensed in virtue of the agreement relations between I2 and C, between C and I1 , and between I1 and PRO. As the reader can easily check, the feature specifications and computations proposed above are such that they track, but do not explain, the distribution and interpretation of PRO. Landau (p. 842) in fact acknowledges that his Rassignment rule is an “honest stipulation,” which played the role of case in previous models. Unfortunately, if the distribution and interpretation of PRO is to rest on a stipulation, calling it honest does not make the analysis less stipulative. In other words, it is subject to the same criticism made to the null-case approach: the distribution and interpretation of PRO ends up being stipulated under the guise of lexical features. It is also worth pointing out that, under the label Agree, Landau’s proposal actually groups different kinds of relations, which do not obviously form a natural class. Thus, in addition to the familiar valuation procedure involving a [−interpretable] and a [+interpretable] feature of Chomsky (2001), the Agree operation assumed by Landau encompasses three other types of relations. First, it admits relations between two [−interpretable] features such as the agreement between C and I1 with respect to [Agr] features in (50) or the agreement between I2 and C in (52) with respect to [R] features. According to Landau (p. 849), “[t]he fact that C◦ bears [+Agr] does not stop this feature from entering Agree with [−Agr] of I◦ ; recall that [+Agr] on C◦ represents abstract [Agr] to begin with (in most cases), thus [Agr] on both heads is semantically uninterpretable and phonologically null.” That may be so, but the resort to features which are motivated neither in LF nor in PF terms not only is completely at odds with core minimalist assumptions, but also reinforces the impression that these features are only redescribing the facts to be explained. The second type of relations encompassed by Landau’s version of Agree include “coindexing” relations such as the agreement between I2 and PRO in (48) to license PRO’s [−R] feature (which was assumed to be a [+interpretable] feature, as mentioned above). Finally, it also includes composite-“coindexing” relations such as the licensing of the [−R] feature of PRO in (50) and (52), which involves the conjunction of three basic agreement relations: between I2 and C, between C and I1 , and between I1 and PRO. Even if we put aside the fact
2.5 Non-movement approaches to control
29
that “coindexing” and feature valuation/deletion seem to be of different nature, it is not at all trivial to explain how the composition of the three agreement relations mentioned should result in “coindexing.” Recall that, in (50) and (52), I2 and C agree with respect to [R] as do I1 and PRO, but C and I1 agree with respect to [Agr]. In virtue of these two agreement relations, PRO agrees with I1 through some kind of transitivity assumption. However, it is worth asking how transitivity arises given that the agreement relations computed do not target the same type of feature. Note that, if A is taller than B and B is fatter than C, then one can conclude nothing regarding A’s height or weight as regards C. However, for the account above to work, we must assume that this logic is overturned when certain feature sets are involved, which in turn brings the obvious minimalist question: why are these features endowed with their alleged properties? This shows that the proposed transitivity in the account of (51) does not follow as a point of logic, but is rather a stipulated feature in Landau’s system. Thus, the proposed composite-“coindexing” relations should be subject to the same skepticism we accord the Barriers approach to A-movement, which licenses A-traces by resorting to a chain coindexing mechanism combining Spec-head agreement with head-to-head government (see Chomsky 1986a, section 11). The non-explanatory nature of the proposal is further highlighted when Landau’s account of (51) is examined in light of his take on the impossibility of PRO in indicative clauses. As we saw in (39), the feature specification proposed for indicatives involved the features [+T] and [+Agr] for I and no features for C. The reason for C not to be associated with [T] features is that the tense value of I is completely independent from the matrix clause. Furthermore, since Landau (p. 840) assumes that the presence of [Agr] on C is parasitic on [+T], if indicative C does not have [+T], it cannot have [Agr] either. Finally, if it is not specified for both features, it cannot be associated with an [R] feature, according to the R-assignment rule in (46). That being so, Landau (p. 843) claims that the reason why PRO cannot be licensed in the indicative configuration in (53) below (Landau’s [40b]) is that “Agree fails due to a feature mismatch in the R value between I◦ and PRO. Thus, indicative clauses with independent tense universally do not display OC.” (53)
[DP I2 [ . . . tDP . . . [CP C [IP I1[+T, +Agr, +R] [VP PRO[−R] . . . ]]]]] Agree
∗
Agree
This specific claim now introduces an additional aspect of compositeagreement relations: feature mismatch is taken to cause a derivational crash under direct agreement, like the relation between I1[+R] and PRO[−R] in (53),
30
Some historical background
but not under “composite” agreement, like the relation among I2[+R] -C[+R] I1[−R] -PRO[−R] in (50). Putting aside the fact that no motivation was provided for why these two instantiations of “Agree” should yield opposite results, it is important to point out that this stipulated aspect of composite agreement leads to overgeneration. Notice that the feature mismatch at the derivational step depicted in (53), that is, before PRO moves, cannot be the reason for the derivation to crash. As we saw in the derivation proposed by Landau for Hebrew subjunctive control in (52), mismatch in the values for [R] by itself is not a problem if the features can be licensed later on in the derivation. Consider for instance the structure in (54), which depicts the movement of PRO in (53). (54)
[DP I2 [ . . . tDP . . . [CP C [IP PRO[−R] I1[+T, +Agr, +R] [VP tPRO . . . ]]]]] Agree
Agree
If I2 in (54) is specified as [−R], it will be able to agree with PRO, but the [+R] feature of I1 will remain unchecked, causing the derivation to crash. Suppose, by contrast, that I2 is specified as [+R]. As such, it should be able to agree with I1 , as represented in (55), checking the [+R] feature of the latter. (55)
Agree
[DP I2[+R] [ . . . tDP . . . [CP C [IP PRO[−R] I1[+T, +Agr, +R] [VP tPRO . . . ]]]]] Agree
Agree
What about the [−R] feature of PRO? Recall from (50) and (52) that PRO can be indirectly licensed by a chain of “agreement” relations. In (52), for example, its [−R] feature is taken to be licensed in virtue of PRO’s having agreed with I1 , which had agreed with C, which in turn had agreed with I2 . That being so, there should be no reason for PRO not to get licensed in (55) via a composite-“agreement” relation. That is, its [−R] feature should be licensed once PRO has agreed with I1 , which agrees with I2 . Crucially, composite agreement is assumed to be oblivious to feature mismatch. In other words, once the composite-agreement relations proposed by Landau are assumed, finite control into indicatives becomes freely available. In fairness, Landau (2004: 846–847) seems to assume that the [+R] feature of I1 cannot be checked by a probe higher than C: “We still account for the fact that indicative complements in Hebrew do not display OC. In a configuration like (40b) [= (53) above], as opposed to (43b) [= (52) above], the [+R] feature of I◦ remains unchecked as no corresponding feature exists on the indicative C◦ .” Notice, however, that C does not prevent a higher probe from agreeing with the embedded subject in (48), for instance. Given that the embedded subject and the embedded I are equidistant (see Chomsky 1995), it does not seem plausible to exclude the checking of the [+R] feature of I1 in (55) based on the
2.5 Non-movement approaches to control
31
intervention of C. Notice also that C in (55) has no features that could block an agreement relation between a higher probe and I1 . A phase-based approach to this conundrum is of no help either. Landau (2004, footnote 26) claims that “[a]lthough not at the edge of its phase, PRO is visible to Agree from the outside since its (- and R-) features are interpretable (hence, never erased).” Note, however, that in Landau’s system the - and Rfeatures of PRO are only valued after agreement. Thus, it is plausible to assume that spell-out/transfer must be halted until PRO has its features valued and, if this is so, we are back to the technical question of why I1 in (55) cannot be checked by the matrix probe if spell-out/transfer is on hold. Of course, one may attempt to specify the inner workings of spell-out/transfer in such a way that PRO becomes immune to spell-out/transfer at the relevant derivational step, but not I1 . But that would only add to an already loaded machinery, without actually shedding light on the discussion. Still, such an attempt would require further complications. Recall that, under composite-agreement relations in Landau’s system (cf. [52] for Hebrew and [ii] in footnote 27 for Balkan languages), the higher probe agrees with the embedded C, “which is also coindexed with PRO via I◦ ” (Landau 2004: 845). This in turn indicates that the embedded I must still be available to the computation at the derivational step where PRO is to have its R-feature licensed. To wrap up: if composite relations must be assumed in order to account for Hebrew subjunctive control, control into indicative clauses becomes freely allowed. Although this may be good news for languages such as Brazilian Portuguese, as discussed in section 2.5.2.2, it is certainly unwelcome for most languages. 2.5.2.4 Simplifying Landau’s “calculus of control” Let us examine what the relevant property of Brazilian Portuguese indicative clauses is that triggers obligatory control. Ferreira (2000, 2004, 2009) proposes that finite Ts in Brazilian Portuguese are ambiguous in being associated with either a complete or an incomplete set of -features and that obligatory control is licensed in clauses with a -incomplete T.29 Nunes (2007, 2008a) reinterprets Ferreira’s proposal in terms of the presence or absence of the feature [person] in T. He observes that the verbal-agreement paradigm of finite clauses in Brazilian Portuguese is such that the only inflection that overtly encodes both number and 29 Ferreira (2000, 2004, 2009) (as well as Rodrigues 2002, 2004) in fact analyzes null subjects in Brazilian Portuguese in terms of the MTC. Thus, in his system a -incomplete T does not value the case feature of the subject of its clause, which can then undergo A-movement to the matrix clause. We will leave a detailed discussion of null subjects in Brazilian Portuguese under the MTC for section 4.4 below.
32
Some historical background
person is the first-person singular inflection. All the other cases involve either number specification with default value for person (third) or default values for both person and number (third singular), as illustrated in (56). (56)
Verbal-agreement paradigm in (colloquial) Brazilian Portuguese cantar ‘to sing’: indicative present eu ‘I’
canto
P:1; N:SG
canta
P:default; N:default (= 3SG)
vocˆe ‘you (SG)’ ele ‘he’ ela ‘she’ a gente ‘we’ vocˆes ‘you (PL)’ eles ‘they (MASC)’
cantam
P:default; N:PL (= 3PL)
elas ‘they (FEM)’
Nunes proposes that -complete and -incomplete finite Ts in Ferreira’s terms correspond to Ts specified with number and person features or a number feature only. In case a T with just a number feature is selected, the corresponding person specification will be added in the morphological component by redundancy rules, as sketched in (57) below. That is, if T has only a number feature and it is valued as singular in the syntactic component, it will later be associated with first person in the morphological component; if the number feature receives any other value in the syntactic component (default or plural), it will later be associated with a default value for person (third) in the morphological component. (57)
cantar ‘to sing’: indicative present Valuation of T in the syntactic component
Addition of [person] in the morphological component
Surface form of the verb
N:SG
P:1; N:SG
canto
N:default
P:default; N:default
canta
N:PL
P:default; N:PL
cantam
If finite control in Brazilian Portuguese is related to the possibility that its finite Ts may be specified only for number in the syntactic component, we now find a commonality with Hebrew subjunctive control. As argued by Landau (2004), subjunctive control in Hebrew is restricted to the third person. This can be
2.5 Non-movement approaches to control
33
interpreted as indicating that the relevant subjunctive T in Hebrew may also be associated with only a number feature, in which case it will surface with default-person morphology, that is, third person. This in turn paves the way for a considerable simplification in Landau’s typology, with the desired effects. In other words, the environments where one finds obligatory control involve deficient T-heads, i.e., heads that are temporally deficient, -deficient, or both. Landau’s table in (39) can now be revised as in (58), where ‘+’ stands for fully specified and ‘−’ for deficient (or null). (58)
Obligatory control [T
−
,
−
]
untensed uninflected infinitives, etc.
+
[T
,
−
]
tensed uninflected infinitives, Brazilian Portuguese indicatives, Hebrew 3rd-person subjunctives, etc.
No control −
[T
,
+
]
Balkan untensed subjunctives, etc.
[T+ , + ] English indicatives, Balkan tensed subjunctives, etc.
The table in (58) shares with Landau the intuition that obligatory control is typologically more diverse, but drastically simplifies the “calculus of control” in Landau’s terms. Under (58), finding out whether or not a given clause licenses obligatory control does not need to take the features of C into consideration, for I is sufficient: if either [T] or [] is deficient, obligatory control is possible. Besides simplifying Landau’s system and accounting for finite control into indicatives in Brazilian Portuguese, (58) also eliminates the suspicious ambiguity of the combination of the specifications C[+T, +Agr] and I[+T, +Agr] in Landau’s table in (39), which were employed to describe both obligatory control in Hebrew subjunctives and no control in Balkan F-subjunctives (see section 4.4 below for further discussion). As opposed to Landau’s system, what matters in (58) is not the morphological realization of agreement features, but rather how specified the set of agreement features associated with I is. Conceptually, this is also a welcome result. If obligatory control is to be ultimately determined in the syntactic component, why should the PF realization of agreement features matter? From the perspective of (58), the availability of obligatory control is determined by the tense and agreement features that enter in the syntactic component, regardless of their later morphological realization.
34
Some historical background
The question that now arises is why obligatory control should correlate with deficiency in tense or -feature specification, as depicted in (58). We have seen that Landau’s Agree-based approach was couched on the admittedly stipulated postulation and distribution of [R]-features, which mimicked the case-based approach to PRO in previous models. Given that the distribution of PRO is handled in such a stipulative manner, it would not be surprising if Landau’s R-assignment rule in (46) could be reformulated in such a way that it should become compatible with the generalizations embodied in (58), something that we will not pursue here. However, notice that tense or -feature deficiency generally characterizes “porous” domains out of which movement can take place (see e.g., Boeckx 2003, 2005). Thus, from the perspective of the MTC, the picture embodied in (58) is exactly what one would expect: we can simply replace control by A-movement, as in (59). (59)
A-movement:
√
A-movement: ∗
[T− , − ]
[T+ , − ]
[T− , + ]
untensed uninflected infinitives, etc.
tensed uninflected infinitives, Brazilian Portuguese indicatives, Hebrew 3rd-person subjunctives, etc.
Balkan untensed subjunctives, etc.
[T+ , + ] English indicatives, Balkan tensed subjunctives, etc.
We return to this correlation between movement/obligatory control and Infldeficiency in section 4.4. 2.5.2.5 Summary Combining aspects of syntactic and semantic approaches, Landau’s Agreebased approach to control involves a rich array of features that allows him to capture many manifestations of control, with a very high degree of formal explicitness. Here we have focused on three major pillars of his proposal: (i) the importance ascribed to partial control; (ii) the typology predicted by his feature system; and (iii) the technical details of how the distribution and interpretation of PRO is to be obtained through the operation Agree. We have seen that, given their sensitivity to the argument properties of the embedded predicate, partial-control constructions may also be conceived as involving a null commitative argument, instead of an obligatory-controlled PRO with an independent
2.6 Conclusion
35
semantic plural feature. In other words, the existence of partial control is not by itself an argument for PRO-based theories. As for the typology predicted, the system undergenerates in that it has no room for finite control into indicatives, which is allowed in Brazilian Portuguese. Finally, the technical apparatus rests on various stipulations regarding the properties of features needed to track the distribution and interpretation of PRO and on composite-agreement relations that are not independently motivated and lead to overgeneration. All in all, we agree with Landau (p. 842) that the theoretical foundations for his approach are on equal footing with the Case-based approach in previous models. Despite its technical precision and empirical coverage, it accounts for the distribution and the interpretation of PRO by ultimately encoding the facts to be explained in the guise of stipulative lexical features. 2.6
Conclusion
In this chapter we have discussed different approaches to control within the generative enterprise, from the standard theory to minimalism. It is an interesting fact that the standard theory and the GB approaches took PRO not as a lexical formative, but as the output of the computations of the syntactic component. Accordingly, each of them attempted to account for the distribution and interpretation of PRO in terms of the broad architectural properties of the model of UG then assumed. By contrast, the null-case and Agree approaches take PRO to be a lexical item and, despite their laudable attempts to deduce the distributional and interpretive properties of PRO, they end up simply encoding them as lexical features, thereby eschewing true explanation. With this background, we are now ready to examine the major properties of the MTC, given a minimalist setting.
3
Basic properties of the movement theory of control
3.1
Introduction
If we could start afresh, without our historical baggage and the preconceptions that often come with it, we would likely be struck by the similarities between sentences like (1a) and (1b) below. Both sentences involve a matrix predicate that embeds a non-finite sentential complement and, more interestingly, the unrealized subject of the embedded clause is interpreted as being “the same” as the subject of the matrix clause. That is, ‘John’ is the kisser in both (1a) and (1b). (1) a. b.
John seemed to kiss Mary John tried to kiss Mary
In face of these structural and interpretive similarities, our fresh minds – unbiased but armed with Occam’s razor – would undoubtedly attempt to capture them in a uniform way, with the same mechanisms, unless presented with strong independent reasons for not doing so. The seduction of this simple reasoning encapsulates the MTC. The MTC takes it that the null hypothesis for the derivation of raising and control constructions such as (1a) and (1b) should resort to the same grammatical devices. Thus, if (1a) is to be analyzed in terms of A-movement, (1b) should prima facie be analyzed as involving A-movement as well. Of course, null hypotheses can be, and frequently are, incorrect. But the incorrectness has to be demonstrated and this – in our view – has not been the case with the MTC, despite claims to the contrary, as we shall discuss. In this chapter, we present the basic features of (our version of) the MTC, leaving a detailed discussion of its empirical advantages and its solutions to problems raised in the literature to chapters 4 and 5. Section 3.2 starts with a historical discussion of factors that prevented the MTC from being entertained as the null hypothesis from day one. Section 3.3 shows how the abandonment of D-structure in the minimalist program made it possible and natural to explore the null hypothesis underlying the MTC. Section 3.4 shows how an analysis 36
3.2 Departing from the null hypothesis
37
of controlled PRO as a trace of A-movement deduces the configurational, phonetic, and interpretive properties of obligatory control and how obligatorily controlled PRO can be dispensed with as a grammatical formative. Finally, section 3.5 reviews the architectural features of the MTC and its place within the minimalist program, showing that there exist no strong reasons for discarding the null hypothesis regarding the derivation of control and raising constructions. 3.2
Departing from the null hypothesis: historical, architectural, and empirical reasons
Consider once again the examples in (1). For all their similarities, there is a difference between the two sentences. In (1b) ‘John’ has an interpretive function in virtue of being the matrix subject that is absent in (1a). Infelicitously, we might say that in (1b) John is described as both a kisser and a trier while in (1a), though he is a kisser still, he is in no sense a seemer. This is reflected in the fact that (1a) has a paraphrase like ‘it seemed that John kissed Mary,’ while (1b) has (at best) the very awkward paraphrase ‘John tried for John to kiss Mary’ and no possible paraphrase analogous to the one for (1a): ‘it tried for John to kiss Mary’ is almost incomprehensible. Generative grammarians, it is fair to say, have been more impressed by the last noted difference than the aforementioned similarities, for they have generally taken the semantic difference revealed by the paraphrases to indicate that these otherwise similar sentences have entirely different generative (derivational) profiles. Example (1a) is taken to mediate the relation between the two subject positions by moving ‘John’ from the embedded to the matrix position, leaving behind a coindexed trace, as illustrated in (2a) below. By contrast, (1b) is taken to relate ‘John’ to the embedded-subject position through some kind of binding relation, as represented in (2b).1 (2) a. b.
[John1 seemed [t1 to kiss Mary]] [John1 tried [PRO1 to kiss Mary]]
It is important to note that enriching the theoretical apparatus by having UG invoke different grammatical resources in order to capture the interpretive difference between (1a) and (1b) is not the only conceivable option. As already suggested in section 3.1, one might imagine keeping the theoretical apparatus 1 Unless it is relevant to the discussion, we abstract from the VP internal-subject hypothesis in this chapter. Also for presentational purposes, we will employ GB representations in terms of traces and PRO when nothing is at stake; as we saw in section 2.3, similar distinctions were made in earlier periods using different technology.
38
Basic properties of the movement theory of control
constant and analyzing (1b) also in terms of movement, as illustrated in (3) below. From this perspective, the interpretive difference between (1a) and (1b) is to be ascribed to the indisputable fact that there is an additional -role available in the matrix clause of (1b) (the “trier” -role), but not in the matrix clause of (1a). Therefore, as ‘John’ moves to the matrix clause, it may establish a new thematic relation in (1b), but not in (1a). (3)
[John1 tried [t1 to kiss Mary]]
The analysis in (3), which essentially embodies the MTC, is arguably the null hypothesis for the analysis of (1b). Given the pervasive role of movement in the grammar – however it is encoded – Occam’s razor should urge us to attempt to make do with movement, given that it is already independently required. Interestingly, this has not been a widely explored path in generative grammar2 and it is worth pausing to consider why this is so. In part, this can be attributed to historical reasons regarding the development of the field. In the earliest days of generative grammar, simply attaining descriptive adequacy was a tremendous challenge as the basic formal tools to handle linguistic data were still in the making. It is therefore unsurprising that these first stages were essentially taxonomic, establishing the inventory of possible constructions in natural languages and formulating the rules that should yield the catalogued constructions. Thus, until the early 1980s, transformations were complex operations that generated alternative structures typed by construction (e.g., the passive rule, the wh-question formation rule, the relativization rule, the topicalization rule, etc.). In this scenario, differentiating a raising rule from a control rule (the equi(valent) NP deletion rule; see section 2.3) makes good sense. One very good diagnostic for differentiating two constructions is their differing effects on meaning and, as we noted above, raising sentences like (1a) do differ from control sentences like (1b) as far as the thematic powers of the subject ‘John’ are concerned. This motivation, though reasonable in this background, ceases to be persuasive with the emergence of the principles-and-parameters approach. One important legacy of the GB era with the shift from constructions and rules to principles and parameters is that constructions are now viewed as epiphenomena, resulting from the interaction of more basic operations. Instead of a (roughly) one-to-one relation between constructions and rules, we now find basic operations such as Move ␣ (or, more radically still, Lasnik and Saito’s [1992] Affect ␣) underlying the derivation of a multitude of different types 2 An early notable exception is Bowers (1973).
3.2 Departing from the null hypothesis
39
of constructions. For instance, the derivation of passive and raising constructions is taken to employ the same grammatical device, namely, (A-)movement, rather than resorting to the distinct rules of passive and raising. The differences between these constructions such as the dethematicization of the external argument in the case of passives are then factored out and analyzed in terms of other independent components of the grammar (e.g., -theory). But once one goes this far, there is no obvious conceptual barrier to categorizing control with passive and raising, all sharing the same generative resources. The three constructions could potentially be derived by the same grammatical tool (Amovement) and their differences allocated to different components of the grammar. Let us make the same point in a slightly different way: nobody assumes that wh-questions are the same as relative clauses. Nonetheless, it is now widely accepted (among generative grammarians) that, whatever differences the two constructions enjoy (and there are many), these differences do not undermine the claim that their derivations both involve a common (A’-)movement operation. The MTC asks that this same reasoning be applied to raising and control configurations. The source of their differences may reside not in the operations that go into generating them, but in the interaction of their specific properties with other grammatical components. In section 3.3 and Chapter 5 below, we will examine in detail potential sources for the differences between control and raising constructions documented in the literature. The important point to bear in mind here is that with the abandonment of constructions as grammatical primitives, there remains no logical impediment for entertaining the hypothesis that control structures are derived by movement. In fact, the principles-and-parameters perspective, in which constructions are not theoretical primitives, invites one to eliminate the exceptional theoretical status of control qua construction in the grammar. Recall from our discussion in section 2.4 that the analysis of control in GB could not ultimately be reduced to binding/construal without additional provisos. Once PRO is analyzed as a pronominal anaphor, the account of the interpretive properties of PRO requires a specific grammatical module, the control module, which must somehow disregard the pronominal specification of PRO in OC constructions and its anaphoric specification in NOC constructions. Such construction sensitivity looks like a fossil from previous stages of the generative enterprise that one would like to get rid of. The second historical reason for why the MTC was not pursued within generative grammar, which also accounts for why the construction-specific flavor of the control module was tolerated within GB, has to do with the general architecture of the grammar standardly assumed prior to the minimalist
40
Basic properties of the movement theory of control
program. More specifically, the assumption of D(eep)-structure (DS) as a level of representation left no room for an approach along the lines of the MTC.3 Let us see why. In prior models, DS has two important properties. First, it codes all the relevant argument-structure information a sentence expresses. It is, in technical lingo, a pure “representation of GF-,” where all and only argument (thematic) positions are filled.4 Thus, if a predicate has a logical subject and object (agent and theme), then both subject and object positions must be lexically filled at DS. On the other hand, if a verb has a logical object but no subject (e.g., a passive or unaccusative verb), the object position must be lexically filled at DS, whereas the subject position must be empty. The second important aspect of DS is that, functionally, it is the output of phrase-building operations (namely, phrase-structure rules and lexical-insertion operations) and the input to the transformational component. In other words, in models that include a DS level all lexical-insertion operations precede all movement transformations. Let us now examine how the sentences in (1) repeated below in (4) are analyzed under these assumptions. Take the representations in (5), for instance, which indicate that ‘John’ has moved from the embedded to the matrix-subject position, leaving a trace behind. (4) a. b.
John seemed to kiss Mary John tried to kiss Mary
(5) a. b.
[John1 seemed [t1 to kiss Mary]] [John1 tried [t1 to kiss Mary]]
Given the requirement that lexical insertion must precede movement, the DS representation of (5a) and (5b) should be as in (6) below, with ‘John’ generated in the embedded-subject position. In both (6a) and (6b), ‘kiss’ has two arguments and its subject and object positions are correctly filled. ‘Seem’ in (6a) does not assign a -role to its subject position and, accordingly, its subject position is left empty. By contrast, ‘try’ does assign a -role to its subject position, but there is no category filling this position in (6b). Hence, a movement-based derivation along the lines of (5b) for the control construction in (4b) is ruled out at DS, as the thematic requirements of ‘try’ are not 3 Deep structure is not quite the same as D-structure. However, for current purposes the differences are insignificant as are the differences between various models of grammar that included a Dstructure level, e.g., EST and both early and late GB models. 4 See Chomsky (1981: 43).
3.2 Departing from the null hypothesis
41
satisfied at this level. The derivation of (4b) should therefore have two distinct elements filling the subject positions, as illustrated in (7b), with ‘John’ occupying the matrix-subject position and PRO the embedded-subject position. Notice that a similar DS representation for raising constructions, as illustrated in (7a), is not licit, as the subject position of ‘seem’ is filled despite the fact that it is not thematic. In other words, assuming DS in the grammar unavoidably leads to a movement analysis of raising and a construal analysis of control. (6) a. b.
DS: [Seemed [John to kiss Mary]] DS: ∗ [Tried [John to kiss Mary]]
(7) a. b.
DS: ∗ [John seemed [PRO to kiss Mary]] DS: [John tried [PRO to kiss Mary]]
The big empirical virtue of assigning distinct DS representations to raising and control constructions along the lines of (6a) and (7b) is that it derives semantic differences between these constructions in a principled manner. Take the contrasts in (8)–(10), for example. (8) a. b. (9) a. b. (10) a. b.
∗
There seems to be someone kissing Mary There tried to be someone kissing Mary The cat seems to be out of the bag (idiomatic interpretation: OK) The cat tried to be out of the bag (idiomatic interpretation: ∗ ) The doctor seemed to examine Mary ∼ Mary seemed to be examined by the doctor The doctor tried to examine Mary = Mary tried to be examined by the doctor
In (8), we can find an expletive in the subject of ‘seem’ but not of ‘try.’ Why? Because expletives have no semantic content and so cannot bear -roles.5 As the subject of ‘seem’ is not thematic, it must be empty at DS, as shown in (11a) below; ‘there’ can then be inserted in the subject position of ‘seem’ after DS without semantic (or grammatical) violence being done. In contrast, as the subject position of ‘try’ is thematic, at DS it must be filled by a category that can bear a -role. If it is left empty at DS, as in (11b) and later filled with ‘there,’ the thematic properties of ‘try’ will not be satisfied at DS. If ‘there’ fills this position already at DS, as in (11c), we again have an illicit DS representation, for ‘there’ is not a valid -role bearer. 5 At least not conventional ones. We set aside for the nonce the “quasi”-argument status of ‘it’ in weather constructions like ‘it is raining’ (see Chomsky 1986b for discussion).
42
Basic properties of the movement theory of control
(11) a. b. c.
DS: [Seems [to be someone kissing Mary]] DS: ∗ [Tried [to be someone kissing Mary]] DS: ∗ [There tried [to be someone kissing Mary]]
We can play the same game with the idioms in (9). Idioms are not compositionally interpreted. If we take this to mean that they cannot bear conventional -roles, then an idiom (or part of one) cannot be simultaneously interpreted idiomatically and thematically. In (9a) ‘the cat’ can retain its idiomatic meaning, as the subject of ‘seem’ is not thematic and is left empty at DS, as represented in (12a) below. Thus, (9a) can be interpreted as meaning that it seems like the secret has been revealed. This is not possible in (9b). Once ‘the cat’ is in a thematic position, as shown in (12b), it cannot have an additional idiomatic interpretation. Consequently, though (9b) is well formed and interpretable, it means something like ‘the kitty tried to escape the confining sac.’ It has nothing to do with secrets revealed or otherwise. (12) a. b.
DS: [Seems [[the cat] to be out of the bag]] DS: [[The cat] tried [PRO to be out of the bag]]
The pairs of sentences in (10) offer a final illustration of the same point. The sentences in (10a) are rough paraphrases of one another, both meaning that it seems that the doctor examined Mary. They display “voice transparency” in the sense that passivizing the embedded clause has little effect on the interpretation of the whole. This transparency is captured at DS as ‘Mary’ fills the object position of the embedded verb in the DS representation of both sentences, as illustrated in (13) below. By contrast, the control counterparts in (10b) have completely different meanings, with the effort being made by the doctor in the first sentence but by Mary in the second. Lack of voice transparency in (10b) is attributed to the different positions ‘Mary’ occupies at DS in each case. As illustrated below, at DS ‘Mary’ is the thematic object of ‘examine’ in (14a) but the thematic subject of ‘try’ in (14b). (13) a. b.
DS: [Seemed [the doctor to [examine Mary]]] DS: [Seemed [to be [examined Mary] by the doctor]]
(14) a. b.
DS: [The doctor tried [PRO to [examine Mary]]] DS: [Mary tried [to be [examined PRO] by the doctor]]
In sum, the two features of DS reviewed above (namely, that DS is substantively the level at which all and only thematic positions must be filled and functionally the level that feeds movement operations) conspire to eliminate a movement approach to control along the lines of (15) below at DS. This is one
3.3 The revival of the null hypothesis
43
reason why the MTC was not considered viable in GB or in previous models that assumed DS. (15)
[John1 tried [t1 to kiss Mary]]
Furthermore, the interaction of these two features of DS derives subtle interpretive properties of raising and control structures and this was certainly a big achievement. So much so that retaining the clumsy construction sensitivity of the control module in a principles-and-parameters model seemed a reasonable price to pay. But questions arise if the architecture of the model changes. Specifically, what if we give up DS? Can the interpretive differences between raising and control still be captured? If so, are we not then free to reconsider the null hypothesis regarding control, expressed in (15)? This is the subject of the next section.
3.3
Back to the future: elimination of DS and the revival of the null hypothesis
As we discussed in section 3.2, assuming DS has drastic consequences for the (simplest) version of the MTC expressed in (15). As DS requires that all -roles must be discharged before any movement takes place, the -roles associated with the controller and the controllee must be assigned before any movement operation, which in practice prevents the controller and the controllee from being associated via movement. Models that eschew a DS level are therefore free to pursue the MTC. In other words, without DS there is no obvious architectural reason against reducing raising and control to a common movement source. This observation has become especially relevant with the emergence of the minimalist program in the early 1990s, as minimalism argues against the postulation of non-interface levels such as DS.6 In particular, minimalists have explored the idea that lexical insertion and -assignment, on the one hand, and movement, on the other, can be freely interspersed. A sentence such as (16), for instance, is to be associated with the (simplified) derivation in (17) (see footnote 1), where movement of ‘what’ in (17e) is sandwiched between different applications of lexical insertion and -assignment. 6 For relevant discussion, see Chomsky (1995), Uriagereka (1998), and Hornstein, Nunes, and Grohmann (2005), among others.
44
Basic properties of the movement theory of control
(16)
What did Mary say that John saw
(17) a.
Merger of ‘saw’ and ‘what’ + θ-assignment: [saw what] Merger of T: [T [saw what]] Merger of ‘John’ + θ -assignment: [John [T [saw what]]] Merger of ‘that’: [that [John [T [saw what]]]] Movement of ‘what’: [whati [that [John [T [saw ti ]]]]] Merger of ‘say’ + θ -assignment: [say [whati [that [John [T [saw ti ]]]]]] Merger of T: [T [say [whati [that [John [T [saw ti ]]]]]]] Merger of ‘Mary’ + θ -assignment: [Mary [T [say [whati [that [John [T [saw ti ]]]]]]]] Merger of C: [C [Mary [T [say [whati [that [John [T [saw ti ]]]]]]]]] Movement of ‘what’: [what [C [Mary [T [say [whati [that [John [T [saw ti ]]]]]]]]]]
b. c. d. e. f. g. h. i. j.
Once it is independently assumed that -role assignment may follow applications of movement, it is not at all odd to suppose that a control construction such as (18) below could be derived along the lines of (19), where the “trier” -role is assigned after movement of ‘John’ (cf. [19g]). This is even more so if movement is in fact a composite operation that includes merger as one of its basic operations.7 That is, if movement of ‘John’ in (19g) involves plugging it into the structure via merger, this merger operation should in principle license -assignment in the same way merger of ‘Mary’ in (19a) or ‘John’ in (19c), for instance, does. (18)
John tried to kiss Mary
(19) a.
Merger of ‘kiss’ and ‘Mary’ + θ-assignment: [kiss Mary] Merger of T: [T [kiss Mary]] Merger of ‘John’ + θ-assignment: [John [T [kiss Mary]]]
b. c.
7 This holds whether move is identical to merge (is internal merge) or contains it as a subpart (involves copy and merge). See Chomsky (1995, 2000, 2001), Nunes (1995, 2001, 2004, in press), and Hornstein (2001) for relevant discussion.
3.3 The revival of the null hypothesis d.
45
Merger of C: [C [John [T [kiss Mary]]]] Merger of ‘tried’ + θ -assignment: [tried [C [John [T [kiss Mary]]]]] Merger of T: [T [tried [C [John [T [kiss Mary]]]]]] Movement of ‘John’ + θ-assignment: [Johni [T [tried [C [ti [T [kiss Mary]]]]]]]
e. f. g.
The question that we now have before us is empirical. Can the analysis sketched in (19) account for the differences between raising and control illustrated in (8)–(10), repeated below in (20)–(22)? (20) a. b.
∗
There seems to be someone kissing Mary There tried to be someone kissing Mary
(21) a. b.
The cat seems to be out of the bag (idiomatic interpretation: OK) The cat tried to be out of the bag (idiomatic interpretation: ∗ )
(22) a.
The doctor seemed to examine Mary ∼ Mary seemed to be examined by the doctor The doctor tried to examine Mary = Mary tried to be examined by the doctor
b.
As we can verify in the relevant representations of the control structures in (23), the answer is affirmative. (23) a. b. c. d.
∗
[Therei tried [ti to be someone kissing Mary]] [[The cat]i tried [ti to be out of the bag]] [[The doctor]i tried [ti to [examine Mary]]] [Maryi tried [ti to be [examined ti ] by the doctor]]
In (23a), ‘there’ moves to a thematic position, but it is not a licit -role bearer; hence the unacceptability of (20b). Under the assumption that a given element cannot be simultaneously interpreted as a thematic argument and an idiom chunk, movement of ‘the cat’ to the thematic matrix-subject position in (23b) will only be licit if ‘the cat’ is not interpreted idiomatically in the embedded clause (cf. [21b]). Finally, ‘Mary’ is associated only with the “examinee” -role in (23c), but with both the “examinee” and “trier” -roles in (23d); hence the two sentences in (22b) are not paraphrases of one another. In sum, once DS is removed, the MTC arises as a theoretical possibility worth considering – in fact, it becomes, we believe, the null hypothesis. Moreover, semantic contrasts such as the ones in (20)–(22), which were used to motivate the derivation of raising and control in terms of distinct grammatical devices (traces vs. PRO), are accounted for by invoking the same or comparable
46
Basic properties of the movement theory of control
assumptions that a PRO-based DS approach must resort to. In other words, the use of the minimal technical apparatus independently required (i.e., movement) has so far also proven to be empirically adequate. Before we close this section and examine other consequences of the analysis outlined thus far, it is important to observe that we are not claiming that the MTC follows from the abandonment of DS (although the “simplest” removal of DS does entail this; cf. Chapter 8 for discussion). Chomsky (2004), for instance, proposes that all thematic information must be discharged via “external” merge, that is, the merge operation that is not part of movement (“internal” merge). This shows that dropping DS does not entail movement into thematic positions.8 However, eliminating DS is a necessary condition for the viability of the MTC. In fact, one might be tempted to suggest that any model that rejects DS on principled grounds should welcome the unification of raising and control by seeing both as products of movement, especially if there is almost no (or rather little) empirical cost in doing so. In this sense, the MTC fits rather comfortably with the broad architectural features of the minimalist program, as we shall discuss in detail later.
3.4
Controlled PROs as A-movement traces
By treating OC PROs as traces of A-movement (movement to a thematic position), the MTC unifies form and meaning. Not only does it account for the lack of phonetic realization of PRO and the distribution of PRO and its controller, but it also goes a fair way to accounting for PRO’s interpretive properties. To illustrate, we examine in the following sections the distinctive characteristics of OC PRO listed in (24) (see section 2.4). (24) a.
OC PRO requires an antecedent: [It was hoped [PRO1 to shave himself]] Its antecedent must c-command it: ∗ [John1 ’s campaign hopes [PRO1 to shave himself]] Its antecedent must be local: ∗ [John1 thinks [that it was hoped [PRO1 to shave himself]]] It cannot appear in case-marked positions: ∗ [John1 said [(that) PRO1 will travel tomorrow]] It gets a sloppy interpretation under ellipsis: [John1 wants [PRO1 to win]] and [Bill does too] (‘and Bill wants himself to win’/∗ ‘and Bill wants John to win’) ∗
b. c. d. e.
8 See Hornstein, Nunes, and Grohmann (2005, section 2.3.2.2) for relevant discussion.
3.4 Controlled PROs as A-movement traces
47
f.
It cannot have split antecedents: [John1 asked Bill2 [PRO1+2 to shave themselves/each other]] g. It has an obligatory de se interpretation in “unfortunate” contexts: [[The unfortunate]1 expects [PRO1 to get a medal]] (#although he doesn’t expect himself to get a medal) h. It must receive a bound reading when linked to an only-DP: [[Only Churchill]1 remembers [PRO1 giving the BST speech]] (‘Only Churchill is such that he remembers himself giving the BST speech’/ ∗ ‘Nobody else remembers that Churchill gave the BST speech’) ∗
3.4.1 Configurational properties The properties illustrated in (24a–d) are standard properties of traces of Amovement (NP-traces in GB terminology).9 In (24a), for instance, PRO requires an antecedent, but if it takes the expletive as its antecedent, it cannot license ‘himself.’ The same situation is found in (25), where the trace must take ‘it’ as its antecedent and, therefore, fails to license the anaphor.10 (25)
∗
[It1 was expected [t1 to shave himself]]
The similarity also extends to the c-command and locality restrictions on the position of the controller. Under the standard assumptions that movement (in general) targets a c-commanding position11 and that A-movement in particular is very local, the ungrammaticality of (24b) and (24c) reduces to the ungrammaticality of the representations in (26), where the antecedent does not c-command the A-trace in (26a) and is not local in (26b) due to the intervention of ‘it.’ (26) a. b.
∗ ∗
[[John1 ’s sister] was hired t1 ] [John1 seems [that it was likely [t1 to shave himself]]]
The analysis of the ungrammaticality of (24c) in terms of minimality is in essence a reincarnation of Rosenbaum’s (1967, 1970) minimal-distance 9 As discussed in section 2.4, the parallels between OC PRO and NP-traces have been often noted. See, for example, Koster (1987), where the similarities are expressly emphasized. 10 There arises the question of why the sentence in (ia) below cannot be derived along the lines of (ib), where the OC PRO/A-trace does have a (local c-commanding) antecedent. We postpone the discussion of sentences such as (i) to section 5.2 below, where we argue that they are excluded due to a minimality violation. (i) a. ∗ John was hoped to shave himself b. ∗ [John1 was hoped [PRO1 /t1 to shave himself]] 11 We return to this issue in section 4.5.1 below, where we discuss cases of movement to non-ccommanding positions in adjunct-control constructions.
48
Basic properties of the movement theory of control
principle (see section 2.3). Take the object-control construction in (27) below, for instance. Under an (updated) Larsonian representation of ditransitive constructions, the structure of the matrix vP in (27) should be as in (28). If PRO is a trace, it must be the trace of ‘Mary’ in (28). If it were the trace of ‘John,’ movement of ‘John’ from the embedded-subject position to the matrix [Spec, vP] would violate minimality, as ‘Mary’ intervenes.12 (27)
[Johnk convinced Maryi [PROi/∗ k to leave]]
(28)
vP v’
Johnk convincedw + v
VP
Maryi
V’ tw
[PRO i/∗ k to leave]
As for (24d), it is also a property that applies to A-traces, as shown in the “hyper-raising” case in (29), where ‘John’ moves from a position associated with nominative case. (29)
∗
[John1 seems [(that) t1 will travel tomorrow]]
Under current minimalist technology, the fact that one does not find traces in case-marked positions follows from Chomsky’s (2001) activation condition, according to which an element is active for purposes of A-movement only if it has not checked/valued its case feature. Again, if OC PRO is an A-trace, it must occupy a caseless position, otherwise the activation condition is violated.13 Finally, notice that here we don’t have two subspecies of A-chains (the canonical ones, which have to be headed by a case position, and the exceptional ones headed by PRO, which are not subject to this requirement), as was the case in GB (see section 2.4). Under the MTC, A-chains are uniformly associated with a case position (see section 5.4 below for further discussion). 12 We return to a discussion of apparent counter-examples such as (i) in section 5.5.1 below. (i)
[Johnk promised Maryi [PROk/∗ i to go]]
13 Notice that this does not entail that OC PRO/trace is always excluded from the subject position of finite clauses. As we saw in sections 2.5.2.2 and 2.5.2.4, Brazilian Portuguese allows obligatory control into indicative finite clauses. We will return to the case properties of finite-control constructions in section 4.4 below.
3.4 Controlled PROs as A-movement traces
49
3.4.2 Interpretive properties The requirement that ellipsis involving OC have a sloppy reading (cf. [24e]) tracks what we find in raising constructions, as exemplified by (30) below, where the second conjunct is understood as ‘Bill also seems to be cooperative.’ Regardless of how ellipsis is to be ultimately analyzed, the similarities of interpretation between (24e) and (30) can receive a straightforward account if they are associated with the same type of dependency relation (i.e., movement), as represented in (31). (30)
John seems to be cooperative and Bill does too
(31) a. b.
[John1 wants [t1 to win]] and [Bill does too] [John1 seems [t1 to be cooperative]] and [Bill does too]
The prohibition against split antecedents (cf. [24f]) also finds a straightforward account under the MTC. From the point of view of the MTC, if ␣ is the antecedent of PRO, then ␣ must have moved from the position occupied by PRO, that is, PRO is the trace of the so-called antecedent. That being so, split antecedents would only be possible if two DPs could move from one and the same position. However, a standard assumption within generative grammar is that two expressions cannot occupy the very same position. This restriction is even clearer under the bare phrase-structure system (see Chomsky 1994, 1995), where there is no distinction between positions and elements that undergo merger and merge is assumed to be a binary operation. Under these assumptions there is no way for the computational system to simultaneously merge two unconnected terms ␣ and  to another term ␥ . Thus, the restriction on split antecedents turns out to have more to do with PRO (the source of the movement) than the antecedents themselves (the targets of movement).14 Before addressing how the properties in (24g) and (24h) can be accounted for, it should be noted that PRO differs from standard pronouns, as far as de se readings and bound readings with ‘only’ are concerned. As we saw earlier, the interpretation of a sentence such as (32a) below requires that the unfortunate be conscious of who he is and expect himself to get a medal; hence the addition in parentheses is infelicitous. The use of a pronoun instead of PRO in (32b) does not trigger such a restrictive interpretation. It is compatible with a de se reading, but also admits a non-de se interpretation. For example, it 14 At first sight, the existence of partial control (see section 2.5.2.1) and split-control (see Oded 2006 and Fujii 2006) constructions seems to be problematic for the MTC. However, as will be discussed in sections 5.6.1 and 5.6.2 below, based on Rodrigues (2007) and Fujii (2006), the source of the movement in these apparently problematic cases involves not unconnected expressions, but rather complex “conjunctive” subjects.
50
Basic properties of the movement theory of control
admits a reading with the unfortunate expecting that some particular individual should get a medal given what he read about this person without having the knowledge that this individual is actually him, the unfortunate. In this scenario, (32b) may be felicitously followed by the addition in parentheses. Similarly, in (33) the pronoun supports the bound reading required by PRO, as well as a coreferential reading where the sentence may be falsified if anybody remembers that Churchill delivered the BST speech. (32) a. b.
[[The unfortunate]1 expects [PRO1 to get a medal]] (#Although he doesn’t expect himself to get a medal) [[The unfortunate]1 expects [that he1 should get a medal]] (Although he doesn’t expect himself to get a medal)
(33) a.
[[Only Churchill]1 remembers [PRO1 giving the BST speech]] (‘Only Churchill is such that he remembers himself giving the BST speech’/ ∗ ‘Nobody else remembers that Churchill gave the BST speech’) b. [[Only Churchill] remembers [that he gave the BST speech]] (‘Only Churchill is such that he remembers himself giving the BST speech’/ ‘Nobody else remembers that Churchill gave the BST speech’)
Having the contrasts in (32) and (33) in mind, let us consider how expressions that have been assigned multiple -roles are to be interpreted. Take the control construction in (34) below, for instance. According to the MTC, it is associated with the (simplified) derivation in (35), where the moved DP in (35d) ends up being marked with two -roles after moving to the thematic-subject position of the matrix clause. The natural interpretation for the thematic relations encoded in (35d) is expressed by the logical form given in (36). (34)
John expected to kiss Mary
(35) a.
Applications of merge: [to kiss Mary] Merger of ‘John’ + assignment of “kisser” θ -role: [Johnkisser to kiss Mary] Applications of merge: [T expected [Johnkisser to kiss Mary]] Movement of ‘John’ + assignment of “expecter” θ-role: [John1 expecter+kisser T expected [t1 to kiss Mary]]
b. c. d. (36)
John (x [x expected x kiss Mary])
Following Reinhart (1983) and Salmon (1986), we can understand (36) as ascribing the property of expecting oneself to kiss Mary to John.15 Importantly, complex monadic predicates such as (36) have an inherently reflexive semantics 15 See the discussion in Grodzinsky and Reinhart (1993: 74), whence this locution is taken.
3.4 Controlled PROs as A-movement traces
51
(note the gloss above: expecting oneself to kiss Mary), thus being semantically very different from structures where two distinct expressions have a dependency relation. In effect, Reinhart and Salmon provide the semantic wherewithal for distinguishing the interpretation of multiple thematic positions within a chain (cf. [32a], [33a], and [34]) from multiple thematic positions in a dependency relation across chains (cf. [32b] and [33b]).16 In other words, the logical forms of (32a) and (33a) are as represented in (37) below and the logical forms of (32b) and (33b), as in (38). Intra-chain “binding” is restricted to de se and bound readings as it involves complex monadic predicates, as opposed to inter-chain binding.17 (37) a. b.
[The unfortunate] (x [x expected x to win a medal]) [Only Churchill] x([x remembers x giving the BST speech])
(38) a. b.
[The unfortunate] (x [x expected that he should win a medal]) [Only Churchill] x([x remembers that he gave the BST speech])
In sum, it appears that, by analyzing OC PRO as an A-trace, the MTC can also derive OC PRO’s central interpretive features, whereas this is something tricky to capture in any model that treats OC PRO as a pronoun of sorts, as we have seen in section 2.5. 16 The similarity in interpretation between reflexives and PRO invites us to consider analyses of reflexivization also in terms of A-movement. See Hornstein (2001, 2007), Boeckx, Hornstein, and Nunes (2007, 2008), and references therein for specific proposals and relevant discussion. 17 This is not the way that Reinhart (1983) interprets matters. She understands pronominal binding as another case of -abstraction. Thus, (i) can also have the structure in (ii) in her approach, allowing the interpretation that the property of expecting oneself to kiss Mary is attributed to Alfred. (i) (ii)
Alfred1 expected that he1 would kiss Mary Alfred (x [x expected that x would kiss Mary])
However, in contrast to (iii) below, (i) need not be interpreted in a de se manner. Moreover, even if we replace ‘Alfred’ in (i) with a quantified DP that binds the pronoun (so blocking the option of a coreferential reading of the coindexed pronoun), as in (iv), the de se reading is still not forced. This suggests that the binding of a pronoun does not result in a complex monadic predicate with an inherently reflexive semantics. To put this another way, there is an important semantic difference between a single expression “binding” two -positions and two expressions each “binding” a -position and themselves in a binding relation. Only the former yields a necessarily reflexive (i.e., de se) reading. For further discussion of this issue, see Hornstein and Pietroski (2009). (iii) (iv)
Alfred expected PRO to kiss Mary [Every soldier]1 expected that he1 would kiss Mary
52
Basic properties of the movement theory of control
3.4.3 Phonetic properties and grammatical status Recall from section 2.5 that in non-movement analyses within minimalism, PRO is a primitive lexical formative. Thus, like all of its other properties, its lack of phonetic content is taken to be an irreducible (i.e., non-explainable) lexical property.18 It pays to reexamine what a strange kind of lexical item PRO is, under this view. It has virtually no properties of its own. It has no phonetic properties and its only semantic property is that of a pure variable, in effect, a placeholder for the interpretation of its antecedent. PRO so conceived has even less theoretical appeal than Agr heads, as the latter are (at least) often phonetically visible. Chomsky (1995) has argued that Agr essentially encodes a syntactic relation and, as such, it should be understood as a grammar-internal formative and not as a lexical head. Similar considerations apply to PRO. Treating it as a lexical formative in fact presents more questions than answers. In the case of its phonetic content, the stipulation that it must be phonetically empty brings with it the obvious question of why this should be so (see section 4.5.4 below for further discussion). In contrast to this lexical approach, its predecessors analyzed the lack of phonetic content of the “controllee” as following from properties of the computational system then assumed (see sections 2.3 and 2.4). In the standard theory, the “controllee” was phonetically null as it underwent a deletion transformation (the equivalent NP deletion rule). In turn, PRO in (early) GB was analyzed as a base-generated empty category, [NP Ø], resulting from the phrasestructure rule NP→ N not followed by lexical insertion for N. In addition, if PRO had phonetic content, it should be subject to the case filter and case assignment operated under government. Once PRO could not be governed (the PRO theorem), it then follows that it could not have phonetic content. All in all, the great virtue of the standard theory and GB accounts when compared to non-movement analyses within minimalism is that they attempted to provide a rationale within which PRO’s phonetic emptiness should follow from general considerations.
18 To reiterate: for non-movement accounts there is really no alternative within a minimalist setting given bare phrase structure except to treat PRO as a lexical primitive. This requirement, in turn, prevents such accounts from explaining why control clauses have the properties they have as they must all pack the specific requirements characteristic of control clauses into the lexical specifications of PRO. The reason that PRO needs a local antecedent is that it is lexically specified to require one. The reason that it is phonetically null is that it is lexically required to be so. In effect, given minimalist assumptions, a PRO-based approach to control can at best track/describe the properties of control constructions but it cannot possibly explain them. In this sense, contemporary PRO-based accounts are far less interesting than their GB predecessors.
3.4 Controlled PROs as A-movement traces
53
The MTC also subscribes to the view underlying the standard theory and GB that OC PRO is not a lexical formative, but a product of the grammar, i.e., a trace left by an A-movement operation. Interestingly, the MTC shares specific aspects with both the standard theory and GB. With (early) GB, it shares the view that OC PRO and NP-traces are indistinguishable at LF. Recall from section 2.4 that both OC PRO and NP-traces were taken to be categories of the form [NP Ø], with the only difference between them being the provenance of the index tying them to their antecedents (see Chomsky 1977: 82): for NP-traces, the index is part of the movement operation; for PRO, it arises from a construal rule, the rule of control. However, the conception of PRO and NP-traces as [NP Ø] categories is completely at odds with the bare phrase-structure system adopted in the minimalist program (see Chomsky 1994, 1995). A key feature of bare phrase structure is that it dispenses with the distinction between a lexical element and the position it occupies. Phrases are understood as projections of lexical items and are built through successive applications of merge. Consequently, there are no lexically unfilled positions. In fact, there is no structure other than the structures formed by successively merging (projections of) lexical items. Thus, in a minimalist setting, PRO and NP-traces cannot be associated with the structure [NP Ø], for it is impossible to generate a phrase without a lexical head, given bare phrase structure. Notice that simply taking PRO and NP-traces to be associated with structures such as [NP t] will not do either. One of the core architectural properties of the minimalist program is the inclusiveness condition (see Chomsky 1995), which enforces parsimony in the set of primitives proposed, by requiring that LF objects be built based only on the features of the lexical items that feed the derivation. The inclusiveness condition bans the creation of new objects in the course of syntactic computations, by only allowing restricted manipulation of the (features of the) lexical items that form syntactic structures. Under this view, traces cannot be theoretical primitives as they are not built from items of the lexicon, but are rather created (out of nothing) by the computation itself (i.e., movement operations). Based on conceptual reasons such as the ones discussed here and empirical reasons having to do with reconstruction effects, Chomsky (1993) incorporates the copy theory of movement into the minimalist program.19 According to the copy theory, movement amounts to copying lexical
19 For further conceptual and empirical arguments for assuming the copy theory of movement, see e.g., Chomsky (1995), Hornstein (1995, 2001), Nunes (1995, 2004, in press), Boˇskovi´c and Nunes (2007), the collection of papers in Corver and Nunes (2007), Kandybowicz (2009), and references therein.
54
Basic properties of the movement theory of control
items or their projections, merging the copied material into the structure, and deleting lower copies in the phonological component. A sentence such as (39), for instance, is analyzed along the lines of (40), where superscripted indices annotate copies. (39)
John was arrested
(40) a.
Applications of merge: [was [arrested John]] Copying and merger of ‘John’: [John1 [was [arrested John1 ]]] Deletion of the lower copy in the phonological component: [John1 [was [arrested John1 ]]]
b. c.
Thus, if OC PRO is an NP-trace, it should also be a copy of the moved element, in compliance with the inclusiveness condition. The control construction in (41) below, for instance, is to be derived as in (42). In other words, whatever is independently responsible for deletion of traces (lower copies) should also account for PRO’s lack of phonetic content: it is a deleted copy (see section 4.5 below for further discussion). (41)
John hoped to see Mary
(42) a.
Applications of merge: [T hoped [John to see Mary]] Copying and merger of ‘John’ + θ -assignment: [John1 [T hoped [John1 to see Mary]]] Deletion of the lower copy in the phonological component: [John1 [T hoped [John1 to kiss Mary]]]
b. c.
By taking OC PRO to be a copy resulting from movement to a thematic position, this minimalist implementation of the MTC complies with both the inclusiveness condition and bare phrase structure. Copies replicate lexical items or syntactic objects built from lexical items and deletion of copies takes place in the phonological component. That is, as far as the syntactic computation goes, PRO is neither a phrase with no lexical head, nor a phrase headed by an entity that is not a lexical item. It is either a lexical item or a phrase built from lexical items, which gets deleted in the phonological component. In other words, this minimalist implementation of the MTC not only accounts for the distribution, interpretation, and lack of phonetic content of OC PRO, but in fact paves the way to the elimination of PRO as an exotic lexical primitive, thereby simplifying the general apparatus of the model. In a sense, the deletion operation in (42c) can be viewed as a descendant of the equi-NP deletion rule of the standard theory, now generalized to
3.4 Controlled PROs as A-movement traces
55
constructions other than control. Importantly, it differs from its predecessor in being able to correctly account for the meaning difference between (43a) and (43b) below. Recall from section 2.3 that, by relying on phonetic identity, the equi-NP deletion rule should derive (43a) from the structure corresponding to (43b), incorrectly predicting that the two sentences should have the same meaning. (43) a. b.
Everyone wants to win Everyone wants everyone to win
By contrast, the deletion operation seen in (40c) and (42c) deals with copies and not simply with elements that happen to have the same phonological shape.20 Under this view, the sentences in (43) are respectively associated with the structures in (44). (44) a. b.
[Everyone1 T wants [everyone1 to win]] [Everyone2 T wants [everyone1 to win]]
We have two copies of ‘everyone’ in (44a), but two distinct occurrences of ‘everyone’ in (44b). In other words, the numeration underlying (44a) has only one instance of ‘everyone,’ which is merged in the embedded clause and later copied to be merged in the matrix clause. By contrast, the numeration underlying (44b) has two instances of ‘everyone,’ each of which is merged in a different clause. When shipped to the phonological component, the structures in (44) will then receive different treatment despite their superficial similarity. That is, deletion will target the lower copy of ‘everyone’ in (44a), as shown in (45) below, but not the second independent occurrence of ‘everyone’ in (44b). (45) then surfaces as (43a) and is interpreted as a complex monadic predicate, as discussed in section 3.4.2. By contrast, (44b) surfaces as (43b) and each quantifier ranges over a different variable. (45)
[Everyone1 T wants [everyone1 to win]]
To sum up, we have mentioned earlier that the MTC accords well with the general features of the minimalist program. In addition to the required abandonment of DS, we have seen in this section that the implementation of the MTC in terms of the copy theory complies with the inclusiveness condition and 20 As Chomsky (1995: 227) points out, “the syntactic objects formed by distinct applications of Select to LI [lexical item] must be distinguished; two occurrences of the pronoun ‘he,’ for example, may have entirely different properties at LF.”
56
Basic properties of the movement theory of control
bare phrase structure. More importantly, by sticking to these core minimalist precepts, the version of the MTC outlined here not only accounted for the distribution, interpretation, and lack of phonetic content of OC PRO, but ended up eliminating PRO as a primitive of the grammar. This substantial result will be strengthened further as we examine its empirical consequences in the next chapters.21
3.5
Conclusion
As mentioned in section 2.2, an adequate theory of control must meet (at least) the following four requirements. First, it must enumerate the kinds of control structures, specifying if and why obligatory control (OC) and non-obligatory control (NOC) are different. Second, it must account for the distributional properties of control, explaining why the controlled element appears where it does. Third, it must account for the interpretation of the controlled element, showing how PRO’s antecedent is determined and what kind of anaphoric relation obtains between PRO and its antecedent in OC and NOC structures. And fourth, it must determine the nature of the controlled element, specifying its place among the inventory of null expressions provided by universal grammar. In this chapter, we outlined the broad features of (our version of) the MTC, paying exclusive attention to OC (we return to NOC in Chapter 6 below). The version of the MTC explored here borrows liberally from the insights of its predecessors (see Chapter 2). Like EST and GB, it treats the controlled element in OC contexts as a non-lexical formative (more precisely, as a residue of movement). Like EST, it also adopts the minimal-distance principle, in fact deriving it in terms of relativized minimality (see section 5.5 below for more detailed discussion). However, in contrast to (virtually all) earlier theories that strictly distinguished raising from control, the MTC insists that these are both formed via the same grammatical operation, namely move. This is made possible by the renunciation of one assumption that has been held constant since the earliest days of generative grammar: we give up the assumption that movement into -positions is impossible. This is in many 21 For instance, Nunes (1999, 2004) has argued that in certain well-defined circumstances, the phonological component may realize a lower copy instead of the head of the chain or even multiple copies. If this is correct, we should in principle expect to find OC constructions (under the relevant circumstances) with the “OC PRO” copy phonetically realized. We return to specific cases in sections 4.5.3 and 4.5.4 below.
3.5 Conclusion
57
ways the key innovative feature of MTC, although it is not specific to it,22 and we will see in the next chapters that this innovation has some nice empirical consequences. The specific answers provided by (our version of) the MTC to the questions above are the following. First, the core case of control is obligatory control (OC). OC is an instance of A-movement. The controlled element is the residue of A-movement. The only difference between the A-movement found in control configurations and that found in more familiar cases of raising and passive is that the one that occurs in control moves the DP through multiple -positions. In all other respects, it is the same operation in both instances and the “traces”/copies left behind are expected to be indistinguishable from one another. As we will discuss in detail in Chapter 6 below, the other species of control is non-obligatory control (NOC), which amounts to the elsewhere case, occurring when movement cannot take place. Second, given that OC PRO is a residue of A-movement, it can only appear in positions from which A-movement is possible. It is well known that subjects can A-move from non-finite clauses (not only infinitives and gerunds, but also -defective finite clauses as discussed in section 2.5.2.4). Thus, PRO can occur as the subject of these kinds of clauses. Third, OC amounts to the formation of a multiply -marked A-chain and is interpreted as standard A-chains are. Taking a page out of the GB playbook, we assume with Chomsky (1981) that A-traces and reflexives are both anaphors. Thus, the controlled element in OC contexts should be interpreted essentially like a locally bound anaphor. Moreover, as the controlled element is simply a residue of A-movement, its antecedent is simply the element that has moved from that position. In short, it is the head of the (multiply -marked) A-chain of which it is a link. The possible antecedents for OC PRO are then defined in terms of the positions to which the “controllee” can move. Indeed, given that A-movement is subject to strict locality conditions, we expect OC to be subject to these locality restrictions as well. We will see ample evidence for this in the next chapters. Finally, under the copy theory of movement, OC PRO is a copy left by a movement operation, which is later deleted in the phonological component. So OC PRO does not surface with phonetic content for the same reasons that “traces”/lower copies do not. Thus, the MTC allows a considerable 22 For instance, sideward movement to thematic positions is argued to be involved in the derivation of parasitic gap and ATB constructions (see Nunes 1995, 2001, 2004; Hornstein 2001; and Hornstein and Nunes 2002). See section 4.5.1 below for relevant discussion.
58
Basic properties of the movement theory of control
simplification in the grammar by eliminating OC PRO as a theoretical primitive, as well as parts of the control module responsible for its interpretation (on NOC, see Chapter 6 below). The most appealing feature of the MTC for us is the fact that the theory not only provides a principled account for PRO’s distribution, but the account it provides for the syntactic distribution of OC PRO immediately extends to cover most (if not all) of OC PRO’s interpretive properties. This makes the MTC rather unique given that other current approaches to control treat these two facets of PRO as quite unrelated. This all seems too good to be true, but we assure the reader that the following chapters will show that MTC is indeed even more successful than we have made it seem in this chapter.
4 Empirical advantages
4.1
Introduction
In this chapter we explore some empirical consequences of (our version of) the MTC. We start by discussing the welcome results one obtains by assuming that OC PRO is a trace of A-movement, regardless of the view on traces one takes. We will see that OC PRO and standard A-traces pattern alike in being invisible to some morphological computations (section 4.2), being transparent for interclausal agreement (section 4.3), and being allowed in the subject position of finite clauses when the finite T is not an obligatory case assigner/checker (section 4.4). Next, we discuss empirical consequences of the MTC when the copy theory of movement is taken into consideration. In particular, we will discuss cases where an OC PRO behaves like an overt element in being subject to morphological restrictions (section 4.5.2), cases where OC PROs are phonetically realized (see section 4.5.3 on backward control and section 4.5.4 on copy control), and cases where OC PROs are traces of sideward movement, i.e., movement from one tree to another independent tree (see section 4.5.1 on adjunct control). Finally, section 4.6 presents our conclusion that the data covered by the MTC discussed in this chapter prove fatal for any PRO-based account of OC. 4.2
Morphological invisibility
It has long been observed that PRO differs from A’-traces in not blocking sandhi phenomena, the most well-known example of such being wanna-contraction in English, as illustrated in (1) and (2).1 (1)
[Who1 do you want PRO to banish t1 from the room] → Who do you wanna banish from the room?
1 For relevant discussion, see e.g., Lightfoot (1976), Postal and Pullum (1978), Jaeggli (1980), Boeckx (2000), and references therein.
59
60
Empirical advantages
(2) ∗
[Who1 do you want t1 to vanish from the room] → Who do you wanna vanish from the room?
Interestingly, as has been known since Lightfoot (1976), A-traces also allow contraction: (3) a. b. c.
[John1 is going t1 to kiss Mary] → John is gonna kiss Mary [John1 used t1 to kiss Mary] → John usta kiss Mary [John1 has t1 to kiss Mary] → John hasta kiss Mary
The parallel behavior of PRO and A-traces in licensing contraction is exactly what is predicted by the MTC, regardless of what the correct analysis for this contrast between A’- and A-traces is. Take, for instance, the influential proposal that the contrast has to do with case (see e.g., Jaeggli 1980), that is, case-marked elements block contraction, but caseless elements do not. If OC PRO is an Atrace, as advocated by the MTC, it must sit in a caseless position and, therefore, should not prevent contraction under this view. Moreover, as observed by Boeckx (2000), if the case-based account of the contrast between (1)/(3) and (2) turns out to be correct, it poses serious questions for any approach to control that takes OC PRO to be case marked, be it in terms of null case (see e.g., Martin 2001) or in terms of regular case (see e.g., Landau 2004). The MTC also accounts for superficially similar constructions such as (4) (from Postal and Pullum 1978), where contraction cannot take place. (4) [I don’t want [[PRO to undress in public] to become standard practice]] → ∗ I don’t wanna undress in public to become standard practice
Notice that the infinitival clause in (4) is in the subject position of the embedded clause. In other words, it is a subject island and should prevent movement from the position occupied by PRO. From the perspective of the MTC, that amounts to saying that the empty category occupying the subject position of the infinitival clause in (4) cannot be an OC PRO/A-trace (see Chapter 6 below). Given that only A-traces are invisible for purposes of contraction, the contrast between (1) and (4) now follows straightforwardly. 4.3
Interclausal agreement
OC PRO and A-traces also pattern alike in triggering agreement in their local domain, matching the features of their antecedent.2 Take case concord, for 2 Although this is the general pattern, there are well-known cases (also in Latin) where an OC PRO seems to mismatch the features of its antecedent. We postpone the discussion of these potentially problematic cases to section 5.4.2 below, where we show that the problems are either apparent or are due to independent factors.
4.3 Interclausal agreement
61
instance. As discussed by Cecchetto and Oniga (2004), in languages with rich case morphology such as Latin, an adjectival predicate agrees with the subject in case (as well as in number and gender), as illustrated in (5a) below. Under the standard assumption that copular constructions involve raising from the embedded predicate, (5a) is to be represented as in (5b). (5) a.
b.
Latin (Cecchetto and Oniga 2004): Ego sum bonus I.NOM am good.NOM ‘I am good’ [TP egoi sum [SC ti bonus]]
What is relevant for our discussion is that similar case agreement is also found in OC PRO clauses, as shown in (6a) with subject control and (6b) with object control. (6) a.
b.
Latin (Cecchetto and Oniga 2004): [Ego volo [PRO esse bonus]] I.NOM want to-be good.NOM ‘I want to be good’ [Ego iubeo te [PRO esse bonum]] I.NOM order you.ACC to-be good.ACC ‘I command you to be good’
The embedded adjectival predicate exhibits nominative case in (6a) and accusative case in (6b). This follows if PRO is not case marked within the embedded clause. If it were, we would have to make the awkward assumption that the infinitival T-head in Latin assigns either nominative or accusative depending on the kind of control involved (subject or object control). As pointed out by Cecchetto and Oniga, case matching of the type illustrated in (6) is thus problematic for any approach that takes PRO to be a bearer of structural case. By contrast, the agreement pattern exhibited in (6) is exactly what is expected under the MTC. If the OC PROs in (6) are actually A-traces, as respectively represented in (7) below, they should pattern like the A-trace of (5b). In other words, the embedded predicates in (7) must agree with the antecedent of the trace in the subject position of the infinitival clause. (7) a. b.
[Egoi volo [ti esse bonus]] [Ego iubeo tei [ti esse bonum]]
It should be clear that our point here is not to debate on how to technically capture the agreement relation between the embedded predicate and the antecedent of the trace in (5b) and (7). For concreteness, we may assume that the embedded predicate surfaces the way it does because it is in agreement relation with a
62
Empirical advantages
link of the chain headed by ‘ego’ in (5b) and (7a) and ‘te’ in (7b) or that the trace in a local agreement relation with the embedded predicate is a copy of the moved element. What should be borne in mind is that, from the perspective of the MTC, whatever the technical analysis is that enforces case matching in (5b), it must also apply to (7). The requirement of interclausal agreement in raising and OC configurations also holds of -features, as discussed by Rodrigues (2004, 2007) with respect to gender. Rodrigues examines the agreement pattern triggered by nouns such as the Romance counterpart of ‘victim,’ which is invariably [+feminine], regardless of whether it refers to males or females. In the raising constructions in (8), for instance, the adjectival predicates take the feminine form even in the context where a man has been hurt. (8) a.
b.
Italian (Rodrigues 2004): La vittima sembra essere ferita/∗ ferito The victim seems be injured.FEM/injured.MASC Brazilian Portuguese (Rodrigues 2004): A v´ıtima parece estar ferida/∗ ?ferido The victim seems be injured.FEM/injured.MASC ‘The victim seems to be injured’
The agreement seen in (8) is replicated in OC constructions such as (9), but not in non-OC constructions such as (10), again in a context where the victim is a male. (9) a.
Italian (Rodrigues 2004): La vittima ha cercato di essere trasferita/??trasferito The victim had tried of be transferred.FEM/transferred.MASC alla stazione di polizia di College Park to-the station of police of College Park
b.
Brazilian Portuguese (Rodrigues 2004): A v´ıtima tentou ser transferida/??transferido para a The victim tried be transferred.FEM/transferred.MASC to the delegacia de pol´ıcia de College Park station of police of College Park ‘The victim tried to be transferred to the police station at College Park’
(10) a.
Italian (Rodrigues 2004): La vittima ha detto che essere ∗ portata/portato alla The victim has said that be brought.FEM/brought.MASC to-the stazione di polizia non era una buona idea station of police not was a good idea ‘The victim said that being brought to the police station was not a good idea’
4.4 Finite control b.
63
Brazilian Portuguese (Rodrigues 2004): A v´ıtima disse que ser ??transferida/transferido para The victim said that be transferred.FEM/transferred.MASC to outra cidade n˜ao e´ uma boa id´eia other city not is a good idea ‘The victim said that being transferred to another city is not a good idea’
As Rodrigues notes, the agreement contrast between OC and NOC in (9) and (10) requires non-trivial provisos under a PRO-based analysis of control. From a purely formal point of view, PRO should have a uniform behavior regarding agreement. In other words, in both types of constructions PRO should either be the element triggering agreement with the embedded predicate (independently from its antecedent) or be transparent to interclausal agreement. Thus, the fact that the null subject of the infinitival clause of the sentences of (9) and (10) does not have a uniform behavior suggests that we are not dealing with the same type of empty category in these constructions. Rodrigues further points out that the MTC, on the other hand, provides a straightforward account of the contrast between (9) and (10). In (9) we have OC (more specifically, subject control). Thus, the null subject inside the infinitival is an A-trace and, as such, it should pattern with the A-trace in the embedded subject position of the raising constructions in (8). Hence, we have a transparent domain for interclausal agreement in (9) and the agreement morphology on the embedded predicate must match the gender feature of the antecedent of the embedded subject. Again, this should be so independently of the specific analysis one assumes for interclausal agreement in standard raising constructions. The sentences in (10), on the other hand, cannot be analyzed as involving A-traces in the subject of the infinitival clause, as the infinitival is a subject island. Once (10) cannot be analyzed in terms of A-traces, interclausal agreement is blocked and the embedded predicate takes an (arguably default) masculine form. 4.4
Finite control
As we saw in detail in section 2.5.2, Landau’s (2004) Agree-based approach to control faces several problems. It resorts to various stipulations regarding the features employed to track the distribution of PRO and the system of composite agreement relations proposed to account for the interpretation of PRO, in addition to not being independently motivated, leads to overgeneration. In this section, we will pay closer attention to a salient undergeneration problem of Landau’s system, which we believe provides decisive evidence in favor of the MTC, namely, the existence of control into indicative clauses.
64
Empirical advantages
Recall that the typology of control structures predicted by Landau’s system given in (11) below (from Landau 2004: 840; cf. [39] in Chapter 2) explicitly blocks OC into indicative clauses. In his own words (pp. 849–850), “the only generalization in this domain that appears to be universal is the incompatibility of indicative clauses with OC.” (11)
Obligatory control Hebrew 3rd-person subjunctive
No control
EC-infinitive
Balkan Csubjunctive
PC-infinitive
I0
[−T, −Agr]
[−T, +Agr] [+T, +Agr] [+T, −Agr]
C0
[−T]
[−T]
Balkan indicative F-subjunctive [+T, +Agr]
[+T, +Agr] [+T, (+Agr)] [+T, +Agr]
[+T, +Agr] Ø
However, we have seen that null subjects in finite indicative clauses in Brazilian Portuguese display all the diagnostics for OC (see section 2.5.2.2). In a sentence such as (12) below, for instance, the embedded subject can only be interpreted as controlled by the most local c-commanding DP, namely, s´o o irm˜ao do Jo˜ao ‘only Jo˜ao’s brother.’ Furthermore, the null subject has only a bound interpretation, only licenses sloppy readings under ellipsis, and must receive a de se reading in the appropriate contexts.3 (12)
Brazilian Portuguese: [[O Pedro]i disse [que [s´o o irm˜ao d[o Jo˜aok ]]m estava achando The Pedro said that only the brother of-the Jo˜ao was thinking [que Øm/∗ i/∗ k/∗ w deveria ganhar uma medalha]]] that should receive a medal ‘Pedro said that [only Jo˜ao’s brother]m was thinking that hem should get a medal’
Once finite control into indicative clauses is empirically attested in Brazilian Portuguese, one has to determine which special property allows it and why it is considerably rare from a crosslinguistic point of view. As discussed in section 2.5.2.4, Ferreira (2000, 2004, 2009) has proposed that indicative Ts in Brazilian Portuguese are ambiguous in that they may be associated with a complete or an incomplete -set. Nunes (2007, 2008a) has reinterpreted this ambiguity in terms of how the person and number features of T are combined in the course of the computation. More specifically, Nunes proposes that finite Ts in Brazilian Portuguese may enter the numeration specified for number and person or for number only. When T is only specified for number, wellformedness conditions in the morphological component trigger the addition 3 See Ferreira (2000, 2004, 2009) and Rodrigues (2002, 2004) for a discussion of these properties of null subjects in Brazilian Portuguese in the context of the MTC.
4.4 Finite control
65
of the feature person in accordance to the redundancy rule sketched in (13) below. Crucially, the paradigm of verbal-agreement morphology in (colloquial) Brazilian Portuguese given in (14) (cf. [56] in Chapter 2) is such that the only form that distinctively encodes person and number is the syncretic inflection for first-person singular; the other two inflections involve a default value (third) for the person feature. (13)
(14)
When T is only specified for number (N): (i) Add [P:1], if N is valued as SG; (ii) otherwise, add [P:default]. Verbal-agreement paradigm in (colloquial) Brazilian Portuguese cantar ‘to sing’: indicative present eu ‘I’
canto
P:1; N:SG
vocˆe ‘you (SG)’ ele ‘he’ ela ‘she’ a gente ‘we’
canta
P:default; N:default (= 3SG)
vocˆes ‘you (PL)’ eles ‘they (MASC)’ elas ‘they (FEM)’
cantam
P:default; N:PL (= 3PL)
Thus, the three different verbal inflections available in (14) can be obtained in two ways: (i) T is specified for both person and number throughout the derivation, as in (15); or (ii) T is only specified for number and the feature person is associated with T in the morphological component in accordance with (13), as shown in (16) (cf. [57] in Chapter 2). (15)
cantar ‘to sing’: indicative present Valuation of T in the syntactic component
Surface form of the verb
P:1; N:SG
canto
P:default; N:default
canta
P:default; N:PL
cantam
(16)
cantar ‘to sing’: indicative present Valuation of T in the syntactic component
Addition of [person] in the morphological component
Surface form of the verb
N:SG
P:1; N:SG
canto
N:default
P:default; N:default
canta
N:PL
P:default; N:PL
cantam
66
Empirical advantages
Under this view, the derivation of a sentence such as (17), for instance, proceeds along the lines of (18) (with English words for convenience). (17)
Brazilian Portuguese: O Jo˜ao disse que comprou um carro novo The Jo˜ao said that bought a car new ‘Jo˜ao said that he bought a new car’
(18) a. b. c. d. e.
[TP T[N:u]/EPP [vP Jo˜ao[case:u] buy- a new car]] [TP Jo˜ao[case:u] T[N:default]/EPP [vP t buy- a new car]] [vP Jo˜ao[case:u] said [CP that [TP t T[N:default]/EPP [vP t buy- a new car]]]] [TP T[P:u; N:u]/EPP [vP Jo˜ao[case:u] said [CP that [TP t T[N:default]/EPP . . . ]]]] [TP Jo˜ao[case:NOM] T[P:default; N:default]/EPP [vP t said [CP that . . . ]]]
The past indicative T in (18a) comes from the numeration with just a number feature, which gets valued (as default) after agreeing with ‘Jo˜ao,’ as shown in (18b). In the morphological component, a default person feature is added to T and the embedded verb surfaces with the “third-person singular” form ‘comprou’ (cf. [17]). Given that only a -complete T is able to check/value the case feature on a DP (Chomsky 2000, 2001), ‘Jo˜ao’ remains active after agreeing with the -incomplete T in (18b). It may then raise to the matrix [Spec, vP], where it receives an additional -role, yielding (18c). The next finite T to enter the derivation comes from the numeration with a complete -set (person and number), as shown in (18d). It then agrees with ‘Jo˜ao,’ valuing its case feature and having its own features valued, as illustrated in (18e). The advantages of this analysis of OC into indicative clauses in Brazilian Portuguese in terms of -incompleteness are twofold. First, not only does it allow us to incorporate Brazilian Portuguese into the picture, but it also makes it possible to considerably simplify the “calculus of control,” as Landau calls it. Whether or not a given clausal structure allows for OC may be determined solely based on the tense and -feature properties of T, as shown in (19) (cf. [58] in Chapter 2), where ‘+’ stands for fully specified and ‘−’ for deficient or null. (19)
Obligatory control [T
−
,
−
]
untensed uninflected infinitives, etc.
[T
+
,
−
]
tensed uninflected infinitives, Brazilian Portuguese indicatives, Hebrew 3rd-person subjunctives, etc.
No control [T
−
,
+
]
Balkan untensed subjunctives, etc.
[T+ , + ] English indicatives, Balkan tensed subjunctives, etc.
4.4 Finite control
67
As mentioned in section 2.5.2.4, the simplification of (11) as in (19) also eliminates the conceptually suspicious ambiguity of the composite C[+T, +Agr] – I[+T, +Agr] in Landau’s system, which is used to describe both OC in Hebrew subjunctives and no control in Balkan F(ree)-subjunctives. Given that OC in Hebrew subjunctives is restricted to third person, as argued by Landau, a natural analysis of Hebrew constructions such as (20) below, for instance, is to take the T-head of their subjunctive clauses to be also ambiguous with respect to how person and number features are associated. In other words, like Brazilian Portuguese indicative Ts, Hebrew subjunctive Ts may enter the derivation fully specified for person and number or specified for just number and receive a default (third) person in the morphological component. (20)
Hebrew (Landau 2004): Gili hivtiax [ˇse- eci yitna’heg yafe] Gil promised that will-behave.3SG.M well ‘Gil promised to behave’
In fact, the parallel between Hebrew subjunctives and Brazilian Portuguese indicatives is even clearer if we consider the idiolectal variation in Brazilian Portuguese illustrated in (21): (21) a.
Brazilian Portuguese (Nunes 2008a): Eu falei que %(eu) comi o bolo I spoke.1SG that I ate.1SG the cake ‘I said that I ate the cake’
b.
Vocˆe/ele/a gente falou que (vocˆe/ele/a gente) comeu o bolo you.SG/he/we spoke.3SG that you.SG/he/we ate.3SG the cake ‘You(SG)/he/we said that you(SG)/he/we ate the cake’
c.
Vocˆes/eles falaram que (vocˆes/eles) comeram o bolo You.PL/they spoke.3PL that you.PL/they ate.3PL the cake ‘You(PL)/they said that you(PL)/they ate the cake’
In (21a–c), the embedded and the matrix subjects are coreferential. For all speakers of BP, the realization of the embedded subjects in (21b) and (21c), which trigger default (third-person) agreement, is truly optional. By contrast, a good number of speakers prefer an overt pronoun when the embedded subject triggers first-person agreement (cf. [21a]).4 Nunes (2008a) attributes this idiolectal variation to the presence or absence of the specification of the redundancy rule in (13i) across speakers’ grammars. For speakers who do not 4 Duarte (1995, 2000) shows that the percentage of null subjects with third person is significantly much higher than with first person in both spoken and written corpora of Brazilian Portuguese.
68
Empirical advantages
have (13i) in their grammars, finite control into indicatives is like what we find in Hebrew subjunctives: it is only possible with subjects that trigger (default) third-person agreement. The second advantage of the analysis of OC into indicatives in Brazilian Portuguese in terms of -incompleteness is that it makes it possible to understand why this subtype of OC is rare from a crosslinguistic point of view. Since the incorporation of the case theory into GB, it has been standardly assumed that there is a strong correlation between finiteness and the presence of a full -set. The unmarked situation is for finite Ts to be -complete ([T+ , + ]) and for non-finite Ts to be -incomplete ([T− , − ]). However, the correlation, albeit strong, is not absolute. Although patterns with opposite values for T and are not garden-variety species across languages, they do exist. Witness, for instance, inflected infinitivals such as (22) below in Portuguese, where the subject is licensed with nominative case within the infinitival,5 and “porous” subjunctives such as (23) in Greek, where the embedded subject can leave the subjunctive clause and undergo A-movement to the matrix-subject position.6 Thus, the fact that OC into indicative clauses is rather uncommon is related to the marked character of mismatches between T and with respect to full specification; in the case of indicative OC, we have the mismatching specification [T+ , − ], as seen in (19). (22)
Brazilian Portuguese: Eles ganharem o jogo foi realmente uma surpresa They win.INF.3PL the game was really a surprise ‘Their winning the game was a real surprise’
(23)
Greek (Alexiadou and Anagnostopoulou 1998): Ta pedhia dhen fenonte na doulevoun The children not seem.3PL SUBJ work.3PL ‘The children do not seem to work’
Bearing this in mind, let us now consider why finite control provides convincing evidence in favor of the MTC. As is well known, A-movement from 5 See e.g., Raposo (1987, 1989), Martins (2001), and Pires (2006) for relevant discussion. 6 See e.g., Varlokosta (1993), Terzi (1997), Alexiadou and Anagnostopoulou (1998), Roussou (2001), and Boeckx (2003, 2008) for relevant discussion. On other cases of raising out of subjunctive clauses, see e.g., Uchibori (2000) for Japanese and Uriagereka (2006) for Romance. A-movement in (23) is consistent with last resort if the subject has not been case-licensed in the embedded clause, as happens in finite control in Brazilian Portuguese (cf. [18]). Alternatively, Boeckx (2003, 2008) argues that the reason movement can take place out of defective finite domains is that a chain can be extended up to its point of maximal checking, which in the case of A-chains is defined in terms of [T+ , + ] (see Richards 2001 and Rizzi 2006 for related ideas).
4.4 Finite control
69
subject positions typically takes place from non-finite, uninflected clauses, as illustrated by the contrast in (24). (24) a. b.
∗
John is likely [t to be home] John is likely [t is home]
However, there are indeed cases where a finite (inflected) subjunctive clause does not block A-movement, as we have just seen in (23). Boeckx (2003, 2008) argues that the relevant property that renders a given domain porous to Amovement is its deficiency with respect to -features (cf. [24a]) or with respect to tense (cf. [23]). That being so, the picture in (19) can in fact be subsumed under the more general table in (25) (cf. [59] in Chapter 2). (25)
A-movement:
√
A-movement: ∗
[T− , − ]
[T+ , − ]
[T− , + ]
untensed uninflected infinitives, etc.
tensed uninflected infinitives, Brazilian Portuguese indicatives, Hebrew 3rd-person subjunctives, etc.
Balkan untensed subjunctives, etc.
[T+ , + ] English indicatives, Balkan tensed subjunctives, etc.
In other words, the MTC predicts that modulo idiosyncrasies of selection by a matrix predicate, if the relevant Infl head of a given domain is negatively specified for T or , it should allow both control and raising constructions.7 In 7 It is interesting to note in this context that control is sometimes much more well behaved than raising. As Mark Baker pointed out to us (personal communication), Kinande control structures behave as usual, with person/number/class agreement on the matrix verb, and infinitival morphology on the embedded verb, as illustrated in (i) below. Raising constructions such as (ii), on the other hand, have matching person/number/case agreement on both the matrix and the embedded verbs. (i)
Kinande (Mark Baker, personal communication): Mo-tw-a-gan-ire eri-seny-a olukwi AFF.1PS.T.refuse.EXT INF.chop.FV wood ‘We refused to chop the wood’
(ii) a.
b.
Kinande (Mark Baker, personal communication): Tu-li-nga mo-tw-a-na-gend-ire 1PS.be-if AFF.1PS.T.INDEED.go.EXT ‘We seem to have left’ Ebitsungu bi-li-nga mo-by-a-huk-ir-w-e potatoes.8 8.be-if AFF.8.T.cook.PASS.EXT ‘The potatoes seem to have been cooked’
So Kinande raising seems to flout the generalization that -complete agreement freezes the relevant DP.
70
Empirical advantages
the sections that follow we show that this prediction is indeed borne out and explore some of its consequences. 4.4.1 Finite control and hyper-raising It is a well-known fact that languages such as Greek and Romanian resort to subjunctive clauses for both control and raising constructions, as respectively illustrated in (26) and (27) below.8 Assuming with Boeckx (2003, 2008) that Tdeficient or -deficient domains are transparent for purposes of A-movement, the paradigm in (26) and (27) is exactly what the MTC leads us to expect. If these subjunctives are specified as T− , as argued by Landau (2004), they should not block A-movement. Whether we obtain a control or a hyper-raising construction (in the sense of Ura 1994) then depends on whether the embedded subject moves first to a -position, as in (26), or directly to the matrix [Spec, TP], as in (27). (26) a.
b.
(27) a.
b.
Greek (Terzi 1997): I Maria prospathise na divasi The Maria tried.3SG SUBJ read.3SG ‘Maria tried to read’ Romanian (Dobrovie-Sorin 1994): Ion vrea s˘a plece devreme mˆıine Ion want.3SG SUBJ leave.3SG early tomorrow ‘Ion wants to leave early tomorrow’ Greek (Alexiadou and Anagnostopoulou 1998): Ta pedhia dhen fenonte na doulevoun The children not seem.3PL SUBJ work.3PL ‘The children do not seem to work’ Romanian (Dobrovie-Sorin 1994): Copiii t˘ai par s˘a fie foarte obosit¸i Children your seem.3PL SUBJ be.3PL very tired ‘Your children seem to be very tired’
However, it is fair to concede that one could reasonably argue that the paradigm in (26)–(27) is only consistent with the MTC and not a knockout argument in its favor, for the pattern in (26)–(27) is associated with a morphological gap in these languages, namely, the lack of true infinitival However, as Baker notes, tense marking cannot go on ‘seem’ in (ii), but a full range of tense morphemes is possible on the embedded verb. This suggests that raising in (ii) may actually involve an inflected modal adjunct. For alternatives to reconciling raising in Bantu and the strong-agreement freezing effect, see Henderson (2006) and Boeckx (2008). 8 See e.g., Varlokosta (1993), Terzi (1997), Alexiadou and Anagnostopoulou (1998), and Roussou (2001) on Greek and Grosu and Horvath (1984), Dobrovie-Sorin (1994), and Alboiu (2007) on Romanian.
4.4 Finite control
71
morphology (in these contexts). So one could say that at an abstract level these are infinitival constructions in the syntactic component and we are back to square one as to what the best theory to account for infinitival OC is. In this regard, Brazilian Portuguese provides us with the relevant test case, as it does not lack infinitives (in fact, it has both the uninflected and the inflected varieties), but allows OC into indicatives, as discussed earlier. As the MTC predicts, it also allows hyper-raising out of indicative clauses, as illustrated in (28b).9 (28) a.
b.
Brazilian Portuguese (Ferreira 2000, Nunes 2008a): Parece/acabou que os estudantes viajaram mais cedo Seem.3SG/finished.3SG that the students traveled.3PL more early ‘It seems/turned out that the students traveled earlier’ Os estudantes parecem/acabaram que viajaram mais cedo The students seem.3PL/finished.3PL that traveled.3PL more early ‘The students seem to have traveled earlier’/‘The students ended up traveling earlier’
The hyper-raising constructions in (28) are derived along the lines of (29) (again with English words, for convenience). (29) a. b. c. d.
[TP T[N:u]/EPP [vP [the students][case:u] traveled earlier]] [TP [the students][case:u] T[N:PL]/EPP [vP t traveled earlier]] TP T[P:u; N:u]/EPP [vP seem/turned out [CP that [TP [the students][case:u] T[N:PL]/EPP . . . ]]] [TP [the students][case:NOM] T[P:default; N:PL]/EPP [vP t seem/turned out [CP that . . . ]]]
If the embedded T in (29a) were fully specified with respect to -features, the subject would have been case licensed in the embedded clause, yielding an impersonal construction, as seen in (28a). However, this is not what happens in (29a). T is associated only with number and the subject does not have its case valued after agreeing with T, as shown in (29b). After further computations, a fully inflected T is selected, as shown in (29c), and enters into an agreement relation with the embedded subject, allowing all unvalued features to be valued, as represented in (29d).10 Notice that, although both the matrix and the embedded verb surface in the third-person plural form (cf. [28b]), they differ with 9 See Ferreira (2000, 2004, 2009), Duarte (2004), Martins and Nunes (2005, in press), and Nunes (2007, 2008a) for relevant discussion. 10 It is worth pointing out that, although finite control and hyper-raising constructions necessarily involve a -incomplete finite T in the embedded clause and -complete finite T in the matrix clause, nothing need be stipulated to ensure this result. The asymmetry between matrix and embedded clauses is trivially derived from UG principles (see Ferreira 2000, 2004, 2009 for discussion). Although both -complete and -incomplete finite Ts are legitimate options for any
72
Empirical advantages
respect to how this specification is carried out. The matrix T verb enters the numeration with person and number features, which then get trivially valued in the syntactic component through Agree. The embedded T, on the other hand, only has a number feature as it enters in the derivation. After being valued in the syntactic component, the number feature is then combined with a default person specification in the morphological component in accordance with (13ii). Independent evidence for the derivation sketched in (29) is provided by the sentences in (30), which show that idiom chunks, which are generally resistant to A’-movement, can also appear in hyper-raising constructions.11 (30) a.
b.
Brazilian Portuguese (Martins and Nunes 2005, in press): [a vaca]i acabou que ti foi pro brejo The cow finished that went to-the swamp Idiomatic reading: ‘It turned out that things went bad’ [o pau]i parece que ti comeu feio The stick seems that ate ugly ‘It seems that there was a big discussion/fight’
Notice that the only relevant difference between the derivation involving OC into finite clauses discussed earlier (cf. [18]) and the derivation of hyper-raising constructions sketched in (29) is that, in the latter, the matrix light verb does not have another -role to assign and the embedded subject moves directly to the matrix [Spec, TP].12 But putting aside this independent difference, the MTC predicts that the two types of constructions should go hand in hand. In this regard, consider the Brazilian Portuguese data in (31)–(32), discussed in Nunes (2008a). given numeration in Brazilian Portuguese, UG principles determine whether or not the choice and the structural locus of a -incomplete finite T give rise to a convergent derivation. If the matrix clause is associated with a -incomplete finite T, there is no source of case assignment for the matrix subject and the derivation simply crashes. In other words, a -incomplete finite T will only yield a convergent derivation if it sits within an embedded clause, being no different from other types of -incomplete Ts, such as the infinitival T-head of standard raising and OC constructions. 11 See Ferreira (2000), Martins and Nunes (2005, in press), and Nunes (2007, 2008a) for additional independent evidence. 12 Technical questions arise regarding phase-based computations. The matrix vP in (29) does not count as a (“strong”) phase as its head is not a “transitive” light verb (see Chomsky 2000, 2001). But what about the embedded CP? Should it not count as a phase and block movement of the embedded subject? Several different answers have been proposed to address this potential problem (see Ferreira 2000; Rodrigues 2004; Nunes 2007, 2008a; and Martins and Nunes in press). For purposes of the current discussion, it suffices to assume with Ferreira (2000) that a C-head that selects for a -incomplete TP does not count as a strong phase head; hence, the embedded CP in (28b)/(29) does not count as a phase, as the head of its complement bears only a number feature (cf. [29a]). We return to this issue in section 5.2 below, where we discuss data bearing on Visser’s generalization which add an interesting twist to this discussion.
4.4 Finite control (31)
Brazilian Portuguese (Nunes 2008a): [Ningu´em mexeu um dedo para me ajudar] Nobody moved a finger to me help ‘Nobody lifted a finger to help me’
a.
b.
(32) a.
b.
73
∗
[Ningu´em disse [que a Maria mexeu um dedo para me ajudar]] Nobody said that the Maria moved a finger to me help ‘Nobody said that Maria didn’t lift a finger to help me’ Brazilian Portuguese (Nunes 2008a): [Ningu´em disse [que ia mexer um dedo para me ajudar]] Nobody said that went move a finger to me help ‘Nobody said that he wasn’t going to lift a finger to help me’ [Ningu´em parecia [que ia mexer um dedo para me ajudar]] Nobody seemed that went move a finger to me help ‘It seemed that nobody was going to lift a finger to help me’
The contrast in (31) illustrates the well-known fact that a negative polarity item such as the minimizer um dedo ‘a finger’ and its licenser (in this case, ningu´em ‘nobody’) must be in the same clause. Curiously, if we have a null rather than an overt embedded subject in sentences analogous to (31b), the minimizer can now be licensed by the matrix subject, as shown in (32a). Even more interesting is the fact that it is not the case that any type of null subject will do. Although contrasts such as (31) also hold in European Portuguese, sentences analogous to (32) are unacceptable in this dialect. Given that Brazilian Portuguese allows finite control into indicative clauses, as argued above, the contrast between the two dialects with respect to (32) receives a straightforward account from the perspective of the MTC. The embedded null subject in sentences such as (32) in European Portuguese, which is a prototypical pro-drop language, is pro. Hence, these sentences are ruled out in European Portuguese because the minimizer and its licenser are not in the same clause (in addition, [32b] violates the -criterion, as there is no -role available for the matrix subject). By contrast, in Brazilian Portuguese the embedded null subject is a trace of the matrix subject, as illustrated in (33) below (with English words). Thus, the minimizer can be licensed by the clause-mate trace of the negative quantifier (or it can be licensed before the quantifier leaves the embedded clause). (33) a. b.
[TP nobodyi [vP ti said [CP that [TP ti would [vP ti lift a finger to help me]]]]] [TP nobodyi [vP seemed [CP that [TP ti would [vP ti lift a finger to help me]]]]]
Again, we see that once the embedded clause is porous due to the availability of an indicative T-head specified as − , the derivation will yield a control (cf. [32a]) or a hyper-raising construction (cf. [32b]) depending on whether or not
74
Empirical advantages
the moving subject is assigned an additional -role on its way to the matrix [Spec, TP] (cf. [33]).13 13 A learnability question that arises is what exactly led indicative Ts to be analyzed by children as ambiguous between -complete and -incomplete in Brazilian Portuguese. After all, the ambiguous morphological paradigm cannot be the whole story, for in English, for instance, verbal morphology is considerably weak, but hyper-raising is not allowed. Furthermore, whatever the relevant property turns out to be, it should arguably be a marked property; otherwise, hyper-raising should be a very common phenomenon. Nunes (2008a) suggests that the relevant trigger for this reanalysis in Brazilian Portuguese was the existence of inflected infinitives in the language. More specifically, Nunes proposes that, while finite verbal morphology started becoming weakened, Brazilian Portuguese learners still had to acquire a marked property of Portuguese, namely, the existence of inflected infinitives. Interestingly, for all Portuguese verbs, the inflected realization of some forms is the same as the uninflected form. Take the verb cantar ‘to sing,’ for example, and compare its uninflected form (‘cantar’) with the paradigm of inflected forms in (i) below. Although the paradigm is considerably meager in (colloquial) Brazilian Portuguese, both dialects have a considerable number of forms that are ambiguous between being inflected or uninflected. Thus, successful acquisition of infinitives in both dialects requires that learners must postulate that (certain) infinitival forms are ambiguous between being -complete (the inflected ones) and -incomplete (the uninflected ones). That being the case, Nunes (2008a) suggests that the weakening of finite verbal morphology in Brazilian Portuguese led learners to generalize this pattern and uniformize the whole paradigm, taking both infinitival and indicative Ts to be systematically ambiguous. Thus, hyperraising also became possible with inflected infinitives in Brazilian Portuguese, as illustrated in (ii) (see section 5.2.3 below for further discussion). For relevant discussion and alternative approaches, see also Ferreira (2000), Rodrigues (2004), and Martins and Nunes (2009). (i) Inflected infinitives in European Portuguese: cantar ‘to sing’
(ii)
Inflected infinitives in (colloquial) Brazilian Portuguese: cantar ‘to sing’
1SG (eu)
cantar
1SG (eu)
cantar
2SG (tu)
cantares
2SG (vocˆe)
cantar
2SG (vocˆe)
cantar
3SG (ele)
cantar
3SG (ele)
cantar
1PL (a gente)
cantar
1PL (n´os)
cantarmos
2PL (vocˆes)
cantarem
1SG (a gente)
cantar
3PL (eles)
cantarem
2PL (v´os)
cantardes
2PL (vocˆes)
cantarem
3PL (eles)
cantarem
Brazilian Portuguese (Nunes 2008a): a. E´ dif´ıcil desses professores elogiarem algu´em Is difficult of-these teachers praise.INF.3PL someone b. Esses professores s˜ao dif´ıceis de elogiarem algu´em These teachers are difficult of praise.INF.3PL someone ‘These teachers rarely praise someone’
4.4 Finite control
75
4.4.2 Finite control, islands, and intervention effects If finite control is also derived via A-movement, as advocated by the MTC, we should expect it to exhibit island and intervention effects. Moreover, given that hyper-raising constructions are also derived by A-movement, finite control and hyper-raising should also pattern alike in this regard. With this in mind, let us examine the sentences in (34). (34) a.
∗
b.
∗
Brazilian Portuguese (Nunes 2008a): [O Jo˜ao disse [que [o bolo [que comeu]] n˜ao estava bom]] The Jo˜ao said that the cake that ate not was good ‘Jo˜ao said that the cake that he ate was not good’ [O Jo˜ao parece [que [o bolo [que comeu]] n˜ao estava bom]] The Jo˜ao seems that the cake that ate not was good ‘It seems that the cake that Jo˜ao ate was not good’
Under the proposal that both finite control and hyper-raising involve Amovement, the sentences in (34) should be derived by moving ‘o Jo˜ao’ from the subject position of the relative clause to the matrix subject position. However, this movement crosses two islands (the relative clause itself and the embedded subject containing it), explaining why the resulting sentences in (34) are unacceptable. In other words, island effects such as the ones documented in (34) provide additional evidence for the MTC.14 Let us now consider the OC constructions in (35a) and (36a) below, whose simplified structures under the MTC are given in (35b) and (36b) respectively.15 14 See Ferreira (2000, 2004, 2009), Rodrigues (2002, 2004), and Nunes (2009a) for further discussion. 15 That constructions such as (36a) are indeed cases of OC is shown by the fact that the embedded subject: (a) cannot have an arbitrary interpretation (cf. [ia]); (b) must be interpreted as the most local c-commanding DP (cf. [ib]); (c) can only have a bound interpretation when controlled by only-DPs (cf. [ic]); (d) only licenses sloppy reading under ellipsis (cf. [id]); and (e) requires de se readings in the appropriate contexts (cf. [ie]). See Landau (2003) and Barrie (2007) for relevant discussion. (i) a. b. c. d. e.
∗ John
doesn’t know what PROarb to eat Peteri said that [Johnk ’s mother]w doesn’t know what PROw/∗ i/∗ k to read A: Only John wondered what to do B: No! I also wondered what I/#he should do John doesn’t know what to eat, and Mary doesn’t either (‘and Mary also doesn’t know what she/∗ he should eat’) The unfortunate wondered how to get along with people after the war (‘[The unfortunate]i wondered how [he himself]i was going to get along with people after the war’)
76
Empirical advantages
What is relevant for our discussion is that the derivations of these sentences involve a step where the embedded subject moves to the matrix [Spec, vP], crossing a filled [Spec, CP], as illustrated in (37). (35) a. b.
What did John try to do? [CP whati did [TP Johnk [vP tk try [CP ti C [TP tk to do ti ]]]]]
(36) a. b.
John wondered what to do [TP Johnk [vP tk wondered [CP whati [TP tk to do ti ]]]]
(37) a.
[vP Johnk try [CP whati C [TP tk to do ti ]]] ↑
b.
[vP Johnk wondered [CP whati C [TP tk to do ti ]]] ↑
The acceptability of the sentences in (35a) and (36a) indicates that we must assume some version of Rizzi’s (1990) relativized minimality under which an element in [Spec, CP] does not count as a proper intervener for A-movement of the embedded subject. Leaving aside matters of technical implementation, this observation makes two predictions with respect to finite control and hyperraising constructions. First, if A-movement of an embedded subject across a filled [Spec, CP] is a licit operation in the case of standard non-finite control, as seen in (35) and (36), it should also be legitimate in cases of finite control into indicatives and hyper-raising, as all of these different constructions involve the same derivational device: A-movement. The Brazilian Portuguese data in (38a) and (39a), whose simplified derivations are provided in (38b) and (39b) (with English words), show that this prediction is indeed fulfilled. (38) a.
b.
(39) a.
b.
Brazilian Portuguese (Nunes 2009a): O que o Jo˜ao disse que comeu? What the Jo˜ao said that ate ‘What did Jo˜ao say that he ate?’ [CP whati [TP Jo˜aok [vP tk said [CP ti that [TP tk ate ti ]]]]] ↑ Brazilian Portuguese (Nunes 2009a): O que o Jo˜ao parece que comeu? What the Jo˜ao seems that ate ‘What does Jo˜ao seem to have eaten?’ [CP whati [TP Jo˜aok [vP seems [CP ti that [TP tk ate ti ]]]]] ↑
4.4 Finite control
77
The second prediction made by the MTC is that if A-movement of the embedded subject is blocked by elements in other left-periphery positions, both finite control and hyper-raising should be affected. As discussed in Nunes (2008a), this prediction is borne out for both types of constructions regardless of whether they involve indicative or subjunctive embedded clauses, as illustrated in (40)–(43). (40)
Brazilian Portuguese (Nunes 2009a): Algu´em disse que o bolo, o Jo˜ao comeu Someone said that the cake the Jo˜ao ate ‘Someone said that Jo˜ao ate the cake’ Parece que o bolo, o Jo˜ao comeu Seems that the cake the Jo˜ao ate ‘It seems that Jo˜ao ate the cake’
a.
b.
(41)
Romanian Ion vrea ca pe Maria s-o ajute numai Petre Ion wants that PE Maria SUBJ.her help only Petre ‘Ion wants Mary to be helped only by Petre’ (Dobrovie-Sorin 1994: 124)
a.
Se poate ca bombele s˘a explodeze ˆın orice moment REFL can.PRES.3SG that the-bombs SUBJ explode in any moment ‘It is possible that the bombs will go off any minute’ (Grosu and Horvath 1984: 352)
b.
(42)
Brazilian Portuguese (Nunes 2009a): [O Jo˜ao]i disse que o bolo, ti comeu The Jo˜ao said that the cake ate ‘Jo˜ao said that he ate the cake’ ∗ b. [O Jo˜ao]i parece que o bolo, ti comeu The Jo˜ao seems that the cake ate ‘Jo˜ao seems to have eaten the cake’ a.
(43)
∗
Romanian: Ion ˆıncepe ca pe Maria s-o ajute Ion starts that PE Maria SUBJ.her help ‘Ion is beginning to help Maria’ (Dobrovie-Sorin 1994: 124) ca ˆın orice moment s˘a explodeze b. ∗ Bombele pot The-bombs can.PRES.3PL that in any moment SUBJ explode ‘The bombs can go off any minute’ (Grosu and Horvath 1987) a.
∗
Example (40) shows that Brazilian Portuguese allows a left-dislocated element in the embedded clause, whereas (41) illustrates the fact that the subjunctive complement introduced by the complementizer ‘ca’ in Romanian must have a
78
Empirical advantages
left-dislocated constituent.16 The contrast between (40)/(41), on the one hand, and (42)/(43), on the other, thus indicates that left-dislocated elements do count as proper interveners for A-movement out of the embedded clause. This raises the independent question of why elements in [Spec, CP] and left-dislocated elements contrast in this fashion.17 Although at the moment we do not have anything conclusive to say on this issue,18 it is worth stressing that whatever 16 See e.g., Grosu and Horvath (1984) and Dobrovie-Sorin (1994) for relevant discussion. 17 Based on the interesting contrast in (i) below, Ferreira (2000, 2004, 2009) proposes that [Spec, CP] may count as a proper intervener for the embedded subject, depending on the nature of its occupant. The idea is that once the embedded subjects of (i) are moving to receive the matrix external -role, minimality considerations should prevent them from crossing elements that are potential -role bearers; hence the argumental DP que livro ‘which book’ in (ia) counts as an intervener for the embedded subject, but not the adverbial elements quando ‘when’ or por que ‘why’ in (ib), which are not potential -role bearers. Although able to correctly distinguish the acceptability patterns of (ia) from (ib), Ferreira’s proposal incorrectly predicts that obligatorycontrol sentences such as (35a), (36a), and (38a) should also be unacceptable, given that the DP in the embedded [Spec, CP] is a potential -role bearer. (i)
Brazilian Portuguese (Ferreira 2000): semana passada a. ∗ O Jo˜ao n˜ao sabe que livro leu na The Jo˜ao not knows which book read in-the week past ‘Jo˜ao doesn’t know which book he read last week’ tk b. O Jo˜ao n˜ao sabe quando/por que leu esse livro The Jo˜ao not knows when/why read this book ‘Jo˜ao doesn’t know when/why he read this book’
Nunes (2009a) argues that the key to this puzzle is to be found in another contrast noted by Ferreira, namely, the contrast between a filled [Spec, CP], as in (ia), and an analogous left-dislocation structure, as in (ii), where the minimality effect is much more salient. (ii)
Brazilian Portuguese (Ferreira 2000): Jo˜ao disse que esses livros, leu na semana passada The Jo˜ao said that these books read in-the week past ‘Jo˜ao said that he read these books last week’
∗O
Given the pervasive use of topic/left-dislocation structures in Brazilian Portuguese (see e.g., Pontes 1987; Kato 1999, 2000; Britto 1997; Galves 1998, 2001; and Negr˜ao 1999), Nunes suggests that the argumental wh-element in (ia) lands in a left-dislocated position on the way to the embedded [Spec, CP]. In other words, the marginality of (ia) is not due to a wh-phrase in [Spec, CP], but to its trace in the left-dislocated position, as sketched in (iii) below (with English words). In turn, the contrast between the unambiguous left-dislocation structure in (ii) and (ia) can be accounted for if (ia) marginally allows the argumental DP to skip the embedded left-dislocation position. (iii)
[TP Jo˜aoi doesn’t [vP ti know [CP [which book]k [LD tk [TP ti read tk last week]]]]] <-∗ --- m
18 But see Nunes (2009a) for the suggestion that the relevant distinction is that C is a phase head and, as such, it allows multiple Specs for successive cyclic movement. Under this view, hyperraising of subjects involves movement through the embedded [Spec, CP]. If the embedded
4.5 MTC under the copy theory of movement
79
the relevant minimality notion is that makes the correct distinctions, it must group finite control and hyper-raising as a natural class and this is exactly what the MTC does. Both are products of A-movement. 4.4.3 Summary Finite-control constructions are especially illuminating in the debate on how to analyze OC. As they involve full-fledged clausal structures, as opposed to the leaner structures generally found in non-finite control, they permit a much more varied testing. In the sections above, we saw that finite control and hyperraising constructions pattern alike with respect to reconstruction effects regarding the licensing of minimizers (section 4.4.1), island effects, and minimality (section 4.4.2). The overall conclusion that their parallel behavior leads to is that they are derived through the same derivational resources, namely, A-movement. In our opinion, finite control provides one of the strongest empirical arguments for the MTC. It is hard to see how alternative approaches on the market can account for the parallel behavior between finite control and hyper-raising without adding stipulations isolating finite control from standard non-finite control or replicating the A-movement operation involved under the guise of different technologies. 4.5
The movement theory of control under the copy theory of movement
As we discussed in Chapter 3, in models having D-structure (DS) as one of their components, there is no room for movement to thematic positions. Recall that under DS-based models, lexical insertion and -role assignment precede all movement operations. Thus, if a given -role fails to be assigned at DS, convergence cannot be enforced by having the relevant -relation established by a later movement operation because the derivation is already ruled out at DS. As we also pointed out, if the notion of DS is abandoned, as is the case within minimalism, this picture completely changes. More specifically, once it is assumed that lexical insertion, -role assignment, and movement actually intersperse, movement to thematic positions arises as a logical possibility, indeed as a natural expectation. One can of course retain the old properties of DS by assuming that relations can be established by merge but not by move (see Chomsky 1995). But it is worth emphasizing that, once DS is gone, the game is different and the CP already has a filled Spec, no minimality issue arises as the two Specs are equidistant (see Chomsky 1995).
80
Empirical advantages
evaluation of the assumptions of the new model also changes. If movement to -positions is a theoretical possibility allowed by the model, the burden of proof is on proposals that exclude it. In other words, under the general architecture of the minimalist program, it is the ban on movement to -positions that requires proper justification, as it is an independent proviso in a model lacking DS. In Chapter 3 and in the sections above, we have argued that, in fact, the null hypothesis in a model that eschews DS receives substantial support in the domain of OC, as it accounts for the distribution and interpretation of OC in terms of A-movement without enriching the theoretical apparatus.19 Again, it is the ban on movement to -positions that represents an enrichment of the theory if we keep the same assumptions constant and such an enrichment brings with it a full bag of additional provisos such as PRO as a special type of empty category and the control module, to name just the most prominent ones (see Chapter 2 for other additional assumptions required in different models). Thus far, we have defended the claim that OC PRO is an empty category left by A-movement, without pausing to discuss the ontology of this empty category. In section 3.4.3 we in fact mentioned that if OC PRO is a trace, as advocated by the MTC, and if traces are copies under the copy theory of movement, OC PRO should also be a copy under the copy theory. However, everything that was discussed so far could be implemented in terms of traces or copies. We will shift gears now and examine the implications of the copy theory for the MTC in more detail. Recall that the original motivation for the incorporation of the copy theory within the minimalist program was twofold. First, it permitted the replacement of DS- and SS-based analyses by LF-based analyses in the case of the binding theory (Chomksy 1993). Second, it made it possible that the output of movement operations was compatible with the inclusiveness condition (Chomsky 1995), which bans creationism in syntax in the sense that all syntactic structures must be built based on lexical material present in the numeration. Thus, although traces violate the inclusiveness condition as there are no such entities in the lexicon, copies are in compliance with this condition as they are replicas of lexical material present in the numeration or structures built from this lexical material. As such, the copy theory is plausibly one of the solid architectural pillars of the minimalist program. 19 For further evidence for movement to -positions, see e.g., Nunes (1995, 2001, 2004), Nunes and Uriagereka (2000), Hornstein (2001), Hornstein and Nunes (2002), in the domain of parasitic gaps and across-the-board movement, and Lidz and Idsardi (1997), Hornstein (2001, 2007), Kayne (2002), Zwart (2002), Grohmann (2003), and Boeckx, Hornstein, and Nunes (2007, 2008) in the domain of anaphora.
4.5 MTC under the copy theory of movement
81
It is worth stressing that this reanalysis of movement operations is not a notational variant of the trace theory just involving the substitution of one type of category without phonetic realization for another. Rather, it has wholesale conceptual and empirical implications within minimalism. Consider the derivation of (44), for instance, given in (45). (44)
John was promoted
(45) a. b.
Num = {John1 , was1 , promoted1 } Selection of ‘promoted’: Num = {John1 , was1 , promoted0 } V = promoted Selection of ‘John’: Num = {John0 , was1 , promoted0 } V = promoted N = John Merger of ‘promoted’ and ‘John’: Num = {John0 , was1 , promoted0 } VP = [promoted John] θ-marking of ‘John’: Num = {John0 , was1 , promoted0 } VP = [promoted John ] Selection of ‘was’: Num = {John0 , was0 , promoted0 } VP = [promoted John ] T = was Merger of ‘was’ and VP: TP = [was promoted John ] Copying of ‘John’: TP = [was promoted John ] N = John Merger of ‘John’ and TP: TP = [John was promoted John ] Deletion in the phonological component: TP = [John was promoted John ]
c.
d.
e.
f.
g. h.
i. j.
The relevant steps for our discussion are the ones represented in (45h– i). As we can see, there is actually no operation of movement employed in the derivation in (45). Displacement in natural languages is reanalyzed under the copy theory as the output of two basic operations: copy, which replicates the targeted constituent (cf. [45h]), and the independently required merge operation, which assembles larger syntactic objects by combining lexical items (cf. [45c–d]) or complex syntactic objects assembled in previous derivational
82
Empirical advantages
steps (cf. [45h–i]).20 Thus, if movement is not a primitive operation of the computational system, keeping the DS-based ban on movement to -position by assuming that merge but not move licenses -assignment requires further elaboration. However, establishing how to state this prohibition in a natural way is no trivial matter. Notice that if syntactic structures are not built in a single step, as in DSbased models, but through various applications of merge, as assumed in the minimalist program, it is an unavoidable property of the system that merge can license -assignment (cf. [45d–e]). That being so, the natural expectation is that in the derivation of an OC construction such as (46) below, for instance, the matrix light verb should be able to assign its external -role to the copy of ‘John’ with which it has merged, as sketched in (47c–d). Any departure from this expectation should be welcomed with skepticism and accompanied by solid evidence. (46)
John hopes to win
(47) a.
Applications of select, merge, θ-assignment, and copy: vP2 = [hopes [John1 to [vP1 John1 win]]] Copying of ‘John’: vP2 = [hopes [John1 to [vP John1 win]]] N = John1 Merger of ‘John’ and vP2 : vP2 = [John1 hopes [John1 to [vP John1 win]]] θ-marking of ‘John’ by the matrix v: vP2 = [John1,2 hopes [John1 [vP John1 win]]] Further applications of select, merge, and copy: TP = [John1,2 T [John1,2 hopes [John1 [vP John1 win]]]] Deletion in the phonological component: TP = [John1,2 T [John1,2 hopes [John1 [vP John1 win]]]]
b.
c. d. e. f.
Notice also that the point remains valid regardless of whether one assumes a featural view on -roles, as illustrated in (47), or a configurational view, as sketched in (48) below, where ‘John’ can be interpreted as establishing the thematic relations 1 and 2 in virtue of having one of its copies merged into the embedded [Spec, vP] and another copy merged into the matrix [Spec, vP]. Crucially, the structure that reaches LF (cf. [48d]) preserves the relevant configurations associated with each thematic relation. In other words, both (47) and (48) can be appropriately interpreted as a complex monadic predicate such as (49) in the semantic component (see section 3.4.2). 20 Merge itself may be a composite of the more basic operations concatenate and label, as argued by Hornstein and Nunes (2008) and Hornstein (2009). We will put this possibility aside for the sake of brevity.
4.5 MTC under the copy theory of movement (48) a.
83
Applications of select, merge, and copy: vP2 = [hopes [John to [vP1 John win]]] 1
b.
Copying of ‘John’: vP2 = [hopes [John to [vP1 John win]]] 1
c.
N = John Merger of ‘John’ and vP2 : vP2 = [John hopes [John to [vP1 John win]]] 2
1
d.
Further applications of select, merge, and copy: TP = [John T [John hopes [John to [vP1 John win]]]]
e.
Deletion in the phonological component: TP = [John T [John hopes [John [vP John win]]]]
2
(49)
1
John (x [x hopes x to win])
In sum, the copy theory unifies the kinds of thematic discharge found in control and non-control structures. Once move reduces to copy plus merge, all instances of -assignment, even those arising from “movement,” result from merging into a thematic position. In addition, the copy theory highlights the conceptual awkwardness of keeping the DS-based ban on movement to thematic positions within a system where traces are independently conceived of as targets of a copying operation. If the copy of a given syntactic object can participate in syntactic relations that are licensed by merge such as case-, -feature, or EPP-checking, for instance (cf. [45h–i]), it should also be able to participate in -relations as these are also licensed by merge (cf. [45d–e]).21 In the next sections we focus on specific empirical consequences of the copy theory for the MTC. 4.5.1 Adjunct control and sideward movement Let us now examine another theoretical possibility that is made available once DS is eliminated and syntactic objects are built step by step through the operation merge. Take the derivation of (50), for instance, as given in (51). (50)
The man saw Jane
21 Those partial to the merge/remerge variety of the copy theory, which treats movement as just one more case of merge (see Chomsky 2001, 2004), should find the idea that -roles could be discharged under remerge just as congenial. In other words, if being merged into a -position suffices for a given DP to get a -role, being remerged should as well.
84
Empirical advantages
(51) a. b.
c.
d.
e.
f.
g.
h.
i.
j. k.
l.
Num = {the1 , man1 , T1 , saw1 , Jane1 } Selection of ‘saw’: Num = {the1 , man1 , T1 , saw0 , Jane1 } V = saw Selection of ‘Jane’: Num = {the1 , man1 , T1 , saw0 , Jane0 } V = saw N = Jane Merger of ‘saw’ and ‘Jane’: Num = {the1 , man1 , T1 , saw0 , Jane0 } VP = [saw Jane] Selection of ‘the’: Num = {the0 , man1 , T1 , saw0 , Jane0 } VP = [saw Jane] D = the Selection of ‘man’: Num = {the0 , man0 , T1 , saw0 , Jane0 } VP = [saw Jane] D = the N = man Merger of ‘the’ and ‘man’: Num = {the0 , man0 , T1 , saw0 , Jane0 } VP = [saw Jane] DP = [the man] Merger of DP and VP: Num = {the0 , man0 , T1 , saw0 , Jane0 } VP = [[the man] [saw Jane]] Selection of T: Num = {the0 , man0 , T0 , saw0 , Jane0 } VP = [[the man] [saw Jane]] T Merger of T and VP: TP = [T [[the man] [saw Jane]]] Copying of [the man]: TP = [T [[the man] [saw Jane]]] DP = [the man] Merger of DP and TP: [[the man] T [[the man] [saw Jane]]]
The exaggeratedly detailed presentation of the steps in (51) is meant to make it transparent that it is ordinarily common for a given derivational step to involve more than one root syntactic object (more than one “tree”) at a time. This is trivially true in the first steps of any derivation. In order for merge to start operating, two items must be selected from the numeration (cf. [51c]) and under bare phrase structure each of these items constitutes a (root) minimal-maximal projection (Chomsky 1995). But there are in addition two other general cases
4.5 MTC under the copy theory of movement
85
where the computational system must handle more than one root syntactic object at a time. The first one is illustrated in (51e–h). Under the assumption that merge can only target root syntactic objects (Chomsky’s [1995] extension condition), the syntactic object in (51h) can only be assembled if ‘the’ and ‘man’ in (51f) merge first and the resulting object [the man] then merges with [saw Jane] (cf. [51g–h]). If ‘man’ first merges with [saw Jane], the next merger yields the unwanted structure ∗ [the [man [saw Jane]]]. Thus, whenever a given structure has a complex specifier, its derivation must involve a step analogous to (51f), with (at least) three root syntactic objects in the derivational working space.22 Finally, the last general case of more than one root syntactic object in a given derivational step arises when the computational system copies a substructure of a root syntactic object. As exemplified in (51k), before the copy gets merged we again have more than one root tree in the derivational working space. The interesting point for our discussion is that the derivational steps exemplified in (51) make room for an additional possibility, abstractly sketched in (52). (52) a.
b.
c.
Applications of select, merge, and copy: K = [...␣...] L = [...] Copying of ␣: K = [...␣...] L = [...] M=␣ Merger of ␣ and L: K = [...␣...] N = [␣ [L . . . ]]
After the computational system builds the root syntactic objects K and L in (52a), a copy of ␣ from within K is made (cf. [52b]) and merged with L, yielding the syntactic object N in (52c). Nunes (1995) calls the steps sketched in (52) ‘sideward movement.’23 Terminological metaphors aside, note that there is no intrinsic difference between the “upward” movement seen in (51j–l), for instance, and the “sideward” movement sketched in (52) in what regards the computational steps involved. In both cases, we have trivial applications of movement, viewed as copy plus merge.24 22 The same holds for complex adjuncts. See Uriagereka (1999) and Nunes and Uriagereka (2000) for relevant discussion. 23 Steps like the ones in (52) are referred to as an “interarboreal operation” by Bobaljik (1995) and Bobaljik and Brown (1997), and as “paracyclic movement” by Uriagereka (1998, chapter 4). 24 If movement is parasitic on agree and agree requires c-command, as in Chomsky’s (2000, 2001, 2004) system, then this is not quite right. We are abstracting from this difference here.
86
Empirical advantages
This point is worth emphasizing, as it has been consistently misunderstood. Within DS-based models such as GB, sideward movement is not a theoretical possibility, for all syntactic computations operate with a single root tree – the one made available by DS. However, the picture is completely different within minimalism. Once DS is abandoned and structure building is carried out by merge, multiple root syntactic objects in a single derivational step not only are allowed in principle, but must indeed be employed in derivations involving complex specifiers, for instance, in order for the extension condition to be satisfied (cf. [51d–h]). Under the standard architectural features of minimalism, sideward movement is therefore not a novel operation or a new species of movement. It is just a description of a specific interaction between copy and merge. The fact that ␣ in (52b–c), for instance, does not merge with the structure that contains the “source” of the copy, as opposed to [the man] in (51j–l), may follow from two independent (therefore irrelevant) reasons: (i) that ␣ in (52b) has two root syntactic objects around to merge with, whereas [the man] in (51j) only has one; and (ii) that merger of ␣ with L in (52b) may be licensed by last resort, but merger with K may not. The derivation of V-to-T movement under the sideward-movement analysis sketched in (53) illustrates this point.25 (53) a.
b.
c.
d.
Applications of select, merge, and copy: VP = [ . . . V . . . ] T Copying of V: VP = [ . . . V . . . ] T V Merger of T and V (by adjunction): VP = [ . . . V . . . ] K = [T0 V [T0 T]] Merger of K and VP: TP T0 V
[VP . . . V . . . ] T0
If T and VP had merged in (53a), yielding [TP T VP], the extension condition should then prevent the verb from adjoining to T, as T would no longer be a root 25 For relevant discussion, see Bobaljik (1995), Nunes (1995, 2001, 2004), Bobaljik and Brown (1997), Uriagereka (1998), and Hornstein and Nunes (2002, 2008).
4.5 MTC under the copy theory of movement
87
syntactic object. However, V-to-T adjunction can comply with the extension condition if it proceeds as in (53b–d). Crucially, once the derivational step in (53b) is reached, the copied V must merge with T rather than VP as it arguably has features to check with the former, but not the latter. To put it in general terms, given the standard minimalist assumptions reviewed above, sideward movement naturally arises within the system without enriching the grammatical apparatus. Thus, unless explicitly prohibited, sideward movement comes for free as one more instantiation of copy plus merge. Consequently, it is excluding sideward movement as a grammatical possibility that requires additional theoretical devices. The theories that exclude sideward movement by invidiously distinguishing the merger in (51k–l) from the one in (52b–c) are the methodologically profligate ones, not those that allow it. This is not the place to defend the virtues of sideward movement.26 Rather we wish to illustrate how it allows a smooth analysis of adjunct-control constructions such as (54) below. As illustrated in (55), adjunct control has virtually the same properties as OC into complements. (54)
Johni saw Mary after/before/while PROi eating a bagel
(55) a.
Adjunct-control PRO requires a local c-commanding antecedent: Johni said [that [Maryk ’s brother]m left [after PROm/∗ i/∗ k/∗ w eating a bagel]]27 Adjunct-control PRO only licenses sloppy readings under ellipsis: John left before PRO singing and Bill did too ‘and Billi left before hei /∗ John sang’ Adjunct-control PRO can only have a bound interpretation when controlled by only-DPs: Only Churchill left after PRO giving the speech ‘[Nobody else]i left after hei /∗ Churchill gave the speech’ In the appropriate type of adjuncts (e.g., purposives), adjunct-control PRO obligatorily requires a de se interpretation:28 The unfortunate wrote a petition (in order) PRO to get a medal ‘[The unfortunate]i wrote a petition so that [he himself]i would get a medal’
b.
c.
d.
In the preceding sections and chapters, we have argued that the properties illustrated in (55a–d) are signature properties of A-movements/chains. Let us then examine how sideward movement allows OC into complements and 26 See Nunes (1995, 2001, 2004), Hornstein (2001), and Drummond (2009) for an extensive discussion of the conceptual and empirical virtues of sideward movement, apparent problems, and solutions to prevent many cases of overgeneration. 27 For reasons that will become clear in section 4.5.1.2 below, PRO in (55a) can be interpreted as ‘John’ if the adjunct clause is interpreted as modifying the matrix verb. 28 These cases were pointed out to us by John Nissenbaum (personal communication).
88
Empirical advantages
adjunct control to be both subsumed under a movement analysis. Take the (simplified) derivation of (56), given in (57), for instance.29 (56)
John1 saw Mary after PRO1 eating lunch
(57) a.
Applications of select, merge, and copy: Num = {John0 , T+ 1 , saw0 , Mary0 , after0 , T− 0 , eating0 , lunch0 } PP = [after John T− eating lunch] VP = [saw Mary] Copying of ‘John’: PP = [after John T− eating lunch] VP = [saw Mary] N = John Merger of John and VP: PP = [after John T− eating lunch] VP = [John saw Mary] Merger of PP and VP: [VP [VP John saw Mary] [PP after John T− eating lunch]] Selection of T + : Num = {John0 , T+ 0 , saw0 , Mary0 , after0 , T− 0 , eating0 , lunch0 } [VP [VP John saw Mary] [PP after John T− eating lunch]] T+ Merger of T + and VP: TP = [T+ [VP [VP John saw Mary] [PP after John T− eating lunch]]] Copying of ‘John’: TP = [T+ [VP [VP John saw Mary] [PP after John T− eating lunch]]] N = John Merger of ‘John’ and TP: TP = [John [T+ [VP [VP John saw Mary] [PP after John T− eating lunch]]]] Deletion in the phonological component: TP = [John [T+ [VP [VP John saw Mary] [PP after John T− eating lunch]]]]
b.
c.
d. e.
f. g.
h. i.
After VP and PP have been assembled in (57a), ‘saw’ still has its external -role to assign, but there is no remaining element in the numeration to receive it. Notice, however, that ‘John’ has not checked case within the gerund clause (its T-head does not have a complete -set) and is therefore still active for purposes of A-movement. The computation then creates a copy of ‘John’ (cf. [57b]) and merges it with the VP (an instance of sideward movement), allowing the external -role to be discharged (cf. [57c]). The PP then adjoins to VP (cf. [57d]), the matrix TP is built (cf. [57f]), and the matrix subject moves to (i.e., gets copied and merged into) [Spec, TP] (cf. [57g–h]). Note that both the sideward movement in (57b–c) and the upward movement in 29 See Hornstein (1999, 2001, 2003) for discussion.
4.5 MTC under the copy theory of movement
89
(57g–h) extend the targeted tree. Notice also that sideward movement provides an escape hatch for ‘John’ to have its case checked. Were ‘John’ not to move, the derivation would crash as it could not get its case checked within the adjunct clause. The derivation sketched in (57) shows that A-movement of ‘John’ from within the adjunct to a thematic position (the external argument of ‘saw’) yields a control structure. It is worth pointing out that no special proviso was added to the system in order to achieve this result. The MTC when interpreted under the copy theory predicts that any instance of obligatory control should involve applications of copy and merge and obligatory adjunct control should be no different. Thus, if PRO in (56) is a residue of (sideward) movement to a thematic position, we expect it to function like an (A-)trace and we expect the relation between ‘John’ and PRO in (56) to manifest all the properties of A-chain dependencies. As the data in (55) indicate, our expectations are not disappointed. The adjunct-control analysis in terms of sideward movement outlined in (57) also accounts for the fact that adjunct control does not differ from subject or object control in terms of interclausal agreement. Recall from section 4.3 that epicene nouns such as the counterpart of ‘victim’ in Romance trigger [+feminine] agreement even in contexts when the victim is a male. As observed by Rodrigues (2004), this pattern of agreement also obtains in adjunct-control configurations such as (58), which indicates that the trace left by sideward movement of ‘the victim’ triggers [+feminine] agreement on the embedded participial morphology, just like the trace left by upward movement in instances of raising and subject control (cf. [8] and [9] above). (58) a.
Italian (Rodrigues 2004): La vittima mori’ dopo essere stata trasportata /??stato The victim died after be been.FEM brought.FEM been.MASC trasportato all’ ospedale brought.MASC to-the hospital
b.
Brazilian Portuguese (Rodrigues 2004): A v´ıtima morreu depois de ser trazida /??trazido para The victim died after of be brought.FEM brought.MASC to o hospital the hospital ‘The victim died after being brought to the hospital’
In the next sections we discuss three additional welcome consequences of the sideward-movement analysis of adjunct control sketched in (57).
90
Empirical advantages
4.5.1.1 Subject–object asymmetry in adjunct control A very distinctive property of adjunct control is that PRO must be controlled by the subject and not the object of the next higher clause, as illustrated in (59).30 (59)
Johni saw Maryk after PROi/∗ k eating lunch
Rosenbaum (1970) attempts to account for this subject–object asymmetry by extending the minimal-distance principle to cases of adjunct control. Roughly speaking, the minimal-distance principle blocks object control into adjuncts on the assumption that ‘John’ but not ‘Mary’ c-commands the adjunct. Unfortunately, this assumption is not obviously correct. For example, it is possible for objects to bind pronouns found within adjuncts, as Orson Welles taught us: (60)
John will drink [no wine]i before iti is ready for drinking
If we assume that in order to be interpreted as a bound variable a pronoun must be c-commanded by its antecedent, this implies that ‘it’ in (60) must be c-commanded by ‘no wine’ at least at LF. But then objects should be able to control into adjuncts, contrary to fact (cf. [59]). The sideward-movement account outlined in (57) allows us to explain this subject/object asymmetry if we assume with Chomsky (1995) that movement is subject to economy. More specifically, Chomsky has proposed that move is less economical than merge. That is, merge trumps move when both are available and both lead to convergent derivations. In the context of the copy theory, where move is understood as copy plus merge, this proposal can be interpreted as saying that the operation copy is costly and should be employed only under convergence pressure.31 With this in mind, consider the derivational step in (61). (61)
Applications of select, merge, and copy: Num = {John0 , T+ 1 , saw0 , Mary1 , after0 , T− 0 , eating0 , lunch0 } PP = [after John T− eating lunch] VP = saw
In (61), ‘saw’ must assign its internal -role and there are two potential candidates to receive it: ‘Mary,’ which is still in the numeration, and ‘John,’ which is still active for purposes of A-movement in virtue of having its case unchecked, as discussed earlier. If ‘Mary’ is selected and merged with ‘saw,’ 30 But see section 7.3.2.1 below, where we discuss cases of adjunct control in Portuguese where we find subject or object control depending on whether we have wh-in situ or wh-movement. 31 See Nunes (1995, 2001, 2004) and Hornstein and Nunes (2002) for relevant discussion.
4.5 MTC under the copy theory of movement
91
as seen in (57a), the derivation converges as a subject-control structure, after ‘John’ undergoes sideward movement to [Spec, VP] (cf. [57b–e]). In turn, if ‘John’ is copied and merged with ‘saw,’ as shown in (62), the derivation should in principle converge as well, this time yielding an object-control structure. (62) a.
b.
c.
d. e. f. g.
Copying of ‘John’: PP = [after John T− eating lunch] V = saw N = John Merger of ‘John’ and V: PP = [after John T− eating lunch] VP = [saw John] Selection and merger of ‘Mary’: PP = [after John T− eating lunch] VP = [Mary saw John] Merger of PP and VP: [VP [Mary saw John] [PP after John T− eating lunch]] Selection and merger of T+ : TP = [T+ [VP [Mary saw John] [PP after John T− eating lunch]]] Copying and merger of ‘Mary’: [TP Mary [T+ [VP [Mary saw John] [PP after John T− eating lunch]]]] Deletion in the phonological component: [TP Mary [T+ [VP [Mary saw John] [PP after John T− eating lunch]]]]
Note, however, that the derivation in (62) violates economy as copying of ‘John’ is employed (cf. [62a]) at a derivational step where selection and merger of ‘Mary’ would suffice to yield a convergent result (cf. [57]). Notice that once ‘Mary’ becomes the object of ‘saw’ (cf. [57a]), it receives accusative case and therefore cannot check the external -role of ‘saw,’ as it becomes inactive for purposes of A-relations. In this scenario, copying of and merger of ‘John’ are indeed legitimate (cf. [57f–g]), as there is no alternative option that leads to convergence. In sum, if economy independently restricts movement/copying and sideward movement is just an instance of copy plus merge, then the restriction to subject control into adjuncts is what we expect (and find).32 32 The result is actually a bit more robust than this. There are various ways of ensuring preference of merger over movement in these contexts. For example, one may piggyback on Chomsky`s (2000) proposal that a numeration is organized in subarrays, and require that material that has been selected and integrated into the structure can be accessed again only after all relevant elements of the active subarray have been used. This possibility is explored in Nunes and Uriagereka (2000) and Nunes (2001, 2004).
92
Empirical advantages
4.5.1.2 Adjunct control and CED effects At first sight, the proposal that adjunct control is also a product of (A-)movement seems to face problems when familiar instances of movement out of adjuncts are taken into consideration. A sentence such as (63), for instance, displays a typical CED effect (see Huang 1982), showing that such movement does not lead to acceptable results. (63)
∗
[[Which book]i did [John [vP [vP talk to Mary] [PP after he read ti ]]]]
Given the unacceptability of (63), the question that arises is why sideward movement in the adjunct-control construction in (64) below (cf. [57]) does not violate the CED. Or, to put matters slightly differently, what prevents sideward movement from applying in (63), incorrectly bleeding CED effects? The challenge for the MTC is thus to provide an account that at once explains why movement leads to CED effects in (63) but not in (64). (64)
[Johni [vP [vP ti saw Mary] [PP after ti eating lunch]]]
Theoretically accounting for the difference is less difficult than it may appear to be. Let us more closely examine the derivational steps prior to movement of ‘John’ in (64) and ‘which book’ in (63), as respectively represented in (65) and (66). (65)
Applications of select, merge, and copy: PP = [after John T− eating lunch] vP = [saw Mary]
(66)
Applications of select, merge, and copy: [Did [John [vP [vP talk to Mary] [PP [which book] after he read [which book]]]]]
Whatever the ultimate analysis of adjunct islands turns out to be, it should obviously apply to adjuncts. But notice that the notion of being an adjunct is not an absolute, but a relational notion. In other words, a given constituent X is an adjunct of Y only after X and Y are merged. Before that, each is an independent syntactic object. This is exactly the case in (65). The PP is not an adjunct of the vP at this derivational step, but an independent tree. Copying from within the PP should in fact be no different from the copying that takes place in standard instances of movement such as movement of the subject from [Spec, vP] to [Spec, TP] depicted in (67).
4.5 MTC under the copy theory of movement (67) a. b.
c.
93
Applications of select and merge: TP = [T [vP he saw her]] Copying of ‘he’: TP = [T [vP he saw her]] D = he Merger of ‘he’ and TP: [TP he [T [vP he saw her]]]
As for the derivation of (63), at the derivational step where the wh-phrase should move out of the PP, the PP has already adjoined to the vP (cf. [66]). Therefore, extraction out of it should indeed yield a CED effect. To derive (63) without incurring a CED violation requires moving ‘which book’ from within the PP before the PP is adjoined to the vP. However, this movement targets [Spec, CP]. Thus, the adjunct cannot be merged until C has been added to the derivation, if it is to remain porous to movement. In other words, a potential derivation of (63) employing sideward movement should involve the derivational steps in (68), where the PP remains an independent syntactic object throughout the computation. (68) a.
b.
c.
Applications of select, merge, and copy: PP = [[which book] after he read [which book]] CP = [did [John [vP talk to Mary]]] Copying of ‘which book’: PP = [[which book] after he read [which book]] CP = [did [John [vP talk to Mary]]] DP = [which book] Merger of ‘which book’ and CP: PP = [[which book] after he read [which book]] CP = [[which book] [did [John [vP talk to Mary]]]]
Although sideward movement of ‘which book’ in (68b–c) allows the relevant wh-features to be licensed, there is no convergent continuation for (68c). Notice that the PP in (68c) must adjoin to vP for interpretive reasons. However, once vP has been integrated into a larger structure, merger between PP and vP violates the extension condition. To sum up, if PP is adjoined to the vP “at the right time” (cf. [66]), it becomes an adjunct island for later movements from within it. If adjunction of PP is delayed to allow feature checking later on in the derivation (cf. [68]), the extension condition prevents PP from merging with vP. In either case, the derivation of (63) is correctly blocked.33 33 One wonders what blocks a derivation in which the wh-element of (63) undergoes sideward movement to an outer [Spec, vP] before PP adjoins to vP, as illustrated in (i) below, which
94
Empirical advantages
Notice that this analysis also derives locality effects in adjunct-control constructions. In a sentence such as (69), for instance, where the gerund is adjoined to the embedded clause, PRO cannot take the matrix subject as its antecedent, but only the subject of the next clause up. (69)
[Johni left the room [after Maryk answered the questions without PROk/∗ i understanding them]]
In order for the matrix control in (69) to be obtained under the MTC, ‘John’ should be generated in the adjunct clause, as shown in (70) below. Movement of ‘John’ to the matrix [Spec, vP] should then yield a CED effect, as it would be crossing an adjunct island. Similar to what happens to the wh-phrase in (66), when the relevant target for the movement of ‘John’ (the matrix [Spec, vP]) is introduced in the derivation it is already too late, as ‘John’ is trapped within an adjunct island. should then yield the sentence in (63) in compliance with the extension condition. Crucially, although the copy of ‘which book’ inside PP is trapped within the adjunct, the copy in the outer [Spec, vP] is free to undergo further movement to [Spec, CP]. (i) a. Applications of select, merge, and copy: PP = [[which book] after he read [which book]] vP = [vP John talk to Mary] b. Copying of [which book]: PP = [[which book] after he read [which book]] vP = [vP John talk to Mary] DP = [which book] c. Merger of DP and vP: PP = [[which book] after he read [which book]] vP = [[which book] [v’ John talk to Mary]] d. Merger of PP and vP (by adjunction): [vP [vP [which book] [v’ John talk to Mary]] [PP [which book] after he read [which book]]] We would like to suggest that sideward movement in (ib–c) is prevented by last resort and the ban on global computations with look-ahead. A standard assumption within the phase-based model (Chomsky 2000, 2001, 2004) is that movement to the edge of a phase is not driven by regular feature checking, but by the phase impenetrability condition, which basically forces elements that are in the complement domain of a phase head to move to its edge so that later relations can take place (see Boˇskovi´c 2007 for relevant discussion). Notice, however, that the wh-phrase in (ia) is already at the edge and is not in the complement domain of any phase head. Thus, copying of the wh-phrase at this derivational step is not licensed by last resort. If computational decisions are to be made locally without look-ahead, as we are assuming here, the wh-phrase will then remain in the edge of PP until PP adjoins to vP. See Nunes (2008c) for further discussion.
4.5 MTC under the copy theory of movement (70)
95
PP = [after Mary [vP [vP answered the questions] [PP without John understanding them]]] vP = [left the room]
It is worth stressing that the analysis of the contrast between (63) and (64) advocated above explores very natural assumptions within minimalism.34 First, once syntactic structures are built step by step, relational notions such as adjunct of are only instantiated if the relevant objects have been merged.35 A given XP will become an adjunct island only after it adjoins to its target; before that, it is a porous domain like any other root syntactic object. Thus, sideward movement in a sense bleeds CED by applying before CED becomes relevant. Second, every application of merge (including adjunction) is subject to the extension condition.36 Third, the operation of copy is subject to last resort, computed in a local fashion (without look-ahead). Thus, one cannot idly create a copy and 34 For a full elaboration see Hornstein (2001), Nunes (2001, 2004), and Hornstein and Nunes (2002). 35 The same considerations of course apply to the notion specifier/subject of (see Nunes and Uriagereka 2000; Hornstein 2001; and Nunes 2001, 2004, for relevant discussion). Hornstein and Kiguchi (2003) and Kiguchi (2004), for instance, analyze Higginbotham’s (1980) PRO-gate phenomena illustrated in (i) in terms of sideward movement of a given DP from within an XP before this XP becomes a subject island, as illustrated in (ii) (see section 6.2 below for further discussion). (i)
[[PROi having to get up early] upset everyonei ]
(ii) a. Applications of select, merge, and copy: K = [everyone having to get up early] L = upset b. Copying of ‘everyone’: K = [everyone having to get up early] L = upset M = everyone c. Merger of L and M: K = [everyone having to get up early] N = [upset everyone] d. Merger of K and N: [[everyone having to get up early] upset everyone] e. Deletion in the phonological component: [[everyone having to get up early] upset everyone] 36 Starting with Chomksy (1993), it has been often assumed that the extension condition does not apply to adjunction. Clearly an approach that treats all instances of merge as subject to extension holds the methodological high ground. For a reanalysis of the principal empirical arguments for assuming that adjunction is not subject to extension, see Nunes (1995, 2001, 2004).
96
Empirical advantages
leave it dangling because it will be useful later on in the derivation. Copies are not made until prompted by their targets.37 In the next section we will see that this analysis also receives independent support from finite control. 4.5.1.3 Finite adjunct control In section 4.4, we saw that Brazilian Portuguese displays finite control into indicative clauses, which was attributed to the possibility that its indicative Theads be associated with person and number (T+ ) or number only (T− ). When associated with T− , indicative finite clauses do not value the case feature of their subject, which then remains active for purposes of A-movement, yielding finite-control configurations. This analysis also extends to finite adjunct clauses, as illustrated in (71), which displays all the diagnostics of OC.38 (71)
Brazilian Portuguese: O Jo˜ao viu a Maria depois que entrou na sala The Jo˜ao saw the Maria after that entered in-the room ‘Johni saw Maryk when hei /∗ shek entered the room’
The (simplified) derivation of (71) proceeds along the lines of (72) (with English words for convenience). (72) a.
b.
c.
d. e. f.
Applications of select, merge, and copy: K = [CP after that Jo˜ao T− entered the room] L = [vP saw Maria] Copying of ‘Jo˜ao’: K = [CP after that Jo˜ao T− entered the room] L = [vP saw Maria] M = Jo˜ao Merger of L and M: K = [CP after that Jo˜ao T− entered the room] N = [vP Jo˜ao saw Maria] Merger of K and N (by adjunction): [vP [vP Jo˜ao saw Maria] [CP after that Jo˜ao T− entered the room]] Selection and merger of T+ : [TP T+ [vP [vP Jo˜ao saw Maria] [CP after that Jo˜ao T− entered the room]]] Copying and merger of ‘Jo˜ao’: [TP Jo˜ao [T’ T+ [vP [vP Jo˜ao saw Maria] [CP after that Jo˜ao T− entered the room]]]]
37 For additional arguments and relevant discussion, see Hornstein (2001), Nunes (2001, 2004), and Hornstein and Nunes (2002). 38 For data and relevant discussion, see Ferreira (2000, 2004, 2009) and Rodrigues (2004).
4.5 MTC under the copy theory of movement g.
97
Deletion in the phonological component: [TP Jo˜ao [T’ T+ [vP [vP Jo˜ao saw Maria] [CP after that Jo˜ao T− entered the room]]]]
The T-head of the adjunct clause in (72a) is not -complete and therefore cannot freeze ‘Jo˜ao’ for purposes of A-movement. ‘Jo˜ao’ can then be copied and merged with vP and receive the external -role of the matrix clause. Crucially, when sideward movement occurs (cf. [72b–c]), CP is an independent tree and no adjunct violation arises.39 By contrast, if the relevant clause has already been adjoined, the discussion in section 4.5.1.2 predicts that a CED effect should be detected. That this prediction is correct is shown by the sentence in (34a), repeated below in (73), which should involve the illegitimate movement of ‘Jo˜ao’ to the matrix [Spec, vP] in (74) (with English words). (73) ∗
(74)
Brazilian Portuguese: [O Jo˜ao disse [que [o bolo [que comeu]] n˜ao estava bom]] The Jo˜ao said that the cake that ate not was good ‘Jo˜ao said that the cake that he ate was not good’ [vP said that [the cake [that Jo˜ao T ate]] was not good]
Another correct prediction that the analysis presented in section 4.5.1.2 is that finite adjunct control should also display sensitivity to locality (cf. [69]), as illustrated in (75) below. When the adjunct modifies the intermediate clause, the null subject inside the adjunct can only be interpreted as ‘Maria.’ If it were to be interpreted as the matrix subject, ‘Jo˜ao’ would have to be generated in the adjunct clause, but movement to the matrix [Spec, vP] should then yield a CED effect (cf. [76] with English words). Therefore, adjunct finite control in Brazilian Portuguese provides additional evidence that adjunct control should be analyzed in terms of sideward movement, as argued earlier. (75)
Brazilian Portuguese: [O Jo˜ao saiu da sala [depois que a Maria gritou The Jo˜ao left of-the room after that the Maria screamed porque estava com medo]] because was with fear ‘Jo˜ao left the room [after Maria screamed because she/∗ he was scared]’
39 Notice that, as in the case of non-finite control, the null embedded subject of (71) cannot be controlled by the object. This subject–object asymmetry can therefore also be accounted for in terms of the “merge-over-move” economy preference discussed in section 4.5.1.2 (see also Modesto 2000; Rodrigues 2004; Nunes 2008c; and section 7.3.2.1 below for further discussion).
98 (76)
Empirical advantages [vP [left the room] [PP after Maria [vP [vP screamed] [because Jo˜ao T− was scared]]]]
4.5.1.4 Summary The above discussion shows that, given some natural minimalist assumptions, it is possible to assimilate adjunct control to the MTC quite straightforwardly. The subject orientation observed in adjunct control can be accounted for in terms of the “merge-over-move” economy preference, and the fact that the sideward movement involved in adjunct control is not subject to the CED is explained in terms of the derivational timing when such movement takes place. Moreover, we have seen that the account proposed also captures finite adjunct control. In effect, the MTC offers a complete analysis of adjunct control within a framework of minimalist assumptions. We find it interesting that cases of adjunct control can be analyzed in terms of movement. The reason is that adjunct control cannot be reduced to the thematic requirements of an embedding control predicate (as is standard in cases of control into complement clauses), as there is no thematic relation between the matrix verb and the adjunct. Thus, whatever control one finds here cannot be a function of the properties of the matrix predicate, as is usually assumed in cases of control into complement clauses. Nonetheless, adjunct control displays all the diagnostic properties of OC found in control into complements. This suggests that, at least in adjunct control, the controller is structurally specified. Moreover, if the properties of OC PROs within adjuncts are structurally determined, then it would be very odd to treat cases of control into complements (where the very same properties appear) as derived by entirely different operations and in entirely different ways. The MTC’s ability to unify complement and adjunct control thus constitutes another argument in its favor, we believe. Most other approaches to control, based as they are (at least in part) on selection properties of the embedding predicate, have had little to say about these cases despite their having all the signature properties of OC found in the case of complements.40 4.5.2
The movement theory of control and morphological restrictions on copies So far, we have been discussing interesting consequences revolving around the reinterpretation of the operation move as copy plus merge. We now turn to 40 See sections 7.2 and 7.3.2 below for a detailed discussion of this point.
4.5 MTC under the copy theory of movement
99
other empirical implications of the implementation of the MTC under the copy theory by inspecting the copies themselves. Consider the superficially similar constructions in (77) from European Portuguese. (77) a.
b.
European Portuguese (Martins and Nunes 2008): Custou-me vender a casa Cost-me sell.INF the house ‘It saddened me that I/he/she had to sell the house’ Custou-me a vender a casa Cost-me to sell.INF the house ‘It was hard for me/∗ him/∗ her to succeed in selling the house’
Martins and Nunes (2008) show that these sentences contrast not only in their meaning, but also in their structural properties. More specifically, they argue that the infinitival subject of (77a) has its case licensed clause-internally, but not the infinitival subject of (77b). Taking the presence or absence of the preposition a ‘to’ as a diagnostic for which structure is at stake, they show that only structures such as (77a) − without the preposition − allow inflected infinitives, independent tense, and expletive subjects, as respectively illustrated in (78). (78)
European Portuguese (Martins and Nunes 2008): Custou-me (∗ a) venderem a casa Cost-me to sell.INF.3PL the house ‘It saddened me that they had to sell the house’ ∗ ‘It was hard for me to get them to greet the boss’ b. Custa-me (∗ a) tˆe-lo despedido Costs-me to have.INF-him fired ‘It saddens me that I had to fire him’ ∗ ‘It is hard for me that I succeeded in firing him’ c. Custa-nos (∗ a) haver pessoas com fome Costs-us to have.INF people with hunger ‘It saddens us that there are hungry people’ ∗ ‘It is hard for us to succeed in causing people to be hungry’ a.
Based on these differences, Martins and Nunes propose that (77a) and (77b) are to be respectively analyzed along the lines of (79) and (80) below (with English words). In (77a)/(79), the null subject inside the infinitival clause is a pro, which may be coreferential with the experiencer complement of the matrix verb. By contrast, the embedded null subject in (77b)/(80) is a trace of the experiencer; in other words, (77b) is an OC construction.41 41 Martins and Nunes (2008) show that constructions such as (77b) indeed display all the diagnostics of obligatory control, contrasting with constructions like (77a). For instance, as opposed to
100
Empirical advantages
(79)
[TP proexpl cost mei [proi sell the house]]
(80)
[TP proexpl cost mei to [ti sell the house]]
What is relevant for our present discussion is an additional contrast discussed by Martins and Nunes. Let us first consider the background data in (81)–(82). (81)
European Portuguese (Martins and Nunes 2008): O Jo˜ao levantou-se cedo The Jo˜ao raised.SE early ‘Jo˜ao got up early’
a.
b.
Vive-se bem nesta cidade Lives.SE well in-this city ‘One lives well in this city’
(82) a.
∗
b.
∗
European Portuguese (Martins and Nunes 2008): Levanta-se-se cedo neste pa´ıs Raises.SE-SE early in-this country ‘One gets up early in this country’ Pode-se levantar-se cedo neste pa´ıs Can.SE raise.SE early in-this country ‘One can get up early in this country’
Example (81) shows that in Portuguese, the third-person clitic se is ambiguous between a reflexive, as in (81a), or an indefinite, as in (81b). In turn, (82) shows constructions such as (77a), constructions such as (77b) require sloppy reading under ellipsis and de se interpretation in the relevant contexts, as illustrated in (i) and (ii) (see Martins and Nunes [2008] for the other diagnostics and further discussion). (i)
European Portuguese (Martins and Nunes 2008): a. Custou-me beber aquilo e a ele, custou-lhe tamb´em Cost-me drink that and to him cost-him too ‘It was hard for me that I had to drink that and it was also hard for him that I/he had to drink that’
b. Custou-me a beber aquilo e a ele, custou-lhe tamb´em Cost-me to drink that and to him cost-him too ‘It was hard for me to succeed in drinking that and it was also hard for him to succeed in drinking that/∗ in having me drink that’ (ii) European Portuguese (Martins and Nunes 2008): [Context: an amnesic soldier watches a documentary in which he is the protagonist, but he does not remember that the person being shown is him himself] a. Custou-lhe depor as armas Cost-him lay-down the arms ‘It saddened him that the documentary character had to lay down his arms’ b. #Custou-lhe a depor as armas Cost-him to lay-down the arms #‘It was hard for him to succeed in laying down his arms’
4.5 MTC under the copy theory of movement
101
that the two uses cannot coexist within a single clause. Example (82b) further shows that the problem with (82a) is not that we have two clitics associated with a single verb. In (82b) the clitics are each associated with a different verb and an ungrammatical result still obtains as the clitics are co-occurring within the same clause (the finite verb is a modal auxiliary). With this restriction in mind, consider now the interesting contrast in (83). (83)
European Portuguese (Martins and Nunes 2008): Custou-me levantar-me cedo Cost-me raise-me early ‘Getting up early is hard for me’
a.
b.
∗
Custou-me a levantar-me cedo Cost-me to raise-me early ‘It was hard for me to succeed in getting up early’
These sentences indicate that the first-person reflexive clitic me can co-occur with a homophonous experiencer in the matrix clause when control is not involved (cf. [83a]), but not when OC is at stake (cf. [83b]). In other words, the OC construction replicates the restriction seen in (82a), where no control is involved, and the question is why this should be so. As Martins and Nunes argue, the contrast in (83) can receive a natural explanation if one adopts the MTC as implemented under the copy theory. From this perspective, the trace in the OC structure in (80) is a copy of the matrix experiencer. Thus, the relevant structures associated with (83a) and (83b) are the ones given in (84) and (85) (with English words and superscripted indices annotating copies). (84)
[TP proexpl cost mei [proi [raise me early]]]
(85)
[TP proexpl cost mei to [mei [raise me early]]]
In (83a)/(84) the embedded reflexive has as its clause-mate a pro coreferential with the matrix experiencer. In the OC construction in (83b)/(85), on the other hand, the embedded subject is a copy of the experiencer. Thus, we find two instances of ‘me’ in the embedded clausal domain in (85), but not in (84). This indicates that the ungrammaticality of (83b) in effect reduces to the independent restriction in the morphological component banning two identical clitics in the same clause seen in (82). As Martins and Nunes also point out, it is not at all obvious how contrasts such as the one in (83) can be captured under a PRO-based account, given that the reflexive and the experiencer should be in different clauses in each of the sentences in (83), as represented in (86) and (87) (with English words).
102
Empirical advantages
(86)
[TP proexpl cost mei [proi [raise me early]]]
(87)
[TP proexpl cost mei [PROi [raise me early]]]
To conclude, by assuming the MTC and the copy theory, we expect OC PRO to behave like a regular copy of its antecedent. More precisely, we expect OC PRO to be subject to whatever computations and restrictions apply to its antecedent in the phonological component. This is exactly what we find in OC involving experiencers in European Portuguese. The fact that the morphological restriction ruling out two identical clitics applies to OC PRO but not to a coreferential pro strongly indicates that OC PRO is a copy of its antecedent and, therefore, that OC is derived by copying/movement. 4.5.3 Backward control So far we have examined customary cases of forward control, in which the controllee is c-commanded by the controller, as represented in (88) below, where ‘’ is meant to be neutral regarding the grammatical nature of the controllee. Backward control, as sketched in (89), is also attested in several languages. (88) a. b.
[DP1 V [1 . . . ]] [DP V DP1 [1 . . . ]]
(89) a. b.
[1 V [DP1 . . . ]] [DP V 1 [DP1 . . . ]]
Backward-control constructions have the following two defining properties: (i) they exhibit a control interpretation in that a single DP is interpreted as being associated with two or more -roles; and (ii) in overt syntax, the controller appears in a lower position than the controllee. In this section we discuss the implications of the existence of backward-control constructions for the debate on how to better account for OC. We start by indicating the problems that backward control poses to PRO-based theories and how they can receive a natural explanation under the MTC. We then proceed to present actual examples of backward control in two different languages: Tsez and Korean.42 4.5.3.1 Backward control and PRO-based approaches to control PRO-based approaches assume that control is an inter-chain relation in which the phonetically unexpressed -role carrier is PRO and that this inter-chain 42 For further illustrations of backward control, see e.g., Polinsky and Potsdam (2002) on Tsez, Monahan (2003) on Korean, Potsdam (2006) on Malagasy, Alboiu (2007) on Romanian, Haddad (2007) on Telugu and Assamese, and Alexiadou, Anagnostopoulou, Iordachioaia, and Marchis (2008) on Greek.
4.5 MTC under the copy theory of movement
103
relation is a species of binding (see Chapter 2). Under this view, the abstract scheme in (89) takes the form in (90). (90) a. b.
[PRO1 V [DP1 . . . ]] [DP V PRO1 [DP1 . . . ]]
Backward-control configurations as in (90) pose deadly problems for any account of OC in terms of PRO. Consider the distributional properties of PRO, for instance. In (90), PRO is allowed to occupy both the subject and the object position. Moreover, when PRO occupies the subject position, there is no specific tense restriction on the type of Infl the clause may have. Thus, (90) is problematic for the standard accounts that tie the distributional properties of PRO to government (e.g., Chomsky 1981), null Case (e.g., Martin 2001), or the tense properties of the heads C and T of the clause containing PRO (e.g., Landau 2004), for PRO in (90) can appear in governed positions marked with regular case, regardless of the tense properties of the clause containing it. When the interpretive properties of control are taken into account, the problems become even more damaging. The standard assumption is that controller and controllee stand in binding/anaphoric relation. However, PRO-based analyses of backward control are inconsistent with the two central principles of binding informally described in (91). (91) a. b.
Anaphors cannot bind their antecedent. Anaphors must be bound by their antecedent.
In (90), DP1 fails to bind PRO and hence should not be a potential antecedent for PRO. Worse still, PRO c-commands DP1 and so should induce a principleC effect and thus be unable to bear the same semantic value as PRO. In standard cases, this should induce a disjoint reference or strong crossover effect. Clearly, this is incompatible with the fact that in the relevant languages and constructions, the structures in (90) manifest the same control relations we find in (89). Every theory of binding in the generative tradition has adhered to the principles in (91). We think that it is reasonable to conclude that any theory of control that must abandon either or both is very unlikely to be correct. PRObased accounts are forced into this uncomfortable situation. More precisely, if binding generally involves c-command (i.e., if ␣ binds  then ␣ c-commands ) and something like principle C obtains (i.e., if ␣ c-commands  then  cannot bind ␣), then PRO-based accounts of backward control must violate one or both of these assumptions. In effect, given the existence of backward control, PRO-based accounts of control violate the basic principles of binding and so one or the other must be abandoned. The conservative strategy, especially given
104
Empirical advantages
the straightforward account provided by the MTC, as we will see below, should be the abandonment of PRO-based theories. In sum, the existence of backward control creates a serious theoretical conundrum for PRO-based accounts of control. Indeed, standard PRO-based accounts appear to forbid backward control as they would lead to binding violations. In this sense, the existence of backward control constitutes a strong counterexample to PRO-based theories of control, not one that is readily massaged away. 4.5.3.2 Backward control and lower-copy pronunciation The MTC finds itself in a much more comfortable situation as regards backward control. Not only can it provide an account of backward control that is consistent with standard assumptions regarding binding, but backward control is in fact a phenomenon to be expected when we examine the MTC under the copy theory of movement. The reasoning goes as follows. As we have discussed above, movement under the copy theory reduces to applications of copy and merge, as illustrated in (92)–(93) below. In the general case, when a given structure containing copies is spelled out, the highest copy is kept and the lower ones are deleted, as sketched in (94a).43 However, an increasing body of literature has been showing that it is also possible to find cases where it is the highest copy that gets deleted, as represented in (94b).44 (92)
K = [...␣...]
(93) a.
Copying of ␣: K = [...␣...] ␣ Merger of α and K: L = [␣ [ . . . ␣ . . . ]]
b. (94) a. b.
Deletion in the phonological component: L = [␣ [ . . . ␣ . . . ]] L = [␣ [ . . . ␣ . . . ]]
Consider the Romanian data in (95) and (96) below, for instance, discussed by Boˇskovi´c (2002). Example (95) shows that Romanian is multiple wh-fronting (MWF). However, the object wh-phrase does not appear to move 43 See Nunes (1995, 1999, 2004) and the collection of papers in Corver and Nunes (2007) for relevant discussion. 44 See Boˇskovi´c and Nunes (2007) and references therein for an extensive review of cases of lower-copy pronunciation.
4.5 MTC under the copy theory of movement
105
if it is homophonous with the fronted-subject wh-phrase, as illustrated in (96). Boˇskovi´c proposes that Romanian has a low-level PF constraint against adjacent homophonous wh-phrases, which rules out (96b). As for the exceptional pattern in (96a), Boˇskovi´c argues that it also involves multiple wh-fronting in the syntactic component but, in order to comply with the PF constraint on adjacent homophonous elements, the higher copy of the object wh-phrase is deleted and the lower one is pronounced instead, as illustrated in the simplified structure in (97). (95)
Romanian (Boˇskovi´c 2002): a. Cine ce precede? Who what precedes? b. ∗ Cine precede ce? Who precedes what? ‘Who precedes what?’
(96)
Romanian (Boˇskovi´c 2002): a. Ce precede ce? What precedes what? b. ∗ Ce ce precede? What what precedes? ‘What precedes what?’
(97)
Deletion in the phonological component: [ce cei precede cei ]
Given this general scenario, it is easy to observe that the MTC is able to offer a very simple account of backward control. The abstract syntactic configurations for forward and backward control in (88) and (89), repeated in (98) and (99) for convenience, reduce to the configurations in (100) under the copy-theory implementation of the MTC, where the two instances of DP1 are copies. (98) a. [DP1 V [1 . . . ]] b. [DP V DP1 [1 . . . ]] (99) a. [1 V [DP1 . . . ]] b. [DP V 1 [DP1 . . . ]] (100)
Subject- and object-control configurations in the syntactic component: a. [DP1 V [DP1 . . . ]] b. [DP V DP1 [DP1 . . . ]]
In other words, the control relation is a chain-internal relation in both forward and backward control. Their difference is a matter of which copy is phonologically expressed. In the more common case, the lower copy is deleted, as
106
Empirical advantages
shown in (101), yielding forward-control constructions. When the upper copy is deleted instead, as shown in (102), we have cases of backward control. (101)
Deletion in the phonological component (forward control): a. [DP1 V [DP1 . . . ]] b. [DP V DP1 [DP1 . . . ]]
(102)
Deletion in the phonological component (backward control): a. [DP1 V [DP1 . . . ]] b. [DP V DP1 [DP1 . . . ]]
Interestingly and importantly, not only do the problems discussed in section 4.5.3.1 evaporate in the context of the MTC in conjunction with the copy theory, but backward control is even to be expected in a minimalist context where the copy theory obtains. As nothing theoretically prevents the grammatical option of pronouncing lower copies, this possibility constitutes the null hypothesis. Furthermore, given that there is reasonable empirical evidence in favor of this option being empirically realized (see footnote 44), it is tempting to conclude (and we will not resist this temptation) that the null hypothesis under the copy-theory implementation of the MTC is that backward control should be an option of UG. We turn next to illustrations of backward control and the arguments that support their existence. 4.5.3.3 Empirical illustrations of backward control Polinsky and Potsdam (2002) document a case of subject backward control in Tsez, a language of the Caucasus. The example in (103), for instance, illustrates a case of backward subject control, embodying the pattern in (89a)/(99a) above.45 (103)
Tsez (Polinsky and Potsdam 2002): biˇsra] yoqsi] [1/∗ 2 [kidb¯a1 ziya girl.ERG cow.ABS feed.INF began ‘The girl began to feed the cow’
The evidence for the proposed structure in (103) − with the null subject in the matrix clause and the overt subject in the embedded one − is the following. First, the matrix verbs in constructions such as (103) are thematic in that they impose selectional restrictions on their subjects. For instance, they require that their subjects be [+animate] and [+volitional], thus excluding sentences such 45 Both the verbs -oqa ‘begin’ and -iˇca ‘continue’ in Tsez can occur in configurations such as (103).
4.5 MTC under the copy theory of movement
107
as (104) below. Furthermore, like standard control verbs, they cannot host idiomatic expressions, as illustrated in (105). (104)
Tsez (Polinsky and Potsdam 2002): a. #kw art’-¯a cˇ ’ikay yexur-a roq-si hammer.ERG glass.ABS break.INF begin.PAST.EVID ‘The hammer began to break the glass’ b. #kid-ber hazab bukad-a yoq-si girl.DAT suffering.ABS see.INF begin.PAST.EVID ‘The girl began to suffer’
(105)
Tsez (Polinsky and Potsdam 2002): a. t’ont’o¯h-¯a buq bac’-xo darkness.ERG sun.ABS eat.PRES ‘The sun has been eclipsed’ (lit. ‘Darkness eats the sun’) buq bac’-a boq-xo b. ∗ t’ont’o¯h-¯a darkness.ERG sun.ABS eat.ABS begin.PRES ‘The eclipse of the sun has begun’
Second, the case marking on the overt subject is always that which is found on subjects in the embedded clause. For example, the verb teqa ‘hear’ takes a dative subject regardless of whether it is embedded under -oqa: (106)
Tsez (Polinsky and Potsdam 2002): kid-ber babiw-s xabar teq-a y-oq-si girl.II.DAT father.GEN story.III.INF hear.INF begin.PAST.EVID ‘The girl began to hear the father’s story’
Third, in Tsez, scrambling is rather free both to the left and the right. However, it is also clause bounded. In particular, scrambling out of an infinitive is not permitted. With this in mind, scrambling can be used as a diagnostic of sentence structure. In -oqa constructions, the overt subject cannot scramble with matrix-clause elements, as illustrated in (107) below. Moreover, it is possible to scramble the whole embedded clause and, when one does, the subject cannot be left behind but must scramble with the rest of the clause. Thus, from (107a) we can obtain (108a) but not (108b), and this is what we expect if the overt subject ‘kidb¯a’ is part of the complement clause. (107)
Tsez (Polinsky and Potsdam 2002): a. ¯huł [kidb¯a ziya biˇsra] yoqsi yesterday girl.ERG cow feed began b. ∗ kidb¯a ¯huł [ziya biˇsra] yoqsi girl.ERG yesterday cow feed began ‘Yesterday the girl began to feed the cow’
108 (108)
Empirical advantages Tsez (Polinsky and Potsdam 2002): a. ¯huł yoqsi [kidb¯a ziya biˇsra] yesterday began girl.ERG cow feed kidb¯a yoqsi [ziya biˇsra] b. ∗¯huł yesterday girl.ERG began cow feed ‘Yesterday the girl began to feed the cow’
Finally, let us examine the event quantification in (109) and (110) below. (109a) is ambiguous with uyrax a¯ tiru ‘four times’ modifying the embedded verb (four feedings), as shown in (109b), or the matrix verb (four beginnings), as shown in (109c). In contrast, (110) only has the reading in which the embedded clause is modified (four feedings), which is what we would expect if the overt subject were in the embedded clause. (109)
(110)
Tsez (Polinsky and Potsdam 2002): a. uyrax a¯ tiru kidb¯a ziya biˇsra yoqsi fourth time girl.ERG cow feed began ‘The girl began to feed the cow for the fourth time’/‘The girl began for the fourth time to feed the cow’ b. [uyrax a¯ tiru kidb¯a ziya biˇsra] yoqsi c. uyrax a¯ tiru [kidb¯a ziya biˇsra´] yoqsi Tsez (Polinsky and Potsdam 2002): kidb¯a uyrax a¯ tiru ziya biˇsra yoqsi girl.ERG fourth time cow feed began ‘The girl began to feed the cow for the fourth time’/∗ ‘The girl began for the fourth time to feed the cow’
Polinsky and Potsdam present other types of evidence all pointing to the same conclusions, namely, that in Tsez backward construction the subject position of the matrix is obligatorily null, thematic, and obligatorily bound by the embedded overt subject. The copy-theory implementation of the MTC can account for these facts as follows. The surface form in (111a), for instance, is derived from the syntactic structure in (111b), where there are two copies of ‘kidb¯a,’ followed by deletion of the higher copy in the phonological component, as shown in (111c).46 46 Polinsky and Potsdam (2002) account for these facts in terms of the MTC by proposing that backward-control constructions involve covert movement of the embedded subject to the matrix -position at LF. This also yields the desired LF structure in (111b) for the overt surface form (111a). However, if one assumes that derivations involve a single cycle, a current minimalist assumption (see Chomsky 2000), this is not a viable analysis as single-cycle theories reject LF-movement operations. Overt movement combined with lower-copy interpretation has the effect of covert LF-movement but without requiring that covert movement exist (see Polinsky and Postdam [2006] for a discussion of this possibility and comparison with backward subject raising).
4.5 MTC under the copy theory of movement (111)
109
Tsez (Polinsky and Potsdam 2002): a. [[kidb¯a ziya biˇsra] yoqsi] girl.ERG cow feed began ‘The girl began to feed the cow’ b. [kidb¯a [kidb¯a ziya biˇsra] yoqsi] c. [kidb¯a [kidb¯a ziya biˇsra] yoqsi]
Polinsky and Potsdam provide independent evidence for the proposed analysis (see footnote 46), based on data such as (112) below. Example (112a) shows that Tsez reflexives are clause bound. However, a reflexive in the matrix clause of a backward-control construction can be bound by a DP in a lower clause, as shown in (112b). This makes perfect sense if the lower DP raises to the matrix clause and from there binds the reflexive. At the C–I interface, there is a copy of ‘irbahin-¯a’ in the matrix subject position licensing the matrix reflexive in (112b) and this contrasts with (112a), where there is no copy of uˇza¯ ‘boy’ in the matrix clause. (112)
Tsez (Polinsky and Potsdam 2002): a. babirk nes¯a nesirk/∗ i [uˇza¯ i ␥utku roda] retin father REFL.I.DAT boy house build wanted ‘The father wanted for himself that the boy should build the house’ b. nes¯a nesiri [irbahin-¯ai halma␥or ␥utku roda] oqsi REFL.I.DAT Ibrahim.I.DAT friend house make began ‘Ibrahim began, for himself, to build a house for his friend’
Polinsky and Potsdam offer further elaborations of the proposal sketched here and discuss various technical issues related to its implementation. However, their main point is twofold. First, that it is quite unclear how the standard theories of control that involve PRO and binding would account for backward-control constructions. In fact, as noted in section 4.5.3.1, on PRObased approaches to control, backward control should be simply impossible. And second, it is easy to explain the properties of backward-control phenomena if one adopts a movement approach to control (even more so under the copy theory). Let us now consider a case of object backward control from Korean, as discussed by Monahan (2003).47 In Korean, predicates like seltukha ‘persuade,’ sikhita and kangyohata ‘force,’ chungkohata ‘advise’ and jeanhata ‘suggest’ allow for two kinds of control configurations, which are superficially distinguished by the case of the controller, as illustrated in (113). 47 For further data and discussion, see Monahan (2003) and Polinsky, Monahan, and Kwon (2007).
110 (113)
Empirical advantages Korean (Monahan 2003): Chelswu-nun Yenghi-lul/ka kakey-ey ka-tolok seltukha-ess-ta Chelswu.TOP Yenghi.ACC/NOM store.LOC go.COMP persuade.PAST.DECL ‘Chelswu persuaded Yenghi to go to the store’
There are two important facts to note concerning (113). First, these sentences are close paraphrases of one another regardless of whether ‘Yenghi’ bears nominative or accusative case. Second, the case ‘Yenghi’ carries correlates with whether it resides in the matrix or the embedded clause in overt syntax. When marked nominative, it patterns like an embedded subject. When marked accusative, it patterns like a matrix object. Taken together, these pairs of facts imply that these sentences have the structure in (114), with ‘’ indicating a phonetically null position. (114)
Korean (Monahan 2003): a. Chelswu-nun Yenghi-luli [i kakey-ey ka-tolok] seltukha-ess-ta Chelswu.TOP Yenghi.ACC store.LOC go.COMP persuade.PAST.DECL b. Chelswu-nun i [Yenghi-kai kakey-ey ka-tolok] seltukha-ess-ta Chelswu.TOP Yenghi.NOM store.LOC go.COMP persuade.PAST.DECL ‘Chelswu persuaded Yenghi to go to the store’
The case of interest here is (114b). Sentences like these have the interpretations of object-control constructions but the overt syntax of expect-type predicates. Thus, like standard object-control constructions and unlike ECM constructions, their matrix verbs impose animacy restrictions on their complements, as shown in (115), and do not display voice transparency, as illustrated in (116) (as indicated by the translations, [116a] and [116b] do not paraphrase one another). (115)
(116)
Korean (Monahan 2003): #Chelswu-nun tol-i tteleci-tolok seltukha-ess-ta Chelswu.TOP rock.NOM fall.COMP persuade.PAST.DECL #‘Chelswu persuaded the rocks to fall’ Korean (Monahan 2003): a. Chelswu-nun Yenghi-ka Swuyeng-ul intophyu ha-tolok Chelswu.TOP Yenghi.NOM Swuyeng.ACC interview do.COMP seltukha-ess-ta persuade.PAST.DECL ‘Chelswu persuaded Yenghi to interview Swuyeng’ b. Chelswu-nun Swuyeng-i Yenghi-eykey intephyu pat-tolok Chelswu.TOP Swuyeng.NOM Yenghi.DAT interview pass.COMP seltukha-ess-ta persuade.PAST.DECL ‘Chelswu persuaded Swuyeng to be interviewed by Yenghi’
4.5 MTC under the copy theory of movement
111
On the other hand, there is convincing evidence that the nominative controller in constructions such as (114b) is located within the embedded clause. First, the matrix verbs of these constructions only permit accusative marking on their complement DP, as illustrated in (117) below, which does not have a sentential complement. Thus, the case of the controller in (114b) must be licensed within the embedded clause. (117)
Korean (Monahan 2003): Chelswu-nun Yenghi-lul/∗ ka seltukha-ess-ta Chelswu.TOP Yenghi.ACC/∗ NOM persuade.PAST.DECL ‘Chelswu persuaded Yenghi’
Second, the interpretation of temporal adverbs correlates with the case marking on the control DP, as illustrated in (118). (118)
Korean (Monahan 2003): nayil kakey-ey mayil Chelswu-nun Yenghi-lul/∗ ka Chelswu.TOP Yenghi.ACC/∗ NOM tomorrow store.LOC every-day ka-tolok seltukha-l ke-ya go.COMP persuade.PAST.DECL ‘Tomorrow Chelswu will persuade Yenghi to go to the store every day’
The contrast between the accusative and nominative marking on ‘Yenghi’ in (118) follows straightforwardly if the nominative DP resides in the embedded clause and the accusative DP, in the matrix clause. Under the standard assumption that adverbs can only modify the clauses they are part of, the adverb nayil ‘tomorrow’ is in the embedded clause when following the nominative DP and, therefore, it will clash with the interpretation of mayil ‘every day,’ leading to unacceptability. On the other hand, when the controller has accusative case, ‘nayil’ can reside in the matrix clause and the sentence is fine. Third, clausal scrambling must pied-pipe a nominative controller, but not its accusative counterpart, as shown in (119). Again this can be accounted for if only the nominative controller is inside the complement clause. (119)
Korean (Monahan 2003): Chelswu-nun [kakey-ey ka-tolok]i Yenghi-ka/∗ -lul ti Chelswu.TOP store.LOC go.COMP Yenghi.ACC/∗ NOM seltukha-ess-ta persuade.PAST.DECL ‘Chelswu persuaded Yenghi to go to the store’
The upshot of these various considerations is the following. The thematic properties of the controller are the same irrespective of the case it carries. However, if marked accusative, it resides in the matrix clause and, if marked
112
Empirical advantages
nominative, it is the subject of the embedded complement. Thus, control clauses with a nominative controller in Korean are backward-control configurations and, as argued by Monahan (2003), can be easily accounted for if OC is movement and if lower copies can be pronounced. In other words, the copy version of the MTC assigns a single syntactic structure to the two sentences of (114), for instance, as illustrated in (120). (120)
[Chelswu-nun Yenghi [Yenghi kakey-ey ka-tolok] seltukha-ess-ta] Chelswu.TOP Yenghi Yenghi store.LOC go.COMP persuade.PAST.DECL
If the matrix copy is phonetically realized, as illustrated in (121a) below, it appears with accusative case (cf. [114a]) and we have a standard instance of forward object control. If the embedded copy is pronounced instead, as illustrated in (121b), it appears with nominative case (cf. [114b]) and a backward object-control construction is yielded.48 48 Cormack and Smith (2004) challenge this analysis and propose an alternative based on the fact that Korean allows clausal scrambling and null objects quite freely. According to them, the derivation of a sentence like (114b), for instance, involves scrambling of the clausal complement to the left of a null object pronoun, as represented in (i). (i) a. [Chelswu-nun [Yenghi-ka1 kakey-ey ka-tolok]2 pro1 t2 seltukha-ess-ta] b. [DP [TP DP1 . . . ]2 pro1 t2 V] Although reasonable, there is good evidence to suggest that this alternative does not work for Korean (we would like to thank Sungshim Hong and Sun-Woong Kim for patient and invaluable critical discussions concerning these constructions and the data reprised below). Take the contrast between (ii) and (iiia) below, for instance. Example (ii) does not involve object control and the embedded nominative quantificational subject cannot bind the null pronoun inside the matrix adjunct, as the former cannot scope over the latter when not in the same clause. By contrast, backward-control constructions in (iiia) allow pronominal binding, which indicates that its matrix object position is occupied not by a null pro, as proposed by Cormack and Smith, but by a deleted copy of the quantified embedded subject, as represented in (iiib). (ii)
∗ pro 1
ttayloin hwuey, Chelswu-nun Yenghi-ka [motun salam1 -i hit.MOD after Chelswu.TOP Yenghi.ACC everyone.NOM ttena-tolok] seltukhassta leave.C persuaded ∗ ‘After hitting him , Chelswu persuaded Yenghi that everyone left’ 1 1 (iii) a. pro1 ttaylin hwuey, Chelswu-nun [motun salam1 -i ttena-tolok] seltukhassta hit.MOD after Chelswu.TOP everyone.NOM leave.C persuaded ‘After hitting him1 John persuaded everyone1 to leave’ b. pro1 ttaylin hwuey, Chelswu-nun motun salam1 [motun salam1 -i ttena-tolok] seltukhassta In fact, were (ib) the correct structure for Korean backward-control configurations, the relation between the DP1 and pro would be an instance of coreference and not binding, for the embedded
4.5 MTC under the copy theory of movement
113
(121) a. [Chelswu-nun Yenghi-lul [Yenghi kakey-ey ka-tolok] Chelswu.TOP Yenghi.ACC Yenghi store.LOC go.COMP seltukha-ess-ta] persuade.PAST.DECL b. [Chelswu-nun Yenghi [Yenghi-ka kakey-ey ka-tolok] Chelswu.TOP Yenghi Yenghi.NOM store.LOC go.COMP seltukha-ess-ta] persuade.PAST.DECL
In sum, the movement analysis of control can treat the nominative version of the Korean constructions above analogously to the Tsez cases via a process of overt movement of the embedded subject to the internal -position of the matrix verb, coupled with pronunciation of the lower copy. It should be stressed that this is a genuine option within UG once one allows movement to -positions and adopts the copy theory of movement, eschewing traces in favor of copies. Like the Tsez data examined earlier, the Korean data discussed above pose a very serious problem for PRO-based accounts of control, as they would require that, in backward-control constructions, an anaphor (i.e., PRO) binds the DP that licenses it, something that is strongly disallowed in all other cases of binding. Thus, the existence of backward control constitutes, in our view, a very powerful reason for rejecting PRO-based accounts of control (see section 4.5.3.1). As the MTC is the only current account of control that makes it subject does not c-command pro. Thus, if the embedded subject in (ib) were quantificational, then pro would be an E-type pronoun. Note, however, that, although we can have cross-sentential relations between an overt E-type pronoun and a preceding quantificational expression, as illustrated in (iv) below, this is not possible with constructions analogous to (ib), as shown in (v). On the assumption that the null pronoun should function like the overt one in this case, the unacceptability of (v) again indicates that a null pronoun is not actually available in backward-control cases such as (114b). (iv)
(v)
Motun saram-tul-i ku siktang pica-lul coahanta. All person.PL.NOM the restaurant.GEN pizza.ACC like.PRS.DECL Ku-tul-un kakkum keki-e ka-n-ta they once in while there go ‘All the people1 like the pizza restaurant. They1 go there once in a while’ ∗ John-i
[motun salam-i1 ttena-tolok] ku-tul1 seltukhassta John.NOM everyone.NOM leave.C they.ACC persuaded ‘John persuaded everyonei that theyi should leave’
Finally, Potsdam (2006) shows that Cormack and Smith’s (2004) proposal does not constitute a general crosslinguistic alternative either, for Malagasy allows backward control but not null objects (see Potsdam 2006 for data and discussion).
114
Empirical advantages
possible to dispense with OC PRO, the existence of backward control points ineluctably to the conclusion that some version of the MTC is correct.49
4.5.3.4 Wrapping up Backward control is not restricted to the languages surveyed, but is in fact ubiquitous (see the references in footnote 42 and Polinsky and Potsdam 2002 for a brief review). This is of great significance, as backward control poses fatal problems for PRO-based accounts of control, but is instead expected under a copy-theory version of the MTC. As discussed in section 4.5.3.1, in both backward subject-control constructions such as the ones found in Tsez and in backward object-control constructions such as the ones found in Korean, the putative PRO under a PRO-based account c-commands the controller, being at odds with standard binding principles. Furthermore, it is no longer descriptively accurate to maintain that PRO only occurs in the subject position of untensed clauses for, in backward objectcontrol constructions, the putative PRO appears in the matrix object position. We believe that the moral of these constructions is clear: if backward control exists, then PRO-based analyses of control are incorrect. As we believe that there is considerable evidence that backward control obtains, it appears to us that PRO-based accounts are doomed to inadequacy.
49 It is worth pointing out that backward control also argues against “mixed” theories of control such as the one in Martin (1996). This proposal resembles the MTC in allowing for a single chain at LF comprising two -roles. However, it differs from the MTC in having a PRO-like element that merges into the controlled -position. The single chain is derived from a process wherein the PRO cliticizes to the controller (or a region very near it) resulting in the unification of the two disparate chains (i.e., the two chains “collapse” into one). This proposal has problems with backward-control cases for it predicts that the phonetic gap will always appear in the “lower” position. Unlike the MTC, it cannot rely on the copy theory to provide the right phonological options as there is no copy of the controlled expression down below, only a PRO-like element. Similar problems afflict approaches to control like that proposed in Manzini and Roussou (2000) on the assumption that feature movement cannot be in a downward direction. Interestingly, a doubling account of control like the analysis of reflexivization in Zwart (2002), wherein the related DPs are generated together and then one moves away to check another -role, is compatible with backward-control data. Backward control would result when the “PRO” moves to check the controller -role, and regular control results when the doubled DP moves to check the controller -role. This approach requires allowing rather complex “doubling” structures as there is no upper bound on the number of possible “PROs” in a control configuration, but this is already required to handle cases of multiple reflexivization in the envisaged framework. One last point: this suggests that local reflexivization and OC should be treated via the same mechanisms, as proposed by Hornstein (2001).
4.5 MTC under the copy theory of movement
115
It is important to observe that backward control also raises challenges for the MTC, as far as some details of technical implementation go. As we have noted, backward control is both consistent with the MTC and expected when the MTC is coupled with the copy theory of movement. This said, a full account of backward control must also specify what licenses the phonetic interpretation of copies, be they high or low. We have tacitly assumed, following Polinsky and Potsdam (2002), Monahan (2003), and Boeckx, Hornstein, and Nunes (2007, 2008), that case may be the responsible factor. Descriptively speaking, it appears that, in languages that allow backward control, copying a DP may leave its case feature stranded (cf. [120]). This is consistent with the idea that variation is limited to the PF/A–P side of the grammar with the LF/C–I side being uniform across languages. However, the details of a full case proposal are still to be worked out to round out a full account of backward constructions in the context of the MTC. Backward control has another useful consequence. Like finite control (see section 4.4), its existence implies that control complements are clausal. This is interesting for there is a long tradition within generative grammar suggesting that control complements are not actually clausal.50 This is plausible when the (null) controllee is in the embedded clause as there is nothing phonetically evident in such cases. However, if backward control has all of the properties of standard control, as appears to be the case, then we can see that control complements are clausal. After all, we can “see” the controller sitting in the embedded clause. It is extremely unlikely (at least it would require a lot of evidence) that backward control involves clausal complements while forward control involves VPs or predicates of some kind, for their properties are identical, as we saw in the case of Korean, for example. Theoretically, the properties of backward control are easy to explain given the copy-theory version of MTC on the assumption that control complements are clausal. The best conclusion then is that control complements are indeed clausal and that this fact, made evident in the case of backward control, holds for control quite generally (see section 7.2 for further discussion). 4.5.4 Phonetic realization of multiple copies and copy control Let us now consider another type of control construction which is predicted to exist by the copy version of the MTC, namely, copy-control constructions. In the discussion above, we tacitly assumed that, given a series of copies left by movement, only one gets pronounced at PF. However, any version of the 50 See e.g., Bresnan (1982), Chierchia (1984), and the discussion in section 7.2 below.
116
Empirical advantages
copy theory must account for why realization of multiple copies is forbidden. The null hypothesis is that all copies should in principle be on an equal footing with respect to phonetic realization. Nunes (1995, 1999, 2004) proposes that the ban on phonetic realization of multiple copies has to do with linearization considerations. The gist of his proposal is that copies count as “the same” for purposes of linearization in the phonological component because they are non-distinct elements (technically, they relate to the same occurrences of lexical items of the numeration) and this creates problems. In each of the structures in (122a) and (122b) below, for instance, the two copies of ‘John’ (annotated by the superscripted indices) induce contradictory linearization requirements. The first copy of ‘John’ in (122a) must precede ‘was,’ which must in turn precede the second copy. However, given that these copies are non-distinct, this amounts to saying that ‘John’ must precede and be preceded by ‘was,’ which is a contradictory requirement. The same applies to (122b): ‘John’ must precede and be preceded by ‘hoped.’ (122) a. [Johni [was [arrested Johni ]]] b. [Johni [T hoped [Johni to see Mary]]]
To put this somewhat differently, the phonological component demands that syntactic structure be converted to linear precedence (say, by Kayne’s [1994] LCA), but a chain is a discontinuous object and cannot be mapped onto a single position at PF. Thus, in order for a structure containing a chain to be linearized, all of its links but one must be deleted. Thus, the reason that all copies but one are phonetically null is that, if they were not deleted, derivations could not converge as they could not be linearized and so would not receive PF interpretations. As for the choice of the link to survive deletion, we have already seen in section 4.5.3 that, although the common situation is for the highest copy to be pronounced, there do exist constructions where a lower copy is pronounced instead, yielding instances of backward control if the relevant movement is movement to a thematic position. Again, this fits within the null hypothesis regarding the copy theory: if a constituent K is a replica of K’ and K’ can be phonetically realized, K can be phonetically realized as well. This line of thinking predicts that if copies do not interfere with linearization, they should in principle be able to surface overtly. Nunes (1999, 2004) argues that, under certain conditions, this actually happens. Here is the reasoning. Suppose for instance that, once the syntactic structure in (123a) below, with two copies of p, is spelled out, the morphological component fuses (in the sense
4.5 MTC under the copy theory of movement
117
of Halle and Marantz 1993) the terminals m and p, yielding the atomic blended terminal #mp# (or #pm#, for that matter), with no internal structure accessible to further morphological or syntactic computations, as sketched in (123b). (123) a. Spelled-out structure: M pi
L r
k pi
m
b. Fusion in the morphological component: M pi
L r
k #mpi #
The content of #mp# in (123b) cannot be directly linearized with respect to r or the upper copy of p because it is an inaccessible part of #mp#. From an LCA perspective, for instance, the blended material within #mp# is not accessible to c-command computations. However, it can be indirectly linearized in (123b) in virtue of being an integral part of #mp#: given that the upper copy of p asymmetrically c-commands r and that r asymmetrically c-commands #mp#, we should obtain the linear order p>r>#mp#. In other words, the material inside #mp# gets linearized in a way analogous to how the phoneme /l/ is indirectly linearized in ‘John loves Mary’ due to its being part of the lexical item ‘loves.’ But, crucially, once the lower copy of p in (123b) becomes invisible for standard linearization computations, the linearization problems caused by the presence of multiple copies discussed as regards (123) cease to exist. Thus, the structure in (123b) not only can, but must, surface with two copies of p at PF. An example should make this idea clear.51 Consider verb clefting in Vata, as illustrated in (124) below. Koopman (1984) shows that the two verbal occurrences in (124) cannot be separated by islands, which indicates that they should 51 For further illustration, see Nunes (1999, 2004, in press), Boˇskovi´c and Nunes (2007), the collection of papers in Corver and Nunes (2007), and references therein.
118
Empirical advantages
be related by movement. The problem, however, is that, if these occurrences are to be treated as copies under the copy theory, then it should not be possible to linearize the structure containing them in accordance with the LCA, as discussed above with respect to (123). Since the pronoun a` ‘we,’ for example, asymmetrically c-commands and is asymmetrically c-commanded by (a copy of) the verb li ‘eat,’ the LCA should induce the contradictory requirement that ‘`a’ precede and be preceded by ‘li.’ (124)
Vata (Koopman 1984): li a` li-da zu´e sak´a eat we eat.PAST yesterday rice ‘We ATE rice yesterday’
Nunes (2004) proposes that this possibility does not in fact arise because the highest copy of the clefted verb gets morphologically fused and thereby evades the purview of the LCA. More precisely, he analyzes verb clefting in Vata as involving verb movement to a focus head, followed by fusion in the morphological component between the moved verb and the focus head, as represented in (125a) below. Of the three verbal copies in (125a), the LCA only “sees” the lower two after the highest copy gets fused with Foc◦ . The lowest copy is then deleted (cf. [125b]) and the structure is linearized as in (124), with two copies of the verb phonetically realized. (125) a. Fusion: [FocP #[Foc0 Vi [Foc0 Foc◦ ]]# [TP . . . [T0 Vi [T0 T◦ ]] [VP . . . Vi . . . ]]] b. Deletion of copies: [FocP #[Foc0 Vi [Foc0 Foc◦ ]]# [TP . . . [T0 Vi [T0 T◦ ]] [VP . . . Vi . . . ]]]
Nunes (2004) presents two pieces of evidence in favor of this account of verb clefting in Vata. The first one relates to Koopman’s (1984: 158) observation that the restricted set of verbs that cannot undergo clefting in Vata has in common the property that they cannot serve as input for morphological processes that apply to other verbs. If these verbs cannot participate in any morphological process, they certainly should not be able to undergo the morphological fusion with Foc◦ depicted in (125a) and should not be allowed in predicate clefting constructions. The second piece of evidence is provided by the fact, also observed by Koopman, that the fronted verb in these focus constructions must be morphologically unencumbered; in particular, none of the tense or negative particles that occur with the verb in Infl may appear with the fronted verb, as illustrated in (126) below. This makes sense if these particles render the verb
4.5 MTC under the copy theory of movement
119
morphologically too complex, thereby preventing the verb from undergoing fusion with the focus head. (126)
Vata (Koopman 1984): a. (∗ na`-) le wa n´a`-le-ka NEG eat they NEG.eat.FT ‘They will not EAT’ b. li(∗ -wa) w`a li-wa zu´e eat.TP they eat.TP yesterday ‘They ATE yesterday’
What is relevant for our purposes here is that these restrictions indicate that the realization of multiple copies should indeed be very sensitive to morphological information, given that multiple copies are only allowed when some copies get morphologically reanalyzed as being part of a fused terminal. The first kind of relevant information regards the feature composition of the elements that are to be fused. After all, not any two terminals can get fused, but only the ones that satisfy the morphological requirements of one another. In Vata, for instance, the duplication of focused material only affects verbs and many languages only allow multiple copies of wh-elements. This may be interpreted as a reflex of the morphological (categorial) restrictions a given head may impose on the copy with which it may fuse. The second kind of information concerns morphological complexity. As a rule, the more morphologically complex a given element is, the less likely it is that it will undergo fusion and become part of a terminal. Thus, the addition of specific morphemes (which may vary from language to language) may make the resulting element morphologically “too heavy” to become reanalyzed as part of a word. This seems to be what is going on in (126), with the addition of Infl particles to the fronted verb. Of course, if a given copy is syntactically complex (i.e., it is phrasal), it is also morphologically complex and not a good candidate to undergo morphological fusion. Now comes the punch line. If multiple copies may be phonetically realized when fusion allows the linearization problem to be circumvented and if control is movement, the copy version of the MTC predicts that control constructions with more than one copy phonetically realized should exist. It also predicts that such constructions should display hallmarks of fusion such as sensitivity to morphological information (fusion may or may not take place depending on the morphological properties of the copies involved) and morphological complexity (the more morphologically complex a given copy is, the more unlikely it is to undergo fusion). Boeckx, Hornstein, and Nunes (2007, 2008) show that these predictions are indeed fulfilled.
120
Empirical advantages
Consider the data in (127) below from San Lucas Quiavin´ı Zapotec, discussed by Lee (2003).52 (127)
San Lucas Quiavin´ı Zapotec (Lee 2003): a. R-c`aa` a’z Gye’eihlly g-auh Gye’eihlly bxaady HAB.want Mike IRR.eat Mike grasshopper ‘Mike wants to eat grasshopper’ b. B-qu`ıi’lly bxuuhahz Gye’eihlly ch-iia Gye’eihlly scweel PERF.persuade priest Mike IRR.go Mike school ‘The priest persuaded Mike to go to school’ c. B-`ıi’lly-ga’ Gye’eihlly zi’cyg`aa’ nih cay-uhny Gye’eihlly z`ee` iny PERF.sing-also Mike while that PROG.do Mike work ‘Mike sang while he worked’
Each of the sentences in (127) shows a bound copy in the embedded-subject position. Interestingly, the similarities of these constructions with standard control constructions go beyond translation. They also trigger a sloppy reading under ellipsis, as shown in (128), and the bound copy displays complementarity with a coreferential pronoun, as shown in (129). (128)
San Lucas Quiavin´ı Zapotec (Lee 2003): a. R-c`aa` a’z Gye’eihlly g-ahcn`ee Gye’eihlly Lia Paamm HAB.want Mike IRR.help Mike FEM Pam z¨e’cy cahgza’ Li’eb likewise Felipe ‘Mike wants to help Pam, and so does Felipe (want to help Pam/∗ want Mike to help Pam)’ b. Zi’cyg`aa’ nih cay-uhny Gye’eihlly z`ee` iny b-`ıi’lly-ga’ Gye’eihlly While that PROG.do Mike work PERF.sing.also Mike z¨e’cy cahgza’ Li’eb likewise Felipe ‘While Mikei was working, hei sang, and so did Felipek (sing while hek worked)’
52 Hmong also seems to allow structures analogous to (131), as illustrated in (i) below (see Boeckx, Hornstein, and Nunes 2007, 2008 for relevant discussion). (i) a.
b.
c.
Hmong (Quinn 2004) Pov xav kom Pov noj mov Pov want/think so-that Pov eat rice ‘Pov wants to eat’ Maiv ntxias Pov kom Pov rov qab noj Maiv persuaded Pov so-that Pov return back eat ‘Maiv persuaded Pov to eat’ Pov pais tom qab Pov hais goodbye rau tus Pov left after back Pov said good-bye to CLF ‘Pov left after saying good-bye to the woman’
mov rice pojniam woman
4.5 MTC under the copy theory of movement (129)
121
San Lucas Quiavin´ı Zapotec (Felicia Lee, personal communication): a. R-caaa’z Gye’eihlly g-ahcn`ee-¨eng Lia Paamm HAB.want Mike IRR.help.3SG.PROX FEM Pam ‘Mikei wants himk/∗ i to help Pam’ b. Zi’cyg`aa’ nih cay-uhny-¨eng z`ee` iny b-`ıi’lly-ga’ Gye’eihlly While that PROG.do.3SG.PROX work PERF.sing.also Mike ‘While hei/∗ k worked, Mikek sang’
Hornstein, Boeckx, and Nunes (2007, 2008) argue that the data in (127)–(128) are indeed cases of control, i.e., movement to thematic positions, with both the controller and the controllee copies phonetically realized. More specifically, they propose that these constructions involve morphological fusion of the controllee copy with the null “self” morpheme available in this language.53 As we should expect given the discussion above, if a control chain involves morphologically encumbered copies, fusion will be blocked and phonetic realization of more than one copy leads to an ungrammatical result. That this prediction is correct is illustrated by the copy-control constructions in (130a), which involves a quantifier phrase, and in (130b), whose links contain an anaphoric possessor.54 (130)
San Lucas Quiavin´ı Zapotec (Lee 2003): a. ∗ Yra’ta’ zhy`aa’p r-c`aa` a’z g-ahcn`ee’ yra’ta’ zhy`aa’p Lia Paamm Every girl HAB.want IRR.help every girl FEM Pam ‘Every girl wants to help Pam’ g-a’uh behts-ni’ b. ∗ R-e’ihpy Gye’eihlly behts-ni’ HAB.tell Mike brother.REFL.POSS IRR.eat brother.REFL.POSS bx:`aady grasshopper ‘Mike told his brother to eat grasshoppers’
Let us reexamine the adjunct copy-control case in (127c). As discussed in section 4.5.1, adjunct control involves sideward movement. The fact that 53 Hornstein, Boeckx, and Nunes (2007, 2008) argue that fusion with this null “self” morpheme is also what underlies the existence of copy-reflexive constructions in San Luca Quiavin´ı Zapotec and Hmong (see footnote 52) such as the ones illustrated in (i). (i) a.
San Luca Quiavin´ı Zapotec (Lee 2003): B-gwa Gye’eihlly Gye’eihlly PERF.shave Mike Mike ‘Mike shaved himself’ b. Hmong (Mortensen 2003): Pov yeej qhuas Pov Pao always praise Pao ‘Pao always praises himself’ 54 See Hornstein, Boeckx, and Nunes (2007, 2008) for details and further discussion.
122
Empirical advantages
sideward movement may also lead to phonetic realization of multiple copies thus further stresses the point that sideward movement is nothing more than one of the instantiations of copy plus merge. Interestingly, there are languages which only allow adjunct copy control, which indicates that the relevant head that triggers fusion in these languages is within the adjunct clause. In his detailed study on control structures in Telugu and Assamese, Haddad (2007, 2009) shows that adjunct copy-control constructions such as (131) and (132) below (CNP stands for conjunctive participle particle) display all the traditional diagnostics of obligatory control and argues that they should also be analyzed in terms of sideward movement and phonetic realization of multiple copies. (131)
Telugu (Haddad 2007): [[Kumar sinima cuus-tuu] [Kumar popkorn tinnaa-Du]] Kumar.NOM movie watch.CNP Kumar.NOM popcorn ate.3.SG.M ‘While watching a movie, Kumar ate popcorn’
(132)
Assamese (Haddad 2007): [[Ram-Or khong uth-i] [Ram-e mor ghorto bhangil-e]] Ram.GEN anger raise.CNP Ram.NOM my house destroyed.3 ‘Having got angry, Ram destroyed my house’
Given the role of morphological fusion in making the phonetic realization of multiple copies possible, it comes as no surprise that multiple copies are only possible if, in Haddad’s (2007: 87) words, the subject “does not exceed one or two words,” as illustrated by the ungrammaticality of (133). (133) ∗
Telugu (Haddad 2007): [[Kumar maryu Sarita sinim cuu-tuu] [Kumar maryu Kumar.NOM and Sarita.NOM movie watch.CNP Kumar.NOM and Sarita popkorn tinna-ru]] Sarita.NOM popcorn ate ‘While Kumar and Sarita were watching a movie, they ate popcorn’
To summarize, if movement into -positions is possible, we should in principle expect such movement to also yield constructions with multiple copies, provided that we have evidence that one of the copies is morphologically reanalyzed. The existence of copy-control constructions therefore provides another (in our view, powerful) argument for the copy version of the MTC, and against PRO-based accounts of control. After all, what can be more convincing for the copy version of the MTC than the phonetic realization of both controller and controllee as copies?
4.6 Conclusion 4.6
123
Conclusion
Counter-examples come in various flavors. A common variety consists of cases that the theory does not cover, though it seems, intuitively, that it should as the data are of a piece with those that it does. Also common are cases that appear to contradict the theory by having a status at odds with what it predicts. In each case, the theory can be saved by judicious twiddling, appending a codicil that adds or excludes the relevant problematic data point. Though often tinged with adhockery, this kind of maneuver is both common and well understood.55 Counter-examples are more serious when more Janus-faced, i.e., when they are perched between two competing theories and smile towards one but frown towards the other. In such circumstances, the “exceptions prove the rule” in the originally intended sense that the counter-examples for one proposal “prove” the second by at once disconfirming the former and confirming the latter. Methodological kudos does (and should) accrue to theories that are able to clean up the ad hoc messes of their competitors. More significant yet are counter-examples that seem to cut to the heart of a theory. These are surprisingly rare, at least in the linguistics literature.56 In such cases defusing the counter-example requires reneging on well-established background principles. Such problematic data are deeply telling for they suggest either that the favored theory is clearly wrong or that the larger theory in which it is embedded is in need of extensive revision. Typically, the larger background assumptions are saved and the more specific proposals sacrificed. Why this brief divagation into the epistemology of counter-examples? Because we believe that part of the data discussed in the previous sections constitute arguments of the strongest type for the debate on obligatory control. In particular, backward and copy-control constructions prove fatal for accounts of control that are PRO-based, that is, accounts that assume a primitive expression like PRO which at once bears a -role and has the controller as antecedent. Or, more precisely, it leads to the conclusion that no PRO-based approach to control can be right. Moreover, backward and copy control are not only fully consistent with the MTC, but are in fact expected, given the copy theory of movement. This in turn leads directly to the conclusion that some version of
55 Nor is it always inappropriate. If a theory is basically correct then an ad hoc addition can be interpreted as claiming that the putative counter-example is not actually a real one. 56 Examples include Chomsky’s (1955, 1957) famous arguments against finite-state grammars and simple phrase-structure grammar as models for linguistic competence. Another plausible example is the island and subjacency arguments against Chomsky’s (1982) approach to parasitic gaps in terms of a functional interpretation of empty categories (see Kayne 1984).
124
Empirical advantages
the MTC must be correct. Finally, given that the only current alternative to PRO-based accounts of control is the MTC,57 whatever problems of technical implementation the MTC currently faces cannot be grounds for arguing in favor of PRO-based accounts. Before closing this chapter, we would also like to call attention to the fact that the MTC and the minimalist program snugly fit together conceptually, a feature of the MTC that we believe to be of more than passing interest. As discussed in Chapter 3, the MTC fits well with the elimination of DS. Without DS, the requirement that all -role assignment be prior to all movement operations is set aside, and this opens the possibility that -roles may also be assigned under movement. In this chapter, we have shown that the copy theory of movement makes room for the existence of cases of lower-copy and multiplecopy pronunciation and that the copy implementation of the MTC correctly predicts that control structures may also exercise these options. In sum, the MTC follows naturally under central features of the minimalist program: the elimination of DS and the copy theory of movement. It is in this sense that the MTC is a quintessentially minimalist account of control. By exploring the logical consequences of these pillars of the minimalist program, the MTC not only provides a conceptually well-grounded analysis of control, but also substantially broadens the empirical coverage of previous analyses. In fact, the nice fit between the MTC and these minimalist ideas constitutes, we believe, an additional argument in its favor. 57 Another possible approach would revive some version of the earliest equi-accounts. Baltin and Barrett (2002) make such a proposal. However, we believe that such an approach only appears to be different from the MTC. More particularly, an adequate equi-account will have to incorporate the MTC if it is to overcome the original difficulty for equi-approaches posed by the contrast between (ia) and (ib) below (see section 2.3). Given that (ia) and (ib) do not mean the same, distinguishing between the putative deleted instance of ‘everyone’ in the embedded clause of (ia) and the non-deleted instance in (ib) requires assuming that only (ib) results from two selections of ‘everyone’ in the derivation. But this is just to endorse the MTC. If this is correct, then equi-accounts are just subspecies of the MTC. (i) a. b.
Everyone wants to leave Everyone wants everyone to leave
5 Empirical challenges and solutions
5.1
Introduction
In this chapter we discuss a fair sample of the empirical challenges that have been perceived by many as providing lethal counter-arguments to the MTC.1 We will start with apparent problems that stem from contrasts between raising and control constructions. Sections 5.2 and 5.3 examine cases where raising is not allowed, but control is. Sections 5.4 discusses cases in which the different morphological patterns associated with raising and control have been interpreted as showing that control constructions involve a case-marked PRO. In sections 5.5, we examine control constructions involving promise-type verbs and control-shift phenomena, which appear to be at odds with the minimality assumptions underlying the MTC. Finally, in sections 5.6 we discuss partial and split control, which at first sight seem to require the postulation of PRO, as the controllee appears to be distinct from the controller. As the reader will see, the apparent problems have more to do with a proper characterization of the relevant constructions than with the MTC itself. It is beyond the tenets of this chapter to offer an independent full-fledged account of each of them. However, we will show that, under very reasonable assumptions (most of which are standard in GB and/or minimalism), the apparently problematic data can receive a streamlined analysis under the MTC and in many cases turn out to actually provide compelling evidence in its favor.
5.2
Passives, obligatory control, and Visser’s generalization
Landau (2003) has claimed that the MTC overgenerates by incorrectly ruling in passives of subject-control predicates (see also Kiss 2005 and section 5.2.2 1 See Hornstein (2001, 2003), Boeckx and Hornstein (2003, 2004, 2006a, 2006b, 2007), Nunes (2007, 2009a, 2010), and Boeckx, Hornstein, and Nunes (in press) for discussion of possible solutions to other empirical objections that have been raised. For the sake of brevity, here we will not review details of our earlier attempts to address the issues to be discussed in this chapter.
125
126
Empirical challenges and solutions
below). The argument runs as follows. If the only difference between the derivation of subject control and raising constructions is that, in the former, movement of the embedded subject first targets a -position, as illustrated in (1) and (2) below, control and raising should pattern alike when this -position is eliminated. Arguably, this is what happens if the subject-control predicate is passivized. Thus, sentences such as (3a) should be licit under the derivation sketched in (3b), contrary to fact. (1) a. b.
John tried to kiss Mary [Johni tried [ti to kiss Mary]]
(2) a. b.
John seems to love Mary [Johni seems [ti to love Mary]]
(3) a. b.
∗ ∗
John was tried to kiss Mary [Johni was tried [ti to kiss Mary]]
At the risk of beating a dead horse, we would like to reiterate a point made in the previous chapters: the MTC is not a raising theory of control. Rather, it contends that raising and obligatory-control constructions are derived by the same operation – A-movement. So, in order for Landau’s argument to be valid, it should be first demonstrated that the licensing conditions for the relevant A-movement to apply in (1b), (2b), and (3b) are the same. We show in the next section that, when the three derivations are carefully inspected, this is actually not the case. But before examining these derivations in detail and providing an account of (3a), let us enlarge our data set. The unacceptability of (3a) is generally taken to fall under Visser’s generalization (see Bresnan 1982), given that object control and ECM verbs do not shy away from passivization, as illustrated in (4) and (5). (4) a. b.
John was persuaded to kiss Mary [Johni was persuaded [ti to kiss Mary]]
(5) a. b.
John was expected to kiss Mary [Johni was expected [ti to kiss Mary]]
Interestingly, the observation that subject-control verbs do not behave like raising predicates when passivized goes beyond structures involving non-finite complements. Recall from section 4.4 that Brazilian Portuguese allows both finite control and hyper-raising, as illustrated in (6) and (7) below. Relevant to the present discussion is the fact that, if the matrix verb of a finite-control structure such as (6) is passivized, hyper-raising is blocked, as shown in (8).2 2 See Nunes (2007, 2008a, 2010) for discussion.
5.2 Passives, OC, and Visser’s generalization
127
In other words, the contrast between raising and subject control seen in (2) and (3) also arises when A-movement out of finite clauses is involved. So, whatever the ultimate analysis for the unacceptability of (3a) is, it should in principle apply to (8a) as well. (6)
Brazilian Portuguese: Os meninos disseram que n˜ao fizeram a tarefa The boys said.PL that not did.PL the homework ‘The boys said that they didn’t do their homework’ [[Os meninos]i disseram [que ti n˜ao fizeram a tarefa]]
a.
b. (7)
Brazilian Portuguese: Os meninos parecem que n˜ao fizeram a tarefa The boys seem.PL that not did.PL the homework ‘It seems that the boys didn’t do their homework’ [[Os meninos]i parecem [que ti n˜ao fizeram a tarefa]]
a.
b. (8)
Brazilian Portuguese: Os meninos foram ditos que n˜ao fizeram a tarefa The boys were said.MASC.PL that not did.PL the homework ‘It was said that the boys didn’t do their homework’ b. ∗ [[Os meninos]i foram ditos [que ti n˜ao fizeram a tarefa]] a.
∗
Let us then consider how the MTC can provide an account of the data in (1)–(8) by paying close attention to the licensing conditions involved in each case where the embedded subject undergoes A-movement to the matrix clause. 5.2.1 Relativizing A-movement Leaving the discussion of finite control and hyper-raising construction to section 5.2.3, let us start by considering the derivational step preceding the movement of the embedded subject to the matrix clause in the derivations of (1)–(5), as sketched in (9)–(13) (irrelevant details omitted). (9) a. b.
John tried to kiss Mary [vP v [VP tried [CP C [TP John to kiss Mary]]]]
(10) a. b. (11) a. b.
John seems to love Mary [TP T [VP seems [TP John to love Mary]]] ∗
John was tried to kiss Mary [PpleP -en [VP tried [CP C [TP John to kiss Mary]]]]
(12) a. b.
John was persuaded to kiss Mary [VP persuaded [CP C [TP John to kiss Mary]]]
(13) a. b.
John was expected to kiss Mary [PpleP -en [VP expected [TP John to kiss Mary]]]
128
Empirical challenges and solutions
In (9b) and (12b), the trigger for the movement of ‘John’ is -related: the matrix light verb in (9b) and the matrix main verb in (12b) need to assign their remaining -role. Movement in (10b), (11b), and (13b), on the other hand, is motivated by agreement in -features with a finite T or a passive participial head.3 Let us first examine the instances of A-movement triggered by -agreement. One salient difference between the structures where such movement is allowed (cf. [10b] and [13b]) and the one where it is not (cf. [11b]) has to do with the categorial nature of the embedded clause. Under standard assumptions, raising and ECM verbs select for TP, whereas control verbs select for CP. Nunes (2007, 2010) argues that this difference is what underlies the contrast between (10b) and (13b), on the one hand, and (11b), on the other. Assuming with Chomsky (2008) that clausal -features are actually hosted by C (they are associated with T only by inheritance from C), Nunes proposes that the agreement relation between ‘-en’ and ‘John’ in (11b) is blocked due to the intervention of C, as depicted in (14) below.4 More specifically, if movement of ‘John’ is to be anchored on -agreement, the intervening -features of C induce a minimality violation. Once ‘John’ is prevented from undergoing A-movement, it cannot have its case feature checked/valued (the embedded C/T is not a case checker/assigner) and the derivation crashes. (14)
[PpleP -en [VP tried [CP C [TP John to kiss Mary]]]] ∗
By contrast, (10b) and (13b) involve no CP layer in the embedded clause. Thus, movement of ‘John’ couched on -agreement is licit, for there is no intervening -feature bearer, as respectively shown in (15) and (16). (15)
[TP Johni T [VP seems [TP ti to love Mary]]] ↑ OK
(16)
[PpleP Johni -en [VP expected [TP ti to kiss Mary]]] ↑ OK
3 We assume with Chomsky (2000, 2001) that case checking is contingent on -agreement and that passive participial heads are associated with (defective) -features (and case) regardless of whether or not these features are overtly realized. 4 Notice that passivization of the whole clause may yield an acceptable result, as exemplified in (i). (i)
To kiss Mary was tried (by John) repeatedly throughout the evening
5.2 Passives, OC, and Visser’s generalization
129
Similar considerations apply to the derivation of “long passives” in German such as (17) below. Wurmbrand (2001) uses contrasts such as the one in (18) between an impersonal and a long passive to argue that, in long passives, the matrix control verb is a restructuring verb that takes VP for a complement. Once these complements involve just the lower shell of the verbal skeleton, there is no appropriate antecedent to license the anaphor; hence the unacceptability of (18b). (17)
German (Wurmbrand 2001): dass die Traktoren zu reparieren versucht wurden that the tractors.NOM to repair tried were ‘that they tried to repair the tractors’
(18)
German (Wurmbrand 2001): Es wurde versucht [PROi sichi den Fisch mit Streifen vorzustellen] It was tried SELF the fish with stripes.ACC to-imagine ‘They tried to imagine what the fish would look like with stripes’
a.
b.
∗
weil {sich} der Fisch {sich} vorzustellen versucht wurde since SELF the fish.NOM SELF to-imagine tried was ‘since they tried to recall the image of the fish’
Assuming that Wurmbrand’s analysis of long passives is correct (see section 5.2.2 below for a discussion of impersonal passives), the relevant derivational step underlying (17) is as sketched in (19) (with English words for convenience). -agreement with the passive participial head can license A-movement of the embedded object in (19), for there is no intervening element that hosts -features. (19)
[PpleP -en [VP tried [VP repair [the tractor]]]] OK
As Nunes points out, given this relativized minimality approach to Amovement, the acceptability of (12a) becomes very illuminating. Although there is a CP layer in the complement clause (cf. [12b]), movement of the embedded subject is motivated by -considerations, not -agreement. Hence, the -features of C do not block the movement of ‘John,’ as represented in (20a) below. Later on, when the passive participial head is introduced, as shown in (20b), ‘John’ has already moved out of the CP and can therefore enter into an agreement with -en and move to the [Spec, PpleP] without any problems, as there are no intervening elements bearing -features. In other words, movement for -reasons in (20a) provides an escape hatch for ‘John’ to enter into -agreement relations later in the derivation.
130
Empirical challenges and solutions
(20) a. b.
[VP Johni persuaded [CP C [TP ti to kiss Mary]]] ↑ OK [PpleP –en [VP Johni persuaded [CP C [TP ti to kiss Mary]]]] OK
Let us now examine the instances of A-movement for -purposes in (9b) and (12b) more closely. Given the blocking role played by C with respect to -agreement, one wonders why it does not act as a proper intervener for the movement of ‘John.’ After all, C gets -marked as it is the head of the complement of the matrix lexical verb. There are two possible approaches to this issue. Under the first one, C does not qualify as a proper intervener because CP is not an appropriate element to carry the external -role assigned by the matrix light verb in (9b) or the experiencer -role assigned by ‘persuaded’ in (12b); propositions are simply incompatible with these -roles. Alternatively, we may assume, following Abels (2003) and Grohmann (2003), that movement cannot be too local. Thus, a given element cannot resort to movement to check two -roles inside the same thematic domain for reasons of anti-locality. If so, once CP is not an eligible candidate to receive the unassigned -role of (9b) and (12b), C does not count as a proper intervener for movement of the embedded subject. We will leave the choice between these two approaches for another occasion. Suffice it to say that either of them correctly allows movement of an embedded subject to a -position in control configurations. In sum, Nunes’s (2007, 2010) analysis makes it clear that the lack of passivization of subject-control verbs is not at all a problem for the MTC. Quite the opposite! This account of Visser’s generalization in fact provides an answer for another of Landau’s criticisms. Landau (2003: 488) claims that the MTC does not seem able to account for the crosslinguistic generalization that infinitival complementizers are found in control structures, but not in raising structures. This empirical generalization is illustrated by the contrast in Hebrew in (21) below involving the verb xadal ‘stop, cease,’ which is ambiguous between a control and a raising verb. Example (21a) involves an animate subject and the infinitival complementizer me- is possible, whereas (21b) has an inanimate subject and me- is blocked. As Landau points out, given the connection between infinitival complementizers and control, the contrast in (21) is to be expected, for in general only animate DPs can be controllers. (21) a.
Hebrew (Landau 2003): Rina xadla (me-)le’acben et Gil Rina stopped (from-)to-irritate ACC Gil ‘Rina stopped irritating Gil’
5.2 Passives, OC, and Visser’s generalization
131
Ha-muzika ha-ro’eˇset xadla (∗ me-)le’acben et Gil The-music the-noisy stopped (from-)to-irritate ACC Gil ‘The loud music stopped irritating Gil’
b.
Nunes’s (2007, 2010) proposal reviewed above provides a straightforward account of this generalization. If an infinitival clause has a CP layer, its subject will not be able to undergo A-movement for agreement/case purposes due to the intervention of C (a -feature carrier). Hence, standard raising constructions are incompatible with (-feature-bearing) complementizers (but see section 5.2.3 for further discussion). By contrast, if the movement is -related, C does not count as an intervener; hence control structures may involve an overt complementizer. The proposal reviewed above also provides an answer to a related challenge posed by van Craenenbroeck, Rooryck, and van den Wyngaerd (2005). Their reasoning is the following. If ‘John’ can move in the control structure in (22a) because it does not have its case feature checked, why can it not move in (22b) on a par with the raising derivation in (22c)? (22) a. b. c.
∗
[Johni tried [ti to win]] [Johni is important [ti to win]] [Johni is likely [ti to win]]
Again, the fact that the three sentences above can be derived through Amovement does not entail that the relevant movements have the same motivations (and restrictions) or that the structural configurations are kept constant. In the case of (22), movement of ‘John’ is sanctioned by -reasons in (22a), but by -agreement reasons in (22b) and (22c). Moreover, under standard assumptions, the control and the impersonal constructions in (22a) and (22b) involve CP infinitivals, whereas the raising construction in (22c) involves a TP infinitival, as respectively shown in (23) below. Once these points are disentangled, it is easy to see that movement of ‘John’ is legitimate in (23a) (the -features of C do not induce an intervention effect for -related movements) and in (23c) (there is no intervener bearing -features), but not in (23b), for the -features of C induce a minimality effect. (23) a.
[vP v [VP tried [CP C [TP John to win]]]] ↑ OK
b.
[TP is-T important [CP C [TP John to win]]] ↑ ∗
c.
[TP is-T likely [TP John to win]] ↑ OK
132
Empirical challenges and solutions
Let us now consider how this proposal can be extended to impersonal passives, which have also been claimed to offer deadly counter-evidence to the MTC. 5.2.2 Impersonal passives Kiss (2005) claims that German impersonal passives pose two types of problems for the MTC. The first one involves contrasts such as the one in (24) below. (24a) shows that, just as we saw in English, passivization of an embedded subject in a subject-control structure is disallowed in German. In turn, (24b) is taken to show that, if no movement takes place, the matrix subject-control verb can appear in the passive voice, yielding an impersonal construction. Interestingly, however, the interpretation of (24b) is that the implicit argument of the matrix verb controls the external argument of the embedded verb. (24) a.
∗
b.
German (Kiss 2005): Der Mann wurde zu tanzen gew¨unscht The man PASS.AUX.3SG to dance wished Es wurde gew¨unscht zu tanzen It PASS.AUX.3SG wished to dance
The second potential counter-argument has a familiar format. Although a passivized subject-control verb cannot take an impersonal passive for a complement, as seen in (25a), a raising verb can, as shown in (25b). (25) a.
b.
∗
German (Kiss 2005): Es wurde gew¨unscht getanzt zu werden It PASS.AUX.3SG wished danced to PASS.AUX.INF ‘Somebody wished to dance’ Es scheint getanzt zu werden It seems danced to PASS.AUX.INF ‘It seems that someone is dancing’
It seems to us that, despite their intrinsic interest, the contrasts in (24) and (25) are more relevant to a proper analysis of (impersonal) passives, than to control per se. In other words, before analyzing (24) and (25), one must first establish the relevant property that languages like German have that allows a simple impersonal passive such as (26) below. It is beyond the scope of this volume to provide such an analysis. For concreteness, we will assume the gist of Baker, Johnson, and Roberts’s (1989) proposal and show that the analysis of Visser’s generalization reviewed in section 5.2.1 also accounts for contrasts such as (24) and (25).
5.2 Passives, OC, and Visser’s generalization (26)
133
German (Jaeggli 1986): Es wurde getanzt It was danced ‘There was dancing’
Baker, Johnson, and Roberts propose that the passive morpheme is a clitic of sorts which is assigned the external -role of the predicate and is doubled either by a “by-phrase” or by an empty category (IMP). According to them, the contrast between English and German with respect to impersonal passives is due to the different case-licensing possibilities the passive morpheme has in each language: accusative in English and accusative or nominative in German. Adapting Baker, Johnson, and Roberts’s proposal and updating it in current parlance, we take the passive morpheme to be a type of light verb with one distinctive property: its external argument may be null (IMP) or realized as a by-phrase.5 As for the difference between English and German, we assume that in English IMP is licensed by structural (accusative) case, whereas in German IMP can be licensed by either structural or inherent case. Under this view, the impersonal passive in (26) is to be represented as in (27) (with English words), where -en assigns the external -role and inherent case to IMP. (27)
[it was [vP IMP [v’ -en [VP danced]]]]
Assuming an analysis along the lines of (27), let us now return to (24). Example (24a) is basically subject to the same analysis applied to English in section 5.2.1. That is, given the configuration in (28) (with English words), movement of the embedded subject to the outer Spec of -en for purposes of -agreement (see footnote 5) is blocked by the -features of C. (28)
[vP IMP [v’ -en [VP wished [CP C [TP [the man] to dance]]]]] ∗
One could ask what prevents the embedded subject from moving to the matrix [Spec, vP] to receive the external -role assigned by -en in a derivation 5 From this perspective, the derivation of a standard passive such as (i) below proceeds along the lines of (ii). Notice that movement of the object to [Spec, TP] first stops in the outer Spec of vP (cf. [iib]). This intermediate step circumvents a potential minimality violation induced by IMP in the inner [Spec, vP], as the two Specs are equidistant. (i)
John was arrested
(ii) a. b. c. d.
[vP IMP [v’ -en [VP arrested John]]] [vP Johni [v’ IMP [v’ -en [VP arrested ti ]]]] [TP was-T [vP Johni [v’ IMP [v’ -en [VP arrested ti ]]]]] [TP Johni was-T [vP ti [v’ IMP [v’ -en [VP arrested ti ]]]]]
134
Empirical challenges and solutions
without IMP. The answer should be: “Nothing!” for the -features of C are oblivious to -related movements. However, if -en assigns its external -role to an overt DP, it gets morphologically realized as a by-phrase, which is not the case in (24a). In other words, this hypothesized derivation is what actually underlies impersonal sentences such as (29), as sketched in (30) (with English words). (29)
German (Kiss 2005): Es wurde von dem Mann gew¨unscht, zu dem Treffen It PASS.AUX.3SG by the man wished to the meeting zu kommen to come ‘The man wished to join the meeting’
(30) [it was [vP [the man]i [v’ -en [VP wished [CP C [TP ti to come to the meeting]]]]]] ↑ OK
As for (24b), its derivation proceeds as in (31) below (with English words). IMP is generated in the embedded clause, where it receives the external role of the embedded predicate, and further computations yield the structure in (31a). Notice that (31a) is the typical configuration for subject control to obtain: the matrix light verb (the passive -en) still has to assign its external -role and the embedded subject is still active for purposes of A-movement as it has not checked its case yet. The embedded subject then moves to [Spec, vP], as shown in (31b), where it receives another -role and inherent case from -en. Crucially, movement of IMP in (31b) is triggered by -considerations and the -features of C do not render it a proper intervener for such a movement. (31) a. b.
[vP -en [VP wished [CP C [TP IMP to dance]]]] [vP IMPi [v’ -en [VP wished [CP C [TP ti to dance]]]]] ↑ OK
Let us now examine the contrast in (25), starting with the grammatical raising construction in (25b). The derivation starts by building an impersonal passive, as shown in (32a) below (with English words), where IMP receives inherent case from -en. Further computations then assemble the infinitival TP in (32b), which must have its EPP-feature checked. Assuming that inherently casemarked elements are inert for purposes of A-movement (see e.g., McGinnis 1998; Hornstein and Nunes 2002; and Rezac 2004), IMP becomes inert after it receives inherent case and cannot move to check the EPP in (32b). Expletive
5.2 Passives, OC, and Visser’s generalization
135
insertion solves this problem, as shown in (32c). After (32d) is built, the expletive then moves to check the EPP and the -features of the matrix T, yielding the convergent result in (32e). No minimality issue arises, as there is no CP layer in a standard raising configuration. (32) a. b. c. d. e.
[vP IMP [v’ -en [VP danced]]] [TP toEPP be [vP IMP [v’ -en [VP danced]]]] [TP it to be [vP IMP [v’ -en [VP danced]]]] [TP T [VP seems [TP it to be [vP IMP [v’ -en [VP danced]]]]]] [TP iti T [VP seems [TP ti to be [vP IMP [v’ -en [VP danced]]]]]]
Finally, let us see what goes wrong with (25a). Its derivation proceeds in an identical fashion to the derivation of (25b) until the point when (33a) below is formed (cf. [32c]). Further applications of merge then yield (33b). The matrix -en in (33b) has to assign its external -role. The embedded subject cannot move to receive this -role because it is an expletive. In turn, IMP in (33b) is a potential -role bearer but it has already received inherent case from the lower -en and has become inert for purposes of A-movement. In addition, ‘it’ intervenes between the matrix and the embedded [Spec, vP]. So the matrix -en can only assign its external -role if another IMP is merged in its Spec, as represented in (33c). An unsolvable problem then shows up in (33d), after the matrix T is introduced in the derivation. The embedded expletive still has its case unchecked but it cannot enter into an agreement relation with the matrix T due to the intervention of the -features of C; hence the unacceptability of (25a). (33) a. b. c. d.
[TP it to be [vP IMP [v’ -en [VP danced]]]] [vP -en [VP wish [CP C [TP it to be [vP IMP [v’ -en [VP dance]]]]]]] [vP IMP [v’ -en [VP wish [CP C [TP it to be [vP IMP [v’ -en [VP dance]]]]]]]] [TP T [vP IMP [v’ -en [VP wish [CP C [TP it to be [vP IMP [v’ -en [VP dance]]]]]]]]] ∗
There are a couple of details to spell out in the approach outlined above, such as the nature of Baker, Johnson, and Roberts’s (1989) empty category IMP, postulated to represent the external argument in passives.6 But we would like to emphasize that such details have to do with the ultimate analysis of passives and not directly with obligatory control. The important point to bear in mind is that, upon close inspection, the puzzling contrasts in (24) and (25) 6 See Baker, Johnson, and Roberts (1989: 228–229) for some remarks on the similarities and differences between IMP and arbitrary PRO.
136
Empirical challenges and solutions
prove to be amenable to a streamlined MTC approach. Rather than being fatal counter-evidence to the MTC, the data involving impersonal passives brought up by Kiss (2005) actually turn out to lend support to the MTC and the approach to Visser’s generalization in terms of minimality discussed in section 5.2.1. 5.2.3 Finite control vs. hyper-raising Let us now return to the contrast between passivization involving finite control and hyper-raising in Brazilian Portuguese, as illustrated in (34). (34) a.
b.
∗
Brazilian Portuguese: Os meninos foram ditos que n˜ao fizeram a tarefa The boys were said.MASC.PL that not did.PL the homework ‘It was said that the boys didn’t do their homework’ Os meninos parecem que n˜ao fizeram a tarefa The boys seem.PL that not did.PL the homework ‘It seems that the boys didn’t do their homework’
From the perspective of the approach to Visser’s generalization reviewed in section 5.2.1, the unacceptability of (34a) is not surprising. Given the configuration in (35) below (with English words), movement of the embedded subject for purposes of -agreement is blocked by the intervening -features of C. The unexpected case is the hyper-raising construction in (34b), which also involves A-movement for -agreement purposes across a -feature bearing C, as represented in (36). The question then is why the movement depicted in (36) does not yield a minimality effect. (35)
[vP [the boys] [v’ pro [v’ -en [VP said [CP that [TP t didn’t do the ↑ ∗ homework]]]]]] [TP [the boys] T [VP seem [CP that [TP t didn’t do the homework]]]] ↑
(36)
Nunes (2007, 2008a, 2010) proposes that the contrast between (34a) and (34b) is related to an interesting correlation between movement of the embedded subject and movement of the embedded clause. As shown in (37)–(40), movement of the embedded subject for purposes of -agreement is possible just in case the embedded CP cannot move. (37) a.
Brazilian Portuguese (Nunes 2008a): Parece [que os meninos fizeram a tarefa] Seems that the boys did the homework ‘It seems that the boys did their homework’
5.2 Passives, OC, and Visser’s generalization b.
∗
c.
Brazilian Portuguese (Nunes 2008a): Acabou [que os estudantes viajaram mais cedo] Finished that the students traveled more early ‘It turned out that the students traveled earlier’
a.
∗
(39)
Brazilian Portuguese (Nunes 2008a): Periga [que aqueles funcion´arios v˜ao ser demitidos] Is-in-danger that those employees go be fired
a. ∗
[[Que aqueles funcion´arios v˜ao ser demitidos]i periga ti ] That those employees go be fired is-in-danger ‘Those employees are in danger of being fired’ [[Aqueles funcion´arios]] perigam que ti v˜ao ser demitidos Those employees are-in-danger that go be fired ‘Those employees are in danger of being fired’
c.
(40)
Brazilian Portuguese (Nunes 2008a): N˜ao foi dito/mencionado [que os meninos fizeram a tarefa] Not was said/mentioned that the boys did the homework ‘It was not said/mentioned that the boys did their homework’
a.
b.
c.
[[Que os estudantes viajaram mais cedo]i acabou ti ] That the students traveled more early finished ‘It turned out that the students traveled earlier’ [[Os estudantes]i acabaram que ti viajaram mais cedo] The students finished that traveled more early ‘The students ended up traveling earlier’
c.
b.
[[Que os meninos fizeram a tarefa]i parece ti ] That the boys did the homework seems ‘It seems that the boys did the homework’ [[Os meninos]i parecem que ti fizeram a tarefa] The boys seem that did the homework ‘The boys seem to have done the homework’
(38)
b.
137
[[Que os meninos fizeram a tarefa]i n˜ao foi dito/mencionado ti ] That the boys did the homework not was said/mentioned ‘That the boys did their homework was not said/mentioned’ ∗
[[Os meninos]i n˜ao foram ditos/mencionados que ti fizeram a tarefa] The boys not were said/mentioned that did the homework ‘It was not said/mentioned that the boys did their homework’
Again, the contrast between (40b) and (40c) follows straightforwardly. If the embedded C counts as an intervener for -related movement, ruling out (40c) (cf. [35]), it is not surprising that its projection can indeed undergo movement
138
Empirical challenges and solutions
for -agreement purposes, as in (40b).7 The challenge is to determine what renders the embedded C in (37b)/(38b)/(39b) inert for purposes of -agreement, thereby freezing movement of the embedded CP and freeing movement of the embedded subject (cf. [37c]/[38c]/[39c]). Nunes (2007, 2008a, 2010) argues that this issue is related to the well-known fact that English experiencers in raising constructions do not block movement (cf. [41a]/[42a] below), despite the fact that they arguably c-command into the raising domain, inducing principle-C effects (cf. [41b]/[42b]). Under the assumption that the experiencers in (41) and (42) are assigned inherent case by the raising verb, they become immobile for A-purposes and do not count as proper intervener for the movement of ‘Mary’ in (41a) and (42b). (41) a. b.
∗
Maryi seems to him [ti to be nice] It seems to himi that Johni is nice
(42) a. b.
∗
Maryi struck him [ti as a fool] It struck himi that Johni was a fool
Returning to (37b)/(38b)/(39b), Nunes proposes that verbs like parecer ‘seem,’ acabar ‘turn out,’ and perigar ‘be on the verge of’ in Brazilian Portuguese assign inherent case to the head of their CP complements. Once C is assigned inherent case, it should behave like the experiencers of (41) and (42). In other words, it becomes inert for purposes of -agreement, which accounts for the immobility of the CP. In turn, if C is inert for -agreement purposes, it does not block movement of the embedded subject, as sketched in (43), allowing for hyper-raising.8 7 Alternatively, Nunes (2007) has proposed an account of the paradigm in (37)–(40) based on Hornstein’s (2009) reinterpretation of Chomsky’s (1964) A-over-A condition in terms of paths. The idea is that if the embedded CP and the embedded subject can both undergo A-movement to participate in a given -agreement relation, movement of CP blocks movement of the embedded subject as it defines a shorter path towards the targeted specifier. We will leave a comparison between the A-over-A and the c-command approaches to another occasion. 8 As observed by Nunes (2008a), this proposal is able to accommodate some micro-variation among speakers. For instance, (39c) is not as acceptable as (37c) for some speakers. Given that inherent case is a lexical property that is to some extent idiosyncratic, variation across speakers with respect to the lexical idiosyncrasies of specific impersonal predicates is therefore unsurprising. Recall also that finite Ts in Brazilian Portuguese may be -incomplete (see section 4.4). The ungrammaticality of the passive construction in (40c) thus indicates that C counts as an intervener for -purposes regardless of whether it is -complete or -incomplete (see Nunes 2008a for discussion). Only when it receives inherent case does it become transparent (the same considerations apply to the impersonal constructions in [46] and [47] below).
5.2 Passives, OC, and Visser’s generalization (43)
139
[TP DPi T [VP parece/acabou/periga [CP queinherent case [TP ti . . . ]]]] seems/turned out/is on the verge of that ↑ OK
Nunes presents two pieces of evidence for this proposal. The first one involves the contrast between (37) and (44), where ‘parecer’ takes a small clause as its complement. (44)
Brazilian Portuguese (Nunes 2008a): Parece o´ bvio que eles viajaram Seems obvious that they traveled ‘It seems obvious that they traveled’
a.
Que eles viajaram parece o´ bvio That they traveled seems obvious ‘That they traveled seems obvious’
b.
c.
∗
Eles parecem o´ bvios que viajaram They seem obvious that traveled ‘It seems obvious that they traveled’
In (44) CP is not an argument of parecer ‘seem’ but of o´ bvio ‘obvious.’ Thus, ‘parecer’ cannot assign inherent case to CP and the embedded C is active for purposes of -agreement relations. Accordingly, CP can move (cf. [44b]) and hyper-raising is blocked (cf. [44c]) due to the intervention of C, as sketched in (45). (45)
[TP DPi T [VP parece [SC o´ bvio [CP que [TP ti . . . ]]]]] seems obvious that ↑ ∗
The second piece of evidence regards the paradigm in (46)–(47). It is also worth pointing out that assigning inherent case to C is a necessary, but not sufficient, condition for hyper-raising to be permitted. Given that the embedded clause of (ia) below is immobile, it is arguably the case that ‘seem’ in English also assigns inherent case to its complement CP. However, hyper-raising is not allowed in English (cf. [ib]), as is well known. The relevant difference between English and Brazilian Portuguese is that finite Ts assign case to their subjects obligatorily in English, but optionally in Brazilian Portuguese (see Ferreira 2000, 2004, 2009; Rodrigues 2004, 2007; and Nunes 2008a). Thus, even though inherent-case assignment to the embedded C in (ib) makes it transparent for purposes of A-movement, the embedded subject has already checked/valued its case and is inactive for A-movement purposes. By contrast, in Brazilian Portuguese the embedded subject may be active if the embedded finite T is associated with an incomplete set of -features (see section 4.4). (i) a. b.
∗ [[that
John left]i seems ti ] seems [that ti left]]
∗ [John i
140
Empirical challenges and solutions
(46)
Brazilian Portuguese (Nunes 2008a): E´ f´acil/dif´ıcil (d)esses professores elogiarem os alunos Is easy/difficult of-these teachers praise.3PL the students ‘It’s easy/hard for these teachers to praise the students’
a.
Esses professores s˜ao f´aceis/dif´ıceis∗ (de) elogiarem os alunos These teachers are easy/difficult of praise.3PL the students ‘These teachers often/rarely praise the students’
b.
(47)
Brazilian Portuguese (Nunes 2008a): E´ bem prov´avel/lament´avel (∗ d)os professores terem elogiado Is very probable/regrettable of-the teachers have.3PL praised o diretor the director
a.
b.
∗
Os professores s˜ao bem prov´aveis/lament´aveis de terem elogiado The teachers are very probable/regrettable of have.3PL praised o diretor the director ‘It is very likely/regrettable that the teachers praised the director’
Examples (46a) and (47a) show that impersonal predicates such as ‘to be easy/ hard’ in Brazilian Portuguese allow the dummy preposition de ‘of’ to precede their infinitival complements, whereas predicates such as ‘to be probable/regrettable’ do not. In turn, (46b) and (47b) show that only the predicates that license the dummy preposition admit hyper-raising. Importantly, hyper-raising can only take place in the presence of the dummy preposition (cf. [46b]). Nunes takes ‘de’ to be a realization of inherent case, which is (optionally) assigned by some impersonal predicates to their CP complements. If ‘de’ is not present or is not licensed, movement of the embedded subject is blocked by C, as shown in (48) below. By contrast, if ‘de’ is present, C is assigned inherent case, thereby becoming inert for A-movement, and does not block movement of the embedded subject, as sketched in (49). As we should expect given the present analysis, movement of the infinitival is possible just in case ‘de’ is not present, as illustrated in (50). (48)
[TP DPi T is easy/difficult/probable/regrettable [CP C [TP ti . . . ]]] ↑ ∗
(49)
[TP DPi T is easy/difficult de [CP Cinherent case [TP ti . . . ]]] ↑ OK
(50)
Brazilian Portuguese (Nunes 2008a): (∗ D)esses professores elogiarem algu´em e´ (bem) f´acil/dif´ıcil Of-these teachers praise.3PL someone is well easy/difficult ‘These teachers easily/rarely praise someone’
5.3 Nominals and control
141
To wrap up, the contrast between passivization of finite-control structures and hyper-raising in Brazilian Portuguese (cf. [34a] vs. [34b]) is due to an independent property, namely, the fact that an element marked with inherent case is inert for A-relations. Once this independent fact is taken into account, finitecontrol and hyper-raising structures in Brazilian Portuguese behave exactly as the MTC predicts.
5.3
Nominals and control
In this section, we discuss an argument that Culicover and Jackendoff (2001) presented against the MTC, based on another instantiation of the contrast between control and raising: control from within nominals is allowed in English, but raising into nominals is not, as illustrated in (51). (51) a. b.
∗
John’s attempt to leave John’s appearance to leave
By now, the reader can already anticipate the refrain that accompanies this line of objection: “If control involves A-movement, why doesn’t it pattern with raising?” But Culicover and Jackendoff’s criticism purports to go beyond the specific analysis of (51), as such contrasts are taken to favor a semantics-based analysis. According to them (pp. 501–502), these contrasts “raise no particular problem for theories of control based on argument structure or conceptual structure. In these theories an implicit argument is precisely a semantic/functional argument that has no NP corresponding to it in phrase structure.” It is worth noticing the different empirical predictions this semantics-based approach and the MTC make with respect to crosslinguistic variation. Under the plausible assumption that the argument structure and the conceptual structure associated with ‘attempt’ and ‘appearance’ are kept constant across languages, contrasts such as the one in (51) should be universal under the semanticsbased approach. Thus, if it turns out that there are languages where raising into nominals coexists with control from within nominals, the semantics-based approach will find itself in a very uncomfortable position, as it is incompatible with such variation. The MTC, on the other hand, is not committed to the universality of contrasts such as (51). From the perspective of the MTC, the two constructions in (51) must be derived by A-movement of the embedded subject motivated by -reasons in (51a) and -agreement/case reasons in (51b). However, it should not be surprising if the syntactic configurations involved in control nominals and raising nominals vary across languages, yielding contrasts
142
Empirical challenges and solutions
such as (51) in some languages but not in others. The ungrammaticality of (51b), for instance, indicates that its syntactic configuration in English must be such that it prevents movement of the embedded subject, but from this one cannot conclude that every language will display the same syntactic configuration in this domain. The question then is not whether the MTC is compatible with crosslinguistic variation with respect to contrasts such as (51) (it is!), but whether it can account for it. Below we discuss two different cases that bear on the issue of crosslinguistic variation: finite control into indicative noun-complement clauses in Brazilian Portuguese and raising into nominals in Hebrew. After showing that these structures can be adequately handled by the MTC but are problematic for the semantics-based approach outlined by Culicover and Jackendoff, we will then suggest an analysis for the English contrast in (51). 5.3.1
Finite control into noun-complement clauses in Brazilian Portuguese Consider the contrast in (52) in Brazilian Portuguese, where the embedded null subject is licensed when its clause is embedded under afirmac¸a˜ o ‘statement,’ but not under probabilidade ‘probability.’ (52)
Brazilian Portuguese (Nunes 2009b): A afirmac¸a˜ o d[o Jo˜ao]i de [que Øi fez o trabalho] e´ falsa The affirmation of-the Jo˜ao of that did the job is false ‘Jo˜ao’s statement that he did the job is false’
a.
b.
∗
A probabilidade d[o Jo˜ao]i de [que Øi tenha feito o The probability of-the Jo˜ao of that has.SUBJ done the trabalho] e´ alta job is high ∗ ‘Jo˜ao’s probability that he did the job is high’
From the perspective of the MTC, the contrast in (52) follows straightforwardly. Given that nominals in Brazilian Portuguese only assign inherent case, movement makes it possible for the embedded subject to receive inherent case from ‘afirmac¸a˜ o’ in (52a), but not in (52b) as ‘probabilidade’ does not have an additional -role to assign. Notice the contrast in (52) alone is not enough to make a case against a semantic approach. After all, the contrast in (52) is replicated in (53) below, where the embedded subject is not null. Under a semantics-based approach, the differences between the argument or conceptual structures of ‘afirmac¸a˜ o’ and ‘probabilidade’ should suffice to account for both (52) and (53).
5.3 Nominals and control (53)
Brazilian Portuguese (Nunes in press): A afirmac¸a˜ o do Jo˜ao de [que a Maria fez o trabalho] e´ falsa The affirmation of-the Jo˜ao of that the Maria did the job is false ‘Jo˜ao’s claim that Maria did the job was false’
a.
b.
143
∗
A probabilidade do Jo˜ao de [que a Maria tenha feito o The probability of-the Jo˜ao of that the Maria has.SUBJ done the
trabalho] e´ alta job is high ∗ ‘Jo˜ao’s probability that Maria did the job is high’
However, there are several aspects that show that the derivation of (52a), for instance, crucially hinges on independent syntactic properties of Brazilian Portuguese. Recall from section 4.4 that referential null subjects in (colloquial) Brazilian Portuguese behave like A-traces rather than null pronominals and this was attributed to its finite Ts being able to host an incomplete -set (see Ferreira 2000, 2004, 2009 and Nunes 2008a). In this regard, it is worth pointing out that the null subject of (52a) displays the same behavior as the other instances of referential null subjects in Brazilian Portuguese. For instance, if there is no antecedent for a null subject inside a noun complement clause, the sentence becomes unacceptable, as shown in (54).9 (54) a.
∗
b.
∗
Brazilian Portuguese (Nunes 2009b): A hip´otese de [que Ø vai ser eleito] e´ de rir The hypothesis of that goes be elected is of laugh ‘The hypothesis that he’s going to be elected is laughable’ A afirmac¸a˜ o de [que Ø fez o trabalho] e´ falsa The affirmation of that did the job is false ‘The statement that he did the job is false’
Second, as Nunes (2009b) observes, there is an interesting correlation in Brazilian Portuguese between the presence of a dummy preposition preceding the noun-complement clause and the licensing of the embedded null subject. In general, noun-complement clauses may be optionally preceded by the dummy preposition ‘de,’ as shown in (55) below. However, an intriguing contrast arises when the noun-complement clause involves a null subject: if the null subject is an expletive, ‘de’ remains optional (cf. [56]); on the other hand, if it is referential, ‘de’ becomes obligatory (cf. [57]).10 9 See Nunes (2009b) for further evidence that referential null subjects within noun-complement clauses in Brazilian Portuguese also behave like A-traces. 10 In (55) and (56), the alternatives with the preposition are generally associated with formal style and written language. However, there is no stylistic difference when a referential null subject is involved (cf. [57]), for the absence of the preposition yields gibberish (see Nunes 2009b for further discussion).
144
Empirical challenges and solutions
(55) a.
b.
(56) a.
Brazilian Portuguese (Nunes 2009b): A hip´otese (de) [que a Terra e´ chata] n˜ao foi esquecida The hypothesis of that the Earth is flat not was forgotten ‘The hypothesis that the Earth is flat was not forgotten’ Ele comentou a afirmac¸a˜ o do Jo˜ao (de) [que a Ana He commented the affirmation of-the Jo˜ao of that the Ana era inocente] was innocent ‘He commented on Jo˜ao’s statement that Ana was innocent’ Brazilian Portuguese (Nunes 2009b): A hip´otese do Jo˜ao (de) [que Øexpl n˜ao existe The hypothesis of-the Jo˜ao of that not exists movimento-wh nessa l´ıngua] parece estar errada wh-movement in-this language seems be wrong ‘Jo˜ao’s hypothesis that there doesn’t exist wh-movement in this language seems to be wrong’
b.
(57) a.
b.
A afirmac¸a˜ o (de) [que Øexpl nunca chove aqui e´ exagerada] The affirmation of that never rains here is exaggerated ‘The claim that it never rains here is an exaggeration’ Brazilian Portuguese (Nunes 2009b): A hip´otese d[o Jo˜ao]i (∗ de) [que Øi vai ser eleito] e´ de rir The hypothesis of-the Jo˜ao of that goes be elected is of laugh ‘Jo˜ao’s hypothesis that he’s going to be elected is laughable’ A afirmac¸a˜ o d[o Jo˜ao]i (∗ de) [que Øi fez o trabalho e´ falsa] The affirmation of-the Jo˜ao of that did the job is false ‘Jo˜ao’s statement that he did the job is false’
Assuming that (57) is also a case of finite control (therefore, A-movement under the MTC), one wonders why the referential null subjects/A-traces require the presence of the preposition. Building on Stowell (1981), Nunes (2009b) uses contrasts such as (58) below to argue that, in Brazilian Portuguese, the presence or absence of ‘de’ in these constructions respectively signals whether we are dealing with a true complement structure or an appositive of sorts. More specifically, Nunes takes ‘de’ in these constructions in BP to be the realization of the inherent case assigned by the subcategorizing noun to the embedded clause. If ‘de’ encodes a noun-complement configuration in virtue of realizing inherent case, its presence in (58) yields unacceptable results as these sentences involve a predication configuration. (58) a.
A hip´otese e´ (∗ de) que o Jo˜ao tenha feito isso The hypothesis is of that the Jo˜ao has done this ‘The hypothesis is that Jo˜ao did this’
5.3 Nominals and control b.
145
A alegac¸a˜ o e´ (∗ de) que a Maria viaja muito The allegation is of that the Maria travels much ‘The allegation is that Maria travels too much’
With this overall picture in mind, the derivation of the version of (57b) with the preposition involves the steps sketched in (59) (with English words). (59) a.
Applications of merge and move: CP = [that Jo˜ao T[N] did this] N = affirmation
b.
Merger between N and CP + inherent-case assignment: [affirmation [that Jo˜ao T[N] did this]inherent case ]
c.
Movement of the embedded subject + -role assignment: [Jo˜aoinherent case affirmation [that Jo˜ao T[N] did this]inherent case ] ↑
d.
Movement of the head noun:11 [affirmation [Jo˜aoinherent case affirmation [that Jo˜ao T[N] did this]inherent case ]] ↑
e.
Deletion of copies in the phonological component: [affirmation [Jo˜aoinherent case affirmation [that Jo˜ao T[N] did this]inherent case ]]
f.
Realization of inherent case: [affirmation [de Jo˜ao] [de that did this]]
Like in the other instances of finite control in Brazilian Portuguese, the embedded T is associated with an incomplete -set in (59a) (see Ferreira 2000 and Nunes 2008a) and is unable to check the case of the embedded subject, which remains active for purposes of A-relations. After the noun and the CP undergo set-merge (in the sense of Chomsky 2000), the CP is assigned inherent case in virtue of the -role it receives from the noun (cf. [59b]). Next, the embedded subject moves, receives the external -role associated with the subcategorizing noun and is also marked with inherent case (cf. [59c]). Finally, both inherent cases are realized as ‘de’ in the morphological component (cf. [59f]). By contrast, the version of (57b) without ‘de’ has no convergent derivation at its disposal. Given that the absence of ‘de’ signals that the embedded CP is an adjunct rather than a complement, the noun and CP must then undergo pair-merge (in the sense of Chomsky 2000). However, if the CP becomes an adjunct, the embedded subject cannot move out of it as it would induce a CED 11 The linear order of (57b) indicates that, after ‘o Jo˜ao’ moves to the relevant -position associated with ‘afirmac¸a˜ o,’ the latter moves to a higher position. The nature of such positions is orthogonal to the current discussion.
146
Empirical challenges and solutions
violation. Thus, the only relevant possibility to be considered is the one in which the embedded subject undergoes sideward movement before CP becomes an adjunct (see sections 4.5.1.2 and 4.5.1.3), as illustrated in (60) (with English words). (60) a.
Applications of merge and move: CP = [that Jo˜ao T[N] did this] N = affirmation
b.
Sideward movement (copy +merge)+ θ-role assignment: CP = [that Jo˜ao T[N] did this] NP = [Jo˜aoinherent case affirmation]
c.
Adjunction of CP to NP:12 [NP [NP Jo˜aoinherent case affirmation] [CP that Jo˜ao T[N] did this]]
d.
Movement of the head noun (see footnote 11): [affirmation [NP [NP Jo˜aoinherent case t] [CP that Jo˜ao T[N] did this]]]
Nunes (2009b) argues that, although the derivational steps in (60) are licit, the final output cannot be linearized. Notice that the copies of ‘Jo˜ao’ in (60d) are not in a chain configuration as they do not stand in a c-command relation. Assuming that deletion of copies can only operate with chains, chain reduction (see Nunes 2004) cannot be employed in (60d). Failure to delete one of the copies of ‘Jo˜ao’ in (60d) in turn causes linearization problems as the system gets contradictory instructions: ‘Jo˜ao’ should precede and be preceded by ‘that,’ as well as precede itself (see Nunes 1999, 2004 for discussion). Note that when ‘de’ is present instead, i.e., when we have a true noun-complement structure as in (59d), the upper copy of ‘Jo˜ao’ c-commands and forms a chain with the lower copy, allowing chain reduction to apply in the phonological component, delete the lower copy, and circumvent potential linearization problems (cf. [59e]). Going back to the comparison between the MTC and approaches based on argument or conceptual structure, the MTC is able to account for all the data concerning referential null subjects within noun-complement clauses in Brazilian Portuguese. Under the assumption that such subjects are A-traces, we account for why they require an antecedent and why the clauses containing them must be true complements (preceded by ‘de’) and not adjuncts (lacking ‘de’). By contrast, the empirical coverage of the semantics-based alternative is limited to (52). Short of ad hoc provisos, there seems to be no coherent way to explain the behavior of referential null subjects within noun-complement clauses in Brazilian Portuguese and their apparent requirement of a dummy preposition, 12 It is immaterial for the purposes of our discussion if CP adjoins to a projection higher than NP.
5.3 Nominals and control
147
based solely on the argument or conceptual structure of the relevant noun. In sum, when all pertinent data are taken into consideration, it is fair to say that the semantic account of the contrast in (52) turns out to be spurious. 5.3.2 Raising into nominals in Hebrew The problem posed by Hebrew to semantics-based approaches to control is even stronger than the one presented by finite control into noun-complement clauses discussed above. As convincingly argued by Sichel (2007), along with standard control from within nominals, Hebrew also allows constructions that involve raising into nominals, as illustrated in (61). (61)
Hebrew (Sichel 2007): ha-nisayon Sel rina [le-hagi’a ba-zman] the-attempt of Rina to-arrive on-time ‘Rina’s attempt to arrive on time’
a.
b.
ha-sikuyim Sel rina [le-hagi’a ba-zman] the-chances of Rina to-arrive on-time ‘Rina’s chances to arrive on time’
Evidence that (61b) does involve raising is provided by the pairs in (62)–(64) below. The a-sentences of (62)–(64) show that the control noun corresponding to ‘attempt’ imposes selectional restrictions on the DP associated with it, therefore being incompatible with inanimate elements, expletives, and idiom chunks respectively. In turn, the b-sentences show that the opposite holds of the noun corresponding to ‘chances,’ which imposes no such restrictions. Therefore, Sichel concludes, the b-sentences involve a raising noun and the element case-marked by it has raised from the embedded clause.13 (62) a.
∗
b.
(63) a. b.
Hebrew (Sichel 2007): [ha-nisayon Sel ha-te’oria lihiyot nexonot] hirgiz otanu the-attempt of the-theory to-be correct annoyed us [ha-sikuyim Sel ha-te’oria lihiyot nexona] kluSim le-maday the-chances of the-theory to-be correct.FEM.SG slim quite ‘The chances of the theory being correct are pretty slim’
∗
Hebrew (Sichel 2007): [ha-nisayon Se ze likrot [Se-bibi yibaxer]] hifti’a otanu the-attempt of it to-happen that-Bibi will-be-elected surprised us [ha-sikuyim Se ze likrot [Se-bibi yibaxer]] tovim the-chances of it to-happen that-Bibi will-be-elected good ‘The chances of it happening that Bibi will be elected are good’
13 See Sichel (2007) for further evidence and discussion.
148
Empirical challenges and solutions
(64) a. b.
∗
Hebrew (Sichel 2007): [ha-nisayon Sel ha-kerax le-hiSaver be-macav ka-ze] hu tipSi the-attempt of the-ice to-break in-situations like-this is silly [ha-sikuyim Sel ha-kerax le-hiSaver be-macav ka-ze] kluSim the-chances of the-ice to-break in-situations like-this slim ‘The chances of the ice breaking in this kind of situation are slim’ (idiomatic reading)
The fact that raising nominals are attested in Hebrew provides strong evidence against an account of the impossibility of such nominals in English in terms of argument or conceptual structure. It is reasonable to suppose that the argument and conceptual structures of these nominals are the same in the two languages; hence, the two of them should not be different with respect to raising. By contrast, the MTC is much better equipped to handle these cases as it relies on the different syntactic configurations that may underlie raising nominal constructions in Hebrew and English. Leaving English to the next section, let us consider under what conditions raising nominal constructions should be allowed in Hebrew from the perspective of the MTC. Although both structures in (61), for instance, involve movement of the DP case-marked by the dummy preposition, the motivation is different in each case. In (61a), the movement is triggered by -related reasons, namely, to allow the external -role associated with ‘attempt’ to be assigned. Hence, the incompatibility with elements that cannot bear this -role such as inanimate elements (cf. [62a]), expletives (cf. [63a]), or idiom chunks (cf. [64a]). As for the raising structure in (61b), the relevant A-movement involved must be triggered by -agreement/case considerations. If so, there can be no intervening element bearing -features. In particular, there should be no CP projection intervening between the raising nominal and the embedded subject, as C hosts -features (see section 5.2.1). Evidence that this conjecture is correct is provided by Sichel’s (2007) discussion of negative concord. In Hebrew, negative DPs must be licensed by clause-mate negation, as shown in (65) below. Importantly, in nominal constructions a negative embedded subject must be licensed by the matrix and not by the embedded verb, as illustrated in (66). As Sichel points out, the licensing of the embedded subject of (66a) is similar to what we see in the ECM constructions in (67), which under standard assumptions should not involve a CP layer. (65)
Hebrew (Sichel 2007): af takmid (∗ lo) nice’ax no student NEG won ‘No student won’
5.3 Nominals and control (66)
Hebrew (Sichel 2007): lo he’emanti [ba-sikuyim/netiya Sel af talmid le-hitkonen] NEG believed-I-in the-chances/tendency of no student to-prepare ‘I didn’t believe in the chances/tendency of any student preparing’
a.
b.
149
∗
(67)
he’emanti [ba-sikuyim/netiya Sel af talmid lo le-hitkonen] believed-I-in the-chances/tendency of no student NEG to-prepare ‘I didn’t believe in the chances/tendency of any student preparing’ Hebrew (Sichel 2007): lo zaxarti [af talmid mitkonen] NEG remembered no student preparing ‘I didn’t remember any student preparing’
The existence of raising into nominals is therefore not surprising from the perspective of the MTC. All depends on the specific syntactic structures involved. In this sense, the contrast between control and raising in the nominal domain may shed more light on the structure of nominal expressions than the nature of control. 5.3.3
The contrast between raising nominals and control nominals in English Now we have seen that raising into nominals should not be excluded as a matter of principle, let us return to the contrast in (51), repeated here in (68). (68) a. b.
∗
John’s attempt to leave John’s appearance to leave
The first thing to point out is that it is not exactly correct that English never allows raising into nominals. As Culicover and Jackendoff (2001, footnote 10) acknowledge, mentioning the data in (69) below, “there do exist examples that appear to be parallels of raising in nominals.” As in the case of Hebrew, the existence of data such as (69) is quite problematic for approaches to control based on argument or conceptual structure. (69)
John’s likelihood/probability of winning
A comparison between (68a) and (70) below, on the one hand, and (69), on the other, is very suggestive. (70)
∗
John’s likelihood/probability to win
The acceptable pattern is possible when the dummy preposition ‘of’ is present, which looks very similar to what happens with hyper-raising out of inflected infinitivals in Brazilian Portuguese (see section 5.2.3). Following a suggestion
150
Empirical challenges and solutions
by Lisa Cheng (personal communication), Nunes (2010), proposes that ‘of’ in constructions like (69) is in fact the realization of the inherent case assigned to the embedded non-finite clause. The contrast between (69) and (70) now follows the interaction between inherent case and -intervention discussed earlier with respect to hyper-raising in Brazilian Portuguese. The derivation of (70) involves movement of the embedded subject for -agreement/case reasons skipping C, which yields a minimality violation, as sketched in (71). (71)
[DP ’s [NP likelihood/probability [CP C [TP John to win]]]] ↑ ∗
As for (69), the subcategorizing nominal assigns inherent case to its complement, which is morphologically realized as ‘of.’ Once CP receives inherent case, it is no longer active for A-purposes and its head is not computed for purposes of A-minimality. Movement of the embedded subject then proceeds without problems, as illustrated in (72).14 (72)
[DP ’s [NP likelihood/probability [CP Cinherent case [TP John winning]]]] ↑ OK
Although admittedly sketchy, this proposal accounts for the fact that only some nominals admit raising (inherent case is to some extent a lexical idiosyncrasy), and explains why a dummy preposition should be resorted to. However, it does not explain why raising into the nominal domain in English is much more restricted than what we saw in Hebrew. In particular, English does not allow raising of expletives or idiom chunks in this context, as illustrated in (73) (cf. [63b] and [64b]). (73) a. b.
∗ ∗
its likelihood of raining/annoying me that Jane is late the shit’s likelihood of hitting the fan in these situations (Sichel 2007)
14 Alternatively, we can extend the analysis of finite control into nominals in Brazilian Portuguese (see section 5.3.1) to English. In other words, the presence of the dummy preposition ‘of’ signals a complement from which movement is allowed; conversely, the absence of the dummy preposition may indicate an adjunct to NP and then there is no way to derive the structure licitly via sideward movement (cf. [60]). Suggestive evidence for such an approach is the contrast in (i), which shows that, in predicative environments, of-gerunds behave like standard complements (cf. [iia]), as opposed to to-infinitivals (cf. [ib]/[iib]). (i) a. b. (ii) a. b.
∗ The
likelihood/probability was of winning The desire/attempt was to win
∗ The
driver is of my car The book is about Chomsky
5.3 Nominals and control
151
Following Nunes (2010), we suggest that the difference between English and Hebrew has to do with the span of the relevant A-movement in each language, as sketched in (74). (74) a.
b.
English: [DP ’s [NP N [CP Cinherent case [TP DP . . . ]]]] ↑ Hebrew: [NP N [ . . . [TP DP . . . ]]] ↑
In English, the moved subject crosses not only C, but also the subcategorizing N (cf. [74a]). In Hebrew, on the other hand, the moved subject seems to occupy a position lower than the subcategorizing noun (cf. [74b]), if we are to judge by the surface order of (61)–(64). That being so, N in English should induce a minimality effect for non-referential elements similar to what we find in the A’domain, where referential and non-referential wh-phrases sharply contrast with respect to movement across a weak island (see e.g., Rizzi 1990), as illustrated in (75). (75) a. b.
∗
What headway do you wonder [how PRO to make t on this project] ?What project do you wonder [how PRO to make headway on t] (Rizzi 1990)
In the case of raising, it is plausible to think that the raising nominal induces (weak) intervention effects for -related movements, blocking non-referential expressions from raising, because it is ultimately a -feature bearer. As for why referential expressions are not subject to such intervention, it is worth noting that, within NP, the subcategorizing noun functions as a predicate and not as an argument. Perhaps this is what makes it transparent for the movement of true arguments. If so, contrasts such as the ones illustrated in (76) below follow from the fact that the functional head associated with -ing is nominal in (76a), but verbal in (76b) (see e.g., Chomsky 1970 and Reuland 1983). That is, movement of the idiom chunk is blocked by the nominal -ing in (76a), but allowed by the verbal -ing in (76b). (76) a. b.
The cat’s being out of the bag was a big problem for the government (idiomatic reading: ∗ ) The cat being out of the bag was a big problem for the government (idiomatic reading: OK)
As for expletives, the contrasts in (77) below can receive the same minimality account if we assume with Rosenbaum (1967) and Hornstein and Witkos
152
Empirical challenges and solutions
(2003), among others, that ‘it’ and ‘there’ are generated together with their “associates” (the CP complement and ‘someone’ respectively) before moving to the subject position of the gerund.15 (77) a. b.
It/∗ its seeming that we would get a raise motivated everyone to work harder There/∗ there’s being someone here was surprising
Whether this general account will prove fruitful will depend, it seems to us, more on the ultimate structure of nominals per se than on the MTC. But it is worth emphasizing that this fine-grained range of (im)possible instances of raising within a single language raises the same kind of problem as crosslinguistic variation regarding raising into nominals poses to semantic-based approaches of the type envisioned by Culicover and Jackendoff (2001). By contrast, these fine-grained distinctions are very congenial to the MTC as their source arguably stems from minimality issues governing movement for -agreement/case purposes. Furthermore, the contrast with constructions involving control nominals, which are much less diversified, is unsurprising, for the relevant movement involved in control, although being of the A-type, is of a different nature – -related rather than -related. 5.4
Obligatory control and morphological case
Let us now examine the case and agreement patterns found in control contexts in Icelandic and Basque. We focus on these two languages as their overt morphology presents interesting case/agreement correlations that have been taken to shed light on the nature of control. It has also been claimed that these correlations constitute knock-out evidence against the MTC (see e.g., Landau 2003, 2006, 2007; San Martin 2004; Sigurðsson 2008; and Bobaljik and Landau 2009). Let us then see whether the MTC is up to the challenge. 5.4.1
Quirky case and the contrast between raising and control in Icelandic Landau (2003, 2006, 2007), Sigurðsson (2008), and Bobaljik and Landau (2009) have claimed that the fact that raising and control constructions differ with respect to the realization of quirky case constitutes a fatal problem for the MTC. As illustrated in (78) and (79) below, the matrix subject of raising constructions surfaces with the quirky dative case specified by the embedded 15 Rosenbaum (1967) made this proposal only for ‘it.’ We believe there are good reasons to extend the insight to ‘there’ (see Hornstein and Witkos 2003).
5.4 Obligatory control and morphological case
153
verb (cf. [78a]/[79a]), as opposed to the matrix subject of control predicates, whose case realization is determined solely by the properties of the matrix domain (cf. [78b]/[79b]). If both raising and control involve A-movement, so the argument goes, case realization should be the same in both types of constructions. (78) a.
b.
(79) a.
b.
Icelandic: M¨onnunum/∗ Mennirnir virðist b´aðum hafa verið hj´alpað seems both.DAT have been helped.DFLT Men-the.DAT/∗ NOM ‘The men seem to have both been helped’ (Sigurðsson 2008) Hann/∗ Honum vonast til að verða bjargað af fjallinu He.NOM/∗ DAT hopes for to be rescued.DFLT of the-mountain ‘He hopes to be rescued from the mountain’ (Andrews 1990, reproduced in Bobaljik and Landau 2009) Icelandic: Str´akunum er talið (hafa The-boys.MASC.PL. DAT. is.SG believed.DFLT to-have verið) bjargað been rescued.DFLT ‘The boys are believed to have been rescued’ (Andrews 1990, reproduced in Bobaljik and Landau 2009) Str´akarnir vonast til að verða hj´alpað/∗ hj´alpaðir/∗ hj´alpuðum The-boys.NOM hope for to be helped.DFLT/∗ PL.NOM/∗ PL.DAT ‘The boys hope to be helped’ (Sigurðsson 1991, reproduced in Bobaljik and Landau 2009)
Following Boeckx, Hornstein, and Nunes (in press), we show below that, when the relevant properties involved in quirky-case assignment and agreement are sorted out, the MTC in fact makes the right cut with respect to contrasts such as (78) and (79). As is well known, Icelandic has a morphologically rich case-agreement system in which structural and quirky case are associated with different agreement paradigms (for comprehensive overviews, see Sigurðsson 1991 and Thr´ainsson 2008). As is the case in other languages, quirky case in Icelandic displays properties of both inherent and structural case (see e.g., Zaenen, Maling, and Thr´ainsson 1985). Like inherent case and unlike structural case, it is associated with a -role and is lexically determined. On the other hand, it is unlike inherent case in that it does not render its recipient frozen for purposes of A-movement; it rather behaves like structural case in requiring an agreement relation with a -complete head in order to be deactivated for A-purposes. Thus, elements marked with quirky case can undergo standard A-movement in passives and ECM constructions, for instance, as respectively
154
Empirical challenges and solutions
illustrated in (80) below. In other words, elements bearing quirky case are indeed quirky mainly from a morphological point of view, as they do not lose their morphological case under passivization (cf. [80a]) or ECM (cf. [80b]) and systematically fail to trigger agreement on the finite verb (cf. [80a]).16 (80) a.
b.
Icelandic (Andrews 1990, reproduced in Bobaljik and Landau 2009): Str´akunum var bjargað The-boys.DAT.MASC.PL was rescued.DFLT ‘The boys were rescued’ ´ tel Eg str´akunum (hafa verið) bjargað I believe the-boys.DAT.MASC.PL to-have been rescued.DFLT ‘I believe the boys to have been rescued’
Given (80), it indeed appears to be surprising from the perspective of the MTC that quirky case is preserved under raising, but not under control. But, before jumping to hasty conclusions, we should first examine in more detail how quirky-case assignment and checking obtain in a simple sentence. Assuming the structure of passives outlined in section 5.2.2, let us consider the derivation of the quirky passive sentence in (80a), for instance, as sketched in (81) (with English words).17 (81) a. b. c. d. e.
f.
V = rescued DP = [the boys][case:?] [VP rescued [the boys][case:DAT] ] [vP IMP [v’ -en[case:?; :?] [VP rescued [the boys][case:DAT] ]]] OK [vP [the boys][case:DAT] ] [v’ IMP [v’ -en[Case:dflt; φ:d :df lt] [VP rescued t]]] ↑ [TP T[:?] be [vP [the boys][case:DAT] ] [v’ IMP [v’ -en[Case:dflt; φ:d :df lt] OK [VP rescued t]]]] [TP [the boys][Case:DAT] [T’ T[φ [φ:d :df lt] be [vP t [v’ IMP [v’ -en[Case:dflt; φ:d :df lt] [VP rescued t]]]]]]
Given the derivational step in (81a), the verb merges with DP and assigns quirky case to it, as shown in (81b). Such an assignment only means that the case feature of the DP has been valued as dative. Crucially, the quirky case marked DP is still active for purposes of A-relations and its case feature must 16 In addition, if a given clause involves a quirky subject and a nominative object, agreement on the finite T is determined by the nominative object but it cannot be first or second person (see e.g., Sigurðsson 1996). 17 For ease of exposition, in this section we will assume that T has already inherited the -features of C (see Chomsky 2008).
5.4 Obligatory control and morphological case
155
be checked against a -complete probe in order to be deactivated. The next step of the derivation in (81c) introduces the passive -en, which has case and an incomplete set of -features (gender and number). Given that both -en and the object DP are active, they can enter into an agreement relation and the object can move to [Spec, -en], as shown in (81d) (see footnote 5). As mentioned above, when a given probe enters into an agreeing relation with an element marked with quirky case, the features of the probe get default values if there is no nominative goal around (see footnote 16); hence, the features of -en in (81d) get valued as nominative, neuter, and singular.18 Notice that, as opposed to the features of -en, which are deleted (for LF purposes) once valued, as represented by the outlined characters, the case feature of the moved object remains active as it has not entered into an agreement relation with a -complete probe yet. This only happens when the finite T enters the derivation in (81e). Agreement between the -complete T and the quirky DP then sets the value of the -features of T as default (third-person singular) and all the uninterpretable features – including the case feature of the subject – get deleted (for LF purposes), yielding the structure in (81f), which surfaces as (79a). Obviously, the remarks above constitute no innovative treatment of quirky case. They just spell out in Agree parlance the old intuition that quirky case has both inherent and structural characteristics (see e.g., Freidin and Sprouse 1991 and Chomsky 2000). It is, however, sufficient for us to tackle the contrast between raising and control illustrated in (78) and (79). In fact, the logic to be exploited here is no different from the one we used to account for why idiom interpretation is preserved under raising, but not under control, as illustrated in (82) (see section 3.2) (82) a. b.
The cat seems to be out of the bag The cat tried to be out of the bag
(idiomatic interpretation: OK) (idiomatic interpretation: ∗ )
Arguably, the DP [the cat] in (82a) acquires the idiosyncratic meaning of the idiom when it merges with [out of the bag], as represented in (83) below. That being so, movement of [the cat] to the matrix subject position in (82a) and (82b) has different implications. In the case of (82a), movement is triggered for -agreement reasons, as sketched in (84), which leaves the idiosyncratic meaning specification of [the cat] unaltered. By contrast, movement in (82b) is -related and, in this case, an ungrammatical result obtains. Either [the cat] cannot move because it is not a potential -role bearer once it has become 18 This is also the pattern found with nominal and adjectival predicates (cf. [88]/[89] below). On the pattern displayed by secondary predicates and floating quantifiers, see section 5.4.2 below.
156
Empirical challenges and solutions
an idiom chunk, as represented in (85a), or, if it moves, assignment of the external -role to it obliterates its idiomatic specification and it can no longer be interpreted as an idiom chunk, as shown in (85b). (83) a. b.
DP = [the cat] PP = [out of the bag] [[the cat]idiom chunk [out of the bag]]
(84)
[TP [the cat]idiom chunk T seems [TP t to be [t out of the bag]]] ↑ OK
(85) a.
[vP v [tried [CP C [TP [the cat]idiom chunk to be [t out of the bag]]]]] ↑ ∗
b.
[vP [the cat]idiom-chunk v [tried [CP C [TP t to be [t out of the bag]]]]] ↑
Similar reasoning accounts for the raising-control contrasts in (78) and (79). The derivation of the raising construction in (79a), for instance, proceeds along the lines of (86) (with English words). (86) a.
Assignment of quirky case: [rescued [the boys][case:DAT] ]
b.
Merger of a φ-incomplete probe: [vP IMP [v’ -en[case:?; :?] [VP rescued [the boys][case:DAT] ]]]
c.
Agreement between the passive participle and the quirky DP + movement: [vP [the boys][case:DAT] ] [v’ IMP [v’ -en[Case:dflt; φ:d :df lt] [VP rescued t]]] ↑
d.
Applications of merge and move: [vP IMP [v’ -en[case:?; :?] [VP believed [TP [the boys][case:DAT] to have been ↑ [vP t . . . ]]]]]
e.
Agreement between the passive participle and the quirky DP + movement: [vP [the boys][case:DAT] ] [v’ IMP [v’ -en[Case:dflt; φ:d :df lt] [VP believed [TP t . . . ]]]]] ↑
f.
Merger of a φ-complete probe: [TP T[:?] be [vP [the boys][case:DAT] ] [v’ IMP [v’ -en[Case:dflt; φ:d :df lt] [VP believed . . . ]]]]
g.
Agreement between T and the quirky DP + movement: [TP [the boys][Case:DAT] [T’ T[φ [φ:d :df lt] be [vP t [v’ IMP ↑ [v’ -en[Case:dflt; φ:d :df lt] . . . ]]]]]
h.
[the-boys.DAT.MASC.PL is.DFLT believed.DFLT to-have been rescued.DFLT]
5.4 Obligatory control and morphological case
157
After having its case feature valued in (86a), the quirky DP enters into an agreement with the two passive morphemes, setting their features to default values (cf. [86c] and [86e]), but remains active for purposes of A-movement as passive morphemes are not associated with a complete -set (they do not have the feature person). The quirky DP will only become inactive after agreeing with the finite T (a -complete probe), as shown in (86g). Like in the derivation of a simple passive (cf. [81]), the embedded object triggers default agreement on its way to the matrix [Spec, TP] and surfaces with the quirky case it received in the most embedded clause (cf. [86h]). By contrast, the derivation of the control structure in (79b), for instance, proceeds as sketched in (87) (with English words). (87) a.
Assignment of quirky case: [helped [the boys][case:DAT] ]
b.
Merger of a φ-incomplete probe: [vP IMP [v’ -en[case:?; :?] [VP helped [the boys][case:DAT] ]]]
c.
Agreement between the passive participle and the quirky DP + movement: [vP [the boys][case:DAT] [v’ IMP [v’ -en[Case:dflt; φ:d :df lt] [VP helped t]]]] ↑
d.
Applications of merge and move: [vP v [VP hope [CP C [TP [the boys][Case:DAT] to be [vP t . . . ]]]]] ↑
e.
Movement and θ -assignment: [vP [the boys][Case:?] [v’ v [VP hope [CP C [TP t to be [vP t . . . ]]]]]] ↑
f.
Merger of a φ-complete probe: [TP T[:?] [vP [the boys][Case:?] [v’ v [VP hope [CP C [TP t to be [vP t . . . ]]]]]]]
g.
Agreement between T and DP, case valuation, and movement: [TP [the boys][Case:NOM] [T’ T[φ [φ:3PL] :3PL] [vP t [v’ v [VP hope [CP C [TP t to be ↑ [vP t . . . ]]]]]]]]
h.
[the-boys.NOM hope.3PL to be helped.DFLT]
The derivation of the embedded clause proceeds like the derivation of raising constructions until we hit the derivational steps in (87d–e). As opposed to the derivational steps in (86c) or (86e), which involve movement driven by agreement, movement of the quirky DP in (87e) is triggered by the -properties of the matrix light verb and this makes a very big difference. Recall that a given quirky case is intrinsically tied to a specific -role. Thus, it is natural to assume that assigning an additional -role to an element bearing quirky
158
Empirical challenges and solutions
case may obliterate the quirky-case value previously specified. In other words, obliteration of the quirky-case value seen in (87e) is similar to what we saw in (85b), where -assignment to an idiom chunk deletes the idiom specification. Once the quirky-case value in (87e) is eliminated, the derivation then proceeds in a standard fashion, with the DP having its case valued through agreement with a -complete probe (cf. [87g]). Given that in (87) the probe is a finite T, the moved DP surfaces as nominative (cf. [87h]). Notice that the controller surfaces with structural case in (87h) because the role it received in the matrix [Spec, vP] was not tied to any specific morphology. This is not the only possibility, though. If the -role of the matrix predicate is associated with quirky case, the controller will then surface with the last quirky case it received and will trigger default agreement of the finite T. This is how we propose control structures with quirky case assigners in both the matrix and the embedded clause are to be derived. The derivation of (88), for example, proceeds as sketched in (89) (with English words).19 (88)
Icelandic (Sigurðsson 2008): Hana langar ekki til að vera kalt Her.ACC longs not for to be cold.DFLT ‘She doesn’t want to be (feeling) cold’
(89) a.
Assignment of quirky case: [AP cold pron.3SG.FEM[case:DAT] ]
b.
Merger of a φ-incomplete probe: [IP Infl[:?] [AP cold pron.3SG.FEM[case:DAT] ]] OK
c.
Agreement between Infl and the quirky DP + movement: [IP pron.3SG.FEM[case:DAT] [I’ Infl[φ [φ:d :df lt] [AP cold t]]] ↑
d.
Applications of merge and move: [vP v [VP want [CP C [TP pron.3SG.FEM[case:DAT] to be [IP t . . . ]]]]] ↑
e.
Movement and θ-assignment: [vP pron.3SG.FEM[case:ACC] [v’ v [VP want [CP C [TP t to be [IP t . . . ]]]]]] ↑
19 Example (i) shows that the embedded predicate of (88) (under the intended meaning) assigns quirky dative to its subject. (i)
Icelandic (Sigurðsson 2008): M´er er kalt Me.DAT is.3SG cold.DFLT ‘I am (feeling) cold’
5.4 Obligatory control and morphological case f.
Merger of a φ-complete probe: [TP T[:?] [vP pron.3SG.FEM[Case:ACC] [v’ v [VP want [CP C [TP t to be [IP t . . . ]]]]]]]
g.
Agreement between T and DP + movement: [TP pron.3SG.FEM[Case:ACC] [T’ T[φ [φ:d :df lt] [vP t [v’ v [VP want [CP C ↑
159
[TP t to be . . . ]]]]]]] h.
[her.ACC wants.DFLT not to be cold.DFLT]
After the pronoun merges with the adjective in (89a), it receives quirky case and, after the Infl probe associated with adjectival predicates is merged (cf. [89b]), it enters into an agreement relation with the pronoun and its -features receive default values (cf. [89c]). The interesting step for the current discussion is the one after (89d) is assembled. The matrix light verb needs to assign its external -role and the embedded subject is still active for the computation as it has not entered into an agreement relation with a -complete probe. As we saw earlier, assignment of a -role to an element marked with quirky case obliterates the quirky specification previously established. Interestingly, the matrix light verb is also a quirky-case assigner.20 Thus, the previous quirky-case value is eliminated and the one associated with the -assignment of the matrix predicate (accusative) is specified, as seen in (89e). Further checking with finite T finally deactivates the case feature of the moved pronoun and values the -features of T as default, as seen in (89g), which surfaces as (89h) (cf. [88]). We would like to stress that assignment of a -role to an idiom chunk and assignment of a -role to an element marked with quirky case are similar but not identical. In particular, deletion of the idiom specification in (85b) is arguably triggered by interpretability at the C–I interface. By contrast, in instances of role assignment to an element previously marked with quirky case (cf. [87e]), the issue is a morphological one: does the morphology of the grammar in question allow preservation of quirky-case specification when a new -role is assigned? It is not inconceivable that different grammars may have opposite answers and even different answers depending on specific quirky values or -roles. The latter scenario can be illustrated by dialects of Spanish that allow embedded quirky morphology on the controller of some obligatory control verbs, as discussed by Boˇskovi´c (1994) based on work by Gonz´alez (1988, 1990). In (90) below, for instance, the preposition preceding the controller is determined by the embedded rather than the matrix verb. That different 20 Nothing will substantially change if it turns out that the quirky case is assigned by the matrix main verb instead of the light verb.
160
Empirical challenges and solutions
possibilities may be accommodated by morphology undoubtedly underlies part of the variation found in speakers’ judgments regarding quirky case. (90)
Spanish (Gonz´alez 1988, 1990): A Juan le quiere gustar Marta To Juan CL.DAT wants like Marta ‘Juan wants to like Marta’
In sum, by making use of fairly standard assumptions regarding quirky case, the MTC can handle the contrast between raising and control perfectly well. In fact, the difference between raising and control with respect to the preservation of quirky-case morphology lies exactly where the MTC would lead us to look. A-movement in raising is related to -agreement and is therefore oblivious to -relations that the relevant DP may have participated in. By contrast, Amovement in control is motivated by -considerations and, therefore, it may in principle be sensitive to -related issues. 5.4.2 Apparent case-marked PROs 5.4.2.1 Icelandic The second type of challenge posed to the MTC coming from Icelandic involves control configurations in which embedded floating quantifiers and secondary predicates display case-agreement morphology that at face value seems to be independent from the controller in the matrix clause (see Landau 2003; Sigurðsson 2008; and Bobaljik and Landau 2009). In the sentences in (91) below, for instance, the matrix subject bears (structural) nominative case (cf. [91a]) and (quirky) accusative case (cf. [91b]), but the secondary predicate in the embedded clause shows up with dative case, which is the quirky case assigned by the embedded verb. (91) a.
b.
Icelandic (Boeckx and Hornstein 2006a): J´on vonast til að leiðast ekki einum Jon.NOM hopes to to be-bored not alone.DAT ‘Jon hopes not to be bored alone’ Bjarna langaði ekki til að leiðast einum Bjarni.ACC wanted not to to be-bored alone.DAT ‘Bjarni wanted not to be bored alone’
The reason why (91) seems intriguing from the perspective of the MTC is that floating quantifiers and secondary predicates agree in case and -features with the nominal expression they are associated with, as illustrated in (92) below. If the secondary predicates in (91) mismatch the case of the matrix subject but exhibit the quirky agreement licensed by the embedded verb, this at face value appears to show that the controller of the agreement cannot be a trace/copy of
5.4 Obligatory control and morphological case
161
the matrix subject. Rather, the agreement on the secondary predicate should be determined by a case-marked PRO (see Sigurðsson 1991, 2008). (92) a.
Icelandic (Sigurðsson 2008): Braeðurnir voru ekki b´aðir kosnir Brothers-the.NOM.MASC.PL were not both.NOM.MASC.PL elected ´ı atj´ornina to board-the ‘The brothers were not both elected to the board’
b.
Braeðrunum var b´aðum boðið Brothers-the.DAT.MASC.PL was both. DAT.MASC.PL invited.DFLT a´ fundinn to meeting-the ‘The brothers were both invited to the meeting’
Again, appearances are misleading and a close look at the relevant derivations promptly reveals the source of the case dissimilarity between the matrix subject and the embedded secondary predicate. The derivation in (91a), for instance, is as sketched in (93) (with English words): (93) a.
Merger between DP and the secondary predicate: [J.[case:?] alone[case:?; :?] ]
b.
Concord: [J.[case:?] alone[case:?;
c.
Merger of the verb + assignment of quirky case: [VP be-bored [J.[case:DAT] alone[case:?; ] ]]
d.
Concord: [VP be-bored [J.[case:DAT] alone[Case:DAT; φ:SG.MASC] :SG.MASC] ]]
e.
Applications of merge and move: [vP v [VP hope [CP C [TP J.[case:DAT] not to be-bored t ↑
]]
alone[Case:DAT; φ:SG.MASC] :SG.MASC] ]]]] f.
Movement and θ-assignment: [vP J.[case:?] [v’ v [VP hope [CP t not to be-bored t alone[Case:DAT; φ:SG.MASC] :SG.MASC] ]]]] ↑
g.
Merger of a φ-complete probe: [TP T[:?] [vP J.[case:?] [v’ v [VP hope [CP t not to be-bored t alone[Case:DAT; φ:SG.MASC] :SG.MASC] ]]]]]
h.
Agreement between T and DP + movement: [TP J.[Case:NOM] [T’ T[φ [φ:3SG] :3SG] [vP t hope not to be-bored t ↑ alone[Case:DAT; φ:SG.MASC] :SG.MASC] ]]]
162
Empirical challenges and solutions
Assume that concord between floating quantifiers/secondary predicates and nominal expressions takes place under mutual c-command, valuing (and deleting for LF purposes) the uninterpretable features of the floating quantifiers/ secondary predicates. That being so, merger between the secondary predicate and the DP in (93a) allows the -features of the secondary predicate to be valued (cf. [93b]). Once a quirky-case assigner is introduced in (93c), it values the case feature of ‘Jon’ as dative (cf. [93c]), which in turn allows the secondary predicate to have its case feature valued via concord (cf. [93d]). Movement of ‘Jon’ to the embedded [Spec, TP] in (93e) strands the secondary predicate. The crucial step is the next one. As discussed in section 5.4.1, assignment of a role to an element marked with quirky case obliterates the previous quirky-case value. This is what happens in (93f) after ‘Jon’ moves to receive the external -role of the matrix light verb. Finally, the moved subject agrees with a finite T and surfaces as nominative (cf. [93h]).21 Notice that the derivation of (91b) is essentially identical to the derivation of (91a). The only relevant difference is that the matrix light verb is a quirkycase assigner (see footnote 20) and, when it assigns its external -role to the moved subject, the previous quirky-case value is overwritten by the new one, as sketched in (94) (with English words). (94) a. b. c. d.
Merger between DP and the secondary predicate + concord: [B.[case:?] alone[case:?; ]] Assignment of quirky case: [VP be-bored [B.[case:DAT] alone[case:?;
] ]]
Concord: [VP be-bored [B.[case:DAT] alone[Case:DAT; φ:SG.MASC] :SG.MASC] ]]
Movement and θ -assignment + quirky valuation: [vP B.[case:ACC] [vP v [VP wanted [CP t not to be-bored t ↑ alone[Case:DAT; φ:SG.MASC] :SG.MASC] ]]]]
e.
Agreement with a φ-complete T + movement: [TP B.[Case:ACC] [T’ T[φ [φ:d :df lt] [vP t hope not to be-bored t ↑ alone[Case:DAT; φ:SG.MASC] :SG.MASC] ]]]
Let us pause for a moment to reconsider the valuation of the uninterpretable features of the secondary predicate in the derivations outlined in (93) and (94). Upon merger, the secondary predicate get its -features valued by the 21 Just as we saw in section 5.4.1, the surface form of a given DP depends on the last caseassigning/valuing head it interacts with, which provides evidence for the view that case must be assigned/valued rather than checked.
5.4 Obligatory control and morphological case
163
corresponding features of the nominal expression it associates with, but not its case feature. This has a straightforward explanation. The -features of the relevant nominal expression are interpretable and hence valued at every derivational step. Its case feature, on the other hand, is uninterpretable and is unvalued upon merger. Only after the case feature of the nominal expression is valued can it value the case feature of the secondary predicate. In the derivations discussed, this happens after the embedded predicate assigns quirky case to the nominal expression (cf. [93c–d]) and [94b–c]). Bearing this in mind, let us now consider instances where there is no quirky-case assignment in the embedded clause, as exemplified in (95). (95) a.
Icelandic (Sigurðsson 2008): Braeðrunum l´ıkaði illa að vera ekki Brothers-the.DAT.MASC.PL liked ill to be not b´aðir kosnir both.NOM.MASC.PL elected ‘The brothers disliked not being both elected’
b.
´ ´ı veisluna Olaf langaði að fara einn Olaf.ACC longed to go alone.NOM to party-the ‘Olaf wished to go alone to the party’
In absence of quirky-case assignment in the embedded clauses of (95), the floating quantifier/secondary predicate surfaces as nominative, regardless of the case specification of the controller. Interestingly, case valuation of the controller takes place after it leaves the floating quantifier/secondary predicate stranded. Consider the simplified derivation of (95b) given in (96) below, for instance. (96) a. b.
c.
d.
Merger between DP and the floating quantifier + concord: [O.[case:?] alone[case:?; ]] Applications of merge and move: [vP v [VP longed [CP C [TP O.[case:?] to go [t alone[case:?; ↑
] ]]]]]
Movement and θ -assignment + quirky valuation: [vP O.[case:ACC] [v’ v [VP longed [CP C [TP t to go [t alone[case:?; ↑
] ]]]]]]
Agreement with a φ-complete T + movement: [TP O.[Case:ACC] [T’ T[φ [φ:d :df lt] [vP t longed t to go [t alone[case:?; ↑
] ]]]]
In (96) ‘Olaf’ gets its case value after it moves to the matrix [Spec, vP], leaving the stranded secondary predicate with its case feature unvalued. We propose that, in such circumstances, the case of the secondary predicate or floating
164
Empirical challenges and solutions
quantifier is assigned a default value in the morphological component (see Boeckx and Hornstein 2006a; Boeckx, Hornstein, and Nunes in press). As we have already seen when discussing default specification for passive morphemes (see section 5.4.1), the default value for case in Icelandic is nominative;22 hence the nominative specification on the floating quantifier and secondary predicate of (95) irrespective of the case value of the controller. It is worth noting that, contrary to Sigurðsson (2008) and what Bobaljk and Landau (2009) claim, the MTC has a very simple explanation for why control infinitives in Icelandic cannot license an overt subject, as illustrated in (97). (97) a.
b.
c.
∗
Icelandic: J´on vonast til [hann/Eir´ıkur að verða r´aðinn] Jon.NOM hopes he/Eric.NOM to be hired.NOM.MASC.SG ‘John hopes for him(self)/Eric to be hired’ (J´onsson 1996, reproduced in Bobaljik and Landau 2009) ´ bað Mar´ıu ´ ´ ∗ Asta) Eg [að (∗ hun/ fara ein andað] I asked Maria.ACC to she/Asta.NOM go alone.NOM.FEM.SG there ‘I asked Maria (for her/Asta) to go there alone’ (Thr´ainsson 1979, adapted in Bobaljik and Landau 2009) ´ Eg vonast til [að (∗ m´er/∗ J´oni) verða hj´alpað] I.NOM hope for to me/Jon.DAT be helped ‘I hoped (for myself/Jon) to be helped’ (Zaenen, Maling, and Thr´ainsson 1985, adapted in Bobaljik and Landau 2009)
Since there is no local -complete probe in the embedded clause of (97a) and (97b), the embedded subject does not have its case valued and the derivation crashes. By contrast, in (97c) the embedded subject is valued with quirky case. However, in order to be licensed, the element bearing quirky case needs to agree with a -complete probe and there is no such element within the embedded clause; hence, the derivation of (97c) also crashes. The reasoning above is quite simple: no case licensing, no convergence. Sentences such as (97) are in fact quite problematic for approaches that take the morphological facts reviewed above to indicate that control infinitives in Icelandic license a PRO marked with structural or quirky case. If this is so, why are the sentences in (97) out? When all is weighed, it seems that approaches that take control infinitives to license a PRO marked with “regular” case are very similar to the null-case approach. At the end of the day, PRO is specified 22 This is in consonance with Sigurðsson’s (2008: 419) observation that “Icelandic is unusual in overtly marking many of its nominatives in morphology. Even so, it is reasonable to analyze nominative as the unmarked case in Icelandic, as in many other languages.”
5.4 Obligatory control and morphological case
165
as being licensed by a peculiar kind of case which can license no other nominal expression. For the sake of completeness, let us consider a final set of data. Sigurðsson (2008) points out that, in addition to the “central facts” discussed above, with nominative specification for embedded agreeing elements, case matching with the controller is marginally possible, although it is sensitive to a variety of as yet poorly understood factors (the type of predicate, the value of the quirky case, idiolectal variation, etc.). He notes, for example, that it is easier for controllers bearing structural accusative case to transmit their cases to a downstairs predicate than it is for controllers bearing quirky case, as respectively exemplified with structural and quirky accusative in (98a) and (98b) below, where the numbers in parentheses indicate how many out of 15 informants in his survey judged the relevant agreement pattern as OK.23 He also notes that, although transmitting quirky dative is marginally possible (cf. [98c]), transmitting quirky genitive is not an option (cf. [98d]). Importantly, case transmission never applies if the embedded predicate has a quirky-case property, as illustrated in (98e), where the embedded predicate assigns quirky dative. (98)
Icelandic (Sigurðsson 2008): ´ ´ı veisluna (12/15) H´un bað Olaf að fara bara einan She.NOM asked Olaf.ACC to go just alone.ACC to party-the ‘She asked Olaf to just go alone to the party’ ´ b. Olaf langaði að vera fyrstan (2/15) Olaf.ACC longed to be the-first-one.ACC ‘Olaf wanted to be the first one’ ´ c. Olafi fannst gaman að vera fyrstum (3/15) Olaf.DAT found pleasurable to be the-first-one.DAT ‘Olaf was pleased to be the first one’ ∗ ´ d. Við k¨olluðum til Olafs að vera r´olegs (0/15) We shouted on Olaf.GEN to be calm.GEN ‘We shouted to Olaf to be calm’ a.
e.
∗
Við b´aðum hana að verða boðna We.NOM asked her.ACC to be invited.ACC.FEM.SG eina alone.ACC.FEM.SG ‘We asked her to get invited alone’
The sharp contrast between (98a) and (98e) is exactly what our approach predicts. Recall that, when quirky case assignment is available in the embedded 23 See Sigurðsson (2008) for relevant figures regarding judgments classified as ? or ∗ .
166
Empirical challenges and solutions
clause, as is the case in (98e), the nominal expression bearing quirky case is able to enter into agreement/concord relations and value case and -features of the agreeing elements around (cf. [93d] and [94c]). Hence, case transmission is blocked in sentences such as (98e) because the relevant elements have already had their features valued. By contrast, the corresponding elements in the embedded clauses of (98a–d) do not have their features valued in the syntactic component. It is then plausible to assume that this leaves the door open for competing morphological strategies (default nominative assignment or longdistance case copying). If case transmission is indeed a morphological process, it would not be surprising that it would be subject to a variety of morphological factors, including sensitivity to the type of case (structural – cf. [98a]) – vs. quirky – cf. [98b]) and to the specific case value, as the contrast between dative and genitive in (98c) and (98d) illustrates.24 In sum, aside from the murky status of case transmission (a status it has under any approach to control), the dissimilarity between the features of the controller and an embedded floating quantifier or secondary predicate is not an insurmountable problem for the MTC, contrary to what is frequently claimed. It in fact follows very naturally as a by-product of the dynamics of the derivation. Once quirky cases are tied to -roles, assignment of a new -role to an element marked with quirky case obliterates the quirky-case value previously specified. Thus, once a quirky-case-marked element moves to a -position, its case realization will be determined from that point on in the derivation, with no connection with the quirky agreement it had previously triggered. 5.4.2.2 Basque Let us finally turn to control in Basque, which San Martin (2004) has claimed runs against the expectations of MTC in that it also appears to involve a casemarked PRO (see also Landau 2006). In a nutshell, San Martin’s argument 24 See Boeckx, Hornstein, and Nunes (in press) for further discussion. Case transmission also appears to be sensitive to the presence of an intervening DP, as it is blocked under ‘promise,’ as illustrated in (i). Landau (2007) also reports a great deal of variability in situations of case transmission in Russian. Interestingly, Landau notes that the presence of an intervening argument also blocks case transmission. (i)
Icelandic (Ussery 2008, reporting data from Andrews 1982): Þeir telja hana hafa lofað honum að vera They believe her.ACC have promised him.DAT to be g´oð/∗ g´oða good.NOM.FEM.SG/ACC.FEM.SG ‘They believe her to have promised him to be good’
5.4 Obligatory control and morphological case
167
is the following. As she observes, the case patterns that arise in Basque are not related to the nature of the predicate involved, but rather to the number of argument DPs. Thus, if a given clause has a single argument DP, it is marked with absolutive case (morphologically, a zero morpheme); if it has two argument DPs, one is marked with absolutive case and the other with ergative case; finally, if it has three argument DPs, one is marked with absolutive case, another with ergative case, and the remaining element with dative case.25 This is respectively illustrated in (99). (99) a.
Jon etxera joan da Jon.ABS house.ALL go AUX.3ABS ‘John has gone home’
b.
Jonek ogia erosi du Jon.ERG bread.DET.ABS buy AUX.3ABS.3ERG ‘John has bought bread’
c.
Nik Mariari oparia eman diot I.ERG Mary.DAT present.DET.ABS give AUX.3ABS.3DAT.1ERG ‘I have given the present to Mary’
With this in mind, consider the control structure in (100) below. The embedded clause of (100) displays the case pattern associated with three arguments, given that one argument is realized as dative. This in turn indicates that ergative case must have been assigned in the embedded clause. Assuming that the absolutive case on the controller in the matrix clause disqualifies it as a potential candidate to be the element bearing ergative case, San Martin concludes that the embedded clause of (100) must involve a PRO marked with ergative case, thereby accounting for the presence of dative case. (100)
Joni [Øi Mariari ogia ematen] Jon.ABS Mary.DAT bread.DET.ABS give.NOMIN.INN saiatu da try AUX.3ABS ‘John has tried to give bread to Mary’
San Martin’s argument is indeed very ingenious. However, we believe that it just shows how control structures such as (100) can be handled in a PRO-based approach, not that the MTC cannot account for such constructions. Basque has the canonical morphological profile of an ergative language in the sense 25 This is reminiscent of the morphological case-assignment mechanisms proposed in Yip, Maling, and Jackendoff (1987), Marantz (1991), and Harley (1995), among others.
168
Empirical challenges and solutions
that transitive objects and intransitive subjects are equally marked absolutive. But morphology aside, Basque arguments behave syntactically on a par with what is found in nominative–accusative languages. For example, in the domain of control, it is subjects that are invariably controlled, as illustrated in (101) below with subjects of transitive and unaccusative predicates respectively. And as far as its inherent case system goes (which, following Laka [2006], we take to include ergative and dative), Basque is much closer to Icelandic than, say, German, as it allows inherently case-marked elements to function as regular subjects (and even enter into -agreement with a full -probe, as illustrated by the agreement on the auxiliary in [99]–[101]). (101) a. Nii [Øi oparia erosten] saiatu naiz I.ABS present.DET.ABS buy.NOMIN.INN try AUX.1ABS ‘I have tried to buy the present’ b. Nii [Øi etxera joaten] saiatu naiz I.ABS house.ALL go.NOMIN.INN try AUX.1ABS ‘I have tried to go home’
Thus, the analysis we proposed in section 5.4.2.1 to account for apparent case-marked PROs in Icelandic extends straightforwardly to Basque. The derivation of (100), for instance, proceeds along the lines of (102) (with English words). (102) a. [John[case:ERG] give bread[case:ABS] Mary[case:DAT] ] b. [vP v [tried [John[case:ERG] give bread[case:ABS] Mary[case:DAT] ]]] c. [vP John[case:?] [v’ v [tried [t give bread[case:ABS] Mary[case:DAT] ]]]] ↑ d. [TP John[case:ABS] has[3ABS] [vP t [v’ v [tried [t give bread[case:ABS] ↑ Mary[case:DAT] ]]]]]
In (102a), ‘John’ is generated in the embedded clause; hence, the presence of three arguments activates the three case specifications and the case of ‘John’ is specified as ergative. However, after ‘John’ moves to the matrix [Spec, vP], it receives another -role and its previous case specification is obliterated. Further case computations then specify ‘John’ as absolutive as it is the sole argument DP in the matrix clause. Again, we have witnessed case mismatch between the controller and the controllee (its trace/copy) which arises in the course of the derivation. More importantly, the mismatch arises exactly where predicted by the MTC: when an inherently case-marked element receives an additional -role.
5.5 The MDP, control shift, and the logic of minimality 5.5
169
The minimal-distance principle, control shift, and the logic of minimality
Rosenbaum (1970) proposed the minimal-distance principle to account for why, typically, the antecedent of an obligatorily controlled PRO is the most proximate nominal expression, as illustrated in (103) (see section 2.3). (103) a. John said that Mary tried [PRO to wash herself/∗ himself] b. John persuaded Mary [PRO to wash herself/∗ himself]
The MTC captures the empirical virtues of the minimal-distance principle via minimality (see section 3.4.1): if obligatorily controlled PRO is a residue of A-movement, then moving ‘John’ from the embedded-subject position in (103) to the matrix-subject position traverses the c-commanding intervening nominal ‘Mary,’ as shown in (104) below. As this violates (relativized) minimality, ‘John’ cannot be an antecedent of PRO. Conversely, ‘Mary’ can be the controller in (103), as there is no intervening DP (cf. [105]). (104) a. [ said that Mary tried [John to . . . ]] ↑ ∗ b. [ persuaded+v [Mary tpersuaded [John to . . . ]]] ↑ ∗ (105) a. [John said that [Mary tried [t to . . . ]]] ↑ OK b. [John persuaded+v [Mary tpersuaded [t to . . . ]]] ↑ OK
We believe it to be a virtue of the MTC that it so elegantly derives Rosenbaum’s minimal-distance principle. However, many (though equally impressed with the intimate connection between the MTC and the minimal-distance principle) have concluded that the MTC is fatally tainted empirically precisely because of this tight relation. There are well-known counter-examples to Rosenbaum’s minimal-distance principle that (some claim) demonstrate that the minimal-distance principle is incorrect. As the MTC assumes the minimaldistance principle in the guise of minimality, the conclusion reached by some is that it too must be false. The argument form is impeccable. Whether the facts are fatal is less clear, as we will argue. The putative problems for the minimal-distance principle come in two varieties (see e.g., Culicover and Jackendoff 2001 and Landau 2003). The first involves control into the complement of verbs like ‘promise,’ as illustrated in (106) below, and the second involves cases of “control shift,” as illustrated in
170
Empirical challenges and solutions
(107b). In both (106) and (107b) ‘John’ appears to control PRO, despite the fact that ‘Mary’ surfaces between PRO and ‘John.’ As this seems to involve a minimality violation under the MTC, control by the matrix subject should be unavailable. (106)
John promised Mary [PRO to wash himself]
(107) a. John asked Mary [PRO to shave herself/∗ himself] b. John asked Mary [PRO to be allowed to shave himself/∗ herself]
There has been an animated discussion regarding how compelling these facts are. For example, the promise-examples are not uniformly deemed acceptable.26 Importantly, as emphasized by Boeckx and Hornstein (2004), Rosenbaum (1967) had already observed (citing C. Chomsky 1969) that control cases like (106) are mastered rather late in the acquisition process, if mastered at all. It appears that there are speakers for whom these cases are never considered acceptable.27 As Rosenbaum correctly notes, this is a problem for those who consider these cases as identical to the standard cases in (103), which appear to be acquired quite straightforwardly. Why, after all, do the acquisition profiles of sentences like (106) differ so significantly from those in (103) if they are unexceptional cases of control? Curiously, this learnability feature of verbs like ‘promise’ has received less attention than the acceptability of (106) (for some speakers). Landau (2003: 480), for example, dismisses the relevance of the acquisition puzzle, claiming that “although cases of type [106] are not as common as those of type [103b], they are far too systematic to be dismissed as ‘highly marked’ exceptions.” Boeckx and Hornstein (2004) observe that the reasoning behind this line of argument is flawed if one accepts that theories of UG aim to provide an answer to the logical problem of language acquisition (Plato’s Problem), a standard assumption at least since Aspects. The interesting acquisition profiles that verbs like ‘promise’ have constitute prima facie evidence that all is not quite standard with these cases from a grammatical point of view and that whatever evidence they provide against the minimal-distance principle requires some further massaging. We take it that a complete account of these cases has three parts: (i) that the cases in (103) are the central examples of complement control (as Rosenbaum originally proposed); (ii) that some speakers find (106) acceptable; and (iii) that some speakers either never come to admit cases like (106) or take a 26 Stockwell, Schachter, and Partee (1973: 536), for example, describe this type of sentence as “only marginally grammatical.” 27 See Courtenay (1998) for discussion.
5.5 The MDP, control shift, and the logic of minimality
171
long time to acquire them. In section 5.5.1 below, we outline an approach that covers all three data points. The putative counter-example in (107b) is equally problematic. There is currently no good account for why control shift obtains. In fact, though seldom noted, control shift is incompatible with standard analyses of control in which the controller choice is catalogued as a (diacritical) fact about the selection properties of the embedding verb. Whatever goes on in control shift cases, it transcends the reaches of selection. For instance, to allow the verb ‘ask’ to select its object as the controller in (107a) but its subject as the controller in (107b) requires allowing it to “see” the verb of the embedded clause: if it is something like ‘allowed’ then the subject controls, otherwise the object (see Farkas 1988). However, if selection is a local head-to-head relation (the standard assumption), the embedded verb is just too remote from ‘ask’ to be visible and so cannot be exploited in this way to determine the “correct” controller. Some may take this to indicate that controller selection is not a lexical property of the embedding predicate but a compositional fact about the embedding predicate coupled with the composed semantic contribution of the embedded sentence.28 However, we still await a (non-ad hoc) explanation of how these different readings arise.29 Again, it seems to us that it is not clear what the control reversal facts in (107) tell us about the adequacy of the MTC. What is clearer is that the reversal only applies in certain stylized situations in which the object of ‘ask’ authorizes the embedded event in some way. An adequate theory of these cases should tell us why control shift occurs in just this narrow range of cases. We attempt to do so in section 5.5.2 below. 5.5.1 Control with promise-type verbs Let us review exactly how the MTC derives the MDP as a special case. For concreteness, consider a persuade-structure like (108). (108)
[DP1 [persuade [DP2 [PRO∗ 1/2 to go home]]]]
The movement from the position occupied by PRO (understood as an Atrace) to the position occupied by DP1 is prohibited if we assume that DP2 triggers a minimality violation. This holds if DP2 “intervenes” between DP1 28 This seems to be the position of Culicover and Jackendoff (2001) and van Craenenbroeck, Rooryck, and van den Wyngaerd (2005), for instance. 29 There is a considerable literature discussing various factors involved in control-shift interpretations that build on Farkas (1988). However, these works do not explain why these readings arise or why these factors are relevant. Rather, they basically reiterate the observed facts and factors by noting that the isolated factors are dispositive and that they contribute in some way to the observed interpretations.
172
Empirical challenges and solutions
and PRO. DP2 intervenes if it c-commands PRO, DP1 c-commands PRO, and DP1 c-commands DP2 . If any of these clauses fails to obtain, then DP2 is not an intervener and movement of DP1 from the PRO position does not violate minimality. Given this logic, the object of control cases involving ‘promise’ like (106), repeated below in (109a), should not count as an intervener if it fails to ccommand the embedded subject, as sketched in (109b), where ‘Mary’ is the complement of a null head. The question is whether it makes sense to suppose that there is an extra layer of structure in promise-cases preventing the object from counting as an intervener. (109) a. John promised Mary [PRO to wash himself] b. [John promised [XP X Mary] [t to wash himself]] ↑ OK
There is indeed a considerable amount of evidence that points to the conclusion that ‘Mary’ in (109) is not the complement of ‘promise,’ but the complement of a null preposition. First, ‘promise’ is often classed with other subjectcontrol verbs whose nominal complement is preceded by the preposition ‘to,’ as illustrated in (110) below.30 (110)
[John vowed/committed [PP to Mary] [t to wash himself]] ↑ OK
‘Vow’ in particular is of interest as it is semantically very close to ‘promise’ and so it is not unreasonable to suppose that the thematic role of the nominal argument of the two verbs is identical. Given (a non-relativized version of) Baker’s (1988, 1997) uniformity of theta assignment hypothesis (UTAH), the apparent direct object of ‘promise’ in (109a) should then be mapped into the complement of a preposition, thus obviating any minimality problems (cf. [109b]). This conclusion is bolstered by the oft-noted observation that the nominalization of ‘promise’ requires ‘to’ rather than ‘of’ before the nominal complement, despite the fact that ‘of’ is what normally precedes structural objects within nominalizations: (111) a. John’s promise to/∗ of Mary to leave b. My promising to/∗ of Mary to leave
In fact, the preposition ‘to’ may surface in other syntactic frames associated with ‘promise,’ as shown in (112) below, and this leads us to the second set of 30 See e.g., Landau (1999, 2003).
5.5 The MDP, control shift, and the logic of minimality
173
facts that indicate that ‘Mary’ in (109a) is not the theme/patient complement of ‘promise.’ The two syntactic frames of ‘promise’ seen in (109a) and (112) resemble what we find with standard double-object constructions in English such as (113), that is, a fronted thematic goal surfaces without its preposition. (112)
I didn’t promise this to Mary
(113) a. John gave a present to Mary b. John gave Mary a present
Importantly, the nominal object of ‘promise’ in constructions like (109a) patterns like the goal of double-object constructions and not like the object of control constructions involving verbs like ‘persuade.’ Thus, as opposed to the object of standard object-control verbs, the object of ‘promise’ and the goal of a double-object construction disallows wh-movement (cf. [114]),31 heavy NP shift (cf. [115]), and secondary predicates (cf. [116]). (114) a. Whoi did you persuade ti to leave the party? b. ∗ Whoi did you promise ti to leave the party? c. ∗ Whoi did you give ti a book? (115) a. You persuaded ti to leave [the party every man that you met]i b. ∗ You promised ti to leave [the party every man that you met]i c. ∗ You gave ti a book [every man that you met]i (116) a. John persuaded Mary1 to go to the party undressed1 b. ∗ John promised Mary1 to go to the party undressed1 c. ∗ John gave Mary1 a book undressed1
These secondary-predication structures are particularly interesting for it is well known that PPs cannot be subjects of such predicates, as shown in (117) below. Baker (1997) argues that the goal of double-object constructions such as (116c) patterns like indirect objects because it is actually a prepositional complement, despite the lack of an overt preposition. This is exactly what we are saying regarding the object of ‘promise’ in constructions like (109a). In 31 Notice that sentences like (ia) below with an overt preposition are only weakly unacceptable, again paralleling double-object constructions (cf. [ib]). Interestingly, when the preposition is overt, wh-movement improves quite a bit, as shown in (ii). (i) a. ?John promised to Mary to leave the party early b. ?John gave to Mary a book (ii)
(?)To whom did John promise to leave the party early?
174
Empirical challenges and solutions
effect, the object of ‘promise’ behaves just like the object of ‘vow’ (cf. [118]), the only difference being the overt preposition in the latter case.32 (117)
∗
John gave a book to Mary1 undressed1
(118)
∗
John vowed to Mary1 to leave the party early undressed1
In sum, if the object of ‘promise’ in sentences like (119a) below is actually tucked inside a PP phrase (see footnote 32), as sketched in (119b), movement of embedded subject over the matrix object is a licit operation. The matrix object does not c-command the embedded subject and so does not intervene for purposes of minimality. Thus, generating the subject-control reading is expected to be possible.33 (119) a. John promised Mary to donate money to the library fund b. [ promised [P Mary] [John to donate money to the library fund]] ↑ OK
Moreover, this null-preposition hypothesis supplies the ingredients for an answer as to why some speakers never seem to allow sentences like (119a) or 32 Strictly speaking, the present analysis does not require the postulation of null prepositions in English. It suffices for what we need here that ‘promise,’ just like ‘give’ in double-object constructions, renders an overt preposition, e.g., ‘to,’ null at some point of the derivation (perhaps via incorporation, as Baker [1997] proposes). 33 If movement of ‘John’ in (119b) is allowed because the matrix object does not c-command it, there arises the question of why the matrix object induces principle-C effects with respect to material inside the embedded clause, as illustrated in (ia) below, with a null preposition, and (ib), with an overt preposition. Note that the problem posed by (i) is not different from what we find in (ii), where raising over an experiencer PP is allowed in English, despite the fact that the nominal within PP induces a principle-C effect with respect to material inside the clausal complement. (i) a. ∗ John promised heri to visit Maryi b. ∗ John vowed to heri to visit Maryi (ii) a. [Johni seems to her [ti to be nice]] b. ∗ [It seems to heri [that Maryi is nice]] Kitahara (1997) has proposed that, at the point when raising takes place in (iia), the pronoun is within PP and does not induce intervention effects; later on in the derivation, the pronoun undergoes covert movement to a position from where it c-commands the clausal complement, thereby inducing principle-C effects (see also Boeckx 1999). As the reader can see, if something along these lines is correct, it applies to (i) in a straightforward fashion. Alternatively, we can exploit the account of (ii) suggested in section 5.2.3, according to which ‘to’ in (ii) is a marker of inherent case and inherently case-marked elements do not induce A-intervention. Again, this suggestion carries over to (i) straightforwardly. For purposes of presentation, we will keep using representations like (119b) in the discussion that follows and, accordingly, assume Kitahara’s (1997) and Boeckx’s (1999) proposal regarding principle-C effects. However, it should be noted that, as far as we can see, both approaches make the same empirical predictions regarding the material discussed here.
5.5 The MDP, control shift, and the logic of minimality
175
that they are late in getting to them. It should be no surprise to discover that null prepositions may be difficult to pin down. Clearly, the data used above to motivate its presence is too exotic to be considered part of the primary linguistic data. Thus, the main evidence for the grammaticality of sentences like (119a) are actual instances of such sentences used in the appropriate context. It might not be too great a stretch to assume that some speakers never receive the relevant input in sufficient amounts. There is a second confound here as well. If one adopts a non-relativized version of UTAH (see Baker 1997), then arguments will be projected to grammatical-function positions on the basis of their thematic proto-role consequences (see Dowty 1991). DP arguments that are internal to VP and that have sufficient theme/patient properties will be assigned to object positions. Those that do not have enough of such properties but have goal/path/location properties will be treated as oblique and mapped to the object of some preposition.34 The question that then arises is whether ‘Mary’ in cases like (119a) has more oblique- or more theme/patient-like consequences. Given the meaning of ‘promise,’ it would not be surprising if a child concluded that the promisee was affected (in some suitable sense) by the proffered promise. Were this to happen, the DP associated with the promisee would be treated as a direct object and minimality would block movement across it. As Baker (1997) emphasizes, if one takes a coarse-grained view of -roles and adopts a Dowty-style protorole analysis to underwrite the projection of proto-roles to syntactic positions, then there will be cases where the relevant proto-role may be obscure (and/or ambiguous). One way to alleviate the obscurity is by syntactic means, e.g., putting in an overt preposition will signal that a DP is not a direct object of the verb.35 However, if such overt evidence is absent, then ambiguity will be rife and more subtle calculations of semantic consequences will be needed to settle matters. It should not be surprising to find that these methods occasionally do not apply uniformly across speakers and that what some speakers catalogue as theme/patient others consider oblique. What is true for ‘promise’ also holds for ‘threaten,’ another putative subjectcontrol predicate. It appears that some speakers accept sentences like (120).36 (120)
John threatened Mary to kiss Sue
34 See Baker (1997) for one reasonable elaboration of these mapping principles. 35 It is worth noting that the authors are acquainted with some speakers that cannot understand sentences like ‘John promised Mary to leave’ as involving subject control. For these speakers, these sentences considerably improve with the subject-control reading if ‘to’ is inserted before ‘Mary’ (‘John promised to Mary to leave’). 36 See Landau (2003).
176
Empirical challenges and solutions
Some, including one of the present authors, find (120) very unacceptable. The question on a Baker–Dowty account of -role projection is what one takes the proto-role of ‘Mary’ to be in (120). One way of being affected is to have one’s psychological state changed. It seems reasonable to conclude from (120) that Mary’s state of mind might be affected by John’s action. If so, ‘Mary’ will be a proto-theme/patient in (120) and so be mapped to the syntax as a direct object. Of course, some may conclude that Mary is merely a recipient of a threat and so not primarily an affected object and so might map ‘Mary’ in (120) to an oblique position, thereby permitting A-movement across it. Do promises alter one? Do threats? It is subtle questions like these that determine the thematic syntax of these constructions (on a Baker–Dowty approach) and that also determine whether a given DP will act as an intervener in an MTC account. It should not be surprising that the core cases (‘persuade,’ ‘force’) are unproblematic, while other verbs are not clear cut in their semantic consequences. In these latter cases, we should expect variation and we appear to find it. To conclude. We noted that an account of subject control sentences involving verbs like ‘promise’ needs to explain how they can be generated, why they are not allowed by all speakers, and how they contrast with control constructions like ‘persuade.’ Our proposal above offers a sketch that addresses all three concerns and is compatible with the MTC. In fact, we could go further. The MTC requires that post-verbal DPs in sentences like (119a) not intervene in order for the subject control reading to be generated. The evidence that such DPs act like objects of prepositions and not like direct objects is just what one would expect were the MTC correct. Put tendentiously, this is what the MTC predicts. The same cannot be said for more conventional approaches which stipulate the antecedents of PRO via ad hoc diacritics that annotate the argument structure of embedding verbs.37 5.5.2 Control shift Let us now examine control shift. Consider (121), for instance. (121) a. John1 asked/begged/petitioned Mary2 [PRO2/∗ 1 to leave the party early] b. John1 asked/begged/petitioned Mary [PRO1 to be allowed/permitted to leave the party early]
The cases in (121a) are typical cases of object control. Note that the indicated licit indexation observes the minimal-distance principle, the more proximate object blocking the more remote subject. In (121b), on the other hand, the matrix 37 As in Landau (2000) and Culicover and Jackendoff (2005), for example.
5.5 The MDP, control shift, and the logic of minimality
177
subject is a licit antecedent for PRO and it appears that the control relation is established over the matrix object, in violation of both the minimal-distance principle and minimality (if PRO is understood to be a residue of movement). Control shift can be licensed without an overt ‘allow’/‘permit’ verb in the embedded clause, as illustrated in (122). (122) a. John asked the guard [PRO to smoke one more cigarette] b. John asked the manager [PRO to pitch in the last game]
These cases are ambiguous with either subject or object being potential controllers. However, when ‘John’ controls, the only acceptable reading is as paraphrased in (123), with an overt instance of ‘allow’/‘permit.’ (123) a. John asked the guard to be allowed to smoke one more cigarette b. John asked the manager to be permitted to pitch in the last game
Moreover, the subject-control reading of the sentences in (122) becomes infelicitous if the guard/manager has no authority to grant the request. In effect, an optimal paraphrase of (122a), for instance, is (124a) below. When the object cannot be interpreted as the source of permission/authority, the sentence becomes unacceptable, as illustrated in (124b).38 (124) a. John asked the guardi to be allowed by himi to smoke one more cigarette b. ∗ John asked the guard to be allowed by the general to smoke one more cigarette
Landau (2003: 480) observes that control shift “is sensitive to pragmatic factors” like “authority relations” and that languages differ in how sensitive they are to such factors. However, it has never been made clear why control shift is sensitive to such a tight interpretive constraint. Here we would like to propose that control-shift cases are in fact amenable to essentially the same kind of analysis proposed in section 5.5.1. What happens in such cases is that a matrix object may be treated as a thematic object or as a thematic oblique (a “source” to be specific). In the latter case, the DP is mapped to the complement position of a preposition, thereby allowing movement across it without any 38 In addition, Chomsky (1980) has noted that control shift is rather sensitive to the right embedded content. Thus, (i) contrasts with (123). (i)
∗ John 1
asked Mary [PRO1 to have/get permission to shave himself]
Why these cases contrast with (123) is unclear. However, one speculation is that the distinction active vs. passive may matter and that the passive makes the source reading more readily available on the matrix object.
178
Empirical challenges and solutions
violation of minimality, just as in the case of ‘promise.’ Let us consider the details. The contrast between (121a) and (121b) with respect to the subject-control reading is in fact unexpected only if the matrix object of each sentence is associated with the same syntactic configuration. However, one reasonable possibility is that the thematic function of the matrix object is different in (121a) and (121b) and, if so, they should be projected into different syntactic positions under (a non-relativized version of) UTAH (see Baker 1997). Specifically, if the matrix object in (122), for instance, is a theme/patient then it will be a complement of ‘ask.’ If, however, it gets a source interpretation (i.e., it is understood as the source of the authority/permission to carry out the request whose content is specified by the embedded clause), then it should syntactically project as the object of a preposition, as illustrated in (125) below.39 This source interpretation underlies the implicit understanding of the matrix object as the agent of the “allowing,” as seen in (124). (125)
[John asked [PP P Mary] [PRO to be allowed to leave]]
The structure in (125) should look very familiar. It is what we proposed for matrix object of subject-control constructions with ‘promise’ (cf. [119b]). As in promise-cases, ‘Mary’ in (125) does not c-command PRO, or does not c-command it at that point in the derivation where ‘John’ moves over it, as shown in (126) (see footnote 33).40 (126)
[ asked [PP P Mary] [John to be allowed to leave]] ↑ OK
In fact, the interaction between control shift and control constructions involving ‘promise’ has a very curious result. So far, we have discussed standard cases of control shift, where the control shifts from the object to the subject. However, promise-cases yield “reverse” control-shift effects, that is, cases where the control shifts from the subject to the object. Consider the data below, for example. 39 Baker (1997: 108) proposes collapsing sources, locations, goals, and paths into one proto-role. 40 As was seen with promise-cases, the nominal inside the PP complement in control-shift structures induces principle-C effects with respect to material inside the clausal complement, as illustrated in (ia) below, and again this holds regardless of whether the preposition is null or overt, as shown in (ib). See footnote 32 for two alternative accounts that are compatible with the MTC. (i) a. b.
∗ John ∗ John
asked her1 PRO to be allowed to visit Mary1 asked of/from her1 permission to visit Mary1
5.5 The MDP, control shift, and the logic of minimality
179
(127) a. John promised Mary to leave b. ∗ Mary was promised (by John) to leave (128)
Mary was promised (by John) to be allowed to leave
Example (127a) is a typical example of subject control with ‘promise,’ where the object does not block movement of the embedded subject. In turn, (127b) shows that the matrix object of (127a) cannot be passivized. Interestingly, if the embedded clause contains a predicate that makes a control-shift reading salient, passivization becomes licit, as seen in (128). Our proposal has a straightforward account for the contrast between (127b) and (128). In (127b), ‘John’ or IMP, the empty counterpart of a by-phrase (see section 5.2.2), is the intended controller of the embedded subject. Given the MTC, this amounts to saying that ‘John’/IMP must be generated in the embedded clause and move to the Spec of the passive morpheme. In turn, this entails that the matrix object must be buried in a PP layer in order for minimality to be observed, as illustrated in (129). But if ‘Mary’ in (129) is the complement of a preposition, it is then reasonable to assume that it cannot be passivized (cf. [127b]). Conversely, if ‘Mary’ is generated as a standard verbal complement, it should be able to undergo passivization; however, being a DP complement, it blocks movement of the embedded subject. Either way, the derivation crashes. (129)
[ -en [promised [PP P Mary] [IMP/John to leave]]] ↑ OK
In (128), on the other hand, ‘Mary’ is the controller (and not the source of authority). In other words, the derivation of (128) parallels the derivation of passives involving object-control verbs such as ‘persuade’ (cf. [20] in section 5.2.1): ‘Mary’ is generated in the embedded clause and moves to the specifier of ‘promise’ before landing in the matrix [Spec, TP], as sketched in (130). (130) a. [VP Maryi promised [ti to be allowed ti to leave]] ↑ OK b. [vP Maryi [v’ [IMP/by John] promised+-en [VP ti tpromised ↑ OK [ti to be allowed . . . ]]]] c. [TP Maryi T [vP ti [v’ [IMP/by John] promised+-en [VP ti tpromised ↑ OK [ti to be . . . ]]]]]
Note that this account ties together control shift and the interpretive restrictions that characterize it. “Authority” restrictions translate naturally into a
180
Empirical challenges and solutions
source role for the nominal object of the matrix clause. Source roles being oblique are syntactically mapped into PP configurations. Given this mapping, the matrix object does not function as an intervener for another DP that moves across it, licensing subject control. To put this another way: the MTC (coupled with UTAH) requires that the matrix nominal object receive an oblique thematic interpretation in order for subject control to be possible in such syntactic configurations. The fact that this occurs thus constitutes a further argument in favor of the MTC. In effect, the MTC explains why control-shifted contexts are semantically restricted to cases where the matrix object is thematically oblique. In short, if we are correct that in control-shifted clauses the matrix object is actually an oblique inside a PP, then the purported empirical problem that control shift posed for the MTC is actually an argument in its favor and an argument against more standard accounts that decouple control selection from the minimal-distance principle or minimality. There is additional independent empirical evidence for the PP structure in (126). First, there are related forms that overtly show prepositions, as illustrated in (131). (131) a. John asked/begged (of/from) Mary permission to leave b. John asked/begged permission to leave of/from Mary c. John petitioned from/??of Mary a permit to leave
Here the preposition is overtly manifest and the source reading on ‘Mary’ is forced. These must be interpreted with Mary authorizing the leaving, just as in the control-shifted readings above. Examples (131a) and (131b) seem to display something analogous to the dative alternation found in double-object constructions and promise-constructions (see section 5.5.1). Interestingly, the preposition is required when the source is at the end of the clause but can be deleted if it is in medial position.41 Given that these verbs already have structures with oblique source arguments, it is reasonable that such roles can arise in the control-shifted readings as well, the main difference residing in the deletion of the preposition in the shifted cases. Second, the diagnostics operative in double-object constructions and promise-cases (see section 5.5.1) also extend to these configurations. For example, wh-movement is possible under standard object control (cf. [132a]), but strained under the control shift (cf. [132b]), which is more on a par with promise- (cf. [132c]) and double-object constructions (cf. [132d]). 41 The deleted version may be slightly preferred.
5.5 The MDP, control shift, and the logic of minimality (132) a. b. c. d.
181
I wonder [whoi John asked ti [PROi to leave early]] ??I wonder [whoi Johnk asked ti [PROk to be allowed to leave early]] ??I wonder [whoi Johnk promised ti [PROk to leave early]] ??I wonder [whoi John gave ti a book]
This effect is also evident in those cases where either a theme/patient or a source reading can be attributed to the matrix nominal object. Example (133a) below, for instance, is ambiguous. However, when the matrix object undergoes wh-movement, the strongly preferred reading is one in which ‘who’ is the controller. That is, under the subject-control reading, (133b) patterns like the transparent control-shift construction in (133c). (133) a. John1 asked/begged [the guard]2 [PRO2/1 to smoke a cigarette] b. Who2 did John1 ask/beg t2 [PRO2/??1 to smoke a cigarette] c. ??Who2 did John1 ask/beg t2 [PRO1 to be allowed to smoke a cigarette]
In turn, heavy NP shift is allowed under object control, but unavailable under control shift, as illustrated in (134). (134) a. [John1 asked/begged t2 [PRO2/??1 to stop working] [every employee that he met]2 ] b. [∗ John1 asked/begged t2 [PRO1 to be allowed to stop working] [every employee that he met]2 ]
Finally, secondary predicates modifying the matrix object are disallowed in the shifted reading, though acceptable in the object-control case, as shown in (135). (135) a. John1 asked/begged Mary2 , unsure of herself, [PRO2 to sing at the gala] b. ∗ John1 asked/begged Mary2 , unsure of herself, [PRO1 to be allowed to sing at the gala]
In sum, the diagnostics used to argue that the matrix object of control constructions with ‘promise’ and the goal of double-object constructions are actually contained in an underlying PP also extend to the control-shift cases. This then provides independent support for the structure proposed in (125)/(126). 5.5.3 Summary Let us recap and conclude. Several have claimed that subject control over “objects” in promise- and in control-shifted cases constitutes strong evidence against the empirical adequacy of Rosenbaum’s minimal-distance principle and, pari passu, the MTC, which has the minimal-distance principle as a consequence. We have noted that the arguments depend on the premise that
182
Empirical challenges and solutions
the matrix object DP in constructions like (136) is a direct complement of the matrix verb. (136) a. John1 promised Mary [PRO1 to leave the party early] b. John1 asked Mary [PRO1 (to be allowed) to smoke a cigarette in her apartment]
If, however, this DP is actually embedded within a PP, then it would not serve as an intervener and so minimality would not block movement across it. In effect, the MTC requires that this apparent direct object actually be analyzed as an indirect object at that point in the derivation where the movement applies and this requires that it get an oblique thematic interpretation given UTAH. We have reviewed evidence pointing to precisely this conclusion and so these cases shift from being problems for the MTC to being evidence in its favor. Pace Landau (2003: 481), it is far from clear that “a theory that does not derive the [minimal-distance principle] is, ceteris paribus, better off than one that does.” Our view is well expressed by Elizabeth Bennet’s reply to Mr. Darcy at the end of Pride and Prejudice: “[our] feelings are quite different, just the opposite in fact.” 5.6
Partial and split control
In section 5.4.2, we discussed cases where controller and controllee differ with respect to the agreement patterns they trigger and argued that the dissimilarity arises in the course of the computation, as the moving DP (the controller) receives a -role that is incompatible with the quirky-case value previously assigned. In this section we discuss two cases where controller and controllee seem to be semantically distinct. The first case involves partial-control constructions like (137) below, where the embedded predicate requires a semantically plural subject but the controller is singular and must be interpreted as a member of the set of referents denoted by the embedded subject. The second case involves constructions where the controlled subject has split antecedents, as illustrated in (138). (137)
[[The chair]i decided [PROi+k to meet at 6]]
(138)
[Johni proposed to Maryk [PROi+k to meet each other at 3]]
Although semantically plural, the controllees of these constructions differ with respect to syntactic number (see Landau 2000). In split control the controllee is also syntactically plural, but in partial control the syntactic number of the controllee is determined by the controller. In (137), for instance, the controllee
5.6 Partial and split control
183
is syntactically singular, like its controller. This can be seen by the contrast between (138) and (139). Given that a plural anaphor must be licensed by a syntactically plural antecedent, ‘each other’ can be licensed in the split-control structure in (138), but not in the partial-control structure in (139). (139)
∗
[[The chair]i decided [PROi+k to meet each other at 6]]
Examples (137) and (138) clearly contrast with (standard) exhaustive-control constructions such as the ones in (140), where the controller must be unique and have the same (semantic and syntactic) number specification as the controllee. (140) a. ∗ [[The chair]i managed [PROi+k to meet at 6]] b. ∗ [Maryi expects Johnk to try [PROi+k to leave]]
As discussed throughout this volume, exhaustive control receives a straightforward analysis under the MTC. If the controllee is a trace/copy of the moved element, the controller should completely determine the referential properties of the controllee. Similarly, there can be no split antecedents for the controllee because two distinct elements cannot move from the exact same position (see section 3.4.1). Given this overall picture, partial- and split-control constructions look especially challenging for the version of the MTC explored here. If obligatorily controlled PRO is simply a residue of A-movement (i.e., a copy of the antecedent interpreted as a bound variable), how can it be interpreted as semantically plural in (137), for example, or have distinct binders in (138)? It is worth observing that there is in fact a clear-cut divide between (standard) exhaustive control, on the one hand, and split and partial control, on the other. The latter is much more amenable to crosslinguistic variation, lexical idiosyncrasies, and non-uniform judgments among speakers. So, whatever turns out to be the ultimate analysis of these constructions, it should treat them as somewhat special when compared to exhaustive-control constructions. In the next sections we discuss how partial and split control can be handled as special cases under the MTC. 5.6.1 Partial control The fact that controller and controllee need not match in semantic number in partial-control constructions has been taken as a strong argument for a PRO-based analysis. Landau (2004), for instance, analyzes the partial-control construction in (141a) along the lines of (141b), where the feature [+Mer] characterizes group names.
184
Empirical challenges and solutions
(141) a. The chair hoped to meet at 6/apply together for the grant b. [[The chair][−Mer] hoped [PRO[+Mer] to meet at 6/apply together for the grant]]
Note that the representation in (141b) by itself is not able to ensure that PRO is interpreted as being a set which includes the referent of the matrix subject as a member. All it says is that PRO is semantically plural. In addition, it does not distinguish tensed infinitives such as (141a), which license partial control, from untensed infinitives such as (142), which do not. (142) a. ∗ The chair managed to meet at 6/apply together for the grant b. ∗ [[The chair][−Mer] managed [PRO[+Mer] to meet at 6/apply together for the grant]]
Landau obtains the wanted results by requiring that PRO have its -features licensed and ascribing a specific set of features to the C- and T-heads in tensed infinitivals which then allows PRO to get its -features valued through a chain of agreement relations involving: (i) the matrix T and the matrix subject; (ii) the matrix T and the embedded C; (iii) the embedded C and the (tensed) infinitival T; and (iv) the embedded infinitival T and PRO (see sections 2.5.2 and 4.4). As discussed in detail in section 2.5.2, crucial parts of the postulated feature system (the R-assignment rule, for instance) are not independently motivated (something that Landau [2004: 852] himself acknowledges) and the typology predicted is empirically incorrect in that it has no room for finite control into indicatives, something that is allowed in Brazilian Portuguese. In addition, the use of a specific chain of agreement relations to account for partial control overgenerates and incorrectly allows finite control in languages like English, for instance. Here we will put these problems aside and focus on some incorrect empirical predictions that a PRO-based analysis a` la Landau makes with respect to partial control, in contrast with the MTC. One such incorrect prediction was already mentioned in section 2.5.2. As observed by Hornstein (2003), the presence of an embedded predicate that requires a plural subject is not sufficient for partial control to be licensed, as shown in (143). (143)
∗
John hoped [PRO to sing alike/to be mutually supporting]
Notice that in (143) the matrix predicate is of the type that licenses partial control (cf. [141a]). Thus, the availability of partial control seems to be related to properties of the embedded predicate as well.
5.6 Partial and split control
185
Hornstein (2003) in fact suggests that only predicates that select a commitative PP can support partial control. Compare the data in (144) and (145), for example. (144) a. ∗ The chair sang alike/was mutually supporting with Bill b. The chair left/went out (with Bill) c. The chair met/applied together for the grant (∗ with Bill) (145) a. ∗ The chair hoped to sing alike/be mutually supporting b. The chair hoped to leave/go out ∗ ‘The chair hoped that he and other people would leave/go out’ c. The chair hoped to meet/apply together for the grant
Given that predicates like ‘sing alike’ and ‘be mutually supporting’ require a plural subject, (144a) shows that the plural meaning cannot be obtained via a commitative. Accordingly, these predicates do not license partial control (cf. [145a]). Examples (144b)/(145b) in turn show that being compatible with an (adjunct) commitative is not sufficient to license partial control ([145b] only admits an exhaustive interpretation for the embedded subject). Rather, the commitative must be selected, as shown by (144c)/(145c). When the facts in (144) and (145) are taken into consideration, the appeal of the PRO-based analysis gets bleached. Rather than involving a PRO with different semantic features, as represented in (141b), the derivation of partialcontrol structures in fact seems to involve the licensing of a null commitative, as illustrated in (146). (146) a. [The chair hoped [PRO to meet procommitative at 6]] b. [The chair hoped [PRO to apply together procommitative for the grant]]
Note that, if partial control just involves the licensing of null commitatives, one need not commit hostages to postulating PRO or assigning special properties to it. Under the MTC, the embedded subjects of the partial-control constructions in (146) should in fact be garden-variety NP-traces, as shown in (147) below. (147) a. [[The chair]i hoped [ti to meet procommitative at 6]] b. [[The chair]i hoped [ti to apply together procommitative for the grant]]
Data such as (144) and (145) thus show that it is not the case that partial control necessarily favors PRO-based analyses or that it is fatal to the MTC. As seen in (147), the MTC is also compatible with partial-control phenomena. In fact, we may go one step further and show that the MTC analysis encapsulated in (147) is empirically superior to PRO-based approaches.
186
Empirical challenges and solutions
First, PRO-based approaches admit unavailable readings with verbs that select commitatives. Take the sentence in (148), for instance. (148)
The chair hoped to meet with the president
Under the MTC approach outlined in (147), (148) is to be represented as in (149) below, where the embedded subject is a trace of the matrix subject. Notice that the structure in (149) can only support an exhaustive control reading: the chair is the only person to meet with the president. (149)
[[The chair]i hoped [ti to meet with the president]]
By contrast, under PRO-based analyses, (148) is to be represented as in (150) below. Suppose that PRO in (150) is semantically plural (it is marked [+Mer] in Landau’s [2004] terms), a possibility available given that the infinitival is tensed. If so, (150) should allow an interpretation under which the chair hoped that a group of people including him would meet with the president. However, this reading is impossible in (148). Why an overt commitative eliminates an otherwise available partial-control reading is therefore quite surprising under PRO-based analyses. (150)
[[The chair] hoped [PRO to meet with the president]]
Let us now consider the tense restrictions. Suppose we adapt Landau’s proposal that tensed and untensed infinitivals contrast in licensing partial control and assume that tensed infinitivals license null commitative complements, but tenseless infinitivals do not. At first sight, there is no gain in such translation. However, when we look closer, the two approaches do make distinct predictions. Under a PRO-based approach, partial control is something related to the embedded null subject of tensed infinitivals, whereas under the MTC partial control is related to the licensing of null commitative complements. Thus, we should in principle be able to find interpretive effects associated with partial control even when no infinitival clauses are involved. Bearing this in mind, consider the contrasts in (151) and (152), based on Rodrigues (2007). (151) a. ∗ The chair met at 6 b. The chair can only meet tomorrow (152) a. ∗ The chair applied together for the grant b. The chair cannot apply together for the grant
Under a PRO-based approach, (151a) and (152a) are excluded because the subject is semantically singular and there is no position available for a semantically
5.6 Partial and split control
187
plural PRO. As Rodrigues (2007) points out, this reasoning should carry over to sentences like (151b) and (152b), predicting that these sentences should be equally unacceptable, contrary to fact. Following Wurmbrand’s (2006) proposal that “tensed” infinitivals actually involve an abstract modal operator (‘woll’), Rodrigues argues that standard partial constructions involving infinitivals and data such as (151b) and (152b) can receive a uniform analysis if the null pronoun that acts as the trigger for the semantically plural reading is licensed by the modal.42 Another argument against PRO-based approaches to partial control and in favor of the MTC involves secondary predicates (see Rodrigues 2007). Consider the sentences in (153), for example. (153) a. John hates to meet angry b. John wants to meet ready for all contingencies
In each of the sentences in (153), the secondary predicate modifies ‘John.’ Under the null commitative approach suggested above coupled with the MTC, this is no surprise. As illustrated in (154) below, the secondary predicate modifies the trace of the moved subject. Note that the null commitative is an indirect object of sorts (it corresponds to an overt PP) and, as we saw in section 5.5, secondary predicates cannot modify indirect objects. (154) a. [Johni hates [ti to meet procommitative angry]] b. [Johni wants [ti to meet procommitative ready for all contingencies]]
By contrast, PRO-based accounts of partial control seem unable to capture the interpretation of the secondary predicate in sentences such as the ones in (153). Given that secondary predication is clause-bound, the secondary predicate of the structures in (155) below cannot modify the matrix subject. Moreover, one cannot say that this modification is indirectly allowed in virtue of PRO being plural and ‘John’ being included in the denotation of PRO. Were that possible, 42 Rodrigues (2007) actually proposes that partial-control constructions involve a complex DP subject with null pronoun along the lines of (i) below. In the derivation of a partial-control sentence such as (iia), for instance, the DP of (i) moves to the matrix clause, leaving pro stranded, as sketched in (iib). Although compatible with the MTC, Rodrigues’s proposal does not account for the commitative restrictions on the embedded predicate illustrated by (145) and (148). Thus, we have reinterpreted her original proposal of a complex DP subject in terms of a null commitative complement. (i)
[pro DP]
(ii) a. The chair decided to meet at 6 b. [[The chair]i decided [ti to [[pro ti ] meet at 6]]]
188
Empirical challenges and solutions
we should find a similar reading in a sentence such as (156), with the secondary predicates modifying individuals denoted by the semantically plural subject. However, the secondary predicates of (156) must hold of the whole committee and cannot be restricted to its leaders, for example. (155) a. [Johni hates [PROi+j to meet angry]] b. [Johni wants [PROi+j to meet ready for all contingencies]] (156)
[Its leaders said [that the committee met drunk/angry/ready for anything]]
In addition, Rodrigues (2007) also shows that embedded secondary predicates syntactically agree with the matrix subject in partial-control constructions. As she observes, this can be clearly seen when the controller involves nouns such as v´ıtima ‘victim’ in Brazilian Portuguese, which is feminine regardless of whether its referent is male or female. In a sentence like (157) below, for instance, the secondary predicate must surface in the feminine singular form, agreeing in gender and number with the matrix subject. Again, these agreement facts are exactly what one expects under the commitative approach coupled with the MTC, as illustrated in (158) (with English words), where the secondary predicate agrees with the trace of the matrix subject. (157)
Brazilian Portuguese: A v´ıtima decidiu [se reunir vestida informalmente] The.FEM victim.FEM decided REFL gather dressed.FEM casually ‘The victim decided to gather dressed casually’
(158)
[[The.FEM victim.FEM]i decided [ti to gather dressed.FEM casually]]
Rodrigues (2007) provides one final very ingenious argument in favor of the MTC based on Torrego’s (1996) work on control constructions in Spanish such as (159a), which are arguably related to the possibility of subject doubling in (159b). (159)
Spanish (Torrego 1996): a. No sabemos si firmar los ling¨uistas la carta Not know.1PL whether sign.INF the linguists the letter ‘We don’t know whether the linguists among us should sign the letter’ b. Fuimos los ling¨uistas went.1PL the linguists ‘The linguists among us went’
Rodrigues refers to cases like (159a) as inverse partial control, for the floating DP in the embedded clause is understood as a subset of the set denoted by the matrix subject, as indicated by the translation. Given the subject-doubling
5.6 Partial and split control
189
structure in (159b), it is plausible that the inverse partial-control construction in (159a) also involves subject doubling, that is, the floating DP and the controllee should form a doubling structure. The relevant question is whether (159a) is to be represented as in (160), where the floating DP forms a doubling structure with the trace of the matrix subject, as the MTC would require, or with the structure in (161), where the floating DP doubles PRO. (160)
[proi not know.1PL [whether [TP ti sign [vP [ti the linguists] tsign the letter]]]]
(161)
[proi not know.1PL [whether [TP PROi sign [vP [ti the linguists] tsign the letter]]]]
Rodrigues builds an answer based on the contrast between (159a) and (162a), which mimics the contrast between (159b) and (162b) observed by Torrego. (162)
Spanish: ir los ling¨uistas a. ∗ Nosotros no sabemos si We not know.1PL whether go.INF the linguists ‘We don’t know whether the linguists among us should go’ (Rodrigues 2007)
b.
∗
Nosotros fuimos los ling¨uistas We went.1PL the linguists ‘The linguists among us went’ (Torrego 1996)
The contrast between (159) and (162) shows that a null first-person plural pronoun can be doubled, but an overt one cannot. That being so, Rodrigues concludes that the ungrammaticality of (162a) can only be accounted for by the MTC structure in (160), but not by the structure in (161). In (160), the floating DP doubles a trace, i.e., a copy, of the matrix subject. Thus, if the matrix subject is overt, it should not allow doubling for the same reasons (162b) does not license it. By contrast, under the PRO-based structure in (161), the floating DP is not directly associated with the matrix subject, but with PRO. So, if this association is licit in (159a), it should also be allowed in (162a), contrary to fact. Again, the empirical coverage provided by the MTC with respect to inverse partial-control constructions is superior to PRO-based approaches.43 To conclude. The existence of partial-control interpretations is often taken to be incompatible with movement theories of control. This section has reviewed approaches to partial-control phenomena that are fully compatible with the 43 See Rodrigues (2007) for additional arguments.
190
Empirical challenges and solutions
MTC and has also discussed constructions that present serious problems for PRO-based approaches. However, this should not be taken to imply that we fully understand partial control. There remain several open questions. For example, why the extensive speaker and language variation regarding the accessibility of such readings? All speakers easily get the exhaustive readings and for many the partial readings are difficult to get if attainable at all. But even in this regard the MTC proves to be no worse than PRO-based alternatives, which do not have an explanation for these facts either. All things considered, it is fair to say that the problems that partial-control phenomena allegedly pose to the MTC are more related to our lack of understanding of these phenomena than to architectural features of the MTC. 5.6.2 Split control Let us now turn to split control. Landau (2000) points out that, although most verbs do not allow split control, some do, as illustrated in (163) below.44 In this section we present an MTC approach to split control based on Fujii’s (2006) work on control in Japanese. As Japanese ties the possibility of split control to a specific particle, it can shed more light on what licenses split control and how it is derived. (163)
Johni proposed to Maryk [PROi+k to help each other]
As Fujii shows, Japanese has three mood particles that trigger obligatory control: the “intentive” marker -(y)oo, the imperative marker -e/-ro, and the “exhortative” marker -(y)oo, as respectively illustrated in (164) below.45 Interestingly, each marker is associated with a different type of obligatory control: the intentive marker is associated with subject control, the imperative marker with object control, and the exhortative marker with split control. Here we will focus on exhortatives. (164)
Japanese (Fujii 2006): a. Taroi -wa [PROi boku-no beeguru-o tabe-yoo-to] keikakusita Taro.TOP my bagel.ACC eat.INTENT.C planned ‘Taro planned to eat my bagel’ b. Yokoi -wa Hiroshik -ni [PROk boku-no beeguru-o tabe-ro-to] meireisita Yoko.TOP Hiroshi.DAT my bagel.ACC eat.IMP.C ordered ‘Yoko ordered Hiroshi to eat my bagel’
44 Similar facts have been observed in German, Hebrew, Turkish (see Oded 2006), and Japanese (see Fujii 2006). 45 Japanese exhortative constructions are interpreted in a way similar to English let’s-constructions. See Fujii (2006) for discussion.
5.6 Partial and split control
191
c. Taroi -wa Hiroshik -ni [PROi+k otagai-o tasuke-a-oo-to] Taro.TOP Hiroshi.DAT each other.ACC teiansita help.RECIP.EXHORT.C proposed ‘Taro proposed to Hiroshi to help each other’
Fujji shows that, aside for the ban on split antecedents, exhortative constructions such as (164c) test positive for all the standard diagnostics of obligatory control. Relevant to our current discussion is the fact that the antecedents of exhortative constructions must be local to one another, that is, they cannot be in different clauses,46 as illustrated by the contrast in (165) below. Note that in (165a) there are two local controllers for PRO in the intermediate clause, ‘kare’ and ‘Hiroshi.’ In (165b), on the other hand, there is only one local antecedent available, ‘kare.’ (165)
Japanese (Fujii 2006): a. Taroi -wa otooto-ni [karei -ga Hiroshik -ni [PROi+k Taro.TOP brother.DAT he.NOM Hiroshi.DAT otagai-o sonkeisi-a-oo-to] itta-koto]-o tugeta each other.ACC respect.RECIP.EXHORT.C said.C.ACC told ‘Taroi told his brother that hei said to Hiroshik to respect each otheri+k ’ b. ∗ Taroi -wa Hiroshik -ni [karei -ga [PROi+k otagai-o Taro.TOP Hiroshi.DAT he.NOM each other.ACC sonkeisi-a-oo-to] omotteiru-koto]-o tugeta respect.RECIP.EXHORT.C thinks.C.ACC told ‘Taroi told Hiroshik that hei thought that theyi+k should respect each other’
Also relevant to the present discussion is a particular gap in the mood paradigm in Japanese. As seen above, Japanese allows subject control with “intentive” mood (cf. [164a]), object control with imperative mood (cf. [164b]), and split control with exhortative mood (cf. [164c]). Fujii (2006) points out that this leaves out one possibility: subject control in a two-DP argument structure, i.e., the flipside of what exists in the imperative mood, as sketched in (166). (166)
∗
[DP1 DP2 [CP PRO1 . . . MOOD . . . ] V]
Both the ungrammaticality of (165b) and the unavailability of mood markers associated with the control configuration in (166) have the flavor of a minimality/minimal-distance principle effect. But if minimality obtains, how 46 See Landau (2000) for similar observations regarding split control in English.
192
Empirical challenges and solutions
are split-control configurations like (165a) derived? Fujii argues that the mood particles highlighted in (164) head mood phrases (MoodPs) and that there is no case available for the subject of MoodPs. In the particular case of exhortative -(y)oo, Fujii proposes that they license some sort of coordinate structure in its Spec, as represented in (167). [MoodP [␣ + ] [Mood’ -(y)oo TP]]
(167)
Suppose that the complex specifier in (167) is some sort of commitative expression. If so, split-control structures such as (164c) or (165a) can be derived along the lines of (168) below. In (168a), [+ ] moves to the internal -position of the subcategorizing V. Under the assumption that [+ ] is complex with + akin to a null commitative preposition,  does not count as an intervener and ␣ can move to [Spec, vP] without any minimality problems. In the relevant respects, movement of ␣ in (168b) is parallel to what we saw with subjectcontrol cases involving ‘promise’ (see section 5.5.1).47 (168) a. [VP [+ ] [V’ V . . . [MoodP [␣ [+ ]] [Mood’ -(y)oo TP]]]] ↑ b. [vP ␣ [v’ v [VP [+ ] [V’ V . . . [MoodP [␣ [+ ]] [Mood’ -(y)oo TP]]]]]] ↑
Going back to English, it is plausible that its split-control constructions also involve a derivation with an exhortative MoodP, as in (168). The sentence in (163), for example, repeated in (169) below, can be derived as sketched 47 The derivation that Fujii (2006) actually proposes for licit instances of split control involves the steps illustrated in (i) below.  moves to receive the -role of the matrix V, pied-piping ␣ (cf. [ia]). ␣ then moves to the matrix [Spec, vP], yielding (ib). (i) a.
b.
[VP [␣ + ] [V’ V . . . [MoodP [␣ + ] [Mood’ -(y)oo TP]]]] ↑ [vP ␣ [v’ v [VP [␣ + ] [V’ V . . . [MoodP [␣ + ] [Mood’ -(y)oo TP]]]]]] ↑
Although compatible with the MTC, the derivation above seems to make the incorrect prediction that a sentence such as (iia), for instance, could mean ‘John washed Bill and himself,’ given the structure in (iib). In order to block (iib), we have kept the gist of Fujii’s original proposal that split control involves some sort of coordination, but reinterpreted it under a commitative structure. (ii) a. John washed Bill b. [vP John v [VP washed [John + Bill]]]
5.6 Partial and split control
193
in (170). The considerable amount of overlapping between the verbs that allow exhortative -(y)oo in Japanese and the verbs that allow split control in English suggests that something along these lines may indeed be on the right track. (169)
John proposed to Mary to help each other
(170) a. [VP proposed . . . [MoodP [John [+Mary]] to help each other]] ↑ b. [vP John proposed-v [VP [to [+Mary]] tproposed . . . [MoodP [John ↑ [+Mary]] . . . ]]]
The account reviewed above leaves several questions open. For instance, we have witnessed above another interaction between commitatives and modals (see section 5.6.1). Why interactions like these should hold is far from clear. For example, why is split control limited to exhortatives in Japanese? Why do imperatives not support such readings in Japanese? The problem becomes more pressing when one sees that the semantic apparatus that handles imperatives is easily extended to accommodate exhortatives. It appears that they only differ in that imperatives encumber the addressee’s to-do list while exhortatives fill in the speaker’s to-do list as well. It is easy to understand why exhortatives require plural (commitative) subjects, but it is harder to understand why imperatives must resist them. As for the question of why certain verbs resist split control, Landau (2000) and Fujii (2006) propose that this follows from the semantics. According to Landau (2000: 55), for instance, “[u]nlike propose and ask, recommend and order do not allow split control – for obvious reasons, given that in order to engage in some action, one does not recommend to/order other people to do it.” We do not find this at all obvious. Why can one not order someone or recommend to someone to engage in a collaborative activity, e.g., washing each other? Why can ‘John ordered Mary to wash each other’ not mean that John ordered Mary to engage with him in the activity wherein each of John and Mary washes the other. This does not seem semantically untoward, nor are recommendations to do so any odder semantically. The fact is that split control seems to be very restricted, perhaps applying only to verbs that support exhortative interpretations. However, why this is so and what exactly characterizes exhortations so that they essentially differ from imperatives and other kinds of moods remains, we believe, quite unclear. This said, were there a semantic
194
Empirical challenges and solutions
restriction of the kind Landau and Fujii suggest, it could be easily combined with the MTC (as Fujii observes). 5.7
Conclusion
As discussed throughout this volume, the MTC takes obligatory control to be established via A-movement. The only relevant difference between obligatory control, on the one hand, and raising, passivization, and (local) scrambling, on the other, is that the relevant A-movement in the case of obligatory control is triggered by -reasons. We think that this difference is not significant enough (at least not in a framework like minimalism, which does not recognize any substantive notion of D-structure) to warrant a special control operation/construction/rule. However, we do think this thematic difference is important when it comes to explaining some divergences between obligatory-control constructions and the other types of constructions that rely on A-movement. Indeed, as a research strategy, we would like it to be true of any difference between obligatory control and, say, raising that it reduce to the extra thematic relation established under control. In the previous sections we have examined a variety of empirical phenomena that have been claimed to pose insuperable problems for the MTC. We have shown that, under close scrutiny, all the allegedly deadly counter-examples can receive plausible analyses under the MTC. Even more importantly, all the answers given stemmed from two simple ideas: (i) that, if control involves movement, (relativized) minimality must be obeyed; and (ii) that quirky-casemarked DPs must be stripped of their quirky case if they are to be further marked. What we have done is make the relevant configurations explicit so that minimality can be properly computed and -marking properly characterized. All in all, it seems to us that, rather than presenting fatal counter-examples to the MTC as often claimed, most of the phenomena reviewed in this chapter end up lending strong conceptual and empirical support to the MTC. It is, of course, up to the reader to decide if we are right and to what degree the purported difficulties for the MTC reviewed and reanalyzed here vitiate the project of reducing control to movement.
6 On non-obligatory control
6.1
Introduction
Within the MTC, non-obligatory control (NOC) has been pushed to the side, with the focus of inquiry resting on obligatory control (OC). The most obvious reason for this is that, as opposed to OC, NOC does not resort to movement. Nonetheless, the MTC is incomplete without an account of the distribution of NOC and in this chapter we would like to present our thoughts on this issue.1 But before we proceed, a disclaimer is in order. As NOC is the elsewhere case (when movement is not involved), there may be different types of NOC which in turn may be subject to different licensing conditions. The discussion below presents the beginnings of an account of NOC that we believe is quite reasonable. However, should it turn out to be partially or totally incorrect, this does not affect the essence of the preceding chapters. As mentioned above, the MTC effectively has something to say about control relations that exhibit movement diagnostics but not much about construal relations that are not derived by movement. Thus, although we think that the proposal to be discussed below fits snugly with the version of MTC advocated in this volume, the two are to some extent independent from one another. That said, let us move to the discussion proper. The chapter is organized as follows. In section 6.2, we discuss configurations where OC and NOC are in complementary distribution and present Hornstein’s (1999, 2001, 2003, 2007) proposal that NOC PRO is a null pronoun and that the complementary distribution between OC and NOC is couched on an economy competition between movement and pronominalization. Section 6.3 examines (apparent) counter-examples to this complementarity and section 6.4 outlines an approach according to which the interpretation of NOC PRO is a dual function of the grammar and the parser. Section 6.5 concludes the chapter.
1 The discussion to be presented below is primarily based on Boeckx and Hornstein (2007).
195
196
On non-obligatory control
6.2
Obligatory vs. non-obligatory control and economy computations
The pairs of examples in (1)–(6) below illustrate the systematic contrast between OC and NOC (see Chapter 2). Example (1a) shows that OC PRO requires an antecedent; (2a) that the antecedent must be local; (3a) that the antecedent must be in a c-commanding position; (4a) that OC yields sloppy readings under ellipsis; (5a) that OC PRO must be interpreted as a bound variable when associated with an only-DP; and (6a) that OC PRO only admits de se reading in “unfortunate” contexts. On the other hand, the corresponding bexamples show that exactly the opposite holds of NOC PRO: it does not require an antecedent (cf. [1b]); if it has an antecedent, the antecedent need not be local [(cf. 2b]) or in a c-commanding position (cf. [3b]); and it allows both strict and sloppy readings under ellipsis (cf. [4b]), bound and coreferential readings when associated with only-DPs (cf. [5b]), and de se and non-de se readings (cf. [6b]). (1) a. b.
∗
It was expected PRO to shave himself It is illegal PRO to park here
(2) a. b.
∗
Johni thinks that it was expected PROi to shave himself Johni thinks that Mary said that PROi shaving himself is vital
(3) a. b.
∗
Johni ’s campaign expects PROi to shave himself Johni ’s friends believe that PROi keeping himself under control is vital if he is to succeed
(4) a. b. (5) a. b. (6) a. b.
Johni expects PROi to win and Billk does too (‘and Billk expects himself to win,’ not ‘and Billk expects himi to win’) Johni thinks that PROi getting his resum´e in order is crucial and Bill does too (‘Billk thinks that hisi/k getting his resum´e in order is crucial’) [Only Churchill]i remembers PROi giving the ‘Blood, Sweat, and Tears’ speech Only Churchill remembers that PRO giving the BST speech was momentous [The unfortunate]i expects PROi to get a medal [The unfortunate]i believes that PROi getting a medal is unlikely
Note that the b-examples in (2)–(6) also illustrate a typical environment where an NOC PRO can be found: an island configuration. In fact, the complementary distribution between OC and NOC generally correlates with environments where movement can or cannot take place. Hornstein (2001, 2003, 2007) accounts for this correlation by reinterpreting in minimalist terms the old idea that (resumptive) pronouns are employed as a last resort saving strategy
6.2 Obligatory vs. non-obligatory control
197
when movement fails. More concretely, Hornstein argues that movement is more economical than pronominalization. Thus, if OC PRO is a residue of movement under the MTC, as argued in the previous chapters, NOC PRO can be analyzed as a null pronoun (pro) which is resorted to when movement is not possible. Under this view, the structures in (6), for instance, are to be represented along the lines of (7) below, with a trace in (7a) and pro in (7b). Crucially, they cannot be represented as in (8): in (8b) movement of the embedded subject should induce an island violation and in (8a) there is an economy violation as the less economical option of pronominalization was employed instead of movement (cf. [7a]) in a configuration where both options would lead to convergent results. (7) a. b. (8) a. b.
[[The unfortunate]i expects [ti to get a medal]] [[The unfortunate]i believes that [[proi getting a medal] is unlikely]] ∗ ∗
[[The unfortunate]i expects [proi to get a medal]] [[The unfortunate]i believes that [[ti getting a medal] is unlikely]]
The distribution and interpretation of pro in the b-examples of (1)–(6) to a great extent mimic the distribution and interpretation of overt pronouns, as illustrated in (9) below.2 That is, an overt pronoun does not need a linguistic antecedent (cf. [9a]); it may be associated with a non-local (cf. [9b]) or nonc-commanding (cf. [9c]) antecedent; it admits both strict and sloppy readings under ellipsis (cf. [9d]), bound and coreferential readings in sentences like (9e), and de se and non-de se reading in “unfortunate” contexts (cf. [9f]). (9) a. b. c. d. e. f.
It is illegal for him to park here Johni thinks that Mary said that hisi shaving himself is vital Johni ’s friends believe that hisi keeping himself under control is vital if he is to succeed Johni thinks that hisi getting his resum´e in order is crucial and Bill does too (‘Billk thinks that hisi/k getting his resum´e in order is crucial’) Only Churchill remembers that his giving the BST speech was momentous [The unfortunate]i believes that hisi getting a medal is unlikely
This general competition between movement and pronominalization extends beyond standard instances of control.3 For instance, Floripi (2003), Rodrigues 2 But see section 6.3 below for a discussion of cases where null and overt pronouns do not go hand in hand. 3 See Hornstein (2001, 2003, 2007) for further arguments, technical implementation, and general discussion on the competition between movement and pronominalization with respect to derivational economy.
198
On non-obligatory control
(2004), and Floripi and Nunes (2009) show that, in Brazilian Portuguese, a null possessor sitting within an object displays all the diagnostics of OC. Thus, the empty category in (10) below must be interpreted as the closest c-commanding antecedent (cf. [10a]) and only admits sloppy readings under ellipsis (cf. [10b]), bound readings when associated with only-DPs (cf. [10c]), and de se readings (cf. [10d]). Assuming Hornstein’s (1999, 2001) theory of control, these authors analyze the empty category in (10) as a trace left by movement of the possessor to the specifier of the closest vP, as illustrated in (11). (10) a.
Brazilian Portuguese (Floripi and Nunes 2009): [O Pedrom acha que [o amigo [d[o Jo˜ao]i ]]k telefonou The Pedro thinks that the friend of-the Jo˜ao called para a m˜ae eck/∗ i/∗ m ] to the mother ‘Pedrom thinks that [Jo˜aoi ’s friend]k called hisk/∗ i/∗ m mother’
b.
[[O Jo˜ao]i vai telefonar para a m˜ae eci ] e [a The Jo˜ao goes call to the mother and the Maria tamb´em vai] Maria also goes ‘Jo˜ao will call his mother and Mary will call her mother, too’ (sloppy reading only)
c.
d.
(11)
[[S´o o Jo˜ao] ligou para a m˜ae ec] Only the Jo˜ao called to the mother ‘Only Jo˜ao called his mother → Nobody else called his own mother’ NOT ‘Nobody else called Jo˜ao’s mother’ [Non-de se context: Jo˜ao doesn’t remember who he is or that the person under discussion is his brother] #[Jo˜ao passou a admirar o irm˜ao ec] Jo˜ao passed to admire the brother ‘Jo˜ao came to admire his brother’ (de se reading only; infelicitous in this context) Brazilian Portuguese: [TP [o Jo˜ao] [vP ti ligou [PP para [DP a m˜ae ti ]]]] The Jo˜ao called to the mother ‘Johni called hisi mother’
Interestingly, as observed by Floripi (2003) and Floripi and Nunes (2009), if the null possessor sits within a subject, we find a different behavior: the antecedent for the null possessor need not be within its clause (cf. [12a]) and the null possessor is compatible with strict and sloppy readings (cf. [12b]), bound and coreferential readings (cf. [12c]), and de se and non-de se readings
6.2 Obligatory vs. non-obligatory control
199
(cf. [12d]). In other words, the null possessors in (12) pattern like their overt counterparts in (13). (12) a.
b.
Brazilian Portuguese (Floripi and Nunes 2009): [[O Jo˜ao]i disse que [[o amigo eci ] vai viajar]] The Jo˜ao said that the friend goes travel ‘Jo˜aoi said that hisi friend is going to travel’ [[A Maria]i vai recomendar a pessoa [que [um amigo eci ] The Maria goes recommend the person that a friend entrevistou] e [o Jo˜ao]k tamb´em vai] interviewed and the Jo˜ao also goes ‘Maria is going to recommend the person that a friend of hers interviewed and Jo˜ao is also going to recommend a person that a friend of his/hers interviewed’ (sloppy and strict readings available)
c.
[[S´o o Jo˜ao] leu o livro [que [a m˜ae ec] indicou]] Only the Jo˜ao read the book that the mother recommended ‘Only Jo˜ao read the book that his mother recommended’ → ‘Nobody else read the book that his own mother recommended’ or ‘Nobody else read the book that Jo˜ao’s mother recommended’
d.
[Non-de se context: Jo˜ao doesn’t remember who he is or that the person under discussion is his brother] Jo˜aoi se surpreendeu [quando [o irm˜ao eci ] fez um discurso] Jo˜ao REFL surprised when the brother made a speech ‘Jo˜ao got surprised when his brother made a speech’ (non-de se reading available)
(13) a.
b.
Brazilian Portuguese (Floripi and Nunes 2009): [[O Jo˜ao]i disse que [[o amigo delei ] vai viajar]] The Jo˜ao said that the friend of-him goes travel ‘Jo˜ao said that his friend is going to travel’ [[A Maria]i vai recomendar a pessoa [que [um amigo The Maria goes recommend the person that a friend [o Jo˜ao]k tamb´em vai] delai ] entrevistou] e of-her interviewed and the Jo˜ao also goes ‘Maria is going to recommend the person that a friend of hers interviewed and Jo˜ao is also going to recommend a person that a friend of his/hers interviewed’ (sloppy and strict readings available)
c.
[[S´o o Jo˜ao] leu o livro [que [a m˜ae dele] indicou]] Only the Jo˜ao read the book that the mother of-him recommended ‘Only Jo˜ao read the book that his mother recommended’ → ‘Nobody else read the book that his own mother recommended’ or ‘Nobody else read the book that Jo˜ao’s mother recommended’
200
On non-obligatory control d.
[Non-de se context: Jo˜ao doesn’t remember who he is or that the person under discussion is his brother] surpreendeu [quando [o irm˜ao delei ] fez um discurso] Jo˜aoi se Jo˜ao REFL surprised when the brother of-him made a speech ‘Jo˜ao got surprised when his brother made a speech’ (non-de se reading available)
Adopting Hornstein’s economy approach to the contrast between OC and NOC, Floripi (2003) and Floripi and Nunes (2009) argue that the null possessor in (12) is a null pronoun, as illustrated in (14), and that pronominalization is sanctioned in these cases as movement out of the subject position is not allowed. Crucially, if pronominalization were generally available, the sentences in (12) should exhibit no interpretive restrictions thanks to the alternative derivation with pro. Again, if movement is possible, it preempts pronominalization. (14)
Brazilian Portuguese: [[O Jo˜ao]i disse que [[o amigo proi ] vai viajar]] The Jo˜ao said that the friend goes travel ‘Jo˜aoi said that hisi friend is going to travel’
Let us examine a slightly more complex case which sheds additional light on the correlation between the impossibility of movement and the availability of pro (and hence NOC). First, consider the PRO-gate sentence (see Higginbotham 1980) in (15) below. Hornstein and Kiguchi (2003) and Kiguchi (2004) show that PRO-gate structures like (15) display all the diagnostics of OC and propose that they involve sideward movement from within the infinitival subject to the object position. The derivation of (15), for instance, is taken to proceed along the lines of (16). After the two independent syntactic objects in (16a) are built, the computational system copies ‘Mary’ and merges it with ‘delighted,’ as shown in (16b). Further computations yield the simplified structure in (16c), which surfaces as (15) after the copy of ‘Mary’ within the clausal subject is deleted in the phonological component (cf. [16d]). A crucial feature of the derivation sketched in (16) is that, as opposed to what we saw in the b-examples in (2)–(6) and in the null-possessor constructions in (12), movement of ‘Mary’ in (16b) is licit as it takes place before the non-finite clause becomes a subject island (see section 4.5.1.2). Once movement is possible, pronominalization is blocked, as shown in (17).4 4 It is also consistent with what follows if we assume a derivation reminiscent of Belletti and Rizzi’s (1988) analysis of psych-verb constructions, with the surface subject in (15) being basegenerated in a position lower than the position occupied by the surface object, and movement of
6.2 Obligatory vs. non-obligatory control (15)
[[PROi washing herself] delighted Maryi ]
(16) a.
Applications of select and merge: [Mary washing herself] delighted Applications of copy and merge (sideward movement): [Maryi washing herself] [delighted Maryi ] Application of merge: [[Maryi washing herself] delighted Maryi ] Deletion in the phonological component: [[Maryi washing herself] delighted Maryi ]
b. c. d. (17)
∗
201
[[heri washing herself] delighted Maryi ]
Now consider what happens when structures such as (15) are embedded, as illustrated in (18). (18) a. b.
John said that [[PROi /∗ heri washing herself] delighted Maryi ] Johnk said that [[prok /himk washing himself] delighted Mary]
Example (18a) is unsurprising as it replicates what we saw in (15)/(17): if movement of ‘Mary’ to the object of ‘delighted’ is allowed (cf. [16] and (i) in footnote 4), pronominalization is not. Conversely, once movement of ‘John’ from within the gerund to the matrix clause in (18b) is out due to the subject island, pronominalization is possible. The contrast between (18a) and (18b) shows that structures should not be classified as OC or NOC, for a given structure may allow OC or NOC. Rather, it is relations that are OC or NOC. That the MTC treats OC and NOC as relations is an important point that is worth emphasizing. OC and NOC are descriptive predicates that are more analogous to bound and free than to interrogative and declarative, i.e., OC and NOC describe relations between nominal expressions, not selection/ subcategorization relations between predicates and types of clausal complements. As grammatical theory does not distinguish clauses as reflexive or pronominal depending on whether they contain anaphors or pronouns, it should not, by parity of reasoning, identify sentences as OC/NOC clauses. ‘Mary’ proceeding in an upward fashion, as sketched in (i) below. Regardless of whether (15) is to be derived along the lines of (16) or (i), the important point to bear in mind is that movement of ‘Mary’ is licit and therefore preempts pronominalization. (i) a. b. c. d.
[VP delighted [Mary washing herself]] [VP Maryi [delighted [ti washing herself]]] [vP delightedk [VP Maryi [tk [ti washing herself]]]] [TP [ti washing herself]m [vP delightedk [VP Maryi [tk tm ]]]]
202
On non-obligatory control
Consequently, to say that a given predicate selects/subcategorizes for an OC or NOC structure can only be viewed as a descriptive statement, with no explanatory ambitions.5 With this general picture in mind, let us now consider some potential problems.
6.3
Some problems
Consider the representation of the sentence in (19) given in (20). (19)
John persuaded Mary to leave
(20)
John1 persuaded Mary2 [PRO2/∗1 to leave]
Under the MTC, PRO in (20) is a copy/trace of A-movement and this explains why ‘Mary’ is the antecedent and ‘John’ cannot be (see section 3.4.1). For ‘John’ to be the antecedent requires that it move over ‘Mary’ on its way to [Spec, vP]; as this violates minimality, it cannot be the antecedent. In contrast, movement of ‘Mary’ from the embedded clause to the Spec of ‘persuade’ does not violate minimality. Assuming that pronominalization and movement compete and that movement is derivationally more economical than pronominalization (see section 6.2), we account for why (19) cannot be associated with the structure in (21a) below. Successful movement of ‘Mary’ to the Spec of ‘persuade’ blocks pronominalization. However, this account does not explain why the illicit movement of ‘John’ in the derivation of (19) does not make room for pronominalization. In other words, why can (19) not be associated with the structure in (21b), where pronominalization “rescues” a failed movement connection between the two subject positions? What prevents a DP that cannot licitly antecede PRO (i.e., a copy/trace of A-movement) from binding a pro (i.e., a null pronoun) in the same position?6 5 This claim has the following consequence: verbs cannot be classified as taking OC complements and so a standard approach to OC, one which treats control as a selection relation between a predicate and its embedded sentential complement, is conceptually misguided. For more discussion see section 7.2 below. 6 One possible answer to this question is not open to us: that ‘persuade’ selects for an OC complement. If what we say in footnote 5 is correct, this makes as much sense as claiming that a predicate selects for an embedded reflexive structure. On one reading of the inclusiveness condition this kind of selection would be ruled out, for it codes grammatical restrictions in lexical selection. Lexical items cannot be so coded on a strict reading of inclusiveness.
6.3 Some problems (21) a. b.
∗
203
John1 persuaded Mary2 [pro2 to leave] John1 persuaded Mary2 [pro1 to leave]
To phrase the problem differently, we have assumed that a coupling between an antecedent and a pronoun is licit just in case movement cannot establish the same relation. If one can move from a position to another, a DP in the “target” cannot bind a pronoun in the “launch” site, i.e., the position of the trace. However, this also implies that if movement is not possible between positions A and B then binding should be. What we see in (21b) is a concrete example of this option. However, we also see that it is impossible; (19) cannot be interpreted with ‘John’ as the leaver. Consider another problematic case: (22)
John kissed Mary without getting embarrassed
Example (22) is a case of adjunct control. As discussed in detail in section 4.5.1, adjunct control involves sideward movement from the subject position of the embedded clause before it becomes an adjunct. Furthermore, given the preference for merge over move, adjunct control involves subject control, rather than object control, as represented in (23). (23)
John1 kissed Mary2 without PRO1/∗2 getting embarrassed
If sideward movement of ‘John’ from the position of PRO in (23) is licit, we expect a coreferential pronoun to be ruled out in this position, which is indeed the case, as shown in (24) below. By the same reasoning, if sideward movement of ‘Mary’ in (23) is not allowed, the question is why pronominalization cannot save the derivation. That is, why can Mary in (22) not be the one who gets embarrassed, given the availability of the structure in (25)? (24)
∗
John1 kissed Mary without him1 getting embarrassed John1 kissed Mary2 without pro2 getting embarrassed
(25)
Note that the unacceptability of (22) with the structure in (25) is even more troublesome than the unacceptability of (19) under the representation in (21b). As opposed to (21b), the overt counterpart of (25) yields an acceptable result, as shown in (26). (26) a. b.
∗
John1 persuaded Mary [him1 to leave] John kissed Mary1 without her1 getting embarrassed
A similar pattern is found in null-possessor constructions in Brazilian Portuguese. In a sentence such as (27) below, movement from the null-possessor
204
On non-obligatory control
position can only go as far as the lower [Spec, vP] without violating minimality, as sketched in (28a). Thus, the null possessor in (27) must be interpreted as the lower subject and economy considerations regarding the derivational cost of movement and pronominalization exclude the representation in (28b). (27)
(28) a. b
Brazilian Portuguese: [O Jo˜ao]k acha que [o Pedro]i vai ligar para a m˜ae eci/∗ k The Jo˜ao thinks that the Pedro goes call to the mother ‘Jo˜ao thinks that Pedroi is going to call hisi mother’ ∗
[O Jo˜ao] acha que [o Pedro]i vai ligar para a m˜ae ti [O Jo˜ao] acha que [o Pedro]i vai ligar para a m˜ae proi
By the same token, once movement of ‘o Jo˜ao’ in (29a) below is excluded due to the intervention of the embedded subject, the question is why the null possessor in (27) cannot be interpreted as the matrix subject under the structure in (29b) with a null pronoun, despite the fact that an overt pronoun allows this interpretation, as illustrated in (30). So why is a null pronoun with the same reading unacceptable? (29) a. b. (30)
∗
[O Jo˜ao]k acha que [o Pedro] vai ligar para a m˜ae ti [O Jo˜ao]k acha que [o Pedro] vai ligar para a m˜ae prok Brazilian Portuguese: delek [O Jo˜ao]k acha que [o Pedro]i vai ligar para a m˜ae The Jo˜ao thinks that the Pedro goes call to the mother of-him ‘Jo˜aok thinks that Pedro is going to call hisk mother’
We outline a possible answer in the next section.
6.4
A proposal
If we insist that the problems in (21b), (25), and (29b), repeated below in (31), get a unified approach (not an obvious requirement, but not a bad one either), the fact that the overt pronouns in (26b) and (30), repeated in (32), allow the relevant readings suggests that more than grammatical requirements are at issue. What else could be at stake? Following Boeckx and Hornstein (2007), we would like to suggest a parsing-based approach. More particularly, we propose that the structures in (31) are not blocked by the grammar, but neither would ever be accepted by a well-behaved parser.7 7 If we also assume that producers and parsers meet similar constraints, then this would not be produced either. Such an assumption is natural in any kind of analysis-by-synthesis model.
6.4 A proposal (31) a. b. c.
John1 persuaded Mary [pro1 to leave] John kissed Mary2 without pro2 getting embarrassed Brazilian Portuguese: [O Jo˜ao]k acha que [o Pedro] vai ligar para a m˜ae prok The Jo˜ao thinks that the Pedro goes call to the mother ‘Jo˜aok thinks that Pedro is going to call hisk mother’
(32) a. b.
John kissed Mary1 without her1 getting embarrassed Brazilian Portuguese: delek [O Jo˜ao]k acha que [o Pedro] vai ligar para a m˜ae The Jo˜ao thinks that the Pedro goes call to the mother of-him ‘Jo˜aok thinks that Pedro is going to call hisk mother’
205
Let us make the following (as far as we can tell, fairly standard) assumptions: (33) a. b.
Parsers move from left to right and project structure rapidly and deterministically on the basis of local information Parsers are transparent with respect to grammars. So, if grammars encode a condition, parsers respect it.8
Given the assumption in (33b), we expect parsers to prefer traces to pronouns (if grammars prefer movement to pronominalization) and, consequently, that parsers will treat gaps as copies/traces in preference to analyzing them as null pronominal pros. In addition, we expect parsers to be sensitive to earlier information. As a parser builds structure left to right, it will prefer to treat a potential gap as a copy/trace (rather than a pro) if it can. Take the sentence in (19), for instance, repeated in (34). (34)
John persuaded Mary to leave
As the sentence is parsed, we arrive at ‘to’ and the parser realizes that it must assign a subject to the embedded clause. Moreover, the parser “sees” that the subject is a null category, either a pro or PRO (= trace/copy). As the parser incorporates the principles of the grammar and grammars “prefer” movement to pronominalization, the parser “prefers” to drop a trace here if it can. As it can, it does, and we get (35) below. Finally, the trace/copy in (35) must have ‘Mary’ as its antecedent due to minimality. Thus, (34) gets the parse in (35), which requires that ‘Mary’ be the antecedent of “PRO.” (35)
John persuaded Mary [t to leave]
As for (31a), it would require that at ‘to’ the parser drop a pro in the subject position, for the only empty category that could take ‘John’ as antecedent is a 8 This does not imply that grammars are identical to parsers (Phillips 1996) – a position which we think is untenable (Phillips [2004] appears to agree on this). Our assumption only implies that the parser respects the design features of grammars. For discussion of the transparency relation between grammars and parsers, see Berwick and Weinberg (1984).
206
On non-obligatory control
null pronoun. However, to drop a pro requires ignoring the parser’s (built-in) preference for a trace/copy over a pronoun, all things being equal − a preference the parser has in virtue of being structurally transparent to the grammar, which prefers movement over pronominalization. This makes (31a) computationally unavailable and this accounts for the lack of the indicated interpretation.9 The same account extends to (31b) and (31c). When the parser gets to the subject position within the gerund in (31b) or the possessor position in (31c) and needs to drop an empty category, it must drop a trace/copy if it can. Thus, it prefers a “PRO” to a pro. As a “PRO” can be licitly dropped here, it must be. If it is, however, then ‘John’ must be the antecedent in (31b) due to mergeover-move computations and ‘o Pedro’ must be the antecedent in (31c) due to minimality. In other words, once the parser analyzes the null subject of the gerund and the null possessor as traces, the indicated readings in (31b) and (31c) become unavailable. One point is worth emphasizing here. The “preference” the parser displays arises as a design feature of a parser that conforms to transparency (a very good condition − perhaps an optimal one − for regulating the relation between grammars and parsers). It is often assumed that parsing strictures can be overridden given greater resources. So, for example, center-embedding structures can be parsed given more memory “space.” The suggestion above, however, cannot be so easily ameliorated. The problem is not one where extra resources would help. If parsing principles must respect grammatical ones (i.e., if transparency holds), then a parser cannot circumvent these principles by using additional memory or attention resources.10 The parse is simply not available. Observe that this account turns on there being an empty category to be parsed. By “seeing” nothing there, the parser must “decide” what sort of empty category to drop in the relevant position. As it prefers dropping traces if it can, it drops a trace and not a null pronoun. However, if there is an overt pronoun occupying the same position, the parser is not faced with any choice as to what it must do as overt pronouns are grammatically licit here. Thus, we can derive sentences like (32a) and (32b) with an overt pronoun anteceded by ‘Mary’ and ‘o Pedro.’
9 Our discussion has reified parsing by adverting to properties of a well-designed “parser.” However, the account survives even if in place of parsers all we have is parsing. The transparency assumption then would amount to saying that, when using the grammar to divine the nature of an empty category, parsing adopts those principles prized by the grammar. Thus, on encountering a phonetic gap, a trace is preferred to pro. 10 This does not mean that a pro can never be placed where a “PRO” can be. See below for a case where a pro can be posited in a place where a “PRO” is licit in order to advance another parsing desideratum.
6.4 A proposal
207
Consider now the last set of cases. We noted that examples like (18b), repeated here in (36), are fine with the indicated interpretation. (36)
Johnk said that [[prok washing himself] delighted Mary]
Here the parser gets to the subject gerund and “encounters” an empty category.11 It can treat it as a copy/trace or a pro. Note, however, that the empty category is inside an island and if the parser wants to link ‘John’ to this element, it must treat it as a pro. Observe that if the empty category were analyzed as a copy/trace, the connection with ‘John’ would be illicit as it would require movement from an island. As a “PRO” cannot provide the support for this relation, a pro is licensed by the grammar. However, this does not end matters. We have seen that (37) is also acceptable. (37)
John said that [[PROi washing herself] delighted Maryi ]
The PRO here is a residue of movement.12 So it seems that the parser can drop a trace here. Why then does this not prevent dropping a pro in (36)? The answer is that the parser here must weigh a competing parsing demand. It is known that parsers like to assign interpretations to empty categories (and dependent elements in general) very quickly.13 Thus, pronouns quite generally greedily appropriate suitable interpretive antecedents (referential anchors) very rapidly on-line. If we add the assumption that parsers are interpretively greedy to our previous two assumptions in (33), then in cases such as (36) and (37), the parser has competing preferences: it would like to assign an interpretation to the empty category (at this point in the parse) and it would prefer to treat the empty category as a trace rather than a pronoun. In contexts like (36) and (37) these desiderata pull in opposite directions: if the empty category is understood as a pro it can be related to ‘John’ and so can rapidly be provided with an interpretation at this point. However, this will also require overriding its preference for traces over pronouns. On the other hand, if it drops a trace 11 The parser “knows” that this is a subject because it follows ‘that’ and because it knows that ‘believe’ does not take gerundive complements. This kind of structural and lexical information is standardly assumed to be available on-line to the parser. Thus, at the point where the gerund is encountered, the relevant information that the gerund is a subject (and hence an island) is known. 12 Either sideward movement (cf. [16]) or movement as in a psych-verb construction (cf. [i] in footnote 4). 13 See, for example, Nicol and Swinney (1989), Osterhout and Mobley (1995), and Badecker and Straub (2002), for discussion. Thanks to Nina Kazanina for very helpful discussion and references.
208
On non-obligatory control
here, it can adhere to its preference for traces over pronouns (i.e., “PRO” over pro), but it cannot resolve the interpretation of the empty category at this point (as there is no antecedent yet available for the copy/trace). Recall that this is a case where the antecedent will only become visible downstream and our parser operates from left to right (cf. [33a]). In short, as both options have their virtues, we suggest that both parses are available.14 It is instructive to compare (36) and (37) with the structures in (31). In (31) the potential antecedents are both to the left of the empty category. Thus, if the empty category is analyzed as a trace, the parser not only complies with transparency (i.e., dropping a trace), but also satisfies its desire to interpret the empty category quickly as it may rely on the local information already parsed. By contrast, in (36) and (37) the potential antecedents for the empty categories are on opposite sides, creating a situation in which transparency and quick assignment of interpretation to the empty category pull in opposite directions, yielding two distinct outputs: if transparency is enforced, we get OC (cf. [37]); if quick interpretation is enforced, we have NOC (cf. [36]). If this analysis is roughly on the right track, then some predictions follow. Consider a sentence like (38), for instance, uttered discourse initially. (38)
Having to wash behind the ears made Mary angry at Bill
Here, there is no parsing advantage to interpreting the empty category as a null pronoun (there is nothing to link the empty category to so that it can be quickly interpreted). As such, we would expect the parser to drop a “PRO” here, giving us a structure like (39). (39)
[PRO having to wash behind the ears] . . .
Thus, the parser will analyze the empty category as a residue of A-movement. In (38) the controller of PRO will then necessarily be ‘Mary.’15 Note that, were there a pro here, it should be able to have ‘Bill’ as antecedent in (38). However, this reading seems unavailable in (38). If we substitute ‘his’ for ‘the’ in (38), 14 It is possible that different speakers weigh these options differently. Nina Kazanina (personal communication) has found many speakers for whom sentences like (37) with ‘John’ as antecedent become very odd when ‘Mary’ is encountered. This suggests that these speakers value transparency more than reference resolution. It goes without saying that the proposal above is not a fully worked-out account and that much more detailed work needs to be done to flesh it out. For some interesting work extending this line of reasoning, see Snarska (2009). 15 See Hornstein and Kiguchi (2003) and Kiguchi (2004) for details and evidence that the empty category in (38) is an OC PRO and not a pro.
6.5 Conclusion
209
Mary is doing the washing behind Bill’s ears! Note, however, that an overt pronoun can have ‘Bill’ as antecedent: (40)
Himi having to wash behind the ears made Mary angry at Billi
The reason for acceptability of (40) under the intended reading is that the pronoun is grammatically permitted in this position and the parser does nothing more than put what it hears where it hears it. Thus, what cannot occur here, because of parsing preferences, is a null pronoun, namely, pro. 6.5
Conclusion
Hornstein (1999, 2001, 2003, 2007) has proposed that, as opposed to OC PRO, which is analyzed as an A-trace, NOC PRO is essentially a null pronoun (pro). The interpretation of an NOC PRO then reduces to figuring out the distribution of pro. Hornstein has argued that its distribution is to a large extent determined by economy considerations: assuming that movement is more economical than pronominalization, the resort to pro (i.e., NOC) will be sanctioned only when movement (i.e., OC) fails. In this chapter, we have seen that the assumption that movement is more economical than pronominalization is, however, insufficient to block some unacceptable instances of NOC whose movement counterpart is not licit. Following Boeckx and Hornstein (2007), we have argued here that the distribution of pro is a dual function of the grammar and the parser. If acceptability reflects both generability and parsability, then being unparsable may provide a reason for the unexpected unacceptability of instances of NOC where OC/movement is not an option. We have relied on the assumption, common in the parsing literature, that parsers use the same sorts of principles and target the same sorts of entities as grammars do, i.e., they are transparent to grammars. We have also assumed that parsers like to resolve the interpretation of empty categories very quickly. In combination, these two assumptions provide the beginnings of an account of NOC by predicting where and when pro is available.
7
Some notes on semantic approaches to control
7.1
Introduction
Much of this book has been devoted to a defense of a specific syntactic approach to (obligatory) control. We have provided many conceptual and empirical arguments in favor of a movement-based approach and against the major alternative syntactic treatments reviewed in Chapter 2. In this chapter, we would like to consider the major non-syntactic approaches to control that can be found in the literature and argue that they are inadequate on various grounds. The approaches we will focus on seek to reduce control to selection, or invoke a rich inventory of semantic types or conceptual structures that bypass syntax. Many of the arguments we will produce against such approaches are independent of our favorite syntactic analysis and can be formulated in various frameworks. However, we think that it is when we contrast non-syntactic approaches to control with a movement-based account that the former more clearly reveal their explanatory weaknesses. The chapter is organized as follows. Section 7.2 discusses some general problems with selectional approaches to control. Section 7.3 focuses on Culicover and Jackendoff’s “simpler-syntax” approach. Finally, a brief conclusion is presented in section 7.4.
7.2
General problems with selectional approaches to obligatory control
There are several critical features of the interpretation of OC PRO. One of the most obvious is that it requires a local antecedent. Typically, the antecedent is an argument of the embedding predicate (the subject for subject control, the object for object control). The locality requirement is one of the key features of control configurations and must be accommodated. One initially plausible way of capturing this locality effect focuses on the observation that the interpretation of PRO, in particular the identification of its antecedent, is determined in 210
7.2 General problems with selectional approaches
211
some way by the embedding predicate. One linguistically reasonable way of implementing this observation is to treat antecedent identification as a species of selection restrictions that the matrix predicate imposes on its embedded complement clause. The virtue of so proceeding lies in the possibility of tracing the locality of antecedent selection in control clauses to the very tight locality that characterizes selection more generally (as everyone knows, selection in natural languages is confined to very local domains). To a greater or lesser extent, this is a general intuition that underlies both “purely” semantic and mixed semantic–syntactic approaches to control.1 Rather than discussing specific technical implementations of this idea, here we would like to highlight a couple of problems that seem to be common to all selectionbased approaches. The first one regards what is being selected. As discussed in section 6.2, one cannot simply say that a given predicate selects for an OC structure, for the same structural configuration may allow both OC and NOC, as illustrated in (1) below. In other words, it is not structures but relations that are OC or NOC (see section 6.2). In the case of (1), for instance, the relation between ec and ‘Mary’ is one of OC (see e.g., Kiguchi 2004; Hornstein and Kiguchi 2003), whereas the one between ec and ‘John’ is one of NOC (see e.g., Hornstein 2001; Boeckx and Hornstein 2007). (1) a. b.
John said that [[eci washing herself] delighted Maryi ] Johnk said that [[eck washing himself] delighted Mary]
One selection approach that does not appear to face this problem involves treating control infinitives as properties, i.e., VPs, rather than clausal complements (see e.g., Chierchia 1984). If control complements are properties, they can be directly selected by the control predicate and need not involve an empty category such as PRO. Under this view, the semantic identification of the “concealed” embedded subject is determined by entailment relations triggered by the meaning of the control predicate. However, as Wurmbrand (2001) shows in detail for German, this proposal can capture at best a subset of control structures, those that can be analyzed as involving just a VP layer. Other types of obligatory control infinitivals in German show evidence of a full-blown clausal structure, which brings back the question of how to identify the content of their OC PRO subjects.2 1 See e.g., Chierchia (1984), Landau (2000), Rooryck (2001, 2007), Wurmbrand (2001), Culicover and Jackendoff (2005). 2 See Wurmbrand (2001) for relevant data and arguments for the two types of OC infinitivals in German.
212
Some notes on semantic approaches to control
In previous chapters we in fact discussed three kinds of data that argue that control configurations are indeed syntactically clausal/propositional and do have embedded subjects. The first case involves finite control into indicatives in Brazilian Portuguese, as illustrated in (2) below, where the embedded clause displays the same clausal material that is present in its non-control counterpart with an overt subject (see section 2.5.2.2). The second case involves backwardcontrol configurations, where the controller is pronounced in the embedded clause, as illustrated in (3) (see section 4.5.3.3), something that should be impossible if control complements were VPs. Finally, the third case involves copy-control sentences such as (4), where the embedded subject is realized as a copy of the controller (see section 4.5.4). In effect, backward- and copycontrol constructions wear their clausal structure on their phonological sleeves. On the assumption that control involves the same grammatical relation across languages, the existence of backward-control and copy-reflexive structures indicates that the controller in OC structures has moved from an embedded clausal complement. (2)
Brazilian Portuguese: [[O Jo˜ao] disse que [o pai d[o Pedro]] acha que ec vai The Jo˜ao said that the father of-the Pedro thinks that goes ser promovido] be promoted ‘Jo˜aoi said that [Pedroj ’s father]k thinks that hek/∗ i/∗ j/∗ l is going to be promoted’
(3)
Tsez (Polinsky and Potsdam 2002): biˇsra] yoqsi] [1/∗2 [kidb¯a1 ziya girl.ERG cow.ABS feed.INF began ‘The girl began to feed the cow’
(4)
San Lucas Quiavin´ı Zapotec (Lee 2003): R-c`aa` a’z Gye’eihlly g-auh Gye’eihlly bxaady HAB.want Mike IRR.eat Mike grasshopper ‘Mike wants to eat grasshopper’
If establishing an OC relation cannot be obtained through direct selection, the other option is to attempt to enforce the desired relation by some sort of indirect selection.3 Consider for instance an object control construction such as (5). 3 See, for instance, Wurmbrand (2001: 250) on obligatory control involving infinitivals larger than VP in German: “Assuming that obligatory control is determined lexically/semantically, a syntactic subject and the application of (syntactic) control mechanisms is in a sense vacuous in obligatory control constructions, since the antecedent of the infinitival subject is already pre-specified as part of the meaning of an obligatory control predicate. We claim that it is exactly
7.2 General problems with selectional approaches (5)
213
[John [VP persuaded Mary [CP C [TP PRO to leave]]]]
Given that head-to-head selection is a very local process, ‘persuade’ can select C, but no other head further down. However, given that C selects T, it seems plausible that ‘persuade’ indirectly imposes restrictions on the embedded T in virtue of selecting C. One can then attempt to stretch this reasoning and assume that ‘persuade’ could also impose restrictions on the element occupying the embedded [Spec, TP]. If these chains of indirect-selection relations could be licensed, there should in principle be room for accommodating the fact that in (5) the subject of the embedded clause must be controlled by the matrix object. Though not implausible, this kind of transitivity of selection must be used sparingly for, if not restricted, it voids the attractive locality feature of a selection-based account. For example, just as C selects T, T selects V. Progressive-present tense, for instance, comports badly with stative predicates: ∗ ‘John is knowing/understanding/recognizing the problem.’ Nonetheless, so far as we know, T does not impose selection restrictions on arguments of the predicate, e.g., requiring that the subject be animate or that the object have some specified thematic role or be some kind of DP (a reflexive, for instance). But, as soon as we resort to transitivity of selection for control, it is not clear how to prevent it from working in unwanted cases. For example, what prevents a matrix verb from imposing restrictions on the argument of a verb embedded two clauses down if transitivity of selection is assumed? Generalized in this way, selection is no longer local and thus is not suited to explain the locality of antecedent selection in control clauses. So, at the very least, a selection-based account of control needs to specify exactly which heads interact, how they do so and what information can get imparted. We do not see how this can be achieved in a non-stipulative fashion. Things are actually much worse. Assuming that a technical way out of the transitivity problem may be found, more problems surface. One of the core properties of OC PRO is that it needs an antecedent. Why? This cannot be a matter of selection. It must be due to some feature of OC PRO. What feature? On the MTC the fact that OC PRO requires an antecedent follows from OC PRO being a residue of movement and one cannot find a residue of movement without something having moved. But on a non-movement account, this requirement must be an intrinsic property of OC PRO. What intrinsic property? One could say it is [+anaphoric] and this is what requires that it have an antecedent. this redundancy that licenses (but does not necessitate) the omission of a syntactic subject in obligatory control constructions.”
214
Some notes on semantic approaches to control
However, this is not enough. One must explain why it must be [+anaphoric] in “OC environments.” Why is there no analogue of OC PRO without this feature? Tracing the requirement that OC PRO have an antecedent to a [+anaphoric] specification only tracks the observed fact. To explain what we find requires showing why the analogue of OC PRO without the [+anaphoric] feature does not exist in “OC environments.” Anything short of this does little more than redescribe the facts, as the feature on OC PRO that forces the presence of an antecedent boils down to a diacritic. Observe that this does not imply that such an approach to control is incorrect. But it implicitly denies that there is anything interesting to be said about the properties of control configurations beyond specifying that what we find is due to the idiosyncratic lexical properties of OC PRO and the special selectional/ semantic features (very broadly understood) of the embedding predicate. Perhaps. However, it is clear that this is not so much a theory of control as the claim that control has no theory. Methodologically, we should be dragged kicking and screaming to this rather nihilistic conclusion. Attempts to eliminate the diacritic smell of selectional approaches by couching the interpretation of OC PRO on other semantic grounds have not been successful either. Take, for instance, Rooryck’s (2007) approach, under which the sentence in (6a) is analyzed as in (6b) and (6c) (Rooryck 2007: 287; Tr stands for transition, a cover term for accomplishments and achievements): (6) a. b. c.
d.
Kim forced Sue to leave Subevent structure: [Tr e1 ∗ act (Kim, (Sue, leave)) {en+1 } leave (Sue)]Tr Syntactic structure: [V force] ( . . . ) [V leave]] [CP (C-)T {en+1 } {Tunrealized } Plain English: At the event time e1 ∗ , Kim undertakes action with respect to Sue’s leaving, resulting in a subevent at an undetermined moment after e1 ∗ , at which Sue leaves.
In Rooryck’s own words (p. 287), “infinitival [unrealized] (C-)T has anaphoric phi-features, which do the job of PRO. Identification of [unrealized] (C-)T with the [unrealized] subevent entails identification of (C-)T anaphoric phifeatures with the phi-features of all and only those argument(s) contained in the [unrealized] subevent. Control thus rides piggyback on the temporal identification of the infinitive by the matrix verb.” Putting aside the conceptual issue of why temporal identification should provide the basis for argument identification, such an approach cannot be extended to finite-control sentences such as (2) in Brazilian Portuguese. The tense of the indicative complement
7.2 General problems with selectional approaches
215
clause is completely independent from the matrix clause, as illustrated by (7), and yet OC obtains as in cases of non-finite complements (see sections 2.5.2.2 and 4.4).4 (7) a.
Brazilian Portuguese: Ontem [o Jo˜ao]i disse que eci/∗ k est´a estudando muito nos Yesterday the Jo˜ao said that is studying much in-the u´ ltimos dias last days ‘Yesterday John said that he has been studying a lot lately’
b.
Ontem [o Jo˜ao]i disse que eci/∗ k vai viajar amanh˜a Yesterday the Jo˜ao said that goes travel tomorrow ‘Yesterday John said that he’s going to travel tomorrow’
c
Ontem [o Jo˜ao]i disse que eci/∗ k tinha viajado na Yesterday the Jo˜ao said that had traveled in-the semana passada week past ‘Yesterday John said that he had traveled last week’
Let us finally turn to – we think – the most lethal problem for selection-based accounts. So far, we have proceeded on the premise that it is prima facie reasonable to codify antecedents in control configurations in terms of selection. One prominent property of selection is that it holds for complements, not adjuncts. The distribution of the latter is quite free, at least for a large class of adjuncts. Thus, one expectation of a selection-based account of control interpretation is that control into adjuncts should have properties quite different from control into complements. This, however, is incorrect. As we have discussed in section 4.5.1, adjunct control shows all the characteristics of complement control, as illustrated in (8). (8) a. b.
c.
Adjunct-control PRO requires a local c-commanding antecedent: Johni said [that [Maryk ’s brother]m left [after PROm/∗ i/∗ k/∗ w eating a bagel]] Adjunct-control PRO only licenses sloppy readings under ellipsis: John left before PRO singing and Bill did too ‘and Billi left before hei /∗ John sang’ Adjunct-control PRO can only have a bound interpretation when controlled by only-DPs: Only Churchill left after PRO giving the speech ‘[Nobody else]i left after hei /∗ Churchill gave the speech’
4 As we saw in section 2.5.1, Wurmbrand (2005) shows that not all control complements show the same kinds of temporal dependencies, so it is not clear that we could generalize the above approach.
216 d.
Some notes on semantic approaches to control In the appropriate type of adjuncts (e.g., purposives), adjunct-control PRO obligatorily requires a de se interpretation: The unfortunate wrote a petition (in order) PRO to get a medal ‘[The unfortunate]i wrote a petition so that [he himself]i would get a medal’
That adjunct control shows all the diagnostic properties of controlled PRO is somewhat unexpected if these properties are the result of selection restrictions imposed by a higher predicate on its complement. The parallel between adjunct and complement control suggests that such accounts are on the wrong track. Crucially, under a movement approach of the sort we have advocated, control into adjuncts falls out from the same mechanism as control into complements (see section 4.5.1).5 Let us now consider Culicover and Jackendoff’s “simpler-syntax” approach. 7.3
“Simpler syntax”
Culicover and Jackendoff (2001, 2005) propose an approach to control that aims to explain the interpretive properties of control structures directly in terms of the lexical meanings of their constituent parts.6 This lexical meaningbased perspective relies on Jackendoff’s (1983, 1990) framework of conceptual semantics, which assumes a very rich, generative apparatus outside narrow syntax. C&J make three principal claims regarding control. First, the central problem in control consists in “identifying the factors that determine possible controllers in any given circumstance” (2005: 416) (i.e., determining the antecedent of “PRO”).7 Second, controller selection is largely a function of the “lexical semantics of the predicate that selects the infinitival or gerundive complement” (p. 416), where the requisite aspects of meaning are explicitly represented in the predicate’s conceptual structure. And third, the control relation is not simply a diacritic feature of a given verb or class of verbs but is “an organic part of their meaning” (p. 420). In other words, “the control properties of heads . . . follow insofar as possible from their meanings, couched in terms of conceptual structure” (p. 445, footnote 19). The following quote well illustrates 5 See section 7.3.2.1 below for additional problems to semantic approaches to control brought up by the interaction of adjunct control and wh-movement. 6 As Culicover and Jackendoff’s approach to control is more fully developed in their 2005 book, we will base our critique on that work (henceforth C&J). 7 C&J avoid using the PRO-notation in discussing control, though they note (p. 416, footnote 1) that their arguments are “largely neutral” concerning PRO. As we find the exposition is streamlined if we include PRO, we do so in what follows.
7.3 “Simpler syntax”
217
C&J’s core conception of what control is: “Our own intuition . . . is that the control behavior of persuade and promise is an essential part of their meanings; there could not be a verb that meant the same thing as persuade but had the control behavior of promise” (p. 420). As C&J develop this lexical-semantic perspective to control, they offer several arguments against the MTC. For this reason, we will begin our discussion of C&J’s treatment of control by defusing the critical points they make against the MTC. Then we turn a critical eye towards C&J’s own proposals. Ultimately, we argue that C&J’s approach fails in its explanatory ambitions and reduces to little more than a list of cases. 7.3.1 Some putative problems for the movement theory of control C&J offer three major kinds of objections to syntactic approaches to control. Let us consider them in turn. 7.3.1.1 Control by adjectival expressions One of the objections raised by C&J involves sentences such as (9) below, which is taken to show that OC PRO is licensed despite the absence of an appropriate antecedent in the syntactic structure. The reasoning is supposed to go as follows.8 PRO in (9) is interpreted as ‘America’ though it does not appear in the structure. The most similar potential antecedent is ‘American.’ However, it cannot be the antecedent of PRO on the assumption that adjectives cannot antecede PRO. (9)
An American attempt [PRO to dominate the Middle East]
But why assume that ‘American’ in (9) cannot be a controller? It appears to be able to “corefer” with a pronoun and even bind a reflexive, as illustrated in (10) below. As such, it might plausibly be able to control a PRO. (10) a. b.
Every American1 military attempt at pacification in Iraq was a failure. It1 was finally forced to approach the UN Every American presentation of itself in international venues is appalling
Assuming that (9) indeed involves OC, the MTC should analyze it along the lines of (11), where ‘American’ gets the external -role of ‘dominate’ and 8 C&J do not actually spell out the problem that (9) poses for syntactic theories of control. They simply cite the facts and conclude that they speak for themselves. We hope that we have thrust the right words into their mouths.
218
Some notes on semantic approaches to control
then moves up to the thematic domain of ‘attempt’ and receives its external -role. (11)
[DP an [NP American attempt [CP C [TP American PRO to [vP American dominate the Middle East]]]]]
Is the movement from a nominal to an adjectival position in (11) illicit? It is not actually very clear what is wrong with this sort of derivation. For whatever reason, adjectives like ‘American’ seem to be able to bind nominal anaphors (cf. [10b]). Such binding normally requires categorical identity, but this seems not to hold in these cases. If not, there is no obvious reason why the same categorical laxity should not extend to movement. Moreover, it is likely that a parallel assumption would have to be made were one to provide an account of (9) in semantic terms. Such an account would have to allow the adjective ‘American’ to saturate the attempter -role in the mapping between conceptual structure and syntax despite the fact that adjectives do not typically do this. In other words, if what ‘American’ is doing in (9) is not typical of adjectives, then the interrelation between conceptual structure and syntax will have to reflect this exceptional behavior by expressly permitting it. Unless it can be demonstrated that the quirky behavior of adjectives like ‘American’ is actually smooth and just what one should expect when viewed from C&J’s perspective, it is not clear why cases like (9) are more problematic for syntactic theories than for semantics-based accounts. In sum, as far as we can tell, as it stands, the observation that sentences like (9) involve control does not in itself argue against syntactic accounts or for semantic ones. 7.3.1.2 (Apparent) lack of antecedent for PRO Another challenge that C&J pose to the MTC involves sentences such as (12) below. The alleged problem with (12) is that there is no controller at all in the overt syntax (ignoring the parenthetical material), but the sentences are understood as the attempter being the leaver and the person ordered leaving, i.e., it is ‘Fred’ when the PP is overt. (12) a. b.
Any such attempt (by Fred) [PRO to leave] will be punished Yesterday’s orders (to Fred) [PRO to leave] have been cancelled
C&J provide a possible way out for the syntactically inclined. They note (without much enthusiasm) that “[o]ne could stipulate a phantom position in the specifier that can never be realized by anything but a null NP; alternatively one could stipulate a null by-phrase in [12a] and a null to-phrase in [12b]”
7.3 “Simpler syntax”
219
(2005: 418). C&J claim that this move would be problematic as it would be motivated on theory-internal grounds and hence with “no independent syntactic motivation.” Why theory-internal grounds in general are not motivation enough is unclear to us. We can see why C&J’s global theoretical predilections for a very spare syntax would find these motivations weak. However, as we do not share their aesthetic, it is not clear why we should not adopt their apposite proposal. If the theory that provides the requisite “internal grounds” is desirable independently (which we think the MTC is), then argument is needed not to follow the theory’s promptings. This places the burden of proof on C&J to demonstrate that such an extension fails empirically for, otherwise, the route motivated on theory-internal grounds is methodologically preferred, pace C&J. Note, incidentally, that, given the discussion of pro in Chapter 6, there is nothing untoward about having a null pronoun move from within the embedded clause of (12a) and (12b) to the thematic domain of ‘attempt’ and ‘orders.’ Deverbal nouns arguably have thematic requirements analogous to those of their verbal counterparts and project into the syntax in a well-behaved UTAHlike manner (see Baker 1997). Thus, after a pro moves to the thematic domain of ‘attempt’ and ‘orders’ in (12), it receives the relevant -role and can be licensed by the inherent case associated with nominals (see Chomsky 1986b).9 C&J note (2005: 418) two further cases where there is no conceivable source for the controller: (13) a. b.
How about PRO taking a swim together? PRO undressing myself/ourselves/yourself in public may annoy Bill
Here C&J suggest that only at pain of adopting Ross’s (1970) theory of performative deletion could these facts be accounted for in a syntactic theory of control like the MTC. This is incorrect. Independent of Ross’s proposal,10 the sentences in (13) are not cases of OC but of NOC. As such, they are not the residues of A-movement, but are instances of pro. In fact, these are cases of what C&J call free control and we call non-obligatory control. Depending on the conversational context, these PROs can be anaphoric to anything: ‘Bill and Martha are filthy from all the work in the garden. Can you think of any way they can clean up? How about washing each other in the outdoor shower?’ Consequently, this is not 9 As argued by Hornstein, Martins, and Nunes (2008), Nunes (2008b), and Kato and Nunes (2009), inherent case is not overtly realized if its bearer is a phonologically empty category. 10 Ross’s earlier analysis has been revived recently in Etxepare (1998) and Nissenbaum (2000), among others.
220
Some notes on semantic approaches to control
a particular problem for the MTC or any other syntactic account unafraid of null elements and not wedded to the specific theoretical assumptions in C&J’s “simpler syntax.” 7.3.1.3 Apparent syntax–semantics mismatches C&J advance a third kind of argument against syntactic accounts of control. They claim (2005: 419) that “the choice of controller can be doubly dissociated from syntactic configuration,” that is, in some cases “the same syntactic configuration can be associated with different controllers” and in other cases “the controller can appear in different syntactic configurations, while preserving meaning.” According to them, (14) and (15) illustrate cases of same structure but different controllers and (16), cases of different structures but same controller. (14) a. b.
Johni persuaded Sarahj [PROj/∗ i to dance] Johni promised Sarahj [PROi/∗ j to dance]
(15) a. b.
Johni talked about [PROi/gen dancing with Jeff] Johni refrained from [PROi/∗ gen dancing with Jeff]
(16) a. b. c. d.
Bill ordered Fredi [PROi to leave immediately] Fredi ’s order from Bill [PROi to leave immediately] The order from Bill to Fredi [PROi to leave immediately] Fredi received Bill’s order [PROi to leave immediately]
In (14) PRO is controlled by ‘Sarah’ with ‘persuade’ but by ‘John’ with ‘promise.’ In turn, (15a) contrasts with (15b) in that the former allows an impersonal (gen) reading with PRO, while the latter does not. As regards (16), ‘Fred’ is the antecedent for PRO despite the “blatant” syntactic differences among the examples. C&J (p. 419) asseverate that only “the dogma that control is syntactic” could motivate a syntactic distinction in these cases on which to rest an explanation for the witnessed differences. For C&J, “intuition suggests that the differences are a consequence of what the verbs mean” (p. 419). This may be correct, but we see precious little beyond confident assertion here. First of all, a false dichotomy has been drawn here between lexical meaning and syntactic structure. Since the earliest days of generative grammar it has been recognized that these two notions are closely related. A predicate’s thematic-argument structure and how this structure is syntactically realized are closely connected. As such, it is reasonable to suppose that if predicates have different argument structures they also have different syntactic structures. Using such an assumption based on Baker’s (1997) UTAH, we argued in section 5.5.1 that, contrary to surface appearances, ‘Sarah’ in (14) is a direct object when
7.3 “Simpler syntax”
221
the complement of ‘persuade,’ but the object of a null preposition (as in dativeshifted structures) when associated with ‘promise.’ This difference accounts for why, in the latter case, ‘John’ is a possible controller while, in the former, it is not. If this is correct, then it suggests that considerable care must be taken when diagnosing underlying syntactic structure from apparent surface form. Generative grammarians have regularly noted that surface form can be misleading. We believe that the ‘persuade’/‘promise’ contrast is a good illustration of this, pace C&J’s assumption that the syntax of ‘persuade’ and ‘promise’ in (14) is the same. Of course, if it is not, as argued in section 5.5.1, their double-dissociation argument weakens. Consider now the contrast in (15). Example (15a), but not (15b), allows PRO to be interpreted impersonally. C&J do not, to our knowledge, explain why this contrast exists besides attributing it to the lexical meanings of the predicates involved. However, it is plausible that part of the contrast revolves around the semantics of the matrix predicates in these constructions and not specifically with the interpretation of PRO. Why so? Chomsky (1986b) proposed the following rule of thumb: if an overt impersonal is disallowed in a certain structure, then it is no surprise that a covert one with PRO is barred as well. In the case of ‘talk about’ vs. ‘refrain from’ it is very plausible that part of the relevant difference reflects the meanings of these predicates, as attested by the examples in (17). (17) a. b.
John talked about one’s/her dancing with Fred John1 refrained from ∗ one’s/∗ her/his1 /PRO1 dancing with Fred
Example (17b) shows that ‘refrain from’ requires that the subject of its gerundive complement be anteceded by the subject of the predicate regardless of how the gerundive subject is expressed. Given this, one might attribute the oddities in (17b) to a fact about the meaning of ‘refrain from,’ e.g., it makes no sense for one to refrain from anyone else’s doing something.11 When identity of reference between subjects is established (be it by binding or control), semantic coherence obtains and the sentences are acceptable. We have no problem with this sort of answer in this case precisely because the constraints ‘refrain from’ imposes on interpretation are insensitive to the particular syntax used to express it. When this obtains, a lexical-semantics account seems reasonable, even when it is unclear how to develop the account in detail. Note that, even if this account 11 To say this is not yet to explain in virtue of what this makes no sense. It is just a bald statement saying that whatever ‘refrain from’ means, it implies one can only refrain from one’s own activities.
222
Some notes on semantic approaches to control
of the semantics of ‘refrain from’ is correct, it does not imply that the PRO in (15b) is not a trace of movement.12 There is no contradiction (or even infelicity or dogma) in assuming that the meaning of ‘refrain from’ demands certain antecedence relations and at the same time assuming that these relations are established by certain grammatical processes in cases like (15b) (e.g., movement, as the MTC would propose). To get a better bearing on when a lexical-semantics account is apposite, it is useful to contrast ‘refrain from’ with ‘talk about.’ The latter’s meaning contrasts with the former’s in tolerating a DP in the subject of its gerundive complement that is not controlled by its subject, as the acceptability of ‘one’s’ and ‘her’ in (17a) demonstrates. Thus, it cannot be that the required interpretation of (15a) with ‘John’ being antecedent of PRO follows from the meaning of ‘talk about’ in a way parallel to ‘refrain from’ in (15b). In other words, the fact that different pronouns are semantically coherent in (17a) raises the question of why the antecedent of PRO in (15a) remains ‘John’ even in cases like (18) below, where an overt pronoun can have ‘Mary’ as antecedent. Note that one cannot simply attribute this to the meaning of ‘talk about’ for, whatever its meaning is, it is clearly compatible with having ‘Mary’ as antecedent (as seen if we replace PRO with a pronoun). But, if this is correct, then some account of this fact that does not rely exclusively on the lexical semantics of ‘talk about’ is required. (18)
Mary1 said that John talked about her1 /∗ PRO1 washing herself
C&J tend, in our view, to treat all cases of control as if they were like the ‘refrain from’ example above. However, in most of the cases of interest, it is far from clear that the meaning of the predicate precludes the control relations claimed to be impossible, at least if we test the relevant cases using alternative syntactic realizations. Consider one more illustration of the problem, C&J’s favorite contrast involving ‘promise’ vs. ‘persuade’: (19) a. b.
Johni promised Billk that hei/k would leave on time Johni persuaded Billk that hei/k should leave on time
(20) a. b.
Johni promised Billk PROi/∗ k to leave on time Johni persuaded Billk PROk/∗ i to leave on time
12 Note that replacing ‘his’ with ‘him’ in (17b) renders the sentence unacceptable, as ‘him’ cannot be interpreted as anteceded by ‘John’ in this configuration. If pronouns can only be used where movement is forbidden (see Hornstein 2001, 2007, and Chapter 6 above), this suggests that A-movement out of accusative-assigning gerunds is allowed, but A-movement out of genitiveassigning gerunds is not.
7.3 “Simpler syntax”
223
Example (19) shows that the embedded pronoun can be coherently anteceded by either ‘John’ or ‘Bill.’ That being so, why is it that in (20a) the only possible antecedent is ‘John’ while in (20b) it is ‘Bill’? Note that the embedded predicates in all the cases of (19) and (20) are “actional” (using C&J’s categorization). Nonetheless, in (19) the antecedent of ‘he’ is not fixed while in (20) the antecedent of PRO is. If the “meaning” of the matrix verb operates to fix antecedence in (20), why does it fail to work its magic in (19)? It would appear that more than the meaning of the embedding predicate is relevant. At the very least, the syntactic structure involved also contributes to the attested control properties. Syntactic theories of control, including the MTC, have aimed to bridge the gap between what lexical semantics plausibly supplies and what is actually observed. We have argued that PRO is a residue of A-movement and movement to the matrix-subject position in persuadeconstructions violates minimality, which explains why ‘John’ is not a potential controller in (20b) (see section 3.4.1). Minimality does not block similar movement in (20a) as ‘Bill’ is within a PP (see section 5.5.1). As the movement is licit, ‘John’ can control PRO in (20a).13 In sum, a full account of the control properties involved here is underdetermined by the lexical semantics of the embedding predicates, i.e., the control properties of ‘promise’ and ‘persuade’ are nothing like those of ‘refrain from.’ 13 One additional point. According to C&J (p. 432), Rosenbaum’s (1967) minimal-distance principle (MDP) should be abandoned because it “fails to account for long distance control in subject complements and for free and nearly free control in object complements (e.g. John talked to Sarah about defending himself).” Note, however, that this poses no problem at all for the MTC. Recall from section 3.4.1 that, within the MTC, the MDP is subsumed under minimality (a condition on movement). As such, the MDP/minimality is relevant only when movement obtains – in cases of OC, but not in cases of NOC. The cases that C&J mention are cases of NOC and, as such, they should not show MDP/minimality effects, which is indeed correct. In fact, C&J’s attitude towards MDP is somewhat difficult to discern. In the body of their control chapter, they repeatedly claim that the MDP is empirically inadequate (see pp. 432, 434, 440). However, in a curious footnote (p. 435, footnote 10) they concede that “[i]t is true . . . that there is a strong bias toward interpreting NV V NP to VP as object control, and this may be a default constructional meaning that makes it hard for some speakers (especially young ones, as shown by C. Chomsky [1969]) to get subject control readings.” This sounds very much like Rosenbaum’s original proposal concerning the MDP, which was understood as a markedness condition, hence defeasible given sufficient data. Rosenbaum argued that ‘promise’ was not a counter-example to the MDP despite its being a subject-control predicate, precisely because of its odd acquisition profile. Rosenbaum cited C. Chomsky’s work on the acquisition of ‘promise’ and noted that its late mastery is just what we would expect if the MDP were a markedness condition on determining controllers. Thus, it appears to us that, despite invidious remarks to the contrary, C&J adopt a version of the MDP which they interpret in markedness terms and which can be overridden by “predicate and complement semantics.”
224
Some notes on semantic approaches to control
Consider now the observation that in (16), repeated below in (21), different syntactic structures result in identical control configurations. This, in itself, is not particularly surprising given the assumption that different syntactic configurations can arise from the same basic underlying form. The different syntactic configurations in (22), for example, are not a problem for any syntactic theory of control (including the MTC) given the standard assumption that actives and passives are transformationally related. (21) a. b. c. d.
Bill ordered Fredi [PROi to leave immediately] Fredi ’s order from Bill [PROi to leave immediately] The order from Bill to Fredi [PROi to leave immediately] Fredi received Bill’s order [PROi to leave immediately]
(22) a. b.
John ordered Bill PRO to leave Bill was ordered (by John) PRO to leave
Thus, the import of C&J’s point regarding the examples in (21) relies on two assumptions: (i) that the same control relations are realized in all these configurations; and (ii) that these different configurations are not syntactically related. Were the sentences in (21) syntactically related, they would be no more problematic than the sentences in (22). So the relevant questions are: (i) whether the interpretive properties of verbs and their deverbal nominalizations are indeed identical; and (ii) whether it is reasonable to suppose that they share an underlying form. Let us consider these questions in turn. Do verbs and their nominalized counterparts have the same interpretive properties? Not completely. Though the sentences in (21) all have interpretations in which PRO is controlled by the individual who has been ordered, this need not be identical to the recipient of an order. Thus, for example, (23a) and (23b) are acceptable while (23c) is not. (23) a.
Mary saw the order from Bill to Fred [PRO to ready ourselves/oneself/ herself for departure] b. Bill’s order to Fred [PRO to ready ourselves for departure] c. ∗ Bill ordered Fred [PRO to ready ourselves for departure]
In (23a) and (23b) ‘Fred’ can be interpreted as the person registering the order though not necessarily ordered himself. With this understanding of the to-PP, one can interpret PRO rather freely, depending on the context. Such a reading is not available in (23c). Thus, the to-PP inside a nominal is not an infallible guide to the thematic role of its complement or, to put it in different words, it does not seem to be the case that in (21) we necessarily have a single interpretation associated with different syntactic forms, as implied by C&J.
7.3 “Simpler syntax”
225
But putting such cases aside, there is a reading of (21b–d) in which ‘Fred’ is understood as having been ordered and, under this reading, it controls PRO in the nominal cases, just as in the verbal counterpart in (21a), which indicates that control within nominals strongly parallels what we find in their verbal counterparts. This brings us to our second question: do verbs and their deverbal nominalizations share a common underlying syntactic form? C&J presume that they do not, although they do assume that they share an underlying conceptual structure. However, it is not clear, at least to us, that there are compelling arguments against verbs and their nominalizations sharing a common underlying syntactic form. This is certainly not an exotic assumption and it has been fairly common since the earliest days of generative grammar.14 C&J present no arguments against it and so we see no reason as yet to reject it. We also accept their implicit argument that the MTC is committed to the claim that verbs and their deverbal nominalizations are syntactically related.15 If control is due to movement in the former then it is also due to movement in the latter (see section 5.3). However, given the state of flux concerning the details of the relation, we have nothing concrete to propose about the structure of verbs and nouns. Suffice it to say that we know of no reason why movement could not operate within either domain with the same interpretive effects if we assume that the way thematic information projects within the verbal and nominal domains is guided in the same way (e.g., via UTAH). If this is correct, although C&J are right that one should strive for a unified account of control with nominal and verbal predicates, their conclusion that this necessarily requires computing control non-syntactically at the level of conceptual structure does not follow. This conclusion requires arguing that verbs and their deverbal nominals are not syntactically related and we see no reason to accept this assumption at this point.16 In sum, the main argument C&J have against syntactic accounts of control in general and the MTC in particular is that the requisite syntactic structure
14 This was the assumption in Lees (1960). Chomsky (1970) argued against Lees’s transformational analysis but not against the view that verbs and nouns could have parallel underlying representations. This is certainly implicit in ‘Remarks’ where it is assumed that the object of ‘destruction’ and ‘destroy’ are semantically and syntactically analogous. The same basic assumption carries over to recent distributed-morphology analyses. Here categorical identity is a rather surfacy property. If this is so, then nothing prevents treating nouns and verbs as essentially derived from a common structure that is categorically neutral. 15 Actually, this assumes that the semantic properties of nominals are determined directly from the structure of the nominal and not via some relation to its corresponding sentence. We leave this possibility aside for now. 16 In fact, to our knowledge most generative grammarians assume that there is a syntactic relation between verbs and their deverbal counterparts.
226
Some notes on semantic approaches to control
is unavailable. The strongest argument revolves around the parallel properties of nouns and verbs as regards control. For the MTC to capture these parallels in terms of movement requires tolerating a more abstract syntax than C&J seem comfortable with, though not one that is particularly exotic given current assumptions. As such, their arguments against syntactic approaches like the MTC rest on premises that syntactic-centric theories have no problem rejecting and, consequently, most of their objections simply beg most of the relevant questions. 7.3.2 Challenges for “simpler syntax” Let us now consider some problems for the approach based on conceptual structure advocated by C&J. 7.3.2.1 Adjunct control As discussed in section 7.2, adjunct control is quite problematic for semantic approaches to control and the “simpler-syntax” approach is not exceptional in this regard. In fact, C&J concede that “[o]ne point in control theory where some syntactic constraint seems unavoidable is in the control of adjuncts” (2005: 425). However, we believe that C&J fail to fully appreciate how severe a challenge adjunct control is to their entire enterprise. As discussed in section 4.5.1 and reviewed in section 7.2, adjunct control shows all of the typical marks of complement control. In an approach like “simpler syntax,” control is fundamentally a fact about “the lexical semantics of the predicate that selects the infinitival or gerundive complement” (p. 416). But, in adjunct-control configurations, there is no special property in the lexical semantics of the embedding predicate that could determine control, yet the very same properties associated with complement control arise. So why should this be if obligatory control is primarily a lexical-semantic fact about embedding predicates? Given that adjunct control exhibits all the diagnostic properties of complement control but without the specific lexical predicates whose semantics is supposed to ground these properties, it is very unclear how C&J could integrate adjunct control into their basic story. Stipulating a very specific rule to the effect that, when there is an adjunct in need of a controller, the local syntactic subject must function as the unique controller will not have any explanatory heft. In fact, it will be a backward step when compared to earlier accounts that pursued a conceptually more appealing route subsuming adjunct-control cases to the MDP.
7.3 “Simpler syntax”
227
As noted in section 4.5.1, the MTC accommodates adjunct control via sideward movement, a grammatical option made available once D-structure is dispensed with. The distinctive properties of these constructions can be largely derived given recent minimalist assumptions, as we have shown. This allows the MTC to give a unified account to both complement and adjunct control. Thus, the fact that they show identical properties is no surprise, as it would be on a mixed theory of the kind mooted in “simpler syntax.”17 In fact, things are even worse than this for C&J’s analysis. Recent work by Nunes (2008c) on adjunct control in Brazilian and European Portuguese shows that independent syntactic properties may yield subject or object control in adjunct-control configurations. More specifically, he shows that in Brazilian and European Portuguese the subject of infinitival adjunct clauses may be controlled by the matrix subject or the matrix object, depending on whether or not the matrix object undergoes wh-movement, as illustrated in (24) below.18 Example (24b) has a wh-in situ in the matrix clause and the result is subject control, as in (24a), with no wh-element involved. By contrast, (24c) has whmovement and now both subject and object control are possible. 17 C&J claim that even in the case of adjunct control “some nonsyntactic influence is necessary” (p. 426). They note that there can be control of adjuncts by implicit arguments within nominals in examples such as (i) below and conclude that this tells against syntactic approaches. However, this position again assumes that the syntax cannot have phonetically null arguments, a claim that we have argued is contentious. (i)
Such a brutal interrogation of the suspect without PRO considering the legal repercussions could lead to disaster
C&J (p. 147) also discuss cases such as (ii) below, which illustrate the so-called “event” control originally discussed by Williams (1985). (ii) a. The ship sinks (in order) PRO to further the plot b. The ship was sunk (in order) PRO to collect the insurance C&J analyze these as involving control by an implicit agent. However, whether this is so is quite unclear. Example (iia) can be seen as a typical case of event control with the paraphrase ‘The ship sinks in order for its sinking to further the plot.’ PRO is not controlled by an implicit agent but by the event. Curiously, this interpretation is not available in other cases of adjunct control: ∗ ‘The ship sinks before/after furthering the plot.’ Example (iib) is more unusual as it seems that the implicit agents of the sinking are the collectors. However, even here there must be something else going on. For example, these cases of control do not license anaphors, even with the by-phrase present: ∗ ‘The ship was sunk (by John) to make himself famous.’ It is unclear why not, if indeed it is the implicit agent that is doing the controlling. The unacceptability follows if even here we have some kind of event control. 18 For original discussion of the finite counterparts of (24) in Brazilian Portuguese, see Modesto (2000), Rodrigues (2004), and Nunes (2008c).
228
Some notes on semantic approaches to control
(24)
European and Brazilian Portuguese (Nunes 2008c): O Jo˜aoi cumprimentou a Mariak depois de PROi/∗ k entrar The Jo˜ao greeted the Maria after of enter
a.
na sala in-the room ‘Jo˜ao greeted Maria after entering the room’ b.
O Jo˜aoi cumprimentou quemk depois de PROi/∗ k entrar na sala? The Jo˜ao greeted who after of enter in-the room ‘Who did Jo˜ao greet after entering the room?’
c.
Quemk e´ que o Jo˜aoi cumprimentou ti depois de PROi/k entrar Who is that the Jo˜ao greeted after of enter na sala? in-the room ‘Whoi did Jo˜aok greet after hei/k entered the room?’
Assuming with Boˇskovi´c’s (2007) that the strong feature that triggers successive cyclic movement is hosted by the moving element, Nunes (2008c) proposes that in languages like Brazilian and European Portuguese, with optional whmovement, this feature is lexically optional on wh-elements. Moreover, the presence of this feature in the derivation has consequences for economy computations regarding merge-over-move. Recall that subject control over object control is enforced in adjunct control due to merge being more economical than move (see Hornstein 2001 and section 4.5.1.1 above). In the case of (24a), for instance, if ‘o Jo˜ao’ is in the subject position of the adjunct clause, it cannot undergo sideward movement to the complement of the matrix verb, for merger of ‘a Maria’ in this position is more economical. So, after ‘a Maria’ is merged, ‘o Jo˜ao’ can only move to the matrix [Spec, vP], yielding subject control. Bearing this in mind, let us now consider the contrast between (24b) and (24c) regarding object control. Nunes argues that they involve the derivations sketched in (25a) and (25b) respectively. (25) a.
∗
O Jo˜ao [[cumprimentou [quemuF ]i ] [depois de ti entrar The Jo˜ao greeted who after of enter.INF na sala]] in-the room ‘Whoi did Jo˜ao greet after hei entered the room?’
b.
[quem√F ]i e´ que o Jo˜ao [[cumprimentou ti ] [depois de ti Who is that the Jo˜ao greeted after of entrar na sala]] enterINF in-the room ‘Whoi did Jo˜ao greet after hei entered the room?’
7.3 “Simpler syntax”
229
The wh-element of both derivations in (25) entered the numeration specified with a strong feature uF, which in turn requires that the wh-phrase must move if possible. This requirement of the strong feature now overrules merge-overmove, for things are not equal anymore. If the wh-element sits in the subject of the adjunct clause and sideward movement to the matrix-object position is possible, such movement must take place. Now, if merge-over-move is circumvented in the presence of a strong feature, this strong feature must be checked. Hence, (25a) is unacceptable not because movement of the wh-element from the adjunct clause to the matrix-object position violates merge-over-move, but because the strong feature of the wh-phrase remained unchecked. When it is checked by moving to [Spec, CP], as in (25b), the derivation converges, yielding an object-control reading.19 In conclusion, it is not at all obvious how C&J can incorporate adjunct control in their system as it attempts to derive obligatory control from semantic relations between embedding predicates and their complements. To put this more starkly: the fact that one gets all the properties of obligatory control in adjunct-control configurations in the absence of the relevant properties that C&J identify as the main causal agent suggests that they have nabbed the wrong suspect. By contrast, the MTC not only provides a unified movement account of both complement and adjunct control, but can also account for cases where adjunct control may be of both the subject and the object type.
7.3.2.2 The problem of the distribution of PRO C&J characterize the empirical problem of control as essentially that of controller selection. In addition to the problems this sort of approach suffers from (see section 7.2), C&J’s treatment does not address any of the other factors often taken to be central to control phenomena. Most conspicuously absent is any account of the distribution of PRO. C&J’s overall approach to syntax (“simpler syntax”) is against using abstract expressions like PRO (though, curiously, they have no qualms about abstract 19 The subject-control reading of (24c) is obtained from a derivation in which the wh-phrase gets merged in the matrix-object position and ‘o Jo˜ao’ moves from the adjunct clause to the matrix [Spec, vP], as illustrated in (i) (see Nunes 2008c). (i)
[Quem√F ]i e´ que o Jo˜aok [tk [cumprimentou ti ] [depois de tk entrar Who is that the Jo˜ao greeted after of enter.INF na sala]] in-the room ‘Whoi did Jo˜ao greet after hei entered the room?’
230
Some notes on semantic approaches to control
entities in either semantics or phonology). However, whether one adopts PRO or eschews it, one is still left with the task of explaining why controllees must be syntactic subjects and why they so typically reside in non-finite clauses. Note, we say syntactic subjects for we take it as given that there is no thematic restriction on controllees. Anything that can be a syntactic subject in a clause (either base generated or derived) is a potential controllee under the appropriate predicate. The problem of the distribution of PRO amounts to explaining why the control relation (most particularly OC) is generally restricted to controllee subjects of non-finite clauses. Consider a concrete analysis of control by C&J. According to them, a lexical predicate’s control obligations are embedded in its conceptual structure as an instruction that one of its arguments antecede some argument of the embedded complement. As C&J restrict their attention largely to actional complements, these instructions are represented as in (26), for example. (26)
X␣ INTEND [␣ ACT]
The ␣s in (26) indicate that X, the logical subject of INTEND, controls the ␣ in the embedded action complement. The question of the distribution of PRO amounts to the question of why it is that this controllee within the ACT complement is always a syntactic subject. Note first that there is no obvious semantic reason for why only syntactic subjects are controlled. Even if one takes control to be a relation to an actional property, it does not follow that the relevant open variable must be in subject position. Of course, one can always stipulate that the second ␣ in (26) is the subject of the embedded ACT complement, but this would be unsatisfactory for obvious reasons. The problem C&J’s approach faces is clear: C&J stress that it is conceptual structure (i.e., semantics), and not overt syntax, that is “an appropriate level for stating control relations” (p. 419). But, as noted, it is syntactic subjects, not conceptual subjects, that are controllees. Thus, if one restricts oneself to conceptual-structure information alone, it is unclear how to track one of the most basic facts about control, namely, that it is syntactic subjects that are controlled. 7.3.2.3 Problems with C&J’s decompositional approach The final problem we would like to point out with respect to the approach to control defended in “simpler syntax” is that it is essentially based on a decompositional approach to natural-language predicates and, as such, it suffers from the sorts of problems often noted to be endemic to such approaches;
7.3 “Simpler syntax”
231
namely, they are either clearly wrong or (where they are not vacuous) largely stipulative. The basic idea behind C&J’s analysis of control consists of two steps. First, they take control to be a semantically primitive relation inherent in the meaning of certain basic predicates. As an illustration of what this means, consider once again the discussion above of ‘refrain from.’ It is arguable that it is part of the basic semantics of this predicate that one cannot refrain from someone else’s doings. This is manifested in the status of sentences like ‘John refrained from Mary’s leaving early,’ which are not simply ungrammatical, but, one might contend, incoherent. Evidence that this is a basic feature of the semantics of ‘refrain from’ (and not merely a feature of the control configuration) comes from noting the semantic acceptability of ‘John refrained from his leaving early’ when ‘John’ is understood as anteceding ‘his.’ Thus, the effects of the meaning of the predicate are exercised across disparate syntactic forms and enforce the same condition: one can only refrain from acts that one commits oneself. The second step is to propose that all cases of obligatory control stem from including one of these primitive-control predicates in their meaning as represented in their conceptual structure. For example, C&J take ‘intend’ to be a primitive controller with the conceptual structure in (26). The binding between X and the ACT is taken to be “inherent” within (26). Thus, the binding that one sees in (27) results from the fact that ‘intend’ (the word) is associated with the conceptual structure of INTEND in (26).20 (27)
Johni intended PROi to leave early
However, INTEND can also be part of more complex conceptual structures. For example, the conceptual structure of ‘persuade’ also involves it as it means ‘cause to come to intend.’21 Thus the reason that one finds control in (28) is in virtue of the conceptual structure of ‘persuade’ including INTEND, which requires as part of its inherent meaning that the persuadee INTEND to do what s/he is persuaded to do. (28)
John persuaded Maryi PROi to leave early
Other primitives include OBLIGATE, SHOULDroot , and CS, a complex stand-in for verbs of the force-dynamic class of predicates (pp. 447–448). C&J 20 Henceforth, natural-language predicates (words) will be in single quotes and conceptualstructure predicates will be capitalized. 21 Actually, more likely it involves CAUSE to COME to INTEND, three conceptual-structure predicates.
232
Some notes on semantic approaches to control
restrict their study to the class of voluntary actions, which covers those cases where the predicate takes an ACT complement, this being a semantic class that cuts across the finite/non-finite category (cf. their discussion on pp. 429–431). C&J propose that these cases all involve primitive predicates like those noted above which, as part of their inherent semantics, select a unique antecedent from the arguments of the embedding predicate to control the “actor” for the ACTion that is complement to that predicate.22 From the above, it should be clear that the success of this approach to control relies on two details: (i) finding a class of primitive predicates for which it is plausible that their control properties are inherent in their meaning; and (ii) showing how complex predicates include these and manifest control in virtue of including these in their conceptual structures. We believe that C&J succeed in neither task and as a result fail to successfully motivate this approach to control. Here is the source of our skepticism. First, consider the primitive-semanticcontrol predicates. The astute reader will have noticed the flurry of capital letters in defining the primitives: INTEND, OBLIGATE, etc. Why the resort to upper case? One reason is that these denote primitives in the conceptual system, not natural-language predicates. As such, for example, INTEND is not the same as ‘intend,’ nor is CS a predicate of English at all. This raises the following puzzle, which we dub Fodor’s problem (as Jerry Fodor is someone who has worried about this issue consistently since the heyday of generative semantics; see e.g. Fodor 1975): how are we to understand predicates like INTEND, OBLIGATE, and so on? If, for example, we take INTEND to mean ‘intend’ and we take control to mean what it conventionally means (namely, uniquely specifying the value of the syntactic-subject argument of the ACT complement) then it is hard to see that it requires control as part of its meaning. The reason is that ‘intend’ behaves quite unlike ‘refrain from,’ which we took to be a plausible model for what might be meant by “controlling in virtue of its meaning.” For example, unlike INTEND, ‘intend’ imposes no requirement on the subject of its complement, as C&J note in their discussion of coercion (p. 452), providing the following kinds of examples:
22 It is unclear why conceptual structures require unique control. There are verbs that one could imagine would semantically fit with more than one controller. For example, ‘agree.’ If ␣ agrees with  then ␣ and  agree. Nonetheless, though ‘John and Bill agreed to help each other’ is perfectly acceptable, ∗ ‘John agreed with Bill to help each other’ is not. Other options come easily to mind. Why conceptual structures specify exactly one controller is, as far as one can see, a stipulation.
7.3 “Simpler syntax” (29) a. b.
233
Hilary intends for Ben to come to the party Hilary intends that Ben come to the party
Note this is precisely what ‘refrain from’ disallowed (∗ ‘Hilary refrained from Bob coming to the party’) and this suggests that it does not follow from the very meaning of ‘intend’ that the subject of the embedded ACT complement must be controlled by the matrix subject. Thus, INTEND cannot mean ‘intend.’ But what does it mean, then? Here is one place we do not wish to go: it means just what ‘intend’ means but it requires that the embedded subject be controlled as with ‘refrain from.’ The reason is that we have no reason to think that such a predicate exists and even if it did, it is not clear in what sense the primitivecontrol property follows from the meaning of INTEND except by stipulation. In other words, one problem with capitalization is that it threatens to bleach all explanatory value from the exercise as it stipulates what it claims to explain. C&J are aware of this problem and they seem to take a different tack. From what we can gather, they do assume that INTEND means ‘intend’ and that all uses of ‘intend’ actually involve control, though not necessarily in the sentence at hand but in some more abstract way. C&J are not very precise here so we may have misunderstood them, but what they seem to say is the following (the relevant discussion is in C&J 2005: 452). In cases like (29) control exists despite all appearances to the contrary. What these sentences mean is paraphrased “approximately” as (30), and (30) involves control of the PRO of the interpolated predicate ‘bring it about.’ (30)
Hilaryi intends PROi to bring it about that Bill come to the party
So, to allow INTEND to mean ‘intend’ and to show that ‘intend’ always involves control, C&J expand the concept of control to include cases of control of an implicit subject of an implicit ACT complement as well as explicit ones. However, now we are faced with the second problem for decompositional approaches like this one: paraphrases are not meanings. Though we agree that the sentences in (29) can be paraphrased as (30) for many circumstances, this does not imply that (29) means what (30) does. So, for example, it is not clear to us that (31) below is a contradiction, which it should be if (29) meant (30). (31)
Hilary intended for Bill to come to the party though, being lazy and complacent, she intended to do nothing whatsoever to bring this about
We can make a stronger case to the same effect with another of C&J’s examples. They analyze ‘plan’ as also involving INTEND. Thus, (32) should be interpreted as (33):
234
Some notes on semantic approaches to control
(32)
Hilary is planning that Ben will come to the party
(33)
Hilary is planning to bring it about that Ben will come to the party
However, here it is quite clear that Hilary could be planning what we have in (32) without planning what we have in (33). Rather, her plans can be based on the assumption that Ben will come without her doing or having to do anything about it. In short, (34) is clearly not contradictory. (34)
Hilary is planning that Ben will come without planning to bring it about that Ben will come
There is a way of repairing this problem. We can assume that ‘plan’ does not involve INTEND as a subpart. However, if it does not, it leaves unexplained why ‘plan’ heads a control structure when coupled with a non-finite complement as in ‘Johni planned PROi to leave.’ One could always fish around for another primitive controller to ameliorate matters, but in the end the problem illustrated here is, we believe, a quite general instance of Fodor’s problem. Fodor has repeatedly pointed out that the capitalized predicates exploited in semanticdecomposition proposals rarely mean what their lower-case brothers do. As such, it is never quite clear what they do mean. But if this is so, it is an unpromising strategy to aim to explain how their putative semantic powers arise from their meanings. More often than not, either what is delivered is not an explanation but a stipulation, or the meanings are left unspecified (albeit with a nod and a wink towards the lower-case analogue). This general problem carries over to C&J’s control account. Until we know what INTEND means exactly, we cannot have any confidence that its meaning underlies its control powers. Moreover, as all control ultimately rests on the meaning of these primitive predicates, we are left with no explanation of control at all. There is yet another problem with C&J’s strategy. Recall what the aim of the game is. The project is to show how what grammarians think of as control actually piggybacks on a more basic notion of semantic control that arises from a small class of primitive predicates that enforce control as an inherent part of their meanings. However, what C&J claim is that there exists a type of implicit control in sentences like (29) and (32) that appears not to be an instance of grammatical control at all; there are no PROs, non-finite clauses, or apparent binding. In other words, C&J in effect propose that there are cases of semantic control even in the absence of syntactic control, as what the primitivecontrol predicates enforce is not syntactic control but semantic control. But this leads to a prediction: licit control interpretations are semantically, not syntactically, delimited. This seems clearly incorrect. Note that under C&J’s approach,
7.3 “Simpler syntax”
235
(35) below is a control configuration with the rough interpretation in (36). However, if (36) is a fine reading of (35), why can (37a) with indexation in (37b) not mean what (35) means? (35)
Ben’s mother said that Hilary is planning that Ben will come to the party
(36)
Ben’s mother said that Hilary is planning to bring it about that Ben come to the party
(37) a. b.
∗
Ben’s mother said that Hilary is planning PRO to come to the party Ben’s1 mother said that Hilary is planning PRO1 to come to the party
What gets us (37b) is the mechanism C&J exploit to “regularize” (29) and (32) in service of getting INTEND to mean ‘intend.’ As noted, this move effectively divorces syntactic and semantic control as control configurations obtain even in the absence of a syntactically bound PRO. But so construing control should allow a syntactic PRO to be bound by an antecedent other than its grammatical controller so long as semantic control can obtain. This is what is done in (37b). It should be fine with the meaning paraphrased in (36), i.e., ‘Hilary’ controls the subject of ‘to bring it about’ and so the control imposed by the “meaning” of INTEND has been satisfied. However, (37a) does not have this meaning. Thus, the “fix” C&J propose to allow INTEND to be a basic controller in virtue of its meaning actually ends up leaving unexplained standard cases of syntactic control. Another illustration of Fodor’s problem as pertains to control is appropriate. C&J suggest that the object-control properties of ‘persuade’ come from its meaning “cause to come to intend” (p. 446). But anyone can see that this is not what ‘persuade’ means (e.g., one can cause someone to come to intend to leave by threatening or humiliating them rather than persuading them); it is, at most, a part of its meaning. But even this is only clear for those cases where the complement is infinitival. Thus (38) is not well paraphrased as (39) despite ‘persuade’ being the matrix predicate. (38)
John persuaded Mary that Frank had come
(39)
John caused Mary to come to intend that Frank had come
If so, then it cannot be that ‘persuade’ (somewhat) means ‘cause to come to intend’ but, at most, that ‘persuade’ with an infinitival complement means this. Thus, the meaning, such as it is, is a product of the composition of the non-finite complement and ‘persuade.’ But what is it about this combination that results in this meaning? C&J do not say. Perhaps because the complement when finite need not be an ACTion. But note here it is an ACTion (‘what Frank did was
236
Some notes on semantic approaches to control
come/Frank came voluntarily’). How does the fact that it need not have been an action (although it is one) affect the compositional properties of the predicate? C&J again do not say. Moreover, even if the complement is infinitival unless it has a PRO subject, we do not see syntactic control, as illustrated in (40) below. Though (40) is degraded, it is perfectly meaningful. We understand it to mean something roughly like (41). (40)
??John persuaded Mary for Bill to come
(41)
John persuaded Mary to bring it about that Bill come
Note that if we assume that (40) is actually a case of semantic control (like in [29] and [32]), then we end up with the problem analogous to that in (37a): why can (42) below not mean what (43) means? One possible answer is that control semantics only arises when the infinitival complement has a PRO subject. This would get the result but it would clearly not be very satisfactory for we would want to know what it is that PRO contributes and this is precisely what the proposed theory of control is supposed to explain. (42)
John1 persuaded Mary PRO1 to come
(43)
John1 persuaded Mary to bring it about that he1 come
There are several moves one might make to finesse this line of reasoning. However, the point is not whether one could do this in this particular case. Rather, the point is that C&J owe us a theory of how to compose complex conceptual structures from primitive ones so that we see why control occurs with non-finite complements but not with finite ones. When ‘persuade’ has a finite clausal complement there is no control. Why not, given the meaning of ‘persuade’ noted above? Why does it become semantic control with a non-finite complement and why grammatical control when the non-finite complement has a PRO subject? C&J do not say. However, unless C&J say why control with ‘persuade’ relies on these conditions (i.e., unless some rules of composition are offered), we have not been offered an account of how the meaning of ‘persuade’ results in control. We could go on and on. However, we hope that the main objection with C&J’s semantic project is clear. Theirs is an essentially decompositional approach to control. As such, it suffers the well-known vicissitudes of all such theories. In place of capitalization, C&J need to provide an account of the meanings of the primitive predicates and principles of composition for forming complex conceptual structures from basic ones. The problem is that compositional theories never really manage to do this. Either the meanings of the conceptual structures
7.4 Conclusion
237
are identified with that of the words that embody them (IDENTIFY/‘identify’) or their meaning is left either unspecified or stipulated. In the first instance it is easy to argue that the proposed meanings cannot be correct. In the second, no real account is offered at all. As such, decompositional accounts generally resolve into elaborate discussions of individual cases. In effect, in place of a general account, we are provided with a list. We hope that the reader appreciates how much more the MTC offers. 7.4
Conclusion
This may be a good moment to note that accepting the MTC does not entail rejecting semantic contributions to determining control. As noted above with respect to ‘refrain from,’ for instance, a better understanding of the lexical semantics of embedding predicates in complement-control structures will certainly supplement syntactic accounts in explaining why a given binding relation must obtain regardless of whether the embedded subject is overt or not. However, the overall conclusion of the discussion in the previous sections is that, left to their own devices, semantic approaches to obligatory control do not constitute a viable alternative to syntactic approaches as they are bound to both under- and overgenerate. Once they are essentially based on the semantic relations between control predicates and their complements, they undergenerate in being unable to account for adjunct control and for the syntactic conditioning one may find in the specification of subject or object control in adjunct configurations (see section 7.3.2.1). On the other hand, they overgenerate in incorrectly predicting obligatory control where the subject of the embedded complement to a control predicate is overt (see the discussion of [19], [29], [32], for instance). That semantic approaches to obligatory control fail to achieve an adequate level of theoretical explanation becomes even more transparent when a detailed comparison with the MTC is made. As shown in the sections above, the MTC is able to provide a non-stipulative, conceptually motivated, and unified account of all the problems faced by semantic approaches to control, without sacrificing empirical coverage.
8
The movement theory of control and the minimalist program
8.1
Introduction
The movement theory of control (MTC) has been, much to our surprise, a controversial proposal. This book has aimed to outline (and trumpet) its virtues and to consider (and parry) purported vices. In this concluding chapter, we would like to return to the MTC’s conceptual foundations, especially as regards its relation to the broader concerns of the minimalist program. Truth be told, our main interest in the MTC has much more to do with minimalism than with control. Or, more accurately, we believe that control is currently interesting because of its near perfect fit with certain central tenets of the minimalist program as realized in the MTC. Indeed, we could be persuaded to go so far as to claim that a minimalistically respectable account of control will necessarily have some version of the MTC at its core. This chapter is a defense of this claim. The defense will proceed through various layers of abstraction. First, we will remind readers of the fact that (obligatory) control relations exemplify canonical properties of movement, minimalistically construed. Second, we return once again to how the minimalist program and the MTC conceptually intertwine, the latter presupposing the truth of some of the central tenets of the former − in particular, the elimination of D-structure. Third, we argue that the MTC alone is consistent with the explanatory ambitions of the minimalist program. The main reason for this is that PRO-centered accounts of control must run afoul of the anti-construction bias characteristic of the minimalist program, a prejudice embodied in the inclusiveness condition, which we interpret to forbid (among other things) coding of formal properties as lexical properties. In the process of making these claims we note that the MTC is actually a conservative extension in a minimalist setting of the classical approaches to control, and in one sense much more consistent with classical generative analyses than are PRO-based accounts, despite the latter’s superficial notational similarity to the GB view. The take-home message is that if you like the 238
8.2 Movement within minimalism and the MTC
239
minimalist program, then you should love the MTC and eschew PRO-based accounts of control, as they fit poorly with the principles, architecture, and explanatory aspirations of the minimalist program. 8.2
Movement within minimalism and the movement theory of control
Within the minimalist program there are several signature properties of movement, the three major ones being locality, economy, and copying. We have argued that the dependency exhibited in obligatory control (OC) acts as if governed by the same restrictions. In other words, given standard minimalist marks of movement, OC is a dependency generated by movement. Let us illustrate. With respect to locality, there are two cases: minimality effects and freezing effects. Let us consider them in turn, starting with minimality. Intervening DPs in A-positions block the establishment of further A-dependencies by prohibiting A-movement across them. This, we have argued, is the source of the minimaldistance principle’s restrictions on possible antecedents of PRO. Thus, in cases like (1), ‘John’ is not a possible antecedent of PRO, as moving across ‘Mary’ is prohibited by minimality (see section 3.4.1). (1)
John1 persuaded Mary2 PRO∗1/2 to leave
Moreover, apparent exceptions to this generalization, e.g., subject control with ‘promise’ and control shift, arise precisely because the apparent intervening DPs do not actually c-command the PRO position and so do not function as interveners (see section 5.5). Note that this same account explains why it is that OC PRO must be a syntactic subject (see section 7.3.2.2). Consider the structure in (2), for instance, which represents A-movement from a non-subject position: (2)
[ . . . DP1 . . . [DP2 V DP1 . . . ]]
The dependency illustrated in (2) between the two instances of DP1 violates minimality as they span an intervening DP2 . Given that OC requires an antecedent, an OC chain will always bottom out in a syntactic subject, for all other dependencies will necessarily violate minimality.1 1 Still compatible with minimality are potential cases with a non-subject OC PRO where PRO sits in a caseless position. For example, Hornstein (2001), following a suggestion by Alan Munn, proposes that “reflexive” predicates like ‘wash,’ ‘shave,’ and ‘dress’ in English optionally assign case and may therefore license an OC PRO/A-trace in the object position, as represented in (i).
240
The movement theory of control and the minimalist program
Let us now consider the effects of freezing, which in the case of A-relations prohibits A-movement from a “-complete” domain. Freezing explains why control, like raising, is possible into non-finite clauses but is generally prohibited into finite ones, as illustrated in (3) below. Given that in English finite clauses are -complete domains, freezing prohibits A-movement from the subject position of the embedded clause of (3b). (3) a. b.
∗
John hoped [PRO to win] John hoped [PRO won]
Two points are worth emphasizing here. First, the MTC is unique among current approaches to control in providing a principled explanation of the contrast in (3) (see section 2.5). All other analyses are reduced to stipulating the distribution facts of OC PRO − that it appears exclusively in the subject position of -defective clauses, e.g., infinitives and gerunds. Second, exceptions to freezing are expected to be possible control configurations, and indeed they are. For example, in Brazilian Portuguese, finite indicative clauses can be defective and, accordingly, they do support OC dependencies (see section 4.4). Let us now examine another characteristic feature of movement within MP, economy in the sense of Chomsky 1995. It too governs OC dependencies. In the guise of merge-over-move, it is instrumental in explaining the restriction to subject antecedents in adjunct-control configurations. In (4) below, for example, ‘John,’ but not ‘Sue,’ can control PRO due to merge-over-move (see section 4.5.1.1). If merge-over-move is at the root of this restriction, then OC dependencies must be generated by movement, as merge-over-move regulates movement, and not construal/Agree operations.2 (4)
John1 saw Sue2 [before PRO1/∗2 leaving the party]
A fourth diagnostic property of movement within the minimalist program is that it produces copies (hence the copy theory of movement). If control involves movement, then there should be copies. This expectation is clearly borne out The effects of minimality in these constructions can be observed in (ii), where ‘Mary’ cannot be the antecedent of PRO, because ‘Bill’ intervenes. (i) (ii)
Bill1 washed/shaved/dressed PRO1 Mary1 wants Bill2 to wash/shave/dress PRO2/∗1
2 Recall that Portuguese allows subject or object control in adjunct configurations depending on whether or not the matrix wh-object is in situ or has undergone movement (see section 7.3.2.1). This once again shows that it is the possibilities for movement to take place that lead to subject or object control (see Nunes 2008c).
8.3 The MTC and the minimalist architecture of UG
241
in copy-control languages and languages that allow backward control (see sections 4.5.4 and 4.5.3). Indeed, as we have argued, it seems to us that only the MTC can account for backward control for such control configurations in PRO-based theories necessarily violate principle C and so should be impossible.3 That the MTC combines so neatly with the copy theory of movement to provide a straightforward account of these phenomena is, in our view, a particularly clear illustration of the tight conceptual fit between central tenets of the MTC and the minimalist program. Note that these four features of OC configurations reflect the operation of very general principles of movement in the minimalist program. Following the duck principle (if it quacks, walks, and flies like a duck, it is a duck), from the fact that OC dependencies look as if they respect these principles, it follows that OC is a dependency generated by movement, at least from a minimalist perspective. Importantly, the above principles and their concomitant technology are central features of minimalist grammars. To the extent that they reflect more fundamental minimalist conceptions (and most do),4 control must be a movement relation on minimalist grounds given that it exhibits what the minimalist program takes to be the core properties of movement.
8.3
The movement theory of control and the minimalist architecture of UG
The MTC rests on the assumption that movement into -positions is grammatically viable. In other words, the MTC is incompatible with D-structure. D-structure, recall, is the syntactic level where all and only -relations are coded. It is also the input to all transformation processes (e.g., movement). Together, these two properties (i) prohibit movement into -positions and (ii) require that all argument DPs begin their derivational lives in -positions. The MTC is clearly incompatible with (i) and thus its theoretical viability requires
3 To date we know of no PRO-centric analysis of backward control. This suggests that there is a consensus: should backward control exist, then PRO-based analyses are incorrect. For recent further novel examples of backward control, see e.g., Fujii (2006) and Alexiadou, Anagnostopoulou, Iordachioaia, and Marchis (2008). 4 For example, the copy theory of movement, which replaces the less methodologically valued trace theory, may be taken to follow from the “simplest” definition of merge (see Chomsky 2004). In turn, minimality and freezing have been central features of generative grammar since the late 1980s and are excellent candidates of “least effort” principles, which are the hallmark of the minimalist program. Of the group, merge-over-move is the most conceptually suspect, though also the most deeply embedded in current phase-based approaches.
242
The movement theory of control and the minimalist program
the elimination of D-structure as a grammatical level. As disposing of Dstructure (a methodologically unwelcome grammar-internal level) is a central architectural feature of the minimalist program, there exists a tight conceptual connection between the minimalist program and the MTC. How tight? The MTC clearly implies the absence of D-structure. Moreover, the absence of Dstructure is sufficient to license the MTC. Thus, a central architectural feature of the minimalist program − the elimination of D-structure − is both necessary and sufficient for the MTC. Before considering this in more detail, let us take a brief digression. Though incompatible with (i), the MTC is not incompatible with (ii). The MTC requires that movement into -positions be possible, but it is agnostic as to whether argument DPs must begin their derivational lives in -positions. We mention this, for the only empirical argument against movement into positions that we know of, which was offered in Chomsky 1995, can rely on (ii) only. Consider the sentence (5a), for example. This sentence is derived if ‘John’ is merged into the embedded [Spec, TP] in preference to raising ‘someone.’ ‘John’ then moves up to the matrix [Spec, vP] to get the external -role and up to the matrix [Spec, TP] to get case checked, as illustrated in (5b). If we assume that ‘someone’ in (5b) can check (partitive) case with ‘be’ (see e.g., Belletti 1988 and Lasnik 1995), then the derivation should converge. Moreover, this derivation should block the derivation of (6). Chomsky (1995) notes that (5b) respects merge-over-move, whereas the derivation underlying (6) would not (‘someone’ is raised to [Spec, TP] instead of ‘John’ being merged into that position). (5) a. b. (6)
∗
John expects to be someone kissing Sam [TP John [vP John expects [TP John to be [vP someone kissing Sam]]]] John expects someone to be kissing Sam
Chomsky (1995) uses the ban on movement into -positions to rule out the derivation in (5b). However, this approach is not in consonance with his general attempt to minimize “look-ahead.” He observes that prohibiting first merge of an argument into a non-thematic position will allow the illicit nature of (5b) to become immediately evident. This is correct. However, if this prohibition is not primitive but actually a subpart of the more general prohibition against movement into -positions, it is not clear that the grammar will locally block the derivation. The purported grammatical violation (the lack of a -role in ‘John’) will only become evident downstream, as it were, and not at the point of the derivation where ‘John’ is introduced. So the (primitive) prohibition against first merging into a non-thematic position is not only sufficient to rule
8.3 The MTC and the minimalist architecture of UG
243
out the derivation in (5b), but also more congenial to local computations. This is good news for the MTC. So interpreted, the ungrammaticality of (5b) does not tell against the MTC because the MTC is consistent with the requirement that argument DPs enter derivations through a thematic door, so long as subsequent movement into -positions is countenanced. This said, the reader should not conclude that the proposed prohibition against first merge into non-thematic positions is correct. There are accounts that derive the facts in (5) and (6) by assuming that ‘be’ is not a case checker and that ‘someone’ cannot check accusative case against the matrix light verb due to the intervention of the lowest copy of ‘John’ (Nunes 1995, 2004) or by assuming that non-finite clauses do not have TP specifiers (see Castillo, Drury, and Grohmann 1999 and Epstein and Seely 2006). Furthermore, it is not clear why the proposed restriction against merging into non-thematic positions should exist if D-structure does not (not a small matter in the context of minimalist theorizing). However, for current purposes, it is worth noting that these issues, though important, are incidental to the MTC proper and − what is important here − that the data in (5) and (6) are not problems for it. To recap: one of the central architectural features of the minimalist program has been the elimination of D-structure. There is thus a very close conceptual connection between the minimalist program and the MTC, to wit, that a necessary condition for the theoretical viability of the MTC (that D-structure not exist) is one of the central tenets of the minimalist program. This contrasts with most other theories of control currently being considered. They do not rely on any distinctive minimalist assumptions and thus, though they might be compatible with the minimalist program, their theoretical apparatus (though not the technology used to express control dependencies) is largely independent of it. The conceptual connections between the minimalist program and the MTC are stronger still. Not only does the MTC imply the absence of D-structure, but the absence of D-structure is sufficient for the MTC given standard ancillary assumptions. Specifically: once D-structure is eliminated as a grammatical level, nothing prohibits movement into -positions. However, if so, then the MTC is a grammatical option. Thus, not only is eliminating D-structure a necessary condition for the MTC, it is a sufficient one as well: (7)
MTC ↔ no D-structure
In other words, to the extent that the elimination of D-structure is a central feature of the minimalist program, the MTC is quintessentially minimalist. If this is correct, the reader may be asking, why has this not been observed previously? The main reason is that eliminating D-structure does not imply
244
The movement theory of control and the minimalist program
removing D-structure conditions from the grammar. Here is some Whig history: Chomsky’s (1993) argument against D-structure was actually quite narrowly focused. D-structure within GB, for example, is a level with many distinctive properties: it is input to the transformational component, meaning that all Dstructure operations precede all transformational operations, and it represents “pure GF-.” Chomsky (1993) argues that the first noted feature of D-structure (Satisfy in the parlance of Chomsky [1993]) must be dispensed with and grammars must adopt generalized transformations that allow derivations to interleave operations akin to lexical insertion with operations akin to movement. Chomsky (1993) actually retains the thematic restrictions coded at D-structure, but in another form. It proposes banning movement into -positions, or restricting a DP’s -role assignment to its first merge. In sum, in Chomsky (1993), the elimination of D-structure is only partial. The MTC requires that it be complete: not only must Satisfy be rejected, but the segregation of functions between lexical insertion and movement (the first being designated to satisfy -relations, the latter to satisfy all the other grammatical dependencies) should be given up as well. The upshot then is that the MTC requires a more radical elimination of D-structure than considered in Chomsky (1993).5 How reasonable is this? On methodological grounds, the wholesale elimination of D-structure and its restrictions is clearly the preferred option. If D-structure has certain properties, then eliminating D-structure entails removing the restrictions coded in D-structure from UG. Moreover, we believe that current theoretical assumptions internal to the minimalist program lead to the same conclusion. The 1993 vintage of the minimalist program distinguishes two different operations: move and merge. The former contrasts with the latter in being greedy and being driven by feature-checking requirements. More recent avatars of the minimalist program take merge and move to be different instances of the very same operation. If so, then either both should be subject to feature-checking requirements or neither should be. Whichever tack one takes, however, the prior differentiation between move and merge is conceptually difficult to retain and, correspondingly we believe, the prohibition against movement into -positions becomes theoretically awkward to enforce. Thus, on both methodological and theory-internal grounds, we believe that there is every reason to retain the methodologically superior option (the complete elimination of D-structure and its properties) that underwrites the MTC. 5 More accurately: as discussed above, the MTC requires that movement into -positions be an option; it commits no hostages to whether argument DPs must first merge into -positions.
8.4 Inclusiveness, bare phrase structure, and the MTC
245
In sum: the MTC is closely tied to a central feature of the Minimalist Program – the elimination of D-structure. Removing D-structure and its attendant properties from UG is both necessary and sufficient for the MTC to be grammatically viable. In other words, once D-structure is removed, the only way of preventing it is to encumber UG with principles that are not otherwise methodologically or theoretically required. The minimalist ethos frowns on this.6 Of course, this does not mean that the MTC is correct. The facts may force us to back away from the minimalist optimum. We have argued extensively that the facts do not require this. However, regardless of how this plays out empirically, we think it a virtue of the MTC that it follows quite seamlessly from the elimination of D-structure, especially in the context of the minimalist program. 8.4
Inclusiveness, bare phrase structure, and the movement theory of control
The elimination of D-structure has a second important consequence for the theory of control. D-structure, recall, consists of two kinds of operations: phrase-structure rules and lexical-insertion operations. In this context, PRO is a perfectly sensible grammatical element. It is what arises via the application of a DP phrase-structure rule sans lexical insertion, namely [DP e]. In a minimalist approach that adopts bare phrase structure (see Chomsky 1994, 1995), this derivational option no longer exists. Bare phrase structure strongly embodies the X’-theoretic conception that phrases are projections of lexical elements.7 No lexeme, no phrase structure. This makes the GB conception of PRO above a non-starter. So, how is PRO to be described? There are two options: as a primitive lexical item or as a grammar-internal formative. The second option is the one explored 6 See for example Chomsky’s (2004) discussions on treating move as just internal merge. As he observes, one need not do this, but not doing it is methodologically suspect as it comes for free unless specifically blocked. 7 One can adopt the endocentricity assumption behind X’-theory without assuming that phrases are projections of lexical heads. On such an interpretation, X’-conventions regulate phrase-structure rules: XP→ . . . X . . . It is consistent with this that there be lexical-insertion rules that replace X with lexical items of category X. However, the original motivation behind X’-theory was to remove the redundancy in systems that generated phrase structure and lexical insertion that pruned it. The pruning seemed to imply that lexical items coded structural consequences. If this is so, why not just build the structure required by the head that is inserted − in other words, build to specifications contained in the head? This assumes not merely endocentricity but the stronger notion that phrases are projections of heads. On this latter view, one cannot have phrases that are lexically headless.
246
The movement theory of control and the minimalist program
by the MTC. Before considering the first option, let us quickly see how the MTC is actually a conservative extension in a minimalist setting of the classical GB conception. Within GB, PRO is, as noted, a grammatical formative − an expression whose existence and properties derive from the organization of the grammar. In this respect, PROs are cousins to traces, the main difference being their grammatical provenance, traces being by-products of movement. Nevertheless, both elements are structurally similar (both have the shape [DP e]), the only formal difference being the source of their indices, movement supplying the requisite index for traces and construal doing the same for PROs. In GB, traces and PROs are formally identical at LF.8 The MTC models this similarity by identifying PROs and A-traces. As copies replace traces in the minimalist program, the formal similarity embedded within GB carries over to the minimalist program if PROs are copies as well. This is the crux of the MTC; PROs are what we call A-traces that have wandered into -positions. What is critical to note here is that copies are perfectly welldefined elements within minimalism and occurrences/copies of an expression are licit entities consistent with bare phrase structure.9 Moreover, the properties of control structures are expected to derive from general principles of grammar, as control relations − like A-trace dependencies − are grammatical products formed by move/(re)merge. So, as in GB, the MTC embodies the assumption that the properties of control configurations derive from (and so directly reflect) the underlying operations and principles of UG. Let us now consider the option of treating PRO as a lexical item and not a grammatical formative. On this conception, PRO is like ‘the,’ ‘dog,’ ‘bring,’ ‘this,’ etc. It lives in the lexicon and it can merge and move, just like any other lexical item or phrase. There are no problems with bare phrase structure on this conception because PRO functions like any other (nominal) expression drawn from the lexicon. However, it is worth considering for a moment how radical a departure this is from the classical conceptions of control. Generative grammar has generally analyzed control properties as grammatical by-products. In the standard theory, PRO is a phonetic gap that results from deletion under equi. Taking “PRO” to be the product of this operation aims to explain its 8 PROs are identical to A-traces in particular, which like PROs are bereft of case, in contrast to A’-traces. 9 One argument against traces is that unlike copies they violate the inclusiveness condition. On the assumption that derivations cannot “add” new elements to the derivation, traces are methodologically suspect elements. Of course, if PROs and traces are identical, then classical GB PROs are suspect elements as well, being purely theory-internal formatives.
8.4 Inclusiveness, bare phrase structure, and the MTC
247
semantic and phonetic properties. In turn, in GB, PRO is [DP e]. The analysis of PRO as [DP e] is meant to account for its distribution and its semantic and phonological interpretation. In both cases, the analysis of “PRO” reflects the view that control facts directly follow from basic operations and organizing principles of grammars (see sections 2.3 and 2.4). In contrast, assuming that PRO is a lexical item (rather than a grammatical formative) is, in effect, to treat its special licensing requirements as lexical quirks and this seems to us quite wrongheaded. Indeed, in a generative context, one can go further. So treating PRO is to endorse a form of “constructionism.” Here is what we mean. Since the early 1980s, generative grammarians have assumed that constructions do not exist. What does this mean? It is the claim that the fundamental principles of grammar operate independently of the lexical items that they manipulate. For example, relative clauses are not islands because they involve particular lexical heads or contain particular lexical items, but because they instantiate particular structures. Topicalization, focus, and relativization do not obey islands because they involve topic, focus, or relative heads, but because they all involve A’-movement. In other words, grammatical operations and restrictions have the properties they do not because of the functional features of the “constructions” in which they apply, but because of the formal properties that these constructions instantiate. It is in this sense that constructions do not exist; they are not the fundamental units of syntactic analysis. The problem with treating PRO as a lexical item is that it amounts to analyzing control configurations as constructions: control properties follow from the unique properties of the lexical item PRO, which defines the construction. In effect, the “control construction” directly reflects the idiosyncratic properties of a distinctive lexical item, rather than the basic operations and organization of the grammar. The GB antipathy to construction accounts carries over to the minimalist program, where it is coded in the inclusiveness condition. The inclusiveness condition can be interpreted as a prohibition against confusing lexical and structural information. It discourages coding structural information onto lexical formatives. But this is what reducing the grammatical properties of control structures to the lexical requirements of a lexical item PRO does. Indeed, many (if not all) the properties of this “lexical” item cannot even be identified independently of the grammar. PRO needs a local, c-commanding, syntactic antecedent and can only be licensed within (tense- or -)defective domains. How are these requirements to be stated in purely “lexical” terms? How can they be expressed except by adverting to grammars, their structures, and their basic operations and principles? They cannot be. PRO’s requirements are grammatical licensing requirements. Postulating PRO makes no sense except in a grammatical
248
The movement theory of control and the minimalist program
context. Its requirements are entirely grammar-internal. Even describing what they are requires reference to principles and operations of the grammar. Consequently, treating PRO as a lexical element violates the spirit of the inclusiveness condition and so renders PRO a suspect element, given minimalist standards. In the end, postulating lexical elements like PRO to account for the attested properties of control cannot possibly yield explanations of these properties (descriptions yes, explanations no), for a lexical item like PRO codes as part of its content the very properties that are supposed to be explained. This is the (very high) cost of violating the inclusiveness condition. If this is correct, then the upshot is that PRO-based accounts of control within minimalism are not compatible with the explanatory ideals of the minimalist program. But as the only non-PRO-centric theory of control is the MTC, this implies that only the MTC is compatible with the minimalist program. The reader will have noticed that this reinforces our earlier conclusion reached above regarding the MTC and the elimination of D-structure. Dispensing with D-structure implicates the MTC. Bare phrase structure implicates it as well. There is thus a very tight conceptual fit between the MTC and the minimalist program. We take this to be a positive feature of the analysis. Once again, one should not conclude that, because the MTC fits well with the minimalist program, the MTC is correct. However, it does suggest that those with minimalist aspirations should smile on the MTC and that the burden of proof must be with those that reject it. Furthermore, if the fit between the minimalist program and the MTC is as tight as we have suggested, then the evidentiary bar relevant to rejecting the MTC should be quite high. If we are correct, then, among the alternatives on offer at present, only the MTC has the capacity to move beyond description to explanation. The reason is that only the MTC evades constructionism and tries to derive the properties of control structures from general principles of grammar rather than from the special licensing conditions of a peculiar lexical item. Moreover, if we are right, the kind of explanation we should provide incorporates the MTC, as this is an inherent feature of minimalism that ineluctably follows the elimination of D-structure. 8.5
Conclusion
In the earlier chapters we have tried to elaborate a movement theory of control. We have tried to address the empirical difficulties attributed to the MTC as well as outline what we take to be its empirical strengths. In this last chapter we have briefly recapped why we think that the MTC is a particularly interesting
8.5 Conclusion
249
theory in the context of the minimalist program. We have argued that, in a minimalist context, the MTC is essentially the null hypothesis about control. Conceptually, methodologically, and theoretically, it is an almost perfect fit with the minimalist program. Given these virtues, its considerable empirical coverage is a delightful bonus. The MTC not only covers virtually all of the classical facts in a principled manner – it even leads to the discovery of new kinds of data (e.g., backward control and copy control) and points to novel kinds of derivations (sideward movement) that have led to an appreciation of the subtle possibilities inherent in the modern minimalist approach to grammar. The MTC might be wrong (though we doubt this), but we do not believe that it will be wrong in a trivial way. Given how deeply interweaved it is with general minimalist precepts, principles, and operations, showing that it is inadequate should tell us a lot about how control in particular and anaphoric dependencies in general are grammatically coded and, importantly, about the scope and prospects of the minimalist project.
References
Abels, Klaus. 2003. Successive cyclicity, anti-locality, and adposition stranding. Doctoral dissertation. University of Connecticut. Alboiu, Gabriela. 2007. Moving forward with Romanian backward control and raising. In W. D. Davies and S. Dubinsky (eds.). New Horizons in the Analysis of Control and Raising. Dordrecht: Springer, 187–211. Alexiadou, Artemis, and Elena Anagnostopoulou. 1998. Parametrizing Agr: word order, verb-movement and EPP-checking. Natural Language and Linguistic Theory 16(3): 491–539. Alexiadou, Artemis, Elena Anagnostopoulou, Gianina Iordachioaia, and Mihaela Marchis. 2008. A stronger argument for backward control. Paper presented at NELS, Cornell University. Andrews, Avery. 1982. The representation of case in Modern Icelandic. In J. Bresnan (ed.). The Mental Representation of Grammatical Relations. Cambridge, MA: MIT Press, 424–503. 1990. Case structures and control in Modern Icelandic. In Joan Maling and Annie Zaenen (eds.). Modern Icelandic Syntax. San Diego, CA: Academic Press, 187– 234. Aoun, Josef. 1979. On government, case marking, and clitic placement. Unpublished manuscript. MIT. Badecker, William, and Kathleen Straub. 2002. The processing role of structural constraints on the interpretation of pronouns and anaphors. Journal of Experimental Psychology: Learning, Memory and Cognition 28: 748–769. Baker, Mark. 1988. Incorporation: A Theory of Grammatical Function Changing. University of Chicago Press. 1997. Thematic Roles and Syntactic Structure. In L. Haegeman (ed.). Elements of Grammar. Dordrecht: Kluwer, 73–138. Baker, Mark, Kyle Johnson, and Ian Roberts. 1989. Passive arguments raised. Linguistic Inquiry 20: 219–252. Baltin, Mark, and Leslie Barrett. 2002. The null content of null case. Unpublished manuscript. New York University. Barrie, Michael. 2007. Control and wh-infinitivals. In W. D. Davies and S. Dubinsky (eds.). New Horizons in the Analysis of Control and Raising. Dordrecht: Springer, 263–279. Belletti, A. 1988. The case of unaccusatives. Linguistic Inquiry 19: 1–34.
250
References
251
Belletti, A., and L. Rizzi. 1988. Psych verbs and theta theory. Natural Language and Linguistic Theory 6: 291–352. Berwick, Robert, and Amy Weinberg. 1984. The Grammatical Basis of Linguistic Performance. Cambridge, MA: MIT Press. Bobaljik, Jonathan. 1995. In terms of merge: copy and head movement. MIT Working Papers in Linguistics 27: 41–64. Bobaljik, Jonathan, and Samuel Brown. 1997. Inter-arboreal operations: headmovement and the extension requirement. Linguistic Inquiry 28: 345–356. Bobaljik, Jonathan, and Idan Landau. 2009. Icelandic control is not A-movement: the case from case. Linguistic Inquiry 40: 113–132. Boeckx, Cedric. 1999. Conflicting c-command requirements. Studia Ling¨u´ıstica 53: 227–250. 2000. A note on contraction. Linguistic Inquiry 31: 357–366. 2003. Islands and Chains. Amsterdam: John Benjamins. 2008. Bare Syntax. Oxford University Press. Boeckx, Cedric, and Norbert Hornstein. 2003. Reply to “Control is not movement.” Linguistic Inquiry 34: 269–280. Boeckx, Cedric, and Norbert Hornstein. 2004. Movement under control. Linguistic Inquiry 35: 431–452. Boeckx, Cedric, and Norbert Hornstein. 2006a. Control in Icelandic and theories of control. Linguistic Inquiry 37: 591–606. Boeckx, Cedric, and Norbert Hornstein. 2006b. The virtues of control as movement. Syntax 9: 118–130. Boeckx, Cedric, and Norbert Hornstein. 2007. On (non-)obligatory control. In W. D. Davies and S. Dubinsky (eds.). New Horizons in the Analysis of Control and Raising. Dordrecht: Springer, 251–262. Boeckx, Cedric, Norbert Hornstein, and Jairo Nunes. 2007. Overt copies in reflexive and control structures: a movement analysis. University of Maryland Working Papers in Linguistics 15: 1–45. www.ling.umd.edu/publications/volume15 Boeckx, Cedric, Norbert Hornstein, and Jairo Nunes. 2008. Copy-reflexive and copycontrol constructions: a movement analysis. Linguistic Variation Yearbook 8: 61– 100. Boeckx, Cedric, Norbert Hornstein, and Jairo Nunes. 2010. Icelandic control really is A-movement: reply to Bobaljik and Landau. Linguistic Inquiry 41(1): 111–130. ˇ Boˇskovi´c, Zeljko. 1994. D-structure, theta-criterion, and movement into theta-positions. Linguistic Analysis 24: 247–286. 1997. The Syntax of Nonfinite Complementation: An Economy Approach. Cambridge, MA: MIT Press. 2002. On multiple wh-fronting. Linguistic Inquiry 33: 351–383. 2007. On the locality and motivation of move and Agree: an even more minimal theory. Linguistic Inquiry 38: 589–644. ˇ Boˇskovi´c, Zeljko, and Jairo Nunes. 2007. The copy theory of movement: a view from PF. In Norbert Corver and Jairo Nunes (eds.). The Copy Theory of Movement. Amsterdam: John Benjamins, 13–74. Bouchard, Dennis. 1984. On the Content of Empty Categories. Dordrecht: Foris.
252
References
Bowers, John. 1973. Grammatical relations. Doctoral dissertation. MIT. 2006. On reducing control to movement. Syntax 11: 125–143. Bresnan, Joan. 1982. Control and complementation. In Joan Bresnan (ed.). The Mental Representation of Grammatical Relations. Cambridge, MA: MIT Press, 282–390. Britto, Helena. 1997. Deslocados a` esquerda, resumptivo-sujeito, ordem SV: a express˜ao do juizo categ´orico e t´etico no portuguˆes do Brasil. Doctoral dissertation. Universidade Estadual de Campinas. Castillo, Juan Carlos, John Drury, and Kleanthes Grohmann. 1999. Merge over move and the extended projection principle. University of Maryland Working Papers in Linguistics 8: 63–103. Cecchetto, Carlo, and Renato Oniga. 2004. A challenge to null case theory. Linguistic Inquiry 35: 141–149. Chierchia, Gennaro. 1984. Topics in the syntax and semantics of infinitives and gerunds. Doctoral dissertation. University of Massachussetts. Chomsky, Carol. 1969. The Acquisition of Syntax in Children from 5 to 10. Cambridge, MA: MIT Press. Chomsky, Noam. 1955. The Logical Structure of Linguistic Theory. New York: Plenum Press. 1957. Syntactic Structures. The Hague: Mouton. 1964. Current Issues in Linguistic Theory. The Hague: Mouton. 1970. Remarks on nominalisation. In R. Jacobs and P. Rosenbaum (eds.). Readings in English Transformational Grammar. Waltham, MA: Ginn, 184–221. 1977. On wh-movement. In P. W. Culicover, T. Wasow, and A. Akmajian (eds.). Formal Syntax. New York, NY: Academic Press, 71–132. 1980. On binding. Linguistic Inquiry 11: 1–46. 1981. Lectures on Government and Binding. Dordrecht: Foris. 1982. Some Concepts and Consequences of the Theory of Government and Binding. Cambridge, MA: MIT Press. 1986a. Barriers. Cambridge, MA: MIT Press. 1986b. Knowledge of Language: Its Nature, Origin and Use. New York: Praeger. 1993. A minimalist program for linguistic theory. In Kenneth Hale and Samuel Jay Keyser (eds.). The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger. Cambridge, MA: MIT Press, 1–52. 1994. Bare phrase-structure. MIT Occasional Papers in Linguistics 5. 1995. Categories and Transformations. In The Minimalist Program. Cambridge, MA: MIT Press, 219–394. 2000. Minimalist inquiries: the framework. In Roger Martin, David Michaels, and Juan Uriagereka (eds.). Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik. Cambridge, MA: MIT Press, 89–155. 2001. Derivation by phase. In Michael Kenstowicz (ed.). Ken Hale: A Life in Language. Cambridge, MA: MIT Press, 1–52. 2004. Beyond explanatory adequacy. In A. Belletti (ed.). Structures and Beyond. Oxford University Press, 104–131. 2008. On phases. In R. Freidin, C. P. Otero, and M. L. Zubizarreta (eds.). Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud. Cambridge, MA: MIT Press, 133–166.
References
253
Chomsky, Noam, and Howard Lasnik. 1993. The theory of principles and parameters. In Joachim Jacobs, Arnim von Stechow, Wolfgang Sternefeld, and Theo Vennemann (eds.). Syntax: An International Handbook of Contemporary Research. Berlin: Walter de Gruyter, 506–569. Cormack, Annabel, and Neil Smith. 2004. Backward control in Korean and Japanese. University College of London Working Papers in Linguistics 16: 57–83. Corver, Norbert, and Jairo Nunes. 2007. The Copy Theory of Movement. Amsterdam: John Benjamins. Courtenay, Karen. 1998. Summary: subject-control verb PROMISE in English. http://linguistlist.org/issues/9/9-651.html Culicover, P., and R. Jackendoff. 2001. Control is not movement. Linguistic Inquiry 30: 483–512. Culicover, P., and R. Jackendoff. 2005. Simpler Syntax. Oxford University Press. Davies, W. D., and S. Dubinsky. 2004. The Grammar of Raising and Control: A Course in Syntactic Argumentation. Oxford: Blackwell. Dobrovie-Sorin, Carmen. 1994. The Syntax of Romanian: Comparative Studies in Romance. Berlin: Mouton de Gruyter. Dowty, David. 1991. Thematic proto-roles and argument selection. Language 67: 547– 619. Drummond, Alex. 2009. On the surprisingly constrained nature of sideward movement. Unpublished manuscript. University of Maryland. Duarte, Maria Eugˆenia. 1995. A perda do princ´ıpio “evite pronome” no portuguˆes brasileiro. Doctoral dissertation. Universidade Estadual de Campinas. 2000. The loss of the “avoid pronoun” principle in Brazilian Portuguese. In M. Kato and E. Negr˜ao (eds.). Brazilian Portuguese and the Null Subject Parameter. Madrid and Frankfurt am Main: Iberoamericana and Vervuert, 17–36. 2004. On the embedding of a syntactic change. Language Variation in Europe, Papers from ICLaVE 2: 145–155. Epstein, Samuel D., and T. Daniel Seely. 2006. Derivations in Minimalism. Cambridge University Press. Etxepare, Ricardo. 1998. The syntax of illocutionary force. Doctoral dissertation. University of Maryland. Farkas, Donka. 1988. On obligatory control. Linguistics and Philosophy 11: 27– 58. Ferreira, Marcelo. 2000. Argumentos nulos em Portuguˆes Brasileiro. MA thesis. Universidade Estadual de Campinas. 2004. Hyperraising and null subjects in Brazilian Portuguese. MIT Working Papers in Linguistics 47: Collected Papers on Romance Syntax, 57–85. 2009. Null subjects and finite control in Brazilian Portuguese. In J. Nunes (ed.). Minimalist Essays on Brazilian Portuguese Syntax. Amsterdam: John Benjamins, 17–49. Floripi, Simone. 2003. Argumentos nulos dentro de DPs em Portuguˆes Brasileiro. MA thesis. Universidade Estadual de Campinas. Floripi, Simone, and Jairo Nunes. 2009. Movement and resumption in null possessor constructions in Brazilian Portuguese. In J. Nunes (ed.). Minimalist Essays on Brazilian Portuguese Syntax. Amsterdam: John Benjamins, 51–68.
254
References
Fodor, Jerry. 1975. The Language of Thought. New York: Thomas Y. Crowell. Freidin, Robert, and Rex Sprouse. 1991. Lexical case phenomena. In R. Freidin (ed.). Principles and Parameters in Comparative Syntax. Cambridge, MA: MIT Press, 392–416. Fujii, Tomohiro. 2006. Some theoretical issues in Japanese control. Doctoral dissertation. University of Maryland. Galves, Charlotte. 1998. T´opicos, sujeitos, pronomes e concordˆancia no Portuguˆes Brasileiro. Cadernos de Estudos Ling¨u´ısticos 34: 7–21. 2001. Ensaios sobre as Gram´aticas do Portuguˆes. Campinas: Editora da UNICAMP. Gonz´alez, Nora. 1988. Object and Raising in Spanish. New York: Garland. 1990. Unusual inversion in Chilean Spanish. In Paul M. Postal and Brian D. Joseph (eds.). Studies in Relational Grammar 3. University of Chicago Press, 87–103. Grinder, J. 1970. Super equi-NP deletion. Chicago Linguistic Society (CLS) 6: 297– 317. Grodzinsky, Yosef, and Tanya Reinhart. 1993. The innateness of binding and of coreference. Linguistic Inquiry 24: 69–101. Grohmann, Kleanthes K. 2003. Prolific Peripheries. Amsterdam: John Benjamins. Grosu, Alexander, and Julia Horvath. 1984. The GB theory and raising in Romanian. Linguistic Inquiry 15: 348–353. Grosu, Alexander, and Julia Horvath. 1987. On nonfiniteness in extraction constructions. Natural Language and Linguistic Theory 5: 181–196. Haddad, Youssef. 2007. Adjunct control in Telugu and Assamese. Doctoral dissertation. University of Florida. 2009. Copy control in Telugu. Journal of Linguistics 45: 69–109. Halle, Morris, and Alec Marantz. 1993. Distributed morphology and the pieces of inflection. In Kenneth Hale and Samuel Keyser (eds.). The View from Building 20: Essays in Honor of Sylvain Bromberger. Cambridge, MA: MIT Press, 111–176. Harley, Heidi. 1995. Subjects, events, and licensing. Doctoral dissertation. MIT. Henderson, Brent. 2006. Multiple agreement and inversion in Bantu. Syntax 9: 275–289. Higginbotham, J. 1980. Pronouns and bound variables. Linguistic Inquiry 11: 679–708. 1992. Reference and control. In R. Larson, S. Iatridou, U. Lahiri, and J. Higginbotham (eds.). Control and Grammar. Dordrecht: Kluwer, 79–108. Hornstein, Norbert. 1990. As Time Goes By. Cambridge, MA: MIT Press. 1995. Logical Form: From GB to Minimalism. Oxford: Blackwell. 1999. Movement and control. Linguistic Inquiry 30: 69–96. 2001. Move! A Minimalist Theory of Construal. Oxford: Blackwell. 2003. On control. In Randall Hendrick (ed.). Minimalist Syntax. Oxford: Blackwell, 6–81. 2007. Pronouns in minimal setting. In Norbert Corver and Jairo Nunes (eds.). The Copy Theory of Movement. Amsterdam: John Benjamins, 351–385. 2009. A Theory of Syntax: Minimal Operations and Universal Grammar. Cambridge University Press. Hornstein, Norbert, Ana Maria Martins, and Jairo Nunes. 2008. Perception and causative structures in English and European Portuguese: -feature agreement and the distribution of bare and prepositional infinitives. Syntax 11(2): 198–222.
References
255
Hornstein, Norbert, and Hirohisa Kiguchi. 2003. PRO gate and movement. Penn Working Papers in Linguistics 8: 101–114. Hornstein, Norbert, and Jairo Nunes. 2002. On asymmetries between parasitic gap and across-the-board constructions. Syntax 5(1): 26–54. Hornstein, Norbert, and Jairo Nunes. 2008. Adjunction, labeling, and bare phrase structure. Biolinguistics 2: 57–86. Hornstein, Norbert, Jairo Nunes, and Kleanthes Grohmann. 2005. Understanding Minimalism. Cambridge University Press. Hornstein, Norbert, and Paul Pietroski. 2009. Obligatory control and local reflexives: copies as vehicles for de se readings. Unpublished manuscript. University of Maryland. Hornstein, Norbert, and Jacek Witkos. 2003. Yet another approach to existential constructions. In Lars-Olof Delsing, C. Falk, G. Josefsson, H. Sigurðsson (eds.). Grammar in Focus: Festchrift for Anders Platzak. Lund: Lund University, 167– 184. Huang, C.-T. James. 1982. Logical relations in Chinese and the theory of grammar. Doctoral dissertation. MIT. Jackendoff, Ray. 1983. Semantics and Cognition. Cambridge: MIT Press. 1990. Semantic Structures. Cambridge: MIT Press. Jaeggli, Osvaldo. 1980. Remarks on to-contraction. Linguistic Inquiry 11: 239–245. 1986. Passive. Linguistic Inquiry 17: 587–622. J´onsson, J´ohannes G´ısli. 1996. Clausal architecture and case in Icelandic. Doctoral dissertation. University of Massachusetts. Kandybowicz, Jason. 2009. The Grammar of Repetition: Nupe Grammar at the Syntax– Phonology Interface. Amsterdam: John Benjamins. Kato, Mary A. 1999. Strong and weak pronominals and the null subject parameter. Probus 11(1): 1–38. 2000. The partial pro-drop nature and the restricted VS order in Brazilian Portuguese. In M. Kato and E. Negr˜ao (eds.). Brazilian Portuguese and the Null Subject Parameter. Madrid and Frankfurt am Main: Iberoamericana and Vervuert, 223–258. Kato, Mary A., and Jairo Nunes. 2009. A uniform raising analysis for standard and nonstandard relative clauses in Brazilian Portuguese. In J. Nunes (ed.). Minimalist Essays on Brazilian Portuguese Syntax. Amsterdam: John Benjamins, 93–120. Kayne, Richard. 1984. Connectedness and Binary Branching. Dordrecht: Foris. 1994. The Antisymmetry of Syntax. Cambridge, MA: MIT Press. 2002. Pronouns and their antecedents. In S. D. Epstein and T. D. Seely (eds.). Derivation and Explanation in the Minimalist Program. Oxford: Blackwell, 133–166. Kiguchi, Hirohisa. 2004. Syntax unchained. Doctoral dissertation. University of Maryland. Kiss, Tibor. 2005. On the empirical viability of the movement theory of control. Unpublished manuscript. Ruhr-Universit¨at Bochum. Kitahara, Hisatsugu. 1997. Elementary Operations and Optimal Derivations. Cambridge: MIT Press. Koopman, Hilda. 1984. The Syntax of Verbs. Dordrecht: Foris. Koster, Jan. 1987. Domains and dynasties. Dordrecht: Foris.
256
References
Laka, Itziar. 2006. On the nature of case in Basque: structural or inherent? In H. Broekhuis, N. Corver, J. Koster, R. Huybregts, and U. Kleinhenz (eds.). Organizing Grammar: Linguistic Studies in Honor of Henk van Riemsdijk. Berlin: Mouton de Gruyter, 374–382. Landau, Idan. 1999. Elements of control. Doctoral dissertation. MIT. 2000. Elements of Control: Structure and Meaning in Infinitival Constructions. Dordrecht: Kluwer. 2003. Movement out of control. Linguistic Inquiry 34: 471–498. 2004. The scale of finiteness and the calculus of control. Natural Language and Linguistic Theory 22: 811–877. 2006. Severing the distribution of PRO from case. Syntax 9: 153–170. 2007. Movement-resistant aspects of control. In W. D. Davies and S. Dubinsky (eds.). New Horizons in the Analysis of Control and Raising. Dordrecht: Springer, 293– 325. Lasnik, Howard. 1995. Case and expletives revisited. Linguistic Inquiry 26: 615– 633. Lasnik, Howard, and Mamoru Saito. 1992. Move A. Cambridge, MA: MIT Press. Lasnik, Howard, and Juan Uriagereka. 1988. A Course in GB Syntax: Lectures on Binding and Empty Categories. Cambridge, MA: MIT Press. Lebeaux, David. 1985. Locality and anaphoric binding. The Linguistic Review 4: 343– 363. Lee, Felicia. 2003. Anaphoric R-expressions as bound variables. Syntax 6: 84– 114. Lees, Robert B. 1960. The grammar of English nominalizations. The Hague: Mouton de Gruyter. Lidz, Jeff, and William Idsardi. 1997. Chains and phono-logical form. UPenn Working Papers in Linguistics 8: 109–125. Lightfoot, David. 1976. Trace theory and twice-moved NPs. Linguistic Inquiry 7: 559– 582. Manzini, Maria Rita, and Anna Roussou. 2000. A minimalist theory of A-movement and control. Lingua 110: 409–447. Marantz, Alec. 1991. Case and licensing. In Germ´an F. Westphal, Benjamin Ao, and Hee-Rahk Chae (eds.). Proceedings of the Eighth Eastern States Conference on Linguistics. Columbus, OH: Ohio State University, 234–253. Martin, Roger. 1996. A minimalist theory of PRO and control. Doctoral dissertation. University of Connecticut. 2001. Null case and the distribution of PRO. Linguistic Inquiry 32: 141–166. Martins, Ana Maria. 2001. On the origin of the Portuguese inflected infinitive: a new perspective on an enduring debate. In L. J. Brinton (ed.). Historical Linguistics 1999: Selected Papers from the 14th International Conference on Historical Linguistics. Amsterdam: John Benjamins, 207–222. Martins, Ana Maria, and Jairo Nunes. 2005. Raising issues in Brazilian and European Portuguese. Journal of Portuguese Linguistics 4: 53–77. Martins, Ana Maria, and Jairo Nunes. 2008. Personal and impersonal infinitives in European Portuguese and obligatory control. Unpublished manuscript. Universidade de Lisboa and Universidade de S˜ao Paulo.
References
257
Martins, Ana Maria, and Jairo Nunes. 2009. Syntactic change as chain reaction: the emergence of hyper-raising in Brazilian Portuguese. In P. Crisma and G. Longobardi (eds.). Historical Syntax and Linguistic Theory. Oxford University Press, 144–157. Martins, Ana Maria, and Jairo Nunes. In press. Apparent hyper-raising in Brazilian Portuguese: agreement with topics across a finite CP. In E. P. Panagiotidis (ed.). The Complementiser Phase: Subjects and Wh-Dependencies. Oxford University Press. McGinnis, Martha. 1998. Locality in A-movement. Doctoral dissertation. MIT. Modesto, Marcello. 2000. On the identification of null arguments. Doctoral dissertation. University of Southern California. Monahan, Philip J. 2003. Backward object control in Korean. In G. Garding and M. Tsujimura (eds.). WCCFL 22 Proceedings. Somerville, MA: Cascadilla Press, 356–369. Mortensen, David. 2003. Two kinds of variable elements in Hmong anaphora. Unpublished manuscript. UC Berkeley. Negr˜ao, Esmeralda. 1999. O Portuguˆes Brasileiro: uma l´ıngua voltada para o discurso. Doctoral dissertation. Universidade de S˜ao Paulo. Nicol, J., and D. Swinney. 1989. The role of structure in coreference assignment during sentence processing. Journal of Psycholinguistic Research 18: 5–20. Nissenbaum, Jon. 2000. Investigations of covert phrase movement. Doctoral dissertation. MIT. Nunes, Jairo. 1995. The copy theory of movement and linearization of chains in the minimalist program. Doctoral dissertation. University of Maryland. 1999. Linearization of chains and phonetic realization of chain links. In Samuel David Epstein and Norbert Hornstein (eds.). Working Minimalism. Cambridge, MA: MIT Press, 217–249. 2001. Sideward movement. Linguistic Inquiry 31(2): 303–344. 2004. Linearization of Chains and Sideward Movement. Cambridge, MA: MIT Press. 2007. A-over-A, inherent case, and relativized probing. GLOW Newsletter 58. 2008a. Inherent case as a licensing condition for A-movement: the case of hyperraising constructions in Brazilian Portuguese. Journal of Portuguese Linguistics 7: 83–108. 2008b. Preposition insertion in the mapping from spell-out to PF. Linguistics in Potsdam 28: Optimality Theory and Minimalism: Interface Theories, 133–156. 2008c. Sideward movement: triggers, timing, and output. Paper presented at Ways of Structure Building, University of the Basque Country. 2009a. A note on wh-islands and finite control in Brazilian Portuguese. Estudos da Lingua(gem). 2009b. Dummy prepositions and the licensing of null subjects in Brazilian Portuguese. In E. Aboh, E. van der Linden, J. Quer, and P. Sleeman (eds.). Romance Languages and Linguistic Theory: Selected Papers from “Going Romance” 2007. Amsterdam: John Benjamins, 243–265. 2010. Relativizing Minimality for A-movement: - and -relations. Probus 22: 1–25. In press. The copy theory. In C. Boeckx (ed.). The Oxford Handbook of Linguistic Minimalism. Oxford University Press.
258
References
Nunes, Jairo, and Juan Uriagereka. 2000. Cyclicity and extraction domains. Syntax 3: 20–43. Oded, Ilknur. 2006. Control in Turkish. MA thesis. Bogazici University. O’Neil, J. 1995. Out of control. NELS 25: 361–371. Osterhout, L., and L. A. Mobley. 1995. Event-related brain potentials elicited by failure to agree. Journal of Memory and Language 34: 739–773. Phillips, Colin. 1996. Order and structure. Doctoral dissertation. MIT. 2004. The real-time status of island constraints. Unpublished manuscript. University of Maryland. Pires, Acrisio. 2001. The syntax of gerunds and infinitives: subjects, case and control. Doctoral dissertation. University of Maryland. 2006. The Minimalist Syntax of Defective Domains: Gerunds and Infinitives. Amsterdam: John Benjamins. Polinsky, Maria, Philip Monahan, and Nayoung Kwon. 2007. Object control in Korean: how many constructions? Language Research 43: 1–33. Polinsky, Maria, and Eric Potsdam. 2002. Backward control. Linguistic Inquiry 33: 245–282. Polinsky, Maria, and Eric Potsdam. 2006. Expanding the scope of control and raising. Syntax 9: 171–192. Pontes, Eunice. 1987. O T´opico no Portuguˆes do Brasil. Campinas: Pontes. Postal, Paul, and Geoffrey Pullum. 1978. Traces and the description of English complementizer contraction. Linguistic Inquiry 9: 1–29. Potsdam, Eric. 2006. Backward object control in Malagasy: evidence against an empty category analysis. In Donald Baumer, David Montero, and Michael Scanlon (eds.). Proceedings of the 25th West Coast Conference on Formal Linguistics. Somerville, MA: Cascadilla Press, 328–336. Quinn, C. 2004. Field notes on white Hmong reflexives. Unpublished manuscript. Harvard University. Raposo, Eduardo. 1987. Romance infinitival clauses and case theory. In C. Neidle and R. A. Nu˜nez Cede˜no (eds.). Studies in Romance Languages. Dordrecht: Foris, 237–249. 1989. Prepositional infinitival constructions in European Portuguese. In O. Jaeggli and K. Safir (eds.). The Null Subject Parameter. Dordrecht: Kluwer, 277– 305. Reinhart, Tanya. 1983. Anaphora and Semantic Interpretation. London: Croom Helm. Reuland, Eric. 1983. Governing -ing. Linguistic Inquiry 14: 101–136. Rezac, Milan. 2004. Elements of cyclic syntax: Agree and merge. Doctoral dissertation. University of Toronto. Richards, Norvin. 2001. Movement in language. Oxford University Press. Rizzi, Luigi. 1990. Relativized minimality. Cambridge, MA: MIT Press. 2006. On the form of chains: criterial positions and ECP effects. In Lisa Cheng and Norbert Corver (eds.). WH-Movement: Moving On. Cambridge, MA: MIT Press, 97–133. Rodrigues, Cilene. 2002. Morphology and null subjects in Brazilian Portuguese. In D. Lightfoot (ed.). Syntactic Effects of Morphological Change. Oxford University Press, 160–178.
References
259
2004. Impoverished morphology and A-movement out of case domains. Doctoral dissertation. University of Maryland. 2007. Agreement and flotation in partial and inverse partial control configurations. In W. D. Davies and S. Dubinsky (eds.). New Horizons in the Analysis of Control and Raising. Dordrecht: Springer, 213–229. Rooryck, Johan. 2001. Configurations of Sentential Complementation: Perspectives from Romance Languages. London: Routledge. 2007. Control via selection. In W. D. Davies and S. Dubinsky (eds.). New Horizons in the Analysis of Control and Raising. Dordrecht: Springer, 281–292. Rosenbaum, P. S. 1967. The Grammar of English Predicate Complement Constructions. Cambridge, MA: MIT Press. 1970. A principle governing deletion in English sentential complementation. In R. Jacobs and P. Rosenbaum (eds.). Readings in English Transformational Grammar. Waltham, MA: Ginn, 20–29. Ross, John R. 1970. On declarative sentences. In R. Jacobs and P. Rosenbaum (eds.). Readings in English Transformational Grammar. Waltham, MA: Ginn, 222– 272. Roussou, Anna. 2001. Control and raising in and out of subjunctive complements. In Mar´ıa Luisa Rivero and Angela Ralli (eds.). Comparative Syntax of the Balkan Languages. Oxford University Press, 74–104. Salmon, N. 1986. Reflexivity. Notre Dame Journal of Formal Logic 27: 401–429. San Martin, Itziar. 2004. On subordination and the distribution of PRO. Doctoral dissertation. University of Maryland. Shlonsky, Ur. 1997. Clause Structure and Word Order in Hebrew and Arabic: An Essay in Comparative Semitic Syntax. Oxford University Press. Sichel, Ivy. 2007. Raising in DP revisited. In W. D. Davies and S. Dubinsky (eds.). New Horizons in the Analysis of Control and Raising. Dordrecht: Springer, 15–34. ´ Sigurðsson, Halld´or Armann. 1991. Icelandic case marked PRO and the licensing of lexical arguments. Natural Language and Linguistic Theory 9: 327–363. 1996. Icelandic finite verb agreement. Working Papers in Scandinavian Syntax 57: 1–46. 2008. The case of PRO. Natural Language and Linguistic Theory 26: 403–450. Snarska, Anna. 2009. On certain issues of control in English and Polish: partial control, split control and super-equi. Doctoral dissertation. Adam Mickiewicza University, Poznan. Stockwell, Robert P., Paul Schachter, and Barbara Hall Partee. 1973. The Major Syntactic Structures of English. New York: Holt, Rinehart and Winston. Stowell, Tim. 1981. The origins of phrase structure. Doctoral dissertation. MIT. 1982. The tense of infinitives. Linguistic Inquiry 13: 561–570. Terzi, Arontho. 1997. PRO and null case in finite clauses. The Linguistic Review 14: 335–360. Thr´ainsson, H¨oskuldur. 1979. On complementation in Icelandic. Doctoral dissertation. Harvard University. 2008. The Syntax of Icelandic. Cambridge University Press. Torrego, Esther. 1996. On quantifier float in control clauses. Linguistic Inquiry 27: 111–126.
260
References
Uchibori, Asako. 2000. The syntax of subjunctive complements: evidence from Japanese. Doctoral dissertation. University of Connecticut. Ura, Hiroyuki. 1994. Varieties of raising and the feature-based bare phrase structure theory. MIT Working Papers in Linguistics 23: 297–316. Uriagereka, Juan. 1998. Rhyme and Reason: An Introduction to Minimalist Syntax. Cambridge, MA: MIT Press. 1999. Multiple Spell-Out. In Samuel David Epstein and Norbert Hornstein (eds.). Working Minimalism. Cambridge, MA: MIT Press, 251–282. 2006. Complete and partial Infl. In Cedric Boeckx (ed.). Agreement systems. Amsterdam: John Benjamins, 267–298. Ussery, Cherlon. 2008. What it means to agree: the behavior of case and phi features in Icelandic control. In C. B. Chang and H. J. Haynie (eds.). Proceedings of WCCFL 26. Somerville, MA: Cascadilla Press, 480–488. van Craenenbroeck, Jeroen, Johan Rooryck, and Guido van den Wyngaerd. 2005. If control raises, it fails to copy, reconstruct, and linearize. Paper presented at the LSA Linguistic Institute Workshop “New Horizons in the Grammar of Raising and Control.” Harvard University, July, 8–10. Varlokosta, Spyridoula. 1993. Control in Modern Greek. Doctoral dissertation. University of Maryland. Williams, Edwin. 1980. Predication. Linguistic Inquiry 11: 203–238. 1985. PRO and the subject of NP. Natural Language and Linguistic Theory 3: 277– 295. Wurmbrand, Susanne. 2001. Infinitives: Restructuring and Clause Structure. Berlin: Mouton de Gruyter. 2005. Tense in infinitives. Paper presented at the LSA Linguistic Institute Workshop “New Horizons in the Grammar of Raising and Control.” Harvard University, July, 8–10. 2006. WollP: where syntax and semantics meet. Unpublished manuscript. University of Connecticut. Yip, Moira, Joan Maling, and Ray Jackendoff. 1987. Case in tiers. Language 63: 217– 250. Zaenen, Annie, Joan Maling, and H¨oskuldur Thr´ainsson. 1985. Case and grammatical functions. Natural Language and Linguistic Theory 3: 441–483. Zwart, C. J.-W. 2002. Issues relating to a derivational theory of binding. In S. D. Epstein and T. D. Seely (eds.). Derivation and Explanation in the Minimalist Program. Oxford: Blackwell, 269–304.
Index
acquisition, 170, 223 activation condition, 48 Agree, 26, 63 A-movement, 1, 29, 34, 36, 46–48, 57, 68, 70, 75, 77, 78, 88, 128, 148, 160, 194, 208, 239 Assamese, 122
equivalent NP deletion, 6, 54 super-equi, 8 exceptional case-marking verb, 16–19, 126–128, 148, 153 exhortative construction, 190–193 expletive, 41, 47, 99, 134, 143, 150, 151, 173 extension condition, 86, 93
bare phrase structure, 49, 53, 84, 245 Basque, 166–168 binding domain, 11–14 binding theory, 11, 16, 80 ˇ Boˇskovi´c, Zeljko, 104, 159, 228
Ferreira, Marcelo, 24, 31, 64, 143 floating quantifier, 160–166 Fodor’s problem, 234 freezing, 138, 239–240 Fujii, Tomohiro, 190–193
case theory, 12, 68 case transmission, 165–166 c-command, 13, 21, 47, 90, 102 Chomsky, Noam, 1, 3, 14, 16, 28, 46, 48, 52, 53, 57, 66, 72, 79, 80, 90, 128, 145, 219, 221, 242, 244 clitic, 100–101, 133 commitative PP, 22–34, 185–193 control adjunct, 7, 87, 90, 92, 121, 203, 215, 226, 240 complement, 170, 216, 226 module, 16, 39–43 partial, 21, 182 copy theory of movement, 53, 57, 59, 80–90, 99, 101, 104, 108, 240 Culicover, Peter, 141–152, 216
German, 129, 132–134 government and binding theory, 1, 5, 9–16, 35, 38, 43, 48, 52, 57, 86, 244, 246 Greek, 68, 70 heavy NP shift, 181 Hebrew, 27, 67, 130, 147–149 Hmong, 120, 121 hyper-raising, 48, 70, 126, 127, 136, 140, 149 Icelandic, 153–154, 158, 160–168 idiom, 42, 45, 72, 107, 147, 150, 151, 155–159 inclusiveness condition, 53–55, 80, 238, 247 inherent case, 133–153, 168, 219 island, 60, 63, 75, 92, 117, 151, 196, 200, 207, 247 Italian, 62, 89
de se interpretation, 13, 25, 49–51, 196–199 double-object construction, 173–181 D-structure, 3, 36, 79, 194, 238, 241–248
Jackendoff, Ray, 141–152, 216 Japanese, 190–193
economy, 90, 91, 195–209, 228, 239–240 ellipsis, 13, 18–25, 49, 64, 120, 196–198
Kinande, 69 Kiss, Tibor, 125, 132–136
261
262
Index
Koopman, Hilda, 117–119 Korean, 110–112, 113 Landau, Idan, 20–35, 63, 66, 125, 130, 152, 160, 170, 177, 183–186, 190, 193 Lasnik, Howard, 16, 38, 242 Latin, 61 lexical semantics, 216–226, 237 Lightfoot, David, 60 linear correspondence axiom, 116–118 linearization, 116–119, 146 merge, 44, 46, 49, 53, 79, 83, 86, 90, 203, 206, 228, 244 pair-merge, 145 set-merge, 145 minimal-distance principle, 6, 7, 47, 90, 169, 176, 191, 239 minimalist program, 3, 36, 39, 43, 53, 80, 124, 238 nominalization, 172, 224 null case, 17–20, 227 null pronoun, 187, 195, 197, 200, 204–209, 219 parser, 204–209 passive, 39, 57, 125, 133, 153, 179, 224 long passive, 129 phase, 31 Portuguese Brazilian, 24, 31–33, 62–63, 64–66, 67–68, 71–74, 77, 78, 89, 96–97, 127, 136–140, 142–146, 188, 198–200, 204, 212, 215, 227, 228 European, 17–20, 74, 99–101, 227 principle C, 103, 138, 241 principles-and-parameters theory, 16, 38 PRO theorem, 11–12, 13 pronominalization, 197–209 proto-role, 175 quirky case, 152, 182, 194
reconstruction effect, 53, 79 redundancy rule, 32, 65 Reinhart, Tanya, 50 relativized minimality, 56, 76, 129, 169, 194 Rodrigues, Cilene, 24, 62–63, 89, 186–189, 197 Romanian, 70, 77, 105 Rooryck, Johan, 214 Rosenbaum, Peter S., 47, 90, 151, 169, 170, 223 Salmon, Nathan, 50 San Lucas Quiavin´ı Zapotec, 120–121, 212 Satisfy, 244 scrambling, 107, 111 selection restriction, 20–24, 69, 106, 147, 171, 201, 211–216, 229 Sichel, Ivy, 147–150 sideward movement, 85, 92, 121, 146, 200, 203, 227 ´ Sigurðsson, Halld´or Armann, 152, 164–165 sloppy reading, 21, 49, 64, 120, 196, 197–198 Spanish, 160, 188–189 split antecedent, 49, 182 standard theory, 6, 52 strict reading, 196–198, 199 Telugu, 122 thematic role, 172, 213, 224 transparency, 206–208 Tsez, 106–109, 212 uniformity of theta assignment hypothesis (UTAH), 172–182, 219–225 Vata, 118–119 visibility condition, 15 Visser’s generalization, 126, 130, 132, 136 voice transparency, 42, 110 wanna-contraction, 59 wh-movement, 180–181, 227 Wurmbrand, Susi, 18, 129, 187, 211